I measured around 0.08ms overhead. LiteLLM's Python proxy added about 7ms to 8ms per request. However, an LLM call takes 500ms to 30 seconds. A 7ms delay is almost invisible compared to the model ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results