Derek Worthen
|
2b70e4a4f3
|
Tokenizer (#2051)
* Add LiteLLM chat and embedding model providers.
* Fix code review findings.
* Add litellm.
* Fix formatting.
* Update dictionary.
* Update litellm.
* Fix embedding.
* Remove manual use of tiktoken and replace with
Tokenizer interface. Adds support for encoding
and decoding the models supported by litellm.
* Update litellm.
* Configure litellm to drop unsupported params.
* Cleanup semversioner release notes.
* Add num_tokens util to Tokenizer interface.
* Update litellm service factories.
* Cleanup litellm chat/embedding model argument assignment.
* Update chat and embedding type field for litellm use and future migration away from fnllm.
* Flatten litellm service organization.
* Update litellm.
* Update litellm factory validation.
* Flatten litellm rate limit service organization.
* Update rate limiter - disable with None/null instead of 0.
* Fix usage of get_tokenizer.
* Update litellm service registrations.
* Add jitter to exponential retry.
* Update validation.
* Update validation.
* Add litellm request logging layer.
* Update cache key.
* Update defaults.
---------
Co-authored-by: Alonso Guevara <alonsog@microsoft.com>
|
2025-09-22 13:55:14 -06:00 |
|