Dynamo v0.7.1

Dynamo v0.7.1 - Release Notes

Summary

Dynamo 0.7.1 is a patch release focusing on tool calling support, NIXL performance improvements, and preprocessing fixes. This release significantly expands function calling capabilities with new tool parsers for DeepSeek V3/R1 models and XML Coder format, improves NIXL concurrency and byte handling for better distributed inference performance, and fixes a critical preprocessor issue with stop token handling.

Base Branch: release/0.7.0.post1

Full Changelog

Performance and Framework Support

NIXL Byte Handling: Refactored how bytes are passed to NIXL in the nixl_connect module (#4860) to improve memory handling efficiency and compatibility with NIXL's native byte processing requirements for distributed KV cache transfers.
NIXL Concurrency Improvements: Enhanced concurrency support in the nixl_connect module (#4862) to enable better parallel processing of NIXL operations, improving throughput for disaggregated inference workloads with multiple concurrent requests.

Tool Calling Support

DeepSeek V3/R1 Tool Parser: Added toolcall parser support for DeepSeek V3 and DeepSeek R1 models (#4861) enabling function calling capabilities with these popular open-weight reasoning models for agentic workflows and structured output generation.
XML Coder Tool Parser: Implemented XML Coder tool parser format (#4859) providing an additional function calling format option for models that use XML-based tool definitions and responses.
Tool Call Configuration Types: Refactored tool call configuration with new config types (#4857) improving type safety, validation, and extensibility of tool calling configuration options across supported models and parsers.

Bug Fixes

Preprocessor Stop Field: Fixed preprocessor to properly populate the "stop" field in request handling (#4858) ensuring stop sequences are correctly propagated through the inference pipeline and models properly terminate generation at specified stop tokens.
min_tokens with ignore_eos: Fixed an issue where setting ignore_eos=true would automatically override min_tokens to equal max_tokens (#4908) ensuring users can continue generation past the EOS token without being forced to generate the maximum number of tokens.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Dynamo v0.7.1

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Dynamo v0.7.1 - Release Notes

Summary

Full Changelog

Performance and Framework Support

Tool Calling Support

Bug Fixes

Uh oh!