Key Features
All Providers
OpenAI, Anthropic, Google, Mistral, Cohere, and any LiteLLM-compatible endpoint
Python Tools
Native Python integration for custom tools and Skills
Fast Startup
~50ms cold start, ~10ms warm start
Capabilities
| Feature | Support |
|---|---|
| Streaming | ✅ Yes |
| Tool Calling | ✅ Yes |
| MCP Servers | ✅ Yes |
| History | 100 messages |
| Cancellation | ✅ Yes |
| Custom Tools | ✅ Yes |
Supported Models
Agno Runtime supports any model available through LiteLLM. There are no restrictions in the runtime itself - model support is determined by your LiteLLM proxy configuration. Commonly used providers:- OpenAI: GPT-4, GPT-4 Turbo, GPT-4o, GPT-3.5 Turbo
- Anthropic: Claude 3.5 Sonnet, Claude 3 Opus, Claude 3 Haiku
- Google: Gemini Pro, Gemini Flash
- Mistral AI: Mistral Large, Mistral Medium
- Cohere: Command, Command R+
- Any LiteLLM-compatible endpoint
model_id parameter or LITELLM_DEFAULT_MODEL environment variable.
Learn more: LiteLLM Supported Models
When to Use
Best for:- Multi-provider flexibility
- Cost optimization across providers
- General-purpose agents (support, Q&A, workflows)
- Organizations avoiding vendor lock-in
- Code-intensive tasks → Use Claude Code Runtime
- Extended conversations (200+ messages) → Use Claude Code Runtime
Trade-offs
Pros:- ✅ All LLM providers supported
- ✅ Mix models from different providers
- ✅ Python ecosystem integration
- ✅ Fast startup time
- ❌ 100-message history limit (vs 200 for Claude Code)
- ❌ Not code-specialized
- ❌ Requires LiteLLM proxy