Skip to main content
Multi-model execution engine supporting all major LLM providers. Built on Agno framework with LiteLLM integration for maximum flexibility.

Key Features

All Providers

OpenAI, Anthropic, Google, Mistral, Cohere, and any LiteLLM-compatible endpoint

Python Tools

Native Python integration for custom tools and Skills

Fast Startup

~50ms cold start, ~10ms warm start

Capabilities

FeatureSupport
Streaming✅ Yes
Tool Calling✅ Yes
MCP Servers✅ Yes
History100 messages
Cancellation✅ Yes
Custom Tools✅ Yes

Supported Models

Agno Runtime supports any model available through LiteLLM. There are no restrictions in the runtime itself - model support is determined by your LiteLLM proxy configuration. Commonly used providers:
  • OpenAI: GPT-4, GPT-4 Turbo, GPT-4o, GPT-3.5 Turbo
  • Anthropic: Claude 3.5 Sonnet, Claude 3 Opus, Claude 3 Haiku
  • Google: Gemini Pro, Gemini Flash
  • Mistral AI: Mistral Large, Mistral Medium
  • Cohere: Command, Command R+
  • Any LiteLLM-compatible endpoint
Model selection: Configure via model_id parameter or LITELLM_DEFAULT_MODEL environment variable. Learn more: LiteLLM Supported Models

When to Use

Best for:
  • Multi-provider flexibility
  • Cost optimization across providers
  • General-purpose agents (support, Q&A, workflows)
  • Organizations avoiding vendor lock-in
Consider alternatives:

Trade-offs

Pros:
  • ✅ All LLM providers supported
  • ✅ Mix models from different providers
  • ✅ Python ecosystem integration
  • ✅ Fast startup time
Cons:
  • ❌ 100-message history limit (vs 200 for Claude Code)
  • ❌ Not code-specialized
  • ❌ Requires LiteLLM proxy


Next Steps