The Model-Agnostic Advantage
Unlike platforms tied to specific AI providers, Kubiya separates intelligence from execution:Switch models based on needs:
- Use GPT-4 for complex reasoning tasks
- Use Claude for detailed analysis and planning
- Use local models for sensitive data processing
- Use specialized models for domain-specific tasks

Supported AI Providers
Commercial Models
OpenAI
GPT-4, GPT-4 Turbo, GPT-3.5 Turbo with function calling and structured outputs
Anthropic
Claude 3.5 Sonnet, Claude 3 Opus, Claude 3 Haiku with tool use capabilities
Gemini Pro, Gemini Ultra with multimodal understanding
Microsoft
Azure OpenAI Service with enterprise features and compliance
Open Source & Self-Hosted
Together AI
Llama 2, CodeLlama, and other open-source models
Groq
Ultra-fast inference for real-time automation
Ollama
Run Llama, Mistral, and other models locally
vLLM
High-performance serving for production deployments
How AI Models Work with Kubiya
Workflow Generation Process
AI models don’t execute operations directly. Instead, they generate structured workflows that runners execute deterministically:1
Context Analysis
AI analyzes your request along with infrastructure context from the context graph
2
Tool Selection
Based on available integrations, AI chooses appropriate tools for the task
3
Workflow Generation
AI creates a structured, deterministic workflow with proper error handling and safety checks
4
Human Review
Complex or sensitive workflows can require approval before execution
5
Deterministic Execution
Runners execute the workflow using serverless tools with full audit trails
Example: Model Selection by Use Case
Model Configuration
API-Based Models
Configure cloud-based AI services:Self-Hosted Models
Run models on your own infrastructure:
Advanced AI Features
Function Calling & Tool Use
Modern AI models can call functions and use tools directly:Structured Output Generation
AI models generate properly formatted workflows:Model Context Protocol (MCP)
Kubiya supports the Model Context Protocol for seamless AI agent integration:
- Claude Desktop with Kubiya context
- ChatGPT with workflow execution capabilities
- Custom agents built with LangChain or similar frameworks
- Enterprise copilots with infrastructure automation
Model Performance & Optimization
Response Time Optimization
Different models excel at different tasks:For real-time operations:
- GPT-3.5 Turbo: ~2-3 seconds response time
- Claude Haiku: Fastest Claude model for simple tasks
- Local models with Groq: Sub-second inference
- Good for: Status checks, simple deployments, routine operations
Cost Optimization Strategies
Quality Assurance
Ensure consistent AI performance across models:
Security & Privacy
Data Handling Policies
Control what data AI models can access:Audit & Compliance
Track all AI model interactions:- Prompt logging: Record all inputs sent to AI models
- Response tracking: Store AI-generated workflows and decisions
- Model attribution: Track which model generated each workflow
- Cost tracking: Monitor AI API usage and costs by team/project
Privacy Notice: When using cloud-based AI models, your prompts and context may be processed by third-party services. Use local models or data masking for sensitive information.