Kubiya’s model-agnostic design means you’re never locked into a single AI provider. Choose from leading commercial models, open-source alternatives, or even run your own models on-premises while Kubiya handles the orchestration and execution.
The Model-Agnostic Advantage
Unlike platforms tied to specific AI providers, Kubiya separates intelligence from execution:
Flexibility
Cost Optimization
Security & Privacy
Switch models based on needs:
Use GPT-4 for complex reasoning tasks
Use Claude for detailed analysis and planning
Use local models for sensitive data processing
Use specialized models for domain-specific tasks
Supported AI Providers
Commercial Models
OpenAI GPT-4, GPT-4 Turbo, GPT-3.5 Turbo with function calling and structured outputs
Anthropic Claude 3.5 Sonnet, Claude 3 Opus, Claude 3 Haiku with tool use capabilities
Google Gemini Pro, Gemini Ultra with multimodal understanding
Microsoft Azure OpenAI Service with enterprise features and compliance
Open Source & Self-Hosted
Together AI Llama 2, CodeLlama, and other open-source models
Groq Ultra-fast inference for real-time automation
Ollama Run Llama, Mistral, and other models locally
vLLM High-performance serving for production deployments
How AI Models Work with Kubiya
Workflow Generation Process
AI models don’t execute operations directly. Instead, they generate structured workflows that runners execute deterministically:
Context Analysis
AI analyzes your request along with infrastructure context from the context graph
Tool Selection
Based on available integrations , AI chooses appropriate tools for the task
Workflow Generation
AI creates a structured, deterministic workflow with proper error handling and safety checks
Human Review
Complex or sensitive workflows can require approval before execution
Example: Model Selection by Use Case
# Different models for different workflow types
model_selection :
incident_response :
provider : anthropic
model : claude-3-sonnet
reason : "Excellent at analyzing logs and suggesting solutions"
routine_deployments :
provider : openai
model : gpt-3.5-turbo
reason : "Cost-effective for standard operations"
sensitive_operations :
provider : local
model : llama-3-8b
reason : "Keep sensitive context on-premises"
complex_troubleshooting :
provider : openai
model : gpt-4-turbo
reason : "Best reasoning for complex multi-system issues"
Model Configuration
API-Based Models
Configure cloud-based AI services:
# OpenAI configuration
ai_providers :
openai :
api_key : "${OPENAI_API_KEY}"
organization : "org-xyz123"
models :
- gpt-4-turbo
- gpt-3.5-turbo
default_model : gpt-4-turbo
anthropic :
api_key : "${ANTHROPIC_API_KEY}"
models :
- claude-3-5-sonnet
- claude-3-haiku
default_model : claude-3-5-sonnet
azure_openai :
endpoint : "https://mycompany.openai.azure.com"
api_key : "${AZURE_OPENAI_KEY}"
deployment_name : "gpt-4-deployment"
Self-Hosted Models
Run models on your own infrastructure:
# Local model configuration
ai_providers :
local_llama :
type : ollama
endpoint : "http://ollama-service:11434"
model : "llama3:8b"
vllm_cluster :
type : openai_compatible
endpoint : "http://vllm-cluster:8000/v1"
model : "meta-llama/Llama-2-13b-chat-hf"
api_key : "not-required"
azure_ml :
type : azure_ml
endpoint : "${AZURE_ML_ENDPOINT}"
api_key : "${AZURE_ML_KEY}"
deployment_name : "custom-model-deployment"
Advanced AI Features
Modern AI models can call functions and use tools directly:
# AI model with tool access
from kubiya import Kubiya
client = Kubiya()
# Models can directly query context and execute workflows
tools = [
{
"type" : "function" ,
"function" : {
"name" : "get_cluster_status" ,
"description" : "Get current status of Kubernetes cluster" ,
"parameters" : {
"cluster_name" : { "type" : "string" },
"namespace" : { "type" : "string" , "required" : False }
}
}
},
{
"type" : "function" ,
"function" : {
"name" : "scale_deployment" ,
"description" : "Scale a Kubernetes deployment" ,
"parameters" : {
"deployment" : { "type" : "string" },
"replicas" : { "type" : "integer" },
"namespace" : { "type" : "string" }
}
}
}
]
response = client.chat.completions.create(
model = "gpt-4-turbo" ,
messages = [{
"role" : "user" ,
"content" : "The payment service seems slow. Can you investigate and scale if needed?"
}],
tools = tools
)
Structured Output Generation
AI models generate properly formatted workflows:
{
"workflow" : {
"name" : "investigate-payment-service" ,
"steps" : [
{
"name" : "check-current-status" ,
"tool" : "kubectl" ,
"args" : [ "get" , "deployment" , "payment-service" , "-o" , "json" ],
"namespace" : "production"
},
{
"name" : "analyze-metrics" ,
"tool" : "datadog-query" ,
"query" : "avg:kubernetes.cpu.usage{service:payment}" ,
"timeframe" : "1h"
},
{
"name" : "scale-if-needed" ,
"tool" : "kubectl" ,
"args" : [ "scale" , "deployment" , "payment-service" , "--replicas=5" ],
"condition" : "${analyze-metrics.cpu_usage} > 80"
}
]
}
}
Model Context Protocol (MCP)
Kubiya supports the Model Context Protocol for seamless AI agent integration:
# Start MCP server for external AI agents
kubiya mcp serve --port 3000
# External AI agents can now:
# - Query your infrastructure context
# - Generate and execute workflows
# - Access your integrations securely
# - Use Kubiya's audit and compliance features
This enables AI agents like:
Claude Desktop with Kubiya context
ChatGPT with workflow execution capabilities
Custom agents built with LangChain or similar frameworks
Enterprise copilots with infrastructure automation
Response Time Optimization
Different models excel at different tasks:
Fast Models
Powerful Models
For real-time operations:
GPT-3.5 Turbo: ~2-3 seconds response time
Claude Haiku: Fastest Claude model for simple tasks
Local models with Groq: Sub-second inference
Good for: Status checks, simple deployments, routine operations
Cost Optimization Strategies
# Intelligent model routing based on complexity
routing_rules :
- condition : "workflow_complexity == 'simple'"
model : "gpt-3.5-turbo"
- condition : "incident_severity >= 'critical'"
model : "gpt-4-turbo"
- condition : "contains_sensitive_data == true"
model : "local_llama"
- condition : "user_tier == 'premium'"
model : "claude-3-opus"
default_model : "gpt-3.5-turbo"
Quality Assurance
Ensure consistent AI performance across models:
# Model quality testing
test_scenarios = [
{
"input" : "Scale the frontend service to handle increased traffic" ,
"expected_tools" : [ "kubectl" , "monitoring-check" ],
"expected_safety_checks" : [ "resource_limits" , "rollback_plan" ]
},
{
"input" : "Investigate database connection issues" ,
"expected_tools" : [ "database-client" , "network-test" , "log-analyzer" ],
"required_context" : [ "database_endpoints" , "connection_pools" ]
}
]
for scenario in test_scenarios:
for model in [ "gpt-4" , "claude-3" , "local-llama" ]:
result = test_model_quality(model, scenario)
assert_workflow_quality(result, scenario[ "requirements" ])
Security & Privacy
Data Handling Policies
Control what data AI models can access:
# Data classification and model access rules
data_policies :
public :
allowed_models : [ "openai/*" , "anthropic/*" , "google/*" ]
internal :
allowed_models : [ "azure_openai/*" , "local/*" ]
confidential :
allowed_models : [ "local/*" ]
data_masking : true
restricted :
allowed_models : [ "air_gapped_local/*" ]
no_cloud_access : true
Audit & Compliance
Track all AI model interactions:
Prompt logging : Record all inputs sent to AI models
Response tracking : Store AI-generated workflows and decisions
Model attribution : Track which model generated each workflow
Cost tracking : Monitor AI API usage and costs by team/project
Privacy Notice : When using cloud-based AI models, your prompts and context may be processed by third-party services. Use local models or data masking for sensitive information.
What’s Next?
AI models generate the intelligence, but workflows provide the structure and safety guardrails that make automation reliable in production. Learn how AI-generated workflows become deterministic, auditable operations.