AI Models - Kubiya

Kubiya’s model-agnostic design means you’re never locked into a single AI provider. Choose from leading commercial models, open-source alternatives, or even run your own models on-premises while Kubiya handles the orchestration and execution.

The Model-Agnostic Advantage

Unlike platforms tied to specific AI providers, Kubiya separates intelligence from execution:

Flexibility
Cost Optimization
Security & Privacy

Switch models based on needs:

Use GPT-4 for complex reasoning tasks
Use Claude for detailed analysis and planning
Use local models for sensitive data processing
Use specialized models for domain-specific tasks

Supported AI Providers

Commercial Models

OpenAI

GPT-4, GPT-4 Turbo, GPT-3.5 Turbo with function calling and structured outputs

Anthropic

Claude 3.5 Sonnet, Claude 3 Opus, Claude 3 Haiku with tool use capabilities

Google

Gemini Pro, Gemini Ultra with multimodal understanding

Microsoft

Azure OpenAI Service with enterprise features and compliance

Open Source & Self-Hosted

Together AI

Llama 2, CodeLlama, and other open-source models

Groq

Ultra-fast inference for real-time automation

Ollama

Run Llama, Mistral, and other models locally

vLLM

High-performance serving for production deployments

How AI Models Work with Kubiya

Workflow Generation Process

AI models don’t execute operations directly. Instead, they generate structured workflows that runners execute deterministically:

Context Analysis

AI analyzes your request along with infrastructure context from the context graph

Tool Selection

Based on available integrations, AI chooses appropriate tools for the task

Workflow Generation

AI creates a structured, deterministic workflow with proper error handling and safety checks

Human Review

Complex or sensitive workflows can require approval before execution

Deterministic Execution

Runners execute the workflow using serverless tools with full audit trails

Example: Model Selection by Use Case

# Different models for different workflow types
model_selection:
  incident_response:
    provider: anthropic
    model: claude-3-sonnet
    reason: "Excellent at analyzing logs and suggesting solutions"
    
  routine_deployments:
    provider: openai  
    model: gpt-3.5-turbo
    reason: "Cost-effective for standard operations"
    
  sensitive_operations:
    provider: local
    model: llama-3-8b
    reason: "Keep sensitive context on-premises"
    
  complex_troubleshooting:
    provider: openai
    model: gpt-4-turbo
    reason: "Best reasoning for complex multi-system issues"

Model Configuration

API-Based Models

Configure cloud-based AI services:

# OpenAI configuration
ai_providers:
  openai:
    api_key: "${OPENAI_API_KEY}"
    organization: "org-xyz123"
    models:
      - gpt-4-turbo
      - gpt-3.5-turbo
    default_model: gpt-4-turbo
    
  anthropic:
    api_key: "${ANTHROPIC_API_KEY}"
    models:
      - claude-3-5-sonnet
      - claude-3-haiku
    default_model: claude-3-5-sonnet
    
  azure_openai:
    endpoint: "https://mycompany.openai.azure.com"
    api_key: "${AZURE_OPENAI_KEY}"
    deployment_name: "gpt-4-deployment"

Self-Hosted Models

Run models on your own infrastructure:

# Local model configuration
ai_providers:
  local_llama:
    type: ollama
    endpoint: "http://ollama-service:11434"
    model: "llama3:8b"
    
  vllm_cluster:
    type: openai_compatible
    endpoint: "http://vllm-cluster:8000/v1"
    model: "meta-llama/Llama-2-13b-chat-hf"
    api_key: "not-required"
    
  azure_ml:
    type: azure_ml
    endpoint: "${AZURE_ML_ENDPOINT}"  
    api_key: "${AZURE_ML_KEY}"
    deployment_name: "custom-model-deployment"

Advanced AI Features

Function Calling & Tool Use

Modern AI models can call functions and use tools directly:

# AI model with tool access
from kubiya import Kubiya

client = Kubiya()

# Models can directly query context and execute workflows
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_cluster_status",
            "description": "Get current status of Kubernetes cluster",
            "parameters": {
                "cluster_name": {"type": "string"},
                "namespace": {"type": "string", "required": False}
            }
        }
    },
    {
        "type": "function", 
        "function": {
            "name": "scale_deployment",
            "description": "Scale a Kubernetes deployment",
            "parameters": {
                "deployment": {"type": "string"},
                "replicas": {"type": "integer"},
                "namespace": {"type": "string"}
            }
        }
    }
]

response = client.chat.completions.create(
    model="gpt-4-turbo",
    messages=[{
        "role": "user", 
        "content": "The payment service seems slow. Can you investigate and scale if needed?"
    }],
    tools=tools
)

Structured Output Generation

AI models generate properly formatted workflows:

{
  "workflow": {
    "name": "investigate-payment-service",
    "steps": [
      {
        "name": "check-current-status",
        "tool": "kubectl",
        "args": ["get", "deployment", "payment-service", "-o", "json"],
        "namespace": "production"
      },
      {
        "name": "analyze-metrics",
        "tool": "datadog-query",
        "query": "avg:kubernetes.cpu.usage{service:payment}",
        "timeframe": "1h"
      },
      {
        "name": "scale-if-needed",
        "tool": "kubectl", 
        "args": ["scale", "deployment", "payment-service", "--replicas=5"],
        "condition": "${analyze-metrics.cpu_usage} > 80"
      }
    ]
  }
}

Model Context Protocol (MCP)

Kubiya supports the Model Context Protocol for seamless AI agent integration:

# Start MCP server for external AI agents
kubiya mcp serve --port 3000

# External AI agents can now:  
# - Query your infrastructure context
# - Generate and execute workflows
# - Access your integrations securely
# - Use Kubiya's audit and compliance features

This enables AI agents like:

Claude Desktop with Kubiya context
ChatGPT with workflow execution capabilities
Custom agents built with LangChain or similar frameworks
Enterprise copilots with infrastructure automation

Model Performance & Optimization

Response Time Optimization

Different models excel at different tasks:

Fast Models
Powerful Models

For real-time operations:

GPT-3.5 Turbo: ~2-3 seconds response time
Claude Haiku: Fastest Claude model for simple tasks
Local models with Groq: Sub-second inference
Good for: Status checks, simple deployments, routine operations

Cost Optimization Strategies

# Intelligent model routing based on complexity
routing_rules:
  - condition: "workflow_complexity == 'simple'"
    model: "gpt-3.5-turbo"
    
  - condition: "incident_severity >= 'critical'"
    model: "gpt-4-turbo"
    
  - condition: "contains_sensitive_data == true"
    model: "local_llama"
    
  - condition: "user_tier == 'premium'"  
    model: "claude-3-opus"
    
  default_model: "gpt-3.5-turbo"

Quality Assurance

Ensure consistent AI performance across models:

# Model quality testing
test_scenarios = [
    {
        "input": "Scale the frontend service to handle increased traffic",
        "expected_tools": ["kubectl", "monitoring-check"],
        "expected_safety_checks": ["resource_limits", "rollback_plan"]
    },
    {
        "input": "Investigate database connection issues",  
        "expected_tools": ["database-client", "network-test", "log-analyzer"],
        "required_context": ["database_endpoints", "connection_pools"]
    }
]

for scenario in test_scenarios:
    for model in ["gpt-4", "claude-3", "local-llama"]:
        result = test_model_quality(model, scenario)
        assert_workflow_quality(result, scenario["requirements"])

Security & Privacy

Data Handling Policies

Control what data AI models can access:

# Data classification and model access rules
data_policies:
  public:
    allowed_models: ["openai/*", "anthropic/*", "google/*"]
    
  internal:
    allowed_models: ["azure_openai/*", "local/*"]
    
  confidential:
    allowed_models: ["local/*"]
    data_masking: true
    
  restricted:
    allowed_models: ["air_gapped_local/*"]
    no_cloud_access: true

Audit & Compliance

Track all AI model interactions:

Prompt logging: Record all inputs sent to AI models
Response tracking: Store AI-generated workflows and decisions
Model attribution: Track which model generated each workflow
Cost tracking: Monitor AI API usage and costs by team/project

Privacy Notice: When using cloud-based AI models, your prompts and context may be processed by third-party services. Use local models or data masking for sensitive information.

What’s Next?

AI models generate the intelligence, but workflows provide the structure and safety guardrails that make automation reliable in production. Learn how AI-generated workflows become deterministic, auditable operations.

Workflows →

Understand how AI generates structured, reliable workflows

MCP Integration

Connect external AI agents to Kubiya’s execution engine

Introduction

Quick Start

Core Concepts

Using Kubiya

Workflows

MCP Integration

Administration

Reference

​The Model-Agnostic Advantage

​Supported AI Providers

​Commercial Models

OpenAI

Anthropic

Google

Microsoft

​Open Source & Self-Hosted

Together AI

Groq

Ollama

vLLM

​How AI Models Work with Kubiya

​Workflow Generation Process

​Example: Model Selection by Use Case

​Model Configuration

​API-Based Models

​Self-Hosted Models

​Advanced AI Features

​Function Calling & Tool Use

​Structured Output Generation

​Model Context Protocol (MCP)

​Model Performance & Optimization

​Response Time Optimization

​Cost Optimization Strategies

​Quality Assurance

​Security & Privacy

​Data Handling Policies

​Audit & Compliance

​What’s Next?

Workflows →

MCP Integration

The Model-Agnostic Advantage

Supported AI Providers

Commercial Models

Open Source & Self-Hosted

How AI Models Work with Kubiya

Workflow Generation Process

Example: Model Selection by Use Case

Model Configuration

API-Based Models

Self-Hosted Models

Advanced AI Features

Function Calling & Tool Use

Structured Output Generation

Model Context Protocol (MCP)

Model Performance & Optimization

Response Time Optimization

Cost Optimization Strategies

Quality Assurance

Security & Privacy

Data Handling Policies

Audit & Compliance

What’s Next?