The Models Service provides programmatic control over the lifecycle and configuration of Large Language Models (LLMs) within the Kubiya platform. It serves as the main interface for discovering, configuring, and managing both standard and custom LLMs, allowing teams to optimize their AI-powered workflows for performance, cost, and compliance.
The Models Service makes it easy to:
- List and filter available models by provider, runtime compatibility, and recommendation status.
- Retrieve detailed model metadata such as capabilities, pricing, and context window.
- Create and update custom model configurations for private, fine-tuned, or organization-specific models.
- Manage the model lifecycle (enable/disable, recommend, update, or delete) so only approved models are available for use by agents and workflows.
- Audit and report on model availability, usage, and provider coverage.
This service is especially valuable for platform administrators and advanced users who need to balance innovation with governance, cost control, and security. By using the Models Service, you can ensure that your agents and workflows always use the most appropriate, cost-effective, and compliant LLMs for your business needs.
Overview
The Models Service provides a set of high-level methods designed to be intuitive and flexible, supporting a wide range of operational and administrative tasks:
- list(): Retrieve all available LLM models, with filtering options for provider, runtime, recommendation status, and pagination. This is the primary entry point for discovering which models are available in your environment.
- get(model_id): Fetch detailed metadata for a specific model, using either its unique identifier (UUID) or its value string (e.g., “kubiya/claude-sonnet-4”). This is useful for inspecting model capabilities, pricing, and configuration before use.
- create(model_data): Register a new custom model configuration. This is typically used for private or fine-tuned models that are not available from public providers, and requires admin privileges.
- update(model_id, model_data): Modify the configuration of an existing model, such as updating its description, pricing, or enabled status. This allows organizations to keep model metadata up to date as providers change their offerings.
- delete(model_id): Remove a model from the registry. This operation is restricted to ensure that no active agents are using the model at the time of deletion.
- get_default(): Retrieve the default recommended model for general-purpose use. This is especially useful for agents or workflows that do not have strict model requirements and should use the platform’s best-practice recommendation.
- list_providers(): List all unique LLM providers currently available in your environment. This helps with auditing, reporting, and provider-specific filtering.
By using these methods, you can build robust, dynamic workflows that adapt to changes in the LLM landscape, enforce organizational policies, and provide a seamless experience for both developers and end-users.
The following sections provide practical guidance, detailed examples, and best practices for leveraging the Models Service effectively in your own projects.
Quick Start
from kubiya import ControlPlaneClient
# Initialize client
client = ControlPlaneClient(api_key="your-api-key")
# List all available models
models = client.models.list()
for model in models:
print(f"Model: {model['value']} - Provider: {model['provider']}")
# Get default recommended model
default_model = client.models.get_default()
print(f"Default: {default_model['value']}")
List Models
List all available LLM models with optional filtering.
Basic Listing
# List all enabled models
models = client.models.list()
for model in models:
print(f"""
Model: {model['value']}
Provider: {model['provider']}
Enabled: {model['enabled']}
Recommended: {model.get('recommended', False)}
""")
Filter by Provider
# List only Anthropic models
anthropic_models = client.models.list(provider="Anthropic")
for model in anthropic_models:
print(f"Anthropic Model: {model['value']}")
# List OpenAI models
openai_models = client.models.list(provider="OpenAI")
# List Google models
google_models = client.models.list(provider="Google")
Filter by Runtime
# List models compatible with claude_code runtime
claude_code_models = client.models.list(runtime="claude_code")
for model in claude_code_models:
print(f"Claude Code Compatible: {model['value']}")
# List models for agno runtime
agno_models = client.models.list(runtime="agno")
Filter by Recommendation
# Get only recommended models
recommended_models = client.models.list(recommended=True)
for model in recommended_models:
print(f"Recommended: {model['value']}")
print(f" Provider: {model['provider']}")
print(f" Description: {model.get('description', 'N/A')}")
Include Disabled Models
# List all models including disabled ones
all_models = client.models.list(enabled_only=False)
print(f"Total models: {len(all_models)}")
enabled = [m for m in all_models if m['enabled']]
disabled = [m for m in all_models if not m['enabled']]
print(f"Enabled: {len(enabled)}, Disabled: {len(disabled)}")
# Paginate through models
skip = 0
limit = 10
while True:
models = client.models.list(skip=skip, limit=limit)
if not models:
break
for model in models:
print(f"Model: {model['value']}")
skip += limit
Get Model Details
Retrieve detailed information about a specific model.
By Model ID
# Get model by UUID
model = client.models.get("model-uuid-here")
print(f"Model: {model['value']}")
print(f"Provider: {model['provider']}")
print(f"Enabled: {model['enabled']}")
print(f"Context Window: {model.get('context_window', 'N/A')}")
print(f"Max Tokens: {model.get('max_tokens', 'N/A')}")
By Model Value
# Get model by value string
model = client.models.get("kubiya/claude-sonnet-4")
print(f"Model Details:")
print(f" Value: {model['value']}")
print(f" Provider: {model['provider']}")
print(f" Description: {model.get('description', 'N/A')}")
print(f" Capabilities: {model.get('capabilities', [])}")
def get_model_info(model_value: str):
"""Get comprehensive model information"""
try:
model = client.models.get(model_value)
info = {
"value": model['value'],
"provider": model['provider'],
"enabled": model['enabled'],
"recommended": model.get('recommended', False),
"context_window": model.get('context_window'),
"max_tokens": model.get('max_tokens'),
"cost_per_1k_input": model.get('cost_per_1k_input'),
"cost_per_1k_output": model.get('cost_per_1k_output'),
"capabilities": model.get('capabilities', []),
"runtime_compatibility": model.get('runtime_compatibility', [])
}
return info
except Exception as e:
print(f"Failed to get model info: {e}")
return None
# Usage
info = get_model_info("kubiya/claude-sonnet-4")
if info:
print(f"Model: {info['value']}")
print(f"Provider: {info['provider']}")
print(f"Context Window: {info['context_window']}")
Get Default Model
Get the default recommended LLM model.
# Get default model
default = client.models.get_default()
print(f"Default Model: {default['value']}")
print(f"Provider: {default['provider']}")
print(f"Why recommended: {default.get('description', 'Balanced performance and cost')}")
List Providers
Get all available LLM providers.
# List all providers
providers = client.models.list_providers()
print("Available Providers:")
for provider in providers:
print(f" - {provider}")
# Get models for this provider
provider_models = client.models.list(provider=provider)
print(f" Models: {len(provider_models)}")
Create Custom Model
Create a new custom LLM model configuration.
Creating models requires organization admin privileges. This operation is typically used for custom/private model deployments.
# Create custom model
model_data = {
"value": "custom/my-fine-tuned-model",
"provider": "Custom",
"enabled": True,
"recommended": False,
"description": "Custom fine-tuned model for our domain",
"context_window": 128000,
"max_tokens": 4096,
"capabilities": ["code", "analysis", "planning"],
"runtime_compatibility": ["agno"],
"cost_per_1k_input": 0.015,
"cost_per_1k_output": 0.075
}
created_model = client.models.create(model_data)
print(f"Created Model: {created_model['uuid']}")
print(f"Value: {created_model['value']}")
Update Model
Update an existing model configuration.
# Update model settings
model_id = "model-uuid-or-value"
update_data = {
"enabled": True,
"recommended": True,
"description": "Updated description",
"cost_per_1k_input": 0.010 # Updated pricing
}
updated_model = client.models.update(model_id, update_data)
print(f"Updated Model: {updated_model['value']}")
print(f"Enabled: {updated_model['enabled']}")
Delete Model
Delete a custom model configuration.
Deleting a model affects any agents configured to use it. Ensure no active agents depend on the model before deletion.
# Delete model
model_id = "model-uuid-or-value"
result = client.models.delete(model_id)
print(f"Deletion result: {result}")
Practical Examples
1. Find Best Model for Use Case
Use this approach when you want to programmatically select the most suitable LLM for a specific task type, such as code generation, cost-sensitive operations, or handling large context windows. This is helpful for dynamic agent workflows that need to adapt to different requirements on the fly.
def find_best_model_for_task(task_type: str):
"""Find the best model for a specific task type"""
# Get recommended models
recommended = client.models.list(recommended=True)
if task_type == "code":
# Prefer Claude models for code
code_models = [m for m in recommended if "claude" in m['value'].lower()]
return code_models[0] if code_models else recommended[0]
elif task_type == "cost-sensitive":
# Find cheapest recommended model
models_with_cost = [m for m in recommended if m.get('cost_per_1k_input')]
return min(models_with_cost, key=lambda m: m['cost_per_1k_input'])
elif task_type == "large-context":
# Find model with largest context window
models_with_context = [m for m in recommended if m.get('context_window')]
return max(models_with_context, key=lambda m: m['context_window'])
else:
# Default to recommended
return client.models.get_default()
# Usage
code_model = find_best_model_for_task("code")
print(f"Best for code: {code_model['value']}")
cheap_model = find_best_model_for_task("cost-sensitive")
print(f"Most cost-effective: {cheap_model['value']}")
context_model = find_best_model_for_task("large-context")
print(f"Largest context: {context_model['value']}")
2. Compare Model Capabilities
This example is useful when you need to evaluate and compare multiple models side by side, such as when deciding which model to standardize on for your team or when presenting options to stakeholders. It helps you quickly assess differences in context window, provider, and cost.
def compare_models(model_values: list):
"""Compare multiple models side by side"""
models = [client.models.get(value) for value in model_values]
print(f"{'Model':<40} {'Provider':<15} {'Context':<12} {'Cost/1K':<10}")
print("-" * 80)
for model in models:
context = model.get('context_window', 'N/A')
cost = model.get('cost_per_1k_input', 'N/A')
print(f"{model['value']:<40} {model['provider']:<15} {str(context):<12} ${str(cost):<10}")
# Usage
compare_models([
"kubiya/claude-sonnet-4",
"kubiya/claude-opus-4",
"kubiya/gpt-4o"
])
3. Model Cost Calculator
Use this pattern to estimate the cost of running a specific workload on a given model. This is valuable for budgeting, cost tracking, or when you want to compare the financial impact of different model choices before running large jobs.
def calculate_cost(model_value: str, input_tokens: int, output_tokens: int):
"""Calculate estimated cost for using a model"""
model = client.models.get(model_value)
input_cost_per_1k = model.get('cost_per_1k_input', 0)
output_cost_per_1k = model.get('cost_per_1k_output', 0)
input_cost = (input_tokens / 1000) * input_cost_per_1k
output_cost = (output_tokens / 1000) * output_cost_per_1k
total_cost = input_cost + output_cost
return {
"model": model_value,
"input_tokens": input_tokens,
"output_tokens": output_tokens,
"input_cost": round(input_cost, 4),
"output_cost": round(output_cost, 4),
"total_cost": round(total_cost, 4)
}
# Usage
cost = calculate_cost("kubiya/claude-sonnet-4", 10000, 2000)
print(f"Model: {cost['model']}")
print(f"Input tokens: {cost['input_tokens']:,} = ${cost['input_cost']}")
print(f"Output tokens: {cost['output_tokens']:,} = ${cost['output_cost']}")
print(f"Total cost: ${cost['total_cost']}")
4. Audit Model Usage
This example demonstrates how to generate a summary report of all available models, including how many are enabled, recommended, or available from each provider. It’s helpful for platform administrators who need to audit model inventory or prepare usage reports.
def audit_available_models():
"""Generate audit report of available models"""
all_models = client.models.list(enabled_only=False)
providers = client.models.list_providers()
report = {
"total_models": len(all_models),
"enabled": len([m for m in all_models if m['enabled']]),
"disabled": len([m for m in all_models if not m['enabled']]),
"recommended": len([m for m in all_models if m.get('recommended')]),
"providers": len(providers),
"by_provider": {}
}
for provider in providers:
provider_models = [m for m in all_models if m['provider'] == provider]
report["by_provider"][provider] = {
"total": len(provider_models),
"enabled": len([m for m in provider_models if m['enabled']])
}
return report
# Usage
audit = audit_available_models()
print(f"Total Models: {audit['total_models']}")
print(f"Enabled: {audit['enabled']}")
print(f"Recommended: {audit['recommended']}")
print(f"\nBy Provider:")
for provider, stats in audit['by_provider'].items():
print(f" {provider}: {stats['total']} total, {stats['enabled']} enabled")
Error Handling
When working with models, you may encounter errors such as requesting a non-existent model or losing access to a provider. This example shows how to handle such errors gracefully, ensuring your application can recover or provide fallback behavior if a model is unavailable.
from kubiya.resources.exceptions import ModelError
try:
# Try to get a model
model = client.models.get("non-existent-model")
except ModelError as e:
print(f"Model error: {e}")
# Handle error - maybe use default model instead
model = client.models.get_default()
print(f"Using default model: {model['value']}")
Best Practices
Follow these best practices to make your use of the Models Service more robust, efficient, and maintainable. These patterns help you avoid common pitfalls and ensure your code is resilient to changes in model availability or configuration.
To reduce API calls and improve performance, cache model information locally after the first retrieval. This is especially useful in applications that repeatedly access the same model details.
# Cache models to reduce API calls
class ModelCache:
def __init__(self, client):
self.client = client
self._cache = {}
self._providers = None
def get_model(self, model_value: str):
if model_value not in self._cache:
self._cache[model_value] = self.client.models.get(model_value)
return self._cache[model_value]
def list_providers(self):
if self._providers is None:
self._providers = self.client.models.list_providers()
return self._providers
# Usage
cache = ModelCache(client)
model1 = cache.get_model("kubiya/claude-sonnet-4") # API call
model2 = cache.get_model("kubiya/claude-sonnet-4") # From cache
2. Use Recommended Models by Default
Unless you have a specific requirement, always use the recommended model. This ensures you benefit from the platform’s best-practice guidance and reduces the risk of using deprecated or suboptimal models.
# Always prefer recommended models unless specific requirements
def select_model(specific_model=None):
if specific_model:
return client.models.get(specific_model)
# Use recommended by default
return client.models.get_default()
# Usage
model = select_model() # Gets default recommended
model = select_model("kubiya/claude-opus-4") # Gets specific model
3. Validate Model Before Use
Before using a model, check that it is enabled and supports the capabilities you need. This prevents runtime errors and ensures your workflow only uses models that meet your requirements.
def validate_model(model_value: str, required_capabilities=None):
"""Validate that a model exists and has required capabilities"""
try:
model = client.models.get(model_value)
if not model['enabled']:
raise ValueError(f"Model {model_value} is disabled")
if required_capabilities:
model_caps = set(model.get('capabilities', []))
required_caps = set(required_capabilities)
if not required_caps.issubset(model_caps):
missing = required_caps - model_caps
raise ValueError(f"Model missing capabilities: {missing}")
return True
except Exception as e:
print(f"Model validation failed: {e}")
return False
# Usage
if validate_model("kubiya/claude-sonnet-4", ["code", "planning"]):
print("Model is valid and has required capabilities")
4. Handle Model Unavailability
Always provide a fallback mechanism in case your preferred model is unavailable. This keeps your application resilient and ensures continuity of service even if a model is removed or temporarily inaccessible.
def get_model_with_fallback(preferred_model: str, fallback_to_default=True):
"""Get model with automatic fallback to default"""
try:
return client.models.get(preferred_model)
except ModelError as e:
print(f"Preferred model unavailable: {e}")
if fallback_to_default:
print("Falling back to default model")
return client.models.get_default()
raise
# Usage
model = get_model_with_fallback("custom/my-model")
API Reference
Methods
| Method | Description | Parameters |
|---|
list() | List all LLM models | enabled_only, provider, runtime, recommended, skip, limit |
get(model_id) | Get specific model | model_id: UUID or value string |
get_default() | Get default recommended model | None |
list_providers() | List available providers | None |
create(model_data) | Create custom model | model_data: Dictionary with model config |
update(model_id, model_data) | Update model | model_id: UUID/value, model_data: Updates |
delete(model_id) | Delete model | model_id: UUID or value string |
Model Object Structure
{
"uuid": "string", # Unique identifier
"value": "string", # Model value (e.g., "kubiya/claude-sonnet-4")
"provider": "string", # Provider name (e.g., "Anthropic")
"enabled": bool, # Whether model is enabled
"recommended": bool, # Whether model is recommended
"description": "string", # Model description
"context_window": int, # Maximum context window size
"max_tokens": int, # Maximum output tokens
"cost_per_1k_input": float, # Cost per 1K input tokens
"cost_per_1k_output": float, # Cost per 1K output tokens
"capabilities": ["string"], # Model capabilities
"runtime_compatibility": ["string"] # Compatible runtimes
}
Next Steps