Manage LLM models and configurations through the Control Plane SDK
The Models Service provides programmatic control over the lifecycle and configuration of Large Language Models (LLMs) within the Kubiya platform. It serves as the main interface for discovering, configuring, and managing both standard and custom LLMs, allowing teams to optimize their AI-powered workflows for performance, cost, and compliance.The Models Service makes it easy to:
List and filter available models by provider, runtime compatibility, and recommendation status.
Retrieve detailed model metadata such as capabilities, pricing, and context window.
Create and update custom model configurations for private, fine-tuned, or organization-specific models.
Manage the model lifecycle (enable/disable, recommend, update, or delete) so only approved models are available for use by agents and workflows.
Audit and report on model availability, usage, and provider coverage.
This service is especially valuable for platform administrators and advanced users who need to balance innovation with governance, cost control, and security. By using the Models Service, you can ensure that your agents and workflows always use the most appropriate, cost-effective, and compliant LLMs for your business needs.
For conceptual information about models and how they’re used in agents, see Agents Core Concepts.
The Models Service provides a set of high-level methods designed to be intuitive and flexible, supporting a wide range of operational and administrative tasks:
list(): Retrieve all available LLM models, with filtering options for provider, runtime, recommendation status, and pagination. This is the primary entry point for discovering which models are available in your environment.
get(model_id): Fetch detailed metadata for a specific model, using either its unique identifier (UUID) or its value string (e.g., “kubiya/claude-sonnet-4”). This is useful for inspecting model capabilities, pricing, and configuration before use.
create(model_data): Register a new custom model configuration. This is typically used for private or fine-tuned models that are not available from public providers, and requires admin privileges.
update(model_id, model_data): Modify the configuration of an existing model, such as updating its description, pricing, or enabled status. This allows organizations to keep model metadata up to date as providers change their offerings.
delete(model_id): Remove a model from the registry. This operation is restricted to ensure that no active agents are using the model at the time of deletion.
get_default(): Retrieve the default recommended model for general-purpose use. This is especially useful for agents or workflows that do not have strict model requirements and should use the platform’s best-practice recommendation.
list_providers(): List all unique LLM providers currently available in your environment. This helps with auditing, reporting, and provider-specific filtering.
By using these methods, you can build robust, dynamic workflows that adapt to changes in the LLM landscape, enforce organizational policies, and provide a seamless experience for both developers and end-users.The following sections provide practical guidance, detailed examples, and best practices for leveraging the Models Service effectively in your own projects.
from kubiya import ControlPlaneClient# Initialize clientclient = ControlPlaneClient(api_key="your-api-key")# List all available modelsmodels = client.models.list()for model in models: print(f"Model: {model['value']} - Provider: {model['provider']}")# Get default recommended modeldefault_model = client.models.get_default()print(f"Default: {default_model['value']}")
# List only Anthropic modelsanthropic_models = client.models.list(provider="Anthropic")for model in anthropic_models: print(f"Anthropic Model: {model['value']}")# List OpenAI modelsopenai_models = client.models.list(provider="OpenAI")# List Google modelsgoogle_models = client.models.list(provider="Google")
# List models compatible with claude_code runtimeclaude_code_models = client.models.list(runtime="claude_code")for model in claude_code_models: print(f"Claude Code Compatible: {model['value']}")# List models for agno runtimeagno_models = client.models.list(runtime="agno")
# Get only recommended modelsrecommended_models = client.models.list(recommended=True)for model in recommended_models: print(f"Recommended: {model['value']}") print(f" Provider: {model['provider']}") print(f" Description: {model.get('description', 'N/A')}")
# List all models including disabled onesall_models = client.models.list(enabled_only=False)print(f"Total models: {len(all_models)}")enabled = [m for m in all_models if m['enabled']]disabled = [m for m in all_models if not m['enabled']]print(f"Enabled: {len(enabled)}, Disabled: {len(disabled)}")
# Paginate through modelsskip = 0limit = 10while True: models = client.models.list(skip=skip, limit=limit) if not models: break for model in models: print(f"Model: {model['value']}") skip += limit
# Get model by value stringmodel = client.models.get("kubiya/claude-sonnet-4")print(f"Model Details:")print(f" Value: {model['value']}")print(f" Provider: {model['provider']}")print(f" Description: {model.get('description', 'N/A')}")print(f" Capabilities: {model.get('capabilities', [])}")
# List all providersproviders = client.models.list_providers()print("Available Providers:")for provider in providers: print(f" - {provider}") # Get models for this provider provider_models = client.models.list(provider=provider) print(f" Models: {len(provider_models)}")
Use this approach when you want to programmatically select the most suitable LLM for a specific task type, such as code generation, cost-sensitive operations, or handling large context windows. This is helpful for dynamic agent workflows that need to adapt to different requirements on the fly.
def find_best_model_for_task(task_type: str): """Find the best model for a specific task type""" # Get recommended models recommended = client.models.list(recommended=True) if task_type == "code": # Prefer Claude models for code code_models = [m for m in recommended if "claude" in m['value'].lower()] return code_models[0] if code_models else recommended[0] elif task_type == "cost-sensitive": # Find cheapest recommended model models_with_cost = [m for m in recommended if m.get('cost_per_1k_input')] return min(models_with_cost, key=lambda m: m['cost_per_1k_input']) elif task_type == "large-context": # Find model with largest context window models_with_context = [m for m in recommended if m.get('context_window')] return max(models_with_context, key=lambda m: m['context_window']) else: # Default to recommended return client.models.get_default()# Usagecode_model = find_best_model_for_task("code")print(f"Best for code: {code_model['value']}")cheap_model = find_best_model_for_task("cost-sensitive")print(f"Most cost-effective: {cheap_model['value']}")context_model = find_best_model_for_task("large-context")print(f"Largest context: {context_model['value']}")
This example is useful when you need to evaluate and compare multiple models side by side, such as when deciding which model to standardize on for your team or when presenting options to stakeholders. It helps you quickly assess differences in context window, provider, and cost.
def compare_models(model_values: list): """Compare multiple models side by side""" models = [client.models.get(value) for value in model_values] print(f"{'Model':<40} {'Provider':<15} {'Context':<12} {'Cost/1K':<10}") print("-" * 80) for model in models: context = model.get('context_window', 'N/A') cost = model.get('cost_per_1k_input', 'N/A') print(f"{model['value']:<40} {model['provider']:<15} {str(context):<12} ${str(cost):<10}")# Usagecompare_models([ "kubiya/claude-sonnet-4", "kubiya/claude-opus-4", "kubiya/gpt-4o"])
Use this pattern to estimate the cost of running a specific workload on a given model. This is valuable for budgeting, cost tracking, or when you want to compare the financial impact of different model choices before running large jobs.
This example demonstrates how to generate a summary report of all available models, including how many are enabled, recommended, or available from each provider. It’s helpful for platform administrators who need to audit model inventory or prepare usage reports.
def audit_available_models(): """Generate audit report of available models""" all_models = client.models.list(enabled_only=False) providers = client.models.list_providers() report = { "total_models": len(all_models), "enabled": len([m for m in all_models if m['enabled']]), "disabled": len([m for m in all_models if not m['enabled']]), "recommended": len([m for m in all_models if m.get('recommended')]), "providers": len(providers), "by_provider": {} } for provider in providers: provider_models = [m for m in all_models if m['provider'] == provider] report["by_provider"][provider] = { "total": len(provider_models), "enabled": len([m for m in provider_models if m['enabled']]) } return report# Usageaudit = audit_available_models()print(f"Total Models: {audit['total_models']}")print(f"Enabled: {audit['enabled']}")print(f"Recommended: {audit['recommended']}")print(f"\nBy Provider:")for provider, stats in audit['by_provider'].items(): print(f" {provider}: {stats['total']} total, {stats['enabled']} enabled")
When working with models, you may encounter errors such as requesting a non-existent model or losing access to a provider. This example shows how to handle such errors gracefully, ensuring your application can recover or provide fallback behavior if a model is unavailable.
from kubiya.resources.exceptions import ModelErrortry: # Try to get a model model = client.models.get("non-existent-model")except ModelError as e: print(f"Model error: {e}") # Handle error - maybe use default model instead model = client.models.get_default() print(f"Using default model: {model['value']}")
Follow these best practices to make your use of the Models Service more robust, efficient, and maintainable. These patterns help you avoid common pitfalls and ensure your code is resilient to changes in model availability or configuration.
To reduce API calls and improve performance, cache model information locally after the first retrieval. This is especially useful in applications that repeatedly access the same model details.
# Cache models to reduce API callsclass ModelCache: def __init__(self, client): self.client = client self._cache = {} self._providers = None def get_model(self, model_value: str): if model_value not in self._cache: self._cache[model_value] = self.client.models.get(model_value) return self._cache[model_value] def list_providers(self): if self._providers is None: self._providers = self.client.models.list_providers() return self._providers# Usagecache = ModelCache(client)model1 = cache.get_model("kubiya/claude-sonnet-4") # API callmodel2 = cache.get_model("kubiya/claude-sonnet-4") # From cache
Unless you have a specific requirement, always use the recommended model. This ensures you benefit from the platform’s best-practice guidance and reduces the risk of using deprecated or suboptimal models.
# Always prefer recommended models unless specific requirementsdef select_model(specific_model=None): if specific_model: return client.models.get(specific_model) # Use recommended by default return client.models.get_default()# Usagemodel = select_model() # Gets default recommendedmodel = select_model("kubiya/claude-opus-4") # Gets specific model
Before using a model, check that it is enabled and supports the capabilities you need. This prevents runtime errors and ensures your workflow only uses models that meet your requirements.
def validate_model(model_value: str, required_capabilities=None): """Validate that a model exists and has required capabilities""" try: model = client.models.get(model_value) if not model['enabled']: raise ValueError(f"Model {model_value} is disabled") if required_capabilities: model_caps = set(model.get('capabilities', [])) required_caps = set(required_capabilities) if not required_caps.issubset(model_caps): missing = required_caps - model_caps raise ValueError(f"Model missing capabilities: {missing}") return True except Exception as e: print(f"Model validation failed: {e}") return False# Usageif validate_model("kubiya/claude-sonnet-4", ["code", "planning"]): print("Model is valid and has required capabilities")
Always provide a fallback mechanism in case your preferred model is unavailable. This keeps your application resilient and ensures continuity of service even if a model is removed or temporarily inaccessible.
def get_model_with_fallback(preferred_model: str, fallback_to_default=True): """Get model with automatic fallback to default""" try: return client.models.get(preferred_model) except ModelError as e: print(f"Preferred model unavailable: {e}") if fallback_to_default: print("Falling back to default model") return client.models.get_default() raise# Usagemodel = get_model_with_fallback("custom/my-model")