Kubiya Runners are the execution engine that orchestrates serverless tools on your infrastructure. They manage container lifecycles, enforce security policies, and provide the bridge between Kubiya’s AI-generated workflows and your actual systems.

Why Runners Matter for Production

Runners solve critical challenges for production automation:

Data Sovereignty

  • All execution happens on your infrastructure
  • Sensitive data never leaves your environment
  • Meet compliance requirements for regulated industries
  • Full control over data residency and processing

Security & Isolation

  • Network policies control tool access to systems
  • Resource limits prevent runaway processes
  • Security scanning of all container images
  • Audit logging of every operation

Performance & Reliability

  • Execute tools close to your data and services
  • Automatic retry and error recovery
  • Load balancing across multiple runner instances
  • Caching of frequently used tool images
Runner Selection Interface

Runner Architecture

Core Components

Manages tool execution lifecycle:
  • Pull and cache tool container images
  • Create isolated execution environments
  • Enforce resource limits and security policies
  • Collect logs and metrics from running containers

Deployment Options

Self-Hosted Runners

Deploy runners on your own infrastructure for maximum control:

Kubernetes

Native Kubernetes deployment with Helm charts

Docker Compose

Simple deployment for development and testing

VM/Bare Metal

Direct installation on Linux systems

Cloud Native

Optimized for AWS EKS, GCP GKE, Azure AKS

Kubernetes Deployment

# values.yaml for Kubiya Runner Helm chart
runner:
  replicas: 3
  resources:
    requests:
      cpu: 500m
      memory: 1Gi
    limits:
      cpu: 2000m
      memory: 4Gi

security:
  networkPolicies: true
  podSecurityStandards: restricted
  runAsNonRoot: true

storage:
  cacheSize: 20Gi
  logsRetention: 30d

integrations:
  kubernets:
    inClusterConfig: true
  aws:
    roleArn: "arn:aws:iam::123456789012:role/KubiyaRunner"

Benefits of Self-Hosted

Security

Complete control over network access, data processing, and credential handling

Compliance

Meet SOC2, HIPAA, PCI-DSS requirements with on-premises execution

Performance

Low latency access to internal systems and databases

Cost

No data transfer costs for large-scale operations

Hosted Runners

Use Kubiya’s managed infrastructure for quick setup:

Quick Start

No installation required - start automating immediately

Maintenance Free

Automatic updates, scaling, and monitoring

Global Reach

Runners available in multiple regions worldwide

Enterprise SLA

99.9% uptime guarantee with 24/7 support
Hosted runners are ideal for development and testing, but production workloads typically require self-hosted runners for security and compliance reasons.

Cross-Environment Orchestration

Runners enable seamless automation across different environments and clusters:

Multi-Cluster Workflows

Deploy applications across multiple Kubernetes clusters:
# Workflow that deploys to multiple environments
name: multi-environment-deployment
steps:
  - name: deploy-to-staging
    runner: staging-cluster-runner
    tool: kubernetes-deployer
    inputs:
      namespace: myapp-staging
      image: myapp:${BUILD_VERSION}
      
  - name: run-integration-tests
    runner: testing-runner
    tool: test-suite
    depends_on: [deploy-to-staging]
    
  - name: deploy-to-production
    runner: production-cluster-runner  
    tool: kubernetes-deployer
    inputs:
      namespace: myapp-prod
      image: myapp:${BUILD_VERSION}
    condition: ${run-integration-tests.status} == "success"

Cross-Cloud Operations

Orchestrate operations spanning multiple cloud providers:
# Disaster recovery workflow across clouds
name: cross-cloud-failover
steps:
  - name: backup-aws-data
    runner: aws-us-east-runner
    tool: aws-backup
    
  - name: restore-to-gcp
    runner: gcp-us-central-runner  
    tool: gcp-restore
    inputs:
      backup_location: ${backup-aws-data.backup_url}
    
  - name: update-dns
    runner: cloudflare-runner
    tool: dns-updater
    inputs:
      record: api.myapp.com
      target: ${restore-to-gcp.new_endpoint}

Intelligent Runner Selection

Kubiya automatically selects the best runner for each operation based on:
  • Proximity to target systems and data
  • Available resources and current load
  • Security policies and network access rules
  • Cost optimization preferences
Runner Configuration Interface

Security & Compliance

Network Security

Runners implement defense-in-depth networking:
# Network policy for runner security
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy  
metadata:
  name: kubiya-runner-policy
spec:
  podSelector:
    matchLabels:
      app: kubiya-runner
  policyTypes:
  - Ingress
  - Egress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: kubiya-control-plane
  egress:
  - to:
    - namespaceSelector:
        matchLabels:
          name: allowed-namespaces
  - to: []
    ports:
    - protocol: TCP
      port: 443  # HTTPS only

Resource Isolation

Each tool execution runs with strict resource controls:
# Resource limits for tool execution
limits:
  cpu: "2000m"
  memory: "4Gi"  
  ephemeral-storage: "10Gi"
  
security_context:
  runAsNonRoot: true
  runAsUser: 65534
  readOnlyRootFilesystem: true
  allowPrivilegeEscalation: false
  
capabilities:
  drop:
  - ALL

Audit & Monitoring

Complete visibility into all runner operations: Activity Center Dashboard
  • Execution logs: Every command, API call, and file access
  • Performance metrics: Resource usage, execution time, error rates
  • Security events: Failed authentication, policy violations, anomalies
  • Compliance reports: SOC2, GDPR, HIPAA compliance summaries

Advanced Configuration

High Availability

Deploy runners with automatic failover:
# HA runner configuration
runner:
  replicas: 5
  antiAffinity: hard  # Spread across nodes
  
persistence:
  storageClass: fast-ssd
  replication: 3
  
loadBalancer:
  enabled: true
  sessionAffinity: ClientIP
  
backup:
  enabled: true
  schedule: "0 2 * * *"
  retention: 30d

Custom Resource Types

Define organization-specific resource types and policies:
# Custom resource definitions for your infrastructure
custom_resources:
  - name: microservice
    properties:
      team: string
      criticality: [low, medium, high, critical]
      data_classification: [public, internal, confidential, restricted]
    
policies:
  - name: critical_service_protection
    selector:
      criticality: critical
    rules:
      - require_approval: true
      - max_concurrent_operations: 1
      - rollback_required: true

Integration Plugins

Extend runner capabilities with custom plugins:
# Custom plugin for specialized monitoring
from kubiya_runner import Plugin

class CustomMonitoringPlugin(Plugin):
    def pre_execution(self, context):
        """Called before each tool execution"""
        self.start_monitoring(context.tool_name)
        
    def post_execution(self, context, result):
        """Called after each tool execution"""
        metrics = self.collect_metrics()
        self.send_to_custom_system(metrics)
        
    def on_error(self, context, error):
        """Called when tool execution fails"""
        self.trigger_incident_response(context, error)

Performance Optimization

Image Caching

Runners aggressively cache tool images for fast startup:
  • Multi-layer caching: Share common base layers across tools
  • Predictive pre-pulling: Download images before they’re needed
  • Garbage collection: Automatically clean up unused images
  • Compression: Reduce storage and transfer overhead

Resource Scaling

Automatically scale runner capacity based on demand:
# Horizontal Pod Autoscaler for runners
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: kubiya-runner-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: kubiya-runner
  minReplicas: 2
  maxReplicas: 20
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Pods
    pods:
      metric:
        name: active_executions
      target:
        type: AverageValue
        averageValue: "5"
Production Tip: Start with 3 runner replicas for high availability, then use metrics to determine optimal scaling parameters for your workload patterns.

What’s Next?

With runners managing tool execution, you need AI models to generate the intelligent workflows that determine what tools to run and when. Kubiya’s model-agnostic approach lets you choose the best AI for your use case.