Kubiya Workers are Temporal-based execution engines that process AI agent workflows with enterprise-grade reliability and scalability. This guide covers worker deployment, configuration, and management.
What is a Worker?
Workers are distributed execution engines that:
Poll Task Queues : Listen for workflow and activity tasks from Temporal
Execute Agent Workflows : Run AI agents with tools and integrations
Report Health : Send heartbeats to the Control Plane
Stream Events : Provide real-time execution updates
Handle Failures : Automatically retry failed tasks with exponential backoff
Quick Start
Start Your First Worker
# 1. Configure API Key
export KUBIYA_API_KEY = "your-api-key"
# 2. Start a local worker
kubiya worker start --queue-id=my-queue --type=local
# 3. Monitor logs (in another terminal)
tail -f ~/.kubiya/workers/my-queue/logs/worker.log
Deployment Modes
Local Mode Development and testing with Python virtual environment
Daemon Mode Production deployment running as background process
Docker Mode Containerized deployment with isolation
Kubernetes Mode Scalable multi-replica deployment with auto-scaling
Worker Architecture
Core Components
Communication Flow
Registration : Worker registers with Control Plane and receives configuration
Temporal Connection : Connects to Temporal Cloud using provided credentials
Task Polling : Continuously polls assigned queue for new tasks
Task Execution : Executes agent workflows and activities
Event Streaming : Sends real-time events to Control Plane
Health Reporting : Periodic heartbeats with metrics and status
Deployment Modes
Local Mode
Best for development and testing.
# Start local worker
kubiya worker start --queue-id=dev-queue --type=local
# With custom environment
export LOG_LEVEL = DEBUG
export HEARTBEAT_INTERVAL = 15
kubiya worker start --queue-id=dev-queue --type=local
Features:
Automatic Python virtual environment setup
Foreground process with live logging
Quick iteration and debugging
Dependencies auto-installed
Directory Structure:
~/.kubiya/workers/dev-queue/
├── venv/ # Python virtual environment
├── logs/
│ └── worker.log # Execution logs
├── worker.py # Worker implementation
└── requirements.txt # Python dependencies
Daemon Mode
Production deployment as a background process.
# Start daemon
kubiya worker start --queue-id=prod-queue --type=local --daemon
# Or use shorthand
kubiya worker start --queue-id=prod-queue --type=local -d
# Check status
cat ~/.kubiya/workers/prod-queue/daemon_info.txt
# View logs
tail -f ~/.kubiya/workers/prod-queue/logs/worker.log
# Stop daemon
pkill -f "worker.py.*prod-queue"
Features:
Runs in background
Automatic restart on crash
Log rotation (configurable)
PID and status tracking
Configuration:
# Start with custom log settings
kubiya worker start \
--queue-id=prod-queue \
--type=local \
--daemon \
--max-log-size=104857600 \ # 100MB
--max-log-backups = 10
Docker Mode
Isolated containerized deployment.
# Start Docker worker via CLI
kubiya worker start --queue-id=docker-queue --type=docker
# Or run container directly
docker run -d \
--name kubiya-worker \
--restart unless-stopped \
-e KUBIYA_API_KEY="your-api-key" \
-e CONTROL_PLANE_URL="https://control-plane.kubiya.ai" \
-e QUEUE_ID="docker-queue" \
-e LOG_LEVEL="INFO" \
ghcr.io/kubiyabot/agent-worker:latest
# View logs
docker logs -f kubiya-worker
# Stop worker
docker stop kubiya-worker
docker rm kubiya-worker
Docker Compose:
# docker-compose.yml
version : '3.8'
services :
kubiya-worker :
image : ghcr.io/kubiyabot/agent-worker:latest
container_name : kubiya-worker
restart : unless-stopped
environment :
- KUBIYA_API_KEY=${KUBIYA_API_KEY}
- CONTROL_PLANE_URL=https://control-plane.kubiya.ai
- QUEUE_ID=docker-queue
- LOG_LEVEL=INFO
- HEARTBEAT_INTERVAL=30
volumes :
- ./logs:/root/.kubiya/workers/logs
logging :
driver : "json-file"
options :
max-size : "100m"
max-file : "10"
Start with:
docker-compose up -d
docker-compose logs -f
Kubernetes Mode
Scalable production deployment with high availability.
Basic Deployment
With Autoscaling
With Monitoring
apiVersion : apps/v1
kind : Deployment
metadata :
name : kubiya-worker
namespace : kubiya
spec :
replicas : 3
selector :
matchLabels :
app : kubiya-worker
template :
metadata :
labels :
app : kubiya-worker
spec :
containers :
- name : worker
image : ghcr.io/kubiyabot/agent-worker:latest
command : [ "kubiya" , "worker" , "start" ]
args :
- "--queue-id=$(QUEUE_ID)"
- "--type=local"
env :
- name : KUBIYA_API_KEY
valueFrom :
secretKeyRef :
name : kubiya-secrets
key : api-key
- name : CONTROL_PLANE_URL
value : "https://control-plane.kubiya.ai"
- name : QUEUE_ID
value : "production-queue"
- name : LOG_LEVEL
value : "INFO"
- name : WORKER_HOSTNAME
valueFrom :
fieldRef :
fieldPath : metadata.name
resources :
requests :
memory : "512Mi"
cpu : "250m"
limits :
memory : "2Gi"
cpu : "1000m"
livenessProbe :
exec :
command : [ "pgrep" , "-f" , "worker.py" ]
initialDelaySeconds : 30
periodSeconds : 10
readinessProbe :
exec :
command : [ "pgrep" , "-f" , "worker.py" ]
initialDelaySeconds : 10
periodSeconds : 5
---
apiVersion : v1
kind : Secret
metadata :
name : kubiya-secrets
namespace : kubiya
type : Opaque
stringData :
api-key : "your-api-key-here"
Deploy:
# Create namespace
kubectl create namespace kubiya
# Apply manifests
kubectl apply -f kubiya-worker.yaml
# Scale deployment
kubectl scale deployment kubiya-worker -n kubiya --replicas=5
# View logs
kubectl logs -f deployment/kubiya-worker -n kubiya
# Check status
kubectl get pods -n kubiya
kubectl describe deployment kubiya-worker -n kubiya
Configuration
Environment Variables
API authentication key for Control Plane
CONTROL_PLANE_URL
string
default: "https://control-plane.kubiya.ai"
Control Plane base URL
CONTROL_PLANE_GATEWAY_URL
Override Control Plane URL (takes precedence)
Worker queue identifier (must match queue in Control Plane)
Environment name for the worker
WORKER_HOSTNAME
string
default: "auto-detected"
Custom worker hostname for identification
Heartbeat interval in seconds (15-300)
Logging level: DEBUG, INFO, WARN, ERROR
MAX_CONCURRENT_ACTIVITIES
Maximum concurrent activity executions
Maximum concurrent workflow executions
Advanced Configuration
# Performance tuning
export MAX_CONCURRENT_ACTIVITIES = 20
export MAX_CONCURRENT_WORKFLOWS = 10
export ACTIVITY_TIMEOUT = 600
# Custom control plane
export CONTROL_PLANE_GATEWAY_URL = "https://cp.company.internal"
# Debug mode
export LOG_LEVEL = DEBUG
export KUBIYA_DEBUG = true
# Resource limits (Docker/K8s)
export MEMORY_LIMIT = "2Gi"
export CPU_LIMIT = "1000m"
# Start worker with configuration
kubiya worker start --queue-id=tuned-queue --type=local
Monitoring
Log Management
# View real-time logs
tail -f ~/.kubiya/workers/ < queue-i d > /logs/worker.log
# Search logs for errors
grep ERROR ~/.kubiya/workers/ < queue-i d > /logs/worker.log
# View last 100 lines
tail -n 100 ~/.kubiya/workers/ < queue-i d > /logs/worker.log
# Follow logs with filtering
tail -f ~/.kubiya/workers/ < queue-i d > /logs/worker.log | grep "Task completed"
Health Checks
# Check if worker is running
ps aux | grep "worker.py.*<queue-id>"
# Check daemon status
cat ~/.kubiya/workers/ < queue-i d > /daemon_info.txt
# Test connectivity to Control Plane
curl https://control-plane.kubiya.ai/health
# Verify Temporal connection (in worker logs)
grep "Connected to Temporal" ~/.kubiya/workers/ < queue-i d > /logs/worker.log
Metrics
Workers report the following metrics:
Task Execution : Success/failure counts, execution time
Resource Usage : CPU, memory, network
Queue Status : Pending tasks, poll rate
Health Status : Heartbeat success, connectivity
# View metrics in logs
tail -f ~/.kubiya/workers/ < queue-i d > /logs/worker.log | grep "metrics"
# Example output:
# [INFO] Metrics: tasks_completed=42, avg_duration=3.2s, memory_mb=512
Troubleshooting
Worker Won’t Start
# Check Python version (requires 3.8+)
python3 --version
# Clear virtual environment
rm -rf ~/.kubiya/workers/ < queue-i d > /venv
# Start with debug logging
export LOG_LEVEL = DEBUG
kubiya worker start --queue-id= < queue-id > --type=local
# Check for port conflicts
lsof -i :7233 # Temporal port
Connection Issues
# Test Control Plane connectivity
curl -v https://control-plane.kubiya.ai/health
# Check API key
echo $KUBIYA_API_KEY
# Verify DNS resolution
nslookup control-plane.kubiya.ai
# Check firewall/proxy settings
echo $HTTP_PROXY
echo $HTTPS_PROXY
Worker Crashes
# Check crash logs
tail -n 500 ~/.kubiya/workers/ < queue-i d > /logs/worker.log | grep ERROR
# Increase memory limits (Docker/K8s)
# Edit deployment and set higher resource limits
# Enable auto-restart (daemon mode)
kubiya worker start --queue-id= < queue-id > --type=local --daemon
# Check for dependency issues
~ /.kubiya/workers/ < queue-id > /venv/bin/pip list
Task Execution Failures
# Check activity timeouts
export ACTIVITY_TIMEOUT = 1200 # Increase to 20 minutes
# Verify tool availability
kubiya tool list --source < source-i d >
# Check integration credentials
kubiya integration list
# Review execution logs
tail -f ~/.kubiya/workers/ < queue-i d > /logs/worker.log | grep "Activity failed"
Best Practices
Production Deployment
Use Daemon or Kubernetes Mode : Never run production workers in foreground
Monitor Health : Set up alerting for heartbeat failures
Resource Limits : Configure appropriate CPU/memory limits
Log Rotation : Enable log rotation to prevent disk fill
Multiple Replicas : Run at least 2-3 workers for high availability
Security
✅ Store API keys in secrets (Kubernetes Secrets, AWS Secrets Manager)
✅ Rotate keys regularly (at least quarterly)
✅ Use network policies to restrict traffic
✅ Enable TLS for all communications
✅ Monitor access logs for suspicious activity
# Tune concurrency based on workload
export MAX_CONCURRENT_ACTIVITIES = 50 # For high-throughput
export MAX_CONCURRENT_WORKFLOWS = 20
# Adjust heartbeat interval
export HEARTBEAT_INTERVAL = 15 # More frequent for critical workers
# Optimize Python environment
# Use pypy for better performance
~ /.kubiya/workers/ < queue-id > /venv/bin/pip install pypy
Scaling Strategy
Vertical Scaling
Horizontal Scaling
Multi-Queue
# Increase resources per worker
resources :
requests :
memory : "2Gi"
cpu : "1000m"
limits :
memory : "4Gi"
cpu : "2000m"
Command Reference
# Start worker
kubiya worker start --queue-id= < id > --type= < local | docker >
# Start daemon
kubiya worker start --queue-id= < id > --type=local --daemon
# Stop worker (daemon mode)
kubiya worker stop --queue-id= < id >
# View worker status
kubiya worker status --queue-id= < id >
# List all workers
kubiya worker list
# View logs
kubiya worker logs --queue-id= < id >
# Clear worker data
kubiya worker clean --queue-id= < id >
Next Steps