Workers

Workers are the compute engines that power Kubiya’s distributed execution. They run on your infrastructure—whether that’s a laptop, a server, or a Kubernetes cluster—and pull tasks from queues to execute agent workflows. The Kubiya CLI manages the entire worker lifecycle, from registration to health monitoring, making deployment and management straightforward.

What is a Worker?

A worker is a program that connects to Kubiya and executes tasks. Think of workers like delivery drivers waiting at a distribution center: when packages (tasks) arrive at the distribution center (task queue), available drivers (workers) pick them up and deliver them. Workers are designed for distributed execution, which means:

They run on your own machines, not in Kubiya’s cloud—giving you control over where your workloads execute
Multiple workers can share the workload for scalability—add more workers to handle more tasks in parallel
Workers poll task queues for work assignments—they actively check for new work and pull tasks when available
Each worker operates independently—if one worker goes down, others continue processing tasks

This architecture gives you the flexibility to scale execution across your infrastructure while maintaining security and control over where sensitive operations run.

Where Workers Run

One of Workers’ key strengths is platform flexibility. You can deploy workers on virtually any compute infrastructure:

MacOS

Perfect for local development and individual contributor machines. Developers can run workers on their MacBooks to test agent workflows in a local environment before deploying to production.

Windows

Ideal for desktop automation and Windows-specific tooling. If your agents need to interact with Windows applications or APIs, run workers on Windows servers or workstations.

Linux

The most common deployment target for production workloads. Run workers on Linux servers, VMs, or cloud instances (AWS EC2, Google Compute Engine, Azure VMs).

Kubernetes

Scalable production deployments with auto-scaling capabilities. Deploy workers as Kubernetes pods that automatically scale based on workload, with built-in health checks and rolling updates.

OpenShift

Enterprise Kubernetes distribution with additional security and governance features. Ideal for organizations with strict compliance requirements and existing OpenShift infrastructure. This flexibility means you can:

Run workers locally during development
Deploy to production on Kubernetes for scale
Place workers inside private networks to reach internal systems
Use specialized hardware (GPU machines, high-memory servers) for specific workloads

Worker Lifecycle Management

The Kubiya CLI takes care of everything workers need to function—you don’t need to manually configure connections, manage credentials, or handle retries. Here’s what happens in a worker’s lifecycle:

1. Registration

When a worker starts, it authenticates with the Kubiya Control Plane using an API key. The Control Plane verifies the worker’s identity and sends back everything it needs: Temporal connection credentials, LLM gateway settings, and queue configuration.

2. Polling

Once registered, the worker continuously checks its assigned task queue for new work. This is an active process—workers pull work when they’re ready, rather than having work pushed to them. This design prevents overloading workers and provides natural load balancing.

3. Execution

When a worker receives a task, it executes the agent workflow in an isolated environment. The worker streams execution logs and events in real-time so you can monitor progress. If a task fails, the worker handles retry logic automatically.

4. Health Monitoring

Every 30 seconds (configurable), workers send heartbeats to the Control Plane reporting their status, how many tasks they’re processing, and system health metrics. This allows the Control Plane to route tasks only to healthy workers.

5. Shutdown

When a worker shuts down (planned or unplanned), it drains in-flight tasks gracefully whenever possible, ensuring tasks complete before the worker terminates. The key takeaway: The Kubiya CLI handles registration, configuration, credentials, retries, and health monitoring automatically. You just start the worker—everything else is managed for you.

Worker Registration Flow

Here’s a visual representation of how workers register and begin processing tasks: This flow shows the complete lifecycle: from starting a worker with a single CLI command to continuous task execution and health monitoring. Key Steps:

CLI starts the worker with kubiya worker start --queue-id=<queue-id>
Worker authenticates with the Control Plane using an API key
Control Plane sends configuration including Task Queue ID, Temporal credentials, and LLM settings
Worker connects to the Task Queue and begins polling for tasks
Continuous execution loop where the worker pulls tasks, executes agent workflows, and reports results
Health monitoring via periodic heartbeats every 30 seconds

Platform Support: Workers run on MacOS, Windows, Linux, Kubernetes, and OpenShift - giving you flexibility to deploy on any infrastructure.

Quick Start

Getting started with workers is straightforward. Here’s how to run your first worker:

# 1. Install the Kubiya CLI
brew install kubiya-cli

# 2. Authenticate with your Kubiya account
kubiya auth login

# 3. Start a worker (one command!)
kubiya worker start --queue-id=my-queue

That’s it! The CLI handles everything automatically:

Creates a Python virtual environment
Installs dependencies
Registers with the Control Plane
Connects to the task queue
Begins polling for tasks
Streams logs to your terminal

For production deployments or different configurations, see the deployment modes below.

Deployment Modes

Workers support multiple deployment modes to fit different use cases:

Local Mode

For development and testing. Runs in the foreground with live logging, perfect for debugging agent workflows during development. Use when: You’re developing agents locally and want to see real-time output.

Daemon Mode

For production deployments on a single server. Runs in the background as a daemon process with automatic restart on crashes and log rotation. Use when: You need a worker running continuously on a single machine without manual intervention.

Docker Mode

For containerized deployments with complete environment isolation. Package workers as Docker containers for portable, reproducible deployments. Use when: You want container isolation or need to deploy workers across different environments with consistent behavior.

Kubernetes Mode

For scalable production deployments with high availability. Deploy workers as Kubernetes pods with horizontal auto-scaling, health checks, and rolling updates. Use when: You need production-scale deployment with automatic scaling based on workload and enterprise-grade reliability.

For detailed deployment guides, YAML configurations, environment variables, troubleshooting steps, and advanced patterns, see the Worker Management CLI Reference.

Workers vs Task Queues

It’s important to understand the relationship between workers and task queues:

Concept	What It Is	Analogy	Purpose
Task Queue	A waiting area for tasks that need execution	Restaurant order queue	Holds and distributes work
Worker	The execution engine that processes tasks	Kitchen staff preparing orders	Executes and completes work

The relationship: Task queues hold the work; workers do the work.

One queue, multiple workers: This is the standard pattern for horizontal scaling. Multiple workers attached to the same queue process tasks in parallel, increasing throughput.
One worker, multiple queues: A single worker can be attached to multiple queues if it has capacity, allowing it to handle different types of work.
Queue-based routing: Tasks are routed to specific queues based on environment (dev/staging/prod), priority (high/low), or other criteria. Workers attached to those queues process the relevant tasks.

Learn more about task queues and how they distribute work in the Task Queues documentation.

Common Patterns

Here are typical deployment patterns you’ll see in production:

Single Queue, Multiple Workers

The most common pattern for scaling throughput. Create one task queue and attach multiple workers to it. As workload increases, add more workers to handle the load. Kubernetes deployments can auto-scale workers based on CPU or custom metrics. Example: A production queue with 10 workers handling agent workflows. During peak hours, scale to 20 workers automatically.

Environment-Specific Queues

Separate queues for different environments to isolate workloads and prevent production issues from affecting development. Example: Three queues—dev-queue, staging-queue, prod-queue—each with their own workers running in the appropriate environment.

Specialized Workers

Dedicate workers with specific hardware or configurations for particular workload types. Example:

GPU workers for machine learning inference tasks
High-memory workers for data processing pipelines
Workers with access to specific internal APIs for integration workflows

Hybrid Deployment

Combine different deployment modes for flexibility and cost optimization. Example:

Kubernetes workers for production scale and auto-scaling
Local workers for developers during feature development
Daemon mode workers on edge servers for low-latency regional processing

Next Steps

Task Queues

Understand how task queues distribute work to workers

Worker Management CLI

Complete technical guide with deployment configurations, troubleshooting, and advanced patterns

Environments

Configure execution environments that define worker context and capabilities

Control Plane Architecture

Deep dive into worker registration internals and system architecture

Introduction

Quick Start

Web Interface

Core Concepts

Infrastructure

What is a Worker?

Where Workers Run

MacOS

Windows

Linux

Kubernetes

OpenShift

Worker Lifecycle Management

1. Registration

2. Polling

3. Execution

4. Health Monitoring

5. Shutdown

Worker Registration Flow

Quick Start

Deployment Modes

Local Mode

Daemon Mode

Docker Mode

Kubernetes Mode

Workers vs Task Queues

Common Patterns

Single Queue, Multiple Workers

Environment-Specific Queues

Specialized Workers

Hybrid Deployment

Next Steps

Task Queues

Worker Management CLI

Environments

Control Plane Architecture

Introduction

Quick Start

Web Interface

Core Concepts

Infrastructure

​What is a Worker?

​Where Workers Run

​MacOS

​Windows

​Linux

​Kubernetes

​OpenShift

​Worker Lifecycle Management

​1. Registration

​2. Polling

​3. Execution

​4. Health Monitoring

​5. Shutdown

​Worker Registration Flow

​Quick Start

​Deployment Modes

​Local Mode

​Daemon Mode

​Docker Mode

​Kubernetes Mode

​Workers vs Task Queues

​Common Patterns

​Single Queue, Multiple Workers

​Environment-Specific Queues

​Specialized Workers

​Hybrid Deployment

​Next Steps

Task Queues

Worker Management CLI

Environments

Control Plane Architecture

What is a Worker?

Where Workers Run

MacOS

Windows

Linux

Kubernetes

OpenShift

Worker Lifecycle Management

1. Registration

2. Polling

3. Execution

4. Health Monitoring

5. Shutdown

Worker Registration Flow

Quick Start

Deployment Modes

Local Mode

Daemon Mode

Docker Mode

Kubernetes Mode

Workers vs Task Queues

Common Patterns

Single Queue, Multiple Workers

Environment-Specific Queues

Specialized Workers

Hybrid Deployment

Next Steps