Quick Start
Execution Lifecycle
Every execution follows this lifecycle: Submit Request → Planning → Execute → Complete- Submit Request: Prompt analysis, resource discovery
- Planning: Agent selection, cost estimation, plan creation
- Execute: Queue selection, worker assignment, streaming output
- Complete: Status update, response storage, graph updates
Lifecycle Stages
1. Submit Request
- User provides natural language prompt
- CLI authenticates with API key
- Request routed to control plane
2. Planning (Auto Mode)
- Resource Discovery: Fetches available agents, teams, environments
- Agent Selection: AI analyzes prompt and selects best agent/team
- Cost Estimation: Calculates token usage and estimated cost
- Plan Generation: Creates detailed execution plan
- User Approval: Shows plan for review (unless
--yes)
3. Execute
- Queue Selection: Chooses worker queue (persistent or ephemeral)
- Worker Assignment: Assigns task to available worker in queue
- Execution: Worker runs task using agent/team configuration
- Streaming: Real-time output streamed to CLI
- Context Updates: Updates context graph with results
4. Complete
- Status Update: Marks execution as complete/failed
- Response Storage: Stores full response in database
- Graph Updates: Updates knowledge graph with execution metadata
- Cleanup: For ephemeral workers, deletes queue and terminates process
Worker Deployment Modes
Choose between two deployment strategies based on your needs:Persistent Workers
Best for: Production environments and high-frequency tasks Long-running workers that stay active and handle multiple executions:- ⚡ Instant task startup (dependencies already installed)
- 🔄 Handles multiple tasks sequentially
- 📦 Pre-warmed environment ready to execute
- 🎯 Best for production workloads
- Production agent deployments
- Frequent executions throughout the day
- Shared team resources
- Mission-critical agents
Ephemeral Workers
Best for: CI/CD pipelines and one-off tasks Short-lived workers created on-demand for single executions:- 💰 Pay only when running (no idle costs)
- 🧹 Automatic cleanup after execution
- 🔒 Isolated environment per execution
- 🚀 Perfect for CI/CD pipelines
- CI/CD pipeline integration
- Development and testing
- Occasional or scheduled tasks
- Local execution from CLI
Quick Comparison
| Persistent Workers | Ephemeral Workers | |
|---|---|---|
| Startup | Instant (pre-warmed) | 10-120s first run, then cached |
| Lifecycle | Always running | Created → Execute → Cleanup |
| Cost | Continuous resource usage | Pay per execution |
| Best For | Production, frequent use | CI/CD, occasional tasks |
| Cleanup | Manual management | Automatic (5 min TTL) |
Local Execution Mode
Run tasks with an ephemeral worker directly from your CLI or CI/CD pipeline.How It Works
--local, here’s what happens:
- Queue Setup - CLI creates a temporary queue with 5-minute auto-cleanup
- Worker Launch - CLI starts a local Python worker process
- Task Execution - Worker runs in your specified directory with full file access
- Stream Results - Real-time output appears in your terminal
- Auto Cleanup - Worker exits and queue is automatically deleted
- 📁 Full access to local files and git repository
- 🔄 Works from any directory with
--cwd - ⚡ Fast for cached runs (dependencies installed once)
- 🧹 Zero cleanup needed (automatic)
Working Directory Support
The--cwd flag sets the execution working directory:
- Local File Access: Worker can read/write files in specified directory
- Git Integration: Access to .git directory and git commands
- Project Context: Reads package.json, requirements.txt, etc.
- Pipeline Integration: CI/CD jobs can pass their working directory
Pipeline Integration
Local mode is perfect for CI/CD pipelines. Here are examples for popular platforms:GitHub Actions
GitLab CI
CircleCI
--local: Creates ephemeral worker in the CI environment--cwd: Sets working directory to pipeline workspace--yes: Skips approval prompts for automationKUBIYA_NON_INTERACTIVE="true": Ensures non-interactive mode
Real-World Use Case: Smart Test Selection
Problem: Traditional CI runs ALL tests on every commit, even when only one module changed. Solution: Kubiya analyzes changes and runs only relevant tests. Example from a Node.js project with modular test suites:- Single module change: Run 8/36 tests (78% faster)
- Documentation change: Run 0/36 tests (100% time saved)
- Average savings: 77% time reduction
- Git Diff Analysis:
git diff --name-onlyshowssrc/tasks/tasks.js - Module Detection: Recognizes “tasks” module changed
- Test Selection: Runs
npm run test:tasks(8 tests) - Skip Irrelevant: Skips projects, comments, tags, search tests (28 tests)
Execution Modes
Auto-Planning Mode (Recommended)
Automatically selects agent/team and creates plan:- Analyzes task requirements
- Selects best agent or team
- Estimates cost and time
- Shows detailed plan
- Asks for approval
Local Mode
Execute with ephemeral local worker:--local: Use ephemeral worker--cwd PATH: Set working directory--environment ID: Specify environment--package-source SOURCE: Custom worker package--local-wheel PATH: Local wheel file (development)
Direct Mode
Skip planning and execute directly:- Testing specific agents
- Bypassing plan approval
- Scripting and automation
Plan File Mode
Execute from saved plan:Command Reference
Global Flags
--yes, -y: Auto-approve plan (skip confirmation)--output, -o: Output format (text,json,yaml)--non-interactive: Skip all prompts--priority: Task priority (low,medium,high,critical)--save-plan PATH: Custom plan save location
Local Mode Flags
--local: Enable local execution mode--cwd PATH: Working directory for execution--environment ID: Environment ID--package-source SOURCE: Worker package source (PyPI version, git URL, GitHub ref)--local-wheel PATH: Path to local wheel file (development)
Queue Flags
--queue ID: Specific worker queue ID(s)--queue-name NAME: Queue selection by name
Advanced Flags
--parent-execution ID: Parent execution for conversation continuation
Environment Variables
Best Practices
Local Execution
- Use
--localfor CI/CD pipelines - Always specify
--cwdfor file access - Set
KUBIYA_NON_INTERACTIVE=truein pipelines - Use
--yesto skip approval prompts - Capture
--output jsonfor result parsing
Persistent Workers
- Use for production environments
- Pre-deploy workers to reduce latency
- Monitor worker health and capacity
- Scale workers based on load
Planning
- Review plans before approval in production
- Save important plans for reproducibility
- Use
--priorityfor urgent tasks - Check cost estimates for large operations
Error Handling
- Parse
--output jsonfor status codes - Implement retries for transient failures
- Monitor execution logs
- Set up alerts for failed executions