Key Features
- Python-native DSL – Define workflows using familiar Python syntax
- Chain and graph workflows – Model simple linear pipelines or complex dependency graphs
- Rich executors – Run shell, Python, Docker build/run, HTTP/SSH calls, tools, and agents
- First-class data flow – Pass parameters, env vars, secrets, and step outputs between steps
- Operational controls – Configure scheduling, queues, timeouts, retries, and notifications
- Testable and reviewable – Store workflows in Git and exercise them in unit/integration tests
Quick Start
Simple Workflow
chain(...) creates a sequential workflow where steps run in the order you define them. For more complex topologies you can use graph(...) and explicit dependencies (see below).
Multi-step deployment workflow
DEPLOYMENT_STATUS for later reporting.
Core Concepts
1. Workflows
A workflow is a container for steps plus all of the operational configuration around them.- Type –
chainfor simple linear flows,graphwhen you want explicit dependencies - Scheduling – Cron-style schedules for periodic workflows
- Environment and parameters – Shared configuration for all steps in the workflow
- Queues and concurrency – Control how many workflow runs and steps can be active at once
- Runners – Choose where the workflow executes (e.g., a specific worker pool)
2. Steps and executors
Steps are the building blocks of a workflow. Each step describes what should run (shell, Python, Docker, HTTP, tool, agent, etc.) and how it should behave (env, timeouts, retries, dependencies).3. Dependencies and workflow shape
In chain workflows, steps run sequentially in the order they are declared. In graph workflows, you can create complex dependency graphs usingdepends(...).
4. Parameters, outputs, and data flow
Parameters and outputs make it easy to pass data between steps without hard-coding values.${PARAM_NAME} for parameters and {{OUTPUT_NAME}} for values produced by previous steps.
Step Types
The DSL ships with several built-in executors so you can describe most real-world automation tasks without custom glue code.Shell commands
Python code
Native Docker build and run
Build images and run containers as first-class workflow steps.HTTP and SSH executors
Agents and LLM-powered steps
You can embed Kubiya agents or inline agents directly into workflows to handle free-form tasks like incident analysis, ticket triage, or runbook execution.Tools and bounded services
Define tools inline or reference existing tools, and attach temporary services like databases or caches for testing.with_database, with_cache, or with_message_queue to spin up temporary dependencies for tests and data migrations.
Advanced Features
Conditional and guarded execution
Use preconditions to control when a step should run, based on parameters, outputs, or external checks.Error handling, retries, and continue-on
Configure how steps behave on failure so workflows are resilient but still predictable.repeat(...) for polling patterns (for example, waiting until a condition in an external system becomes true).
Tool definitions with services
Inline tools can be paired with temporary services such as databases to build realistic ephemeral environments.Complete Example
The example below shows a realistic CI/CD workflow that builds a Docker image, runs tests inside a container, deploys to Kubernetes, and notifies a webhook.DSL Methods Reference
Workflow methods
| Method | Description |
|---|---|
.description(text) | Set a human-readable description for the workflow |
.type("chain" | "graph") | Explicitly set the workflow type (sequential vs dependency graph) |
.schedule(cron) | Attach a cron schedule for periodic runs |
.env(**variables) | Define shared environment variables for all steps |
.params(**parameters) | Define workflow parameters with default values |
.with_files(files) | Attach in-memory files (e.g., configs, scripts) to the workflow |
.dotenv(*files) | Load environment from one or more .env files |
.step(name, command=None, callback=fn) | Add a step (either simple command or fully configured via callback) |
.parallel_steps(name, items, command, max_concurrent=None) | Run the same command in parallel across multiple items |
.sub_workflow(name, workflow, params=None) | Call another workflow as a sub-workflow step |
.get_secret_step(name, secret_name, **kwargs) | Add a step that retrieves a secret by name |
.runner(name) | Set which runner / worker group should execute the workflow |
.queue(name, max_active_runs=None) | Assign the workflow to a queue and cap concurrent runs |
.max_active_runs(limit) | Limit how many runs of this workflow may be active at once |
.max_active_steps(limit) | Limit how many steps may run in parallel |
.skip_if_successful(skip=True) | Skip execution if a successful run already exists for the period |
.timeout(seconds) | Set a maximum runtime for the workflow |
.cleanup_timeout(seconds) | Set a timeout for cleanup logic after the workflow ends |
.delay(seconds) | Delay workflow start after it is triggered |
.max_output_size(bytes) | Limit the size of captured output |
.handlers(success=None, failure=None, exit=None, cancel=None) | Register lifecycle hooks that run on workflow events |
.notifications(...) | Configure email notifications on success/failure |
.tags(*tags) | Attach tags for search and organization |
.group(name) | Assign the workflow to a logical group |
.preconditions(*conditions) | Define workflow-level preconditions that must be satisfied before running |
.to_dict() | Convert the workflow to a Python dictionary |
.to_json(indent=2) | Convert the workflow definition to JSON |
.to_yaml() | Convert the workflow definition to YAML |
.compile(indent=2) | Compile to JSON (alias for to_json) |
.validate() | Perform basic validation and return errors/warnings |
Step configuration methods
| Method | Description |
|---|---|
.description(text) | Set a description for the step |
.shell(command, **config) | Run a shell command with optional configuration (env, shell type, etc.) |
.python(script) | Run a Python script inside the step |
.docker(image, command=None, content=None) | Run a Docker container with an optional script or command |
.docker_build(image, **config) | Build a Docker image (optionally from git) with advanced configuration |
.docker_run(image, **config) | Run a container with resource limits, env, volumes, and Kubernetes options |
.http(url, method="GET", headers=None, body=None) | Call an HTTP endpoint |
.ssh(host, user, command, port=22, key_file=None) | Run a command over SSH on a remote host |
.kubiya(url, method="GET", **config) | Call Kubiya APIs as part of a workflow step |
.llm_completion(...) | Run an LLM completion (e.g., summarization, classification) as a step |
.inline_agent(...) | Configure and run an inline agent for a single step |
.agent(...) | Invoke an existing agent by name as a workflow step |
.tool_def(...) | Define a tool inline (image, content, arguments, services) |
.tool(name_or_tool, args=None, timeout=None, **kwargs) | Use an existing tool by name or Tool instance |
.jq(query) | Process JSON output using a jq-style query |
.args(**arguments) | Provide arguments for tool or executor configuration |
.depends(*step_names) | Declare dependencies on other steps |
.parallel(items, max_concurrent=None) | Run the same step across multiple items in parallel |
.output(name) | Capture the step’s output into a named variable |
.stdout(path) | Redirect standard output to a file |
.stderr(path) | Redirect standard error to a file |
.env(variables=None, **kwargs) | Attach environment variables to the step |
.dir(path) | Set the working directory for the command |
.shell_type(shell) | Use a specific shell (e.g., bash, sh) |
.id(identifier) | Set a stable identifier for programmatic referencing |
.preconditions(*conditions) | Add preconditions that must be met before the step runs |
.retry(...) | Configure retry policy (limit, intervals, backoff, exit codes, etc.) |
.repeat(...) | Configure repeat/polling behavior for the step |
.continue_on(...) | Control when the workflow should continue despite failures |
.timeout(seconds) | Set a timeout for the step |
.retries(count) | Set a simple retry count |
.signal_on_stop(signal) | Choose which signal to send when stopping the step |
.mail_on_error(send=True) | Enable email notification when the step fails |
.with_service(...) | Attach a bounded service (e.g., helper container) to a tool step |
.with_database(...) | Attach a temporary database service for the step |
.with_cache(...) | Attach a cache service (e.g., Redis) for the step |
.with_message_queue(...) | Attach a message queue service (e.g., RabbitMQ) for the step |
Best Practices
The DSL is flexible enough to describe very large workflows. These guidelines help keep things readable and maintainable as they grow.Keep Steps Atomic
Keep Steps Atomic
Each step should do one thing well:
Use Meaningful Names
Use Meaningful Names
Prefer descriptive names over positional ones:
Leverage Parameters
Leverage Parameters
Make workflows reusable across environments and services:
Add Descriptions
Add Descriptions
Document intent so others (and future you) can understand the workflow quickly: