Strategies and best practices for testing Kubiya workflows
Testing workflows ensures reliability and correctness before deploying to production. Learn how to test workflows locally, mock dependencies, and integrate with CI/CD pipelines.
import pytestfrom kubiya import ControlPlaneClient@pytest.fixturedef client(): """Create test client.""" return ControlPlaneClient(api_key="test-api-key")def test_workflow_validation(client): """Test that workflow structure is valid.""" workflow_spec = { "name": "test-workflow", "steps": [ {"name": "step1", "action": "echo", "params": {"message": "Hello"}} ] } # Validate workflow structure assert "name" in workflow_spec assert "steps" in workflow_spec assert len(workflow_spec["steps"]) > 0
When you define workflows with the Kubiya Workflow DSL instead of raw dicts,
you can use the DSL’s built-in validation to catch structural issues before
submitting them to the Control Plane.
This kind of test runs entirely locally: it never executes the workflow, but
it guarantees that the compiled spec has a name, at least one step, unique
step names, and no obvious structural problems for a given execution type
(chain vs graph).
When you build workflows with the Kubiya Workflow DSL (workflow, chain,
or graph), you can unit-test the specification itself without running any
containers or hitting the Control Plane. This is useful for catching breaking
changes early (for example, when someone renames a step or removes a queue
configuration).
This mirrors how Workflow.validate() is implemented: it ensures the
workflow has a name, at least one step, no duplicate step names, and that each
step has either a command, run, or type field set. In chain mode it
also flags explicit dependencies with warnings, which you can turn into failed
tests if your team prefers to keep chains strictly sequential.
These tests stay fast and deterministic: they only touch in-memory Python
objects, but still protect you against accidental changes to your workflow
shape (for example, changing the workflow type from chain to graph or
dropping the schedule/queue configuration).
import pytest@pytest.fixture(scope="session")def setup_test_environment(): """Setup test environment once per session.""" print("\nSetting up test environment...") # Setup code yield print("\nTearing down test environment...") # Cleanup code@pytest.fixture(scope="function")def test_data(): """Provide test data for each test.""" data = {"test": "value"} yield data # Cleanup after each test data.clear()def test_with_fixtures(setup_test_environment, test_data): """Test using fixtures.""" assert test_data["test"] == "value"
import pytest@pytest.mark.unitdef test_unit_logic(): """Fast unit test.""" assert True@pytest.mark.integrationdef test_integration_with_api(client): """Integration test with real API.""" result = client.datasets.list_datasets() assert isinstance(result, list)@pytest.mark.e2e@pytest.mark.slowdef test_full_workflow(client): """Slow end-to-end test.""" # Complete workflow test pass
Run specific test types:
# Run only unit testspytest -m unit# Run integration and e2e testspytest -m "integration or e2e"# Skip slow testspytest -m "not slow"
@pytest.fixturedef cleanup_datasets(client): """Clean up test datasets after tests.""" created_datasets = [] yield created_datasets # Cleanup for dataset_id in created_datasets: try: client.datasets.delete_dataset(dataset_id=dataset_id) except Exception: pass # Already deleted or doesn't existdef test_with_cleanup(client, cleanup_datasets): """Test that cleans up after itself.""" dataset = client.datasets.create_dataset(name="temp", scope="user") cleanup_datasets.append(dataset['id']) # Test logic...
def test_graceful_error_handling(client): """Test that errors are handled gracefully.""" try: # Attempt invalid operation client.graph.intelligent_search(keywords="", max_turns=-1) assert False, "Should have raised an error" except Exception as e: # Verify error is handled appropriately assert "invalid" in str(e).lower() or "error" in str(e).lower()
# ❌ BAD - Unclear test namesdef test_1(): passdef test_workflow(): pass# ✅ GOOD - Descriptive test namesdef test_dataset_creation_with_org_scope(): """Test creating organization-scoped dataset.""" passdef test_intelligent_search_returns_relevant_results(): """Test that intelligent search returns results with high relevance scores.""" pass
import pytest@pytest.mark.flaky(reruns=3, reruns_delay=2)def test_sometimes_fails(): """Test that might fail due to external factors.""" # Test that depends on external service availability pass