This guide walks through two practical agent use cases, CloudOps Engineer and Platform Engineer, showing how to configure them, when to use them, and how they fit into real operational workflows. These examples are designed as quickstarts so teams can create production-grade agents with minimal setup. Agents in Kubiya operate using Skills, optional MCP integrations, and configuration inherited from attached Environments. They run deterministic plans and can automate DevOps, SRE, platform, and workflow tasks in a repeatable and safe way.Documentation Index
Fetch the complete documentation index at: https://docs.kubiya.ai/llms.txt
Use this file to discover all available pages before exploring further.
When to use Agents for Engineering Workflows
Use these agents when you need to automate engineering tasks that are:- Repetitive (provisioning, deployments, audits, checks).
- Standardized (pipeline operations, resource creation, validations).
- High-context (logs, metrics, environments, IaC).
- Triggered by users, workflows, chat commands, or webhooks.
Prerequisites
Before creating these agents ensure:- At least one Environment is Ready
- That Environment has a Task Queue with at least one connected worker
- Required secrets exist (cloud credentials, registry tokens, access keys)
- Required Skills are enabled at the Environment level (shell, Docker, Python, etc.)
CloudOps Engineer Agent
A CloudOps Engineer agent manages cloud infrastructure operations. It handles tasks such as provisioning, resource audits, monitoring, and deployment execution across providers such as AWS, GCP, or Azure.What this agent does
A CloudOps Engineer typically performs:- Cloud resource inspections and audits
- VM, Kubernetes, or serverless deployment actions
- Reading logs and metrics to surface anomalies
- Managing cloud accounts/tokens stored in Environments
- Responding to alerts or executing runbooks autonomously
When to use it
Choose this agent when you need:- Multi-cloud operational support
- Automated cloud infrastructure checks
- On-demand remediation (restart nodes, scale resources)
- Daily/weekly cost or resource drift audits
- Integration with alerting workflows (Slack, webhooks)
Required configuration
This agent requires:Environment-level prerequisites
- Cloud credentials (AWS IAM key, GCP service account, Azure SP)
- Skills:
- Shell (Full Access or Restricted)
- Docker (if running local tooling)
- Python (Unrestricted or Restricted)
- Kubiya Platform APIs enabled
Optional MCP integrations
- Cloud provider-specific MCP servers
- GitHub/GitLab for IaC repos
- Terraform Cloud/Atlantis MCP
Step-by-step: Create a CloudOps Engineer agent
1. Basic Info
- Name: CloudOps Engineer
- Description: Manages cloud infrastructure. Performs audits, deployments, and operational diagnostics.
- Model: Claude Sonnet or Claude Opus for high-precision operational tasks
- Capabilities: cloudops, infrastructure, devops
2. Deployment
- Environments: Add a production, staging, or shared cloud environment
- Runtime: Default recommended
3. Execution Environment
Attach:- Environment variables such as
DEFAULT_REGION=us-east-1 - Secrets like
AWS_ACCESS_KEY,AWS_SECRET_KEY,GCP_CREDENTIALS - Integrations: GitHub if working with IaC repos
4. Tools (Skills & MCP)
Skills to enable:- Shell – Full Access
- Python – Unrestricted
- Docker – Full Control
- Cloud provider MCP (aws-cli, gcloud, az)
- Terraform MCP
5. Policies (after creation)
Example restrictions:- Restrict shell to investigative commands only
- Allow deployments only in specific namespaces or regions
How to use the CloudOps Engineer
Once created, you can:Chat-based
Ask: “Check unused EC2 instances older than 30 days.” “Run a drift check against staging Terraform.” “Analyze CloudWatch errors for service X.”Workflow-based
Use this agent in automation:- Daily cost audits
- Auto-remediation tasks
- Alert enrichment
- Deployment orchestration
Webhook-based
Trigger from cloud alerts or monitoring tools. You can verify your agent by running the provided command with your agent ID:Platform Engineer Agent
A Platform Engineer agent manages platform infrastructure, CI/CD tooling, internal developer platforms, and pipeline orchestration. It is ideal for automating internal platform tasks.What this agent does
A Platform Engineer agent typically performs:- Pipeline debugging and execution
- Infrastructure-as-code workflows (Terraform, Helm, Argo, Flux)
- Container builds and validations
- Deployments across clusters/environments
- Reading logs and analyzing build failures
- Managing platform lifecycle automation
When to use it
Use this agent when you need:- Developer workflow automation
- Consistent CI/CD operations
- Standardized environment/platform management
- Automated validation of IaC or manifests
- Assistance with Kubernetes, Helm, Docker, Terraform tooling
Required configuration
Environment-level prerequisites
- CI/CD credentials
- Kubernetes API tokens
- Registry credentials
- Skills:
- Shell – Full Access
- Docker – Full Control
- Full Diagramming Suite (optional for workflow visualization)
Optional MCP integrations:
- GitHub/GitLab
- ArgoCD MCP
- Kubernetes MCP
- Jenkins MCP
Step-by-step: Create a Platform Engineer agent
1. Basic Info
- Name: Platform Engineer
- Description: Automates platform tasks, pipelines, and deployments.
- Model: Claude Sonnet for performance or Claude Opus for deeper reasoning
- Capabilities: devops, platform, cicd
2. Deployment
- Environments: staging, prod, infra
- Supports multi-environment use
3. Execution Environment
Add:K8S_CONTEXT=prod-cluster- Registry secrets
- GitHub integration for pipeline repos
4. Tools (Skills & MCP)
Recommended Skills:- Docker – Full Control
- Shell – Full Access
- Full Diagramming Suite
- Python (optional)
- Kubernetes
- ArgoCD
- GitHub Repos
5. Policies
Define guardrails:- Only deploy to specific namespaces
- Allow Docker operations only on controlled repos
- Restrict shell commands
How to use the Platform Engineer
Chat-based workflows
Ask the agent to: “Validate this Helm chart and show me errors.” “Deploy service X to staging.” “Explain why pipeline run #438 failed.”Workflow-based automation
Attach it to:- PR-triggered IaC validations
- Continuous deployment pipelines
- Automatic container scanning and fixes
Developer workflow self-service
Developers can delegate tasks via chat: “Create a temporary Kubernetes namespace for testing.” Run the provided cli command to verify your new agent:Troubleshooting Both Agents
If CloudOps or Platform Engineer agents appear stuck:- Ensure the Environment is in Ready state
- Confirm the Task Queue has at least one active worker
- Check that required Skills are attached
- Verify secrets and credentials exist at the correct level
- Review OPA policies for command or environment restrictions
Best Practices
- Keep Skills minimal; add more only when needed
- Store cloud tokens and CI credentials at the Environment level
- Use multiple Environments instead of duplicating agents
- Add policies once the agent’s behavior is verified
- Always start with safe read-only commands during testing
Next Steps
- Add alerts or workflows that trigger these agents
- Connect MCP integrations for deeper automation
- Build dashboards showing agent health and activity
- Add environment-specific OPA policies for guardrails