Skip to main content
Semantic Search enables natural language queries across your cognitive memories. Unlike traditional keyword matching, semantic search understands the meaning and context of your queries, finding relevant information even when exact words don’t match. Semantic Search Interface

How Semantic Search Works

Keyword search looks for exact word matches:
Query: "pod crash"
Matches: Documents containing exactly "pod" AND "crash"
Misses: "container restart", "service failure", "deployment error"
Semantic search understands meaning:
Query: "pod crash"
Matches:
  - "Kubernetes containers failing" (similarity: 0.87)
  - "Service restart loop detected" (similarity: 0.82)
  - "Deployment rollout errors" (similarity: 0.79)
  - "Memory limit OOMKilled" (similarity: 0.75)

The Semantic Search Pipeline

Search Types

Cognitive Memory supports multiple search strategies:

CHUNKS (Default)

Basic semantic search returning matching content chunks. Best for:
  • Quick lookups
  • Specific fact retrieval
  • Simple Q&A

GRAPH_COMPLETION

Traverses the knowledge graph to find connected entities and relationships. Best for:
  • Understanding context
  • Discovering related concepts
  • Following relationship chains
  • Returns direct matches plus related entities (services, databases, configs)

TEMPORAL

Time-aware search that considers when memories were created. Best for:
  • Recent incidents
  • Historical analysis
  • Trend identification
  • Returns only memories within specified time range

FEEDBACK

Learns from user interactions and feedback to improve results. Best for:
  • Personalized search
  • Iterative refinement
  • Learning user preferences
  • Improves results based on relevance feedback

RAG_COMPLETION

Retrieval-Augmented Generation: Combines search with LLM generation. Best for:
  • Answering questions with citations
  • Summarizing multiple sources
  • Generating reports from knowledge
  • Returns LLM-generated answer plus source citations

Understanding Similarity Scores

Similarity scores range from 0.0 (no match) to 1.0 (perfect match).
Score RangeInterpretationAction
0.90 - 1.00Highly relevantDirect answer/solution
0.80 - 0.89Very relevantStrong candidate
0.70 - 0.79RelevantWorth considering
0.60 - 0.69Moderately relevantMay contain useful info
< 0.60Low relevanceLikely not helpful

Search via CLI

# Basic search
kubiya cognitive memory search "how to troubleshoot pod crashes"

# Search in specific dataset
kubiya cognitive memory search "database performance tuning" --dataset production

# Limit results
kubiya cognitive memory search "api errors" --limit 5

Agent Memory Recall

Agents automatically use semantic search when recalling memories: Behind the scenes:
  1. Agent’s recall_memory() method calls semantic search
  2. Searches in environment-based dataset (e.g., “production”)
  3. Returns memories from same org (shared knowledge)
  4. Agent uses results to inform next actions

Search Best Practices

Query Formulation

Good queries:
  • ✅ “how to increase kubernetes pod memory limits”
  • ✅ “common causes of database connection timeouts”
  • ✅ “steps to debug api gateway 502 errors”
Poor queries:
  • ❌ “error” (too vague)
  • ❌ “fix it” (no context)
  • ❌ “kubernetes” (too broad)
Tips:
  • Use natural language questions
  • Include context and specifics
  • Describe the problem, not just keywords
  • Mention relevant systems/services

Result Evaluation

  1. Check similarity scores - Prefer results ≥ 0.70
  2. Review metadata - Filter by relevant tags
  3. Verify recency - Older solutions may be outdated
  4. Cross-reference - Compare multiple high-scoring results
  5. Attribute sources - Check which agent/user stored it

Iterative Refinement

If results aren’t satisfactory:
  1. Start broad: “deployment issues”
  2. Add specificity: “kubernetes deployment rollout stuck on pending status”
  3. Apply filters: Filter by cluster, time range, or metadata

Performance Optimization

For fast queries:
  • Limit results to 10-20
  • Use CHUNKS search type
  • Search within specific datasets
For comprehensive analysis:
  • Increase limit to 50-100
  • Use GRAPH_COMPLETION
  • Search across multiple datasets
For time-sensitive queries:
  • Use TEMPORAL search type
  • Apply recent time range
  • Filter by recency

Common Patterns

Incident Response

Search for similar past incidents using natural language queries:
  • Query: “api gateway 503 errors in production”
  • Filter by incident type and severity
  • Review resolutions and root causes from high-scoring results

Runbook Lookup

Find relevant procedures and runbooks:
  • Query: “how to scale database during peak traffic”
  • Search in sre-runbooks dataset
  • Apply solutions from similar scenarios

Knowledge Discovery

Explore related concepts using GRAPH_COMPLETION:
  • Discover connections between services, issues, and solutions
  • Follow relationship chains to understand dependencies
  • Group results by topic or metadata

Troubleshooting

No Results Found

Possible causes:
  • Dataset is empty or doesn’t contain relevant memories
  • Query is too specific or uses unusual terminology
  • Filters are too restrictive
Solutions:
  • Broaden your query
  • Search across all datasets instead of one
  • Remove restrictive filters
  • Check if dataset contains relevant memories

Low-Quality Results

Possible causes:
  • Query is too vague
  • Memories lack detail or context
  • Need more data in dataset
Solutions:
  • Make queries more specific
  • Add metadata when storing memories
  • Use RAG_COMPLETION for synthesized answers
Possible causes:
  • Large dataset
  • Complex graph traversal
  • High result limit
Solutions:
  • Use CHUNKS search type instead of GRAPH_COMPLETION
  • Lower result limit (10 instead of 50)
  • Search within specific dataset instead of all datasets

Next Steps