Skip to main content

Semantic Search

CLOUD

Natural language search across assets, terms, and metrics using vector embeddings to find relevant results based on meaning, not just keywords.

Overview

Semantic search uses vector embeddings to understand the meaning of your query and find relevant assets, terms, and metrics across your entire project.

Searching

Endpoint

POST /api/msh/ai/semantic-search

Search using natural language queries.

Request:

curl -X POST \
-H "Authorization: Bearer $API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"query": "Where is customer revenue calculated?",
"project_id": 1,
"entity_types": ["asset", "term", "metric"],
"limit": 10
}' \
https://api.msh.io/api/msh/ai/semantic-search

Response:

{
"success": true,
"query": "Where is customer revenue calculated?",
"results": [
{
"entity_type": "asset",
"entity_id": 5,
"entity_name": "revenue",
"relevance_score": 0.92,
"snippet": "Calculates monthly recurring revenue..."
},
{
"entity_type": "metric",
"entity_id": 2,
"entity_name": "Monthly Recurring Revenue",
"relevance_score": 0.88,
"snippet": "Total recurring revenue normalized..."
}
],
"count": 2
}

Entity Types

Search across different entity types:

  • asset - Search assets (tables, views, models)
  • term - Search glossary terms
  • metric - Search metrics

Search Specific Types

# Search only assets
curl -X POST \
-H "Authorization: Bearer $API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"query": "customer revenue",
"project_id": 1,
"entity_types": ["asset"],
"limit": 10
}' \
https://api.msh.io/api/msh/ai/semantic-search

# Search only metrics
curl -X POST \
-H "Authorization: Bearer $API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"query": "revenue metrics",
"project_id": 1,
"entity_types": ["metric"],
"limit": 10
}' \
https://api.msh.io/api/msh/ai/semantic-search

Relevance Scores

Results are ranked by relevance score (0.0 to 1.0):

  • 0.9+: Highly relevant
  • 0.7-0.9: Relevant
  • 0.5-0.7: Somewhat relevant
  • <0.5: Less relevant

Example Queries

Finding Assets

# "Where is customer data stored?"
# "Show me assets that calculate revenue"
# "What tables contain order information?"

Finding Metrics

# "What metrics measure revenue?"
# "How is customer lifetime value calculated?"
# "Show me retention metrics"

Finding Terms

# "What is the definition of customer?"
# "Explain what MRR means"
# "What are the business terms for orders?"

Use Cases

Data Discovery

Find assets and metrics quickly:

curl -X POST \
-H "Authorization: Bearer $API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"query": "customer lifetime value",
"project_id": 1,
"entity_types": ["asset", "metric"]
}' \
https://api.msh.io/api/msh/ai/semantic-search

Understanding Business Terms

Find glossary definitions:

curl -X POST \
-H "Authorization: Bearer $API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"query": "what is MRR",
"project_id": 1,
"entity_types": ["term", "metric"]
}' \
https://api.msh.io/api/msh/ai/semantic-search

Discover related assets:

curl -X POST \
-H "Authorization: Bearer $API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"query": "assets related to customer orders",
"project_id": 1,
"entity_types": ["asset"]
}' \
https://api.msh.io/api/msh/ai/semantic-search

Best Practices

  1. Be Specific: Use specific queries for better results
  2. Use Business Terms: Use glossary terms in queries
  3. Filter by Type: Use entity_types to narrow results
  4. Review Scores: Check relevance scores to filter results