Skip to main content

msh ai review

Use AI to review an asset for risks and issues.

Usage

msh ai review <asset_path> [--json]

Description

Analyzes an asset and identifies:

  • Performance risks: Missing indexes, inefficient queries
  • Data quality risks: Missing tests, potential null issues
  • Glossary alignment issues: Columns not linked to glossary terms
  • Best practice violations: Common anti-patterns
  • Suggested improvements: Actionable recommendations

Options

  • --json: Return structured review object

Examples

Basic Review

msh ai review assets/revenue.msh

Example Output:

Asset: revenue

Summary:
This asset calculates monthly revenue per customer. Overall structure
is good, but there are some performance and quality concerns.

Risks:
⚠️ Performance: Missing index on customer_id column
⚠️ Performance: Large JOIN without WHERE clause may be slow
⚠️ Data Quality: No tests defined for revenue calculations
⚠️ Data Quality: Potential null values in amount column

Glossary Issues:
- Column 'customer_id' not linked to 'Customer' term
- Column 'revenue' not linked to 'Revenue' metric

Performance Notes:
- Consider adding WHERE clause to filter recent data
- Consider partitioning by month for better performance

Suggested Changes:
1. Add tests for revenue calculations
2. Link columns to glossary terms
3. Add WHERE clause for incremental processing
4. Add index on customer_id

Structured Output

msh ai review assets/revenue.msh --json

Example JSON Output:

{
"summary": "This asset calculates monthly revenue...",
"risks": [
{
"type": "performance",
"severity": "warning",
"message": "Missing index on customer_id column"
},
{
"type": "data_quality",
"severity": "warning",
"message": "No tests defined for revenue calculations"
}
],
"glossary_issues": [
{
"column": "customer_id",
"suggestion": "Link to 'Customer' term"
}
],
"performance_notes": [
"Consider adding WHERE clause to filter recent data"
],
"suggested_changes": [
"Add tests for revenue calculations",
"Link columns to glossary terms"
]
}

Risk Types

Performance Risks

  • Missing indexes
  • Inefficient queries (large JOINs, no WHERE clauses)
  • Full table scans
  • Missing partitioning

Data Quality Risks

  • Missing tests
  • Potential null values
  • Type mismatches
  • Missing constraints

Glossary Alignment Issues

  • Columns not linked to glossary terms
  • Missing business context
  • Unclear column names

Best Practice Violations

  • Anti-patterns
  • Code smells
  • Security concerns

Prerequisites

  1. AI Configuration: Must configure AI provider first

    msh config ai --provider openai --model gpt-4 --api-key env:OPENAI_API_KEY
  2. Manifest: Generate manifest before using AI commands

    msh manifest
  3. Glossary (optional): Link glossary terms for better reviews

    msh glossary link-term "Customer" --asset revenue --column customer_id

Use Cases

  • Pre-Deployment Review: Review assets before deploying to production
  • Code Quality: Identify potential issues early
  • Performance Optimization: Find performance bottlenecks
  • Documentation: Ensure glossary alignment

Acting on Suggestions

After reviewing suggestions, you can:

  1. Apply fixes manually: Review and implement suggested changes
  2. Use AI to fix: Let AI generate fixes automatically
    msh ai fix assets/revenue.msh --apply
  3. Generate tests: Use AI to generate tests
    msh ai tests assets/revenue.msh --apply