Skip to main content

CLI Reference

The msh command-line interface is your primary tool for managing the data lifecycle.

Command Aliases

For consistency, many commands can be accessed via the msh asset group:

# Traditional commands (still work)
msh run orders
msh rollback orders
msh status
msh sample orders

# Alias form (for consistency)
msh asset run orders
msh asset rollback orders
msh asset status
msh asset sample orders
msh asset versions orders

Both forms work identically. Use whichever you prefer.

Commands

msh init

Initializes a new msh project or sets up the state in a new destination.

msh init

This command creates the necessary internal tables (msh_state_history, msh_catalog) in your destination database.


msh create

Scaffold new resources to get started quickly.

msh create asset <name>

Subcommands:

  • asset <name>: Creates a new .msh file in the models/ directory with a production-ready template.

Example:

msh create asset revenue_mart
# [OK] Created new asset: models/revenue_mart.msh

msh discover

Auto-discovers source schemas from REST APIs or SQL databases and generates .msh file configurations automatically.

msh discover <url_or_connection_string> [table:name]

REST API Discovery:

msh discover https://api.github.com/repos/dlt-hub/dlt/issues

Example Output:

Detected source type: rest_api
Generated .msh configuration:
============================================================
name: issues
description: Auto-discovered from rest_api
ingest:
type: rest_api
endpoint: https://api.github.com/repos/dlt-hub/dlt/issues
resource: data
contract:
evolution: evolve
enforce_types: true
required_columns:
- id
- title
- state
- created_at
transform: |
SELECT * FROM {{ source }}
============================================================

✓ Written to: models/issues.msh
You can now run: msh run issues

SQL Database Discovery:

msh discover postgresql://user:pass@localhost:5432/mydb table:orders

Options:

  • --name <name>: Custom asset name (defaults to inferred name)
  • --output, -o <path>: Custom output file path
  • --write/--no-write: Write to file (default) or print only

The command automatically generates schema contracts based on discovered columns. See Discover Command for complete documentation.


msh sample

Preview and sample data from assets for testing and debugging.

msh sample <asset_name> [options]

Options:

  • --limit <number>: Number of rows to preview (default: 10)
  • --size <number>: Create a temporary sample table with this many rows
  • --env <name>: Environment to query (default: dev)
  • --source <type>: Source to sample from - raw, model, or view (default: view)

Examples:

# Quick preview (10 rows from view)
msh sample orders

# Preview raw data
msh sample orders --source raw --limit 50

# Create test dataset
msh sample orders --size 1000

Note: Asset must be deployed first (msh run <asset_name>). See Sample Command for complete documentation.


msh doctor

Diagnoses issues with your configuration, connections, or pipeline health.

msh doctor

Example Output:

msh Doctor v1.0.0
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Environment Checks
[OK] Python 3.11.5
[OK] dbt-core 1.7.0
[OK] dlt 0.4.12

Warehouse Connection
[OK] Connected to postgres://localhost:5432/msh_db
[OK] Schema 'analytics_blue' exists
[OK] State table 'msh_state_history' found

Source Connectivity
[OK] Stripe API (verified with test request)
[FAIL] Salesforce API - Invalid credentials

Recommendations:
• Check SALESFORCE_API_KEY in your .env file

Use msh doctor whenever you encounter unexpected behavior. It will pinpoint configuration issues before you waste time debugging.


msh validate

Perform static analysis on your project to catch errors before runtime. Ideal for CI/CD pipelines.

msh validate

Checks Performed:

  • Syntax: Validates YAML structure.
  • Schema: Ensures required fields (name, ingest, transform) are present.
  • Safety: Checks for unclosed Jinja tags.
  • Security: Scans for hardcoded credentials (heuristic).

Example Output:

[bold blue]Validating 5 files...[/bold blue]
[PASS] revenue.msh
[FAIL] broken.msh
- Missing 'ingest' block

msh plan

Previews the changes that will be applied to your data warehouse. Shows what will be created, modified, or deleted.

msh plan [--env <name>]

Options:

  • --env <name>: Target environment (defaults to dev). Toggles between dev (Git-aware schemas) and prod (fixed schemas).

This is the equivalent of terraform plan or dbt run --dry-run. It shows you the diff without executing anything.

What It Does:

  1. Parses all .msh files.
  2. Compiles SQL and Python code.
  3. Checks for syntax errors.
  4. Validates contracts against sources (if contract blocks exist).
  5. Prints a summary of what would happen.

msh run

Executes the pipeline. This includes:

  1. Ingestion: Running dlt loaders.
  2. Transformation: Running dbt models.
  3. Blue/Green Swap: Promoting the new state to production.
msh run [options] [asset_selector]

Options:

  • --env <name>: Target environment (defaults to dev). Toggles between dev (Git-aware schemas) and prod (fixed schemas).
  • --debug: Enable debug mode to see raw dbt logs. Useful for troubleshooting compilation errors.
  • --dry-run: Simulate run without moving data (same as msh plan).
  • --all: Run all assets in the project.

Asset Selectors:

  • asset_name: Run only this asset.
  • +asset_name: Run asset and all upstream dependencies.
  • asset_name+: Run asset and all downstream dependencies.

Note: Cannot combine --all with asset selectors.

Example Output:

msh Run v1.0.0
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

[1/4] Initializing Green Schema (analytics_green_a3f2b1c)
[OK] Schema created

[2/4] Ingestion (dlt)
[OK] stripe.customers → raw_stripe_customers (1,245 rows)
[OK] stripe.invoices → raw_stripe_invoices (3,891 rows)

[3/4] Transformation (dbt)
[OK] models/customers.msh → customers (1,245 rows)
[OK] models/revenue.msh → revenue (89 rows)

[4/4] Blue/Green Deploy
[OK] Swapping analytics_blue ↔ analytics_green_a3f2b1c
[OK] Deployment complete

State saved to msh_state_history (run_id: a3f2b1c)

Examples:

# Run all assets
msh run --all

# Run specific asset
msh run orders

# Run with upstreams
msh run +orders

# Run with downstreams
msh run orders+

# Run in production environment
msh run --env prod

# Preview changes
msh run --dry-run

# Debug mode
msh run --debug

msh rollback

The "Undo Button". Reverts the production state to the previous deployment instantly.

msh rollback [options] [asset_name(s)]

Options:

  • --env <name>: Target environment (defaults to dev). Toggles between dev (Git-aware schemas) and prod (fixed schemas).
  • --all: Rollback all assets.

Usage:

  • Without arguments: Rolls back the entire warehouse to the last successful run.
  • With asset_name: Rolls back only that specific asset.
  • With comma-separated names: msh rollback orders,revenue,users
  • With --all: Rollback all assets.

Note: Cannot combine --all with asset names.

Example Output:

msh Rollback v1.0.0
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Rolling back to previous state (run_id: 9d8e7f6)
[OK] Swapping analytics_blue ↔ analytics_archive_9d8e7f6
[OK] Rollback complete

Production is now at state: 9d8e7f6 (deployed 2 hours ago)
warning

Rollback is instantaneous but destructive to the current Blue state. The current production schema becomes the new archive. Use with caution in production environments.


msh lineage

Generates and serves standard dbt documentation (dbt docs). This provides interactive documentation including lineage graphs, table schemas, and test results.

msh lineage [--debug]

Options:

  • --debug: Enable debug mode to see full dbt logs (useful for debugging compilation errors).

What it does:

  1. Generates dbt documentation using dbt docs generate
  2. Serves interactive documentation using dbt docs serve
  3. Opens a web browser with the documentation

Note: Requires running msh run first to create the build directory. The documentation shows:

  • Table schemas and column descriptions
  • Lineage graphs (dependencies between models)
  • Test results and data quality metrics
  • Source definitions

For custom React-based lineage visualization, use msh ui instead.


msh ui

Launches the local dashboard to view pipeline status, logs, and lineage graphs.

msh ui [--port <number>]

Options:

  • --port <number>: Port to serve the UI on (default: 3000). Change this if port 3000 is already in use.

By default, this starts a web server on http://localhost:3000. The dashboard shows:

  • Active and archived deployments.
  • Smart Ingest metrics (columns fetched vs. columns available).
  • Real-time lineage graph.

Example:

# Use default port 3000
msh ui

# Use custom port
msh ui --port 8080

API Endpoints:

  • GET /api/catalog.json - Asset catalog data
  • GET /api/quality.json - Quality metrics and test results
  • GET /api/stream/run?asset=<name> - Stream run logs

msh versions

View the deployment history of a specific asset.

msh versions <asset_name>

Example Output:

Versions: revenue_mart
┌──────┬─────────────────────┬──────────┐
│ Hash │ Deployed At │ Status │
├──────┼─────────────────────┼──────────┤
│ a1b2 │ 2023-10-27 10:00:00 │ deployed │
│ c3d4 │ 2023-10-26 09:30:00 │ deployed │
└──────┴─────────────────────┴──────────┘

msh status

View the currently active version hash for all assets in the project.

msh status [--format <format>]

Options:

  • --format <format>: Output format - table (default) or json. JSON format is useful for automation and CI/CD pipelines.

Example Output (Table):

Project Status
┌──────────────┬─────────────┐
│ Asset │ Active Hash │
├──────────────┼─────────────┤
│ revenue_mart │ a1b2 │
│ users_clean │ e5f6 │
└──────────────┴─────────────┘

Example Output (JSON):

{
"assets": [
{"name": "revenue_mart", "active_hash": "a1b2"},
{"name": "users_clean", "active_hash": "e5f6"}
]
}

msh fmt

Format your .msh files to a standard style. This ensures consistency across your team.

msh fmt [path/to/models]

What It Does:

  • Indents YAML blocks correctly (2 spaces).
  • Formats SQL blocks (keywords uppercase, proper indentation).
  • Formats Python blocks (using black).

Example:

Before:

name:users
sql:|
select id,name from source('api','users')

After msh fmt:

name: users

sql: |
SELECT
id,
name
FROM source('api', 'users')

msh preview

Runs the pipeline but skips the final deployment (Swap). Useful for testing changes in the Green schema without affecting production.

msh preview [options]

Options:

  • Same as msh run (--env, --debug, --dry-run, --all, asset selectors).

What it does:

  • Executes ingestion and transformation
  • Builds the Green schema
  • Does NOT swap to production
  • Allows you to inspect the Green schema before deploying

msh publish

Publish transformed data assets to external destinations (Reverse ETL).

msh publish <asset_name> --to <destination>

Options:

  • --to <destination>: Required. Target destination (e.g., salesforce, hubspot, postgres).

Security: Credentials are read from DESTINATION__<TARGET>__CREDENTIALS in your .env file.

Example:

msh publish customers --to salesforce

See Publish Command for complete documentation.


msh freshness

Check if your data assets are up-to-date and meet freshness expectations.

msh freshness

What it does:

  • Queries msh_state_history table (fast) rather than scanning raw data (slow)
  • Compares last run time against warn_after and error_after thresholds
  • Displays status table with OK/WARN/ERROR indicators

Configuration: Add freshness block to your .msh file:

freshness:
warn_after: 12h # Warn if data is older than 12 hours
error_after: 24h # Error if data is older than 24 hours

See Freshness Command for complete documentation.


msh inspect

Parse a .msh asset into structured AST + metadata.

msh inspect <asset_path> [--json]

Description: Parses a .msh file and extracts structured metadata including:

  • Asset blocks (ingest, transform, contract, deploy)
  • Schema information (columns, types)
  • Lineage (dependencies via {{ ref() }})
  • Test definitions

Options:

  • --json: Output structured JSON instead of human-readable text

Examples:

# Human-readable output
msh inspect assets/revenue.msh

# JSON output
msh inspect assets/revenue.msh --json

See AI Features for more details.


msh manifest

Generate project-level manifest for AI context.

msh manifest [--json] [--update]

Description: Scans all .msh files in the project and generates a manifest with:

  • All asset metadata
  • Project configuration
  • Warehouse information
  • Default schema

Options:

  • --json: Output manifest as JSON
  • --update: Update existing manifest instead of regenerating

Examples:

# Generate manifest
msh manifest

# Generate and output as JSON
msh manifest --json

# Update existing manifest
msh manifest --update

Output:

  • Saves to .msh/manifest.json
  • Also generates .msh/lineage.json, .msh/schemas.json, .msh/tests.json

See Metadata Cache for more details.


msh ai context

Generate AI-ready context pack for asset or project.

msh ai context [--asset <asset_id>] [--json] [--include-tests] [--include-history]

Description: Generates a comprehensive context pack containing:

  • Project information
  • Asset metadata
  • Lineage graph
  • Glossary terms
  • Schemas
  • Tests (optional)
  • Metrics and policies

Options:

  • --asset <asset_id>: Focus context pack on specific asset (includes upstream/downstream)
  • --json: Output context pack as JSON
  • --include-tests: Include test metadata in context pack
  • --include-history: Include recent run/deploy history

Examples:

# Generate project-level context pack
msh ai context

# Generate context pack for specific asset
msh ai context --asset revenue

# Include tests and history
msh ai context --asset revenue --include-tests --include-history

# Output as JSON
msh ai context --asset revenue --json

See Context Packs for complete documentation.


msh ai explain

Use AI to explain what an asset does in plain language.

msh ai explain <asset_path> [--json]

Description: Uses AI to generate a natural language explanation of an asset, including:

  • What data it ingests
  • What transformation it performs
  • Output schema
  • Dependencies
  • Business purpose

Options:

  • --json: Return structured explanation object instead of plain text

Prerequisites:

  • AI configuration must be set up (msh config ai)
  • Asset must exist and be parseable

Examples:

# Explain an asset
msh ai explain assets/revenue.msh

# Get structured explanation
msh ai explain assets/revenue.msh --json

See AI Explain for complete documentation.


msh ai review

Use AI to review asset for risks and issues.

msh ai review <asset_path> [--json]

Description: Uses AI to analyze an asset and identify:

  • Performance risks (missing indexes, inefficient queries)
  • Data quality risks (missing tests, potential null issues)
  • Glossary alignment issues (columns not linked to glossary terms)
  • Best practice violations
  • Suggested improvements

Options:

  • --json: Return structured review object

Examples:

# Review an asset
msh ai review assets/revenue.msh

# Get structured review
msh ai review assets/revenue.msh --json

See AI Review for complete documentation.


msh ai new

Generate a new .msh asset from natural language description.

msh ai new [--name <name>] [--apply] [--path <path>]

Description: Prompts user for a natural language description and generates a complete .msh file with:

  • Ingest block (appropriate type based on description)
  • Transform block with SQL
  • Tests
  • Contract block (if needed)

Options:

  • --name <name>: Suggested asset ID/name
  • --apply: Write the generated asset to disk automatically
  • --path <path>: Path to save the new .msh file (if --apply is set)

Examples:

# Generate asset (preview only)
msh ai new --name customer_revenue

# Generate and save automatically
msh ai new --name customer_revenue --apply

# Generate and save to specific path
msh ai new --name customer_revenue --apply --path assets/customer_revenue.msh

See AI New for complete documentation.


msh ai fix

Use AI to suggest fixes for broken or failing assets.

msh ai fix <asset_path> [--apply] [--json] [--error <error_message>]

Description: Analyzes a broken asset and suggests fixes:

  • Parses error messages (if provided)
  • Analyzes asset structure
  • Generates JSON patch with suggested fixes
  • Optionally applies fixes automatically

Options:

  • --apply: Apply the suggested patch after user confirmation
  • --json: Return patch as JSON structure
  • --error <error_message>: Provide error message from failed run

Examples:

# Suggest fixes (preview only)
msh ai fix assets/revenue.msh

# Suggest fixes with error context
msh ai fix assets/revenue.msh --error "Column 'customer_id' not found"

# Apply fixes automatically
msh ai fix assets/revenue.msh --apply

# Get patch as JSON
msh ai fix assets/revenue.msh --json

See AI Fix for complete documentation.


msh ai tests

Use AI to generate or improve tests for an asset.

msh ai tests <asset_path> [--apply] [--json]

Description: Analyzes an asset and suggests appropriate tests:

  • Reviews asset schema
  • Identifies missing tests
  • Suggests test improvements
  • Generates JSON patch with test additions

Options:

  • --apply: Apply suggested test changes to the file
  • --json: Return suggested tests as JSON structure

Examples:

# Suggest tests (preview only)
msh ai tests assets/revenue.msh

# Apply test changes automatically
msh ai tests assets/revenue.msh --apply

# Get suggestions as JSON
msh ai tests assets/revenue.msh --json

See AI Tests for complete documentation.


msh ai apply

Apply an AI-generated patch file to one or more .msh assets.

msh ai apply <patch_file> [--dry-run]

Description: Applies a JSON patch file (generated by AI commands) to .msh files:

  • Validates patch using safety layer
  • Shows diff in dry-run mode
  • Applies changes atomically
  • Reports success/failure for each file

Options:

  • --dry-run: Show diff without modifying any files

Examples:

# Preview changes
msh ai apply patch.json --dry-run

# Apply patch
msh ai apply patch.json

See AI Apply for complete documentation.


Glossary Commands

msh glossary add-term

Create a new glossary term in the project glossary file.

msh glossary add-term <name> [--description] [--id] [--json]

Description: Creates a new business term in the glossary:

  • Generates ID automatically if not provided
  • Validates uniqueness
  • Saves to glossary.yaml and .msh/glossary.json cache

Options:

  • --description: Description of the term
  • --id: Optional explicit ID (e.g., term.customer)
  • --json: Output as JSON

Examples:

# Add term with description
msh glossary add-term "Customer" --description "A customer entity"

# Add term with explicit ID
msh glossary add-term "Customer" --id "term.customer" --description "A customer entity"

# Output as JSON
msh glossary add-term "Customer" --json

See Managing Terms for complete documentation.


Link an existing glossary term to assets and columns.

msh glossary link-term <term_name_or_id> --asset <asset> [--column] [--role] [--json]

Description: Links a glossary term to:

  • Assets (required)
  • Columns (optional)
  • Roles (optional: primary_key, foreign_key, attribute)

Options:

  • --asset <asset>: Asset ID or path (required)
  • --column <column>: Column name within the asset (optional)
  • --role <role>: Role of the column (optional)
  • --json: Output as JSON

Examples:

# Link term to asset
msh glossary link-term "Customer" --asset revenue

# Link term to specific column
msh glossary link-term "Customer" --asset revenue --column customer_id --role primary_key

# Link term to multiple columns
msh glossary link-term "Customer" --asset revenue --column customer_id
msh glossary link-term "Customer" --asset revenue --column customer_name

See Linking Terms for complete documentation.


msh glossary list

List glossary terms in the current project.

msh glossary list [--json]

Description: Lists all glossary terms, metrics, dimensions, and policies:

  • Shows term count
  • Lists first 10 terms (with more indicator)
  • Shows metrics, dimensions, policies counts

Options:

  • --json: Return glossary as JSON

Examples:

# List glossary
msh glossary list

# Output as JSON
msh glossary list --json

See Managing Terms for complete documentation.


msh glossary export

Export the full glossary as JSON for AI or external tools.

msh glossary export [--path]

Description: Exports the complete glossary as JSON:

  • Includes all terms, metrics, dimensions, policies
  • Suitable for AI consumption or external tools
  • Prints to stdout if no path specified

Options:

  • --path <path>: Optional output file path; otherwise print to stdout

Examples:

# Print to stdout
msh glossary export

# Save to file
msh glossary export --path glossary.json

See Managing Terms for complete documentation.


msh config ai

Configure AI provider and model for msh.

msh config ai [--provider] [--model] [--api-key] [--endpoint]

Description: Configures AI provider settings:

  • Saves to ~/.msh/config (global) or msh.yaml (project-level)
  • Supports multiple providers (OpenAI, Anthropic, Ollama, etc.)
  • API key resolution with env:VAR_NAME support

Options:

  • --provider <provider>: AI provider key (e.g., openai, anthropic, ollama)
  • --model <model>: Model identifier (e.g., gpt-4.5, claude-3-opus)
  • --api-key <key>: API key or env:VAR_NAME reference
  • --endpoint <url>: Optional custom endpoint URL

Examples:

# Configure OpenAI
msh config ai --provider openai --model gpt-4 --api-key env:OPENAI_API_KEY

# Configure Anthropic
msh config ai --provider anthropic --model claude-3-opus --api-key env:ANTHROPIC_API_KEY

# Configure Ollama (local)
msh config ai --provider ollama --model llama2 --endpoint http://localhost:11434/api/generate

Supported Providers:

  • openai - OpenAI (GPT-4, GPT-3.5, etc.)
  • anthropic - Anthropic (Claude 3 Opus, Sonnet, etc.)
  • ollama - Ollama (local models, no API key needed)
  • huggingface - HuggingFace (local models)
  • azure - Azure OpenAI

See AI Setup for complete documentation.