CLI Reference
The msh command-line interface is your primary tool for managing the data lifecycle.
Command Aliases
For consistency, many commands can be accessed via the msh asset group:
# Traditional commands (still work)
msh run orders
msh rollback orders
msh status
msh sample orders
# Alias form (for consistency)
msh asset run orders
msh asset rollback orders
msh asset status
msh asset sample orders
msh asset versions orders
Both forms work identically. Use whichever you prefer.
Commands
msh init
Initializes a new msh project or sets up the state in a new destination.
msh init
This command creates the necessary internal tables (msh_state_history, msh_catalog) in your destination database.
msh create
Scaffold new resources to get started quickly.
msh create asset <name>
Subcommands:
asset <name>: Creates a new.mshfile in themodels/directory with a production-ready template.
Example:
msh create asset revenue_mart
# [OK] Created new asset: models/revenue_mart.msh
msh discover
Auto-discovers source schemas from REST APIs or SQL databases and generates .msh file configurations automatically.
msh discover <url_or_connection_string> [table:name]
REST API Discovery:
msh discover https://api.github.com/repos/dlt-hub/dlt/issues
Example Output:
Detected source type: rest_api
Generated .msh configuration:
============================================================
name: issues
description: Auto-discovered from rest_api
ingest:
type: rest_api
endpoint: https://api.github.com/repos/dlt-hub/dlt/issues
resource: data
contract:
evolution: evolve
enforce_types: true
required_columns:
- id
- title
- state
- created_at
transform: |
SELECT * FROM {{ source }}
============================================================
✓ Written to: models/issues.msh
You can now run: msh run issues
SQL Database Discovery:
msh discover postgresql://user:pass@localhost:5432/mydb table:orders
Options:
--name <name>: Custom asset name (defaults to inferred name)--output, -o <path>: Custom output file path--write/--no-write: Write to file (default) or print only
The command automatically generates schema contracts based on discovered columns. See Discover Command for complete documentation.
msh sample
Preview and sample data from assets for testing and debugging.
msh sample <asset_name> [options]
Options:
--limit <number>: Number of rows to preview (default:10)--size <number>: Create a temporary sample table with this many rows--env <name>: Environment to query (default:dev)--source <type>: Source to sample from -raw,model, orview(default:view)
Examples:
# Quick preview (10 rows from view)
msh sample orders
# Preview raw data
msh sample orders --source raw --limit 50
# Create test dataset
msh sample orders --size 1000
Note: Asset must be deployed first (msh run <asset_name>). See Sample Command for complete documentation.
msh doctor
Diagnoses issues with your configuration, connections, or pipeline health.
msh doctor
Example Output:
msh Doctor v1.0.0
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Environment Checks
[OK] Python 3.11.5
[OK] dbt-core 1.7.0
[OK] dlt 0.4.12
Warehouse Connection
[OK] Connected to postgres://localhost:5432/msh_db
[OK] Schema 'analytics_blue' exists
[OK] State table 'msh_state_history' found
Source Connectivity
[OK] Stripe API (verified with test request)
[FAIL] Salesforce API - Invalid credentials
Recommendations:
• Check SALESFORCE_API_KEY in your .env file
Use msh doctor whenever you encounter unexpected behavior. It will pinpoint configuration issues before you waste time debugging.
msh validate
Perform static analysis on your project to catch errors before runtime. Ideal for CI/CD pipelines.
msh validate
Checks Performed:
- Syntax: Validates YAML structure.
- Schema: Ensures required fields (
name,ingest,transform) are present. - Safety: Checks for unclosed Jinja tags.
- Security: Scans for hardcoded credentials (heuristic).
Example Output:
[bold blue]Validating 5 files...[/bold blue]
[PASS] revenue.msh
[FAIL] broken.msh
- Missing 'ingest' block
msh plan
Previews the changes that will be applied to your data warehouse. Shows what will be created, modified, or deleted.
msh plan [--env <name>]
Options:
--env <name>: Target environment (defaults todev). Toggles between dev (Git-aware schemas) and prod (fixed schemas).
This is the equivalent of terraform plan or dbt run --dry-run. It shows you the diff without executing anything.
What It Does:
- Parses all
.mshfiles. - Compiles SQL and Python code.
- Checks for syntax errors.
- Validates contracts against sources (if
contractblocks exist). - Prints a summary of what would happen.
msh run
Executes the pipeline. This includes:
- Ingestion: Running dlt loaders.
- Transformation: Running dbt models.
- Blue/Green Swap: Promoting the new state to production.
msh run [options] [asset_selector]
Options:
--env <name>: Target environment (defaults todev). Toggles between dev (Git-aware schemas) and prod (fixed schemas).--debug: Enable debug mode to see raw dbt logs. Useful for troubleshooting compilation errors.--dry-run: Simulate run without moving data (same asmsh plan).--all: Run all assets in the project.
Asset Selectors:
asset_name: Run only this asset.+asset_name: Run asset and all upstream dependencies.asset_name+: Run asset and all downstream dependencies.
Note: Cannot combine --all with asset selectors.
Example Output:
msh Run v1.0.0
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
[1/4] Initializing Green Schema (analytics_green_a3f2b1c)
[OK] Schema created
[2/4] Ingestion (dlt)
[OK] stripe.customers → raw_stripe_customers (1,245 rows)
[OK] stripe.invoices → raw_stripe_invoices (3,891 rows)
[3/4] Transformation (dbt)
[OK] models/customers.msh → customers (1,245 rows)
[OK] models/revenue.msh → revenue (89 rows)
[4/4] Blue/Green Deploy
[OK] Swapping analytics_blue ↔ analytics_green_a3f2b1c
[OK] Deployment complete
State saved to msh_state_history (run_id: a3f2b1c)
Examples:
# Run all assets
msh run --all
# Run specific asset
msh run orders
# Run with upstreams
msh run +orders
# Run with downstreams
msh run orders+
# Run in production environment
msh run --env prod
# Preview changes
msh run --dry-run
# Debug mode
msh run --debug
msh rollback
The "Undo Button". Reverts the production state to the previous deployment instantly.
msh rollback [options] [asset_name(s)]
Options:
--env <name>: Target environment (defaults todev). Toggles between dev (Git-aware schemas) and prod (fixed schemas).--all: Rollback all assets.
Usage:
- Without arguments: Rolls back the entire warehouse to the last successful run.
- With
asset_name: Rolls back only that specific asset. - With comma-separated names:
msh rollback orders,revenue,users - With
--all: Rollback all assets.
Note: Cannot combine --all with asset names.
Example Output:
msh Rollback v1.0.0
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Rolling back to previous state (run_id: 9d8e7f6)
[OK] Swapping analytics_blue ↔ analytics_archive_9d8e7f6
[OK] Rollback complete
Production is now at state: 9d8e7f6 (deployed 2 hours ago)
Rollback is instantaneous but destructive to the current Blue state. The current production schema becomes the new archive. Use with caution in production environments.
msh lineage
Generates and serves standard dbt documentation (dbt docs). This provides interactive documentation including lineage graphs, table schemas, and test results.
msh lineage [--debug]
Options:
--debug: Enable debug mode to see full dbt logs (useful for debugging compilation errors).
What it does:
- Generates dbt documentation using
dbt docs generate - Serves interactive documentation using
dbt docs serve - Opens a web browser with the documentation
Note: Requires running msh run first to create the build directory. The documentation shows:
- Table schemas and column descriptions
- Lineage graphs (dependencies between models)
- Test results and data quality metrics
- Source definitions
For custom React-based lineage visualization, use msh ui instead.
msh ui
Launches the local dashboard to view pipeline status, logs, and lineage graphs.
msh ui [--port <number>]
Options:
--port <number>: Port to serve the UI on (default:3000). Change this if port 3000 is already in use.
By default, this starts a web server on http://localhost:3000. The dashboard shows:
- Active and archived deployments.
- Smart Ingest metrics (columns fetched vs. columns available).
- Real-time lineage graph.
Example:
# Use default port 3000
msh ui
# Use custom port
msh ui --port 8080
API Endpoints:
GET /api/catalog.json- Asset catalog dataGET /api/quality.json- Quality metrics and test resultsGET /api/stream/run?asset=<name>- Stream run logs
msh versions
View the deployment history of a specific asset.
msh versions <asset_name>
Example Output:
Versions: revenue_mart
┌─ ─────┬─────────────────────┬──────────┐
│ Hash │ Deployed At │ Status │
├──────┼─────────────────────┼──────────┤
│ a1b2 │ 2023-10-27 10:00:00 │ deployed │
│ c3d4 │ 2023-10-26 09:30:00 │ deployed │
└──────┴─────────────────────┴──────────┘
msh status
View the currently active version hash for all assets in the project.
msh status [--format <format>]
Options:
--format <format>: Output format -table(default) orjson. JSON format is useful for automation and CI/CD pipelines.
Example Output (Table):
Project Status
┌──────────────┬─────────────┐
│ Asset │ Active Hash │
├──────────────┼─────────────┤
│ revenue_mart │ a1b2 │
│ users_clean │ e5f6 │
└──────────────┴─────────────┘
Example Output (JSON):
{
"assets": [
{"name": "revenue_mart", "active_hash": "a1b2"},
{"name": "users_clean", "active_hash": "e5f6"}
]
}
msh fmt
Format your .msh files to a standard style. This ensures consistency across your team.
msh fmt [path/to/models]
What It Does:
- Indents YAML blocks correctly (2 spaces).
- Formats SQL blocks (keywords uppercase, proper indentation).
- Formats Python blocks (using
black).
Example:
Before:
name:users
sql:|
select id,name from source('api','users')
After msh fmt:
name: users
sql: |
SELECT
id,
name
FROM source('api', 'users')
msh preview
Runs the pipeline but skips the final deployment (Swap). Useful for testing changes in the Green schema without affecting production.
msh preview [options]
Options:
- Same as
msh run(--env,--debug,--dry-run,--all, asset selectors).
What it does:
- Executes ingestion and transformation
- Builds the Green schema
- Does NOT swap to production
- Allows you to inspect the Green schema before deploying
msh publish
Publish transformed data assets to external destinations (Reverse ETL).
msh publish <asset_name> --to <destination>
Options:
--to <destination>: Required. Target destination (e.g.,salesforce,hubspot,postgres).
Security: Credentials are read from DESTINATION__<TARGET>__CREDENTIALS in your .env file.
Example:
msh publish customers --to salesforce
See Publish Command for complete documentation.
msh freshness
Check if your data assets are up-to-date and meet freshness expectations.
msh freshness
What it does:
- Queries
msh_state_historytable (fast) rather than scanning raw data (slow) - Compares last run time against
warn_afteranderror_afterthresholds - Displays status table with OK/WARN/ERROR indicators
Configuration: Add freshness block to your .msh file:
freshness:
warn_after: 12h # Warn if data is older than 12 hours
error_after: 24h # Error if data is older than 24 hours
See Freshness Command for complete documentation.
msh inspect
Parse a .msh asset into structured AST + metadata.
msh inspect <asset_path> [--json]
Description:
Parses a .msh file and extracts structured metadata including:
- Asset blocks (ingest, transform, contract, deploy)
- Schema information (columns, types)
- Lineage (dependencies via
{{ ref() }}) - Test definitions
Options:
--json: Output structured JSON instead of human-readable text
Examples:
# Human-readable output
msh inspect assets/revenue.msh
# JSON output
msh inspect assets/revenue.msh --json
See AI Features for more details.
msh manifest
Generate project-level manifest for AI context.
msh manifest [--json] [--update]
Description:
Scans all .msh files in the project and generates a manifest with:
- All asset metadata
- Project configuration
- Warehouse information
- Default schema
Options:
--json: Output manifest as JSON--update: Update existing manifest instead of regenerating
Examples:
# Generate manifest
msh manifest
# Generate and output as JSON
msh manifest --json
# Update existing manifest
msh manifest --update
Output:
- Saves to
.msh/manifest.json - Also generates
.msh/lineage.json,.msh/schemas.json,.msh/tests.json
See Metadata Cache for more details.
msh ai context
Generate AI-ready context pack for asset or project.
msh ai context [--asset <asset_id>] [--json] [--include-tests] [--include-history]
Description: Generates a comprehensive context pack containing:
- Project information
- Asset metadata
- Lineage graph
- Glossary terms
- Schemas
- Tests (optional)
- Metrics and policies
Options:
--asset <asset_id>: Focus context pack on specific asset (includes upstream/downstream)--json: Output context pack as JSON--include-tests: Include test metadata in context pack--include-history: Include recent run/deploy history
Examples:
# Generate project-level context pack
msh ai context
# Generate context pack for specific asset
msh ai context --asset revenue
# Include tests and history
msh ai context --asset revenue --include-tests --include-history
# Output as JSON
msh ai context --asset revenue --json
See Context Packs for complete documentation.
msh ai explain
Use AI to explain what an asset does in plain language.
msh ai explain <asset_path> [--json]
Description: Uses AI to generate a natural language explanation of an asset, including:
- What data it ingests
- What transformation it performs
- Output schema
- Dependencies
- Business purpose
Options:
--json: Return structured explanation object instead of plain text
Prerequisites:
- AI configuration must be set up (
msh config ai) - Asset must exist and be parseable
Examples:
# Explain an asset
msh ai explain assets/revenue.msh
# Get structured explanation
msh ai explain assets/revenue.msh --json
See AI Explain for complete documentation.
msh ai review
Use AI to review asset for risks and issues.
msh ai review <asset_path> [--json]
Description: Uses AI to analyze an asset and identify:
- Performance risks (missing indexes, inefficient queries)
- Data quality risks (missing tests, potential null issues)
- Glossary alignment issues (columns not linked to glossary terms)
- Best practice violations
- Suggested improvements
Options:
--json: Return structured review object
Examples:
# Review an asset
msh ai review assets/revenue.msh
# Get structured review
msh ai review assets/revenue.msh --json
See AI Review for complete documentation.
msh ai new
Generate a new .msh asset from natural language description.
msh ai new [--name <name>] [--apply] [--path <path>]
Description:
Prompts user for a natural language description and generates a complete .msh file with:
- Ingest block (appropriate type based on description)
- Transform block with SQL
- Tests
- Contract block (if needed)
Options:
--name <name>: Suggested asset ID/name--apply: Write the generated asset to disk automatically--path <path>: Path to save the new.mshfile (if--applyis set)
Examples:
# Generate asset (preview only)
msh ai new --name customer_revenue
# Generate and save automatically
msh ai new --name customer_revenue --apply
# Generate and save to specific path
msh ai new --name customer_revenue --apply --path assets/customer_revenue.msh
See AI New for complete documentation.
msh ai fix
Use AI to suggest fixes for broken or failing assets.
msh ai fix <asset_path> [--apply] [--json] [--error <error_message>]
Description: Analyzes a broken asset and suggests fixes:
- Parses error messages (if provided)
- Analyzes asset structure
- Generates JSON patch with suggested fixes
- Optionally applies fixes automatically
Options:
--apply: Apply the suggested patch after user confirmation--json: Return patch as JSON structure--error <error_message>: Provide error message from failed run
Examples:
# Suggest fixes (preview only)
msh ai fix assets/revenue.msh
# Suggest fixes with error context
msh ai fix assets/revenue.msh --error "Column 'customer_id' not found"
# Apply fixes automatically
msh ai fix assets/revenue.msh --apply
# Get patch as JSON
msh ai fix assets/revenue.msh --json
See AI Fix for complete documentation.
msh ai tests
Use AI to generate or improve tests for an asset.
msh ai tests <asset_path> [--apply] [--json]
Description: Analyzes an asset and suggests appropriate tests:
- Reviews asset schema
- Identifies missing tests
- Suggests test improvements
- Generates JSON patch with test additions
Options:
--apply: Apply suggested test changes to the file--json: Return suggested tests as JSON structure
Examples:
# Suggest tests (preview only)
msh ai tests assets/revenue.msh
# Apply test changes automatically
msh ai tests assets/revenue.msh --apply
# Get suggestions as JSON
msh ai tests assets/revenue.msh --json
See AI Tests for complete documentation.
msh ai apply
Apply an AI-generated patch file to one or more .msh assets.
msh ai apply <patch_file> [--dry-run]
Description:
Applies a JSON patch file (generated by AI commands) to .msh files:
- Validates patch using safety layer
- Shows diff in dry-run mode
- Applies changes atomically
- Reports success/failure for each file
Options:
--dry-run: Show diff without modifying any files
Examples:
# Preview changes
msh ai apply patch.json --dry-run
# Apply patch
msh ai apply patch.json
See AI Apply for complete documentation.
Glossary Commands
msh glossary add-term
Create a new glossary term in the project glossary file.
msh glossary add-term <name> [--description] [--id] [--json]
Description: Creates a new business term in the glossary:
- Generates ID automatically if not provided
- Validates uniqueness
- Saves to
glossary.yamland.msh/glossary.jsoncache
Options:
--description: Description of the term--id: Optional explicit ID (e.g., term.customer)--json: Output as JSON
Examples:
# Add term with description
msh glossary add-term "Customer" --description "A customer entity"
# Add term with explicit ID
msh glossary add-term "Customer" --id "term.customer" --description "A customer entity"
# Output as JSON
msh glossary add-term "Customer" --json
See Managing Terms for complete documentation.
msh glossary link-term
Link an existing glossary term to assets and columns.
msh glossary link-term <term_name_or_id> --asset <asset> [--column] [--role] [--json]
Description: Links a glossary term to:
- Assets (required)
- Columns (optional)
- Roles (optional: primary_key, foreign_key, attribute)
Options:
--asset <asset>: Asset ID or path (required)--column <column>: Column name within the asset (optional)--role <role>: Role of the column (optional)--json: Output as JSON
Examples:
# Link term to asset
msh glossary link-term "Customer" --asset revenue
# Link term to specific column
msh glossary link-term "Customer" --asset revenue --column customer_id --role primary_key
# Link term to multiple columns
msh glossary link-term "Customer" --asset revenue --column customer_id
msh glossary link-term "Customer" --asset revenue --column customer_name
See Linking Terms for complete documentation.
msh glossary list
List glossary terms in the current project.
msh glossary list [--json]
Description: Lists all glossary terms, metrics, dimensions, and policies:
- Shows term count
- Lists first 10 terms (with more indicator)
- Shows metrics, dimensions, policies counts
Options:
--json: Return glossary as JSON
Examples:
# List glossary
msh glossary list
# Output as JSON
msh glossary list --json
See Managing Terms for complete documentation.
msh glossary export
Export the full glossary as JSON for AI or external tools.
msh glossary export [--path]
Description: Exports the complete glossary as JSON:
- Includes all terms, metrics, dimensions, policies
- Suitable for AI consumption or external tools
- Prints to stdout if no path specified
Options:
--path <path>: Optional output file path; otherwise print to stdout
Examples:
# Print to stdout
msh glossary export
# Save to file
msh glossary export --path glossary.json
See Managing Terms for complete documentation.
msh config ai
Configure AI provider and model for msh.
msh config ai [--provider] [--model] [--api-key] [--endpoint]
Description: Configures AI provider settings:
- Saves to
~/.msh/config(global) ormsh.yaml(project-level) - Supports multiple providers (OpenAI, Anthropic, Ollama, etc.)
- API key resolution with
env:VAR_NAMEsupport
Options:
--provider <provider>: AI provider key (e.g., openai, anthropic, ollama)--model <model>: Model identifier (e.g., gpt-4.5, claude-3-opus)--api-key <key>: API key or env:VAR_NAME reference--endpoint <url>: Optional custom endpoint URL
Examples:
# Configure OpenAI
msh config ai --provider openai --model gpt-4 --api-key env:OPENAI_API_KEY
# Configure Anthropic
msh config ai --provider anthropic --model claude-3-opus --api-key env:ANTHROPIC_API_KEY
# Configure Ollama (local)
msh config ai --provider ollama --model llama2 --endpoint http://localhost:11434/api/generate
Supported Providers:
openai- OpenAI (GPT-4, GPT-3.5, etc.)anthropic- Anthropic (Claude 3 Opus, Sonnet, etc.)ollama- Ollama (local models, no API key needed)huggingface- HuggingFace (local models)azure- Azure OpenAI
See AI Setup for complete documentation.