Troubleshooting
When things go wrong, msh provides tools to help you diagnose and fix issues quickly.
Diagnosing Failures
The key to debugging msh is understanding which layer is failing: Ingestion (dlt) or Transformation (dbt).
Reading the Logs
msh aggregates logs from its internal engine as well as the underlying tools (dlt and dbt). Each log line is prefixed with its source:
[msh] Starting run (run_id: a3f2b1c)
[msh] Initializing Green schema: analytics_green_a3f2b1c
[dlt] Loading resource: stripe.customers
[dlt] ✓ Loaded 1,245 rows to raw_stripe_customers
[dbt] Running model: models/customers.msh
[dbt] ✗ Compilation Error in model customers (models/customers.msh)
[dbt] column "email_address" does not exist
[msh] ✗ Run failed at transformation stage
dlt (Ingestion) Errors
Errors during the ingestion phase usually relate to:
- Authentication: Wrong API key or database password.
- Schema Mismatch: The source data changed and doesn't match the expected schema.
- Network: Transient network failures.
Example Error:
[dlt] Pipeline stripe_to_postgres failed
[dlt] requests.exceptions.HTTPError: 401 Client Error: Unauthorized
[dlt] Hint: Check your STRIPE_API_KEY environment variable
Fix: Verify the API key in your .env file.
dbt (Transformation) Errors
Errors during the transformation phase are typically SQL errors:
- Syntax Error: Invalid SQL in your
.mshfile. - Missing Column: You referenced a column that doesn't exist.
- Test Failure: A data quality test (e.g.,
unique,not_null) failed.
Example Error:
[dbt] Compilation Error in model customers (models/customers.msh)
[dbt] column "email_address" does not exist
[dbt] LINE 3: email_address,
[dbt] ^
[dbt] Hint: Available columns are: email, status, created_at
Fix: Update your SQL to use the correct column name (email instead of email_address).
Improved Error Messages
msh now provides consistent, helpful error messages with actionable suggestions.
Standard Error Format
All error messages follow this format:
[ERROR] {Phase}: {Message}
Tip: {Actionable suggestion}
Usage examples:
msh run --all
msh run orders
msh run +orders # Run orders and upstreams
Error Message Examples
Before:
Error: Cannot use --all with asset selector.
After:
[ERROR] Run: Cannot use --all with asset selector. Use one or the other.
Usage examples:
msh run --all
msh run orders
msh run +orders # Run orders and upstreams
Command-Specific Improvements
-
Run Command (
msh run):- Clearer messages for conflicting options
- Usage examples for common scenarios
- Suggestions when no assets match selector
-
Rollback Command (
msh rollback):- Helpful messages when backup tables don't exist
- Suggestions to check available versions with
msh versions - Usage examples for bulk operations
-
Validate Command (
msh validate):- File locations for validation errors
- Suggestions for fixing common issues
- Template references for scaffolding (
msh create asset)
-
Orchestrator Errors:
- Connection error hints
- Setup guidance for common issues
- Compilation error context
Debug Mode
When troubleshooting issues, enable debug mode to see detailed logs from underlying tools (dbt, dlt).
Usage:
msh run --debug
msh plan --debug
msh lineage --debug
What it shows:
- Full dbt compilation logs
- SQL queries being executed
- Detailed error stack traces
- Verbose dlt pipeline logs
Example:
# Run with debug to see detailed dbt logs
msh run orders --debug
# Output includes:
[dbt] Compiling model 'orders'...
[dbt] SQL: SELECT id, name FROM {{ source('api', 'users') }}
[dbt] Error: column "name" does not exist
Running msh doctor
The first step in debugging is always msh doctor. This command performs a suite of health checks on your environment.
msh doctor
Example Output:
msh Doctor v1.0.0
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Environment Checks
[OK] Python 3.11.5
[OK] dbt-core 1.7.0
[OK] dlt 0.4.12
Warehouse Connection
[OK] Connected to postgres://localhost:5432/msh_db
[OK] Schema 'analytics_blue' exists
[OK] State table 'msh_state_history' found
[OK] User has CREATE, DROP, ALTER privileges
Source Connectivity
[OK] Stripe API (verified with test request)
[FAIL] Salesforce API - Invalid credentials
Recommendations:
• Check SALESFORCE_API_KEY in your .env file
• Verify API key has not expired
What msh doctor Checks
- Python Dependencies: Ensures
dlt,dbt, and other required packages are installed. - Database Connectivity: Tests connection to the destination database.
- Permissions: Verifies the database user has necessary privileges for Blue/Green swaps.
- State Integrity: Checks that
msh_state_historyexists and is readable. - Source Connectivity: Attempts to connect to each configured source.
Common Setup Failures
Missing API Key
Symptom: dlt fails to connect to a source.
[dlt] requests.exceptions.HTTPError: 401 Client Error: Unauthorized
Fix: Ensure the environment variable matches the pattern expected by the source. For example:
- Stripe expects
STRIPE_API_KEY - Salesforce expects
SALESFORCE_USERNAME,SALESFORCE_PASSWORD,SALESFORCE_SECURITY_TOKEN
Check your .env file and verify the variable names.
Docker Amnesia
Symptom: msh rollback says there is no previous state to roll back to.
[msh] Error: No previous deployment found in msh_state_history
Cause: You might be running msh in a container without persistent storage and you haven't configured a remote destination for state.
Fix: Ensure your destination credentials are set (e.g., DESTINATION__POSTGRES__CREDENTIALS) so state is stored in the database, not locally.
Permission Denied
Symptom: msh fails during the swap phase.
[msh] Error: permission denied to create schema "analytics_green_a3f2b1c"
Cause: The database user doesn't have CREATE SCHEMA privileges.
Fix: Grant the necessary permissions:
GRANT CREATE ON DATABASE msh_db TO msh_user;
Port Conflicts
Symptom: msh ui fails to start.
[msh] Error: Address already in use (port 3000)
Fix: The default port 3000 might be in use. Kill the process using the port:
# Find the process
lsof -i :3000
# Kill it
kill -9 <PID>
Schema Mismatch
Symptom: dlt fails with a schema evolution error.
[dlt] SchemaEvolutionError: Column 'new_field' not found in destination table
Cause: The source API added a new field, but your destination table doesn't have it.
Fix: msh handles schema evolution automatically. If you see this error, it means you have schema_evolution disabled. Enable it in your source configuration:
source:
type: dlt
source_name: stripe
config:
schema_evolution: true
Circular Dependencies
Symptom: dbt fails with a circular dependency error.
[dbt] Compilation Error: Circular dependency detected
[dbt] customers -> orders -> customers
Cause: Two models reference each other, creating a loop.
Fix: Refactor your models to break the cycle. Use CTEs or intermediate models to resolve the dependency.
Source Resolution Errors
When using source definitions in msh.yaml, you may encounter source resolution errors.
Source Not Found
Symptom: Error when referencing a source that doesn't exist in msh.yaml.
[ERROR] Source 'prod_db' not found in msh.yaml.
Available sources: staging_db, analytics_db
Cause: The source name in your .msh file doesn't match any source defined in msh.yaml.
Fix:
- Check the source name spelling in your
.mshfile - Verify the source is defined in
msh.yaml - List available sources from the error message
Example:
# msh.yaml
sources:
- name: prod_db # ✅ Correct name
type: sql_database
...
# models/stg_orders.msh
ingest:
source: prod_db # ✅ Matches msh.yaml
table: orders
Table/Resource Not Found
Symptom: Error when referencing a table or resource that isn't listed in the source definition.
[ERROR] Table 'orders' not found in source 'prod_db'.
Available tables: customers, products
Cause: The table/resource isn't listed in the source's tables or resources array in msh.yaml.
Fix: Add the table/resource to the source definition:
# msh.yaml
sources:
- name: prod_db
type: sql_database
tables:
- name: customers
- name: products
- name: orders # ✅ Add missing table
Mixing Source Reference with Direct Credentials
Symptom: Error when using both source reference and direct credentials.
[ERROR] Cannot specify both 'source' reference and direct 'type'/'credentials' in ingest block
Cause: You're trying to use both a source reference (source: prod_db) and direct credentials (type:, credentials:) in the same ingest block.
Fix: Use either a source reference OR direct credentials, not both:
# ❌ Wrong - mixing both
ingest:
source: prod_db
type: sql_database
credentials: "${DB_URI}"
# ✅ Correct - source reference only
ingest:
source: prod_db
table: orders
# ✅ Correct - direct credentials only
ingest:
type: sql_database
credentials: "${DB_URI}"
table: orders
Missing Required Fields
Symptom: Error when source reference is missing required fields.
[ERROR] Source reference requires either 'table' (for SQL sources) or 'resource' (for API sources)
Cause: You're using a source reference but haven't specified which table (SQL) or resource (API) to use.
Fix: Add the required field:
# ✅ SQL Database Source
ingest:
source: prod_db
table: orders # Required for SQL sources
# ✅ REST API Source
ingest:
source: stripe_api
resource: customers # Required for API sources
Environment Variable Not Resolved
Symptom: Source definition uses ${VAR_NAME} but variable isn't set.
[ERROR] Environment variable 'DB_PROD_CREDENTIALS' not found
Cause: The environment variable referenced in msh.yaml isn't set in your .env file or system environment.
Fix:
- Check your
.envfile for the variable - Verify the variable name matches exactly (case-sensitive)
- Restart your terminal/session if you just added it
Example:
# msh.yaml
sources:
- name: prod_db
credentials: "${DB_PROD_CREDENTIALS}" # Must exist in .env
# .env
DB_PROD_CREDENTIALS="postgresql://user:pass@host/db" # ✅ Set this
Schema.Table Format Issues
Symptom: Error when using schema.table format with source references.
[ERROR] Table 'public.orders' not found in source 'prod_db'. Available tables: orders
Cause: You're using schema.table format, but the source already has a schema defined.
Fix: Use just the table name (the source's schema will be used):
# msh.yaml
sources:
- name: prod_db
schema: public # Schema defined here
tables:
- name: orders
# ✅ Correct - just table name
ingest:
source: prod_db
table: orders # Not "public.orders"
# ✅ Also correct - explicit schema override
ingest:
source: prod_db
table: public.orders # Explicit override
See msh.yaml Configuration Reference for more details on source definitions.
Variable Resolution Errors
When using variables in SQL with {{ var("variable_name") }}, you may encounter variable resolution errors.
Variable Not Found
Symptom: Error when referencing a variable that doesn't exist in msh.yaml.
[ERROR] Variable 'active_status' not found in msh.yaml vars.
Available variables: start_date, end_date
Cause: The variable name in your SQL doesn't match any variable defined in msh.yaml vars: section.
Fix: Add the variable to vars: section in msh.yaml:
# msh.yaml
vars:
active_status: "active" # ✅ Add missing variable
start_date: "${START_DATE}"
Environment Variable Not Resolved
Symptom: Variable references an environment variable that isn't set.
[ERROR] Environment variable 'START_DATE' not found
Cause: The environment variable referenced in vars: section isn't set in your .env file or system environment.
Fix: Add the environment variable to your .env file:
# .env
START_DATE=2024-01-01 # ✅ Add missing variable
Test Suite Errors
When using test suites from msh.yaml, you may encounter test suite resolution errors.
Test Suite Not Found
Symptom: Error when referencing a test suite that doesn't exist.
[ERROR] Test suite 'staging' not found in msh.yaml.
Available suites: financial, quality
Cause: The test suite name in your .msh file doesn't match any suite defined in msh.yaml test_suites: section.
Fix: Check the suite name spelling or add it to test_suites: section:
# msh.yaml
test_suites:
staging: # ✅ Add missing suite
- assert: "id IS NOT NULL"
- unique: id
No Test Suites Defined
Symptom: Error when using test_suites: but none are defined.
[ERROR] Test suites requested but no test_suites defined in msh.yaml
Cause: You're using test_suites: in a .msh file but haven't defined any suites in msh.yaml.
Fix: Either define test suites in msh.yaml or use individual tests: instead.
Config Default Errors
When using layer defaults from msh.yaml, you may encounter default resolution errors.
Layer Not Found
Symptom: Error when layer doesn't exist in defaults.
[ERROR] Layer 'staging' not found in defaults.
Available layers: marts, intermediate
Cause: The layer: field in your .msh file doesn't match any layer defined in msh.yaml defaults: section.
Fix: Add the layer to defaults: section or check the layer: field:
# msh.yaml
defaults:
staging: # ✅ Add missing layer
write_disposition: merge
primary_key: id
Auto-Detection Failed
Symptom: Layer auto-detection doesn't work as expected.
Cause: File is not in a recognized directory structure (models/staging/, models/marts/, etc.).
Fix: Either move the file to a recognized directory or explicitly set layer: field:
# models/custom/stg_orders.msh
layer: staging # ✅ Explicitly set layer
Macro Errors
When using SQL macros, you may encounter macro rendering errors.
Macro Not Found
Symptom: Error when calling a macro that doesn't exist.
[ERROR] Macro 'clean_string' not found
Cause: The macro isn't defined in any file in the macros/ directory.
Fix: Create the macro in a .sql file in macros/ directory:
-- macros/common.sql
{% macro clean_string(column_name) %}
TRIM(UPPER({{ column_name }}))
{% endmacro %}
Macro Syntax Error
Symptom: Error when macro has syntax issues.
[ERROR] Error rendering macro 'clean_string': Syntax error in macro file
Cause: The macro definition has invalid Jinja2 syntax.
Fix: Check macro syntax and ensure proper Jinja2 formatting:
-- ✅ Correct syntax
{% macro clean_string(column_name) %}
TRIM(UPPER({{ column_name }}))
{% endmacro %}
Macro Parameter Error
Symptom: Error when calling macro with wrong parameters.
[ERROR] Macro 'format_date' called with incorrect number of arguments
Cause: The macro call doesn't match the macro definition's parameters.
Fix: Check macro definition and ensure parameters match:
-- Macro definition
{% macro format_date(column_name, format='YYYY-MM-DD') %}
DATE_FORMAT({{ column_name }}, '{{ format }}')
{% endmacro %}
-- ✅ Correct usage
{{ format_date('created_at', 'YYYY-MM') }}
{{ format_date('created_at') }} # Uses default format