Skip to main content

Troubleshooting

When things go wrong, msh provides tools to help you diagnose and fix issues quickly.

Diagnosing Failures

The key to debugging msh is understanding which layer is failing: Ingestion (dlt) or Transformation (dbt).

Reading the Logs

msh aggregates logs from its internal engine as well as the underlying tools (dlt and dbt). Each log line is prefixed with its source:

[msh] Starting run (run_id: a3f2b1c)
[msh] Initializing Green schema: analytics_green_a3f2b1c
[dlt] Loading resource: stripe.customers
[dlt] ✓ Loaded 1,245 rows to raw_stripe_customers
[dbt] Running model: models/customers.msh
[dbt] ✗ Compilation Error in model customers (models/customers.msh)
[dbt] column "email_address" does not exist
[msh] ✗ Run failed at transformation stage

dlt (Ingestion) Errors

Errors during the ingestion phase usually relate to:

  • Authentication: Wrong API key or database password.
  • Schema Mismatch: The source data changed and doesn't match the expected schema.
  • Network: Transient network failures.

Example Error:

[dlt] Pipeline stripe_to_postgres failed
[dlt] requests.exceptions.HTTPError: 401 Client Error: Unauthorized
[dlt] Hint: Check your STRIPE_API_KEY environment variable

Fix: Verify the API key in your .env file.

dbt (Transformation) Errors

Errors during the transformation phase are typically SQL errors:

  • Syntax Error: Invalid SQL in your .msh file.
  • Missing Column: You referenced a column that doesn't exist.
  • Test Failure: A data quality test (e.g., unique, not_null) failed.

Example Error:

[dbt] Compilation Error in model customers (models/customers.msh)
[dbt] column "email_address" does not exist
[dbt] LINE 3: email_address,
[dbt] ^
[dbt] Hint: Available columns are: email, status, created_at

Fix: Update your SQL to use the correct column name (email instead of email_address).


Improved Error Messages

msh now provides consistent, helpful error messages with actionable suggestions.

Standard Error Format

All error messages follow this format:

[ERROR] {Phase}: {Message}
Tip: {Actionable suggestion}
Usage examples:
msh run --all
msh run orders
msh run +orders # Run orders and upstreams

Error Message Examples

Before:

Error: Cannot use --all with asset selector.

After:

[ERROR] Run: Cannot use --all with asset selector. Use one or the other.
Usage examples:
msh run --all
msh run orders
msh run +orders # Run orders and upstreams

Command-Specific Improvements

  1. Run Command (msh run):

    • Clearer messages for conflicting options
    • Usage examples for common scenarios
    • Suggestions when no assets match selector
  2. Rollback Command (msh rollback):

    • Helpful messages when backup tables don't exist
    • Suggestions to check available versions with msh versions
    • Usage examples for bulk operations
  3. Validate Command (msh validate):

    • File locations for validation errors
    • Suggestions for fixing common issues
    • Template references for scaffolding (msh create asset)
  4. Orchestrator Errors:

    • Connection error hints
    • Setup guidance for common issues
    • Compilation error context

Debug Mode

When troubleshooting issues, enable debug mode to see detailed logs from underlying tools (dbt, dlt).

Usage:

msh run --debug
msh plan --debug
msh lineage --debug

What it shows:

  • Full dbt compilation logs
  • SQL queries being executed
  • Detailed error stack traces
  • Verbose dlt pipeline logs

Example:

# Run with debug to see detailed dbt logs
msh run orders --debug

# Output includes:
[dbt] Compiling model 'orders'...
[dbt] SQL: SELECT id, name FROM {{ source('api', 'users') }}
[dbt] Error: column "name" does not exist

Running msh doctor

The first step in debugging is always msh doctor. This command performs a suite of health checks on your environment.

msh doctor

Example Output:

msh Doctor v1.0.0
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Environment Checks
[OK] Python 3.11.5
[OK] dbt-core 1.7.0
[OK] dlt 0.4.12

Warehouse Connection
[OK] Connected to postgres://localhost:5432/msh_db
[OK] Schema 'analytics_blue' exists
[OK] State table 'msh_state_history' found
[OK] User has CREATE, DROP, ALTER privileges

Source Connectivity
[OK] Stripe API (verified with test request)
[FAIL] Salesforce API - Invalid credentials

Recommendations:
• Check SALESFORCE_API_KEY in your .env file
• Verify API key has not expired

What msh doctor Checks

  1. Python Dependencies: Ensures dlt, dbt, and other required packages are installed.
  2. Database Connectivity: Tests connection to the destination database.
  3. Permissions: Verifies the database user has necessary privileges for Blue/Green swaps.
  4. State Integrity: Checks that msh_state_history exists and is readable.
  5. Source Connectivity: Attempts to connect to each configured source.

Common Setup Failures

Missing API Key

Symptom: dlt fails to connect to a source.

[dlt] requests.exceptions.HTTPError: 401 Client Error: Unauthorized

Fix: Ensure the environment variable matches the pattern expected by the source. For example:

  • Stripe expects STRIPE_API_KEY
  • Salesforce expects SALESFORCE_USERNAME, SALESFORCE_PASSWORD, SALESFORCE_SECURITY_TOKEN

Check your .env file and verify the variable names.


Docker Amnesia

Symptom: msh rollback says there is no previous state to roll back to.

[msh] Error: No previous deployment found in msh_state_history

Cause: You might be running msh in a container without persistent storage and you haven't configured a remote destination for state.

Fix: Ensure your destination credentials are set (e.g., DESTINATION__POSTGRES__CREDENTIALS) so state is stored in the database, not locally.


Permission Denied

Symptom: msh fails during the swap phase.

[msh] Error: permission denied to create schema "analytics_green_a3f2b1c"

Cause: The database user doesn't have CREATE SCHEMA privileges.

Fix: Grant the necessary permissions:

GRANT CREATE ON DATABASE msh_db TO msh_user;

Port Conflicts

Symptom: msh ui fails to start.

[msh] Error: Address already in use (port 3000)

Fix: The default port 3000 might be in use. Kill the process using the port:

# Find the process
lsof -i :3000

# Kill it
kill -9 <PID>

Schema Mismatch

Symptom: dlt fails with a schema evolution error.

[dlt] SchemaEvolutionError: Column 'new_field' not found in destination table

Cause: The source API added a new field, but your destination table doesn't have it.

Fix: msh handles schema evolution automatically. If you see this error, it means you have schema_evolution disabled. Enable it in your source configuration:

source:
type: dlt
source_name: stripe
config:
schema_evolution: true

Circular Dependencies

Symptom: dbt fails with a circular dependency error.

[dbt] Compilation Error: Circular dependency detected
[dbt] customers -> orders -> customers

Cause: Two models reference each other, creating a loop.

Fix: Refactor your models to break the cycle. Use CTEs or intermediate models to resolve the dependency.


Source Resolution Errors

When using source definitions in msh.yaml, you may encounter source resolution errors.

Source Not Found

Symptom: Error when referencing a source that doesn't exist in msh.yaml.

[ERROR] Source 'prod_db' not found in msh.yaml.
Available sources: staging_db, analytics_db

Cause: The source name in your .msh file doesn't match any source defined in msh.yaml.

Fix:

  1. Check the source name spelling in your .msh file
  2. Verify the source is defined in msh.yaml
  3. List available sources from the error message

Example:

# msh.yaml
sources:
- name: prod_db # ✅ Correct name
type: sql_database
...

# models/stg_orders.msh
ingest:
source: prod_db # ✅ Matches msh.yaml
table: orders

Table/Resource Not Found

Symptom: Error when referencing a table or resource that isn't listed in the source definition.

[ERROR] Table 'orders' not found in source 'prod_db'.
Available tables: customers, products

Cause: The table/resource isn't listed in the source's tables or resources array in msh.yaml.

Fix: Add the table/resource to the source definition:

# msh.yaml
sources:
- name: prod_db
type: sql_database
tables:
- name: customers
- name: products
- name: orders # ✅ Add missing table

Mixing Source Reference with Direct Credentials

Symptom: Error when using both source reference and direct credentials.

[ERROR] Cannot specify both 'source' reference and direct 'type'/'credentials' in ingest block

Cause: You're trying to use both a source reference (source: prod_db) and direct credentials (type:, credentials:) in the same ingest block.

Fix: Use either a source reference OR direct credentials, not both:

# ❌ Wrong - mixing both
ingest:
source: prod_db
type: sql_database
credentials: "${DB_URI}"

# ✅ Correct - source reference only
ingest:
source: prod_db
table: orders

# ✅ Correct - direct credentials only
ingest:
type: sql_database
credentials: "${DB_URI}"
table: orders

Missing Required Fields

Symptom: Error when source reference is missing required fields.

[ERROR] Source reference requires either 'table' (for SQL sources) or 'resource' (for API sources)

Cause: You're using a source reference but haven't specified which table (SQL) or resource (API) to use.

Fix: Add the required field:

# ✅ SQL Database Source
ingest:
source: prod_db
table: orders # Required for SQL sources

# ✅ REST API Source
ingest:
source: stripe_api
resource: customers # Required for API sources

Environment Variable Not Resolved

Symptom: Source definition uses ${VAR_NAME} but variable isn't set.

[ERROR] Environment variable 'DB_PROD_CREDENTIALS' not found

Cause: The environment variable referenced in msh.yaml isn't set in your .env file or system environment.

Fix:

  1. Check your .env file for the variable
  2. Verify the variable name matches exactly (case-sensitive)
  3. Restart your terminal/session if you just added it

Example:

# msh.yaml
sources:
- name: prod_db
credentials: "${DB_PROD_CREDENTIALS}" # Must exist in .env
# .env
DB_PROD_CREDENTIALS="postgresql://user:pass@host/db" # ✅ Set this

Schema.Table Format Issues

Symptom: Error when using schema.table format with source references.

[ERROR] Table 'public.orders' not found in source 'prod_db'. Available tables: orders

Cause: You're using schema.table format, but the source already has a schema defined.

Fix: Use just the table name (the source's schema will be used):

# msh.yaml
sources:
- name: prod_db
schema: public # Schema defined here
tables:
- name: orders

# ✅ Correct - just table name
ingest:
source: prod_db
table: orders # Not "public.orders"

# ✅ Also correct - explicit schema override
ingest:
source: prod_db
table: public.orders # Explicit override

See msh.yaml Configuration Reference for more details on source definitions.


Variable Resolution Errors

When using variables in SQL with {{ var("variable_name") }}, you may encounter variable resolution errors.

Variable Not Found

Symptom: Error when referencing a variable that doesn't exist in msh.yaml.

[ERROR] Variable 'active_status' not found in msh.yaml vars.
Available variables: start_date, end_date

Cause: The variable name in your SQL doesn't match any variable defined in msh.yaml vars: section.

Fix: Add the variable to vars: section in msh.yaml:

# msh.yaml
vars:
active_status: "active" # ✅ Add missing variable
start_date: "${START_DATE}"

Environment Variable Not Resolved

Symptom: Variable references an environment variable that isn't set.

[ERROR] Environment variable 'START_DATE' not found

Cause: The environment variable referenced in vars: section isn't set in your .env file or system environment.

Fix: Add the environment variable to your .env file:

# .env
START_DATE=2024-01-01 # ✅ Add missing variable

Test Suite Errors

When using test suites from msh.yaml, you may encounter test suite resolution errors.

Test Suite Not Found

Symptom: Error when referencing a test suite that doesn't exist.

[ERROR] Test suite 'staging' not found in msh.yaml.
Available suites: financial, quality

Cause: The test suite name in your .msh file doesn't match any suite defined in msh.yaml test_suites: section.

Fix: Check the suite name spelling or add it to test_suites: section:

# msh.yaml
test_suites:
staging: # ✅ Add missing suite
- assert: "id IS NOT NULL"
- unique: id

No Test Suites Defined

Symptom: Error when using test_suites: but none are defined.

[ERROR] Test suites requested but no test_suites defined in msh.yaml

Cause: You're using test_suites: in a .msh file but haven't defined any suites in msh.yaml.

Fix: Either define test suites in msh.yaml or use individual tests: instead.


Config Default Errors

When using layer defaults from msh.yaml, you may encounter default resolution errors.

Layer Not Found

Symptom: Error when layer doesn't exist in defaults.

[ERROR] Layer 'staging' not found in defaults.
Available layers: marts, intermediate

Cause: The layer: field in your .msh file doesn't match any layer defined in msh.yaml defaults: section.

Fix: Add the layer to defaults: section or check the layer: field:

# msh.yaml
defaults:
staging: # ✅ Add missing layer
write_disposition: merge
primary_key: id

Auto-Detection Failed

Symptom: Layer auto-detection doesn't work as expected.

Cause: File is not in a recognized directory structure (models/staging/, models/marts/, etc.).

Fix: Either move the file to a recognized directory or explicitly set layer: field:

# models/custom/stg_orders.msh
layer: staging # ✅ Explicitly set layer

Macro Errors

When using SQL macros, you may encounter macro rendering errors.

Macro Not Found

Symptom: Error when calling a macro that doesn't exist.

[ERROR] Macro 'clean_string' not found

Cause: The macro isn't defined in any file in the macros/ directory.

Fix: Create the macro in a .sql file in macros/ directory:

-- macros/common.sql
{% macro clean_string(column_name) %}
TRIM(UPPER({{ column_name }}))
{% endmacro %}

Macro Syntax Error

Symptom: Error when macro has syntax issues.

[ERROR] Error rendering macro 'clean_string': Syntax error in macro file

Cause: The macro definition has invalid Jinja2 syntax.

Fix: Check macro syntax and ensure proper Jinja2 formatting:

-- ✅ Correct syntax
{% macro clean_string(column_name) %}
TRIM(UPPER({{ column_name }}))
{% endmacro %}

Macro Parameter Error

Symptom: Error when calling macro with wrong parameters.

[ERROR] Macro 'format_date' called with incorrect number of arguments

Cause: The macro call doesn't match the macro definition's parameters.

Fix: Check macro definition and ensure parameters match:

-- Macro definition
{% macro format_date(column_name, format='YYYY-MM-DD') %}
DATE_FORMAT({{ column_name }}, '{{ format }}')
{% endmacro %}

-- ✅ Correct usage
{{ format_date('created_at', 'YYYY-MM') }}
{{ format_date('created_at') }} # Uses default format