Universal Connectivity

msh is designed to be destination-agnostic. You can switch your underlying data warehouse from Postgres to Snowflake (or vice versa) with a single configuration change.

Secrets Management

All secrets are managed via environment variables or a .env file. msh follows the dlt convention for naming credentials, which means they are automatically injected into the pipeline without additional configuration.

The `.env` File

Create a .env file in your project root:

# .env
STRIPE_API_KEY=sk_live_abc123xyz
SALESFORCE_USERNAME=user@company.com
SALESFORCE_PASSWORD=secure_password
SALESFORCE_SECURITY_TOKEN=token123

# Destination credentials
DESTINATION__POSTGRES__CREDENTIALS="postgresql://user:pass@localhost:5432/msh_db"

msh automatically loads this file and makes the variables available to both the orchestration layer (for schema swaps) and the ingestion layer (for API calls).

info

Never commit .env files to version control. Add .env to your .gitignore.

Destination Configuration

The destination is where your data lands. msh supports any database that dlt and dbt support, including Postgres, Snowflake, BigQuery, Redshift, and DuckDB.

Postgres

Set the following environment variables. DESTINATION__... is for dlt/msh, while POSTGRES_... are for dbt.

# dlt / msh (Ingestion & Orchestration)
DESTINATION__POSTGRES__CREDENTIALS="postgresql://user:password@host:port/database"

# dbt (Transformation)
POSTGRES_HOST="host"
POSTGRES_USER="user"
POSTGRES_PASSWORD="password"
POSTGRES_PORT="5432"
POSTGRES_DB="database"

Snowflake

For Snowflake, you need both the dlt credentials and the dbt environment variables.

Important Notes:

Account name should be just the identifier (e.g., xyz123), not the full URL
Schema names are automatically converted to uppercase (Snowflake convention)
Schema names are validated and sanitized (max 255 chars, recommended 63)
In dev mode, git branch suffixes are automatically appended to schema names

# dlt / msh (Ingestion & Orchestration)
DESTINATION__SNOWFLAKE__CREDENTIALS__DATABASE="ANALYTICS"
DESTINATION__SNOWFLAKE__CREDENTIALS__PASSWORD="secure_password"
DESTINATION__SNOWFLAKE__CREDENTIALS__USERNAME="MSH_USER"
DESTINATION__SNOWFLAKE__CREDENTIALS__ACCOUNT="xyz123"  # or HOST="xyz123.snowflakecomputing.com"
DESTINATION__SNOWFLAKE__CREDENTIALS__WAREHOUSE="COMPUTE_WH"
DESTINATION__SNOWFLAKE__CREDENTIALS__ROLE="TRANSFORMER"

# dbt (Transformation)
SNOWFLAKE_ACCOUNT="xyz123"
SNOWFLAKE_USER="MSH_USER"
SNOWFLAKE_PASSWORD="secure_password"
SNOWFLAKE_ROLE="TRANSFORMER"
SNOWFLAKE_DATABASE="ANALYTICS"
SNOWFLAKE_WAREHOUSE="COMPUTE_WH"

Connection String Format: msh automatically builds the Snowflake connection string in the format:

snowflake://user:password@account/database?warehouse=warehouse&role=role&schema=schema

Schema Naming Best Practices:

Use uppercase names (automatically enforced)
Keep names under 63 characters (recommended)
Avoid special characters (automatically sanitized)
In dev mode, branch names are appended: PUBLIC_feature_branch

Switching Destinations

To switch from Postgres to Snowflake:

Update your .env file with the new destination credentials.
Run msh init to initialize the internal state in the new destination.
Run msh run to deploy your pipeline.

msh handles the dialect differences for you. Standard SQL transformations will work on both platforms without modification.

note

msh automatically detects the destination type based on the environment variables you provide. You don't need to explicitly set a DESTINATION_TYPE variable.

Source Configuration

msh supports two approaches for defining sources:

Source Definitions (Recommended): Define sources once in msh.yaml and reference them from .msh files
Direct Credentials: Define credentials directly in each .msh file

Both approaches work and can be mixed in the same project. Source definitions follow the DRY principle and are recommended for projects with multiple assets using the same sources.

Source Definitions (Recommended)

Define sources once in msh.yaml and reference them from your .msh files:

Define in msh.yaml:

# msh.yaml
sources:
  - name: prod_db
    type: sql_database
    credentials: "${DB_PROD_CREDENTIALS}"
    schema: public
    tables:
      - name: orders
      - name: customers

  - name: stripe_api
    type: rest_api
    endpoint: "https://api.stripe.com"
    credentials:
      api_key: "${STRIPE_API_KEY}"
    resources:
      - name: customers
      - name: charges

Reference in .msh files:

# models/stg_orders.msh
ingest:
  source: prod_db
  table: orders

transform: |
  SELECT * FROM {{ source }}

# models/stg_stripe_customers.msh
ingest:
  source: stripe_api
  resource: customers

transform: |
  SELECT
    id,
    email,
    created_at
  FROM {{ source }}
  WHERE status = 'active'

Benefits:

✅ Change credentials once, update everywhere
✅ Clear source catalog in one place
✅ Environment variables for sensitive data
✅ Eliminates credential repetition

See msh.yaml Configuration Reference for complete documentation.

Direct Credentials (Alternative)

You can also define credentials directly in each .msh file:

API Sources:

For API sources, reference the credentials using ${ENV_VAR} syntax:

# models/stripe_customers.msh
source:
  type: dlt
  source_name: stripe
  resource: customers
  credentials:
    api_key: ${STRIPE_API_KEY}

sql: |
  SELECT
    id,
    email,
    created_at
  FROM source('stripe', 'customers')
  WHERE status = 'active'

Database Sources (DB-to-DB):

For database-to-database replication, use the sql_database source type:

# models/legacy_users.msh
source:
  type: dlt
  source_name: sql_database
  credentials: ${SOURCE__POSTGRES__CREDENTIALS}
  table: users
  schema: public

sql: |
  SELECT
    user_id,
    email,
    signup_date
  FROM source('sql_database', 'users')

Environment Variable:

SOURCE__POSTGRES__CREDENTIALS="postgresql://readonly_user:pass@legacy-db.company.com:5432/production"

When to Use Each Approach

Use Source Definitions (msh.yaml) when:

You have multiple .msh files using the same source
You want a centralized source catalog
You're building a layered project (staging → intermediate → marts)
You want to follow DRY principles

Use Direct Credentials when:

You have a single asset using a unique source
You're prototyping or testing
You need source-specific configuration that doesn't fit the source definition pattern

Both approaches can coexist in the same project, allowing gradual migration from direct credentials to source definitions.

Troubleshooting

Snowflake-Specific Issues

If you encounter errors with Snowflake, see the Snowflake Troubleshooting Guide for detailed solutions to common problems including:

Warehouse suspension errors
Connection timeouts
Authentication failures
Schema name issues
Quota exceeded errors

Quick Checks:

Verify all required environment variables are set (run env | grep SNOWFLAKE)
Ensure your warehouse is running (not suspended) in Snowflake UI
Check user permissions (CREATE SCHEMA, CREATE TABLE, CREATE VIEW)
Run with --debug flag for detailed error messages: msh run --debug

General Connection Issues

Problem: Connection fails immediately

Check that all required environment variables are set
Verify credentials are correct (test manually if possible)
Check network connectivity to the destination

Problem: Connection timeout

Verify the destination is accessible from your network
Check firewall rules and IP whitelisting
For Snowflake, ensure warehouse is running

Problem: Authentication errors

Double-check username and password (case-sensitive)
Verify account/database names match exactly
Check user permissions and role assignments

Best Practices

Use Read-Only Credentials for Sources: When connecting to production databases, always use a read-only user to prevent accidental writes.
Separate Environments: Use different .env files for development, staging, and production (e.g., .env.dev, .env.prod).
Rotate Secrets Regularly: API keys and database passwords should be rotated periodically and updated in your secret management system.
Test Connections First: Use msh run --dry-run to verify connections before running full pipelines.
Monitor Warehouse Status: For Snowflake, set up alerts for warehouse suspension and monitor compute usage.

Secrets Management​

The .env File​

Destination Configuration​

Postgres​

Snowflake​

Switching Destinations​

Source Configuration​

Source Definitions (Recommended)​

Direct Credentials (Alternative)​

When to Use Each Approach​

Troubleshooting​

Snowflake-Specific Issues​

General Connection Issues​

Best Practices​