Environment Configuration

msh supports multiple environments (development, staging, production) with different configurations, schema naming, and deployment behaviors. This guide explains how to configure and use environments effectively.

Overview

Environments in msh control:

Schema naming: Git-aware schemas in dev, fixed schemas in prod
Configuration files: Environment-specific .env files
Deployment behavior: Different destinations and credentials per environment

The `--env` Flag

All msh commands support the --env flag to specify the target environment:

msh run --env dev
msh run --env staging
msh run --env prod

Default: If --env is not specified, msh defaults to dev.

Environment-Specific Configuration

Creating Environment Files

Create separate .env files for each environment:

my-msh-project/
├── .env.dev          # Development environment
├── .env.staging      # Staging environment
├── .env.production   # Production environment
└── .env              # Default (used if no --env specified)

Environment File Naming

msh loads environment files using the pattern: .env.<environment>

Examples:

msh run --env dev → loads .env.dev
msh run --env staging → loads .env.staging
msh run --env prod → loads .env.production or .env.prod
msh run (no flag) → loads .env (defaults to dev behavior)

Example Environment Files

.env.dev (Development):

# Development database (local or shared dev instance)
DESTINATION__DUCKDB__CREDENTIALS="duckdb:///dev_data.duckdb"

# Or Postgres dev instance
DESTINATION__POSTGRES__CREDENTIALS="postgresql://dev_user:dev_pass@localhost:5432/analytics_dev"

# Source credentials (can use test/sandbox APIs)
STRIPE_API_KEY="sk_test_..."
SALESFORCE_USERNAME="dev@example.com"

.env.staging (Staging):

# Staging database (shared staging instance)
DESTINATION__POSTGRES__CREDENTIALS="postgresql://staging_user:staging_pass@staging-db:5432/analytics_staging"

# Source credentials (staging/test APIs)
STRIPE_API_KEY="sk_test_..."
SALESFORCE_USERNAME="staging@example.com"

.env.production (Production):

# Production database (Snowflake, Postgres, etc.)
DESTINATION__SNOWFLAKE__CREDENTIALS__DATABASE="ANALYTICS_PROD"
DESTINATION__SNOWFLAKE__CREDENTIALS__USERNAME="MSH_PROD_USER"
DESTINATION__SNOWFLAKE__CREDENTIALS__PASSWORD="secure_prod_password"
DESTINATION__SNOWFLAKE__CREDENTIALS__HOST="abc123.snowflakecomputing.com"
DESTINATION__SNOWFLAKE__CREDENTIALS__WAREHOUSE="COMPUTE_WH"
DESTINATION__SNOWFLAKE__CREDENTIALS__ROLE="TRANSFORMER"

# Production source credentials
STRIPE_API_KEY="sk_live_..."
SALESFORCE_USERNAME="prod@example.com"

Schema Naming by Environment

Development Environment (`--env dev`)

In development, msh uses Git-aware schemas for automatic isolation:

Behavior:

Schema names include git branch suffix
Each developer branch gets isolated schemas
Prevents conflicts when multiple developers work simultaneously

Examples:

Branch feature/new-api → Schema: main_feature_new_api
Branch bugfix/issue-123 → Schema: main_bugfix_issue_123
Branch main → Schema: main_main
No git repo → Schema: main_local (fallback)

Raw Dataset:

Branch feature/new-api → Dataset: msh_raw_feature_new_api
Production → Dataset: msh_raw (no suffix)

See Git-Aware Schemas for complete details.

Production Environment (`--env prod`)

In production, msh uses fixed schemas without git suffixes:

Behavior:

Schema names are consistent and predictable
No git branch suffixes
Always uses base schema names

Examples:

Schema: main (DuckDB) or PUBLIC (Snowflake) or public (Postgres)
Raw Dataset: msh_raw

Why: Production deployments should always use the same schema names for consistency and to avoid confusion.

Environment Comparison

Feature	Development (`--env dev`)	Production (`--env prod`)
Schema Naming	Git-aware (with branch suffix)	Fixed (no suffix)
Example Schema	`main_feature_new_api`	`main`
Raw Dataset	`msh_raw_feature_new_api`	`msh_raw`
Use Case	Local development, feature branches	Production deployments
Isolation	Per-branch isolation	Single production schema
Configuration File	`.env.dev`	`.env.production`

Using Environments in Commands

Run Command

# Development (default)
msh run
msh run --env dev

# Staging
msh run --env staging

# Production
msh run --env prod

Plan Command

# Preview changes in dev
msh plan --env dev

# Preview changes in prod
msh plan --env prod

Rollback Command

# Rollback dev environment
msh rollback --env dev orders

# Rollback production
msh rollback --env prod orders

Status Command

# Check dev status
msh status

# Check prod status (if you have prod credentials configured)
msh status  # Uses current environment

CI/CD Integration

GitHub Actions Example

name: Deploy

on:
  push:
    branches: [main]

jobs:
  deploy-staging:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - name: Deploy to Staging
        run: msh run --env staging
        env:
          DESTINATION__POSTGRES__CREDENTIALS: ${{ secrets.STAGING_DB }}
          
  deploy-production:
    runs-on: ubuntu-latest
    needs: deploy-staging
    if: github.ref == 'refs/heads/main'
    steps:
      - uses: actions/checkout@v3
      - name: Deploy to Production
        run: msh run --env prod
        env:
          DESTINATION__SNOWFLAKE__CREDENTIALS__DATABASE: ${{ secrets.PROD_DB }}

Docker Example

FROM python:3.11-slim

WORKDIR /app

# Copy environment-specific config
COPY .env.production .env

# Run with production environment
CMD ["msh", "run", "--env", "prod"]

Best Practices

1. Never Commit Environment Files

Add to .gitignore:

.env*
!.env.example

Create .env.example as a template:

# .env.example
DESTINATION__POSTGRES__CREDENTIALS="postgresql://user:pass@host:5432/db"
STRIPE_API_KEY="your_api_key_here"

2. Use Secret Management in Production

Don't:

# ❌ Hardcode in .env file
DESTINATION__SNOWFLAKE__CREDENTIALS__PASSWORD="hardcoded_password"

Do:

# ✅ Use environment variables from secret manager
# Set in CI/CD or orchestration platform
DESTINATION__SNOWFLAKE__CREDENTIALS__PASSWORD="${SNOWFLAKE_PASSWORD}"

3. Separate Dev and Prod Databases

Development:

Use local databases (DuckDB, local Postgres)
Or shared dev instance with isolated schemas (Git-aware)

Production:

Use dedicated production database
Never mix dev and prod data

4. Test in Staging First

Workflow:

Develop locally (--env dev)
Deploy to staging (--env staging)
Verify in staging
Deploy to production (--env prod)

5. Environment-Specific Asset Configuration

You can also configure assets differently per environment in msh.yaml:

# msh.yaml
project_name: my_project

environments:
  dev:
    destination: duckdb
    target_schema: main
  staging:
    destination: postgres
    target_schema: analytics_staging
  prod:
    destination: snowflake
    target_schema: ANALYTICS_PROD

Then use:

msh run --env dev      # Uses DuckDB
msh run --env staging  # Uses Postgres staging
msh run --env prod     # Uses Snowflake production

Troubleshooting

Wrong Environment File Loaded

Symptom: Command uses wrong credentials or database.

Fix: Verify the environment file name matches:

# Check if file exists
ls -la .env.dev .env.production

# Verify file is being loaded
msh run --env dev --debug

Schema Name Mismatch

Symptom: Tables not found or wrong schema used.

Cause: Environment mismatch (dev vs prod) or git branch changed.

Fix:

# Check current environment
msh status

# Verify git branch (in dev)
git branch

# Use correct environment
msh run --env prod

Credentials Not Found

Symptom: Connection errors or authentication failures.

Fix: Ensure environment file exists and contains correct credentials:

# Check if file exists
cat .env.production

# Verify credentials format
# Should match: DESTINATION__<TYPE>__CREDENTIALS=...

Git-Aware Schemas: How dev environment uses git branches
Production Deployment: Complete production setup guide
CLI Reference: All commands with --env flag documentation

Overview​

The --env Flag​

Environment-Specific Configuration​

Creating Environment Files​

Environment File Naming​

Example Environment Files​

Schema Naming by Environment​

Development Environment (--env dev)​

Production Environment (--env prod)​

Environment Comparison​

Using Environments in Commands​

Run Command​

Plan Command​

Rollback Command​

Status Command​

CI/CD Integration​

GitHub Actions Example​

Docker Example​

Best Practices​

1. Never Commit Environment Files​

2. Use Secret Management in Production​

3. Separate Dev and Prod Databases​

4. Test in Staging First​

5. Environment-Specific Asset Configuration​

Troubleshooting​

Wrong Environment File Loaded​

Schema Name Mismatch​

Credentials Not Found​

Related Documentation​

Overview

The `--env` Flag

Environment-Specific Configuration

Creating Environment Files

Environment File Naming

Example Environment Files

Schema Naming by Environment

Development Environment (`--env dev`)

Production Environment (`--env prod`)

Environment Comparison

Using Environments in Commands

Run Command

Plan Command

Rollback Command

Status Command

CI/CD Integration

GitHub Actions Example

Docker Example

Best Practices

1. Never Commit Environment Files

2. Use Secret Management in Production

3. Separate Dev and Prod Databases

4. Test in Staging First

5. Environment-Specific Asset Configuration

Troubleshooting

Wrong Environment File Loaded

Schema Name Mismatch

Credentials Not Found

Related Documentation