Environment Configuration
msh supports multiple environments (development, staging, production) with different configurations, schema naming, and deployment behaviors. This guide explains how to configure and use environments effectively.
Overview
Environments in msh control:
- Schema naming: Git-aware schemas in dev, fixed schemas in prod
- Configuration files: Environment-specific
.envfiles - Deployment behavior: Different destinations and credentials per environment
The --env Flag
All msh commands support the --env flag to specify the target environment:
msh run --env dev
msh run --env staging
msh run --env prod
Default: If --env is not specified, msh defaults to dev.
Environment-Specific Configuration
Creating Environment Files
Create separate .env files for each environment:
my-msh-project/
├── .env.dev # Development environment
├── .env.staging # Staging environment
├── .env.production # Production environment
└── .env # Default (used if no --env specified)
Environment File Naming
msh loads environment files using the pattern: .env.<environment>
Examples:
msh run --env dev→ loads.env.devmsh run --env staging→ loads.env.stagingmsh run --env prod→ loads.env.productionor.env.prodmsh run(no flag) → loads.env(defaults to dev behavior)
Example Environment Files
.env.dev (Development):
# Development database (local or shared dev instance)
DESTINATION__DUCKDB__CREDENTIALS="duckdb:///dev_data.duckdb"
# Or Postgres dev instance
DESTINATION__POSTGRES__CREDENTIALS="postgresql://dev_user:dev_pass@localhost:5432/analytics_dev"
# Source credentials (can use test/sandbox APIs)
STRIPE_API_KEY="sk_test_..."
SALESFORCE_USERNAME="dev@example.com"
.env.staging (Staging):
# Staging database (shared staging instance)
DESTINATION__POSTGRES__CREDENTIALS="postgresql://staging_user:staging_pass@staging-db:5432/analytics_staging"
# Source credentials (staging/test APIs)
STRIPE_API_KEY="sk_test_..."
SALESFORCE_USERNAME="staging@example.com"
.env.production (Production):
# Production database (Snowflake, Postgres, etc.)
DESTINATION__SNOWFLAKE__CREDENTIALS__DATABASE="ANALYTICS_PROD"
DESTINATION__SNOWFLAKE__CREDENTIALS__USERNAME="MSH_PROD_USER"
DESTINATION__SNOWFLAKE__CREDENTIALS__PASSWORD="secure_prod_password"
DESTINATION__SNOWFLAKE__CREDENTIALS__HOST="abc123.snowflakecomputing.com"
DESTINATION__SNOWFLAKE__CREDENTIALS__WAREHOUSE="COMPUTE_WH"
DESTINATION__SNOWFLAKE__CREDENTIALS__ROLE="TRANSFORMER"
# Production source credentials
STRIPE_API_KEY="sk_live_..."
SALESFORCE_USERNAME="prod@example.com"
Schema Naming by Environment
Development Environment (--env dev)
In development, msh uses Git-aware schemas for automatic isolation:
Behavior:
- Schema names include git branch suffix
- Each developer branch gets isolated schemas
- Prevents conflicts when multiple developers work simultaneously
Examples:
- Branch
feature/new-api→ Schema:main_feature_new_api - Branch
bugfix/issue-123→ Schema:main_bugfix_issue_123 - Branch
main→ Schema:main_main - No git repo → Schema:
main_local(fallback)
Raw Dataset:
- Branch
feature/new-api→ Dataset:msh_raw_feature_new_api - Production → Dataset:
msh_raw(no suffix)
See Git-Aware Schemas for complete details.
Production Environment (--env prod)
In production, msh uses fixed schemas without git suffixes:
Behavior:
- Schema names are consistent and predictable
- No git branch suffixes
- Always uses base schema names
Examples:
- Schema:
main(DuckDB) orPUBLIC(Snowflake) orpublic(Postgres) - Raw Dataset:
msh_raw
Why: Production deployments should always use the same schema names for consistency and to avoid confusion.
Environment Comparison
| Feature | Development (--env dev) | Production (--env prod) |
|---|---|---|
| Schema Naming | Git-aware (with branch suffix) | Fixed (no suffix) |
| Example Schema | main_feature_new_api | main |
| Raw Dataset | msh_raw_feature_new_api | msh_raw |
| Use Case | Local development, feature branches | Production deployments |
| Isolation | Per-branch isolation | Single production schema |
| Configuration File | .env.dev | .env.production |
Using Environments in Commands
Run Command
# Development (default)
msh run
msh run --env dev
# Staging
msh run --env staging
# Production
msh run --env prod
Plan Command
# Preview changes in dev
msh plan --env dev
# Preview changes in prod
msh plan --env prod
Rollback Command
# Rollback dev environment
msh rollback --env dev orders
# Rollback production
msh rollback --env prod orders
Status Command
# Check dev status
msh status
# Check prod status (if you have prod credentials configured)
msh status # Uses current environment
CI/CD Integration
GitHub Actions Example
name: Deploy
on:
push:
branches: [main]
jobs:
deploy-staging:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Deploy to Staging
run: msh run --env staging
env:
DESTINATION__POSTGRES__CREDENTIALS: ${{ secrets.STAGING_DB }}
deploy-production:
runs-on: ubuntu-latest
needs: deploy-staging
if: github.ref == 'refs/heads/main'
steps:
- uses: actions/checkout@v3
- name: Deploy to Production
run: msh run --env prod
env:
DESTINATION__SNOWFLAKE__CREDENTIALS__DATABASE: ${{ secrets.PROD_DB }}
Docker Example
FROM python:3.11-slim
WORKDIR /app
# Copy environment-specific config
COPY .env.production .env
# Run with production environment
CMD ["msh", "run", "--env", "prod"]
Best Practices
1. Never Commit Environment Files
Add to .gitignore:
.env*
!.env.example
Create .env.example as a template:
# .env.example
DESTINATION__POSTGRES__CREDENTIALS="postgresql://user:pass@host:5432/db"
STRIPE_API_KEY="your_api_key_here"
2. Use Secret Management in Production
Don't:
# ❌ Hardcode in .env file
DESTINATION__SNOWFLAKE__CREDENTIALS__PASSWORD="hardcoded_password"
Do:
# ✅ Use environment variables from secret manager
# Set in CI/CD or orchestration platform
DESTINATION__SNOWFLAKE__CREDENTIALS__PASSWORD="${SNOWFLAKE_PASSWORD}"
3. Separate Dev and Prod Databases
Development:
- Use local databases (DuckDB, local Postgres)
- Or shared dev instance with isolated schemas (Git-aware)
Production:
- Use dedicated production database
- Never mix dev and prod data
4. Test in Staging First
Workflow:
- Develop locally (
--env dev) - Deploy to staging (
--env staging) - Verify in staging
- Deploy to production (
--env prod)
5. Environment-Specific Asset Configuration
You can also configure assets differently per environment in msh.yaml:
# msh.yaml
project_name: my_project
environments:
dev:
destination: duckdb
target_schema: main
staging:
destination: postgres
target_schema: analytics_staging
prod:
destination: snowflake
target_schema: ANALYTICS_PROD
Then use:
msh run --env dev # Uses DuckDB
msh run --env staging # Uses Postgres staging
msh run --env prod # Uses Snowflake production
Troubleshooting
Wrong Environment File Loaded
Symptom: Command uses wrong credentials or database.
Fix: Verify the environment file name matches:
# Check if file exists
ls -la .env.dev .env.production
# Verify file is being loaded
msh run --env dev --debug
Schema Name Mismatch
Symptom: Tables not found or wrong schema used.
Cause: Environment mismatch (dev vs prod) or git branch changed.
Fix:
# Check current environment
msh status
# Verify git branch (in dev)
git branch
# Use correct environment
msh run --env prod
Credentials Not Found
Symptom: Connection errors or authentication failures.
Fix: Ensure environment file exists and contains correct credentials:
# Check if file exists
cat .env.production
# Verify credentials format
# Should match: DESTINATION__<TYPE>__CREDENTIALS=...
Related Documentation
- Git-Aware Schemas: How dev environment uses git branches
- Production Deployment: Complete production setup guide
- CLI Reference: All commands with
--envflag documentation