Skip to main content

Migrating from dbt

If you are already using dbt, you don't need to rewrite your entire project to start using msh. msh is designed to wrap your existing SQL logic, adding the power of Atomic Data Engineering (Blue/Green deployment, Smart Ingest, and Rollbacks) without changing your transformation code.

The Wrapper Pattern

The easiest way to migrate is to use the "Wrapper Pattern". You keep your complex .sql files exactly as they are and create a lightweight .msh file to manage the lifecycle.

1. Existing dbt Model

Suppose you have a complex revenue model in models/revenue_mart.sql:

-- models/revenue_mart.sql
WITH transactions AS (
SELECT * FROM {{ source('stripe', 'charges') }}
),
refunds AS (
SELECT * FROM {{ source('stripe', 'refunds') }}
)
SELECT
t.id,
t.amount - COALESCE(r.amount, 0) as net_amount,
t.created_at
FROM transactions t
LEFT JOIN refunds r ON t.id = r.charge_id

2. Create the Wrapper

Create a file named models/revenue_mart.msh next to it:

name: revenue_mart

# Define where the data comes from (msh handles the ingest)
ingest:
type: dlt
source_name: stripe
resource: charges

# Point to your existing SQL file
transform_file: revenue_mart.sql

# Add Atomic Lifecycle features
deployment:
strategy: blue_green

tests:
- not_null: id
- unique: id

3. Run msh

msh run

What happens:

  1. Ingest: msh sees the ingest block and automatically loads the Stripe data into a raw table (e.g., raw_revenue_mart_a1b2).
  2. Compile: msh reads revenue_mart.sql, replaces {{ source(...) }} with the new raw table, and compiles it into a dbt model.
  3. Execute: msh runs dbt to build the model in a staging schema.
  4. Swap: Once tests pass, msh atomically swaps the view to point to the new table.

Benefits of Wrapping

  • Zero Rewrites: Your SQL logic remains untouched.
  • State Management: msh takes over the responsibility of managing table state (Blue/Green), so you never see "table not found" errors during updates.
  • Unified Pipeline: Ingest and Transform are now coupled in a single atomic unit. If ingestion fails, transformation doesn't run. If transformation fails, the old data remains live.

Source Definitions (dbt-style)

msh supports dbt-style source definitions, allowing you to define sources once in msh.yaml and reference them from individual .msh files. This eliminates credential repetition and follows the DRY principle.

dbt Sources.yml → msh.yaml

dbt approach (sources.yml):

# models/sources.yml
sources:
- name: stripe
description: Stripe payment data
tables:
- name: charges
- name: customers

- name: postgres
description: Production database
schema: public
tables:
- name: orders
- name: customers

msh approach (msh.yaml):

# msh.yaml
sources:
- name: stripe_api
type: rest_api
endpoint: "https://api.stripe.com"
credentials:
api_key: "${STRIPE_API_KEY}"
resources:
- name: charges
- name: customers

- name: prod_db
type: sql_database
credentials: "${DB_PROD_CREDENTIALS}"
schema: public
tables:
- name: orders
- name: customers

Referencing Sources

dbt approach:

-- models/stg_orders.sql
SELECT * FROM {{ source('postgres', 'orders') }}

msh approach:

# models/stg_orders.msh
ingest:
source: prod_db
table: orders

transform: |
SELECT * FROM {{ source }}

Key Differences

Featuredbtmsh
Source Definitionsources.ymlmsh.yaml
CredentialsDefined separatelyIncluded in source definition
Environment VariablesManual handling${VAR_NAME} syntax
Source TypesSQL databases onlySQL databases + REST APIs + more
Usage{{ source('name', 'table') }}source: name in ingest block

Migration Path

  1. Identify Sources: List all sources used in your dbt project
  2. Create msh.yaml: Convert sources.yml entries to msh format
  3. Update .msh Files: Replace {{ source(...) }} with source: references
  4. Test: Verify all sources resolve correctly
  5. Gradual Migration: Both styles can coexist during migration

Example: Complete Migration

Before (dbt):

# models/sources.yml
sources:
- name: stripe
tables:
- name: charges

# models/stg_charges.sql
SELECT * FROM {{ source('stripe', 'charges') }}

After (msh):

# msh.yaml
sources:
- name: stripe_api
type: rest_api
endpoint: "https://api.stripe.com"
credentials:
api_key: "${STRIPE_API_KEY}"
resources:
- name: charges

# models/stg_charges.msh
ingest:
source: stripe_api
resource: charges

transform: |
SELECT * FROM {{ source }}

See msh.yaml Configuration Reference for complete documentation.