Skip to main content

Sample Command

Quickly preview and sample data from assets for testing, debugging, and data exploration.

Overview

The msh sample command provides a fast way to preview data from your assets without writing SQL queries. It queries the active version of deployed assets and displays results in a formatted table.

Usage

msh sample <asset_name> [options]

Arguments:

  • asset_name (required): Name of the asset to sample

Options

--limit <number>

Number of rows to preview (default: 10).

msh sample orders --limit 20

--size <number>

Create a temporary sample table with this many rows for testing.

msh sample orders --size 1000

What it does:

  • Creates a new table {asset_name}_sample_{size} in the same schema
  • Uses random sampling to select rows
  • Useful for creating test datasets without copying entire tables

Example:

msh sample customers --size 5000
# Creates: customers_sample_5000 table

--env <name>

Environment to query (default: dev).

msh sample orders --env prod

Behavior:

  • dev: Queries dev environment (Git-aware schemas)
  • prod: Queries production environment (fixed schemas)

--source <type>

Source to sample from (default: view).

Options:

  • view: Active view (default, most common) - Shows production data
  • model: Transformed model table - Shows the transformed data
  • raw: Raw ingested table - Shows data as ingested from source
# Sample from active view (production data)
msh sample orders

# Sample from raw ingested table
msh sample orders --source raw

# Sample from transformed model table
msh sample orders --source model

Examples

Quick Preview

Preview the latest 10 rows from an asset:

msh sample revenue

Output:

Sample from revenue (main.revenue)
┌──────────┬─────────────┬──────────┐
│ order_id │ amount │ date │
├──────────┼─────────────┼──────────┤
│ 1001 │ 150.50 │ 2024-01-15│
│ 1002 │ 89.99 │ 2024-01-15│
└──────────┴─────────────┴──────────┘
Showing 10 row(s). Use --limit to change.

Check Raw Data Quality

Inspect raw ingested data before transformation:

msh sample revenue --source raw --limit 50

This helps you:

  • Verify data ingestion is working correctly
  • Check for data quality issues early
  • Understand source data structure

Create Test Dataset

Generate a sample table for testing:

msh sample orders --size 5000

Output:

[OK] Created sample table 'main.orders_sample_5000' with 5000 rows.
Sample table created: main.orders_sample_5000
You can now query this table or use it for testing.

Use cases:

  • Create smaller datasets for faster testing
  • Generate test data for development
  • Isolate data for debugging specific issues

Sample from Production

Query production data (use with caution):

msh sample orders --env prod --limit 5

Sample Multiple Sources

Compare raw vs transformed data:

# Raw data
msh sample orders --source raw --limit 10

# Transformed data
msh sample orders --source model --limit 10

# Production view
msh sample orders --source view --limit 10

How It Works

  1. Resolves Asset Location: Finds the active version of the asset using Blue/Green deployment hash
  2. Queries Database: Executes SELECT * FROM {schema}.{table} LIMIT {limit}
  3. Displays Results: Formats output in a readable table

Source Resolution:

  • View: Queries the active view (e.g., main.orders)
  • Model: Queries the model table (e.g., main.model_orders_a1b2c3)
  • Raw: Queries the raw ingested table (e.g., msh_raw.raw_orders_a1b2c3)

Requirements

Asset Must Be Deployed

The asset must have been deployed at least once:

# If you see this error:
# [ERROR] Sample: Asset 'orders' not found. Run 'msh run orders' first.

# Deploy the asset first:
msh run orders

# Then sample:
msh sample orders

Database Connection

The command requires:

  • Valid database credentials in .env file
  • Active connection to the destination database
  • Read permissions on the target tables/views

Use Cases

1. Quick Data Inspection

# Check latest data
msh sample customers

# Inspect specific number of rows
msh sample customers --limit 100

2. Data Quality Checks

# Check raw data
msh sample orders --source raw --limit 50

# Check transformed data
msh sample orders --source model --limit 50

3. Test Data Generation

# Create test dataset
msh sample orders --size 1000

# Use in SQL queries
SELECT * FROM main.orders_sample_1000 WHERE amount > 100;

4. Debugging Transformations

# Compare before and after transformation
msh sample orders --source raw --limit 10
msh sample orders --source model --limit 10

Command Aliases

You can also use the msh asset group:

# Traditional form
msh sample orders

# Alias form
msh asset sample orders

Both forms work identically. The alias form provides consistency with other asset commands.

Troubleshooting

"Asset not found"

Error:

[ERROR] Sample: Asset 'orders' not found. Run 'msh run orders' first.

Solution: Deploy the asset first:

msh run orders
msh sample orders

"Table not found"

Error:

[ERROR] Sample: Table 'main.model_orders_abc123' not found.

Causes:

  • Asset hasn't been deployed yet
  • Wrong environment specified (--env)
  • Asset name mismatch

Solution:

# Check asset status
msh status

# Verify environment
msh sample orders --env dev

# Deploy if needed
msh run orders

"Connection failed"

Error:

[ERROR] Sample: Failed to connect to database: ...

Solution: Check your .env file and database credentials:

# Verify credentials
msh doctor

# Check .env file
cat .env
  • msh run: Deploy assets before sampling
  • msh status: Check which assets are deployed
  • msh versions: View deployment history

See Also