Skip to main content

Cloud Metadata Management

CLOUD

Upload and manage project metadata in the cloud for centralized access, cross-project analysis, and team collaboration.

Overview

The Cloud Metadata Engine stores all project metadata centrally, enabling:

  • Centralized Asset Discovery: Find assets across all projects
  • Cross-Project Lineage: Analyze dependencies across projects
  • Version History: Track asset versions and deployments
  • Test Aggregation: Aggregate test results across projects

Uploading Metadata

From CLI

Generate a context pack and upload it to the cloud:

# Generate context pack
msh ai context --json > context.json

# Upload to cloud
curl -X POST \
-H "Authorization: Bearer $API_TOKEN" \
-H "Content-Type: application/json" \
-d "{\"context_pack\": $(cat context.json), \"source\": \"cli\"}" \
https://api.msh.io/api/msh/metadata/projects/$PROJECT_ID/upload

Request Body

{
"context_pack": {
"project": {
"id": "my-project",
"name": "My Project",
"warehouse": "postgres",
"default_schema": "public"
},
"assets": [
{
"id": "revenue",
"path": "assets/revenue.msh",
"blocks": {...},
"schema": {...}
}
],
"lineage": [...],
"glossary": [...]
},
"source": "cli"
}

Response

{
"success": true,
"assets_synced": 15,
"lineage_edges_synced": 23,
"schemas_synced": 15,
"tests_synced": 8
}

API Endpoints

Upload Context Pack

Endpoint: POST /api/msh/metadata/projects/{project_id}/upload

Upload project metadata from CLI or CI/CD.

Example:

curl -X POST \
-H "Authorization: Bearer $API_TOKEN" \
-H "Content-Type: application/json" \
-d @context.json \
https://api.msh.io/api/msh/metadata/projects/1/upload

Get Project Manifest

Endpoint: GET /api/msh/metadata/projects/{project_id}/manifest

Retrieve the complete project manifest.

Example:

curl -X GET \
-H "Authorization: Bearer $API_TOKEN" \
https://api.msh.io/api/msh/metadata/projects/1/manifest

Response:

{
"project": {...},
"assets": [...],
"lineage": [...],
"schemas": {...},
"tests": [...]
}

Get Asset Metadata

Endpoint: GET /api/msh/metadata/assets/{asset_id}/metadata

Retrieve metadata for a specific asset.

Example:

curl -X GET \
-H "Authorization: Bearer $API_TOKEN" \
https://api.msh.io/api/msh/metadata/assets/5/metadata

Get Asset Lineage

Endpoint: GET /api/msh/metadata/assets/{asset_id}/lineage

Retrieve lineage information for an asset.

Example:

curl -X GET \
-H "Authorization: Bearer $API_TOKEN" \
https://api.msh.io/api/msh/metadata/assets/5/lineage

Response:

{
"upstream": [
{"asset_id": 3, "asset_name": "stg_orders"}
],
"downstream": [
{"asset_id": 7, "asset_name": "revenue_dashboard"}
]
}

Get Version History

Endpoint: GET /api/msh/metadata/assets/{asset_id}/versions

Retrieve version history for an asset.

Example:

curl -X GET \
-H "Authorization: Bearer $API_TOKEN" \
https://api.msh.io/api/msh/metadata/assets/5/versions

Get Test Definitions

Endpoint: GET /api/msh/metadata/assets/{asset_id}/tests

Retrieve test definitions and results for an asset.

Example:

curl -X GET \
-H "Authorization: Bearer $API_TOKEN" \
https://api.msh.io/api/msh/metadata/assets/5/tests

CI/CD Integration

Automatically sync metadata on every deployment:

# .github/workflows/msh-sync.yml
name: Sync msh Metadata

on:
push:
branches: [main]

jobs:
sync-metadata:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Install msh
run: pip install msh-cli
- name: Generate context pack
run: msh ai context --json > context.json
- name: Upload to cloud
run: |
curl -X POST \
-H "Authorization: Bearer ${{ secrets.MSH_API_TOKEN }}" \
-H "Content-Type: application/json" \
-d "{\"context_pack\": $(cat context.json), \"source\": \"webhook\"}" \
https://api.msh.io/api/msh/metadata/projects/${{ secrets.MSH_PROJECT_ID }}/upload

Best Practices

  1. Regular Syncs: Upload metadata after every significant change
  2. CI/CD Integration: Automate metadata sync in your CI/CD pipeline
  3. Version Control: Keep metadata in sync with code changes
  4. Error Handling: Handle upload failures gracefully