Skip to main content

Glossary Overview

OSS

The glossary system helps you maintain a business dictionary of terms, metrics, dimensions, and policies for data governance.

Cloud vs OSS

This section documents OSS (Open Source) glossary features. For Cloud Platform glossary features, see Cloud Glossary Management.

Purpose

The glossary provides:

  • Business context: Link technical columns to business terms
  • Data governance: Define policies and rules
  • AI enhancement: Better AI understanding of your data
  • Documentation: Self-documenting data assets

Structure

The glossary contains:

Terms

Business concepts (e.g., "Customer", "Revenue"):

terms:
- id: term.customer
name: Customer
description: A customer entity

Metrics

Calculated measures (e.g., "Monthly Revenue"):

metrics:
- id: metric.monthly_revenue
name: Monthly Revenue
description: Total revenue per month
formula: SUM(amount)

Dimensions

Categorization attributes (e.g., "Region"):

dimensions:
- id: dim.region
name: Region
description: Geographic region

Policies

Rules and constraints (e.g., "No PII in public assets"):

policies:
- name: No PII in public assets
rule: PII columns cannot be in public schema
pii_columns: [email, ssn, phone]

Storage

Glossary is stored in:

  • Project-level: glossary.yaml or msh.yaml (glossary section)
  • Cache: .msh/glossary.json (auto-generated)

Quick Start

1. Create Terms

msh glossary add-term "Customer" --description "A customer entity"
msh glossary link-term "Customer" --asset revenue --column customer_id

3. List Glossary

msh glossary list

4. Export Glossary

msh glossary export --path glossary.json

Benefits

AI Enhancement

Glossary terms improve AI understanding:

# AI understands "Customer" means customer_id column
msh ai explain assets/revenue.msh
# Output includes: "This asset calculates revenue per Customer..."

Data Governance

Policies enforce rules:

policies:
- name: No PII in public assets
rule: PII columns cannot be in public schema

Enforcement:

  • PII columns are masked in context packs
  • PII columns are blocked in public assets
  • Policy violations are reported

Documentation

Glossary provides self-documenting assets:

msh glossary list
# Shows all terms, metrics, dimensions, and policies