Glossary Overview
The glossary system helps you maintain a business dictionary of terms, metrics, dimensions, and policies for data governance.
Cloud vs OSS
This section documents OSS (Open Source) glossary features. For Cloud Platform glossary features, see Cloud Glossary Management.
Purpose
The glossary provides:
- Business context: Link technical columns to business terms
- Data governance: Define policies and rules
- AI enhancement: Better AI understanding of your data
- Documentation: Self-documenting data assets
Structure
The glossary contains:
Terms
Business concepts (e.g., "Customer", "Revenue"):
terms:
- id: term.customer
name: Customer
description: A customer entity
Metrics
Calculated measures (e.g., "Monthly Revenue"):
metrics:
- id: metric.monthly_revenue
name: Monthly Revenue
description: Total revenue per month
formula: SUM(amount)
Dimensions
Categorization attributes (e.g., "Region"):
dimensions:
- id: dim.region
name: Region
description: Geographic region
Policies
Rules and constraints (e.g., "No PII in public assets"):
policies:
- name: No PII in public assets
rule: PII columns cannot be in public schema
pii_columns: [email, ssn, phone]
Storage
Glossary is stored in:
- Project-level:
glossary.yamlormsh.yaml(glossary section) - Cache:
.msh/glossary.json(auto-generated)
Quick Start
1. Create Terms
msh glossary add-term "Customer" --description "A customer entity"
2. Link Terms
msh glossary link-term "Customer" --asset revenue --column customer_id
3. List Glossary
msh glossary list
4. Export Glossary
msh glossary export --path glossary.json
Benefits
AI Enhancement
Glossary terms improve AI understanding:
# AI understands "Customer" means customer_id column
msh ai explain assets/revenue.msh
# Output includes: "This asset calculates revenue per Customer..."
Data Governance
Policies enforce rules:
policies:
- name: No PII in public assets
rule: PII columns cannot be in public schema
Enforcement:
- PII columns are masked in context packs
- PII columns are blocked in public assets
- Policy violations are reported
Documentation
Glossary provides self-documenting assets:
msh glossary list
# Shows all terms, metrics, dimensions, and policies
Related Documentation
- Managing Terms - Create and manage terms
- Linking Terms - Link terms to assets
- Policies - Define and enforce policies
- Glossary Commands - CLI reference