View Categories

Vocabulary & Taxonomy Governance

4 min read

Maintaining Common Semantics Across Sources #


1 Purpose #

Every enterprise speaks multiple dialects — business, technical, regulatory.
EA 2.0 acts as the interpreter, translating those dialects into a shared semantic model that all systems and teams can understand.

Vocabulary and taxonomy governance keep the meaning of entities — “Customer,” “Service,” “Incident,” “Control,” “Outcome” — consistent across cloud systems, BI dashboards, and AI reasoning layers.

Without this, AI can correlate data but never truly comprehend it.


2 Core Principles #

PrincipleMeaning
Single Semantic BackboneOne canonical term store feeding all domains.
Contextual InheritanceTerms adapt per domain while preserving parent meaning.
Machine & Human AlignmentTaxonomies are readable by both business users and LLMs.
Change ControlEvery definition change is versioned, approved, and propagated.
Traceable UsageEvery graph node references a term, making all data “typed.”

3 Components of EA 2.0 Vocabulary System #

ComponentDescriptionTooling / Source
Business GlossaryDefines business terms, KPIs, policies.Collibra / SharePoint Term Store / Excel seed
Technical CatalogEnumerates data entities, APIs, and schema names.Purview / Snowflake / Data Factory
Risk & Control TaxonomyDefines standardized risk types and mitigations.ServiceNow GRC / ISO 27001 Mapping
Capability TaxonomyDefines enterprise functions and outcomes.EA 2.0 Ontology / Capability Map
Tag DictionaryMaps informal labels or aliases to canonical terms.EA 2.0 Graph extension

Together, these components form EA 2.0’s semantic mesh.


4 Term Object Structure #

Each term in the EA 2.0 graph carries attributes:

FieldDescriptionExample
term_idUnique keyTERM_12345
preferred_labelCanonical name“Customer”
definitionOfficial meaning“Individual or organization purchasing goods/services.”
aliasesSynonyms or variants“Client,” “Buyer,” “Account”
domainBusiness / Technical scopeSales
versionSemantic version3.1
source_of_truthGlossary sourceCollibra
last_reviewedDate of stewardship approval2025-09-01

Every node (Capability, DataEntity, Policy, Risk, etc.) references at least one term_id.


5 Governance Workflow #

  1. Proposal: New term request submitted via form or Teams bot.
  2. Review: Steward checks duplication and domain alignment.
  3. Approval: Governance board validates definition.
  4. Propagation: EA 2.0 API syncs term to Graph, SharePoint, and Purview.
  5. Deprecation: Old terms flagged, relationships auto-repointed.

This workflow ensures controlled evolution rather than chaotic sprawl.


6 Semantic Versioning #

Each change increments term version (Major.Minor):

  • Major (x.0) — definition meaning changed.
  • Minor (.x) — formatting or metadata updated.

Relationships store which version they were created under, enabling semantic time-travel:

“Show all applications using the pre-2024 definition of ‘Customer.’”


7 Synonym & Mapping Table #

EA 2.0 maintains a Synonym Map:

AliasCanonical TermConfidenceSource
ClientCustomer0.95CRM API
BuyerCustomer0.9Procurement DB
Cust_IDCustomer0.8Legacy Schema

Used by NLQ and RAG layers to interpret user prompts correctly — “show all clients” = Customer.


8 Integration with External Catalogs #

EA 2.0 exposes and consumes term metadata through APIs:

| System | Direction | Method | Purpose |
|:–|:–|:–|
| Collibra | Import / Sync | REST API / CSV export | Business glossary seed |
| Azure Purview | Bi-directional | Purview Lineage API | Data entity ↔ glossary link |
| SharePoint Term Store | Export | Graph API | Reuse in intranet / Teams |
| ServiceNow GRC | Import | Table API | Align risk/control taxonomies |

This federation keeps all sources semantically consistent without duplication.


9 Semantic Governance Metrics #

KPIDefinitionTarget
Term Coverage% of nodes referencing valid terms≥ 95 %
Duplicate TermsTerms with overlapping meanings≤ 2 %
Review ComplianceTerms reviewed in last 12 months100 %
Synonym AccuracyVerified synonym-to-term mappings≥ 90 %
Cross-System Sync LagTime between term update and sync< 24 h

10 Visualization #

  • Semantic Map: displays relationships among terms and synonyms.
  • Domain Tree: hierarchical view of capabilities, risks, and data concepts.
  • Change Timeline: shows evolution of term versions.
  • Term Impact View: highlights all graph nodes affected by a definition change.

These are rendered in Power BI or directly in EA 2.0’s React front-end.


11 Common Pitfalls & Remedies #

IssueEffectRemedy
Multiple glossaries per domainConflicting meaningsCentral glossary + federation API
Missing term linksorphaned nodesenforce term_id as mandatory field
Unapproved alias growthinconsistent NLQ resultsauto-detect aliases via LLM + steward validation
Over-engineered hierarchiesgovernance fatiguekeep ≤ 3 levels deep per domain

12 AI Augmentation #

EA 2.0 uses AI to maintain its taxonomy intelligently:

  • LLM detects potential duplicates or synonym clusters.
  • NLP auto-suggests missing definitions or domain placements.
  • Predictive tagging: new data feeds auto-linked to known terms with confidence scoring.

Human stewards approve AI-suggested terms — ensuring both speed and control.


13 Benefits #

✅ Consistent semantics across systems and analytics.
✅ Reliable NLQ responses (no synonym confusion).
✅ Easier integration with external data catalogs.
✅ Regulatory confidence through definition traceability.
✅ Faster onboarding for new architects and analysts.


14 Cultural Impact #

Vocabulary governance turns architecture from “diagramming” into shared literacy.
When everyone — from developer to CFO — means the same thing by service, collaboration accelerates.


15 Takeaway #

A shared language is the foundation of collective intelligence.
EA 2.0’s vocabulary governance makes the enterprise not only connected, but coherent — ensuring that every query, model, and decision speaks the same tongue.

Powered by BetterDocs

Leave a Reply

Your email address will not be published. Required fields are marked *

Scroll to Top