Data Quality Metrics & Gates

3 min read

Table of Contents

Completeness • Freshness • Validity
1 Purpose
2 Core DQ Principles
3 DQ Dimensions Tracked
4 DQ Scoring Model
5 DQ Gates in Pipeline
6 DQ Dashboard KPIs
7 Governance Rules
8 Visualization Views
9 Automation Example
10 Benefits
11 Common Pitfalls & Mitigations
12 Takeaway

Completeness • Freshness • Validity #

1 Purpose #

The power of EA 2.0 depends on the reliability of its inputs.
Bad data breaks reasoning faster than bad algorithms.
Data Quality (DQ) Gates ensure that only trusted, current, and complete information enters the knowledge graph — making predictions, audits, and automations credible.

2 Core DQ Principles #

Principle	Meaning
Evidence over assumption	Every node and relationship must trace back to a verifiable source.
Continuous validation	DQ rules run automatically on ingestion and nightly refresh.
Transparency by design	Each node carries DQ scores visible to users.
Governance integration	Violations raise GRC tasks automatically.

3 DQ Dimensions Tracked #

Dimension	Definition	Example Metric	Target
Completeness	Percentage of mandatory fields populated.	`filled_fields / total_mandatory * 100`	≥ 95 %
Freshness	Time since last update vs data TTL (threshold).	`(now - last_seen_at)`	≤ 7 days
Validity	Conformance to pattern, type, or range.	email format, date ISO check	100 %
Uniqueness	No duplicates of IDs or names.	duplicate count per domain	0
Accuracy	Cross-checked against trusted source.	MSI vs Cloud Inventory match rate	≥ 90 %
Lineage Completeness	Linked nodes per ontology rule.	capability → app link %	≥ 80 %

4 DQ Scoring Model #

Each node receives a composite DQ Score (0–1).

DQ Score = 0.3 × Completeness + 0.2 × Freshness + 0.2 × Validity + 0.2 × Uniqueness + 0.1 × Accuracy

Displayed as:
🟢 ≥ 0.9 = Trusted 🟡 0.7–0.89 = Review 🔴 < 0.7 = Critical

Scores propagate upward: an application inherits the mean DQ Score of its linked data entities and controls.

5 DQ Gates in Pipeline #

Stage 1 – Extract:

Validate file headers, API response codes.
Reject feeds with missing IDs.

Stage 2 – Transform:

Apply schema validation (YAML contract).
Normalize taxonomy (spelling, case, codes).

Stage 3 – Load (Graph):

Compute DQ Scores.
Tag nodes with dq_status.
Log violations → DQ Incident Table.

Stage 4 – Govern:

If dq_status = critical, ServiceNow GRC ticket is created.

6 DQ Dashboard KPIs #

Metric	Description	Threshold
Overall DQ Score	Mean of all active nodes	≥ 0.9
Nodes Failing DQ Gates	Count of nodes below 0.7	≤ 5 %
Average Data Age	Days since last_seen_at	≤ 7
Duplicate Rate	% of duplicate IDs	≤ 1 %
Policy-linked DQ Incidents	Open vs Closed tickets	95 % closure within 14 days

7 Governance Rules #

Every feed owner has a DQ Steward.
DQ violations auto-notified in Teams channel.
Weekly DQ Stand-Up reviews top 10 critical issues.
Quarterly Maturity Score update based on DQ improvement.

8 Visualization Views #

DQ Radar Chart: visualizes six dimensions per domain.
Heatmap: color-codes low-score applications.
Trend Line: DQ Score progress over time.
DQ Incident Log: ServiceNow integration view.

Each view feeds Power BI and the NLQ interface for queries like:

“Show all applications with DQ Score < 0.8 and last seen > 14 days.”

9 Automation Example #

Policy Trigger:

If dq_status = critical for any data entity linked to a risk, create a GRC ticket and notify data steward.

Remediation Logic App:

Assign to data owner.
Request corrected file via secure form.
Reload feed and recompute DQ Score.

10 Benefits #

✅ Objective measurement of trust.
✅ Fewer false alarms in predictive governance.
✅ Faster root-cause analysis for data issues.
✅ Direct link between DQ and EA maturity metrics.

11 Common Pitfalls & Mitigations #

Issue	Effect	Solution
Over-strict rules reject too many feeds	Loss of data coverage	Tiered gates with grace periods
No DQ ownership	Issues linger	Assign feed stewards per domain
Late DQ reporting	Decisions on stale data	Nightly DQ jobs + alerts
Blind spot in manual uploads	Shadow data	Mandatory OneDrive drop folder governed by policy

12 Takeaway #

Data without quality is noise; architecture without trust is fiction.
DQ Metrics and Gates make EA 2.0 a truthful foundation where AI can reason confidently, executives can act decisively, and auditors can verify instantly.

What are your Feelings

Still stuck? How can we help?

Updated on November 9, 2025

Overview & Principles

Data Sourcing & Integration

Reasoning & Intelligence Layer

Outbound Actions & Governance

Data Quality, Lineage & Ontology

Platform Implementation

Governance, Roles & Operations

Reference Assets & Visual Library

Implementation Playbooks

FAQ & Troubleshooting

Data Quality Metrics & Gates

Completeness • Freshness • Validity #

1 Purpose #

2 Core DQ Principles #

3 DQ Dimensions Tracked #

4 DQ Scoring Model #

5 DQ Gates in Pipeline #

6 DQ Dashboard KPIs #

7 Governance Rules #

8 Visualization Views #

9 Automation Example #

10 Benefits #

11 Common Pitfalls & Mitigations #

12 Takeaway #

What are your Feelings

Leave a Reply Cancel reply

Completeness • Freshness • Validity #

1 Purpose #

2 Core DQ Principles #

3 DQ Dimensions Tracked #

4 DQ Scoring Model #

5 DQ Gates in Pipeline #

6 DQ Dashboard KPIs #

7 Governance Rules #

8 Visualization Views #

9 Automation Example #

10 Benefits #

11 Common Pitfalls & Mitigations #

12 Takeaway #

What are your Feelings

Share This Article :

How can we help?

Leave a Reply Cancel reply