View Categories

Data Quality Metrics & Gates

3 min read

Completeness • Freshness • Validity #


1 Purpose #

The power of EA 2.0 depends on the reliability of its inputs.
Bad data breaks reasoning faster than bad algorithms.
Data Quality (DQ) Gates ensure that only trusted, current, and complete information enters the knowledge graph — making predictions, audits, and automations credible.


2 Core DQ Principles #

PrincipleMeaning
Evidence over assumptionEvery node and relationship must trace back to a verifiable source.
Continuous validationDQ rules run automatically on ingestion and nightly refresh.
Transparency by designEach node carries DQ scores visible to users.
Governance integrationViolations raise GRC tasks automatically.

3 DQ Dimensions Tracked #

DimensionDefinitionExample MetricTarget
CompletenessPercentage of mandatory fields populated.filled_fields / total_mandatory * 100≥ 95 %
FreshnessTime since last update vs data TTL (threshold).(now - last_seen_at)≤ 7 days
ValidityConformance to pattern, type, or range.email format, date ISO check100 %
UniquenessNo duplicates of IDs or names.duplicate count per domain0
AccuracyCross-checked against trusted source.MSI vs Cloud Inventory match rate≥ 90 %
Lineage CompletenessLinked nodes per ontology rule.capability → app link %≥ 80 %

4 DQ Scoring Model #

Each node receives a composite DQ Score (0–1).

DQ Score = 0.3 × Completeness + 0.2 × Freshness + 0.2 × Validity + 0.2 × Uniqueness + 0.1 × Accuracy

Displayed as:
🟢 ≥ 0.9 = Trusted 🟡 0.7–0.89 = Review 🔴 < 0.7 = Critical

Scores propagate upward: an application inherits the mean DQ Score of its linked data entities and controls.


5 DQ Gates in Pipeline #

Stage 1 – Extract:

  • Validate file headers, API response codes.
  • Reject feeds with missing IDs.

Stage 2 – Transform:

  • Apply schema validation (YAML contract).
  • Normalize taxonomy (spelling, case, codes).

Stage 3 – Load (Graph):

  • Compute DQ Scores.
  • Tag nodes with dq_status.
  • Log violations → DQ Incident Table.

Stage 4 – Govern:

  • If dq_status = critical, ServiceNow GRC ticket is created.

6 DQ Dashboard KPIs #

MetricDescriptionThreshold
Overall DQ ScoreMean of all active nodes≥ 0.9
Nodes Failing DQ GatesCount of nodes below 0.7≤ 5 %
Average Data AgeDays since last_seen_at≤ 7
Duplicate Rate% of duplicate IDs≤ 1 %
Policy-linked DQ IncidentsOpen vs Closed tickets95 % closure within 14 days

7 Governance Rules #

  1. Every feed owner has a DQ Steward.
  2. DQ violations auto-notified in Teams channel.
  3. Weekly DQ Stand-Up reviews top 10 critical issues.
  4. Quarterly Maturity Score update based on DQ improvement.

8 Visualization Views #

  • DQ Radar Chart: visualizes six dimensions per domain.
  • Heatmap: color-codes low-score applications.
  • Trend Line: DQ Score progress over time.
  • DQ Incident Log: ServiceNow integration view.

Each view feeds Power BI and the NLQ interface for queries like:

“Show all applications with DQ Score < 0.8 and last seen > 14 days.”


9 Automation Example #

Policy Trigger:

If dq_status = critical for any data entity linked to a risk, create a GRC ticket and notify data steward.

Remediation Logic App:

  • Assign to data owner.
  • Request corrected file via secure form.
  • Reload feed and recompute DQ Score.

10 Benefits #

✅ Objective measurement of trust.
✅ Fewer false alarms in predictive governance.
✅ Faster root-cause analysis for data issues.
✅ Direct link between DQ and EA maturity metrics.


11 Common Pitfalls & Mitigations #

IssueEffectSolution
Over-strict rules reject too many feedsLoss of data coverageTiered gates with grace periods
No DQ ownershipIssues lingerAssign feed stewards per domain
Late DQ reportingDecisions on stale dataNightly DQ jobs + alerts
Blind spot in manual uploadsShadow dataMandatory OneDrive drop folder governed by policy

12 Takeaway #

Data without quality is noise; architecture without trust is fiction.
DQ Metrics and Gates make EA 2.0 a truthful foundation where AI can reason confidently, executives can act decisively, and auditors can verify instantly.

Powered by BetterDocs

Leave a Reply

Your email address will not be published. Required fields are marked *

Scroll to Top