- Cosmos DB Gremlin / Neo4j Aura Gov
- 1 Purpose
- 2 Platform Options
- 3 Schema Philosophy
- 4 Core Label and Index Setup (Neo4j Example)
- 5 Graph Loading Pattern
- 6 Performance Tuning Guidelines
- 7 Governance and Security
- 8 Graph Algorithms Enabled
- 9 NLQ and Reasoning Integration
- 10 Backups & Disaster Recovery
- 11 Monitoring Metrics
- 12 Multi-Environment Strategy
- 13 Advantages of Graph Architecture
- 14 Sample Governance Policy in Graph
- 15 Takeaway
Cosmos DB Gremlin / Neo4j Aura Gov #
1 Purpose #
The EA 2.0 graph database is not just a data store — it’s the semantic core of the enterprise.
It stores every node, relationship, event, and confidence score so the reasoning engine can understand how the enterprise operates, why outcomes occur, and where to intervene.
2 Platform Options #
| Option | Type | Ideal For | Notes / Integration |
|---|---|---|---|
| Azure Cosmos DB (Gremlin API) | Managed PaaS | Government / Sovereign deployments | Fully integrated with Azure RBAC + VNet; schema-optional JSON. |
| Neo4j Aura Gov / Enterprise | SaaS or Self-Managed | Private sector or federated tenants | Advanced Cypher support, graph algorithms, Bloom visualization. |
Both are interchangeable under EA 2.0’s data-access layer; the model is portable.
3 Schema Philosophy #
EA 2.0 uses a “semistructured ontology”:
- Fixed core labels (Capability, Application, Data, Risk, Control, Outcome).
- Flexible extensions via metadata properties.
- Relationship types carry direction, weight, and evidence.
This design allows schema-on-read flexibility while keeping query performance predictable.
4 Core Label and Index Setup (Neo4j Example) #
CREATE CONSTRAINT capability_id IF NOT EXISTS
FOR (c:Capability) REQUIRE c.id IS UNIQUE;
CREATE CONSTRAINT app_id IF NOT EXISTS
FOR (a:Application) REQUIRE a.id IS UNIQUE;
CREATE INDEX FOR (r:Risk) ON (r.severity);
CREATE INDEX FOR ()-[rel:SUPPORTED_BY]-() ON (rel.confidence);
Tip: Use numeric IDs + ULIDs to keep joins efficient.
5 Graph Loading Pattern #
- Stage incoming JSON to Blob Storage.
- Transform with Function app → normalized YAML/JSON graph payload.
- Load via Cypher / Gremlin bulk upsert.
Cypher Example
UNWIND $records AS r
MERGE (a:Application {id:r.app_id})
ON CREATE SET a.name=r.app_name, a.created_at=timestamp()
MERGE (c:Capability {id:r.cap_id})
MERGE (a)-[:SUPPORTED_BY {confidence:r.conf}]->(c);
6 Performance Tuning Guidelines #
| Category | Practice | Impact |
|---|---|---|
| Partitioning | Split by domain or region | Parallel queries & smaller index footprint |
| Caching | Use server-side Gremlin or Neo4j page cache | 2–5× faster traversals |
| Query Hints | RETURN LIMIT clauses for UI requests | Prevents large payloads |
| Batch Ingest | ≤ 500 nodes / transaction | Stable RUs in Cosmos |
| Compression | GZIP payloads from Functions | Lower network latency |
7 Governance and Security #
| Area | Control | Implementation |
|---|---|---|
| Access | Azure AD / Entra ID RBAC | Role → Graph scope mapping |
| Encryption | In-transit TLS 1.2+, at-rest AES-256 | Default in Cosmos / Aura |
| Audit Logs | Query logs + change capture to Blob | Immutable history |
| Least Privilege | Separate read/write keys per service | Rotation via Key Vault |
| Data Residency | Region-locked instances | Meets GDPR / UAE Gov SLA |
8 Graph Algorithms Enabled #
EA 2.0 activates built-in graph analytics to power insight queries:
| Algorithm | Purpose | Example Use |
|---|---|---|
| Centrality | Find critical applications | “Which apps support most capabilities?” |
| Community Detection | Identify related functions / teams | Cluster analysis of domains |
| Shortest Path | Trace risk impact chains | “How does Risk-R17 reach Finance?” |
| Similarity | Recommend reusable controls | Compliance reuse patterns |
9 NLQ and Reasoning Integration #
- The Reasoning API translates natural-language questions into Cypher/Gremlin.
- Safe query templates are stored in Prompt Library.
- Results return as JSON for Power BI or React UI rendering.
Example Query:
“Show all capabilities with AI-related risks and their mitigations.”
→ translates to a multi-hop graph query (2 ms average response time on 100k nodes).
10 Backups & Disaster Recovery #
| DB Type | Native Option | Frequency | Retention |
|---|---|---|---|
| Cosmos DB | Continuous Backup + Point-in-Time Restore | 4 hours RPO | 30 days |
| Neo4j Aura | Daily auto-snapshot | 24 hours RPO | 7–30 days |
| Self-Managed Neo4j | Cron export via neo4j-admin dump | 6 hours | Configurable |
11 Monitoring Metrics #
Track these KPIs in Azure Monitor / Neo4j Bloom:
| Metric | Target | Interpretation |
|---|---|---|
| Query Latency (ms) | < 200 avg | Healthy response |
| CPU / RU Utilization | < 70 % | Stable load |
| Failed Writes | 0 | Data integrity |
| Graph Size | ≤ 10 GB per partition | Manageable index |
| Node Growth Rate | < 5 %/day | Predictable scaling |
12 Multi-Environment Strategy #
- Dev/Test/Prod graphs with promotion pipeline (Azure DevOps).
- Snapshot export/import for data migration.
- Feature-flag schema changes via metadata version.
13 Advantages of Graph Architecture #
✅ Native model of enterprise interdependencies.
✅ Real-time impact analysis (queries in ms).
✅ Foundation for RAG and predictive governance.
✅ Schema-light yet semantically rich.
✅ Easily visualized and narrative-ready.
14 Sample Governance Policy in Graph #
MERGE (p:Policy {id:'POL-123', name:'Tagging Enforcement'})
MERGE (r:Risk {id:'RISK-22'})
MERGE (p)-[:MITIGATES]->(r)
MERGE (p)-[:TRIGGERED_BY]->(:Event {type:'NonCompliantResource'})
This link ensures every policy has a traceable trigger and target risk.
15 Takeaway #
The graph is the brainstem of EA 2.0 — where insight becomes structure.
When deployed on Cosmos DB or Neo4j Aura Gov, it delivers speed, security, and semantic clarity at enterprise scale.