View Categories

Integration Patterns (Cloud, COTS, On-Prem)

3 min read

How to Connect These Sources Safely and Scalably #


1. Purpose #

Connecting data to the EA 2.0 graph is not a technical chore — it’s the moment when architecture becomes alive.
But integrations differ: some are API-rich, others locked in legacy COTS (Commercial-Off-The-Shelf) systems, and some still live in on-prem databases.
This guide defines safe, repeatable integration patterns that bring all these worlds into a single reasoning fabric without breaking security, sovereignty, or performance.


2. Integration Philosophy #

EA 2.0 follows five integration commandments:

  1. No direct coupling: All systems connect through a mediation layer (Function, API Gateway, or Event Bus).
  2. Incremental ingestion: Pull only deltas or metadata, never entire tables.
  3. Stateless connectors: Functions execute, commit, and exit — no long-running sync jobs.
  4. Source authority preserved: Never transform data in source; only enrich or map downstream.
  5. Lineage everywhere: Every node carries its source, timestamp, and load event ID.

3. Three Integration Archetypes #

ArchetypeTypical SourcesConnector PatternExample Stack
Cloud-NativeAzure, AWS, SaaS APIsServerless API Poller or Event SubscriptionAzure Function → Graph API → Neo4j REST
COTS / Enterprise AppsServiceNow, SAP, Oracle, SalesforceAPI Adapter or Scheduled Extract → Blob → FunctionServiceNow REST → Blob Storage → Loader
On-Prem / LegacySQL, CSV, shared drivesGateway Sync or Agent PushSelf-hosted agent → HTTPS webhook → Function

Each pattern implements the same contract:
→ Extract → Normalize → Map → Upsert → Log.


4. Cloud-Native Integration #

🔹 Example: Azure Resource Graph → EA 2.0 #

  1. Trigger: Timer (daily) or EventGrid (resource change).
  2. Extract: Azure Function queries Resource Graph API for new/changed resources.
  3. Transform: Normalize key metadata: name, type, region, tags, owner.
  4. Upsert: POST to EA 2.0 Graph Loader endpoint.
  5. Log: Write metrics to Application Insights (rows processed, latency).

This yields real-time visibility of infrastructure without touching production workloads.

🔹 Example: AWS Config → EA 2.0 #

  • Lambda function subscribed to Config SNS topic.
  • Parses JSON payload → maps to “Infrastructure Node.”
  • Upserts via secure REST to EA Graph endpoint.

Cloud events feed the architecture continuously — no scheduled exports needed.


5. COTS / SaaS Integration #

🔹 Example: ServiceNow CMDB #

Pattern: Incremental REST Extract → Blob → Function Loader

  1. CMDB tables queried via REST API with sys_updated_on > last_sync.
  2. Results written to a secure Blob container (isolated per tenant).
  3. Loader Function maps columns to canonical ontology fields.
  4. Each record stamped with source_system='servicenow'.

ServiceNow → Blob → Graph ensures traceability, replayability, and throttling safety.

🔹 Example: SAP / Oracle / Salesforce #

Use middleware iPaaS (Azure Data Factory, Mulesoft, Boomi, etc.) for heavy COTS APIs.
Avoid direct JDBC pulls; use certified connectors for:

  • Rate-limit handling
  • Retry + DLQ (dead-letter queue)
  • Secure token management

All extracts end in a landing zone (Blob/S3) before ingestion to EA 2.0.


6. On-Prem Integration #

Legacy sources can still join the EA 2.0 ecosystem via hybrid bridges:

  • Self-hosted Gateway Agent: Runs in DMZ, connects outward to secure HTTPS Function endpoint.
  • SFTP/CSV Drop Pattern: Systems export CSVs to a watched OneDrive/SharePoint folder.
  • Database Proxy Pattern: Read-only SQL user queries a view and pushes data via REST.

Every on-prem push is outbound-initiated to avoid inbound firewall rules — maintaining zero trust posture.


7. Transformation & Normalization #

The Normalization Layer harmonizes data before graph load:

StepActionExample
Map FieldsAlign to canonical schemaapp_name → name, owner → person_ref
Classify SensitivityApply label based on field“PII” → Confidential
De-duplicateMerge by natural keySame App ID from two systems
Score ConfidenceAssign trust weight per sourceCMDB = 1.0, Spreadsheet = 0.6
Emit EventsPublish record.updated messageFeeds Predictive Layer

Transformation is policy-driven, not hard-coded — new sources can join by adding mapping config, not new code.


8. Performance-Safe Patterns #

  • Use pagination + delta windows (updated_since) to limit pull size.
  • Implement back-off & retry (HTTP 429/503) logic for APIs.
  • Cache source metadata locally (e.g., schema hash) to skip unchanged fields.
  • Split heavy syncs into parallel Function executions by domain.
  • Use event compression for telemetry feeds (batch 1000 → 1 payload).

These patterns let EA 2.0 ingest thousands of records daily without noticeable system impact.


9. Security & Compliance Controls #

ControlMechanism
AuthenticationOAuth 2.0 client credentials via Entra ID / IAM roles
EncryptionTLS 1.2+ in transit; blob storage encrypted with tenant key
Data ResidencyConnectors restricted to sovereign region endpoints
Secrets ManagementKeys stored in Azure Key Vault / AWS Secrets Manager
Audit TrailEvery extraction logged with timestamp + checksum
Error IsolationFailed loads quarantined in a dead_letter container

Integration pipelines are treated as first-class governed assets — visible in dashboards with freshness SLAs.


10. Monitoring and Alerting #

  • Ingestion Health Dashboard: shows record counts, latency, and failure rate per source.
  • Freshness Gauge: color-coded indicator of last successful sync.
  • Incident Hooks: failed Function invocations auto-create tickets in ServiceNow.
  • Predictive Trend: ML detects degradation (fewer records = possible API drift).

Integration becomes observable infrastructure, not a hidden script.


11. Example Hybrid Pattern (visual logic) #

┌──────────┐     ┌──────────────┐     ┌─────────────┐     ┌───────────┐
│ Source   │ --> │ Extractor    │ --> │ Normalizer  │ --> │ Graph API │
│ (CMDB)   │     │ (Function)   │     │ (Mapping)   │     │ (Upsert)  │
└──────────┘     └──────────────┘     └─────────────┘     └───────────┘
       │                                        ↑
       └────────── Error Log / Retry ───────────┘

This pattern repeats identically across all domains — only connectors differ.


12. Key KPIs #

KPITargetMeaning
Average Sync Latency< 15 minutesFresh data visible in near-real time
API Success Rate> 99 %Reliable source integration
Throughput per Function> 500 records/secScalable ingestion
Data Freshness SLA≤ 24 hoursUp-to-date graph representation
Integration Confidence≥ 0.8Quality of mapping and reconciliation

13. Common Pitfalls #

IssueRoot CauseFix
Duplicate nodesMissing unique ID mappingDefine natural_key
API throttlingOver-aggressive pollingImplement exponential backoff
Schema driftSource update not trackedAdd schema hash validation
Missed deltasTimezone mismatch in filtersStore UTC timestamps
Stale on-prem dataManual exports forgottenAdd automatic file watcher

14. Takeaway #

Integration in EA 2.0 isn’t plumbing — it’s governance in motion.
Each connector is a living contract of trust between systems.
The more seamless the integration, the more intelligent the enterprise becomes.

Powered by BetterDocs

Leave a Reply

Your email address will not be published. Required fields are marked *

Scroll to Top