View Categories

Policies, Triggers & Thresholds

2 min read

Structure and Authoring Guidelines #


1 Purpose #

The Predictive and Autonomous layers rely on written logic—policies.
A policy is a digital rule that describes what must be true, when to react, and how to respond.
Together, policies, triggers, and thresholds create EA 2.0’s governance nervous system:

  • Policies define intent.
  • Triggers sense deviation.
  • Thresholds calibrate sensitivity.

2 Policy Taxonomy #

CategoryExample IntentCommon TriggerTypical Action
ComplianceStay aligned with security or privacy standardsPolicy violation > 0Create GRC ticket
OperationalMaintain SLA and performanceSLA < targetScale up infra / notify
FinancialPrevent overspend or under-utilisationCost > budget × 1.1Suspend non-prod resources
LifecycleKeep apps supported and ownedSupport end < 90 daysTrigger upgrade plan
Risk ControlContain exposure before incidentRisk score > thresholdInvoke mitigation workflow

Policies can be preventive (avoid a breach) or reactive (contain a breach).


3 Policy Structure #

Every policy has five core blocks:

id: unique_policy_id
title: Short description
intent: What outcome this rule protects
trigger:
  source_metric: KPI or event
  operator: ">"
  threshold: numeric or logical value
  condition: optional context filter
action:
  type: task | notification | automation
  endpoint: API URL or ServiceNow table
governance:
  owner: team email
  approval: required | optional | auto
  severity: info | warning | critical
metadata:
  version: 1.0
  last_updated: 2025-11-08

This YAML-based structure is human-readable and deployable through Git CI/CD.


4 Trigger Design #

Triggers are sensors—each listens to one or more metrics and fires when a condition is met.

  • Event-Based: real-time log or message (incident.created).
  • Time-Based: scheduled KPI check (every hour).
  • Threshold-Based: numeric breach (cpu > 80 %).
  • Anomaly-Based: ML-detected deviation (cost z-score > 2).

Good triggers are specific, debounced (don’t spam), and explainable.


5 Threshold Tuning #

Thresholds decide how sensitive automation is.
Three design principles:

  1. Dynamic not static.
    Use rolling averages or percentiles instead of hard numbers.
  2. Confidence-weighted.
    Couple thresholds with model certainty (fire if confidence > 0.8).
  3. Context-aware.
    Adjust per environment: Dev > 100 %, Prod > 90 %.

Example SQL-style rule:

IF cost/current_budget > 1.1 AND confidence > 0.8 THEN trigger('CostOverrun')

6 Authoring Workflow #

  1. Define Intent – What problem should never occur?
  2. Select Signal – Which KPI or event detects it earliest?
  3. Set Threshold – When does it become unacceptable?
  4. Choose Action – Fix automatically or notify humans?
  5. Tag Owner & Severity – Who’s accountable?
  6. Publish & Test – Run dry-mode simulation.

All new policies go through peer review before activation.


7 Simulation & Testing #

EA 2.0 includes a policy sandbox that can replay 30 days of historical data and show which triggers would have fired.
Benefits:

  • Detect over-sensitivity (too many alerts).
  • Benchmark thresholds.
  • Visualise cost or risk avoided if policy had existed earlier.

8 Policy Dependencies #

Policies often depend on each other.
Use explicit links to avoid feedback loops:

depends_on: [risk_score_policy, cost_guard_policy]
conflicts_with: [debug_mode_policy]

Dependency graphs ensure orchestrated execution.


9 Governance Metadata #

Every policy automatically records:

  • Version, author, last change.
  • Approval status (draft → approved → active → retired).
  • Execution count & success rate.
  • Exceptions granted (with expiry date).

This makes audit effortless and keeps historical lineage.


10 Policy Lifecycle #

PhaseDescriptionTool
DraftAuthored, awaiting peer reviewGit branch
ApprovedCAB or EA Ops sign-offMerge → main
ActiveDeployed & monitoredLogic App / API Mgmt
SuspendedTemporarily offPolicy dashboard
RetiredArchived, immutable recordBlob archive

Policies evolve like software—not documents.


11 Monitoring & Feedback #

Dashboards track:

  • Policy coverage % of systems.
  • Trigger frequency by domain.
  • Mean time to resolve (automated vs manual).
  • False positive ratio.

EA Ops reviews these metrics quarterly to adjust thresholds and retire obsolete rules.


12 Best Practices #

✅ Use plain language titles: “Prevent Unlabeled PII Storage.”
✅ Include business impact field (“saves $50K / yr in cloud cost”).
✅ Version policies like code.
✅ Always test before trust.
✅ Link each policy to an owner node in the graph for accountability.


13 Example Policy Library Snapshot #

IDIntentSeverityActionOwner
policy_cost_guardPrevent budget overshootHighLogic App pauseFinOps
policy_sla_watchMaintain SLA ≥ 95 %MediumNotify OpsEA Ops
policy_data_labelEnforce Confidential tagHighApply labelData Governance
policy_orphan_appDetect unowned appsMediumCreate ticketITSM
policy_risk_driftCap risk score increase < 10 %LowLog alertSecurity

14 KPIs for Policy Governance #

KPITargetMeaning
Policy Coverage≥ 85 % systems governedMaturity
Approval Turnaround≤ 3 daysEfficiency
False Trigger Rate≤ 5 %Precision
Rollback Rate≤ 2 %Reliability
Active Policy Ratio≥ 70 % vs retiredRelevance

15 Cultural Guidelines #

  • Start small – pilot 10 policies first.
  • Treat policies as collaboration between architects and operators.
  • Encourage feedback from executors to authors.
  • Celebrate prevented incidents as success metrics.

Governance becomes a shared craft, not enforcement.


16 Takeaway #

Policies are architecture expressed as code.
When rules are transparent, measurable, and adaptive, autonomy feels safe instead of risky.

Powered by BetterDocs

Leave a Reply

Your email address will not be published. Required fields are marked *

Scroll to Top