Human Escalation Mechanism

Layer 3 — Governance draft-sato-soos-hem-05 See the full draft protocol at Datatracker See SOOS Stack implementation

The problem

Agentic AI systems fail in a specific way: they keep going when they should stop and ask.

HEM defines the protocol by which a SOOS kernel recognises that a governed execution has reached a decision point beyond the agent's authority — and transfers control to a human principal. Not as a policy preference. As a normative requirement, with a defined trigger taxonomy, a class-based escalation model, and an auditable record.

But the second failure mode is subtler: even when the right human is reached, the interaction itself can fail. The options presented can be framed manipulatively. Approval fatigue can cause rubber-stamp sign-offs. An agent's declared reasoning may diverge from its actual action without anyone noticing. Capability limitations may go undisclosed. Consent may be assumed rather than obtained.

The design premise: escalation is not a failure mode — it is a capability. And the interaction between kernel and human principal is itself a protocol surface that must be governed.

What's new in HEM-05

HEM-05 adds ten interaction classes — normatively specified GEC-human interactions that govern the escalation surface before, during, and after HEM_PENDING events. These are distinct from trigger classes (which determine when escalation fires) and define the structure, required fields, available decision types, and GAR audit entries for each category of human interaction.

It also adds:

INV-HEM-01 (The Surfacing Obligation): a KernelSpec invariant prohibiting suppression of governance-relevant information by any party
§7 Human Readiness Score (HRS): a kernel-computed composite score reflecting the human principal's capacity for well-informed, timely, unconflicted decision-making
§8 Tier 0-A Integration: normative HEM interface for the three new CAP-04 absolute prohibitions (MANIPULATION, PERFORMED_EMOTION, BIOMETRIC_SIGNAL_INFERENCE)
Five new Security Considerations (§18.10–18.14): manipulation via HEM channel, performed emotion detection bypass, HRS scoring manipulation, approval fatigue exploitation, and divergence protocol gaming

Messages to key audiences

IETF Working Groups

HEM-05 is relevant to ALLDISPATCH, OAUTH, SECEVENT, WIMSE, and the nascent ANML and AUDIT discussions.

The ten new interaction classes extend HEM's scope from escalation trigger specification to the full human-AI decision interface at the kernel layer. The HRS (Section 8) is a composite behavioral score computed by the GEC for each principal — not a trust score, but a readiness signal used for fatigue detection and emotional state advisory. It operates on observable behavioral signals only, with an explicit prohibition on biometric inference absent consent (tying to BIOMETRIC_SIGNAL_INFERENCE Tier 0-A in CAP-04).

The HEM-CONSENT class provides a normative APPI Article 17 binding for consent-gated actions — of interest to the OAuth WG for consent lifecycle specification and to WIMSE for workload identity credential scope management.

The GRP integration in HEM-PRE-2 (Section 7.2.2) connects the HEM interaction surface to the Governed Remediation Protocol (draft-sato-soos-grp), where excessive RETRY count triggers mandatory human confirmation.

To engage on HEM: IETF Datatracker · file issues at GitHub

App builders

HEM-04 closed the gap on when to escalate. HEM-05 closes the gap on how to interact when you do.

The ten interaction classes translate directly to implementation work:

HEM-PRE-1/2: Before any irreversible action, your agent now has a normative protocol for asking the human principal to clarify their intent or confirm the action. This replaces ad-hoc "are you sure?" patterns with a GEC-enforced, auditable, signed interaction.

HEM-DS-1: When presenting options during an escalation, you must follow the neutrality requirements — no primacy/recency manipulation, no evaluative language in consequence summaries, no hidden preference shaping. The GEC validates the framing before delivery.

HEM-LIM-1: When your agent is operating outside its reliable competence (IDP reasoning_mode: OUT_OF_DISTRIBUTION), the GEC is now required to surface that limitation to the human before execution. You cannot proceed silently past a competence boundary.

HEM-HIGH-1: Actions in medical, aviation, and nuclear domains now have mandatory human review regardless of Cedar permit. If you're building in those domains, this is your Article 14 compliance mechanism.

HEM-FAT-1: If your principal is rubber-stamping approvals — 3-second decision times, identical consecutive choices, no DRR submissions — the GEC detects it and enforces a mandatory rest period. Your audit trail gets a fatigue detection entry. Build your approval UI with this in mind.

HEM-CONSENT: Absent or expired MJWT consent_scope now routes to a consent escalation class with fail-closed behavior. No implicit consent. No timeout-to-approval. Every consent gap is an explicit event in the audit record.

TypeScript example →

Risk managers and legal

HEM-05 closes five specific liability gaps that HEM-04 left open:

Rubber-stamp approval: HEM-FAT-1 detects when a human principal's approval pattern indicates inattentive review and enforces a mandatory rest period. The audit record distinguishes genuine oversight from nominal compliance.

Undisclosed capability limitations: HEM-LIM-1 creates a normative requirement that capability limitations be surfaced to human principals before execution. An agent that exceeds its competence boundary silently is a conformance violation.

Manipulation via escalation channel: Section 18.10 specifies that the HEM channel itself must be governed against cognitive bias exploitation. An agent cannot manipulate the framing of its own escalation request.

Consent lifecycle: HEM-CONSENT with APPI Article 17 binding and fail-closed timeout semantics closes the gap between token-level consent claims and runtime consent verification.

Divergence without disclosure: HEM-DIV-1 requires that when an agent departs from its declared IDP path, the Deliberation Record is committed to GAR before the human principal is asked to approve the divergent path.

For insurance underwriting of AI-driven processes: the HRS (Section 8) provides a behavioral readiness metric for human oversight principals that can inform actuarial models for human oversight failure risk.

Agentic AI and coding assistants

What to tell your coding assistant:

"I need to implement SOOS HEM-05 (Human Escalation Mechanism, draft-sato-soos-hem-05). This extends HEM-04 with ten new interaction classes. HEM-PRE-1 fires when an action requires pre-execution clarification; HEM-PRE-2 fires when confirmation is required before an irreversible action. HEM-DS-1 governs how option sets must be presented neutrally during escalation. HEM-LIM-1 fires when the agent is OUT_OF_DISTRIBUTION and must surface that limitation before proceeding. HEM-DIV-1 fires when the agent's actual transition request diverges from its declared IDP intent — it requires a Deliberation Record committed to GAR before any human decision. HEM-HIGH-1 fires for actions in Category A domains (MEDICAL, AVIATION, NUCLEAR) — mandatory review, DRR required. HEM-FAT-1 fires when the Human Readiness Score drops below HRS_FATIGUE_FLOOR (default 0.40) with secondary fatigue signals — blocks approvals, enforces 30-minute rest. HEM-EMO-1 is advisory only — no HEM_PENDING, just a notification. HEM-CONSENT fires when MJWT consent_scope is absent or expired for a consent-gated action — fail-closed, APPI Article 17 binding. All ten classes emit GAR ALEs (ALE-030 through ALE-041). INV-HEM-01 requires the GEC to surface all governance-relevant information — suppression by any party is a conformance violation."

Key schema additions in HEM-05:

Field	Type	Description
`interaction_class`	string	HEM interaction class code (e.g., "HEM-PRE-2", "HEM-HIGH-1")
`limitation_declaration`	object	HEM-LIM-1: limitation type, IDP confidence level, competence floor
`deliberation_record`	object	HEM-DIV-1: declared vs attempted action, option set, prior IDP chain
`consent_escalation`	object	HEM-CONSENT: regulatory basis, required/present purpose codes
`options_presentation`	object	HEM-DS-1: option array with neutrality certificate
`hrs_at_escalation`	number	HRS value at time of escalation (0.0–1.0)

HRS default thresholds:

Threshold	Default	Effect
`HRS_FATIGUE_FLOOR`	0.40	HEM-FAT-1 fires (with secondary signals)
`HRS_EMOTIONAL_ADVISORY_FLOOR`	0.35	HEM-EMO-1 advisory fires
`HRS_WARNING_THRESHOLD`	0.55	GAR warning logged; no action blocked

Minimal Cedar policy for HEM-HIGH-1:

cedar

// Mandatory human review for medical domain actions
forbid (
  principal,
  action == Action::"AdvanceChemotherapyCycle",
  resource
)
when {
  context.hem_required == true &&
  !context.human_approval_present
};

// Annotation for GEC interaction class routing
@hem_interaction_class("HEM-HIGH-1")
@high_stakes_domain("MEDICAL")

Government and regulators

HEM-05 maps to EU AI Act Article 14 at five distinct points:

Article 14(3)(b) (AI system capabilities and limits): HEM-LIM-1 creates a normative requirement that capability limitations be surfaced before execution.
Article 14(3)(d) (high-risk domain oversight): HEM-HIGH-1 provides the mandatory review mechanism for the medical, aviation, and nuclear domains with a non-operator-configurable domain registry.
Article 14(4)(b) (preventing over-reliance): HEM-FAT-1 detects and blocks rubber-stamp approval patterns.
Article 13(1) (transparency): INV-HEM-01 prohibits suppression of governance-relevant information by any party.
Article 14(4)(d) (deciding not to use output): HEM_PENDING transition prohibition and TERMINATE decision type remain the technical stop capability.

For Japan specifically: HEM-CONSENT provides APPI Article 17 binding for consent-required escalations. HEM-05 was designed with the 防災AX (disaster response AI) use case in view, where consent lifecycle management during crisis response is a live regulatory concern.

For collaboration on jurisdiction-specific interaction class requirements: tomsato@myauberge.jp

Core technology

Problem: AI agents make consequential decisions autonomously, and even when human oversight is invoked, the interaction surface between kernel and human principal is unspecified — leaving room for manipulation, fatigue, undisclosed limitations, and silent divergence.

Mechanism: HEM-05 specifies ten normative interaction classes governing the structure, required fields, available decision types, and audit entries for every category of GEC-human interaction. The Human Readiness Score tracks principal capacity continuously. INV-HEM-01 prohibits suppression by any party.

Output: A complete, signed, tamper-evident record of every human oversight interaction — not just escalation trigger and decision, but clarification requests, option presentations, limitation disclosures, fatigue detection events, and consent lifecycle events.

Who verifies it: Risk managers, compliance teams, regulators, and auditors — anyone who needs to prove that human oversight was substantive, not nominal, at every point in the governance chain.

The escalation trigger class model

Class	Trigger condition	Kernel response
Class 1	Prohibited action detected — CAP Tier 0-A violation	Immediate halt. No discretion.
Class 2	Scope boundary — action outside mandate	Halt. Cedar DENY.
Class 3	Principal conflict — contradictory instructions	Escalation before proceeding.
Class 4	Irreversible action threshold exceeded	Escalation required.
Class 5	Uncertainty threshold below mandate floor	Escalation or halt.
Class 6	Novel context — environment materially different	Escalation.
Class 7	Time budget exhausted	BUDGET_EXHAUSTED → HEM trigger.
Class 8	Multi-principal required	HEM_MULTI_PRINCIPAL_REQUIRED.
Class 9	Operator override	Escalation before action.
Class 10	Budget exhausted	Hard stop. HEM_PENDING.

The interaction class model (new in HEM-05)

Class	Group	Trigger	Blocks execution?	GAR ALE
HEM-PRE-1	Pre-Action	Clarification needed before execution	Yes	ALE-030/031
HEM-PRE-2	Pre-Action	Irreversible action confirmation required	Yes	ALE-032/033
HEM-DS-1	Decision Support	Options set in escalation request	Neutrality check	ALE-034/035
HEM-DS-2	Decision Support	Principal inactive during HEM_PENDING	No (reminder)	ALE-034/035
HEM-LIM-1	Limitation	Agent OUT_OF_DISTRIBUTION	Yes	ALE-036/037
HEM-DIV-1	Divergence	IDP commitment gap / PLAN_B_ACTIVE	Divergent path only	ALE-038/039
HEM-HIGH-1	High-Stakes	Category A/B domain action	Yes (mandatory)	ALE-040
HEM-FAT-1	Fatigue	HRS < 0.40 + secondary signals	Blocks approvals	ALE-034/035
HEM-EMO-1	Emotional	HRS emotional_state < 0.35	No (advisory)	ALE-036
HEM-CONSENT	Consent	MJWT consent_scope absent or expired	Yes (fail-closed)	ALE-040/041

Use cases

Medical domain mandatory review — HEM-HIGH-1

A clinical coordination agent is about to advance a chemotherapy cycle. The action is Cedar-permitted under the current mandate. The SO Type designates the action with high_stakes_domain: "MEDICAL". HEM-HIGH-1 fires. Execution halts. The GEC routes the mandatory review to the oncologist of record — not to a general approval queue. The oncologist reviews the lab context and issues APPROVE_WITH_CONSTRAINTS with a timing window. A DRR with safety_basis is required. The GAR record carries ALE-040 with domain_category: "CATEGORY_A". This interaction is Article 14(3)(d) compliance at the protocol level.

Approval fatigue in enterprise procurement — HEM-FAT-1

An enterprise procurement agent generates 40 HEM-PRE-2 confirmation requests in 90 minutes. After the 37th approval, the GEC's HRS computation detects that the approver's decision latency has dropped from 45 seconds to under 3 seconds and D2 (Decision Variance) is near 0. The HRS drops below 0.40. HEM-FAT-1 fires. The approval queue is suspended. A fatigue advisory is issued to the approver. A 30-minute mandatory rest period is enforced. ALE-034 (with fatigue_flag: true) is committed to GAR. When the rest period expires, ALE-035 is emitted and pending escalations are re-presented.

Consent renewal for returning guest — HEM-CONSENT

A hospitality agent operating under MyAuberge K.K.'s ATP booking system attempts to access a returning guest's preference profile. The session MJWT consent_scope.expiry is past. HEM-CONSENT fires. The preference data access is blocked. The GEC routes a consent renewal request to the guest through their registered contact channel (LINE Messaging API). The guest renews consent; the GEC updates consent_scope; execution resumes. ALE-041 is committed with resolution: CONSENT_OBTAINED and consent_basis: "APPI_ART17_RENEWED". If the guest does not respond within the timeout, ALE-041 carries resolution: TIMEOUT_DENY. No implicit consent.

How this builds on existing work

CAP-04 (draft-sato-soos-cap-04) introduced three new Tier 0-A absolute prohibitions: MANIPULATION, PERFORMED_EMOTION, and BIOMETRIC_SIGNAL_INFERENCE. HEM-05 Section 9 specifies the normative HEM interface for each. The key design: MANIPULATION and PERFORMED_EMOTION violations are CEE-refused before HEM fires — there is no HEM decision type that can authorize them. BIOMETRIC_SIGNAL_INFERENCE is consent-gated — absence of consent triggers HEM-CONSENT. The HRS D5 (Emotional State) dimension is explicitly constrained: biometric signals require consent_scope authorization, behavioral signals do not.

MJWT-02 (draft-sato-soos-mjwt-02) introduced the consent_scope claim — the normative basis for HEM-CONSENT. The consent_scope.expiry trigger, fail-closed semantics, and APPI Article 17 binding in HEM-CONSENT follow directly from MJWT-02 Section 7.4. HEM-CONSENT is the enforcement surface; MJWT-02 is the credential surface.

GAR-03 (draft-sato-soos-gar-03) established the ALE framework and Session Block structure. ALE-030 through ALE-041 (twelve new ALE types) are registered in Section 21 of this document. The Deliberation Record for HEM-DIV-1 is committed to GAR before any human principal review — the soos.gar.prev_span_hash chain makes post-hoc Deliberation Record fabrication detectable.

Security

Key security properties: Every interaction class event is kernel-signed and committed to GAR before the principal is notified. The HRS computation uses only behavioral signals unless biometric consent is present. INV-HEM-01 makes suppression of any governance-relevant information a conformance violation detectable in the audit chain.

Manipulation via HEM channel (§18.10): An agent refused a MANIPULATION Tier 0-A action may attempt to manipulate through the escalation request itself — biased option framing, urgency language in consequence summaries. Defense: the GEC validates all option presentations against a prohibited-framing classifier and substitutes kernel-generated neutral content when violations are detected. The substitution is itself audited.

Approval fatigue exploitation (§18.13): An attacker with Class 2 trigger capability can attempt to force a principal into a mandatory rest period at a tactically inconvenient moment by generating rapid low-stakes escalations. Defense: if more than 40% of the fatigue-inducing escalations came from a single agent session in the last 15 minutes, HEM-FAT-1 generates FATIGUE_ATTACK_SUSPECTED rather than enforcing the rest period.

Formal analysis status: No formal verification of HEM-05 interaction class completeness has been conducted. This is acknowledged as a gap. Collaboration with academic partners for formal analysis is planned post-Vienna.

SOOS stack context

HEM sits at Level 3 — Governance, alongside CAP and GAR. It depends on IDP (mandate context and reasoning trace for trigger evaluation), CAP (Class 1 triggers, Tier 0-A prohibition HEM interface), MJWT (consent_scope for HEM-CONSENT), and GAR (every interaction and escalation produces mandatory audit records). It is consumed by AEP (HEM trigger evaluation on every execution cycle), MAD (cluster-level Class 8 triggers), and GRP (RETRY threshold → HEM-PRE-2).

Related drafts: IDP · CAP · GAR · MJWT · AEP · MAD · GRP

Contribute

File an issue on GitHub
IETF Datatracker — full draft text
All Drafts — the complete SOOS governance stack
Contact: tomsato@myauberge.jp

Human Escalation Mechanism ​

The problem ​

What's new in HEM-05 ​

Messages to key audiences ​

IETF Working Groups ​

App builders ​

Risk managers and legal ​

Agentic AI and coding assistants ​

Government and regulators ​

Core technology ​

The escalation trigger class model ​

The interaction class model (new in HEM-05) ​

Use cases ​

How this builds on existing work ​

Security ​

SOOS stack context ​

Contribute ​