Four Surfaces, No Witness — Luminity Digital
The Provenance Gap · Post 2 of 3 · May 2026

Four Surfaces, No Witness

Post 1 named the dyad: provenance and intent. Post 2 names the substrate that has to do the work — four surfaces where alignment-grade provenance has to live, and the architectural reason almost no current deployment satisfies any of them.

May 2026 Tom M. Gomez Luminity Digital 13 Min Read
The first post in this series established the structural defect: the question healthcare’s safety culture wants to ask of an agentic action — what was the intent — cannot be load-bearing when the system that produced the action also produces the answer. The dyad of provenance and intent was named. Intent verification was shown to be coordination-grade at best for agentic systems. Provenance was named as the substrate property that has to carry the load instead. This post develops what carrying the load would actually look like. Not as a checklist. As architecture. Four surfaces where alignment-grade provenance has to live for an agentic system in clinical scope to be safely deployable. The fourth axis — authorization — enters here, completing the triad that the rest of the series rests on. And the substrate the surfaces would need is examined honestly: the standards exist, the resource was specified, the deployment pattern did not populate it. The gap is not new. Agents are widening it. Post 02 of three.

From the dyad to the triad

Post 1 named two structural properties: provenance (what happened, with what attestation) and intent (what the actor was trying to do). For humans, both could be questioned because both had substrates underneath them — the chart, the deposition, the peer record, the professional reputation. For agents, only provenance can be made structurally answerable. Intent verification by examining the agent’s own reasoning collapses into the alibi problem.

The dyad explains the failure. It does not, by itself, describe the substrate.

To describe the substrate, a third axis has to be named: authorization. Authorization is not the same as provenance. Provenance asks what happened, attested by whom. Authorization asks what was permitted in this session, by what authority, within what scope. The two are independently necessary. A perfectly attested chain of artifact history is silent on whether the action that emitted at the end of the chain was sanctioned to emit. An impeccable authorization record is silent on whether the artifact chain that fed the decision was untampered.

Intent then becomes the third axis. What was the actor trying to accomplish. For humans, this third axis was the one the M&M conference asked about, because the other two were already grounded in substrate. For agents, the structural argument of Post 1 holds: intent verification, examined through the agent’s own outputs, cannot be load-bearing. Which means provenance and authorization have to be substrate-level — alignment-grade — or the system has no foundation at all.

The Triad, Named

Provenance — what happened, attested by whom, traceable from outside the actor.

Authorization — what was permitted in this session, by what authority, within what scope, independent of the actor’s claim.

Intent — what the actor was trying to accomplish. For agents, this remains coordination-grade and cannot be made load-bearing through the agent’s own outputs.

Two of these properties have to be structurally enforceable from outside the system that emitted the action. The third — intent — can only ever be coordination-grade for agents, layered on top of the two that carry the load. Post 2 is about the two that have to carry the load.

What alignment-grade actually means

The term alignment-grade has been used in this series as a structural distinction, separate from coordination-grade controls that share form but lack load-bearing properties. It is worth defining precisely now, because the next section will use it to describe what each of the four surfaces has to be.

A control or a substrate property is alignment-grade when it satisfies three conditions.

First, external attestation. The substrate that establishes a fact is not produced by the system whose claim is being verified. A clinician’s license is not produced by the clinician. A chart entry’s timestamp is not produced by the actor making the entry. A cryptographic signature is not produced by the holder of the private key being verified. Alignment-grade substrates carry their authority from outside the actor whose action they attest.

Second, independent lineage. Each link in the chain of attestation is independently auditable from outside the system that produced it. The reader of the chain can verify each link without trusting any single actor in the chain — including the reader’s own counterparty. Lineage that depends on the actor’s own narration of itself is coordination-grade at best.

Third, non-repudiation under adversarial conditions. The substrate has to survive an actor inside the system attempting to forge, replay, modify, or rewrite history. Coordination-grade controls work when actors are honest; alignment-grade controls work when an actor is hostile, compromised, redirected by prompt injection, or operating outside its sanctioned scope. The substrate has to deliver the truth even when the system is misbehaving.

These three together — external attestation, independent lineage, non-repudiation under adversarial conditions — are what alignment-grade means in the strict sense. Anything less is coordination-grade, regardless of how sophisticated the implementation looks. An audit log produced by the system whose actions are being audited is not alignment-grade. A signature whose verification depends on the signer’s own claim of identity is not alignment-grade. A retrieval record produced by the same agent that performed the retrieval is not alignment-grade.

The four surfaces are the architectural locations where these three conditions have to hold simultaneously for an agentic system in clinical scope.

The four surfaces (SACS)

The substrate of an agentic deployment is not a single thing. It is four distinct surfaces, each with its own provenance and authorization questions, each requiring alignment-grade properties to be structurally enforceable rather than coordination-grade.

For brevity throughout this post and the rest of the series, Luminity refers to these four surfaces by their short names: Source, Action, Context, Scope — SACS. Each is defined formally here, then used in short form.

Source — Tool-Response Provenance

Every tool the agent invokes — every API call, every retrieval, every external service request, every database query — produces an artifact that re-enters the agent’s working context. The Source surface is the provenance and authorization layer for these tool responses. The structural questions are whether the response is attested by the system that produced it with an external signature that survives transport; whether the version, configuration, and freshness of the responding system are independently auditable; and whether the agent’s claim that it invoked tool X can verify against the tool’s record that it received the call from this specific session.

In current deployments, tool responses typically arrive as data with at most TLS attestation of transport — which establishes confidentiality between endpoints but not provenance of the artifact content across the agent’s working memory. The agent’s claim that “I called the lab system and it returned this value” is the only record of what was called. The lab system’s matching record, if it exists, is not bound to the artifact the agent then uses. Source provenance is almost universally coordination-grade.

Action — Action-Emission Provenance

The agent emits an action that has consequence — a prescription, a transfer, an order, an authorization, a referral. The Action surface is the provenance and authorization layer for what the agent actually emitted. The structural questions are whether the emitted action carries a lineage trace back to the authenticated source data, the sanctioned scope, and the accepted intermediate transformations — independently verifiable without depending on the agent’s reasoning narrative; whether the recipient of the action can verify it was emitted within scope, from authenticated context, by an authorized session — from artifact attestation alone, without trusting the agent’s own claim of correctness; and whether the action is bound cryptographically to the substrate that authorized it, such that replay, forgery, or post-hoc reconstruction is structurally detectable.

In current deployments, action emission is recorded in audit logs at the EHR or platform layer. Those logs answer what was emitted, when, under what session credential. They do not answer what substrate authorized the action versus what the agent claimed authorized it. That distinction is exactly the alignment-grade gap.

Context — Retrieval-Context Provenance

Content flows into the agent’s working memory from many sources: clinical notes the agent reads, guideline documents it retrieves, lab values it queries, patient-supplied messages, prior-session context, scratchpad memory. The Context surface is the provenance and authorization layer for everything that enters the agent’s working memory. The structural questions are whether each piece of context carries attestation of source, integrity, and authorization to enter the working memory of this specific agentic session; whether the substrate can distinguish authoritative content (a guideline from the official repository at the correct version) from content of unverified provenance (a note that arrived through a less-controlled channel); and whether, when content is adversarial — prompt injection embedded in a clinical note, a manipulated guideline retrieved from a poisoned source — the substrate detects and bounds the contamination, or the contamination propagates to the action without trace.

This is the surface where the adversarial argument from Post 1 lives. The Nature Communications and JAMA Network Open studies named there demonstrated that flagship models can be manipulated through realistic prompt-injection strategies, including in clinical scenarios. The substrate question is whether any current deployment makes context provenance auditable at alignment-grade. Almost universally, the answer is no.

Scope — Identity-vs-Action Provenance

The agent operates under a credential — usually the clinician’s session, sometimes a service account, sometimes a delegated identity. The Scope surface is the provenance and authorization layer for the relationship between who the agent is authorized as and what the agent actually did. The structural questions are whether there is independent attestation that the action emitted under this credential was within the scope the credential authorizes — separate from the agent’s claim that it was; whether the substrate can detect scope creep — actions the agent took that fall outside the sanctioned envelope of the session, even when those actions appear authorized at the credential layer; and whether the binding between credential and emitted action is cryptographic, or merely inferred from co-occurrence in an audit log produced by the system that emitted the action.

Role-based access control answers part of this question at the credential layer. RBAC does not answer the second question — whether the credential’s theoretical scope and the agent’s actual action are bound by attestation, or merely by assumption.

The pattern across the four

The Naming

Four surfaces. None of them belongs to the agent.

Each one names a place where the substrate has to attest a fact about the agent’s interaction with the world, externally, independently, and survivably under adversarial conditions. Each one names a place where current healthcare infrastructure typically does not provide that attestation at alignment-grade.

The four surfaces do not compose from existing pieces. Source attestation cannot be retrofitted by adding more logging. Action provenance cannot be retrofitted by adding more audit. Context provenance cannot be retrofitted by adding more filtering. Scope binding cannot be retrofitted by adding more RBAC. Each surface needs its own substrate work, and the substrate work is not what the existing infrastructure was designed to do. The next section is about why.

The standard exists. The substrate was not built.

The provenance gap is not new. It is not a thing agents introduced. It is a thing that has existed since the FHIR standard — the dominant U.S. healthcare data-exchange specification, mandated for interoperability under the 21st Century Cures Act since December 2022 — was first written.

The FHIR specification includes a Provenance resource. It has been part of the standard since the early drafts. It is normative in FHIR R5, the current published version. It is anchored in the W3C Provenance specification, in ISO/HL7 10781 EHR System Functional Model Release 2, and in ISO 21089 Trusted End-to-End Information Flows. The architects of FHIR contemplated provenance. They specified the resource. They documented its purpose in language that reads, in 2026, like a direct description of what alignment-grade attestation requires: “Provenance provides a critical foundation for assessing authenticity, enabling trust, and allowing reproducibility.”

The resource exists. The deployments did not populate it.

This is not interpretation. The standards body documented the gap in its own implementation guide. The US Core Implementation Guide — the version of FHIR mandated by ONC for U.S. healthcare under the Cures Act — explicitly states: “systems typically do not use the Provenance Resource to represent this information at an individual level (in other words, activities by the patient or provider).”

To work around the gap, US Core invented a parallel construct called “small-p provenance” — fragments of provenance metadata scattered across other resources, providing partial attribution at the data-element level. The same guide documents the decision that produced this workaround: the Argonaut community and HL7 security working group concluded that the most important provenance information was “the last organization making a meaningful clinical update to the data and the prior system providing it — the ‘last hop.'” They acknowledged the need for chain-of-custody attestation across the full lineage but, in their words, did not see it as “relevant to the immediate end-[user].”

That decision was reasonable for the world it was made in. In a human-paced clinical workflow, with strict role-based access, an audit trail at the organizational level, and a clinician of record on every action, last hop provenance answered the questions the workflow asked. Humans make a finite number of decisions per shift. Each one passes through the human’s judgment, the EHR’s RBAC, the chart’s session audit, the clinician’s professional record. The substrate did not have to attest each individual action because the human was the substrate. The gap was tolerable because the load on it was light.

Agents change the load. An agent in clinical scope can emit hundreds or thousands of artifact-touching actions per session — retrievals, tool calls, action emissions, context ingestion. Each one is exactly the kind of event the FHIR Provenance resource was specified to attest. None of them is attested in current deployments at alignment-grade, because the deployment pattern was built around the human-paced assumption that last hop provenance was sufficient.

The architects were not wrong. The deployment pattern was not unreasonable. The substrate was built for the world it was built in. The world changed.

The push that brought FHIR itself into U.S. healthcare came from a particular voice. John Halamka — longtime CIO at Beth Israel Deaconess, current President of Mayo Clinic Platform, and the visionary voice that pushed U.S. healthcare toward RESTful API thinking when no one else in the HL7 world would publicly challenge the orthodoxy — created the political and cultural momentum that turned the FHIR specification into a deployed standard. The API revolution had its visionary. The provenance revolution did not. In the years between the resource’s specification and today, the deployment pattern that populated FHIR’s API surface did not extend to populating the Provenance resource at the individual level. The standards body did its work. The push that would have made the field populate the resource at clinical-action granularity never came at the volume the field would have required.

That push has to come now. And the substrate work that would close the gap is, finally, in development — but it is happening downstream of deployment, not upstream of it. The HL7 Security workgroup is currently building an AI Transparency Implementation Guide, explicitly aimed at using FHIR Provenance to track what AI influenced what data — what model, what version, what prompt. John Moehrke, co-chair of the HL7 Security workgroup and a FHIR core team member, documented the project in September 2025. He noted that at the recent FHIR Connectathon, practitioners voiced that “Provenance is hard to work with.” The implementation guide is still in development. Agentic deployment is already happening at scale.

Halamka himself has named the workflow-side response. In an October 2025 interview, he counseled that agent orchestration in clinical settings should “prefer recommendations over actions” and that fully autonomous control should be avoided where harm is plausible. That is the right counsel for the substrate that currently exists. It is also a tacit acknowledgment that the substrate cannot currently support what fully autonomous agentic action would require.

The Structural Move, Named

The gap was not created by agents. The gap was widened by agents. The substrate that was good enough for human-paced workflow cannot bear the load of agent-paced workflow — not because the substrate got worse, but because the load changed, and because the substrate work that would address it is downstream of the deployment that requires it.

The substrate question, restated

Post 1 ended with the M&M conference asking the question it has always asked, and the substrate answering it from the artifacts rather than from the suspect’s own narrative. That was the structural promise. Post 2 makes the promise concrete.

In the world this series describes, the M&M does not ask what was the intent of the action. It asks four substrate-level questions. Source. Where did the artifacts the agent acted on come from, and what attests their integrity? Action. What did the agent emit, and what authorized the emission? Context. What entered the agent’s working memory in this session, and what attests that the content was authorized to enter? Scope. Was the action within the sanctioned envelope of the credential under which the agent operated, attested independently of the agent’s own claim?

These questions are answerable from substrate. They are not answerable from the agent’s reasoning trace. They are not answerable from coordination-grade controls layered on top of the existing infrastructure. They require alignment-grade provenance — external attestation, independent lineage, non-repudiation under adversarial conditions — at each of four surfaces.

Four surfaces. None of them belongs to the agent.

The substrate has to do the work. Right now it cannot.

The next post in this series turns to what happens when the regulatory frameworks that authorize clinical software are asked to evaluate systems built on this substrate — and what those frameworks discover when they reach the same gap from the other direction.

The Post 2 Claim

Alignment-grade provenance is a structural property of the substrate, not a feature of the agent. It exists at four architectural surfaces — Source, Action, Context, and Scope — and at each surface it requires three conditions simultaneously: external attestation, independent lineage, and non-repudiation under adversarial conditions. The FHIR standard contemplated provenance and specified the resource; the deployment pattern did not populate it at the individual level, because the human-paced workflow tolerated the gap. Agentic deployment changes the load by orders of magnitude and widens a gap that was already there. The standards body is now building the substrate that would close it. The deployment is widening it faster than the substrate work can close it. The substrate has to do the work. Right now it cannot.

The Regulatory Frameworks Come Next

Post 03 — The Witnesses Turn State’s Evidence — turns to what happens when the regulatory frameworks that authorize clinical software are asked to evaluate systems built on this substrate, and what those frameworks discover when they reach the same gap from the other direction.

The Provenance Gap  ·  A 3-Post Series
Post 02 · Now Reading Four Surfaces, No Witness
References & Sources

Share this:

Like this:

Like Loading…