The evidence graph, watching the agent show its receipts

Under each agent answer, a live crawl plays the agent reading the real context graph, then collapses to an inspectable graph cross-linked to the prose.
- trust
- provenance
- evidence graph
TL;DR
The deepest trust problem is not whether the agent will do the wrong thing, it is whether you can believe what it just told you.
Under each agent answer, a live crawl plays the agent reading the real context graph, then collapses to an inspectable graph whose nodes are cross-linked to the answer's prose via interactive citations.
The problem
Fluent answers are not the same as correct answers, and acting on a confident but wrong claim in infra is expensive.
Users had no way to interrogate why the agent said what it said, so they trusted blindly or not at all.
Process
I mapped how engineers trust a colleague's claim: they ask how you know, and expect a chain back to a source. That is the whole idea here, trust is provenance made watchable and traversable.
A static citation list was not enough, because infra reasoning is relational and temporal, so the evidence should show the agent moving through it rather than list its sources.
I also prototyped and rejected a trust ledger sidebar; it tested as clutter that nobody opened twice, which is how I learned that provenance had to live inside the answer, not beside it.
The solution
- A live crawl (
searching the context graph) where nodes light up as they are read and discarded ones fall away, collapsing to an inspectable React-Flow graph.
- Node drill-down to the exact provenance record.
- Clickable citations that light the matching node.
- The whole answer is built for trust: streamed reasoning, typed value badges (red for a cost anomaly, green for a saving), and a guarded rollback that asks before acting.
- A sanitizer plus a leak-guard test ensure no raw query ever reaches the browser, so designing the trust feature also meant designing what the user is not allowed to see.
Impact
Shipped internally, behind a feature flag, and still awaiting review, so there are no adoption numbers yet.
In every demo it answers the single hardest objection: whether the answer can be trusted rather than taken on faith.
Reflection
Reasoning-state explains intent, the approval gate controls action, the evidence graph explains grounding.
Grounding is where infra trust actually lives.