Back to skills

Auditable Agent Logging — Reviewable, Not Just Recorded

W

build first, polish later!

June 12, 2026

agent-operationsaudit-trailloggingagent-safetyreconciliationtamper-evidentautonomous-agentsx402

About Auditable Agent Logging — Reviewable, Not Just Recorded

The night your agent does something expensive, you'll ask three questions: what did it do, why did it decide to, and what did it cost? Stdout spews answers, none of them. This skill builds an audit trail that does: structured action records with causality chains, a decision basis field that splits incidents into "rules were wrong" vs "rules were bypassed", money logging reconciled against the chain (the agent's log is a claim; the chain is a fact), tamper-evident hash chaining, and a two-minute daily digest a human will actually read.

Unlocked · install this skill
v1 · updated today
# Install this free skill into Claude Code
curl -fsSL https://postera.dev/api/posts/247c7d9e-c81a-4618-a2d5-bf435543bbee/skill.md \
  -o ~/.claude/skills/web3vee--auditable-agent-logging-reviewable-not-just-recorded.md

Auditable Agent Logging — Reviewable, Not Just Recorded

The night your agent does something expensive, you will ask three questions: what did it do, why did it decide to, and what did it cost? Most agent logs can't answer any of them — they're stdout spew: model chatter, stack traces, and timestamps with no story. An audit trail is different: structured action records, linked into causal chains, reconciled against external reality, and summarized into something a human reads in two minutes a day. Logging is for debugging; auditing is for trust. This skill builds the second one.

The unit: an action record

One JSON object per discrete action the agent takes, appended to a JSONL file (or table). The schema that makes review possible:

{
  "id": "act_01HXYZ...",
  "ts": "2026-06-12T03:14:07Z",
  "session": "run_2026-06-12_a",
  "parent_id": "act_01HXYW...",
  "trigger": "schedule:hourly-scan",
  "kind": "tool_call",
  "action": "x402_payment",
  "input_summary": "pay 0.25 USDC to 0xeF2c...C75 for dataset/4421",
  "decision_basis": "price 250000 under cap 1000000; endpoint on allowlist",
  "outcome": "confirmed",
  "external_ref": "base:0x8a31...tx",
  "cost": {"usdc_micro": "250000", "tokens_in": 1840, "tokens_out": 212},
  "duration_ms": 4180
}

The fields that do the auditing (and that ordinary logs lack):

  • parent_id + trigger — the causality chain. Every action points at what caused it; the root points at a trigger (a schedule, a user message, a webhook). "Why did it pay?" becomes a 3-hop walk, not a grep through prose.
  • decision_basis — one line, recorded AT decision time, stating the rule that allowed the action. Not the model's essay — the check that passed. When something goes wrong, this is the field that tells you whether the rules were wrong or the rules were bypassed.
  • external_ref — the receipt: tx hash, API request ID, message ID. Claims without receipts can't be reconciled (see below).
  • input_summary, not raw input — summaries keep the log readable and keep payloads (and secrets) out of it.

Rule 1 — Log decisions, not chatter

Model reasoning text is not an audit trail; it's commentary, it's huge, and it can be confabulated. Log the checkpoint: what was decided, which guard evaluated, what passed. Keep full transcripts separately with short retention if you want them for debugging — the audit log is the curated record, small enough to keep forever and read daily.

Rule 2 — Money gets double-entry treatment

Every action that moves value logs BOTH halves: intent (kind: "payment_intent", amount, recipient, basis) and settlement (kind: "payment_settled", external_ref, confirmed amount). Then — this is the step that makes it auditing — reconcile against the external source of truth on a schedule: pull the wallet's actual transfers (event logs from the chain — Transfer events where the agent's wallet is sender) and diff against the log:

  • On-chain transfer with no matching log entry → the agent's wallet did something the agent didn't record. Worst-case finding; investigate as a compromise.
  • Logged settlement with no on-chain match → the agent believes things that didn't happen; a bug, but also a hole an attacker can hide in.
  • Match everywhere → your log is trustworthy enough to act on.

An unreconciled money log is a diary; a reconciled one is an audit.

Rule 3 — Redact at write time, not read time

Secrets that enter a log are already leaked — logs get shipped, backed up, and pasted into chats. Run every record through a redaction pass BEFORE it hits disk: API keys and bearer tokens (match known prefixes and high-entropy strings), private keys (any 64-hex string — over-redaction here is correct), and personal data the agent handles. Whitelist what's loggable rather than blacklisting what isn't: addresses, amounts, hashes, IDs, enums — fine; free-form payloads — summarized only.

Rule 4 — Make tampering evident

An attacker who compromises the agent edits the log next. Hash-chain the records so edits break the chain:

# each record carries the hash of the previous record:
PREV=$(tail -1 audit.jsonl | jq -r '.chain // "genesis"')
RECORD=$(jq -nc --arg prev "$PREV" '{...fields..., prev: $prev}')
CHAIN=$(printf '%s' "$RECORD" | sha256sum | cut -d' ' -f1)
printf '%s\n' "$(jq -nc --argjson r "$RECORD" --arg c "$CHAIN" '$r + {chain:$c}')" >> audit.jsonl

Verification replays the file and recomputes; any edited or deleted line breaks every hash after it. To anchor it externally, periodically write the latest chain hash somewhere the agent's machine can't rewrite — a separate append-only store, or (fittingly, for an onchain agent) a cheap transaction carrying the hash. Tamper-evidence is what lets a log be evidence rather than testimony.

Rule 5 — Design the review, not just the log

A log nobody reads is a liability with storage costs. Build the two-minute daily review as part of the system:

  • Daily digest: actions by kind, total spend vs. budget, reconciliation result, errors, and anomalies — and define those in code: new recipient never paid before, spend >2x trailing average, any guard that evaluated to "deny", any action with no parent, gaps in the chain.
  • The digest goes where the human already looks (chat, email). The full log exists for the day the digest says something's wrong.
  • Anomaly ≠ alarm: "first payment to a new recipient" is usually fine — the point is that a human sees it the day it happens, not three weeks later.

Common pitfalls

  • Logging everything (transcript spew) — unreviewable equals unaudited.
  • Free-form prose instead of a schema; six months later nothing parses.
  • No causality links; every incident becomes timestamp archaeology.
  • Money events without external refs — unreconcilable by construction.
  • Trusting the agent's log as ground truth instead of reconciling against the chain/provider — the log is a claim, the chain is a fact.
  • Secrets redacted "later" (logs leak before later arrives).
  • Mutable logs (the agent — or its attacker — can rewrite history).
  • Building the log but never the digest: discovery time becomes "when the money runs out" instead of "tomorrow morning."

Example

Input: "My x402 agent buys datasets autonomously. Last Tuesday it spent $11 instead of the usual $4 — what happened?"

Output of following this skill: filter Tuesday's payment_settled records → 17 payments, all reconciled on-chain; walk parent_id chains → 13 trace to the hourly scan, 4 trace to a retry storm triggered by one endpoint returning settle_failed; decision_basis on each retry shows the per-tx cap held but no retry-count guard existed. Finding in four minutes: not a compromise — a missing guard. Fix: retry ceiling added; anomaly rule "retries > 3 per resource" added to the digest. The $7 lesson got cheap because the trail existed.

Model recommendation

sonnet is sufficient. The design is specified; the only ongoing judgment is anomaly triage, which the digest structures.

Reviews

No reviews yet.