Hooks
A hook is a single Markdown file that subscribes to a lifecycle event and returns an
allow/block/modifydecision in deterministic Starlark.
If tools are what the agent can do, hooks are what the harness watches and rules on. They are the policy and observability plane of AI Harness — the layer where deny-lists, audits, redactions, retries, and rate limits live, all expressed as code that diffs cleanly in a pull request.
What a hook is
A hook artifact has three jobs:
- Subscribe to a lifecycle event —
tool.pre,tool.post,turn.start,turn.end. - Inspect the event payload — the tool name and arguments, the model response, the structured result.
- Return a decision —
allow(),block(reason), ormodify(payload).
Hooks are loaded from .harness/hooks/*.md. Unlike tools they are not
addressable by the model. The model never sees a hook by name; it only ever
sees the consequences (a tool call rejected, a result redacted, a turn
re-prompted).
That is the point. A hook is a piece of harness policy that the agent cannot route around.
Anatomy of a hook
A complete, real hook from the governed-agent example:
---
event: tool.pre
priority: 10
when: payload["name"] == "run_command"
script: |
def handle(event, payload):
cmd = payload.get("args", {}).get("command", "")
dangerous = [
"rm -rf /",
":(){ :|:& };:",
"mkfs",
"dd if=",
"shutdown",
]
for d in dangerous:
if d in cmd:
metrics.incr("audit.policy.deny")
return block("dangerous command pattern blocked: '" + d + "'")
return allow()
---
# command_guard
Hard-blocks well-known destructive shell patterns. Pair with the systemd
unit (`deploy/systemd/harness.service`) for real isolation.
Four things to notice:
event:is the subscription. This handler runs on everytool.preevent the harness fires.when:is a static gate. A Starlark expression evaluated against the payload beforehandleis called — it lets a hook scope itself to a single tool, model, or turn shape without paying the cost of running its body.priority:resolves order. Lower numbers run first. The audit hook ships at priority1so it sees every call, even ones a higher-priority policy hook will go on to block.- The decision shape is
allow/block/modify. That ternary is the entire contract between hook and harness.
The four lifecycle events
The hook event catalog is intentionally small. Every event surface is a deterministic place where the harness needs a yes/no/rewrite answer.
| Event | When it fires | Typical use |
|---|---|---|
turn.start | Before the model is called for a new turn | Inject context, reject empty turns, stamp a trace ID |
tool.pre | After argument validation, before run(args) | Deny-list tools, scrub args, enforce path/network policy |
tool.post | After run(args) returns | Redact results, truncate, attach metrics, append audit lines |
turn.end | After the model produces its turn output | Emit summaries, write transcripts, fire OTel spans |
Each event is dispatched by the runtime unconditionally — there is no fast path that skips hooks. That uniformity is what lets a single hook enforce a policy across every tool the agent will ever call, including ones added later or generated by a self-augmenting workflow.
Coming primitives. The event catalog is designed to grow. Two events in active spec —
delegation.post_verify(#103) andagent.stop(#104) — extend the sameallow/block/modifymodel to sub-agent verification and loop-exit decisions. The hook contract you learn here is the contract you keep.
The decision model
Every hook handler ends in one of three calls:
allow() # pass through unchanged
block("reason for the agent") # reject; reason is surfaced as an error
modify({"args": new_args}) # rewrite the payload, then continue
The harness composes decisions across hooks deterministically:
- Hooks for an event run in priority order (low to high).
- The first
blockwins — the chain short-circuits and the rest are skipped. modifyrewrites the payload in place for downstream hooks and the underlying operation.allowis a no-op pass.
There is no "after-the-fact override" and no implicit rule that lets a
later hook silently undo an earlier block. The order is the rule.
The Starlark sandbox (and what hooks get extra)
Hook scripts run in the same Starlark dialect as tools — no I/O at the language level, no imports, no mutable globals. Hooks pick up a small set of additional built-ins shaped around their job:
| Built-in | Purpose |
|---|---|
allow() / block() / modify() | Decision constructors |
metrics.incr / metrics.set | Counters and gauges visible to metrics.snapshot() |
log / log.info / log.warn | Structured logs that flow into turn.end payloads |
cache.get / cache.set | Per-run KV cache shared with tools |
http.get / http.post | Outbound HTTP, gated by the network allowlist |
Hooks deliberately do not receive exec.run or fs.write. Policy
code that can shell out is policy code an attacker can pivot through. If a
hook genuinely needs to mutate state (rare), do it through a named tool
the hook calls explicitly — the call goes back through the lifecycle and
inherits all the same audit guarantees.
Why hooks are files, not callbacks
A hook could in principle be a Go interface registered in init(). We
reject that for the default path for the same reasons as
tools:
- Reviewable. A diff like
+ .harness/hooks/command_guard.mdshows the entire policy: subscription, scope, priority, decision logic — in one file. - Composable. Layer policies by adding files; remove them with
git rm. There is no central registration table to keep in sync. - Portable. Hook artifacts move between repos, teams, and harness versions without a code change.
- Governed. Because hooks are Markdown, other hooks can read them.
A meta-policy hook can enforce that every new tool ships with a matching
audit hook, or that no hook in
.harness/hooks/lacks apriority:.
The Go-level hook API still exists for cases that genuinely need it (performance-critical paths, native integrations). It is the escape hatch, not the default.
Composing hooks: the policy stack
Real harnesses don't have a hook — they have a stack of hooks for
each event. The governed-agent example ships seven, every one of which
is independently reviewable:
.harness/hooks/
├── audit_tool_pre.md # priority 1 — count + log every call
├── audit_tool_post.md # priority 1 — count + log every result
├── command_guard.md # priority 10 — deny dangerous shell patterns
├── path_guard.md # priority 10 — jail filesystem writes
├── prefer_named_tools.md # priority 20 — reject raw exec.run
├── meta_tool_guard.md # priority 30 — block tools editing .harness/
└── completion_window_guard.md # priority 40 — cap output size per turn
Reading top to bottom, the policy reads like English: "audit everything,
deny dangerous commands, jail the filesystem, only let the agent use
named tools, don't let it edit the harness itself, cap completion size."
Each line is a file. Each file is a 30-line Markdown artifact. The whole
governance posture is a git log.
Hooks versus middleware versus interceptors
People coming from Express, Rails, or gRPC ask where the line is. In AI Harness:
- Hooks subscribe to agent lifecycle events, not HTTP requests. They see semantic payloads (tool name, arguments, results), not bytes.
- Hooks return a decision, not a continuation. There is no
next()call; the harness owns the chain and runs it deterministically. - Hooks are governed alongside tools. They live next to the capabilities they regulate, version with them, and ship with them.
That last point is the one that matters most. In a typical service, the
middleware stack and the handler stack live in different repos, owned by
different teams, deployed on different cadences. In AI Harness they live
in the same .harness/ directory and ship in the same pull request. You
cannot land a tool without landing the policy that guards it.
Hook execution lifecycle
For any event, the harness runs this sequence:
- Filter. Evaluate each hook's
when:expression against the payload; drop the ones that don't match. - Sort. Order surviving hooks by
priorityascending. - Dispatch. Call
handle(event, payload)for each in order. - Compose. Apply
modifyrewrites in place; short-circuit on the firstblock; treatallowas pass-through. - Return. Hand the final decision and (possibly modified) payload back to the caller — the tool dispatcher, the turn loop, or whichever subsystem fired the event.
This pipeline is identical for every event. There is no privileged hook, no built-in policy that runs outside the chain, and no way for a tool or sub-agent to bypass it.
What to read next
- Delegation — how sub-agents inherit the same hook
surface, and how the upcoming
delegation.post_verifyandagent.stopevents extend it. - Governance & Policy — patterns for stacking hooks, allowlists, and tool wrappers into a deployable agent.
- Guide: Writing a Hook walks through a hook from blank file to production review.
- Reference: the full Hook Artifact Schema documents every supported frontmatter field and built-in.