Claude Code Hooks: Make "Done" Mean Tests Passed, Not Agent Stopped

An agent saying "done" is not the same as a change being shippable. Claude Code hooks give you a deterministic layer around the agent loop: run checks after edits, and block final completion when the repository is not green.

This article shows a practical pattern using PostToolUse and Stop hooks so "done" means "the configured checks passed."

The short version

  • PostToolUse hooks are useful for cheap, local feedback after Edit, Write, or related file-changing tools.
  • Stop hooks run when Claude believes it has finished and are the right place for final gates.
  • Hooks run shell commands with your user permissions, so treat them like production scripts.
  • Keep fast checks in PostToolUse; keep expensive integration suites for Stop or CI.
  • Do not rely on the model to remember tests. Put the rule in hooks.

The failure mode

Most coding agents are optimized to keep moving. They can edit files, explain the change, and stop before running the checks you would have run manually. Even when prompted to test, the model may skip tests after a small change, run the wrong package's tests, or ignore a failing command because the output is long.

Hooks fix a different layer of the problem. They do not ask the model to be more disciplined. They execute when lifecycle events happen.

In Claude Code, hooks can be configured in settings files such as project-level .claude/settings.json or user-level ~/.claude/settings.json. They receive JSON on stdin and can run commands. Events include PreToolUse, PostToolUse, Notification, Stop, and others.

Use PostToolUse for tight feedback

PostToolUse is best for checks that should happen immediately after a file modification: formatting, type-aware lint for the touched package, or a focused unit test if the mapping is cheap.

A minimal project hook might look like this:

{
  "hooks": {
    "PostToolUse": [
      {
        "matcher": "Edit|Write|MultiEdit",
        "hooks": [
          {
            "type": "command",
            "command": "$CLAUDE_PROJECT_DIR/.claude/hooks/after-edit.sh"
          }
        ]
      }
    ]
  }
}

Then keep the script boring and defensive:

#!/usr/bin/env bash
set -euo pipefail

cd "${CLAUDE_PROJECT_DIR:?}"

# Cheap checks only. Leave the full suite for Stop or CI.
npm run format:check
npm run lint -- --max-warnings=0

If your repository is large, avoid running the full monorepo suite on every edit. Instead, inspect the hook JSON, identify the changed file, and map it to the nearest package. Be conservative: if the mapping is unclear, print guidance and let the final gate catch it.

Use Stop as the final gate

The Stop event fires when Claude is about to finish a response. That makes it the right place to prevent "I'm done" when checks are failing.

Example:

{
  "hooks": {
    "Stop": [
      {
        "hooks": [
          {
            "type": "command",
            "command": "$CLAUDE_PROJECT_DIR/.claude/hooks/final-check.sh"
          }
        ]
      }
    ]
  }
}

And the gate:

#!/usr/bin/env bash
set -euo pipefail
cd "${CLAUDE_PROJECT_DIR:?}"

npm test -- --runInBand
npm run typecheck

The exact blocking behavior depends on the hook event and Claude Code's current hook semantics, so verify it in your installed version. The important design principle is stable: final checks should be deterministic and outside the model's discretion.

Feed failures back as actionable text

A failing hook should produce output the agent can use. Avoid dumping 5,000 lines of logs. Capture the command, exit code, and the relevant failure lines.

A wrapper helps:

#!/usr/bin/env bash
set -euo pipefail

run() {
  echo "==> $*"
  tmp=$(mktemp)
  if ! "$@" >"$tmp" 2>&1; then
    echo "FAILED: $*" >&2
    tail -n 120 "$tmp" >&2
    exit 2
  fi
}

cd "${CLAUDE_PROJECT_DIR:?}"
run npm test -- --runInBand
run npm run typecheck

Prefer a non-zero exit that Claude Code treats as a blocker for the relevant event. Test that behavior locally after upgrades.

Security rules for hooks

Hooks run with your user permissions. That is powerful and dangerous.

Follow these rules:

  • use absolute paths or $CLAUDE_PROJECT_DIR;
  • quote every variable;
  • never eval hook input;
  • validate file paths before passing them to tools;
  • avoid network calls unless the hook explicitly needs them;
  • keep project hooks reviewed like application code;
  • do not copy hook scripts from untrusted repos.

If an organization uses managed hooks, prefer that for non-negotiable controls such as secret scanning or release gates.

Gotchas

A hook that is too slow will train developers to disable hooks. Keep the per-edit path fast.

A hook that changes files can create confusing loops. If you auto-format after edits, make sure the agent sees the resulting diff and does not fight the formatter.

Hooks do not replace CI. Local checks are a fast gate; CI still verifies clean checkout behavior, matrix builds, and protected-branch requirements.

We're building Grass around the same principle: an agent run should come back as reviewable work, not just a stopped process. Grass runs agents in isolated sandboxes and keeps the handoff focused on what changed, what passed, and what still needs a human decision. You can try it at https://codeongrass.com.

Conclusion

Claude Code hooks are the right place to encode "done means tested." Use PostToolUse for immediate feedback after edits and Stop for final verification. Keep scripts small, secure, and deterministic, and let CI remain the outer quality gate.

Sources

  • Claude Code hooks documentation and hooks guide
  • Anthropic guidance that hooks execute commands with user permissions