Guards

Three guardrails that fire on every write. Always on. Cannot be disabled.

Write guard

services/write_guard.py. Validates every file write before it touches disk.

Rejects:

  • Paths with .. (path traversal).
  • Absolute paths.
  • Paths outside the workspace root.
  • Files larger than 1 MB.
  • Binary files (detected by null-byte scan).
  • Dangerous patterns (e.g., overwriting .git/config, requirements.txt without confirmation in some modes).

When a write is rejected, the receipt records guards.write: "fail" and the diff is dropped. The agent is informed and may retry with a corrected path.

Hallucination guard

services/hallucination_guard.py. Cross-references every function/class reference in the generated code against the workspace's symbol index.

A reference like from app.utils import foo triggers a lookup: does foo exist in app.utils? If not:

  • The receipt records guards.hallucination: "warn" plus a list of missing references.
  • The change is not blocked — the agent may have legitimately created the symbol elsewhere in the same diff.
  • A warning chip appears in the Changes panel for human review.

Future work: surface the warning more aggressively in the editor gutter (Appendix E1).

Egress guard

services/egress_guard.py. Strips sensitive content from agent frames before they reach your browser.

Patterns redacted:

  • Live-looking API keys (sk-…, xoxb-…, AWS access keys, Google service-account JSON, etc.).
  • PEM private-key blocks.
  • Database connection strings with embedded passwords.
  • Paths outside the workspace root (prevents the agent from accidentally exposing your home directory in chat).

Redacted content is replaced inline with a placeholder like [REDACTED:api-key]. The receipt records the count of redactions, not the redacted values.

Why guards can't be disabled

Defense-in-depth. The agent is constrained at multiple layers:

  1. Autonomy mode (does it apply at all?).
  2. Hunk review (does this specific change get accepted?).
  3. Write guard (is the path/size/type safe?).
  4. Hallucination + egress guards (is the content safe?).

Disabling any one would punch a hole through the whole chain. If you find a guard pattern is too aggressive for your use case, file an issue or self-host with a custom rule set.

Tuning

Self-hosters can extend the rule sets:

  • Write guard: services/write_guard.pyBLOCKED_PATHS and BLOCKED_PATTERNS.
  • Egress guard: services/egress_guard.pyREDACTION_RULES (regex list).

Per-workspace allowlists for write paths are on the roadmap.