Guards

Three guardrails that fire on every write. Always on. Cannot be disabled.

Write guard

services/write_guard.py. Validates every file write before it touches disk.

Rejects:

Paths with .. (path traversal).
Absolute paths.
Paths outside the workspace root.
Files larger than 1 MB.
Binary files (detected by null-byte scan).
Dangerous patterns (e.g., overwriting .git/config, requirements.txt without confirmation in some modes).

When a write is rejected, the receipt records guards.write: "fail" and the diff is dropped. The agent is informed and may retry with a corrected path.

Hallucination guard

services/hallucination_guard.py. Cross-references every function/class reference in the generated code against the workspace's symbol index.

A reference like from app.utils import foo triggers a lookup: does foo exist in app.utils? If not:

The receipt records guards.hallucination: "warn" plus a list of missing references.
The change is not blocked — the agent may have legitimately created the symbol elsewhere in the same diff.
A warning chip appears in the Changes panel for human review.

Future work: surface the warning more aggressively in the editor gutter (Appendix E1).

Egress guard

services/egress_guard.py. Strips sensitive content from agent frames before they reach your browser.

Patterns redacted:

Live-looking API keys (sk-…, xoxb-…, AWS access keys, Google service-account JSON, etc.).
PEM private-key blocks.
Database connection strings with embedded passwords.
Paths outside the workspace root (prevents the agent from accidentally exposing your home directory in chat).

Redacted content is replaced inline with a placeholder like [REDACTED:api-key]. The receipt records the count of redactions, not the redacted values.

Why guards can't be disabled

Defense-in-depth. The agent is constrained at multiple layers:

Autonomy mode (does it apply at all?).
Hunk review (does this specific change get accepted?).
Write guard (is the path/size/type safe?).
Hallucination + egress guards (is the content safe?).

Disabling any one would punch a hole through the whole chain. If you find a guard pattern is too aggressive for your use case, file an issue or self-host with a custom rule set.

Tuning

Self-hosters can extend the rule sets:

Write guard: services/write_guard.py → BLOCKED_PATHS and BLOCKED_PATTERNS.
Egress guard: services/egress_guard.py → REDACTION_RULES (regex list).

Per-workspace allowlists for write paths are on the roadmap.