Guards
Three guardrails that fire on every write. Always on. Cannot be disabled.
Write guard
services/write_guard.py. Validates every file write before it touches disk.
Rejects:
- Paths with
..(path traversal). - Absolute paths.
- Paths outside the workspace root.
- Files larger than 1 MB.
- Binary files (detected by null-byte scan).
- Dangerous patterns (e.g., overwriting
.git/config,requirements.txtwithout confirmation in some modes).
When a write is rejected, the receipt records guards.write: "fail" and the diff is dropped. The agent is informed and may retry with a corrected path.
Hallucination guard
services/hallucination_guard.py. Cross-references every function/class reference in the generated code against the workspace's symbol index.
A reference like from app.utils import foo triggers a lookup: does foo exist in app.utils? If not:
- The receipt records
guards.hallucination: "warn"plus a list of missing references. - The change is not blocked — the agent may have legitimately created the symbol elsewhere in the same diff.
- A warning chip appears in the Changes panel for human review.
Future work: surface the warning more aggressively in the editor gutter (Appendix E1).
Egress guard
services/egress_guard.py. Strips sensitive content from agent frames before they reach your browser.
Patterns redacted:
- Live-looking API keys (
sk-…,xoxb-…, AWS access keys, Google service-account JSON, etc.). - PEM private-key blocks.
- Database connection strings with embedded passwords.
- Paths outside the workspace root (prevents the agent from accidentally exposing your home directory in chat).
Redacted content is replaced inline with a placeholder like [REDACTED:api-key]. The receipt records the count of redactions, not the redacted values.
Why guards can't be disabled
Defense-in-depth. The agent is constrained at multiple layers:
- Autonomy mode (does it apply at all?).
- Hunk review (does this specific change get accepted?).
- Write guard (is the path/size/type safe?).
- Hallucination + egress guards (is the content safe?).
Disabling any one would punch a hole through the whole chain. If you find a guard pattern is too aggressive for your use case, file an issue or self-host with a custom rule set.
Tuning
Self-hosters can extend the rule sets:
- Write guard:
services/write_guard.py→BLOCKED_PATHSandBLOCKED_PATTERNS. - Egress guard:
services/egress_guard.py→REDACTION_RULES(regex list).
Per-workspace allowlists for write paths are on the roadmap.