Reins

Security controls for AI agents.

In Greek myth, Athena gave Bellerophon the golden bridle — reins included — that let him guide Pegasus. Reins applies the same idea to AI agents: raw power is not enough — what matters is making it controllable.

Reins watches your agent’s every move. Here’s how, technically.

Prevent, Pause, Prove

Reins enforces deterministic security policies on every agent action — synchronously, before execution, with no LLM in the enforcement path.

Prevent — Block destructive actions before they execute. Regex and heuristic rules catch rm -rf /, DROP DATABASE, fork bombs, writes to ~/.ssh, and dozens of other patterns. Rules are evaluated in under 50ms with no network call.
Pause — Route high-impact actions through human approval before they proceed. In a terminal, this is an interactive prompt. In a messaging workflow (WhatsApp, Telegram), an out-of-band notification delivers the approval request directly to the human — bypassing the agent context entirely so the agent cannot self-approve.
Prove — Every decision is appended to an immutable JSONL audit trail: tool, action, decision, rule, timestamp, decision time. Pending entries are buffered locally and flushed to Reins Cloud on the next sync.

Enforcement is synchronous and deterministic. The agent cannot proceed until the hook exits. An unhandled hook error blocks the action — fail-closed, not fail-open.

Why this matters

AI agents are increasingly capable of taking real actions: deleting files, pushing code, sending messages, dropping database tables, making purchases. Most of the time, this is exactly what you want. But capability without control creates a class of failure modes that are hard to anticipate and easy to trigger.

Irreversibility. A git push --force, DROP TABLE, or rm -rf is not a suggestion — it’s a permanent change. Agents don’t inherently understand the difference between a reversible action and one that destroys data or breaks production. They optimize for task completion, not for the preservation of things you might want back.

Drift and manipulation. An agent working through a long task can drift far from its original intent — through prompt injection in tool output, through cumulative low-risk steps that compose into a harmful chain, or simply through losing context. By the time an agent reaches a destructive action, the reason it was authorized may no longer apply.

In-context approval is not enforcement. Asking a model to “confirm with the user before deleting” is a suggestion, not a control. A sufficiently creative prompt, a poisoned tool response, or simple model error can bypass it. The approval request is just another token — the same model that asked can also answer.

No audit trail. When something goes wrong, you need to know what happened: what action, what rule, what was the decision, who approved it. Without a structured record, post-incident review is guesswork.

An agent cannot be its own watchdog. Neither can any runtime that runs inside the agent’s context.

Why Reins

	Reins	ClawSec	DefenseClaw
Architecture	External to agent — cannot be prompt-injected	Runs inside agent context — can be compromised	External, multi-runtime
Install	`npm i -g @pegasi/reins`	Skill install	3 runtimes + Go daemon
Hosted dashboard	Yes (Reins Cloud)	No	No (Splunk only)
HITL approvals	Yes	No	No
Target	Developers + small teams	OpenClaw users	Enterprise SOC teams

The architecture difference is the one that matters most. Security controls that run inside the agent’s context can be bypassed by the agent — whether through prompt injection, model error, or adversarial tool output. Reins runs at the OS hook level (Claude Code) or inside the gateway process before tool dispatch (OpenClaw), where the agent has no influence over the enforcement decision.

In The News

TechCrunch (February 23, 2026): A Meta AI security researcher said an OpenClaw agent ran amok on her inbox

Install

npm install -g @pegasi-ai/reins
reins init

reins init runs an interactive wizard that installs PreToolUse and PostToolUse hooks into .claude/settings.json (Claude Code) or registers the plugin with your OpenClaw gateway, then runs an initial security scan.

Next steps

Getting Started — install, init, first scan
Claude Code — hooks, skill, examples
OpenClaw — plugin, channel mode, OOB approvals
How It Works — terminal mode, channel mode, memory forecasting
Security Policies — ALLOW / ASK / DENY rules, OWASP coverage
CLI Reference — every command
Security Scan — 13-check audit, drift monitoring
Reins Cloud — centralized governance