Article source ↗ source url updated Thu Jun 04 2026 00:00:00 GMT+0000 (Coordinated Universal Time)

What AI Agents Should Never Do on Their Own (Towards Data Science)

A practitioner argument that “more freedom → better output” is incomplete: agent autonomy is only useful when bounded. The piece gives the wiki its first explicit framework for where to put the human checkpoints — the agent-guardrails concept.

Six things agents shouldn’t do autonomously

Destructive file ops — rm -rf, git reset --hard (discard unrecoverable work).
DB writes/migrations — DELETE without WHERE, DROP, TRUNCATE, prod schema changes.
Cloud infra — terraform apply, kubectl delete, IAM changes.
Production deployments — any release to live, “regardless of code quality.”
Auth/security logic — code that “looks correct” but fails on edge cases (“bugs here… show up in incident reports, sometimes months later”).
Secrets/credentials — reading or writing .env files and API keys.

The organizing lens — reversibility / recovery cost

Assess every task by “can this be undone?” Low recovery cost (refactors, unit tests) → high autonomy; steep penalty (dropped prod table) → mandatory pre-execution checkpoint. Autonomy is graduated by blast radius, not all-or-nothing.

Concrete guardrail mechanisms

AGENTS.md contract — repo file with setup, coding rules, safety rules, and a “Definition of Done”; the agreed contract before work begins (the cross-vendor sibling of the CLAUDE.md / spec-driven-development “constitution”).
blocked_commands.md — an explicit block list of destructive ops needing approval; “use explicit block lists rather than trusting agents to infer boundaries.”
Two-agent loop — one implements, a second reviews the diff “without ego investment” — the agent-orchestration adversarial-verification pattern applied as a safety gate.
Final reports — summarize changes, files, test results, risks, and incomplete work.
Never let the agent pick deploy timing — “you know what’s in flight, what incidents are open… the agent doesn’t have any of that context.”

Why it matters here

It turns the wiki’s scattered reliability/governance mentions (Scout’s policy-conformance audit, Hermes’ command-approval, the harness review/eval discipline) into a named, usable discipline — a permission matrix keyed on reversibility, not vibes. It’s the human-in-the-loop counterpart to autonomy: where durable-agents and self-improving-agents push agents to do more alone, this maps the bright lines they shouldn’t cross. (Caveat: single practitioner op-ed, no benchmarks.)

agent-guardrails · agentic-coding-harness · agent-orchestration · spec-driven-development · agent-middleware · durable-agents

What AI Agents Should Never Do on Their Own (Towards Data Science)

Six things agents shouldn’t do autonomously

The organizing lens — reversibility / recovery cost

Concrete guardrail mechanisms

Why it matters here

Related

Linked from