Agentic coding harness
The scaffolding layer wrapped around a code-writing LLM — everything the model itself
doesn’t do: task selection, branch/PR management, code review, CI/CD, delivery gates, state
persistence, and multi-agent/human coordination. The recurring thesis across its instances is
blunt: frontier models can write code; the hard, valuable part is “everything else”
(agentsys). langchain states the formula cleanly — agent = model + harness, the
harness being “the scaffolding around the model that connects it to the real world,” whose core
job is delivering the right context to the model at each step langchain-custom-harness.
The shared bet
Output quality comes less from raw model power than from structure around the model: phase gates that forbid skipping tests/review, deterministic tools instead of token-spend for mechanical work, confidence grading to route auto-fix vs. human review, and persistent state so work survives interruption. agentsys‘s claim that Sonnet + harness beats raw Opus on cost-effectiveness is the strong form: a good harness substitutes for capability. langchain restates the same bet as “task-harness fit determines agent usefulness more than raw model capability” langchain-custom-harness.
The structural unit — agent-middleware
langchain names the primitive the harness is built from: composable agent-middleware,
each piece hooking into the agent loop (before/after model calls, before/after tool calls,
startup/teardown) to handle one concern — assembled by stacking, not as a monolith
langchain-custom-harness. Its prebuilt middleware map onto threads this wiki tracks separately:
delegation (SubAgentMiddleware → agent-orchestration), supervision (HumanInTheLoopMiddleware
→ agent-kanban), state (durable-agents), memory (gbrain). Build tooling for this ranges
from minimal loop-builders (LangChain’s create_agent) to pre-assembled stacks (Deep Agents,
anthropic‘s Claude Agent SDK) to configurable harnesses like Pi (cf. oh-my-pi).
Instances in this wiki
- agentsys — 24 plugins / 49 agents / 44 skills; phase-gated pipelines, certainty grading.
- agent-kanban — shared board giving the harness a collaborative, supervised surface (agent identity, human↔agent messaging, mutual PR review).
- compound-engineering-plugin — the harness as a learning loop (compound-engineering): each task codifies reuse for the next.
- partially claude-financial-services — skills + connectors + managed deployment as a domain-specific harness.
- gstack — garry-tan‘s “software factory”: 23 skills + role-split agents on a Think→Plan→Build→Review→Test→Ship→Reflect sprint; the richest instance, and it folds in conductor (parallel sprints) and gbrain (memory).
- conductor — the planning/context side: Context → Spec & Plan → Implement on Gemini CLI.
- claw-code — the minimal end: a provider-agnostic Rust CLI harness.
- oh-my-pi — the IDE-integrated, feature-rich end: LSP, first-class debuggers, in-process tools, hash-anchored edits, persistent kernels, subagents (40+ providers).
- agent-starter-pack — the deploy/operate end: templates + eval + Cloud Run/CI-CD.
These map a spectrum across one workflow: plan/context (conductor) → build/orchestrate (agentsys, agent-kanban, gstack) → deploy/operate (agent-starter-pack), with claw-code (minimal) and oh-my-pi (IDE-deep) bracketing the CLI-harness layer. Many are explicitly multi-runtime / multi- provider, reinforcing that the harness — not any one vendor’s client — is the durable unit.
Connections
This is the product/runtime crystallization of three threads already here: agent-orchestration (how subagents fan out), agent-skills / agentskills-spec (the loadable capabilities the harness routes), and spec-driven-development (process-as-markdown the gates enforce). On the augment→automate axis (synthesis) the harness is what moves a team rightward — automating the wrapping work so the human supervises outcomes, not steps.
Related
agentsys · agent-kanban · agent-orchestration · agent-skills · compound-engineering · spec-driven-development · jaseci
Adjacent at the language level: jaseci (the Jac language) takes a different tack from these markdown-skill + CLI-harness products — the framework is the language (typed signatures build prompts; graph “walkers” drive agentic flows). Same goal (structure around the model), a language-design point on the spectrum.
Linked from
- index
- log
- synthesis
- agent-guardrails
- agent-kanban
- agent-loops-verification
- agent-middleware
- agent-orchestration
- agent-starter-pack
- agentic-tooling
- agents-never-do-alone
- agents-that-build-agents-ms
- agentsys
- ant-cli
- ai-job-search
- claude-agent-sdk
- claude-code-best-practices
- claw-code
- conductor
- durable-agents
- get-shit-done
- gstack
- jaseci
- langchain-custom-harness
- langchain
- lathe
- loop-engineering
- memory-vault
- oh-my-pi
- openhuman
- seekdb
- self-improving-agents
- signadot
- spec-driven-development
- spec-kit