Defined Term mechanism updated Thu Jun 04 2026 00:00:00 GMT+0000 (Coordinated Universal Time)

Agentic coding harness

The scaffolding layer wrapped around a code-writing LLM — everything the model itself doesn’t do: task selection, branch/PR management, code review, CI/CD, delivery gates, state persistence, and multi-agent/human coordination. The recurring thesis across its instances is blunt: frontier models can write code; the hard, valuable part is “everything else” (agentsys). langchain states the formula cleanly — agent = model + harness, the harness being “the scaffolding around the model that connects it to the real world,” whose core job is delivering the right context to the model at each step langchain-custom-harness.

The shared bet

Output quality comes less from raw model power than from structure around the model: phase gates that forbid skipping tests/review, deterministic tools instead of token-spend for mechanical work, confidence grading to route auto-fix vs. human review, and persistent state so work survives interruption. agentsys‘s claim that Sonnet + harness beats raw Opus on cost-effectiveness is the strong form: a good harness substitutes for capability. langchain restates the same bet as “task-harness fit determines agent usefulness more than raw model capability” langchain-custom-harness.

The structural unit — agent-middleware

langchain names the primitive the harness is built from: composable agent-middleware, each piece hooking into the agent loop (before/after model calls, before/after tool calls, startup/teardown) to handle one concern — assembled by stacking, not as a monolith langchain-custom-harness. Its prebuilt middleware map onto threads this wiki tracks separately: delegation (SubAgentMiddleware → agent-orchestration), supervision (HumanInTheLoopMiddleware → agent-kanban), state (durable-agents), memory (gbrain). Build tooling for this ranges from minimal loop-builders (LangChain’s create_agent) to pre-assembled stacks (Deep Agents, anthropic‘s Claude Agent SDK) to configurable harnesses like Pi (cf. oh-my-pi).

Instances in this wiki

agentsys — 24 plugins / 49 agents / 44 skills; phase-gated pipelines, certainty grading.
agent-kanban — shared board giving the harness a collaborative, supervised surface (agent identity, human↔agent messaging, mutual PR review).
compound-engineering-plugin — the harness as a learning loop (compound-engineering): each task codifies reuse for the next.
partially claude-financial-services — skills + connectors + managed deployment as a domain-specific harness.
gstack — garry-tan‘s “software factory”: 23 skills + role-split agents on a Think→Plan→Build→Review→Test→Ship→Reflect sprint; the richest instance, and it folds in conductor (parallel sprints) and gbrain (memory).
conductor — the planning/context side: Context → Spec & Plan → Implement on Gemini CLI.
claw-code — the minimal end: a provider-agnostic Rust CLI harness.
oh-my-pi — the IDE-integrated, feature-rich end: LSP, first-class debuggers, in-process tools, hash-anchored edits, persistent kernels, subagents (40+ providers).
agent-starter-pack — the deploy/operate end: templates + eval + Cloud Run/CI-CD.

These map a spectrum across one workflow: plan/context (conductor) → build/orchestrate (agentsys, agent-kanban, gstack) → deploy/operate (agent-starter-pack), with claw-code (minimal) and oh-my-pi (IDE-deep) bracketing the CLI-harness layer. Many are explicitly multi-runtime / multi- provider, reinforcing that the harness — not any one vendor’s client — is the durable unit.

Connections

This is the product/runtime crystallization of three threads already here: agent-orchestration (how subagents fan out), agent-skills / agentskills-spec (the loadable capabilities the harness routes), and spec-driven-development (process-as-markdown the gates enforce). On the augment→automate axis (synthesis) the harness is what moves a team rightward — automating the wrapping work so the human supervises outcomes, not steps.

agentsys · agent-kanban · agent-orchestration · agent-skills · compound-engineering · spec-driven-development · jaseci

Adjacent at the language level: jaseci (the Jac language) takes a different tack from these markdown-skill + CLI-harness products — the framework is the language (typed signatures build prompts; graph “walkers” drive agentic flows). Same goal (structure around the model), a language-design point on the spectrum.

Agentic coding harness

The shared bet

The structural unit — agent-middleware

Instances in this wiki

Connections

Related

Linked from