Spokes.wiki Search Graph Growth About

agentic-tooling-wiki

log

Synthesis — Agentic Tooling

The evolving thesis. This spoke was split out of research-wiki’s “cluster B” on 2026-06-01, when a single Telegram burst added ~10 agent-tooling sources and the cluster became larger than its parent’s core. It owns the tooling for building and running LLM agents; the knowledge-management / tools-for-thought lineage stays in research-wiki, joined here by a few explicitly cross-linked bridge nodes (gbrain, agent-skills, compound-engineering, model-context-protocol, anthropic, claude-opus-4-8).

Current thesis

The frontier model writes code; the value has moved to everything wrapped around it. Six layers recur across the sources:

  1. Capability as portable markdown — agent-skills. Domain expertise + procedures bundled as markdown an agent loads at runtime: claude-financial-services (FSI verticals), agentic-seo-skill and claude-skills-ppc (marketing), gstack‘s 23 role-skills. This is hardening from per-tool conventions into an open cross-vendor standard, agentskills-spec (agentskills.io) — one skill across “Gemini CLI, Claude Code, Cursor, 40+ products” — with progressive disclosure (metadata → body → resources on demand) the load-on-demand mechanism. Skills become agentic when combined with tools/data via model-context-protocol (the explicit “skills × MCP = agency” formula).

    The standards layer around this primitive now spans four edges: skills (agentskills-spec = what an agent can do), context (agents-md = what it must know about a project; 60k+ repos, Linux Foundation) tools (MCP = how it reaches data/APIs), and interop (a2a-protocol = how agents from different frameworks collaborate without exposing internal state — Agent Cards, tasks, JSON-RPC/HTTP). The framing: MCP = agent→tool; A2A = agent→agent — complementary, MCP runs inside an A2A ecosystem. Four distinct open standards, each a candidate attractor or a fragmentation front; the governance move repeats (vendor → neutral foundation → default). agents-md and agentskills-spec are siblings: skills = what an agent can do; AGENTS.md = what it must know about the project it’s working in.

  2. Structure around the model — the agentic-coding-harness. A wave of products (agentsys, agent-kanban, claw-code, conductor, gstack, agent-starter-pack) crystallizes one bet: structure substitutes for capability — phase gates, deterministic tools instead of token-spend, confidence grading, persistent state, multi-agent + human coordination. agentsys pushes the strong form (Sonnet + harness > raw Opus on cost-effectiveness). The instances span a full plan → build → deploy spectrum: conductor (context/plan) → agentsys/agent-kanban/gstack (build/orchestrate) → agent-starter-pack (deploy/operate), over the thin provider- agnostic substrate claw-code. langchain supplies the canonical formula and the structural unit: agent = model + harness, “task-harness fit > raw model capability”, and the harness built from composable agent-middleware — single-concern pieces that hook the agent loop (before/after model & tool calls, startup/teardown) and stack instead of a monolith (langchain-custom-harness). Tellingly, LangChain’s prebuilt middleware are just this wiki’s other threads as drop-ins — delegation = agent-orchestration, human-in-the-loop = supervision (agent-kanban), state = durable-agents, memory = gbrain — so the harness is a middleware stack over orchestration / durability / supervision / skills. The pattern is now generalizing beyond coding to everyday verticals: ai-job-search is a fork-and-customize Claude Code harness for job applications (skills + slash commands + a drafter–reviewer subagent loop) — same primitives, non-coding domain. It also adds a sharp reliability pattern, output-grounded verification: rather than trusting generation, it compiles the LaTeX, reads the rendered PDF, and fixes layout until visual inspection passes — verifying the artifact, not the model output (the same “structure/checks over raw capability” discipline as the durability/guardrails threads). The vertical-generalization is now itself productized as a marketplace: pm-skills packages 68 skills / 9 plugins / 42 chained workflows for product management (discovery→strategy→execution→launch→growth) — same primitives (agent-skills + slash commands that chain skills), a knowledge-heavy non-coding domain, and notably cross-vendor (Claude Code/Cowork, Codex, exported to Gemini CLI/Cursor/Kiro) — a real-world stress test of the portable-skill thesis.

  3. Execution & deployment patterns. agent-orchestration (orchestrator → parallel subagent fan-out, adversarial verification) is the execution-time engine, paired at authoring time with spec-driven-development (agree a reviewable spec — or, per conductor, persistent context — before the agent codes). A deployment axis also appears (claude-code-channels-vs-openclaw): event-driven (Anthropic’s Channels — reactive, human-messaged; the channel driving this very wiki) vs self-driven (OpenClaw — an autonomous heartbeat daemon). The self-driven pole is going mainstream: OpenClaw is now both forked (hermes-agent, open-source successor) and productized by a major vendor (microsoft-scout, built on OpenClaw, Microsoft-365-integrated) — the autonomous, durable, self-improving personal assistant is no longer fringe. Notably both ship governance (Scout’s continuous “policy conformance” audit; Hermes’ command-approval/isolation), making the reliability discipline a product feature, not an afterthought. That discipline now has an explicit practitioner framework — agent-guardrails (agents-never-do-alone): bound autonomy by reversibility / recovery cost, hard-stop the irreversible (prod deploys, infra, auth, secrets, destructive ops) behind human checkpoints, and codify the limits in an AGENTS.md contract + blocked_commands.md block list + a two-agent review loop. It’s the explicit counterweight to the autonomy push of layers 4–5: as agents do more alone, the guardrails map the bright lines they shouldn’t cross. A sharper counter-current is lathe: it uses the same primitive (Claude Code/Cursor/Codex agent-skills) for the opposite goal — “LLMs to teach you, rather than think for you.” Where the rest of the spoke automates work away, Lathe is skills that deliberately keep the human doing the work (generating hands-on tutorials you type out by hand) — the extreme augment pole, the agent-tooling instance of Engelbart’s augmenting human intellect (research-wiki’s augmenting-human-intellect) rather than replacing it. So “skills” are pole-agnostic: the same load-on-demand primitive serves both maximal automation and deliberate human practice. renwei-writing extends the augment pole to writing: a skill that edits without erasing the author — resisting the homogenization where “each AI pass strips more human voice” — and verifying the result against a checklist adapted from Wikipedia’s “Signs of AI writing” (output-grounded verification, like ai-job-search). Lathe keeps the human doing the work; renwei-writing keeps the human audible in it — both the anti-replacement use of the same primitive. A third entrant, openhuman (GPL-3.0, ~30.7k★), stakes out the local-first / privacy corner: a Tauri desktop app whose memory layer is a local SQLite “Memory Tree + Obsidian wiki” — a gbrain-style personal knowledge base as the agent’s memory. It’s the cleanest demonstration that this pole sits astride the research-wiki seam: an autonomous harness whose differentiator is a tools-for-thought knowledge base.

    Two boundary markers sharpen the execution thread from opposite ends. tgpt is the non-agentic floor: a terminal LLM tool with none of the structure (no tool use, no file edits, no planning loop, just prompt→reply). Its one axis is provider breadth across ~10 backends; it reinforces the spoke’s thesis by contrast — the value sits in what tgpt deliberately omits. arrow-js is the output ceiling — the rendered layer the rest of the stack has always assumed but never addressed. A reactive UI framework (< 5 kb, no build step, plain TypeScript, three functions) with WASM sandboxing: component logic runs in a WebAssembly sandbox while rendering to the DOM, so a chat agent can hand generated UI code to a host application safely. The sandbox is the agent-guardrails containment discipline applied to browser-side code rather than server-side actions. The corpus now traces the full stack from skills and orchestration down to what gets rendered to the user.

    The unit of authoring is shifting from the prompt to the loop — loop-engineering. Rather than crafting one request, you define a goal, a stop condition, and a feedback signal and let the agent iterate against its own prior work (agent-loops-verification, Arjun Iyer/Signadot; the Ralph technique is the canonical instance). This relocates the bottleneck from generation to verification: a loop is only as good as its feedback signal — “feedback is only as truthful as the system that generates it.” That is the strongest statement yet of why the spoke’s reliability thread is load-bearing — output-grounded verification (ai-job-search compiling and reading its PDF; renwei-writing‘s checklist) and agent-guardrails containment matter more as loops generate faster than humans can review. The new claim it adds: those examples verify the artifact, but for cloud-native code the harder problem is verifying behaviour against a live system, which pushes verification down into runtime infrastructure (ephemeral environments, observability) — the live seam to platform-ops-wiki. So verification is now a first-class layer of the stack, not a post-hoc check, and it straddles the agentic-tooling / platform-ops boundary. A vivid worked instance arrived 2026-06-16: autonovel (Nous/hermes-agent) writes whole novels as a loop — goal + stop-conditions (quality >7.5/>6.0, plateau detection) + a feedback signal built from two “immune systems” (mechanical regex scan + a separate LLM-judge model). It is the artifact-verification pole (a shipped 79k-word novel, second-son-house-of-bells — published under the byline “Claude Hermes,” the orchestrator+model persona worn as a name) of the same thesis, and shows the feedback signal can be engineered as a layered evaluator rather than a single check. Its origin, Karpathy’s autoresearch (also 2026-06-16), pins the load-bearing variable: autoresearch’s loop runs on an objective metric (val_bpb), so autonomy is nearly free; autonovel has no ground-truth metric for prose, so building a trustworthy evaluator is the research problem. Same paradigm, and the feedback signal’s availability is what makes one easy and the other hard — the sharpest statement yet of “a loop is only as good as its feedback signal.” autoresearch also reframes containment as an autonomy enabler: agents edit only train.py, keeping diffs reviewable, which is what lets the loop run unattended.

  4. Durability — from session to process. durable-agents (adk-long-running-agents) is the maturation axis: agents that run for days/weeks, pause, survive crashes, and resume by separating workflow state from conversation history (state machines + persistent sessions + webhook-driven resume). It reframes the deployment axis — durable pause/resume (ADK) vs. continuous heartbeat (OpenClaw) are two routes to the same end: agents that outlive a single session. And like the harness thesis, durability is a reliability discipline (checkpoints, atomic transitions), not a model-capability gain — structure unlocking production use again.

  5. Self-improvement — the growth axis. self-improving-agents (hermes-agent‘s closed learning loop; adk self-generating meta-skills; gstack‘s Reflect) are agents that author and refine their own skills and accumulate memory over time. This is the clearest bridge back to research-wiki: it’s compound-engineering inside the agent and rhymes with gbrain‘s compounding personal knowledge base (hermes-agent is a convergence node — skills + orchestration + durability + self-improvement, and an OpenClaw successor). Open risk: self-authored capability can compound mistakes, so it rests on the same review/eval discipline — no published longitudinal evidence yet that self-improvement stays net-positive. A new convergence instance, zouroboros (2026-06-16), tackles that open risk head-on: its daily introspection→prescription→evolution loop gates every self-authored procedure change through a three-model consensus vote (GLM-5.1 / Kimi-K2.6 / MiniMax-M2.5 in parallel), plus circuit breakers and loop guards fencing the recursion. So where layer 3’s loop thesis asks “how truthful is the feedback signal?”, zouroboros answers for the self-improvement case with multi-model agreement — a third design point alongside autonovel‘s engineered LLM-judge and autoresearch‘s objective metric. It also fuses all of layers 2–6 in one repo (orchestration + memory + durability + self-improvement + resilience), the densest single instance after hermes-agent/gstack — though young, single-author, and deliberately platform-locked to zo-computer (an instance of the patterns, not a portable framework).

  6. Memory and state as infrastructure. The threads above all assume a place to keep state: durable-agents separates workflow state from history, self-improving-agents accumulate memory, openhuman ships a SQLite “Memory Tree”, gbrain (cross-wiki) is a KB used as agent memory. That layer now has its own concept node — agent-memory (state + retrievable knowledge) — and, for the first time, a standalone infrastructure product: seekdb (OceanBase, Apache-2.0). It is MySQL-compatible, does hybrid vector + full-text search in one query, makes fresh writes immediately retrievable (async indexing + two-level HNSW), and adds the distinctive copy-on-write FORK/MERGE sandbox: agent state as a branchable, mergeable, rollback-able thing rather than one mutable blob. That branching primitive is the storage-layer expression of the reversibility/exploration discipline named in agent-guardrails and the fork-and-try pattern of agent-orchestration — pushed down into the database. A new vendor stratum: a proven DB team (not an agent-platform vendor) competing to be the substrate the harnesses run on. Same caveat as ever — the 10×-Milvus throughput claim is the repo’s own benchmark, unreplicated (see Open questions). A second standalone product, memory-vault (2026-06-17), marks the other end of the weight spectrum: a small open-source MCP server — stock Postgres + pgvector, hybrid vector + keyword search, docker compose up — that externalizes the context Claude Code otherwise loses to lossy auto-compaction (memory survives /clear and a change of machine). Its distinguishing move is delivery: where seekdb is a database the harness queries, Memory Vault is handed to the agent over the MCP — the storage-layer reading of the spoke’s “skills × MCP = agency” formula. So the memory layer now has two poles — a full-featured branch/merge DB and a thin MCP-wired store — and the recurring shape is that the store reaches the agent as a tool. (T3, a MakeUseOf how-to, not the project’s docs or an independent test — retrieval quality unmeasured.)

The convergence node — gstack

garry-tan‘s “software factory” is the clearest single artifact of the thesis: 23 agent-skills, role-split agent-orchestration, parallel sprints via conductor, and gbrain as persistent memory — its Think→Plan→Build→Review→Test→Ship→Reflect sprint, where Reflect is compound-engineering. It is also the cleanest bridge back to research-wiki: the same author’s knowledge base is the harness’s memory layer.

hermes-profile-builder (Nous) gives the spoke’s most abstract claim a UI: an agent is composed from identity + model + skills + MCP servers, and the Profile Builder is a local dashboard with exactly those four toggleable blocks (writing config.yaml/.env/SOUL.md, CLI-parity, one-click Skills-Hub/MCP-catalog installs). So “skills × MCP = agency” + model-selection + identity stop being separate threads and become a single composition surface in one shipping product. Evidence on the standardization-reach question: the blocks are stable enough to assemble in a GUI. Local-only (127.0.0.1) keeps it on the openhuman local-first pole.

Vendors

anthropic (Claude Code / Channels / FSI) is the furthest along the model-provider → full-agent-platform shift: the ant CLI (ant-cli) drives claude-managed-agents from the terminal, and claude-agent-sdk ships the Claude Code harness itself as a Python/TS library — tools, agent loop, hooks, subagents, MCP, sessions, and .claude/ skills/CLAUDE.md dirs. A three-rung ladder: Client SDK → Agent SDK → Managed Agents. The wiki’s threads are now literally one vendor’s SDK feature list. google (adk, agent-starter-pack, Gemini CLI / conductor) and now microsoft (Agent Framework + Azure AI Foundry, per agents-that-build-agents-ms) are all building the same harness + skills patterns — three independent vendors converging on skills as the primitive, strong evidence the form factor is an industry attractor, not one vendor’s house style. The open agentskills-spec is where that convergence is most explicit — and Microsoft adopts it outright: per agents-that-build-agents-ms (confirmed via Microsoft Learn), Foundry skills follow the agentskills.io format and surface to any MCP client as MCP Resources (SEP-2640), so all three vendors land on the same open file format, not rival proprietary ones. langchain is the fourth vendor, but decomposes the harness differently: its primitive is composable agent-middleware, not a loadable skill — a framing tension worth watching (skill-as-unit vs middleware-as-unit; in practice likely both, skills being what the agent can do and middleware how the loop is wired).

Open questions

  • Benchmarks. Nearly every claim here is from READMEs / vendor blogs / one practitioner (claude-code-best-practices) — including agentsys‘s “Sonnet+harness > Opus” and the ~90% progressive-disclosure saving. A neutral, measured comparison is the highest-value next source. Partial step (2026-06-16): awesome-hermes-usecases is third-party, primary-source- gated deployment evidence for hermes-agent — it narrows the README-only-claims gap (the features demonstrably ship and get used across 13 domains) without closing it, since it documents that Hermes is deployed, not how well it performs against alternatives. Usage evidence, not a benchmark.
  • Does structure really substitute for capability, and how far? The harness-vs-model tradeoff is asserted, not quantified.
  • Standardization reach. Will agentskills-spec actually unify formats, or fragment (agent-starter-pack already migrating to agents-cli; ACP, MCP, vendor-native dirs coexist)? renwei-writing ships for Cola (~/.cola/skills/) — yet another host with a SKILL.md-style dir, i.e. the pattern keeps spreading but via per-tool directories, leaving open whether they converge on the spec or just proliferate. Convergence is now winning the evidence. Anthropic’s own anthropic-skills repo publishes the spec first-party, and Microsoft Foundry adopts it outright: agents-that-build-agents-ms confirms (via Microsoft Learn) that Foundry skills “follow the Agent Skills specification format”SKILL.md + YAML front matter, the spec’s progressive-disclosure advertise→load→read pattern, surfaced to any MCP client as MCP Resources (SEP-2640). So the three converging vendors (Anthropic, Google, Microsoft) are not just each shipping skills but landing on the same open format wired to the same transport (MCP) — strong evidence the spec unifies rather than fragments. The residual fragmentation is at the per-tool directory/host level (Cola, agents-cli), not the file format.
  • The A-bridge. gstack fuses harness (here) + knowledge base (gbrain, research-wiki). Is “agent tooling” ultimately a sub-case of “LLMs operating over file-based markdown,” or a distinct engineering discipline? The split makes the question explicit.

Contradictions / tensions

None flagged yet — the sources are complementary. Watch the harness-substitutes-for-model claim against future neutral benchmarks.

Cross-spoke adjacency

  • research-wiki — parent; holds the knowledge-management / memex / llm-wiki lineage (cluster A) and formal methods (E), plus the bridge nodes above. gbrain, agent-skills, compound-engineering, and model-context-protocol are the seams between the two.
  • llm-inference-wiki — the mechanism layer (how models run/serve). This spoke sits above it: harnesses and skills consume inference; neither owns the other’s subject.
  • platform-ops-wiki — the verification-runtime seam (new, via agent-loops-verification). This spoke owns the loop + verification paradigm (loop-engineering); when verification of agent-written code requires a real running system (cloud-native behaviour, not just the artifact), it becomes a runtime/ephemeral-environment problem that platform-ops owns. Watch for sources where agent loops drive CI/CD or validate against live infrastructure.

Index — Agentic Tooling Wiki

Catalog of every page, grouped by schema.org @type. Spine: synthesis (thesis), log.md (history), this file (catalog). Some wiki-links resolve to bridge nodes in the sibling research-wiki (intentional cross-wiki links).

DefinedTerm (concepts / mechanisms)

  • agentic-toolingumbrella: the tools wrapped around a model to build/run LLM agents; structure as the lever · domain
  • agentic-coding-harness — the scaffolding around a code-writing model (“everything else”); structure as capability · mechanism
  • agentskills-spec — agentskills.io open cross-vendor skills standard + progressive disclosure · standard
  • agent-orchestration — orchestrator → parallel-subagent fan-out; the execution-time engine · mechanism
  • spec-driven-development — agree the spec/context before the agent codes; process-as-markdown · practice
  • durable-agents — long-running agents that pause/resume/survive crashes via state machines + persistent sessions · mechanism
  • self-improving-agents — agents that author/refine their own skills + accumulate memory (the “growth” axis) · mechanism
  • agent-middleware — the composable structural unit of a harness: single-concern pieces hooking the agent loop · mechanism
  • agent-guardrails — autonomy boundaries: bound agents by reversibility/recovery cost; human checkpoints on irreversible actions · practice
  • agents-md — open project-context convention (AGENTS.md); cross-tool (60k+ repos, Linux Foundation) · source · standard
  • agent-memory — the persistent state + retrievable-knowledge substrate beneath durability/self-improvement; the storage layer of its own · mechanism
  • loop-engineering — designing the agent’s iterative loop (goal + stop condition + feedback) instead of a prompt; the Ralph technique generalized; relocates the bottleneck to verification · practice

SoftwareSourceCode (sources)

  • claude-financial-services — Anthropic’s reference FSI agents/skills/connectors repo · source · src: claude-financial-services.md
  • compound-engineering-plugin — Every Inc’s compound-engineering skillpack for coding agents · source · src: compound-engineering-plugin.md
  • agentic-seo-skill — LLM-first SEO skill pack for agent IDEs · source · src: agentic-seo-skill.md
  • agent-kanban — shared Kanban board for multi-agent + human coding collaboration · source · github.com
  • agentsys — modular agent orchestration runtime (24 plugins / 49 agents / 44 skills) · source · github.com
  • claw-code — provider-agnostic Rust CLI agent harness · source · github.com
  • oh-my-pi — IDE-wired terminal coding agent: LSP, debuggers, hash-anchored edits, subagents (40+ providers) · source · github.com
  • hermes-agent — Nous Research self-improving autonomous agent framework (“grows with you”); OpenClaw successor · source · github.com
  • awesome-hermes-usecases — community-curated, primary-source-gated catalog of real-world Hermes Agent deployments (13 domains); the evidence counterweight to hermes-agent’s README claims · source · T2 · github.com
  • autonovel — Nous/Hermes Agent autonomous novel-writing pipeline (4 phases, ~27 scripts, mechanical+LLM-judge eval loops); shipped a 79k-word novel — the loop/verification thesis as a working artifact · source · T1 · github.com
  • zouroboros — self-enhancing multi-agent orchestration + hybrid memory platform (5 executors incl. Hermes; Health Council; 3-model consensus gate on self-edits); convergence instance, Zo-Computer-native · source · T1 · github.com
  • autoresearch — Karpathy’s “agentic scientist”: agents autonomously iterate ML training overnight (edit train.py → 5-min run → val_bpb → keep/discard); the loop thesis with an objective feedback signal · source · T1 · github.com
  • microsoft-scout — Microsoft’s OpenClaw-built personal assistant (autonomous, durable, self-improving) + policy-conformance governance · source · techcrunch.com
  • openhuman — local-first (Tauri/SQLite) open-source personal AI agent; Memory Tree + Obsidian wiki; 118+ Composio integrations · source · github.com
  • spec-kit — GitHub’s reference spec-driven-development toolkit (six /speckit.* commands, 30+ agents) · source · github.com
  • anthropic-skills — Anthropic’s official Agent Skills repo: the spec, template, example + production document skills, plugin marketplace; first-party home of the skills standard · source · T1 · github.com
  • get-shit-done — GSD: Claude Code system pairing context engineering + spec-driven development · source · github.com
  • jaseci — Jac language + full-stack AI framework (Meaning Typed Programming, graph walkers) · source · docs.jaseci.org
  • node-js-functional-patterns-skill — a skill pack via MCP Market; honest stub (429) · source · mcpmarket.com
  • conductor — Gemini CLI extension; Context-Driven Development (context → spec → implement) · source · github.com
  • gstack — Garry Tan’s “AI software factory”: 23 skills + role-agents; uses GBrain + Conductor · source · github.com
  • claude-to-speech — Claude Code plugin: ElevenLabs TTS for responses via hooks · source · github.com
  • ai-job-search — Claude Code job-application framework (skills + slash commands + drafter–reviewer subagents; output-grounded PDF verification); harness pattern in a non-coding vertical · source · github.com
  • lathe — Claude Code/Cursor/Codex skills that generate hands-on tutorials you work through by hand; the anti-automation “teach, don’t do for you” augment pole · source · github.com
  • renwei-writing — 人味儿写作: a Cola skill that edits text without erasing the author’s voice; augment pole applied to writing; Wikipedia-”Signs of AI writing” post-edit checklist · source · T1 · github.com
  • pm-skills — PM Skills Marketplace: 68 skills / 9 plugins / 42 chained workflows for product management; cross-vendor (Claude Code/Cowork/Codex/Gemini/Cursor) · source · github.com
  • seekdb — OceanBase’s “AI-native state store for agents”: MySQL-compatible, hybrid vector+FTS, COW FORK/MERGE sandboxes (Apache-2.0); the memory-substrate layer as a product · source · github.com
  • tgpt — multi-provider terminal LLM chat frontend (Go, GPL-3.0); ~10 backends incl. free tiers + Ollama; the non-agentic boundary of the terminal-AI ecosystem · source · T1 · github.com
  • arrow-js — reactive UI framework for agent-generated interfaces; WASM sandboxing for safe browser-side code execution; < 5 kb, no build step · source · T1 · arrow-js.com
  • agent-starter-pack — Google Cloud toolkit for building/deploying production agents from templates (Cloud Run / Agent Engine, eval, CI/CD); the deploy/operate end of the harness · source · T1 · googlecloudplatform.github.io

TechArticle (sources)

  • orchestration-mode — Claude API doc: multi-agent fan-out via mid-conversation steering · source · src: orchestration-mode.md
  • hermes-profile-builder — Nous: web dashboard composing an agent from identity/model/skills/MCP (the “agent = composition” thesis as a form) · source · T3 · marktechpost.com
  • agent-loops-verification — Arjun Iyer/Signadot (TNS): loops replace prompts, so verification becomes the bottleneck — and for cloud-native code it’s a runtime problem · source · T4 · thenewstack.io

BlogPosting / Article (sources)

  • adk-agents-with-skills — Google guide: building ADK agents with skills; the open skills spec · source · developers.googleblog.com
  • adk-long-running-agents — Google guide: durable long-running ADK agents (pause/resume, state machines, persistent sessions) · source · developers.googleblog.com
  • claude-code-channels-vs-openclaw — event-driven vs self-driven Claude agents; the deployment axis · source · aimaker.substack.com
  • claude-code-best-practices — practitioner lessons; BMAD vs plan mode, CLAUDE.md, model selection · source · ranthebuilder.cloud
  • agents-never-do-alone — TDS: what agents shouldn’t do autonomously; reversibility matrix + AGENTS.md/blocked_commands.md/two-agent review · source · towardsdatascience.com
  • langchain-custom-harness — LangChain: agent = model + harness; middleware as the composition unit; task-harness fit · source · langchain.com
  • ant-cli — hands-on: Anthropic’s ant CLI deploys/manages Claude Managed Agents from the terminal (model provider → agent platform) · source · medium.com
  • claude-skills-ppc — Claude Skills for PPC; skills × MCP = agency; system-designer framing · source · searchengineland.com
  • spec-driven-ai-tools — comparison of BMAD / Spec-Kit / OpenSpec · source · src: spec-driven-ai-tools.md
  • agents-that-build-agents-ms — Microsoft SKILL-first blueprint (Agent Framework + Foundry); Foundry adopts the agentskills.io spec (SKILL.md, MCP Resources/SEP-2640) · source · techcommunity.microsoft.com

Book (sources)

  • second-son-house-of-bells — the shipped novel from the autonovel pipeline, bylined “Claude Hermes” (PDF/ePub/audiobook); the reader-facing artifact end of the loop thesis · source · T2 · nousresearch.com

SoftwareApplication

  • adk — Google’s Agent Development Kit; loads skills via the open agentskills.io spec
  • claude-cowork — interactive plugin host for the FSI agents
  • claude-agent-sdk — Anthropic’s official SDK: the Claude Code harness as a Python/TS library (tools, loop, hooks, subagents, MCP) · source
  • memory-vault — open-source MCP server: Postgres + pgvector persistent memory for Claude Code; externalizes context lost to auto-compaction; the lightweight pole of agent-memory vs seekdb · source · T3 · makeuseof.com

WebAPI

  • claude-managed-agents — headless /v1/agents deployment surface for the FSI agents
  • a2a-protocol — A2A (Agent2Agent): cross-vendor agent↔agent interop standard (Agent Cards, tasks, JSON-RPC); Google→Linux Foundation; complements MCP (agent↔tool) · source

Organization

  • microsoft — Microsoft Foundry (agent factory) + open-source Agent Framework SDK; third agent-platform vendor; adopts agentskills.io + MCP Resources; also microsoft-scout
  • langchaincreate_agent / Deep Agents; harness-from-middleware (the fourth agent-platform vendor)
  • oceanbase — distributed-DB team (Alipay/Taobao) behind seekdb; the agent-infrastructure/state-store vendor stratum
  • signadot — Kubernetes-native ephemeral environments for runtime/behavioural verification; the loop-verification substrate (platform-ops seam)
  • nous-research — AI lab behind the Hermes line; owner/developer of hermes-agent + hermes-profile-builder (open-source, model-agnostic, self-driven pole)
  • zo-computer — the platform zouroboros is built natively on; thin node (host platform), evidence-only

Person

Synthesis

  • synthesis — the evolving thesis: skills → harness → orchestration/deployment

Bridge nodes (live in sibling wikis, linked cross-wiki)

gbrain · agent-skills · compound-engineering · model-context-protocol · anthropic · claude-opus-4-8 · garry-tan (research-wiki) · google (llm-providers-wiki — canonical Google node; agent-platform-vendor facet noted there)