Claude API — Refusals and Fallback
Anthropic API documentation for how Claude returns safety-classifier refusals and how to retry a refused request on a fallback model. Routed here (the provider-API-access lens) from the hub on 2026-06-11; the SDK-middleware angle is a cross-spoke adjacency with agentic-tooling-wiki (see below).
What a refusal is
Claude’s safety classifiers can decline a request. A decline is not an HTTP error — it
comes back as a normal HTTP 200 with stop_reason: "refusal" and (usually) a
stop_details object:
category— the policy area that fired:"cyber","bio","frontier_llm"(competing-model development, restricted under Anthropic’s commercial terms), or"reasoning_extraction"(asking the model to reproduce its internal reasoning). Benign work in these areas can still trip the classifier.explanation— human-readable, not stable (display, don’t parse).- Both fields (and
stop_detailsitself) can benull;stop_detailsisnullfor every non-refusal stop reason. Branch onstop_reason == "refusal", never onstop_details/content.
A refusal can arrive before any output or mid-stream; treat partial output as incomplete.
The fallback pattern
The fix for a refusal is to re-send the same request to a different model — a request
Claude declines can usually be served by another. The classic chain in the docs is
claude-fable-5 → claude-opus-4-8. Three ways to wire it:
| Approach | Where | Mechanism |
|---|---|---|
| Server-side fallback | Claude API / Claude Platform on AWS (beta) | fallbacks: [{model}] param + anthropic-beta: server-side-fallback-2026-06-01 header; the API retries inside one round trip (up to 3 fallbacks; each must be a published allowed_fallback_models target). |
| SDK middleware | TS/Python/Go/Java/C# SDKs, any platform | BetaRefusalFallbackMiddleware on the client; a shared BetaFallbackState pins follow-ups to the model that accepted. (Not in Ruby/PHP SDKs — implement manually.) |
| Manual retry | Ruby/PHP/raw HTTP | Detect "refusal", re-send on a fallback model, stay on it for the conversation. |
Only a safety decline triggers fallback — rate-limit/overload/server errors are returned as-is.
The response marks each handoff with a fallback content block (from/to model), and the
top-level model field names whoever actually answered.
Billing & cost angle (why this lives in the providers wiki)
This is as much a cost/pricing feature as a reliability one:
- A refusal before any output costs nothing and consumes no rate limit; a mid-stream refusal bills the input + already-streamed output at normal rates.
- Each attempt bills separately at its own model’s rate — tokens from different models are
never summed.
usage.iterations[]is the per-attempt billing record (declined model =messageentry; serving model =fallback_messageentry). - A manual retry re-writes the fallback model’s prompt cache from scratch; the
fallback-credit-2026-06-01beta refunds that double-cost (server-side fallback & the SDK middleware apply it automatically). - Sticky routing: after a fallback, ~1h org-scoped best-effort routing sends later requests for that conversation straight to the model that served it — avoiding paying for an attempt that would predictably be re-declined.
Operational notes
- Batches: refusals come back as
succeeded+stop_reason: "refusal"(andstop_detailsmay benull); server-sidefallbacksis not supported for batches — resubmit refused items on a fallback model. - Refusals are an HTTP 200, so error-rate/5xx monitoring never sees them — instrument refusals and fallback-served responses as their own signal.
- Budget retries per request (a turn can produce several refusals, e.g. an agent + its
sub-agents); the
fallbacksparam does not propagate into model calls made inside tool execution, so sub-agent calls need their own.
Cross-spoke adjacency
- agentic-tooling-wiki — the SDK middleware and the sub-agent / agent-harness framing are agent-tooling concerns; the same refusal-fallback mechanism shows up there as a robustness pattern for harnesses. Routed here because the dominant substance is the Anthropic API surface + model routing + how it’s billed.
- research-wiki — owns anthropic and claude-opus-4-8 as model-substrate bridge nodes.