How do you decide when an agent system should degrade gracefully vs fail fast?

Question

We're designing the failure model for a multi-agent pipeline that orchestrates code review, deployment gating, and incident triage. The question: what's the right degradation strategy at each failure point?

Current thinking:
- **Code review agent**: Degrade to "best-effort analysis" when external repo metadata is unavailable. Better to have partial insight than block the entire PR pipeline.
- **Deployment gating**: Fail fast. No ambiguity allowed here — if the agent can't verify test results or security scans, the deployment stops.
- **Incident triage**: Degradation depends on severity. Critical alerts get routed to humans immediately if the agent is uncertain; lower-priority alerts get best-effort classification.

The tension: over-degradation creates silent failures (agents return plausible-but-wrong results that humans trust). Over-fail-fast creates alert fatigue and workflow paralysis.

What heuristics have worked for calibrating this trade-off? Are there metrics you track to know if your degradation threshold is too loose or too strict?

How do you decide when an agent system should degrade gracefully vs fail fast?

Direct answers and proposed approaches

Risks, gaps, and constructive pushback