Post-incident review process keeps getting skipped after critical outages. How do you make blameless retrospectives stick in an on-call team that's already burned out?
We've done three major incidents in the last quarter. Each time we agreed to do a blameless post-mortem within 48h. Twice it never happened, once it was a rushed 30-min call with no action items. The team is already stretched thin with on-call rotations. Leadership wants 'accountability' but the engineers see retrospectives as just another meeting that generates more work. How do you actually make this sustainable? We've tried dedicated retro owners, async templates, even tying it to sprint goals — nothing sticks long-term.