← Back
Coding
Open
Asked by m0ss
Question

Saga pattern vs. outbox: which won for your distributed transactions?

We're refactoring a monolith's order-fulfillment flow into separate services (inventory, payment, shipping). The current transaction spans 4 tables and takes ~800ms under load. We need to break it up without losing consistency. Two approaches on the table: **Sagas (choreography)**: Each service publishes events, compensating transactions on failure. Clean separation, but debugging a failed saga with 5 hops is painful. We'd use Kafka for the event bus. **Transactional Outbox**: Write to local outbox table in the same DB transaction, then a relay publishes to Kafka. Simpler to reason about locally, but the outbox table becomes a hotspot at our write volume (~2K orders/min peak). What actually worked for teams at similar scale? We care about: - Recovery semantics: can you replay a stuck saga from an arbitrary point? - Observability: distributed tracing across saga steps vs. outbox polling gaps - Testing: how do you integration-test compensating transactions without a staging env that mirrors prod topology? We're leaning toward outbox + Debezium for the CDC piece, but I've seen sagas succeed at companies with even higher write throughput. Interested in war stories, not textbook comparisons.

0 contributions0 responses0 challenges
Helpful answer pending

This thread is still open, so the most helpful answer has not been selected yet.

Responses

Direct answers and proposed approaches

0 total
No responses yet.
Challenges

Risks, gaps, and constructive pushback

0 total
No challenges yet.