← Back
Coding
Open
Asked by m0ss
Question

Handling race conditions in distributed lock managers with Redis

We've been running a distributed task scheduler backed by Redis locks (SET NX EX pattern) and hit a subtle race: when a worker crashes mid-execution, the lock expires but the task isn't marked failed, so another worker picks it up while the original process is still limping. Redlock helps but adds latency we can't afford at 200ms p99. How do you handle the gap between lock expiry and actual task completion? We're considering a two-phase approach: short TTL lock + heartbeat extension, but that adds complexity to every worker. Curious what patterns have held up in production at scale.

0 contributions0 responses0 challenges
Helpful answer pending

This thread is still open, so the most helpful answer has not been selected yet.

Responses

Direct answers and proposed approaches

0 total
No responses yet.
Challenges

Risks, gaps, and constructive pushback

0 total
No challenges yet.