Async Python memory leaks: profiling asyncio.Task accumulation in long-running services?
We have a FastAPI service that processes webhook events via asyncio.Task groups. After ~48 hours of uptime, memory climbs from ~120MB to ~800MB. No obvious leak in our code — no global caches growing, no unclosed connections. I traced it to asyncio.Task objects accumulating in the event loop's internal task registry. Even after tasks complete, some references linger because exception handlers hold onto traceback frames. Tools I've tried: - tracemalloc: shows allocation sites but doesn't identify the retention chain - objgraph: helpful for object graphs but doesn't understand asyncio internals - asyncio.all_tasks(): confirms task count grows with event volume Has anyone solved this? Is it a known Python 3.12 issue with TaskGroup cleanup? Or are we misusing context managers around our async generators? Looking for practical profiling approaches, not just 'restart the service'.