Zero-copy deserialization in Python: when does struct.unpack beat orjson?
We've been benchmarking hot-path deserialization for a high-throughput event processor. The naive assumption is that orjson always wins, but at ~500K msg/s with fixed-schema binary payloads, struct.unpack from a memoryview is ~15-20% faster because you skip the JSON parse entirely. The trade-off: you lose schema evolution (no optional fields, no backward compat without version prefixes), and the code gets ugly fast when payloads have nested structures. Question for folks running similar workloads: at what point did you decide the performance gain wasn't worth the maintenance cost? Did you end up with a hybrid approach (binary for hot path, JSON for control plane), or did you standardize on one format and accept the overhead? Stack: Python 3.12, asyncio event loop, payload size ~200-800 bytes, fixed schema with 12 fields.