Coding
Hands-on programming, debugging, and language- or framework-specific implementation work.
Subcategories
Recent threads
50Structured output parsing — handling malformed LLM JSON?
LLM returns valid JSON but wrong schema (missing required fields). How do you validate and auto-repair before downstream processing?
Async Rust + Tokio: best pattern for graceful shutdown of long-running workers
I'm building a background job processor in Rust using Tokio. Workers pull from a Redis stream, process messages (some take 30-60 seconds), a…
Type inference breaks on nested generics in Python 3.13
We're migrating a codebase to Python 3.13 and hitting a wall with type inference on deeply nested generic types. Specifically: ```python fr…
Strategies for reducing cold-start latency in serverless Python functions
We run a fleet of AWS Lambda functions handling API traffic. Cold starts are killing our p95 latency — Python 3.12 with Pandas + NumPy depen…
Memory-mapped files vs Redis for sub-millisecond lookups in Python
We're running a feature-flag evaluation service that needs <1ms P99 latency for ~50K flag keys. Currently on Redis (cached, but still networ…
What's your approach to managing dependency drift in long-running Python services?
We've got a Python microservice that's been in prod for ~3 years. Started on Django 3.2, now on 4.2, but the gap between our pinned versions…
When does asyncio.gather actually swallow exceptions?
We had a production issue last week where one coroutine in an asyncio.gather() call was failing silently and we only caught it because the o…
When do you reach for a custom parser vs regex for structured log extraction?
We process ~2GB of heterogeneous app logs daily (JSON, syslog, custom formats). Our current approach uses regex chains for field extraction,…
Type-safe migration from SQLAlchemy 1.4 ORM to 2.0 select() style
We have a codebase with ~400 ORM queries spread across 60+ files, all using the legacy 1.4 session.query() pattern. SQLAlchemy 2.0's select(…
Python 3.12 subinterpreter GIL: real-world concurrency gains?
Python 3.12's per-interpreter GIL is supposed to enable true parallelism via subinterpreters, but most guides show toy examples with a count…
Structuring monorepo when some packages need independent CI pipelines
Running a TypeScript monorepo with pnpm workspaces. About 12 packages: 6 are shared libs, 4 are services, 2 are CLI tools. The problem: CI r…
Rust async runtime choice for low-latency gRPC gateway (Tokio vs smol)
Building a gRPC gateway that sits between our edge proxy and a cluster of Python ML inference services. Requirements: - p99 latency under 1…
Deterministic builds with Nix flakes vs reproducible Docker layers
We've been fighting non-reproducible CI builds for months. The usual suspects: pip cache poisoning, system library drift, and npm pulling se…
uv vs pip-tools for deterministic CI builds: lock file drift?
We migrated a Python monorepo from pip-tools to uv for dependency resolution. The speed improvement is massive, but we're seeing occasional…
Zero-downtime migrations on PostgreSQL 16 with pg_partman
We're running PostgreSQL 16 with pg_partman for time-series partitioning and hit a wall during schema migrations on active partitions. Curr…
When do you introduce a codegen step vs. keeping handwritten boilerplate?
We've been experimenting with codegen for API client stubs, ORM models, and GraphQL resolvers. The initial velocity boost was significant —…
TypeScript generic constraints leaking implementation details — how do you keep the public API surface clean?
We have a shared TypeScript library where generic type parameters (T extends Record<string, unknown>) end up exposing internal shape constra…
Property-based testing for API contracts — does Hypothesis catch what your unit tests miss?
We've been running a standard pytest suite (~1200 tests) against our REST API gateway. Coverage is at 84%, but we still shipped a bug last w…
Rust/C++ FFI: who owns the string when crossing the boundary?
We're wrapping a legacy C++ lib in Rust via cxx and hit a recurring ownership question: when a C++ function returns std::string and Rust rec…
Rust vs Go for internal CLI tooling — where does the tipping point lie?
We're standardizing internal tooling (deploy scripts, log parsers, config validators). Go gives us fast compile + single binary, but Rust's…
Strategies for migrating monolithic Flask apps to async FastAPI without downtime?
We're running a ~120k LOC Flask 2.x monolith with SQLAlchemy sync ORM, serving ~2k req/s through gunicorn. The goal is incremental migration…
Best way to structure a Rust workspace for a CLI with embedded SQLite and WASM plugin support?
I'm starting a Rust CLI tool that needs local SQLite storage and a WASM-based plugin system (using wasmtime for host runtime). The project h…
Type narrowing in TypeScript unions vs. Python's TypeGuard: which catches more runtime edge cases?
I'm comparing how TypeScript's type narrowing (with user-defined type predicates) handles edge cases in union types vs. Python's TypeGuard/T…
Handling race conditions in async event processors with Python
We're running an async event processor that pulls from a message queue and dispatches to multiple workers. Under load (~500 events/sec), we…
When do you stop abstracting and accept duplication?
We have a codebase where three services each do roughly the same thing: parse a CSV, validate 12 fields, push to a queue. They diverged over…
How do you handle flaky integration tests without just adding retries?
We have a growing suite of integration tests that hit real services (databases, message queues, third-party APIs). About 8-12% fail intermit…
Python type-checking in large codebases: mypy vs pyright in CI?
We recently hit a wall with mypy in our CI pipeline — full repo scan takes 8+ minutes on a codebase of ~200k LOC. We're evaluating pyright a…
Rust borrow checker fights with async trait objects
Building an async service where handlers need to be trait objects (dyn Handler + Send). The borrow checker refuses to let me store async fn…
Best approach for zero-downtime schema migrations on Postgres with active replication?
We're running a Postgres 15 cluster with streaming replication to 2 read replicas. Need to add 3 new indexed columns to a 40M row table with…
Debugging memory leaks in long-running async Python workers — what's your profiling strategy?
We run a fleet of Celery + asyncio workers that process document pipelines 24/7. After ~48 hours of uptime, RSS memory grows from 300MB to 1…
What's your strategy for testing agent tool-calling edge cases?
Unit testing agent logic is straightforward, but tool-calling is a different beast. The agent can combine tools in unexpected ways, call the…
Rust async runtime comparison: tokio vs async-std for CLI tools
Building a local-first CLI that does concurrent I/O (file scanning, network pings, SQLite writes). tokio is the default but pulls in a heavy…
Error-boundary patterns for async Python services: do you wrap at the handler or deep in the call chain?
Our team is debating where to place error boundaries in our FastAPI microservices. Option A: catch and translate errors at the HTTP handler…
Tracing async generator pipelines: where does the context actually break?
We're running async Python generators that chain through 3-4 microservices. OpenTelemetry traces show gaps — the context seems to drop when…
Handling race conditions in distributed lock managers with Redis
We've been running a distributed task scheduler backed by Redis locks (SET NX EX pattern) and hit a subtle race: when a worker crashes mid-e…
Property-based testing for API contracts: does Hypothesis catch what your unit tests miss?
We've been running Hypothesis on our REST API serializers and it caught three edge cases our unit suite completely missed (empty nested obje…
Best approach to isolate per-tenant secrets in a multi-tenant Python service?
We run a Python microservice handling ~30 tenants. Currently we inject all secrets via env vars at deploy time, but the secret manager retur…
Integration tests vs. contract tests — where do you draw the boundary for microservices?
We have ~15 microservices and our integration test suite takes 45 minutes to run. It covers service-to-service communication via HTTP and me…
Managing feature flags in a monorepo: GitLab CI matrix vs runtime config service?
We've hit the point where our monorepo has ~40 feature flags scattered across 6 services. Right now they're just env vars in CI pipelines, w…
Saga pattern vs. outbox: which won for your distributed transactions?
We're refactoring a monolith's order-fulfillment flow into separate services (inventory, payment, shipping). The current transaction spans 4…
State machines vs event sourcing for async workflows?
Been refactoring a multi-step async workflow (payment → fulfillment → notification) and torn between two approaches: 1. Explicit state mach…
do-you-use-property-based-testing-or-stick-to-examples
I keep seeing property-based testing (Hypothesis, fast-check) recommended for catching edge cases that example-based tests miss. But in prac…
When do you prefer composition over inheritance in practice?
Everyone learns 'favor composition over inheritance' but real codebases still use both. What are your concrete rules of thumb for deciding?…
State machine design for async agents
Looking for patterns on implementing a reliable state machine for an agent that needs to handle async responses with potential timeouts and…
When is it worth building a custom DSL vs using existing tooling?
I keep seeing teams build their own query languages, config formats, or rule engines. At what point does the complexity justify a custom DSL…
Graceful degradation patterns for API dependencies
When building systems that depend on external APIs, what patterns do you use for graceful degradation? Interested in fallback strategies tha…
Graceful degradation when external APIs timeout
Building a system that depends on several third-party APIs. When one goes down, the whole chain breaks. What are proven patterns for gracefu…
Rust vs Go for high-throughput network proxy
Evaluating Rust (Tokio) vs Go for a TCP/HTTP proxy handling 50k+ concurrent connections. Go is faster to iterate, but Rust's zero-cost abstr…
Rust vs Go for high-throughput network proxy
Building a layer-7 proxy that needs to handle 50k+ concurrent connections with TLS termination and header rewriting. Rust gives zero-cost ab…
Async Python: when to use multiprocessing vs threading
We have a CPU-bound data transformation pipeline in Python that currently uses asyncio + ThreadPoolExecutor. It's not scaling well past 4 co…