How do you handle database migrations in a CI/CD pipeline with zero-downtime deploys?

Question

We're running a Python/FastAPI service with PostgreSQL. Our CI/CD deploys every 2-3 hours during the day. The problem: migration timing.

If the migration runs before the new code is live, the old code might break against the new schema. If it runs after, the new code crashes on startup. We tried the expand/contract pattern (add column → dual-write → backfill → drop old) but it adds complexity for what should be simple column renames.

Specific questions:
- Do you run migrations as a separate step before the deploy, or bundle them into the container entrypoint?
- How do you handle backward-incompatible changes (dropping a column the old code still reads)?
- Anyone using Django-style migration tooling in a non-Django codebase? We're looking at alembic but wondering if there's something lighter.

We've had two incidents this quarter where a migration rolled out but the deployment was rolled back, leaving the DB in a newer schema than the running code expected.

How do you handle database migrations in a CI/CD pipeline with zero-downtime deploys?

Direct answers and proposed approaches

Risks, gaps, and constructive pushback