CI dry-run revealed an architectural ordering bug: db-init/004 and
db-init/005 ALTER TABLE the Directus-managed tables (organization_users,
events, etc.), but db-init runs BEFORE schema-apply creates those
tables. On a fresh CI Postgres this fails with "relation does not
exist." Local dev never tripped this because we'd created the tables
via MCP first.
Fix: introduce a post-schema migration phase. Two db-init runs in the
entrypoint, with schema-apply in between:
1. apply-db-init.sh db-init/ → positions hypertable + faulty
column (tables Directus does
NOT manage)
2. schema-apply.sh → creates Directus-managed tables
from snapshots/schema.yaml
3. apply-db-init.sh db-init-post/ → composite UNIQUE constraints on
the Directus-managed tables
4. directus bootstrap
5. directus start
Files moved:
db-init/004_junction_unique_constraints.sql →
db-init-post/001_junction_unique_constraints.sql
db-init/005_event_participation_unique_constraints.sql →
db-init-post/002_event_participation_unique_constraints.sql
Each ALTER TABLE in the post-schema migrations is now wrapped in a
pg_constraint existence guard for idempotency. This handles the dev DB
where the constraints already exist (from the original 004/005 runs +
the manual psql recovery during task 1.5's destructive-apply
incident). Old 004/005 rows in migrations_applied become orphans —
harmless.
Updates:
- Dockerfile: COPY db-init-post into the image
- entrypoint.sh: 4-step → 5-step flow with the post-schema run between
schema-apply and bootstrap
- .gitea/workflows/build.yml: dry-run chains all three pre-boot scripts
(pre-schema → schema-apply → post-schema); path filter includes
db-init-post/**
- Task specs 1.4 and 1.5 Done sections: updated to reference the new
db-init-post/ path (db-init/004 → db-init-post/001, etc.)
The reusable runner script (apply-db-init.sh) didn't need to change —
it already accepts DB_INIT_DIR and uses just the basename for the
guard-table key. The two phases share migrations_applied; filenames
don't collide because pre-schema and post-schema use distinct
descriptive names.
Phase 1 is still "done" — this is a Phase 1 architectural correction
exposed by the CI dry-run, not a new task.
The runner host typically has another Postgres listening on 5432
(local dev stack, stage instance, etc.), which made the services:
postgres container fail at start with "port already allocated."
Remap the host-side port from 5432:5432 to 15432:5432. The service
container still listens on 5432 internally; only the runner host
binding changes. Dry-run's DB_PORT updated to 15432 to match.
--network host semantics preserved: DB_HOST=localhost reaches the
service on the runner's loopback at the new port.
Why we still need a Postgres container at all: the dry-run gate
applies db-init/*.sql migrations and the directus schema snapshot
against a real DB to catch breakage before pushing the image. No
Postgres = no validation = the gate is bypassed.
Inline comment in the workflow now explains the choice; task spec's
Done section captures the correction so future readers don't
re-discover this.
.gitea/workflows/build.yml builds the directus image on path-filtered
pushes to main and validates the boot pipeline against a throwaway
Postgres before pushing the image to the registry. The dry-run is the
gate that catches snapshot drift, broken db-init scripts, or
incompatible schema changes before they reach stage.
Workflow shape (mirrors processor's CI but tailored to Directus):
- Path filter: snapshots/, db-init/, extensions/, scripts/,
entrypoint.sh, Dockerfile, the workflow file itself.
Docs-only commits (.planning/, README.md, compose.dev.yaml,
package.json) do NOT trigger CI.
- Throwaway Postgres via services: block, pinned to the same
timescale/timescaledb-ha:pg16.6-ts2.17.2-all tag as compose.dev.yaml.
- Plain `docker build` (NOT build-push-action) so the image stays in
the local daemon for the subsequent docker run dry-run.
- Dry-run: --network host + --entrypoint bash to override the upstream
entrypoint and run only apply-db-init.sh && schema-apply.sh.
Skips bootstrap and pm2-runtime — the schema apply is the gate.
- Two image tags: :main (mutable) and :<sha> (immutable).
- Optional Portainer webhook gated on secret presence; curl -fsS so a
misconfigured URL fails the step explicitly.
Spec corrections folded in (the spec's draft had two contradictions
that would have failed at runtime):
1. DB_HOST=localhost (not 'postgres'). With --network host, service
containers are reachable on the runner's loopback by their port
mapping, NOT by service name. Service-name resolution requires the
default bridge network; --network host overrides it.
2. health-retries 20 (not 10). timescaledb-ha:*-all does more init
work at boot than vanilla postgres; 50s isn't always enough.
Operator action required in the Gitea repo Settings before first run:
configure REGISTRY_USERNAME and REGISTRY_PASSWORD secrets (required for
push); optionally PORTAINER_WEBHOOK_URL (for auto-deploy).
Live verification deferred to first relevant commit. Documented in the
task spec's Done section: positive (clean snapshot → push succeeds)
and negative (malformed snapshot → halt before push) cases to validate
once CI runs.
ROADMAP marks 1.8 done. Phase 1 progress: 8/9 tasks complete (1.1–1.8);
only 1.9 (Rally Albania 2026 dogfood seed) remains before Phase 1 ships.