Split db-init into pre-schema and post-schema phases

CI dry-run revealed an architectural ordering bug: db-init/004 and
db-init/005 ALTER TABLE the Directus-managed tables (organization_users,
events, etc.), but db-init runs BEFORE schema-apply creates those
tables. On a fresh CI Postgres this fails with "relation does not
exist." Local dev never tripped this because we'd created the tables
via MCP first.

Fix: introduce a post-schema migration phase. Two db-init runs in the
entrypoint, with schema-apply in between:

  1. apply-db-init.sh   db-init/        → positions hypertable + faulty
                                          column (tables Directus does
                                          NOT manage)
  2. schema-apply.sh                    → creates Directus-managed tables
                                          from snapshots/schema.yaml
  3. apply-db-init.sh   db-init-post/   → composite UNIQUE constraints on
                                          the Directus-managed tables
  4. directus bootstrap
  5. directus start

Files moved:
  db-init/004_junction_unique_constraints.sql →
    db-init-post/001_junction_unique_constraints.sql
  db-init/005_event_participation_unique_constraints.sql →
    db-init-post/002_event_participation_unique_constraints.sql

Each ALTER TABLE in the post-schema migrations is now wrapped in a
pg_constraint existence guard for idempotency. This handles the dev DB
where the constraints already exist (from the original 004/005 runs +
the manual psql recovery during task 1.5's destructive-apply
incident). Old 004/005 rows in migrations_applied become orphans —
harmless.

Updates:
- Dockerfile: COPY db-init-post into the image
- entrypoint.sh: 4-step → 5-step flow with the post-schema run between
  schema-apply and bootstrap
- .gitea/workflows/build.yml: dry-run chains all three pre-boot scripts
  (pre-schema → schema-apply → post-schema); path filter includes
  db-init-post/**
- Task specs 1.4 and 1.5 Done sections: updated to reference the new
  db-init-post/ path (db-init/004 → db-init-post/001, etc.)

The reusable runner script (apply-db-init.sh) didn't need to change —
it already accepts DB_INIT_DIR and uses just the basename for the
guard-table key. The two phases share migrations_applied; filenames
don't collide because pre-schema and post-schema use distinct
descriptive names.

Phase 1 is still "done" — this is a Phase 1 architectural correction
exposed by the CI dry-run, not a new task.
This commit is contained in:
2026-05-02 10:47:52 +02:00
parent 82615c0a66
commit e01abfef27
10 changed files with 245 additions and 157 deletions
@@ -140,7 +140,7 @@ Unique constraint: `(organization_id, device_id)`.
- `organization_devices` — 6 fields (id UUID PK, organization_id M2O, device_id M2O, registered_at, date_created, date_updated).
- 6 M2O relations on the junctions, all with `ON DELETE RESTRICT`.
**Composite unique constraints landed via `db-init/004_junction_unique_constraints.sql`** because Directus's snapshot YAML format does not capture composite unique constraints (only single-column ones via `is_unique`). The migration adds:
**Composite unique constraints landed via `db-init-post/001_junction_unique_constraints.sql`** because Directus's snapshot YAML format does not capture composite unique constraints (only single-column ones via `is_unique`). The migration adds:
- `organization_users (organization_id, user_id)`
- `organization_vehicles (organization_id, vehicle_id)`
- `organization_devices (organization_id, device_id)`
@@ -157,7 +157,7 @@ Boot logs confirm: `[db-init] apply 004_junction_unique_constraints.sql` → `[d
- ✅ All seven collections exist with the fields specified.
- ✅ Required fields flagged (organizations.name/slug, devices.imei/model, vehicles.make/model, junction org/target/role).
- ✅ Single-column unique constraints (organizations.slug, devices.imei) enforced.
- ✅ Composite unique constraints on junctions enforced via db-init/004 (assertion block confirms).
- ✅ Composite unique constraints on junctions enforced via db-init-post/001 (assertion block confirms).
- ✅ M2O relations clickable in admin UI (Directus auto-resolves the dropdowns from the relation metadata).
- ✅ No permission policies attached — admin-only by default.
-`pnpm run schema:snapshot` produces snapshots/schema.yaml with all 7 collections present.
@@ -133,7 +133,7 @@ Unique constraint: `(entry_id, device_id)` — a device can't appear twice in th
**10 relations** wired across the 5 collections, all `ON DELETE RESTRICT` except `entry_devices.assigned_user_id` (`SET NULL`, deviation noted above).
**Composite unique constraints landed via `db-init/005_event_participation_unique_constraints.sql`:**
**Composite unique constraints landed via `db-init-post/002_event_participation_unique_constraints.sql`:**
- `events (organization_id, slug)`
- `classes (event_id, code)`
- `entries (event_id, race_number)`
@@ -149,10 +149,10 @@ This task surfaced a real foot-gun in our boot pipeline. Documenting in detail s
**What happened:**
1. We created 5 new collections via MCP against the running Directus.
2. We then ran `docker compose build && up -d` to make `db-init/005_*.sql` apply.
2. We then ran `docker compose build && up -d` to make `db-init-post/002_*.sql` apply.
3. The image rebuild baked in the OLD `snapshots/schema.yaml` (committed in task 1.4 — only had 7 collections).
4. Boot ran the entrypoint chain. db-init applied 005 successfully (constraints landed on the new tables). But step 2/4 (`schema-apply.sh``directus schema apply --yes /directus/snapshots/schema.yaml`) compared the running DB against the stale snapshot and saw 5 collections that "shouldn't exist" — so it **deleted them**, taking the constraints with them.
5. End state: 5 collections gone, db-init/005 row in `migrations_applied` still recorded as applied (so it wouldn't re-run), production-shape damage in dev.
5. End state: 5 collections gone, db-init-post/002 row in `migrations_applied` still recorded as applied (so it wouldn't re-run), production-shape damage in dev.
**Why `directus schema apply --yes` is destructive by design:**
@@ -161,7 +161,7 @@ The `--yes` flag tells Directus to enforce the snapshot as the single source of
**Recovery performed:**
1. Re-created the 5 collections + 10 relations via MCP (same calls as the original task 1.5 work — repeatable since the data was source-controlled in the conversation).
2. Re-applied the 5 ALTER TABLE statements from `db-init/005_*.sql` directly via psql (since `migrations_applied` already had 005 recorded).
2. Re-applied the 5 ALTER TABLE statements from `db-init-post/002_*.sql` directly via psql (since `migrations_applied` already had 005 recorded).
3. Ran `pnpm run schema:snapshot` *before* any further restart. Snapshot now reflects the full 13-collection state.
**Discipline going forward (operator rule):**
@@ -181,7 +181,7 @@ The entrypoint's hard-coded `--yes` is a long-term issue. Phase 3 hardening coul
- ✅ All 5 collections exist with the fields specified.
- ✅ Required fields flagged (events.organization_id/name/slug/discipline/starts_at/ends_at, classes.event_id/code/name, entries.event_id/class_id/race_number/status, entry_crew.entry_id/user_id/role, entry_devices.entry_id/device_id).
- ✅ Single-column unique constraints — none in this task (all uniqueness is composite).
- ✅ Composite unique constraints (5 of them) enforced via db-init/005.
- ✅ Composite unique constraints (5 of them) enforced via db-init-post/002.
- ✅ M2O relations wired (10 total).
- ✅ status enum dropdown shows all 8 values in lifecycle order.
- ✅ race_number is integer.