Replaces the original "migration advisory lock" sketch. Once processor doesn't run DDL, the lock concern delegates to Directus's db-init runner. Context: positions hypertable + faulty column DDL currently exists in both processor (src/db/migrations/0001 + 0002) and directus (db-init/001/002/003). Two sources of truth for the same schema is a known hazard — adding a column means editing two files in two repos, and silent drift between them is invisible until runtime. Fix: directus becomes the sole DDL owner. Processor's migration runner is retired; only INSERT/SELECT/UPDATE remain. Task spec covers: - Pre-flight diff between processor migrations and directus db-init (must be byte/semantically equivalent before deletion) - File-by-file deletion list - Test infra migration (integration test moves to fixture-based schema setup, matching the established Phase 1.5 task 1.5.6 pattern) - Wiki + ROADMAP updates - compose.yaml depends_on directus: service_healthy - Operational notes (existing migrations_applied table is left in place) Sequence: ideally lands AFTER Phase 1.5 ships so the agent shipping the WS endpoint isn't pulled into a side quest mid-flight.
12 KiB
Task 3.5 — Retire processor migration runner
Phase: 3 — Production hardening
Status: ⬜ Not started
Depends on: Phase 1.5 ideally landed (avoid mid-flight code churn for the agent shipping the WS endpoint). No hard code dependency.
Replaces: the original 3.5 sketch ("Migration advisory lock"). Once the processor doesn't run migrations, the lock concern is delegated to Directus's db-init runner — outside this service's surface.
Wiki refs: docs/wiki/entities/processor.md §"Schema ownership vs. write access" (the line that needs to change), docs/wiki/entities/directus.md §"Schema management — snapshot/apply pipeline", docs/wiki/entities/postgres-timescaledb.md
Goal
Establish directus as the single owner of all DDL against the shared Postgres database. Retire the processor's migration runner. After this task, the only DDL paths are:
trm/directus/db-init/*.sql(pre-schema: extensions, hypertables, raw tables Directus's snapshot-yaml format can't express).trm/directus/snapshots/schema.yaml(Directus-managed user collections).trm/directus/db-init-post/*.sql(post-schema: composite UNIQUE constraints on Directus-managed tables).
Processor exclusively does INSERT / SELECT / UPDATE. No CREATE, ALTER, CREATE EXTENSION, or any other DDL.
Context — why this exists
The current state has both services creating the positions hypertable and the faulty column:
trm/processor/src/db/migrations/0001_positions.sqland0002_positions_faulty.sql(processor's runner, from Phase 1 task 1.4 + the recent 1.5.5 prep).trm/directus/db-init/001_extensions.sql,002_positions_hypertable.sql,003_faulty_column.sql(directus's runner, added later when the destructive-apply incident showed positions had to exist beforedirectus schema applyruns or it would get wiped).
Both runners are idempotent (IF NOT EXISTS, etc.) so the runtime collision is benign at the moment, but the architectural risks are real:
- Two sources of truth. Adding a column means editing two files in two repos; either one can drift silently.
- Schema divergence. A processor migration that adds a column the directus side doesn't know about is silently invisible to the admin UI.
- Two
migrations_appliedtables, which already caused the ghost-collection apply conflict earlier in Phase 1 of directus. - Operator confusion. The wiki says "Directus owns the schema" but processor runs migrations — newcomers can't tell which is canonical.
The fix is the wiki's stated intent: directus owns DDL. Processor was the historical owner before directus's db-init story matured; the legacy runner survived the transition because nobody retired it.
Pre-flight (before deleting anything)
1. Confirm directus's db-init/ covers the full processor schema surface
Check that trm/directus/db-init/'s SQL is byte-equivalent (or semantically equivalent) to processor's migrations. As of writing, directus has:
001_extensions.sql—CREATE EXTENSION IF NOT EXISTS timescaledb+postgis.002_positions_hypertable.sql—CREATE TABLE positions (...)+create_hypertable(...)+ indexes.003_faulty_column.sql—ALTER TABLE positions ADD COLUMN IF NOT EXISTS faulty ...+positions_device_ts_idx.
Processor has:
src/db/migrations/0001_positions.sql— extensions + table + hypertable +positions_device_ts(UNIQUE on(device_id, ts)) +positions_ts(DESC).src/db/migrations/0002_positions_faulty.sql—faultycolumn +positions_device_ts_idx((device_id, ts DESC)).
Diff the two before retiring. If processor's SQL has an index, column, or constraint directus's db-init/ doesn't, the deliverable starts with porting that diff into directus's db-init/ (and snapshotting if applicable). Specific things to verify:
- All four indexes exist in directus's db-init:
positions_device_ts(UNIQUE),positions_ts,positions_device_ts_idx. - Column types match exactly:
device_id text,ts timestamptz,ingested_at timestamptz DEFAULT now(), etc. chunk_time_intervalisINTERVAL '1 day'on both sides.- The
ON CONFLICT (device_id, ts) DO NOTHINGupsert path requires the UNIQUE on(device_id, ts)— that's thepositions_device_tsindex, notpositions_device_ts_idx. Both must exist.
2. Verify the directus db-init apply order is fixed
Per docs/wiki/entities/directus.md's 5-step boot pipeline:
1. db-init pre-schema → positions hypertable, faulty column, timescaledb extension
2. directus bootstrap → Directus system tables + first admin
3. directus schema apply → Directus-managed user collections
4. db-init post-schema → composite UNIQUE constraints on user collections
5. pm2-runtime start → server up at :8055
So when processor boots against a stage stack:
directuscontainer has run steps 1–4 (positions exists; everything else exists).processorcontainer can connect andINSERTimmediately.
Compose ordering. trm/deploy/compose.yaml's processor service should depends_on: directus with condition: service_healthy so processor doesn't try to read positions before directus's db-init has run on first deploy. Verify in this task.
Deliverables
trm/processor/
- Delete
src/db/migrations/0001_positions.sql. - Delete
src/db/migrations/0002_positions_faulty.sql. - Delete the migrations directory if it's now empty.
- Delete
src/db/migrate.ts(or whatever the migration-runner module is named — the file that owns themigrations_appliedtable, the file walker, thepg_advisory_lockif any). - Update
src/main.tsto remove theawait migrate(...)step from boot. Postgres pool creation stays; migration call goes. - Update tests that exercise the migration runner — most likely deletes the corresponding test file. Integration tests that previously seeded the schema via
migrate()either:- (a) Use directus's
db-init/*.sqlfiles directly (read them inbeforeAll, execute against the testcontainer Postgres), or - (b) Carry a fixture SQL file in
test/fixtures/(the same approach Phase 1.5 task 1.5.6 already takes for its integration test). Pick (b) — it's already the established pattern.
- (a) Use directus's
- Update
Dockerfileto drop any migration-running step from the entrypoint (Phase 1's Dockerfile may not have this, but verify; the runtime container shouldn't carry the migrations directory if the runner is gone). - Update
package.jsondependencies— ifpg-migrateor any migration-runner library was a Phase-1-only dep, remove it. - Update
phase-1-throughput/04-postgres-schema.md's Done section with a note: "Migration runner retired in Phase 3 task 3.5 — see that task for context." - Update
ROADMAP.mdto reflect the retired runner under Phase 1's "what changed since landing" note.
trm/docs/wiki/
- Update
wiki/entities/processor.md— drop the "Schema ownership vs. write access" caveat that says "the positions hypertable is owned by processor's migration runner." Replace with a single sentence: "Processor never runs DDL. Schema is exclusively owned by Directus (snapshot.yaml +db-init/for things the snapshot can't express)." - Update
wiki/entities/directus.md— confirm the Schema-management section already listsdb-init/as covering everything (no edits unless current text implies a split). - Update
wiki/entities/postgres-timescaledb.md— verify the writer-side documentation; remove any "split schema authority" framing. - Append
docs/log.mdwith anoteentry recording the retirement.
trm/deploy/
- Verify
compose.yaml'sprocessorservice hasdepends_on: directus: { condition: service_healthy }. Add if missing. - Confirm the deploy README doesn't mention the processor's migration runner anywhere.
Specification
What stays in src/db/
pool.ts— PostgresPoolfactory. Untouched.- Connection helpers, query helpers (if any). Untouched.
What goes
migrations/*.sql— gone.migrate.ts(the runner). Gone.migrations_appliedtable — directus's runner has its own; the processor's becomes orphaned but harmless. Don't drop it from existing databases. The retirement is a runtime change; the table is just unused. Phase 3 hardening's eventualOPERATIONS.md(task 3.7) can document a one-offDROP TABLE migrations_appliedstep for operators who want a clean schema.
Boot order on first deploy
1. postgres container starts → DB available.
2. directus container starts → runs 5-step boot pipeline.
├─ Step 1 (db-init pre-schema) creates positions hypertable + faulty column + extensions.
├─ Steps 2-4 set up Directus's own world.
└─ Step 5 marks the container healthy.
3. processor container starts (depends_on: directus: service_healthy) → connects, finds positions, starts consuming.
If processor races directus on a fresh stack (no depends_on), it'll fail to find the positions table and crash-loop until directus catches up. depends_on: service_healthy makes the order deterministic.
Dev workflow
compose.dev.yaml in trm/processor (if it exists for processor-side dev) should depends_on: directus if running both. For pure-processor dev (no directus), the developer either:
- Runs directus's
db-init/*.sqlmanually against their local Postgres before booting processor. - Or copies the equivalent SQL into a one-off bootstrap script in
processor/test/fixtures/.
Document the chosen path in processor/README.md.
What this task does NOT do
- Does not retire directus's snapshot-managed collections.
- Does not change Phase 1 or Phase 1.5 code paths beyond removing the migration runner step.
- Does not introduce a new migration tool. The fix is fewer moving parts, not different ones.
Acceptance criteria
pnpm typecheck,pnpm lint,pnpm testclean.pnpm test:integrationruns green — the integration test no longer relies onmigrate(); it loads schema from a fixture SQL file instead.src/db/migrations/directory is gone (or empty + gitignored).- No
migrate()call anywhere in the source tree. - No
migrations_appliedreferences in processor source. - Stage smoke against a fresh DB: redeploy the stack, watch directus boot through its 5 steps, watch processor connect and start writing positions. No errors.
docs/wiki/entities/processor.mdanddirectus.mdagree: directus is the sole DDL owner.docs/log.mdhas anoteentry recording the retirement.trm/deploy/compose.yaml'sprocessorservice hasdepends_on: directus: service_healthy.
Risks / open questions
- Existing prod databases. If anyone has deployed the processor's migrations on a real DB, the
migrations_appliedtable is harmless but stale. Document a one-off cleanup query for operators (inOPERATIONS.mdwhen 3.7 lands). - Schema drift between processor's old migrations and directus's db-init. If the diff in pre-flight step 1 surfaces anything, that diff must land in directus's
db-init/before the processor's runner is retired. Order of operations matters: never delete the processor migration before the equivalent SQL is verified live in directus's runner. - Test container schema setup. The integration test fixture has to mirror what directus actually creates. If directus's
db-init/changes in a way that breaks processor's read paths, the fixture and the read paths both need updating. Mitigation: the fixture file lives intest/fixtures/and a comment at its top says "syncs withtrm/directus/db-init/— update both when schema changes." - The original 3.5 ("Migration advisory lock") concern. Once processor doesn't run migrations, the advisory-lock concern is delegated to directus's runner. That's a directus concern; whether to add an advisory lock to directus's
apply-db-init.shis tracked as a follow-up in directus's own roadmap, not here. - PostGIS usage in Phase 2. Processor's
0001_positions.sqlenables PostGIS even though Phase 1 doesn't use it. Directus'sdb-init/001_extensions.sqldoes the same. Confirm in pre-flight; no change needed if the directus side already has it.
Done
(Filled in when the task lands.)