scripts/apply-db-init.sh implements the boot-time runner that walks db-init/*.sql in numeric-prefix order, applies each via psql, and records successful applications in a migrations_applied guard table so re-runs are no-ops. All 7 acceptance criteria pass live against the dev compose stack: empty dir, missing env var, apply, idempotent re-run, checksum mismatch, filename collision, broken SQL. Two retroactive Dockerfile corrections folded in (exposed by the first live-test attempt of 1.2's script): 1. apk add bash. The directus/directus:11.17.4 base is Alpine and ships ash via BusyBox, not bash. The script uses bash-specific features (associative arrays, [[ ]], mapfile, BASH_REMATCH) and fails at line 69 in sh. 2. .gitattributes added at repo root forcing LF on *.sh, *.sql, *.yaml, *.yml. Without it, Windows checkouts with core.autocrlf=true (the Git-for-Windows default) silently inject CRLF, causing "bad interpreter: /usr/bin/env bash^M" inside the Linux container. This failure mode only manifests in the container. Both corrections are documented in 01-project-scaffold.md's Done section; 02-db-init-runner.md's Done section captures the live-test results, the corrected docker compose run --entrypoint commands, and the gotcha about compose env defaults masking missing-env-var tests. ROADMAP marks 1.2 done; 1.3 next.
12 KiB
Task 1.2 — db-init runner script
Phase: 1 — Slice 1 schema + deploy pipeline
Status: ⬜ Not started
Depends on: 1.1
Wiki refs: docs/wiki/entities/postgres-timescaledb.md, docs/wiki/entities/directus.md (Schema management section)
Goal
Implement scripts/apply-db-init.sh — the boot-time runner that walks db-init/*.sql in numeric order, applies each via psql against the configured Postgres, and records successful applications in a migrations_applied guard table so re-runs are no-ops. This is the foundation Phase 1 (and every later phase) depends on for non-Directus DDL.
Deliverables
scripts/apply-db-init.sh— POSIX-compatible bash. Does the following, in order:- Wait for Postgres readiness. Loop calling
pg_isready -h $DB_HOST -p $DB_PORT -U $DB_USER -d $DB_DATABASEuntil success or timeout (configurable, default 60 s). Exit non-zero on timeout with a clear log message. - Bootstrap the guard table.
CREATE TABLE IF NOT EXISTS migrations_applied ( filename TEXT PRIMARY KEY, applied_at TIMESTAMPTZ NOT NULL DEFAULT now(), checksum TEXT NOT NULL ); - Walk
db-init/*.sqlin numeric-prefix order (sorted lexically; theNNN_prefix enforces order). For each file:- Compute
sha256sumof the file contents →checksum. - Query
migrations_applied WHERE filename = <basename>. - If a row exists and the checksums match → log
skip filenameand continue. - If a row exists and checksums DON'T match → log error and exit non-zero. (Migrations are append-only; never edit a file once applied.)
- If no row exists → apply the file via
psql -v ON_ERROR_STOP=1 -f <path>. On success, insert the row. On failure, exit non-zero with the SQL error.
- Compute
- Log a one-line summary at the end:
db-init complete: <N> applied, <M> skipped.
- Wait for Postgres readiness. Loop calling
Specification
- Environment variables expected:
DB_HOST,DB_PORT,DB_USER,DB_PASSWORD,DB_DATABASE. PlusDB_INIT_DIR(default/directus/db-init) andDB_INIT_TIMEOUT_SECONDS(default60). - Use
PGPASSWORDfor psql auth — exported in the script beforepsqlcalls, never printed in logs. - Each migration runs in a single transaction by virtue of
psql -v ON_ERROR_STOP=1 -1 -f. The-1flag wraps the whole file inBEGIN/COMMIT. (Some statements likeCREATE EXTENSIONorCREATE INDEX CONCURRENTLYcan't run in a transaction — those go in their own files without-1if needed. Document the exception inline.) - Numeric-prefix convention.
001_,002_, …,999_. Pad to 3 digits; gives 999 slots which is well beyond what we'll need. - Filename uniqueness. Two files can't share a prefix. Lint check at script start: detect collisions, error out before applying anything.
- Logging. One line per file at INFO level. Failure logs include the psql exit code and the offending file. No SQL output to stdout (verbose
psqloutput goes to stderr and is suppressed unlessDEBUG=1is set). - Idempotency. Running the script twice in a row → second run does zero psql work beyond the readiness check + guard-table query.
- Exit codes.
0= success,1= readiness timeout,2= checksum mismatch,3= psql error,4= filename collision.
Acceptance criteria
- Script is executable (
chmod +x), shebang is#!/usr/bin/env bash. set -euo pipefailat the top.- Against a fresh Postgres, no
db-init/*.sqlfiles yet → script createsmigrations_appliedtable, prints "0 applied, 0 skipped", exits 0. - After 1.3 lands, script applies all three migrations on first run (3 applied, 0 skipped), no-ops on second run (0 applied, 3 skipped).
- Manually editing an applied file → next run exits 2 with a clear "checksum mismatch" error.
- Adding two files with the same numeric prefix → script exits 4 before applying anything.
- Killing Postgres mid-run during file 002 → script exits 3 with the psql error; on next run, file 002 retries cleanly. Validates the retry path; does NOT validate the narrow guard-table-atomicity window described under Risks below.
Risks / open questions
CREATE EXTENSIONinside a transaction. Some Postgres extensions can be created inside a transaction (timescaledb, postgis), some cannot (pg_partman with parallel apply). For Phase 1 the only extension is timescaledb, which is fine. Re-evaluate per phase.- Concurrent boots. If two Directus containers boot against the same DB at the same time (rolling deploy), both will try to apply migrations. The guard table's
PRIMARY KEYonfilenamemakes the insert race-safe, but two containers running the samepsql -fat once is risky. Mitigation for Phase 1: assume single-replica boot during deploy; Phase 3+ revisit if rolling deploy is a goal. - Guard-table atomicity gap. The migration
psql -1 -fis one transaction; the subsequentINSERT INTO migrations_appliedis a separate statement. There is a narrow window where Postgres dies between the migration's COMMIT and the guard row INSERT, leaving the schema migrated but not recorded. Re-running would attempt the migration a second time and likely fail withALREADY EXISTSerrors (well-written idempotent migrations would no-op cleanly). Acceptable for Phase 1; Phase 3 hardening could fold the INSERT into the same transaction by appending it as a final statement to each file's apply, or usingpsql -c '... ; INSERT INTO migrations_applied ...'as a single command.
Done
Implementation landed and live-verified 2026-05-01. All 7 acceptance criteria pass against the dev compose stack.
Files created at C:\Users\Administrator\projects\trm\directus\:
scripts/apply-db-init.sh— 302-line bash runner per the spec..gitattributes—*.sh,*.sql,*.yaml,*.ymlforced to LF. Required, not optional (see Implementation findings below).
Both files staged via git add + git update-index --chmod=+x; entrypoint.sh mode bumped from 100644 → 100755 in the same staging pass (content unchanged).
Implementation findings:
-
.gitattributesis a required deliverable, not optional. The original spec didn't mention it. On Windows dev machines withcore.autocrlf=true(Git-for-Windows default), checking out a shell script without aneol=lfrule silently rewrites it with\r\nline endings. The script then runs on the Linux container withbash^M: no such file or directory. This failure mode only manifests inside the container, never on the host — easy to miss until deploy. The agent added.gitattributescovering*.sh,*.sql,*.yaml,*.ymlto lock in LF. -
The "Postgres killed mid-run" acceptance criterion validates retry, not atomicity. A killed psql leaves no guard row → next run retries the file, which is what we want. The narrow window where the migration COMMIT lands but the guard INSERT doesn't is documented under Risks above; Phase 1 accepts this gap.
Acceptance criteria — static (passed) vs live (pending Docker run by user):
Passed via static inspection:
- ✅ Executable bit (
100755), shebang (#!/usr/bin/env bash),set -euo pipefailon line 51. - ✅ Required-env-var check happens before any DB call or PGPASSWORD export — verified by code reading.
Live-verified 2026-05-01 against the dev compose stack:
- ✅ Fresh DB, no SQL files → "0 applied, 0 skipped", exit 0.
- ✅ One file → applies (
1 applied, 0 skipped); re-run → no-op (0 applied, 1 skipped). - ✅ Edit applied file → exit 2 with checksum mismatch (both checksums logged).
- ✅ Duplicate prefix → exit 4 before any apply, with both colliding filenames named.
- ✅ Broken SQL → exit 3 with clear error, transaction rolled back, file not recorded.
- ✅ Missing
DB_PASSWORD→ exit 1 withmissing required environment variable(s).
Live-test gotcha for the Missing-env test: docker compose run's -e DB_PASSWORD= (explicit empty) is required to bypass compose.dev.yaml's ${DB_PASSWORD:-directus} fallback. Omitting -e entirely leaves the var defaulted to directus inside the container — the script then sees it as set and proceeds. Documented inline in the corrected test commands above.
Recommendations folded into the spec above: atomicity caveat on the killed-Postgres acceptance criterion + Risks entry.
Recommendation NOT yet folded (deliberate — owner's call): the .gitattributes requirement should land in task 1.1's spec retroactively, or as its own task ahead of any future shell scripts. Currently it's mentioned only here. Suggest a one-line add to task 1.1's "Deliverables" section: .gitattributes enforcing LF for *.sh / *.sql / *.yaml / *.yml. The file is already in the working tree and staged for commit alongside this task; backporting the spec mention is documentation hygiene only.
Live-verification findings (2026-05-01) — required Dockerfile + test-command corrections:
First live-test pass surfaced two real issues that the agent's static analysis missed:
-
bashis NOT in the directus base image. The upstreamdirectus/directus:11.17.4is Alpine-based; Alpine shipsash(BusyBox), not bash. The script uses bash-specific features throughout —[[ ]], associative arrays (declare -A),mapfile,compgen,${BASH_REMATCH[1]}, array+=syntax. Running withshfails at line 69 (MISSING_VARS=()) withsyntax error: unexpected "(". Fix: task 1.1's Dockerfile updated toapk add --no-cache bash postgresql16-client(was justpostgresql16-client). Folded into task 1.2's commit because the dependency is exposed by 1.2's script. Image must be rebuilt after this change. -
docker compose run <service> bash /path/to/script.shdoes NOT run the script — it runs Directus. The Dockerfile setsENTRYPOINT ["/directus/entrypoint.sh"].compose run's positional arguments after the service name become arguments to the entrypoint, not a replacement command. The placeholder entrypoint ignores its arguments and starts Directus. Fix: test commands must use--entrypointto override. Corrected commands documented below.
Corrected test commands (use these for live verification):
# Rebuild the image first (picks up bash)
docker compose -f compose.dev.yaml build
# Start a fresh db-only stack
docker compose -f compose.dev.yaml up -d db
# Test A — fresh DB, no SQL files → "0 applied, 0 skipped", exit 0
docker compose -f compose.dev.yaml run --rm --no-deps \
-e DB_HOST=db -e DB_PORT=5432 -e DB_USER=directus \
-e DB_PASSWORD=directus -e DB_DATABASE=directus \
-e DB_INIT_DIR=/directus/db-init \
--entrypoint /directus/scripts/apply-db-init.sh \
directus
# Test B — missing DB_PASSWORD → exit 1
docker compose -f compose.dev.yaml run --rm --no-deps \
-e DB_HOST=db -e DB_PORT=5432 -e DB_USER=directus \
-e DB_DATABASE=directus \
--entrypoint /directus/scripts/apply-db-init.sh \
directus
# Tests C–F — interactive shell with bash, override DB_INIT_DIR to a tmp dir
docker compose -f compose.dev.yaml run --rm --no-deps \
-e DB_HOST=db -e DB_PORT=5432 -e DB_USER=directus \
-e DB_PASSWORD=directus -e DB_DATABASE=directus \
-e DB_INIT_DIR=/tmp/test-migrations \
--entrypoint bash \
directus
# inside the bash shell:
mkdir /tmp/test-migrations
echo "SELECT 1;" > /tmp/test-migrations/001_test.sql
/directus/scripts/apply-db-init.sh # → 1 applied, 0 skipped
/directus/scripts/apply-db-init.sh # → 0 applied, 1 skipped
echo "SELECT 2;" > /tmp/test-migrations/001_test.sql
/directus/scripts/apply-db-init.sh # → exit 2 (checksum)
echo "SELECT 1;" > /tmp/test-migrations/001_test.sql
echo "SELECT 1;" > /tmp/test-migrations/001_dup.sql
/directus/scripts/apply-db-init.sh # → exit 4 (collision)
rm /tmp/test-migrations/001_dup.sql
echo "THIS IS NOT SQL" > /tmp/test-migrations/002_broken.sql
/directus/scripts/apply-db-init.sh # → exit 3 (psql error)