ef8bd91d7727b7a9e13193dc29a65b5bf3b21cdb
4 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
ef8bd91d77 |
Reorder boot: bootstrap before schema-apply (and harden schema-apply)
Second CI dry-run failure exposed two more issues:
1. Schema-apply runs against a fresh Postgres → fails with "Directus
isn't installed on this database. Please run 'directus bootstrap'
first." Bootstrap is what creates Directus's system tables; schema
apply requires those tables to exist. Local dev never tripped this
because bootstrap had been done in earlier sessions.
2. `node cli.js schema apply` printed an ERROR but exited 0 in the
not-installed case. schema-apply.sh trusted the exit code,
reported "schema apply complete," and the chain continued — until
the post-schema migration tried to ALTER TABLE on user tables that
never got created.
Fixes:
- entrypoint.sh: reorder steps from
pre-schema → schema-apply → post-schema → bootstrap → start
to
pre-schema → bootstrap → schema-apply → post-schema → start
Bootstrap is idempotent ("Database already initialized, skipping
install" on warm DB) so adding it earlier costs nothing on warm
boots and unblocks fresh boots.
- .gitea/workflows/build.yml: dry-run chain updated to mirror the new
entrypoint order. Bootstrap is now part of the pre-boot validation,
not skipped for speed. CI dry-run now genuinely covers the same path
the production entrypoint takes (minus the final pm2-runtime step,
which doesn't add validation value).
- scripts/schema-apply.sh: defense in depth. After the apply call
succeeds (exit 0), grep the output for ' ERROR: ' and fail loudly if
found. Catches the silent-failure pattern Directus's CLI exhibits
when bootstrap hasn't run. Error message names the likely cause
(schema-apply before bootstrap) for fast operator triage.
This is the second Phase 1 architectural correction exposed by the CI
dry-run gate. The gate is paying for itself in the very first PR it
runs against.
|
||
|
|
e22d9d489a |
Tasks 1.6 + 1.7 — schema tooling + real entrypoint flow
Two parallel tasks landing together. The boot pipeline is now wired end-to-end: db-init → schema apply → directus bootstrap → pm2-runtime. Live-verified by booting a fresh compose stack to a serving Directus admin UI on :8055. Task 1.6 — snapshot tooling: - scripts/schema-snapshot.sh — host-side, dev-time. Verifies docker is on PATH and the directus compose service is running, runs `node /directus/cli.js schema snapshot --yes` inside the container, copies the YAML out to ./snapshots/schema.yaml. Used after admin-UI schema changes to capture the new state for git commit. - scripts/schema-apply.sh — image-side, boot-time. Reads /directus/snapshots/schema.yaml, runs a dry-run preview, then applies. Gracefully skips when the snapshot is absent or whitespace- only (Phase 1 first-boot path before tasks 1.4/1.5 produce collections). SNAPSHOT_PATH env var override for CI flexibility. - snapshots/README.md — lifecycle doc; warns against hand-editing. Task 1.7 — real entrypoint flow: - entrypoint.sh rewritten from Phase 1.1's placeholder to the 4-step boot per ROADMAP design rule #3: 1/4 db-init → /directus/scripts/apply-db-init.sh 2/4 schema apply → /directus/scripts/schema-apply.sh 3/4 directus bootstrap → node /directus/cli.js bootstrap 4/4 directus start → exec pm2-runtime start ecosystem.config.cjs set -euo pipefail halts boot on any step's non-zero exit. Each step emits a [entrypoint] log marker so an operator reading container logs sees which step failed. Bug found and fixed during live verification: - Both 1.6 scripts initially called bare `directus schema ...` as if the CLI were on PATH. Upstream directus/directus:11.17.4 does NOT expose `directus` on PATH — invocation is via `node /directus/cli.js`, same pattern as the entrypoint's bootstrap step. Both scripts corrected. Also added -T to docker compose exec in schema-snapshot.sh so the script works in non-TTY contexts (CI). Phase 5 follow-up (non-blocking) flagged in 07's Done section: Directus warns "Collection 'positions' doesn't have a primary key column and will be ignored". The positions table uses UNIQUE INDEX (device_id, ts) matching processor's pattern, not a PK constraint. Means positions is not auto-registered as a Directus collection — fine for Phase 1, but the operator faulty-flag workflow will need a custom endpoint or manual collection registration in Phase 5. ROADMAP marks 1.6 + 1.7 done. Phase 1 progress: 5/9 tasks complete (1.1, 1.2, 1.3, 1.6, 1.7); 1.4, 1.5, 1.8, 1.9 remain. |
||
|
|
dec2d190ce |
Task 1.2 — db-init runner script
scripts/apply-db-init.sh implements the boot-time runner that walks db-init/*.sql in numeric-prefix order, applies each via psql, and records successful applications in a migrations_applied guard table so re-runs are no-ops. All 7 acceptance criteria pass live against the dev compose stack: empty dir, missing env var, apply, idempotent re-run, checksum mismatch, filename collision, broken SQL. Two retroactive Dockerfile corrections folded in (exposed by the first live-test attempt of 1.2's script): 1. apk add bash. The directus/directus:11.17.4 base is Alpine and ships ash via BusyBox, not bash. The script uses bash-specific features (associative arrays, [[ ]], mapfile, BASH_REMATCH) and fails at line 69 in sh. 2. .gitattributes added at repo root forcing LF on *.sh, *.sql, *.yaml, *.yml. Without it, Windows checkouts with core.autocrlf=true (the Git-for-Windows default) silently inject CRLF, causing "bad interpreter: /usr/bin/env bash^M" inside the Linux container. This failure mode only manifests in the container. Both corrections are documented in 01-project-scaffold.md's Done section; 02-db-init-runner.md's Done section captures the live-test results, the corrected docker compose run --entrypoint commands, and the gotcha about compose env defaults masking missing-env-var tests. ROADMAP marks 1.2 done; 1.3 next. |
||
|
|
387c3c4cfa |
Task 1.1 — Project scaffold
Phase 1 task 1.1 lands. Directus 11.17.4 boots locally end-to-end against a TimescaleDB+PostGIS container; admin UI serves at :8055, admin bootstrap from env vars works, named volumes preserve data across down/up cycles. Scaffold: - Dockerfile — FROM directus/directus:11.17.4. Pre-installs postgresql16-client (ahead of task 1.2's db-init runner needing psql). Bakes in /directus/snapshots, /directus/db-init, /directus/scripts, /directus/extensions, /directus/entrypoint.sh. - compose.dev.yaml — db (timescale/timescaledb-ha:pg16.6-ts2.17.2-all) + directus (local build), healthchecks, named volumes directus-pg-data + directus-uploads. - entrypoint.sh — placeholder using upstream's actual flow (node cli.js bootstrap && pm2-runtime start ecosystem.config.cjs); the real db-init -> schema apply -> start wrapper lands in task 1.7. - package.json — scripts-only (dev, dev:down, dev:reset, schema:snapshot, schema:apply, db:init), no runtime deps. - .env.example — sectioned, fully documented, KEY/SECRET marked required with generation hints. - .gitignore, .dockerignore — match the processor service conventions. - snapshots/, db-init/, scripts/, extensions/ — empty with .gitkeep, filled by later Phase 1 tasks (1.3, 1.6) and Phase 5. Lessons locked in (against the empirical pnpm dev boot): - timescale/timescaledb-ha:pg16-latest does NOT exist on Docker Hub. Pin a concrete version (we used pg16.6-ts2.17.2-all). - This image's data directory is /home/postgres/pgdata/data, not /pgdata or /var/lib/postgresql/data. PGDATA env var and the volume mount must both target it. - The -all variant bundles PostGIS binaries but the extension is not auto-created on the directus database; CREATE EXTENSION lands in Phase 2 alongside the geofences/SLZs/waypoints collections. - The upstream image's CMD is bootstrap + pm2-runtime, not a simple cli.js start. Bypassing pm2 would lose crash recovery. These corrections folded into 01-project-scaffold.md (deliverable line + Done section), 08-gitea-ci-dryrun.md (CI service tag), and the inline comments in compose.dev.yaml so future implementers don't re-discover them. Status: ROADMAP marks 1.1 done, Phase 1 in progress, 1.2 next. |