Tasks 1.6 + 1.7 — schema tooling + real entrypoint flow

Two parallel tasks landing together. The boot pipeline is now wired
end-to-end: db-init → schema apply → directus bootstrap → pm2-runtime.
Live-verified by booting a fresh compose stack to a serving Directus
admin UI on :8055.

Task 1.6 — snapshot tooling:
- scripts/schema-snapshot.sh — host-side, dev-time. Verifies docker
  is on PATH and the directus compose service is running, runs
  `node /directus/cli.js schema snapshot --yes` inside the container,
  copies the YAML out to ./snapshots/schema.yaml. Used after admin-UI
  schema changes to capture the new state for git commit.
- scripts/schema-apply.sh — image-side, boot-time. Reads
  /directus/snapshots/schema.yaml, runs a dry-run preview, then
  applies. Gracefully skips when the snapshot is absent or whitespace-
  only (Phase 1 first-boot path before tasks 1.4/1.5 produce
  collections). SNAPSHOT_PATH env var override for CI flexibility.
- snapshots/README.md — lifecycle doc; warns against hand-editing.

Task 1.7 — real entrypoint flow:
- entrypoint.sh rewritten from Phase 1.1's placeholder to the
  4-step boot per ROADMAP design rule #3:
    1/4 db-init          → /directus/scripts/apply-db-init.sh
    2/4 schema apply     → /directus/scripts/schema-apply.sh
    3/4 directus bootstrap → node /directus/cli.js bootstrap
    4/4 directus start   → exec pm2-runtime start ecosystem.config.cjs
  set -euo pipefail halts boot on any step's non-zero exit. Each step
  emits a [entrypoint] log marker so an operator reading container
  logs sees which step failed.

Bug found and fixed during live verification:
- Both 1.6 scripts initially called bare `directus schema ...` as if
  the CLI were on PATH. Upstream directus/directus:11.17.4 does NOT
  expose `directus` on PATH — invocation is via `node /directus/cli.js`,
  same pattern as the entrypoint's bootstrap step. Both scripts
  corrected. Also added -T to docker compose exec in schema-snapshot.sh
  so the script works in non-TTY contexts (CI).

Phase 5 follow-up (non-blocking) flagged in 07's Done section: Directus
warns "Collection 'positions' doesn't have a primary key column and
will be ignored". The positions table uses UNIQUE INDEX (device_id, ts)
matching processor's pattern, not a PK constraint. Means positions is
not auto-registered as a Directus collection — fine for Phase 1, but
the operator faulty-flag workflow will need a custom endpoint or
manual collection registration in Phase 5.

ROADMAP marks 1.6 + 1.7 done. Phase 1 progress: 5/9 tasks complete
(1.1, 1.2, 1.3, 1.6, 1.7); 1.4, 1.5, 1.8, 1.9 remain.
This commit is contained in:
2026-05-01 23:14:28 +02:00
parent 25a9731070
commit e22d9d489a
7 changed files with 538 additions and 22 deletions
+3 -3
View File
@@ -42,7 +42,7 @@ These rules govern every task. Any deviation must be discussed and documented as
### Phase 1 — Slice 1 schema + deploy pipeline
**Status:** 🟨 In progress (1.1, 1.2, 1.3 done; 1.4 next)
**Status:** 🟨 In progress (1.1, 1.2, 1.3, 1.6, 1.7 done; 1.4, 1.5, 1.8, 1.9 remaining)
**Outcome:** A Directus instance with the org-level catalog (orgs, users, organization_users, vehicles, devices and their org junctions) and event-participation collections (events, classes, entries, entry_crew, entry_devices) live and snapshot-tracked. `db-init/` covers the TimescaleDB extension, the `positions` hypertable, and the `faulty` column. Image builds via Gitea Actions with a CI dry-run that catches snapshot drift before deploy. Rally Albania 2026 is registered as the first event in admin UI to dogfood the registration workflow. **This is what Rally Albania 2026 needs.**
[**See `phase-1-slice-1-schema/README.md`**](./phase-1-slice-1-schema/README.md)
@@ -54,8 +54,8 @@ These rules govern every task. Any deviation must be discussed and documented as
| 1.3 | [Initial migrations (extensions, positions hypertable, faulty column)](./phase-1-slice-1-schema/03-initial-migrations.md) | 🟩 | pending user commit |
| 1.4 | [Org-level catalog collections](./phase-1-slice-1-schema/04-org-catalog-collections.md) | ⬜ | — |
| 1.5 | [Event-participation collections](./phase-1-slice-1-schema/05-event-participation-collections.md) | ⬜ | — |
| 1.6 | [Schema snapshot/apply tooling](./phase-1-slice-1-schema/06-snapshot-tooling.md) | | |
| 1.7 | [Image build & entrypoint](./phase-1-slice-1-schema/07-image-and-dockerfile.md) | | |
| 1.6 | [Schema snapshot/apply tooling](./phase-1-slice-1-schema/06-snapshot-tooling.md) | 🟩 | pending user commit |
| 1.7 | [Image build & entrypoint](./phase-1-slice-1-schema/07-image-and-dockerfile.md) | 🟩 | pending user commit |
| 1.8 | [Gitea CI dry-run workflow](./phase-1-slice-1-schema/08-gitea-ci-dryrun.md) | ⬜ | — |
| 1.9 | [Rally Albania 2026 dogfood seed](./phase-1-slice-1-schema/09-rally-albania-2026-seed.md) | ⬜ | — |
@@ -56,4 +56,64 @@ Wrap Directus's native `schema snapshot` and `schema apply` commands in repo-loc
## Done
(Fill in commit SHA + one-line note when this lands.)
**Implementation complete 2026-05-01 — pending user live-test and commit.**
Files created:
- `scripts/schema-snapshot.sh` — host-side dev-time snapshot script.
- Verifies `docker` on PATH; verifies the `directus` compose service is in
running state (`docker compose ps --status running --services`).
- Invokes `directus schema snapshot --yes /tmp/schema-snapshot.yaml` inside
the container via `docker compose exec`.
- Copies the output file out via `docker compose cp`.
- Prints `snapshot written to snapshots/schema.yaml (<N> bytes)`.
- Exits 1 with a clear message if docker is missing, compose file is absent,
service is not running, snapshot command fails, or copy fails.
- `scripts/schema-apply.sh` — image-side boot-time apply script.
- Verifies `directus` CLI is on PATH (exit 2 if not — image misconfiguration).
- Reads `SNAPSHOT_PATH` env var (default `/directus/snapshots/schema.yaml`).
- Exits 0 with a skip message if the snapshot is absent or empty/whitespace
(safe for first boot before tasks 1.4/1.5 land).
- Logs a dry-run preview (`directus schema apply --dry-run`) before applying.
- Applies via `directus schema apply --yes`; exits 1 on failure.
- `snapshots/README.md` — lifecycle documentation; warns against hand-editing.
**Deviations from task spec:**
- `schema:diff` npm alias was intentionally **not** added. The task brief for
this implementation pass explicitly excluded it as scope creep (dry-run is
built into the apply script). The task spec's deliverables section lists it,
but the overriding implementation brief takes precedence. If needed, add
`"schema:diff": "bash scripts/schema-apply.sh --dry-run-only"` in a follow-up
— or simply document that `docker compose exec directus directus schema apply
--dry-run /directus/snapshots/schema.yaml` is the equivalent one-liner.
- `--format=yaml` flag was NOT passed to `directus schema snapshot`. Directus
11 snapshots to YAML by default (confirmed in source); the flag does not exist
as a standalone option in this version. The output path ends in `.yaml`, which
is sufficient to confirm format intent.
**Acceptance criteria status:**
Static (no Docker required — verified in sandbox):
- [x] `#!/usr/bin/env bash` shebang on both scripts.
- [x] `set -euo pipefail` on both scripts.
- [x] Both scripts marked `100755` in the git index (`git update-index --chmod=+x`).
- [x] `schema-apply.sh` skip logic: absent file → exit 0 with skip message.
- [x] `schema-apply.sh` skip logic: empty/whitespace-only file → exit 0 with skip message.
- [x] `schema-apply.sh` skip logic: real YAML content → proceeds to dry-run + apply.
- [x] `schema-snapshot.sh` stopped-stack logic: empty running-services list → exit 1 with "Directus container is not running" message.
- [x] `schema-snapshot.sh` docker-not-found logic: no docker on PATH → exit 1 with clear message.
- [x] `[schema-snapshot]` and `[schema-apply]` log prefixes on all log lines.
- [x] `SNAPSHOT_PATH` env var override supported in `schema-apply.sh` (used by CI).
Live (verified 2026-05-01):
- [x] `schema-apply.sh` boot-time integration: container boot triggers it as entrypoint step 2/4; with no `snapshots/schema.yaml` present yet, it logs `snapshot not found at /directus/snapshots/schema.yaml — no schema to apply, skipping` and exits 0; entrypoint proceeds to step 3.
- [ ] `pnpm run schema:snapshot` against running stack writes `snapshots/schema.yaml`. **Pending tasks 1.4/1.5** — there are no collections to snapshot yet.
- [ ] Repeated `schema:apply` on an already-applied DB is a no-op (idempotent). **Pending tasks 1.4/1.5.**
**Bug fix during live verification:** the agent's first pass invoked `directus schema apply` and `directus schema snapshot` as if `directus` were on PATH. The upstream `directus/directus:11.17.4` image does NOT expose `directus` on PATH — the CLI is invoked as `node /directus/cli.js <subcommand>`, matching the upstream image's CMD. Both scripts corrected:
- `schema-apply.sh`: `command -v directus` check replaced with `[[ -f /directus/cli.js ]]`; both `directus schema apply --dry-run` and `directus schema apply --yes` now use `node "${DIRECTUS_CLI}" schema apply ...`.
- `schema-snapshot.sh`: `docker compose exec directus directus schema snapshot --yes ...` now uses `docker compose exec -T directus node /directus/cli.js schema snapshot --yes ...`. The `-T` flag added to disable TTY allocation for non-interactive use.
(Fill in commit SHA when this lands.)
@@ -82,4 +82,67 @@ Build a production-ready Directus image that bakes in the snapshot, db-init migr
## Done
(Fill in commit SHA + one-line note when this lands.)
Pending commit by user. `entrypoint.sh` replaced with production boot flow 2026-05-01.
**Deliverables produced:**
- `entrypoint.sh` — full boot flow: db-init → schema apply → bootstrap → pm2-runtime start. Mode `100755` preserved.
**Scope boundary honored:**
- Only `entrypoint.sh` was modified. `Dockerfile`, `compose.dev.yaml`, `package.json`, `apply-db-init.sh`, and everything under `scripts/`, `db-init/`, and `snapshots/` were untouched (parallel agent boundary for task 1.6).
**Deviations from task 1.7 spec:**
The task spec (`07-image-and-dockerfile.md`) shows a naive entrypoint with `exec /directus/cli.js start` as the final command. This was superseded by the implementation brief's explicit requirement (and task 1.1 Done section) to use `node /directus/cli.js bootstrap && pm2-runtime start /directus/ecosystem.config.cjs` — the upstream image's actual CMD. The final entrypoint:
1. Calls `bootstrap` as a discrete step 3 (after schema apply), then
2. Uses `exec pm2-runtime start /directus/ecosystem.config.cjs` as step 4.
This matches the ROADMAP design rule #3 apply order and preserves pm2's crash recovery and signal handling. `exec` replaces the bash process so SIGTERM from `docker stop` reaches pm2 directly without traversal through bash.
**Static acceptance criteria (passed):**
- File path: `C:\Users\Administrator\projects\trm\directus\entrypoint.sh`
- Shebang: `#!/usr/bin/env bash`
- `set -euo pipefail` present (line 22)
- `log()` helper uses `printf` — no trailing newline issues
- Apply order: db-init (1/4) → schema apply (2/4) → bootstrap (3/4) → pm2-runtime (4/4)
- `exec pm2-runtime` — bash process replaced; signals reach pm2 directly
- File mode: `100755` confirmed via `git ls-files -s entrypoint.sh` before and after staging
**Parallel agent status (task 1.6):**
`scripts/schema-apply.sh` was NOT present when this task ran — only `scripts/apply-db-init.sh` and `scripts/schema-snapshot.sh` existed in `scripts/`. Step 2/4 of the entrypoint calls `/directus/scripts/schema-apply.sh`. With `set -euo pipefail`, a missing script causes bash to exit non-zero at that line before attempting execution (the shell resolves the command, finds it executable, then the kernel `exec` fails with ENOENT → bash reports the error and exits 127). This means the full boot sequence **cannot be live-tested until task 1.6's `schema-apply.sh` lands**. The implementation is correct; the missing dependency is a parallel-agent timing issue, not a bug.
**Acceptance criteria — live testing deferred:**
Live acceptance criteria (Docker boot, curl health check, restart verification) cannot be completed until `scripts/schema-apply.sh` is produced by task 1.6. Re-run the full acceptance suite after both task 1.6 and 1.7 PRs land:
- `docker compose -f compose.dev.yaml down -v`
- `docker compose -f compose.dev.yaml build`
- `docker compose -f compose.dev.yaml up -d`
- Watch for: `[entrypoint] step 1/4` → `[db-init]` output → `[entrypoint] step 2/4` → schema-apply log → `[entrypoint] step 3/4` → bootstrap log → `[entrypoint] step 4/4` → PM2 startup → server at `:8055`
- `curl http://localhost:8055/server/health` → 200
- `docker compose -f compose.dev.yaml restart directus` → clean re-boot with "already initialized" paths
**Live-verification result (2026-05-01) — all four steps fired in order, server up at :8055:**
```
[entrypoint] step 1/4: db-init → 3 applied, 0 skipped
[entrypoint] step 2/4: directus schema apply → snapshot not found, skipping (correct for Phase 1)
[entrypoint] step 3/4: directus bootstrap → system tables created, first admin role + user added
[entrypoint] step 4/4: directus start (pm2-runtime)
PM2 log: App [directus:0] online
Server started at http://0.0.0.0:8055
```
**Bug fix during live verification:** the parallel `schema-apply.sh` invoked `directus` as if it were on PATH. The upstream image does NOT expose `directus` on PATH — invocation is via `node /directus/cli.js`. See task 1.6's Done section for the fix detail. Entrypoint itself was unaffected; only `schema-apply.sh` needed the change.
**Phase 5 follow-up note (not blocking Phase 1):**
Boot logs include `WARN: Collection "positions" doesn't have a primary key column and will be ignored` — three times (during bootstrap migrations + once at startup). Directus auto-discovers tables in the public schema and tries to register them as collections, but skips ones without a PRIMARY KEY constraint. The positions table uses `UNIQUE INDEX (device_id, ts)` instead of a PK (matching processor's pattern, see task 1.3 Done). Result: positions is **not** auto-registered as a Directus collection, so the cross-plane operator workflow (operator flips `faulty` flag via admin UI) cannot use the auto-collection path.
This is acceptable for Phase 1 (no operator UI yet). Phase 5 (custom extensions) needs a different mechanism for the faulty-flag workflow:
- **Option A**: a custom Directus endpoint (`POST /positions/:id/flag-faulty`) that performs the UPDATE directly via the database service. Bypasses Directus's collection abstraction; thin wrapper around SQL.
- **Option B**: register positions in `directus_collections` manually with a composite primary key configured (`device_id`, `ts`). Some Directus versions support this; verify against 11.17.4.
- **Option C**: add an `id BIGSERIAL PRIMARY KEY` surrogate column to positions. Cleanest for Directus, but introduces a column processor doesn't write and slightly increases per-row storage.
Phase 5's task file should pin one of these options before extension work begins.