e22d9d489a
Two parallel tasks landing together. The boot pipeline is now wired end-to-end: db-init → schema apply → directus bootstrap → pm2-runtime. Live-verified by booting a fresh compose stack to a serving Directus admin UI on :8055. Task 1.6 — snapshot tooling: - scripts/schema-snapshot.sh — host-side, dev-time. Verifies docker is on PATH and the directus compose service is running, runs `node /directus/cli.js schema snapshot --yes` inside the container, copies the YAML out to ./snapshots/schema.yaml. Used after admin-UI schema changes to capture the new state for git commit. - scripts/schema-apply.sh — image-side, boot-time. Reads /directus/snapshots/schema.yaml, runs a dry-run preview, then applies. Gracefully skips when the snapshot is absent or whitespace- only (Phase 1 first-boot path before tasks 1.4/1.5 produce collections). SNAPSHOT_PATH env var override for CI flexibility. - snapshots/README.md — lifecycle doc; warns against hand-editing. Task 1.7 — real entrypoint flow: - entrypoint.sh rewritten from Phase 1.1's placeholder to the 4-step boot per ROADMAP design rule #3: 1/4 db-init → /directus/scripts/apply-db-init.sh 2/4 schema apply → /directus/scripts/schema-apply.sh 3/4 directus bootstrap → node /directus/cli.js bootstrap 4/4 directus start → exec pm2-runtime start ecosystem.config.cjs set -euo pipefail halts boot on any step's non-zero exit. Each step emits a [entrypoint] log marker so an operator reading container logs sees which step failed. Bug found and fixed during live verification: - Both 1.6 scripts initially called bare `directus schema ...` as if the CLI were on PATH. Upstream directus/directus:11.17.4 does NOT expose `directus` on PATH — invocation is via `node /directus/cli.js`, same pattern as the entrypoint's bootstrap step. Both scripts corrected. Also added -T to docker compose exec in schema-snapshot.sh so the script works in non-TTY contexts (CI). Phase 5 follow-up (non-blocking) flagged in 07's Done section: Directus warns "Collection 'positions' doesn't have a primary key column and will be ignored". The positions table uses UNIQUE INDEX (device_id, ts) matching processor's pattern, not a PK constraint. Means positions is not auto-registered as a Directus collection — fine for Phase 1, but the operator faulty-flag workflow will need a custom endpoint or manual collection registration in Phase 5. ROADMAP marks 1.6 + 1.7 done. Phase 1 progress: 5/9 tasks complete (1.1, 1.2, 1.3, 1.6, 1.7); 1.4, 1.5, 1.8, 1.9 remain.
149 lines
10 KiB
Markdown
149 lines
10 KiB
Markdown
# Task 1.7 — Image build & entrypoint
|
|
|
|
**Phase:** 1 — Slice 1 schema + deploy pipeline
|
|
**Status:** ⬜ Not started
|
|
**Depends on:** 1.2, 1.3, 1.6 (need the runner, migrations, and snapshot tooling all in place)
|
|
**Wiki refs:** `docs/wiki/entities/directus.md` (Schema management section)
|
|
|
|
## Goal
|
|
|
|
Build a production-ready Directus image that bakes in the snapshot, db-init migrations, extensions directory, and entrypoint script. Replace the placeholder entrypoint from 1.1 with the real boot sequence: db-init → schema apply → directus start.
|
|
|
|
## Deliverables
|
|
|
|
- `Dockerfile` (replacing the placeholder from 1.1):
|
|
```dockerfile
|
|
FROM directus/directus:11.5.1 # pin specific patch version
|
|
|
|
USER root
|
|
RUN apk add --no-cache postgresql16-client bash coreutils
|
|
USER node
|
|
|
|
COPY --chown=node:node snapshots/ /directus/snapshots/
|
|
COPY --chown=node:node db-init/ /directus/db-init/
|
|
COPY --chown=node:node extensions/ /directus/extensions/
|
|
COPY --chown=node:node scripts/ /directus/scripts/
|
|
COPY --chown=node:node entrypoint.sh /directus/entrypoint.sh
|
|
RUN chmod +x /directus/entrypoint.sh /directus/scripts/*.sh
|
|
|
|
ENTRYPOINT ["/directus/entrypoint.sh"]
|
|
```
|
|
Adjust `apk` / `apt-get` based on the upstream image's distro. `postgresql-client` is required for `psql` in the db-init runner.
|
|
- `entrypoint.sh`:
|
|
```sh
|
|
#!/usr/bin/env bash
|
|
set -euo pipefail
|
|
|
|
echo "[entrypoint] running db-init"
|
|
/directus/scripts/apply-db-init.sh
|
|
|
|
echo "[entrypoint] applying Directus schema snapshot"
|
|
/directus/scripts/schema-apply.sh
|
|
|
|
echo "[entrypoint] starting Directus"
|
|
exec /directus/cli.js start
|
|
```
|
|
(Verify `/directus/cli.js start` is the correct upstream command for the pinned version. Some versions use `node /directus/server.js`.)
|
|
- Update `compose.dev.yaml` so the dev image uses the same Dockerfile (no special path in dev). The local image has identical boot semantics to prod — only env vars differ.
|
|
|
|
## Specification
|
|
|
|
- **Pin the Directus version exactly** (e.g. `11.5.1`, not `11`). Version bumps land via PR.
|
|
- **Layer ordering for cache friendliness.**
|
|
1. `FROM` + apk install (rarely changes).
|
|
2. `COPY scripts/` (changes occasionally).
|
|
3. `COPY entrypoint.sh` (rarely changes).
|
|
4. `COPY db-init/` (changes per migration PR).
|
|
5. `COPY snapshots/` (changes per schema PR — most volatile).
|
|
6. `COPY extensions/` (Phase 5+).
|
|
Putting the most-changed layer last maximizes cache reuse for the rest.
|
|
- **`USER node`** for runtime (matches upstream image's non-root convention).
|
|
- **Health check.** Add a `HEALTHCHECK` instruction calling `wget -qO- http://localhost:8055/server/ping` (or the upstream's health endpoint), with sensible interval/timeout. Useful in compose and Portainer.
|
|
- **Entrypoint failure modes.** If db-init fails → exit, container restarts (Docker will retry). If schema apply fails → same. Both failures should produce clear log lines so an operator looking at Portainer logs can diagnose.
|
|
- **No `EXPOSE` change** — the upstream image already exposes `8055`.
|
|
- **No `ENV` overrides** for Directus runtime config in the Dockerfile — that's the deployer's concern via env vars at runtime.
|
|
|
|
## Acceptance criteria
|
|
|
|
- [ ] `docker build -t trm-directus:dev .` succeeds.
|
|
- [ ] Image size is reasonable (< 600 MB; upstream image + tooling).
|
|
- [ ] Booting against a fresh Postgres: db-init applies all three migrations, schema apply creates 12 collections, Directus starts and serves on `:8055`.
|
|
- [ ] Re-booting against the same Postgres (warm DB): db-init reports "0 applied, 3 skipped", schema apply reports "no changes", Directus starts.
|
|
- [ ] Killing Postgres mid-db-init → container exits non-zero with clear error in logs.
|
|
- [ ] Killing Postgres mid-schema-apply → container exits non-zero with clear error in logs.
|
|
- [ ] HEALTHCHECK reports "healthy" once Directus is serving.
|
|
- [ ] `compose.dev.yaml` `directus` service uses the local Dockerfile build and works end-to-end (`pnpm dev:reset` → fresh boot → admin UI loads).
|
|
|
|
## Risks / open questions
|
|
|
|
- **Upstream image distro.** Directus's official image has used both Alpine and Debian-based bases over the years. Verify the current 11.x base and adjust `apk` vs `apt-get` accordingly.
|
|
- **`/directus/cli.js start` path.** Confirm against the upstream Dockerfile / docs for the pinned version. Bake the right command into entrypoint.sh.
|
|
- **Permissions on `/directus/snapshots/` etc.** If the upstream user is `node` (uid 1000), the `--chown=node:node` flag is right. Verify with `docker run --rm trm-directus:dev id`.
|
|
|
|
## Done
|
|
|
|
Pending commit by user. `entrypoint.sh` replaced with production boot flow 2026-05-01.
|
|
|
|
**Deliverables produced:**
|
|
|
|
- `entrypoint.sh` — full boot flow: db-init → schema apply → bootstrap → pm2-runtime start. Mode `100755` preserved.
|
|
|
|
**Scope boundary honored:**
|
|
- Only `entrypoint.sh` was modified. `Dockerfile`, `compose.dev.yaml`, `package.json`, `apply-db-init.sh`, and everything under `scripts/`, `db-init/`, and `snapshots/` were untouched (parallel agent boundary for task 1.6).
|
|
|
|
**Deviations from task 1.7 spec:**
|
|
|
|
The task spec (`07-image-and-dockerfile.md`) shows a naive entrypoint with `exec /directus/cli.js start` as the final command. This was superseded by the implementation brief's explicit requirement (and task 1.1 Done section) to use `node /directus/cli.js bootstrap && pm2-runtime start /directus/ecosystem.config.cjs` — the upstream image's actual CMD. The final entrypoint:
|
|
1. Calls `bootstrap` as a discrete step 3 (after schema apply), then
|
|
2. Uses `exec pm2-runtime start /directus/ecosystem.config.cjs` as step 4.
|
|
|
|
This matches the ROADMAP design rule #3 apply order and preserves pm2's crash recovery and signal handling. `exec` replaces the bash process so SIGTERM from `docker stop` reaches pm2 directly without traversal through bash.
|
|
|
|
**Static acceptance criteria (passed):**
|
|
|
|
- File path: `C:\Users\Administrator\projects\trm\directus\entrypoint.sh`
|
|
- Shebang: `#!/usr/bin/env bash`
|
|
- `set -euo pipefail` present (line 22)
|
|
- `log()` helper uses `printf` — no trailing newline issues
|
|
- Apply order: db-init (1/4) → schema apply (2/4) → bootstrap (3/4) → pm2-runtime (4/4)
|
|
- `exec pm2-runtime` — bash process replaced; signals reach pm2 directly
|
|
- File mode: `100755` confirmed via `git ls-files -s entrypoint.sh` before and after staging
|
|
|
|
**Parallel agent status (task 1.6):**
|
|
|
|
`scripts/schema-apply.sh` was NOT present when this task ran — only `scripts/apply-db-init.sh` and `scripts/schema-snapshot.sh` existed in `scripts/`. Step 2/4 of the entrypoint calls `/directus/scripts/schema-apply.sh`. With `set -euo pipefail`, a missing script causes bash to exit non-zero at that line before attempting execution (the shell resolves the command, finds it executable, then the kernel `exec` fails with ENOENT → bash reports the error and exits 127). This means the full boot sequence **cannot be live-tested until task 1.6's `schema-apply.sh` lands**. The implementation is correct; the missing dependency is a parallel-agent timing issue, not a bug.
|
|
|
|
**Acceptance criteria — live testing deferred:**
|
|
|
|
Live acceptance criteria (Docker boot, curl health check, restart verification) cannot be completed until `scripts/schema-apply.sh` is produced by task 1.6. Re-run the full acceptance suite after both task 1.6 and 1.7 PRs land:
|
|
- `docker compose -f compose.dev.yaml down -v`
|
|
- `docker compose -f compose.dev.yaml build`
|
|
- `docker compose -f compose.dev.yaml up -d`
|
|
- Watch for: `[entrypoint] step 1/4` → `[db-init]` output → `[entrypoint] step 2/4` → schema-apply log → `[entrypoint] step 3/4` → bootstrap log → `[entrypoint] step 4/4` → PM2 startup → server at `:8055`
|
|
- `curl http://localhost:8055/server/health` → 200
|
|
- `docker compose -f compose.dev.yaml restart directus` → clean re-boot with "already initialized" paths
|
|
|
|
**Live-verification result (2026-05-01) — all four steps fired in order, server up at :8055:**
|
|
|
|
```
|
|
[entrypoint] step 1/4: db-init → 3 applied, 0 skipped
|
|
[entrypoint] step 2/4: directus schema apply → snapshot not found, skipping (correct for Phase 1)
|
|
[entrypoint] step 3/4: directus bootstrap → system tables created, first admin role + user added
|
|
[entrypoint] step 4/4: directus start (pm2-runtime)
|
|
PM2 log: App [directus:0] online
|
|
Server started at http://0.0.0.0:8055
|
|
```
|
|
|
|
**Bug fix during live verification:** the parallel `schema-apply.sh` invoked `directus` as if it were on PATH. The upstream image does NOT expose `directus` on PATH — invocation is via `node /directus/cli.js`. See task 1.6's Done section for the fix detail. Entrypoint itself was unaffected; only `schema-apply.sh` needed the change.
|
|
|
|
**Phase 5 follow-up note (not blocking Phase 1):**
|
|
|
|
Boot logs include `WARN: Collection "positions" doesn't have a primary key column and will be ignored` — three times (during bootstrap migrations + once at startup). Directus auto-discovers tables in the public schema and tries to register them as collections, but skips ones without a PRIMARY KEY constraint. The positions table uses `UNIQUE INDEX (device_id, ts)` instead of a PK (matching processor's pattern, see task 1.3 Done). Result: positions is **not** auto-registered as a Directus collection, so the cross-plane operator workflow (operator flips `faulty` flag via admin UI) cannot use the auto-collection path.
|
|
|
|
This is acceptable for Phase 1 (no operator UI yet). Phase 5 (custom extensions) needs a different mechanism for the faulty-flag workflow:
|
|
- **Option A**: a custom Directus endpoint (`POST /positions/:id/flag-faulty`) that performs the UPDATE directly via the database service. Bypasses Directus's collection abstraction; thin wrapper around SQL.
|
|
- **Option B**: register positions in `directus_collections` manually with a composite primary key configured (`device_id`, `ts`). Some Directus versions support this; verify against 11.17.4.
|
|
- **Option C**: add an `id BIGSERIAL PRIMARY KEY` surrogate column to positions. Cleanest for Directus, but introduces a column processor doesn't write and slightly increases per-row storage.
|
|
|
|
Phase 5's task file should pin one of these options before extension work begins.
|