Add Phase 1 task decisions documentation for db-init runner, positions table divergences, and Gitea CI workflow

This commit is contained in:
2026-05-02 12:23:06 +02:00
parent f9b96efc6b
commit 130b44778a
5 changed files with 126 additions and 8 deletions
@@ -3,3 +3,6 @@
- [Directus image facts](reference_directus_image.md) — directus/directus:11.17.4 confirmed on Docker Hub; Alpine-based (node:22-alpine); runs as user `node`; CMD is `node cli.js bootstrap && pm2-runtime start ecosystem.config.cjs` - [Directus image facts](reference_directus_image.md) — directus/directus:11.17.4 confirmed on Docker Hub; Alpine-based (node:22-alpine); runs as user `node`; CMD is `node cli.js bootstrap && pm2-runtime start ecosystem.config.cjs`
- [TimescaleDB-HA image facts](reference_timescaledb_ha.md) — timescale/timescaledb-ha:pg16-latest includes PostGIS by default; PGDATA=/pgdata (not /var/lib/postgresql/data); env vars: POSTGRES_USER, POSTGRES_PASSWORD, POSTGRES_DB - [TimescaleDB-HA image facts](reference_timescaledb_ha.md) — timescale/timescaledb-ha:pg16-latest includes PostGIS by default; PGDATA=/pgdata (not /var/lib/postgresql/data); env vars: POSTGRES_USER, POSTGRES_PASSWORD, POSTGRES_DB
- [Phase 1 scaffold decisions](project_phase1_scaffold.md) — key implementation decisions made during task 1.1 (image pins, volume paths, entrypoint command, apk package name) - [Phase 1 scaffold decisions](project_phase1_scaffold.md) — key implementation decisions made during task 1.1 (image pins, volume paths, entrypoint command, apk package name)
- [Phase 1 task 1.2 decisions](project_phase1_task12.md) — apply-db-init.sh: .gitattributes required for LF on Windows, chmod+x via git update-index, collision detection pattern, psql error capture idiom
- [Phase 1 task 1.3 decisions](project_phase1_task13.md) — positions table: 8 divergences between spec and processor migration; processor wins; assertion block patterns for information_schema + pg_indexes checks
- [Phase 1 task 1.8 decisions](project_phase1_task18.md) — CI workflow: no build-push-action (image must be in local daemon for docker run); --network host requires DB_HOST=localhost not postgres; curl -fsS for Portainer webhook
@@ -0,0 +1,26 @@
---
name: Phase 1 task 1.2 — db-init runner decisions
description: Implementation decisions and gotchas from writing apply-db-init.sh
type: project
---
# Phase 1 task 1.2 — db-init runner decisions
## Key decisions
- **`.gitattributes` required**: `core.autocrlf=true` on the Windows dev machine would corrupt shell scripts at checkout (turning LF → CRLF → `bad interpreter: /usr/bin/env bash^M`). Added `directus/.gitattributes` with `*.sh text eol=lf` and `*.sql text eol=lf`. This is a non-negotiable requirement for any new shell script in the directus service.
- **`git update-index --chmod=+x`**: The only way to set executable bit on Windows without WSL. Must be done after `git add` (not before — the file must already be indexed). If re-staged after a chmod, the bit is preserved as long as `--chmod=+x` was called on the indexed file.
- **Filename collision detection uses `[[ -v "ARRAY[key]" ]]`**: The Bash 4.2+ idiom for checking associative array key existence. Safe in the `node:22-alpine` base image (ships Bash 5.x). Do not use `[[ "${ARRAY[$key]+_}" ]]` — less readable.
- **`set -euo pipefail` + `|| psql_exit=$?`**: The `||` short-circuits `set -e` for that single statement, which is correct and idiomatic Bash. This is the approved pattern for capturing psql exit status without disabling `set -e` globally.
- **`run_psql` wrapper vs inline**: The guard-table bootstrap and the record-insert use `run_psql` / inline psql respectively. The apply step cannot use `run_psql` because it needs `-v ON_ERROR_STOP=1 -1 --file=` which are not generic flags. This duplication is intentional.
- **`compgen -G` for glob check**: Used to detect whether any `*.sql` files exist before calling `ls | sort`. Avoids `ls: cannot access ... No such file or directory` error under `set -e`.
- **Filename quoting in SQL**: Basenames are passed directly into SQL strings with single quotes. This is safe for filenames matching `[0-9]+_[a-z0-9_]+.sql` (the project convention). If someone introduces a filename with a single quote, it would break. Acceptable for now — document if it becomes a concern.
## What was NOT done (and why)
- No `trap ERR` — the spec says "if it adds clarity". With `set -euo pipefail` and explicit exit codes on every failure path, ERR trapping would only add noise.
- No lock file for concurrent-boot safety — the task spec acknowledges this risk and defers it to Phase 3+.
## Acceptance testing status
Docker was not available in the agent's shell environment. Acceptance tests must be run manually — see the task report for the exact commands.
@@ -0,0 +1,48 @@
---
name: Phase 1 task 1.3 decisions
description: positions hypertable schema divergences from spec, cross-check against processor migration, and assertion block patterns
type: project
---
Task 1.3 authored three SQL migrations under directus/db-init/. The critical finding was a substantial divergence between the task spec (03-initial-migrations.md) and the processor's actual migration (processor/src/db/migrations/0001_positions.sql). The processor migration wins in all cases.
**Why:** The processor is the sole writer for positions. If the table schema doesn't match what the processor inserts, writes fail at runtime with NOT NULL violations or column-not-found errors.
**How to apply:** Always read processor/src/db/migrations/0001_positions.sql before writing or modifying the positions table schema. Do not trust 03-initial-migrations.md column list without cross-checking.
## Divergences found (spec vs. processor ground truth)
1. **ingested_at** — not in spec, present in processor migration as `timestamptz NOT NULL DEFAULT now()`. Required.
2. **codec** — not in spec, present in processor migration as `text NOT NULL`. Required.
3. **altitude/angle/speed** — spec says `DOUBLE PRECISION nullable`; processor has `real NOT NULL`. Use `real NOT NULL`.
4. **satellites/priority** — spec says nullable; processor has `NOT NULL`. Use NOT NULL.
5. **attributes DEFAULT** — spec adds `DEFAULT '{}'::jsonb`; processor has no default. No default in the migration (processor always supplies the value).
6. **PRIMARY KEY vs UNIQUE INDEX** — spec uses `PRIMARY KEY (device_id, ts)`; processor uses `CREATE UNIQUE INDEX positions_device_ts ON positions (device_id, ts)` (no PK). TimescaleDB idiomatic: unique index, no PK. Index name is `positions_device_ts` (no `_idx` suffix).
7. **Chunk interval** — spec says `INTERVAL '7 days'`; processor uses `INTERVAL '1 day'`. Use 1 day.
8. **Indexes** — spec has one composite `(device_id, ts DESC)` index; processor has two: `positions_device_ts (device_id, ts)` (unique) and `positions_ts (ts DESC)`. Both are required.
## Assertion block pattern established
Each migration ends with a `DO $$ DECLARE ... BEGIN ... END $$;` block that:
- Checks table/column existence via `information_schema.columns`
- Checks hypertable registration via `timescaledb_information.hypertables`
- Checks index existence via `pg_indexes`
- Checks index predicate via `pg_indexes.indexdef ILIKE '%where (faulty = false)%'`
- Raises named exceptions on any mismatch
For the faulty column NOT NULL check, the pattern is:
```sql
SELECT data_type, is_nullable = 'NO', column_default
INTO _col_type, _col_notnull, _col_default
FROM information_schema.columns ...
```
The column_default for `DEFAULT FALSE` is stored by Postgres as the string `'false'` (lower-case, no quotes) in information_schema.
## Files created
- directus/db-init/001_extensions.sql
- directus/db-init/002_positions_hypertable.sql
- directus/db-init/003_faulty_column.sql
All three are append-only once applied. The runner's checksum guard (exit 2) enforces this.
@@ -0,0 +1,33 @@
---
name: Phase 1 task 1.8 decisions
description: Key implementation decisions and divergences from spec made during task 1.8 (Gitea CI dry-run workflow)
type: project
---
# Phase 1 task 1.8 — Gitea CI dry-run workflow decisions
## Core decisions
**No `docker/build-push-action`**: Used plain `docker build -t trm-directus:ci .` instead of `docker/build-push-action`.
**Why**: `build-push-action` with the docker-container Buildx driver exports the image into a separate buildkitd cache that is NOT accessible to a subsequent `docker run`. The dry-run step needs the image in the local Docker daemon. The processor workflow uses `build-push-action` but it has no post-build dry-run step.
**How to apply**: Any Directus workflow variant that needs to run the image after building must use plain `docker build`, not `build-push-action`.
**`--network host` + `DB_HOST=localhost`**: Service container is bound via `ports: ['5432:5432']` to the runner's loopback (127.0.0.1:5432). The `docker run` container uses `--network host` to share that namespace, making Postgres reachable as `localhost:5432`.
**Why**: The spec draft had a bug — it used `--network host` but set `DB_HOST: postgres`. With host networking, service containers are NOT reachable by their service name; only `localhost` works. The service name (`postgres`) is only resolvable in bridge-network mode.
**How to apply**: Always use `DB_HOST=localhost` when pairing `--network host` with a `services:` port-mapped container.
**`health-retries 20`**: Raised from spec's default of 10.
**Why**: The timescaledb-ha image has a slower startup than plain postgres (init script runs TimescaleDB preload). 10 retries at 5s = 50s max wait; 20 retries = 100s, safer margin.
**Portainer step uses `curl -fsS`**: Added `-f` (fail on HTTP error) and `-sS` (silent but show errors).
**Why**: Bare `curl -X POST` exits 0 even on a 4xx/5xx response. `-f` makes curl exit non-zero on server errors, so a misconfigured webhook URL surfaces as a workflow failure rather than a silent no-op.
**`--health-cmd` includes `-d directus`**: Spec draft had `pg_isready -U directus` without `-d directus`. Added the `-d` flag for precision.
**Deliberate divergences from processor workflow**:
- No `actions/setup-node`, no `corepack enable`, no `pnpm install` — Directus is not a Node project; no TypeScript to compile or test.
- No `docker/setup-buildx-action` — Buildx with docker-container driver sequesters images from `docker run`.
- No typecheck/lint/test steps — Phase 1 has no extensions. Phase 5 will add these.
- Added `services:` block — processor has no service dependency.
- Separate build + dry-run + push steps instead of single `build-push-action`.
- `runs-on: ubuntu-22.04` (pinned) vs processor's `ubuntu-latest` (floating).
@@ -1,16 +1,24 @@
--- ---
name: TimescaleDB-HA Docker image facts name: TimescaleDB-HA Docker image facts
description: Key facts about timescale/timescaledb-ha needed for compose and CI configuration description: Empirically verified facts about timescale/timescaledb-ha used in TRM compose files
type: reference type: reference
--- ---
# TimescaleDB-HA Docker image facts # TimescaleDB-HA Docker image facts
- **Image used**: `timescale/timescaledb-ha:pg16-latest` **CORRECTED after task 1.1 live boot (2026-05-01).**
- **PostGIS**: Included by default in all (non-oss) tags. The `-oss` suffix variants ship only OSS-licensed extensions (PostGIS is Apache-licensed, included even in `-oss`). The GitHub README confirms: "By default, the Docker image contains many extensions, including TimescaleDB and PostGIS."
- **`-all` suffix meaning**: Contains multiple Postgres major versions in one image (e.g. pg13+pg14+pg15). NOT about extra extensions. Not needed for TRM. - **Image pinned in TRM**: `timescale/timescaledb-ha:pg16.6-ts2.17.2-all`
- **PGDATA directory**: `/pgdata` — set via `PGDATA=/pgdata` env var. NOT the standard `/var/lib/postgresql/data`. Volume mounts must target `/pgdata`. - `:pg16-latest` does NOT exist on Docker Hub — it 404s at pull time.
- The `-all` suffix here means "all extensions bundled" (TimescaleDB + PostGIS + more), NOT "all PG versions".
- Always pin a concrete `pgX.Y-tsA.B.C-all` tag; floating tags are unreliable for this image.
- **PGDATA directory**: `/home/postgres/pgdata/data`
- NOT `/pgdata`, NOT `/var/lib/postgresql/data`.
- Setting `PGDATA=/pgdata` and mounting a volume to `/pgdata` causes initdb to fail with "could not change permissions of directory".
- compose.dev.yaml correctly sets `PGDATA: /home/postgres/pgdata/data` and mounts to the same path.
- **PostGIS**: Binaries are bundled in the `-all` image but the extension is NOT auto-created on user databases.
- Directus logs a benign warning: `PostGIS isn't installed. Geometry type support will be limited.`
- Must be created explicitly: `CREATE EXTENSION IF NOT EXISTS postgis;` — lands in db-init Phase 2.
- **Environment variables**: `POSTGRES_USER`, `POSTGRES_PASSWORD`, `POSTGRES_DB` (same as official postgres image). - **Environment variables**: `POSTGRES_USER`, `POSTGRES_PASSWORD`, `POSTGRES_DB` (same as official postgres image).
- **Healthcheck**: `pg_isready -U $$POSTGRES_USER -d $$POSTGRES_DB` (double-$ in compose YAML to escape interpolation; the container sees single $ at runtime). - **Healthcheck**: `pg_isready -U $$POSTGRES_USER -d $$POSTGRES_DB` (double-$ in compose YAML to escape interpolation).
- **Compatible with `CREATE EXTENSION postgis`**: Yes, binaries present. - **`psql` / `pg_isready` in the Directus container**: Installed via `apk add --no-cache postgresql16-client` in the Dockerfile (task 1.1). Available on PATH at runtime.
- **Compatible with `CREATE EXTENSION timescaledb`**: Yes, it's the whole point of the image.