Add planning documents for Phase 1 (throughput pipeline) and stub Phases 2-4
ROADMAP.md establishes status legend, architectural anchors pointing at the wiki, and seven non-negotiable design rules — most importantly the core/domain boundary that protects Phase 1 from Phase 2 churn, the schema-authority split (positions hypertable owned here; everything else owned by Directus), and idempotent-writes via (device_id, ts) ON CONFLICT. Phase 1 (throughput pipeline) is fully detailed across 11 task files: scaffold, core types + sentinel decoder, config + logging, Postgres hypertable, Redis Stream consumer, per-device LRU state, batched writer, main wiring, observability, integration test, Dockerfile + Gitea CI. Observability is in Phase 1 (not deferred) — lesson learned from tcp-ingestion task 1.10. Phases 2-4 are stub READMEs. Phase 2 (domain logic) blocks on Directus schema decisions and lists those open questions explicitly. Phase 3 (production hardening) and Phase 4 (future) sketch the task shape.
This commit is contained in:
@@ -0,0 +1,89 @@
|
||||
# Task 1.4 — Postgres connection & `positions` hypertable
|
||||
|
||||
**Phase:** 1 — Throughput pipeline
|
||||
**Status:** ⬜ Not started
|
||||
**Depends on:** 1.1, 1.3
|
||||
**Wiki refs:** `docs/wiki/entities/postgres-timescaledb.md`
|
||||
|
||||
## Goal
|
||||
|
||||
Stand up the Postgres connection (a single `pg.Pool`) and define the `positions` hypertable migration. This is the only table whose schema the Processor owns directly (per the design rule in ROADMAP.md). Every other table is owned by Directus.
|
||||
|
||||
## Deliverables
|
||||
|
||||
- `src/db/pool.ts` exporting:
|
||||
- `createPool(url: string): pg.Pool` — single Pool with sane defaults (`max: 10`, `idleTimeoutMillis: 30_000`, `connectionTimeoutMillis: 5_000`). Sets `application_name = 'processor'` so connections are identifiable in `pg_stat_activity`.
|
||||
- `connectWithRetry(pool, logger): Promise<void>` — runs `SELECT 1` with exponential backoff (3 attempts, up to 5s). Mirrors `tcp-ingestion`'s `connectRedis` pattern. Calls `process.exit(1)` on final failure.
|
||||
- `src/db/migrations/0001_positions.sql` containing:
|
||||
- `CREATE EXTENSION IF NOT EXISTS timescaledb;` (no-op if already enabled)
|
||||
- `CREATE TABLE IF NOT EXISTS positions (...)` per the schema below
|
||||
- `SELECT create_hypertable('positions', 'ts', if_not_exists => TRUE, chunk_time_interval => INTERVAL '1 day');`
|
||||
- `CREATE UNIQUE INDEX IF NOT EXISTS positions_device_ts ON positions (device_id, ts);`
|
||||
- `CREATE INDEX IF NOT EXISTS positions_ts ON positions (ts DESC);`
|
||||
- `src/db/migrate.ts` — minimal runner that executes pending migration files in order. Tracks applied migrations in a `schema_migrations(version, applied_at)` table. Idempotent. Called from `main.ts` before the consumer starts.
|
||||
- `test/db/migrate.test.ts` covering: applying a fresh migration; applying twice is a no-op; bad SQL fails loudly.
|
||||
|
||||
## Specification
|
||||
|
||||
### `positions` table schema
|
||||
|
||||
```sql
|
||||
CREATE TABLE IF NOT EXISTS positions (
|
||||
device_id text NOT NULL,
|
||||
ts timestamptz NOT NULL, -- canonical event time from device GPS
|
||||
ingested_at timestamptz NOT NULL DEFAULT now(), -- when Processor wrote the row
|
||||
latitude double precision NOT NULL,
|
||||
longitude double precision NOT NULL,
|
||||
altitude real NOT NULL,
|
||||
angle real NOT NULL,
|
||||
speed real NOT NULL,
|
||||
satellites smallint NOT NULL,
|
||||
priority smallint NOT NULL,
|
||||
codec text NOT NULL, -- '8' | '8E' | '16'
|
||||
attributes jsonb NOT NULL -- the IO bag, sentinel-decoded
|
||||
);
|
||||
```
|
||||
|
||||
### Why these column types
|
||||
|
||||
- `device_id text` — IMEIs are 15 ASCII digits. Could be `bigint`, but `text` keeps the door open for non-IMEI device identifiers (future vendors) and avoids leading-zero loss.
|
||||
- `ts timestamptz` — the **device-reported** time, not ingestion time. This is the hypertable partitioning column.
|
||||
- `ingested_at timestamptz` — diagnostic: helps spot devices with clock skew or buffered records (the 55-record buffer flush we saw on stage). Not part of the natural key.
|
||||
- `altitude/angle/speed real` — float32 is plenty; saves space on a high-volume table.
|
||||
- `attributes jsonb` — preserves the IO bag verbatim. Per the design rule, no naming or unit conversion happens here; that's Phase 2 in `src/domain/`.
|
||||
|
||||
### bigint and Buffer attributes — JSONB encoding
|
||||
|
||||
The codec (task 1.2) decodes `__bigint` to `bigint` and `__buffer_b64` to `Buffer`. Postgres `jsonb` is JSON, so we re-encode for storage:
|
||||
- `bigint` → JSON number if it fits in `Number.MAX_SAFE_INTEGER`, else JSON string. Always store as a string is simpler and unambiguous; **decision: always string for bigint**.
|
||||
- `Buffer` → base64 string.
|
||||
|
||||
**Re-encoding loses the type tag.** Phase 2 IO interpretation (per-model mapping table) is responsible for knowing that `attributes.io_240` is a u64 stored as a string. Phase 1 doesn't need to query individual attributes — it's pass-through storage.
|
||||
|
||||
If this becomes painful later, options to revisit: a separate `attributes_typed` column with structured shape; or store bigints as `numeric` and Buffers as `bytea` in dedicated columns. **Defer** — 80% of attributes are small ints, and the simple string approach unblocks Phase 1.
|
||||
|
||||
### Migration runner
|
||||
|
||||
Follow the simplest possible pattern. The runner:
|
||||
1. `CREATE TABLE IF NOT EXISTS schema_migrations (version text PRIMARY KEY, applied_at timestamptz NOT NULL DEFAULT now())`.
|
||||
2. Lists `*.sql` files in `src/db/migrations/` sorted by filename.
|
||||
3. For each, `SELECT 1 FROM schema_migrations WHERE version = $1`. If absent, run the SQL inside a transaction and insert the row.
|
||||
4. Logs each applied or skipped migration at `info`.
|
||||
|
||||
Do **not** introduce a heavy framework (Knex, node-pg-migrate). The Processor has one migration file in Phase 1 — a 30-line runner is the right answer.
|
||||
|
||||
## Acceptance criteria
|
||||
|
||||
- [ ] `pnpm typecheck`, `pnpm lint`, `pnpm test` clean.
|
||||
- [ ] Integration test (testcontainers TimescaleDB): apply migration; insert a row with a bigint-as-string attribute; query it back; verify shape.
|
||||
- [ ] Re-running the migration on an already-migrated database is a no-op.
|
||||
- [ ] `connectWithRetry` retries 3 times with exponential backoff, then calls `process.exit(1)`. Verify with a unit test using a fake Pool.
|
||||
|
||||
## Risks / open questions
|
||||
|
||||
- **TimescaleDB extension availability.** The `deploy/` repo's Postgres container must be the `timescale/timescaledb` image, not stock `postgres`. Document this explicitly in the deploy README when Phase 1 ships. Fall back to a regular table (no hypertable) if the extension is unavailable: `create_hypertable` will error, but the `IF NOT EXISTS` table creation succeeds. The performance falls off a cliff at scale, but functional.
|
||||
- **Schema authority overlap with Directus.** Directus also speaks Postgres. When Directus connects and introspects the schema, it will see the `positions` table created by Processor. That's fine — Directus can reflect tables it didn't create. But if an operator later modifies `positions` from the Directus admin UI, the migration may break. Document: `positions` is a Processor-owned table; do not edit from Directus.
|
||||
|
||||
## Done
|
||||
|
||||
(Fill in once complete: commit SHA, brief notes.)
|
||||
Reference in New Issue
Block a user