Scaffold mirrors tcp-ingestion conventions: ESM, strict TS, pnpm, vitest
with unit/integration split, ESLint flat config with no-floating-promises
+ no-misused-promises + import/no-restricted-paths (the new src/core/ →
src/domain/ boundary that protects Phase 1 from Phase 2 churn).
Core types in src/core/types.ts (Position, StreamRecord, DeviceState,
Metrics, AttributeValue) — Position is byte-equivalent to tcp-ingestion's
output. Codec in src/core/codec.ts implements sentinel reversal:
{__bigint:"..."} → bigint, {__buffer_b64:"..."} → Buffer, ISO timestamp
string → Date. CodecError surfaces malformed payload reasons with the
failing field named.
Config in src/config/load.ts (zod schema, all 13 env vars with defaults
and bounded numerics). Logger in src/observability/logger.ts matches
tcp-ingestion exactly: ISO timestamps, string level labels, pino-pretty
in development.
Postgres in src/db/: createPool with sane defaults and application_name,
connectWithRetry mirroring the ioredis retry pattern, a 30-line
migration runner using a schema_migrations table, and 0001_positions.sql
with the hypertable + (device_id, ts) unique index + ts DESC index.
Migration runner unit-tested against a mocked pg.Pool; the real
TimescaleDB round-trip is deferred to task 1.10 per spec.
Verification: typecheck, lint, build all clean; 73 unit tests passing
across 4 files. import/no-restricted-paths verified live by temporarily
adding a forbidden src/domain/ import.
6.5 KiB
Task 1.4 — Postgres connection & positions hypertable
Phase: 1 — Throughput pipeline
Status: 🟩 Done
Depends on: 1.1, 1.3
Wiki refs: docs/wiki/entities/postgres-timescaledb.md
Goal
Stand up the Postgres connection (a single pg.Pool) and define the positions hypertable migration. This is the only table whose schema the Processor owns directly (per the design rule in ROADMAP.md). Every other table is owned by Directus.
Deliverables
src/db/pool.tsexporting:createPool(url: string): pg.Pool— single Pool with sane defaults (max: 10,idleTimeoutMillis: 30_000,connectionTimeoutMillis: 5_000). Setsapplication_name = 'processor'so connections are identifiable inpg_stat_activity.connectWithRetry(pool, logger): Promise<void>— runsSELECT 1with exponential backoff (3 attempts, up to 5s). Mirrorstcp-ingestion'sconnectRedispattern. Callsprocess.exit(1)on final failure.
src/db/migrations/0001_positions.sqlcontaining:CREATE EXTENSION IF NOT EXISTS timescaledb;(no-op if already enabled)CREATE TABLE IF NOT EXISTS positions (...)per the schema belowSELECT create_hypertable('positions', 'ts', if_not_exists => TRUE, chunk_time_interval => INTERVAL '1 day');CREATE UNIQUE INDEX IF NOT EXISTS positions_device_ts ON positions (device_id, ts);CREATE INDEX IF NOT EXISTS positions_ts ON positions (ts DESC);
src/db/migrate.ts— minimal runner that executes pending migration files in order. Tracks applied migrations in aschema_migrations(version, applied_at)table. Idempotent. Called frommain.tsbefore the consumer starts.test/db/migrate.test.tscovering: applying a fresh migration; applying twice is a no-op; bad SQL fails loudly.
Specification
positions table schema
CREATE TABLE IF NOT EXISTS positions (
device_id text NOT NULL,
ts timestamptz NOT NULL, -- canonical event time from device GPS
ingested_at timestamptz NOT NULL DEFAULT now(), -- when Processor wrote the row
latitude double precision NOT NULL,
longitude double precision NOT NULL,
altitude real NOT NULL,
angle real NOT NULL,
speed real NOT NULL,
satellites smallint NOT NULL,
priority smallint NOT NULL,
codec text NOT NULL, -- '8' | '8E' | '16'
attributes jsonb NOT NULL -- the IO bag, sentinel-decoded
);
Why these column types
device_id text— IMEIs are 15 ASCII digits. Could bebigint, buttextkeeps the door open for non-IMEI device identifiers (future vendors) and avoids leading-zero loss.ts timestamptz— the device-reported time, not ingestion time. This is the hypertable partitioning column.ingested_at timestamptz— diagnostic: helps spot devices with clock skew or buffered records (the 55-record buffer flush we saw on stage). Not part of the natural key.altitude/angle/speed real— float32 is plenty; saves space on a high-volume table.attributes jsonb— preserves the IO bag verbatim. Per the design rule, no naming or unit conversion happens here; that's Phase 2 insrc/domain/.
bigint and Buffer attributes — JSONB encoding
The codec (task 1.2) decodes __bigint to bigint and __buffer_b64 to Buffer. Postgres jsonb is JSON, so we re-encode for storage:
bigint→ JSON number if it fits inNumber.MAX_SAFE_INTEGER, else JSON string. Always store as a string is simpler and unambiguous; decision: always string for bigint.Buffer→ base64 string.
Re-encoding loses the type tag. Phase 2 IO interpretation (per-model mapping table) is responsible for knowing that attributes.io_240 is a u64 stored as a string. Phase 1 doesn't need to query individual attributes — it's pass-through storage.
If this becomes painful later, options to revisit: a separate attributes_typed column with structured shape; or store bigints as numeric and Buffers as bytea in dedicated columns. Defer — 80% of attributes are small ints, and the simple string approach unblocks Phase 1.
Migration runner
Follow the simplest possible pattern. The runner:
CREATE TABLE IF NOT EXISTS schema_migrations (version text PRIMARY KEY, applied_at timestamptz NOT NULL DEFAULT now()).- Lists
*.sqlfiles insrc/db/migrations/sorted by filename. - For each,
SELECT 1 FROM schema_migrations WHERE version = $1. If absent, run the SQL inside a transaction and insert the row. - Logs each applied or skipped migration at
info.
Do not introduce a heavy framework (Knex, node-pg-migrate). The Processor has one migration file in Phase 1 — a 30-line runner is the right answer.
Acceptance criteria
pnpm typecheck,pnpm lint,pnpm testclean.- Integration test (testcontainers TimescaleDB): apply migration; insert a row with a bigint-as-string attribute; query it back; verify shape.
- Re-running the migration on an already-migrated database is a no-op.
connectWithRetryretries 3 times with exponential backoff, then callsprocess.exit(1). Verify with a unit test using a fake Pool.
Risks / open questions
- TimescaleDB extension availability. The
deploy/repo's Postgres container must be thetimescale/timescaledbimage, not stockpostgres. Document this explicitly in the deploy README when Phase 1 ships. Fall back to a regular table (no hypertable) if the extension is unavailable:create_hypertablewill error, but theIF NOT EXISTStable creation succeeds. The performance falls off a cliff at scale, but functional. - Schema authority overlap with Directus. Directus also speaks Postgres. When Directus connects and introspects the schema, it will see the
positionstable created by Processor. That's fine — Directus can reflect tables it didn't create. But if an operator later modifiespositionsfrom the Directus admin UI, the migration may break. Document:positionsis a Processor-owned table; do not edit from Directus.
Done
(pending commit SHA) — Implemented src/db/pool.ts (createPool, connectWithRetry), src/db/migrate.ts (runMigrations — 30-line runner), and src/db/migrations/0001_positions.sql (hypertable + unique index + ts-desc index). Unit tests use a mocked pg.Pool throughout; the real TimescaleDB round-trip is deferred to task 1.10 per spec. The "calls process.exit(1)" pool test uses maxAttempts=1 to avoid fake-timer unhandled-rejection noise that surfaces when a backoff setTimeout resolves after the outer promise has already thrown.