ROADMAP.md establishes status legend, architectural anchors pointing at the wiki, and seven non-negotiable design rules — most importantly the core/domain boundary that protects Phase 1 from Phase 2 churn, the schema-authority split (positions hypertable owned here; everything else owned by Directus), and idempotent-writes via (device_id, ts) ON CONFLICT. Phase 1 (throughput pipeline) is fully detailed across 11 task files: scaffold, core types + sentinel decoder, config + logging, Postgres hypertable, Redis Stream consumer, per-device LRU state, batched writer, main wiring, observability, integration test, Dockerfile + Gitea CI. Observability is in Phase 1 (not deferred) — lesson learned from tcp-ingestion task 1.10. Phases 2-4 are stub READMEs. Phase 2 (domain logic) blocks on Directus schema decisions and lists those open questions explicitly. Phase 3 (production hardening) and Phase 4 (future) sketch the task shape.
5.6 KiB
Task 1.8 — Main wiring & ACK semantics
Phase: 1 — Throughput pipeline
Status: ⬜ Not started
Depends on: 1.5, 1.6, 1.7
Wiki refs: docs/wiki/entities/processor.md
Goal
Assemble the throughput pipeline in src/main.ts: connect Redis + Postgres → run migrations → build the device-state store → build the writer → build the consumer with a sink that calls state.update() then writer.write() → start. Establish the rule for what to ACK and when.
Deliverables
-
src/main.tsupdated to:loadConfig()(from task 1.3).createLogger()(from task 1.3).createPool(config.POSTGRES_URL)andconnectWithRetry()(from task 1.4).- Run migrations via
migrate()(from task 1.4) before any consumer activity. - Connect Redis with
connectRedis(...)(re-implement thetcp-ingestionretry pattern; small enough to copy). - Build
state = createDeviceStateStore(config, logger). - Build
writer = createWriter(pool, config, logger, metrics). - Build
consumer = createConsumer(redis, config, logger, metrics, sink)wheresinkis the function defined below. await consumer.start().- Install graceful shutdown stub (full Phase 3 hardening later): on SIGTERM/SIGINT, call
consumer.stop(), await pending writes, close Redis + Pool, exit.
-
src/main.tsdefines the sink function (the central decision point):async function sink(records: ConsumedRecord[]): Promise<string[]> { // 1. Update in-memory state for every record (cheap, synchronous, can't fail meaningfully) for (const r of records) state.update(r.position); // 2. Write to Postgres const results = await writer.write(records); // 3. ACK only the IDs that succeeded or were duplicates return results .filter(r => r.status === 'inserted' || r.status === 'duplicate') .map(r => r.id); } -
A placeholder
metricsshim — the same trace-logging stub astcp-ingestionoriginally had (task 1.9 replaces it with prom-client). UseMetricsfromsrc/core/types.ts.
Specification
State update happens before write — by design
The sink updates state first, then writes. If the write fails:
- The state update has already happened.
- The record is not ACKed, so it stays pending.
- On re-delivery (same instance retries, or another instance claims), the record will be processed again.
state.updateis idempotent for a given position (same record applied twice produces the samelast_position, onlyposition_count_sessionis double-counted — and that's a session counter that resets on restart anyway, so it's a non-issue).
If we wrote first and updated state second, a successful write followed by a state-update crash would leave Postgres ahead of state — but state is hot-path, so that's worse. The chosen order keeps state consistent with what's been seen, even if not yet persisted.
What the sink does NOT do
- No business logic. No "is this a finish-line crossing" detection. That's Phase 2's domain.
- No multi-stream fanout. No publishing to derived streams (e.g. for the SPA). The Phase 1 model is: positions go into Postgres, Directus reads them and pushes via WebSocket. If that fanout proves insufficient at the SPA layer, Phase 4 considers a dedicated WebSocket gateway reading from Redis directly.
Graceful shutdown — Phase 1 stub vs. Phase 3 final
Phase 1 stub is enough to not lose data in the common case:
- Catch SIGTERM/SIGINT.
consumer.stop()— exits the read loop after the current batch.- Await any in-flight
writer.write(). redis.quit()andpool.end().process.exit(0).- Force-exit timer at 15s as a backstop.
What Phase 1 does NOT do (deferred to Phase 3):
- Explicit consumer-group offset commit on SIGTERM (the current model relies on
XACKafter each successful write, which is already the right thing — but Phase 3 documents and tests this rigorously). - Uncaught exception / unhandled rejection handlers that flush state to logs before crashing.
- Multi-instance coordination on shutdown (drain mode).
Logger shape
Match tcp-ingestion's convention:
infofor lifecycle:processor starting,Postgres connected,Redis connected,migrations applied,consumer started on stream X group Y consumer Z,processor ready.debugfor per-batch:batch consumed n=42,batch written inserted=40 duplicates=2 failed=0.warn/errorfor the obvious.
After this task lands you should be able to run pnpm dev against a local Redis + Postgres, publish a synthetic Position to telemetry:t, and watch a row appear in positions while seeing the lifecycle logs above.
Acceptance criteria
pnpm typecheck,pnpm lint,pnpm testclean.pnpm dev(with local Redis + Postgres reachable) shows the lifecycle log sequence andprocessor ready.- Manually publishing a
Positiontotelemetry:tresults in a row inpositionswithin seconds. - SIGTERM during idle exits cleanly (no error, no force-exit warning).
- SIGTERM with in-flight writes waits for them to complete before exiting.
Risks / open questions
metricsplaceholder is intentional. Don't try to wire prom-client here; that's task 1.9. Use the trace-logging shim fromtcp-ingestion's pre-1.10main.tsas the model.- Migration during deploy. Phase 1 runs migrations on every startup. With multiple instances, two starting at once both try to migrate — Postgres advisory locks would solve this. Defer to Phase 3 (it's a Production hardening concern); for the pilot with one instance, this is fine. Document the limitation.
Done
(Fill in once complete: commit SHA, brief notes.)