Implement Phase 1 tasks 1.5-1.8 (consumer + state + writer + main wiring)

src/core/consumer.ts — XREADGROUP loop with consumer-group resumption, ensureConsumerGroup (BUSYGROUP-tolerant), decodeBatch (CodecError → log + skip + leave pending; never speculative ACK), partial-ACK semantics, connectRedis (mirroring tcp-ingestion's retry pattern), clean stop. src/core/state.ts — LRU Map<device_id, DeviceState> using delete+set bump trick (no third-party LRU dep); last_seen = max(prev, ts) so out-of-order replays don't regress the high-water mark; evictedTotal() counter. src/core/writer.ts — multi-row INSERT ON CONFLICT (device_id, ts) DO NOTHING with RETURNING. Duplicate detection by set-difference between input and RETURNING rows (xmax=0 doesn't work for skipped-conflict rows, only returned ones — confirmed in the task spec's own Note). Sequential chunking to WRITE_BATCH_SIZE; bigint→string and Buffer→base64 attribute serialization that handles Buffer.toJSON shape. src/main.ts — full pipeline: pool → migrate → redis → state → writer → sink → consumer → graceful-shutdown stub. Sink ordering is state.update BEFORE writer.write per spec rationale (state stays consistent with what's been seen even if not yet persisted; redelivery is idempotent on state). Metrics is still the trace-logging shim from tcp-ingestion's pre-1.10 pattern; real prom-client lands in task 1.9. Verification: typecheck, lint clean; 112 unit tests passing across 7 test files (+39 from this batch).
2026-04-30 21:47:43 +02:00
parent 6a14eb1d01
commit 2a50aaf175
12 changed files with 2218 additions and 15 deletions
@@ -1,7 +1,7 @@
 # Task 1.5 — Redis Stream consumer (XREADGROUP)

 **Phase:** 1 — Throughput pipeline
-**Status:** ⬜ Not started
+**Status:** 🟩 Done
 **Depends on:** 1.2, 1.3
 **Wiki refs:** `docs/wiki/entities/redis-streams.md`, `docs/wiki/entities/processor.md`

@@ -90,4 +90,4 @@ On `start()`:

 ## Done

-(Fill in once complete: commit SHA, brief notes.)
+`src/core/consumer.ts` — XREADGROUP loop with `ensureConsumerGroup`, `decodeBatch`, partial-ACK semantics, `connectRedis` (co-located, not in `src/db/`), and clean stop. `test/consumer.test.ts` — 11 tests covering happy path, partial ACK, BUSYGROUP swallow, decode error skip, missing payload skip, XREADGROUP backoff, clean stop. *(pending commit SHA)*
@@ -1,7 +1,7 @@
 # Task 1.6 — Per-device in-memory state

 **Phase:** 1 — Throughput pipeline
-**Status:** ⬜ Not started
+**Status:** 🟩 Done
 **Depends on:** 1.2
 **Wiki refs:** `docs/wiki/entities/processor.md` (§ State management)

@@ -78,4 +78,4 @@ The interface is built to extend: Phase 2 may add fields, but the existing field

 ## Done

-(Fill in once complete: commit SHA, brief notes.)
+`src/core/state.ts` — LRU Map using delete+set bump trick, `last_seen = max(prev, position.timestamp)` semantics, `evictedTotal()` counter. `test/state.test.ts` — 14 tests covering new-device creation, session counter increment, LRU eviction at cap, LRU re-touch, evictedTotal, out-of-order timestamp rejection, get/size. *(pending commit SHA)*
@@ -1,7 +1,7 @@
 # Task 1.7 — Position writer (batched upsert)

 **Phase:** 1 — Throughput pipeline
-**Status:** ⬜ Not started
+**Status:** 🟩 Done
 **Depends on:** 1.2, 1.4
 **Wiki refs:** `docs/wiki/entities/postgres-timescaledb.md`

@@ -91,4 +91,6 @@ If a transaction-wide failure occurs (Pool dead, transient network), all records

 ## Done

-(Fill in once complete: commit SHA, brief notes.)
+`src/core/writer.ts` — multi-row INSERT with RETURNING, duplicate detection by (device_id, ts) set diff, sequential chunking, bigint/Buffer attribute serialization (handles Buffer.toJSON shape). `test/writer.test.ts` — 14 tests covering happy path, all-duplicate, mixed, pool error, chunk split, Buffer base64, bigint string, parameter ordering, metrics. *(pending commit SHA)*
+
+**Note:** The spec's `RETURNING (xmax = 0) AS inserted` idiom was replaced with a simpler set-difference approach — compare RETURNING rows against input by (device_id, ts). The xmax approach is mentioned in the spec but then immediately qualified: "rows that hit the conflict are NOT returned." The set-diff is cleaner and avoids confusion. The spec's own Note section confirms this is the right approach.
@@ -1,7 +1,7 @@
 # Task 1.8 — Main wiring & ACK semantics

 **Phase:** 1 — Throughput pipeline
-**Status:** ⬜ Not started
+**Status:** 🟩 Done
 **Depends on:** 1.5, 1.6, 1.7
 **Wiki refs:** `docs/wiki/entities/processor.md`

@@ -97,4 +97,4 @@ After this task lands you should be able to run `pnpm dev` against a local Redis

 ## Done

-(Fill in once complete: commit SHA, brief notes.)
+`src/main.ts` — full pipeline wiring: Postgres pool → migrations → Redis → state store → writer → sink → consumer → graceful shutdown stub. Metrics shim uses `logger.trace`. Sink ordering: state.update before writer.write per spec. *(pending commit SHA)*