2a50aaf175
src/core/consumer.ts — XREADGROUP loop with consumer-group resumption, ensureConsumerGroup (BUSYGROUP-tolerant), decodeBatch (CodecError → log + skip + leave pending; never speculative ACK), partial-ACK semantics, connectRedis (mirroring tcp-ingestion's retry pattern), clean stop. src/core/state.ts — LRU Map<device_id, DeviceState> using delete+set bump trick (no third-party LRU dep); last_seen = max(prev, ts) so out-of-order replays don't regress the high-water mark; evictedTotal() counter. src/core/writer.ts — multi-row INSERT ON CONFLICT (device_id, ts) DO NOTHING with RETURNING. Duplicate detection by set-difference between input and RETURNING rows (xmax=0 doesn't work for skipped-conflict rows, only returned ones — confirmed in the task spec's own Note). Sequential chunking to WRITE_BATCH_SIZE; bigint→string and Buffer→base64 attribute serialization that handles Buffer.toJSON shape. src/main.ts — full pipeline: pool → migrate → redis → state → writer → sink → consumer → graceful-shutdown stub. Sink ordering is state.update BEFORE writer.write per spec rationale (state stays consistent with what's been seen even if not yet persisted; redelivery is idempotent on state). Metrics is still the trace-logging shim from tcp-ingestion's pre-1.10 pattern; real prom-client lands in task 1.9. Verification: typecheck, lint clean; 112 unit tests passing across 7 test files (+39 from this batch).
Phase 1 — Throughput pipeline
Implement a Node.js worker that joins a Redis Streams consumer group, decodes Position records, upserts them into a TimescaleDB hypertable, maintains per-device in-memory state, and ships with the operational baseline (Prometheus metrics, health/readiness endpoints, integration tests, Dockerfile, Gitea CI/CD pipeline).
Outcome statement
When Phase 1 is done:
- The Processor connects to Redis and joins consumer group
processoron streamtelemetry:t(configurable). On startup it creates the group withMKSTREAMif missing. - Every
Positionrecord published bytcp-ingestionlands as exactly one row in thepositionshypertable, withdevice_id,ts, GPS fields, and the IOattributesbag preserved asJSONB(sentinel-decoded — bigint values becomenumeric, Buffer values becomebyteaortextper the spec in task 1.2). - Per-device in-memory state (
last_position,last_seen,position_count_session) is updated on every record and bounded by an LRU cap. XACKis sent only after the Postgres write succeeds. A crashed instance leaves work pending; on its next start it picks up via consumer-group resumption, and any other instance can claim its pending entries (fullXAUTOCLAIMpolish lives in Phase 3, but the basic resumption works in Phase 1).GET /metricsreturns Prometheus exposition format with consumer lag, throughput, write-latency histogram, error counters.GET /healthzandGET /readyzcover liveness and readiness (Redis ready + Postgres ready).- The service builds reproducibly via a Gitea Actions workflow, publishing a Docker image to the Gitea container registry tagged
:main(and per-commit SHA tags later if needed). - An integration test spins up Redis + Postgres via testcontainers, publishes a synthetic
Positionto the input stream, and verifies the resulting row inpositions. End-to-end byte-level round-trip including bigint and Buffer sentinel reversal.
Sequencing
1.1 Project scaffold
├─→ 1.2 Core types & contracts
│ ├─→ 1.3 Configuration & logging
│ ├─→ 1.4 Postgres connection & positions hypertable
│ │ └─→ 1.7 Position writer (batched upsert)
│ └─→ 1.5 Redis Stream consumer
│ ├─→ 1.6 Per-device in-memory state
│ └─→ 1.8 Main wiring & ACK semantics (depends on 1.5, 1.6, 1.7)
│ └─→ 1.9 Observability
│ └─→ 1.10 Integration test
│ └─→ 1.11 Dockerfile & CI
Tasks 1.5/1.6/1.7 can be developed in parallel after 1.4 lands. Task 1.10 (integration test) should land before 1.11 because the Dockerfile depends on knowing what pnpm test and pnpm test:integration will do.
Files modified
Phase 1 produces this layout in processor/:
processor/
├── .gitea/workflows/build.yml
├── src/
│ ├── core/
│ │ ├── types.ts # Position, DeviceState, Metrics
│ │ ├── consumer.ts # XREADGROUP loop + claim handler
│ │ ├── writer.ts # Postgres batched upsert
│ │ ├── state.ts # in-memory device state with LRU
│ │ └── codec.ts # sentinel decode (__bigint, __buffer_b64)
│ ├── db/
│ │ ├── pool.ts # pg.Pool factory
│ │ └── migrations/
│ │ └── 0001_positions.sql # hypertable creation
│ ├── config/load.ts # zod schema for env
│ ├── observability/
│ │ ├── logger.ts # pino root logger
│ │ └── metrics.ts # prom-client + HTTP server
│ └── main.ts
├── test/
│ ├── codec.test.ts
│ ├── state.test.ts
│ ├── consumer.test.ts # mocked Redis behaviour
│ ├── writer.test.ts # mocked pg behaviour
│ └── pipeline.integration.test.ts # testcontainers Redis + Postgres
├── Dockerfile
├── compose.dev.yaml
├── package.json
├── pnpm-lock.yaml
├── tsconfig.json
├── vitest.config.ts
├── vitest.integration.config.ts
├── .dockerignore
├── .gitignore
├── .prettierrc
├── eslint.config.js
└── README.md
Tech stack (decided)
- Node.js 22 LTS, ESM-only.
- TypeScript 5.x with
strict: true,noUncheckedIndexedAccess: true. - pnpm for dependency management.
- vitest for tests (unit + integration split — same pattern as
tcp-ingestion). - pino for structured logging (ISO timestamps, string level labels — same config as
tcp-ingestion). - prom-client for Prometheus metrics.
- ioredis for Redis Streams (XREADGROUP, XACK, XAUTOCLAIM).
- pg (
pgpackage, notpostgres.js) for Postgres — battle-tested, simple Pool API. - zod for environment-variable validation.
- testcontainers for integration tests (Redis 7 + TimescaleDB).
If an implementer wants to deviate, they must update the relevant task file first.
Key design decisions inherited from tcp-ingestion
- ESLint
import/no-restricted-paths—src/core/cannot import fromsrc/domain/(the boundary that protects Phase 1 from Phase 2 churn).src/db/is shared. - Logger config —
pino.stdTimeFunctions.isoTime+ level-as-string formatter. Lifecycle events atinfo; high-frequency per-record events atdebugortrace. - Slim Dockerfile — multi-stage with BuildKit cache mounts,
pnpm fetch+pnpm install --offlinein the build stage,pnpm prune --prodfor runtime. - CI workflow — single-job pattern matching
tcp-ingestion/.gitea/workflows/build.yml. Noservices:block; no separate test container.