Files
processor/.planning/phase-1-throughput/08-main-wiring.md
T
julian e1c6f59948 Realign processor stream-name default to telemetry:teltonika
Stage discovered the wrong default at runtime: tcp-ingestion's compiled
default REDIS_TELEMETRY_STREAM is 'telemetry:teltonika' but processor's
was 'telemetry:t', so the two services were talking past each other —
tcp-ingestion publishing to one stream, processor reading another empty
one. The deploy stack now pins both to the same value via a shared env
var, but the processor's compiled default should also match so local
development and the integration test stay aligned with reality.

Changes:
- src/config/load.ts — default changed to 'telemetry:teltonika'
- .env.example — same
- test/config.test.ts — default-value assertion updated
- planning docs (ROADMAP, phase-1 README, tasks 03/08/10, phase-3 README) —
  occurrences of 'telemetry:t' replaced with 'telemetry:teltonika'

The deploy stack remains the single source of truth via the shared
REDIS_TELEMETRY_STREAM env var. Compiled defaults are belt-and-braces.
2026-05-01 11:43:31 +02:00

5.8 KiB

Task 1.8 — Main wiring & ACK semantics

Phase: 1 — Throughput pipeline Status: 🟩 Done Depends on: 1.5, 1.6, 1.7 Wiki refs: docs/wiki/entities/processor.md

Goal

Assemble the throughput pipeline in src/main.ts: connect Redis + Postgres → run migrations → build the device-state store → build the writer → build the consumer with a sink that calls state.update() then writer.write() → start. Establish the rule for what to ACK and when.

Deliverables

  • src/main.ts updated to:

    1. loadConfig() (from task 1.3).
    2. createLogger() (from task 1.3).
    3. createPool(config.POSTGRES_URL) and connectWithRetry() (from task 1.4).
    4. Run migrations via migrate() (from task 1.4) before any consumer activity.
    5. Connect Redis with connectRedis(...) (re-implement the tcp-ingestion retry pattern; small enough to copy).
    6. Build state = createDeviceStateStore(config, logger).
    7. Build writer = createWriter(pool, config, logger, metrics).
    8. Build consumer = createConsumer(redis, config, logger, metrics, sink) where sink is the function defined below.
    9. await consumer.start().
    10. Install graceful shutdown stub (full Phase 3 hardening later): on SIGTERM/SIGINT, call consumer.stop(), await pending writes, close Redis + Pool, exit.
  • src/main.ts defines the sink function (the central decision point):

    async function sink(records: ConsumedRecord[]): Promise<string[]> {
      // 1. Update in-memory state for every record (cheap, synchronous, can't fail meaningfully)
      for (const r of records) state.update(r.position);
    
      // 2. Write to Postgres
      const results = await writer.write(records);
    
      // 3. ACK only the IDs that succeeded or were duplicates
      return results
        .filter(r => r.status === 'inserted' || r.status === 'duplicate')
        .map(r => r.id);
    }
    
  • A placeholder metrics shim — the same trace-logging stub as tcp-ingestion originally had (task 1.9 replaces it with prom-client). Use Metrics from src/core/types.ts.

Specification

State update happens before write — by design

The sink updates state first, then writes. If the write fails:

  • The state update has already happened.
  • The record is not ACKed, so it stays pending.
  • On re-delivery (same instance retries, or another instance claims), the record will be processed again.
  • state.update is idempotent for a given position (same record applied twice produces the same last_position, only position_count_session is double-counted — and that's a session counter that resets on restart anyway, so it's a non-issue).

If we wrote first and updated state second, a successful write followed by a state-update crash would leave Postgres ahead of state — but state is hot-path, so that's worse. The chosen order keeps state consistent with what's been seen, even if not yet persisted.

What the sink does NOT do

  • No business logic. No "is this a finish-line crossing" detection. That's Phase 2's domain.
  • No multi-stream fanout. No publishing to derived streams (e.g. for the SPA). The Phase 1 model is: positions go into Postgres, Directus reads them and pushes via WebSocket. If that fanout proves insufficient at the SPA layer, Phase 4 considers a dedicated WebSocket gateway reading from Redis directly.

Graceful shutdown — Phase 1 stub vs. Phase 3 final

Phase 1 stub is enough to not lose data in the common case:

  1. Catch SIGTERM/SIGINT.
  2. consumer.stop() — exits the read loop after the current batch.
  3. Await any in-flight writer.write().
  4. redis.quit() and pool.end().
  5. process.exit(0).
  6. Force-exit timer at 15s as a backstop.

What Phase 1 does NOT do (deferred to Phase 3):

  • Explicit consumer-group offset commit on SIGTERM (the current model relies on XACK after each successful write, which is already the right thing — but Phase 3 documents and tests this rigorously).
  • Uncaught exception / unhandled rejection handlers that flush state to logs before crashing.
  • Multi-instance coordination on shutdown (drain mode).

Logger shape

Match tcp-ingestion's convention:

  • info for lifecycle: processor starting, Postgres connected, Redis connected, migrations applied, consumer started on stream X group Y consumer Z, processor ready.
  • debug for per-batch: batch consumed n=42, batch written inserted=40 duplicates=2 failed=0.
  • warn / error for the obvious.

After this task lands you should be able to run pnpm dev against a local Redis + Postgres, publish a synthetic Position to telemetry:teltonika, and watch a row appear in positions while seeing the lifecycle logs above.

Acceptance criteria

  • pnpm typecheck, pnpm lint, pnpm test clean.
  • pnpm dev (with local Redis + Postgres reachable) shows the lifecycle log sequence and processor ready.
  • Manually publishing a Position to telemetry:teltonika results in a row in positions within seconds.
  • SIGTERM during idle exits cleanly (no error, no force-exit warning).
  • SIGTERM with in-flight writes waits for them to complete before exiting.

Risks / open questions

  • metrics placeholder is intentional. Don't try to wire prom-client here; that's task 1.9. Use the trace-logging shim from tcp-ingestion's pre-1.10 main.ts as the model.
  • Migration during deploy. Phase 1 runs migrations on every startup. With multiple instances, two starting at once both try to migrate — Postgres advisory locks would solve this. Defer to Phase 3 (it's a Production hardening concern); for the pilot with one instance, this is fine. Document the limitation.

Done

src/main.ts — full pipeline wiring: Postgres pool → migrations → Redis → state store → writer → sink → consumer → graceful shutdown stub. Metrics shim uses logger.trace. Sink ordering: state.update before writer.write per spec. Landed in 68d3da3.