Files
tcp-ingestion/.planning/phase-1-telemetry/08-redis-publisher.md
T
julian c33c7a4f6b Implement Phase 1 task 1.8 (Redis Streams publisher + main wiring)
- Bounded in-memory queue (default 10000); overflow throws PublishOverflowError
  so the framing layer skips ACK and the device retransmits.
- Background worker drains via XADD with MAXLEN ~ approximate trimming.
- JSON serialization with sentinel encoding for bigint/Buffer/Date; correctly
  handles Buffer.prototype.toJSON firing before the replacer.
- AdapterContext.publish(position, codec) with codec-label closure at dispatch
  in adapters/teltonika/index.ts; zero changes to the three codec parsers.
- connectRedis with retry-on-startup; main.ts wires the full pipeline.
- installGracefulShutdown stubbed (full hardening in task 1.12).
- 19 new tests (17 unit + 2 Docker-conditional integration). Total 81 passing.
2026-04-30 16:39:34 +02:00

7.3 KiB

Task 1.8 — Redis Streams publisher & main wiring

Phase: 1 — Inbound telemetry Status: 🟩 Done Depends on: 1.2, 1.3, 1.4, 1.5, 1.6, 1.7 Wiki refs: docs/wiki/entities/redis-streams.md, docs/wiki/concepts/position-record.md

Goal

Implement the real publishPosition that writes Position records to a Redis Stream, then wire the entire Phase 1 pipeline together in src/main.ts.

Deliverables

  • src/core/publish.ts (replacing the stub from task 1.2):
    • createPublisher(redis: Redis, config: Config, logger: Logger, metrics: Metrics): Publisher factory.
    • Publisher.publish(p: Position): Promise<void> that serializes and XADDs.
    • Internal serialization helper serializePosition(p: Position): Record<string, string> returning the field-value pairs Redis expects.
  • src/main.ts updated to:
    1. Load config (task 1.3).
    2. Build logger and metrics (tasks 1.3, 1.10).
    3. Connect to Redis with retry-on-startup logic.
    4. Build the publisher.
    5. Build the Teltonika adapter and register codec handlers.
    6. Start the TCP server.
    7. Start the metrics HTTP server (task 1.10).
    8. Install graceful shutdown (task 1.12 finalizes; stub here).

Specification

Stream record shape

XADD telemetry:teltonika MAXLEN ~ <maxlen> * <fields> where fields are flat key→string pairs (Redis Streams do not nest). Use a JSON-encoded payload field for simplicity:

1) ts        → ISO8601 string (timestamp from the Position)
2) device_id → IMEI string
3) codec     → "8" | "8E" | "16" (the codec that produced this record — useful for downstream filtering)
4) payload   → JSON string of the full Position

The duplicated device_id and ts at the top level let downstream tools filter without parsing the JSON; payload is the source of truth.

JSON serialization

Position.attributes contains number | bigint | Buffer. JSON.stringify out of the box handles number but not bigint or Buffer. Implement a custom replacer:

function replacer(_key: string, value: unknown): unknown {
  if (typeof value === 'bigint') return { __bigint: value.toString() };
  if (Buffer.isBuffer(value))    return { __buffer_b64: value.toString('base64') };
  if (value instanceof Date)     return value.toISOString();
  return value;
}

The __bigint and __buffer_b64 sentinels are decoded by the Processor (and any other consumer). Document this contract in the position-record page once landed.

XADD options

  • MAXLEN ~ <REDIS_STREAM_MAXLEN> — approximate trimming, much cheaper than exact.
  • * for auto-generated message ID.
  • Use a single connection (no pooling — ioredis multiplexes commands automatically).

Backpressure / non-blocking property

The TCP handler is await-ing ctx.publish(p). Two strategies:

Option A: Direct XADD per record. Simplest. Latency per publish is sub-millisecond on a healthy Redis. The risk: if Redis hangs, the TCP handler blocks → device sockets back up → Phase 1's "TCP handler never blocks" property is violated.

Option B: Bounded in-memory queue + worker drain. A Promise-based bounded queue (e.g. p-queue or hand-rolled). publish() resolves once the record is enqueued; a worker drains via XADD. If the queue is full, the worker has fallen behind catastrophically — at that point we have to choose: drop oldest, drop newest, or throw. Recommendation: drop newest with a structured error log + metric, because the device will retransmit (we won't ACK).

Decision: Option B. Specification:

  • Queue capacity: 10,000 records (configurable via PUBLISH_QUEUE_CAPACITY).
  • On overflow: do not publish; throw a typed PublishOverflowError. The framing layer (task 1.4) catches this and skips the ACK so the device retransmits.
  • Worker concurrency: 1 (Redis is single-threaded per connection; concurrency just adds context-switch cost).
  • Metric: teltonika_publish_queue_depth gauge, teltonika_publish_overflow_total counter.

The worker uses XADD with a per-call timeout (e.g. 2s) and exits the process on prolonged Redis unavailability — graceful shutdown should restart the process via the orchestrator.

main.ts skeleton

async function main() {
  const config = loadConfig();
  const logger = createLogger(config);
  const metrics = createMetrics();
  const redis = await connectRedis(config, logger);
  const publisher = createPublisher(redis, config, logger, metrics);
  const adapter = createTeltonikaAdapter({ publisher, logger, metrics });
  const server = startServer(config.TELTONIKA_PORT, adapter, { publish: publisher.publish, logger, metrics });
  const metricsServer = startMetricsServer(config.METRICS_PORT, metrics);
  installGracefulShutdown({ server, metricsServer, redis, publisher, logger });
  logger.info({ port: config.TELTONIKA_PORT }, 'tcp-ingestion ready');
}

main().catch((err) => { console.error(err); process.exit(1); });

Acceptance criteria

  • Integration test: spin up a Redis (testcontainers or redis-mock), publish a known Position, XREAD it back, parse the JSON, and assert it equals the input (with bigint and Buffer round-tripped through the sentinel encoding).
  • Overflow test: artificially block the worker, fill the queue, verify the next publish() rejects with PublishOverflowError, verify metrics increment.
  • Startup test: with a wrong REDIS_URL, the process logs a clear error and exits non-zero.
  • An end-to-end test: open a TCP client to the running server, send the canonical Codec 8 fixture, verify a Position lands on the Stream and the ACK comes back with 00 00 00 01.

Risks / open questions

  • redis-mock does not implement Streams. Use testcontainers + a real Redis for integration tests.
  • The bounded queue could cause backpressure concerns — discuss with the Processor team whether they prefer the device-retransmit path (overflow throw) or a soft-drop with logging. Defaulting to retransmit because it's the safer correctness choice.

Done

Implemented in task 1.8. Key deviations from spec:

  1. Buffer.toJSON() trapBuffer.prototype.toJSON() converts Buffer to {type:'Buffer',data:[...]} before the JSON.stringify replacer sees it. The replacer checks both instanceof Uint8Array (direct calls) and the {type:'Buffer',data:[]} shape (JSON.stringify path) to handle both cases. The spec's Buffer.isBuffer(value) check would not work here; documented in publish.ts.

  2. Codec label plumbing — Chose Option B (handler wrapper), not a signature change to CodecHandlerContext.publish. AdapterContext.publish was updated to (position, codec) => Promise<void>; the framing layer (index.ts) builds a (pos) => ctx.publish(pos, codecLabel) closure at dispatch time. Codec parsers (codec8.ts, codec8e.ts, codec16.ts) are unchanged.

  3. connectRedis exported from publish.ts — co-located with publisher for testability; spec showed it in main.ts but extraction is cleaner.

  4. Integration tests skipped (Docker unavailable) — Two integration tests in test/publish.integration.test.ts log "Docker not available — skipping" and pass without executing. Will run in CI (task 1.11).

  5. startMetricsServer omitted from main.ts — Task 1.10 is out of scope; placeholder metrics (stub inc/observe) used per spec. The main.ts skeleton in the spec included startMetricsServer — deferred.

Test count: 81 (was 62, +19).