Files
processor/.planning/phase-1-throughput/10-integration-test.md
T
julian e1c6f59948 Realign processor stream-name default to telemetry:teltonika
Stage discovered the wrong default at runtime: tcp-ingestion's compiled
default REDIS_TELEMETRY_STREAM is 'telemetry:teltonika' but processor's
was 'telemetry:t', so the two services were talking past each other —
tcp-ingestion publishing to one stream, processor reading another empty
one. The deploy stack now pins both to the same value via a shared env
var, but the processor's compiled default should also match so local
development and the integration test stay aligned with reality.

Changes:
- src/config/load.ts — default changed to 'telemetry:teltonika'
- .env.example — same
- test/config.test.ts — default-value assertion updated
- planning docs (ROADMAP, phase-1 README, tasks 03/08/10, phase-3 README) —
  occurrences of 'telemetry:t' replaced with 'telemetry:teltonika'

The deploy stack remains the single source of truth via the shared
REDIS_TELEMETRY_STREAM env var. Compiled defaults are belt-and-braces.
2026-05-01 11:43:31 +02:00

4.5 KiB

Task 1.10 — Integration test (testcontainers Redis + Postgres)

Phase: 1 — Throughput pipeline Status: 🟩 Done Depends on: 1.5, 1.7, 1.8, 1.9 Wiki refs:

Goal

End-to-end pipeline test: spin up Redis 7 and TimescaleDB via testcontainers, boot the Processor against them, publish a synthetic Position to telemetry:teltonika, verify the row appears in positions with byte-equivalent attribute decoding (bigint, Buffer included).

This is the integration test that proves the upstream contract from tcp-ingestion flows through end-to-end. Mirror tcp-ingestion/test/publish.integration.test.ts's structure and skip-on-no-Docker pattern.

Deliverables

  • test/pipeline.integration.test.ts:

    • beforeAll: start Redis container, start TimescaleDB container, run migrations, build a Processor instance pointed at both. If Docker is unavailable, log a clear skip message and set a flag so all it blocks early-return without failing.
    • afterAll: stop the Processor, stop containers.
    • Test 1: publish a Position with bigint and Buffer attributes via XADD; wait for the row in positions (poll, timeout 10s); assert device_id, ts, GPS fields, and a JSON round-trip of attributes matches the original (bigint as string, Buffer as base64).
    • Test 2: publish two records with the same (device_id, ts); verify only one row in positions (idempotency check).
    • Test 3: publish a malformed payload (broken JSON) on the stream; verify processor_decode_errors_total increments and the bad entry stays in PEL (not ACKed).
    • Test 4: simulate the writer failing once (e.g. by temporarily shutting Postgres mid-test, then bringing it back); verify the record gets retried and eventually lands.
  • Use the TimescaleDB image, not stock postgres:7-alpine. Suggested: timescale/timescaledb:latest-pg16. Confirm the migration's CREATE EXTENSION IF NOT EXISTS timescaledb no-ops (extension already loaded).

  • Use the same Vitest config split as tcp-ingestion: vitest.integration.config.ts with hookTimeout: 120_000, testTimeout: 60_000. Default pnpm test excludes *.integration.test.ts; opt-in via pnpm test:integration.

Specification

Skip-on-no-Docker pattern

Copy tcp-ingestion/test/publish.integration.test.ts's pattern verbatim:

  • Try to start the first container in beforeAll. On error, set dockerAvailable = false, log a warning, and return.
  • Each it block early-returns with a console.warn if !dockerAvailable.
  • This pattern was the fix for the CI test failure on the runner without Docker — keep it.

Synthetic Position publishing

Reuse serializePosition from tcp-ingestion's publish.ts if it can be imported (likely not — separate repos). Otherwise inline the encoding: a Position object → JSON.stringify with the bigint/Buffer replacer → XADD telemetry:teltonika * ts <iso> device_id <imei> codec 8E payload <json>.

Why test 4 (writer failure → retry)

This validates the core ACK semantics: if a write fails, the record stays pending, and re-delivery brings it back. Without this test, we have unit tests showing each piece behaves correctly, but no proof the pieces compose right. Skip-conditions: if simulating Postgres failure mid-test is too flaky in testcontainers, weaken to: stop Postgres before publishing, publish, start Postgres, verify row appears.

Acceptance criteria

  • pnpm test:integration runs all four scenarios green when Docker is available.
  • Without Docker, the suite logs skip messages and exits 0 (does not fail).
  • CI (pnpm test, unit only) does not run these — they are opt-in.
  • First-run container pull is reasonable; subsequent runs are fast (testcontainers caches the image).

Risks / open questions

  • Image pull on first CI run. The TimescaleDB image is large (~700MB). If we ever wire integration tests into CI (separate job with Docker), pre-pulling may be required. Document but defer.
  • Test flakiness from polling. Polling for "row appears in positions" uses a 10s timeout. If CI is slow, raise it. Don't replace polling with await sleep(2000) — that's reliably wrong.

Done

test/pipeline.integration.test.ts: four scenarios (happy path with bigint+Buffer, idempotency, malformed payload stays pending, writer failure → retry after Postgres restart). Uses timescale/timescaledb:latest-pg16; skip-on-no-Docker pattern verified (exits 0 without Docker). pnpm test:integration runs 4 tests green with Docker, 4 skips without. Landed in 9791620.