# Task 1.5 — Redis Stream consumer (XREADGROUP) **Phase:** 1 — Throughput pipeline **Status:** 🟩 Done **Depends on:** 1.2, 1.3 **Wiki refs:** `docs/wiki/entities/redis-streams.md`, `docs/wiki/entities/processor.md` ## Goal Build the Redis Stream consumer: join the consumer group, fetch batches via `XREADGROUP`, decode each entry to a `Position`, hand off to a sink callback, and return successfully-handled IDs to the caller for `XACK`. This task does **not** wire in the Postgres writer or the in-memory state — those are tasks 1.7 and 1.6, joined to the consumer in 1.8. The consumer accepts a `sink: (records: ConsumedRecord[]) => Promise` callback that returns the IDs it wants ACKed. Only those IDs are ACKed; failures stay pending and get claimed on the next loop. ## Deliverables - `src/core/consumer.ts` exporting: - `createConsumer(redis, config, logger, metrics, sink): Consumer` — factory. - `Consumer` interface: `start(): Promise` (returns when the consumer loop starts), `stop(): Promise` (signals the loop to exit, waits for the in-flight batch). - `ensureConsumerGroup(redis, stream, group)` — `XGROUP CREATE ... MKSTREAM` ignoring `BUSYGROUP` errors. Called once at start. - `type ConsumedRecord = { id: string; position: Position; codec: string; ts: string }` — what's passed to the sink. - `test/consumer.test.ts` (mocked `ioredis`): - Decodes a synthetic stream entry into a `ConsumedRecord` with the right shape. - Calls `sink` with the decoded batch and ACKs only the IDs the sink returned. - On `BUSYGROUP` error from `XGROUP CREATE`, swallows the error and continues. - On a malformed payload, increments `consumer_decode_errors_total`, logs at `error`, and **does not** ACK the bad entry — leaves it pending for inspection. - On `stop()`, the loop exits cleanly without losing in-flight work. ## Specification ### Consumer loop shape ```ts async function runLoop() { while (!stopping) { let entries: StreamEntry[]; try { entries = await redis.xreadgroup( 'GROUP', group, consumerName, 'COUNT', batchSize, 'BLOCK', batchBlockMs, 'STREAMS', stream, '>', ); } catch (err) { logger.error({ err }, 'XREADGROUP failed; backing off'); await sleep(1000); continue; } if (!entries) continue; // BLOCK timeout const records = decodeBatch(entries); // <— may emit decode errors const ackIds = await sink(records); // <— writer + state if (ackIds.length > 0) { await redis.xack(stream, group, ...ackIds); } } } ``` ### Decode error handling `decodeBatch` calls `decodePosition` (from task 1.2) on each entry's `payload` field. If a single entry fails to decode: - Increment `processor_decode_errors_total{stream=...}`. - Log at `error` with the entry ID and a truncated raw payload (first 256 chars). - **Skip** the entry — do not pass to sink, do not ACK. It stays in the consumer's PEL (Pending Entries List) and will be re-attempted on next claim. Phase 3 will route truly-poison entries to a dead-letter stream; for Phase 1, leaving them pending and visible in `XPENDING` is enough. ### `XACK` semantics ACK only what the sink returned. If the sink returns `['id1', 'id3']` from a batch of `[id1, id2, id3]`, then `id2` stays pending. Why a sink might return a partial list: it failed to write some records. The consumer must trust the sink's signal — never ACK speculatively. ### Consumer group setup On `start()`: 1. `XGROUP CREATE $ MKSTREAM` — creates the stream if missing, group at "now" so we don't replay history. If the group already exists, the call returns `BUSYGROUP Consumer Group name already exists` — catch and ignore. 2. Log at `info` whether the group was created or already existed. ### Why `>` not `0` for the read ID `>` means "deliver only new entries, not pending ones for this consumer." That's what we want for the steady-state loop. Phase 3 will add an explicit `XAUTOCLAIM` step at startup (and periodically) to pull stuck pending entries from dead consumers; Phase 1 relies on the natural redelivery via consumer-group resumption (when a dead instance restarts with the same name, it sees its old PEL). ## Acceptance criteria - [ ] `pnpm typecheck`, `pnpm lint`, `pnpm test` clean. - [ ] Unit tests cover: happy path, `BUSYGROUP` swallow, decode error skip, partial-ACK, clean stop. - [ ] Stop signal causes the loop to exit within one `BATCH_BLOCK_MS` tick. ## Risks / open questions - **Consumer name uniqueness.** Two instances with the same `REDIS_CONSUMER_NAME` will both read from the same PEL, which is undefined behaviour. Task 1.3 already documents that `INSTANCE_ID` (which defaults `REDIS_CONSUMER_NAME`) must be unique per instance — surface this again in the operator-facing README later. - **Long sink calls block the loop.** If the Postgres writer takes 30s, no new records are read. That's fine for Phase 1 (Postgres should be fast); Phase 3 may add a configurable max-in-flight if writer pressure becomes an issue. ## Done `src/core/consumer.ts` — XREADGROUP loop with `ensureConsumerGroup`, `decodeBatch`, partial-ACK semantics, `connectRedis` (co-located, not in `src/db/`), and clean stop. `test/consumer.test.ts` — 11 tests covering happy path, partial ACK, BUSYGROUP swallow, decode error skip, missing payload skip, XREADGROUP backoff, clean stop. *(pending commit SHA)*