processor/.planning/phase-1-throughput/03-config-and-logging.md

# Task 1.3 — Configuration & logging

**Phase:** 1 — Throughput pipeline
**Status:** ⬜ Not started
**Depends on:** 1.1
**Wiki refs:** `docs/wiki/entities/processor.md`

## Goal

Validate environment variables on startup with `zod`, build the pino root logger with the same conventions as `tcp-ingestion` (ISO timestamps, string level labels, instance_id base field), and fail fast with a readable error message if config is invalid.

## Deliverables

- `src/config/load.ts` exporting:
  - `loadConfig(): Config` — reads `process.env`, runs zod parse, returns a typed `Config`. Throws on invalid input with a multi-line message that names every invalid field.
  - `Config` type derived from the zod schema.
- `src/observability/logger.ts` exporting:
  - `createLogger({ level, nodeEnv, instanceId }): Logger` — pino root logger with base fields `service: 'processor'`, `instance_id`. ISO timestamps via `pino.stdTimeFunctions.isoTime`. Level formatter that emits `"level":"info"` not `"level":30`. In `nodeEnv === 'development'`, use the pino-pretty transport.
  - `type Logger` re-exported from `pino`.
- Wire both into `src/main.ts`: `loadConfig()` → `createLogger()` → `logger.info('processor starting')` → exit 0 (still a stub; consumer wiring lands in 1.8).

## Specification

### Environment variables

| Var | Required | Default | Notes |
|---|---|---|---|
| `NODE_ENV` | no | `production` | `development` enables pino-pretty |
| `INSTANCE_ID` | no | `processor-1` | Used in metrics + log base field |
| `LOG_LEVEL` | no | `info` | `trace` / `debug` / `info` / `warn` / `error` |
| `REDIS_URL` | yes | — | e.g. `redis://redis:6379` |
| `POSTGRES_URL` | yes | — | e.g. `postgres://user:pass@db:5432/trm` |
| `REDIS_TELEMETRY_STREAM` | no | `telemetry:t` | Must match `tcp-ingestion`'s `REDIS_TELEMETRY_STREAM` |
| `REDIS_CONSUMER_GROUP` | no | `processor` | All Processor instances join this group |
| `REDIS_CONSUMER_NAME` | no | `${INSTANCE_ID}` | Unique per instance — defaults to instance id |
| `METRICS_PORT` | no | `9090` | HTTP server port for `/metrics`, `/healthz`, `/readyz` |
| `BATCH_SIZE` | no | `100` | Max records per `XREADGROUP` call |
| `BATCH_BLOCK_MS` | no | `5000` | `BLOCK` timeout on `XREADGROUP` when stream is empty |
| `WRITE_BATCH_SIZE` | no | `50` | Max rows per Postgres `INSERT` |
| `DEVICE_STATE_LRU_CAP` | no | `10000` | Max devices kept in memory; LRU eviction beyond this |

### Validation rules

- All defaults must be expressed in the zod schema with `.default(...)` so the parsed `Config` is fully typed and never has `undefined` for an optional field.
- Numeric env vars must be coerced from string and bounded: `BATCH_SIZE` 1–10000, `BATCH_BLOCK_MS` 0–60000, `WRITE_BATCH_SIZE` 1–1000, `DEVICE_STATE_LRU_CAP` 100–1_000_000.
- `REDIS_URL` and `POSTGRES_URL` must parse as URLs with the expected protocol (`redis:` or `rediss:`; `postgres:` or `postgresql:`).
- `LOG_LEVEL` must be one of pino's accepted levels.

### Logger conventions

Match `tcp-ingestion/src/observability/logger.ts` line for line where applicable. Future-you grepping across services should see the same shape:

```ts
const formatters = { level: (label: string) => ({ level: label }) };

if (nodeEnv === 'development') {
  return pino({ level, base, timestamp: pino.stdTimeFunctions.isoTime, formatters,
    transport: { target: 'pino-pretty', options: { colorize: true, translateTime: 'SYS:standard', ignore: 'pid,hostname' } } });
}
return pino({ level, base, timestamp: pino.stdTimeFunctions.isoTime, formatters });
```

## Acceptance criteria

- [ ] `pnpm test` covers config validation: missing required vars throw with the right message; invalid URLs throw; bounded numerics throw on out-of-range values.
- [ ] Running with valid env emits a single `processor starting` info log with `service=processor` and `instance_id=processor-1` base fields.
- [ ] Running with `NODE_ENV=development` produces colorized output via pino-pretty.
- [ ] Running with `NODE_ENV=production` produces JSON output with ISO `time` and string `level`.

## Risks / open questions

- `REDIS_CONSUMER_NAME` defaulting to `INSTANCE_ID` means `INSTANCE_ID` must be unique per instance for safe consumer-group operation. Document this in `.env.example` so operators don't accidentally run two instances with the same `INSTANCE_ID`.

## Done

(Fill in once complete: commit SHA, brief notes.)