Files
processor/.planning/phase-1-throughput/03-config-and-logging.md
T
julian c314ba0902 Add planning documents for Phase 1 (throughput pipeline) and stub Phases 2-4
ROADMAP.md establishes status legend, architectural anchors pointing at the
wiki, and seven non-negotiable design rules — most importantly the
core/domain boundary that protects Phase 1 from Phase 2 churn, the
schema-authority split (positions hypertable owned here; everything else
owned by Directus), and idempotent-writes via (device_id, ts) ON CONFLICT.

Phase 1 (throughput pipeline) is fully detailed across 11 task files:
scaffold, core types + sentinel decoder, config + logging, Postgres
hypertable, Redis Stream consumer, per-device LRU state, batched writer,
main wiring, observability, integration test, Dockerfile + Gitea CI.
Observability is in Phase 1 (not deferred) — lesson learned from
tcp-ingestion task 1.10.

Phases 2-4 are stub READMEs. Phase 2 (domain logic) blocks on Directus
schema decisions and lists those open questions explicitly. Phase 3
(production hardening) and Phase 4 (future) sketch the task shape.
2026-04-30 21:16:59 +02:00

77 lines
4.3 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Task 1.3 — Configuration & logging
**Phase:** 1 — Throughput pipeline
**Status:** ⬜ Not started
**Depends on:** 1.1
**Wiki refs:** `docs/wiki/entities/processor.md`
## Goal
Validate environment variables on startup with `zod`, build the pino root logger with the same conventions as `tcp-ingestion` (ISO timestamps, string level labels, instance_id base field), and fail fast with a readable error message if config is invalid.
## Deliverables
- `src/config/load.ts` exporting:
- `loadConfig(): Config` — reads `process.env`, runs zod parse, returns a typed `Config`. Throws on invalid input with a multi-line message that names every invalid field.
- `Config` type derived from the zod schema.
- `src/observability/logger.ts` exporting:
- `createLogger({ level, nodeEnv, instanceId }): Logger` — pino root logger with base fields `service: 'processor'`, `instance_id`. ISO timestamps via `pino.stdTimeFunctions.isoTime`. Level formatter that emits `"level":"info"` not `"level":30`. In `nodeEnv === 'development'`, use the pino-pretty transport.
- `type Logger` re-exported from `pino`.
- Wire both into `src/main.ts`: `loadConfig()``createLogger()``logger.info('processor starting')` → exit 0 (still a stub; consumer wiring lands in 1.8).
## Specification
### Environment variables
| Var | Required | Default | Notes |
|---|---|---|---|
| `NODE_ENV` | no | `production` | `development` enables pino-pretty |
| `INSTANCE_ID` | no | `processor-1` | Used in metrics + log base field |
| `LOG_LEVEL` | no | `info` | `trace` / `debug` / `info` / `warn` / `error` |
| `REDIS_URL` | yes | — | e.g. `redis://redis:6379` |
| `POSTGRES_URL` | yes | — | e.g. `postgres://user:pass@db:5432/trm` |
| `REDIS_TELEMETRY_STREAM` | no | `telemetry:t` | Must match `tcp-ingestion`'s `REDIS_TELEMETRY_STREAM` |
| `REDIS_CONSUMER_GROUP` | no | `processor` | All Processor instances join this group |
| `REDIS_CONSUMER_NAME` | no | `${INSTANCE_ID}` | Unique per instance — defaults to instance id |
| `METRICS_PORT` | no | `9090` | HTTP server port for `/metrics`, `/healthz`, `/readyz` |
| `BATCH_SIZE` | no | `100` | Max records per `XREADGROUP` call |
| `BATCH_BLOCK_MS` | no | `5000` | `BLOCK` timeout on `XREADGROUP` when stream is empty |
| `WRITE_BATCH_SIZE` | no | `50` | Max rows per Postgres `INSERT` |
| `DEVICE_STATE_LRU_CAP` | no | `10000` | Max devices kept in memory; LRU eviction beyond this |
### Validation rules
- All defaults must be expressed in the zod schema with `.default(...)` so the parsed `Config` is fully typed and never has `undefined` for an optional field.
- Numeric env vars must be coerced from string and bounded: `BATCH_SIZE` 110000, `BATCH_BLOCK_MS` 060000, `WRITE_BATCH_SIZE` 11000, `DEVICE_STATE_LRU_CAP` 1001_000_000.
- `REDIS_URL` and `POSTGRES_URL` must parse as URLs with the expected protocol (`redis:` or `rediss:`; `postgres:` or `postgresql:`).
- `LOG_LEVEL` must be one of pino's accepted levels.
### Logger conventions
Match `tcp-ingestion/src/observability/logger.ts` line for line where applicable. Future-you grepping across services should see the same shape:
```ts
const formatters = { level: (label: string) => ({ level: label }) };
if (nodeEnv === 'development') {
return pino({ level, base, timestamp: pino.stdTimeFunctions.isoTime, formatters,
transport: { target: 'pino-pretty', options: { colorize: true, translateTime: 'SYS:standard', ignore: 'pid,hostname' } } });
}
return pino({ level, base, timestamp: pino.stdTimeFunctions.isoTime, formatters });
```
## Acceptance criteria
- [ ] `pnpm test` covers config validation: missing required vars throw with the right message; invalid URLs throw; bounded numerics throw on out-of-range values.
- [ ] Running with valid env emits a single `processor starting` info log with `service=processor` and `instance_id=processor-1` base fields.
- [ ] Running with `NODE_ENV=development` produces colorized output via pino-pretty.
- [ ] Running with `NODE_ENV=production` produces JSON output with ISO `time` and string `level`.
## Risks / open questions
- `REDIS_CONSUMER_NAME` defaulting to `INSTANCE_ID` means `INSTANCE_ID` must be unique per instance for safe consumer-group operation. Document this in `.env.example` so operators don't accidentally run two instances with the same `INSTANCE_ID`.
## Done
(Fill in once complete: commit SHA, brief notes.)