Add planning documents for Phase 1 (throughput pipeline) and stub Phases 2-4

ROADMAP.md establishes status legend, architectural anchors pointing at the
wiki, and seven non-negotiable design rules — most importantly the
core/domain boundary that protects Phase 1 from Phase 2 churn, the
schema-authority split (positions hypertable owned here; everything else
owned by Directus), and idempotent-writes via (device_id, ts) ON CONFLICT.

Phase 1 (throughput pipeline) is fully detailed across 11 task files:
scaffold, core types + sentinel decoder, config + logging, Postgres
hypertable, Redis Stream consumer, per-device LRU state, batched writer,
main wiring, observability, integration test, Dockerfile + Gitea CI.
Observability is in Phase 1 (not deferred) — lesson learned from
tcp-ingestion task 1.10.

Phases 2-4 are stub READMEs. Phase 2 (domain logic) blocks on Directus
schema decisions and lists those open questions explicitly. Phase 3
(production hardening) and Phase 4 (future) sketch the task shape.
This commit is contained in:
2026-04-30 21:16:26 +02:00
parent 1a4202f4d1
commit c314ba0902
17 changed files with 1191 additions and 0 deletions
@@ -0,0 +1,76 @@
# Task 1.3 — Configuration & logging
**Phase:** 1 — Throughput pipeline
**Status:** ⬜ Not started
**Depends on:** 1.1
**Wiki refs:** `docs/wiki/entities/processor.md`
## Goal
Validate environment variables on startup with `zod`, build the pino root logger with the same conventions as `tcp-ingestion` (ISO timestamps, string level labels, instance_id base field), and fail fast with a readable error message if config is invalid.
## Deliverables
- `src/config/load.ts` exporting:
- `loadConfig(): Config` — reads `process.env`, runs zod parse, returns a typed `Config`. Throws on invalid input with a multi-line message that names every invalid field.
- `Config` type derived from the zod schema.
- `src/observability/logger.ts` exporting:
- `createLogger({ level, nodeEnv, instanceId }): Logger` — pino root logger with base fields `service: 'processor'`, `instance_id`. ISO timestamps via `pino.stdTimeFunctions.isoTime`. Level formatter that emits `"level":"info"` not `"level":30`. In `nodeEnv === 'development'`, use the pino-pretty transport.
- `type Logger` re-exported from `pino`.
- Wire both into `src/main.ts`: `loadConfig()``createLogger()``logger.info('processor starting')` → exit 0 (still a stub; consumer wiring lands in 1.8).
## Specification
### Environment variables
| Var | Required | Default | Notes |
|---|---|---|---|
| `NODE_ENV` | no | `production` | `development` enables pino-pretty |
| `INSTANCE_ID` | no | `processor-1` | Used in metrics + log base field |
| `LOG_LEVEL` | no | `info` | `trace` / `debug` / `info` / `warn` / `error` |
| `REDIS_URL` | yes | — | e.g. `redis://redis:6379` |
| `POSTGRES_URL` | yes | — | e.g. `postgres://user:pass@db:5432/trm` |
| `REDIS_TELEMETRY_STREAM` | no | `telemetry:t` | Must match `tcp-ingestion`'s `REDIS_TELEMETRY_STREAM` |
| `REDIS_CONSUMER_GROUP` | no | `processor` | All Processor instances join this group |
| `REDIS_CONSUMER_NAME` | no | `${INSTANCE_ID}` | Unique per instance — defaults to instance id |
| `METRICS_PORT` | no | `9090` | HTTP server port for `/metrics`, `/healthz`, `/readyz` |
| `BATCH_SIZE` | no | `100` | Max records per `XREADGROUP` call |
| `BATCH_BLOCK_MS` | no | `5000` | `BLOCK` timeout on `XREADGROUP` when stream is empty |
| `WRITE_BATCH_SIZE` | no | `50` | Max rows per Postgres `INSERT` |
| `DEVICE_STATE_LRU_CAP` | no | `10000` | Max devices kept in memory; LRU eviction beyond this |
### Validation rules
- All defaults must be expressed in the zod schema with `.default(...)` so the parsed `Config` is fully typed and never has `undefined` for an optional field.
- Numeric env vars must be coerced from string and bounded: `BATCH_SIZE` 110000, `BATCH_BLOCK_MS` 060000, `WRITE_BATCH_SIZE` 11000, `DEVICE_STATE_LRU_CAP` 1001_000_000.
- `REDIS_URL` and `POSTGRES_URL` must parse as URLs with the expected protocol (`redis:` or `rediss:`; `postgres:` or `postgresql:`).
- `LOG_LEVEL` must be one of pino's accepted levels.
### Logger conventions
Match `tcp-ingestion/src/observability/logger.ts` line for line where applicable. Future-you grepping across services should see the same shape:
```ts
const formatters = { level: (label: string) => ({ level: label }) };
if (nodeEnv === 'development') {
return pino({ level, base, timestamp: pino.stdTimeFunctions.isoTime, formatters,
transport: { target: 'pino-pretty', options: { colorize: true, translateTime: 'SYS:standard', ignore: 'pid,hostname' } } });
}
return pino({ level, base, timestamp: pino.stdTimeFunctions.isoTime, formatters });
```
## Acceptance criteria
- [ ] `pnpm test` covers config validation: missing required vars throw with the right message; invalid URLs throw; bounded numerics throw on out-of-range values.
- [ ] Running with valid env emits a single `processor starting` info log with `service=processor` and `instance_id=processor-1` base fields.
- [ ] Running with `NODE_ENV=development` produces colorized output via pino-pretty.
- [ ] Running with `NODE_ENV=production` produces JSON output with ISO `time` and string `level`.
## Risks / open questions
- `REDIS_CONSUMER_NAME` defaulting to `INSTANCE_ID` means `INSTANCE_ID` must be unique per instance for safe consumer-group operation. Document this in `.env.example` so operators don't accidentally run two instances with the same `INSTANCE_ID`.
## Done
(Fill in once complete: commit SHA, brief notes.)