c314ba0902
ROADMAP.md establishes status legend, architectural anchors pointing at the wiki, and seven non-negotiable design rules — most importantly the core/domain boundary that protects Phase 1 from Phase 2 churn, the schema-authority split (positions hypertable owned here; everything else owned by Directus), and idempotent-writes via (device_id, ts) ON CONFLICT. Phase 1 (throughput pipeline) is fully detailed across 11 task files: scaffold, core types + sentinel decoder, config + logging, Postgres hypertable, Redis Stream consumer, per-device LRU state, batched writer, main wiring, observability, integration test, Dockerfile + Gitea CI. Observability is in Phase 1 (not deferred) — lesson learned from tcp-ingestion task 1.10. Phases 2-4 are stub READMEs. Phase 2 (domain logic) blocks on Directus schema decisions and lists those open questions explicitly. Phase 3 (production hardening) and Phase 4 (future) sketch the task shape.
77 lines
4.3 KiB
Markdown
77 lines
4.3 KiB
Markdown
# Task 1.3 — Configuration & logging
|
||
|
||
**Phase:** 1 — Throughput pipeline
|
||
**Status:** ⬜ Not started
|
||
**Depends on:** 1.1
|
||
**Wiki refs:** `docs/wiki/entities/processor.md`
|
||
|
||
## Goal
|
||
|
||
Validate environment variables on startup with `zod`, build the pino root logger with the same conventions as `tcp-ingestion` (ISO timestamps, string level labels, instance_id base field), and fail fast with a readable error message if config is invalid.
|
||
|
||
## Deliverables
|
||
|
||
- `src/config/load.ts` exporting:
|
||
- `loadConfig(): Config` — reads `process.env`, runs zod parse, returns a typed `Config`. Throws on invalid input with a multi-line message that names every invalid field.
|
||
- `Config` type derived from the zod schema.
|
||
- `src/observability/logger.ts` exporting:
|
||
- `createLogger({ level, nodeEnv, instanceId }): Logger` — pino root logger with base fields `service: 'processor'`, `instance_id`. ISO timestamps via `pino.stdTimeFunctions.isoTime`. Level formatter that emits `"level":"info"` not `"level":30`. In `nodeEnv === 'development'`, use the pino-pretty transport.
|
||
- `type Logger` re-exported from `pino`.
|
||
- Wire both into `src/main.ts`: `loadConfig()` → `createLogger()` → `logger.info('processor starting')` → exit 0 (still a stub; consumer wiring lands in 1.8).
|
||
|
||
## Specification
|
||
|
||
### Environment variables
|
||
|
||
| Var | Required | Default | Notes |
|
||
|---|---|---|---|
|
||
| `NODE_ENV` | no | `production` | `development` enables pino-pretty |
|
||
| `INSTANCE_ID` | no | `processor-1` | Used in metrics + log base field |
|
||
| `LOG_LEVEL` | no | `info` | `trace` / `debug` / `info` / `warn` / `error` |
|
||
| `REDIS_URL` | yes | — | e.g. `redis://redis:6379` |
|
||
| `POSTGRES_URL` | yes | — | e.g. `postgres://user:pass@db:5432/trm` |
|
||
| `REDIS_TELEMETRY_STREAM` | no | `telemetry:t` | Must match `tcp-ingestion`'s `REDIS_TELEMETRY_STREAM` |
|
||
| `REDIS_CONSUMER_GROUP` | no | `processor` | All Processor instances join this group |
|
||
| `REDIS_CONSUMER_NAME` | no | `${INSTANCE_ID}` | Unique per instance — defaults to instance id |
|
||
| `METRICS_PORT` | no | `9090` | HTTP server port for `/metrics`, `/healthz`, `/readyz` |
|
||
| `BATCH_SIZE` | no | `100` | Max records per `XREADGROUP` call |
|
||
| `BATCH_BLOCK_MS` | no | `5000` | `BLOCK` timeout on `XREADGROUP` when stream is empty |
|
||
| `WRITE_BATCH_SIZE` | no | `50` | Max rows per Postgres `INSERT` |
|
||
| `DEVICE_STATE_LRU_CAP` | no | `10000` | Max devices kept in memory; LRU eviction beyond this |
|
||
|
||
### Validation rules
|
||
|
||
- All defaults must be expressed in the zod schema with `.default(...)` so the parsed `Config` is fully typed and never has `undefined` for an optional field.
|
||
- Numeric env vars must be coerced from string and bounded: `BATCH_SIZE` 1–10000, `BATCH_BLOCK_MS` 0–60000, `WRITE_BATCH_SIZE` 1–1000, `DEVICE_STATE_LRU_CAP` 100–1_000_000.
|
||
- `REDIS_URL` and `POSTGRES_URL` must parse as URLs with the expected protocol (`redis:` or `rediss:`; `postgres:` or `postgresql:`).
|
||
- `LOG_LEVEL` must be one of pino's accepted levels.
|
||
|
||
### Logger conventions
|
||
|
||
Match `tcp-ingestion/src/observability/logger.ts` line for line where applicable. Future-you grepping across services should see the same shape:
|
||
|
||
```ts
|
||
const formatters = { level: (label: string) => ({ level: label }) };
|
||
|
||
if (nodeEnv === 'development') {
|
||
return pino({ level, base, timestamp: pino.stdTimeFunctions.isoTime, formatters,
|
||
transport: { target: 'pino-pretty', options: { colorize: true, translateTime: 'SYS:standard', ignore: 'pid,hostname' } } });
|
||
}
|
||
return pino({ level, base, timestamp: pino.stdTimeFunctions.isoTime, formatters });
|
||
```
|
||
|
||
## Acceptance criteria
|
||
|
||
- [ ] `pnpm test` covers config validation: missing required vars throw with the right message; invalid URLs throw; bounded numerics throw on out-of-range values.
|
||
- [ ] Running with valid env emits a single `processor starting` info log with `service=processor` and `instance_id=processor-1` base fields.
|
||
- [ ] Running with `NODE_ENV=development` produces colorized output via pino-pretty.
|
||
- [ ] Running with `NODE_ENV=production` produces JSON output with ISO `time` and string `level`.
|
||
|
||
## Risks / open questions
|
||
|
||
- `REDIS_CONSUMER_NAME` defaulting to `INSTANCE_ID` means `INSTANCE_ID` must be unique per instance for safe consumer-group operation. Document this in `.env.example` so operators don't accidentally run two instances with the same `INSTANCE_ID`.
|
||
|
||
## Done
|
||
|
||
(Fill in once complete: commit SHA, brief notes.)
|