Files
processor/.planning/phase-1-throughput/03-config-and-logging.md
T
julian c314ba0902 Add planning documents for Phase 1 (throughput pipeline) and stub Phases 2-4
ROADMAP.md establishes status legend, architectural anchors pointing at the
wiki, and seven non-negotiable design rules — most importantly the
core/domain boundary that protects Phase 1 from Phase 2 churn, the
schema-authority split (positions hypertable owned here; everything else
owned by Directus), and idempotent-writes via (device_id, ts) ON CONFLICT.

Phase 1 (throughput pipeline) is fully detailed across 11 task files:
scaffold, core types + sentinel decoder, config + logging, Postgres
hypertable, Redis Stream consumer, per-device LRU state, batched writer,
main wiring, observability, integration test, Dockerfile + Gitea CI.
Observability is in Phase 1 (not deferred) — lesson learned from
tcp-ingestion task 1.10.

Phases 2-4 are stub READMEs. Phase 2 (domain logic) blocks on Directus
schema decisions and lists those open questions explicitly. Phase 3
(production hardening) and Phase 4 (future) sketch the task shape.
2026-04-30 21:16:59 +02:00

4.3 KiB
Raw Blame History

Task 1.3 — Configuration & logging

Phase: 1 — Throughput pipeline Status: Not started Depends on: 1.1 Wiki refs: docs/wiki/entities/processor.md

Goal

Validate environment variables on startup with zod, build the pino root logger with the same conventions as tcp-ingestion (ISO timestamps, string level labels, instance_id base field), and fail fast with a readable error message if config is invalid.

Deliverables

  • src/config/load.ts exporting:
    • loadConfig(): Config — reads process.env, runs zod parse, returns a typed Config. Throws on invalid input with a multi-line message that names every invalid field.
    • Config type derived from the zod schema.
  • src/observability/logger.ts exporting:
    • createLogger({ level, nodeEnv, instanceId }): Logger — pino root logger with base fields service: 'processor', instance_id. ISO timestamps via pino.stdTimeFunctions.isoTime. Level formatter that emits "level":"info" not "level":30. In nodeEnv === 'development', use the pino-pretty transport.
    • type Logger re-exported from pino.
  • Wire both into src/main.ts: loadConfig()createLogger()logger.info('processor starting') → exit 0 (still a stub; consumer wiring lands in 1.8).

Specification

Environment variables

Var Required Default Notes
NODE_ENV no production development enables pino-pretty
INSTANCE_ID no processor-1 Used in metrics + log base field
LOG_LEVEL no info trace / debug / info / warn / error
REDIS_URL yes e.g. redis://redis:6379
POSTGRES_URL yes e.g. postgres://user:pass@db:5432/trm
REDIS_TELEMETRY_STREAM no telemetry:t Must match tcp-ingestion's REDIS_TELEMETRY_STREAM
REDIS_CONSUMER_GROUP no processor All Processor instances join this group
REDIS_CONSUMER_NAME no ${INSTANCE_ID} Unique per instance — defaults to instance id
METRICS_PORT no 9090 HTTP server port for /metrics, /healthz, /readyz
BATCH_SIZE no 100 Max records per XREADGROUP call
BATCH_BLOCK_MS no 5000 BLOCK timeout on XREADGROUP when stream is empty
WRITE_BATCH_SIZE no 50 Max rows per Postgres INSERT
DEVICE_STATE_LRU_CAP no 10000 Max devices kept in memory; LRU eviction beyond this

Validation rules

  • All defaults must be expressed in the zod schema with .default(...) so the parsed Config is fully typed and never has undefined for an optional field.
  • Numeric env vars must be coerced from string and bounded: BATCH_SIZE 110000, BATCH_BLOCK_MS 060000, WRITE_BATCH_SIZE 11000, DEVICE_STATE_LRU_CAP 1001_000_000.
  • REDIS_URL and POSTGRES_URL must parse as URLs with the expected protocol (redis: or rediss:; postgres: or postgresql:).
  • LOG_LEVEL must be one of pino's accepted levels.

Logger conventions

Match tcp-ingestion/src/observability/logger.ts line for line where applicable. Future-you grepping across services should see the same shape:

const formatters = { level: (label: string) => ({ level: label }) };

if (nodeEnv === 'development') {
  return pino({ level, base, timestamp: pino.stdTimeFunctions.isoTime, formatters,
    transport: { target: 'pino-pretty', options: { colorize: true, translateTime: 'SYS:standard', ignore: 'pid,hostname' } } });
}
return pino({ level, base, timestamp: pino.stdTimeFunctions.isoTime, formatters });

Acceptance criteria

  • pnpm test covers config validation: missing required vars throw with the right message; invalid URLs throw; bounded numerics throw on out-of-range values.
  • Running with valid env emits a single processor starting info log with service=processor and instance_id=processor-1 base fields.
  • Running with NODE_ENV=development produces colorized output via pino-pretty.
  • Running with NODE_ENV=production produces JSON output with ISO time and string level.

Risks / open questions

  • REDIS_CONSUMER_NAME defaulting to INSTANCE_ID means INSTANCE_ID must be unique per instance for safe consumer-group operation. Document this in .env.example so operators don't accidentally run two instances with the same INSTANCE_ID.

Done

(Fill in once complete: commit SHA, brief notes.)