Files
tcp-ingestion/.planning/phase-1-telemetry/03-config-and-logging.md
T
julian c8a5f4cd68 Add Phase 1 and Phase 2 planning documents
ROADMAP plus granular task files per phase. Phase 1 (12 tasks + 1.13
device authority) covers Codec 8/8E/16 telemetry ingestion; Phase 2
(6 tasks) covers Codec 12/14 outbound commands; Phase 3 enumerates
deferred items.
2026-04-30 15:50:49 +02:00

79 lines
4.2 KiB
Markdown

# Task 1.3 — Configuration & logging
**Phase:** 1 — Inbound telemetry
**Status:** ⬜ Not started
**Depends on:** 1.1
**Wiki refs:** `docs/wiki/sources/gps-tracking-architecture.md` § Deployment topology, § Observability
## Goal
Provide a single source of truth for runtime configuration (env-var-driven, validated at startup, fail-fast on misconfiguration) and a structured JSON logger.
## Deliverables
- `src/config/load.ts`:
- Exports `loadConfig(): Config` that parses `process.env` through a zod schema, returning a typed `Config` object. Throws with a clear error message on missing/malformed values.
- All env vars optional in dev (with sensible defaults) and required in production-like deployments. Use `NODE_ENV` to gate.
- `src/observability/logger.ts`:
- Exports a configured `pino` logger. JSON output by default; pretty-printed via `pino-pretty` only when `NODE_ENV === 'development'` (lazy-loaded so it's not in the prod bundle).
- Log level controlled by `LOG_LEVEL` env var (default `info` in production, `debug` in development).
- Adds a `service: 'tcp-ingestion'` and `instance_id` (from `INSTANCE_ID` env var or a generated short UUID at startup) to every log line.
## Specification
### Config schema (zod)
```ts
const ConfigSchema = z.object({
NODE_ENV: z.enum(['development', 'test', 'production']).default('development'),
INSTANCE_ID: z.string().min(1).default(() => `local-${randomUUID().slice(0, 8)}`),
LOG_LEVEL: z.enum(['fatal', 'error', 'warn', 'info', 'debug', 'trace']).default('info'),
// Vendor port bindings — extend as adapters are added.
TELTONIKA_PORT: z.coerce.number().int().min(1).max(65535).default(5027),
// Redis
REDIS_URL: z.string().url(),
REDIS_TELEMETRY_STREAM: z.string().min(1).default('telemetry:teltonika'),
REDIS_STREAM_MAXLEN: z.coerce.number().int().min(0).default(1_000_000), // approximate cap
// Observability
METRICS_PORT: z.coerce.number().int().min(0).max(65535).default(9090),
// Phase 2 (planned, not used in Phase 1)
// COMMANDS_OUTBOUND_STREAM_PREFIX: z.string().default('commands:outbound'),
});
export type Config = z.infer<typeof ConfigSchema>;
```
The Phase 2 fields are commented out so they do not become runtime requirements before Phase 2 ships. Add them when Phase 2 is in flight.
### Logger conventions
- Always emit JSON in production (pino default).
- Always include: `time`, `level`, `service`, `instance_id`, `msg`.
- Adapter log lines include `imei` when known; framing log lines include `codec_id` when applicable; CRC failures include `expected_crc`, `computed_crc`, `frame_length`.
- Use `logger.child({ imei })` to scope a logger per session, so subsequent log lines auto-include the IMEI.
- Never log raw frame payloads at info or above — they're huge and may contain sensitive telemetry. At debug, truncate to first/last 16 bytes.
### Failure mode
`loadConfig()` is called once in `main.ts`. If it throws, the process exits with a non-zero code and a single human-readable line listing the missing/invalid keys. **Do not fall back to silent defaults for required keys** — the operational habit we want is "missing config = process refuses to start," not "process starts and behaves weirdly later."
## Acceptance criteria
- [ ] Calling `loadConfig()` with `REDIS_URL` unset throws and the error names `REDIS_URL` specifically.
- [ ] Calling `loadConfig()` in dev with `NODE_ENV=development` and only `REDIS_URL` set returns a fully valid `Config` with sensible defaults for everything else.
- [ ] The logger emits JSON when `NODE_ENV=production` and pretty-printed text when `NODE_ENV=development`.
- [ ] `logger.child({ imei: '...' })` produces lines with `imei` included.
## Risks / open questions
- `INSTANCE_ID` default is a random UUID per process start — fine for dev, but in production K8s/compose deployments, set it explicitly to a stable identifier (pod name, hostname, etc.). The Phase 2 connection registry depends on `INSTANCE_ID` being stable across the lifetime of the process; document this in the deployment notes (task 1.11).
- Log volume could be high under load. Pino is fast (~100k+ lines/sec on modern hardware) but consider `useOnlyCustomLevels` or sampling for the busiest events (e.g. per-frame debug logs).
## Done
(Fill in once complete.)