Tasks 1.1-1.9 marked done with their landing commit SHAs. Tasks 1.10 (observability), 1.12 (production hardening), and 1.13 (device authority) marked paused with explicit resume triggers — pilot deployment on real Teltonika hardware takes priority. Task 1.11 remains as next, in slimmed form for the pilot (no /readyz healthcheck since the metrics endpoint is part of paused 1.10).
4.2 KiB
Task 1.3 — Configuration & logging
Phase: 1 — Inbound telemetry
Status: 🟩 Done — landed in commit 1e9219d
Depends on: 1.1
Wiki refs: docs/wiki/sources/gps-tracking-architecture.md § Deployment topology, § Observability
Goal
Provide a single source of truth for runtime configuration (env-var-driven, validated at startup, fail-fast on misconfiguration) and a structured JSON logger.
Deliverables
src/config/load.ts:- Exports
loadConfig(): Configthat parsesprocess.envthrough a zod schema, returning a typedConfigobject. Throws with a clear error message on missing/malformed values. - All env vars optional in dev (with sensible defaults) and required in production-like deployments. Use
NODE_ENVto gate.
- Exports
src/observability/logger.ts:- Exports a configured
pinologger. JSON output by default; pretty-printed viapino-prettyonly whenNODE_ENV === 'development'(lazy-loaded so it's not in the prod bundle). - Log level controlled by
LOG_LEVELenv var (defaultinfoin production,debugin development). - Adds a
service: 'tcp-ingestion'andinstance_id(fromINSTANCE_IDenv var or a generated short UUID at startup) to every log line.
- Exports a configured
Specification
Config schema (zod)
const ConfigSchema = z.object({
NODE_ENV: z.enum(['development', 'test', 'production']).default('development'),
INSTANCE_ID: z.string().min(1).default(() => `local-${randomUUID().slice(0, 8)}`),
LOG_LEVEL: z.enum(['fatal', 'error', 'warn', 'info', 'debug', 'trace']).default('info'),
// Vendor port bindings — extend as adapters are added.
TELTONIKA_PORT: z.coerce.number().int().min(1).max(65535).default(5027),
// Redis
REDIS_URL: z.string().url(),
REDIS_TELEMETRY_STREAM: z.string().min(1).default('telemetry:teltonika'),
REDIS_STREAM_MAXLEN: z.coerce.number().int().min(0).default(1_000_000), // approximate cap
// Observability
METRICS_PORT: z.coerce.number().int().min(0).max(65535).default(9090),
// Phase 2 (planned, not used in Phase 1)
// COMMANDS_OUTBOUND_STREAM_PREFIX: z.string().default('commands:outbound'),
});
export type Config = z.infer<typeof ConfigSchema>;
The Phase 2 fields are commented out so they do not become runtime requirements before Phase 2 ships. Add them when Phase 2 is in flight.
Logger conventions
- Always emit JSON in production (pino default).
- Always include:
time,level,service,instance_id,msg. - Adapter log lines include
imeiwhen known; framing log lines includecodec_idwhen applicable; CRC failures includeexpected_crc,computed_crc,frame_length. - Use
logger.child({ imei })to scope a logger per session, so subsequent log lines auto-include the IMEI. - Never log raw frame payloads at info or above — they're huge and may contain sensitive telemetry. At debug, truncate to first/last 16 bytes.
Failure mode
loadConfig() is called once in main.ts. If it throws, the process exits with a non-zero code and a single human-readable line listing the missing/invalid keys. Do not fall back to silent defaults for required keys — the operational habit we want is "missing config = process refuses to start," not "process starts and behaves weirdly later."
Acceptance criteria
- Calling
loadConfig()withREDIS_URLunset throws and the error namesREDIS_URLspecifically. - Calling
loadConfig()in dev withNODE_ENV=developmentand onlyREDIS_URLset returns a fully validConfigwith sensible defaults for everything else. - The logger emits JSON when
NODE_ENV=productionand pretty-printed text whenNODE_ENV=development. logger.child({ imei: '...' })produces lines withimeiincluded.
Risks / open questions
INSTANCE_IDdefault is a random UUID per process start — fine for dev, but in production K8s/compose deployments, set it explicitly to a stable identifier (pod name, hostname, etc.). The Phase 2 connection registry depends onINSTANCE_IDbeing stable across the lifetime of the process; document this in the deployment notes (task 1.11).- Log volume could be high under load. Pino is fast (~100k+ lines/sec on modern hardware) but consider
useOnlyCustomLevelsor sampling for the busiest events (e.g. per-frame debug logs).
Done
(Fill in once complete.)