Implement Phase 1 task 1.10 (Prometheus metrics + /healthz + /readyz)

Replaces the placeholder Metrics shim with a prom-client implementation
in src/observability/metrics.ts: all 10 Phase 1 metrics from the wiki
spec, plus nodejs_* defaults. Exposes /metrics, /healthz, /readyz over
node:http on METRICS_PORT (9090); /readyz returns 503 when Redis status
is not 'ready' or the TCP listener isn't bound.

The Metrics interface in src/core/types.ts is unchanged — adapter call
sites continue to use the same inc/observe shape. Only main.ts sees the
extended type that adds serializeMetrics().

Side effects:
- Dockerfile re-enables HEALTHCHECK pointing at /readyz, and EXPOSE 9090.
- frame-ingested log downgraded back to debug now that
  teltonika_records_published_total is scrapeable.
- 19 new unit tests covering exposition format, all metric types, and
  every HTTP endpoint path. Total now 98 passing.

Note: deploy/compose.yaml still does not expose 9090 — separate decision
about how Prometheus reaches the service (host port vs. internal scraper
on the same Docker network).
This commit is contained in:
2026-04-30 20:52:12 +02:00
parent ff9c8d67a4
commit d4a6d8f713
8 changed files with 720 additions and 27 deletions
@@ -1,7 +1,7 @@
# Task 1.10 — Observability (Prometheus metrics)
**Phase:** 1 — Inbound telemetry
**Status:** ⏸ Paused — deferred until after the real-device pilot test. See ROADMAP.md "Deferred" section for resume triggers. The placeholder `Metrics` interface in `src/core/types.ts` is what code currently uses; this task replaces it with `prom-client` and adds the `/metrics`, `/healthz`, `/readyz` HTTP endpoints.
**Status:** 🟩
**Depends on:** 1.2, 1.3
**Wiki refs:** `docs/wiki/sources/teltonika-ingestion-architecture.md` § 7. Observability, `docs/wiki/sources/gps-tracking-architecture.md` § 7.4
@@ -81,4 +81,4 @@ Use Node's `node:http` directly — no Express/Fastify dependency for two endpoi
## Done
(Fill in once complete.)
Implemented `src/observability/metrics.ts` with `createMetrics()`, `startMetricsServer()`, and `ReadyzDeps`. Replaced the placeholder shim in `src/main.ts`, wired metrics server into boot and graceful shutdown, downgraded `frame ingested` log to debug, and re-enabled the Dockerfile `HEALTHCHECK`. Landed in: *(pending commit SHA)*