Implement Phase 1 task 1.10 (Prometheus metrics + /healthz + /readyz)

Replaces the placeholder Metrics shim with a prom-client implementation
in src/observability/metrics.ts: all 10 Phase 1 metrics from the wiki
spec, plus nodejs_* defaults. Exposes /metrics, /healthz, /readyz over
node:http on METRICS_PORT (9090); /readyz returns 503 when Redis status
is not 'ready' or the TCP listener isn't bound.

The Metrics interface in src/core/types.ts is unchanged — adapter call
sites continue to use the same inc/observe shape. Only main.ts sees the
extended type that adds serializeMetrics().

Side effects:
- Dockerfile re-enables HEALTHCHECK pointing at /readyz, and EXPOSE 9090.
- frame-ingested log downgraded back to debug now that
  teltonika_records_published_total is scrapeable.
- 19 new unit tests covering exposition format, all metric types, and
  every HTTP endpoint path. Total now 98 passing.

Note: deploy/compose.yaml still does not expose 9090 — separate decision
about how Prometheus reaches the service (host port vs. internal scraper
on the same Docker network).
This commit is contained in:
2026-04-30 20:52:12 +02:00
parent ff9c8d67a4
commit d4a6d8f713
8 changed files with 720 additions and 27 deletions
+3 -4
View File
@@ -24,9 +24,8 @@ COPY --from=build --chown=app:app /app/node_modules ./node_modules
COPY --from=build --chown=app:app /app/dist ./dist
COPY --from=build --chown=app:app /app/package.json ./package.json
USER app
# Only the TCP port is exposed. METRICS_PORT (9090) is in the config schema but
# no HTTP server runs today — task 1.10 (observability) adds that server.
EXPOSE 5027
# HEALTHCHECK deferred — re-add `wget -qO- http://localhost:${METRICS_PORT}/readyz`
# when task 1.10 (observability) ships and the HTTP server is running.
EXPOSE 9090
HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
CMD wget -qO- http://localhost:9090/readyz || exit 1
CMD ["node", "dist/main.js"]