julian d758c211ae Fix metric wiring gaps audited against live processor output
Several Phase 1 metrics were registered in observability/metrics.ts but
either unwired at the call sites or wired with wrong counts. Production
output showed 11 records ingested per logs but only 4 in metrics. The
fixes below align metric values with actual hot-path activity.

Wiring gaps closed (consumer.ts):
- processor_consumer_reads_total{result=ok|empty|error} — was registered
  but never inc'd. Now fires on each XREADGROUP outcome.
- processor_consumer_records_total — was registered but never inc'd.
  Now fires once per XREADGROUP, with the entry count.

Counts corrected (writer.ts):
- processor_position_writes_total{status} — was inc'd unconditionally
  by 1 per chunk for each of inserted/duplicate. Now inc'd by the
  actual per-status count, and only when count > 0.
- processor_position_writes_total{status='failed'} — was inc'd by 1
  per failed chunk. Now inc'd by chunk.length so every failed record
  is counted.

Counts corrected (main.ts):
- processor_acks_total — was inc'd by 1 per non-empty batch. Now
  inc'd by ackIds.length so every ACK'd ID is counted.

Wiring gap closed (state.ts):
- processor_device_state_evictions_total — internal `evicted` counter
  existed but was never published to metrics. createDeviceStateStore
  now accepts a Metrics injection and inc's on each eviction.

Metrics interface extended (types.ts, metrics.ts):
- Metrics.inc gained an optional third `value` parameter (defaults to 1)
  for batched increments. dispatchInc passes it through to prom-client's
  Counter.inc(labels, value).

Tests updated to reflect the new third arg and the state.ts factory's
new metrics parameter. Total 134 unit tests passing (no count change —
existing tests adjusted, no new tests added; the real verification is
on stage where the metrics are now meaningful again).
2026-05-01 11:43:06 +02:00

processor

Node.js worker that consumes Position records from a Redis Stream (produced by tcp-ingestion), maintains per-device runtime state, applies racing-domain rules, and writes durable state to Postgres / TimescaleDB.

For the architectural specification see ../docs/wiki/entities/processor.md. For the work plan and task status see .planning/ROADMAP.md.

This service is part of the TRM (Time Racing Management) platform.


Quick start (local)

Prerequisites: Node.js 22+, pnpm, a local Redis instance, and a TimescaleDB instance.

git clone <repo-url>
cd processor
pnpm install
cp .env.example .env
# Edit .env — at minimum set REDIS_URL and POSTGRES_URL
pnpm dev

pnpm dev uses tsx watch for hot-reload during development. The metrics server listens on METRICS_PORT (default 9090). The service connects to Redis and Postgres on startup; both must be reachable before the process starts.


Test the Docker build locally

compose.dev.yaml builds the image from source and runs it next to Redis and TimescaleDB containers. Useful for verifying Dockerfile changes before pushing:

docker compose -f compose.dev.yaml up --build

Once running, the readiness endpoint confirms everything is wired:

curl http://localhost:9090/readyz
# {"status":"ok"}

For day-to-day development, prefer pnpm dev directly — it has hot reload and faster iteration.


Production / stage deployment

This service is not deployed standalone. It runs as part of the platform stack defined in the deploy/ repo, which Portainer pulls and runs on the stage and production hosts.

The image itself is published to git.dev.microservices.al/trm/processor:main on every push to main (see CI behavior below). The deploy/ repo's compose.yaml references that image; updates flow through there, not through this repo.

To pin a specific commit in production, set PROCESSOR_TAG=<sha> in the deploy stack's environment variables.

Note: The deploy/compose.yaml will need a processor service entry and a TimescaleDB service added before this service can run in stage/production. See .planning/phase-1-throughput/11-dockerfile-and-ci.md for the expected service block shape. That is a deploy-side change for the user to make.


Environment variables

See .env.example for all variables with descriptions and defaults. Required variables:

Variable Description
REDIS_URL Redis connection URL, e.g. redis://localhost:6379
POSTGRES_URL TimescaleDB connection URL, e.g. postgres://user:pass@host:5432/trm

All other variables have sensible defaults (see .env.example).


Tests

  • pnpm test — unit tests only. Fast (~12 s), no external dependencies. This is what CI runs.
  • pnpm test:integration — integration tests that need Docker (testcontainers spins up real Redis 7 and TimescaleDB containers). Opt-in. Run locally before changes to the consumer, writer, or migration.

Integration tests live in test/**/*.integration.test.ts and are excluded from the default run by vitest.config.ts.

Without Docker

If Docker is unavailable, pnpm test:integration still exits 0 — the suite logs a skip message per test and does not fail the build. This is the correct behavior for CI runners that lack Docker access.


CI behavior

Gitea Actions workflow is at .gitea/workflows/build.yml.

  • Push to main (only when src/, test/, build config, Dockerfile, or the workflow file itself changes): runs typecheck, lint, test (unit tests only), then builds and pushes the Docker image tagged :main. Auto-deploys to stage if a Portainer webhook is configured via secrets.PORTAINER_WEBHOOK_URL.
  • Manual trigger (workflow_dispatch): same flow, run on demand.

Integration tests are not run in CI — they need Docker access on the runner, which is not currently configured. Run them locally as needed.

The workflow uses secrets.REGISTRY_USERNAME and secrets.REGISTRY_PASSWORD for the Gitea registry login — these must be configured in the repo's (or org's) Actions secrets.

S
Description
No description provided
Readme 302 KiB
Languages
TypeScript 98.7%
JavaScript 1%
Dockerfile 0.3%