src/observability/metrics.ts — full prom-client implementation. All 10
Phase 1 metrics registered (processor_consumer_reads_total,
_records_total, _lag, _decode_errors_total, processor_position_writes_total
{status}, _write_duration_seconds, processor_acks_total,
processor_device_state_{size,evictions_total}) plus nodejs_* defaults.
node:http server with /metrics, /healthz, /readyz. /readyz checks
redis.status === 'ready' AND a 5s-cached SELECT 1 Postgres probe.
processor_consumer_lag sampled every 10s via XINFO GROUPS, falling back
to a no-op when the consumer group hasn't been created yet.
src/main.ts — replaces the trace-logging shim with createMetrics() and
startMetricsServer(); shutdown closes the metrics server before
redis.quit() and pool.end().
test/metrics.test.ts — 22 unit tests: exposition format, every metric
type behaviour, all four HTTP endpoint paths including /readyz 503 cases.
test/pipeline.integration.test.ts — testcontainers Redis 7 +
TimescaleDB latest-pg16. Four scenarios: happy path with bigint+Buffer
attribute round-trip, idempotency on (device_id, ts), malformed payload
stays in PEL (decode_errors_total increments), writer failure → retry
(weaker variant per spec: stop Postgres before publish, restart, verify
row appears). Skip-on-no-Docker pattern verified — exits 0 without
Docker.
Dockerfile — multi-stage matching tcp-ingestion. EXPOSE 9090 only,
HEALTHCHECK on /readyz, image-source label points at processor repo.
.gitea/workflows/build.yml — single-job workflow mirroring
tcp-ingestion. Path filters cover src/, test/, build config, Dockerfile.
Portainer webhook step uncommented for :main auto-deploy.
compose.dev.yaml — local-build variant with Redis + TimescaleDB +
processor-dev for verifying Dockerfile changes without the registry
round-trip.
README.md — fleshed out from stub: quick-start, Docker build, deployment
note, env vars, tests (unit vs. integration), CI behavior. Flags the
deploy-side change needed: deploy/compose.yaml needs a TimescaleDB
service and a processor service entry added.
Verification: typecheck, lint clean; 134 unit tests passing across 8
files (+22 from this batch). pnpm test:integration runs cleanly under
the no-Docker skip pattern.
Phase 1 is now complete. Service is pilot-ready.
5.3 KiB
Task 1.11 — Dockerfile & Gitea workflow
Phase: 1 — Throughput pipeline Status: 🟩 Done Depends on: 1.10 Wiki refs: —
Goal
Containerize the service and add the Gitea Actions workflow that builds and publishes git.dev.microservices.al/trm/processor:main on every push to main. Mirror tcp-ingestion's slim variant — same multi-stage Dockerfile, same single-job workflow with path filters.
Deliverables
Dockerfile— multi-stage: deps → build → runtime. Matchtcp-ingestion/Dockerfileline for line, adjusting only:EXPOSE 9090(only — Processor has no TCP listener).HEALTHCHECKpointing at/readyzon${METRICS_PORT}.CMD ["node", "dist/main.js"].
.gitea/workflows/build.yml— single-job workflow matchingtcp-ingestion/.gitea/workflows/build.yml:- Trigger:
pushtomain(path filters:src/,test/,package.json,pnpm-lock.yaml,tsconfig.json,Dockerfile,.gitea/workflows/build.yml) +workflow_dispatch. - Steps: checkout, setup-node@v4 (Node 22, pnpm), install, typecheck, lint, test (unit only), docker buildx build-push to
git.dev.microservices.al/trm/processor:main. - Uses
secrets.REGISTRY_USERNAME/secrets.REGISTRY_PASSWORD. - Final step: trigger Portainer webhook on success (uncommented; same as
tcp-ingestionafter the:main-> webhook auto-deploy got working).
- Trigger:
compose.dev.yaml— local-build variant withbuild: ., namedprocessor-dev, depends on a Redis service and a TimescaleDB service. Useful for verifying Dockerfile changes without the registry round-trip.README.md(the repo-level one, already a stub) — flesh out with:- Quick-start (local:
pnpm install && cp .env.example .env && pnpm dev). - "Run the Docker build locally" section (
docker compose -f compose.dev.yaml up --build). - Production-deployment note: image is pulled by the
deploy/repo's stack; do not run standalone. - Pin to a specific commit via
PROCESSOR_TAG=<sha>in the deploy stack. - Tests section (unit vs. integration).
- CI behavior summary.
- "Pilot deployment notes" section if anything is paused (Phase 1 has nothing paused — note this and remove the section if so).
- Quick-start (local:
Specification
Dockerfile parity with tcp-ingestion
Open tcp-ingestion/Dockerfile and copy structure verbatim. The only diffs from a Phase 1 Processor are:
- No
EXPOSE 5027— there's no TCP listener. HEALTHCHECKURL path is/readyz(already true fortcp-ingestion).- Image label:
org.opencontainers.image.sourceshould point to theprocessorrepo URL.
This parity matters: when a future engineer needs to debug a build, having two services build the same way reduces cognitive load.
Workflow parity with tcp-ingestion
Same. Open tcp-ingestion/.gitea/workflows/build.yml, copy, change image name and (if needed) path filters. The webhook step at the end should be uncommented so :main builds auto-deploy through Portainer.
Stage deploy
Phase 1 ships ready to land in the deploy/compose.yaml (trm/deploy repo) as a new service. Do not edit deploy/compose.yaml from this task. Surface it in the final report: "Add processor service to deploy/compose.yaml with image, env, depends_on Redis + Postgres." That is a deploy-side change, made by the user.
The deploy/compose.yaml's service block will look roughly like:
processor:
image: git.dev.microservices.al/trm/processor:${PROCESSOR_TAG:-main}
depends_on:
redis: { condition: service_healthy }
postgres: { condition: service_healthy }
environment:
NODE_ENV: production
INSTANCE_ID: ${PROCESSOR_INSTANCE_ID:-processor-1}
REDIS_URL: redis://redis:6379
POSTGRES_URL: postgres://...
LOG_LEVEL: ${LOG_LEVEL:-info}
restart: unless-stopped
Plus a Postgres service (TimescaleDB image) added to the stack — the stack currently only has Redis + tcp-ingestion. That's the user's deploy decision to make.
Acceptance criteria
docker build .succeeds locally; resulting image runs and exposes/healthzon 9090.docker compose -f compose.dev.yaml up --buildboots Redis + TimescaleDB + Processor;/readyzreports 200 once everything is up.- Pushing to
main(or hittingworkflow_dispatch) builds the image, runs typecheck/lint/test, and pushes:mainto the registry. - Portainer webhook fires on successful push and the stage stack picks up the new image (assuming the
deploy/stack is set up). - Image size is reasonable (target < 250 MB final stage; the
tcp-ingestionslim variant lands around there).
Risks / open questions
- Re-pull on stack redeploy. The same Portainer issue we hit with
tcp-ingestion(stack redeploy doesn't pull new images by default) will apply here. Make sure the same fix is in place ("Re-pull image" toggle, or per-commit-SHA tags) before this lands. Cross-reference thetcp-ingestiondeploy note indeploy/README.md. - HEALTHCHECK
wgetavailability.node:22-alpineincludeswget. If we ever switch base image, revisit.
Done
Dockerfile (multi-stage, EXPOSE 9090 only, HEALTHCHECK on /readyz), .gitea/workflows/build.yml (mirrors tcp-ingestion; Portainer webhook uncommented), compose.dev.yaml (Redis + TimescaleDB + processor-dev), README.md fleshed out. (pending commit SHA)