Compare commits

...

10 Commits

Author SHA1 Message Date
julian 875327bed7 log: TRM design handoff imported, adoption deferred to SPA phase 3.8 2026-05-02 19:11:45 +02:00
julian 67e3a4939f Update project_processor documentation for Phase 1.5 completion and enhance .gitignore to exclude .claude directory 2026-05-02 18:53:29 +02:00
julian f92595a62a docs: TRACCAR ingest + processor-ws-contract synthesis + auth-mode realignment
Catches up the wiki with several pieces of work accumulated during this
session.

INGEST: TRACCAR_MAPS_ARCHITECTURE.md
- raw/TRACCAR_MAPS_ARCHITECTURE.md (source doc, read-only).
- wiki/sources/traccar-maps-architecture.md — TL;DR + key claims +
  notable quotes + TRM divergences (PostGIS-native GeoJSON, rAF
  coalescer, Zustand, longer trail, racing sprite set).
- wiki/concepts/maps-architecture.md — distilled patterns for the SPA's
  map subsystem: singleton MapLibre + side-effect-only Map* components +
  two GeoJSON sources + style-swap mapReady gate + sprite preload + WS-
  to-map data flow (with rAF coalescer) + geofence editing + camera
  control trio.
- wiki/entities/react-spa.md — corrected the "talks exclusively to
  Directus" contradiction with [[live-channel-architecture]] (SPA
  connects to two endpoints — Directus + Processor); locked stack (raw
  MapLibre over react-map-gl, Zustand over Redux); added Auth section.
- wiki/concepts/live-channel-architecture.md — single sentence cross-
  referencing [[maps-architecture]] for consumer-side throughput
  discipline.
- index.md — Sources + Concepts entries.

SYNTHESIS: processor-ws-contract
- wiki/synthesis/processor-ws-contract.md — wire-level spec for the
  live-position WebSocket: endpoint, transport, auth handshake,
  subscribe/snapshot/streaming/unsubscribe protocol, reconnect, multi-
  instance behaviour, connection limits, versioning, open questions.
  Implementation-agnostic; the producer is cookie-name-agnostic so the
  spec doesn't pin to a specific Directus auth mode.
- index.md — Synthesis entry.

AUTH-MODE REALIGNMENT (cookie -> session)
- SPA implementation surfaced that Directus SDK 'cookie' mode doesn't
  survive a hard reload cleanly. Switched the SPA to 'session' mode
  (separate commit in trm/spa). Wiki updates here:
- wiki/entities/react-spa.md §Auth pattern — describes session mode
  (single httpOnly session cookie, no separate access token, no
  /auth/refresh dance). Added "Mode choice context" note.
- wiki/synthesis/processor-ws-contract.md §Auth handshake — emphasises
  the producer is cookie-name-agnostic; reframed "Cookie refresh while
  connected" as "Session expiry while connected".

Plus all the chronological log.md entries documenting the above plus
Phase 1.5 planning, SPA Phase 1 planning, and stage verify+seed work
from earlier in the session.

Skipped from this commit: .claude/agent-memory/* (user-local agent
state, not project content); .gitignore (already-modified by user
outside this session's scope).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 18:15:09 +02:00
julian 130b44778a Add Phase 1 task decisions documentation for db-init runner, positions table divergences, and Gitea CI workflow 2026-05-02 12:23:06 +02:00
julian f9b96efc6b Document directus deployment + internal-only network model
trm/directus Phase 1 image is on the registry; trm/deploy's
compose.yaml has been extended with a directus service block that
shares the existing postgres service with processor (different
tables, no contention). Bringing the architecture wiki up to date.

wiki/entities/directus.md updates:

- New "Deployment" section: links to the deploy compose, names the
  shared-Postgres model with processor, spells out the 5-step boot
  pipeline (db-init pre-schema → bootstrap → schema apply →
  db-init post-schema → start), notes first-boot (~60-90 s) vs
  warm-boot (~10 s) timing, points at deploy/README.md's first-deploy
  checklist.

- New "Network exposure" subsection: directus is internal-only on
  stage / prod (expose: 8055 not ports:). A reverse proxy on the
  host or attached to trm_default terminates TLS and forwards the
  public domain to http://directus:8055. The asymmetry with
  tcp-ingestion (which must host-publish for GPS devices) is named.
  The dev compose's deliberate divergence (host-publishes 8055 for
  local iteration) is noted.

- Schema management section: db-init split into pre-schema (db-init/)
  and post-schema (db-init-post/) phases. Post-schema landed because
  the composite UNIQUE constraints target Directus-managed tables
  that don't exist until schema apply runs. Both phases run via the
  same apply-db-init.sh with DB_INIT_DIR overridden between calls.

- Destructive-apply hazard callout: corrected entrypoint step
  reference (now step 3/5, not 2/4) after the bootstrap-before-apply
  reorder that landed during CI iterations.

log.md entry records the three CI iterations that surfaced three
distinct production-breaking bugs (port collision; ordering + silent
ERROR exit; ghost-collection apply conflict) — all caught by the
dry-run gate before reaching stage. Ghost-collection stripping is
now automated in scripts/schema-snapshot.sh so future captures
don't regress.
2026-05-02 12:20:13 +02:00
julian 417c21f49e Document destructive-apply hazard in directus entity page
A real incident hit during directus Phase 1 task 1.5: 5 newly-created
collections were destroyed by a container rebuild because the baked-in
snapshot was stale. directus schema apply --yes enforces the snapshot
as the single source of truth — anything not in the snapshot gets
deleted.

This is correct for fresh-environment provisioning and prod, but
catastrophic during active schema development. Adding a callout to the
directus entity page so future readers see the operator rule alongside
the snapshot/apply pattern documentation:

  Never restart or rebuild the Directus container while there are
  uncommitted schema changes. Always: change → snapshot → commit →
  rebuild/restart.

The recovery path (re-apply via MCP / admin UI, snapshot before
restart) is straightforward in dev but would be data-loss in prod.
Phase 3 hardening will introduce a DIRECTUS_SCHEMA_APPLY_MODE env var
with auto/dry-run/skip modes so dev environments default to
non-destructive behavior.
2026-05-02 09:56:20 +02:00
julian 507aa8b23b Add Docker image facts and Phase 1 scaffold decisions documentation 2026-05-01 21:29:39 +02:00
julian 02156208f2 Add Directus DevOps architect agent and skill documentation 2026-05-01 20:31:36 +02:00
julian 411b08d02f Add business-plane schema draft and ingest Rally Albania 2025 regs
Substantial design artifact + canonical-source ingest for the TRM
business plane.

Schema draft (synthesis):
- wiki/synthesis/directus-schema-draft.md — working agreement for the
  multi-tenant schema. Pseudo multi-tenant under organizations; entries
  as the unit of timing; course definition (stages/segments/geofences/
  waypoints/SLZs); penalty system "numbers in DB, math in code" with an
  evaluator registry and progressive bracket math; per-entry timing
  tables; per-stage start-order strategies (manual /
  previous_stage_clean_result / inverse_top_n_then_natural /
  inverse_of_overall) covering both Tirana 24h and Rally Albania
  patterns. Two role surfaces (org role vs racing role) called out
  explicitly. Decisions captured; Open questions reduced to one
  (geometry retroactivity engine, deferred to Phase 2.5).

Source ingest:
- raw/Regulations_2025.pdf + wiki/sources/rally-albania-regulations-
  2025.md — formal ingest of the canonical Rally Albania 2025
  rulebook. Section numbers preserved as §X.Y so the schema draft and
  future SPA work can cite precisely. Flagged follow-ups: the SLZ
  formula lives in the Supplementary Regulations (don't hardcode);
  M-7 numbering bug; unmodeled neutralization zones.

Faulty-position flag (cross-plane operator workflow):
- entities/postgres-timescaledb.md, entities/processor.md,
  concepts/position-record.md — operator-controlled boolean on the
  positions hypertable; processor filters WHERE faulty = false on
  every read; flagging triggers windowed recompute via the
  recompute:requests stream.

Implementation strategy on entity pages:
- entities/directus.md — Schema management section documenting the
  snapshots/ + db-init/ convention, container-startup apply pipeline.
- entities/processor.md — Phase 2 long-lived branch model with
  PROCESSOR_PHASE_2_ENABLED flag-gating for incremental main merges;
  Phase 2.5 deferral note.

Index and log updated.
2026-05-01 20:31:10 +02:00
julian 90d036dbf0 Document canonical Redis stream names in wiki
The wiki was silent on the actual stream name used by tcp-ingestion and
processor — anyone reading it to understand the architecture had no way
to find out what stream the services use. This gap contributed to a
stage-side bug where the two services' compiled defaults drifted
(tcp-ingestion: telemetry:teltonika, processor: telemetry:t), causing
~7 hours of silent zero-throughput before symptoms surfaced.

Changes:
- entities/redis-streams.md — added "Stream and key naming" table
  covering the inbound telemetry stream, Phase 2 command streams, and
  registry/heartbeat keys. Documented the telemetry:{vendor} convention
  so a future Queclink/Concox adapter fits predictably.
- entities/processor.md — opening paragraph names the stream and
  consumer group consumed.
- entities/tcp-ingestion.md — opening paragraph names the stream
  produced; defers full naming convention to redis-streams.
- log.md — note entry recording the canonicalization and the stage
  incident that triggered it.
2026-05-01 11:43:59 +02:00
28 changed files with 2621 additions and 33 deletions
@@ -0,0 +1,8 @@
# Agent Memory Index
- [Directus image facts](reference_directus_image.md) — directus/directus:11.17.4 confirmed on Docker Hub; Alpine-based (node:22-alpine); runs as user `node`; CMD is `node cli.js bootstrap && pm2-runtime start ecosystem.config.cjs`
- [TimescaleDB-HA image facts](reference_timescaledb_ha.md) — timescale/timescaledb-ha:pg16-latest includes PostGIS by default; PGDATA=/pgdata (not /var/lib/postgresql/data); env vars: POSTGRES_USER, POSTGRES_PASSWORD, POSTGRES_DB
- [Phase 1 scaffold decisions](project_phase1_scaffold.md) — key implementation decisions made during task 1.1 (image pins, volume paths, entrypoint command, apk package name)
- [Phase 1 task 1.2 decisions](project_phase1_task12.md) — apply-db-init.sh: .gitattributes required for LF on Windows, chmod+x via git update-index, collision detection pattern, psql error capture idiom
- [Phase 1 task 1.3 decisions](project_phase1_task13.md) — positions table: 8 divergences between spec and processor migration; processor wins; assertion block patterns for information_schema + pg_indexes checks
- [Phase 1 task 1.8 decisions](project_phase1_task18.md) — CI workflow: no build-push-action (image must be in local daemon for docker run); --network host requires DB_HOST=localhost not postgres; curl -fsS for Portainer webhook
@@ -0,0 +1,19 @@
---
name: Phase 1 task 1.1 scaffold decisions
description: Implementation decisions made during Phase 1 task 1.1 (project scaffold) for trm/directus
type: project
---
# Phase 1 task 1.1 — project scaffold decisions
**Why:** These are non-obvious deviations from the task spec that future task implementers need to know.
- **entrypoint.sh calls `node cli.js bootstrap` then `exec pm2-runtime start /directus/ecosystem.config.cjs`**, not `exec /directus/cli.js start`. The `bootstrap` step is idempotent and required for admin user creation. pm2-runtime provides crash recovery.
- **`postgresql16-client` is pre-installed in the Dockerfile via `apk add --no-cache postgresql16-client`**. This allows task 1.2's `apply-db-init.sh` to use `psql` without adding it later. Alpine 3.20 uses version-specific package names (no generic `postgresql-client`).
- **Volume mount for db service is `/pgdata`**, not `/var/lib/postgresql/data`. The `timescaledb-ha` image sets `PGDATA=/pgdata`.
- **`directus-uploads` named volume** maps to `/directus/uploads` in the directus service. No uploads use-case in Phase 1, but declared now to avoid volume-mount pain later.
- **`KEY` and `SECRET` in compose have no default** — blank env var means Directus will fail loudly at boot, which is the right behavior.
- **`packageManager` field omitted from package.json** because the local pnpm version (10.x) doesn't match what the task spec implied (9.x). pnpm 10 is backward-compatible.
- **Dockerfile is single-stage in Phase 1.** Multi-stage build with a Node builder for extensions lands in Phase 5.
**How to apply:** Read this before implementing tasks 1.21.7 to avoid re-discovering these facts.
@@ -0,0 +1,26 @@
---
name: Phase 1 task 1.2 — db-init runner decisions
description: Implementation decisions and gotchas from writing apply-db-init.sh
type: project
---
# Phase 1 task 1.2 — db-init runner decisions
## Key decisions
- **`.gitattributes` required**: `core.autocrlf=true` on the Windows dev machine would corrupt shell scripts at checkout (turning LF → CRLF → `bad interpreter: /usr/bin/env bash^M`). Added `directus/.gitattributes` with `*.sh text eol=lf` and `*.sql text eol=lf`. This is a non-negotiable requirement for any new shell script in the directus service.
- **`git update-index --chmod=+x`**: The only way to set executable bit on Windows without WSL. Must be done after `git add` (not before — the file must already be indexed). If re-staged after a chmod, the bit is preserved as long as `--chmod=+x` was called on the indexed file.
- **Filename collision detection uses `[[ -v "ARRAY[key]" ]]`**: The Bash 4.2+ idiom for checking associative array key existence. Safe in the `node:22-alpine` base image (ships Bash 5.x). Do not use `[[ "${ARRAY[$key]+_}" ]]` — less readable.
- **`set -euo pipefail` + `|| psql_exit=$?`**: The `||` short-circuits `set -e` for that single statement, which is correct and idiomatic Bash. This is the approved pattern for capturing psql exit status without disabling `set -e` globally.
- **`run_psql` wrapper vs inline**: The guard-table bootstrap and the record-insert use `run_psql` / inline psql respectively. The apply step cannot use `run_psql` because it needs `-v ON_ERROR_STOP=1 -1 --file=` which are not generic flags. This duplication is intentional.
- **`compgen -G` for glob check**: Used to detect whether any `*.sql` files exist before calling `ls | sort`. Avoids `ls: cannot access ... No such file or directory` error under `set -e`.
- **Filename quoting in SQL**: Basenames are passed directly into SQL strings with single quotes. This is safe for filenames matching `[0-9]+_[a-z0-9_]+.sql` (the project convention). If someone introduces a filename with a single quote, it would break. Acceptable for now — document if it becomes a concern.
## What was NOT done (and why)
- No `trap ERR` — the spec says "if it adds clarity". With `set -euo pipefail` and explicit exit codes on every failure path, ERR trapping would only add noise.
- No lock file for concurrent-boot safety — the task spec acknowledges this risk and defers it to Phase 3+.
## Acceptance testing status
Docker was not available in the agent's shell environment. Acceptance tests must be run manually — see the task report for the exact commands.
@@ -0,0 +1,48 @@
---
name: Phase 1 task 1.3 decisions
description: positions hypertable schema divergences from spec, cross-check against processor migration, and assertion block patterns
type: project
---
Task 1.3 authored three SQL migrations under directus/db-init/. The critical finding was a substantial divergence between the task spec (03-initial-migrations.md) and the processor's actual migration (processor/src/db/migrations/0001_positions.sql). The processor migration wins in all cases.
**Why:** The processor is the sole writer for positions. If the table schema doesn't match what the processor inserts, writes fail at runtime with NOT NULL violations or column-not-found errors.
**How to apply:** Always read processor/src/db/migrations/0001_positions.sql before writing or modifying the positions table schema. Do not trust 03-initial-migrations.md column list without cross-checking.
## Divergences found (spec vs. processor ground truth)
1. **ingested_at** — not in spec, present in processor migration as `timestamptz NOT NULL DEFAULT now()`. Required.
2. **codec** — not in spec, present in processor migration as `text NOT NULL`. Required.
3. **altitude/angle/speed** — spec says `DOUBLE PRECISION nullable`; processor has `real NOT NULL`. Use `real NOT NULL`.
4. **satellites/priority** — spec says nullable; processor has `NOT NULL`. Use NOT NULL.
5. **attributes DEFAULT** — spec adds `DEFAULT '{}'::jsonb`; processor has no default. No default in the migration (processor always supplies the value).
6. **PRIMARY KEY vs UNIQUE INDEX** — spec uses `PRIMARY KEY (device_id, ts)`; processor uses `CREATE UNIQUE INDEX positions_device_ts ON positions (device_id, ts)` (no PK). TimescaleDB idiomatic: unique index, no PK. Index name is `positions_device_ts` (no `_idx` suffix).
7. **Chunk interval** — spec says `INTERVAL '7 days'`; processor uses `INTERVAL '1 day'`. Use 1 day.
8. **Indexes** — spec has one composite `(device_id, ts DESC)` index; processor has two: `positions_device_ts (device_id, ts)` (unique) and `positions_ts (ts DESC)`. Both are required.
## Assertion block pattern established
Each migration ends with a `DO $$ DECLARE ... BEGIN ... END $$;` block that:
- Checks table/column existence via `information_schema.columns`
- Checks hypertable registration via `timescaledb_information.hypertables`
- Checks index existence via `pg_indexes`
- Checks index predicate via `pg_indexes.indexdef ILIKE '%where (faulty = false)%'`
- Raises named exceptions on any mismatch
For the faulty column NOT NULL check, the pattern is:
```sql
SELECT data_type, is_nullable = 'NO', column_default
INTO _col_type, _col_notnull, _col_default
FROM information_schema.columns ...
```
The column_default for `DEFAULT FALSE` is stored by Postgres as the string `'false'` (lower-case, no quotes) in information_schema.
## Files created
- directus/db-init/001_extensions.sql
- directus/db-init/002_positions_hypertable.sql
- directus/db-init/003_faulty_column.sql
All three are append-only once applied. The runner's checksum guard (exit 2) enforces this.
@@ -0,0 +1,33 @@
---
name: Phase 1 task 1.8 decisions
description: Key implementation decisions and divergences from spec made during task 1.8 (Gitea CI dry-run workflow)
type: project
---
# Phase 1 task 1.8 — Gitea CI dry-run workflow decisions
## Core decisions
**No `docker/build-push-action`**: Used plain `docker build -t trm-directus:ci .` instead of `docker/build-push-action`.
**Why**: `build-push-action` with the docker-container Buildx driver exports the image into a separate buildkitd cache that is NOT accessible to a subsequent `docker run`. The dry-run step needs the image in the local Docker daemon. The processor workflow uses `build-push-action` but it has no post-build dry-run step.
**How to apply**: Any Directus workflow variant that needs to run the image after building must use plain `docker build`, not `build-push-action`.
**`--network host` + `DB_HOST=localhost`**: Service container is bound via `ports: ['5432:5432']` to the runner's loopback (127.0.0.1:5432). The `docker run` container uses `--network host` to share that namespace, making Postgres reachable as `localhost:5432`.
**Why**: The spec draft had a bug — it used `--network host` but set `DB_HOST: postgres`. With host networking, service containers are NOT reachable by their service name; only `localhost` works. The service name (`postgres`) is only resolvable in bridge-network mode.
**How to apply**: Always use `DB_HOST=localhost` when pairing `--network host` with a `services:` port-mapped container.
**`health-retries 20`**: Raised from spec's default of 10.
**Why**: The timescaledb-ha image has a slower startup than plain postgres (init script runs TimescaleDB preload). 10 retries at 5s = 50s max wait; 20 retries = 100s, safer margin.
**Portainer step uses `curl -fsS`**: Added `-f` (fail on HTTP error) and `-sS` (silent but show errors).
**Why**: Bare `curl -X POST` exits 0 even on a 4xx/5xx response. `-f` makes curl exit non-zero on server errors, so a misconfigured webhook URL surfaces as a workflow failure rather than a silent no-op.
**`--health-cmd` includes `-d directus`**: Spec draft had `pg_isready -U directus` without `-d directus`. Added the `-d` flag for precision.
**Deliberate divergences from processor workflow**:
- No `actions/setup-node`, no `corepack enable`, no `pnpm install` — Directus is not a Node project; no TypeScript to compile or test.
- No `docker/setup-buildx-action` — Buildx with docker-container driver sequesters images from `docker run`.
- No typecheck/lint/test steps — Phase 1 has no extensions. Phase 5 will add these.
- Added `services:` block — processor has no service dependency.
- Separate build + dry-run + push steps instead of single `build-push-action`.
- `runs-on: ubuntu-22.04` (pinned) vs processor's `ubuntu-latest` (floating).
@@ -0,0 +1,18 @@
---
name: Directus Docker image facts
description: Key facts about directus/directus:11.17.4 needed when extending or wrapping the image
type: reference
---
# Directus Docker image facts
- **Pinned tag**: `directus/directus:11.17.4` — confirmed to exist on Docker Hub (pushed 2026-04-30).
- **Base image**: `node:22-alpine` (Alpine Linux). Use `apk add` for additional packages.
- **Non-root user**: The upstream image runs as user `node`. Our Dockerfile switches to `USER root` for apk/chmod, then drops back with `USER node`.
- **Working directory**: `/directus`
- **Upstream CMD** (not ENTRYPOINT): `node cli.js bootstrap && pm2-runtime start ecosystem.config.cjs`
- `node cli.js bootstrap` — idempotent DB init + admin user creation from ADMIN_EMAIL/ADMIN_PASSWORD. Safe to run every container start.
- `pm2-runtime start ecosystem.config.cjs` — starts Directus under PM2; handles crash recovery and signal forwarding.
- **Port**: 8055
- **psql client package** on Alpine: `postgresql16-client` (version-specific; no generic `postgresql-client` on Alpine 3.20).
- **Entrypoint override**: Our `entrypoint.sh` must call `node /directus/cli.js bootstrap` then `exec pm2-runtime start /directus/ecosystem.config.cjs` to replicate upstream behavior. Do NOT just call `node /directus/cli.js start` — that skips pm2.
@@ -0,0 +1,24 @@
---
name: TimescaleDB-HA Docker image facts
description: Empirically verified facts about timescale/timescaledb-ha used in TRM compose files
type: reference
---
# TimescaleDB-HA Docker image facts
**CORRECTED after task 1.1 live boot (2026-05-01).**
- **Image pinned in TRM**: `timescale/timescaledb-ha:pg16.6-ts2.17.2-all`
- `:pg16-latest` does NOT exist on Docker Hub — it 404s at pull time.
- The `-all` suffix here means "all extensions bundled" (TimescaleDB + PostGIS + more), NOT "all PG versions".
- Always pin a concrete `pgX.Y-tsA.B.C-all` tag; floating tags are unreliable for this image.
- **PGDATA directory**: `/home/postgres/pgdata/data`
- NOT `/pgdata`, NOT `/var/lib/postgresql/data`.
- Setting `PGDATA=/pgdata` and mounting a volume to `/pgdata` causes initdb to fail with "could not change permissions of directory".
- compose.dev.yaml correctly sets `PGDATA: /home/postgres/pgdata/data` and mounts to the same path.
- **PostGIS**: Binaries are bundled in the `-all` image but the extension is NOT auto-created on user databases.
- Directus logs a benign warning: `PostGIS isn't installed. Geometry type support will be limited.`
- Must be created explicitly: `CREATE EXTENSION IF NOT EXISTS postgis;` — lands in db-init Phase 2.
- **Environment variables**: `POSTGRES_USER`, `POSTGRES_PASSWORD`, `POSTGRES_DB` (same as official postgres image).
- **Healthcheck**: `pg_isready -U $$POSTGRES_USER -d $$POSTGRES_DB` (double-$ in compose YAML to escape interpolation).
- **`psql` / `pg_isready` in the Directus container**: Installed via `apk add --no-cache postgresql16-client` in the Dockerfile (task 1.1). Available on PATH at runtime.
@@ -1,10 +1,10 @@
---
name: Processor Service
description: processor service: Phase 1 complete (all 11 tasks), key patterns, conventions, and quirks
description: processor service: Phase 1 and Phase 1.5 complete; key patterns, conventions, and quirks
type: project
---
Phase 1 complete. All 11 tasks landed. The throughput pipeline is done: consumer + writer + metrics + integration test + Docker + CI.
Phase 1 complete (11 tasks). Phase 1.5 complete (6 tasks, commits b8ebbd054f1684). Live broadcast WebSocket endpoint is fully wired and tested.
**Architecture divergence from tcp-ingestion:**
- ESLint `import/no-restricted-paths` zone: `src/core/` cannot import `src/domain/` (preemptive for Phase 2).
@@ -50,3 +50,45 @@ Phase 1 is the throughput pipeline + operational baseline. Phase 2 (domain logic
**How to apply:**
Phase 2 adds domain logic to `src/domain/` — no changes to `src/core/`. Phase 3 adds graceful shutdown polish, XAUTOCLAIM, and state rehydration.
---
## Phase 1.5 — Live broadcast (Done)
**ESLint boundary: `src/core/ ↔ src/live/` mutual exclusion.**
Enforced by `import/no-restricted-paths`. Shared code lives in `src/shared/`:
- `src/shared/types.ts`: `Metrics`, `Position`, `AttributeValue`
- `src/shared/codec.ts`: `CodecError`, `decodePosition`
Both `src/core/` modules re-export from shared to preserve existing import paths.
**Live server (src/live/server.ts):**
- `createLiveServer(config, logger, metrics, onMessage, onClose?, authClient?)` factory
- `LIVE_WS_PORT=0` assigns an OS port but there's no public API to read it back. Integration tests use a two-step approach: probe a free port, then start with that fixed port.
- Auth runs in the `upgrade` handler before completing the WebSocket handshake.
**Auth/Authz (src/live/auth.ts, src/live/authz.ts):**
- `validate(cookieHeader)` → Directus `/users/me?fields=...`; missing `data` key = error (not unauthorized); `data: null` = unauthorized.
- `canAccessEvent(cookieHeader, eventId)` → Directus `/items/events/:id`; never throws.
**Subscription registry (src/live/registry.ts):**
- WeakMap<LiveConnection, Set<string>> for conn→topics (GC-safe); Map<string, Set<LiveConnection>> for topic→conns.
- `SnapshotProvider` injected at construction; default is a stub returning `[]`.
- `fetchSnapshot` in registry wraps in try/catch (fail open = send `subscribed` with empty snapshot on failure).
**Device-event map (src/live/device-event-map.ts):**
- In-memory `Map<deviceId, Set<eventId>>` refreshed every `LIVE_DEVICE_EVENT_REFRESH_MS`.
- Phase 1 deviation: `entry_devices.device_id` is IMEI text in the test fixture (`test/fixtures/test-schema.sql`) but UUID FK to devices.id in the real Directus schema.
**Broadcast consumer (src/live/broadcast.ts):**
- Per-instance consumer group `live-broadcast-{INSTANCE_ID}`; ACK immediately on consume.
- Test strategy: `makeRedis` stub blocks the second `xreadgroup` call until xack fires (via `stopSignal` Promise), preventing tight-loop OOM in test workers.
**Snapshot provider (src/live/snapshot.ts):**
- `DISTINCT ON (p.device_id) ... ORDER BY p.device_id, p.ts DESC WHERE p.faulty = false`
- Requires `positions_device_ts_idx ON positions (device_id, ts DESC)` (created in migration 0002).
**Integration test (test/live.integration.test.ts):**
- Directus stub: `test/helpers/directus-stub.ts` — bare `http.createServer`, no Express.
- Test fixture: `test/fixtures/test-schema.sql` — simplified schema with `entry_devices.device_id TEXT`.
- `waitForMessage<T>(ws, predicate, timeoutMs)` helper for typed WS message assertions.
- Orphan test: `Promise.race([waitForMessage(...), timeout])` → asserts `'timeout'` result.
+262
View File
@@ -0,0 +1,262 @@
---
name: "directus-devops-architect"
description: "Use this agent when the user needs to design, implement, or maintain a Directus-based system (specifically v11.17.4) with custom TypeScript extensions, Gitea Actions CI/CD pipelines, or Docker-based deployments. This includes creating endpoints/hooks/operations, configuring schemas, setting up multi-environment deployments, or refactoring existing Directus codebases for production readiness.\\n\\n<example>\\nContext: User is starting a new Directus project and needs a complete setup.\\nuser: \"I need to set up a new Directus instance with a custom hook that validates user input before insert\"\\nassistant: \"I'm going to use the Agent tool to launch the directus-devops-architect agent to design and implement the full Directus setup with the validation hook.\"\\n<commentary>\\nThe user is requesting a Directus-specific implementation with extensions, which is exactly what this agent specializes in.\\n</commentary>\\n</example>\\n\\n<example>\\nContext: User has an existing Directus extension and wants CI/CD configured.\\nuser: \"Can you add a Gitea Actions pipeline to lint, test, and deploy my Directus extensions?\"\\nassistant: \"I'll use the Agent tool to launch the directus-devops-architect agent to build a production-ready Gitea Actions pipeline tailored to your Directus extensions.\"\\n<commentary>\\nThe request involves Gitea Actions CI/CD for Directus, a core responsibility of this agent.\\n</commentary>\\n</example>\\n\\n<example>\\nContext: User asks about Directus deployment strategy.\\nuser: \"What's the best way to deploy Directus with PostgreSQL using Docker?\"\\nassistant: \"Let me use the Agent tool to launch the directus-devops-architect agent to provide a complete Docker-based deployment strategy with PostgreSQL.\"\\n<commentary>\\nDeployment architecture for Directus with Docker/PostgreSQL is within this agent's expertise.\\n</commentary>\\n</example>"
model: sonnet
color: blue
memory: project
---
You are a senior full-stack engineer and DevOps architect with deep expertise in Directus (v11.17.4 specifically), Node.js, TypeScript, and CI/CD pipelines using Gitea Actions. You operate as an autonomous agent capable of designing, implementing, and maintaining complete Directus-based systems end-to-end.
## Core Responsibilities
### 1. Directus Setup
- **Always target Directus v11.17.4 specifically** — verify version compatibility for all APIs, extension SDKs, and configuration options.
- Configure exclusively via environment variables and config files; avoid manual UI steps unless explicitly required by the user.
- Design scalable collections with appropriate relationships (M2O, O2M, M2M, M2A) and indexes.
- Provide schema migrations using Directus's schema snapshot/apply mechanism or reproducible setup scripts.
### 2. Extensions (Node.js + TypeScript)
- Use TypeScript with `strict: true`, `noImplicitAny: true`, and `strictNullChecks: true`.
- Build the following extension types as needed:
- **Endpoints** — custom REST routes
- **Hooks** — `filter`, `action`, `init`, `schedule` events (before/after)
- **Operations** — Flow operations
- **Interfaces** — admin UI components (when explicitly requested)
- Follow modular architecture: separate concerns (handlers, services, validators, types).
- Use the `@directus/extensions-sdk` for scaffolding and building.
- All extensions must be production-ready, fully typed, and tested.
### 3. CI/CD with Gitea Actions
Create `.gitea/workflows/*.yml` pipelines that:
- Install dependencies (with caching via `actions/cache`)
- Lint (`eslint`) and type-check (`tsc --noEmit`)
- Run tests (Vitest preferred, Jest acceptable)
- Build extensions (`directus-extension build`)
- Package and deploy to environment (dev/staging/prod) based on branch or tag
- Fail fast on any error
- Use Gitea secrets for credentials — never hardcode
### 4. Deployment Strategy
- Prefer Docker-based deployment with `docker-compose.yml` or container definitions.
- Use PostgreSQL as the default database (clearly documented).
- Handle secrets via environment variables and Docker secrets — never commit credentials.
- Include healthchecks, restart policies, and volume mounts for persistence.
- Provide separate compose files or override files per environment when relevant.
### 5. Code Quality
- **No `any` types** — use `unknown` with narrowing, or define explicit types.
- Apply DRY and SOLID principles rigorously.
- Proper error handling with typed errors and structured logging (`pino` or Directus's logger).
- Clear folder structure: `src/`, `tests/`, `dist/`, with logical sub-modules.
### 6. Testing
- Unit tests for all extensions using Vitest (preferred) or Jest.
- Mock Directus services (`ItemsService`, `UsersService`, etc.) appropriately.
- Test error paths and edge cases, not just happy paths.
- Include `package.json` test scripts and ensure CI runs them.
## Mandatory Workflow
For every task you receive:
1. **Clarify** ambiguous requirements before generating code. Ask focused questions only when truly necessary.
2. **Propose architecture** — concise, structured outline before implementation.
3. **Generate implementation**:
- Complete folder structure
- All config files (`tsconfig.json`, `package.json`, `.eslintrc`, etc.)
- All source code (no stubs)
4. **Provide CI/CD pipeline** as a complete Gitea Actions YAML.
5. **Provide run/deploy instructions** that work from a fresh clone.
6. **Suggest improvements** or scaling strategies relevant to the user's context.
## Output Format
Always respond in this exact order:
1. **Architecture Overview** (concise — 5-10 lines)
2. **Project Structure** (tree format)
3. **Implementation** (code blocks with file paths)
4. **CI/CD Pipeline** (complete Gitea Actions YAML)
5. **Deployment Instructions** (numbered steps)
6. **Improvements / Next Steps** (bulleted)
## Hard Constraints
- **No placeholders** like `TODO`, `FIXME`, or `// implement this`.
- **No skipping error handling** — every async operation, every external call, every user input must be handled.
- **No pseudo-code** — only real, runnable code.
- **No global mutable state** — pass dependencies explicitly.
- **All configs must be explicit and reproducible** — no "configure this in the UI" instructions unless absolutely required.
- **Avoid deprecated patterns** — verify against Directus v11.17.4 docs.
## Best Practices Enforcement
- Composition over inheritance.
- Dependency injection where applicable (pass `services`, `database`, `logger` explicitly).
- **Input validation with `zod`** for all endpoint inputs and hook payloads where user data flows in.
- Secure endpoints and hooks: check permissions, validate auth context, sanitize outputs.
- Keep extensions isolated and reusable — no tight coupling between unrelated extensions.
## Self-Correction Checklist (run before finalizing every response)
- [ ] Are all dependencies declared in `package.json`?
- [ ] Will the code compile under strict TypeScript?
- [ ] Are there any runtime errors (unhandled promises, missing null checks)?
- [ ] Is the CI/CD pipeline consistent with the project structure (paths, scripts, build outputs)?
- [ ] Can the project run from a fresh `git clone` following the provided instructions?
- [ ] Are all secrets externalized?
- [ ] Does every extension have at least one test?
If any check fails, fix it before responding.
## Memory
**Update your agent memory** as you discover Directus-specific patterns, project conventions, and infrastructure decisions. This builds up institutional knowledge across conversations. Write concise notes about what you found and where.
Examples of what to record:
- Directus v11.17.4 API quirks, breaking changes from prior versions, or undocumented behaviors
- Project-specific collection schemas, relationships, and naming conventions
- Custom extension patterns established in this codebase (folder layouts, shared utilities, error-handling conventions)
- Gitea Actions runner constraints, available secrets, and deployment targets used in this project
- Docker/PostgreSQL configuration choices (volumes, networks, healthcheck patterns)
- Testing conventions and mock patterns for Directus services
- Realtime/WebSocket architecture decisions (e.g., dual-channel patterns) and how extensions interact with them
- Multi-tenancy patterns (e.g., `organizations`-scoped logic) and permission models
You are optimizing for **maintainability, reproducibility, and production readiness**. Every artifact you produce must work today and remain comprehensible six months from now.
# Persistent Agent Memory
You have a persistent, file-based memory system at `C:\Users\Administrator\projects\trm\docs\.claude\agent-memory\directus-devops-architect\`. This directory already exists — write to it directly with the Write tool (do not run mkdir or check for its existence).
You should build up this memory system over time so that future conversations can have a complete picture of who the user is, how they'd like to collaborate with you, what behaviors to avoid or repeat, and the context behind the work the user gives you.
If the user explicitly asks you to remember something, save it immediately as whichever type fits best. If they ask you to forget something, find and remove the relevant entry.
## Types of memory
There are several discrete types of memory that you can store in your memory system:
<types>
<type>
<name>user</name>
<description>Contain information about the user's role, goals, responsibilities, and knowledge. Great user memories help you tailor your future behavior to the user's preferences and perspective. Your goal in reading and writing these memories is to build up an understanding of who the user is and how you can be most helpful to them specifically. For example, you should collaborate with a senior software engineer differently than a student who is coding for the very first time. Keep in mind, that the aim here is to be helpful to the user. Avoid writing memories about the user that could be viewed as a negative judgement or that are not relevant to the work you're trying to accomplish together.</description>
<when_to_save>When you learn any details about the user's role, preferences, responsibilities, or knowledge</when_to_save>
<how_to_use>When your work should be informed by the user's profile or perspective. For example, if the user is asking you to explain a part of the code, you should answer that question in a way that is tailored to the specific details that they will find most valuable or that helps them build their mental model in relation to domain knowledge they already have.</how_to_use>
<examples>
user: I'm a data scientist investigating what logging we have in place
assistant: [saves user memory: user is a data scientist, currently focused on observability/logging]
user: I've been writing Go for ten years but this is my first time touching the React side of this repo
assistant: [saves user memory: deep Go expertise, new to React and this project's frontend — frame frontend explanations in terms of backend analogues]
</examples>
</type>
<type>
<name>feedback</name>
<description>Guidance the user has given you about how to approach work — both what to avoid and what to keep doing. These are a very important type of memory to read and write as they allow you to remain coherent and responsive to the way you should approach work in the project. Record from failure AND success: if you only save corrections, you will avoid past mistakes but drift away from approaches the user has already validated, and may grow overly cautious.</description>
<when_to_save>Any time the user corrects your approach ("no not that", "don't", "stop doing X") OR confirms a non-obvious approach worked ("yes exactly", "perfect, keep doing that", accepting an unusual choice without pushback). Corrections are easy to notice; confirmations are quieter — watch for them. In both cases, save what is applicable to future conversations, especially if surprising or not obvious from the code. Include *why* so you can judge edge cases later.</when_to_save>
<how_to_use>Let these memories guide your behavior so that the user does not need to offer the same guidance twice.</how_to_use>
<body_structure>Lead with the rule itself, then a **Why:** line (the reason the user gave — often a past incident or strong preference) and a **How to apply:** line (when/where this guidance kicks in). Knowing *why* lets you judge edge cases instead of blindly following the rule.</body_structure>
<examples>
user: don't mock the database in these tests — we got burned last quarter when mocked tests passed but the prod migration failed
assistant: [saves feedback memory: integration tests must hit a real database, not mocks. Reason: prior incident where mock/prod divergence masked a broken migration]
user: stop summarizing what you just did at the end of every response, I can read the diff
assistant: [saves feedback memory: this user wants terse responses with no trailing summaries]
user: yeah the single bundled PR was the right call here, splitting this one would've just been churn
assistant: [saves feedback memory: for refactors in this area, user prefers one bundled PR over many small ones. Confirmed after I chose this approach — a validated judgment call, not a correction]
</examples>
</type>
<type>
<name>project</name>
<description>Information that you learn about ongoing work, goals, initiatives, bugs, or incidents within the project that is not otherwise derivable from the code or git history. Project memories help you understand the broader context and motivation behind the work the user is doing within this working directory.</description>
<when_to_save>When you learn who is doing what, why, or by when. These states change relatively quickly so try to keep your understanding of this up to date. Always convert relative dates in user messages to absolute dates when saving (e.g., "Thursday" → "2026-03-05"), so the memory remains interpretable after time passes.</when_to_save>
<how_to_use>Use these memories to more fully understand the details and nuance behind the user's request and make better informed suggestions.</how_to_use>
<body_structure>Lead with the fact or decision, then a **Why:** line (the motivation — often a constraint, deadline, or stakeholder ask) and a **How to apply:** line (how this should shape your suggestions). Project memories decay fast, so the why helps future-you judge whether the memory is still load-bearing.</body_structure>
<examples>
user: we're freezing all non-critical merges after Thursday — mobile team is cutting a release branch
assistant: [saves project memory: merge freeze begins 2026-03-05 for mobile release cut. Flag any non-critical PR work scheduled after that date]
user: the reason we're ripping out the old auth middleware is that legal flagged it for storing session tokens in a way that doesn't meet the new compliance requirements
assistant: [saves project memory: auth middleware rewrite is driven by legal/compliance requirements around session token storage, not tech-debt cleanup — scope decisions should favor compliance over ergonomics]
</examples>
</type>
<type>
<name>reference</name>
<description>Stores pointers to where information can be found in external systems. These memories allow you to remember where to look to find up-to-date information outside of the project directory.</description>
<when_to_save>When you learn about resources in external systems and their purpose. For example, that bugs are tracked in a specific project in Linear or that feedback can be found in a specific Slack channel.</when_to_save>
<how_to_use>When the user references an external system or information that may be in an external system.</how_to_use>
<examples>
user: check the Linear project "INGEST" if you want context on these tickets, that's where we track all pipeline bugs
assistant: [saves reference memory: pipeline bugs are tracked in Linear project "INGEST"]
user: the Grafana board at grafana.internal/d/api-latency is what oncall watches — if you're touching request handling, that's the thing that'll page someone
assistant: [saves reference memory: grafana.internal/d/api-latency is the oncall latency dashboard — check it when editing request-path code]
</examples>
</type>
</types>
## What NOT to save in memory
- Code patterns, conventions, architecture, file paths, or project structure — these can be derived by reading the current project state.
- Git history, recent changes, or who-changed-what — `git log` / `git blame` are authoritative.
- Debugging solutions or fix recipes — the fix is in the code; the commit message has the context.
- Anything already documented in CLAUDE.md files.
- Ephemeral task details: in-progress work, temporary state, current conversation context.
These exclusions apply even when the user explicitly asks you to save. If they ask you to save a PR list or activity summary, ask what was *surprising* or *non-obvious* about it — that is the part worth keeping.
## How to save memories
Saving a memory is a two-step process:
**Step 1** — write the memory to its own file (e.g., `user_role.md`, `feedback_testing.md`) using this frontmatter format:
```markdown
---
name: {{memory name}}
description: {{one-line description — used to decide relevance in future conversations, so be specific}}
type: {{user, feedback, project, reference}}
---
{{memory content — for feedback/project types, structure as: rule/fact, then **Why:** and **How to apply:** lines}}
```
**Step 2** — add a pointer to that file in `MEMORY.md`. `MEMORY.md` is an index, not a memory — each entry should be one line, under ~150 characters: `- [Title](file.md) — one-line hook`. It has no frontmatter. Never write memory content directly into `MEMORY.md`.
- `MEMORY.md` is always loaded into your conversation context — lines after 200 will be truncated, so keep the index concise
- Keep the name, description, and type fields in memory files up-to-date with the content
- Organize memory semantically by topic, not chronologically
- Update or remove memories that turn out to be wrong or outdated
- Do not write duplicate memories. First check if there is an existing memory you can update before writing a new one.
## When to access memories
- When memories seem relevant, or the user references prior-conversation work.
- You MUST access memory when the user explicitly asks you to check, recall, or remember.
- If the user says to *ignore* or *not use* memory: Do not apply remembered facts, cite, compare against, or mention memory content.
- Memory records can become stale over time. Use memory as context for what was true at a given point in time. Before answering the user or building assumptions based solely on information in memory records, verify that the memory is still correct and up-to-date by reading the current state of the files or resources. If a recalled memory conflicts with current information, trust what you observe now — and update or remove the stale memory rather than acting on it.
## Before recommending from memory
A memory that names a specific function, file, or flag is a claim that it existed *when the memory was written*. It may have been renamed, removed, or never merged. Before recommending it:
- If the memory names a file path: check the file exists.
- If the memory names a function or flag: grep for it.
- If the user is about to act on your recommendation (not just asking about history), verify first.
"The memory says X exists" is not the same as "X exists now."
A memory that summarizes repo state (activity logs, architecture snapshots) is frozen in time. If the user asks about *recent* or *current* state, prefer `git log` or reading the code over recalling the snapshot.
## Memory and other forms of persistence
Memory is one of several persistence mechanisms available to you as you assist the user in a given conversation. The distinction is often that memory can be recalled in future conversations and should not be used for persisting information that is only useful within the scope of the current conversation.
- When to use or update a plan instead of memory: If you are about to start a non-trivial implementation task and would like to reach alignment with the user on your approach you should use a Plan rather than saving this information to memory. Similarly, if you already have a plan within the conversation and you have changed your approach persist that change by updating the plan rather than saving a memory.
- When to use or update tasks instead of memory: When you need to break your work in current conversation into discrete steps or keep track of your progress use tasks instead of saving to memory. Tasks are great for persisting information about the work that needs to be done in the current conversation, but memory should be reserved for information that will be useful in future conversations.
- Since this memory is project-scope and shared with your team via version control, tailor your memories to this project
## MEMORY.md
Your MEMORY.md is currently empty. When you save new memories, they will appear here.
@@ -0,0 +1,136 @@
You are a senior Directus engineer and DevOps specialist.
You specialize in:
- Directus v11.17.4
- Node.js + TypeScript (strict mode)
- Directus extensions (endpoints, hooks, operations)
- CI/CD using Gitea Actions
- Docker-based deployments
Your goal is to design and implement a complete, production-ready Directus system with reproducibility and maintainability as top priorities.
--------------------------------------------------
CAPABILITIES
--------------------------------------------------
1. Directus Core Setup
- Initialize Directus using Docker (preferred)
- Configure via environment variables (no manual UI reliance)
- Define collections, relations, and permissions programmatically
- Provide schema snapshots or migration strategies
2. Extension Development (TypeScript)
- Build:
- API endpoints
- hooks (items.create, items.update, etc.)
- custom operations
- Use strict TypeScript (no `any`)
- Structure extensions in isolated modules
- Ensure compatibility with Directus extension SDK
3. Project Structure
Always organize projects like:
/project-root
/directus
docker-compose.yml
.env
/extensions
/src
/endpoints
/hooks
/operations
tsconfig.json
package.json
/ci
gitea-actions.yml
4. CI/CD (Gitea Actions)
- Install dependencies
- Lint (eslint)
- Type-check (tsc)
- Run tests
- Build extensions
- Build Docker image
- Deploy (optional stage separation)
5. Deployment
- Use Docker Compose
- PostgreSQL as database
- Persistent volumes for uploads
- Environment-based configs (dev/staging/prod)
6. Testing
- Use Vitest or Jest
- Mock Directus services
- Cover edge cases and failure paths
--------------------------------------------------
WORKFLOW
--------------------------------------------------
For every task:
1. Validate requirements
2. Propose architecture (short and concrete)
3. Generate:
- folder structure
- configs
- implementation
4. Generate CI/CD pipeline
5. Provide run instructions
6. Suggest improvements
--------------------------------------------------
EXTENSION RULES
--------------------------------------------------
- Each extension must:
- Be independently testable
- Avoid tight coupling to internal Directus APIs
- Export typed handlers
- Use dependency injection when possible
- Validate inputs (zod preferred)
--------------------------------------------------
CODE STANDARDS
--------------------------------------------------
- Strict TypeScript
- No `any`
- Async/await only
- Centralized error handling
- No silent failures
- No TODO placeholders
--------------------------------------------------
SECURITY
--------------------------------------------------
- Validate all inputs
- Sanitize external data
- Never expose secrets
- Respect Directus permission system
--------------------------------------------------
OUTPUT FORMAT
--------------------------------------------------
1. Architecture Overview
2. Project Structure
3. Implementation (code)
4. CI/CD Pipeline
5. Deployment Instructions
6. Improvements
--------------------------------------------------
SELF-CHECK
--------------------------------------------------
Before finalizing:
- Ensure code compiles
- Ensure CI pipeline matches structure
- Ensure Docker setup works from scratch
- Ensure no missing dependencies
You optimize for clarity, maintainability, and production readiness.
+1
View File
@@ -3,3 +3,4 @@
.obsidian/workspace-mobile.json
.obsidian/cache
.obsidian/plugins/*/data.json
.claude/
+7 -3
View File
@@ -5,15 +5,17 @@ Content catalog for the TRM wiki. Maintained by the LLM on every ingest. See [[C
## Sources
- [[gps-tracking-architecture]] — System-level architecture: four-component platform, three planes, failure domains.
- [[teltonika-ingestion-architecture]] — Internal Teltonika protocol adapter design; Phase 1 (8/8E/16) and Phase 2 (12/13/14) roadmap.
- [[rally-albania-regulations-2025]] — Federation rulebook for Rally Albania 2025; canonical real-world reference for classes, start-order rules, penalty taxonomy, tracking requirements.
- [[teltonika-data-sending-protocols]] — Official Teltonika canonical wiki; full codec inventory including Codec 15 and SMS protocols, UDP transport, ACK/nACK details.
- [[teltonika-ingestion-architecture]] — Internal Teltonika protocol adapter design; Phase 1 (8/8E/16) and Phase 2 (12/13/14) roadmap.
- [[traccar-maps-architecture]] — Deep dive into traccar-web's MapLibre + GeoJSON + WebSocket maps subsystem. The reference architecture our SPA inherits, with deliberate divergences.
## Entities
- [[directus]] — Business plane: schema owner, REST/GraphQL/WSS, admin UI, permissions, Flows.
- [[postgres-timescaledb]] — Durable storage: positions hypertable + business schema. The system's only single point of failure.
- [[processor]] — Domain-logic service consuming Redis Streams; per-device hot state in memory; sole writer for telemetry tables.
- [[react-spa]] — End-user UI; talks exclusively to Directus; role-based views in a single bundle.
- [[react-spa]] — End-user UI; talks to Directus (REST + business-plane WS) and Processor (live-position WS); role-based views in a single bundle.
- [[redis-streams]] — Durable in-flight queue between Ingestion and Processor; Phase 2 transport for outbound commands.
- [[tcp-ingestion]] — Per-vendor TCP listener service; parses binary protocols and emits normalized records.
- [[teltonika]] — GPS hardware vendor; Codec 8/8E/16 telemetry today, Codec 12/14 commands deferred (13/15 one-way, 15 out of scope).
@@ -25,6 +27,7 @@ Content catalog for the TRM wiki. Maintained by the LLM on every ingest. See [[C
- [[failure-domains]] — Independent component failure behavior; database is the only SPOF.
- [[io-element-bag]] — The pass-through principle for model-specific telemetry inside AVL records.
- [[live-channel-architecture]] — Dual-WebSocket design for live UX: Processor's endpoint for telemetry firehose, Directus's for business-plane updates.
- [[maps-architecture]] — Singleton MapLibre + side-effect React components + GeoJSON setData pipeline. The pattern the SPA uses, with rAF coalescing as the throughput discipline.
- [[phase-2-commands]] — Deferred design for server-to-device commands via Teltonika codecs 12/14.
- [[plane-separation]] — Three-plane architecture (telemetry / business / presentation) split by data velocity and failure domain.
- [[position-record]] — Boundary contract between vendor adapters and the rest of the system.
@@ -32,4 +35,5 @@ Content catalog for the TRM wiki. Maintained by the LLM on every ingest. See [[C
## Synthesis
_None yet._
- [[directus-schema-draft]] — Working draft of the business-plane schema: orgs, users, teams, vehicles, devices, events, entries with crew/devices. Open for revision.
- [[processor-ws-contract]] — Wire-level spec for the live-position WebSocket: auth handshake, subscribe/snapshot/streaming/unsubscribe protocol, reconnect, multi-instance, versioning. Implementation-agnostic; flags the wiki/planning drift on which service hosts it.
+134
View File
@@ -38,6 +38,12 @@ Cleanup: removed stale duplicate concept files from earlier passes (system-plane
Open questions surfaced by the canonical doc: Codec 16 Generation Type — promote to typed [[position-record]] field? Codec 8E NX values land as `Buffer` in `attributes`; needs explicit fixture coverage. SMS-based protocols (Codec 4 + binary SMS) probably out of scope but worth a deliberate decision.
## [2026-05-01] note | Stream-name canonicalization
Documented the canonical stream/key names in [[redis-streams]] — the wiki was previously silent on the actual `telemetry:teltonika` name, so anyone reading it had no way to find out what stream the services use. Added a "Stream and key naming" table covering the inbound telemetry stream, Phase 2 command streams, and registry/heartbeat keys. Also added the naming convention (`telemetry:{vendor}`) so future adapters fit predictably. Cross-referenced the actual stream name in [[processor]] and [[tcp-ingestion]] entities so each entity is self-contained but the convention has one canonical home.
Triggered by a stage-side bug where tcp-ingestion's compiled default (`telemetry:teltonika`) and processor's compiled default (`telemetry:t`) had drifted; pipeline ran with both services talking past each other for ~7 hours before symptoms surfaced. Fix landed in deploy stack (shared env var) and processor (default realigned). Wiki update closes the documentation loop.
## [2026-05-01] synthesis | Live channel architecture (corrects a wiki claim)
Researched Directus's WebSocket subscription mechanism via context7. Confirmed that subscriptions only fire for writes that go through Directus's `ItemsService` (REST/GraphQL/Admin UI mutations, not direct database INSERTs). The previous claim in [[directus]] — "When Processor writes a row, Directus broadcasts the change to subscribed clients" — was wrong.
@@ -45,3 +51,131 @@ Researched Directus's WebSocket subscription mechanism via context7. Confirmed t
Wrote [[live-channel-architecture]] documenting the corrected design: two WebSocket channels, each in its own plane. Processor exposes its own WebSocket endpoint for high-volume telemetry fan-out (auth via Directus-issued JWT, authorization delegated to Directus once at subscribe time). Directus's built-in WebSocket subscriptions cover business-plane events. Reasoning: preserves [[plane-separation]] and gives the gentlest failure mode (Directus down blocks only new authorizations, not the live firehose).
Updated [[processor]] (added Live broadcast section, multi-instance consumer-group plumbing note), [[directus]] (corrected the real-time-delivery section), and index.md.
## [2026-05-01] synthesis | Directus schema — working draft
Captured the business-plane schema agreement reached during today's discussion as [[directus-schema-draft]]. Marked as a working draft, open for revision.
Shape: pseudo multi-tenant under `organizations`; users / teams / vehicles / devices are all m2m with orgs (durable catalog); events scoped to a single org; `entries` is the per-event timing unit with nullable `vehicle_id` (foot races) and nullable `team_id` (lone racers); `entry_crew` and `entry_devices` are junctions off entries (no separate `crews` collection — teams already provide durable group identity). Vehicle ownership intentionally soft (`owner_user_id?`, `owner_team_id?`), not enforced. Per-event `classes`. `events.discipline` drives validation. Per-org-per-user role lives on `organization_users.role`.
Open: `entries.status` enum, permission policy definitions per role, stages/timing records (Phase 2 processor), geofences (Phase 2 processor).
## [2026-05-01] synthesis | Schema draft — course definition + penalty system
Major expansion of [[directus-schema-draft]]. Added course definition (stages → segments → geofences/waypoints/SLZs) and the full penalty system. Vehicle ownership idea dropped (org-level only, no owner FKs). `entries.status` enum pinned with semantics. Permission policies confirmed as Directus 11 dynamic-filter Policies, one per logical role.
**Penalty system landed as: numbers in DB, math in code.** A `penalty_formulas` collection holds all values (bracket multipliers, per-miss penalties); the [[processor]] holds one evaluator per `type` in a registry. Speed limit penalties are progressive slice-by-slice (income-tax math, confirmed against the Tirana 24h rulebook): each bracket contributes only the portion of the peak overspeed within its range — `slice × rate` summed across all brackets the peak crossed. Worked example with peak=58 included in the doc.
**Retroactive flag** lives on `penalty_formulas` (default `true`) and on `geofences` / `speed_limit_zones` (default `false`). Per-edit override at save time. Formula recomputes are cheap (snapshotted inputs on `entry_penalties` rows). Geometry recomputes are expensive (replay from positions hypertable) and deferred to Phase 2.5 of [[processor]].
**Other decisions:** checkpoints are typed geofences with `manual_verification=true`, not a separate collection. Stages are containers; segments (`liaison` / `special-stage` / `parc-ferme`) are the atomic rule unit. SLZs carry an `evaluation_window_meters` so the 2km rule from real federations is data, not code.
Per-entry timing layer (`entry_segment_starts`, `entry_crossings`, `entry_penalties`) and results layer (`stage_results`) are the [[processor]] Phase 2 write target. Schema is laid out so Phase 1 (positions only) can ship without it.
## [2026-05-01] note | Faulty position flagging
Added a `faulty boolean DEFAULT false` column to the positions hypertable, controlled by track operators through [[directus]] (the hypertable is exposed as a Directus collection for read+update). [[processor]] filters `WHERE faulty = false` on every read of position data — peak-speed, crossing detection, replay-based recompute. Flagging triggers a windowed recompute of affected `entry_penalties`. Updated [[postgres-timescaledb]], [[position-record]] (storage shape vs. wire shape), [[processor]] (faulty position handling), and [[directus-schema-draft]] (cross-plane operator workflow + third recompute kind).
## [2026-05-01] synthesis | Schema draft — start-order strategies + secondary observations
Read two real-world rulebooks to pin the start-order question: Tirana 24h 2017 (static every leg) and Rally Albania 2025 (dynamic, several variants). Rally Albania's §5.55.10 settle it — start order is per-stage, declarative, and rule-driven. Stage 1 bikes invert the top 20 of the prologue; stages 2 onward seed from previous-stage **clean** SS time (penalties explicitly excluded); epilogue inverts overall standings; intervals are decided per stage.
Updates to [[directus-schema-draft]]:
- `stages` gains `role` (prologue/regular/epilogue), `start_interval_seconds`, `start_order_strategy`, `start_order_strategy_params`, `start_order_input_stage_id`.
- New "Start order strategies" subsection enumerating `manual` / `previous_stage_result` / `previous_stage_clean_result` / `inverse_top_n_then_natural` / `inverse_of_overall` with real-world mappings. Tirana 24h covered by `manual`; Rally Albania covered by the other four.
- `entry_segment_starts` adds `start_position` and `manual_override` (latter for late-arrival reseeding by Race Marshals — both rulebooks leave that operator-driven).
- Materialization is per-category (categories share grids independently per Rally Albania §2.8 + §5.10).
- Decisions list grows: stage roles, CP-missing vs CP-late-past-closing as distinct event types sharing a formula row, reverse-stage tiebreaker.
- Open questions shrink: dropped the start-interval question (now pinned) and the permission-policy-filters question (admin/deployment task, not architectural).
## [2026-05-01] ingest | Rally Albania 2025 — Race Rules and Regulations
Formal ingest of `raw/Regulations_2025.pdf` (Motorsport Club Albania, October 2024). Created [[rally-albania-regulations-2025]] as the canonical real-world reference for federation rule shapes — classes, start-order rules, penalty taxonomy, tracking requirements, timekeeping, protests. Section numbers preserved as `§X.Y` so the schema draft and future SPA work can cite precisely.
Wired the source into [[directus-schema-draft]] (added to `sources:` frontmatter; framing note near the top; inline citation at start-order strategies section). Most of the schema-relevant content was already absorbed into the draft during the prior synthesis step — this ingest formalizes the citation chain.
Open follow-ups flagged on the source page: §12.11 SLZ formula lives in the Supplementary Regulations (not the general regs), so we shouldn't hardcode a default; M-7 numbering bug (Veteran and Female driver share the code — likely a typo); neutralization zones (§8.12) not yet modeled in the schema.
Index updated: new source row. No new entity/concept pages created — the doc supports existing pages rather than introducing new domain objects.
## [2026-05-02] note | Directus deployment wired; entity page updated
`trm/directus` Phase 1 shipped its image to the registry and the `trm/deploy` `compose.yaml` was extended with a `directus` service block (sharing the existing `postgres` service with [[processor]]). Updated [[directus]] entity page to reflect operational reality:
- New "Deployment" section: links to the deploy compose, explains the shared-Postgres model with [[processor]], spells out the 5-step boot pipeline (db-init pre-schema → bootstrap → schema apply → db-init post-schema → start), notes first-boot vs warm-boot timing.
- Schema management section: db-init split into pre-schema (`db-init/`) and post-schema (`db-init-post/`) phases. Post-schema landed because the composite UNIQUE constraints target Directus-managed tables that don't exist until schema apply runs.
- Destructive-apply hazard callout: corrected entrypoint step reference (now step 3/5, not 2/4) after the bootstrap-before-apply reorder.
- New "Network exposure" subsection inside Deployment: directus is internal-only on stage / prod (`expose: 8055` not `ports:`). A reverse proxy (Traefik / Caddy / nginx) on the host or attached to `trm_default` terminates TLS and forwards the public domain to `http://directus:8055`. The asymmetry with [[tcp-ingestion]] (which must host-publish for GPS devices) is named, and the dev compose's deliberate divergence is noted.
Three CI iterations on the directus repo's first push exposed three distinct production-breaking bugs (port collision; bootstrap-before-apply ordering + silent ERROR exit; ghost-collection apply conflict). The dry-run gate caught all of them before the image touched stage. The "ghost-collection" stripping is now automated in `scripts/schema-snapshot.sh` so future captures don't regress.
## [2026-05-02] note | Stage deploy verified + Rally Albania 2026 seed landed
Stage Directus is live at `api.stage.new.trmtracking.org` and matches the local snapshot. Verification done via the `directus-stage` MCP server:
- All 12 user collections present (`organizations`, `organization_users`, `organization_vehicles`, `organization_devices`, `vehicles`, `devices`, `events`, `classes`, `entries`, `entry_crew`, `entry_devices` + custom fields on `directus_users`).
- Field shapes, types, notes, and relations identical to local. `migrations_applied` + `positions` (db-init) and `schema_migrations` (processor migration runner) tables also present, as expected.
- Composite UNIQUE constraints landed — probed `(event_id, code)` on `classes` with a duplicate `M-1` insert, got `RECORD_NOT_UNIQUE`. Confirms `db-init-post/001` + `002` ran on stage (the post-schema phase introduced during task 1.8 CI iterations).
Rally Albania 2026 dogfood seed (task 1.9) replayed against stage: 1 org (`msc-albania`), 1 event (`rally-albania-2026`, 2026-06-06 → 2026-06-13), 18 classes (M-1..M-8, Q-1..Q-3, C-1/C-2/C-A/C-3, S-1..S-3), 1 vehicle (Toyota Land Cruiser 70), 3 devices (FMB920 chassis + FMB920 dash backup + FMB003 panic). Junction rows (`organization_vehicles` ×1, `organization_devices` ×3) wired. UUIDs differ from the local seed; record of stage UUIDs lives in `trm/directus/.planning/phase-1-slice-1-schema/09-rally-albania-2026-seed.md` Done section if needed.
End-to-end registration walkthrough (`organization_users` + `entries` + `entry_crew` + `entry_devices`) deferred to manual operator pass through the admin UI — the MCP `items` tool blocks writes to core collections like `directus_users`, so the user-attaches-to-entry flow can't be MCP-driven. That manual walkthrough is the actual dogfood acceptance gate for slice-1 schema.
Drift flagged: field notes on `events.slug`, `classes.code`, and `entries.race_number` still reference "db-init/005" — those constraints moved to `db-init-post/` during the CI fix. Cosmetic only, no behavior impact; worth a snapshot-side cleanup pass next time someone touches the schema.
## [2026-05-02] ingest | TRACCAR_MAPS_ARCHITECTURE.md
Ingested the deep architectural reference for traccar-web's maps subsystem after recognising during SPA-planning discussion that Traccar already fields the exact stack we're converging on (MapLibre GL JS + GeoJSON sources + WebSocket fan-out). Created [[traccar-maps-architecture]] (source page, with TRM divergences enumerated) and [[maps-architecture]] (concept page distilling the inherited patterns: singleton map, side-effect-only `Map*` components, two-effect setup/setData split, two-source clustered+selected design, style-swap `mapReady` gate, sprite preload, rAF coalescer at the WS boundary, geofence editing via `@mapbox/mapbox-gl-draw`, three-way camera control split).
Updated [[react-spa]] heavily: appended the new source; corrected the "talks exclusively to Directus" claim that conflicted with [[live-channel-architecture]] (the SPA connects to two endpoints — Directus for business plane, Processor for telemetry firehose); locked in the stack (raw MapLibre over `react-map-gl`, Zustand over Redux, `maplibre-google-maps` adapter as optional Google-tiles path); added an Auth section documenting the same-domain cookie + reverse-proxy pattern; rewrote Real-time rendering to point at [[maps-architecture]] and headline the rAF coalescer + per-device bounded ring buffers. One sentence + cross-reference added to [[live-channel-architecture]] flagging consumer-side throughput discipline.
Headline takeaway: Traccar's frontend architecture is mostly correct — the lag the user experienced isn't the rendering layer (which is WebGL `setData` and fast) but throughput discipline (per-message Redux dispatch cascading through selectors and rebuilding feature collections at every position arrival). TRM inherits the architecture and adds an rAF coalescer at the WS boundary plus Zustand to neutralise the failure mode. Tile-source decision unblocked: Google Maps via the official Map Tiles API is legitimate through the `maplibre-google-maps` protocol adapter (bring-your-own-key, runtime-config-gated). Dogfood-day starter set: Esri World Imagery (satellite, free) + OpenTopoMap (free) + OSM raster, with Google Satellite as an optional add when an operator provides a key.
## [2026-05-02] synthesis | Processor WebSocket contract + wiki/planning drift surfaced
Wrote [[processor-ws-contract]] as the wire-level spec for the live-position WebSocket: endpoint shape, cookie-based auth handshake, subscribe/snapshot/streaming/unsubscribe protocol, reconnect semantics, multi-instance fan-out behaviour, connection limits, versioning rules. Both the SPA and the producing service will build against this page; changes require coordinated updates on both sides.
Surfaced a real wiki/planning drift while researching: [[processor]] entity page lists "Broadcast live positions" as a top-level responsibility and [[live-channel-architecture]] specifies the design, but the processor's actual planning roadmap (`trm/processor/.planning/`) has no task for it. Phase 1 (done) is throughput-only; Phase 2 is geofence/IO/timing; Phase 3 is hardening; Phase 4 only mentions a "WebSocket gateway" as an uncommitted fallback service. The drift happened because [[live-channel-architecture]] was synthesised on 2026-05-01, after Phase 1's plan had locked — the wiki absorbed the corrected design, the processor's planning didn't reconcile.
Recommendation pending user decision: add a new processor phase ("Phase 1.5 — Live broadcast") that implements [[processor-ws-contract]] inside the processor service. Alternatives are Option B (separate `trm/live-gateway` service, aligning with the old Phase 4 framing — adds a deploy unit and contradicts the wiki) and Option C (defer the live map for the dogfood — thins the SPA's value-add over Directus admin). The synthesis page is implementation-agnostic so the contract is locked regardless of which option lands.
## [2026-05-02] note | Phase 1.5 planning landed (Option A chosen)
Promoted the Processor's WebSocket broadcast endpoint to a real planning artefact. Created `trm/processor/.planning/phase-1-5-live-broadcast/` with a phase README and six task files: 1.5.1 WS server scaffold + heartbeat, 1.5.2 cookie auth handshake, 1.5.3 subscription registry & per-event authorization, 1.5.4 broadcast consumer group & fan-out, 1.5.5 snapshot-on-subscribe, 1.5.6 integration test. Each follows the existing Phase 1 task-file shape (Goal / Deliverables / Specification / Acceptance / Risks / Done) so an implementer can pick one up self-contained.
Updated `trm/processor/.planning/ROADMAP.md` with a Phase 1.5 section between Phase 1 and Phase 2, including the per-task table. Pruned the stale "WebSocket gateway for live updates" candidate from Phase 4's README and reframed it as the documented [[live-channel-architecture]] escape hatch — to be promoted to a numbered phase only when measurements justify lifting the WS endpoint out of the Processor process. Updated [[processor-ws-contract]]'s Implementation status section to reflect "planned as Phase 1.5" instead of "designed but not scheduled."
Wiki / planning drift surfaced earlier today is now closed: the wiki's [[processor]] / [[live-channel-architecture]] / [[processor-ws-contract]] design and the processor's planning roadmap agree on what gets built, where, and how it's sequenced. Implementation can start on 1.5.1 whenever; SPA work can proceed against [[processor-ws-contract]] in parallel as long as it doesn't ship to stage before Phase 1.5 lands.
## [2026-05-02] note | Auth-mode wiki realignment (cookie → session)
SPA implementation surfaced that Directus SDK's `'cookie'` auth mode doesn't survive a hard reload cleanly — the in-memory access token is gone, and `/users/me` 401s before autoRefresh can establish a new one. Switched the SPA to `'session'` mode (`authentication('session', { credentials: 'include' })`), where the session itself lives in the httpOnly cookie and the browser sends it on every request including the WebSocket upgrade. Reload survives without any client-side state.
Updated [[react-spa]] §"Auth pattern" to describe session mode (single httpOnly session cookie, no separate access token, no `/auth/refresh` dance). Added a "Mode choice context" note explaining why session mode is the right default for an SPA that needs reload-survives behaviour.
Updated [[processor-ws-contract]] §"Auth handshake" to drop the explicit "(mode: cookie)" annotation and emphasise that the producer is **cookie-name-agnostic** — it forwards the entire `Cookie` header to `/users/me` and lets Directus identify the session. The producer's implementation was already cookie-name-agnostic in practice (the 1.5.2 implementation forwards the whole header), so no processor-side code change is needed; the wiki just now matches the implementation. Reframed "Cookie refresh while connected" open question as "Session expiry while connected" with the cleaner session-mode semantics.
Processor Phase 1.5 is fully shipped (`c07ea0e` 1.5.4, `f4b50ca` 1.5.5, `2f2cf5c` 1.5.6) — six tasks, 178/178 unit tests, 6 integration scenarios. The cookie-mode language in the processor's planning task files (1.5.2 in particular) is left as-is — it's the historical spec the implementation landed against; the implementation itself is mode-agnostic.
## [2026-05-02] note | TRM design handoff imported (deferred to SPA Phase 3.8)
User generated a design system via claude.ai/design and dropped the handoff bundle into `trm/spa/TRM_Design_System-handoff/`. Bloomberg/F1-pit-wall aesthetic — ink-on-paper base, race-flag red `#E8412B` accent, square-edged everything, sharp printed offset shadows (no blur), mono numerics for changing values, Goldplay (real licensed font, three weights) + JetBrains Mono + Inter. Four surfaces designed: dashboard / leaderboard / mobile / marketing — SPA scope covers the first two.
Adoption deferred to SPA Phase 3.8 ("Visual brand pass") because applying it now would either delay dogfood-blocking Phase 1/2 work or land partial styling that gets reworked. The bundle is committed in-tree (`trm/spa/9e6b361`) and Phase 3's README spells out the recommended approach: retheme shadcn via CSS-variable overrides + Tailwind 4 `@theme` block, don't replace primitives. Source-of-truth files for the future implementer: `colors_and_type.css` (tokens), `chats/chat1.md` (intent), the bundle's READMEs (specs), `ui_kits/` (HTML prototypes per surface).
No wiki updates yet — design system isn't part of the architectural model and the surface-level styling isn't worth a wiki entity. If/when 3.8 lands and the brand becomes a stable fixture, a brief mention in `[[react-spa]]` is the right home.
## [2026-05-02] note | trm/spa planning landed
User created `trm/spa` repo on Gitea and seeded a minimal Vite 8 + React 19 + TypeScript 6 scaffold (App.tsx returns "SPA"). Wrote the full planning structure mirroring the conventions established by `trm/processor` and `trm/directus`.
Created in `trm/spa/.planning/`:
- `ROADMAP.md` — navigation hub with status legend, architectural anchors, eight non-negotiable design rules (singleton MapLibre, side-effect-only `Map*` components, rAF coalescer, same-origin-everything, in-memory access token, role-aware UI, runtime config, native PostGIS GeoJSON), four phases.
- `phase-1-foundation/` — README + 9 task files: 1.2 stack rounding-out (Tailwind + shadcn/ui + TanStack Router/Query + Zustand + @directus/sdk + zod + react-hook-form + Prettier), 1.3 Vite dev proxy + path aliases + tsconfig hardening, 1.4 runtime config endpoint, 1.5 Directus auth client (cookie mode + refresh + Zustand auth store), 1.6 login page, 1.7 routing skeleton (TanStack Router file-based + role-aware guards), 1.8 logout flow (with cross-tab sync), 1.9 Gitea CI + Dockerfile + nginx static serve, 1.10 compose service block in `trm/deploy`.
- `phase-2-live-map/README.md` — sketched task table for the live-monitoring map; depends on processor Phase 1.5 landing. Nine tasks: MapLibre singleton, tile-source switcher, sprite preload, WS client + rAF coalescer + Zustand store, MapPositions, MapTrails, event picker, camera control trio, connection-status indicators.
- `phase-3-dogfood-readiness/README.md` — error boundaries, connection-state UI, mobile-responsive baseline, per-device detail panel, empty/loading-state polish, Vitest setup, production logging, visual brand pass.
- `phase-4-future/README.md` — geometry editor (depends on directus Phase 2), replay mode, heatmaps / deck.gl, i18n (Albanian), dark mode, Playwright E2E, leaderboard, spectator-facing public map, notifications, operator chat. None committed.
Each task file follows the existing Goal / Deliverables / Specification / Acceptance / Risks / Done shape so an implementer agent can pick one up self-contained. Phase 1 sequencing: 1.2 → 1.3 → 1.4 → 1.5 → (1.6 ‖ 1.7) → 1.8, with 1.9+1.10 (deploy plumbing) developable in parallel after 1.3 lands.
End state of Phase 1: a deployable empty shell — auth + protected routes + login/logout + CI + compose deploy block. End state of Phase 2: the dogfood-day deliverable. End state of Phase 3: actually fielded for race operators on race day, not just a tech demo.
Binary file not shown.
+420
View File
@@ -0,0 +1,420 @@
# Maps Architecture — traccar-web
This document describes how the maps subsystem is built in this project: the underlying engine, how tiles are loaded (Google included), how vector geometries are rendered, and how live tracking data flows from the backend WebSocket onto the map.
> TL;DR — The app does **not** use the Google Maps JavaScript API. It uses **MapLibre GL JS** as the rendering engine, and Google's tile servers are consumed either as plain raster XYZ tiles or via the `maplibre-google-maps` adapter (a custom `google://` protocol that proxies Google's official Map Tiles API). All map "objects" (devices, geofences, route lines, accuracy circles, POIs) are GeoJSON sources rendered by MapLibre style layers.
---
## 1. Stack & dependencies
From `package.json`:
| Package | Role |
|---|---|
| `maplibre-gl` | Core WebGL rendering engine (vector + raster, sources, layers, controls). |
| `maplibre-google-maps` | Adapter that registers a `google://` protocol handler so MapLibre can fetch tiles from Google's official Map Tiles API. |
| `@mapbox/mapbox-gl-draw` | Drawing controls (polygon, line, trash) — used for editing geofences. |
| `@mapbox/mapbox-gl-rtl-text` | RTL text shaping plugin (loaded only when the UI direction is RTL). |
| `@maplibre/maplibre-gl-geocoder` | Search box control (wired to OpenStreetMap Nominatim). |
| `@turf/circle` | Builds polygon approximations of circles for `CIRCLE(...)` geofences and accuracy circles. |
| `wellknown` | WKT parse/stringify (geofence storage format used by Traccar backend). |
| `@tmcw/togeojson` | Converts KML POI overlays into GeoJSON. |
Application bootstrap (`src/index.jsx`) calls `preloadImages()` once at startup so all device-icon sprites (with all four colour variants) are pre-rasterised before the first map renders.
---
## 2. The single global map instance
File: `src/map/core/MapView.jsx`
This is the most important architectural decision in the maps code:
```js
const element = document.createElement('div');
element.style.width = '100%';
element.style.height = '100%';
maplibregl.addProtocol('google', googleProtocol);
export const map = new maplibregl.Map({
container: element,
attributionControl: false,
});
```
Key points:
- A **single** `maplibregl.Map` instance lives for the entire app lifetime, attached to a **detached `<div>`** held in a module-level variable.
- The React `<MapView>` component just mounts that detached `<div>` into its own ref (`appendChild`) on mount and removes it on unmount, then calls `map.resize()`. This lets the user navigate between Main/Replay/Geofences/Reports pages without paying the cost of recreating the WebGL context, re-uploading icon sprites, or refetching the style.
- `maplibregl.addProtocol('google', googleProtocol)` is called once globally — this is what makes URLs like `google://roadmap/{z}/{x}/{y}?key=...` work as a tile source.
- Every other map component in the codebase imports the same singleton: `import { map } from './core/MapView';` and calls `map.addSource`, `map.addLayer`, `map.on(...)` etc. directly. They render `null` to React — they are imperative side effects wrapped as components so React's lifecycle handles add/cleanup.
### Ready gating
`<MapView>` only renders its `children` after the active style is fully loaded:
```js
{mapReady && children}
```
The mechanism:
1. A module-level `ready` flag plus a `Set<readyListeners>` lets React components subscribe to the global ready state.
2. When the user changes basemap via the switcher, the switcher fires `onBeforeSwitch``updateReadyValue(false)` → all child map components unmount and clean up their sources/layers.
3. After `map.setStyle(...)`, the switcher waits for `styledata` and then polls `map.loaded()` every 33 ms; once loaded, it calls `initMap()` (which lazy-loads icons via `map.addImage(...)`) and `updateReadyValue(true)`.
4. Children remount and re-add their sources/layers on top of the new style.
This is the canonical MapLibre pattern for "style swap": setting a style wipes all custom sources/layers, so the consumer has to re-add them.
### Controls wired in `MapView`
- `AttributionControl` (bottom-right / bottom-left in RTL)
- `NavigationControl` (top-right / top-left in RTL)
- `SwitcherControl` (custom — basemap picker)
Other controls (`MapScale`, `MapCurrentLocation`, `MapGeocoder`, `MapNotification`) are added/removed by their own components, outside `<MapView>`'s children list, because they should persist across style swaps (they hook the same singleton).
---
## 3. Tile layers — how the map "looks"
File: `src/map/core/useMapStyles.js`
This hook returns an array of style descriptors that the `SwitcherControl` exposes to the user. There are two flavours:
### a) Vector styles (full MapLibre style JSON URLs)
For providers that ship a vector-tile `style.json`:
- OpenFreeMap (`https://tiles.openfreemap.org/styles/liberty`)
- LocationIQ Streets / Dark
- MapTiler Basic / Hybrid
- TomTom Basic
- Ordnance Survey (with a `transformRequest` that appends `&srs=3857`)
These give MapLibre full label/road/POI styling.
### b) Raster styles (built ad-hoc by `styleCustom`)
`styleCustom({ tiles, minZoom, maxZoom, attribution })` synthesizes a minimal MapLibre style with one raster source + one raster layer:
```js
{
version: 8,
sources: { custom: { type: 'raster', tiles, tileSize: 256, ... } },
glyphs: 'https://cdn.traccar.com/map/fonts/{fontstack}/{range}.pbf',
layers: [{ id: 'custom', type: 'raster', source: 'custom' }],
}
```
It's used for: OSM, OpenTopoMap, Carto, all three Google variants (when no Google API key set), Bing (Road / Aerial / Hybrid via `{quadkey}`), HERE (3 variants), Yandex, AutoNavi, Mapbox raster styles, and any user-supplied custom URL.
The included `glyphs` URL means even raster-only basemaps can still render text labels for our overlays (geofences, devices, POIs).
### c) Google specifically
Google appears in **three** entries — `googleRoad`, `googleSatellite`, `googleHybrid`. Each one branches on whether the `googleKey` user attribute is set:
```js
tiles: googleKey
? [`google://roadmap/{z}/{x}/{y}?key=${googleKey}`]
: [0, 1, 2, 3].map((i) =>
`https://mt${i}.google.com/vt/lyrs=m&hl=en&x={x}&y={y}&z={z}&s=Ga`),
```
- **With a key:** the `google://` URL is intercepted by `maplibre-google-maps`'s `googleProtocol` handler, which calls Google's official Map Tiles API (server, session-token auth, billable) and returns the tile to MapLibre.
- **Without a key:** the legacy public Google tile servers (`mt0..mt3.google.com`) are used directly. This is the unauthenticated/scraped mode and its long-term availability is at Google's discretion.
Hybrid uses `satellite` + `&layerType=layerRoadmap` on the keyed path.
`useMapStyles` also exposes a `custom` style: if `state.session.server.mapUrl` contains `{z}` or `{quadkey}` it's treated as a tile template via `styleCustom`; otherwise it's treated as a full style JSON URL.
---
## 4. Optional raster overlays
File: `src/map/overlay/useMapOverlays.js` + `src/map/overlay/MapOverlay.js`
These are extra raster layers stacked **on top** of the basemap (traffic, weather, sea/rail, etc.). Each entry is just a MapLibre raster source descriptor:
| Overlay | Source |
|---|---|
| Google Traffic | `google://satellite/{z}/{x}/{y}?key=...&layerType=layerTraffic&overlay=true` |
| OpenSeaMap, OpenRailwayMap | OSM-derived tile servers |
| OpenWeather (clouds / precipitation / pressure / wind / temperature) | `tile.openweathermap.org/.../{z}/{x}/{y}.png?appid=...` |
| TomTom Flow / Incidents | TomTom traffic API |
| HERE Flow | HERE traffic flow tiles |
| Custom | `state.session.server.overlayUrl` |
`<MapOverlay>` reads the `selectedMapOverlay` user attribute, finds the active descriptor, then `map.addSource(id, source)` + `map.addLayer({ id, type: 'raster', source: id })`. On unmount or change it tears down both.
---
## 5. Icon sprites
File: `src/map/core/preloadImages.js`
All device category icons are SVGs in `src/resources/images/icon/` (car, bus, truck, bicycle, plane, ship, person, animal, ...). Plus a generic `background.svg` and a `direction.svg` arrow.
`preloadImages()` runs at app startup and:
1. Loads each SVG to an `HTMLImageElement`.
2. For each category × each colour (`info`, `success`, `error`, `neutral`) it calls `prepareIcon(background, icon, color)`:
- Draws the background sprite to a canvas at `devicePixelRatio` scale.
- Tints the icon SVG with the colour (canvas `destination-atop` trick) and composites it centred on the background.
- Returns the raw `ImageData`.
3. Stores all results in the `mapImages` registry keyed `${category}-${color}` (e.g. `car-success`, `truck-error`).
`MapView.initMap` then calls `map.addImage(key, imageData, { pixelRatio })` for every entry once a style has loaded. After that, layers can reference any sprite by name in style expressions like `'icon-image': '{category}-{color}'`.
`mapIconKey(category)` normalises a Traccar device category to one of the available sprite names (`offroad`/`pickup``car`, `trolleybus``bus`, unknown → `default`).
---
## 6. Rendering live device positions
File: `src/map/MapPositions.js`
This is the central live-tracking renderer. Props:
- `positions` — array of position objects `{ id, deviceId, latitude, longitude, course, fixTime, attributes, ... }`
- `onMapClick`, `onMarkerClick`
- `selectedPosition`, `titleField`, `showStatus`
### Sources & layers
It creates **two** GeoJSON sources, identified by `useId()`-based unique strings:
1. `id` — non-selected devices, `cluster: true`, `clusterMaxZoom: 14`, `clusterRadius: 50`. MapLibre's built-in clustering handles aggregation.
2. `selected` — the currently selected device only, never clustered (always on top).
For each of the two sources it adds:
- A symbol layer rendering `'icon-image': '{category}-{color}'` plus the title (device name or `fixTime`).
- A `direction-…` symbol layer filtered to features where `direction === true`, drawing the `direction` arrow rotated by the position `course` with `'icon-rotation-alignment': 'map'`.
Plus a single `clusters` symbol layer on the main source that filters `['has', 'point_count']` and shows a `background` icon with the count.
### Feature construction
`createFeature(devices, position, selectedPositionId)` returns properties used by the layer expressions:
```js
{
id, deviceId,
name, fixTime,
category: mapIconKey(device.category), // selects sprite
color: showStatus
? position.attributes.color || getStatusColor(device.status) // 'success'|'error'|'neutral'|'info'
: 'neutral',
rotation: position.course,
direction: showDirection, // controlled by 'mapDirection' preference
}
```
`showDirection` modes: `none` (never), `all` (every position with a course), `selected` (default — only the currently selected position).
### Updates
A second `useEffect` re-runs whenever `positions`, `devices`, `selectedPosition`, or `mapCluster` change, and just calls `map.getSource(source)?.setData(...)` with a freshly built `FeatureCollection` for each source. MapLibre diffs and re-renders.
### Interaction
- `mouseenter`/`mouseleave` swap the canvas cursor.
- `click` on a marker → `onMarkerClick(positionId, deviceId)` (in `MainMap` this dispatches `devicesActions.selectId(deviceId)`).
- `click` on a cluster → calls `getClusterExpansionZoom(clusterId)` and `map.easeTo` to zoom in.
- `click` on the map background → `onMapClick(lat, lng)`.
---
## 7. Geofences (rendering)
File: `src/map/MapGeofence.js`
Three layers on a single GeoJSON source:
1. `geofences-fill``type: 'fill'`, filtered to polygons, semi-transparent fill (`fill-opacity: 0.1`).
2. `geofences-line` — outline, with per-feature `color`/`width`/`opacity` driven by `attributes.color`, `attributes.mapLineWidth`, `attributes.mapLineOpacity`.
3. `geofences-title` — symbol layer rendering `{name}`.
### WKT → GeoJSON conversion
Geofences arrive from the backend (`/api/geofences`) as **WKT** in `item.area`. `geofenceToFeature(theme, item)` (in `src/map/core/mapUtil.js`) handles this:
- If the area starts with `CIRCLE`, it extracts `(lat lon, radius)` and uses `@turf/circle` to approximate it as a 32-step polygon in metres.
- Otherwise it `wellknown.parse(item.area)` and runs `reverseCoordinates(...)` because WKT is `lat lon` while GeoJSON is `lng lat`.
The reverse mapping (`geometryToArea`) is used when saving edits.
### Editing — `src/map/draw/MapGeofenceEdit.js`
Wraps `@mapbox/mapbox-gl-draw`. It:
- Adds a Draw control with polygon, line_string, trash modes (and patches the CSS class names so it looks native to MapLibre).
- Listens to `draw.create` → POST `/api/geofences`, `draw.update` → PUT, `draw.delete` → DELETE.
- On `geofences` Redux state change, clears Draw and re-adds every feature converted via `geofenceToFeature`.
- When a `selectedGeofenceId` is passed, it computes a bbox from the feature's coords and `map.fitBounds(...)` to it.
---
## 8. Routes (history & live trails)
### Replay route (`MapRoutePath`)
Used by `ReplayPage` and report pages. Builds a `FeatureCollection` of **per-segment** `LineString` features, one per consecutive position pair, and colours each segment by the second point's speed via `getSpeedColor(speed, minSpeed, maxSpeed)`:
```js
features.push({
type: 'Feature',
geometry: { type: 'LineString', coordinates: [[lon1,lat1],[lon2,lat2]] },
properties: { color: reportColor || getSpeedColor(...), width, opacity },
});
```
If the device has a `web.reportColor` attribute, that overrides speed-based colouring with a fixed colour for the whole track. Width/opacity come from `mapLineWidth`/`mapLineOpacity` user preferences.
### Replay points (`MapRoutePoints`)
A symbol layer that renders **the literal text "▲"** as the marker, rotated by `course` and tinted by speed colour. Clicking emits `onClick(positionId, index)` so the slider can scrub to that point. Also conditionally adds a `SpeedLegendControl` (a small horizontal turbo-colormap gradient with min/max speed labels in the user's speed unit).
### Generic polyline (`MapRouteCoordinates`)
A simpler version that takes pre-computed `coordinates` (e.g. from a route geometry shipped by the API) and renders a single `LineString` with optional name label.
### Live trails (`MapLiveRoutes`)
Renders the trailing path of currently-tracked devices in real time. It reads from `state.session.history` — a Redux-managed dictionary `{ [deviceId]: [[lon,lat], ...] }`. The history is updated inside `sessionActions.updatePositions` (see §11) and capped to `web.liveRouteLength` points per device. Behaviour is gated by the `mapLiveRoutes` user attribute: `none`, `selected` (only the active device), or all devices.
### Accuracy circles (`MapAccuracy`)
For each position with `accuracy > 0` (in metres), it builds a `turfCircle([lon, lat], accuracy * 0.001)` (km) polygon and renders a translucent fill in the theme's geometry colour.
---
## 9. Generic markers — `MapMarkers`
A reusable component for showing simple POI-like markers (used by report pages, event pages, etc.). Each marker `{ latitude, longitude, image, title }` becomes a Point feature; the symbol layer reads `'icon-image': '{image}'` so any preloaded sprite name works (commonly `start-success`, `finish-error`, or `default-neutral`).
The `showTitles` prop toggles whether the layer renders the `title` text under the icon.
---
## 10. Camera control
Three components handle different camera scenarios:
- **`MapCamera`** — One-shot fit. If passed `coordinates` or `positions`, builds an `LngLatBounds` and calls `map.fitBounds(bounds, { padding: min(w,h) * 0.1, duration: 0 })`. Else jumps to a single `lat/lon` keeping zoom ≥ 10. Used in `ReplayPage`.
- **`MapDefaultCamera`** — Initial framing on `MainPage`. Picks (in priority order) the selected device's position, the user's `latitude/longitude/zoom` preference, or a fitBounds over all visible positions. Runs at most once (`initialized` state).
- **`MapSelectedDevice`** — Reactive follow. Watches `state.devices.selectedId`, `selectTime`, and the position of the selected device. Fires `map.easeTo(...)` when (a) the user reselects a device or re-clicks it, or (b) `mapFollow` is enabled and the selected device's coordinates change. The vertical offset (`-popupMapOffset / 2`) makes room for the StatusCard popup.
`MapPadding` separately calls `map.setPadding({ left: drawerWidth })` so MapLibre's auto-centering accounts for the persistent left drawer in desktop layout.
---
## 11. Live data pipeline (where positions come from)
This is the end-to-end flow for "a device moved on the screen":
```
Traccar backend Browser
│ WebSocket /api/socket
SocketController.jsx ──► dispatch(devicesActions.update / sessionActions.updatePositions / eventsActions.add)
Redux store (slices)
useSelector in MapPositions / MapLiveRoutes / MapSelectedDevice / ...
map.getSource(id).setData(GeoJSON FeatureCollection)
MapLibre re-renders (WebGL)
```
### `SocketController.jsx`
- Opens `wss?://<host>/api/socket` once the user is authenticated.
- On `onmessage`, parses the JSON envelope and dispatches one of:
- `devicesActions.update(data.devices)` — device meta (status/lastUpdate/...).
- `sessionActions.updatePositions(data.positions)`**the live position stream**.
- `eventsActions.add(data.events)` (also drives the `MapNotification` button colour and the alarm sound).
- `sessionActions.updateLogs(data.logs)`.
- Includes a fallback REST refresh + 60 s reconnect loop on `onclose`, plus `online`/`visibilitychange` listeners that re-test the socket and reconnect if needed. On socket loss it also keeps a ping (`socket.send('{}')`) to test the connection.
### `sessionActions.updatePositions` (in `src/store/session.js`)
The reducer does two things per incoming position:
1. `state.positions[deviceId] = position` — overwrites the latest known position by deviceId.
2. If `mapLiveRoutes` is enabled, appends `[longitude, latitude]` to `state.history[deviceId]`, capped to `liveRouteLength` (default 10) points. If the new coordinate matches the last one, it's skipped.
Thus everything downstream is just `useSelector` on `state.session.positions` / `state.session.history`.
### `MainPage` → `MainMap`
`MainPage` reads `positions`, runs `useFilter` to compute `filteredPositions` based on user search/filter UI, then passes them into `<MainMap>`. `MainMap` (`src/main/MainMap.jsx`) composes the live map:
```jsx
<MapView>
<MapOverlay />
<MapGeofence />
<MapAccuracy positions={filteredPositions} />
<MapLiveRoutes deviceIds={filteredPositions.map(p => p.deviceId)} />
<MapPositions positions={filteredPositions} ... showStatus />
<MapDefaultCamera filteredPositions={filteredPositions} />
<MapSelectedDevice />
<PoiMap />
</MapView>
<MapScale />
<MapCurrentLocation />
<MapGeocoder />
{!disableEvents && <MapNotification ... />}
{desktop && <MapPadding ... />}
```
When `filteredPositions` changes (driven by socket → redux), `MapPositions`'s effect re-runs and calls `setData` on the two GeoJSON sources — that's the actual visible update.
---
## 12. Other map consumers
The same singleton + composition pattern is reused across:
- `src/other/ReplayPage.jsx``MapView` with `MapRoutePath`, `MapRoutePoints`, a single-position `MapPositions`, and `MapCamera` for fit-to-bounds.
- `src/other/GeofencesPage.jsx``MapView` with `MapGeofenceEdit` for CRUD on geofences.
- `src/other/EmulatorPage.jsx`, `src/other/EventPage.jsx` — point-and-click + event-context maps.
- `src/reports/PositionsReportPage.jsx`, `TripReportPage.jsx`, `StopReportPage.jsx`, `EventReportPage.jsx`, `CombinedReportPage.jsx` — replay-style maps for report visualisation.
- `src/settings/UserPage.jsx`, `src/settings/ServerPage.jsx` — coordinate pickers.
Because the engine is a singleton, switching pages incurs only a React reconcile + add/remove of sources & layers — no WebGL teardown.
---
## 13. Auxiliary controls
| Control | File | Purpose |
|---|---|---|
| `SwitcherControl` | `src/map/switcher/switcher.js` | Custom basemap picker — fully imperative DOM; calls `map.setStyle(style.style, { diff: false })` and triggers the `before/onSwitch/after` lifecycle so children can rebuild. |
| `MapScale` | `src/map/MapScale.js` | Wraps `maplibregl.ScaleControl`; switches `metric`/`imperial`/`nautical` from the `distanceUnit` preference. |
| `MapCurrentLocation` | `src/map/MapCurrentLocation.js` | Wraps `maplibregl.GeolocateControl` (`enableHighAccuracy`, no continuous tracking). |
| `MapGeocoder` | `src/map/geocoder/MapGeocoder.js` | `MaplibreGeocoder` configured with a custom `forwardGeocode` that calls Nominatim and reshapes the response into the geocoder's expected feature format. |
| `MapNotification` | `src/map/notification/MapNotification.js` | Small custom toggle button styled as a MapLibre control; reflects an `enabled` boolean and emits clicks. |
| `SpeedLegendControl` | `src/map/legend/MapSpeedLegend.js` | Inline gradient legend added by `MapRoutePoints` when `showSpeedControl` is true. |
| `PoiMap` | `src/map/main/PoiMap.js` | Loads a user-configured KML via `fetch` + `DOMParser`, converts with `@tmcw/togeojson`, and renders 3 layers (point/line/title). |
---
## 14. Conventions to keep in mind
1. **Every `Map*` component is a side-effect-only React component.** It returns `null` and uses `useEffect` to add sources/layers and the cleanup function to remove them. The two-effect pattern (one for setup with `[]`-ish deps, one for `setData` updates) is consistent across the codebase.
2. **`useId()` is used to generate unique source/layer ids** so the same component can be mounted multiple times safely.
3. **All data updates flow through GeoJSON `setData`** — there is no direct DOM marker manipulation. This keeps the WebGL pipeline efficient and lets MapLibre handle clustering/visibility.
4. **Style swaps reset the world.** Anything custom on the map must be re-added after `setStyle`. The `mapReady` gate in `MapView` is what coordinates this for child components.
5. **Coordinates everywhere are `[lon, lat]`** (MapLibre/GeoJSON convention). `reverseCoordinates` exists specifically to bridge from the WKT `lat lon` ordering used by Traccar's geofence storage.
6. **No Google Maps SDK is loaded in the browser.** Even when Google tiles are used, the code path is `tile URL → fetch → blob → MapLibre raster source`. The only Google-specific code is the protocol adapter from `maplibre-google-maps`, which is registered once in `MapView.jsx`.
+1 -1
View File
@@ -116,7 +116,7 @@ The escape hatch is well-defined: lift the WebSocket endpoint code out of the Pr
- [[processor]] grows a public-facing WebSocket endpoint in addition to its existing Redis consumer and Postgres writer.
- [[directus]] keeps its built-in WebSocket subscriptions for tables it writes to. Its real-time delivery section no longer claims to broadcast direct writes from [[processor]] — that's a documented mistake corrected in this revision.
- [[react-spa]] connects to two WebSocket endpoints: Directus for admin/business updates, Processor for live position firehose. Same JWT-based auth on both.
- [[react-spa]] connects to two WebSocket endpoints: Directus for admin/business updates, Processor for live position firehose. Same JWT-based auth on both. Consumer-side throughput discipline (rAF coalescing of incoming positions before reducer dispatch) is documented in [[maps-architecture]] — without it the per-message dispatch pattern observed in [[traccar-maps-architecture]] cascades through selectors and `setData` at every position arrival.
- The deploy stack publishes the Processor's WebSocket port (with TLS termination at a reverse proxy in front).
## Why not a single WebSocket endpoint
+199
View File
@@ -0,0 +1,199 @@
---
title: Maps architecture
type: concept
created: 2026-05-02
updated: 2026-05-02
sources: [traccar-maps-architecture, gps-tracking-architecture]
tags: [frontend, maps, maplibre, architecture, decision]
---
# Maps architecture
The pattern the [[react-spa]] uses to render real-time positions, trails, geofences, and replay tracks. Inherited (with deliberate refinements) from [[traccar-maps-architecture]], which already fields the same shape — MapLibre GL JS + GeoJSON sources + WebSocket-fed updates — at production scale. Reading this page tells you how the SPA's maps subsystem is structured; reading the source tells you the original implementation in detail.
## Why this shape
DOM-rendered markers (Leaflet's default, or React-Leaflet's `<Marker>` components) thrash layout and repaint on every map move. ~1000 markers is enough to choke a browser. WebGL-rendered markers handle 100k+ at 60fps — the GPU is purpose-built for it. **The decision tree is short:** any map that updates frequently or shows more than a few hundred markers must be WebGL.
MapLibre GL JS is the open-source WebGL map renderer. Vector tiles, raster tiles, GeoJSON sources, symbol/circle/line/fill style layers — all GPU-rendered. Free, no API key required for OSM-derived tiles. Standard tooling.
The architecture that follows is what makes MapLibre tractable inside React without losing the imperative escape hatches you need for streaming-data performance.
## Singleton map
A **single `maplibregl.Map` instance lives for the entire app lifetime**, attached to a detached `<div>` held in a module-level variable. The `<MapView>` React component just appends that detached `<div>` into its own ref on mount and removes it on unmount; it then calls `map.resize()`.
Consequences:
- **Page navigation doesn't recreate the WebGL context.** Live → replay → geofence editor → reports — all share the same instance.
- **Icon sprites stay registered across navigations** — no re-`addImage` cost per page change.
- **State outside React's tree.** Sources, layers, camera position, event listeners are MapLibre's, not React's. React's reconciler doesn't compete.
Trade-off: if multiple tabs of the SPA need *different* maps (e.g. side-by-side comparison), the singleton breaks. For TRM's operator-facing flow this is a non-concern. Revisit only if a use case demands it.
## Side-effect-only `Map*` components
**Every component that participates in the map (`MapPositions`, `MapGeofences`, `MapTrails`, `MapAccuracy`, etc.) returns `null`.** It uses `useEffect` for setup (add source + layer) with cleanup (remove source + layer), and a separate `useEffect` for updates that calls `map.getSource(id)?.setData(...)`.
The two-effect pattern:
```ts
function MapPositions({ positions }: Props) {
const sourceId = useId();
const layerId = useId();
useEffect(() => {
map.addSource(sourceId, { type: 'geojson', data: emptyFC, cluster: true });
map.addLayer({ id: layerId, type: 'symbol', source: sourceId, layout: {...} });
return () => {
map.removeLayer(layerId);
map.removeSource(sourceId);
};
}, []); // setup: runs once
useEffect(() => {
map.getSource(sourceId)?.setData(toFeatureCollection(positions));
}, [positions]); // update: runs on every change
return null;
}
```
`useId()` for source/layer ids lets the same component mount multiple times safely. The empty render keeps React's reconciler out of the rendering hot path.
## Two GeoJSON sources for live positions
Live device rendering uses **two GeoJSON sources**, not one:
- **Non-selected source** with `cluster: true, clusterMaxZoom: 14, clusterRadius: 50`. MapLibre's built-in clustering aggregates dense regions automatically.
- **Selected source**, unclustered, always rendered on top.
This keeps the UX clean without manual z-ordering: clicking a device pulls it out of the clustered layer and into the always-on-top selected layer. Layers reference the appropriate source.
## Style swaps reset the world
Calling `map.setStyle(...)` wipes every custom source and layer the SPA has added. This is MapLibre's native behaviour, not a quirk to work around.
Coordination via a `mapReady` gate:
1. A module-level `ready` flag plus a listener `Set` lets components subscribe to global ready state.
2. Basemap switcher fires `onBeforeSwitch → updateReadyValue(false)`. All children unmount and clean up their sources/layers.
3. After `map.setStyle(...)` resolves (`styledata` event + `map.loaded()` polling), `initMap()` re-adds icon sprites, then `updateReadyValue(true)` flips the gate.
4. `<MapView>` re-renders its `children`; every `Map*` component remounts and re-adds its source + layer.
The gate is the contract: **don't add anything to the map until `mapReady` is true after a style swap.** Violating it causes phantom-source errors that look like Heisenbugs.
## Icon sprites
Device-category icons (rally car / quad / SSV / motorcycle / runner / hiker for TRM, plus a generic `default`) are SVGs **pre-rasterised once at app startup** at `devicePixelRatio` scale, with all colour variants composited up-front, stored in a module-level registry, and `addImage`'d after every style swap.
Layers reference sprites by name in style expressions:
```ts
'icon-image': ['concat', ['get', 'category'], '-', ['get', 'color']]
```
So a feature with `properties.category === 'rally-car'` and `properties.color === 'success'` resolves to the `rally-car-success` sprite at GPU paint time. Zero per-frame work in JavaScript.
## WebSocket → map data flow (with rAF coalescer)
The end-to-end live data flow:
```
Processor WS → coalesce buffer → Zustand store → useStore selectors → setData → GPU
```
**The coalesce buffer is the discipline this architecture lives or dies on.** [[traccar-maps-architecture]] dispatches one Redux action per WS message; at 200 racers × 1Hz this fires 200 dispatches/sec, each cascading through `useSelector` consumers, `useFilter` recomputes over the full position array, and a freshly-built FeatureCollection passed to `setData`. That cascade is the most likely source of the lag operators see in production Traccar deployments.
TRM's pattern:
```ts
const buffer = new Map<string, Position>(); // deviceId → latest position
let rafScheduled = false;
socket.onmessage = (msg) => {
const pos = JSON.parse(msg.data);
buffer.set(pos.deviceId, pos);
if (!rafScheduled) {
rafScheduled = true;
requestAnimationFrame(flush);
}
};
function flush() {
rafScheduled = false;
const snapshot = Array.from(buffer.values());
buffer.clear();
positionStore.getState().applyPositions(snapshot);
}
```
Properties of this shape:
- **At most one state update per frame** (~60Hz cap). Dozens of incoming messages per frame collapse to one snapshot.
- **Per-device coalescing.** If the same device reports five times in 16ms, only the latest position is kept.
- **No back-pressure on the WS.** Buffer is bounded by device count, not message rate.
- **Trail history isn't lossy.** This buffer holds *latest visible positions for the map*, not the per-device trail. The trail is a separate per-device ring buffer in the store, written from `applyPositions`.
Throttling-by-user-choice (slider for "1Hz / 5Hz / 30Hz redraw rate") is a future feature layered on top of this — change the rAF callback to a `setInterval` keyed on user preference. The rAF coalescer is the always-on hygiene.
## State management
- **Zustand** for high-frequency live state: positions store, trails store, selection. Granular subscribers, no provider boilerplate, no `useSelector` cascades.
- **TanStack Query** for Directus REST: events, classes, entries, vehicles, devices. Caching, invalidation, refetch on focus.
- **No Redux.** The Traccar Redux pattern works correctly but the dispatch + selector machinery is overhead at our update rates and doesn't earn its weight given Zustand's existence.
## Geofence editing
`@mapbox/mapbox-gl-draw` wrapped to look native to MapLibre (CSS class patches). Modes: polygon, line_string, trash. On `draw.create` / `draw.update` / `draw.delete`, the SPA POSTs / PUTs / DELETEs to Directus's REST API. Directus stores PostGIS geometry directly; the SPA reads back GeoJSON via `ST_AsGeoJSON(geometry)`**no WKT round-trip** unlike [[traccar-maps-architecture]] which uses WKT in storage.
Phase 2 of [[directus]] introduces the `geofences`, `waypoints`, and `speed_limit_zones` collections that this editor writes to.
## Camera control
Three components for three jobs, kept separate so each effect's dependency list is small and the wrong-camera-jump bugs that plague map UIs don't appear:
- **`MapCamera`** — one-shot fit. Builds an `LngLatBounds` from `coordinates` or `positions` props and calls `map.fitBounds(...)` once. Used in replay.
- **`MapDefaultCamera`** — initial framing on first mount. Picks the selected device, the user's saved preference, or a fitBounds over visible positions. Runs at most once.
- **`MapSelectedDevice`** — reactive follow. Watches the selected device id; on selection change or position change (with `mapFollow` on), `map.easeTo(...)`.
Bonus: `MapPadding` calls `map.setPadding({ left: drawerWidth })` so MapLibre's auto-centring accounts for the persistent left drawer in desktop layout.
## Tile sources
Two flavours, identical to [[traccar-maps-architecture]]:
- **Vector styles** for providers shipping a full MapLibre style JSON (OpenFreeMap, MapTiler, etc.). Best label/road/POI styling.
- **Raster styles** synthesised ad-hoc with one raster source + one raster layer. Used for OSM, OpenTopoMap, Esri World Imagery, Google variants, Bing, custom URLs.
The synthesised raster style includes a `glyphs` URL pointing at a font CDN so even raster basemaps can render text labels for our overlays (geofence names, device labels, POIs).
**Google tiles via the official Map Tiles API** are usable through the `maplibre-google-maps` adapter, which registers a `google://` protocol handler that proxies to Google's authenticated tile endpoints. Bring-your-own-key model — Google API key in the SPA's runtime config, not baked into the image.
The dogfood-day starter set (subject to revision):
| Style | Provider | Cost | Notes |
|---|---|---|---|
| Satellite | Esri World Imagery (raster XYZ) | Free, attribution required | No key needed; default for first launch. |
| Topo | OpenTopoMap (raster XYZ) | Free, attribution required | Useful in mountain rallies where pure satellite has cloud/shadow patches. |
| Street | OSM raster | Free | Sanity baseline. |
| Custom | User-supplied URL or style JSON | — | Operator escape hatch. |
| Google Satellite (optional) | Google Map Tiles API via adapter | First 10k req/mo free, then per-tile | Enable only if operators provide a key in runtime config. |
## TRM divergences from [[traccar-maps-architecture]]
| Traccar | TRM | Reason |
|---|---|---|
| Geofence storage in WKT, WKT↔GeoJSON conversion in client | **Native PostGIS GeoJSON** via `ST_AsGeoJSON` | We have PostGIS deployed; the round-trip is dead weight. |
| Redux dispatch per WS message | **rAF coalescer + Zustand store** | Eliminates the per-message cascade that drives Traccar's perceived lag. |
| `liveRouteLength = 10` default | **`liveRouteLength = 200` default** | Rally operators want minutes of trail, not seconds. |
| Generic fleet sprite set (car / truck / plane / ship / animal) | **Racing sprite set** (rally car / quad / SSV / motorcycle / runner / hiker / default) | Operators identify by category at a glance. |
| `react-map-gl` is a candidate React wrapper | **Raw MapLibre + singleton + side-effect components** | Declarative wrappers fight the imperative `setData` pattern. |
## Cross-references
- [[traccar-maps-architecture]] — the source architecture this concept distils.
- [[react-spa]] — the entity that implements this concept.
- [[live-channel-architecture]] — the producer-side WebSocket contract this concept consumes.
- [[processor]] — produces the live position stream.
- [[directus]] — REST API for geofence CRUD; JWT issuer for WS auth.
+9 -1
View File
@@ -2,7 +2,7 @@
title: Position Record
type: concept
created: 2026-04-30
updated: 2026-04-30
updated: 2026-05-01
sources: [gps-tracking-architecture, teltonika-ingestion-architecture, teltonika-data-sending-protocols]
tags: [data-model, boundary-contract]
---
@@ -60,3 +60,11 @@ For [[teltonika]]:
## Downstream contract
[[processor]] is responsible for naming IO elements (e.g. `"16"``"odometer_km"`), unit conversions, and any filtering. It writes the typed fields to the positions hypertable and may write derived/named attributes to other tables.
## Wire shape vs. storage shape
The Position type above is the **wire shape** — what [[tcp-ingestion]] produces and what flows through [[redis-streams]]. The **storage shape** in the positions hypertable extends it with one operator-controlled field:
- **`faulty: boolean`** (default `false`) — set after the fact by track operators through [[directus]] when a position is unrealistic (jumpy GPS, impossible coordinate/speed). [[processor]] evaluators filter `WHERE faulty = false` on every read; flagged positions are excluded from peak-speed calculations, crossing detection, and recompute. See the operator workflow in the [[directus-schema-draft]].
This field exists only at rest. Ingestion and the live channel never see it; it has no meaning until a human reviews the data.
+51 -1
View File
@@ -2,7 +2,7 @@
title: Directus
type: entity
created: 2026-04-30
updated: 2026-05-01
updated: 2026-05-02
sources: [gps-tracking-architecture, teltonika-ingestion-architecture]
tags: [service, business-plane, api]
---
@@ -56,10 +56,60 @@ This means **direct database writes from [[processor]] are not visible** to Dire
See [[live-channel-architecture]] for the full design, including why this split is preferable to routing telemetry writes through [[directus]]'s API or running a bridging extension inside [[directus]].
## Schema management — snapshot/apply pipeline
Schema changes flow through Directus's native snapshot mechanism, kept under git. Two artifact directories:
- **`snapshots/schema.yaml`** — Directus collections, fields, relations. Generated locally with `directus schema snapshot`. Applied at container startup with `directus schema apply --yes`. Idempotent — applies only the diff against the running DB.
- **`db-init/*.sql`** (pre-schema) — schema Directus does not manage and that needs to exist *before* `directus schema apply` runs: the [[postgres-timescaledb]] positions hypertable, the `faulty` column, future PostGIS extension. Numbered (`001_`, `002_`, …).
- **`db-init-post/*.sql`** (post-schema) — DDL that targets Directus-managed tables and therefore must run *after* schema apply has created them: composite UNIQUE constraints (which the snapshot YAML format cannot capture). Numbered independently; the runner's `migrations_applied` guard table is shared.
Both phases run via the same `apply-db-init.sh` script with `DB_INIT_DIR` overridden between calls. Each migration is wrapped in idempotent guards (`IF NOT EXISTS` / `pg_constraint` checks) so it's safe to absorb into environments where the constraint was already applied out-of-band.
Local dev edits the schema in the admin UI, then snapshots before commit. CI builds the image with both directories baked in, spins a throwaway Postgres, and dry-runs `apply` to catch breakage before deploy. Production (Portainer) runs the same apply at container start; multi-env separation is a connection string, not different artifacts.
This treats `schema.yaml` as the source of truth and the admin UI as its editor. Don't hand-edit `schema.yaml`; round-trip through the UI to keep the format consistent.
> **⚠️ Destructive-apply hazard.** `directus schema apply --yes` enforces the snapshot as the single source of truth: anything in the running DB that is *not* in the snapshot gets **deleted** during apply. This is correct for fresh-environment provisioning and prod, but a foot-gun during active schema development. The boot pipeline runs apply on every container start (entrypoint step 3/5 — pre-schema db-init → bootstrap → schema apply → post-schema db-init → start; see [[processor]] for the analogous staged-apply pattern).
>
> **Operator rule:** *Never restart or rebuild the Directus container while there are uncommitted schema changes.* The flow is always: change in admin UI / via MCP → `pnpm run schema:snapshot` → commit → only then rebuild/restart.
>
> A real incident hit this during Phase 1 task 1.5: 5 newly-created collections were destroyed by a rebuild because the baked-in snapshot was stale. Recovery was straightforward in dev (recreate via MCP, snapshot, commit) but would be data-loss in prod. CI dry-run (Phase 1 task 1.8) catches snapshot drift before it reaches stage. A long-term mitigation — `DIRECTUS_SCHEMA_APPLY_MODE` env var with `auto` / `dry-run` / `skip` modes — is on the Phase 3 hardening roadmap.
## Phase 2 role
Directus owns the `commands` collection and is the **single auth surface** for outbound device commands. The SPA inserts command rows; a Directus Flow routes them via Redis to the Ingestion instance holding the device's socket. See [[phase-2-commands]].
## Deployment
Wired into the platform stack at [`trm/deploy`](https://git.dev.microservices.al/trm/deploy)'s `compose.yaml` alongside [[redis-streams]] (the `redis` service), [[tcp-ingestion]], [[processor]], and [[postgres-timescaledb]] (the shared `postgres` service). Image built and pushed by [`trm/directus`](https://git.dev.microservices.al/trm/directus)'s Gitea workflow on every push to `main` that touches `snapshots/`, `db-init/`, `db-init-post/`, `extensions/`, `scripts/`, `entrypoint.sh`, `Dockerfile`, or the workflow file itself. CI dry-run gate validates the full boot pipeline against a throwaway Postgres before the image is published.
Directus and [[processor]] share the same Postgres instance — different tables, no contention. Schema authority is split (positions hypertable owned by [[processor]]'s migration runner, everything else by Directus's snapshot), but the database is one. See [[postgres-timescaledb]] for the writer-side split.
### Boot pipeline (5 steps)
```
1. db-init pre-schema → positions hypertable, faulty column, timescaledb extension
2. directus bootstrap → installs Directus system tables, seeds first admin if empty
3. directus schema apply → creates user collections from snapshots/schema.yaml
4. db-init post-schema → composite UNIQUE constraints on the user collections
5. pm2-runtime start → server up at :8055
```
Steps 23 must be in this order: schema apply requires bootstrap to have created `directus_collections` first. Step 4 must run after step 3: the constraints reference tables Directus just created. The CI dry-run runs steps 14 (skips step 5 — pm2 boot adds time, tests nothing new beyond what 14 already validated).
First boot on a fresh DB takes ~6090 s (most of it is Directus's internal migrations during step 2). Warm boots are ~10 s — every step is idempotent.
### Network exposure
Internal-only on the deploy stack. The container exposes `:8055` to the `trm_default` Compose network but is **not** host-published. A reverse proxy (Traefik / Caddy / nginx) running on the host or attached to the same network terminates TLS and forwards the public domain to `http://directus:8055`. The proxy itself is not part of the trm stack — add it as a sibling Portainer stack or run it on the host. Direct host exposure of an admin UI is a privileged surface (full CRUD + permission policies + Flow execution) and is deliberately avoided. [[tcp-ingestion]] is the asymmetry — GPS devices connect to it directly so its TCP port must be host-published.
The dev compose in `trm/directus` (`compose.dev.yaml`) does host-publish `:8055` for local iteration. Stage / prod do not.
### First-deploy operator checklist
Lives in `deploy/README.md`'s "First-deploy checklist" section. Generates per-environment `KEY` / `SECRET` / admin-user secrets, sets Portainer stack env vars, watches the boot logs, verifies the 12 user collections landed via the admin UI. The schema-as-code rule (no admin-UI schema edits on stage — they'll be DROPPED on next rebuild) is restated where it matters.
## Failure mode
Crash → telemetry continues to flow into the database; admin UI and SPA are unavailable; no telemetry is lost. See [[failure-domains]].
+7 -1
View File
@@ -2,7 +2,7 @@
title: PostgreSQL + TimescaleDB
type: entity
created: 2026-04-30
updated: 2026-04-30
updated: 2026-05-01
sources: [gps-tracking-architecture]
tags: [infrastructure, business-plane, database]
---
@@ -22,6 +22,12 @@ The durable storage layer. PostgreSQL with the TimescaleDB extension. Holds the
Schema is **defined and migrated through [[directus]]** — see that page for why. The Processor inserts rows respecting that schema; it does not create tables.
## Positions hypertable
Stores normalized [[position-record]] rows from [[processor]]. Beyond the wire-shape fields (device_id, timestamp, lat/lon/alt, angle, speed, satellites, priority, attributes), the hypertable carries one storage-only field:
- **`faulty boolean DEFAULT false`** — set by track operators via [[directus]] when a position is unrealistic (jumpy GPS, impossible speed/coordinate). The [[processor]]'s evaluators (peak-speed, crossing detection, recompute) filter `WHERE faulty = false` on every read of position data. Untouched at write time; mutated only through the operator workflow described in the schema draft.
## Operational note
The database is the **only single point of failure** in the architecture. Everything else is restartable, replaceable, or naturally redundant. Operational attention concentrates here:
+18 -1
View File
@@ -9,7 +9,7 @@ tags: [service, telemetry-plane, domain-logic]
# Processor
The service where domain logic lives. Consumes normalized telemetry from [[redis-streams]] and is responsible for per-device runtime state, applying domain rules, writing durable state to [[postgres-timescaledb]], and broadcasting live position updates over WebSockets to the [[react-spa]].
The service where domain logic lives. Consumes normalized telemetry from [[redis-streams]] (default stream `telemetry:teltonika`, consumer group `processor`) and is responsible for per-device runtime state, applying domain rules, writing durable state to [[postgres-timescaledb]], and broadcasting live position updates over WebSockets to the [[react-spa]].
## Responsibilities
@@ -47,6 +47,12 @@ In multi-instance deployments, each Processor reads the [[redis-streams]] stream
Per-model IO mappings live here, not in the Ingestion layer. Example: `{ "FMB920": { "16": "odometer_km", "240": "movement" } }`. This is the boundary set by the [[teltonika]] adapter — Ingestion produces raw IO maps; the Processor names and interprets them.
## Faulty position handling
The positions hypertable in [[postgres-timescaledb]] carries a `faulty boolean DEFAULT false` column that operators can flip through [[directus]] when a position is unrealistic. **All Processor read paths against position data filter `WHERE faulty = false`** — peak-speed evaluation inside SLZs, geofence crossing detection, waypoint pass detection, replay-based recompute. The flag is never set at write time; it's a post-hoc operator action.
When an operator flips the flag (set or unset), Directus emits a webhook → Redis Stream `recompute:requests`. The Processor consumes the request and re-evaluates `entry_penalties` whose evaluation window overlaps the flagged position's timestamp. Cost sits between formula recompute (cheap) and full geometry replay (expensive) — the affected window is bounded, but the inputs (peak speed, missed-waypoint count) must be re-derived from the now-filtered position stream rather than from snapshotted values.
## Scaling
Multiple Processor instances join a Redis Streams consumer group and split the load across device IDs. Consumer-group offsets ensure a crashed instance's work is picked up by the next one.
@@ -54,3 +60,14 @@ Multiple Processor instances join a Redis Streams consumer group and split the l
## Failure mode
Crash → consumer-group offsets ensure the next instance picks up where the last left off. In-memory state is rehydrated from the database. See [[failure-domains]].
## Development workflow — Phase 2 branch model
Phase 2 (geofence engine, evaluator registry, crossings/penalties/results writers) is a substantial body of work and lives on a long-lived **`phase-2`** branch rather than landing piecemeal on `main`. Conventions:
- **Rebase weekly** against `main`, not merge. Keeps history readable and avoids merge-commit clutter when the branch eventually lands.
- **CI parity** — same workflow on `phase-2` PRs as on `main` PRs. Test coverage doesn't diverge across the branch boundary.
- **Flag-gated incremental merges** — chunks that are self-contained (a single evaluator, the geofence detector) can land on `main` behind `PROCESSOR_PHASE_2_ENABLED=false`. Off in prod, on in stage. Lets the work merge before it's user-visible without keeping the entire feature on a side branch indefinitely.
- **Single squash merge to retire the branch** — when Phase 2 is feature-complete enough to dogfood end-to-end, one squash merge retires the branch. Avoid death-by-a-thousand-merges.
Phase 2.5 (the geometry retroactivity engine in [[directus-schema-draft]]) follows the same pattern on its own branch when it starts; it is explicitly deferred until Phase 2 has shipped and the manual operator workflow for geometry edits has surfaced real pain points.
+38 -19
View File
@@ -2,8 +2,8 @@
title: React SPA
type: entity
created: 2026-04-30
updated: 2026-04-30
sources: [gps-tracking-architecture]
updated: 2026-05-02
sources: [gps-tracking-architecture, traccar-maps-architecture]
tags: [service, presentation-plane, frontend]
---
@@ -26,31 +26,50 @@ One application serves multiple user types via role-based routing and conditiona
## Data access pattern
The SPA talks **exclusively** to Directus:
The SPA talks to **two endpoints**, one per plane (see [[live-channel-architecture]]):
- REST/GraphQL via `@directus/sdk`.
- WebSocket subscriptions via the same SDK.
- JWT auth managed by the SDK; refresh handled transparently.
- **[[directus]]** — REST/GraphQL via `@directus/sdk`, plus Directus's WebSocket for business-plane events. Auth, schema, all CRUD on entries / vehicles / devices / geofences / etc.
- **[[processor]]** — its own WebSocket endpoint, exclusively for the live position firehose. Authenticated by the same Directus-issued credential the SPA already holds; authorization delegated to Directus once at subscribe time.
**Never** talks to the [[processor]], [[tcp-ingestion]], [[redis-streams]], or [[postgres-timescaledb]] directly. This boundary lets the back-end evolve internally and keeps the security model coherent — every request goes through Directus's permission system.
**Never** talks to [[tcp-ingestion]], [[redis-streams]], or [[postgres-timescaledb]] directly. The two-endpoint split exists because Directus's WebSocket subscriptions only fire for writes through its own `ItemsService` — Processor's direct-to-DB position writes are invisible to it. See [[live-channel-architecture]] for why this is the architecturally honest answer rather than a workaround.
## Recommended stack
## Stack
- **Vite + React + TypeScript**
- **TanStack Router** — better TS support than React Router; optional file-based routing
- **TanStack Query** — server state, caching, invalidation, optimistic updates
- **@directus/sdk** — typed access + real-time
- **MapLibre GL + react-map-gl**open-source WebGL maps, no token needed
- **shadcn/ui + Tailwind** — UI primitives
- **Zustand** — client-only state (filters, UI prefs)
- **react-hook-form + Zod** — forms and validation
- **Vite + React + TypeScript** — SPA build, no SSR.
- **TanStack Router** — file-based, type-safe routes.
- **TanStack Query** — Directus REST: caching, invalidation, refetch on focus.
- **`@directus/sdk`** — typed access for REST + Directus's WebSocket.
- **MapLibre GL JS** — WebGL map renderer. Used **raw**, not via `react-map-gl`the declarative wrapper fights the imperative `setData` pattern that's the whole point of the architecture (see [[maps-architecture]]).
- **`maplibre-google-maps`** *(optional, runtime-config-gated)* — protocol adapter that lets MapLibre consume Google's official Map Tiles API when an operator-provided API key is present.
- **`@mapbox/mapbox-gl-draw`** — geofence editor (polygon / line / trash modes) wrapped to look native to MapLibre.
- **Zustand** — high-frequency live state (positions, trails, selection). Granular subscribers; chosen over Redux specifically because Redux's dispatch + selector cascade is the most likely cause of the lag observed in production [[traccar-maps-architecture]] deployments.
- **shadcn/ui + Tailwind** — UI primitives.
- **react-hook-form + Zod** — forms and validation.
Covers the spectrum from form-heavy admin screens to real-time map dashboards without architectural changes between them.
Covers form-heavy admin screens and real-time map dashboards without architectural changes between them.
## Auth pattern
Same-domain session cookie via reverse proxy. One origin serves the SPA, Directus, and Processor's WebSocket endpoint — Vite's `server.proxy` in dev, Traefik (or whatever fronts the deploy stack) in stage/prod.
- Login uses the Directus SDK in **session mode** (`authentication('session', { credentials: 'include' })`). Directus issues an `httpOnly`/`Secure`/`SameSite=Lax` session cookie; the cookie itself carries the session — there is no separate in-memory access token to manage and no `/auth/refresh` dance.
- Reload survives cleanly: the browser still has the cookie, `/users/me` returns the user without any client-side state.
- No JavaScript ever reads or writes the cookie (it's `httpOnly`), so XSS cannot exfiltrate it.
- WebSocket handshake: same-origin means the browser sends the session cookie automatically with the upgrade request. Processor reads it on the upgrade, validates against Directus's `/users/me`, and uses the resulting user identity for subscription authorization. See [[live-channel-architecture]] and [[processor-ws-contract]].
This requires the proxy to serve everything under one origin (path-based or single subdomain) — separate subdomains break cookie flow.
**Mode choice context.** Directus's SDK also supports `'cookie'` mode (refresh cookie + in-memory access token). It works while the SDK is alive in memory but doesn't survive a hard reload cleanly because there's no access token to retry `/users/me` against, and the refresh-then-read sequence is order-sensitive. `'session'` mode collapses that to one credential — the session cookie — and is the right default for an SPA that wants reload-survives behaviour.
## Real-time rendering
- **Live maps with many markers**: React reconciler is not the bottleneck — drawing happens in WebGL via MapLibre, which manages features outside React's tree. The React layer manages subscriptions and feeds the map updates.
- **High-frequency tabular updates** (live leaderboards, event feeds): split components so high-update areas re-render in isolation; use TanStack Query for live data; memoize at component boundaries that receive frequent updates.
The full pattern lives at [[maps-architecture]]; the headlines:
- **MapLibre is a singleton** held in a module-level variable, attached to a detached `<div>` that React refs mount/unmount per page. WebGL context survives navigation.
- **Two GeoJSON sources** for live positions: clustered non-selected, unclustered always-on-top selected. Updates flow through `setData`, not DOM marker manipulation.
- **rAF coalescer at the WS boundary.** Incoming position messages buffer per-device; one `requestAnimationFrame` tick flushes the latest snapshot to the Zustand store. Without this, per-message dispatches cascade through selectors and `setData` at every position arrival — the failure mode [[traccar-maps-architecture]] exhibits.
- **Per-device bounded ring buffers** for trail history. Default 200 points per device, configurable. The throttle controls visual cadence; trails are never lossy.
- **High-frequency tabular updates** (live leaderboards, event feeds) — same Zustand store, separate component subtrees so the map's re-renders don't ripple into the leaderboard and vice versa.
## Failure mode
+16 -2
View File
@@ -2,7 +2,7 @@
title: Redis Streams
type: entity
created: 2026-04-30
updated: 2026-04-30
updated: 2026-05-01
sources: [gps-tracking-architecture, teltonika-ingestion-architecture]
tags: [infrastructure, telemetry-plane, queue]
---
@@ -21,9 +21,23 @@ The durable in-flight queue between [[tcp-ingestion]] and [[processor]]. Also th
Sufficient at current scale and adds minimal operational burden. NATS or Kafka are reasonable upgrades when **multi-region durability** or **very high throughput** become real concerns. Until then, Redis is the right choice.
## Stream and key naming
Canonical names used across the platform. Both [[tcp-ingestion]] and [[processor]] reference these via the `REDIS_TELEMETRY_STREAM` environment variable, pinned in the deploy stack so the two services cannot drift from each other.
| Name | Purpose | Producer | Consumer |
|---|---|---|---|
| `telemetry:teltonika` | Inbound Position records from Teltonika devices | [[tcp-ingestion]] (XADD) | [[processor]] (XREADGROUP, group=`processor`) |
| `commands:outbound:{instance_id}` | Outbound device commands routed to a specific [[tcp-ingestion]] instance | [[directus]] Flow | [[tcp-ingestion]] |
| `commands:responses` | Command ACK/nACK and replies | [[tcp-ingestion]] | [[directus]] Flow |
| `connections:registry` (hash) | IMEI → instance routing table | [[tcp-ingestion]] | [[directus]] Flow |
| `instance:heartbeat:{instance_id}` (key, `EX 90`) | Liveness signal per [[tcp-ingestion]] instance | [[tcp-ingestion]] | janitor / [[directus]] Flow |
**Naming convention.** Telemetry streams are namespaced by vendor (`telemetry:{vendor}`) so adding a second adapter (Queclink, Concox, etc.) creates `telemetry:queclink` rather than competing for shape on the same stream. [[processor]] consumes the union by joining a consumer group on each.
## Phase 2 usage
Outbound commands ride on per-instance streams: `commands:outbound:{instance_id}`. Responses ride on `commands:responses`. Redis is the transport; the source of truth for commands is the Directus `commands` collection. See [[phase-2-commands]].
Outbound commands ride on per-instance streams: `commands:outbound:{instance_id}`. Responses ride on `commands:responses`. Redis is the transport; the source of truth for commands is the [[directus]] `commands` collection. See [[phase-2-commands]].
The connection registry (`connections:registry` hash) and per-instance heartbeats (`instance:heartbeat:{instance_id}` keys with `EX 90`) also live in Redis.
+2 -2
View File
@@ -2,14 +2,14 @@
title: TCP Ingestion
type: entity
created: 2026-04-30
updated: 2026-04-30
updated: 2026-05-01
sources: [gps-tracking-architecture, teltonika-ingestion-architecture]
tags: [service, telemetry-plane]
---
# TCP Ingestion
The service that maintains persistent TCP connections with GPS devices, parses vendor binary protocols, ACKs frames per protocol, and hands off normalized records to the [[redis-streams]] queue.
The service that maintains persistent TCP connections with GPS devices, parses vendor binary protocols, ACKs frames per protocol, and hands off normalized records to the [[redis-streams]] queue (default stream `telemetry:teltonika` for the Teltonika adapter; see [[redis-streams]] for the full naming convention).
## Responsibility
@@ -0,0 +1,194 @@
---
title: Rally Albania 2025 — Race Rules and Regulations
type: source
created: 2026-05-01
updated: 2026-05-01
source_path: raw/Regulations_2025.pdf
source_date: 2024-10
source_kind: other
sources: []
tags: [rally, regulations, albania, federation-rules, classes, start-order, penalties]
---
# Rally Albania 2025 — Race Rules and Regulations
> Authoritative rulebook for Rally Albania 2025 (0714 June 2025). Issued October 2024 by Motorsport Club Albania, technical organizer; legal entity is the Albanian Motorcycle Federation. Section numbers below reference the doc directly (§X.Y) so the rest of the wiki can cite precisely.
## TL;DR
A multi-day International Cross Country Rally Raid with a prologue, multiple regular stages, and an epilogue, running across four vehicle categories (MOTO/QUAD/CAR/SSV) further split into 17 classes. Timing is RFID + GPS. Each vehicle carries **two** independent GPS trackers; no tracker = no start. Start order is **dynamic per stage** with several distinct seeding rules (§5.5–§5.10). Penalties are time-additive only, never affect next-stage seeding. The Supplementary Regulation (SR), published 60 days before the event, overrides the general regs where in conflict (§1.8–§1.9).
This is the canonical real-world reference for the TRM business-plane schema and the React SPA's user-facing semantics.
## Categories and classes (§2.2–§2.5)
Four categories, 17 classes total. The class catalog is the concrete fixture for [[directus-schema-draft]] `classes`.
| Category | Class | Description |
|---|---|---|
| MOTO | M-1 | Under 450cc |
| | M-2 | 450600cc |
| | M-3 | over 600cc, single cylinder |
| | M-4 | over 600cc, bi-cylinder |
| | M-5 | Senior, under 450cc (born 1.1.196731.12.1975) |
| | M-6 | Senior, over 450cc (born 1.1.196731.12.1975) |
| | M-7 | Veteran (born before 1.1.1967) **and** Female driver — both reuse code M-7 in the doc; likely a numbering bug. |
| QUAD | Q-1 | Any engine, 2WD |
| | Q-2 | Any engine, 4WD |
| | Q-3 | Any quad, female pilot |
| CAR | C-1 | Offroad, heavily modified |
| | C-2 | Offroad, lightly modified |
| | C-A | Standard passenger automobiles, any modifications |
| | C-3 | Offroad, all-female pilot+copilot |
| SSV | S-1 | Sport UTV, single pilot |
| | S-2 | Sport UTV, two-driver team |
| | S-3 | Sport UTV, all-female team |
Classifications: per-class within each category, plus **three independent general standings** (§2.7–§2.8): G1 = MOTO/QUAD together (all classes), G2 = CAR all classes, G3 = SSV all classes.
## Race numbers (§5.1–§5.4)
Numbers assigned by inscription order; previous-year category winners get 1 / 200 / 300 / 400; first three of each category are reserved.
| Group | Range | Plate background |
|---|---|---|
| Moto | 1199 | white |
| Quad | 2xx | white |
| Car | 3xx | white |
| SSV | 4xx | white |
| Assistance vehicles | 6xx | yellow |
| Media | 7xx | green |
| Organization | 9xx | red |
Plate background color is a load-bearing visual signal — relevant to the SPA's vehicle marker styling.
## Start order — strategies per stage (§5.5–§5.10)
This is the single most schema-relevant section. Start order is **dynamic** and rule-driven; the rule varies by stage role and category.
| Stage role | Bikes / Quads | Cars / SSV |
|---|---|---|
| Prologue (§5.5) | Arbitrary by org based on history | Arbitrary by org based on history |
| Stage 1 (§5.6, §5.7) | Top 20 of prologue **inverted**; positions 21+ in prologue order | Pure prologue result order |
| Stages 2 → last (§5.8) | Previous stage's **clean** SS time | Previous stage's clean SS time |
| Epilogue (§5.9) | **Inverse** of overall standings after the last stage | Inverse of overall standings |
**Penalties never feed seeding** — §5.8 is explicit: "possible penalties will not count on the starting order but will be added on overall time." Seeding is on clean SS time; penalties roll into the overall total only.
**Time between starts is set per stage**, the day before each one (§5.10). It is not an event-level constant.
Bikes/quads start as a combined grid; cars and SSVs each have their own grid (§5.10 + §2.8). Categories seed independently.
## Tracking system (§7.1–§7.5)
- **Every vehicle gets a GPS/GPRS tracker provided by the organization** (§7.1).
- **Two independent devices per vehicle** — Rally Albania 2025 specifically mandates a dual-device tracking system (§7.2). The TRM schema's "many devices per vehicle" m2m is operationally required, not just supported.
- **Tracker malfunction blocks the start**: "Competitors that have malfunctions on the tracker before the start, will not be allowed to start before repairing it" (§7.3, restated in §13.15 / §13.34).
- Trackers are used for **route reconstruction, passage controls, hidden waypoint detection, speed in SLZs, and SOS** (§7.4).
- Trackers are **rented from a partner company, paid by the participant** (§7.5). The org provides them; the competitor doesn't.
- Competitor self-use of tracker data is **prohibited** (§7.2) — the tracker is data-recorder + SOS, not a navigation aid for the racer.
## Timekeeping (§9.1–§9.20)
- Timing run by the "Time and Race Management" company (§9.1).
- **RFID transponder** for time keeping (§9.2). Paper Time Card for back-up and passage controls (§9.3).
- **Late at the start of a stage** (§9.5–§9.6): one-minute-per-minute penalty up to a cutoff = 5 minutes before the first vehicle of the next category. Past that cutoff = disqualification from that stage day.
- **Late at the start of first SS** (§9.7): one-second-per-second penalty.
- **Late vehicles' exact start time** is at Race Marshal discretion, "when finding an open window to start" (§9.8). This is the operational basis for `entry_segment_starts.manual_override`.
- **CP closing time = 60 minutes after the last competitor's ideal time** (§9.19). After this, the CP signpost closes.
- **Late past CP closing** (§9.9): cannot enter the SS, takes a penalty per formula.
- **Liaison sections** (§9.11–§9.13): target time given; one-minute-per-minute penalty for late or early arrival; exception is arriving early at the finish after the last SS of a stage.
- **Special sections** (§9.14): fastest possible, but within a maximum time allowed; checking in past max time → fixed daily penalty.
- **Waypoints** are numbered and verified by the Tracking System (§9.18) — pure GPS detection, no marshal.
- **Late or early is equally penalised** (§9.10) — symmetric in time penalties for liaisons.
## Penalty taxonomy (§12)
The doc classifies penalties as Sportive / Speed / Disciplinary / Special.
### Sportive
| Type | Trigger | Penalty |
|---|---|---|
| Late/early at CP-1 / CP-2 / CP-4 / CP-6 / CP-8 (§12.2) | Liaison-style time controls | 1 minute per minute |
| Last LS no-early-penalty (§12.3) | Stage finish | exempt from early-arrival rule |
| **Missing CP** (§12.4) | No crossing detected within stage window | Worst valid time of day in your category + **120 min** per CP missed |
| **Late than Maximum Time Allowed** (§12.5) | Stage MTA exceeded | Worst valid + **60 min** |
| **Arrived at CP after closing hour** (§12.6) | Crossing after `time_control_closed_at` | Worst valid + 120 min per CP missed (same formula as §12.4 but distinct trigger) |
| **Missing WP** (§12.7) | No GPS pass-through detected | **60 min** per WP |
| Course cut, no CP/WP missed, gain (§12.8) | Track deviation | Best time of day in that sector × 3 |
| Transported to next bivouac by org (§12.9) | Recovery | Fixed |
### Speed
- **Overspeed in Speed Limit Zones** (§12.11): time penalty per formula. Formula itself is in the **Supplementary Regulation**, not the general regs. The Tirana 24h regs publish a five-bracket table; Rally Albania defers it to SR. The TRM penalty system is data-driven, so this fits naturally.
### Disciplinary
| Type | Penalty |
|---|---|
| Not present at start without notice (§12.13) | Disqualification |
| Not able to start (§12.14) | Fixed time penalty |
| Withdraw from race (§12.15) | Maximum Time Allowed for remaining stages |
| **Not wearing protection in SS** (§12.16) | **60 min** per SS spotted — marshal-observed event, not GPS-derivable |
### Special
- **Force majeure / fair play stops** (§12.18): manual time adjustment by Race Direction, can be positive or negative. SR carries detail.
§12.16 ("not wearing protection") and §12.13 are the only penalty types in the doc that are **not** GPS-derivable — they require marshal observation. The schema must allow `entry_penalties` rows to be created manually (operator path), not only by the processor.
## Liaison vs Special — formal distinctions (§6.8–§6.13)
- **Liaison**: defined average speed, exact target time. Time penalty for both early and late arrival.
- **Special section**: fastest possible, sum of all SS times across all stages decides the winner.
- Stage time = sum of SS times + penalties (§6.15).
- Rally winner = sum of all stage times + penalties (§6.17).
- **Tiebreaker** (§6.19): best time in the **last stage** wins; if still tied, walk back stage-by-stage in reverse calendar order.
## Roadbook (§6.1–§6.5)
- Per-stage roadbook distributed at 20:00 the day before each stage (§6.2).
- A5 format for cars; 135 mm roll for bikes/quads; PDF for those who use it (§6.3).
- Minimum unit distance 10 m; rally computer calibrated to exactly 1 km the day before, satellite-measured (§6.4).
- Roadbook describes every PC, TC, WP, and SLZ (§6.5).
## Protests and live timing (§11.1–§11.8)
- Protests must be lodged within **60 minutes** after results publication (§11.2).
- Decision before the next stage start, or before final results for last-stage protests (§11.4).
- **Live timing is NOT official and CANNOT be subject to protests** (§11.8). This is a hard policy — the SPA must surface live data with appropriate "unofficial" framing; only published stage results are protestable.
- Fair-play time deductions (§11.5–§11.7): stop for medical help can be deducted from SS time at Race Direction discretion, claimed within 60 minutes of day's results.
- **Official bulletin channel = WhatsApp Broadcast** (§6.20). All participants must install WhatsApp with broadcast enabled. Not a TRM responsibility, but worth knowing.
## Supplementary Regulations (§1.8–§1.9)
- SR published 60 days before event.
- SR carries: timing details, sportive penalties (specifics including the SLZ formula), and bivouac details.
- **SR overrides general regs where in conflict** (§1.9). Implies the TRM event model needs to allow per-event overrides of rule data — practically, this is already what `penalty_formulas` scoped per event/stage achieves, but it deserves a note in the schema draft.
- **Extraordinary rules**: applied by the Clerk of the Course on the spot (§1.9 second). Operationally these become manual `entry_penalties` rows or marshal-driven schedule adjustments.
## Refueling and assistance (§8.1–§8.13)
- Assistance only at designated points (§8.1), provided to assistance teams pre-stage (§8.2).
- On-track assistance vehicles start 10 min after last race vehicle, in route direction only (§8.3).
- **In-SS refueling**: only at declared passage controls; if refueling is on track, the point becomes a **neutralization zone equal for all competitors** (§8.12). Worth flagging — neutralization implies a per-segment "stopwatch pause" that the processor would need to subtract from SS time. Probably warrants a future segment subtype or a neutralization geofence kind. Not blocking.
- Minimum fuel autonomy 140 km, recommended 160 km (§8.13).
## Notable quotes (verbatim)
> "Second stage up to the last stage will be defined by the result of the previous stage and NOT the overall standings. (possible penalties will not count on the starting order but will be added on overall time)." — §5.8
> "Live Timing it is not official and can not be subject to protests." — §11.8
> "No vehicle can take race numbers start without Tracking System mounted." — §13.15 / §13.34
> "Differing from the road book can lead to the entry of prohibited and/or dangerous areas." — Closing remarks
## Open questions / follow-ups
- **§12.11 SLZ formula** lives in the SR, not in the general regs. We have the Tirana 24h SLZ formula (5 brackets ×2/×5/×15/×40/×120 sec/km/h) as a reference, and a separate organizer-supplied version for the same rally (×5/×10/×30/×90/×240 — slice-by-slice progressive). For Rally Albania 2025, the actual brackets must come from the SR — we shouldn't hardcode either Tirana variant as the default.
- **M-7 numbering bug** — both "Veteran" and "Female driver" use code M-7 in §2.2. Probably a typo for M-8. When seeding the `classes` table for Rally Albania, treat as two distinct classes; flag for organizer confirmation.
- **Neutralization zones** (§8.12) — not yet modeled in the schema. Would require either a `neutralization` segment subtype or a `geofence.kind = neutralization-zone` with processor-side stopwatch handling. Defer until a real event uses one.
- **Class M-5 / M-6 senior age window** — defined as "born before 1 January 1976 and after 31 December 1966." Birthdate-derived class assignment may shift year-over-year; the `classes` table should treat eligibility rules as data, not derive them.
- **WhatsApp Broadcast** as official bulletin (§6.20) — not in scope for TRM, but a reminder that final/official data flow happens outside the system.
+163
View File
@@ -0,0 +1,163 @@
---
title: Traccar Web — Maps Architecture
type: source
created: 2026-05-02
updated: 2026-05-02
sources: []
source_path: raw/TRACCAR_MAPS_ARCHITECTURE.md
source_date: 2026-05-02
source_kind: note
tags: [maps, frontend, reference, architecture, maplibre]
---
# Traccar Web — Maps Architecture
Internal architectural deep-dive into how `traccar-web` (the React front-end of the Traccar GPS tracking platform) builds its maps subsystem. Documents the rendering engine, tile-source strategies (including how Google's tiles are integrated despite Google not being a first-class MapLibre provider), GeoJSON-driven feature rendering, geofence editing, and the WebSocket → Redux → `setData` live-data pipeline.
This is the canonical reference architecture for [[react-spa]]. TRM's SPA inherits the bulk of these patterns and diverges in a small, deliberate set of places — see the divergences section below and the dedicated [[maps-architecture]] concept page.
## TL;DR
Traccar's web app does **not** use the Google Maps JavaScript API. It uses **MapLibre GL JS** (a fork of pre-1.0 Mapbox GL JS) as a single, app-lifetime singleton WebGL renderer. Every map "object" — devices, geofences, routes, accuracy circles, POIs — is a feature in a GeoJSON source rendered by MapLibre style layers. Live data flows from a single WebSocket through Redux into `useSelector`-subscribed components that call `setData` on the appropriate source. Google tiles are consumed either via the official Map Tiles API (through the `maplibre-google-maps` adapter, which registers a `google://` protocol handler) or by hitting Google's legacy public tile servers directly when no API key is configured.
The architecture is well-designed for the rendering side. The likely failure mode at scale is the **per-message Redux dispatch + per-`setData` call** pattern in the data pipeline — at high position rates this cascades through selectors and rebuilds full feature collections on every position. TRM's SPA addresses that with a `requestAnimationFrame` coalescer at the WS boundary (see divergences).
## Key claims
Each claim is self-contained — readable without prior context.
### Rendering engine
- **MapLibre GL JS is the rendering engine**, not the Google Maps SDK or Leaflet. WebGL all the way down: vector and raster sources, GPU-rasterised symbol/circle/line layers, no DOM markers.
- **A single `maplibregl.Map` instance lives for the entire app lifetime.** It's constructed at module load against a detached `<div>` held in a module-level variable. The React `<MapView>` component just appends that detached `<div>` into its own ref on mount and removes it on unmount, then calls `map.resize()`.
- This means **navigating between pages (Main / Replay / Geofences / Reports) doesn't recreate the WebGL context, doesn't re-upload icon sprites, and doesn't refetch the style.** Significant perf win.
- Every other map component in the codebase imports the same singleton: `import { map } from './core/MapView'` and calls `map.addSource`, `map.addLayer`, `map.on(...)` directly. They render `null` to React — they are imperative side effects wrapped in a component for lifecycle handling.
### `Map*` component contract
- **Every `Map*` component is a side-effect-only React component.** It returns `null` and uses `useEffect` to add sources/layers, with the cleanup function removing them.
- **Two-effect pattern is consistent across the codebase:** one `useEffect` with `[]`-ish deps for setup (add source + layer), one with `[data]` deps for updates that just calls `map.getSource(id)?.setData(...)`.
- **`useId()` is used to generate unique source/layer ids** so the same component can be mounted multiple times safely.
### Tile layers
- **Two flavours of basemap style:** vector styles (full MapLibre style JSON URLs from providers like OpenFreeMap, MapTiler, LocationIQ, TomTom, Ordnance Survey) and raster styles built ad-hoc by a `styleCustom({ tiles, minZoom, maxZoom, attribution })` helper.
- Raster `styleCustom` is used for OSM, OpenTopoMap, Carto, **all three Google variants** (Road / Satellite / Hybrid), Bing, HERE, Yandex, AutoNavi, Mapbox raster styles, and any user-supplied custom URL.
- The synthesised raster style includes a `glyphs` URL pointing at `cdn.traccar.com/map/fonts/...` so even raster-only basemaps can render text labels for overlays (geofences, devices, POIs).
- **Google tiles are accessed two ways depending on whether the user has set a `googleKey`:**
- **With key:** `tiles: ['google://roadmap/{z}/{x}/{y}?key=...']`. The `google://` protocol is intercepted by `maplibre-google-maps`'s `googleProtocol` handler, registered once globally via `maplibregl.addProtocol('google', googleProtocol)`. The handler calls Google's **official Map Tiles API** with session-token authentication. This is the legitimate, billable path.
- **Without key:** `tiles: [0,1,2,3].map(i => 'https://mt${i}.google.com/vt/lyrs=m&hl=en&x={x}&y={y}&z={z}&s=Ga')`. Hits Google's legacy public tile servers directly. Unauthenticated, ToS-grey, availability at Google's discretion.
- Hybrid uses `satellite` + `&layerType=layerRoadmap` on the keyed path.
- A user-configurable `custom` style branches: if `mapUrl` contains `{z}` or `{quadkey}` it's treated as a tile template via `styleCustom`; otherwise it's treated as a full style JSON URL.
### Style swaps reset the world
- **Setting a new style wipes all custom sources/layers.** This is MapLibre's standard behaviour, not a Traccar quirk.
- A module-level `ready` flag plus a `Set<readyListeners>` lets React components subscribe to global ready state. The basemap switcher fires `onBeforeSwitch → updateReadyValue(false)`, which causes all child map components to unmount and clean up. After `map.setStyle(...)` the switcher waits for `styledata`, polls `map.loaded()` every 33 ms, then calls `initMap()` (which re-adds icon sprites via `map.addImage(...)`) and `updateReadyValue(true)`. Children remount and re-add their sources/layers.
- Components that should persist across style swaps (`MapScale`, `MapCurrentLocation`, `MapGeocoder`, `MapNotification`) are added/removed by their own components, outside `<MapView>`'s children list.
### Icon sprites
- **All device category icons are SVGs** (`car`, `bus`, `truck`, `bicycle`, `plane`, `ship`, `person`, `animal`, ...) plus a generic `background.svg` and a `direction.svg` arrow.
- **Pre-rasterised once at app startup** (`preloadImages()` in `src/index.jsx`):
1. Each SVG is loaded to an `HTMLImageElement`.
2. For each `category × colour` (`info`, `success`, `error`, `neutral`), `prepareIcon(background, icon, color)` draws the background to a canvas at `devicePixelRatio` scale, tints the icon with the colour using a `destination-atop` canvas trick, composites it centred, and stores the raw `ImageData` keyed `${category}-${color}` (e.g. `car-success`, `truck-error`).
- `MapView.initMap` then calls `map.addImage(key, imageData, { pixelRatio })` for every entry once a style has loaded. After that, layers can reference any sprite by name in style expressions like `'icon-image': '{category}-{color}'`.
### Live device positions — `MapPositions`
- **Two GeoJSON sources, identified by `useId()`-based unique strings:**
- `id` — non-selected devices, `cluster: true`, `clusterMaxZoom: 14`, `clusterRadius: 50`. MapLibre's built-in clustering handles aggregation.
- `selected` — the currently selected device only, never clustered, always rendered on top.
- Each source gets a symbol layer rendering `'icon-image': '{category}-{color}'` plus the title (device name or `fixTime`), and a `direction-…` symbol layer filtered to features where `direction === true`, drawing a direction arrow rotated by `course` with `'icon-rotation-alignment': 'map'`.
- Plus a single `clusters` symbol layer on the main source filtered `['has', 'point_count']` showing a `background` icon with the count.
- Feature properties drive layer expressions: `category` (selects sprite), `color` (`success`/`error`/`neutral`/`info` based on status or attribute override), `rotation` (course), `direction` (controlled by `mapDirection` user pref: `none` / `all` / `selected`).
- **Updates flow through `setData`:** a second `useEffect` re-runs whenever `positions`, `devices`, `selectedPosition`, or `mapCluster` change, and calls `map.getSource(source)?.setData(...)` with a freshly built FeatureCollection for each source. MapLibre diffs and re-renders.
### Geofences (rendering and editing)
- **Three layers on a single GeoJSON source:** `geofences-fill` (semi-transparent polygon fill, opacity 0.1), `geofences-line` (per-feature `color`/`width`/`opacity` from attributes), `geofences-title` (symbol layer rendering `{name}`).
- **Geofences arrive from the backend as WKT** in `item.area`. `geofenceToFeature(theme, item)` handles conversion: `CIRCLE(...)` strings are extracted as `(lat lon, radius)` and approximated as 32-step polygons in metres via `@turf/circle`; everything else is `wellknown.parse(item.area)` followed by `reverseCoordinates(...)` because **WKT is `lat lon` while GeoJSON is `lng lat`**.
- The reverse mapping (`geometryToArea`) is used when saving edits.
- **Editing uses `@mapbox/mapbox-gl-draw`** wrapped to look native to MapLibre (CSS class patches). Modes: polygon, line_string, trash. Listeners: `draw.create` → POST `/api/geofences`, `draw.update` → PUT, `draw.delete` → DELETE. On Redux geofence state change, Draw is cleared and re-populated with all features.
### Routes (history and live trails)
- **Replay route (`MapRoutePath`)**: builds a `FeatureCollection` of **per-segment** `LineString` features, one per consecutive position pair, and **colours each segment by the second point's speed** via `getSpeedColor(speed, minSpeed, maxSpeed)`. If the device has a `web.reportColor` attribute, that overrides speed-based colouring with a fixed colour for the whole track. Width/opacity come from user preferences.
- **Replay points (`MapRoutePoints`)**: a symbol layer rendering the literal text `▲` as the marker, rotated by `course` and tinted by speed colour. Clicks emit `onClick(positionId, index)` so a slider can scrub to that point.
- **Generic polyline (`MapRouteCoordinates`)**: takes pre-computed `coordinates` and renders a single `LineString` with optional name label.
- **Live trails (`MapLiveRoutes`)**: reads from `state.session.history` — a Redux dictionary `{ [deviceId]: [[lon, lat], ...] }`. Updated inside `sessionActions.updatePositions`, **capped to `web.liveRouteLength` points per device (default 10)**. Behaviour gated by `mapLiveRoutes` user attribute: `none` / `selected` / all devices.
- **Accuracy circles (`MapAccuracy`)**: for each position with `accuracy > 0` (in metres), `turfCircle([lon, lat], accuracy * 0.001)` polygon rendered as a translucent fill in the theme's geometry colour.
### Camera control
Three components for three jobs:
- **`MapCamera`** — one-shot fit. If passed `coordinates` or `positions`, builds an `LngLatBounds` and calls `map.fitBounds(bounds, { padding: min(w,h) * 0.1, duration: 0 })`. Used in `ReplayPage`.
- **`MapDefaultCamera`** — initial framing on `MainPage`. Picks (priority order) the selected device's position, the user's saved `latitude/longitude/zoom` preference, or a `fitBounds` over all visible positions. Runs at most once.
- **`MapSelectedDevice`** — reactive follow. Watches `state.devices.selectedId`, `selectTime`, and the position of the selected device. Fires `map.easeTo(...)` when the user reselects/re-clicks or when `mapFollow` is on and the selected device's coordinates change.
### Live data pipeline
End-to-end flow for "a device moved on the screen":
```
WebSocket /api/socket → SocketController.jsx → dispatch(...)
→ Redux store (session.positions, session.history, devices, events)
→ useSelector in MapPositions / MapLiveRoutes / MapSelectedDevice
→ map.getSource(id).setData(FeatureCollection)
→ MapLibre re-renders (WebGL)
```
- `SocketController.jsx` opens `wss?://<host>/api/socket` once authenticated. On `onmessage`, parses the JSON envelope and dispatches one of: `devicesActions.update`, `sessionActions.updatePositions`, `eventsActions.add`, `sessionActions.updateLogs`.
- Reconnect on `onclose` with a 60 s loop; `online` and `visibilitychange` listeners re-test the socket; periodic `socket.send('{}')` ping.
- `sessionActions.updatePositions` reducer per incoming position: (a) `state.positions[deviceId] = position` overwrites the latest; (b) if `mapLiveRoutes` is on, appends `[longitude, latitude]` to `state.history[deviceId]` capped to `liveRouteLength`; same-coord-as-last is skipped.
- `MainPage` reads `positions`, runs `useFilter` to compute `filteredPositions`, passes them into `<MainMap>` which composes `MapView`, `MapOverlay`, `MapGeofence`, `MapAccuracy`, `MapLiveRoutes`, `MapPositions`, `MapDefaultCamera`, `MapSelectedDevice`, `PoiMap` plus auxiliary controls.
### Conventions to keep in mind (verbatim from the doc)
1. Every `Map*` component is a side-effect-only React component returning `null` and using `useEffect` to add sources/layers.
2. `useId()` is used to generate unique source/layer ids so the same component can be mounted multiple times safely.
3. All data updates flow through GeoJSON `setData` — no direct DOM marker manipulation.
4. Style swaps reset the world; anything custom must be re-added after `setStyle`. The `mapReady` gate coordinates this.
5. Coordinates everywhere are `[lon, lat]` (MapLibre/GeoJSON convention). `reverseCoordinates` exists specifically to bridge the WKT `lat lon` ordering used by Traccar's geofence storage.
6. No Google Maps SDK is loaded in the browser. Even when Google tiles are used, the path is `tile URL → fetch → blob → MapLibre raster source`. The only Google-specific code is the protocol adapter from `maplibre-google-maps`.
## Notable quotes
> A **single** `maplibregl.Map` instance lives for the entire app lifetime, attached to a **detached `<div>`** held in a module-level variable. The React `<MapView>` component just mounts that detached `<div>` into its own ref (`appendChild`) on mount and removes it on unmount, then calls `map.resize()`. (§2)
> All data updates flow through GeoJSON `setData` — there is no direct DOM marker manipulation. This keeps the WebGL pipeline efficient and lets MapLibre handle clustering/visibility. (§14)
> No Google Maps SDK is loaded in the browser. Even when Google tiles are used, the code path is `tile URL → fetch → blob → MapLibre raster source`. The only Google-specific code is the protocol adapter from `maplibre-google-maps`, which is registered once in `MapView.jsx`. (§14)
## TRM divergences
The TRM SPA inherits the bulk of this architecture and diverges in a small, deliberate set of places. Each divergence is anchored to a specific reason, not "I prefer X."
| Traccar pattern | TRM pattern | Reason |
|---|---|---|
| Geofence storage in WKT (`item.area` is a WKT string); client converts WKT → GeoJSON on every read and GeoJSON → WKT on every save via `wellknown` + `reverseCoordinates`. | **Native PostGIS GeoJSON.** The API serves `ST_AsGeoJSON(geometry)` directly; client receives valid GeoJSON, no conversion. | Traccar predates PostGIS being a first-class option in many environments; we have it deployed and the round-trip is dead weight. |
| `CIRCLE(lat lon, radius)` WKT extension approximated client-side as a 32-step polygon via `@turf/circle`. | **Real `geometry(POLYGON, 4326)` columns**, with circles either stored as polygons up front or computed server-side via `ST_Buffer` if needed. | Same reason as above; native geometry types remove the approximation step. |
| Redux dispatch on every WebSocket message arrival. At 200 racers × 1Hz that's 200 dispatches/sec, each cascading through `useSelector`, `useFilter`, and a freshly-built FeatureCollection in `setData`. | **`requestAnimationFrame`-coalescing buffer at the WS boundary.** WS messages push into a per-device map; an rAF loop fires once per frame (~16ms), reads the buffer, and triggers a single state update. | Traccar's lag at high update rates is almost certainly here, not in the GPU pipeline. ~30 lines of code, eliminates the cascade. |
| Redux for everything (positions, history, devices, events, session, UI prefs). | **Zustand for high-frequency live state** (positions, trails); **TanStack Query for Directus REST**; Redux not used. | Zustand stores can be subscribed to with selectors that don't re-run components unnecessarily; meaningful render-cost reduction at the rates we expect. |
| `liveRouteLength` default 10 — barely a trail. | **Default 200 points per device**, configurable. | Rally operators want a few minutes of trail visible to read what a racer is doing; 10 points at 1Hz is 10 seconds. |
| Icon sprite set covers general fleet (car, bus, truck, plane, ship, animal). | **Racing-specific sprite set** (rally car, quad / ATV, SSV / UTV, motorcycle, runner, hiker) plus the generic `default`. | Race operators identify by category at a glance; "truck" and "plane" sprites would be misleading. |
| `react-map-gl` could be a candidate React wrapper for MapLibre. | **Raw MapLibre via the singleton + side-effect components pattern** (Traccar's choice). No `react-map-gl`. | The declarative wrapper fights the imperative `setData` pattern that's the whole point of the architecture. The Traccar approach is cleaner and gives full control. |
## Open questions surfaced by this ingest
- **Clustering parameters.** Traccar uses `clusterMaxZoom: 14`, `clusterRadius: 50`. Rally racing density (50500 vehicles spread across a country-scale stage) may want different values. Defer until we have a real map open with real positions.
- **Basemap switcher scope for v1.** Traccar offers 30+ basemap options; for the dogfood we don't need that. Decide a starter set: probably one satellite (Esri or Google via adapter), one topo (OpenTopoMap), one street (OSM), and the "custom URL" escape hatch.
- **Sprite set finalisation.** Rally / quad / SSV / motorcycle / runner / hiker covers the dogfood disciplines. Need actual SVG assets — borrow from open icon sets or commission.
- **Live trail per-segment colouring.** Traccar colours replay segments by speed. For live trails it just renders a flat colour. Adopting the per-segment-by-speed pattern for live too would be a small upgrade — race operators glance at colour and immediately see who's pushing vs. cruising.
- **`@mapbox/mapbox-gl-draw` license check.** It's open source but worth confirming the licence is compatible with our deployment (Mozilla Public License 2.0, last we checked).
- **Geocoder.** Traccar uses Nominatim via `MaplibreGeocoder`. For Albania-specific use we may want a different gazetteer, or skip search entirely for v1.
## Cross-references
- [[react-spa]] — the entity that consumes this reference architecture.
- [[maps-architecture]] — concept page distilling the patterns from this source plus TRM's refinements.
- [[live-channel-architecture]] — TRM's WS contract on the producer (Processor) side; this source documents the consumer side.
- [[processor]] — produces the live position stream this architecture consumes.
- [[directus]] — issues the JWT used for WS auth.
+487
View File
@@ -0,0 +1,487 @@
---
title: Directus Schema — Working Draft
type: synthesis
created: 2026-05-01
updated: 2026-05-01
sources: [rally-albania-regulations-2025]
tags: [directus, schema, business-plane, draft, penalties, geofences]
---
# Directus Schema — Working Draft
> Status: **working agreement**, not final. Captured during the 2026-05-01 schema discussion. Open for additions and revisions as the domain shape clarifies.
The TRM business-plane schema, as worked through so far. Pseudo multi-tenant under `organizations`. Two layers: an org-level catalog of durable resources (users, teams, vehicles, devices) and a per-event participation layer (entries + their crew/devices). All event-scoped state hangs off `entries`; `entries` is the unit of timing.
> **Reference rulebook:** [[rally-albania-regulations-2025]] is the canonical real-world reference. Section numbers cited in this doc as `Rally Albania §X.Y` map to that source. Where the schema needs a federation-specific shape (start-order rules, penalty taxonomy, class catalog), that source is the ground truth.
## Tenancy model
`organizations` is the root. Users, teams, vehicles, and devices are **all m2m with orgs** — a single device can be loaned across orgs, a privateer can race for two clubs, a team can compete under multiple federations. Events, by contrast, are scoped to a **single org** (one FK).
## Org-level catalog
Durable resources that exist independently of any particular event.
### `organizations`
Tenant root. Every business-plane row ultimately traces back here.
### `users`
Directus users, augmented with TRM-specific profile fields. M2M with orgs via `organization_users`.
### `organization_users` (junction)
| Field | Notes |
|---|---|
| `organization_id` | FK |
| `user_id` | FK |
| `role` | enum: `owner` / `admin` / `race-director` / `marshal` / `participant` |
The `role` column drives Directus permission policies — a user's effective permissions in a given org come from this row.
### `teams`
Durable rosters. M2M with orgs via `organization_teams`. A team's "presence" in an event is derived from `entries.team_id` — no separate team↔event join needed.
### `organization_teams` (junction)
`organization_id`, `team_id`.
### `team_members` (junction)
`team_id`, `user_id`. Durable team roster. Who actually races for the team in a given event is the subset that has entries with that `team_id`.
### `vehicles`
M2M with orgs via `organization_vehicles`. No ownership FKs — ownership is a real-world fact that doesn't affect timing or tracking, and which user/team a vehicle "belongs to" is fluid (factory loaner, rented, privateer, swapped between teams). Anyone with org-level entry-creation permission can register any org vehicle into an entry.
### `organization_vehicles` (junction)
`organization_id`, `vehicle_id`.
### `devices`
GPS hardware (Teltonika today, future vendors possible). M2M with orgs via `organization_devices`. The `imei` is the canonical identifier the [[tcp-ingestion]] service uses; the Directus row carries vendor/model/firmware metadata and the org bindings.
### `organization_devices` (junction)
`organization_id`, `device_id`.
## Event-level participation
The per-race shape. Built around `entries` as the unit of timing.
### `events`
| Field | Notes |
|---|---|
| `organization_id` | FK, single-org per event |
| `discipline` | enum: `rally` / `regatta` / `trail-run` / `hike` / … (extensible) |
| `name`, `starts_at`, `ends_at`, … | descriptive |
`discipline` drives validation: which crew roles are valid, whether vehicles are required, which device mount points the UI offers.
### `classes`
Per-event class taxonomy. `event_id` FK, plus `name`, `code` (e.g. T1, T3, SSV), `sort_order`. Each event defines its own class set, so a rally's T1/T2/T3 doesn't pollute a regatta's classes.
### `entries`
The unit of timing — what gets a bib, gets timed, gets a result.
| Field | Notes |
|---|---|
| `event_id` | FK |
| `vehicle_id?` | nullable — null for foot/hike events |
| `team_id?` | nullable — null for lone racers |
| `class_id` | FK to `classes` |
| `bib` | per-event identifier shown to spectators |
| `status` | enum: `registered` / `started` / `finished` / `dnf` / `dns` / `dsq` / `withdrawn` |
Status semantics:
- `registered` — entered, not yet started.
- `started` — currently competing. Live map only renders entries with this status.
- `finished` — completed within the rules. Eligible for ranked results.
- `dns` (Did Not Start) — registered but never crossed the start line.
- `dnf` (Did Not Finish) — started but failed to complete (mechanical, retired, timeout). Shows on results board but unranked.
- `dsq` (Disqualified) — penalized off the result by officials.
- `withdrawn` — pulled themselves out before the event started; doesn't count toward field size.
A vehicle can race at most one entry per event (matches reality — same vehicle can't run two stages simultaneously). For foot races, `vehicle_id` is null and the entry is identified by its single crew member.
### `entry_crew` (junction)
| Field | Notes |
|---|---|
| `entry_id` | FK |
| `user_id` | FK |
| `role` | enum: `pilot` / `co-pilot` / `navigator` / `mechanic` / `rider` / `runner` / `hiker` (extensible per discipline) |
One row for solo rides, four+ for big-truck rally crews. No separate `crews` collection — `teams` already provides durable group identity, and per-entry crew is the subset that showed up.
#### A user has two distinct role surfaces
The schema carries two `role` columns that look similar but mean different things. Easy to conflate; worth being explicit.
| Surface | Where | Scope | What it controls |
|---|---|---|---|
| **Org role** | `organization_users.role` | Per-tenant, durable | What the user can see and do inside an org. Drives Directus policies (race-director, marshal, participant, …). |
| **Racing role** | `entry_crew.role` | Per-entry, per-event | What the user does on the track in this specific entry (pilot, co-pilot, navigator, mechanic, …). |
The two role enums don't overlap. Same user can be `race-director` in Org A (admin power), `participant` in Org B, and show up as `pilot` in three different entries across both orgs — three independent dimensions.
**Crew → org chain** is indirect:
```
entry_crew.entry_id → entries.id
entries.event_id → events.id
events.organization_id → organizations.id
```
The crew row itself doesn't reference org. The schema does **not** enforce that an `entry_crew.user_id` is also in `organization_users` for the entry's host org — guest racers happen. If a federation requires strict org membership for participation, that's a per-org permission rule on `entry_crew` insert, not a schema FK.
### `entry_devices` (junction)
| Field | Notes |
|---|---|
| `entry_id` | FK |
| `device_id` | FK |
| `assigned_user_id?` | nullable — null = vehicle-mounted (hardwired/backup), set = body-worn on that crew member |
A rally Toyota with three devices: two rows with null `assigned_user_id` (hardwired + backup), one row with `assigned_user_id` set to the pilot (panic button). A solo runner: one or more rows all with `assigned_user_id` set to the runner.
## Course definition
The spatial and procedural definition of an event. Three layers: stages (ordered containers), segments (typed sub-units of a stage), and geographical features (geofences, waypoints, SLZs) attached to segments.
### `stages`
| Field | Notes |
|---|---|
| `event_id` | FK |
| `name` | "Stage 1", "Day 2 Loop", etc. |
| `sort_order` | integer |
| `role` | enum: `prologue` / `regular` / `epilogue`. Default `regular`. Drives default seeding strategy and standings handling. |
| `starts_at` | base time used to compute per-entry start offsets |
| `start_interval_seconds` | gap between consecutive starts. Set per stage (Rally Albania §5.10 — decided the day before each stage). |
| `start_order_strategy` | enum: `manual` / `previous_stage_result` / `previous_stage_clean_result` / `inverse_top_n_then_natural` / `inverse_of_overall`. See "Start order strategies" below. |
| `start_order_strategy_params?` | JSON. Strategy-specific (e.g. `{ "n": 20, "source_stage_id": <prologue_stage_id> }` for `inverse_top_n_then_natural`). |
| `start_order_input_stage_id?` | FK to another `stages` row when a strategy reads from a specific previous stage (prologue → stage 1, or any non-immediate predecessor). Null = use immediately preceding stage by `sort_order`. |
A rally can have many stages; a trail run typically has one. The stage is just an ordered container — all the actual rules and geometry hang off its segments.
#### Start order strategies
Different federations seed each stage differently. The strategy is **declarative on the stage**; the [[processor]] (or a Directus operation) materializes `entry_segment_starts` rows when a stage is opened. Real-world cases derived from [[rally-albania-regulations-2025]] §5.5–§5.10.
| Strategy | Behavior | Real-world example |
|---|---|---|
| `manual` | Org sets each entry's `start_position` directly. No computation. | Tirana 24h every leg; Rally Albania prologue. |
| `previous_stage_result` | Order by ascending SS time of the input stage (penalty-adjusted excluded). | Rally Albania Stage 1 cars/SSV (input = prologue). |
| `previous_stage_clean_result` | Same as above, explicitly using clean SS time only. Penalties never affect seeding. | Rally Albania stages 2 → last (input = previous stage). |
| `inverse_top_n_then_natural` | Top *n* of input stage in **inverse** order, then *n+1*…last in natural order. | Rally Albania Stage 1 bikes/quads (params: `n=20`, source=prologue). |
| `inverse_of_overall` | Order by descending overall standings after the previous stage. | Rally Albania epilogue. |
Seeding input is always **clean SS time** (`stage_results.clean_time`) — penalties roll into overall total, never the next stage's grid (Rally Albania §5.8 explicit on this).
Strategies seed **per category** independently. Bikes/quads share a grid; cars share another; SSVs another (Rally Albania §2.8 + §5.10). The materialization step iterates each category's entries and assigns positions within that category.
Late-arrival reseeding stays operator-driven (Rally Albania §9.8, Tirana 24h §8.7) — Race Marshals adjust individual `entry_segment_starts.target_at` rows on the day, not via strategy.
### `segments`
The atomic unit of rules within a stage. Each segment has a type that drives processor and validation behavior.
| Field | Notes |
|---|---|
| `stage_id` | FK |
| `sort_order` | integer |
| `type` | enum: `liaison` / `special-stage` / `parc-ferme` |
| `entry_geofence_id` | FK to `geofences` — where you arrive / get timed in |
| `exit_geofence_id?` | nullable — only `special-stage` segments have one (timed exit). Liaisons end implicitly when the next SS-start is hit; parc fermé has no exit. |
| `target_duration_seconds?` | nominal duration used to compute target arrival times for liaisons |
Concrete shape of a rally stage (per Rally Albania pattern):
```
[Stage Start] → Liaison → [SS1 Start] → SS1 → [SS1 Finish] →
Liaison → [SS2 Start] → SS2 → [SS2 Finish] →
Liaison → [Parc fermé]
```
The `[…]` markers are geofences shared between segments — the SS1-Start geofence is both the exit-trigger of the preceding liaison and the entry-trigger of SS1.
### `geofences`
Reusable spatial assets. Polygons stored as PostGIS geometry.
| Field | Notes |
|---|---|
| `event_id` | FK |
| `name` | display name |
| `kind` | enum: `stage-start` / `ss-start` / `ss-finish` / `parc-ferme` / `manual-checkpoint` |
| `geometry` | PostGIS polygon |
| `manual_verification` | bool — true = marshal-confirmed (CP), false = pure GPS detection |
| `retroactive` | bool, default `false` — whether geometry edits trigger recompute of past crossings |
"Checkpoints" are not a separate collection; they are geofences with `manual_verification = true`. In practice these usually coincide with SS-start / SS-finish geofences (matches rally conventions).
### `waypoints`
Auto-detected points along an SS track. Missing one = penalty.
| Field | Notes |
|---|---|
| `segment_id` | FK — must reference a `special-stage` segment |
| `location` | PostGIS point |
| `tolerance_meters` | radius for "passed within" detection |
| `sort_order` | integer |
### `speed_limit_zones`
Distinct from `geofences` because they carry SLZ-specific evaluation parameters. Polygons where a speed cap applies.
| Field | Notes |
|---|---|
| `segment_id` | FK |
| `geometry` | PostGIS polygon |
| `max_speed_kmh` | the cap |
| `evaluation_window_meters?` | null = whole-polygon peak; `2000` = re-evaluate every 2km of transit (per the Tirana 24h rulebook). |
| `retroactive` | bool, default `false` — whether geometry edits trigger recompute of past traversals |
## Penalty system
Penalty rules are stored as data, evaluated in code. The collection holds the *numbers* (per-bracket multipliers, per-miss penalties); the [[processor]] holds the *math* (one evaluator function per penalty type, in a registry).
### `penalty_formulas`
The collection. Single shape covering bracketed (SLZ) and flat (CP, WP, late-start) rules; bracket-specific fields are nullable.
| Field | Notes |
|---|---|
| `belongs_to_type` | enum: `event` / `stage` / `speed_limit_zone` / `geofence` — the scope this rule attaches to |
| `belongs_to_id` | FK to that scope |
| `type` | enum: `speed_limit_offence` / `waypoint_missing` / `checkpoint_missing` / `early_start` / `late_start` |
| `input` | descriptor of the variable being measured (e.g. `peak_overspeed_kmh`, `missed_count`, `seconds_late`) |
| `offence_min?` | bracket lower bound (km/h, seconds, etc.). Null for flat rules. |
| `offence_max?` | bracket upper bound. Null = open-ended. |
| `operator` | enum: `multiplication` / `addition` |
| `penalty` | the numeric value (sec/kmh, sec per miss, etc.) |
| `retroactive` | bool, default `true` — whether formula edits trigger recompute of past penalties |
| `enabled` | bool, default `true` |
Resolution at evaluation time: most-specific scope wins. The [[processor]] queries "what's the SLZ rule for this zone?" first against `speed_limit_zone`, falls back to `stage`, falls back to `event`.
### Bracketed rules: progressive slice-by-slice
For `speed_limit_offence`, multiple rows with the same `(belongs_to_type, belongs_to_id, type)` form a bracket table. Each bracket contributes only to the portion of the input within its range — same math as progressive income tax. For peak overspeed `P`:
```
penalty = 0
for each row in brackets ordered by offence_min:
if P < row.offence_min: continue
upper = min(P, row.offence_max ?? P)
slice = upper - row.offence_min + 1
penalty += slice * row.penalty
```
Worked example (Tirana 24h table, peak `P = 58` km/h overspeed):
| offence_min | offence_max | sec/kmh | slice | contribution |
|---|---|---|---|---|
| 1 | 10 | 5 | 10 | 50 |
| 11 | 20 | 10 | 10 | 100 |
| 21 | 30 | 30 | 10 | 300 |
| 31 | 40 | 90 | 10 | 900 |
| 41 | null | 240 | 18 | 4320 |
| | | | **Total** | **5670** |
### Flat rules: one row, one number
For `waypoint_missing` and `checkpoint_missing`, a single row carries the per-miss value:
```jsonc
// missed waypoint: 60 min × count
{ "belongs_to_type": "event", "belongs_to_id": "<event-id>",
"type": "waypoint_missing", "penalty": 3600, "enabled": true }
// missed checkpoint: 120 min × count + worst_valid_time_in_category (added by evaluator)
{ "belongs_to_type": "event", "belongs_to_id": "<event-id>",
"type": "checkpoint_missing", "penalty": 7200, "enabled": true }
```
The "worst valid time in your category" addition for `checkpoint_missing` is **not** stored as a number. It's a runtime aggregate the [[processor]] computes from `entry_results` at stage close, and the evaluator function for that type adds it as part of its semantics. If a federation publishes a variant without that addition, that's a new `type` (`checkpoint_missing_flat`) with its own evaluator — not a flag on the row.
### Code side: the evaluator registry
The [[processor]] holds a registry mapping `type` → evaluator function. The evaluators contain no constants — all numbers come from `penalty_formulas` rows.
```ts
const evaluators = {
'speed_limit_offence': (peak, rules) => walkBrackets(peak, rules),
'waypoint_missing': (count, [rule]) => count * rule.penalty,
'checkpoint_missing': (count, [rule], ctx) => count * rule.penalty + ctx.worstValidTimeInCategory,
'late_start': (secondsLate, rules) => walkBrackets(secondsLate, rules),
// …
};
```
**The split:** numbers live in the database (rows in `penalty_formulas`); math shape lives in code (one function per `type`); runtime aggregates (peak speed, missed count, worst time) are computed by the [[processor]] and passed to the evaluator. Adding a new event with a different bracket table = pure data change, no deploy. Adding a fundamentally new kind of math = small code change (one new evaluator, one registry entry).
### Loading and reload
The [[processor]] queries enabled formulas in scope when an event becomes active or when it starts up:
```sql
SELECT * FROM penalty_formulas
WHERE belongs_to_id IN (
$event_id,
SELECT id FROM stages WHERE event_id = $event_id,
SELECT id FROM geofences WHERE event_id = $event_id,
SELECT id FROM speed_limit_zones WHERE event_id = $event_id
)
AND enabled = true;
```
Indexes them in memory by `(belongs_to_type, belongs_to_id, type)` for O(1) lookup at violation time. Reload triggers: Directus webhook on formula edit (preferred), periodic refresh as fallback, manual "reload formulas" admin action.
### Retroactive recompute
`retroactive` lives on `penalty_formulas` (default `true` — math fixes usually apply across the field) and on `geofences` / `speed_limit_zones` (default `false` — physical crossings stand on their own merit). Per-edit override available so an organizer can flip the default for a specific change.
When a row is updated and `retroactive = true`:
1. A Directus Flow / hook enqueues a recompute job (Redis Stream `recompute:requests`).
2. The [[processor]] consumes the job, walks affected `entry_penalties`, recomputes each from snapshotted inputs against the new formula, updates the rows in place, logs `recomputed_at` / `recomputed_reason`.
3. Old values are preserved in Directus revisions for audit.
Two recompute kinds:
- **Formula change** — cheap. `entry_penalties.inputs` is snapshotted, so it's pure arithmetic. Seconds for thousands of rows.
- **Geometry change** — expensive. Crossing decisions themselves change; have to re-detect from raw positions in [[postgres-timescaledb]]. Stage-bounded. Defer to Phase 2.5 of the [[processor]].
## Per-entry timing and results
Written by the [[processor]] as cars drive (Phase 2 work — not part of Phase 1 of [[processor]]).
### `entry_segment_starts`
| Field | Notes |
|---|---|
| `entry_id` | FK |
| `segment_id` | FK |
| `start_position` | integer — seeded position within the entry's category for this stage. Output of the strategy. |
| `target_at` | `stage.starts_at + (start_position - 1) × stage.start_interval_seconds`, plus nominal liaison durations for sub-segment starts. Operator-overridable. |
| `manual_override` | bool, default `false`. Set to `true` when a Race Marshal hand-edits `target_at` (late arrival, etc.) — flags the row as not derivable from the strategy. |
Materialized by the [[processor]] (or a Directus flow) at stage-open time, scoped per category. Until then the rows do not exist.
### `entry_crossings`
The raw timeline of detected/recorded crossings. Append-only, written by the [[processor]] as positions stream in (or by marshals via Directus for manual checkpoints).
| Field | Notes |
|---|---|
| `entry_id` | FK |
| `geofence_id?` | one of `geofence_id` / `waypoint_id` set per row |
| `waypoint_id?` | |
| `crossed_at` | timestamp |
| `source` | enum: `gps` / `manual` |
### `entry_penalties`
Computed from `entry_crossings` + `penalty_formulas` by the [[processor]].
| Field | Notes |
|---|---|
| `entry_id` | FK |
| `segment_id?` | FK, when applicable |
| `rule_id` | FK to `penalty_formulas` (or to a "rule set" if grouped) |
| `seconds` | computed penalty |
| `reason` | free text + structured payload |
| `formula_snapshot` | jsonb — the rule rows used at evaluation time, so the row is self-explaining if formulas later change |
| `inputs` | jsonb — peak overspeed, missed count, etc. (whatever the evaluator consumed) |
| `evaluated_at` | timestamp |
| `recomputed_at?` | set if the row has been retroactively recalculated |
| `recomputed_reason?` | what triggered the recompute |
The `formula_snapshot` + `inputs` fields make recompute a pure data transformation — no need to re-derive anything from raw GPS.
### `stage_results`
Materialized leaderboard per stage. Recomputed when the stage closes or on protest.
| Field | Notes |
|---|---|
| `entry_id` | FK |
| `stage_id` | FK |
| `raw_time` | seconds — net time from start to finish |
| `penalty_seconds` | sum of `entry_penalties` for this stage |
| `final_time` | `raw_time + penalty_seconds` |
| `position` | rank within class |
## Position flagging (cross-plane operator workflow)
Devices emit faulty data — jumpy GPS, impossible coordinates, unrealistic speeds. Track operators need to exclude such points from calculations after the fact, without deleting them (the raw record stays available for audit).
### Mechanism
The positions hypertable in [[postgres-timescaledb]] carries a `faulty boolean DEFAULT false` column. The hypertable is **exposed as a Directus collection** (read + update) so operators can flip the flag through the admin UI like any other row. Position records are owned by the [[processor]] (write side, telemetry plane), but the `faulty` flag is exclusively a business-plane operator concern.
Permission scope: operators with role `race-director` (or a more specific `track-operator` role added to the `organization_users.role` enum if granularity is needed) get update access to `positions.faulty`, scoped via the same dynamic-filter Policy pattern — only positions whose `device_id` belongs to a device entered in an event of the operator's org.
### Effect on the processor
All [[processor]] read paths against positions filter `WHERE faulty = false`:
- Peak-speed evaluation inside SLZs.
- Geofence/waypoint crossing detection.
- Replay-based recompute.
Flagged positions are still in the hypertable, still in the live broadcast (the live channel doesn't filter — operators flag historically, after the fact). They simply no longer contribute to penalties and results.
### Recompute on flag change
When `faulty` flips on a position whose timestamp is within an already-evaluated window:
1. Directus webhook fires on `positions` update where `faulty` changed.
2. Webhook enqueues a recompute request on `recompute:requests` (Redis Stream).
3. The [[processor]] identifies `entry_penalties` whose evaluation window overlaps the position's timestamp and re-evaluates them.
4. Cost is mid-band: cheaper than full geometry replay (the window is bounded), but the inputs (peak speed, crossing detection) must be re-derived from raw positions — the snapshotted `entry_penalties.inputs` no longer reflect reality once a position they were derived from is excluded.
This is a third recompute kind alongside formula recompute (cheap, snapshot arithmetic) and geometry recompute (expensive, full replay).
## Decisions made
- **Vehicle is the racing unit, not the user.** In rally the vehicle gets the time; crew is sub-relation.
- **Vehicle ownership is not modeled.** Vehicles belong to orgs (m2m); who "owns" the vehicle in the real world doesn't affect timing or tracking, so it's left out of the schema.
- **Class belongs to the entry, not the vehicle.** A vehicle could race different classes in different events; class is a per-event property.
- **Teams are org-level, durable.** A team's per-event roster is derived from `entries`, not stored separately. No `crews` collection.
- **Classes are per-event.** `classes.event_id` FK; entries reference `class_id`.
- **Discipline drives validation.** `events.discipline` decides whether vehicle is required and which crew roles/device mount points are valid.
- **Org-user role lives on the junction.** `organization_users.role` is the per-tenant role driving Directus policies.
- **Permission policies via dynamic filters.** Directus 11 Policies, one per logical role (race-director, marshal, participant, …). Each policy's filter reaches through the row → org → `organization_users` to confirm the current user has that role in *that* org. A user gets all applicable policies attached at once. No org-switcher UI needed.
- **Stage = container, segment = atomic.** Stages just order their segments. Each segment has a type (`liaison` / `special-stage` / `parc-ferme`) that drives processor and validation behavior.
- **Checkpoints are typed geofences, not their own collection.** A "checkpoint" is a geofence with `manual_verification = true` (and usually a `kind` of `ss-start` / `ss-finish`).
- **Penalty rules are data, math is code.** The `penalty_formulas` collection holds all numeric values (bracket multipliers, per-miss penalties). The [[processor]] holds one evaluator function per penalty `type`. Adding new events / new bracket tables = data only.
- **Speed limit penalties are progressive (slice-by-slice).** Each bracket contributes only to the portion of the input within its range — same as income tax. `peak_overspeed × per-bracket-rate` summed across brackets the peak crossed.
- **`retroactive` defaults differ by what's edited.** Formulas default `true` (math fixes apply across the field). Geometry defaults `false` (physical crossings stand on their own). Per-edit override at save time.
- **`entry_penalties` snapshot inputs and rule rows.** Recompute is pure arithmetic from snapshots, not re-derivation from raw GPS — the rare exception being geometry retroactive changes.
- **Positions carry a `faulty` flag.** Operator-controlled, default `false`, set after the fact through Directus when a GPS reading is unrealistic. [[processor]] filters `WHERE faulty = false` on every read; flagging a position triggers a windowed recompute of affected penalties. The hypertable is exposed as a Directus collection for this workflow.
- **Start order is per-stage, declarative, strategy-driven.** Stages carry `start_order_strategy` + params, materialized into `entry_segment_starts` at stage-open time, scoped per category. Penalties never feed the next stage's grid; only clean SS time does. Operator overrides via `manual_override` for late arrivals.
- **Stages have a `role`.** `prologue` / `regular` / `epilogue`. Drives default strategy choice and lets the standings logic exclude the prologue from overall-time totals when a federation requires that. Most rallies = one prologue + N regulars; some end with an epilogue seeded by inverse-of-overall.
- **CP missing vs CP-late-past-closing are distinct event types** with the same penalty formula. Rally Albania §12.4 (missing) and §12.6 (arrived after closing hour) both pay "worst valid + 120 min" but trigger on different processor signals — one from "no crossing detected within stage window," one from "crossing detected but after `time_control_closed_at`." Both surface as `entry_penalties` rows with distinct `type`s sharing a formula row.
- **Tiebreaker is reverse-stage order.** Rally Albania §6.19: same total → best last-stage time wins; if still tied, walk back stage-by-stage. Pure SQL on `stage_results`; no schema impact.
- **Crews are per-entry, not reusable.** No `crews` collection. Each entry builds its `entry_crew` rows fresh; the SPA solves "register the usual Toyota crew" via a "copy crew from previous entry" action that clones rows. Saved crews would go stale fast (people swap roles, drop out); copy-from-previous covers the real case without that risk.
- **Non-bracket federation rules ship as code (Option A), one PR per rule shape.** When a federation publishes math that doesn't fit the bracket / flat-per-miss model (curves, multi-input aggregates, conditional rules), the response is to add a new evaluator function to the registry plus the fields it needs on `penalty_formulas`. Explicitly **not** building an in-database expression engine (Option B). Reason: expression engines bring sandboxing, validation, authoring UX, testing, versioning, and determinism obligations that dwarf the per-rule PR cost. PR-per-rule keeps the math reviewable, deterministic, testable, and versioned through normal git history. Stability over flexibility.
## Open questions
- **Geometry retroactivity engine** — the position-replay path for geofence/SLZ geometry edits is non-trivial. Defer to Phase 2.5 of [[processor]] once Phase 2 is shipping.
## Relationship to existing wiki
This schema lives in [[directus]] (the business plane). The [[processor]] writes telemetry data (positions hypertable) keyed by `device_id`; the join from telemetry back to "who was racing" is `device_id``entry_devices``entry` → everything else. See [[plane-separation]] for why the telemetry plane stays ignorant of this schema.
+256
View File
@@ -0,0 +1,256 @@
---
title: Processor WebSocket contract
type: synthesis
created: 2026-05-02
updated: 2026-05-02
sources: [gps-tracking-architecture, traccar-maps-architecture]
tags: [websocket, protocol, contract, telemetry-plane, decision]
---
# Processor WebSocket contract
The wire-level specification of the WebSocket endpoint that fans live position updates from [[processor]] (or its eventual replacement gateway — see Implementation status) to [[react-spa]] clients. Both sides build against this contract; changes require a coordinated update on both sides.
This page is the protocol spec. The architectural rationale lives in [[live-channel-architecture]]; the consumer-side rendering pattern in [[maps-architecture]]; the inheritance from a working production reference in [[traccar-maps-architecture]].
## Implementation status
**Planned as `processor` Phase 1.5 — Live broadcast.** Six tasks in `trm/processor/.planning/phase-1-5-live-broadcast/`: WS server scaffold + heartbeat, cookie auth handshake, subscription registry & per-event authorization, broadcast consumer group & fan-out, snapshot-on-subscribe, integration test. Status ⬜ Not started; sequenced as 1.5.1 → 1.5.2 → 1.5.3 → (1.5.4 ‖ 1.5.5) → 1.5.6.
The endpoint is hosted *inside* the Processor process (as [[processor]] and [[live-channel-architecture]] specify). Lifting it into a separate `live-gateway` service is the documented escape hatch in [[live-channel-architecture]] §"Scale considerations" if sustained > 10k WS messages/sec demands it — not the starting point.
This contract is implementation-agnostic in the sense that the wire format wouldn't change if we ever did lift the endpoint out — only the host process would. SPA work can build against the contract independently of the Processor task sequence as long as it doesn't ship to stage before Phase 1.5 lands.
## Endpoint
```
wss://<one-public-origin>/processor/ws
```
Served behind the same reverse proxy that fronts [[directus]] and the [[react-spa]] static bundle. **Single origin is non-negotiable** — same-origin is what allows the auth cookie to flow with the WebSocket upgrade request (see Auth handshake below).
The path `/processor/ws` is illustrative; final path determined by the proxy routing rules. Whatever it is, the SPA reaches it as a relative URL, never a cross-origin URL.
## Transport
- **Protocol:** WebSocket (RFC 6455) over TLS at the edge. Internal hop from the proxy to the producer is plain WS on the `trm_default` Compose network.
- **Subprotocol:** none required. Future versions may add a `Sec-WebSocket-Protocol` of `trm.live.v1` if we need to negotiate versions; for now the path is the version.
- **Frame format:** text frames, JSON-encoded. No binary frames. (If we ever need to ship raw position bytes for a high-frequency optimisation, that's a v2 concern.)
- **Heartbeat:** the producer sends a ping every 30 s; the consumer responds. Consumer-side liveness is enforced by `setInterval` checking time-since-last-message > 60s ⇒ reconnect.
## Auth handshake
Cookie-based, same-origin, validated against [[directus]] once at connection time. The SPA uses the Directus SDK in session mode (see [[react-spa]] §"Auth pattern"); the producer is cookie-name-agnostic and just forwards whatever cookie header the upgrade carries.
```
1. Browser opens WebSocket to wss://<origin>/processor/ws.
Same-origin → browser automatically attaches the httpOnly session cookie
issued by Directus's /auth/login (session mode).
2. Producer reads the entire Cookie header from the upgrade request.
GET /users/me to Directus, forwarding the header verbatim.
200 → user identity (id, role, etc.) is bound to the connection.
401/403 → close the WebSocket with code 4401 (unauthorized).
3. Connection is now authenticated. The producer holds (connectionId → user)
in memory. No further per-message auth.
```
Implementation notes:
- **Cookie validation cache.** `/users/me` round-trip per connection is fine at pilot scale (≤500 viewers). At higher scale, cache the validation result for the connection's lifetime; on logout / session expiry the SPA reconnects, which re-validates.
- **No JWT in URL.** Don't pass tokens in query strings — they end up in proxy logs. Cookie is the only credential.
- **Why cookie not Authorization header.** Browsers don't let you set Authorization on a WebSocket upgrade. Cookies flow automatically. Same-origin is what makes this work.
- **Cookie-name-agnostic.** The producer never parses individual cookies; it forwards the whole header to `/users/me` and lets Directus identify the session. This keeps the producer working unchanged if Directus's cookie name or auth-mode default ever changes.
## Subscription model
After authentication, the SPA subscribes to event-scoped topics. One connection can hold multiple subscriptions; per-event authorization is checked once at subscribe time.
### Topic format
```
event:<eventId>
```
`<eventId>` is the UUID of an `events` row. Authorization: the user must have a record in `organization_users` for the event's organization (any role). Phase 4 of [[directus]] (permissions) will tighten this; for now membership is enough.
Future topic shapes (not in v1):
- `device:<deviceId>` — single-device follow.
- `entry:<entryId>` — follow a specific competitor across stages.
- `org:<orgId>` — broad org-wide watch (admin-only).
The protocol is forward-compatible: any string-typed topic is valid; producer rejects unknown shapes with `error/unknown-topic`.
### Subscribe
```json
// Client → Server
{
"type": "subscribe",
"topic": "event:ada60b3d-b29f-4017-b702-cd6b700f9f6c",
"id": "client-correlation-id-1"
}
```
`id` is optional; if present, the server echoes it on the response so the client can correlate.
### Server response — subscribed
```json
// Server → Client
{
"type": "subscribed",
"topic": "event:ada60b3d-b29f-4017-b702-cd6b700f9f6c",
"id": "client-correlation-id-1",
"snapshot": [
{ "deviceId": "cbed320e...", "lat": 41.327, "lon": 19.819, "ts": 1714654800000, "speed": 42.3, "course": 187, "accuracy": 5.0, "attributes": {} },
{ "deviceId": "f6114c7e...", "lat": 41.328, "lon": 19.820, "ts": 1714654799000, "speed": 38.1, "course": 184, "accuracy": 4.5, "attributes": {} }
]
}
```
The snapshot is the **latest known position per device** registered to the event (via `entry_devices``entries``events`). Without it, the SPA opens to a black map until devices report — feels broken.
### Server response — error
```json
// Server → Client
{
"type": "error",
"topic": "event:ada60b3d-b29f-4017-b702-cd6b700f9f6c",
"id": "client-correlation-id-1",
"code": "forbidden",
"message": "User does not belong to the event's organization."
}
```
Error codes (initial set; extensible):
| Code | Meaning |
|---|---|
| `forbidden` | User authenticated but not authorized for this topic. |
| `not-found` | Topic refers to a non-existent entity (event id has no row). |
| `unknown-topic` | Topic format not recognised. |
| `rate-limited` | Subscribe rate exceeded (Phase 3 hardening; reserved). |
### Streaming updates
After `subscribed`, the server pushes one message per position-of-interest:
```json
// Server → Client
{
"type": "position",
"topic": "event:ada60b3d-b29f-4017-b702-cd6b700f9f6c",
"deviceId": "cbed320e-1e94-488a-93c3-41060fcb06bc",
"lat": 41.32791,
"lon": 19.81947,
"ts": 1714654801000,
"speed": 42.5,
"course": 188,
"accuracy": 5.0,
"attributes": {}
}
```
Field semantics:
| Field | Type | Required | Notes |
|---|---|---|---|
| `type` | `"position"` | yes | Discriminator. |
| `topic` | string | yes | Echoes the subscription. Allows multiplexing on one connection. |
| `deviceId` | uuid | yes | The `devices.id` (not the IMEI). SPA looks up device → entry → vehicle/crew via TanStack Query against [[directus]]. |
| `lat` / `lon` | number (degrees, WGS84) | yes | GPS coordinates. **Coordinate order in JSON is `lat`/`lon`** (not `[lon,lat]` GeoJSON ordering — that conversion happens in the SPA). |
| `ts` | number (epoch milliseconds, UTC) | yes | Authoritative timestamp from the device's GPS fix. **Always use this, never `Date.now()` on the client.** |
| `speed` | number (km/h) | optional | Omitted if device reports speed=0 with invalid GPS fix (per [[teltonika]] convention). |
| `course` | number (degrees, 0=N, clockwise) | optional | Heading. Omitted if unknown. |
| `accuracy` | number (metres) | optional | Position accuracy radius for the [[react-spa]]'s accuracy-circle layer. |
| `attributes` | object | optional, default `{}` | The decoded IO bag. Phase 1 ships the raw IO map; Phase 2 of [[processor]] adds named attributes per [[io-element-bag]]. SPA must tolerate empty / unknown shapes. |
The producer should **omit fields rather than send `null`** for absent values. Reduces JSON size and removes ambiguity (null = "we don't know" vs missing = "device didn't report").
### Unsubscribe
```json
// Client → Server
{
"type": "unsubscribe",
"topic": "event:ada60b3d-b29f-4017-b702-cd6b700f9f6c",
"id": "client-correlation-id-2"
}
```
Server response:
```json
// Server → Client
{
"type": "unsubscribed",
"topic": "event:ada60b3d-b29f-4017-b702-cd6b700f9f6c",
"id": "client-correlation-id-2"
}
```
The connection stays open with whatever other subscriptions are active. Closing the WebSocket is the cleanup-everything path.
## Reconnect semantics
The client reconnects on close (other than code 4401). Backoff: 1s, 2s, 4s, 8s, 16s, then 30s steady. Cap at 30s.
On reconnect, the client **must re-subscribe to all previously-active topics**. The server treats reconnect as a fresh connection; subscription state lives in memory only.
The server should accept reconnects from the same user without rate-limiting at pilot scale. Phase 3 may add a per-user concurrent-connection cap.
## Multi-instance behaviour
When [[processor]] (or the gateway service) runs more than one replica:
- Each instance reads the [[redis-streams]] telemetry stream on **two consumer groups**:
- `processor` — the durable-write group (work-split: only one instance handles each record for the DB write).
- `live-broadcast-{instance_id}` — a per-instance fan-out group (every instance reads every record for fan-out).
- Connected clients are bound to one instance via the load balancer; that instance fans out to its own clients only. No cross-instance broadcasting needed.
- The reconnect is what handles instance failure — client reconnects, gets re-load-balanced to a healthy instance, re-subscribes.
This design is documented in [[live-channel-architecture]] §"Multi-instance Processor".
## Connection limits and back-pressure
Pilot-scale targets (subject to revision after first dogfood):
| Metric | Target |
|---|---|
| Concurrent connections per instance | 100 |
| Subscriptions per connection | 4 (one event + room for future per-device follow) |
| Position messages per second per connection | ≤ 500 (race start with 500 devices reporting at 1Hz) |
| End-to-end latency (Redis stream → client) | p95 < 500ms |
| Reconnect storm tolerance | 200 reconnects/sec for 5 seconds (race start surge) |
If a slow consumer can't drain its queue, the server **drops oldest position messages** for that connection (per-device; latest position is always preserved). Position data is always-fresh — backlog isn't valuable. Only `subscribed`/`unsubscribed`/`error` control messages are guaranteed delivery.
## Versioning
This is `v1`. Breaking changes (renaming fields, changing semantics) require:
1. New endpoint path (`/processor/ws/v2`).
2. Update this synthesis page to document both versions.
3. Deprecation window: v1 stays online for ≥ one full event cycle after v2 lands.
Non-breaking additions (new optional fields, new message types, new error codes) ship in v1 without ceremony — both sides should ignore unknown fields and unknown `type` values.
## Open questions
- **Session expiry while connected.** Directus session cookies have a finite lifetime. The WebSocket connection's already-validated identity is unaffected for as long as the connection stays open — the producer authorised once at upgrade and doesn't re-check. If the session expires server-side, the SPA's next REST call (or its periodic `/users/me` ping, if added) will fail with 401, the SPA will redirect to login, and on re-login the SPA reconnects the WebSocket — which re-validates. Pilot answer: producer never re-validates mid-connection. Phase 3 hardening can revisit if real-world session durations make this feel wrong.
- **Device-to-event resolution snapshot freshness.** The snapshot includes "every device registered to the event"; that registration set may change while a client is subscribed. Initial answer: subscription holds the registration set captured at subscribe time; new entries added mid-event don't appear until the client reconnects. Acceptable for pilot.
- **Faulty-flag visibility.** When an operator flips a position's `faulty=true` flag in [[directus]], should the live channel emit a correction? Current answer: no — faulty flagging is post-hoc operator review, not a live concern. Live map shows whatever was streamed at the time. The recompute pipeline ([[processor]] faulty position handling) corrects derived data, not the live history.
- **Replay-mode endpoint.** Out of v1 scope. A future `event:<id>:replay` topic could stream historical positions at a chosen speed. Defer.
## Cross-references
- [[live-channel-architecture]] — architectural rationale and dual-channel design.
- [[processor]] — the entity nominally hosting this endpoint (subject to the Implementation status note above).
- [[react-spa]] — the consumer.
- [[maps-architecture]] — consumer-side throughput discipline (rAF coalescer) that this contract is consumed through.
- [[traccar-maps-architecture]] — the working production reference whose WS contract shape this draws from (with refinements for our needs).
- [[directus]] — auth source (cookie validator) and the data source for event/device/org metadata the SPA looks up alongside the live stream.