Catches up the wiki with several pieces of work accumulated during this session. INGEST: TRACCAR_MAPS_ARCHITECTURE.md - raw/TRACCAR_MAPS_ARCHITECTURE.md (source doc, read-only). - wiki/sources/traccar-maps-architecture.md — TL;DR + key claims + notable quotes + TRM divergences (PostGIS-native GeoJSON, rAF coalescer, Zustand, longer trail, racing sprite set). - wiki/concepts/maps-architecture.md — distilled patterns for the SPA's map subsystem: singleton MapLibre + side-effect-only Map* components + two GeoJSON sources + style-swap mapReady gate + sprite preload + WS- to-map data flow (with rAF coalescer) + geofence editing + camera control trio. - wiki/entities/react-spa.md — corrected the "talks exclusively to Directus" contradiction with [[live-channel-architecture]] (SPA connects to two endpoints — Directus + Processor); locked stack (raw MapLibre over react-map-gl, Zustand over Redux); added Auth section. - wiki/concepts/live-channel-architecture.md — single sentence cross- referencing [[maps-architecture]] for consumer-side throughput discipline. - index.md — Sources + Concepts entries. SYNTHESIS: processor-ws-contract - wiki/synthesis/processor-ws-contract.md — wire-level spec for the live-position WebSocket: endpoint, transport, auth handshake, subscribe/snapshot/streaming/unsubscribe protocol, reconnect, multi- instance behaviour, connection limits, versioning, open questions. Implementation-agnostic; the producer is cookie-name-agnostic so the spec doesn't pin to a specific Directus auth mode. - index.md — Synthesis entry. AUTH-MODE REALIGNMENT (cookie -> session) - SPA implementation surfaced that Directus SDK 'cookie' mode doesn't survive a hard reload cleanly. Switched the SPA to 'session' mode (separate commit in trm/spa). Wiki updates here: - wiki/entities/react-spa.md §Auth pattern — describes session mode (single httpOnly session cookie, no separate access token, no /auth/refresh dance). Added "Mode choice context" note. - wiki/synthesis/processor-ws-contract.md §Auth handshake — emphasises the producer is cookie-name-agnostic; reframed "Cookie refresh while connected" as "Session expiry while connected". Plus all the chronological log.md entries documenting the above plus Phase 1.5 planning, SPA Phase 1 planning, and stage verify+seed work from earlier in the session. Skipped from this commit: .claude/agent-memory/* (user-local agent state, not project content); .gitignore (already-modified by user outside this session's scope). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
14 KiB
title, type, created, updated, sources, tags
| title | type | created | updated | sources | tags | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Processor WebSocket contract | synthesis | 2026-05-02 | 2026-05-02 |
|
|
Processor WebSocket contract
The wire-level specification of the WebSocket endpoint that fans live position updates from processor (or its eventual replacement gateway — see Implementation status) to react-spa clients. Both sides build against this contract; changes require a coordinated update on both sides.
This page is the protocol spec. The architectural rationale lives in live-channel-architecture; the consumer-side rendering pattern in maps-architecture; the inheritance from a working production reference in traccar-maps-architecture.
Implementation status
Planned as processor Phase 1.5 — Live broadcast. Six tasks in trm/processor/.planning/phase-1-5-live-broadcast/: WS server scaffold + heartbeat, cookie auth handshake, subscription registry & per-event authorization, broadcast consumer group & fan-out, snapshot-on-subscribe, integration test. Status ⬜ Not started; sequenced as 1.5.1 → 1.5.2 → 1.5.3 → (1.5.4 ‖ 1.5.5) → 1.5.6.
The endpoint is hosted inside the Processor process (as processor and live-channel-architecture specify). Lifting it into a separate live-gateway service is the documented escape hatch in live-channel-architecture §"Scale considerations" if sustained > 10k WS messages/sec demands it — not the starting point.
This contract is implementation-agnostic in the sense that the wire format wouldn't change if we ever did lift the endpoint out — only the host process would. SPA work can build against the contract independently of the Processor task sequence as long as it doesn't ship to stage before Phase 1.5 lands.
Endpoint
wss://<one-public-origin>/processor/ws
Served behind the same reverse proxy that fronts directus and the react-spa static bundle. Single origin is non-negotiable — same-origin is what allows the auth cookie to flow with the WebSocket upgrade request (see Auth handshake below).
The path /processor/ws is illustrative; final path determined by the proxy routing rules. Whatever it is, the SPA reaches it as a relative URL, never a cross-origin URL.
Transport
- Protocol: WebSocket (RFC 6455) over TLS at the edge. Internal hop from the proxy to the producer is plain WS on the
trm_defaultCompose network. - Subprotocol: none required. Future versions may add a
Sec-WebSocket-Protocoloftrm.live.v1if we need to negotiate versions; for now the path is the version. - Frame format: text frames, JSON-encoded. No binary frames. (If we ever need to ship raw position bytes for a high-frequency optimisation, that's a v2 concern.)
- Heartbeat: the producer sends a ping every 30 s; the consumer responds. Consumer-side liveness is enforced by
setIntervalchecking time-since-last-message > 60s ⇒ reconnect.
Auth handshake
Cookie-based, same-origin, validated against directus once at connection time. The SPA uses the Directus SDK in session mode (see react-spa §"Auth pattern"); the producer is cookie-name-agnostic and just forwards whatever cookie header the upgrade carries.
1. Browser opens WebSocket to wss://<origin>/processor/ws.
Same-origin → browser automatically attaches the httpOnly session cookie
issued by Directus's /auth/login (session mode).
2. Producer reads the entire Cookie header from the upgrade request.
GET /users/me to Directus, forwarding the header verbatim.
200 → user identity (id, role, etc.) is bound to the connection.
401/403 → close the WebSocket with code 4401 (unauthorized).
3. Connection is now authenticated. The producer holds (connectionId → user)
in memory. No further per-message auth.
Implementation notes:
- Cookie validation cache.
/users/meround-trip per connection is fine at pilot scale (≤500 viewers). At higher scale, cache the validation result for the connection's lifetime; on logout / session expiry the SPA reconnects, which re-validates. - No JWT in URL. Don't pass tokens in query strings — they end up in proxy logs. Cookie is the only credential.
- Why cookie not Authorization header. Browsers don't let you set Authorization on a WebSocket upgrade. Cookies flow automatically. Same-origin is what makes this work.
- Cookie-name-agnostic. The producer never parses individual cookies; it forwards the whole header to
/users/meand lets Directus identify the session. This keeps the producer working unchanged if Directus's cookie name or auth-mode default ever changes.
Subscription model
After authentication, the SPA subscribes to event-scoped topics. One connection can hold multiple subscriptions; per-event authorization is checked once at subscribe time.
Topic format
event:<eventId>
<eventId> is the UUID of an events row. Authorization: the user must have a record in organization_users for the event's organization (any role). Phase 4 of directus (permissions) will tighten this; for now membership is enough.
Future topic shapes (not in v1):
device:<deviceId>— single-device follow.entry:<entryId>— follow a specific competitor across stages.org:<orgId>— broad org-wide watch (admin-only).
The protocol is forward-compatible: any string-typed topic is valid; producer rejects unknown shapes with error/unknown-topic.
Subscribe
// Client → Server
{
"type": "subscribe",
"topic": "event:ada60b3d-b29f-4017-b702-cd6b700f9f6c",
"id": "client-correlation-id-1"
}
id is optional; if present, the server echoes it on the response so the client can correlate.
Server response — subscribed
// Server → Client
{
"type": "subscribed",
"topic": "event:ada60b3d-b29f-4017-b702-cd6b700f9f6c",
"id": "client-correlation-id-1",
"snapshot": [
{ "deviceId": "cbed320e...", "lat": 41.327, "lon": 19.819, "ts": 1714654800000, "speed": 42.3, "course": 187, "accuracy": 5.0, "attributes": {} },
{ "deviceId": "f6114c7e...", "lat": 41.328, "lon": 19.820, "ts": 1714654799000, "speed": 38.1, "course": 184, "accuracy": 4.5, "attributes": {} }
]
}
The snapshot is the latest known position per device registered to the event (via entry_devices → entries → events). Without it, the SPA opens to a black map until devices report — feels broken.
Server response — error
// Server → Client
{
"type": "error",
"topic": "event:ada60b3d-b29f-4017-b702-cd6b700f9f6c",
"id": "client-correlation-id-1",
"code": "forbidden",
"message": "User does not belong to the event's organization."
}
Error codes (initial set; extensible):
| Code | Meaning |
|---|---|
forbidden |
User authenticated but not authorized for this topic. |
not-found |
Topic refers to a non-existent entity (event id has no row). |
unknown-topic |
Topic format not recognised. |
rate-limited |
Subscribe rate exceeded (Phase 3 hardening; reserved). |
Streaming updates
After subscribed, the server pushes one message per position-of-interest:
// Server → Client
{
"type": "position",
"topic": "event:ada60b3d-b29f-4017-b702-cd6b700f9f6c",
"deviceId": "cbed320e-1e94-488a-93c3-41060fcb06bc",
"lat": 41.32791,
"lon": 19.81947,
"ts": 1714654801000,
"speed": 42.5,
"course": 188,
"accuracy": 5.0,
"attributes": {}
}
Field semantics:
| Field | Type | Required | Notes |
|---|---|---|---|
type |
"position" |
yes | Discriminator. |
topic |
string | yes | Echoes the subscription. Allows multiplexing on one connection. |
deviceId |
uuid | yes | The devices.id (not the IMEI). SPA looks up device → entry → vehicle/crew via TanStack Query against directus. |
lat / lon |
number (degrees, WGS84) | yes | GPS coordinates. Coordinate order in JSON is lat/lon (not [lon,lat] GeoJSON ordering — that conversion happens in the SPA). |
ts |
number (epoch milliseconds, UTC) | yes | Authoritative timestamp from the device's GPS fix. Always use this, never Date.now() on the client. |
speed |
number (km/h) | optional | Omitted if device reports speed=0 with invalid GPS fix (per teltonika convention). |
course |
number (degrees, 0=N, clockwise) | optional | Heading. Omitted if unknown. |
accuracy |
number (metres) | optional | Position accuracy radius for the react-spa's accuracy-circle layer. |
attributes |
object | optional, default {} |
The decoded IO bag. Phase 1 ships the raw IO map; Phase 2 of processor adds named attributes per io-element-bag. SPA must tolerate empty / unknown shapes. |
The producer should omit fields rather than send null for absent values. Reduces JSON size and removes ambiguity (null = "we don't know" vs missing = "device didn't report").
Unsubscribe
// Client → Server
{
"type": "unsubscribe",
"topic": "event:ada60b3d-b29f-4017-b702-cd6b700f9f6c",
"id": "client-correlation-id-2"
}
Server response:
// Server → Client
{
"type": "unsubscribed",
"topic": "event:ada60b3d-b29f-4017-b702-cd6b700f9f6c",
"id": "client-correlation-id-2"
}
The connection stays open with whatever other subscriptions are active. Closing the WebSocket is the cleanup-everything path.
Reconnect semantics
The client reconnects on close (other than code 4401). Backoff: 1s, 2s, 4s, 8s, 16s, then 30s steady. Cap at 30s.
On reconnect, the client must re-subscribe to all previously-active topics. The server treats reconnect as a fresh connection; subscription state lives in memory only.
The server should accept reconnects from the same user without rate-limiting at pilot scale. Phase 3 may add a per-user concurrent-connection cap.
Multi-instance behaviour
When processor (or the gateway service) runs more than one replica:
- Each instance reads the redis-streams telemetry stream on two consumer groups:
processor— the durable-write group (work-split: only one instance handles each record for the DB write).live-broadcast-{instance_id}— a per-instance fan-out group (every instance reads every record for fan-out).
- Connected clients are bound to one instance via the load balancer; that instance fans out to its own clients only. No cross-instance broadcasting needed.
- The reconnect is what handles instance failure — client reconnects, gets re-load-balanced to a healthy instance, re-subscribes.
This design is documented in live-channel-architecture §"Multi-instance Processor".
Connection limits and back-pressure
Pilot-scale targets (subject to revision after first dogfood):
| Metric | Target |
|---|---|
| Concurrent connections per instance | 100 |
| Subscriptions per connection | 4 (one event + room for future per-device follow) |
| Position messages per second per connection | ≤ 500 (race start with 500 devices reporting at 1Hz) |
| End-to-end latency (Redis stream → client) | p95 < 500ms |
| Reconnect storm tolerance | 200 reconnects/sec for 5 seconds (race start surge) |
If a slow consumer can't drain its queue, the server drops oldest position messages for that connection (per-device; latest position is always preserved). Position data is always-fresh — backlog isn't valuable. Only subscribed/unsubscribed/error control messages are guaranteed delivery.
Versioning
This is v1. Breaking changes (renaming fields, changing semantics) require:
- New endpoint path (
/processor/ws/v2). - Update this synthesis page to document both versions.
- Deprecation window: v1 stays online for ≥ one full event cycle after v2 lands.
Non-breaking additions (new optional fields, new message types, new error codes) ship in v1 without ceremony — both sides should ignore unknown fields and unknown type values.
Open questions
- Session expiry while connected. Directus session cookies have a finite lifetime. The WebSocket connection's already-validated identity is unaffected for as long as the connection stays open — the producer authorised once at upgrade and doesn't re-check. If the session expires server-side, the SPA's next REST call (or its periodic
/users/meping, if added) will fail with 401, the SPA will redirect to login, and on re-login the SPA reconnects the WebSocket — which re-validates. Pilot answer: producer never re-validates mid-connection. Phase 3 hardening can revisit if real-world session durations make this feel wrong. - Device-to-event resolution snapshot freshness. The snapshot includes "every device registered to the event"; that registration set may change while a client is subscribed. Initial answer: subscription holds the registration set captured at subscribe time; new entries added mid-event don't appear until the client reconnects. Acceptable for pilot.
- Faulty-flag visibility. When an operator flips a position's
faulty=trueflag in directus, should the live channel emit a correction? Current answer: no — faulty flagging is post-hoc operator review, not a live concern. Live map shows whatever was streamed at the time. The recompute pipeline (processor faulty position handling) corrects derived data, not the live history. - Replay-mode endpoint. Out of v1 scope. A future
event:<id>:replaytopic could stream historical positions at a chosen speed. Defer.
Cross-references
- live-channel-architecture — architectural rationale and dual-channel design.
- processor — the entity nominally hosting this endpoint (subject to the Implementation status note above).
- react-spa — the consumer.
- maps-architecture — consumer-side throughput discipline (rAF coalescer) that this contract is consumed through.
- traccar-maps-architecture — the working production reference whose WS contract shape this draws from (with refinements for our needs).
- directus — auth source (cookie validator) and the data source for event/device/org metadata the SPA looks up alongside the live stream.