docs: update log and wiki entries for Phase 1.5 live broadcast implementation and incident resolution
This commit is contained in:
@@ -2,7 +2,7 @@
|
||||
title: Live channel architecture
|
||||
type: concept
|
||||
created: 2026-05-01
|
||||
updated: 2026-05-01
|
||||
updated: 2026-05-03
|
||||
sources: []
|
||||
tags: [architecture, realtime, websocket, telemetry-plane, decision]
|
||||
---
|
||||
@@ -59,22 +59,28 @@ Each kind of data takes the path that fits it. No bridges, no extensions inside
|
||||
|
||||
## Authorization flow
|
||||
|
||||
The Processor's WebSocket endpoint validates connections through Directus, but never asks Directus per record.
|
||||
The Processor's WebSocket endpoint validates connections through Directus, but never asks Directus per record. The handshake is **cookie-based and same-origin** — see [[processor-ws-contract]] §"Auth handshake" for the wire-level spec.
|
||||
|
||||
```
|
||||
1. SPA opens wss://processor.../live with a Directus-issued JWT.
|
||||
2. Processor validates the JWT (round-trip to Directus's /users/me, or local
|
||||
verification with Directus's signing secret). Failure → close socket.
|
||||
3. SPA sends {type: 'subscribe', event_id: 42}.
|
||||
4. Processor calls Directus once: GET /items/events/42 with the user's token.
|
||||
200 → allow subscription, store {client → event_id} in memory.
|
||||
403 → reject subscription with a clear error.
|
||||
1. SPA opens wss://<origin>/ws-live (relative URL; same origin as Directus).
|
||||
Browser auto-attaches the httpOnly Directus session cookie.
|
||||
2. Processor reads the entire Cookie header from the upgrade request and
|
||||
forwards it to Directus GET /users/me.
|
||||
200 → bind the connection to (id, role).
|
||||
401/403 → close the socket with code 4401 (unauthorized).
|
||||
3. SPA sends {type: 'subscribe', topic: 'event:<uuid>'}.
|
||||
4. Processor checks the user's organization_users membership against the
|
||||
event's organization_id (one cached lookup per event).
|
||||
200 → store {client → topic}; reply with the latest-position snapshot.
|
||||
403 → reply with {type: 'error', code: 'forbidden'}.
|
||||
5. For every position arriving on Redis, match against in-memory subscriptions
|
||||
and push to matched clients. Zero Directus calls in the hot path.
|
||||
```
|
||||
|
||||
Connection-time auth is amortized over session lifetime. Permission re-checks happen on subscription change, not on every record. The hot path is bounded by `O(positions × subscribed-clients-per-event)` and runs entirely on the Processor's event loop with in-memory state.
|
||||
|
||||
> Earlier revisions of this page described JWT-in-URL auth. That predated [[react-spa]]'s switch to Directus SDK session-mode auth (see log entry 2026-05-02 "Auth-mode wiki realignment"). The current implementation is cookie-based; tokens never appear in WebSocket URLs (which would land them in proxy logs).
|
||||
|
||||
## Failure modes
|
||||
|
||||
| Failure | Effect on durable storage | Effect on live channel |
|
||||
@@ -107,7 +113,7 @@ At pilot scale (≤500 devices per event, tens of viewers), the dominant costs a
|
||||
When this becomes wrong:
|
||||
|
||||
- Sustained > ~10k WebSocket messages/sec total → consider sharding the broadcast path or extracting to a dedicated gateway service.
|
||||
- Connection-time auth becomes a thundering herd at race start with thousands of viewers → cache JWT verification locally and shorten the Directus permission check via a token-with-scope pattern.
|
||||
- Connection-time auth becomes a thundering herd at race start with thousands of viewers → cache the `/users/me` validation result for the connection's lifetime and shorten the Directus permission check via a token-with-scope pattern. Pilot scale doesn't need this; revisit when measured.
|
||||
- Multi-data-center deployment → revisit the consumer-group fan-out strategy; per-region broadcast may be cleaner than global.
|
||||
|
||||
The escape hatch is well-defined: lift the WebSocket endpoint code out of the Processor into a standalone service that subscribes to the same `live-broadcast-*` consumer group. The Redis-stream-in / WebSocket-out contract doesn't change; only the host process does.
|
||||
@@ -116,7 +122,7 @@ The escape hatch is well-defined: lift the WebSocket endpoint code out of the Pr
|
||||
|
||||
- [[processor]] grows a public-facing WebSocket endpoint in addition to its existing Redis consumer and Postgres writer.
|
||||
- [[directus]] keeps its built-in WebSocket subscriptions for tables it writes to. Its real-time delivery section no longer claims to broadcast direct writes from [[processor]] — that's a documented mistake corrected in this revision.
|
||||
- [[react-spa]] connects to two WebSocket endpoints: Directus for admin/business updates, Processor for live position firehose. Same JWT-based auth on both. Consumer-side throughput discipline (rAF coalescing of incoming positions before reducer dispatch) is documented in [[maps-architecture]] — without it the per-message dispatch pattern observed in [[traccar-maps-architecture]] cascades through selectors and `setData` at every position arrival.
|
||||
- [[react-spa]] connects to two WebSocket endpoints: Directus at `/ws-business` for admin/business updates, Processor at `/ws-live` for live position firehose. Same-origin httpOnly Directus session cookie on both — no separate auth artifact for the live channel. Consumer-side throughput discipline (rAF coalescing of incoming positions before reducer dispatch) is documented in [[maps-architecture]] — without it the per-message dispatch pattern observed in [[traccar-maps-architecture]] cascades through selectors and `setData` at every position arrival.
|
||||
- The deploy stack publishes the Processor's WebSocket port (with TLS termination at a reverse proxy in front).
|
||||
|
||||
## Why not a single WebSocket endpoint
|
||||
@@ -130,6 +136,6 @@ Two endpoints, each serving the writes its plane manages, is the architecturally
|
||||
|
||||
## Open questions
|
||||
|
||||
- **JWT validation strategy.** Round-trip to Directus's `/users/me` (no shared secret, ~20ms per connection) vs. local verification with Directus's signing key (no round-trip, but a secret to share). Pilot can start with round-trip; revisit if connection rates climb.
|
||||
- **Auth caching strategy.** Currently every WebSocket connection round-trips to Directus's `/users/me` (~20ms over the internal network) to validate the forwarded session cookie. At pilot scale (≤500 viewers, low reconnect rate) this is trivial. Caching the validation per-connection-lifetime is the cheap optimisation; a stateless verification path (shared signing secret) is the heavier one. Defer until measurements demand it.
|
||||
- **Subscription model.** Per-event, per-stage, per-organization, or arbitrary filter expressions? The simplest pilot model is "subscribe to one event by ID"; extensions land when SPA UX demands them.
|
||||
- **Permission staleness.** If a user is removed from an organization mid-session, do their existing subscriptions silently keep delivering until reconnect? Either re-validate periodically, or accept "trust the session" for pilot.
|
||||
|
||||
Reference in New Issue
Block a user