Files

98 lines
7.3 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Phase 1.5 — Live broadcast
Implement the WebSocket endpoint that fans live position updates from the Processor to subscribed [[react-spa]] clients. Layered on top of Phase 1's throughput pipeline; logically between Phase 1 (throughput) and Phase 2 (domain logic). The wire spec is locked at `docs/wiki/synthesis/processor-ws-contract.md`.
## Why a separate phase
Phase 1's outcome was Redis → Postgres only; the WebSocket fan-out side of [[processor]] was wiki-canonical (`docs/wiki/concepts/live-channel-architecture.md`) but had no implementation task. Phase 2 is gated on Directus schema decisions and is a substantial domain-logic chunk; bundling the WebSocket work into it would couple two unrelated workstreams.
This phase is small, self-contained, and unblocks the [[react-spa]]'s live-map feature for the Rally Albania 2026 dogfood. It does **not** touch domain logic or the Phase 1 throughput path.
## Outcome statement
When Phase 1.5 is done:
- The Processor exposes a WebSocket endpoint (path TBD by the reverse proxy; same origin as [[directus]] and the SPA bundle so the auth cookie flows automatically).
- Connections authenticate via the Directus-issued cookie attached to the WebSocket upgrade request. Validation is a single `/users/me` round-trip to [[directus]] at connect time; the validated user identity is bound to the connection for its lifetime.
- Clients subscribe to `event:<eventId>` topics. Per-event authorization is checked once at subscribe time (does the user belong to the event's organization?). Multiple subscriptions per connection are supported.
- On `subscribed`, the server returns a snapshot of the latest known position for every device registered to the event (via `entry_devices``entries``events`). After the snapshot, position records stream as they arrive on Redis.
- A second consumer group `live-broadcast-{instance_id}` reads the same `telemetry:teltonika` stream as the durable-write group (`processor`), but per-instance — every Processor instance reads every record for its own connected clients. The durable-write path is unaffected.
- 30s server-side ping; client-side liveness check on 60s message-gap; backoff reconnect on close.
- All of this is covered by an end-to-end integration test (testcontainers Redis + Postgres + a Directus auth stub).
## Sequencing
```
1.5.1 WS server scaffold + heartbeat
└─→ 1.5.2 Cookie auth handshake
└─→ 1.5.3 Subscription registry & authorization
├─→ 1.5.4 Broadcast consumer group & fan-out
├─→ 1.5.5 Snapshot-on-subscribe
└─→ 1.5.6 Integration test (depends on 1.5.4 + 1.5.5)
```
1.5.4 and 1.5.5 can be developed in parallel after 1.5.3 lands.
## Files modified
This phase adds these to the existing `processor/` layout:
```
processor/
├── src/
│ ├── core/
│ │ └── ... (unchanged from Phase 1)
│ ├── live/
│ │ ├── server.ts # WS server, heartbeat, lifecycle
│ │ ├── auth.ts # cookie → /users/me → user identity
│ │ ├── registry.ts # subscriptions: connection→topics, topic→connections
│ │ ├── broadcast.ts # live-broadcast consumer group + fan-out loop
│ │ ├── snapshot.ts # latest-position-per-device query
│ │ └── protocol.ts # zod schemas for the wire format (subscribe/position/etc.)
│ ├── db/
│ │ └── ... (unchanged)
│ └── main.ts # wires the live server alongside the consumer
└── test/
├── live-server.test.ts # mocked: heartbeat, lifecycle, message routing
├── live-auth.test.ts # mocked Directus client
├── live-registry.test.ts # subscribe/unsubscribe semantics
├── live-snapshot.test.ts # query shape
└── live.integration.test.ts # end-to-end with testcontainers
```
## Tech stack additions
- **`ws`** — minimal, mature WebSocket server. Plays naturally with `http.createServer` (already used by Phase 1's metrics/health server).
- **No HTTP client library.** Node 22's global `fetch` is sufficient for the `/users/me` and `/items/events/<id>` calls to Directus.
- **`zod`** (already a Phase 1 dep) — runtime validation of inbound WS messages. Strict schemas; reject unknown fields.
No new test dependencies. `vitest` + `testcontainers` already cover what's needed.
## Non-negotiable design rules
These rules govern every task in this phase. Any deviation must be discussed and documented before code lands.
1. **Live work is isolated.** `src/live/` cannot import from `src/core/` and vice versa, with one exception: `src/db/pool.ts` is shared. The Phase 1 throughput pipeline must run unchanged whether or not the live server starts, and vice versa. Enforced by `import/no-restricted-paths` ESLint config.
2. **Authorization is checked once at subscribe time.** Never per record. The hot fan-out path is `O(records × subscribed-clients-per-event)` with zero Directus calls.
3. **Subscription state is in-memory.** No durable subscription store. Reconnect re-subscribes; instance failure means a brief gap and a reconnect.
4. **Always-fresh, not always-deliver.** If a slow consumer can't drain its queue, drop oldest position messages for that connection — latest-position-per-device is what matters. Control messages (`subscribed`/`unsubscribed`/`error`) are guaranteed.
5. **Single origin.** The endpoint is reachable only at the same origin as Directus and the SPA bundle. Cross-origin won't carry the cookie. The reverse-proxy config is responsible for the routing; the Processor binds to a port and trusts the proxy to forward correctly.
6. **No business logic.** This phase ships the protocol and the fan-out plumbing. Nothing in `src/live/` should know what an `entries.race_number` is or what a `class_id` means. Phase 2 may add domain-aware filtering (e.g. "subscribe to a specific class within an event") — out of scope here.
## Key design references (read before starting any task)
- `docs/wiki/synthesis/processor-ws-contract.md` — the wire spec. Authoritative.
- `docs/wiki/concepts/live-channel-architecture.md` — the architectural rationale; explains why this lives in the Processor at all.
- `docs/wiki/entities/processor.md` — the entity-level summary, including the multi-instance consumer-group split.
- `docs/wiki/entities/directus.md` — the auth source; explains how the cookie is issued and what `/users/me` returns.
- `docs/wiki/entities/react-spa.md` — the consumer; `Auth pattern` and `Real-time rendering` sections describe the SPA-side handshake and the rAF coalescer that shapes our delivery cadence.
## Acceptance for the phase as a whole
- [ ] All six task files done.
- [ ] `pnpm typecheck`, `pnpm lint`, `pnpm test` clean across the new code.
- [ ] `pnpm test:integration` runs the live-pipeline end-to-end test green.
- [ ] Manual smoke: with stage Directus + stage Processor + a `wscat` client carrying a valid cookie, can connect, subscribe to the Rally Albania 2026 event, see snapshot, see streamed positions when synthetic positions are published to Redis.
- [ ] No regressions in Phase 1's throughput tests; the durable-write path is unchanged.
- [ ] `docs/wiki/synthesis/processor-ws-contract.md` Implementation status section updated to reflect "implemented in Phase 1.5".