Files
docs/log.md
T
julian f9b96efc6b Document directus deployment + internal-only network model
trm/directus Phase 1 image is on the registry; trm/deploy's
compose.yaml has been extended with a directus service block that
shares the existing postgres service with processor (different
tables, no contention). Bringing the architecture wiki up to date.

wiki/entities/directus.md updates:

- New "Deployment" section: links to the deploy compose, names the
  shared-Postgres model with processor, spells out the 5-step boot
  pipeline (db-init pre-schema → bootstrap → schema apply →
  db-init post-schema → start), notes first-boot (~60-90 s) vs
  warm-boot (~10 s) timing, points at deploy/README.md's first-deploy
  checklist.

- New "Network exposure" subsection: directus is internal-only on
  stage / prod (expose: 8055 not ports:). A reverse proxy on the
  host or attached to trm_default terminates TLS and forwards the
  public domain to http://directus:8055. The asymmetry with
  tcp-ingestion (which must host-publish for GPS devices) is named.
  The dev compose's deliberate divergence (host-publishes 8055 for
  local iteration) is noted.

- Schema management section: db-init split into pre-schema (db-init/)
  and post-schema (db-init-post/) phases. Post-schema landed because
  the composite UNIQUE constraints target Directus-managed tables
  that don't exist until schema apply runs. Both phases run via the
  same apply-db-init.sh with DB_INIT_DIR overridden between calls.

- Destructive-apply hazard callout: corrected entrypoint step
  reference (now step 3/5, not 2/4) after the bootstrap-before-apply
  reorder that landed during CI iterations.

log.md entry records the three CI iterations that surfaced three
distinct production-breaking bugs (port collision; ordering + silent
ERROR exit; ghost-collection apply conflict) — all caught by the
dry-run gate before reaching stage. Ghost-collection stripping is
now automated in scripts/schema-snapshot.sh so future captures
don't regress.
2026-05-02 12:20:13 +02:00

111 lines
13 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Log
Chronological activity log. Append-only. Entry headers use the format `## [YYYY-MM-DD] <op> | <title>` so they can be grepped:
```
grep "^## \[" log.md | tail -10
```
---
## [2026-04-30] note | Wiki bootstrapped
Created CLAUDE.md (schema + workflows), index.md (empty catalog), and this log. Wiki directory structure (wiki/sources, wiki/entities, wiki/concepts, wiki/synthesis) will be created on first ingest.
## [2026-04-30] ingest | gps-tracking-architecture.md + teltonika-ingestion-architecture.md
Ingested both initial architecture docs in one pass. Created:
- Source pages: [[gps-tracking-architecture]], [[teltonika-ingestion-architecture]].
- Entity pages: [[tcp-ingestion]], [[processor]], [[directus]], [[react-spa]], [[redis-streams]], [[postgres-timescaledb]], [[teltonika]].
- Concept pages: [[plane-separation]], [[protocol-adapter]], [[codec-dispatch]], [[position-record]], [[failure-domains]], [[phase-2-commands]].
- Updated index.md with all 15 new pages.
No contradictions to flag — the two docs are coherent (the Teltonika doc explicitly cites and respects the system architecture). Open follow-ups: TRM business domain not yet captured; per-model IO dictionary location TBD; Phase 2 timing unspecified.
## [2026-04-30] ingest | Teltonika Data Sending Protocols (official wiki)
Ingested the canonical Teltonika spec covering all codec families. New additions:
- Source page: [[teltonika-data-sending-protocols]].
- New concept: [[avl-data-format]] — byte-level reference for codecs 8/8E/16, including UDP envelope.
Updates to existing pages (no contradictions; refinements + additions):
- [[teltonika]] — added full codec table with hex IDs, Codec 15 (out of scope), Codec 14 ACK/nACK, packet size limits, UDP support note.
- [[codec-dispatch]] — corrected hex IDs, added directionality table covering codecs 815.
- [[position-record]] — concrete priority enum (0/1/2), two's-complement lat/lon note, Speed=0 means GPS invalid, Generation Type and NX section flagged.
- [[phase-2-commands]] — clarified Codec 12 vs 14 selection, added `nack` status for Codec 14 IMEI-mismatch (Type `0x11`); noted 13/15 are not part of the outbound design.
Cleanup: removed stale duplicate concept files from earlier passes (system-planes.md, protocol-adapter-pattern.md, codec-dispatch-registry.md) — superseded by plane-separation.md, protocol-adapter.md, codec-dispatch.md respectively. Fixed dangling [[protocol-adapter-pattern]] link in [[io-element-bag]].
Open questions surfaced by the canonical doc: Codec 16 Generation Type — promote to typed [[position-record]] field? Codec 8E NX values land as `Buffer` in `attributes`; needs explicit fixture coverage. SMS-based protocols (Codec 4 + binary SMS) probably out of scope but worth a deliberate decision.
## [2026-05-01] note | Stream-name canonicalization
Documented the canonical stream/key names in [[redis-streams]] — the wiki was previously silent on the actual `telemetry:teltonika` name, so anyone reading it had no way to find out what stream the services use. Added a "Stream and key naming" table covering the inbound telemetry stream, Phase 2 command streams, and registry/heartbeat keys. Also added the naming convention (`telemetry:{vendor}`) so future adapters fit predictably. Cross-referenced the actual stream name in [[processor]] and [[tcp-ingestion]] entities so each entity is self-contained but the convention has one canonical home.
Triggered by a stage-side bug where tcp-ingestion's compiled default (`telemetry:teltonika`) and processor's compiled default (`telemetry:t`) had drifted; pipeline ran with both services talking past each other for ~7 hours before symptoms surfaced. Fix landed in deploy stack (shared env var) and processor (default realigned). Wiki update closes the documentation loop.
## [2026-05-01] synthesis | Live channel architecture (corrects a wiki claim)
Researched Directus's WebSocket subscription mechanism via context7. Confirmed that subscriptions only fire for writes that go through Directus's `ItemsService` (REST/GraphQL/Admin UI mutations, not direct database INSERTs). The previous claim in [[directus]] — "When Processor writes a row, Directus broadcasts the change to subscribed clients" — was wrong.
Wrote [[live-channel-architecture]] documenting the corrected design: two WebSocket channels, each in its own plane. Processor exposes its own WebSocket endpoint for high-volume telemetry fan-out (auth via Directus-issued JWT, authorization delegated to Directus once at subscribe time). Directus's built-in WebSocket subscriptions cover business-plane events. Reasoning: preserves [[plane-separation]] and gives the gentlest failure mode (Directus down blocks only new authorizations, not the live firehose).
Updated [[processor]] (added Live broadcast section, multi-instance consumer-group plumbing note), [[directus]] (corrected the real-time-delivery section), and index.md.
## [2026-05-01] synthesis | Directus schema — working draft
Captured the business-plane schema agreement reached during today's discussion as [[directus-schema-draft]]. Marked as a working draft, open for revision.
Shape: pseudo multi-tenant under `organizations`; users / teams / vehicles / devices are all m2m with orgs (durable catalog); events scoped to a single org; `entries` is the per-event timing unit with nullable `vehicle_id` (foot races) and nullable `team_id` (lone racers); `entry_crew` and `entry_devices` are junctions off entries (no separate `crews` collection — teams already provide durable group identity). Vehicle ownership intentionally soft (`owner_user_id?`, `owner_team_id?`), not enforced. Per-event `classes`. `events.discipline` drives validation. Per-org-per-user role lives on `organization_users.role`.
Open: `entries.status` enum, permission policy definitions per role, stages/timing records (Phase 2 processor), geofences (Phase 2 processor).
## [2026-05-01] synthesis | Schema draft — course definition + penalty system
Major expansion of [[directus-schema-draft]]. Added course definition (stages → segments → geofences/waypoints/SLZs) and the full penalty system. Vehicle ownership idea dropped (org-level only, no owner FKs). `entries.status` enum pinned with semantics. Permission policies confirmed as Directus 11 dynamic-filter Policies, one per logical role.
**Penalty system landed as: numbers in DB, math in code.** A `penalty_formulas` collection holds all values (bracket multipliers, per-miss penalties); the [[processor]] holds one evaluator per `type` in a registry. Speed limit penalties are progressive slice-by-slice (income-tax math, confirmed against the Tirana 24h rulebook): each bracket contributes only the portion of the peak overspeed within its range — `slice × rate` summed across all brackets the peak crossed. Worked example with peak=58 included in the doc.
**Retroactive flag** lives on `penalty_formulas` (default `true`) and on `geofences` / `speed_limit_zones` (default `false`). Per-edit override at save time. Formula recomputes are cheap (snapshotted inputs on `entry_penalties` rows). Geometry recomputes are expensive (replay from positions hypertable) and deferred to Phase 2.5 of [[processor]].
**Other decisions:** checkpoints are typed geofences with `manual_verification=true`, not a separate collection. Stages are containers; segments (`liaison` / `special-stage` / `parc-ferme`) are the atomic rule unit. SLZs carry an `evaluation_window_meters` so the 2km rule from real federations is data, not code.
Per-entry timing layer (`entry_segment_starts`, `entry_crossings`, `entry_penalties`) and results layer (`stage_results`) are the [[processor]] Phase 2 write target. Schema is laid out so Phase 1 (positions only) can ship without it.
## [2026-05-01] note | Faulty position flagging
Added a `faulty boolean DEFAULT false` column to the positions hypertable, controlled by track operators through [[directus]] (the hypertable is exposed as a Directus collection for read+update). [[processor]] filters `WHERE faulty = false` on every read of position data — peak-speed, crossing detection, replay-based recompute. Flagging triggers a windowed recompute of affected `entry_penalties`. Updated [[postgres-timescaledb]], [[position-record]] (storage shape vs. wire shape), [[processor]] (faulty position handling), and [[directus-schema-draft]] (cross-plane operator workflow + third recompute kind).
## [2026-05-01] synthesis | Schema draft — start-order strategies + secondary observations
Read two real-world rulebooks to pin the start-order question: Tirana 24h 2017 (static every leg) and Rally Albania 2025 (dynamic, several variants). Rally Albania's §5.55.10 settle it — start order is per-stage, declarative, and rule-driven. Stage 1 bikes invert the top 20 of the prologue; stages 2 onward seed from previous-stage **clean** SS time (penalties explicitly excluded); epilogue inverts overall standings; intervals are decided per stage.
Updates to [[directus-schema-draft]]:
- `stages` gains `role` (prologue/regular/epilogue), `start_interval_seconds`, `start_order_strategy`, `start_order_strategy_params`, `start_order_input_stage_id`.
- New "Start order strategies" subsection enumerating `manual` / `previous_stage_result` / `previous_stage_clean_result` / `inverse_top_n_then_natural` / `inverse_of_overall` with real-world mappings. Tirana 24h covered by `manual`; Rally Albania covered by the other four.
- `entry_segment_starts` adds `start_position` and `manual_override` (latter for late-arrival reseeding by Race Marshals — both rulebooks leave that operator-driven).
- Materialization is per-category (categories share grids independently per Rally Albania §2.8 + §5.10).
- Decisions list grows: stage roles, CP-missing vs CP-late-past-closing as distinct event types sharing a formula row, reverse-stage tiebreaker.
- Open questions shrink: dropped the start-interval question (now pinned) and the permission-policy-filters question (admin/deployment task, not architectural).
## [2026-05-01] ingest | Rally Albania 2025 — Race Rules and Regulations
Formal ingest of `raw/Regulations_2025.pdf` (Motorsport Club Albania, October 2024). Created [[rally-albania-regulations-2025]] as the canonical real-world reference for federation rule shapes — classes, start-order rules, penalty taxonomy, tracking requirements, timekeeping, protests. Section numbers preserved as `§X.Y` so the schema draft and future SPA work can cite precisely.
Wired the source into [[directus-schema-draft]] (added to `sources:` frontmatter; framing note near the top; inline citation at start-order strategies section). Most of the schema-relevant content was already absorbed into the draft during the prior synthesis step — this ingest formalizes the citation chain.
Open follow-ups flagged on the source page: §12.11 SLZ formula lives in the Supplementary Regulations (not the general regs), so we shouldn't hardcode a default; M-7 numbering bug (Veteran and Female driver share the code — likely a typo); neutralization zones (§8.12) not yet modeled in the schema.
Index updated: new source row. No new entity/concept pages created — the doc supports existing pages rather than introducing new domain objects.
## [2026-05-02] note | Directus deployment wired; entity page updated
`trm/directus` Phase 1 shipped its image to the registry and the `trm/deploy` `compose.yaml` was extended with a `directus` service block (sharing the existing `postgres` service with [[processor]]). Updated [[directus]] entity page to reflect operational reality:
- New "Deployment" section: links to the deploy compose, explains the shared-Postgres model with [[processor]], spells out the 5-step boot pipeline (db-init pre-schema → bootstrap → schema apply → db-init post-schema → start), notes first-boot vs warm-boot timing.
- Schema management section: db-init split into pre-schema (`db-init/`) and post-schema (`db-init-post/`) phases. Post-schema landed because the composite UNIQUE constraints target Directus-managed tables that don't exist until schema apply runs.
- Destructive-apply hazard callout: corrected entrypoint step reference (now step 3/5, not 2/4) after the bootstrap-before-apply reorder.
- New "Network exposure" subsection inside Deployment: directus is internal-only on stage / prod (`expose: 8055` not `ports:`). A reverse proxy (Traefik / Caddy / nginx) on the host or attached to `trm_default` terminates TLS and forwards the public domain to `http://directus:8055`. The asymmetry with [[tcp-ingestion]] (which must host-publish for GPS devices) is named, and the dev compose's deliberate divergence is noted.
Three CI iterations on the directus repo's first push exposed three distinct production-breaking bugs (port collision; bootstrap-before-apply ordering + silent ERROR exit; ghost-collection apply conflict). The dry-run gate caught all of them before the image touched stage. The "ghost-collection" stripping is now automated in `scripts/schema-snapshot.sh` so future captures don't regress.