diff --git a/.env.example b/.env.example index f27604b..c1b147c 100644 --- a/.env.example +++ b/.env.example @@ -51,6 +51,61 @@ PROCESSOR_TAG=main # Pending Entries List, which is undefined behaviour). PROCESSOR_INSTANCE_ID=processor-1 +# --------------------------------------------------------------------- +# directus (business plane) +# --------------------------------------------------------------------- + +# Image tag to pull. `main` auto-tracks the latest commit on the main branch. +# In production, pin to a specific commit SHA for reproducibility. +# Example: DIRECTUS_TAG=ef8bd91 +DIRECTUS_TAG=main + +# Note: directus is intentionally NOT host-published. The admin UI + API +# listen on port 8055 inside the `trm_default` Compose network only, +# reachable as `http://directus:8055` from a reverse proxy (Traefik / +# Caddy / nginx) on the host or attached to the same network. Wire your +# proxy to forward your public domain to that internal address; the +# proxy handles TLS, auth headers, and any WAF / rate-limit policy. +# For local dev (compose.dev.yaml in trm/directus) the dev compose +# host-publishes 8055 directly — this prod stack does not. + +# REQUIRED. Instance identity key (any UUID) and JWT signing secret +# (long random string). Generate fresh values per environment: +# DIRECTUS_KEY=$(uuidgen) +# DIRECTUS_SECRET=$(openssl rand -hex 64) +# Two instances sharing these produce colliding tokens — never reuse +# stage's KEY/SECRET in production. The compose defaults are obvious +# placeholders and will fail on any meaningful KEY validation. +DIRECTUS_KEY=REPLACE-ME-WITH-A-UUID +DIRECTUS_SECRET=REPLACE-ME-WITH-A-LONG-RANDOM-STRING + +# First-boot admin user. Created automatically when directus_users is +# empty at first boot; ignored on subsequent boots. Change the password +# via the admin UI after first login (the password env var is NOT a +# rotation mechanism — only the initial seed). +DIRECTUS_ADMIN_EMAIL=admin@example.com +DIRECTUS_ADMIN_PASSWORD=CHANGE-ON-FIRST-LOGIN + +# Public-facing URL used in password-reset emails, OAuth redirects, and +# asset URLs. In real prod set to https://; the localhost +# default is for first-deploy smoke testing only. +DIRECTUS_PUBLIC_URL=http://localhost:8055 + +# Optional toggles. Defaults disable cache and CORS. Enable per env: +# DIRECTUS_CACHE_ENABLED=true (then configure CACHE_STORE etc. directly +# in compose.yaml — Directus has 20+ +# cache-related env vars not exposed here) +# DIRECTUS_CORS_ENABLED=true +# DIRECTUS_CORS_ORIGIN=https://your-spa.example.com +DIRECTUS_CACHE_ENABLED=false +DIRECTUS_CORS_ENABLED=false +DIRECTUS_CORS_ORIGIN=false + +# pino log style: json (structured, for log aggregators) | pretty (human-readable). +# Defaults to json in compose.yaml — production-friendly. Set to `pretty` +# for local debugging. +LOG_STYLE=json + # --------------------------------------------------------------------- # Shared # --------------------------------------------------------------------- diff --git a/README.md b/README.md index 8c702f2..ceadf46 100644 --- a/README.md +++ b/README.md @@ -18,16 +18,121 @@ Currently: - **redis** — telemetry queue + future Phase 2 connection registry. Internal-only, persisted via named volume. - **tcp-ingestion** — Teltonika telemetry TCP server. Image built by [`trm/tcp-ingestion`](https://git.dev.microservices.al/trm/tcp-ingestion)'s Gitea workflow. +- **postgres** — PostgreSQL 16 + TimescaleDB + PostGIS via `timescale/timescaledb-ha`. Schema authority is split: the `positions` hypertable is owned by [`trm/processor`](https://git.dev.microservices.al/trm/processor)'s migration runner; everything else is owned by `trm/directus` via its snapshot YAML. Internal-only, persisted via named volume. +- **processor** — consumes telemetry from Redis, writes to Postgres. Image built by [`trm/processor`](https://git.dev.microservices.al/trm/processor)'s Gitea workflow. +- **directus** — business-plane API + admin UI + schema authority. Image built by [`trm/directus`](https://git.dev.microservices.al/trm/directus)'s Gitea workflow. Boot pipeline runs db-init pre-schema → bootstrap → schema-apply → db-init post-schema → start; first boot on a fresh DB takes ~60–90 s. Planned (will be added as they land): -- **processor** — consumes telemetry from Redis, writes to PostgreSQL/Timescale. -- **postgres** — with TimescaleDB extension. -- **directus** — business-plane API and admin UI. - **react-spa** — front-end SPA (static bundle, served via reverse proxy). See `../docs/wiki/` for the full architecture. +## First-deploy checklist + +Run through this once per environment before clicking deploy. It covers the security-critical secrets (which must NOT use the compose.yaml placeholder defaults) and the Portainer setup. + +### 1. Generate per-environment secrets + +These values are unique per environment — never reuse them across stage/prod, and never reuse them after a compromise. Run on any machine with `openssl` and `uuidgen` (Linux/macOS/WSL): + +```bash +echo "POSTGRES_PASSWORD=$(openssl rand -base64 32 | tr -d '/+=')" +echo "DIRECTUS_KEY=$(uuidgen)" +echo "DIRECTUS_SECRET=$(openssl rand -hex 64)" +echo "DIRECTUS_ADMIN_PASSWORD=$(openssl rand -base64 24 | tr -d '/+=')" +``` + +Keep the output somewhere safe (1Password, Vaultwarden, etc.) — you'll paste it into Portainer next, and you'll need `DIRECTUS_ADMIN_PASSWORD` again to log in for the first time. + +> The compose defaults for these (`trm-pilot-change-me`, `REPLACE-ME-WITH-A-UUID`, `REPLACE-ME-WITH-A-LONG-RANDOM-STRING`, `CHANGE-ON-FIRST-LOGIN`) are deliberately broken-looking. Anything still using them after deploy is a misconfiguration. + +### 2. Set Portainer stack environment variables + +Stack → **Environment variables** in Portainer's Add stack form. Required for first deploy: + +| Variable | Value | +|---|---| +| `POSTGRES_PASSWORD` | from step 1 | +| `DIRECTUS_KEY` | from step 1 | +| `DIRECTUS_SECRET` | from step 1 | +| `DIRECTUS_ADMIN_EMAIL` | the email you'll log in with | +| `DIRECTUS_ADMIN_PASSWORD` | from step 1 | +| `DIRECTUS_PUBLIC_URL` | external URL of the Directus admin UI (e.g. `https://directus.stage.example.com`). Used in password-reset emails and OAuth redirects. | + +Recommended for any non-throwaway environment: + +| Variable | Value | +|---|---| +| `TCP_INGESTION_TAG` | a specific commit SHA (not `main`) for reproducibility | +| `PROCESSOR_TAG` | same | +| `DIRECTUS_TAG` | same | +| `LOG_LEVEL` | `info` or `warn` for prod; `debug` only for active troubleshooting | +| `LOG_STYLE` | `json` for log aggregators; default is already `json` | + +Optional: + +| Variable | When to set | +|---|---| +| `TCP_INGESTION_PORT` | Change if `5027` is already in use on the host. GPS devices need a real host port — this one is published. | +| `DIRECTUS_CORS_ENABLED` / `DIRECTUS_CORS_ORIGIN` | Set to `true` and the SPA's origin URL once the React SPA is deployed. | + +> **Directus is internal-only by design.** It listens on `8055` inside the `trm_default` Compose network and is **not** published to the host. Wire a reverse proxy (Traefik / Caddy / nginx) on the host or attached to the network and forward your public domain to `http://directus:8055`. The proxy handles TLS, optional WAF / rate-limit, and any auth-header rewriting. Set `DIRECTUS_PUBLIC_URL` to the proxy-served URL (e.g. `https://directus.stage.example.com`) so password-reset emails and OAuth redirects work. The dev compose in `trm/directus` does host-publish `8055` for local iteration; this stage/prod stack deliberately does not. + +### 3. Authenticate the host to the Gitea registry + +First deploy only (Portainer's **Registries** UI is preferred over manual login): + +```bash +docker login git.dev.microservices.al +``` + +### 4. Deploy the stack + +Stack → **Add stack** → **Repository** → fill in repo URL, branch, compose path → **Deploy the stack**. + +### 5. Watch the first boot + +Directus's first boot runs ~30–45 s of internal migrations on top of the project's own boot pipeline. Total time-to-healthy on a fresh DB is ~60–90 s. Tail the logs in Portainer → Stack → directus → Logs: + +Expected progression: + +``` +[entrypoint] step 1/5: db-init (pre-schema) +[db-init] db-init complete: 3 applied, 0 skipped +[entrypoint] step 2/5: directus bootstrap +INFO: Initializing bootstrap... +INFO: Installing Directus system tables... +INFO: Running migrations... +... ~80 internal migrations ... +INFO: Setting up first admin role... +INFO: Adding first admin user... +INFO: Done +[entrypoint] step 3/5: directus schema apply +INFO: Snapshot applied successfully +[entrypoint] step 4/5: db-init (post-schema) +[db-init] db-init complete: 2 applied, 0 skipped +[entrypoint] step 5/5: directus start (pm2-runtime) +PM2 log: App [directus:0] online +INFO: Server started at http://0.0.0.0:8055 +``` + +The healthcheck flips to `healthy` once the server is serving (~5–10 s after the "Server started" log line). + +### 6. First login + +Browse to `${DIRECTUS_PUBLIC_URL}` (or `http://:8055` if you didn't put it behind a proxy). Log in with `DIRECTUS_ADMIN_EMAIL` + `DIRECTUS_ADMIN_PASSWORD`. + +**Change the admin password immediately** via the user's own profile in the admin UI. The `DIRECTUS_ADMIN_PASSWORD` env var is only the first-boot seed — changing it post-deploy has no effect on the running user. Same goes for `POSTGRES_PASSWORD`: it's baked into the persistent volume on first boot and must be rotated via `ALTER USER` inside psql, not by changing the env var. + +### 7. Verify the schema landed + +Admin UI → **Settings → Data Model**. You should see 12 user collections: `organizations`, `users` (built-in `directus_users` with custom fields), `organization_users`, `vehicles`, `organization_vehicles`, `devices`, `organization_devices`, `events`, `classes`, `entries`, `entry_crew`, `entry_devices`. If a collection is missing, the schema-apply step failed and the boot logs will say so. + +> **Schema-as-code reminder:** do NOT add or remove collections via the admin UI on this stage instance. Schema changes flow through `trm/directus` (admin-UI edit → `pnpm run schema:snapshot` → commit → CI dry-run → image rebuild → redeploy). Edits made directly here will be DROPPED on the next image rebuild — schema-apply enforces the committed snapshot. See `docs/wiki/entities/directus.md` "destructive-apply hazard." + +--- + ## Deploy via Portainer (Repository Stack) 1. **Stack → Add stack → Repository** in Portainer. @@ -72,9 +177,10 @@ To pin a specific build for production, set the relevant `*_TAG` variable in `.e ## Network model - One internal Compose network (`trm_default`). -- Redis is **not** bound to a host port — only reachable from other services in the stack via service-name DNS (`redis://redis:6379`). -- tcp-ingestion's TCP port (5027 by default) is bound to the host so devices can reach it. -- Other Redis instances on the same host can keep using port 6379 freely; this stack does not collide with them. +- **Redis**, **postgres**, **directus**, and **processor** are not bound to host ports — only reachable from other services in the stack via service-name DNS (`redis://redis:6379`, `postgres:5432`, `http://directus:8055`). +- **tcp-ingestion**'s TCP port (`5027` by default) is bound to the host so GPS devices can reach it. This is the only host-published port in the stack. +- **Directus admin UI / API access** goes through a reverse proxy (Traefik / Caddy / nginx) on the host or attached to the `trm_default` network. The proxy handles TLS, public-DNS routing, and any WAF / rate-limit / auth-header policy. Wire your proxy to forward your domain to `http://directus:8055`. The proxy itself is not part of this stack — add it as a sibling stack in Portainer or run it on the host. +- Other Redis / Postgres instances on the same host can keep using their default ports freely; this stack does not collide with them. The default postgres-on-host you may already have running on `5432` is untouched — this stack's postgres is internal-only. ## Environment variables diff --git a/compose.yaml b/compose.yaml index c010259..b44fcec 100644 --- a/compose.yaml +++ b/compose.yaml @@ -113,9 +113,107 @@ services: LOG_LEVEL: ${LOG_LEVEL:-info} restart: unless-stopped + # ------------------------------------------------------------------- + # directus — business-plane API, admin UI, and schema authority. + # Built by git.dev.microservices.al/trm/directus's Gitea workflow. + # + # Boot pipeline (5 steps; see trm/directus/entrypoint.sh): + # 1. db-init pre-schema → positions hypertable + faulty column + # 2. directus bootstrap → installs Directus system tables + # 3. directus schema apply → applies snapshots/schema.yaml + # 4. db-init post-schema → composite UNIQUE constraints + # 5. pm2-runtime start → server up at :8055 + # + # First-boot on a fresh DB takes ~60–90 s (Directus runs its own + # internal migrations during step 2). Subsequent boots are ~5 s as + # all steps no-op against the warm DB. + # + # Schema-as-code: collections + fields + relations live in the image + # (snapshots/schema.yaml + db-init/*.sql baked in at build time). + # Schema changes flow through the trm/directus repo + its CI dry-run + # gate, NOT through manual edits on this stage instance. Editing + # collections via the admin UI here will be DROPPED on the next image + # rebuild — schema-apply enforces the committed snapshot. See + # docs/wiki/entities/directus.md "destructive-apply hazard" callout. + # ------------------------------------------------------------------- + directus: + image: git.dev.microservices.al/trm/directus:${DIRECTUS_TAG:-main} + depends_on: + postgres: + condition: service_healthy + expose: + # Internal-only. The admin UI + API are reachable from other services + # in the stack via service-name DNS (`http://directus:8055`). A reverse + # proxy (Traefik / Caddy / nginx) running on the host or attached to + # the `trm_default` network terminates TLS, applies its own auth / + # rate-limit / WAF rules, and forwards to this expose port. + # + # Why not host-publish 8055 directly: the admin UI is a privileged + # surface (full CRUD + permission policies + Flow execution). Direct + # exposure leaks an attack surface and forces TLS into a service that + # shouldn't care about it. tcp-ingestion is different (GPS devices + # connect directly so it must publish to the host); Directus is HTTP + # and belongs behind a proxy in any non-throwaway environment. + - '8055' + environment: + # ----- Database connection ----- + DB_CLIENT: pg + DB_HOST: postgres + DB_PORT: 5432 + DB_DATABASE: ${POSTGRES_DB:-trm} + DB_USER: ${POSTGRES_USER:-trm} + DB_PASSWORD: ${POSTGRES_PASSWORD:-trm-pilot-change-me} + + # ----- Instance security — REQUIRED, must be unique per environment. + # KEY: any UUID. SECRET: long random string, e.g. `openssl rand -hex 64`. + # Two instances sharing the same KEY/SECRET produce colliding JWTs. + # Defaults below are placeholders — REPLACE in the Portainer stack env. + KEY: ${DIRECTUS_KEY:-REPLACE-ME-WITH-A-UUID} + SECRET: ${DIRECTUS_SECRET:-REPLACE-ME-WITH-A-LONG-RANDOM-STRING} + + # ----- Admin bootstrap — only used on first init. + # If directus_users is empty at first boot, an admin user is created + # from these. Subsequent boots ignore them. Change the password via + # the admin UI after first login. + ADMIN_EMAIL: ${DIRECTUS_ADMIN_EMAIL:-admin@example.com} + ADMIN_PASSWORD: ${DIRECTUS_ADMIN_PASSWORD:-CHANGE-ON-FIRST-LOGIN} + + # ----- Public-facing URL (used in emails, OAuth redirects, asset URLs). + # In real prod set to https://; default localhost is just + # for first-deploy smoke testing. + PUBLIC_URL: ${DIRECTUS_PUBLIC_URL:-http://localhost:8055} + + # ----- Logging ----- + LOG_LEVEL: ${LOG_LEVEL:-info} + LOG_STYLE: ${LOG_STYLE:-json} + + # ----- WebSockets — required for the live channel architecture + # (Directus's WS subs cover business-plane events; processor's WS + # carries the telemetry firehose). See live-channel-architecture + # in the wiki. + WEBSOCKETS_ENABLED: 'true' + + # ----- Cache / CORS — defaults disabled; enable per environment. + CACHE_ENABLED: ${DIRECTUS_CACHE_ENABLED:-false} + CORS_ENABLED: ${DIRECTUS_CORS_ENABLED:-false} + CORS_ORIGIN: ${DIRECTUS_CORS_ORIGIN:-false} + volumes: + # Persist admin-uploaded files across container restarts. + # snapshots/ + db-init/ are baked into the image, NOT mounted — + # that's the schema-as-code split. + - directus-uploads:/directus/uploads + restart: unless-stopped + healthcheck: + test: ['CMD-SHELL', 'wget -qO- http://localhost:8055/server/health || exit 1'] + interval: 30s + timeout: 10s + # First boot includes Directus's internal migrations (~30–45 s on + # fresh DB). 120 s gives margin; warm boots become healthy in ~10 s. + start_period: 120s + retries: 3 + # ------------------------------------------------------------------- # Future services land here: - # - directus: business-plane API + admin UI # - react-spa: front-end (static, served via nginx or Caddy) # See ../docs/wiki/ for the platform architecture. # ------------------------------------------------------------------- @@ -123,3 +221,4 @@ services: volumes: redis-data: postgres-data: + directus-uploads: