Files
directus/.planning/phase-1-slice-1-schema/08-gitea-ci-dryrun.md
T
julian ec119af274 Fix CI port collision — remap throwaway Postgres to host port 15432
The runner host typically has another Postgres listening on 5432
(local dev stack, stage instance, etc.), which made the services:
postgres container fail at start with "port already allocated."

Remap the host-side port from 5432:5432 to 15432:5432. The service
container still listens on 5432 internally; only the runner host
binding changes. Dry-run's DB_PORT updated to 15432 to match.

--network host semantics preserved: DB_HOST=localhost reaches the
service on the runner's loopback at the new port.

Why we still need a Postgres container at all: the dry-run gate
applies db-init/*.sql migrations and the directus schema snapshot
against a real DB to catch breakage before pushing the image. No
Postgres = no validation = the gate is bypassed.

Inline comment in the workflow now explains the choice; task spec's
Done section captures the correction so future readers don't
re-discover this.
2026-05-02 10:38:26 +02:00

11 KiB
Raw Blame History

Task 1.8 — Gitea CI dry-run workflow

Phase: 1 — Slice 1 schema + deploy pipeline Status: Not started Depends on: 1.7 Wiki refs: docs/wiki/entities/directus.md (Schema management section)

Goal

Build a Gitea Actions workflow that on push to main (when relevant paths change): builds the image, spins up a throwaway Postgres + TimescaleDB in CI, runs the entrypoint flow as a dry-run to catch snapshot/migration breakage, and only publishes the image to the registry if the dry-run succeeds. Mirrors the processor and tcp-ingestion workflow shape.

Deliverables

  • .gitea/workflows/build.yml:
    name: Build directus image
    
    on:
      push:
        branches: [main]
        paths:
          - 'snapshots/**'
          - 'db-init/**'
          - 'extensions/**'
          - 'scripts/**'
          - 'entrypoint.sh'
          - 'Dockerfile'
          - '.gitea/workflows/build.yml'
      workflow_dispatch:
    
    jobs:
      build-and-publish:
        runs-on: ubuntu-22.04
        services:
          postgres:
            image: timescale/timescaledb-ha:pg16.6-ts2.17.2-all  # match compose.dev.yaml; :pg16-latest does NOT exist on Docker Hub
            env:
              POSTGRES_USER: directus
              POSTGRES_PASSWORD: directus
              POSTGRES_DB: directus
            ports: ['5432:5432']
            options: >-
              --health-cmd "pg_isready -U directus"
              --health-interval 5s
              --health-timeout 5s
              --health-retries 10
    
        steps:
          - uses: actions/checkout@v4
    
          - name: Build image
            run: docker build -t trm-directus:ci .
    
          - name: Dry-run boot against throwaway Postgres
            env:
              DB_HOST: postgres
              DB_PORT: 5432
              DB_USER: directus
              DB_PASSWORD: directus
              DB_DATABASE: directus
              KEY: ci-key-not-secret
              SECRET: ci-secret-not-secret
              ADMIN_EMAIL: ci@example.com
              ADMIN_PASSWORD: ci-password-not-secret
              PUBLIC_URL: http://localhost:8055
            run: |
              docker run --rm \
                -e DB_CLIENT=pg \
                -e DB_HOST=$DB_HOST -e DB_PORT=$DB_PORT \
                -e DB_USER=$DB_USER -e DB_PASSWORD=$DB_PASSWORD -e DB_DATABASE=$DB_DATABASE \
                -e KEY=$KEY -e SECRET=$SECRET \
                -e ADMIN_EMAIL=$ADMIN_EMAIL -e ADMIN_PASSWORD=$ADMIN_PASSWORD \
                -e PUBLIC_URL=$PUBLIC_URL \
                --network host \
                --entrypoint bash \
                trm-directus:ci \
                -c '/directus/scripts/apply-db-init.sh && /directus/scripts/schema-apply.sh && echo "dry-run ok"'
    
          - name: Login to Gitea registry
            uses: docker/login-action@v3
            with:
              registry: git.dev.microservices.al
              username: ${{ secrets.REGISTRY_USERNAME }}
              password: ${{ secrets.REGISTRY_PASSWORD }}
    
          - name: Tag and push
            run: |
              docker tag trm-directus:ci git.dev.microservices.al/trm/directus:main
              docker tag trm-directus:ci git.dev.microservices.al/trm/directus:${{ github.sha }}
              docker push git.dev.microservices.al/trm/directus:main
              docker push git.dev.microservices.al/trm/directus:${{ github.sha }}
    
          - name: Trigger Portainer redeploy (optional)
            if: secrets.PORTAINER_WEBHOOK_URL != ''
            run: curl -X POST "${{ secrets.PORTAINER_WEBHOOK_URL }}"
    

Specification

  • Dry-run runs the entrypoint scripts only, not directus start. Starting the server and waiting for it to serve is slow and unnecessary — the goal is to catch DDL / snapshot apply errors. Override the ENTRYPOINT and run the two scripts directly.
  • Service container is the throwaway Postgres. services: block in Gitea Actions (compatible syntax with GitHub Actions). Use the pinned TimescaleDB image; mismatch with prod hides bugs.
  • Path filter on on.push.paths keeps CI quiet for unrelated repo changes (docs-only commits, etc.). Mirrors the processor workflow.
  • Two image tags published: :main (always points at latest main) and :<sha> (specific commit, immutable). The deploy stack can pin to either.
  • Portainer webhook is optional (gated by secret presence). If unset, no auto-deploy.
  • No integration tests in CI for Phase 1. The dry-run boot is the integration test — it proves the snapshot+db-init combination works against a fresh Postgres. Phase 5+ adds extension-specific tests as those land.
  • Required Gitea secrets:
    • REGISTRY_USERNAME, REGISTRY_PASSWORD — for the image push.
    • PORTAINER_WEBHOOK_URL — optional, for auto-deploy.

Acceptance criteria

  • Workflow file is committed at .gitea/workflows/build.yml.
  • First push to main after this lands triggers the workflow.
  • Workflow steps in order: checkout → build → dry-run boot → registry login → tag/push → optional Portainer ping.
  • Dry-run step exits 0 with logs showing "db-init complete" and "schema apply: no changes" (after the snapshot has been applied once, subsequent runs against fresh Postgres still apply from scratch — verify the apply step works in both cases).
  • Intentionally break the snapshot (manually edit snapshots/schema.yaml to a malformed YAML) → workflow fails at the dry-run step → image is NOT pushed.
  • Intentionally break a migration (introduce SQL syntax error in db-init/) → workflow fails at the dry-run step → image is NOT pushed.
  • Push a docs-only change → workflow does NOT trigger.
  • Image pushed to registry under git.dev.microservices.al/trm/directus:main and :<sha>.
  • Portainer webhook fires if configured.

Risks / open questions

  • Gitea Actions services: syntax compatibility. Gitea's runner is mostly GitHub-Actions-compatible but has historically had quirks with the services: block (especially around image pulls from private registries). If the throwaway Postgres can't be brought up via services:, fall back to a docker run step that backgrounds the container and a wait-loop on pg_isready. Document the chosen approach.
  • Network access between job container and service container. --network host is the simplest solution if Gitea's runner allows it. If not, use the Docker network created by the runner and reference the service by name (postgres:5432).

Done

Implementation landed (pending live trigger by first relevant commit). Workflow file at .gitea/workflows/build.yml. Statically validated; live trigger requires a push that touches one of the path-filtered locations.

Corrections folded in vs. the spec's draft YAML:

  1. DB_HOST=localhost, not DB_HOST=postgres. The spec's draft mixed --network host with service-name resolution; those are mutually exclusive. With --network host the docker-run container shares the runner's loopback, so the service's port mapping (5432:5432) is reachable as localhost:5432, not by service name postgres. (Service-name resolution would only work with the runner's default bridge network.)
  2. --health-retries 20 instead of 10. The timescaledb-ha:*-all image runs more init work at startup than vanilla postgres and occasionally exceeds the 50s window on cold runner images. 20 retries × 5s = 100s margin.
  3. --health-cmd "pg_isready -U directus -d directus" with explicit -d. Spec had user only.
  4. curl -fsS -X POST for the Portainer webhook step. Bare curl -X POST returns 0 even on HTTP 4xx/5xx; -f makes a misconfigured webhook URL fail the step explicitly.
  5. Plain docker build, NOT docker/build-push-action@v5. The dry-run step needs the freshly-built image accessible to a subsequent docker run. build-push-action with the docker-container Buildx driver exports into a separate buildkitd cache that docker run cannot see — the run would fail with "image not found." Plain docker build keeps the image in the local Docker daemon.

Deliberate divergences from processor/.gitea/workflows/build.yml:

Aspect Processor Directus Why
Build mechanism docker/build-push-action@v5 plain docker build dry-run needs local-daemon access (above)
Buildx setup yes no Buildx isolates the image; would defeat the dry-run
services: block absent present Directus dry-run needs a live Postgres; processor mocks it
Node/pnpm setup yes no No TS to compile in Phase 1 (Phase 5 adds this)
typecheck/lint/test three steps none No extensions yet
Portainer webhook unconditional gated on secret presence Spec requirement
runs-on ubuntu-latest ubuntu-22.04 Pin to avoid floating-tag runner image breakage

Acceptance criteria status:

Static (verified):

  • Workflow file at .gitea/workflows/build.yml.
  • Steps in correct order: checkout → build → dry-run → login → tag/push → optional Portainer.
  • Path filter excludes .planning/, README.md, compose.dev.yaml, package.json — docs-only commits won't trigger CI.
  • Workflow file itself is in the path-filter list (so changes to CI trigger CI).
  • Two image tags published (:main, :<sha>).
  • Required secrets identified: REGISTRY_USERNAME, REGISTRY_PASSWORD. Optional: PORTAINER_WEBHOOK_URL.
  • Dry-run command logic traced: env vars, network mode, entrypoint override, script chain all consistent.

Pending live trigger (will validate on first push that hits the path filter):

  • Workflow triggers on push.
  • Dry-run step exits 0 against a fresh Postgres + the committed snapshot (currently 105 KB, 13 collections).
  • Snapshot drift simulation: hand-edit snapshots/schema.yaml to malformed YAML → push → CI fails at dry-run → image NOT pushed.
  • Migration syntax error simulation: introduce broken db-init/006_*.sql → push → CI fails at dry-run → image NOT pushed.
  • Image actually published to git.dev.microservices.al/trm/directus:main after a clean run.
  • Portainer webhook fires if configured.

Operator action required before first run: in the Gitea repo at git.dev.microservices.al/trm/directus → Settings → Secrets, configure:

  • REGISTRY_USERNAME — Gitea user with write access to the container registry
  • REGISTRY_PASSWORD — password or PAT for that user
  • PORTAINER_WEBHOOK_URL (optional) — for auto-redeploy on push

Without REGISTRY_USERNAME / REGISTRY_PASSWORD the Login step fails with a clear auth error. Without PORTAINER_WEBHOOK_URL the Portainer step is skipped entirely.

Port-allocation correction (2026-05-02): initial workflow used 5432:5432 for the throwaway-Postgres port mapping. On a self-hosted Gitea runner, the host typically has another Postgres on 5432 (dev stack, stage instance), causing the service container to fail at start with "port already allocated." Fixed by remapping to 15432:5432 (the conventional Postgres-second-instance port) and updating the dry-run's DB_PORT=15432. The service container itself still listens on 5432 internally — only the host-side mapping changed. --network host semantics are preserved: DB_HOST=localhost reaches the service on the runner's loopback at :15432.