directus/.planning/phase-1-slice-1-schema/03-initial-migrations.md

# Task 1.3 — Initial migrations

**Phase:** 1 — Slice 1 schema + deploy pipeline
**Status:** ⬜ Not started
**Depends on:** 1.2
**Wiki refs:** `docs/wiki/entities/postgres-timescaledb.md`, `docs/wiki/concepts/position-record.md`, `docs/wiki/entities/processor.md` (Faulty position handling)

## Goal

Author the three Phase 1 migrations under `db-init/`: the TimescaleDB extension, the `positions` hypertable creation, and the `faulty boolean` column. Each is internally idempotent so that environments where they were applied ad-hoc (e.g. existing stage) absorb them as no-ops.

## Deliverables

- `db-init/001_extensions.sql`:
  ```sql
  CREATE EXTENSION IF NOT EXISTS timescaledb CASCADE;
  ```
- `db-init/002_positions_hypertable.sql`:
  ```sql
  CREATE TABLE IF NOT EXISTS positions (
    device_id   TEXT        NOT NULL,
    ts          TIMESTAMPTZ NOT NULL,
    latitude    DOUBLE PRECISION NOT NULL,
    longitude   DOUBLE PRECISION NOT NULL,
    altitude    DOUBLE PRECISION,
    angle       SMALLINT,
    speed       SMALLINT,
    satellites  SMALLINT,
    priority    SMALLINT,
    attributes  JSONB        NOT NULL DEFAULT '{}'::jsonb,
    PRIMARY KEY (device_id, ts)
  );

  -- Idempotent hypertable creation: if_not_exists => true
  SELECT create_hypertable(
    'positions', 'ts',
    chunk_time_interval => INTERVAL '7 days',
    if_not_exists => TRUE
  );

  CREATE INDEX IF NOT EXISTS positions_device_ts_idx
    ON positions (device_id, ts DESC);
  ```
- `db-init/003_faulty_column.sql`:
  ```sql
  ALTER TABLE positions
    ADD COLUMN IF NOT EXISTS faulty BOOLEAN NOT NULL DEFAULT FALSE;

  CREATE INDEX IF NOT EXISTS positions_faulty_idx
    ON positions (device_id, ts DESC) WHERE faulty = FALSE;
  ```

## Specification

- **Schema must match what `processor` writes.** Cross-check column names, types, nullability against `docs/wiki/concepts/position-record.md` and the actual `processor` writer code (`processor/src/db/migrations/0001_positions.sql`). If any field differs, this task is **blocked** until [[directus-schema-draft]] and the processor's existing migration are reconciled — fix the divergence in the doc first, then this task.
- **`attributes` is `JSONB NOT NULL DEFAULT '{}'`** — never null, always an object. Keeps query plans simple.
- **`(device_id, ts)` primary key** — natural key, idempotent for the processor's `ON CONFLICT DO NOTHING` writer.
- **Chunk interval = 7 days.** Tunable later; 7 days is a reasonable default for hundreds of devices emitting at multi-Hz.
- **Faulty index uses a partial-index `WHERE faulty = FALSE`.** Optimizes the [[processor]] hot-path read which always filters faulty out. Operator queries that select faulty rows specifically use the broader `(device_id, ts DESC)` index.
- **`CASCADE` on `CREATE EXTENSION`** so that any dependent extensions install transparently. TimescaleDB has no required deps so CASCADE is a no-op for now, but harmless and future-proof.
- **No `IF EXISTS` shortcuts that hide schema drift.** The migrations are idempotent at the *DDL* level (`IF NOT EXISTS`), but if a column type already differs from what the file declares, the migration silently passes — leaving stage in an inconsistent state. Add a final `DO $$ ... $$` block per file that asserts the table shape is what the migration intends:
  ```sql
  -- end of 002_positions_hypertable.sql
  DO $$ BEGIN
    IF NOT EXISTS (
      SELECT 1 FROM information_schema.columns
      WHERE table_name = 'positions' AND column_name = 'attributes' AND data_type = 'jsonb'
    ) THEN
      RAISE EXCEPTION 'positions.attributes is not JSONB — schema drift';
    END IF;
  END $$;
  ```
  One assertion per critical column shape. Catches the case where stage has the table but with subtly different types.

## Acceptance criteria

- [ ] Against a fresh Postgres + TimescaleDB image, `apply-db-init.sh` runs all three files cleanly.
- [ ] `\d positions` shows the expected columns (including `faulty`).
- [ ] `SELECT * FROM timescaledb_information.hypertables WHERE hypertable_name = 'positions';` returns one row.
- [ ] Both indexes (`positions_device_ts_idx`, `positions_faulty_idx`) exist (`\di+`).
- [ ] Re-running the script is a no-op (verified via `migrations_applied` table contents).
- [ ] Against a Postgres that *already* has `positions` from a prior ad-hoc run, the migration absorbs it as a no-op (provided the existing schema matches; otherwise the assertion blocks deploy).
- [ ] Cross-checked against `processor/src/db/migrations/0001_positions.sql` — column names, types, indexes match.

## Risks / open questions

- **Existing stage Postgres may have a slightly different schema.** Run `pg_dump --schema-only -t positions` on stage before this task lands and compare to the migration above. Reconcile differences in this file (or document them as known-divergent).
- **Hypertable was created before — `create_hypertable` with `if_not_exists` should accept it, but the chunk interval can't be retroactively changed via this call.** If stage's chunk interval differs from `7 days`, that's a non-blocking divergence (functional, just suboptimal). Don't try to migrate it via SQL; leave it as a follow-up.

## Done

(Fill in commit SHA + one-line note when this lands.)