Add Phase 1 and Phase 2 planning documents

ROADMAP plus granular task files per phase. Phase 1 (12 tasks + 1.13 device authority) covers Codec 8/8E/16 telemetry ingestion; Phase 2 (6 tasks) covers Codec 12/14 outbound commands; Phase 3 enumerates deferred items.
2026-04-30 15:47:06 +02:00
parent 95e60a2c75
commit c8a5f4cd68
23 changed files with 2508 additions and 0 deletions
@@ -0,0 +1,57 @@
+# Task 1.1 — Project scaffold
+
+**Phase:** 1 — Inbound telemetry
+**Status:** ⬜ Not started
+**Depends on:** None
+**Wiki refs:** `docs/wiki/sources/teltonika-ingestion-architecture.md` § Project location and layout
+
+## Goal
+
+Initialize the Node.js / TypeScript project with the directory layout from the wiki, install the agreed tooling, and produce a "hello world" `main.ts` that the rest of Phase 1 builds on.
+
+## Deliverables
+
+- `package.json` declaring:
+  - Type: `"module"` (ESM only).
+  - Engines: `"node": ">=22"`.
+  - Scripts: `build`, `dev`, `start`, `test`, `test:watch`, `lint`, `format`, `typecheck`.
+  - Dependencies (production): `ioredis`, `pino`, `pino-pretty` (dev-only via NODE_ENV check), `prom-client`, `zod`.
+  - Dev dependencies: `typescript`, `@types/node`, `vitest`, `@vitest/coverage-v8`, `eslint`, `@typescript-eslint/parser`, `@typescript-eslint/eslint-plugin`, `prettier`, `tsx` (for `dev` watch).
+- `tsconfig.json` with `strict: true`, `target: ES2022`, `module: NodeNext`, `moduleResolution: NodeNext`, `outDir: dist`, `rootDir: src`, `declaration: false`, `noUncheckedIndexedAccess: true`.
+- `eslint.config.js` (flat config) with `@typescript-eslint/recommended-type-checked` plus a small project-specific allow-list.
+- `.prettierrc` — 2 spaces, single quotes, no semis at line end OR keep semis (pick one and stay consistent — recommend keep semis to match Node convention).
+- `.gitignore` — `node_modules/`, `dist/`, `coverage/`, `.env`, `.env.local`, `*.log`.
+- `.dockerignore` — same as `.gitignore` plus `.git/`, `.planning/`, `test/`, `*.md` except `README.md`.
+- Empty directories with `.gitkeep` files where Phase 1 will fill them in:
+  - `src/core/`, `src/adapters/teltonika/codec/data/`, `src/adapters/teltonika/codec/command/`, `src/config/`, `src/observability/`
+  - `test/fixtures/teltonika/codec8/`, `test/fixtures/teltonika/codec8e/`, `test/fixtures/teltonika/codec16/`
+- `src/main.ts` — minimal stub: imports a logger (placeholder until task 1.3), prints "tcp-ingestion starting" and exits with code 0.
+- `README.md` — short description pointing at `.planning/ROADMAP.md` for the work plan, and at `../docs/wiki/` for the architectural specification.
+
+## Specification
+
+- **Package manager:** pnpm. Commit `pnpm-lock.yaml`. The Dockerfile in task 1.11 will use `pnpm fetch` for layer-cache friendliness.
+- **Module style:** ESM throughout. No CJS interop hacks. All files use `import`/`export` and `.js` suffix on relative imports per Node ESM resolution rules.
+- **TypeScript path style:** Use relative imports for now. No `paths` aliases — they add a bundler dependency at runtime that we don't want.
+- **No bundler.** The build is `tsc` only. Runtime is plain Node consuming `dist/`. The Dockerfile will copy `dist/` and `node_modules/`.
+- **Linting style:** Configure ESLint to enforce `@typescript-eslint/no-floating-promises` and `@typescript-eslint/no-misused-promises` — both are critical in a TCP server where unhandled promise rejections will silently lose work.
+
+## Acceptance criteria
+
+- [ ] `pnpm install` succeeds with no warnings other than peer deps.
+- [ ] `pnpm typecheck` succeeds on the empty project.
+- [ ] `pnpm lint` succeeds.
+- [ ] `pnpm build` produces `dist/main.js`.
+- [ ] `pnpm start` runs the compiled output and prints the startup message.
+- [ ] `pnpm test` runs (with no tests) and exits successfully.
+- [ ] `pnpm dev` runs `main.ts` via `tsx` and prints the startup message.
+- [ ] Repository builds reproducibly: deleting `node_modules` and `dist`, then `pnpm install --frozen-lockfile && pnpm build` produces identical output.
+
+## Risks / open questions
+
+- Pinning Node 22 LTS vs 20 LTS: 22 is current LTS in 2026 and has stable native fetch + better worker thread perf. Stay with 22 unless deployment infra forces 20.
+- ESLint v9 flat config: ensure the version is compatible with `@typescript-eslint/*` v8+. If issues arise, fall back to legacy `.eslintrc.json` until upstream catches up.
+
+## Done
+
+(Fill in once complete: commit SHA, brief notes.)
@@ -0,0 +1,89 @@
+# Task 1.2 — Core shell & framing types
+
+**Phase:** 1 — Inbound telemetry
+**Status:** ⬜ Not started
+**Depends on:** 1.1
+**Wiki refs:** `docs/wiki/concepts/protocol-adapter.md`, `docs/wiki/concepts/codec-dispatch.md`, `docs/wiki/concepts/position-record.md`
+
+## Goal
+
+Build the vendor-agnostic shell: TCP server bootstrap, per-socket session loop, and the type/registry definitions that adapters plug into. **No Teltonika-specific code in this task.**
+
+## Deliverables
+
+- `src/core/types.ts`:
+  - `Position` type matching the [[position-record]] shape exactly.
+  - `Adapter` interface: `{ name: string; ports: number[]; handleSession(socket: net.Socket, ctx: AdapterContext): Promise<void> }`.
+  - `AdapterContext` interface: `{ publish: (p: Position) => Promise<void>; logger: Logger; metrics: Metrics }` — a narrow contract giving adapters what they need without leaking shell internals.
+- `src/core/registry.ts`:
+  - `AdapterRegistry` class (or simple module) holding `Map<port, Adapter>`. Methods: `register(adapter)`, `get(port)`.
+  - This is the *adapter* registry. The *codec* registry (per-vendor, in Teltonika's case) is internal to the adapter — it lives in `src/adapters/teltonika/` (task 1.4).
+- `src/core/session.ts`:
+  - `runSession(socket, adapter, ctx)` that wraps `adapter.handleSession` with:
+    - Initial socket configuration (`setNoDelay`, `setKeepAlive` with a sane delay, e.g. 60s).
+    - Standard error handling: `error`, `close`, `end` events all logged at `debug` level with the connection's remote address.
+    - A `try { await handleSession() } catch (e) { logger.warn(e) }` that ensures the socket is destroyed if the handler throws.
+  - Crucially, `runSession` does *not* know about IMEI, framing, or codecs — those are entirely the adapter's business.
+- `src/core/server.ts`:
+  - `startServer(port, adapter, ctx)` returning a closable handle. Uses `net.createServer((socket) => runSession(socket, adapter, ctx))`.
+  - Logs server bind, accepts, and close events.
+- `src/core/publish.ts`:
+  - Stub `publishPosition(position)` returning `Promise<void>`. Real implementation lands in task 1.8. For now, it accepts a `Position` and logs at debug. The shape should already match what task 1.8 will produce so `Adapter` types stabilize early.
+
+## Specification
+
+### Vendor-agnostic discipline (re-stated)
+
+**`src/core/` must not import from `src/adapters/` — ever.** This is enforced by ESLint with `eslint-plugin-import`'s `no-restricted-paths` rule. Add the rule in this task; failure to follow it should be a CI error.
+
+```js
+// in eslint.config.js
+'import/no-restricted-paths': ['error', {
+  zones: [{ target: './src/core', from: './src/adapters' }]
+}]
+```
+
+Adapters can import from `core/`; the reverse is forbidden.
+
+### TCP socket settings
+
+- `socket.setNoDelay(true)` — disable Nagle so ACKs are not batched. We're sending small ACKs; latency matters more than packet count.
+- `socket.setKeepAlive(true, 60_000)` — TCP keepalive with 60s probe. Defends against idle NAT timeouts; safe because devices already retransmit on disconnect.
+- No `socket.setTimeout()` at the shell level. The protocol does not specify per-frame timing; idle sockets are fine. Adapters can impose timeouts if their protocol demands.
+
+### Position type
+
+Mirror [[position-record]] precisely:
+
+```ts
+export type Position = {
+  device_id: string;
+  timestamp: Date;
+  latitude: number;
+  longitude: number;
+  altitude: number;
+  angle: number;        // 0–360
+  speed: number;        // km/h, 0 may mean "GPS invalid" — caller preserves verbatim
+  satellites: number;
+  priority: 0 | 1 | 2;  // Low | High | Panic
+  attributes: Record<string, number | bigint | Buffer>;
+};
+```
+
+Use `Date` not `number` for `timestamp` — the value is a `Date` from the moment it leaves Ingestion, downstream is responsible for serialization choice.
+
+## Acceptance criteria
+
+- [ ] `pnpm typecheck` and `pnpm lint` pass.
+- [ ] `src/core/server.ts` can be imported and `startServer` returns a `net.Server` that listens on a configurable port.
+- [ ] A trivial test (in `test/core/server.test.ts`) starts a server with a stub adapter, opens a TCP client, and verifies the adapter's `handleSession` is invoked with a real socket.
+- [ ] ESLint enforces the no-restricted-paths rule — verified by adding a temporary import-from-adapter into `src/core/server.ts` and confirming the lint error, then removing it.
+
+## Risks / open questions
+
+- The `AdapterContext`'s metrics interface is sketched but not fully specified until task 1.10. Make a minimal placeholder (`{ inc: (name, labels?) => void; observe: ... }`) and tighten in 1.10.
+- `Buffer` in `Position.attributes` requires JSON serialization handling at the publish boundary (task 1.8). Decide there: base64-encode buffers, or serialize via msgpack. Recommendation: base64 with a sentinel in the JSON, e.g. `{ "_b64": "..." }`. Defer the decision to task 1.8 and revisit if simpler options surface.
+
+## Done
+
+(Fill in once complete.)
@@ -0,0 +1,78 @@
+# Task 1.3 — Configuration & logging
+
+**Phase:** 1 — Inbound telemetry
+**Status:** ⬜ Not started
+**Depends on:** 1.1
+**Wiki refs:** `docs/wiki/sources/gps-tracking-architecture.md` § Deployment topology, § Observability
+
+## Goal
+
+Provide a single source of truth for runtime configuration (env-var-driven, validated at startup, fail-fast on misconfiguration) and a structured JSON logger.
+
+## Deliverables
+
+- `src/config/load.ts`:
+  - Exports `loadConfig(): Config` that parses `process.env` through a zod schema, returning a typed `Config` object. Throws with a clear error message on missing/malformed values.
+  - All env vars optional in dev (with sensible defaults) and required in production-like deployments. Use `NODE_ENV` to gate.
+- `src/observability/logger.ts`:
+  - Exports a configured `pino` logger. JSON output by default; pretty-printed via `pino-pretty` only when `NODE_ENV === 'development'` (lazy-loaded so it's not in the prod bundle).
+  - Log level controlled by `LOG_LEVEL` env var (default `info` in production, `debug` in development).
+  - Adds a `service: 'tcp-ingestion'` and `instance_id` (from `INSTANCE_ID` env var or a generated short UUID at startup) to every log line.
+
+## Specification
+
+### Config schema (zod)
+
+```ts
+const ConfigSchema = z.object({
+  NODE_ENV: z.enum(['development', 'test', 'production']).default('development'),
+  INSTANCE_ID: z.string().min(1).default(() => `local-${randomUUID().slice(0, 8)}`),
+  LOG_LEVEL: z.enum(['fatal', 'error', 'warn', 'info', 'debug', 'trace']).default('info'),
+
+  // Vendor port bindings — extend as adapters are added.
+  TELTONIKA_PORT: z.coerce.number().int().min(1).max(65535).default(5027),
+
+  // Redis
+  REDIS_URL: z.string().url(),
+  REDIS_TELEMETRY_STREAM: z.string().min(1).default('telemetry:teltonika'),
+  REDIS_STREAM_MAXLEN: z.coerce.number().int().min(0).default(1_000_000), // approximate cap
+
+  // Observability
+  METRICS_PORT: z.coerce.number().int().min(0).max(65535).default(9090),
+
+  // Phase 2 (planned, not used in Phase 1)
+  // COMMANDS_OUTBOUND_STREAM_PREFIX: z.string().default('commands:outbound'),
+});
+
+export type Config = z.infer<typeof ConfigSchema>;
+```
+
+The Phase 2 fields are commented out so they do not become runtime requirements before Phase 2 ships. Add them when Phase 2 is in flight.
+
+### Logger conventions
+
+- Always emit JSON in production (pino default).
+- Always include: `time`, `level`, `service`, `instance_id`, `msg`.
+- Adapter log lines include `imei` when known; framing log lines include `codec_id` when applicable; CRC failures include `expected_crc`, `computed_crc`, `frame_length`.
+- Use `logger.child({ imei })` to scope a logger per session, so subsequent log lines auto-include the IMEI.
+- Never log raw frame payloads at info or above — they're huge and may contain sensitive telemetry. At debug, truncate to first/last 16 bytes.
+
+### Failure mode
+
+`loadConfig()` is called once in `main.ts`. If it throws, the process exits with a non-zero code and a single human-readable line listing the missing/invalid keys. **Do not fall back to silent defaults for required keys** — the operational habit we want is "missing config = process refuses to start," not "process starts and behaves weirdly later."
+
+## Acceptance criteria
+
+- [ ] Calling `loadConfig()` with `REDIS_URL` unset throws and the error names `REDIS_URL` specifically.
+- [ ] Calling `loadConfig()` in dev with `NODE_ENV=development` and only `REDIS_URL` set returns a fully valid `Config` with sensible defaults for everything else.
+- [ ] The logger emits JSON when `NODE_ENV=production` and pretty-printed text when `NODE_ENV=development`.
+- [ ] `logger.child({ imei: '...' })` produces lines with `imei` included.
+
+## Risks / open questions
+
+- `INSTANCE_ID` default is a random UUID per process start — fine for dev, but in production K8s/compose deployments, set it explicitly to a stable identifier (pod name, hostname, etc.). The Phase 2 connection registry depends on `INSTANCE_ID` being stable across the lifetime of the process; document this in the deployment notes (task 1.11).
+- Log volume could be high under load. Pino is fast (~100k+ lines/sec on modern hardware) but consider `useOnlyCustomLevels` or sampling for the busiest events (e.g. per-frame debug logs).
+
+## Done
+
+(Fill in once complete.)
@@ -0,0 +1,183 @@
+# Task 1.4 — Teltonika framing layer
+
+**Phase:** 1 — Inbound telemetry
+**Status:** ⬜ Not started
+**Depends on:** 1.2
+**Wiki refs:** `docs/wiki/concepts/avl-data-format.md` (envelope, IMEI handshake), `docs/wiki/concepts/codec-dispatch.md`, `docs/wiki/sources/teltonika-data-sending-protocols.md`
+
+## Goal
+
+Implement the Teltonika **adapter shell**: IMEI handshake, AVL frame envelope read loop, CRC validation, and codec dispatch registry. Codec parsers themselves are tasks 1.5–1.7; this task lays the framing rails they slot into.
+
+## Deliverables
+
+- `src/adapters/teltonika/index.ts` — the `Adapter` export consumed by `src/core/`. Wires the codec registry, exports `{ name: 'teltonika', ports: [config.TELTONIKA_PORT], handleSession }`.
+- `src/adapters/teltonika/handshake.ts` — `readImeiHandshake(socket): Promise<string>` performs the 2-byte length + ASCII IMEI read and returns the IMEI string. **Does not write the accept/reject byte itself** — that decision is made by the session loop after consulting `DeviceAuthority` (see "DeviceAuthority seam" below). On malformed input, throws a typed `HandshakeError`.
+- `src/adapters/teltonika/device-authority.ts` — defines the `DeviceAuthority` interface and ships an `AllowAllAuthority` default implementation. The opt-in Redis-backed authority lives in task 1.13.
+- `src/adapters/teltonika/frame.ts` — `readNextFrame(socket, buffer): Promise<{ codecId: number; payload: Buffer; crcValid: boolean }>` plus a small `BufferedReader` class that handles partial-read accumulation across `socket.on('data')` events.
+- `src/adapters/teltonika/crc.ts` — pure function `crc16Ibm(buf: Buffer): number`. Implements CRC-16/IBM (polynomial `0xA001`, init `0x0000`, no reflection beyond the polynomial standard).
+- `src/adapters/teltonika/codec/registry.ts` — internal-to-adapter codec registry: `Map<codecId, CodecDataHandler>`. Phase 1 registers handlers from `codec/data/`. Phase 2 will register from `codec/command/`.
+
+## Specification
+
+### IMEI handshake
+
+```
+Device → Server: [length 2B big-endian][IMEI bytes (ASCII, length B)]
+Server → Device: 0x01 (accept) | 0x00 (reject)
+```
+
+Phase 1 default: **accept all syntactically valid IMEIs.** Authorization (the question of whether a given IMEI is *expected* to be in the fleet) is a soft observability concern, not a hard gate, until task 1.13 adds the opt-in allow-list refresher. The handshake consults a `DeviceAuthority` interface for a `known | unknown` label that flows into metrics and logs but does **not** block the handshake by default.
+
+Parse rules:
+- Length must be ≤ 32 (Teltonika IMEIs are 15 ASCII digits; we allow some headroom).
+- IMEI body must match `/^\d{14,16}$/` after ASCII decode.
+- Anything malformed: `HandshakeError`, log `WARN` with the offending bytes (truncated), destroy socket, no `0x01`.
+
+### `DeviceAuthority` seam
+
+```ts
+export interface DeviceAuthority {
+  check(imei: string): Promise<'known' | 'unknown'>;
+}
+
+export class AllowAllAuthority implements DeviceAuthority {
+  async check(): Promise<'known'> { return 'known'; }
+}
+```
+
+Wire `DeviceAuthority` into the Teltonika adapter context. Default binding in `main.ts` is `new AllowAllAuthority()` — every IMEI is reported as `known` until a real authority is configured.
+
+Behavior of the handshake:
+
+```ts
+const imei = await readImeiHandshake(socket);
+const authority = ctx.deviceAuthority;
+const knownLabel = await authority.check(imei).catch((err) => {
+  ctx.logger.warn({ err, imei }, 'device authority check failed; defaulting to unknown');
+  return 'unknown' as const;
+});
+ctx.metrics.handshake.inc({ result: 'accepted', known: knownLabel });
+if (knownLabel === 'unknown' && config.STRICT_DEVICE_AUTH) {
+  // Reject (rare; off by default)
+  socket.write(Buffer.from([0x00]));
+  ctx.logger.warn({ imei }, 'rejected unknown device under STRICT_DEVICE_AUTH');
+  socket.destroy();
+  return;
+}
+socket.write(Buffer.from([0x01]));
+```
+
+Three properties:
+
+1. **Default behavior is unchanged from accept-all.** No business-plane dependency.
+2. **Unknown devices are *visible*** via the `known` label on `teltonika_handshake_total` (see task 1.10).
+3. **`STRICT_DEVICE_AUTH=true`** flips the policy to reject-unknowns. Off by default. When operators want this, they enable it; the code path is already there.
+
+The real implementation of `DeviceAuthority` (Redis-backed, refreshed from a Directus-published allow-list) is task 1.13. Task 1.4 only ships the interface and the `AllowAllAuthority` default.
+
+### AVL frame envelope
+
+Per [[avl-data-format]]:
+
+```
+[Preamble 4B = 0x00000000]
+[DataFieldLength 4B big-endian]
+[CodecID 1B]
+[N1 1B]
+[AVL records — DataFieldLength minus 2 bytes for CodecID and N1, minus 1 byte for N2]
+[N2 1B]
+[CRC 4B]
+```
+
+Important framing rules:
+
+- **DataFieldLength is NOT the size of the AVL records section** — it is the size from `CodecID` through `N2` inclusive. So the bytes to read after the length field = `DataFieldLength + 4 (CRC)`.
+- **CRC is computed from `CodecID` through `N2`** (the same span as `DataFieldLength`).
+- **N1 must equal N2.** Mismatch is a malformation; treat like CRC failure (no ACK, log, **drop the connection** — N1≠N2 is structural, not transient).
+- **The CRC field is 4 bytes** but only the lower 2 contain the value; upper 2 are zero. Read all 4; validate the lower 16 bits.
+
+Pseudocode for the read loop:
+
+```ts
+const reader = new BufferedReader(socket);
+while (!socket.destroyed) {
+  const preamble = await reader.readExact(4);
+  if (preamble.readUInt32BE() !== 0) {
+    logger.warn({ imei }, 'invalid preamble; dropping connection');
+    socket.destroy();
+    return;
+  }
+  const length = (await reader.readExact(4)).readUInt32BE();
+  if (length < 8 || length > MAX_AVL_PACKET_SIZE) { /* log + destroy */ }
+  const body = await reader.readExact(length);   // CodecID + N1 + records + N2
+  const crcField = await reader.readExact(4);
+  const expectedCrc = crcField.readUInt16BE(2);  // lower 2 of 4
+  const computedCrc = crc16Ibm(body);
+  if (expectedCrc !== computedCrc) {
+    metrics.frames.inc({ codec: codecLabel(body[0]), result: 'crc_fail' });
+    logger.warn({ imei, expected_crc: expectedCrc, computed_crc: computedCrc }, 'CRC mismatch');
+    continue; // do NOT ack; device retransmits
+  }
+  const codecId = body[0];
+  const handler = codecRegistry.get(codecId);
+  if (!handler) {
+    metrics.unknownCodec.inc({ codec_id: String(codecId) });
+    logger.warn({ imei, codec_id: codecId, header: body.subarray(0, 16).toString('hex') }, 'unknown codec; dropping connection');
+    socket.destroy();
+    return;
+  }
+  const result = await handler.handle(body, ctx);
+  // ACK: 4-byte big-endian count of records accepted
+  const ack = Buffer.alloc(4);
+  ack.writeUInt32BE(result.recordCount, 0);
+  socket.write(ack);
+}
+```
+
+### CRC-16/IBM
+
+Polynomial `0xA001` (reflected `0x8005`), initial value `0x0000`, no final XOR. Implementation should be a tight loop with a precomputed lookup table for performance — protocol parsing is on the hot path. The Teltonika doc references CRC-16/IBM; it's the same as ARC.
+
+Test against the canonical doc's worked example:
+- Frame body `08010000016B40D8EA30010000000000000000000000000000000105021503010101425E0F01F10000601A014E000000000000000001` (codec 8, see `docs/raw/...`)  expected CRC = `0x0000C7CF` (lower 16 bits = `0xC7CF`).
+
+### MAX_AVL_PACKET_SIZE
+
+Constant for sanity check on `DataFieldLength`. Use `1300` to cover both fleet caps (512B for FMB640 family, 1280B for others) with small headroom. Larger frames are malformed and we drop the connection.
+
+### Codec registry structure
+
+```ts
+export interface CodecDataHandler {
+  codec_id: number;
+  handle(
+    body: Buffer,                                // CodecID + N1 + records + N2
+    ctx: { imei: string; publish: (p: Position) => Promise<void>; logger: Logger; metrics: Metrics }
+  ): Promise<{ recordCount: number }>;
+}
+```
+
+The `handle` body skips the framing-level concerns (envelope, CRC, codec dispatch) — those happen above. Each codec parser receives the validated body and is responsible for parsing N1/N2, the records themselves, and producing `Position` records via `ctx.publish`.
+
+## Acceptance criteria
+
+- [ ] CRC-16/IBM matches the canonical Teltonika example byte-for-byte.
+- [ ] `readImeiHandshake` returns a parsed IMEI for well-formed input without writing to the socket.
+- [ ] `readImeiHandshake` rejects malformed input by throwing without writing anything.
+- [ ] The session loop, after a successful handshake, consults `DeviceAuthority.check`, increments `teltonika_handshake_total{result, known}`, and writes `0x01` (or `0x00` under `STRICT_DEVICE_AUTH`).
+- [ ] `AllowAllAuthority` always returns `'known'`; verified by a unit test.
+- [ ] `STRICT_DEVICE_AUTH=true` causes an `unknown` device to receive `0x00` and have its socket destroyed; verified by an integration test with a stub authority.
+- [ ] `BufferedReader.readExact(n)` correctly handles cases where bytes arrive across multiple `data` events.
+- [ ] `readNextFrame` correctly identifies CRC mismatch without dropping the connection.
+- [ ] `readNextFrame` drops the connection on unknown codec ID and logs the structured warn line.
+- [ ] All paths that write to the socket use a single point of ACK emission so Phase 2 can later interpose a write queue without rewriting framing code.
+
+## Risks / open questions
+
+- `BufferedReader` correctness is critical. Use a battle-tested approach — a queue of pending reads with a backing `Buffer.concat` accumulator. Alternatively use Node's `node:stream/web` async iterator if the ergonomics fit.
+- The `await ctx.publish(p)` inside the handler is the boundary where Phase 1's "TCP handler never blocks on downstream" property is enforced. The publish must use a non-blocking strategy (fire-and-forget into a bounded queue, or guarantee Redis publish is fast enough). Task 1.8 specifies the publish strategy; this task only needs to make the `await` semantically correct.
+
+## Done
+
+(Fill in once complete.)
@@ -0,0 +1,98 @@
+# Task 1.5 — Codec 8 parser
+
+**Phase:** 1 — Inbound telemetry
+**Status:** ⬜ Not started
+**Depends on:** 1.4, 1.9 (fixture infra)
+**Wiki refs:** `docs/wiki/concepts/avl-data-format.md` § Codec 8, `docs/wiki/sources/teltonika-data-sending-protocols.md` § Codec 8
+
+## Goal
+
+Parse Codec 8 (`0x08`) AVL data bodies into `Position` records and publish them via `ctx.publish`.
+
+## Deliverables
+
+- `src/adapters/teltonika/codec/data/codec8.ts` exporting `codec8Handler: CodecDataHandler` with `codec_id: 0x08`.
+- Helper functions in the same file (or in a sibling `gps-element.ts` if shared with codec 8E and 16):
+  - `parseGpsElement(buf, offset): { value: GpsElement; nextOffset: number }`
+  - `parseTimestamp(buf, offset): { value: Date; nextOffset: number }`
+- Test file `test/codec8.test.ts` with at least the three fixtures from the canonical Teltonika example doc plus a synthetic empty-IO fixture and a multi-record fixture.
+
+## Specification
+
+### AVL record layout (Codec 8)
+
+```
+[Timestamp 8B] [Priority 1B] [GPS Element 15B] [IO Element ...]
+```
+
+#### Timestamp
+
+8-byte big-endian unsigned integer = milliseconds since UNIX epoch UTC. Convert to `Date` via `new Date(Number(buf.readBigUInt64BE(offset)))`. Use `BigInt` arithmetic to avoid the `Number` precision concern; values are well within safe range until ~year 285k, but be explicit.
+
+#### Priority
+
+1-byte enum: `0` = Low, `1` = High, `2` = Panic. Reject unexpected values? Decision: **accept any value 0–255 and pass through** — the Teltonika spec lists 0–2, but treating "unexpected priority" as parser failure feels hostile. Log a `debug` line if value > 2 but proceed.
+
+Type-narrow to `0 | 1 | 2` only when value is in range; otherwise record as `2` (Panic, the most conservative) and emit a metric `teltonika_priority_out_of_range_total`. Open question: confirm with operations the right fallback.
+
+#### GPS Element (15 bytes)
+
+```
+[Longitude 4B][Latitude 4B][Altitude 2B][Angle 2B][Satellites 1B][Speed 2B]
+```
+
+- **Longitude / Latitude**: signed 32-bit big-endian integer (two's complement), divided by `1e7` to get decimal degrees. Negative bit handling: `buf.readInt32BE(offset) / 1e7` does the right thing because Node's `readInt32BE` interprets the value as signed.
+- **Altitude**: 2-byte signed big-endian, meters above sea level.
+- **Angle**: 2-byte unsigned big-endian, degrees from north pole (0–360).
+- **Satellites**: 1-byte unsigned.
+- **Speed**: 2-byte unsigned, km/h. **Pass through verbatim** — `0x0000` may mean "GPS invalid" but that semantic decision belongs to the Processor.
+
+#### IO Element (Codec 8 layout)
+
+```
+[Event IO ID 1B]
+[N total 1B]
+[N1 1B]   then N1 × ([IO ID 1B][Value 1B])
+[N2 1B]   then N2 × ([IO ID 1B][Value 2B BE unsigned])
+[N4 1B]   then N4 × ([IO ID 1B][Value 4B BE unsigned])
+[N8 1B]   then N8 × ([IO ID 1B][Value 8B BE — store as bigint])
+```
+
+Iterate each section and write into `position.attributes`:
+
+```ts
+attributes[String(ioId)] = value;
+```
+
+Values:
+- 1-byte → `number` (read with `readUInt8`)
+- 2-byte → `number` (read with `readUInt16BE`)
+- 4-byte → `number` (read with `readUInt32BE`)
+- 8-byte → `bigint` (read with `readBigUInt64BE`)
+
+**Do not decode signedness** for IO values. The spec is silent on per-IO signedness; downstream model-aware code in the Processor handles that. If a downstream interpretation needs signed, it can `(unsigned > 0x7FFFFFFF) ? unsigned - 0x100000000 : unsigned` itself.
+
+The `Event IO ID` value is captured into a separate `event_io_id` attribute — propose: store as `attributes['__event']` (or as a typed sibling field on the Position; recommendation: store under `attributes['__event']` to keep the `Position` shape stable and avoid adding a Codec-8-specific field).
+
+> **Open question:** is `__event` the right key? Alternatives: `'_event_io_id'`, `'0'` (it's IO ID 0 in some interpretations, but that's a different "0"). Decide before merging task 1.5.
+
+### Record loop
+
+After parsing N1 (number of records from the framing layer), loop N1 times producing one `Position` per record. Validate that the position of the cursor at the end equals the start of the trailing N2 byte; mismatch is a parser bug → throw with a structured error including offset.
+
+## Acceptance criteria
+
+- [ ] All three canonical doc examples (single record with all IO widths; single record with reduced IO; two records) parse to the expected `Position[]` byte-for-byte (verified via the fixture suite from task 1.9).
+- [ ] CRC validation already happened upstream (task 1.4); this task does not re-check.
+- [ ] Cursor-end-equals-N2 invariant holds for every fixture.
+- [ ] `Position.timestamp` round-trips: `new Date(...).toISOString()` matches the doc example's `GMT: Monday, June 10, 2019, 10:04:46 AM` for the first fixture.
+- [ ] All IO IDs from the fixture appear in `attributes` with correct numeric/bigint types.
+
+## Risks / open questions
+
+- The `Event IO ID` field semantics. Storing under `'__event'` keeps things flexible but adds a magic key. Discuss with Processor implementer before settling.
+- 8-byte values as `bigint` complicate JSON serialization. Task 1.8 (publisher) must handle this — recommend serializing as a string with a sentinel, e.g. `"123n"` or `{ "_bigint": "123" }`. Keep parser side clean (real `bigint`); push encoding to the publish boundary.
+
+## Done
+
+(Fill in once complete.)
@@ -0,0 +1,81 @@
+# Task 1.6 — Codec 8 Extended parser
+
+**Phase:** 1 — Inbound telemetry
+**Status:** ⬜ Not started
+**Depends on:** 1.4, 1.5 (shared GPS Element / timestamp helpers), 1.9
+**Wiki refs:** `docs/wiki/concepts/avl-data-format.md` § Codec 8 Extended, `docs/wiki/sources/teltonika-data-sending-protocols.md` § Codec 8 Extended
+
+## Goal
+
+Parse Codec 8 Extended (`0x8E`) AVL data bodies into `Position` records, including the **NX variable-length IO section** that does not exist in Codecs 8 or 16.
+
+## Deliverables
+
+- `src/adapters/teltonika/codec/data/codec8e.ts` exporting `codec8eHandler: CodecDataHandler` with `codec_id: 0x8E`.
+- Test file `test/codec8e.test.ts` with the canonical doc example plus at least two synthetic fixtures: one with NX entries, one with mixed N1/N2/N4/N8/NX.
+
+## Specification
+
+### Differences from Codec 8
+
+| Field | Codec 8 | Codec 8 Extended |
+|-------|---------|------------------|
+| Codec ID | `0x08` | `0x8E` |
+| Event IO ID width | 1B | 2B |
+| N total / N* counts | 1B | **2B** |
+| IO ID width | 1B | 2B |
+| Value widths | 1/2/4/8B | 1/2/4/8B (same) |
+| Variable-length IO (NX) | — | **Yes** |
+
+The fixed AVL fields (timestamp, priority, GPS element 15B) are identical to Codec 8.
+
+### IO Element layout (Codec 8E)
+
+```
+[Event IO ID 2B]
+[N total 2B]
+[N1 2B]   then N1 × ([IO ID 2B][Value 1B])
+[N2 2B]   then N2 × ([IO ID 2B][Value 2B])
+[N4 2B]   then N4 × ([IO ID 2B][Value 4B])
+[N8 2B]   then N8 × ([IO ID 2B][Value 8B])
+[NX 2B]   then NX × ([IO ID 2B][Length 2B][Value <Length> bytes])  ← unique to 8E
+```
+
+### NX section — the load-bearing complication
+
+The NX section is the most error-prone part of Codec 8E. Each entry self-describes:
+
+- 2 bytes IO ID.
+- 2 bytes length (unsigned big-endian).
+- `length` bytes of raw value.
+
+Store NX values as **`Buffer`** (not number/bigint) — they may be ICCID-class data, BLE sensor payloads, or similar binary content. The Processor decodes them per model.
+
+```ts
+attributes[String(ioId)] = buf.subarray(offset, offset + length); // copy via .subarray; treat as Buffer
+```
+
+**Common bug:** off-by-one in the length field's endianness or width. Verify with a fixture that has at least one NX entry whose length value spans both bytes (e.g. 256+ bytes).
+
+**Common bug 2:** mishandling NX length 0. Permitted by the spec; treat as a 0-byte Buffer.
+
+### Cursor invariant
+
+Same as Codec 8: after parsing all N records and the trailing N2 byte, the cursor must equal the body's last byte. Mismatch = parser bug; throw with offset details.
+
+## Acceptance criteria
+
+- [ ] Canonical doc example (one record with N1=1, N2=1, N4=1, N8=2, NX=0) parses correctly. Note: the doc's NX section count is `00 00`, so this fixture covers the "NX present but empty" path.
+- [ ] At least one synthetic fixture has NX > 0 with mixed lengths (e.g. 1B, 8B, 64B values).
+- [ ] At least one synthetic fixture has an NX entry with `length = 0`.
+- [ ] At least one synthetic fixture has an NX entry with `length` requiring full 16 bits (≥ 256B).
+- [ ] All NX values land in `attributes` as `Buffer` instances; non-NX values land as `number` or `bigint` per width.
+
+## Risks / open questions
+
+- Maximum total record size remains 255 bytes per the spec. NX with large values can push this — verify the per-record size guard.
+- Memory pressure: storing many `Buffer` instances per record could add up. Use `Buffer.subarray` (zero-copy view) rather than `Buffer.from(slice)` (copy). Confirm that downstream consumers (the publisher, task 1.8) handle the view semantics correctly — they should be safe because the underlying frame buffer is held until publish completes.
+
+## Done
+
+(Fill in once complete.)
@@ -0,0 +1,84 @@
+# Task 1.7 — Codec 16 parser
+
+**Phase:** 1 — Inbound telemetry
+**Status:** ⬜ Not started
+**Depends on:** 1.4, 1.5 (shared helpers), 1.9
+**Wiki refs:** `docs/wiki/concepts/avl-data-format.md` § Codec 16, `docs/wiki/sources/teltonika-data-sending-protocols.md` § Codec 16
+
+## Goal
+
+Parse Codec 16 (`0x10`) AVL data bodies into `Position` records, including the per-record **Generation Type** byte.
+
+## Deliverables
+
+- `src/adapters/teltonika/codec/data/codec16.ts` exporting `codec16Handler: CodecDataHandler` with `codec_id: 0x10`.
+- Test file `test/codec16.test.ts` with the canonical doc example (multi-record) plus at least one synthetic fixture covering each Generation Type value.
+
+## Specification
+
+### Differences from Codec 8 / Codec 8E
+
+| Field | Codec 8 | Codec 16 | Codec 8E (for contrast) |
+|-------|---------|----------|-------------------------|
+| Codec ID | `0x08` | `0x10` | `0x8E` |
+| Event IO ID width | 1B | **2B** | 2B |
+| Generation Type | — | **1B** | — |
+| N total / N* counts | 1B | **1B** | 2B |
+| IO ID width | 1B | **2B** | 2B |
+| Value widths | 1/2/4/8B | 1/2/4/8B | 1/2/4/8B |
+| Variable-length IO (NX) | — | — | Yes |
+
+Codec 16 is a "mixed" layout: 2-byte IO IDs (like 8E) but 1-byte counts (like 8), plus the new Generation Type field. This is the trap — implementers who copy from Codec 8E will get the count widths wrong; implementers who copy from Codec 8 will get the IO ID widths wrong. Read the spec carefully and write fixture-driven tests first.
+
+### IO Element layout (Codec 16)
+
+```
+[Event IO ID 2B]
+[Generation Type 1B]   ← unique to Codec 16
+[N total 1B]
+[N1 1B]   then N1 × ([IO ID 2B][Value 1B])
+[N2 1B]   then N2 × ([IO ID 2B][Value 2B])
+[N4 1B]   then N4 × ([IO ID 2B][Value 4B])
+[N8 1B]   then N8 × ([IO ID 2B][Value 8B])
+```
+
+No NX section.
+
+### Generation Type
+
+1-byte enum:
+
+| Value | Meaning |
+|-------|---------|
+| 0 | On Exit |
+| 1 | On Entrance |
+| 2 | On Both |
+| 3 | Reserved |
+| 4 | Hysteresis |
+| 5 | On Change |
+| 6 | Eventual |
+| 7 | Periodical |
+
+Storage decision: **store as `attributes['__generation_type']`** (consistent with the `__event` convention from task 1.5). Codec 8 and 8E omit this key entirely. Downstream code can pattern-match on its presence.
+
+> **Open question (carried from task 1.5):** if we promote Generation Type to a typed `Position` field, then `__event` should also become typed. Recommendation: keep them in `attributes` for Phase 1; revisit when Processor-side modeling firms up. Flagged in [[position-record]] open questions.
+
+### AVL ID range
+
+Codec 16 (and 8E) supports IO IDs > 255. The parser treats this transparently — IO IDs are read as 2-byte unsigned values; nothing prevents `ioId = 1234`. Just confirm no fixture has an off-by-one assumption that breaks for >255.
+
+## Acceptance criteria
+
+- [ ] Canonical doc example (two records, N1=2, N2=2, codec ID `0x10`, generation type `0x05`) parses correctly with both records' attributes populated.
+- [ ] A synthetic fixture exists for each Generation Type 0–7 (eight fixtures total or one fixture with eight records varying the field).
+- [ ] At least one synthetic fixture has an IO ID > 255 to verify 2-byte read.
+- [ ] `attributes['__generation_type']` is set on every Codec 16 position; absent on Codec 8 / 8E positions.
+
+## Risks / open questions
+
+- The "mixed widths" trap is real. Pair-review (or have a second LLM agent review) the field-width table before declaring done. Fixture tests catch this if they're built carefully.
+- Reserved value `3` for Generation Type: spec says reserved. Decision: log a `debug` if observed; do not reject. We do not police reserved values that don't break parsing.
+
+## Done
+
+(Fill in once complete.)
@@ -0,0 +1,114 @@
+# Task 1.8 — Redis Streams publisher & main wiring
+
+**Phase:** 1 — Inbound telemetry
+**Status:** ⬜ Not started
+**Depends on:** 1.2, 1.3, 1.4, 1.5, 1.6, 1.7
+**Wiki refs:** `docs/wiki/entities/redis-streams.md`, `docs/wiki/concepts/position-record.md`
+
+## Goal
+
+Implement the real `publishPosition` that writes `Position` records to a Redis Stream, then wire the entire Phase 1 pipeline together in `src/main.ts`.
+
+## Deliverables
+
+- `src/core/publish.ts` (replacing the stub from task 1.2):
+  - `createPublisher(redis: Redis, config: Config, logger: Logger, metrics: Metrics): Publisher` factory.
+  - `Publisher.publish(p: Position): Promise<void>` that serializes and `XADD`s.
+  - Internal serialization helper `serializePosition(p: Position): Record<string, string>` returning the field-value pairs Redis expects.
+- `src/main.ts` updated to:
+  1. Load config (task 1.3).
+  2. Build logger and metrics (tasks 1.3, 1.10).
+  3. Connect to Redis with retry-on-startup logic.
+  4. Build the publisher.
+  5. Build the Teltonika adapter and register codec handlers.
+  6. Start the TCP server.
+  7. Start the metrics HTTP server (task 1.10).
+  8. Install graceful shutdown (task 1.12 finalizes; stub here).
+
+## Specification
+
+### Stream record shape
+
+`XADD telemetry:teltonika MAXLEN ~ <maxlen> * <fields>` where fields are flat key→string pairs (Redis Streams do not nest). Use a JSON-encoded `payload` field for simplicity:
+
+```
+1) ts        → ISO8601 string (timestamp from the Position)
+2) device_id → IMEI string
+3) codec     → "8" | "8E" | "16" (the codec that produced this record — useful for downstream filtering)
+4) payload   → JSON string of the full Position
+```
+
+The duplicated `device_id` and `ts` at the top level let downstream tools filter without parsing the JSON; `payload` is the source of truth.
+
+### JSON serialization
+
+`Position.attributes` contains `number | bigint | Buffer`. JSON.stringify out of the box handles `number` but not `bigint` or `Buffer`. Implement a custom replacer:
+
+```ts
+function replacer(_key: string, value: unknown): unknown {
+  if (typeof value === 'bigint') return { __bigint: value.toString() };
+  if (Buffer.isBuffer(value))    return { __buffer_b64: value.toString('base64') };
+  if (value instanceof Date)     return value.toISOString();
+  return value;
+}
+```
+
+The `__bigint` and `__buffer_b64` sentinels are decoded by the Processor (and any other consumer). Document this contract in the [[position-record]] page once landed.
+
+### `XADD` options
+
+- `MAXLEN ~ <REDIS_STREAM_MAXLEN>` — approximate trimming, much cheaper than exact.
+- `*` for auto-generated message ID.
+- Use a single connection (no pooling — `ioredis` multiplexes commands automatically).
+
+### Backpressure / non-blocking property
+
+The TCP handler is `await`-ing `ctx.publish(p)`. Two strategies:
+
+**Option A: Direct `XADD` per record.** Simplest. Latency per publish is sub-millisecond on a healthy Redis. The risk: if Redis hangs, the TCP handler blocks → device sockets back up → Phase 1's "TCP handler never blocks" property is violated.
+
+**Option B: Bounded in-memory queue + worker drain.** A `Promise`-based bounded queue (e.g. `p-queue` or hand-rolled). `publish()` resolves once the record is enqueued; a worker drains via `XADD`. If the queue is full, the worker has fallen behind catastrophically — at that point we have to choose: drop oldest, drop newest, or throw. Recommendation: drop newest with a structured error log + metric, because the device will retransmit (we won't ACK).
+
+**Decision: Option B.** Specification:
+
+- Queue capacity: 10,000 records (configurable via `PUBLISH_QUEUE_CAPACITY`).
+- On overflow: do **not** publish; throw a typed `PublishOverflowError`. The framing layer (task 1.4) catches this and skips the ACK so the device retransmits.
+- Worker concurrency: 1 (Redis is single-threaded per connection; concurrency just adds context-switch cost).
+- Metric: `teltonika_publish_queue_depth` gauge, `teltonika_publish_overflow_total` counter.
+
+The worker uses `XADD` with a per-call timeout (e.g. 2s) and exits the process on prolonged Redis unavailability — graceful shutdown should restart the process via the orchestrator.
+
+### `main.ts` skeleton
+
+```ts
+async function main() {
+  const config = loadConfig();
+  const logger = createLogger(config);
+  const metrics = createMetrics();
+  const redis = await connectRedis(config, logger);
+  const publisher = createPublisher(redis, config, logger, metrics);
+  const adapter = createTeltonikaAdapter({ publisher, logger, metrics });
+  const server = startServer(config.TELTONIKA_PORT, adapter, { publish: publisher.publish, logger, metrics });
+  const metricsServer = startMetricsServer(config.METRICS_PORT, metrics);
+  installGracefulShutdown({ server, metricsServer, redis, publisher, logger });
+  logger.info({ port: config.TELTONIKA_PORT }, 'tcp-ingestion ready');
+}
+
+main().catch((err) => { console.error(err); process.exit(1); });
+```
+
+## Acceptance criteria
+
+- [ ] Integration test: spin up a Redis (testcontainers or `redis-mock`), publish a known `Position`, `XREAD` it back, parse the JSON, and assert it equals the input (with `bigint` and `Buffer` round-tripped through the sentinel encoding).
+- [ ] Overflow test: artificially block the worker, fill the queue, verify the next `publish()` rejects with `PublishOverflowError`, verify metrics increment.
+- [ ] Startup test: with a wrong `REDIS_URL`, the process logs a clear error and exits non-zero.
+- [ ] An end-to-end test: open a TCP client to the running server, send the canonical Codec 8 fixture, verify a Position lands on the Stream and the ACK comes back with `00 00 00 01`.
+
+## Risks / open questions
+
+- `redis-mock` does not implement Streams. Use testcontainers + a real Redis for integration tests.
+- The bounded queue could cause backpressure concerns — discuss with the Processor team whether they prefer the device-retransmit path (overflow throw) or a soft-drop with logging. Defaulting to retransmit because it's the safer correctness choice.
+
+## Done
+
+(Fill in once complete.)
@@ -0,0 +1,156 @@
+# Task 1.9 — Fixture suite & testing strategy
+
+**Phase:** 1 — Inbound telemetry
+**Status:** ⬜ Not started
+**Depends on:** 1.1
+**Wiki refs:** `docs/wiki/sources/teltonika-ingestion-architecture.md` § 5.6, `docs/wiki/sources/teltonika-data-sending-protocols.md`
+
+## Goal
+
+Establish the fixture-based testing infrastructure and seed it with the canonical hex captures from the Teltonika documentation. **This is the only place where the parser's correctness is actually verified.** Bugs in binary protocol parsers are silent; tests are the defense.
+
+## Deliverables
+
+- `test/fixtures/teltonika/codec8/`, `test/fixtures/teltonika/codec8e/`, `test/fixtures/teltonika/codec16/` populated with at least:
+  - 3 captures from the canonical Teltonika doc (one per codec, with full parsed expectations).
+  - 1 synthetic edge case per codec (empty IO bag, max-size IO values, multi-record).
+- Each fixture is a pair: `<name>.hex` (raw frame, hex-encoded with whitespace stripped) and `<name>.expected.json` (the expected `Position[]` after parsing).
+- `test/fixtures/_loader.ts` — helpers:
+  - `loadFixture(path): { hex: Buffer; expected: Position[] }`
+  - `compareToExpected(actual: Position[], expected: Position[]): void` (deep-equals with `bigint`/`Buffer` aware comparator).
+- A vitest test pattern that automatically picks up every fixture pair in a directory and generates a test per pair. So adding a new fixture file = a new test, no boilerplate.
+- `test/fixtures/teltonika/README.md` documenting the format and how to add new captures.
+
+## Specification
+
+### Fixture format
+
+`fixture-name.hex`:
+```
+000000000000003608010000016B40D8EA30
+01000000000000000000000000000000010
+5021503010101425E0F01F10000601A014E
+0000000000000000010000C7CF
+```
+
+Whitespace and newlines are ignored. Implementer strips `[^0-9a-fA-F]` and parses with `Buffer.from(hex, 'hex')`.
+
+`fixture-name.expected.json`:
+```json
+{
+  "positions": [
+    {
+      "device_id": "FIXTURE",
+      "timestamp": "2019-06-10T10:04:46.000Z",
+      "latitude": 0,
+      "longitude": 0,
+      "altitude": 0,
+      "angle": 0,
+      "speed": 0,
+      "satellites": 0,
+      "priority": 1,
+      "attributes": {
+        "21": 3,
+        "1": 1,
+        "66": 24079,
+        "241": 24602,
+        "78": "__bigint:0",
+        "__event": 1
+      }
+    }
+  ],
+  "ack_record_count": 1
+}
+```
+
+The `__bigint:` and `__buffer_b64:` prefixes are how the JSON file represents the special types. The loader decodes them into real `bigint` / `Buffer` instances before comparison.
+
+`device_id` in fixtures is a placeholder (`"FIXTURE"`) because the captures don't include the IMEI — the codec parsers receive the IMEI from the framing layer's session context, not from the body itself.
+
+### Bootstrap fixtures (must be present at end of this task)
+
+From the canonical Teltonika doc (`docs/raw/Teltonika Data Sending Protocols - Teltonika Telematics Wiki.md`):
+
+#### Codec 8
+- `01-single-record-all-widths.hex`: 1st example — one record with N1=2, N2=1, N4=1, N8=1.
+- `02-single-record-reduced.hex`: 2nd example — one record with N1=2, N2=1, N4=0, N8=0.
+- `03-two-records.hex`: 3rd example — two records with minimal IO.
+
+#### Codec 8 Extended
+- `01-canonical.hex`: doc example — one record, N1=1, N2=1, N4=1, N8=2, NX=0.
+
+#### Codec 16
+- `01-canonical.hex`: doc example — two records with Generation Type `0x05`.
+
+### Synthetic fixtures (must be present)
+
+#### Codec 8
+- `04-empty-io-bag.hex`: one record, N=0 (no IO elements). Smallest valid record.
+- `05-multi-record-large.hex`: 10 records to exercise the loop and N1==N2 invariant.
+
+#### Codec 8 Extended
+- `02-nx-mixed.hex`: one record with NX=3, lengths 1, 8, 64.
+- `03-nx-zero-length.hex`: one record with one NX entry of length 0.
+- `04-nx-large-length.hex`: one record with one NX entry of length 300+ (verifies 16-bit length read).
+
+#### Codec 16
+- `02-each-generation-type.hex`: 8 records, one per Generation Type 0–7.
+- `03-large-io-id.hex`: one record with an IO ID > 255 (e.g. `0x0400`).
+
+### Test runner pattern
+
+In `test/codec8.test.ts`, `codec8e.test.ts`, `codec16.test.ts`:
+
+```ts
+import { describe, it, expect } from 'vitest';
+import { loadFixturesFromDir } from './fixtures/_loader';
+import { codec8Handler } from '../src/adapters/teltonika/codec/data/codec8';
+
+describe('Codec 8 parser', () => {
+  for (const fixture of loadFixturesFromDir('test/fixtures/teltonika/codec8')) {
+    it(`parses ${fixture.name}`, async () => {
+      const positions: Position[] = [];
+      const ctx = makeTestCtx(positions);
+      const result = await codec8Handler.handle(fixture.body, ctx);
+      expect(positions).toEqual(fixture.expected.positions);
+      expect(result.recordCount).toBe(fixture.expected.ack_record_count);
+    });
+  }
+});
+```
+
+This pattern means **adding a new fixture file = a new test, automatically.** No editing the test file.
+
+### CRC tests
+
+A separate `test/crc.test.ts` covers `crc16Ibm` against:
+- The canonical doc CRCs (each fixture's CRC computed over the body should match the trailing CRC bytes).
+- A few hand-computed reference values (from online CRC-16/IBM calculators, recorded in the test).
+- An empty buffer (`crc16Ibm(Buffer.alloc(0))` should return `0x0000`).
+
+### Frame tests
+
+`test/frame.test.ts`:
+- IMEI handshake happy path.
+- IMEI handshake malformed length.
+- Frame envelope: bytes split across multiple `data` events.
+- Frame envelope: CRC mismatch path returns the right outcome (no ACK, connection stays open).
+- Frame envelope: unknown codec ID drops the connection.
+
+## Acceptance criteria
+
+- [ ] All bootstrap and synthetic fixtures listed above are present.
+- [ ] `pnpm test` runs all fixture tests and they pass.
+- [ ] `pnpm test --coverage` reports ≥ 90% line coverage for `src/adapters/teltonika/codec/`.
+- [ ] Adding a new fixture pair to a codec's fixtures directory automatically produces a new test (verified manually by adding a temp fixture).
+- [ ] The fixture README documents the format clearly enough that a new contributor can add a capture without reading the test code.
+
+## Risks / open questions
+
+- Where do real production captures come from? Until devices are streaming to a staging environment, we only have doc captures. Plan: record the first day of staging traffic into `tcpdump`-style captures, extract a few representative frames per device model, contribute them as fixtures with the model name in the filename. This step is a follow-up after staging deployment, not a Phase 1 blocker.
+- Hex format vs binary `.bin` files: hex is reviewable in PRs and documented in the doc. Stick with hex.
+- Confirming expected outputs: bootstrap fixtures' expected outputs come directly from the canonical doc's parsed tables. Synthetic fixture expectations are computed by hand and double-checked against the parser output once the parser is believed correct — circular if the parser is buggy. Mitigation: cross-check at least one synthetic fixture against an external Teltonika parser (e.g. the open-source [Traccar](https://github.com/traccar/traccar) project's Teltonika decoder) before declaring done.
+
+## Done
+
+(Fill in once complete.)
@@ -0,0 +1,84 @@
+# Task 1.10 — Observability (Prometheus metrics)
+
+**Phase:** 1 — Inbound telemetry
+**Status:** ⬜ Not started
+**Depends on:** 1.2, 1.3
+**Wiki refs:** `docs/wiki/sources/teltonika-ingestion-architecture.md` § 7. Observability, `docs/wiki/sources/gps-tracking-architecture.md` § 7.4
+
+## Goal
+
+Expose Prometheus metrics over an HTTP endpoint so the platform's observability stack can scrape them. Metrics drive alerting (consumer lag, unknown codecs, CRC failures) and capacity planning (connection counts, frame rates).
+
+## Deliverables
+
+- `src/observability/metrics.ts`:
+  - Exports `createMetrics(): Metrics` returning a typed wrapper around `prom-client` registries.
+  - All metric definitions in one place, with explicit names/labels matching the wiki spec.
+  - A `serializeMetrics(): Promise<string>` returning the standard Prom exposition format.
+  - A `startMetricsServer(port, metrics): http.Server` that exposes `GET /metrics` and `GET /healthz`.
+- Wiring updates: every place that should emit a metric (handshake outcome, frame outcome, publish queue depth, etc.) calls into the `Metrics` object.
+
+## Specification
+
+### Metric inventory (Phase 1)
+
+Per `docs/wiki/sources/teltonika-ingestion-architecture.md` § 7:
+
+| Metric | Type | Labels | Description |
+|--------|------|--------|-------------|
+| `teltonika_connections_active` | gauge | — | Currently open device sessions. |
+| `teltonika_handshake_total` | counter | `result=accepted\|rejected\|malformed`, `known=known\|unknown` | IMEI handshake outcomes. The `known` label distinguishes IMEIs that the configured `DeviceAuthority` recognizes from those it does not. With the default `AllowAllAuthority`, `known` is always `known`. |
+| `teltonika_device_authority_failures_total` | counter | — | Times a `DeviceAuthority.check` call threw or timed out. Non-zero rate indicates the allow-list refresher (task 1.13) is unhealthy. |
+| `teltonika_frames_total` | counter | `codec=8\|8E\|16\|unknown`, `result=ok\|crc_fail\|truncated\|n_mismatch` | Frame-level outcomes. |
+| `teltonika_records_published_total` | counter | `codec` | AVL records emitted to Redis. |
+| `teltonika_parse_duration_seconds` | histogram | `codec` | Per-frame parse time. Buckets: `[0.0001, 0.0005, 0.001, 0.005, 0.01, 0.05, 0.1]` (seconds). |
+| `teltonika_unknown_codec_total` | counter | `codec_id` (string of the offending byte) | **Canary** for codec coverage drift. |
+
+Phase 1 also adds publisher-related metrics from task 1.8:
+
+| Metric | Type | Labels | Description |
+|--------|------|--------|-------------|
+| `teltonika_publish_queue_depth` | gauge | — | Current bounded-queue depth. |
+| `teltonika_publish_overflow_total` | counter | — | Records dropped because the queue was full. |
+| `teltonika_publish_duration_seconds` | histogram | — | XADD latency. |
+
+Plus shell-level:
+
+| Metric | Type | Labels | Description |
+|--------|------|--------|-------------|
+| `nodejs_*` | various | — | Default Node.js process metrics (`prom-client` provides a `collectDefaultMetrics()`). |
+
+### Naming convention
+
+- `teltonika_*` for adapter-specific metrics.
+- `nodejs_*` for runtime metrics (default).
+- No service prefix — Prometheus scrape config adds the `service` and `instance` labels externally.
+
+### Health and readiness
+
+- `GET /healthz`: returns `200 OK` if the process is alive. (Liveness probe.)
+- `GET /readyz`: returns `200 OK` if the Redis connection is healthy AND the TCP listener is bound. (Readiness probe.) Returns `503` otherwise.
+- Both endpoints return a tiny JSON body `{ "status": "ok" }` for diagnostic value.
+
+### HTTP server
+
+Use Node's `node:http` directly — no Express/Fastify dependency for two endpoints. Keep it minimal, ~30 lines.
+
+## Acceptance criteria
+
+- [ ] `curl http://localhost:9090/metrics` returns valid Prometheus exposition format with every metric in the inventory present (some at zero).
+- [ ] After processing the canonical Codec 8 fixture, `teltonika_records_published_total{codec="8"}` increments by 1 and `teltonika_frames_total{codec="8",result="ok"}` increments by 1.
+- [ ] Sending a packet with an unknown codec ID increments `teltonika_unknown_codec_total{codec_id="..."}`.
+- [ ] After a handshake from an IMEI the configured `DeviceAuthority` returns `'unknown'` for, `teltonika_handshake_total{result="accepted",known="unknown"}` increments by 1.
+- [ ] `GET /readyz` returns `503` while Redis is unreachable, then `200` once it reconnects.
+- [ ] Prom-client default metrics are exposed (Node version, GC, event loop lag).
+
+## Risks / open questions
+
+- Cardinality of `codec_id` label on `teltonika_unknown_codec_total`: bounded by 256 possible byte values. Acceptable.
+- Cardinality of `device_id` (IMEI) in metrics: **avoid**. Per-device metrics belong in logs/traces, not Prometheus, because the cardinality is unbounded. Phase 1 does not add per-IMEI labels anywhere. (This is a watch-out for future tasks.)
+- Histogram buckets for `teltonika_parse_duration_seconds`: tuned for sub-millisecond expected times. Adjust based on real production data after the first week.
+
+## Done
+
+(Fill in once complete.)
@@ -0,0 +1,175 @@
+# Task 1.11 — Dockerfile & Gitea workflow
+
+**Phase:** 1 — Inbound telemetry
+**Status:** ⬜ Not started
+**Depends on:** 1.8 (so the service actually does something), 1.10 (metrics endpoint for healthcheck)
+**Wiki refs:** `docs/wiki/sources/gps-tracking-architecture.md` § 7.3 Deployment topology
+
+## Goal
+
+Produce a multi-stage Docker image and a Gitea Actions workflow that builds and pushes the image to the project's Gitea Container Registry on every push to `main` and every tag.
+
+## Deliverables
+
+- `Dockerfile` — multi-stage build (deps → build → runtime).
+- `.dockerignore` — already created in task 1.1; verify it excludes `.planning/`, `test/`, `dist/` (rebuilt in image).
+- `.gitea/workflows/build.yml` — Gitea Actions workflow.
+- `compose.yaml` (alongside Dockerfile) — example local stack with Redis for `pnpm docker:dev`. Useful for local testing of the full pipeline.
+- Documentation updates in `README.md` covering: build, run locally, run via compose, CI behavior, image registry path.
+
+## Specification
+
+### Dockerfile
+
+```dockerfile
+# syntax=docker/dockerfile:1.7
+
+# ---- deps stage: install with cache-friendly pnpm fetch ----
+FROM node:22-alpine AS deps
+WORKDIR /app
+RUN corepack enable && corepack prepare pnpm@latest-9 --activate
+COPY package.json pnpm-lock.yaml ./
+RUN --mount=type=cache,id=pnpm-store,target=/root/.local/share/pnpm/store \
+    pnpm fetch
+
+# ---- build stage: compile TypeScript ----
+FROM deps AS build
+COPY . .
+RUN --mount=type=cache,id=pnpm-store,target=/root/.local/share/pnpm/store \
+    pnpm install --frozen-lockfile --offline
+RUN pnpm build
+RUN pnpm prune --prod
+
+# ---- runtime: slim, non-root ----
+FROM node:22-alpine AS runtime
+WORKDIR /app
+RUN addgroup -S app && adduser -S -G app app
+COPY --from=build --chown=app:app /app/node_modules ./node_modules
+COPY --from=build --chown=app:app /app/dist ./dist
+COPY --from=build --chown=app:app /app/package.json ./package.json
+USER app
+EXPOSE 5027 9090
+HEALTHCHECK --interval=30s --timeout=3s --start-period=10s --retries=3 \
+  CMD wget -qO- http://localhost:9090/readyz || exit 1
+CMD ["node", "dist/main.js"]
+```
+
+Notes:
+- `node:22-alpine` is small (~100MB final image). If musl-related issues arise (rare with pure JS), fall back to `node:22-slim`.
+- BuildKit cache mounts (`--mount=type=cache`) speed up rebuilds significantly; the Gitea runner must support BuildKit (it does by default with modern docker).
+- `pnpm prune --prod` strips dev dependencies before the runtime copy.
+- Healthcheck hits `/readyz` so the container reports unhealthy if Redis is unreachable.
+
+### Gitea workflow
+
+`.gitea/workflows/build.yml`:
+
+```yaml
+name: build
+
+on:
+  push:
+    branches: [main]
+    tags: ['v*']
+  pull_request:
+    branches: [main]
+
+jobs:
+  test:
+    runs-on: ubuntu-latest
+    container: node:22-alpine
+    services:
+      redis:
+        image: redis:7-alpine
+    steps:
+      - uses: actions/checkout@v4
+      - run: corepack enable && corepack prepare pnpm@latest-9 --activate
+      - run: pnpm install --frozen-lockfile
+      - run: pnpm typecheck
+      - run: pnpm lint
+      - run: pnpm test --coverage
+        env:
+          REDIS_URL: redis://redis:6379
+
+  build-and-push:
+    needs: test
+    if: gitea.event_name == 'push'
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+      - uses: docker/setup-buildx-action@v3
+      - uses: docker/login-action@v3
+        with:
+          registry: git.dev.microservices.al
+          username: ${{ gitea.actor }}
+          password: ${{ secrets.GITEA_TOKEN }}
+      - id: meta
+        uses: docker/metadata-action@v5
+        with:
+          images: git.dev.microservices.al/trm/tcp-ingestion
+          tags: |
+            type=ref,event=branch
+            type=sha,prefix=,format=short
+            type=semver,pattern={{version}}
+            type=semver,pattern={{major}}.{{minor}}
+            type=raw,value=latest,enable={{is_default_branch}}
+      - uses: docker/build-push-action@v5
+        with:
+          context: .
+          push: true
+          tags: ${{ steps.meta.outputs.tags }}
+          labels: ${{ steps.meta.outputs.labels }}
+          cache-from: type=registry,ref=git.dev.microservices.al/trm/tcp-ingestion:buildcache
+          cache-to: type=registry,ref=git.dev.microservices.al/trm/tcp-ingestion:buildcache,mode=max
+```
+
+Tags produced:
+- On push to `main`: `main`, `<short-sha>`, `latest`.
+- On tag `v1.2.3`: `v1.2.3`, `1.2.3`, `1.2`, `latest` (because tag push is on default branch context-dependent — verify with the Gitea Actions semantics in your runner version; adjust if necessary).
+- On PR: tests run, no push.
+
+`GITEA_TOKEN` is provided by Gitea Actions automatically (similar to `GITHUB_TOKEN` in GitHub Actions). It must have package-write scope; configure once in repo settings if the default scope is read-only.
+
+### compose.yaml (local dev)
+
+```yaml
+services:
+  redis:
+    image: redis:7-alpine
+    ports: ['6379:6379']
+  ingestion:
+    build: .
+    depends_on: [redis]
+    ports:
+      - '5027:5027'   # Teltonika TCP
+      - '9090:9090'   # metrics
+    environment:
+      NODE_ENV: production
+      INSTANCE_ID: local-1
+      REDIS_URL: redis://redis:6379
+      LOG_LEVEL: debug
+    restart: unless-stopped
+```
+
+### Deployment
+
+Out of scope for this task: how the image is consumed in production (compose pull + restart? K8s? Watchtower?). Recommend a follow-up task once Phase 1 is functional, since the deployment substrate may not be fully decided yet. For now, the image is built and published; humans pull and run it manually.
+
+## Acceptance criteria
+
+- [ ] `docker build .` succeeds locally and produces an image under 200MB.
+- [ ] `docker compose up` starts both Redis and the ingestion service; the service's `/healthz` and `/readyz` return 200.
+- [ ] On push to `main`, the Gitea workflow runs tests, builds the image, and publishes it to the registry. The image is visible in the Gitea Packages UI.
+- [ ] On a tag push, the image is also tagged with the version.
+- [ ] On a PR, only the test job runs (no push).
+- [ ] BuildKit cache reduces a rebuild-with-no-changes to under 30 seconds.
+
+## Risks / open questions
+
+- The exact Gitea Actions feature parity with GitHub Actions varies by runner version. If `docker/metadata-action@v5` doesn't work as expected, fall back to a hand-rolled tag generator using `git rev-parse --short HEAD`.
+- `GITEA_TOKEN` permissions: confirm the default token can push to the registry. If not, switch to a dedicated `secrets.REGISTRY_TOKEN`.
+- Architecture: build only `linux/amd64` for now. Multi-arch (`linux/arm64`) is a follow-up if anyone needs it for Apple Silicon dev.
+
+## Done
+
+(Fill in once complete.)
@@ -0,0 +1,150 @@
+# Task 1.12 — Production hardening
+
+**Phase:** 1 — Inbound telemetry
+**Status:** ⬜ Not started
+**Depends on:** 1.8, 1.10, 1.11
+**Wiki refs:** `docs/wiki/concepts/failure-domains.md`
+
+## Goal
+
+Make the service safe for unattended production operation: graceful shutdown, robust error handling, structured logging discipline, sane defaults for resource limits, and operational documentation.
+
+## Deliverables
+
+- `src/core/lifecycle.ts` — `installGracefulShutdown({ ... })` that wires SIGTERM/SIGINT/SIGHUP to a coordinated shutdown.
+- `src/core/errors.ts` — typed error classes (`HandshakeError`, `FrameError`, `PublishOverflowError`, `RedisUnavailableError`).
+- Updates to `src/main.ts` to install error handlers and shutdown.
+- `OPERATIONS.md` (or section in `README.md`) covering: env var reference, signals, log fields, metric meanings, common alert rules, troubleshooting.
+- (Optional) `docs/runbook.md` for on-call: "what to do when X alert fires."
+
+## Specification
+
+### Graceful shutdown
+
+On SIGTERM (deployment rolling update) or SIGINT (Ctrl-C):
+
+1. **Stop accepting new connections.** `server.close()` — existing sockets continue.
+2. **Drain the publish queue.** Stop accepting new `publish()` calls; wait for the worker to flush queued records to Redis (with a timeout, e.g. 10s).
+3. **Send a final goodbye on each open socket.** Optional: just let TCP FIN naturally; devices will reconnect to a new instance.
+4. **Close Redis connection.**
+5. **Exit cleanly with code 0.**
+
+If shutdown takes longer than `SHUTDOWN_TIMEOUT_MS` (default 30s), log and exit with code 1 — the orchestrator will SIGKILL anyway, but exiting deliberately gives a cleaner signal.
+
+```ts
+export function installGracefulShutdown(handles: ShutdownHandles) {
+  let shuttingDown = false;
+  const shutdown = async (signal: string) => {
+    if (shuttingDown) return;
+    shuttingDown = true;
+    handles.logger.info({ signal }, 'shutdown: starting');
+    const deadline = setTimeout(() => {
+      handles.logger.error({}, 'shutdown: timed out, forcing exit');
+      process.exit(1);
+    }, handles.timeoutMs ?? 30_000);
+    try {
+      await new Promise<void>((res) => handles.server.close(() => res()));
+      await handles.publisher.drain(10_000);
+      await handles.redis.quit();
+      handles.metricsServer.close();
+      clearTimeout(deadline);
+      handles.logger.info({}, 'shutdown: clean exit');
+      process.exit(0);
+    } catch (err) {
+      handles.logger.error({ err }, 'shutdown: error during drain');
+      clearTimeout(deadline);
+      process.exit(1);
+    }
+  };
+  process.on('SIGTERM', () => shutdown('SIGTERM'));
+  process.on('SIGINT', () => shutdown('SIGINT'));
+}
+```
+
+### Unhandled promise / uncaught exception
+
+```ts
+process.on('unhandledRejection', (reason) => {
+  logger.fatal({ reason }, 'unhandledRejection');
+  process.exit(1);
+});
+process.on('uncaughtException', (err) => {
+  logger.fatal({ err }, 'uncaughtException');
+  process.exit(1);
+});
+```
+
+Crashing the process on either is the right move — the orchestrator restarts, devices reconnect, no harm done. The wrong move is to log and continue; that hides real bugs.
+
+ESLint's `no-floating-promises` (added in task 1.1) is the first line of defense; these handlers are the safety net.
+
+### Per-socket error handling
+
+In the session loop:
+
+- Errors from `BufferedReader` / `frame.ts` / codec parsers: log at `warn` with `imei`, drop the socket.
+- Errors from `ctx.publish` (specifically `PublishOverflowError`): skip the ACK, continue reading. Device retransmits.
+- Errors from `ctx.publish` (other, unexpected): log at `error`, drop the socket. Open question: should we crash the process? Recommendation: drop the socket only; let the publisher's own logic decide whether the underlying issue (e.g. Redis hang) warrants process exit.
+
+### Resource limits
+
+- **Max concurrent connections per instance:** soft cap via gauge alert (`teltonika_connections_active > 5000`). No hard cap in code — let the OS-level fd limit be the real ceiling.
+- **Per-connection memory:** the `BufferedReader` buffer is bounded by `MAX_AVL_PACKET_SIZE` (~1.3KB) per session. With 5,000 connections, ~6.5MB of buffer state — fine.
+- **Node heap:** set via `NODE_OPTIONS=--max-old-space-size=512` in the Dockerfile or compose. 512MB is plenty for this workload.
+
+### Logging discipline (audit pass)
+
+Before declaring this task done, walk through every `logger.*` call site and confirm:
+
+- `info`: lifecycle events (startup, shutdown, server bound).
+- `warn`: recoverable per-frame issues (CRC fail, malformed handshake), per-connection drops.
+- `error`: per-publish failures, unexpected per-session errors.
+- `fatal`: process-killing conditions (Redis unreachable for >X seconds, `unhandledRejection`).
+- `debug`: per-frame parse details, per-record publish details.
+- No `console.log` anywhere in production paths. If there are any, replace.
+
+### OPERATIONS.md outline
+
+```
+# tcp-ingestion — Operations
+
+## Configuration
+[table of env vars from task 1.3]
+
+## Signals
+| Signal | Effect |
+|--------|--------|
+| SIGTERM | Graceful shutdown (drain publish queue, close connections, exit 0) |
+| SIGINT | Same as SIGTERM |
+
+## Metrics
+[table of metrics from task 1.10]
+
+## Alerts (recommended)
+- `teltonika_unknown_codec_total > 0` for 5 min: investigate codec coverage drift.
+- `teltonika_publish_overflow_total > 0` for 1 min: Redis or downstream backed up.
+- `rate(teltonika_frames_total{result="crc_fail"}[5m]) / rate(teltonika_frames_total[5m]) > 0.01`: high CRC error rate, suspect device firmware or line quality.
+- `teltonika_connections_active{instance=...} == 0` for 10 min while peer instances have traffic: instance is silently broken; investigate.
+
+## Troubleshooting
+- "Devices not connecting" → check TCP_PORT firewall, /readyz response, Redis connectivity.
+- "Records not appearing in Redis" → check publish queue depth metric, then Redis connectivity.
+- "High CRC failures from one IMEI" → likely a firmware bug or bad cellular link; coordinate with device fleet ops.
+```
+
+## Acceptance criteria
+
+- [ ] SIGTERM during steady-state traffic results in a clean exit with no data loss (verified by killing the process and confirming the publish queue drained, no `PublishOverflowError` in the last second of logs).
+- [ ] SIGTERM under publish-queue-overflow conditions still exits within `SHUTDOWN_TIMEOUT_MS`.
+- [ ] An `unhandledRejection` (intentionally injected via test) logs at fatal and exits non-zero.
+- [ ] OPERATIONS.md is populated and accurate; an on-caller could read it cold and find the answer to "what does this metric mean."
+- [ ] All log calls audited; no `console.log` in production paths.
+
+## Risks / open questions
+
+- The "drain publish queue with timeout" balance: too long blocks deployments; too short loses records on shutdown. Default 10s is a reasonable starting point; tune after real production data.
+- Crashing on `unhandledRejection` is opinionated. Some teams prefer to log and continue. We choose crash because the alternative hides bugs and we have a fast restart path. Document the choice.
+
+## Done
+
+(Fill in once complete.)
@@ -0,0 +1,153 @@
+# Task 1.13 — Device authority (Redis allow-list refresher)
+
+**Phase:** 1 — Inbound telemetry
+**Status:** ⬜ Not started (deferrable — can ship after the rest of Phase 1)
+**Depends on:** 1.4 (DeviceAuthority seam), 1.10 (metrics)
+**Wiki refs:** `docs/wiki/concepts/plane-separation.md`, `docs/wiki/entities/directus.md`, `docs/wiki/entities/redis-streams.md`
+
+## Goal
+
+Provide a real `DeviceAuthority` implementation that classifies an IMEI as `known` or `unknown` by consulting an allow-list **published from Directus into Redis** and cached in-memory in each Ingestion instance. This is the operational link between the business plane (where the source-of-truth `devices` collection lives) and the telemetry plane (where Ingestion makes its handshake decisions).
+
+## Non-goals
+
+- Not a security boundary. Real device security is network-level + downstream filtering. This list is a **soft signal** for observability and (optionally) a hard reject under `STRICT_DEVICE_AUTH`.
+- Not a real-time check. The list is cached locally with periodic refresh; new device provisioning takes effect within the refresh interval.
+
+## Deliverables
+
+- `src/adapters/teltonika/redis-allow-list-authority.ts`:
+  - `RedisAllowListAuthority` implementing `DeviceAuthority`.
+  - In-memory `Set<string>` of allowed IMEIs.
+  - Refresh worker that pulls from Redis on a configurable cadence.
+  - `start()` runs an initial fetch synchronously (so the cache is warm before the TCP listener accepts) and then starts the periodic refresh.
+  - `stop()` halts the refresh ticker.
+- `src/main.ts` updated:
+  - Read `DEVICE_AUTHORITY_MODE` env var (`allow_all` | `redis_allow_list`, default `allow_all`).
+  - Construct the appropriate authority and pass it into the adapter context.
+- Documentation in `OPERATIONS.md` (task 1.12) — section "Device authority" describing the env vars, refresh cadence, and Directus contract.
+
+## Specification
+
+### Redis contract
+
+The Ingestion side reads from a single Redis key. Two viable shapes; pick one and stick with it.
+
+**Option 1: Redis Set.** Simple, idiomatic for membership checks.
+
+```
+SADD devices:allowed <imei1> <imei2> ...
+SMEMBERS devices:allowed       # what the refresher reads
+SISMEMBER devices:allowed <imei>   # what an on-demand check would do (we do not use this; we cache)
+```
+
+**Option 2: Redis Hash with metadata per device.** Useful if downstream wants more than membership (e.g. device model, firmware version, owner).
+
+```
+HSET devices:allowed <imei> '{"model":"FMB920","fw":"03.27"}'
+HGETALL devices:allowed
+```
+
+**Recommendation: Option 1 (Set).** Membership is the only signal Ingestion uses; metadata belongs in Directus where it's queryable. If a future task needs metadata in Ingestion, switch to Option 2.
+
+### Directus → Redis sync (out of scope for this task)
+
+This task implements the **Ingestion-side reader**. The Directus-side publisher is a separate piece of work in the Directus repo:
+
+- A `devices` collection in Directus with at least `imei`, `active` fields.
+- A Directus Flow or hook that, on `items.create | items.update | items.delete` of `devices`, updates the Redis Set:
+  - Active inserted/updated → `SADD devices:allowed <imei>`.
+  - Deleted or `active=false` → `SREM devices:allowed <imei>`.
+- A periodic full-resync (e.g. nightly cron) that snapshots the collection into Redis to recover from any drift: `DEL devices:allowed && SADD devices:allowed <imei1> ... <imeiN>`.
+
+Document this contract in the Ingestion repo's `OPERATIONS.md` so on-call understands the dependency, but the implementation lives in Directus.
+
+### Refresh strategy
+
+```ts
+class RedisAllowListAuthority implements DeviceAuthority {
+  private cache = new Set<string>();
+  private timer?: NodeJS.Timeout;
+
+  constructor(
+    private redis: Redis,
+    private key: string = 'devices:allowed',
+    private intervalMs: number = 30_000,
+    private logger: Logger,
+    private metrics: Metrics,
+  ) {}
+
+  async start(): Promise<void> {
+    await this.refresh(); // synchronous initial load before TCP listener is up
+    this.timer = setInterval(() => {
+      this.refresh().catch((err) => this.logger.warn({ err }, 'allow-list refresh failed'));
+    }, this.intervalMs);
+  }
+
+  stop(): void { if (this.timer) clearInterval(this.timer); }
+
+  async check(imei: string): Promise<'known' | 'unknown'> {
+    return this.cache.has(imei) ? 'known' : 'unknown';
+  }
+
+  private async refresh(): Promise<void> {
+    const start = process.hrtime.bigint();
+    const members = await this.redis.smembers(this.key);
+    this.cache = new Set(members);
+    const ms = Number(process.hrtime.bigint() - start) / 1e6;
+    this.metrics.allowListRefresh.observe(ms / 1000);
+    this.metrics.allowListSize.set(this.cache.size);
+    this.logger.debug({ size: this.cache.size, took_ms: ms }, 'allow-list refreshed');
+  }
+}
+```
+
+### Failure modes
+
+- **Redis unavailable at startup.** `start()` throws → process exits non-zero → orchestrator restarts. Loud failure, easy to alert. Operators may opt to fall back to `allow_all` via env var change.
+- **Redis unavailable mid-flight.** `refresh` fails; the cache stays at last-known-good. `check` keeps working off the stale cache. Log warn; metric for refresh failures. Eventually the cache is "stale forever" if Redis never recovers — that's fine because telemetry is still flowing.
+- **Empty allow-list.** A bug or misconfiguration in Directus could publish an empty Set. The Ingestion side will then mark every device as `unknown`. With `STRICT_DEVICE_AUTH=false` (default), this is a visibility problem (alert-worthy) but not a service outage. With `STRICT_DEVICE_AUTH=true`, the entire fleet would be rejected — bad. Add a safety: refuse to apply a refresh result of size 0 unless `ALLOW_EMPTY_ALLOW_LIST=true` is set explicitly. Log error; keep the previous cache.
+
+### Configuration
+
+Add to the env schema (task 1.3):
+
+```ts
+DEVICE_AUTHORITY_MODE: z.enum(['allow_all', 'redis_allow_list']).default('allow_all'),
+DEVICE_ALLOW_LIST_KEY: z.string().default('devices:allowed'),
+DEVICE_ALLOW_LIST_REFRESH_MS: z.coerce.number().int().min(1000).default(30_000),
+STRICT_DEVICE_AUTH: z.coerce.boolean().default(false),
+ALLOW_EMPTY_ALLOW_LIST: z.coerce.boolean().default(false),
+```
+
+### Metrics
+
+Add to task 1.10's inventory:
+
+| Metric | Type | Labels | Description |
+|--------|------|--------|-------------|
+| `teltonika_allow_list_size` | gauge | — | Number of IMEIs in the local cache. Sudden drops are alert-worthy. |
+| `teltonika_allow_list_refresh_duration_seconds` | histogram | — | Time to refresh from Redis. |
+| `teltonika_allow_list_refresh_failures_total` | counter | `reason` | Refresh attempts that failed (network, empty-rejected, etc.). |
+
+## Acceptance criteria
+
+- [ ] With `DEVICE_AUTHORITY_MODE=allow_all`, behavior is identical to Phase 1 default — every IMEI is `known`.
+- [ ] With `DEVICE_AUTHORITY_MODE=redis_allow_list` and a populated Redis Set, `check(imei)` returns `'known'` for members and `'unknown'` for non-members.
+- [ ] Initial load happens before the TCP listener accepts connections.
+- [ ] Refresh runs every `DEVICE_ALLOW_LIST_REFRESH_MS` and updates the cache.
+- [ ] Empty allow-list refresh is rejected (cache preserved) unless `ALLOW_EMPTY_ALLOW_LIST=true`; metric increments with `reason=empty_rejected`.
+- [ ] Mid-flight Redis outage does not crash the service; subsequent successful refresh restores the cache.
+- [ ] `teltonika_allow_list_size` and `teltonika_allow_list_refresh_duration_seconds` appear in `/metrics`.
+- [ ] `STRICT_DEVICE_AUTH=true` combined with `redis_allow_list` causes `0x00` rejection of unknown IMEIs (verified by integration test).
+
+## Risks / open questions
+
+- **Provisioning lag.** A newly added device waits up to `DEVICE_ALLOW_LIST_REFRESH_MS` before being recognized. Default 30s is fine for most ops; tune down to 5s if the team has a workflow where they provision and immediately expect the device to be `known`.
+- **Cache size.** A Set of 100k IMEIs is ~6MB in memory — fine. At 1M+ devices, consider a Bloom filter + Redis fallback for misses, or split into shards. Not a near-term concern.
+- **Drift between Directus and Redis.** Hooks-based sync can miss updates if Directus has an issue mid-write. The nightly full-resync cron mitigates. Discussed in the Directus-side task (out of repo scope here).
+- **Should `STRICT_DEVICE_AUTH` be observable?** Yes — log at info on startup which mode the authority is in, so operators can verify config without reading env vars.
+
+## Done
+
+(Fill in once complete.)
@@ -0,0 +1,102 @@
+# Phase 1 — Inbound telemetry
+
+Implement a Node.js TCP server that ingests Teltonika telemetry over codecs 8, 8E, and 16; publishes normalized `Position` records to a Redis Stream; and ships with the operational baseline (Prometheus metrics, fixture-based tests, Dockerfile, Gitea CI/CD pipeline).
+
+## Outcome statement
+
+When Phase 1 is done:
+
+- Devices in the deployed FMB/FMC/FMM/FMU fleet connect to a known TCP port, complete the IMEI handshake, and stream AVL frames.
+- Every well-formed AVL record produces exactly one `Position` JSON entry on the `telemetry:teltonika` Redis Stream, with all GPS fields and IO element bag intact.
+- CRC-mismatched frames are dropped (no ACK) so devices retransmit.
+- Unknown-codec frames cause the connection to close with a structured `WARN` log entry; a Prometheus counter increments.
+- **Device authority is observable but permissive by default** — every handshake is labeled `known` or `unknown` based on a configurable `DeviceAuthority`; the Phase 1 default `AllowAllAuthority` accepts everything, and an opt-in `RedisAllowListAuthority` (task 1.13) reads a Directus-published allow-list from Redis. Strict reject-on-unknown is gated behind a `STRICT_DEVICE_AUTH` flag.
+- The service builds reproducibly via a Gitea Actions workflow, publishing a Docker image to the project's Gitea Container Registry, tagged by branch + git SHA.
+- Tests cover every codec parser using hex captures sourced from the canonical Teltonika doc, with at least one synthetic edge-case fixture per codec.
+
+## Sequencing
+
+```
+1.1 Project scaffold
+   ├─→ 1.2 Core shell & framing types
+   │      ├─→ 1.3 Configuration & logging
+   │      ├─→ 1.4 Teltonika framing layer (incl. DeviceAuthority seam)
+   │      │      ├─→ 1.5 Codec 8 parser
+   │      │      ├─→ 1.6 Codec 8 Extended parser
+   │      │      └─→ 1.7 Codec 16 parser
+   │      └─→ 1.8 Redis publisher & main wiring
+   │             └─→ 1.10 Observability
+   │                    ├─→ 1.11 Dockerfile & CI
+   │                    │      └─→ 1.12 Production hardening
+   │                    └─→ 1.13 Device authority (opt-in, deferrable)
+   └─→ 1.9 Fixture suite (cross-cutting; established alongside 1.5)
+```
+
+Tasks 1.5, 1.6, 1.7 can be done in parallel after 1.4 lands. Task 1.9 (fixture infrastructure) should land *with or before* 1.5 — it's the framework the codec tasks add to. Task 1.13 is the only Phase 1 task that can ship *after* the rest of Phase 1 is in production — `AllowAllAuthority` is functional from day one; the Redis allow-list lights up once the Directus-side publisher exists.
+
+## Files modified
+
+Phase 1 produces this layout in `tcp-ingestion/`:
+
+```
+tcp-ingestion/
+├── .gitea/workflows/build.yml
+├── src/
+│   ├── core/
+│   │   ├── types.ts
+│   │   ├── publish.ts
+│   │   ├── registry.ts
+│   │   ├── session.ts
+│   │   └── server.ts
+│   ├── adapters/
+│   │   └── teltonika/
+│   │       ├── index.ts
+│   │       ├── handshake.ts
+│   │       ├── frame.ts
+│   │       ├── crc.ts
+│   │       ├── device-authority.ts          (interface + AllowAllAuthority)
+│   │       ├── redis-allow-list-authority.ts (task 1.13, opt-in)
+│   │       └── codec/
+│   │           ├── data/
+│   │           │   ├── codec8.ts
+│   │           │   ├── codec8e.ts
+│   │           │   └── codec16.ts
+│   │           └── command/    (empty in Phase 1)
+│   ├── config/load.ts
+│   ├── observability/
+│   │   ├── logger.ts
+│   │   └── metrics.ts
+│   └── main.ts
+├── test/
+│   ├── fixtures/teltonika/
+│   │   ├── codec8/
+│   │   ├── codec8e/
+│   │   └── codec16/
+│   ├── codec8.test.ts
+│   ├── codec8e.test.ts
+│   ├── codec16.test.ts
+│   ├── crc.test.ts
+│   └── frame.test.ts
+├── Dockerfile
+├── package.json
+├── pnpm-lock.yaml
+├── tsconfig.json
+├── .dockerignore
+├── .gitignore
+├── .prettierrc
+├── eslint.config.js
+└── README.md
+```
+
+## Tech stack (decided)
+
+- **Node.js 22 LTS**, ESM-only.
+- **TypeScript 5.x** with `strict: true`.
+- **pnpm** for dependency management (deterministic, fast, easy to add workspaces later if needed).
+- **vitest** for tests.
+- **pino** for structured logging.
+- **prom-client** for Prometheus metrics.
+- **ioredis** for Redis Streams.
+- **zod** for environment-variable validation.
+
+If an implementer wants to deviate, they must update the relevant task file first.