# Task 1.6 โ€” Per-device in-memory state **Phase:** 1 โ€” Throughput pipeline **Status:** ๐ŸŸฉ Done **Depends on:** 1.2 **Wiki refs:** `docs/wiki/entities/processor.md` (ยง State management) ## Goal Maintain a bounded `Map` updated on every accepted Position. Phase 1 only stores trivial state โ€” `last_position`, `last_seen`, `position_count_session` โ€” but the structure is built so Phase 2 (geofence accumulators, time-since-last-checkpoint, etc.) can extend it cleanly. ## Deliverables - `src/core/state.ts` exporting: - `createDeviceStateStore(config, logger): DeviceStateStore` โ€” factory. - `DeviceStateStore` interface: - `update(position: Position): DeviceState` โ€” applies the position, returns the new state. Touches LRU order. - `get(device_id: string): DeviceState | undefined` โ€” read without touching LRU order. (Used for diagnostics; the hot path uses `update`.) - `size(): number` โ€” for metrics. - `evictedTotal(): number` โ€” for metrics. - `test/state.test.ts` covering: - First update for a new device creates the entry; subsequent updates increment `position_count_session`. - LRU eviction: with cap=3, after 4 distinct devices, the least-recently-updated is evicted. - Eviction increments `evictedTotal()`. - `last_seen` reflects the position's `timestamp` (the device-reported time), not the wall clock at update time. - Out-of-order positions (a position with `timestamp` older than `last_seen`) are still applied (we don't drop them) but `last_seen` only advances forward โ€” i.e. `last_seen = max(prev_last_seen, position.timestamp)`. Document the rationale. ## Specification ### LRU implementation Use a plain `Map`. JavaScript `Map` preserves insertion order, and we exploit it: on every `update`, `delete` then `set` the entry โ€” that bumps it to the most recent position in iteration order. When `size() > cap`, take `keys().next().value` (the oldest) and `delete` it. This is O(1) per update and avoids a third-party LRU dependency. **Do not** introduce `lru-cache` โ€” the standard `Map` trick is sufficient for Phase 1's needs. ### Why `last_seen = max(...)`, not `last_seen = position.timestamp` Devices buffer records when offline and replay them in bursts (we observed a 55-record buffer flush on stage). Within a single batch, timestamps may *decrease* between consecutive records if the device sorted them oddly. We want `last_seen` to mean "highest device timestamp seen so far for this device" โ€” that's what downstream consumers want. ### What about restart? On Processor restart, the in-memory state is empty. The first record from any device creates a fresh `DeviceState`. **Phase 1 accepts this** โ€” it's a recovery path, not a hot path, and Phase 1 has no domain logic that would be wrong without rehydrated state. Phase 3 (production hardening) adds rehydration: on first packet for an unknown device, query `positions WHERE device_id = $1 ORDER BY ts DESC LIMIT 1` to seed `last_position`. That's a Phase 3 task, not Phase 1. ### What state lives here, what doesn't In Phase 1 the state is intentionally minimal: ```ts type DeviceState = { device_id: string; last_position: Position; last_seen: Date; // = max(prev, position.timestamp) position_count_session: number; // resets on restart }; ``` **Not in Phase 1:** - Geofence membership (Phase 2) - Distance accumulators (Phase 2) - Time-in-stage (Phase 2) - Anything that would be wrong if dropped on restart (Phase 3 + rehydration) The interface is built to extend: Phase 2 may add fields, but the existing fields and method signatures should not change. ## Acceptance criteria - [ ] `pnpm typecheck`, `pnpm lint`, `pnpm test` clean. - [ ] LRU cap from `DEVICE_STATE_LRU_CAP` config is respected. - [ ] `evictedTotal()` increments correctly under eviction. - [ ] `last_seen` does not regress on out-of-order timestamps. ## Risks / open questions - **Cap sizing.** Default `DEVICE_STATE_LRU_CAP=10000`. At 1KB per state entry, that's 10MB of resident memory โ€” fine. Operators with unusually large fleets can raise it; the bound exists to prevent runaway growth from misbehaving devices flooding novel `device_id` values. - **No mutex.** State is updated only from the consumer loop, which is single-threaded. If Phase 2 introduces parallel sinks, revisit with proper synchronization. ## Done `src/core/state.ts` โ€” LRU Map using delete+set bump trick, `last_seen = max(prev, position.timestamp)` semantics, `evictedTotal()` counter. `test/state.test.ts` โ€” 14 tests covering new-device creation, session counter increment, LRU eviction at cap, LRU re-touch, evictedTotal, out-of-order timestamp rejection, get/size. Landed in `68d3da3`.