Realign processor stream-name default to telemetry:teltonika
Stage discovered the wrong default at runtime: tcp-ingestion's compiled default REDIS_TELEMETRY_STREAM is 'telemetry:teltonika' but processor's was 'telemetry:t', so the two services were talking past each other — tcp-ingestion publishing to one stream, processor reading another empty one. The deploy stack now pins both to the same value via a shared env var, but the processor's compiled default should also match so local development and the integration test stay aligned with reality. Changes: - src/config/load.ts — default changed to 'telemetry:teltonika' - .env.example — same - test/config.test.ts — default-value assertion updated - planning docs (ROADMAP, phase-1 README, tasks 03/08/10, phase-3 README) — occurrences of 'telemetry:t' replaced with 'telemetry:teltonika' The deploy stack remains the single source of truth via the shared REDIS_TELEMETRY_STREAM env var. Compiled defaults are belt-and-braces.
This commit is contained in:
@@ -11,7 +11,7 @@ When Phase 3 is done:
|
||||
- **Graceful shutdown** with bounded in-flight drain: SIGTERM blocks new reads, awaits in-flight writes, ACKs anything still in PEL whose write succeeded, exits clean.
|
||||
- **State rehydration on restart**: on first packet for an unknown device, the Processor queries Postgres for the device's `last_position` and seeds `DeviceState` accordingly. Phase 2 accumulators get the same treatment (e.g. last geofence membership comes from the last `timing_records` row).
|
||||
- **`XAUTOCLAIM` for stuck pending entries**: at startup and on a cadence, the Processor claims entries that have been pending in another consumer's PEL for longer than `CLAIM_THRESHOLD_MS`. Lets a dead instance's work get picked up by survivors without manual intervention.
|
||||
- **Dead-letter stream for poison records**: records that fail to decode N times go to `telemetry:t:dlq` with the original payload + the error. Operators can inspect, fix, replay.
|
||||
- **Dead-letter stream for poison records**: records that fail to decode N times go to `telemetry:teltonika:dlq` with the original payload + the error. Operators can inspect, fix, replay.
|
||||
- **Multi-instance load split verified**: spinning up two Processor instances against the same consumer group splits the work evenly. End-to-end test in CI (or at least a manual playbook).
|
||||
- **Migration safety with multiple instances**: Postgres advisory locks around the migration runner so two instances starting simultaneously don't race.
|
||||
- **Uncaught exception / unhandled rejection handlers**: log, flush in-memory state to a panic dump file, exit with a code Portainer treats as restart-worthy.
|
||||
@@ -24,7 +24,7 @@ When Phase 3 is done:
|
||||
| 3.1 | Graceful shutdown — full | Replaces the Phase 1 stub. Drain budget configurable. Tested end-to-end |
|
||||
| 3.2 | Per-device state rehydration on first-packet | Single `SELECT ... LIMIT 1` per cold device. Memoized by LRU |
|
||||
| 3.3 | `XAUTOCLAIM` runner | Periodic + on-startup. Claims entries pending > `CLAIM_THRESHOLD_MS`. Re-runs the sink |
|
||||
| 3.4 | Dead-letter stream | After N failed decodes/writes, record goes to `telemetry:t:dlq`; original ACKed off the main stream |
|
||||
| 3.4 | Dead-letter stream | After N failed decodes/writes, record goes to `telemetry:teltonika:dlq`; original ACKed off the main stream |
|
||||
| 3.5 | Migration advisory lock | `pg_advisory_lock(<hash>)` around the migrate runner; two instances can start simultaneously |
|
||||
| 3.6 | Uncaught exception / unhandled rejection handlers | Log, flush, exit. Match `tcp-ingestion`'s eventual Phase 1 task 1.12 work when that lands |
|
||||
| 3.7 | OPERATIONS.md | The runbook |
|
||||
|
||||
Reference in New Issue
Block a user