22b1b069df
Initialize CLAUDE.md schema, index, and log; ingest three architecture sources (system overview, Teltonika ingestion design, official Teltonika data-sending protocols) into 7 entity pages, 8 concept pages, and 3 source pages with wikilink cross-references.
41 lines
1.8 KiB
Markdown
41 lines
1.8 KiB
Markdown
---
|
|
title: Redis Streams
|
|
type: entity
|
|
created: 2026-04-30
|
|
updated: 2026-04-30
|
|
sources: [gps-tracking-architecture, teltonika-ingestion-architecture]
|
|
tags: [infrastructure, telemetry-plane, queue]
|
|
---
|
|
|
|
# Redis Streams
|
|
|
|
The durable in-flight queue between [[tcp-ingestion]] and [[processor]]. Also the transport for Phase 2 outbound commands.
|
|
|
|
## What it provides
|
|
|
|
- **Buffering** — temporary slowness in [[processor]] does not push back on Ingestion sockets.
|
|
- **Replayability** — Streams retain messages, so a Processor crash does not lose telemetry; consumer-group offsets resume from the last position.
|
|
- **Horizontal scaling** — multiple Processor instances join a consumer group and split load across device IDs.
|
|
|
|
## Why Redis (and not Kafka/NATS)
|
|
|
|
Sufficient at current scale and adds minimal operational burden. NATS or Kafka are reasonable upgrades when **multi-region durability** or **very high throughput** become real concerns. Until then, Redis is the right choice.
|
|
|
|
## Phase 2 usage
|
|
|
|
Outbound commands ride on per-instance streams: `commands:outbound:{instance_id}`. Responses ride on `commands:responses`. Redis is the transport; the source of truth for commands is the Directus `commands` collection. See [[phase-2-commands]].
|
|
|
|
The connection registry (`connections:registry` hash) and per-instance heartbeats (`instance:heartbeat:{instance_id}` keys with `EX 90`) also live in Redis.
|
|
|
|
## Failure mode
|
|
|
|
Streams are persisted; restart resumes from disk. Complete Redis loss is recoverable from device retransmits and Processor checkpointing. See [[failure-domains]].
|
|
|
|
## Operational note
|
|
|
|
**Consumer lag is the canary metric** for the entire telemetry pipeline. Observability dashboards should make it prominent.
|
|
|
|
## Deployment
|
|
|
|
Internal-only container. Persistence enabled. Never exposed externally.
|