Fix metric wiring gaps audited against live processor output
Several Phase 1 metrics were registered in observability/metrics.ts but
either unwired at the call sites or wired with wrong counts. Production
output showed 11 records ingested per logs but only 4 in metrics. The
fixes below align metric values with actual hot-path activity.
Wiring gaps closed (consumer.ts):
- processor_consumer_reads_total{result=ok|empty|error} — was registered
but never inc'd. Now fires on each XREADGROUP outcome.
- processor_consumer_records_total — was registered but never inc'd.
Now fires once per XREADGROUP, with the entry count.
Counts corrected (writer.ts):
- processor_position_writes_total{status} — was inc'd unconditionally
by 1 per chunk for each of inserted/duplicate. Now inc'd by the
actual per-status count, and only when count > 0.
- processor_position_writes_total{status='failed'} — was inc'd by 1
per failed chunk. Now inc'd by chunk.length so every failed record
is counted.
Counts corrected (main.ts):
- processor_acks_total — was inc'd by 1 per non-empty batch. Now
inc'd by ackIds.length so every ACK'd ID is counted.
Wiring gap closed (state.ts):
- processor_device_state_evictions_total — internal `evicted` counter
existed but was never published to metrics. createDeviceStateStore
now accepts a Metrics injection and inc's on each eviction.
Metrics interface extended (types.ts, metrics.ts):
- Metrics.inc gained an optional third `value` parameter (defaults to 1)
for batched increments. dispatchInc passes it through to prom-client's
Counter.inc(labels, value).
Tests updated to reflect the new third arg and the state.ts factory's
new metrics parameter. Total 134 unit tests passing (no count change —
existing tests adjusted, no new tests added; the real verification is
on stage where the metrics are now meaningful again).
This commit is contained in:
@@ -71,8 +71,8 @@ export function createMetrics(): Metrics & {
|
||||
collectDefaultMetrics({ register: internal.registry });
|
||||
|
||||
const metricsImpl: Metrics & { serializeMetrics: () => Promise<string> } = {
|
||||
inc(name: string, labels?: Record<string, string>): void {
|
||||
dispatchInc(internal, name, labels);
|
||||
inc(name: string, labels?: Record<string, string>, value?: number): void {
|
||||
dispatchInc(internal, name, labels, value);
|
||||
},
|
||||
|
||||
observe(name: string, value: number, labels?: Record<string, string>): void {
|
||||
@@ -398,25 +398,27 @@ function dispatchInc(
|
||||
r: InternalRegistry,
|
||||
name: string,
|
||||
labels?: Record<string, string>,
|
||||
value?: number,
|
||||
): void {
|
||||
const v = value ?? 1;
|
||||
switch (name) {
|
||||
case 'processor_consumer_reads_total':
|
||||
r.consumerReadsTotal.inc(labels ?? {});
|
||||
r.consumerReadsTotal.inc(labels ?? {}, v);
|
||||
break;
|
||||
case 'processor_consumer_records_total':
|
||||
r.consumerRecordsTotal.inc();
|
||||
r.consumerRecordsTotal.inc(v);
|
||||
break;
|
||||
case 'processor_decode_errors_total':
|
||||
r.decodeErrorsTotal.inc();
|
||||
r.decodeErrorsTotal.inc(v);
|
||||
break;
|
||||
case 'processor_position_writes_total':
|
||||
r.positionWritesTotal.inc(labels ?? {});
|
||||
r.positionWritesTotal.inc(labels ?? {}, v);
|
||||
break;
|
||||
case 'processor_acks_total':
|
||||
r.acksTotal.inc();
|
||||
r.acksTotal.inc(v);
|
||||
break;
|
||||
case 'processor_device_state_evictions_total':
|
||||
r.deviceStateEvictionsTotal.inc();
|
||||
r.deviceStateEvictionsTotal.inc(v);
|
||||
break;
|
||||
default:
|
||||
// Unknown metric name — silently ignore. This preserves the contract
|
||||
|
||||
Reference in New Issue
Block a user