Table of Contents
Architecture
Three components in series: telemetry source, local daemon, firmware. The source is the variable; the daemon and firmware are identical across both architectures.
[ Claude Code ]
|
v
[ telemetry source ] <--- A: OTEL + Prometheus
<--- B: ccusage + JSONL tail
|
v
[ claude-gauge daemon ] Python, FastAPI, GET /usage
|
v
[ ESP32 firmware ] polls /usage ~1 Hz
|
v
[ four needles + annunciator ]
Two architectures
Pick one. Both feed the same daemon interface, so the firmware never knows which backend is wired up.
| A. OTEL-native | B. ccusage-sourced | |
|---|---|---|
| Data source | Claude Code OTLP -> collector -> Prometheus | ccusage CLI + watchdog JSONL tail |
| External deps | Docker Compose stack (3 services) | Node + bunx ccusage |
| Deep-stats dashboard | Grafana dashboard 25052 for free | ccusage TUI, or build something |
| Short-window tach | Limited by scrape interval (15s) | Sub-second via JSONL tail |
| Operational weight | Moderate | Tiny |
| Homelab-enterprise fit | Strong | Weak |
| Survivability through Claude Code updates | High (OTEL schema is documented) | Medium (JSONL is implementation detail) |
See DataSources for the implementation-level specifics of each.
The daemon
Python service that exposes a single stable HTTP contract. The firmware polls it ~1 Hz.
GET /usage
-> {
"rate_1m": <number>, tokens/min, short window
"window_5h_tokens": <number>,
"window_5h_pct": <0..1>,
"thinking_ratio": <0..1>, thinking tokens / output tokens
"cache_hit_rate": <0..1>,
"last_model": "opus"|"sonnet"|"haiku",
"hot": <bool>, tach above redline
"warn": <bool>, fuel or cache anomaly
"stall": <bool>, no telemetry for N minutes
"idle": <bool>, daemon up, no activity
"updated_at": <iso8601>
}
Two implementations share this shape:
daemon_prom.py— PromQL queries against Prometheus (A)daemon_ccusage.py— ccusage subprocess + watchdog tail (B)
The firmware is unaware. Swapping backends is a daemon-only change.
Why this shape
The firmware is the slow part of the system. ESP32s are not good at JSON parsing at high frequency, they do not roundtrip Prometheus, and their HTTP stacks are brittle. Put all the reasoning on the daemon side. The firmware receives a flat, pre-computed structure and only does mapping (0-1000 scale -> stepper target).
This also means the firmware contract is stable across architectures AND across future data-source changes. When Anthropic ships a native usage endpoint, the daemon swaps its input and the hardware does not know.
Ceilings and calibration
The daemon does not know the plan caps. User configures local ceilings via environment:
CLAUDE_GAUGE_5H_CEILING tokens that count as 100% on 5h fuel
CLAUDE_GAUGE_TACH_REDLINE tokens/min that count as redline
CLAUDE_GAUGE_STALL_MINUTES minutes of silence before STALL lights
Both architectures consume the same variables. Calibrate by running
for a week and comparing to ccusage blocks or Claude's /usage
output.