commit 461679dd6484d2403464f302febf981aae1cd02a Author: archeious Date: Fri Apr 17 19:30:38 2026 -0600 Initial wiki: six-page documentation set Pages: - Home: overview, cluster-at-a-glance table, page index, with Monitor Dash render embedded. - Architecture: two-architecture tradeoff (A: OTEL-native, B: ccusage-sourced), shared daemon contract (/usage JSON shape), calibration env vars. - DataSources: architecture A implementation (docker-compose, otel-collector-config, prometheus.yml, Claude Code env, claude_code_* metric schema, PromQL per gauge, daemon sketch); architecture B implementation (ccusage CLI vs MCP server, watchdog JSONL tail, daemon sketch); recommendation. - Hardware: four-gauge layout (5h fuel, tach, thinking ratio, cache hit), X27.168 steppers with SwitecX25 library, ESP32 DevKit wiring with concrete GPIO assignments, firmware structure, enclosure V1 notes, BOM (~$80-90). - Roadmap: eight phases (A daemon stdout through H character metrics), deferred list, out-of-scope list. - Ideas: four exotic enclosure variants with render (steampunk chronometer, retro VFD array, minimalist birch e-ink, bio-digital cybernetic cluster), metrics brainstorm organised in seven categories, hardware / software wild ideas, unlikely-but-noted. Images committed: monitor-dash.png referenced from Home and Hardware; exotic-dashes.png referenced from Ideas. diff --git a/Architecture.md b/Architecture.md new file mode 100644 index 0000000..932957b --- /dev/null +++ b/Architecture.md @@ -0,0 +1,97 @@ +# Architecture + +Three components in series: telemetry source, local daemon, firmware. +The source is the variable; the daemon and firmware are identical +across both architectures. + +``` +[ Claude Code ] + | + v +[ telemetry source ] <--- A: OTEL + Prometheus + <--- B: ccusage + JSONL tail + | + v +[ claude-gauge daemon ] Python, FastAPI, GET /usage + | + v +[ ESP32 firmware ] polls /usage ~1 Hz + | + v +[ four needles + annunciator ] +``` + +## Two architectures + +Pick one. Both feed the same daemon interface, so the firmware +never knows which backend is wired up. + +| | A. OTEL-native | B. ccusage-sourced | +|---|---|---| +| Data source | Claude Code OTLP -> collector -> Prometheus | `ccusage` CLI + watchdog JSONL tail | +| External deps | Docker Compose stack (3 services) | Node + `bunx ccusage` | +| Deep-stats dashboard | Grafana dashboard 25052 for free | `ccusage` TUI, or build something | +| Short-window tach | Limited by scrape interval (15s) | Sub-second via JSONL tail | +| Operational weight | Moderate | Tiny | +| Homelab-enterprise fit | Strong | Weak | +| Survivability through Claude Code updates | High (OTEL schema is documented) | Medium (JSONL is implementation detail) | + +See [DataSources](DataSources) for the implementation-level specifics +of each. + +## The daemon + +Python service that exposes a single stable HTTP contract. The +firmware polls it ~1 Hz. + +``` +GET /usage + -> { + "rate_1m": , tokens/min, short window + "window_5h_tokens": , + "window_5h_pct": <0..1>, + "thinking_ratio": <0..1>, thinking tokens / output tokens + "cache_hit_rate": <0..1>, + "last_model": "opus"|"sonnet"|"haiku", + "hot": , tach above redline + "warn": , fuel or cache anomaly + "stall": , no telemetry for N minutes + "idle": , daemon up, no activity + "updated_at": + } +``` + +Two implementations share this shape: + +* `daemon_prom.py` — PromQL queries against Prometheus (A) +* `daemon_ccusage.py` — ccusage subprocess + watchdog tail (B) + +The firmware is unaware. Swapping backends is a daemon-only change. + +## Why this shape + +The firmware is the slow part of the system. ESP32s are not good at +JSON parsing at high frequency, they do not roundtrip Prometheus, +and their HTTP stacks are brittle. Put all the reasoning on the +daemon side. The firmware receives a flat, pre-computed structure +and only does mapping (0-1000 scale -> stepper target). + +This also means the firmware contract is stable across architectures +AND across future data-source changes. When Anthropic ships a native +usage endpoint, the daemon swaps its input and the hardware does not +know. + +## Ceilings and calibration + +The daemon does not know the plan caps. User configures local ceilings +via environment: + +``` +CLAUDE_GAUGE_5H_CEILING tokens that count as 100% on 5h fuel +CLAUDE_GAUGE_TACH_REDLINE tokens/min that count as redline +CLAUDE_GAUGE_STALL_MINUTES minutes of silence before STALL lights +``` + +Both architectures consume the same variables. Calibrate by running +for a week and comparing to `ccusage blocks` or Claude's `/usage` +output. diff --git a/DataSources.md b/DataSources.md new file mode 100644 index 0000000..e79cdc5 --- /dev/null +++ b/DataSources.md @@ -0,0 +1,333 @@ +# Data Sources + +Two architectures, implemented side by side. Pick one; both produce +the same `/usage` payload for the firmware. + +## A. OTEL-native + +Uses Claude Code's built-in OpenTelemetry support. Mirrors the +reference stack published in `anthropics/claude-code-monitoring-guide`. + +### Docker Compose + +Three containers: OTEL collector, Prometheus, Grafana. + +```yaml +services: + otel-collector: + image: otel/opentelemetry-collector-contrib:latest + command: ["--config=/etc/otel-collector-config.yaml"] + volumes: + - ./otel-collector-config.yaml:/etc/otel-collector-config.yaml + ports: + - "4317:4317" # OTLP gRPC + - "4318:4318" # OTLP HTTP + - "8889:8889" # Prometheus scrape + + prometheus: + image: prom/prometheus:latest + ports: ["9090:9090"] + volumes: + - ./prometheus.yml:/etc/prometheus/prometheus.yml + - prometheus_data:/prometheus + command: + - '--config.file=/etc/prometheus/prometheus.yml' + - '--storage.tsdb.path=/prometheus' + - '--storage.tsdb.retention.time=8d' + - '--web.enable-lifecycle' + + grafana: + image: grafana/grafana:latest + ports: ["3000:3000"] + environment: + - GF_SECURITY_ADMIN_PASSWORD=admin + +volumes: + prometheus_data: + grafana_data: +``` + +### OTEL collector config + +```yaml +receivers: + otlp: + protocols: + grpc: { endpoint: 0.0.0.0:4317 } + http: { endpoint: 0.0.0.0:4318 } + +processors: + batch: { timeout: 1s, send_batch_size: 1024 } + memory_limiter: { check_interval: 1s, limit_mib: 512 } + +exporters: + prometheus: + endpoint: "0.0.0.0:8889" + send_timestamps: true + metric_expiration: 192h # 8 days; needed for 7d queries + enable_open_metrics: true + +service: + pipelines: + metrics: + receivers: [otlp] + processors: [memory_limiter, batch] + exporters: [prometheus] +``` + +### Prometheus scrape config + +```yaml +global: + scrape_interval: 15s + evaluation_interval: 15s + +scrape_configs: + - job_name: 'otel-collector' + static_configs: + - targets: ['otel-collector:8889'] +``` + +### Claude Code environment + +Set in the shell Claude Code launches in. User profile or +`~/.claude/settings.json` managed settings both work. + +```bash +export CLAUDE_CODE_ENABLE_TELEMETRY=1 +export OTEL_METRICS_EXPORTER=otlp +export OTEL_LOGS_EXPORTER=otlp +export OTEL_EXPORTER_OTLP_PROTOCOL=grpc +export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317 +export OTEL_METRIC_EXPORT_INTERVAL=10000 # 10s for responsiveness +export OTEL_METRICS_INCLUDE_SESSION_ID=false +``` + +### Claude Code metrics (surfaced in Prometheus) + +After OTEL-to-Prom conversion: + +| Prometheus metric | Labels | +|---|---| +| `claude_code_token_usage_tokens_total` | `type` (input, output, cacheRead, cacheCreation), `model` | +| `claude_code_cost_usage_USD_total` | `model` | +| `claude_code_session_count_total` | | +| `claude_code_active_time_total_seconds_total` | `type` (user, cli) | +| `claude_code_lines_of_code_count_total` | `type` (added, removed) | +| `claude_code_commit_count_total` | | +| `claude_code_pull_request_count_total` | | +| `claude_code_code_edit_tool_decision_count_total` | `tool_name`, `decision`, `language` | + +Full event schema is in the [Claude Code monitoring docs](https://code.claude.com/docs/en/monitoring-usage). + +### PromQL for each gauge + +```promql +# Tokens/min (tach) +sum(rate(claude_code_token_usage_tokens_total[1m])) * 60 + +# 5h fuel +sum(increase(claude_code_token_usage_tokens_total[5h])) + +# Thinking/output ratio (temp gauge) +# Note: OTEL does not emit thinking tokens as a dedicated metric; +# derive via event stream (claude_code.api_request + prompt.id +# correlation) or fall back to a constant 0 until instrumented. + +# Cache hit rate (boost gauge) + sum(rate(claude_code_token_usage_tokens_total{type="cacheRead"}[5m])) +/ sum(rate(claude_code_token_usage_tokens_total{type=~"input|cacheRead|cacheCreation"}[5m])) + +# Last model (approximation) +topk(1, claude_code_token_usage_tokens_total{type="output"}) +``` + +### Daemon (A) + +``` +src/claude_gauge/ + daemon_prom.py FastAPI, PromQL queries, /usage + config.py ceilings, URLs, thresholds + windows.py query builders and result parsing +``` + +```python +async def prom(q: str) -> float: + r = await client.get(f"{PROM}/api/v1/query", params={"query": q}) + data = r.json()["data"]["result"] + return float(data[0]["value"][1]) if data else 0.0 + +@app.get("/usage") +async def usage(): + rate_1m = await prom(Q_TACH) + win_5h = await prom(Q_5H) + cache = await prom(Q_CACHE) + ... +``` + +### Deep-stats dashboard + +Import **Grafana Labs dashboard 25052** ("Claude Code") against the +Prometheus data source. That is the deep-stats surface. No custom +web UI needed. + +### Pros and cons + +Pros: +* Uses the platform feature Anthropic ships +* Grafana dashboard is free +* Metric schema is documented and stable +* Generalises cleanly if other hosts also run Claude Code + +Cons: +* Scrape interval caps tach responsiveness at ~15s +* Three containers to run +* Requires env-var changes on every Claude Code launch surface + +If the 15s cap is annoying, bolt a JSONL tail from B on top for the +tach path only. Hybrid. See [Hardware](Hardware) for the firmware +contract that stays identical either way. + +--- + +## B. ccusage-sourced + +One process. No collector, no Prometheus, no Grafana. The daemon +subprocess-calls `ccusage` for the fuel gauges and tails JSONL +directly for the tach. + +### ccusage integration + +Two shapes, pick one. + +**B1 (simplest)**: periodic CLI subprocess. + +```bash +npx ccusage@latest blocks --json # current 5h block +npx ccusage@latest daily --json # per-day aggregates +``` + +**B2 (persistent)**: ccusage MCP HTTP server. Exposes four tools +(`daily`, `monthly`, `session`, `blocks`) at `POST /` over +StreamableHTTP MCP transport. + +```bash +bunx @ccusage/mcp@latest --type http --port 8080 +``` + +B1 is the default. Switch to B2 only if you also want the MCP +surface available to other local agents. + +### Short-window tach via watchdog + +`ccusage` aggregates are too coarse for a responsive tach. The +daemon keeps a 60-second ring buffer by tailing +`~/.claude/projects/**/*.jsonl` directly. + +```python +from watchdog.observers import Observer +from watchdog.events import FileSystemEventHandler +from collections import deque +import json, time + +class JsonlTail(FileSystemEventHandler): + def __init__(self, bus): + self.bus = bus + self.offsets = {} + + def on_modified(self, event): + p = Path(event.src_path) + if p.suffix != ".jsonl": + return + off = self.offsets.get(p, 0) + with p.open() as f: + f.seek(off) + for line in f: + try: + d = json.loads(line) + except json.JSONDecodeError: + continue + if d.get("type") == "assistant": + u = d["message"].get("usage", {}) + tokens = sum(u.get(k, 0) for k in ( + "input_tokens", "output_tokens", + "cache_read_input_tokens", + "cache_creation_input_tokens", + )) + thinking = extract_thinking_tokens(d) + model = d["message"].get("model", "") + self.bus.push(time.time(), tokens, thinking, model) + self.offsets[p] = f.tell() +``` + +### Daemon (B) + +``` +src/claude_gauge/ + daemon_ccusage.py FastAPI, ccusage subprocess, /usage + tail.py watchdog + RateBus + config.py +``` + +```python +async def ccusage(cmd): + proc = await asyncio.create_subprocess_exec( + "npx", "ccusage@latest", cmd, "--json", + stdout=asyncio.subprocess.PIPE, + ) + out, _ = await proc.communicate() + return json.loads(out) + +@app.get("/usage") +async def usage(): + rate = bus.rate_per_min() + blocks = await ccusage("blocks") + cur = next((b for b in blocks["blocks"] if b.get("isActive")), None) + w5 = cur["totalTokens"] if cur else 0 + return { + "rate_1m": rate, + "window_5h_tokens": w5, + "window_5h_pct": min(1.0, w5 / CEIL_5H), + "thinking_ratio": bus.thinking_ratio(), + "cache_hit_rate": bus.cache_hit_rate(), + "last_model": bus.last_model(), + "hot": rate > RED, + "warn": (w5 / CEIL_5H) > 0.8, + "stall": bus.silent_for_minutes() > STALL_MIN, + "idle": True, + } +``` + +Cache `ccusage` output with a 10s TTL so 1 Hz firmware polling does +not spawn a subprocess every request. + +### Pros and cons + +Pros: +* Single process, one dependency tree +* Sub-second tach out of the box +* No Docker, no collector, no env vars on the Claude Code side +* `ccusage` has solved the JSONL edge cases already + +Cons: +* No free Grafana dashboard; deep stats require the `ccusage` TUI or + a custom surface +* Node required on the runtime path +* JSONL format is an implementation detail; upstream changes can + break parsing + +--- + +## Recommendation + +**Architecture A** for homelab-as-enterprise framing. OTEL is the +platform feature, Prometheus integrates with the rest of the homelab +stack, and Grafana dashboard 25052 is free deep-stats. Scrape +interval cap is the only real concession; hybrid with the JSONL +tail from B if the tach feels sluggish after calibration. + +**Architecture B** if Prometheus is not already running and you want +a working needle sooner. Ship it, migrate to A later if OTEL becomes +useful for other things. + +Firmware and cluster are identical either way. diff --git a/Hardware.md b/Hardware.md new file mode 100644 index 0000000..c7f68b7 --- /dev/null +++ b/Hardware.md @@ -0,0 +1,190 @@ +# Hardware + +Four analog needles, five annunciator lamps, brushed-aluminium bezel, +mounted under the monitor. ESP32 polls the daemon's `/usage` endpoint +at ~1 Hz and updates the cluster. + +![Monitor-mounted cluster](images/monitor-dash.png) + +## Cluster layout + +Four round gauges in a single bezel, reading left to right: + +| # | Label | Metric | Scale | +|---|---|---|---| +| 1 | 5H PLAN / Left fuel | % of 5h plan window used | 0 - 100% | +| 2 | TOKENS/MIN / Tach | Short-window burn rate | 0 - redline | +| 3 | THINKING/OUTPUT / Temp | Thinking vs output tokens | low - high, colour-coded zones | +| 4 | CACHE HIT / Boost | Cache read as fraction of input | 0 - 100% | + +Annunciator row below the gauges, left to right: + +| Lamp | Colour | Condition | +|---|---|---| +| MODEL | RGB | Red for Opus, amber for Sonnet, green for Haiku; reflects the most recent assistant message | +| HOT | Red | Tach above redline; optionally flashes | +| WARN | Amber | Fuel above 80% or cache-hit-rate below a floor | +| STALL | Blue | No telemetry for the configured silence window | +| IDLE | Green | Daemon reachable, no current activity (pulses softly) | + +The 7-day window is not on the cluster. It moves too slowly to warrant +a needle and lives on the deep-stats dashboard instead. + +## Movement + +**Switec X27.168** automotive stepper motor. + +* 315-degree sweep +* 600 steps (~2 degrees per step with 3-degree/step gearing) +* Direct drive, no microstepping required +* ~$8 per unit from Adafruit / AliExpress +* Used in car dashboards, so bezels and faces are off-the-shelf + +Sibling parts in the same family (X25, VID28, VID29, BKA30D-R5) all +work with the same driver library. The X27.168 has the longest sweep +and the most tutorial coverage, which is why it wins. + +## Driver library + +**`clearwater/SwitecX25`** Arduino C++ library. Despite the name it +covers X27.168 without modification. Drives four GPIO pins per +motor. No external driver IC strictly required for short wiring +runs; add ULN2003A darlington arrays if you want cleaner current +handling or longer wire runs. + +No maintained MicroPython port exists. Firmware is therefore Arduino +C++, not MicroPython. Not the original plan, but the right trade +given how mature the Arduino ecosystem is for this motor. + +## Board + +**ESP32 DevKit V1** (generic). + +* WiFi built in +* Enough GPIO for 3+ steppers (12 pins), 5 annunciator LEDs, one + reset button +* Arduino framework via PlatformIO +* ~$8 per board + +Alternatives: + +* Raspberry Pi Pico W with CircuitPython — fewer tutorials for the + stepper, but friendlier to iterate +* ESP32-S3 if you want USB-CDC for serial debugging without an + extra USB-to-UART chip + +Picking the classic ESP32 DevKit keeps part count low. + +## Wiring + +Concrete GPIO assignments for a four-gauge cluster. Adjust to suit +the specific board variant. + +``` +ESP32 DevKit V1 + +---------------- 5V -> stepper common (via regulator if needed) + +---------------- GND -> common ground + + Gauge 1 (5h PLAN) + GPIO 13, 14, 27, 26 -> X27.168 #1 coil pins + + Gauge 2 (Tach) + GPIO 25, 33, 32, 35 -> X27.168 #2 coil pins + + Gauge 3 (Thinking ratio) + GPIO 34, 39, 36, 22 -> X27.168 #3 coil pins + + Gauge 4 (Cache hit) + GPIO 23, 18, 19, 5 -> X27.168 #4 coil pins + + Annunciator LEDs (220R series resistors) + GPIO 16 (R), 17 (G), 21 (B) -> MODEL (common-cathode RGB) + GPIO 4 -> HOT (red, PWM for flash) + GPIO 2 -> WARN (amber) + GPIO 15 -> STALL (blue) + GPIO 12 -> IDLE (green, PWM for pulse) + + Reset button + GPIO 0 (boot button dual-use) or GPIO 35 with pull-up +``` + +Separate 5V rail for the steppers if you see brownouts when all four +move at once. The onboard regulator is marginal under combined motor +current. + +## Firmware structure + +``` +firmware/ + platformio.ini + src/ + main.cpp setup() + loop() + wifi.cpp connect + reconnect + gauge.cpp SwitecX25 wrapper; map 0-1000 to 0-600 steps + annunciator.cpp LED state machine + poll.cpp HTTP GET /usage every 1s + config.h daemon URL, thresholds +``` + +Poll loop: + +1. Every 1000 ms, HTTP GET `http://:8080/usage` +2. Parse JSON with ArduinoJson +3. For each gauge, set target step from scaled value +4. Update LED states from boolean flags and `last_model` +5. `gauge.update()` runs per-loop-tick; stepper moves toward target + non-blocking + +Firmware does not interpret: no PromQL, no ring buffers, no rolling +windows. That is all daemon work. Firmware maps pre-computed floats +to stepper positions and LED states. + +## Enclosure + +V1 target: + +* Brushed-aluminium bezel (3D-printed PLA + metallic spray, upgrade + later) +* Cream faces with a hairline burgundy redline zone (optional house + palette match with quartermaster) +* Hexagonal screws around each gauge bezel (visible in the render) +* Smoked acrylic covering the annunciator row so LEDs only appear + when lit +* Matte black mounting bracket clipped to the monitor bezel +* Roughly 200 mm wide by 95 mm tall + +Faces and bezels can be printed on a home 3D printer at V1 fidelity; +if the project earns a V2, commission a machinist for a proper +aluminium bezel. + +## Bill of materials (V1) + +| Item | Qty | ~Cost | +|---|---|---| +| Switec X27.168 stepper | 4 | $32 | +| ESP32 DevKit V1 | 1 | $8 | +| ULN2003A driver array | 1-2 | $2 | +| RGB LED (common cathode) | 1 | $1 | +| 5 mm LEDs (red, amber, blue, green, red flash) | 4 | $2 | +| 220R resistors | 10 | $1 | +| 5 V 2 A supply | 1 | $10 | +| PLA filament + spray paint | n/a | $10 | +| Hook-up wire, headers, JST connectors | | $10 | +| Smoked acrylic scrap | 1 | $5 | + +~$80-90 for V1 materials. + +## Calibration + +After the cluster is physically assembled, run the daemon for a week +and compare: + +* Left fuel steady-state against `ccusage blocks` or Claude's + `/usage` output; tune `CLAUDE_GAUGE_5H_CEILING` +* Tach peaks against "Claude was really cooking" sessions; tune + `CLAUDE_GAUGE_TACH_REDLINE` +* Cache hit rate against a session you know was cache-warm vs a + cold session; confirm the needle is responsive + +Recalibrate when Anthropic changes plan limits or when your daily +workload shifts materially. diff --git a/Home.md b/Home.md new file mode 100644 index 0000000..699e52e --- /dev/null +++ b/Home.md @@ -0,0 +1,54 @@ +# claude-gauge + +Hardware instrument cluster displaying Claude Code session telemetry +in real time. Four analog needle gauges plus an annunciator row, +driven by an ESP32 polling a local Python daemon. The daemon reads +either Claude Code's native OpenTelemetry feed through Prometheus +(architecture A) or `ccusage` CLI aggregates with a direct JSONL +tail for the tach (architecture B). Firmware and cluster are +identical across both. + +Fighter-jet / race-car aesthetic. Physical-first. The deep stats +live in Grafana (A) or `ccusage`'s own surfaces (B); the cluster is +the ambient summary. + +![Monitor-mounted cluster with four gauges and annunciator row](images/monitor-dash.png) + +## Pages + +* [Architecture](Architecture) — two-architecture tradeoff, daemon + shape, shared HTTP surface +* [DataSources](DataSources) — Docker Compose + PromQL for A, + `ccusage` integration for B, Claude Code env vars +* [Hardware](Hardware) — cluster layout, X27.168 steppers, ESP32 + wiring, firmware structure, enclosure +* [Roadmap](Roadmap) — phases, shipped, deferred +* [Ideas](Ideas) — exotic variants, metrics brainstorm, parked + thoughts + +## At a glance + +Four round gauges in a brushed-aluminium bezel, mounted under the +monitor. Each drives from one primary metric. + +| Gauge | Metric | Scale | +|---|---|---| +| 5H PLAN / Left fuel | % of 5h plan window used | 0 - 100% | +| TOKENS/MIN / Tach | Short-window burn rate | 0 - redline (configurable) | +| THINKING/OUTPUT / Temp | Ratio of thinking to output tokens | low - high (colour-coded) | +| CACHE HIT / Boost | Cache read as fraction of input | 0 - 100% | + +Annunciator row below: **MODEL** (RGB, colour-coded per model), +**HOT** (tach above redline), **WARN** (fuel or cache anomaly), +**STALL** (no telemetry for N minutes), **IDLE** (daemon reachable, +no activity). + +The 7-day window is deliberately absent from the physical cluster. +It moves too slowly to warrant a needle and lives on the deep-stats +dashboard (Grafana panel or `ccusage` TUI) instead. + +## Status + +Scaffolded. Phase A pending architecture decision. + +See the [Roadmap](Roadmap) for phase-by-phase plan. diff --git a/Ideas.md b/Ideas.md new file mode 100644 index 0000000..ea1eff4 --- /dev/null +++ b/Ideas.md @@ -0,0 +1,186 @@ +# Ideas + +Parked thoughts and exotic variants. Nothing here is committed; this +is the "capture the shiny object so we can return to the main task" +page. + +## Exotic variants + +Four directions the enclosure could take beyond the classic +brushed-aluminium dashboard. The daemon, firmware contract, and +annunciator semantics stay the same; only the physical shell changes. + +![Four exotic dash variants](images/exotic-dashes.png) + +### Steampunk automaton chronometer + +Wooden case with visible gears, brass piping, vertical glass tubes +as fluid-column gauges. The tach becomes a swinging pendulum; the +fuel gauges become liquid levels in tubes lit from below. Models +are indicated by a rotating brass disc behind a window. Victorian +scientific-instrument vibe. + +Build cost: medium-high. Requires woodworking and brass hardware +sourcing. Steppers hide inside the case; tube visuals driven by +RGB LED strips scaled to the fuel percentage. + +### Retro-future VFD array + +Black panel with cyan/teal vacuum-fluorescent displays. Large VFD +readout for TOKEN/MIN, segmented bar graphs for the fuel windows, +a small 3D wireframe terrain display corner animated to reflect +thinking ratio. Blade Runner, 80s mainframe, CRT TV from the future. + +Build cost: medium. VFD modules are available but not cheap; needs +a driver IC per display and a 5 V rail that can swing the filament +bias. All character-based, no moving needles. + +### Minimalist birch e-ink interface + +Birch plywood case with four e-ink round displays rendering +minimalist graphs. Matte, quiet, Nordic. Each display is a full +round e-ink rather than a stepper and needle. Less showy than the +classic dash, more desk-neighbour-friendly if you work in shared +space. + +Build cost: high per-unit; round e-ink panels run $30-60 each. +Firmware is more complex: per-panel rendering instead of stepper +control. Update rate is slow, which suits the fuel gauges more than +the tach. May need to fall back to an OLED for the tach specifically. + +### Bio-digital cybernetic cluster + +Organic bioluminescent tendrils in magenta/cyan, plasma-globe central +sphere for the tach, neural-network-like filaments for the fuels. +H. R. Giger meets a Phillip K. Dick novel. + +Build cost: low (if done with EL wire and a plasma globe) to +ridiculous (if done properly with electroluminescent silicone and a +custom bioreactor substrate). Mostly a vibes build. Would be fun as +a Halloween-only seasonal swap-out. Not a daily driver. + +## Shortlist if building a second cluster + +1. VFD array for a secondary desk (aesthetic pairs well with a + terminal-heavy workflow) +2. Birch e-ink as a quiet hallway ambient display +3. Steampunk as a living-room "ambient AI" piece for non-technical + guests +4. Bio-digital as a conversation-starter for Halloween + +## Metrics brainstorm (for Phase G / H) + +Derivable from OTEL (A) or JSONL (B). Not on the primary cluster; +lands on Grafana panels or a custom web page later. + +### Cost and tokens + +* Cache hit rate and cache-savings dollar value +* Cost per session at published pricing +* Projected monthly spend at current burn rate +* Opus / Sonnet / Haiku token split +* Server tool use (web search / web fetch) counts +* Service tier distribution (standard vs priority) + +### Time and rhythm + +* Session count, duration distribution +* Time-of-day heatmap (circadian work pattern) +* Day-of-week heatmap +* Think-time vs work-time ratio +* Streak tracking (consecutive days used) +* All-nighter detector (session crossing 2am local) +* Longest continuous session + +### Work shape + +* Thinking-to-output ratio trend over time +* Stop-reason distribution (watch for rising `max_tokens`) +* Messages per session +* Tool calls per assistant response (parallelism indicator) +* User interrupt rate (sessions ending on cancel) +* Iteration count per task + +### Tool usage + +* Top tools by count (Bash, Edit, Read, Grep) +* Tool success vs failure rate +* Bash command distribution parsed by root executable +* File reads vs edits vs writes +* Hottest files across all sessions +* Agent / subagent counts (`isSidechain=true`), depth +* Web search / web fetch counts + +### Project and context + +* Tokens per project +* Time per project +* Project switching rate within a session +* Dormant project detector (no activity in N days) +* Languages touched, by file extension +* Last file edited per project (resume-where-you-left-off) + +### Friction and quality + +* User message length distribution (terse vs prose) +* Rough correction reflex count ("no", "wrong", "stop", "actually") +* Permission denial frequency +* Retry / regenerate patterns +* File-history-snapshot count per session + +### Character and fun + +* **Em-dash violator count** against the CLAUDE.md rule. Per-week + needle ("rule violations") with its own LED annunciator +* Emoji leakage count +* Most-used phrase by Claude in your transcripts +* Most-used phrase by you +* "Dude, chill" detector (explicit pushback events) +* Thank-you rate per session +* Silent sessions (ended without `/compact` or `/clear`) + +### Cross-system correlations + +* Git: commits produced per session, lines changed per token spent +* Forgejo: PRs opened, merged, closed per session +* Quartermaster: did long Claude sessions correlate with + budget-editing days +* Home automation: Claude usage vs espresso machine activations +* Fitbit / health data: Claude usage vs heart rate (do you actually + stress out when Claude is slow?) + +## Hardware wild ideas + +* **Audio feedback**: a quiet relay-click when the tach crosses + redline; a chime when a 5h block resets. Could be annoying; + prototype with speaker off first. +* **Tactile feedback**: a small vibration motor in the bezel that + pulses when STALL lights (useful if the cluster is out of your + peripheral vision). +* **Per-project LEDs**: a strip beside the cluster where the lit + LED indicates which project is currently active. Fun for homelab + archaeology. +* **Second cluster on the other monitor** showing a 24-hour sparkline + rather than live gauges. Historical companion. +* **Cluster-to-cluster sync**: if claude-gauge becomes popular enough + to build for friends, a single OTEL collector could drive multiple + clusters in multiple homes. Niche but fun. + +## Software wild ideas + +* Voice announcement on stall ("Claude appears to be thinking. Or + possibly dead.") +* "Sobriety score" needle that gauges how many consecutive tool + failures are happening +* Integration with a physical "commit" button on the cluster that + accepts pending changes with a satisfying mechanical click +* RGB LED strip behind the cluster that shifts colour based on + thinking-ratio (cool blue when cruising, deep red when grinding) + +## Unlikely but noted + +* Smartwatch complication showing 5h fuel. Nice for away-from-desk + awareness but duplicates what a phone notification already does. +* Discord bot that posts "claude-gauge is hot!" to a channel when + the tach crosses redline. Fun in a team setting, overkill solo. +* Tattoo. No. diff --git a/Roadmap.md b/Roadmap.md new file mode 100644 index 0000000..be06d58 --- /dev/null +++ b/Roadmap.md @@ -0,0 +1,104 @@ +# Roadmap + +One phase per issue. No scope bleed between phases. + +## Shipped + +| # | Title | Merged | +|---|---|---| +| - | Initial scaffold and PLAN.md | 2026-04-17 | +| 1 | Flesh out PLAN.md with two-architecture implementation detail | PR #2 open | + +## Phases + +### Phase A — daemon prints to stdout + +Tail the telemetry source (OTEL-Prom or ccusage), maintain the five +primary values in memory, print them to stdout every second. No HTTP, +no hardware, no dashboard. First proof the numbers land. + +Architecture A or B must be chosen before starting; the daemon +implementation differs. + +**Exit criteria**: Claude Code activity makes the values tick in a +terminal. + +### Phase B — HTTP endpoint + +Stand up FastAPI, expose `GET /usage` returning the documented JSON +shape. Verify with `curl` from another machine on the LAN. + +**Exit criteria**: `curl http://:8080/usage` returns valid JSON +with all fields populated. + +### Phase C — single needle + +Minimal ESP32 firmware. Polls `/usage`, drives one stepper (the tach) +and one LED (MODEL). Proves the hardware path end to end. + +**Exit criteria**: the tach needle moves in response to typing into +Claude Code. + +### Phase D — full cluster + +Four steppers, full annunciator row. Wiring on a breadboard or +solder-perf. No enclosure yet. + +**Exit criteria**: all four gauges and all five lamps behave +according to their specifications in [Hardware](Hardware). + +### Phase E — calibration + +Run the cluster for a week of real use. Tune the three environment +variables against observed behaviour. Document chosen values in +this wiki. + +**Exit criteria**: the fuel gauge reads roughly what `ccusage blocks` +reports; the tach redline fires when you expect it to. + +### Phase F — enclosure V1 + +3D-printed bezel, cream faces, annunciator smoked-acrylic cover, +monitor-mount bracket. Permanent install under the monitor. + +**Exit criteria**: the cluster is a desk object you would show +someone, not a breadboard. + +### Phase G — dashboard + +Architecture A: import Grafana Labs dashboard 25052 against the +Prometheus data source. Link it from the wiki Home. + +Architecture B: decide whether to build a custom deep-stats page or +just live in the `ccusage` TUI. + +**Exit criteria**: a "look at everything" surface exists somewhere. + +### Phase H — character and cross-system metrics + +Em-dash counter, phrase extractor, git correlation, Quartermaster +correlation. Lowest priority, highest amusement. Ships as Grafana +panels (A) or as a small web page (B). + +## Deferred + +* True stream-time tach. Requires a Claude Code hook + (`SessionStart` / `Stop` / `PostToolUse`) pushing heartbeats. + Dramatic work for modest gain over the JSONL tail. +* Thinking-token metric via OTEL. Currently derivable only from + events + `prompt.id` correlation; simpler to extract from JSONL + content blocks directly in the tail. +* Hybrid (A + B) daemon where fuel gauges come from Prometheus and + tach comes from JSONL tail. Worth it only if Phase E shows the + 15 s scrape interval feels sluggish. +* Multi-machine cluster. If other homelab hosts start running Claude + Code, point them at the same OTEL collector (architecture A) and + the gauges become per-host aggregates. + +## Out of scope + +* MicroPython firmware. No maintained SwitecX25 port exists. Not + worth writing one. +* Battery operation. Always desk-powered. +* Alexa / voice interfaces. No. +* Cloud sync of gauge state. No. diff --git a/images/exotic-dashes.png b/images/exotic-dashes.png new file mode 100644 index 0000000..5259eb0 Binary files /dev/null and b/images/exotic-dashes.png differ diff --git a/images/monitor-dash.png b/images/monitor-dash.png new file mode 100644 index 0000000..462880c Binary files /dev/null and b/images/monitor-dash.png differ