Jeff Smith 08eaa63300 Flesh out PLAN.md with two-architecture implementation detail

Expand the planning document to implementation-ready detail. Both
viable data paths are specified independently so either can ship:

- Architecture A (OTEL-native): full docker-compose stack mirroring
  Anthropic's claude-code-monitoring-guide (otel-collector on 4317
  / 4318 / 8889, Prometheus on 9090, Grafana on 3000), Claude Code
  env vars, PromQL queries for each gauge, Python daemon sketch
  that queries Prometheus and serves /usage.

- Architecture B (ccusage-sourced): subprocess invocation of
  ccusage CLI for 5h blocks and 7d daily aggregates, watchdog
  JSONL tail for sub-second tach responsiveness, Python daemon
  sketch with a 60s RateBus ring buffer.

Hardware specified: X27.168 stepper motors driven by the Arduino
SwitecX25 library on ESP32 (Arduino C++ since no MicroPython port
exists), concrete GPIO pin assignments, ULN2003A driver notes,
annunciator LED wiring, enclosure notes.

Also captured: metric schema from Claude Code's OTEL docs,
prior-art review (ccusage / Claude-Code-Usage-Monitor / Grafana
Labs dashboards 25052 and 24993 / Anthropic's monitoring guide),
six-phase delivery plan, comparison table of A vs B, recommendation
(A for homelab-as-enterprise framing), and a metrics brainstorm
for later phases.

README updated to summarise both architectures and point at the
expanded plan.

Refs #1

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-17 19:18:59 -06:00

24 KiB

Raw Blame History

claude-gauge

Hardware instrument cluster displaying Claude Code session telemetry. Three analog needle gauges plus an annunciator row, driven by an ESP32 polling a local daemon, driven by Claude Code's own OpenTelemetry feed or by ccusage. Fighter-jet / race-car aesthetic. Physical-first.

Why

Watching tokens burn against the Max-plan windows is useful, but the same data also tells you when Claude is grinding, which model just ran, and how warm your cache is. A dial on the desk makes that ambient instead of tab-switching.

Prior art and the decision it implies

Software side is crowded. ccusage, Claude-Code-Usage-Monitor, haasonsaas/claude-usage-tracker, phuryn/claude-usage, multiple Grafana dashboards (Grafana Labs 25052 and 24993), and Anthropic's own claude-code-monitoring-guide repo all do the JSONL parsing and rolling-window math already. Claude Code ships with native OpenTelemetry support. The physical-gauge angle has no extant prior art.

Implication: do not rebuild the telemetry layer. Consume it. Spend the love on the hardware and the adapter that bridges it.

Instrument cluster (same in all architectures)

    +------------+   +--------------+   +------------+
    |   5h FUEL  |   |   TOKENS/MIN |   |  7d FUEL   |
    |   0 - 100% |   |  0 - redline |   |  0 - 100%  |
    +------------+   +--------------+   +------------+
    [OPUS] [SONNET] [HAIKU]   [HOT] [WARN] [STALL] [IDLE]

Gauge	Metric
Center tach	Tokens/min, rolling short window
Left fuel	% of 5h plan window used
Right fuel	% of 7d plan window used

Lamp	Condition
OPUS / SONNET / HAIKU	colour-coded model that emitted the most recent tokens
HOT	tach above redline
WARN	either fuel gauge above 80%
STALL	no telemetry in last N minutes
IDLE	daemon reachable, no activity

Two architectures

Pick one. Both feed the same firmware and cluster.

	A. OTEL-native	B. ccusage-sourced
Data source	Claude Code OTLP -> collector -> Prometheus	Local JSONL via `ccusage` CLI
External deps	Docker Compose stack (collector, Prometheus, Grafana)	Node + `npx ccusage`
Deep-stats dashboard	Grafana dashboard 25052 for free	Build nothing, ccusage has a TUI
Short-window tach	Limited by Prometheus scrape interval (15s)	Hybrid JSONL tail gives sub-second
Operational weight	Moderate (3 services)	Tiny (one subprocess)
Homelab-enterprise fit	Strong	Weak
Time to first needle	Day 2	Day 1
Survivability through Claude Code updates	High (OTEL schema is stable and documented)	Medium (JSONL layout is an implementation detail)

Both share the firmware, cluster, and enclosure. The daemon is the only thing that differs. The /usage HTTP shape is identical across A and B so the firmware never knows which backend is wired up.

Architecture A: OTEL-native

Stack

Mirror Anthropic's reference stack (claude-code-monitoring-guide). Three containers.

# docker-compose.yml
services:
  otel-collector:
    image: otel/opentelemetry-collector-contrib:latest
    command: ["--config=/etc/otel-collector-config.yaml"]
    volumes:
      - ./otel-collector-config.yaml:/etc/otel-collector-config.yaml
    ports:
      - "4317:4317"   # OTLP gRPC
      - "4318:4318"   # OTLP HTTP
      - "8889:8889"   # Prometheus scrape
    depends_on:
      - prometheus

  prometheus:
    image: prom/prometheus:latest
    ports:
      - "9090:9090"
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
      - prometheus_data:/prometheus
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.path=/prometheus'
      - '--storage.tsdb.retention.time=8d'   # > 7d so increase() works
      - '--web.enable-lifecycle'

  grafana:
    image: grafana/grafana:latest
    ports:
      - "3000:3000"
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=admin
    volumes:
      - grafana_data:/var/lib/grafana
      - ./grafana/provisioning:/etc/grafana/provisioning
      - ./grafana/dashboards:/var/lib/grafana/dashboards
    depends_on:
      - prometheus

volumes:
  prometheus_data:
  grafana_data:

# otel-collector-config.yaml
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318

processors:
  batch:
    timeout: 1s
    send_batch_size: 1024
  memory_limiter:
    check_interval: 1s
    limit_mib: 512

exporters:
  prometheus:
    endpoint: "0.0.0.0:8889"
    send_timestamps: true
    metric_expiration: 192h    # 8 days, covers 7d window
    enable_open_metrics: true

service:
  pipelines:
    metrics:
      receivers: [otlp]
      processors: [memory_limiter, batch]
      exporters: [prometheus]

# prometheus.yml
global:
  scrape_interval: 15s
  evaluation_interval: 15s

scrape_configs:
  - job_name: 'otel-collector'
    static_configs:
      - targets: ['otel-collector:8889']

Import Grafana Labs dashboard 25052 ("Claude Code") against the Prometheus data source. That is the deep-stats dashboard; no custom web UI needed.

Claude Code configuration

Set in the shell Claude Code runs in (user profile, systemd unit, or ~/.claude/settings.json managed settings):

export CLAUDE_CODE_ENABLE_TELEMETRY=1
export OTEL_METRICS_EXPORTER=otlp
export OTEL_LOGS_EXPORTER=otlp
export OTEL_EXPORTER_OTLP_PROTOCOL=grpc
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317
export OTEL_METRIC_EXPORT_INTERVAL=10000   # 10s for gauge responsiveness
export OTEL_METRICS_INCLUDE_SESSION_ID=false   # bound cardinality

Metrics Claude Code emits (via OTEL, surfaced in Prometheus)

All prefixed claude_code_ after the OTEL-to-Prom conversion.

Prometheus metric	Labels
`claude_code_token_usage_tokens_total`	`type` (`input`/`output`/`cacheRead`/`cacheCreation`), `model`
`claude_code_cost_usage_USD_total`	`model`
`claude_code_session_count_total`
`claude_code_active_time_total_seconds_total`	`type` (`user`/`cli`)
`claude_code_lines_of_code_count_total`	`type` (`added`/`removed`)
`claude_code_commit_count_total`
`claude_code_pull_request_count_total`
`claude_code_code_edit_tool_decision_count_total`	`tool_name`, `decision`, `language`

Events (via OTEL logs) carry richer per-request context including prompt.id, duration_ms, speed (fast/normal), etc. Not needed for the primary gauges.

PromQL the daemon runs

# Tokens/min, short rolling window (tach)
sum(rate(claude_code_token_usage_tokens_total[1m])) * 60

# 5h window sum (left fuel)
sum(increase(claude_code_token_usage_tokens_total[5h]))

# 7d window sum (right fuel)
sum(increase(claude_code_token_usage_tokens_total[7d]))

# Cache hit rate (optional sub-gauge)
  sum(rate(claude_code_token_usage_tokens_total{type="cacheRead"}[5m]))
/ sum(rate(claude_code_token_usage_tokens_total{type=~"input|cacheRead|cacheCreation"}[5m]))

# Last model (approximation via max-sample lookup)
topk(1, claude_code_token_usage_tokens_total{type="output"})

# Cost estimates
sum(increase(claude_code_cost_usage_USD_total[5h]))
sum(increase(claude_code_cost_usage_USD_total[7d]))

# Stall detection (no tokens in last N minutes)
absent(rate(claude_code_token_usage_tokens_total[2m]) > 0)

Daemon (A)

Thin Python service. Queries Prometheus, transforms to /usage payload for the firmware.

src/claude_gauge/
  __init__.py
  daemon_prom.py       FastAPI app, PromQL queries, /usage endpoint
  config.py            Prometheus URL, ceilings, stall threshold
  windows.py           PromQL builders and result parsing
  calibration.py       Maps raw values to firmware-friendly 0-1000 scales

# daemon_prom.py sketch
import os
import httpx
from fastapi import FastAPI

PROM = os.environ.get("CLAUDE_GAUGE_PROM_URL", "http://localhost:9090")
CEIL_5H = int(os.environ.get("CLAUDE_GAUGE_5H_CEILING", 500_000))
CEIL_7D = int(os.environ.get("CLAUDE_GAUGE_7D_CEILING", 3_000_000))
RED = int(os.environ.get("CLAUDE_GAUGE_TACH_REDLINE", 8000))  # tokens/min

app = FastAPI()
client = httpx.AsyncClient(timeout=5.0)

async def prom(q: str) -> float:
    r = await client.get(f"{PROM}/api/v1/query", params={"query": q})
    data = r.json()["data"]["result"]
    return float(data[0]["value"][1]) if data else 0.0

@app.get("/usage")
async def usage():
    rate_1m = await prom("sum(rate(claude_code_token_usage_tokens_total[1m])) * 60")
    win_5h  = await prom("sum(increase(claude_code_token_usage_tokens_total[5h]))")
    win_7d  = await prom("sum(increase(claude_code_token_usage_tokens_total[7d]))")
    cache   = await prom(
        'sum(rate(claude_code_token_usage_tokens_total{type="cacheRead"}[5m])) / '
        'sum(rate(claude_code_token_usage_tokens_total{type=~"input|cacheRead|cacheCreation"}[5m]))'
    )
    stalled = (await prom(
        'sum(rate(claude_code_token_usage_tokens_total[2m]))'
    )) == 0.0
    return {
        "rate_1m": rate_1m,
        "window_5h_tokens": win_5h,
        "window_5h_pct": min(1.0, win_5h / CEIL_5H),
        "window_7d_tokens": win_7d,
        "window_7d_pct": min(1.0, win_7d / CEIL_7D),
        "cache_hit_rate": cache,
        "hot": rate_1m > RED,
        "warn": (win_5h / CEIL_5H) > 0.8 or (win_7d / CEIL_7D) > 0.8,
        "stall": stalled,
        "idle": True,
        "last_model": await last_model(),
    }

last_model needs one extra query that picks the model label of the most recently incremented output-token series. Implementation detail; simplest is to run a small query loop on metric labels.

Dependencies (A)

# pyproject.toml additions
dependencies = [
    "fastapi>=0.136.0",
    "uvicorn[standard]>=0.44.0",
    "httpx>=0.28.1",
]

No SQLite, no watchdog, no ORM. Prometheus is the database.

Retention considerations

Collector metric_expiration: 192h keeps a metric visible for 8d after its last sample, so 7d increase() queries work even on intermittent sessions.
Prometheus --storage.tsdb.retention.time=8d keeps the samples long enough for the same 7d queries.
Grafana dashboard 25052 pulls from the same Prometheus.

Pros and cons of A

Pros:

Uses the platform feature Anthropic ships.
Grafana dashboard is free.
Metric schema is documented and stable.
Plays cleanly with any other homelab metrics already in Prometheus.
Architecture translates without changes when other machines run Claude Code too: point their OTLP endpoint at the same collector.

Cons:

Prometheus scrape interval caps tach responsiveness at ~15s.
Three containers to run.
Requires env-var changes on every Claude Code launch surface.

Tach responsiveness mitigation (A)

If the 15s cap bothers you, the daemon can keep a tiny JSONL-tail fallback just for the tach. Same code shape as architecture B's tach component; described below. Pulling the fuel gauges and everything else from Prometheus, tach from direct file tail, is a clean hybrid. Only activate if Phase C shows the needle feels sluggish.

Architecture B: ccusage-sourced

Stack

One process: ccusage as a long-lived subprocess or periodic shell call. No collector, no Prometheus, no Grafana. A hybrid watchdog tail handles the sub-second tach that ccusage's aggregate API can't.

[ Claude Code ]  ->  ~/.claude/projects/**/*.jsonl
                            |
                            +---+----------------+
                            |                    |
                            v                    v
                [ watchdog tail ]       [ ccusage CLI / MCP ]
                (short-window tach)     (5h blocks, 7d daily)
                            |                    |
                            +----------+---------+
                                       v
                            [ claude-gauge daemon ]
                                 GET /usage
                                       |
                                       v
                               ESP32 firmware

ccusage integration options

Two shapes work. Pick one, not both.

Option B1: periodic CLI subprocess (simplest)

npx ccusage@latest blocks --json        # current 5h block
npx ccusage@latest daily  --json        # per-day aggregates for 7d sum

Run every ~10s from the daemon. Parse JSON, fill the fuel gauges.

Option B2: ccusage MCP HTTP server (persistent)

bunx @ccusage/mcp@latest --type http --port 8080

Exposes a Hono app at POST / handling MCP StreamableHTTP requests. Four registered tools:

Tool	Description
`daily`	Usage grouped by date
`monthly`	Usage grouped by month
`session`	Usage grouped by conversation session
`blocks`	Usage grouped by 5-hour session billing blocks

Each tool accepts since, until, mode, timezone, locale and returns JSON in an MCP text content block.

Invoke as an MCP client from the daemon (mcp Python SDK) or as raw JSON-RPC to POST /.

Recommendation

B1. The CLI path is simpler, has fewer moving parts, and the performance hit of a subprocess call every 10s is negligible. Switch to B2 only if you also want the MCP surface exposed to other local agents (Claude Code can already consume ccusage's MCP).

Short-window tach via watchdog

ccusage aggregates are too coarse for the tach. The daemon keeps its own 60-second ring buffer by tailing JSONL directly.

from watchdog.observers import Observer
from watchdog.events import FileSystemEventHandler
from collections import deque
from pathlib import Path
import json, time

class JsonlTail(FileSystemEventHandler):
    def __init__(self, bus):
        self.bus = bus
        self.offsets: dict[Path, int] = {}

    def on_modified(self, event):
        p = Path(event.src_path)
        if p.suffix != ".jsonl":
            return
        off = self.offsets.get(p, 0)
        with p.open() as f:
            f.seek(off)
            for line in f:
                try:
                    d = json.loads(line)
                except json.JSONDecodeError:
                    continue
                if d.get("type") == "assistant":
                    u = d.get("message", {}).get("usage", {})
                    tokens = sum(u.get(k, 0) for k in (
                        "input_tokens", "output_tokens",
                        "cache_read_input_tokens",
                        "cache_creation_input_tokens",
                    ))
                    model = d.get("message", {}).get("model", "")
                    self.bus.push(time.time(), tokens, model)
            self.offsets[p] = f.tell()

class RateBus:
    def __init__(self, window_s=60):
        self.window_s = window_s
        self.buf: deque[tuple[float, int, str]] = deque()

    def push(self, ts, tokens, model):
        self.buf.append((ts, tokens, model))
        self._evict()

    def _evict(self):
        cutoff = time.time() - self.window_s
        while self.buf and self.buf[0][0] < cutoff:
            self.buf.popleft()

    def rate_per_min(self):
        self._evict()
        return sum(t for _, t, _ in self.buf)

    def last_model(self):
        return self.buf[-1][2] if self.buf else None

Daemon (B)

src/claude_gauge/
  __init__.py
  daemon_ccusage.py    FastAPI app, ccusage subprocess calls, /usage
  tail.py              watchdog + RateBus for tach
  config.py
  calibration.py

# daemon_ccusage.py sketch
import asyncio, json, os, subprocess
from fastapi import FastAPI
from .tail import RateBus, start_watcher

CEIL_5H = int(os.environ.get("CLAUDE_GAUGE_5H_CEILING", 500_000))
CEIL_7D = int(os.environ.get("CLAUDE_GAUGE_7D_CEILING", 3_000_000))
RED = int(os.environ.get("CLAUDE_GAUGE_TACH_REDLINE", 8000))

bus = RateBus(window_s=60)
start_watcher(bus)   # background thread

app = FastAPI()

async def ccusage(cmd: str) -> dict:
    proc = await asyncio.create_subprocess_exec(
        "npx", "ccusage@latest", cmd, "--json",
        stdout=asyncio.subprocess.PIPE,
    )
    out, _ = await proc.communicate()
    return json.loads(out)

async def current_5h_tokens() -> int:
    blocks = await ccusage("blocks")
    cur = next((b for b in blocks.get("blocks", []) if b.get("isActive")), None)
    return cur["totalTokens"] if cur else 0

async def trailing_7d_tokens() -> int:
    daily = await ccusage("daily")
    # sum last 7 daily buckets
    rows = daily.get("daily", [])[-7:]
    return sum(r["totalTokens"] for r in rows)

@app.get("/usage")
async def usage():
    rate = bus.rate_per_min()
    w5, w7 = await asyncio.gather(current_5h_tokens(), trailing_7d_tokens())
    return {
        "rate_1m": rate,
        "window_5h_tokens": w5,
        "window_5h_pct": min(1.0, w5 / CEIL_5H),
        "window_7d_tokens": w7,
        "window_7d_pct": min(1.0, w7 / CEIL_7D),
        "hot": rate > RED,
        "warn": (w5 / CEIL_5H) > 0.8 or (w7 / CEIL_7D) > 0.8,
        "stall": rate == 0 and not bus.buf,
        "idle": True,
        "last_model": bus.last_model(),
    }

Cache ccusage blocks/daily output with a 10s TTL so the /usage endpoint stays cheap when the firmware polls at 1 Hz.

Dependencies (B)

dependencies = [
    "fastapi>=0.136.0",
    "uvicorn[standard]>=0.44.0",
    "watchdog>=5.0.0",
]

Node needs to be on the PATH for npx ccusage@latest. Pin a version in config rather than using @latest once the daemon is past Phase A.

Pros and cons of B

Pros:

Single process, one dependency tree.
Sub-second tach works out of the box via the watchdog tail.
No service stack, no Docker, no collector.
ccusage is actively maintained and has already solved the edge cases in JSONL parsing (missing fields, renamed formats, cache token math, cost per model).

Cons:

No free Grafana dashboard. If you want deep stats, either run ccusage interactively or build something.
Node on the runtime path.
JSONL format is an implementation detail; upstream changes could break parsing. ccusage tracks these but there's a lag window.
Does not generalise if other machines also run Claude Code; each one needs its own daemon.

Hardware (shared by A and B)

Movement

Switec X27.168 automotive stepper motor. 315-degree sweep, 600 steps, roughly 2 degrees / step. ~$8 each. Used in car dashboards, so enclosures and bezels exist off the shelf.

Related cousins: X25, VID28, VID29, BKA30D-R5. The library supports all of them, but X27.168 has the longest sweep and the most available tutorials.

Driver

SwitecX25 Arduino library (clearwater/SwitecX25 on GitHub). Works for X27.168 despite the name. Drives 4 GPIO pins per motor. No external driver IC required for short wiring runs; use small transistor arrays (ULN2003A) if you want cleaner current handling.

No maintained MicroPython port exists. Firmware is Arduino C++ rather than MicroPython. Not the original plan, but the right trade.

Board

ESP32 DevKit (generic). WiFi, enough GPIO for 3 steppers (12 pins) plus 8 annunciator LEDs and a reset button. ~$8.

Alternative: Raspberry Pi Pico W. Less toolchain overhead if you prefer CircuitPython, but you'd still be hand-rolling the stepper driver.

Wiring sketch

ESP32 DevKit
  GPIO 13,14,27,26  -->  X27.168 #1 (left fuel)
  GPIO 25,33,32,35  -->  X27.168 #2 (tach)
  GPIO 34,39,36,22  -->  X27.168 #3 (right fuel)
  GPIO 21  -->  OPUS LED   (red)
  GPIO 19  -->  SONNET LED (amber)
  GPIO 18  -->  HAIKU LED  (green)
  GPIO 5   -->  HOT LED    (red, PWM for flashing)
  GPIO 17  -->  WARN LED   (amber)
  GPIO 16  -->  STALL LED  (blue)
  GPIO 4   -->  IDLE LED   (green, pulses while daemon reachable)
  GPIO 15  -->  tactile reset button (pull-up)

220R resistors per LED. Use a separate 5V rail for the steppers if you see brownouts when all three move at once; ESP32's 3V3 rail is fine for signals but the motors pull more than the onboard regulator likes.

Firmware structure

firmware/
  platformio.ini
  src/
    main.cpp            setup() + loop()
    wifi.cpp            connect + reconnect
    gauge.cpp           wraps SwitecX25; map pct 0..1 to 0..steps
    annunciator.cpp     LED state machine
    poll.cpp            HTTP GET /usage every 1s
    config.h            daemon URL, redline, thresholds

Poll loop:

Every 1000ms, GET http://<daemon>:8080/usage.
Parse JSON (ArduinoJson).
Set gauge targets: tach.setTargetStep(map(rate_1m, 0, redline, 0, 600)), likewise for fuels.
Update LED states from hot/warn/stall/idle/last_model.
gauge.update() runs the stepper every loop tick until it hits target.

Enclosure

Cream faces, hairline burgundy redline zone (matches quartermaster palette if you want the house look).
Brushed aluminium bezel; 3D-print + spray-paint is fine for V1.
Annunciator row behind smoked acrylic so the LEDs only show when lit.
Desk-size footprint: roughly 180mm wide x 90mm tall for the cluster.

Phasing

One phase per issue. No scope bleed between phases.

Phase	Deliverable	Architecture-agnostic?
A	Daemon prints five window values to stdout	No (A or B chosen before start)
B	`/usage` HTTP endpoint; curl from browser or another box	No
C	ESP32 firmware driving ONE needle (tach) from the daemon	Yes
D	Three needles plus annunciator row	Yes
E	Calibration period: tune ceilings and redline against real use	Yes
F	Enclosure V1 (printed), cabling, permanent install	Yes
G	(If A) Grafana dashboard wired in; (if B) pick a deep-stats path or decline	Diverges
H	Character metrics and cross-system correlations (em-dash counter, git correlation, quartermaster correlation)	Yes

Do not attempt Phase D before Phase C. Hardware integration is where surprises land; start with one axis.

Recommendation (for Jeff's homelab)

Architecture A.

The homelab-as-enterprise framing is the deciding factor. OTEL is the platform feature, Prometheus is already the right tool, Grafana dashboard 25052 is a free deep-stats surface, and the architecture generalises if other machines start running Claude Code. The 15s scrape interval is the only real concession; if the tach feels sluggish after Phase E, bolt the JSONL tail from B on top for the tach path only. Hybrid.

If you don't already run Prometheus in the homelab, B gets you to a working needle sooner (Phase A ships same day). Migrate to A later if OTEL becomes useful for other things.

Either way, the firmware and cluster are identical. The architecture choice is only about what the daemon reads.

Metrics brainstorm (for later phases)

All derivable from OTEL (A) or from the JSONL directly (B). Not wired into the primary cluster; land in Phase G or a future Grafana panel.

Cost and tokens

Cache hit rate and cache-savings dollar value.
Cost per session at published pricing.
Projected monthly spend.
Opus / Sonnet / Haiku token split.
Server tool use (web search / web fetch) counts.

Time and rhythm

Session count, duration distribution, time-of-day heatmap.
Think-time (user idle) vs work-time (assistant active).
Streak tracking; all-nighter detector.

Work shape

Thinking-to-output ratio as a "cogitation index" gauge.
Stop-reason distribution (watch rising max_tokens).
Tool calls per assistant response (parallelism indicator).

Tool usage

Top tools by count. Bash-command root-executable distribution.
File reads vs edits vs writes per session.
Hottest files across all sessions.
Agent / subagent counts (isSidechain=true).

Project and context

Tokens per project, last-active timestamp, dormant-project detector.

Friction and quality

Permission denial frequency.
File-history-snapshot count per session.

Character

Em-dash violation counter against the CLAUDE.md rule.
Most-used phrase by Claude vs by the user.
Thank-you rate, "Dude, chill" detector.

Cross-system

Git correlation: commits produced, lines changed per token.
Quartermaster correlation: budget-editing days vs Claude load.

Next steps

Decide A or B. Default: A.
File Phase A as the first issue on archeious/claude-gauge.
If A: stand up the Compose stack, point Claude Code at it, verify metrics reach Prometheus via the /api/v1/query browser interface.
If B: install ccusage, run blocks --json and daily --json by hand, paste the outputs somewhere durable for reference.
Ship Phase A. See the numbers tick in a terminal.

24 KiB Raw Blame History

claude-gauge

Why

Prior art and the decision it implies

Instrument cluster (same in all architectures)

Two architectures

Architecture A: OTEL-native

Stack

Claude Code configuration

Metrics Claude Code emits (via OTEL, surfaced in Prometheus)

PromQL the daemon runs

Daemon (A)

Dependencies (A)

Retention considerations

Pros and cons of A

Tach responsiveness mitigation (A)

Architecture B: ccusage-sourced

Stack

ccusage integration options

Option B1: periodic CLI subprocess (simplest)

Option B2: ccusage MCP HTTP server (persistent)

Recommendation

Short-window tach via watchdog

Daemon (B)

Dependencies (B)

Pros and cons of B

Hardware (shared by A and B)

Movement

Driver

Board

Wiring sketch

Firmware structure

Enclosure

Phasing

Recommendation (for Jeff's homelab)

Metrics brainstorm (for later phases)

Cost and tokens

Time and rhythm

Work shape

Tool usage

Project and context

Friction and quality

Character

Cross-system

Next steps

24 KiB

Raw Blame History