Initial scaffold

Project skeleton for claude-gauge: a hardware instrument cluster and companion web dashboard driven by a local Python daemon that tails ~/.claude/projects/**/*.jsonl. Includes: - PLAN.md with cluster layout, dashboard sections, architecture, metrics brainstorm, and phasing. - README.md overview. - Python package scaffold (src/claude_gauge), pyproject.toml targeting Python 3.12+ and uv, MIT LICENSE, standard gitignore. No runtime code yet. Phase A (daemon MVP) will land under the first issue. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 19:01:30 -06:00 · 2026-04-17 19:01:30 -06:00 · a683828e6b
commit a683828e6b
8 changed files with 573 additions and 0 deletions
--- a/.gitignore
+++ b/.gitignore
@ -0,0 +1,177 @@
 # ---> Python
 # Byte-compiled / optimized / DLL files
 __pycache__/
 *.py[cod]
 *$py.class
 # C extensions
 *.so
 # Distribution / packaging
 .Python
 build/
 develop-eggs/
 dist/
 downloads/
 eggs/
 .eggs/
 lib/
 lib64/
 parts/
 sdist/
 var/
 wheels/
 share/python-wheels/
 *.egg-info/
 .installed.cfg
 *.egg
 MANIFEST
 # PyInstaller
 #  Usually these files are written by a python script from a template
 #  before PyInstaller builds the exe, so as to inject date/other infos into it.
 *.manifest
 *.spec
 # Installer logs
 pip-log.txt
 pip-delete-this-directory.txt
 # Unit test / coverage reports
 htmlcov/
 .tox/
 .nox/
 .coverage
 .coverage.*
 .cache
 nosetests.xml
 coverage.xml
 *.cover
 *.py,cover
 .hypothesis/
 .pytest_cache/
 cover/
 # Translations
 *.mo
 *.pot
 # Django stuff:
 *.log
 local_settings.py
 db.sqlite3
 db.sqlite3-journal
 quartermaster.db
 quartermaster.db-journal
 *.sqlite
 *.sqlite3
 backups/
 docs/wiki/
 # Brand source assets (optimised versions live in src/quartermaster/static/brand/)
 /logo.png
 *:Zone.Identifier
 # Flask stuff:
 instance/
 .webassets-cache
 # Scrapy stuff:
 .scrapy
 # Sphinx documentation
 docs/_build/
 # PyBuilder
 .pybuilder/
 target/
 # Jupyter Notebook
 .ipynb_checkpoints
 # IPython
 profile_default/
 ipython_config.py
 # pyenv
 #   For a library or package, you might want to ignore these files since the code is
 #   intended to run in multiple environments; otherwise, check them in:
 # .python-version
 # pipenv
 #   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
 #   However, in case of collaboration, if having platform-specific dependencies or dependencies
 #   having no cross-platform support, pipenv may install dependencies that don't work, or not
 #   install all needed dependencies.
 #Pipfile.lock
 # poetry
 #   Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
 #   This is especially recommended for binary packages to ensure reproducibility, and is more
 #   commonly ignored for libraries.
 #   https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
 #poetry.lock
 # pdm
 #   Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
 #pdm.lock
 #   pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
 #   in version control.
 #   https://pdm.fming.dev/latest/usage/project/#working-with-version-control
 .pdm.toml
 .pdm-python
 .pdm-build/
 # PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
 __pypackages__/
 # Celery stuff
 celerybeat-schedule
 celerybeat.pid
 # SageMath parsed files
 *.sage.py
 # Environments
 .env
 .venv
 env/
 venv/
 ENV/
 env.bak/
 venv.bak/
 # Spyder project settings
 .spyderproject
 .spyproject
 # Rope project settings
 .ropeproject
 # mkdocs documentation
 /site
 # mypy
 .mypy_cache/
 .dmypy.json
 dmypy.json
 # Pyre type checker
 .pyre/
 # pytype static type analyzer
 .pytype/
 # Cython debug symbols
 cython_debug/
 # PyCharm
 #  JetBrains specific template is maintained in a separate JetBrains.gitignore that can
 #  be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
 #  and can be added to the global gitignore or merged into this file.  For a more nuclear
 #  option (not recommended) you can uncomment the following to ignore the entire idea folder.
 #.idea/
 # Superpowers brainstorm companion working dir
 .superpowers/
--- a/.python-version
+++ b/.python-version
@ -0,0 +1 @@
 3.12
--- a/9
+++ b/9
@ -0,0 +1,9 @@
 MIT License
 Copyright (c) 2026 archeious
 Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
 The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
--- a/PLAN.md
+++ b/PLAN.md
@ -0,0 +1,342 @@
 # claude-gauge
 Hardware instrument cluster plus companion web dashboard for Claude
 Code session telemetry. Primary surface is a physical cluster on the
 desk (fighter-jet / race-car aesthetic). Secondary surface is a
 local web dashboard for the deep stats.
 ## Problem
 There is no official programmatic API for Claude Max plan usage
 today. The `/usage` and `/status` slash commands in Claude Code are
 interactive-only. Open feature requests (claude-code issues #40395,
 #13585, #27217, #33978) track this but nothing has shipped as of
 2026-04-17.
 Rather than wait, drive everything from local Claude Code state by
 watching the session transcript JSONL files in
 `~/.claude/projects/`. Swap the data source later when Anthropic
 ships a real endpoint; the hardware, firmware, daemon API, and
 dashboard all stay the same.
 ## Instrument cluster (primary surface)
 Physical cluster on the desk. Three analog gauges across the top,
 annunciator row below. Backlit. Brushed aluminium bezel, black face,
 cream needles, burgundy redline zone (optional: match the
 quartermaster palette for a house aesthetic).
 ```
    +------------+   +--------------+   +------------+
    |   5h FUEL  |   |   TOKENS/MIN |   |  7d FUEL   |
    |   0 - 100% |   |  0 - redline |   |  0 - 100%  |
    +------------+   +--------------+   +------------+
    [OPUS] [SONNET] [HAIKU]   [HOT] [WARN] [STALL] [IDLE]
 ```
 ### Primary gauges
 | Gauge | Input | Feel |
 |---|---|---|
 | Center tach | tokens/min, short rolling window | Jumpy, fun, shows when Claude is cooking |
 | Left fuel | % of 5h plan window used | Slow, steady, tells you when to worry |
 | Right fuel | % of 7d plan window used | Slowest, long-arc view of the week |
 ### Annunciator row
 | Lamp | Condition |
 |---|---|
 | OPUS / SONNET / HAIKU | lights the model that wrote the most recent tokens |
 | HOT | flashes when tach crosses redline |
 | WARN | solid when either fuel gauge is above 80% |
 | STALL | lights after N minutes of JSONL silence (no activity) |
 | IDLE | green "power on, daemon reachable" indicator |
 Model lamps are colour-coded: Opus deep red, Sonnet amber, Haiku
 green. Visual cue for why the tach just spiked.
 ### Optional fourth gauge
 If physical real estate allows, a "temp" or "boost" sub-gauge:
 * **Cache hit rate** as a boost gauge. High cache read = cheap
  inference. Bragging-rights needle.
 * **Thinking-to-output ratio** as a temp gauge. High = Claude is
  grinding; low = cruising. Tells you when a task is hard.
 ## Companion web dashboard (secondary surface)
 Same daemon serves a browser dashboard at `http://<host>:<port>/`
 with the deep stats. Claude Code is already a terminal tool, so the
 dashboard lives alongside as a geek-out surface when you want to
 drill in past the three-needle summary.
 ### Dashboard sections
 1. **Overview**: the three gauges rendered in the browser, plus the
   annunciator row. Same data as the cluster, so a quick sanity
   check from any device on the LAN.
 2. **Rates and windows**: line charts for tokens/min over last 1h,
   6h, 24h. 5h and 7d window sums over the past month.
 3. **Cost**: dollar estimates derived from token counts and
   published per-model pricing. Today, week-to-date, month-to-date,
   projected month.
 4. **Models**: stacked time series by model (Opus / Sonnet / Haiku).
   Token split pie for the current 7d window.
 5. **Projects**: tokens per project, time per project, last-active
   timestamp. Sortable table.
 6. **Tools**: top tools by call count, success rate per tool,
   Bash command distribution (top commands by root executable).
 7. **Files**: hottest files by Read + Edit count, edit-to-read
   ratio per file.
 8. **Rhythm**: time-of-day heatmap, day-of-week heatmap, session
   duration distribution.
 9. **Raw events**: streaming tail of the latest parsed JSONL rows,
   for debugging and to watch the data land.
 Dashboard is read-only. Nothing the user does here mutates state.
 ## Architecture
 ```
 ~/.claude/projects/**/*.jsonl
          |
          v (watchdog / inotify, line-append events)
          |
  [ gauge daemon ]  <--- local, Python, long-running
          |
          +-- parse new line -> structured event
          |     { ts, session_id, project, model, role,
          |       input_tokens, output_tokens,
          |       cache_read, cache_creation,
          |       thinking_tokens, stop_reason,
          |       tool_use: [...], is_sidechain, request_id }
          |
          +-- in-memory ring buffer (last ~60 min of events)
          +-- periodic flush of events + rolling aggregates to SQLite
          |
          +-- computed windows (primary, cluster):
          |     rate_instant    last 30s, tokens/min
          |     rate_1m         rolling 60s, tokens/min
          |     rate_5m         rolling 5m, tokens/min
          |     window_5h       rolling 5h sum
          |     window_7d       rolling 7d sum
          |
          +-- computed aggregates (secondary, dashboard):
          |     cache hit rate, thinking ratio, cost estimate,
          |     per-model token split, per-project totals,
          |     tool call counts, file touch counts, rhythm grids
          |
          +-- derived flags:
          |     hot, warn, stall, idle, last_model, last_project
          |
          +-- HTTP endpoints:
          |     GET  /usage        primary cluster payload
          |     GET  /stats        deep aggregates for the dashboard
          |     GET  /stream       SSE of parsed events (raw tail)
          |     GET  /             dashboard UI
          |
          v (poll every ~1s)
     [ ESP32 / Pi Pico, on LAN ]
          |
          +-- drives three needles via PWM + low-pass filter
              (or servos if full-swing dials are easier)
          +-- drives annunciator LEDs over GPIO
          +-- optional small OLED behind the cluster for raw numbers
 ```
 ## Real-time honesty
 Claude Code writes a JSONL line **after** each assistant message
 completes, not during streaming. The cluster therefore updates
 per-message, not per-token:
 * Rapid back-and-forth with small messages: tach ticks steadily,
  feels alive.
 * Opus cranking a 40-second response: tach reads zero, zero, zero,
  then SPIKES at the end with the full message's token count.
 Still useful. The spike itself is informative ("that burn just hit
 8k tokens in 40 seconds"). STALL lamp covers the "is it still
 working or is the session idle" ambiguity.
 If true stream-time is wanted later, a Claude Code hook
 (`SessionStart` / `Stop` / `PostToolUse`) can push heartbeats into
 the daemon. Dramatically more work for modest gain. Park as v2.
 ## Calibration
 The daemon does not know Anthropic's real per-plan caps. User
 configures a local ceiling:
 ```
 CLAUDE_GAUGE_5H_CEILING=<tokens>
 CLAUDE_GAUGE_7D_CEILING=<tokens>
 CLAUDE_GAUGE_TACH_REDLINE=<tokens/min>
 ```
 Gauges read against those ceilings. Estimate from experience: run
 for a week at normal usage, note where `/usage` says you are vs
 where the dials read, adjust the config. Ship sensible defaults
 tunable per user.
 ## Data source migration path
 When Anthropic ships an official usage endpoint, the daemon replaces
 its input with that endpoint and drops the JSONL tailing. The HTTP
 shape it serves to the cluster stays the same. The dashboard gains
 accuracy but does not change structure. Zero hardware change, zero
 firmware change.
 ## Stack (proposed, not decided)
 **Daemon**: Python 3.12+, `uv`, `watchdog` for file tail, FastAPI
 for HTTP, SQLite for durable state, Jinja2 for dashboard templates,
 Alpine.js or HTMX for dashboard interactivity. Matches the house
 style (see quartermaster).
 **Firmware**: MicroPython on ESP32 or Pi Pico W for the wifi stack.
 C/Rust if MicroPython HTTP polling turns out wobbly at the required
 update rate.
 **Hardware**: three analog voltmeter movements (0-5V or similar)
 for the gauges, driven from PWM pins through low-pass filters, or
 tiny servos with printed needles. Annunciator row is discrete LEDs
 behind a smoked acrylic face.
 ## Metrics brainstorm (for later inspection)
 Everything on this list is parseable from the JSONL today. The MVP
 daemon only needs the primary-window stats; everything else lands
 under later issues. Capturing them here so they don't get forgotten.
 ### Cost and tokens
 * Total tokens by window (already primary)
 * Breakdown: `input` / `output` / `cache_read` / `cache_creation`
 * **Cache hit rate** = cache_read / (cache_read + cache_creation + input)
 * **Cache savings** in dollars (cache reads are 10% of normal input cost)
 * Cost per session at published per-model pricing
 * Cost per day / week / month; projected month
 * Opus / Sonnet / Haiku token split
 * Server tool use: `web_search_requests`, `web_fetch_requests` per day
 * Service tier distribution (standard vs priority)
 * Ephemeral-1h vs ephemeral-5m cache split
 ### Time and rhythm
 * Sessions per day / per week, rolling average
 * Session duration distribution
 * Time-of-day heatmap (circadian work pattern)
 * Day-of-week heatmap
 * Longest continuous session on record
 * Your think-time: gap between assistant-end and next user message
 * Claude's work-time: user message to assistant complete
 * Active vs idle ratio within sessions
 * Streak tracking (consecutive days used)
 * All-nighter detector (session crossing 2am local)
 ### Work shape
 * **Thinking tokens** counted from content blocks of type `thinking`
 * Thinking-to-output ratio ("cogitation index")
 * Stop-reason distribution (`end_turn` / `tool_use` / `max_tokens`);
  watch for rising `max_tokens` (responses getting cut off)
 * Messages per session
 * Tool calls per assistant response (parallelism indicator)
 * User interrupt rate (sessions ending on cancel)
 * Iteration count per task (assistant messages between two
  consecutive user prompts as a proxy)
 ### Tool usage
 * Top tools by count (Bash, Edit, Read, Grep, ...)
 * Tool success vs failure rate
 * Bash command distribution, parsed by root executable (git, python,
  uv, ls, ...)
 * File reads / edits / writes per session
 * Hottest files by touch count across all sessions
 * Agent / subagent counts (`isSidechain=true`), subagent depth
 * Web search / web fetch counts
 ### Project and context
 * Tokens per project (JSONL path encodes the project)
 * Time per project
 * Project switching rate within a session
 * Dormant project detector (no activity in N days)
 * Languages touched, derived from file extensions
 * Last file edited per project (resume-where-you-left-off)
 ### Friction and quality signals
 * User message length distribution (one-word directives vs prose)
 * Rough correction reflex count: user messages starting with "no",
  "wrong", "stop", "actually"
 * Permission denial frequency
 * Retry / regenerate patterns
 * File-history-snapshot count per session (checkpoint density)
 ### Character and fun
 * Em dash violation count in assistant text (per the CLAUDE.md rule,
  a needle that reads "rule-break events this week")
 * Emoji leakage count
 * Most-used phrase by Claude in your transcripts
 * Most-used phrase by you (captures your actual directive vocabulary)
 * "Dude, chill" detector: count of explicit pushback against Claude
 * Thank-you rate per session
 * Silent sessions (ended without `/compact` or `/clear`)
 ### Cross-system correlations (bigger, later)
 * Git: commits produced per session, lines changed per token spent
 * PRs / issues closed per session
 * Quartermaster: correlate long Claude sessions with budget-editing
  days (just for fun)
 ## Phasing
 ### Phase A — daemon MVP
 One issue. Tail JSONL, parse into structured events, maintain the
 five primary windows, print them to stdout every second. No HTTP,
 no hardware, no dashboard. Prove the numbers land correctly against
 Claude Code's own `/usage` output.
 ### Phase B — HTTP + dashboard overview
 Stand up FastAPI, expose `GET /usage` for the cluster and `GET /`
 for a dashboard overview page (three gauges rendered in SVG, same
 data as the cluster). No deep stats yet. First thing you can load
 in a browser.
 ### Phase C — firmware + single needle
 Minimal ESP32 / Pico firmware that polls `/usage` and drives one
 needle (the tach). Prove the hardware path end to end.
 ### Phase D — full cluster
 Two more needles + annunciator row. Enclosure prototype.
 ### Phase E — dashboard deep stats
 Cost panel, model panel, project panel, rhythm heatmaps, tool
 panel, file panel, raw-event tail. Pulls from the aggregate store
 the daemon has been building since Phase A.
 ### Phase F — character and cross-system
 Em-dash detector, phrase extractor, git correlation, quartermaster
 correlation. Lowest priority, highest amusement.
 ## Next steps
 1. Forgejo repo at `archeious/claude-gauge` (created alongside
   this plan).
 2. File Phase A as the first issue.
 3. Ship Phase A. See the five numbers tick up in a terminal while
   typing into Claude Code. First dopamine hit.
 4. File follow-up issues one phase at a time. No scope bleed
   between phases.
--- a/README.md
+++ b/README.md
@ -0,0 +1,19 @@
 # claude-gauge
 Hardware instrument cluster plus companion web dashboard for Claude
 Code session telemetry.
 A local Python daemon tails `~/.claude/projects/**/*.jsonl`, parses
 structured events into rolling windows and aggregate stats, and
 exposes them over HTTP. An ESP32 or Pi Pico drives a three-gauge
 analog cluster on the desk (tokens/min tach, 5h fuel, 7d fuel) with
 an annunciator row of model and warning lamps. A browser dashboard
 on the same port surfaces the deep stats.
 See [PLAN.md](PLAN.md) for architecture, the instrument cluster
 layout, the dashboard sections, the full metrics brainstorm, and
 the phasing plan.
 ## Status
 Scaffolded. Phase A (daemon MVP) is the first issue.
--- a/pyproject.toml
+++ b/pyproject.toml
@ -0,0 +1,22 @@
 [project]
 name = "claude-gauge"
 version = "0.1.0"
 description = "Instrument cluster and dashboard for Claude Code session telemetry."
 readme = "README.md"
 authors = [
    { name = "Jeff Smith", email = "jeff@unbiasedgeek.com" }
 ]
 requires-python = ">=3.12"
 dependencies = []
 [build-system]
 requires = ["hatchling"]
 build-backend = "hatchling.build"
 [dependency-groups]
 dev = [
    "pytest>=9.0.3",
 ]
 [tool.pytest.ini_options]
 testpaths = ["tests"]
--- a/src/claude_gauge/init.py
+++ b/src/claude_gauge/init.py
@ -0,0 +1,3 @@
 """claude-gauge: Claude Code session telemetry for an instrument cluster and dashboard."""
 __version__ = "0.1.0"
--- a/tests/init.py
+++ b/tests/init.py
		`@ -0,0 +1,3 @@`
							`"""claude-gauge: Claude Code session telemetry for an instrument cluster and dashboard."""`

							`__version__ = "0.1.0"`