# Healthz + Structured JSON Logs Implementation Plan > **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. **Goal:** Add `GET /healthz` endpoint and structured JSON logging to stdout so Quartermaster can satisfy the homelab platform contract (#25) for its first deploy. **Architecture:** One new logging module (`logging_config.py`) loads a JSON config (`src/quartermaster/logconfig.json`) and exposes an `AccessLogFilter` that enriches uvicorn access records with `event="http_request"` and request metadata. One new router module (`routes_health.py`) handles `/healthz`. Five seed events (`logger.info(..., extra={"event": ...})`) are sprinkled at the five most operationally interesting mutations so the Loki view is useful on day one. **Tech Stack:** Python 3.12, FastAPI, uvicorn, SQLAlchemy, pytest, `python-json-logger`. **Spec:** `docs/superpowers/specs/2026-04-19-healthz-and-json-logs-design.md` --- ## File Structure New files: | Path | Responsibility | |---|---| | `src/quartermaster/logconfig.json` | uvicorn/stdlib `dictConfig` — formatter, filter, handler, loggers | | `src/quartermaster/logging_config.py` | Loads `logconfig.json` into `LOG_CONFIG`; defines `AccessLogFilter` | | `src/quartermaster/routes_health.py` | `GET /healthz` router | | `tests/test_logging.py` | Formatter + filter unit tests + drift/load tests | | `tests/test_health.py` | Healthz endpoint tests (200 + 503) | Modified files: | Path | Change | |---|---| | `pyproject.toml` | Add `python-json-logger` to `[project].dependencies` | | `src/quartermaster/main.py` | Include the health router | | `src/quartermaster/service.py` | Add one `logger.info` seed call in `update_entry` | | `src/quartermaster/month_service.py` | Add four `logger.info` seed calls in `create_month`, `close_month`, `add_posting`, `delete_posting` | | `README.md` | Add "Logs" section; update Run block to mention `--log-config` | Ship logging first, then healthz. --- ## Task 1: Add `python-json-logger` dependency **Files:** - Modify: `pyproject.toml` - [ ] **Step 1: Edit `pyproject.toml`** Add `"python-json-logger>=3.2.0",` to the `dependencies` list. The resulting block should read: ```toml dependencies = [ "alembic>=1.18.4", "fastapi>=0.136.0", "jinja2>=3.1.6", "python-json-logger>=3.2.0", "python-multipart>=0.0.26", "sqlalchemy>=2.0.49", "uvicorn[standard]>=0.44.0", ] ``` - [ ] **Step 2: Sync the lock file** Run: `uv sync` Expected: `uv.lock` updated; no errors. - [ ] **Step 3: Confirm the import works** Run: `uv run python -c "from pythonjsonlogger.jsonlogger import JsonFormatter; print(JsonFormatter)"` Expected: prints `` and exits 0. - [ ] **Step 4: Commit** ```bash git add pyproject.toml uv.lock git commit -m "chore: add python-json-logger dependency (#27)" ``` --- ## Task 2: Add JSON log config + formatter test **Files:** - Create: `src/quartermaster/logconfig.json` - Create: `src/quartermaster/logging_config.py` - Test: `tests/test_logging.py` - [ ] **Step 1: Write the failing test** Create `tests/test_logging.py`: ```python from __future__ import annotations import io import json import logging from quartermaster.logging_config import LOG_CONFIG def _build_formatter(): """Instantiate the JSON formatter from LOG_CONFIG for direct testing.""" from pydoc import locate cfg = dict(LOG_CONFIG["formatters"]["json"]) factory_path = cfg.pop("()") factory = locate(factory_path) assert factory is not None, f"formatter factory not importable: {factory_path}" return factory(**cfg) def test_log_config_emits_json_with_required_fields(): formatter = _build_formatter() stream = io.StringIO() handler = logging.StreamHandler(stream) handler.setFormatter(formatter) logger = logging.getLogger("tests.logging_smoke") logger.handlers = [handler] logger.propagate = False logger.setLevel(logging.INFO) logger.info("smoke message", extra={"event": "smoke"}) line = stream.getvalue().strip() assert line, "formatter produced no output" payload = json.loads(line) assert payload["event"] == "smoke" assert payload["level"] == "INFO" assert payload["logger"] == "tests.logging_smoke" assert payload["message"] == "smoke message" assert "timestamp" in payload ``` - [ ] **Step 2: Run the test and verify it fails** Run: `uv run pytest tests/test_logging.py -v` Expected: FAIL with `ModuleNotFoundError: No module named 'quartermaster.logging_config'`. - [ ] **Step 3: Create `src/quartermaster/logconfig.json`** ```json { "version": 1, "disable_existing_loggers": false, "formatters": { "json": { "()": "pythonjsonlogger.jsonlogger.JsonFormatter", "format": "%(asctime)s %(levelname)s %(name)s %(message)s", "rename_fields": { "asctime": "timestamp", "levelname": "level", "name": "logger" } } }, "filters": { "access": { "()": "quartermaster.logging_config.AccessLogFilter" } }, "handlers": { "stdout": { "class": "logging.StreamHandler", "stream": "ext://sys.stdout", "formatter": "json" }, "access": { "class": "logging.StreamHandler", "stream": "ext://sys.stdout", "formatter": "json", "filters": ["access"] } }, "loggers": { "quartermaster": { "level": "INFO", "handlers": ["stdout"], "propagate": false }, "uvicorn": { "level": "INFO", "handlers": ["stdout"], "propagate": false }, "uvicorn.error": { "level": "INFO", "handlers": ["stdout"], "propagate": false }, "uvicorn.access": { "level": "INFO", "handlers": ["access"], "propagate": false } }, "root": { "level": "INFO", "handlers": ["stdout"] } } ``` - [ ] **Step 4: Create `src/quartermaster/logging_config.py`** ```python from __future__ import annotations import json import logging from pathlib import Path _CONFIG_PATH = Path(__file__).parent / "logconfig.json" LOG_CONFIG = json.loads(_CONFIG_PATH.read_text()) class AccessLogFilter(logging.Filter): """Enrich uvicorn access records with structured fields for Loki. Uvicorn emits access records whose ``args`` tuple is ``(client_addr, method, full_path, http_version, status_code)``. """ def filter(self, record: logging.LogRecord) -> bool: args = record.args if isinstance(args, tuple) and len(args) >= 5: record.event = "http_request" record.client_ip = args[0] record.method = args[1] record.path = args[2] record.status = args[4] return True ``` - [ ] **Step 5: Run the test and verify it passes** Run: `uv run pytest tests/test_logging.py -v` Expected: PASS. - [ ] **Step 6: Commit** ```bash git add src/quartermaster/logconfig.json src/quartermaster/logging_config.py tests/test_logging.py git commit -m "feat(logging): add JSON log config and formatter (#27)" ``` --- ## Task 3: AccessLogFilter enriches uvicorn access records **Files:** - Modify: `src/quartermaster/logging_config.py` (already has `AccessLogFilter` from Task 2 — this task adds the test to lock in behavior) - Test: `tests/test_logging.py` - [ ] **Step 1: Write the failing test** Append to `tests/test_logging.py`: ```python def test_access_log_filter_enriches_record(): from quartermaster.logging_config import AccessLogFilter record = logging.LogRecord( name="uvicorn.access", level=logging.INFO, pathname=__file__, lineno=0, msg='%s - "%s %s HTTP/%s" %d', args=("127.0.0.1:1234", "GET", "/healthz", "1.1", 200), exc_info=None, ) assert AccessLogFilter().filter(record) is True assert record.event == "http_request" assert record.method == "GET" assert record.path == "/healthz" assert record.status == 200 assert record.client_ip == "127.0.0.1:1234" def test_access_log_filter_passes_through_non_access_records(): """Non-access records (args tuple shape mismatch) are kept, unmodified.""" from quartermaster.logging_config import AccessLogFilter record = logging.LogRecord( name="quartermaster", level=logging.INFO, pathname=__file__, lineno=0, msg="regular app log", args=None, exc_info=None, ) assert AccessLogFilter().filter(record) is True assert not hasattr(record, "event") or record.event is None assert not hasattr(record, "method") or record.method is None ``` - [ ] **Step 2: Run the tests and verify they pass** Run: `uv run pytest tests/test_logging.py -v` Expected: PASS for all four tests (the two from Task 2 plus the two above). `AccessLogFilter` was already implemented in Task 2; these tests codify its contract. - [ ] **Step 3: Commit** ```bash git add tests/test_logging.py git commit -m "test(logging): cover AccessLogFilter contract (#27)" ``` --- ## Task 4: Wire uvicorn to the log config + update README **Files:** - Modify: `README.md` - [ ] **Step 1: Update the "Run" section of `README.md`** Replace the current Run block (`uv run uvicorn quartermaster.main:app --reload`) with: ````markdown ## Run ```sh uv run uvicorn quartermaster.main:app \ --log-config src/quartermaster/logconfig.json \ --reload ``` Open http://127.0.0.1:8000. ```` - [ ] **Step 2: Add a "Logs" section to `README.md`** Insert this section between "Run" and "Tests": ````markdown ## Logs Logs are JSON on stdout. Each line has `level` and `event` as queryable Loki labels (indexed by Promtail on the deploy host), plus arbitrary extra fields in the JSON body. Example LogQL queries in Grafana Explore (Loki data source): ``` {container="quartermaster"} | json {container="quartermaster", event="http_request", status=~"5.."} {container="quartermaster", event="month_closed"} | json | line_format "{{.path}} {{.message}}" ``` HTTP access logs appear as `event="http_request"` with `method`, `path`, `status`, and `client_ip` extras. Application events (`month_created`, `month_closed`, `template_entry_updated`, `posting_added`, `posting_deleted`) fire at the matching mutation sites. ```` - [ ] **Step 3: Manual smoke test** Run: `uv run uvicorn quartermaster.main:app --log-config src/quartermaster/logconfig.json --port 8765 &` Then in another shell: `curl -s http://127.0.0.1:8765/ > /dev/null` Then: `kill %1` Expected: uvicorn's stdout shows JSON lines like `{"timestamp": "...", "level": "INFO", "logger": "uvicorn.access", "message": "...", "event": "http_request", "client_ip": "127.0.0.1:...", "method": "GET", "path": "/", "status": 200}`. If the output is plain text, the `--log-config` path is wrong or the JSON file has a syntax error. - [ ] **Step 4: Commit** ```bash git add README.md git commit -m "docs: README Logs section and --log-config flag (#27)" ``` --- ## Task 5: Seed app events in `month_service.py` **Files:** - Modify: `src/quartermaster/month_service.py` - Test: `tests/test_logging.py` This task adds `logger.info(..., extra={"event": ...})` calls at four sites: `create_month`, `close_month`, `add_posting`, `delete_posting`. - [ ] **Step 1: Write the failing tests** Append to `tests/test_logging.py`: ```python import pytest from decimal import Decimal from datetime import date from quartermaster import month_service, service from quartermaster.models import Section def _make_debt_minimum(db, name="Loan", amount=Decimal("100.00")): from quartermaster.models import Entry entry = Entry(section=Section.debt_minimum, name=name, amount=amount) db.add(entry) db.commit() db.refresh(entry) return entry def test_create_month_logs_month_created_event(db, caplog): caplog.set_level(logging.INFO, logger="quartermaster") month_service.create_month(db, "2026-05") events = [r for r in caplog.records if getattr(r, "event", None) == "month_created"] assert len(events) == 1 assert events[0].levelname == "INFO" def test_close_month_logs_month_closed_event(db, caplog): month = month_service.create_month(db, "2026-06") # move through active → closed with a zero-applied budget (no entries yet) month_service.activate_month(db, month) caplog.clear() caplog.set_level(logging.INFO, logger="quartermaster") month_service.close_month(db, month) events = [r for r in caplog.records if getattr(r, "event", None) == "month_closed"] assert len(events) == 1 def test_add_posting_logs_posting_added_event(db, caplog): _make_debt_minimum(db) month = month_service.create_month(db, "2026-07") entry = month.entries[0] caplog.set_level(logging.INFO, logger="quartermaster") month_service.add_posting( db, month, entry.id, occurred_on=date(2026, 7, 15), amount=Decimal("25.00"), ) events = [r for r in caplog.records if getattr(r, "event", None) == "posting_added"] assert len(events) == 1 def test_delete_posting_logs_posting_deleted_event(db, caplog): _make_debt_minimum(db) month = month_service.create_month(db, "2026-08") entry = month.entries[0] posting = month_service.add_posting( db, month, entry.id, occurred_on=date(2026, 8, 15), amount=Decimal("10.00"), ) caplog.clear() caplog.set_level(logging.INFO, logger="quartermaster") month_service.delete_posting(db, month, posting.id) events = [r for r in caplog.records if getattr(r, "event", None) == "posting_deleted"] assert len(events) == 1 ``` - [ ] **Step 2: Run the tests and verify they fail** Run: `uv run pytest tests/test_logging.py -v -k "logs_"` Expected: FAIL — no records with the expected `event` values. - [ ] **Step 3: Add the module logger and four `logger.info` calls** At the top of `src/quartermaster/month_service.py`, after the existing imports, add: ```python import logging logger = logging.getLogger("quartermaster.month_service") ``` Inside `create_month`, immediately before the final `return month` (after `db.commit(); db.refresh(month)`): ```python logger.info( "created month snapshot", extra={"event": "month_created", "year_month": month.year_month}, ) return month ``` Inside `close_month`, immediately before the final `return month` (after `db.commit(); db.refresh(month)`): ```python logger.info( "closed month", extra={"event": "month_closed", "year_month": month.year_month}, ) return month ``` Inside `add_posting`, immediately before the final `return posting` (after `db.commit(); db.refresh(posting)`): ```python logger.info( "added posting", extra={ "event": "posting_added", "posting_id": posting.id, "month_entry_id": posting.month_entry_id, "amount": str(posting.amount), }, ) return posting ``` Inside `delete_posting` — look at the full function; it needs the log line at the end, after `db.commit()` and before the function returns. Here is the exact shape: ```python def delete_posting( db: Session, month: Month, posting_id: int ) -> MonthEntry | None: """Delete a posting; return the parent entry so callers can re-render its section.""" posting = get_posting(db, month, posting_id) if posting is None: return None entry = posting.entry db.delete(posting) db.commit() logger.info( "deleted posting", extra={"event": "posting_deleted", "posting_id": posting_id}, ) return entry ``` (The `posting_id` parameter is captured before the delete, so logging it after commit is safe.) - [ ] **Step 4: Run the tests and verify they pass** Run: `uv run pytest tests/test_logging.py -v -k "logs_"` Expected: PASS for all four seed-event tests. - [ ] **Step 5: Run the full test suite** Run: `uv run pytest` Expected: all 137 existing tests plus the 8 new logging tests pass. - [ ] **Step 6: Commit** ```bash git add src/quartermaster/month_service.py tests/test_logging.py git commit -m "feat(logging): seed month + posting events (#27)" ``` --- ## Task 6: Seed `template_entry_updated` event in `service.py` **Files:** - Modify: `src/quartermaster/service.py` - Test: `tests/test_logging.py` - [ ] **Step 1: Write the failing test** Append to `tests/test_logging.py`: ```python def test_update_entry_logs_template_entry_updated_event(db, caplog): from quartermaster.models import Entry entry = Entry(section=Section.fixed_bill, name="Power", amount=Decimal("50.00")) db.add(entry) db.commit() db.refresh(entry) caplog.set_level(logging.INFO, logger="quartermaster") service.update_entry(db, entry.id, amount=Decimal("55.00")) events = [ r for r in caplog.records if getattr(r, "event", None) == "template_entry_updated" ] assert len(events) == 1 assert events[0].entry_id == entry.id ``` - [ ] **Step 2: Run the test and verify it fails** Run: `uv run pytest tests/test_logging.py::test_update_entry_logs_template_entry_updated_event -v` Expected: FAIL. - [ ] **Step 3: Add the module logger and `logger.info` call** At the top of `src/quartermaster/service.py`, after the existing imports, add: ```python import logging logger = logging.getLogger("quartermaster.service") ``` Inside `update_entry`, immediately before the final `return entry` (after `db.commit(); db.refresh(entry)`): ```python logger.info( "updated template entry", extra={"event": "template_entry_updated", "entry_id": entry.id}, ) return entry ``` - [ ] **Step 4: Run the test and verify it passes** Run: `uv run pytest tests/test_logging.py::test_update_entry_logs_template_entry_updated_event -v` Expected: PASS. - [ ] **Step 5: Run the full test suite** Run: `uv run pytest` Expected: all tests pass. - [ ] **Step 6: Commit** ```bash git add src/quartermaster/service.py tests/test_logging.py git commit -m "feat(logging): seed template_entry_updated event (#27)" ``` --- ## Task 7: `/healthz` success path **Files:** - Create: `src/quartermaster/routes_health.py` - Modify: `src/quartermaster/main.py` - Test: `tests/test_health.py` - [ ] **Step 1: Write the failing test** Create `tests/test_health.py`: ```python from __future__ import annotations def test_healthz_returns_ok(client): response = client.get("/healthz") assert response.status_code == 200 assert response.json() == {"status": "ok"} ``` - [ ] **Step 2: Run the test and verify it fails** Run: `uv run pytest tests/test_health.py -v` Expected: FAIL with 404 (route not registered). - [ ] **Step 3: Create `src/quartermaster/routes_health.py`** ```python from __future__ import annotations import logging from fastapi import APIRouter, Depends from fastapi.responses import JSONResponse from sqlalchemy import text from sqlalchemy.orm import Session from quartermaster.db import get_session logger = logging.getLogger("quartermaster.health") router = APIRouter() @router.get("/healthz") def healthz(db: Session = Depends(get_session)) -> JSONResponse: try: db.execute(text("SELECT 1")) except Exception as exc: logger.warning( "healthz check failed", extra={"event": "healthz_failed", "error_class": type(exc).__name__}, ) return JSONResponse( status_code=503, content={"status": "error", "detail": type(exc).__name__}, ) return JSONResponse(status_code=200, content={"status": "ok"}) ``` - [ ] **Step 4: Wire the router into the FastAPI app** Edit `src/quartermaster/main.py`. Change the imports block and `create_app` to: ```python from __future__ import annotations from pathlib import Path from fastapi import FastAPI from fastapi.staticfiles import StaticFiles from quartermaster.routes import router from quartermaster.routes_health import router as health_router from quartermaster.routes_month import router as month_router STATIC_DIR = Path(__file__).parent / "static" def create_app() -> FastAPI: app = FastAPI(title="Quartermaster", version="0.1.0") app.mount("/static", StaticFiles(directory=str(STATIC_DIR)), name="static") app.include_router(health_router) app.include_router(router) app.include_router(month_router) return app app = create_app() ``` - [ ] **Step 5: Run the test and verify it passes** Run: `uv run pytest tests/test_health.py::test_healthz_returns_ok -v` Expected: PASS. - [ ] **Step 6: Commit** ```bash git add src/quartermaster/routes_health.py src/quartermaster/main.py tests/test_health.py git commit -m "feat(health): add /healthz endpoint (#26)" ``` --- ## Task 8: `/healthz` returns 503 when the DB check fails **Files:** - Test: `tests/test_health.py` - No production code changes — the exception handler from Task 7 already covers this. - [ ] **Step 1: Write the failing test** Append to `tests/test_health.py`: ```python from sqlalchemy.exc import OperationalError from quartermaster.db import get_session from quartermaster.main import create_app from fastapi.testclient import TestClient def test_healthz_returns_503_when_db_check_raises(): app = create_app() def broken_session(): class BrokenSession: def execute(self, *args, **kwargs): raise OperationalError("boom", None, Exception("boom")) def close(self): pass session = BrokenSession() try: yield session finally: session.close() app.dependency_overrides[get_session] = broken_session with TestClient(app) as client: response = client.get("/healthz") assert response.status_code == 503 body = response.json() assert body["status"] == "error" assert body["detail"] == "OperationalError" ``` - [ ] **Step 2: Run the test and verify it passes** Run: `uv run pytest tests/test_health.py -v` Expected: PASS for both tests. The exception handler in Task 7 already implements the 503 path; this test codifies the contract. - [ ] **Step 3: Run the full suite** Run: `uv run pytest` Expected: all tests pass (137 existing + 9 new logging tests + 2 healthz tests = 148). - [ ] **Step 4: Commit** ```bash git add tests/test_health.py git commit -m "test(health): cover 503 on DB failure (#26)" ``` --- ## Post-implementation verification After Task 8 completes: - [ ] **Full test suite** Run: `uv run pytest -v` Expected: 148 tests pass. - [ ] **End-to-end smoke test with the log config** Run in one shell: `uv run uvicorn quartermaster.main:app --log-config src/quartermaster/logconfig.json --port 8765 &` Then: `curl -s http://127.0.0.1:8765/healthz` Then: `kill %1` Expected: - `curl` prints `{"status":"ok"}`. - uvicorn stdout showed at least one JSON line containing `"event": "http_request"`, `"path": "/healthz"`, `"status": 200`. - [ ] **Close issues #26 and #27** Reference the merge commit on both issues. Leave #25 open. --- ## Dependencies and ordering - Tasks 1 → 2 → 3 → 4 must be sequential (each builds on the last). - Task 5 depends on Task 2 (needs LOG_CONFIG in place to test via caplog, though caplog itself does not require the config — independent in practice). - Task 6 is independent of 5 but logically belongs after. - Tasks 7 → 8 are sequential. - Tasks 5, 6, 7 could theoretically run in parallel after Task 4, but linear is cleaner.