quartermaster/docs/superpowers/plans/2026-04-19-healthz-and-json-logs.md
Jeff Smith ad21a86c83 docs: implementation plan for healthz + JSON logs (#26, #27)
Plan covers 8 TDD tasks landing the /healthz endpoint and structured
JSON logs behind the platform contract (#25).

Also: spec updated to use JSON (not YAML) as the log-config source —
avoids a pyyaml runtime dep and removes drift between the dict tests
apply and the file uvicorn reads.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 12:12:55 -06:00

23 KiB

Healthz + Structured JSON Logs Implementation Plan

For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (- [ ]) syntax for tracking.

Goal: Add GET /healthz endpoint and structured JSON logging to stdout so Quartermaster can satisfy the homelab platform contract (#25) for its first deploy.

Architecture: One new logging module (logging_config.py) loads a JSON config (src/quartermaster/logconfig.json) and exposes an AccessLogFilter that enriches uvicorn access records with event="http_request" and request metadata. One new router module (routes_health.py) handles /healthz. Five seed events (logger.info(..., extra={"event": ...})) are sprinkled at the five most operationally interesting mutations so the Loki view is useful on day one.

Tech Stack: Python 3.12, FastAPI, uvicorn, SQLAlchemy, pytest, python-json-logger.

Spec: docs/superpowers/specs/2026-04-19-healthz-and-json-logs-design.md


File Structure

New files:

Path Responsibility
src/quartermaster/logconfig.json uvicorn/stdlib dictConfig — formatter, filter, handler, loggers
src/quartermaster/logging_config.py Loads logconfig.json into LOG_CONFIG; defines AccessLogFilter
src/quartermaster/routes_health.py GET /healthz router
tests/test_logging.py Formatter + filter unit tests + drift/load tests
tests/test_health.py Healthz endpoint tests (200 + 503)

Modified files:

Path Change
pyproject.toml Add python-json-logger to [project].dependencies
src/quartermaster/main.py Include the health router
src/quartermaster/service.py Add one logger.info seed call in update_entry
src/quartermaster/month_service.py Add four logger.info seed calls in create_month, close_month, add_posting, delete_posting
README.md Add "Logs" section; update Run block to mention --log-config

Ship logging first, then healthz.


Task 1: Add python-json-logger dependency

Files:

  • Modify: pyproject.toml

  • Step 1: Edit pyproject.toml

Add "python-json-logger>=3.2.0", to the dependencies list. The resulting block should read:

dependencies = [
    "alembic>=1.18.4",
    "fastapi>=0.136.0",
    "jinja2>=3.1.6",
    "python-json-logger>=3.2.0",
    "python-multipart>=0.0.26",
    "sqlalchemy>=2.0.49",
    "uvicorn[standard]>=0.44.0",
]
  • Step 2: Sync the lock file

Run: uv sync Expected: uv.lock updated; no errors.

  • Step 3: Confirm the import works

Run: uv run python -c "from pythonjsonlogger.jsonlogger import JsonFormatter; print(JsonFormatter)" Expected: prints <class 'pythonjsonlogger.jsonlogger.JsonFormatter'> and exits 0.

  • Step 4: Commit
git add pyproject.toml uv.lock
git commit -m "chore: add python-json-logger dependency (#27)"

Task 2: Add JSON log config + formatter test

Files:

  • Create: src/quartermaster/logconfig.json

  • Create: src/quartermaster/logging_config.py

  • Test: tests/test_logging.py

  • Step 1: Write the failing test

Create tests/test_logging.py:

from __future__ import annotations

import io
import json
import logging

from quartermaster.logging_config import LOG_CONFIG


def _build_formatter():
    """Instantiate the JSON formatter from LOG_CONFIG for direct testing."""
    from pydoc import locate

    cfg = dict(LOG_CONFIG["formatters"]["json"])
    factory_path = cfg.pop("()")
    factory = locate(factory_path)
    assert factory is not None, f"formatter factory not importable: {factory_path}"
    return factory(**cfg)


def test_log_config_emits_json_with_required_fields():
    formatter = _build_formatter()
    stream = io.StringIO()
    handler = logging.StreamHandler(stream)
    handler.setFormatter(formatter)

    logger = logging.getLogger("tests.logging_smoke")
    logger.handlers = [handler]
    logger.propagate = False
    logger.setLevel(logging.INFO)

    logger.info("smoke message", extra={"event": "smoke"})

    line = stream.getvalue().strip()
    assert line, "formatter produced no output"
    payload = json.loads(line)

    assert payload["event"] == "smoke"
    assert payload["level"] == "INFO"
    assert payload["logger"] == "tests.logging_smoke"
    assert payload["message"] == "smoke message"
    assert "timestamp" in payload
  • Step 2: Run the test and verify it fails

Run: uv run pytest tests/test_logging.py -v Expected: FAIL with ModuleNotFoundError: No module named 'quartermaster.logging_config'.

  • Step 3: Create src/quartermaster/logconfig.json
{
  "version": 1,
  "disable_existing_loggers": false,
  "formatters": {
    "json": {
      "()": "pythonjsonlogger.jsonlogger.JsonFormatter",
      "format": "%(asctime)s %(levelname)s %(name)s %(message)s",
      "rename_fields": {
        "asctime": "timestamp",
        "levelname": "level",
        "name": "logger"
      }
    }
  },
  "filters": {
    "access": {
      "()": "quartermaster.logging_config.AccessLogFilter"
    }
  },
  "handlers": {
    "stdout": {
      "class": "logging.StreamHandler",
      "stream": "ext://sys.stdout",
      "formatter": "json"
    },
    "access": {
      "class": "logging.StreamHandler",
      "stream": "ext://sys.stdout",
      "formatter": "json",
      "filters": ["access"]
    }
  },
  "loggers": {
    "quartermaster": {
      "level": "INFO",
      "handlers": ["stdout"],
      "propagate": false
    },
    "uvicorn": {
      "level": "INFO",
      "handlers": ["stdout"],
      "propagate": false
    },
    "uvicorn.error": {
      "level": "INFO",
      "handlers": ["stdout"],
      "propagate": false
    },
    "uvicorn.access": {
      "level": "INFO",
      "handlers": ["access"],
      "propagate": false
    }
  },
  "root": {
    "level": "INFO",
    "handlers": ["stdout"]
  }
}
  • Step 4: Create src/quartermaster/logging_config.py
from __future__ import annotations

import json
import logging
from pathlib import Path

_CONFIG_PATH = Path(__file__).parent / "logconfig.json"
LOG_CONFIG = json.loads(_CONFIG_PATH.read_text())


class AccessLogFilter(logging.Filter):
    """Enrich uvicorn access records with structured fields for Loki.

    Uvicorn emits access records whose ``args`` tuple is
    ``(client_addr, method, full_path, http_version, status_code)``.
    """

    def filter(self, record: logging.LogRecord) -> bool:
        args = record.args
        if isinstance(args, tuple) and len(args) >= 5:
            record.event = "http_request"
            record.client_ip = args[0]
            record.method = args[1]
            record.path = args[2]
            record.status = args[4]
        return True
  • Step 5: Run the test and verify it passes

Run: uv run pytest tests/test_logging.py -v Expected: PASS.

  • Step 6: Commit
git add src/quartermaster/logconfig.json src/quartermaster/logging_config.py tests/test_logging.py
git commit -m "feat(logging): add JSON log config and formatter (#27)"

Task 3: AccessLogFilter enriches uvicorn access records

Files:

  • Modify: src/quartermaster/logging_config.py (already has AccessLogFilter from Task 2 — this task adds the test to lock in behavior)

  • Test: tests/test_logging.py

  • Step 1: Write the failing test

Append to tests/test_logging.py:

def test_access_log_filter_enriches_record():
    from quartermaster.logging_config import AccessLogFilter

    record = logging.LogRecord(
        name="uvicorn.access",
        level=logging.INFO,
        pathname=__file__,
        lineno=0,
        msg='%s - "%s %s HTTP/%s" %d',
        args=("127.0.0.1:1234", "GET", "/healthz", "1.1", 200),
        exc_info=None,
    )

    assert AccessLogFilter().filter(record) is True
    assert record.event == "http_request"
    assert record.method == "GET"
    assert record.path == "/healthz"
    assert record.status == 200
    assert record.client_ip == "127.0.0.1:1234"


def test_access_log_filter_passes_through_non_access_records():
    """Non-access records (args tuple shape mismatch) are kept, unmodified."""
    from quartermaster.logging_config import AccessLogFilter

    record = logging.LogRecord(
        name="quartermaster",
        level=logging.INFO,
        pathname=__file__,
        lineno=0,
        msg="regular app log",
        args=None,
        exc_info=None,
    )

    assert AccessLogFilter().filter(record) is True
    assert not hasattr(record, "event") or record.event is None
    assert not hasattr(record, "method") or record.method is None
  • Step 2: Run the tests and verify they pass

Run: uv run pytest tests/test_logging.py -v Expected: PASS for all four tests (the two from Task 2 plus the two above). AccessLogFilter was already implemented in Task 2; these tests codify its contract.

  • Step 3: Commit
git add tests/test_logging.py
git commit -m "test(logging): cover AccessLogFilter contract (#27)"

Task 4: Wire uvicorn to the log config + update README

Files:

  • Modify: README.md

  • Step 1: Update the "Run" section of README.md

Replace the current Run block (uv run uvicorn quartermaster.main:app --reload) with:

## Run

```sh
uv run uvicorn quartermaster.main:app \
    --log-config src/quartermaster/logconfig.json \
    --reload
```

Open http://127.0.0.1:8000.
  • Step 2: Add a "Logs" section to README.md

Insert this section between "Run" and "Tests":

## Logs

Logs are JSON on stdout. Each line has `level` and `event` as queryable
Loki labels (indexed by Promtail on the deploy host), plus arbitrary
extra fields in the JSON body.

Example LogQL queries in Grafana Explore (Loki data source):

```
{container="quartermaster"} | json
{container="quartermaster", event="http_request", status=~"5.."}
{container="quartermaster", event="month_closed"} | json | line_format "{{.path}} {{.message}}"
```

HTTP access logs appear as `event="http_request"` with `method`, `path`,
`status`, and `client_ip` extras. Application events (`month_created`,
`month_closed`, `template_entry_updated`, `posting_added`,
`posting_deleted`) fire at the matching mutation sites.
  • Step 3: Manual smoke test

Run: uv run uvicorn quartermaster.main:app --log-config src/quartermaster/logconfig.json --port 8765 & Then in another shell: curl -s http://127.0.0.1:8765/ > /dev/null Then: kill %1

Expected: uvicorn's stdout shows JSON lines like {"timestamp": "...", "level": "INFO", "logger": "uvicorn.access", "message": "...", "event": "http_request", "client_ip": "127.0.0.1:...", "method": "GET", "path": "/", "status": 200}.

If the output is plain text, the --log-config path is wrong or the JSON file has a syntax error.

  • Step 4: Commit
git add README.md
git commit -m "docs: README Logs section and --log-config flag (#27)"

Task 5: Seed app events in month_service.py

Files:

  • Modify: src/quartermaster/month_service.py
  • Test: tests/test_logging.py

This task adds logger.info(..., extra={"event": ...}) calls at four sites: create_month, close_month, add_posting, delete_posting.

  • Step 1: Write the failing tests

Append to tests/test_logging.py:

import pytest
from decimal import Decimal
from datetime import date

from quartermaster import month_service, service
from quartermaster.models import Section


def _make_debt_minimum(db, name="Loan", amount=Decimal("100.00")):
    from quartermaster.models import Entry
    entry = Entry(section=Section.debt_minimum, name=name, amount=amount)
    db.add(entry)
    db.commit()
    db.refresh(entry)
    return entry


def test_create_month_logs_month_created_event(db, caplog):
    caplog.set_level(logging.INFO, logger="quartermaster")
    month_service.create_month(db, "2026-05")

    events = [r for r in caplog.records if getattr(r, "event", None) == "month_created"]
    assert len(events) == 1
    assert events[0].levelname == "INFO"


def test_close_month_logs_month_closed_event(db, caplog):
    month = month_service.create_month(db, "2026-06")
    # move through active → closed with a zero-applied budget (no entries yet)
    month_service.activate_month(db, month)
    caplog.clear()
    caplog.set_level(logging.INFO, logger="quartermaster")

    month_service.close_month(db, month)

    events = [r for r in caplog.records if getattr(r, "event", None) == "month_closed"]
    assert len(events) == 1


def test_add_posting_logs_posting_added_event(db, caplog):
    _make_debt_minimum(db)
    month = month_service.create_month(db, "2026-07")
    entry = month.entries[0]
    caplog.set_level(logging.INFO, logger="quartermaster")

    month_service.add_posting(
        db, month, entry.id,
        occurred_on=date(2026, 7, 15),
        amount=Decimal("25.00"),
    )

    events = [r for r in caplog.records if getattr(r, "event", None) == "posting_added"]
    assert len(events) == 1


def test_delete_posting_logs_posting_deleted_event(db, caplog):
    _make_debt_minimum(db)
    month = month_service.create_month(db, "2026-08")
    entry = month.entries[0]
    posting = month_service.add_posting(
        db, month, entry.id,
        occurred_on=date(2026, 8, 15),
        amount=Decimal("10.00"),
    )
    caplog.clear()
    caplog.set_level(logging.INFO, logger="quartermaster")

    month_service.delete_posting(db, month, posting.id)

    events = [r for r in caplog.records if getattr(r, "event", None) == "posting_deleted"]
    assert len(events) == 1
  • Step 2: Run the tests and verify they fail

Run: uv run pytest tests/test_logging.py -v -k "logs_" Expected: FAIL — no records with the expected event values.

  • Step 3: Add the module logger and four logger.info calls

At the top of src/quartermaster/month_service.py, after the existing imports, add:

import logging

logger = logging.getLogger("quartermaster.month_service")

Inside create_month, immediately before the final return month (after db.commit(); db.refresh(month)):

logger.info(
    "created month snapshot",
    extra={"event": "month_created", "year_month": month.year_month},
)
return month

Inside close_month, immediately before the final return month (after db.commit(); db.refresh(month)):

logger.info(
    "closed month",
    extra={"event": "month_closed", "year_month": month.year_month},
)
return month

Inside add_posting, immediately before the final return posting (after db.commit(); db.refresh(posting)):

logger.info(
    "added posting",
    extra={
        "event": "posting_added",
        "posting_id": posting.id,
        "month_entry_id": posting.month_entry_id,
        "amount": str(posting.amount),
    },
)
return posting

Inside delete_posting — look at the full function; it needs the log line at the end, after db.commit() and before the function returns. Here is the exact shape:

def delete_posting(
    db: Session, month: Month, posting_id: int
) -> MonthEntry | None:
    """Delete a posting; return the parent entry so callers can re-render its section."""
    posting = get_posting(db, month, posting_id)
    if posting is None:
        return None
    entry = posting.entry
    db.delete(posting)
    db.commit()
    logger.info(
        "deleted posting",
        extra={"event": "posting_deleted", "posting_id": posting_id},
    )
    return entry

(The posting_id parameter is captured before the delete, so logging it after commit is safe.)

  • Step 4: Run the tests and verify they pass

Run: uv run pytest tests/test_logging.py -v -k "logs_" Expected: PASS for all four seed-event tests.

  • Step 5: Run the full test suite

Run: uv run pytest Expected: all 137 existing tests plus the 8 new logging tests pass.

  • Step 6: Commit
git add src/quartermaster/month_service.py tests/test_logging.py
git commit -m "feat(logging): seed month + posting events (#27)"

Task 6: Seed template_entry_updated event in service.py

Files:

  • Modify: src/quartermaster/service.py

  • Test: tests/test_logging.py

  • Step 1: Write the failing test

Append to tests/test_logging.py:

def test_update_entry_logs_template_entry_updated_event(db, caplog):
    from quartermaster.models import Entry
    entry = Entry(section=Section.fixed_bill, name="Power", amount=Decimal("50.00"))
    db.add(entry)
    db.commit()
    db.refresh(entry)

    caplog.set_level(logging.INFO, logger="quartermaster")

    service.update_entry(db, entry.id, amount=Decimal("55.00"))

    events = [
        r for r in caplog.records
        if getattr(r, "event", None) == "template_entry_updated"
    ]
    assert len(events) == 1
    assert events[0].entry_id == entry.id
  • Step 2: Run the test and verify it fails

Run: uv run pytest tests/test_logging.py::test_update_entry_logs_template_entry_updated_event -v Expected: FAIL.

  • Step 3: Add the module logger and logger.info call

At the top of src/quartermaster/service.py, after the existing imports, add:

import logging

logger = logging.getLogger("quartermaster.service")

Inside update_entry, immediately before the final return entry (after db.commit(); db.refresh(entry)):

logger.info(
    "updated template entry",
    extra={"event": "template_entry_updated", "entry_id": entry.id},
)
return entry
  • Step 4: Run the test and verify it passes

Run: uv run pytest tests/test_logging.py::test_update_entry_logs_template_entry_updated_event -v Expected: PASS.

  • Step 5: Run the full test suite

Run: uv run pytest Expected: all tests pass.

  • Step 6: Commit
git add src/quartermaster/service.py tests/test_logging.py
git commit -m "feat(logging): seed template_entry_updated event (#27)"

Task 7: /healthz success path

Files:

  • Create: src/quartermaster/routes_health.py

  • Modify: src/quartermaster/main.py

  • Test: tests/test_health.py

  • Step 1: Write the failing test

Create tests/test_health.py:

from __future__ import annotations


def test_healthz_returns_ok(client):
    response = client.get("/healthz")
    assert response.status_code == 200
    assert response.json() == {"status": "ok"}
  • Step 2: Run the test and verify it fails

Run: uv run pytest tests/test_health.py -v Expected: FAIL with 404 (route not registered).

  • Step 3: Create src/quartermaster/routes_health.py
from __future__ import annotations

import logging

from fastapi import APIRouter, Depends
from fastapi.responses import JSONResponse
from sqlalchemy import text
from sqlalchemy.orm import Session

from quartermaster.db import get_session

logger = logging.getLogger("quartermaster.health")

router = APIRouter()


@router.get("/healthz")
def healthz(db: Session = Depends(get_session)) -> JSONResponse:
    try:
        db.execute(text("SELECT 1"))
    except Exception as exc:
        logger.warning(
            "healthz check failed",
            extra={"event": "healthz_failed", "error_class": type(exc).__name__},
        )
        return JSONResponse(
            status_code=503,
            content={"status": "error", "detail": type(exc).__name__},
        )
    return JSONResponse(status_code=200, content={"status": "ok"})
  • Step 4: Wire the router into the FastAPI app

Edit src/quartermaster/main.py. Change the imports block and create_app to:

from __future__ import annotations

from pathlib import Path

from fastapi import FastAPI
from fastapi.staticfiles import StaticFiles

from quartermaster.routes import router
from quartermaster.routes_health import router as health_router
from quartermaster.routes_month import router as month_router

STATIC_DIR = Path(__file__).parent / "static"


def create_app() -> FastAPI:
    app = FastAPI(title="Quartermaster", version="0.1.0")
    app.mount("/static", StaticFiles(directory=str(STATIC_DIR)), name="static")
    app.include_router(health_router)
    app.include_router(router)
    app.include_router(month_router)
    return app


app = create_app()
  • Step 5: Run the test and verify it passes

Run: uv run pytest tests/test_health.py::test_healthz_returns_ok -v Expected: PASS.

  • Step 6: Commit
git add src/quartermaster/routes_health.py src/quartermaster/main.py tests/test_health.py
git commit -m "feat(health): add /healthz endpoint (#26)"

Task 8: /healthz returns 503 when the DB check fails

Files:

  • Test: tests/test_health.py

  • No production code changes — the exception handler from Task 7 already covers this.

  • Step 1: Write the failing test

Append to tests/test_health.py:

from sqlalchemy.exc import OperationalError

from quartermaster.db import get_session
from quartermaster.main import create_app
from fastapi.testclient import TestClient


def test_healthz_returns_503_when_db_check_raises():
    app = create_app()

    def broken_session():
        class BrokenSession:
            def execute(self, *args, **kwargs):
                raise OperationalError("boom", None, Exception("boom"))

            def close(self):
                pass

        session = BrokenSession()
        try:
            yield session
        finally:
            session.close()

    app.dependency_overrides[get_session] = broken_session

    with TestClient(app) as client:
        response = client.get("/healthz")

    assert response.status_code == 503
    body = response.json()
    assert body["status"] == "error"
    assert body["detail"] == "OperationalError"
  • Step 2: Run the test and verify it passes

Run: uv run pytest tests/test_health.py -v Expected: PASS for both tests. The exception handler in Task 7 already implements the 503 path; this test codifies the contract.

  • Step 3: Run the full suite

Run: uv run pytest Expected: all tests pass (137 existing + 9 new logging tests + 2 healthz tests = 148).

  • Step 4: Commit
git add tests/test_health.py
git commit -m "test(health): cover 503 on DB failure (#26)"

Post-implementation verification

After Task 8 completes:

  • Full test suite

Run: uv run pytest -v Expected: 148 tests pass.

  • End-to-end smoke test with the log config

Run in one shell: uv run uvicorn quartermaster.main:app --log-config src/quartermaster/logconfig.json --port 8765 & Then: curl -s http://127.0.0.1:8765/healthz Then: kill %1

Expected:

  • curl prints {"status":"ok"}.

  • uvicorn stdout showed at least one JSON line containing "event": "http_request", "path": "/healthz", "status": 200.

  • Close issues #26 and #27

Reference the merge commit on both issues. Leave #25 open.


Dependencies and ordering

  • Tasks 1 → 2 → 3 → 4 must be sequential (each builds on the last).
  • Task 5 depends on Task 2 (needs LOG_CONFIG in place to test via caplog, though caplog itself does not require the config — independent in practice).
  • Task 6 is independent of 5 but logically belongs after.
  • Tasks 7 → 8 are sequential.
  • Tasks 5, 6, 7 could theoretically run in parallel after Task 4, but linear is cleaner.