# Healthz + Structured JSON Logs Implementation Plan

> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.

**Goal:** Add `GET /healthz` endpoint and structured JSON logging to stdout so Quartermaster can satisfy the homelab platform contract (#25) for its first deploy.

**Architecture:** One new logging module (`logging_config.py`) loads a JSON config (`src/quartermaster/logconfig.json`) and exposes an `AccessLogFilter` that enriches uvicorn access records with `event="http_request"` and request metadata. One new router module (`routes_health.py`) handles `/healthz`. Five seed events (`logger.info(..., extra={"event": ...})`) are sprinkled at the five most operationally interesting mutations so the Loki view is useful on day one.

**Tech Stack:** Python 3.12, FastAPI, uvicorn, SQLAlchemy, pytest, `python-json-logger`.

**Spec:** `docs/superpowers/specs/2026-04-19-healthz-and-json-logs-design.md`

---

## File Structure

New files:

| Path | Responsibility |
|---|---|
| `src/quartermaster/logconfig.json` | uvicorn/stdlib `dictConfig` — formatter, filter, handler, loggers |
| `src/quartermaster/logging_config.py` | Loads `logconfig.json` into `LOG_CONFIG`; defines `AccessLogFilter` |
| `src/quartermaster/routes_health.py` | `GET /healthz` router |
| `tests/test_logging.py` | Formatter + filter unit tests + drift/load tests |
| `tests/test_health.py` | Healthz endpoint tests (200 + 503) |

Modified files:

| Path | Change |
|---|---|
| `pyproject.toml` | Add `python-json-logger` to `[project].dependencies` |
| `src/quartermaster/main.py` | Include the health router |
| `src/quartermaster/service.py` | Add one `logger.info` seed call in `update_entry` |
| `src/quartermaster/month_service.py` | Add four `logger.info` seed calls in `create_month`, `close_month`, `add_posting`, `delete_posting` |
| `README.md` | Add "Logs" section; update Run block to mention `--log-config` |

Ship logging first, then healthz.

---

## Task 1: Add `python-json-logger` dependency

**Files:**
- Modify: `pyproject.toml`

- [ ] **Step 1: Edit `pyproject.toml`**

Add `"python-json-logger>=3.2.0",` to the `dependencies` list. The resulting block should read:

```toml
dependencies = [
    "alembic>=1.18.4",
    "fastapi>=0.136.0",
    "jinja2>=3.1.6",
    "python-json-logger>=3.2.0",
    "python-multipart>=0.0.26",
    "sqlalchemy>=2.0.49",
    "uvicorn[standard]>=0.44.0",
]
```

- [ ] **Step 2: Sync the lock file**

Run: `uv sync`
Expected: `uv.lock` updated; no errors.

- [ ] **Step 3: Confirm the import works**

Run: `uv run python -c "from pythonjsonlogger.jsonlogger import JsonFormatter; print(JsonFormatter)"`
Expected: prints `<class 'pythonjsonlogger.jsonlogger.JsonFormatter'>` and exits 0.

- [ ] **Step 4: Commit**

```bash
git add pyproject.toml uv.lock
git commit -m "chore: add python-json-logger dependency (#27)"
```

---

## Task 2: Add JSON log config + formatter test

**Files:**
- Create: `src/quartermaster/logconfig.json`
- Create: `src/quartermaster/logging_config.py`
- Test: `tests/test_logging.py`

- [ ] **Step 1: Write the failing test**

Create `tests/test_logging.py`:

```python
from __future__ import annotations

import io
import json
import logging

from quartermaster.logging_config import LOG_CONFIG


def _build_formatter():
    """Instantiate the JSON formatter from LOG_CONFIG for direct testing."""
    from pydoc import locate

    cfg = dict(LOG_CONFIG["formatters"]["json"])
    factory_path = cfg.pop("()")
    factory = locate(factory_path)
    assert factory is not None, f"formatter factory not importable: {factory_path}"
    return factory(**cfg)


def test_log_config_emits_json_with_required_fields():
    formatter = _build_formatter()
    stream = io.StringIO()
    handler = logging.StreamHandler(stream)
    handler.setFormatter(formatter)

    logger = logging.getLogger("tests.logging_smoke")
    logger.handlers = [handler]
    logger.propagate = False
    logger.setLevel(logging.INFO)

    logger.info("smoke message", extra={"event": "smoke"})

    line = stream.getvalue().strip()
    assert line, "formatter produced no output"
    payload = json.loads(line)

    assert payload["event"] == "smoke"
    assert payload["level"] == "INFO"
    assert payload["logger"] == "tests.logging_smoke"
    assert payload["message"] == "smoke message"
    assert "timestamp" in payload
```

- [ ] **Step 2: Run the test and verify it fails**

Run: `uv run pytest tests/test_logging.py -v`
Expected: FAIL with `ModuleNotFoundError: No module named 'quartermaster.logging_config'`.

- [ ] **Step 3: Create `src/quartermaster/logconfig.json`**

```json
{
  "version": 1,
  "disable_existing_loggers": false,
  "formatters": {
    "json": {
      "()": "pythonjsonlogger.jsonlogger.JsonFormatter",
      "format": "%(asctime)s %(levelname)s %(name)s %(message)s",
      "rename_fields": {
        "asctime": "timestamp",
        "levelname": "level",
        "name": "logger"
      }
    }
  },
  "filters": {
    "access": {
      "()": "quartermaster.logging_config.AccessLogFilter"
    }
  },
  "handlers": {
    "stdout": {
      "class": "logging.StreamHandler",
      "stream": "ext://sys.stdout",
      "formatter": "json"
    },
    "access": {
      "class": "logging.StreamHandler",
      "stream": "ext://sys.stdout",
      "formatter": "json",
      "filters": ["access"]
    }
  },
  "loggers": {
    "quartermaster": {
      "level": "INFO",
      "handlers": ["stdout"],
      "propagate": false
    },
    "uvicorn": {
      "level": "INFO",
      "handlers": ["stdout"],
      "propagate": false
    },
    "uvicorn.error": {
      "level": "INFO",
      "handlers": ["stdout"],
      "propagate": false
    },
    "uvicorn.access": {
      "level": "INFO",
      "handlers": ["access"],
      "propagate": false
    }
  },
  "root": {
    "level": "INFO",
    "handlers": ["stdout"]
  }
}
```

- [ ] **Step 4: Create `src/quartermaster/logging_config.py`**

```python
from __future__ import annotations

import json
import logging
from pathlib import Path

_CONFIG_PATH = Path(__file__).parent / "logconfig.json"
LOG_CONFIG = json.loads(_CONFIG_PATH.read_text())


class AccessLogFilter(logging.Filter):
    """Enrich uvicorn access records with structured fields for Loki.

    Uvicorn emits access records whose ``args`` tuple is
    ``(client_addr, method, full_path, http_version, status_code)``.
    """

    def filter(self, record: logging.LogRecord) -> bool:
        args = record.args
        if isinstance(args, tuple) and len(args) >= 5:
            record.event = "http_request"
            record.client_ip = args[0]
            record.method = args[1]
            record.path = args[2]
            record.status = args[4]
        return True
```

- [ ] **Step 5: Run the test and verify it passes**

Run: `uv run pytest tests/test_logging.py -v`
Expected: PASS.

- [ ] **Step 6: Commit**

```bash
git add src/quartermaster/logconfig.json src/quartermaster/logging_config.py tests/test_logging.py
git commit -m "feat(logging): add JSON log config and formatter (#27)"
```

---

## Task 3: AccessLogFilter enriches uvicorn access records

**Files:**
- Modify: `src/quartermaster/logging_config.py` (already has `AccessLogFilter` from Task 2 — this task adds the test to lock in behavior)
- Test: `tests/test_logging.py`

- [ ] **Step 1: Write the failing test**

Append to `tests/test_logging.py`:

```python
def test_access_log_filter_enriches_record():
    from quartermaster.logging_config import AccessLogFilter

    record = logging.LogRecord(
        name="uvicorn.access",
        level=logging.INFO,
        pathname=__file__,
        lineno=0,
        msg='%s - "%s %s HTTP/%s" %d',
        args=("127.0.0.1:1234", "GET", "/healthz", "1.1", 200),
        exc_info=None,
    )

    assert AccessLogFilter().filter(record) is True
    assert record.event == "http_request"
    assert record.method == "GET"
    assert record.path == "/healthz"
    assert record.status == 200
    assert record.client_ip == "127.0.0.1:1234"


def test_access_log_filter_passes_through_non_access_records():
    """Non-access records (args tuple shape mismatch) are kept, unmodified."""
    from quartermaster.logging_config import AccessLogFilter

    record = logging.LogRecord(
        name="quartermaster",
        level=logging.INFO,
        pathname=__file__,
        lineno=0,
        msg="regular app log",
        args=None,
        exc_info=None,
    )

    assert AccessLogFilter().filter(record) is True
    assert not hasattr(record, "event") or record.event is None
    assert not hasattr(record, "method") or record.method is None
```

- [ ] **Step 2: Run the tests and verify they pass**

Run: `uv run pytest tests/test_logging.py -v`
Expected: PASS for all four tests (the two from Task 2 plus the two above). `AccessLogFilter` was already implemented in Task 2; these tests codify its contract.

- [ ] **Step 3: Commit**

```bash
git add tests/test_logging.py
git commit -m "test(logging): cover AccessLogFilter contract (#27)"
```

---

## Task 4: Wire uvicorn to the log config + update README

**Files:**
- Modify: `README.md`

- [ ] **Step 1: Update the "Run" section of `README.md`**

Replace the current Run block (`uv run uvicorn quartermaster.main:app --reload`) with:

````markdown
## Run

```sh
uv run uvicorn quartermaster.main:app \
    --log-config src/quartermaster/logconfig.json \
    --reload
```

Open http://127.0.0.1:8000.
````

- [ ] **Step 2: Add a "Logs" section to `README.md`**

Insert this section between "Run" and "Tests":

````markdown
## Logs

Logs are JSON on stdout. Each line has `level` and `event` as queryable
Loki labels (indexed by Promtail on the deploy host), plus arbitrary
extra fields in the JSON body.

Example LogQL queries in Grafana Explore (Loki data source):

```
{container="quartermaster"} | json
{container="quartermaster", event="http_request", status=~"5.."}
{container="quartermaster", event="month_closed"} | json | line_format "{{.path}} {{.message}}"
```

HTTP access logs appear as `event="http_request"` with `method`, `path`,
`status`, and `client_ip` extras. Application events (`month_created`,
`month_closed`, `template_entry_updated`, `posting_added`,
`posting_deleted`) fire at the matching mutation sites.
````

- [ ] **Step 3: Manual smoke test**

Run: `uv run uvicorn quartermaster.main:app --log-config src/quartermaster/logconfig.json --port 8765 &`
Then in another shell: `curl -s http://127.0.0.1:8765/ > /dev/null`
Then: `kill %1`

Expected: uvicorn's stdout shows JSON lines like
`{"timestamp": "...", "level": "INFO", "logger": "uvicorn.access", "message": "...", "event": "http_request", "client_ip": "127.0.0.1:...", "method": "GET", "path": "/", "status": 200}`.

If the output is plain text, the `--log-config` path is wrong or the JSON file has a syntax error.

- [ ] **Step 4: Commit**

```bash
git add README.md
git commit -m "docs: README Logs section and --log-config flag (#27)"
```

---

## Task 5: Seed app events in `month_service.py`

**Files:**
- Modify: `src/quartermaster/month_service.py`
- Test: `tests/test_logging.py`

This task adds `logger.info(..., extra={"event": ...})` calls at four sites: `create_month`, `close_month`, `add_posting`, `delete_posting`.

- [ ] **Step 1: Write the failing tests**

Append to `tests/test_logging.py`:

```python
import pytest
from decimal import Decimal
from datetime import date

from quartermaster import month_service, service
from quartermaster.models import Section


def _make_debt_minimum(db, name="Loan", amount=Decimal("100.00")):
    from quartermaster.models import Entry
    entry = Entry(section=Section.debt_minimum, name=name, amount=amount)
    db.add(entry)
    db.commit()
    db.refresh(entry)
    return entry


def test_create_month_logs_month_created_event(db, caplog):
    caplog.set_level(logging.INFO, logger="quartermaster")
    month_service.create_month(db, "2026-05")

    events = [r for r in caplog.records if getattr(r, "event", None) == "month_created"]
    assert len(events) == 1
    assert events[0].levelname == "INFO"


def test_close_month_logs_month_closed_event(db, caplog):
    month = month_service.create_month(db, "2026-06")
    # move through active → closed with a zero-applied budget (no entries yet)
    month_service.activate_month(db, month)
    caplog.clear()
    caplog.set_level(logging.INFO, logger="quartermaster")

    month_service.close_month(db, month)

    events = [r for r in caplog.records if getattr(r, "event", None) == "month_closed"]
    assert len(events) == 1


def test_add_posting_logs_posting_added_event(db, caplog):
    _make_debt_minimum(db)
    month = month_service.create_month(db, "2026-07")
    entry = month.entries[0]
    caplog.set_level(logging.INFO, logger="quartermaster")

    month_service.add_posting(
        db, month, entry.id,
        occurred_on=date(2026, 7, 15),
        amount=Decimal("25.00"),
    )

    events = [r for r in caplog.records if getattr(r, "event", None) == "posting_added"]
    assert len(events) == 1


def test_delete_posting_logs_posting_deleted_event(db, caplog):
    _make_debt_minimum(db)
    month = month_service.create_month(db, "2026-08")
    entry = month.entries[0]
    posting = month_service.add_posting(
        db, month, entry.id,
        occurred_on=date(2026, 8, 15),
        amount=Decimal("10.00"),
    )
    caplog.clear()
    caplog.set_level(logging.INFO, logger="quartermaster")

    month_service.delete_posting(db, month, posting.id)

    events = [r for r in caplog.records if getattr(r, "event", None) == "posting_deleted"]
    assert len(events) == 1
```

- [ ] **Step 2: Run the tests and verify they fail**

Run: `uv run pytest tests/test_logging.py -v -k "logs_"`
Expected: FAIL — no records with the expected `event` values.

- [ ] **Step 3: Add the module logger and four `logger.info` calls**

At the top of `src/quartermaster/month_service.py`, after the existing imports, add:

```python
import logging

logger = logging.getLogger("quartermaster.month_service")
```

Inside `create_month`, immediately before the final `return month` (after `db.commit(); db.refresh(month)`):

```python
logger.info(
    "created month snapshot",
    extra={"event": "month_created", "year_month": month.year_month},
)
return month
```

Inside `close_month`, immediately before the final `return month` (after `db.commit(); db.refresh(month)`):

```python
logger.info(
    "closed month",
    extra={"event": "month_closed", "year_month": month.year_month},
)
return month
```

Inside `add_posting`, immediately before the final `return posting` (after `db.commit(); db.refresh(posting)`):

```python
logger.info(
    "added posting",
    extra={
        "event": "posting_added",
        "posting_id": posting.id,
        "month_entry_id": posting.month_entry_id,
        "amount": str(posting.amount),
    },
)
return posting
```

Inside `delete_posting` — look at the full function; it needs the log line at the end, after `db.commit()` and before the function returns. Here is the exact shape:

```python
def delete_posting(
    db: Session, month: Month, posting_id: int
) -> MonthEntry | None:
    """Delete a posting; return the parent entry so callers can re-render its section."""
    posting = get_posting(db, month, posting_id)
    if posting is None:
        return None
    entry = posting.entry
    db.delete(posting)
    db.commit()
    logger.info(
        "deleted posting",
        extra={"event": "posting_deleted", "posting_id": posting_id},
    )
    return entry
```

(The `posting_id` parameter is captured before the delete, so logging it after commit is safe.)

- [ ] **Step 4: Run the tests and verify they pass**

Run: `uv run pytest tests/test_logging.py -v -k "logs_"`
Expected: PASS for all four seed-event tests.

- [ ] **Step 5: Run the full test suite**

Run: `uv run pytest`
Expected: all 137 existing tests plus the 8 new logging tests pass.

- [ ] **Step 6: Commit**

```bash
git add src/quartermaster/month_service.py tests/test_logging.py
git commit -m "feat(logging): seed month + posting events (#27)"
```

---

## Task 6: Seed `template_entry_updated` event in `service.py`

**Files:**
- Modify: `src/quartermaster/service.py`
- Test: `tests/test_logging.py`

- [ ] **Step 1: Write the failing test**

Append to `tests/test_logging.py`:

```python
def test_update_entry_logs_template_entry_updated_event(db, caplog):
    from quartermaster.models import Entry
    entry = Entry(section=Section.fixed_bill, name="Power", amount=Decimal("50.00"))
    db.add(entry)
    db.commit()
    db.refresh(entry)

    caplog.set_level(logging.INFO, logger="quartermaster")

    service.update_entry(db, entry.id, amount=Decimal("55.00"))

    events = [
        r for r in caplog.records
        if getattr(r, "event", None) == "template_entry_updated"
    ]
    assert len(events) == 1
    assert events[0].entry_id == entry.id
```

- [ ] **Step 2: Run the test and verify it fails**

Run: `uv run pytest tests/test_logging.py::test_update_entry_logs_template_entry_updated_event -v`
Expected: FAIL.

- [ ] **Step 3: Add the module logger and `logger.info` call**

At the top of `src/quartermaster/service.py`, after the existing imports, add:

```python
import logging

logger = logging.getLogger("quartermaster.service")
```

Inside `update_entry`, immediately before the final `return entry` (after `db.commit(); db.refresh(entry)`):

```python
logger.info(
    "updated template entry",
    extra={"event": "template_entry_updated", "entry_id": entry.id},
)
return entry
```

- [ ] **Step 4: Run the test and verify it passes**

Run: `uv run pytest tests/test_logging.py::test_update_entry_logs_template_entry_updated_event -v`
Expected: PASS.

- [ ] **Step 5: Run the full test suite**

Run: `uv run pytest`
Expected: all tests pass.

- [ ] **Step 6: Commit**

```bash
git add src/quartermaster/service.py tests/test_logging.py
git commit -m "feat(logging): seed template_entry_updated event (#27)"
```

---

## Task 7: `/healthz` success path

**Files:**
- Create: `src/quartermaster/routes_health.py`
- Modify: `src/quartermaster/main.py`
- Test: `tests/test_health.py`

- [ ] **Step 1: Write the failing test**

Create `tests/test_health.py`:

```python
from __future__ import annotations


def test_healthz_returns_ok(client):
    response = client.get("/healthz")
    assert response.status_code == 200
    assert response.json() == {"status": "ok"}
```

- [ ] **Step 2: Run the test and verify it fails**

Run: `uv run pytest tests/test_health.py -v`
Expected: FAIL with 404 (route not registered).

- [ ] **Step 3: Create `src/quartermaster/routes_health.py`**

```python
from __future__ import annotations

import logging

from fastapi import APIRouter, Depends
from fastapi.responses import JSONResponse
from sqlalchemy import text
from sqlalchemy.orm import Session

from quartermaster.db import get_session

logger = logging.getLogger("quartermaster.health")

router = APIRouter()


@router.get("/healthz")
def healthz(db: Session = Depends(get_session)) -> JSONResponse:
    try:
        db.execute(text("SELECT 1"))
    except Exception as exc:
        logger.warning(
            "healthz check failed",
            extra={"event": "healthz_failed", "error_class": type(exc).__name__},
        )
        return JSONResponse(
            status_code=503,
            content={"status": "error", "detail": type(exc).__name__},
        )
    return JSONResponse(status_code=200, content={"status": "ok"})
```

- [ ] **Step 4: Wire the router into the FastAPI app**

Edit `src/quartermaster/main.py`. Change the imports block and `create_app` to:

```python
from __future__ import annotations

from pathlib import Path

from fastapi import FastAPI
from fastapi.staticfiles import StaticFiles

from quartermaster.routes import router
from quartermaster.routes_health import router as health_router
from quartermaster.routes_month import router as month_router

STATIC_DIR = Path(__file__).parent / "static"


def create_app() -> FastAPI:
    app = FastAPI(title="Quartermaster", version="0.1.0")
    app.mount("/static", StaticFiles(directory=str(STATIC_DIR)), name="static")
    app.include_router(health_router)
    app.include_router(router)
    app.include_router(month_router)
    return app


app = create_app()
```

- [ ] **Step 5: Run the test and verify it passes**

Run: `uv run pytest tests/test_health.py::test_healthz_returns_ok -v`
Expected: PASS.

- [ ] **Step 6: Commit**

```bash
git add src/quartermaster/routes_health.py src/quartermaster/main.py tests/test_health.py
git commit -m "feat(health): add /healthz endpoint (#26)"
```

---

## Task 8: `/healthz` returns 503 when the DB check fails

**Files:**
- Test: `tests/test_health.py`
- No production code changes — the exception handler from Task 7 already covers this.

- [ ] **Step 1: Write the failing test**

Append to `tests/test_health.py`:

```python
from sqlalchemy.exc import OperationalError

from quartermaster.db import get_session
from quartermaster.main import create_app
from fastapi.testclient import TestClient


def test_healthz_returns_503_when_db_check_raises():
    app = create_app()

    def broken_session():
        class BrokenSession:
            def execute(self, *args, **kwargs):
                raise OperationalError("boom", None, Exception("boom"))

            def close(self):
                pass

        session = BrokenSession()
        try:
            yield session
        finally:
            session.close()

    app.dependency_overrides[get_session] = broken_session

    with TestClient(app) as client:
        response = client.get("/healthz")

    assert response.status_code == 503
    body = response.json()
    assert body["status"] == "error"
    assert body["detail"] == "OperationalError"
```

- [ ] **Step 2: Run the test and verify it passes**

Run: `uv run pytest tests/test_health.py -v`
Expected: PASS for both tests. The exception handler in Task 7 already implements the 503 path; this test codifies the contract.

- [ ] **Step 3: Run the full suite**

Run: `uv run pytest`
Expected: all tests pass (137 existing + 9 new logging tests + 2 healthz tests = 148).

- [ ] **Step 4: Commit**

```bash
git add tests/test_health.py
git commit -m "test(health): cover 503 on DB failure (#26)"
```

---

## Post-implementation verification

After Task 8 completes:

- [ ] **Full test suite**

Run: `uv run pytest -v`
Expected: 148 tests pass.

- [ ] **End-to-end smoke test with the log config**

Run in one shell: `uv run uvicorn quartermaster.main:app --log-config src/quartermaster/logconfig.json --port 8765 &`
Then: `curl -s http://127.0.0.1:8765/healthz`
Then: `kill %1`

Expected:
- `curl` prints `{"status":"ok"}`.
- uvicorn stdout showed at least one JSON line containing `"event": "http_request"`, `"path": "/healthz"`, `"status": 200`.

- [ ] **Close issues #26 and #27**

Reference the merge commit on both issues. Leave #25 open.

---

## Dependencies and ordering

- Tasks 1 → 2 → 3 → 4 must be sequential (each builds on the last).
- Task 5 depends on Task 2 (needs LOG_CONFIG in place to test via caplog, though caplog itself does not require the config — independent in practice).
- Task 6 is independent of 5 but logically belongs after.
- Tasks 7 → 8 are sequential.
- Tasks 5, 6, 7 could theoretically run in parallel after Task 4, but linear is cleaner.