5 Operations
claude-code edited this page 2026-04-19 18:35:25 -06:00
This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

Operations

Database safety rule

The SQLite database holds real user data. It is gitignored, so losing it means the data is gone.

Before any operation that interacts with the database at the schema or destructive level, run the backup script:

./scripts/backup-db.sh <reason>

That includes:

  • rm, mv, truncate, or otherwise replacing quartermaster.db
  • Ad-hoc sqlite3 sessions that might DROP, DELETE, or UPDATE
  • Running any script that mutates the DB outside the running app
  • Running the app in "smoke test" / "exercise the flow" mode (the app writes to the DB; use a throwaway DB for this)

Routine writes by the running web application on your normal data file are NOT covered by this rule. Those are expected behaviour.

Alembic runs scripts/backup-db.sh alembic automatically before every migration, downgrade, alembic current, or alembic revision --autogenerate. You do not need to call it manually for ordinary schema work.

Backup script

scripts/backup-db.sh [reason]
  • Uses sqlite3.Connection.backup (Python's online-backup API) so the copy is consistent even if the app is writing. No sqlite3 binary is required.
  • Resolves the DB path from QUARTERMASTER_DB_URL (sqlite:/// only) or defaults to ./quartermaster.db.
  • Writes to <dir-of-db>/backups/ by default. Override with QUARTERMASTER_BACKUP_DIR=/any/path.
  • Exits 0 when the DB file is missing (safe to call from hooks on a fresh checkout).
  • Exits 1 on a non-file sqlite URL (e.g. postgres://...) or an unsupported URL.

Filename format:

quartermaster-YYYYMMDD-HHMMSS-<slug>.db

The slug is derived from the optional reason argument, sanitised to [A-Za-z0-9-]. Default reason is manual.

Retention

Forever. The dataset is small (kilobytes per month). Prune manually from backups/ if the directory ever grows uncomfortable.

Restore

A backup file is a complete SQLite database. To restore:

  1. Stop the app (pkill -f uvicorn.*quartermaster or Ctrl-C).
  2. Back up the current state first in case you want to undo the restore: ./scripts/backup-db.sh pre-restore.
  3. Copy the chosen backup over the live file: cp backups/quartermaster-20260417-103000-alembic.db quartermaster.db.
  4. Restart the app. Refresh the browser.

Alembic version tracking travels with the data, so if the backup was made on an earlier schema you may need uv run alembic upgrade head after the restore (which will itself create a backup first).

Throwaway-DB pattern for development

Never smoke-test or experiment against the live data file. Use a tmp path:

export QUARTERMASTER_DB_URL=sqlite:////tmp/qm-dev.db
export QUARTERMASTER_BACKUP_DIR=/tmp/qm-dev-backups
uv run alembic upgrade head
uv run uvicorn quartermaster.main:app --reload
# when done
unset QUARTERMASTER_DB_URL QUARTERMASTER_BACKUP_DIR
rm -rf /tmp/qm-dev.db /tmp/qm-dev-backups

Running in production

Production is home-ctr-onyx at https://quartermaster.unbiasedgeek.com/. Dev is the uvicorn reload server at http://127.0.0.1:8000. The platform contract (PlatformContractQuartermaster) is the authoritative record of the deploy surface; the sections below cover the app-side affordances that feed into it.

Deploy

The deploy surface lives at the repo root:

File Purpose
Dockerfile python:3.12-slim-bookworm base, uv sync --no-dev --frozen, USER 1000:1000, EXPOSE 8000, HEALTHCHECK against /healthz.
docker/entrypoint.sh Runs alembic upgrade head (the backup hook fires automatically) then exec uvicorn with --proxy-headers --forwarded-allow-ips='*' --log-config src/quartermaster/logconfig.json.
compose.yml Single quartermaster service: /mnt/quartermaster:/data bind mount, QUARTERMASTER_DB_URL=sqlite:////data/quartermaster.db (four slashes — an absolute path), proxy-net external, 1 GB mem+memswap, json-file logging capped at 50 MB × 3, all twelve Traefik + required container labels from the platform contract.
.forgejo/workflows/deploy.yml On push to main: checkout → buildx → registry login → build + push → write .env + docker compose pull + up -d → healthz smoke.

Image tag flow

compose.yml references the image as …/quartermaster:${QUARTERMASTER_TAG:-latest}. The deploy workflow writes QUARTERMASTER_TAG=<git-sha> to a .env file next to the compose file, and docker compose auto-loads .env. Every deploy pins a specific SHA without editing the checked-in compose file.

No SSH in the workflow

The homelab runner lives on home-ctr-onyx itself with the host's Docker socket mounted, so docker compose pull && up -d from the runner manages the production container directly — no separate SSH hop from a runner elsewhere. This is the reason the workflow only needs two secrets (below).

Required secrets

Repo-scoped Forgejo Actions secrets on archeious/quartermaster:

  • REGISTRY_TOKENarcheious Forgejo personal access token with read:package + write:package. Used as the docker-login password against forgejo.labbity.unbiasedgeek.com. Generate via Forgejo → User Settings → Applications → Generate New Token.
  • QUARTERMASTER_SMOKE_PASSWORD — plaintext basic-auth password for the admin user. The bcrypt hash is stored platform-side (~/secrets on the operator workstation as QUARTERMASTER_BASICAUTH_HASH); the plaintext is delivered to the tenant out-of-band at provisioning. Used by the post-deploy curl -u admin:$QUARTERMASTER_SMOKE_PASSWORD …/healthz probe.

Rollback (manual, v1)

  1. Find the prior SHA you want to roll back to (git log or the Actions run history).
  2. SSH to home-ctr-onyx (or via whichever operator access you have).
  3. cd to the compose directory (the last deploy's checkout, or re-clone the repo).
  4. Write QUARTERMASTER_TAG=<prior-sha> to .env.
  5. docker compose up -d.

compose.yml is in the repo, so step 3 is at worst a git clone.

Health

GET /healthz — unauthenticated, returns:

  • 200 {"status":"ok"} when a trivial SELECT 1 through the SQLAlchemy session succeeds.
  • 503 {"status":"error","detail":"<ExceptionClassName>"} on any exception from the DB probe. The error class name is the only detail leaked (no message, no traceback) — enough for an operator to see what tripped the check from a curl without log access.

No auth on purpose: the Docker HEALTHCHECK runs inside the container and cannot carry credentials, and Traefik's basic-auth middleware is not applied to this route. Kept on a dedicated router (src/quartermaster/routes_health.py) so any future router-scoped auth on the main routers leaves it alone.

A failed probe also emits a structured warning log (event=healthz_failed, error_class=<cls>) for Loki.

Logs

Logs are JSON on stdout. The config lives at src/quartermaster/logconfig.json and is consumed both by Python (via the LOG_CONFIG dict loaded in src/quartermaster/logging_config.py) and by uvicorn CLI:

uv run uvicorn quartermaster.main:app \
    --log-config src/quartermaster/logconfig.json \
    --reload

Each log line has level and event as top-level JSON fields (Promtail on home-ctr-onyx extracts them as queryable Loki labels), plus arbitrary extras in the JSON body.

Access logs

Uvicorn access records are enriched by AccessLogFilter into:

{
  "timestamp": "...", "level": "INFO", "logger": "uvicorn.access",
  "event": "http_request", "method": "GET", "path": "/healthz",
  "status": 200, "client_ip": "10.0.0.42:54321",
  "message": "... - \"GET /healthz HTTP/1.1\" 200"
}

Application events

Five seed events fire at the most operationally interesting mutations:

Event Fires in Extras
month_created month_service.create_month year_month
month_closed month_service.close_month year_month
template_entry_updated service.update_entry entry_id
posting_added month_service.add_posting posting_id, month_entry_id, amount
posting_deleted month_service.delete_posting posting_id
healthz_failed routes_health.healthz (WARNING) error_class

Additional events can be added the same way — logger.info(msg, extra={"event": "...", ...}) on a logger under quartermaster.*.

Example LogQL queries

Grafana Explore, Loki data source:

{container="quartermaster"} | json
{container="quartermaster", event="http_request", status=~"5.."}
{container="quartermaster", event="month_closed"} | json | line_format "{{.year_month}} {{.message}}"

Dev ergonomics

Omit --log-config src/quartermaster/logconfig.json during local dev if you'd rather read logs in uvicorn's default human-readable format. Production must use the config so Promtail indexes properly.

Troubleshooting

"no such table" on first request

You forgot uv run alembic upgrade head after cloning. The backup hook will print "no database at ..., nothing to back up" the first time, which is expected.

"no such column: month_entry.notes" (or similar)

You pulled code with a new column but did not apply the migration. Run uv run alembic upgrade head. The backup hook backs up the live DB first, then the migration adds the missing column.

Alembic reports a revision it cannot locate

The DB is on a schema version whose migration file is not in the current checkout. Check out the branch that introduced that revision or downgrade the DB to a known-good revision before continuing.

Backup script exits 1 on a postgres-style URL

Intentional. The script is sqlite-specific. Switch the URL back to sqlite:///... or do the backup manually via your Postgres tooling.

/healthz returns 503

Inspect the logged event=healthz_failed record in Loki or stdout. error_class names the exception type; common causes are a misconfigured QUARTERMASTER_DB_URL, a DB file that got wiped or permissions-corrupted, or Alembic having failed on container start.

DeprecationWarning about pythonjsonlogger.jsonlogger

Fixed on main. The config now references pythonjsonlogger.json. If you see the warning, pull and re-run uv sync.

CSS, images, or other /static/... assets fail to load in prod

Rendered page has <link href="http://…/static/app.css"> (http, not https) or <link href="http://<internal-host>/static/app.css"> (the container's bind address instead of the public hostname). Starlette's url_for() is reading the scheme/host from the direct request rather than the Traefik-forwarded headers. Confirm docker/entrypoint.sh launches uvicorn with both --proxy-headers and --forwarded-allow-ips='*'. Missing either one lets mixed-content blocking or wrong hostnames slip through. Reproduce locally against any image with:

docker run -d --rm -p 18000:8000 -e QUARTERMASTER_DB_URL=... <image>
curl -sS -H 'Host: quartermaster.unbiasedgeek.com' \
       -H 'X-Forwarded-Proto: https' \
       http://127.0.0.1:18000/ | grep stylesheet

The href should be https://quartermaster.unbiasedgeek.com/static/....

Current schema

Applied migrations at time of writing:

Revision Description
f1ccdc4bc1bf initial schema (entry, debt_target)
03ebe3c07262 add month snapshot tables (month, month_entry, month_debt_target)
ec804bdf366d add notes column to entry and month_entry
a4ec4f8f6e9f add month lifecycle columns (state, activated_at, closed_at)
cc60e7f73a1c add posting ledger table, seed opening-balance postings, drop month_entry.applied

No schema change between cc60e7f73a1c and HEAD. After pulling new code, uv run alembic upgrade head walks the chain and the backup hook fires between each hop.