Table of Contents
- Operations
- Database safety rule
- Backup script
- Retention
- Restore
- Throwaway-DB pattern for development
- Running in production
- Deploy
- Health
- Logs
- Troubleshooting
- "no such table" on first request
- "no such column: month_entry.notes" (or similar)
- Alembic reports a revision it cannot locate
- Backup script exits 1 on a postgres-style URL
- /healthz returns 503
- DeprecationWarning about pythonjsonlogger.jsonlogger
- CSS, images, or other /static/... assets fail to load in prod
- Current schema
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
Operations
Database safety rule
The SQLite database holds real user data. It is gitignored, so losing it means the data is gone.
Before any operation that interacts with the database at the schema or destructive level, run the backup script:
./scripts/backup-db.sh <reason>
That includes:
rm,mv, truncate, or otherwise replacingquartermaster.db- Ad-hoc
sqlite3sessions that mightDROP,DELETE, orUPDATE - Running any script that mutates the DB outside the running app
- Running the app in "smoke test" / "exercise the flow" mode (the app writes to the DB; use a throwaway DB for this)
Routine writes by the running web application on your normal data file are NOT covered by this rule. Those are expected behaviour.
Alembic runs scripts/backup-db.sh alembic automatically before every
migration, downgrade, alembic current, or alembic revision --autogenerate. You do not need to call it manually for ordinary
schema work.
Backup script
scripts/backup-db.sh [reason]
- Uses
sqlite3.Connection.backup(Python's online-backup API) so the copy is consistent even if the app is writing. Nosqlite3binary is required. - Resolves the DB path from
QUARTERMASTER_DB_URL(sqlite:/// only) or defaults to./quartermaster.db. - Writes to
<dir-of-db>/backups/by default. Override withQUARTERMASTER_BACKUP_DIR=/any/path. - Exits 0 when the DB file is missing (safe to call from hooks on a fresh checkout).
- Exits 1 on a non-file sqlite URL (e.g.
postgres://...) or an unsupported URL.
Filename format:
quartermaster-YYYYMMDD-HHMMSS-<slug>.db
The slug is derived from the optional reason argument, sanitised to
[A-Za-z0-9-]. Default reason is manual.
Retention
Forever. The dataset is small (kilobytes per month). Prune manually
from backups/ if the directory ever grows uncomfortable.
Restore
A backup file is a complete SQLite database. To restore:
- Stop the app (
pkill -f uvicorn.*quartermasteror Ctrl-C). - Back up the current state first in case you want to undo the restore:
./scripts/backup-db.sh pre-restore. - Copy the chosen backup over the live file:
cp backups/quartermaster-20260417-103000-alembic.db quartermaster.db. - Restart the app. Refresh the browser.
Alembic version tracking travels with the data, so if the backup was
made on an earlier schema you may need uv run alembic upgrade head
after the restore (which will itself create a backup first).
Throwaway-DB pattern for development
Never smoke-test or experiment against the live data file. Use a tmp path:
export QUARTERMASTER_DB_URL=sqlite:////tmp/qm-dev.db
export QUARTERMASTER_BACKUP_DIR=/tmp/qm-dev-backups
uv run alembic upgrade head
uv run uvicorn quartermaster.main:app --reload
# when done
unset QUARTERMASTER_DB_URL QUARTERMASTER_BACKUP_DIR
rm -rf /tmp/qm-dev.db /tmp/qm-dev-backups
Running in production
Production is home-ctr-onyx at
https://quartermaster.unbiasedgeek.com/. Dev is the uvicorn reload
server at http://127.0.0.1:8000. The platform contract
(PlatformContractQuartermaster)
is the authoritative record of the deploy surface; the sections below
cover the app-side affordances that feed into it.
Deploy
The deploy surface lives at the repo root:
| File | Purpose |
|---|---|
Dockerfile |
python:3.12-slim-bookworm base, uv sync --no-dev --frozen, USER 1000:1000, EXPOSE 8000, HEALTHCHECK against /healthz. |
docker/entrypoint.sh |
Runs alembic upgrade head (the backup hook fires automatically) then exec uvicorn with --proxy-headers --forwarded-allow-ips='*' --log-config src/quartermaster/logconfig.json. |
compose.yml |
Single quartermaster service: /mnt/quartermaster:/data bind mount, QUARTERMASTER_DB_URL=sqlite:////data/quartermaster.db (four slashes — an absolute path), proxy-net external, 1 GB mem+memswap, json-file logging capped at 50 MB × 3, all twelve Traefik + required container labels from the platform contract. |
.forgejo/workflows/deploy.yml |
On push to main: checkout → buildx → registry login → build + push → write .env + docker compose pull + up -d → healthz smoke. |
Image tag flow
compose.yml references the image as
…/quartermaster:${QUARTERMASTER_TAG:-latest}. The deploy workflow
writes QUARTERMASTER_TAG=<git-sha> to a .env file next to the
compose file, and docker compose auto-loads .env. Every deploy
pins a specific SHA without editing the checked-in compose file.
No SSH in the workflow
The homelab runner lives on home-ctr-onyx itself with the host's
Docker socket mounted, so docker compose pull && up -d from the
runner manages the production container directly — no separate SSH
hop from a runner elsewhere. This is the reason the workflow only
needs two secrets (below).
Required secrets
Repo-scoped Forgejo Actions secrets on archeious/quartermaster:
REGISTRY_TOKEN—archeiousForgejo personal access token withread:package+write:package. Used as the docker-login password againstforgejo.labbity.unbiasedgeek.com. Generate via Forgejo → User Settings → Applications → Generate New Token.QUARTERMASTER_SMOKE_PASSWORD— plaintext basic-auth password for theadminuser. The bcrypt hash is stored platform-side (~/secretson the operator workstation asQUARTERMASTER_BASICAUTH_HASH); the plaintext is delivered to the tenant out-of-band at provisioning. Used by the post-deploycurl -u admin:$QUARTERMASTER_SMOKE_PASSWORD …/healthzprobe.
Rollback (manual, v1)
- Find the prior SHA you want to roll back to (
git logor the Actions run history). - SSH to home-ctr-onyx (or via whichever operator access you have).
cdto the compose directory (the last deploy's checkout, or re-clone the repo).- Write
QUARTERMASTER_TAG=<prior-sha>to.env. docker compose up -d.
compose.yml is in the repo, so step 3 is at worst a git clone.
Health
GET /healthz — unauthenticated, returns:
200 {"status":"ok"}when a trivialSELECT 1through the SQLAlchemy session succeeds.503 {"status":"error","detail":"<ExceptionClassName>"}on any exception from the DB probe. The error class name is the only detail leaked (no message, no traceback) — enough for an operator to see what tripped the check from acurlwithout log access.
No auth on purpose: the Docker HEALTHCHECK runs inside the container
and cannot carry credentials, and Traefik's basic-auth middleware is
not applied to this route. Kept on a dedicated router
(src/quartermaster/routes_health.py) so any future router-scoped auth
on the main routers leaves it alone.
A failed probe also emits a structured warning log (event=healthz_failed,
error_class=<cls>) for Loki.
Logs
Logs are JSON on stdout. The config lives at
src/quartermaster/logconfig.json and is consumed both by Python (via
the LOG_CONFIG dict loaded in src/quartermaster/logging_config.py)
and by uvicorn CLI:
uv run uvicorn quartermaster.main:app \
--log-config src/quartermaster/logconfig.json \
--reload
Each log line has level and event as top-level JSON fields
(Promtail on home-ctr-onyx extracts them as queryable Loki labels),
plus arbitrary extras in the JSON body.
Access logs
Uvicorn access records are enriched by AccessLogFilter into:
{
"timestamp": "...", "level": "INFO", "logger": "uvicorn.access",
"event": "http_request", "method": "GET", "path": "/healthz",
"status": 200, "client_ip": "10.0.0.42:54321",
"message": "... - \"GET /healthz HTTP/1.1\" 200"
}
Application events
Five seed events fire at the most operationally interesting mutations:
| Event | Fires in | Extras |
|---|---|---|
month_created |
month_service.create_month |
year_month |
month_closed |
month_service.close_month |
year_month |
template_entry_updated |
service.update_entry |
entry_id |
posting_added |
month_service.add_posting |
posting_id, month_entry_id, amount |
posting_deleted |
month_service.delete_posting |
posting_id |
healthz_failed |
routes_health.healthz (WARNING) |
error_class |
Additional events can be added the same way — logger.info(msg, extra={"event": "...", ...}) on a logger under quartermaster.*.
Example LogQL queries
Grafana Explore, Loki data source:
{container="quartermaster"} | json
{container="quartermaster", event="http_request", status=~"5.."}
{container="quartermaster", event="month_closed"} | json | line_format "{{.year_month}} {{.message}}"
Dev ergonomics
Omit --log-config src/quartermaster/logconfig.json during local dev
if you'd rather read logs in uvicorn's default human-readable format.
Production must use the config so Promtail indexes properly.
Troubleshooting
"no such table" on first request
You forgot uv run alembic upgrade head after cloning. The backup
hook will print "no database at ..., nothing to back up" the first
time, which is expected.
"no such column: month_entry.notes" (or similar)
You pulled code with a new column but did not apply the migration.
Run uv run alembic upgrade head. The backup hook backs up the live
DB first, then the migration adds the missing column.
Alembic reports a revision it cannot locate
The DB is on a schema version whose migration file is not in the current checkout. Check out the branch that introduced that revision or downgrade the DB to a known-good revision before continuing.
Backup script exits 1 on a postgres-style URL
Intentional. The script is sqlite-specific. Switch the URL back to sqlite:///... or do the backup manually via your Postgres tooling.
/healthz returns 503
Inspect the logged event=healthz_failed record in Loki or stdout.
error_class names the exception type; common causes are a
misconfigured QUARTERMASTER_DB_URL, a DB file that got wiped or
permissions-corrupted, or Alembic having failed on container start.
DeprecationWarning about pythonjsonlogger.jsonlogger
Fixed on main. The config now references pythonjsonlogger.json.
If you see the warning, pull and re-run uv sync.
CSS, images, or other /static/... assets fail to load in prod
Rendered page has <link href="http://…/static/app.css"> (http, not
https) or <link href="http://<internal-host>/static/app.css"> (the
container's bind address instead of the public hostname). Starlette's
url_for() is reading the scheme/host from the direct request rather
than the Traefik-forwarded headers. Confirm
docker/entrypoint.sh launches uvicorn with both
--proxy-headers and --forwarded-allow-ips='*'. Missing either one
lets mixed-content blocking or wrong hostnames slip through.
Reproduce locally against any image with:
docker run -d --rm -p 18000:8000 -e QUARTERMASTER_DB_URL=... <image>
curl -sS -H 'Host: quartermaster.unbiasedgeek.com' \
-H 'X-Forwarded-Proto: https' \
http://127.0.0.1:18000/ | grep stylesheet
The href should be https://quartermaster.unbiasedgeek.com/static/....
Current schema
Applied migrations at time of writing:
| Revision | Description |
|---|---|
f1ccdc4bc1bf |
initial schema (entry, debt_target) |
03ebe3c07262 |
add month snapshot tables (month, month_entry, month_debt_target) |
ec804bdf366d |
add notes column to entry and month_entry |
a4ec4f8f6e9f |
add month lifecycle columns (state, activated_at, closed_at) |
cc60e7f73a1c |
add posting ledger table, seed opening-balance postings, drop month_entry.applied |
No schema change between cc60e7f73a1c and HEAD. After pulling new
code, uv run alembic upgrade head walks the chain and the backup
hook fires between each hop.