docs(operations): add Deploy section for live production + proxy-headers troubleshooting

claude-code 2026-04-19 18:35:25 -06:00
parent 951978bdac
commit c1a4abe1f5

@ -72,7 +72,7 @@ A backup file is a complete SQLite database. To restore:
4. Restart the app. Refresh the browser.
Alembic version tracking travels with the data, so if the backup was
made on an earlier schema you may need to `uv run alembic upgrade head`
made on an earlier schema you may need `uv run alembic upgrade head`
after the restore (which will itself create a backup first).
## Throwaway-DB pattern for development
@ -90,14 +90,69 @@ unset QUARTERMASTER_DB_URL QUARTERMASTER_BACKUP_DIR
rm -rf /tmp/qm-dev.db /tmp/qm-dev-backups
```
## Running in "production"
## Running in production
Production is the homelab host home-ctr-onyx, containerised. Dev is
the uvicorn reload server at `http://127.0.0.1:8000`. The platform
contract ([PlatformContractQuartermaster](https://forgejo.labbity.unbiasedgeek.com/homelab/homelab-IaC/wiki/PlatformContractQuartermaster))
Production is home-ctr-onyx at
`https://quartermaster.unbiasedgeek.com/`. Dev is the uvicorn reload
server at `http://127.0.0.1:8000`. The platform contract
([PlatformContractQuartermaster](https://forgejo.labbity.unbiasedgeek.com/homelab/homelab-IaC/wiki/PlatformContractQuartermaster))
is the authoritative record of the deploy surface; the sections below
cover the app-side affordances that feed into it.
## Deploy
The deploy surface lives at the repo root:
| File | Purpose |
|---|---|
| `Dockerfile` | `python:3.12-slim-bookworm` base, `uv sync --no-dev --frozen`, `USER 1000:1000`, `EXPOSE 8000`, `HEALTHCHECK` against `/healthz`. |
| `docker/entrypoint.sh` | Runs `alembic upgrade head` (the backup hook fires automatically) then `exec uvicorn` with `--proxy-headers --forwarded-allow-ips='*' --log-config src/quartermaster/logconfig.json`. |
| `compose.yml` | Single `quartermaster` service: `/mnt/quartermaster:/data` bind mount, `QUARTERMASTER_DB_URL=sqlite:////data/quartermaster.db` (four slashes — an absolute path), `proxy-net` external, 1 GB mem+memswap, `json-file` logging capped at 50 MB × 3, all twelve Traefik + required container labels from the platform contract. |
| `.forgejo/workflows/deploy.yml` | On push to `main`: checkout → buildx → registry login → build + push → write `.env` + `docker compose pull` + `up -d` → healthz smoke. |
### Image tag flow
`compose.yml` references the image as
`…/quartermaster:${QUARTERMASTER_TAG:-latest}`. The deploy workflow
writes `QUARTERMASTER_TAG=<git-sha>` to a `.env` file next to the
compose file, and `docker compose` auto-loads `.env`. Every deploy
pins a specific SHA without editing the checked-in compose file.
### No SSH in the workflow
The `homelab` runner lives on home-ctr-onyx itself with the host's
Docker socket mounted, so `docker compose pull && up -d` from the
runner manages the production container directly — no separate SSH
hop from a runner elsewhere. This is the reason the workflow only
needs two secrets (below).
### Required secrets
Repo-scoped Forgejo Actions secrets on `archeious/quartermaster`:
* **`REGISTRY_TOKEN`** — `archeious` Forgejo personal access token
with `read:package` + `write:package`. Used as the docker-login
password against `forgejo.labbity.unbiasedgeek.com`. Generate via
Forgejo → User Settings → Applications → Generate New Token.
* **`QUARTERMASTER_SMOKE_PASSWORD`** — plaintext basic-auth password
for the `admin` user. The bcrypt hash is stored platform-side
(`~/secrets` on the operator workstation as
`QUARTERMASTER_BASICAUTH_HASH`); the plaintext is delivered to
the tenant out-of-band at provisioning. Used by the post-deploy
`curl -u admin:$QUARTERMASTER_SMOKE_PASSWORD …/healthz` probe.
### Rollback (manual, v1)
1. Find the prior SHA you want to roll back to (`git log` or the
Actions run history).
2. SSH to home-ctr-onyx (or via whichever operator access you have).
3. `cd` to the compose directory (the last deploy's checkout, or
re-clone the repo).
4. Write `QUARTERMASTER_TAG=<prior-sha>` to `.env`.
5. `docker compose up -d`.
`compose.yml` is in the repo, so step 3 is at worst a `git clone`.
## Health
`GET /healthz` — unauthenticated, returns:
@ -166,7 +221,7 @@ extra={"event": "...", ...})` on a logger under `quartermaster.*`.
### Example LogQL queries
Grafana Explore, Loki data source, once the deploy is live:
Grafana Explore, Loki data source:
```
{container="quartermaster"} | json
@ -217,6 +272,27 @@ permissions-corrupted, or Alembic having failed on container start.
Fixed on main. The config now references `pythonjsonlogger.json`.
If you see the warning, pull and re-run `uv sync`.
### CSS, images, or other `/static/...` assets fail to load in prod
Rendered page has `<link href="http://…/static/app.css">` (http, not
https) or `<link href="http://<internal-host>/static/app.css">` (the
container's bind address instead of the public hostname). Starlette's
`url_for()` is reading the scheme/host from the direct request rather
than the Traefik-forwarded headers. Confirm
`docker/entrypoint.sh` launches uvicorn with **both**
`--proxy-headers` and `--forwarded-allow-ips='*'`. Missing either one
lets mixed-content blocking or wrong hostnames slip through.
Reproduce locally against any image with:
```sh
docker run -d --rm -p 18000:8000 -e QUARTERMASTER_DB_URL=... <image>
curl -sS -H 'Host: quartermaster.unbiasedgeek.com' \
-H 'X-Forwarded-Proto: https' \
http://127.0.0.1:18000/ | grep stylesheet
```
The href should be `https://quartermaster.unbiasedgeek.com/static/...`.
## Current schema
Applied migrations at time of writing: