Platform onboarding intake: info needed to draft the Quartermaster platform contract #25
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
The homelab platform team (homelab/homelab-IaC) is drafting a Platform Contract for Quartermaster — the document that records what the platform provisions, what Quartermaster is allowed to do on
home-ctr-onyx, and how the two sides interoperate.Reference (same shape we'll use for yours): Platform Contract: Archon
Please fill in the answers below (edit this issue body, or drop a comment with the numbered answers). Anything you leave blank, we'll default to the conservative / minimal option noted in brackets.
1. Identity & ownership
1.1. One-line description. What does Quartermaster do?
A household budget tracker: template → monthly snapshot → posting ledger, with a Primary Debt Target pointer and Planning / Active / Closed month lifecycle.
1.2. Primary contact / owning team.
Jeff Smith (jeff@unbiasedgeek.com), solo.
1.3. Repository URL (for registry path and deploy manifest location).
https://forgejo.labbity.unbiasedgeek.com/archeious/quartermaster1.4. Tenant namespace prefix. Archon uses
archon-for all containers / volumes / images.- [x]
quartermaster-(recommended)- [ ]
qm-(shorter)- [ ] other:
...1.5. Language / runtime stack (informs image base, healthcheck style, memory budget).
Python 3.12 · FastAPI + uvicorn · Jinja2 + HTMX · SQLite + Alembic · managed with
uv.2. Ingress (DNS + Traefik)
2.1. Hostname pattern.
- [ ]
quartermaster.labbity.unbiasedgeek.com— internal/lab, wildcard cert, no per-host Cloudflare DNS- [x]
quartermaster.unbiasedgeek.com— product surface, own CNAME + per-host LE cert (likearchon-viewer)- [ ] Both (labbity canonical + unbiasedgeek.com redirect)
- [ ] Other:
...2.2. Exposure.
- [x] Public (internet-reachable)
- [ ] LAN-only (Traefik middleware blocks non-10.0.0.0/24, like the MinIO console)
2.3. Container port the app listens on (the internal port Traefik should proxy to).
8000(uvicorn default).2.4. Path routing. Does this app serve only
/, or does it need path-based sub-routes (e.g./apivs/split)?Only
/. No API/UI split.2.5. Middleware needs. Any of the following?
- [x] Rate limiting — starter proposal: 10 req/s sustained / 30 burst per source IP (personal-use tool; also deters brute-force against the basic-auth gate). Defer to platform-team numbers if there's a consistent tenant default.
- [ ] Basic auth / forward auth
- [ ] www → apex redirect (like
unbiasedgeek.com)- [ ] None
3. Container runtime
3.1. Image location. Where does CI push the image?
- [x]
forgejo.labbity.unbiasedgeek.com/archeious/quartermaster/quartermaster:<tag>(recommended)- [ ] Public registry (Docker Hub / ghcr / other):
...3.2. Tag strategy.
- [ ] Semver (
v0.1.0) — optional at milestones- [x] Git SHA (primary for deploys)
- [ ]
latest(not recommended for anything load-bearing)3.3. Expected long-running containers on home-ctr-onyx (check all that apply):
- [x] One web container
- [ ] Background worker(s) — how many:
...- [ ] Scheduler / cron
- [ ] Other:
...3.4. Resource ceiling. Aggregate memory budget Quartermaster can consume on home-ctr-onyx (Archon's is 24 GB). Default 4 GB unless you need more.
1 GB. FastAPI + SQLite for a single user; plenty of headroom at 1 GB.
3.5. Host port publishing. Do you need any host-port binds for LAN access? (Separate from Traefik ingress, which does not need a host port.)
- [x] No (recommended; use Traefik)
- [ ] Yes — specify desired number of ports:
...(we'll allocate a range like Archon's30000-30999)3.6. Docker socket or privileged mode?
- [x] No (default)
- [ ] Yes — explain why:
...3.7. Healthcheck plan. HTTP endpoint, CLI command, or none?
HTTP:
GET /healthzreturns 200 with a trivial DB ping. Endpoint does not exist yet — will land in the repo before the first deploy cut.4. Data & state
4.1. Postgres.
- [x] Not needed
- [ ] Dev DB only
- [ ] Dev + prod DBs (like Archon)
- [ ] Read-only viewer role in addition to owner role
Data store is SQLite.
4.2. Object storage (MinIO / S3).
- [x] Not needed
- [ ] One bucket
- [ ] Dev + prod buckets
- [ ] Estimated size at rest:
...4.3. Host bind-mount path. Platform can create
/mnt/quartermaster/on request for persistent host-side data that isn't a named volume. Needed?- [ ] Not needed (named volumes only)
- [x] Needed — describe: holds the live SQLite file (
quartermaster.db) and the siblingbackups/directory. The repo-level safety rule inCLAUDE.mddepends onscripts/backup-db.shbeing able to write next to the DB, so the mount needs to cover both. Host-side visibility ofbackups/also means the standard onyx backup sweep can include it.4.4. Named volumes. List any volumes you'll create (must use
quartermaster-prefix).None. All persistent state lives under the
/mnt/quartermaster/bind mount.5. Observability
5.1. Logs. Promtail on home-ctr-onyx auto-ships all Docker logs to Loki; JSON with
level+eventgets indexed. Will you emit structured JSON?- [x] Yes (recommended) — uvicorn
--log-config+ stdlibloggingwith a JSON formatter. Treated as launch work, not deferred.- [ ] Plain text
5.2. Metrics. Pushgateway is available on
archon-net. Do you need Prometheus metrics?- [x] Not needed
- [ ] Pull (scrape): container exposes
/metricson the container port- [ ] Push (Pushgateway): for batch / cron jobs
- [ ] Metric prefix to reserve:
quartermaster_5.3. Grafana dashboards. Managed via the
grafana/grafanaTerraform provider. Want one stood up now, or later?- [x] Not yet
- [ ] One dashboard at launch
5.4. Alerts. Alertmanager is wired up. Any alerts to define at launch (TLS expiry, container down, error rate)?
Three at launch: container-down, TLS cert expiry (matters because we're on a per-host LE cert, not the wildcard), and elevated 5xx rate (cheap signal given structured logs are landing).
6. CI/CD
6.1. Forgejo Actions runner.
onyx-runneron home-ctr-onyx, 10 concurrent jobs. Labels available:homelab(container mode, has Node.js),self-hosted/onyx(host mode, no Node.js). Which do you need?- [x]
homelab(for JS actions likeactions/checkout)- [ ]
self-hosted/onyx(shell-only, direct host)- [ ] Both
6.2. Deploy mechanism.
- [x] Compose file in the repo
- [ ]
docker runwrapped in a systemd unit- [ ] Forgejo Actions workflow does
docker run- [ ] Other:
...6.3. Deploy on merge to default branch?
- [x] Yes. Alembic
upgrade headruns on container start; the backup hook inalembic/env.pyprotects the DB.- [ ] Manual trigger only
7. Secrets & auth
7.1. Runtime secrets needed (env vars at container start). List names only, not values.
None required inside the app.
QUARTERMASTER_DB_URLpoints at the SQLite path on the bind mount; no API keys, no DB password. (Basic-auth credentials live in Traefik, not in Quartermaster — see 7.3.)7.2. Where should they live?
- [ ]
~/secretson workstation (mirrors Archon pattern)- [x] Forgejo Actions secrets on
archeious/quartermaster- [ ] Both (workstation for local ops, Actions for CI deploy)
7.3. User-facing auth (does the web app itself gate access)?
- [ ] No auth (fully public)
- [ ] App-level auth (handled inside Quartermaster)
- [x] Infrastructure-level (Traefik basic auth / forward auth / Cloudflare Access) — Traefik basic auth. App has no login; Traefik gates everything at the edge.
8. Lifecycle
8.1. Uptime expectation.
- [ ] Prototype — downtime OK, no SLO
- [x] Best-effort — survives host reboots, no on-call
- [ ] Product — outages matter, needs alerts
8.2. Expected launch window.
ASAP.
8.3. Anything else the platform should know (pending redesigns, scale plans, data sensitivity)?
Financial data, single user, not regulated. Publicly exposed, so basic auth (7.3) is load-bearing — brute-force rate limiting (2.5) and TLS expiry alerts (5.4) are tied to that decision.
Platform-team asks
Three items that need a platform-side call:
/mnt/quartermaster/is acceptable given the SQLite +backups/layout.Once this is filled in, homelab-IaC will:
PlatformContractQuartermaster.mdin the wiki and add it to the indexhomelab/homelab-IaCfor the provisioning work (Cloudflare DNS if needed, CoreDNS hosts entry, host port range allocation, anything else)Platform-side provisioning complete. Quartermaster is ready to deploy.
Contract: PlatformContractQuartermaster
Platform tracking: homelab-IaC#216 (4 PRs merged: #217 Cloudflare, #218 CoreDNS, #219 Traefik middlewares, #220 bind mount + backup)
What's live
quartermaster.unbiasedgeek.comDNS166.70.212.117) and internally (10.0.0.61via split-DNS)quartermaster-basicauth@file(admin user, bcrypt) +quartermaster-ratelimit@file(10/30 per-IP) loaded on home-ctr-onyx, hot-reload verified/mnt/quartermaster/1000:1000mode0750, siblingbackups/ready. Included in nightly restic snapshot to r720xd-1HTTP/2 404(no router for this host yet — your compose adds it). Cert is currently Traefik's self-signed default; it flips to a real Let's Encrypt cert automatically on first request once your router withtls.certresolver=letsencryptis registeredWhat you do next (on
archeious/quartermaster)USER 1000:1000(or map to UID 1000 via another mechanism) so it can write to the/mnt/quartermaster/bind mount./healthzendpoint: return 200 with a trivialSELECT 1against the SQLite DB — needed for the cAdvisor container-down signal the contract references.compose.yml:forgejo.labbity.unbiasedgeek.com/archeious/quartermaster/quartermaster:<git-sha>/mnt/quartermaster→ wherever the app writes (typical/data); setQUARTERMASTER_DB_URLtosqlite:///data/quartermaster.db(or equivalent)networks: [proxy-net]declared asexternal: truemiddlewares=quartermaster-basicauth@file,quartermaster-ratelimit@fileHEALTHCHECKhitting/healthzmem_limit: 1g,memswap_limit: 1g,restart: unless-stopped,logging.options.max-size: 50m+max-file: 3tenant=quartermaster,project=quartermaster,managed_by=quartermaster,com.centurylinklabs.watchtower.enable=falsehomelablabel that builds, pushes to the Forgejo registry, SSHes to home-ctr-onyx, and runsdocker compose pull && docker compose up -d.upgrade headon container start with the pre-upgrade DB backup hook inalembic/env.py(per your intake-form answer).Basic-auth credentials
adminReminders
/mnt/quartermaster/, don't bypass the basic-auth router.Closing this intake issue. If you hit a platform-side gap during deploy, file a fresh issue on
homelab/homelab-IaC.