fix(cache): _save_investigations write is not atomic — crash mid-write corrupts shared UUID map #85
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Problem
/tmp/luminos/investigations.jsonis shared state across every invocation on the machine: it's the map from absolute target path → UUID that lets luminos resume prior investigations. The write path is non-atomic:luminos_lib/cache.py:46-49:If the process crashes, is killed, or runs out of disk between
open("w")(which truncates) andjson.dumpcompleting, the file is left truncated or empty._load_investigationsthen sees ajson.JSONDecodeError, returns{}, and every resumable investigation on this machine loses its UUID mapping — they all become unreachable (the cache dirs still exist under their old UUIDs; luminos just can't find them from the target path anymore).A concurrent run can also clobber: two
luminosinvocations against different targets, if they race on load/modify/save, lose one of the two updates.Fix
Standard atomic-replace pattern:
Addresses crash-safety. Concurrency is not solved by this alone (two writers can still lose one side's update) but is far less likely than crash corruption in practice, and the fix is cheap.
If concurrency matters more later (probably when #39 MCP backend lands), a flock over
INVESTIGATIONS_PATHaround load + save is the next step.Acceptance
_save_investigationsuses temp file +os.replacetests/test_cache.pysimulates a mid-write crash and confirms the old file is intact