marchwarden/docs/stress-tests/M3.3-runs/18-scope.log
Jeff Smith 13215d7ddb docs(stress-tests): M3.3 Phase A — calibration data collection
Issue #46 (Phase A only — Phase B human rating still pending, issue stays open).

Adds the data-collection half of the calibration milestone:

- scripts/calibration_runner.sh — runs 20 fixed balanced-depth queries
  across 4 categories (factual, comparative, contradiction-prone,
  scope-edge), 5 each, capturing per-run logs to docs/stress-tests/M3.3-runs/.
- scripts/calibration_collect.py — loads every persisted ResearchResult
  under ~/.marchwarden/traces/*.result.json and emits a markdown rating
  worksheet with one row per run. Recovers question text from each
  trace's start event and category from the run-log filename.
- docs/stress-tests/M3.3-rating-worksheet.md — 22 runs (20 calibration
  + caffeine smoke + M3.2 multi-axis), with empty actual_rating columns
  for the human-in-the-loop scoring step.
- docs/stress-tests/M3.3-runs/*.log — runtime logs from the calibration
  runner, kept as provenance. Gitignore updated with an exception
  carving stress-test logs out of the global *.log ignore.

Note: M3.1's 4 runs predate #54 (full result persistence) and so are
unrecoverable to the worksheet — only post-#54 runs have a result.json
sibling. 22 rateable runs is still within the milestone target of 20–30.

Phases B (human rating) and C (analysis + rubric + wiki update) follow
in a later session. This issue stays open until both are done.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 20:21:47 -06:00

272 lines
32 KiB
Text
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

Researching: What internal compensation bands does Goldman Sachs use for VPs in
2026?
{"question": "What internal compensation bands does Goldman Sachs use for VPs in 2026?", "depth": "balanced", "max_iterations": null, "token_budget": null, "event": "ask_started", "logger": "marchwarden.cli", "level": "info", "timestamp": "2026-04-09T02:15:05.792037Z"}
{"transport": "stdio", "server": "marchwarden-web-researcher", "event": "mcp_server_starting", "logger": "marchwarden.mcp", "level": "info", "timestamp": "2026-04-09T02:15:06.820624Z"}
{"event": "Processing request of type CallToolRequest", "logger": "mcp.server.lowlevel.server", "level": "info", "timestamp": "2026-04-09T02:15:06.829930Z"}
{"question": "What internal compensation bands does Goldman Sachs use for VPs in 2026?", "depth": "balanced", "max_iterations": 5, "token_budget": 20000, "model_id": "claude-sonnet-4-6", "event": "research_started", "trace_id": "716e548a-ceaf-4d18-8b47-ac35e3460b52", "researcher": "web", "logger": "marchwarden.researcher.web", "level": "info", "timestamp": "2026-04-09T02:15:06.876139Z"}
{"step": 1, "decision": "Beginning research: depth=balanced", "question": "What internal compensation bands does Goldman Sachs use for VPs in 2026?", "context": "", "max_iterations": 5, "token_budget": 20000, "event": "start", "trace_id": "716e548a-ceaf-4d18-8b47-ac35e3460b52", "researcher": "web", "logger": "marchwarden.researcher.trace", "level": "info", "timestamp": "2026-04-09T02:15:06.876453Z"}
{"step": 2, "decision": "Starting iteration 1/5", "tokens_so_far": 0, "event": "iteration_start", "trace_id": "716e548a-ceaf-4d18-8b47-ac35e3460b52", "researcher": "web", "logger": "marchwarden.researcher.trace", "level": "info", "timestamp": "2026-04-09T02:15:06.876542Z"}
{"step": 7, "decision": "Starting iteration 2/5", "tokens_so_far": 1108, "event": "iteration_start", "trace_id": "716e548a-ceaf-4d18-8b47-ac35e3460b52", "researcher": "web", "logger": "marchwarden.researcher.trace", "level": "info", "timestamp": "2026-04-09T02:15:14.969587Z"}
{"step": 12, "decision": "Starting iteration 3/5", "tokens_so_far": 5772, "event": "iteration_start", "trace_id": "716e548a-ceaf-4d18-8b47-ac35e3460b52", "researcher": "web", "logger": "marchwarden.researcher.trace", "level": "info", "timestamp": "2026-04-09T02:15:26.767509Z"}
{"step": 17, "decision": "Starting iteration 4/5", "tokens_so_far": 15029, "event": "iteration_start", "trace_id": "716e548a-ceaf-4d18-8b47-ac35e3460b52", "researcher": "web", "logger": "marchwarden.researcher.trace", "level": "info", "timestamp": "2026-04-09T02:15:32.149418Z"}
{"step": 22, "decision": "Token budget reached before iteration 5: 26452/20000", "event": "budget_exhausted", "trace_id": "716e548a-ceaf-4d18-8b47-ac35e3460b52", "researcher": "web", "logger": "marchwarden.researcher.trace", "level": "info", "timestamp": "2026-04-09T02:15:41.927200Z"}
{"step": 23, "decision": "Beginning synthesis of gathered evidence", "evidence_count": 31, "iterations_run": 4, "tokens_used": 26452, "event": "synthesis_start", "trace_id": "716e548a-ceaf-4d18-8b47-ac35e3460b52", "researcher": "web", "logger": "marchwarden.researcher.trace", "level": "info", "timestamp": "2026-04-09T02:15:41.927359Z"}
{"step": 24, "decision": "Parsed synthesis JSON successfully", "duration_ms": 65550, "event": "synthesis_complete", "trace_id": "716e548a-ceaf-4d18-8b47-ac35e3460b52", "researcher": "web", "logger": "marchwarden.researcher.trace", "level": "info", "timestamp": "2026-04-09T02:16:45.258119Z"}
{"step": 42, "decision": "Research complete", "confidence": 0.62, "citation_count": 10, "gap_count": 4, "discovery_count": 3, "total_duration_sec": 102.914, "event": "complete", "trace_id": "716e548a-ceaf-4d18-8b47-ac35e3460b52", "researcher": "web", "logger": "marchwarden.researcher.trace", "level": "info", "timestamp": "2026-04-09T02:16:45.259163Z"}
{"confidence": 0.62, "citations": 10, "gaps": 4, "discovery_events": 3, "tokens_used": 51829, "iterations_run": 4, "wall_time_sec": 98.38188624382019, "budget_exhausted": true, "event": "research_completed", "trace_id": "716e548a-ceaf-4d18-8b47-ac35e3460b52", "researcher": "web", "logger": "marchwarden.researcher.web", "level": "info", "timestamp": "2026-04-09T02:16:45.259280Z"}
{"error": "[Errno 13] Permission denied: '/home/micro/.marchwarden/costs.jsonl'", "event": "cost_ledger_write_failed", "trace_id": "716e548a-ceaf-4d18-8b47-ac35e3460b52", "researcher": "web", "logger": "marchwarden.researcher.web", "level": "warning", "timestamp": "2026-04-09T02:16:45.259714Z"}
{"event": "Processing request of type ListToolsRequest", "logger": "mcp.server.lowlevel.server", "level": "info", "timestamp": "2026-04-09T02:16:45.264223Z"}
{"trace_id": "716e548a-ceaf-4d18-8b47-ac35e3460b52", "confidence": 0.62, "citations": 10, "tokens_used": 51829, "wall_time_sec": 98.38188624382019, "event": "ask_completed", "logger": "marchwarden.cli", "level": "info", "timestamp": "2026-04-09T02:16:45.493130Z"}
╭─────────────────────────────────── Answer ───────────────────────────────────╮
│ Goldman Sachs does not publicly disclose formal internal compensation bands │
│ for VPs. Based on available evidence, the VP title at Goldman Sachs is a │
│ single, wide-band level (there are no officially published sub-bands like │
│ VP1/VP2/VP3 at Goldman, unlike some other banks). Compensation varies │
│ enormously depending on division (front office vs. middle/back office) and │
│ seniority within the band. Key data points for 2026: (1) Glassdoor reports a │
│ typical total pay range of $213,109$391,379 (25th75th percentile) across │
│ ~4,695 salary submissions, covering all VP roles firm-wide. (2) Levels.fyi │
│ reports a median total VP compensation of $144K, which likely skews toward │
│ tech/engineering roles. (3) 6figr reports an average of $297K (range │
│ $265K$501K, top 10% up to $514K) based on 67 profiles. (4) For front-office │
│ Investment Banking VPs specifically, Glassdoor reports a much higher range │
│ of $480,547$888,585 (25th75th percentile) based on 14 salaries. (5) │
│ Industry benchmarks from Mergers & Inquisitions (2026 update) place │
│ front-office IB VP base salary at $250$300K with total compensation of │
│ $525$800K for NY-based roles. (6) Indeed reports an average of ~$145,324, │
│ consistent with a broad mix of roles. Community sources (Fishbowl) confirm │
│ the VP band is 'very wide' with no official internal sub-levels at Goldman; │
│ pay differentiation happens informally by group, skillset, and front vs. │
│ back office status. │
╰──────────────────────────────────────────────────────────────────────────────╯
Citations
┏━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━┓
┃ # ┃ Title / Locator ┃ Excerpt ┃ Conf ┃
┡━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━┩
│ 1 │ Total salary range for │ The typical pay range is │ 0.85 │
│ │ Goldman Sachs Vice President │ between $213,109 (25th │ │
│ │ - Glassdoor │ percentile) and $391,379 (75th │ │
│ │ https://www.glassdoor.com/Sal │ percentile) annually. This is │ │
│ │ ary/Goldman-Sachs-Vice-Presid │ based on 4,695 salaries │ │
│ │ ent-Salaries-E2800_D_KO14,28. │ submitted by Goldman Sachs │ │
│ │ htm │ │ │
├─────┼───────────────────────────────┼────────────────────────────────┼───────┤
│ 2 │ Total salary range for │ The typical pay range is │ 0.85 │
│ │ Goldman Sachs Vice President │ between $220,674 (25th │ │
│ │ - Glassdoor │ percentile) and $411,924 (75th │ │
│ │ https://www.glassdoor.com/Sal │ percentile) annually. This is │ │
│ │ ary/Goldman-Sachs-V-P-Salarie │ based on 4,695 salaries │ │
│ │ s-E2800_D_KO14,17.htm │ submitted by Goldman Sachs │ │
├─────┼───────────────────────────────┼────────────────────────────────┼───────┤
│ 3 │ Goldman Sachs Vice President │ The median Vice President │ 0.75 │
│ │ Salary | $110K-$144K+ | │ compensation in United States │ │
│ │ Levels.fyi │ package at Goldman Sachs │ │
│ │ https://www.levels.fyi/compan │ totals $144K per year. View │ │
│ │ ies/goldman-sachs/salaries/vi │ the base salary, stock, and │ │
│ │ ce-president │ bonus breakdowns for Goldman │ │
│ │ │ Sachs's total compensation │ │
│ │ │ packages. Last updated: │ │
│ │ │ 4/6/2026 │ │
├─────┼───────────────────────────────┼────────────────────────────────┼───────┤
│ 4 │ Goldman Sachs Vice President │ Employees at Goldman Sachs as │ 0.70 │
│ │ Vp Salaries 2026 | │ Vice President Vp earn an │ │
│ │ $265k-$514k │ average of $297k, mostly │ │
│ │ https://6figr.com/us/salary/g │ ranging from $265k per year to │ │
│ │ oldman-sachs--vice-president- │ $501k per year based on 67 │ │
│ │ vp │ profiles. The top 10% │ │
├─────┼───────────────────────────────┼────────────────────────────────┼───────┤
│ 5 │ Goldman Sachs Investment │ The typical pay range is │ 0.65 │
│ │ Banking Vice President ... │ between $480,547 (25th │ │
│ │ https://www.glassdoor.com/Sal │ percentile) and $888,585 (75th │ │
│ │ ary/Goldman-Sachs-Investment- │ percentile) annually. This is │ │
│ │ Banking-Vice-President-Salari │ based on 14 salaries submitted │ │
│ │ es-E2800_D_KO14,47.htm │ by Goldman Sachs │ │
├─────┼───────────────────────────────┼────────────────────────────────┼───────┤
│ 6 │ Investment Banker Salary and │ Vice President (VP) | 28-40 | │ 0.88 │
│ │ Bonus Report: 2026 Update │ $250-$300K | $525-$800K | 3-4 │ │
│ │ https://mergersandinquisition │ years │ │
│ │ s.com/investment-banker-salar │ │ │
│ │ y/ │ NOTE: All numbers are pre-tax │ │
│ │ │ for New York-based │ │
│ │ │ front-office roles and include │ │
│ │ │ base salaries and year-end │ │
│ │ │ bonuses but not │ │
│ │ │ signing/relocation bonuses, │ │
│ │ │ stub bonuses, benefits, etc. │ │
├─────┼───────────────────────────────┼────────────────────────────────┼───────┤
│ 7 │ Vice President yearly │ Average Goldman Sachs Vice │ 0.70 │
│ │ salaries in the United States │ President yearly pay in the │ │
│ │ at Goldman Sachs │ United States is approximately │ │
│ │ https://www.indeed.com/cmp/Go │ $145,324, which is 9% below │ │
│ │ ldman-Sachs/salaries/Vice-Pre │ the national average. Salary │ │
│ │ sident │ estimated from │ │
├─────┼───────────────────────────────┼────────────────────────────────┼───────┤
│ 8 │ Are there internal levels/ │ Goldman VP band is very wide. │ 0.72 │
│ │ bands within the VP tit... | │ Promoted from associate and │ │
│ │ Fishbowl │ Next step md is difficult to │ │
│ │ https://www.fishbowlapp.com/p │ get. │ │
│ │ ost/are-there-internal-levels │ │ │
│ │ -bands-within-the-vp-title-at │ Yes, banks have different │ │
│ │ -goldman-sachs-fwiw-this-is-f │ bands depending on skillset, │ │
│ │ or-a-nonbusiness-internal-str │ group within the firm, front │ │
│ │ ategy-kind │ office vs back office, etc │ │
│ │ │ │ │
│ │ │ Not Goldman though. It's just │ │
│ │ │ VP │ │
├─────┼───────────────────────────────┼────────────────────────────────┼───────┤
│ 9 │ VP of FP&A at Goldman Sachs │ FP&A is middle office at │ 0.65 │
│ │ salary : r/FPandA - Reddit │ banks, they won't make │ │
│ │ https://www.reddit.com/r/FPan │ anywhere near $400k at VP │ │
│ │ dA/comments/1dgguz5/vp_of_fpa │ level. Front office VP │ │
│ │ _at_goldman_sachs_salary/ │ positions will all clear over │ │
│ │ │ $400k in a place │ │
├─────┼───────────────────────────────┼────────────────────────────────┼───────┤
│ 10 │ Goldman Sachs Vp Salaries │ 15 to 15 yrs. Base. $179k. │ 0.65 │
│ │ 2026 | $208k-$586k - │ Stocks / Yr. $21k. Bonus. │ │
│ │ 6figr.com │ $120k. Total Salary. $318k. │ │
│ │ https://6figr.com/us/salary/g │ Goldman Sachs Vp salary levels │ │
│ │ oldman-sachs--vp │ ranges from Vice President │ │
│ │ │ (Accountant) upto │ │
└─────┴───────────────────────────────┴────────────────────────────────┴───────┘
Gaps
┏━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Category ┃ Topic ┃ Detail ┃
┡━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ source_not_found │ Official internal Goldman │ Goldman Sachs does not │
│ │ Sachs VP compensation │ publicly publish its │
│ │ bands │ internal compensation │
│ │ │ bands or grade │
│ │ │ structures. No │
│ │ │ authoritative internal │
│ │ │ HR documentation was │
│ │ │ found. All data is from │
│ │ │ third-party crowdsourced │
│ │ │ salary platforms. │
├───────────────────────┼───────────────────────────┼──────────────────────────┤
│ source_not_found │ VP sub-band breakdown │ Community sources │
│ │ (VP1/VP2/VP3 equivalents) │ explicitly state Goldman │
│ │ │ uses a single 'VP' title │
│ │ │ with no formal │
│ │ │ sub-levels, unlike some │
│ │ │ peers. No granular │
│ │ │ sub-band salary data │
│ │ │ exists in any source │
│ │ │ reviewed. │
├───────────────────────┼───────────────────────────┼──────────────────────────┤
│ scope_exceeded │ Non-US VP compensation │ Some sources (e.g., │
│ │ bands │ AmbitionBox) reference │
│ │ │ India-based VP salaries │
│ │ │ (₹49.4L₹54.6L), but │
│ │ │ comprehensive │
│ │ │ international band data │
│ │ │ was not gathered. The │
│ │ │ question context appears │
│ │ │ US-focused. │
├───────────────────────┼───────────────────────────┼──────────────────────────┤
│ contradictory_sources │ Levels.fyi median │ Levels.fyi reports a │
│ │ discrepancy │ median of $144K while │
│ │ │ Glassdoor and 6figr │
│ │ │ report $213K$411K │
│ │ │ ranges. Levels.fyi │
│ │ │ likely captures │
│ │ │ engineering/tech VPs who │
│ │ │ have different │
│ │ │ compensation structures │
│ │ │ and lower base pay than │
│ │ │ finance VPs. │
└───────────────────────┴───────────────────────────┴──────────────────────────┘
Discovery Events
┏━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┓
┃ ┃ Suggested ┃ ┃ ┃
┃ Type ┃ Researcher ┃ Query ┃ Reason ┃
┡━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━┩
│ contradiction │ database │ Goldman Sachs VP │ Large discrepancy │
│ │ │ total │ between │
│ │ │ compensation by │ Levels.fyi ($144K │
│ │ │ division 2025 │ median) and │
│ │ │ 2026 │ Glassdoor │
│ │ │ │ ($213K$391K │
│ │ │ │ range) suggests │
│ │ │ │ the VP population │
│ │ │ │ is heterogeneous │
│ │ │ │ across tech and │
│ │ │ │ finance │
│ │ │ │ functions; │
│ │ │ │ further │
│ │ │ │ segmentation by │
│ │ │ │ division would │
│ │ │ │ resolve this. │
├──────────────────┼───────────────────┼───────────────────┼───────────────────┤
│ related_research │ null │ Goldman Sachs │ Understanding how │
│ │ │ internal grade │ Goldman's VP band │
│ │ │ structure VP │ maps to peer │
│ │ │ Director MD 2026 │ banks' grade │
│ │ │ │ systems would │
│ │ │ │ clarify the wide │
│ │ │ │ compensation │
│ │ │ │ range observed. │
├──────────────────┼───────────────────┼───────────────────┼───────────────────┤
│ related_research │ null │ Goldman Sachs │ Mergers & │
│ │ │ 2025 bonus pool │ Inquisitions │
│ │ │ VP payout by │ notes senior │
│ │ │ division │ bankers (VPs+) │
│ │ │ │ received │
│ │ │ │ disproportionate │
│ │ │ │ 2025 bonus │
│ │ │ │ increases; │
│ │ │ │ division-level │
│ │ │ │ data would │
│ │ │ │ sharpen the band │
│ │ │ │ picture. │
└──────────────────┴───────────────────┴───────────────────┴───────────────────┘
Open Questions
┏━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Priority ┃ Question ┃ Context ┃
┡━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ high │ Does Goldman Sachs use any │ Fishbowl community posts │
│ │ informal internal seniority │ confirm the VP band is wide and │
│ │ designations within the VP │ pay varies significantly, but │
│ │ title (e.g., junior VP vs. │ it is unclear whether informal │
│ │ senior VP) that affect │ tracking of seniority within │
│ │ compensation but are not │ the band drives structured pay │
│ │ publicly disclosed? │ steps. │
├──────────┼─────────────────────────────────┼─────────────────────────────────┤
│ high │ How did 2025 year-end bonuses │ Mergers & Inquisitions notes │
│ │ for Goldman Sachs VPs compare │ that VPs and Directors saw │
│ │ to the prior year, and were │ 1015% total comp increases in │
│ │ front-office VPs │ 2025, but Goldman-specific │
│ │ disproportionate beneficiaries? │ figures were not isolated. │
├──────────┼─────────────────────────────────┼─────────────────────────────────┤
│ medium │ Why does Levels.fyi report a │ The discrepancy likely reflects │
│ │ $144K median for Goldman Sachs │ different user populations │
│ │ VPs when Glassdoor and 6figr │ (tech-focused on Levels.fyi vs. │
│ │ report ranges starting at │ finance-focused on │
│ │ $213K$265K? │ Glassdoor/6figr), but this has │
│ │ │ not been confirmed. │
├──────────┼─────────────────────────────────┼─────────────────────────────────┤
│ medium │ What is the typical │ Fishbowl notes the VP band is │
│ │ time-in-grade for a Goldman │ wide and the step to MD is │
│ │ Sachs VP before promotion to │ difficult; Mergers & │
│ │ Managing Director, and does │ Inquisitions gives a 34 year │
│ │ longer tenure correlate with │ promotion window for VPs across │
│ │ meaningfully higher within-band │ large banks. │
│ │ pay? │ │
└──────────┴─────────────────────────────────┴─────────────────────────────────┘
╭───────────────────────────────── Confidence ─────────────────────────────────╮
│ Overall: 0.62 │
│ Corroborating sources: 8 │
│ Source authority: medium │
│ Contradiction detected: True │
│ Query specificity match: 0.55 │
│ Budget status: spent │
│ Recency: current │
╰──────────────────────────────────────────────────────────────────────────────╯
╭──────────────────────────────────── Cost ────────────────────────────────────╮
│ Tokens: 51829 │
│ Iterations: 4 │
│ Wall time: 98.38s │
│ Model: claude-sonnet-4-6 │
╰──────────────────────────────────────────────────────────────────────────────╯
trace_id: 716e548a-ceaf-4d18-8b47-ac35e3460b52