PM of the Future
The execution governor is gone. Capacity is infinite. The only bottleneck left is taste — and judgment.
The thesis
1:3 PM to engineer ratio. PMs fluent in APIs, security, and scalability. Focus is strategy, prototyping, and testing — not coordination. Not documentation. Not waiting.
A team running 13 times more experiments per quarter compounds its learning gap until it becomes insurmountable. That's not a prediction. It's already happening.
Most PMs were never actually bottlenecked by execution. They were bottlenecked by taste and judgment. Team capacity functioned as a governor that prevented bad ideas from shipping. Remove that governor and you discover who was driving and who was just steering.— Gemini Head of Product
Open role
We hire operators. They build product architecture and define API SLOs before writing a line of copy. They ship production code in Cursor or Claude Code on Monday and have an experiment running by Wednesday. They write their own eval suites in Braintrust. They read a LangSmith trace without asking for help. They reverse-engineer product features from business impact and customer expectations — fast. Above all, they have taste — the judgment to know what's worth shipping when capacity is infinite, and the discipline to kill what isn't.
"They own one number: number of bookings per day. The dashboard is the review. The throughput is the strategy."
A week in the life
Prototypes the rebooking agent v2 in Claude Code. Builds a working demo of the new disruption threshold logic — no PRD, no Figma review. Evidence first.
Writes 20 evals in Braintrust against last week's failure logs. Tags the failure modes — wrong carrier preference, missed hotel loyalty match. Defines what "passing" looks like before shipping.
Ships the experiment to 10% of traffic. All evals pass. Disruption watch active on the new cohort. No ceremony — a PR, a deploy, a Loom for context.
Reviews the eval deltas — one branch regressed on multi-currency bookings. Kills that branch. Opens LangSmith traces to find where the profile agent dropped the currency preference.
Talks to 3 users who hit the failure mode. Records with a phone. No UX researcher in the loop — they listen, they take notes, they write the next round of evals on Monday.
What they refuse to do
Each one was a governor on bad ideas. Remove the governor — find out if you have taste.
Prototype in Claude Code instead. Evidence precedes documentation. If you can't build a rough version in an afternoon, the idea isn't clear enough to document.
Throughput metrics are live. The dashboard is the review. If someone needs a quarterly meeting to understand how the product is performing, the product isn't instrumented correctly.
Real-time data kills the ritual. Bookings per day is on the screen right now. The ritual existed to compensate for latency in reporting — remove the latency, remove the ritual.
Strategy is 90-day bets, battle-tested and re-set. A roadmap that spans 12 months is a fiction dressed as a plan. Ship, measure, reset. Three times a year beats one plan across twelve months.
The stack
The tools in the stack are the tools of a builder — not a coordinator.
Prototyping features directly in the codebase. The PM builds the first version — not a spec for someone else to build.
Spinning up UI and marketing demos in hours. Ships a working front-end to test with users before writing a line of backend code.
Writing evals, catching hallucinations before they reach production. Evals are a product metric — built during development, not bolted on after launch.
Observability on agent traces. Knows where the AI burns budget and fails — reads traces without needing an ML engineer to interpret them.
Issues — not sprints. No velocity tracking. No burndown. Work exists or it doesn't. The only cycle that matters is prototype → ship → eval.
User research artifacts. Talks to three users on Friday, records it, reviews it Monday morning before writing the next round of evals. No UX research team in the loop.
Not in this stack: Jira · Confluence · Roadmunk · PowerPoint · quarterly planning decks
How they work with agents
The pricing agent shipped a 3% markup experiment overnight. Here's what happened next.
"This feature is right 90% of the time. Here's what happens the other 10%." — The PM communicates uncertainty clearly. It builds trust. It's also how you write the next round of evals.