Fund the Boring AI
Gartner’s own data reveals what works. It’s not what vendors are selling.
Twenty-eight percent.
That is the share of AI infrastructure projects that deliver full ROI. Not a typo. Not a pessimist’s reading of Gartner’s survey of 782 IT leaders — that is the number. One in five fail outright. More than half of the managers surveyed have already lived through at least one AI failure — quietly, without press release, without post-mortem.
The demos were flawless. The pilots were promising. And then the thing met your actual operations.
The board noticed. A Harris Poll found that 98% of tech leaders report rising pressure to show returns, and 71% of CIOs say their AI budgets face cuts or freezes if they can’t demonstrate results by the end of H1.
An NBER study of nearly 6,000 executives across the US, UK, Germany, and Australia found that 69% of their businesses are actively using AI. And yet 89% saw no change in productivity. More than 90% detected no impact on employment.
The spending wave crested. The patience window is closing.
But here is what the reckoning obscures — because panic without a map is just noise.
Inside Gartner’s survey of 782 infrastructure and operations leaders is a pattern that changes everything. AI doesn’t fail uniformly. It fails by type. The highest failure rates cluster around the most ambitious applications: auto-remediation, self-healing infrastructure, agent-led workflow management. The headline-generating moonshots vendors pitch hardest in every demo you’ve ever sat through. Gartner’s “Too Big to Fail” report predicts more than 70% of mainframe migration projects initiated in 2026 will fail — because the gap between what vendors promise for code transformation and what AI can actually deliver is widest precisely where the stakes are highest.
Meanwhile, 53% of infrastructure and operations leaders report genuine success in narrow, mature, unglamorous applications. IT service management. Cloud operations. Document classification. The work nobody puts in a case study.
This is not coincidence. It is the mechanism. AI performs best when the problem is tractable — well-defined inputs, verifiable outputs, no judgment calls living in people’s heads. The same frontier model class that fails at agent-led workflow orchestration is the one driving the 53% success rate in ticket triage and document classification. Same tools. Different problem shape. That is what the pattern points to: the selection pressure has flipped. Projects no longer fail because the technology is immature. They fail because the problem was never tractable to begin with.
The portable question is simple: does this problem have a ground truth you can check in under a minute? If yes, AI ships value. If no, you’re funding a demo.
If you sat in a vendor demo last year and thought that will never survive contact with our actual operations — the data has caught up with your instinct.
Klarna learned this publicly. The fintech made headlines in 2024 replacing customer-service workers with AI. By early 2025 it was quietly rehiring humans — absorbing the particular awkwardness of bringing back people it had already let go. AWS’s own CEO Matt Garman named the consequence plainly: “Gutting junior employees in favor of AI is a short-term play that could easily come back to bite employers.” Stop hiring juniors, he warned, and you destroy the pipeline to expertise. The rework costs don’t appear on the ROI dashboard. They appear in escalations, complaints, and the expertise that walked out the door.
A METR randomised controlled trial of 16 experienced open-source developers found they took 19% longer to complete tasks when using AI tools — not beginners, but engineers working on codebases they knew. The same developers believed they’d been 20% faster. Call it the review tax: verifying an AI’s output costs roughly what the problem’s ambiguity costs. On a tractable problem it takes seconds; on an ambitious one it swallows the task. That perception gap is where budgets go to die.
The boards are already responding. Forrester found that large organisations are deferring 25% of planned 2026 AI spending into 2027. Fewer than one in three decision-makers could connect AI investments to financial growth. Sharyn Leaver, Forrester’s chief research officer, named the inflection: “In 2026, the AI hype period ends.” The pressure to deliver real, measurable results is intensifying.
What survives is not more ambition. It is better aim.
This week's question: which of your active AI projects can answer yes to the ground truth test? For every one that can't — kill or pause it. That's where the 53% lives. Fund the rest you'd never put in a press release.
Sources
Gartner I&O survey (782 IT leaders, Nov–Dec 2025) + Harris Poll/Dataiku CIO pressure data — The Register, Apr 7, 2026: https://www.theregister.com/2026/04/07/ai_returns_gartner/
NBER working paper w34836 (6,000 executives, US/UK/Germany/Australia, Feb 2026): https://www.nber.org/papers/w34836 — reported by The Register, Feb 18, 2026: https://www.theregister.com/2026/02/18/ai_productivity_survey/
Gartner “Too Big to Fail: Why Mainframe Exit Projects Are Likely to Fail in the Age of Generative AI” (Apr 2026) — The Register, Apr 15, 2026: https://www.theregister.com/2026/04/15/gartner_mainframe_exit_analysis/
The Klarna Effect + Matt Garman quote — Gary Marcus Substack, Aug 23, 2025:
METR, “Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity” (16 developers, 246 issues, Jul 10, 2025): https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/ — arXiv preprint: https://arxiv.org/abs/2507.09089
Forrester AI spending deferral + Sharyn Leaver quote — The Register, Oct 28, 2025: https://www.theregister.com/2025/10/28/forrester_ai_spending/



