GPU Capacity Is Tightening. Power Is the New Bottleneck.

The constraint on AI infrastructure is no longer compute. It is power, and the gap between who can solve it and who cannot is widening fast.

What Happened

Three signals from the same 24-hour news cycle tell a coherent and uncomfortable story for anyone sourcing GPU capacity today.

First, Moody's revised its hyperscaler capex forecast upward by $85 billion, putting 2026 spend at $785 billion and tracking toward $1 trillion by 2027. AWS, Azure, GCP, and Oracle Cloud (OCI) are collectively occupying every MW of buildable data center capacity they can reach. The consequence for everyone else: wait lists for H200 and B200 reserved instances that already stretch quarters are not getting shorter.

Second, xAI made the problem visible in a single infrastructure decision. Rather than wait in a utility interconnection queue, the company deployed 19 natural gas turbines at its Colossus 2 campus in Southaven, Mississippi, using behind-the-meter generation (on-site power that bypasses the public grid entirely) to control its own timeline. This is not a quirk of xAI's culture. It is a rational response to a grid interconnection system that is failing AI-scale infrastructure.

That failure is now quantified. Data from PJM, the grid operator covering the mid-Atlantic and Midwest US, shows that post-approval delays for AI data center projects now exceed the time spent in the interconnection queue itself. Projects are clearing regulatory approval and then stalling for years before a single watt reaches a GPU. In Texas, ERCOT has flagged that AI-driven load forecasts may be materially overstated, introducing new uncertainty into permitting and power commitments for clients banking on Texas capacity. Meanwhile, Google locked up 500MW of Texas solar in a 15-year power purchase agreement (PPA, a long-term electricity contract) with Linea Energy, tightening available renewable offtake supply for any competing operator in that market.

On the supply side, one counter-signal deserves attention. GPU rental marketplaces are showing early signs of price compression, as expanding neocloud (specialized GPU cloud providers, an alternative to hyperscalers) supply drives spot pricing transparency. This creates real procurement opportunities, but it also adds volatility that complicates any capacity plan built on spot exposure alone.

Why It Matters

The power bottleneck is structurally different from a GPU supply shortage. When Nvidia ships more H100s, H200s, or GB200s, the hardware constraint eases. When grid interconnection queues stretch years and post-approval delays compound on top of that, no amount of chip production closes the gap. Physical infrastructure, not silicon, is now the long-duration constraint.

This asymmetry hits different client types differently. Frontier labs with the balance sheets to do what xAI did, deploy behind-the-meter generation and build on their own power timeline, can insulate themselves. Fortune 500 enterprises rolling out AI infrastructure for the first time cannot. They are entering a market where the best-located, best-powered capacity is increasingly pre-committed to hyperscalers under decade-long agreements, or tied up by well-capitalized neocloud operators who moved early.

The neocloud sector is itself bifurcating. Operators with owned or contracted power, such as those building at scale in power-advantaged regions like the Nordic countries (as evidenced by Nscale's $790M financing for a Norway AI compute campus), are in a fundamentally different position than operators running on spot power at market rates. The former can offer durable reserved capacity. The latter are exposed to the same constraints squeezing everyone else.

For sovereign AI programs in both the US and EU, the power question is also a sovereignty question. Compute that depends on grid interconnection timelines measured in years is not a reliable foundation for national AI strategy.

What Clients Should Do

If you are a frontier lab or a well-funded AI application company planning a training cluster at the 5,000 to 20,000 GPU scale, the window for securing favorable reserved capacity terms on H200 or GB200 nodes is narrowing. Neocloud operators we work with are still offering 30 to 50 percent savings versus hyperscaler reserved instance pricing, with ramp times (the deployment timeline for capacity coming online) measured in weeks rather than quarters. But the operators with owned power, and therefore credible delivery guarantees, are seeing those terms tighten as demand accumulates.

If you are a Fortune 500 enterprise in financial services, pharma, or manufacturing beginning your first serious AI infrastructure buildout, do not anchor your planning to hyperscaler availability. AWS, Azure, and GCP are the right answer for some workloads, but for dedicated GPU capacity at volume, neocloud operators consistently outperform on price, speed, and contract flexibility. The right architecture is a portfolio: hyperscaler for elastic burst, neocloud for reserved training and inference capacity, and Tier III colocation (Tier III is a data center reliability standard guaranteeing 99.982% uptime) at operators like Equinix, Digital Realty, CyrusOne, or QTS for teams that want to bring their own hardware.

If you are a system integrator or consultancy sourcing for an enterprise client, the GPU rental price compression story is real but nuanced. Spot market pricing is becoming more transparent, which is useful for benchmarking. It is not a substitute for reserved capacity when a client has committed delivery timelines. Lock in reserved terms now and use spot data as leverage in negotiations.

For any client eyeing Texas or PJM-region colocation for new capacity, factor in the power delay data explicitly. Northern Virginia (NoVa), Dallas, Phoenix, Chicago, and Atlanta all have different power queue dynamics. Site selection without a current power availability analysis is planning on outdated assumptions.

How XIRR Advisors Can Help

XIRR Advisors sources two things: reserved GPU capacity from neocloud operators across the USA and EU, and Tier III colocation space in major US markets. We do not broker hyperscalers. AWS, Azure, GCP, and OCI sell direct, and you do not need a broker for that. Where we add value is in the neocloud and colocation markets, where operator relationships, current availability data, and contract expertise move the needle on both price and speed.

Share your requirements, GPU type (H100, H200, B200, GB200, GB300), cluster size, region, timing, or megawatts for colocation, and we will canvas the market and come back with a shortlist in 48 hours. Earlier conversations consistently yield better terms. Capacity that is available today may not be available in 60 days. Our fee is paid by the provider. Clients pay nothing. Reach us at contact@xirradvisors.com or DM @XIRRAdvisors.

References

[1] Data Center Dynamics: Moody's hyperscaler capex forecasts marked up by $85bn to close in on $1trn by 2027

[2] Data Center Dynamics: xAI deploys 19 natural gas turbines at Colossus 2 data center in Southaven, Mississippi

[3] Data Center Knowledge: Why AI data center projects face years of delays after approval

[4] Data Center Knowledge: ERCOT warns Texas AI power boom may not materialize

[5] Data Center Dynamics: Google inks 15-year 500MW solar PPA with Linea Energy in Texas

[6] Data Center Knowledge: GPU rental markets show signs of pricing compression

[7] Data Center Knowledge: Nordic banks back massive AI power buildout in Norway

GPU MarketsNeocloudHyperscalerEnterprise AIData Center Power

— Tell Us What You're Sourcing

Share your requirements. We'll canvas the market.

Tell us your needs (region, GPU type, capacity, timing — or MW for colocation) and we'll canvas the neocloud and colocation markets on your behalf. Shortlist in 48 hours.

Earlier conversations get better terms. When you engage early, we have time to negotiate with vendors before you need to commit. You pay nothing. Provider-paid model.

Share Your Requirements → Email for a Discovery Call →