The $25B Memory Premium Inside Every Inference Token

Microsoft's depreciation bill hit $10.2 billion in the quarter ending March 31, a direct consequence of compounding hardware purchases against a six-year asset schedule.

The company spent $30.9 billion on capital equipment in Q3, roughly two-thirds in short-lived GPU and CPU assets. Over the nine months ending March 31, $80.1 billion went into infrastructure supporting an AI franchise at $37 billion in annual recurring revenue.

The Memory Surcharge

Samsung and SK Hynix raised HBM3e supply prices roughly 20% for 2026, with Nvidia H200 production the primary driver. Each H200 requires six HBM3e stacks, and China export-control changes in late 2025 pulled forward demand that suppliers had not planned for.

Broader DRAM prices surged 90% in Q1 2026 versus Q4 2025. Microsoft attributed $25 billion of its $190 billion 2026 capex plan to memory and storage price inflation, not expanded capacity, per its April earnings call.

The pricing implies Microsoft is paying more for the same planned capacity than its 2026 budget modeled. That surcharge enters a depreciation schedule Microsoft has held at six years since FY2023, when CFO Amy Hood extended server useful life from four years in July 2022. Every token served on Azure over that window carries a fraction of the 2026 memory premium in its cost basis.

OpenAI's Ledger

GPT-4o lists at $2.50 per million input tokens. Three years of competition drove the price down roughly 60%.

Documents reviewed by journalist Ed Zitron show OpenAI spent $3.648 billion on Azure inference in the quarter ending September 2025. In the same quarter, Microsoft received $411.2 million from OpenAI as its revenue share. That agreement covers ChatGPT and OpenAI's API platform at 20%, so dividing $411.2 million by 0.20 implies at least $2.056 billion in total covered revenue for the quarter. Inference spending exceeded that entire base by $1.6 billion.

OpenAI's internal projections, from the same document set, show inference costs reaching $14.1 billion in 2026, against roughly $25 billion in projected revenue and a $14 billion net loss.

OpenAI's Q3 numbers reveal who is running the AI inference subsidy: $1.77 in Azure inference costs for every dollar of API and ChatGPT revenue, before salaries, training, or anything else on the ledger. A $2.50-per-million list price expires when the memory costs that underpin it rise 20%.

OpenAI's 2026 inference projection of $14.1 billion was built before DRAM prices rose 90% in Q1. If inference costs track memory pricing even partially, that figure is conservative. The projected $14 billion net loss proves optimistic by the time 2026 closes.