Want to work on research like this? We are looking for experienced researchers to lead new projects on one of multiple expanding teams: jobs.lever.co/epoch-ai/de...
Posts by Epoch AI
In all, that's over 9 GW planned for Stargate in the US. It's unclear if that'll be achieved by 2029, as financing, procurement, and politics could all stand in the way. But the construction activity shows the project isn't merely an ambition.
Learn more: epoch.ai/blog/openai...
The Ohio site is the odd one out. While there is a small data center planned here, the site is primarily a Foxconn plant for manufacturing data center equipment, potentially Nvidia servers. A small amount of land was recently cleared around the existing buildings.
Three slightly smaller sites are underway: one in Milam County, Texas, by SoftBank's SB Energy, one in Wisconsin, also built by Vantage, and one in Michigan, built by Related Digital. Each one will have an estimated 1.2-1.4 GW of total facility power.
Further west in New Mexico, STACK Infrastructure is bringing Stargate another 2.2 GW with the Project Jupiter campus. The site will have four massive buildings powered by two natural-gas "microgrids" designed to minimize impact on the local power grid.
However, Stargate is still targeting 2 GW in other locations. Construction of a 2 GW campus is in full swing across 1200 acres (4.9 sq km) of Shackelford County, just across the county line from Abilene. This campus is built by Vantage and will feature 10 buildings.
The most advanced Stargate site is in Abilene, Texas. With an estimated 0.6 GW already operational today and 1.2 GW expected by Q3 2026, it is ahead of all other sites. OpenAI recently withdrew from plans to expand the site to 2.1 GW. Microsoft has since claimed the extra 900 MW.
In 2025, OpenAI announced Stargate, a $500 billion data center initiative. We surveyed all 7 US sites and found visible development at each.
There's a long road ahead, but the project appears on track to reach 9+ GW by 2029—comparable to New York City's peak power demand. 🧵
You can explore the results yourself in our interactive visualization: switch between metrics, toggle data preparation and evaluation settings, and compare how all 8 candidate curves fit each dataset.
All four capability metrics heavily weigh math and programming tasks. These are areas where correctness is easy to verify automatically, which makes them natural targets for RL.
Tasks with harder-to-verify outputs may not have sped up in the same way.
The one exception is an index we built aggregating WeirdML V2 tasks, where a single global linear trend fits best and we don't see signs of acceleration.
This holds up across several robustness checks: we vary how the data is prepared (e.g. treating each SOTA release as a point vs. interpolating daily values) as well as how fits are evaluated (e.g. comparing prediction accuracies for different forecasting horizons).
On ECI, Time Horizons, and our math capabilities index, the curve that makes the best predictions on a 6-month horizon is a pair of independent linear trends, with a break when reasoning models arrived.
We fit 8 candidate curves to 4 different AI capability metrics:
① ECI
② METR's 50% time horizon (on a log scale)
③ An index aggregating math benchmarks
④ An index based on the WeirdML v2 benchmark.
Then, we looked at which curve fit best predicted future capabilities.
Have AI capabilities accelerated?
On 3 out of the 4 AI capability metrics we investigated, we found strong evidence of acceleration, around when reasoning models emerged.
Claude remains far behind ChatGPT's ~30% share but is the only AI service in our survey to show a clear upward trend across this short time period.
Our survey doesn't shed light on *why* usage patterns shifted, but the timing of the jump coincides with a public dispute with the US government, as well as an increase in enterprise adoption.
According to our latest polls, Claude usage in the US rose by over 40% amid increased attention last month, but remains far behind ChatGPT.
Our point estimate would imply several million new weekly users in the United States.
Going forward, we will improve our processes and data pipelines so that they are more robust to human error.
You can find the refreshed data insight below.
epoch.ai/data-insigh...
This lowers our estimate of non-hyperscaler AI compute. Still, we know it’s large: Nvidia disclosed that 4-5 hyperscalers (sometimes excluding Meta) make up just half of its data center sales in recent quarters. See our methodology for more.
epoch.ai/data/ai-chi...
Errata: Yesterday, we discovered that some of our chip owner estimates were stale—Oracle's Nvidia compute wasn’t subtracted from "Other" as intended.
This inflated “Other” by ~1M H100e, 5% of the overall total. In our corrected figures, hyperscalers hold 71% of world AI compute.
Interested in purchasing access? Reach out to math@epoch.ai
For more about the benchmark, visit our website.
epoch.ai/frontiermat...
OpenAI funded the creation of the original FrontierMath: Tiers 1-4, but Open Problems is developed independently and owned solely by Epoch. The pilot phase of the project was supported by a grant from Schmidt Sciences.
Any solution found with a verifier must be communicated to Epoch, and the problem author has joint publication rights with the party that found the solution.
Access to the verifiers is available for purchase by any party. We structure it this way to help fund the expansion of the benchmark. Our main cost is compensation to mathematicians, as the problems and verifiers are labor-intensive to formulate and implement.
FrontierMath: Open Problems is our benchmark of unsolved math research problems. Problems are designed so that, even though no solution is known today, potential solutions can be checked for accuracy by a bespoke computer program, called a verifier.
OpenAI has purchased access to the FrontierMath: Open Problems verifiers. This allows them to check the validity of solutions their models generate. Thread with details.