February 5, 2026
Analyst ReportMaia 200 Aims to Lower AI Inference Costs
- The new generation of Microsoft custom AI silicon offers better price/performance for large-scale inference.
- It’s technically competitive with similar options being delivered from Google and AWS.
- It’s most useful for customers running their own custom models.
Maia 200, Microsoft’s newest generation of custom AI silicon, is being deployed in a handful of Azure regions in the United States, offering much of the performance benefits of GPUs but at a lower cost. Like Google’s Tensor Processing Units (TPUs) and AWS’s Trainium and Inferium, Maia 200 is a custom-designed chip optimized for the kinds of mathematical operations widely used in AI applications. Although not as fast as the highest-end NVIDIA GPUs, custom AI chips typically offer much better price/performance.
With Maia 200, Microsoft is specifically focusing on inference (the process of using AI models) instead of training (the process of building AI models). This would lower the cost for organizations that build and run their own models.
Atlas Members have full access
Get access to this and thousands of other unbiased analyses, roadmaps, decision kits, infographics, reference guides, and more, all included with membership. Comprehensive access to the most in-depth and unbiased expertise for Microsoft enterprise decision-making is waiting.
Membership OptionsAlready have an account? Login Now