Updated: December 28, 2024 (December 28, 2024)

  Charts & Illustrations

Azure OpenAI Service Deployment Types

My Atlas / Charts & Illustrations

317 wordsTime to read: 2 min
Rob Sanfilippo by
Rob Sanfilippo

Before joining Directions on Microsoft, Rob worked at Microsoft for 14 years where he designed technologies for Microsoft products and... more

Deployment Type Description Notes
Global Inference requests could be routed to any data-center in the world where the LLM is supported, depending on capacity available. Least expensive; however, it may not meet customer compliance and latency requirements.
Data Zone Inference requests could be routed to any data center within a Microsoft-defined Data Zone. Costs 10% more than Global; currently, there are only U.S. and Europe Data Zones.
Regional Inference requests are routed to the Azure region where the customer has deployed the associated Open AI resource. Twice the cost of Global for hourly pricing under the Provisioned purchase model.

The Azure OpenAI service offers three deployment type choices. Customers can choose the most cost-effective type that meets their compliance and connectivity latency requirements. The types vary by the Azure datacenters that respond to service inference requests. 

The Global type routes requests to any datacenter in the world where the required large language model (LLM) is supported, based on available capacity. Global is the least-expensive deployment type, but it provides no customer control over where request processing occurs.

Atlas Members have full access

Get access to this and thousands of other unbiased analyses, roadmaps, decision kits, infographics, reference guides, and more, all included with membership. Comprehensive access to the most in-depth and unbiased expertise for Microsoft enterprise decision-making is waiting.

Membership Options

Already have an account? Login Now