Updated: December 28, 2024 (December 28, 2024)
Charts & IllustrationsAzure OpenAI Service Deployment Types
Deployment Type | Description | Notes |
Global | Inference requests could be routed to any data-center in the world where the LLM is supported, depending on capacity available. | Least expensive; however, it may not meet customer compliance and latency requirements. |
Data Zone | Inference requests could be routed to any data center within a Microsoft-defined Data Zone. | Costs 10% more than Global; currently, there are only U.S. and Europe Data Zones. |
Regional | Inference requests are routed to the Azure region where the customer has deployed the associated Open AI resource. | Twice the cost of Global for hourly pricing under the Provisioned purchase model. |
The Azure OpenAI service offers three deployment type choices. Customers can choose the most cost-effective type that meets their compliance and connectivity latency requirements. The types vary by the Azure datacenters that respond to service inference requests.
The Global type routes requests to any datacenter in the world where the required large language model (LLM) is supported, based on available capacity. Global is the least-expensive deployment type, but it provides no customer control over where request processing occurs.
Atlas Members have full access
Get access to this and thousands of other unbiased analyses, roadmaps, decision kits, infographics, reference guides, and more, all included with membership. Comprehensive access to the most in-depth and unbiased expertise for Microsoft enterprise decision-making is waiting.
Membership OptionsAlready have an account? Login Now