Updated: December 19, 2024 (May 13, 2024)
Analyst ReportPTU Quota and Commitment Example
The Provisioned Throughput Unit (PTU) purchase model for the Azure OpenAI service could reduce customer charges by allowing them to purchase commitments of capacity for 30-day intervals.
OpenAI resources in a subscription contain deployments, which provide access to large language models (LLMs) that drive natural language-based applications. Customers work with Microsoft sales account teams to set quotas on the subscription for the number of PTU commitments that can be purchased for each LLM type. The illustration shows a customer with a GPT-4 quota of 500 PTUs and a GPT-3.5-Turbo quota of 200 PTUs. This customer can purchase up to those numbers of PTU commitments across all OpenAI resources in the subscription.
The customer has purchased 200 PTUs for resource A, 300 PTUs for resource B, and 100 PTUs for resource C. Deployments within each resource can be assigned a combined total of up to the number of commitments in their parent resource.
It is possible to have more PTUs assigned to a resource’s deployments than the number of commitments purchased for the resource, in which case a substantially higher-priced hourly rate is charged for the extra PTUs. The customer shown here has 150 PTUs assigned to deployment 5, and its parent, resource C, has 100 commitments, so 50 of the PTUs are charged at the hourly rate instead of the 30-day commitment rate.
Atlas Members have full access
Get access to this and thousands of other unbiased analyses, roadmaps, decision kits, infographics, reference guides, and more, all included with membership. Comprehensive access to the most in-depth and unbiased expertise for Microsoft enterprise decision-making is waiting.
Membership OptionsAlready have an account? Login Now