OpenAI Foundry to offer dedicated compute for customers to run AI workloads

OpenAI is launching a developer platform known as Foundry to allow customers to run large artificial intelligence workloads based on its models.

“[Foundry allows] inference at scale with full control over the model configuration and performance profile,” documents posted to Twitter state.

It aims to deliver “static allocation” of compute capacity that is “designed for cutting-edge customers running larger workloads.”

Customers will be able to monitor their instances using the same tools and dashboards as OpenAI uses for its own work, including ChatGPT, Dall E, and GPR3.

“Coming soon, OpenAI will offer more robust fine-tuning options for our latest models,” the document says. “Foundry will be the platform for serving those models.”

The service offers SLAs for 99.5 percent uptime and on-call engineering support.

Running a lightweight version of GPT-3.5 will cost $26,000 a month for a three-month commitment or $264,000 over a one-year commitment.

Model instance DV (8K max content) will cost $78,000 per month for three months, or $792,000 for the full year.

While the document does not disclose where that compute will be hosted, it is likely on Microsoft Azure.

Microsoft has invested more than $10 billion in the company, and is its primary cloud provider. It has also built specialized bespoke systems for OpenAI.

The move comes after Amazon Web Services this week said that it would integrate Hugging Face’s AI software development hub into its cloud.

Source: datacenterdynamics.com

Picture: DCD/DALL·E 2