Microsoft announces in-house Arm CPU and AI accelerator chips, custom racks

Microsoft has announced two in-house designed semiconductors – an Arm CPU and a dedicated AI accelerator.

The Microsoft Azure Cobalt CPU is designed for general workloads, with a focus on performance per watt. The Microsoft Azure Maia AI Accelerator, meanwhile, is optimized for artificial intelligence tasks and generative AI.

The company said that it would begin rolling out the chips to its data centers early next year, initially for internal services like Microsoft Copilot and Azure OpenAI Service. They will then be made available on Microsoft Azure more generally, but are unlikely to be sold individually.

“Microsoft is building the infrastructure to support AI innovation, and we are reimagining every aspect of our data centers to meet the needs of our customers,” said Scott Guthrie, EVP of Microsoft’s Cloud and AI Group.

“At the scale we operate, it’s important for us to optimize and integrate every layer of the infrastructure stack to maximize performance, diversify our supply chain, and give customers infrastructure choice.”

The Azure Cobalt 100 CPU will join rival Arm chip Ampere on Microsoft Azure. It is currently being used for internal Microsoft products such as Azure SQL servers and Microsoft Teams. It has 128 Neoverse N2 cores on Armv9 and 12 channels of DDR5, and is based on Arm’s Neoverse Genesis CSS (Compute Subsystem) Platform.

As for Maia, it is built on TSMC’s 5nm node and has 105 billion transistors on a monolithic die. The company claims a performance of 1,600 Tflops of MXInt8 and 3,200 Tflops of MXFP4 – which best rival Google’s TPUv5 and Amazon’s Trainium.

It has a memory bandwidth of 1.6TBps, above Trainium, but below the TPUv5.

The chip will be deployed in a custom-designed rack and cluster known as Ares, SemiAnalysis reports. The servers are not standard 19” or OCP and are reportedly “much wider.”

Ares will only be available as a liquid-cooled configuration, requiring some data centers to deploy water-to-air CDUs.

Microsoft said that each rack will have a ‘sidekick,’ where cooling infrastructure is located on the side of the system, circulating liquid to cold plates.

Each server features four Maia accelerators, with eight servers per rack.

Microsoft said that “OpenAI has provided feedback on Azure Maia,” which will help inform future Microsoft designs.

“Since first partnering with Microsoft, we’ve collaborated to co-design Azure’s AI infrastructure at every layer for our models and unprecedented training needs,” Sam Altman, CEO of OpenAI, said.

“We were excited when Microsoft first shared their designs for the Maia chip, and we’ve worked together to refine and test it with our models. Azure’s end-to-end AI architecture, now optimized down to the silicon with Maia, paves the way for training more capable models and making those models cheaper for our customers.”

However, the company has reportedly considered developing its own chips due to a desire for a different chip design – likely a higher memory bandwidth, which is key for large language model training.

Microsoft is currently developing second-generation versions of both chip lines.

Source: datacenterdynamics.com