Microsoft debuts the NDv6 GB300 cluster with more than 4,600 NVIDIA Blackwell Ultra GPUs: the new Azure supercomputer to power OpenAI AI

In a sector where each generation of hardware seems to last less than a technological fad, Microsoft has made a move again. The Redmond company has presented the new series of NDv6 GB300 virtual machinespowered by an engineering colossus: a Industrial-scale production cluster with NVIDIA GB300 NVL72 systemsdesigned for the most demanding AI inference workloads.

Behind this announcement there is more than technical specifications. There is a declaration of intent: Azure does not want to be just a cloud to run artificial intelligence, but the infrastructure where the future of AI is defined.

A leap in scale that redefines what “cloud” means

The new NDv6 GB300 cluster is, in practice, a distributed supercomputer capable of processing models that previously required separate or dedicated infrastructures. With more than 4,600 NVIDIA Blackwell Ultra GPUs connected via the NVIDIA Quantum-X800 InfiniBand networkthe system takes the concept of “hyperscale” to a terrain that, until recently, was only occupied by government laboratories or research centers.

own Nidhi Chappellcorporate vice president of AI Infrastructure at Azure, summed it up with a calculated phrase: “It’s not just about raw power; it’s about optimizing every layer of the modern data center for AI.”

That phrase contains the key to what Microsoft has achieved: Rethink hardware, memory, networking and cooling as a single organism.

Inside the engine: 72 GPUs per rack, liquid cooling and huge memory

The heart of every NDv6 GB300 virtual machine is the system NVIDIA GB300 NVL72a liquid-cooled rack-scale unit with frightening density. Each module integrates 72 Blackwell Ultra GPUs and 36 Grace CPUsall interconnected through 5th Generation NVLink Switch.

The result is a virtual machine with 37 TB of fast memory and 1.44 exaflops of FP4 Tensor Core performancea figure that just three years ago would have seemed like science fiction. This unified memory space is what allows you to train and run multimodal reasoning and artificial intelligence models (the same ones that OpenAI, Azure’s main partner, works on) without latency or data fragmentation slowing down the process.

Blackwell Ultra, NVIDIA’s jewel for 2025, is not only designed for brute force. Its architecture incorporates NVFP4 precision formatswhich reduce consumption and double performance compared to previous generations, in addition to NVIDIA Dynamoa technology that optimizes dynamic inference in reasoning models.

In the latest benchmarks MLPerf Inference v5.1this combination achieved performance up to five times higher than that of the Hopper architecture, setting new records in models such as DeepSeek-R1 (671 B of parameters) or Llama 3.1 405B.

A network up to the challenge: NVLink and Quantum-X800

All this power would be useless without a network capable of moving data at the same speed. Microsoft and NVIDIA have designed a two-tier interconnection architecture: vertical, within each rack, and horizontal, between racks.

insidethe NVLink Switch system offers 130 TB/s of direct bandwidth between the 72 GPUs in a rack, acting as a single accelerator with shared memory. Abroadthe Quantum-X800 InfiniBand network connects the more than 4,600 chips with 800 GB/s per GPU, supported by Quantum-X800 switches and ConnectX-8 SuperNIC cards.

This combination allows an AI model of billions of parameters Work as if you were training on a single gigantic machine. Additionally, the network uses active telemetry and adaptive congestion control, along with NVIDIA’s SHARP v4 protocol, which reduces reduce operations and aggregates data directly in the hardware, multiplying efficiency in training and inference.

A years-long collaboration between giants

The Azure announcement has not come out of nowhere. Microsoft and NVIDIA have been working in tandem for several years to build the infrastructure that supports the advances of OpenAI and other companies training next-generation models.

This NDv6 GB300 cluster is the result of that collaborationbut also a strategic country investment: it reinforces the position of the United States as the epicenter of the development of advanced AI and reduces dependence on foreign infrastructure.

According to sources close to the project, building this system required reviewing every layer of the Azure data center, from the design of liquid cooling and electrical distribution, to a new orchestration and storage software stack adapted to the scale of current models.

What’s next: the cloud as a global supercomputer

Microsoft’s goal is not only to offer a new type of virtual machine, but to turn Azure into a network of modular supercomputers capable of growing without limits. The NDv6 GB300 is the first visible block of that plan.

As the company expands its fleet to hundreds of thousands of Blackwell Ultra GPUs, the impact will be double– will offer customers like OpenAI a platform of unprecedented performance while democratizing access to extreme computing resources for startups, universities and developers.

With this release, Microsoft is not only building the infrastructure that will power the next wave of AI models. It is drawing the skeleton of the digital future: a cloud that is no longer a remote service, but a shared global supercomputer.