Xeon 6+ ClearWater Forest: double density and 3D CPU with 288 cores

Xeon 6+ ClearWater Forest: double density and 3D CPU with 288 cores

Intel ClearWater Forest, architecture of the new Xeon 6+

In this new generation of Xeon processors, called Intel Xeon 6+, Intel incorporates its new manufacturing and packaging processes to achieve the densest Xeon processor to date, with twice as many cores per processor and supported by its Intel Darkmont efficiency cores.

Sample of the complete Intel Xeon 6+, with its 12 Darkmont computing elements for a total of 288 cores, where we also see the elements stacked in 3D before installing its IHS

The first up to 144 efficiency cores that we could mount in configurations of up to two CPUs per motherboard. The Intel® Xeon® 6780E, based on Sierra Forest, tripled the efficiency of previous solutions, with up to 5 times the data delivery capacity. It was based on a single Intel 3 compute tile with Intel 7 base tiles.

Geeknetic Xeon 6+ ClearWater Forest: double density and 3D CPU with 288 cores 2
Intel Xeon 6+ ClearWater Forest Darkmont Compute Elements Wafer

ClearWater Forest design and architecture allows for doubling performance while maintaining the same energy profilewith 1.9 times more bandwidth thanks to the incorporation of Intel 18A as the transistor manufacturing process and technology Foveros Direct 3Dwhich allows tiles to be stacked on top of each other in Intel’s most advanced three-dimensional design to date and is already operating at full capacity at two of its Intel Foundries manufacturing plants in Arizona and Oregon.

Geeknetic Xeon 6+ ClearWater Forest: double density and 3D CPU with 288 cores 3

With the incorporation of the new Darkmont efficiency core, Intel manages to double the density of these CPUs, with up to 288 E-Core cores and add almost double the bandwidth with 12 channels of DDR5 memory with speeds of up to 8000 MT/s.

Geeknetic Xeon 6+ ClearWater Forest: double density and 3D CPU with 288 cores 4

The new architecture design includes five different components, unlike Sierra Forest. The “I/O Tiles”, “EMIB” and “Compute Tiles” are maintained, and new “Base Tiles” and “Foveros Direct 3D” are added to stack different components in 3 dimensions. This moves from a design based on two manufacturing processes, Intel 7 and Intel 3, to a more complex one with Intel 18A, Intel 3 and Intel 7.

Geeknetic Xeon 6+ ClearWater Forest: double density and 3D CPU with 288 cores 5

EMIB continues to be the basis for communication between the different tiles, at least the active bases and the communication base, which is still based on Intel 7. But now the 12 computing tiles (Intel 18A), before it was just one, are placed on top of 3 active tiles (Intel 3) that act as a base; Now we will go with the details.

Geeknetic Xeon 6+ ClearWater Forest: double density and 3D CPU with 288 cores 6

Intel 7

  • It is an enhanced 10nm advanced process node, optimized to deliver higher performance per watt.
  • It is used in high-performance and energy-efficient products, such as server CPUs, Intel used it for the first time in servers with the Intel Sapphire Rapids.
  • It allows higher frequencies and improvements in transistor density compared to previous generations.

Intel 3

  • Evolution of Intel 7, with improvements in EUV (Extreme Ultraviolet Lithography) photolithography.
  • The Intel Sierra Forest, Xeon 6, were the first Intel processors with this manufacturing process, which is now used in the Intel Xeon 6+ cache bases.

Intel 18A

  • It is a next-generation process node, debuting in Intel ClearWater Forest, Xeon 6+, and Intel Panther Lake process cores for low-power, high-efficiency mobile and edge devices.
  • It introduces technologies such as RibbonFET (new gate-all-around transistor) and PowerVia (power through the back of the chip).

Construction through process Intel 18A allows to continue reducing the size of the transistor, placing the on/off door completely surrounding the silicon sheets. In this way, density is increased, signal loss and energy loss are reduced, especially in “off” states. With PowerVIA, the power is placed on the reverse side and the connection signal on the front side – you have more details here -, with a reduction in materials, reduced congestion, improved signal and performance.

Geeknetic Xeon 6+ ClearWater Forest: double density and 3D CPU with 288 cores 7

Foveros Direct 3D is the packaging technology that allows Intel to place the new computing tiles on top of the base tiles. It achieves this through a direct connection, copper to copper, only 9 micrometers, low resistance and with a surprising energy efficiency of 0.05 picojoules per bit.

Geeknetic Xeon 6+ ClearWater Forest: double density and 3D CPU with 288 cores 8

This design allows surprising scaling in the components of this processor, which makes it combine an unknown density of cores with surprising connectivity. The I/O Tiles add 8 accelerators per tileand remember that there are two in each processor, 48 PCI Express 5.0 lanes32 CXL 2.0 lines, 96 UPI 2.0 lines per tile. A design that Intel also reuses from Granite Rapids in a design of heterogeneous components that work thanks to EMIB communication between them.

Geeknetic Xeon 6+ ClearWater Forest: double density and 3D CPU with 288 cores 9

In the new “base” tiles we find 192 MB of cache per base —we have three in total—, 48 MB for each computing tile. We also find the DDR5 memory controller, with four channels per base tile. This makes a total of 576 MB of cache, five times more than the previous generation, with 12 channels of DDR5 memory. Once again, EMIB is the glue that allows communication between these tiles.

On top of each base tile, four computing elements or tiles, each with six E-Core modules with Darkmont architecturewith four cores each. There are 24 cores for each tilewith 4 MB of L2 cache for each of them, 1 MB per module. A total of up to 288 cores per processorfigures that double those of the previous generation with the same energy profile.

Geeknetic Xeon 6+ ClearWater Forest: dual density and 3D CPU with 288 cores 10

The different units that make up Darkmont offer a substantial advantage over the previous generation, based on the already powerful Crestmont, so much so that it can lead us to think that the path of new generations of server processors, increasingly with more highly specialized accelerators, goes through this path: that of maximizing connectivity and performance per watt.

Crestmont (Sierra Forest) Feature Darkmont (Clearwater Forest)
Branch predictor Wider & deeper
6-wide Decode 9-wide
6-wide Allocate 8-wide
64 entries uOp queue 96 entries
256 entries ROB window 416 entries
17 ports Dispatch 26 ports
4 ALUs Scalar ALUs 8 ALUs
2x128b Vector FMAs 4x128b
2 AGUs Address generation 4 AGUs
64B/cycle L2 bandwidth 128B/cycle

It has 50% more in almost all important areas of a processorincluding a new “branch predictor” unit, more encoding and decoding capacity, more ports, twice as many arithmetic units, twice as much vector computing capacity and an L2 cache with twice the bandwidth. The generational leap is simply impressive and we will also find these cores in the new Panther Lake for portable and Edge devices.

Geeknetic Xeon 6+ ClearWater Forest: double density and 3D CPU with 288 cores 11
They use the same platform as the Xeon 6900P for high-performance P-Core cores.

With these numbers, Intel boasts a 1.9x performance improvement over the Intel® Xeon® 6780E Processor23% improvement in performance-watt efficiency and an 8:1 consolidation ratio with respect to five-year-old processors (each rack based on Intel Xeon 6+ could replace 8 racks with five-year-old processors).

Geeknetic Xeon 6+ ClearWater Forest: double density and 3D CPU with 288 cores 12

Twice as many cores, twice as many Intel accelerators, more PCI Express lanes, support for faster memories, five times more L3 cache, up to 288 MB of L2 cache with double the bandwidth, and commercial configurations promising the same 330W power profile of the previous generation but still with many commercial details to detail.

  • Platform
    • Sockets: 1S – 2S (Xeon 6900P compatible)
    • Maximum TDP: 300 to 500W per CPU
  • Compute and Memory
    • Cores: up to 288 efficient cores
    • L2 cache: up to 288MB (up to 4MB per cluster)
    • Last cache level: 576MB
    • Memory: 12 channels DDR5 8000MT/s
  • Interconnections
    • Intel® UPI: up to 6 UPI 2.0 (up to 24 GT/s per lane)
    • PCI Express: up to 96 PCIe 5.0 lanes (x16, x8, x4, x2)
    • Compute Express Link: up to 64 CXL 2.0 lanes
  • Safety and Efficiency
    • Security: Intel Software Guard Extensions (Intel SGX), Intel Trust Domain Extensions (Intel TDX)
    • Power Management: Intel Application Energy Telemetry (Intel AET), Intel Turbo Rate Limiter
  • Acceleration
    • Acceleration: Intel Advanced Vector Extensions 2 (VNNI/INT8)
    • Integrated accelerators (up to 16 accelerators):
      • 4x Intel QuickAssist Technology
      • 4x Intel Dynamic Load Balancer
      • 4x Intel Data Streaming Accelerator
      • 4x Intel In-memory Analytics Accelerator

These processors will arrive during 2026, with working frequency configurations that should not be far from the current ones. Here the improvement is in the optimizations of the new generation cores and in their greater density of cores and cache thanks to Intel’s complex and intelligent three-dimensional packagingwhich combines elements from three different generations or manufacturing processes.