NVIDIA Demand Reduction Factors / It feels like competition between NVIDIA and Microsoft is intensifying. These two companies are amazing… Samsung Electronics has to compete in the midst of this


[MSFT’s own network chip will be released next week]

NVIDIA Demand Reduction Factors / It feels like competition between NVIDIA and Microsoft is intensifying. These two companies are amazing… Samsung Electronics has to compete in the midst of this

Microsoft is currently building the largest infrastructure ever seen by mankind. It may seem exaggerated, but annual spending on big projects like national rail networks, dams or space programs like Apollo Moon landings all pale in comparison to Microsoft’s more than $50 billion annual spending on data centers in 2024 and beyond. This infrastructure build is aimed squarely at accelerating the path to AGI and bringing generative AI intelligence to every aspect of life, from productivity applications to leisure.

While most of the AI infrastructure will be based on Nvidia’s GPUs in the medium term, there is considerable effort to diversify to both other silicon suppliers and internal-developed silicon. We detailed Microsoft’s ambitious plans for AMD MI300 in January and more recently MI300X orders for next year. In addition to accelerators, there are important requirements for 800G PAM4 optics, consistent optics, cabling, cooling, CPU, storage, DRAM, and other various server components.

Today, we want to learn about Microsoft’s internal silicon efforts. Today’s Azure Ignite event has two major silicon announcements: the cobalt 100 CPU and the Maia 100 AI accelerator (also known as Athena or M100). Microsoft’s system-level approach is so prominent that it will also go into rack-level design, networking (Azer Boost & Hollow Core Fiber) and security for the Maia 100. Learn more about Microsoft’s long-term plans, including Maia 100 volumes, AMD MI300X, Nvidia H100/H200/B100, Google’s TPUV5, Amazon’s Trinium/Inferrentia2, and AI silicon, including next-generation chips. I’ll also share what I’ve heard about GPT-3.5 and GPT-4 model performance for the Maia 100.

Although Microsoft is currently behind Google and Amazon in deploying custom silicon in their data centers, you should know that it has a long history of silicon projects. For example, do you know that Microsoft has developed a custom CPU called E2 using a custom set of instructions that utilizes Explicit Data Graph Execution (EDGE)? They even transplanted Windows specifically for this ISA! Microsoft has historically worked with AMD on semi-customized gaming console chips, but now it is also expanding its partnership to custom Arm-based Windows PC chips. Microsoft has also developed generations of root trusts found across all servers that are installed internally in the data center.

Microsoft Project Catapult, which aims for search, AI, and networking. Early Project Catapult was based entirely on standard FPGA, but Microsoft eventually signed a deal with Intel for a custom FPGA. The purpose of this FPGA was primarily for Bing, but it had to be abandoned due to Intel’s execution issues. Bing still relies heavily on FPGA, as opposed to Google searches, which are primarily accelerated by TPUs.

As part of today’s announcement, Microsoft will also announce an external FPGA-based 200G DPU and Azure Boost network adapter, an internally designed ASIC. It offloads a lot of hypervisor, host, network, and storage-related tasks, but for some reason Azure instances using Azure Boost must forgo the host CPU core for infrastructure-related tasks. This is different from Nitro in Amazon, which allows all host CPU cores to be used for VMs.

Azure Cobalt 100 CPU
The Azure Cobalt 100 CPU is Microsoft’s second arm-based CPU deployed to the cloud. It is already used for internal Microsoft products such as Azure SQL Server and Microsoft Teams. The first arm-based CPU distributed by Microsoft was a neovus N1-based CPU purchased from Ampere Computing. The cobalt 100 CPU has evolved from it, providing the Amv9 with 128 neo-bus N2 cores and 12 DDR5 channels. Neobus N2 offers 40% higher performance than Neobus N1.

Cobalt 100 is primarily based on Arm’s Neoverse Genesis Compute Subsystem (CSS) platform. This product from Arm makes it much faster, easier, and cheaper to develop a good Arm-based CPU, away from the classical business model that only provides IP licensing.

Arm supplies a proven batch blob with many parts of the design process completed. Here we discuss this new business model in detail.

For the cobalt 100, Microsoft is taking two Genesis computing subsystems and grouping them into one CPU.

This is similar to Alibaba’s Yitan 710 CPU, which is also based on Neovers N2. Profiled here by Chips & Cheese.

Arm has previously boasted that it only took 13 months from the start of the project to the operation of silicon for hyperscalers. Given that the only customers we know of Genesis CSS are Alibaba and Microsoft, and Alibaba was the first to market, it’s possible Arm is talking about Microsoft on the slide below. It is also possible that Google’s Arm-based CPU is also using Genesis CSS.

Azure Maia 100 (Athena)
Microsoft’s long-awaited AI accelerator has finally been released. They were the last of the top four hyperscalers in the U.S. (Amazon, Google, Meta, and Microsoft) to unveil their products. So Maia 100 is by no means a slob. Let’s compare performance/TCO against AMD MI300X, Nvidia H100/H200/B100, Google TPUV5, and Amazon Trainium/Infergentia2.

Most of these products are as follows. Full specifications, network settings and topologies, rack design, volume lamps, performance, power consumption, design partners, and more. There is a very unique aspect of chips that ML researchers, infrastructure representatives, silicon design teams, and investors need to know.

Azure Maia 100 (Athena)
Microsoft’s long awaited AI accelerator is finally here. They are the last of the big 4 US hyperscalers (Amazon, Google, Meta, Microsoft) to unveil their product. With that said, Maia 100 isn’t a slouch. We will compare its performance / TCO versus AMD MI300X, Nvidia H100/H200/B100, Google’s TPUv5, Amazon’s Trainium/Inferentia2.

The bulk of this piece is below. It will include the full specifications, network setup and topology, rack design, volume ramp, performance, power consumption, design partners, and more. There are some very unique aspects of this chip that we think ML researchers, infrastructure folks, silicon design teams, and investors should be made aware of.


답글 남기기

이메일 주소는 공개되지 않습니다. 필수 필드는 *로 표시됩니다