Nvidia launches super AI chip Vera Rubin, platform for humanoid robots.

At the GTC 2025 developer conference, Nvidia introduced a series of next-generation hardware and software, with a focus on increasing AI power.

Nvidia's annual GTC (GPU Technology Conference) conference, taking place in San Jose, California (USA). This year, the event started from March 17 to 21, in which the important speech was given by CEO Jensen Huang on March 18, introducing the company's new flagship products.

Blackwell Ultra GB300 GPU.

Expected to launch later this year, Blackwell Ultra GB300 is an upgraded AI graphics chip from the 2024 Blackwell version. This GPU still retains the same 20 petaflops AI computing power as its "senior" but has upgraded HBM3e memory from 192 GB to 288 GB. Compared to the H100 model launched in 2022, Blackwell Ultra has 1.5 times more power in AI inference, can process 1,000 tokens per second, 10 times more than the chip model launched three years ago.

Blackwell Ultra GB300 AI chip.

“AI has come a long way. AI reasoning and AI agents require exponentially more compute performance,” Huang said. “We designed Blackwell Ultra to address that. It’s the only general-purpose platform that can do pre-trained, post-trained, and top-of-the-line AI inference.”

Nvidia will sell the new chip in clusters, with 72 Blackwell Ultra GPUs and 36 Nvidia Grace CPUs, and it can connect to the Nvidia DGX Cloud private cloud platform, which it describes as “a fully managed, end-to-end cloud that optimizes performance with software, services, and expertise for AI workloads.”

Nvidia also offers a smaller system called the B300 NVL16. Compared to the previous Hopper generation, the chip offers 11 times faster inference on large language models, 7 times more compute, and 4 times more memory.

According to Nvidia, partners who have placed orders for Blackwell Ultra include Cisco, Dell, HP, Lenovo and Supermicro. The company has not announced the price of the product.

The "Super Chip" Vera Rubin.

Vera Rubin, which Huang emphasized as Nvidia's next AI chip architecture, will be released in 2026, while the Rubin Ultra version could be released in 2027.

Nvidia CEO Jensen Huang introduces Vera Rubin on stage at the event.

Vera Rubin will reportedly deliver up to 50 petaflops of performance, while Rubin Ultra is a combination of two Vera Rubin models, delivering 100 petaflops. While each Rubin processor combines two GPUs into a single chip, Rubin Ultra combines four GPUs. These systems are said to take AI inference “to the next level.”

A Rubin Ultra NVL576 cluster will deliver 14 times the performance of a Blackwell Ultra cluster. According to Nvidia, the Vera Rubin NVL576 will be housed in a new liquid-cooled server rack design called the Kyber Rack.

After the Vera Rubin architecture, Nvidia has also considered a newer architecture called Feynman, which is expected to be released in 2028. However, the company has not yet announced details about this new architecture.

DGX AI Personal Computer.

DGX includes two versions, DGX Spark and DGX Station, which Nvidia introduced as "super desktop computers". The product runs on Blackwell Ultra chips, designed to support developers to directly run large AI inference models right at home instead of needing large systems. According to Reuters, this is considered a direct challenge to PC products, especially Apple's high-end Macs.

The DGX Station desktop motherboard integrates Nvidia's Blackwell Ultra.

DGX Spark delivers up to 1 trillion operations per second, supporting AI tuning and inference with the latest AI reasoning models, including the Nvidia Cosmos Reason physical AI application model and the Nvidia GR00T N1 robotics platform model.

The more powerful DGX Station with a massive 784 GB of memory accelerates large-scale training and inference workloads. The ConnectX-8 SuperNIC platform is optimized to accelerate hyperscale AI computing workloads. With up to 800 Gb/s network support, the ConnectX-8 SuperNIC enables ultra-fast and efficient networking of multiple DGX Stations together to accelerate inference.

DGX Spark and DGX Station are manufactured by Asus, Boxx, Dell, HP, Lambda and Supermicro, and are available for pre-order and shipping later this year, but pricing has not been announced.

Spectrum-X and Quantum-X Silicon Photonic Network Chips.

Nvidia’s new pair of silicon photonic networking chips will enable “AI factories” like data centers to connect millions of GPUs across multiple locations while dramatically reducing power consumption. Spectrum-X accelerates AI networking performance by 1.6x compared to traditional Ethernet, while Quantum-X is the world’s first end-to-end, high-performance 800 Gb/s networking chip designed for large-scale AI.

Quantum-X Chip.

The new chips, which use Spectrum-4 Ethernet switches and BlueField-3 SuperNICs, deliver the highest performance for AI, machine learning, and natural language processing, as well as a wide range of industrial applications, according to Nvidia. Quantum-X is expected to be available later this year, and Spectrum-X in 2026.

Dynamo Software.

Nvidia Dynamo is an open-source, low-latency, modular inference platform for serving generative AI models in distributed environments. It enables seamless scaling for large GPU inference workloads, while intelligently routing, optimizing memory management, and seamless data transfer. The goal is to accelerate inference, where AI models “think” to answer a question in multiple steps, rather than just coming up with a single answer.

Dynamo supports all popular large language model (LLM) inference and optimization AI platforms, including DeepSeek's DeepSeek-R1 and Meta's Llama. The software is free to download.

Nvidia Isaac GR00T N1.

The GR00T N1 is a model designed by Nvidia for humanoid robots and is called "the world's first open humanoid robot platform". It is equipped with a "dual" system that allows robots to think fast and slow - a factor quite similar to reasoning AI models. In which, "System 1" is the fast-thinking action model, reflecting human reflexes or intuition, while "System 2" is the slow-thinking model for making deliberate, methodical decisions.

Nvidia CEO Jensen Huang stands next to the Blue robot running the GR00T N1 on stage at the event.

The “skeleton” for the GR00T N1 includes Newton, an open-source physics engine developed with Google DeepMind and Disney Research, and is built for robotics. It can easily generalize common tasks like grasping, moving objects with one or both hands, and transferring objects from one hand to another, or perform multi-step tasks that require long contexts and combine common skills like material handling, packaging, and inspection.

Early access humanoid robot developers for the GR00T N1 include Agility Robotics, Boston Dynamics, Mentee Robotics, and NEURA Robotics. Isaac GR00T N1 is expected to be announced later this year.

Post a Comment

Previous Post Next Post