Nvidia GTC 2025 – Top Takeaways on AI Infrastructure, GPUs, and Robotics

Nvidia GTC 2025. Jensen Huang spoke for almost 2 hours during the keynote and covered a wide range of topics. He carried the responsibility to speak to a mixed audience - investors to partners to developers. Overall, I thought Jensen laid out a clear vision of how Nvidia will navigate the rapidly evolving AI landscape.
- It comes as no surprise that Nvidia wants to "own" the AI Stack along with its major ecosystem partners. Full AI stack is "chip to software" at the moment, but clearly it will further move "up" to the application layer in different ways. Full AI stack also includes multi-cloud and edge capabilities. Effectively, no matter where the data reside and where the outputs go, Nvidia wants to be there. I think if Nvidia had its way, it wants to get into data infrastructure through major acquisitions, but that will probably make some of its partners upset today.
- It is also not surprising that Nvidia further emphasized its roadmap from Blackwell to Blackwell Ultra to Vera Rubin and the ultra version of that. These major refreshes and upgrades are key to the core revenue streams of Nvidia.
- Another key takeaway from the roadmap is that we should see even faster and cheaper models across major categories of models. For AI developers, the focus will not be model costs but model performance in real-world applications (different from model benchmarks) AND model deployment paradigms to accommodate the best user experience and requirements demanded by a particular use case.
- One of the most interesting part of GTC for me was the discussion regarding robotics. It's clear that Nvidia wants to be the "AI stack for robotics." Jensen Huang actually walked through the pipeline through his quick discussion on the capabilities.
- AI-generated data to simulate real world cases (compute intensive and a very hard data problem)
- Simulation environments to test the reinforcement learning and robotic systems through "digital twins" (compute intensive if we layer on the actual simulations)
- Nvidia's Foundational model Groot N-1 is an "ensemble" of two models that do two different things in a sequential order - a vision language model (aka multimodal model) that takes in the environment, turns into prompts, and reasons. A second model, a diffusion transformer model, takes in the robotic state (i.e., robots' features and characteristics) and turn into a set of feasible actions informed by the reasoning that has been done before.
- The model has been pre-trained and AI developers can fine tune the model with custom datasets.
It's not surprising that Nvidia is focused on robotics. Besides the market opportunities and secular trends of declining births and labor shortage, robots are data hungry, compute intensive, often task-category specific, and are just really hard to train, build, and maintain for high level performance. This falls exactly in Nvidia's sweet spot. Now, how to design the tasks, simulate, test, evaluate, and deploy to enable human and robots to co-exist is another story and opens new innovation opportunities.
If we think that we are moving too furiously into agents that orchestrate LLMs to perform tasks, we are actually also marching full speed into building robots.
Stay apace!