At ICRA 2024, Nvidia showcased a suite of research aimed at closing the sim-to-real gap for humanoid robots.

The work spans eight papers that cover navigation, grasping, perception, and planning.

By training policies in high-fidelity simulators, researchers can accumulate thousands of hours of experience in a fraction of real time.

However, transferring those policies to physical hardware often suffers from dynamics mismatches and sensor noise.

To address this, Nvidia introduced Compass, which combines imitation learning with reinforcement learning across varied robot morphologies.

Compass-trained agents achieved an 80% success rate on navigation benchmarks, far exceeding imitation-only baselines.

For manipulation, the Grasp-MPC framework uses reactive visual-proprioceptive feedback to adjust grasps on the fly.

Trained on simulated objects, Grasp-MPC raised grasp success from 41% to 75% on a physical manipulator.

Sparr adds a lightweight residual controller trained on a small set of real-world trajectories to correct systematic bias.

This two-layer approach boosted task success by roughly 38% over pure sim-to-real transfer.

Peek couples a convolutional backbone with a vision-language model to produce semantic annotations that improve perception robustness.

Together, these methods illustrate a simulation-first, adaptation-light strategy that could accelerate humanoid robot deployment.