Skip to content

operator22th/awesome-world-models-for-robots

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 

Repository files navigation

awesome-world-models-for-robots

overview

  • World Models
  • arxiv 2024, 11, Understanding World or Predicting Future? A Comprehensive Survey of World Models Paper.

benchmark

  • arXiv 2024, 03, HumanoidBench: Simulated Humanoid Benchmark for Whole-Body Locomotion and Manipulation Paper Website. 15 whole-body manipulation and 12 locomotion tasks. This repo contains the code for environments and training.

dataset

  • AgiBot World Website. 1 million+ trajectories from 100 robots.
  • LeRobotDataset Website. A bunch of models, datasets, and tools for real-world robotics in PyTorch.
  • 1xgpt Website.
  • OXE Paper.

models

  • Cosmos Website. Paper. Autoregressive Video2World/Text2World foundation models.

toolbox

  • Menagerie Website MuJoCo physics engines. System identification toolbox has not been released.(up to 2025.1)
  • MuJoCo Playground Website Paper Training environments in mjx. Humanoid Locomotion, Quadruped Locomotion and Manipulation (most robot arms and hand) tasks are included.

papers

  • arxiv 2025, 03, Multi-Stage Manipulation with Demonstration-Augmented Reward, Policy, and World Model Learning. Paper. Website.
  • ICRA 2024, MoDem-V2: Visuo-Motor World Models for Real-World Robot Manipulation. Paper.
  • CoRL 2024, Multi-Task Interactive Robot Fleet Learning with Visual World Models. Paper. Visual world model for anomaly detection.
  • CVPR 2023, Affordances from Human Videos as a Versatile Representation for Robotics. Paper. Prediction contact points and trajectory waypoints, then use it for downstream tasks (suitable for different learning paradigms).
  • RSS 2023, Structured World Models from Human Videos. Paper. Robot arm manipulation tasks. World Models with structured action space design.
  • RSS 2024, HRP: Human Affordances for Robotic Pre-Training. Paper.
  • arxiv 2025, 02, Strengthening Generative Robot Policies through Predictive World Modeling. Paper. Stengthen imitation learning with world model.
  • arxiv 2024, 11, DINO-WM: World Models on Pre-trained Visual Features enable Zero-shot Planning. Website. World model for MPC. DINOv2 for representation.
  • CoRL 2022, Daydreamer: World models for physical robot learning. Paper.
  • CoRL 2023 (Oral), Finetuning Offline World Models in the Real World Website Paper Offline pretraining and online finetuning of world models. Robot arm manipulation tasks.
  • arxiv 2025, 01, Robotic World Model: A Neural Network Simulator for Robust Policy Optimization in Robotics Paper. MBPO sim2real using world models. Quadruped locomotion tasks.
  • ICLR 2024 (Outstanding Paper), UniSim: Learning Interactive Real-World Simulators Website.
  • ICLR 2024, Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation Website.

workshop

  • ICML 2024, Multi-modal Foundation Model meets Embodied AI Website.
  • ICLR 2025, Generative Models for Robot Learning. Website.
  • ICLR 2025, World Models. Website.

related: World Models

  • Leo Fan's List. Website.
  • arxiv 2024, 05, Hierarchical World Models as Visual Whole-Body Humanoid Controllers. Website.
  • ICML 2024, Offline Transition Modeling via Contrastive Energy Learning. Code.
  • ICML 2024, 3D-VLA: A 3DVision-Language-Action Generative World Model. Paper.
  • ICML 2024 (Oral), Genie: Generative Interactive Environments. Paper.
  • ICML 2024 (Oral), Learning to Model the World with Language. Paper. Website.
  • 2024, 12, Genie2 Blog.

related: LLM as WM

  • ICLR 2025, Monte Carlo Planning with Large Language Model for Text-Based Games. Paper.
  • arxiv 2024, AgentGym: Evolving Large Language Model-based Agents across Diverse Environments. Paper. Code.
  • NIPS 2023, Language Models Meet World Models: Embodied Experiences Enhance Language Models. Paper. Openreview.
  • NIPS 2023, Large Language Models as Commonsense Knowledge for Large-Scale Task Planning. Website. Paper.
  • NIPS 2023, ChessGPT: Bridging Policy Learning and Language Modeling. Paper. Code.

related: Transfer Learning

  • arxiv 2022, 01, Transferability in Deep Learning: A Survey. Paper.

related: Robotics & Foundation models

  • RSS 2024, OK-Robot: What Really Matters in Integrating Open-Knowledge Models for Robotics. Paper.
  • CoRL 2023 (Oral), VoxPoser: Composable 3D Value Maps for Robotic Manipulation with Language Models. Website.
  • ICLR 2024, Zero-Shot Robotic Manipulation with Pretrained Image-Editing Diffusion Models. Paper.
  • ICRA 2025, WildLMA: Long Horizon Loco-MAnipulation in the Wild. Website.
  • arxiv, 2024, 12, NaVILA: Legged Robot Vision-Language-Action Model for Navigation. Website.
  • arxiv, 2024, 10, GenSim2: Scaling Robot Data Generation with Multi-modal and Reasoning LLMs. Paper.

related: Robotics & Vision-based RL

  • CoRL 2022 (Oral), Deep Whole-Body Control: Learning a Unified Policy for Manipulation and Locomotion. Paper.
  • CoRL 2022 (Oral), Legged Locomotion in Challenging Terrains using Egocentric Vision. Paper.
  • ICML 2023 (Oral), Efficient RL via Disentangled Environment and Agent Representations. Website.
  • CoRL 2022, VideoDex: Learning Dexterity from Internet Videos. Website.
  • CVPR 2022, Coupling Vision and Proprioception for Navigation of Legged Robots. Paper.
  • CoRL 2024, Continuously Improving Mobile Manipulation with Autonomous Real-World RL. Paper. Mobile Manipulation.
  • RSS 2023, Pre-Training for Robots: Offline RL Enables Learning New Tasks from a Handful of Trials. Paper.
  • CoRL 2024, Steering Your Generalists: Improving Robotic Foundation Models via Value Guidance. Website.

related: Robotics & Visual representations

  • NeurIPS 2024, DynaMo: In-Domain Dynamics Pretraining for Visuo-Motor Control. Paper.
  • RSS 2024, HRP: Human Affordances for Robotic Pre-Training. Paper.
  • ICML 2023 (Oral), Efficient RL via Disentangled Environment and Agent Representations. Website.
  • CVPR 2023, Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture. Paper.
  • ICML 2022, On Pre-Training for Visuo-Motor Control: Revisiting a Learning-from-Scratch Baseline. Paper.
  • IROS 2023, Visual Reinforcement Learning with Self-Supervised 3D Representations. Paper.

related: Generative models for Decision-Making

  • arxiv 2025, 02, History-Guided Video Diffusion. Website. Paper.
  • arxiv 2025, 01, Inference-Time Alignment in Diffusion Models with Reward-Guided Generation: Tutorial and Review. Paper.
  • arxiv 2024, 05, Bridging Model-Based Optimization and Generative Modeling via Conservative Fine-Tuning of Diffusion Models. Paper.
  • NIPS 2024, Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion. Website.
  • ICML 2022, Learning Iterative Reasoning through Energy Minimization. Paper.
  • ICRA 2023, NoMaD: Goal Masked Diffusion Policies for Navigation and Exploration. Paper.
  • ICML 2024, Video as the New Language for Real-World Decision Making. Paper.

related: Generative simulation

  • arxiv 2024, 06, RoboCasa: Large-Scale Simulation of Everyday Tasks for Generalist Robots. Paper.

related: RL in the Real World

  • arxiv 2021, 02, NeoRL: A Near Real-World Benchmark for Offline Reinforcement Learning. Paper. Website

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published