Skip to content

zchoi/Awesome-Embodied-Robotics-and-Agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

221 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🤖 Awesome Embodied Robotics and Agent Awesome

This is a curated list of "Embodied robotics or agent with Vision-Language Models (VLMs) and Large Language Models (LLMs)" research which is maintained by haonan.

Watch this repository for the latest updates and feel free to raise pull requests if you find some interesting papers!

News🔥

[2025/10/30] 🎉 Our survey paper "A Survey on Efficient Vision-Language-Action Models" [arXiv] has been released!
[2025/04/23] Add π-0.5, a lightweight and modular framework designed to integrate perception, control, and learning directly within physical systems.
[2025/03/18] Add some popular vision-language action (VLA) models. 🦾
[2024/06/28] Created a new board about agent self-evolutionary research. 🤖
[2024/06/07] Add Mobile-Agent-v2, a mobile device operation assistant with effective navigation via multi-agent collaboration. 🚀
[2024/05/13] Add "Learning Interactive Real-World Simulators"——outstanding paper award in ICLR 2024 🥇.
[2024/04/24] Add "A Survey on Self-Evolution of Large Language Models", a systematic survey on self-evolution in LLMs! 💥
[2024/04/16] Add some CVPR 2024 papers.
[2024/04/15] Add MetaGPT, accepted for oral presentation (top 1.2%) at ICLR 2024, ranking #1 in the LLM-based Agent category. 🚀
[2024/03/13] Add CRADLE, an interesting paper exploring LLM-based agent in Red Dead Redemption II!🎮

Development of Embodied Robotics and Benchmarks

Figure 1. The Organization of Our Survey. We systematically categorize efficient VLAs into three core pillars: (1) Efficient Model Design, encompassing efficient architectures and model compression techniques, (2) Efficient Training, covering efficient pre-training and post-training strategies, and (3) Efficient Data Collection, including efficient data collection and augmentation methods. The framework also reviews VLA foundations, key applications, challenges, and future directions, establishing the groundwork for advancing scalable embodied intelligence.

Table of Contents 🍃

Methods

Survey

Vision-Language-Action Model

Self-Evolving Agents

Advanced Agent Applications

LLMs with RL or World Model

Planning and Manipulation or Pretraining

Multi-Agent Learning and Coordination

Vision and Language Navigation

Detection

  • DetGPT: Detect What You Need via Reasoning [arXiv 2023]
    Renjie Pi1∗ Jiahui Gao2* Shizhe Diao1∗ Rui Pan1 Hanze Dong1 Jipeng Zhang1 Lewei Yao1 Jianhua Han3 Hang Xu2 Lingpeng Kong2 Tong Zhang1
    1The Hong Kong University of Science and Technology 2The University of Hong Kong 3Shanghai Jiao Tong University

3D Grounding

Interactive Embodied Learning

Rearrangement

Benchmark

Simulator

Others

Acknowledge

Thanks to everyone who has contributed to this repository! Special thanks to those who submitted PRs with solid work—your efforts make this project better and stronger. 🚀✨

About

This is a curated list of "Embodied AI or robot with Large Language Models" research. Watch this repository for the latest updates! 🔥

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors