Read this in other languages: English, 中文.
Lightweight version of OpenVLA, based on the llava-next framework
Features:
1.The main model is only 0.5B, using qwen2-0.5B.
2.Does not use RLDS format datasets, instead fine-tuned using mllm format.
3.Does not occupy inherent tokens of the LLM model, instead using an additional 271 tokens to represent actions.
Follow the steps below to install the OpenVLA-Qwen lightweight version.
Clone the repository to your local machine:
git clone https://github.com/IP127000/openVLA-Qwen2-0.5B.git
cd openVLA-Qwenconda create -n vla-qwen python=3.10 -y
conda activate vla-qwenpip install --upgrade pip
pip install -e ".[train]"cd openVLA-Qwen2-0.5B
mkdir models
cd models
git clone https://huggingface.co/lmms-lab/llava-onevision-qwen2-0.5b-ovPREV_STAGE_CHECKPOINT="models/llava-onevision-qwen2-0.5b-ov"
--data_path scripts/train/vla.yaml
--image_folder /images \json_path: /root/data/vla_llava.jsoncd /openVLA-Qwen2-0.5B
./scripts/train/finetune_ov_vla.shThis project makes use of the following open-source projects, and I would like to express my gratitude to their creators:
- OpenVLA - For providing the core structure and functionality that inspired OpenVLA-Qwen.
- LLaVA_OpenVLA - For contribution to data preprocessing techniques.