1Harbin Institute of Technology, Shenzhen, 2Huawei Noah’s Ark Lab
*Corresponding author
Annual Meeting of the Association for Computational Linguistics (ACL) 2025
🔥 Details will be released. Stay tuned 🍻 👍
- [05/2025] Project Page released.
- [05/2025] Arxiv paper released.
- [05/2025] Code released.
This is the github repository of GUI-explorer: Autonomous Exploration and Mining of Transition-aware Knowledge for GUI Agent. In this work, we propose GUI-explorer. It synergizes two key components: (1) Autonomous Exploration of Function-Aware Trajectory; (2) Unsupervised Mining of Transition-Aware Knowledge.
The overview of the proposed GUI-explorer:
git clone https://github.com/JiuTian-VL/GUI-explorer.git
cd GUI-explorer
mkdir knowledge_base
cd knowledge_base
wget https://github.com/JiuTian-VL/GUI-explorer/releases/download/knowledge_base/knowledge_data.pklcd GUI-explorer
conda create -n GUI_explorer python=3.12 -y
conda activate GUI_explorer
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu126
pip install -r requirements.txtDuplicate .env.example and rename it to .env. Then, in the .env file, fill in your OPENAI_API_KEY.
# Open a new shell window and run
cd GUI-explorer
conda activate GUI_explorer
python -m utils.embedding_pipeline
# Open a new shell window and run (Need to wait for embedding_pipeline to start up)
cd GUI-explorer
conda activate GUI_explorer
python -m utils.retrieval # After prepare api servers
cd GUI-explorer
conda activate GUI_explorer
python exploration_and_mining.py -device_serial emulator-5554 -max_branching_factor 10 -max_exploration_steps 30 -max_exploration_depth 5 -package_name net.osmand
# After the update of knowledge_base, you need to restart `python -m utils.retrieval` to load the new knowledge_basedevice_serial can be obtained by running adb devices. (If not, you need to follow the Setup section in this tutorial).
package_name can be obtained from the app's link on the app store. For example, in https://play.google.com/store/apps/details?id=net.osmand, net.osmand is the package_name for this app.
# After prepare api servers
# Connect an Android device to this computer and make sure you can see it in `adb devices`.
# Open a new shell window and run
cd GUI-explorer
conda activate GUI_explorer
python -m demo.demo_web_backend
# Open a new shell window and run
cd GUI-explorer
conda activate GUI_explorer
python -m demo.demo_agent_backend
# Open a new shell window and run
cd GUI-explorer/demo/demo_web_frontend
pnpm install
pnpm run devOpen http://localhost:5173 in your browser.
You should be able to see something like this:
Table 1: Main Result of GUI-explorer on SPA-Bench single-app English Level 3 tasks.

Table 2: Main Result of GUI-explorer on AndroidWorld tasks.

Table 3: Main Result of GUI-explorer on GUI-KRB.

| Instruction | Video |
|---|---|
| Open Google Chrome and search for today's weather in Shenzhen. Carefully observe the screen and record the current weather conditions. Then, in Markor, create a note named "today.md" and write the temperature read from the webpage into it. | Multi-Apps-Task-1.mp4 |
| Get the search results for stay tonight near 'wembley stadium' for 1 adult. Add one result to wishlist. Confirm that this item is in the wishlist. | Single-App-Task-1.mp4 |
If you find this work useful for your research, please kindly cite our paper:
@inproceedings{xie2025gui,
title={GUI-explorer: Autonomous Exploration and Mining of Transition-aware Knowledge for GUI Agent},
author={Bin Xie and Rui Shao and Gongwei Chen and Kaiwen Zhou and Yinchuan Li and Jie Liu and Min Zhang and Liqiang Nie},
booktitle={Annual Meeting of the Association for Computational Linguistics (ACL)},
year={2025}
}


