Skip to content
This repository was archived by the owner on Apr 7, 2026. It is now read-only.

iLearn-Lab/ACL25-GUI-explorer

Repository files navigation

GUI-explorer: Autonomous Exploration and Mining of Transition-aware Knowledge for GUI Agent

1Harbin Institute of Technology, Shenzhen, 2Huawei Noah’s Ark Lab
*Corresponding author

Annual Meeting of the Association for Computational Linguistics (ACL) 2025

[Paper] [Code] [Project Page]

🔥 Details will be released. Stay tuned 🍻 👍

If you find this work useful for your research, please kindly cite our paper and star our repo.

Updates

Introduction

This is the github repository of GUI-explorer: Autonomous Exploration and Mining of Transition-aware Knowledge for GUI Agent. In this work, we propose GUI-explorer. It synergizes two key components: (1) Autonomous Exploration of Function-Aware Trajectory; (2) Unsupervised Mining of Transition-Aware Knowledge.

The overview of the proposed GUI-explorer:

Installation

Download

git clone https://github.com/JiuTian-VL/GUI-explorer.git
cd GUI-explorer
mkdir knowledge_base
cd knowledge_base
wget https://github.com/JiuTian-VL/GUI-explorer/releases/download/knowledge_base/knowledge_data.pkl

Environment

cd GUI-explorer
conda create -n GUI_explorer python=3.12 -y
conda activate GUI_explorer
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu126
pip install -r requirements.txt

Duplicate .env.example and rename it to .env. Then, in the .env file, fill in your OPENAI_API_KEY.

Usage

Prepare api servers

# Open a new shell window and run
cd GUI-explorer
conda activate GUI_explorer
python -m utils.embedding_pipeline

# Open a new shell window and run (Need to wait for embedding_pipeline to start up)
cd GUI-explorer
conda activate GUI_explorer
python -m utils.retrieval 

Exploration

# After prepare api servers
cd GUI-explorer
conda activate GUI_explorer
python exploration_and_mining.py -device_serial emulator-5554 -max_branching_factor 10 -max_exploration_steps 30 -max_exploration_depth 5 -package_name net.osmand
# After the update of knowledge_base, you need to restart `python -m utils.retrieval` to load the new knowledge_base

device_serial can be obtained by running adb devices. (If not, you need to follow the Setup section in this tutorial).

package_name can be obtained from the app's link on the app store. For example, in https://play.google.com/store/apps/details?id=net.osmand, net.osmand is the package_name for this app.

Demo

# After prepare api servers
# Connect an Android device to this computer and make sure you can see it in `adb devices`.
# Open a new shell window and run
cd GUI-explorer
conda activate GUI_explorer
python -m demo.demo_web_backend

# Open a new shell window and run
cd GUI-explorer
conda activate GUI_explorer
python -m demo.demo_agent_backend

# Open a new shell window and run
cd GUI-explorer/demo/demo_web_frontend
pnpm install
pnpm run dev

Open http://localhost:5173 in your browser.

You should be able to see something like this:

web-demo

Evaluation Results

Table 1: Main Result of GUI-explorer on SPA-Bench single-app English Level 3 tasks. SPA-Bench

Table 2: Main Result of GUI-explorer on AndroidWorld tasks. AndroidWorld

Table 3: Main Result of GUI-explorer on GUI-KRB. GUI-KRB

Showcases

Instruction Video
Open Google Chrome and search for today's weather in Shenzhen. Carefully observe the screen and record the current weather conditions. Then, in Markor, create a note named "today.md" and write the temperature read from the webpage into it.
Multi-Apps-Task-1.mp4
Get the search results for stay tonight near 'wembley stadium' for 1 adult. Add one result to wishlist. Confirm that this item is in the wishlist.
Single-App-Task-1.mp4

More Examples

compare

Citation

If you find this work useful for your research, please kindly cite our paper:

@inproceedings{xie2025gui,
    title={GUI-explorer: Autonomous Exploration and Mining of Transition-aware Knowledge for GUI Agent}, 
    author={Bin Xie and Rui Shao and Gongwei Chen and Kaiwen Zhou and Yinchuan Li and Jie Liu and Min Zhang and Liqiang Nie},
    booktitle={Annual Meeting of the Association for Computational Linguistics (ACL)},
    year={2025}
}

About

[ACL 2025] GUI-explorer: Autonomous Exploration and Mining of Transition-aware Knowledge for GUI Agent

Resources

License

Stars

Watchers

Forks

Contributors

Languages