G3Flow: Generative 3D Semantic Flow for Pose-aware and Generalizable Object Manipulation

Tianxing Chen^*, Yao Mu^{* †}, Zhixuan Liang^*, Zanxin Chen, Shijia Peng, Qiangyu Chen, Mingkun Xu, Ruizhen Hu, Hongyuan Zhang, Xuelong Li, Ping Luo^†.

Project Page | PDF | arXiv

📚 Overview

We present G3Flow, a novel approach that leverages foundation models to generate and maintain 3D semantic flow for enhanced robotic manipulation.

🛠️ Installation

See INSTALLATION.md for installation instructions. It takes about 30 minutes for installation.

🧑🏻‍💻 Usage

1. Collect Expert Data

This step involves data collection on RoboTwin for different tasks, with each task collecting 100 sets of data, including point cloud and RGBD data.

${task_name}: bottle_adjust_T, bottle_adjust_G, diverse_bottles_pick_G, shoe_place_T, shoe_place_G, shoes_place_T, shoes_place_G, tool_adjust_T, tool_adjust_G.

cd RoboTwin_Benchmark
bash run_task.sh ${task_name} ${gpu_id}
cd ..

2. Process Data

This step will process the raw data to obtain G3Flow data for each moment, as well as a PCA model. The n_component parameter refers to the target dimensionality when using PCA for dimensionality reduction.

bash process_data.sh ${task_name} ${expert_data_num} ${n_components} ${gpu_id}

The processed data will be stored in the G3FlowDP/data directory, and the obtained PCA model will be stored in the G3FlowDP/PCA_model directory.

3. Train G3Flow-based Policy

bash train.sh ${task_name} ${expert_data_num} ${n_components} ${seed} ${gpu_id}

4. Evaluate G3Flow-based Policy

bash eval.sh ${task_name} ${expert_data_num} ${n_components} ${seed} ${gpu_id}

👍 Citation

If you find our work useful, please consider citing:

@InProceedings{Chen_2025_CVPR,
    author    = {Chen, Tianxing and Mu, Yao and Liang, Zhixuan and Chen, Zanxin and Peng, Shijia and Chen, Qiangyu and Xu, Mingkun and Hu, Ruizhen and Zhang, Hongyuan and Li, Xuelong and Luo, Ping},
    title     = {G3Flow: Generative 3D Semantic Flow for Pose-aware and Generalizable Object Manipulation},
    booktitle = {Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)},
    month     = {June},
    year      = {2025},
    pages     = {1735-1744}
}

😺 Acknowledgement

Our code is generally built upon: Diffusion Policy, FoundationPose, Grounded-SAM, DP3. We thank all these authors for their nicely open sourced code and their great contributions to the community.

Contact Tianxing Chen if you have any questions or suggestions.

🏷️ License

This repository is released under the MIT license. See LICENSE for additional details.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
G3FlowDP		G3FlowDP
RoboTwin_Benchmark		RoboTwin_Benchmark
files		files
tools		tools
.gitignore		.gitignore
INSTALLATION.md		INSTALLATION.md
LICENSE		LICENSE
README.md		README.md
eval.sh		eval.sh
eval_ablation.sh		eval_ablation.sh
process_data.sh		process_data.sh
train.sh		train.sh
train_ablation.sh		train_ablation.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

G3Flow: Generative 3D Semantic Flow for Pose-aware and Generalizable Object Manipulation

📚 Overview

🛠️ Installation

🧑🏻‍💻 Usage

1. Collect Expert Data

2. Process Data

3. Train G3Flow-based Policy

4. Evaluate G3Flow-based Policy

👍 Citation

😺 Acknowledgement

🏷️ License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

TianxingChen/G3Flow

Folders and files

Latest commit

History

Repository files navigation

G3Flow: Generative 3D Semantic Flow for Pose-aware and Generalizable Object Manipulation

📚 Overview

🛠️ Installation

🧑🏻‍💻 Usage

1. Collect Expert Data

2. Process Data

3. Train G3Flow-based Policy

4. Evaluate G3Flow-based Policy

👍 Citation

😺 Acknowledgement

🏷️ License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages