Skip to content

by-luckk/PhysGen3D

Repository files navigation


PHysGen3D: Crafting a Miniature Interactive
World from a Single Image

CVPR, 2025
Boyuan Chen · Hanxiao Jiang · Shaowei Liu · Saurabh Gupta · Yunzhu Li · Hao Zhao · Shenlong Wang

Demo GIF

Paper PDF Arxiv Project Page


This repository contains the implementation for the paper PHysGen3D: Crafting a Miniature Interactive World from a Single Image, CVPR 2025. In this paper, we present a novel framework that transforms a single image into an amodal, camera-centric, interactive 3D scene.

Overview

overview

Structure

The folders are exsisting wheels used in the projects. "engine" folder contains the core of taichi-elements.

Run perception.py to run the perception part.

Run ball_sim.py mpm_sim.py to run several demos of mpm method.

Installation

conda create -y -n phys python=3.10
conda activate phys
git clone --recurse-submodules [email protected]:by-luckk/PhysGen3D.git
cd PhysGen3D
bash env_install/env_install.sh
bash env_install/download_pretrained.sh

## Usage

The examples below are provided for the demo images in `data/img/`. The `teddy.jpg` can be substituted with any other images. `${name}` is the name of the image.

### Run the perception part

```bash
python perception.py --input_image data/img/teddy.jpg --text_prompt teddy
  • The text prompt discribes the object you want to move. It's in format of a single word or multiple words seperated by . like cat.dog.

  • Outputs are saved in outputs/${name} as follows:

    ${name}/
      ├── depth # Depth point cloud
      ├── images # Multiview object images
      ├── inpaint # Background inpainting
      ├── mask # Object masks
      ├── meshes # Mesh reconstruction
      ├── object # Object registration results
      ├── grounded_sam_output.jpg
      ├── raw_image.jpg
      └── transform.json # Geometries

Run the simulation part

python simulation.py --config data/sim/teddy.yaml
  • You can manually set all the physical parameters or get them automatically using GPT-4o.
  • Velocities is the initial velocity of object(s), in 1D or 2D array: [Vx, Vy, Vz] or [[Vx1, Vy1, Vz1], [Vx2, Vy2, Vz2]].
  • The outputs are saved in sim_result/sim_result_${time} folder.

Run the rendering part

python rendering.py \
-i ./sim_result/sim_result_${time} \
--path outputs/teddy \
--env data/hdr/teddy.exr \
-b 0 \
-e 100 \
-f \
-s 1 \
-o render_result/1 \
-M 460 \
-p 20 \
--shutter-time 0.0
  • In run_mitsuba.sh, put your simulation results folder sim_result/sim_result_${time} after -i.
  • It runs the teddy bear demo by default. For other demos, put perception result outputs/${name} after --path and env light file data/hdr/teddy.exr after --env.
  • The outputs and the final video are saved in render_result folder.

Prepare your own image

About

This is the repository that contains source code for the PhysGen3D.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published