This repo is developed based on pytorch-lightning, hydra, huggingface libs, and others.
This code repo serves as the backend for the Unreal Engine demo. To run the full UE demo, one can follow this instruction to download and setup the frontend additionally.
Note that our non-commercial license also applies to this UE demo. Please ask [email protected] for commercial use.
This project requires two SMPL-X model files for different purposes:
-
Standard SMPL-X model (for motion generation):
- File:
models/SMPLX/neutral/SMPLX_neutral.npz - Source: SMPL-X Official Website
- Purpose: Core motion generation and body model computations
- Required for: All training and inference operations
- File:
-
Slim SMPL-X model (for visualization/rendering):
- File:
models/SMPLX/SMPLX_neutral_array_f32_slim.npz - Source: Cannot be provided directly. Needs this script to process the downloaded SMPL-X checkpoint.
- Purpose: Optimized for headless rendering with gloss-rs/smpl-rs
- Required for: Video rendering and visualization
- File:
Model registry structure:
model-registry/
└── models/
└── SMPLX/
├── neutral/
│ └── SMPLX_neutral.npz # Standard model (download required)
└── SMPLX_neutral_array_f32_slim.npz # Slim model (for rendering)
When using python-dotenv, set up your data paths by copying and editing the environment template:
cp .env.example .env
# Edit .env file with your specific paths:
# AMASS_DATA_PATH=/your/path/to/AMASS/data
# MODEL_REGISTRY_PATH=/your/path/to/model-registryIn case of no poetry, you can first install or update your poetry via
poetry self update
See more info at Poetry's main page.
If the first time to use, enter the project root directory and type
poetry install
You might need to specify the intallation path if you want.
To modify the repo, e.g. install another specific library or update some versions, use
poetry add xxx=v1.xx
Poetry will check version compatibilities automatically, and lock the versions in poetry.lock.
The pre-trained checkpoints can be downloaded here. We suggest to put them to logs/. For example, the base models are saved in logs/motion_diffuser_ar/runs, and the action generation models can be saved in logs/motion_diffuser_ar_action/runs.
The AMASS dataset can be downloaded here.
The motion in the raw AMASS dataset is Z-up, whereas PRIMAL uses the Y-up world coordinate. One can use primal/utils/cvt_amass_to_yup.py to perform the conversion. Afterwards, its path can be set in .env.example (see Sec. 0.2).
We leverage gradio to show how our methods work.
The base model is pretrained on AMASS, and can produce perpetual motions given an initial state. The generation uses an autoregressive diffusion approach with overlapping sliding windows, where each iteration:
- Canonicalizes motion to a local coordinate frame
- Embeds the motion seed as conditioning
- Denoises via reverse diffusion to generate a motion primitive
- Post-processes to transform back to world coordinates
You can control the motion through:
- Classifier-based guidance: Control movement direction and facing orientation
- Velocity perturbations: Trigger specific actions (left_kick, right_kick, run_forward, flip_back, roll_forward)
Run the gradio demo only with the base model in your terminal with:
poetry run python demos/ARDiffusion_gradio.py logs/motion_diffuser_ar/runs/silu_anchor
The results, i.e. the set of (.pkl, .mp4, and .smpl) are saved to outputs/gradio_ARDiffusion. One can change the output folder inside of the gradio demo file.
One can directly drag the .smpl file into our platform for visualization.
Since the two adapted models have similar approaches to run, here we just show the action generation model for example. This model is obtained by adapting the base model to personalized motion data captured with MoCapade. Given an initial state and an action label, it generates arbitrarily long motions belonging to that action class.
- Action embeddings: Discrete action labels (learned during training) condition the diffusion process
- Classifier-free guidance: Adjustable strength of action control via
guidance_weight_actionparameter - Velocity perturbations: Random noise to diversify action execution
- Optional trajectory control: Can combine action generation with movement/facing direction guidance
The action labels depend on your personalized data. Our checkpoint uses ['jump', 'kick', 'punch', 'run', 'walk'].
You are free to modify this in your own scenarios.
Run this demo with:
poetry run python demos/ARDiffusionAction_gradio.py logs/motion_diffuser_ar_action/runs/ours
The results are automatically saved to outputs/gradio_ARDiffusionAction.
All scripts are saved in scripts/ and experiments in experiments/.
One can either run the .sh files of the inference workflow, or run individual .py files.
Following are examples about motion generation. Reactions to perturbations and classifier-based control have the identical gen|render|eval steps.
See run_principledAction.sh and run_principledControl.sh for more details.
The .sh file wraps python files with default settings. The format is
bash scripts/run_motionrealism.sh [mode] [checkpointfolder]
Then the script will run corresponding python scripts with default settings. One needs to run the "gen" mode first. For example,
bash scripts/run_motionrealism.sh gen logs/motion_diffuser_ar/runs/silu_anchor
Afterwards, run
bash scripts/run_motionrealism.sh render logs/motion_diffuser_ar/runs/silu_anchor
to render the videos, or
bash scripts/run_motionrealism.sh eval logs/motion_diffuser_ar/runs/silu_anchor
to perform quantitative evaluation.
For more control over parameters, you can run individual Python files:
Step 1: generate motions. Run e.g.
poetry run python experiments/gen_Motion.py --expdir logs/motion_diffuser_ar/runs/silu_anchor --ckptidx 29999 --use_ema --dataset SFU --use_reproj_kpts --use_inertialization --frame_skip 400
The results are automatically saved into outputs/MotionRealism/{checkpoint_path}.
Step 2: quantitative evaluation. Run the following script, and the results will be printed in the terminal
python experiments/eval_generation.py outputs/MotionRealism/silu_anchor/SFU_ema-True_reproj-True_inertial-True.pkl
Step 3: render the videos. The videos will save besides where the pkl file is located. See primal/rendering/render.py for its options.
python primal/rendering/render.py outputs/MotionRealism/silu_anchor/SFU_ema-True_reproj-True_inertial-True.pkl false
The .sh file wraps python files with default settings. The format is
bash scripts/run_ARDiffusionAction.sh [mode] [checkpointfolder]
The script will run corresponding python scripts with default settings. One needs to run the "gen" mode first. For example,
bash scripts/run_ARDiffusionAction.sh gen logs/motion_diffuser_ar_action/runs/ours
Afterwards, run
bash scripts/run_ARDiffusionAction.sh render logs/motion_diffuser_ar_action/runs/ours
to render the videos, or
bash scripts/run_ARDiffusionAction.sh eval logs/motion_diffuser_ar_action/runs/ours
to perform quantitative evaluation.
For more control over parameters, you can run individual Python files:
Step 1: generate motions. Run
python experiments/gen_Motion_ARDiffusionAction.py --expdir logs/motion_diffuser_ar_action/runs/ours --ckptidx 99 --use_ema --dataset SFU --use_reproj_kpts --use_inertialization --frame_skip 400 --action run
The results are automatically saved into outputs/ARDiffusionAction/{checkpointfoldername}.
Step 2: quantitative evaluation. Run the following script, and the results will be printed in the terminal
python experiments/eval_generation.py outputs/ARDiffusionAction/ours/SFU_ema-True_reproj-True_inertial-True_action-run.pkl
Step 3: render the videos. The videos will save besides where the pkl file is located. See primal/rendering/render.py for its options.
python primal/rendering/render.py outputs/ARDiffusionAction/ours/SFU_ema-True_reproj-True_inertial-True_action-run.pkl false
One can directly type
python scripts/train.py --config-name=train_diffusion task_name=motion_diffuser_ar
It will load the default configuations in primal/configs/train_diffusion.yaml, and save the checkpoint to logs/motion_diffuser_ar/runs/<year>-<month>-<date>-<hour>-<minute>-<second>. This folder is exactly the {checkpoint_path} for testing.
Detailed default settings of data, model, etc. are in respective config folders. For example, - data: amass_smplx in primal/configs/train_diffusion.yaml is corresponding to primal/configs/data/amass_smplx.yaml.
To keep the default setting consistent, it is suggested to specify the non-default settings in the command line, e.g.
python scripts/train.py task_name=motion_diffuser_ar data.batch_size=256 data.framerate=30 data.seq_len=16 model._target_=primal.models.motion_diffuser.ARDiffusion model.cfg.scheduler.num_train_timesteps=50 trainer.max_epochs=30000 model.cfg.use_metric_velocity=true data.subsets=['ACCAD','BMLmovi','BMLrub','CMU','DFaust','EKUT','Eyes_Japan_Dataset','GRAB','HDM05','KIT','MoSh','PosePrior','SFU','SOMA','SSM','TCDHands','TotalCapture','Transitions']
Note that the pretraining phase will take days until you can get good motions. To monitor the training process, you can use tensorboard by typing in the terminal
tensorboard --logdir={checkpoint_path}
In this work, we consider two avatar-related tasks: spatial target reaching and semantic action generation.
Adaptation approaches:
According to the paper, the three approaches are finetuning, OmniControlNet, and ours, which correspond to:
finetune: Fine-tune all model parameterscontrolnet1: OmniControl-style ControlNet architecturecontrolnet2: Our proposed ControlNet approach (recommended)
Specify the approach via +model.cfg.network.controltype=[adaptation_approach].
Important: Always specify the pretrained base model via e.g. '+finetune_from_exp=logs/motion_diffuser_ar/runs/silu_anchor'. Otherwise, the model will train from scratch.
This adaptation enables motion generation towards a specific 3D target location (or 2D target on the xz-plane). To perform finetuning, run e.g.:
python scripts/train.py --config-name=train_diffusion task_name=motion_diffuser_ar_spatial data.batch_size=256 trainer.max_epochs=100 model._target_=primal.models.motion_diffuser.ARDiffusionSpatial model.cfg.network.type=transformerInContext '+model.cfg.network.controltype=controlnet2' '+model.cfg.goal_type=2D' '+finetune_from_exp=logs/motion_diffuser_ar/runs/silu_anchor'
The checkpoint will be saved to the task_name, e.g. logs/motion_diffuser_ar_spatial.
One can change the settings like learning rate, batch size, goal_type, etc. accordingly.
The base model can be quickly adapted to some user captured motion sequences. In the following, we share what we do to create a personalized motion model.
We use a cellphone to capture some videos of specific actions of martial arts, i.e. ['jump', 'kick', 'punch', 'run', 'walk']. You can capture a very long video containing all actions, or capture them individually. Of course, the action labels are dependent on the scenario.
Step 2: Estimate body motions with MoCapade.
Trim the videos to 5-7 seconds so that the server can process them. Upload the videos, and then download the .smpl files.
Create your dataset directory with all .smpl files in a single folder:
your_dataset_folder/
├── jump_001.smpl
├── jump_002.smpl
├── kick_001.smpl
├── kick_002.smpl
├── punch_001.smpl
├── run_001.smpl
├── walk_001.smpl
└── ...
According to the action labels, name each individual .smpl file to [action]_[idx].smpl, such as jump_003.smpl and kick_004.smpl. This file formatting is necessary, since the action labels are directly retrieved from the filenames using os.path.basename(file).split('.')[0].split('_')[0].
Set the path to your dataset folder in your environment variables or directly in the config:
- Environment variable:
CUSTOMIZED_ACTION_PATH=/path/to/your_dataset_folder - Or edit
primal/configs/data/customized_action_mc.yamland setpath: /path/to/your_dataset_folder
The system automatically:
- Extracts action labels from filenames (e.g., "jump" from "jump_001.smpl")
- Segments each
.smplfile into fixed-length sequences (default: 16 frames)
Run the following command for example. Definitely, their settings can be changed.
python primal/scripts/train.py task_name=motion_diffuser_ar_action data=customized_action_mc data.path={your_own_dataset} data.batch_size=16 trainer.max_epochs=1000 '+model.cfg.network.controltype=controlnet2' '+finetune_from_exp=logs/motion_diffuser_ar/runs/silu_anchor'
The checkpoint will be saved to logs/motion_diffuser_ar_action.
Use gradio demo to verify the effectiveness of your trained model.
See license file for more details. Please cite the following work if it helps. Many thanks.
@inproceedings{primal:iccv:2025,
author = {Zhang, Yan and Feng, Yao and Cseke, Alpár and Saini, Nitin and Bajandas, Nathan and Heron, Nicolas and Black, Michael J.},
title = {{PRIMAL:} Physically Reactive and Interactive Motor Model for Avatar Learning},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
month = oct,
year = {2025}
}