Performance Misaligned

I test the given checkpoint SIT-XL/2 using command specified in README.
with cfg=1.5: `torchrun --nnodes=1 --nproc_per_node=8 sample_ddp.py ODE --cfg-scale 1.5 --model SiT-XL/2 --num-fid-samples 50000`
without cfg: `torchrun --nnodes=1 --nproc_per_node=8 sample_ddp.py ODE --model SiT-XL/2 --num-fid-samples 50000`

In my testing, the FID for cfg=1.0 (w/o cfg) is 9.5 (I try several times and all > 9.3). But in paper, the SIT-XL/2 7M (w/o cfg) achieve FID=8.3. I wonder where will cause this performance misalignment of sampling from pretrained SIT models without cfg.

|   | FID (this repo)   | FID (paper)   |
|------------|------------|------------|
| SIT-XL/2 (w/o. cfg)| 9.5| 8.3|
| SIT-XL/2 (w/. cfg=1.5)| 2.08| 2.06|

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Performance Misaligned #31

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Performance Misaligned #31

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions