Skip to content

Performance Misaligned #31

@guyuchao

Description

@guyuchao

I test the given checkpoint SIT-XL/2 using command specified in README.
with cfg=1.5: torchrun --nnodes=1 --nproc_per_node=8 sample_ddp.py ODE --cfg-scale 1.5 --model SiT-XL/2 --num-fid-samples 50000
without cfg: torchrun --nnodes=1 --nproc_per_node=8 sample_ddp.py ODE --model SiT-XL/2 --num-fid-samples 50000

In my testing, the FID for cfg=1.0 (w/o cfg) is 9.5 (I try several times and all > 9.3). But in paper, the SIT-XL/2 7M (w/o cfg) achieve FID=8.3. I wonder where will cause this performance misalignment of sampling from pretrained SIT models without cfg.

FID (this repo) FID (paper)
SIT-XL/2 (w/o. cfg) 9.5 8.3
SIT-XL/2 (w/. cfg=1.5) 2.08 2.06

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions