Skip to content

[Inductor][SDPA] test_sdpa_rewriter_12 broken on A2/A16 GPU #135407

@eqy

Description

@eqy

🐛 Describe the bug

One of the pattern-matching tests fails on A2/A16 and likely A10 (untested). Not particularly urgent but I would like to learn more about how to expose which part of Inductor is failing to pattern match on certain cases. Since I suspect this is using pre-compiled FA kernels, my guess is there is a hidden constraint that isn't met somewhere (not in the pattern definition itself) or there is a runtime failure that is caught and suppressed.

For working configs it seems to match to pattern 11 (not 12):

def _sfdp_pattern_11(query, key, value, inv_scale):

Is there a way to get the Inductor logs that would be relevant here?

CC @drisspg

Versions

~week old main branch

cc @ptrblck @msaroufim @chauhang @penguinwu @voznesenskym @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @aakhundov @ezyang @drisspg @mikaylagawarecki

Metadata

Metadata

Assignees

No one assigned

    Labels

    module: cudaRelated to torch.cuda, and CUDA support in generalmodule: inductormodule: sdpaAll things related to torch.nn.functional.scaled_dot_product_attentiiononcall: pt2triagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions