Skip to content

☂️ MPS support for large tensors #149325

@malfet

Description

@malfet

🐛 Describe the bug

Random ATen operations that use MPS backend can fail with Error: total bytes of NDArray > 2**32 or other such errors that expect either tensor size to be less than 4Gb, or total number of elements to be indexable by 32-bit index.

This is umbrella issue to track those (should be searchable by module: mps+ module: 64-bit)

Need to figure out some tooling to detect broken ops and how to run tests, as allocating large tensors on machines we have right now simply not going to work

Example of existing issues:

Versions

CI

cc @kulinseth @albanD @DenisVieriu97 @jhavukainen

Metadata

Metadata

Assignees

No one assigned

    Labels

    module: 64-bitProblems related to incorrectly using 32-bit integers when 64-bit is needed (e.g., 8G tensors)module: crashProblem manifests as a hard crash, as opposed to a RuntimeErrormodule: mpsRelated to Apple Metal Performance Shaders frameworktriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions