Fix: avoid hooking `cuMemGetInfo_v2` to prevent memory conflict. #80

yangshiqi · 2025-06-11T07:22:18Z

The current code directly returns the user-configured limit as *total without considering the actual hardware capacity limit.

After the fix, even if CUDA_DEVICE_MEMORY_LIMIT=15360MB is set, the actual total returned will not exceed the actual available memory of the T4 card (about 14917MB), thus avoiding the "No execution plan worked!" error.

…total and limit memory. Signed-off-by: yangshiqi <[email protected]>

hami-robott · 2025-06-11T07:34:51Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: yangshiqi

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Fix: avoid hooking cuMemGetInfo_v2 to prevent memory conflict with …

fcfc20d

…total and limit memory. Signed-off-by: yangshiqi <[email protected]>

yangshiqi closed this Jun 11, 2025

hami-robott bot added the dco-signoff: yes label Jun 11, 2025

hami-robott bot added the size/S label Jun 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix: avoid hooking `cuMemGetInfo_v2` to prevent memory conflict. #80

Fix: avoid hooking `cuMemGetInfo_v2` to prevent memory conflict. #80

Uh oh!

yangshiqi commented Jun 11, 2025

Uh oh!

hami-robott bot commented Jun 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Fix: avoid hooking cuMemGetInfo_v2 to prevent memory conflict. #80

Fix: avoid hooking cuMemGetInfo_v2 to prevent memory conflict. #80

Uh oh!

Conversation

yangshiqi commented Jun 11, 2025

Uh oh!

hami-robott bot commented Jun 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Fix: avoid hooking `cuMemGetInfo_v2` to prevent memory conflict. #80

Fix: avoid hooking `cuMemGetInfo_v2` to prevent memory conflict. #80