-
Notifications
You must be signed in to change notification settings - Fork 25.5k
Description
🐛 Bug
The CUDA API expects a void**
for option values for functions like cuModuleLoadDataEx
. The documentation seems to be unclear, what that should be but according to other sources (see below) that value should be simply the value casted to a void*
, not a pointer to that value.
Hence the code at
option_vals.emplace_back(&jit_opt_level); |
I've seen this in one of the PyTorch tests (see below) where I get:
======================================================================
ERROR: test_unary_ops (test_jit_cuda_fuser.TestCudaFuser)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/tmp/install_pt/lib/python3.8/site-packages/torch/testing/_internal/common_utils.py", line 827, in wrapper
method(*args, **kwargs)
File "/dev/shm/s3248973-EasyBuild/PyTorch/1.7.1/fosscuda-2020b/pytorch-1.7.1/test/test_jit_cuda_fuser.py", line 369, in test_unary_ops
self._unary_test_helper(op)
File "/dev/shm/s3248973-EasyBuild/PyTorch/1.7.1/fosscuda-2020b/pytorch-1.7.1/test/test_jit_cuda_fuser.py", line 328, in _unary_test_helper
jit_o = t_jit(x, 2.0)
File "/tmp/install_pt/lib/python3.8/site-packages/torch/testing/_internal/common_utils.py", line 126, in prof_func_call
return prof_callable(func_call, *args, **kwargs)
File "/tmp/install_pt/lib/python3.8/site-packages/torch/testing/_internal/common_utils.py", line 123, in prof_callable
return callable(*args, **kwargs)
RuntimeError: The following operation failed in the TorchScript interpreter2.
Traceback of TorchScript (most recent call last):
RuntimeError: CUDA driver error: a PTX JIT compilation failed
And to verify I added the following code to torch/csrc/jit/codegen/cuda/executor_utils.cpp above the call to cuModuleLoadDataEx
:
options.push_back(CU_JIT_ERROR_LOG_BUFFER);
options.push_back(CU_JIT_ERROR_LOG_BUFFER_SIZE_BYTES);
std::string errors(8000, '\0');
option_vals.push_back((void*) errors.data());
option_vals.push_back((void*) errors.size());
When printing this string on failure I got:
ptxas fatal : 32-bit integer value (3849789140) out of range
This is exactly the pointer to jit_opt_level
which confirms the above.
PS: It is likely a good idea to include the JIT error buffer in PyTorch and report it on failure.
References:
- https://stackoverflow.com/a/17070844/1930508
- https://github.com/HongjianLi/cuda/blob/dd52fd563558667315de3fecea3559ac6ba2a89a/vectorAdd/vectorAdd.cpp#L74
- https://github.com/MentorEmbedded/nvptx-tools/blob/59e0b755e3ab085a3a348bd001bad4f010fd9c00/nvptx-run.c#L77-L88
To Reproduce
Steps to reproduce the behavior:
python test_jit_cuda_fuser_legacy.py -k test_unary_ops
Environment
- PyTorch Version (e.g., 1.0): 1.7.1, master
cc @gmagogsfm