-
Notifications
You must be signed in to change notification settings - Fork 25.5k
Open
Labels
module: crashProblem manifests as a hard crash, as opposed to a RuntimeErrorProblem manifests as a hard crash, as opposed to a RuntimeErroroncall: quantizationQuantization support in PyTorchQuantization support in PyTorch
Description
🐛 Describe the bug
A heap-buffer-overflow can be triggered in torch.quantized_max_pool2d via the Python API, similar to the issue reported in C++ (#116254).
When converting it's C++ snippet to Python, the same input parameters cannot trigger the overflow:
import torch
print(torch.__version__,flush=True)
base = torch.randn(1, 1)
q_tensor = torch.quantize_per_tensor(base, scale=0.1, zero_point=10, dtype=torch.qint32)
torch.quantized_max_pool2d(
q_tensor,
kernel_size=(0, 0),
stride=None,
padding=0,
dilation=(0, 0),
ceil_mode=True
)
output
2.9.0a0+git002e594
Traceback (most recent call last):
File "/home/chenxu/test1/test1.py", line 5, in <module>
torch.quantized_max_pool2d(
RuntimeError: Expected dilation >= 1
but this code
import torch
print(torch.__version__,flush=True)
q_tensor = torch.quantize_per_tensor(
torch.zeros((2, 2), dtype=torch.float),
scale=0.10832346361038446,
zero_point=65,
dtype=torch.qint8
)
input = [[q_tensor,[-576936867]],{},[],{}]
torch.quantized_max_pool2d(*input[0],**input[1])
will output
2.9.0a0+git002e594
=================================================================
==26777==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x502000944918 at pc 0x7fffd2346c9e bp 0x7fffffffd0b0 sp 0x7fffffffd0a8
READ of size 8 at 0x502000944918 thread T0
#0 0x7fffd2346c9d in at::native::quantized_max_pool2d(at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool)::$_0::operator()() const::'lambda'()::operator()() const /home/chenxu/pytorch/pytorch/aten/src/ATen/native/quantized/cpu/Pooling.cpp:638:3
#1 0x7fffd2346c9d in at::native::quantized_max_pool2d(at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool)::$_0::operator()() const /home/chenxu/pytorch/pytorch/aten/src/ATen/native/quantized/cpu/Pooling.cpp:638:3
#2 0x7fffd233af9d in at::native::quantized_max_pool2d(at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool) /home/chenxu/pytorch/pytorch/aten/src/ATen/native/quantized/cpu/Pooling.cpp:638:3
#3 0x7fffd5f05867 in at::(anonymous namespace)::(anonymous namespace)::wrapper_QuantizedCPU__quantized_max_pool2d(at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool) /home/chenxu/pytorch/pytorch/build/aten/src/ATen/RegisterQuantizedCPU_0.cpp:666:10
#4 0x7fffd5f05867 in c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor (at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool), &at::(anonymous namespace)::(anonymous namespace)::wrapper_QuantizedCPU__quantized_max_pool2d(at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool)>, at::Tensor, c10::guts::typelist::typelist<at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool>>::operator()(at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool) /home/chenxu/pytorch/pytorch/aten/src/ATen/core/boxing/impl/WrapFunctionIntoFunctor.h:17:12
#5 0x7fffd5f05867 in c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor (at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool), &at::(anonymous namespace)::(anonymous namespace)::wrapper_QuantizedCPU__quantized_max_pool2d(at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool)>, at::Tensor, c10::guts::typelist::typelist<at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool>>, at::Tensor (at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool)>::call(c10::OperatorKernel*, c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool) /home/chenxu/pytorch/pytorch/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:578:12
#6 0x7fffd5f05867 in std::decay<c10::guts::infer_function_traits<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor (at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool), &at::(anonymous namespace)::(anonymous namespace)::wrapper_QuantizedCPU__quantized_max_pool2d(at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool)>, at::Tensor, c10::guts::typelist::typelist<at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool>>>::type::return_type>::type c10::impl::call_functor_with_args_from_stack_<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor (at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool), &at::(anonymous namespace)::(anonymous namespace)::wrapper_QuantizedCPU__quantized_max_pool2d(at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool)>, at::Tensor, c10::guts::typelist::typelist<at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool>>, false, 0ul, 1ul, 2ul, 3ul, 4ul, 5ul, at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool>(c10::OperatorKernel*, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue>>*, std::integer_sequence<unsigned long, 0ul, 1ul, 2ul, 3ul, 4ul, 5ul>, c10::guts::typelist::typelist<at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool>*) /home/chenxu/pytorch/pytorch/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:642:10
#7 0x7fffd5f05867 in std::decay<c10::guts::infer_function_traits<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor (at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool), &at::(anonymous namespace)::(anonymous namespace)::wrapper_QuantizedCPU__quantized_max_pool2d(at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool)>, at::Tensor, c10::guts::typelist::typelist<at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool>>>::type::return_type>::type c10::impl::call_functor_with_args_from_stack<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor (at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool), &at::(anonymous namespace)::(anonymous namespace)::wrapper_QuantizedCPU__quantized_max_pool2d(at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool)>, at::Tensor, c10::guts::typelist::typelist<at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool>>, false>(c10::OperatorKernel*, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue>>*) /home/chenxu/pytorch/pytorch/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:666:10
#8 0x7fffd5f05867 in c10::impl::make_boxed_from_unboxed_functor<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor (at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool), &at::(anonymous namespace)::(anonymous namespace)::wrapper_QuantizedCPU__quantized_max_pool2d(at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool)>, at::Tensor, c10::guts::typelist::typelist<at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool>>, false>::call(c10::OperatorKernel*, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue>>*) /home/chenxu/pytorch/pytorch/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:765:28
#9 0x7fffdd39f3bc in c10::BoxedKernel::callBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue>>*) const /home/chenxu/pytorch/pytorch/aten/src/ATen/core/boxing/BoxedKernel_impl.h:48:3
#10 0x7fffdd39f3bc in c10::KernelFunction::callBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue>>*) const /home/chenxu/pytorch/pytorch/aten/src/ATen/core/boxing/KernelFunction_impl.h:94:22
#11 0x7fffdd39f3bc in c10::Dispatcher::redispatchBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue>>*) const /home/chenxu/pytorch/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:935:17
#12 0x7fffdd39f3bc in c10::OperatorHandle::redispatchBoxed(c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue>>*) const /home/chenxu/pytorch/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:550:34
#13 0x7fffdd39f3bc in torch::autograd::autogradNotImplementedFallbackImpl(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue>>*) /home/chenxu/pytorch/pytorch/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:362:8
#14 0x7fffdd39f3bc in void c10::BoxedKernel::make_boxed_function<&torch::autograd::autogradNotImplementedFallbackImpl(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue>>*)>(c10::OperatorKernel*, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue>>*) /home/chenxu/pytorch/pytorch/aten/src/ATen/core/boxing/BoxedKernel_impl.h:30:3
#15 0x7fffd2a27226 in c10::BoxedKernel::callBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue>>*) const /home/chenxu/pytorch/pytorch/aten/src/ATen/core/boxing/BoxedKernel_impl.h:48:3
#16 0x7fffd2a27226 in c10::impl::BoxedKernelWrapper<at::Tensor (at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool), void>::call(c10::BoxedKernel const&, c10::OperatorHandle const&, c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool) /home/chenxu/pytorch/pytorch/aten/src/ATen/core/boxing/impl/boxing.h:247:23
#17 0x7fffd2c8bbb5 in at::Tensor c10::KernelFunction::call<at::Tensor, at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool>(c10::OperatorHandle const&, c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool) const /home/chenxu/pytorch/pytorch/aten/src/ATen/core/boxing/KernelFunction_impl.h:183:10
#18 0x7fffd2c8bbb5 in at::Tensor c10::Dispatcher::call<at::Tensor, at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool>(c10::TypedOperatorHandle<at::Tensor (at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool)> const&, at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool) const /home/chenxu/pytorch/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:827:26
#19 0x7fffd2c8bbb5 in c10::TypedOperatorHandle<at::Tensor (at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool)>::call(at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool) const /home/chenxu/pytorch/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:615:41
#20 0x7fffd2c8bbb5 in at::_ops::quantized_max_pool2d::call(at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool) /home/chenxu/pytorch/pytorch/build/aten/src/ATen/Operators_1.cpp:3453:15
#21 0x7ffff1f55fcf in at::quantized_max_pool2d(at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool) /home/chenxu/pytorch/pytorch/build/aten/src/ATen/ops/quantized_max_pool2d.h:28:12
#22 0x7ffff1f55fcf in torch::autograd::THPVariable_quantized_max_pool2d(_object*, _object*, _object*)::$_0::operator()(at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool) const /home/chenxu/pytorch/pytorch/torch/csrc/autograd/generated/python_torch_functions_2.cpp:5275:12
#23 0x7ffff1f55fcf in torch::autograd::THPVariable_quantized_max_pool2d(_object*, _object*, _object*) /home/chenxu/pytorch/pytorch/torch/csrc/autograd/generated/python_torch_functions_2.cpp:5277:15
#24 0x528db6 in cfunction_call /usr/local/src/conda/python-3.11.13/Objects/methodobject.c:542:18
#25 0x54351c in _PyObject_Call /usr/local/src/conda/python-3.11.13/Objects/call.c:343:19
#26 0x54351c in PyObject_Call /usr/local/src/conda/python-3.11.13/Objects/call.c:355:12
#27 0x519c68 in do_call_core /usr/local/src/conda/python-3.11.13/Python/ceval.c:7321:9
#28 0x519c68 in _PyEval_EvalFrameDefault /usr/local/src/conda/python-3.11.13/Python/ceval.c:5376:22
#29 0x5cd0a9 in _PyEval_EvalFrame /usr/local/src/conda/python-3.11.13/Include/internal/pycore_ceval.h:73:16
#30 0x5cd0a9 in _PyEval_Vector /usr/local/src/conda/python-3.11.13/Python/ceval.c:6434:24
#31 0x5cc77e in PyEval_EvalCode /usr/local/src/conda/python-3.11.13/Python/ceval.c:1148:21
#32 0x5ed556 in run_eval_code_obj /usr/local/src/conda/python-3.11.13/Python/pythonrun.c:1741:9
#33 0x5e907f in run_mod /usr/local/src/conda/python-3.11.13/Python/pythonrun.c:1762:19
#34 0x5fde71 in pyrun_file /usr/local/src/conda/python-3.11.13/Python/pythonrun.c:1657:15
#35 0x5fd28e in _PyRun_SimpleFileObject /usr/local/src/conda/python-3.11.13/Python/pythonrun.c:440:13
#36 0x5fcfb2 in _PyRun_AnyFileObject /usr/local/src/conda/python-3.11.13/Python/pythonrun.c:79:15
#37 0x5f7dad in pymain_run_file_obj /usr/local/src/conda/python-3.11.13/Modules/main.c:360:15
#38 0x5f7dad in pymain_run_file /usr/local/src/conda/python-3.11.13/Modules/main.c:379:15
#39 0x5f7dad in pymain_run_python /usr/local/src/conda/python-3.11.13/Modules/main.c:605:21
#40 0x5f7dad in Py_RunMain /usr/local/src/conda/python-3.11.13/Modules/main.c:684:5
#41 0x5bcdf8 in Py_BytesMain /usr/local/src/conda/python-3.11.13/Modules/main.c:738:12
#42 0x7ffff71a4d8f (/lib/x86_64-linux-gnu/libc.so.6+0x29d8f) (BuildId: d5197096f709801829b118af1b7cf6631efa2dcd)
#43 0x7ffff71a4e3f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x29e3f) (BuildId: d5197096f709801829b118af1b7cf6631efa2dcd)
#44 0x5bcc42 in _start (/root/anaconda3/envs/pt-nightly/bin/python3.11+0x5bcc42)
0x502000944918 is located 0 bytes after 8-byte region [0x502000944910,0x502000944918)
allocated by thread T0 here:
#0 0x7ffff75a816d in operator new(unsigned long) (/usr/lib/llvm-18/lib/clang/18/lib/linux/libclang_rt.asan-x86_64.so+0x10616d) (BuildId: 4cd39e6608b20f2f5a148a941cd434e0cadcd3dc)
#1 0x7fffd038f55f in __gnu_cxx::new_allocator<long>::allocate(unsigned long, void const*) /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/ext/new_allocator.h:127:27
#2 0x7fffd038f55f in std::allocator_traits<std::allocator<long>>::allocate(std::allocator<long>&, unsigned long) /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/alloc_traits.h:464:20
#3 0x7fffd038f55f in std::_Vector_base<long, std::allocator<long>>::_M_allocate(unsigned long) /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_vector.h:346:20
#4 0x7fffd038f55f in std::vector<long, std::allocator<long>>::reserve(unsigned long) /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/vector.tcc:78:22
#5 0x7fffd038f55f in std::vector<long, std::allocator<long>> c10::generic_to<long>(c10::IValue, c10::_fake_type<std::vector<long, std::allocator<long>>>) /home/chenxu/pytorch/pytorch/aten/src/ATen/core/ivalue_inl.h:1785:10
SUMMARY: AddressSanitizer: heap-buffer-overflow /home/chenxu/pytorch/pytorch/aten/src/ATen/native/quantized/cpu/Pooling.cpp:638:3 in at::native::quantized_max_pool2d(at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool)::$_0::operator()() const::'lambda'()::operator()() const
Shadow bytes around the buggy address:
0x502000944680: fa fa fd fa fa fa fd fa fa fa fd fa fa fa fd fa
0x502000944700: fa fa fd fa fa fa fd fa fa fa fd fa fa fa fd fa
0x502000944780: fa fa fd fa fa fa fd fa fa fa fd fd fa fa 00 fa
0x502000944800: fa fa 00 00 fa fa 00 00 fa fa 00 fa fa fa 00 00
0x502000944880: fa fa 00 00 fa fa fd fd fa fa 00 fa fa fa 00 fa
=>0x502000944900: fa fa 00[fa]fa fa 00 00 fa fa 00 00 fa fa fa fa
0x502000944980: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x502000944a00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x502000944a80: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x502000944b00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x502000944b80: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
==26777==ABORTING
Versions
Collecting environment information...
PyTorch version: N/A
Is debug build: N/A
CUDA used to build PyTorch: N/A
ROCM used to build PyTorch: N/A
OS: Ubuntu 22.04.5 LTS (x86_64)
GCC version: (conda-forge gcc 15.1.0-5) 15.1.0
Clang version: 18.1.8 (++20240731024944+3b5b5c1ec4a3-1~exp1~20240731145000.144)
CMake version: version 4.1.0
Libc version: glibc-2.35
Python version: 3.11.13 (main, Jun 5 2025, 13:12:00) [GCC 11.2.0] (64-bit runtime)
Python platform: Linux-5.4.0-152-generic-x86_64-with-glibc2.35
Is CUDA available: N/A
CUDA runtime version: Could not collect
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: Could not collect
Nvidia driver version: Could not collect
cuDNN version: Could not collect
Is XPU available: N/A
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: N/A
CPU:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 46 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 32
On-line CPU(s) list: 0-31
Vendor ID: GenuineIntel
Model name: Intel(R) Xeon(R) Silver 4216 CPU @ 2.10GHz
CPU family: 6
Model: 85
Thread(s) per core: 1
Core(s) per socket: 32
Socket(s): 1
Stepping: 7
BogoMIPS: 4190.15
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology cpuid tsc_known_freq pni pclmulqdq vmx ssse3 fma cx16 pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves arat umip pku ospke avx512_vnni md_clear arch_capabilities
Virtualization: VT-x
Hypervisor vendor: KVM
Virtualization type: full
L1d cache: 1 MiB (32 instances)
L1i cache: 1 MiB (32 instances)
L2 cache: 128 MiB (32 instances)
L3 cache: 16 MiB (1 instance)
NUMA node(s): 1
NUMA node0 CPU(s): 0-31
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf: Not affected
Vulnerability Mds: Not affected
Vulnerability Meltdown: Not affected
Vulnerability Mmio stale data: Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown
Vulnerability Retbleed: Mitigation; Enhanced IBRS
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2: Mitigation; Enhanced IBRS, IBPB conditional, RSB filling, PBRSB-eIBRS SW sequence
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Mitigation; TSX disabled
Versions of relevant libraries:
[pip3] intel-cmplr-lib-ur==2025.2.1
[pip3] intel-openmp==2025.2.1
[pip3] mkl-include==2025.2.0
[pip3] mkl-static==2025.2.0
[pip3] numpy==2.3.2
[pip3] optree==0.17.0
[pip3] tbb==2022.2.0
[pip3] tbb-devel==2022.2.0
[pip3] tcmlib==1.4.0
[pip3] torch==2.9.0a0+git002e594
[pip3] umf==0.11.0
[conda] intel-cmplr-lib-ur 2025.2.1 pypi_0 pypi
[conda] intel-openmp 2025.2.1 pypi_0 pypi
[conda] mkl-include 2025.2.0 pypi_0 pypi
[conda] mkl-static 2025.2.0 pypi_0 pypi
[conda] numpy 2.3.2 pypi_0 pypi
[conda] optree 0.17.0 pypi_0 pypi
[conda] tbb 2022.2.0 pypi_0 pypi
[conda] tbb-devel 2022.2.0 pypi_0 pypi
[conda] tcmlib 1.4.0 pypi_0 pypi
[conda] torch 2.9.0a0+git002e594 pypi_0 pypi
[conda] umf 0.11.0 pypi_0 pypi
cc @jerryzh168 @jianyuh @raghuramank100 @jamesr66a @vkuzo @jgong5 @Xia-Weiwen @leslie-fang-intel
Metadata
Metadata
Assignees
Labels
module: crashProblem manifests as a hard crash, as opposed to a RuntimeErrorProblem manifests as a hard crash, as opposed to a RuntimeErroroncall: quantizationQuantization support in PyTorchQuantization support in PyTorch