Skip to content

[BuildSpeed] generated/TraceType_[01].cpp takes 15 min to compile with clang-17 #163853

@malfet

Description

@malfet

🐛 Describe the bug

Observed while running ninja torch_cpu on M4Pro

% time /usr/bin/c++ -DAT_BUILD_ARM_VEC256_WITH_SLEEF -DAT_PER_OPERATOR_HEADERS -DCAFFE2_BUILD_MAIN_LIB -DCPUINFO_SUPPORTED_PLATFORM=1 -DENABLE_IPC_FABRIC -DFMT_HEADER_ONLY=1 -DFXDIV_USE_INLINE_ASSEMBLY=0 -DHAVE_MMAP=1 -DHAVE_SHM_OPEN=1 -DHAVE_SHM_UNLINK=1 -DKINETO_NAMESPACE=libkineto -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -DNNP_CONVOLUTION_ONLY=0 -DNNP_INFERENCE_ONLY=0 -DONNXIFI_ENABLE_EXT=1 -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -DUSE_C10D_GLOO -DUSE_EXTERNAL_MZCRC -DUSE_RPC -DUSE_TENSORPIPE -DXNN_LOG_LEVEL=0 -D_FILE_OFFSET_BITS=64 -Dtorch_cpu_EXPORTS -I/Users/malfet/git/pytorch/pytorch/build/aten/src -I/Users/malfet/git/pytorch/pytorch/aten/src -I/Users/malfet/git/pytorch/pytorch/build -I/Users/malfet/git/pytorch/pytorch -I/Users/malfet/git/pytorch/pytorch/nlohmann -I/Users/malfet/git/pytorch/pytorch/moodycamel -I/Users/malfet/git/pytorch/pytorch/torch/csrc/api -I/Users/malfet/git/pytorch/pytorch/torch/csrc/api/include -I/Users/malfet/git/pytorch/pytorch/caffe2/aten/src/TH -I/Users/malfet/git/pytorch/pytorch/build/caffe2/aten/src/TH -I/Users/malfet/git/pytorch/pytorch/build/caffe2/aten/src -I/Users/malfet/git/pytorch/pytorch/build/caffe2/../aten/src -I/Users/malfet/git/pytorch/pytorch/torch/csrc -I/Users/malfet/git/pytorch/pytorch/torch/headeronly -I/Users/malfet/git/pytorch/pytorch/third_party/miniz-3.0.2 -I/Users/malfet/git/pytorch/pytorch/third_party/kineto/libkineto/include -I/Users/malfet/git/pytorch/pytorch/third_party/kineto/libkineto/src -I/Users/malfet/git/pytorch/pytorch/third_party/cpp-httplib -I/Users/malfet/git/pytorch/pytorch/aten/src/ATen/.. -I/Users/malfet/git/pytorch/pytorch/third_party/FXdiv/include -I/Users/malfet/git/pytorch/pytorch/c10/.. -I/Users/malfet/git/pytorch/pytorch/third_party/pthreadpool/include -I/Users/malfet/git/pytorch/pytorch/third_party/cpuinfo/include -I/Users/malfet/git/pytorch/pytorch/aten/src/ATen/native/quantized/cpu/qnnpack/include -I/Users/malfet/git/pytorch/pytorch/aten/src/ATen/native/quantized/cpu/qnnpack/src -I/Users/malfet/git/pytorch/pytorch/aten/src/ATen/native/quantized/cpu/qnnpack/deps/clog/include -I/Users/malfet/git/pytorch/pytorch/third_party/NNPACK/include -I/Users/malfet/git/pytorch/pytorch/third_party/FP16/include -I/Users/malfet/git/pytorch/pytorch/third_party/tensorpipe -I/Users/malfet/git/pytorch/pytorch/build/third_party/tensorpipe -I/Users/malfet/git/pytorch/pytorch/third_party/tensorpipe/third_party/libnop/include -I/Users/malfet/git/pytorch/pytorch/third_party/kleidiai -I/Users/malfet/git/pytorch/pytorch/third_party/fmt/include -I/Users/malfet/git/pytorch/pytorch/third_party/onnx -I/Users/malfet/git/pytorch/pytorch/build/third_party/onnx -I/Users/malfet/git/pytorch/pytorch/third_party/flatbuffers/include -F/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/System/Library/Frameworks -isystem /Users/malfet/git/pytorch/pytorch/build/third_party/gloo -isystem /Users/malfet/git/pytorch/pytorch/cmake/../third_party/gloo -isystem /Users/malfet/git/pytorch/pytorch/cmake/../third_party/tensorpipe/third_party/libuv/include -isystem /Users/malfet/git/pytorch/pytorch/cmake/../third_party/googletest/googlemock/include -isystem /Users/malfet/git/pytorch/pytorch/cmake/../third_party/googletest/googletest/include -isystem /Users/malfet/git/pytorch/pytorch/third_party/protobuf/src -isystem /Users/malfet/git/pytorch/pytorch/third_party/XNNPACK/include -isystem /Users/malfet/git/pytorch/pytorch/cmake/../third_party/eigen -isystem /Users/malfet/git/pytorch/pytorch/INTERFACE -isystem /Users/malfet/git/pytorch/pytorch/third_party/nlohmann/include -isystem /Users/malfet/git/pytorch/pytorch/third_party/concurrentqueue -isystem /Users/malfet/git/pytorch/pytorch/build/include  -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DLIBKINETO_NOROCTRACER -DLIBKINETO_NOXPUPTI=ON -DUSE_PYTORCH_QNNPACK -DAT_BUILD_ARM_VEC256_WITH_SLEEF -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -DC10_NODEPRECATED -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=braced-scalar-init -Werror=range-loop-construct -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-unknown-pragmas -Wno-unused-parameter -Wno-strict-overflow -Wno-strict-aliasing -Wvla-extension -Wsuggest-override -Wnewline-eof -Winconsistent-missing-override -Winconsistent-missing-destructor-override -Wno-pass-failed -Wno-error=old-style-cast -Wconstant-conversion -Qunused-arguments -faligned-new -fno-math-errno -fno-trapping-math -Werror=format -DUSE_MPS -Wno-missing-braces -O3 -DNDEBUG -DNDEBUG -std=gnu++17 -arch arm64 -fPIC -fcolor-diagnostics -DTORCH_USE_LIBUV -DCAFFE2_USE_GLOO -D__NEON__ -Wall -Wextra -Wdeprecated -Wunused -Wno-unused-parameter -Wno-missing-field-initializers -Wno-array-bounds -Wno-unknown-pragmas -Wno-strict-overflow -Wno-strict-aliasing -Wextra-semi -Wmove -fvisibility=hidden -O2 -Wmissing-prototypes -Werror=missing-prototypes -Xpreprocessor -fopenmp -I/opt/homebrew/opt/libomp/include -Wno-missing-prototypes -Wno-error=missing-prototypes -o caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/autograd/generated/TraceType_0.cpp.o -c /Users/malfet/git/pytorch/pytorch/torch/csrc/autograd/generated/TraceType_0.cpp

/usr/bin/c++ -DAT_BUILD_ARM_VEC256_WITH_SLEEF -DAT_PER_OPERATOR_HEADERS        214.62s user 9.84s system 99% cpu 3:46.18 total

Versions

CI

cc @jerryzh168 @seemethere

Metadata

Metadata

Assignees

Labels

module: buildBuild system issuesmodule: performanceIssues related to performance, either of kernel code or framework gluetriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions