-
Notifications
You must be signed in to change notification settings - Fork 25.2k
[ONNX] Refactor op level debugging #97494
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
[ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/97494
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 8867c75: This comment was automatically generated by Dr. CI and updates every 15 minutes. |
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
Fixes #97728 Fixes microsoft/onnxscript#393 Provide op_level_debug in exporter which creates randomnied torch.Tensor based on FakeTensorProp real shape as inputs of both torch ops and ONNX symbolic function. The test is different from [op_correctness_test.py](https://github.com/microsoft/onnx-script/blob/main/onnxscript/tests/function_libs/torch_aten/ops_correctness_test.py) as op_level_debug generating real tensors based on the fake tensors in the model. Limitation: 1. trace_only onnxscript function is not supported due to lack of param_schema. 2. The ops with dim/indices (INT64) is not supported that they need the information(shape) from other input args. [ghstack-poisoned]
Fixes #97728 Fixes microsoft/onnxscript#393 Provide op_level_debug in exporter which creates randomnied torch.Tensor based on FakeTensorProp real shape as inputs of both torch ops and ONNX symbolic function. The test is different from [op_correctness_test.py](https://github.com/microsoft/onnx-script/blob/main/onnxscript/tests/function_libs/torch_aten/ops_correctness_test.py) as op_level_debug generating real tensors based on the fake tensors in the model. Limitation: 1. trace_only onnxscript function is not supported due to lack of param_schema. 2. The ops with dim/indices (INT64) is not supported that they need the information(shape) from other input args. [ghstack-poisoned]
Fixes #97728 Fixes microsoft/onnxscript#393 Provide op_level_debug in exporter which creates randomnied torch.Tensor based on FakeTensorProp real shape as inputs of both torch ops and ONNX symbolic function. The test is different from [op_correctness_test.py](https://github.com/microsoft/onnx-script/blob/main/onnxscript/tests/function_libs/torch_aten/ops_correctness_test.py) as op_level_debug generating real tensors based on the fake tensors in the model. Limitation: 1. trace_only onnxscript function is not supported due to lack of param_schema. 2. The ops with dim/indices (INT64) is not supported that they need the information(shape) from other input args. [ghstack-poisoned]
Had CI fail on: Not sure it's on the machine or the test... |
Fixes #97728 Fixes microsoft/onnxscript#393 Provide op_level_debug in exporter which creates randomnied torch.Tensor based on FakeTensorProp real shape as inputs of both torch ops and ONNX symbolic function. The PR leverages on Transformer class to create a new fx.Graph, but shares the same Module with the original one to save memory. The test is different from [op_correctness_test.py](https://github.com/microsoft/onnx-script/blob/main/onnxscript/tests/function_libs/torch_aten/ops_correctness_test.py) as op_level_debug generating real tensors based on the fake tensors in the model. Limitation: 1. Some of the trace_only function is not supported due to lack of param_schema which leads to arg/kwargs wronly split and ndarray wrapping. (WARNINGS in SARIF) 2. The ops with dim/indices (INT64) is not supported that they need the information(shape) from other input args. (WARNINGS in SARIF) 3. sym_size and built-in ops are not supported. 4. op_level_debug only labels results in SARIF. It doesn't stop exporter. 5. Introduce ONNX owning FakeTensorProp supports int/float/bool 6. parametrized op_level_debug and dynamic_shapes into FX tests [ghstack-poisoned]
Most likely because of ORT. This may be useful: microsoft/onnxscript#602 |
DynamicSliceExportMod(), | ||
(x,), | ||
additional_test_inputs=[(y,)], | ||
) | ||
|
||
def test_mutation(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please do me a favor and skip this one for dynamic too #98622 🤣
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm The one reported is actually not dynamic. I will skip the whole test in this PR, and put skip_ORT_version on it in the next PR.
Fixes #97728 Fixes #98622 Fixes microsoft/onnxscript#393 Provide op_level_debug in exporter which creates randomnied torch.Tensor based on FakeTensorProp real shape as inputs of both torch ops and ONNX symbolic function. The PR leverages on Transformer class to create a new fx.Graph, but shares the same Module with the original one to save memory. The test is different from [op_correctness_test.py](https://github.com/microsoft/onnx-script/blob/main/onnxscript/tests/function_libs/torch_aten/ops_correctness_test.py) as op_level_debug generating real tensors based on the fake tensors in the model. Limitation: 1. Some of the trace_only function is not supported due to lack of param_schema which leads to arg/kwargs wronly split and ndarray wrapping. (WARNINGS in SARIF) 2. The ops with dim/indices (INT64) is not supported that they need the information(shape) from other input args. (WARNINGS in SARIF) 3. sym_size and built-in ops are not supported. 4. op_level_debug only labels results in SARIF. It doesn't stop exporter. 5. Introduce ONNX owning FakeTensorProp supports int/float/bool 6. parametrized op_level_debug and dynamic_shapes into FX tests [ghstack-poisoned]
Fixes #97728 Fixes #98622 Fixes microsoft/onnxscript#393 Provide op_level_debug in exporter which creates randomnied torch.Tensor based on FakeTensorProp real shape as inputs of both torch ops and ONNX symbolic function. The PR leverages on Transformer class to create a new fx.Graph, but shares the same Module with the original one to save memory. The test is different from [op_correctness_test.py](https://github.com/microsoft/onnx-script/blob/main/onnxscript/tests/function_libs/torch_aten/ops_correctness_test.py) as op_level_debug generating real tensors based on the fake tensors in the model. Limitation: 1. Some of the trace_only function is not supported due to lack of param_schema which leads to arg/kwargs wronly split and ndarray wrapping. (WARNINGS in SARIF) 2. The ops with dim/indices (INT64) is not supported that they need the information(shape) from other input args. (WARNINGS in SARIF) 3. sym_size and built-in ops are not supported. 4. op_level_debug only labels results in SARIF. It doesn't stop exporter. 5. Introduce ONNX owning FakeTensorProp supports int/float/bool 6. parametrized op_level_debug and dynamic_shapes into FX tests [ghstack-poisoned]
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Fixes pytorch#97728 Fixes pytorch#98622 Fixes microsoft/onnxscript#393 Provide op_level_debug in exporter which creates randomnied torch.Tensor based on FakeTensorProp real shape as inputs of both torch ops and ONNX symbolic function. The PR leverages on Transformer class to create a new fx.Graph, but shares the same Module with the original one to save memory. The test is different from [op_correctness_test.py](https://github.com/microsoft/onnx-script/blob/main/onnxscript/tests/function_libs/torch_aten/ops_correctness_test.py) as op_level_debug generating real tensors based on the fake tensors in the model. Limitation: 1. Some of the trace_only function is not supported due to lack of param_schema which leads to arg/kwargs wronly split and ndarray wrapping. (WARNINGS in SARIF) 2. The ops with dim/indices (INT64) is not supported that they need the information(shape) from other input args. (WARNINGS in SARIF) 3. sym_size and built-in ops are not supported. 4. op_level_debug only labels results in SARIF. It doesn't stop exporter. 5. Introduce ONNX owning FakeTensorProp supports int/float/bool 6. parametrized op_level_debug and dynamic_shapes into FX tests Pull Request resolved: pytorch#97494 Approved by: https://github.com/justinchuby, https://github.com/BowenBao
Fixes pytorch#97728 Fixes pytorch#98622 Fixes microsoft/onnxscript#393 Provide op_level_debug in exporter which creates randomnied torch.Tensor based on FakeTensorProp real shape as inputs of both torch ops and ONNX symbolic function. The PR leverages on Transformer class to create a new fx.Graph, but shares the same Module with the original one to save memory. The test is different from [op_correctness_test.py](https://github.com/microsoft/onnx-script/blob/main/onnxscript/tests/function_libs/torch_aten/ops_correctness_test.py) as op_level_debug generating real tensors based on the fake tensors in the model. Limitation: 1. Some of the trace_only function is not supported due to lack of param_schema which leads to arg/kwargs wronly split and ndarray wrapping. (WARNINGS in SARIF) 2. The ops with dim/indices (INT64) is not supported that they need the information(shape) from other input args. (WARNINGS in SARIF) 3. sym_size and built-in ops are not supported. 4. op_level_debug only labels results in SARIF. It doesn't stop exporter. 5. Introduce ONNX owning FakeTensorProp supports int/float/bool 6. parametrized op_level_debug and dynamic_shapes into FX tests Pull Request resolved: pytorch#97494 Approved by: https://github.com/justinchuby, https://github.com/BowenBao
" typing.Sequence[int], torch.Tensor], as [None, None]:" | ||
) | ||
def test_shufflenet_v2_dynamic_axes(self): | ||
model = torchvision.models.shufflenet_v2_x0_5(pretrained=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this pass CI ... Seems torchvision is not imported from anywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this skipped?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🫢
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no worries, I'm adding a skip decorator for torchvision if uninstalled in the other PR
@@ -40,6 +40,22 @@ def export_fx_to_onnx( | |||
# Remove them since ONNX inference does not need them. | |||
module = passes.RemoveInputMutation(module).run(*fx_module_args) | |||
|
|||
# ONNX does not support views and mutations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤣
…alue into the original gm" From #97494 (comment), the passes should modify gm inplace, but before this PR, `ShapeInferenceWithFakeTensor` utilizes Transform.transform() to make a copy of the gm, and rely on the assumption that the topological order of two graphs should be the same. This PR addresses the issue by saving another metavalue `static_shape` into gm for op_level_debug, instead of overwriting `val`. cc BowenBao [ghstack-poisoned]
…alue into the original gm" From #97494 (comment), the passes should modify gm inplace, but before this PR, `ShapeInferenceWithFakeTensor` utilizes Transform.transform() to make a copy of the gm, and rely on the assumption that the topological order of two graphs should be the same. This PR addresses the issue by saving another metavalue `static_shape` into gm for op_level_debug, instead of overwriting `val`. cc BowenBao [ghstack-poisoned]
…or to fill metavalue into the original gm" From #97494 (comment), the passes should modify gm inplace, but before this PR, `ShapeInferenceWithFakeTensor` utilizes Transform.transform() to make a copy of the gm, and rely on the assumption that the topological order of two graphs should be the same. This PR addresses the issue by saving another metavalue `static_shape` into gm for op_level_debug, instead of overwriting `val`. cc BowenBao [ghstack-poisoned]
…alue into the original gm" From #97494 (comment), the passes should modify gm inplace, but before this PR, `ShapeInferenceWithFakeTensor` utilizes Transform.transform() to make a copy of the gm, and rely on the assumption that the topological order of two graphs should be the same. This PR addresses the issue by saving another metavalue `static_shape` into gm for op_level_debug, instead of overwriting `val`. cc BowenBao [ghstack-poisoned]
…he original gm (#98760) From #97494 (comment), the passes should modify gm inplace, but before this PR, `ShapeInferenceWithFakeTensor` utilizes Transform.transform() to make a copy of the gm, and rely on the assumption that the topological order of two graphs should be the same. This PR addresses the issue by saving another metavalue `static_shape` into gm for op_level_debug, instead of overwriting `val`. Pull Request resolved: #98760 Approved by: https://github.com/BowenBao
ghstack-source-id: 350aa06 Pull Request resolved: pytorch/pytorch#97494
Stack from ghstack (oldest at bottom):
Fixes #97728
Fixes #98622
Fixes microsoft/onnxscript#393
Provide op_level_debug in exporter which creates randomnied torch.Tensor based on FakeTensorProp real shape as inputs of both torch ops and ONNX symbolic function. The PR leverages on Transformer class to create a new fx.Graph, but shares the same Module with the original one to save memory.
The test is different from op_correctness_test.py as op_level_debug generating real tensors based on the fake tensors in the model.
Limitation: