[ONNX] Refactor op level debugging #97494

titaiwangms · 2023-03-24T01:28:52Z

Stack from ghstack (oldest at bottom):

Fixes #97728
Fixes #98622
Fixes microsoft/onnxscript#393

Provide op_level_debug in exporter which creates randomnied torch.Tensor based on FakeTensorProp real shape as inputs of both torch ops and ONNX symbolic function. The PR leverages on Transformer class to create a new fx.Graph, but shares the same Module with the original one to save memory.

The test is different from op_correctness_test.py as op_level_debug generating real tensors based on the fake tensors in the model.

Limitation:

Some of the trace_only function is not supported due to lack of param_schema which leads to arg/kwargs wronly split and ndarray wrapping. (WARNINGS in SARIF)
The ops with dim/indices (INT64) is not supported that they need the information(shape) from other input args. (WARNINGS in SARIF)
sym_size and built-in ops are not supported.
op_level_debug only labels results in SARIF. It doesn't stop exporter.
Introduce ONNX owning FakeTensorProp supports int/float/bool
parametrized op_level_debug and dynamic_shapes into FX tests

[ghstack-poisoned]

pytorch-bot · 2023-03-24T01:28:55Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/97494

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 8867c75:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

[ghstack-poisoned]

ghstack-source-id: 6828e7b Pull Request resolved: #97494

[ghstack-poisoned]

ghstack-source-id: 8563ade Pull Request resolved: #97494

[ghstack-poisoned]

ghstack-source-id: fcd3f8b Pull Request resolved: #97494

[ghstack-poisoned]

ghstack-source-id: d839262 Pull Request resolved: #97494

Fixes #97728 Fixes microsoft/onnxscript#393 Provide op_level_debug in exporter which creates randomnied torch.Tensor based on FakeTensorProp real shape as inputs of both torch ops and ONNX symbolic function. The test is different from [op_correctness_test.py](https://github.com/microsoft/onnx-script/blob/main/onnxscript/tests/function_libs/torch_aten/ops_correctness_test.py) as op_level_debug generating real tensors based on the fake tensors in the model. Limitation: 1. trace_only onnxscript function is not supported due to lack of param_schema. 2. The ops with dim/indices (INT64) is not supported that they need the information(shape) from other input args. [ghstack-poisoned]

ghstack-source-id: 1474e27 Pull Request resolved: #97494

Fixes #97728 Fixes microsoft/onnxscript#393 Provide op_level_debug in exporter which creates randomnied torch.Tensor based on FakeTensorProp real shape as inputs of both torch ops and ONNX symbolic function. The test is different from [op_correctness_test.py](https://github.com/microsoft/onnx-script/blob/main/onnxscript/tests/function_libs/torch_aten/ops_correctness_test.py) as op_level_debug generating real tensors based on the fake tensors in the model. Limitation: 1. trace_only onnxscript function is not supported due to lack of param_schema. 2. The ops with dim/indices (INT64) is not supported that they need the information(shape) from other input args. [ghstack-poisoned]

ghstack-source-id: f0cd543 Pull Request resolved: #97494

torch/onnx/_internal/onnx_proto_utils.py

titaiwangms · 2023-04-07T13:43:43Z

Had CI fail on:
worker 'gw1' crashed while running 'test/onnx/test_fx_to_onnx_with_onnxruntime.py::TestFxToOnnxWithOnnxRuntime_op_level_debug_False_dynamic_shapes_True::test_flatten_dynamic_axes'

Not sure it's on the machine or the test...

Fixes #97728 Fixes microsoft/onnxscript#393 Provide op_level_debug in exporter which creates randomnied torch.Tensor based on FakeTensorProp real shape as inputs of both torch ops and ONNX symbolic function. The PR leverages on Transformer class to create a new fx.Graph, but shares the same Module with the original one to save memory. The test is different from [op_correctness_test.py](https://github.com/microsoft/onnx-script/blob/main/onnxscript/tests/function_libs/torch_aten/ops_correctness_test.py) as op_level_debug generating real tensors based on the fake tensors in the model. Limitation: 1. Some of the trace_only function is not supported due to lack of param_schema which leads to arg/kwargs wronly split and ndarray wrapping. (WARNINGS in SARIF) 2. The ops with dim/indices (INT64) is not supported that they need the information(shape) from other input args. (WARNINGS in SARIF) 3. sym_size and built-in ops are not supported. 4. op_level_debug only labels results in SARIF. It doesn't stop exporter. 5. Introduce ONNX owning FakeTensorProp supports int/float/bool 6. parametrized op_level_debug and dynamic_shapes into FX tests [ghstack-poisoned]

justinchuby · 2023-04-07T15:54:41Z

Most likely because of ORT. This may be useful: microsoft/onnxscript#602

torch/onnx/_internal/fx/passes/fx_to_onnxscript.py

BowenBao · 2023-04-07T21:47:57Z

test/onnx/test_fx_to_onnx_with_onnxruntime.py

+            DynamicSliceExportMod(),
+            (x,),
+            additional_test_inputs=[(y,)],
+        )

    def test_mutation(self):


Please do me a favor and skip this one for dynamic too #98622 🤣

Hmmm The one reported is actually not dynamic. I will skip the whole test in this PR, and put skip_ORT_version on it in the next PR.

Fixes #97728 Fixes #98622 Fixes microsoft/onnxscript#393 Provide op_level_debug in exporter which creates randomnied torch.Tensor based on FakeTensorProp real shape as inputs of both torch ops and ONNX symbolic function. The PR leverages on Transformer class to create a new fx.Graph, but shares the same Module with the original one to save memory. The test is different from [op_correctness_test.py](https://github.com/microsoft/onnx-script/blob/main/onnxscript/tests/function_libs/torch_aten/ops_correctness_test.py) as op_level_debug generating real tensors based on the fake tensors in the model. Limitation: 1. Some of the trace_only function is not supported due to lack of param_schema which leads to arg/kwargs wronly split and ndarray wrapping. (WARNINGS in SARIF) 2. The ops with dim/indices (INT64) is not supported that they need the information(shape) from other input args. (WARNINGS in SARIF) 3. sym_size and built-in ops are not supported. 4. op_level_debug only labels results in SARIF. It doesn't stop exporter. 5. Introduce ONNX owning FakeTensorProp supports int/float/bool 6. parametrized op_level_debug and dynamic_shapes into FX tests [ghstack-poisoned]

titaiwangms · 2023-04-08T04:01:45Z

@pytorchbot merge

pytorchmergebot · 2023-04-08T04:03:39Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Fixes pytorch#97728 Fixes pytorch#98622 Fixes microsoft/onnxscript#393 Provide op_level_debug in exporter which creates randomnied torch.Tensor based on FakeTensorProp real shape as inputs of both torch ops and ONNX symbolic function. The PR leverages on Transformer class to create a new fx.Graph, but shares the same Module with the original one to save memory. The test is different from [op_correctness_test.py](https://github.com/microsoft/onnx-script/blob/main/onnxscript/tests/function_libs/torch_aten/ops_correctness_test.py) as op_level_debug generating real tensors based on the fake tensors in the model. Limitation: 1. Some of the trace_only function is not supported due to lack of param_schema which leads to arg/kwargs wronly split and ndarray wrapping. (WARNINGS in SARIF) 2. The ops with dim/indices (INT64) is not supported that they need the information(shape) from other input args. (WARNINGS in SARIF) 3. sym_size and built-in ops are not supported. 4. op_level_debug only labels results in SARIF. It doesn't stop exporter. 5. Introduce ONNX owning FakeTensorProp supports int/float/bool 6. parametrized op_level_debug and dynamic_shapes into FX tests Pull Request resolved: pytorch#97494 Approved by: https://github.com/justinchuby, https://github.com/BowenBao

BowenBao · 2023-04-10T18:08:32Z

test/onnx/test_fx_to_onnx_with_onnxruntime.py

+        " typing.Sequence[int], torch.Tensor], as [None, None]:"
+    )
+    def test_shufflenet_v2_dynamic_axes(self):
+        model = torchvision.models.shufflenet_v2_x0_5(pretrained=False)


How does this pass CI ... Seems torchvision is not imported from anywhere?

~~Isn't this skipped?~~

no worries, I'm adding a skip decorator for torchvision if uninstalled in the other PR

BowenBao · 2023-04-10T18:45:04Z

torch/onnx/_internal/fx/fx_exporter.py

@@ -40,6 +40,22 @@ def export_fx_to_onnx(
        # Remove them since ONNX inference does not need them.
        module = passes.RemoveInputMutation(module).run(*fx_module_args)

+        # ONNX does not support views and mutations.


…alue into the original gm" From #97494 (comment), the passes should modify gm inplace, but before this PR, `ShapeInferenceWithFakeTensor` utilizes Transform.transform() to make a copy of the gm, and rely on the assumption that the topological order of two graphs should be the same. This PR addresses the issue by saving another metavalue `static_shape` into gm for op_level_debug, instead of overwriting `val`. cc BowenBao [ghstack-poisoned]

…or to fill metavalue into the original gm" From #97494 (comment), the passes should modify gm inplace, but before this PR, `ShapeInferenceWithFakeTensor` utilizes Transform.transform() to make a copy of the gm, and rely on the assumption that the topological order of two graphs should be the same. This PR addresses the issue by saving another metavalue `static_shape` into gm for op_level_debug, instead of overwriting `val`. cc BowenBao [ghstack-poisoned]

…alue into the original gm" From #97494 (comment), the passes should modify gm inplace, but before this PR, `ShapeInferenceWithFakeTensor` utilizes Transform.transform() to make a copy of the gm, and rely on the assumption that the topological order of two graphs should be the same. This PR addresses the issue by saving another metavalue `static_shape` into gm for op_level_debug, instead of overwriting `val`. cc BowenBao [ghstack-poisoned]

…he original gm (#98760) From #97494 (comment), the passes should modify gm inplace, but before this PR, `ShapeInferenceWithFakeTensor` utilizes Transform.transform() to make a copy of the gm, and rely on the assumption that the topological order of two graphs should be the same. This PR addresses the issue by saving another metavalue `static_shape` into gm for op_level_debug, instead of overwriting `val`. Pull Request resolved: #98760 Approved by: https://github.com/BowenBao

ghstack-source-id: 350aa06 Pull Request resolved: pytorch/pytorch#97494

[ONNX] Refactor op level debugging

b46c040

[ghstack-poisoned]

titaiwangms requested review from BowenBao and abock as code owners March 24, 2023 01:28

titaiwangms mentioned this pull request Mar 24, 2023

[ONNX] Support converting fx graph with symbolic shape to ONNX #96350

Closed

pytorch-bot bot added the release notes: onnx torch.onnx related changes that should show up in the release notes label Mar 24, 2023

pytorchbot added the open source label Mar 24, 2023

Update on "[ONNX] Refactor op level debugging"

b447f33

[ghstack-poisoned]

titaiwangms added a commit that referenced this pull request Mar 24, 2023

[ONNX] Refactor op level debugging

5981fea

ghstack-source-id: 6828e7b Pull Request resolved: #97494

titaiwangms marked this pull request as draft March 24, 2023 01:34

Update on "[ONNX] Refactor op level debugging"

4192f5d

[ghstack-poisoned]

titaiwangms added a commit that referenced this pull request Mar 24, 2023

[ONNX] Refactor op level debugging

11adfa1

ghstack-source-id: 8563ade Pull Request resolved: #97494

titaiwangms added module: onnx Related to torch.onnx topic: bug fixes topic category labels Mar 24, 2023

Update on "[ONNX] Refactor op level debugging"

c526ad7

[ghstack-poisoned]

titaiwangms added a commit that referenced this pull request Mar 27, 2023

[ONNX] Refactor op level debugging

03d35fd

ghstack-source-id: fcd3f8b Pull Request resolved: #97494

Update on "[ONNX] Refactor op level debugging"

42a759a

[ghstack-poisoned]

titaiwangms added a commit that referenced this pull request Mar 28, 2023

[ONNX] Refactor op level debugging

0ff0f21

ghstack-source-id: d839262 Pull Request resolved: #97494

titaiwangms marked this pull request as ready for review March 28, 2023 20:44

titaiwangms added a commit that referenced this pull request Mar 28, 2023

[ONNX] Refactor op level debugging

66f2fc7

ghstack-source-id: 1474e27 Pull Request resolved: #97494

titaiwangms marked this pull request as draft March 28, 2023 23:05

titaiwangms marked this pull request as ready for review March 29, 2023 17:32

titaiwangms added a commit that referenced this pull request Mar 29, 2023

[ONNX] Refactor op level debugging

b66f5e2

ghstack-source-id: f0cd543 Pull Request resolved: #97494

titaiwangms requested a review from justinchuby March 29, 2023 19:01

justinchuby reviewed Mar 29, 2023

View reviewed changes

torch/onnx/_internal/onnx_proto_utils.py Outdated Show resolved Hide resolved

titaiwangms added the ciflow/trunk Trigger trunk jobs on your pull request label Apr 7, 2023

BowenBao reviewed Apr 7, 2023

View reviewed changes

torch/onnx/_internal/fx/passes/fx_to_onnxscript.py Outdated Show resolved Hide resolved

BowenBao reviewed Apr 7, 2023

View reviewed changes

huydhn mentioned this pull request Apr 7, 2023

DISABLED test_mutation (__main__.TestFxToOnnxWithOnnxRuntime) #98622

Closed

BowenBao approved these changes Apr 7, 2023

View reviewed changes

titaiwangms added 2 commits April 8, 2023 01:54

pytorchmergebot added the merging label Apr 8, 2023

pytorchmergebot added the Merged label Apr 8, 2023

pytorchmergebot closed this in 526d9bb Apr 8, 2023

titaiwangms mentioned this pull request Apr 10, 2023

[ONNX] Refactor ShapeInferenceWithFakeTensor to fill metavalue into the original gm #98760

Closed

BowenBao reviewed Apr 10, 2023

View reviewed changes

airen3339 pushed a commit to airen3339/pytorch that referenced this pull request Apr 19, 2023

[ONNX] Refactor op level debugging

1b2ca5c

ghstack-source-id: 350aa06 Pull Request resolved: pytorch/pytorch#97494

facebook-github-bot deleted the gh/AllenTiTaiWang/49/head branch June 8, 2023 14:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ONNX] Refactor op level debugging #97494

[ONNX] Refactor op level debugging #97494

Uh oh!

titaiwangms commented Mar 24, 2023 •

edited

Loading

Uh oh!

pytorch-bot bot commented Mar 24, 2023 •

edited

Loading

Uh oh!

Uh oh!

titaiwangms commented Apr 7, 2023

Uh oh!

justinchuby commented Apr 7, 2023

Uh oh!

Uh oh!

BowenBao Apr 7, 2023

Uh oh!

titaiwangms Apr 7, 2023

Uh oh!

titaiwangms Apr 8, 2023

Uh oh!

titaiwangms commented Apr 8, 2023

Uh oh!

pytorchmergebot commented Apr 8, 2023

Uh oh!

BowenBao Apr 10, 2023

Uh oh!

titaiwangms Apr 10, 2023 •

edited

Loading

Uh oh!

titaiwangms Apr 10, 2023

Uh oh!

BowenBao Apr 10, 2023

Uh oh!

BowenBao Apr 10, 2023

Uh oh!

Uh oh!

[ONNX] Refactor op level debugging #97494

[ONNX] Refactor op level debugging #97494

Uh oh!

Conversation

titaiwangms commented Mar 24, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Mar 24, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/97494

✅ No Failures

Uh oh!

Uh oh!

titaiwangms commented Apr 7, 2023

Uh oh!

justinchuby commented Apr 7, 2023

Uh oh!

Uh oh!

BowenBao Apr 7, 2023

Choose a reason for hiding this comment

Uh oh!

titaiwangms Apr 7, 2023

Choose a reason for hiding this comment

Uh oh!

titaiwangms Apr 8, 2023

Choose a reason for hiding this comment

Uh oh!

titaiwangms commented Apr 8, 2023

Uh oh!

pytorchmergebot commented Apr 8, 2023

Merge started

Uh oh!

BowenBao Apr 10, 2023

Choose a reason for hiding this comment

Uh oh!

titaiwangms Apr 10, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

titaiwangms Apr 10, 2023

Choose a reason for hiding this comment

Uh oh!

BowenBao Apr 10, 2023

Choose a reason for hiding this comment

Uh oh!

BowenBao Apr 10, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

titaiwangms commented Mar 24, 2023 •

edited

Loading

pytorch-bot bot commented Mar 24, 2023 •

edited

Loading

titaiwangms Apr 10, 2023 •

edited

Loading