-
Notifications
You must be signed in to change notification settings - Fork 25.5k
Open
Labels
module: dispatchDispatchStub, Type, void pointer table, c10 dispatchDispatchStub, Type, void pointer table, c10 dispatchtriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate moduleThis issue has been looked at a team member, and triaged and prioritized into an appropriate module
Description
https://gist.github.com/zou3519/b987e00a82c7e184b8896a5df7b0bfa9
Benchmarking two cases:
- torch.ops.mylib.foo operator that has an Autograd key that takes unboxed inputs but a CPU key that boxes (via return to Python)
- torch.ops.mylib.foo_cpp operator that has an Autograd key and CPU key (in cpp) that take unboxed inputs
num_tensors 5
2.7380013465881348 # clone
13.052228927612305 # foo
8.257509231567383 # foo_cpp
NB: We have an Autograd key that accepts unboxed inputs to emulate how built-in PyTorch operators work. If I delete the autograd registration for both operators, then it becomes a boxed fallback, which brings the numbers a lot closer together (both at around 8). It looks like one unboxing isn't bad, but a boxing is bad.
Metadata
Metadata
Assignees
Labels
module: dispatchDispatchStub, Type, void pointer table, c10 dispatchDispatchStub, Type, void pointer table, c10 dispatchtriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate moduleThis issue has been looked at a team member, and triaged and prioritized into an appropriate module