Tags: zhuhong61/pytorch
Tags
[wip] test_python_ir_utils_graph fixes pytorch#73474 fixes pytorch#73473 [ghstack-poisoned]
Update on "[quant][gpu][core] Implemented quantized add operator usin… …g cudnn" Summary: This PR implements the quantized add operator using cudnn operations. Also added a corresponding test function in test_quantized_op.py. Ideally, we should merge this function with the cpu variant, but for now, we will keep it separate until cudnn v8 is in the default build. Other factors also complicate the merge as cudnn quantized add is currently only supported for int8 symmetrically quantized tensors. Test Plan: In pytorch main dir, execute ``` python test/test_quantization.py TestQuantizedOps.test_qadd_relu_cudnn ``` Differential Revision: [D35009111](https://our.internmc.facebook.com/intern/diff/D35009111) [ghstack-poisoned]
Update on "[BC-breaking] Use ScatterGatherKernel for scatter_reduce (… …CPU-only)" Update signature of `scatter_reduce_` to match `scatter_/scatter_add_` `Tensor.scatter_reduce_(int64 dim, Tensor index, Tensor src, str reduce)` - Add new reduction options in ScatterGatherKernel.cpp and update `scatter_reduce` to call into the cpu kernel for `scatter.reduce` - `scatter_reduce` now has the same shape constraints as `scatter_` and `scatter_add_` - Migrate `test/test_torch.py:test_scatter_reduce` to `test/test_scatter_gather_ops.py` [ghstack-poisoned]
[wip] test_script_pack_padded_sequence Fixes pytorch#34213 [ghstack-poisoned]
Merge remote-tracking branch 'upstream/viable/strict' into batched-csr
Merge remote-tracking branch 'upstream/viable/strict' into batched-csr
Merge remote-tracking branch 'upstream/viable/strict' into batched-csr
Add a unit test for launching hierarchical SGD by post-localSGD optim… …izer
fix bugs, optimize index sort/count for the case nnz > size
PreviousNext