Slice op for NC4HW4 input tensor does not respect C4 data layout

**MNN: 2.9.3**
**API: C++**

I encountered problem that Slicing op does not respect data layout for models converted from Caffe framework (NC4HW4).
Here is simple Caffe prototxt
```
layer {
	type: "Input"
	name: "input"
	top: "input"
	input_param {
		shape {
			dim: 1
			dim: 8
			dim: 6
			dim: 6
		}
	}
}
layer {
	type: "Slice"
	name: "slice"
	bottom: "input"
	top: "output1"
	top: "output2"
	top: "output3"
	top: "output4"
	slice_param {
		slice_point: 1
		slice_point: 5
		slice_point: 7
	}
}
```
It slices 1x8x6x6 tensor into 4 tensors 1x1x6x6, 1x4x6x6, 1x2x6x6, 1x1x6x6
After converting it with MNNConvert tool it turned out that 4th channel of the second slice contains wrong values.
After debugging MNN code I found that it probably happens because data format for Caffe models is NC4HW4, so input data is packed into two 1x4x6x6 chunks and the second slice should take 3 channels from the first chunk and forth channel from the second chunk.
However, when it goes to final method that does data copying [MNNTranspose32Bit](https://github.com/alibaba/MNN/blob/2.9.3/source/backend/cpu/compute/CommonOptFunction.cpp#L1839) it does not care about C4 data chunking.
For this particular model I tried to add quick hack [after this line](https://github.com/alibaba/MNN/blob/2.9.3/source/backend/cpu/compute/CommonOptFunction.cpp#L1845):
```
if (i == 3)
{
    si = srcO + 143;
}
```
to force proceeding to next C4 chunk and results are as expected.
Interestingly, the temp output Tensor in [CPURaster](https://github.com/alibaba/MNN/blob/2.9.3/source/backend/cpu/CPURaster.cpp#L638) has NCHW format that is converted to NC4HW4 as the last step. Maybe it is forgotten to convert input tensor to NCHW layout before slicing.

Also, I tried to convert ONNX model that does the same and it works normally. However, debugging shows that it uses different tensor layout internally: NCHW.

I attach sample models (slice_caffe.mnn - converted from Caffe, slice_onnx.mnn - converted from ONNX) and sample code (inference.cpp) that can be used to check the bug (it creates 1x8x6x6 tensor filled from 0 to 287 passes through slice model and checks that outputs are the same as input):
[slice_caffe.mnn.zip](https://github.com/user-attachments/files/16364905/slice_caffe.mnn.zip)
[slice_onnx.mnn.zip](https://github.com/user-attachments/files/16364909/slice_onnx.mnn.zip)
[inference.cpp.zip](https://github.com/user-attachments/files/16364972/inference.cpp.zip)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Slice op for NC4HW4 input tensor does not respect C4 data layout #2967

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Slice op for NC4HW4 input tensor does not respect C4 data layout #2967

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions