Skip to content

MLBEDSW-11137: Depthwise Maxpool reduce over Height

William Isaksson requested to merge MLBEDSW-11137 into main

Drastically improves the performance of Transformer SoftMax ops on Ethos-U85 by inserting a Transpose before the SoftMax Reduce-Max when its viable to achieve a transposed output depth larger than or equal to 16.

This improves memory access patterns since more operators can read or write in the more burst efficient 'Brick' format. This also lets the MaxPool operate on a channel axis that is not larger than 1 which makes it much more computationally efficient.

Change-Id: I67c9a1ffc48396f1dca6187fe2842812940ad09e Signed-off-by: William Isaksson william.isaksson@arm.com

Merge request reports

Loading