Skip to content

Add wider variants of Advanced SIMD FP16 and FP32 MatMul

Jakub Sujak requested to merge jakub/neon_fp16 into main

Add 6x32 block size variant of Advanced SIMD FP16 MatMul, increased from the original 6x16 variant.

Add 6x16 block size variant of Advanced SIMD FP32 MatMul, increased from the original 6x8 variant.

These are the maximum viable block sizes for these kernels.

Add a variant of the kernel optimized for the Arm® Cortex®-A55 processor.

Signed-off-by: Jakub Sujak jakub.sujak@arm.com

Edited by Jakub Sujak

Merge request reports

Loading