Skip to content

Add micro-kernel to compute FP16 GEMV

Jakub Sujak requested to merge jakub/fp16_gemv into main
  • Compute the general matrix-vector (GEMV) multiplication between an FP16 LHS and RHS and accumulate into FP16 output. The RHS packs FP16 weights and biases together.

  • Optimized for Arm® Neon™ using MLA instructions.

  • Add accompanying tests.

Signed-off-by: Jakub Sujak jakub.sujak@arm.com

Merge request reports