Add micro-kernel to compute FP16 GEMV (!79) · Merge requests · Kleidi / KleidiAI · GitLab

Jakub Sujak requested to merge jakub/fp16_gemv into main Jul 31, 2024

Compute the general matrix-vector (GEMV) multiplication between an FP16 LHS and RHS and accumulate into FP16 output. The RHS packs FP16 weights and biases together.
Optimized for Arm® Neon™ using MLA instructions.
Add accompanying tests.

Signed-off-by: Jakub Sujak jakub.sujak@arm.com