Add micro-kernel to compute FP16 GEMV
-
Compute the general matrix-vector (GEMV) multiplication between an FP16 LHS and RHS and accumulate into FP16 output. The RHS packs FP16 weights and biases together.
-
Optimized for Arm® Neon™ using MLA instructions.
-
Add accompanying tests.
Signed-off-by: Jakub Sujak jakub.sujak@arm.com