Fix out-of-bound reads in the LHS packing function
- Rewrite the optimized path for kr = 16
- The out-of-bound check is not required anymore as the optimized path is only for the in-bound portion of the matrix
Signed-off-by: Gian Marco Iodice gianmarco.iodice@arm.com