Perform clamp after calculation
There is an issue with
kai_matmul_clamp_f32_qai8dxp1vlx4_qsi8cxp4vlx4_1vlx4vl_sme_mopa_asm
where it applies clamping before scaling, which result in precision
issues for certain clamp values.
Signed-off-by: Emil Ohlsson emil.ohlsson@arm.com