- Dec 02, 2024
-
-
Signed-off-by:
Gian Marco Iodice <gianmarco.iodice@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-
- GEMM and GEMV Micro-kernels to compute the matrix multiplication of dynamically quantized symmetric signed 8-bit integer with per-block quantization (QSI8D32) LHS matrix and quantized symmetric 4-bit signed integer with per-block quantization (QSI4C32) RHS matrix and the accumulation of the result into a single-precision (F32) output, optimized for SME2 technology. Signed-off-by:
Gian Marco Iodice <gianmarco.iodice@arm.com> Signed-off-by:
Anitha Raj <anitha.raj@arm.com> Reviewed-by:
Viet-Hoa Do <viet-hoa.do@arm.com> Reviewed-by:
Anitha Raj <anitha.raj@arm.com> Reviewed-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com> Reviewed-by:
Anton Bondarenko <anton.bondarenko@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-
Signed-off-by:
Jakub Sujak <jakub.sujak@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-
- Nov 29, 2024
-
-
Signed-off-by:
Jakub Sujak <jakub.sujak@arm.com> Approved-by:
Anton Bondarenko <anton.bondarenko@arm.com>
-
Signed-off-by:
Anton Bondarenko <anton.bondarenko@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-
Anton Bondarenko authored
Signed-off-by:
Anton Bondarenko <anton.bondarenko@arm.com> Approved-by:
Emil Ohlsson <emil.ohlsson@arm.com>
-
Signed-off-by:
Jens Elofsson <jens.elofsson@arm.com> Approved-by:
Emil Ohlsson <emil.ohlsson@arm.com>
-
Suhail M authored
* Add the SME Int8 GEMM set of microkernels: - LHS packing kernel. - Non-transposed RHS packing kernel. - Main kernel. * Update the test framework to support static int8 GEMM. Resolves: KLEIDIAI-171, KLEIDIAI-235, KLEIDIAI-39 Signed-off-by:
Mohammed Suhail Munshi <MohammedSuhail.Munshi@arm.com> Signed-off-by:
Viet-Hoa Do <viet-hoa.do@arm.com> Reviewed-by:
Viet-Hoa Do <viet-hoa.do@arm.com> Reviewed-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com> Reviewed-by:
Mohammed Suhail Munshi <mohammedsuhail.munshi@arm.com> Reviewed-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Emil Ohlsson <emil.ohlsson@arm.com>
-
Anton Bondarenko authored
Current solution for SME microkernels support is to use SME opcodes and SVE instruction in streaming mode. And a precondition for compiler to understand SVE instructions is compilation with -march=...+sve+sve2. However this allows compiler to generate own SVE instructions for normal C/C++ code. And might cause illegal instruction exception on CPUs where SME implemented w/o SVE. In this case we want to disable usage of compiler generated SVE instructions. Test no SVE instructions using FVP with disabled SVE support. Signed-off-by:
Anton Bondarenko <anton.bondarenko@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-
Signed-off-by:
Viet-Hoa Do <viet-hoa.do@arm.com> Reviewed-by:
Viet-Hoa Do <viet-hoa.do@arm.com> Reviewed-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-
RHS pack is required. LHS pack is not required Signed-off-by:
Felix Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com> Reviewed-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com> Reviewed-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Emil Ohlsson <emil.ohlsson@arm.com>
-
Jakub Sujak authored
Regenerate the SME2 GEMV micro-kernel assembly so that it is contained within the SMSTART/SMSTOP boundary, preventing illegal instruction faults when attempting to execute streaming SVE code on a system without SVE support. Signed-off-by:
Jakub Sujak <jakub.sujak@arm.com> Approved-by:
Emil Ohlsson <emil.ohlsson@arm.com>
-
This commit * Adds bf16 x bf16 = fp16 matmul microkernel with 8x12 output block size * Lhs/Rhs packing functions that packs and converts the inputs from fp16 to bf16 * Corresponding tests, and modifications to the testing framework, and reference implementation Signed-off-by:
Gunes Bayir <gunes.bayir@arm.com> Reviewed-by:
Viet-Hoa Do <viet-hoa.do@arm.com> Reviewed-by:
Jakub Sujak <jakub.sujak@arm.com> Reviewed-by:
Gunes Bayir <gunes.bayir@arm.com> Reviewed-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Emil Ohlsson <emil.ohlsson@arm.com>
-
Felix Johnny Thomasmathibalan authored
Affected micro kernel: FP16 GEMM, SME2 Signed-off-by:
Felix Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com> Approved-by:
Emil Ohlsson <emil.ohlsson@arm.com>
-
- Nov 28, 2024
-
-
Jens Elofsson authored
Signed-off-by:
Jens Elofsson <jens.elofsson@arm.com> Reviewed-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Emil Ohlsson <emil.ohlsson@arm.com>
-
- Add GeMM-like micro-kernels - Add GeMV-like micro-kernels Signed-off-by:
Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by:
Jakub Sujak <jakub.sujak@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-
This change fixes a minor copy paste error in the kernel interfaces. Signed-off-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-
- Nov 27, 2024
-
-
Signed-off-by:
Michael Kozlov <michael.kozlov@arm.com> Signed-off-by:
Anitha Raj <anitha.raj@arm.com> Reviewed-by:
Viet-Hoa Do <viet-hoa.do@arm.com> Reviewed-by:
Anitha Raj <anitha.raj@arm.com> Reviewed-by:
Anton Bondarenko <anton.bondarenko@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com> Approved-by:
Anton Bondarenko <anton.bondarenko@arm.com>
-
- Nov 21, 2024
-
-
Add fp16 kernels for LHS and RHS packing, and matmul. Also add related unit tests for said kernels, and extend unit Matmul tests to support calling fp16 kernels. Signed-off-by:
Emil Ohlsson <emil.ohlsson@arm.com> Reviewed-by:
Viet-Hoa Do <viet-hoa.do@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-
- Nov 20, 2024
-
-
Emil Ohlsson authored
KleidiAI is intended to target certain build environments, this means that KleidiAI should be buildable using CMake version 3.16 Signed-off-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-
- Nov 19, 2024
-
-
* Round off the odd strides for the int4 RHS by padding with 0s Signed-off-by:
Anitha Raj <anitha.raj@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-
- Nov 18, 2024
-
-
Alias existing nxk/kxn packing parameter structs to new one. To keep a consistent interface for the packing function(s) within a microkernel folder. Signed-off-by:
Michael Kozlov <michael.kozlov@arm.com> Reviewed-by:
Viet-Hoa Do <viet-hoa.do@arm.com> Reviewed-by:
Michael Kozlov <michael.kozlov@arm.com> Approved-by:
Gian Marco Iodice <gianmarco.iodice@arm.com>
-
Move the data generation in the unit tests to after feature support have been confirmed. Signed-off-by:
Jens Elofsson <jens.elofsson@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-
- Nov 14, 2024
-
-
One of the f32 kernel tests incorrectly uses matmtul functionality for both main and rhs support functionality. This commit addresses this issue Signed-off-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-
Add generic transpose function, use it for non-transposed (kxn) RHS packing tests. Signed-off-by:
Anitha Raj <anitha.raj@arm.com> Signed-off-by:
Michael Kozlov <michael.kozlov@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-
Anton Bondarenko authored
Signed-off-by:
Anton Bondarenko <anton.bondarenko@arm.com> Approved-by:
Emil Ohlsson <emil.ohlsson@arm.com>
-
- Nov 13, 2024
-
-
Emil Ohlsson authored
This change addresses testing issues related to RHS and LHS packing tests. One issue related to reference data comparisson, which is addressed by adding proper rounding for comparisons. LHS packing tests did not correctly pass rolling parameters back to the packing kernel RHS packing had issues with blocking, which affected portioned testing Signed-off-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-
Emil Ohlsson authored
This change makes some readability changes to the testing framework. Which allows printing of rectangles and intermediate values for easy dumping while debugging. Signed-off-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-
- Nov 11, 2024
-
-
Anton Bondarenko authored
10 minutes should be enough for regular ones. Job with external dependency has a bigger timeout of 1 hour for better robustness. Signed-off-by:
Anton Bondarenko <anton.bondarenko@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-
- Nov 06, 2024
-
-
Signed-off-by:
Jens Elofsson <jens.elofsson@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-
The `CMAKE_SOURCE_DIR` variable always corresponds to the top level directory of the CMakefile being processed by CMake. This causes issues for CMake projects that fetch KleidiAI using `FetchContent()` as it incorrectly assumes KleidiAI's dependencies reside in that project's top level directory, rather than in KleidiAI's source tree. Resolve this issue by using the `CMAKE_CURRENT_SOURCE_DIR` variable to use relative paths to the KleidiAI project. Signed-off-by:
Jakub Sujak <jakub.sujak@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-
- Nov 04, 2024
-
-
Jakub Sujak authored
Signed-off-by:
Jakub Sujak <jakub.sujak@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-
Jakub Sujak authored
Signed-off-by:
Jakub Sujak <jakub.sujak@arm.com> Reviewed-by:
Anton Bondarenko <anton.bondarenko@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-
Now when we have major third party components in repository there is no need to download them anymore. Signed-off-by:
Anton Bondarenko <anton.bondarenko@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-
- Add variant to examples - Add unit test for variant Signed-off-by:
Michael Kozlov <michael.kozlov@arm.com> Approved-by:
Gian Marco Iodice <gianmarco.iodice@arm.com>
-
Local third party components provides a better clarity for KleidiAI library external dependencies. Commands used to get files: LICENSES/BSD-3-Clause.txt -> reuse download BSD-3-Clause third_party/benchmark-v1.8.4.zip -> wget https://github.com/google/benchmark/archive/refs/tags/v1.8.4.zip third_party/googletest-v.1.14.0.zip -> wget https://github.com/google/googletest/archive/refs/tags/v1.14.0.zip Signed-off-by:
Anton Bondarenko <anton.bondarenko@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-
Anton Bondarenko authored
It's preferred to keep original license text untouched so common project code style should not apply to it. Signed-off-by:
Anton Bondarenko <anton.bondarenko@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-
This helps distinguish it from the other SME GEMV kernel. Signed-off-by:
Jakub Sujak <jakub.sujak@arm.com> Approved-by:
Anton Bondarenko <anton.bondarenko@arm.com>
-
Felix Johnny Thomasmathibalan authored
This reverts commit 6b3c6fad . Signed-off-by:
Felix Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com> Approved-by:
Anton Bondarenko <anton.bondarenko@arm.com>
-
Anton Bondarenko authored
It's preferred to keep original license text untouched so common project code style should not apply to it. Signed-off-by:
Anton Bondarenko <anton.bondarenko@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-