llama.cpp with KleidiAI cannot run llama3.2 1B successfully
llama.cpp with KleidiAI cannot run llama3.2 1B successfully on smart phone. $./llama-cli -m ../llm_models/llama3.2/llama-3.2-1B-q4_0.gguf -p "Write a code in C for bubble sorting" -n 400 -t 1
and get the following information: llm_load_print_meta: rope scaling = linear llm_load_print_meta: freq_base_train = 500000.0 llm_load_print_meta: freq_scale_train = 1 llm_load_print_meta: n_ctx_orig_yarn = 131072 llm_load_print_meta: rope_finetuned = unknown llm_load_print_meta: ssm_d_conv = 0 llm_load_print_meta: ssm_d_inner = 0 llm_load_print_meta: ssm_d_state = 0 llm_load_print_meta: ssm_dt_rank = 0 llm_load_print_meta: model type = ?B llm_load_print_meta: model ftype = Q4_0 llm_load_print_meta: model params = 1.24 B llm_load_print_meta: model size = 663.16 MiB (4.50 BPW) llm_load_print_meta: general.name = Llama 3.2 1b Instruct Bnb 4bit llm_load_print_meta: BOS token = 128000 '<|begin_of_text|>' llm_load_print_meta: EOS token = 128009 '<|eot_id|>' llm_load_print_meta: PAD token = 128004 '<|finetune_right_pad_id|>' llm_load_print_meta: LF token = 128 'Ä' llm_load_print_meta: EOT token = 128009 '<|eot_id|>' llm_load_tensors: ggml ctx size = 0.07 MiB llama_model_load: error loading model: done_getting_tensors: wrong number of tensors; expected 147, got 146 llama_load_model_from_file: failed to load model llama_init_from_gpt_params: error: failed to load model '../llm_models/llama3.2/llama-3.2-1B-q4_0.gguf' main: error: unable to load model