ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;font-size: 1.2em;font-weight: bold;display: table;margin-right: auto;margin-bottom: 1em;margin-left: auto;padding-right: 1em;padding-left: 1em;border-bottom: 2px solid rgb(15, 76, 129);color: rgb(63, 63, 63);">Qwen2模型Text2SQL微调ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;margin: 1.5em 8px;letter-spacing: 0.1em;color: rgb(63, 63, 63);">这里,模型我们选择Qwen2-1.5B模型,微调框架使用Llama-Factory。ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;font-size: 1.2em;font-weight: bold;display: table;margin: 4em auto 2em;padding-right: 0.2em;padding-left: 0.2em;background: rgb(15, 76, 129);color: rgb(255, 255, 255);">Qwen2-1.5B微调ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;font-size: 1.1em;font-weight: bold;margin-top: 2em;margin-right: 8px;margin-bottom: 0.75em;padding-left: 8px;border-left: 3px solid rgb(15, 76, 129);color: rgb(63, 63, 63);">准备python环境ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;overflow-x: auto;border-radius: 8px;padding: 1em;margin: 10px 8px;">condacreate--namellama_factorypython=3.11 condaactivatellama_factoryingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;font-size: 1.1em;font-weight: bold;margin-top: 2em;margin-right: 8px;margin-bottom: 0.75em;padding-left: 8px;border-left: 3px solid rgb(15, 76, 129);color: rgb(63, 63, 63);">部署llama-factoryingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;overflow-x: auto;border-radius: 8px;padding: 1em;margin: 10px 8px;">gitclone--depth1https://github.com/hiyouga/LLaMA-Factory.git cdLLaMA-Factory pip3install-e".[torch,metrics]" #如果要在Windows平台上开启量化LoRA(QLoRA),需要安装预编译的bitsandbytes库 pip3installhttps://github.com/jllllll/bitsandbytes-windows-webui/releases/download/wheels/bitsandbytes-0.41.2.post2-py3-none-win_amd64.whl #安装pytorch condainstallpytorchtorchvisiontorchaudiopytorch-cuda=11.8-cpytorch-cnvidia #如启动报错,出现ImportError:cannotimportname'get_full_repo_name'from'huggingface_hub',需安装chardet pip3installchardetingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;margin: 1.5em 8px;letter-spacing: 0.1em;color: rgb(63, 63, 63);">执行python src/webui.py启动: ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;font-size: 1.1em;font-weight: bold;margin-top: 2em;margin-right: 8px;margin-bottom: 0.75em;padding-left: 8px;border-left: 3px solid rgb(15, 76, 129);color: rgb(63, 63, 63);">Text2SQL微调ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;margin: 1.5em 8px;letter-spacing: 0.1em;color: rgb(63, 63, 63);">(1)准备样本数据集,如:{ "db_id":"department_management", "instruction":"IwantyoutoactasaSQLterminalinfrontofanexampledatabase,youneedonlytoreturnthesqlcommandtome.Belowisaninstructionthatdescribesatask,Writearesponsethatappropriatelycompletestherequest.\n\"\n##Instruction:\ndepartment_managementcontainstablessuchasdepartment,head,management.TabledepartmenthascolumnssuchasDepartment_ID,Name,Creation,Ranking,Budget_in_Billions,Num_Employees.Department_IDistheprimarykey.\nTableheadhascolumnssuchashead_ID,name,born_state,age.head_IDistheprimarykey.\nTablemanagementhascolumnssuchasdepartment_ID,head_ID,temporary_acting.department_IDistheprimarykey.\nThehead_IDofmanagementistheforeignkeyofhead_IDofhead.\nThedepartment_IDofmanagementistheforeignkeyofDepartment_IDofdepartment.\n\n", "input":"###Input:\nHowmanyheadsofthedepartmentsareolderthan56?\n\n###Response:", "output":"SELECTcount(*)FROMheadWHEREage>56", "history":[] }
(2)添加数据集。将数据集json文件复制到LLaMA-Factory/data目录下,在dataset_info.json中添加如下内容: "text2sql_train":{ "file_name":"text2sql_train.json", "columns":{ "prompt":"instruction", "query":"input", "response":"output", "history":"history" } }, "text2sql_dev":{ "file_name":"text2sql_dev.json", "columns":{ "prompt":"instruction", "query":"input", "response":"output", "history":"history" } }
(3)配置llama-factory页面上的各参数,预览命令如下(根据自己的实际情况调整参数): llamafactory-clitrain` --stagesft` --do_trainTrue` --model_name_or_pathD:\\LLM\\Qwen2-1.5B-Instruct` --preprocessing_num_workers16` --finetuning_typelora` --quantization_methodbitsandbytes` --templateqwen` --flash_attnauto` --dataset_dirD:\\python_project\\LLaMA-Factory\\data` --datasettext2sql_train` --cutoff_len1024` --learning_rate0.0001` --num_train_epochs2.0` --max_samples100000` --per_device_train_batch_size2` --gradient_accumulation_steps4` --lr_scheduler_typecosine` --max_grad_norm1.0` --logging_steps20` --save_steps500` --warmup_steps0` --optimadamw_torch` --packingFalse` --report_tonone` --output_dirsaves\Qwen2-1.5B-Chat\lora\train_2024-07-19-19-45-59` --bf16True` --plot_lossTrue` --ddp_timeout180000000` --include_num_input_tokens_seenTrue` --lora_rank8` --lora_alpha16` --lora_dropout0` --lora_targetall
(4)微调后,对测试集数据进行推理评估: 
{ "predict_bleu-4":88.43791015473889, "predict_rouge-1":92.31425483558995, "predict_rouge-2":85.43010570599614, "predict_rouge-l":89.06327794970986, "predict_runtime":1027.4111, "predict_samples_per_second":1.006, "predict_steps_per_second":0.503 }
从以上的指标数据看,模型的效果还是挺不错的,大家可以基于自己的样本数据自行设置各个参数。以下是以上的评估指标解读: 1.predict_bleu-4: *BLEU(BilingualEvaluationUnderstudy)是一种常用的用于评估机器翻译质量的指标。 *BLEU-4表示四元语法BLEU分数,它衡量模型生成文本与参考文本之间的n-gram匹配程度,其中n=4。 *值越高表示生成的文本与参考文本越相似,最大值为100。
2.predict_rouge-1和predict_rouge-2: *ROUGE(Recall-OrientedUnderstudyforGistingEvaluation)是一种用于评估自动摘要和文本生成模型性能的指标。 *ROUGE-1表示一元ROUGE分数,ROUGE-2表示二元ROUGE分数,分别衡量模型生成文本与参考文本之间的单个词和双词序列的匹配程度。 *值越高表示生成的文本与参考文本越相似,最大值为100。
3.predict_rouge-l: *ROUGE-L衡量模型生成文本与参考文本之间最长公共子序列(LongestCommonSubsequence)的匹配程度。 *值越高表示生成的文本与参考文本越相似,最大值为100。
4.predict_runtime: *预测运行时间,表示模型生成一批样本所花费的总时间。 *单位通常为秒。
5.predict_samples_per_second: *每秒生成的样本数量,表示模型每秒钟能够生成的样本数量。 *通常用于评估模型的推理速度。
6.predict_steps_per_second: *每秒执行的步骤数量,表示模型每秒钟能够执行的步骤数量。 *对于生成模型,一般指的是每秒钟执行生成操作的次数。
(5)评估感觉达到了微调预期,可以直接导出微调后的模型(注意选上微调生成的检查点): 
GGUF模型以上微调导出的是safetensors模型,我们可以使用llama.cpp将safetensors模型转为GGUF模型。推荐使用w64devkit+make方案部署llama.cpp。 安装make下载地址:https://gnuwin32.sourceforge.net/packages/make.htm 
点击框中的下载,下载安装后,把安装路径添加到环境变量PATH中。在终端,执行以下命令,将出现Make版本信息: (llama_factory)D:\LLM\llama.cpp>make-v GNUMake3.81 Copyright(C)2006FreeSoftwareFoundation,Inc. Thisisfreesoftware;seethesourceforcopyingconditions. ThereisNOwarranty;notevenforMERCHANTABILITYorFITNESSFORA PARTICULARPURPOSE.
Thisprogrambuiltfori386-pc-mingw32
安装MinGW下载源文件,把mingw64/bin添加到环境变量PATH中。 
安装w64devkit下载地址:https://github.com/skeeto/w64devkit/releases 下载 w64devkit-fortran-1.23.0.zip后解压。注意:不要包含中文路径。 部署llama.cpp下载地址:https://github.com/ggerganov/llama.cpp 使用我们刚刚下载的w64devkit.exe打开llama.cpp,然后make,成功后就能得到一堆exe文件啦。 ~$cdD: D:/$cd/LLM/llama.cpp D:/LLM/llama.cpp$make
make成功后,安装python依赖: condacreate--namellama_cpppython=3.11 condaactivatellama_cpp pip3install-rrequirements.txt
转换为GGUF FP16格式#model_path/mymodel为我们实际的模型路径 pythonconvert_hf_to_gguf.pymodel_path/mymodel
运行日志: (llama_cpp)D:\LLM\llama.cpp>pythonconvert_hf_to_gguf.pyD:/python_project/LLaMA-Factory/output_model/model2 INFO:hf-to-gguf oadingmodel:model2 INFO:gguf.gguf_writer:gguf:ThisGGUFfileisforLittleEndianonly INFO:hf-to-gguf:Exportingmodel... INFO:hf-to-gguf:gguf:loadingmodelweightmapfrom'model.safetensors.index.json' INFO:hf-to-gguf:gguf:loadingmodelpart'model-00001-of-00002.safetensors' INFO:hf-to-gguf:token_embd.weight,torch.bfloat16-->F16,shape={1536,151936} INFO:hf-to-gguf:blk.0.attn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.0.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536} INFO:hf-to-gguf:blk.0.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.0.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.0.ffn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.0.attn_k.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.0.attn_k.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.0.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.0.attn_q.bias,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.0.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.0.attn_v.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.0.attn_v.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.1.attn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.1.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536} INFO:hf-to-gguf:blk.1.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.1.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.1.ffn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.1.attn_k.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.1.attn_k.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.1.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.1.attn_q.bias,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.1.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.1.attn_v.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.1.attn_v.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.10.attn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.10.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536} INFO:hf-to-gguf:blk.10.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.10.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.10.ffn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.10.attn_k.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.10.attn_k.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.10.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.10.attn_q.bias,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.10.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.10.attn_v.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.10.attn_v.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.11.attn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.11.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536} INFO:hf-to-gguf:blk.11.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.11.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.11.ffn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.11.attn_k.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.11.attn_k.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.11.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.11.attn_q.bias,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.11.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.11.attn_v.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.11.attn_v.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.12.attn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.12.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536} INFO:hf-to-gguf:blk.12.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.12.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.12.ffn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.12.attn_k.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.12.attn_k.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.12.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.12.attn_q.bias,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.12.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.12.attn_v.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.12.attn_v.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.13.attn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.13.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536} INFO:hf-to-gguf:blk.13.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.13.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.13.ffn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.13.attn_k.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.13.attn_k.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.13.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.13.attn_q.bias,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.13.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.13.attn_v.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.13.attn_v.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.14.attn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.14.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536} INFO:hf-to-gguf:blk.14.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.14.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.14.ffn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.14.attn_k.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.14.attn_k.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.14.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.14.attn_q.bias,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.14.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.14.attn_v.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.14.attn_v.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.15.attn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.15.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536} INFO:hf-to-gguf:blk.15.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.15.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.15.ffn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.15.attn_k.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.15.attn_k.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.15.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.15.attn_q.bias,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.15.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.15.attn_v.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.15.attn_v.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.16.attn_k.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.16.attn_k.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.16.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.16.attn_q.bias,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.16.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.16.attn_v.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.16.attn_v.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.2.attn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.2.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536} INFO:hf-to-gguf:blk.2.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.2.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.2.ffn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.2.attn_k.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.2.attn_k.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.2.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.2.attn_q.bias,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.2.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.2.attn_v.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.2.attn_v.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.3.attn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.3.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536} INFO:hf-to-gguf:blk.3.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.3.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.3.ffn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.3.attn_k.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.3.attn_k.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.3.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.3.attn_q.bias,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.3.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.3.attn_v.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.3.attn_v.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.4.attn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.4.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536} INFO:hf-to-gguf:blk.4.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.4.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.4.ffn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.4.attn_k.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.4.attn_k.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.4.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.4.attn_q.bias,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.4.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.4.attn_v.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.4.attn_v.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.5.attn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.5.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536} INFO:hf-to-gguf:blk.5.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.5.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.5.ffn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.5.attn_k.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.5.attn_k.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.5.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.5.attn_q.bias,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.5.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.5.attn_v.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.5.attn_v.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.6.attn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.6.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536} INFO:hf-to-gguf:blk.6.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.6.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.6.ffn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.6.attn_k.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.6.attn_k.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.6.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.6.attn_q.bias,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.6.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.6.attn_v.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.6.attn_v.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.7.attn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.7.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536} INFO:hf-to-gguf:blk.7.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.7.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.7.ffn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.7.attn_k.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.7.attn_k.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.7.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.7.attn_q.bias,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.7.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.7.attn_v.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.7.attn_v.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.8.attn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.8.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536} INFO:hf-to-gguf:blk.8.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.8.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.8.ffn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.8.attn_k.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.8.attn_k.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.8.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.8.attn_q.bias,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.8.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.8.attn_v.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.8.attn_v.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.9.attn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.9.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536} INFO:hf-to-gguf:blk.9.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.9.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.9.ffn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.9.attn_k.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.9.attn_k.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.9.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.9.attn_q.bias,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.9.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.9.attn_v.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.9.attn_v.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:gguf:loadingmodelpart'model-00002-of-00002.safetensors' INFO:hf-to-gguf:blk.16.attn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.16.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536} INFO:hf-to-gguf:blk.16.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.16.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.16.ffn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.17.attn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.17.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536} INFO:hf-to-gguf:blk.17.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.17.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.17.ffn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.17.attn_k.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.17.attn_k.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.17.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.17.attn_q.bias,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.17.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.17.attn_v.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.17.attn_v.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.18.attn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.18.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536} INFO:hf-to-gguf:blk.18.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.18.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.18.ffn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.18.attn_k.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.18.attn_k.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.18.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.18.attn_q.bias,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.18.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.18.attn_v.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.18.attn_v.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.19.attn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.19.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536} INFO:hf-to-gguf:blk.19.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.19.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.19.ffn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.19.attn_k.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.19.attn_k.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.19.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.19.attn_q.bias,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.19.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.19.attn_v.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.19.attn_v.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.20.attn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.20.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536} INFO:hf-to-gguf:blk.20.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.20.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.20.ffn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.20.attn_k.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.20.attn_k.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.20.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.20.attn_q.bias,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.20.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.20.attn_v.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.20.attn_v.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.21.attn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.21.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536} INFO:hf-to-gguf:blk.21.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.21.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.21.ffn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.21.attn_k.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.21.attn_k.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.21.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.21.attn_q.bias,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.21.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.21.attn_v.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.21.attn_v.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.22.attn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.22.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536} INFO:hf-to-gguf:blk.22.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.22.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.22.ffn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.22.attn_k.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.22.attn_k.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.22.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.22.attn_q.bias,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.22.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.22.attn_v.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.22.attn_v.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.23.attn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.23.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536} INFO:hf-to-gguf:blk.23.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.23.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.23.ffn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.23.attn_k.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.23.attn_k.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.23.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.23.attn_q.bias,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.23.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.23.attn_v.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.23.attn_v.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.24.attn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.24.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536} INFO:hf-to-gguf:blk.24.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.24.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.24.ffn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.24.attn_k.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.24.attn_k.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.24.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.24.attn_q.bias,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.24.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.24.attn_v.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.24.attn_v.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.25.attn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.25.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536} INFO:hf-to-gguf:blk.25.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.25.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.25.ffn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.25.attn_k.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.25.attn_k.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.25.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.25.attn_q.bias,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.25.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.25.attn_v.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.25.attn_v.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.26.attn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.26.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536} INFO:hf-to-gguf:blk.26.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.26.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.26.ffn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.26.attn_k.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.26.attn_k.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.26.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.26.attn_q.bias,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.26.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.26.attn_v.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.26.attn_v.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.27.attn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.27.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536} INFO:hf-to-gguf:blk.27.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.27.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960} INFO:hf-to-gguf:blk.27.ffn_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.27.attn_k.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.27.attn_k.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf:blk.27.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.27.attn_q.bias,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:blk.27.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536} INFO:hf-to-gguf:blk.27.attn_v.bias,torch.bfloat16-->F32,shape={256} INFO:hf-to-gguf:blk.27.attn_v.weight,torch.bfloat16-->F16,shape={1536,256} INFO:hf-to-gguf utput_norm.weight,torch.bfloat16-->F32,shape={1536} INFO:hf-to-gguf:Setmetamodel INFO:hf-to-gguf:Setmodelparameters INFO:hf-to-gguf:gguf:contextlength=32768 INFO:hf-to-gguf:gguf:embeddinglength=1536 INFO:hf-to-gguf:gguf:feedforwardlength=8960 INFO:hf-to-gguf:gguf:headcount=12 INFO:hf-to-gguf:gguf:key-valueheadcount=2 INFO:hf-to-gguf:gguf:ropetheta=1000000.0 INFO:hf-to-gguf:gguf:rmsnormepsilon=1e-06 INFO:hf-to-gguf:gguf:filetype=1 INFO:hf-to-gguf:Setmodeltokenizer Specialtokenshavebeenaddedinthevocabulary,makesuretheassociatedwordembeddingsarefine-tunedortrained. INFO:gguf.vocab:Adding151387merge(s). INFO:gguf.vocab:Settingspecialtokentypeeosto151645 INFO:gguf.vocab:Settingspecialtokentypepadto151643 INFO:gguf.vocab:Settingspecialtokentypebosto151643 INFO:gguf.vocab:Settingchat_templateto{%setsystem_message='Youareahelpfulassistant.'%}{%ifmessages[0]['role']=='system'%}{%setsystem_message=messages[0]['content']%}{%endif%}{%ifsystem_messageisdefined%}{{'<|im_start|>system '+system_message+'<|im_end|> '}}{%endif%}{%formessageinmessages%}{%setcontent=message['content']%}{%ifmessage['role']=='user'%}{{'<|im_start|>user '+content+'<|im_end|> <|im_start|>assistant '}}{%elifmessage['role']=='assistant'%}{{content+'<|im_end|>'+' '}}{%endif%}{%endfor%} INFO:hf-to-gguf:Setmodelquantizationversion INFO:gguf.gguf_writer:Writingthefollowingfiles: INFO:gguf.gguf_writer :\Llm\Qwen2-1.5B-Instruct-F16.gguf:n_tensors=338,total_size=3.1G Writing:100%|██████████████████████████████████████████████████████████████████|3.09G/3.09G[00:14<00:00,218Mbyte/s] INFO:hf-to-gguf:ModelsuccessfullyexportedtoD:\Llm\Qwen2-1.5B-Instruct-F16.gguf
模型量化#model_path/mymodel-F16.gguf为刚刚得到的FP16模型,model_path/mymodel_Q4_k_M.gguf为要得到的量化模型,Q4_K_M为使用Q4_K_M方法 llama-quantize.exemodel_path/mymodel-F16.ggufmodel_path/mymodel_Q4_k_M.ggufQ4_K_M
运行日志: (llama_cpp)D:\LLM\llama.cpp>llama-quantize.exeD:/Llm/Qwen2-1.5B-Instruct-F16.ggufD:/Llm/Qwen2-1.5B-Instruct_Q4_k_M.ggufQ4_K_M main:build=0(unknown) main:builtwithcc(GCC)14.1.0forx86_64-w64-mingw32 main:quantizing'D:/Llm/Qwen2-1.5B-Instruct-F16.gguf'to'D:/Llm/Qwen2-1.5B-Instruct_Q4_k_M.gguf'asQ4_K_M llama_model_loader:loadedmetadatawith25key-valuepairsand338tensorsfromD:/Llm/Qwen2-1.5B-Instruct-F16.gguf(versionGGUFV3(latest)) llama_model_loader umpingmetadatakeys/values.Note:KVoverridesdonotapplyinthisoutput. llama_model_loader:-kv0:general.architecturestr=qwen2 llama_model_loader:-kv1:general.typestr=model llama_model_loader:-kv2:general.namestr=D:\\LLM\\Qwen21.5BInstruct llama_model_loader:-kv3:general.finetunestr=Instruct llama_model_loader:-kv4:general.basenamestr=D:\\LLM\\Qwen2 llama_model_loader:-kv5:general.size_labelstr=1.5B llama_model_loader:-kv6:qwen2.block_countu32=28 llama_model_loader:-kv7:qwen2.context_lengthu32=32768 llama_model_loader:-kv8:qwen2.embedding_lengthu32=1536 llama_model_loader:-kv9:qwen2.feed_forward_lengthu32=8960 llama_model_loader:-kv10:qwen2.attention.head_countu32=12 llama_model_loader:-kv11:qwen2.attention.head_count_kvu32=2 llama_model_loader:-kv12:qwen2.rope.freq_basef32=1000000.000000 llama_model_loader:-kv13:qwen2.attention.layer_norm_rms_epsilonf32=0.000001 llama_model_loader:-kv14:general.file_typeu32=1 llama_model_loader:-kv15:tokenizer.ggml.modelstr=gpt2 llama_model_loader:-kv16:tokenizer.ggml.prestr=qwen2 llama_model_loader:-kv17:tokenizer.ggml.tokensarr[str,151936]=["!","\"","#","$","%","&","'",... llama_model_loader:-kv18:tokenizer.ggml.token_typearr[i32,151936]=[1,1,1,1,1,1,1,1,1,1,1,1,... llama_model_loader:-kv19:tokenizer.ggml.mergesarr[str,151387]=["臓臓","臓臓臓臓","in","臓t",... llama_model_loader:-kv20:tokenizer.ggml.eos_token_idu32=151645 llama_model_loader:-kv21:tokenizer.ggml.padding_token_idu32=151643 llama_model_loader:-kv22:tokenizer.ggml.bos_token_idu32=151643 llama_model_loader:-kv23:tokenizer.chat_templatestr={%setsystem_message='Youareahe... llama_model_loader:-kv24:general.quantization_versionu32=2 llama_model_loader:-typef32:141tensors llama_model_loader:-typef16:197tensors [1/338]token_embd.weight-[1536,151936,1,1],type=f16,convertingtoq6_K..size=445.12MiB->182.57MiB [2/338]blk.0.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [3/338]blk.0.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq6_K..size=26.25MiB->10.77MiB [4/338]blk.0.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [5/338]blk.0.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [6/338]blk.0.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [7/338]blk.0.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB [8/338]blk.0.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB [9/338]blk.0.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [10/338]blk.0.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB [11/338]blk.0.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [12/338]blk.0.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB [13/338]blk.0.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq6_K..size=0.75MiB->0.31MiB [14/338]blk.1.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [15/338]blk.1.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq6_K..size=26.25MiB->10.77MiB [16/338]blk.1.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [17/338]blk.1.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [18/338]blk.1.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [19/338]blk.1.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB [20/338]blk.1.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB [21/338]blk.1.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [22/338]blk.1.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB [23/338]blk.1.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [24/338]blk.1.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB [25/338]blk.1.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq6_K..size=0.75MiB->0.31MiB [26/338]blk.10.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [27/338]blk.10.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq6_K..size=26.25MiB->10.77MiB [28/338]blk.10.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [29/338]blk.10.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [30/338]blk.10.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [31/338]blk.10.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB [32/338]blk.10.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB [33/338]blk.10.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [34/338]blk.10.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB [35/338]blk.10.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [36/338]blk.10.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB [37/338]blk.10.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq6_K..size=0.75MiB->0.31MiB [38/338]blk.11.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [39/338]blk.11.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [40/338]blk.11.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [41/338]blk.11.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [42/338]blk.11.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [43/338]blk.11.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB [44/338]blk.11.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB [45/338]blk.11.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [46/338]blk.11.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB [47/338]blk.11.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [48/338]blk.11.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB [49/338]blk.11.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB [50/338]blk.12.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [51/338]blk.12.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [52/338]blk.12.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [53/338]blk.12.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [54/338]blk.12.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [55/338]blk.12.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB [56/338]blk.12.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB [57/338]blk.12.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [58/338]blk.12.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB [59/338]blk.12.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [60/338]blk.12.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB [61/338]blk.12.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB [62/338]blk.13.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [63/338]blk.13.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq6_K..size=26.25MiB->10.77MiB [64/338]blk.13.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [65/338]blk.13.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [66/338]blk.13.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [67/338]blk.13.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB [68/338]blk.13.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB [69/338]blk.13.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [70/338]blk.13.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB [71/338]blk.13.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [72/338]blk.13.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB [73/338]blk.13.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq6_K..size=0.75MiB->0.31MiB [74/338]blk.14.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [75/338]blk.14.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [76/338]blk.14.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [77/338]blk.14.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [78/338]blk.14.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [79/338]blk.14.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB [80/338]blk.14.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB [81/338]blk.14.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [82/338]blk.14.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB [83/338]blk.14.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [84/338]blk.14.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB [85/338]blk.14.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB [86/338]blk.15.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [87/338]blk.15.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [88/338]blk.15.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [89/338]blk.15.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [90/338]blk.15.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [91/338]blk.15.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB [92/338]blk.15.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB [93/338]blk.15.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [94/338]blk.15.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB [95/338]blk.15.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [96/338]blk.15.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB [97/338]blk.15.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB [98/338]blk.16.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB [99/338]blk.16.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB [100/338]blk.16.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [101/338]blk.16.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB [102/338]blk.16.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [103/338]blk.16.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB [104/338]blk.16.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq6_K..size=0.75MiB->0.31MiB [105/338]blk.2.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [106/338]blk.2.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq6_K..size=26.25MiB->10.77MiB [107/338]blk.2.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [108/338]blk.2.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [109/338]blk.2.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [110/338]blk.2.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB [111/338]blk.2.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB [112/338]blk.2.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [113/338]blk.2.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB [114/338]blk.2.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [115/338]blk.2.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB [116/338]blk.2.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB [117/338]blk.3.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [118/338]blk.3.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [119/338]blk.3.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [120/338]blk.3.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [121/338]blk.3.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [122/338]blk.3.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB [123/338]blk.3.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB [124/338]blk.3.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [125/338]blk.3.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB [126/338]blk.3.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [127/338]blk.3.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB [128/338]blk.3.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB [129/338]blk.4.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [130/338]blk.4.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [131/338]blk.4.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [132/338]blk.4.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [133/338]blk.4.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [134/338]blk.4.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB [135/338]blk.4.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB [136/338]blk.4.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [137/338]blk.4.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB [138/338]blk.4.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [139/338]blk.4.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB [140/338]blk.4.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq6_K..size=0.75MiB->0.31MiB [141/338]blk.5.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [142/338]blk.5.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq6_K..size=26.25MiB->10.77MiB [143/338]blk.5.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [144/338]blk.5.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [145/338]blk.5.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [146/338]blk.5.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB [147/338]blk.5.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB [148/338]blk.5.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [149/338]blk.5.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB [150/338]blk.5.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [151/338]blk.5.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB [152/338]blk.5.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB [153/338]blk.6.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [154/338]blk.6.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [155/338]blk.6.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [156/338]blk.6.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [157/338]blk.6.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [158/338]blk.6.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB [159/338]blk.6.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB [160/338]blk.6.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [161/338]blk.6.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB [162/338]blk.6.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [163/338]blk.6.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB [164/338]blk.6.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB [165/338]blk.7.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [166/338]blk.7.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [167/338]blk.7.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [168/338]blk.7.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [169/338]blk.7.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [170/338]blk.7.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB [171/338]blk.7.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB [172/338]blk.7.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [173/338]blk.7.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB [174/338]blk.7.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [175/338]blk.7.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB [176/338]blk.7.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq6_K..size=0.75MiB->0.31MiB [177/338]blk.8.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [178/338]blk.8.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq6_K..size=26.25MiB->10.77MiB [179/338]blk.8.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [180/338]blk.8.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [181/338]blk.8.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [182/338]blk.8.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB [183/338]blk.8.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB [184/338]blk.8.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [185/338]blk.8.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB [186/338]blk.8.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [187/338]blk.8.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB [188/338]blk.8.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB [189/338]blk.9.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [190/338]blk.9.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [191/338]blk.9.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [192/338]blk.9.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [193/338]blk.9.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [194/338]blk.9.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB [195/338]blk.9.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB [196/338]blk.9.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [197/338]blk.9.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB [198/338]blk.9.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [199/338]blk.9.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB [200/338]blk.9.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB [201/338]blk.16.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [202/338]blk.16.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [203/338]blk.16.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [204/338]blk.16.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [205/338]blk.16.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [206/338]blk.17.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [207/338]blk.17.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq6_K..size=26.25MiB->10.77MiB [208/338]blk.17.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [209/338]blk.17.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [210/338]blk.17.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [211/338]blk.17.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB [212/338]blk.17.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB [213/338]blk.17.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [214/338]blk.17.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB [215/338]blk.17.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [216/338]blk.17.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB [217/338]blk.17.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq6_K..size=0.75MiB->0.31MiB [218/338]blk.18.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [219/338]blk.18.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [220/338]blk.18.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [221/338]blk.18.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [222/338]blk.18.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [223/338]blk.18.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB [224/338]blk.18.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB [225/338]blk.18.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [226/338]blk.18.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB [227/338]blk.18.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [228/338]blk.18.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB [229/338]blk.18.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB [230/338]blk.19.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [231/338]blk.19.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [232/338]blk.19.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [233/338]blk.19.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [234/338]blk.19.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [235/338]blk.19.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB [236/338]blk.19.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB [237/338]blk.19.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [238/338]blk.19.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB [239/338]blk.19.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [240/338]blk.19.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB [241/338]blk.19.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB [242/338]blk.20.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [243/338]blk.20.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq6_K..size=26.25MiB->10.77MiB [244/338]blk.20.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [245/338]blk.20.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [246/338]blk.20.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [247/338]blk.20.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB [248/338]blk.20.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB [249/338]blk.20.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [250/338]blk.20.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB [251/338]blk.20.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [252/338]blk.20.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB [253/338]blk.20.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq6_K..size=0.75MiB->0.31MiB [254/338]blk.21.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [255/338]blk.21.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [256/338]blk.21.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [257/338]blk.21.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [258/338]blk.21.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [259/338]blk.21.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB [260/338]blk.21.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB [261/338]blk.21.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [262/338]blk.21.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB [263/338]blk.21.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [264/338]blk.21.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB [265/338]blk.21.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB [266/338]blk.22.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [267/338]blk.22.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [268/338]blk.22.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [269/338]blk.22.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [270/338]blk.22.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [271/338]blk.22.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB [272/338]blk.22.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB [273/338]blk.22.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [274/338]blk.22.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB [275/338]blk.22.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [276/338]blk.22.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB [277/338]blk.22.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB [278/338]blk.23.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [279/338]blk.23.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq6_K..size=26.25MiB->10.77MiB [280/338]blk.23.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [281/338]blk.23.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [282/338]blk.23.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [283/338]blk.23.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB [284/338]blk.23.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB [285/338]blk.23.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [286/338]blk.23.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB [287/338]blk.23.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [288/338]blk.23.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB [289/338]blk.23.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq6_K..size=0.75MiB->0.31MiB [290/338]blk.24.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [291/338]blk.24.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq6_K..size=26.25MiB->10.77MiB [292/338]blk.24.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [293/338]blk.24.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [294/338]blk.24.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [295/338]blk.24.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB [296/338]blk.24.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB [297/338]blk.24.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [298/338]blk.24.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB [299/338]blk.24.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [300/338]blk.24.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB [301/338]blk.24.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq6_K..size=0.75MiB->0.31MiB [302/338]blk.25.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [303/338]blk.25.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq6_K..size=26.25MiB->10.77MiB [304/338]blk.25.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [305/338]blk.25.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [306/338]blk.25.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [307/338]blk.25.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB [308/338]blk.25.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB [309/338]blk.25.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [310/338]blk.25.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB [311/338]blk.25.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [312/338]blk.25.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB [313/338]blk.25.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq6_K..size=0.75MiB->0.31MiB [314/338]blk.26.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [315/338]blk.26.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq6_K..size=26.25MiB->10.77MiB [316/338]blk.26.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [317/338]blk.26.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [318/338]blk.26.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [319/338]blk.26.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB [320/338]blk.26.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB [321/338]blk.26.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [322/338]blk.26.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB [323/338]blk.26.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [324/338]blk.26.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB [325/338]blk.26.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq6_K..size=0.75MiB->0.31MiB [326/338]blk.27.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [327/338]blk.27.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq6_K..size=26.25MiB->10.77MiB [328/338]blk.27.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [329/338]blk.27.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB [330/338]blk.27.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB [331/338]blk.27.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB [332/338]blk.27.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB [333/338]blk.27.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [334/338]blk.27.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB [335/338]blk.27.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB [336/338]blk.27.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB [337/338]blk.27.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq6_K..size=0.75MiB->0.31MiB [338/338]output_norm.weight-[1536,1,1,1],type=f32,size=0.006MB llama_model_quantize_internal:modelsize=2944.68MB llama_model_quantize_internal:quantsize=934.69MB
main:quantizetime=14228.85ms main:totaltime=14228.85ms
最后,得到了Qwen2-1.5B-Instruct_Q4_k_M.gguf模型。
|