condacreate--namellama_factorypython=3.11
condaactivatellama_factoryingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;font-size: 1.1em;font-weight: bold;margin-top: 2em;margin-right: 8px;margin-bottom: 0.75em;padding-left: 8px;border-left: 3px solid rgb(15, 76, 129);color: rgb(63, 63, 63);">部署llama-factoryingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;overflow-x: auto;border-radius: 8px;padding: 1em;margin: 10px 8px;">gitclone--depth1https://github.com/hiyouga/LLaMA-Factory.git
cdLLaMA-Factory
pip3install-e".[torch,metrics]"
#如果要在Windows平台上开启量化LoRA(QLoRA),需要安装预编译的bitsandbytes库
pip3installhttps://github.com/jllllll/bitsandbytes-windows-webui/releases/download/wheels/bitsandbytes-0.41.2.post2-py3-none-win_amd64.whl
#安装pytorch
condainstallpytorchtorchvisiontorchaudiopytorch-cuda=11.8-cpytorch-cnvidia
#如启动报错,出现ImportError:cannotimportname'get_full_repo_name'from'huggingface_hub',需安装chardet
pip3installchardetingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;margin: 1.5em 8px;letter-spacing: 0.1em;color: rgb(63, 63, 63);">执行python src/webui.py启动:{
"db_id":"department_management",
"instruction":"IwantyoutoactasaSQLterminalinfrontofanexampledatabase,youneedonlytoreturnthesqlcommandtome.Belowisaninstructionthatdescribesatask,Writearesponsethatappropriatelycompletestherequest.\n\"\n##Instruction:\ndepartment_managementcontainstablessuchasdepartment,head,management.TabledepartmenthascolumnssuchasDepartment_ID,Name,Creation,Ranking,Budget_in_Billions,Num_Employees.Department_IDistheprimarykey.\nTableheadhascolumnssuchashead_ID,name,born_state,age.head_IDistheprimarykey.\nTablemanagementhascolumnssuchasdepartment_ID,head_ID,temporary_acting.department_IDistheprimarykey.\nThehead_IDofmanagementistheforeignkeyofhead_IDofhead.\nThedepartment_IDofmanagementistheforeignkeyofDepartment_IDofdepartment.\n\n",
"input":"###Input:\nHowmanyheadsofthedepartmentsareolderthan56?\n\n###Response:",
"output":"SELECTcount(*)FROMheadWHEREage>56",
"history":[]
}(2)添加数据集。将数据集json文件复制到LLaMA-Factory/data目录下,在dataset_info.json中添加如下内容:
"text2sql_train":{
"file_name":"text2sql_train.json",
"columns":{
"prompt":"instruction",
"query":"input",
"response":"output",
"history":"history"
}
},
"text2sql_dev":{
"file_name":"text2sql_dev.json",
"columns":{
"prompt":"instruction",
"query":"input",
"response":"output",
"history":"history"
}
}(3)配置llama-factory页面上的各参数,预览命令如下(根据自己的实际情况调整参数):
llamafactory-clitrain`
--stagesft`
--do_trainTrue`
--model_name_or_pathD:\\LLM\\Qwen2-1.5B-Instruct`
--preprocessing_num_workers16`
--finetuning_typelora`
--quantization_methodbitsandbytes`
--templateqwen`
--flash_attnauto`
--dataset_dirD:\\python_project\\LLaMA-Factory\\data`
--datasettext2sql_train`
--cutoff_len1024`
--learning_rate0.0001`
--num_train_epochs2.0`
--max_samples100000`
--per_device_train_batch_size2`
--gradient_accumulation_steps4`
--lr_scheduler_typecosine`
--max_grad_norm1.0`
--logging_steps20`
--save_steps500`
--warmup_steps0`
--optimadamw_torch`
--packingFalse`
--report_tonone`
--output_dirsaves\Qwen2-1.5B-Chat\lora\train_2024-07-19-19-45-59`
--bf16True`
--plot_lossTrue`
--ddp_timeout180000000`
--include_num_input_tokens_seenTrue`
--lora_rank8`
--lora_alpha16`
--lora_dropout0`
--lora_targetall(4)微调后,对测试集数据进行推理评估:
{
"predict_bleu-4":88.43791015473889,
"predict_rouge-1":92.31425483558995,
"predict_rouge-2":85.43010570599614,
"predict_rouge-l":89.06327794970986,
"predict_runtime":1027.4111,
"predict_samples_per_second":1.006,
"predict_steps_per_second":0.503
}从以上的指标数据看,模型的效果还是挺不错的,大家可以基于自己的样本数据自行设置各个参数。以下是以上的评估指标解读:
1.predict_bleu-4:
*BLEU(BilingualEvaluationUnderstudy)是一种常用的用于评估机器翻译质量的指标。
*BLEU-4表示四元语法BLEU分数,它衡量模型生成文本与参考文本之间的n-gram匹配程度,其中n=4。
*值越高表示生成的文本与参考文本越相似,最大值为100。
2.predict_rouge-1和predict_rouge-2:
*ROUGE(Recall-OrientedUnderstudyforGistingEvaluation)是一种用于评估自动摘要和文本生成模型性能的指标。
*ROUGE-1表示一元ROUGE分数,ROUGE-2表示二元ROUGE分数,分别衡量模型生成文本与参考文本之间的单个词和双词序列的匹配程度。
*值越高表示生成的文本与参考文本越相似,最大值为100。
3.predict_rouge-l:
*ROUGE-L衡量模型生成文本与参考文本之间最长公共子序列(LongestCommonSubsequence)的匹配程度。
*值越高表示生成的文本与参考文本越相似,最大值为100。
4.predict_runtime:
*预测运行时间,表示模型生成一批样本所花费的总时间。
*单位通常为秒。
5.predict_samples_per_second:
*每秒生成的样本数量,表示模型每秒钟能够生成的样本数量。
*通常用于评估模型的推理速度。
6.predict_steps_per_second:
*每秒执行的步骤数量,表示模型每秒钟能够执行的步骤数量。
*对于生成模型,一般指的是每秒钟执行生成操作的次数。(5)评估感觉达到了微调预期,可以直接导出微调后的模型(注意选上微调生成的检查点):
以上微调导出的是safetensors模型,我们可以使用llama.cpp将safetensors模型转为GGUF模型。推荐使用w64devkit+make方案部署llama.cpp。
下载地址:https://gnuwin32.sourceforge.net/packages/make.htm
点击框中的下载,下载安装后,把安装路径添加到环境变量PATH中。在终端,执行以下命令,将出现Make版本信息:
(llama_factory)D:\LLM\llama.cpp>make-v
GNUMake3.81
Copyright(C)2006FreeSoftwareFoundation,Inc.
Thisisfreesoftware;seethesourceforcopyingconditions.
ThereisNOwarranty;notevenforMERCHANTABILITYorFITNESSFORA
PARTICULARPURPOSE.
Thisprogrambuiltfori386-pc-mingw32下载源文件,把mingw64/bin添加到环境变量PATH中。
下载地址:https://github.com/skeeto/w64devkit/releases
下载 w64devkit-fortran-1.23.0.zip后解压。注意:不要包含中文路径。
下载地址:https://github.com/ggerganov/llama.cpp
使用我们刚刚下载的w64devkit.exe打开llama.cpp,然后make,成功后就能得到一堆exe文件啦。
~$cdD:
D:/$cd/LLM/llama.cpp
D:/LLM/llama.cpp$makemake成功后,安装python依赖:
condacreate--namellama_cpppython=3.11
condaactivatellama_cpp
pip3install-rrequirements.txt#model_path/mymodel为我们实际的模型路径
pythonconvert_hf_to_gguf.pymodel_path/mymodel运行日志:
(llama_cpp)D:\LLM\llama.cpp>pythonconvert_hf_to_gguf.pyD:/python_project/LLaMA-Factory/output_model/model2
INFO:hf-to-gguf
oadingmodel:model2
INFO:gguf.gguf_writer:gguf:ThisGGUFfileisforLittleEndianonly
INFO:hf-to-gguf:Exportingmodel...
INFO:hf-to-gguf:gguf:loadingmodelweightmapfrom'model.safetensors.index.json'
INFO:hf-to-gguf:gguf:loadingmodelpart'model-00001-of-00002.safetensors'
INFO:hf-to-gguf:token_embd.weight,torch.bfloat16-->F16,shape={1536,151936}
INFO:hf-to-gguf:blk.0.attn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.0.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536}
INFO:hf-to-gguf:blk.0.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.0.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.0.ffn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.0.attn_k.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.0.attn_k.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.0.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.0.attn_q.bias,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.0.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.0.attn_v.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.0.attn_v.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.1.attn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.1.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536}
INFO:hf-to-gguf:blk.1.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.1.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.1.ffn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.1.attn_k.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.1.attn_k.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.1.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.1.attn_q.bias,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.1.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.1.attn_v.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.1.attn_v.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.10.attn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.10.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536}
INFO:hf-to-gguf:blk.10.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.10.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.10.ffn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.10.attn_k.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.10.attn_k.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.10.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.10.attn_q.bias,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.10.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.10.attn_v.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.10.attn_v.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.11.attn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.11.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536}
INFO:hf-to-gguf:blk.11.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.11.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.11.ffn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.11.attn_k.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.11.attn_k.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.11.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.11.attn_q.bias,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.11.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.11.attn_v.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.11.attn_v.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.12.attn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.12.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536}
INFO:hf-to-gguf:blk.12.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.12.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.12.ffn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.12.attn_k.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.12.attn_k.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.12.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.12.attn_q.bias,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.12.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.12.attn_v.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.12.attn_v.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.13.attn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.13.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536}
INFO:hf-to-gguf:blk.13.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.13.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.13.ffn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.13.attn_k.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.13.attn_k.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.13.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.13.attn_q.bias,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.13.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.13.attn_v.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.13.attn_v.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.14.attn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.14.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536}
INFO:hf-to-gguf:blk.14.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.14.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.14.ffn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.14.attn_k.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.14.attn_k.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.14.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.14.attn_q.bias,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.14.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.14.attn_v.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.14.attn_v.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.15.attn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.15.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536}
INFO:hf-to-gguf:blk.15.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.15.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.15.ffn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.15.attn_k.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.15.attn_k.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.15.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.15.attn_q.bias,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.15.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.15.attn_v.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.15.attn_v.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.16.attn_k.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.16.attn_k.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.16.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.16.attn_q.bias,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.16.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.16.attn_v.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.16.attn_v.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.2.attn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.2.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536}
INFO:hf-to-gguf:blk.2.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.2.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.2.ffn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.2.attn_k.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.2.attn_k.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.2.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.2.attn_q.bias,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.2.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.2.attn_v.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.2.attn_v.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.3.attn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.3.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536}
INFO:hf-to-gguf:blk.3.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.3.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.3.ffn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.3.attn_k.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.3.attn_k.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.3.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.3.attn_q.bias,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.3.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.3.attn_v.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.3.attn_v.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.4.attn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.4.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536}
INFO:hf-to-gguf:blk.4.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.4.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.4.ffn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.4.attn_k.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.4.attn_k.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.4.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.4.attn_q.bias,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.4.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.4.attn_v.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.4.attn_v.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.5.attn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.5.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536}
INFO:hf-to-gguf:blk.5.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.5.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.5.ffn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.5.attn_k.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.5.attn_k.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.5.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.5.attn_q.bias,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.5.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.5.attn_v.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.5.attn_v.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.6.attn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.6.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536}
INFO:hf-to-gguf:blk.6.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.6.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.6.ffn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.6.attn_k.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.6.attn_k.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.6.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.6.attn_q.bias,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.6.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.6.attn_v.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.6.attn_v.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.7.attn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.7.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536}
INFO:hf-to-gguf:blk.7.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.7.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.7.ffn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.7.attn_k.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.7.attn_k.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.7.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.7.attn_q.bias,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.7.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.7.attn_v.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.7.attn_v.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.8.attn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.8.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536}
INFO:hf-to-gguf:blk.8.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.8.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.8.ffn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.8.attn_k.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.8.attn_k.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.8.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.8.attn_q.bias,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.8.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.8.attn_v.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.8.attn_v.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.9.attn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.9.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536}
INFO:hf-to-gguf:blk.9.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.9.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.9.ffn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.9.attn_k.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.9.attn_k.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.9.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.9.attn_q.bias,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.9.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.9.attn_v.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.9.attn_v.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:gguf:loadingmodelpart'model-00002-of-00002.safetensors'
INFO:hf-to-gguf:blk.16.attn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.16.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536}
INFO:hf-to-gguf:blk.16.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.16.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.16.ffn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.17.attn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.17.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536}
INFO:hf-to-gguf:blk.17.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.17.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.17.ffn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.17.attn_k.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.17.attn_k.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.17.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.17.attn_q.bias,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.17.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.17.attn_v.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.17.attn_v.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.18.attn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.18.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536}
INFO:hf-to-gguf:blk.18.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.18.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.18.ffn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.18.attn_k.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.18.attn_k.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.18.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.18.attn_q.bias,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.18.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.18.attn_v.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.18.attn_v.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.19.attn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.19.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536}
INFO:hf-to-gguf:blk.19.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.19.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.19.ffn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.19.attn_k.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.19.attn_k.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.19.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.19.attn_q.bias,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.19.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.19.attn_v.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.19.attn_v.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.20.attn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.20.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536}
INFO:hf-to-gguf:blk.20.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.20.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.20.ffn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.20.attn_k.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.20.attn_k.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.20.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.20.attn_q.bias,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.20.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.20.attn_v.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.20.attn_v.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.21.attn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.21.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536}
INFO:hf-to-gguf:blk.21.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.21.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.21.ffn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.21.attn_k.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.21.attn_k.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.21.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.21.attn_q.bias,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.21.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.21.attn_v.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.21.attn_v.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.22.attn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.22.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536}
INFO:hf-to-gguf:blk.22.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.22.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.22.ffn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.22.attn_k.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.22.attn_k.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.22.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.22.attn_q.bias,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.22.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.22.attn_v.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.22.attn_v.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.23.attn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.23.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536}
INFO:hf-to-gguf:blk.23.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.23.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.23.ffn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.23.attn_k.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.23.attn_k.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.23.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.23.attn_q.bias,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.23.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.23.attn_v.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.23.attn_v.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.24.attn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.24.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536}
INFO:hf-to-gguf:blk.24.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.24.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.24.ffn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.24.attn_k.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.24.attn_k.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.24.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.24.attn_q.bias,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.24.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.24.attn_v.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.24.attn_v.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.25.attn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.25.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536}
INFO:hf-to-gguf:blk.25.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.25.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.25.ffn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.25.attn_k.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.25.attn_k.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.25.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.25.attn_q.bias,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.25.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.25.attn_v.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.25.attn_v.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.26.attn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.26.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536}
INFO:hf-to-gguf:blk.26.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.26.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.26.ffn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.26.attn_k.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.26.attn_k.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.26.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.26.attn_q.bias,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.26.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.26.attn_v.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.26.attn_v.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.27.attn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.27.ffn_down.weight,torch.bfloat16-->F16,shape={8960,1536}
INFO:hf-to-gguf:blk.27.ffn_gate.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.27.ffn_up.weight,torch.bfloat16-->F16,shape={1536,8960}
INFO:hf-to-gguf:blk.27.ffn_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.27.attn_k.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.27.attn_k.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf:blk.27.attn_output.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.27.attn_q.bias,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:blk.27.attn_q.weight,torch.bfloat16-->F16,shape={1536,1536}
INFO:hf-to-gguf:blk.27.attn_v.bias,torch.bfloat16-->F32,shape={256}
INFO:hf-to-gguf:blk.27.attn_v.weight,torch.bfloat16-->F16,shape={1536,256}
INFO:hf-to-gguf
utput_norm.weight,torch.bfloat16-->F32,shape={1536}
INFO:hf-to-gguf:Setmetamodel
INFO:hf-to-gguf:Setmodelparameters
INFO:hf-to-gguf:gguf:contextlength=32768
INFO:hf-to-gguf:gguf:embeddinglength=1536
INFO:hf-to-gguf:gguf:feedforwardlength=8960
INFO:hf-to-gguf:gguf:headcount=12
INFO:hf-to-gguf:gguf:key-valueheadcount=2
INFO:hf-to-gguf:gguf:ropetheta=1000000.0
INFO:hf-to-gguf:gguf:rmsnormepsilon=1e-06
INFO:hf-to-gguf:gguf:filetype=1
INFO:hf-to-gguf:Setmodeltokenizer
Specialtokenshavebeenaddedinthevocabulary,makesuretheassociatedwordembeddingsarefine-tunedortrained.
INFO:gguf.vocab:Adding151387merge(s).
INFO:gguf.vocab:Settingspecialtokentypeeosto151645
INFO:gguf.vocab:Settingspecialtokentypepadto151643
INFO:gguf.vocab:Settingspecialtokentypebosto151643
INFO:gguf.vocab:Settingchat_templateto{%setsystem_message='Youareahelpfulassistant.'%}{%ifmessages[0]['role']=='system'%}{%setsystem_message=messages[0]['content']%}{%endif%}{%ifsystem_messageisdefined%}{{'<|im_start|>system
'+system_message+'<|im_end|>
'}}{%endif%}{%formessageinmessages%}{%setcontent=message['content']%}{%ifmessage['role']=='user'%}{{'<|im_start|>user
'+content+'<|im_end|>
<|im_start|>assistant
'}}{%elifmessage['role']=='assistant'%}{{content+'<|im_end|>'+'
'}}{%endif%}{%endfor%}
INFO:hf-to-gguf:Setmodelquantizationversion
INFO:gguf.gguf_writer:Writingthefollowingfiles:
INFO:gguf.gguf_writer
:\Llm\Qwen2-1.5B-Instruct-F16.gguf:n_tensors=338,total_size=3.1G
Writing:100%|██████████████████████████████████████████████████████████████████|3.09G/3.09G[00:14<00:00,218Mbyte/s]
INFO:hf-to-gguf:ModelsuccessfullyexportedtoD:\Llm\Qwen2-1.5B-Instruct-F16.gguf#model_path/mymodel-F16.gguf为刚刚得到的FP16模型,model_path/mymodel_Q4_k_M.gguf为要得到的量化模型,Q4_K_M为使用Q4_K_M方法
llama-quantize.exemodel_path/mymodel-F16.ggufmodel_path/mymodel_Q4_k_M.ggufQ4_K_M运行日志:
(llama_cpp)D:\LLM\llama.cpp>llama-quantize.exeD:/Llm/Qwen2-1.5B-Instruct-F16.ggufD:/Llm/Qwen2-1.5B-Instruct_Q4_k_M.ggufQ4_K_M
main:build=0(unknown)
main:builtwithcc(GCC)14.1.0forx86_64-w64-mingw32
main:quantizing'D:/Llm/Qwen2-1.5B-Instruct-F16.gguf'to'D:/Llm/Qwen2-1.5B-Instruct_Q4_k_M.gguf'asQ4_K_M
llama_model_loader:loadedmetadatawith25key-valuepairsand338tensorsfromD:/Llm/Qwen2-1.5B-Instruct-F16.gguf(versionGGUFV3(latest))
llama_model_loader
umpingmetadatakeys/values.Note:KVoverridesdonotapplyinthisoutput.
llama_model_loader:-kv0:general.architecturestr=qwen2
llama_model_loader:-kv1:general.typestr=model
llama_model_loader:-kv2:general.namestr=D:\\LLM\\Qwen21.5BInstruct
llama_model_loader:-kv3:general.finetunestr=Instruct
llama_model_loader:-kv4:general.basenamestr=D:\\LLM\\Qwen2
llama_model_loader:-kv5:general.size_labelstr=1.5B
llama_model_loader:-kv6:qwen2.block_countu32=28
llama_model_loader:-kv7:qwen2.context_lengthu32=32768
llama_model_loader:-kv8:qwen2.embedding_lengthu32=1536
llama_model_loader:-kv9:qwen2.feed_forward_lengthu32=8960
llama_model_loader:-kv10:qwen2.attention.head_countu32=12
llama_model_loader:-kv11:qwen2.attention.head_count_kvu32=2
llama_model_loader:-kv12:qwen2.rope.freq_basef32=1000000.000000
llama_model_loader:-kv13:qwen2.attention.layer_norm_rms_epsilonf32=0.000001
llama_model_loader:-kv14:general.file_typeu32=1
llama_model_loader:-kv15:tokenizer.ggml.modelstr=gpt2
llama_model_loader:-kv16:tokenizer.ggml.prestr=qwen2
llama_model_loader:-kv17:tokenizer.ggml.tokensarr[str,151936]=["!","\"","#","$","%","&","'",...
llama_model_loader:-kv18:tokenizer.ggml.token_typearr[i32,151936]=[1,1,1,1,1,1,1,1,1,1,1,1,...
llama_model_loader:-kv19:tokenizer.ggml.mergesarr[str,151387]=["臓臓","臓臓臓臓","in","臓t",...
llama_model_loader:-kv20:tokenizer.ggml.eos_token_idu32=151645
llama_model_loader:-kv21:tokenizer.ggml.padding_token_idu32=151643
llama_model_loader:-kv22:tokenizer.ggml.bos_token_idu32=151643
llama_model_loader:-kv23:tokenizer.chat_templatestr={%setsystem_message='Youareahe...
llama_model_loader:-kv24:general.quantization_versionu32=2
llama_model_loader:-typef32:141tensors
llama_model_loader:-typef16:197tensors
[1/338]token_embd.weight-[1536,151936,1,1],type=f16,convertingtoq6_K..size=445.12MiB->182.57MiB
[2/338]blk.0.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[3/338]blk.0.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq6_K..size=26.25MiB->10.77MiB
[4/338]blk.0.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[5/338]blk.0.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[6/338]blk.0.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[7/338]blk.0.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB
[8/338]blk.0.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB
[9/338]blk.0.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[10/338]blk.0.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB
[11/338]blk.0.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[12/338]blk.0.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB
[13/338]blk.0.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq6_K..size=0.75MiB->0.31MiB
[14/338]blk.1.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[15/338]blk.1.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq6_K..size=26.25MiB->10.77MiB
[16/338]blk.1.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[17/338]blk.1.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[18/338]blk.1.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[19/338]blk.1.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB
[20/338]blk.1.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB
[21/338]blk.1.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[22/338]blk.1.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB
[23/338]blk.1.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[24/338]blk.1.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB
[25/338]blk.1.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq6_K..size=0.75MiB->0.31MiB
[26/338]blk.10.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[27/338]blk.10.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq6_K..size=26.25MiB->10.77MiB
[28/338]blk.10.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[29/338]blk.10.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[30/338]blk.10.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[31/338]blk.10.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB
[32/338]blk.10.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB
[33/338]blk.10.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[34/338]blk.10.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB
[35/338]blk.10.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[36/338]blk.10.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB
[37/338]blk.10.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq6_K..size=0.75MiB->0.31MiB
[38/338]blk.11.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[39/338]blk.11.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[40/338]blk.11.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[41/338]blk.11.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[42/338]blk.11.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[43/338]blk.11.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB
[44/338]blk.11.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB
[45/338]blk.11.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[46/338]blk.11.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB
[47/338]blk.11.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[48/338]blk.11.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB
[49/338]blk.11.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB
[50/338]blk.12.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[51/338]blk.12.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[52/338]blk.12.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[53/338]blk.12.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[54/338]blk.12.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[55/338]blk.12.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB
[56/338]blk.12.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB
[57/338]blk.12.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[58/338]blk.12.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB
[59/338]blk.12.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[60/338]blk.12.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB
[61/338]blk.12.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB
[62/338]blk.13.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[63/338]blk.13.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq6_K..size=26.25MiB->10.77MiB
[64/338]blk.13.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[65/338]blk.13.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[66/338]blk.13.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[67/338]blk.13.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB
[68/338]blk.13.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB
[69/338]blk.13.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[70/338]blk.13.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB
[71/338]blk.13.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[72/338]blk.13.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB
[73/338]blk.13.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq6_K..size=0.75MiB->0.31MiB
[74/338]blk.14.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[75/338]blk.14.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[76/338]blk.14.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[77/338]blk.14.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[78/338]blk.14.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[79/338]blk.14.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB
[80/338]blk.14.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB
[81/338]blk.14.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[82/338]blk.14.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB
[83/338]blk.14.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[84/338]blk.14.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB
[85/338]blk.14.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB
[86/338]blk.15.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[87/338]blk.15.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[88/338]blk.15.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[89/338]blk.15.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[90/338]blk.15.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[91/338]blk.15.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB
[92/338]blk.15.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB
[93/338]blk.15.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[94/338]blk.15.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB
[95/338]blk.15.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[96/338]blk.15.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB
[97/338]blk.15.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB
[98/338]blk.16.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB
[99/338]blk.16.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB
[100/338]blk.16.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[101/338]blk.16.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB
[102/338]blk.16.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[103/338]blk.16.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB
[104/338]blk.16.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq6_K..size=0.75MiB->0.31MiB
[105/338]blk.2.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[106/338]blk.2.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq6_K..size=26.25MiB->10.77MiB
[107/338]blk.2.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[108/338]blk.2.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[109/338]blk.2.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[110/338]blk.2.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB
[111/338]blk.2.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB
[112/338]blk.2.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[113/338]blk.2.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB
[114/338]blk.2.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[115/338]blk.2.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB
[116/338]blk.2.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB
[117/338]blk.3.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[118/338]blk.3.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[119/338]blk.3.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[120/338]blk.3.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[121/338]blk.3.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[122/338]blk.3.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB
[123/338]blk.3.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB
[124/338]blk.3.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[125/338]blk.3.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB
[126/338]blk.3.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[127/338]blk.3.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB
[128/338]blk.3.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB
[129/338]blk.4.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[130/338]blk.4.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[131/338]blk.4.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[132/338]blk.4.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[133/338]blk.4.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[134/338]blk.4.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB
[135/338]blk.4.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB
[136/338]blk.4.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[137/338]blk.4.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB
[138/338]blk.4.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[139/338]blk.4.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB
[140/338]blk.4.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq6_K..size=0.75MiB->0.31MiB
[141/338]blk.5.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[142/338]blk.5.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq6_K..size=26.25MiB->10.77MiB
[143/338]blk.5.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[144/338]blk.5.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[145/338]blk.5.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[146/338]blk.5.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB
[147/338]blk.5.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB
[148/338]blk.5.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[149/338]blk.5.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB
[150/338]blk.5.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[151/338]blk.5.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB
[152/338]blk.5.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB
[153/338]blk.6.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[154/338]blk.6.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[155/338]blk.6.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[156/338]blk.6.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[157/338]blk.6.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[158/338]blk.6.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB
[159/338]blk.6.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB
[160/338]blk.6.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[161/338]blk.6.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB
[162/338]blk.6.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[163/338]blk.6.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB
[164/338]blk.6.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB
[165/338]blk.7.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[166/338]blk.7.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[167/338]blk.7.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[168/338]blk.7.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[169/338]blk.7.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[170/338]blk.7.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB
[171/338]blk.7.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB
[172/338]blk.7.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[173/338]blk.7.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB
[174/338]blk.7.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[175/338]blk.7.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB
[176/338]blk.7.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq6_K..size=0.75MiB->0.31MiB
[177/338]blk.8.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[178/338]blk.8.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq6_K..size=26.25MiB->10.77MiB
[179/338]blk.8.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[180/338]blk.8.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[181/338]blk.8.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[182/338]blk.8.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB
[183/338]blk.8.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB
[184/338]blk.8.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[185/338]blk.8.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB
[186/338]blk.8.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[187/338]blk.8.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB
[188/338]blk.8.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB
[189/338]blk.9.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[190/338]blk.9.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[191/338]blk.9.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[192/338]blk.9.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[193/338]blk.9.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[194/338]blk.9.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB
[195/338]blk.9.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB
[196/338]blk.9.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[197/338]blk.9.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB
[198/338]blk.9.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[199/338]blk.9.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB
[200/338]blk.9.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB
[201/338]blk.16.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[202/338]blk.16.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[203/338]blk.16.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[204/338]blk.16.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[205/338]blk.16.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[206/338]blk.17.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[207/338]blk.17.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq6_K..size=26.25MiB->10.77MiB
[208/338]blk.17.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[209/338]blk.17.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[210/338]blk.17.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[211/338]blk.17.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB
[212/338]blk.17.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB
[213/338]blk.17.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[214/338]blk.17.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB
[215/338]blk.17.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[216/338]blk.17.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB
[217/338]blk.17.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq6_K..size=0.75MiB->0.31MiB
[218/338]blk.18.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[219/338]blk.18.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[220/338]blk.18.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[221/338]blk.18.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[222/338]blk.18.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[223/338]blk.18.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB
[224/338]blk.18.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB
[225/338]blk.18.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[226/338]blk.18.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB
[227/338]blk.18.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[228/338]blk.18.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB
[229/338]blk.18.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB
[230/338]blk.19.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[231/338]blk.19.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[232/338]blk.19.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[233/338]blk.19.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[234/338]blk.19.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[235/338]blk.19.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB
[236/338]blk.19.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB
[237/338]blk.19.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[238/338]blk.19.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB
[239/338]blk.19.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[240/338]blk.19.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB
[241/338]blk.19.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB
[242/338]blk.20.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[243/338]blk.20.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq6_K..size=26.25MiB->10.77MiB
[244/338]blk.20.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[245/338]blk.20.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[246/338]blk.20.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[247/338]blk.20.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB
[248/338]blk.20.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB
[249/338]blk.20.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[250/338]blk.20.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB
[251/338]blk.20.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[252/338]blk.20.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB
[253/338]blk.20.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq6_K..size=0.75MiB->0.31MiB
[254/338]blk.21.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[255/338]blk.21.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[256/338]blk.21.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[257/338]blk.21.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[258/338]blk.21.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[259/338]blk.21.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB
[260/338]blk.21.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB
[261/338]blk.21.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[262/338]blk.21.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB
[263/338]blk.21.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[264/338]blk.21.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB
[265/338]blk.21.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB
[266/338]blk.22.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[267/338]blk.22.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[268/338]blk.22.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[269/338]blk.22.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[270/338]blk.22.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[271/338]blk.22.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB
[272/338]blk.22.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB
[273/338]blk.22.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[274/338]blk.22.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB
[275/338]blk.22.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[276/338]blk.22.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB
[277/338]blk.22.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB
[278/338]blk.23.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[279/338]blk.23.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq6_K..size=26.25MiB->10.77MiB
[280/338]blk.23.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[281/338]blk.23.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[282/338]blk.23.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[283/338]blk.23.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB
[284/338]blk.23.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB
[285/338]blk.23.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[286/338]blk.23.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB
[287/338]blk.23.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[288/338]blk.23.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB
[289/338]blk.23.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq6_K..size=0.75MiB->0.31MiB
[290/338]blk.24.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[291/338]blk.24.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq6_K..size=26.25MiB->10.77MiB
[292/338]blk.24.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[293/338]blk.24.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[294/338]blk.24.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[295/338]blk.24.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB
[296/338]blk.24.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB
[297/338]blk.24.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[298/338]blk.24.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB
[299/338]blk.24.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[300/338]blk.24.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB
[301/338]blk.24.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq6_K..size=0.75MiB->0.31MiB
[302/338]blk.25.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[303/338]blk.25.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq6_K..size=26.25MiB->10.77MiB
[304/338]blk.25.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[305/338]blk.25.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[306/338]blk.25.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[307/338]blk.25.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB
[308/338]blk.25.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB
[309/338]blk.25.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[310/338]blk.25.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB
[311/338]blk.25.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[312/338]blk.25.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB
[313/338]blk.25.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq6_K..size=0.75MiB->0.31MiB
[314/338]blk.26.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[315/338]blk.26.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq6_K..size=26.25MiB->10.77MiB
[316/338]blk.26.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[317/338]blk.26.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[318/338]blk.26.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[319/338]blk.26.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB
[320/338]blk.26.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB
[321/338]blk.26.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[322/338]blk.26.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB
[323/338]blk.26.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[324/338]blk.26.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB
[325/338]blk.26.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq6_K..size=0.75MiB->0.31MiB
[326/338]blk.27.attn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[327/338]blk.27.ffn_down.weight-[8960,1536,1,1],type=f16,convertingtoq6_K..size=26.25MiB->10.77MiB
[328/338]blk.27.ffn_gate.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[329/338]blk.27.ffn_up.weight-[1536,8960,1,1],type=f16,convertingtoq4_K..size=26.25MiB->7.38MiB
[330/338]blk.27.ffn_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
[331/338]blk.27.attn_k.bias-[256,1,1,1],type=f32,size=0.001MB
[332/338]blk.27.attn_k.weight-[1536,256,1,1],type=f16,convertingtoq4_K..size=0.75MiB->0.21MiB
[333/338]blk.27.attn_output.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[334/338]blk.27.attn_q.bias-[1536,1,1,1],type=f32,size=0.006MB
[335/338]blk.27.attn_q.weight-[1536,1536,1,1],type=f16,convertingtoq4_K..size=4.50MiB->1.27MiB
[336/338]blk.27.attn_v.bias-[256,1,1,1],type=f32,size=0.001MB
[337/338]blk.27.attn_v.weight-[1536,256,1,1],type=f16,convertingtoq6_K..size=0.75MiB->0.31MiB
[338/338]output_norm.weight-[1536,1,1,1],type=f32,size=0.006MB
llama_model_quantize_internal:modelsize=2944.68MB
llama_model_quantize_internal:quantsize=934.69MB
main:quantizetime=14228.85ms
main:totaltime=14228.85ms最后,得到了Qwen2-1.5B-Instruct_Q4_k_M.gguf模型。
| 欢迎光临 链载Ai (https://www.lianzai.com/) | Powered by Discuz! X3.5 |