微调是增强大模型原生能力的最佳方法。
借助微调,我们可以优化大模型的问答语气风格、可以增强模型推理能力和Agent能力,甚至是进行专业领域的知识灌注。
本次以医疗领域为例,对 DeepSeek 进行专项提升!
最终达到问答风格优化+知识灌注目的,让模型在微调过程中掌握复杂医学问题的专业推理过程,并提高疾病诊断的准确率。
这套微调流程可以适用于任意尺寸任意精度的 DeepSeek R1 模型。
微调后还可以一键创建微调后的 GGUF 模型权重,无缝代入 Ollama 、 vLLM 等主流大模型推理工具进行对话!
ingFang SC", system-ui, -apple-system, BlinkMacSystemFont, "Helvetica Neue", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;letter-spacing: 0.034em;">硬件要求:本节公开课最小化复现仅需 7G 显存、半小时运行时间即可完成,并获得微调效果。
ingFang SC", system-ui, -apple-system, BlinkMacSystemFont, "Helvetica Neue", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;letter-spacing: 0.034em;">训练流程迁移:本节公开课介绍的DeepSeek R1模型的高效微调流程可以迁移至DeepSeek R1任意蒸馏模型、任意COT数据集,甚至是进行DeepSeek R1模型高效微调。
课件代码:公开课随课提供全部课件、代码、训练数据、模型微调前后权重等各项内容。后台回复“777”即可无偿领取
课程参考资料:为了更好的辅助学习,随公开课附赠相关参考资料。
下面正式开始!
fromunslothimportFastLanguageModel
尝试用unsloth进行LLama模型推理
max_seq_length=2048dtype=Noneload_in_4bit=False
*注,若显存不足,则可以 load _ in _ 4bit = True ,运行 4 bit 量化版。
在 INT4 量化情况下, 8B 模型推理仅需 7G 左右显存。此时model就是读取进来的 DeepSeek R1 8B 蒸馏模型,而 tokenizer 则是分词器。
将模型调整为推理模式:
FastLanguageModel.for_inference(model)
然后即可和模型进行对话:
question="请问如何证明根号2是无理数?"
然后这里我们首先需要借助分词器,将输入的问题转化为标记索引:
inputs=tokenizer([question],return_tensors="pt").to("cuda")最后再带入inputs进行对话
outputs=model.generate(input_ids=inputs.input_ids,max_new_tokens=1200,use_cache=True,)
此时得到的回复也是词索引,同样需要分词器将其转化为文本:
response=tokenizer.batch_decode(outputs)
response
print(response[0])
至此我们就完成了 unsloth 模型推理流程。
设置问答模板
prompt_style = """Below is an instruction that describes a task,pairedwithaninputthat provides further context.Write a response that appropriately completes the request.Beforeanswering,think carefully about the questionandcreatea step-by-stepchainofthoughtstoensure alogicalandaccurate response.### Instruction:Youarea medical expertwithadvancedknowledgeinclinical reasoning,diagnostics,andtreatment planning.Please answer thefollowingmedical question.### Question:{}### Response:<think>{}"""
翻译如下:
prompt _ style = """以下是一个任务说明,配有提供更多背景信息的输入。
请写出一个恰当的回答来完成该任务。
在回答之前,请仔细思考问题,并按步骤进行推理,确保回答逻辑清晰且准确。
### Instruction:
您是一位具有高级临床推理、诊断和治疗规划知识的医学专家。
请回答以下医学问题。
接下来我们抽取部分 medical-o1-reasoning-SFT 数据集中问题进行提问,并查看初始状态下模型回答结果。
question_1 = "A 61-year-old womanwithalonghistoryofinvoluntary urine loss during activitieslikecoughingorsneezing butnoleakageatnight undergoes a gynecologicalexamandQ-tip test.Basedonthese findings,what wouldcystometry most likely reveal about her residual volumeanddetrusor contractions?"
翻译:一位 61 岁的女性,有长期在咳嗽或打喷嚏等活动中发生不自主尿液流失的病史,但夜间没有漏尿。她接受了妇科检查和 Q-tip 测试。根据这些检查结果,膀胱测量( cystometry )最可能会显示她的残余尿量和逼尿肌收缩情况如何?
question_2 = "Given a patient who experiences sudden-onsetchest pain radiating to the neck and left arm,withapast medical historyofhypercholesterolemiaandcoronaryartery disease,elevated troponin Ilevels,andtachycardia,whatisthe most likely coronary arteryinvolved basedonthis presentation?"
翻译:面对一位突发胸痛并放射至颈部和左臂的患者,其既往病史包括高胆固醇血症和冠状动脉疾病,同时伴有升高的肌钙蛋白 I 水平和心动过速,根据这些临床表现,最可能受累的冠状动脉是哪一条?
问答测试
inputs1 = tokenizer([prompt_style.format(question_1,"")],return_tensors="pt").to("cuda")outputs1 = model.generate(input_ids=inputs1.input_ids,max_new_tokens=1200,use_cache=True,)response1 = tokenizer.batch_decode(outputs1)
print(response1[0].split("###Response:")[1])回复如下:
<think>Okay,so I'm trying to figure out this medical question.Let'sbreakit down step by step.The patient is a61-year-old woman with a history of involuntary urine loss during activities like coughingorsneezing,but she doesn't leak at night.She's undergoing a gynecological examanda Q-tip test.The question is about what cystometry would show regarding her residual volumeanddetrusor contractions.First,I know that involuntary urine loss during activities like sneezingorcoughing is often related to stress urinary incontinence.Stress incontinence usually happenswhenthe urethral muscles aren't strong enough to prevent the bladder from leaking when there's increased pressure,like from coughing.Now,the Q-tip test.From what I remember,the Q-tip is a urethral catheter used to measure urethral pressure profiles.It's often used to assess urethral function.A positive Q-tip test result,where the urethral pressure is lower than the intravesicular pressure during the Valsalva maneuver,is associated with intrinsic sphincteric deficiency,which is a type of stress incontinence.Since the patient has a history of involuntary leakage during activities but not at night,it's more likely stress incontinence rather than something like urgency incontinence,which would also involve nighttime leakage.So,ifthe Q-tip test is positive,it points towards intrinsic sphincteric deficiency.Now,moving to cystometry.Cystometry is a test that measures how the bladder responds to fillingandthe contractions of the detrusor muscle.It can showifthere's an overactive bladder(OAB),which causes urgency and frequency,or if there's underactive detrusor,which can lead to retention.In thiscase,the patient's main issue is stress incontinence,which is more about the inability to hold urine under increased pressure.Cystometry would look at the detrusor contractions.If the detrusor is underactive,it won't contract strongly enough to empty the bladder,leading to residual volume.Butifit's overactive,it might contract too strongly,causing urgency.Given that the patient has a history of stress incontinence and a positive Q-tip test suggesting intrinsic sphincteric deficiency,I think the cystometry would show that the detrusor contractions are normal.The issue isn't with the detrusor muscle's ability to contract but rather the inability to seal the urethra to hold the pressure.So,the residual volume might be normal unless there's significant retention,but the key finding would be the detrusor contractions being normal,notoveractive.Wait,but could there be residual volume? If the patient has some urine leftinthe bladder after urination,that's residual volume.But without symptoms of retention,like a full bladder or difficulty starting urination,it's less likely.The main issue is the incontinence during activities,so the detrusor contractions are normal,andthe residual volume is within normal limitsunlessthere's another condition.So,putting it all together,the cystometry would likely show normal detrusor contractions and normal residual volume.The issue is more on the sphincteric side,not the detrusor.</think>Basedontheanalysisofthepatient'shistoryandtheQ-tiptestresults,thecystometrywouldmostlikelyrevealnormaldetrusorcontractionsandnormalresidualvolume.Theprimaryissueappearstobestressurinaryincontinenceduetointrinsicsphinctericdeficiency,asindicatedbythepositiveQ-tiptest.Thisconditiontypicallyaffectstheurethralsphincter'sabilitytopreventleakageunderincreasedpressure,ratherthanthedetrusormuscle'scontractility.Therefore,thedetrusorcontractionsarenotoveractive,andtheresidualvolumeiswithinnormallimits.<|end▁of▁sentence|>
翻译如下:
<think>好的,我正在尝试分析这个医学问题。我们一步步来分解。患者是一位 61 岁的女性,有在咳嗽或打喷嚏等活动中发生不自主尿液流失的病史,但她夜间没有漏尿。她正在接受妇科检查和 Q-tip 测试。问题是关于膀胱测量( cystometry )会显示她的残余尿量和逼尿肌收缩情况。
首先,我知道在像打喷嚏或咳嗽等活动中发生不自主尿液流失通常与压力性尿失禁有关。压力性尿失禁通常发生在尿道肌肉不足以在压力增大的情况下(比如咳嗽时)防止膀胱漏尿时。
接下来是 Q-tip 测试。根据我记得的, Q-tip 是一种用于测量尿道压力曲线的尿道导管。它通常用于评估尿道功能。Q-tip 测试阳性结果,即在 Valsalva 操作过程中尿道压力低于膀胱内压,与内源性括约肌缺陷相关,这是一种压力性尿失禁类型。
由于患者有在活动中出现不自主漏尿的病史,但夜间没有漏尿,更可能是压力性尿失禁,而不是像急迫性尿失禁那样的情况,急迫性尿失禁通常伴有夜间漏尿。因此,如果 Q-tip 测试阳性,提示内源性括约肌缺陷。
现在,谈到膀胱测量。膀胱测量是一种测试,旨在测量膀胱在充盈过程中的反应以及逼尿肌的收缩情况。它可以显示是否存在膀胱过度活动症( OAB ),即引起急迫感和频尿的情况,或是否存在逼尿肌低活动性,导致尿潴留。
在这种情况下,患者的主要问题是压力性尿失禁,这更与无法在压力增大时保持尿液有关。膀胱测量会查看逼尿肌的收缩情况。如果逼尿肌低活动性,它将不能强有力地收缩以排空膀胱,导致残余尿量。但如果逼尿肌过度活跃,可能会收缩过度,导致急迫感。
鉴于患者有压力性尿失禁的病史和 Q-tip 测试阳性,提示内源性括约肌缺陷,我认为膀胱测量会显示逼尿肌的收缩是正常的。问题不在于逼尿肌收缩的能力,而是无法密封尿道以保持压力。因此,残余尿量可能是正常的,除非有明显的尿潴留,但关键发现是逼尿肌的收缩是正常的,而不是过度活跃。
等等,但会不会有残余尿量?如果患者排尿后膀胱中残留一些尿液,那就是残余尿量。但如果没有尿潴留的症状,比如膀胱饱胀或排尿困难,那么这种情况的可能性较小。主要问题是在活动中发生的尿失禁,因此逼尿肌收缩是正常的,残余尿量在正常范围内,除非有其他情况。
所以,综合来看,膀胱测量可能会显示逼尿肌的收缩正常,残余尿量正常。问题更多是在括约肌方面,而不是逼尿肌。</think>
根据对患者病史和 Q-tip 测试结果的分析,膀胱测量最可能显示逼尿肌的收缩正常,残余尿量正常。主要问题似乎是由于内源性括约肌缺陷引起的压力性尿失禁,如 Q-tip 测试阳性所示。这种情况通常影响尿道括约肌在压力增大时防止漏尿的能力,而不是逼尿肌的收缩能力。因此,逼尿肌的收缩并未过度活跃,残余尿量在正常范围内。
标准答案:
在这种压力性尿失禁的情况下,膀胱测压检查(cystometry)最可能显示ingFang SC", system-ui, -apple-system, BlinkMacSystemFont, "Helvetica Neue", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;letter-spacing: 0.034em;">正常的排尿后残余尿量,因为压力性尿失禁通常不会影响膀胱排空功能。此外,由于压力性尿失禁主要与ingFang SC", system-ui, -apple-system, BlinkMacSystemFont, "Helvetica Neue", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;letter-spacing: 0.034em;">身体用力有关,而不是膀胱过度活动症(OAB),因此在测试过程中ingFang SC", system-ui, -apple-system, BlinkMacSystemFont, "Helvetica Neue", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;letter-spacing: 0.034em;">不太可能观察到逼尿肌的非自主收缩。
问题二的测试如下:
inputs2 = tokenizer([prompt_style.format(question_2,"")],return_tensors="pt").to("cuda")outputs2 = model.generate(input_ids=inputs2.input_ids,max_new_tokens=1200,use_cache=True,)response2 = tokenizer.batch_decode(outputs2)
print(response2[0].split("###Response:")[1])<think>Okay,soIhave a patient presenting with sudden-onset chest pain that's radiating to the neck andleftarm.Thatmakes me think of a heart attack because those symptoms are classic—angina or myocardial infarction.Theleftarm pain,neck pain,and sometimes jaw or back pain can all be related to coronary artery issues.Lookingat the past medical history,the patient has hypercholesterolemia,whichishigh cholesterol,and coronary artery disease.Thoseare both risk factorsforatherosclerosis,which can lead to blockagesinthe coronary arteries.Theelevated troponinIlevels are a big clue because troponinisa cardiac enzyme released when the heart muscleisdamaged,whichisa sign of a heart attack.Also,the patientisexperiencing tachycardia,which means their heartisbeating faster than usual.Ina heart attack,the heart might beat fasterasit tries to pump blood to compensateforthe blocked artery.Now,considering the coronary arteries,theleftmain coronary artery supplies blood to the entireleftside of the heart,including theleftventricle,whichisa large muscle that's crucialforpumping blood.Ifthere's a blockage here,it can lead to a more severe heart attack because theleftventricleisso vital.Therightcoronary artery supplies therightventricle and the inferior wall of theleftventricle.Blockageshere are possible too,but theleftmainismore commonly associated with the symptoms described,especially when troponiniselevated.Soputting it all together,the most likely coronary artery involvedistheleftmain coronary artery.Thecombination of the patient's history,the elevated troponin,and the typical chest pain radiation points to this artery being the culprit.</think>Themost likely coronary artery involvedinthis presentationisthe **leftmain coronary artery(LMCA)**.**Explanation:**- **Symptoms:**Thepatient's sudden chest pain radiating to the neck andleftarm,along with elevated troponin levels,suggests an acute coronary syndrome,likely a myocardial infarction(heart attack).- **PastMedicalHistory:**Historyof hypercholesterolemia and coronary artery disease are risk factorsforatherosclerosis,which can lead to blockagesinthe coronary arteries.- **Tachycardia:**Increasedheart rate may occurasthe heart compensatesforreduced blood flow to the heart muscle.- **CoronaryArteryConsideration:**Theleftmain coronary artery supplies theleftventricle,a large muscle thatiscrucialforcardiac function.BlockagesintheLMCAcan lead to more severe and life-threatening heart attacks compared to blockagesintherightcoronary artery,which typically supply less critical areas.Thus,thecombinationofsymptoms,elevatedtroponin,andthepatient'shistorystronglypointstothe**leftmaincoronaryartery**asthemostlikelyculprit.<|end▁of▁sentence|>
翻译如下:
<think>好的,我有一位患者,突然出现胸痛,并放射到颈部和左臂。这让我想到了心脏病发作,因为这些症状很经典——心绞痛或心肌梗死。左臂痛、颈部痛,有时还会伴随下颌或背部的疼痛,这些都可能与冠状动脉问题相关。
从病史来看,患者有高胆固醇血症(即高胆固醇)和冠状动脉疾病,这两个因素都是动脉粥样硬化的风险因素,可能导致冠状动脉发生堵塞。肌钙蛋白I升高是一个很大的线索,因为肌钙蛋白是心肌受损时释放的心脏酶,通常表明发生了心肌梗死。另外,患者还出现了心动过速,即心跳比平常快。在心肌梗死时,心脏可能会加速跳动,以试图通过增加心脏输出量来补偿被阻塞的冠状动脉。
考虑到冠状动脉,左主冠状动脉( LMCA )为整个左侧心脏提供血液,包括左心室,而左心室是一个关键的泵血肌肉。如果这里发生堵塞,可能导致更严重的心肌梗死,因为左心室至关重要。右冠状动脉为右心室和左心室下壁提供血液,这里的堵塞也是可能的,但左主冠状动脉通常与上述症状更相关,尤其是当肌钙蛋白升高时。
所以,将所有因素综合考虑,最可能受累的冠状动脉是左主冠状动脉( LMCA )。患者的病史、肌钙蛋白升高以及典型的胸痛放射症状都指向了这一动脉作为罪魁祸首。
</think>
最可能受累的冠状动脉是左主冠状动脉(LMCA)。
解释:
症状:患者突发胸痛并放射至颈部和左臂,以及肌钙蛋白升高,提示急性冠状动脉综合症,可能是心肌梗死。
病史:高胆固醇血症和冠状动脉疾病病史是动脉粥样硬化的风险因素,可能导致冠状动脉堵塞。
心动过速:心率增加可能是心脏为补偿心肌血流减少而产生的反应。
冠状动脉考虑:左主冠状动脉供应左心室,这个肌肉对心脏功能至关重要。与右冠状动脉相比,左主冠状动脉的堵塞会导致更严重且危及生命的心肌梗死,右冠状动脉通常供应的是不那么关键的区域。
因此,症状、肌钙蛋白升高以及患者的病史强烈指向左主冠状动脉( LMCA )作为最可能的罪魁祸首。<| end ▁ of ▁ sentence |>
标准答案:
根据患者表现出的突然胸痛并放射至颈部和左臂,结合其有高胆固醇血症和冠状动脉疾病的病史,肌钙蛋白升高和心动过速,临床症状强烈提示左前降支( LAD )动脉受累。该动脉通常是引发此类症状的罪魁祸首,因为它供应了心脏的大部分区域。放射性疼痛和肌钙蛋白升高的组合表明心肌受损,这使得 LAD 成为最可能的致病动脉。然而,在没有进一步的诊断检查(如心电图)的情况下,最终的确诊仍需等待确认。
能够看出,在原始状态下,模型能够进行推理并给出回复,但实际上第一个回答过程并不符合医学规范。
第二个问题则直接回答错误。
由此可见,在初始状态下,模型对于 medical-o1-reasoning-SFT 数据集问答效果并不好。
接下来尝试进行微调,并测试微调后模型问答效果。
对于当前数据集而言,我们可以带入原始数据集的部分数据进行微调,也可以带入全部数据并遍历多次进行微调。
对于大多数的微调实验,我们都可以从最小可行性实验入手进行微调,也就是先尝试带入少量数据进行微调,并观测微调效果。
若微调可以顺利执行,并能够获得微调效果,再考虑带入更多的数据进行更大规模微调。
这里我们直接从 huggingface 上下载 medical-o1-reasoning-SFT 数据集。
设置代理环境
由于 huggingface 网络受限,下载数据集前需要先进行网络环境设置。
若是 AutoDL 服务器,则可以按照如下方式开启学术加速,从而顺利连接 huggingface 并进行数据集下载:
importsubprocessimportosresult = subprocess.run('bash -c"source /etc/network_turbo && env | grep proxy"',shell=True,capture_output=True,text=True)output = result.stdoutforlineinoutput.splitlines():if'='inline:var,value = line.split('=',1)os.environ[var]= value
下载数据集
接下来使用datasets进行数据集下载
!pipinstalldatasets
importosfromdatasetsimportload_dataset
提取并设置文本生成结束的标记:
EOS_TOKEN=tokenizer.eos_tokentokenizer.eos_token
'<|end▁of▁sentence|>'
然后定义函数,用于对 medical-o1-reasoning-SFT 数据集进行修改, Complex _ CoT 列和 Response 列进行拼接,并加上文本结束标记:
defformatting_prompts_func(examples):inputs=examples["Question"]cots=examples["Complex_CoT"]outputs=examples["Response"]texts=[]forinput,cot,outputinzip(inputs,cots,outputs):text=train_prompt_style.format(input,cot,output)+EOS_TOKENtexts.append(text)return{"text":texts,}在最小可行性实验中,我们可以只下载 500条数据进行微调即可看出效果:
dataset=load_dataset("FreedomIntelligence/medical-o1-reasoning-SFT","en",split="train[0:500]",trust_remote_code=True)UsingthelatestcachedversionofthedatasetsinceFreedomIntelligence/medical-o1-reasoning-SFTcouldn'tbefoundontheHuggingFaceHubFoundthelatestcacheddatasetconfiguration'en'at/root/.cache/huggingface/datasets/FreedomIntelligence___medical-o1-reasoning-sft/en/0.0.0/4c9573e7de1e8660b88158db2efa7c7204bbd269(lastmodifiedonWedFeb501:06:322025).
dataset[0]
{'Question':'A61-year-oldwomanwithalonghistoryofinvoluntaryurinelossduringactivitieslikecoughingorsneezingbutnoleakageatnightundergoesagynecologicalexamandQ-tiptest.Basedonthesefindings,whatwouldcystometrymostlikelyrevealaboutherresidualvolumeanddetrusorcontractions?','Complex_CoT':"Okay,let'sthinkaboutthisstepbystep.There'sa61-year-oldwomanherewho'sbeendealingwithinvoluntaryurineleakageswhenevershe'sdoingsomethingthatupsherabdominalpressurelikecoughingorsneezing.Thissoundsalotlikestressurinaryincontinencetome.Now,it'sinterestingthatshedoesn'thaveanyissuesatnight;sheisn'texperiencingleakagewhilesleeping.Thislikelymeansherbladder'sabilitytoholdurineisfinewhensheisn'tunderphysicalstress.Hmm,that'sacluethatwe'redealingwithsomethingrelatedtopressureratherthanabladdermuscleproblem.\n\nThefactthatsheunderwentaQ-tiptestisintriguingtoo.Thistestisusuallydonetoassessurethralmobility.Instressincontinence,aQ-tipmightmovesignificantly,showingurethralhypermobility.Thiskindofmovementoftenmeansthere'saweaknessinthesupportstructuresthatshouldhelpkeeptheurethraclosedduringincreasesinabdominalpressure.So,that'saligningwellwithstressincontinence.\n\nNow,let'sthinkaboutwhatwouldhappenduringcystometry.Sincestressincontinenceisn'tusuallyaboutsuddenbladdercontractions,Iwouldn'texpecttoseeinvoluntarydetrusorcontractionsduringthistest.Herbladderisn'tspasmingoranything;it'smoreaboutthesupportstructurefailingunderstress.Plus,shelikelyemptiesherbladdercompletelybecausestressincontinencedoesn'ttypicallyinvolveincompleteemptying.So,herresidualvolumeshouldbeprettynormal.\n\nAllinall,itseemslikeiftheydoacystometryonher,itwilllikelyshowanormalresidualvolumeandnoinvoluntarycontractions.Yup,Ithinkthatmakessensegivenhersymptomsandthetypicalpresentationsofstressurinaryincontinence.",'Response':'Cystometryinthiscaseofstressurinaryincontinencewouldmostlikelyrevealanormalpost-voidresidualvolume,asstressincontinencetypicallydoesnotinvolveissueswithbladderemptying.Additionally,sincestressurinaryincontinenceisprimarilyrelatedtophysicalexertionandnotanoveractivebladder,youwouldnotexpecttoseeanyinvoluntarydetrusorcontractionsduringthetest.'}然后进行结构化处理:
dataset=dataset.map(formatting_prompts_func,batched=True,)
将数据集整理为如下形式:
dataset["text"][0]
数据集保存地址
默认情况下数据集保存在主目录下. cache 文件夹中,数据文件格式如下所示:
然后即可把模型设置为微调模式:
model=FastLanguageModel.get_peft_model(model,r=16,target_modules=["q_proj","k_proj","v_proj","o_proj","gate_proj","up_proj","down_proj",],lora_alpha=16,lora_dropout=0,bias="none",use_gradient_checkpointing="unsloth",#Trueor"unsloth"forverylongcontextrandom_state=3407,use_rslora=False,loftq_config=None,)
Unsloth2025.1.8patched32layerswith32QKVlayers,32Olayersand32MLPlayers.
然后导入相关的库:
fromtrlimportSFTTrainerfromtransformersimportTrainingArgumentsfromunslothimportis_bfloat16_supported
创建有监督微调对象:
trainer=SFTTrainer(model=model,tokenizer=tokenizer,train_dataset=dataset,dataset_text_field="text",max_seq_length=max_seq_length,dataset_num_proc=2,args=TrainingArguments(per_device_train_batch_size=2,gradient_accumulation_steps=4,#Usenum_train_epochs=1,warmup_ratioforfulltrainingruns!warmup_steps=5,max_steps=60,learning_rate=2e-4,fp16=notis_bfloat16_supported(),bf16=is_bfloat16_supported(),logging_steps=10,optim="adamw_8bit",weight_decay=0.01,lr_scheduler_type="linear",seed=3407,output_dir="outputs",),)
这段代码主要是用SFTTrainer进行监督微调(Supervised Fine-Tuning,SFT),适用于transformers和Unsloth生态中的模型微调:
1.导入相关库
SFTTrainer(来自trl库):
trl(Transformer Reinforcement Learning)是Hugging Face旗下的trl库,提供监督微调(SFT)和强化学习(RLHF)相关的功能。
SFTTrainer主要用于有监督微调(Supervised Fine-Tuning),适用于LoRA等低秩适配微调方式。
TrainingArguments(来自transformers库):
这个类用于定义训练超参数,比如批量大小、学习率、优化器、训练步数等。
is_bfloat16_supported()(来自unsloth):
这个函数检查当前GPU是否支持bfloat16(BF16),如果支持,则返回True,否则返回False。
bfloat16是一种更高效的数值格式,在新款NVIDIA A100/H100等 GPU 上表现更优。
2.初始化SFTTrainer进行模型微调
SFTTrainer部分TrainingArguments部分然后开始微调:
trainer_stats=trainer.train()
Tracking run with wandb version 0.19.5
Run data is saved locally in
/root/autodl-tmp/models/wandb/run-20250205_004957-k0dz6rg7
Syncing runingFang SC", system-ui, -apple-system, BlinkMacSystemFont, "Helvetica Neue", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;font-size: var(--articleFontsize);letter-spacing: 0.034em;">outputstoWeights & Biases(docs)
View project athttps://wandb.ai/2323365771-ff/huggingface
View run athttps://wandb.ai/2323365771-ff/huggingface/runs/k0dz6rg7
此时 wandb 中显示内容如下:
trainer_stats
注意, unsloth 在微调结束后,会自动更新模型权重(在缓存中),因此无需手动合并模型权重即可直接调用微调后的模型:
FastLanguageModel.for_inference(model)
inputs = tokenizer([prompt_style.format(question_1,"")],return_tensors="pt").to("cuda")outputs = model.generate(input_ids=inputs.input_ids,attention_mask=inputs.attention_mask,max_new_tokens=1200,use_cache=True,)response = tokenizer.batch_decode(outputs)
print(response[0].split("###Response:")[1])能够发现,第一个问题回答更加规范,并且回答正确。但第二个问题仍然回答错误。由此可以考虑继续进行大规模微调。
不过在此之前,我们可以将现在小规模微调的模型进行本地保存。
此时本地保存的模型权重在outputs文件夹中:
然后可使用如下代码进行模型权重合并:
new_model_local ="DeepSeek-R1-Medical-COT-Tiny"model.save_pretrained(new_model_local)tokenizer.save_pretrained(new_model_local)model.save_pretrained_merged(new_model_local,tokenizer,save_method="merged_16bit",)
保存结束后,即可在当前文件夹中看到对应模型:
然后即可将其推送到 huggingface 上并保存为 GGUF 格式文件并进行调用。
三、完整高效微调实验
接下来我们尝试带入全部数据进行高效微调,以提升模型微调效果。
train_prompt_style = """Below is an instruction that describes a task,pairedwithaninputthat provides further context.Write a response that appropriately completes the request.Beforeanswering,think carefully about the questionandcreatea step-by-stepchainofthoughtstoensure alogicalandaccurate response.### Instruction:Youarea medical expertwithadvancedknowledgeinclinical reasoning,diagnostics,andtreatment planning.Please answer thefollowingmedical question.### Question:{}### Response:<think>{}</think>{}"""
EOS_TOKEN = tokenizer.eos_token# Must add EOS_TOKENdefformatting_prompts_func(examples):inputs = examples["Question"]cots = examples["Complex_CoT"]outputs = examples["Response"]texts =[]forinput,cot,outputinzip(inputs,cots,outputs):text = train_prompt_style.format(input,cot,output)+ EOS_TOKENtexts.append(text)return{"text":texts,}
此时读取全部数据
dataset=load_dataset("FreedomIntelligence/medical-o1-reasoning-SFT","en",split="train",trust_remote_code=True)dataset=dataset.map(formatting_prompts_func,batched=True,)dataset["text"][0]model=FastLanguageModel.get_peft_model(model,r=16,target_modules=["q_proj","k_proj","v_proj","o_proj","gate_proj","up_proj","down_proj",],lora_alpha=16,lora_dropout=0,bias="none",use_gradient_checkpointing="unsloth",#Trueor"unsloth"forverylongcontextrandom_state=3407,use_rslora=False,loftq_config=None,)
这里设置 epoch 为 3 ,遍历 3 次数据集:
fromtrl import SFTTrainerfromtransformers import TrainingArgumentsfromunsloth import is_bfloat16_supportedtrainer=SFTTrainer(model=model,tokenizer=tokenizer,train_dataset=dataset,dataset_text_field="text",max_seq_length=max_seq_length,dataset_num_proc=2,args=TrainingArguments(per_device_train_batch_size=2,gradient_accumulation_steps=4,num_train_epochs=3,warmup_steps=5,# max_steps=60,learning_rate=2e-4,fp16=not is_bfloat16_supported(),bf16=is_bfloat16_supported(),logging_steps=10,optim="adamw_8bit",weight_decay=0.01,lr_scheduler_type="linear",seed=3407,output_dir="outputs",),)
trainer_stats=trainer.train()
这里总共训练约 15 个小时。再带入两个问题进行测试,均有较好的回答效果:
问题一输出:
<think>Alright,let's think this through step by step.We've got a61-year-old woman who's been dealing with involuntary urine loss whenever she does something like coughing or sneezing.That sounds like stress urinary incontinence,which usually means there's some kind of leakage when there's increased pressure in the abdomen.Now,the fact that she doesn't have any leakage at night is interesting.That's a big hint because it suggests that the problem isn't about bladder controlorretention,since there's no issue when lying down.Now,let's consider the Q-tip test.This test is a clever way to checkforurethral mobility.Ifthe Q-tip moves a lot,it means the urethra is mobile,which is a typical sign of stress incontinence.So,we're probably looking at a condition where the urethra isn't staying closed under pressure.Now,ifwe were todoa cystometry on her,we'd be checking a few things.First,there's the residual volume.Since she's not leaking urine at night,her bladder should be empty,so we'd expect the residual volume to be normal.Then,there's the detrusor contractions.Stress incontinence doesn't usually involve the detrusor muscle contracting abnormally.So,we'd expect normal or maybe slightly increased contractions,but nothing out of the ordinary.So,putting it all together,if we do the cystometry,we'd likely see a normal residual volumeandnormal detrusor contractions.That fits with everything we know about her symptomsandthe Q-tip test results.Everything lines up with stress urinary incontinence,confirming that our conclusion makes sense.</think>Based on the information provided,the61-year-old woman is likely experiencing stress urinary incontinence,assuggested by the involuntary urine loss during activities like coughingorsneezing,coupled with the fact that she doesn't experience leakage at night.During a cystometry exam,we would expect the following findings:1.**Residual Volume:** This would be normal.Since she does not experience leakage at night,her bladder is likely empty at bedtime,indicating a normal post-void residual volume.2.**Detrusor Contractions:** These would also be normal or possibly slightly increased.Stress urinary incontinence typically does not involve abnormal detrusor contractions,as the issue is more related to the closure mechanism of the urethra under increased abdominal pressure rather than bladder control or contractions.Thus,cystometry would most likely reveal a normal residual volume and normal detrusor contractions,consistent with stress urinary incontinence.<|end▁of▁sentence|>
问题二输出:
<think>Okay,let's think about this.Thepatientishaving sudden chest pain that's shooting to their neck andleftarm.That'spretty classicforsomething like a myocardial infarction,right?Imean,the pain spreading like that usually points to a big probleminthe heart's supply.Now,considering the heart's coronary arteries,Iknow theleftanterior descending artery,orLAD,isa big player.It'slike the main highwayforblood to reach the front wall of the heart.Ifthere's a blockage there,it can definitely cause pain that radiates to the neck and arm.Then,there's therightcoronary artery,orRCA,which supplies therightside of the heart and can affect the inferior wall of the heart.Butwait,the pain pattern here seems to be more on theleftside,so maybe theRCAisless likely.Thepatient has hypercholesterolemia and coronary artery disease.Theseconditions put them at riskforatherosclerosis,which can lead to blockagesinthe coronary arteries.TheLADiscommonly involvedinsuch scenarios,especially when the pain spreads to the neck and arm.Also,the elevated troponinIlevels and tachycardia are strong signals that something seriousishappeninginthe heart.Theseare usually seeninmyocardial infarctions.Giventhe pain pattern and the patient's risk factors,theLADseems like the most likely culprit here.So,whenIput all this together,it really seems like theleftanterior descending arteryisthe most likely artery involvedinthis situation.Itjust fits with the classic presentation of anterior myocardial infarction.Yeah,I'mpretty confident about that.</think>Basedon the presentation of sudden-onset chest pain radiating to the neck andleftarm,along with the patient's history of hypercholesterolemia and coronary artery disease,the most likely coronary artery involvedistheleftanterior descending(LAD)artery.Thisartery supplies the front wall of the heart,and a blockage here can cause the classic symptoms described.Theelevated troponinIlevels and tachycardia further support the likelihood of a myocardial infarction,with theLADbeing a common siteforsuch events.<|end▁of▁sentence|>
最后进行模型权重保存:
new_model_local ="DeepSeek-R1-Medical-COT"model.save_pretrained(new_model_local)tokenizer.save_pretrained(new_model_local)model.save_pretrained_merged(new_model_local,tokenizer,save_method="merged_16bit",)
以上,即完成了本次微调,你也来试试看吧
| 欢迎光临 链载Ai (https://www.lianzai.com/) | Powered by Discuz! X3.5 |