DocChat：在几个小时内完成 GPT-4 级别对话QA训练！ - 链载Ai

DocChat的发布无疑是对话式问答系统领域的一次重大飞跃。Cerebras以其在机器学习（ML）和大型语言模型（LLMs）方面的深厚专业知识，推出了DocChat系列下的两个新模型：Cerebras Llama3-DocChat和Cerebras Dragon-DocChat。这些模型不仅展现出高性能对话式人工智能的潜力，更是在基于文档的问答任务中展现出了独特的定制优势。

Cerebras Llama3-DocChat 是基于Llama 3的构建，并融合了该领域最新研究的先进见解。特别是在Nvidia的ChatQA模型系列上，该模型的开发用时极短，显示了Cerebras在ML训练和数据集策划方面的丰富经验，以及在合成数据生成等创新技术上的突破。

Cerebras Dragon-DocChat 则是一个多轮检索模型，其经过微调后在召回率上取得了显著改进。它在ChatQA对话式问答数据集上接受了训练，并通过硬负样本的对比损失进行增强，展现出在多轮对话设置中的卓越性能。

DocChat模型格外引人注目的是它们的训练速度。Cerebras Llama3-DocChat模型仅用几个小时就完成了训练，而Dragon-DocChat模型的微调则在几分钟内完成。这种训练效率的突破，为人工智能行业树立了新的标准。

在性能方面，这两种模型在各种基准测试中都取得了一流的结果，超越了许多现有解决方案。例如，在ConvFinQA和SQA等基准测试中，Cerebras Llama3-DocChat展现出显著的改进，证明了其处理复杂对话式问答任务的能力。

fromtransformersimportAutoTokenizer,AutoModelForCausalLM
importtorch

model_id="cerebras/Llama3-DocChat-1.0-8B"

tokenizer=AutoTokenizer.from_pretrained(model_id)
model=AutoModelForCausalLM.from_pretrained(model_id,torch_dtype=torch.float16,device_map="auto")


system="Thisisachatbetweenauserandanartificialintelligenceassistant.Theassistantgiveshelpful,detailed,andpoliteanswerstotheuser'squestionsbasedonthecontext.Theassistantshouldalsoindicatewhentheanswercannotbefoundinthecontext."
instruction="leasegiveafullandcompleteanswerforthequestion."

document="""
#CerebrasWafer-ScaleCluster

Exa-scaleperformance,singledevicesimplicity

##AISupercomputers

CondorGalaxy(CG),thesupercomputerbuiltbyG42andCerebras,isthesimplestandfastestwaytobuildAImodelsinthecloud.Withover16ExaFLOPsofAIcompute,CondorGalaxytrainsthemostdemandingmodelsinhoursratherthandays.TheterabytescaleMemoryXsystemnativelyaccommodates100billion+parametermodels,makinglargescaletrainingsimpleandefficient.

|Cluster|ExaFLOPs|Systems|Memory|
|--------|--------|--------|------|
|CG1|4|64CS-2s|82TB|
|CG2|4|64CS-2s|82TB|
|CG3|8|64CS-3s|108TB|
"""

question="HowmanytotalCSsystemsdoesCondorGalaxy1,2,and3havecombined,andhowmanyflopsdoesthiscorrespondto?"

user_turn=f"""<context>
{document}
</context>
{instruction}{question}"""

messages=[
{"role":"system","content":system},
{"role":"user","content":user_turn}
]

input_ids=tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
return_tensors="pt"
).to(model.device)

terminators=[
tokenizer.eos_token_id,
tokenizer.convert_tokens_to_ids("<|eot_id|>")
]

outputs=model.generate(
input_ids,
max_new_tokens=256,
eos_token_id=terminators,
)
response=outputs[0][input_ids.shape[-1]:]
print(tokenizer.decode(response,skip_special_tokens=True))

Cerebras通过发布DocChat，展现了其对开源社区的承诺。公司公开了模型权重、完整的训练配方和相关数据集，这种透明度水平允许其他人工智能研究人员和开发人员复制、构建和创新Cerebras的工作，推动该领域的发展。

在与其他模型的直接比较中，DocChat模型展现出了令人印象深刻的结果。在ChatRAG基准测试中，Cerebras Llama3-DocChat在多个关键指标上超越了Nvidia的Llama3-ChatQA和GPT-4 Turbo。Cerebras Dragon-DocChat同样在多轮对话设置中的召回率上超越了Facebook的Dragon+和Nvidia的Dragon Multiturn。

DocChat的开发并非没有挑战，尤其是在模型处理无法回答的问题的能力和算术性能上。通过实验和技术创新，Cerebras在这些领域取得了显著进展。公司对DocChat系列的未来发展有着雄心勃勃的计划，包括支持更长的上下文、改进数学推理和更大的模型尺寸。

总之，DocChat的发布不仅是Cerebras技术实力的展现，也是其对开源和持续创新的承诺的体现。随着Cerebras继续完善和扩展其产品，DocChat有望对人工智能驱动的通信未来产生深远的影响。