|
ingFang SC", Cambria, Cochin, Georgia, Times, "Times New Roman", serif;font-size: 15.4px;font-weight: bold;display: table;margin: 4em auto 2em;padding: 4px 1em;background: rgb(0, 152, 116);color: rgb(255, 255, 255);border-radius: 999px;">先决条件ingFang SC", Cambria, Cochin, Georgia, Times, "Times New Roman", serif;padding-left: 1em;list-style: circle;color: rgb(63, 63, 63);" class="list-paddingleft-1"> •安装ollama和llama3模型,参看超越GPT-3.5!Llama3个人电脑本地部署教程 •安装python3.9 •安装langchain用于协调LLM •安装weaviate-client用于向量数据库 ingFang SC", Cambria, Cochin, Georgia, Times, "Times New Roman", serif;overflow-x: auto;border-radius: 8px;margin: 10px 8px;">pip3installlangchainweaviate-clientingFang SC", Cambria, Cochin, Georgia, Times, "Times New Roman", serif;font-size: 15.4px;font-weight: bold;display: table;margin: 4em auto 2em;padding: 4px 1em;background: rgb(0, 152, 116);color: rgb(255, 255, 255);border-radius: 999px;">RAG实践ingFang SC", Cambria, Cochin, Georgia, Times, "Times New Roman", serif;margin: 1.5em 8px;letter-spacing: 0.1em;color: rgb(63, 63, 63);">RAG需要从向量数据库检索上下文然后输入LLM进行生成,因此需要提前将文本数据向量化并存储到向量数据库。主要步骤如下: ingFang SC", Cambria, Cochin, Georgia, Times, "Times New Roman", serif;padding-left: 1em;color: rgb(63, 63, 63);" class="list-paddingleft-1">1.准备文本资料 2.将文本分块 3.嵌入以及存储块到向量数据库 ingFang SC", Cambria, Cochin, Georgia, Times, "Times New Roman", serif;margin: 1.5em 8px;letter-spacing: 0.1em;color: rgb(63, 63, 63);">新建一个python3项目以及index.py文件,导入需要用到的模块:ingFang SC", Cambria, Cochin, Georgia, Times, "Times New Roman", serif;overflow-x: auto;border-radius: 8px;margin: 10px 8px;">fromlangchain_community.document_loadersimportTextLoader#文本加载器 fromlangchain.text_splitterimportCharacterTextSplitter#文本分块器 fromlangchain_community.embeddingsimportOllamaEmbeddings#Ollama向量嵌入器 importweaviate#向量数据库 fromweaviate.embeddedimportEmbeddedOptions#向量嵌入选项 fromlangchain.promptsimportChatPromptTemplate#聊天提示模板 fromlangchain_community.chat_modelsimportChatOllama#ChatOllma聊天模型 fromlangchain.schema.runnableimportRunnablePassthrough fromlangchain.schema.output_parserimportStrOutputParser#输出解析器 fromlangchain_community.vectorstoresimportWeaviate#向量数据库 importrequestsingFang SC", Cambria, Cochin, Georgia, Times, "Times New Roman", serif;font-size: 14px;font-weight: bold;margin-top: 2em;margin-right: 8px;margin-bottom: 0.75em;padding-left: 8px;border-left: 3px solid rgb(0, 152, 116);color: rgb(63, 63, 63);">下载&加载语料ingFang SC", Cambria, Cochin, Georgia, Times, "Times New Roman", serif;margin: 1.5em 8px;letter-spacing: 0.1em;color: rgb(63, 63, 63);">这里使用拜登总统2022年的国情咨文作为示例。文件链接https://raw.githubusercontent.com/langchain-ai/langchain/master/docs/docs/modules/state_of_the_union.txt。langchain提供了多个文档加载器,这里我们使用`TextLoaders`即可。#下载文件 url="https://raw.githubusercontent.com/langchain-ai/langchain/master/docs/docs/modules/state_of_the_union.txt" res=requests.get(url) withopen("state_of_the_union.txt","w")asf: f.write(res.text) #加载文件 loader=TextLoader('./state_of_the_union.txt') documents=loader.load()
语料分块由于原始文档过大,超出了LLM的上下文窗口,需要将其分块才能让LLM识别。LangChain 提供了许多内置的文本分块工具,这里用CharacterTextSplitter作为示例: text_splitter=CharacterTextSplitter(chunk_size=500,chunk_overlap=50) chunks=text_splitter.split_documents(documents)
嵌入以及存储到向量数据库为了对语料分块进行搜索,需要为每个块生成向量并嵌入文档,最后将文档和向量一起存储。这里使用Ollama&llama3生成向量,并存储到Weaviate向量数据库。 client=weaviate.Client( embedded_options=EmbeddedOptions() ) print("storevector") vectorstore=Weaviate.from_documents( client=client, documents=chunks, embedding=OllamaEmbeddings(model="llama3"), by_text=False )
检索 & 增强向量数据库加载数据后,可以作为检索器,通过用户查询和嵌入向量之间的语义相似性获取数据,然后使用一个固定的聊天模板即可。 #检索器 retriever=vectorstore.as_retriever() #LLM提示模板 template="""Youareanassistantforquestion-answeringtasks. Usethefollowingpiecesofretrievedcontexttoanswerthequestion. Ifyoudon'tknowtheanswer,justsaythatyoudon'tknow. Usethreesentencesmaximumandkeeptheanswerconcise. Question:{question} Context:{context} Answer: """ prompt=ChatPromptTemplate.from_template(template)
生成最后,将检索器、聊天模板以及LLM组合成RAG链就可以了。 llm=ChatOllama(model="llama3",temperature=10) rag_chain=( {"context":retriever,"question":RunnablePassthrough()}#上下文信息 |prompt |llm |StrOutputParser() ) #开始查询&生成 query="Whatdidthepresidentmainlysay?" print(rag_chain.invoke(query))
上面的示例中我问了LLM总统主要说了什么,LLM回答如下: ThepresidentmainlytalkedaboutcontinuingeffortstocombatCOVID-19,includingvaccinationratesandmeasurestopreparefornewvariants.Theyalsodiscussedinvestmentsinworkers,communities,andlawenforcement,withafocusonfairnessandjustice.ThetonewashopefulandemphasizedtheimportanceoftakingactiontoimproveAmericans'lives. 可以看到还是像那么回事的,LLM使用的输入预料的内容答复了一些关于新冠疫情以及工作、社区等内容。 langchain支持多种LLM,有需要的读者可以尝试下使用OpenAI提供的LLM。
读者可以根据需要替换下输入预料,构造自己的私有知识检索库。 本文所有代码如下: fromlangchain_community.document_loadersimportTextLoader fromlangchain.text_splitterimportCharacterTextSplitter fromlangchain_community.embeddingsimportOllamaEmbeddings importweaviate fromweaviate.embeddedimportEmbeddedOptions fromlangchain.promptsimportChatPromptTemplate fromlangchain_community.chat_modelsimportChatOllama fromlangchain.schema.runnableimportRunnablePassthrough fromlangchain.schema.output_parserimportStrOutputParser fromlangchain_community.vectorstoresimportWeaviate importrequests #下载数据 url="https://raw.githubusercontent.com/langchain-ai/langchain/master/docs/docs/modules/state_of_the_union.txt" res=requests.get(url) withopen("state_of_the_union.txt","w")asf: f.write(res.text) #加载数据 loader=TextLoader('./state_of_the_union.txt') documents=loader.load() #文本分块 text_splitter=CharacterTextSplitter(chunk_size=500,chunk_overlap=50) chunks=text_splitter.split_documents(documents) #初始化向量数据库并嵌入目标文档 client=weaviate.Client( embedded_options=EmbeddedOptions() ) vectorstore=Weaviate.from_documents( client=client, documents=chunks, embedding=OllamaEmbeddings(model="llama3"), by_text=False ) #检索器 retriever=vectorstore.as_retriever() #LLM提示模板 template="""Youareanassistantforquestion-answeringtasks. Usethefollowingpiecesofretrievedcontexttoanswerthequestion. Ifyoudon'tknowtheanswer,justsaythatyoudon'tknow. Usethreesentencesmaximumandkeeptheanswerconcise. Question:{question} Context:{context} Answer: """ prompt=ChatPromptTemplate.from_template(template) llm=ChatOllama(model="llama3",temperature=10) rag_chain=( {"context":retriever,"question":RunnablePassthrough()} |prompt |llm |StrOutputParser() ) #开始查询&生成 query="Whatdidthepresidentmainlysay?" print(rag_chain.invoke(query))
|