ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;font-size: 1.2em;font-weight: bold;display: table;margin: 4em auto 2em;padding-right: 0.2em;padding-left: 0.2em;background: rgb(15, 76, 129);color: rgb(255, 255, 255);">RAG开发系列•什么是RAG(检索增强生成)? •6行代码入门RAG开发 •9行代码开发一个基于ollama的私有化RAG
今天要介绍的是用DuckDB把向量保存到数据库,并增加一个UI,让它成为一个真正可以使用的RAG应用(当然还是雏形)。  ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;margin: 1.5em 8px;letter-spacing: 0.1em;color: rgb(63, 63, 63);">向量数据库的选择很多,这里暂且不讨论它们的优劣性。ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;font-size: 1.2em;font-weight: bold;display: table;margin: 4em auto 2em;padding-right: 0.2em;padding-left: 0.2em;background: rgb(15, 76, 129);color: rgb(255, 255, 255);">正文ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;font-size: 1.1em;font-weight: bold;margin-top: 2em;margin-right: 8px;margin-bottom: 0.75em;padding-left: 8px;border-left: 3px solid rgb(15, 76, 129);color: rgb(63, 63, 63);">安装包ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;overflow-x: auto;border-radius: 8px;margin: 10px 8px;">pipinstallduckdbllama-index-vector-stores-duckdbingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;margin: 1.5em 8px;letter-spacing: 0.1em;color: rgb(63, 63, 63);">因为llamaindex已经帮你封装好了,引入DuckDB,只需要增加增加两行代码即可ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;font-size: 1.1em;font-weight: bold;margin-top: 2em;margin-right: 8px;margin-bottom: 0.75em;padding-left: 8px;border-left: 3px solid rgb(15, 76, 129);color: rgb(63, 63, 63);">代码ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;overflow-x: auto;border-radius: 8px;margin: 10px 8px;"> fromllama_index.coreimportVectorStoreIndex,Document,SimpleDirectoryReader,Settings,StorageContext fromllama_index.llms.ollamaimportOllama fromllama_index.embeddings.ollamaimportOllamaEmbedding fromllama_index.vector_stores.duckdbimportDuckDBVectorStore
#指定LLM Settings.llm=Ollama(model="wizardlm2:7b-q5_K_M",request_timeout=60.0) #指定embeddingmodel Settings.embed_model=OllamaEmbedding(model_name="snowflake-arctic-embed:latest") ##剩下代码一样 documents=SimpleDirectoryReader("./data").load_data() index=VectorStoreIndex.from_documents(documents) chat_engine=index.as_chat_engine(chat_mode="condense_question",verbose=True) print(chat_engine.chat("DuckDB的VSS扩展主要功能,replyinChinese"))ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;font-size: 1.1em;font-weight: bold;margin-top: 2em;margin-right: 8px;margin-bottom: 0.75em;padding-left: 8px;border-left: 3px solid rgb(15, 76, 129);color: rgb(63, 63, 63);">加个UIingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;margin: 1.5em 8px;letter-spacing: 0.1em;color: rgb(63, 63, 63);">可选的UI框架很多,如streamlit, gradio, nicegui等等,今天介绍个streamlit的实现。importos importstreamlitasst fromllama_index.coreimportVectorStoreIndex,SimpleDirectoryReader,Settings fromllama_index.vector_stores.duckdbimportDuckDBVectorStore fromllama_index.coreimportStorageContext
fromllama_index.llms.ollamaimportOllama fromllama_index.embeddings.ollamaimportOllamaEmbedding
@st.cache_resource definit_model(): Settings.llm=Ollama(model="wizardlm2:7b-q5_K_M",request_timeout=300.0)
Settings.embed_model=OllamaEmbedding(model_name="snowflake-arctic-embed:latest") embed_dim=len(Settings.embed_model.get_query_embedding('hello')) returnembed_dim
@st.cache_resource definit_index(rebuild=False): embed_dim=init_model() ifrebuild: documents=SimpleDirectoryReader("./data").load_data() os.remove('duckdb/rag.db') os.removedirs('duckdb')
vector_store=DuckDBVectorStore(embed_dim=embed_dim,database_name="rag.db",persist_dir="duckdb")
storage_context=StorageContext.from_defaults(vector_store=vector_store) index=VectorStoreIndex.from_documents(documents,storage_context=storage_context) else: vector_store=DuckDBVectorStore(embed_dim=embed_dim,database_name="rag.db",persist_dir="duckdb") index=VectorStoreIndex.from_vector_store(vector_store=vector_store) returnindex
@st.cache_resource definit_engine(): index=init_index(rebuild=True) chat_engine=index.as_chat_engine(chat_mode="condense_question",verbose=True) returnchat_engine
importstreamlitasst fromragimportinit_engine
defmain(): if"messages"notinst.session_state.keys():#Initializethechatmessageshistory st.session_state.messages=[ {"role":"assistant","content":"Iamragbot!"} ] #print(chat_engine.chat("DuckDB的VSS扩展主要功能,replyinChinese"))
if"chat_engine"notinst.session_state.keys():#Initializethechatengine st.session_state.chat_engine=init_engine()
#Promptforuserinputandsavetochathistory ifprompt:=st.chat_input("Yourquestion"): st.session_state.messages.append({"role":"user","content":prompt})
formessageinst.session_state.messages:#Displaythepriorchatmessages withst.chat_message(message["role"]): st.write(message["content"])
#Iflastmessageisnotfromassistant,generateanewresponse ifst.session_state.messages[-1]["role"]!="assistant": withst.chat_message("assistant"): withst.spinner("Thinking..."): response=st.session_state.chat_engine.chat(prompt) st.write(response.response) message={"role":"assistant","content":response.response} #Addresponsetomessagehistory st.session_state.messages.append(message)
if__name__=="__main__": main()
效果图已经附上了。 或者需要的包 pipinstallllama-index-embeddings-ollamallama-index-llms-ollamallama-index-readers-filellama-index-vector-stores-duckdbduckdbstreamlit 或者使用requirements.txt,代码在GitHub[1]。 结论目前只是出了个雏形,接下来还有很多工作要做,如: 引用链接[1]GitHub:https://github.com/alitrack/rag
|