链载Ai

标题: RAG开发系列 [打印本页]

作者: 链载Ai    时间: 2025-12-2 09:51
标题: RAG开发系列

ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;font-size: 1.2em;font-weight: bold;display: table;margin: 4em auto 2em;padding-right: 0.2em;padding-left: 0.2em;background: rgb(15, 76, 129);color: rgb(255, 255, 255);">RAG开发系列

今天要介绍的是用DuckDB把向量保存到数据库,并增加一个UI,让它成为一个真正可以使用的RAG应用(当然还是雏形)。

ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;margin: 1.5em 8px;letter-spacing: 0.1em;color: rgb(63, 63, 63);">向量数据库的选择很多,这里暂且不讨论它们的优劣性。

ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;font-size: 1.2em;font-weight: bold;display: table;margin: 4em auto 2em;padding-right: 0.2em;padding-left: 0.2em;background: rgb(15, 76, 129);color: rgb(255, 255, 255);">正文

ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;font-size: 1.1em;font-weight: bold;margin-top: 2em;margin-right: 8px;margin-bottom: 0.75em;padding-left: 8px;border-left: 3px solid rgb(15, 76, 129);color: rgb(63, 63, 63);">安装包

ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;overflow-x: auto;border-radius: 8px;margin: 10px 8px;">pipinstallduckdbllama-index-vector-stores-duckdb

ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;margin: 1.5em 8px;letter-spacing: 0.1em;color: rgb(63, 63, 63);">因为llamaindex已经帮你封装好了,引入DuckDB,只需要增加增加两行代码即可

ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;font-size: 1.1em;font-weight: bold;margin-top: 2em;margin-right: 8px;margin-bottom: 0.75em;padding-left: 8px;border-left: 3px solid rgb(15, 76, 129);color: rgb(63, 63, 63);">代码

ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;overflow-x: auto;border-radius: 8px;margin: 10px 8px;">
fromllama_index.coreimportVectorStoreIndex,Document,SimpleDirectoryReader,Settings,StorageContext
fromllama_index.llms.ollamaimportOllama
fromllama_index.embeddings.ollamaimportOllamaEmbedding
fromllama_index.vector_stores.duckdbimportDuckDBVectorStore

#指定LLM
Settings.llm=Ollama(model="wizardlm2:7b-q5_K_M",request_timeout=60.0)
#指定embeddingmodel
Settings.embed_model=OllamaEmbedding(model_name="snowflake-arctic-embed:latest")
##剩下代码一样
documents=SimpleDirectoryReader("./data").load_data()
index=VectorStoreIndex.from_documents(documents)
chat_engine=index.as_chat_engine(chat_mode="condense_question",verbose=True)
print(chat_engine.chat("DuckDB的VSS扩展主要功能,replyinChinese"))

ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;font-size: 1.1em;font-weight: bold;margin-top: 2em;margin-right: 8px;margin-bottom: 0.75em;padding-left: 8px;border-left: 3px solid rgb(15, 76, 129);color: rgb(63, 63, 63);">加个UI

ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;margin: 1.5em 8px;letter-spacing: 0.1em;color: rgb(63, 63, 63);">可选的UI框架很多,如streamlit, gradio, nicegui等等,今天介绍个streamlit的实现。

importos
importstreamlitasst
fromllama_index.coreimportVectorStoreIndex,SimpleDirectoryReader,Settings
fromllama_index.vector_stores.duckdbimportDuckDBVectorStore
fromllama_index.coreimportStorageContext

fromllama_index.llms.ollamaimportOllama
fromllama_index.embeddings.ollamaimportOllamaEmbedding

@st.cache_resource
definit_model():
Settings.llm=Ollama(model="wizardlm2:7b-q5_K_M",request_timeout=300.0)


Settings.embed_model=OllamaEmbedding(model_name="snowflake-arctic-embed:latest")
embed_dim=len(Settings.embed_model.get_query_embedding('hello'))
returnembed_dim



@st.cache_resource
definit_index(rebuild=False):
embed_dim=init_model()
ifrebuild:
documents=SimpleDirectoryReader("./data").load_data()
os.remove('duckdb/rag.db')
os.removedirs('duckdb')

vector_store=DuckDBVectorStore(embed_dim=embed_dim,database_name="rag.db",persist_dir="duckdb")

storage_context=StorageContext.from_defaults(vector_store=vector_store)
index=VectorStoreIndex.from_documents(documents,storage_context=storage_context)
else:
vector_store=DuckDBVectorStore(embed_dim=embed_dim,database_name="rag.db",persist_dir="duckdb")
index=VectorStoreIndex.from_vector_store(vector_store=vector_store)
returnindex


@st.cache_resource
definit_engine():
index=init_index(rebuild=True)
chat_engine=index.as_chat_engine(chat_mode="condense_question",verbose=True)
returnchat_engine
importstreamlitasst
fromragimportinit_engine


defmain():
if"messages"notinst.session_state.keys():#Initializethechatmessageshistory
st.session_state.messages=[
{"role":"assistant","content":"Iamragbot!"}
]
#print(chat_engine.chat("DuckDB的VSS扩展主要功能,replyinChinese"))

if"chat_engine"notinst.session_state.keys():#Initializethechatengine
st.session_state.chat_engine=init_engine()

#Promptforuserinputandsavetochathistory
ifprompt:=st.chat_input("Yourquestion"):
st.session_state.messages.append({"role":"user","content":prompt})

formessageinst.session_state.messages:#Displaythepriorchatmessages
withst.chat_message(message["role"]):
st.write(message["content"])

#Iflastmessageisnotfromassistant,generateanewresponse
ifst.session_state.messages[-1]["role"]!="assistant":
withst.chat_message("assistant"):
withst.spinner("Thinking..."):
response=st.session_state.chat_engine.chat(prompt)
st.write(response.response)
message={"role":"assistant","content":response.response}
#Addresponsetomessagehistory
st.session_state.messages.append(message)

if__name__=="__main__":
main()

效果图已经附上了。

或者需要的包

pipinstallllama-index-embeddings-ollamallama-index-llms-ollamallama-index-readers-filellama-index-vector-stores-duckdbduckdbstreamlit

或者使用requirements.txt,代码在GitHub[1]

结论

目前只是出了个雏形,接下来还有很多工作要做,如:

引用链接

[1]GitHub:https://github.com/alitrack/rag







欢迎光临 链载Ai (http://www.lianzai.com/) Powered by Discuz! X3.5