手搓系列｜MAS RAG实现博客搜索与问答

显示全部楼层

ingFang SC", system-ui, -apple-system, BlinkMacSystemFont, "Helvetica Neue", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;letter-spacing: 0.578px;margin-top: 0px;margin-bottom: 0px;font-size: 22px;padding-bottom: 12px;">序

ingFang SC", system-ui, -apple-system, BlinkMacSystemFont, "Helvetica Neue", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;color: rgb(31, 35, 41);margin: 0px 0px 4px;word-break: break-all;min-height: 20px;">最近正好在研究RAG，虽然RAG有很多问题被诟病，但预计未来三年仍然是构建Agent扩展知识库的关键技术，RAG的问题也会逐一被弥补或完善，就像MAS的问题目前最佳解法是构建个性化的Context engineering一样。

ingFang SC", system-ui, -apple-system, BlinkMacSystemFont, "Helvetica Neue", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;color: rgb(31, 35, 41);margin: 0px 0px 4px;word-break: break-all;min-height: 20px;">本文记录构建Agentic RAG系统的过程，结合了LangChain (文档处理), LangGraph (Agent流程控制), Google Gemini (LLM) 和Qdrant向量数据库，这个智能问答系统，能够自动判断是否需要重写查询、检索更多内容或直接回答问题。

ingFang SC", system-ui, -apple-system, BlinkMacSystemFont, "Helvetica Neue", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;color: rgb(31, 35, 41);margin: 0px 0px 4px;word-break: break-all;min-height: 20px;">MAS = Multi-agent system 中文名称是多智能体系统，RAG是一种给LLM提供训练语料外数据或实时检索数据能力的技术，旨在提升LLM的智能程度。建议在了解这两个名词的5W1H之后再阅读本文，MAS和RAG可以查阅公众号往期文章。

ingFang SC", system-ui, -apple-system, BlinkMacSystemFont, "Helvetica Neue", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;letter-spacing: 0.578px;margin-top: 0px;margin-bottom: 0px;font-size: 22px;padding-bottom: 12px;">代码架构概述

ingFang SC", system-ui, -apple-system, BlinkMacSystemFont, "Helvetica Neue", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;color: rgb(31, 35, 41);margin: 0px 0px 4px;word-break: break-all;min-height: 20px;">建议结合代码阅读，GitHub repo链接：https://github.com/KatnissStoa/RAG_blog_chat.git

ingFang SC", system-ui, -apple-system, BlinkMacSystemFont, "Helvetica Neue", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;color: rgb(31, 35, 41);margin: 0px 0px 4px;word-break: break-all;min-height: 20px;">代码仓库核心功能概述：

文档加载：使用WebBaseLoader从博客 URL 抓取 HTML 内容
文本分块：使用RecursiveCharacterTextSplitter将长文分割为语义段落，按段落/句子/词递归切分
向量化：使用 Google Gemini 的embedding-001模型生成文本嵌入，即将文本块转为向量
向量存储：使用 Qdrant 存储嵌入向量，支持快速高效的语义相似度检索
Agentic 查询处理：使用 LangGraph 构建的智能代理，动态决定是否需要重写查询或继续检索
相关性评分：使用 Gemini 对检索结果进行自动打分，过滤低质量内容
用户界面：使用 Streamlit 构建交互式 Web 界面，支持输入博客链接和提问

ingFang SC", system-ui, -apple-system, BlinkMacSystemFont, "Helvetica Neue", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;letter-spacing: 0.578px;margin-top: 8px;margin-bottom: 0px;font-size: 22px;padding-bottom: 12px;">代码详解

ingFang SC", system-ui, -apple-system, BlinkMacSystemFont, "Helvetica Neue", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;color: rgb(31, 35, 41);margin: 0px 0px 4px;word-break: break-all;min-height: 20px;">用Steamlit构建UI界面，要求使用者输入三个内容：Qdrant数据库地址、Qdrant Key、Google Gemini APIKey，准备好资源才能使用RAG功能。

defset_sidebar():
 """Setup sidebar for API keys and configuration."""
 withst.sidebar:
    st.subheader("API Configuration")
   
    qdrant_host = st.text_input("Enter your Qdrant Host URL:",type="password")
    qdrant_api_key = st.text_input("Enter your Qdrant API key:",type="password")
    gemini_api_key = st.text_input("Enter your Gemini API key:",type="password")

   ifst.button("Done"):
     ifqdrant_hostandqdrant_api_keyandgemini_api_key:
        st.session_state.qdrant_host = qdrant_host
        st.session_state.qdrant_api_key = qdrant_api_key
        st.session_state.gemini_api_key = gemini_api_key
        st.success("API keys saved!")
     else:
        st.warning("Please fill all API fields")

grade_documents 用 Gemini 对「用户提问」和「搜索结果/资料」进行相似度判断，LLM只能回答 yes / no，若为 yes 则走 generate 把问题和资料一起给LLM写一段简洁准确的回复输出给用户，若为 no 则重新搜索，重新检索前LLM会把用户原问题改写成更贴切、更好搜索的内容。

# LLM
  model = ChatGoogleGenerativeAI(api_key=st.session_state.gemini_api_key, temperature=0, model="gemini-2.0-flash", streaming=True)

 # LLM with tool and validation
  llm_with_tool = model.with_structured_output(grade)

 # Prompt
  prompt = PromptTemplate(
    template="""You are a grader assessing relevance of a retrieved document to a user question. \n
    Here is the retrieved document: \n\n {context} \n\n
    Here is the user question: {question} \n
    If the document contains keyword(s) or semantic meaning related to the user question, grade it as relevant. \n
    Give a binary score 'yes' or 'no' score to indicate whether the document is relevant to the question.""",
    input_variables=["context","question"],
  )

 # Chain
  chain = prompt | llm_with_tool

  messages = state["messages"]
  last_message = messages[-1]

  question = messages[0].content
  docs = last_message.content

  scored_result = chain.invoke({"question": question,"context": docs})

  score = scored_result.binary_score

 ifscore =="yes":
   print("---DECISION: DOCS RELEVANT---")
   return"generate"

 else:
   print("---DECISION: DOCS NOT RELEVANT---")
   print(score)
   return"rewrite"

在 main 函数里，会把上述所有流程走一遍，首先check Qdrant地址和key & Gemini api key，将博客内容全文抓取下来，用 embedding_model 变为向量，切成 chunk 并配好编号 UUID，连接 Qdrant 存入 chunk 内容，成功的内容显示绿色，失败的显示红色。当用户Query输入后，LLM 判断数据库中是否有相似内容，若有则将问题和内容合并生成简洁回复显示在网页上，若无则改写问题重新检索，最多改写一次，查不到就结束，不会无限循环，节省 token 和时间消耗。

defmain():
  set_sidebar()

 # Check if API keys are set
 ifnotall([st.session_state.qdrant_host,
        st.session_state.qdrant_api_key,
        st.session_state.gemini_api_key]):
    st.warning("Please configure your API keys in the sidebar first")
   return

 # Initialize components
  embedding_model, client, db = initialize_components()
 ifnotall([embedding_model, client, db]):
   return

 # Initialize retriever and tools
  retriever = db.as_retriever(search_type="similarity", search_kwargs={"k":5})
  retriever_tool = create_retriever_tool(
    retriever,
   "retrieve_blog_posts",
   "Search and return information about blog posts on LLMs, LLM agents, prompt engineering, and adversarial attacks on LLMs.",
  )
  tools = [retriever_tool]

 # URL input section
  url = st.text_input(
   ":link: Paste the blog link:",
    placeholder="e.g., https://lilianweng.github.io/posts/2023-06-23-agent/"
  )
 ifst.button("Enter URL"):
   ifurl:
     withst.spinner("Processing documents..."):
       ifadd_documents_to_qdrant(url, db):
          st.success("Documents added successfully!")
       else:
          st.error("Failed to add documents")
   else:
      st.warning("Please enter a URL")

 # Query section
  graph = get_graph(retriever_tool)
  query = st.text_area(
   ":bulb: Enter your query about the blog post:",
    placeholder="e.g., What does Lilian Weng say about the types of agent memory?"
  )

 ifst.button("Submit Query"):
   ifnotquery:
      st.warning("Please enter a query")
     return

    inputs = {"messages": [HumanMessage(content=query)]}
   withst.spinner("Generating response..."):
     try:
        response = generate_message(graph, inputs)
        st.write(response)
     exceptExceptionase:
        st.error(f"Error generating response:{str(e)}")

  st.markdown("---")
  st.write("Built with :blue-background[LangChain] | :blue-background[LangGraph] by [Charan](https://www.linkedin.com/in/codewithcharan/)")

if__name__ =="__main__":
  main()