|
检索增强生成(Retrieval-Augmented Generation, RAG)作为应用大模型落地的方案之一,通过让 LLM 获取上下文最新数据来解决 LLM 的局限性。典型的应用案例是基于公司特定的文档和知识库开发的聊天机器人,为公司内部人员快速检索内部文档提供便利。另外,也适用于特定领域的GenAI应用,如医疗保健、金融和法律服务。尽管Naive RAG在处理简单问题时表现良好,但在面对复杂任务时却显得力不从心。本文将探讨Naive RAG的局限性,并介绍如何通过引入代理(Agentic)方法来提升RAG系统的智能性和实用性。 01.
Naive RAG的局限性Naive RAG方法在处理简单问题时表现良好,例如:ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;">“特斯拉的主要风险因素是什么?”(基于特斯拉2021年10K报告)ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;">“Milvus 2.4有哪些功能?”(基于Milvus 2.4 release note) 然而,当面对更复杂的问题时,Naive RAG的局限性就显现出来了。ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;">总结性问题:例如,“给我一个公司10K年度报告的总结”,Naive RAG难以在不丢失重要信息的情况下生成全面的总结。ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;">比较性问题:例如,“Milvus 2.4 与Milvus 2.3 区别有哪些”,Naive RAG难以有效地进行多文档比较。ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;">结构化分析和语义搜索:例如,“告诉我美国表现最好的网约车公司的风险因素”,Naive RAG难以在复杂的语义搜索和结构化分析中表现出色。ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;">一般性多部分问题:例如,“告诉我文章A中的支持X的论点,再告诉我文章B中支持Y的论点,按照我们的内部风格指南制作一个表格,然后基于这些事实生成你自己的结论”,Naive RAG难以处理多步骤、多部分的复杂任务。 02. Naive RAG上述痛点的原因ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;">单次处理:Naive RAG通常是一次性处理查询,缺乏多步骤的推理能力。ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;">缺乏查询理解和规划:Naive RAG无法深入理解查询的复杂性,也无法进行任务规划。ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;">缺乏工具使用能力:Naive RAG无法调用外部工具或API来辅助完成任务。ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;">缺乏反思和错误纠正:Naive RAG无法根据反馈进行自我改进。- 无记忆(无状态):Naive RAG无法记住对话历史,无法在多轮对话中保持上下文一致性。
03. 从RAG到Agentic RAG为了克服Naive RAG的局限性,我们可以引入代理方法(Agentic),使RAG系统更加智能和灵活。- 路由是最简单的代理推理形式。给定用户查询和一组选择,系统可以输出一个子集,将查询路由到合适的处理模块。
- 调用外部工具或API来辅助完成任务。比如,使用查询天气接口来获取最新的天气信息。
- 将查询分解为可并行处理的子查询。每个子查询可以针对任何一组RAG管道执行,从而提高处理效率和准确性。
- 使用反馈来改进代理的执行并减少错误,反馈可以来自LLM自身。
- 除了当前查询外,还可以将对话历史作为输入,纳入RAG管道中,从而在多轮对话中保持上下文一致性。
04. 实践我们基于Milvus,LlamaIndex构建一个Agentic RAG案例。首先,我们把Milvus 2.3 和 2.4 release note文档,通过LlamaIndex SentenceWindowNodeParser分段之后,导入到Milvus。node_parser=SentenceWindowNodeParser.from_defaults( window_size=3, window_metadata_key="window", original_text_metadata_key="original_text", )
#Extractnodesfromdocuments nodes=node_parser.get_nodes_from_documents(documents)
vector_store=MilvusVectorStore(dim=1536, uri="http://localhost:19530", collection_name='agentic_rag', overwrite=True, enable_sparse=False, hybrid_ranker="RRFRanker", hybrid_ranker_params={"k":60})
storage_context=StorageContext.from_defaults(vector_store=vector_store)
index=VectorStoreIndex( nodes, storage_context=storage_context )
然后,我们定义两个agent tool,他们分别是vector query tool 和summary tool。vector query tool利用了Milvus Hybrid search能力。summary tool采用了 LlamaIndex的 SummaryIndex 对于文档块提取summary。defvector_query( query:str, page_numbers:Optional[List[int]]=None )->str: #Thetargetkeydefaultsto`window`tomatchthenode_parser'sdefault postproc=MetadataReplacementPostProcessor( target_metadata_key="window" ) #BAAI/bge-reranker-baseisacross-encodermodel #link:https://huggingface.co/BAAI/bge-reranker-base rerank=BGERerankFunction( top_n=3, model_name="BAAI/bge-reranker-base", device="cpu" )
#TheQueryEngineclassisequippedwiththegeneratorandfacilitatestheretrievalandgenerationsteps query_engine=vector_index.as_query_engine( similarity_top_k=3, vector_store_query_mode="hybrid",#Milvusstartssupportingfromversion2.4,use'Default'forversionsbefore2.4 node_postprocessors=[postproc,rerank], )
response=query_engine.query(query) returnresponse
vector_query_tool=FunctionTool.from_defaults( name=f"vector_tool_{name}", fn=vector_query )
summary_index=SummaryIndex(nodes) summary_query_engine=summary_index.as_query_engine( response_mode="tree_summarize", use_async=True, ) summary_tool=QueryEngineTool.from_defaults( name=f"summary_tool_{name}", query_engine=summary_query_engine, description=( f"Usefulforsummarizationquestionsrelatedto{name}" ), )
最后,通过LlamaIndex的ReActAgentWorker整合vector query tool 和summary tool,以及React的特性,反思,路由以及使用工具能力,实现Agentic RAG。response=agent.query( "Tellmeaboutthedifferencebetweenmilvus2.3andmilvus2.4?" )
这是对于上述问题的思考过程,包括React的观察,思考,行动具体步骤。Thought:TheuserisaskingaboutthedifferencesbetweenMilvus2.3andMilvus2.4.Ineedtogatherinformationaboutbothversionstoprovideacomprehensivecomparison. Action:vector_tool_milvus_2_3 ActionInput:{'query':"What'snewinMilvus2.3?"} Thought:Itappearsthereisapersistentissuewiththetoolinput.Iwilltryusingadifferenttooltogatherthenecessaryinformation. Action:summary_tool_milvus_2_3 ActionInput:{'input':"What'snewinMilvus2.3?"} Observation:Milvus2.3introducesseveralnewfeaturesandenhancements,includingGPUsupport,Arm64support,arefactoredQueryNode,andamergedIndexCoordandDataCoordforsimplifieddeployment.ItalsoincludesanewNATS-basedmessagequeue,upsertAPI,rangesearch,countstatement,cosinemetrics,rawvectorsinsearchreturns,ScaNNindexsupport,iteratorfunctionality,JSON_CONTAINSexpressions,andCDCsupport.Additionally,thereareperformanceimprovementssuchasMMapforcapacityincrease,optimizeddatafiltering,growingindex,andincreasedresourceusageinmulti-coreenvironments.Thereleasealsofeaturesanewloadbalancer,dynamicconfigurationsupport,tracingsupport,reorganizederrorcodes,andupgradedtoolslikeBirdwatcherandAttu. Thought:IhavegatheredinformationaboutthenewfeaturesandenhancementsinMilvus2.3.Now,IneedtogathersimilarinformationforMilvus2.4tocomparethetwoversions. Action:summary_tool_milvus_2_4 ActionInput:{'input':"What'snewinMilvus2.4?"} Observation:Milvus2.4introducesseveralnewfeaturesandimprovements.Keyhighlightsinclude: ... Theseupdatesaimtoenhancesystemperformance,improveobservability,andstreamlinethecodebaseforincreasedsimplicity. Thought:InowhavedetailedinformationaboutthenewfeaturesandenhancementsinbothMilvus2.3andMilvus2.4.Icancomparethetwoversionsbasedonthesedetails.
输出的答案,从答案来看,不仅详细列举了Milvus 2.3 和Milvus 2.4各种功能特性,还有功能总结与对比。Answer:Milvus2.3andMilvus2.4bothintroducesignificantnewfeaturesandimprovements,buttheyfocusondifferentaspectsofthesystem.Herearethekeydifferences:
###Milvus2.3: 1.**GPUandArm64Support**:IntroducesGPUsupportandArm64support. 2.**RefactoredQueryNode**:SimplifiesdeploymentbymergingIndexCoordandDataCoord. 3.**NewMessageQueue**:ImplementsaNATS-basedmessagequeue. 4.**NewAPIsandFeatures**:IncludesupsertAPI,rangesearch,countstatement,cosinemetrics,rawvectorsinsearchreturns,ScaNNindexsupport,iteratorfunctionality,JSON_CONTAINSexpressions,andCDCsupport. 5.**PerformanceImprovements**:FeaturesMMapforcapacityincrease,optimizeddatafiltering,growingindex,andincreasedresourceusageinmulti-coreenvironments. 6.**LoadBalancerandDynamicConfiguration**:Introducesanewloadbalanceranddynamicconfigurationsupport. 7.**ToolUpgrades**:UpgradestoolslikeBirdwatcherandAttu.
###Milvus2.4: 1.**NewGPUIndex-CAGRA**:Introducesastate-of-the-artGPU-basedgraphindexforsignificantperformanceboosts. 2.**Multi-vectorandHybridSearch**:Supportsstoringvectorembeddingsfrommultiplemodelsandconductinghybridsearches. 3.**SparseVectors**:Supportsefficientapproximatesemanticnearestneighborsearchforkeywordinterpretationandanalysis. 4.**GroupingSearch**:Allowsaggregationofsearchresultsbyspecificscalarfields. 5.**InvertedIndexandFuzzyMatch**:Improvesscalarqueryperformanceandsupportsfuzzymatches. 6.**NewDataTypes**:SupportsFloat16andBFloat16vectordatatypes. 7.**L0Segment**:Enhancesperformanceofdeleteandupsertoperationsbyrecordingdeleteddata. 8.**RefactoredBulkInsert**:Improveslogicforimportingmultiplefilesinasinglerequest. 9.**Memory-mappedStorage**:Optimizesmemoryusagebymappingfilecontentintomemory. 10.**Milvus-CDC**:Improvesstabilityforcapturingandsynchronizingincrementaldata.
###Summary: -**Milvus2.3**focusesonbroadeninghardwaresupport,enhancingdeploymentsimplicity,andintroducingvariousnewAPIsandperformanceimprovements. -**Milvus2.4**emphasizesadvancedsearchcapabilities,newdatatypes,improvedmemorymanagement,andfurtherperformanceoptimizations.
Bothversionsaimtoenhancesystemperformanceandusability,butMilvus2.4introducesmoreadvancedsearchfeaturesandoptimizationscomparedtoMilvus2.3.
05. 从Naive RAG到Agentic RAG的转变,标志着检索增强生成技术的一个重要进步。通过引入代理(Agentic)方法,我们可以显著提升RAG系统在处理复杂任务时的能力。这不仅使得RAG系统更加智能和灵活,也为其在实际应用中提供了更广阔的前景。未来,我们可以期待更多基于Agentic RAG的创新应用,推动大模型技术的发展。 |