ReAct 框架的代码实现工作原理

显示全部楼层

React Agent 是如何工作的？

“Reason Only”（向内求索）：COT 类型的 Prompt，希望模型一步一步思考去解决问题。这类型 Prompt 非常擅长逻辑推理，能将问题拆解成一个一个小问题，逐一解决。但是没有灵活运用外部工具和外部知识的能力。向内求索型人格：像一个思维严谨，思考问题很深入，但是不够开放，不喜欢接触新知识的人。
“Act Only”（向外探索）：工具使用 Prompt，非常擅长借助外部工具和外部知识去解决问题，但是逻辑推理能力不强。向外探索人格：像一个思维开放，非常喜欢探索新知识，尝试新工具，但是思想深度不够的人。
“ReAct”(内外兼修) ：就是将“Reason Only”（向内求索）和“Act Only”（向外探索）这两者相结合，

将严谨思维和探索的心态结合起来，优势互补，成为一个内外兼修的人。

直观的理解就是，ReAct 就是结合了的推理能力 + 工具使用能力/外部知识能力，利用大模型去解决难问题。

在 ReAct Prompting 中，使用 LLMs 以交错的方式生成推理轨迹和特定于任务的操作，从而实现两者之间更大的协同作用：

推理跟踪帮助模型诱导、跟踪和更新行动计划以及处理异常。
执行操作与外部源（例如知识库或环境）进行交互，以获取相关信息。

接下来深入研究 ReAct 框架的代码实现的工作原理，总共分三步。

第一步：

定义 Prompt 模版

TOOL_DESC="""{name_for_model}:Callthistooltointeractwiththe{name_for_human}API.Whatisthe{name_for_human}APIusefulfor?{description_for_model}Parameters:{parameters}FormattheargumentsasaJSONobject."""

REACT_PROMPT="""Answerthefollowingquestionsasbestyoucan.Youhaveaccesstothefollowingtools:

{tool_descs}

Usethefollowingformat:

Question:theinputquestionyoumustanswer
Thought:youshouldalwaysthinkaboutwhattodo
Action:theactiontotake,shouldbeoneof[{tool_names}]
ActionInput:theinputtotheaction
Observation:theresultoftheaction
...(thisThought/Action/ActionInput/Observationcanberepeatedzeroormoretimes)
Thought:Inowknowthefinalanswer
FinalAnswer:thefinalanswertotheoriginalinputquestion

Begin!

Question:{query}"""

工具构建函数

defbuild_planning_prompt(TOOLS,query):
tool_descs=[]
tool_names=[]
forinfoinTOOLS:
tool_descs.append(
TOOL_DESC.format(
name_for_model=info['name_for_model'],
name_for_human=info['name_for_human'],
description_for_model=info['description_for_model'],
parameters=json.dumps(
info['parameters'],ensure_ascii=False),
)
)
tool_names.append(info['name_for_model'])
tool_descs='\n\n'.join(tool_descs)
tool_names=','.join(tool_names)

prompt=REACT_PROMPT.format(tool_descs=tool_descs,tool_names=tool_names,query=query)
returnprompt

prompt_1=build_planning_prompt(TOOLS[0:1],query="加拿大2023年人口统计数字是多少？")
print(prompt_1)

调用函数查看 response 结果

Answerthefollowingquestionsasbestyoucan.Youhaveaccesstothefollowingtools:

Search:CallthistooltointeractwiththegooglesearchAPI.WhatisthegooglesearchAPIusefulfor?usefulforwhenyouneedtoanswerquestionsaboutcurrentevents.Parameters:[{"name":"query","type":"string","description":"searchqueryofgoogle","required":true}]FormattheargumentsasaJSONobject.

Usethefollowingformat:

Question:theinputquestionyoumustanswer
Thought:youshouldalwaysthinkaboutwhattodo
Action:theactiontotake,shouldbeoneof[Search]
ActionInput:theinputtotheaction
Observation:theresultoftheaction
...(thisThought/Action/ActionInput/Observationcanberepeatedzeroormoretimes)
Thought:Inowknowthefinalanswer
FinalAnswer:thefinalanswertotheoriginalinputquestion

Begin!

Question:加拿大2023年人口统计数字是多少？

初始化模型

#国内连hugginface网络不好，这段代码可能需要多重试
checkpoint="Qwen/Qwen-7B-Chat"
TOKENIZER=AutoTokenizer.from_pretrained(checkpoint,trust_remote_code=True)
MODEL=AutoModelForCausalLM.from_pretrained(checkpoint,device_map="auto",trust_remote_code=True).eval()
MODEL.generation_config=GenerationConfig.from_pretrained(checkpoint,trust_remote_code=True)
MODEL.generation_config.do_sample=False#greedy

设置 "Observation" 为 stop word

stop=["Observation:","Observation:\n"]
react_stop_words_tokens=[TOKENIZER.encode(stop_)forstop_instop]
response_1,_=MODEL.chat(TOKENIZER,prompt_1,history=None,stop_words_ids=react_stop_words_tokens)
print(response_1)

模型在预测到要生成的下一个词是 "Observation" 时马上停止生成，此时这个 prompt 后会生成如下的结果

Thought:我应该使用搜索工具帮助我完成任务。search api能完成搜索的任务。
Action:Search
ActionInput:{"query":"加拿大2023年人口统计数字"}
Observation:

第二步：

从输出中解析需要使用的工具和入参

defparse_latest_plugin_call(text:str)->Tuple[str,str]:
i=text.rfind('\nAction:')
j=text.rfind('\nActionInput:')
k=text.rfind('\nObservation:')
if0<=i<j:#Ifthetexthas`Action`and`Actioninput`,
ifk<j:#butdoesnotcontain`Observation`,
#thenitislikelythat`Observation`isommitedbytheLLM,
#becausetheoutputtextmayhavediscardedthestopword.
text=text.rstrip()+'\nObservation:'#Additback.
k=text.rfind('\nObservation:')
if0<=i<j<k:
plugin_name=text[i+len('\nAction:'):j].strip()
plugin_args=text[j+len('\nActionInput:'):k].strip()
returnplugin_name,plugin_args
return'',''

调用对应工具

defuse_api(tools,response):
use_toolname,action_input=parse_latest_plugin_call(response)
ifuse_toolname=="":
return"notoolfounds"

used_tool_meta=list(filter(lambdax:x["name_for_model"]==use_toolname,tools))
iflen(used_tool_meta)==0:
return"notoolfounds"

api_output=used_tool_meta[0]["tool_api"](action_input)
returnapi_output

api_output=use_api(TOOLS,response_1)
print(api_output)

工具返回的结果

根据加拿大统计局预测，加拿大人口今天（2023年6月16日）预计将超过4000万。联邦统计局使用模型来实时估计加拿大的人口，该计数模型预计加拿大人口将在北美东部时间今天下午3点前达到4000万。加拿大的人口增长率目前为2.7％。

第三步：

根据工具返回结果继续调用模型

拼接上述返回答案，形成新的prompt，并获得生成最终结果

prompt_2=prompt_1+response_1+''+api_output
stop=["Observation:","Observation:\n"]
react_stop_words_tokens=[TOKENIZER.encode(stop_)forstop_instop]
response_2,_=MODEL.chat(TOKENIZER,prompt_2,history=None,stop_words_ids=react_stop_words_tokens)
print(prompt_2,response_2)

续写，生成最终答案

按照 ReAct 格式继续生成 Thought 和 Final Answer，得到最终答案。

Answerthefollowingquestionsasbestyoucan.Youhaveaccesstothefollowingtools:

Search:CallthistooltointeractwiththegooglesearchAPI.WhatisthegooglesearchAPIusefulfor?usefulforwhenyouneedtoanswerquestionsaboutcurrentevents.Parameters:[{"name":"query","type":"string","description":"searchqueryofgoogle","required":true}]FormattheargumentsasaJSONobject.

Usethefollowingformat:

Question:theinputquestionyoumustanswer
Thought:youshouldalwaysthinkaboutwhattodo
Action:theactiontotake,shouldbeoneof[Search]
ActionInput:theinputtotheaction
Observation:theresultoftheaction
...(thisThought/Action/ActionInput/Observationcanberepeatedzeroormoretimes)
Thought:Inowknowthefinalanswer
FinalAnswer:thefinalanswertotheoriginalinputquestion

Begin!

Question:加拿大2023年人口统计数字是多少？Thought:我应该使用搜索工具帮助我完成任务。search api能完成搜索的任务。
Action:Search
ActionInput:{"query":"加拿大2023年人口统计数字"}
Observation:根据加拿大统计局预测，加拿大人口今天（2023年6月16日）预计将超过4000万。联邦统计局使用模型来实时估计加拿大的人口，该计数模型预计加拿大人口将在北美东部时间今天下午3点前达到4000万。加拿大的人口增长率目前为2.7％。
Thought:Inowknowthefinalanswer.
FinalAnswer:加拿大2023年人口统计数字预计为4000万。

总结

defmain(query,choose_tools):
#组织prompt
prompt=build_planning_prompt(choose_tools,query)
#设置stopword，模型停止生成
stop=["Observation:","Observation:\n"]
react_stop_words_tokens=[TOKENIZER.encode(stop_)forstop_instop]
response,_=MODEL.chat(TOKENIZER,prompt,history=None,stop_words_ids=react_stop_words_tokens)

while"FinalAnswer:"notinresponse:#出现finalAnswer时结束
api_output=use_api(choose_tools,response)#抽取入参并执行api
api_output=str(api_output)#部分api工具返回结果非字符串格式需进行转化后输出
if"notoolfounds"==api_output:
break
print("\033[32m"+response+"\033[0m"+"\033[34m"+''+api_output+"\033[0m")
prompt=prompt+response+''+api_output#合并api输出
response,_=MODEL.chat(TOKENIZER,prompt,history=None,stop_words_ids=react_stop_words_tokens)#继续生成

print("\033[32m"+response+"\033[0m")

核心思想是利用了大模型的续写能力，按照 ReAct 的步骤进行续写。
设置停止词，当生成与停止词一样的输出时，模型停止生成。
进行工具和参数解析，并调用工具。
将工具调用结果进行 ReAct 格式进行拼装，调用模型继续进行续写生成。
直到遇见 Final Answer: ，获得最终答案。
使用一次工具，至少要调用 2 次大模型。