Agent 是在某种能自主理解、规划决策、执行复杂任务的智能体，自主 Agent 是由人工智能驱动的程序

显示全部楼层 · *链载Ai* 显示全部楼层 *发表于 2025-11-30 21:17:15* |阅读模式

ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;font-size: 1.2em;font-weight: bold;display: table;margin-right: auto;margin-bottom: 2em;margin-left: auto;padding-right: 0.2em;padding-left: 0.2em;background: rgb(15, 76, 129);color: rgb(255, 255, 255);">Agent是什么？

ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;margin: 1.5em 8px;letter-spacing: 0.1em;color: rgb(63, 63, 63);">Agent一词起源于拉丁语中的Agere，意思是“to do”。在LLM语境下，Agent可以理解为在某种能自主理解、规划决策、执行复杂任务的智能体。

ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;margin: 1.5em 8px;letter-spacing: 0.1em;color: rgb(63, 63, 63);">Agent并非ChatGPT升级版，它不仅告诉你“如何做”，更会帮你去做。如果Copilot是副驾驶，那么Agent就是主驾驶。

ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;margin: 1.5em 8px;letter-spacing: 0.1em;color: rgb(63, 63, 63);">自主Agent是由人工智能驱动的程序，当给定目标时，它们能够自己创建任务、完成任务、创建新任务、重新确定任务列表的优先级、完成新的顶级任务，并循环直到达到目标。

ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;margin: 1.5em 8px;letter-spacing: 0.1em;color: rgb(63, 63, 63);">最直观的公式：Agent = LLM+Planning+Feedback+Tool use

ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;font-size: 1.2em;font-weight: bold;display: table;margin: 4em auto 2em;padding-right: 0.2em;padding-left: 0.2em;background: rgb(15, 76, 129);color: rgb(255, 255, 255);">Agent决策流程

ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;margin: 1.5em 8px;letter-spacing: 0.1em;color: rgb(63, 63, 63);">感知（Perception）→ 规划（Planning）→ 行动（Action）

•感知（Perception）是指Agent从环境中收集信息并从中提取相关知识的能力。
•规划（Planning）是指Agent为了某一目标而作出的决策过程。
•行动（Action）是指基于环境和规划做出的动作。

ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;margin: 1.5em 8px;letter-spacing: 0.1em;color: rgb(63, 63, 63);">Agent通过感知从环境中收集信息并提取相关知识。然后通过规划为了达到某个目标做出决策。最后，通过行动基于环境和规划做出具体的动作。Policy是Agent做出行动的核心决策，而行动又为进一步感知提供了观察的前提和基础，形成了一个自主的闭环学习过程。

ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;font-size: 1.2em;font-weight: bold;display: table;margin: 4em auto 2em;padding-right: 0.2em;padding-left: 0.2em;background: rgb(15, 76, 129);color: rgb(255, 255, 255);">人是如何做事的？

在工作中，我们通常会用到PDCA思维模型。基于PDCA模型，我们可以将完成一项任务进行拆解，按照作出计划、计划实施、检查实施效果，然后将成功的纳入标准，不成功的留待下一循环去解决。目前，这是人们高效完成一项任务非常成功的经验总结。

如何让LLM替代人去做事?

要让LLM替代人去做事，我们可以基于PDCA模型进行规划、执行、评估和反思。

规划能力（Plan）-> 分解任务：Agent大脑把大的任务拆解为更小的，可管理的子任务，这对有效的、可控的处理好大的复杂的任务效果很好。

执行能力（Done）-> 使用工具：Agent能学习到在模型内部知识不够时（比如：在pre-train时不存在，且之后没法改变的模型weights）去调用外部API，比如：获取实时的信息、执行代码的能力、访问专有的信息知识库等等。这是一个典型的平台+工具的场景，我们要有生态意识，即我们构建平台以及一些必要的工具，然后大力吸引其他厂商提供更多的组件工具，形成生态。

评估能力（Check）-> 确认执行结果：Agent要能在任务正常执行后判断产出物是否符合目标，在发生异常时要能对异常进行分类（危害等级），对异常进行定位（哪个子任务产生的错误），对异常进行原因分析（什么导致的异常）。

反思能力（Adjust）-> 基于评估结果重新规划：Agent要能在产出物符合目标时及时结束任务，是整个流程最核心的部分；同时，进行归因分析总结导致成果的主要因素，另外，Agent要能在发生异常或产出物不符合目标时给出应对措施，并重新进行规划开启再循环过程。

下面，来看几个具体的案例

让LLM能够获取当前时间

首先，我们定义一个获取当前时间的tool

fromlangchain.toolsimportTool

defget_time(input=""):
returndatetime.datetime.now()


#定义获取当前时间
time_tool=Tool(
name='getcurrenttime',
func=get_time,
description="用来获取当前时间.inputshouldbe'time'"""
)

name: 工具名称func: 工具的实现description: 工具的描述，一定要是准确的描述，该部分会加入到LLM的prompt中，若描述不准确，LLM可能无法准确调用

我们将langchain中内置的prompt打印出来看看

Respondtothehumanashelpfullyandaccuratelyaspossible.Youhaveaccesstothefollowingtools:

getcurrenttime:用来获取当前时间.inputshouldbe'time',args:{{'tool_input':{{'type':'string'}}}}

Useajsonblobtospecifyatoolbyprovidinganactionkey(toolname)andanaction_inputkey(toolinput).

Valid"action"values:"FinalAnswer"orgetcurrenttime

ProvideonlyONEactionper$JSON_BLOB,asshown:

```
{{
"action"TOOL_NAME,
"action_input"INPUT
}}
```

Followthisformat:

Question:inputquestiontoanswer
Thought:considerpreviousandsubsequentsteps
Action:
```
$JSON_BLOB
```
Observation:actionresult
...(repeatThought/Action/ObservationNtimes)
Thought:Iknowwhattorespond
Action:
```
{{
"action":"FinalAnswer",
"action_input":"Finalresponsetohuman"
}}
```

Begin!RemindertoALWAYSrespondwithavalidjsonblobofasingleaction.Usetoolsifnecessary.Responddirectlyifappropriate.FormatisAction:```$JSON_BLOB```thenObservation:.
Thought:

从以上prompt可以看出，我们定义好的获取当前时间的工具函数，也被包裹在里面，并且，他还帮我们生成了一个输入参数的格式限制prompt：args: {{'tool_input': {{'type': 'string'}}}}

我们接着看：

Useajsonblobtospecifyatoolbyprovidinganactionkey(toolname)andanaction_inputkey(toolinput).

Valid"action"values:"FinalAnswer"orgetcurrenttime

ProvideonlyONEactionper$JSON_BLOB,asshown:

```
{{
"action"TOOL_NAME,
"action_input"INPUT
}}
```

这段prompt要求LLM生成的action需要是一个jsonb的格式，并且包含两个key：action和action_input，分别对应工具名和工具的输入，并且给了一个样例。

并且，有效的action不仅包含了get current time，还多了个Final Answer

我们来用一个实际的问题试试：

question="现在几点？"

result=agent.run(question)
print(result)

输出

当前时间是2024年01月02日11点12分01秒。

对比下未使用tool的输出：

我无法回答这个问题，因为我没有实时访问实际的时间或日期。我是根据我的训练数据提供信息的。

可见，当不使用tool时，LLM是无法知道当前时间的

为了更容易理解Agent是如何工作的，我打印出了中间过程的日志：

Thought:需要使用工具获取当前时间
Action:
```
{
"action":"getcurrenttime",
"action_input":{
"type":"string"
}
}
```

Observation:2024-01-0211:44:16.900356

我现在知道了当前时间
Action:
```
{
"action":"FinalAnswer",
"action_input":"当前时间是2024年01月02日11点44分16秒。"
}
```

首先，LLM先思考应该调用哪个工具，并且知道应该调用get current time，且给出了输入参数的类型

接着，拿到了LLM输出的结果，即：Observation: 2024-01-02 11:44:16.900356

最后，LLM知道了答案，再次调用工具：Final Answer输出答案

让LLM拥有计算器的功能

langchain内置了许多工具，使用load_tools函数即可加载，这次我们不自己定义tool了，我们使用langchain内置的工具试试。

tools=load_tools(tool_names=["llm-math"],llm=llm)

tools.append(time_tool)

看看llm-math的定义

def_get_llm_math(llm:BaseLanguageModel)->BaseTool:
returnTool(
name="Calculator",
description="Usefulforwhenyouneedtoanswerquestionsaboutmath.",
func=LLMMathChain.from_llm(llm=llm).run,
coroutine=LLMMathChain.from_llm(llm=llm).arun,
)

我们看看此时的prompt

Respondtothehumanashelpfullyandaccuratelyaspossible.Youhaveaccesstothefollowingtools:

Calculator:Usefulforwhenyouneedtoanswerquestionsaboutmath.,args:{{'tool_input':{{'type':'string'}}}}
getcurrenttime:用来获取当前时间.inputshouldbe'now',args:{{'tool_input':{{'type':'string'}}}}

Useajsonblobtospecifyatoolbyprovidinganactionkey(toolname)andanaction_inputkey(toolinput).

Valid"action"values:"FinalAnswer"orCalculator,getcurrenttime

ProvideonlyONEactionper$JSON_BLOB,asshown:

```
{{
"action"TOOL_NAME,
"action_input"INPUT
}}
```

Followthisformat:

Question:inputquestiontoanswer
Thought:considerpreviousandsubsequentsteps
Action:
```
$JSON_BLOB
```
Observation:actionresult
...(repeatThought/Action/ObservationNtimes)
Thought:Iknowwhattorespond
Action:
```
{{
"action":"FinalAnswer",
"action_input":"Finalresponsetohuman"
}}
```

Begin!RemindertoALWAYSrespondwithavalidjsonblobofasingleaction.Usetoolsifnecessary.Responddirectlyifappropriate.FormatisAction:```$JSON_BLOB```thenObservation:.
Thought:

相比上一个例子，多了一个名叫Calculator的prompt:Calculator: Useful for when you need to answer questions about math., args: {{'tool_input': {{'type': 'string'}}}}

实际上就是多了个tool name 和 tool description

来试试效果

question="789*324353等于多少？"

result=agent.run(question)
print(result)

输出

255914517

对比下未使用tool的输出：

789*324353=324353*(700+80+9)=324353*700+324353*80+324353*9=227047100+25948240+2921177=252995340+2921177=255916517

未使用tool虽然没有获得正确答案，但好在知道将数学问题分解，但我这里使用的是qwen-72b-chat-int4，要是小一点的模型，就不一定有这样的效果了。

以下是baichuan2-13b-chat的输出

789乘以324353等于259553427。

让LLM获取实时天气

定义tool:

China-City-List-latest.csv文件从(https://github.com/qwd/LocationList/blob/master/China-City-List-latest.csv下载

和风天气API key需要在https://dev.qweather.com注册获取，自行google

defgetLocationId(city):
d=collections.defaultdict(str)
try:
df=pd.read_csv("./data/datasets/virus/China-City-List-latest.csv",encoding='utf-8')
exceptExceptionase:
print(e)
fori,rowindf.iterrows():
d[row['Location_Name_ZH']]=row['Location_ID']
returnd[city]ifcityindelse''


defget_weather(location):
key="你的和风天气APIkey"
id=getLocationId(location)
ifnotid:
return"没有这个城市"
base_url='https://devapi.qweather.com/v7/weather/now?'
params={'location':id,'key':key,'lang':'zh'}
response=requests.get(base_url,params=params)
data=response.json()
ifdata["code"]!="200":
return"没有这个城市的天气情况"
returnget_weather_info(data)


defget_weather_info(info):
ifinfo["code"]!="200":
return"没有这个城市的天气情况"
#result=f'现在天气{info["hourly"][0]["text"]}，温度{info["hourly"][0]["temp"]}度,未来24小时天气{info["hourly"][-1]["text"]}，温度{info["hourly"][-1]["temp"]}度。'

result=f"""
现在天气:{info["now"]["text"]}
温度:{info["now"]["temp"]}摄氏度
风向:{info["now"]["windDir"]}
风力等级:{info["now"]["windScale"]}
风速:{info["now"]["windSpeed"]}公里/小时
"""

returnresult


weather_tool=Tool(
name='getcurrentweather',
func=get_weather,
description="用来获取当地的天气信息，输入应该是城市名称"""
)

来试试效果

question="杭州今天能穿短袖吗？"

result=agent.run(question)
print(result)

输出

不建议穿短袖，今天杭州有霾，温度为10摄氏度。

对比下未使用tool的输出：

作为一个语言模型，我无法获取实时的天气信息。请您自行查询杭州当前的天气情况，并根据气温和个人体质决定是否穿短袖。

以上工具函数，输入参数均只有一个，接下来看看，当输入参数有多个时，应如何处理

tool有多个输入参数的场景

定义tool:

classFutureWeatherInput(BaseModel):
location:str=Field(description="城市名称")
date:str=Field(description="日期，格式：yyyy-mm-dd，如：2021-11-15")


defget_future_weather(location,date):
key="你的和风天气APIkey"
id=getLocationId(location)
ifnotid:
return"没有这个城市"
base_url='https://devapi.qweather.com/v7/weather/7d?'
params={'location':id,'key':key,'lang':'zh'}
response=requests.get(base_url,params=params)
data=response.json()
ifdata["code"]!="200":
return"没有这个城市的天气情况"

result={}
daily=data["daily"]
foritemindaily:
fxDate=item["fxDate"]

weather_text=f"""
天气:{item["textDay"]}
最高温度:{item["tempMax"]}摄氏度
最低温度:{item["tempMin"]}摄氏度
风向:{item["windDirDay"]}
风力等级:{item["windScaleDay"]}
风速:{item["windSpeedDay"]}公里/小时
"""
result[fxDate]=weather_text

returnresult[date]


future_weather_tool=StructuredTool(
name='getfutureweather',
func=get_future_weather,
description="用来获取当地今天和未来六天的天气信息。""",
args_schema=FutureWeatherInput
)

当tool需要多个输入参数时，我们不再使用Tool类，而使用StructuredTool类，它的定义如下（从langchain源码里可以找到）

classStructuredTool(BaseTool):
"""Toolthatcanoperateonanynumberofinputs."""

description:str=""
args_schema:Type[BaseModel]=Field(...,description="Thetoolschema.")
"""Theinputarguments'schema."""
func:Optional[Callable[...,Any]]
"""Thefunctiontorunwhenthetooliscalled."""
coroutine:Optional[Callable[...,Awaitable[Any]]]=None
"""Theasynchronousversionofthefunction."""

且通过pydantic的BaseModel来约束输入，对输入参数的description也是必要的，因为该description也会传到prompt中

Calculator:Usefulforwhenyouneedtoanswerquestionsaboutmath.,args:{{'tool_input':{{'type':'string'}}}}
getcurrenttime:用来获取当前时间.inputshouldbe'now'。当需要获取今天、明天、后天等的日期时，你应该调用此函数获取今天的日期,args:{{'tool_input':{{'type':'string'}}}}
getcurrentweather:用来获取当地当天的天气信息，输入应该是城市名称,args:{{'tool_input':{{'type':'string'}}}}
getfutureweather:用来获取当地今天和未来六天的天气信息。,args:{{'location':{{'title':'Location','description':'城市名称','type':'string'}},'date':{{'title':'Date','description':'日期，格式：yyyy-mm-dd，如：2021-11-15','type':'string'}}}}

Useajsonblobtospecifyatoolbyprovidinganactionkey(toolname)andanaction_inputkey(toolinput).

Valid"action"values:"FinalAnswer"orCalculator,getcurrenttime,getcurrentweather,getfutureweather

来试试效果

question="今天是几号？明天准备去杭州旅游，能穿短袖吗？"

result=agent.run(question)
print(result)

输出

明天杭州的天气预报为晴，最高温度为13摄氏度，最低温度为2摄氏度，建议携带一些保暖衣物。

让LLM实现联网搜索

定义tool：

defget_internet_content(query):
params={
"engine":"baidu",
"q":query,
"api_key":"你的Serpapikey"
}

search=BaiduSearch(params)
result=search.get_json()["organic_results"][0]["snippet"]
returnresult

baidu_search_tool=Tool(
name='百度搜索',
func=get_internet_content,
description="用来从互联网上获取当前时事信息，输入应该是搜索query"""
)

Serpapi key需要你自行注册获取，地址https://serpapi.com/

来试试效果

question="小米su7什么时候发布"

result=agent.run(question)
print(result)

输出

小米su7预计将于2024年上半年量产上市。

Agent之所以能回答该问题，是因为我们使用百度搜索获取了小米su7 发布日期的相关信息，LLM再基于该信息总结答案，相当于外挂了一个知识库，只不过这个知识库不再是我们本地的数据库，而是百度搜索

到这里你会发现，其实不同的工具，就是不同的函数而已，要想Agent能够适配自己的业务场景，只是把这些函数换成了自己业务相关的函数或接口。

以上LLM使用的均是qwen-72b-chat-int4，同时也对比过baichuan2-13b-chat、yi-34b-chat，qwen-14b-chat，其中baichuan2-13b-chat效果最差，基本无法理解如何调用tool，yi-34b-chat不如qwen-14b-chat，qwen-72b-chat-int4效果最好，个人猜测主要原因是因为qwen系列的模型在专门的工具调用数据集上训练过，因此效果要比其他模型要好，且官方开源了一个大模型工具调用数据集，地址：MSAgent-Bench大模型工具调用数据集

完整代码

importcollections
importrandom
importrequests
importdatetime
importpandasaspd
fromlangchain.toolsimportTool,StructuredTool
fromlangchain.agentsimportinitialize_agent
fromlangchain.chat_modelsimportChatOpenAI
fromlangchain.agentsimportload_tools
fromlangchain.agentsimportAgentType

frompydanticimportBaseModel,Field
fromserpapi.baidu_searchimportBaiduSearch


defgetLocationId(city):
d=collections.defaultdict(str)
try:
df=pd.read_csv("./data/datasets/virus/China-City-List-latest.csv",encoding='utf-8')
exceptExceptionase:
print(e)
fori,rowindf.iterrows():
d[row['Location_Name_ZH']]=row['Location_ID']
returnd[city]ifcityindelse''


defget_weather(location):
key="你的和风天气APIkey"
id=getLocationId(location)
ifnotid:
return"没有这个城市"
base_url='https://devapi.qweather.com/v7/weather/now?'
params={'location':id,'key':key,'lang':'zh'}
response=requests.get(base_url,params=params)
data=response.json()
ifdata["code"]!="200":
return"没有这个城市的天气情况"
returnget_weather_info(data)


classFutureWeatherInput(BaseModel):
location:str=Field(description="城市名称")
date:str=Field(description="日期，格式：yyyy-mm-dd，如：2021-11-15")


defget_future_weather(location,date):
key="你的和风天气APIkey"
id=getLocationId(location)
ifnotid:
return"没有这个城市"
base_url='https://devapi.qweather.com/v7/weather/7d?'
params={'location':id,'key':key,'lang':'zh'}
response=requests.get(base_url,params=params)
data=response.json()
ifdata["code"]!="200":
return"没有这个城市的天气情况"

result={}
daily=data["daily"]
foritemindaily:
fxDate=item["fxDate"]

weather_text=f"""
天气:{item["textDay"]}
最高温度:{item["tempMax"]}摄氏度
最低温度:{item["tempMin"]}摄氏度
风向:{item["windDirDay"]}
风力等级:{item["windScaleDay"]}
风速:{item["windSpeedDay"]}公里/小时
"""
result[fxDate]=weather_text

returnresult[date]




defget_weather_info(info):
ifinfo["code"]!="200":
return"没有这个城市的天气情况"
#result=f'现在天气{info["hourly"][0]["text"]}，温度{info["hourly"][0]["temp"]}度,未来24小时天气{info["hourly"][-1]["text"]}，温度{info["hourly"][-1]["temp"]}度。'

result=f"""
现在天气:{info["now"]["text"]}
温度:{info["now"]["temp"]}摄氏度
风向:{info["now"]["windDir"]}
风力等级:{info["now"]["windScale"]}
风速:{info["now"]["windSpeed"]}公里/小时
"""
returnresult


defget_internet_content(query):
params={
"engine":"baidu",
"q":query,
"api_key":"你的SerpApikey"
}

search=BaiduSearch(params)
result=search.get_json()["organic_results"][0]["snippet"]
returnresult



deftest_agent_example():
model="Qwen-72B-Chat-Int4"
api_key="EMPTY"
base_url="http://localhost:8000/v1"

llm=ChatOpenAI(model=model,temperature=0,api_key=api_key,base_url=base_url)


print(get_weather("北京"))


defget_time(input=""):
returndatetime.datetime.now()

#定义获取当前时间
time_tool=Tool(
name='getcurrenttime',
func=get_time,
description="用来获取当前时间.inputshouldbe'now'。当需要获取今天、明天、后天等的日期时，你应该调用此函数获取今天的日期"""
)

weather_tool=Tool(
name='getcurrentweather',
func=get_weather,
description="用来获取当地当天的天气信息，输入应该是城市名称"""
)

future_weather_tool=StructuredTool(
name='getfutureweather',
func=get_future_weather,
description="用来获取当地今天和未来六天的天气信息。""",
args_schema=FutureWeatherInput
)



tools=load_tools(tool_names=["llm-math"],llm=llm)

tools.extend([time_tool,weather_tool,future_weather_tool])


#创建代理
agent=initialize_agent(
agent=AgentType.STRUCTURED_CHAT_ZERO_SHOT_REACT_DESCRIPTION,
tools=tools,
llm=llm,
verbose=True,
max_iterations=5,
handle_parsing_errors=True
)

print(agent.agent.llm_chain.prompt[0].prompt.template)

question="今天是几号？明天准备去杭州旅游，能穿短袖吗？"

result=agent.run(question)
print("----"*20)
print(result)

总结

1、tool description 非常重要，没有写好description，agent无法理解在什么情况下应该调用该tool

2、输入参数的 description 非常重要，想要LLM生成给定格式的输入参数，可以给一些few shot样例

3、agent本质还是prompt工程，极大程度上依赖于LLM的参数量。小模型无法理解prompt，无法生成给定格式的输入参数，导致tool函数不能被正常调用