虽然AIOS设计的是一个LLM Agent操作系统(OS),将LLM作为OS的大脑(一个有灵魂的OS),奔着AGI去的,但是就落地实处的角度出发,抛去OS,它作为一个Multi-Agent框架蛮好的。
图1:一个激励性的例子,展示了之智能体(例如,旅行智能体)在完成任务时需要LLM级别和操作系统级别的资源和功能。
智能体调度器(Agent Scheduler)
它的主要作用是对智能体(agents)的请求进行有效管理,以优化大型语言模型(LLM)的利用效率。智能体调度器采用不同的调度策略,如先进先出(FIFO)、轮询(Round Robin)等算法,来处理智能体任务的执行顺序。
图3:智能体调度器的示意图
上下文管理器(Context Manager)
负责处理LLM生成过程中的上下文信息和状态的关键模块。它的主要功能包括上下文快照(snapshot)和恢复(restoration)以及上下文窗口管理。
上下文快照和恢复功能允许系统在智能体请求被调度器挂起时(即使LLM尚未完成对当前请求的响应生成),保存当前生成过程的状态。这样,一旦资源再次可用,系统就可以从之前保存的状态恢复生成过程,从而继续生成响应,确保了临时挂起不会丢失进度,优化了资源的使用效率。
上下文窗口管理功能则用于处理长上下文信息,这些信息可能超出LLM的处理能力。通过基本的文本摘要和扩展技术,上下文管理器能够有效地管理上下文窗口,增强LLM处理和理解大量上下文信息的能力,同时保持信息的完整性和相关性。
图4:上下文快照和恢复,使用束搜索(束宽=1)作为一个示例搜索算法来说明这个生成性解码过程
工具管理器(Tool Manager)
图5:AIOS中管理的工具,最后一列显示了每个工具所需的输入和输出格式。
https://github.com/agiresearch/AIOS/tree/main/src
{"name":"MathAgent","description":"Youareanexpertwhoisgoodatsolvingmathematicalproblems,givenamathematicalproblem,youneedtobreakdownthisproblemintosmallersub-problems.Solveapartoftheproblemstepbystepwithexplanationsandfinallybuilduptothefinalsolution."},{"name":"NarrativeAgent","description":"Youareanexpertwhoisgoodatwritingnovels,givenathemeorbackground,youneedtowriteashortstorywithawell-developedplotandcharacters,developdifferentsectionsofthestory,suchasintroduction,risingaction,climax,andconclusion."},{"name":"RecAgent","description":"Youareanexpertwhoisgoodatrecommendingrestrauntsorhotelsforusers,givenarequest,youneedtofirstdeterminetherightrecommendationdirectionandthenprovidetherecommendationlists."},{"name":"TravelAgent","description":["Youareaproficientplanner.","Basedontheprovidedinformationandquery,pleasegivemeadetailedplan,includingspecificssuchasflightnumbers(e.g.,F0123456),restaurantnames,andaccommodationnames.","Notethatalltheinformationinyourplanshouldbederivedfromtheprovideddata.","Youmustadheretotheformatgivenintheexample.Additionally,alldetailsshouldalignwithcommonsense.","Thesymbol'-'indicatesthatinformationisunnecessary.","Forexample,intheprovidedsample,youdonotneedtoplanafterreturningtothedeparturecity.","Whenyoutraveltotwocitiesinoneday,youshouldnoteitinthe'CurrentCity'sectionasintheexample(i.e.,fromAtoB)."],"flow":["Step1::
rocess:::Basedontheinputquery,determinetheduration,departurecity,anddestination.:::next::step2","Step2::
ecision:::Isthedestinationastateoracity?:::city::step4:::state::step3","Step3::
rocess:::Selectacityasthenewdestinationcityfromthedestinationstate:::next::step4","Step4::
rocess:::Estimatethecostoftakingataxifromdeparturecitytothedestinationcity.:::next::Step5","Step5::
rocess:::Estimatethecostofself-drivingfromdeparturecitytothedestinationcity.:::next::Step6","Step6::
rocess:::Estimatethecostoftakingaflightonthestartdatefromdeparturecitytothedestinationcity.:::next::Step7","Step7::
ecision:::Isthereareasonabletransportationbasedontheresultsoftaxi,self-drivingandflightcost?:::yes::Step8:::no::Step3","Step8::
rocess:::Recordthemostreasonabletransportationmethodfromdeparturecitytothefirstdestinationcity.Movetothefirstdestinationcity.:::next::Step9","Step9::
rocess:::Recordanunvisitedrestaurantfortoday'sbreakfastatcurrentcity:::next::Step10","Step10::
rocess:::Recordanunvisitedrestaurantfortoday'slunchatcurrentcity:::next::Step11","Step11::
rocess:::Recordanunvisitedrestaurantfortoday'sdinneratcurrentcity:::next::Step12","Step12::
rocess:::Recordanunvisitedattractionfortoday'splanatcurrentcity:::next::Step13","Step13::
ecision:::Istodaythelastdayofthetrip?:::yes::Step14:::no::Step19","Step14:::Process:::Estimatethecostoftakingataxifromcurrentcitytothedeparturecity.:::next::Step15","Step15:::Process:::Estimatethecostofself-drivingfromcurrentcitytothedeparturecity.:::next::Step16","Step16:::Process:::Estimatethecostoftakingaflightonthelastdatefromcurrentcitytothedeparturecity.:::next::Step17","Step17:::Process:::Recordthemostreasonabletransportationmethodfromcurrentcitytothedeparturecity.:::next::Step18","Step18:::Terminal:::Outputalltheplansinjson.:::","Step19:::Process:::Findareasonableaccommodationatcurrentcity.:::next::Step20","Step20::
ecision:::Isthereareasonableaccommodationatcurrentcity?:::yes::Step21:::no::Step3","Step21:::Process:::Recordtheaccommodationatcurrentcity.Startplanningthenextday.Now,whatisthedatetoday?:::next::Step22","Step22::
ecision:::Istodaythethirddayofthetrip?:::no::Step23:::yes::Step24","Step23::
ecision:::Istodaythefifthdayofthetrip?:::no::Step9:::yes::Step24","Step24:::Process:::Selectanunvisitedcityasthenewdestinationcityfromthedestinationstate.:::next::step4"],"tool_info":["Avaiabletools:","google_search"]}from src.agents.agent_process import (AgentProcess,)classBaseAgent:def get_response(self, prompt, temperature=0.0):agent_process = AgentProcess(self.agent_name, prompt, temperature)agent_process.set_created_time(time.time())self.agent_process_queue.put(agent_process)thread = CustomizedThread(target=self.listen, args=(agent_process,))thread.start()# print(result)result = thread.join()waiting_time = agent_process.get_start_time() - agent_process.get_created_time()turnaround_time = agent_process.get_end_time() - agent_process.get_created_time()result = result.replace("\n", "")returnresult,waiting_time,turnaround_timedef check_tool_use(self, prompt, tool_info, temperature=0.):prompt = f'You are allowed to use the following tools: \n\n```{tool_info}```\n\n' \f'Do you think the response ```{prompt}``` calls any tool?\n' \f'Only answer "Yes" or "No".'while True:response = self.get_response(prompt, temperature)temperature += .5print(f'Tool use check: {response}')if 'yes' in response.lower():return Trueif 'no' in response.lower():return Falseprint(f'Temperature: {temperature}')if temperature > 2:breakprint('No valid format output when calling "Tool use check".')# exit(1)def get_prompt(self, tool_info, flow_ptr, task_description, cur_progress):progress_str = '\n'.join(cur_progress)prompt = f'{tool_info}\n\nCurrent Progress:\n{progress_str}\n\nTask description: {task_description}\n\n' \f'Question: {flow_ptr.get_instruction()}\n\nOnly answer the current instruction and do not be verbose.'return promptdef get_tool_arg(self, prompt, tool_info, selected_tool):prompt = f'{tool_info}\n\n' \f'You attempt to use the tool ```{selected_tool}```. ' \f'What is the input argument to call tool for this step: ```{prompt}```? ' \f'Respond "None" if no arguments are needed for this tool. Separate by comma if there are multiple arguments. Do not be verbose!'response = self.get_response(prompt)print(f'arameters: {response}')
returnresponsedef get_final_result(self, prompt):prompt = f"Given the interaction history: {prompt}, give the answer to the task input and don't be verbose!"final_result, waiting_time, turnaround_time = self.get_response(prompt)final_result.replace("\n", "")returnfinal_result,waiting_time,turnaround_time
{"model_type":"causal_lm","open_sourced":true,"model_name":"google/gemma-2b-it"}fromsrc.agents.agent_processimportAgentProcessimport timeclass BaseScheduler:def __init__(self, llm):self.active = False # start/stop the schedulerself.thread = Thread(target=self.run)self.llm = llmdef run(self):passdef start(self):"""start the scheduler"""self.active = Trueself.thread.start()def stop(self):"""stop the scheduler"""self.active = Falseself.thread.join()def execute_request(self, agent_process: AgentProcess):agent_process.set_status("Executing")logger.info(f"[{agent_process.agent_name}] is executing.")agent_process.set_start_time(time.time())response = self.llm.address_request(agent_process.prompt)agent_process.set_response(response)agent_process.set_end_time(time.time())agent_process.set_status("Done")
memory/storage:短期记忆与长期记忆就不细讲了,实现的不复杂,短期记忆通过dict进行内存存储检索,长期记忆通过db或file进行长期存储检索
tool:工具这块实现了8个,比如论文arxiv,搜索(bing/goolge)等,每个工具具体实现主要是api接口url、参数配置、执行、结果解析。
class BingSearch(BaseTool):"""Bing Search Tool, refactored from langchain.In order to set this up, follow instructions at:https://levelup.gitconnected.com/api-tutorial-how-to-use-bing-web-search-api-in-python-4165d5592a7e"""def __init__(self):super().__init__()self.url = "https://api.bing.microsoft.com/v7.0/search" # temporarilyself.bing_subscription_key = get_from_env("BING_SUBSCRIPTION_KEY")self.k: int = 10 # topk searched results# search_kwargs: dictdef _bing_search_results(self, search_term: str, count: int) -> List[dict]:headers = {"Ocp-Apim-Subscription-Key": self.bing_subscription_key}params = {"q": search_term,"count": count,"textDecorations": True,"textFormat": "HTML",# **self.search_kwargs,}response = requests.get(self.bing_search_url,headers=headers,params=params,# type: ignore)response.raise_for_status()search_results = response.json()if "webPages" in search_results:return search_results["webPages"]["value"]return []def run(self, query: str) -> str:"""Run query through BingSearch and parse result."""response = self._bing_search_results(query, count=self.k)result = self.parse_result(response)return resultdef parse_result(self, response):snippets = []if len(response) == 0:return "No good Bing Search Result was found"for result in response:snippets.append(result["snippet"])return"".join(snippets)
从Agent到多模态Agent再到多模态Multi-Agents系统的发展与案例讲解(1.2万字,20+文献,27张图)
AIOSLMAgentOperatingSystemhttps://arxiv.org/pdf/2403.18243.pdf
| 欢迎光临 链载Ai (https://www.lianzai.com/) | Powered by Discuz! X3.5 |