|
Agent智能体系统正在作为通用工具被广泛研究和应用,解决复杂问题通常需要由多个组件组成的复合智能体系统,而手工设计的解决方案最终会被学习到的更高效的解决方案所取代。 为此,提出了自动化设计智能体系统(ADAS:Automated Design of Agentic Systems,已开源)的新研究领域,目标是自动创建强大的智能体系统设计。通过代码定义整个智能体系统,并由一个“元Agent”自动发现新的智能体,理论上允许ADAS算法发现任何可能的构建块和智能体系统。元Agent搜索的概述以及发现的Agent示例。指导元Agent迭代地编程新代理,测试它们在任务上的性能,将它们添加到已发现Agent的存档中,并使用这个存档来通知后续迭代中的元Agent。展示了三次运行中的三个示例Agent,所有名称都由元Agent生成。自动化设计智能体系统(Automated Design of Agentic Systems):ADAS的定义和目标ingFang SC", miui, "Hiragino Sans GB", "Microsoft Yahei", sans-serif;letter-spacing: 0.5px;text-align: start;text-wrap: wrap;background-color: rgb(49, 49, 58);" class="list-paddingleft-1">ADAS旨在自动发明新的构建块,并设计功能强大的智能体系统。智能体系统涉及使用基础模型(Foundation Models,简称FMs)作为模块,通过规划、使用工具和执行多步骤的迭代处理来完成任务。 ADAS的三个关键组成部分自动化智能体系统设计(ADAS)的三个关键组成部分。搜索空间决定了ADAS中可以表示哪些Agent系统。搜索算法指定了ADAS方法如何探索搜索空间。评估函数定义了如何根据目标目标(如性能)评估候选Agent。 
ingFang SC", miui, "Hiragino Sans GB", "Microsoft Yahei", sans-serif;letter-spacing: 0.5px;text-align: start;text-wrap: wrap;background-color: rgb(49, 49, 58);" class="list-paddingleft-1">搜索空间(Search Space):定义了ADAS中可以表示哪些智能体系统。例如,一些研究只变异智能体的文本提示,而其他组件(如控制流)保持不变。 搜索算法(Search Algorithm):指定了ADAS方法如何探索搜索空间。由于搜索空间通常非常大甚至无界,需要考虑探索与利用的权衡。 评估函数(Evaluation Function):根据ADAS算法的应用,可能考虑不同的目标来优化,如性能、成本、延迟或智能体的安全性。评估函数定义了如何在这些目标上评估候选智能体。 通过在编码、科学和数学等多个领域的广泛实验,展示了该算法能够逐步发明具有新颖设计的智能体,这些智能体的性能ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;font-size: 16px;letter-spacing: 0.544px;text-indent: 0em;">大大超过了手工设计的最先进智能体。 元智能体搜索在ARC挑战上的结果。(a) 元智能体搜索基于不断增长的先前发现的存档,逐步发现高性能智能体。通过五次评估智能体,在保留的测试集上报告中位数准确度和95%的自举置信区间。(b) 元智能体搜索在ARC挑战上发现的最佳智能体的可视化。 
来自ARC挑战的一个示例任务。给定输入-输出网格示例,人工智能系统被要求学习转换规则,然后将这些学到的规则应用于测试网格,以预测最终答案。 
Meta Agent Search与多个领域内最先进的手工设计智能体之间的性能比较。Meta Agent Search在每个领域中发现的智能体都优于基线。报告了在保留的测试集上的测试准确度和95%自举置信区间。每个领域的搜索是独立进行的。 
将MGSM中的顶级智能体转移到其他数学领域时的性能。元智能体搜索发现的智能体在不同数学领域中始终优于基线。我们报告了测试准确度和95%自举置信区间。顶级智能体的名称由元智能体搜索生成。 
Youareahelpfulassistant.MakesuretoreturninaWELL-FORMEDJSONobject. 使用以下提示来指导元智能体基于先前发现的智能体存档来设计新智能体。 Youareanexpertmachinelearningresearchertestingvariousagenticsystems.Yourobjectiveistodesignbuildingblockssuchaspromptsandcontrolflowswithinthesesystemstosolvecomplextasks.Youraimistodesignanoptimalagentperformingwellon[BriefDescriptionoftheDomain].[FrameworkCode][OutputInstructionsandExamples][DiscoveredAgentArchive](initializedwithbaselines,updatedateveryiteration)#YourtaskYouaredeeplyfamiliarwithpromptingtechniquesandtheagentworksfromtheliterature.Yourgoalistomaximizethespecifiedperformancemetricsbyproposinginterestinglynewagents.Observethediscoveredagentscarefullyandthinkaboutwhatinsights,lessons,orsteppingstonescanbelearnedfromthem.Becreativewhenthinkingaboutthenextinterestingagenttotry.Youareencouragedtodrawinspirationfromrelatedagentpapersoracademicpapersfromotherresearchareas.Usetheknowledgefromthearchiveandinspirationfromacademicliteraturetoproposethenextinterestingagenticsystemdesign.THINKOUTSIDETHEBOX. 使用以下提示来指导和格式化元智能体的输出。在这里,收集并呈现了元智能体在提示中可能犯的一些常见错误,这在提高生成代码的质量方面是有效的。#OutputInstructionandExample:Thefirstkeyshouldbe(“thought”),anditshouldcaptureyourthoughtprocessfordesigningthenextfunction.Inthe“thought”section,firstreasonaboutwhatthenextinterestingagenttotryshouldbe,thendescribeyourreasoningandtheoverallconceptbehindtheagentdesign,andfinallydetailtheimplementationsteps.Thesecondkey(“name”)correspondstothenameofyournextagentarchitecture.Finally,thelastkey(“code”)correspondstotheexact“forward()”functioninPythoncodethatyouwouldliketotry.YoumustwriteCOMPLETECODEin“code”:Yourcodewillbepartoftheentireproject,sopleaseimplementcomplete,reliable,reusablecodesnippets.Hereisanexampleoftheoutputformatforthenextagent:{“thought”:“**Insights:**Yourinsightsonwhatshouldbethenextinterestingagent.**OverallIdea:**yourreasoningandtheoverallconceptbehindtheagentdesign.**Implementation:**describetheimplementationstepbystep.”,“name”:“Nameofyourproposedagent”,“code”:“defforward(self,taskInfo):#Yourcodehere”}##WRONGImplementationexamples:[Examplesofpotentialmistakesthemetaagentmaymakeinimplementation]在元智能体的第一次响应之后,进行两轮自我反思,以使生成的智能体新颖且无错误。[GeneratedAgentfromPreviousIteration]Carefullyreviewtheproposednewarchitectureandreflectonthefollowingpoints:1.**Interestingness**:Assesswhetheryourproposedarchitectureisinterestingorinnovativecomparedtoexistingmethodsinthearchive.Ifyoudeterminethattheproposedarchitectureisnotinteresting,suggestanewarchitecturethataddressestheseshortcomings.-Makesuretocheckthedifferencebetweentheproposedarchitectureandpreviousattempts.-ComparetheproposalandthearchitecturesinthearchiveCAREFULLY,includingtheiractualdifferencesintheimplementation.-Decidewhetherthecurrentarchitectureisinnovative.-USECRITICALTHINKING!2.**ImplementationMistakes**:Identifyanymistakesyoumayhavemadeintheimplementation.Reviewthecodecarefully,debuganyissuesyoufind,andprovideacorrectedversion.REMEMBERchecking"##WRONGImplementationexamples"intheprompt.3.**Improvement**:Basedontheproposedarchitecture,suggestimprovementsinthedetailedimplementationthatcouldincreaseitsperformanceoreffectiveness.Inthisstep,focusonrefiningandoptimizingtheexistingimplementationwithoutalteringtheoveralldesignframework,exceptifyouwanttoproposeadifferentarchitectureifthecurrentisnotinteresting.-Observecarefullyaboutwhethertheimplementationisactuallydoingwhatitissupposedtodo.-Checkifthereisredundantcodeorunnecessarystepsintheimplementation.Replacethemwitheffectiveimplementation.-Trytoavoidtheimplementationbeingtoosimilartothepreviousagent.Andthen,youneedtoimproveorrevisetheimplementation,orimplementthenewproposedarchitecturebasedonthereflection.Yourresponseshouldbeorganizedasfollows:"reflection" rovideyourthoughtsontheinterestingnessofthearchitecture,identifyanymistakesintheimplementation,andsuggestimprovements."thought":Reviseyourpreviousproposalorproposeanewarchitectureifnecessary,usingthesameformatastheexampleresponse."name" rovideanamefortherevisedornewarchitecture.(Don’tputwordslike"new"or"improved"inthename.)"code" rovidethecorrectedcodeoranimprovedimplementation.Makesureyouactuallyimplementyourfixandimprovementinthiscode. Usingthetipsin“##WRONGImplementationexamples”section,furtherrevisethecode.Yourresponseshouldbeorganizedasfollows:Includeyourupdatedreflectionsinthe“reflection”.Repeattheprevious“thought”and“name”.Updatethecorrectedversionofthecodeinthe“code”section. 当在执行生成的代码期间遇到错误时,会进行反思并重新运行代码。如果错误持续存在,这个过程会重复进行,最多五次。以下是用于自我反思任何运行时错误的提示: 运行时错误发生时的自我反思提示 Errorduringevaluation:[Runtimeerrors]Carefullyconsiderwhereyouwentwronginyourlatestimplementation.Usinginsightsfrompreviousattempts,trytodebugthecurrentcodetoimplementthesamethought.Repeatyourpreviousthoughtin“thought”,andputyourthinkingfordebuggingin“debug_thought”. https://arxiv.org/pdf/2408.08435AutomatedDesignofAgenticSystemshttps://github.com/ShengranHu/ADAS
|