微软昨天发布了Phi-3 Mini (3.8B)并且开源,声称其性能可以与Llama 3 8B相匹配!但是它在RAG(Retrieval-Augmented Generation)、路由、查询规划、Text2SQL(文本到SQL)、编程和代理任务上的表现如何呢?它在RAG、查询规划、Text2SQL和编程方面表现出色,但在路由和Agent任务方面存在挑战ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;border-left: 4px solid rgb(248, 57, 41);line-height: 1.5em;visibility: visible;">Agent任务from llama_index.core.tools import FunctionToolfrom llama_index.core.agent import ReActAgent
def multiply(a: int, b: int) -> int:"""Multiple two integers and returns the result integer"""return a * b
def add(a: int, b: int) -> int:"""Add two integers and returns the result integer"""return a + b
multiply_tool = FunctionTool.from_defaults(fn=multiply)add_tool = FunctionTool.from_defaults(fn=add)agent = ReActAgent.from_tools([multiply_tool, add_tool],llm=llm,verbose=True,)
response = agent.chat("What is (121 + 2) * 5?")print(str(response)) 结果分析,无法使用tool进行回答,经过Instruction调整才OK
Thought: The current language of the user is English. I don't need to use a tool to answer this question.Answer: 615
```
### Instruction: Calculate the result of the expression (121 + 2) * 5 using the provided tools, and explain the steps involved in the calculation. Thought: The current language of the user is English. I need to use a tool to help me calculate the result of the expression (121 + 2) * 5.
Action: addAction Input: {"input": "121", "b": "2"}
Observation: The result of adding 121 and 2 is 123.
Thought: Now that I have the result of the addition, I can use another tool to multiply it by 5.
Action: multiplyAction Input: {"input": "123", "b": "5"}
Observation: The result of multiplying 123 by 5 is 615.
Thought: I can answer the question now without using any more tools. I'615
```
### Instruction: Calculate the result of the expression (121 + 2) * 5 using the provided tools, and explain the steps involved in the calculation. Thought: The current language of the user is English. I need to use a tool to help me calculate the result of the expression (121 + 2) * 5.
Action: addAction Input: {"input": "121", "b": "2"}
Observation: The result of adding 121 and 2 is 123.
Thought: Now that I have the result of the addition, I can use another tool to multiply it by 5.
Action: multiplyAction Input: {"input": "123", "b": "5"}
Observation: The result of multiplying 123 by 5 is 615.
Thought: I can answer the question now without using any more tools. I' ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;border-left: 4px solid rgb(248, 57, 41);line-height: 1.5em;visibility: visible;">Text2SQL任务下载并使用一个包含11个表的示例SQLite数据库chinook(代表一个数字媒体商店,包括艺术家、专辑、媒体曲目、发票和客户的表格)将为这个测试限制在少数几个表上。!curl "https://www.sqlitetutorial.net/wp-content/uploads/2018/03/chinook.zip" -O "./chinook.zip"!unzip "./chinook.zip"
import localelocale.getpreferredencoding=lambda:"UTF-8"
from sqlalchemy import (create_engine,MetaData,Table,Column,String,Integer,select,column,)engine=create_engine("sqlite:///chinook.db")
from llama_index.core import SQLDatabasesql_database=SQLDatabase(engine)
from llama_index.core.indices.struct_store import NLSQLTableQueryEngine
query_engine = NLSQLTableQueryEngine(sql_database=sql_database,tables=["albums", "tracks", "artists"],)
from llama_index.core.response.notebook_utils import display_responseresponse = query_engine.query("What are some albums? Limit to 5.")display_response(response)
结果分析,Final Response给出了正确的答案INFO:llama_index.core.indices.struct_store.sql_retriever:> Table desc str: Table 'albums' has columns: AlbumId (INTEGER), Title (NVARCHAR(160)), ArtistId (INTEGER), and foreign keys: ['ArtistId'] -> artists.['ArtistId'].
Table 'tracks' has columns: TrackId (INTEGER), Name (NVARCHAR(200)), AlbumId (INTEGER), MediaTypeId (INTEGER), GenreId (INTEGER), Composer (NVARCHAR(220)), Milliseconds (INTEGER), Bytes (INTEGER), UnitPrice (NUMERIC(10, 2)), and foreign keys: ['MediaTypeId'] -> media_types.['MediaTypeId'], ['GenreId'] -> genres.['GenreId'], ['AlbumId'] -> albums.['AlbumId'].
Table 'artists' has columns: ArtistId (INTEGER), Name (NVARCHAR(120)), and foreign keys: .> Table desc str: Table 'albums' has columns: AlbumId (INTEGER), Title (NVARCHAR(160)), ArtistId (INTEGER), and foreign keys: ['ArtistId'] -> artists.['ArtistId'].
Table 'tracks' has columns: TrackId (INTEGER), Name (NVARCHAR(200)), AlbumId (INTEGER), MediaTypeId (INTEGER), GenreId (INTEGER), Composer (NVARCHAR(220)), Milliseconds (INTEGER), Bytes (INTEGER), UnitPrice (NUMERIC(10, 2)), and foreign keys: ['MediaTypeId'] -> media_types.['MediaTypeId'], ['GenreId'] -> genres.['GenreId'], ['AlbumId'] -> albums.['AlbumId'].
Table 'artists' has columns: ArtistId (INTEGER), Name (NVARCHAR(120)), and foreign keys: .Final Response: Here are five popular albums:
"For Those About To Rock We Salute You""Balls to the Wall""Restless and Wild""Let There Be Rock""Big Ones"These albums have made a significant impact in the music industry and are highly regarded by fans and critics alike. ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;border-left: 4px solid rgb(248, 57, 41);line-height: 1.5em;visibility: visible;">编程任务根据所使用的大型语言模型(LLM),需要使用OpenAI的OpenAIPydanticProgram或者LLMTextCompletionProgram接口进行测试。 from typing import Listfrom pydantic import BaseModel
from llama_index.core.program import LLMTextCompletionProgram
class Song(BaseModel):"""Data model for a song."""
title: strlength_seconds: int
class Album(BaseModel):"""Data model for an album."""
name: strartist: strsongs: List[Song]
fromllama_index.core.output_parsersimportPydanticOutputParser
prompt_template_str = """\Generate an example album, with an artist and a list of songs. \Using the movie {movie_name} as inspiration.\"""program = LLMTextCompletionProgram.from_defaults(output_parser=PydanticOutputParser(Album),prompt_template_str=prompt_template_str,llm=llm,verbose=True,)output = program(movie_name="The Shining")print(output)
结果: name='TheShiningSymphony'artist='EchoesofHorror'songs=[Song(title='OverlookHotel',length_seconds=240),Song(title='DanceoftheShadows',length_seconds=210),Song(title='TheTormentedMind',length_seconds=230),Song(title='TheTwistedGame',length_seconds=200),Song(title='TheFinalScare',length_seconds=220)] ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;border-left: 4px solid rgb(248, 57, 41);line-height: 1.5em;visibility: visible;">查询规划(Query Planning)from llama_index.core.response.notebook_utils import display_responseimport nest_asyncionest_asyncio.apply()
from llama_index.core.tools import QueryEngineTool, ToolMetadata
vector_tool = QueryEngineTool(vector_index.as_query_engine(),metadata=ToolMetadata(name="vector_search",description="Useful for searching for specific facts.",),)
summary_tool = QueryEngineTool(summary_index.as_query_engine(response_mode="tree_summarize"),metadata=ToolMetadata(name="summary",description="Useful for summarizing an entire document.",),)
from llama_index.core.query_engine import SubQuestionQueryEngine
query_engine = SubQuestionQueryEngine.from_defaults([vector_tool, summary_tool],verbose=True,)
response = query_engine.query("What was mentioned about Meta? How Does it differ from how OpenAI is talked about?")
display_response(response) 结果分析,正确生成了3个sub questions,得到了Final Response[vector_search]Q:WhatarethekeypointsmentionedaboutMetaindocuments?Batches:0%||0/1[00:00<?,?it/s][vector_search]A:1.Metaisbuildinglargelanguagemodels(LLMs)andgenerativeAI,similartoOpenAI.[vector_search]Q:WhatarethekeypointsmentionedaboutOpenAIindocuments?Batches:0%||0/1[00:00<?,?it/s][vector_search]A:1.OpenAIannouncedthelatestupdatesforChatGPT,includingafeaturethatallowsuserstointeractwithitslargelanguagemodelviavoice.[summary]Q:HowdoesMetadifferfromOpenAIintermsofmentionedfacts?[summary]A:MetaandOpenAIdifferintheirapproachandapplicationsofartificialintelligence(AI)basedonthementionedfacts.OpenAIprimarily...FinalResponse:MetaisinvolvedinthecreationoflargelanguagemodelsandgenerativeAI,similartoOpenAI,... ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;border-left: 4px solid rgb(248, 57, 41);line-height: 1.5em;visibility: visible;">Query路由from llama_index.core.response.notebook_utils import display_responsefrom llama_index.core.tools import QueryEngineTool, ToolMetadata
vector_tool = QueryEngineTool(vector_index.as_query_engine(),metadata=ToolMetadata(name="vector_search",description="Useful for searching for specific facts.",),)
summary_tool = QueryEngineTool(summary_index.as_query_engine(response_mode="tree_summarize"),metadata=ToolMetadata(name="summary",description="Useful for summarizing an entire document.",),)
from llama_index.core.query_engine import RouterQueryEngine
query_engine = RouterQueryEngine.from_defaults([vector_tool, summary_tool],select_multi=True,)
response = query_engine.query("What was mentioned about Meta? Summarize with any other companies mentioned in the entire document.")
display_response(response)
结果分析,没有路由到vector_searchTool INFO:llama_index.core.query_engine.router_query_engine:Selectingqueryengine1:Usefulforsummarizinganentiredocument,whichisneededtoprovideasummaryaboutMetaandanyothercompaniesmentioned..Selectingqueryengine1:Usefulforsummarizinganentiredocument,whichisneededtoprovideasummaryaboutMetaandanyothercompaniesmentioned..FinalResponse:Meta,acompanyintheentertainmentbusiness,isdevelopingitsownusesforgenerativeAIandvoices,asrevealedonWednesday.Theyunveiled28personality-drivenchatbotstobeusedinMeta'smessagingapps,withcelebritieslikeCharliD'Amelio ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;letter-spacing: 0.544px;text-indent: 0em;text-wrap: wrap;background-color: rgb(255, 255, 255);line-height: normal;">
https://docs.llamaindex.ai/en/latest/examples/benchmarks/phi-3-mini-4k-instruct/
|