{'query':'What did the president say about Ketanji Brown Jackson',
'result':" The president speaks highly of Ketanji Brown Jackson, stating that she is one of the nation's top legal minds, and will continue the legacy of excellence of Justice Breyer. The president also mentions that he worked with her family and that she comes from a family of public school educators and police officers. Since her nomination, she has received support from various groups, including the Fraternal Order of Police and judges from both major political parties. \n\nWould you like me to extract another sentence from the provided text? "}
bge-reranker(Base/Large)
这些模型来自北京人工智能研究院 (BAAI),并且是开源的(Apache 2.0 许可证)。它们基于 Transformer,类似交叉编码器,专为重排序任务而设计。它们提供不同大小的版本,例如 Base 版和 Large 版。
compressed_docs = compression_retriever.invoke("What is the plan for the economy?") pretty_print_docs(compressed_docs)
输出:
Document 1: More infrastructure and innovationinAmerica. More goods moving faster and cheaperinAmerica. Morejobswhereyou can earn a good livinginAmerica. And instead of relying on foreign supply chains,let’s make itinAmerica. Economists call it “increasing the productive capacity of our economy.” I call it building a better America. My plan to fight inflation will lower your costs and lower the deficit.
Second – cut energy costsforfamilies an average of$500a year by combatting climate change.
Let’s provide investments and tax credits to weatherize your homes and businesses to be energy efficient and you get a tax credit; double America’s clean energy productioninsolar, wind, and so much more; lower the price of electric vehicles, saving you another$80a month because you’ll never have to pay at the gas pump again.
Look at cars. Last year, there weren’t enough semiconductors to make all the cars that people wanted to buy. And guess what, prices of automobiles went up. So—we have a choice. One way to fight inflation is to drive down wages and make Americans poorer. I have a better plan to fight inflation. Lower your costs, not your wages. Make more cars and semiconductorsinAmerica. More infrastructure and innovationinAmerica. More goods moving faster and cheaperinAmerica.
Voyage Rerank
Voyage AI 提供专有的神经网络模型(voyage-rerank-2、voyage-rerank-2-lite),可通过 API 访问。这些模型很可能是经过精细调整的高级交叉编码器,旨在实现最高的相关性评分。
llm = OpenAI(temperature=0) compressor = VoyageAIRerank( model="rerank-lite-1", voyageai_api_key=os.environ["VOYAGE_API_KEY"], top_k=3 ) compression_retriever = ContextualCompressionRetriever( base_compressor=compressor, base_retriever=retriever ) compressed_docs = compression_retriever.invoke( "What did the president say about Ketanji Jackson Brown" ) pretty_print_docs(compressed_docs)
输出:
Document 1:
One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who willcontinueJustice Breyer’s legacy of excellence.
Solet’s not abandon our streets. Or choose between safety and equal justice. Let’s come together to protect our communities, restore trust, and hold law enforcement accountable. That’s why the Justice Department required body cameras, banned chokeholds, and restricted no-knock warrantsforits officers.
I spoke with their families and told them that we are foreverindebtfortheir sacrifice, and we will carry on their mission to restore the trust and safety every community deserves.
I’ve worked on these issues a long time.
I know what works: Investingincrime prevention and community police officers who’ll walk the beat, who’ll know the neighborhood, and who can restore trust and safety.
Solet’s not abandon our streets. Or choose between safety and equal justice.
Jina Reranker
这提供了重排序解决方案,包括 Jina Reranker v2 和 Jina-ColBERT 等神经模型。Jina Reranker v2 很可能是一个跨编码器风格的模型。Jina-ColBERT 使用 Jina 的基础模型实现了 ColBERT 架构(下文将详细介绍)。
主要特点:Jina 提供经济实惠且性能卓越的选项。其突出特点是 Jina-ColBERT 能够处理超长文档,支持高达 8,000 个词条的上下文长度。这减少了对长文本进行大段分块的需要。开源组件也是 Jina 生态系统的一部分。
compressor = JinaRerank() compression_retriever = ContextualCompressionRetriever( base_compressor=compressor, base_retriever=retriever ) compressed_docs = compression_retriever.get_relevant_documents( "What did the president say about Ketanji Jackson Brown" ) pretty_print_docs(compressed_docs)
输出:
Document 1:
Solet’s not abandon our streets. Or choose between safety and equal justice. Let’s come together to protect our communities, restore trust, and hold law enforcement accountable. That’s why the Justice Department required body cameras, banned chokeholds, and restricted no-knock warrantsforits officers.
I spoke with their families and told them that we are foreverindebtfortheir sacrifice, and we will carry on their mission to restore the trust and safety every community deserves. I’ve worked on these issues a long time. I know what works: Investingincrime prevention and community police officers who’ll walk the beat, who’ll know the neighborhood, and who can restore trust and safety. Solet’s not abandon our streets. Or choose between safety and equal justice.
Document(page_content='In June 1985, Miyazaki, Takahata, Tokuma and Suzuki founded the animation production company Studio Ghibli, with funding from Tokuma Shoten. Studio Ghibli\'s first film, Laputa: Castleinthe Sky (1986), employed the same production crew of Nausicaä. Miyazaki\'s designs for the film\'s setting were inspired by Greek architecture and"European urbanistic templates". Some of the architectureinthe film was also inspired by a Welsh mining town; Miyazaki witnessed the mining strike upon his first', metadata={'relevance_score': 26.5194149017334})
理想用例:需要在资源受限的硬件(如 CPU 或边缘设备)上快速重新排序的应用程序、延迟至关重要的大容量搜索系统、寻求简单且“聊胜于无”的重新排序步骤且复杂性最小的项目。
示例代码
fromlangchain.retrieversimportContextualCompressionRetriever fromlangchain.retrievers.document_compressorsimportFlashrankRerank fromlangchain_openaiimportChatOpenAI llm = ChatOpenAI(temperature=0) compressor = FlashrankRerank() compression_retriever = ContextualCompressionRetriever( base_compressor=compressor, base_retriever=retriever ) compressed_docs = compression_retriever.invoke( "What did the president say about Ketanji Jackson Brown" ) print([doc.metadata["id"]fordocincompressed_docs]) pretty_print_docs(compressed_docs)
此代码片段利用 ContextualCompressionRetriever 中的 FlashrankRerank 函数来提升检索到的文档的相关性。它根据查询“总统对 Ketanji Jackson Brown 有何评价”的相关性,对基础检索器(用 检索器 表示)获取的文档进行重新排序。最后,它会打印文档 ID 以及压缩后、重新排序后的文档。
输出:
[0, 5, 3]
Document 1:
One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who willcontinueJustice Breyer’s legacy of excellence. ----------------------------------------------------------------------------------------------------
Document 2:
He met the Ukrainian people. From President Zelenskyy to every Ukrainian, their fearlessness, their courage, their determination, inspires the world. Groups of citizens blocking tanks with their bodies. Everyone from students to retirees teachers turned soldiers defending their homeland. In this struggle as President Zelenskyy saidinhis speech to the European Parliament “Light will win over darkness.” The Ukrainian Ambassador to the United States is here tonight. ----------------------------------------------------------------------------------------------------
Document 3:
And tonight, I’m announcing that the Justice Department will name a chief prosecutor forpandemic fraud. By the end of this year, the deficit will be down to less than half what it was before I took office. The only president ever to cut the deficit by more than one trillion dollarsina single year. Lowering your costs also means demanding more competition. I’m a capitalist, but capitalism without competition isn’t capitalism It’s exploitation—and it drives up prices. The output shoes it reranks the retrieved chunks based on the relevancy.
# Load the model, here we use our base sized model model = MxbaiRerankV2("mixedbread-ai/mxbai-rerank-base-v2")
# Example query and documents query ="Who wrote To Kill a Mockingbird?"
documents = ["To Kill a Mockingbird is a novel by Harper Lee published in 1960. It was immediately successful, winning the Pulitzer Prize, and has become a classic of modern American literature.",
"The novel Moby-Dick was written by Herman Melville and first published in 1851. It is considered a masterpiece of American literature and deals with complex themes of obsession, revenge, and the conflict between good and evil.",
"Harper Lee, an American novelist widely known for her novel To Kill a Mockingbird, was born in 1926 in Monroeville, Alabama. She received the Pulitzer Prize for Fiction in 1961.",
"Jane Austen was an English novelist known primarily for her six major novels, which interpret, critique and comment upon the British landed gentry at the end of the 18th century.",
"The Harry Potter series, which consists of seven fantasy novels written by British author J.K. Rowling, is among the most popular and critically acclaimed books of the modern era.",
"The Great Gatsby, a novel written by American author F. Scott Fitzgerald, was published in 1925. The story is set in the Jazz Age and follows the life of millionaire Jay Gatsby and his pursuit of Daisy Buchanan." ] # Calculate the scores results = model.rank(query, documents) print(results)
输出:
[RankResult(index=0, score=9.847987174987793, document='To Kill a Mockingbird is a novel by Harper Lee published in 1960. It was immediately successful, winning the Pulitzer Prize, and has become a classic of modern American literature.'),
RankResult(index=2, score=8.258672714233398, document='Harper Lee, an American novelist widely known for her novel To Kill a Mockingbird, was born in 1926 in Monroeville, Alabama. She received the Pulitzer Prize for Fiction in 1961.'),
RankResult(index=3, score=3.579845428466797, document='Jane Austen was an English novelist known primarily for her six major novels, which interpret, critique and comment upon the British landed gentry at the end of the 18th century.'),
RankResult(index=4, score=2.716982841491699, document='The Harry Potter series, which consists of seven fantasy novels written by British author J.K. Rowling, is among the most popular and critically acclaimed books of the modern era.'),
RankResult(index=1, score=2.233165740966797, document='The novel Moby-Dick was written by Herman Melville and first published in 1851. It is considered a masterpiece of American literature and deals with complex themes of obsession, revenge, and the conflict between good and evil.'),
RankResult(index=5, score=1.8150043487548828, document='The Great Gatsby, a novel written by American author F. Scott Fitzgerald, was published in 1925. The story is set in the Jazz Age and follows the life of millionaire Jay Gatsby and his pursuit of Daisy Buchanan.')]