FlashRank is an ultra-lightweight and ultra-fast Python library designed to add reranking to existing search and retrieval pipelines. It is based on state-of-the-art (SoTA) cross-encoders.
This notebook introduces the use of FlashRank-Reranker within the LangChain framework, showcasing how to apply reranking techniques to improve the quality of search or retrieval results. It provides practical code examples and explanations for integrating FlashRank into a LangChain pipeline, highlighting its efficiency and effectiveness. The focus is on leveraging FlashRank's capabilities to enhance the ranking of outputs in a streamlined and scalable way.
def pretty_print_docs(docs):
print(
f"\n{'-' * 100}\n".join(
[
f"Document {i+1}:\n\n{d.page_content}\nMetadata: {d.metadata}"
for i, d in enumerate(docs)
]
)
)
FlashrankRerank
Load data for a simple example and create a retriever.
from langchain_community.document_loaders import TextLoader
from langchain_community.vectorstores import FAISS
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_openai import OpenAIEmbeddings
# Load the documents
documents = TextLoader("./data/appendix-keywords.txt").load()
# Initialized the text splitter
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=100)
# Split the documents
texts = text_splitter.split_documents(documents)
# Add a unique ID to each text
for idx, text in enumerate(texts):
text.metadata["id"] = idx
# Initialize the retriever
retriever = FAISS.from_documents(
texts, OpenAIEmbeddings()
).as_retriever(search_kwargs={"k": 10})
# query
query = "Tell me about Word2Vec"
# Search for documents
docs = retriever.invoke(query)
# Print the document
pretty_print_docs(docs)
Now, let's wrap the base retriever with a ContextualCompressionRetriever and use FlashrankRerank as the compressor.
from langchain.retrievers import ContextualCompressionRetriever
from langchain.retrievers.document_compressors import FlashrankRerank
from langchain_openai import ChatOpenAI
# Initialize the LLM
llm = ChatOpenAI(temperature=0)
# Initialize the FlshrankRerank
compressor = FlashrankRerank(model="ms-marco-MultiBERT-L-12")
# Initialize the ContextualCompressioinRetriever
compression_retriever = ContextualCompressionRetriever(
base_compressor=compressor, base_retriever=retriever
)
# Search for compressed documents
compressed_docs = compression_retriever.invoke(
"Tell me about Word2Vec."
)
# Print the document ID
print([doc.metadata["id"] for doc in compressed_docs])
Compare the results after reranker is applied.
# Print the results of document compressions
pretty_print_docs(compressed_docs)