LangChain OpenTutorial
  • 🦜️🔗 The LangChain Open Tutorial for Everyone
  • 01-Basic
    • Getting Started on Windows
    • 02-Getting-Started-Mac
    • OpenAI API Key Generation and Testing Guide
    • LangSmith Tracking Setup
    • Using the OpenAI API (GPT-4o Multimodal)
    • Basic Example: Prompt+Model+OutputParser
    • LCEL Interface
    • Runnable
  • 02-Prompt
    • Prompt Template
    • Few-Shot Templates
    • LangChain Hub
    • Personal Prompts for LangChain
    • Prompt Caching
  • 03-OutputParser
    • PydanticOutputParser
    • PydanticOutputParser
    • CommaSeparatedListOutputParser
    • Structured Output Parser
    • JsonOutputParser
    • PandasDataFrameOutputParser
    • DatetimeOutputParser
    • EnumOutputParser
    • Output Fixing Parser
  • 04-Model
    • Using Various LLM Models
    • Chat Models
    • Caching
    • Caching VLLM
    • Model Serialization
    • Check Token Usage
    • Google Generative AI
    • Huggingface Endpoints
    • HuggingFace Local
    • HuggingFace Pipeline
    • ChatOllama
    • GPT4ALL
    • Video Q&A LLM (Gemini)
  • 05-Memory
    • ConversationBufferMemory
    • ConversationBufferWindowMemory
    • ConversationTokenBufferMemory
    • ConversationEntityMemory
    • ConversationKGMemory
    • ConversationSummaryMemory
    • VectorStoreRetrieverMemory
    • LCEL (Remembering Conversation History): Adding Memory
    • Memory Using SQLite
    • Conversation With History
  • 06-DocumentLoader
    • Document & Document Loader
    • PDF Loader
    • WebBaseLoader
    • CSV Loader
    • Excel File Loading in LangChain
    • Microsoft Word(doc, docx) With Langchain
    • Microsoft PowerPoint
    • TXT Loader
    • JSON
    • Arxiv Loader
    • UpstageDocumentParseLoader
    • LlamaParse
    • HWP (Hangeul) Loader
  • 07-TextSplitter
    • Character Text Splitter
    • 02. RecursiveCharacterTextSplitter
    • Text Splitting Methods in NLP
    • TokenTextSplitter
    • SemanticChunker
    • Split code with Langchain
    • MarkdownHeaderTextSplitter
    • HTMLHeaderTextSplitter
    • RecursiveJsonSplitter
  • 08-Embedding
    • OpenAI Embeddings
    • CacheBackedEmbeddings
    • HuggingFace Embeddings
    • Upstage
    • Ollama Embeddings With Langchain
    • LlamaCpp Embeddings With Langchain
    • GPT4ALL
    • Multimodal Embeddings With Langchain
  • 09-VectorStore
    • Vector Stores
    • Chroma
    • Faiss
    • Pinecone
    • Qdrant
    • Elasticsearch
    • MongoDB Atlas
    • PGVector
    • Neo4j
    • Weaviate
    • Faiss
    • {VectorStore Name}
  • 10-Retriever
    • VectorStore-backed Retriever
    • Contextual Compression Retriever
    • Ensemble Retriever
    • Long Context Reorder
    • Parent Document Retriever
    • MultiQueryRetriever
    • MultiVectorRetriever
    • Self-querying
    • TimeWeightedVectorStoreRetriever
    • TimeWeightedVectorStoreRetriever
    • Kiwi BM25 Retriever
    • Ensemble Retriever with Convex Combination (CC)
  • 11-Reranker
    • Cross Encoder Reranker
    • JinaReranker
    • FlashRank Reranker
  • 12-RAG
    • Understanding the basic structure of RAG
    • RAG Basic WebBaseLoader
    • Exploring RAG in LangChain
    • RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval
    • Conversation-With-History
    • Translation
    • Multi Modal RAG
  • 13-LangChain-Expression-Language
    • RunnablePassthrough
    • Inspect Runnables
    • RunnableLambda
    • Routing
    • Runnable Parallel
    • Configure-Runtime-Chain-Components
    • Creating Runnable objects with chain decorator
    • RunnableWithMessageHistory
    • Generator
    • Binding
    • Fallbacks
    • RunnableRetry
    • WithListeners
    • How to stream runnables
  • 14-Chains
    • Summarization
    • SQL
    • Structured Output Chain
    • StructuredDataChat
  • 15-Agent
    • Tools
    • Bind Tools
    • Tool Calling Agent
    • Tool Calling Agent with More LLM Models
    • Iteration-human-in-the-loop
    • Agentic RAG
    • CSV/Excel Analysis Agent
    • Agent-with-Toolkits-File-Management
    • Make Report Using RAG, Web searching, Image generation Agent
    • TwoAgentDebateWithTools
    • React Agent
  • 16-Evaluations
    • Generate synthetic test dataset (with RAGAS)
    • Evaluation using RAGAS
    • HF-Upload
    • LangSmith-Dataset
    • LLM-as-Judge
    • Embedding-based Evaluator(embedding_distance)
    • LangSmith Custom LLM Evaluation
    • Heuristic Evaluation
    • Compare experiment evaluations
    • Summary Evaluators
    • Groundedness Evaluation
    • Pairwise Evaluation
    • LangSmith Repeat Evaluation
    • LangSmith Online Evaluation
    • LangFuse Online Evaluation
  • 17-LangGraph
    • 01-Core-Features
      • Understanding Common Python Syntax Used in LangGraph
      • Title
      • Building a Basic Chatbot with LangGraph
      • Building an Agent with LangGraph
      • Agent with Memory
      • LangGraph Streaming Outputs
      • Human-in-the-loop
      • LangGraph Manual State Update
      • Asking Humans for Help: Customizing State in LangGraph
      • DeleteMessages
      • DeleteMessages
      • LangGraph ToolNode
      • LangGraph ToolNode
      • Branch Creation for Parallel Node Execution
      • Conversation Summaries with LangGraph
      • Conversation Summaries with LangGraph
      • LangGrpah Subgraph
      • How to transform the input and output of a subgraph
      • LangGraph Streaming Mode
      • Errors
      • A Long-Term Memory Agent
    • 02-Structures
      • LangGraph-Building-Graphs
      • Naive RAG
      • Add Groundedness Check
      • Adding a Web Search Module
      • LangGraph-Add-Query-Rewrite
      • Agentic RAG
      • Adaptive RAG
      • Multi-Agent Structures (1)
      • Multi Agent Structures (2)
    • 03-Use-Cases
      • LangGraph Agent Simulation
      • Meta Prompt Generator based on User Requirements
      • CRAG: Corrective RAG
      • Plan-and-Execute
      • Multi Agent Collaboration Network
      • Multi Agent Collaboration Network
      • Multi-Agent Supervisor
      • 08-LangGraph-Hierarchical-Multi-Agent-Teams
      • 08-LangGraph-Hierarchical-Multi-Agent-Teams
      • SQL-Agent
      • 10-LangGraph-Research-Assistant
      • LangGraph Code Assistant
      • Deploy on LangGraph Cloud
      • Tree of Thoughts (ToT)
      • Ollama Deep Researcher (Deepseek-R1)
      • Functional API
      • Reflection in LangGraph
  • 19-Cookbook
    • 01-SQL
      • TextToSQL
      • SpeechToSQL
    • 02-RecommendationSystem
      • ResumeRecommendationReview
    • 03-GraphDB
      • Movie QA System with Graph Database
      • 05-TitanicQASystem
      • Real-Time GraphRAG QA
    • 04-GraphRAG
      • Academic Search System
      • Academic QA System with GraphRAG
    • 05-AIMemoryManagementSystem
      • ConversationMemoryManagementSystem
    • 06-Multimodal
      • Multimodal RAG
      • Shopping QnA
    • 07-Agent
      • 14-MoARAG
      • CoT Based Smart Web Search
      • 16-MultiAgentShoppingMallSystem
      • Agent-Based Dynamic Slot Filling
      • Code Debugging System
      • New Employee Onboarding Chatbot
      • 20-LangGraphStudio-MultiAgent
      • Multi-Agent Scheduler System
    • 08-Serving
      • FastAPI Serving
      • Sending Requests to Remote Graph Server
      • Building a Agent API with LangServe: Integrating Currency Exchange and Trip Planning
    • 08-SyntheticDataset
      • Synthetic Dataset Generation using RAG
    • 09-Monitoring
      • Langfuse Selfhosting
Powered by GitBook
On this page
  • Overview
  • Table of Contents
  • References
  • Environment Setup
  • Key Features and Mechanism
  • Purpose
  • Structure
  • Mechanism
  • Advantages
  • Limitations
  • Practical Applications
  • Implementation
  • Key Advantages of Reranker
  • Document Count Settings for Reranker
  • Trade-offs When Using a Reranker
  1. 11-Reranker

Cross Encoder Reranker

Previous11-RerankerNextJinaReranker

Last updated 28 days ago

  • Author:

  • Peer Review:

  • Proofread :

  • This is a part of

Overview

The Cross Encoder Reranker is a technique designed to enhance the performance of Retrieval-Augmented Generation (RAG) systems. This guide explains how to implement a reranker using Hugging Face's Cross Encoders to refine the ranking of retrieved documents, promoting those most relevant to a query.

Table of Contents

References


Environment Setup

[Note]

  • langchain-opentutorial is a package that provides a set of easy-to-use environment setup, useful functions and utilities for tutorials.

# Set environment variables
from langchain_opentutorial import set_env

set_env(
    {
        "OPENAI_API_KEY": "",
    }
)

You can alternatively set OPENAI_API_KEY in .env file and load it.

[Note] This is not necessary if you've already set OPENAI_API_KEY in previous steps.

# Configuration file to manage API keys as environment variables
from dotenv import load_dotenv

# Load API key information
load_dotenv(override=True)
%%capture --no-stderr
%pip install langchain-opentutorial

Key Features and Mechanism

Purpose

  • Re-rank retrieved documents to refine their ranking, prioritizing the most relevant results for the query.

Structure

  • Accepts both the query and document as a single input pair, enabling joint processing.

Mechanism

  • Single Input Pair : Processes the query and document as a combined input to output a relevance score directly.

  • Self-Attention Mechanism : Uses self-attention to jointly analyze the query and document , effectively capturing their semantic relationship.

Advantages

  • Higher Accuracy : Provides more precise similarity scores.

  • Deep Contextual Analysis : Explores semantic nuances between query and document .

Limitations

  • High Computational Costs : Processing can be time-intensive.

  • Scalability Issues : Not suitable for large-scale document collections without optimization.

Practical Applications

  • A Bi-Encoder quickly retrieves candidate documents by computing lightweight similarity scores.

  • A Cross Encoder refines these results by deeply analyzing the semantic relationship between the query and the retrieved documents .

Implementation

  • Use Hugging Face cross encoders or BAAI/bge-reranker models.

  • Easily integrate with frameworks like LangChain through the CrossEncoderReranker component.

Key Advantages of Reranker

  • Precise Similarity Scoring : Delivers highly accurate measurements of relevance between the query and documents.

  • Semantic Depth : Analyzes deeper semantic relationships, uncovering nuances in query - document interactions.

  • Refined Search Quality : Improves the relevance and quality of the retrieved documents .

  • RAG System Boost : Enhances the performance of Retrieval-Augmented Generation (RAG) systems by refining input relevance.

  • Seamless Integration : Easily adaptable to various workflows and compatible with multiple frameworks.

  • Model Versatility : Offers flexibility with a wide range of pre-trained models for tailored use cases.

Document Count Settings for Reranker

  • Reranking is generally performed on the top 5–10 documents retrieved during the initial search.

  • The ideal number of documents for reranking should be determined through experimentation and evaluation, as it depends on the dataset characteristics and computational resources available.

Trade-offs When Using a Reranker

  • Accuracy vs Processing Time : Striking a balance between achieving higher accuracy and minimizing processing time.

  • Performance Improvement vs Computational Cost : Weighing the benefits of improved performance against the additional computational resources required.

  • Search Speed vs Relevance Accuracy : Managing the trade-off between faster retrieval and maintaining high relevance in results.

  • System Requirements : Ensuring the system meets the necessary hardware and software requirements to support reranking.

  • Dataset Characteristics : Considering the scale, diversity, and specific attributes of the dataset to optimize reranker performance.

Explaining the Implementation of Cross Encoder Reranker with a Simple Example

# Helper function to format and print document content
def pretty_print_docs(docs):
    # Print each document in the list with a separator between them
    print(
        f"\n{'-' * 100}\n".join(  # Separator line for better readability
            [f"Document {i+1}:\n\n" + d.page_content for i, d in enumerate(docs)]  # Format: Document number + content
        )
    )
from langchain_community.document_loaders import TextLoader
from langchain_community.vectorstores import FAISS
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter

# Load documents
documents = TextLoader("./data/appendix-keywords.txt").load()

# Configure text splitter
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=100)

# Split documents into chunks
texts = text_splitter.split_documents(documents)

# # Set up the embedding model
embeddings_model = HuggingFaceEmbeddings(
    model_name="sentence-transformers/msmarco-distilbert-dot-v5",
    model_kwargs = {"tokenizer_kwargs": {"clean_up_tokenization_spaces": False}}
)

# Create FAISS index from documents and set up retriever
retriever = FAISS.from_documents(texts, embeddings_model).as_retriever(
    search_kwargs={"k": 10}
)

# Define the query
query = "Can you tell me about Word2Vec?"

# Execute the query and retrieve results
docs = retriever.invoke(query)

# Display the retrieved documents
pretty_print_docs(docs)
Document 1:
    
    Word2Vec
    Definition: Word2Vec is a technique in NLP that maps words to a vector space, representing their semantic relationships based on context.
    Example: In a Word2Vec model, "king" and "queen" are represented by vectors located close to each other.
    Related Keywords: Natural Language Processing (NLP), Embedding, Semantic Similarity
    ----------------------------------------------------------------------------------------------------
    Document 2:
    
    Token
    Definition: A token refers to a smaller unit of text obtained by splitting a larger piece of text. It can be a word, phrase, or sentence.
    Example: The sentence "I go to school" can be tokenized into "I," "go," "to," and "school."
    Related Keywords: Tokenization, Natural Language Processing (NLP), Syntax Analysis
    ----------------------------------------------------------------------------------------------------
    Document 3:
    
    Example: A customer information table in a relational database is an example of structured data.
    Related Keywords: Database, Data Analysis, Data Modeling
    ----------------------------------------------------------------------------------------------------
    Document 4:
    
    Schema
    Definition: A schema defines the structure of a database or file, detailing how data is organized and stored.
    Example: A relational database schema specifies column names, data types, and key constraints.
    Related Keywords: Database, Data Modeling, Data Management
    ----------------------------------------------------------------------------------------------------
    Document 5:
    
    Keyword Search
    Definition: Keyword search involves finding information based on user-inputted keywords, commonly used in search engines and database systems.
    Example: Searching 
    When a user searches for "coffee shops in Seoul," the system returns a list of relevant coffee shops.
    Related Keywords: Search Engine, Data Search, Information Retrieval
    ----------------------------------------------------------------------------------------------------
    Document 6:
    
    TF-IDF (Term Frequency-Inverse Document Frequency)
    Definition: TF-IDF is a statistical measure used to evaluate the importance of a word within a document by considering its frequency and rarity across a corpus.
    Example: Words with high TF-IDF values are often unique and critical for understanding the document.
    Related Keywords: Natural Language Processing (NLP), Information Retrieval, Data Mining
    ----------------------------------------------------------------------------------------------------
    Document 7:
    
    SQL
    Definition: SQL (Structured Query Language) is a programming language for managing data in databases. 
    It allows you to perform various operations such as querying, updating, inserting, and deleting data.
    Example: SELECT * FROM users WHERE age > 18; retrieves information about users aged above 18.
    Related Keywords: Database, Query, Data Management
    ----------------------------------------------------------------------------------------------------
    Document 8:
    
    Open Source
    Definition: Open source software allows its source code to be freely used, modified, and distributed, fostering collaboration and innovation.
    Example: The Linux operating system is a well-known open source project.
    Related Keywords: Software Development, Community, Technical Collaboration
    Structured Data
    Definition: Structured data is organized according to a specific format or schema, making it easy to search and analyze.
    ----------------------------------------------------------------------------------------------------
    Document 9:
    
    Semantic Search
    Definition: Semantic search is a search technique that understands the meaning of a user's query beyond simple keyword matching, returning results that are contextually relevant.
    Example: If a user searches for "planets in the solar system," the system provides information about planets like Jupiter and Mars.
    Related Keywords: Natural Language Processing (NLP), Search Algorithms, Data Mining
    ----------------------------------------------------------------------------------------------------
    Document 10:
    
    GPT (Generative Pretrained Transformer)
    Definition: GPT is a generative language model pre-trained on vast datasets, capable of performing various text-based tasks. It generates natural and coherent text based on input.
    Example: A chatbot generating detailed answers to user queries is powered by GPT models.
    Related Keywords: Natural Language Processing (NLP), Text Generation, Deep Learning

Now, let's wrap the base_retriever with a ContextualCompressionRetriever . The CrossEncoderReranker leverages HuggingFaceCrossEncoder to re-rank the retrieved results.

from langchain.retrievers import ContextualCompressionRetriever
from langchain.retrievers.document_compressors import CrossEncoderReranker
from langchain_community.cross_encoders import HuggingFaceCrossEncoder

# Initialize the model
model = HuggingFaceCrossEncoder(model_name="BAAI/bge-reranker-v2-m3")

# Select the top 3 documents
compressor = CrossEncoderReranker(model=model, top_n=3)

# Initialize the contextual compression retriever
compression_retriever = ContextualCompressionRetriever(
    base_compressor=compressor, base_retriever=retriever
)

# Retrieve compressed documents
compressed_docs = compression_retriever.invoke("Can you tell me about Word2Vec?")

# Display the documents
pretty_print_docs(compressed_docs)
Document 1:
    
    Word2Vec
    Definition: Word2Vec is a technique in NLP that maps words to a vector space, representing their semantic relationships based on context.
    Example: In a Word2Vec model, "king" and "queen" are represented by vectors located close to each other.
    Related Keywords: Natural Language Processing (NLP), Embedding, Semantic Similarity
    ----------------------------------------------------------------------------------------------------
    Document 2:
    
    Open Source
    Definition: Open source software allows its source code to be freely used, modified, and distributed, fostering collaboration and innovation.
    Example: The Linux operating system is a well-known open source project.
    Related Keywords: Software Development, Community, Technical Collaboration
    Structured Data
    Definition: Structured data is organized according to a specific format or schema, making it easy to search and analyze.
    ----------------------------------------------------------------------------------------------------
    Document 3:
    
    TF-IDF (Term Frequency-Inverse Document Frequency)
    Definition: TF-IDF is a statistical measure used to evaluate the importance of a word within a document by considering its frequency and rarity across a corpus.
    Example: Words with high TF-IDF values are often unique and critical for understanding the document.
    Related Keywords: Natural Language Processing (NLP), Information Retrieval, Data Mining

Set up the environment. You may refer to for more details.

You can checkout the for more details.

Multilingual Support BGE Reranker:

Hugging Face Cross Encoders
Environment Setup
langchain-opentutorial
bge-reranker-v2-m3
JeongHo Shin
JaeJun Shim
LangChain Open Tutorial
Overview
Environment Setup
Key Features and Mechanism
Practical Applications
Implementation
Key Advantages of Reranker
Document Count Settings for Reranker
Trade-offs When Using a Reranker