Contextual Compression Retriever
Author: JoonHo Kim
Peer Review :
Proofread : jishin86
This is a part of LangChain Open Tutorial
Overview
The ContextualCompressionRetriever in LangChain is a powerful tool designed to optimize the retrieval process by compressing retrieved documents based on context. This retriever is particularly useful in scenarios where large amounts of data need to be summarized or filtered dynamically, ensuring that only the most relevant information is passed to subsequent processing steps.
Key features of the ContextualCompressionRetriever include:
Context-Aware Compression: Documents are compressed based on the specific context or query, ensuring relevance and reducing redundancy.
Flexible Integration: Works seamlessly with other LangChain components, making it easy to integrate into existing pipelines.
Customizable Compression: Allows for the use of different compression techniques, including summary models and embedding-based methods, to tailor the retrieval process to your needs.
The ContextualCompressionRetriever is particularly suited for applications like:
Summarizing large datasets for Q&A systems.
Enhancing chatbot performance by providing concise and relevant responses.
Improving efficiency in document-heavy tasks like legal analysis or academic research.
By using this retriever, developers can significantly reduce computational overhead and improve the quality of information presented to end-users.

Table of Contents
References
Environment Setup
Set up the environment. You may refer to Environment Setup for more details.
[Note]
langchain-opentutorialis a package that provides a set of easy-to-use environment setup, useful functions and utilities for tutorials.You can checkout the
langchain-opentutorialfor more details.
You can alternatively set OPENAI_API_KEY in .env file and load it.
[Note] This is not necessary if you've already set OPENAI_API_KEY in previous steps.
The following function is used to display documents in a visually appealing format.
Basic Retriever Configuration
Let's start by initializing a simple vector store retriever and saving text documents in chunks. When a sample question is asked, you can see that the retriever returns 1 to 2 relevant documents along with a few irrelevant ones.
We will follow the following steps to generate a retriever.
Generate Loader to load text file using TextLoader
Generate text chunks using CharacterTextSplitter and split the text into chunks of 300 characters with no overlap.
Generate vector store using FAISS and convert it to retriever
Query the retriever to find relevant documents
Print the relevant documents
Contextual Compression
The DocumentCompressor created using LLMChainExtractor is exactly what is applied to the retriever, which is the ContextualCompressionRetriever.
ContextualCompressionRetriever will compress the documents by removing irrelevant information and focusing on the most relevant information.
Let's see how the retriever works before and after applying ContextualCompressionRetriever.
Document Filtering Using LLM
LLMChainFilter
LLMChainFilter is a simpler yet powerful compressor that uses an LLM chain to decide which documents to filter and which to return from the initially retrieved documents.
This filter selectively returns documents without altering (compressing) their content.
EmbeddingsFilter
Performing additional LLM calls for each retrieved document is costly and slow.
The EmbeddingsFilter provides a more affordable and faster option by embedding both the documents and the query, returning only those documents with embeddings that are sufficiently similar to the query.
This allows for maintaining the relevance of search results while saving on computational costs and time.
The process involves compressing and retrieving relevant documents using EmbeddingsFilter and ContextualCompressionRetriever.
The
EmbeddingsFilteris used to filter documents that exceed a specified similarity threshold (0.86).
Creating a Pipeline (Compressor + Document Converter)
Using DocumentCompressorPipeline, multiple compressors can be sequentially combined.
You can add BaseDocumentTransformer to the pipeline along with the Compressor, which performs transformations on the document set without performing contextual compression.
For example, TextSplitter can be used as a document transformer to split documents into smaller pieces, while EmbeddingsRedundantFilter can be used to filter out duplicate documents based on the embedding similarity between documents (by default, considering documents with a similarity of 0.95 or higher as duplicates).
Below, we first split the documents into smaller chunks, then remove duplicate documents, and filter based on relevance to the query to create a compressor pipeline."
While initializing the ContextualCompressionRetriever, we use pipeline_compressor as the base_compressor and retriever as the base_retriever.
Last updated