LangChain OpenTutorial
  • 🦜️🔗 The LangChain Open Tutorial for Everyone
  • 01-Basic
    • Getting Started on Windows
    • 02-Getting-Started-Mac
    • OpenAI API Key Generation and Testing Guide
    • LangSmith Tracking Setup
    • Using the OpenAI API (GPT-4o Multimodal)
    • Basic Example: Prompt+Model+OutputParser
    • LCEL Interface
    • Runnable
  • 02-Prompt
    • Prompt Template
    • Few-Shot Templates
    • LangChain Hub
    • Personal Prompts for LangChain
    • Prompt Caching
  • 03-OutputParser
    • PydanticOutputParser
    • PydanticOutputParser
    • CommaSeparatedListOutputParser
    • Structured Output Parser
    • JsonOutputParser
    • PandasDataFrameOutputParser
    • DatetimeOutputParser
    • EnumOutputParser
    • Output Fixing Parser
  • 04-Model
    • Using Various LLM Models
    • Chat Models
    • Caching
    • Caching VLLM
    • Model Serialization
    • Check Token Usage
    • Google Generative AI
    • Huggingface Endpoints
    • HuggingFace Local
    • HuggingFace Pipeline
    • ChatOllama
    • GPT4ALL
    • Video Q&A LLM (Gemini)
  • 05-Memory
    • ConversationBufferMemory
    • ConversationBufferWindowMemory
    • ConversationTokenBufferMemory
    • ConversationEntityMemory
    • ConversationKGMemory
    • ConversationSummaryMemory
    • VectorStoreRetrieverMemory
    • LCEL (Remembering Conversation History): Adding Memory
    • Memory Using SQLite
    • Conversation With History
  • 06-DocumentLoader
    • Document & Document Loader
    • PDF Loader
    • WebBaseLoader
    • CSV Loader
    • Excel File Loading in LangChain
    • Microsoft Word(doc, docx) With Langchain
    • Microsoft PowerPoint
    • TXT Loader
    • JSON
    • Arxiv Loader
    • UpstageDocumentParseLoader
    • LlamaParse
    • HWP (Hangeul) Loader
  • 07-TextSplitter
    • Character Text Splitter
    • 02. RecursiveCharacterTextSplitter
    • Text Splitting Methods in NLP
    • TokenTextSplitter
    • SemanticChunker
    • Split code with Langchain
    • MarkdownHeaderTextSplitter
    • HTMLHeaderTextSplitter
    • RecursiveJsonSplitter
  • 08-Embedding
    • OpenAI Embeddings
    • CacheBackedEmbeddings
    • HuggingFace Embeddings
    • Upstage
    • Ollama Embeddings With Langchain
    • LlamaCpp Embeddings With Langchain
    • GPT4ALL
    • Multimodal Embeddings With Langchain
  • 09-VectorStore
    • Vector Stores
    • Chroma
    • Faiss
    • Pinecone
    • Qdrant
    • Elasticsearch
    • MongoDB Atlas
    • PGVector
    • Neo4j
    • Weaviate
    • Faiss
    • {VectorStore Name}
  • 10-Retriever
    • VectorStore-backed Retriever
    • Contextual Compression Retriever
    • Ensemble Retriever
    • Long Context Reorder
    • Parent Document Retriever
    • MultiQueryRetriever
    • MultiVectorRetriever
    • Self-querying
    • TimeWeightedVectorStoreRetriever
    • TimeWeightedVectorStoreRetriever
    • Kiwi BM25 Retriever
    • Ensemble Retriever with Convex Combination (CC)
  • 11-Reranker
    • Cross Encoder Reranker
    • JinaReranker
    • FlashRank Reranker
  • 12-RAG
    • Understanding the basic structure of RAG
    • RAG Basic WebBaseLoader
    • Exploring RAG in LangChain
    • RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval
    • Conversation-With-History
    • Translation
    • Multi Modal RAG
  • 13-LangChain-Expression-Language
    • RunnablePassthrough
    • Inspect Runnables
    • RunnableLambda
    • Routing
    • Runnable Parallel
    • Configure-Runtime-Chain-Components
    • Creating Runnable objects with chain decorator
    • RunnableWithMessageHistory
    • Generator
    • Binding
    • Fallbacks
    • RunnableRetry
    • WithListeners
    • How to stream runnables
  • 14-Chains
    • Summarization
    • SQL
    • Structured Output Chain
    • StructuredDataChat
  • 15-Agent
    • Tools
    • Bind Tools
    • Tool Calling Agent
    • Tool Calling Agent with More LLM Models
    • Iteration-human-in-the-loop
    • Agentic RAG
    • CSV/Excel Analysis Agent
    • Agent-with-Toolkits-File-Management
    • Make Report Using RAG, Web searching, Image generation Agent
    • TwoAgentDebateWithTools
    • React Agent
  • 16-Evaluations
    • Generate synthetic test dataset (with RAGAS)
    • Evaluation using RAGAS
    • HF-Upload
    • LangSmith-Dataset
    • LLM-as-Judge
    • Embedding-based Evaluator(embedding_distance)
    • LangSmith Custom LLM Evaluation
    • Heuristic Evaluation
    • Compare experiment evaluations
    • Summary Evaluators
    • Groundedness Evaluation
    • Pairwise Evaluation
    • LangSmith Repeat Evaluation
    • LangSmith Online Evaluation
    • LangFuse Online Evaluation
  • 17-LangGraph
    • 01-Core-Features
      • Understanding Common Python Syntax Used in LangGraph
      • Title
      • Building a Basic Chatbot with LangGraph
      • Building an Agent with LangGraph
      • Agent with Memory
      • LangGraph Streaming Outputs
      • Human-in-the-loop
      • LangGraph Manual State Update
      • Asking Humans for Help: Customizing State in LangGraph
      • DeleteMessages
      • DeleteMessages
      • LangGraph ToolNode
      • LangGraph ToolNode
      • Branch Creation for Parallel Node Execution
      • Conversation Summaries with LangGraph
      • Conversation Summaries with LangGraph
      • LangGrpah Subgraph
      • How to transform the input and output of a subgraph
      • LangGraph Streaming Mode
      • Errors
      • A Long-Term Memory Agent
    • 02-Structures
      • LangGraph-Building-Graphs
      • Naive RAG
      • Add Groundedness Check
      • Adding a Web Search Module
      • LangGraph-Add-Query-Rewrite
      • Agentic RAG
      • Adaptive RAG
      • Multi-Agent Structures (1)
      • Multi Agent Structures (2)
    • 03-Use-Cases
      • LangGraph Agent Simulation
      • Meta Prompt Generator based on User Requirements
      • CRAG: Corrective RAG
      • Plan-and-Execute
      • Multi Agent Collaboration Network
      • Multi Agent Collaboration Network
      • Multi-Agent Supervisor
      • 08-LangGraph-Hierarchical-Multi-Agent-Teams
      • 08-LangGraph-Hierarchical-Multi-Agent-Teams
      • SQL-Agent
      • 10-LangGraph-Research-Assistant
      • LangGraph Code Assistant
      • Deploy on LangGraph Cloud
      • Tree of Thoughts (ToT)
      • Ollama Deep Researcher (Deepseek-R1)
      • Functional API
      • Reflection in LangGraph
  • 19-Cookbook
    • 01-SQL
      • TextToSQL
      • SpeechToSQL
    • 02-RecommendationSystem
      • ResumeRecommendationReview
    • 03-GraphDB
      • Movie QA System with Graph Database
      • 05-TitanicQASystem
      • Real-Time GraphRAG QA
    • 04-GraphRAG
      • Academic Search System
      • Academic QA System with GraphRAG
    • 05-AIMemoryManagementSystem
      • ConversationMemoryManagementSystem
    • 06-Multimodal
      • Multimodal RAG
      • Shopping QnA
    • 07-Agent
      • 14-MoARAG
      • CoT Based Smart Web Search
      • 16-MultiAgentShoppingMallSystem
      • Agent-Based Dynamic Slot Filling
      • Code Debugging System
      • New Employee Onboarding Chatbot
      • 20-LangGraphStudio-MultiAgent
      • Multi-Agent Scheduler System
    • 08-Serving
      • FastAPI Serving
      • Sending Requests to Remote Graph Server
      • Building a Agent API with LangServe: Integrating Currency Exchange and Trip Planning
    • 08-SyntheticDataset
      • Synthetic Dataset Generation using RAG
    • 09-Monitoring
      • Langfuse Selfhosting
Powered by GitBook
On this page
  • Overview
  • Table of Contents
  • References
  • Environment Setup
  • Building a Vector Database
  • Usage
  • How to Use the LCEL Chain
  1. 10-Retriever

MultiQueryRetriever

PreviousParent Document RetrieverNextMultiVectorRetriever

Last updated 28 days ago

  • Author:

  • Peer Review:

  • Proofread :

  • This is a part of

Overview

MultiQueryRetriever offers a thoughtful approach to improving distance-based vector database retrieval by generating diverse queries with the help of an LLM.

This method simplifies the retrieval process, minimizes the need for manual prompt adjustments, and aims to provide more nuanced and comprehensive results.

  • Understanding Distance-Based Vector Search Distance-based vector search is a technique that identifies documents with embeddings similar to a query embedding based on their 'distance' in a high-dimensional space. However, subtle variations in query details or embedding representations can occasionally make it challenging to fully capture the intended meaning, which might affect the search results.

  • Streamlined Prompt Tuning MultiQueryRetriever reduces the complexity of prompt tuning by utilizing an LLM to automatically generate multiple queries from different perspectives for a single input. This helps minimize the effort required for manual adjustments or prompt engineering.

  • Broader Document Retrieval Each generated query is used to perform a search, and the unique documents retrieved from all queries are combined. This approach helps uncover a wider range of potentially relevant documents, increasing the chances of retrieving valuable information.

  • Improved Search Robustness By exploring a question from multiple perspectives through diverse queries, MultiQueryRetriever addresses some of the limitations of distance-based searches. This approach can better account for nuanced differences and deeper meanings in the data, leading to more contextually relevant and well-rounded results.

Table of Contents

References


Environment Setup

[Note]

  • langchain-opentutorial is a package that provides a set of easy-to-use environment setup, useful functions and utilities for tutorials.

%%capture --no-stderr
%pip install langchain-opentutorial
# Install required packages
from langchain_opentutorial import package

package.install(
    [
        "langchain",
        "langchain_core",
        "langchain_openai",
    ],
    verbose=False,
    upgrade=False,
)
# Set environment variables
from langchain_opentutorial import set_env

set_env(
    {
        "LANGCHAIN_TRACING_V2": "true",
        "LANGCHAIN_ENDPOINT": "https://api.smith.langchain.com",
        "LANGCHAIN_PROJECT": "06-Multi-Query-Retriever",
    }
)
Environment variables have been set successfully.

Alternatively, environment variables can also be set using a .env file.

[Note]

  • This is not necessary if you've already set the environment variables in the previous step.

# Configuration file to manage API keys as environment variables
from dotenv import load_dotenv

# Load API key information
load_dotenv()
True

Building a Vector Database

Vector databases enable efficient retrieval of relevant documents by embedding text data into a high-dimensional vector space.

This example demonstrates creating a simple vector database using LangChain, which involves loading and splitting a document, generating embeddings with OpenAI, and performing a search query to retrieve contextually relevant information.

# Build a sample vector DB
from langchain_community.document_loaders import WebBaseLoader
from langchain.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter

# Load a blog post
loader = WebBaseLoader(
    "https://python.langchain.com/docs/introduction/", encoding="utf-8"
)

# Split documents
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)
docs = loader.load_and_split(text_splitter)

# Define embedding
openai_embedding = OpenAIEmbeddings()

# Create the vector DB
db = FAISS.from_documents(docs, openai_embedding)

# Create a retriever
retriever = db.as_retriever()

# Document search
query = "Please explain the key features and architecture of the LangChain framework."
relevant_docs = retriever.invoke(query)

# Print the number of retrieved documents
print(f"Number of retrieved documents: {len(relevant_docs)}")

# Print each document with its number
for idx, doc in enumerate(relevant_docs, start=1):
    print(f"Document #{idx}:\n{doc.page_content}\n{'-'*40}")
Number of retrieved documents: 4
    Document #1:
    noteThese docs focus on the Python LangChain library. Head here for docs on the JavaScript LangChain library.
    Architecture​
    The LangChain framework consists of multiple open-source libraries. Read more in the
    Architecture page.
    ----------------------------------------
    Document #2:
    LangChain is a framework for developing applications powered by large language models (LLMs).
    LangChain simplifies every stage of the LLM application lifecycle:
    ----------------------------------------
    Document #3:
    However, these guides will help you quickly accomplish common tasks using chat models,
    vector stores, and other common LangChain components.
    Check out LangGraph-specific how-tos here.
    Conceptual guide​
    Introductions to all the key parts of LangChain you’ll need to know! Here you'll find high level explanations of all LangChain concepts.
    For a deeper dive into LangGraph concepts, check out this page.
    Integrations​
    ----------------------------------------
    Document #4:
    langgraph: Orchestration framework for combining LangChain components into production-ready applications with persistence, streaming, and other key features. See LangGraph documentation.
    ----------------------------------------

Usage

Simply specify the LLM to be used in MultiQueryRetriever and pass the query, and the retriever will handle the rest.

from langchain.retrievers.multi_query import MultiQueryRetriever
from langchain_openai import ChatOpenAI


# Initialize the ChatOpenAI language model with temperature set to 0.
llm = ChatOpenAI(temperature=0, model="gpt-4o-mini")

multiquery_retriever = MultiQueryRetriever.from_llm(  # Initialize the MultiQueryRetriever using the language model.
    # Pass the vector database retriever and the language model.
    retriever=db.as_retriever(),
    llm=llm,
)

Below is code that you can run to debug the intermediate process of generating multiple queries.

First, we retrieve the langchain.retrievers.multi_query logger.

This is done using the logging.getLogger method. Then, we set the logger's log level to INFO, so that only log messages at the INFO level or above are printed.

# Logging settings for the query
import logging

logging.basicConfig()
logging.getLogger("langchain.retrievers.multi_query").setLevel(logging.INFO)

This code uses the invoke method of the retriever_from_llm object to search for documents relevant to the given question.

The retrieved documents are stored in the variable relevant_docs, and checking the length of this variable lets you see how many relevant documents were found.

Through this process, you can effectively locate information related to the user's question and assess how much of it is available.

# Define the question
question = "Please explain the key features and architecture of the LangChain framework."
# Document search
relevant_docs = multiquery_retriever.invoke(question)

# Return the number of unique documents retrieved.
print(
    f"===============\nNumber of retrieved documents: {len(relevant_docs)}",
    end="\n===============\n",
)

# Print the content of the retrieved documents.
print(relevant_docs[0].page_content)
INFO:langchain.retrievers.multi_query:Generated queries: ['What are the main components and architectural design of the LangChain framework?', 'Can you describe the essential characteristics and structure of the LangChain framework?', 'What are the significant features and the underlying architecture of the LangChain framework?']
===============
Number of retrieved documents: 6
===============
noteThese docs focus on the Python LangChain library. Head here for docs on the JavaScript LangChain library.
Architecture​
The LangChain framework consists of multiple open-source libraries. Read more in the
Architecture page.

How to Use the LCEL Chain

  • Define a custom prompt, then create a Chain with that prompt.

  • When the Chain receives a user question (in the following example), it generates 5 questions, and returns the 5 generated questions separated by '\n'.

from langchain_core.runnables import RunnablePassthrough
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser

# Define the prompt template (written to generate 5 questions)
prompt = PromptTemplate.from_template(
    """You are an AI language model assistant. 
Your task is to generate five different versions of the given user question to retrieve relevant documents from a vector database. 
By generating multiple perspectives on the user question, your goal is to help the user overcome some of the limitations of the distance-based similarity search. 
Your response should be a list of values separated by new lines, eg: `foo\nbar\nbaz\n`

#ORIGINAL QUESTION: 
{question}

#Answer in English:
"""
)

# Create an instance of the language model.
llm = ChatOpenAI(temperature=0, model="gpt-4o-mini")

# Create the LLMChain.
custom_multiquery_chain = (
    {"question": RunnablePassthrough()} | prompt | llm | StrOutputParser()
)

# Define the question.
question = "Please explain the key features and architecture of the LangChain framework."

# Execute the chain and check the generated multiple queries.
multi_queries = custom_multiquery_chain.invoke(question)
# Check the result (5 generated questions)
print(multi_queries)
What are the main components and structure of the LangChain framework?  
    Can you describe the architecture and essential characteristics of LangChain?  
    What are the significant features and design elements of the LangChain framework?  
    How is the LangChain framework structured, and what are its key functionalities?  
    Could you provide an overview of the LangChain framework's architecture and its primary features?  

You can pass the previously created Chain to the MultiQueryRetriever to perform retrieval.

multiquery_retriever = MultiQueryRetriever.from_llm(
    llm=custom_multiquery_chain, retriever=db.as_retriever()
)

Use the MultiQueryRetriever to search documents and check the results.

# Result
relevant_docs = multiquery_retriever.invoke(question)

# Return the number of unique documents retrieved.
print(
    f"===============\nNumber of retrieved documents: {len(relevant_docs)}",
    end="\n===============\n",
)

# Print the content of the retrieved documents.
print(relevant_docs[0].page_content)
INFO:langchain.retrievers.multi_query:Generated queries: ['What are the main characteristics and structure of the LangChain framework?  ', 'Can you describe the essential features and design of the LangChain framework?  ', 'Could you provide an overview of the key components and architecture of the LangChain framework?  ', 'What are the fundamental aspects and architectural elements of the LangChain framework?  ', 'Please outline the primary features and framework architecture of LangChain.']
===============
Number of retrieved documents: 5
===============
LangChain is a framework for developing applications powered by large language models (LLMs).
LangChain simplifies every stage of the LLM application lifecycle:

Set up the environment. You may refer to for more details.

You can checkout the for more details.

LangChain Documentation: How to use the MultiQueryRetriever
Environment Setup
langchain-opentutorial
hong-seongmin
Juni Lee
LangChain OpenTutorial
Overview
Environment Setup
Building a Vector Database
Usage
How to Use the LCEL Chain