Qdrant
Author: HyeonJong Moon, Pupba
Peer Review: liniar, hellohotkey, Sohyeon Yim
This is a part of LangChain Open Tutorial
Overview
This tutorial covers how to use Qdrant****Qdrant with LangChain .
Qdrant is a high-performance, open-source vector database that stands out with advanced filtering, payload indexing, and native support for hybrid (vector + keyword) search.
This tutorial walks you through using CRUD operations with the Qdrant storing , updating , deleting documents, and performing similarity-based retrieval .
Table of Contents
References
Environment Setup
Set up the environment. You may refer to Environment Setup for more details.
[Note]
langchain-opentutorialis a package that provides a set of easy-to-use environment setup, useful functions and utilities for tutorials.You can checkout the
langchain-opentutorialfor more details.
You can alternatively set API keys such as OPENAI_API_KEY in a .env file and load them.
[Note] This is not necessary if you've already set the required API keys in previous steps.
What is Qdrant?

Qdrant is an open-source vector database and similarity search engine built in Rust, designed to handle high-dimensional vector data efficiently.
It provides a production-ready service with a user-friendly API for storing, searching, and managing vectors along with additional payload data.
Key Features
High Performance : Built in Rust for speed and reliability, handling billions of vectors with low latency.
Advanced Filtering : Supports complex filtering with JSON payloads, enabling precise searches based on metadata.
Hybrid Search : Combines vector similarity with keyword-based filtering for enhanced search capabilities.
Scalable Deployment : Offers cloud-native scalability with options for on-premise, cloud, and hybrid deployments.
Multi-language Support : Provides client libraries for Python, JavaScript/TypeScript, Go, and more.
Prepare Data
This section guides you through the data preparation process .
This section includes the following components:
Data Introduction
Preprocess Data
Data Introduction
In this tutorial, we will use the fairy tale 📗 The Little Prince in PDF format as our data.
This material complies with the Apache 2.0 license .
The data is used in a text (.txt) format converted from the original PDF.
You can view the data at the link below.
Preprocess Data
In this tutorial section, we will preprocess the text data from The Little Prince and convert it into a list of LangChain Document objects with metadata.
Each document chunk will include a title field in the metadata, extracted from the first line of each section.
Setting up Qdrant
This part walks you through the initial setup of Qdrant .
This section includes the following components:
Load Embedding Model
Load Qdrant Client
Load Embedding Model
In this section, you'll learn how to load an embedding model.
This tutorial uses OpenAI's API-Key for loading the model.
💡 If you prefer to use another embedding model, see the instructions below.
Load Qdrant Client
In this section, we'll show you how to load the database client object using the Python SDK for Qdrant .
Document Manager
For the LangChain-OpenTutorial, we have implemented a custom set of CRUD functionalities for VectorDBs
The following operations are included:
upsert: Update existing documents or insert if they don’t existupsert_parallel: Perform upserts in parallel for large-scale datasimilarity_search: Search for similar documents based on embeddingsdelete: Remove documents based on filter conditions
Each of these features is implemented as class methods specific to each VectorDB.
In this tutorial, you'll learn how to use these methods to interact with your VectorDB.
We plan to continuously expand the functionality by adding more common operations in the future.
Create Instance
First, create an instance of the Qdrant helper class to use its CRUD functionalities.
This class is initialized with the Qdrant Python SDK client instance and the embedding model instance , both of which were defined in the previous section.
Now you can use the following CRUD operations with the crud_manager instance.
These instance allow you to easily manage documents in your Qdrant .
Upsert Document
Update existing documents or insert if they don’t exist
✅ Args
texts: Iterable[str] – List of text contents to be inserted/updated.metadatas: Optional[List[Dict]] – List of metadata dictionaries for each text (optional).ids: Optional[List[str]] – Custom IDs for the documents. If not provided, IDs will be auto-generated.**kwargs: Extra arguments for the underlying vector store.
🔄 Return
None
Upsert Parallel
Perform upsert in parallel for large-scale data
✅ Args
texts: Iterable[str] – List of text contents to be inserted/updated.metadatas: Optional[List[Dict]] – List of metadata dictionaries for each text (optional).ids: Optional[List[str]] – Custom IDs for the documents. If not provided, IDs will be auto-generated.batch_size: int – Number of documents per batch (default: 32).workers: int – Number of parallel workers (default: 10).**kwargs: Extra arguments for the underlying vector store.
🔄 Return
None
Similarity Search
Search for similar documents based on embeddings .
This method uses "cosine similarity" .
✅ Args
query: str – The text query for similarity search.k: int – Number of top results to return (default: 10).**kwargs: Additional search options (e.g., filters).
🔄 Return
results: List[Document] – A list of LangChain Document objects ranked by similarity.
as_retriever
The as_retriever() method creates a LangChain-compatible retriever wrapper.
This function allows a DocumentManager class to return a retriever object by wrapping the internal search() method, while staying lightweight and independent from full LangChain VectorStore dependencies.
The retriever obtained through this function is compatible with existing LangChain retrievers and can be used in LangChain Pipelines (e.g., RetrievalQA, ConversationalRetrievalChain, Tool, etc.)
✅ Args
search_fn: Callable - The function used to retrieve relevant documents. Typically this isself.searchfrom aDocumentManagerinstance.search_kwargs: Optional[Dict] - A dictionary of keyword arguments passed tosearch_fn, such askfor top-K results or metadata filters.
🔄 Return
LightCustomRetriever:BaseRetriever - A lightweight LangChain-compatible retriever that internally uses the givensearch_fnandsearch_kwargs.
Delete Document
Delete documents based on filter conditions
✅ Args
ids: Optional[List[str]] – List of document IDs to delete. If None, deletion is based on filter.filters: Optional[Dict] – Dictionary specifying filter conditions (e.g., metadata match).**kwargs: Any additional parameters.
🔄 Return
None
Last updated