This tutorial covers how to use Chroma Multimodal Vector Store with LangChain .
In this tutorial, we will inherit the ChromaDB class from the previous tutorial to create a class that adds a Multimodal feature, and then use this class to implement an example of a multimodal search engine.
# Set environment variables
from langchain_opentutorial import set_env
set_env(
{
"OPENAI_API_KEY": "",
"LANGCHAIN_API_KEY": "",
"LANGCHAIN_TRACING_V2": "true",
"LANGCHAIN_ENDPOINT": "https://api.smith.langchain.com",
"LANGCHAIN_PROJECT": "Chroma With Langchain", # title 과 동일하게 설정해 주세요
"HUGGINGFACEHUB_API_TOKEN": "",
}
)
Environment variables have been set successfully.
You can alternatively set API keys such as OPENAI_API_KEY in a .env file and load them.
[Note] This is not necessary if you've already set the required API keys in previous steps.
# Load API keys from .env file
from dotenv import load_dotenv
load_dotenv(override=True)
True
Multimodal Search
Chorma supports Multimodal Collections , which means it can handle and store embeddings from different types of data, such as text , images , audio , or even video .
from datasets import load_dataset
dataset = load_dataset("Pupba/animal-180", split="train")
# slice 50 set
images = dataset[:50]["png"]
image_paths = [save_temp_gen_url(img) for img in images]
metas = dataset[:50]["json"]
prompts = [data["prompt"] for data in metas]
categories = [data["category"] for data in metas]
Image Path: C:\Users\Jung\AppData\Local\Temp\tmp9dt0pak8.png
Prompt: a fluffy white rabbit sitting in a grassy meadow, soft sunlight illuminating its fur, highly detailed, 8k resolution.
Category: rabbit
Rank[1]
Category: elephant
Prompt: an elephant walking through tall grass, golden sunlight reflecting off its skin, highly detailed, natural lighting, ultra-realistic.
Cosine Similarity Score: 0.310
Rank[2]
Category: elephant
Prompt: an elephant roaring in the early morning light, mist in the background, highly detailed, ultra-realistic, 8k resolution.
Cosine Similarity Score: 0.308
Image Query Search
Search for images that are similar to the images.
# query image url
import requests
import io
def load_image_from_url(url: str, resolution: int = 512) -> Image.Image:
"""
Load an image from a URL and return it as a PIL Image object.
Args:
url (str): The URL of the image.
Returns:
Image.Image: The loaded PIL Image object.
"""
response = requests.get(url)
response.raise_for_status() # Raise an error for failed requests
image = Image.open(io.BytesIO(response.content))
image = image.resize((resolution, resolution), resample=Image.Resampling.LANCZOS)
return image
def save_image_to_tempfile(url: str) -> str:
"""
Download an image from a URL and save it to a temporary file.
Args:
url (str): The URL of the image.
Returns:
str: The file path to the saved image.
"""
response = requests.get(url)
# Raise an error for failed requests
response.raise_for_status()
# Create a temporary file
temp_file = tempfile.NamedTemporaryFile(delete=False, suffix=".jpg")
temp_file.write(response.content)
# Close the file to allow other processes to access it
temp_file.close()
return temp_file.name
Rank[1]
Category: rabbit
Prompt: a rabbit sitting on a stone wall, looking at the camera, soft natural lighting, highly detailed, ultra-realistic.
Cosine Similarity Score: 0.913
Rank[2]
Category: rabbit
Prompt: a rabbit standing on its hind legs, looking at the camera, soft golden lighting, highly detailed, 8k resolution.
Cosine Similarity Score: 0.885