LangChain OpenTutorial
  • 🦜️🔗 The LangChain Open Tutorial for Everyone
  • 01-Basic
    • Getting Started on Windows
    • 02-Getting-Started-Mac
    • OpenAI API Key Generation and Testing Guide
    • LangSmith Tracking Setup
    • Using the OpenAI API (GPT-4o Multimodal)
    • Basic Example: Prompt+Model+OutputParser
    • LCEL Interface
    • Runnable
  • 02-Prompt
    • Prompt Template
    • Few-Shot Templates
    • LangChain Hub
    • Personal Prompts for LangChain
    • Prompt Caching
  • 03-OutputParser
    • PydanticOutputParser
    • PydanticOutputParser
    • CommaSeparatedListOutputParser
    • Structured Output Parser
    • JsonOutputParser
    • PandasDataFrameOutputParser
    • DatetimeOutputParser
    • EnumOutputParser
    • Output Fixing Parser
  • 04-Model
    • Using Various LLM Models
    • Chat Models
    • Caching
    • Caching VLLM
    • Model Serialization
    • Check Token Usage
    • Google Generative AI
    • Huggingface Endpoints
    • HuggingFace Local
    • HuggingFace Pipeline
    • ChatOllama
    • GPT4ALL
    • Video Q&A LLM (Gemini)
  • 05-Memory
    • ConversationBufferMemory
    • ConversationBufferWindowMemory
    • ConversationTokenBufferMemory
    • ConversationEntityMemory
    • ConversationKGMemory
    • ConversationSummaryMemory
    • VectorStoreRetrieverMemory
    • LCEL (Remembering Conversation History): Adding Memory
    • Memory Using SQLite
    • Conversation With History
  • 06-DocumentLoader
    • Document & Document Loader
    • PDF Loader
    • WebBaseLoader
    • CSV Loader
    • Excel File Loading in LangChain
    • Microsoft Word(doc, docx) With Langchain
    • Microsoft PowerPoint
    • TXT Loader
    • JSON
    • Arxiv Loader
    • UpstageDocumentParseLoader
    • LlamaParse
    • HWP (Hangeul) Loader
  • 07-TextSplitter
    • Character Text Splitter
    • 02. RecursiveCharacterTextSplitter
    • Text Splitting Methods in NLP
    • TokenTextSplitter
    • SemanticChunker
    • Split code with Langchain
    • MarkdownHeaderTextSplitter
    • HTMLHeaderTextSplitter
    • RecursiveJsonSplitter
  • 08-Embedding
    • OpenAI Embeddings
    • CacheBackedEmbeddings
    • HuggingFace Embeddings
    • Upstage
    • Ollama Embeddings With Langchain
    • LlamaCpp Embeddings With Langchain
    • GPT4ALL
    • Multimodal Embeddings With Langchain
  • 09-VectorStore
    • Vector Stores
    • Chroma
    • Faiss
    • Pinecone
    • Qdrant
    • Elasticsearch
    • MongoDB Atlas
    • PGVector
    • Neo4j
    • Weaviate
    • Faiss
    • {VectorStore Name}
  • 10-Retriever
    • VectorStore-backed Retriever
    • Contextual Compression Retriever
    • Ensemble Retriever
    • Long Context Reorder
    • Parent Document Retriever
    • MultiQueryRetriever
    • MultiVectorRetriever
    • Self-querying
    • TimeWeightedVectorStoreRetriever
    • TimeWeightedVectorStoreRetriever
    • Kiwi BM25 Retriever
    • Ensemble Retriever with Convex Combination (CC)
  • 11-Reranker
    • Cross Encoder Reranker
    • JinaReranker
    • FlashRank Reranker
  • 12-RAG
    • Understanding the basic structure of RAG
    • RAG Basic WebBaseLoader
    • Exploring RAG in LangChain
    • RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval
    • Conversation-With-History
    • Translation
    • Multi Modal RAG
  • 13-LangChain-Expression-Language
    • RunnablePassthrough
    • Inspect Runnables
    • RunnableLambda
    • Routing
    • Runnable Parallel
    • Configure-Runtime-Chain-Components
    • Creating Runnable objects with chain decorator
    • RunnableWithMessageHistory
    • Generator
    • Binding
    • Fallbacks
    • RunnableRetry
    • WithListeners
    • How to stream runnables
  • 14-Chains
    • Summarization
    • SQL
    • Structured Output Chain
    • StructuredDataChat
  • 15-Agent
    • Tools
    • Bind Tools
    • Tool Calling Agent
    • Tool Calling Agent with More LLM Models
    • Iteration-human-in-the-loop
    • Agentic RAG
    • CSV/Excel Analysis Agent
    • Agent-with-Toolkits-File-Management
    • Make Report Using RAG, Web searching, Image generation Agent
    • TwoAgentDebateWithTools
    • React Agent
  • 16-Evaluations
    • Generate synthetic test dataset (with RAGAS)
    • Evaluation using RAGAS
    • HF-Upload
    • LangSmith-Dataset
    • LLM-as-Judge
    • Embedding-based Evaluator(embedding_distance)
    • LangSmith Custom LLM Evaluation
    • Heuristic Evaluation
    • Compare experiment evaluations
    • Summary Evaluators
    • Groundedness Evaluation
    • Pairwise Evaluation
    • LangSmith Repeat Evaluation
    • LangSmith Online Evaluation
    • LangFuse Online Evaluation
  • 17-LangGraph
    • 01-Core-Features
      • Understanding Common Python Syntax Used in LangGraph
      • Title
      • Building a Basic Chatbot with LangGraph
      • Building an Agent with LangGraph
      • Agent with Memory
      • LangGraph Streaming Outputs
      • Human-in-the-loop
      • LangGraph Manual State Update
      • Asking Humans for Help: Customizing State in LangGraph
      • DeleteMessages
      • DeleteMessages
      • LangGraph ToolNode
      • LangGraph ToolNode
      • Branch Creation for Parallel Node Execution
      • Conversation Summaries with LangGraph
      • Conversation Summaries with LangGraph
      • LangGrpah Subgraph
      • How to transform the input and output of a subgraph
      • LangGraph Streaming Mode
      • Errors
      • A Long-Term Memory Agent
    • 02-Structures
      • LangGraph-Building-Graphs
      • Naive RAG
      • Add Groundedness Check
      • Adding a Web Search Module
      • LangGraph-Add-Query-Rewrite
      • Agentic RAG
      • Adaptive RAG
      • Multi-Agent Structures (1)
      • Multi Agent Structures (2)
    • 03-Use-Cases
      • LangGraph Agent Simulation
      • Meta Prompt Generator based on User Requirements
      • CRAG: Corrective RAG
      • Plan-and-Execute
      • Multi Agent Collaboration Network
      • Multi Agent Collaboration Network
      • Multi-Agent Supervisor
      • 08-LangGraph-Hierarchical-Multi-Agent-Teams
      • 08-LangGraph-Hierarchical-Multi-Agent-Teams
      • SQL-Agent
      • 10-LangGraph-Research-Assistant
      • LangGraph Code Assistant
      • Deploy on LangGraph Cloud
      • Tree of Thoughts (ToT)
      • Ollama Deep Researcher (Deepseek-R1)
      • Functional API
      • Reflection in LangGraph
  • 19-Cookbook
    • 01-SQL
      • TextToSQL
      • SpeechToSQL
    • 02-RecommendationSystem
      • ResumeRecommendationReview
    • 03-GraphDB
      • Movie QA System with Graph Database
      • 05-TitanicQASystem
      • Real-Time GraphRAG QA
    • 04-GraphRAG
      • Academic Search System
      • Academic QA System with GraphRAG
    • 05-AIMemoryManagementSystem
      • ConversationMemoryManagementSystem
    • 06-Multimodal
      • Multimodal RAG
      • Shopping QnA
    • 07-Agent
      • 14-MoARAG
      • CoT Based Smart Web Search
      • 16-MultiAgentShoppingMallSystem
      • Agent-Based Dynamic Slot Filling
      • Code Debugging System
      • New Employee Onboarding Chatbot
      • 20-LangGraphStudio-MultiAgent
      • Multi-Agent Scheduler System
    • 08-Serving
      • FastAPI Serving
      • Sending Requests to Remote Graph Server
      • Building a Agent API with LangServe: Integrating Currency Exchange and Trip Planning
    • 08-SyntheticDataset
      • Synthetic Dataset Generation using RAG
    • 09-Monitoring
      • Langfuse Selfhosting
Powered by GitBook
On this page
  • Overview
  • Table of Contents
  • References
  • Environment Setup
  • Install
  • Ollama Download
  • Model Download using Ollama
  • Model
  • Modelfile
  • Ollama model
  • Output format: JSON
  • Multimodal support
  1. 04-Model

ChatOllama

PreviousHuggingFace PipelineNextGPT4ALL

Last updated 28 days ago

  • Author:

  • Peer Review : ,

  • Proofread :

  • This is a part of

Overview

This tutorial covers how Ollama can be used to run open source large language models, such as Llama 3.3 locally.

  • : a list of supperted models

There are two ways to utilize Ollama models:

  • ollama command in the terminal

  • ChatOllama class of LangChain.

ChatOllama allows you to create a chain of prompt, LLM, OutputParser and run the model like other chat models such as ChatOpenAI.

  • The output format can be string, JSON, etc.

  • It also supports multimodal models.

Table of Contents

References


Environment Setup

[Note]

  • langchain-opentutorial is a package that provides a set of easy-to-use environment setup, useful functions and utilities for tutorials.

%%capture --no-stderr
!pip install langchain-opentutorial
# Install required packages
from langchain_opentutorial import package

package.install(
    [
        "langsmith",
        "langchain_core",
        "langchain-ollama",
        "langchain_community",
    ],
    verbose=False,
    upgrade=False,
)
    [notice] A new release of pip is available: 24.1 -> 24.3.1
    [notice] To update, run: pip install --upgrade pip
# Set environment variables
from langchain_opentutorial import set_env

set_env(
    {
        "LANGCHAIN_API_KEY": "",
        "LANGCHAIN_TRACING_V2": "true",
        "LANGCHAIN_ENDPOINT": "https://api.smith.langchain.com",
        "LANGCHAIN_PROJECT": "09-Ollama",
    }
)
Environment variables have been set successfully.

You can alternatively set API keys such as LANGCHAIN_API_KEY in a .env file and load them.

[Note] This is not necessary if you've already set the required API keys in previous steps.

# Load API keys from .env file
from dotenv import load_dotenv

load_dotenv(override=True)
True

Install

Install Ollama and download models we'll use in this tutorial.

Ollama Download

Download and install Ollama available for macOS, Linux, and Windows.

After installation success, you can run ollama in your terminal.

Model Download using Ollama

Above ollama available commands, the pull command is used to download a model from a registry and bring it into your local environment.

Use ollama pull <name-of-model> command to get the model.

For example:

  • ollama pull llama3.2 : 3B parameters model (default)

  • ollama pull llama3.2:1b : 1B parameters model

The default tag version of the model will be downloaded to the path below.

  • Mac: ~/.ollama/models

  • Linux/WSL: /usr/share/ollama/.ollama/models

Use ollama list to view all the models you’ve downloaded.

Chat with model directly from the command line using:

ollama run <name-of-model>

Send a message to model and use available commands.

Model

Check the configuration information in the Modelfile of the Ollama and run the model with ChatOllama.

Modelfile

A model file is the blueprint to create and share models with Ollama.

HuggingFace support download open models(.gguf extension). Then you can create a Modelfile to define your own custom model.

In this tutorial, two ways to view Modelfile of llama3.2:1b.

  • Option 2: use ollama show command to print the Modelfile

You can check the prompt template configuration.

Ollama model

All local models are available at localhost:11434.

from langchain_ollama import ChatOllama
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate

llm = ChatOllama(model="llama3.2:1b")

prompt = ChatPromptTemplate.from_template("Provide a brief explanation of this {topic}")

# Chaining
chain = prompt | llm | StrOutputParser()

response = chain.stream({"topic": "deep learning"})
from langchain_core.messages import AIMessageChunk

# Streaming response from model
for token in response:
    if isinstance(token, AIMessageChunk):
        print(token.content, end="", flush=True)
    elif isinstance(token, str):
        print(token, end="", flush=True)
Deep learning is a subfield of machine learning that involves the use of artificial neural networks (ANNs) to analyze and interpret data. ANNs are modeled after the human brain, with layers of interconnected nodes or "neurons" that process and transmit information.
    
    In traditional machine learning, algorithms like linear regression and decision trees are used to solve problems. However, these methods can be inflexible and prone to overfitting, where the model becomes too specialized to the training data and fails to generalize well to new, unseen data.
    
    Deep learning addresses this limitation by using multiple layers of ANNs with different types of nodes (e.g., sigmoid, ReLU, or tanh) that work together to learn complex patterns in the data. The key characteristics of deep learning models include:
    
    1. **Hierarchical structure**: Deep models have multiple layers, each with its own set of nodes and activation functions.
    2. **Non-linearity**: Deep networks use non-linear activation functions, such as ReLU or tanh, which introduce non-linearity into the model.
    3. **Regularization**: Regularization techniques, like dropout or L1/L2 regularization, are used to prevent overfitting by randomly dropping out nodes or adding a penalty term to the loss function.
    
    Deep learning has been widely adopted for tasks such as image recognition, natural language processing, speech recognition, and predictive modeling. It's particularly useful when dealing with high-dimensional data, complex relationships between variables, or noisy data.

Streaming response is possible through the single chain created above.

  • astream() : asynchronous streaming

async for chunks in chain.astream({"topic": "Google"}):
    print(chunks, end="", flush=True)
You're referring to Google. Google is an American multinational technology company that specializes in internet-related services and products. It was founded on September 4, 1998, by Larry Page and Sergey Brin while they were Ph.D. students at Stanford University.
    
    Google's primary mission is to organize the world's information and make it universally accessible and useful. The company's search engine, which is now one of the most widely used search engines in the world, was initially designed to provide hyperlinked acronyms (or "knolinks") for Web pages, but over time has evolved into a full-fledged search engine that can index and retrieve information from the entire web.
    
    Google's other notable products and services include:
    
    * Gmail: an email service
    * Google Maps: a mapping and navigation service
    * Google Drive: a cloud storage service
    * Google Docs: a word processing and document editing service
    * YouTube: a video-sharing platform
    * Chrome: a web browser
    
    Google is known for its innovative and user-friendly products, as well as its commitment to improving the online experience. It has become one of the most valuable companies in the world, with a market capitalization of over $1 trillion.

Output format: JSON

Use the latest version of Ollama and specify the format of the output to json.

Local models must be downloaded before they can be used.

  • ollama pull gemma2

from langchain_ollama import ChatOllama

llm = ChatOllama(
    model="gemma2",
    format="json",
    temperature=0,
)

Output the response in JSON format, even if the prompt does not include a message like response in JSON format.

prompt = "Tell me 10 European travel destinations. key: `places`."

response = llm.invoke(prompt)
print(response.content)
{
      "places": [
        "Paris, France",
        "Rome, Italy",
        "London, England",
        "Barcelona, Spain",
        "Amsterdam, Netherlands",
        "Prague, Czech Republic",
        "Vienna, Austria",
        "Berlin, Germany",
        "Dublin, Ireland",
        "Budapest, Hungary"
      ]
    }

Multimodal support

Download multimodal LLMs:

  • ollama pull llava:7b

  • ollama pull bakllava

Note. update Ollama to use the latest version that supports multimodal.

Provides functions:

  • convert_to_base64 : convert a PIL image to a Base64 encoded string

  • plt_img_base64 : embed it in HTML to display the image

Example usage:

  • Open a PIL image from the specified file path and save it to pil_image.

  • Use the convert_to_base64 function to convert pil_image to a Base64 encoded string.

  • Use the plt_img_base64 function to display the Base64-encoded string as an image.

import base64
from io import BytesIO

from IPython.display import HTML, display
from PIL import Image


def convert_to_base64(pil_image):
    buffered = BytesIO()
    pil_image.save(buffered, format="JPEG")
    img_str = base64.b64encode(buffered.getvalue()).decode("utf-8")
    return img_str


def plt_img_base64(img_base64):
    image_html = f'<img src="data:image/jpeg;base64,{img_base64}" />'
    display(HTML(image_html))


file_path = "./assets/09-Ollama-flow-explanation-06.png"  # Jeju island beach image
pil_image = Image.open(file_path)

image_b64 = convert_to_base64(pil_image)

plt_img_base64(image_b64)
  • prompt_func : takes image and text data as input and converts it to HumanMessage.

    • image : Base64 encoded JPEG format.

    • text : plain text.

from langchain_core.messages import HumanMessage


def prompt_func(data):
    text = data["text"]
    image = data["image"]

    image_part = {
        "type": "image_url",
        "image_url": f"data:image/jpeg;base64,{image}",
    }

    content_parts = []

    text_part = {"type": "text", "text": text}

    content_parts.append(image_part)
    content_parts.append(text_part)

    return [HumanMessage(content=content_parts)]

Call the chain.invoke method to pass an image and text query and generate an answer.

  • ChatOllama : Uses a multimodal LLM, such as llava.

  • StrOutputParser : parse the output of LLM into a string

  • chain : pipeline prompt_func, llm, and StrOutputParser

from langchain_core.output_parsers import StrOutputParser
from langchain_ollama import ChatOllama


llm = ChatOllama(model="llava:7b", temperature=0)

chain = prompt_func | llm | StrOutputParser()

query_chain = chain.invoke(
    {"text": "Describe a picture in bullet points", "image": image_b64}
)

print(query_chain)
 - The image shows a picturesque tropical beach scene.
    - In the foreground, there is a rocky shore with clear blue water and white foam from waves breaking on the rocks.
    - A small island or landmass is visible in the background, surrounded by the ocean.
    - The sky is clear and blue, suggesting good weather conditions.
    - There are no people visible in the image.
    - The overall style of the image is a natural landscape photograph with vibrant colors and clear details. 

Set up the environment. You may refer to for more details.

You can checkout the for more details.

more information in the .

Option 1:

Ollama supports multimodal LLMs like and .

You can use tags to explore the full set of versions of models like .

Ollama
Ollama Model File
ChatOllama
Environment Setup
langchain-opentutorial
installation
Ollama Modelfile documentation
view a template from a model's tags page
bakllava
llava
llava
Ivy Bae
HyeonJong Moon
sunworl
frimer
LangChain Open Tutorial
Ollama Library
Overview
Environement Setup
Install
Model
Output format: JSON
Multimodal support
explanation-01
explanation-02
explanation-03
explanation-04
explanation-05