LangChain OpenTutorial
  • 🦜️🔗 The LangChain Open Tutorial for Everyone
  • 01-Basic
    • Getting Started on Windows
    • 02-Getting-Started-Mac
    • OpenAI API Key Generation and Testing Guide
    • LangSmith Tracking Setup
    • Using the OpenAI API (GPT-4o Multimodal)
    • Basic Example: Prompt+Model+OutputParser
    • LCEL Interface
    • Runnable
  • 02-Prompt
    • Prompt Template
    • Few-Shot Templates
    • LangChain Hub
    • Personal Prompts for LangChain
    • Prompt Caching
  • 03-OutputParser
    • PydanticOutputParser
    • PydanticOutputParser
    • CommaSeparatedListOutputParser
    • Structured Output Parser
    • JsonOutputParser
    • PandasDataFrameOutputParser
    • DatetimeOutputParser
    • EnumOutputParser
    • Output Fixing Parser
  • 04-Model
    • Using Various LLM Models
    • Chat Models
    • Caching
    • Caching VLLM
    • Model Serialization
    • Check Token Usage
    • Google Generative AI
    • Huggingface Endpoints
    • HuggingFace Local
    • HuggingFace Pipeline
    • ChatOllama
    • GPT4ALL
    • Video Q&A LLM (Gemini)
  • 05-Memory
    • ConversationBufferMemory
    • ConversationBufferWindowMemory
    • ConversationTokenBufferMemory
    • ConversationEntityMemory
    • ConversationKGMemory
    • ConversationSummaryMemory
    • VectorStoreRetrieverMemory
    • LCEL (Remembering Conversation History): Adding Memory
    • Memory Using SQLite
    • Conversation With History
  • 06-DocumentLoader
    • Document & Document Loader
    • PDF Loader
    • WebBaseLoader
    • CSV Loader
    • Excel File Loading in LangChain
    • Microsoft Word(doc, docx) With Langchain
    • Microsoft PowerPoint
    • TXT Loader
    • JSON
    • Arxiv Loader
    • UpstageDocumentParseLoader
    • LlamaParse
    • HWP (Hangeul) Loader
  • 07-TextSplitter
    • Character Text Splitter
    • 02. RecursiveCharacterTextSplitter
    • Text Splitting Methods in NLP
    • TokenTextSplitter
    • SemanticChunker
    • Split code with Langchain
    • MarkdownHeaderTextSplitter
    • HTMLHeaderTextSplitter
    • RecursiveJsonSplitter
  • 08-Embedding
    • OpenAI Embeddings
    • CacheBackedEmbeddings
    • HuggingFace Embeddings
    • Upstage
    • Ollama Embeddings With Langchain
    • LlamaCpp Embeddings With Langchain
    • GPT4ALL
    • Multimodal Embeddings With Langchain
  • 09-VectorStore
    • Vector Stores
    • Chroma
    • Faiss
    • Pinecone
    • Qdrant
    • Elasticsearch
    • MongoDB Atlas
    • PGVector
    • Neo4j
    • Weaviate
    • Faiss
    • {VectorStore Name}
  • 10-Retriever
    • VectorStore-backed Retriever
    • Contextual Compression Retriever
    • Ensemble Retriever
    • Long Context Reorder
    • Parent Document Retriever
    • MultiQueryRetriever
    • MultiVectorRetriever
    • Self-querying
    • TimeWeightedVectorStoreRetriever
    • TimeWeightedVectorStoreRetriever
    • Kiwi BM25 Retriever
    • Ensemble Retriever with Convex Combination (CC)
  • 11-Reranker
    • Cross Encoder Reranker
    • JinaReranker
    • FlashRank Reranker
  • 12-RAG
    • Understanding the basic structure of RAG
    • RAG Basic WebBaseLoader
    • Exploring RAG in LangChain
    • RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval
    • Conversation-With-History
    • Translation
    • Multi Modal RAG
  • 13-LangChain-Expression-Language
    • RunnablePassthrough
    • Inspect Runnables
    • RunnableLambda
    • Routing
    • Runnable Parallel
    • Configure-Runtime-Chain-Components
    • Creating Runnable objects with chain decorator
    • RunnableWithMessageHistory
    • Generator
    • Binding
    • Fallbacks
    • RunnableRetry
    • WithListeners
    • How to stream runnables
  • 14-Chains
    • Summarization
    • SQL
    • Structured Output Chain
    • StructuredDataChat
  • 15-Agent
    • Tools
    • Bind Tools
    • Tool Calling Agent
    • Tool Calling Agent with More LLM Models
    • Iteration-human-in-the-loop
    • Agentic RAG
    • CSV/Excel Analysis Agent
    • Agent-with-Toolkits-File-Management
    • Make Report Using RAG, Web searching, Image generation Agent
    • TwoAgentDebateWithTools
    • React Agent
  • 16-Evaluations
    • Generate synthetic test dataset (with RAGAS)
    • Evaluation using RAGAS
    • HF-Upload
    • LangSmith-Dataset
    • LLM-as-Judge
    • Embedding-based Evaluator(embedding_distance)
    • LangSmith Custom LLM Evaluation
    • Heuristic Evaluation
    • Compare experiment evaluations
    • Summary Evaluators
    • Groundedness Evaluation
    • Pairwise Evaluation
    • LangSmith Repeat Evaluation
    • LangSmith Online Evaluation
    • LangFuse Online Evaluation
  • 17-LangGraph
    • 01-Core-Features
      • Understanding Common Python Syntax Used in LangGraph
      • Title
      • Building a Basic Chatbot with LangGraph
      • Building an Agent with LangGraph
      • Agent with Memory
      • LangGraph Streaming Outputs
      • Human-in-the-loop
      • LangGraph Manual State Update
      • Asking Humans for Help: Customizing State in LangGraph
      • DeleteMessages
      • DeleteMessages
      • LangGraph ToolNode
      • LangGraph ToolNode
      • Branch Creation for Parallel Node Execution
      • Conversation Summaries with LangGraph
      • Conversation Summaries with LangGraph
      • LangGrpah Subgraph
      • How to transform the input and output of a subgraph
      • LangGraph Streaming Mode
      • Errors
      • A Long-Term Memory Agent
    • 02-Structures
      • LangGraph-Building-Graphs
      • Naive RAG
      • Add Groundedness Check
      • Adding a Web Search Module
      • LangGraph-Add-Query-Rewrite
      • Agentic RAG
      • Adaptive RAG
      • Multi-Agent Structures (1)
      • Multi Agent Structures (2)
    • 03-Use-Cases
      • LangGraph Agent Simulation
      • Meta Prompt Generator based on User Requirements
      • CRAG: Corrective RAG
      • Plan-and-Execute
      • Multi Agent Collaboration Network
      • Multi Agent Collaboration Network
      • Multi-Agent Supervisor
      • 08-LangGraph-Hierarchical-Multi-Agent-Teams
      • 08-LangGraph-Hierarchical-Multi-Agent-Teams
      • SQL-Agent
      • 10-LangGraph-Research-Assistant
      • LangGraph Code Assistant
      • Deploy on LangGraph Cloud
      • Tree of Thoughts (ToT)
      • Ollama Deep Researcher (Deepseek-R1)
      • Functional API
      • Reflection in LangGraph
  • 19-Cookbook
    • 01-SQL
      • TextToSQL
      • SpeechToSQL
    • 02-RecommendationSystem
      • ResumeRecommendationReview
    • 03-GraphDB
      • Movie QA System with Graph Database
      • 05-TitanicQASystem
      • Real-Time GraphRAG QA
    • 04-GraphRAG
      • Academic Search System
      • Academic QA System with GraphRAG
    • 05-AIMemoryManagementSystem
      • ConversationMemoryManagementSystem
    • 06-Multimodal
      • Multimodal RAG
      • Shopping QnA
    • 07-Agent
      • 14-MoARAG
      • CoT Based Smart Web Search
      • 16-MultiAgentShoppingMallSystem
      • Agent-Based Dynamic Slot Filling
      • Code Debugging System
      • New Employee Onboarding Chatbot
      • 20-LangGraphStudio-MultiAgent
      • Multi-Agent Scheduler System
    • 08-Serving
      • FastAPI Serving
      • Sending Requests to Remote Graph Server
      • Building a Agent API with LangServe: Integrating Currency Exchange and Trip Planning
    • 08-SyntheticDataset
      • Synthetic Dataset Generation using RAG
    • 09-Monitoring
      • Langfuse Selfhosting
Powered by GitBook
On this page
  • Overview
  • Table of Contents
  • References
  • Environment Setup
  • Installation
  • Install the Python Package
  • What is GPT4ALL
  • Choosing a Model
  • Model Selection Criteria
  • Downloading a Model
  • Running GPT4ALL Models
  • Creating a Prompt and Checking the Result
  1. 04-Model

GPT4ALL

PreviousChatOllamaNextVideo Q&A LLM (Gemini)

Last updated 28 days ago

  • Author:

  • Peer Review : ,

  • Proofread :

  • This is a part of

Overview

In this tutorial, we’re exploring GPT4ALL together! From picking the perfect model for your hardware to running it on your own, we’ll walk you through the process step by step.

Ready? Let’s dive in and have some fun along the way!

Table of Contents

References


Environment Setup

[Note]

  • langchain-opentutorial is a package that provides a set of easy-to-use environment setup, useful functions and utilities for tutorials.

%%capture --no-stderr
!pip install langchain-opentutorial
# Install required packages
from langchain_opentutorial import package

package.install(
    [
        "langsmith",
        "langchain",
        "langchain_core",
        "langchain_community",
    ],
    verbose=False,
    upgrade=False,
)
# Set environment variables
from langchain_opentutorial import set_env

set_env(
    {
        "OPENAI_API_KEY": "",
        "LANGCHAIN_API_KEY": "",
        "LANGCHAIN_TRACING_V2": "true",
        "LANGCHAIN_ENDPOINT": "https://api.smith.langchain.com",
        "LANGCHAIN_PROJECT": "GPT4ALL",
    }
)
Environment variables have been set successfully.

You can also create and use a .env file in the root directory as shown below.

from dotenv import load_dotenv

load_dotenv()
True

Installation

Ready to get started with gpt4all? Let’s make sure you’ve got everything set up! We’ll guide you through installing the package using pip or poetry. Don’t worry, it’s easy and quick.


Install the Python Package

You can install gpt4all using pip or poetry, depending on your preferred package manager. Here’s how:

1. Installation using pip

If you’re using pip, run the following command in your terminal:

!pip install -qU gpt4all

2. Installation using poetry

Prefer poetry? No problem! Here’s how to install gpt4all using poetry:

Step 1: Add gpt4all to your project Run this command to add the package to your pyproject.toml:

!poetry add gpt4all

Step 2: Install dependencies If the package is already added but not installed, simply run:

!poetry install

Poetry will sync your environment and install all required dependencies.

What is GPT4ALL

GitHub:nomic-ai/gpt4all is an open-source chatbot ecosystem trained on a large amount of data, including code and chat-form conversations.

In this example, we will explain how to interact with the GPT4All model using LangChain.

Choosing a Model


Model Selection Criteria

Model Name

Filesize

RAM Required

Parameters

Quantization

Developer

License

MD5 Sum (Unique Hash)

Meta-Llama-3-8B-Instruct.Q4_0.gguf

4.66 GB

8 GB

8 Billion

q4_0

Meta

Llama 3 License

c87ad09e1e4c8f9c35a5fcef52b6f1c9

Nous-Hermes-2-Mistral-7B-DPO.Q4_0.gguf

4.11 GB

8 GB

7 Billion

q4_0

Mistral & Nous Research

Apache 2.0

Coa5f6b4eabd3992da4d7fb7f020f921eb

Phi-3-mini-4k-instruct.Q4_0.gguf

2.18 GB

4 GB

3.8 Billion

q4_0

Microsoft

MIT

f8347badde9bfc2efbe89124d78ddaf5

orca-mini-3b-gguf2-q4_0.gguf

1.98 GB

4 GB

3 Billion

q4_0

Microsoft

CC-BY-NC-SA-4.0

0e769317b90ac30d6e09486d61fefa26

gpt4all-13b-snoozy-q4_0.gguf

7.37 GB

16 GB

13 Billion

q4_0

Nomic AI

GPL

40388eb2f8d16bb5d08c96fdfaac6b2c

Based on Use Case

Choose your model depending on the tasks you plan to perform:

  • Lightweight tasks (e.g., simple conversation): orca-mini-3b-gguf2-q4_0.gguf or Phi-3-mini-4k-instruct.Q4_0.gguf.

  • Moderate tasks (e.g., summarization or grammar correction): Meta-Llama-3-8B-Instruct.Q4_0.gguf or Nous-Hermes-2-Mistral-7B-DPO.Q4_0.gguf.

  • Advanced tasks (e.g., long text generation, research): gpt4all-13b-snoozy-q4_0.gguf.

Based on System Specifications

Select a model based on your available hardware:

  • For 4GB RAM or less, use orca-mini-3b-gguf2-q4_0.gguf or Phi-3-mini-4k-instruct.Q4_0.gguf.

  • For 8GB RAM or more, use Meta-Llama-3-8B-Instruct.Q4_0.gguf or Nous-Hermes-2-Mistral-7B-DPO.Q4_0.gguf.

  • For 16GB RAM or more, use gpt4all-13b-snoozy-q4_0.gguf.

[NOTE]

  • GGML: CPU-friendly and low memory usage.

  • GGUF: Latest format with GPU acceleration support.

  • q4_0 Quantization: Efficient for both CPU and GPU workloads, with reduced memory requirements.

Downloading a Model

In this tutorial, we will be using Microsoft's Phi-3-Mini-4K-Instruct model.

  1. Load Models in Python: After downloading the model, create a folder named models and place the downloaded file in that folder.

  • Assign the local file path (e.g., Phi-3-mini-4k-instruct-q4.gguf) to the local_path variable.

local_path = "./models/Phi-3-mini-4k-instruct-q4.gguf"  # Replace with your desired local file path.
  • You can replace this path with any local file path you prefer.

Running GPT4ALL Models

GPT4All is a powerful large-scale language model, similar to GPT-3, designed to support a variety of natural language processing tasks.

This module allows you to easily load the GPT4All model and perform inference seamlessly.

In the following example, we demonstrate how to load the GPT4All model and utilize it to answer a question by leveraging a custom prompt and inference pipeline.

from langchain_core.prompts.chat import ChatPromptTemplate
from langchain_community.llms.gpt4all import GPT4All
from langchain_core.output_parsers.string import StrOutputParser
from langchain_core.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

[NOTE]

Due to structural changes, in version 0.3.13, you need to replace from langchain.prompts import ChatPromptTemplate with from langchain_core.prompts import ChatPromptTemplate.

Creating a Prompt and Checking the Result

template = """
<s>A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.</s>
<s>Human: {question}</s>
<s>Assistant:
"""

prompt = ChatPromptTemplate.from_template(template)

result = prompt.invoke({"question": "where is the capital of United States?"})

print(result.messages[0].content)
    A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
    Human: where is the capital of United States?
    Assistant:
    

The ChatPromptTemplate is responsible for creating prompt templates in LangChain and dynamically substituting variables. Without using the invoke() method, you can utilize the class's template methods to generate prompts. In this case, the template can be returned as a string using the format method.# Using format() instead of invoke()result = prompt.format(question="What is the capital of United States?")In a nutshell, the invoke() method is great for chain-based tasks, while the format() method is perfect for returning simple strings.result = prompt.format(question="where is the capital of United States?")print(result)Human: A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. Human: where is the capital of United States? Assistant: You might notice that Human: is automatically added to the output. If you'd like to avoid this behavior, you can use LangChain's PromptTemplate class instead of ChatPromptTemplate. The PromptTemplate class doesn’t add any extra prefixes like this.from langchain_core.prompts.prompt import PromptTemplateprompt = PromptTemplate.from_template(template)formatted_prompt = prompt.format(question="Where is the capital of the United States?")print(formatted_prompt) A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. Human: Where is the capital of the United States? Assistant: We'll be using invoke for chain-based tasks, so go ahead and forget about the format method for now! 😉Using Chains to Display Results in Real-Time# Promptprompt = ChatPromptTemplate.from_template( """ <s>A chat between a user and an AI assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.</s> <s>Human: {question}</s> <s>Assistant:""")# GPT4All Language Model Initialization# Specify the path to the GPT4All model file in modelmodel = GPT4All( model=local_path, n_threads=8, # Number of threads to use. backend="gpu", # GPU Configuration streaming=True, # Streaming Configuration callbacks=[StreamingStdOutCallbackHandler()] # Callback Configuration)# Create the chainchain = prompt | model | StrOutputParser()# Execute the queryresponse = chain.invoke({"question": "where is the capital of United States?"})=== The capital of the United States is Washington, D.C., which stands for District of Columbia. It was established by the Constitution along with a federal district that would serve as the nation's seat of government and be under its exclusive jurisdiction. The city itself lies on the east bank of the Potomac River near its fall point where it empties into Chesapeake Bay, but Washington is not part of any U.S. state; instead, it has a special federal district status as outlined in Article I, Section 8 of the Constitution and further defined by the Residence Act of 1790 signed by President George Washington. Washington D.C.'s location was chosen to be near the nation's capital at that time—Philadelphia, Pennsylvania—and it also holds symbolic significance as a neutral ground for both northern and southern states during their early years in America. The city is home to many iconic landmarks such as the U.S. Capitol Building where Congress meets, the White House (the residence of the President), Supreme Court buildings, numerous museums like the Smithsonian Institution's National Museum of American History or Natural History and Air & Space, among othersSummaryToday, we explored GPT4ALL together! We didn’t just run models — we took part in the decision-making process, from selecting a model to suit our needs to choosing the right methods based on our desired outcomes or execution direction. Along the way, we compared the performance of popular models and even ran the code ourselves.Next time, we’ll dive into Video Q&A LLM (Gemini). Until then, try running today’s code with different models and see how they perform. See you soon! 😊

Set up the environment. You may refer to for more details.

You can checkout the for more details.

Image Description

It's the most crucial and decision-making time. Before diving into writing code, it's time to decide which model to use. Below, we explore popular models and help you choose the right one based on GPT4All's .

Download the Model: Visit to download the required model (2.39 GB).

Image Description
Image Description

Use the to load and run your selected model in your project.

GPT4All Python SDK
GPT4ALL docs
Environment Setup
langchain-opentutorial
Python Documentation
HuggingFace
Python Documentation
Yoonji Oh
Joseph
Normalist-K
frimer
LangChain Open Tutorial
Overview
Environement Setup
Installation
What is GPT4ALL
Choosing a Model
Downloading a Model
Running GPT4ALL Models
Summary