A Long-Term Memory Agent
Author: byoon
Peer Review:
Proofread : hong-seongmin
This is a part of LangChain Open Tutorial
Overview
This tutorial explains how to implement an agent with long-term memory capabilities using LangGraph. The agent can store, retrieve, and use memories to enhance its interactions with users.
Inspired by Research The concept of long-term memory in agents is inspired by papers like MemGPT and is based on our own work. The agent extracts memories from chat interactions and stores them in a database.
Memory Representation In this tutorial, "memory" is represented in two ways:
Text Information: A piece of text generated by the agent.
Structured Information: Knowledge about entities extracted by the agent, represented as (subject, predicate, object) knowledge triples. This structured memory can be read or queried semantically to provide personalized context during interactions with users.
Key Idea: Shared Memory Across Conversations The KEY idea is that by saving memories, the agent persists information about users that is SHARED across multiple conversations (threads), which is different from memory of a single conversation that is already enabled by LangGraph's persistence.

By using long-term memory, the agent can provide more personalized and context-aware responses across different conversations.
Table of Contents
References
Environment-setup
Set up the environment. You may refer to Environment Setup for more details.
[Note]
langchain-opentutorialis a package that provides a set of easy-to-use environment setup, useful functions and utilities for tutorials.You can checkout the
langchain-opentutorialfor more details.
Define vectorstore for memories
We will define the vectorstore where our memories will be stored and retrieved based on the conversation context. Memories will be stored as embeddings and later looked up when needed. In this tutorial, we will use an in-memory vectorstore to manage and store our memories efficiently.
Define tools
Next, we will define our memory tools. We need one tool to store memories and another tool to search for the most relevant memory based on the conversation context.
Additionally, let's give our agent ability to search the web using Tavily.
Define state, nodes and edges
Our graph state consists of two channels:
Messages: Stores the chat history to maintain conversation context.
Recall Memories: Holds contextual memories that are retrieved before calling the agent and included in the system prompt.
This structure ensures relevant information is available for better responses.
The purpose of each function is as follows:
agent(): Generates a response using GPT-4o with recalled memories and tool integration.load_memories(): Retrieves relevant past memories based on the conversation history.route_tools(): Determines whether to use tools or end the conversation based on the last message.
Build the graph
Our agent graph will be very similar to a simple ReAct agent. The only key difference is that we add a node to load memories before calling the agent for the first time.

Run the agent
Let's run the agent for the first time and tell it some information about the user.
Note: we're specifying user_id to save memories for a given user.
You can see that the agent saved the memory about user's name. Let's add some more information about the user.
Now we can use the saved information about our user on a different thread. Let's try it out:
Notice that the agent loads the most relevant memories before responding. In our case, it suggests dinner recommendations based on both food preferences and location.
Finally, let's use the search tool along with the conversation history and memory to find the location of a pizzeria.
Adding structured memories
So far, we have stored memories as simple text, like "John loves pizza". This format works well when saving memories in a vector database.
However, if your application would benefit from other storage options—such as a graph database—we can modify the system to store memories in a more structured way.
Below, we update the save_recall_memory tool to accept a list of "knowledge triples" (3-part structures with a subject, predicate, and object), which can be stored in a knowledge graph. The model will then generate these structured representations when using its tools.
For now, we continue using the same vector database, but save_recall_memory and search_recall_memories can be further modified to work with a graph database.
At this stage, we only need to update the save_recall_memory tool.
We can then compile the graph exactly as before:
As before, the memories generated from one thread are accessed in another thread from the same user:
For illustrative purposes we can visualize the knowledge graph extracted by the model:

Last updated