Conversation Summaries with LangGraph
Author: Junseong Kim
Peer Review:
This is a part of LangChain Open Tutorial
Overview
One of the most common use cases of conversation persistence is keeping track of a conversation's history. By summarizing and referencing past messages, we can maintain essential context without overloading the system with the entire conversation. This becomes especially important for long conversations, where a large context window can lead to increased computational costs and potential inaccuracies.
In this tutorial, we will explore how to summarize a conversation and integrate that summary into a new conversation state while removing older messages. This approach helps manage the conversation length within a limited context window, preventing inadvertent increases in cost or inference time.
Key Steps:
Detect if a conversation is too long (e.g., based on the number of messages).
If it exceeds a threshold, summarize the conversation so far.
Remove older messages and store only the summary (plus the most recent messages).
This tutorial will guide you through setting up a conversation flow that automatically summarizes older messages and retains only the recent conversation turns and the summary.

Table of Contents
References
Environment Setup
Set up the environment. You may refer to Environment Setup for more details.
[Note]
langchain-opentutorialis a package that provides a set of easy-to-use environment setup, useful functions and utilities for tutorials.You can checkout the
langchain-opentutorialfor more details.
You can alternatively set OPENAI_API_KEY in .env file and load it.
[Note] This is not necessary if you've already set OPENAI_API_KEY in previous steps.
Checking the Conversation Length
Below, we set up our custom State to store both messages and summaries. We'll also define a helper function in a separate cell to determine if the conversation has exceeded a certain length.
The
Stateclass extendsMessagesState, holdingmessagesandsummary.We initialize a
ChatOpenAImodel with the namegpt-4o-mini.
Here is a separate code cell for the should_continue function, which checks if the conversation has more than 6 messages.
Summarizing and Managing the Conversation
When the conversation exceeds the threshold, we summarize it and remove older messages to preserve only a recent segment and the overall summary. We also create a node (ask_llm) to handle new messages, optionally including the existing summary.
ask_llmchecks if a summary exists and prepends it as aSystemMessageif so.
summarize_conversationeither creates or extends a summary and removes older messages.
Building Graph Workflow
Here, we construct a StateGraph, add our nodes, and compile the application. We also use visualize_graph(app) to see how the workflow is structured.
Use
StateGraphto define nodes and edges.Use
visualize_graph(app)to see the workflow.
Below is an example code snippet that visualizes the current workflow graph. We define custom node styles for the graph and then use IPython utilities to display the rendered image inline. The NodeStyles data class customizes fill colors, stroke styles, and overall appearance of the nodes.

Running Workflow
You can interact with the application by sending messages. Once the conversation exceeds 6 messages, it automatically summarizes and shortens.
The
app.streammethod handles streaming of messages and triggers the nodes accordingly.Check the internal state with
app.get_state(config)to see how messages and summaries are updated.
Here is a helper function, print_update, which prints updates in real time during streaming.
Below is a single code cell demonstrating how we handle user messages and process them through streaming mode. We import HumanMessage, configure the session with a thread ID, and send three user messages in sequence. After each message, the updates are streamed and printed in real time.
So far, no summary has been generated because there are only six messages. Once the conversation exceeds six messages, summarization will be triggered.
You can see that there are only six messages in the list, which is why no summary has been created yet. Let's send another message to exceed that threshold.
Because the conversation now has more than six messages, summarization will be triggered.
During this process, old messages are removed, and only the summary plus the last two messages remain in the conversation state.
As you can see, the summary has been added, and the older messages have been replaced by RemoveMessage actions. Only the most recent messages and the newly created summary remain.
You can now continue the conversation, and despite only having the last two messages visible, the system still retains the overall context through the summary.
Even though the older messages were removed from the conversation history, the summary contains the essential context. The model can respond accurately based on the stored summary.
Last updated