LangGraph Streaming Mode
Author: Yejin Park
Peer Review:
Proofread : Chaeyoon Kim
This is a part of LangChain Open Tutorial
Overview
This tutorial demonstrates LangGraph's streaming capabilities by building an AI news search system.
It covers three key streaming modes: values, updates, and messages, each serving different output monitoring needs.
The tutorial also explores advanced features including subgraphs and tag-based filtering for enhanced control over real-time AI outputs.
Table of Contents
References
Environment Setup
Setting up your environment is the first step. See the Environment Setup guide for more details.
[Note]
The langchain-opentutorial is a package of easy-to-use environment setup guidance, useful functions and utilities for tutorials.
Check out the langchain-opentutorial for more details.
You can set API keys in a .env file or set them manually.
[Note] If you’re not using the .env file, no worries! Just enter the keys directly in the cell below, and you’re good to go.
Introduction to Streaming Modes
LangGraph supports multiple streaming modes. The main ones are:
values: This streaming mode streams back values of the graph. This is the full state of the graph after each node is called.updates: This streaming mode streams back updates to the graph. This is the update to the state of the graph after each node is called.messages: This streaming mode streams LLM tokens from nodes as they are produced.
Defining the Graph
We'll create a simple agent that can search news and process the results.
First, we'll define a class to fetch Google News search results:
Now that we have our news fetching functionality, let's build the graph structure using LangGraph.
We'll create states, define tools, and establish the connections between different components:
Visualize the graph.

Step-by-step output of a node
Streaming mode
values: Output current status value for each stepupdates: Output only status updates for each step (default)messages: Output messages for each step
Streaming here does not mean token-by-token streaming of LLM output, but rather step-by-step output.
Values Mode (stream_mode="values")
stream_mode="values")The values mode streams the complete state after each node execution.
A chunk is a tuple with two elements.
key: key of Statevalue: value of State
Synchronous Streaming
Asynchronous Streaming
The astream() method runs the graph through asynchronous stream processing and generates chunked responses in value mode.
It uses an async for statement to perform asynchronous stream processing.
If you only want to see the final result, do the following:
Updates Mode (stream_mode="updates")
stream_mode="updates")The updates mode streams only the changes to the state after each node execution.
The output is a dictionary with the node names as keys and the updated values as values.
A chunk is a tuple with two elements.
key: the name of the Nodevalue: The output value from that Node step, i.e. a dictionary with multiple key-value pairs
Synchronous Streaming
Asynchronous Streaming
Messages Mode (stream_mode="messages")
stream_mode="messages")The messages mode streams individual messages from each node.
A chunk is a tuple with two elements.
chunk_msg: real-time output messagemetadata: Node information
Synchronous Streaming
Asynchronous Streaming
Advanced Streaming Features
Streaming output to a specific node
If you want to output for a specific Node, you can set it via stream_mode="messages".
When setting stream_mode="messages", you will receive messages in the form of (chunk_msg, metadata).
chunk_msg: the real-time output messagemetadata: the node information
You can use metadata["langgraph_node"] to output only messages from a specific node.
You can see the node information by outputting metadata.

Filtering with Tags
If your LLM's output comes from multiple places, you may only want to output messages from a specific node.
In this case, you can add tags to select only the nodes you want to output.
Here's how to add tags to your LLM. Tags can be added as a list.
This allows you to filter events more precisely, keeping only events that occurred in that model.
The example below outputs only if the WANT_TO_STREAM tag is present.
Tool Call Streaming
AIMessageChunk: Output messages in real-time, tokenized units.tool_call_chunks: Tool call chunks. If tool_call_chunks exists, output tool call chunks cumulatively. (Tool tokens are determined by looking at this property)
Working with Subgraphs
In this part, we'll learn how to structure your graph using Subgraphs.
Subgraphs is a feature that allows you to define parts of a graph as subgraphs.
Example Flow
Subgraphs reuse the existing ability to search for the latest news.
The Parent Graph adds the ability to generate social media posts based on the latest news retrieved.
We can visualize the subgraph flow using xray option.

Now let's see how our graph processes and outputs the results when searching for AI news.
When including Subgraphs output
You can also include the output of Subgraphs via subgraphs=True.
The output will be in the form (namespace, chunk).
Streaming LLM Output Token by Token Inside Subgraphs
kind indicates the type of event.
See the LangChain astream_events() reference for all event types.
For streaming output of only specific tags
ONLY_STREAM_TAGS allows you to set only the tags you want to stream output.
Here we see that "WANT_TO_STREAM2" is excluded from the output and only "WANT_TO_STREAM" is output.
Last updated