Structured Output Parser
Author: Yoolim Han
Proofread : BokyungisaGod
This is a part of LangChain Open Tutorial
Overview
The StructuredOutputParser
is a valuable tool for formatting Large Language Model (LLM) responses into dictionary structures, enabling the return of multiple fields as key/value pairs.
While Pydantic and JSON parsers offer robust capabilities, the StructuredOutputParser
is particularly effective for less powerful models, such as local models with fewer parameters. It is especially beneficial for models with lower intelligence compared to advanced models like GPT or Claude.
By utilizing the StructuredOutputParser
, developers can maintain data integrity and consistency across various LLM applications, even when operating with models that have reduced parameter counts.
Table of Contents
References
Environment Setup
Set up the environment. You may refer to Environment Setup for more details.
[Note]
langchain-opentutorial
is a package that provides a set of easy-to-use environment setup along with useful functions and utilities for tutorials.You can checkout the
langchain-opentutorial
for more details.
%%capture --no-stderr
%pip install langchain-opentutorial
# Install required packages
from langchain_opentutorial import package
package.install(
[
"langsmith",
"langchain",
"langchain_openai",
"langchain_community",
],
verbose=False,
upgrade=False,
)
# Set environment variables
from langchain_opentutorial import set_env
set_env(
{
"OPENAI_API_KEY": "",
"LANGCHAIN_API_KEY": "",
"LANGCHAIN_TRACING_V2": "true",
"LANGCHAIN_ENDPOINT": "https://api.smith.langchain.com",
"LANGCHAIN_PROJECT": "03-StructuredOutputParser",
}
)
Environment variables have been set successfully.
You can alternatively setOPENAI_API_KEY
in .env
file and load it.
[Note] This is not necessary if you've already set OPENAI_API_KEY
in previous steps.
from dotenv import load_dotenv
load_dotenv(override=True)
False
Implementing the StructuredOutputParser
StructuredOutputParser
Using ResponseSchema
with StructuredOutputParser
ResponseSchema
with StructuredOutputParser
Define a response schema using the
ResponseSchema
class to include the answer to the user's question and adescription
of the source (website) used.Initialize
StructuredOutputParser
withresponse_schemas
to structure the output according to the defined response schema.
[Note]
When using local models, Pydantic parsers may frequently fail to work properly. In such cases, using StructuredOutputParser
can be a good alternative solution.
from langchain.output_parsers import ResponseSchema, StructuredOutputParser
# Response to the user's question
response_schemas = [
ResponseSchema(name="answer", description="Answer to the user's question"),
ResponseSchema(
name="source",
description="The `source` used to answer the user's question, which should be a `website URL`.",
),
]
# Initialize the structured output parser based on the response schemas
output_parser = StructuredOutputParser.from_response_schemas(response_schemas)
Embedding Response Schemas into Prompts
Create a PromptTemplate
to format user questions and embed parsing instructions for structured outputs.
from langchain_core.prompts import PromptTemplate
# Parse the format instructions.
format_instructions = output_parser.get_format_instructions()
prompt = PromptTemplate(
# Set up the template to answer the user's question as well as possible.
template="answer the user's question as well as possible.\n{format_instructions}\n{question}",
# Use 'question' as the input variable.
input_variables=["question"],
# Use 'format_instructions' as a partial variable.
partial_variables={"format_instructions": format_instructions},
)
Integrating with ChatOpenAI
and Running the Chain
ChatOpenAI
and Running the Chain
Combine the PromptTemplate
, ChatOpenAI
model , and StructuredOutputParser
into a chain
. Finally, run the chain
with a specific question
to produce results.
from langchain_openai import ChatOpenAI
model = ChatOpenAI(temperature=0) # Initialize the ChatOpenAI model
chain = prompt | model | output_parser # Connect the prompt, model, and output parser
# Ask the question, "What is the largest desert in the world?"
chain.invoke({"question": "What is the largest desert in the world?"})
{'answer': 'The largest desert in the world is the Antarctic Desert.',
'source': 'https://www.worldatlas.com/articles/what-is-the-largest-desert-in-the-world.html'}
Using Streamed Outputs
Use the chain.stream
method to receive a streaming response to the question
, "How many players are on a soccer team?"
for s in chain.stream({"question": "How many players are on a soccer team?"}):
# Stream the output
print(s)
{'answer': 'A standard soccer team consists of 11 players on the field at a time.', 'source': 'https://www.fifa.com/who-we-are/news/what-are-the-rules-of-football-2040008'}
Last updated