Structured Output Parser

Author: Yoolim Han
Design:
Peer Review : Jeongeun Lim
This is a part of LangChain Open Tutorial

Overview

The StructuredOutputParser is a valuable tool for formatting Large Language Model (LLM) responses into dictionary structures, enabling the return of multiple fields as key/value pairs. hile Pydantic and JSON parsers offer robust capabilities, the StructuredOutputParser is particularly effective for less powerful models, such as local models with fewer parameters. It is especially beneficial for models with lower intelligence compared to advanced models like GPT or Claude. By utilizing the StructuredOutputParser, developers can maintain data integrity and consistency across various LLM applications, even when operating with models that have reduced parameter counts.

References

Environment Setup

Set up the environment. You may refer to Environment Setup for more details.

[Note]

langchain-opentutorial is a package that provides a set of easy-to-use environment setup, useful functions and utilities for tutorials.
You can checkout the langchain-opentutorial for more details.

%%capture --no-stderr
%pip install langchain-opentutorial

# Install required packages
from langchain_opentutorial import package

package.install(
    [
        "langsmith",
        "langchain",
        "langchain_openai",
        "langchain_community",
    ],
    verbose=False,
    upgrade=False,
)

# Set environment variables
from langchain_opentutorial import set_env

set_env(
    {
        "OPENAI_API_KEY": "",
        "LANGCHAIN_API_KEY": "",
        "LANGCHAIN_TRACING_V2": "true",
        "LANGCHAIN_ENDPOINT": "https://api.smith.langchain.com",
        "LANGCHAIN_PROJECT": "03-StructuredOutputParser",
    }
)

Environment variables have been set successfully.

You can alternatively set OPENAI_API_KEY in .env file and load it.

[Note] This is not necessary if you've already set OPENAI_API_KEY in previous steps.

from dotenv import load_dotenv

load_dotenv(override=True)

False

Implementing Structured Output Parser

Using ResponseSchema with StructuredOutputParser

Define a response schema using the ResponseSchema class to include the answer to the user's question and a description of the source (website) used.
Initialize StructuredOutputParser with response_schemas to structure the output according to the defined response schema.

[Note] When using local models, Pydantic parsers may frequently fail to work properly. In such cases, using StructuredOutputParser can be a good alternative solution.

from langchain.output_parsers import ResponseSchema, StructuredOutputParser

# Response to the user's question
response_schemas = [
    ResponseSchema(name="answer", description="Answer to the user's question"),
    ResponseSchema(
        name="source",
        description="The `source` used to answer the user's question, which should be a `website URL`.",
    ),
]
# Initialize the structured output parser based on the response schemas
output_parser = StructuredOutputParser.from_response_schemas(response_schemas)

Embedding Response Schemas into Prompts

Create a PromptTemplate to format user questions and embed parsing instructions for structured outputs.

from langchain_core.prompts import PromptTemplate
# Parse the format instructions.
format_instructions = output_parser.get_format_instructions()
prompt = PromptTemplate(
    # Set up the template to answer the user's question as best as possible.
    template="answer the users question as best as possible.\n{format_instructions}\n{question}",
    # Use 'question' as the input variable.
    input_variables=["question"],
    # Use 'format_instructions' as a partial variable.
    partial_variables={"format_instructions": format_instructions},
)

Integrating with ChatOpenAI and Running the Chain

Combine the PromptTemplate, ChatOpenAI model, and StructuredOutputParser into a chain. Finally, run the chain with a specific question to produce results.

from langchain_openai import ChatOpenAI

model = ChatOpenAI(temperature=0)  # Initialize the ChatOpenAI model

chain = prompt | model | output_parser  # Connect the prompt, model, and output parser

# Ask the question, "What is the largest desert in the world?"
chain.invoke({"question": "What is the largest desert in the world?"})

{'answer': 'The largest desert in the world is the Antarctic Desert.',
     'source': 'https://www.worldatlas.com/articles/what-is-the-largest-desert-in-the-world.html'}

Using Streamed Outputs

Use the chain.stream method to receive a streaming response to the question, "How many players are on a soccer team?"

for s in chain.stream({"question": "How many players are on a soccer team?"}):
    # Stream the output
    print(s)

{'answer': 'A standard soccer team consists of 11 players on the field at a time.', 'source': 'https://www.fifa.com/who-we-are/news/what-are-the-rules-of-football-2040008'}

PreviousComma Separated List Output Parser NextJsonOutputParser

Last updated 10 days ago

Structured Output Parser

Overview

Table of Contents

References

Environment Setup

Implementing Structured Output Parser

Using ResponseSchema with StructuredOutputParser

Embedding Response Schemas into Prompts

Integrating with ChatOpenAI and Running the Chain

Using Streamed Outputs

Overview

Table of Contents

References

Environment Setup

Implementing Structured Output Parser

Using ResponseSchema with StructuredOutputParser

Embedding Response Schemas into Prompts

Integrating with ChatOpenAI and Running the Chain

Using Streamed Outputs