CommaSeparatedListOutputParser

Author: Junseong Kim
Peer Review : Teddy Lee, stsr1284, brian604
Proofread : BokyungisaGod
This is a part of LangChain Open Tutorial

Overview

The CommaSeparatedListOutputParser is a specialized output parser in LangChain designed for generating structured outputs in the form of comma-separated lists.

It simplifies the process of extracting and presenting data in a clear and concise list format, making it particularly useful for organizing information such as data points, names, items, or other structured values. By leveraging this parser, users can enhance data clarity, ensure consistent formatting, and improve workflow efficiency, especially in applications where structured outputs are essential.

This tutorial demonstrates how to use the CommaSeparatedListOutputParser to:

Set up and initialize the parser for generating comma-separated lists
Integrate it with a prompt template and language model
Process structured outputs iteratively using streaming mechanisms

References

Environment Setup

Set up the environment. You may refer to Environment Setup for more details.

[Note]

langchain-opentutorial is a package that provides a set of easy-to-use environment setup, useful functions and utilities for tutorials.
You can checkout the langchain-opentutorial for more details.

%%capture --no-stderr
%pip install langchain-opentutorial

# Install required packages
from langchain_opentutorial import package

package.install(
    [
        "langsmith",
        "langchain",
        "langchain_openai",
        "langchain_community",
    ],
    verbose=False,
    upgrade=False,
)

# Set environment variables
from langchain_opentutorial import set_env

set_env(
    {
        "OPENAI_API_KEY": "",
        "LANGCHAIN_API_KEY": "",
        "LANGCHAIN_TRACING_V2": "true",
        "LANGCHAIN_ENDPOINT": "https://api.smith.langchain.com",
        "LANGCHAIN_PROJECT": "02-CommaSeparatedListOutputParser",
    }
)

Environment variables have been set successfully.

You can alternatively set OPENAI_API_KEY in .env file and load it.

[Note] This is not necessary if you've already set OPENAI_API_KEY in previous steps.

from dotenv import load_dotenv

load_dotenv()

True

Implementing the `CommaSeparatedListOutputParser`

If you need to generate outputs in the form of a comma-separated list, the CommaSeparatedListOutputParser from LangChain simplifies the process. Below is a step-by-step implementation:

1. Importing Required Modules

Start by importing the necessary modules and initializing the CommaSeparatedListOutputParser. Retrieve the formatting instructions from the parser to guide the output structure.

from langchain_core.output_parsers import CommaSeparatedListOutputParser

# Initialize the output parser
output_parser = CommaSeparatedListOutputParser()

# Retrieve format instructions for the output parser
format_instructions = output_parser.get_format_instructions()
print(format_instructions)

Your response should be a list of comma separated values, eg: `foo, bar, baz` or `foo,bar,baz`

2. Creating the Prompt Template

Define a PromptTemplate that dynamically generates a list of items. The placeholder subject will be replaced with the desired topic during execution.

from langchain_core.prompts import PromptTemplate

# Define the prompt template
prompt = PromptTemplate(
    template="List five {subject}.\n{format_instructions}",
    input_variables=["subject"],  # 'subject' will be dynamically replaced
    partial_variables={
        "format_instructions": format_instructions
    },  # Use parser's format instructions
)
print(prompt)

input_variables=['subject'] input_types={} partial_variables={'format_instructions': 'Your response should be a list of comma separated values, eg: `foo, bar, baz` or `foo,bar,baz`'} template='List five {subject}.\n{format_instructions}'

3. Integrating with `ChatOpenAI` and Running the Chain

Combine the PromptTemplate, ChatOpenAI model, and CommaSeparatedListOutputParser into a chain. Finally, run the chain with a specific subject to produce results.

from langchain_openai import ChatOpenAI

# Initialize the ChatOpenAI model
model = ChatOpenAI(temperature=0)

# Combine the prompt, model, and output parser into a chain
chain = prompt | model | output_parser

# Run the chain with a specific subject
result = chain.invoke({"subject": "famous landmarks in South Korea"})
print(result)

['Gyeongbokgung Palace', 'N Seoul Tower', 'Bukchon Hanok Village', 'Seongsan Ilchulbong Peak', 'Haeundae Beach']

4. Accessing Data with Python Indexing

Since the CommaSeparatedListOutputParser automatically formats the output as a Python list, you can easily access individual elements using indexing.

# Accessing specific elements using Python indexing
print("First Landmark:", result[0])
print("Second Landmark:", result[1])
print("Last Landmark:", result[-1])

First Landmark: Gyeongbokgung Palace
    Second Landmark: N Seoul Tower
    Last Landmark: Haeundae Beach

Using Streamed Outputs

For larger outputs or real-time feedback, you can process the results using the stream method. This allows you to handle data piece by piece as it is generated.

# Iterate through the streamed output for a subject
for output in chain.stream({"subject": "famous landmarks in South Korea"}):
    print(output)

['Gyeongbokgung Palace']
    ['N Seoul Tower']
    ['Bukchon Hanok Village']
    ['Seongsan Ilchulbong Peak']
    ['Haeundae Beach']

PreviousPydanticOutputParser NextStructured Output Parser

Last updated 3 months ago