JSON
Let's look at how to load files with the .json extension using a loader.
Author: leebeanbin
Peer Review : syshin0116, Teddy Lee
Proofread : JaeJun Shim
This is a part of LangChain Open Tutorial
Overview
This tutorial demonstrates how to use LangChain's JSONLoader to load and process JSON files. We'll explore how to extract specific data from structured JSON files using jq-style queries.
Table of Contents
When you want to extract values under the content field within the message key of JSON data, you can easily do this using JSONLoader as shown below.
References
https://python.langchain.com/docs/how_to/document_loader_json/
Environment Setup
Set up the environment. You may refer to Environment Setup for more details.
[Note]
langchain-opentutorialis a package that provides a set of easy-to-use environment setup, useful functions and utilities for tutorials.You can check out the
langchain-opentutorialfor more details.
You can alternatively set OPENAI_API_KEY in .env file and load it.
[Note] This is not necessary if you've already set OPENAI_API_KEY in previous steps.
Generate JSON Data
If you want to generate JSON data, you can use the following code.
The case of loading JSON data is as follows when you want to load your own JSON data.
JSONLoader
JSONLoaderWhen you want to extract values under the content field within the message key of JSON data, you can easily do this using JSONLoader as shown below.
Basic Usage
This usage shows off how to execute load JSON and print what I get from
Loading Each Person as a Separate Document
We can load each person object from people.json as an individual document using the jq_schema=".people[]"
Using content_key within jq_schema
content_key within jq_schemaTo load documents from a JSON file using content_key within the jq_schema, set is_content_key_jq_parsable=True. Ensure that content_key is compatible and can be parsed using the jq_schema.
Extracting Metadata from people.json
people.jsonLet's define a metadata_func to extract relevant information like name, age, and city from each person object.
Understanding JSON Query Syntax
Let's explore the basic syntax of jq-style queries used in JSONLoader:
Basic Selectors
.: Current object.key: Access specific key in object.[]: Iterate over array elements
Pipe Operator
|: Pass result of left expression as input to right expression
Object Construction
{key: value}: Create new object
Example JSON:
Common Query Patterns:
.people[]: Access each array element.people[].name: Get all names.people[] | {name: .name}: Create new object with name.people[] | {name, email: .contact.email}: Extract nested data
[Note]
Always use
text_content=Falsewhen working with complex JSON dataThis ensures proper handling of non-string values (objects, arrays, numbers)
Advanced Queries
Here are examples of extracting specific information using different jq schemas:
These examples demonstrate the flexibility of jq queries in fetching data in various ways.
Last updated