Naive RAG

Open in ColabOpen in GitHub

Overview

In this chapter, Section 02(Naive-RAG) through Section 05(Add-Query-Rewrite) is not an independent section, but cover one topic.

We'll make a basic RAG on this section, and make more complicated RAG as sections goes by.

Table of Contents

References


Environment Setup

Setting up your environment is the first step. See the Environment Setup guide for more details.

[Note]

The langchain-opentutorial is a package of easy-to-use environment setup guidance, useful functions and utilities for tutorials. Check out the langchain-opentutorial for more details.

You can set API keys in a .env file or set them manually.

[Note] If youโ€™re not using the .env file, no worries! Just enter the keys directly in the cell below, and youโ€™re good to go.

Procedure

Perform Naive RAG, the basic RAG system which has 2 progress, Retrieve and Generate .

You can see the structure in the image below.

Creating a Basic PDF-Based Retrieval Chain

This section creates a Retrieval Chain based on a PDF document . It is the simplest structure of a Retrieval Chain.

In LangGraph , Retrievers and Chains are created separately. This allows detailed processing for each node.

First, use the pdf_retriever to fetch search results.

You can control the quantity to retrieve, by changing self_k argument in pdf.py file.

Pass the search result as context to the chain.

Defining State

State defines the shared state among the nodes and another nodes.

Typically, the TypedDict format is used.

Defining Nodes

Nodes : These are nodes that handle each stage, typically implemented as Python functions. Inputs and outputs are the State values.

[ Note ]

  • A State is taken as input, performs the defined logic, and returns an updated State .

Creating the Graph

Edges : Python functions that determine the next Node to execute based on the current State .

There can be general edges and conditional edges.

Visualize the compiled graph.

png

Executing the Graph

  • The config parameter provides configuration informations necessary for graph execution.

  • recursion_limit : Sets the maximum recursion depth for graph execution.

  • inputs : Provides the input data for the graph execution.

The stream_graph function below streams only specific nodes.

You can easily check the streaming output of a specific node .

Last updated