Understanding the basic structure of RAG

Open in ColabOpen in GitHub

Overview

1. Pre-processing - Steps 1 to 4

rag-1.png

The pre-processing stage involves four steps to load, split, embed, and store documents into a Vector DB (database).

  • Step 1: Document Load : Load the document content.

  • Step 2: Text Split : Split the document into chunks based on specific criteria.

  • Step 3: Embedding : Generate embeddings for the chunks and prepare them for storage.

  • Step 4: Vector DB Storage : Store the embedded chunks in the database.

2. RAG Execution (RunTime) - Steps 5 to 8

rag-2.png
  • Step 5: Retriever : Define a retriever to fetch results from the database based on the input query. Retrievers use search algorithms and are categorized as dense or sparse:

    • Dense : Similarity-based search.

    • Sparse : Keyword-based search.

  • Step 6: Prompt : Create a prompt for executing RAG. The context in the prompt includes content retrieved from the document. Through prompt engineering, you can specify the format of the answer.

  • Step 7: LLM : Define the language model (e.g., GPT-3.5, GPT-4, Claude).

  • Step 8: Chain : Create a chain that connects the prompt, LLM, and output.

Table of Contents

References


Document Used for Practice A European Approach to Artificial Intelligence - A Policy Perspective

  • Author: EIT Digital and 5 EIT KICs (EIT Manufacturing, EIT Urban Mobility, EIT Health, EIT Climate-KIC, EIT Digital)

  • Link: https://eit.europa.eu/sites/default/files/eit-digital-artificial-intelligence-report.pdf

  • File Name: A European Approach to Artificial Intelligence - A Policy Perspective.pdf

Please copy the downloaded file to the data folder for practice.

Environment Setup

Set up the environment. You may refer to Environment Setup for more details.

[Note]

  • langchain-opentutorial is a package that provides a set of easy-to-use environment setup, useful functions and utilities for tutorials.

  • You can checkout the langchain-opentutorial for more details.

Set the API key.

RAG Basic Pipeline

Below is the skeleton code for understanding the basic structure of RAG (Retrieval Augmented Generation).

The content of each module can be adjusted to fit specific scenarios, allowing for iterative improvement of the structure to suit the documents.

(Different options or new techniques can be applied at each step.)

Print the content of the page.

Check the metadata.

Send a query to the retriever and check the resulting chunks.

Input a query (question) into the created chain and execute it.

Complete Code

This is a combined code that integrates steps 1 through 8.

Last updated