Agentic RAG

Open in ColabOpen in GitHub

Overview

Agentic RAG extends traditional RAG (Retrieval-Augmented Generation) systems by incorporating an agent-based approach for more sophisticated information retrieval and response generation. This system goes beyond simple document retrieval and response generation by enabling agents to utilize various tools for more intelligent information processing. These tools include Tavily Search for accessing up-to-date information, Python code execution capabilities, and custom function implementations, all integrated within the LangChain framework to provide a comprehensive solution for information processing and generation tasks.

This tutorial demonstrates how to build a document retrieval system using FAISS DB for effective PDF document processing and searching. Using the AI Brief from the Software Policy Research Institute as an example document, we'll explore how to integrate web-based document loaders, text splitters, vector stores, and OpenAI embeddings to create a practical Agentic RAG system. The implementation showcases how the Retriever tool can be effectively combined with various LangChain components to create a robust document search and response generation pipeline.

Table of Contents

References


Environment Setup

Set up the environment. You may refer to Environment Setup for more details.

[Note]

  • langchain-opentutorial is a package that provides a set of easy-to-use environment setup, useful functions and utilities for tutorials.

  • You can check out the langchain-opentutorial for more details.

LangChain provides built-in tools that make it easy to use the Tavily search engine as a tool in your applications.

To use Tavily Search, you'll need to obtain an API key.

Click here to sign up on the Tavily website and get your Tavily Search API key.

You can alternatively set API keys in a .env file and load it.

[Note] This is not necessary if you've already set API keys in previous steps.

Configuring Tools

The foundational stage of setting up tools for the agent to use. We implement a web search tool using the Tavily Search API and a PDF document retrieval tool. These tools enable the agent to effectively search and utilize information from various sources. By combining these tools, the agent can select and use the appropriate tool based on the context.

The web search tool utilizes the Tavily Search API to retrieve real-time information from the web. It returns up to 6 results ranked by relevance, with each result containing a URL and content snippet.

This tutorial demonstrates how to build a PDF search tool that leverages vector databases for efficient document retrieval. The system divides PDF documents into manageable chunks and utilizes OpenAI embeddings for text vectorization alongside FAISS for fast similarity searching.

For this tutorial, we'll work with a sample document from the academic text "An Introduction to Ethics in Robotics and AI" (2021). This comprehensive book explores fundamental concepts including AI definitions, machine learning principles, robotics fundamentals, and the current limitations of AI technology.

  • Title: What Is AI?

  • Authors:

    • Christoph Bartneck (University of Canterbury)

    • Christoph Lütge (Technical University of Munich)

To begin, please place the PDF file in your data directory.

Combining Tools

We combine multiple tools into a single list, allowing the agent to select and use the appropriate tool based on the context. This enables flexible switching between web search and document retrieval.

Building the Agent

The core stage of building an agent. We initialize a Large Language Model (LLM) and set up prompt templates that enable the agent to effectively utilize tools. The agent is configured to combine PDF search and web search capabilities, allowing it to find answers from various information sources. Specifically, we use create_tool_calling_agent to create an agent with tool-using capabilities and explain how to set up the execution environment using AgentExecutor.

Note: We set verbose=False to suppress intermediate step outputs from the agent executor.

Implementing Chat History

The essential implementation stage for managing conversation history. We implement a session-based chat history store that allows the agent to remember and reference previous conversations. Using ChatMessageHistory, we maintain independent conversation histories for each session, and through RunnableWithMessageHistory, we enable the agent to understand conversation context and maintain natural dialogue flow. This allows users to ask follow-up questions naturally based on previous interactions.

Running Examples

Introduction to running the implemented agent and examining its results. Using streaming output, we can observe the agent's thought process and results in real-time. Through various examples, we showcase the agent's core functionalities including PDF document search, web search, independent session management across conversations, and response restructuring. The process_response function helps structure the agent's responses, clearly showing tool usage and results in an organized manner.

Last updated