Naive RAG
Last updated
Last updated
Author:
Design:
Peer Review:
Proofread :
This is a part of
In this chapter, Section 02(Naive-RAG) through Section 05(Add-Query-Rewrite) is not an independent section, but cover one topic.
We'll make a basic RAG on this section, and make more complicated RAG as sections goes by.
[Note]
%%capture --no-stderr
%pip install langchain-opentutorial
# Install required packages
from langchain_opentutorial import package
package.install(
[
"langsmith",
"langchain",
"langchain_core",
"langchain_community",
"langgraph",
"typing",
"langchain-opentutorial"
],
verbose=False,
upgrade=False,
)
You can set API keys in a .env
file or set them manually.
[Note] If youβre not using the .env
file, no worries! Just enter the keys directly in the cell below, and youβre good to go.
from dotenv import load_dotenv
from langchain_opentutorial import set_env
# Attempt to load environment variables from a .env file; if unsuccessful, set them manually.
if not load_dotenv():
set_env(
{
"OPENAI_API_KEY": "",
"LANGCHAIN_API_KEY": "",
"LANGCHAIN_TRACING_V2": "true",
"LANGCHAIN_ENDPOINT": "https://api.smith.langchain.com",
"LANGCHAIN_PROJECT": "02-LangGraph-Naive-RAG", # set the project name same as the title
}
)
Perform Naive RAG
, the basic RAG system which has 2 progress, Retrieve and Generate .
You can see the structure in the image below.
This section creates a Retrieval Chain based on a PDF document . It is the simplest structure of a Retrieval Chain.
In LangGraph
, Retrievers and Chains are created separately. This allows detailed processing for each node.
from rag.pdf import PDFRetrievalChain
# Load the PDF document
pdf = PDFRetrievalChain(["data/A European Approach to Artificial Intelligence - A Policy Perspective.pdf"]).create_chain()
# Create retriever and chain
pdf_retriever = pdf.retriever
pdf_chain = pdf.chain
First, use the pdf_retriever
to fetch search results.
You can control the quantity to retrieve, by changing self_k
argument in pdf.py
file.
search_result = pdf_retriever.invoke(
"Where has the application of AI in healthcare been confined to so far?"
)
search_result
[Document(id='76d10387-a006-4d7c-8bd1-c316831d6094', metadata={'source': 'data/A European Approach to Artificial Intelligence - A Policy Perspective.pdf', 'file_path': 'data/A European Approach to Artificial Intelligence - A Policy Perspective.pdf', 'page': 14, 'total_pages': 24, 'CreationDate': "D:20200922223534+02'00'", 'Creator': 'Adobe InDesign 15.1 (Macintosh)', 'ModDate': "D:20200922223544+02'00'", 'Producer': 'Adobe PDF Library 15.0', 'Trapped': 'False'}, page_content='activities. So far, however, AI applications in healthcare have been potential. Specific healthcare training should be provided to data\nconfined to administrative tasks (i.e., Natural Language Processing scientists working in hospitals so that they can better understand'),
Document(id='f4f78deb-c8d9-4c84-a3e4-a91b0477413d', metadata={'source': 'data/A European Approach to Artificial Intelligence - A Policy Perspective.pdf', 'file_path': 'data/A European Approach to Artificial Intelligence - A Policy Perspective.pdf', 'page': 14, 'total_pages': 24, 'CreationDate': "D:20200922223534+02'00'", 'Creator': 'Adobe InDesign 15.1 (Macintosh)', 'ModDate': "D:20200922223544+02'00'", 'Producer': 'Adobe PDF Library 15.0', 'Trapped': 'False'}, page_content='are great, as more use of AI in research and development could\nHealthcare is arguably the sector where AI could make the lead to a more personalised healthcare based on patientsβ data.'),
Document(id='e393ec31-5738-4454-aab0-bf1f33ec1d18', metadata={'source': 'data/A European Approach to Artificial Intelligence - A Policy Perspective.pdf', 'file_path': 'data/A European Approach to Artificial Intelligence - A Policy Perspective.pdf', 'page': 10, 'total_pages': 24, 'CreationDate': "D:20200922223534+02'00'", 'Creator': 'Adobe InDesign 15.1 (Macintosh)', 'ModDate': "D:20200922223544+02'00'", 'Producer': 'Adobe PDF Library 15.0', 'Trapped': 'False'}, page_content='intermediate / professional users (i.e., healthcare professionals). the safety of employees. The key application of AI is certainly in\nThis is a matter of privacy and personal data protection, of building predictive maintenance. Yet, the more radical transformation of'),
Document(id='693acf63-9fbf-4e71-9128-71b230dcd17f', metadata={'source': 'data/A European Approach to Artificial Intelligence - A Policy Perspective.pdf', 'file_path': 'data/A European Approach to Artificial Intelligence - A Policy Perspective.pdf', 'page': 15, 'total_pages': 24, 'CreationDate': "D:20200922223534+02'00'", 'Creator': 'Adobe InDesign 15.1 (Macintosh)', 'ModDate': "D:20200922223544+02'00'", 'Producer': 'Adobe PDF Library 15.0', 'Trapped': 'False'}, page_content='Remote sible, as AI solutions can increasingly divert patients ning healthcare professionals, starting from the simple\nhealthcare to appropriate solutions for their specific symptoms tasks and diagnostic appointments.\nand underlying conditions.\n16'),
Document(id='764ba481-f984-47e9-b86b-98250eda7ad1', metadata={'source': 'data/A European Approach to Artificial Intelligence - A Policy Perspective.pdf', 'file_path': 'data/A European Approach to Artificial Intelligence - A Policy Perspective.pdf', 'page': 14, 'total_pages': 24, 'CreationDate': "D:20200922223534+02'00'", 'Creator': 'Adobe InDesign 15.1 (Macintosh)', 'ModDate': "D:20200922223544+02'00'", 'Producer': 'Adobe PDF Library 15.0', 'Trapped': 'False'}, page_content='same. The Covid-19 crisis has shown how strained our National\nHealthcare Systems are, and AI solutions could help meet the cur- AI in the healthcare faces organisational and skill challenges. One'),
Document(id='c0f68d31-3ebf-45f7-bf6c-cef790cb6f92', metadata={'source': 'data/A European Approach to Artificial Intelligence - A Policy Perspective.pdf', 'file_path': 'data/A European Approach to Artificial Intelligence - A Policy Perspective.pdf', 'page': 3, 'total_pages': 24, 'CreationDate': "D:20200922223534+02'00'", 'Creator': 'Adobe InDesign 15.1 (Macintosh)', 'ModDate': "D:20200922223544+02'00'", 'Producer': 'Adobe PDF Library 15.0', 'Trapped': 'False'}, page_content='advanced robots, autonomous cars, drones or Internet of Things place, a recent EIT Health Report envisages more in healthcare in\napplications)β. Broad AI definitions cover several technologies, in- the near future, such as remote monitoring, AI-powered alerting'),
Document(id='8594b8a3-aef5-44ae-93aa-7f1d5afeaaa3', metadata={'source': 'data/A European Approach to Artificial Intelligence - A Policy Perspective.pdf', 'file_path': 'data/A European Approach to Artificial Intelligence - A Policy Perspective.pdf', 'page': 21, 'total_pages': 24, 'CreationDate': "D:20200922223534+02'00'", 'Creator': 'Adobe InDesign 15.1 (Macintosh)', 'ModDate': "D:20200922223544+02'00'", 'Producer': 'Adobe PDF Library 15.0', 'Trapped': 'False'}, page_content='EIT Health and McKinsey & Company, (2020), Transforming healthcare with AI. Impact Scherer, M. (2016). Regulating Artificial Intelligence Systems: Risks, Challenges, Compe-'),
Document(id='98d019bc-1a7e-4273-8eac-eff09d7bf756', metadata={'source': 'data/A European Approach to Artificial Intelligence - A Policy Perspective.pdf', 'file_path': 'data/A European Approach to Artificial Intelligence - A Policy Perspective.pdf', 'page': 15, 'total_pages': 24, 'CreationDate': "D:20200922223534+02'00'", 'Creator': 'Adobe InDesign 15.1 (Macintosh)', 'ModDate': "D:20200922223544+02'00'", 'Producer': 'Adobe PDF Library 15.0', 'Trapped': 'False'}, page_content='very sensitive. An extensive use to feed AI tools can the use of patientβs data in the hospitals that deploy\nHealth data raise many concerns. Data ownership is also an issue AI-powered applications. The patients should be aware'),
Document(id='b63de5ec-3554-488f-a4c0-1975c0c9e39f', metadata={'source': 'data/A European Approach to Artificial Intelligence - A Policy Perspective.pdf', 'file_path': 'data/A European Approach to Artificial Intelligence - A Policy Perspective.pdf', 'page': 14, 'total_pages': 24, 'CreationDate': "D:20200922223534+02'00'", 'Creator': 'Adobe InDesign 15.1 (Macintosh)', 'ModDate': "D:20200922223544+02'00'", 'Producer': 'Adobe PDF Library 15.0', 'Trapped': 'False'}, page_content='to extract information from clinical notes or predictive scheduling healthcare practitioners needs. In addition, at the regulatory le-\nof the visits) and diagnostic (machine and deep learning applied to vel it is important that new AI regulation is harmonised with other'),
Document(id='7c4c1675-87ad-47ba-8df1-39e8036cf6c2', metadata={'source': 'data/A European Approach to Artificial Intelligence - A Policy Perspective.pdf', 'file_path': 'data/A European Approach to Artificial Intelligence - A Policy Perspective.pdf', 'page': 14, 'total_pages': 24, 'CreationDate': "D:20200922223534+02'00'", 'Creator': 'Adobe InDesign 15.1 (Macintosh)', 'ModDate': "D:20200922223544+02'00'", 'Producer': 'Adobe PDF Library 15.0', 'Trapped': 'False'}, page_content='greatest impact in addressing societal challenges. Given rising de- A second challenge is that of finding a common language and un-\nmands and costs, AI could help doing more and better with the derstanding between data experts and healthcare professionals.')]
Pass the search result as context to the chain.
# Generate an answer based on the search results
answer = pdf_chain.invoke(
{
"question": "Where has the application of AI in healthcare been confined to so far?",
"context": search_result,
"chat_history": [],
}
)
print(answer)
The application of AI in healthcare has so far been confined primarily to administrative tasks, such as Natural Language Processing for extracting information from clinical notes and predictive scheduling. There is potential for more extensive use in research and development, but significant applications remain limited.
**Source**
- data/A European Approach to Artificial Intelligence - A Policy Perspective.pdf (page 14)
State
defines the shared state among the nodes and another nodes.
Typically, the TypedDict
format is used.
from typing import Annotated, TypedDict
from langgraph.graph.message import add_messages
# Define GraphState
class GraphState(TypedDict):
question: Annotated[str, "Question"] # Question
context: Annotated[str, "Context"] # Search results from the document
answer: Annotated[str, "Answer"] # Answer
messages: Annotated[list, add_messages] # Messages (accumulated list)
Nodes
: These are nodes that handle each stage, typically implemented as Python functions. Inputs and outputs are the State values.
[ Note ]
A State
is taken as input, performs the defined logic, and returns an updated State
.
from rag.utils import format_docs
from langchain_opentutorial.messages import messages_to_history
# Node that retrieve document
def retrieve_document(state: GraphState) -> GraphState:
# Get the question from the state
latest_question = state["question"]
# Search the document to find relevant sections
retrieved_docs = pdf_retriever.invoke(latest_question)
# Format the retrieved documents (to input into a prompt)
retrieved_docs = format_docs(retrieved_docs)
# Save the retrieved documents in the key named context
return {"context": retrieved_docs}
# Node that generate answer
def llm_answer(state: GraphState) -> GraphState:
# Get the question from the state
latest_question = state["question"]
# Get the retrieved documents from the state
context = state["context"]
# Call the chain to generate an answer
response = pdf_chain.invoke(
{
"question": latest_question,
"context": context,
"chat_history": messages_to_history(state["messages"]),
}
)
# Save the generated answer and (user question, answer) messages in the state
return {
"answer": response,
"messages": [("user", latest_question), ("assistant", response)],
}
Edges
: Python functions that determine the next Node
to execute based on the current State
.
There can be general edges and conditional edges.
from langgraph.graph import END, StateGraph
from langgraph.checkpoint.memory import MemorySaver
# Create a graph
workflow = StateGraph(GraphState)
# Define nodes
workflow.add_node("retrieve", retrieve_document)
workflow.add_node("llm_answer", llm_answer)
# Define edges
workflow.add_edge("retrieve", "llm_answer") # Retrieval -> Answer generation
workflow.add_edge("llm_answer", END) # Answer generation -> End
# Set entry point for the graph
workflow.set_entry_point("retrieve")
# Set up a checkpointer
memory = MemorySaver()
# Compile the graph
app = workflow.compile(checkpointer=memory)
Visualize the compiled graph.
from langchain_opentutorial.graphs import visualize_graph
visualize_graph(app)
The config
parameter provides configuration informations necessary for graph execution.
recursion_limit
: Sets the maximum recursion depth for graph execution.
inputs
: Provides the input data for the graph execution.
The stream_graph
function below streams only specific nodes.
You can easily check the streaming output of a specific node .
from langchain_core.runnables import RunnableConfig
from langchain_opentutorial.messages import invoke_graph, stream_graph, random_uuid
# Configure settings (recursion limit, thread_id)
config = RunnableConfig(recursion_limit=20, configurable={"thread_id": random_uuid()})
# Question input
inputs = GraphState(
question="Where has the application of AI in healthcare been confined to so far?"
)
# Execute the graph
invoke_graph(app, inputs, config)
==================================================
π Node: retrieve π
- - - - - - - - - - - - - - - - - - - - - - - - -
context:
activities. So far, however, AI applications in healthcare have been potential. Specific healthcare training should be provided to data
confined to administrative tasks (i.e., Natural Language Processing scientists working in hospitals so that they can better understanddata/A European Approach to Artificial Intelligence - A Policy Perspective.pdf15
are great, as more use of AI in research and development could
Healthcare is arguably the sector where AI could make the lead to a more personalised healthcare based on patientsβ data.data/A European Approach to Artificial Intelligence - A Policy Perspective.pdf15
intermediate / professional users (i.e., healthcare professionals). the safety of employees. The key application of AI is certainly in
This is a matter of privacy and personal data protection, of building predictive maintenance. Yet, the more radical transformation ofdata/A European Approach to Artificial Intelligence - A Policy Perspective.pdf11
Remote sible, as AI solutions can increasingly divert patients ning healthcare professionals, starting from the simple
healthcare to appropriate solutions for their specific symptoms tasks and diagnostic appointments.
and underlying conditions.
16data/A European Approach to Artificial Intelligence - A Policy Perspective.pdf16
same. The Covid-19 crisis has shown how strained our National
Healthcare Systems are, and AI solutions could help meet the cur- AI in the healthcare faces organisational and skill challenges. Onedata/A European Approach to Artificial Intelligence - A Policy Perspective.pdf15
advanced robots, autonomous cars, drones or Internet of Things place, a recent EIT Health Report envisages more in healthcare in
applications)β. Broad AI definitions cover several technologies, in- the near future, such as remote monitoring, AI-powered alertingdata/A European Approach to Artificial Intelligence - A Policy Perspective.pdf4
EIT Health and McKinsey & Company, (2020), Transforming healthcare with AI. Impact Scherer, M. (2016). Regulating Artificial Intelligence Systems: Risks, Challenges, Compe-data/A European Approach to Artificial Intelligence - A Policy Perspective.pdf22
very sensitive. An extensive use to feed AI tools can the use of patientβs data in the hospitals that deploy
Health data raise many concerns. Data ownership is also an issue AI-powered applications. The patients should be awaredata/A European Approach to Artificial Intelligence - A Policy Perspective.pdf16
to extract information from clinical notes or predictive scheduling healthcare practitioners needs. In addition, at the regulatory le-
of the visits) and diagnostic (machine and deep learning applied to vel it is important that new AI regulation is harmonised with otherdata/A European Approach to Artificial Intelligence - A Policy Perspective.pdf15
greatest impact in addressing societal challenges. Given rising de- A second challenge is that of finding a common language and un-
mands and costs, AI could help doing more and better with the derstanding between data experts and healthcare professionals.data/A European Approach to Artificial Intelligence - A Policy Perspective.pdf15
==================================================
==================================================
π Node: llm_answer π
- - - - - - - - - - - - - - - - - - - - - - - - -
answer:
The application of AI in healthcare has so far been confined primarily to administrative tasks, such as Natural Language Processing for extracting information from clinical notes and predictive scheduling.
**Source**
- data/A European Approach to Artificial Intelligence - A Policy Perspective.pdf (page 15)
('user', 'Where has the application of AI in healthcare been confined to so far?')
('assistant', 'The application of AI in healthcare has so far been confined primarily to administrative tasks, such as Natural Language Processing for extracting information from clinical notes and predictive scheduling. \n\n**Source**\n- data/A European Approach to Artificial Intelligence - A Policy Perspective.pdf (page 15)')
==================================================
# Stream the graph outputs
stream_graph(app, inputs, config)
==================================================
π Node: llm_answer π
- - - - - - - - - - - - - - - - - - - - - - - - -
The application of AI in healthcare has so far been confined primarily to administrative tasks, such as Natural Language Processing for extracting information from clinical notes and predictive scheduling.
**Source**
- data/A European Approach to Artificial Intelligence - A Policy Perspective.pdf (page 15)
outputs = app.get_state(config).values
print(f'Question: {outputs["question"]}')
print("===" * 20)
print(f'Answer:\n{outputs["answer"]}')
Question: Where has the application of AI in healthcare been confined to so far?
============================================================
Answer:
The application of AI in healthcare has so far been confined primarily to administrative tasks, such as Natural Language Processing for extracting information from clinical notes and predictive scheduling.
**Source**
- data/A European Approach to Artificial Intelligence - A Policy Perspective.pdf (page 15)
Setting up your environment is the first step. See the guide for more details.
The langchain-opentutorial is a package of easy-to-use environment setup guidance, useful functions and utilities for tutorials. Check out the for more details.