SQL-Agent
Author: Jinu Cho
Peer Review:
Proofread : Chaeyoon Kim
This is a part of LangChain Open Tutorial
Overview
In this tutorial, we will build an agent step-by-step that can answer questions

Key Topics :
Database : Set up an SQLite database and load the Chinook sample database.
Utility Functions : Define utility functions for implementing the agent.
Tool Definitions : Define tools for interacting with the database.
Workflow Definition : Define the workflow (graph) for the agent.
Graph Visualization : Visualize the defined workflow graph.
Agent Execution : Execute the agent and review the results.
Evaluation : Evaluate the agent and compare its performance.
Table of Contents
References
Environment Setup
Setting up your environment is the first step. See the Environment Setup guide for more details.
[Note]
The langchain-opentutorial is a package of easy-to-use environment setup guidance, useful functions and utilities for tutorials.
Check out the langchain-opentutorial for more details.
You can alternatively set OPENAI_API_KEY in .env file and load it.
[Note] This is not necessary if you've already set OPENAI_API_KEY in previous steps.
Agent for Interacting with SQL Databases
The workflow of an agent that executes SQL queries is as follows:
Understanding the Database Schema : Retrieve a list of available tables.
Selecting Relevant Tables : Identify tables related to the question.
Fetching DDL : Obtain the schema definitions (DDL) of the selected tables.
Generating Queries : Construct SQL queries based on the question and DDL information.
Query Validation : Use an LLM to review for common errors and refine the query.
Query Execution and Error Handling : Execute the query on the database engine and handle errors to ensure successful execution.
Generating Responses : Provide the final answer based on the query results.
Set Up the Database
In this tutorial, we will create an SQLite database. SQLite is a lightweight database that is easy to set up and use.
For this tutorial, we will load the Chinook sample database, which represents a digital media store.
You can find more information about the Chinook database here.
First, download the Chinook database to use for the exercises in this tutorial.
Using the Chinook Database
Next, we will create an SQLDatabase tool using the downloaded Chinook database and execute a sample query, "SELECT * FROM Artist LIMIT 5;" .
Utility Functions
Let's define several utility functions to assist in implementing the agent.
In particular, wrap ToolNode to include error handling and the capability to relay errors back to the agent.
SQL Query Execution Tools
To enable the agent to interact with the database, we define several tools.
Defined Tools:
list_tables_tool: Retrieves a list of available tables in the database.get_schema_tool: Fetches the DDL (Data Definition Language) of a table.db_query_tool: Executes queries and retrieves results or returns error messages if the query fails.
Note:
DDL (Data Definition Language) refers to SQL commands used to define or modify the structure and schema of a database. It includes commands to create, alter, and delete database objects like tables, indexes, and views.
Key DDL Commands:
CREATE : Creates database objects.
Example: CREATE TABLE users (id INT, name VARCHAR(100));
ALTER : Modifies existing database objects.
Example: ALTER TABLE users ADD COLUMN email VARCHAR(100);
DROP : Deletes database objects.
Example: DROP TABLE users;
SQLDatabaseToolkit Tools
The following tools are provided to interact with the SQL database:
QuerySQLDatabaseToolFunction : Executes SQL queries and returns results.
Input : Valid SQL query.
Output : Database results or error messages.
Error Handling :
Rewrites and retries the query if errors occur.
Resolves "unknown column" errors by checking table fields using
sql_db_schema.
InfoSQLDatabaseToolFunction : Retrieves table schemas and sample data.
Input : Comma-separated list of table names.
Usage Example : table1, table2, table3
Note : Check table existence using
sql_db_list_tablesbefore using this tool.
ListSQLDatabaseToolFunction : Retrieves a list of tables in the database.
QuerySQLCheckerToolFunction : Validates SQL queries before execution.
Validation Checks :
Use of NULL values and NOT IN .
Suitability of UNION vs UNION ALL .
Correct range settings for BETWEEN .
Data type consistency.
Proper quoting of identifiers.
Correct number of function arguments.
Proper type casting.
Accuracy of join columns.
Features : Uses a GPT-4o model for validation.
Below is an example of executing list_tables_tool and get_schema_tool .
Next, define db_query_tool .
In db_query_tool case, it executes queries against the database and returns the results.
If an error occurs, it returns an error message.
When executed successfully.
When an error occurs.
This will produce an error due to the incorrect SQL syntax ("LIMITS" instead of "LIMIT")
SQL Query Checker
Next, we will use an LLM to check for common mistakes in SQL queries.
Strictly speaking, this is not a tool, but it will be added as a node in the workflow later.
Verify the Fix for Incorrect Query
An incorrect query was executed to check if the issue is properly handled and fixed.
Remark :
The query used LIMITS instead of LIMIT .
The input query contained an error, but the generated query was successfully corrected.
Correction: LIMITS → LIMIT
Graph Definition
Define the workflow for the agent.
The agent first forcefully invokes the list_tables_tool to retrieve the available tables from the database. After this, it follows the steps mentioned earlier in the tutorial.

Excution the Agent
Execute the agent to interact with an SQL database through the entire process.
The agent retrieves information from the database based on user queries, generates and executes SQL queries, and returns the results.
Evaluating an SQL Agent Using LangSmith Evaluator
Now, let's evaluate the SQL Agent's responses to queries. To do this, we'll first create a dataset specifically designed for evaluation purposes.
Next, we define an evaluator and proceed with the evaluation process.
For this, we use LLM-as-judge as the evaluator. The prompt employed will be the default one provided by the evaluation prompt hub.
However, for more precise evaluations, it is recommended to fine-tune the prompt to suit your specific needs.
Next, we define a function to predict the SQL query response generated by our agent.
Define the prompt and the evaluator (LLM-as-judge) to evaluate the SQL query response.
Now, we perform the evaluation and review the results.
The evaluation results can be reviewed individually at the generated URL.

Last updated