Tool Calling Agent

Open in ColabOpen in GitHub

Overview

This tutorial explains tool calling in LangChain, allowing models to detect when one or more tools are called and what inputs to pass to those tools.

When making API calls, you can define tools and intelligently guide the model to generate structured objects, such as JSON, containing arguments for calling these tools.

The goal of the tools API is to provide more reliable generation of valid and useful tool calls beyond what standard text completion or chat APIs can achieve.

You can create agents that iteratively call tools and receive results until a query is resolved by integrating this structured output with the ability to bind multiple tools to a tool-calling chat model and letting the model choose which tools to call.

This represents a more generalized version of the OpenAI tools agent which was specifically designed for OpenAI's particular tool-calling style.

This agent uses LangChain's ToolCall interface to support a broader spectrum of provider implementations beyond OpenAI, including Anthropic, Google Gemini, and Mistral.

Table of Contents

References


Environment Setup

Set up the environment. You may refer to Environment Setup for more details.

[Note]

  • langchain-opentutorial is a package that provides a set of easy-to-use environment setup, useful functions and utilities for tutorials.

  • You can checkout the langchain-opentutorial for more details.

You can alternatively set OPENAI_API_KEY in .env file and load it.

[Note] This is not necessary if you've already set OPENAI_API_KEY in previous steps.

Creating Tools

LangChain allows you to define custom tools that your agents can interact with. You can create tools for searching news or executing Python code.

The @tool decorator is used to create tools:

  • TavilySearchResults is a tool for searching news.

  • PythonREPL is a tool for executing Python code.

Constructing an Agent Prompt

  • chat_history: This variable stores the conversation history if your agent supports multi-turn. (Otherwise, you can omit this.)

  • agent_scratchpad: This variable serves as temporary storage for intermediate variables.

  • input: This variable represents the user's input.

Creating Agent

Define an agent using the create_tool_calling_agent function.

AgentExecutor

The AgentExecutor is a class for managing an agent that uses tools.

Key properties

  • agent: the underlying agent responsible for creating plans and determining actions at each step of the execution loop.

  • tools: a list containing all the valid tools that the agent is authorized to use.

  • return_intermediate_steps: boolean flag determins whether to return the intermediate steps the agent took along with the final output.

  • max_iterations: a maximum number of steps the agent can take before the execution loop is terminated.

  • max_execution_time: the maximum amount of time the execution loop is allowed to run.

  • early_stopping_method: a defined method how to handle situations when the agent does not return an AgentFinish. ("force" or "generate")

    • "force" : returns a string indicating that the execution loop was stopped due to reaching the time or iteration limit.

    • "generate" : calls the agent's LLM chain once to generate a final answer based on the previous steps taken.

  • handle_parsing_errors : a specification how to handle parsing errors. (You can set True, False, or provide a custom error handling function.)

  • trim_intermediate_steps : method of trimming intermediate steps. (You can set -1 to keep all steps, or provide a custom trimming function.)

Key methods

  1. invoke : Executes the agent.

  2. stream : Stream the steps required to reach the final output.

Key features

  1. Tool validation : Ensure that the tool is compatible with the agent.

  2. Execution control : Set maximum interations and execution time limits to manage agent bahavior.

  3. Error handling : Offers various processing options for output parsing errors.

  4. Intermediate step management : Allows for trimming intermediate steps or returning options for debugging.

  5. Asynchronous support : Supports asynchronous execution and streaming of results.

Optimization tips

  • Set appropriate values for max_iterations and max_execution_time to manage execution time.

  • Use trim_intermediate_steps to optimize memory usage.

  • For complex tasks, use the stream method to monitor step-by-step results.

Checking step-by-step results using Stream output

We will use the stream() method of AgentExecutor to stream the intermediate steps of the agent.

The output of stream() alternates between (Action, Observation) pairs, and finally ends with the agent's answer if the goal is achieved.

The flow will look like the followings:

  1. Action output

  2. Observation output

  3. Action output

  4. Observation output

... (Continue until the goal is achieved) ...

Then, the agent will conclude a final answer if its goal is achieved.

The following table summarizes the content you'll encounter in the output:

Output
Description

Action

actions: Represents the AgentAction or its subclass. messages: Chat messages corresponding to the action call.

Observation

steps: A record of the agent's work, including the current action and its observation. messages: Chat messages containing the results from function calls (i.e., observations).

Final Answer

output: Represents AgentFinish signal. messages: Chat messages containing the final output.

Customizing intermediate step output using user-defined functions

You can define the following 3 functions to customize the intermediate steps output:

  • tool_callback: This function handles the output generated by tool calls.

  • observation_callback: This function deals with the observation data output.

  • result_callback: This function allows you to handle the final answer output.

Here's an example callback function that demonstrates how to clean up the intermediate steps of the Agent.

This callback function can be useful when presenting intermediate steps to users in an application like Streamlit.

Check the response process of your Agent in streaming mode.

png

Modify the callback function to use it.

Check the output content. You can reflect the output value of your callback functions, providing intermediate content that has been changed.

Communicating Agent with previous conversation history

To remember past conversations, you can wrap the AgentExecutor with RunnableWithMessageHistory.

For more details on RunnableWithMessageHistory, please refer to the link below.

Reference

Last updated