Caching

Open in ColabOpen in GitHub

Overview

LangChain provides optional caching layer for LLMs.

This is useful for two reasons:

  • When requesting the same completions multiple times, it can reduce the number of API calls to the LLM provider and thus save costs.

  • By reduing the number of API calls to the LLM provider, it can improve the running time of the application.

In this tutorial, we will use gpt-4o-mini OpenAI API and utilize two kinds of cache, InMemoryCache and SQLiteCache. At end of each section we will compare wall times between before and after caching.

Table of Contents

References


Environment Setup

Set up the environment. You may refer to Environment Setup for more details.

[Note]

  • langchain-opentutorial is a package that provides a set of easy-to-use environment setup, useful functions and utilities for tutorials.

  • You can checkout the langchain-opentutorial for more details.

InMemoryCache

First, cache the answer to the same question using InMemoryCache.

Now we invoke the chain with the same question.

Note that if we set InMemoryCache again, the cache will be lost and the wall time will increase.

SQLiteCache

Now, we cache the answer to the same question by using SQLiteCache.

Now we invoke the chain with the same question.

Note that if we use SQLiteCache, setting caching again does not delete stored cache.

Last updated