Code Debugging System

Open in ColabOpen in GitHub

Overview

In this tutorial, we'll build an AI-powered Python code debugging system using LangGraph. This system automates the debugging process by executing code, analyzing errors, suggesting fixes, and validating corrections.

  • Code Execution: Run Python code and capture any errors

  • Error Analysis: Use AI to analyze the error and identify the cause

  • Code Correction: Generate fixed code and unit tests

  • Validation: Verify the corrected code works properly

Table of Contents

References


Environment Setup

Setting up your environment is the first step. See the Environment Setup guide for more details.

[Note]

  • The langchain-opentutorial is a package of easy-to-use environment setup guidance, useful functions and utilities for tutorials.

  • Check out the langchain-opentutorial for more details.

You can alternatively set API keys such as OPENAI_API_KEY in a .env file and load them.

[Note] This is not necessary if you've already set the required API keys in previous steps.

Basic Components

Our debugging system is built upon three fundamental utility components that handle code execution, command management, and AI response parsing.

  • Code Executor: This component provides safe Python code execution in an isolated environment, capturing outputs and errors. It serves as the foundation for both initial code testing and validation of corrections.

  • Command Executor: This component manages system-level operations, particularly package installations. It automatically detects the environment (UV/pip) and ensures proper command execution across different Python environments.

  • Code Block Parser: The CodeBlockParser is a custom LangChain output parser that processes AI model responses. By extending BaseOutputParser, it extracts and categorizes markdown code blocks into structured data, making it seamlessly integrable with LangChain's LCEL pipelines. This parser is crucial for handling AI-generated code corrections and test cases.

  • LLM Components: Handles intelligent code analysis and correction through:

    • Chain configuration using ChatOpenAI and custom prompts

    • Structured prompt template for error analysis, code correction, and test generation

These components work together to provide the essential functionality needed by our LangGraph-based debugging workflow.

Code Executor

First, Code Executor is the process of running Python code and capturing any errors that occur during execution.

The execute_code function safely executes Python code in a temporary environment and captures comprehensive execution results including output and error messages. It returns a structured result object that serves as the foundation for our debugging workflow's analysis and correction steps.

Let's test our debugging system with a Python code sample that contains multiple issues including syntax errors, type errors, and PEP 8 violations.

This student grade processing code demonstrates common programming mistakes such as missing colons, mixed data types, and naming convention violations that our system will detect and fix.

Command Executor

The CommandExecutor provides a unified interface for executing system commands and installing packages across different Python environments. It automatically detects whether to use UV or pip package manager and adapts its behavior accordingly using the Strategy pattern. The system consists of a base executor class with specialized implementations for UV and pip environments, ensuring consistent command execution regardless of the environment setup.

Let's test our CommandExecutor by installing required packages (matplotlib and pytest) using the appropriate package manager for our environment.

Code Block Parser

The CodeBlockParser is a custom LangChain output parser that processes AI model responses, extracting and categorizing markdown code blocks into structured data. By extending BaseOutputParser, it extracts code blocks with specific tags (corrected for fixed code, tests for test code) from the AI's response text. This parser enables seamless integration with LangChain's LCEL pipelines while providing a structured way to handle different types of code blocks in our debugging workflow.

Recommendations

  1. Code Quality (PEP 8):

    • Ensure class names are capitalized (e.g., Student instead of student).

    • Use consistent naming conventions and spacing in the code.

  2. Runtime Considerations:

    • Consider edge cases like empty score lists before performing operations that assume non-empty inputs.

  3. Package Dependencies:

    • For this simple program, no external packages are required. However, if you expand the functionality (e.g., using a database or web framework), make sure to manage dependencies with a tool like pip and maintain a requirements.txt file.

By following these recommendations, the code will be more robust, maintainable, and user-friendly. """

code_block_parser = CodeBlockParser() code_block_parser_result = code_block_parser.invoke(sample_output_from_llm)

corrected = "\n".join(code_block_parser_result.corrected)

print("------- Fixed Code -------") print(corrected)

print("\n------- Bash Commands -------") print(code_block_parser_result.bash)

  • Code quality (PEP 8)

  • Runtime considerations

Format your response with clear sections and tagged code blocks as shown above. """)

chain = prompt | model | CodeBlockParser()

Building LangGraph Workflow

In this section, we'll build an automated debugging workflow using LangGraph. This workflow will systematically process code, identify errors, suggest fixes, and validate the corrections. Let's break down the key components:

  • Execute Code: First, run the code and capture any errors

  • Analyze Errors: If errors occur, use AI to analyze them and suggest fixes

  • Install Dependencies: Install any required packages

  • Decide Next Step: Either end the process if successful, or retry debugging

The workflow continues until either:

  • The code runs successfully without errors

  • The maximum number of debug attempts (3) is reached

Let's implement each component and see how they work together in our debugging system.

Define the State

The AgentState serves as the central data structure tracking progress through the debugging workflow. This stateful approach enables iterative improvements by preserving critical debugging context between cycles.

Key State Components:

  1. Original Code: Preserves the initial code before any modifications

  2. Code: Current version of the code being debugged

  3. Error: Captured runtime error message (if any)

  4. Dependencies: List of required packages identified by AI analysis

  5. Execution Result: Detailed outcome of code execution attempts (success status, raw output/error)

  6. Debug Attempts: Counter tracking retries (max 3 attempts)

  7. Fix History: List of debugging attempts with corresponding errors and fixes

The state evolves through each workflow iteration, preserving the debugging history while allowing progressive code improvements.

Define the Nodes

The workflow consists of five specialized nodes that form the debugging pipeline:

  1. Execute Code Node (execute_code_node):

    • Runs the current code version

    • Captures execution results (success status, output/error)

    • Updates state with error message if failure occurs

  2. Analyze Error Node (analyze_error_node):

    • Activates only when errors exist

    • Uses LLM to diagnose errors and generate:

      • Corrected code versions

      • Required dependencies

      • Validation test cases

  3. Install Dependencies Node (install_deps_node):

    • Processes AI-identified package requirements

    • Executes system commands using environment-aware executor

    • Supports both pip and UV package managers

  4. Decision Node (decide_next_step):

    • Evaluates workflow progress:

      • Ends on successful execution

      • Limits to 3 debug cycles

      • Triggers retries otherwise

    • Acts as workflow control gate

Nodes maintain single responsibility while sharing state through the AgentState structure, enabling modular debugging steps.

Define the Edges

The edge configuration defines the workflow's logical flow and decision points:

  1. Entry Point:

    • Workflow starts with code execution (execute node)

    • builder.set_entry_point("execute")

  2. Sequential Flow:

    • ExecuteAnalyze: Code execution results flow to error analysis

    • AnalyzeInstall: Identified dependencies proceed to installation

    • InstallExecute: After setup, proceed to validation testing

  3. Conditional Branching:

    • Post-test evaluation determines next step:

      • Success: Terminate workflow (END)

      • Failure: Restart cycle if under 3 attempts (retry → execute)

      • Max Attempts: Terminate after 3 retries (max_attempts_reached → END)

    • Implemented via add_conditional_edges()

This edge configuration creates a robust debugging loop that automatically handles retries while ensuring the process eventually terminates, either through successful debugging or reaching the attempt limit.

png

Run the Workflow

Let's compile our builder graph and run it with the sample code we created earlier to see our debugging system in action.

Last updated