Code Debugging System
Author: HeeWung Song(Dan)
Peer Review: Jongcheol Kim, Heeah Kim
Proofread : Q0211
This is a part of LangChain Open Tutorial
Overview
In this tutorial, we'll build an AI-powered Python code debugging system using LangGraph. This system automates the debugging process by executing code, analyzing errors, suggesting fixes, and validating corrections.
Code Execution: Run Python code and capture any errors
Error Analysis: Use AI to analyze the error and identify the cause
Code Correction: Generate fixed code and unit tests
Validation: Verify the corrected code works properly
Table of Contents
References
Environment Setup
Setting up your environment is the first step. See the Environment Setup guide for more details.
[Note]
The
langchain-opentutorialis a package of easy-to-use environment setup guidance, useful functions and utilities for tutorials.Check out the
langchain-opentutorialfor more details.
You can alternatively set API keys such as OPENAI_API_KEY in a .env file and load them.
[Note] This is not necessary if you've already set the required API keys in previous steps.
Basic Components
Our debugging system is built upon three fundamental utility components that handle code execution, command management, and AI response parsing.
Code Executor: This component provides safe Python code execution in an isolated environment, capturing outputs and errors. It serves as the foundation for both initial code testing and validation of corrections.
Command Executor: This component manages system-level operations, particularly package installations. It automatically detects the environment (UV/pip) and ensures proper command execution across different Python environments.
Code Block Parser: The
CodeBlockParseris a custom LangChain output parser that processes AI model responses. By extendingBaseOutputParser, it extracts and categorizes markdown code blocks into structured data, making it seamlessly integrable with LangChain's LCEL pipelines. This parser is crucial for handling AI-generated code corrections and test cases.LLM Components: Handles intelligent code analysis and correction through:
Chain configuration using ChatOpenAI and custom prompts
Structured prompt template for error analysis, code correction, and test generation
These components work together to provide the essential functionality needed by our LangGraph-based debugging workflow.
Code Executor
First, Code Executor is the process of running Python code and capturing any errors that occur during execution.
The execute_code function safely executes Python code in a temporary environment and captures comprehensive execution results including output and error messages. It returns a structured result object that serves as the foundation for our debugging workflow's analysis and correction steps.
Let's test our debugging system with a Python code sample that contains multiple issues including syntax errors, type errors, and PEP 8 violations.
This student grade processing code demonstrates common programming mistakes such as missing colons, mixed data types, and naming convention violations that our system will detect and fix.
Command Executor
The CommandExecutor provides a unified interface for executing system commands and installing packages across different Python environments. It automatically detects whether to use UV or pip package manager and adapts its behavior accordingly using the Strategy pattern. The system consists of a base executor class with specialized implementations for UV and pip environments, ensuring consistent command execution regardless of the environment setup.
Let's test our CommandExecutor by installing required packages (matplotlib and pytest) using the appropriate package manager for our environment.
Code Block Parser
The CodeBlockParser is a custom LangChain output parser that processes AI model responses, extracting and categorizing markdown code blocks into structured data. By extending BaseOutputParser, it extracts code blocks with specific tags (corrected for fixed code, tests for test code) from the AI's response text. This parser enables seamless integration with LangChain's LCEL pipelines while providing a structured way to handle different types of code blocks in our debugging workflow.
Recommendations
Code Quality (PEP 8):
Ensure class names are capitalized (e.g.,
Studentinstead ofstudent).Use consistent naming conventions and spacing in the code.
Runtime Considerations:
Consider edge cases like empty score lists before performing operations that assume non-empty inputs.
Package Dependencies:
For this simple program, no external packages are required. However, if you expand the functionality (e.g., using a database or web framework), make sure to manage dependencies with a tool like
pipand maintain arequirements.txtfile.
By following these recommendations, the code will be more robust, maintainable, and user-friendly. """
code_block_parser = CodeBlockParser() code_block_parser_result = code_block_parser.invoke(sample_output_from_llm)
corrected = "\n".join(code_block_parser_result.corrected)
print("------- Fixed Code -------") print(corrected)
print("\n------- Bash Commands -------") print(code_block_parser_result.bash)
Code quality (PEP 8)
Runtime considerations
Format your response with clear sections and tagged code blocks as shown above. """)
chain = prompt | model | CodeBlockParser()
Building LangGraph Workflow
In this section, we'll build an automated debugging workflow using LangGraph. This workflow will systematically process code, identify errors, suggest fixes, and validate the corrections. Let's break down the key components:
Execute Code: First, run the code and capture any errors
Analyze Errors: If errors occur, use AI to analyze them and suggest fixes
Install Dependencies: Install any required packages
Decide Next Step: Either end the process if successful, or retry debugging
The workflow continues until either:
The code runs successfully without errors
The maximum number of debug attempts (3) is reached
Let's implement each component and see how they work together in our debugging system.
Define the State
The AgentState serves as the central data structure tracking progress through the debugging workflow. This stateful approach enables iterative improvements by preserving critical debugging context between cycles.
Key State Components:
Original Code: Preserves the initial code before any modifications
Code: Current version of the code being debugged
Error: Captured runtime error message (if any)
Dependencies: List of required packages identified by AI analysis
Execution Result: Detailed outcome of code execution attempts (success status, raw output/error)
Debug Attempts: Counter tracking retries (max 3 attempts)
Fix History: List of debugging attempts with corresponding errors and fixes
The state evolves through each workflow iteration, preserving the debugging history while allowing progressive code improvements.
Define the Nodes
The workflow consists of five specialized nodes that form the debugging pipeline:
Execute Code Node (
execute_code_node):Runs the current code version
Captures execution results (success status, output/error)
Updates state with error message if failure occurs
Analyze Error Node (
analyze_error_node):Activates only when errors exist
Uses LLM to diagnose errors and generate:
Corrected code versions
Required dependencies
Validation test cases
Install Dependencies Node (
install_deps_node):Processes AI-identified package requirements
Executes system commands using environment-aware executor
Supports both pip and UV package managers
Decision Node (
decide_next_step):Evaluates workflow progress:
Ends on successful execution
Limits to 3 debug cycles
Triggers retries otherwise
Acts as workflow control gate
Nodes maintain single responsibility while sharing state through the AgentState structure, enabling modular debugging steps.
Define the Edges
The edge configuration defines the workflow's logical flow and decision points:
Entry Point:
Workflow starts with code execution (execute node)
builder.set_entry_point("execute")
Sequential Flow:
Execute → Analyze: Code execution results flow to error analysis
Analyze → Install: Identified dependencies proceed to installation
Install → Execute: After setup, proceed to validation testing
Conditional Branching:
Post-test evaluation determines next step:
Success: Terminate workflow (END)
Failure: Restart cycle if under 3 attempts (retry → execute)
Max Attempts: Terminate after 3 retries (max_attempts_reached → END)
Implemented via
add_conditional_edges()
This edge configuration creates a robust debugging loop that automatically handles retries while ensuring the process eventually terminates, either through successful debugging or reaching the attempt limit.

Run the Workflow
Let's compile our builder graph and run it with the sample code we created earlier to see our debugging system in action.
Last updated