Skip to content

Code Generation Agent Pattern: Read-Generate-Test-Fix Loop #174

@yasha-dev1

Description

@yasha-dev1

Overview

The Code Generation Agent pattern enables autonomous code creation, editing, and debugging through a structured Read → Generate → Test → Fix loop. This pattern powers modern AI coding assistants like Claude Code, GitHub Copilot, and OpenCode. The agent integrates with LSP (Language Server Protocol) for code intelligence, file systems for context, and test runners for validation.

How It Works

Core Loop

1. READ: Understand codebase context
   ↓
2. GENERATE: Create/edit code based on requirements
   ↓
3. TEST: Run tests to validate changes
   ↓
4. FIX: If tests fail, debug and regenerate
   ↓
Repeat until tests pass or max iterations

Flow Example

User: "Add a function to calculate Fibonacci numbers with memoization"

READ PHASE:
  → Scan codebase structure
  → Find relevant files (math_utils.py)
  → Read existing code patterns
  → Check LSP for type definitions

GENERATE PHASE:
  → Create fibonacci() function
  → Add memoization decorator
  → Add type hints
  → Generate docstring
  → Write to math_utils.py

TEST PHASE:
  → Run pytest tests/test_math_utils.py
  → Result: 1 test failed
  → Error: "Expected fibonacci(10) = 55, got 89"

FIX PHASE:
  → Analyze error
  → Identify bug (wrong base case)
  → Regenerate with fix
  → Write corrected code

TEST PHASE (iteration 2):
  → Run pytest again
  → Result: All tests passed ✓

COMPLETE: Return generated code + test results

Reference Implementations

Proposed PyWorkflow Implementation

from pyworkflow import workflow, step, agent
from pyworkflow.agents import CodeAgent, LSPClient, TestRunner

# Define code tools
@step()
async def read_file(file_path: str) -> str:
    """Read file contents."""
    with open(file_path) as f:
        return f.read()

@step()
async def write_file(file_path: str, content: str) -> bool:
    """Write content to file."""
    with open(file_path, "w") as f:
        f.write(content)
    return True

@step()
async def run_tests(test_path: str = None) -> dict:
    """Run tests and return results."""
    result = subprocess.run(
        ["pytest", test_path or "tests/", "-v"],
        capture_output=True,
        text=True
    )
    return {
        "passed": result.returncode == 0,
        "stdout": result.stdout,
        "stderr": result.stderr
    }

@step()
async def get_lsp_info(file_path: str, position: dict) -> dict:
    """Get LSP information (type hints, references, etc.)."""
    lsp_client = LSPClient()
    return await lsp_client.get_hover_info(file_path, position)

# Create code generation agent
@agent(
    pattern="code_generation",
    model="claude-sonnet-4-5-20250929",
    tools=[read_file, write_file, run_tests, get_lsp_info],
    enable_testing=True,
    max_fix_iterations=5,
    auto_format=True,  # Run formatter after code generation
    lsp_enabled=True,  # Use LSP for code intelligence
)
async def coding_agent(task: str, codebase_path: str):
    """
    AI coding agent with test-driven development.
    
    The agent:
    1. Reads codebase context
    2. Generates or edits code
    3. Runs tests to validate
    4. Fixes issues if tests fail
    5. Iterates until tests pass
    """
    pass

# Use the agent
result = await coding_agent.run(
    task="Add a function to calculate factorial with type hints and tests",
    codebase_path="/path/to/project"
)

print(result.files_modified)    # ["math_utils.py", "tests/test_math.py"]
print(result.tests_passed)      # True
print(result.iterations)        # 2 (initial + 1 fix)
print(result.code_changes)      # Diff of changes

Advanced: Multi-File Refactoring

@agent(
    pattern="code_generation",
    model="claude-opus-4-6",  # More powerful for complex refactoring
    tools=[read_file, write_file, run_tests, search_codebase, get_lsp_references],
    enable_planning=True,      # Plan changes before executing
    enable_testing=True,
    require_approval=True,     # Human approval for multi-file changes
)
async def refactoring_agent(task: str):
    """
    Advanced refactoring agent with planning.
    
    1. Analyzes codebase
    2. Generates refactoring plan
    3. Requests human approval
    4. Executes changes
    5. Runs full test suite
    """
    pass

# Use with planning
result = await refactoring_agent.run(
    "Refactor authentication module to use JWT tokens instead of sessions"
)
# → Agent generates plan
# → Workflow suspends for approval via hook()
# → Human approves
# → Agent executes refactoring
# → Tests validate changes

Event Types

Code generation agents record these events:

  1. AGENT_STARTED - Agent begins execution

    {
      "run_id": "abc123",
      "task": "Add factorial function",
      "codebase_path": "/path/to/project",
      "enable_testing": true
    }
  2. AGENT_CODEBASE_SCAN - Agent scans codebase for context

    {
      "total_files": 45,
      "relevant_files": ["math_utils.py", "tests/test_math.py"],
      "lsp_initialized": true
    }
  3. AGENT_CODE_GENERATION_STARTED - Begin code generation

    {
      "iteration": 1,
      "target_files": ["math_utils.py"],
      "operation": "create_function"  # or "edit_function", "refactor", "delete"
    }
  4. AGENT_CODE_GENERATED - Code generation completes

    {
      "iteration": 1,
      "file_path": "math_utils.py",
      "lines_added": 15,
      "lines_deleted": 0,
      "diff": "...",
      "tokens_generated": 450
    }
  5. AGENT_FILE_WRITTEN - File written to disk

    {
      "file_path": "math_utils.py",
      "content_hash": "abc123...",
      "formatted": true  # If auto-format enabled
    }
  6. AGENT_TESTS_STARTED - Test execution begins

    {
      "iteration": 1,
      "test_command": "pytest tests/test_math.py -v"
    }
  7. AGENT_TESTS_COMPLETED - Test execution finishes

    {
      "iteration": 1,
      "passed": false,
      "total_tests": 5,
      "passed_tests": 4,
      "failed_tests": 1,
      "error_output": "AssertionError: Expected 120, got 720"
    }
  8. AGENT_DEBUGGING - Agent analyzes test failures

    {
      "iteration": 1,
      "error": "AssertionError: Expected 120, got 720",
      "diagnosis": "Incorrect base case in recursive function",
      "fix_strategy": "Change base case from n <= 0 to n <= 1"
    }
  9. AGENT_FIXING - Agent regenerates code with fix

    {
      "iteration": 2,
      "fix_applied": "Updated base case condition",
      "file_path": "math_utils.py"
    }
  10. AGENT_ALL_TESTS_PASSED - All tests passed

    {
      "final_iteration": 2,
      "total_tests": 5,
      "passed_tests": 5,
      "total_fix_iterations": 1
    }
  11. AGENT_COMPLETED / AGENT_FAILED

Implementation Details

Codebase Scanner

class CodebaseScanner:
    async def scan(self, codebase_path: str, task: str) -> dict:
        """Scan codebase and identify relevant files."""
        
        # Find all code files
        all_files = await self._find_code_files(codebase_path)
        
        # Use LLM to identify relevant files for task
        prompt = f"""
        Task: {task}
        
        Files in codebase:
        {json.dumps(all_files, indent=2)}
        
        Which files are most relevant for this task?
        Respond with JSON array of file paths.
        """
        
        response = await self.llm.generate(messages=[{"role": "user", "content": prompt}])
        relevant_files = json.loads(response.text)
        
        # Read relevant files
        file_contents = {}
        for file_path in relevant_files:
            file_contents[file_path] = await read_file(file_path)
        
        await ctx.record_event(EventType.AGENT_CODEBASE_SCAN, {
            "total_files": len(all_files),
            "relevant_files": relevant_files,
            "lsp_initialized": self.lsp_enabled
        })
        
        return {
            "relevant_files": relevant_files,
            "file_contents": file_contents
        }

Code Generator

class CodeGenerator:
    async def generate(self, task: str, context: dict) -> dict:
        """Generate code based on task and codebase context."""
        
        system_prompt = """
        You are an expert software engineer. Generate clean, well-documented code.
        Follow these guidelines:
        - Use type hints
        - Write docstrings
        - Follow existing code style
        - Add error handling
        - Write tests
        """
        
        user_prompt = f"""
        Task: {task}
        
        Existing code context:
        {self._format_context(context)}
        
        Generate the necessary code changes.
        Respond with JSON:
        {{
          "file_path": "path/to/file.py",
          "operation": "create_function|edit_function|refactor|delete",
          "code": "...",
          "explanation": "..."
        }}
        """
        
        response = await self.llm.generate(
            model=self.model,
            messages=[
                {"role": "system", "content": system_prompt},
                {"role": "user", "content": user_prompt}
            ]
        )
        
        code_change = json.loads(response.text)
        
        await ctx.record_event(EventType.AGENT_CODE_GENERATED, {
            "file_path": code_change["file_path"],
            "operation": code_change["operation"],
            "lines_added": code_change["code"].count("\n"),
            "tokens_generated": response.usage.total_tokens
        })
        
        return code_change

Test Runner Integration

class TestRunner:
    async def run_and_validate(self, test_path: str = None) -> dict:
        """Run tests and parse results."""
        
        await ctx.record_event(EventType.AGENT_TESTS_STARTED, {
            "test_command": f"pytest {test_path or 'tests/'} -v"
        })
        
        # Execute tests as PyWorkflow step
        result = await run_tests(test_path)
        
        # Parse output
        passed = result["passed"]
        test_info = self._parse_pytest_output(result["stdout"])
        
        await ctx.record_event(EventType.AGENT_TESTS_COMPLETED, {
            "passed": passed,
            "total_tests": test_info["total"],
            "passed_tests": test_info["passed"],
            "failed_tests": test_info["failed"],
            "error_output": result["stderr"] if not passed else None
        })
        
        return {
            "passed": passed,
            "test_info": test_info,
            "error_output": result["stderr"]
        }

Debugging & Fixing Logic

class CodeFixer:
    async def fix(self, code: str, test_error: str, iteration: int) -> str:
        """Analyze test failure and generate fix."""
        
        prompt = f"""
        The following code failed tests:
        
        ```python
        {code}
        ```
        
        Test error:
        {test_error}
        
        Analyze the error and provide a fixed version of the code.
        Explain the bug and your fix.
        
        Respond with JSON:
        {{
          "diagnosis": "Explanation of the bug",
          "fix_strategy": "How you'll fix it",
          "fixed_code": "..."
        }}
        """
        
        response = await self.llm.generate(messages=[{"role": "user", "content": prompt}])
        fix_data = json.loads(response.text)
        
        await ctx.record_event(EventType.AGENT_DEBUGGING, {
            "iteration": iteration,
            "error": test_error,
            "diagnosis": fix_data["diagnosis"],
            "fix_strategy": fix_data["fix_strategy"]
        })
        
        await ctx.record_event(EventType.AGENT_FIXING, {
            "iteration": iteration + 1,
            "fix_applied": fix_data["fix_strategy"]
        })
        
        return fix_data["fixed_code"]

LSP Integration

class LSPClient:
    """Language Server Protocol client for code intelligence."""
    
    async def initialize(self, workspace_path: str, language: str):
        """Initialize LSP server for the workspace."""
        # Start language server (pyright, rust-analyzer, etc.)
        self.server = await start_lsp_server(language, workspace_path)
    
    async def get_type_info(self, file_path: str, position: dict) -> dict:
        """Get type information at cursor position."""
        return await self.server.text_document_hover({
            "textDocument": {"uri": f"file://{file_path}"},
            "position": position
        })
    
    async def find_references(self, file_path: str, position: dict) -> list[dict]:
        """Find all references to symbol."""
        return await self.server.text_document_references({
            "textDocument": {"uri": f"file://{file_path}"},
            "position": position
        })

Main Agent Loop

class CodeGenerationAgent:
    async def run(self, task: str, codebase_path: str):
        """Execute code generation with test-driven development."""
        
        # 1. Scan codebase
        context = await self.scanner.scan(codebase_path, task)
        
        iteration = 1
        code = None
        
        while iteration <= self.max_fix_iterations:
            # 2. Generate code
            if iteration == 1:
                code_change = await self.generator.generate(task, context)
            else:
                # Fix existing code
                code_change["code"] = await self.fixer.fix(
                    code_change["code"],
                    test_result["error_output"],
                    iteration - 1
                )
            
            # 3. Write code to file
            await write_file(code_change["file_path"], code_change["code"])
            
            if self.auto_format:
                await self._format_file(code_change["file_path"])
            
            # 4. Run tests
            if self.enable_testing:
                test_result = await self.test_runner.run_and_validate()
                
                if test_result["passed"]:
                    await ctx.record_event(EventType.AGENT_ALL_TESTS_PASSED, {
                        "final_iteration": iteration,
                        "total_tests": test_result["test_info"]["total"]
                    })
                    break
            else:
                # No testing - assume success
                break
            
            iteration += 1
        
        return {
            "files_modified": [code_change["file_path"]],
            "tests_passed": test_result.get("passed", True),
            "iterations": iteration,
            "code_changes": code_change
        }

Trade-offs

Pros

  • Autonomous: Generates, tests, and fixes code without human intervention
  • Test-driven: Validates changes with automated tests
  • Code intelligence: LSP integration for type checking, references
  • Iterative improvement: Automatically fixes bugs until tests pass
  • Scalable: Can handle multi-file changes and refactoring
  • Transparent: Event log shows full read-generate-test-fix loop

Cons

  • Complexity: Many moving parts (LSP, test runner, file system)
  • Unpredictable: May generate incorrect code even after iterations
  • Test dependency: Requires good test coverage to be effective
  • Resource intensive: Multiple LLM calls per iteration
  • Security risk: Writes code to file system (needs sandboxing)

When to Use Code Generation Agent

Use Code Generation Agent when:

  • Automated code generation is core feature
  • You have comprehensive test suite
  • Security/sandboxing is in place
  • You need autonomous bug fixing

Don't use when:

  • Simple one-off code generation
  • No test coverage
  • High-stakes production code without review
  • Security constraints prevent file system access

Related Issues

References

Implementation Checklist

  • Create pyworkflow/agents/code_generation.py with CodeGenerationAgent class
  • Implement CodebaseScanner for context gathering
  • Implement CodeGenerator for code generation
  • Implement TestRunner for test execution
  • Implement CodeFixer for debugging and fixes
  • Integrate LSP client for code intelligence
  • Add file system tools (read_file, write_file) with sandboxing
  • Add event types: AGENT_CODE_GENERATED, AGENT_TESTS_*, AGENT_DEBUGGING
  • Create @agent(pattern="code_generation") decorator
  • Add max_fix_iterations safeguard
  • Support multiple languages (Python, TypeScript, Rust, Go)
  • Add auto-formatting support (black, prettier, rustfmt)
  • Add tests with mock file system
  • Document code generation agent in examples/
  • Add security guidelines (sandboxing, file permissions)

Metadata

Metadata

Assignees

No one assigned

    Labels

    agentsAI Agent module (pyworkflow_agents)enhancementNew feature or requestfeatureFeature to be implemented

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions