Experiment Lifecycle

Overview

Each experiment follows a lifecycle from solution selection through code generation, execution, and optional debugging.

Git Branch Model

Each experiment runs in its own git branch, inheriting code from its parent experiment. This enables:

Isolation: Each experiment starts from a clean parent state
Inheritance: Child experiments build on parent solutions
Versioning: All code changes are tracked

Experiment Session

The ExperimentSession manages a single experiment:

class ExperimentSession:
    def __init__(self, main_repo, session_folder, coding_agent_config, 
                 parent_branch_name, branch_name):
        # Clone repo to isolated folder
        self.repo = git.Repo.clone_from(main_repo, session_folder)
        
        # Checkout parent branch (inherit code)
        self.repo.git.checkout(parent_branch_name)
        
        # Create new experiment branch
        self.repo.git.checkout('-b', branch_name)
        
        # Initialize coding agent
        self.coding_agent = CodingAgentFactory.create(coding_agent_config)

Implementation Flow

1. Implement Solution

def implement_solution(self, solution, context, session):
    prompt = f"""
    Implement <solution> for <problem>.
    
    <problem>{context.problem}</problem>
    <solution>{solution}</solution>
    <knowledge>{context.kg_code_results}</knowledge>
    
    Requirements:
    - Write clean, functional code
    - Output in Experiment Output Data Directory: ./output_data_{branch}
    """
    
    session.generate_code(prompt)
    return self.problem_handler.run(session.session_folder)

2. Debug Loop

If the code fails, the debug loop attempts to fix it:

def _implement_n_debug(self, solution, context, code_debug_tries, branch_name):
    session = self.workspace.create_experiment_session(branch_name, parent_branch)
    result = self.implement_solution(solution, context, session)
    
    for i in range(code_debug_tries):
        if result.run_had_error and result.continue_debugging:
            result = self.debug_solution(solution, context, result.error_details, session)
        else:
            break
    
    self.workspace.finalize_session(session)
    return result

3. Debug Solution

def debug_solution(self, solution, context, error, session):
    prompt = f"""
    Debug the implementation.
    
    <problem>{context.problem}</problem>
    <solution>{solution}</solution>
    <error>{error}</error>
    
    Requirements:
    - Fix errors directly, no try/except workarounds
    - Don't change solution logic
    """
    
    session.generate_code(prompt, debug_mode=True)
    return self.problem_handler.run(session.session_folder)

Result Structure

@dataclass
class ExperimentResult:
    node_id: int          # Tree node ID
    solution: str         # Solution description
    score: float          # Evaluation score
    branch_name: str      # Git branch
    had_error: bool       # Whether it failed
    error_message: str    # Error details
    output: str           # Execution output
    feedbacks: str        # Analysis feedback

Session Finalization

After an experiment completes:

Commit changes to the experiment branch
Push to origin (available for child experiments)
Record cost from coding agent
Cleanup session folder

def finalize_session(self, session):
    cost = session.get_cumulative_cost()
    self.previous_sessions_cost += cost
    session.close_session()  # commit, push, cleanup

Getting Started

Core Concepts

Benchmarks

Components

Deployment

Experiment Lifecycle

Overview

Git Branch Model

Experiment Session

Implementation Flow

1. Implement Solution

2. Debug Loop

3. Debug Solution

Result Structure

Session Finalization

Getting Started

Core Concepts

Benchmarks

Components

Deployment

​Overview

​Git Branch Model

​Experiment Session

​Implementation Flow

​1. Implement Solution

​2. Debug Loop

​3. Debug Solution

​Result Structure

​Session Finalization

Overview

Git Branch Model

Experiment Session

Implementation Flow

1. Implement Solution

2. Debug Loop

3. Debug Solution

Result Structure

Session Finalization