
Usage
CLI Options
| Option | Description | Default |
|---|---|---|
-p, --problem | Problem ID (e.g., ahc039) | Required |
-i, --iterations | Max experiment iterations | 14 |
-m, --mode | Config mode | ALE_CONFIGS |
-d, --coding-agent | Coding agent | From config |
--list | List all problems | - |
--lite | List lite problems | - |
--list-agents | List coding agents | - |
Available Problems
ahc008, ahc011, ahc015, ahc016, ahc024, ahc025, ahc026, ahc027, ahc039, ahc046
ALE-Bench uses
benchmark_tree_search strategy which uses the handler’s built-in evaluation via handler.run(). This is different from kapso.evolve() which uses agent-built evaluation.Output Structure
The agent generates:Evaluation
The evaluation process works as follows:- Code Submission: The
main.cppfile is read from the experiment workspace - Docker Evaluation: Code is sent to
ale_bench.public_eval()which compiles and runs in an isolated Docker container - Test Execution: Solution runs against all test cases with strict time limits
- Validation: Each test case must return
ACCEPTEDwith a non-zero score - Score Stabilization: If all tests pass, the solution runs 4 additional times and scores are averaged for stability
- Final Ranking: Private evaluation compares against original contest participants
Code Requirements
Generated C++ must:- Be time-aware (limit: time_limit - 100ms for I/O)
- Handle all input constraints
- Use efficient algorithms and data structures
- Include compiler optimization pragmas if helpful
Built-in Domain Knowledge
The handler includes tips for common algorithms:Simulated Annealing
Simulated Annealing
- Design good state representation
- Balance small and large moves
- Avoid recomputation in legality checks
- Keep regret mechanism for constrained problems
Beam / Random Search
Beam / Random Search
- Balance diversity and quality in beams
- Fast-stop bad solutions
- Use strong heuristic scoring
Random Simulation
Random Simulation
- Define strong heuristic scoring
- Consider average and std of scores
- Balance greedy vs long-horizon moves