Usage
CLI Options
| Option | Description | Default |
|---|---|---|
-p, --problem | Problem ID (e.g., ahc039) | Required |
-i, --iterations | Max experiment iterations | 14 |
-m, --mode | Config mode | ALE_CONFIGS |
-d, --coding-agent | Coding agent | From config |
--list | List all problems | - |
--lite | List lite problems | - |
--list-agents | List coding agents | - |
Available Problems
| Problem | Contestants | Scoring |
|---|---|---|
| ahc008 | 824 | Maximize |
| ahc011 | 926 | Maximize |
| ahc015 | 779 | Maximize |
| ahc016 | 1047 | Maximize |
| ahc024 | 664 | Maximize |
| ahc025 | 879 | Minimize |
| ahc026 | 740 | Maximize |
| ahc027 | 999 | Minimize |
| ahc039 | 683 | Maximize |
| ahc046 | 939 | Maximize |
Output Structure
The agent generates:Evaluation
Code Requirements
Generated C++ must:- Be time-aware (limit: time_limit - 100ms for I/O)
- Handle all input constraints
- Use efficient algorithms and data structures
- Include compiler optimization pragmas if helpful
Built-in Domain Knowledge
The handler includes tips for common algorithms:Simulated Annealing
Simulated Annealing
- Design good state representation
- Balance small and large moves
- Avoid recomputation in legality checks
- Keep regret mechanism for constrained problems
Beam / Random Search
Beam / Random Search
- Balance diversity and quality in beams
- Fast-stop bad solutions
- Use strong heuristic scoring
Random Simulation
Random Simulation
- Define strong heuristic scoring
- Consider average and std of scores
- Balance greedy vs long-horizon moves
Key Differences from MLE-Bench
| Aspect | MLE-Bench | ALE-Bench |
|---|---|---|
| Language | Python | C++ (cpp23) |
| Main file | main.py | main.cpp |
| Debug mode | --debug flag | N/A |
| Evaluation | CSV grading | Docker tests |
| Stop condition | Medal achieved | Never (fixed iterations) |
| Knowledge graph | Recommended | Disabled by default |