Installation

Core Installation

Install from PyPI (recommended)

pip install leeroo-kapso

Or install from source (for development or wiki data)

git clone https://github.com/leeroo-ai/kapso.git
cd kapso

# Pull Git LFS files (wiki knowledge data)
git lfs install
git lfs pull

# Create conda environment (recommended)
conda create -n kapso_conda python=3.12
conda activate kapso_conda

# Install in development mode
pip install -e .

This repository uses Git LFS for large files in data/wikis_batch_top100/. If you didn’t install Git LFS before cloning, run these commands to fetch the files.

Configure API keys

Create .env in project root:

# Required for most operations
OPENAI_API_KEY=your-openai-api-key

# For Gemini coding agent
GOOGLE_API_KEY=your-google-api-key

# For Claude Code coding agent
ANTHROPIC_API_KEY=your-anthropic-api-key

# For Leeroopedia MCP (curated ML/AI knowledge — sign up at leeroopedia.com)
LEEROOPEDIA_API_KEY=your-leeroopedia-api-key

Coding Agent Setup

Kapso supports multiple coding agents. Install the ones you plan to use:

Aider (Default)
Claude Code
Gemini
OpenHands

Aider is installed automatically with pip install -e .No additional setup required. Uses git-centric diff-based editing.

# Verify installation
aider --version

Claude Code requires Node.js and the Anthropic CLI:

# Install Claude Code CLI
npm install -g @anthropic-ai/claude-code

# Verify installation
claude --version

Set ANTHROPIC_API_KEY in your .env file.

Gemini uses the Google AI SDK, installed with core dependencies.Set GOOGLE_API_KEY in your .env file.

OpenHands has conflicting dependencies with aider-chat. Use a separate conda environment.

# Create separate environment
conda create -n openhands_env python=3.12
conda activate openhands_env

# Install OpenHands
pip install openhands-ai litellm

Do NOT install OpenHands in the same environment as Kapso.

Leeroopedia MCP (Optional)

Connect Kapso to Leeroopedia — a curated knowledge base of 1000+ ML/AI frameworks. Kapso agents use it during ideation and implementation to search docs, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

pip install leeroopedia-mcp

LEEROOPEDIA_API_KEY=kpsk_your_key_here

See the Leeroopedia MCP docs for Claude Code and Cursor setup.

Benchmark Installation

MLE-Bench
ALE-Bench

MLE-Bench provides Kaggle competition problems.Prerequisites:

Git LFS (sudo apt-get install git-lfs or brew install git-lfs)

Installation:

# Clone and install MLE-Bench
git clone https://github.com/openai/mle-bench.git
cd mle-bench
git lfs install
git lfs fetch --all
git lfs pull
pip install -e .
cd ..

# Install MLE-specific dependencies
pip install -r benchmarks/mle/requirements.txt

Verify:

PYTHONPATH=. python -m benchmarks.mle.runner --list

ALE-Bench provides AtCoder algorithmic optimization problems.Prerequisites:

Docker
libcairo2-dev (sudo apt-get install -y libcairo2-dev)

Installation:

# Clone and install ALE-Bench
git clone https://github.com/SakanaAI/ALE-Bench.git
cd ALE-Bench
pip install .
pip install ".[eval]"

# Build Docker container for evaluation
bash ./scripts/docker_build_202301.sh $(id -u) $(id -g)
cd ..

Verify:

PYTHONPATH=. python -m benchmarks.ale.runner --list

Infrastructure Setup

Kapso uses Docker containers for its knowledge graph infrastructure:

Weaviate: Vector database for semantic search
Neo4j: Graph database for relationships
MediaWiki (optional): Web UI for browsing wiki pages

Quick Start (Recommended)

# Start all infrastructure
./scripts/start_infra.sh

# Stop infrastructure (data preserved)
./scripts/stop_infra.sh

# Stop and wipe all data
./scripts/stop_infra.sh --volumes

Default Service URLs

Service	URL	Credentials
MediaWiki	http://localhost:8090	`admin` / `adminpass123`
Neo4j Browser	http://localhost:7474	`neo4j` / `password`
Weaviate	http://localhost:8080	Anonymous (no auth)

Manual Setup

If you prefer to start services individually:

Docker Compose (All Services)

docker compose -f services/infrastructure/docker-compose.yml up -d

This starts:

Weaviate (port 8080) - Vector database for embeddings
Neo4j (ports 7474, 7687) - Graph database for relationships
MediaWiki (port 8090) - Web UI for browsing wiki pages
MariaDB - Backend database for MediaWiki

Individual Containers

Weaviate (vector DB):

docker run -d --name weaviate \
    -p 8080:8080 -p 50051:50051 \
    -e AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED=true \
    -e PERSISTENCE_DATA_PATH='/var/lib/weaviate' \
    semitechnologies/weaviate:1.27.0

Neo4j (graph DB):

docker run -d --name neo4j \
    --restart unless-stopped \
    -p 7474:7474 -p 7687:7687 \
    -e NEO4J_AUTH=neo4j/password \
    neo4j:5.18.0

Configure Environment

Add infrastructure settings to your .env:

# Neo4j connection
NEO4J_URI=bolt://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=password

# Weaviate connection
WEAVIATE_URL=http://localhost:8080

# MediaWiki (optional)
MW_BASE=http://localhost:8090
MW_USER=admin
MW_PASS=adminpass123

Knowledge Graph Indexing

After infrastructure is running, index your wiki pages:

Using Kapso API (Recommended)

from kapso.kapso import Kapso

# Initialize Kapso
kapso = Kapso(config_path="src/config.yaml")

# Index wiki pages (one-time operation)
kapso.index_kg(
    wiki_dir="data/wikis",
    save_to="data/indexes/my_knowledge.index",
)

# Load existing index on subsequent runs
kapso = Kapso(
    config_path="src/config.yaml",
    kg_index="data/indexes/my_knowledge.index",
)

Index File Format

The .index file is a JSON reference to the indexed data:

{
  "version": "1.0",
  "created_at": "2025-01-15T10:30:00Z",
  "data_source": "data/wikis_llm_finetuning",
  "search_backend": "kg_graph_search",
  "backend_refs": {
    "weaviate_collection": "KapsoWiki",
    "embedding_model": "text-embedding-3-large"
  },
  "page_count": 99
}

KG Search Backends

Backend	Data Format	Storage	Use Case
`kg_graph_search`	Wiki pages (.md/.mediawiki)	Weaviate + Neo4j	Semantic search with LLM reranking
`kg_llm_navigation`	JSON (nodes/edges)	Neo4j only	LLM-guided graph navigation

Environment Variables Reference

Variable	Required	Default	Description
`OPENAI_API_KEY`	Yes	-	OpenAI API key (also for embeddings)
`GOOGLE_API_KEY`	No	-	Google API key for Gemini
`ANTHROPIC_API_KEY`	No	-	Anthropic API key for Claude
`LEEROOPEDIA_API_KEY`	No	-	Leeroopedia API key (sign up)
`NEO4J_URI`	No	`bolt://localhost:7687`	Neo4j connection URI
`NEO4J_USER`	No	`neo4j`	Neo4j username
`NEO4J_PASSWORD`	No	`password`	Neo4j password
`WEAVIATE_URL`	No	`http://localhost:8080`	Weaviate server URL
`MW_BASE`	No	`http://localhost:8090`	MediaWiki base URL
`MW_USER`	No	`admin`	MediaWiki username
`MW_PASS`	No	`adminpass123`	MediaWiki password
`CUDA_DEVICE`	No	`0`	GPU device for ML training

Verify Installation

# Check core installation
python -c "from kapso.kapso import Kapso; print('Kapso OK')"

# Check orchestrator
python -c "from kapso.execution.orchestrator import OrchestratorAgent; print('Orchestrator OK')"

# Check knowledge search
python -c "from kapso.knowledge_base.search import KnowledgeSearchFactory; print('Knowledge Search OK')"

# Check MLE-Bench (if installed)
python -c "import mlebench; print('MLE-Bench OK')"

# Check ALE-Bench (if installed)
python -c "import ale_bench; print('ALE-Bench OK')"

# Check Neo4j driver
python -c "from neo4j import GraphDatabase; print('Neo4j driver OK')"

# Check Weaviate client
python -c "import weaviate; print('Weaviate client OK')"

Troubleshooting

Services won't start

# Check Docker logs
docker compose -f services/infrastructure/docker-compose.yml logs

# Check individual service
docker logs weaviate
docker logs neo4j

MediaWiki issues

# View MediaWiki logs
docker compose -f services/infrastructure/docker-compose.yml logs wiki

# Full reset (deletes all data)
./scripts/stop_infra.sh --volumes
./scripts/start_infra.sh

Port conflicts

If default ports are in use, modify services/infrastructure/docker-compose.yml or use different port mappings.

Getting Started

Evolve System

Knowledge System & Learning

Research

Deployment

Benchmarks

Installation

Core Installation

Coding Agent Setup

Leeroopedia MCP (Optional)

Benchmark Installation

Infrastructure Setup

Quick Start (Recommended)

Default Service URLs

Manual Setup

Configure Environment

Knowledge Graph Indexing

Using Kapso API (Recommended)

Index File Format

KG Search Backends

Environment Variables Reference

Verify Installation

Troubleshooting

Getting Started

Evolve System

Knowledge System & Learning

Research

Deployment

Benchmarks

​Core Installation

​Coding Agent Setup

​Leeroopedia MCP (Optional)

​Benchmark Installation

​Infrastructure Setup

​Quick Start (Recommended)

​Default Service URLs

​Manual Setup

​Configure Environment

​Knowledge Graph Indexing

​Using Kapso API (Recommended)

​Index File Format

​KG Search Backends

​Environment Variables Reference

​Verify Installation

​Troubleshooting

Core Installation

Coding Agent Setup

Leeroopedia MCP (Optional)

Benchmark Installation

Infrastructure Setup

Quick Start (Recommended)

Default Service URLs

Manual Setup

Configure Environment

Knowledge Graph Indexing

Using Kapso API (Recommended)

Index File Format

KG Search Backends

Environment Variables Reference

Verify Installation

Troubleshooting