Skip to main content

Overview

The knowledge system provides domain-specific guidance during problem-solving. It helps the agent recommend proven approaches and avoid common pitfalls by learning from repositories, research, and past experiments.

Key Concepts

Knowledge Graph (KG)

The KG stores domain knowledge in a structured format:
  • Storage: Weaviate (vector embeddings) + Neo4j (graph structure)
  • Schema: 5 page types organized as a directed acyclic graph (DAG)
  • Content: Wiki pages with overviews, content, and connections

Knowledge Pipeline

A two-stage process for learning from sources:
  1. Stage 1 (Ingestors): Extract WikiPages from sources
  2. Stage 2 (Merger): Intelligently merge into the KG
Hybrid retrieval that combines:
  • Semantic search: Vector similarity in Weaviate
  • Graph traversal: Connected pages from Neo4j
  • LLM reranking: Relevance scoring

Using Knowledge in Kapso

Option 1: Pre-indexed KG

from src.kapso import Kapso

# One-time: Index wiki pages
kapso = Kapso()
kapso.index_kg(
    wiki_dir="data/wikis_llm_finetuning",
    save_to="data/indexes/llm_finetuning.index",
)

# Every time: Load existing index
kapso = Kapso(kg_index="data/indexes/llm_finetuning.index")
solution = kapso.evolve(goal="Fine-tune LLaMA with QLoRA")

Option 2: Learn from Sources

from src.kapso import Kapso, Source

kapso = Kapso()

# Learn from a repository
kapso.learn(
    Source.Repo("https://github.com/huggingface/transformers"),
    wiki_dir="data/wikis",
)

# Learn from web research
research = kapso.research("QLoRA best practices", mode="idea")
kapso.learn(research, wiki_dir="data/wikis")

Option 3: Research as Context

from src.kapso import Kapso

kapso = Kapso()

# Research without persisting to KG
research = kapso.research(
    "unsloth FastLanguageModel example",
    mode="implementation",
    depth="deep",
)

# Use research as context
solution = kapso.evolve(
    goal="Fine-tune with Unsloth + LoRA",
    additional_context=research.to_context_string(),
)

Wiki Page Types

The KG uses 5 page types organized as a Top-Down DAG:
TypeRoleExample
WorkflowThe Recipe - ordered sequence of steps”QLoRA Fine-tuning”
PrincipleThe Theory - library-agnostic concepts”Low Rank Adaptation”
ImplementationThe Code - concrete API reference”TRL_SFTTrainer”
EnvironmentThe Context - hardware/dependencies”CUDA_11_Environment”
HeuristicThe Wisdom - tips and optimizations”Learning_Rate_Tuning”

Connection Schema

Search Backends

BackendData FormatStorageUse Case
kg_graph_searchWiki pages (.md)Weaviate + Neo4jSemantic search with reranking
kg_llm_navigationJSON (nodes/edges)Neo4j onlyLLM-guided graph navigation

Infrastructure Requirements

Both backends require database infrastructure:
# Start Weaviate and Neo4j
./scripts/start_infra.sh

# Stop infrastructure
./scripts/stop_infra.sh

Next Steps