Skip to main content

Overview

Each deployment strategy is designed for specific use cases. The system can auto-select the best strategy or you can specify one explicitly.

LOCAL

Run as a local Python process. Fastest for development and testing.

Features

  • No containerization overhead
  • Direct file system access
  • Easiest debugging
  • No additional setup required

Usage

deployed_program = kapso.deploy(solution, strategy=DeployStrategy.LOCAL)
result = deployed_program.run({"data_path": "./data.csv"})

How It Works

  1. Adapter creates a run.py entry point
  2. Runner imports and calls the main function
  3. Results returned directly

Configuration

# In strategy config
local:
  module: "main"      # Module to import
  callable: "predict" # Function to call
  timeout: 300        # Execution timeout

DOCKER

Run in an isolated Docker container.

Features

  • Environment isolation
  • Reproducible builds
  • Easy dependency management
  • Cross-platform compatibility

Prerequisites

  • Docker installed and running

Usage

deployed_program = kapso.deploy(solution, strategy=DeployStrategy.DOCKER)
result = deployed_program.run({"data_path": "/data/input.csv"})

How It Works

  1. Adapter creates Dockerfile and docker-compose.yml
  2. Runner builds and starts container
  3. Inputs/outputs mapped via volumes

Dockerfile Template

FROM python:3.12-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
CMD ["python", "run.py"]
Deploy to Modal.com for serverless GPU compute.

Features

  • Serverless scaling
  • GPU support (A100, H100)
  • Pay-per-use pricing
  • Fast cold starts

Prerequisites

  • Modal account and CLI
  • modal package installed
pip install modal
modal token new

Usage

deployed_program = kapso.deploy(solution, strategy=DeployStrategy.MODAL)
result = deployed_program.run({"prompt": "Hello, world!"})

How It Works

  1. Adapter creates modal_app.py with Modal decorators
  2. Runner deploys to Modal
  3. HTTP endpoint for inference
import modal

app = modal.App("kapso-deployment")

@app.function(gpu="A100")
def predict(inputs):
    # Load model and run inference
    return result

BENTOML

Deploy with BentoML for production ML serving.

Features

  • Model versioning
  • Automatic batching
  • Prometheus metrics
  • Kubernetes-ready

Prerequisites

  • bentoml package installed
pip install bentoml

Usage

deployed_program = kapso.deploy(solution, strategy=DeployStrategy.BENTOML)
result = deployed_program.run({"features": [1, 2, 3]})

How It Works

  1. Adapter creates service.py with BentoML decorators
  2. Runner builds and serves the Bento
  3. HTTP API for inference

BentoML Service Template

import bentoml

@bentoml.service
class KapsoService:
    @bentoml.api
    def predict(self, inputs: dict) -> dict:
        return self.model.predict(inputs)

LANGGRAPH

Deploy as a LangGraph agent workflow.

Features

  • Stateful agent execution
  • Tool integration
  • Streaming support
  • Checkpoint persistence

Prerequisites

  • langgraph package installed
pip install langgraph

Usage

deployed_program = kapso.deploy(solution, strategy=DeployStrategy.LANGGRAPH)
result = deployed_program.run({"query": "What is 2+2?"})

How It Works

  1. Adapter creates graph.py with LangGraph nodes
  2. Runner compiles and runs the graph
  3. Streaming or batch execution

AUTO Selection

When DeployStrategy.AUTO is used, the selector analyzes the code:
deployed_program = kapso.deploy(solution, strategy=DeployStrategy.AUTO)
# System chooses based on:
# - Code complexity
# - Resource requirements (GPU, memory)
# - Dependencies
# - Target environment

Selection Criteria

FactorLOCALDOCKERMODALBENTOML
Simple script
Complex deps
GPU required
Production ML
Isolation needed

Strategy Comparison

StrategyStartupCostIsolationGPUProduction
LOCALFastFreeNoneLocalNo
DOCKERMediumFreeFullVia nvidia-dockerYes
MODALMediumPay-per-useFullYesYes
BENTOMLMediumSelf-hostedFullYesYes
LANGGRAPHFastDependsNoneVia backendYes

Run Interface

Each strategy defines how to call the deployed code:
# Returned by adapter
run_interface = {
    "type": "function",     # or "http", "grpc"
    "module": "main",       # Python module
    "callable": "predict",  # Function name
    "endpoint": None,       # HTTP endpoint (if applicable)
}

Environment Variables

Pass environment variables to deployments:
deployed_program = kapso.deploy(
    solution,
    strategy=DeployStrategy.DOCKER,
    env_vars={
        "API_KEY": "...",
        "MODEL_PATH": "/models/v1",
    },
)