Skip to main content

Overview

The deployment system uses a plugin architecture. Adding a new strategy requires:
  1. Creating a directory in src/deployment/strategies/
  2. Adding configuration and instruction files
  3. Implementing a Runner class
No changes to core code are needed - strategies are auto-discovered at runtime.

Quick Start

Here’s what you’ll create:
src/deployment/strategies/mycloud/
├── config.yaml              # Strategy configuration
├── selector_instruction.md  # When to choose this strategy
├── adapter_instruction.md   # How to adapt code
├── runner.py                # Runtime execution class
└── __init__.py              # Exports

Step 1: Create the Directory

mkdir -p src/deployment/strategies/mycloud
The directory name becomes the strategy name (lowercase).

Step 2: Create config.yaml

This file defines your strategy’s configuration.
# src/deployment/strategies/mycloud/config.yaml

# Strategy identification
name: mycloud
provider: mycloud          # Cloud provider name (or null for local)
interface: http            # "function", "http", or custom

# Runner class name (must exist in runner.py)
runner_class: MyCloudRunner

# Default resource requirements
default_resources:
  gpu: T4
  memory: 16Gi
  cpu: 2

# Default run interface (passed to runner constructor)
run_interface:
  type: http
  predict_path: /predict
  # Add any strategy-specific defaults here

Configuration Fields

FieldRequiredDescription
nameYesStrategy name (matches directory name)
providerNoCloud provider identifier
interfaceYesInterface type: function, http, or custom
runner_classYesClass name to import from runner.py
default_resourcesNoDefault resource requirements
run_interfaceNoDefault parameters for runner constructor

Step 3: Create selector_instruction.md

This file tells the SelectorAgent when to choose your strategy.
# src/deployment/strategies/mycloud/selector_instruction.md

# MyCloud

## Summary
Deploy to MyCloud for managed GPU instances with auto-scaling.

## Best For
- GPU workloads requiring specific hardware
- Production deployments with SLA requirements
- Need managed Kubernetes infrastructure
- Multi-region deployment

## Not For
- Quick local testing (use local)
- Simple scripts without GPU needs (use local)
- Serverless workloads (use modal)
- Agent workflows (use langgraph)

## Resources
Requires resource specification:
- gpu: T4, A10G, A100
- memory: 8Gi, 16Gi, 32Gi
- cpu: 1, 2, 4

Default: gpu=T4, memory=16Gi, cpu=2

## Interface
http (REST API endpoint)

## Provider
mycloud

Structure Guidelines

SectionPurpose
SummaryOne-line description (used in strategy listings)
Best ForUse cases where this strategy excels
Not ForSituations to avoid (suggests alternatives)
ResourcesResource options and defaults
InterfaceInterface type
ProviderCloud provider name
The SelectorAgent reads ALL selector instructions and presents them to the LLM to make a choice.

Step 4: Create adapter_instruction.md

This file tells the AdapterAgent how to transform code for your strategy.
# src/deployment/strategies/mycloud/adapter_instruction.md

# MyCloud Deployment Instructions

Deploy the solution to MyCloud platform.

## DEPLOY COMMAND

```bash
mycloud deploy --app-name my-solution --gpu T4
Run this command after creating the deployment files. Parse the output to get the deployment URL.

RUN INTERFACE

After successful deployment, output:
<run_interface>{"type": "http", "predict_path": "/predict"}</run_interface>
<endpoint_url>https://my-solution.mycloud.com</endpoint_url>

CRITICAL: VERIFY DEPLOYMENT

Do NOT just create files. You MUST:
  1. Create all deployment files
  2. Run the DEPLOY COMMAND
  3. Verify deployment succeeds
  4. Extract and report the endpoint URL
If deployment fails, debug and retry.

Required Structure

solution/
├── main.py              # Entry point with predict() function
├── requirements.txt     # Dependencies
├── mycloud.yaml         # MyCloud configuration
└── ...

MyCloud Configuration (mycloud.yaml)

Create this file with deployment settings:
# mycloud.yaml
name: my-solution
runtime: python3.10
gpu: T4
memory: 16Gi
replicas: 1

entrypoint:
  module: main
  function: predict

health_check:
  path: /health
  interval: 30s

Entry Point (main.py)

Ensure main.py has this structure:
def predict(inputs: dict) -> dict:
    """
    Main prediction function.
    
    Args:
        inputs: Input dictionary
        
    Returns:
        Result dictionary
    """
    # Your logic here
    result = process(inputs)
    return {"status": "success", "output": result}


def health():
    """Health check endpoint."""
    return {"status": "healthy"}

Testing Before Deployment

# Test locally first
python -c "from main import predict; print(predict({'test': True}))"

# Then deploy
mycloud deploy --app-name my-solution

### Key Sections

| Section | Purpose |
|---------|---------|
| `DEPLOY COMMAND` | The command the coding agent should run |
| `RUN INTERFACE` | JSON format for runner configuration |
| `Required Structure` | Files that must exist |
| `Configuration` | Strategy-specific config file templates |
| `Entry Point` | Standard `main.py` structure |

### Output Format

The adapter looks for these XML tags in the agent's output:

```xml
<!-- Run interface (REQUIRED) - tells the runner how to execute -->
<run_interface>{"type": "http", "predict_path": "/predict"}</run_interface>

<!-- Endpoint URL (for HTTP strategies) -->
<endpoint_url>https://deployed-url.com</endpoint_url>

Step 5: Create runner.py

The Runner class handles actual execution.
# src/deployment/strategies/mycloud/runner.py

"""
MyCloud Runner

Executes software by making HTTP requests to MyCloud endpoints.
"""

import requests
from typing import Any, Dict, List, Union

from src.deployment.strategies.base import Runner


class MyCloudRunner(Runner):
    """
    Runner for MyCloud deployments.
    
    Makes HTTP requests to the deployed MyCloud endpoint.
    """
    
    def __init__(
        self,
        endpoint: str = None,
        predict_path: str = "/predict",
        code_path: str = None,
        timeout: int = 300,
        **kwargs,  # Accept extra params from run_interface
    ):
        """
        Initialize the MyCloud runner.
        
        Args:
            endpoint: The deployed endpoint URL
            predict_path: Path for prediction endpoint
            code_path: Path to code (for reference)
            timeout: Request timeout in seconds
            **kwargs: Additional parameters (ignored)
        """
        self.endpoint = endpoint
        self.predict_path = predict_path
        self.code_path = code_path
        self.timeout = timeout
        self._logs: List[str] = []
        
        if endpoint:
            self._logs.append(f"Initialized with endpoint: {endpoint}")
        else:
            self._logs.append("Warning: No endpoint provided")
    
    def run(self, inputs: Union[Dict, str, bytes]) -> Any:
        """
        Execute by making HTTP POST to the endpoint.
        
        Args:
            inputs: Input data for the prediction
            
        Returns:
            Response from the endpoint
        """
        if not self.endpoint:
            return {
                "error": "No endpoint configured",
                "instructions": [
                    "1. Deploy to MyCloud first",
                    f"2. Run: mycloud deploy --app-name <name>",
                    "3. Use the returned endpoint URL",
                ],
            }
        
        url = f"{self.endpoint.rstrip('/')}{self.predict_path}"
        self._logs.append(f"POST {url}")
        
        try:
            # Convert inputs to JSON-serializable format
            if isinstance(inputs, bytes):
                inputs = inputs.decode('utf-8')
            if isinstance(inputs, str):
                inputs = {"input": inputs}
            
            response = requests.post(
                url,
                json=inputs,
                timeout=self.timeout,
                headers={"Content-Type": "application/json"}
            )
            
            response.raise_for_status()
            result = response.json()
            
            self._logs.append(f"Response: {response.status_code}")
            return result
            
        except requests.exceptions.Timeout:
            self._logs.append(f"Timeout after {self.timeout}s")
            return {"error": f"Request timed out after {self.timeout}s"}
            
        except requests.exceptions.RequestException as e:
            self._logs.append(f"Request error: {e}")
            return {"error": str(e)}
    
    def stop(self) -> None:
        """
        Stop the runner.
        
        For HTTP-based runners, this is usually a no-op.
        Override if your cloud provider supports stopping instances.
        """
        self._logs.append("Stopped")
        # Optionally: call mycloud API to stop the instance
    
    def is_healthy(self) -> bool:
        """
        Check if the endpoint is healthy.
        
        Returns:
            True if endpoint responds, False otherwise
        """
        if not self.endpoint:
            return False
        
        try:
            health_url = f"{self.endpoint.rstrip('/')}/health"
            response = requests.get(health_url, timeout=5)
            return response.status_code == 200
        except:
            return False
    
    def get_logs(self) -> str:
        """Get runner logs."""
        return "\n".join(self._logs)

Runner Interface

Your runner must implement these methods:
MethodRequiredDescription
run(inputs)YesExecute with inputs, return result
stop()YesCleanup resources
is_healthy()YesCheck if runner is ready
get_logs()NoReturn log string (default: empty)

Constructor Parameters

The runner constructor receives parameters from run_interface:
# From config.yaml
run_interface:
  type: http
  predict_path: /predict

# Plus from adaptation output
# <run_interface>{"type": "http", "endpoint": "https://..."}</run_interface>

# Plus from factory
# code_path, timeout

# All merged and passed to __init__
runner = MyCloudRunner(
    type="http",           # From config
    predict_path="/predict",  # From config
    endpoint="https://...",   # From adaptation
    code_path="/path/to/code",  # From factory
    timeout=300,              # From factory
)
Always use **kwargs to accept extra parameters gracefully.

Step 6: Create init.py

Export your runner class:
# src/deployment/strategies/mycloud/__init__.py

from src.deployment.strategies.mycloud.runner import MyCloudRunner

__all__ = ["MyCloudRunner"]

Step 7: Test Your Strategy

Verify Discovery

from src.deployment.strategies import StrategyRegistry

registry = StrategyRegistry.get()
print(registry.list_strategies())
# Should include 'mycloud'

print(registry.get_selector_instruction('mycloud'))
# Should print your selector_instruction.md content

Test Deployment

from src.deployment import DeploymentFactory, DeployStrategy, DeployConfig

# DeployStrategy enum is auto-generated
print(DeployStrategy.MYCLOUD)  # Should work

config = DeployConfig(
    solution=my_solution,
    timeout=300,
)

software = DeploymentFactory.create(DeployStrategy.MYCLOUD, config)
result = software.run({"test": True})
print(result)

Complete Example: AWS Lambda Strategy

Here’s a complete example for AWS Lambda:

config.yaml

name: lambda
provider: aws
interface: function
runner_class: LambdaRunner

default_resources:
  memory: 512
  timeout: 30

run_interface:
  type: lambda
  invoke_type: RequestResponse

selector_instruction.md

# Lambda

## Summary
Deploy as AWS Lambda function for serverless execution.

## Best For
- Event-driven workloads
- Short-running tasks (<15 min)
- Infrequent invocations
- AWS ecosystem integration

## Not For
- Long-running tasks (use docker)
- GPU workloads (use modal)
- Stateful applications (use langgraph)

## Resources
- memory: 128-10240 MB
- timeout: 1-900 seconds

Default: memory=512, timeout=30

## Interface
function (AWS SDK invoke)

## Provider
aws

adapter_instruction.md

# Lambda Deployment Instructions

## DEPLOY COMMAND

```bash
# Package and deploy
zip -r function.zip . -x "*.git*"
aws lambda create-function \
  --function-name my-solution \
  --runtime python3.10 \
  --handler main.handler \
  --zip-file fileb://function.zip \
  --role arn:aws:iam::ACCOUNT:role/lambda-role

RUN INTERFACE

<run_interface>{"type": "lambda", "function_name": "my-solution"}</run_interface>

Required Structure

solution/
├── main.py           # With handler(event, context) function
└── requirements.txt

Entry Point

# main.py
def handler(event, context):
    return predict(event)

def predict(inputs: dict) -> dict:
    result = process(inputs)
    return {"statusCode": 200, "body": result}

### runner.py

```python
import json
import boto3
from typing import Any, Dict, Union
from src.deployment.strategies.base import Runner


class LambdaRunner(Runner):
    def __init__(
        self,
        function_name: str = None,
        invoke_type: str = "RequestResponse",
        code_path: str = None,
        **kwargs,
    ):
        self.function_name = function_name
        self.invoke_type = invoke_type
        self.code_path = code_path
        self._client = boto3.client('lambda')
        self._logs = []
    
    def run(self, inputs: Union[Dict, str, bytes]) -> Any:
        if not self.function_name:
            return {"error": "No function_name configured"}
        
        payload = json.dumps(inputs) if isinstance(inputs, dict) else inputs
        
        response = self._client.invoke(
            FunctionName=self.function_name,
            InvocationType=self.invoke_type,
            Payload=payload,
        )
        
        result = json.loads(response['Payload'].read())
        return result
    
    def stop(self) -> None:
        pass  # Lambda is serverless
    
    def is_healthy(self) -> bool:
        try:
            self._client.get_function(FunctionName=self.function_name)
            return True
        except:
            return False

Best Practices

1. Accept Extra Parameters

Always use **kwargs in your runner constructor:
def __init__(self, endpoint: str, **kwargs):
    # kwargs catches any unexpected parameters

2. Provide Clear Error Messages

When things go wrong, tell users how to fix it:
if not self.endpoint:
    return {
        "error": "Endpoint not configured",
        "instructions": [
            "1. Deploy first: mycloud deploy",
            "2. Set the endpoint URL",
        ],
    }

3. Include Health Checks

Implement meaningful health checks:
def is_healthy(self) -> bool:
    try:
        response = requests.get(f"{self.endpoint}/health", timeout=5)
        return response.status_code == 200
    except:
        return False

4. Log Important Events

Use the logs for debugging:
self._logs.append(f"Calling {url} with {inputs}")
self._logs.append(f"Response: {response.status_code}")

5. Handle Timeouts Gracefully

try:
    response = requests.post(url, timeout=self.timeout)
except requests.exceptions.Timeout:
    return {"error": f"Timed out after {self.timeout}s"}

Troubleshooting

Strategy Not Discovered

Check that:
  • Directory is in src/deployment/strategies/
  • Both selector_instruction.md and adapter_instruction.md exist
  • config.yaml has runner_class field
# Debug discovery
from src.deployment.strategies.base import StrategyRegistry
StrategyRegistry.reset()  # Clear cache
registry = StrategyRegistry.get()
print(registry._strategies)  # See what was discovered

Runner Class Not Found

Check that:
  • runner_class in config.yaml matches class name in runner.py
  • Class is properly defined and inherits from Runner
# Test import manually
from src.deployment.strategies.mycloud.runner import MyCloudRunner

Adaptation Fails

Check your adapter_instruction.md:
  • Deploy command is correct
  • Output format includes <run_interface> tags
  • Entry point structure is clear

Next Steps