Skip to content

AI Harnesses

Cub supports multiple AI coding CLI tools called harnesses. A harness is a wrapper around an AI coding assistant that provides a consistent interface for task execution, token tracking, and streaming output.

What is a Harness?

A harness adapts an AI coding assistant's CLI to work with cub's autonomous loop. It handles:

  • Invocation: Running the AI with system and task prompts
  • Streaming: Real-time output as the AI generates responses
  • Token tracking: Monitoring usage for budget management
  • Auto mode: Enabling autonomous operation without user prompts
  • Hooks: Event interception for guardrails and custom behavior (v0.24+)

Supported Harnesses

Harness Name CLI Binary Documentation
Claude Code (SDK) claude claude github.com/anthropics/claude-code
Claude Code (CLI) claude-cli claude Shell-out mode for compatibility
Codex codex codex github.com/openai/codex
Gemini gemini gemini github.com/google/gemini-cli
OpenCode opencode opencode github.com/sst/opencode

Claude Code SDK vs CLI

The default claude harness uses the Claude Agent SDK for full hook support and better integration. Use claude-cli explicitly for the shell-out approach (simpler deployment, no SDK dependencies).

Capability Matrix

Different harnesses have different features. Cub adapts its behavior based on what each harness supports.

Capability Claude (SDK) Claude (CLI) Codex Gemini OpenCode
streaming ✅ ✅ ✅ ❌ ✅
token_reporting ✅ ✅ ❌ ❌* ✅
system_prompt ✅ ✅ ❌ ❌ ❌
auto_mode ✅ ✅ ✅ ✅ ✅
json_output ✅ ✅ ✅ ❌ ✅
model_selection ✅ ✅ ✅ ✅ ❌
hooks ✅ ❌ ❌ ❌ ❌
custom_tools ✅ ❌ ❌ ❌ ❌
sessions ✅ ❌ ❌ ❌ ❌

*Gemini uses character-based estimation (~4 chars/token)

Capability Descriptions

streaming
Real-time output streaming as the AI generates responses. When available, cub shows live progress with the --stream flag.
token_reporting
Accurate token usage reporting for budget tracking. When unavailable, cub estimates usage or skips budget tracking.
system_prompt
Support for a separate system prompt (keeps instructions distinct from the task). When unavailable, cub concatenates system and task prompts with a --- separator.
auto_mode
Autonomous operation without user confirmation prompts. All harnesses must support this for unattended execution.
json_output
Structured JSON response format for programmatic parsing. Enables reliable token extraction and result parsing.
model_selection
Runtime model selection via CLI flag. Enables task labels like model:haiku to select specific models.
hooks (v0.24+)
Event interception for guardrails and custom behavior. Enables blocking dangerous commands, logging tool usage, and implementing circuit breakers. See Hooks System.
custom_tools (v0.24+)
Register custom tools that the AI can invoke. Enables extending the AI's capabilities with project-specific functionality.
sessions (v0.24+)
Stateful conversation sessions. Enables multi-turn interactions and context preservation across tool calls.

Harness Selection

Cub selects a harness using this priority order:

  1. CLI flag: --harness claude
  2. Environment variable: HARNESS=claude
  3. Config priority array: harness.priority in config file
  4. Default order: claude > opencode > codex > gemini

Auto-Detection

When no harness is specified, cub auto-detects by checking which CLI binaries are available on your system:

# Cub will use the first available harness
cub run --once

Explicit Selection

Specify a harness explicitly for consistent behavior:

# Via CLI flag
cub run --harness claude

# Via environment variable
HARNESS=claude cub run

Configuration

Configure harness behavior in .cub.json or your global config:

{
  "harness": {
    "priority": ["claude", "opencode", "codex", "gemini"]
  }
}

Cub tries each harness in order until one is found installed.

Per-Task Model Selection

Use task labels to select models for specific tasks:

# Add model label to a task
bd label <task-id> model:haiku

When cub runs a task with a model: label, it passes the model to the harness (if supported).

How Cub Adapts

Cub adjusts its behavior based on harness capabilities:

Scenario Adaptation
No streaming Output appears after completion instead of real-time
No token_reporting Budget tracking uses estimates or is disabled
No system_prompt System prompt concatenated with task prompt
No json_output Raw text output parsed as-is
No model_selection model: task labels are ignored

Querying Capabilities

Python API

from cub.core.harness import get_backend, get_capabilities

# Get current harness capabilities
backend = get_backend()
caps = backend.capabilities

if caps.streaming:
    print("Streaming available")

if caps.token_reporting:
    print("Accurate token tracking enabled")

Shell (Bash Backend)

source lib/harness.sh

# Check if current harness supports a capability
if harness_supports "streaming"; then
    echo "Streaming available"
fi

# Get all capabilities as JSON
harness_get_capabilities_json
# {"harness":"claude","streaming":true,"token_reporting":true,...}

Choosing a Harness

Choose Claude Code (SDK) if: (default, recommended)

  • You need hooks for guardrails and circuit breakers
  • You want custom tools and stateful sessions
  • Budget tracking and streaming are important
  • You want the most full-featured experience

Choose Claude Code (Legacy) if:

  • You need the previous shell-out behavior
  • You're troubleshooting SDK integration issues
  • Your environment doesn't support the Claude Agent SDK

Choose Codex if:

  • You prefer OpenAI models
  • Streaming is important but token tracking is not critical

Choose Gemini if:

  • You want to use Google's models
  • Basic autonomous operation is sufficient

Choose OpenCode if:

  • You need streaming and token tracking
  • You don't need separate system prompts

Async Harness Interface (v0.24+)

Starting in v0.24, all harnesses use an async interface internally. This enables:

  • Non-blocking execution: Other tasks can run while waiting for AI responses
  • Better streaming: True async generators for real-time output
  • Hook integration: Hooks can intercept and modify behavior asynchronously

The async interface is transparent to users - cub run handles the async execution automatically.

Python API

from cub.core.harness import get_async_backend, detect_async_harness

# Auto-detect best available harness
harness_name = detect_async_harness()

# Get the async backend
backend = get_async_backend(harness_name)

# Check capabilities
if backend.supports_feature("hooks"):
    print("Hooks available for guardrails")

# Run a task (async)
async def run():
    result = await backend.run_task(task_input)
    print(f"Completed with {result.usage.total_tokens} tokens")