AI Harnesses¶

Cub supports multiple AI coding CLI tools called harnesses. A harness is a wrapper around an AI coding assistant that provides a consistent interface for task execution, token tracking, and streaming output.

What is a Harness?¶

A harness adapts an AI coding assistant's CLI to work with cub's autonomous loop. It handles:

Invocation: Running the AI with system and task prompts
Streaming: Real-time output as the AI generates responses
Token tracking: Monitoring usage for budget management
Auto mode: Enabling autonomous operation without user prompts
Hooks: Event interception for guardrails and custom behavior (v0.24+)

Supported Harnesses¶

Harness	Name	CLI Binary	Documentation
Claude Code (SDK)	`claude`	`claude`	github.com/anthropics/claude-code
Claude Code (CLI)	`claude-cli`	`claude`	Shell-out mode for compatibility
Codex	`codex`	`codex`	github.com/openai/codex
Gemini	`gemini`	`gemini`	github.com/google/gemini-cli
OpenCode	`opencode`	`opencode`	github.com/sst/opencode

Claude Code SDK vs CLI

The default claude harness uses the Claude Agent SDK for full hook support and better integration. Use claude-cli explicitly for the shell-out approach (simpler deployment, no SDK dependencies).

Capability Matrix¶

Different harnesses have different features. Cub adapts its behavior based on what each harness supports.

Capability	Claude (SDK)	Claude (CLI)	Codex	Gemini	OpenCode
streaming
token_reporting				*
system_prompt
auto_mode
json_output
model_selection
hooks
custom_tools
sessions

*Gemini uses character-based estimation (~4 chars/token)

Capability Descriptions¶

streaming: Real-time output streaming as the AI generates responses. When available, cub shows live progress with the --stream flag.
token_reporting: Accurate token usage reporting for budget tracking. When unavailable, cub estimates usage or skips budget tracking.
system_prompt: Support for a separate system prompt (keeps instructions distinct from the task). When unavailable, cub concatenates system and task prompts with a --- separator.
auto_mode: Autonomous operation without user confirmation prompts. All harnesses must support this for unattended execution.
json_output: Structured JSON response format for programmatic parsing. Enables reliable token extraction and result parsing.
model_selection: Runtime model selection via CLI flag. Enables task labels like model:haiku to select specific models.
hooks (v0.24+): Event interception for guardrails and custom behavior. Enables blocking dangerous commands, logging tool usage, and implementing circuit breakers. See Hooks System.
custom_tools (v0.24+): Register custom tools that the AI can invoke. Enables extending the AI's capabilities with project-specific functionality.
sessions (v0.24+): Stateful conversation sessions. Enables multi-turn interactions and context preservation across tool calls.

Harness Selection¶

Cub selects a harness using this priority order:

CLI flag: --harness claude
Environment variable: HARNESS=claude
Config priority array: harness.priority in config file
Default order: claude > opencode > codex > gemini

Auto-Detection¶

When no harness is specified, cub auto-detects by checking which CLI binaries are available on your system:

# Cub will use the first available harness
cub run --once

Explicit Selection¶

Specify a harness explicitly for consistent behavior:

# Via CLI flag
cub run --harness claude

# Via environment variable
HARNESS=claude cub run

Configuration¶

Configure harness behavior in .cub.json or your global config:

{
  "harness": {
    "priority": ["claude", "opencode", "codex", "gemini"]
  }
}

Cub tries each harness in order until one is found installed.

Per-Task Model Selection¶

Use task labels to select models for specific tasks:

# Add model label to a task
bd label <task-id> model:haiku

When cub runs a task with a model: label, it passes the model to the harness (if supported).

How Cub Adapts¶

Cub adjusts its behavior based on harness capabilities:

Scenario	Adaptation
No streaming	Output appears after completion instead of real-time
No token_reporting	Budget tracking uses estimates or is disabled
No system_prompt	System prompt concatenated with task prompt
No json_output	Raw text output parsed as-is
No model_selection	`model:` task labels are ignored

Querying Capabilities¶

Python API¶

from cub.core.harness import get_backend, get_capabilities

# Get current harness capabilities
backend = get_backend()
caps = backend.capabilities

if caps.streaming:
    print("Streaming available")

if caps.token_reporting:
    print("Accurate token tracking enabled")

Shell (Bash Backend)¶

source lib/harness.sh

# Check if current harness supports a capability
if harness_supports "streaming"; then
    echo "Streaming available"
fi

# Get all capabilities as JSON
harness_get_capabilities_json
# {"harness":"claude","streaming":true,"token_reporting":true,...}

Choosing a Harness¶

Choose Claude Code (SDK) if: (default, recommended)

You need hooks for guardrails and circuit breakers
You want custom tools and stateful sessions
Budget tracking and streaming are important
You want the most full-featured experience

Choose Claude Code (Legacy) if:

You need the previous shell-out behavior
You're troubleshooting SDK integration issues
Your environment doesn't support the Claude Agent SDK

Choose Codex if:

You prefer OpenAI models
Streaming is important but token tracking is not critical

Choose Gemini if:

You want to use Google's models
Basic autonomous operation is sufficient

Choose OpenCode if:

You need streaming and token tracking
You don't need separate system prompts

Async Harness Interface (v0.24+)¶

Starting in v0.24, all harnesses use an async interface internally. This enables:

Non-blocking execution: Other tasks can run while waiting for AI responses
Better streaming: True async generators for real-time output
Hook integration: Hooks can intercept and modify behavior asynchronously

The async interface is transparent to users - cub run handles the async execution automatically.

Python API¶

from cub.core.harness import get_async_backend, detect_async_harness

# Auto-detect best available harness
harness_name = detect_async_harness()

# Get the async backend
backend = get_async_backend(harness_name)

# Check capabilities
if backend.supports_feature("hooks"):
    print("Hooks available for guardrails")

# Run a task (async)
async def run():
    result = await backend.run_task(task_input)
    print(f"Completed with {result.usage.total_tokens} tokens")