AI Harnesses¶
Cub supports multiple AI coding CLI tools called harnesses. A harness is a wrapper around an AI coding assistant that provides a consistent interface for task execution, token tracking, and streaming output.
What is a Harness?¶
A harness adapts an AI coding assistant's CLI to work with cub's autonomous loop. It handles:
- Invocation: Running the AI with system and task prompts
- Streaming: Real-time output as the AI generates responses
- Token tracking: Monitoring usage for budget management
- Auto mode: Enabling autonomous operation without user prompts
- Hooks: Event interception for guardrails and custom behavior (v0.24+)
Supported Harnesses¶
| Harness | Name | CLI Binary | Documentation |
|---|---|---|---|
| Claude Code (SDK) | claude | claude | github.com/anthropics/claude-code |
| Claude Code (CLI) | claude-cli | claude | Shell-out mode for compatibility |
| Codex | codex | codex | github.com/openai/codex |
| Gemini | gemini | gemini | github.com/google/gemini-cli |
| OpenCode | opencode | opencode | github.com/sst/opencode |
Claude Code SDK vs CLI
The default claude harness uses the Claude Agent SDK for full hook support and better integration. Use claude-cli explicitly for the shell-out approach (simpler deployment, no SDK dependencies).
Capability Matrix¶
Different harnesses have different features. Cub adapts its behavior based on what each harness supports.
| Capability | Claude (SDK) | Claude (CLI) | Codex | Gemini | OpenCode |
|---|---|---|---|---|---|
| streaming | |||||
| token_reporting | |||||
| system_prompt | |||||
| auto_mode | |||||
| json_output | |||||
| model_selection | |||||
| hooks | |||||
| custom_tools | |||||
| sessions |
*Gemini uses character-based estimation (~4 chars/token)
Capability Descriptions¶
- streaming
- Real-time output streaming as the AI generates responses. When available, cub shows live progress with the
--streamflag. - token_reporting
- Accurate token usage reporting for budget tracking. When unavailable, cub estimates usage or skips budget tracking.
- system_prompt
- Support for a separate system prompt (keeps instructions distinct from the task). When unavailable, cub concatenates system and task prompts with a
---separator. - auto_mode
- Autonomous operation without user confirmation prompts. All harnesses must support this for unattended execution.
- json_output
- Structured JSON response format for programmatic parsing. Enables reliable token extraction and result parsing.
- model_selection
- Runtime model selection via CLI flag. Enables task labels like
model:haikuto select specific models. - hooks (v0.24+)
- Event interception for guardrails and custom behavior. Enables blocking dangerous commands, logging tool usage, and implementing circuit breakers. See Hooks System.
- custom_tools (v0.24+)
- Register custom tools that the AI can invoke. Enables extending the AI's capabilities with project-specific functionality.
- sessions (v0.24+)
- Stateful conversation sessions. Enables multi-turn interactions and context preservation across tool calls.
Harness Selection¶
Cub selects a harness using this priority order:
- CLI flag:
--harness claude - Environment variable:
HARNESS=claude - Config priority array:
harness.priorityin config file - Default order: claude > opencode > codex > gemini
Auto-Detection¶
When no harness is specified, cub auto-detects by checking which CLI binaries are available on your system:
Explicit Selection¶
Specify a harness explicitly for consistent behavior:
Configuration¶
Configure harness behavior in .cub.json or your global config:
Cub tries each harness in order until one is found installed.
Per-Task Model Selection¶
Use task labels to select models for specific tasks:
When cub runs a task with a model: label, it passes the model to the harness (if supported).
How Cub Adapts¶
Cub adjusts its behavior based on harness capabilities:
| Scenario | Adaptation |
|---|---|
| No streaming | Output appears after completion instead of real-time |
| No token_reporting | Budget tracking uses estimates or is disabled |
| No system_prompt | System prompt concatenated with task prompt |
| No json_output | Raw text output parsed as-is |
| No model_selection | model: task labels are ignored |
Querying Capabilities¶
Python API¶
from cub.core.harness import get_backend, get_capabilities
# Get current harness capabilities
backend = get_backend()
caps = backend.capabilities
if caps.streaming:
print("Streaming available")
if caps.token_reporting:
print("Accurate token tracking enabled")
Shell (Bash Backend)¶
source lib/harness.sh
# Check if current harness supports a capability
if harness_supports "streaming"; then
echo "Streaming available"
fi
# Get all capabilities as JSON
harness_get_capabilities_json
# {"harness":"claude","streaming":true,"token_reporting":true,...}
Choosing a Harness¶
Choose Claude Code (SDK) if: (default, recommended)
- You need hooks for guardrails and circuit breakers
- You want custom tools and stateful sessions
- Budget tracking and streaming are important
- You want the most full-featured experience
Choose Claude Code (Legacy) if:
- You need the previous shell-out behavior
- You're troubleshooting SDK integration issues
- Your environment doesn't support the Claude Agent SDK
Choose Codex if:
- You prefer OpenAI models
- Streaming is important but token tracking is not critical
Choose Gemini if:
- You want to use Google's models
- Basic autonomous operation is sufficient
Choose OpenCode if:
- You need streaming and token tracking
- You don't need separate system prompts
Async Harness Interface (v0.24+)¶
Starting in v0.24, all harnesses use an async interface internally. This enables:
- Non-blocking execution: Other tasks can run while waiting for AI responses
- Better streaming: True async generators for real-time output
- Hook integration: Hooks can intercept and modify behavior asynchronously
The async interface is transparent to users - cub run handles the async execution automatically.
Python API¶
from cub.core.harness import get_async_backend, detect_async_harness
# Auto-detect best available harness
harness_name = detect_async_harness()
# Get the async backend
backend = get_async_backend(harness_name)
# Check capabilities
if backend.supports_feature("hooks"):
print("Hooks available for guardrails")
# Run a task (async)
async def run():
result = await backend.run_task(task_input)
print(f"Completed with {result.usage.total_tokens} tokens")