Files
kage-research/research.md
2026-04-09 00:39:52 +00:00

16 KiB

Research: Agent Frameworks for Programmatic/Headless Usage

Summary

This research evaluates seven agent frameworks/tools for programmatic/headless usage: Hermes, OpenCode, Pi, OpenClaw, LangChain Agents, Claude Code, and Codex. The evaluation focuses on headless operation, resource usage, session management, agent lifecycle, data persistence, customizability, and integration complexity. For the user's use case (replacing hermes + opencode with something better for local dev and cloud production), the top recommendations are:

  • Pi (agent-core): Best for pure programmatic control with excellent TypeScript SDK, event-driven architecture, and lightweight footprint
  • Claude Code: Best for production-grade headless operation with structured output, CI/CD integration, and official SDK support
  • LangChain: Best for flexibility and customization if the user wants full control over the agent loop
  • OpenCode: Strong option if they want to stick with a similar architecture but need better SDK

Comparison Matrix

Criteria Hermes OpenCode Pi (agent-core) OpenClaw LangChain Agents Claude Code Codex
Headless/Programmatic Python lib (AIAgent) SDK + server mode Full TypeScript SDK Gateway WS API create_agent() Python -p flag + SDK CLI only
Resource Usage ~500MB+ (Python) ~200-400MB (Go) ~50-100MB (TS core) ~500MB+ (Node) ~100-300MB (Python) ~200-400MB (Node) ~200-300MB (Rust)
Multi-agent Support Subagents/spawn Multiple sessions Multiple instances Multi-agent routing Via LangGraph Multiple sessions Single agent
Session Management SQLite-based Session API In-memory + custom Gateway sessions Manual state --resume flag Session-based
Data Persistence SQLite + pluggable memory File-based Custom (you control) SQLite + gateway You implement File-based File-based
Customizability High (skills, tools, prompts) High (tools, prompts) High (tools, middleware) High (skills, MCP) Very high Medium (plugins, hooks) Low
Plug-and-Play Easy (pip install) Easy (npm) Easy (npm) Moderate Moderate Easy Easy
LLM Flexibility 200+ via OpenRouter Any (provider-agnostic) Any (multi-provider) Any (multi-provider) Any Anthropic-first OpenAI-first

Per-Tool Deep Dives

1. Hermes Agent (NousResearch/hermes-agent)

Repository: https://github.com/NousResearch/hermes-agent (30.7K stars)

Headless / Programmatic API

Yes - Python Library

Hermes can be imported and used as a Python library:

from run_agent import AIAgent

agent = AIAgent(
    model="anthropic/claude-sonnet-4",
    quiet_mode=True,
)
response = agent.chat("What is the capital of France?")

For full conversation control:

result = agent.run_conversation(
    user_message="Search for recent Python features",
    task_id="my-task-1",
)
# Returns: final_response, messages, task_id

CLI Headless: Also supports -p flag via OpenClaw migration path.

Resource Usage

  • Memory: ~500MB+ (Python runtime)
  • CPU: Moderate (depends on model)
  • Multi-agent: Supports subagents via sessions_spawn tool
  • Batch: batch_runner.py for parallel processing

Session Management

  • SQLite-based session storage (configurable location)
  • Pluggable memory providers (v0.7.0+) - built-in, Honcho, or custom
  • Conversation history preserved across sessions
  • FTS5 search for cross-session recall
  • Multi-turn conversations via conversation_history parameter

Agent Lifecycle

  1. Initialize: AIAgent(model=, quiet_mode=)
  2. Run: chat() or run_conversation()
  3. Terminate: Automatic cleanup; resources released on conversation end

Key options:

  • max_iterations: 90 default (configurable)
  • enabled_toolsets / disabled_toolsets: Control available tools
  • skip_memory / skip_context_files: Stateless mode for APIs

Data Persistence

  • SQLite: Session data stored in ~/.hermes/
  • Memory: Pluggable providers (built-in, Honcho, vector stores)
  • Trajectories: JSONL format for training data (save_trajectories=True)
  • API Server: Shared SessionDB for Open WebUI integration

Customizability

  • Skills: Procedural memory via SKILL.md files
  • Tools: Custom tool registration
  • Prompts: ephemeral_system_prompt for dynamic prompts
  • MCP: Model Context Protocol support
  • Platform hints: platform param for Discord, Telegram, etc.

Performance/Intelligence

  • Self-improving: Agent creates skills from experience
  • Memory persistence: Learns across sessions
  • Credential pooling: Multiple API keys with rotation
  • Compression: Context compression to prevent overflow

Integration Example (FastAPI)

from fastapi import FastAPI
from pydantic import BaseModel
from run_agent import AIAgent

app = FastAPI()

class ChatRequest(BaseModel):
    message: str
    model: str = "anthropic/claude-sonnet-4"

@app.post("/chat")
async def chat(request: ChatRequest):
    agent = AIAgent(
        model=request.model,
        quiet_mode=True,
        skip_context_files=True,
        skip_memory=True,
    )
    return {"response": agent.chat(request.message)}

2. OpenCode (anomalyco/opencode)

Repository: https://github.com/anomalyco/opencode (138.9K stars, but this is the frontend repo - the actual agent is https://github.com/opencode-ai/opencode with 11.8K stars)

Headless / Programmatic API

Yes - SDK + Server Mode

Server Mode:

opencode serve [--port 4096] [--hostname "127.0.0.1"]

SDK:

import { createOpencode } from "@opencode-ai/sdk"

const { client } = await createOpencode()
// Or client-only:
const client = createOpencodeClient({ baseUrl: "http://localhost:4096" })

Resource Usage

  • Memory: ~200-400MB (Go runtime)
  • Architecture: Client/server - TUI is just one client
  • Multi-agent: Multiple sessions supported

Session Management

  • Full Session API:
    • session.create(), session.list(), session.get()
    • session.prompt() - send prompts
    • session.abort() - cancel running sessions
    • session.summarize() - compress context

Agent Lifecycle

  1. Start server: opencode serve
  2. Create session: client.session.create()
  3. Prompt: client.session.prompt()
  4. Terminate: Server stays running; sessions are disposable

Data Persistence

  • File-based configuration (opencode.json)
  • Sessions stored in server memory (configurable)

Customizability

  • Tools: Custom tool definitions
  • Prompts: Custom system prompts
  • Structured Output: JSON Schema support
  • Provider-agnostic: Any model via configuration

Structured Output Example

const result = await client.session.prompt({
  path: { id: sessionId },
  body: {
    parts: [{ type: "text", text: "Research Anthropic" }],
    format: {
      type: "json_schema",
      schema: {
        type: "object",
        properties: {
          company: { type: "string" },
          founded: { type: "number" },
        },
        required: ["company", "founded"],
      },
    },
  },
});

3. Pi (badlogic/pi-mono)

Repository: https://github.com/badlogic/pi-mono (33.1K stars)

This is the actual agent runtime that Feynman uses.

Headless / Programmatic API

Yes - Full TypeScript SDK

import { Agent } from "@mariozechner/pi-agent-core";
import { getModel } from "@mariozechner/pi-ai";

const agent = new Agent({
  initialState: {
    systemPrompt: "You are a helpful assistant.",
    model: getModel("anthropic", "claude-sonnet-4-20250514"),
  },
});

agent.subscribe((event) => {
  if (event.type === "message_update" && event.assistantMessageEvent.type === "text_delta") {
    process.stdout.write(event.assistantMessageEvent.delta);
  }
});

await agent.prompt("Hello!");

Resource Usage

  • Memory: ~50-100MB for core agent (very lightweight)
  • CPU: Minimal (just orchestration)
  • Multi-agent: Create multiple Agent instances
  • Dependencies: Requires @mariozechner/pi-ai for LLM calls

Session Management

  • In-memory by default - you control persistence
  • Messages array in agent state
  • Custom state schema via TypeScript interfaces
  • Session ID for provider caching

Agent Lifecycle

  1. Create: new Agent({ initialState })
  2. Prompt: agent.prompt() or agent.continue()
  3. Events: Subscribe to agent_start, turn_start, message_update, etc.
  4. Terminate: agent.reset() or let go out of scope

Key options:

  • transformContext: Prune/compress messages
  • convertToLlm: Filter custom message types
  • beforeToolCall / afterToolCall: Hooks for tool execution

Data Persistence

  • You control: Implement persistence via middleware
  • State is mutable: agent.state.messages = newMessages
  • No built-in storage: Freedom to implement as needed

Customizability

  • Tools: AgentTool with Typebox schemas
  • Middleware: @dynamic_prompt, @wrap_tool_call decorators
  • Message types: Custom via declaration merging
  • Thinking budgets: Configurable per provider

Low-Level API

import { agentLoop, agentLoopContinue } from "@mariozechner/pi-agent-core";

for await (const event of agentLoop([userMessage], context, config)) {
  console.log(event.type);
}

4. OpenClaw (openclaw/openclaw)

Repository: https://github.com/openclaw/openclaw (351.9K stars)

Headless / Programmatic API

Yes - Gateway WebSocket API

OpenClaw has an extensive Gateway WS API:

openclaw gateway --port 18789 --verbose

# Send a message
openclaw message send --to +1234567890 --message "Hello"

# Agent command
openclaw agent --message "Ship checklist" --thinking high

Resource Usage

  • Memory: ~500MB+ (Node.js runtime)
  • Multi-agent: Multi-agent routing via Gateway

Session Management

  • Gateway Sessions: Main session + group isolation
  • Session tools: sessions_list, sessions_history, sessions_send
  • SQLite-based storage

Agent Lifecycle

  1. Start Gateway: openclaw gateway
  2. Connect: WebSocket to ws://127.0.0.1:18789
  3. Message: Send via CLI or API
  4. Persistence: Sessions saved to SQLite

Data Persistence

  • SQLite: Gateway session storage
  • Workspace: ~/.openclaw/workspace
  • Skills: ~/.openclaw/workspace/skills/<skill>/SKILL.md

Customizability

  • Skills: Full skill system (ClawHub registry)
  • MCP: Model Context Protocol support
  • Channels: 20+ messaging platforms

5. LangChain Agents (langchain-ai/langchain)

Repository: https://github.com/langchain-ai/langchain

Headless / Programmatic API

Yes - Full Python API

from langchain.agents import create_agent

agent = create_agent("openai:gpt-5", tools=tools)
result = agent.invoke({"messages": [{"role": "user", "content": "Hello"}]})

Resource Usage

  • Memory: ~100-300MB (Python)
  • Flexible: Your code controls resource allocation
  • Multi-agent: Via LangGraph subgraphs

Session Management

  • Manual: You manage message history in state
  • Custom state: Extend AgentState TypedDict
  • Memory integration: Optional short-term/long-term memory

Agent Lifecycle

  1. Create: create_agent(model, tools, system_prompt)
  2. Invoke: agent.invoke({"messages": [...]})
  3. Stream: agent.stream() for real-time events

Data Persistence

  • You implement: Full control via middleware
  • Optional memory: LangChain memory modules

Customizability

  • Very high: Middleware, tools, prompts, dynamic everything
  • ReAct pattern: Built-in reasoning + acting loop
  • ToolStrategy / ProviderStrategy: Structured output

6. Claude Code (anthropics/claude-code)

Repository: https://github.com/anthropics/claude-code

Headless / Programmatic API

Yes - Agent SDK + CLI

CLI Headless:

claude -p "Find and fix the bug in auth.py" --allowedTools "Read,Edit,Bash"
claude --bare -p "Summarize" --allowedTools "Read"

SDK (Python/TypeScript):

from anthropic import Agent

agent = Agent(
    model="claude-sonnet-4-20250514",
    tools=[...],
)
result = agent.run("Fix the bug in auth.py")

Resource Usage

  • Memory: ~200-400MB (Node.js)
  • Structured output: JSON with --output-format json
  • Streaming: --output-format stream-json

Session Management

  • Session ID: --resume <session-id>
  • Continue: --continue for follow-up
  • Persistence: File-based in ~/.claude/

Agent Lifecycle

  1. Run: claude -p "task"
  2. Continue: claude -p "more" --continue
  3. Resume: claude --resume <session-id>

Customizability

  • Hooks: Pre/post tool use
  • Plugins: Custom commands and agents
  • MCP: Model Context Protocol
  • Settings: JSON config files

7. Codex (openai/codex)

Repository: https://github.com/openai/codex

Headless / Programmatic API

CLI Only - No official programmatic API

npm install -g @openai/codex
codex "Write a function to sort a list"

Resource Usage

  • Memory: ~200-300MB (Rust binary)
  • Lightweight: Minimal footprint

Session Management

  • Limited: Basic session support
  • No SDK: Not designed for programmatic control

Customizability

  • Low: No official extension API
  • Provider-locked: OpenAI-first

Recommendations for User's Use Case

Primary Recommendation: Pi (agent-core)

Why:

  • Lightest weight (~50-100MB)
  • Full programmatic control via TypeScript
  • Event-driven architecture perfect for custom integration
  • Feynman already uses it - seamless replacement
  • You control persistence - perfect for cloud production

Best for: User wants fine-grained control, lightweight footprint, TypeScript ecosystem

Secondary: Claude Code

Why:

  • Production-grade headless mode
  • Structured output support
  • Official SDK (Python/TypeScript)
  • CI/CD integration built-in
  • bare mode for consistent CI runs

Best for: Production cloud deployment with structured requirements

Alternative: LangChain

Why:

  • Maximum flexibility
  • Any LLM provider
  • Rich ecosystem
  • Full control over agent loop

Best for: User wants to build custom agent behavior from scratch


Sources

Primary Sources (Kept)

Why These Sources

  • Official repositories and documentation
  • Recent updates (2025-2026)
  • Direct technical details from source
  • Code examples for integration

Gaps & Limitations

Not Fully Covered

  1. Benchmark data: No comprehensive benchmarks comparing agent performance across tools
  2. OpenCode internal architecture: Client/server details somewhat opaque
  3. Exact resource numbers: Estimates based on typical Python/Node.js/Go runtime sizes
  4. OpenClaw detailed SDK: Very large project; deep programmatic details require more investigation
  5. Codex SDK: Currently CLI-only with no programmatic API

Suggested Next Steps

  1. Test Pi locally: Install @mariozechner/pi-agent-core and verify headless operation
  2. Test Claude Code: Try claude -p --bare for CI use case
  3. OpenCode server test: Run opencode serve and test SDK integration
  4. Hermes Python lib: Test the programmatic API for comparison

For Cloud Production

  • Consider Pi for lightweight containers
  • Consider Claude Code for structured output requirements
  • Both support any LLM provider - not locked in