# Research: Agent Frameworks for Programmatic/Headless Usage ## Summary This research evaluates seven agent frameworks/tools for programmatic/headless usage: Hermes, OpenCode, Pi, OpenClaw, LangChain Agents, Claude Code, and Codex. The evaluation focuses on headless operation, resource usage, session management, agent lifecycle, data persistence, customizability, and integration complexity. **For the user's use case (replacing hermes + opencode with something better for local dev and cloud production)**, the top recommendations are: - **Pi (agent-core)**: Best for pure programmatic control with excellent TypeScript SDK, event-driven architecture, and lightweight footprint - **Claude Code**: Best for production-grade headless operation with structured output, CI/CD integration, and official SDK support - **LangChain**: Best for flexibility and customization if the user wants full control over the agent loop - **OpenCode**: Strong option if they want to stick with a similar architecture but need better SDK --- ## Comparison Matrix | Criteria | Hermes | OpenCode | Pi (agent-core) | OpenClaw | LangChain Agents | Claude Code | Codex | |----------|--------|----------|-----------------|----------|-----------------|-------------|-------| | **Headless/Programmatic** | ✅ Python lib (`AIAgent`) | ✅ SDK + server mode | ✅ Full TypeScript SDK | ✅ Gateway WS API | ✅ `create_agent()` Python | ✅ `-p` flag + SDK | ❌ CLI only | | **Resource Usage** | ~500MB+ (Python) | ~200-400MB (Go) | ~50-100MB (TS core) | ~500MB+ (Node) | ~100-300MB (Python) | ~200-400MB (Node) | ~200-300MB (Rust) | | **Multi-agent Support** | ✅ Subagents/spawn | ✅ Multiple sessions | ✅ Multiple instances | ✅ Multi-agent routing | ✅ Via LangGraph | ✅ Multiple sessions | ❌ Single agent | | **Session Management** | SQLite-based | Session API | In-memory + custom | Gateway sessions | Manual state | `--resume` flag | Session-based | | **Data Persistence** | SQLite + pluggable memory | File-based | Custom (you control) | SQLite + gateway | You implement | File-based | File-based | | **Customizability** | High (skills, tools, prompts) | High (tools, prompts) | High (tools, middleware) | High (skills, MCP) | Very high | Medium (plugins, hooks) | Low | | **Plug-and-Play** | Easy (pip install) | Easy (npm) | Easy (npm) | Moderate | Moderate | Easy | Easy | | **LLM Flexibility** | 200+ via OpenRouter | Any (provider-agnostic) | Any (multi-provider) | Any (multi-provider) | Any | Anthropic-first | OpenAI-first | --- ## Per-Tool Deep Dives ### 1. Hermes Agent (NousResearch/hermes-agent) **Repository**: https://github.com/NousResearch/hermes-agent (30.7K stars) #### Headless / Programmatic API ✅ **Yes - Python Library** Hermes can be imported and used as a Python library: ```python from run_agent import AIAgent agent = AIAgent( model="anthropic/claude-sonnet-4", quiet_mode=True, ) response = agent.chat("What is the capital of France?") ``` For full conversation control: ```python result = agent.run_conversation( user_message="Search for recent Python features", task_id="my-task-1", ) # Returns: final_response, messages, task_id ``` **CLI Headless**: Also supports `-p` flag via OpenClaw migration path. #### Resource Usage - **Memory**: ~500MB+ (Python runtime) - **CPU**: Moderate (depends on model) - **Multi-agent**: Supports subagents via `sessions_spawn` tool - **Batch**: `batch_runner.py` for parallel processing #### Session Management - **SQLite-based** session storage (configurable location) - **Pluggable memory providers** (v0.7.0+) - built-in, Honcho, or custom - **Conversation history** preserved across sessions - **FTS5 search** for cross-session recall - Multi-turn conversations via `conversation_history` parameter #### Agent Lifecycle 1. **Initialize**: `AIAgent(model=, quiet_mode=)` 2. **Run**: `chat()` or `run_conversation()` 3. **Terminate**: Automatic cleanup; resources released on conversation end **Key options**: - `max_iterations`: 90 default (configurable) - `enabled_toolsets` / `disabled_toolsets`: Control available tools - `skip_memory` / `skip_context_files`: Stateless mode for APIs #### Data Persistence - **SQLite**: Session data stored in `~/.hermes/` - **Memory**: Pluggable providers (built-in, Honcho, vector stores) - **Trajectories**: JSONL format for training data (`save_trajectories=True`) - **API Server**: Shared SessionDB for Open WebUI integration #### Customizability - **Skills**: Procedural memory via `SKILL.md` files - **Tools**: Custom tool registration - **Prompts**: `ephemeral_system_prompt` for dynamic prompts - **MCP**: Model Context Protocol support - **Platform hints**: `platform` param for Discord, Telegram, etc. #### Performance/Intelligence - **Self-improving**: Agent creates skills from experience - **Memory persistence**: Learns across sessions - **Credential pooling**: Multiple API keys with rotation - **Compression**: Context compression to prevent overflow #### Integration Example (FastAPI) ```python from fastapi import FastAPI from pydantic import BaseModel from run_agent import AIAgent app = FastAPI() class ChatRequest(BaseModel): message: str model: str = "anthropic/claude-sonnet-4" @app.post("/chat") async def chat(request: ChatRequest): agent = AIAgent( model=request.model, quiet_mode=True, skip_context_files=True, skip_memory=True, ) return {"response": agent.chat(request.message)} ``` --- ### 2. OpenCode (anomalyco/opencode) **Repository**: https://github.com/anomalyco/opencode (138.9K stars, but this is the frontend repo - the actual agent is https://github.com/opencode-ai/opencode with 11.8K stars) #### Headless / Programmatic API ✅ **Yes - SDK + Server Mode** **Server Mode**: ```bash opencode serve [--port 4096] [--hostname "127.0.0.1"] ``` **SDK**: ```typescript import { createOpencode } from "@opencode-ai/sdk" const { client } = await createOpencode() // Or client-only: const client = createOpencodeClient({ baseUrl: "http://localhost:4096" }) ``` #### Resource Usage - **Memory**: ~200-400MB (Go runtime) - **Architecture**: Client/server - TUI is just one client - **Multi-agent**: Multiple sessions supported #### Session Management - Full **Session API**: - `session.create()`, `session.list()`, `session.get()` - `session.prompt()` - send prompts - `session.abort()` - cancel running sessions - `session.summarize()` - compress context #### Agent Lifecycle 1. **Start server**: `opencode serve` 2. **Create session**: `client.session.create()` 3. **Prompt**: `client.session.prompt()` 4. **Terminate**: Server stays running; sessions are disposable #### Data Persistence - File-based configuration (`opencode.json`) - Sessions stored in server memory (configurable) #### Customizability - **Tools**: Custom tool definitions - **Prompts**: Custom system prompts - **Structured Output**: JSON Schema support - **Provider-agnostic**: Any model via configuration #### Structured Output Example ```typescript const result = await client.session.prompt({ path: { id: sessionId }, body: { parts: [{ type: "text", text: "Research Anthropic" }], format: { type: "json_schema", schema: { type: "object", properties: { company: { type: "string" }, founded: { type: "number" }, }, required: ["company", "founded"], }, }, }, }); ``` --- ### 3. Pi (badlogic/pi-mono) **Repository**: https://github.com/badlogic/pi-mono (33.1K stars) **This is the actual agent runtime that Feynman uses.** #### Headless / Programmatic API ✅ **Yes - Full TypeScript SDK** ```typescript import { Agent } from "@mariozechner/pi-agent-core"; import { getModel } from "@mariozechner/pi-ai"; const agent = new Agent({ initialState: { systemPrompt: "You are a helpful assistant.", model: getModel("anthropic", "claude-sonnet-4-20250514"), }, }); agent.subscribe((event) => { if (event.type === "message_update" && event.assistantMessageEvent.type === "text_delta") { process.stdout.write(event.assistantMessageEvent.delta); } }); await agent.prompt("Hello!"); ``` #### Resource Usage - **Memory**: ~50-100MB for core agent (very lightweight) - **CPU**: Minimal (just orchestration) - **Multi-agent**: Create multiple `Agent` instances - **Dependencies**: Requires `@mariozechner/pi-ai` for LLM calls #### Session Management - **In-memory** by default - you control persistence - **Messages array** in agent state - **Custom state schema** via TypeScript interfaces - **Session ID** for provider caching #### Agent Lifecycle 1. **Create**: `new Agent({ initialState })` 2. **Prompt**: `agent.prompt()` or `agent.continue()` 3. **Events**: Subscribe to `agent_start`, `turn_start`, `message_update`, etc. 4. **Terminate**: `agent.reset()` or let go out of scope **Key options**: - `transformContext`: Prune/compress messages - `convertToLlm`: Filter custom message types - `beforeToolCall` / `afterToolCall`: Hooks for tool execution #### Data Persistence - **You control**: Implement persistence via middleware - **State is mutable**: `agent.state.messages = newMessages` - **No built-in storage**: Freedom to implement as needed #### Customizability - **Tools**: `AgentTool` with Typebox schemas - **Middleware**: `@dynamic_prompt`, `@wrap_tool_call` decorators - **Message types**: Custom via declaration merging - **Thinking budgets**: Configurable per provider #### Low-Level API ```typescript import { agentLoop, agentLoopContinue } from "@mariozechner/pi-agent-core"; for await (const event of agentLoop([userMessage], context, config)) { console.log(event.type); } ``` --- ### 4. OpenClaw (openclaw/openclaw) **Repository**: https://github.com/openclaw/openclaw (351.9K stars) #### Headless / Programmatic API ✅ **Yes - Gateway WebSocket API** OpenClaw has an extensive Gateway WS API: ```bash openclaw gateway --port 18789 --verbose # Send a message openclaw message send --to +1234567890 --message "Hello" # Agent command openclaw agent --message "Ship checklist" --thinking high ``` #### Resource Usage - **Memory**: ~500MB+ (Node.js runtime) - **Multi-agent**: Multi-agent routing via Gateway #### Session Management - **Gateway Sessions**: Main session + group isolation - **Session tools**: `sessions_list`, `sessions_history`, `sessions_send` - **SQLite-based** storage #### Agent Lifecycle 1. **Start Gateway**: `openclaw gateway` 2. **Connect**: WebSocket to `ws://127.0.0.1:18789` 3. **Message**: Send via CLI or API 4. **Persistence**: Sessions saved to SQLite #### Data Persistence - **SQLite**: Gateway session storage - **Workspace**: `~/.openclaw/workspace` - **Skills**: `~/.openclaw/workspace/skills//SKILL.md` #### Customizability - **Skills**: Full skill system (ClawHub registry) - **MCP**: Model Context Protocol support - **Channels**: 20+ messaging platforms --- ### 5. LangChain Agents (langchain-ai/langchain) **Repository**: https://github.com/langchain-ai/langchain #### Headless / Programmatic API ✅ **Yes - Full Python API** ```python from langchain.agents import create_agent agent = create_agent("openai:gpt-5", tools=tools) result = agent.invoke({"messages": [{"role": "user", "content": "Hello"}]}) ``` #### Resource Usage - **Memory**: ~100-300MB (Python) - **Flexible**: Your code controls resource allocation - **Multi-agent**: Via LangGraph subgraphs #### Session Management - **Manual**: You manage message history in state - **Custom state**: Extend `AgentState` TypedDict - **Memory integration**: Optional short-term/long-term memory #### Agent Lifecycle 1. **Create**: `create_agent(model, tools, system_prompt)` 2. **Invoke**: `agent.invoke({"messages": [...]})` 3. **Stream**: `agent.stream()` for real-time events #### Data Persistence - **You implement**: Full control via middleware - **Optional memory**: LangChain memory modules #### Customizability - **Very high**: Middleware, tools, prompts, dynamic everything - **ReAct pattern**: Built-in reasoning + acting loop - **ToolStrategy** / **ProviderStrategy**: Structured output --- ### 6. Claude Code (anthropics/claude-code) **Repository**: https://github.com/anthropics/claude-code #### Headless / Programmatic API ✅ **Yes - Agent SDK + CLI** **CLI Headless**: ```bash claude -p "Find and fix the bug in auth.py" --allowedTools "Read,Edit,Bash" claude --bare -p "Summarize" --allowedTools "Read" ``` **SDK** (Python/TypeScript): ```python from anthropic import Agent agent = Agent( model="claude-sonnet-4-20250514", tools=[...], ) result = agent.run("Fix the bug in auth.py") ``` #### Resource Usage - **Memory**: ~200-400MB (Node.js) - **Structured output**: JSON with `--output-format json` - **Streaming**: `--output-format stream-json` #### Session Management - **Session ID**: `--resume ` - **Continue**: `--continue` for follow-up - **Persistence**: File-based in `~/.claude/` #### Agent Lifecycle 1. **Run**: `claude -p "task"` 2. **Continue**: `claude -p "more" --continue` 3. **Resume**: `claude --resume ` #### Customizability - **Hooks**: Pre/post tool use - **Plugins**: Custom commands and agents - **MCP**: Model Context Protocol - **Settings**: JSON config files --- ### 7. Codex (openai/codex) **Repository**: https://github.com/openai/codex #### Headless / Programmatic API ❌ **CLI Only - No official programmatic API** ```bash npm install -g @openai/codex codex "Write a function to sort a list" ``` #### Resource Usage - **Memory**: ~200-300MB (Rust binary) - **Lightweight**: Minimal footprint #### Session Management - **Limited**: Basic session support - **No SDK**: Not designed for programmatic control #### Customizability - **Low**: No official extension API - **Provider-locked**: OpenAI-first --- ## Recommendations for User's Use Case ### Primary Recommendation: Pi (agent-core) **Why**: - Lightest weight (~50-100MB) - Full programmatic control via TypeScript - Event-driven architecture perfect for custom integration - Feynman already uses it - seamless replacement - You control persistence - perfect for cloud production **Best for**: User wants fine-grained control, lightweight footprint, TypeScript ecosystem ### Secondary: Claude Code **Why**: - Production-grade headless mode - Structured output support - Official SDK (Python/TypeScript) - CI/CD integration built-in - `bare` mode for consistent CI runs **Best for**: Production cloud deployment with structured requirements ### Alternative: LangChain **Why**: - Maximum flexibility - Any LLM provider - Rich ecosystem - Full control over agent loop **Best for**: User wants to build custom agent behavior from scratch --- ## Sources ### Primary Sources (Kept) - **Hermes Agent**: https://github.com/NousResearch/hermes-agent - Python library docs, v0.7.0 release notes - **OpenCode SDK**: https://opencode.ai/docs/sdk/ - Full TypeScript SDK documentation - **Pi agent-core**: https://github.com/badlogic/pi-mono/tree/main/packages/agent - Complete TypeScript API - **Claude Code Headless**: https://code.claude.com/docs/en/headless - Official headless documentation - **LangChain Agents**: https://docs.langchain.com/oss/python/langchain/agents - Official agents documentation - **OpenClaw**: https://github.com/openclaw/openclaw - Gateway architecture - **Codex**: https://github.com/openai/codex - CLI tool ### Why These Sources - Official repositories and documentation - Recent updates (2025-2026) - Direct technical details from source - Code examples for integration --- ## Gaps & Limitations ### Not Fully Covered 1. **Benchmark data**: No comprehensive benchmarks comparing agent performance across tools 2. **OpenCode internal architecture**: Client/server details somewhat opaque 3. **Exact resource numbers**: Estimates based on typical Python/Node.js/Go runtime sizes 4. **OpenClaw detailed SDK**: Very large project; deep programmatic details require more investigation 5. **Codex SDK**: Currently CLI-only with no programmatic API ### Suggested Next Steps 1. **Test Pi locally**: Install `@mariozechner/pi-agent-core` and verify headless operation 2. **Test Claude Code**: Try `claude -p --bare` for CI use case 3. **OpenCode server test**: Run `opencode serve` and test SDK integration 4. **Hermes Python lib**: Test the programmatic API for comparison ### For Cloud Production - Consider **Pi** for lightweight containers - Consider **Claude Code** for structured output requirements - Both support any LLM provider - not locked in