16 KiB
Research: Agent Frameworks for Programmatic/Headless Usage
Summary
This research evaluates seven agent frameworks/tools for programmatic/headless usage: Hermes, OpenCode, Pi, OpenClaw, LangChain Agents, Claude Code, and Codex. The evaluation focuses on headless operation, resource usage, session management, agent lifecycle, data persistence, customizability, and integration complexity. For the user's use case (replacing hermes + opencode with something better for local dev and cloud production), the top recommendations are:
- Pi (agent-core): Best for pure programmatic control with excellent TypeScript SDK, event-driven architecture, and lightweight footprint
- Claude Code: Best for production-grade headless operation with structured output, CI/CD integration, and official SDK support
- LangChain: Best for flexibility and customization if the user wants full control over the agent loop
- OpenCode: Strong option if they want to stick with a similar architecture but need better SDK
Comparison Matrix
| Criteria | Hermes | OpenCode | Pi (agent-core) | OpenClaw | LangChain Agents | Claude Code | Codex |
|---|---|---|---|---|---|---|---|
| Headless/Programmatic | ✅ Python lib (AIAgent) |
✅ SDK + server mode | ✅ Full TypeScript SDK | ✅ Gateway WS API | ✅ create_agent() Python |
✅ -p flag + SDK |
❌ CLI only |
| Resource Usage | ~500MB+ (Python) | ~200-400MB (Go) | ~50-100MB (TS core) | ~500MB+ (Node) | ~100-300MB (Python) | ~200-400MB (Node) | ~200-300MB (Rust) |
| Multi-agent Support | ✅ Subagents/spawn | ✅ Multiple sessions | ✅ Multiple instances | ✅ Multi-agent routing | ✅ Via LangGraph | ✅ Multiple sessions | ❌ Single agent |
| Session Management | SQLite-based | Session API | In-memory + custom | Gateway sessions | Manual state | --resume flag |
Session-based |
| Data Persistence | SQLite + pluggable memory | File-based | Custom (you control) | SQLite + gateway | You implement | File-based | File-based |
| Customizability | High (skills, tools, prompts) | High (tools, prompts) | High (tools, middleware) | High (skills, MCP) | Very high | Medium (plugins, hooks) | Low |
| Plug-and-Play | Easy (pip install) | Easy (npm) | Easy (npm) | Moderate | Moderate | Easy | Easy |
| LLM Flexibility | 200+ via OpenRouter | Any (provider-agnostic) | Any (multi-provider) | Any (multi-provider) | Any | Anthropic-first | OpenAI-first |
Per-Tool Deep Dives
1. Hermes Agent (NousResearch/hermes-agent)
Repository: https://github.com/NousResearch/hermes-agent (30.7K stars)
Headless / Programmatic API
✅ Yes - Python Library
Hermes can be imported and used as a Python library:
from run_agent import AIAgent
agent = AIAgent(
model="anthropic/claude-sonnet-4",
quiet_mode=True,
)
response = agent.chat("What is the capital of France?")
For full conversation control:
result = agent.run_conversation(
user_message="Search for recent Python features",
task_id="my-task-1",
)
# Returns: final_response, messages, task_id
CLI Headless: Also supports -p flag via OpenClaw migration path.
Resource Usage
- Memory: ~500MB+ (Python runtime)
- CPU: Moderate (depends on model)
- Multi-agent: Supports subagents via
sessions_spawntool - Batch:
batch_runner.pyfor parallel processing
Session Management
- SQLite-based session storage (configurable location)
- Pluggable memory providers (v0.7.0+) - built-in, Honcho, or custom
- Conversation history preserved across sessions
- FTS5 search for cross-session recall
- Multi-turn conversations via
conversation_historyparameter
Agent Lifecycle
- Initialize:
AIAgent(model=, quiet_mode=) - Run:
chat()orrun_conversation() - Terminate: Automatic cleanup; resources released on conversation end
Key options:
max_iterations: 90 default (configurable)enabled_toolsets/disabled_toolsets: Control available toolsskip_memory/skip_context_files: Stateless mode for APIs
Data Persistence
- SQLite: Session data stored in
~/.hermes/ - Memory: Pluggable providers (built-in, Honcho, vector stores)
- Trajectories: JSONL format for training data (
save_trajectories=True) - API Server: Shared SessionDB for Open WebUI integration
Customizability
- Skills: Procedural memory via
SKILL.mdfiles - Tools: Custom tool registration
- Prompts:
ephemeral_system_promptfor dynamic prompts - MCP: Model Context Protocol support
- Platform hints:
platformparam for Discord, Telegram, etc.
Performance/Intelligence
- Self-improving: Agent creates skills from experience
- Memory persistence: Learns across sessions
- Credential pooling: Multiple API keys with rotation
- Compression: Context compression to prevent overflow
Integration Example (FastAPI)
from fastapi import FastAPI
from pydantic import BaseModel
from run_agent import AIAgent
app = FastAPI()
class ChatRequest(BaseModel):
message: str
model: str = "anthropic/claude-sonnet-4"
@app.post("/chat")
async def chat(request: ChatRequest):
agent = AIAgent(
model=request.model,
quiet_mode=True,
skip_context_files=True,
skip_memory=True,
)
return {"response": agent.chat(request.message)}
2. OpenCode (anomalyco/opencode)
Repository: https://github.com/anomalyco/opencode (138.9K stars, but this is the frontend repo - the actual agent is https://github.com/opencode-ai/opencode with 11.8K stars)
Headless / Programmatic API
✅ Yes - SDK + Server Mode
Server Mode:
opencode serve [--port 4096] [--hostname "127.0.0.1"]
SDK:
import { createOpencode } from "@opencode-ai/sdk"
const { client } = await createOpencode()
// Or client-only:
const client = createOpencodeClient({ baseUrl: "http://localhost:4096" })
Resource Usage
- Memory: ~200-400MB (Go runtime)
- Architecture: Client/server - TUI is just one client
- Multi-agent: Multiple sessions supported
Session Management
- Full Session API:
session.create(),session.list(),session.get()session.prompt()- send promptssession.abort()- cancel running sessionssession.summarize()- compress context
Agent Lifecycle
- Start server:
opencode serve - Create session:
client.session.create() - Prompt:
client.session.prompt() - Terminate: Server stays running; sessions are disposable
Data Persistence
- File-based configuration (
opencode.json) - Sessions stored in server memory (configurable)
Customizability
- Tools: Custom tool definitions
- Prompts: Custom system prompts
- Structured Output: JSON Schema support
- Provider-agnostic: Any model via configuration
Structured Output Example
const result = await client.session.prompt({
path: { id: sessionId },
body: {
parts: [{ type: "text", text: "Research Anthropic" }],
format: {
type: "json_schema",
schema: {
type: "object",
properties: {
company: { type: "string" },
founded: { type: "number" },
},
required: ["company", "founded"],
},
},
},
});
3. Pi (badlogic/pi-mono)
Repository: https://github.com/badlogic/pi-mono (33.1K stars)
This is the actual agent runtime that Feynman uses.
Headless / Programmatic API
✅ Yes - Full TypeScript SDK
import { Agent } from "@mariozechner/pi-agent-core";
import { getModel } from "@mariozechner/pi-ai";
const agent = new Agent({
initialState: {
systemPrompt: "You are a helpful assistant.",
model: getModel("anthropic", "claude-sonnet-4-20250514"),
},
});
agent.subscribe((event) => {
if (event.type === "message_update" && event.assistantMessageEvent.type === "text_delta") {
process.stdout.write(event.assistantMessageEvent.delta);
}
});
await agent.prompt("Hello!");
Resource Usage
- Memory: ~50-100MB for core agent (very lightweight)
- CPU: Minimal (just orchestration)
- Multi-agent: Create multiple
Agentinstances - Dependencies: Requires
@mariozechner/pi-aifor LLM calls
Session Management
- In-memory by default - you control persistence
- Messages array in agent state
- Custom state schema via TypeScript interfaces
- Session ID for provider caching
Agent Lifecycle
- Create:
new Agent({ initialState }) - Prompt:
agent.prompt()oragent.continue() - Events: Subscribe to
agent_start,turn_start,message_update, etc. - Terminate:
agent.reset()or let go out of scope
Key options:
transformContext: Prune/compress messagesconvertToLlm: Filter custom message typesbeforeToolCall/afterToolCall: Hooks for tool execution
Data Persistence
- You control: Implement persistence via middleware
- State is mutable:
agent.state.messages = newMessages - No built-in storage: Freedom to implement as needed
Customizability
- Tools:
AgentToolwith Typebox schemas - Middleware:
@dynamic_prompt,@wrap_tool_calldecorators - Message types: Custom via declaration merging
- Thinking budgets: Configurable per provider
Low-Level API
import { agentLoop, agentLoopContinue } from "@mariozechner/pi-agent-core";
for await (const event of agentLoop([userMessage], context, config)) {
console.log(event.type);
}
4. OpenClaw (openclaw/openclaw)
Repository: https://github.com/openclaw/openclaw (351.9K stars)
Headless / Programmatic API
✅ Yes - Gateway WebSocket API
OpenClaw has an extensive Gateway WS API:
openclaw gateway --port 18789 --verbose
# Send a message
openclaw message send --to +1234567890 --message "Hello"
# Agent command
openclaw agent --message "Ship checklist" --thinking high
Resource Usage
- Memory: ~500MB+ (Node.js runtime)
- Multi-agent: Multi-agent routing via Gateway
Session Management
- Gateway Sessions: Main session + group isolation
- Session tools:
sessions_list,sessions_history,sessions_send - SQLite-based storage
Agent Lifecycle
- Start Gateway:
openclaw gateway - Connect: WebSocket to
ws://127.0.0.1:18789 - Message: Send via CLI or API
- Persistence: Sessions saved to SQLite
Data Persistence
- SQLite: Gateway session storage
- Workspace:
~/.openclaw/workspace - Skills:
~/.openclaw/workspace/skills/<skill>/SKILL.md
Customizability
- Skills: Full skill system (ClawHub registry)
- MCP: Model Context Protocol support
- Channels: 20+ messaging platforms
5. LangChain Agents (langchain-ai/langchain)
Repository: https://github.com/langchain-ai/langchain
Headless / Programmatic API
✅ Yes - Full Python API
from langchain.agents import create_agent
agent = create_agent("openai:gpt-5", tools=tools)
result = agent.invoke({"messages": [{"role": "user", "content": "Hello"}]})
Resource Usage
- Memory: ~100-300MB (Python)
- Flexible: Your code controls resource allocation
- Multi-agent: Via LangGraph subgraphs
Session Management
- Manual: You manage message history in state
- Custom state: Extend
AgentStateTypedDict - Memory integration: Optional short-term/long-term memory
Agent Lifecycle
- Create:
create_agent(model, tools, system_prompt) - Invoke:
agent.invoke({"messages": [...]}) - Stream:
agent.stream()for real-time events
Data Persistence
- You implement: Full control via middleware
- Optional memory: LangChain memory modules
Customizability
- Very high: Middleware, tools, prompts, dynamic everything
- ReAct pattern: Built-in reasoning + acting loop
- ToolStrategy / ProviderStrategy: Structured output
6. Claude Code (anthropics/claude-code)
Repository: https://github.com/anthropics/claude-code
Headless / Programmatic API
✅ Yes - Agent SDK + CLI
CLI Headless:
claude -p "Find and fix the bug in auth.py" --allowedTools "Read,Edit,Bash"
claude --bare -p "Summarize" --allowedTools "Read"
SDK (Python/TypeScript):
from anthropic import Agent
agent = Agent(
model="claude-sonnet-4-20250514",
tools=[...],
)
result = agent.run("Fix the bug in auth.py")
Resource Usage
- Memory: ~200-400MB (Node.js)
- Structured output: JSON with
--output-format json - Streaming:
--output-format stream-json
Session Management
- Session ID:
--resume <session-id> - Continue:
--continuefor follow-up - Persistence: File-based in
~/.claude/
Agent Lifecycle
- Run:
claude -p "task" - Continue:
claude -p "more" --continue - Resume:
claude --resume <session-id>
Customizability
- Hooks: Pre/post tool use
- Plugins: Custom commands and agents
- MCP: Model Context Protocol
- Settings: JSON config files
7. Codex (openai/codex)
Repository: https://github.com/openai/codex
Headless / Programmatic API
❌ CLI Only - No official programmatic API
npm install -g @openai/codex
codex "Write a function to sort a list"
Resource Usage
- Memory: ~200-300MB (Rust binary)
- Lightweight: Minimal footprint
Session Management
- Limited: Basic session support
- No SDK: Not designed for programmatic control
Customizability
- Low: No official extension API
- Provider-locked: OpenAI-first
Recommendations for User's Use Case
Primary Recommendation: Pi (agent-core)
Why:
- Lightest weight (~50-100MB)
- Full programmatic control via TypeScript
- Event-driven architecture perfect for custom integration
- Feynman already uses it - seamless replacement
- You control persistence - perfect for cloud production
Best for: User wants fine-grained control, lightweight footprint, TypeScript ecosystem
Secondary: Claude Code
Why:
- Production-grade headless mode
- Structured output support
- Official SDK (Python/TypeScript)
- CI/CD integration built-in
baremode for consistent CI runs
Best for: Production cloud deployment with structured requirements
Alternative: LangChain
Why:
- Maximum flexibility
- Any LLM provider
- Rich ecosystem
- Full control over agent loop
Best for: User wants to build custom agent behavior from scratch
Sources
Primary Sources (Kept)
- Hermes Agent: https://github.com/NousResearch/hermes-agent - Python library docs, v0.7.0 release notes
- OpenCode SDK: https://opencode.ai/docs/sdk/ - Full TypeScript SDK documentation
- Pi agent-core: https://github.com/badlogic/pi-mono/tree/main/packages/agent - Complete TypeScript API
- Claude Code Headless: https://code.claude.com/docs/en/headless - Official headless documentation
- LangChain Agents: https://docs.langchain.com/oss/python/langchain/agents - Official agents documentation
- OpenClaw: https://github.com/openclaw/openclaw - Gateway architecture
- Codex: https://github.com/openai/codex - CLI tool
Why These Sources
- Official repositories and documentation
- Recent updates (2025-2026)
- Direct technical details from source
- Code examples for integration
Gaps & Limitations
Not Fully Covered
- Benchmark data: No comprehensive benchmarks comparing agent performance across tools
- OpenCode internal architecture: Client/server details somewhat opaque
- Exact resource numbers: Estimates based on typical Python/Node.js/Go runtime sizes
- OpenClaw detailed SDK: Very large project; deep programmatic details require more investigation
- Codex SDK: Currently CLI-only with no programmatic API
Suggested Next Steps
- Test Pi locally: Install
@mariozechner/pi-agent-coreand verify headless operation - Test Claude Code: Try
claude -p --barefor CI use case - OpenCode server test: Run
opencode serveand test SDK integration - Hermes Python lib: Test the programmatic API for comparison
For Cloud Production
- Consider Pi for lightweight containers
- Consider Claude Code for structured output requirements
- Both support any LLM provider - not locked in