506 lines
16 KiB
Markdown
506 lines
16 KiB
Markdown
# Research: Agent Frameworks for Programmatic/Headless Usage
|
|
|
|
## Summary
|
|
|
|
This research evaluates seven agent frameworks/tools for programmatic/headless usage: Hermes, OpenCode, Pi, OpenClaw, LangChain Agents, Claude Code, and Codex. The evaluation focuses on headless operation, resource usage, session management, agent lifecycle, data persistence, customizability, and integration complexity. **For the user's use case (replacing hermes + opencode with something better for local dev and cloud production)**, the top recommendations are:
|
|
|
|
- **Pi (agent-core)**: Best for pure programmatic control with excellent TypeScript SDK, event-driven architecture, and lightweight footprint
|
|
- **Claude Code**: Best for production-grade headless operation with structured output, CI/CD integration, and official SDK support
|
|
- **LangChain**: Best for flexibility and customization if the user wants full control over the agent loop
|
|
- **OpenCode**: Strong option if they want to stick with a similar architecture but need better SDK
|
|
|
|
---
|
|
|
|
## Comparison Matrix
|
|
|
|
| Criteria | Hermes | OpenCode | Pi (agent-core) | OpenClaw | LangChain Agents | Claude Code | Codex |
|
|
|----------|--------|----------|-----------------|----------|-----------------|-------------|-------|
|
|
| **Headless/Programmatic** | ✅ Python lib (`AIAgent`) | ✅ SDK + server mode | ✅ Full TypeScript SDK | ✅ Gateway WS API | ✅ `create_agent()` Python | ✅ `-p` flag + SDK | ❌ CLI only |
|
|
| **Resource Usage** | ~500MB+ (Python) | ~200-400MB (Go) | ~50-100MB (TS core) | ~500MB+ (Node) | ~100-300MB (Python) | ~200-400MB (Node) | ~200-300MB (Rust) |
|
|
| **Multi-agent Support** | ✅ Subagents/spawn | ✅ Multiple sessions | ✅ Multiple instances | ✅ Multi-agent routing | ✅ Via LangGraph | ✅ Multiple sessions | ❌ Single agent |
|
|
| **Session Management** | SQLite-based | Session API | In-memory + custom | Gateway sessions | Manual state | `--resume` flag | Session-based |
|
|
| **Data Persistence** | SQLite + pluggable memory | File-based | Custom (you control) | SQLite + gateway | You implement | File-based | File-based |
|
|
| **Customizability** | High (skills, tools, prompts) | High (tools, prompts) | High (tools, middleware) | High (skills, MCP) | Very high | Medium (plugins, hooks) | Low |
|
|
| **Plug-and-Play** | Easy (pip install) | Easy (npm) | Easy (npm) | Moderate | Moderate | Easy | Easy |
|
|
| **LLM Flexibility** | 200+ via OpenRouter | Any (provider-agnostic) | Any (multi-provider) | Any (multi-provider) | Any | Anthropic-first | OpenAI-first |
|
|
|
|
---
|
|
|
|
## Per-Tool Deep Dives
|
|
|
|
### 1. Hermes Agent (NousResearch/hermes-agent)
|
|
|
|
**Repository**: https://github.com/NousResearch/hermes-agent (30.7K stars)
|
|
|
|
#### Headless / Programmatic API
|
|
✅ **Yes - Python Library**
|
|
|
|
Hermes can be imported and used as a Python library:
|
|
|
|
```python
|
|
from run_agent import AIAgent
|
|
|
|
agent = AIAgent(
|
|
model="anthropic/claude-sonnet-4",
|
|
quiet_mode=True,
|
|
)
|
|
response = agent.chat("What is the capital of France?")
|
|
```
|
|
|
|
For full conversation control:
|
|
```python
|
|
result = agent.run_conversation(
|
|
user_message="Search for recent Python features",
|
|
task_id="my-task-1",
|
|
)
|
|
# Returns: final_response, messages, task_id
|
|
```
|
|
|
|
**CLI Headless**: Also supports `-p` flag via OpenClaw migration path.
|
|
|
|
#### Resource Usage
|
|
- **Memory**: ~500MB+ (Python runtime)
|
|
- **CPU**: Moderate (depends on model)
|
|
- **Multi-agent**: Supports subagents via `sessions_spawn` tool
|
|
- **Batch**: `batch_runner.py` for parallel processing
|
|
|
|
#### Session Management
|
|
- **SQLite-based** session storage (configurable location)
|
|
- **Pluggable memory providers** (v0.7.0+) - built-in, Honcho, or custom
|
|
- **Conversation history** preserved across sessions
|
|
- **FTS5 search** for cross-session recall
|
|
- Multi-turn conversations via `conversation_history` parameter
|
|
|
|
#### Agent Lifecycle
|
|
1. **Initialize**: `AIAgent(model=, quiet_mode=)`
|
|
2. **Run**: `chat()` or `run_conversation()`
|
|
3. **Terminate**: Automatic cleanup; resources released on conversation end
|
|
|
|
**Key options**:
|
|
- `max_iterations`: 90 default (configurable)
|
|
- `enabled_toolsets` / `disabled_toolsets`: Control available tools
|
|
- `skip_memory` / `skip_context_files`: Stateless mode for APIs
|
|
|
|
#### Data Persistence
|
|
- **SQLite**: Session data stored in `~/.hermes/`
|
|
- **Memory**: Pluggable providers (built-in, Honcho, vector stores)
|
|
- **Trajectories**: JSONL format for training data (`save_trajectories=True`)
|
|
- **API Server**: Shared SessionDB for Open WebUI integration
|
|
|
|
#### Customizability
|
|
- **Skills**: Procedural memory via `SKILL.md` files
|
|
- **Tools**: Custom tool registration
|
|
- **Prompts**: `ephemeral_system_prompt` for dynamic prompts
|
|
- **MCP**: Model Context Protocol support
|
|
- **Platform hints**: `platform` param for Discord, Telegram, etc.
|
|
|
|
#### Performance/Intelligence
|
|
- **Self-improving**: Agent creates skills from experience
|
|
- **Memory persistence**: Learns across sessions
|
|
- **Credential pooling**: Multiple API keys with rotation
|
|
- **Compression**: Context compression to prevent overflow
|
|
|
|
#### Integration Example (FastAPI)
|
|
```python
|
|
from fastapi import FastAPI
|
|
from pydantic import BaseModel
|
|
from run_agent import AIAgent
|
|
|
|
app = FastAPI()
|
|
|
|
class ChatRequest(BaseModel):
|
|
message: str
|
|
model: str = "anthropic/claude-sonnet-4"
|
|
|
|
@app.post("/chat")
|
|
async def chat(request: ChatRequest):
|
|
agent = AIAgent(
|
|
model=request.model,
|
|
quiet_mode=True,
|
|
skip_context_files=True,
|
|
skip_memory=True,
|
|
)
|
|
return {"response": agent.chat(request.message)}
|
|
```
|
|
|
|
---
|
|
|
|
### 2. OpenCode (anomalyco/opencode)
|
|
|
|
**Repository**: https://github.com/anomalyco/opencode (138.9K stars, but this is the frontend repo - the actual agent is https://github.com/opencode-ai/opencode with 11.8K stars)
|
|
|
|
#### Headless / Programmatic API
|
|
✅ **Yes - SDK + Server Mode**
|
|
|
|
**Server Mode**:
|
|
```bash
|
|
opencode serve [--port 4096] [--hostname "127.0.0.1"]
|
|
```
|
|
|
|
**SDK**:
|
|
```typescript
|
|
import { createOpencode } from "@opencode-ai/sdk"
|
|
|
|
const { client } = await createOpencode()
|
|
// Or client-only:
|
|
const client = createOpencodeClient({ baseUrl: "http://localhost:4096" })
|
|
```
|
|
|
|
#### Resource Usage
|
|
- **Memory**: ~200-400MB (Go runtime)
|
|
- **Architecture**: Client/server - TUI is just one client
|
|
- **Multi-agent**: Multiple sessions supported
|
|
|
|
#### Session Management
|
|
- Full **Session API**:
|
|
- `session.create()`, `session.list()`, `session.get()`
|
|
- `session.prompt()` - send prompts
|
|
- `session.abort()` - cancel running sessions
|
|
- `session.summarize()` - compress context
|
|
|
|
#### Agent Lifecycle
|
|
1. **Start server**: `opencode serve`
|
|
2. **Create session**: `client.session.create()`
|
|
3. **Prompt**: `client.session.prompt()`
|
|
4. **Terminate**: Server stays running; sessions are disposable
|
|
|
|
#### Data Persistence
|
|
- File-based configuration (`opencode.json`)
|
|
- Sessions stored in server memory (configurable)
|
|
|
|
#### Customizability
|
|
- **Tools**: Custom tool definitions
|
|
- **Prompts**: Custom system prompts
|
|
- **Structured Output**: JSON Schema support
|
|
- **Provider-agnostic**: Any model via configuration
|
|
|
|
#### Structured Output Example
|
|
```typescript
|
|
const result = await client.session.prompt({
|
|
path: { id: sessionId },
|
|
body: {
|
|
parts: [{ type: "text", text: "Research Anthropic" }],
|
|
format: {
|
|
type: "json_schema",
|
|
schema: {
|
|
type: "object",
|
|
properties: {
|
|
company: { type: "string" },
|
|
founded: { type: "number" },
|
|
},
|
|
required: ["company", "founded"],
|
|
},
|
|
},
|
|
},
|
|
});
|
|
```
|
|
|
|
---
|
|
|
|
### 3. Pi (badlogic/pi-mono)
|
|
|
|
**Repository**: https://github.com/badlogic/pi-mono (33.1K stars)
|
|
|
|
**This is the actual agent runtime that Feynman uses.**
|
|
|
|
#### Headless / Programmatic API
|
|
✅ **Yes - Full TypeScript SDK**
|
|
|
|
```typescript
|
|
import { Agent } from "@mariozechner/pi-agent-core";
|
|
import { getModel } from "@mariozechner/pi-ai";
|
|
|
|
const agent = new Agent({
|
|
initialState: {
|
|
systemPrompt: "You are a helpful assistant.",
|
|
model: getModel("anthropic", "claude-sonnet-4-20250514"),
|
|
},
|
|
});
|
|
|
|
agent.subscribe((event) => {
|
|
if (event.type === "message_update" && event.assistantMessageEvent.type === "text_delta") {
|
|
process.stdout.write(event.assistantMessageEvent.delta);
|
|
}
|
|
});
|
|
|
|
await agent.prompt("Hello!");
|
|
```
|
|
|
|
#### Resource Usage
|
|
- **Memory**: ~50-100MB for core agent (very lightweight)
|
|
- **CPU**: Minimal (just orchestration)
|
|
- **Multi-agent**: Create multiple `Agent` instances
|
|
- **Dependencies**: Requires `@mariozechner/pi-ai` for LLM calls
|
|
|
|
#### Session Management
|
|
- **In-memory** by default - you control persistence
|
|
- **Messages array** in agent state
|
|
- **Custom state schema** via TypeScript interfaces
|
|
- **Session ID** for provider caching
|
|
|
|
#### Agent Lifecycle
|
|
1. **Create**: `new Agent({ initialState })`
|
|
2. **Prompt**: `agent.prompt()` or `agent.continue()`
|
|
3. **Events**: Subscribe to `agent_start`, `turn_start`, `message_update`, etc.
|
|
4. **Terminate**: `agent.reset()` or let go out of scope
|
|
|
|
**Key options**:
|
|
- `transformContext`: Prune/compress messages
|
|
- `convertToLlm`: Filter custom message types
|
|
- `beforeToolCall` / `afterToolCall`: Hooks for tool execution
|
|
|
|
#### Data Persistence
|
|
- **You control**: Implement persistence via middleware
|
|
- **State is mutable**: `agent.state.messages = newMessages`
|
|
- **No built-in storage**: Freedom to implement as needed
|
|
|
|
#### Customizability
|
|
- **Tools**: `AgentTool` with Typebox schemas
|
|
- **Middleware**: `@dynamic_prompt`, `@wrap_tool_call` decorators
|
|
- **Message types**: Custom via declaration merging
|
|
- **Thinking budgets**: Configurable per provider
|
|
|
|
#### Low-Level API
|
|
```typescript
|
|
import { agentLoop, agentLoopContinue } from "@mariozechner/pi-agent-core";
|
|
|
|
for await (const event of agentLoop([userMessage], context, config)) {
|
|
console.log(event.type);
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
### 4. OpenClaw (openclaw/openclaw)
|
|
|
|
**Repository**: https://github.com/openclaw/openclaw (351.9K stars)
|
|
|
|
#### Headless / Programmatic API
|
|
✅ **Yes - Gateway WebSocket API**
|
|
|
|
OpenClaw has an extensive Gateway WS API:
|
|
```bash
|
|
openclaw gateway --port 18789 --verbose
|
|
|
|
# Send a message
|
|
openclaw message send --to +1234567890 --message "Hello"
|
|
|
|
# Agent command
|
|
openclaw agent --message "Ship checklist" --thinking high
|
|
```
|
|
|
|
#### Resource Usage
|
|
- **Memory**: ~500MB+ (Node.js runtime)
|
|
- **Multi-agent**: Multi-agent routing via Gateway
|
|
|
|
#### Session Management
|
|
- **Gateway Sessions**: Main session + group isolation
|
|
- **Session tools**: `sessions_list`, `sessions_history`, `sessions_send`
|
|
- **SQLite-based** storage
|
|
|
|
#### Agent Lifecycle
|
|
1. **Start Gateway**: `openclaw gateway`
|
|
2. **Connect**: WebSocket to `ws://127.0.0.1:18789`
|
|
3. **Message**: Send via CLI or API
|
|
4. **Persistence**: Sessions saved to SQLite
|
|
|
|
#### Data Persistence
|
|
- **SQLite**: Gateway session storage
|
|
- **Workspace**: `~/.openclaw/workspace`
|
|
- **Skills**: `~/.openclaw/workspace/skills/<skill>/SKILL.md`
|
|
|
|
#### Customizability
|
|
- **Skills**: Full skill system (ClawHub registry)
|
|
- **MCP**: Model Context Protocol support
|
|
- **Channels**: 20+ messaging platforms
|
|
|
|
---
|
|
|
|
### 5. LangChain Agents (langchain-ai/langchain)
|
|
|
|
**Repository**: https://github.com/langchain-ai/langchain
|
|
|
|
#### Headless / Programmatic API
|
|
✅ **Yes - Full Python API**
|
|
|
|
```python
|
|
from langchain.agents import create_agent
|
|
|
|
agent = create_agent("openai:gpt-5", tools=tools)
|
|
result = agent.invoke({"messages": [{"role": "user", "content": "Hello"}]})
|
|
```
|
|
|
|
#### Resource Usage
|
|
- **Memory**: ~100-300MB (Python)
|
|
- **Flexible**: Your code controls resource allocation
|
|
- **Multi-agent**: Via LangGraph subgraphs
|
|
|
|
#### Session Management
|
|
- **Manual**: You manage message history in state
|
|
- **Custom state**: Extend `AgentState` TypedDict
|
|
- **Memory integration**: Optional short-term/long-term memory
|
|
|
|
#### Agent Lifecycle
|
|
1. **Create**: `create_agent(model, tools, system_prompt)`
|
|
2. **Invoke**: `agent.invoke({"messages": [...]})`
|
|
3. **Stream**: `agent.stream()` for real-time events
|
|
|
|
#### Data Persistence
|
|
- **You implement**: Full control via middleware
|
|
- **Optional memory**: LangChain memory modules
|
|
|
|
#### Customizability
|
|
- **Very high**: Middleware, tools, prompts, dynamic everything
|
|
- **ReAct pattern**: Built-in reasoning + acting loop
|
|
- **ToolStrategy** / **ProviderStrategy**: Structured output
|
|
|
|
---
|
|
|
|
### 6. Claude Code (anthropics/claude-code)
|
|
|
|
**Repository**: https://github.com/anthropics/claude-code
|
|
|
|
#### Headless / Programmatic API
|
|
✅ **Yes - Agent SDK + CLI**
|
|
|
|
**CLI Headless**:
|
|
```bash
|
|
claude -p "Find and fix the bug in auth.py" --allowedTools "Read,Edit,Bash"
|
|
claude --bare -p "Summarize" --allowedTools "Read"
|
|
```
|
|
|
|
**SDK** (Python/TypeScript):
|
|
```python
|
|
from anthropic import Agent
|
|
|
|
agent = Agent(
|
|
model="claude-sonnet-4-20250514",
|
|
tools=[...],
|
|
)
|
|
result = agent.run("Fix the bug in auth.py")
|
|
```
|
|
|
|
#### Resource Usage
|
|
- **Memory**: ~200-400MB (Node.js)
|
|
- **Structured output**: JSON with `--output-format json`
|
|
- **Streaming**: `--output-format stream-json`
|
|
|
|
#### Session Management
|
|
- **Session ID**: `--resume <session-id>`
|
|
- **Continue**: `--continue` for follow-up
|
|
- **Persistence**: File-based in `~/.claude/`
|
|
|
|
#### Agent Lifecycle
|
|
1. **Run**: `claude -p "task"`
|
|
2. **Continue**: `claude -p "more" --continue`
|
|
3. **Resume**: `claude --resume <session-id>`
|
|
|
|
#### Customizability
|
|
- **Hooks**: Pre/post tool use
|
|
- **Plugins**: Custom commands and agents
|
|
- **MCP**: Model Context Protocol
|
|
- **Settings**: JSON config files
|
|
|
|
---
|
|
|
|
### 7. Codex (openai/codex)
|
|
|
|
**Repository**: https://github.com/openai/codex
|
|
|
|
#### Headless / Programmatic API
|
|
❌ **CLI Only - No official programmatic API**
|
|
|
|
```bash
|
|
npm install -g @openai/codex
|
|
codex "Write a function to sort a list"
|
|
```
|
|
|
|
#### Resource Usage
|
|
- **Memory**: ~200-300MB (Rust binary)
|
|
- **Lightweight**: Minimal footprint
|
|
|
|
#### Session Management
|
|
- **Limited**: Basic session support
|
|
- **No SDK**: Not designed for programmatic control
|
|
|
|
#### Customizability
|
|
- **Low**: No official extension API
|
|
- **Provider-locked**: OpenAI-first
|
|
|
|
---
|
|
|
|
## Recommendations for User's Use Case
|
|
|
|
### Primary Recommendation: Pi (agent-core)
|
|
|
|
**Why**:
|
|
- Lightest weight (~50-100MB)
|
|
- Full programmatic control via TypeScript
|
|
- Event-driven architecture perfect for custom integration
|
|
- Feynman already uses it - seamless replacement
|
|
- You control persistence - perfect for cloud production
|
|
|
|
**Best for**: User wants fine-grained control, lightweight footprint, TypeScript ecosystem
|
|
|
|
### Secondary: Claude Code
|
|
|
|
**Why**:
|
|
- Production-grade headless mode
|
|
- Structured output support
|
|
- Official SDK (Python/TypeScript)
|
|
- CI/CD integration built-in
|
|
- `bare` mode for consistent CI runs
|
|
|
|
**Best for**: Production cloud deployment with structured requirements
|
|
|
|
### Alternative: LangChain
|
|
|
|
**Why**:
|
|
- Maximum flexibility
|
|
- Any LLM provider
|
|
- Rich ecosystem
|
|
- Full control over agent loop
|
|
|
|
**Best for**: User wants to build custom agent behavior from scratch
|
|
|
|
---
|
|
|
|
## Sources
|
|
|
|
### Primary Sources (Kept)
|
|
- **Hermes Agent**: https://github.com/NousResearch/hermes-agent - Python library docs, v0.7.0 release notes
|
|
- **OpenCode SDK**: https://opencode.ai/docs/sdk/ - Full TypeScript SDK documentation
|
|
- **Pi agent-core**: https://github.com/badlogic/pi-mono/tree/main/packages/agent - Complete TypeScript API
|
|
- **Claude Code Headless**: https://code.claude.com/docs/en/headless - Official headless documentation
|
|
- **LangChain Agents**: https://docs.langchain.com/oss/python/langchain/agents - Official agents documentation
|
|
- **OpenClaw**: https://github.com/openclaw/openclaw - Gateway architecture
|
|
- **Codex**: https://github.com/openai/codex - CLI tool
|
|
|
|
### Why These Sources
|
|
- Official repositories and documentation
|
|
- Recent updates (2025-2026)
|
|
- Direct technical details from source
|
|
- Code examples for integration
|
|
|
|
---
|
|
|
|
## Gaps & Limitations
|
|
|
|
### Not Fully Covered
|
|
1. **Benchmark data**: No comprehensive benchmarks comparing agent performance across tools
|
|
2. **OpenCode internal architecture**: Client/server details somewhat opaque
|
|
3. **Exact resource numbers**: Estimates based on typical Python/Node.js/Go runtime sizes
|
|
4. **OpenClaw detailed SDK**: Very large project; deep programmatic details require more investigation
|
|
5. **Codex SDK**: Currently CLI-only with no programmatic API
|
|
|
|
### Suggested Next Steps
|
|
1. **Test Pi locally**: Install `@mariozechner/pi-agent-core` and verify headless operation
|
|
2. **Test Claude Code**: Try `claude -p --bare` for CI use case
|
|
3. **OpenCode server test**: Run `opencode serve` and test SDK integration
|
|
4. **Hermes Python lib**: Test the programmatic API for comparison
|
|
|
|
### For Cloud Production
|
|
- Consider **Pi** for lightweight containers
|
|
- Consider **Claude Code** for structured output requirements
|
|
- Both support any LLM provider - not locked in
|