Files

shokollm 71fc8b4495 Initial commit: kage-research project files

2026-04-09 00:39:52 +00:00

16 KiB

Raw Blame History

Research: Agent Frameworks for Programmatic/Headless Usage

Summary

This research evaluates seven agent frameworks/tools for programmatic/headless usage: Hermes, OpenCode, Pi, OpenClaw, LangChain Agents, Claude Code, and Codex. The evaluation focuses on headless operation, resource usage, session management, agent lifecycle, data persistence, customizability, and integration complexity. For the user's use case (replacing hermes + opencode with something better for local dev and cloud production), the top recommendations are:

Pi (agent-core): Best for pure programmatic control with excellent TypeScript SDK, event-driven architecture, and lightweight footprint
Claude Code: Best for production-grade headless operation with structured output, CI/CD integration, and official SDK support
LangChain: Best for flexibility and customization if the user wants full control over the agent loop
OpenCode: Strong option if they want to stick with a similar architecture but need better SDK

Comparison Matrix

Criteria	Hermes	OpenCode	Pi (agent-core)	OpenClaw	LangChain Agents	Claude Code	Codex
Headless/Programmatic	✅ Python lib (`AIAgent`)	✅ SDK + server mode	✅ Full TypeScript SDK	✅ Gateway WS API	✅ `create_agent()` Python	✅ `-p` flag + SDK	❌ CLI only
Resource Usage	~500MB+ (Python)	~200-400MB (Go)	~50-100MB (TS core)	~500MB+ (Node)	~100-300MB (Python)	~200-400MB (Node)	~200-300MB (Rust)
Multi-agent Support	✅ Subagents/spawn	✅ Multiple sessions	✅ Multiple instances	✅ Multi-agent routing	✅ Via LangGraph	✅ Multiple sessions	❌ Single agent
Session Management	SQLite-based	Session API	In-memory + custom	Gateway sessions	Manual state	`--resume` flag	Session-based
Data Persistence	SQLite + pluggable memory	File-based	Custom (you control)	SQLite + gateway	You implement	File-based	File-based
Customizability	High (skills, tools, prompts)	High (tools, prompts)	High (tools, middleware)	High (skills, MCP)	Very high	Medium (plugins, hooks)	Low
Plug-and-Play	Easy (pip install)	Easy (npm)	Easy (npm)	Moderate	Moderate	Easy	Easy
LLM Flexibility	200+ via OpenRouter	Any (provider-agnostic)	Any (multi-provider)	Any (multi-provider)	Any	Anthropic-first	OpenAI-first

Per-Tool Deep Dives

1. Hermes Agent (NousResearch/hermes-agent)

Repository: https://github.com/NousResearch/hermes-agent (30.7K stars)

Headless / Programmatic API

✅ Yes - Python Library

Hermes can be imported and used as a Python library:

from run_agent import AIAgent

agent = AIAgent(
    model="anthropic/claude-sonnet-4",
    quiet_mode=True,
)
response = agent.chat("What is the capital of France?")

For full conversation control:

result = agent.run_conversation(
    user_message="Search for recent Python features",
    task_id="my-task-1",
)
# Returns: final_response, messages, task_id

CLI Headless: Also supports -p flag via OpenClaw migration path.

Resource Usage

Memory: ~500MB+ (Python runtime)
CPU: Moderate (depends on model)
Multi-agent: Supports subagents via sessions_spawn tool
Batch: batch_runner.py for parallel processing

Session Management

SQLite-based session storage (configurable location)
Pluggable memory providers (v0.7.0+) - built-in, Honcho, or custom
Conversation history preserved across sessions
FTS5 search for cross-session recall
Multi-turn conversations via conversation_history parameter

Agent Lifecycle

Initialize: AIAgent(model=, quiet_mode=)
Run: chat() or run_conversation()
Terminate: Automatic cleanup; resources released on conversation end

Key options:

max_iterations: 90 default (configurable)
enabled_toolsets / disabled_toolsets: Control available tools
skip_memory / skip_context_files: Stateless mode for APIs

Data Persistence

SQLite: Session data stored in ~/.hermes/
Memory: Pluggable providers (built-in, Honcho, vector stores)
Trajectories: JSONL format for training data (save_trajectories=True)
API Server: Shared SessionDB for Open WebUI integration

Customizability

Skills: Procedural memory via SKILL.md files
Tools: Custom tool registration
Prompts: ephemeral_system_prompt for dynamic prompts
MCP: Model Context Protocol support
Platform hints: platform param for Discord, Telegram, etc.

Performance/Intelligence

Self-improving: Agent creates skills from experience
Memory persistence: Learns across sessions
Credential pooling: Multiple API keys with rotation
Compression: Context compression to prevent overflow

Integration Example (FastAPI)

from fastapi import FastAPI
from pydantic import BaseModel
from run_agent import AIAgent

app = FastAPI()

class ChatRequest(BaseModel):
    message: str
    model: str = "anthropic/claude-sonnet-4"

@app.post("/chat")
async def chat(request: ChatRequest):
    agent = AIAgent(
        model=request.model,
        quiet_mode=True,
        skip_context_files=True,
        skip_memory=True,
    )
    return {"response": agent.chat(request.message)}

2. OpenCode (anomalyco/opencode)

Repository: https://github.com/anomalyco/opencode (138.9K stars, but this is the frontend repo - the actual agent is https://github.com/opencode-ai/opencode with 11.8K stars)

Headless / Programmatic API

✅ Yes - SDK + Server Mode

Server Mode:

opencode serve [--port 4096] [--hostname "127.0.0.1"]

SDK:

import { createOpencode } from "@opencode-ai/sdk"

const { client } = await createOpencode()
// Or client-only:
const client = createOpencodeClient({ baseUrl: "http://localhost:4096" })

Resource Usage

Memory: ~200-400MB (Go runtime)
Architecture: Client/server - TUI is just one client
Multi-agent: Multiple sessions supported

Session Management

Full Session API:
- session.create(), session.list(), session.get()
- session.prompt() - send prompts
- session.abort() - cancel running sessions
- session.summarize() - compress context

Agent Lifecycle

Start server: opencode serve
Create session: client.session.create()
Prompt: client.session.prompt()
Terminate: Server stays running; sessions are disposable

Data Persistence

File-based configuration (opencode.json)
Sessions stored in server memory (configurable)

Customizability

Tools: Custom tool definitions
Prompts: Custom system prompts
Structured Output: JSON Schema support
Provider-agnostic: Any model via configuration

Structured Output Example

const result = await client.session.prompt({
  path: { id: sessionId },
  body: {
    parts: [{ type: "text", text: "Research Anthropic" }],
    format: {
      type: "json_schema",
      schema: {
        type: "object",
        properties: {
          company: { type: "string" },
          founded: { type: "number" },
        },
        required: ["company", "founded"],
      },
    },
  },
});

3. Pi (badlogic/pi-mono)

Repository: https://github.com/badlogic/pi-mono (33.1K stars)

This is the actual agent runtime that Feynman uses.

Headless / Programmatic API

✅ Yes - Full TypeScript SDK

import { Agent } from "@mariozechner/pi-agent-core";
import { getModel } from "@mariozechner/pi-ai";

const agent = new Agent({
  initialState: {
    systemPrompt: "You are a helpful assistant.",
    model: getModel("anthropic", "claude-sonnet-4-20250514"),
  },
});

agent.subscribe((event) => {
  if (event.type === "message_update" && event.assistantMessageEvent.type === "text_delta") {
    process.stdout.write(event.assistantMessageEvent.delta);
  }
});

await agent.prompt("Hello!");

Resource Usage

Memory: ~50-100MB for core agent (very lightweight)
CPU: Minimal (just orchestration)
Multi-agent: Create multiple Agent instances
Dependencies: Requires @mariozechner/pi-ai for LLM calls

Session Management

In-memory by default - you control persistence
Messages array in agent state
Custom state schema via TypeScript interfaces
Session ID for provider caching

Agent Lifecycle

Create: new Agent({ initialState })
Prompt: agent.prompt() or agent.continue()
Events: Subscribe to agent_start, turn_start, message_update, etc.
Terminate: agent.reset() or let go out of scope

Key options:

transformContext: Prune/compress messages
convertToLlm: Filter custom message types
beforeToolCall / afterToolCall: Hooks for tool execution

Data Persistence

You control: Implement persistence via middleware
State is mutable: agent.state.messages = newMessages
No built-in storage: Freedom to implement as needed

Customizability

Tools: AgentTool with Typebox schemas
Middleware: @dynamic_prompt, @wrap_tool_call decorators
Message types: Custom via declaration merging
Thinking budgets: Configurable per provider

Low-Level API

import { agentLoop, agentLoopContinue } from "@mariozechner/pi-agent-core";

for await (const event of agentLoop([userMessage], context, config)) {
  console.log(event.type);
}

4. OpenClaw (openclaw/openclaw)

Repository: https://github.com/openclaw/openclaw (351.9K stars)

Headless / Programmatic API

✅ Yes - Gateway WebSocket API

OpenClaw has an extensive Gateway WS API:

openclaw gateway --port 18789 --verbose

# Send a message
openclaw message send --to +1234567890 --message "Hello"

# Agent command
openclaw agent --message "Ship checklist" --thinking high

Resource Usage

Memory: ~500MB+ (Node.js runtime)
Multi-agent: Multi-agent routing via Gateway

Session Management

Gateway Sessions: Main session + group isolation
Session tools: sessions_list, sessions_history, sessions_send
SQLite-based storage

Agent Lifecycle

Start Gateway: openclaw gateway
Connect: WebSocket to ws://127.0.0.1:18789
Message: Send via CLI or API
Persistence: Sessions saved to SQLite

Data Persistence

SQLite: Gateway session storage
Workspace: ~/.openclaw/workspace
Skills: ~/.openclaw/workspace/skills/<skill>/SKILL.md

Customizability

Skills: Full skill system (ClawHub registry)
MCP: Model Context Protocol support
Channels: 20+ messaging platforms

5. LangChain Agents (langchain-ai/langchain)

Repository: https://github.com/langchain-ai/langchain

Headless / Programmatic API

✅ Yes - Full Python API

from langchain.agents import create_agent

agent = create_agent("openai:gpt-5", tools=tools)
result = agent.invoke({"messages": [{"role": "user", "content": "Hello"}]})

Resource Usage

Memory: ~100-300MB (Python)
Flexible: Your code controls resource allocation
Multi-agent: Via LangGraph subgraphs

Session Management

Manual: You manage message history in state
Custom state: Extend AgentState TypedDict
Memory integration: Optional short-term/long-term memory

Agent Lifecycle

Create: create_agent(model, tools, system_prompt)
Invoke: agent.invoke({"messages": [...]})
Stream: agent.stream() for real-time events

Data Persistence

You implement: Full control via middleware
Optional memory: LangChain memory modules

Customizability

Very high: Middleware, tools, prompts, dynamic everything
ReAct pattern: Built-in reasoning + acting loop
ToolStrategy / ProviderStrategy: Structured output

6. Claude Code (anthropics/claude-code)

Repository: https://github.com/anthropics/claude-code

Headless / Programmatic API

✅ Yes - Agent SDK + CLI

CLI Headless:

claude -p "Find and fix the bug in auth.py" --allowedTools "Read,Edit,Bash"
claude --bare -p "Summarize" --allowedTools "Read"

SDK (Python/TypeScript):

from anthropic import Agent

agent = Agent(
    model="claude-sonnet-4-20250514",
    tools=[...],
)
result = agent.run("Fix the bug in auth.py")

Resource Usage

Memory: ~200-400MB (Node.js)
Structured output: JSON with --output-format json
Streaming: --output-format stream-json

Session Management

Session ID: --resume <session-id>
Continue: --continue for follow-up
Persistence: File-based in ~/.claude/

Agent Lifecycle

Run: claude -p "task"
Continue: claude -p "more" --continue
Resume: claude --resume <session-id>

Customizability

Hooks: Pre/post tool use
Plugins: Custom commands and agents
MCP: Model Context Protocol
Settings: JSON config files

7. Codex (openai/codex)

Repository: https://github.com/openai/codex

Headless / Programmatic API

❌ CLI Only - No official programmatic API

npm install -g @openai/codex
codex "Write a function to sort a list"

Resource Usage

Memory: ~200-300MB (Rust binary)
Lightweight: Minimal footprint

Session Management

Limited: Basic session support
No SDK: Not designed for programmatic control

Customizability

Low: No official extension API
Provider-locked: OpenAI-first

Recommendations for User's Use Case

Primary Recommendation: Pi (agent-core)

Why:

Lightest weight (~50-100MB)
Full programmatic control via TypeScript
Event-driven architecture perfect for custom integration
Feynman already uses it - seamless replacement
You control persistence - perfect for cloud production

Best for: User wants fine-grained control, lightweight footprint, TypeScript ecosystem

Secondary: Claude Code

Why:

Production-grade headless mode
Structured output support
Official SDK (Python/TypeScript)
CI/CD integration built-in
bare mode for consistent CI runs

Best for: Production cloud deployment with structured requirements

Alternative: LangChain

Why:

Maximum flexibility
Any LLM provider
Rich ecosystem
Full control over agent loop

Best for: User wants to build custom agent behavior from scratch

Sources

Primary Sources (Kept)

Hermes Agent: https://github.com/NousResearch/hermes-agent - Python library docs, v0.7.0 release notes
OpenCode SDK: https://opencode.ai/docs/sdk/ - Full TypeScript SDK documentation
Pi agent-core: https://github.com/badlogic/pi-mono/tree/main/packages/agent - Complete TypeScript API
Claude Code Headless: https://code.claude.com/docs/en/headless - Official headless documentation
LangChain Agents: https://docs.langchain.com/oss/python/langchain/agents - Official agents documentation
OpenClaw: https://github.com/openclaw/openclaw - Gateway architecture
Codex: https://github.com/openai/codex - CLI tool

Why These Sources

Official repositories and documentation
Recent updates (2025-2026)
Direct technical details from source
Code examples for integration

Gaps & Limitations

Not Fully Covered

Benchmark data: No comprehensive benchmarks comparing agent performance across tools
OpenCode internal architecture: Client/server details somewhat opaque
Exact resource numbers: Estimates based on typical Python/Node.js/Go runtime sizes
OpenClaw detailed SDK: Very large project; deep programmatic details require more investigation
Codex SDK: Currently CLI-only with no programmatic API

Suggested Next Steps

Test Pi locally: Install @mariozechner/pi-agent-core and verify headless operation
Test Claude Code: Try claude -p --bare for CI use case
OpenCode server test: Run opencode serve and test SDK integration
Hermes Python lib: Test the programmatic API for comparison

For Cloud Production

Consider Pi for lightweight containers
Consider Claude Code for structured output requirements
Both support any LLM provider - not locked in

16 KiB Raw Blame History

Research: Agent Frameworks for Programmatic/Headless Usage

Summary

Comparison Matrix

Per-Tool Deep Dives

1. Hermes Agent (NousResearch/hermes-agent)

Headless / Programmatic API

Resource Usage

Session Management

Agent Lifecycle

Data Persistence

Customizability

Performance/Intelligence

Integration Example (FastAPI)

2. OpenCode (anomalyco/opencode)

Headless / Programmatic API

Resource Usage

Session Management

Agent Lifecycle

Data Persistence

Customizability

Structured Output Example

3. Pi (badlogic/pi-mono)

Headless / Programmatic API

Resource Usage

Session Management

Agent Lifecycle

Data Persistence

Customizability

Low-Level API

4. OpenClaw (openclaw/openclaw)

Headless / Programmatic API

Resource Usage

Session Management

Agent Lifecycle

Data Persistence

Customizability

5. LangChain Agents (langchain-ai/langchain)

Headless / Programmatic API

Resource Usage

Session Management

Agent Lifecycle

Data Persistence

Customizability

6. Claude Code (anthropics/claude-code)

Headless / Programmatic API

Resource Usage

Session Management

Agent Lifecycle

Customizability

7. Codex (openai/codex)

Headless / Programmatic API

Resource Usage

Session Management

Customizability

Recommendations for User's Use Case

Primary Recommendation: Pi (agent-core)

Secondary: Claude Code

Alternative: LangChain

Sources

Primary Sources (Kept)

Why These Sources

Gaps & Limitations

Not Fully Covered

Suggested Next Steps

For Cloud Production

16 KiB

Raw Blame History