Files

shokollm 71fc8b4495 Initial commit: kage-research project files

2026-04-09 00:39:52 +00:00

22 KiB

Raw Blame History

Deep Research: Pi (agent-core) Integration for Kugetsu

Executive Summary

This document outlines the research and implementation plan for replacing OpenCode with Pi (agent-core) in the Kugetsu orchestration system. The goal is to reduce memory usage, eliminate session poisoning (context leakage), and improve reliability while maintaining the parallel execution workflow.

1. Current System Analysis

1.1 Current Architecture

┌─────────────────────────────────────────────────────────────────┐
│                        Current Setup                            │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  User (Telegram) ──► Hermes (gateway) ──► Kugetsu (orchestrate)│
│                                                          │       │
│                                          ┌──────────────┴───────┤
│                                          ▼                      │
│                                   ┌─────────────┐              │
│                                   │  OpenCode   │ (Agent)      │
│                                   │  (340MB/ea) │              │
│                                   └─────────────┘              │
│                                          │                      │
│                              ┌───────────┴───────────┐         │
│                              ▼                       ▼         │
│                       ┌────────────┐          ┌────────────┐   │
│                       │ Shadow 1   │          │ Shadow 2   │   │
│                       │ (Worktree) │          │ (Worktree) │   │
│                       └────────────┘          └────────────┘   │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

1.2 Identified Problems

Problem	Cause	Impact
Session Poisoning	Context from Agent A bleeds into Agent B	Wrong task execution, confused agents
High Memory	~340MB per OpenCode instance	Max 5 concurrent agents on 4GB RAM
Silent Crashes	Process dies without PR/commit	Lost work, no recovery
No Structured Output	OpenCode lacks JSON output	Hard to integrate with Hermes

2. Pi (agent-core) Deep Dive

2.1 Overview

Repository: https://github.com/badlogic/pi-mono Package: @mariozechner/pi-agent-core Language: TypeScript Memory Footprint: ~50-100MB (core only)

2.2 Architecture

Pi is designed as a minimal, extensible agent runtime. Unlike OpenCode or Hermes, it doesn't include:

Built-in sub-agent spawning
TUI (terminal UI)
Session persistence (you control this)
MCP support (intentionally)

This is actually beneficial for Kugetsu because:

You control exactly how shadows are managed
No opinionated session isolation to fight against
Full control over context management

2.3 Core API

import { Agent } from "@mariozechner/pi-agent-core";
import { getModel } from "@mariozechner/pi-ai";

const agent = new Agent({
  initialState: {
    systemPrompt: "You are a coding agent.",
    model: getModel("anthropic", "claude-sonnet-4-20250514"),
    tools: [myTool],
    messages: [],
  },
  convertToLlm: (msgs) => msgs.filter(m => 
    ["user", "assistant", "toolResult"].includes(m.role)
  ),
});

// Stream events
agent.subscribe((event) => {
  console.log(event.type);
});

await agent.prompt("Fix the bug in auth.py");

2.4 Key Features for Kugetsu

Event-Driven Architecture

Pi emits rich events for UI integration:

agent_start / agent_end
turn_start / turn_end
message_start / message_update / message_end
tool_execution_start / tool_execution_update / tool_execution_end

This is critical for headless UX - you can reconstruct TUI-like behavior by subscribing to these events.

Tool Execution Control

// Block dangerous tools
beforeToolCall: async ({ toolCall, args }) => {
  if (toolCall.name === "bash" && args.command.includes("rm -rf")) {
    return { block: true, reason: "Dangerous command blocked" };
  }
}

// Audit tool results
afterToolCall: async ({ toolCall, result }) => {
  console.log(`Tool ${toolCall.name} executed:`, result);
  return { details: { ...result.details, audited: true } };
}

Context Management

transformContext: async (messages, signal) => {
  // Prune old messages
  if (estimateTokens(messages) > MAX_TOKENS) {
    return pruneOldMessages(messages);
  }
  // Inject external context
  return injectContext(messages);
}

Steering & Follow-up

// Interrupt agent while running
agent.steer({
  role: "user",
  content: "Stop! Do this instead.",
});

// Queue work after agent finishes
agent.followUp({
  role: "user", 
  content: "Also summarize the result.",
});

3. Integration Design

3.1 Proposed Architecture

┌─────────────────────────────────────────────────────────────────┐
│                     Proposed Setup                              │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  User (Telegram) ──► Hermes (gateway) ──► Kugetsu-Pi (orch)    │
│                                                          │      │
│                                          ┌────────────────┴─────┤
│                                          ▼                      │
│                                   ┌─────────────────────┐       │
│                                   │   Shadow Manager    │       │
│                                   │   (New Component)   │       │
│                                   └─────────────────────┘       │
│                                          │                      │
│                     ┌─────────────────────┼─────────────────────┤
│                     ▼                     ▼                     ▼
│              ┌────────────┐         ┌────────────┐        ┌────────────┐
│              │ Shadow 1  │         │ Shadow 2  │        │ Shadow N   │
│              │ (Pi Agent)│         │ (Pi Agent)│        │ (Pi Agent) │
│              │ ~80MB     │         │ ~80MB     │        │ ~80MB       │
│              └────────────┘         └────────────┘        └────────────┘
│                     │                     │                     │
│                     ▼                     ▼                     ▼
│              ┌────────────┐         ┌────────────┐        ┌────────────┐
│              │ Worktree 1 │         │ Worktree 2 │        │ Worktree N │
│              └────────────┘         └────────────┘        └────────────┘
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

3.2 Shadow Manager Component

The Shadow Manager replaces Kugetsu's OpenCode wrapper with Pi-native logic:

interface ShadowManager {
  // Create a new shadow (sub-agent)
  spawnShadow(config: ShadowConfig): Promise<Shadow>;
  
  // Get existing shadow
  getShadow(id: string): Shadow | undefined;
  
  // List all active shadows
  listShadows(): Shadow[];
  
  // Terminate shadow
  terminateShadow(id: string): Promise<void>;
  
  // Resource management
  getResourceUsage(): ResourceStats;
}

interface Shadow {
  id: string;
  agent: Agent;
  worktree: Worktree;
  state: ShadowState;
  createdAt: Date;
  
  prompt(message: string): Promise<AgentEvent[]>;
  continue(): Promise<AgentEvent[]>;
  abort(): void;
}

3.3 Session Isolation (Fixing Context Poisoning)

The key to preventing session poisoning is strict context boundaries:

class Shadow {
  private isolatedMessages: AgentMessage[] = [];
  
  constructor(config: ShadowConfig) {
    this.agent = new Agent({
      initialState: {
        systemPrompt: config.systemPrompt,
        model: config.model,
        tools: config.tools,
        messages: [],  // Start empty
      },
      convertToLlm: (msgs) => this.filterAndConvert(msgs),
    });
  }
  
  private filterAndConvert(messages: AgentMessage[]): Message[] {
    // STRICT: Only this shadow's messages
    const myMessages = messages.filter(m => 
      m._shadowId === this.id  // Tag each message with shadow ID
    );
    
    return myMessages.map(m => ({
      role: m.role,
      content: m.content,
    }));
  }
  
  async prompt(message: string): Promise<AgentEvent[]> {
    // Inject shadow ID into message
    const myMessage: AgentMessage = {
      role: "user",
      content: message,
      timestamp: Date.now(),
      _shadowId: this.id,  // Tag with shadow ID
    };
    
    return this.agent.prompt(myMessage);
  }
}

Why This Works:

Each message is tagged with its shadow ID
convertToLlm filters to only that shadow's messages
No cross-contamination possible
Even if agent state is shared, LLM only sees isolated context

4. Resource Benchmarks

4.1 Estimated Memory Usage

Component	OpenCode (Current)	Pi (Proposed)	Savings
Agent Core	~340MB	~80MB	76%
Node.js Runtime	(included)	~100MB	-
Tools/Extensions	Varies	Minimal	-
Per Shadow	~340MB	~80-100MB	~70%

4.2 Capacity Planning

Based on 4GB RAM, 2 CPU cores:

Scenario	OpenCode	Pi	Improvement
Max Concurrent	5	15-20	3-4x
CPU Bound	5 (contention)	8-10	60-100%
Memory Bound	5	40+	8x

Conservative Estimate: 10-15 concurrent shadows with Pi vs 5 with OpenCode

4.3 Scaling Model

Memory Budget: 4GB
Reserve: 512MB (system)
Available: 3.5GB

Pi Shadow: ~80MB base + ~20MB tools/context
Safe limit: 3.5GB / 100MB = 35 shadows

Recommended: 15-20 shadows (leaves headroom)

4.4 Scaling Beyond 4GB

RAM	Recommended Shadows	Notes
4GB	15-20	Target
8GB	35-45	Smooth scaling
16GB	80-100	High concurrency
32GB	180-200	Dedicated workload

5. Headless UX Patterns

5.1 The TUI Gap

You mentioned headless lacks "TUI qualities", specifically:

"TUI handles prompt better... if it ends right away with question or any blocker, it just feels not right"

Pi addresses this through its event-driven architecture.

5.2 Prompt Handling in Headless

TUI Pattern: Agent stops → User sees prompt → User responds → Agent continues

Pi Headless Pattern:

class HeadlessUX {
  private pendingPrompts: Map<string, PromptHandler> = new Map();
  
  subscribeToAgent(agent: Agent) {
    agent.subscribe(async (event) => {
      switch (event.type) {
        case "turn_end":
          // Check if agent is waiting for input
          const isWaiting = await this.checkForPendingPrompt(event);
          if (isWaiting) {
            // Queue for user response via Hermes
            await this.escalateToUser(event);
          }
          break;
          
        case "tool_execution_start":
          // Log what's happening
          this.log(`${event.toolName} starting...`);
          break;
          
        case "tool_execution_end":
          this.log(`${event.toolName} completed`);
          break;
      }
    });
  }
  
  private async checkForPendingPrompt(event: TurnEndEvent): Promise<boolean> {
    // Analyze if agent is blocked waiting for:
    // - Clarification
    // - Confirmation  
    // - Missing information
    // This can be inferred from:
    // - Tool results asking questions
    // - Assistant message content patterns
    // - Custom "prompt" tool results
    return false; // Implement based on your needs
  }
  
  private async escalateToUser(event: TurnEndEvent) {
    // Send to Hermes/Telegram
    await hermes.sendMessage({
      chat_id: this.userId,
      text: `Agent needs input: ${extractQuestion(event)}`,
      keyboard: generateKeyboard(event),
    });
  }
}

5.3 Rich Event Streaming

Reconstruct TUI-like output:

async function streamToTelegram(agent: Agent, chatId: string) {
  const messageBuilder = new TelegramMessageBuilder(chatId);
  
  agent.subscribe(async (event) => {
    switch (event.type) {
      case "turn_start":
        messageBuilder.startTyping();
        break;
        
      case "message_update":
        if (event.assistantMessageEvent.type === "text_delta") {
          messageBuilder.append(event.assistantMessageEvent.delta);
        }
        if (event.assistantMessageEvent.type === "thinking_delta") {
          messageBuilder.setThinking(event.assistantMessageEvent.thinking);
        }
        break;
        
      case "tool_execution_start":
        messageBuilder.appendCode(`🔧 Running ${event.toolName}...`);
        break;
        
      case "tool_execution_end":
        if (event.isError) {
          messageBuilder.append(`❌ Error: ${event.result}`);
        } else {
          messageBuilder.append(`✅ ${event.toolName} done`);
        }
        break;
        
      case "agent_end":
        await messageBuilder.send();
        break;
    }
  });
  
  await agent.prompt(userMessage);
}

5.4 Thinking Time

Pi supports configurable thinking levels:

thinkingBudgets: {
  minimal: 128,
  low: 512,
  medium: 1024,
  high: 2048,
}

In headless, you can expose this as a parameter:

/think high /solve complex-problem

6. Error Handling & Recovery

6.1 Crash Recovery

OpenCode "suddenly dies" → Pi has better observability:

class Shadow {
  private checkpointInterval: NodeJS.Timeout;
  
  constructor(config: ShadowConfig) {
    // Save state every 30 seconds
    this.checkpointInterval = setInterval(() => {
      this.saveCheckpoint();
    }, 30_000);
    
    this.agent.subscribe(async (event) => {
      if (event.type === "agent_end") {
        // Successful completion - clean up checkpoint
        this.clearCheckpoint();
      }
    });
  }
  
  private saveCheckpoint() {
    const state = {
      messages: this.agent.state.messages,
      id: this.id,
      timestamp: Date.now(),
    };
    fs.writeFileSync(
      `checkpoints/${this.id}.json`,
      JSON.stringify(state)
    );
  }
  
  static async recover(checkpointId: string): Promise<Shadow> {
    const state = JSON.parse(
      fs.readFileSync(`checkpoints/${checkpointId}.json`)
    );
    
    const shadow = new Shadow({ /* config */ });
    shadow.agent.state.messages = state.messages;
    return shadow;
  }
}

6.2 Tool Execution Safety

const safeTools: AgentTool[] = [
  {
    name: "read",
    label: "Read File",
    description: "Read file contents",
    parameters: Type.Object({ path: Type.String() }),
    execute: async (id, params) => {
      // Path validation
      if (!isSafePath(params.path, this.worktree.path)) {
        throw new Error("Path outside worktree");
      }
      return { content: [{ text: await fs.readFile(params.path) }] };
    },
  },
  {
    name: "bash",
    label: "Run Command",
    description: "Run shell command",
    parameters: Type.Object({ command: Type.String() }),
    execute: async (id, params, signal) => {
      // Command allowlist
      const allowed = ["git", "npm", "npx", "pnpm", "make"];
      if (!allowed.some(cmd => params.command.startsWith(cmd))) {
        throw new Error("Command not allowed");
      }
      
      // Execute in worktree
      return execInWorktree(params.command, this.worktree, signal);
    },
  },
];

7. Implementation Roadmap

Phase 1: Core Integration (Week 1-2)

Install @mariozechner/pi-agent-core and @mariozechner/pi-ai
Create basic Shadow class with isolated context
Implement tool registry (read, write, edit, bash)
Connect Hermes message format to Pi prompt

Phase 2: Session Management (Week 2-3)

Implement Shadow Manager
Worktree creation/cleanup per shadow
Checkpoint/save state logic
Graceful shutdown handling

Phase 3: Parallel Orchestration (Week 3-4)

Task queue with concurrency limits
Resource monitoring (memory, CPU)
Auto-scale based on load
Shadow pool for reuse

Phase 4: UX Enhancement (Week 4-5)

Event streaming to Telegram
Thinking time configuration
Prompt escalation flow
Progress indicators

Phase 5: Production Hardening (Week 5-6)

Error recovery patterns
Logging and observability
Rate limiting
Security hardening

8. Open Questions

Question	Notes
PM Agent location	Run as separate Pi instance or part of Shadow Manager?
Message history	Store in Hermes context or Shadow Manager state?
Cross-shadow communication	How should PM Agent talk to Coding Agents?
Memory monitoring	Use cgroup stats or Node.js process.memoryUsage()?
Checkpoint storage	File-based, Redis, or database?

9. Recommendations

Start with Pi + Kugetsu (keep Kugetsu, swap OpenCode)
- Lower risk, proven orchestration layer
- Focus on Shadow isolation first
Implement strict context tagging to prevent session poisoning
- Each message has shadow ID
- convertToLlm filters by shadow ID
Target 10-15 concurrent shadows on 4GB RAM
- Conservative estimate: 10
- Monitor and adjust
Expose thinking levels in headless for complex tasks
- /think high prefix for deep reasoning
Build checkpointing early for crash recovery

Sources

Pi agent-core: https://github.com/badlogic/pi-mono/tree/main/packages/agent
Pi coding-agent: https://github.com/badlogic/pi-mono/tree/main/packages/coding-agent
Pi npm packages: https://www.npmjs.com/package/@mariozechner/pi-agent-core
Kugetsu: https://git.fbrns.co/shoko/kugetsu

Appendix: Code Examples

A.1 Minimal Shadow Implementation

import { Agent } from "@mariozechner/pi-agent-core";
import { getModel } from "@mariozechner/pi-ai";

interface ShadowConfig {
  id: string;
  systemPrompt: string;
  model: string;
  worktreePath: string;
  tools: AgentTool[];
}

class Shadow {
  public readonly agent: Agent;
  public readonly id: string;
  public readonly worktreePath: string;
  
  constructor(config: ShadowConfig) {
    this.id = config.id;
    this.worktreePath = config.worktreePath;
    
    this.agent = new Agent({
      initialState: {
        systemPrompt: config.systemPrompt,
        model: getModel("anthropic", config.model),
        tools: config.tools,
        messages: [],
      },
      convertToLlm: (msgs) => {
        // Strict: only user, assistant, toolResult roles
        return msgs
          .filter(m => ["user", "assistant", "toolResult"].includes(m.role))
          .map(m => ({ role: m.role, content: m.content }));
      },
    });
  }
  
  async prompt(message: string) {
    return this.agent.prompt(message);
  }
  
  abort() {
    this.agent.abort();
  }
}

A.2 Shadow Manager with Queue

class ShadowManager {
  private shadows: Map<string, Shadow> = new Map();
  private queue: AsyncQueue<PromptRequest>;
  private maxConcurrent: number;
  private activeCount = 0;
  
  constructor(maxConcurrent = 10) {
    this.maxConcurrent = maxConcurrent;
    this.queue = new AsyncQueue({
      concurrency: maxConcurrent,
      processor: (req) => this.processRequest(req),
    });
  }
  
  async submitRequest(request: PromptRequest) {
    return this.queue.enqueue(request);
  }
  
  private async processRequest(req: PromptRequest): Promise<Response> {
    // Check if shadow exists
    let shadow = this.shadows.get(req.shadowId);
    
    if (!shadow) {
      // Create new shadow
      shadow = new Shadow({
        id: req.shadowId,
        systemPrompt: req.systemPrompt,
        model: req.model,
        worktreePath: req.worktreePath,
        tools: req.tools,
      });
      this.shadows.set(req.shadowId, shadow);
    }
    
    this.activeCount++;
    try {
      return await shadow.prompt(req.message);
    } finally {
      this.activeCount--;
    }
  }
  
  getStats() {
    return {
      active: this.activeCount,
      queued: this.queue.size,
      totalShadows: this.shadows.size,
      maxConcurrent: this.maxConcurrent,
    };
  }
}

22 KiB Raw Blame History