From 202d8ccfbb806ad47a78819bd09071e7822e1286 Mon Sep 17 00:00:00 2001 From: shokollm <270575765+shokollm@users.noreply.github.com> Date: Mon, 30 Mar 2026 11:46:12 +0000 Subject: [PATCH] docs: add Phase 3 chat architecture and overview documentation - Add docs/kugetsu-chat.md: - Model B architecture (separate Chat/PM agents) - Session types (chat-agent, pm-agent, pm-agent-{repo}, issue sessions) - Hybrid message routing - PM Agent modes (notify/silent) - Context management (local + Gitea fetch on-demand) - Example flows - Add docs/kugetsu.md: - Overview of kugetsu system - Quick start guide - Links to all documentation - Update docs/kugetsu-architecture.md: - Add Phase 3 architecture section - Update success criteria - Add Phase 3 design decisions - Add docs/telegram-setup.md: - BotFather bot creation guide - Security notes - Remove ssh-keygen.sh (not needed) --- docs/kugetsu-architecture.md | 101 +++++++++++---- docs/kugetsu-chat.md | 240 +++++++++++++++++++++++++++++++++++ docs/kugetsu.md | 111 ++++++++++++++++ docs/telegram-setup.md | 96 ++++++++++++++ 4 files changed, 525 insertions(+), 23 deletions(-) create mode 100644 docs/kugetsu-chat.md create mode 100644 docs/kugetsu.md create mode 100644 docs/telegram-setup.md diff --git a/docs/kugetsu-architecture.md b/docs/kugetsu-architecture.md index 390857b..ddcc21e 100644 --- a/docs/kugetsu-architecture.md +++ b/docs/kugetsu-architecture.md @@ -1,8 +1,10 @@ # Kugetsu Architecture -**Date:** 2025-03-27 +**Date:** 2026-03-30 **Status:** In Progress +> **Note:** This document describes the overall Kugetsu architecture. For Phase 3 (Chat) specific details, see [kugetsu-chat.md](kugetsu-chat.md). + ## 1. Overview ### 1.1 Background: The Name @@ -90,6 +92,34 @@ Your focus shifts from doing to overseeing — reviewing PRs, approving plans, m └─────────────────────────────────────────────────────────────────┘ ``` +### 2.1.1 Phase 3: Chat Interface (Telegram) + +``` +┌─────────────────────────────────────────────────────────────────┐ +│ Human (Phone) │ +│ Telegram App │ +└─────────────────────────────────────────────────────────────────┘ + │ + Telegram Protocol + ▼ +┌─────────────────────────────────────────────────────────────────┐ +│ Hermes (Chat Agent Gateway - Phase 3) │ +│ - Receives Telegram messages │ +│ - Natural language interpretation │ +│ - Routes to appropriate agent │ +└─────────────────────────────────────────────────────────────────┘ + │ + ┌───────────┴───────────┐ + │ │ + ▼ ▼ + ┌─────────────────┐ ┌─────────────────────────┐ + │ Chat Agent │ │ PM Agent │ + │ (casual chat) │◄───►│ (task coordination) │ + └─────────────────┘ └─────────────────────────┘ +``` + +See [kugetsu-chat.md](kugetsu-chat.md) for full Phase 3 architecture. + ### 2.2 Agent Types #### PM Agent (Project Manager) @@ -289,32 +319,47 @@ When a Coding Agent starts, it: ## 6. PoC Scope & Success Criteria -### 6.1 Initial PoC Setup +### 6.1 Phases Summary -- **1 Repository** -- **1 PM Agent** -- **Multiple Coding Agents** (up to machine capacity) -- **Tools**: Hermes (primary), OpenClaw (secondary/test) +| Phase | Status | Description | +|-------|--------|-------------| +| Phase 1 | ✅ Complete | SSH + Tailscale remote access | +| Phase 1b | ✅ Complete | Tailscale VPN setup | +| Phase 2 | 📋 Planned | API Interface | +| Phase 3 | 📋 Planned | Chat Integration (Telegram) | +| Phase 4 | 📋 Planned | Web Dashboard | -### 6.2 Research Goals +### 6.2 Current Implementation -| Item | Description | -|------|-------------| -| Parallel capacity | How many Coding Agents can run simultaneously on one machine? | -| Hermes limit | Can we bypass or modify Hermes's 3-task hard limit? | -| OpenClaw compatibility | Does the architecture work with OpenClaw as well? | -| Communication patterns | What works, what fails, what needs refinement? | +- **1 Repository** (kugetsu) +- **Session Manager**: kugetsu CLI +- **Agent Framework**: opencode +- **Access**: SSH + Tailscale (Phase 1) +- **Communication Hub**: Gitea Issues/PRs -### 6.3 Success Criteria +### 6.3 Research Goals +| Item | Description | Status | +|------|-------------|--------| +| Parallel capacity | How many Coding Agents can run simultaneously on one machine? | Pending | +| Session management | Does kugetsu properly manage opencode sessions? | ✅ Working | +| Remote access | Does SSH + Tailscale enable remote work? | ✅ Working | +| Chat interface | Can Hermes bridge Telegram for mobile UX? | Planned (Phase 3) | + +### 6.4 Success Criteria + +- [x] kugetsu CLI manages sessions properly +- [x] Remote access via SSH works +- [x] Remote access via Tailscale works - [ ] PM successfully splits and assigns tasks - [ ] Multiple Coding Agents work in parallel - [ ] Coding Agents follow guidelines and create valid PRs - [ ] PM merges PRs to release branch - [ ] Human approves final merge - [ ] System handles at least 3 parallel agents +- [ ] Telegram chat interface for mobile UX -### 6.4 Out of Scope (Phase 1) +### 6.5 Future Phases - Multiple PMs coordinating - Distributed/multi-machine setup @@ -327,14 +372,23 @@ When a Coding Agent starts, it: ### 7.1 Active Research -| Item | Question | -|------|----------| -| **Hermes 3-task limit** | Where does this come from? Can it be configured or bypassed? | -| **OpenClaw parity** | Will the same architecture work with OpenClaw? | -| **Failure recovery** | What's the best strategy for agent crashes/restarts? | -| **Context management** | How do agents maintain context across long tasks? | +| Item | Question | Phase | +|------|----------|-------| +| **Hermes 3-task limit** | Where does this come from? Can it be configured or bypassed? | Future | +| **OpenClaw parity** | Will the same architecture work with OpenClaw? | Future | +| **Failure recovery** | What's the best strategy for agent crashes/restarts? | All | +| **Context management** | How do agents maintain context across long tasks? | All | -### 7.2 Design Decisions Pending +### 7.2 Phase 3 Design Decisions + +| Item | Question | Status | +|------|---------|--------| +| **Chat Agent implementation** | Hermes as chat agent or separate Telegram bot? | Hermes (Model A/B hybrid) | +| **PM Agent location** | Separate opencode session or Hermes mode? | Separate session (Model B) | +| **Session timeout** | How long until inactive sessions are paused? | Pending | +| **Message history** | Store in Hermes context or external database? | Pending | + +### 7.3 Design Decisions Pending | Item | Question | |------|----------| @@ -360,4 +414,5 @@ When a Coding Agent starts, it: ## Status History -- 2025-03-27: Initial architecture draft +- 2026-03-30: Added Phase 3 architecture notes, updated status +- 2026-03-27: Initial architecture draft diff --git a/docs/kugetsu-chat.md b/docs/kugetsu-chat.md new file mode 100644 index 0000000..c6ca50e --- /dev/null +++ b/docs/kugetsu-chat.md @@ -0,0 +1,240 @@ +# Kugetsu Chat Architecture (Phase 3) + +**Status:** Planned (Not Yet Implemented) +**Related Issue:** #19 + +## Overview + +Phase 3 adds Telegram chat interface for mobile/phone UX. Users can interact with their agent team via natural language from any device with Telegram. + +## Architecture: Model B (Separate Agents) + +``` +┌─────────────────────────────────────────────────────────────────┐ +│ User (Phone) │ +│ Telegram App │ +└─────────────────────────────────────────────────────────────────┘ + │ + │ Telegram Protocol + ▼ +┌─────────────────────────────────────────────────────────────────┐ +│ Hermes (Chat Agent Gateway) │ +│ - Receives messages from Telegram │ +│ - Interprets natural language │ +│ - Routes to appropriate agent session │ +│ - Maintains conversation context │ +└─────────────────────────────────────────────────────────────────┘ + │ + ┌─────────────────┴─────────────────┐ + │ │ + ▼ ▼ +┌─────────────────────────┐ ┌─────────────────────────────┐ +│ Chat Agent Session │ │ PM Agent Session │ +│ (opencode session) │ │ (opencode session) │ +│ │ │ │ +│ Session ID: chat-agent │ │ Session ID: pm-agent │ +│ │ │ │ +│ - Handles casual chat │ │ - Coordinates tasks │ +│ - Clears context on │◄────────┼─── PM questions to user │ +│ unrelated messages │ │ │ +│ - Short interactions │ │ - Delegates to Dev Agents │ +└─────────────────────────┘ │ - Long-running work │ + └─────────────────────────────┘ + │ + ▼ + ┌─────────────────────────────────────────┐ + │ Dev Agent Sessions │ + │ (opencode sessions via kugetsu) │ + │ │ + │ Session IDs: │ + │ - issue-1-pr │ + │ - issue-2-research │ + │ - fix-issue-3 │ + │ - ... │ + │ │ + │ - Work autonomously │ + │ - Output to Gitea │ + │ - One issue per session │ + └─────────────────────────────────────────┘ + │ + ▼ + ┌─────────────────────────────────────────┐ + │ Gitea │ + │ Issues, PRs, Comments │ + │ (Permanent audit trail) │ + └─────────────────────────────────────────┘ +``` + +## Session Types + +| Session | kugetsu Session ID | Purpose | Lifespan | +|---------|---------------------|---------|----------| +| Chat Agent | `chat-agent` | User conversation (Hermes) | Persistent | +| PM Agent | `pm-agent` | Task coordination | Persistent | +| PM Agent (repo-specific) | `pm-agent-{repo-name}` | Extends base PM for specific repo | Optional scaling | +| Dev Agent | `issue-{n}-{type}` | Issue work | Until issue resolved | + +### PM Agent Hierarchy + +- **Base PM**: `pm-agent` - Generic 1-way/1-door agent +- **Repo-specific PM**: `pm-agent-{repo-name}` - Extends base PM for specific repo (optional scaling) + +## Message Routing (Hybrid - Option 3) + +### Routing Rules + +| User Message | Route To | Response | +|--------------|----------|----------| +| Casual chat | Chat Agent | Direct response | +| Task request | PM Agent | Task created or clarification needed | +| Status query | PM Agent | Current status | +| "PM, be silent" | PM Agent | Mode changed to silent | +| "PM, notify me" | PM Agent | Mode changed to notify | +| Clarification | PM → Chat → User | PM asks via Hermes | + +### Example Flows + +#### Flow 1: Simple Task Request + +``` +User: "create a test file for issue #5" + │ + ▼ +Hermes (Chat Gateway) + │ Routes to PM + ▼ +PM Agent + │ Sees clear task + │ Creates kugetsu session: kugetsu start github.com/user/repo#5 "create test" + ▼ +Dev Agent (issue-5-pr session) + │ Does work + │ Posts PR to Gitea + ▼ +PM Agent + │ Task done + │ Checks: PM mode = notify? + ▼ +Hermes (Chat Gateway) + │ "Issue #5 is done! PR created." + ▼ +User (Telegram) +``` + +#### Flow 2: Task with Clarification + +``` +User: "improve the thing" + │ + ▼ +Hermes (Chat Gateway) + │ Routes to PM + ▼ +PM Agent + │ Unclear - what thing? which repo? + │ PM sends clarification request + ▼ +Hermes (Chat Gateway) + │ "Which project did you mean? github.com/user/project or git.fbrns.co/team/core?" + ▼ +User (Telegram): "git.fbrns.co/team/core" + │ + ▼ +Hermes (Chat Gateway) + │ PM receives clarification + │ PM proceeds with task + ▼ +...continues as Flow 1... +``` + +#### Flow 3: Silent Mode + +``` +User: "work on issue #7 silently" + │ + ▼ +Hermes (Chat Gateway) + │ Routes to PM + ▼ +PM Agent + │ Sets mode = silent + │ "Okay, I will work silently. Check Gitea for progress." + ▼ +...PM works in background... + │ + ▼ +User checks Gitea directly + │ Sees PR, comments, progress + │ +User: "status" + │ + ▼ +Hermes → PM + │ PM responds with status + ▼ +User +``` + +## PM Agent Modes + +| Mode | Behavior | Trigger | +|------|----------|---------| +| **Notify** (default) | PM sends completion message | `pm notify` or default | +| **Silent** | PM works quietly | `pm silent` or `pm be quiet` | + +## Implementation Notes + +### Hermes as Gateway + +Hermes handles: +- Telegram message reception +- Natural language interpretation +- Session routing +- Response formatting + +### opencode Sessions + +Each agent runs in its own opencode session via kugetsu: +- Sessions persist across interactions +- kugetsu manages session lifecycle +- Each session has isolated context + +### Gitea Integration + +All agent work outputs to Gitea: +- Issue comments for progress +- PRs for code changes +- Permanent audit trail + +### Context Management + +#### Storage +- **Primary**: Kugetsu session file (local JSON) +- **Extension**: Gitea comments (fetched on-demand) + +#### Fetch Triggers +| Trigger | When | +|---------|------| +| **No context** | Initial load - PM fetches relevant issue/PR comments | +| **Explicit request** | Agent decides to fetch more context | +| **Insufficient** | Local context not helpful - like initial case | + +#### Context Merge Strategy +- **Default**: Append new context to existing +- **Threshold**: Summarize + replace at 40% of model context window (dynamic based on model) + +--- + +## Open Questions + +1. **Telegram API vs Bot API**: Use long polling (Bot API) or MTProto (user session)? +2. **Session timeout**: How long until inactive sessions are paused? +3. **Message history**: Store in Hermes context or external database? + +--- + +## Related Documentation + +- [Telegram Setup Guide](telegram-setup.md) +- [kugetsu Architecture](kugetsu-architecture.md) +- [Subagent Workflow](SUBAGENT_WORKFLOW.md) \ No newline at end of file diff --git a/docs/kugetsu.md b/docs/kugetsu.md new file mode 100644 index 0000000..6cf2497 --- /dev/null +++ b/docs/kugetsu.md @@ -0,0 +1,111 @@ +# Kugetsu + +**Status:** In Development + +Kugetsu is an agent orchestration system that enables parallel task execution across multiple repositories through a hierarchical multi-agent architecture. + +## Quick Overview + +``` +Human (Executive) + └── PM Agent (Task Coordinator) + ├── Dev Agent A → Issue 1 → PR + ├── Dev Agent B → Issue 2 → PR + └── Dev Agent C → Issue 3 → PR +``` + +Your focus shifts from doing to overseeing — reviewing PRs, approving plans, managing priorities. + +## Core Components + +| Component | Implementation | Purpose | +|-----------|---------------|---------| +| **Session Manager** | `kugetsu` CLI | Manages opencode sessions | +| **Chat Interface** | Hermes + Telegram | Mobile UX (Phase 3) | +| **PM Agent** | opencode session | Task coordination | +| **Dev Agents** | opencode sessions | Execute tasks | +| **Communication Hub** | Gitea | Issues, PRs, Comments | + +## Session Architecture + +| Session | kugetsu ID | Purpose | +|---------|-------------|---------| +| Base Session | `base` | Initial TUI session for forking | +| PM Agent | `pm-agent` | Task coordination | +| Repo PM | `pm-agent-{repo}` | Repo-specific PM (optional) | +| Dev Agent | `issue-{n}` | Per-issue work | + +## Current Capabilities + +### Phase 1: Remote Access ✅ +- SSH access to container +- Tailscale VPN for cross-network access +- See [docs/kugetsu-setup.md](kugetsu-setup.md) + +### Phase 2: API Interface 📋 +- Planned: REST/CLI API for task assignment +- Status polling +- Webhook support + +### Phase 3: Chat Integration 📋 +- Telegram bot for mobile UX +- Natural language interaction +- See [docs/kugetsu-chat.md](kugetsu-chat.md) + +### Phase 4: Web Dashboard 📋 +- Visual task board +- Agent status monitoring +- Read-only dashboards + +## Installation + +```bash +# Clone repository +git clone https://git.fbrns.co/shoko/kugetsu.git + +# Install kugetsu +bash kugetsu/skills/kugetsu/scripts/kugetsu-install.sh + +# Setup SSH (optional) +bash kugetsu/skills/kugetsu/scripts/sshd-setup.sh + +# Setup Tailscale (optional) +bash kugetsu/skills/kugetsu/scripts/tailscale-setup.sh +``` + +## Quick Start + +```bash +# Initialize base session (requires TTY) +kugetsu init + +# Start work on issue +kugetsu start github.com/user/repo#14 "fix bug" + +# Continue later +kugetsu continue github.com/user/repo#14 "add tests" + +# List sessions +kugetsu list +``` + +## Documentation + +| Document | Purpose | +|----------|---------| +| [kugetsu-architecture.md](kugetsu-architecture.md) | Detailed architecture | +| [kugetsu-chat.md](kugetsu-chat.md) | Phase 3 chat design | +| [kugetsu-setup.md](kugetsu-setup.md) | Setup guides | +| [telegram-setup.md](telegram-setup.md) | Telegram bot setup | +| [SUBAGENT_WORKFLOW.md](SUBAGENT_WORKFLOW.md) | Subagent execution | + +## Priority Model + +| Priority | Type | +|----------|------| +| 1 | Security | +| 2 | Bugs | +| 3 | Features | +| 4 | Research | + +Within each type: Critical > High > Medium > Low \ No newline at end of file diff --git a/docs/telegram-setup.md b/docs/telegram-setup.md new file mode 100644 index 0000000..fee0264 --- /dev/null +++ b/docs/telegram-setup.md @@ -0,0 +1,96 @@ +# Telegram Bot Setup Guide + +This guide covers creating and configuring a Telegram bot for kugetsu Phase 3 (Chat Integration). + +## Create a Telegram Bot + +### Step 1: Start BotFather + +1. Open Telegram and search for **@BotFather** +2. Click **Start** to begin + +### Step 2: Create New Bot + +Send the command: +``` +/newbot +``` + +BotFather will ask for: +1. **Name** - A human-readable name (e.g., "Kugetsu Bot") +2. **Username** - Must end in `bot` (e.g., `kugetsu_agent_bot`) + +### Step 3: Save Your Token + +BotFather will give you a token like: +``` +1234567890:ABCdefGHIjklMNOpqrSTUvwxyz123456789 +``` + +**⚠️ Keep this token secret!** It allows access to your bot. + +### Step 4: Set Bot Description (Optional) + +``` +/setdescription +``` +Enter a description like: "Kugetsu Chat Agent - Interact with your agent via Telegram" + +### Step 5: Set Bot Picture (Optional) + +``` +/setuserpic +``` +Upload a profile picture for the bot. + +--- + +## Configure Hermes for Telegram + +*(This section will be expanded when Phase 3 implementation begins)* + +### Required Environment Variables + +```bash +TELEGRAM_BOT_TOKEN="your-bot-token-here" +TELEGRAM_API_ID="your-api-id" # From https://my.telegram.org +TELEGRAM_API_HASH="your-api-hash" # From https://my.telegram.org +``` + +### Hermes Configuration + +```yaml +# hermes/config.yaml +telegram: + enabled: true + bot_token: ${TELEGRAM_BOT_TOKEN} +``` + +--- + +## Security Notes + +- **Never commit bot tokens** to version control +- Use environment variables or secrets management +- Rotate tokens if compromised: `/revoke` in BotFather + +--- + +## Troubleshooting + +### Bot Not Responding + +1. Check bot token is correct +2. Verify Hermes is running and connected +3. Check bot has not been blocked by user + +### "Bot was blocked by the user" + +The user has blocked your bot. They need to unblock it or start a new chat. + +--- + +## See Also + +- [Phase 3: Chat Integration (Issue #19)](../issues/19) +- [kugetsu Chat Architecture](kugetsu-chat.md) \ No newline at end of file