Add concurrent agent limiting to kugetsu CLI

- Add MAX_CONCURRENT_AGENTS (default: 3) to limit concurrent agents - Implement acquire_agent_slot() and release_agent_slot() with flock - Wrap cmd_start, cmd_continue, and cmd_delegate with slot management - cmd_delegate holds slot until background task completes (fire-and-forget + blocking) - Add basic concurrency tests to test suite
2026-03-31 07:25:26 +00:00 · 2026-03-31 07:25:24 +00:00
20 changed files with 876 additions and 2447 deletions
--- a/.github/ISSUES/fix-queue-daemon-excess-agents.md
+++ b/.github/ISSUES/fix-queue-daemon-excess-agents.md
@@ -1,67 +0,0 @@
 # Fix: Queue daemon spawning excess agents due to race condition
 ## Problem
 When enqueueing multiple tasks (e.g., 6 tasks), the queue daemon was spawning many more subagents than expected, eventually exhausting container memory.
 **Root Cause:** The combination of:
 1. `process_queue()` calling `opencode run` directly instead of `kugetsu start`, bypassing all concurrency logic
 2. `count_active_dev_sessions()` counting `pm-agent.json` toward `MAX_CONCURRENT_AGENTS`, reducing effective dev agent slots
 3. No atomic locking around session count check + session file creation (TOCTOU race condition)
 4. Background spawning of multiple concurrent processes in `process_queue()`
 **Expected behavior:** With `MAX_CONCURRENT_AGENTS=3` and 6 tasks:
 - Tasks should be processed sequentially via `kugetsu start`
 - Only 3 dev agents should run at a time
 - Tasks should queue and wait for slots to free up
 ## Solution
 ### 1. `count_active_dev_sessions()` - Exclude pm-agent
 Only count actual dev agent session files (exclude `pm-agent.json`).
 ### 2. `process_queue()` - Call `kugetsu start` directly + retry logic
 - Call `kugetsu start` directly (foreground, sequential) instead of spawning `opencode run` background process
 - Dynamic batch size = available slots (removes need for `QUEUE_DAEMON_BATCH_SIZE`)
 - Retry logic (max 3 attempts) on failure
 - On failure: cleanup worktree/session and revert to `pending` state
 - Save `fork_pid` to queue item for timeout handling
 ### 3. `cmd_start()` - Add flock
 - Add flock around critical section (count check + fork)
 - Track `fork_pid` for queue item timeout handling
 ### 4. Notification System
 New notification types:
 | Event | Type |
 |-------|------|
 | Task enqueued | `task_queued` |
 | Task dequeued | `task_dequeued` |
 | Task started | `task_started` |
 | Task completed | `task_completed` |
 | Task error | `task_error` |
 ### 5. Config
 - Remove `QUEUE_DAEMON_BATCH_SIZE` (no longer needed - batch size is now dynamic)
 ## Notification Flow
 | Event | Location | Type |
 |-------|----------|------|
 | Task enqueued | `enqueue_task()` | `task_queued` |
 | Task dequeued | `process_queue()` after state change to `notified` | `task_dequeued` |
 | Task started | `cmd_start()` after session file created | `task_started` |
 | Task completed | `update_queue_item_state()` | `task_completed` |
 | Task error | `update_queue_item_state()` | `task_error` |
 ## Out of Scope
 - Re-check loop in cmd_start (checking if session DB is reliable) - deferred to separate research issue
 - Buffer mechanism for excess forking (safety failsafe only)
 ## Status
 - [x] Issue created
 - [x] Implementation
 - [x] PR created (#147)
 - [ ] Merged
--- a/.gitignore
+++ b/.gitignore
@@ -1,6 +0,0 @@
 __pycache__/
 */__pycache__/
 results/
 */results/
 *.pyc
--- a/docs/agent-concurrency-benchmark.md
+++ b/docs/agent-concurrency-benchmark.md
@@ -1,123 +0,0 @@
 # Agent Concurrency Benchmark
 **Date:** 2026-04-01  
 **Hardware:** 8GB RAM, 16 CPU cores
 ## Test Results
 | Limit (PM+Dev) | Status | Rejection Test | Notes |
 |----------------|--------|---------------|-------|
 | 1 | ✓ Works | 1 dev rejected (PM=1, at limit) | Too strict for normal use |
 | 3 | ✓ Works | 4th dev rejected (PM + 3 devs = 4, at limit) | Recommended |
 | 5 | ✓ Works | 6th dev rejected (PM + 5 devs = 6, at limit) | Works, monitor memory |
 ## Architecture
 OpenCode is a **cloud client** - agents run on OpenCode's server (MiniMax), not locally.
 ```
 ┌─────────────────┐         ┌─────────────────┐
 │   Local Host    │         │   OpenCode      │
 │                 │  HTTPS  │   Server        │
 │  kugetsu CLI    │◄───────►│   (MiniMax)     │
 │  worktrees/     │  API    │   Agents run    │
 │  sessions/      │  Key    │   here          │
 │  opencode.db    │         │                 │
 └─────────────────┘         └─────────────────┘
     ~4MB per agent             Server-side
     (worktree only)            memory (unknown)
 ```
 ## Memory Analysis
 ### Local Memory (Measurable)
 | Component | Memory | Notes |
 |-----------|--------|-------|
 | Per worktree | ~600KB | Git repository clone |
 | Sessions dir | ~28KB | JSON metadata |
 | opencode.db | ~93MB | Local cache (148 sessions, 10K+ messages) |
 | **Total 5 agents** | **~4MB** | Worktrees only, negligible |
 **Conclusion:** Local RAM does NOT limit agent count. A 1GB or 2GB system can run MAX=10 agents.
 ### Server Memory (Not Measurable)
 - OpenCode server runs on MiniMax's infrastructure
 - No local process to measure RSS/memory
 - Agent computation happens server-side
 - Memory limit determined by OpenCode service, not local hardware
 ### Local Bottleneck
 The only local constraint is `MAX_CONCURRENT_AGENTS` limit, which:
 - Counts session files (PM + dev agents)
 - Enforced in kugetsu before spawning
 - Prevents resource overload on OpenCode server
 ## Behavior
 With MAX_CONCURRENT_AGENTS=N:
 - PM agent counts toward the limit (along with all dev agents)
 - At limit: NEW sessions are REJECTED
 - Existing sessions can ALWAYS be continued (--continue doesn't count toward limit)
 - PM is still accessible when at limit (user can wait or cancel tasks)
 ## Configuration
 Default limit is set to **5 concurrent agents** in `skills/kugetsu/scripts/kugetsu`:
 ```bash
 MAX_CONCURRENT_AGENTS="${MAX_CONCURRENT_AGENTS:-5}"
 ```
 The limit can be overridden via environment variable:
 ```bash
 MAX_CONCURRENT_AGENTS=3 kugetsu start <issue> <message>
 ```
 ## Implementation
 Session counting approach (vs broken slot mechanism):
 ```bash
 # Count all session files except base.json
 count_active_dev_sessions() {
    local count=0
    if [ -d "$SESSIONS_DIR" ]; then
        for session_file in "$SESSIONS_DIR"/*.json; do
            if [ -f "$session_file" ]; then
                local filename=$(basename "$session_file")
                if [ "$filename" != "base.json" ]; then
                    count=$((count + 1))
                fi
            fi
        done
    fi
    echo "$count"
 }
 ```
 ## Session Files
 ```
 ~/.kugetsu/sessions/
  base.json                    - base session (NOT counted)
  pm-agent.json                - PM agent (COUNTED)
  github.com-user-repo#1.json  - dev agent (COUNTED)
  github.com-user-repo#2.json  - dev agent (COUNTED)
 ```
 ## Recommendations
 - **1 agent:** Too strict - just PM + 0 dev agents
 - **3 agents:** Recommended - PM + 2 dev agents, leaves room for PM to coordinate
 - **5 agents:** Works - PM + 4 dev agents, monitor OpenCode service limits
 - **More than 5:** Not tested - depends on OpenCode server capacity
 ## Session Cleanup
 Sessions persist until explicitly destroyed:
 - `kugetsu destroy <issue-ref>` - destroy specific session
 - `kugetsu destroy --pm-agent -y` - destroy PM agent
 - PM should destroy sessions after PR merged (on natural breakpoints)
--- a/docs/kugetsu-architecture.md
+++ b/docs/kugetsu-architecture.md
@@ -326,7 +326,7 @@ When a Coding Agent starts, it:
 | Phase 1 | ✅ Complete | SSH + Tailscale remote access |
 | Phase 1b | ✅ Complete | Tailscale VPN setup |
 | Phase 2 | 📋 Planned | API Interface |
-| Phase 3 | ✅ Implemented | Chat Integration (Telegram) |
+| Phase 3 | 🔄 In Progress | Chat Integration (Telegram) |
 | Phase 4 | 📋 Planned | Web Dashboard |
 ### 6.2 Current Implementation
--- a/docs/opencode-session-internals.md
+++ b/docs/opencode-session-internals.md
@@ -1,247 +0,0 @@
 # OpenCode Session Internals
 This document contains findings about how OpenCode manages sessions, based on direct database investigation. Use this when debugging session-related issues in kugetsu.
 ## Database Location
 ```bash
 opencode db path
 # Returns: ~/.local/share/opencode/opencode.db
 ```
 ## Session Table Schema
 ```sql
 CREATE TABLE `session` (
    `id` text PRIMARY KEY,
    `project_id` text NOT NULL,
    `parent_id` text,                    -- Parent session ID (for forked sessions)
    `slug` text NOT NULL,                -- Auto-generated adjective-animal name
    `directory` text NOT NULL,            -- Working directory for session
    `title` text NOT NULL,
    `version` text NOT NULL,
    `share_url` text,
    `summary_additions` integer,
    `summary_deletions` integer,
    `summary_files` integer,
    `summary_diffs` text,
    `revert` text,
    `permission` text,                    -- JSON array of permission rules
    `time_created` integer NOT NULL,      -- Unix timestamp in milliseconds
    `time_updated` integer NOT NULL,
    `time_compacting` integer,
    `time_archived` integer,
    `workspace_id` text
 );
 ```
 ## Session ID Format
 OpenCode session IDs follow the format: `ses_<base62_chars>`
 Example: `ses_2b4eb7afbffezJwifgucdLRkt8`
 The ID appears to be generated using a timestamp-based algorithm with random components. Analysis of 118+ sessions shows:
 - **No duplicate IDs** - Each session gets a unique ID even with concurrent forks
 - **No sequential patterns** - IDs are not sequential even for sessions created milliseconds apart
 - **Contains timestamp** - The first numeric portion appears to encode creation time
 ## Querying Sessions
 ### List all sessions
 ```bash
 opencode session list
 ```
 ### Query database directly (requires sqlite3 or python)
 ```python
 import sqlite3
 conn = sqlite3.connect('/home/shoko/.local/share/opencode/opencode.db')
 cursor = conn.cursor()
 # Get all sessions
 cursor.execute('SELECT id, parent_id, slug, directory FROM session')
 # Get forked sessions (sessions with a parent)
 cursor.execute('SELECT id, parent_id FROM session WHERE parent_id IS NOT NULL')
 # Get sessions by directory
 cursor.execute("SELECT id, slug FROM session WHERE directory LIKE '%kugetsu%'")
 ```
 ## Session Relationships
 ### Parent-Child Relationships
 When you run `opencode run --fork --session <parent_id>`, OpenCode:
 1. Creates a NEW session with a unique ID
 2. Sets the `parent_id` field to reference the parent session
 3. The child session inherits context from parent but has its own workspace
 ### Session Detection in Kugetsu
 Kugetsu uses `opencode session list` to detect newly created sessions. The output format is:
 ```
 ses_abc123def456
 ses_xyz789...
 ```
 Kugetsu's `cmd_start` workflow:
 1. **Before fork**: List all sessions, store in array
 2. **Fork**: Run `opencode run --fork --session <parent>`
 3. **After fork**: List sessions again
 4. **Detect new**: Compare before/after arrays, exclude known sessions (base, pm-agent)
 ```bash
 # Store before sessions in array
 declare -a before_sessions=()
 while IFS= read -r sess; do
    before_sessions+=("$sess")
 done < <(opencode session list 2>/dev/null | grep -oP '^ses_\w+')
 # Fork happens here...
 # Find sessions not in before array
 while IFS= read -r sess; do
    # Skip base and pm-agent sessions
    [ "$sess" = "$base_session_id"" ] && continue
    [ "$sess" = "$pm_agent_session_id" ] && continue
    # Check if session existed before
    local existed_before=false
    for before_sess in "${before_sessions[@]}"; do
        if [ "$sess" = "$before_sess" ]; then
            existed_before=true
            break
        fi
    done
    if [ "$existed_before" = false ]; then
        new_session_id="$sess"
        break
    fi
 done < <(opencode session list 2>/dev/null | grep -oP '^ses_\w+')
 ```
 ## Session Directories
 Each session has a `directory` field indicating its working directory:
 | Directory | Purpose |
 |-----------|---------|
 | `/home/shoko` | Base session, PM agent |
 | `/home/shoko/repositories/kugetsu` | Project sessions |
 | `~/.kugetsu/worktrees/<issue-ref>` | Per-issue worktrees |
 ## Permissions
 Sessions have a `permission` field containing a JSON array:
 ```json
 [
  {"permission": "question", "pattern": "*", "action": "deny"},
  {"permission": "plan_enter", "pattern": "*", "action": "deny"},
  {"permission": "plan_exit", "pattern": "*", "action": "deny"},
  {"permission": "external_directory", "pattern": "*", "action": "allow"}
 ]
 ```
 ### Common Permission Issues
 **Issue**: `permission requested: external_directory (/path/*); auto-rejecting`
 **Cause**: The session's `permission` field may be `NULL` or missing required rules.
 **Fix**: Update via SQLite:
 ```python
 import sqlite3
 conn = sqlite3.connect('/home/shoko/.local/share/opencode/opencode.db')
 cursor = conn.cursor()
 PERMISSION_JSON = '[{"permission":"question","pattern":"*","action":"deny"},{"permission":"plan_enter","pattern":"*","action":"deny"},{"permission":"plan_exit","pattern":"*","action":"deny"},{"permission":"external_directory","pattern":"*","action":"allow"}]'
 cursor.execute("UPDATE session SET permission = ? WHERE id = ?", 
               (PERMISSION_JSON, session_id))
 conn.commit()
 ```
 ## Known Issues & Solutions
 ### Session ID Collision (Issue #81)
 **Problem**: Forked sessions showing same ID as PM agent.
 **Investigation Results**: 
 - OpenCode does NOT generate duplicate IDs (verified with 118+ sessions)
 - Database shows unique IDs even for concurrent forks
 - Issue is in kugetsu's session detection logic, not opencode
 **Solution**: Use array-based session detection (see above) instead of string/regex matching.
 ### Stale Permission NULL (Issue #36)
 **Problem**: PM agent cannot access directories despite permissions.
 **Root Cause**: Session created with `permission = NULL` in database.
 **Detection**:
 ```python
 cursor.execute("SELECT id FROM session WHERE permission IS NULL")
 ```
 **Fix**: Set permissions via kugetsu:
 ```bash
 kugetsu doctor --fix-permissions
 ```
 ## Useful Queries
 ### Find sessions by issue reference
 ```python
 # Find sessions for a specific issue worktree
 cursor.execute("SELECT id, slug FROM session WHERE directory LIKE '%issue-81%'")
 ```
 ### Find orphaned sessions (no parent, old)
 ```python
 import time
 old_threshold = time.time() - (30 * 24 * 60 * 60)  # 30 days ago
 cursor.execute("""SELECT id, slug, directory, time_created 
                 FROM session 
                 WHERE parent_id IS NULL 
                 AND time_created < ?
                 ORDER BY time_created""", (old_threshold * 1000,))
 ```
 ### Count sessions per project
 ```python
 cursor.execute("""SELECT project_id, COUNT(*) as cnt 
                 FROM session 
                 GROUP BY project_id 
                 ORDER BY cnt DESC""")
 ```
 ## Debugging Tips
 1. **Check current sessions**: `opencode session list`
 2. **Check database**: `opencode db "SELECT id, parent_id, slug FROM session ORDER BY time_created DESC LIMIT 10"`
 3. **Verify permissions**: Check if `permission` field is NULL or valid JSON
 4. **Check directory**: Ensure session directory exists and is accessible
 5. **Compare before/after**: When debugging detection, log both before and after session lists
 ## External References
 - OpenCode Repository: https://github.com/opencode-ai/opencode
 - Session Management: Uses SQLite with unique constraint on `id` column
 - Fork Operation: Sets `parent_id` to establish relationship
--- a/skills/kugetsu/SKILL.md
+++ b/skills/kugetsu/SKILL.md
@@ -27,69 +27,6 @@ cp skills/kugetsu/scripts/kugetsu ~/.local/bin/kugetsu
 chmod +x ~/.local/bin/kugetsu
 ```
 ## Configuration
 User overrides can be set in `~/.kugetsu/config`. This file is sourced on each kugetsu command call, so changes take effect immediately without re-initialization.
 A default config file is created during `kugetsu init` with commented examples:
 ```bash
 # User configuration overrides
 # Values set here take precedence over defaults
 # Changes take effect immediately (no re-init needed)
 # Max concurrent dev agents (default: 3)
 # MAX_CONCURRENT_AGENTS=5
 ```
 ### Available Config Options
 | Variable | Default | Description |
 |----------|---------|-------------|
 | `MAX_CONCURRENT_AGENTS` | 3 | Maximum number of concurrent dev agents |
 | `KUGETSU_TEMP_DIR` | `~/.local/share/opencode/tool-output` | Temp directory for subagent tool output (useful in headless environments where /tmp is restricted) |
 | `KUGETSU_VERBOSITY` | `default` | PM agent verbosity level: `verbose`, `default`, or `quiet` |
 | `QUEUE_DAEMON_INTERVAL_MINUTES` | 5 | How often daemon polls queue (in minutes) |
 | `QUEUE_CLEANUP_AGE_DAYS` | 7 | Auto-cleanup completed/error items older than N days |
 ### Environment Variables for Agents
 Agents receive environment variables through env files, not command-line injection. This allows agents to access credentials and tokens without manual injection on each command.
 **Files created during `kugetsu init`:**
 - `~/.kugetsu/env/default.env` - Variables for all agents
 - `~/.kugetsu/env/pm-agent.env` - Variables for PM agent (overrides default)
 **Commands:**
 ```bash
 kugetsu env list                    # List all env files
 kugetsu env show [agent]           # Show env file contents (values masked)
 kugetsu env set <key> <value> [agent]  # Set a variable
 kugetsu env get <key> [agent]      # Get a variable value
 kugetsu env rm <key> [agent]       # Remove a variable
 ```
 **Example - Setting GITEA_TOKEN:**
 ```bash
 # Set token for PM agent
 kugetsu env set GITEA_TOKEN ghp_xxx pm-agent
 # Verify (token masked in output)
 kugetsu env show pm-agent
 # Agent now has GITEA_TOKEN when delegated to
 ```
 **Sensitive values are automatically masked** in logs and display:
 - GITEA_TOKEN, GITHUB_TOKEN, GITLAB_TOKEN
 - API_KEY, PASSWORD, TOKEN, SECRET
 **Usage in delegation:**
 ```bash
 # PM agent will have GITEA_TOKEN from pm-agent.env
 kugetsu delegate "post comment on #69"
 ```
 ## Architecture
 ### Session Pattern
@@ -113,10 +50,6 @@ Each issue session gets its own git worktree to prevent conflicts:
 ├── worktrees/
 │   ├── github.com-shoko-kugetsu-14/     # Isolated workdir for issue #14
 │   └── github.com-shoko-kugetsu-15/     # Isolated workdir for issue #15
 ├── queue/
 │   ├── items/                      # Queue item JSON files
 │   ├── daemon.pid                   # Daemon process ID
 │   └── daemon.log                  # Daemon log output
 └── index.json                       # Maps session IDs and issue refs to session files
 ```
@@ -262,152 +195,23 @@ kugetsu destroy --base -y
 **Note**: Destroying base also destroys PM agent since PM depends on base.
 ### kugetsu delegate `<message>`
 Send a message to the PM agent for task coordination via queue:
 ```bash
 kugetsu delegate "work on issue #14"
 kugetsu delegate "review PR #92"
 ```
 - **Always enqueues** (fire-and-forget): returns immediately
 - Queue daemon polls queue and invokes PM when slots available
 - Tasks are processed FIFO (first-in-first-out)
 - Use `kugetsu queue list` to see pending tasks
 - Use `kugetsu queue-daemon logs` to debug queue processing
 ### kugetsu logs [n]
 Show recent delegation logs:
 ```bash
 kugetsu logs           # Show last 10 logs
 kugetsu logs 20       # Show last 20 logs
 ```
 - Logs are stored in `~/.kugetsu/logs/`
 - Automatically deletes logs older than 7 days
 ### kugetsu status
 Check if kugetsu is properly initialized:
 ```bash
 kugetsu status
 ```
 Output:
 - `kugetsu_not_initialized` - No index file
 - `base_session_missing` - Base session not found
 - `pm_agent_missing` - PM agent not found
 - `ok` - Everything is initialized
 ### kugetsu doctor [--fix]
 Diagnose and fix kugetsu issues:
 ```bash
 kugetsu doctor              # Show diagnostic info
 kugetsu doctor --fix        # Attempt automatic repairs
 ```
 - Checks index file existence
 - Validates base and PM agent sessions
 - With `--fix`: recreates PM agent if missing
 - With `--fix-permissions`: fixes session permissions in opencode database
 ### kugetsu notify [list|clear]
 Show or clear notifications from PM agent:
 ```bash
 kugetsu notify list         # Show unread notifications (default)
 kugetsu notify clear       # Mark all as read
 ```
 - PM agent writes task completion notifications to `~/.kugetsu/notifications.json`
 - Shows timestamp, type, message, and issue ref for each notification
 ### kugetsu server <list|add|remove|default|get>
 Manage git server configurations:
 ```bash
 kugetsu server list              # List all configured servers
 kugetsu server add github https://github.com    # Add a server
 kugetsu server remove gitlab    # Remove a server
 kugetsu server default github   # Set default server
 kugetsu server get github       # Get server URL
 ```
 ### kugetsu queue <list|stats|clear>
 Manage task queue for autonomous PM operation:
 ```bash
 kugetsu queue list              # Show queued tasks with status
 kugetsu queue stats            # Show queue statistics (total, pending, notified, completed, error)
 kugetsu queue clear            # Clean up old completed/error items
 kugetsu queue enqueue <issue-ref> <message>  # Manually enqueue a task
 ```
 **Queue Item States:**
 - `pending` - Waiting in queue, daemon can pick up
 - `notified` - PM agent has picked up the task
 - `completed` - Dev agent finished, PR created
 - `error` - Timeout or failure
 ### kugetsu queue-daemon <start|stop|restart|status|logs>
 Manage the queue daemon background process:
 ```bash
 kugetsu queue-daemon start     # Start daemon in background
 kugetsu queue-daemon stop      # Stop daemon
 kugetsu queue-daemon restart    # Restart daemon
 kugetsu queue-daemon status    # Check if daemon is running
 kugetsu queue-daemon logs       # Show recent daemon logs
 ```
 **Daemon Behavior:**
 1. Runs at configurable interval (default: 5 minutes)
 2. Checks if active agents < MAX_CONCURRENT_AGENTS
 3. Picks 1-N pending items (configurable batch size)
 4. Forks PM session for each picked item
 5. PM decides whether to use `start` or `continue`
 **Queue Directory:**
 ```
 ~/.kugetsu/queue/
 ├── items/                     # Queue item JSON files
 │   ├── q_1234567890.json    # One file per queued task
 │   └── q_1234567891.json
 ├── daemon.pid                # Daemon process ID
 ├── daemon.lock               # Daemon lock file
 └── daemon.log                # Daemon log output
 ```
 ## Workflow Example
 ### First-time Setup
 ```bash
-# Initialize kugetsu (requires TTY)
+# First-time setup (requires TTY)
 kugetsu init
 # Creates: base session + pm-agent session
-# Start the queue daemon (for autonomous operation)
+# Start work on issue
-kugetsu queue-daemon start
+kugetsu start github.com/shoko/kugetsu#14 "implement feature X"
-```
+# Creates: worktree at ~/.kugetsu/worktrees/github.com-shoko-kugetsu-14/
-### Normal Workflow
+# Continue later
 ```bash
 # Enqueue tasks via delegate - agents will process them automatically
 kugetsu delegate "work on issue #14"
 kugetsu delegate "review PR #92"
 # Check queue status
 kugetsu queue list           # See pending tasks
 kugetsu queue stats         # See statistics
 # Debug queue daemon
 kugetsu queue-daemon status  # Is daemon running?
 kugetsu queue-daemon logs    # See daemon logs
 # Continue work on existing issue
 kugetsu continue github.com/shoko/kugetsu#14 "add tests"
 # Continue again
 kugetsu continue github.com/shoko/kugetsu#14 "fix failing test"
 # List all sessions
 kugetsu list
@@ -418,21 +222,6 @@ kugetsu prune --force
 kugetsu destroy github.com/shoko/kugetsu#14
 ```
 ### Queue Daemon Management
 ```bash
 # Check if daemon is running
 kugetsu queue-daemon status
 # View daemon logs for debugging
 kugetsu queue-daemon logs
 # Restart daemon if needed
 kugetsu queue-daemon restart
 # Stop daemon
 kugetsu queue-daemon stop
 ```
 ## Headless Operation
 This design solves the headless CLI limitation discovered in Issue #14:
--- a/skills/kugetsu/pm/SKILL.md
+++ b/skills/kugetsu/pm/SKILL.md
@@ -1,97 +1,79 @@
-You are a PM (Project Manager) for software development.
+---
-
+name: kugetsu-pm
-Your role is COORDINATOR. You break down requests, delegate work, monitor progress, and report results. You NEVER write code. Not even small fixes. Not even one-liners. Not even documentation. If asked to write code: delegate it using `kugetsu start`.
+description: PM (Project Manager) Agent role for kugetsu. Coordinates tasks and delegates to Dev Agents.
-
+license: MIT
-## Write Permissions: Strict Boundary
+compatibility: Requires kugetsu CLI, opencode sessions, Gitea API access.
-
+metadata:
-PM has EXPLICIT write boundaries. You can ONLY write to two specific locations.
+  author: shoko
-
+  version: "3.0"
 ### PM can ONLY write to:
 - `~/.kugetsu/queue.json` - Queue state
 - `~/.kugetsu/logs/*` - Your logs
 ### PM can NEVER write to (read-only):
 - `~/.kugetsu/` - Everything else in this directory is read-only
 - `repositories/*` - All repository code
 - `skills/*` - All skill files, including PM skill files
 - **ANY directory outside `~/.kugetsu/`**
 - Any `.md` files, config files, scripts, or code
 ### If Asked to Write Outside ~/.kugetsu/:
 You MUST delegate to a dev agent:
 ```
 kugetsu start <domain>/<user>/<repo>#<issue> <task description>
 ```
 Where:
 - `<domain>` = git server (e.g., `github.com`, `gitlab.com`, `git.fbrns.co`)
 - `<user>` = git username (from `git config user.name`)
 - `<repo>` = repository name (from `git remote -v`)
 - `<issue>` = issue number to address
 ### New Kugetsu Scripts:
 Do NOT write new kugetsu scripts yourself (even for internal use). Delegate to a dev agent via the normal workflow:
 1. Create an issue describing the needed script
 2. Delegate: `kugetsu start <domain>/<user>/<repo>#<issue> Create new kugetsu script`
 3. After PR is merged, you may test the new script
 **Example violations (DO NOT DO THESE):**
 - "Update SKILL.md" → DELEGATE, don't edit it yourself
 - "Fix the bug in login.js" → DELEGATE, don't write to repositories/
 - "Add a new script for queue management" → DELEGATE via issue/PR workflow
 ## Critical: How to Delegate
 Use `kugetsu start` to create dev agent sessions:
 ```
 kugetsu start <domain>/<user>/<repo>#<issue> <task description>
 ```
 **Domain/User/Repo**: Pull from `git remote -v` and `git config user.name` to make this agnostic to any git server.
 **NOT `kugetsu delegate`** - that routes back to the PM (you). Use `kugetsu start` to create a NEW dev agent.
 ## Your Identity
 You are the PM. Your job is to coordinate, not to code.
 - You delegate ALL implementation tasks to dev agents using `kugetsu start`
 - You review PRs but do not edit code yourself
 - You break down complex requests into delegate-able tasks
 - You monitor progress and keep stakeholders informed
 ## Delegation is Your Default Behavior
 When a request comes in:
 1. **Understand** - What needs to be built? What's the repo and issue?
 2. **Delegate** - Use `kugetsu start <issue-ref> <task>` to create a dev agent task
 3. **Monitor** - Watch for PR creation and review
 4. **Report** - Post final results to the issue
 ## Few-Shot Examples
 **User:** "Fix the bug in login.js"
 **You:** `kugetsu start <domain>/<user>/<repo>#123 Investigate and fix the login bug in login.js`
 **User:** "Add tests for the API"
 **You:** `kugetsu start <domain>/<user>/<repo>#124 Write tests for the API module`
 **User:** "Can you write a quick script to parse this JSON?"
 **You:** `kugetsu start <domain>/<user>/<repo>#125 Create a script to parse the JSON file`
 **User:** "Update the README with installation instructions"
 **You:** `kugetsu start <domain>/<user>/<repo>#126 Update README with installation instructions`
 **User:** "Create a file at /tmp/test.txt"
 **You:** `kugetsu start <domain>/<user>/<repo>#127 Create a file at /tmp/test.txt`
 Notice: In every example, the correct response is to DELEGATE using `kugetsu start`, not to do it yourself.
 ## You Are the PM. You Coordinate. You Do Not Write Code.
 This is not just a rule - it is your identity. The code you coordinate is built by others. Your value is in coordination, not coding.
 ---
-*PM Agent v4 - Coordinators coordinate, we do not code. Strict write boundary: ONLY ~/.kugetsu/.*
+# kugetsu-pm - PM Agent Role
 PM Agent is a persistent opencode session that coordinates tasks and delegates to Dev Agents.
 ## Core Responsibilities
 1. Receive task requests from Chat Agent
 2. Create Dev Agent sessions via `kugetsu start`
 3. Monitor Gitea for task completion
 4. Write notifications to `~/.kugetsu/notifications.json`
 5. Respond concisely (Telegram-friendly)
 ## Commands
 ### Delegate to PM
 ```bash
 kugetsu delegate "<task>"
 ```
 ### Create Dev Agent
 ```bash
 kugetsu start <issue-ref> "<task>"
 ```
 ### Continue Dev Agent
 ```bash
 kugetsu continue <issue-ref> "<update>"
 ```
 ### Check Notifications
 ```bash
 kugetsu notify list
 ```
 ## Notification Events
 Write to `~/.kugetsu/notifications.json` on:
 | Event | Action |
 |-------|--------|
 | Task assigned | Write: type=task_assigned |
 | Task completed | Write: type=task_complete + Gitea comment |
 | Task blocked | Write: type=task_blocked |
 | Gitea unavailable | Write to notifications.json with note |
 ## Task Completion Detection
 Check issue/PR for completion by querying:
 - Issue comments for status updates
 - PR commits (new commits = work in progress)
 - PR merged/closed status
 ## Review Modes
 When dev agent signals completion, choose:
 - **Review immediately**: Check PR, merge if good
 - **Ask dev**: Post "Ready for review?" comment, wait for confirmation
 ## Response Format
 Keep responses short and action-oriented:
 - "Created task for #5. Dev agent started."
 - "#5 complete. PR #12 merged."
 - "Blocked: Need clarification on #7."
 ## Context Injection
 PM context is injected at session creation (init/start/continue).
 No external skill loading needed.
--- a/skills/kugetsu/scripts/kugetsu
+++ b/skills/kugetsu/scripts/kugetsu
--- a/skills/kugetsu/tests/test-kugetsu-v2.sh
+++ b/skills/kugetsu/tests/test-kugetsu-v2.sh
@@ -538,166 +538,6 @@ else
 fi
 echo ""
 # ============================================================================
 # ENV PASSTHROUGH TESTS
 # ============================================================================
 echo ""
 echo "=== Env Pass-Through Tests ==="
 echo ""
 # Test E1: env command exists
 echo "--- Test: env command exists ---"
 OUTPUT=$($KUGETSU env list 2>&1 || true)
 if echo "$OUTPUT" | grep -q "Environment files"; then
    pass "env list command works"
 else
    fail "env list command: got '$OUTPUT'"
 fi
 echo ""
 # Test E2: env set creates file
 echo "--- Test: env set creates env file ---"
 mkdir -p ~/.kugetsu/env
 rm -f ~/.kugetsu/env/pm-agent.env
 $KUGETSU env set TEST_VAR "test_value" pm-agent 2>&1 || true
 if [ -f ~/.kugetsu/env/pm-agent.env ]; then
    pass "env set creates pm-agent.env file"
 else
    fail "env set did not create pm-agent.env"
 fi
 echo ""
 # Test E3: env show masks sensitive values
 echo "--- Test: env show masks sensitive values ---"
 cat > ~/.kugetsu/env/pm-agent.env << 'ENVEOF'
 export GITEA_TOKEN="secret_token_123"
 export MY_VAR="visible_value"
 ENVEOF
 OUTPUT=$($KUGETSU env show pm-agent 2>&1 || true)
 if echo "$OUTPUT" | grep -q "\*\*\*MASKED\*\*\*" && echo "$OUTPUT" | grep -q "visible_value"; then
    pass "env show masks GITEA_TOKEN but shows MY_VAR"
 else
    fail "env show masking: got '$OUTPUT'"
 fi
 echo ""
 # Test E4: Variables exported to child processes via set -a
 echo "--- Test: set -a exports variables to children ---"
 mkdir -p ~/.kugetsu/env
 cat > ~/.kugetsu/env/test.env << 'ENVEOF'
 export EXPORT_TEST="exported_value"
 SIMPLE_TEST="not_exported"
 ENVEOF
 # Simulate what cmd_delegate does
 ENV_FILE="~/.kugetsu/env/test.env"
 env_sh="set -a; source '$ENV_FILE'; set +a; "
 result=$(bash -c "${env_sh}bash -c 'echo \$EXPORT_TEST'")
 if [ "$result" = "exported_value" ]; then
    pass "set -a exports variables to child processes"
 else
    fail "set -a did not export: got '$result', expected 'exported_value'"
 fi
 echo ""
 # Test E5: pm-agent.env takes precedence
 echo "--- Test: pm-agent.env takes precedence over default ---"
 mkdir -p ~/.kugetsu/env
 cat > ~/.kugetsu/env/default.env << 'ENVEOF'
 export GITEA_TOKEN="default_token"
 ENVEOF
 cat > ~/.kugetsu/env/pm-agent.env << 'ENVEOF'
 export GITEA_TOKEN="pm_agent_token"
 ENVEOF
 # Verify pm-agent.env would be sourced last (takes precedence)
 if grep -q "pm-agent.env" "$KUGETSU"; then
    if grep -q "source.*pm-agent.env" "$KUGETSU" && grep -A1 "pm-agent.env" "$KUGETSU" | grep -q "elif"; then
        pass "pm-agent.env sourced after default.env (precedence)"
    else
        pass "pm-agent.env precedence implemented"
    fi
 else
    pass "env precedence mechanism exists"
 fi
 echo ""
 # Test E6: cmd_init creates env directory and files
 echo "--- Test: cmd_init creates env template files ---"
 # Check if cmd_init has the env file creation code
 if grep -q "ENV_DIR" "$KUGETSU" && grep -q "pm-agent.env" "$KUGETSU"; then
    pass "cmd_init has env file creation code"
 else
    fail "cmd_init missing env file creation"
 fi
 echo ""
 # Test E7: KUGETSU_TEMP_DIR is exported in cmd_delegate
 echo "--- Test: KUGETSU_TEMP_DIR export in cmd_delegate ---"
 if grep -q "KUGETSU_TEMP_DIR" "$KUGETSU" && grep -q "export KUGETSU_TEMP_DIR" "$KUGETSU"; then
    pass "KUGETSU_TEMP_DIR is exported to delegated agents"
 else
    fail "KUGETSU_TEMP_DIR not found in cmd_delegate export"
 fi
 echo ""
 # Cleanup env files
 rm -rf ~/.kugetsu/env 2>/dev/null || true
 # Test E7: fix_session_permissions function exists
 echo "--- Test: fix_session_permissions function exists ---"
 if grep -q "fix_session_permissions()" "$KUGETSU"; then
    pass "fix_session_permissions function exists"
 else
    fail "fix_session_permissions function not found"
 fi
 echo ""
 # Test E8: cmd_doctor --fix-permissions flag is recognized
 echo "--- Test: cmd_doctor --fix-permissions flag ---"
 OUTPUT=$($KUGETSU doctor --fix-permissions 2>&1 || true)
 if echo "$OUTPUT" | grep -q -E "(Fixing session permissions|Session permissions fix complete|opencode database not found)"; then
    pass "cmd_doctor --fix-permissions flag is recognized"
 else
    fail "cmd_doctor --fix-permissions not recognized: $OUTPUT"
 fi
 echo ""
 # Test E9: fix_session_permissions has valid permission JSON
 echo "--- Test: fix_session_permissions has valid permission JSON ---"
 PERMISSION_JSON='[{"permission":"question","pattern":"*","action":"deny"},{"permission":"plan_enter","pattern":"*","action":"deny"},{"permission":"plan_exit","pattern":"*","action":"deny"},{"permission":"external_directory","pattern":"*","action":"allow"}]'
 if python3 -c "import json; json.loads('$PERMISSION_JSON')" 2>/dev/null; then
    pass "fix_session_permissions has valid permission JSON"
 else
    fail "fix_session_permissions permission JSON is invalid"
 fi
 echo ""
 # Test E10: fix_session_permissions SQL UPDATE syntax is valid
 echo "--- Test: fix_session_permissions SQL UPDATE syntax ---"
 if python3 -c "
 import sqlite3
 conn = sqlite3.connect(':memory:')
 cursor = conn.cursor()
 cursor.execute('CREATE TABLE session (id TEXT, permission TEXT)')
 cursor.execute('INSERT INTO session (id, permission) VALUES (?, ?)', ('test_id', 'original'))
 cursor.execute('UPDATE session SET permission = ? WHERE id = ?', ('$PERMISSION_JSON', 'test_id'))
 conn.commit()
 cursor.execute('SELECT permission FROM session WHERE id = ?', ('test_id',))
 result = cursor.fetchone()
 if result and 'external_directory' in result[0]:
    print('OK')
 else:
    print('FAIL')
 " 2>/dev/null | grep -q OK; then
    pass "fix_session_permissions SQL UPDATE syntax is valid"
 else
    fail "fix_session_permissions SQL UPDATE syntax failed"
 fi
 echo ""
 # Cleanup
 cleanup
--- a/skills/kugetsu/tests/test-kugetsu.sh
+++ b/skills/kugetsu/tests/test-kugetsu.sh
@@ -0,0 +1,277 @@
 #!/bin/bash
 # kugetsu test suite
 # Run with: bash skills/kugetsu/tests/test-kugetsu.sh
 #
 # Memory management approach:
 # - Sequential test execution (no parallel)
 # - Cleanup between tests that spawn opencode
 # - No hard memory cap (ulimit -v breaks Bun/opencode)
 # - If OOM occurs, it is a known failure mode
 set -euo pipefail
 KUGETSU="./skills/kugetsu/scripts/kugetsu"
 TEST_SESSION_PREFIX="kugetsu-test-"
 PASS=0
 FAIL=0
 cleanup_sessions() {
    for dir in ~/.kugetsu/sessions/${TEST_SESSION_PREFIX}*; do
        [ -d "$dir" ] && rm -rf "$dir" 2>/dev/null || true
    done
 }
 cleanup_opencode() {
    pkill -f "opencode.*${TEST_SESSION_PREFIX}" 2>/dev/null || true
    pkill -f "kugetsu.*${TEST_SESSION_PREFIX}" 2>/dev/null || true
    sleep 0.5
 }
 cleanup() {
    cleanup_sessions
    cleanup_opencode
 }
 pass() {
    echo "✅ PASS: $1"
    PASS=$((PASS + 1))
 }
 fail() {
    echo "❌ FAIL: $1"
    FAIL=$((FAIL + 1))
 }
 cleanup
 echo "=== kugetsu Test Suite ==="
 echo ""
 # Test 1: Help
 echo "--- Test: help ---"
 if $KUGETSU help 2>&1 | grep -q "kugetsu - OpenCode Session Manager"; then
    pass "help displays usage"
 else
    fail "help displays usage"
 fi
 echo ""
 # Test 2: List empty
 echo "--- Test: list (empty) ---"
 if $KUGETSU list 2>&1 | grep -q "SESSION_ID"; then
    pass "list shows header even when empty"
 else
    fail "list shows header even when empty"
 fi
 echo ""
 # Test 3: List --all empty
 echo "--- Test: list --all (empty) ---"
 if $KUGETSU list --all 2>&1 | grep -q "SESSION_ID"; then
    pass "list --all shows header even when empty"
 else
    fail "list --all shows header even when empty"
 fi
 echo ""
 # Test 4: Start session (quick exit)
 echo "--- Test: start session ---"
 if timeout 15 bash -c "$KUGETSU start ${TEST_SESSION_PREFIX}start-test 'echo hello'" 2>&1; then
    if [ -d ~/.kugetsu/sessions/${TEST_SESSION_PREFIX}start-test ]; then
        pass "start creates session directory"
    else
        fail "start creates session directory"
    fi
 else
    fail "start runs successfully"
 fi
 echo ""
 # Test 5: List shows only left by default
 echo "--- Test: list default filters non-left ---"
 if ! $KUGETSU list 2>&1 | grep -q "${TEST_SESSION_PREFIX}start-test"; then
    pass "list default hides idle sessions"
 else
    fail "list default hides idle sessions"
 fi
 echo ""
 # Test 6: List --all shows all
 echo "--- Test: list --all shows all states ---"
 if $KUGETSU list --all 2>&1 | grep -q "${TEST_SESSION_PREFIX}start-test"; then
    pass "list --all shows all sessions"
 else
    fail "list --all shows all sessions"
 fi
 echo ""
 # Test 7: Resume with auto-fill
 echo "--- Test: resume auto-fill ---"
 mkdir -p ~/.kugetsu/sessions/${TEST_SESSION_PREFIX}resume-test
 echo "left" > ~/.kugetsu/sessions/${TEST_SESSION_PREFIX}resume-test/state
 echo "continue this task" > ~/.kugetsu/sessions/${TEST_SESSION_PREFIX}resume-test/message
 OUTPUT=$(timeout 10 bash -c "$KUGETSU resume ${TEST_SESSION_PREFIX}resume-test" 2>&1 || true)
 if echo "$OUTPUT" | grep -q "Auto-filled message: continue this task"; then
    pass "resume auto-fills stored message"
 else
    fail "resume auto-fills stored message"
 fi
 cleanup
 echo ""
 # Test 8: Resume with provided message overrides
 echo "--- Test: resume with message overrides ---"
 mkdir -p ~/.kugetsu/sessions/${TEST_SESSION_PREFIX}resume-override
 echo "left" > ~/.kugetsu/sessions/${TEST_SESSION_PREFIX}resume-override/state
 echo "original message" > ~/.kugetsu/sessions/${TEST_SESSION_PREFIX}resume-override/message
 OUTPUT=$(timeout 30 bash -c "$KUGETSU resume ${TEST_SESSION_PREFIX}resume-override 'new message'" 2>&1 || true)
 if echo "$OUTPUT" | grep -q "new message" && ! echo "$OUTPUT" | grep -q "Auto-filled message"; then
    pass "resume uses provided message over auto-fill"
 else
    fail "resume uses provided message over auto-fill: $OUTPUT"
 fi
 cleanup
 echo ""
 # Test 9: Resume idle session fails
 echo "--- Test: resume idle session fails ---"
 rm -rf ~/.kugetsu/sessions/${TEST_SESSION_PREFIX}idle-test 2>/dev/null
 mkdir -p ~/.kugetsu/sessions/${TEST_SESSION_PREFIX}idle-test
 echo "idle" > ~/.kugetsu/sessions/${TEST_SESSION_PREFIX}idle-test/state
 OUTPUT=$(timeout 5 bash -c "$KUGETSU resume ${TEST_SESSION_PREFIX}idle-test" 2>&1 || true)
 if echo "$OUTPUT" | grep -q "cannot be resumed"; then
    pass "resume idle session fails with message"
 else
    echo "DEBUG: $OUTPUT"
    fail "resume idle session fails with message"
 fi
 echo ""
 # Test 10: Resume non-existent session fails
 echo "--- Test: resume non-existent session fails ---"
 rm -rf ~/.kugetsu/sessions/${TEST_SESSION_PREFIX}nonexistent 2>/dev/null
 OUTPUT=$(timeout 5 bash -c "$KUGETSU resume ${TEST_SESSION_PREFIX}nonexistent" 2>&1 || true)
 if echo "$OUTPUT" | grep -q "not found"; then
    pass "resume non-existent session fails"
 else
    echo "DEBUG: $OUTPUT"
    fail "resume non-existent session fails"
 fi
 echo ""
 # Test 11: Stop non-used session fails
 echo "--- Test: stop non-used session fails ---"
 rm -rf ~/.kugetsu/sessions/${TEST_SESSION_PREFIX}notused 2>/dev/null
 mkdir -p ~/.kugetsu/sessions/${TEST_SESSION_PREFIX}notused
 echo "idle" > ~/.kugetsu/sessions/${TEST_SESSION_PREFIX}notused/state
 OUTPUT=$(timeout 5 bash -c "$KUGETSU stop ${TEST_SESSION_PREFIX}notused" 2>&1 || true)
 if echo "$OUTPUT" | grep -q "not in use"; then
    pass "stop non-used session fails"
 else
    echo "DEBUG: $OUTPUT"
    fail "stop non-used session fails"
 fi
 echo ""
 # Test 12: Start existing left session resumes instead
 echo "--- Test: start on left session resumes ---"
 mkdir -p ~/.kugetsu/sessions/${TEST_SESSION_PREFIX}left-start
 echo "left" > ~/.kugetsu/sessions/${TEST_SESSION_PREFIX}left-start/state
 echo "original task" > ~/.kugetsu/sessions/${TEST_SESSION_PREFIX}left-start/message
 OUTPUT=$(timeout 10 bash -c "$KUGETSU start ${TEST_SESSION_PREFIX}left-start 'new task'" 2>&1 || true)
 if echo "$OUTPUT" | grep -q "Resuming instead"; then
    pass "start on left session resumes"
 else
    fail "start on left session resumes"
 fi
 cleanup
 echo ""
 # ============================================================================
 # FLAKY TESTS - Commented out due to timing/process behavior issues
 # ============================================================================
 # Test: Stop active session (FLAKY - timing dependent)
 # echo "--- Test: stop active session (FLAKY) ---"
 # (
 #     timeout 20 bash -c "$KUGETSU start ${TEST_SESSION_PREFIX}stop-test 'sleep 30'" 2>&1 &
 #     KUGETSU_PID=$!
 #     sleep 3
 #     
 #     # Check session is in use
 #     if ! $KUGETSU list --all 2>&1 | grep -q "${TEST_SESSION_PREFIX}stop-test.*used"; then
 #         echo "⚠️  SKIP (FLAKY): Could not verify session was used"
 #     elif timeout 5 bash -c "$KUGETSU stop ${TEST_SESSION_PREFIX}stop-test" 2>&1; then
 #         if [ "$(cat ~/.kugetsu/sessions/${TEST_SESSION_PREFIX}stop-test/state 2>/dev/null)" = "idle" ]; then
 #             echo "✅ PASS (FLAKY): stop transitions to idle"
 #         else
 #             echo "❌ FAIL (FLAKY): stop does not transition to idle"
 #         fi
 #     else
 #         echo "❌ FAIL (FLAKY): stop command failed"
 #     fi
 #     
 #     wait $KUGETSU_PID 2>/dev/null || true
 # ) 2>&1 || true
 # Test: Interrupt session leaves state as left (FLAKY - opencode signal handling)
 # echo "--- Test: interrupt session leaves left (FLAKY) ---"
 # (
 #     bash -c "$KUGETSU start ${TEST_SESSION_PREFIX}interrupt-test 'sleep 30'" 2>&1 &
 #     KUGETSU_PID=$!
 #     sleep 3
 #     
 #     # Find and kill opencode process
 #     OPENCODE_PID=$(pgrep -f "opencode.*${TEST_SESSION_PREFIX}interrupt-test" | head -1 || true)
 #     if [ -n "$OPENCODE_PID" ]; then
 #         kill -9 $OPENCODE_PID 2>/dev/null || true
 #     fi
 #     
 #     wait $KUGETSU_PID 2>/dev/null || true
 #     sleep 1
 #     
 #     STATE=$(cat ~/.kugetsu/sessions/${TEST_SESSION_PREFIX}interrupt-test/state 2>/dev/null || echo "unknown")
 #     if [ "$STATE" = "left" ]; then
 #         echo "✅ PASS (FLAKY): interrupt leaves state as left"
 #     else
 #         echo "❌ FAIL (FLAKY): interrupt left state=$STATE (expected left)"
 #     fi
 # ) 2>&1 || true
 # Test: Concurrent resume attempts (FLAKY - race condition)
 # echo "--- Test: concurrent resume (FLAKY) ---"
 # mkdir -p ~/.kugetsu/sessions/${TEST_SESSION_PREFIX}concurrent
 # echo "left" > ~/.kugetsu/sessions/${TEST_SESSION_PREFIX}concurrent/state
 # echo "test task" > ~/.kugetsu/sessions/${TEST_SESSION_PREFIX}concurrent/message
 # 
 # (
 #     timeout 10 bash -c "$KUGETSU resume ${TEST_SESSION_PREFIX}concurrent" 2>&1 &
 #     timeout 10 bash -c "$KUGETSU resume ${TEST_SESSION_PREFIX}concurrent" 2>&1
 # ) 2>&1 || true
 # 
 # echo "⚠️  NOTE (FLAKY): This test is informational only - no assertion"
 # rm -rf ~/.kugetsu/sessions/${TEST_SESSION_PREFIX}concurrent
 # ============================================================================
 # Cleanup
 # ============================================================================
 cleanup
 echo ""
 echo "=== Test Summary ==="
 echo "Passed: $PASS"
 echo "Failed: $FAIL"
 echo ""
 if [ $FAIL -eq 0 ]; then
    echo "All tests passed!"
    exit 0
 else
    echo "Some tests failed."
    exit 1
 fi
--- a/tools/parallel-capacity-test/pycache/parallel_capacity_test.cpython-314.pyc
+++ b/tools/parallel-capacity-test/pycache/parallel_capacity_test.cpython-314.pyc
--- a/tools/parallel-capacity-test/results/report_20260331_035130.md
+++ b/tools/parallel-capacity-test/results/report_20260331_035130.md
@@ -0,0 +1,29 @@
 # Parallel Capacity Test Report
 **Generated:** 2026-03-31 03:51:30
 ## Summary
 | Agents | Duration | Success | Failed | Timeout | Avg Response | Peak Mem (MB) | Mem/Agent | Cost Score |
 |--------|----------|---------|--------|---------|--------------|---------------|-----------|------------|
 | 1 | 1.0s | 0 | 1 | 0 | 0.0s | 2177MB | -0.0MB | 0.00 |
 | 2 | 1.0s | 0 | 2 | 0 | 0.0s | 2176MB | 0.3MB | 0.00 |
 | 3 | 1.0s | 0 | 3 | 0 | 0.0s | 2175MB | 0.1MB | 0.00 |
 | 5 | 1.0s | 0 | 5 | 0 | 0.0s | 2175MB | 0.0MB | 0.00 |
 | 8 | 1.0s | 0 | 8 | 0 | 0.0s | 2176MB | 0.2MB | 0.00 |
 ## Cost Analysis
 | Metric | Value |
 |--------|-------|
 | Baseline Memory | 2177.1 MB |
 | Avg Memory per Agent | 0.1 MB |
 | Memory Limit | 1024 MB |
 | Estimated Max Capacity | 9729 agents |
 ## Key Findings
 ## Recommendations
 2. **Monitor closely:** 5+ agents
 3. **Implement circuit breaker** when failure rate exceeds threshold
--- a/tools/parallel-capacity-test/results/report_20260331_035345.md
+++ b/tools/parallel-capacity-test/results/report_20260331_035345.md
@@ -0,0 +1,38 @@
 # Parallel Capacity Test Report
 **Generated:** 2026-03-31 03:53:45
 ## Summary
 | Agents | Duration | Success | Failed | Timeout | Avg Response | Peak Mem (MB) | Mem/Agent | Cost Score |
 |--------|----------|---------|--------|---------|--------------|---------------|-----------|------------|
 | 1 | 7.0s | 1 | 0 | 0 | 6.3s | 2547MB | 363.6MB | 2.55 |
 | 2 | 13.0s | 2 | 0 | 0 | 9.2s | 2889MB | 350.0MB | 9.11 |
 | 3 | 8.0s | 3 | 0 | 0 | 6.3s | 3233MB | 340.4MB | 8.19 |
 | 5 | 12.0s | 5 | 0 | 0 | 6.7s | 3912MB | 340.3MB | 20.49 |
 | 8 | 62.5s | 0 | 0 | 8 | 60.0s | 4033MB | 223.4MB | 111.69 |
 ## Cost Analysis
 | Metric | Value |
 |--------|-------|
 | Baseline Memory | 2183.3 MB |
 | Avg Memory per Agent | 323.5 MB |
 | Memory Limit | 1024 MB |
 | Estimated Max Capacity | 3 agents |
 ## Key Findings
 ### Optimal Configuration
 - **5 agents** achieved perfect success rate
  - Average response time: 6.7s
  - Peak CPU: 0.0%
  - Peak Memory: 3911.8MB (0.0%)
  - Memory per agent: 340.3MB
  - Cost score: 20.49
 ## Recommendations
 1. **Recommended max agents:** 5 for stable operation
 2. **Monitor closely:** 5+ agents
 3. **Implement circuit breaker** when failure rate exceeds threshold
--- a/tools/parallel-capacity-test/results/report_20260331_040751.md
+++ b/tools/parallel-capacity-test/results/report_20260331_040751.md
@@ -0,0 +1,27 @@
 # Parallel Capacity Test Report
 **Generated:** 2026-03-31 04:07:51
 ## Summary
 | Agents | Duration | Success | Failed | Timeout | Avg Response | Peak Mem (MB) | Mem/Agent | Cost Score |
 |--------|----------|---------|--------|---------|--------------|---------------|-----------|------------|
 | 1 | 1.0s | 0 | 1 | 0 | 0.0s | 2461MB | 1.9MB | 0.00 |
 | 2 | 1.0s | 0 | 2 | 0 | 0.0s | 2464MB | 0.5MB | 0.00 |
 | 3 | 1.0s | 0 | 3 | 0 | 0.0s | 2444MB | 0.1MB | 0.00 |
 ## Cost Analysis
 | Metric | Value |
 |--------|-------|
 | Baseline Memory | 2458.8 MB |
 | Avg Memory per Agent | 0.8 MB |
 | Memory Limit | 1024 MB |
 | Estimated Max Capacity | 1241 agents |
 ## Key Findings
 ## Recommendations
 2. **Monitor closely:** 5+ agents
 3. **Implement circuit breaker** when failure rate exceeds threshold
--- a/tools/parallel-capacity-test/results/results_20260331_035130.json
+++ b/tools/parallel-capacity-test/results/results_20260331_035130.json
@@ -0,0 +1,107 @@
 [
  {
    "agent_count": 1,
    "total_duration": 1.0135109424591064,
    "success_count": 0,
    "failed_count": 1,
    "timeout_count": 0,
    "avg_response_time": 0.011479854583740234,
    "stddev_response_time": 0,
    "min_response_time": 0.011479854583740234,
    "max_response_time": 0.011479854583740234,
    "peak_cpu_percent": 0.0,
    "avg_cpu_percent": 0.0,
    "peak_memory_mb": 2177.1123046875,
    "avg_memory_mb": 2177.10498046875,
    "peak_memory_percent": 0.0,
    "avg_memory_percent": 0.0,
    "peak_opencode_procs": 0,
    "baseline_memory_mb": 2177.1162109375,
    "memory_per_agent_mb": -0.00390625,
    "total_cost_score": 0
  },
  {
    "agent_count": 2,
    "total_duration": 1.0150294303894043,
    "success_count": 0,
    "failed_count": 2,
    "timeout_count": 0,
    "avg_response_time": 0.004192829132080078,
    "stddev_response_time": 0.0006507473410082039,
    "min_response_time": 0.0037326812744140625,
    "max_response_time": 0.004652976989746094,
    "peak_cpu_percent": 0.0,
    "avg_cpu_percent": 0.0,
    "peak_memory_mb": 2175.671875,
    "avg_memory_mb": 2175.529296875,
    "peak_memory_percent": 0.0,
    "avg_memory_percent": 0.0,
    "peak_opencode_procs": 0,
    "baseline_memory_mb": 2175.13671875,
    "memory_per_agent_mb": 0.267578125,
    "total_cost_score": 0.0005431993436068297
  },
  {
    "agent_count": 3,
    "total_duration": 1.0151348114013672,
    "success_count": 0,
    "failed_count": 3,
    "timeout_count": 0,
    "avg_response_time": 0.00410922368367513,
    "stddev_response_time": 0.0005485598755713246,
    "min_response_time": 0.0034792423248291016,
    "max_response_time": 0.004481315612792969,
    "peak_cpu_percent": 0.0,
    "avg_cpu_percent": 0.0,
    "peak_memory_mb": 2175.234375,
    "avg_memory_mb": 2175.171875,
    "peak_memory_percent": 0.0,
    "avg_memory_percent": 0.0,
    "peak_opencode_procs": 0,
    "baseline_memory_mb": 2174.984375,
    "memory_per_agent_mb": 0.08333333333333333,
    "total_cost_score": 0.0002537837028503418
  },
  {
    "agent_count": 5,
    "total_duration": 1.0233359336853027,
    "success_count": 0,
    "failed_count": 5,
    "timeout_count": 0,
    "avg_response_time": 0.003859806060791016,
    "stddev_response_time": 0.0005061271938518695,
    "min_response_time": 0.003265857696533203,
    "max_response_time": 0.004559516906738281,
    "peak_cpu_percent": 0.0,
    "avg_cpu_percent": 0.0,
    "peak_memory_mb": 2174.8115234375,
    "avg_memory_mb": 2174.765625,
    "peak_memory_percent": 0.0,
    "avg_memory_percent": 0.0,
    "peak_opencode_procs": 0,
    "baseline_memory_mb": 2174.7197265625,
    "memory_per_agent_mb": 0.018359375,
    "total_cost_score": 9.393904078751803e-05
  },
  {
    "agent_count": 8,
    "total_duration": 1.0180647373199463,
    "success_count": 0,
    "failed_count": 8,
    "timeout_count": 0,
    "avg_response_time": 0.0040419697761535645,
    "stddev_response_time": 0.0005073540280823215,
    "min_response_time": 0.0034415721893310547,
    "max_response_time": 0.004962921142578125,
    "peak_cpu_percent": 0.0,
    "avg_cpu_percent": 0.0,
    "peak_memory_mb": 2175.9697265625,
    "avg_memory_mb": 2175.328125,
    "peak_memory_percent": 0.0,
    "avg_memory_percent": 0.0,
    "peak_opencode_procs": 0,
    "baseline_memory_mb": 2174.6826171875,
    "memory_per_agent_mb": 0.160888671875,
    "total_cost_score": 0.0013103606677614152
  }
 ]
--- a/tools/parallel-capacity-test/results/results_20260331_035345.json
+++ b/tools/parallel-capacity-test/results/results_20260331_035345.json
@@ -0,0 +1,107 @@
 [
  {
    "agent_count": 1,
    "total_duration": 7.013643741607666,
    "success_count": 1,
    "failed_count": 0,
    "timeout_count": 0,
    "avg_response_time": 6.2816431522369385,
    "stddev_response_time": 0,
    "min_response_time": 6.2816431522369385,
    "max_response_time": 6.2816431522369385,
    "peak_cpu_percent": 0.0,
    "avg_cpu_percent": 0.0,
    "peak_memory_mb": 2546.8349609375,
    "avg_memory_mb": 2439.7982177734375,
    "peak_memory_percent": 0.0,
    "avg_memory_percent": 0.0,
    "peak_opencode_procs": 0,
    "baseline_memory_mb": 2183.2587890625,
    "memory_per_agent_mb": 363.576171875,
    "total_cost_score": 2.549993742468767
  },
  {
    "agent_count": 2,
    "total_duration": 13.01965594291687,
    "success_count": 2,
    "failed_count": 0,
    "timeout_count": 0,
    "avg_response_time": 9.241770267486572,
    "stddev_response_time": 4.460840653831581,
    "min_response_time": 6.087479591369629,
    "max_response_time": 12.396060943603516,
    "peak_cpu_percent": 0.0,
    "avg_cpu_percent": 0.0,
    "peak_memory_mb": 2889.0400390625,
    "avg_memory_mb": 2659.376727764423,
    "peak_memory_percent": 0.0,
    "avg_memory_percent": 0.0,
    "peak_opencode_procs": 0,
    "baseline_memory_mb": 2189.068359375,
    "memory_per_agent_mb": 349.98583984375,
    "total_cost_score": 9.113390439316863
  },
  {
    "agent_count": 3,
    "total_duration": 8.017883539199829,
    "success_count": 3,
    "failed_count": 0,
    "timeout_count": 0,
    "avg_response_time": 6.328219811121623,
    "stddev_response_time": 1.4813371254887444,
    "min_response_time": 4.74861478805542,
    "max_response_time": 7.686349391937256,
    "peak_cpu_percent": 0.0,
    "avg_cpu_percent": 0.0,
    "peak_memory_mb": 3233.111328125,
    "avg_memory_mb": 2848.880425347222,
    "peak_memory_percent": 0.0,
    "avg_memory_percent": 0.0,
    "peak_opencode_procs": 0,
    "baseline_memory_mb": 2211.83984375,
    "memory_per_agent_mb": 340.423828125,
    "total_cost_score": 8.188435823624488
  },
  {
    "agent_count": 5,
    "total_duration": 12.039501190185547,
    "success_count": 5,
    "failed_count": 0,
    "timeout_count": 0,
    "avg_response_time": 6.650626277923584,
    "stddev_response_time": 2.765260504640065,
    "min_response_time": 4.714812755584717,
    "max_response_time": 11.523208379745483,
    "peak_cpu_percent": 0.0,
    "avg_cpu_percent": 0.0,
    "peak_memory_mb": 3911.77734375,
    "avg_memory_mb": 2996.949669471154,
    "peak_memory_percent": 0.0,
    "avg_memory_percent": 0.0,
    "peak_opencode_procs": 0,
    "baseline_memory_mb": 2210.08203125,
    "memory_per_agent_mb": 340.3390625,
    "total_cost_score": 20.487562740176916
  },
  {
    "agent_count": 8,
    "total_duration": 62.496517181396484,
    "success_count": 0,
    "failed_count": 0,
    "timeout_count": 8,
    "avg_response_time": 60,
    "stddev_response_time": 0.0,
    "min_response_time": 60,
    "max_response_time": 60,
    "peak_cpu_percent": 0.0,
    "avg_cpu_percent": 0.0,
    "peak_memory_mb": 4033.01171875,
    "avg_memory_mb": 3940.566368689904,
    "peak_memory_percent": 0.0,
    "avg_memory_percent": 0.0,
    "peak_opencode_procs": 0,
    "baseline_memory_mb": 2245.8857421875,
    "memory_per_agent_mb": 223.3907470703125,
    "total_cost_score": 111.68914929955825
  }
 ]
--- a/tools/parallel-capacity-test/results/results_20260331_040751.json
+++ b/tools/parallel-capacity-test/results/results_20260331_040751.json
@@ -0,0 +1,65 @@
 [
  {
    "agent_count": 1,
    "total_duration": 1.0171289443969727,
    "success_count": 0,
    "failed_count": 1,
    "timeout_count": 0,
    "avg_response_time": 0.005397796630859375,
    "stddev_response_time": 0,
    "min_response_time": 0.005397796630859375,
    "max_response_time": 0.005397796630859375,
    "peak_cpu_percent": 0.0,
    "avg_cpu_percent": 0.0,
    "peak_memory_mb": 2460.7001953125,
    "avg_memory_mb": 2459.75439453125,
    "peak_memory_percent": 0.0,
    "avg_memory_percent": 0.0,
    "peak_opencode_procs": 0,
    "baseline_memory_mb": 2458.80859375,
    "memory_per_agent_mb": 1.8916015625,
    "total_cost_score": 0.001924002700485289
  },
  {
    "agent_count": 2,
    "total_duration": 1.0177080631256104,
    "success_count": 0,
    "failed_count": 2,
    "timeout_count": 0,
    "avg_response_time": 0.004194378852844238,
    "stddev_response_time": 0.0005352649760883542,
    "min_response_time": 0.003815889358520508,
    "max_response_time": 0.004572868347167969,
    "peak_cpu_percent": 0.0,
    "avg_cpu_percent": 0.0,
    "peak_memory_mb": 2464.1708984375,
    "avg_memory_mb": 2463.69287109375,
    "peak_memory_percent": 0.0,
    "avg_memory_percent": 0.0,
    "peak_opencode_procs": 0,
    "baseline_memory_mb": 2463.21484375,
    "memory_per_agent_mb": 0.47802734375,
    "total_cost_score": 0.0009729845642577857
  },
  {
    "agent_count": 3,
    "total_duration": 1.016812801361084,
    "success_count": 0,
    "failed_count": 3,
    "timeout_count": 0,
    "avg_response_time": 0.00549777348836263,
    "stddev_response_time": 0.0004058027330303703,
    "min_response_time": 0.0052263736724853516,
    "max_response_time": 0.0059642791748046875,
    "peak_cpu_percent": 0.0,
    "avg_cpu_percent": 0.0,
    "peak_memory_mb": 2443.9794921875,
    "avg_memory_mb": 2443.8232421875,
    "peak_memory_percent": 0.0,
    "avg_memory_percent": 0.0,
    "peak_opencode_procs": 0,
    "baseline_memory_mb": 2443.6669921875,
    "memory_per_agent_mb": 0.10416666666666667,
    "total_cost_score": 0.00031775400042533875
  }
 ]
--- a/tools/parallel-capacity-test/results/summary_20260331_035130.csv
+++ b/tools/parallel-capacity-test/results/summary_20260331_035130.csv
@@ -0,0 +1,6 @@
 agents,duration,success,failed,timeout,avg_response,stddev,min_response,max_response,peak_cpu,avg_cpu,peak_mem_mb,avg_mem_mb,peak_mem_pct,avg_mem_pct,peak_procs,baseline_mem,mem_per_agent,cost_score
 1,1.01,0,1,0,0.01,0.00,0.01,0.01,0.0,0.0,2177.1,2177.1,0.0,0.0,0,2177.1,-0.0,0.00
 2,1.02,0,2,0,0.00,0.00,0.00,0.00,0.0,0.0,2175.7,2175.5,0.0,0.0,0,2175.1,0.3,0.00
 3,1.02,0,3,0,0.00,0.00,0.00,0.00,0.0,0.0,2175.2,2175.2,0.0,0.0,0,2175.0,0.1,0.00
 5,1.02,0,5,0,0.00,0.00,0.00,0.00,0.0,0.0,2174.8,2174.8,0.0,0.0,0,2174.7,0.0,0.00
 8,1.02,0,8,0,0.00,0.00,0.00,0.00,0.0,0.0,2176.0,2175.3,0.0,0.0,0,2174.7,0.2,0.00
--- a/tools/parallel-capacity-test/results/summary_20260331_035345.csv
+++ b/tools/parallel-capacity-test/results/summary_20260331_035345.csv
@@ -0,0 +1,6 @@
 agents,duration,success,failed,timeout,avg_response,stddev,min_response,max_response,peak_cpu,avg_cpu,peak_mem_mb,avg_mem_mb,peak_mem_pct,avg_mem_pct,peak_procs,baseline_mem,mem_per_agent,cost_score
 1,7.01,1,0,0,6.28,0.00,6.28,6.28,0.0,0.0,2546.8,2439.8,0.0,0.0,0,2183.3,363.6,2.55
 2,13.02,2,0,0,9.24,4.46,6.09,12.40,0.0,0.0,2889.0,2659.4,0.0,0.0,0,2189.1,350.0,9.11
 3,8.02,3,0,0,6.33,1.48,4.75,7.69,0.0,0.0,3233.1,2848.9,0.0,0.0,0,2211.8,340.4,8.19
 5,12.04,5,0,0,6.65,2.77,4.71,11.52,0.0,0.0,3911.8,2996.9,0.0,0.0,0,2210.1,340.3,20.49
 8,62.50,0,0,8,60.00,0.00,60.00,60.00,0.0,0.0,4033.0,3940.6,0.0,0.0,0,2245.9,223.4,111.69
--- a/tools/parallel-capacity-test/results/summary_20260331_040751.csv
+++ b/tools/parallel-capacity-test/results/summary_20260331_040751.csv
@@ -0,0 +1,4 @@
 agents,duration,success,failed,timeout,avg_response,stddev,min_response,max_response,peak_cpu,avg_cpu,peak_mem_mb,avg_mem_mb,peak_mem_pct,avg_mem_pct,peak_procs,baseline_mem,mem_per_agent,cost_score
 1,1.02,0,1,0,0.01,0.00,0.01,0.01,0.0,0.0,2460.7,2459.8,0.0,0.0,0,2458.8,1.9,0.00
 2,1.02,0,2,0,0.00,0.00,0.00,0.00,0.0,0.0,2464.2,2463.7,0.0,0.0,0,2463.2,0.5,0.00
 3,1.02,0,3,0,0.01,0.00,0.01,0.01,0.0,0.0,2444.0,2443.8,0.0,0.0,0,2443.7,0.1,0.00