Bug: MAX_CONCURRENT_AGENTS limit not enforced #63

Closed
opened 2026-04-01 04:04:04 +02:00 by shoko · 4 comments
Owner

Bug

The MAX_CONCURRENT_AGENTS limit is not enforced because slots are released immediately after forking, not after the child process completes.

Root Cause

In cmd_start():

  1. acquire_agent_slot - Slot acquired
  2. opencode run --fork ... - FORKS child, parent returns IMMEDIATELY
  3. release_agent_slot - Slot released BEFORE child finishes

Impact

MAX_CONCURRENT_AGENTS limit is non-functional. Any number of agents can run.

## Bug The MAX_CONCURRENT_AGENTS limit is not enforced because slots are released immediately after forking, not after the child process completes. ## Root Cause In cmd_start(): 1. acquire_agent_slot - Slot acquired 2. opencode run --fork ... - FORKS child, parent returns IMMEDIATELY 3. release_agent_slot - Slot released BEFORE child finishes ## Impact MAX_CONCURRENT_AGENTS limit is non-functional. Any number of agents can run.
shoko added the critical label 2026-04-01 04:04:14 +02:00
Author
Owner

Issue with This Fix

The fix waits for opencode run --fork to complete, but opencode run --fork returns immediately after FORKING the child process - it does NOT wait for the agent session to finish.

Tested: 6 kugetsu start commands ran sequentially, all succeeded because each finished in ~1 sec and released its slot.

The Real Issue

opencode run --fork forks a child process for the agent, then immediately returns. wait $child_pid only waits for the parent opencode to finish creating the session, not for the actual agent to finish.

Options

  1. Remove --fork so opencode waits for session completion
  2. Track actual session count - check opencode session list before starting new one
  3. Keep slot but rate limit based on time

Option 2 is best - count actual active sessions instead of using slots.

## Issue with This Fix The fix waits for `opencode run --fork` to complete, but `opencode run --fork` returns immediately after FORKING the child process - it does NOT wait for the agent session to finish. Tested: 6 `kugetsu start` commands ran sequentially, all succeeded because each finished in ~1 sec and released its slot. ## The Real Issue `opencode run --fork` forks a child process for the agent, then immediately returns. `wait $child_pid` only waits for the parent opencode to finish creating the session, not for the actual agent to finish. ## Options 1. Remove `--fork` so opencode waits for session completion 2. Track actual session count - check `opencode session list` before starting new one 3. Keep slot but rate limit based on time Option 2 is best - count actual active sessions instead of using slots.
Author
Owner

Update: pgrep approach won't work

Tested pgrep opencode - it returns 0 even with active sessions. Reason:

  • opencode run --fork creates session on opencode SERVER
  • Client process exits immediately after session creation
  • No persistent local process to count

opencode session list has no filtering

Options found:

  • --max-count N - limit to N most recent
  • --format json|table - change output format

No filter for state/activity. All sessions shown, no way to distinguish active vs idle vs stale.

Conclusion

Option C (PM cleanup) is the only viable approach.

PM should destroy sessions at natural breakpoints:

  1. After PR merged (task truly complete)
  2. On user explicit request (explicit cleanup)

This avoids false positives and doesn't add latency since user is already waiting for response.

Recommendation: Close this PR and implement Option C instead - add session cleanup to PM workflow.

## Update: pgrep approach won't work Tested `pgrep opencode` - it returns 0 even with active sessions. Reason: - `opencode run --fork` creates session on opencode SERVER - Client process exits immediately after session creation - No persistent local process to count ## opencode session list has no filtering Options found: - `--max-count N` - limit to N most recent - `--format json|table` - change output format No filter for state/activity. All sessions shown, no way to distinguish active vs idle vs stale. ## Conclusion **Option C (PM cleanup) is the only viable approach.** PM should destroy sessions at natural breakpoints: 1. After PR merged (task truly complete) 2. On user explicit request (explicit cleanup) This avoids false positives and doesn't add latency since user is already waiting for response. **Recommendation:** Close this PR and implement Option C instead - add session cleanup to PM workflow.
Author
Owner

The slot mechanism doesn't work because opencode run --fork returns immediately after forking, not after child completes. We need a different approach.

New Proposal: Session Counting

Count actual sessions from ~/.kugetsu/sessions/:

  • Exclude base session
  • Exclude PM session
  • Remaining = active agent sessions

Session Files

~/.kugetsu/sessions/
  base.json                    - base session (EXCLUDE)
  pm-agent.json                - PM session (EXCLUDE)
  github.com-shoko-kugetsu-49.json  - dev agent session 1 (COUNT)
  github.com-shoko-kugetsu-57.json  - dev agent session 2 (COUNT)
  github.com-shoko-kugetsu-60.json  - dev agent session 3 (COUNT)

Count Logic

# Count issue sessions (exclude base and pm-agent)
active_count=$(ls ~/.kugetsu/sessions/issue-*.json 2>/dev/null | wc -l)
if [ "$active_count" -ge "$MAX_CONCURRENT_AGENTS" ]; then
    echo "Error: Max concurrent agents ($MAX_CONCURRENT_AGENTS) reached"
    exit 1
fi

Rules

  1. NEW session creation - check count, reject if at limit
  2. Existing session continue - ALWAYS allowed (--continue doesn't count toward limit)
  3. PM cleanup - destroy session when task done (PR merged or user request)

Flow Example (limit=3)

  1. kugetsu init → base + PM sessions created
  2. User: "fix issue #49" → PM spawns dev session #49 (count=1)
  3. User: "fix issue #57" → PM spawns dev session #57 (count=2)
  4. User: "fix issue #60" → PM spawns dev session #60 (count=3)
  5. User: "fix issue #56" → REJECTED (count=3 >= limit)
  6. Dev #49 done → PM merges PR → PM destroys session #49 (count=2)
  7. User: "fix issue #56" → PM spawns dev session #56 (count=3)

Benefits

  • Reliable: counts actual sessions, not broken slot mechanism
  • Simple: just count files in sessions directory
  • No false positives: existing sessions can always continue
  • No process tracking needed: sessions persist on opencode server

This properly enforces MAX_CONCURRENT_AGENTS while preserving --fork functionality.

Should I implement this?

## Session-Counting Approach (Recommended Fix) The slot mechanism doesn't work because `opencode run --fork` returns immediately after forking, not after child completes. We need a different approach. ### New Proposal: Session Counting Count actual sessions from `~/.kugetsu/sessions/`: - Exclude base session - Exclude PM session - Remaining = active agent sessions ### Session Files ``` ~/.kugetsu/sessions/ base.json - base session (EXCLUDE) pm-agent.json - PM session (EXCLUDE) github.com-shoko-kugetsu-49.json - dev agent session 1 (COUNT) github.com-shoko-kugetsu-57.json - dev agent session 2 (COUNT) github.com-shoko-kugetsu-60.json - dev agent session 3 (COUNT) ``` ### Count Logic ```bash # Count issue sessions (exclude base and pm-agent) active_count=$(ls ~/.kugetsu/sessions/issue-*.json 2>/dev/null | wc -l) if [ "$active_count" -ge "$MAX_CONCURRENT_AGENTS" ]; then echo "Error: Max concurrent agents ($MAX_CONCURRENT_AGENTS) reached" exit 1 fi ``` ### Rules 1. **NEW session creation** - check count, reject if at limit 2. **Existing session continue** - ALWAYS allowed (--continue doesn't count toward limit) 3. **PM cleanup** - destroy session when task done (PR merged or user request) ### Flow Example (limit=3) 1. `kugetsu init` → base + PM sessions created 2. User: "fix issue #49" → PM spawns dev session #49 (count=1) 3. User: "fix issue #57" → PM spawns dev session #57 (count=2) 4. User: "fix issue #60" → PM spawns dev session #60 (count=3) 5. User: "fix issue #56" → REJECTED (count=3 >= limit) 6. Dev #49 done → PM merges PR → PM destroys session #49 (count=2) 7. User: "fix issue #56" → PM spawns dev session #56 (count=3) ### Benefits - Reliable: counts actual sessions, not broken slot mechanism - Simple: just count files in sessions directory - No false positives: existing sessions can always continue - No process tracking needed: sessions persist on opencode server This properly enforces MAX_CONCURRENT_AGENTS while preserving --fork functionality. **Should I implement this?**
Author
Owner

Fix Implemented: Session-Counting Approach

PR created with session-counting implementation:

PR: #65

Changes Made

  1. Added count_active_dev_sessions() - counts actual session files in ~/.kugetsu/sessions/, excluding base.json and pm-agent.json

  2. Modified cmd_start() - checks session count before creating new session, rejects if at limit

  3. Removed wait $child_pid - since opencode run --fork returns immediately, not when child completes

  4. Modified cmd_continue() - no longer counts toward limit (existing sessions can always continue via --continue)

Rules

  1. NEW session creation - check count, reject if at limit
  2. Existing session continue - ALWAYS allowed (--continue does not count toward limit)
  3. PM cleanup - destroy session when task done

Testing

With MAX=3 and 6 existing sessions:

Active sessions: 6
MAX: 3
Would REJECT new session (at limit)

This properly enforces MAX_CONCURRENT_AGENTS while preserving --fork functionality.

## Fix Implemented: Session-Counting Approach PR created with session-counting implementation: **PR:** https://git.fbrns.co/shoko/kugetsu/pulls/65 ### Changes Made 1. **Added `count_active_dev_sessions()`** - counts actual session files in `~/.kugetsu/sessions/`, excluding `base.json` and `pm-agent.json` 2. **Modified `cmd_start()`** - checks session count before creating new session, rejects if at limit 3. **Removed `wait $child_pid`** - since `opencode run --fork` returns immediately, not when child completes 4. **Modified `cmd_continue()`** - no longer counts toward limit (existing sessions can always continue via `--continue`) ### Rules 1. **NEW session creation** - check count, reject if at limit 2. **Existing session continue** - ALWAYS allowed (`--continue` does not count toward limit) 3. **PM cleanup** - destroy session when task done ### Testing With MAX=3 and 6 existing sessions: ``` Active sessions: 6 MAX: 3 Would REJECT new session (at limit) ``` This properly enforces MAX_CONCURRENT_AGENTS while preserving --fork functionality.
shoko closed this issue 2026-04-01 07:40:00 +02:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: shoko/kugetsu#63