Design: Implement parallel agent capacity limits and queueing #37

New Issue

shoko · 2026-03-31T06:01:57+02:00

shoko commented

2026-03-31 06:01:57 +02:00

Context

Testing revealed that running 8+ parallel opencode agents causes timeouts due to resource contention. Current implementation has no built-in limits.

Questions to Resolve

What happens when max agents is reached?
- Reject new requests?
- Block waiting for slot?
- Queue for later execution?
Where does the queue live?
- In-memory (lost on restart)?
- Persistent (file/database)?
- Hermes/Python layer?
- External (Redis, SQLite)?
Queue ordering?
- FIFO (first come first served)?
- Priority-based?
- By issue severity?
Queue timeout?
- How long to wait in queue?
- Auto-expire queued requests?
Backpressure signaling?
- How does PM agent know limits are reached?
- Should PM stop delegating when near limit?

Current Behavior

No limits enforced
8+ agents causes resource contention and timeouts
Recommended max: 5 agents based on testing

Implementation Options

Option A: Simple Semaphore

In-memory counter with max concurrent limit
New agents blocked until slot available
No persistence, no queue

Option B: Queue with Persistence

Requests queued in SQLite/Redis
PM queries queue position
Human can inspect/manage queue

Option C: PM Agent Self-Regulation

PM agent tracks active agents count
PM decides to delay delegation when near limit
No explicit queue, uses PM memory

Out of Scope for This Issue

Actual implementation (separate issues/PRs)
Specific queue backend choice
Priority implementation details

Issue #3 (parallel capacity testing - completed)
Parallel capacity test tool: tools/parallel-capacity-test/

## Context Testing revealed that running 8+ parallel opencode agents causes timeouts due to resource contention. Current implementation has no built-in limits. ## Questions to Resolve 1. **What happens when max agents is reached?** - Reject new requests? - Block waiting for slot? - Queue for later execution? 2. **Where does the queue live?** - In-memory (lost on restart)? - Persistent (file/database)? - Hermes/Python layer? - External (Redis, SQLite)? 3. **Queue ordering?** - FIFO (first come first served)? - Priority-based? - By issue severity? 4. **Queue timeout?** - How long to wait in queue? - Auto-expire queued requests? 5. **Backpressure signaling?** - How does PM agent know limits are reached? - Should PM stop delegating when near limit? ## Current Behavior - No limits enforced - 8+ agents causes resource contention and timeouts - Recommended max: 5 agents based on testing ## Implementation Options ### Option A: Simple Semaphore - In-memory counter with max concurrent limit - New agents blocked until slot available - No persistence, no queue ### Option B: Queue with Persistence - Requests queued in SQLite/Redis - PM queries queue position - Human can inspect/manage queue ### Option C: PM Agent Self-Regulation - PM agent tracks active agents count - PM decides to delay delegation when near limit - No explicit queue, uses PM memory ## Out of Scope for This Issue - Actual implementation (separate issues/PRs) - Specific queue backend choice - Priority implementation details ## Related - Issue #3 (parallel capacity testing - completed) - Parallel capacity test tool: `tools/parallel-capacity-test/`

shoko commented

2026-03-31 07:10:18 +02:00

Feedback

Great design issue! Here is my analysis based on the kugetsu architecture:

Recommended Approach: Option C (PM Agent Self-Regulation) + Minimal Queue

Why not pure Option A (Semaphore):

No visibility into queue state
PM agent cant inform user about wait times
Lost on restart (but agents restart means new context anyway)

Why not full Option B (Persistent Queue):

Overkill for Phase 3/initial Telegram UX
Redis/SQLite adds complexity
FIFO may not match priority needs

Proposed Hybrid Design

1. PM Agent tracks active count in memory

PM session stores: active_agents: [session_ids]
PM updates count on delegation and completion
Simple, no external dependency

2. User-facing queue via kugetsu index

{
  "base": "ses_abc",
  "pm_agent": "ses_pm",
  "active": ["issue-14.json", "issue-15.json"],
  "queued": ["issue-16.json"]
}

Persisted to ~/.kugetsu/index.json
kugetsu CLI can show: kugetsu queue

3. Queue behavior:

Scenario	Behavior
Max reached (e.g., 5)	New request queued, PM tells user "queued, position N"
Slot frees up	PM picks next from queue
Queue timeout	PM notifies user "task expired after 24h"

4. Queue ordering:

FIFO for now
Priority can come later (Phase 2 API)

Where to Implement

Layer	What
kugetsu CLI	Track active count, manage queued list, persist to index.json
PM Agent	Check capacity before delegation, update on completion
Telegram UX	PM responds with queue position

Capacity Limit

Based on testing, 5 agents seems reasonable. But this should be:

Configurable via ~/.kugetsu/config.json
Default: 3 (safer for resource-constrained containers)
Tunable based on container RAM/CPU

Integration with Phase 3

For Telegram UX, the PM agent needs to:

Check kugetsu queue before delegating
If full, add to queue and tell user
When slot frees, pick from queue and notify user

Questions

Should we implement queue timeout with notification?
Should PM agent auto-retry queued tasks or wait for user confirmation?
Do we need "cancel queued task" functionality?

## Feedback Great design issue! Here is my analysis based on the kugetsu architecture: ### Recommended Approach: Option C (PM Agent Self-Regulation) + Minimal Queue **Why not pure Option A (Semaphore):** - No visibility into queue state - PM agent cant inform user about wait times - Lost on restart (but agents restart means new context anyway) **Why not full Option B (Persistent Queue):** - Overkill for Phase 3/initial Telegram UX - Redis/SQLite adds complexity - FIFO may not match priority needs ### Proposed Hybrid Design **1. PM Agent tracks active count in memory** - PM session stores: `active_agents: [session_ids]` - PM updates count on delegation and completion - Simple, no external dependency **2. User-facing queue via kugetsu index** ```json { "base": "ses_abc", "pm_agent": "ses_pm", "active": ["issue-14.json", "issue-15.json"], "queued": ["issue-16.json"] } ``` - Persisted to `~/.kugetsu/index.json` - kugetsu CLI can show: `kugetsu queue` **3. Queue behavior:** | Scenario | Behavior | |---------|----------| | Max reached (e.g., 5) | New request queued, PM tells user "queued, position N" | | Slot frees up | PM picks next from queue | | Queue timeout | PM notifies user "task expired after 24h" | **4. Queue ordering:** - FIFO for now - Priority can come later (Phase 2 API) ### Where to Implement | Layer | What | |-------|------| | **kugetsu CLI** | Track active count, manage queued list, persist to index.json | | **PM Agent** | Check capacity before delegation, update on completion | | **Telegram UX** | PM responds with queue position | ### Capacity Limit Based on testing, 5 agents seems reasonable. But this should be: - Configurable via `~/.kugetsu/config.json` - Default: 3 (safer for resource-constrained containers) - Tunable based on container RAM/CPU ### Integration with Phase 3 For Telegram UX, the PM agent needs to: 1. Check `kugetsu queue` before delegating 2. If full, add to queue and tell user 3. When slot frees, pick from queue and notify user ### Questions 1. Should we implement queue timeout with notification? 2. Should PM agent auto-retry queued tasks or wait for user confirmation? 3. Do we need "cancel queued task" functionality?

shoko referenced a pull request that will close this issue

2026-03-31 11:46:44 +02:00

feat(kugetsu): increase MAX_CONCURRENT_AGENTS from 3 to 5 #43

shoko closed this issue

2026-03-31 14:48:53 +02:00

Sign in to join this conversation.

Branches Tags

main

fix/issue-254

fix/issue-252

fix/issue-248

fix/issue-246

fix/issue-165

fix/issue-158

fix/issue-244

fix/issue-121

fix/issue-119

fix/issue-166

fix/issue-229-pr-conflict-check

fix/issue-212

fix/issue-229-user-message-with-base-workflow

fix/issue-229-ensure-session-worktree-bug

fix/issue-168

fix/issue-225-cmd-continue-message-truncation

fix/issue-223-pr-creation-instructions

fix/issue-220-pm-context-enhancement

fix/issue-210-msg-file-race-condition

fix/issue-207-queue-daemon-set-debug-mode

fix/issue-message-encoding

fix/issue-118

fix/issue-daemon-worktree-session-handling

fix/issue-write-index-quoting

fix/issue-cmd-destroy-unbound-var

fix/issue-daemon-worktree-path-fix

fix/issue-syntax-error-372

fix/issue-187-start-forks-agent

fix/issue-185-worktree-wrong-directory

fix/issue-183-destroy-base-requires-target

fix/issue-181-get-repo-url-strips-user-org

fix/issue-179-worktree-path-doubled

fix/issue-176-extract-issue-ref

fix/issue-174-queue-daemon-crashes

fix/issue-172-init-script-wrong-session-ids

fix/issue-170-duplicate-update-queue

fix/issue-167-notification-bash

fix/issue-156-queue-fixes

fix/issue-155-queue-list-json

fix/issue-160-gitea-token-from-pm-agent

fix/issue-156

fix/issue-150

fix/issue-116-modularize-script

fix/issue-120

fix/issue-144-parse-issue-ref-format-v2

fix/issue-queue-daemon-excess-agents

fix/issue-142-process-queue-missing-parens

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: shoko/kugetsu#37