Compare commits
1 Commits
fix/issue-
...
ce4116bcb1
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
ce4116bcb1 |
67
.github/ISSUES/fix-queue-daemon-excess-agents.md
vendored
67
.github/ISSUES/fix-queue-daemon-excess-agents.md
vendored
@@ -1,67 +0,0 @@
|
|||||||
# Fix: Queue daemon spawning excess agents due to race condition
|
|
||||||
|
|
||||||
## Problem
|
|
||||||
|
|
||||||
When enqueueing multiple tasks (e.g., 6 tasks), the queue daemon was spawning many more subagents than expected, eventually exhausting container memory.
|
|
||||||
|
|
||||||
**Root Cause:** The combination of:
|
|
||||||
1. `process_queue()` calling `opencode run` directly instead of `kugetsu start`, bypassing all concurrency logic
|
|
||||||
2. `count_active_dev_sessions()` counting `pm-agent.json` toward `MAX_CONCURRENT_AGENTS`, reducing effective dev agent slots
|
|
||||||
3. No atomic locking around session count check + session file creation (TOCTOU race condition)
|
|
||||||
4. Background spawning of multiple concurrent processes in `process_queue()`
|
|
||||||
|
|
||||||
**Expected behavior:** With `MAX_CONCURRENT_AGENTS=3` and 6 tasks:
|
|
||||||
- Tasks should be processed sequentially via `kugetsu start`
|
|
||||||
- Only 3 dev agents should run at a time
|
|
||||||
- Tasks should queue and wait for slots to free up
|
|
||||||
|
|
||||||
## Solution
|
|
||||||
|
|
||||||
### 1. `count_active_dev_sessions()` - Exclude pm-agent
|
|
||||||
Only count actual dev agent session files (exclude `pm-agent.json`).
|
|
||||||
|
|
||||||
### 2. `process_queue()` - Call `kugetsu start` directly + retry logic
|
|
||||||
- Call `kugetsu start` directly (foreground, sequential) instead of spawning `opencode run` background process
|
|
||||||
- Dynamic batch size = available slots (removes need for `QUEUE_DAEMON_BATCH_SIZE`)
|
|
||||||
- Retry logic (max 3 attempts) on failure
|
|
||||||
- On failure: cleanup worktree/session and revert to `pending` state
|
|
||||||
- Save `fork_pid` to queue item for timeout handling
|
|
||||||
|
|
||||||
### 3. `cmd_start()` - Add flock
|
|
||||||
- Add flock around critical section (count check + fork)
|
|
||||||
- Track `fork_pid` for queue item timeout handling
|
|
||||||
|
|
||||||
### 4. Notification System
|
|
||||||
New notification types:
|
|
||||||
| Event | Type |
|
|
||||||
|-------|------|
|
|
||||||
| Task enqueued | `task_queued` |
|
|
||||||
| Task dequeued | `task_dequeued` |
|
|
||||||
| Task started | `task_started` |
|
|
||||||
| Task completed | `task_completed` |
|
|
||||||
| Task error | `task_error` |
|
|
||||||
|
|
||||||
### 5. Config
|
|
||||||
- Remove `QUEUE_DAEMON_BATCH_SIZE` (no longer needed - batch size is now dynamic)
|
|
||||||
|
|
||||||
## Notification Flow
|
|
||||||
|
|
||||||
| Event | Location | Type |
|
|
||||||
|-------|----------|------|
|
|
||||||
| Task enqueued | `enqueue_task()` | `task_queued` |
|
|
||||||
| Task dequeued | `process_queue()` after state change to `notified` | `task_dequeued` |
|
|
||||||
| Task started | `cmd_start()` after session file created | `task_started` |
|
|
||||||
| Task completed | `update_queue_item_state()` | `task_completed` |
|
|
||||||
| Task error | `update_queue_item_state()` | `task_error` |
|
|
||||||
|
|
||||||
## Out of Scope
|
|
||||||
|
|
||||||
- Re-check loop in cmd_start (checking if session DB is reliable) - deferred to separate research issue
|
|
||||||
- Buffer mechanism for excess forking (safety failsafe only)
|
|
||||||
|
|
||||||
## Status
|
|
||||||
|
|
||||||
- [x] Issue created
|
|
||||||
- [x] Implementation
|
|
||||||
- [x] PR created (#147)
|
|
||||||
- [ ] Merged
|
|
||||||
1
.kugetsu-worktrees/git.fbrns.co-shoko-jigaido-2
Submodule
1
.kugetsu-worktrees/git.fbrns.co-shoko-jigaido-2
Submodule
Submodule .kugetsu-worktrees/git.fbrns.co-shoko-jigaido-2 added at 332d7fc60a
@@ -49,8 +49,6 @@ A default config file is created during `kugetsu init` with commented examples:
|
|||||||
| `MAX_CONCURRENT_AGENTS` | 3 | Maximum number of concurrent dev agents |
|
| `MAX_CONCURRENT_AGENTS` | 3 | Maximum number of concurrent dev agents |
|
||||||
| `KUGETSU_TEMP_DIR` | `~/.local/share/opencode/tool-output` | Temp directory for subagent tool output (useful in headless environments where /tmp is restricted) |
|
| `KUGETSU_TEMP_DIR` | `~/.local/share/opencode/tool-output` | Temp directory for subagent tool output (useful in headless environments where /tmp is restricted) |
|
||||||
| `KUGETSU_VERBOSITY` | `default` | PM agent verbosity level: `verbose`, `default`, or `quiet` |
|
| `KUGETSU_VERBOSITY` | `default` | PM agent verbosity level: `verbose`, `default`, or `quiet` |
|
||||||
| `QUEUE_DAEMON_INTERVAL_MINUTES` | 5 | How often daemon polls queue (in minutes) |
|
|
||||||
| `QUEUE_CLEANUP_AGE_DAYS` | 7 | Auto-cleanup completed/error items older than N days |
|
|
||||||
|
|
||||||
### Environment Variables for Agents
|
### Environment Variables for Agents
|
||||||
|
|
||||||
@@ -113,10 +111,6 @@ Each issue session gets its own git worktree to prevent conflicts:
|
|||||||
├── worktrees/
|
├── worktrees/
|
||||||
│ ├── github.com-shoko-kugetsu-14/ # Isolated workdir for issue #14
|
│ ├── github.com-shoko-kugetsu-14/ # Isolated workdir for issue #14
|
||||||
│ └── github.com-shoko-kugetsu-15/ # Isolated workdir for issue #15
|
│ └── github.com-shoko-kugetsu-15/ # Isolated workdir for issue #15
|
||||||
├── queue/
|
|
||||||
│ ├── items/ # Queue item JSON files
|
|
||||||
│ ├── daemon.pid # Daemon process ID
|
|
||||||
│ └── daemon.log # Daemon log output
|
|
||||||
└── index.json # Maps session IDs and issue refs to session files
|
└── index.json # Maps session IDs and issue refs to session files
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -264,17 +258,16 @@ kugetsu destroy --base -y
|
|||||||
|
|
||||||
### kugetsu delegate `<message>`
|
### kugetsu delegate `<message>`
|
||||||
|
|
||||||
Send a message to the PM agent for task coordination via queue:
|
Send a message to the PM agent for task coordination (fire-and-forget):
|
||||||
```bash
|
```bash
|
||||||
kugetsu delegate "work on issue #14"
|
kugetsu delegate "work on issue #14"
|
||||||
kugetsu delegate "review PR #92"
|
kugetsu delegate "review PR #92"
|
||||||
```
|
```
|
||||||
|
|
||||||
- **Always enqueues** (fire-and-forget): returns immediately
|
- Non-blocking: returns immediately, runs in background
|
||||||
- Queue daemon polls queue and invokes PM when slots available
|
- PM agent processes the message asynchronously
|
||||||
- Tasks are processed FIFO (first-in-first-out)
|
- Uses `KUGETSU_VERBOSITY` env var to control PM agent output verbosity
|
||||||
- Use `kugetsu queue list` to see pending tasks
|
- Log output stored in `~/.kugetsu/logs/delegate-<timestamp>.log`
|
||||||
- Use `kugetsu queue-daemon logs` to debug queue processing
|
|
||||||
|
|
||||||
### kugetsu logs [n]
|
### kugetsu logs [n]
|
||||||
|
|
||||||
@@ -335,79 +328,35 @@ kugetsu server default github # Set default server
|
|||||||
kugetsu server get github # Get server URL
|
kugetsu server get github # Get server URL
|
||||||
```
|
```
|
||||||
|
|
||||||
### kugetsu queue <list|stats|clear>
|
### kugetsu queue <list|enqueue|dequeue|clear>
|
||||||
|
|
||||||
Manage task queue for autonomous PM operation:
|
Manage task queue for autonomous PM operation:
|
||||||
```bash
|
```bash
|
||||||
kugetsu queue list # Show queued tasks with status
|
kugetsu queue list # Show queued tasks
|
||||||
kugetsu queue stats # Show queue statistics (total, pending, notified, completed, error)
|
kugetsu queue enqueue "task" # Add task to queue
|
||||||
kugetsu queue clear # Clean up old completed/error items
|
kugetsu queue dequeue # Remove next task from queue
|
||||||
kugetsu queue enqueue <issue-ref> <message> # Manually enqueue a task
|
kugetsu queue clear # Clear all queued tasks
|
||||||
```
|
```
|
||||||
|
|
||||||
**Queue Item States:**
|
- Queue stored in `~/.kugetsu/queue.json`
|
||||||
- `pending` - Waiting in queue, daemon can pick up
|
|
||||||
- `notified` - PM agent has picked up the task
|
|
||||||
- `completed` - Dev agent finished, PR created
|
|
||||||
- `error` - Timeout or failure
|
|
||||||
|
|
||||||
### kugetsu queue-daemon <start|stop|restart|status|logs>
|
|
||||||
|
|
||||||
Manage the queue daemon background process:
|
|
||||||
```bash
|
|
||||||
kugetsu queue-daemon start # Start daemon in background
|
|
||||||
kugetsu queue-daemon stop # Stop daemon
|
|
||||||
kugetsu queue-daemon restart # Restart daemon
|
|
||||||
kugetsu queue-daemon status # Check if daemon is running
|
|
||||||
kugetsu queue-daemon logs # Show recent daemon logs
|
|
||||||
```
|
|
||||||
|
|
||||||
**Daemon Behavior:**
|
|
||||||
1. Runs at configurable interval (default: 5 minutes)
|
|
||||||
2. Checks if active agents < MAX_CONCURRENT_AGENTS
|
|
||||||
3. Picks 1-N pending items (configurable batch size)
|
|
||||||
4. Forks PM session for each picked item
|
|
||||||
5. PM decides whether to use `start` or `continue`
|
|
||||||
|
|
||||||
**Queue Directory:**
|
|
||||||
```
|
|
||||||
~/.kugetsu/queue/
|
|
||||||
├── items/ # Queue item JSON files
|
|
||||||
│ ├── q_1234567890.json # One file per queued task
|
|
||||||
│ └── q_1234567891.json
|
|
||||||
├── daemon.pid # Daemon process ID
|
|
||||||
├── daemon.lock # Daemon lock file
|
|
||||||
└── daemon.log # Daemon log output
|
|
||||||
```
|
|
||||||
|
|
||||||
## Workflow Example
|
## Workflow Example
|
||||||
|
|
||||||
### First-time Setup
|
|
||||||
```bash
|
```bash
|
||||||
# Initialize kugetsu (requires TTY)
|
# First-time setup (requires TTY)
|
||||||
kugetsu init
|
kugetsu init
|
||||||
|
# Creates: base session + pm-agent session
|
||||||
|
|
||||||
# Start the queue daemon (for autonomous operation)
|
# Start work on issue
|
||||||
kugetsu queue-daemon start
|
kugetsu start github.com/shoko/kugetsu#14 "implement feature X"
|
||||||
```
|
# Creates: worktree at ~/.kugetsu/worktrees/github.com-shoko-kugetsu-14/
|
||||||
|
|
||||||
### Normal Workflow
|
# Continue later
|
||||||
```bash
|
|
||||||
# Enqueue tasks via delegate - agents will process them automatically
|
|
||||||
kugetsu delegate "work on issue #14"
|
|
||||||
kugetsu delegate "review PR #92"
|
|
||||||
|
|
||||||
# Check queue status
|
|
||||||
kugetsu queue list # See pending tasks
|
|
||||||
kugetsu queue stats # See statistics
|
|
||||||
|
|
||||||
# Debug queue daemon
|
|
||||||
kugetsu queue-daemon status # Is daemon running?
|
|
||||||
kugetsu queue-daemon logs # See daemon logs
|
|
||||||
|
|
||||||
# Continue work on existing issue
|
|
||||||
kugetsu continue github.com/shoko/kugetsu#14 "add tests"
|
kugetsu continue github.com/shoko/kugetsu#14 "add tests"
|
||||||
|
|
||||||
|
# Continue again
|
||||||
|
kugetsu continue github.com/shoko/kugetsu#14 "fix failing test"
|
||||||
|
|
||||||
# List all sessions
|
# List all sessions
|
||||||
kugetsu list
|
kugetsu list
|
||||||
|
|
||||||
@@ -418,21 +367,6 @@ kugetsu prune --force
|
|||||||
kugetsu destroy github.com/shoko/kugetsu#14
|
kugetsu destroy github.com/shoko/kugetsu#14
|
||||||
```
|
```
|
||||||
|
|
||||||
### Queue Daemon Management
|
|
||||||
```bash
|
|
||||||
# Check if daemon is running
|
|
||||||
kugetsu queue-daemon status
|
|
||||||
|
|
||||||
# View daemon logs for debugging
|
|
||||||
kugetsu queue-daemon logs
|
|
||||||
|
|
||||||
# Restart daemon if needed
|
|
||||||
kugetsu queue-daemon restart
|
|
||||||
|
|
||||||
# Stop daemon
|
|
||||||
kugetsu queue-daemon stop
|
|
||||||
```
|
|
||||||
|
|
||||||
## Headless Operation
|
## Headless Operation
|
||||||
|
|
||||||
This design solves the headless CLI limitation discovered in Issue #14:
|
This design solves the headless CLI limitation discovered in Issue #14:
|
||||||
|
|||||||
File diff suppressed because it is too large
Load Diff
Reference in New Issue
Block a user