Add parallel capacity test tool for Hermes/OpenCode #5

Merged
shoko merged 7 commits from fix/issue-3-parallel-test into main 2026-03-31 06:28:58 +02:00
Owner

Summary

Adds a parallel capacity test tool to stress test Hermes/OpenCode and find the practical limit of parallel agent execution.

What was added

Files

  • tools/parallel-capacity-test/run_test.sh - Bash script for running tests
  • tools/parallel-capacity-test/parallel_capacity_test.py - Python tool with detailed metrics
  • tools/parallel-capacity-test/README.md - Documentation

Capabilities

  • Spawns N concurrent opencode run agents
  • Measures response time, CPU usage, and memory usage
  • Ramps up from 1 to higher agent counts (configurable)
  • Identifies failure points and performance degradation
  • Generates JSON, CSV, and Markdown reports

Usage

cd tools/parallel-capacity-test
./parallel_capacity_test.py --quick
# or
./parallel_capacity_test.py --agents 15 --timeout 120

Fixes #3

## Summary Adds a parallel capacity test tool to stress test Hermes/OpenCode and find the practical limit of parallel agent execution. ## What was added ### Files - `tools/parallel-capacity-test/run_test.sh` - Bash script for running tests - `tools/parallel-capacity-test/parallel_capacity_test.py` - Python tool with detailed metrics - `tools/parallel-capacity-test/README.md` - Documentation ### Capabilities - Spawns N concurrent opencode run agents - Measures response time, CPU usage, and memory usage - Ramps up from 1 to higher agent counts (configurable) - Identifies failure points and performance degradation - Generates JSON, CSV, and Markdown reports ### Usage ```bash cd tools/parallel-capacity-test ./parallel_capacity_test.py --quick # or ./parallel_capacity_test.py --agents 15 --timeout 120 ``` Fixes #3
shoko added 1 commit 2026-03-27 11:31:02 +01:00
This tool tests the practical limits of parallel agent execution
by spawning N concurrent opencode run tasks and measuring:
- Response time
- CPU and memory usage
- Success/failure rates

Includes both bash (run_test.sh) and Python (parallel_capacity_test.py)
implementations with full metrics collection and reporting.

Fixes #3
First-time contributor

can you run the test and post here the result of the test using different parameters?

can you run the test and post here the result of the test using different parameters?
Author
Owner

Parallel Capacity Test Results

Ran the parallel capacity test tool with different agent counts (timeout: 60s).

Results Summary

Agent Count Success Failed Timeout
1 agent 0 1 0
2 agents 0 2 0
3 agents 0 3 0
5 agents 0 5 0

Observations

  • All tests resulted in failures (0 success) across all agent counts tested
  • No timeouts occurred within the 60-second window
  • The test appears to fail consistently regardless of agent count
  • This suggests a fundamental issue with the test setup rather than a capacity limit

Test Environment

  • Tool: parallel_capacity_test.py
  • Timeout: 60 seconds per test
  • Date: 2026-03-27

Note: psutil was not available, so resource monitoring was limited.

## Parallel Capacity Test Results Ran the parallel capacity test tool with different agent counts (timeout: 60s). ### Results Summary | Agent Count | Success | Failed | Timeout | |-------------|---------|--------|----------| | 1 agent | 0 | 1 | 0 | | 2 agents | 0 | 2 | 0 | | 3 agents | 0 | 3 | 0 | | 5 agents | 0 | 5 | 0 | ### Observations - All tests resulted in failures (0 success) across all agent counts tested - No timeouts occurred within the 60-second window - The test appears to fail consistently regardless of agent count - This suggests a fundamental issue with the test setup rather than a capacity limit ### Test Environment - Tool: parallel_capacity_test.py - Timeout: 60 seconds per test - Date: 2026-03-27 _Note: psutil was not available, so resource monitoring was limited._
Author
Owner

Debug Report: Parallel Capacity Test Tool Failing All Tests

Problem

All tests return 0 success regardless of agent count (1, 2, 3, 5 all fail).

Root Cause Found

The script parallel_capacity_test.py uses the wrong command-line option for specifying the working directory.

Bug location: Line in _run_single_agent() method:

['opencode', 'run', task, '--workdir', workdir]

Issue: opencode run does NOT have a --workdir option. It uses --dir instead.

When --workdir is passed, opencode treats it as an unrecognized option and shows the help message, causing the subprocess to return exit code 1 and no useful output.

Evidence

With correct --dir option (works):

$ opencode run "Respond with exactly: PARALLEL_TEST_OK" --dir /tmp/test_opencode
> build · MiniMax-M2.7
PARALLEL_TEST_OK

With incorrect --workdir option (fails):

$ opencode run "Respond with exactly: PARALLEL_TEST_OK" --workdir /tmp/test_wrong
# Shows help message and exits with code 1

Fix Applied

Changed line 138 in parallel_capacity_test.py from:

['opencode', 'run', task, '--workdir', workdir],

to:

['opencode', 'run', task, '--dir', workdir],

Verification

After fix, test now passes:

[RESULT] 1 agents: 1 success, 0 failed, 0 timeout

Additional Notes

  • psutil is not installed, causing resource monitoring warnings (non-critical)
## Debug Report: Parallel Capacity Test Tool Failing All Tests ### Problem All tests return 0 success regardless of agent count (1, 2, 3, 5 all fail). ### Root Cause Found The script `parallel_capacity_test.py` uses the wrong command-line option for specifying the working directory. **Bug location:** Line in `_run_single_agent()` method: ```python ['opencode', 'run', task, '--workdir', workdir] ``` **Issue:** `opencode run` does NOT have a `--workdir` option. It uses `--dir` instead. When `--workdir` is passed, opencode treats it as an unrecognized option and shows the help message, causing the subprocess to return exit code 1 and no useful output. ### Evidence With correct `--dir` option (works): ``` $ opencode run "Respond with exactly: PARALLEL_TEST_OK" --dir /tmp/test_opencode > build · MiniMax-M2.7 PARALLEL_TEST_OK ``` With incorrect `--workdir` option (fails): ``` $ opencode run "Respond with exactly: PARALLEL_TEST_OK" --workdir /tmp/test_wrong # Shows help message and exits with code 1 ``` ### Fix Applied Changed line 138 in `parallel_capacity_test.py` from: ```python ['opencode', 'run', task, '--workdir', workdir], ``` to: ```python ['opencode', 'run', task, '--dir', workdir], ``` ### Verification After fix, test now passes: ``` [RESULT] 1 agents: 1 success, 0 failed, 0 timeout ``` ### Additional Notes - psutil is not installed, causing resource monitoring warnings (non-critical)
First-time contributor

yes we should debug it. what are the other options other than psutil? I'd like to use something built-in, so no additional installation if possible. but if there is no such approach, well there is no way so tell user that we are using external tool so they need to install it first before using our tool. what do you think?

yes we should debug it. what are the other options other than psutil? I'd like to use something built-in, so no additional installation if possible. but if there is no such approach, well there is no way so tell user that we are using external tool so they need to install it first before using our tool. what do you think?
shoko added 1 commit 2026-03-27 12:54:22 +01:00
First-time contributor

alright can you now post the test result using different parameters?

alright can you now post the test result using different parameters?
shoko added 1 commit 2026-03-31 05:29:08 +02:00
shoko added 2 commits 2026-03-31 05:32:15 +02:00
First-time contributor

in the parallel capacity test, can we use kugetsu cli instead of opencode cli directly? in a way that I want to be sure that we are testing our own tool so we know exactly how many parallelization we can run at a time. in a way like, how much does it cost for a single parallelization, and how how does it cost if we scale it up. lets say at most memory we can use is 1 GB. what do you think?

in the parallel capacity test, can we use kugetsu cli instead of opencode cli directly? in a way that I want to be sure that we are testing our own tool so we know exactly how many parallelization we can run at a time. in a way like, how much does it cost for a single parallelization, and how how does it cost if we scale it up. lets say at most memory we can use is 1 GB. what do you think?
shoko added 1 commit 2026-03-31 05:47:48 +02:00
Author
Owner

Parallel Capacity Test Results (Kugetsu Mode)

Tested with kugetsu CLI but all tests failed because kugetsu init requires TTY. Tested with opencode mode instead:

Agents Duration Success Failed Timeout Avg Response Mem/Agent Cost Score
1 7.0s 1 0 0 6.3s 363.6MB 2.55
2 13.0s 2 0 0 9.2s 350.0MB 9.11
3 8.0s 3 0 0 6.3s 340.4MB 8.19
5 12.0s 5 0 0 6.7s 340.3MB 20.49
8 62.5s 0 0 8 60.0s 223.4MB 111.69

Key Findings

  • Optimal: 5 agents with 100% success rate
  • Memory per agent: ~340MB (within 1GB limit)
  • 8+ agents: Timeouts occur (resource contention)
  • Estimated max capacity: ~3-5 agents before degradation

Notes

  1. Kugetsu mode requires TTY for kugetsu init - cannot run headless
  2. For true kugetsu stress test, need to run kugetsu init first, then run test
  3. Memory limit of 1GB per agent is generous - actual usage is ~340MB/agent
  4. Cost score = (memory_delta * duration) / 1000
## Parallel Capacity Test Results (Kugetsu Mode) Tested with `kugetsu` CLI but all tests failed because `kugetsu init` requires TTY. Tested with `opencode` mode instead: | Agents | Duration | Success | Failed | Timeout | Avg Response | Mem/Agent | Cost Score | |--------|----------|---------|--------|---------|-------------|-----------|------------| | 1 | 7.0s | 1 | 0 | 0 | 6.3s | 363.6MB | 2.55 | | 2 | 13.0s | 2 | 0 | 0 | 9.2s | 350.0MB | 9.11 | | 3 | 8.0s | 3 | 0 | 0 | 6.3s | 340.4MB | 8.19 | | 5 | 12.0s | 5 | 0 | 0 | 6.7s | 340.3MB | 20.49 | | 8 | 62.5s | 0 | 0 | 8 | 60.0s | 223.4MB | 111.69 | ### Key Findings - **Optimal: 5 agents** with 100% success rate - **Memory per agent: ~340MB** (within 1GB limit) - **8+ agents**: Timeouts occur (resource contention) - **Estimated max capacity: ~3-5 agents** before degradation ### Notes 1. Kugetsu mode requires TTY for `kugetsu init` - cannot run headless 2. For true kugetsu stress test, need to run `kugetsu init` first, then run test 3. Memory limit of 1GB per agent is generous - actual usage is ~340MB/agent 4. Cost score = (memory_delta * duration) / 1000
shoko added 1 commit 2026-03-31 06:02:12 +02:00
han approved these changes 2026-03-31 06:28:11 +02:00
han left a comment
First-time contributor

lgtm

lgtm
shoko merged commit a3c24e53b9 into main 2026-03-31 06:28:58 +02:00
Sign in to join this conversation.