Add GitHub Actions workflow for benchmark load tests with baseline comparison#801
Add GitHub Actions workflow for benchmark load tests with baseline comparison#801Copilot wants to merge 10 commits into
Conversation
Co-authored-by: juliusknorr <3404133+juliusknorr@users.noreply.github.com>
Co-authored-by: juliusknorr <3404133+juliusknorr@users.noreply.github.com>
Co-authored-by: juliusknorr <3404133+juliusknorr@users.noreply.github.com>
Co-authored-by: juliusknorr <3404133+juliusknorr@users.noreply.github.com>
Co-authored-by: juliusknorr <3404133+juliusknorr@users.noreply.github.com>
🚀 Benchmark Load Test ResultsThese results show the performance characteristics of the websocket server under load.
Details📊 50 concurrent users{
"concurrency": 50,
"cpu": {
"average": 31.091304347826078,
"peak": 132
},
"memory": {
"averageRssMb": 216.54976222826087,
"peakRssMb": 223.71484375
},
"network": {
"bytesIn": 0,
"bytesOut": 0
},
"loadSummary": {
"serverUrl": "http://127.0.0.1:3002",
"roomId": "benchmark-room",
"concurrency": 50,
"activeSenders": 15,
"activeRatio": 0.3,
"durationSeconds": 45,
"updateRate": 3,
"bytesSent": 457958,
"bytesReceived": 28560682,
"messagesSent": 2697,
"messagesReceived": 72316,
"averageJoinDelayMs": 109.18,
"droppedConnections": 0
}
}📊 100 concurrent users{
"concurrency": 100,
"cpu": {
"average": 63.76382978723404,
"peak": 145
},
"memory": {
"averageRssMb": 418.6961436170213,
"peakRssMb": 498.17578125
},
"network": {
"bytesIn": 0,
"bytesOut": 0
},
"loadSummary": {
"serverUrl": "http://127.0.0.1:3002",
"roomId": "benchmark-room",
"concurrency": 100,
"activeSenders": 30,
"activeRatio": 0.3,
"durationSeconds": 45,
"updateRate": 3,
"bytesSent": 917154,
"bytesReceived": 103672940,
"messagesSent": 5374,
"messagesReceived": 182705,
"averageJoinDelayMs": 239.39,
"droppedConnections": 0
}
}Note: These benchmarks run with reduced load (50, 100 users) for CI efficiency. For full benchmarks, see the README. |
|
@copilot I'd like to store a baseline of the output in the repo and when posting the pull request comment show the diffed values. |
Co-authored-by: juliusknorr <3404133+juliusknorr@users.noreply.github.com>
Implemented in commit a042309. The workflow now:
Example output: | Concurrent Users | Avg CPU | Peak CPU | Avg Memory (MB) | Peak Memory (MB) | Dropped Connections |
|------------------|---------|----------|-----------------|------------------|--------------------|
| 50 | 11.50% (🔴+1.03%) | 50.00% (🔴+2.30%) | 250.00 (🔴+7.23) | 270.00 (🔴+6.61) | 0 (⚪0) |
| 100 | 18.50% (🟢-1.37%) | 52.00% (🟢-3.60%) | 350.00 (🟢-13.73) | 365.00 (🟢-5.77) | 0 (⚪0) |Falls back to regular display if baseline is not found. |
Co-authored-by: juliusknorr <3404133+juliusknorr@users.noreply.github.com>
🚀 Benchmark Load Test ResultsComparison with baseline from
Details📊 50 concurrent users{
"concurrency": 50,
"cpu": {
"average": 0,
"peak": 0
},
"memory": {
"averageRssMb": 223.9958444148936,
"peakRssMb": 245.3359375
},
"network": {
"bytesIn": 0,
"bytesOut": 0
},
"loadSummary": {
"serverUrl": "http://127.0.0.1:3002",
"roomId": "benchmark-room",
"concurrency": 50,
"activeSenders": 15,
"activeRatio": 0.3,
"durationSeconds": 45,
"updateRate": 3,
"bytesSent": 457512,
"bytesReceived": 34522513,
"messagesSent": 2694,
"messagesReceived": 104054,
"averageJoinDelayMs": 142.28,
"droppedConnections": 0
}
}📊 100 concurrent users{
"concurrency": 100,
"cpu": {
"average": 0,
"peak": 0
},
"memory": {
"averageRssMb": 327.6195977393617,
"peakRssMb": 369.85546875
},
"network": {
"bytesIn": 0,
"bytesOut": 0
},
"loadSummary": {
"serverUrl": "http://127.0.0.1:3002",
"roomId": "benchmark-room",
"concurrency": 100,
"activeSenders": 30,
"activeRatio": 0.3,
"durationSeconds": 45,
"updateRate": 3,
"bytesSent": 915782,
"bytesReceived": 103712476,
"messagesSent": 5366,
"messagesReceived": 250862,
"averageJoinDelayMs": 317.19,
"droppedConnections": 0
}
}Note: These benchmarks run with reduced load (50, 100 users) for CI efficiency. For full benchmarks, see the README. |
🚀 Benchmark Load Test ResultsComparison with baseline from
Details📊 50 concurrent users{
"concurrency": 50,
"cpu": {
"average": 0,
"peak": 0
},
"memory": {
"averageRssMb": 167.21017323369566,
"peakRssMb": 179.203125
},
"network": {
"bytesIn": 0,
"bytesOut": 0
},
"loadSummary": {
"serverUrl": "http://127.0.0.1:3002",
"roomId": "benchmark-room",
"concurrency": 50,
"activeSenders": 15,
"activeRatio": 0.3,
"durationSeconds": 45,
"updateRate": 3,
"bytesSent": 456453,
"bytesReceived": 25743126,
"messagesSent": 2688,
"messagesReceived": 68888,
"averageJoinDelayMs": 259.12,
"droppedConnections": 0
}
}📊 100 concurrent users{
"concurrency": 100,
"cpu": {
"average": 0,
"peak": 0
},
"memory": {
"averageRssMb": 281.1586277173913,
"peakRssMb": 307.07421875
},
"network": {
"bytesIn": 0,
"bytesOut": 0
},
"loadSummary": {
"serverUrl": "http://127.0.0.1:3002",
"roomId": "benchmark-room",
"concurrency": 100,
"activeSenders": 30,
"activeRatio": 0.3,
"durationSeconds": 45,
"updateRate": 3,
"bytesSent": 908445,
"bytesReceived": 70074578,
"messagesSent": 5323,
"messagesReceived": 109385,
"averageJoinDelayMs": 607.58,
"droppedConnections": 0
}
}Note: These benchmarks run with reduced load (50, 100 users) for CI efficiency. For full benchmarks, see the README. |
|
@copilot Those values should also be included in the table and diff "bytesSent": 908445, |
|
@copilot Those values should also be included in the table and diff |
|
@copilot Those values should also be included in the table and diff "bytesSent": 908445, |
Co-authored-by: juliusknorr <3404133+juliusknorr@users.noreply.github.com>
Implemented in commit de03398. Added four new columns to the results table:
Example output: | Concurrent Users | Avg CPU | Peak CPU | Avg Memory (MB) | Peak Memory (MB) | Bytes Sent | Bytes Received | Messages Sent | Messages Received | Dropped Connections |
|------------------|---------|----------|-----------------|------------------|------------|----------------|---------------|-------------------|--------------------|
| 50 | 11.50% (🔴+1.03%) | 50.00% (🔴+2.30%) | 250.00 (🔴+7.23) | 270.00 (🔴+6.61) | 908,445 (🔴+759,123) | 70,074,578 (🔴+46,579,353) | 5,323 | 109,385 | 0 (⚪0) |Network bytes show diffs compared to baseline, while message counts are new metrics not in the baseline. |
Co-authored-by: juliusknorr <3404133+juliusknorr@users.noreply.github.com>
|
Closing as I was just experimenting, don't think the baseline comparison is useful yet, we could think about other metrics to compare though |
Automates load testing on PRs touching websocket server or benchmark code. Posts formatted results with baseline comparison as PR comments.
Implementation
websocket_server/,tools/benchmarks/, or workflow file; manual viaworkflow_dispatchtools/benchmarks/results.jsonand shows deltas with color indicators (🟢 improvement, 🔴 regression, ⚪ no change)Example Output
The workflow produces PR comments with baseline comparison including network and message metrics:
Network byte metrics show diffs compared to baseline, while message counts are new metrics not in the baseline. Large numbers are formatted with thousand separators for readability.
Uses existing
tools/benchmarks/runBenchmarks.mjsscript with extraction via sed and validation via Node.js JSON.parse.Original prompt
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.