Set minimum NUM_MACHINES=6 for special.functional on AIX to prevent OOM#6990
Set minimum NUM_MACHINES=6 for special.functional on AIX to prevent OOM#6990annaibm wants to merge 1 commit intoadoptium:masterfrom
Conversation
|
Is this tested? |
|
@pshipton , yes this was tested via Grinder run https://hyc-runtimes-jenkins.swg-devops.com/job/Grinder/59332/ on |
|
Perhaps we should be looking at the tests which produce so much output and see about reducing it. |
|
Or am I misunderstanding why the OOM occurs? |
|
my understanding is the OOM occurs in resultsSum.pl which reads the entire |
This is a very good initiative. There are several other ways to improve MBCS tests, and we should likely make a plan to address all of them. Related: #5161 |
There was a problem hiding this comment.
Pull request overview
Updates the Jenkins dynamic parallelization logic to ensure special.functional jobs on AIX are split into multiple parallel lists, avoiding Perl OOM during resultsSummary processing on lower-memory AIX workers.
Changes:
- Enforce a minimum
NUM_MACHINES=6when generating the parallel list forTARGET == "special.functional"on AIX. - Regenerate
parallelList.mkwith the higher minimum when the initial computedNUM_LISTis below 6. - Add a targeted log message referencing the motivating issue for traceability.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
AIX machines (e.g., paix820) with limited RAM (8GB) run into OOM errors in resultsSummary when special.functional runs as a single unsplit job. Adding a minimum of 6 parallel lists ensures the job is always split, preventing Perl OOM failures during test result processing. related: https://github.ibm.com/runtimes/automation/issues/921 Signed-off-by: Anna Babu Palathingal <anna.bp@ibm.com>
ae9ab71 to
313119a
Compare
|
Not sure we'll still need this change, I'm setting it to draft. |
AIX machines (e.g., paix820) with limited RAM (8GB) run into OOM errors in resultsSummary when special.functional runs as a single unsplit job. Adding a minimum of 6 parallel lists ensures the job is always split, preventing Perl OOM failures during test result processing.
related: https://github.ibm.com/runtimes/automation/issues/921