Conversation
|
This one is ready to go. |
|
Hi Steve, thanks for making these changes! We unfortunately have to work on a resubmission of the conference paper, which is due pretty soon. We will review and approve the changes after we finish the resubmission on April 10. |
Sure, no problem. Let me know if you need/want help with the paper. |
|
Okay guys, this PR is still waiting for some kind of action. If we can get it approved and merged before Friday, then I think we have a chance of getting through a weekly regression this weekend, for the first time since November maybe :) Thanks! |
This is part of my ongoing effort to make weekly regressions more useful and more manageable.
Instead of trying to run the entire suite of weekly apps in a single 2-day chunk, this new code runs it in sequential chunks, where each chunk is a restartable group i.e.
glb_tests,glb_tests_RVetc. It still takes 2 days for the entire run, but a single failure at the end no longer requires a whole new 2-day run from the beginning. Also, when/if a group fails, the test continues on to run the remaining groups. This way, you can optionally restart the failed step/group even as the remaining groups continue to run.Examples:
In terms of being more manageable, weekly runs now bypass the weird byzantine
regress-metahooks/regression-stepsmechanism in favor of a much simplerweekly.ymldriver. The new driver gets loaded as soon aspipeline.ymlrecognizes that we are doing a weekly run and not the normal aha1-9 regressions.And also I took this opportunity to simplify and optimize the way we do
E64_supported_testchecks.Summary of changes
New files
generate-weekly-pipeline.shweekly.yml:much simpler full regression pipeline, generated with help fromgenerate-weekly-pipelinescriptChanged files
pipeline.yml: new"Launch Weekly Run"step lets us swapweekly.ymlin place of normal aha9 regressionsapp: added new--subgroupoption to run a single config group standalonerepress.py--group(s)option e.g. can do e.g."--groups glb_tests,resnet_tests"repress_info.py: fixed methodsummarize_and_print_info(),which was supposed to use the timing table read-only, but oops notests.pyandregress_util.py:E64_supported_testgroup properties do not belong with dynamically-loaded executable app groups and directives, so this fixes that.