[feat][CGS][application-manager] add intelligent queue selection based on yarn resource usage#986
Merged
casionone merged 9 commits intodev-1.18.2-webankfrom Apr 13, 2026
Merged
Conversation
…eption (#964) * #AI commit# 开发阶段: 修复sr任务重试导致加载init_sql异常bug * #AI commit# 开发阶段: 修复sr任务重试导致加载init_sql异常bug * #AI commit# 开发阶段: 修复sr任务重试导致加载init_sql异常bug * #AI commit# 修复: * 增加任务重试开关覆盖范围
…t queue selection - Translate all Chinese log messages to English for consistency - Update comments and documentation to English - No functional changes, only log message translation
… in smart queue selection" This reverts commit 47fb4e6.
v-kkhuang
added a commit
that referenced
this pull request
Apr 16, 2026
…d on yarn resource usage (#986) * [fix][CGS][engineconn] fix sr task retry causing init_sql loading exception (#964) * #AI commit# 开发阶段: 修复sr任务重试导致加载init_sql异常bug * #AI commit# 开发阶段: 修复sr任务重试导致加载init_sql异常bug * #AI commit# 开发阶段: 修复sr任务重试导致加载init_sql异常bug * #AI commit# 修复: * 增加任务重试开关覆盖范围 * #AI commit# 开发阶段: spark支持第二队列选择 * #AI commit# 开发阶段: 优化第二队列逻辑 * #AI commit# 开发阶段: 优化 wds.linkis.rm.secondary.yarnqueue.enable默认值 * #AI commit# 开发阶段: refactor: translate Chinese logs to English in smart queue selection - Translate all Chinese log messages to English for consistency - Update comments and documentation to English - No functional changes, only log message translation * Revert "#AI commit# 开发阶段: refactor: translate Chinese logs to English in smart queue selection" This reverts commit 47fb4e6. * #AI commit# 开发阶段: 优化日志英文打印 * #AI commit# 开发阶段: 优化日志英文打印
v-kkhuang
added a commit
that referenced
this pull request
Apr 16, 2026
…d on yarn resource usage (#986) * [fix][CGS][engineconn] fix sr task retry causing init_sql loading exception (#964) * #AI commit# 开发阶段: 修复sr任务重试导致加载init_sql异常bug * #AI commit# 开发阶段: 修复sr任务重试导致加载init_sql异常bug * #AI commit# 开发阶段: 修复sr任务重试导致加载init_sql异常bug * #AI commit# 修复: * 增加任务重试开关覆盖范围 * #AI commit# 开发阶段: spark支持第二队列选择 * #AI commit# 开发阶段: 优化第二队列逻辑 * #AI commit# 开发阶段: 优化 wds.linkis.rm.secondary.yarnqueue.enable默认值 * #AI commit# 开发阶段: refactor: translate Chinese logs to English in smart queue selection - Translate all Chinese log messages to English for consistency - Update comments and documentation to English - No functional changes, only log message translation * Revert "#AI commit# 开发阶段: refactor: translate Chinese logs to English in smart queue selection" This reverts commit 47fb4e6. * #AI commit# 开发阶段: 优化日志英文打印 * #AI commit# 开发阶段: 优化日志英文打印
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What is the purpose of the change
Background/Problem:
Currently, Linkis uses a fixed queue configuration for Yarn jobs. This can lead to inefficient resource utilization when some queues are heavily loaded while others have available capacity. The system needs the ability to automatically select the optimal queue based on real-time resource usage.
Purpose of Change:
This PR adds intelligent queue selection functionality that monitors Yarn queue resource usage in real-time and automatically selects the optimal queue based on configurable thresholds. When the secondary queue has available capacity (usage below threshold), jobs are directed there; otherwise, they use the primary queue.
Value/Impact:
After the change, Linkis can optimize resource utilization across multiple queues, reduce job wait times, and improve overall cluster efficiency. The system provides configurable thresholds, engine/creator filtering, and automatic fallback to ensure stability.
Related issues/PRs
Related issues: close apache#5415
Related pr:none
Brief change log
Checklist