Add DevOps Agent Operator for automated workload failure investigation with AWS DevOps Agent#36
Open
cawcaw253 wants to merge 1 commit intoaws-samples:mainfrom
Open
Conversation
- Detect pod failures with 5-layer priority logic - Collect pod/node diagnostics via SSM - Export to S3, CloudWatch Logs, and trigger aws devops agent webhook - Include investigation runbooks and tests
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds a Kubernetes operator that automatically detects EKS pod failures, collects comprehensive diagnostics, and integrates with AWS DevOps Agent for automated investigation.
Features
Files Added
요약
EKS Pod 장애를 자동으로 감지하고 종합적인 진단 데이터를 수집하여 AWS DevOps Agent와 연동해 자동화된 장애 조사를 지원하는 Kubernetes Operator를 추가합니다.
주요 기능
추가된 파일