🚀 Enterprise-Grade EKS GitOps Platform

Production-grade Kubernetes platform on AWS — Full GitOps, Multi-Stack Observability, AWS-Native CI/CD, and Infrastructure as Code across 5 technology layers.

📸 Platform in Action

🔍 Observability — ELK Stack (789 Live Log Events)

📊 Kibana Observability Overview

🔄 ArgoCD GitOps Auto-Sync

📡 AWS CloudWatch Metrics Explorer

🌱 Elastic Beanstalk Docker Deployment

📈 Grafana — Live Kubernetes Metrics

📋 Grafana — All Dashboards (15+)

🔬 Grafana — Namespace-Level Metrics

🖥️ AWS CloudWatch Container Insights

🏗️ Architecture

┌──────────────────────────────────────────────────────────────────┐
│                     Developer Workflow                            │
│                                                                   │
│   git push ──► GitHub Actions ──► ECR ──► ArgoCD ──► EKS        │
└──────────────────────────────────────────────────────────────────┘
                                │
              ┌─────────────────┼──────────────────┐
              ▼                 ▼                  ▼
       ┌────────────┐   ┌────────────┐   ┌─────────────┐
       │  GitHub    │   │ CodeBuild  │   │   ArgoCD    │
       │  Actions   │   │  + ECR     │   │   GitOps    │
       └────────────┘   └────────────┘   └─────────────┘
                                               │
                         ┌─────────────────────▼──────────────────────┐
                         │              AWS EKS v1.31                  │
                         │   ┌──────────┐  ┌──────────┐  ┌─────────┐ │
                         │   │  ArgoCD  │  │   App    │  │  Helm   │ │
                         │   │ v3.3.6   │  │ (nginx)  │  │ Charts  │ │
                         │   └──────────┘  └──────────┘  └─────────┘ │
                         │   ┌──────────┐  ┌──────────┐  ┌─────────┐ │
                         │   │   ELK    │  │Prometheus│  │CloudWtch│ │
                         │   │  Stack   │  │+ Grafana │  │ Agent   │ │
                         │   └──────────┘  └──────────┘  └─────────┘ │
                         │         3 nodes (t3.small) — ap-south-1    │
                         └────────────────────────────────────────────┘
                                               │
               ┌───────────────────────────────┼──────────────────────┐
               ▼                               ▼                      ▼
       ┌──────────────┐               ┌──────────────┐      ┌──────────────┐
       │  CloudWatch  │               │  CloudTrail  │      │  S3 + ECR    │
       │  Insights    │               │  Audit Logs  │      │  Artifacts   │
       │  332 metrics │               │  Multi-region│      │  Terraform   │
       └──────────────┘               └──────────────┘      └──────────────┘

✅ What Was Built — Sprint by Sprint

Sprint 1 — Foundation Infrastructure

VPC — Custom VPC with public/private subnets, NAT gateway via CloudFormation
EKS v1.31 — Multi-node Kubernetes cluster provisioned with AWS CDK
ALB Controller — AWS Load Balancer Controller deployed via Helm
Ansible — Node hardening playbook for security compliance

Sprint 2 — GitOps & CI/CD

GitHub Actions — 3-stage pipeline (build → test → deploy) with Docker + ECR integration
ArgoCD v3.3.6 — Full GitOps: auto-syncs Helm charts from GitHub to EKS on every push
Helm — All workloads packaged as Helm charts with configurable values.yaml
Self-healing: ArgoCD auto-reverts any manual changes to match Git state

Sprint 3 — Full Observability Stack

Elasticsearch + Kibana 8.5 — Centralized log storage and visualization (789 live log events)
Filebeat — DaemonSet shipping container logs from all pods across all namespaces
Prometheus + Grafana — Metrics collection with 15+ pre-built Kubernetes dashboards
CloudWatch Container Insights — Pod-level CPU/memory/network (332 metrics flowing)

Sprint 4 — AWS Native Services

CloudTrail — Multi-region audit trail with log delivery to S3
CodeBuild — Docker image build pipeline pushing to ECR
Elastic Beanstalk — Docker app deployed from ECR (Green health confirmed)
EKS Control Plane Logging — API server, audit, authenticator logs enabled

Sprint 5 — IaC & Additional CI/CD

Terraform — S3 artifact bucket with versioning, encryption, and resource tagging
GitLab CI — .gitlab-ci.yml pipeline with build, test, deploy stages

🛠️ Complete Tech Stack

☁️ Cloud & Infrastructure

Tool	Version	Usage
AWS EKS	v1.31	Managed Kubernetes cluster
AWS CDK	v2	EKS cluster provisioning
CloudFormation	—	VPC and networking
Terraform	v1.11	S3 artifact storage (IaC)
AWS ECR	—	Docker image registry
Elastic Beanstalk	AL2023	PaaS Docker deployment
CloudTrail	—	Multi-region audit logging
CloudWatch	—	Container Insights (332 metrics)

🔄 CI/CD & GitOps

Tool	Version	Usage
GitHub Actions	—	CI/CD pipeline (3 stages)
ArgoCD	v3.3.6	GitOps auto-sync from GitHub
CodeBuild	—	AWS-native Docker builds
GitLab CI	—	Alternative pipeline
Helm	v3.14	Kubernetes package management

📊 Observability

Tool	Version	Usage
Elasticsearch	8.5.1	Log storage and indexing
Kibana	8.5.1	Log visualization (789 events)
Filebeat	8.5.1	Log shipping DaemonSet
Prometheus	v0.90.1	Metrics scraping
Grafana	v10.4.1	15+ pre-built dashboards

🔒 Security & Networking

Tool	Usage
Ansible	Node hardening playbook
ALB Controller	AWS Load Balancer ingress
IAM / IRSA	Service account roles
VPC	Custom networking with subnets
Security Groups	Fine-grained port access

🔑 Real-World Problems Solved

These are actual issues encountered and resolved — not tutorial steps:

Problem	Root Cause	Solution
Pods stuck in Pending	t3.small hit 11-pod limit	Scaled node group 1 → 3 nodes
ALB not provisioning	Missing `DescribeListenerAttributes` IAM permission	Added inline policy to ALB controller role
Kibana `.security` index missing	Elasticsearch restarted before bootstrap completed	Manually regenerated Kibana service token via ES API
CloudWatch `MissingEndpoint` error	Agent missing region config	Patched cwagentconfig with explicit region
Git push rejected (685MB file)	Terraform provider binary committed to Git	Used `git filter-branch` to rewrite history
Kibana auth failure after restart	Stale service account token	Deleted and recreated token via `/_security/service/` API

📁 Repository Structure

eks-gitops-platform/
├── .github/workflows/        # GitHub Actions CI/CD pipeline
├── .gitlab-ci.yml            # GitLab CI pipeline (build/test/deploy)
├── helm/eks-gitops-app/      # Application Helm chart
│   ├── Chart.yaml
│   ├── values.yaml
│   └── templates/deployment.yaml
├── terraform/main.tf         # S3 bucket with versioning (Terraform)
├── argocd-app.yaml           # ArgoCD Application manifest
├── buildspec.yml             # AWS CodeBuild specification
├── Dockerfile                # Docker image definition
├── Dockerrun.aws.json        # Elastic Beanstalk Docker config
├── elasticsearch-values.yaml # Elasticsearch Helm overrides
├── kibana-values.yaml        # Kibana Helm overrides
├── filebeat-values.yaml      # Filebeat Helm overrides
├── prometheus-values.yaml    # Prometheus + Grafana overrides
├── assets/                   # Portfolio GIFs and screenshots
└── README.md

📊 Platform Metrics

Metric	Value
EKS Version	v1.31.13-eks-ecaa3a6
Total Namespaces	7
Total Pods Running	25+
Kibana Log Events	789 (last 15 min)
Log Rate	44 events/minute
CloudWatch Metrics	332
Cluster CPU	54% utilized
Cluster Memory	85% utilized
Grafana Dashboards	15+ pre-built
CI/CD Stages	3 (build / test / deploy)

🎯 Interview Talking Points

"Tell me about your Kubernetes experience"

I built a multi-node EKS v1.31 cluster from scratch using AWS CDK, deployed 25+ pods across 7 namespaces, and resolved real production issues — pod scheduling failures, IAM permission gaps, and resource constraints.

"How have you implemented GitOps?"

I deployed ArgoCD v3.3.6 with automated sync — any git push auto-deploys to EKS with self-healing. The entire cluster state is declared in Git. I also wrote a GitLab CI pipeline as an alternative to GitHub Actions.

"What observability tools have you used?"

I built a dual observability stack: ELK (789 live log events at 44/min) and Prometheus + Grafana (15+ dashboards). Also enabled CloudWatch Container Insights — 332 metrics flowing from all namespaces.

"Have you used Terraform?"

Yes — Terraform for S3, CDK for EKS, CloudFormation for VPC. I understand multiple IaC tools and when to use each.

"Tell me about a problem you solved"

Kibana kept failing after Elasticsearch restarts because the .security index didn't bootstrap in time. I diagnosed it from pod logs, traced it to a stale service account token, and fixed it by calling the Elasticsearch security API to regenerate the token. That's the kind of real debugging experience this project gave me.

👤 Author

Vivek Bommalla

Every component deployed, debugged, and verified on real AWS infrastructure.
Not a tutorial follow-along — real problems encountered and solved.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🚀 Enterprise-Grade EKS GitOps Platform

📸 Platform in Action

🔍 Observability — ELK Stack (789 Live Log Events)

📊 Kibana Observability Overview

🔄 ArgoCD GitOps Auto-Sync

📡 AWS CloudWatch Metrics Explorer

🌱 Elastic Beanstalk Docker Deployment

📈 Grafana — Live Kubernetes Metrics

📋 Grafana — All Dashboards (15+)

🔬 Grafana — Namespace-Level Metrics

🖥️ AWS CloudWatch Container Insights

🏗️ Architecture

✅ What Was Built — Sprint by Sprint

Sprint 1 — Foundation Infrastructure

Sprint 2 — GitOps & CI/CD

Sprint 3 — Full Observability Stack

Sprint 4 — AWS Native Services

Sprint 5 — IaC & Additional CI/CD

🛠️ Complete Tech Stack

☁️ Cloud & Infrastructure

🔄 CI/CD & GitOps

📊 Observability

🔒 Security & Networking

🔑 Real-World Problems Solved

📁 Repository Structure

📊 Platform Metrics

🎯 Interview Talking Points

👤 Author

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
.github/workflows		.github/workflows
app		app
assets		assets
config		config
helm/eks-gitops-app		helm/eks-gitops-app
infrastructure		infrastructure
scripts		scripts
terraform		terraform
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitlab-ci.yml		.gitlab-ci.yml
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
Dockerrun.aws.json		Dockerrun.aws.json
README.md		README.md
SECURITY.md		SECURITY.md
appspec.yml		appspec.yml
argocd-app.yaml		argocd-app.yaml
buildspec.yml		buildspec.yml
codepipeline.yaml		codepipeline.yaml
vpc.yaml		vpc.yaml

Folders and files

Latest commit

History

Repository files navigation

🚀 Enterprise-Grade EKS GitOps Platform

📸 Platform in Action

🔍 Observability — ELK Stack (789 Live Log Events)

📊 Kibana Observability Overview

🔄 ArgoCD GitOps Auto-Sync

📡 AWS CloudWatch Metrics Explorer

🌱 Elastic Beanstalk Docker Deployment

📈 Grafana — Live Kubernetes Metrics

📋 Grafana — All Dashboards (15+)

🔬 Grafana — Namespace-Level Metrics

🖥️ AWS CloudWatch Container Insights

🏗️ Architecture

✅ What Was Built — Sprint by Sprint

Sprint 1 — Foundation Infrastructure

Sprint 2 — GitOps & CI/CD

Sprint 3 — Full Observability Stack

Sprint 4 — AWS Native Services

Sprint 5 — IaC & Additional CI/CD

🛠️ Complete Tech Stack

☁️ Cloud & Infrastructure

🔄 CI/CD & GitOps

📊 Observability

🔒 Security & Networking

🔑 Real-World Problems Solved

📁 Repository Structure

📊 Platform Metrics

🎯 Interview Talking Points

👤 Author

About

Topics

Resources

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages