IDP Portal

A custom Internal Developer Portal combining GitOps-managed AWS infrastructure with an AI-powered multi-agent system for full-stack developer self-service with policy guardrails.

Acknowledgments and References

This project draws architectural inspiration from the following open-source projects by the CNOE (Cloud Native Operational Excellence) community:

CNOE AWS Reference Implementation (docs) - The GitOps Bridge pattern used in this project (Terraform outputs -> K8s Secret annotations -> ArgoCD ApplicationSet cluster generator -> Helm values) is adapted from the CNOE reference architecture. The two-stage bootstrap approach (ArgoCD + ESO first, then ArgoCD manages everything) and the App-of-Apps ApplicationSet pattern also follow CNOE's design. CNOE is licensed under Apache License 2.0.
CAIPE - AI Platform Engineering - The multi-agent AI architecture in this project is inspired by CAIPE's approach to applying AI agents to platform engineering tasks. The concepts of an LLM-based supervisor agent orchestrating specialized sub-agents, the use of the Model Context Protocol (MCP) for tool registration, and the Agent-to-Agent (A2A) protocol for inter-agent communication are informed by CAIPE's design. CAIPE is licensed under Apache License 2.0.
Policy Agent - The Go-based OPA/Rego policy validation agent integrated into this project is based on prior work by the author.

What is original to this project:

The complete FastAPI backend implementation with 12 specialized AI agents (GitHub, ArgoCD, Flux, Jira, Slack, PagerDuty, Backstage, Kafka, Vault, Rancher, Policy, Backstage)
The Next.js 15 portal UI with 12 pages, chat interface, and self-service wizard
The self-service provisioning engine with multi-step workflows and policy gates
The Keycloak identity broker pattern with Okta/ForgeRock overlays
The specific Terraform module compositions for VPC, EKS, IAM, Vault, Kafka, and GitOps Bridge
All Helm chart configurations and values for the 14 platform addons
The CI/CD pipeline designs including the policy-gate workflow
The OPA/Rego policy rules across 5 domains (kubernetes, kafka, terraform, cicd, gitops)
The integration of the Policy Agent in 4 modes (K8s webhook, AI chat, CI/CD gate, Backstage templates)

Architecture Overview

Text-based architecture diagram

 Developers                                    Platform Engineers
     |                                                |
     v                                                v
 +-----------+                                +--------------+
 | Portal UI |  Next.js 15 / React 19        | Terraform    |
 | (SPA)     |  Tailwind + shadcn/ui         | Infrastructure|
 +-----+-----+                                +------+-------+
       |                                              |
       | HTTPS + Keycloak OIDC JWT                    | HCL
       |                                              v
 +-----v-----------------------------------------+  +------------------+
 |              AI Backend (FastAPI)               |  | AWS              |
 |                                                 |  |  EKS Cluster     |
 |  +-------------------------------------------+ |  |  VPC / Subnets   |
 |  |        Supervisor Agent (LangGraph)        | |  |  IAM Pod Identity|
 |  |  Intent classification -> Agent routing    | |  |  KMS (Vault)     |
 |  +-----+-----+-----+-----+-----+-----+------+ |  |  Secrets Manager |
 |        |     |     |     |     |     |        | |  +--------+---------+
 |  +-----v-+ +-v---+ +-v--++ +--v--+ +v----+   | |           |
 |  |GitHub | |Argo | |Flux | |Kafka| |Vault|   | |           | GitOps Bridge
 |  |Agent  | |CD   | |CD  | |Agent| |Agent|   | |           | (K8s Secret
 |  +-------+ |Agent| |Agnt| +-----+ +-----+   | |           |  annotations)
 |  +-------+ +-----+ +----+ +-----+ +-----+   | |           |
 |  |Jira   | |Slack | |PgDy| |Back| |Ranch|   | |           v
 |  |Agent  | |Agent | |Agnt| |stge| |er   |   | |  +--------+---------+
 |  +-------+ +------+ +----+ +----+ +-----+   | |  | ArgoCD           |
 |                                               | |  | ApplicationSet   |
 |  +----------+ +---------+ +----------------+ | |  | (cluster generator|
 |  |Policy    | |MCP      | |Self-Service    | | |  |  reads annotations|
 |  |Agent(Go) | |Server   | |Engine          | | |  |  -> Helm values)  |
 |  +----------+ +---------+ +----------------+ | |  +------------------+
 |                                               | |
 |  SSE Streaming | A2A Protocol | RAG/pgvector  | |  Platform Addons (14):
 +-----------------------+------------------------+ |  ArgoCD, Flux, Backstage
                         |                          |  Keycloak, Vault, Kafka
                         | HTTP/gRPC                |  Rancher, Cert-Manager
                         v                          |  External-DNS, External-
              +----------+-----------+              |  Secrets, Ingress-NGINX
              | Platform Services    |              |  OPA Gatekeeper,
              |  PostgreSQL+pgvector |              |  Policy Agent, AI Platform
              |  Redis               |
              |  Keycloak            |
              |  Vault               |
              |  Kafka (Redpanda)    |
              +----------------------+

System Architecture

 +----------------------------------------------------------------------+
 |                         EXTERNAL SERVICES                             |
 |  GitHub API   Jira API   Slack API   PagerDuty API   Anthropic API   |
 +-------+----------+----------+----------+----------------+-----------+
         |          |          |          |                |
 +-------v----------v----------v----------v----------------v-----------+
 |                                                                      |
 |                        AI BACKEND (FastAPI)                          |
 |                                                                      |
 |  +----------------------------------------------------------------+ |
 |  |                  API Layer (/api/v1)                            | |
 |  |  POST /chat/stream (SSE)  |  GET /agents  |  POST /self-service| |
 |  +--------------+--------------------------------------------+---+ |
 |                 |                                             |     |
 |  +--------------v--------------+   +--------------------------v--+ |
 |  |   Supervisor Orchestrator   |   |   Self-Service Engine       | |
 |  |   (LangGraph StateGraph)    |   |                             | |
 |  |                             |   |  Workflows:                 | |
 |  |  1. Classify intent         |   |   create_service            | |
 |  |  2. Select agent(s)         |   |   create_kafka_topic        | |
 |  |  3. Execute tools           |   |   provision_database        | |
 |  |  4. Synthesize response     |   |   create_repo               | |
 |  |  5. Stream via SSE          |   |                             | |
 |  +--------------+--------------+   |  Each workflow chains:      | |
 |                 |                   |   validate -> create ->     | |
 |  +--------------v--------------+   |   deploy -> register ->     | |
 |  |    11 AI Sub-Agents         |   |   notify                    | |
 |  |                             |   +-----------------------------+ |
 |  |  github   argocd   flux    |                                    |
 |  |  jira     slack    pager   |   +-----------------------------+  |
 |  |  kafka    vault    rancher |   |  Services                   |  |
 |  |  backstage                 |   |   LLM (Claude/OpenAI)       |  |
 |  |                            |   |   Auth (Keycloak JWT)       |  |
 |  |  Each agent:               |   |   Database (pg+pgvector)    |  |
 |  |   BaseAgent ABC            |   |   Redis (cache/sessions)    |  |
 |  |   get_card() -> AgentCard  |   +-----------------------------+  |
 |  |   get_tools() -> MCP tools |                                    |
 |  |   invoke(task) -> result   |   +-----------------------------+  |
 |  +-----------------------------+   |  Protocols                 |  |
 |                                    |   MCP (tool registry)      |  |
 |  +-----------------------------+   |   A2A (agent-to-agent)     |  |
 |  |  Policy Agent (bridge)      |   +-----------------------------+  |
 |  |   Calls Go binary via HTTP  |                                   |
 |  |   validate / generate / fix |                                   |
 |  +--------------+--------------+                                   |
 +-----------------|--------------------------------------------------+
                   |
 +-----------------v--------------------------------------------------+
 |                     POLICY AGENT (Go)                               |
 |                                                                     |
 |  +--------------+  +--------------+  +---------------------------+ |
 |  |  CLI         |  |  HTTP API    |  |  K8s Admission Webhook    | |
 |  |  validate    |  |  /validate   |  |  ValidatingWebhookConfig  | |
 |  |  generate    |  |  /generate   |  |  Intercepts create/update | |
 |  |  fix         |  |  /fix        |  |  on Deployments,          | |
 |  |  list        |  |  /policies   |  |  KafkaTopics, etc.        | |
 |  +------+-------+  +------+------+  +-------------+-------------+ |
 |         |                 |                        |               |
 |  +------v-----------------v------------------------v-------------+ |
 |  |                  OPA Engine (Rego)                             | |
 |  |  policies/kubernetes/deployment.rego                           | |
 |  |  policies/kafka/topic.rego                                     | |
 |  |  policies/terraform/security.rego                              | |
 |  |  policies/cicd/pipeline.rego                                   | |
 |  |  policies/gitops/argocd.rego                                   | |
 |  +---------------------------+-----------------------------------+ |
 |                              |                                     |
 |  +---------------------------v-----------------------------------+ |
 |  |              Claude AI Client                                  | |
 |  |  GenerateConfig() - produce compliant YAML from requirements   | |
 |  |  FixViolations()  - auto-remediate policy violations           | |
 |  +----------------------------------------------------------------+ |
 +---------------------------------------------------------------------+

AI Multi-Agent Architecture

Text-based agent architecture diagram

  User: "Create a new microservice called order-api with a Kafka topic"
                                    |
                                    v
                         +---------------------+
                         |  Supervisor Agent    |
                         |  (LangGraph)         |
                         |                      |
                         |  System prompt with   |
                         |  agent descriptions   |
                         |  -> LLM classifies    |
                         |     intent            |
                         +---------+-----------+
                                   |
                    +--------------+--------------+
                    |              |              |
              +-----v-----+ +----v-----+ +-----v-----+
              |  Policy    | |  GitHub   | |  Kafka     |
              |  Agent     | |  Agent    | |  Agent     |
              |            | |           | |            |
              | validate   | | create    | | create     |
              | generated  | | repository| | topic via  |
              | configs    | | from      | | Strimzi    |
              | against    | | golden    | | KafkaTopic |
              | Rego       | | path      | | CRD        |
              | policies   | | template  | |            |
              +-----+------+ +----+-----+ +-----+-----+
                    |              |              |
                    +--------------+--------------+
                                   |
                         +---------v-----------+
                         |  ArgoCD Agent        |
                         |  Create Application  |
                         |  for GitOps deploy   |
                         +---------+-----------+
                                   |
                         +---------v-----------+
                         |  Backstage Agent     |
                         |  Register in catalog |
                         +---------+-----------+
                                   |
                         +---------v-----------+
                         |  Slack Agent         |
                         |  Notify #platform    |
                         +---------+-----------+
                                   |
                         +---------v-----------+
                         |  Supervisor          |
                         |  Synthesize final    |
                         |  response + SSE      |
                         |  stream to UI        |
                         +---------------------+


  Agent Inventory:
  +-------------+--------------------------------------+--------------+
  | Agent       | Tools                                | Integrates   |
  |-------------+--------------------------------------+--------------|
  | github      | create_repo, create_pr, list_repos   | GitHub API   |
  |             | create_issue, search_code            |              |
  | argocd      | sync_app, get_status, list_apps      | ArgoCD API   |
  |             | rollback, get_history                |              |
  | flux        | reconcile, suspend, resume           | kubectl      |
  |             | list_kustomizations, get_source      |              |
  | jira        | create_issue, search, update         | Jira REST v3 |
  |             | get_sprint, add_comment              |              |
  | slack       | send_message, create_channel         | Slack API    |
  |             | post_incident_update                 |              |
  | pagerduty   | list_incidents, acknowledge          | PagerDuty v2 |
  |             | resolve, get_oncall, trigger          |              |
  | backstage   | list_entities, get_entity            | Backstage API|
  |             | trigger_template, search              |              |
  | kubernetes  | list_pods, get_pod_status, list_svcs  | kubectl      |
  |             | list_namespaces, get_logs, scale, evts|              |
  | kafka       | create_topic, list_topics            | Strimzi CRDs |
  |             | describe, update_config, delete       | via kubectl  |
  | vault       | read_secret, write_secret            | Vault HTTP   |
  |             | list, create_policy, enable_engine    |              |
  | rancher     | list_clusters, get_status            | Rancher v3   |
  |             | scale_nodepool, get_events            |              |
  | policy      | validate_config, generate_config     | Go Policy    |
  |             | fix_violations, list_policies         | Agent HTTP   |
  +-------------+--------------------------------------+--------------+

GitOps Bridge Pattern

Text-based GitOps Bridge diagram

  PHASE 1: Terraform provisions infrastructure and passes metadata

  +------------------------------------------------------------------+
  |                    Terraform                                      |
  |                                                                   |
  |  modules/vpc ------> VPC ID, Subnet IDs                          |
  |  modules/eks ------> Cluster endpoint, OIDC provider             |
  |  modules/iam ------> IAM Role ARNs (ESO, DNS, Cert-Mgr, Vault)  |
  |  modules/vault ----> KMS key, DynamoDB table                     |
  |  modules/secrets --> SM secret ARNs                              |
  |                                                                   |
  |  All outputs assembled in locals.tf --> gitops_bridge_metadata   |
  |                                                                   |
  |  modules/gitops-bridge --> Creates K8s Secret in argocd namespace|
  +------------------------------+-----------------------------------+
                                 |
                                 v
  +------------------------------------------------------------------+
  |  K8s Secret: "cluster-metadata" (namespace: argocd)              |
  |                                                                   |
  |  labels:                                                          |
  |    argocd.argoproj.io/secret-type: cluster                       |
  |                                                                   |
  |  annotations:  (GitOps Bridge metadata)                          |
  |    cluster_name: idpportal-dev                                   |
  |    region: us-west-2                                              |
  |    environment: dev                                               |
  |    vpc_id: vpc-0abc123                                            |
  |    iam_role_arn_external_secrets: arn:aws:iam::...:role/...      |
  |    iam_role_arn_external_dns: arn:aws:iam::...:role/...          |
  |    iam_role_arn_cert_manager: arn:aws:iam::...:role/...          |
  |    vault_addr: http://vault.vault.svc:8200                       |
  |    kafka_bootstrap_servers: kafka:9092                            |
  |    gitops_repo_url: https://github.com/org/idpportal.git         |
  |    domain_name: dev.idp.example.com                              |
  +------------------------------+-----------------------------------+
                                 |
  PHASE 2: ArgoCD ApplicationSet reads annotations as Helm values
                                 |
                                 v
  +------------------------------------------------------------------+
  |  ApplicationSet (packages/addons/templates/applicationset.yaml)  |
  |                                                                   |
  |  generators:                                                      |
  |    - clusters:                                                    |
  |        selector:                                                  |
  |          matchLabels:                                             |
  |            argocd.argoproj.io/secret-type: cluster               |
  |                                                                   |
  |  For each addon (14 total):                                      |
  |    template:                                                      |
  |      source:                                                      |
  |        path: packages/<addon-name>                               |
  |        helm:                                                      |
  |          parameters:                                              |
  |            - cluster.name = {{ annotations.cluster_name }}       |
  |            - cluster.region = {{ annotations.region }}            |
  |            - serviceAccount.role = {{ annotations.iam_role }}    |
  |      destination:                                                 |
  |        namespace: <addon-namespace>                              |
  |      syncPolicy:                                                  |
  |        automated: { prune: true, selfHeal: true }                |
  +------------------------------+-----------------------------------+
                                 |
                                 v
  +------------------------------------------------------------------+
  |  ArgoCD generates an Application per addon:                      |
  |                                                                   |
  |  argo-cd          --> argocd namespace                            |
  |  flux-system      --> flux-system namespace                      |
  |  backstage        --> backstage namespace                        |
  |  keycloak         --> keycloak namespace                         |
  |  vault            --> vault namespace                             |
  |  kafka            --> kafka namespace                             |
  |  rancher          --> cattle-system namespace                    |
  |  cert-manager     --> cert-manager namespace                     |
  |  external-dns     --> external-dns namespace                     |
  |  external-secrets --> external-secrets namespace                 |
  |  ingress-nginx    --> ingress-nginx namespace                    |
  |  opa-gatekeeper   --> gatekeeper-system namespace                |
  |  policy-agent     --> policy-agent namespace                     |
  |  ai-platform      --> idpportal namespace                        |
  +------------------------------------------------------------------+

Policy-as-Code Architecture

Text-based Policy architecture diagram

  Policy Agent integrates in 4 modes:

  +-------------------------------------------------------------------+
  |                                                                    |
  |  MODE 1: K8s Admission Webhook                                    |
  |  +----------+    +-----------------+    +------------------------+ |
  |  | kubectl   |    | K8s API Server  |    | Policy Agent Webhook   | |
  |  | apply     |--->| ValidatingWH    |--->| POST /webhook/validate | |
  |  | deploy.yml|    | Configuration   |    | OPA evaluates Rego     | |
  |  +----------+    +-----------------+    | Returns allow/deny     | |
  |                                          +------------------------+ |
  |                                                                    |
  |  MODE 2: AI Chat Agent                                            |
  |  +----------+    +-----------------+    +------------------------+ |
  |  | Developer |    | Supervisor      |    | Policy Sub-Agent       | |
  |  | "validate |--->| routes to       |--->| calls Go agent HTTP    | |
  |  |  my yaml" |    | policy agent    |    | returns violations     | |
  |  +----------+    +-----------------+    +------------------------+ |
  |                                                                    |
  |  MODE 3: CI/CD Gate                                               |
  |  +----------+    +-----------------+    +------------------------+ |
  |  | GitHub PR |    | policy-gate.yml |    | policy-agent CLI       | |
  |  | with K8s  |--->| GitHub Action   |--->| validate --domain k8s  | |
  |  | configs   |    | on PR trigger   |    | --file deploy.yaml     | |
  |  +----------+    +-----------------+    | exit 1 on violations   | |
  |                                          +------------------------+ |
  |                                                                    |
  |  MODE 4: Self-Service Templates                                   |
  |  +----------+    +-----------------+    +------------------------+ |
  |  | Backstage |    | Scaffolder      |    | Policy Agent HTTP      | |
  |  | Template  |--->| validates       |--->| POST /validate         | |
  |  | Wizard    |    | before commit   |    | blocks non-compliant   | |
  |  +----------+    +-----------------+    +------------------------+ |
  |                                                                    |
  +-------------------------------------------------------------------+

  Policy Domains:
  +-------------+----------------------------------------------------+
  | Domain      | Example Rules                                      |
  |-------------+----------------------------------------------------|
  | kubernetes  | require security context, resource limits,         |
  |             | no privileged, no root, no :latest tag, probes     |
  | kafka       | min 3 partitions, min 3 replicas, max retention,  |
  |             | naming convention, require retention with delete   |
  | terraform   | S3 encryption, no public S3/RDS, no open ingress, |
  |             | RDS encryption+backups, EKS secrets encryption     |
  | cicd        | no pipe-to-bash, pin action versions,             |
  |             | restrict PR triggers, no secret logging, timeouts  |
  | gitops      | require self-heal, no HEAD revision, pin chart    |
  |             | versions, require namespace, health checks         |
  +-------------+----------------------------------------------------+

Identity and Auth Architecture

Text-based Identity architecture diagram

  +------------------------------------------------------------------+
  |                   Identity Flow                                   |
  |                                                                   |
  |  Developer                                                        |
  |     |                                                             |
  |     |  1. Login request                                          |
  |     v                                                             |
  |  +--------------------------+                                    |
  |  |  Portal UI (Next.js)     |                                    |
  |  |  next-auth v5            |                                    |
  |  |  Keycloak OIDC Provider  |                                    |
  |  +------------+-------------+                                    |
  |               |  2. Redirect to Keycloak                         |
  |               v                                                   |
  |  +----------------------------------------------------+         |
  |  |              Keycloak (Front Door)                   |         |
  |  |                                                      |         |
  |  |  Realm: idpportal                                    |         |
  |  |  Clients: idpportal-ui, backstage, argocd, vault    |         |
  |  |  Roles: platform-admin, developer, viewer            |         |
  |  |  Groups: platform-admins, developers, viewers        |         |
  |  |                                                      |         |
  |  |  +----------------------------------------------+   |         |
  |  |  |  Identity Broker (optional)                   |   |         |
  |  |  |                                               |   |         |
  |  |  |  +---------------+  +-----------------+      |   |         |
  |  |  |  | Okta Broker    |  | ForgeRock       |      |   |         |
  |  |  |  | OIDC           |  | Broker OIDC     |      |   |         |
  |  |  |  |                |  |                 |      |   |         |
  |  |  |  | Enterprise SSO |  | Enterprise SSO  |      |   |         |
  |  |  |  +---------------+  +-----------------+      |   |         |
  |  |  +----------------------------------------------+   |         |
  |  +------------+---------------------------------------+         |
  |               |  3. JWT (access_token + roles)                   |
  |               v                                                   |
  |  +--------------------------+                                    |
  |  |  FastAPI Backend          |                                    |
  |  |  JWT verification         |                                    |
  |  |  Keycloak public key      |                                    |
  |  |  Role -> Permission map   |                                    |
  |  |                            |                                    |
  |  |  platform-admin -> all     |                                    |
  |  |  developer -> read + exec  |                                    |
  |  |  viewer -> read only       |                                    |
  |  +--------------------------+                                    |
  |                                                                   |
  |  SSO across all platform services:                               |
  |  Portal UI, Backstage, ArgoCD, Vault, Rancher                    |
  |  (all share the same Keycloak realm)                             |
  +------------------------------------------------------------------+

Self-Service Flow

Text-based Self-Service flow diagram

  Developer: "I need a new microservice with a Kafka topic"
      |
      v
  +-------------------------------------------------------------+
  |  Self-Service Hub (/self-service)                           |
  |                                                              |
  |  +------------+ +------------+ +------------+ +----------+ |
  |  |Microservice| |Kafka Topic | | Database   | |S3 Bucket | |
  |  | Template   | | Template   | | Template   | | Template | |
  |  +------+-----+ +------------+ +------------+ +----------+ |
  +---------|---------------------------------------------------+
            |  Select template, fill parameters
            v
  +-------------------------------------------------------------+
  |  Step 1: VALIDATE                                           |
  |  Policy Agent validates generated configs against Rego      |
  |  -> kubernetes/deployment.rego                              |
  |  -> kafka/topic.rego                                        |
  |  Result: PASS / FAIL with violations                        |
  +-----------------------------+-------------------------------+
                                | PASS
                                v
  +-------------------------------------------------------------+
  |  Step 2: CREATE REPOSITORY                                  |
  |  GitHub Agent creates repo from golden path template        |
  |  -> Dockerfile, CI pipeline, Helm chart, catalog-info.yaml |
  +-----------------------------+-------------------------------+
                                v
  +-------------------------------------------------------------+
  |  Step 3: PROVISION RESOURCES                                |
  |  Kafka Agent creates KafkaTopic CR via Strimzi              |
  |  Vault Agent stores credentials                             |
  +-----------------------------+-------------------------------+
                                v
  +-------------------------------------------------------------+
  |  Step 4: DEPLOY via GitOps                                  |
  |  ArgoCD Agent creates Application (or Flux Kustomization)  |
  |  -> Automated sync + self-heal enabled                      |
  +-----------------------------+-------------------------------+
                                v
  +-------------------------------------------------------------+
  |  Step 5: REGISTER                                           |
  |  Backstage Agent registers service in catalog               |
  +-----------------------------+-------------------------------+
                                v
  +-------------------------------------------------------------+
  |  Step 6: NOTIFY                                             |
  |  Slack Agent posts to #platform-updates                     |
  |  Jira Agent creates tracking ticket                         |
  +-------------------------------------------------------------+

Tech Stack

Layer	Technology	Purpose
Cloud	AWS (EKS, VPC, IAM, KMS, SM, Route53)	Infrastructure platform
IaC	Terraform + terraform-aws-modules	Infrastructure provisioning
Container Orchestration	Kubernetes (EKS 1.31)	Workload runtime
GitOps	Argo CD + Flux CD	Continuous deployment (teams choose)
Developer Portal	Backstage	Service catalog + scaffolder templates
Identity	Keycloak 26 + Okta/ForgeRock brokers	SSO across all services
Secrets	HashiCorp Vault + AWS Secrets Manager	App secrets + bootstrap secrets
Streaming	Apache Kafka (Strimzi operator)	Event streaming + topic self-service
Multi-cluster	Rancher	Kubernetes fleet management
Policy	OPA/Rego + custom Go Policy Agent	Policy-as-code validation
AI Backend	Python 3.12, FastAPI, LangGraph	Multi-agent orchestration
AI Models	Claude (Anthropic) / GPT (OpenAI)	LLM for agent reasoning
AI Protocols	MCP + A2A	Tool registry + agent communication
Portal UI	Next.js 15, React 19, Tailwind, shadcn/ui	Developer-facing SPA
State	Zustand	Client-side state management
Auth (UI)	next-auth v5	Keycloak OIDC integration
Database	PostgreSQL 16 + pgvector	Data + RAG embeddings
Cache	Redis 7	Sessions + caching
DNS	External DNS	Automatic Route53 records
TLS	cert-manager + Let's Encrypt	Automated TLS certificates
Ingress	NGINX Ingress Controller	HTTP routing + NLB
Admission	OPA Gatekeeper + Policy Agent webhook	Runtime policy enforcement
CI/CD	GitHub Actions (7 workflows)	Build, test, deploy, policy gate

Project Structure

idpportal/
├── config.yaml                          # Central environment configuration
├── Makefile                             # Top-level build/dev commands
├── docker-compose.yaml                  # Local dev (8 services)
├── docker-compose.dev.yaml              # Hot-reload overrides
├── .env.example                         # All environment variables
│
├── infrastructure/                      # Terraform (AWS)
│   ├── main.tf                          # Module composition
│   ├── locals.tf                        # GitOps Bridge metadata assembly
│   ├── variables.tf / outputs.tf
│   ├── providers.tf / versions.tf / backend.tf
│   ├── modules/
│   │   ├── vpc/                         # VPC + subnets (3 AZ)
│   │   ├── eks/                         # EKS 1.31 + managed node groups
│   │   ├── iam/                         # Pod Identity roles (ESO, DNS, Cert)
│   │   ├── secrets/                     # AWS Secrets Manager (bootstrap)
│   │   ├── vault/                       # KMS auto-unseal + DynamoDB backend
│   │   ├── kafka/                       # Kafka namespace prep
│   │   └── gitops-bridge/               # K8s Secret with TF metadata
│   └── envs/
│       ├── dev.tfvars
│       ├── staging.tfvars
│       └── prod.tfvars
│
├── packages/                            # Helm charts (14 addons via ArgoCD)
│   ├── addons/                          # Root ApplicationSet (App of Apps)
│   ├── argo-cd/                         # ArgoCD + Keycloak OIDC SSO
│   ├── flux-system/                     # Flux controllers + GitRepository
│   ├── backstage/                       # Backstage + GitHub App + Keycloak
│   ├── keycloak/                        # Keycloak + IdP broker overlays
│   │   └── overlays/
│   │       ├── okta-broker.yaml
│   │       └── forgerock-broker.yaml
│   ├── vault/                           # Vault HA + Secrets Operator
│   ├── kafka/                           # Strimzi operator + Kafka cluster
│   ├── rancher/                         # Rancher server
│   ├── cert-manager/                    # TLS + ClusterIssuer
│   ├── external-dns/                    # Route53 DNS automation
│   ├── external-secrets/                # AWS SM -> K8s Secrets
│   ├── ingress-nginx/                   # NLB ingress
│   ├── opa-gatekeeper/                  # Policy enforcement + templates
│   ├── policy-agent/                    # Go webhook + ValidatingWH config
│   └── ai-platform/                     # Backend + UI deployment + Ingress
│
├── policies/                            # OPA/Rego policies (5 domains)
│   ├── kubernetes/deployment.rego
│   ├── kafka/topic.rego
│   ├── terraform/security.rego
│   ├── cicd/pipeline.rego
│   └── gitops/argocd.rego
│
├── backend/                             # Python FastAPI (AI backend)
│   ├── app/
│   │   ├── main.py                      # Entry point + lifespan
│   │   ├── config.py                    # Pydantic settings
│   │   ├── api/v1/                      # REST endpoints
│   │   │   ├── chat.py                  # Chat (sync + SSE streaming)
│   │   │   ├── agents.py               # Agent management
│   │   │   ├── selfservice.py          # Self-service provisioning
│   │   │   └── health.py               # Health/readiness
│   │   ├── agents/                      # 13 AI agents
│   │   │   ├── base.py                  # BaseAgent ABC + AgentCard
│   │   │   ├── registry.py             # Agent discovery
│   │   │   ├── supervisor.py           # LangGraph orchestrator
│   │   │   ├── github/                  # GitHub (repos, PRs, issues)
│   │   │   ├── argocd/                 # ArgoCD (sync, rollback)
│   │   │   ├── flux/                   # Flux (reconcile, suspend)
│   │   │   ├── jira/                   # Jira (issues, sprints)
│   │   │   ├── slack/                  # Slack (messages, channels)
│   │   │   ├── pagerduty/             # PagerDuty (incidents)
│   │   │   ├── backstage/             # Backstage (catalog)
│   │   │   ├── kubernetes/             # Kubernetes (pods, services, logs, scale)
│   │   │   ├── kafka/                  # Kafka (topics via Strimzi)
│   │   │   ├── vault/                  # Vault (secrets CRUD)
│   │   │   ├── rancher/               # Rancher (multi-cluster mgmt)
│   │   │   └── policy/                # Policy (validate/generate/fix)
│   │   └── services/
│   │       ├── llm.py                   # Claude / OpenAI factory
│   │       ├── auth.py                  # Keycloak JWT verification
│   │       └── database.py             # Async PostgreSQL + pgvector
│   ├── pyproject.toml
│   └── Dockerfile
│
├── policy-agent/                        # Go Policy Agent
│   ├── cmd/
│   │   ├── policy-agent/main.go         # CLI (validate, generate, fix)
│   │   └── webhook/main.go              # HTTP server + K8s webhook
│   ├── internal/
│   │   ├── policy/engine.go             # OPA engine (load + eval Rego)
│   │   ├── validator/validator.go       # Domain validation
│   │   ├── ai/claude.go                 # Claude API (gen + remediate)
│   │   └── webhook/handler.go           # Admission + HTTP handlers
│   ├── go.mod
│   └── Dockerfile
│
├── ui/                                  # Next.js 15 Portal UI
│   ├── src/
│   │   ├── app/                         # 12 pages (App Router)
│   │   │   ├── page.tsx                 # Dashboard
│   │   │   ├── chat/                    # AI chat interface
│   │   │   ├── catalog/                 # Service catalog
│   │   │   ├── deployments/             # ArgoCD + Flux view
│   │   │   ├── incidents/               # PagerDuty incidents
│   │   │   ├── kafka/                   # Kafka topics
│   │   │   ├── policies/                # Compliance dashboard
│   │   │   ├── secrets/                 # Vault browser
│   │   │   ├── self-service/            # Template wizard
│   │   │   ├── clusters/                # Rancher clusters
│   │   │   ├── knowledge/               # RAG search
│   │   │   └── settings/                # User prefs
│   │   ├── components/
│   │   │   ├── chat/                    # ChatWindow, MessageBubble, etc.
│   │   │   └── layout/                  # Sidebar, Header
│   │   ├── stores/chat.ts               # Zustand (SSE streaming)
│   │   └── lib/                         # API client, auth, utils
│   ├── package.json
│   └── Dockerfile
│
├── templates/                           # Backstage scaffolder templates
│   ├── microservice/template.yaml       # Golden path microservice
│   ├── kafka-topic/template.yaml        # Self-service Kafka topic
│   └── database/template.yaml           # Provision managed database
│
├── scripts/
│   └── bootstrap.sh                     # 2-stage EKS bootstrap
│
└── .github/workflows/                   # CI/CD (7 pipelines)
    ├── ci-backend.yaml                  # Ruff lint + pytest
    ├── ci-ui.yaml                       # ESLint + Next.js build
    ├── ci-policy-agent.yaml             # Go vet + test + build
    ├── policy-gate.yaml                 # Validate configs on PR
    ├── terraform-plan.yaml              # Plan on PR + comment
    ├── terraform-apply.yaml             # Apply on merge to main
    └── release.yaml                     # Docker build + ECR push

Quick Start

Prerequisites

Docker & Docker Compose
Python 3.12+ with uv
Node.js 20+
Go 1.22+
Terraform 1.7+ (for infrastructure)
kubectl + AWS CLI (for EKS deployment)

Local Development

# 1. Clone and configure
cp .env.example .env
# Edit .env with your API keys (at minimum: ANTHROPIC_API_KEY)

# 2. Start all services (8 containers)
docker compose up -d

# 3. Access the portal
#   UI:        http://localhost:3000
#   Backend:   http://localhost:8000/docs  (Swagger)
#   Keycloak:  http://localhost:8080       (admin/admin)
#   Vault:     http://localhost:8200       (token: dev-root-token)

Development with Hot Reload

# Start with hot-reload on backend + UI
docker compose -f docker-compose.yaml -f docker-compose.dev.yaml up

# Or run backend/UI natively:
cd backend && uv run uvicorn app.main:app --reload --port 8000
cd ui && npm run dev

Policy Agent

# Validate a Kubernetes config
cd policy-agent
go run ./cmd/policy-agent validate \
  --domain kubernetes \
  --file examples/deployment.yaml \
  --policies-dir ../policies

# Start webhook server
go run ./cmd/webhook --port 8443 --policies-dir ../policies

Deployment

EKS Bootstrap (Production)

# 1. Provision infrastructure
cd infrastructure
terraform init -backend-config=envs/dev.tfvars
terraform plan -var-file=envs/dev.tfvars
terraform apply -var-file=envs/dev.tfvars

# 2. Configure kubectl
aws eks update-kubeconfig --name idpportal-dev --region us-west-2

# 3. Bootstrap platform
#    Installs ArgoCD + ESO, then ArgoCD manages all 14 addons
./scripts/bootstrap.sh

# 4. ArgoCD auto-syncs everything via ApplicationSet

Environments

Environment	Cluster	VPC CIDR	Nodes
dev	idpportal-dev	10.0.0.0/16	3x m6i.xlarge
staging	idpportal-staging	10.1.0.0/16	3x m6i.xlarge
prod	idpportal-prod	10.2.0.0/16	5x m6i.2xlarge

CI/CD Pipelines

Workflow	Trigger	What it does
`ci-backend`	Push/PR to `backend/**`	Ruff lint + pytest with Postgres/Redis
`ci-ui`	Push/PR to `ui/**`	ESLint + Next.js build
`ci-policy-agent`	Push/PR to `policy-agent/**`	go vet + go test + build
`policy-gate`	PR with `.yaml`/`.tf`	Validates configs against Rego policies
`terraform-plan`	PR to `infrastructure/**`	Terraform plan + PR comment
`terraform-apply`	Merge to main `infrastructure/**`	Terraform apply
`release`	Git tag `v*`	Docker build, ECR push, Helm values update

Self-Service Capabilities

Developers can self-serve through the portal UI or AI chat:

Template	What it provisions
Microservice	GitHub repo + CI pipeline + ArgoCD/Flux deploy + catalog registration
Kafka Topic	Strimzi KafkaTopic CR with policy validation
Database	RDS instance + credentials in Vault
API Service	REST API with OpenAPI spec + API gateway config
S3 Bucket	Encrypted S3 bucket with IAM policy
Worker Service	Background job processor with SQS/Kafka consumer

All provisioning is policy-gated: configs are validated against OPA/Rego policies before any resources are created.

Implementation Guide

For the complete step-by-step execution guide, see IMPLEMENTATION.md.

It covers everything needed to get this platform running:

Local Development — Docker Compose, hot reload, policy agent CLI
AWS Infrastructure — Terraform state backend, VPC/EKS provisioning
EKS Bootstrap — ArgoCD + ESO install, ApplicationSet sync
Platform Services — Keycloak, Vault, Kafka, AWS Secrets Manager
Application Deployment — Docker build, ECR push, Helm values, webhook setup
CI/CD Pipelines — GitHub Actions, OIDC for AWS, release workflow
Self-Service — Backstage templates, AI chat provisioning
Production Hardening — Network policies, observability, backups
Verification Checklist — Complete post-deploy validation
Troubleshooting — Common issues and fixes

References

CNOE - Cloud Native Operational Excellence
CNOE AWS Reference Implementation - GitOps Bridge pattern, App-of-Apps, bootstrap strategy
CAIPE - AI Platform Engineering - Multi-agent AI architecture concepts
GitOps Bridge - Terraform-to-ArgoCD metadata bridge
terraform-aws-modules - VPC, EKS, IAM Terraform modules
Backstage - Developer portal and software catalog
Strimzi - Kubernetes-native Kafka operator
OPA / Rego - Policy-as-code framework
LangGraph - Multi-agent orchestration framework
Model Context Protocol (MCP) - AI tool registry standard

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IDP Portal

Acknowledgments and References

Table of Contents

Architecture Overview

System Architecture

AI Multi-Agent Architecture

GitOps Bridge Pattern

Policy-as-Code Architecture

Identity and Auth Architecture

Self-Service Flow

Tech Stack

Project Structure

Quick Start

Prerequisites

Local Development

Development with Hot Reload

Policy Agent

Deployment

EKS Bootstrap (Production)

Environments

CI/CD Pipelines

Self-Service Capabilities

Implementation Guide

References

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.github/workflows		.github/workflows
backend		backend
docs/images		docs/images
infrastructure		infrastructure
packages		packages
policies		policies
policy-agent		policy-agent
scripts		scripts
templates		templates
ui		ui
.editorconfig		.editorconfig
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
IMPLEMENTATION.md		IMPLEMENTATION.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
config.yaml		config.yaml
docker-compose.dev.yaml		docker-compose.dev.yaml
docker-compose.yaml		docker-compose.yaml

Folders and files

Latest commit

History

Repository files navigation

IDP Portal

Acknowledgments and References

Table of Contents

Architecture Overview

System Architecture

AI Multi-Agent Architecture

GitOps Bridge Pattern

Policy-as-Code Architecture

Identity and Auth Architecture

Self-Service Flow

Tech Stack

Project Structure

Quick Start

Prerequisites

Local Development

Development with Hot Reload

Policy Agent

Deployment

EKS Bootstrap (Production)

Environments

CI/CD Pipelines

Self-Service Capabilities

Implementation Guide

References

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages