Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 2 additions & 50 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -40,8 +40,6 @@ jobs:
- name: Run unit tests
run: |
go test -v -race -coverprofile=coverage.out \
./internal/operator/identifier/... \
./internal/operator/cniconfig/... \
./pkg/common/util/...

- name: Upload coverage
Expand All @@ -50,35 +48,6 @@ jobs:
files: coverage.out
fail_ci_if_error: false

test-operator:
name: Operator Tests
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6

- uses: actions/setup-go@v5
with:
go-version: ${{ env.GO_VERSION }}

- name: Generate manifests
run: make manifests generate

- name: Install envtest
run: go install sigs.k8s.io/controller-runtime/tools/setup-envtest@latest

- name: Run operator tests
run: |
KUBEBUILDER_ASSETS=$(setup-envtest use 1.31.0 -p path)
export KUBEBUILDER_ASSETS
echo "KUBEBUILDER_ASSETS=$KUBEBUILDER_ASSETS"
go test -v -race -coverprofile=coverage-operator.out ./internal/operator/...

- name: Upload coverage
uses: codecov/codecov-action@v6
with:
files: coverage-operator.out
fail_ci_if_error: false

build:
name: Build
runs-on: ubuntu-latest
Expand All @@ -89,9 +58,6 @@ jobs:
with:
go-version: ${{ env.GO_VERSION }}

- name: Generate manifests
run: make manifests generate

- name: Build binary
run: make build

Expand Down Expand Up @@ -122,7 +88,7 @@ jobs:
test-e2e:
name: E2E Tests
runs-on: ubuntu-latest
needs: [lint, test-unit, test-operator, build]
needs: [lint, test-unit, build]
if: github.ref == 'refs/heads/main' || startsWith(github.ref, 'refs/tags/')
steps:
- uses: actions/checkout@v6
Expand All @@ -149,23 +115,9 @@ jobs:
- name: Install CRDs
run: make install

- name: Deploy operator
run: |
make deploy IMG=galactic:e2e
kubectl -n galactic-system wait --for=condition=available deployment/galactic-controller-manager --timeout=120s

- name: Verify CRDs
run: |
kubectl get crd vpcs.galactic.datumapis.com
kubectl get crd vpcattachments.galactic.datumapis.com

- name: Create test resources
run: |
kubectl apply -f config/samples/galactic_v1alpha_vpc.yaml
kubectl apply -f config/samples/galactic_v1alpha_vpcattachment.yaml
sleep 5
kubectl get vpc
kubectl get vpcattachment
kubectl get crd network-attachment-definitions.k8s.cni.cncf.io

- name: Cleanup
if: always()
Expand Down
9 changes: 0 additions & 9 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,9 +24,6 @@ jobs:
with:
go-version: '1.24'

- name: Generate manifests
run: make manifests generate

- name: Log in to Container Registry
uses: docker/login-action@v4
with:
Expand Down Expand Up @@ -64,13 +61,7 @@ jobs:
GIT_TREE_STATE=clean
BUILD_DATE=${{ github.event.repository.updated_at }}

- name: Generate install manifest
run: |
make build-installer IMG=${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.ref_name }}

- name: Create Release
uses: softprops/action-gh-release@v3
with:
generate_release_notes: true
files: |
dist/install.yaml
4 changes: 0 additions & 4 deletions .golangci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@ linters:
- copyloopvar
- dupl
- errcheck
- ginkgolinter
- goconst
- gocyclo
- govet
Expand All @@ -29,9 +28,6 @@ linters:
exclusions:
generated: lax
rules:
- linters:
- lll
path: api/*
- linters:
- dupl
- lll
Expand Down
45 changes: 17 additions & 28 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,65 +2,54 @@

## Purpose & Architecture

Galactic is a Kubernetes network operator that gives pods declarative multi-cloud VPC connectivity using SRv6 and kernel VRF isolation. Users create two CRDs (`VPC`, `VPCAttachment`) and annotate pods; Galactic handles all routing. The control plane is a Go operator (`internal/operator/controller/`) that assigns identifiers and generates Multus `NetworkAttachmentDefinition` resources. A DaemonSet agent (`internal/agent/srv6/`) manages kernel SRv6 routes and VRFs per node. The CNI plugin (`internal/cni/`) runs in-process with the agent, registering container endpoints via gRPC. BGP is used as the control plane for distributing SRv6 routes between agents.
Galactic is the SRv6 data plane for multi-cloud VPC networking. It consists of a DaemonSet agent (`internal/agent/srv6/`) that manages kernel SRv6 routes and VRFs per node, and a CNI plugin (`internal/cni/`) that registers container endpoints with the agent via gRPC. VPC and VPCAttachment CRD management lives in a separate operator project; Galactic receives pre-populated identifiers through the CNI config and acts on them. BGP is used as the control plane for distributing SRv6 routes between agents.

**Data flow:** VPC/VPCAttachment CRDs → operator assigns 48-bit/16-bit hex IDs → NetworkAttachmentDefinition written → pod annotation triggers webhook (adds Multus annotation) → CNI runs → gRPC registers endpoint with agent → agent manages SRv6 ingress routes locally → BGP distributes SRv6 routes between agents.
**Data flow:** CNI invoked with pre-populated VPC/VPCAttachment identifiers → gRPC registers endpoint with agent → agent manages SRv6 ingress routes locally → BGP distributes SRv6 routes between agents.

**Non-obvious decisions:**
- VPC identifiers are 48-bit hex; VPCAttachment identifiers are 16-bit hex. These are embedded into IPv6 SRv6 endpoint addresses for deterministic route lookups.
- VPC identifiers are 48-bit hex; VPCAttachment identifiers are 16-bit hex. These are embedded into IPv6 SRv6 endpoint addresses for deterministic route lookups. Both are supplied by an external operator via the CNI config.
- Identifiers are also Base62-encoded for interface naming (VRF: `vrfX-Y`, veth host side: `galX-Y`) to keep kernel interface name length within limits.
- The binary auto-detects CNI mode via the `CNI_COMMAND` env var; otherwise runs as a Cobra CLI with `operator`, `agent`, `cni`, `version` subcommands.
- The binary auto-detects CNI mode via the `CNI_COMMAND` env var; otherwise runs as a Cobra CLI with `agent`, `cni`, and `version` subcommands.

## Tech Stack

- **Go 1.24** (toolchain 1.24.2) — operator, agent, CNI plugin
- **controller-runtime v0.21 / k8s v1.33** — operator framework
- **Multus CNI** — multi-network for pods; Galactic generates NADs automatically
- **Go 1.24** (toolchain 1.24.2) — agent and CNI plugin
- **Multus CNI** — multi-network for pods; NAD generation is handled by the external operator
- **gRPC + protobuf** — CNI-to-agent local communication (`pkg/proto/local/`)
- **SRv6 + netlink** — kernel-level routing; `github.com/vishvananda/netlink`
- **BGP** — control plane for SRv6 route distribution between agents
- **Ginkgo/Gomega** — Go BDD-style tests
- **controller-gen v0.18 / kustomize v5.6** — code + manifest generation (managed by Makefile, vendored to `bin/`)
- **BGP** — control plane for SRv6 route distribution between agents (in progress)

## Development Workflow

```
make build # produces bin/galactic
make test # gen + fmt + vet + unit tests with coverage
make test # fmt + vet + unit tests with coverage
make lint # golangci-lint; lint-fix applies safe auto-fixes
make run-operator # run operator against current kubeconfig
make run-agent # run agent (requires root / CAP_NET_ADMIN)
make test-e2e # requires Kind; setup-test-e2e creates the cluster
make manifests # regenerate CRDs + RBAC from Go types (run after API changes)
make generate # regenerate DeepCopy methods (run after API type changes)
```

**Before every PR:** `make lint test`.

**Envtest binaries** are downloaded to `bin/` by `make setup-envtest`. CI pins Kubernetes 1.31 for controller tests.

## Code Standards

See [CONVENTIONS.md](CONVENTIONS.md) for the full, prescriptive coding standards covering Go naming, error handling, testing patterns, API type conventions, code generation, linting, and commit messages.
See [CONVENTIONS.md](CONVENTIONS.md) for the full, prescriptive coding standards covering Go naming, error handling, testing patterns, linting, and commit messages.

Summary:
- Go: `gofmt`/`goimports` enforced; golangci-lint with `errcheck`, `staticcheck`, `govet`, `revive`, `gocyclo`, `dupl`, `unused` (see `.golangci.yml`). `lll` excluded from `api/` and `internal/`.
- Generated files (`zz_generated.deepcopy.go`, CRD YAMLs) are committed; regenerate with `make generate manifests` after type changes. Never hand-edit them.
- Kubebuilder marker annotations (`+kubebuilder:rbac`, `+kubebuilder:object:root`, etc.) drive code + manifest generation — keep them accurate.
- Go: `gofmt`/`goimports` enforced; golangci-lint with `errcheck`, `staticcheck`, `govet`, `revive`, `gocyclo`, `dupl`, `unused` (see `.golangci.yml`). `lll` excluded from `internal/`.
- Generated protobuf files (`*.pb.go`, `*_grpc.pb.go`) are committed; never hand-edit them.

## Current State

- **Known debt:** e2e tests only run on `main`/release branches (not PRs), so regressions in cluster behavior can merge undetected. Unit coverage exists for `identifier`, `cniconfig`, and `pkg/common/util`; controller reconciler logic has envtest coverage but agent/CNI paths do not.
- **Known debt:** Agent and CNI kernel-path code (`internal/agent/srv6/`, `internal/cni/`) has no unit coverage; these paths are best covered by integration or e2e tests. Only `pkg/common/util` has unit test coverage.
- **In flux:** The SRv6 route management (`internal/agent/srv6/`) and VRF utilities (`pkg/common/vrf/`) are the least tested and most likely to change as multi-cloud routing matures. BGP integration is in progress.

## New Developer Entry Points

1. Run `make build` to verify toolchain; run `make test` to confirm envtest and unit tests pass.
2. Read `pkg/apis/v1alpha/vpc_types.go` and `vpcattachment_types.go` — the CRD types are the core abstraction.
3. Trace `internal/operator/controller/vpcattachment_controller.go` — it wires operator reconciliation to Multus NAD generation.
4. Read `internal/cni/cni.go` (cmdAdd/cmdDel) to understand the container attach path.
5. `config/samples/` has working VPC, VPCAttachment, and annotated Pod examples.
1. Run `make build` to verify toolchain; run `make test` to confirm unit tests pass.
2. Read `internal/cni/cni.go` (cmdAdd/cmdDel) to understand the container attach path.
3. Read `internal/agent/srv6/srv6.go` to understand the agent entry point and how it manages SRv6 routes and VRFs.
4. Read `pkg/proto/local/local.go` to understand the gRPC interface between the CNI and the agent.
5. Explore `pkg/common/` for shared utilities (VRF management, sysctl helpers, CNI types).

**Likely trip-ups:**
- `make run-agent` requires elevated privileges (netlink, VRF, SRv6 operations need `CAP_NET_ADMIN`).
- After modifying API types, you must run both `make generate` and `make manifests` or CRD YAML and DeepCopy will be out of sync.
Loading
Loading