Skip to content

Add unvendored repository support with precise dependency diff analysis#71

Open
bts-bastion wants to merge 8 commits into
digitalocean:masterfrom
bts-bastion:bts/go-unvendored
Open

Add unvendored repository support with precise dependency diff analysis#71
bts-bastion wants to merge 8 commits into
digitalocean:masterfrom
bts-bastion:bts/go-unvendored

Conversation

@bts-bastion
Copy link
Copy Markdown

Problem

GTA (Go Test Auto) was designed for vendored repositories. In an unvendored repo, two things break:

  1. Crashes: packages.Load returns packages with errors for external dependencies that aren't
    vendored locally. These errors propagate through graph construction and PackageFromImport,
    causing GTA to fail outright.

  2. Nuclear go.mod handling: When go.mod changes, GTA marks every package in the dependency
    graph as dirty. In a vendored repo this rarely happens (vendor changes show up as file diffs
    instead). In an unvendored repo, routine dependency bumps trigger a full rebuild of the entire
    module — completely negating GTA's value as a targeted analysis tool.

Approach

The fix is delivered in two phases across 7 atomic commits, each independently reviewable and
test-green.

Phase 1: Don't crash (commits 1-4)

Four defensive changes make GTA survive unvendored repos without changing its analysis behavior:

  • Skip errored packages in graph construction (packager.go): When packages.Load returns a
    package with pkg.Errors (e.g., an external dep whose source isn't local), record its module info
    but don't add it to the forward/reverse graphs. This prevents broken nodes from polluting
    traversal.

  • Tolerate failures (gta.go): In ChangedPackages(), when the reverse
    graph contains an external package that can't be resolved, continue instead of returning an
    error. For vendored repos this path is never hit (all graph packages resolve).

  • Filter non-local edges during traversal (gta.go): The traverse() function in
    markedPackages() now skips edges to external packages via a localPackageChecker type assertion.
    This prevents the graph walk from chasing into external dependency trees that don't exist locally.

  • isLocalPackage helper (packager.go): Determines whether an import path belongs to a main
    module by checking against modulesNamesByDir. In GOPATH mode (where modulesNamesByDir is
    empty), all packages are treated as local — preserving existing behavior exactly.

All four changes use type assertions on unexported interfaces, not additions to the public
Packager interface. Custom Packager implementations are completely unaffected — the type
assertion simply fails and the code falls through to existing behavior.

Phase 2: Precise dependency diff (commits 5-7)

Replace the nuclear "mark all packages when go.mod changes" with targeted analysis:

How diffGoMod works

diffGoMod(oldData, newData) parses both go.mod files using modfile.Parse (with ParseLax
fallback for non-standard version strings) and compares:

  • Require directives: Added, removed, or version-changed dependencies
  • Replace directives: Added, removed, or target-changed replacements

It returns a []ModuleChange identifying exactly which dependency module paths changed.

How diffGoSum catches transitive changes

This is the key insight for correctness. go.mod only lists direct dependencies. A transitive
dependency three levels deep can change version without go.mod changing at all — but go.sum
will change, because it records the cryptographic hash of every module in the build graph.

diffGoSum(oldData, newData) does a line-level symmetric diff of the two go.sum files. Each
go.sum line is module version hash. Any line present in one file but not the other means that
module changed. The function extracts the module path from each differing line and returns the
deduplicated set.

This makes the detection intentionally overprotective: if any transitive dependency changes
its resolved version or hash, go.sum will reflect it, and GTA will mark the local packages that
import that module. It may mark slightly more than strictly necessary (e.g., if a transitive dep
changed but only affects a code path your local package doesn't use), but it will never miss a
change that could affect your build
. The bias is toward safety — false positives (extra test runs)
over false negatives (missed regressions).

How the pieces connect at the call site

When go.mod or go.sum appears in the diff, markedPackages() now:

  1. Retrieves old content via BaseFileReader (the git differ reads git show <base>:<path>)
    or via SetBaseGoMod/SetBaseGoSum options (for file-differ mode without git).

  2. Diffs both files: diffGoMod for direct dependency changes, diffGoSum for transitive.
    Results are merged and deduplicated.

  3. Maps modules to local importers: LocalImportersOf(changedModPaths) scans the forward
    dependency graph to find which local packages import any package under the changed module paths.
    Only those packages are marked dirty.

  4. Falls back to nuclear when base content is unavailable (custom Differ without
    BaseFileReader, no SetBaseGoMod option). This preserves existing behavior for any setup
    that can't provide the old file content.

What about go.work?

go.work changes continue to use the nuclear option. Workspace structural changes (adding/removing
modules from the workspace) warrant full re-evaluation since they can fundamentally change which
packages are visible and how they resolve.

Interface compatibility

This PR adds zero methods to any exported interface:

New type Scope Discovery
localPackageChecker unexported interface type assertion in traverse()
localImporterFinder unexported interface type assertion in markedPackages()
BaseFileReader exported interface type assertion on Differ at call site

BaseFileReader is a standalone exported interface, not an extension of Differ. The git differ
satisfies it; the file differ does not. Call sites use if reader, ok := g.differ.(BaseFileReader).

packageContext (the concrete type behind Packager) implements all three. External Packager
implementations don't need to change — the type assertions simply fail and existing code paths run.

Correctness argument

The detection is sound (no false negatives) because:

  • Every direct dependency change appears in go.mod → caught by diffGoMod
  • Every transitive dependency change appears in go.sum → caught by diffGoSum
  • Every changed module is mapped to local importers via the forward graph → caught by LocalImportersOf
  • When base content is unavailable, we fall back to marking everything → no change can slip through

The detection may produce false positives (overprotective) because:

  • go.sum line changes don't distinguish which specific packages within a module changed
  • A module path match marks the local package even if it only imports an unaffected sub-package
  • diffGoSum treats hash changes the same as version changes

This is the right tradeoff: running a few extra tests is cheap; missing a broken dependency is not.

Test plan

  • go test -v -count=1 -race ./... — all pass, no races
  • 7 new unit test functions with 40+ table-driven cases across packager_test.go, gomod_test.go, gta_test.go
  • 5 new integration tests covering: precise detection, nuclear fallback, go.sum transitive detection, go.work nuclear preservation, SetBaseGoMod option
  • All 11 pre-existing test files pass unmodified (backward compatibility)
  • GOPATH-mode tests pass (the _/GOPATH subtests in TestGTA_ChangedPackages)

Detect and parse go.work files to resolve packages across all workspace
modules, enabling cross-module dependency tracking in CI pipelines.
Add -no-workspace flag to opt out. Mark all packages dirty when go.work
or go.mod changes.

Co-authored by: Blue Thunder Somogyi
These methods enable distinguishing local (main module) packages from
external dependencies and mapping changed dependency modules to the
local packages that import them. Foundation for unvendored repo support.

Co-authored by: Blue Thunder Somogyi
When packages.Load returns packages with errors (common for external
dependencies in unvendored repos), skip them when building the
forward/reverse dependency graphs. Module info is still recorded.

Co-authored by: Blue Thunder Somogyi
When a package in the dependency graph cannot be resolved (e.g., an
external dependency in an unvendored repo), skip it instead of
returning an error from ChangedPackages(). This prevents GTA from
crashing on unvendored repositories.

Co-authored by: Blue Thunder Somogyi
When traversing the dependency graph in markedPackages(), skip edges
to external (non-local) packages. Uses a type assertion so custom
Packager implementations retain the existing traversal behavior.
In GOPATH mode (no modules), all packages are treated as local.

Co-authored by: Blue Thunder Somogyi
New file gomod.go provides diffGoMod() and diffGoSum() for precise
dependency change detection. diffGoMod compares Require and Replace
directives using modfile.Parse with ParseLax fallback. diffGoSum does
line-level comparison to catch transitive dependency changes.

Co-authored by: Blue Thunder Somogyi
BaseFileReader enables reading file content at the git merge-base,
used to retrieve old go.mod/go.sum for dependency diff analysis.
SetBaseGoMod and SetBaseGoSum options provide the same capability
for file-differ mode (no git access).

Co-authored by: Blue Thunder Somogyi
When go.mod or go.sum changes in an unvendored repo, GTA now parses
the diff to identify exactly which dependency modules changed, then
marks only the local packages that import those modules. This replaces
the previous "mark all packages" approach which negated GTA's value.

Falls back to mark-all when base file content is unavailable (e.g.,
custom Differ without BaseFileReader, no SetBaseGoMod option).

go.work changes continue to use mark-all since workspace structural
changes warrant full re-evaluation.

Co-authored by: Blue Thunder Somogyi
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant