Skip to content

Use less memory in processing JSON#82

Merged
mason-sharp merged 1 commit intomainfrom
task/ace-minor-leaks
Feb 23, 2026
Merged

Use less memory in processing JSON#82
mason-sharp merged 1 commit intomainfrom
task/ace-minor-leaks

Conversation

@mason-sharp
Copy link
Member

@mason-sharp mason-sharp commented Feb 23, 2026

  • Replace json.MarshalIndent + os.WriteFile with json.NewEncoder streaming directly to the file, avoiding a second full copy of the diff report in memory at write time.

  • Replace defer conn.Close()/pool.Close() inside loops with explicit Close() at end of each iteration, preventing all connections from staying open until the function returns.

@coderabbitai
Copy link

coderabbitai bot commented Feb 23, 2026

📝 Walkthrough

Walkthrough

Resource cleanup in table_diff.go moves from deferred closures to explicit closes during per-node processing. utils.go switches from in-memory JSON marshalling to streaming JSON encoding when writing files, changing memory and timing characteristics without altering public APIs.

Changes

Cohort / File(s) Summary
Resource Management Refactoring
internal/consistency/diff/table_diff.go
Replaced deferred conn.Close() and pool.Close() usage with explicit closes executed after per-node processing, changing when connections/pools are released within loops.
JSON Output Optimization
pkg/common/utils.go
Replaced json.MarshalIndent + os.WriteFile with os.Create + json.NewEncoder (with SetIndent) and explicit file close handling to stream formatted JSON to disk instead of building a full in-memory byte slice.

Poem

🐇 I hopped through loops and closed each gate,
Freed every socket, finished every wait.
I speak in streams to save the heap,
Printed tidy JSON before I sleep.
A little rabbit, tidy and elate.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Use less memory in processing JSON' directly relates to the main change of replacing json.MarshalIndent + os.WriteFile with streaming json.NewEncoder, which is the primary memory optimization.
Description check ✅ Passed The pull request description accurately describes the changeset, detailing both the JSON streaming optimization and the explicit resource closure changes.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch task/ace-minor-leaks

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
internal/consistency/diff/table_diff.go (1)

872-933: ⚠️ Potential issue | 🟠 Major

Connection leak on error paths in RunChecks loop

Replacing defer conn.Close() with an explicit conn.Close() at the end of the iteration means every early return in the loop body after a successful auth.GetClusterNodeConnection will leak the connection. Specifically, the following paths skip line 933:

  • GetColumns error / empty columns (lines ~879–883)
  • GetPrimaryKey error / empty key (lines ~886–891)
  • Schema mismatch (lines ~898–900)
  • GetColumnTypes error (lines ~906–908)
  • CheckUserPrivileges error / missing TableSelect privilege (lines ~918–925)

The previous defer conn.Close() was safe on error paths — it deferred cleanup until function return. The right approach to close at end of iteration while also covering all error paths is to scope the connection inside an IIFE:

🔧 Proposed fix: IIFE to scope `defer conn.Close()` per iteration
 	for _, nodeInfo := range t.ClusterNodes {
 		hostname, _ := nodeInfo["Name"].(string)
 		hostIP, _ := nodeInfo["PublicIP"].(string)
 		user, _ := nodeInfo["DBUser"].(string)
 
 		port, ok := nodeInfo["Port"].(string)
 		if !ok {
 			port = "5432"
 		}
 
 		if !utils.Contains(t.NodeList, hostname) {
 			continue
 		}
 
-		conn, err := auth.GetClusterNodeConnection(t.Ctx, nodeInfo, t.connOpts())
-		if err != nil {
-			return fmt.Errorf("failed to connect to node %s: %w", hostname, err)
-		}
-
-		currCols, err := queries.GetColumns(t.Ctx, conn, schema, table)
-		// ... (all existing logic) ...
-		hostMap[hostIP+":"+port] = hostname
-
-		if t.TableFilter != "" {
-			logger.Info("Applying table filter for diff: %s", t.TableFilter)
-		}
-
-		conn.Close()
-	}
+		var colTypes map[string]string
+		colTypesKey := fmt.Sprintf("%s:%s", hostIP, port)
+		if nodeErr := func() error {
+			conn, err := auth.GetClusterNodeConnection(t.Ctx, nodeInfo, t.connOpts())
+			if err != nil {
+				return fmt.Errorf("failed to connect to node %s: %w", hostname, err)
+			}
+			defer conn.Close()
+
+			// ... (all existing logic using conn) ...
+			return nil
+		}(); nodeErr != nil {
+			return nodeErr
+		}
+	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/consistency/diff/table_diff.go` around lines 872 - 933, The loop in
RunChecks creates a DB connection via auth.GetClusterNodeConnection but calls
conn.Close() only at the loop end, leaking connections on early returns (e.g.,
after queries.GetColumns, queries.GetPrimaryKey, schema mismatch,
queries.GetColumnTypes, queries.CheckUserPrivileges). Fix by scoping
per-iteration work in an anonymous function (IIFE) immediately after creating
conn so you can use defer conn.Close() to guarantee cleanup on all return paths;
move the logic that reads currCols/currKey/colTypes, sets t.ColTypes and
hostMap, checks privileges, logs filters, etc., into that IIFE and keep
auth.GetClusterNodeConnection and defer conn.Close() inside it.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/common/utils.go`:
- Line 1229: The current use of enc.Encode(diffResult) writes a trailing newline
which changes prior behavior (json.MarshalIndent + os.WriteFile) and may break
byte-exact comparisons; update the write so the file contains the exact JSON
bytes without the extra '\n' — either replace enc.Encode(diffResult) with
json.MarshalIndent(diffResult, "", "  ") and write the result using os.WriteFile
(preserving previous indentation and no trailing newline), or encode into a
buffer then trim a single trailing newline before writing; refer to
enc.Encode(diffResult), json.MarshalIndent, os.WriteFile and the diffResult
variable to locate and modify the code.
- Line 1225: Replace the silent defer f.Close() with an explicit close after the
encode/write completes: remove defer f.Close(), perform the encode/write (e.g.,
encoder.Encode or file write call), then call if err := f.Close(); err != nil {
return fmt.Errorf("closing file: %w", err) } (or log/propagate the error
consistent with surrounding code). This ensures errors from f.Close() (flush
failures like disk full) are detected rather than silently ignored; search for
the defer f.Close() occurrence and the adjacent encode/write call to implement
the change.

---

Outside diff comments:
In `@internal/consistency/diff/table_diff.go`:
- Around line 872-933: The loop in RunChecks creates a DB connection via
auth.GetClusterNodeConnection but calls conn.Close() only at the loop end,
leaking connections on early returns (e.g., after queries.GetColumns,
queries.GetPrimaryKey, schema mismatch, queries.GetColumnTypes,
queries.CheckUserPrivileges). Fix by scoping per-iteration work in an anonymous
function (IIFE) immediately after creating conn so you can use defer
conn.Close() to guarantee cleanup on all return paths; move the logic that reads
currCols/currKey/colTypes, sets t.ColTypes and hostMap, checks privileges, logs
filters, etc., into that IIFE and keep auth.GetClusterNodeConnection and defer
conn.Close() inside it.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6997687 and 8ec1bda.

📒 Files selected for processing (2)
  • internal/consistency/diff/table_diff.go
  • pkg/common/utils.go

return "", "", fmt.Errorf("failed to write diffs file: %w", err)
enc := json.NewEncoder(f)
enc.SetIndent("", " ")
if err := enc.Encode(diffResult); err != nil {
Copy link

@coderabbitai coderabbitai bot Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

json.Encoder.Encode appends a trailing newline — behavioral change

enc.Encode(v) always writes a \n after the JSON payload; the previous json.MarshalIndent + os.WriteFile did not. Any downstream consumer or test that reads the file and compares its exact byte content will be affected. This is likely harmless for JSON parsers, but worth confirming no tooling does a byte-exact check on the diff output.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/common/utils.go` at line 1229, The current use of enc.Encode(diffResult)
writes a trailing newline which changes prior behavior (json.MarshalIndent +
os.WriteFile) and may break byte-exact comparisons; update the write so the file
contains the exact JSON bytes without the extra '\n' — either replace
enc.Encode(diffResult) with json.MarshalIndent(diffResult, "", "  ") and write
the result using os.WriteFile (preserving previous indentation and no trailing
newline), or encode into a buffer then trim a single trailing newline before
writing; refer to enc.Encode(diffResult), json.MarshalIndent, os.WriteFile and
the diffResult variable to locate and modify the code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is ok. It is just at the end of the file, everything else remains the same.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mason-sharp, understood! Thanks for confirming that the trailing newline is acceptable for the JSON diff output files. The streaming approach still achieves the memory optimization goal while keeping the output functionally equivalent for consumers.

…disk

- Replace defer conn.Close()/pool.Close() inside loops with explicit
  Close() at end of each iteration, preventing all connections from
  staying open until the function returns.

- Replace json.MarshalIndent + os.WriteFile with json.NewEncoder
  streaming directly to the file, avoiding a second full copy of the
  diff report in memory at write time.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@mason-sharp mason-sharp merged commit 85a7e0e into main Feb 23, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant