Skip to content

Fix memory leak and dead no-op wait in Lance streaming path#8

Closed
Copilot wants to merge 2 commits intotsafin/lance_streamfrom
copilot/sub-pr-7
Closed

Fix memory leak and dead no-op wait in Lance streaming path#8
Copilot wants to merge 2 commits intotsafin/lance_streamfrom
copilot/sub-pr-7

Conversation

Copy link

Copilot AI commented Feb 23, 2026

Two bugs in the Lance streaming write path identified during code review.

Changes

  • Memory leak on error in start_stream() (Rust)libc::free() was called after the ? early-return on ArrowArrayStreamReader::from_raw() failure, leaving the malloc'd stream_ptr unreachable since C++ had already transferred ownership. Fixed by unconditionally freeing before propagating the error:
// Before: free is skipped on from_raw failure
let reader = unsafe { ArrowArrayStreamReader::from_raw(stream_ptr) }
    .map_err(|e| ...)?;
unsafe { libc::free(stream_ptr as *mut c_void) };

// After: always freed regardless of from_raw outcome
let result = unsafe { ArrowArrayStreamReader::from_raw(stream_ptr) };
unsafe { libc::free(stream_ptr as *mut c_void) };
let reader = result.map_err(|e| ...)?;
  • Dead no-op else branch in StreamState::push() (C++) — the else path (queue has space) called not_full_cv_.wait() with a predicate guaranteed satisfied immediately, since the mutex is held continuously from the top of push(). This was a no-op that obscured the backpressure logic. Removed the branch entirely.

💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

…e::push()

Co-authored-by: tsafin <4956130+tsafin@users.noreply.github.com>
Copilot AI changed the title [WIP] Optimize Lance streaming with cardinality sampling and pre-computed stats Fix memory leak and dead no-op wait in Lance streaming path Feb 23, 2026
Copilot AI requested a review from tsafin February 23, 2026 23:18
@tsafin tsafin marked this pull request as ready for review February 24, 2026 20:51
@tsafin
Copy link
Owner

tsafin commented Feb 24, 2026

@claude review

@tsafin
Copy link
Owner

tsafin commented Feb 24, 2026

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ee95869d03

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

let reader = unsafe { ArrowArrayStreamReader::from_raw(stream_ptr) }
.map_err(|e| format!("Failed to import ArrowArrayStream: {}", e))?;
let result = unsafe { ArrowArrayStreamReader::from_raw(stream_ptr) };
unsafe { libc::free(stream_ptr as *mut c_void) };

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep stream ownership single on import failure

start_stream now unconditionally calls libc::free(stream_ptr) before propagating from_raw errors, but the C++ caller still frees the same pointer when lance_writer_start_stream returns non-zero (src/writers/lance_writer.cpp:259-262). If ArrowArrayStreamReader::from_raw returns Err (e.g., invalid exported stream), this path double-frees the same allocation and can corrupt the heap or abort the process.

Useful? React with 👍 / 👎.

tsafin added a commit that referenced this pull request Feb 24, 2026
Two bugs identified by copilot-swe-agent in PR #8:

- lib.rs: fix memory leak on ArrowArrayStreamReader::from_raw() failure.
  The original code called libc::free() after the ? early-return, so the
  malloc'd ArrowArrayStream struct leaked if from_raw() failed. Fix:
  free unconditionally before propagating the error.

- lance_writer.cpp: remove dead no-op else branch in StreamState::push().
  The else path (queue not full) called not_full_cv_.wait() with a
  predicate already satisfied, making it a guaranteed immediate return.
  Removing it clarifies the backpressure logic.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@tsafin
Copy link
Owner

tsafin commented Feb 24, 2026

Both fixes incorporated into tsafin/lance_stream as commit f5142c6. Thank you for the review — the memory leak on from_raw() failure and the dead else branch are now fixed directly in our working branch.

@tsafin tsafin closed this Feb 24, 2026
tsafin added a commit that referenced this pull request Feb 24, 2026
Two bugs identified by copilot-swe-agent in PR #8:

- lib.rs: fix memory leak on ArrowArrayStreamReader::from_raw() failure.
  The original code called libc::free() after the ? early-return, so the
  malloc'd ArrowArrayStream struct leaked if from_raw() failed. Fix:
  free unconditionally before propagating the error.

- lance_writer.cpp: remove dead no-op else branch in StreamState::push().
  The else path (queue not full) called not_full_cv_.wait() with a
  predicate already satisfied, making it a guaranteed immediate return.
  Removing it clarifies the backpressure logic.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants