Update to MLX 0.30.6 by robert-johansson · Pull Request #6 · frost-beta/node-mlx

robert-johansson · 2026-02-24T16:07:06Z

Summary

Bump MLX submodule from 0.25.0 to 0.30.6
Add ki::Type specialization for mlx::core::SmallVector (MLX >= 0.26 uses SmallVector for Shape)
Update API call sites for breaking changes: std::vector<int> → mx::Shape, new output_padding params in conv_transpose, extra arg in scaled_dot_product_attention
Split large ki::Set registration calls to stay within template parameter limits
Wrap mx::metal::device_info to return the new std::unordered_map return type

Tested on macOS with Apple Silicon (M4). All existing functionality works.

🤖 Generated with Claude Code

Bump MLX submodule from v0.25.0 to v0.30.6 and fix all API changes: - Add SmallVector<T> kizunapi type specialization (Shape changed from std::vector<int> to SmallVector in MLX >= 0.26) - Add PutIntoShape helper, keep PutIntoVector for std::vector<int> uses - Update FFT wrapper function pointer types for Shape parameter - Add output_padding parameter to conv_transpose1d/2d/3d - Add sinks parameter to scaled_dot_product_attention calls - Move device_info from metal:: to gpu:: namespace - Split large ki::Set calls to stay within template argument limits Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Update deps/mlx with fix for compile_fuse broadcast split_one bug that caused "unordered_map::at: key not found" on compiled functions with ~100+ operations. This is an upstream MLX bug (v0.29.4+). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Update MLX submodule with improved compile_fuse fix that preserves the broadcast fusion optimization while fixing the aliasing bug that caused unordered_map::at crashes on large computation graphs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Points deps/mlx to ml-explore/mlx main (c8536f52) which includes the merged compile_fuse broadcast split fix from PR #3166, plus newer upstream fixes (Metal event leak, conv3d overflow, fence sync). Replaces the local branch commits (65cefdef, a6d40e4a) which are now superseded by the upstream merge. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Update MLX submodule to include native lgamma/digamma kernels and add Node.js bindings for both operations. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Update deps/mlx submodule URL to robert-johansson/mlx (genmlx branch) with lgamma, digamma, bessel_i0e, bessel_i1e ops - Add besselI0e/besselI1e bindings in ops.cc and type declarations Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Report external memory (min 1MB per array) via napi_adjust_external_memory so the JS GC knows about Metal GPU buffer pressure. This makes GC run earlier, reducing the chance of hitting Metal's 499K allocation limit. - Point kizunapi submodule to robert-johansson fork with ExternalMemorySize trait - Specialize ExternalMemorySize for mx::array (1MB minimum cost) - Add napi_adjust_external_memory calls in Tidy and Dispose paths Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Adds a native function that bypasses the deferred N-API finalizer queue by synchronously walking the wrapper registry and freeing arrays whose JS wrappers have been GC'd. This is critical for synchronous inference loops where the event loop never yields and deferred finalizers never run. Includes kizunapi changes: - CollectDeadWrappers<T>() in InstanceData - ExternalMemorySize reporting on AllowPassByValue path - Double-free guard in finalizer callbacks Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Kernel .h changes now take effect via JIT source string regeneration without needing to manually delete .air/.metallib files. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ed calls The Tidy function captured `auto& top = g_tidy_arrays.top()` and passed `[&top]` to the AwaitFunction lambda. If the lambda executed after the stack was modified (async Promise path, or nested tidy calls), `top` became a dangling reference → segfault at address 0x5. Fix: move the set off the stack inside cpp_then (at execution time, not capture time). Use a shared_ptr<bool> flag to coordinate between cpp_then and cpp_finally so the stack is popped exactly once — cpp_then pops on success, cpp_finally pops only on error (if cpp_then didn't run). Verified: nested tidy (3 levels), 218K-call stress test, GenMLX test suite (165/165 gen_clj_compat). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…eference ExternalMemorySize::Get(a) was called on array pointers before checking if the pointer was still valid. If JS GC had already finalized the array (calling TypeBridge::Finalize → delete), the pointer was dangling. Fix: check GetWrapper/DeleteWrapper first. Only access the pointer if the wrapper map confirms it's still alive (states 1 or 3, not state 2). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

When the JS function passed to valueAndGrad threw during tracing, the error was silently swallowed. The traced lambda returned an empty vector, MLX's value_and_grad continued with garbage, and TreeUnflatten returned a stale tracer Symbol instead of a concrete mx.array. No error was ever propagated to the caller. Fix: track callback failure with a flag. After value_and_grad_func returns, check the flag and throw instead of proceeding with invalid results. Reproducer (before fix): const vg = mx.valueAndGrad((w, x) => { throw new Error('oops'); }); const [v, g] = vg(mx.array([1]), mx.array([2])); // v.constructor.name was 'Symbol' — should have thrown Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Update deps/mlx to genmlx-rebased branch which includes: - 53 upstream commits (teardown fix, split-K matmul, etc.) - Library cleaner: Metal shader pipelines are released when compiled functions are erased from the compile cache - Custom ops (lgamma, digamma, bessel, vmap floor_divide fix) - Export mx.detail.compile_clear_cache as compileClearCache in JS bindings, allowing explicit cleanup of all compiled function caches and their associated Metal resources. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Root cause: after mx.eval(), each array retains shared_ptr references to its inputs through the computation graph. Under Bun/JSC, the GC is non-deterministic and finalizers are deferred to the event loop. In synchronous code (which nbb/ClojureScript is), finalizers never fire, so Metal buffers accumulate monotonically — num_resources grew from 26 to 18,000+ in 60 seconds, eventually hitting the macOS 499K limit. Fix: call array.detach() on evaluated arrays in Eval(). This severs the graph links (primitive + inputs), allowing parent arrays and their Metal buffers to be freed immediately. Safe because node-mlx manages gradients via separate valueAndGrad/grad transforms that trace their own graphs — the forward graph is never reused after eval. Also: - Expose getNumResources/getResourceLimit for Metal buffer monitoring - Move SweepDeadArrays to shared header for cross-file access - Update MLX submodule with resource tracking API Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Wraps mx::searchsorted in node-mlx NAPI bindings. TypeScript declaration added. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Robert Johansson and others added 15 commits February 22, 2026 21:08

Expose lgamma and digamma ops from MLX

e4aeb03

Update MLX submodule to include native lgamma/digamma kernels and add Node.js bindings for both operations. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add bun.lock and .cache/ to .gitignore

a189102

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Update MLX submodule: fix lgamma Metal kernel recursion bug

0fa8757

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Enable MLX_METAL_JIT=ON for runtime kernel compilation

fa13d95

Kernel .h changes now take effect via JIT source string regeneration without needing to manually delete .air/.metallib files. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

robert-johansson force-pushed the main branch from 0af732e to 24b674d Compare March 22, 2026 18:07

robert-johansson force-pushed the main branch from 9285eec to 6b43370 Compare March 22, 2026 19:34

Add searchsorted bindings for MLX searchsorted operation

d377618

Wraps mx::searchsorted in node-mlx NAPI bindings. TypeScript declaration added. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update to MLX 0.30.6#6

Update to MLX 0.30.6#6
robert-johansson wants to merge 17 commits intofrost-beta:mainfrom
robert-johansson:main

robert-johansson commented Feb 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

robert-johansson commented Feb 24, 2026

Summary

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant