The goal of the Mooncake.jl project is to produce an AD package written entirely in Julia that improves on ForwardDiff.jl, ReverseDiff.jl, and Zygote.jl in several ways.
Please refer to the docs for more info.
Important
Mooncake.jl is maintained primarily by academic researchers at grant-funded
institutions, with correspondingly limited capacity for triage and review. In the spirit of
long-lived projects such as R and TeX, we favour correctness, stability,
and tightly scoped fixes over open-ended expansion.
Contributions are most welcome when they concern reproducible defects: incorrect results, unexpected failures, or behaviour at odds with the documented scope. Feature requests, redesign proposals, and debugging queries lacking a minimal reproducible example sit outside what we can reasonably support, as do requests for rules beyond Julia Base, or for behaviour noted on the known limitations page; such issues will generally be closed.
Organisations relying on Mooncake.jl commercially are warmly encouraged to contribute, whether through focused code contributions, financial support, or both.
Check that you're running a version of Julia that Mooncake.jl supports.
See the SUPPORT_POLICY.md file for more info.
There are several ways to interact with Mooncake.jl. To interact directly with Mooncake.jl, use Mooncake's native API, which allows reuse of prepared caches for repeated gradient and Hessian evaluations:
import Mooncake as MC
f(x) = (1 - x[1])^2 + 100 * (x[2] - x[1]^2)^2 # Rosenbrock
x = [1.2, 1.2]
# Reverse mode
grad_cache = MC.prepare_gradient_cache(f, x);
val, grad = MC.value_and_gradient!!(grad_cache, f, x)
# Forward mode
fwd_cache = MC.prepare_derivative_cache(f, x);
val_fwd, grad_fwd = MC.value_and_gradient!!(fwd_cache, f, x)
# Hessian
hess_cache = MC.prepare_hessian_cache(f, x);
val, grad, H = MC.value_gradient_and_hessian!!(hess_cache, f, x)
# val : f(x)
# grad : ∇f(x) (length-n vector)
# H : ∇²f(x) (n×n matrix)You should expect that MC.prepare_*_cache take a little time to run, but that subsequent gradient and hessian calls using the prepared caches are fast. For details, see the interface docs.
