Motivation
Suppose that for a set of parameters $x$, the equation $F(x, y) = 0$ defines $y(x)$ implicitly. Eg $x$ could be parameters to a problem that we approximate numerically, and $y$ the parameters of an approximation we obtain numerically (rootfinding etc). Given data $d$, the likelihood is defined as $\ell(d \mid x, y)$.
Theoretically, one could of course solve for the $y$ that belongs to each $x$. But this may be expensive and brittle, and if
$$
x_2 = x_1 + \Delta
$$
then
$$
\hat{y}_2 = y_2 + \frac{\partial y}{\partial x} \Delta
$$
would be a good initial guess for $y_2 = y(x_2)$.
Ideally, "users" like Turing.jl and DynamicHMC.jl should be able to ignore the details of these things and just carry on doing HMC/NUTS/etc with minimal changes.
Proposal: allow coordinates to be opaque
I propose an addition to the API composed of 3 functions, with the fallbacks
lift(ℓ, x::AbstractVector) = x
unlift(ℓ, x::AbstractVector) = x
translate(ℓ, x::AbstractVector, Δ::AbstractVector) = x .+ Δ
Specifically,
- "users" would call
lift when generating random points for starting MCs, and in similar situations. Otherwise they would use translate,
- similarly,
unlift would be called when coordinates are needed (eg turn statistics),
- leapfrog and RWMH steps would use
translate.
- otherwise the result of
lift and the x arguments of logdensity, logdensity_and_gradient, translate, unlift are allowed to be opaque objects, not an ::AbstractVector of real numbers. Nevertheless, logdensity_and_gradient should provide a valid gradient of x -> logdensity(ℓ, lift(ℓ, x)), but how that is done is up to the implementation of ℓ.
Bikeshedding names is appreciated 😉, also alternative API suggestions.
How this meshes with AD
This is a bit tricky and I don't yet have a good API in mind. Related work is in
Motivation
Suppose that for a set of parameters$x$ , the equation $F(x, y) = 0$ defines $y(x)$ implicitly. Eg $x$ could be parameters to a problem that we approximate numerically, and $y$ the parameters of an approximation we obtain numerically (rootfinding etc). Given data $d$ , the likelihood is defined as $\ell(d \mid x, y)$ .
Theoretically, one could of course solve for the$y$ that belongs to each $x$ . But this may be expensive and brittle, and if
then
would be a good initial guess for$y_2 = y(x_2)$ .
Ideally, "users" like Turing.jl and DynamicHMC.jl should be able to ignore the details of these things and just carry on doing HMC/NUTS/etc with minimal changes.
Proposal: allow coordinates to be opaque
I propose an addition to the API composed of 3 functions, with the fallbacks
Specifically,
liftwhen generating random points for starting MCs, and in similar situations. Otherwise they would usetranslate,unliftwould be called when coordinates are needed (eg turn statistics),translate.liftand thexarguments oflogdensity,logdensity_and_gradient,translate,unliftare allowed to be opaque objects, not an::AbstractVectorof real numbers. Nevertheless,logdensity_and_gradientshould provide a valid gradient ofx -> logdensity(ℓ, lift(ℓ, x)), but how that is done is up to the implementation ofℓ.Bikeshedding names is appreciated 😉, also alternative API suggestions.
How this meshes with AD
This is a bit tricky and I don't yet have a good API in mind. Related work is in