UM-Bridge · annereinarz · Apr 1, 2026 · Feb 27, 2026 · Mar 1, 2026 · Mar 1, 2026
diff --git a/docs/source/math_description.rst b/docs/source/math_description.rst
@@ -4,106 +4,167 @@
 Mathematical abstraction in UM-Bridge
 =====================================
 
-In this section, we will describe UM-Bridge's interface mathematically.
+In this section, we will describe UM-Bridge's interface mathematically. 
 
-Model Evaluation
-================
-Let :math:`\mathcal{F}` denote the numerical model that maps the model input vector, :math:`\mathbf{x}` to
-the output vector :math:`\mathbf{f(\mathbf{x})}`:
+Let :math:`\mathbf{F}` denote the numerical model that maps the model input vector, :math:`\boldsymbol{\theta}` 
+to the output vector :math:`\mathbf{F}(\boldsymbol{\theta})`. We will use bold font to 
+indicate vectors. Note that both inputs and ouputs are required to be a list of lists in the actual 
+implementation. For a list of :math:`d` input vectors each with :math:`n` dimensions, we have
 
 .. math::    
-    \mathcal{F}\, : \,
-    \mathbf{x}
+    \mathbf{F}\, : \,
+    \mathbb{R}^{n \times d}
     \;\longrightarrow\;
-    \mathbf{f}(\mathbf{x}), \quad
-    \mathbf{x} \in \mathbb{R}^d, \;
-    \mathbf{f}(\mathbf{x}) \in \mathbb{R}^n.
+    \mathbb{R}^{m \times d}.
+
+The arguments ``inWrt`` and ``outWrt`` in functions, where derivatives are involved, allow the user to 
+select particular indices (out of :math:`d` indices) at which the derivative should be evaluated with 
+respect to. However, more of this will be clarified in the respective sections.
+
+Additionally, there may be an objective function :math:`L = L(\mathbf{F}(\boldsymbol{\theta}))`. 
 
+UM-Bridge allows the following four operations.
+
+Model Evaluation
+================
 
-Gradient Evaluation
-===================
+This is simply the so called forward map that takes an element from the list of input vectors, 
+:math:`\boldsymbol{\theta} = (\theta_1, \ldots, \theta_n) \in \mathbb{R}^n`, and returns the model output, 
+:math:`\mathbf{F}(\boldsymbol{\theta}) = (F(\boldsymbol{\theta})_1, \ldots, F(\boldsymbol{\theta})_m) \in \mathbb{R}^m`.
 
-The ``gradient`` function evaluates the sensitivity of a scalar 
-objective, :math:`L(\mathbf{f}(\mathbf{x}))`, that depends on the model output, with respect to the model input. Using the 
-chain rule:
+For a collection of :math:`d` input vectors, each of dimension :math:`n`, this follows the definition as before:
 
 .. math::
-    \nabla_{\mathbf{x}}L
-    = \left(\frac{\partial \mathbf{f}}{\partial \mathbf{x}}\right)^{\!\top}
+
+    \mathbf{F} : \mathbb{R}^{n \times d} \; \longrightarrow \; \mathbb{R}^{m \times d}.
+
+In practice, both inputs and outputs are expected as lists of lists.
+
+
+Gradient of the objective function
+==================================
+
+The gradient function evaluates the sensitivity of the scalar objective. Using the chain rule:
+
+.. math::
+    :name: eq:1
+
+    \nabla_{\boldsymbol{\theta}}L
+    = \left(\frac{\partial \mathbf{F}}{\partial \boldsymbol{\theta}}\right)^{\!\top}
     \boldsymbol{\lambda},
     \qquad
-    \boldsymbol{\lambda} = \frac{\partial L}{\partial \mathbf{f}},
+    \boldsymbol{\lambda} = \frac{\partial L}{\partial \mathbf{F}},
+
+where :math:`\boldsymbol{\lambda}` is known as the sensitivity vector and 
+:math:`\dfrac{\partial \mathbf{F}}{\partial \boldsymbol{\theta}}` is actually the Jacobian of the
+forward map.
 
-where :math:`\lambda` is known as the sensitivity vector.
+Since there are multiple choices due to the format of the input and output, we can select a specific
+component within the input (:math:`\boldsymbol{\theta}_i \in \mathbb{R}^n`) and output list of 
+lists (:math:`\mathbf{F}_j \in \mathbb{R}^m`). These indices are chosen using ``inWrt`` and ``outWrt``, respectively, 
+in the implementation.
 
+So :ref:`(1) <eq:1>` becomes
 
-Applying Jacobian
-=================
+.. math::
+
+    \nabla_{\boldsymbol{\theta}_i}
+    = \left( \dfrac{\partial \mathbf{F}_j}{\partial \boldsymbol{\theta}_i} \right) ^ {\!\top}
+    \boldsymbol{\lambda}_j,
+    \qquad
+    \boldsymbol{\lambda}_j = \dfrac{\partial L}{\partial \mathbf{F}_j},
+
+where :math:`\boldsymbol{\lambda}_j` is the ``sens`` argument in the code. 
+
+The output of this operation is a vector because we are essentially doing a matrix vector product.
+
+Applying Jacobian to a vector
+=============================
 
-The ``apply_jacobian`` function evaluates the product of the model's Jacobian, :math:`J`, and a
-vector, :math:`\mathbf{v}`, of the user's choice. The Jacobian of a vector-valued function 
+The apply Jacobian function evaluates the product of the transpose of the model's Jacobian, :math:`J^{\top}`, and a
+vector, :math:`\mathbf{v}`, of the user's choice (``vec``). The Jacobian of a vector-valued function 
 is given by
 
 .. math::
     J =
-    \frac{\partial \mathbf{f}}{\partial \mathbf{x}} =
+    \frac{\partial \mathbf{F}}{\partial \boldsymbol{\theta}} =
     \left[
     \begin{array}{ccc}
-    \dfrac{\partial \mathbf{f}}{\partial x_1} & \cdots & \dfrac{\partial \mathbf{f}}{\partial x_d}
+    \dfrac{\partial \mathbf{F}}{\partial \theta_1} & \cdots & \dfrac{\partial \mathbf{F}}{\partial \theta_n}
     \end{array}
     \right] = 
     \begin{pmatrix}
-    \dfrac{\partial f_{1}}{\partial x_{1}} & \cdots &
-    \dfrac{\partial f_{1}}{\partial x_{d}} \\[12pt]
+    \dfrac{\partial F_{1}}{\partial \theta_{1}} & \cdots &
+    \dfrac{\partial F_{1}}{\partial \theta_{n}} \\[12pt]
     \vdots & \ddots & \vdots \\[4pt]
-    \dfrac{\partial f_{n}}{\partial x_{1}} & \cdots &
-    \dfrac{\partial f_{n}}{\partial x_{d}}
+    \dfrac{\partial F_{m}}{\partial \theta_{1}} & \cdots &
+    \dfrac{\partial F_{m}}{\partial \theta_{n}}
     \end{pmatrix}
-    \in \mathbb{R}^{n \times d}.
+    \in \mathbb{R}^{m \times n}.
 
 
-The output of this function for a chosen :math:`\mathbf{v} \in \mathbb{R}^{d}` is then
+For a chosen :math:`\mathbf{v} \in \mathbb{R}^{n}`, this is simply
 
 .. math::
-    \texttt{output}
-    = J\,\mathbf{v}
-    = \frac{\partial \mathbf{f}}{\partial \mathbf{x}}\,\mathbf{v}.
+    J^{\!\top}\,\mathbf{v}
+    = \left( \dfrac{\partial \mathbf{F}}{\partial \boldsymbol{\theta}} \right) ^ {\!\top} \,\mathbf{v}.
 
-Additionally, we can use this (or vice versa) to expression the ``gradient`` function by setting 
-:math:`\mathbf{v} = \mathbf{\lambda}`.  
+Additionally, we can use this to express the gradient function by setting 
+:math:`\mathbf{v} = \boldsymbol{\lambda}` as mentioned before.
 
+However, as before, we can choose an index each from the input and output to construct the Jacobian such that
+:math:`J_{ji} = \frac{\partial \mathbf{F}_j}{\partial \boldsymbol{\theta}_i}`. The output of this 
+action is then
 
-Applying Hessian
-================
+.. math::
+    \texttt{output} =
+    J_{ji}\,\mathbf{v}
+    = \dfrac{\partial \mathbf{F}_j}{\partial \boldsymbol{\theta}_i}\,\mathbf{v},
+
+where the :math:`i^{th}` and :math:`j^{th}` indices coresspond to ``inWrt`` and ``outWrt``.
 
-This is a combination of the previous two sections: the output is still a matrix-vector product, but 
+Applying Hessian to a vector
+============================
+
+The apply Hessian action is a combination of the previous two sections: the action is still a matrix-vector product, but 
 the matrix is the Hessian of an objective function. The Hessian, :math:`H`, is given by
 
 .. math::
     H =
-    \frac{\partial^2 L}{\partial \mathbf{x}\,\partial \mathbf{x}}
-    = \frac{\partial}{\partial \mathbf{x}}
+    \frac{\partial^2 L}{\partial \boldsymbol{\theta}\,\partial \boldsymbol{\theta}}
+    = \frac{\partial}{\partial \boldsymbol{\theta}}
     \left(
-    \frac{\partial \mathbf{f}}{\partial \mathbf{x}}
+    \frac{\partial \mathbf{F}}{\partial \boldsymbol{\theta}}
     \right)^{\!\top}
     \boldsymbol{\lambda} = 
-    H = \begin{bmatrix}
-    \dfrac{\partial^2 L}{\partial x_1^2} & \dfrac{\partial^2 L}{\partial x_1 \partial x_2} & \cdots & \dfrac{\partial^2 L}{\partial x_1 \partial x_n} \\[18pt]
-    \dfrac{\partial^2 L}{\partial x_2 \partial x_1} & \dfrac{\partial^2 L}{\partial x_2^2} & \cdots & \dfrac{\partial^2 L}{\partial x_2 \partial x_n} \\[18pt]
+    \begin{bmatrix}
+    \dfrac{\partial^2 L}{\partial \theta_1^2} & \dfrac{\partial^2 L}{\partial \theta_1 \partial \theta_2} & \cdots & \dfrac{\partial^2 L}{\partial \theta_1 \partial \theta_n} \\[18pt]
+    \dfrac{\partial^2 L}{\partial \theta_2 \partial \theta_1} & \dfrac{\partial^2 L}{\partial \theta_2^2} & \cdots & \dfrac{\partial^2 L}{\partial \theta_2 \partial \theta_n} \\[18pt]
     \vdots & \vdots & \ddots & \vdots \\[6pt]
-    \dfrac{\partial^2 L}{\partial x_n \partial x_1} & \dfrac{\partial^2 L}{\partial x_n \partial x_2} & \cdots & \dfrac{\partial^2 L}{\partial x_n^2}
+    \dfrac{\partial^2 L}{\partial \theta_n \partial \theta_1} & \dfrac{\partial^2 L}{\partial \theta_n \partial \theta_2} & \cdots & \dfrac{\partial^2 L}{\partial \theta_n^2}
     \end{bmatrix},
 
-where :math:`L` is the objective function and :math:`\mathbf{\lambda}` is the sensitivity vector as defined in the ``gradient`` 
-section.
+where :math:`L` is the objective function and :math:`\boldsymbol{\lambda}` is the sensitivity vector as defined previously.
 
-So the output for a chosen vector can be written as
+So the product of :math:`H` and the chosen vector (of size :math:`n`) can be written as
 
 .. math::
     H\,\mathbf{v}
-    = \frac{\partial^2 \mathcal{L}}{\partial \mathbf{x}\,\partial \mathbf{x}}\,\mathbf{v} = 
-    \left[\frac{\partial}{\partial \mathbf{x}}
+    = \dfrac{\partial^2 L}{\partial \boldsymbol{\theta}\,\partial \boldsymbol{\theta}}\,\mathbf{v} = 
+    \left[\dfrac{\partial}{\partial \boldsymbol{\theta}}
     \left(
-    \frac{\partial \mathbf{f}}{\partial \mathbf{x}}
+    \dfrac{\partial \mathbf{F}}{\partial \boldsymbol{\theta}}
     \right)^{\!\top}
     \boldsymbol{\lambda}\right]\,\mathbf{v}.
+
+As in the apply Jacobian action, we can select certain indices from the list of lists to construct the Hessian. 
+Since :math:`H` contains the second derivative of :math:`L`, we require two indices from the input:
+``inWrt1`` and ``inWrt2``. The output of this action is 
+
+.. math::
+    \texttt{output} =
+    \left( \dfrac{\partial}{\partial \boldsymbol{\theta}_i}
+    \left[ \left( \dfrac{\partial \mathbf{F}_k}{\partial \boldsymbol{\theta}_j} \right) ^ {\!\top} \, \boldsymbol{\lambda}_k \right] \right)
+    \, \mathbf{v}.
+
+