Add residual-based posterior predictive diagnostics for individual-level antibody trajectories

### Background and motivation

In recent discussions, we identified a need to quantify how well posterior predicted trajectories match observed antibody measurements at the individual level, instead of relying on subjective visual comparisons (“it looks better”).

The package already provides `plot_predicted_curve()` to overlay:

- observed data points, and

- posterior-predicted median curves with 95% credible intervals,

- optionally with all posterior sample curves.

However, there is currently no built-in way to compute residual-based metrics that summarize the distance between observed measurements and model predictions. These metrics are useful because they:

- provide an objective summary of fit for each individual and biomarker,

- can be reported in Results/Supplemental tables,

- support routine posterior predictive evaluation in a modern Bayesian workflow,

- are general and can be applied to any dataset fit using `run_mod()`.

The goal is to compute residuals using the posterior median predicted trajectory evaluated at the observed timepoints (not on a dense grid), and then summarize residual magnitude across observations.

------------------------------------------------

### Scope of work

Add utilities that:

**1. Extract posterior predicted values at observed timepoints**

- Use the same posterior draws and the same within-host model function used in `plot_predicted_curve()` (via `ab()`).

- Evaluate predictions at the dataset’s observed visit times for each ID and antigen–isotype.

- Compute posterior summaries at each observed timepoint:

    - median prediction (required)

    - optional 2.5% / 97.5% (optional, for context)

**2. Compute residuals between observed and predicted values**

At each observed timepoint, compute residuals relative to the posterior median prediction:

- raw residual: `obs - pred_med`

- absolute residual: `abs(obs - pred_med)`

- squared residual: `(obs - pred_med)^2`

Support a `scale` option to compute residuals on:

- **original scale** (e.g., ELISA units or MFI)

- **log scale** (recommended when the model is fit on log measurements), i.e.
`log(obs) - log(pred_med)`
with careful handling of non-positive values.

**3. Provide summary metrics by individual and biomarker**

Return tidy summaries that can be used directly in reports/tables, such as:

- `MAE` (mean absolute error)

- `RMSE` (root mean squared error)

- `SSE` (sum of squared errors)

- `n_obs` (number of observations used)

Support multiple summary levels:

- pointwise residual output (one row per observation)

- per `id × antigen_iso`

- per `antigen_iso` (aggregated across IDs)

- overall (single summary)

**4. Integrate cleanly with the existing plotting workflow**

Do not overload plotting with diagnostics. The workflow should be:

- `plot_predicted_curve()` for visualization

- a separate residual-metric function for quantitative evaluation

However, to avoid duplicated computation, implement a shared internal helper that both functions can reuse to produce posterior predictions.

**5. Provide documentation + examples using package data**

Use `serodynamics::nepal_sees` and `serodynamics::nepal_sees_jags_output` to demonstrate:

- computing residual metrics for one ID and one biomarker,

- computing residual summaries across multiple IDs (faceted use-case),

- returning a tidy table for reporting.

-----------------------------------------------

### Required additions

- **New exported function(s)**
(suggested names; final naming can be adjusted to package conventions)

    - `compute_residual_metrics()`
Computes pointwise residuals and/or summarized metrics.

    - (Optional internal helper) `predict_posterior_at_times()`
Shared logic extracted from `plot_predicted_curve()` to compute posterior median predictions at arbitrary times.

- **Unit tests** (use `serodynamics::nepal_sees` + `serodynamics::nepal_sees_jags_output`)

    - Dimensional consistency:

           - residual rows match the number of observed rows after filtering by `ids` and `antigen_iso`

     - Output validity:

           - no missing columns

           - summaries return expected fields (`MAE`, `RMSE`, `SSE`, `n_obs`)

     - Robust handling:

           - works with single ID and multiple IDs

           - gracefully handles missing values and non-finite predictions

           - log-scale residuals reject or flag non-positive values

- **roxygen2 documentation**

      - Full docs for the new residual function(s)

      - Examples in the external example file format

      - Explicitly state expected dataset columns and supported scales

- **Tutorial / vignette update**

      - Add a short section:

         - “Evaluating individual-level fit using residual metrics”

      - Demonstrate:

         1. `plot_predicted_curve()` overlay

         2. `compute_residual_metrics()` summary table output

      - Emphasize this as a recommended posterior predictive diagnostic step after fitting models with `run_mod()`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add residual-based posterior predictive diagnostics for individual-level antibody trajectories #179

Background and motivation

Scope of work

Required additions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add residual-based posterior predictive diagnostics for individual-level antibody trajectories #179

Description

Background and motivation

Scope of work

Required additions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions