Discuss deviance calculation with replicates; clarify saturation vs. full parametrization by Copilot · Pull Request #428 · d-morrison/rme

Copilot · 2026-04-10T08:26:35Z

…ametrization Agent-Logs-Url: https://github.com/d-morrison/rme/sessions/5ec46eee-86e6-497e-a280-030b6ea86579 Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>

… deviance Agent-Logs-Url: https://github.com/d-morrison/rme/sessions/5ec46eee-86e6-497e-a280-030b6ea86579 Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>

Agent-Logs-Url: https://github.com/d-morrison/rme/sessions/0096c260-f494-46d4-8567-a0443503044f Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>

…rametrized model for ungrouped data Agent-Logs-Url: https://github.com/d-morrison/rme/sessions/14b8d079-55f6-4d4d-a497-07ca47d8afe2 Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>

…hy chi^2 GOF test fails for ungrouped data Agent-Logs-Url: https://github.com/d-morrison/rme/sessions/ec68eba3-8c76-4f8c-a9fd-422947591440 Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>

… size; add note about grouped data workaround Agent-Logs-Url: https://github.com/d-morrison/rme/sessions/c5a6c684-cb62-4fb7-9c4c-a0b832f21613 Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>

d-morrison · 2026-04-10T10:12:45Z

@copilot let's refer readers to https://stats.stackexchange.com/questions/626597/is-there-a-justification-for-the-bernoulli-deviance-in-the-r-stats-package and https://link.springer.com/chapter/10.1007/978-1-4419-0118-7_9#Sec10 for more information

…ed deviance discussion Agent-Logs-Url: https://github.com/d-morrison/rme/sessions/bb2d12d1-c32c-46b1-aeba-b1aecfa444ab Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>

Copilot · 2026-04-10T10:17:50Z

@copilot let's refer readers to https://stats.stackexchange.com/questions/626597/is-there-a-justification-for-the-bernoulli-deviance-in-the-r-stats-package and https://link.springer.com/chapter/10.100...

Added in commit 86de9cb. The callout now ends with:

For further discussion, see @dunn2018generalized [Chapter 9] and this Stats Stack Exchange thread.

The Springer link (978-1-4419-0118-7) corresponds to Dunn & Smyth (2018) "Generalized Linear Models with Examples in R", which was already in references.bib as dunn2018generalized.

github-actions · 2026-04-10T19:21:57Z

PR Preview Action v1.8.1
🚀 Preview available at: https://d-morrison.github.io/rme/pr-preview/pr-428/

_{Built to branch gh-pages at 2026-04-10 19:21 UTC.}

Copilot

Pull request overview

Clarifies how deviance is computed/interpreted when covariate patterns repeat (replicates), with emphasis on the distinction between saturated vs. fully parametrized reference models and how R’s deviance() behaves for grouped vs. ungrouped Binomial/Bernoulli data.

Changes:

Refines the Gaussian deviance discussion to be in terms of distinct covariate patterns and adds a new subsection contrasting saturated vs. fully parametrized models under replicates.
Adds narrative + R examples demonstrating deviance(fit) == -2*logLik(fit) for ungrouped Bernoulli data with repeated patterns, and explains implications for GOF testing.
Extends inst/WORDLIST with additional technical terms/acronyms.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File	Description
`_sec_compare_lm_glm_deviance.qmd`	Adds the replicates-focused explanation and examples; updates the summary table to reflect the clarified deviance reference.
`inst/WORDLIST`	Adds new allowlisted technical tokens used by the updated prose/math.

Copilot · 2026-04-11T00:21:13Z

+fp
+frac
+github
+GOF
+infty
+leq


inst/WORDLIST already contains GitHub, but this change adds github (lowercase) and also introduces duplicate entries for frac/github later in the file. Duplicates/inconsistent casing can make spellcheck behavior harder to reason about and may allow incorrect capitalization in prose; consider removing the duplicates and keeping only the preferred casing (e.g., GitHub).

Copilot · 2026-04-11T00:21:13Z

+By convention, terms with $y_k = 0$ or $y_k = n_k$ contribute zero to the sum,
+since $0\log(0) = 0$ in the limit.


The note says patterns with y_k = 0 or y_k = n_k “contribute zero to the sum,” but in the Binomial deviance expression only the 0·log(0) component is taken as 0; the other term generally remains nonzero (e.g., if y_k=0, the (n_k-y_k) log((1-y_k/n_k)/(1-\hat\pi_k)) term is still present). Reword to avoid implying the entire bracketed contribution is 0.

Suggested change

By convention, terms with $y_k = 0$ or $y_k = n_k$ contribute zero to the sum,

since $0\log(0) = 0$ in the limit.

By convention,

when $y_k = 0$ or $y_k = n_k$,

the corresponding component of the form $0\log(0)$ is taken to equal 0

in the limiting sense.

The other component in the bracket generally remains nonzero

and still contributes to the sum.

Copilot · 2026-04-11T00:21:13Z

+When patterns do repeat ($q < n$), the saturated model sets $\hat\pi_k = y_k/n_k$ per pattern,
+giving $\ell_{\text{sat}} < 0$.
+
+`deviance()` for ungrouped data **cannot** be used as a goodness-of-fit test
+against the $\chi^2$ distribution when $q < n$.
+The correct GOF statistic is $2(\ell_{\text{sat}} - \ell(\hat\beta))$,
+but R's `deviance()` for ungrouped data returns $-2\ell(\hat\beta)$ (using $\ell_{\text{fp}} = 0$).
+These two quantities differ by $-2\ell_{\text{sat}} > 0$ whenever $q < n$.


“When patterns do repeat ($q < n$) … giving $\ell_{\text{sat}} < 0$” isn’t always true: with replicates you can still have $\ell_{\text{sat}} = 0$ if every repeated pattern is all-0 or all-1 (i.e., each pattern has $y_k \in {0,n_k}$ so the Binomial pmf at $\hat\pi_k=y_k/n_k$ is 1). Consider qualifying this as “$\ell_{\text{sat}} < 0$ whenever at least one pattern has both outcomes (0 < y_k < n_k).”

Suggested change

When patterns do repeat ($q < n$), the saturated model sets $\hat\pi_k = y_k/n_k$ per pattern,

giving $\ell_{\text{sat}} < 0$.

`deviance()` for ungrouped data **cannot** be used as a goodness-of-fit test

against the $\chi^2$ distribution when $q < n$.

The correct GOF statistic is $2(\ell_{\text{sat}} - \ell(\hat\beta))$,

but R's `deviance()` for ungrouped data returns $-2\ell(\hat\beta)$ (using $\ell_{\text{fp}} = 0$).

These two quantities differ by $-2\ell_{\text{sat}} > 0$ whenever $q < n$.

When patterns do repeat ($q < n$), the saturated model sets $\hat\pi_k = y_k/n_k$ per pattern.

If at least one pattern has both outcomes

($0 < y_k < n_k$),

then $\ell_{\text{sat}} < 0$.

`deviance()` for ungrouped data **cannot** be used as a goodness-of-fit test

against the $\chi^2$ distribution when $q < n$.

The correct GOF statistic is $2(\ell_{\text{sat}} - \ell(\hat\beta))$,

but R's `deviance()` for ungrouped data returns $-2\ell(\hat\beta)$ (using $\ell_{\text{fp}} = 0$).

These two quantities differ by $-2\ell_{\text{sat}} > 0$

whenever at least one repeated pattern has both outcomes

($0 < y_k < n_k$).

Copilot · 2026-04-11T00:21:14Z

+`deviance()` for ungrouped data **cannot** be used as a goodness-of-fit test
+against the $\chi^2$ distribution when $q < n$.
+The correct GOF statistic is $2(\ell_{\text{sat}} - \ell(\hat\beta))$,
+but R's `deviance()` for ungrouped data returns $-2\ell(\hat\beta)$ (using $\ell_{\text{fp}} = 0$).


The claim that ungrouped deviance() “cannot be used as a goodness-of-fit test against the $\chi^2$ distribution when $q < n$” is too broad. The issue you describe is that it’s the wrong statistic to compare to $\chi^2(q-p)$ (lack-of-fit by covariate pattern); ungrouped deviance is typically compared to $\chi^2(n-p)$ instead. Suggest rephrasing to explicitly mention the df/statistic mismatch rather than implying no $\chi^2$ GOF use at all.

Suggested change

`deviance()` for ungrouped data **cannot** be used as a goodness-of-fit test

against the $\chi^2$ distribution when $q < n$.

The correct GOF statistic is $2(\ell_{\text{sat}} - \ell(\hat\beta))$,

but R's `deviance()` for ungrouped data returns $-2\ell(\hat\beta)$ (using $\ell_{\text{fp}} = 0$).

When $q < n$, `deviance()` for ungrouped data is **not** the correct statistic

for the covariate-pattern goodness-of-fit test based on $\chi^2(q - p)$.

For that lack-of-fit test, the correct statistic is

$2(\ell_{\text{sat}} - \ell(\hat\beta))$,

but R's `deviance()` for ungrouped data returns $-2\ell(\hat\beta)$

(using $\ell_{\text{fp}} = 0$).

Initial plan

f651219

Copilot AI assigned Copilot and d-morrison Apr 10, 2026

Copilot started work on behalf of d-morrison April 10, 2026 08:26 View session

Copilot AI linked an issue Apr 10, 2026 that may be closed by this pull request

discuss how deviance is calculated (for both LMs and GLMs) when there are replicates of some covariate patterns #427

Closed

Copilot AI and others added 2 commits April 10, 2026 08:42

Add discussion of deviance with replicates and saturation vs full par…

ba1a6ec

…ametrization Agent-Logs-Url: https://github.com/d-morrison/rme/sessions/5ec46eee-86e6-497e-a280-030b6ea86579 Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>

Add note about 0*log(0) = 0 convention for boundary cases in Binomial…

38aa47c

… deviance Agent-Logs-Url: https://github.com/d-morrison/rme/sessions/5ec46eee-86e6-497e-a280-030b6ea86579 Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Discuss how deviance is calculated for LMs and GLMs with replicates~~ Discuss deviance calculation with replicates; clarify saturation vs. full parametrization Apr 10, 2026

Copilot AI requested a review from d-morrison April 10, 2026 08:45

Copilot finished work on behalf of d-morrison April 10, 2026 08:45

d-morrison requested changes Apr 10, 2026

View reviewed changes

Comment thread _sec_compare_lm_glm_deviance.qmd Outdated

Copilot started work on behalf of d-morrison April 10, 2026 09:03 View session

Clarify why ell_sat = 0 for ungrouped Bernoulli data in saturated model

497e1c6

Agent-Logs-Url: https://github.com/d-morrison/rme/sessions/0096c260-f494-46d4-8567-a0443503044f Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>

Copilot finished work on behalf of d-morrison April 10, 2026 09:16

Copilot AI requested a review from d-morrison April 10, 2026 09:16

Merge branch 'main' into copilot/discuss-devience-calculation-replicates

64446ab

Copilot started work on behalf of d-morrison April 10, 2026 09:19 View session

Correct ell_sat=0 claim: only holds when q=n; clarify R uses fully pa…

c8d8101

…rametrized model for ungrouped data Agent-Logs-Url: https://github.com/d-morrison/rme/sessions/14b8d079-55f6-4d4d-a497-07ca47d8afe2 Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>

Copilot finished work on behalf of d-morrison April 10, 2026 09:24

d-morrison requested changes Apr 10, 2026

View reviewed changes

Comment thread _sec_compare_lm_glm_deviance.qmd

Comment thread _sec_compare_lm_glm_deviance.qmd Outdated

Copilot started work on behalf of d-morrison April 10, 2026 09:36 View session

Copilot finished work on behalf of d-morrison April 10, 2026 09:43

Copilot AI requested a review from d-morrison April 10, 2026 09:43

d-morrison requested changes Apr 10, 2026

View reviewed changes

Comment thread _sec_compare_lm_glm_deviance.qmd Outdated

Copilot started work on behalf of d-morrison April 10, 2026 09:54 View session

d-morrison and others added 2 commits April 10, 2026 02:55

Merge branch 'main' into copilot/discuss-devience-calculation-replicates

2ad4ae4

Copilot finished work on behalf of d-morrison April 10, 2026 10:00

Copilot AI requested a review from d-morrison April 10, 2026 10:00

Copilot started work on behalf of d-morrison April 10, 2026 10:12 View session

Add references to Dunn & Smyth (2018) and Stats SE thread for ungroup…

86de9cb

…ed deviance discussion Agent-Logs-Url: https://github.com/d-morrison/rme/sessions/bb2d12d1-c32c-46b1-aeba-b1aecfa444ab Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>

Copilot finished work on behalf of d-morrison April 10, 2026 10:19

Merge branch 'main' into copilot/discuss-devience-calculation-replicates

696af10

d-morrison marked this pull request as ready for review April 11, 2026 00:17

Copilot AI review requested due to automatic review settings April 11, 2026 00:17

d-morrison merged commit 81d682a into main Apr 11, 2026
4 checks passed

d-morrison deleted the copilot/discuss-devience-calculation-replicates branch April 11, 2026 00:17

Copilot started reviewing on behalf of d-morrison April 11, 2026 00:17 View session

Copilot AI reviewed Apr 11, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discuss deviance calculation with replicates; clarify saturation vs. full parametrization#428

Discuss deviance calculation with replicates; clarify saturation vs. full parametrization#428
d-morrison merged 11 commits intomainfrom
copilot/discuss-devience-calculation-replicates

Copilot AI commented Apr 10, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

d-morrison commented Apr 10, 2026

Uh oh!

Copilot AI commented Apr 10, 2026

Uh oh!

github-actions bot commented Apr 10, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 11, 2026

Uh oh!

Copilot AI Apr 11, 2026

Uh oh!

Copilot AI Apr 11, 2026

Uh oh!

Copilot AI Apr 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		By convention, terms with $y_k = 0$ or $y_k = n_k$ contribute zero to the sum,
		since $0\log(0) = 0$ in the limit.

-By convention, terms with $y_k = 0$ or $y_k = n_k$ contribute zero to the sum,
-since $0\log(0) = 0$ in the limit.
+By convention,
+when $y_k = 0$ or $y_k = n_k$,
+the corresponding component of the form $0\log(0)$ is taken to equal 0
+in the limiting sense.
+The other component in the bracket generally remains nonzero
+and still contributes to the sum.

Conversation

Copilot AI commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

d-morrison commented Apr 10, 2026

Uh oh!

Copilot AI commented Apr 10, 2026

Uh oh!

github-actions bot commented Apr 10, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Apr 10, 2026 •

edited

Loading