256 get cox pairwise df strata in formulas by jszczypinski · Pull Request #258 · insightsengineering/crane

jszczypinski · 2026-05-27T08:53:54Z

Currently get_cox_pairwise_df do not accept formulas with strata() term, which is quite common.
Also formulas in R should not use namespace
https://stat.ethz.ch/pipermail/r-devel/2025-April/083989.html

Closes #256

Pre-review Checklist (if item does not apply, mark is as complete)

All GitHub Action workflows pass with a ✅
PR branch has pulled the most recent updates from master branch: usethis::pr_merge_main()
If a bug was fixed, a unit test was added.
Code coverage is suitable for any new functions/features (generally, 100% coverage for new code): devtools::test_coverage()
Request a reviewer

Reviewer Checklist (if item does not apply, mark is as complete)

If a bug was fixed, a unit test was added.
Run pkgdown::build_site(). Check the R console for errors, and review the rendered website.
Code coverage is suitable for any new functions/features: devtools::test_coverage()

When the branch is ready to be merged:

Update NEWS.md with the changes from this pull request under the heading "# cards (development version)". If there is an issue associated with the pull request, reference it in parentheses at the end update (see NEWS.md for examples).
All GitHub Action workflows pass with a ✅
Approve Pull Request
Merge the PR. Please use "Squash and merge" or "Rebase and merge".

Signed-off-by: Jan Szczypiński <jan.szczypinski@gmail.com>

github-actions · 2026-05-27T09:40:36Z

Unit Tests Summary

1 files 240 suites 3m 13s ⏱️
240 tests 240 ✅ 0 💤 0 ❌
712 runs 712 ✅ 0 💤 0 ❌

Results for commit 3049777.

♻️ This comment has been updated with latest results.

github-actions · 2026-05-27T09:40:42Z

Unit Test Performance Difference

Test Suite	$Status$	Time on `main`	$±Time$	$±Tests$	$±Skipped$	$±Failures$	$±Errors$
check_formula_for_namespace	👶		$+0.00$	$+2$	$0$	$0$	$0$
modify_header_rm_md	💔	$0.09$	$+1.20$	$0$	$0$	$0$	$0$

Additional test case details

Test Suite	$Status$	Time on `main`	$±Time$	Test Case
check_formula_for_namespace	👶		$+0.00$	.check_formula_for_namespace_aborts_when_a_namespace_is_present
check_formula_for_namespace	👶		$+0.00$	.check_formula_for_namespace_correctly_parses_long_formulas
check_formula_for_namespace	👶		$+0.16$	.check_formula_for_namespace_passes_valid_formulas
get_cox_pairwise_df	👶		$+0.00$	get_cox_pairwise_df_handles_robust_TRUE_correctly_via_...
get_cox_pairwise_df	👶		$+0.00$	get_cox_pairwise_df_respects_conf.int_passed_via_...
get_cox_pairwise_df	👶		$+0.01$	get_cox_pairwise_df_works_for_formula_with_complex_strata_
get_cox_pairwise_df	👶		$+0.00$	get_cox_pairwise_df_works_for_formula_with_covariates_via_likelihood_ratio
modify_header_rm_md	💔	$0.09$	$+1.20$	strip_md_bold_works_with_gtsummary_table
modify_zero_recode	💚	$1.56$	$-1.43$	modify_zero_recode_works

Results for commit a94398a

♻️ This comment has been updated with latest results.

github-actions · 2026-05-27T09:40:53Z

Code Coverage Summary

Filename                                 Stmts    Miss  Cover    Missing
-------------------------------------  -------  ------  -------  ------------------------------------------------------------------------------------------------
R/add_blank_rows.R                          63       0  100.00%
R/add_difference_row.R                     101       0  100.00%
R/add_forest_utils.R                        97      10  89.69%   76-79, 94-100
R/add_forest.R                             139       0  100.00%
R/add_hierarchical_count_row.R              33       0  100.00%
R/adjust_stat_columns_wrap.R                29       1  96.55%   53
R/annotate_gg_km.R                         135       0  100.00%
R/annotate_gg_pkc.R                         92       0  100.00%
R/annotate_gg.R                             81       0  100.00%
R/ard_tabulate_abnormal_by_baseline.R       65       0  100.00%
R/crane-package.R                            2       2  0.00%    26-27
R/deprecated.R                              21      21  0.00%    15-51
R/df_add_poolings.R                         41       0  100.00%
R/get_cox_pairwise_df.R                    149      42  71.81%   140-147, 172-178, 301-306, 347-354, 363-384
R/gg_km_utils.R                             35      14  60.00%   18-35
R/gg_km.R                                  143      37  74.13%   55-58, 75, 102, 176-181, 184-187, 197-199, 204-205, 239-241, 248-251, 255, 266-270, 283, 285-287
R/gg_lineplot.R                             94       0  100.00%
R/gg_mmrm_lineplot.R                       102       1  99.02%   106
R/gg_pkc_lineplot.R                         98       0  100.00%
R/gg_utils.R                               221       0  100.00%
R/label_roche.R                             72       0  100.00%
R/modify_header_rm_md.R                     18       2  88.89%   35-36
R/modify_zero_recode.R                      20       1  95.00%   64
R/reverse_difference_ci.R                   33       0  100.00%
R/tbl_baseline_chg.R                       188       0  100.00%
R/tbl_coxph.R                               90       1  98.89%   219
R/tbl_hierarchical_incidence_rate.R        209       0  100.00%
R/tbl_hierarchical_rate_and_count.R        339      13  96.17%   343, 425, 446-456
R/tbl_hierarchical_rate_by_grade.R         317       3  99.05%   169-171
R/tbl_listing.R                             35       0  100.00%
R/tbl_mmrm.R                               254       1  99.61%   393
R/tbl_null_report.R                          9       0  100.00%
R/tbl_rmpt.R                               157      12  92.36%   297-302, 314-319
R/tbl_roche_subgroups.R                    155       0  100.00%
R/tbl_roche_summary.R                       64       0  100.00%
R/tbl_shift.R                              116       0  100.00%
R/tbl_survfit_quantiles.R                  132       1  99.24%   295
R/tbl_survfit_times.R                       92       0  100.00%
R/tbl_with_pools.R                          64       0  100.00%
R/theme_gtsummary_roche.R                   84       1  98.81%   61
R/utils.R                                   42       0  100.00%
TOTAL                                     4231     163  96.15%

Diff against main

Filename                   Stmts    Miss  Cover
-----------------------  -------  ------  --------
R/get_cox_pairwise_df.R      +53     +36  -21.94%
R/utils.R                     +6       0  +100.00%
TOTAL                        +59     +36  -0.81%

Results for commit: 3049777

Minimum allowed coverage is 80%

♻️ This comment has been updated with latest results

Melkiades · 2026-05-27T11:11:09Z

 #' \itemize{
 #'   \item `HR`: The Hazard Ratio formatted to two decimal places.
-#'   \item `95% CI`: The 95\% confidence interval as `"(lower, upper)"`.
+#'   \item `95% CI`: The 95% Wald confidence interval as `"(lower, upper)"`.


can you change the method if you want? If not we need to be upfront about it

By default, coxph() calculates the variance using the standard Fisher information matrix, and the resulting CIs extracted via summary() or confint() are Wald confidence intervals.

If you want to change the underlying variance method to calculate Robust (Sandwich) Confidence Intervals, you need to use the robust = TRUE argument inside your model call.

So we can add an argument robust and additionally specify the confidence limit in another argument?

However these will be two additional arguments in the function call. What is your take on this?

Could we use ... to pass these additional arguments consistently to each method?

I have added the code to do that.

Melkiades · 2026-05-27T11:12:46Z

+  formula_str <- paste(deparse(model_formula), collapse = " ")
+  if (grepl("\\b[a-zA-Z][a-zA-Z0-9.]*::", formula_str)) {
+    cli::cli_abort(
+      "{.arg model_formula} must be specified without namespace.",
+      call = get_cli_abort_call()
+    )
+  }
+


I got the same from AI in another PR (avot01). I wonder if we should make an utility for it

I will put it import-standalone-checks.R

It's called check_formula_for_namespace()

That is not the place for this. see header:

# Standalone file: do not edit by hand # Source: https://github.com/insightsengineering/standalone/blob/HEAD/R/standalone-checks.R # Generated by: usethis::use_standalone("insightsengineering/standalone", "checks") # ----------------------------------------------------------------------

Good call. I have put it in utils.R

Melkiades · 2026-05-27T13:13:03Z

+#' Check if formula contains namespace
+#' @keywords internal
+#' @noRd
+check_formula_for_namespace <- function(formula) {


I think we need a test for this directly (to add into utils.R)

Added a test for it

Melkiades · 2026-05-27T13:13:51Z

@@ -40,18 +38,20 @@ test_that(
    expect_no_error(
      suppressWarnings(


I would aim at removing all suppressWarnings (I may also have added some tbh)

I removed all from get_cox_pairwise_df()

jszczypinski · 2026-05-27T13:22:57Z

+  if (test == "log-rank") {
+    # --- 1. Standard Log-Rank via survival (Matches SAS & rtables) ---
+    sdiff <- survival::survdiff(formula, data = data)
+
+    # Calculate the asymptotic p-value using the chi-square statistic
+    # Degrees of freedom is (number of groups - 1)
+    p_value <- stats::pchisq(sdiff$chisq, df = length(sdiff$n) - 1, lower.tail = FALSE)
+  } else if (test == "likelihood-ratio") {
+    # --- 2. Likelihood-Ratio Test via survival ---
+    # Fit the full Cox model
+    fit_full <- survival::coxph(formula, data = data, ties = ties)
+
+    # Safely create the reduced formula by dropping the arm variable
+    reduced_formula <- stats::update(formula, paste(". ~ . -", arm))
+
+    # Fit the reduced model explicitly to avoid drop1 environment scope errors
+    fit_reduced <- survival::coxph(reduced_formula, data = data, ties = ties)


I have reverted to survival for computing log-rank p value, since coin is using permutation test for this. This resulted in discrepancy between crane and tern results.

We need to talk about the LRT engine change (survreg → coxph)

Just flagging this because I think it deserves a note in the PR description / NEWS.

The old LRT used survreg(dist = "exponential") to match SAS's PROC LIFETEST LR test. The new code uses nested coxph() anova (Cox partial-likelihood LRT, equivalent to SAS PROC PHREG). The switch was necessary because survreg() errors on strata() terms.

However, we could preserve both: use the original survreg exponential LRT when the formula has no strata() terms (keeping backward compatibility and SAS PROC LIFETEST alignment), and fall back to the Cox LRT only when strata() is present. This way existing non-stratified results stay unchanged, and stratified formulas get a valid test. Both approaches have SAS equivalents — they just come from different procedures.

Either way, worth a note in NEWS since test = "likelihood-ratio" behavior is changing for at least the stratified case.

jszczypinski added 15 commits May 25, 2026 13:38

Adding discrete ties method to get_cox_pairwise_df()

5841abc

Small fix to work with gg_km

82e5563

style

93da9de

Merge branch 'main' into 193_add_discrete_ties_get_cox_pairwise_df

c760f77

Applying reviewer comments

e66f180

docs

59447a4

Add functionality to allow for strata() in model formula

6900964

style

ed7124b

Update functions that depend on get_cox_pairwise_df

3e62d54

Updating examples

33fa118

more docs

71b2210

Comments about Wald CIs

32c059c

Merge branch 'main' into 193_add_discrete_ties_get_cox_pairwise_df

510f066

docs again

0da95c4

Remove ties = discrete

51b1c81

jszczypinski requested review from Melkiades and llrs-roche May 27, 2026 08:54

jszczypinski commented May 27, 2026

View reviewed changes

Comment thread R/get_cox_pairwise_df.R Outdated

Apply suggestion from @jszczypinski

2d09123

Signed-off-by: Jan Szczypiński <jan.szczypinski@gmail.com>

jszczypinski commented May 27, 2026

View reviewed changes

Comment thread R/get_cox_pairwise_df.R Outdated

Apply suggestion from @jszczypinski

e8ff931

Signed-off-by: Jan Szczypiński <jan.szczypinski@gmail.com>

jszczypinski commented May 27, 2026

View reviewed changes

Comment thread tests/testthat/test-get_cox_pairwise_df.R Outdated

Apply suggestion from @jszczypinski

f482bfa

Signed-off-by: Jan Szczypiński <jan.szczypinski@gmail.com>

jszczypinski self-assigned this May 27, 2026

jszczypinski marked this pull request as ready for review May 27, 2026 09:33

jszczypinski added 2 commits May 27, 2026 12:29

Changes to cox ph p value calculation

39054ae

Docs + style

822e50b

jszczypinski commented May 27, 2026

View reviewed changes

Comment thread R/get_cox_pairwise_df.R Outdated

remove duplicated helper

5614503

Melkiades reviewed May 27, 2026

View reviewed changes

Comment thread R/annotate_gg_km.R Outdated

Melkiades reviewed May 27, 2026

View reviewed changes

jszczypinski added 2 commits May 27, 2026 14:17

Changes based on a review

b51db83

Update check_formula_for_namespace()

fc73f3f

Melkiades reviewed May 27, 2026

View reviewed changes

Use survival for logrank, add ... to get_cox_pairwise_df

b948396

jszczypinski commented May 27, 2026

View reviewed changes

Add a test for formula check + remove supressWarnings

3049777

		@@ -40,18 +38,20 @@ test_that(
		expect_no_error(
		suppressWarnings(

Uh oh!

Conversation

jszczypinski commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Unit Tests Summary

Uh oh!

github-actions Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Unit Test Performance Difference

Uh oh!

github-actions Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Coverage Summary

Diff against main

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jszczypinski May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jszczypinski commented May 27, 2026 •

edited

Loading

github-actions Bot commented May 27, 2026 •

edited

Loading

github-actions Bot commented May 27, 2026 •

edited

Loading

github-actions Bot commented May 27, 2026 •

edited

Loading

jszczypinski May 27, 2026 •

edited

Loading