Skip to content

Medication refactoring for CRAN#19

Open
rafdoodle wants to merge 11 commits intodevfrom
medication-refactoring
Open

Medication refactoring for CRAN#19
rafdoodle wants to merge 11 commits intodevfrom
medication-refactoring

Conversation

@rafdoodle
Copy link
Collaborator

@rafdoodle rafdoodle commented Mar 16, 2026

Hello Doug,

Hope you are doing well. I have completely refactored how medications are recoded in the CHMS by addressing CRAN-blocking issues 13 and 15. Here is what has been done:

Issue #13 — Circular/self-referential medication metadata

The old medication variables (any_htn_med2, diab_med2) were referenced by downstream functions in R/blood-pressure.R and R/diabetes.R using the old naming convention. These have been replaced throughout:

  • any_htn_med2any_htn_med and diab_med2diab_med in all HTN and diabetes functions (derive_diabetes_status(), derive_hypertension(), derive_hypertension_adj(), derive_hypertension_control(), derive_hypertension_control_adj()) and their man/ docs
  • inst/extdata/variable-details.csv updated with clean, non-circular entries for any_htn_med and diab_med across three database tiers:
    • cycle1_meds / cycle2_medsDerivedVar:: derived from 80 atc_*/mhr_* columns via Func::is_*_cycles1to2
    • cycle3_medscycle6_medsDerivedVar:: derived from meucatc / npi_25b via Func::is_*
    • cycle1cycle6copy passthrough once the column has been merged into the main cycle data by recode_meds_*()
  • inst/extdata/variables.csv updated to match; data/variable_details.rda and data/variables.rda regenerated from the updated CSVs

Issue #15 — No aggregation wrapper for cycles 3-6

Cycles 3-6 store medications in long format (one row per medication per respondent). Previously users had to manually recode, aggregate, and join in three separate steps. Three new wrapper functions now handle the full pipeline:

recode_meds_cycles1to2(data, meds_data, variables, ...)
Accepts the main cycle data frame and the wide-format medication data frame (up to 80 atc_*/mhr_* column pairs). Normalises uppercase column names, recodes via rec_with_table(), converts to numeric, and left-joins medication columns back into data. Returns the enriched main data frame.

recode_meds_cycles3to6(data, meds_data, variables, ...)
Accepts the main cycle data frame and the long-format medication data frame. Recodes via rec_with_table(), aggregates to one row per respondent via aggregate_meds_by_person(), and left-joins into data. Returns the enriched main data frame.

aggregate_meds_by_person(data, variables, by)
Collapses long-format medication data to one row per respondent using max(). All-NA rows become tagged_na("b").

Both recode functions share the same call signature, so the workflow is identical regardless of cycle:

# Cycles 1-2
cycle1 <- recode_meds_cycles1to2(cycle1, cycle1_meds, c("any_htn_med", "diab_med"))

# Cycles 3-6
cycle3 <- recode_meds_cycles3to6(cycle3, cycle3_meds, c("any_htn_med", "diab_med"))

Additional fixes

  • recode_after_meds() — fixed two issues: (1) medication rows are filtered out of variable_details before calling rec_with_table(), so pre-computed medication columns are passed through via the copy entries rather than re-derived; (2) clinicid was being silently dropped because it was not included in the variables argument passed to rec_with_table() and therefore not returned in the output; restored via dplyr::bind_cols(data[, by, drop = FALSE], recoded).

Changes

File What changed
inst/extdata/variable-details.csv Replaced any_htn_med2/diab_med2 entries with DerivedVar rows for _meds databases and copy rows for main cycle databases
inst/extdata/variables.csv Updated variable entries to match
data/variable_details.rda, data/variables.rda Regenerated from updated CSVs
R/blood-pressure.R any_htn_med2any_htn_med throughout all HTN functions
R/diabetes.R diab_med2diab_med throughout diabetes functions
R/medications.R New classification and wrapper functions; recode_after_meds() clinicid fix; ordered cycles 1-2 → cycles 3-6
man/ Regenerated docs for all modified and new functions
vignettes/recoding_medications.qmd Full rewrite — removed outdated sections, added section 3 (workflow) and section 4 (advanced is_* usage); all chunks now execute; library() calls consolidated into setup chunk
tests/testthat/test-medications.R Tests for all new functions; clinicid presence assertion added; ordered cycles 1-2 → cycles 3-6

Test plan

  • devtools::check() passes with 0 errors, 0 warnings, 0 notes
  • recode_meds_cycles1to2() merges medication columns into main data frame
  • recode_meds_cycles3to6() aggregates and merges medication columns into main data frame
  • recode_after_meds() output contains clinicid
  • recode_after_meds() does not re-derive medication variables
  • Downstream HTN and diabetes functions accept any_htn_med / diab_med without error

Please let me know what you think.

Sincerely,
Rafidul

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant