Skip to content

Chronic conditions#167

Closed
caitlink12 wants to merge 39 commits intofeature/v3.0.0-validation-infrastructurefrom
chron-cond
Closed

Chronic conditions#167
caitlink12 wants to merge 39 commits intofeature/v3.0.0-validation-infrastructurefrom
chron-cond

Conversation

@caitlink12
Copy link
Copy Markdown

Included ICES survey cycles to existing chronic condition variables CCC_071, CCC_091, CCC_101, CCC_111, CCC_121, CCC_151, CCC_280 to variable and variable_details.

included ICES survey cycles to existing CCC_071 variable and updated suffix from _i to _m
included ICES survey cycles to existing CCC_091 variable and updated suffixes from _i to _m (note: changes may be null with changes made in resp-cond branch)
included ICES survey cycles to existing CCC_101 variable and updated suffixes from _i to _m
included ICES survey cycles to existing CCC_111 variable and updated suffixes from _i to _m
included ICES survey cycles to existing CCC_121 variable and updated suffixes from _i to _m
included ICES survey cycles to existing CCC_151 variable and updated suffixes from _i to _m
included ICES survey cycles to existing CCC_280 variable and updated suffixes from _i to _m
Copy link
Copy Markdown
Collaborator

@rafdoodle rafdoodle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CCC_ Variable Metadata Review — databaseStart / variableStart Updates

✅ Variables Updated (variables.csv + variable_details.csv)

The following CCC_ variables had their databaseStart and variableStart extended to include the full range of master file (_m) databases (where applicable), mirroring their existing public file (_p) coverage. Changes were applied to both inst/extdata/variables.csv and inst/extdata/variable_details.csv:

Variable Description
CCC_031 Asthma
CCC_035 Asthma — had symptoms/attacks
CCC_036 Asthma — took medication
CCC_051 Arthritis/Rheumatism
CCC_061 Back problems
CCC_071 Hypertension
CCC_081 Migraine headache
CCC_091 COPD/Emphysema/Bronchitis
CCC_101 Diabetes
CCC_105 Diabetes — currently takes insulin
CCC_10A Diabetes diagnosed when pregnant
CCC_10B Diabetes diagnosed — other than when pregnant
CCC_10C Diabetes diagnosed — when started with insulin
CCC_111 Epilepsy (already complete)
CCC_121 Heart disease
CCC_131 Active cancer
CCC_151 Stroke
CCC_181 Alzheimer's disease or other dementia
CCC_280 Mood disorder
number_conditions Number of chronic conditions (minor formatting fix only)

⚠️ Variables Needing Review (variables.csv + variable_details.csv)

The following variables have not yet been updated. Their databaseStart and variableStart in both variables.csv and variable_details.csv may be incomplete with respect to master file (_m) database coverage and inconsistent with respect to PUMF file (_p) database coverage. They should thus be verified against the CCHS data dictionaries.

Variable Description Notes
CCC_041 Fibromyalgia Only cchs2010_m listed
CCC_072 Hypertension diagnosis Starts at cchs2005_p; only cchs2009_m, cchs2010_m, cchs2012_m
CCC_073 Hypertension medication Starts at cchs2005_p; only cchs2009_m, cchs2010_m, cchs2012_m
CCC_073A Hypertension — pregnant at first diagnosis Starts at cchs2009_2010_p; only cchs2009_m, cchs2010_m, cchs2012_m
CCC_073B Hypertension — diagnosed outside pregnancy Starts at cchs2009_2010_p; only cchs2009_m, cchs2010_m, cchs2012_m
CCC_075 Blood cholesterol No _m entries
CCC_102_A Age of diabetes diagnosis (categorical, 2003) Only cchs2009_m, cchs2010_m, cchs2012_m
CCC_102_B Age of diabetes diagnosis (categorical, 2005+) Starts at cchs2005_p; only cchs2009_m, cchs2010_m, cchs2012_m
CCC_102_cont Age of diabetes diagnosis (continuous) Starts at cchs2003_p; only cchs2009_m, cchs2010_m, cchs2012_m
CCC_106 Diabetes — pills to control blood sugar Starts at cchs2005_p; only cchs2009_m, cchs2010_m, cchs2012_m
CCC_141 Stomach or intestinal ulcers No cchs2015_2016_p/cchs2017_2018_p; only cchs2009_m, cchs2010_m, cchs2012_m
CCC_161 Urinary incontinence No cchs2015_2016_p/cchs2017_2018_p; only cchs2009_m, cchs2010_m, cchs2012_m
CCC_171 Bowel disorder No cchs2015_2016_p/cchs2017_2018_p; only cchs2009_m, cchs2010_m, cchs2012_m
CCC_17A Type of bowel disorder Starts at cchs2005_p; no cchs2015_2016_p/cchs2017_2018_p; only cchs2009_m, cchs2010_m, cchs2012_m
CCC_251 Chronic fatigue syndrome Only cchs2010_m
CCC_261 Multiple chemical sensitivities Only cchs2010_m
CCC_290 Anxiety disorder Starts at cchs2003_p; only cchs2009_m, cchs2010_m, cchs2012_m
CCC_31A Cancer diagnosis Starts at cchs2005_p; only cchs2009_m, cchs2010_m, cchs2012_m
CCC_91A Chronic bronchitis No _m entries
CCC_91E Emphysema No _m entries
CCC_91F COPD No _m entries
COPD_Emph_der COPD/Emphysema (derived) Only cchs2009_m, cchs2010_m, cchs2012_m; see note below
resp_condition Respiratory condition (derived) Only cchs2009_m, cchs2010_m, cchs2012_m; see note below

Note on COPD_Emph_der and resp_condition

The functions in R/respiratory-condition.R that produce COPD_Emph_der and resp_condition do not currently account for master file (_m) databases as they use DHHGAGE_cont as a parameter. The databaseStart/variableStart metadata for these derived variables should be updated in both variables.csv and variable_details.csv once the underlying source variables (CCC_91E, CCC_91F, CCC_091) have been reviewed, and the R functions should be updated to handle _m data accordingly.

Convert cchs2009_s, cchs2010_s, cchs2012_s to _m equivalents and
replace _NA::a/_NA::b with _NAa/_NAb in dummyVariable field across
all CCC_* variables (114 databaseStart + 102 dummyVariable changes).
@DougManuel
Copy link
Copy Markdown
Contributor

Review: PR #167 (Chronic conditions)

Reviewed 19 CCC variables (expanded from original 7 after Rafi's update) plus number_conditions derived variable. L6 integration testing passed for all PUMF cycles. Source variable names verified against MCP metadata database.

Full review details in CEP-012 (ceps/cep-012-chronic-conditions/).

Fixes applied (commit bd90746)

  1. _s_m in databaseStart (114 rows across all CCC_* variables): Converted cchs2009_s, cchs2010_s, cchs2012_s to single-year master equivalents.
  2. _NA::a_NAa / _NA::b_NAb in dummyVariable (102 rows): Colons are invalid in identifiers.

L6 integration results

All 19 variables pass rec_with_table(). Expected gaps:

  • CCC_091 (COPD): MISS for 2005 and 2007-2008 (question split into 91A/91E/91F those cycles)
  • CCC_111 (epilepsy): MISS from 2007+ (dropped from CCHS)
  • CCC_181 (dementia): MISS all cycles (master-only, never in PUMF)
  • CCC_280 (mood disorder): MISS for 2001 (introduced 2003)

No step changes at era boundaries for CCC_091 prevalence.

Respiratory conditions (CCC_091 family)

Three questionnaire eras documented:

  • 2001-2003: 91A (bronchitis, standalone) + 91B (COPD/emphysema combined)
  • 2005-2008: 91A + 91E (emphysema) + 91F (COPD) as three separate questions
  • 2009+: Recombined as CCC_091 "Do you have COPD (eg bronchitis, emphysema)?" → CCC_030 in 2015+

Harmonized CCC_091 correctly maps to 91B (2001-2003), 91F (2005-2008 master), CCC_091 (2009-2014), CCC_030 (2015+).

CCC_181 (Alzheimer's/dementia)

Confirmed master-only (suppressed from PUMF for disclosure control). Universe changed: 18+ (2001-2008) → 35+ (2009-2014) → 41+ (2015+). 2015+ rename CCC_181 → CCC_145 correctly mapped.

Pre-existing issues (not introduced by this PR)

  • CCC_151 ccsh2009_2010_m typo (swapped s/h in databaseStart)
  • number_conditions requires full feeder chain including DHHGAGE_cont (documented in roxygen)

Recommendation

PR is good to merge. Close and delete the branch after merge.

@StaceyFisher
Copy link
Copy Markdown
Contributor

StaceyFisher commented Mar 11, 2026

For COPD/emphysema/chronic bronchitis (CCC_091):

As mentioned above, there are three 'eras' for this concept:
2001-2003: 91A (bronchitis, standalone) + 91B (COPD/emphysema combined)
2005-2008: 91A + 91E (emphysema) + 91F (COPD) as three separate questions
2009+: Recombined as CCC_091 "Do you have COPD (eg bronchitis, emphysema)?" → CCC_030 in 2015+

However, in 2009/10/11, CCC_091 is labelled 'Has COPD' however, the question actually asks "Do you have chronic bronchitis, emphysema or chronic obstructive pulmonary disease or COPD?', therefore all three of these conditions need to be combined in the cycles in which they are separated.

1.1 (2000/01): CCCA_91A (CB) or CCC_91B (E + COPD)
2.1 (2003) : CCCC_91A (CB) or CCC_91B (E + COPD)
3.1 (2005) : CCCE_91A (CB) or CCCE_91E (E) or CCCE_91F (COPD)
4.1 (2007/08): CCC_91A (CB) or CCC_91E (E) or CCC_91F (COPD)
2009 - 2014: CCC_091 (CB or E or COPD)
2015 - 2023: CCC_030 (CB or E or COPD)
Abbreviations: Chronic bronchitis (CB); Emphysema (E); COPD

Of note, the documentation also talks of CCCDCPD as a derived variable in 4.1 that would also work, but I don't see this variable in the ICES data dictionary

@StaceyFisher StaceyFisher requested a review from rafdoodle March 11, 2026 18:35
Copy link
Copy Markdown
Collaborator

@rafdoodle rafdoodle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello Stacey,

Your summary of the three eras is exactly right. Here's what I implemented in cchsflow to harmonize across them:

CCC_091 (harmonized "any chronic respiratory condition" flag):

  • 2001/2003: CCC_091_fun1(CCC_91A, CCC_91B) — returns 1 if either chronic bronchitis (91A) or COPD/emphysema (91B) is positive
  • 2005/2007-08: CCC_091_fun2(CCC_91A, CCC_91E, CCC_91F) — returns 1 if any of CB (91A), emphysema (91E), or COPD (91F) is positive
  • 2009–2014: direct recode of CCC_091 (already combined in the survey)
  • 2015–2018: maps CCC_030CCC_091 (already handled via cycle-specific variableStart)

COPD_Emph_der (age-stratified derived variable):

This variable classifies respondents as: (1) age ≥35 with condition, (2) age <35 with condition, or (3) no condition. We updated it to properly cover all cycles including the annual/master file split, using the era-appropriate source variables:

  • 2001/2003: COPD_Emph_der_fun2(age, CCC_91B) — uses CCC_91B (COPD/emphysema combined)
  • 2005/2007-08: COPD_Emph_der_fun1(age, CCC_91E, CCC_91F) — uses emphysema + COPD separately
  • 2009+: COPD_Emph_der_fun2(age, CCC_091) — uses the harmonized CCC_091

Note that COPD_Emph_der intentionally excludes standalone chronic bronchitis (CCC_91A) for the 2001–2008 eras, consistent with its focus on COPD/emphysema specifically. CCC_091 is the appropriate variable if you want all three conditions combined.

Didn't see CCCDCPD in the 613 data dictionary anywhere.

Please let me what you think. More chronic condition variables still need fixing.

@rafdoodle rafdoodle requested a review from StaceyFisher March 13, 2026 17:35
@DougManuel
Copy link
Copy Markdown
Contributor

Note from v3-smoking branch review (2026-03-13):

check_recode_blocks() — a new validation function added on the v3-smoking branch — flags COPD_Emph_der as having a recStart collision across two DerivedVar:: blocks. After the Func:: router rows are excluded, two recode blocks remain with recStart = N/A across different variableStart values (DerivedVar::[DHHGAGE_cont, CCC_091] and DerivedVar::[DHHGAGE_cont, CCC_91E, CCC_91F, CCC_091]). This is likely a pre-existing structural issue but should be reviewed before merging.

rafdoodle and others added 7 commits March 18, 2026 14:16
updated existing CCC_041 variable with ICES survey cycles for 2001-2018
updated existing CCC_072 variable with ICES survey cycles from 2005-2014
updated existing CCC_073 with ICES survey cycles for 2005-2018
caitlink12 and others added 20 commits April 1, 2026 11:42
updated existing CCC_073A variable with ICES survey cycles from 2009-2014
updated existing CCC_073B variable with ICES survey cycles for 2009-2014
updated existing CCC_075 variable with ICES survey cycles for 2015-2018
updated existing CCC_102_A variable with ICES survey cycles for 2001-2018. Only continuous values available for ICES survey cycles for this variable
updated existing CCC_102_B variable with ICES survey cycles for 2001-2018. Only continuous versions of this variable were available for ICES survey cycles
updated existing CCC_102_cont variable with ICES survey cycles for 2001-2018
updated existing CCC_106 variable with ICES survey cycles for 2005-2018
updated existing CCC_141 with ICES survey cycles for 2001-2018
reorder categories so 'Yes' category (coded as 1) is first to stay consistent with other yes/no coding
updated existing CCC_161 variable with ICES survey cycles for 2001-2018
updated existing CCC_171 and CCC_17A variables with ICES survey cycles from 2001-2018 and 2005-2018 respectively
reordered CCC_17A to have category 1 at the top of the CCC_17A variable listing
updated existing CCC_251 variable with ICES survey cycles for 2001, 2003, 2005, 2015/2016, 2017/2018
updated existing CCC_2061 variable with ICES survey cycles for 2001, 2003, 2005, 2015/2016, 2017/2018
updated existing CCC_290 variable with ICES survey cycles for 2003-2018
updated existing CCC_31A variable with ICES survey cycles for 2005-2018
…v; fixed cycle order in databaseStart and variableStart
…ster and PUMF, and both pre-2005 and 2005+ categorical CCCG102 for PUMF only
rafdoodle added a commit that referenced this pull request Apr 3, 2026
@rafdoodle
Copy link
Copy Markdown
Collaborator

Changes manually merged to v3 via commit 96f627b. Will close this PR now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants