Skip to content

Add files via upload#1

Open
neurouroboros wants to merge 2 commits intoGP2code:mainfrom
neurouroboros:SW
Open

Add files via upload#1
neurouroboros wants to merge 2 commits intoGP2code:mainfrom
neurouroboros:SW

Conversation

@neurouroboros
Copy link
Copy Markdown

I have added some additional medications variables and created a draft for a medications list (based on the unique medication names found in datasets I have harmonised for GP2), which can be used to assign the medications variable based on the name. Lietsel mentioned that Datatecnica is generating a list of medication classes using UKB data and I believe another team is working on doing something similar within GP2, although I am not sure if the aim is to provide a comprehensive mapping for free-text medications names or if it is project-focused and including only a subset of medications. We should combine efforts, if possible, or my list can be used and added to in the interim for data harmonisation until it is replaced by a comprehensive mapping being developed in those projects.

I corrected unterminated strings in the values fore most  PD RFQ-U variables and dx_criteria_application.  I also changed all instances of "Don’t Know" (mostly in PD RFQ-U and some MERQ variables) to "Unknown", to be consistent with the rest of GP2 data.
@neurouroboros
Copy link
Copy Markdown
Author

I corrected unterminated strings in the values for most PD RFQ-U variables and dx_criteria_application. I also changed all instances of "Don’t Know" (mostly in PD RFQ-U and some MERQ variables) to "Unknown", to be consistent with other GP2 data.

@lietsel-jones
Copy link
Copy Markdown
Collaborator

We can do this once a medications interest/project group in GP2 is formed. There is a group that is pending. We can compare what they propose as a dictionary once we receive it and modify where needed

@neurouroboros
Copy link
Copy Markdown
Author

GP2_Data_Dictionary_ver1.1-3.csv

My previous pull request regarding the unterminated strings for MERQ have not been addressed yet. Also, there were some other issues I corrected that I've been patching in the cohort harmonisation notebooks (in section 3B which autopopulates the coding sheet). Please see below for the list of changes made to the data dictionary.

Summary of data dictionary corrections

1. Values-string fixes

dx_criteria_application

Fixed a malformed categorical Values string.

Before

["Clinically established PD", "Probable PD", "Other", 'Unknown"]

After

["Clinically established PD", "Probable PD", "Other", "Unknown"]

2. Numeric-rule fixes

Added or corrected numeric validation rules so they are valid downstream expressions using y.

smell_test_num_smells

After

(y == 12)

smell_test_num_correct

After

(y >= 0) & (y <= 12)

smell_test_score

After

(y >= 0) & (y <= 12)

smell_test_score_best

After

(y >= 0) & (y <= 12)

pm_RIN

Changed from a non-evaluable text/range-style entry to a valid numeric rule.

After

(y >= 0) & (y <= 10)

pm_PH

Changed from a non-evaluable text/range-style entry to a valid numeric rule.

After

(y >= 0) & (y <= 14)

brain_weight

Changed from a unit label to a valid numeric rule.

After

(y >= 0) & (y <= 3000)

3. MERQ yes/no values standardisation

Standardised malformed categorical Values strings for the MERQ yes/no response set to:

["Yes", "No", "Don't know", "Prefer not to answer"]

Affected items:

  • m_merq_job_pesticide
  • m_merq_job_solvent
  • m_merq_job_weld
  • m_merq_smoking_ever
  • m_merq_smoking_continue
  • m_merq_caffeine_ever
  • m_merq_caffeine_continue
  • m_merq_head_injury_ever
  • m_merq_alcohol_ever
  • m_merq_alcohol_continue
  • m_merq_diabetes_dx

4. Row-level correction

path_sn_neuronal_loss

Corrected both ItemType and Values.

Changes

  • ItemType set to:
    string
    
  • Values changed to:
    ["None", "Mild", "Moderate", "Severe"]
    

This removes unsupported categories that were causing downstream issues.

5. Minor formatting normalisation

upd23c1_min_since_last_levodopa

Applied whitespace-only cleanup to the numeric rule.

Before

(y>=0) 

After

(y>=0)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants