Skip to content

eastgenomics/eggd_pandora

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

87 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

eggd_pandora

What does this app do?

eggd_pandora is a variant sharing system, which takes in variant data from a source and inputs that variant data to a variant database. It can work in two ways:

  1. Take a case in OpenCGA and submit the variants that have been interpreted for the proband to the DECIPHER database
  2. Take a csv file of interpreted variants and submit these variants to the ClinVar database (typically this csv is made by the variant workbook parser script that gets interpreted variants from an Excel workbook; an example csv is here).

What inputs are required for this app to run?

  • --running_mode: (str) mode that eggd_pandora should run in:
    • "decipher" - pull from OpenCGA and push to DECIPHER
    • "clinvar" - take in a variant csv and submit all the variants in it to ClinVar
    • "get_clinvar_accession" - take a clinvar submission ID and retrieve the accession ID.

DECIPHER

  • --decipher_api_keys: (file) DNAnexus link to JSON file containing the client key and user key to access the API for the DECIPHER project to which the variants should be submitted
  • --opencga_config: (file) DNAnexus link to JSON file containing the user and password for OpenCGA
  • --opencga_case_id: (string) the case ID on OpenCGA for the case that is to be submitted to DECIPHER
  • --opencga_study_name: (string) the name of the OpenCGA study containing the case that is to be submitted to DECIPHER
  • --decipher_submitter_id: (int) the DECIPHER account ID of the submitter

ClinVar

  • --variant_csv: (file) Variant .csv file with data that should be converted to a JSON
  • --clinvar_api_key: (file) File containing ClinVar API key
  • --clinvar_testing: (bool) whether or not to use the ClinVar test endpoint (True) or live endpoint (False)

ClinVar accession

  • --clinvar_api_key: (file) File containing ClinVar API key
  • --submission_ids_file : (file) File containing ClinVar submission IDs. Example format
    Local_ID    ClinVar_Submission_ID
    uid_1707758278575948778 SUB14474946
    uid_1707745813424903959 SUB14471647

How does this app work?

This app runs the script pandora.sh which can run both DECIPHER and ClinVar variant submissions.\

In "decipher" running mode, pandora.sh will run pull_from_opencga.py, which extracts the necessary information for the case to be submitted to DECIPHER and outputs it in a JSON called case_phenotype_and_variant_data.json. This JSON is then passed to push_to_decipher.py which reformats this information and submits it to DECIPHER.\

In "clinvar" running mode, pandora.sh will run pull_from_csv.py to extract the necessary information for submission to ClinVar from a csv of variant data and export a JSON for each variant. These variant JSONs are then passed to push_to_clinvar.py and the variant data is submitted to ClinVar. push_to_clinvar.py runs get_clinvar_accession.py, which queries the ClinVar API to retrieve the accession ID for that submission. The script will query the API every five mins until it retrieves an accession ID, at which point it will quit. It will try for an hour; if no accession ID is generated by ClinVar in an hour, the script will quit.\

In "get_clinvar_accession" running mode, pandora.sh will run get_clinvar_accession.py, which queries the ClinVar API to retrieve the accession ID for the submission IDs in the input file. The script will query the API every five mins until it retrieves an accession ID, at which point it will quit. It will try for an hour; if no accession ID is generated by ClinVar in an hour, the script will quit.

What does this app output?

In DECIPHER mode:
This app creates a DECIPHER patient record for a case in OpenCGA and adds HPO phenotype terms and intepreted variants. The eggd_pandora app will create a new proband patient record in DECIPHER if the patient does not already exist, or add the variants to an existing patient. The app outputs a link to the new or updated patient record.\

In ClinVar mode:
Adds variants in the input csv to Clinvar. Outputs a tsv with the local ID and the ClinVar accession ID for each variant\

In ClinVar accession mode:
Outputs a tsv with the local ID and the ClinVar accession ID for each variant

App notes and variant input limitations

  • eggd_pandora for DECIPHER only works for SNVs, indels, insertions and deletions.
  • eggd_pandora for DECIPHER assumes that patients that have their sex recorded in OpenCGA as "male" are 46XY and as "female" are 46XX.
  • DECIPHER only accepts sequence variants less than 100 bp in length.
  • DECIPHER only accepts variants on build GRCh38.

This app was made by East GLH

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors