Skip to content

langbnj/alphasync

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

alphasync

AlphaSync (https://alphasync.stjude.org) is an updated AlphaFold structure database synchronised with UniProt.

AlphaSync predicts new structures to stay up-to-date with the latest UniProt release, and it additionally enhances all structures with residue-level data such as solvent accessibility and atom-level non-covalent contacts. A preprint manuscript describing AlphaSync is available at https://www.biorxiv.org/content/10.1101/2025.03.12.642845v1.

Please note

This repository provides the structure prediction and processing pipeline behind AlphaSync. It is not yet optimised for local deployment. It currently requires a local SQL server to be set up and is optimised for an LSF HPC environment with a local Singularity container of AlphaFold 2.3.2. We have tentative plans for a more portable containerised version in future.

Initial setup

  • Requires Lahuta for residue-residue contacts, which is not yet publicly released (but should be soon) (https://bisejdiu.github.io/lahuta)
  • Install Python packages (see individual scripts for imports)
  • Install DSSP to make sure mkdssp is available (https://github.com/PDB-REDO/dssp)
  • Update blang_mysql.py with SQL connection details
  • Create tables in sql/sql_create_statements.sql and import .sql files

To update

  • Run run.py
    • Downloads structures from AlphaFold Protein Structure Database (AFDB) via FTP and GCS
    • Parses structures
    • Calculates RSA/dihedrals/contacts
    • Maps sequences to structures
  • Run run.py -alphasync
    • Refreshes protein sequence and proteome data from UniProt REST API and FTP
    • Submits AlphaFold structure prediction jobs as needed
    • Parses structures
    • Calculates RSA/dihedrals/contacts
    • Maps sequences to structures
    • Can then migrate alphasync_compact SQL tables to web server (code available on request)
    • Repeat for new UniProt releases

Acknowledgements

The code in input/alphasync/alphafold_tools is modified slightly from https://github.com/google-deepmind/alphafold, licensed under the Apache 2.0 license. The main change is a split into CPU- and GPU-based steps for more efficient parallelisation, similar to AlphaFold 3.

About

AlphaSync protein structure processing pipeline

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages