class ErfanZohrabi:
def __init__(self):
self.role = "MSc Bioinformatics @ University of Bologna"
self.background = "BSc Cellular & Molecular Biology (GPA: 17.87/20)"
self.focus = ["LLMs in Genomics", "Deep Learning for Biology",
"Protein Language Models", "Multi-Omics Integration"]
self.passion = "Decoding life's algorithms through AI & computation"
def current_research(self):
return {
"genomics" : "DNA/Protein sequence classification with Deep Learning",
"llm_bio" : "Large Language Models applied to biological sequences",
"omics" : "Single-cell & multi-omics data integration",
"ai_safety" : "Trustworthy AI in clinical genomics"
}| Domain | Technologies |
|---|---|
| Foundation Models | Transformers Β· BERT/GPT Architectures Β· Protein Language Models (PLM) |
| Genomic LLMs | DNABERT Β· AlphaFold Β· ESM Β· ProtTrans |
| Deep Learning | PyTorch Β· TensorFlow Β· CNNs Β· RNNs Β· Attention Mechanisms |
| Generative AI | VAEs Β· GANs Β· Diffusion Models Β· Sequence Generation |
| Graph Neural Nets | GNNs for PPI Networks Β· Molecular Graphs (CS224W Stanford) |
| NLP for Biology | Biomedical Text Mining Β· Drug Discovery Β· Sequence Tokenization |
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β BIOINFORMATICS SKILL TREE β
βββββββββββββββββββββββ¬βββββββββββββββββββββββ¬ββββββββββββββββββββββββββββ€
β SEQUENCE ANALYSIS β STRUCTURAL BIO β OMICS & SYSTEMS β
β β DNA Classificationβ β Protein Folding β β scRNA-seq Analysis β
β β Sequence Alignmentβ β AlphaFold/PLM β β Multi-Omics Integration β
β β Promoter Analysis β β Signal Peptides β β Genomics + Proteomics β
β β Methylation (ILL) β β HMM Profiles β β Transcriptomics β
β β Variant Calling β β Domain Annotation β β Epigenomics β
βββββββββββββββββββββββΌβββββββββββββββββββββββΌββββββββββββββββββββββββββββ€
β CLINICAL GENOMICS β TOOLS & PIPELINES β ML FOR BIO β
β β Pathogenicity Predβ β MEGA11 β β Random Forest β
β β ClinVar Analysis β β Biopython β β SVM / KNN β
β β PolyPhen / SIFT β β BLAST / HMMER β β Neural Networks β
β β Cancer Genomics β β Illumina Arrays β β PSO Optimization β
β β KCNB1 Variants β β Jupyter / RStudio β β LOOCV / Cross-Val β
βββββββββββββββββββββββ΄βββββββββββββββββββββββ΄ββββββββββββββββββββββββββββ
|
Breast Cancer Prediction ML & DL classification of promoter DNA sequences for breast cancer prediction. Compared KNN, SVM (RBF), Neural Networks, and AdaBoost achieving 96.3% accuracy with PSO-optimized SVM.
|
ML for Genetic Variant Prediction Random Forest model on ClinVar data to classify KCNB1 gene variants as pathogenic or benign. Benchmarked against PolyPhen and SIFT in-silico tools.
|
|
Illumina Infinium Array Statistical analysis of fluorescent intensity data and methylation statuses from Illumina arrays using R. Covers probe characteristics, beta values, and differential methylation.
|
Protein Sequence ML Model Predictive modeling of signal peptides in protein sequences using ML β critical for understanding protein secretion and subcellular localization.
|
|
Lab of Bioinformatics Project Built a Profile Hidden Markov Model (pHMM) for the Kunitz-type protease inhibitor domain β a rigorous structural bioinformatics exercise using HMMER and MSA.
|
Personal Research Website Personal portfolio built with HTML/CSS/JS showcasing research, experience, and projects in bioinformatics, AI, and computational biology.
|
| Year | Title | Journal |
|---|---|---|
| 2022 | Applications of Python Programming in Bioinformatics (Biopython) | Journal of Ghin |
| 2021 | Cancer Cell Cycle in Breast & Testicular Cancer | Journal of Ghin |
| 2020 | Targeted Drug Delivery for Cancer Treatment | Journal of Ghin |
π¬ ACTIVE RESEARCH AREAS
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
𧬠Large Language Models (LLMs) for DNA/RNA sequence analysis
π€ Protein Language Models (ESM, ProtTrans, AlphaFold integration)
π§ Deep Reinforcement Learning for DNA sequence alignment
πΈοΈ Graph Neural Networks in Computational Biology
βοΈ AI + CRISPR: smart gene-editing target identification
π¬ Single-cell & Spatial Transcriptomics with Deep Learning
π Multi-Omics Data Integration (Genomics + Proteomics + Transcriptomics)
π Trustworthy & Interpretable AI for Clinical Genomics
π§ͺ Generative Models for Protein Sequence Design
π§² Biomedical Text Mining for Drug Discovery
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
| Degree | Institution | Focus |
|---|---|---|
| MSc Bioinformatics | University of Bologna | ML, Deep Learning in Genomics, Multi-Omics, Structural Bio |
| BSc Cellular & Molecular Biology | University of Damghan | GPA: 17.87/20 Β· Genetics, Biostatistics, Programming |