Skip to content
View ErfanZohrabi's full-sized avatar

Highlights

  • Pro

Block or report ErfanZohrabi

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
ErfanZohrabi/README.md

🌌 About Me

class ErfanZohrabi:
    def __init__(self):
        self.role       = "MSc Bioinformatics @ University of Bologna"
        self.background = "BSc Cellular & Molecular Biology (GPA: 17.87/20)"
        self.focus      = ["LLMs in Genomics", "Deep Learning for Biology",
                           "Protein Language Models", "Multi-Omics Integration"]
        self.passion    = "Decoding life's algorithms through AI & computation"

    def current_research(self):
        return {
            "genomics"   : "DNA/Protein sequence classification with Deep Learning",
            "llm_bio"    : "Large Language Models applied to biological sequences",
            "omics"      : "Single-cell & multi-omics data integration",
            "ai_safety"  : "Trustworthy AI in clinical genomics"
        }

πŸ€– AI & LLM Skills

Domain Technologies
Foundation Models Transformers Β· BERT/GPT Architectures Β· Protein Language Models (PLM)
Genomic LLMs DNABERT Β· AlphaFold Β· ESM Β· ProtTrans
Deep Learning PyTorch Β· TensorFlow Β· CNNs Β· RNNs Β· Attention Mechanisms
Generative AI VAEs Β· GANs Β· Diffusion Models Β· Sequence Generation
Graph Neural Nets GNNs for PPI Networks Β· Molecular Graphs (CS224W Stanford)
NLP for Biology Biomedical Text Mining Β· Drug Discovery Β· Sequence Tokenization

πŸŽ“ AI Certifications

GenAI+LLMs DeepLearning GenerativeModels GraphML PythonResearch


🧬 Bioinformatics Expertise

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                     BIOINFORMATICS SKILL TREE                          β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚   SEQUENCE ANALYSIS β”‚   STRUCTURAL BIO     β”‚   OMICS & SYSTEMS         β”‚
β”‚ ● DNA Classificationβ”‚ ● Protein Folding    β”‚ ● scRNA-seq Analysis      β”‚
β”‚ ● Sequence Alignmentβ”‚ ● AlphaFold/PLM      β”‚ ● Multi-Omics Integration β”‚
β”‚ ● Promoter Analysis β”‚ ● Signal Peptides    β”‚ ● Genomics + Proteomics   β”‚
β”‚ ● Methylation (ILL) β”‚ ● HMM Profiles       β”‚ ● Transcriptomics         β”‚
β”‚ ● Variant Calling   β”‚ ● Domain Annotation  β”‚ ● Epigenomics             β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚   CLINICAL GENOMICS β”‚   TOOLS & PIPELINES  β”‚   ML FOR BIO              β”‚
β”‚ ● Pathogenicity Predβ”‚ ● MEGA11             β”‚ ● Random Forest           β”‚
β”‚ ● ClinVar Analysis  β”‚ ● Biopython          β”‚ ● SVM / KNN               β”‚
β”‚ ● PolyPhen / SIFT   β”‚ ● BLAST / HMMER      β”‚ ● Neural Networks         β”‚
β”‚ ● Cancer Genomics   β”‚ ● Illumina Arrays    β”‚ ● PSO Optimization        β”‚
β”‚ ● KCNB1 Variants    β”‚ ● Jupyter / RStudio  β”‚ ● LOOCV / Cross-Val       β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸš€ Featured Projects

🦠 DNA Sequence Classification

Breast Cancer Prediction

Repo Python Jupyter

ML & DL classification of promoter DNA sequences for breast cancer prediction. Compared KNN, SVM (RBF), Neural Networks, and AdaBoost achieving 96.3% accuracy with PSO-optimized SVM.

SVM Neural Networks PSO KNN AdaBoost

🧠 KCNB1 Variant Pathogenicity

ML for Genetic Variant Prediction

Repo Python

Random Forest model on ClinVar data to classify KCNB1 gene variants as pathogenic or benign. Benchmarked against PolyPhen and SIFT in-silico tools.

Random Forest LOOCV ClinVar PolyPhen SIFT

πŸ”¬ DNA Methylation Analysis

Illumina Infinium Array

Repo R

Statistical analysis of fluorescent intensity data and methylation statuses from Illumina arrays using R. Covers probe characteristics, beta values, and differential methylation.

R Epigenomics Statistical Modeling Illumina

🧬 Signal Peptide Prediction

Protein Sequence ML Model

Repo Python

Predictive modeling of signal peptides in protein sequences using ML β€” critical for understanding protein secretion and subcellular localization.

Protein ML Signal Peptides Sequence Analysis

πŸ—οΈ Profile HMM β€” Kunitz Domain

Lab of Bioinformatics Project

Repo Python

Built a Profile Hidden Markov Model (pHMM) for the Kunitz-type protease inhibitor domain β€” a rigorous structural bioinformatics exercise using HMMER and MSA.

HMM HMMER MSA Domain Annotation

🌐 Bioinformatics CV Portfolio

Personal Research Website

Live HTML

Personal portfolio built with HTML/CSS/JS showcasing research, experience, and projects in bioinformatics, AI, and computational biology.

HTML CSS JavaScript GitHub Pages


πŸ› οΈ Tech Stack

πŸ’» Programming Languages

Python R JavaScript SQL PHP

πŸ€– AI & Machine Learning

PyTorch TensorFlow scikit-learn Transformers NumPy Pandas Matplotlib

🧬 Bioinformatics Tools

Biopython RStudio MEGA11 HMMER BLAST Scanpy

πŸ”§ Dev Tools

Git Jupyter PyCharm LaTeX


πŸ“Š GitHub Stats


πŸ“œ Publications

Year Title Journal
2022 Applications of Python Programming in Bioinformatics (Biopython) Journal of Ghin
2021 Cancer Cell Cycle in Breast & Testicular Cancer Journal of Ghin
2020 Targeted Drug Delivery for Cancer Treatment Journal of Ghin

πŸ”­ Research Interests

πŸ”¬ ACTIVE RESEARCH AREAS
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
🧬  Large Language Models (LLMs) for DNA/RNA sequence analysis
πŸ”€  Protein Language Models (ESM, ProtTrans, AlphaFold integration)
🧠  Deep Reinforcement Learning for DNA sequence alignment
πŸ•ΈοΈ  Graph Neural Networks in Computational Biology
βš—οΈ  AI + CRISPR: smart gene-editing target identification
πŸ”¬  Single-cell & Spatial Transcriptomics with Deep Learning
🌐  Multi-Omics Data Integration (Genomics + Proteomics + Transcriptomics)
πŸ”’  Trustworthy & Interpretable AI for Clinical Genomics
πŸ§ͺ  Generative Models for Protein Sequence Design
🧲  Biomedical Text Mining for Drug Discovery
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

πŸŽ“ Education

Degree Institution Focus
MSc Bioinformatics University of Bologna ML, Deep Learning in Genomics, Multi-Omics, Structural Bio
BSc Cellular & Molecular Biology University of Damghan GPA: 17.87/20 Β· Genetics, Biostatistics, Programming

"Decoding the language of life, one sequence at a time."

Pinned Loading

  1. Bio_search Bio_search Public

    BioSearch is a comprehensive web application that allows researchers, students, and biologists to search across multiple biological databases simultaneously. This tool simplifies the process of fin…

    Python 3

  2. Prediction-of-Protein-Protein-Interactions-Algorithm- Prediction-of-Protein-Protein-Interactions-Algorithm- Public

    Development of a Machine Learning Algorithm for the Prediction of Protein-Protein Interactions Using Python

    Python

  3. Evaluate-DNA-methylation Evaluate-DNA-methylation Public

    Evaluate DNA methylation based on infinium technology of Illumina array chip

    R

  4. Spymot Spymot Public

    A comprehensive protein analysis platform that combines motif detection with 3D structure validation using AlphaFold2 confidence scores. Designed for cancer biology research, drug discovery, and fu…

    Python 3

  5. Deep-Learning-for-Sexism-Detection-in-Tweets Deep-Learning-for-Sexism-Detection-in-Tweets Public

    Sexism detection in tweets (EXIST 2023): a comparative deep-learning project evaluating LSTM vs Transformer models, with saved PyTorch checkpoints and re

    Jupyter Notebook

  6. LLM_Prompting LLM_Prompting Public

    Prompting experiments comparing zero-shot vs few-shot LLMs (Mistral 7B, TinyLlama) for multi-class sexism detection, with full evaluation plots and prediction logs.

    Jupyter Notebook