Skip to content

Latest commit

 

History

History
87 lines (53 loc) · 2.14 KB

File metadata and controls

87 lines (53 loc) · 2.14 KB

Bioinformatics & Cheminformatics Project

This project demonstrates the use of Python and RDKit for drug discovery-related data analysis.
It involves collecting molecular datasets from ChEMBL, cleaning and curating them, applying filters like Lipinski’s Rule of Five, performing disruptor calculations, and generating visual plots.


📂 Project Contents

  • Project work.ipynb
    Main workflow notebook. Includes:

    • Data collection from ChEMBL
    • Data cleaning and duplicate removal
    • Lipinski’s Rule of Five analysis
    • Disruptor calculation
    • Basic visualizations with RDKit and matplotlib
  • aromatase.ipynb
    Focused case study on compounds related to aromatase inhibitors:

    • Data preparation
    • Rule of Five and disruptor screening
    • Plotting molecular property distributions

🛠️ Tools & Libraries Used


🚀 How to Run

  1. Clone this repository:
    git clone https://github.com/mike3119/https-github.com-mike3119-my-project.git
    
    
  2. Navigate into the folder:

cd https-github.com-mike3119-my-project

  1. Install dependencies:

pip install rdkit pandas matplotlib

  1. Open the notebooks:

jupyter notebook


📊 Example Outputs

Some of the results you can expect:

Distribution plots of molecular weight, LogP, and other drug-likeness properties

Filtering of compounds based on Lipinski’s Rule of Five

Data tables of curated compounds for further analysis


🎯 Project Goals

Learn and apply bioinformatics/cheminformatics techniques for drug discovery.

Explore the power of ChEMBL as a molecular dataset resource.

Practice molecular analysis with RDKit in Python.

Build a foundation for advanced drug design projects.


👤 Author

Michael Hemen Akosu

hemenmonterakosu@gmail.com B.Sc. Chemistry

PGD in Drug Analysis, Pharmaceutical Chemistry (University of Ibadan)

Open to internships, collaborations, and learning opportunities in bioinformatics and cheminformatics.