Skip to content

CodeNinjaSarthak/Mri-Tumor-Classification-Benchmark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Brain Tumor MRI Classification

Research code supporting our work on multi-class brain tumour classification from MRI scans using deep convolutional neural networks.z

Overview

This repository trains and evaluates five CNN architectures for 4-class brain tumour classification:

Model Type
Xception Transfer learning (ImageNet)
EfficientNetB4 Transfer learning (ImageNet)
ResNet-50 Transfer learning (ImageNet)
Custom Residual CNN Trained from scratch, dilated convolutions
Dilated CNN Trained from scratch, sequential dilated convolutions

Classes: glioma · meningioma · notumor · pituitary

Dataset: Brain Tumor MRI Dataset
Author: Masoud Nickparvar (Kaggle)
License: CC BY-SA 4.0

This repository does not redistribute the dataset.
Users must download it directly from Kaggle and comply with the dataset’s license terms.

Project Structure

Mri-Tumor-Classification-Benchmark/
│
├── notebooks/
│   ├── 00_original_notebook.ipynb     # Original monolithic notebook (reference)
│   ├── 01_data_exploration.ipynb      # EDA: class distribution, sample images
│   └── 02_train_and_evaluate.ipynb    # Model training + evaluation pipeline
│
├── src/                               # Importable Python modules
│   ├── __init__.py
│   ├── data_utils.py                  # Dataset loading & Keras generators
│   ├── models.py                      # Model factory functions
│   └── visualization.py              # Plotting helpers
│
├── configs/
│   └── config.yaml                    # All hyperparameters & paths (edit here)
│
├── data/
│   └── README.md                      # Dataset download instructions
│
├── figures/                           # Saved training curves & confusion matrices
│
├── outputs/                           # Model weights (gitignored, created at runtime)
│
├── environment.yml                    # Conda environment spec
├── requirements.txt                   # Pip requirements
├── LICENSE                            # Apache 2.0
└── CONTRIBUTING.md

Quick Start

1. Clone

git clone https://github.com/CodeNinjaSarthak/Mri-Tumor-Classification-Benchmark.git
cd Mri-Tumor-Classification-Benchmark

2. Create environment

# Option A — conda (recommended)
conda env create -f environment.yml
conda activate brain-tumor-mri

# Option B — pip + venv
python -m venv .venv
source .venv/bin/activate          # Windows: .venv\Scripts\activate
pip install -r requirements.txt

3. Download the dataset

Follow the instructions in data/README.md.

TL;DR (Kaggle API):

pip install kaggle
# Place ~/.kaggle/kaggle.json (from kaggle.com → Settings → Create New Token)
kaggle datasets download -d masoudnickparvar/brain-tumor-mri-dataset \
    -p data/ --unzip

4. Configure paths

Open configs/config.yaml and update the data: section:

data:
  training_dir: "data/brain-tumor-mri-dataset/Training"
  testing_dir:  "data/brain-tumor-mri-dataset/Testing"

On Kaggle the defaults (/kaggle/input/...) are correct as-is.

5. Run the notebooks

jupyter notebook

Execute in order:

Step Notebook
1 notebooks/01_data_exploration.ipynb
2 notebooks/02_train_and_evaluate.ipynb

Reproducing Results

All random seeds that TensorFlow exposes are set to 42. Note that full bit-for-bit reproducibility on GPU is not guaranteed by TensorFlow without additional environment variables; see the TF determinism guide for details.

To enable strict determinism add the following to the top of the training notebook before any other TensorFlow calls:

import os
os.environ["TF_DETERMINISTIC_OPS"] = "1"
tf.config.experimental.enable_op_determinism()

Key Design Decisions

Decision Rationale
All hyperparameters in configs/config.yaml Single place to change epochs, LR, batch size; avoids notebook sprawl
EarlyStopping(restore_best_weights=True) + ReduceLROnPlateau Prevents overfitting, adapts LR when training plateaus
80 / 20 stratified split via ImageDataGenerator(validation_split=0.2) Reproducible split with fixed seed=42
Models deleted after evaluation (del model, history) Frees GPU VRAM between sequential model runs on memory-limited hardware
Transfer-learning backbones fully fine-tuned (no frozen layers) MRI imagery differs sufficiently from ImageNet that full fine-tuning yields better results than feature extraction alone

Notes on Architecture Naming

The model labelled "U-Net" in earlier versions of this codebase is a sequential dilated CNN, not the encoder-decoder with skip connections described in Ronneberger et al. (2015). It has been renamed to "Dilated CNN" in the refactored notebooks for clarity. The scientific implementation (weights, dilation rates, layer counts) is unchanged.


Data Usage Notice

This repository contains no medical data. All experiments were conducted using publicly available datasets.

License

This code is released under the Apache 2.0 License. The dataset is distributed under CC BY-SA 4.0 — see the Kaggle page for full terms.

About

Benchmarking deep CNN architectures for multi-class brain tumor classification using MRI scans.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors