Water Quality Prediction using Deep Learning Neural Networks (CPCB)

This project utilizes Deep Learning Neural Networks to predict the Water Quality Index (WQI) and Water Quality Classification using environmental monitoring data provided by the Central Pollution Control Board (CPCB), India.

📋 Table of Contents

Project Overview
Dataset Description
Workflow
Installation and Setup
Predictive Models
Evaluation Metrics
Results
License
Contact

🔍 Project Overview

Access to clean water is a fundamental human necessity. However, water quality varies widely due to environmental, geographical, and human-induced factors. This project aims to accurately predict water quality metrics from chemical and physical parameters across various locations in India (2019-2022).

By leveraging Deep Learning, we provide two distinct predictive functionalities:

Regression Analysis: Predicting the numerical Water Quality Index (WQI).
Multi-class Classification: Categorizing samples into qualitative labels (e.g., Excellent, Good, Poor, Unsuitable).

📊 Dataset Description

The dataset contains chemical and physical samples collected from various wells across India.

Features:

Geographical: Well_ID, State, District, Block, Village, Latitude, Longitude.
Temporal: Year (2019, 2020, 2021, 2022).
Indicators: pH, Electrical Conductivity (EC), Carbonates (CO3), Bicarbonates (HCO3), Chlorides (Cl), Sulfates (SO4), Nitrates (NO3), Total Hardness (TH), Calcium (Ca), Magnesium (Mg), Sodium (Na), Potassium (K), Fluoride (F), Total Dissolved Solids (TDS).

Targets:

WQI: Continuous numerical value.
Water Quality Classification: Categorical (Excellent, Good, Poor, Very Poor yet Drinkable, Unsuitable for Drinking).

⚙️ Workflow

The following diagram illustrates the data processing and modeling pipeline:

graph TD
    A[Data Acquisition: CPCB Water Quality Dataset] --> B[Data Preprocessing]
    B --> B1[Locate Header & Clean Garbage Text]
    B1 --> B2[Handle Missing Values: Median Filling]
    B2 --> B3[Feature Type Conversion: Numeric Coercion]
    B3 --> B4[Target Definition: WQI & Classification]
    
    B4 --> C[Data Splitting: Train/Test]
    C --> D[Feature Scaling: StandardScaler]
    
    D --> E1[Deep Learning Regression Model]
    D --> E2[Deep Learning Classification Model]
    
    E1 --> F1[WQI Prediction]
    E2 --> F2[Water Quality Category Labeling]
    
    F1 --> G1[Evaluation: R2 Score, MAE]
    F2 --> G2[Evaluation: Accuracy, F1-Score]
    
    G1 --> H[Model Finalization]
    G2 --> H

(The workflow source is also available in Flow/workflow.mmd)

🚀 Installation and Setup

To run this project locally, ensure you have Python 3.10+ installed.

Clone the repository:

git clone https://github.com/SANJAI-s0/WQI-WQP_using_DL_Neural_Network.git
cd WQI-WQP_using_DL_Neural_Network

Install dependencies:
```
pip install -r requirements.txt
```
Run the analysis: Open the Jupyter Notebook to view the full pipeline and metrics:
```
jupyter notebook Water_Quality_Prediction.ipynb
```

🧠 Predictive Models

The project implements two separate Deep Neural Networks (DNN) using Keras/TensorFlow:

1. Regression Model (WQI)

Architecture: Sequential API with multiple Dense layers (64 -> 32 -> 16 -> 1).
Optimizer: Adam.
Loss Function: Mean Squared Error (MSE).

2. Classification Model (Category)

Architecture: Sequential API (64 -> 32 -> 16 -> output_classes).
Activation: ReLU for hidden layers, Softmax for the output layer.
Loss Function: Sparse Categorical Crossentropy.

📈 Evaluation Metrics

The models are evaluated based on the following:

Regression: $R^2$ Score (Coefficient of Determination) and Mean Absolute Error (MAE).
Classification: Accuracy Score and Weighted F1-Score.

🏆 Results

The models achieve reliable performance across the dataset. Detailed confusion matrices and loss curves can be found within the Water_Quality_Prediction.ipynb notebook.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

📧 Contact

Sanjai - GitHub Profile

Project Link: https://github.com/SANJAI-s0/WQI-WQP_using_DL_Neural_Network

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Water Quality Prediction using Deep Learning Neural Networks (CPCB)

📋 Table of Contents

🔍 Project Overview

📊 Dataset Description

Features:

Targets:

⚙️ Workflow

🚀 Installation and Setup

🧠 Predictive Models

1. Regression Model (WQI)

2. Classification Model (Category)

📈 Evaluation Metrics

🏆 Results

📄 License

📧 Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Dataset		Dataset
Flow		Flow
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
Water_Quality_Prediction.ipynb		Water_Quality_Prediction.ipynb
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Water Quality Prediction using Deep Learning Neural Networks (CPCB)

📋 Table of Contents

🔍 Project Overview

📊 Dataset Description

Features:

Targets:

⚙️ Workflow

🚀 Installation and Setup

🧠 Predictive Models

1. Regression Model (WQI)

2. Classification Model (Category)

📈 Evaluation Metrics

🏆 Results

📄 License

📧 Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages