About

This repository represents an academic workshop of data mining course. It contains a practical assignment to get in depth with both supervised and unsupervised learning.

Supervised learning :

The objectives learnt are :

Visualizing the dataset
Using naive bayes model and learning its prinicples
Implementing a method that splits dataset into training and test datasets ( A manual implementation of sklearn train_test_split function )
Training the model using different training dataset size
Calculating errors and scores in each case
Cross validation
Using Random Forest model

You can find the notebook here : https://github.com/BenrhayemRacem/GL4_TP_DATA_MINING/tree/supervised_learning

Unsupervised learning :

The objectives learnt are :

Visualizing the dataset
Using kmeans model and learning its prinicples
Calculating the silhouette score
Drawing the dendrogram with hierarchical agglomerative clustering algorithm (HAC)
Using the Principal Component Analysis (PCA)
Using an Agglomerative Clustering (AGNES) and drawing its dendrogram
Comparing HAC and Agglomerative Clustering results with the kmeans using crosstab
Implementing a manual DIANA ( DIvisie ANAlysis) approach based on kmeans

You can find the notebook here : https://github.com/BenrhayemRacem/GL4_TP_DATA_MINING/tree/unsupervised_learning

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Supervised learning :

Unsupervised learning :

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

About

Supervised learning :

Unsupervised learning :

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages