Skip to content

GeorchPz/aprendizaje_automatico_notebooks

Repository files navigation

Machine Learning Notebooks

This repository contains a collection of Jupyter notebooks for machine learning (aprendizaje automático) algorithms and techniques, with examples and practical implementations. These notebooks are exercises from the course Artificial Intelligence 2024 at the UIB (University de les Illes Balears). The course is taught by Miquel Miró Nicolau, Gabriel Moyà Alcover, Dr. Javier Varona Gómez. From the research group XAI (Explainable Artificial Intelligence).

Contents

1. Introduction to Machine Learning

1_ML_i_Perceptró.ipynb: Introduction to Machine Learning concepts and Perceptron implementation with practical activities.

2. Regression

2_Regr_Pràctica.ipynb and 2_Regressió_i_correlació.ipynb: Practice notebook for regression techniques including data exploration, correlation matrices, and model implementation.

3. Logistic Regression

3_Regressió_Logística_i_K-Fold.ipynb and 3_RegrLog_Pràctica.ipynb: Logistic regression implementation with K-Fold cross-validation, model evaluation and practical exercises (including model training, evaluation using accuracy metrics, and confusion matrix visualization).

4. Support Vector Machines

4_SVM.ipynb and 4_SVM_Pràctica.ipynb: Support Vector Machine implementation with both linear and non-linear kernels, visualization of decision boundaries, and hyperparameter tuning using cross-validation. Includes practical exercises comparing SVM with other classification models (Perceptron, Logistic Regression).

5. Data Cleaning

5_Neteja_de_dades_i_DT.ipynb: This notebook covers data cleaning techniques, such as: handling missing values, categorical data encoding, feature scaling, and noise reduction.

Assignment

ML_assignment.ipynb: Final course assignment applying multiple machine learning algorithms to the forest cover type dataset. The notebook includes data preprocessing (resampling for class balance and PCA for dimensionality reduction), hyperparameter optimization for various models (Perceptron, Logistic Regression, SVM, Decision Tree, Random Forest), and comprehensive model evaluation using confusion matrices and classification metrics.

Key Techniques Covered

Classification Methods

  • Perceptron: Simple neural network implementation for linear classification problems
  • Logistic Regression: Probabilistic classification for binary and multi-class problems
  • Support Vector Machines: Classification with both linear and non-linear kernels for optimal decision boundaries
  • Decision Trees: Tree-based classification with conditional branching
  • Random Forest: Ensemble method combining multiple decision trees for improved performance

Cross-Validation

K-Fold cross-validation is implemented in multiple notebooks to evaluate model performance. The technique divides the dataset into k subsets and uses each subset for testing while training on the remaining data.

K-Fold

Classification Metrics

  • Accuracy scoring
  • Confusion matrices
  • Classification reports
  • Precision, recall, and F1-score
  • ROC curves and AUC analysis

Parameter Optimization

  • Grid search for hyperparameter tuning
  • Custom product dictionary for parameter combinations
  • Cross-validation based optimization
  • Regularization parameter selection

Data Preprocessing

  • Handling missing values
  • Categorical data encoding
  • Feature scaling and normalization
  • Dimensionality reduction using PCA
  • Class balancing and resampling techniques
  • Noise reduction methods

Visualization Techniques

  • Decision boundary visualization
  • Feature correlation heatmaps
  • Model performance comparison plots
  • Learning curves
  • Hyperparameter effect visualization

Languages

The notebooks contain comments and explanations in Catalan.

About

Jupyter notebooks on machine learning fundamentals, including classification and regression methods with practical implementations. Created as part of the Artificial Intelligence 2024 course at UIB.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors