Skip to content

Papagiannopoulos/Santander_Customer_Transaction_Prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

20 Commits
ย 
ย 
ย 
ย 

Repository files navigation

Which customers will make a specific transaction in the future?

Aim: Identify which customers will make a specific transaction in the future

Project Summary

This project referred to the 2nd most popular Kaggle competition, where i scored at the top 8% Global rank.
In this imbalanced classification problem, data are anonymized.
I performed a deep EDA and data manipulation including preprocessing & feature engineering to understan the underlying data pattenrs, building predictive models such as lightGMB and logisting regression.
To further familiarized with Shiny app, all visualizations additionally created using Rshiny.

Project Structure

Table of Contents

  1. ๐Ÿ” Dataset - Data description & source
  2. ๐Ÿงน Data Processing - Cleaning and feature engineering
  3. ๐Ÿ” Reproducibility - Reproducibility steps
  4. ๐Ÿ“ฆ Requirements - Install dependencies

Dataset

Dataset is anonymized containing 200 numeric feature variables, the binary โ€œtargetโ€ column, and a string โ€œID_codeโ€ column.
The task is to predict the value of โ€œtargetโ€ column in the test set.

  • Download source provided for this competition has the same structure as the real data where Santander has available to solve this problem.

Evaluation

Submissions are evaluated on the Area Under the Curve (AUC) between the predicted probability and the observed target.

๐Ÿ” Reproducibility

  1. Clone the repo!
    Run the following commands in terminal:
    git clone 'https://github.com/Papagiannopoulos/Santander_Customer_Transaction_Prediction.git'
    cd 'Santander_Customer_Transaction_Prediction'
  2. Download the data & add them in your cloned repo
  3. Run the requirements.R

๐Ÿ“ฆ Requirements

  • Before running the Shiny app.R, run the requirements script to install all necessary libraries
  • R version: 4.4.1 (2024-06-14 ucrt)

About

Second most popular Kaggle competition (closed) where i scored in the top 8% of Global rank. Pending finalization!

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages