Skip to content

IdanKanat/ETL-Project__BigDataEngineering

Repository files navigation

ETL-Project - BigDataEngineering

This project was a major part of a course I took - "Big Data Engineering" at TAU’s Faculty of Engineering, during the 3rd year of my studies (2025).

The project has been the 1st course assignment. Along with my 2 partners, we focused on building a complete ETL pipeline for the Olist online orders dataset (Brazil). The task included:

  • Performing a detailed exploratory data analysis (EDA) procedure, specifically handling null values, duplicates, and logical inconsistencies.

  • Designing and constructing a data warehouse from scratch using a star schema, applying denormalization and design pattern principles.

The full details can be found in the notebook file (IPYNB) and in the Final Report (PDF), in the repo, written in Hebrew.

Link to ALL the CSV files used in the project

The project's final grade - 93.

About

This project was a major part of a course I took - "Big Data Engineering" at TAU’s Faculty of Engineering, during the 3rd year of my studies (2025). As the 1st course assignment, along with my 2 teammates, we implemented a complete ETL pipeline for the Olist online orders dataset (Brazil) - building a Data Warehouse from scratch & detailed EDA.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors