Skip to content

Shanzita/RiceDatathon2026

Repository files navigation

Data Cleaning and Pre-processing as well as Decision Tree models

Setup Instructions

Without uv

  1. Create a .venv in the src directory
  2. Run this code to install dependencies
py -m pip install -r requirements.txt
  1. Move cleaned datasets from KNNFeatureCreation colab into the data folder
  2. Run main.py file

With uv package manager

  1. Run this to move into the src directory
cd src
  1. Run this code to install dependencies
uv sync
  1. Run this code to run the main.py file
uv run main.py

Usage Instructions

Models stored in the models dictionary.
rf_post: A random forest model trained on the top 20 most important features. It is used to predict Post Covid RevPAR
xgb_post: A XGBoost model trained on the entire Post-covid dataset. It is used to predict Post Covid RevPAR
rf_pre: A random forest model trained on the top 20 most important features. It is used to predict Pre Covid RevPAR
xgb_pre: A XGBoost model trained on the entire Post-covid dataset. It is used to predict Post Covid RevPAR

Remember that the rf models are only trained on the top 20 features. Remember to trim the features dataframe so that the model works

IPYNB Notebooks

The below notebooks should be run in the following order in Google Colab after running clean_and_preprocess() from cleaning_pre_processing_and_trees/main.py. Before running the below, store in Colab the cleaned train and test csv data in the form train.csv,test.csv; and the original drive time data master_panel_drv10.csv,master_panel_drv15.csv,master_panel_drv30.csv; and the original scoring data scoring.csv.

  1. Data Exploration.ipynb
  2. KNNFeatureCreation.ipynb
  3. NeuralNet.ipynb
  4. Ensembling.ipynb
  5. Visualizations2.ipynb
  6. Visualizations.ipynb

Predictions for scoring.csv

filled_scoring.csv contains the ensemble model's prediction for scoring.csv.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors