Data Cleaning and Pre-processing as well as Decision Tree models

Setup Instructions

Without uv

Create a .venv in the src directory
Run this code to install dependencies

py -m pip install -r requirements.txt

Move cleaned datasets from KNNFeatureCreation colab into the data folder
Run main.py file

With uv package manager

Run this to move into the src directory

cd src

Run this code to install dependencies

uv sync

Run this code to run the main.py file

uv run main.py

Usage Instructions

Models stored in the models dictionary.
rf_post: A random forest model trained on the top 20 most important features. It is used to predict Post Covid RevPAR
xgb_post: A XGBoost model trained on the entire Post-covid dataset. It is used to predict Post Covid RevPAR
rf_pre: A random forest model trained on the top 20 most important features. It is used to predict Pre Covid RevPAR
xgb_pre: A XGBoost model trained on the entire Post-covid dataset. It is used to predict Post Covid RevPAR

Remember that the rf models are only trained on the top 20 features. Remember to trim the features dataframe so that the model works

IPYNB Notebooks

The below notebooks should be run in the following order in Google Colab after running clean_and_preprocess() from cleaning_pre_processing_and_trees/main.py. Before running the below, store in Colab the cleaned train and test csv data in the form train.csv,test.csv; and the original drive time data master_panel_drv10.csv,master_panel_drv15.csv,master_panel_drv30.csv; and the original scoring data scoring.csv.

Data Exploration.ipynb
KNNFeatureCreation.ipynb
NeuralNet.ipynb
Ensembling.ipynb
Visualizations2.ipynb
Visualizations.ipynb

Predictions for `scoring.csv`

filled_scoring.csv contains the ensemble model's prediction for scoring.csv.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Cleaning and Pre-processing as well as Decision Tree models

Setup Instructions

Without uv

With uv package manager

Usage Instructions

Remember that the rf models are only trained on the top 20 features. Remember to trim the features dataframe so that the model works

IPYNB Notebooks

Predictions for `scoring.csv`

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
additional_dashboard		additional_dashboard
cleaning_pre_processing_and_trees		cleaning_pre_processing_and_trees
.gitignore		.gitignore
Data Exploration.ipynb		Data Exploration.ipynb
Ensembling.ipynb		Ensembling.ipynb
KNNFeatureCreation.ipynb		KNNFeatureCreation.ipynb
NeuralNet.ipynb		NeuralNet.ipynb
README.md		README.md
Visualizations.ipynb		Visualizations.ipynb
Visualizations2.ipynb		Visualizations2.ipynb
filled_scoring.csv		filled_scoring.csv

Folders and files

Latest commit

History

Repository files navigation

Data Cleaning and Pre-processing as well as Decision Tree models

Setup Instructions

Without uv

With uv package manager

Usage Instructions

Remember that the rf models are only trained on the top 20 features. Remember to trim the features dataframe so that the model works

IPYNB Notebooks

Predictions for scoring.csv

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Predictions for `scoring.csv`

Packages