Skip to content

morsoletodev/MortalityPrediction-ML

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DataSUS Machine Learning Pipeline

Python Black Uses the Cookiecutter Data Science project template License: GPL v3

Installation & Usage

The command docker compose is responsible for building and execution the pipeline.

docker compose up <command> <--no-deps>

The available commands are listed below:

  • acquire: Download datasets, and merge into single file.
  • process: Adjust format and schema of acquired files.
  • train_model: Creates a splink model.
  • linkage: Performs the linkage between datasets;
  • jupyter: Jupyter server used to execute ml notebooks. Check EnsembleModels for more information.

The flag --no-deps can be used to execute process, train_model and linkage services solo, since they are dependent on the previous (e.g. process depends on acquire).

License

GNU

About

Hybrid Data Pipeline & Ensemble Optimization Framework. Includes a Splink-powered entity matching engine and a systematic ML exploratory research pipeline for estimator/sampler pairing and hyperparameter tuning.

Topics

Resources

License

Stars

Watchers

Forks

Contributors