GitHub - vivadata/diamonds: Projet Fil rouge Mlops

Diamonds: from notebook to package

This repository is a teaching project where you refactor a Jupyter notebook into a reusable Python package (diamonds).

Prerequisites

Python: installed and managed with pyenv
Virtual environments: managed with pyenv-virtualenv
direnv: installed and enabled in your shell

1. Clone the repository

Fork this repository and clone it into the project directory. Then Create a new branch for your productionizing-ml project.

gh repo fork vivadata/diamonds
git clone git@github.com:<your-username>/diamonds.git
cd diamonds
# Create A new branch and switch to it
git checkout -b <your-username>-productioninizing-ml

2. Create and activate a virtual environment (pyenv-virtualenv)

Create a new virtual environment for this project.
```
pyenv virtualenv 3.11 diamonds
```
Tell this directory to use that virtual environment.
```
pyenv local diamonds
```
Check that Python now points to the virtualenv.
```
which python
python --version
```
Install the package in develop mode. NB : The -e flag is used to install the package in develop mode.
```
pip install -e .
```

3. Configure direnv

Allow direnv in this directory (only once).
```
direnv allow
```
Create an .envrc file at the root of the project so the virtualenv is activated automatically when you cd into the directory.
```
echo 'dotenv' > .envrc
direnv allow
```
Leave and re-enter the project directory and confirm that the virtualenv is automatically activated.
```
cd ..
cd Pengouins-demo
which python
```

4. Next steps

Explore the notebook: open notebooks/Exploration.ipynb.
Identify responsibilities:
- data loading and cleaning,
- feature engineering,
- model training and evaluation,
- prediction.
Refactor into the package:
- move data-related code into src/diamonds/data.py,
- move model-related code into src/diamonds/model.py,
- centralize constants/paths in src/diamonds/params.py,
- implement model saving/loading in src/diamonds/registry.py.
Train the model:
- Udpate new script src/diamonds/train.py to train the model and save it in the models directory.
- run python -m src.diamonds.train to train the model and save it in the models directory.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data		data
models		models
notebooks		notebooks
src/diamonds		src/diamonds
.env copy		.env copy
.gitignore		.gitignore
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Diamonds: from notebook to package

Prerequisites

1. Clone the repository

2. Create and activate a virtual environment (pyenv-virtualenv)

3. Configure direnv

4. Next steps

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Diamonds: from notebook to package

Prerequisites

1. Clone the repository

2. Create and activate a virtual environment (pyenv-virtualenv)

3. Configure direnv

4. Next steps

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages