First you need to download data from Kaggle : https://www.kaggle.com/c/titanic or you can also just create your own data folder which contains your data files
pip install dvc
dvc init
dvc remote add localStorage /tmp/titanic-storage (you can add -d option for setting a default remote)
dvc add data
dvc push
git add data.dvc
You can check now that actual data file has been copied to the remote we created in the configuration chapter
ls -R /tmp/titanic-storage
You can then check all works by first deleting your data folder to retrieve them from the remote
rm -rf data
dvc pull
dvc repro pipeline/Dvcfile
git add .
git commit -m '...'
git tag -a '...' -m ''
dvc push
dvc metrics show -T
python __main__.py
dvc gc
dvc checkout