Git for data science projects. It streamlines your git code, your data (S3 and GCP) and the dependencies to a single reproducible environment.
This project is distributed under the Apache license version 2.0 (see the LICENSE file in the project root).
By submitting a pull request for this project, you agree to license your contribution under the Apache license version 2.0 to this project.
dvc
supports both aws and google cloud; you will need accounts either from aws or gcloud.
To see your available buckets, run either TODO
or gsutil ls
.
To setup your google cloud credentials: TODO
To test they are set up correctly, TODO
mkdir t
cd !$
git init .
dvc init
Info. Directories data/, cache/ and state/ were created
Info. File .gitignore was created
Info. Directory cache was added to .gitignore file
Info. [Git] A new commit 34687b2 was made in the current branch. Added files:
Info. [Git] A .gitignore
Info. [Git] A dvc.conf
Create a virtualenv, for example, at ~/.environments/dvc
by mkdir -p ~/.environments/dvc; virtualenv ~/.environments/dvc
or use virtualenvwrapper
.
# if you use virtualenvwrapper
workon dvc
# otherwise:
source ~/.environments/dvc/bin/activate
pip install -r requirements.txt
# happy coding!
# creates dist/dvc
./build_osx.sh