Skip to content
/ dvc Public
forked from iterative/dvc

⚡️Data & models versioning for ML projects, make them shareable and reproducible

License

Notifications You must be signed in to change notification settings

jeremcs/dvc

Repository files navigation

Code Climate Test Coverage

Data version control

Git for data science projects. It streamlines your git code, your data (S3 and GCP) and the dependencies to a single reproducible environment.

Copyright

This project is distributed under the Apache license version 2.0 (see the LICENSE file in the project root).

By submitting a pull request for this project, you agree to license your contribution under the Apache license version 2.0 to this project.

Setup / Configuration

dvc supports both aws and google cloud; you will need accounts either from aws or gcloud.

To see your available buckets, run either TODO or gsutil ls.

Google Cloud Setup

To setup your google cloud credentials: TODO

To test they are set up correctly, TODO

Usage

mkdir t
cd !$
git init .
dvc init

Info. Directories data/, cache/ and state/ were created
Info. File .gitignore was created
Info. Directory cache was added to .gitignore file
Info. [Git] A new commit 34687b2 was made in the current branch. Added files:
Info. [Git]	A  .gitignore
Info. [Git]	A  dvc.conf



Development

Create a virtualenv, for example, at ~/.environments/dvc by mkdir -p ~/.environments/dvc; virtualenv ~/.environments/dvc or use virtualenvwrapper.

# if you use virtualenvwrapper
workon dvc

# otherwise:
source ~/.environments/dvc/bin/activate

pip install -r requirements.txt

# happy coding!

Building

OSX

# creates dist/dvc
./build_osx.sh

About

⚡️Data & models versioning for ML projects, make them shareable and reproducible

Resources

License

Code of conduct

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 94.5%
  • Shell 3.2%
  • Inno Setup 1.7%
  • Other 0.6%