Skip to content

sampathweb/quilt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

docs on_gitbook chat on_slack codecov pypi

Note: this is the documentation for Quilt 3. For Quilt 2 see here and here.

Overview

Quilt is a collaboration tool for creating, managing, and sharing datasets in S3. Quilt users transform raw, messy data in S3 buckets into immutable datasets--reusable, trusted building blocks that are easy to version, test, share and catalog. Working with datasets in Quilt speeds up model creation, accelerates experimentation, reduces downtime, and increases the productivity of data science teams.

Collaborate in S3

  • Quilt adds search, content preview, versioning, and a Python API to any S3 bucket
  • Every file in Quilt is versioned and searchable
  • Quilt is for data scientists, data engineers, and data-driven teams

Use cases

  • Collaborate - get everyone on the same page by pointing them all to the same immutable data version
  • Experiment faster - blob storage is schemaless and scalable, so iterations are quick
  • Recover, rollback, and reproduce with immutable packages
  • Understand what's in S3 - plaintext and faceted search over S3

Key features

  • Browse, search any S3 bucket
  • Preview images, Jupyter notebooks, Vega visualizations - without downloading
  • Read/write Python objects to and from S3
  • Immutable versions for objects, immutable packages for collections of objects

Components

  • /catalog (JavaScript) - Search, browse, and preview your data in S3
  • /api/python - Read, write, and annotate Python objects in S3

About

Quilt is a versioned data portal for AWS

Resources

License

Code of conduct

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Jupyter Notebook 56.7%
  • JavaScript 25.9%
  • Python 17.0%
  • HTML 0.4%
  • Dockerfile 0.0%
  • CSS 0.0%