- Raleigh, NC
- innowhite.com
deequ Public
Forked from awslabs/deequDeequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
Scala Apache License 2.0 UpdatedMar 2, 2020 -
Basic_SQL_Flow Public
Basic_SQL_Flow is a work flow based tool for sql statements.
HTML UpdatedFeb 13, 2019 -
dataq_tests Public
This repo consists of automated test scripts for Dataq
Python UpdatedOct 4, 2018 -
incubator-livy Public
Forked from apache/incubator-livyMirror of Apache livy (Incubating)
Scala Apache License 2.0 UpdatedSep 26, 2018 -
spark-hello-wrold Public
A hello world example to demonstrate test pipeline
Scala UpdatedSep 22, 2018 -
azkaban Public
Forked from azkaban/azkabanAzkaban workflow manager.
Java Apache License 2.0 UpdatedAug 25, 2018 -
testcontainers-scala Public
Forked from testcontainers/testcontainers-scalaDocker containers for testing in scala
azure-cosmosdb-spark Public
Forked from Azure/azure-cosmosdb-sparkThis project provides a client library that allows Azure Cosmos DB to act as an input source or output sink for Spark jobs.
Scala MIT License UpdatedMar 29, 2018 -
docker-spark Public
Forked from sequenceiq/docker-sparkShell Apache License 2.0 UpdatedAug 21, 2017 -
LearningSpark Public
Forked from spirom/LearningSparkScala examples for learning to use Spark
Scala MIT License UpdatedJul 16, 2017 -
Spark.TableStatsExample Public
Forked from tmalaska/Spark.TableStatsExampleSimple Spark example of generating table stats for use of data quality checks
Scala Apache License 2.0 UpdatedApr 28, 2017 -
drunken-data-quality Public
Forked from FRosner/drunken-data-qualitySpark package for checking data quality
Scala Apache License 2.0 UpdatedApr 12, 2017 -
spark Public
Forked from apache/sparkMirror of Apache Spark
Scala Apache License 2.0 UpdatedMar 18, 2017 -