-
Nielsen
- Oldsmar, FL
- http://www.nielsen.com/
Starred repositories
MapReduce performance testing using teragen and terasort
Terraform module to create Amazon Elastic Kubernetes (EKS) resources 🇺🇦
Terragrunt is a flexible orchestration tool that allows Infrastructure as Code written in OpenTofu/Terraform to scale.
Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared …
a pyenv plugin to manage virtualenv (a.k.a. python-virtualenv)
Includes notes on using Apache Spark in general, notes on using Spark for Physics, how to run TPCDS on PySpark, how to create histograms with Spark, tools for performance testing CPUs, Jupyter note…
🍺 The missing package manager for macOS (or Linux)
A command-line tool for launching Apache Spark clusters.
Upserts, Deletes And Incremental Processing on Big Data.
All the things about TPC-DS in Apache Spark
Use the TPC-DS benchmark to test Spark SQL performance
Essential Spark extensions and helper methods ✨😲
A curated list of awesome Apache Spark packages and resources.
A series of DAGs/Workflows to help maintain the operation of Airflow
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
PowerMock is a Java framework that allows you to unit test code normally regarded as untestable.
JVM Profiler Sending Metrics to Kafka, Console Output or Custom Reporter
Official s3cmd repo -- Command line tool for managing S3 compatible storage services (including Amazon S3 and CloudFront).
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
Spark: The Definitive Guide's Code Repository
scopt / scopt
Forked from jstrachan/scoptcommand line options parsing for Scala