Lists (1)
Sort Name ascending (A-Z)
Stars
This project downloads and stores the daily SBI forex rates in a CSV file enabling you to access historical rates, easily.
🌐 The Internet OS! Free, Open-Source, and Self-Hostable.
Cruise-control is the first of its kind to fully automate the dynamic workload rebalance and self-healing of a Kafka cluster. It provides great value to Kafka users by simplifying the operation of …
Code examples and docker environment for Spark
Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, P…
A curated list of awesome Apache Spark packages and resources.
Spark Gotchas. A subjective compilation of the Apache Spark tips and tricks
Generate big TPC-DS datasets with Databricks
Notes talking about the design and implementation of Apache Spark
Super fast drop-in replacement of the in memory key-value store Redis, made in Golang
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Opiniated RAG for integrating GenAI in your apps 🧠 Focus on your product rather than the RAG. Easy integration in existing products with customisation! Any LLM: GPT4, Groq, Llama. Any Vectorstore: …
A tool for catching binary incompatibility in Scala
Examples for High Performance Spark
PG_DIPLOMA_IN_DATA_SCIENCE_IIIT-B_&_UPGRAD
Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, Du…
Source code for Twitter's Recommendation Algorithm
Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
System design interview for IT companies
The Patterns of Scalable, Reliable, and Performant Large-Scale Systems
⚡ Lightning-Talk Style Demo of Istio and OpenCensus⚡
This repository contains the notebooks and presentations we use for our Databricks Tech Talks
Find Aadhaar cards thanks to Google
The java implementation of Apache Dubbo. An RPC and microservice framework.