Lists (27)
Sort Name ascending (A-Z)
airflow
bi
blockchain
Causality
Data Engineering
dbt
DevTools
frontend
graph
hardware
IoT
k8s
LLM & GENAI
Marketing
ML
Observability
Ops
productivity
Record Linkage
redshift
Sec
serverless
snowflake
stack
streaming
trading
visuals
- All languages
- ABAP
- ANTLR
- ActionScript
- Assembly
- Awk
- Ballerina
- Batchfile
- Bicep
- Bikeshed
- BitBake
- C
- C#
- C++
- CMake
- CSS
- Clojure
- CoffeeScript
- Common Workflow Language
- Coq
- Crystal
- Cuda
- Cypher
- Cython
- D
- DIGITAL Command Language
- Dart
- Dhall
- Dockerfile
- EJS
- Eagle
- Elixir
- Elm
- Emacs Lisp
- Erlang
- F#
- F*
- Fortran
- GAMS
- GCC Machine Description
- Go
- Groovy
- HCL
- HTML
- Haskell
- Idris
- Java
- JavaScript
- Jinja
- Jsonnet
- Julia
- Jupyter Notebook
- Kotlin
- LLVM
- Lean
- Less
- LookML
- Lua
- M4
- MATLAB
- MDX
- Makefile
- Markdown
- Mathematica
- Mermaid
- Mustache
- Nim
- OCaml
- Objective-C
- Open Policy Agent
- OpenEdge ABL
- OpenSCAD
- P4
- PHP
- PLSQL
- PLpgSQL
- Pascal
- Perl
- Pony
- PowerShell
- Praat
- Prolog
- Puppet
- Python
- R
- Rich Text Format
- Roff
- Ruby
- Rust
- SCSS
- SQL
- Scala
- Scheme
- Shell
- Smarty
- Solidity
- Standard ML
- Starlark
- Stata
- Svelte
- Swift
- TLA
- TSQL
- TeX
- Terra
- Thrift
- TypeScript
- Vim Script
- Vue
- Web Ontology Language
- XSLT
- ZIL
- Zig
- jq
Starred repositories
Fluss is a streaming storage built for real-time analytics.
A lightweight Python-based tool for extracting and analyzing data column lineage for dbt projects
GitHub Action to download JSON artifacts from a dbt Cloud CI job triggered by a pull request.
Installer for DataKitchen's Open Source Data Observability Products. Data breaks. Servers break. Your toolchain breaks. Ensure your team is the first to know and the first to solve with visibility …
A fully static distributed library system powered by IPFS, SQLite and GitHub
Realtime database, runs anywhere. Install Fireproof in your front-end app or edge function, and sync data via any backend.
Apache Polaris, the interoperable, open source catalog for Apache Iceberg
Metabase driver and plugin for Materialize
LakeSoul is an end-to-end, realtime and cloud native Lakehouse framework with fast data ingestion, concurrent update and incremental data analytics on cloud storages for both BI and AI applications.
Gen-AI Chat for Teams - Think ChatGPT if it had access to your team's unique knowledge.
A portable Pythonic Data Catalog API powered by Ray that brings exabyte-level scalability and fast, ACID-compliant, change-data-capture to your big data workloads.
A high performance caching library for Java
AWS Kinesis Flink App processing a real time streaming input that writes the output in different file formats to S3
大数据知识仓库涉及到数据仓库建模、实时计算、大数据、数据中台、系统设计、Java、算法等。
This repository contains resources for technical coding interviews.
CLI tool to bulk migrate the tables from one catalog another without a data copy
Universal solution for geospatial data tailored to data lakehouse systems for the first time in the industry
Access to Spanish Statistical Office's Household Income Distribution Atlas data at municipality, district and census tract levels
This Guidance demonstrates a robust approach to incrementally export and maintain a centralized data repository reflecting ongoing changes in a distributed database.
This repository contains the Markdown source and tools for the Quix developer documentation.
Lakekeeper: A Rust native Iceberg REST Catalog
A self-hosted dashboard that puts all your feeds in one place
Repository for Advanced Flink Application Patterns series