Skip to content
View ches's full-sized avatar
😴
On a slow life break, occasionally attentive
😴
On a slow life break, occasionally attentive

Sponsoring

@mgmeyers

Highlights

  • Pro

Organizations

@barcampbangkok @scalajp @hspec @bkkhack @go-kafka @feast-dev

Block or report ches

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

Data 🪣

Data tools, data infrastructure, data visualization… probably a bit of big bucket…
27 repositories

lakeFS - Data version control for your data lake | Git for data

Go 4,576 369 Updated Mar 15, 2025

Nessie: Transactional Catalog for Data Lakes with Git-like semantics

Java 1,158 147 Updated Mar 15, 2025

A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rewrites.

Python 2,052 93 Updated Sep 21, 2024

DuckDB is an analytical in-process SQL database management system

C++ 27,594 2,165 Updated Mar 14, 2025

A time-series database for high-performance real-time analytics packaged as a Postgres extension

C 18,567 916 Updated Mar 14, 2025

Voilà turns Jupyter notebooks into standalone web applications

Python 5,618 510 Updated Mar 3, 2025

A data visualization and analytics component, especially well-suited for large and/or streaming datasets.

C++ 8,970 1,210 Updated Mar 14, 2025

🦜🔗 Build context-aware reasoning applications

Jupyter Notebook 103,369 16,723 Updated Mar 16, 2025

Making data lake work for time series

Python 1,157 59 Updated Aug 21, 2024

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Python 36,004 6,104 Updated Mar 15, 2025

Dataframes powered by a multithreaded, vectorized query engine, written in Rust

Rust 32,385 2,118 Updated Mar 15, 2025

Read and write Parquet in Scala. Use Scala classes as schema. No need to start a cluster.

Scala 286 67 Updated Feb 24, 2025

𝗗𝗮𝘁𝗮, 𝗔𝗻𝗮𝗹𝘆𝘁𝗶𝗰𝘀 & 𝗔𝗜. Modern alternative to Snowflake. Cost-effective and simple for massive-scale analytics. https://databend.com

Rust 8,256 770 Updated Mar 15, 2025

Typesafe wrapper for Apache Spark DataFrame API

Scala 141 9 Updated Oct 23, 2024

🔥🔥🔥AI-driven database tool and SQL client, The hottest GUI client, supporting MySQL, Oracle, PostgreSQL, DB2, SQL Server, DB2, SQLite, H2, ClickHouse, and more.

Java 20,660 2,273 Updated Mar 5, 2025

Conduit streams data between data stores. Kafka Connect replacement. No JVM required.

Go 438 49 Updated Mar 15, 2025

OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team co…

TypeScript 6,268 1,173 Updated Mar 15, 2025

EventStoreDB, the event-native database. Designed for Event Sourcing, Event-Driven, and Microservices architectures

C# 5,437 661 Updated Mar 15, 2025

Concurrent and multi-stage data ingestion and data processing with Elixir

Elixir 2,495 163 Updated Mar 5, 2025

A library that provides useful extensions to Apache Spark and PySpark.

Scala 219 28 Updated Mar 14, 2025

A generative AI extension for JupyterLab

Python 3,487 375 Updated Mar 14, 2025

A code-first agent framework for seamlessly planning and executing data analytics tasks.

Python 5,590 715 Updated Mar 14, 2025

🦔 PostHog provides open-source web & product analytics, session recording, feature flagging and A/B testing that you can self-host. Get started - free.

Python 24,975 1,545 Updated Mar 16, 2025

Latency Tester for Apache Cassandra

Rust 184 23 Updated Feb 17, 2025

Developing Spark with sbt

Scala 6 5 Updated Jan 23, 2021

Business intelligence as code: build fast, interactive data visualizations in SQL and markdown

JavaScript 4,954 241 Updated Mar 14, 2025

A lightweight data processing framework built on DuckDB and 3FS.

Python 4,235 354 Updated Mar 5, 2025