Skip to content
View jimdowling's full-sized avatar

Block or report jimdowling

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

👕 An open-source course that will teach you how to build and deploy a real-time personalized recommender for H&M fashion articles.

Python 117 26 Updated Dec 12, 2024

Embeddable stream processing engine based on Apache DataFusion

Rust 288 7 Updated Dec 10, 2024

📚 Parameterize, execute, and analyze notebooks

Python 6,033 431 Updated Oct 5, 2024
Rust 1 2 Updated Oct 18, 2024
Rust 37 14 Updated Dec 16, 2024

🥈 Silver Medal Solution to Kaggle H&M Personalized Fashion Recommendations

Jupyter Notebook 62 16 Updated May 20, 2022

📊 Cube — Universal semantic layer platform for AI, BI, spreadsheets, and embedded analytics

Rust 18,068 1,779 Updated Dec 17, 2024

The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class performance …

Java 9,381 1,867 Updated Dec 16, 2024

A native Rust library for Delta Lake, with bindings into Python

Rust 1 1 Updated Oct 11, 2024

Incremental ML learning in the real-world

Python 61 9 Updated Aug 17, 2024

An analysis of cycling counts in Auckland in relation to the weather

Jupyter Notebook 89 56 Updated Feb 27, 2023

data load tool (dlt) is an open source Python library that makes data loading easy 🛠️

Python 2,810 183 Updated Dec 16, 2024

A highly efficient daemon for streaming data from Kafka into Delta Lake

Rust 376 82 Updated Dec 12, 2024

A clean, extensible, python project template

Makefile 2 1 Updated Jul 29, 2024

A write-audit-publish implementation on a data lake without the JVM

Python 42 2 Updated Aug 12, 2024

Apache Polaris, the interoperable, open source catalog for Apache Iceberg

Python 1,220 147 Updated Dec 16, 2024

Debug your GitHub Actions via SSH by using tmate to get access to the runner system itself.

JavaScript 3,001 297 Updated Nov 17, 2024

Visual Data Transformation with Python Code Generation. Low-Code Python-based ETL.

TypeScript 938 45 Updated Dec 10, 2024

A native Delta implementation for integration with any query engine

Rust 160 49 Updated Dec 14, 2024

O'Reilly book - Building Machine Learning Systems with a feature store: batch, real-time, and LLMs

Jupyter Notebook 20 99 Updated Dec 12, 2024

This work aims to re-engineer the Hadoop Distributed File System (HDFS) so that it can be 1) highly available, and 2) horizontally scalable. This is achieved by replacing the central master server …

Java 2 Updated Jan 2, 2012

The native Rust implementation for Apache Hudi, with Python API bindings.

Rust 167 33 Updated Dec 13, 2024

λFS: an elastic, high-performance, serverless-function-based metadata service for large-scale distributed file systems (ACM ASPLOS'23)

Java 11 2 Updated Feb 12, 2024

AI Toolkit for Healthcare Imaging

Python 5,965 1,102 Updated Dec 10, 2024

Database connectivity API standard and libraries for Apache Arrow

C# 389 99 Updated Dec 17, 2024

Check for data drift between two OpenAI multi-turn chat jsonl files.

Jupyter Notebook 37 6 Updated Apr 11, 2024

Project bike sharing predictor

Jupyter Notebook 64 13 Updated Nov 28, 2024

Data Processing/Feature Calculation Engine for real-time AI/ML

Python 39 4 Updated Dec 15, 2024

The Feldera Incremental Computation Engine

Rust 843 49 Updated Dec 16, 2024

Curate better data for LLMs

Python 986 93 Updated Mar 19, 2024
Next