-
Hopsworks
- Stockholm
- @jim_dowling
- in/jim-dowling-206a98
Stars
👕 An open-source course that will teach you how to build and deploy a real-time personalized recommender for H&M fashion articles.
Embeddable stream processing engine based on Apache DataFusion
📚 Parameterize, execute, and analyze notebooks
🥈 Silver Medal Solution to Kaggle H&M Personalized Fashion Recommendations
📊 Cube — Universal semantic layer platform for AI, BI, spreadsheets, and embedded analytics
The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class performance …
Silemo / delta-rs
Forked from delta-io/delta-rsA native Rust library for Delta Lake, with bindings into Python
Incremental ML learning in the real-world
An analysis of cycling counts in Auckland in relation to the weather
data load tool (dlt) is an open source Python library that makes data loading easy 🛠️
A highly efficient daemon for streaming data from Kafka into Delta Lake
A clean, extensible, python project template
A write-audit-publish implementation on a data lake without the JVM
Apache Polaris, the interoperable, open source catalog for Apache Iceberg
Debug your GitHub Actions via SSH by using tmate to get access to the runner system itself.
Visual Data Transformation with Python Code Generation. Low-Code Python-based ETL.
A native Delta implementation for integration with any query engine
O'Reilly book - Building Machine Learning Systems with a feature store: batch, real-time, and LLMs
This work aims to re-engineer the Hadoop Distributed File System (HDFS) so that it can be 1) highly available, and 2) horizontally scalable. This is achieved by replacing the central master server …
The native Rust implementation for Apache Hudi, with Python API bindings.
λFS: an elastic, high-performance, serverless-function-based metadata service for large-scale distributed file systems (ACM ASPLOS'23)
Database connectivity API standard and libraries for Apache Arrow
Check for data drift between two OpenAI multi-turn chat jsonl files.
Project bike sharing predictor
Data Processing/Feature Calculation Engine for real-time AI/ML