Skip to content
View mgabs's full-sized avatar

Block or report mgabs

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, Du…

Rust 4,169 254 Updated Feb 11, 2025

A collection of awesome things related to the AWS Cloud Development Kit (CDK)

2,027 196 Updated Mar 12, 2024

A Python Library to support running data quality rules while the spark job is running⚑

Python 171 44 Updated Jan 24, 2025

Source code for Twitter's Recommendation Algorithm

Python 10,184 2,223 Updated Jul 10, 2024

Snippets and templates representing common Customer Success patterns

HCL 243 28 Updated Jan 1, 2025

Explanation to key concepts in ML

7,448 587 Updated Feb 11, 2025

A curated and opinionated list of resources for Chief Technology Officers, with the emphasis on startups

26,549 1,581 Updated Mar 24, 2024

Dask on ECS Fargate

Shell 14 11 Updated Sep 23, 2019

An example of an ETL pipeline that lays out generic DE processes. This is now out of date but still provides useful information

Python 27 14 Updated Apr 22, 2022

Apache Spark - A unified analytics engine for large-scale data processing

Scala 40,508 28,490 Updated Feb 11, 2025

AI and Machine Learning with Kubeflow, Amazon EKS, and SageMaker

Jupyter Notebook 3,373 1,083 Updated Jul 31, 2024

πŸ“š Papers & tech blogs by companies sharing their work on data science & machine learning in production.

27,690 3,725 Updated Jul 18, 2024

😎 A curated list of awesome MLOps tools

Python 4,296 595 Updated Nov 29, 2024

Architecture decision record (ADR) examples for software planning, IT leadership, and template documentation

12,782 2,498 Updated Feb 7, 2025

Example code that launches a docker container on AWS Fargate from AWS Lambda

Makefile 18 5 Updated Dec 24, 2017

Testing out tracing's new valuable work.

Rust 1 Updated Feb 16, 2022

This repository has setting files to host redash and redash bot on AWS. You could launch Redash ready for production within 5 minutes.

40 12 Updated May 24, 2020

Merlion: A Machine Learning Framework for Time Series Intelligence

Python 3,522 311 Updated Jun 20, 2024

This repository is for demonstrating the capability to do SQL-based UPDATES, DELETES, and INSERTS directly in the Data Lake using Amazon S3, AWS Glue and Delta Lake.

Python 16 10 Updated Aug 25, 2021

πŸ€— The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools

Python 19,589 2,748 Updated Feb 5, 2025

Simple CLI for storing and retreiving SSH keys from AWS SecretsManager

Rust 1 Updated May 31, 2021

Utilities for the awesome window manager

Lua 857 49 Updated Dec 18, 2024

πŸ“Š Path to a free self-taught education in Data Science!

19,414 3,518 Updated Nov 21, 2024

πŸŽ“ Path to a free self-taught education in Computer Science!

175,234 22,213 Updated Feb 1, 2025

CLI to manage emails

Rust 4,592 128 Updated Jan 27, 2025

A list about Apache Kafka

581 163 Updated Feb 9, 2024

The Little Book of Rust Macros

Rust 891 96 Updated Nov 30, 2022

Shared resource dispatcher

Rust 242 66 Updated May 15, 2024

Adding support for the Rust language to the Linux kernel.

C 4,059 440 Updated Feb 9, 2025
Next