cakiem8x

Van Nguyen cakiem8x

26 followers · 84 following

ThuDo JSC
Hanoi
http://thudomutimedia.vn

Starred repositories

8 results for source starred repositories written in Scala

Clear filter

apache / spark

Apache Spark - A unified analytics engine for large-scale data processing

Scala 40,415 28,472 Updated Jan 29, 2025

delta-io / delta

An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs

Scala 7,780 1,754 Updated Jan 29, 2025

microsoft / SynapseML

Simple and Distributed Machine Learning

Scala 5,091 836 Updated Jan 10, 2025

JohnSnowLabs / spark-nlp

State of the Art Natural Language Processing

Scala 3,906 719 Updated Jan 29, 2025

databricks / LearningSparkV2

This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]

Scala 1,238 750 Updated Jan 28, 2025

high-performance-spark / high-performance-spark-examples

Examples for High Performance Spark

Scala 506 234 Updated Nov 3, 2024

shafiab / HashtagCashtag

My Insight Data Engineering Fellowship project. I implemented a big data processing pipeline based on lambda architecture, that aggregates Twitter and US stock market data for user sentiment anal…

Scala 501 128 Updated Aug 24, 2022