Stars
本项目曾冲到全球第一,干货集锦见本页面最底部,另完整精致的纸质版《编程之法:面试和算法心得》已在京东/当当上销售
TensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters.
Apache Ambari simplifies provisioning, managing, and monitoring of Apache Hadoop clusters.
Machine Learning、Deep Learning、PostgreSQL、Distributed System、Node.Js、Golang
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K…
REST job server for Apache Spark
Micro second messaging that stores everything to disk
Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
Apache Spark - A unified analytics engine for large-scale data processing
ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.
An Open Source Machine Learning Framework for Everyone
Distributed Scheduled Job Framework
CoreNLP: A Java suite of core NLP tools for tokenization, sentence segmentation, NER, parsing, coreference, sentiment analysis, etc.
The Scala 2.8 Collections API main types as dot graphes (GraphViz)
The java implementation of Apache Dubbo. An RPC and microservice framework.
Scalatron, a multi-player programming game in which coders pit bot programs (written in Scala) against each other
A platform to build and run apps that are elastic, agile, and resilient. SDK, libraries, and hosted environments.
Notes talking about the design and implementation of Apache Spark
A distributed data integration framework that simplifies common aspects of big data integration such as data ingestion, replication, organization and lifecycle management for both streaming and bat…
PredictionIO, a machine learning server for developers and ML engineers.