Skip to content
Change the repository type filter

All

    Repositories list

    • Byzer (former MLSQL): A low-code open-source programming language for data pipeline, analytics and AI.
      JavaScript
      Apache License 2.0
      548000Updated Apr 29, 2022Apr 29, 2022
    • neo4j

      Public
      Graphs for Everyone
      Java
      Other
      2.4k000Updated Apr 11, 2022Apr 11, 2022
    • gpdb

      Public
      Greenplum Database - Massively Parallel PostgreSQL for Analytics. An open-source massively parallel data platform for analytics, machine learning and AI.
      C
      Apache License 2.0
      1.9k000Updated Apr 7, 2022Apr 7, 2022
    • alluxio

      Public
      Alluxio, data orchestration for analytics and machine learning in the cloud
      Java
      Apache License 2.0
      2.9k000Updated Apr 3, 2022Apr 3, 2022
    • dbeaver

      Public
      Free universal database tool and SQL client
      Java
      Apache License 2.0
      3.5k000Updated Apr 1, 2022Apr 1, 2022
    • flume

      Public
      Mirror of Apache Flume
      Java
      Apache License 2.0
      1.6k000Updated Mar 31, 2022Mar 31, 2022
    • Read Delta tables without any Spark
      Python
      Apache License 2.0
      14000Updated Mar 10, 2022Mar 10, 2022
    • Pentaho Data Integration ( ETL ) a.k.a Kettle
      Java
      Apache License 2.0
      3.5k000Updated Mar 9, 2022Mar 9, 2022
    • hadoop

      Public
      Apache Hadoop
      Java
      Apache License 2.0
      8.9k000Updated Mar 9, 2022Mar 9, 2022
    • impala

      Public
      Apache Impala
      C++
      Apache License 2.0
      509000Updated Mar 9, 2022Mar 9, 2022
    • hbase

      Public
      Apache HBase
      Java
      Apache License 2.0
      3.3k000Updated Mar 9, 2022Mar 9, 2022
    • DataX

      Public
      DataX是阿里云DataWorks数据集成的开源版本。
      Java
      Other
      5.5k000Updated Mar 8, 2022Mar 8, 2022
    • ClickHouse® is a free analytics DBMS for big data
      C++
      Apache License 2.0
      7k000Updated Mar 8, 2022Mar 8, 2022
    • kudu

      Public
      Mirror of Apache Kudu
      C++
      Apache License 2.0
      653000Updated Mar 4, 2022Mar 4, 2022
    • TabPy

      Public
      Execute Python code on the fly and display results in Tableau visualizations:
      Python
      MIT License
      601000Updated Mar 3, 2022Mar 3, 2022
    • Apache Doris(Incubating) is an MPP-based interactive SQL data warehousing for reporting and analysis.
      C++
      Apache License 2.0
      3.3k000Updated Mar 2, 2022Mar 2, 2022
    • hudi

      Public
      Upserts, Deletes And Incremental Processing on Big Data.
      Java
      Apache License 2.0
      2.4k000Updated Mar 2, 2022Mar 2, 2022
    • iceberg

      Public
      Apache Iceberg
      Java
      Apache License 2.0
      2.3k000Updated Mar 2, 2022Mar 2, 2022
    • flink

      Public
      Apache Flink
      Java
      Apache License 2.0
      13k000Updated Mar 2, 2022Mar 2, 2022
    • spark

      Public
      Apache Spark - A unified analytics engine for large-scale data processing
      Scala
      Apache License 2.0
      28k000Updated Mar 2, 2022Mar 2, 2022
    • presto

      Public
      The official home of the Presto distributed SQL query engine for big data
      Java
      Apache License 2.0
      5.4k000Updated Mar 2, 2022Mar 2, 2022
    • kylin

      Public
      Apache Kylin
      Java
      Apache License 2.0
      1.5k000Updated Mar 1, 2022Mar 1, 2022
    • A collection of Power BI samples for developer use.
      JavaScript
      MIT License
      1.5k000Updated Feb 27, 2022Feb 27, 2022
    • Apache Parquet
      Java
      Apache License 2.0
      1.4k000Updated Feb 25, 2022Feb 25, 2022
    • oozie

      Public
      Mirror of Apache Oozie
      Java
      Apache License 2.0
      475000Updated Feb 22, 2022Feb 22, 2022