Skip to content
@ScaleUnlimited

Scale Unlimited

Popular repositories Loading

  1. flink-crawler flink-crawler Public

    Continuous scalable web crawler built on top of Flink and crawler-commons

    Java 51 18

  2. cascading.solr cascading.solr Public

    Cascading scheme for Solr

    Java 27 13

  3. cascading.utils cascading.utils Public

    Utilities for Cascading

    Java 22 12

  4. cascading.avro cascading.avro Public

    Forked from clizzin/cascading.avro

    Cascading Scheme for the Apache Avro data serialization format

    Java 19 25

  5. cascading.simpledb cascading.simpledb Public

    Cascading Tap & Scheme for Amazon's SimpleDB

    Java 12 1

  6. wikipedia-ngrams wikipedia-ngrams Public

    Code to split/parse Wikipedia XML dump

    Java 12 4

Repositories

Showing 10 of 32 repositories
  • cascading.utils Public

    Utilities for Cascading

    ScaleUnlimited/cascading.utils’s past year of commit activity
    Java 22 12 19 5 Updated Jul 1, 2022
  • cascading.solr Public

    Cascading scheme for Solr

    ScaleUnlimited/cascading.solr’s past year of commit activity
    Java 27 13 9 7 Updated Jul 1, 2022
  • pinot Public Forked from apache/pinot

    Apache Pinot (Incubating) - A realtime distributed OLAP datastore

    ScaleUnlimited/pinot’s past year of commit activity
    Java 0 Apache-2.0 1,378 0 0 Updated Jun 8, 2022
  • text-similarity Public

    Source code for blog post series on text features for similarity calculation

    ScaleUnlimited/text-similarity’s past year of commit activity
    Java 11 1 0 3 Updated May 12, 2021
  • flink-crawler-ccdemo Public

    Demo of using flink-crawler to extract pages from Common Crawl for a target language

    ScaleUnlimited/flink-crawler-ccdemo’s past year of commit activity
    Java 0 Apache-2.0 0 3 0 Updated Apr 8, 2019
  • flink-crawler Public

    Continuous scalable web crawler built on top of Flink and crawler-commons

    ScaleUnlimited/flink-crawler’s past year of commit activity
    Java 51 Apache-2.0 18 27 0 Updated Apr 8, 2019
  • flink-utils Public

    Utilities for use with Flink

    ScaleUnlimited/flink-utils’s past year of commit activity
    Java 0 Apache-2.0 0 0 0 Updated Mar 14, 2019
  • cascading Public Forked from Cascading/cascading

    Cascading is a feature rich API for defining and executing complex and fault tolerant data processing workflows on various cluster computing platforms. Please see https://github.com/cwensel/cascading for access to all WIP branches.

    ScaleUnlimited/cascading’s past year of commit activity
    Java 0 227 0 0 Updated Nov 29, 2018
  • flink-streaming-kmeans Public

    Simple implementation of KMeans clustering on Flink, using iterations

    ScaleUnlimited/flink-streaming-kmeans’s past year of commit activity
    Java 10 Apache-2.0 1 7 0 Updated Nov 15, 2018
  • fastText Public Forked from facebookresearch/fastText

    Library for fast text representation and classification.

    ScaleUnlimited/fastText’s past year of commit activity
    HTML 0 4,842 0 0 Updated Jul 16, 2018

Top languages

Loading…

Most used topics

Loading…