Skip to content

dgryski/go-fastquantiles

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

77 Commits
 
 
 
 
 
 

Repository files navigation

Approximate streaming quantiles

NOTE: THIS CODE IS CURRENTLY BROKEN

For a fast streaming quantile implementation, please see http://github.com/dgryski/go-gk


My goal is to implement:

"A Fast Algorithm for Approximate Quantiles in High Speed Data Streams" (Qi Zhang and Wei Wang, 2007).

http://www.cs.unc.edu/~zhangq/PUBLICATIONS/SSDBM07-Qi%20Zhang-FastQStream.pdf

Course notes:
http://www.mathcs.emory.edu/~cheung/Courses/584-StreamDB/Syllabus/08-Quantile/Zhang.html
http://www.mathcs.emory.edu/~cheung/Courses/584-StreamDB/Syllabus/08-Quantile/Zhang2.html

Zhang/Wang is based on the algorithm in
"Space-Efficient Online Computation of Quantile Summaries" (Greenwald, Khanna, 2001)
http://infolab.stanford.edu/~datar/courses/cs361a/papers/quantiles.pdf

So I decided to start with GK.  There's a good overview at
http://papercruncher.com/2010/03/02/stream-algorithms-order-statistics/

and more detailed notes at
http://www.mathcs.emory.edu/~cheung/Courses/584-StreamDB/Syllabus/08-Quantile/Greenwald.html
http://www.mathcs.emory.edu/~cheung/Courses/584-StreamDB/Syllabus/08-Quantile/Greenwald2.html

The MERGE algorithm is taken from
"Continuously Maintaining Quantile Summaries of the Most Recent N Elements over a Data Stream"
(Lin, H. Lu, J. Xu, and J.X. Yu, 2004)
http://www.cs.ubc.ca/~xujian/paper/quant.pdf

About

approximate streaming quantiles

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages