forked from argusdusty/Ferret
-
Notifications
You must be signed in to change notification settings - Fork 0
yinqunjun/Ferret
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Ferret: An optimized substring search engine written in Go. Makes use of a combination of an Inverted Index and a Suffix Array to allow log-time lookups with a relatively small memory footprint. Also incorporates error-correction (Levenshtein distance 1) and simple Unicode-to-ASCII conversion. Allows for arbitrary sorting functions Allows you to map arbitrary data to your results, and quickly update this data. License: CC BY-SA 3.0 (http://creativecommons.org/licenses/by-sa/3.0/) You are free to share and remix this work, as long as you attribute the original author, and share it under a similar license Author: Mark Canning To Install: go get github.com/argusdusty/Ferret To Update: go get -u github.com/argusdusty/Ferret To Use: import "github.com/argusdusty/Ferret" To try the examples: go build example.go && go build dictionaryexample.go Check out example.go and dictionaryexample.go for example usage. The code is meant to be as fast as possible for a substring dictionary search, and as such is best suited for large dictionaries with >1,000,000 total characters. I've timed 10s initialization for 3.5 million characters (350,000 entries) on a modern CPU, and 15us search time (4000us with error-correction). Uses linear memory (approximately 10 times the size of the dictionary) Searches performed in log time with the number of characters in the dictionary. Sorted searches can be slow, taking ~linear time with the number of matches, rather than linear time with the results limit.
About
An optimized substring search engine written in Go
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published