Skip to content

Latest commit

 

History

History
64 lines (38 loc) · 3.97 KB

README.md

File metadata and controls

64 lines (38 loc) · 3.97 KB

String Algorithms

From Wikipedia: a string-searching algorithm invented by Alfred V. Aho and Margaret J. Corasick in 1975.[1] It is a kind of dictionary-matching algorithm that locates elements of a finite set of strings (the "dictionary") within an input text. It matches all strings simultaneously.

From Wikipedia: The Burrows–Wheeler transform (BWT, also called block-sorting compression) rearranges a character string into runs of similar characters. This is useful for compression, since it tends to be easy to compress a string that has runs of repeated characters by techniques such as move-to-front transform and run-length encoding. More importantly, the transformation is reversible, without needing to store any additional data except the position of the first original character. The BWT is thus a "free" method of improving the efficiency of text compression algorithms, costing only some extra computation.

Properties

  • Worst-case performance O(n)

From Wikipedia: searches for occurrences of a "word" W within a main "text string" S by employing the observation that when a mismatch occurs, the word itself embodies sufficient information to determine where the next match could begin, thus bypassing re-examination of previously matched characters. Knuth Morris Pratt search runs in linear time in the length of W and S.

Properties

  • Case performance O(s + w)
  • Case space complexity O(w)

From Wikipedia: find a longest palindrome in a string in linear time.

Properties

  • Worst-case time complexity is O(n)
  • Worst-case space complexity is O(n)

From Wikipedia: a string-searching algorithm created by Richard M. Karp and Michael O. Rabin that uses hashing to find an exact match of a pattern string in a text.

From Wikipedia: In information theory, the Hamming distance between two strings of equal length is the number of positions at which the corresponding symbols are different. In other words, it measures the minimum number of substitutions required to change one string into the other, or the minimum number of errors that could have transformed one string into the other. In a more general context, the Hamming distance is one of several string metrics for measuring the edit distance between two sequences. It is named after the American mathematician Richard Hamming.

From IBM: The REVERSE function accepts a character expression as its argument, and returns a string of the same length, but with the ordinal positions of every logical character reversed.

This algorithm finds instances of a text pattern within a larger text in linear time. Let the text length be n and pattern be m, then the total time to compute is O(m + n) with linear space complexity. The Z-algorithm is identical to the Knuth Morris Pratt algorithm in time and space complexity, but serves as a simpler example. In this algorithm, we construct a Z array.

Properties

  • Case-performance = O(m + n)
  • Case space complexity O(w)