Clone wiki

cl-string-match / Home


CL-STRING-MATCH Quickdocs aims at providing robust implementations of string matching algorithms. These algorithms are also called "substring search" or "subsequence search" algorithms.

Currently it provides implementations of the following string matching algorithms:

  • Brute-force (also known as naïve algorithm)
  • Boyer-Moore (with mismatched character heuristic)
  • Boyer-Moore-Horspool algorithm
  • Knuth-Morris-Pratt algorithm
  • Rabin-Karp algorithm
  • Aho-Corasick algorithm (with finite set of patterns)
  • A simple backtracking regular expressions engine re similar to that of the Lua programming language. At the moment it significantly underperforms compared to the CL-PPCRE.

Some string processing algorithms are also implemented:

  • Simple (naїve) suffix tree construction algorithm
  • Ukkonen's suffix tree construction algorithm

Data structures:

  • Prefix trie
  • Suffix tree

Some algorithms (Brute-force, Boyer-Moore-Horspool) have parametric implementations making it possible to declare specific implementations for application-specific custom data types and data structures.


Additional resources

This Wiki is created to provide additional documentation on the cl-string-search library and additional information on the topic of string search/string matching. Hopefuly you will find it useful.