Monday, September 1, 2014

a bit of data structure: Suffix Array

Suffix array is simply the array of suffixes of a string sorted to their lexicographical order. Easiest way to construct suffix array will have a running time of \(O(N^2\log{N})\).

Pseudo-code for suffix array construction in \(O(N\text{log}^2N)\) time complexity (not the best but not bad):
Let \(A\) be the string
1. Set \(N = \text{length}(A)\)
2. For i from 0 to \(N-1\):
    2.1 Set P(0, i) = position of \(A_i\) in A that is sorted by its individual character
3. Set cnt = 1
4. For k from 0 to \(\lceil\text{log}N \rceil\):
    4.1 For i from 0 to N-1:
          4.1.1 Set L(i) = ( P(k-1, i), P(k-1, i+cnt), i )
    4.2 Sort L(i)
    4.3 Compute P(k, i) using L(i)
    4.4 cnt = cnt * 2