Tuesday, July 15, 2014

a bit of data structure: Segment Tree

This data structure is used to efficiently support queries which request for the index of the minimum element in a given range.

The idea is using a divide and conquer: given a range \((i, \ldots, j)\), and then we find the index to the minimum element for \((i, \ldots, \frac{i+j}{2})\) and \((\frac{i+j}{2}+1, \ldots, j)\) seperately, and finally combine the result by choosing the lower element of the two indexes as the minimum index for \(i, \ldots, j\).

The tree is built on a one dimensional array just like Binary Heap and it requires \(O(4N)\) space. To process a query, we traverse down the tree to find the sub-interval which lies totally inside the query range.

Using Segment Tree we can maintain a dynamically changing information on a static data structure as update operations can be implemented efficiently. Furthermore, to increase its performance, we can opt to have a lazy propagation of updates, by maintaining an extra array that mark whether a segment have to be updated or not.

Here is a quick implementation in C/C++ :

vector<int> stree(4*N + 1,0);
vector<int> arr(N + 1,0);

int build(int i, int j, int k)
 if(i == j)
  stree[k] = i;
  return stree[k];
 int mid = (i+j)/2;
 int left=-1,right=-1;
 left = build(i,mid,2*k);
 right = build(mid+1,j,2*k+1);
 if(arr[left] > arr[right])
  stree[k] = right;
  stree[k] = left;
 return stree[k];

int rmq(int p, int i, int j, int L, int R)
 //invalid query
 if(i > R || j < L) return -1;
 if(i <= L && R <= j) return stree[p];
 int mid = (L+R)/2;
 int left = rmq(2*p, i, j, L, mid);
 int right = rmq(2*p + 1, i, j, mid+1, R);
 if (left == -1) return right;
 if (right == -1) return left;
 return (arr[left] < arr[right]) ? left : right;