Thursday, July 10, 2014

a bit of data structure: Fenwick Tree (Binary Indexed Tree)

Problem Statement:
Given a set of values \(i \in \{ 1, \ldots, n \} \) and set of frequencies \(f_i\):
1. find the cumulative frequency from i to j
2. support updating frequencies

Data Structure: Fenwick Tree (also known as Binary Indexed Tree)

Construction:
Every number can be represented the sum of powers of 2, which is the basis of the binary representation of numbers. Define LSB[i] as the least significant bit of i (the last bit 1 in the binary representation of i). For example:
1. LSB[3] = LSB[11] = 0
2. LSB[14] = LSB[1110] = 1
3. LSB[52] = LSB[110100] = 2

We define FTree[i] as the sum of frequencies from \( i - 2^{\text{LSB[i]}} + 1, \ldots, i\). This allows an iterative representation of Cumulative Frequency from 1 to i CF[i] in terms of FTree, for example:

1. CF[10001101] = FTree[10001100] + FTree[10001000] + FTree[10000000]
2. CF[1101] = FTree[1100] + FTree[1000]

As such, CF[i] can be found in \(O(\log{N})\) time, where \(N\) is the length of the binary representation of i.

In case we update the entry in FTree[i] by a value val, we iteratively add val to FTree[i + \(2^\text{LSB[i]}\)], hence updating all responsibility chain in FTree in \(O(\log{N})\) time as well.

The pseudocode for construction of Fenwick Tree:

LSB(i):
1. Return (i & ~i)

Build-FTree(array of frequency F):
1. let FTree[] be an array, FTree[0] = 0.
2. For each i from 1 to n = F.length
    2.1 Set FTree[i] = sum of F[k] for k from (i - LSB(i) + 1) to i
3. Return FTree

Find-CF(FTree, i):
1. Set cumfreq = 0
2. While i != 0:
    2.1 Set cumfreq = cumfreqFTree[i]
    2.2 Set i = i - LSB(i)
3. Retrun cumfreq

Update-FTree(FTree, i, val):
1. While i not greater than n
    1.1 Set FTree[i] = FTree[i] + val
    1.2 Set i = i + LSB(i)

The implementation is very sweet and simple :D



vector<int> ftree(11,0);
int maxval;

int lsb(int i) 
{
 return i & (-i);
}

void update(int i, int val)
{
 while(i <= maxval)
 {
  ftree[i] += val;
  i = i + lsb(i);
 }
}

int cumfreq(int i)
{
 int tot = 0;
 while( i > 0)
 {
  tot += ftree[i];
  i = i - lsb(i);
 }
 return tot;
}