## Monday, June 6, 2016

### Data Structure: Fenwick Tree and Range Update

Something led me to read up on Fenwick Tree (a.k.a Binary Indexed Tree). There are lots of great tutorials that explain what this data structure is, and how it is useful. A google search is all you need. I would like to write up some short discussion on the tree, and also about a technique that is used to allow this tree to support range updates.

1. What's the motivation behind this data structure?
Being the friend of Segment Tree, Fenwick Tree is designed to answer ranged queries efficiently. Historically it is used mostly for calculating cumulative frequencies, or in other words, the sum of values in range [i, j]. It actually can only handle queries in the form [1, j], but in the case of cumulative frequency, one can first query [1, j], and then query [1, i-1], and subtract the result for a desired effect.

2. What's it look like?
Using the array A = [1, 4, -3, 5, 6, 1, 3, -1, 0, 2], a Fenwick tree looks like this:

i       1  2   3  4  5  6  7   8  9 10
A       1, 4, -3, 5, 6, 1, 3, -1, 0, 2
fenwick 1, 5, -3, 7, 6, 7, 3, 16, 0, 2

2D representation:
16
+---------+----+----+
7         |    |
+----+----+         |    |
5    |              7    |              2
+----+    |         +----+    |         +----+
1    4   -3    5    6    1    3   -1    0    2
0001 0010 0011 0100 0101 0110 0111 1000 1001 1010


3. Where is the tree?
It was weird that it is kind of difficult for me to think about Fenwick Tree as a tree, despite its name. After some contemplation, I think the parent-child relationship of a Fenwick Tree can be described as follows:
1) A node with value v with binary representation A1B where B is made up of zero bits (i.e. the 1 in A1B denotes the rightmost bit 1 in v) is the parent of A01000[...]0, A01100[...]0, A01110[...]0, and so on until A01111[...]1. Hence the parent of node v can be computed as v + (v & -v). So more concretely, v is a child of v + (v & -v).
2) If v = A1B (where B is zero bits), then that node covers the sum from A0B+1 to A1B. This is the motivating construction of fenwick tree, it allows us to compute a range query in O(log n) time (where log n is the length of the bits representation of v). To see this, suppose that v = A1B1C where B and C are zero bits (hence the 1s in v represents the two rightmost 1 bits of v). Then we know that A1B1C covers the value from [A1B0C+1] to A1B1C. All we need to find is now the total sum from 1 to A1B0C(where B0C is all 0 bits, which is hence a recursive operation! ). By 'peeling' off the rightmost 1 bit one by one, we can compute the total sum from 1 to v. This peeling operation is represented as v - (v & -v) (we can also see this as jumping to the left sibling of node v).

4. Code for Range query and Update:
Range query [1,v] is hence supported as follows:
rmq(v):
while (v) {
rmq += fenwick[v]; // or rmq = min(fenwick[v], rmq) for RMQ
v -= v & -v;
}
On the other hand, update with incremental value val is represented as follow:
update(v, val):
while (v <= length) {
fenwick[tree] += val; // or fenwick[v] = min(fenwick[v], val).
v += v & v;
}
Notice that we only support 'incremental changes', like adding/subtracting some delta, or by performing monotonic minimisation/maximisation.

5. Range update?
Notice also that the update procedure above only handle one entry at a time. As it turns out, there is a way to provide a range update for cumulative frequency by manipulation 2 Fenwick trees at a time.
Firstly, suppose that we have two arrays mult and offset. When we perform a range update on [3, 5] with value +2 for example, we set the values in mult as follows:
mult  : [ 0  0  2  2  2  0  0]
While offset is set as follows:
offset: [ 0  0  4  4  4 -6 -6]
Of course if we have performed that range update on our cumulative array, it will look like this:
sum   : [ 0  0  2  4  6  6  6]
How does that help? When we want to compute the value of the cumulative frequence at index i after the update, we calculate sum[i] =  i * mult[i] - offset. Hence for example:
at i = 3, sum[3] = 3 * 2 - 4 = 2.
at i = 5, sum[5] = 5 * 2 - 4 = 6.
at i = 6, sum[6] = 6 * 0 - (-6) = 6.
Hence we can recover the value of sum[i] from both mult[i] and offset[i]. Now, if both mult and offset are fenwick trees, we can perform the updates as follows: mult.update(3, 2) and then mult.update(6, -2). And we can perform the following to offset: offset.update(3, 4) and offset.update(6, -10). Then sum[i] can be computed as mult.rmq(i) * i - offset.rmq(i).

More generally, we have:
update(left, right, val) {
mult.update(left, val);
mult.update(right+1, -val);
offset.update(left, (left-1)*val);
offset.update(right+1, -right*val);
}

rmq(i) {
return mult.rmq(i) * i - offset.rmq(i)
}