Sunday, June 2, 2019

Merging heaps in O(log N)?

Can you merge two heaps in O(log N)? Turns out if they are leftist heaps, you can!
But what is a leftist heap, and how is it different from binary heap?

Recall that binary heap is both complete and balanced. Hence it looks something like this:
               1
             /   \
            2     3
           / \   /
          5   6 7
             Fig 1
It's clear that it supports O(log N) operations.

In contrast, this is what leftist heap looks like:
               1
              / \
             3   2
            /   /
           5   7
          /
         6
             Fig 2
It's neither complete nor balanced. Yet, it supports O(log N) operations too!

How does it work?


Idea behind Leftist Heap


Each node on a leftist heap has a 'rank'. It is defined as the shortest distance to its empty child. Hence in the example tree above (Fig 2), the rank of node '1' is 2, while the rank of node '3' is 1. By definition, the rank of an empty (NULL) node is 0.

Each node maintains the following property:
1. value of current node is smaller than those of its children (the usual min-heap property)
2. the rank of its left child is greater than or equal to the rank of its right child (the leftist property!)

The leftist property has a consequence: the rank of a node is = 1 + the rank of its right child. Do you see it?

So, how does the leftist property help us support O(log N) insertion and deletion operations? Introducing, the "merge" operation!

Merging Leftist Heap


As it turns out, we can merge 2 leftist heap efficiently:
1. First we compare the min values of the leftist heaps. Let's call the one with a smaller min value as A, and the other as B.
2. We recursively merge B with the right child of A, call the result M.
3. Then we attach M as the new right child of A.
4. Wait, what if M has a larger rank than the left child of A? Well, we simply swap them! Hence, we maintain leftist property of heap A as such.

Here is an illustration:

               1         MERGE WITH           4
              / \                            / \
             3   2                          7   5
            /   /                          /
           5   7                          12
          /
         6
              A                               B

Here we merge B to the right child of A. Let's focus on those:

          2    MERGE WITH         4
         /                       / \
        7                       7   5
                               /
                              12

In this case, we merge heap '4' with right child of '2'. Since the right child is
empty, we simply attach '4' to it:

                2
               / \
              7   4
                 / \
                7   5
               /
              12

But since the left child of '2' has a rank of 1 while its right child has a rank of 2, we must swap the children:
                2
               / \
              4   7
             / \
            7   5
           /
          12
               M
Next we attach this to the right child of the original heap A:
               1 
              / \ 
             3   2
            /   / \
           5   4   7
          /   / \
         6   7   5
            /
           12
Finally, since the left child of '1' has rank 1 while its right child has rank 2, we must do a final swap and we get the result of the merge as:

                1 
              /   \ 
             2     3
            / \   /
           4  7  5 
          / \   /
         7  5  6
        /
       12

The time complexity of merge operation is O(rank of heap). (Here, rank of heap is the rank of its root node)

But, what is the upper bound of the rank, in relation to the size of the tree? It turns out, for a leftist heap with rank R, the number of nodes in the heap is at least \(2^R-1\). This means \(2^R-1\) <= N or R <= log (N+1), or R = O(log N).

Proof: By induction on S(R) = minimum size of leftist heap with rank R.
Base case: S(1). The smallest possible such leftist heap is a leaf node, hence S(1) = 1 = \(2^1 -1\).
Inductive case: Say \(S(k) = 2^k-1\) for all k <= R. What is S(R+1)?
Consider the right subtree of the root. This is a leftist heap with rank R. Hence smallest possible size for this subtree is S(R) = \(2^R-1\).
How about the left subtree of the root? It can have rank >= R. The smallest possible size is therefore S(R) = \(2^R-1\) when the rank = R.
Hence in total, S(R+1) must be \((2^R-1) + (2^R-1) + 1 = 2^{R+1}-1\), proving the inductive case.

Hence time complexity of merge operation is O(log N).


Insertion and Deletion on Leftist Heap


Insertion can be done as merging the existing heap with a new heap with 1 node. Deletion of the root of a heap is done by merge its left and right children.

Hence these operations are O(log N)!

Here is the full implementation:

#include <cstdio>
#include <vector>
#include <cstdlib>
#include <queue>
#include <cassert>
using namespace std;

// NOTE: This is an implementation of leftist heap for educational purposes
// only. It does not deal with any memory management, and it pays 0 respect
// for software eng practices. Read and use at your own risk.

// Leftist heap node definition.
struct Node {
  int value;
  int rank;
  Node *left;
  Node *right;
};

// Private functions.
Node *create_leaf_node(int value) {
  return new Node{value, 1, nullptr, nullptr};
}

int get_rank(Node *node) {
  if (!node) return 0;
  return node->rank;
}

// Creates a new leftist heap.
Node *create_node(int value, Node *heap_a, Node *heap_b) {
  if (get_rank(heap_a) >= get_rank(heap_b)) {
    return new Node{value, get_rank(heap_b)+1, heap_a, heap_b};
  }
  return new Node{value, get_rank(heap_a)+1, heap_b, heap_a};
}

// Merges leftist heap 'first' with 'second' and returns the pointer to the
// merged heap. Destroys both 'first' and 'second'.
Node *merge(Node *first, Node *second) {
  if (!first) {
    return second;
  }
  if (!second) {
    return first;
  }
  if (first->value > second->value) {
    return merge(second, first);
  }
  return create_node(first->value, first->left, merge(first->right, second));
}

// Debug interface.
void print(Node *heap) {
  if (!heap) {
    printf("E");
    return;
  }
  printf("(%d[%d]", heap->value, heap->rank);
  print(heap->left);
  print(heap->right);
  printf(")");
}

// Public interfaces.
Node *push(Node *heap, int value) {
  return merge(heap, create_leaf_node(value));
}

int top(Node *heap) {
  return heap->value;
}

Node *pop(Node *heap) {
  return merge(heap->left, heap->right);
}

int main() {
  int N = 500000;
  Node *heap = nullptr;
  std::priority_queue<int> pq;
  for (int i = 0; i < N; ++i) {
    int v = rand() % N;
    pq.push(-v);
    heap = push(heap, v);
  }
  while (N-- > 0) {
    assert(top(heap) == -pq.top());
    heap = pop(heap);
    pq.pop();
  }
  return 0;
}

15 comments:

  1. Thanks for the blog!
    How about a blog on Segment Tree questions that involve storing a BST in each node. Explanation with an example is really appreciated. :)

    ReplyDelete
  2. Thank you for your post. This is excellent information. It is amazing and wonderful to visit your site.
    best event management in chennai
    event management services in chennai

    ReplyDelete
  3. "Great explanation! The concept of merging heaps in O(log N) has always been tricky, but this makes it easier to grasp."
    Warehouse Storage rack Delhi
    mezzanine floor

    ReplyDelete
  4. "Can you elaborate on the scenarios where merging heaps is particularly beneficial in real-world applications?"
    mobile compactor in delhi
    fifo flow rack delhi

    ReplyDelete
  5. "This post clarified the process of merging heaps, but could you add a practical code snippet for better understanding?"
    heavy duty rack manufacturer delhi
    Multi tier rack

    ReplyDelete
  6. "I appreciate the clarity in your explanation. How does this approach compare to other heap operations in terms of efficiency?"
    Fabric Roll racks Supplier
    Warehouse mezzanine floor manufacturer

    ReplyDelete
  7. "Is there a specific type of heap (binary, Fibonacci, etc.) where merging in O(log N) is most efficient?"
    Slotted angle rack manufacturer
    Modular Mezzanine floor manufacturer

    ReplyDelete
  8. "How does merging heaps in O(log N) help in optimizing algorithms like Dijkstra's or Prim's?"
    pallet rack supplier
    Industrial Storage Rack hyderabad

    ReplyDelete
  9. "Well-written article! Could you include edge cases to consider while implementing this method?"
    Pallet Rack delhi
    Heavy Duty Rack Delhi

    ReplyDelete
  10. "This was insightful! Does merging heaps in O(log N) require any specific preprocessing of the heaps?"
    warehouse storage rack supplier
    Shrink Packing machine India

    ReplyDelete
  11. "Your explanation is clear, but could you dive deeper into the mathematical proof behind the O(log N) complexity?"
    Shrink wrapping machine delhi
    Box Wrapping machine manufacturer

    ReplyDelete
  12. "The use of diagrams or animations could make the merging process even clearer for visual learners."
    Strapping machine manufacturer
    Invest in Brands

    ReplyDelete
  13. "This article is very informative! How does the performance of this method vary with extremely large datasets?"
    Dust Collector
    Warehouse Storage rack Supplier

    ReplyDelete