a bit of cs: Disjoint Set Union-Find

Showing posts with label Disjoint Set Union-Find. Show all posts

Saturday, July 23, 2016

Codeforces Round #360 (Div. 1) - D. Dividing Kingdom II

Problem Statement:
http://codeforces.com/contest/687/problem/D

Summary:
n < 1000 vertices, m = O(n^2) weighted edges (numbered from 1 to m), and q < 1000 queries (L, R) to consider all edges number L to R, group the endpoints of the edges into two groups 0 or 1, and minimize the weight of the edge with the largest weight having its endpoints in the same group.

Codeforces 441D - Valera and Swaps

Problem Statement:
441D - Valera and Swaps

Solution:
This problem is somewhat tedious in a sense that we need accuracy in swapping indexes and content of arrays.

The idea to solve this problem is from the (might be routine, but not so obvious on first try) observation that the minimum number of swaps required to place every element to its correct position is equal to the total number of elements in the "misplaced chain" minus one. What I mean by the misplaced chain is: Let a[i] be the elements in the sequence, and p[i] be the index of element i in array a[i]. If we start from position cur = i, and trace down all elements cur = p[cur], we will get a chain of misplaced elements. There will be several disjoint chains, and hence we can total up the total minimum number of swaps necessary to restore each element in each chain to its correct position, call this sum S. We can compare S and m, which leads us to three cases:

Codeforces Round #286 (Div. 1) Problem D - Mr. Kitayuta's Colorful Graph

Problem Statement:
506D - Mr. Kitayuta's Colorful Graph

Solution:
The problem can be divided into (although not mutually exclusive) parts, first one is to effectively build the connectivity information in O(N), and the second part is to manage the queries. I am not so sure about the complexity analysis for the query handling part, but the following works.

First part: building the connectivity information using disjoint set union. Here is a simple yet powerful idea I learnt from the submissions made by clever gods who had solved the problem during the contest. We start out with an empty graph G. Then we incrementally add a new edge e. This edge has a color c and has two end points u and v. Before we add the edge and connect those vertices, we examine first whether each of u and v belong to any edges with color c. Let's say u indeed has such edge, call one of them e'. Since the addition of e to u will lead to e' and e being connected, we can simply represent this connection using a disjoint set union! So upon addition of e to u, e and e' are in the same set. Notice that we only need one of such edge e', since we have the invariant that every other e' incident to u will already be in the same disjoint set as e'! This is a very powerful observation, and this simplify the construction of the disjoint set union structure to O(N) overall.

Codeforces 455C - Civilization

Problem Statement:
455C - Civilization

Solution:
An nice problem, but my implementation is a bit messy :P

The most important observation that greatly simplify the problem:
If node u has the property such that for every v reachable from u, \(max_{v \text{ reachable from } u} \{ d(u,v) \} \) is minimum, then u must be on the longest path in that connected component. Furthermore, u must be on the middle of that path.

Firstly, why do we want to find such node u? We can think a node with that kind of property as the "midpoint" or "center" of the whole tree, where d(u,v) will be analogous to the "radius" of the tree. So when we have to merge to connected component, all we need to do is to connect the two "centers" of the connected components, and it is guaranteed that the resulting connected component will have the minimum longest path possible.

Codeforces 466E - Information Graph

Problem Statement:
466E - Information Graph

Solution:
The main idea is to build the whole relationship graph before answering the queries of type 3. We can make use of the fact that if we run DFS on the root of a graph and maintain the time information when we first enter and leave a certain node, we can check whether u is a parent of v on the DFS tree in O(1) by comparing the time information. While building the information graph, we also use a disjoint set union structure to maintain information about the parents of each nodes in the connected components currently being built. Finally, a node x receives a packet with index i if and only if x is in between the vertex u that received that packet and the vertex v that was the parent of u when it received the packet. After we build the graph and the disjoint set union structures, we can serve each queries of type 3 in O(1).

a bit of data structure: Disjoint Set Find-Union

Problem Statement:
How do we represent disjoint sets in a manner such that it supports these two functions efficiently:
1. union(A,B): given disjoint sets \(A\) and \(B\), return \(C = A\cup B\)
2. same(u,v): given 2 element \(u\) and \(v\), check whether they are in the same set.

To efficiently support a data structure of disjoint sets, we need a clever construct to represent the disjoint sets. How? Here we'll discuss the standard way of doing it.

Let's say we have \(N\) elements indexed as \(1,2,3,\ldots , N\). Initially, all elements are disjoint to each other. We have a notion of parent, expressed as an array parent[\(1 \ldots N] such that parent[i] points to the parent of element i. At first, every element is a parent to itself since they are disjoint. We call a procedure to initialize this condition as init():

init():
input: none
output: initialize the disjoint set data structure


for i in 1 to N:
   parent[i] = i

The union of 2 elements will result in parent of the first element pointing to the second element. As this procedure proceeds, a forest of trees will result, where each tree is a representation of a set. Now imagine if we want to check if 2 element is indeed inside the same tree, that means we need to traverse up the parent chain of each element until we find an element which parent points to itself (hence the root of the tree) and finally compare whether the two roots are the same. This traversal may take \(O(\log{N})\) each time (the height of the tree). Hence we need to compress this tree such that subsequent check of the root of a tree will result in \(O(1)\) average. Call this procedure find(u) as described below:

find(u):
input: an element
output: the root of the disjoint set it is in


if u != parent[u]:
    parent[u] = find(parent[u])
return parent[u]

Here we check whether \(u\) is root, if so then we return \(u\). Otherwise, we need to update the parent of u to point to the root of the tree, and return it. Hence it updates the parent-child relationship in the tree such that the tree is compressed to a flatter tree.

Finally we can describe the two main functions above in an efficient manner:

union(u,v):
input: 2 elements u,v
output: union of the set containing u and the set containing v

parent[find(u)] = find(v)




same(u,v):
input: 2 elements u,v
output: is u and v in the same set?


return find(u) == find(v)

Here is a minimalist implementation of the data structure:

vector<int> p(100);
vector<int> r(100);
int init() { for (int i=0;i<100;++i) p[i]=i, r[i] = 0; }
int find_set(int i) { return (p[i] == i) ? i : (p[i] = find_set(p[i])); }
bool same_set(int i, int j) { return find_set(i) == find_set(j); }
void union_set(int i, int j) {
 if ( same_set(i,j) ) return;
 int x = find_set(i);
 int y = find_set(j);
 if(r[x] > r[y]) {
  p[y] = x;
 } else {
  p[x] = y;
  if (r[x] == r[y]) ++r[y];
 }
}

a bit of cs

Saturday, July 23, 2016

Codeforces Round #360 (Div. 1) - D. Dividing Kingdom II

Saturday, January 31, 2015

Codeforces 441D - Valera and Swaps

Monday, January 19, 2015

Codeforces Round #286 (Div. 1) Problem D - Mr. Kitayuta's Colorful Graph

Thursday, January 8, 2015

Codeforces 455C - Civilization

Sunday, January 4, 2015

Codeforces 466E - Information Graph

Sunday, June 22, 2014

a bit of data structure: Disjoint Set Find-Union