Sunday, June 29, 2014

a bit of graph: Dijkstra's Shortest Path Algorithm

Problem Statement:
Given a undirected graph with non-negative weighted edges, a starting point s, and a destination t, find the path with minimum weight.

Dijkstra's Algorithm:
A greedy algorithm can be devised to solve this problem. Suppose we already have a minimum path using several nodes. Then the intuitive thing to do is to add a node to the last node in our path such that its weight is the lowest. This is basically the intuition behind Dijkstra's algorithm, which resembles Prim's algorithm for Minimum Spanning Tree. However, there are a few important details which ensure the correctness of the algorithm:

Definition:
s: starting point
t: destination
d[i]: array of distances from s
w(u,v): weigh of edge u to v

1. Initially, we have an array of distances d[i] which values are all infinity. This means that we do not know yet whether the node i is reachable from our starting point. However, we do know that s is reachable (since it's our starting point anyway) so d[s] = 0 and d[i] = INF for i != s.

2. For an edge with nodes u and v at its end points, an important operation called relax(u,v) is performed as follows: if d[u] > d[v] + w(u,v) then update d[u] = d[v] + w(u,v) (we can also say that the edge u-v is relaxed).

3. Let S be the set of nodes in which the distance from s has already known (that is, if u in S, then d[u] is guaranteed to be the total weight of the minimum path between s and u). At first, S = {s}. Then for each element in S, we examine its edges and perform the relax(u,v) operation.

4. After we have iterated through all element in S, we choose the node that is not yet inside S which has the smallest d[i], and add it in S (hence the greedy choice!). If the node happens to be t, then we've found d[t] which is the total weight of the minimum path from s (which is how we defined S earlier). If the node is not t, then repeat 3 until we cannot add anymore nodes into S in which case t is not reachable from s.

It remains to prove that in step 4, the node that we added to S indeed has the property that d[i] is the value of its minimum weight path from s, and the proof can be very interesting :D.

The implementation of this algorithm is usually done using a heap data structure such as priority queue to choose the minimum node in step 4, although a normal array can also be used if the number of nodes are not too big. Below is a sample implementation in C++:


typedef pair<int, int> pii;

struct nd{
 int v; //neighboring vertex
 int w; //weight of edge u-v
};

vector<vector<nd> > G;
vector<int> p; //parent attr
vector<int> d; //dist attr

void init(int N){//no of nodes, no of edges
 G = vector<vector<nd> >(N+1);
 p = vector<int>(N+1);
 d = vector<int>(N+1);
 for(int i=0;i<N+1;++i)
  p[i]=-1, d[i]=MAX;
}

void djikstra(int s, int f){ //starting pt
 priority_queue<pii> q;
 d[s]=0;
 p[s]=-1;
 q.push(make_pair(d[s] ,s));
 while(!q.empty()){
  pii cur = q.top();
  q.pop();
  int u = cur.second;
  int nE = G[u].size();
  for(int i=0;i<nE;++i){
   int v = G[u][i].v;
   int w = G[u][i].w;
   if(d[v] > d[u] + w){
    d[v] = d[u]+w;
    p[v] = u;
    q.push(make_pair(-d[v],v));
   }
  }
 }
}