## Wednesday, October 22, 2014

### SPOJ - MSTRING

Problem Statement (the problem statement is quite ambiguous):
SPOJ - MSTRING

Summary:
Given two strings S and V, find the shortest subsequence of S that is not a subsequence of V.
( |S|, |V| <= 1000 )

Solution:
This is an interesting question in a sense that usually problems will ask for the longest subsequence. Here we are asked to find the shortest uncommon subsequence, the direct opposite of that, and surprisingly dynamic programming can also solve the problem equally well :D but first we need a few tricks.

We establish the relationship between the sub problems and the problem at hand. Let L(i, j) be the length of the shortest subsequence that is in S[1..i] that is not in V[1..j]. Here we have the following relationship:
L(i, j) =
1. If letter S[i] is nowhere to be found in V[1..j], then L(i,j) = 1.
2. Otherwise, we have two case:
2.1. S[i] is in the shortest subsequence. We find k, the most immediate index in V[1..j] where V[k] == S[i]. Then L(i, j) = 1 + L(i-1, k-1).
2.2 S[i] is not in the shortest subsequence. Then L(i, j) = L(i-1, j).
We pick whichever is lower.

Direct implementation of the above will lead to a $$O(N^3)$$ algorithm, which most likely be too slow. Therefore, we also need another table to store the index k for each cases of i and j, therefore we can push down the complexity of the problem to $$O(N^2)$$.

Implementation:

string s, v;
int N, M;
int next[1003][1003], dp[1003][1003];
int main(){
cin >> s >> v;
N = s.size();
M = v.size();
for(int i=0;i<N;++i){
int prev = -1;
for(int j=0;j<M;++j){
if(s[i] == v[j]) prev = j;
next[i+1][j+1] = prev;
}
}
for(int i=1;i<=N;++i){
dp[i][0] = 1;
}
for(int i=0;i<=M;++i){
dp[0][i] = INF;
}
for(int i=1;i<=N;++i){
for(int j=1;j<=M;++j){
if(next[i][j] == -1) dp[i][j] = 1;
else {
dp[i][j] = min(1+dp[i-1][next[i][j]], dp[i-1][j]);
}
}
}
printf("%d\n", dp[N][M]);
return 0;
}