## Monday, October 27, 2014

### Follow Up: Codeforces 275 Div. 2 Problem E / Div. 1 Problem C

Problem Statement:
482C - Game of String

Solution:
The most excruciating problem I've done so far in my life.. haha
Anyway, I was not able to come up with a purely $$O(L 2^L)$$ solution, but there is such solutions, and I am still trying to understand the thought process. The one that I managed to get it accepted is actually $$O(NL 2^L)$$ but the N term is very small thanks to __builtin_popcount() function.

While solving the problem, I learnt a new bitmasking technique which I will describe now. Firstly, given a set of questions, we want to know whether there are several strings sharing the same answer to that questions. As such we have to build a table BM of all questions, which has $$2^20$$ entry. Then we check, for each pair of strings, we have a bitmask in which the bit i is set if and only if both string share a common character at position i. The bitmask is the index on table BM. Then we set both bits i and j on BM[bitmask] to indicate that question bitmask is satisfied by i and j.

Then, we observe that if bit i is on in bitmask, then if we turn off that bit, we get bitmask', and consequently, BM[bitmask'] must at least have the corresponding bits ON in BM[bitmask] since less questions can be definitely be satisfied by at least as many strings.

Then from here we need an efficient way to calculate the expectation. Let E(bitmask) be the expected number of questions needed, given that we already have asked questions that is indicated as ON bits in bitmask. Then we have:
E(bitmask) = sum of { ((1-P) * E(bitmask') + P) * prob } for all bitmask'
Where bitmask' is bitmask with an additional ON bit (in other words, with an additional question not yet asked), prob is the probability of choosing that question, and P is the probability that the string that is being guessed is actually determinable with question set bitmask'.

To find P, we need to find D, the initial number of strings satisfying bitmask (which is the number of ON bits in BM[bitmask]), then we find keep track of how many bits in BM[bitmask] (call this C) that is switched to zero bit in BM[bitmask']. P is C divided by D. We can do this by doing some bit manipulations. Here the role of __builtin_popcount() becomes really important to calculate the number of one bits in BM[bitmask] and BM[bitmask'].

Implementation:

#include <iostream>
#include <cstdio>
#include <algorithm>
#include <string>
#include <utility>
using namespace std;

string s[51];
double dp[1050000];
int b[1050000];
long long bm[1050000];
int N, M;

void prep(){
for(int i=0;i<(1<<M);++i)
bm[i] = 0;
for(int i=0;i<N;++i){
for(int j=i+1;j<N;++j){
int cur = 0;
for(int k=0;k<M;++k){
if(s[i][k] == s[j][k]){
cur |= 1<<k;
}
}
bm[cur] |= 1LL<<i;
bm[cur] |= 1LL<<j;
}
}
for(int i=(1<<M)-1;i>=0;--i){
for(int j=0;j<M;++j){
bm[i & (~(1<<j))] |= bm[i];
}
}
}

int main(){
cin >> N;
for(int i=0;i<N;++i){
cin >> s[i];
}
M = s[0].size();
int lim = 1<<M;
prep();
b[0] = 0;
for(int i=0;i<lim;++i){
b[i] = b[i/2] + i%2;
}
double cnt, prob, p, tot;
long long cur;
dp[lim-1] = 0;
for(int i=lim-2;i>=0;--i){
tot = 0;
p = 1 / ((double)M - b[i]);
dp[i] = 0;
tot = (double)__builtin_popcount(bm[i]);
tot += (double)__builtin_popcount(bm[i]>>32);
if(bm[i] == 0){
dp[i] = 0;
continue;
}
for(int j=0;j<M;++j){
if(!(i & (1<<j))){
cnt = 0;
cur = bm[i ^ (1<<j)] | ~bm[i];
cur = ~cur;
cnt = __builtin_popcount(cur);
cnt += __builtin_popcount(cur>>32);
prob = cnt/tot;
dp[i] += ((dp[i ^ (1<<j)] + 1) * (1-prob) + prob) * p;
}
}
}
printf("%.12lf\n", dp[0]);
return 0;
}