Well, today I spent significant amount of time teaching my Mac to distinguish 0 from 1. The motivation was mostly to check my own understanding of logistic regression.
Logistic regression model for two classes can be summarised as follows:
Let \(P(Y=1 | X=x) = h_w(x) \) where \(h_w(x) = g(w^Tx)\) and \(g(x) = \frac{1}{1+e^{-x}}\). The function \(g(x)\) is called a sigmoid function which takes value between 0 and 1. Naturally we have \(P(Y=0|X=x) = 1-h_w(x)\), since we are assuming only two classes. It can be used to map a real value to a probability space, and it has many characteristics that is good for this purpose.