Softmax Regression can be recognize as a feedforward layer with a softmax activation function. It’s extremely popular in Deep Learning Community for multi-class classification. It is a direct derivation from Logistic Regression. The output of each sample is a probability distribution \(p(y|x)\) where \(y\) is the label.

We usually minimize Categorical Cross Entropy cost to train a Softmax Regression Layer. This cost can measure the cross entropy between predicted output and target output.

 1class SoftmaxLayer(Layer):
 2    """Softmax Layer"""
 3    def __init__(self, **kwargs):
 4        super(SoftmaxLayer, self).__init__(**kwargs);
 5        
 6    def apply(self, X):
 7        return nnfuns.softmax(self.apply_lin(X));
 8    
 9    def predict(self, X_out):
10        """Predict label
11        
12        Parameters
13        ----------
14        X_out : matrix
15            input sample outputs, the size is (number of cases, number of classes)
16            
17        Returns
18        -------
19        Y_pred : vector
20            predicted label, the size is (number of cases)
21        """
22        
23        return T.argmax(X_out, axis=1);
24    
25    def error(self, X_out, Y):
26        """Mis-classified label
27        
28        Parameters
29        ----------
30        X_out : vector
31            predict labels, the size is (number of cases, number of classes)
32        Y : vector
33            correct labels, the size is (number of cases)
34            
35        Returns
36        -------
37        error : scalar
38            difference between predicted label and true label.
39        """
40    
41        return T.mean(T.neq(self.predict(X_out), Y));