Matrix Operations: Softmax Implementation

Hello, I get a different result when I calculate Softmax of a matrix by the definition that I find at: https://en.wikipedia.org/wiki/Softmax_function

So, I wrote my own Softmax function in Python and for the example that you provided I have:

A = [[1.0, 2.0, 3.0],
     [4.0, 1.0, 2.0], 
     [3.0, 4.0, 1.0]]
def softmax(x, axis=-1):
    y = np.exp(x - np.max(x, axis, keepdims=True))
    return y / np.sum(y, axis, keepdims=True)

array([[ 0.09003057, 0.24472847, 0.66524096], [ 0.84379473, 0.04201007, 0.1141952 ], [ 0.25949646, 0.70538451, 0.03511903]])

This is the same example by Blaze:

#include <iostream>
#include <blaze/Math.h>
int main()
{
    blaze::StaticMatrix<double, 3UL, 3UL> A{ { 1.0, 2.0, 3.0 }
                                     , { 4.0, 1.0, 2.0 }
                                     , { 3.0, 4.0, 1.0 } };
    blaze::StaticMatrix<double, 3UL, 3UL> B;
    B = blaze::softmax(A);     
    std::cout << B << "\n";
    return 0;
}

// ( 0.0157764 0.0428847 0.116573 ) // ( 0.316878 0.0157764 0.0428847 ) // ( 0.116573 0.316878 0.0157764 )

which result is the same as https://bitbucket.org/blaze-lib/blaze/wiki/Matrix%20Operations#!softmax

Therefore, I checked the Tensorflow response:

import tensorflow as tf
import numpy as np
a = tf.constant(np.array([[1.0, 2.0, 3.0],
     [4.0, 1.0, 2.0], 
     [3.0, 4.0, 1.0]]))
with tf.Session() as s:
    print(s.run(tf.nn.softmax(a)))

[[ 0.09003057 0.24472847 0.66524096] [ 0.84379473 0.04201007 0.1141952 ] [ 0.25949646 0.70538451 0.03511903]]

I think the denominator should be a vector not a scalar as in

template< typename VT, bool TF >
VT softmax( const blaze::Vector<VT,TF>& v )
{
   VT tmp( exp( ~v ) );
   const auto scalar( sum( ~tmp ) );
   tmp /= scalar;
   return tmp;
}

Would you please explain to me what I miss?

Comments (3)