Thursday 2 September 2021

Tensorflow issue with softmax

I have a Tensorflow multiclass classifier that is generating nan or inf while computing probabilities using tf.nn.softmax. See the following snippet (logits is of shape batch_size x 6, since I have 6 classes and the output is one-hot encoded). batch_size is 1024.

logits = tf.debugging.check_numerics(logits, message='bad logits', name=None)
probabilities = tf.nn.softmax(logits=logits, name='Softmax')
probabilities = tf.debugging.check_numerics(probabilities, message='bad probabilities', name=None)

The classifier fails on the last statement as it finds nan or inf in probabilities. logits are clean, otherwise the first statement would have failed.

From what I read about tf.nn.softmax, it can handle very large and very small values in logits. I have verified this in interactive mode.

>>> with tf.Session() as s:
...   a = tf.constant([[1000, 10], [-100, -200], [3, 4.0]])
...   sm = tf.nn.softmax(logits=a, name='Softmax')
...   print(a.eval())
...   print(sm.eval())
...
[[1000.   10.]
 [-100. -200.]
 [   3.    4.]]
[[1.         0.        ]
 [1.         0.        ]
 [0.26894143 0.7310586 ]]

I then tried clipping the values in logits and the whole thing now works. See the modified snippet below.

logits = tf.debugging.check_numerics(logits, message='logits', name=None)
safe_logits = tf.clip_by_value(logits, -15.0, 15.0)
probabilities = tf.nn.softmax(logits=safe_logits, name='Softmax')
probabilities = tf.debugging.check_numerics(probabilities, message='bad probabilities', name=None)

In second statement, I am clipping the values in logits to -15 and 15, and that somehow prevents nan/inf in softmax computation. So, I was able to fix the issue at hand.

However, I still don't understand why this clipping is working? (I should mention that clipping between -20 and 20 does not work and the model fails with nan or inf in probabilities).

Could someone help me understand why this is the case?

I am using tensorflow 1.15.0, running on a 64-bit instance.



from Tensorflow issue with softmax

No comments:

Post a Comment