So I'm doing the tensorflow tutorial found here:
Basically, my input is a [28x28] matrix (image) that I flatten to a [1x784] vector.
The tutorial then says:
We scale these values to a range of 0 to 1 before feeding to the neural network model. For this, cast the datatype of the image components from an integer to a float, and divide by 255.
My question is why do we need to normalize in this case? My understanding is that when we have features that are on different scales, we need normalization if not the output of the model is distorted. But in this case all pixel ranges go from 0 to 255 (all features are the same scale)
I went ahead and ran it with normalization, and get an accuracy of over 85%, whereas no normalization, my accuracy falls to 10%.