TensorflowJs comes with some built-in functions that allow you to generate random values. For example
import * as tf from "@tensorflow/tfjs"
// Creating a tensor with 5 values sampled
// from a uniform distribution
// Printing the tensor
// will output something like this
// [0.0003758, 0.1491586, 0.2266536, 0.0614096, 0.1920560]
Actually, TensorflowJs has a full utility package for random values.
But why? Why would we need to generate these random values?
Well, because these random values will later become the weights between the neurons of a network.
But still ... why? Why do we need to start with random weights?
For example, why can't we just start with the weights set to zero?
Yeah ... all weights set to zero will not be an option.
The total input of a neuron is calculated as the weighted sum of all the individual inputs:
const totalInput = sumAll(weight[i] * input[i])
Given that the weights are zero then all the times the total input will be zero. Therefore our network will just stay in a constant state of zero inputs for all the neurons, and it will not learn.
Something similar will happen for any constant initialization value. Every neuron in the network will compute the same output, which results in the same weight/parameter update. Doing this will defeat the purpose of having multiple neurons.
The reason why we want random initialization values for the TensorflowJs weights is so that we can break the symmetry and determine changes.
Chance determines learning. If a neural network always keeps its state constant during all the epochs it does not changes, therefore it does not learn.
The process of training a neural network so that it makes “reasonable” predictions involves tweaking the weights of the neurons multiple times so that the rate of success becomes better and better.
What are good initialization values for weights in neural networks?
The weights need to have initial values. If they are not a constant value how should you choose them?
The best is to initialize the weights with small uniform random numbers. Somewhere between 0 and 1. Therefore the need for random functions in TensorflowJs such as
The reason why the values need to be uniform (close to each other, but now equal) is to be sure that each input has the chance to be meaningful. Given the weighted sum of the inputs, if the weight of one input is way too big or too small then that input will not matter at all or will make the other inputs not too matter.
Also when the weights are initialized with a large number, the term grows. The value is then mapped to 1 using something like a sigmoid function, resulting in a much slower change in gradient descending slope and so the performance of the learning process is greatly affected.