Why biases are excluded from l2_regularization term?
Issue #133
new
Why l2_regularization = tf.reduce_sum(tf.square(W_fc))
and not l2_regularization =tf.reduce_sum(tf.square(W_fc))+tf.reduce_sum(tf.square(b_fc))
here?
Comments (2)
-
-
reporter - Log in to comment
My sense from the literature / tutorials / etc was that biases are usually excluded from regularization.
https://stats.stackexchange.com/questions/153605/no-regularisation-term-for-bias-unit-in-neural-network
On a related note, I'm planning to add another flag for type of regularization (L1 or L2)