Hi,
Thanks for your code. It is very well structured.
Are you sure that the first three equations in the back propagation (237-239) are correct? I think that the derivatives of the weights (dSdwW) should be computed by the derivative of NetOutputAct and CECSquashAct.
Moreover, for the NetOutputAct derivative, did you forget to use the derivative of the softmax?
Best
Tessa
Hi,
Thanks for your code. It is very well structured.
Are you sure that the first three equations in the back propagation (237-239) are correct? I think that the derivatives of the weights (dSdwW) should be computed by the derivative of NetOutputAct and CECSquashAct.
Moreover, for the NetOutputAct derivative, did you forget to use the derivative of the softmax?
Best
Tessa