gaussian grbm initialization#71
Conversation
|
@jquetzalcoatl IIRC, Hinton's recommendation pertains to zero-one-valued RBMs (bipartite with hidden units). Would it make sense to translate the |
|
@kevinchern The REM reference is for spin models i.e., {-1,1}. Ultimately, the initialization pertains to whether the model is ergodic. In this sense, the support only set an offset energy. I believe the main motivation for initializing with 0.01 in Hinton's guide is to start in a paramagnetic phase, which ties nicely with the REM/SK spin glass model |
Co-authored-by: Kevin Chern <32395608+kevinchern@users.noreply.github.com>
|
added release note |
Co-authored-by: Kevin Chern <32395608+kevinchern@users.noreply.github.com>
Co-authored-by: Kevin Chern <32395608+kevinchern@users.noreply.github.com>
Co-authored-by: Kevin Chern <32395608+kevinchern@users.noreply.github.com>
Co-authored-by: Kevin Chern <32395608+kevinchern@users.noreply.github.com>
There was a problem hiding this comment.
Tests are failing but otherwise LGTM. Thanks for the much-needed PR @jquetzalcoatl !!
@VolodyaCO offered to take a look at the tests
Co-authored-by: Kevin Chern <32395608+kevinchern@users.noreply.github.com>
Co-authored-by: Kevin Chern <32395608+kevinchern@users.noreply.github.com>
Co-authored-by: Kevin Chern <32395608+kevinchern@users.noreply.github.com>
|
Any updates on this? |
|
The reason for this test failing is very strange. Essentially, it is making sure that both the DVAE forward (which does encode -> latent to discrete -> decode) matches encode -> latent_to_discrete -> decode, i.e., this is a pretty simple unit test: expected_latents = self.encoders[n_latent_dims](self.data)
expected_discretes = self.dvaes[n_latent_dims].latent_to_discrete(
expected_latents, n_samples
)
expected_reconstructed_x = self.decoders[n_latent_dims](expected_discretes)
latents, discretes, reconstructed_x = self.dvaes[n_latent_dims].forward(
x=self.data, n_samples=n_samples
)
assert torch.equal(reconstructed_x, expected_reconstructed_x)
assert torch.equal(discretes, expected_discretes)
assert torch.equal(latents, expected_latents)Moreover, self.encoders = {i: Encoder(i) for i in latent_dims_list}
self.decoders = {i: Decoder(latent_features, input_features) for i in latent_dims_list}
self.dvaes = {i: DVAE(self.encoders[i], self.decoders[i]) for i in latent_dims_list}So even if the encoders/decoders are updated in other tests (because of training), there should be a permanent tracking of the encoders/decoders in the dvaes. |
|
Found the issue and fixed it in a PR to @jquetzalcoatl 's repo: jquetzalcoatl#1 Please approve javi, this would update the current PR and solve the issue. Took me a while to get the error! |
Fix failing forward method unit tests
VolodyaCO
left a comment
There was a problem hiding this comment.
I have definitely had to manually change the initialisation of GRBM weights whenever I use the GRBM. Thanks for this PR. I think it looks good to merge.
grbm weights and biases initialization set to Gaussian N(0,1/number of nodes)
Hinton guide suggests 0.01 as standard deviation. See https://www.cs.toronto.edu/~hinton/absps/guideTR.pdf
Moreover, having it set to Gaussian with this dependence on the number of nodes makes the energy extensive and initializes the gRBM in a paramagnetic phase similar to that describen in the Random Energy model paper
https://journals.aps.org/prb/abstract/10.1103/PhysRevB.24.2613
See #48