Understanding question - what value to take of the estimator while evaluating?

First, thanks or this great work and implementation - I want to use it in my own work.
I have a basic question about the implementation:
assume I have fixed embedding (size 512) with many samples (about 2 million)
 
I saw in the examples that the values of the MI is changing through the optimization, and moreover the values have high variance but extremally good MSE.

As I understand I will use all the 2 million samples in order to train the CLUB estimator - when is the best time to take the evaluation of the MI? is it best to monitor the loss in order to see it not changing or other measure? what is your suggestion ? and then what portion of the 2 million examples will you use for the evaluation of the true MI? all of them? and then taking the MSE of all the examples? 

second question, regarding the architecture of the hidden layer and the network, any suggestion about that for the case I have two variables with 512 dim each?

the last question regarding the robustness of the optimizer, lets assume I will change the two vectors in time, optimizing them for a different task, and I will want to measure the MI again after changing them, will you initialize the optimizer for measuring the MI for the modified vectors or use the last optimizer that was trained?
thanks!
  

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Understanding question - what value to take of the estimator while evaluating? #18

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Understanding question - what value to take of the estimator while evaluating? #18

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions