-
Notifications
You must be signed in to change notification settings - Fork 2
[Some questions about implementation] #4
Copy link
Copy link
Open
Description
Hi, I'm Junmo Cho.
I've read the paper which was pretty interesting. Sorry for taking your time, but while running the code, I've got some questions.
- Is minus of binary_cross_entropy between img, and pred_img coming from assuming the reward distribution as Bernoulli distribution? I thought that for each pixel in img (which is gt target, and value is 1 or 0) is used as Ber(y|pi) = pi^y * (1-pi)^(1-y) where y is pixel and from for each pixel dist in pred_img, we input it for pi. Please correct me if my understanding is wrong.
- Another thing is why do we divide steps (which is length of generation sequence of GFN - 16 here) for logprobs, and reward when calculating TB loss? I thought that logprobs is itself log of production of P_F(s_i | s_{i-1}) from i=1 to n as in the paper.
- Also, why there is no backward policy term in the TB loss? Are we assuming backward policy as uniform and involve it in logZ?
It would be grateful if I can have some answers! Thanks.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels