Hi, thanks for the great work on this project!
I'm currently training the denoising diffusion network. During training, the model reports two losses: base loss and mask loss. I’ve noticed that while the base loss steadily decreases, the mask loss seems to increase over time. Is this behavior expected?
Hi, thanks for the great work on this project!
I'm currently training the denoising diffusion network. During training, the model reports two losses: base loss and mask loss. I’ve noticed that while the base loss steadily decreases, the mask loss seems to increase over time. Is this behavior expected?