Thank you for sharing your excellent work on "AdaIR: Adaptive All-in-One Image Restoration via Frequency Mining and Modulation". I have a question regarding the evaluation process described in the paper.
I tested the provided models (adair3d.ckpt and adair5d.ckpt) and noticed that the PSNR and SSIM results I obtained are significantly different from the values in the paper's table. For example, when evaluating the pretrained results on the Rain100L/SOTS dataset, I found a notable gap between my results and reported ones in paper. (I utilized the calculate function in Basicsr for evaluation )
3 tasks
For 5 tasks
For the released test results, I observed that the test images appear to have their edges cropped by one pixel on Rain100L images. Specifically, the original image size is (481,321), but the test image is (480,320). Even after adjusting for this discrepancy by aligning the cropped images, the evaluation metrics are still approximately 0.14 lower than those reported.
How exactly are the PSNR and SSIM values computed for the 3-task and 5-task settings?
Are there any specific pre-processing or evaluation details (e.g., cropping strategy, alignment procedures) that might affect the final metrics?
Thank you very much for your time and assistance!
Thank you for sharing your excellent work on "AdaIR: Adaptive All-in-One Image Restoration via Frequency Mining and Modulation". I have a question regarding the evaluation process described in the paper.
I tested the provided models (adair3d.ckpt and adair5d.ckpt) and noticed that the PSNR and SSIM results I obtained are significantly different from the values in the paper's table. For example, when evaluating the pretrained results on the Rain100L/SOTS dataset, I found a notable gap between my results and reported ones in paper. (I utilized the calculate function in Basicsr for evaluation )
3 tasks
For 5 tasks
For the released test results, I observed that the test images appear to have their edges cropped by one pixel on Rain100L images. Specifically, the original image size is (481,321), but the test image is (480,320). Even after adjusting for this discrepancy by aligning the cropped images, the evaluation metrics are still approximately 0.14 lower than those reported.
How exactly are the PSNR and SSIM values computed for the 3-task and 5-task settings?
Are there any specific pre-processing or evaluation details (e.g., cropping strategy, alignment procedures) that might affect the final metrics?
Thank you very much for your time and assistance!