Thanks for your great job and codes!
Regarding the training data, did you randomly crop out 512*512 pieces, and then extract textual descriptions from the 512-sized images?
If that's the case, do all the training images need to be preprocessed and saved in advance?
How many training data of 512 dimensions are there approximately?
Thanks for your great job and codes!
Regarding the training data, did you randomly crop out 512*512 pieces, and then extract textual descriptions from the 512-sized images?
If that's the case, do all the training images need to be preprocessed and saved in advance?
How many training data of 512 dimensions are there approximately?