-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Loss goes horizontally after 200 epoch #11
Comments
Since I'm not sure which dataset you're using, I'm uncertain whether there are issues with the training or if the difficulty of the upstream task itself is causing the loss not to decrease. Perhaps you could also experiment with using MAE to see if the loss can decrease on your own dataset. |
I trained on RGB images from SUNRGBD dataset which has over 5000 images for training. I used MAE and the result is the same. It seems like the training process stuck in a local minima. How many epochs did you train, is there anything special about your learning rate scheduler? |
The situation you described, which the model gets stuck in a local optimum, is indeed a possibility, but I can't offer you specific advice on how to address this issue. Typically, I would try different loss functions, optimizers, and so on, but there's no guarantee that the problem will be resolved. Another possible situation is that the upstream task on your dataset itself is inherently difficult, so the loss may not continue to decrease. The configuration I used can be directly found in the open-source project code. I didn't employ any different operations. |
How small do you expect your loss to be for a good reconstruction? |
For different datasets, the final loss obtained by Swin MAE is not entirely the same. However, for the two datasets I tried, it was roughly between 0.002 and 0.003. You can see the loss curves for each experiment in the paper. |
Now I know what is the problem. I see that you did not use normalization and RandomResizedCrop transform as in original MAE. When I do not use normalization, the loss starts to come close to your report. Do you have any comment on the effect of those transformations? |
As mentioned here. If your goal is to reconstruct a good-looking image, use unnormalized pixels. If your goal is to finetune for a downstream recognition task, use normalized pixels. Did you finetune downstream task using normalized or unnormalized pixels? |
MAE does not rely on data augmentation as much as contrastive learning. And I believe that RandomResizedCrop destroys the integrity of medical images. Therefore, RandomResizedCrop is not used in the experiments. |
Thank you for your interesting work. I tried to use your method on my custom dataset. The loss goes from 2.2 to 0.2 in less than 200 epochs, then refuses to go down. Do you encounter this problem? What can I do to overcome this?
The text was updated successfully, but these errors were encountered: