Loss goes horizontally after 200 epoch #11

bqm1111 · 2024-04-23T19:04:51Z

Thank you for your interesting work. I tried to use your method on my custom dataset. The loss goes from 2.2 to 0.2 in less than 200 epochs, then refuses to go down. Do you encounter this problem? What can I do to overcome this?

Zian-Xu · 2024-04-24T03:10:22Z

Since I'm not sure which dataset you're using, I'm uncertain whether there are issues with the training or if the difficulty of the upstream task itself is causing the loss not to decrease. Perhaps you could also experiment with using MAE to see if the loss can decrease on your own dataset.

bqm1111 · 2024-04-24T08:18:58Z

I trained on RGB images from SUNRGBD dataset which has over 5000 images for training. I used MAE and the result is the same. It seems like the training process stuck in a local minima. How many epochs did you train, is there anything special about your learning rate scheduler?

Zian-Xu · 2024-04-24T08:40:59Z

The situation you described, which the model gets stuck in a local optimum, is indeed a possibility, but I can't offer you specific advice on how to address this issue. Typically, I would try different loss functions, optimizers, and so on, but there's no guarantee that the problem will be resolved. Another possible situation is that the upstream task on your dataset itself is inherently difficult, so the loss may not continue to decrease. The configuration I used can be directly found in the open-source project code. I didn't employ any different operations.

bqm1111 · 2024-04-24T08:42:31Z

How small do you expect your loss to be for a good reconstruction?

Zian-Xu · 2024-04-24T09:13:03Z

For different datasets, the final loss obtained by Swin MAE is not entirely the same. However, for the two datasets I tried, it was roughly between 0.002 and 0.003. You can see the loss curves for each experiment in the paper.

bqm1111 · 2024-04-24T11:15:44Z

Now I know what is the problem. I see that you did not use normalization and RandomResizedCrop transform as in original MAE. When I do not use normalization, the loss starts to come close to your report. Do you have any comment on the effect of those transformations?

bqm1111 · 2024-04-24T11:43:52Z

As mentioned here. If your goal is to reconstruct a good-looking image, use unnormalized pixels. If your goal is to finetune for a downstream recognition task, use normalized pixels. Did you finetune downstream task using normalized or unnormalized pixels?

Zian-Xu · 2024-05-09T13:31:13Z

MAE does not rely on data augmentation as much as contrastive learning. And I believe that RandomResizedCrop destroys the integrity of medical images. Therefore, RandomResizedCrop is not used in the experiments.
In previous experiments, unnormalized pixels have been used. The role played by normalization needs further experimental verification.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Loss goes horizontally after 200 epoch #11

Loss goes horizontally after 200 epoch #11

bqm1111 commented Apr 23, 2024

Zian-Xu commented Apr 24, 2024

bqm1111 commented Apr 24, 2024 •

edited

Loading

Zian-Xu commented Apr 24, 2024

bqm1111 commented Apr 24, 2024

Zian-Xu commented Apr 24, 2024

bqm1111 commented Apr 24, 2024 •

edited

Loading

bqm1111 commented Apr 24, 2024

Zian-Xu commented May 9, 2024

Loss goes horizontally after 200 epoch #11

Loss goes horizontally after 200 epoch #11

Comments

bqm1111 commented Apr 23, 2024

Zian-Xu commented Apr 24, 2024

bqm1111 commented Apr 24, 2024 • edited Loading

Zian-Xu commented Apr 24, 2024

bqm1111 commented Apr 24, 2024

Zian-Xu commented Apr 24, 2024

bqm1111 commented Apr 24, 2024 • edited Loading

bqm1111 commented Apr 24, 2024

Zian-Xu commented May 9, 2024

bqm1111 commented Apr 24, 2024 •

edited

Loading

bqm1111 commented Apr 24, 2024 •

edited

Loading