-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reproduce the Potsdam results #39
Comments
For the record I'm seeing exactly the same problem -- I can replicate the STEGO results against the model already trained, but when I train myself I get a lower accuracy for the cluster probe than the paper reports. |
I attempted to train cocostuff to get a successful training run to see what the graphs looked like (#23 (comment)). Even with this, though, I could not successfully tune the Potsdam hyperparameters. I decided to turn to a Bayesian hyperparameter optimizer, SigOpt. I had it run for about 100 times, tuning the various positive and negative hyperparameters, focused on just optimizing cluster mIoU. Technically I should have had it optimize linear accuracy/mIoU and cluster accuracy/mIoU all together, but for simplicity just chose cluster mIoU. It came up with these hyperparameter values for the Potsdam dataset: Parameters: Unfortunately, even with this, I still could not replicate the Potsdam results listed in the paper: At this point, I think there is something more fundamentally broken somewhere in STEGO related to Potsdam, perhaps in the dataset as a bug or elsewhere. |
Thanks for replicating this @BradNeuberg, this might be something related to the specifics of your distributed training setup. How many workers do you use and are you using same batch size? These models were trained on a single GPU so this might have affected training. |
I am using Google Cloud, with the machine type being an n1-standard-8 with 8 CPU cores and a V-100 GPU. Since I have 8 CPU cores, I could potentially set num_workers to 8; however, I consistently get out of memory errors at about epoch 22 if I do that, so I've set the num_workers to 1, which gets rid of out of memory errors. My batch size is 32. I'm only using a single machine and a single GPU for training. |
Hi @BradNeuberg , Will you show some example how did you use Bayesian hyperparameter optimizer, SigOpt to optimize the hyperparameters for STEGO model? |
How to deal with the problem about potsdam repulicating? |
@mhamilton723 ,could you share the hyparams about postdam? |
Hi folks, |
@Cemm23333 you can find them here: https://arxiv.org/abs/2304.07314 |
Could you help to reproduce the results with the Potsdam dataset? I trained STEGO with the same configuration used in
potsdam_test.ckpt
and then evaluate the model usingeval_segmentation.py
, but Accuracy and IoU of clustering are low.Using
potsdam_test.ckpt
, I gotbut, using my checkpoint, I got
The results with the linear probe look good, but not the one with the cluster. Could you help to figure out what can make the difference?
Here is my configuration used to train STEGO:
The text was updated successfully, but these errors were encountered: