hello!! I trained the model with batchsize of 2048 to epoch_13_13999.bin, and then changed batchsize to 1024 for training epoch14, and the accuracy decreased from 56% to 55%. c #253

leo23ui · 2024-12-12T03:46:32Z

hello!! , I trained the model with batchsize of 2048 to epoch_13_13999.bin, and then changed batchsize to 1024 for training epoch14, and the accuracy decreased from 56% to 55%. could you please give me some advice on how to solve this problem? thanks!!! I modified the following code after modifying batchsize.

Modify the code in train.py as follows:

#if num_feed_images >= all_num_feed_images:
#break
if batch_count >= num_batches_per_epoch:
break

Change the code in main.py as follows:

#start_iter = checkpoint['iter_in_epoch'] + 1
start_iter = checkpoint['iter_in_epoch']*2 + 1

wkcn · 2024-12-14T00:10:40Z

It may be related to the optimizer state (the first/second moment in AdamW).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hello!! I trained the model with batchsize of 2048 to epoch_13_13999.bin, and then changed batchsize to 1024 for training epoch14, and the accuracy decreased from 56% to 55%. c #253

hello!! I trained the model with batchsize of 2048 to epoch_13_13999.bin, and then changed batchsize to 1024 for training epoch14, and the accuracy decreased from 56% to 55%. c #253

leo23ui commented Dec 12, 2024 •

edited

Loading

wkcn commented Dec 14, 2024

hello!! I trained the model with batchsize of 2048 to epoch_13_13999.bin, and then changed batchsize to 1024 for training epoch14, and the accuracy decreased from 56% to 55%. c #253

hello!! I trained the model with batchsize of 2048 to epoch_13_13999.bin, and then changed batchsize to 1024 for training epoch14, and the accuracy decreased from 56% to 55%. c #253

Comments

leo23ui commented Dec 12, 2024 • edited Loading

wkcn commented Dec 14, 2024

leo23ui commented Dec 12, 2024 •

edited

Loading