Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Details for finetuning with higher resolution. #5

Closed
LSanghyeok opened this issue Nov 6, 2024 · 1 comment
Closed

Details for finetuning with higher resolution. #5

LSanghyeok opened this issue Nov 6, 2024 · 1 comment

Comments

@LSanghyeok
Copy link

Hi, Thank you for the fantastic work!

I’m particularly interested in the detailed configuration for fine-tuning results with higher resolution reported in the main table of the paper.
Could you provide more detailed settings used for it? Especially for lr, min_lr, the use of the cosine scheduler, and other protocols.
Additionally, if possible, could you share the scripts to help reproduce these results?

I look forward to your response and appreciate any details you can share.

Thanks.

@ysj9909
Copy link
Owner

ysj9909 commented Nov 28, 2024

Thank you so much for your interest in our research!
First, I sincerely apologize for the delayed response due to a busy schedule. Please find the detailed training setup and log below for your reference.

  • res. 512 x 512
    args
    {
    "batch_size": 256,
    "epochs": 30,
    "model": "SHViT-S4",
    "input_size": 512,
    "model_ema": true,
    "model_ema_decay": 0.99996,
    "model_ema_force_cpu": false,
    "opt": "adamw",
    "opt_eps": 1e-08,
    "opt_betas": null,
    "clip_grad": 0.02,
    "clip_mode": "agc",
    "momentum": 0.9,
    "weight_decay": 1e-08,
    "sched": "cosine",
    "lr": 4e-05,
    "lr_noise": null,
    "lr_noise_pct": 0.67,
    "lr_noise_std": 1.0,
    "warmup_lr": 2e-08,
    "min_lr": 2e-07,
    "decay_epochs": 30,
    "warmup_epochs": 5,
    "cooldown_epochs": 10,
    "patience_epochs": 10,
    "decay_rate": 0.1,
    "ThreeAugment": false,
    "color_jitter": 0.4,
    "aa": "rand-m9-mstd0.5-inc1",
    "smoothing": 0.1,
    "train_interpolation": "bicubic",
    "repeated_aug": true,
    "reprob": 0.25,
    "remode": "pixel",
    "recount": 1,
    "resplit": false,
    "mixup": 0.8,
    "cutmix": 1.0,
    "cutmix_minmax": null,
    "mixup_prob": 1.0,
    "mixup_switch_prob": 0.5,
    "mixup_mode": "batch",
    "teacher_model": "regnety_160",
    "teacher_path": "https://dl.fbaipublicfiles.com/deit/regnety_160-a5fe301d.pth",
    "distillation_type": "none",
    "distillation_alpha": 0.5,
    "distillation_tau": 1.0,
    "finetune": "outputs/SHViT/s4-384x384.pth",
    "set_bn_eval": false,
    "data_path": "datasets/imagenet-1k",
    "data_set": "IMNET",
    "inat_category": "name",
    "output_dir": "outputs",
    "device": "cuda",
    "seed": 0,
    "resume": "",
    "start_epoch": 0,
    "eval": false,
    "dist_eval": true,
    "num_workers": 10,
    "pin_mem": true,
    "world_size": 4,
    "dist_url": "env://",
    "save_freq": 1,
    "rank": 0,
    "gpu": 0,
    "distributed": true,
    "dist_backend": "nccl",
    "nb_classes": 1000
    }

log
{"train_lr": 1.9999999999999825e-08, "train_loss": 3.0021165267740795, "test_loss": 0.9942393930572452, "test_acc1": 80.78200282287598, "test_acc5": 95.52200252990723, "epoch": 0, "n_parameters": 16588484}
{"train_lr": 1.9999999999999825e-08, "train_loss": 2.9967277955522924, "test_loss": 1.165703083529617, "test_acc1": 80.66200249145508, "test_acc5": 95.39800263977051, "epoch": 1, "n_parameters": 16588484}
{"train_lr": 8.015999999999748e-06, "train_loss": 2.9077237610050815, "test_loss": 1.0041656787648345, "test_acc1": 81.44000264648437, "test_acc5": 95.78600263122559, "epoch": 2, "n_parameters": 16588484}
{"train_lr": 1.60119999999996e-05, "train_loss": 2.8780302244315235, "test_loss": 1.0053500862735691, "test_acc1": 81.56800271606446, "test_acc5": 95.79600253356934, "epoch": 3, "n_parameters": 16588484}
{"train_lr": 2.4007999999999662e-05, "train_loss": 2.8711931062973948, "test_loss": 0.9884235037095619, "test_acc1": 81.65000243835449, "test_acc5": 95.8120026763916, "epoch": 4, "n_parameters": 16588484}
{"train_lr": 3.200400000000007e-05, "train_loss": 2.855403150562093, "test_loss": 1.0687555201125867, "test_acc1": 81.55600258666992, "test_acc5": 95.75200251464844, "epoch": 5, "n_parameters": 16588484}
{"train_lr": 3.7333905535309283e-05, "train_loss": 2.8514407510332447, "test_loss": 1.0417560077074803, "test_acc1": 81.62800270812988, "test_acc5": 95.82800245544433, "epoch": 6, "n_parameters": 16588484}
{"train_lr": 3.61994381880623e-05, "train_loss": 2.8451906923862764, "test_loss": 1.079871901960084, "test_acc1": 81.49000242919922, "test_acc5": 95.71800255371093, "epoch": 7, "n_parameters": 16588484}
{"train_lr": 3.488858202699983e-05, "train_loss": 2.83527819277476, "test_loss": 0.9795589370257927, "test_acc1": 81.70600269592285, "test_acc5": 95.87000263732911, "epoch": 8, "n_parameters": 16588484}
{"train_lr": 3.341569906654207e-05, "train_loss": 2.829995608110603, "test_loss": 0.9492181566628543, "test_acc1": 81.85600226745605, "test_acc5": 95.90800284606934, "epoch": 9, "n_parameters": 16588484}
{"train_lr": 3.1796926520619977e-05, "train_loss": 2.8244817256451036, "test_loss": 0.9702437578728704, "test_acc1": 81.89600251342773, "test_acc5": 95.87400284057617, "epoch": 10, "n_parameters": 16588484}
{"train_lr": 3.004999999999962e-05, "train_loss": 2.8213165434096736, "test_loss": 1.0603628957813436, "test_acc1": 81.61000227355957, "test_acc5": 95.79200217041016, "epoch": 11, "n_parameters": 16588484}
{"train_lr": 2.819405919720878e-05, "train_loss": 2.815701360849263, "test_loss": 0.9939798803040476, "test_acc1": 81.8520023590088, "test_acc5": 95.89200260009765, "epoch": 12, "n_parameters": 16588484}
{"train_lr": 2.6249438188061264e-05, "train_loss": 2.8113755780777674, "test_loss": 0.9455332597999861, "test_acc1": 81.79000258056641, "test_acc5": 95.92800265869141, "epoch": 13, "n_parameters": 16588484}
{"train_lr": 2.4237442647273045e-05, "train_loss": 2.806338689024214, "test_loss": 0.998022968118841, "test_acc1": 81.79600298645019, "test_acc5": 95.92200233947754, "epoch": 14, "n_parameters": 16588484}
{"train_lr": 2.218011641902657e-05, "train_loss": 2.8068453557580875, "test_loss": 0.9343066102627552, "test_acc1": 81.86600245666504, "test_acc5": 95.92000251464843, "epoch": 15, "n_parameters": 16588484}
{"train_lr": 2.0099999999999614e-05, "train_loss": 2.8078854170015197, "test_loss": 1.044697282892285, "test_acc1": 81.65200227355957, "test_acc5": 95.86000262573242, "epoch": 16, "n_parameters": 16588484}
{"train_lr": 1.801988358097341e-05, "train_loss": 2.811426327716437, "test_loss": 1.069516186461304, "test_acc1": 81.77600200622558, "test_acc5": 95.89000268432618, "epoch": 17, "n_parameters": 16588484}
{"train_lr": 1.5962557352726692e-05, "train_loss": 2.8039920227609567, "test_loss": 1.032561695485404, "test_acc1": 81.73600232421875, "test_acc5": 95.84000257263183, "epoch": 18, "n_parameters": 16588484}
{"train_lr": 1.3950561811939017e-05, "train_loss": 2.806251465726337, "test_loss": 1.0380705184105672, "test_acc1": 81.72400244323731, "test_acc5": 95.9020025994873, "epoch": 19, "n_parameters": 16588484}
{"train_lr": 1.2005940802791846e-05, "train_loss": 2.790250376736422, "test_loss": 1.069025012128281, "test_acc1": 81.6460025946045, "test_acc5": 95.83400252807617, "epoch": 20, "n_parameters": 16588484}
{"train_lr": 1.0150000000000303e-05, "train_loss": 2.7953243468591062, "test_loss": 0.954402582211928, "test_acc1": 81.94200269470215, "test_acc5": 95.93400261230468, "epoch": 21, "n_parameters": 16588484}
{"train_lr": 8.403073479379566e-06, "train_loss": 2.788672054056927, "test_loss": 0.9823438281362707, "test_acc1": 81.98000222045899, "test_acc5": 95.94000272338867, "epoch": 22, "n_parameters": 16588484}
{"train_lr": 6.784300933458732e-06, "train_loss": 2.7942356936317934, "test_loss": 1.0218670982303042, "test_acc1": 81.8860025, "test_acc5": 95.90800295715331, "epoch": 23, "n_parameters": 16588484}
{"train_lr": 5.311417972999943e-06, "train_loss": 2.799621383634974, "test_loss": 1.0153561487342373, "test_acc1": 81.85600248779296, "test_acc5": 95.91600272338867, "epoch": 24, "n_parameters": 16588484}
{"train_lr": 4.000561811938627e-06, "train_loss": 2.8025113808975326, "test_loss": 1.0201955302195116, "test_acc1": 81.75200270935059, "test_acc5": 95.838002913208, "epoch": 25, "n_parameters": 16588484}
{"train_lr": 2.866094464689694e-06, "train_loss": 2.7992377952991916, "test_loss": 1.0058673367355808, "test_acc1": 81.91800265808105, "test_acc5": 95.93600252807617, "epoch": 26, "n_parameters": 16588484}
{"train_lr": 1.920445392912236e-06, "train_loss": 2.791052138538574, "test_loss": 0.9959358189142111, "test_acc1": 81.85200229248046, "test_acc5": 95.96200274291992, "epoch": 27, "n_parameters": 16588484}
{"train_lr": 1.173975325726429e-06, "train_loss": 2.789799641481311, "test_loss": 1.082988210699775, "test_acc1": 81.64400270507812, "test_acc5": 95.76600254760743, "epoch": 28, "n_parameters": 16588484}
{"train_lr": 6.348627453972665e-07, "train_loss": 2.7947734272022613, "test_loss": 1.0316823188102606, "test_acc1": 81.71000265869141, "test_acc5": 95.80600269836425, "epoch": 29, "n_parameters": 16588484}

@ysj9909 ysj9909 closed this as completed Nov 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants