We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nice work! Thank you very much for your contribution to the AI safety community!
I noticed a weird phenomenon when training the topology generator. The code of training the topology generator is
toponet.train() for _ in tqdm(range(args.gtn_epochs), desc="training topology generator"): optimizer_topo.zero_grad() # generate new adj_list by dr.data['adj_list'] for gid in pset: SendtoCUDA(gid, [init_As, Ainputs, topomasks]) # only send the used graph items to cuda rst_bkdA = toponet( Ainputs[gid], topomasks[gid], topo_thrd, cuda, args.topo_activation, 'topo') # rst_bkdA = recover_mask(nodenums[gid], topomasks[gid], 'topo') # bkd_dr.data['adj_list'][gid] = torch.add(rst_bkdA, init_As[gid]) bkd_dr.data['adj_list'][gid] = torch.add(rst_bkdA[:nodenums[gid], :nodenums[gid]], init_As[gid]) # only current position in cuda SendtoCPU(gid, [init_As, Ainputs, topomasks]) loss = forwarding(args, bkd_dr, model, allset, criterion) loss.backward() optimizer_topo.step() torch.cuda.empty_cache() toponet.eval()
When I check the parameters of the topology generator before and after the training using the following snippets, i.e.,
import copy old_toponet = copy.deepcopy(toponet) toponet.train() for _ in tqdm(range(args.gtn_epochs), desc="training topology generator"): optimizer_topo.zero_grad() # generate new adj_list by dr.data['adj_list'] for gid in pset: SendtoCUDA(gid, [init_As, Ainputs, topomasks]) # only send the used graph items to cuda rst_bkdA = toponet( Ainputs[gid], topomasks[gid], topo_thrd, cuda, args.topo_activation, 'topo') # rst_bkdA = recover_mask(nodenums[gid], topomasks[gid], 'topo') # bkd_dr.data['adj_list'][gid] = torch.add(rst_bkdA, init_As[gid]) bkd_dr.data['adj_list'][gid] = torch.add(rst_bkdA[:nodenums[gid], :nodenums[gid]], init_As[gid]) # only current position in cuda SendtoCPU(gid, [init_As, Ainputs, topomasks]) loss = forwarding(args, bkd_dr, model, allset, criterion) loss.backward() optimizer_topo.step() torch.cuda.empty_cache() toponet.eval() new_toponet = copy.deepcopy(toponet) old_state_dict = old_toponet.state_dict() new_state_dict = new_toponet.state_dict() for name in old_state_dict: param_diff = new_state_dict[name] - old_state_dict[name] print(torch.mean(param_diff))
I found there is no difference in parameters after training. The log is as follows:
N nodes avg/std/min/max: 15.69/13.69/2/95 N edges avg/std/min/max: 16.20/15.01/1/103 Node degree avg/std/min/max: 2.06/0.84/0/6 Node features dim: 4 N classes: 2 Classes: [0 1] Class 0: 400 samples Class 1: 1600 samples train 1000, test 1000 Train Epoch: 1 Loss: 0.3501 (avg: 0.6249) sec/iter: 0.09 Train Epoch: 2 Loss: 0.4396 (avg: 0.4671) sec/iter: 0.04 Train Epoch: 3 Loss: 0.3962 (avg: 0.4762) sec/iter: 0.05 Train Epoch: 4 Loss: 0.2415 (avg: 0.4725) sec/iter: 0.04 Train Epoch: 5 Loss: 0.3413 (avg: 0.4318) sec/iter: 0.05 Test set (epoch 5): Average loss: 0.3149, Accuracy: 936/1000 (93.60%) sec/iter: 0.04 Train Epoch: 6 Loss: 0.1591 (avg: 0.4509) sec/iter: 0.05 Train Epoch: 7 Loss: 0.2189 (avg: 0.4338) sec/iter: 0.05 Train Epoch: 8 Loss: 0.3262 (avg: 0.4374) sec/iter: 0.05 Train Epoch: 9 Loss: 0.4319 (avg: 0.4283) sec/iter: 0.05 Train Epoch: 10 Loss: 0.2932 (avg: 0.4221) sec/iter: 0.04 Test set (epoch 10): Average loss: 0.2969, Accuracy: 949/1000 (94.90%) sec/iter: 0.04 Train Epoch: 11 Loss: 0.3764 (avg: 0.4185) sec/iter: 0.04 Train Epoch: 12 Loss: 0.3095 (avg: 0.4208) sec/iter: 0.05 Train Epoch: 13 Loss: 0.2180 (avg: 0.3867) sec/iter: 0.04 Train Epoch: 14 Loss: 0.3225 (avg: 0.3997) sec/iter: 0.05 Train Epoch: 15 Loss: 0.2932 (avg: 0.4269) sec/iter: 0.04 Test set (epoch 15): Average loss: 0.2962, Accuracy: 953/1000 (95.30%) sec/iter: 0.04 Train Epoch: 16 Loss: 0.2085 (avg: 0.3804) sec/iter: 0.04 Train Epoch: 17 Loss: 0.3577 (avg: 0.4243) sec/iter: 0.04 Train Epoch: 18 Loss: 0.2417 (avg: 0.3843) sec/iter: 0.04 Train Epoch: 19 Loss: 0.2875 (avg: 0.3822) sec/iter: 0.04 Train Epoch: 20 Loss: 0.2741 (avg: 0.3581) sec/iter: 0.05 Test set (epoch 20): Average loss: 0.2789, Accuracy: 955/1000 (95.50%) sec/iter: 0.04 initializing trigger...: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 500/500 [00:00<00:00, 4332.37it/s] initializing trigger...: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [00:00<00:00, 19990.96it/s] Resampling step 0, bi-level optimization step 0 training topology generator: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [02:09<00:00, 6.46s/it] tensor(0., device='cuda:0') tensor(0., device='cuda:0') tensor(0., device='cuda:0') tensor(0., device='cuda:0') tensor(0., device='cuda:0') tensor(0., device='cuda:0')
Could you give me some suggestions about this problem? Thank you very much for any replies! :)
The text was updated successfully, but these errors were encountered:
I noticed the same issue -- toponet does not update between epochs. Also hoping for suggestions on this!
Sorry, something went wrong.
No branches or pull requests
Nice work! Thank you very much for your contribution to the AI safety community!
I noticed a weird phenomenon when training the topology generator. The code of training the topology generator is
When I check the parameters of the topology generator before and after the training using the following snippets, i.e.,
I found there is no difference in parameters after training. The log is as follows:
Could you give me some suggestions about this problem? Thank you very much for any replies! :)
The text was updated successfully, but these errors were encountered: