Bugs when running 'StageB_ldm_finetune.py' #12

lyh1028 · 2023-04-10T10:16:57Z

Thank you for your excellent work. Could you please help me with questions below?
It is weird that when I first ran this file it worked well. However, when I repeated my operation, it threws out error info like:
Traceback (most recent call last):
File "code/stageB_ldm_finetune.py", line 245, in
main(config)
File "code/stageB_ldm_finetune.py", line 163, in main
generative_model.finetune(trainer, fmri_latents_dataset_train, fmri_latents_dataset_test,
File "/public1/home/ungradu/home/gra02/lyh_test/mind-vis/mind-vis-main/code/dc_ldm/ldm_for_fmri.py", line 103, in finetune
trainers.fit(self.model, dataloader, val_dataloaders=test_loader)
File "/public1/home/ungradu/home/gra02/anaconda3/envs/mind-vis/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 770, in fit
self._call_and_handle_interrupt(
File "/public1/home/ungradu/home/gra02/anaconda3/envs/mind-vis/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 721, in _call_and_handle_interrupt
return self.strategy.launcher.launch(trainer_fn, *args, trainer=self, **kwargs)
File "/public1/home/ungradu/home/gra02/anaconda3/envs/mind-vis/lib/python3.8/site-packages/pytorch_lightning/strategies/launchers/spawn.py", line 78, in launch
mp.spawn(
File "/public1/home/ungradu/home/gra02/anaconda3/envs/mind-vis/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 240, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "/public1/home/ungradu/home/gra02/anaconda3/envs/mind-vis/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 189, in start_processes
process.start()
File "/public1/home/ungradu/home/gra02/anaconda3/envs/mind-vis/lib/python3.8/multiprocessing/process.py", line 121, in start
self._popen = self._Popen(self)
File "/public1/home/ungradu/home/gra02/anaconda3/envs/mind-vis/lib/python3.8/multiprocessing/context.py", line 284, in _Popen
return Popen(process_obj)
File "/public1/home/ungradu/home/gra02/anaconda3/envs/mind-vis/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in init
super().init(process_obj)
File "/public1/home/ungradu/home/gra02/anaconda3/envs/mind-vis/lib/python3.8/multiprocessing/popen_fork.py", line 19, in init
self._launch(process_obj)
File "/public1/home/ungradu/home/gra02/anaconda3/envs/mind-vis/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 47, in _launch
reduction.dump(process_obj, fp)
File "/public1/home/ungradu/home/gra02/anaconda3/envs/mind-vis/lib/python3.8/multiprocessing/reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'TorchHistory.add_log_parameters_hook..'

p.s.
before the error info there is also a userwarning about wandb, but I guess it is not the cause of this problem.

lyh1028 · 2023-04-12T08:15:31Z

I have solved this problem by using only one GPU while training.

bottle0228 · 2023-05-29T03:05:12Z

Hello,

I have also encountered this error. May I ask how you specifically modified the code to resolve this error?

I would greatly appreciate it if I could receive your help.

lyh1028 · 2023-05-29T07:32:36Z

Hello,

I have also encountered this error. May I ask how you specifically modified the code to resolve this error?

I would greatly appreciate it if I could receive your help.

My server will parallelize calculations on multiple GPUs by default, which will cause some problems (I have forgotten the specific reason for the problem,lol). To solve it simply you only need to specify the GPU at runtime, for example, CUDA_VISIBLE_DEVICES=1(your gpu id) python StageB_ldm_finetune.py
However, this will make the finetune process very slow. I think it takes about 3 days in one RTX 3090 GPU.

bottle0228 · 2023-05-29T11:48:47Z

Thank you very much for your help. I have successfully solved this problem, but as you said, it does run very slowly.

tejastake · 2023-07-16T13:51:57Z

hi there can you tell me which verson of pytorch_lightning you have used.
can you please guide me about my error.
my error is
File "/Major_with_stage3/eval_metrics.py", line 119, in n_way_top_k_acc
acc = accuracy(pred_picked.unsqueeze(0), torch.tensor([0], device=pred.device),
TypeError: accuracy() missing 1 required positional argument: 'task'

bottle0228 · 2023-07-23T03:13:34Z

hi there can you tell me which verson of pytorch_lightning you have used. can you please guide me about my error. my error is File "/Major_with_stage3/eval_metrics.py", line 119, in n_way_top_k_acc acc = accuracy(pred_picked.unsqueeze(0), torch.tensor([0], device=pred.device), TypeError: accuracy() missing 1 required positional argument: 'task'

Hello, I'm using the verson of pytorch_lightning is 1.6.5.

JoyMei · 2023-10-26T11:35:25Z

hi there can you tell me which verson of pytorch_lightning you have used. can you please guide me about my error. my error is File "/Major_with_stage3/eval_metrics.py", line 119, in n_way_top_k_acc acc = accuracy(pred_picked.unsqueeze(0), torch.tensor([0], device=pred.device), TypeError: accuracy() missing 1 required positional argument: 'task'

Due to the update of the torchmetrics API, has already been resolved in this one #13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bugs when running 'StageB_ldm_finetune.py' #12

Bugs when running 'StageB_ldm_finetune.py' #12

lyh1028 commented Apr 10, 2023

lyh1028 commented Apr 12, 2023

bottle0228 commented May 29, 2023

lyh1028 commented May 29, 2023

bottle0228 commented May 29, 2023

tejastake commented Jul 16, 2023

bottle0228 commented Jul 23, 2023

JoyMei commented Oct 26, 2023

Bugs when running 'StageB_ldm_finetune.py' #12

Bugs when running 'StageB_ldm_finetune.py' #12

Comments

lyh1028 commented Apr 10, 2023

lyh1028 commented Apr 12, 2023

bottle0228 commented May 29, 2023

lyh1028 commented May 29, 2023

bottle0228 commented May 29, 2023

tejastake commented Jul 16, 2023

bottle0228 commented Jul 23, 2023

JoyMei commented Oct 26, 2023