Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't load Whisper Model #23

Open
renatobrusarosco opened this issue Dec 9, 2023 · 4 comments
Open

Can't load Whisper Model #23

renatobrusarosco opened this issue Dec 9, 2023 · 4 comments

Comments

@renatobrusarosco
Copy link

Hi there,

I don't have a lot of knowledge about programming, and I'm struggling to deal with your tool. I tried to use the tool in Jupyter Anaconda. I installed all the required libraries, but when I attempted to run the final code, I encountered the same error message multiple times:

vbnet
Copy code
Python >= 3.10
Using cache found in C:\Users\renat/.cache\torch\hub\snakers4_silero-vad_master
Using Demucs
Using standard Whisper
LOADING: medium GPU:0 BS: 2
100%|█████████████████████████████████████| 1.42G/1.42G [00:25<00:00, 59.9MiB/s]
Can't load Whisper model: STD/medium

Below you can find more information:

RuntimeError Traceback (most recent call last)
File ~\Downloads\Jupyter\WhisperHallu\transcribeHallu.py:110, in loadModel(gpu, modelSize)
109 print("LOADING: "+modelSize+" GPU:"+gpu+" BS: "+str(beam_size))
--> 110 model = whisper.load_model(modelSize,device=torch.device("cuda:"+gpu)) #May be "cpu"
111 elif whisperFound == "SM4T":

File ~\AppData\Roaming\Python\Python311\site-packages\whisper_init_.py:146, in load_model(name, device, download_root, in_memory)
143 with (
144 io.BytesIO(checkpoint_file) if in_memory else open(checkpoint_file, "rb")
145 ) as fp:
--> 146 checkpoint = torch.load(fp, map_location=device)
147 del checkpoint_file

File ~\AppData\Roaming\Python\Python311\site-packages\torch\serialization.py:1014, in load(f, map_location, pickle_module, weights_only, mmap, **pickle_load_args)
1013 raise pickle.UnpicklingError(UNSAFE_MESSAGE + str(e)) from None
-> 1014 return _load(opened_zipfile,
1015 map_location,
1016 pickle_module,
1017 overall_storage=overall_storage,
1018 **pickle_load_args)
1019 if mmap:

File ~\AppData\Roaming\Python\Python311\site-packages\torch\serialization.py:1422, in _load(zip_file, map_location, pickle_module, pickle_file, overall_storage, **pickle_load_args)
1421 unpickler.persistent_load = persistent_load
-> 1422 result = unpickler.load()
1424 torch._utils._validate_loaded_sparse_tensors()

File ~\AppData\Roaming\Python\Python311\site-packages\torch\serialization.py:1392, in _load..persistent_load(saved_id)
1391 nbytes = numel * torch._utils._element_size(dtype)
-> 1392 typed_storage = load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location))
1394 return typed_storage

File ~\AppData\Roaming\Python\Python311\site-packages\torch\serialization.py:1366, in _load..load_tensor(dtype, numel, key, location)
1363 # TODO: Once we decide to break serialization FC, we can
1364 # stop wrapping with TypedStorage
1365 typed_storage = torch.storage.TypedStorage(
-> 1366 wrap_storage=restore_location(storage, location),
1367 dtype=dtype,
1368 _internal=True)
1370 if typed_storage._data_ptr() != 0:

File ~\AppData\Roaming\Python\Python311\site-packages\torch\serialization.py:1299, in _get_restore_location..restore_location(storage, location)
1298 def restore_location(storage, location):
-> 1299 return default_restore_location(storage, str(map_location))

File ~\AppData\Roaming\Python\Python311\site-packages\torch\serialization.py:381, in default_restore_location(storage, location)
380 for _, _, fn in _package_registry:
--> 381 result = fn(storage, location)
382 if result is not None:

File ~\AppData\Roaming\Python\Python311\site-packages\torch\serialization.py:274, in _cuda_deserialize(obj, location)
273 if location.startswith('cuda'):
--> 274 device = validate_cuda_device(location)
275 if getattr(obj, "_torch_load_uninitialized", False):

File ~\AppData\Roaming\Python\Python311\site-packages\torch\serialization.py:258, in validate_cuda_device(location)
257 if not torch.cuda.is_available():
--> 258 raise RuntimeError('Attempting to deserialize object on a CUDA '
259 'device but torch.cuda.is_available() is False. '
260 'If you are running on a CPU-only machine, '
261 'please use torch.load with map_location=torch.device('cpu') '
262 'to map your storages to the CPU.')
263 device_count = torch.cuda.device_count()

RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.

During handling of the above exception, another exception occurred:

SystemExit Traceback (most recent call last)
[... skipping hidden 1 frame]

Cell In[3], line 26
15 #Example
16 #lng="uk"
17 #prompt= "Whisper, Ok. "
(...)
23 # +"Ok, Whisper. "
24 #path="/path/to/your/uk/sound/file"
---> 26 loadModel("0")
27 result = transcribePrompt(path=path, lng=lng, prompt=prompt)

File ~\Downloads\Jupyter\WhisperHallu\transcribeHallu.py:117, in loadModel(gpu, modelSize)
116 print("Can't load Whisper model: "+whisperFound+"/"+modelSize)
--> 117 sys.exit(-1)

SystemExit: -1

During handling of the above exception, another exception occurred:

AttributeError Traceback (most recent call last)
[... skipping hidden 1 frame]

File ~\anaconda3\Lib\site-packages\IPython\core\interactiveshell.py:2097, in InteractiveShell.showtraceback(self, exc_tuple, filename, tb_offset, exception_only, running_compiled_code)
2094 if exception_only:
2095 stb = ['An exception has occurred, use %tb to see '
2096 'the full traceback.\n']
-> 2097 stb.extend(self.InteractiveTB.get_exception_only(etype,
2098 value))
2099 else:
2101 def contains_exceptiongroup(val):

File ~\anaconda3\Lib\site-packages\IPython\core\ultratb.py:710, in ListTB.get_exception_only(self, etype, value)
702 def get_exception_only(self, etype, value):
703 """Only print the exception type and message, without a traceback.
704
705 Parameters
(...)
708 value : exception value
709 """
--> 710 return ListTB.structured_traceback(self, etype, value)

File ~\anaconda3\Lib\site-packages\IPython\core\ultratb.py:568, in ListTB.structured_traceback(self, etype, evalue, etb, tb_offset, context)
565 chained_exc_ids.add(id(exception[1]))
566 chained_exceptions_tb_offset = 0
567 out_list = (
--> 568 self.structured_traceback(
569 etype,
570 evalue,
571 (etb, chained_exc_ids), # type: ignore
572 chained_exceptions_tb_offset,
573 context,
574 )
575 + chained_exception_message
576 + out_list)
578 return out_list

File ~\anaconda3\Lib\site-packages\IPython\core\ultratb.py:1435, in AutoFormattedTB.structured_traceback(self, etype, evalue, etb, tb_offset, number_of_lines_of_context)
1433 else:
1434 self.tb = etb
-> 1435 return FormattedTB.structured_traceback(
1436 self, etype, evalue, etb, tb_offset, number_of_lines_of_context
1437 )

File ~\anaconda3\Lib\site-packages\IPython\core\ultratb.py:1326, in FormattedTB.structured_traceback(self, etype, value, tb, tb_offset, number_of_lines_of_context)
1323 mode = self.mode
1324 if mode in self.verbose_modes:
1325 # Verbose modes need a full traceback
-> 1326 return VerboseTB.structured_traceback(
1327 self, etype, value, tb, tb_offset, number_of_lines_of_context
1328 )
1329 elif mode == 'Minimal':
1330 return ListTB.get_exception_only(self, etype, value)

File ~\anaconda3\Lib\site-packages\IPython\core\ultratb.py:1173, in VerboseTB.structured_traceback(self, etype, evalue, etb, tb_offset, number_of_lines_of_context)
1164 def structured_traceback(
1165 self,
1166 etype: type,
(...)
1170 number_of_lines_of_context: int = 5,
1171 ):
1172 """Return a nice text document describing the traceback."""
-> 1173 formatted_exception = self.format_exception_as_a_whole(etype, evalue, etb, number_of_lines_of_context,
1174 tb_offset)
1176 colors = self.Colors # just a shorthand + quicker name lookup
1177 colorsnormal = colors.Normal # used a lot

File ~\anaconda3\Lib\site-packages\IPython\core\ultratb.py:1063, in VerboseTB.format_exception_as_a_whole(self, etype, evalue, etb, number_of_lines_of_context, tb_offset)
1060 assert isinstance(tb_offset, int)
1061 head = self.prepare_header(str(etype), self.long_header)
1062 records = (
-> 1063 self.get_records(etb, number_of_lines_of_context, tb_offset) if etb else []
1064 )
1066 frames = []
1067 skipped = 0

File ~\anaconda3\Lib\site-packages\IPython\core\ultratb.py:1131, in VerboseTB.get_records(self, etb, number_of_lines_of_context, tb_offset)
1129 while cf is not None:
1130 try:
-> 1131 mod = inspect.getmodule(cf.tb_frame)
1132 if mod is not None:
1133 mod_name = mod.name

AttributeError: 'tuple' object has no attribute 'tb_frame'

In my research, I discovered that the issue is related to something with the GPU. Here are my PC specifications:
AMD Ryzen 7 6800H with Radeon Graphics 3.20 GHz
16.0 GB RAM
NVIDIA GeForce RTX 3070 Ti
Thank you in advance.

@EtienneAb3d
Copy link
Owner

@renatobrusarosco

Seems your problem is here:

RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.

You need a graphic card with CUDA installed.

Try to make Whisper working alone first, then, WhisperHallu should also work.
https://github.com/openai/whisper

@renatobrusarosco
Copy link
Author

I have a NVIDIA GeForce RTX 3070 Ti, after researching a bit, I found out that I needed to install CUDA in a specific way, and this problem was solved. However, another issue has arisen. When I run the final code, I always receive this message (I have the latest version of ffmpeg installed). Could you help me? Thank you.

RuntimeError Traceback (most recent call last)
Cell In[19], line 27
15 # Example
16 # lng = "uk"
17 # prompt = "Whisper, Ok. "
(...)
23 # "Ok, Whisper. "
24 # path = "/path/to/your/uk/sound/file"
26 loadModel("0")
---> 27 result = transcribePrompt(path=path, lng=lng, prompt=prompt)

File ~\Downloads\Jupyter\WhisperHalluEnv\WhisperHallu\transcribeHallu.py:201, in transcribePrompt(path, lng, prompt, lngInput, isMusic, addSRT, truncDuration, maxDuration)
199 print("PROMPT="+prompt,flush=True)
200 opts = dict(language=lng,initial_prompt=prompt)
--> 201 return transcribeOpts(path, opts,lngInput,isMusic=isMusic,addSRT=addSRT,truncDuration=truncDuration,maxDuration=maxDuration)

File ~\Downloads\Jupyter\WhisperHalluEnv\WhisperHallu\transcribeHallu.py:265, in transcribeOpts(path, opts, lngInput, isMusic, onlySRT, addSRT, truncDuration, maxDuration)
260 pathDemucs=pathIn+".vocals.wav" #demucsDir+"/htdemucs/"+os.path.splitext(os.path.basename(pathIn))[0]+"/vocals.wav"
261 #Demucs seems complex, using CLI cmd for now
262 #aCmd = "python -m demucs --two-stems=vocals -d "+device+":"+cudaIdx+" --out "+demucsDir+" "+pathIn
263 #print("CMD: "+aCmd)
264 #os.system(aCmd)
--> 265 demucs_audio(pathIn=pathIn,model=modelDemucs,device="cuda:"+cudaIdx,pathVocals=pathDemucs,pathOther=pathIn+".other.wav")
266 print("T=",(time.time()-startTime))
267 print("PATH="+pathDemucs,flush=True)

File ~\Downloads\Jupyter\WhisperHalluEnv\WhisperHallu\demucsWrapper.py:43, in demucs_audio(pathIn, model, device, pathVocals, pathOther)
41 source_idx=model.sources.index(name)
42 source=result[0, source_idx].mean(0)
---> 43 torchaudio.save(pathIn+"."+name+".wav", source[None], model.samplerate)

File ~\AppData\Roaming\Python\Python311\site-packages\torchaudio_backend\utils.py:311, in get_save_func..save(uri, src, sample_rate, channels_first, format, encoding, bits_per_sample, buffer_size, backend, compression)
223 def save(
224 uri: Union[BinaryIO, str, os.PathLike],
225 src: torch.Tensor,
(...)
233 compression: Optional[Union[CodecConfig, float, int]] = None,
234 ):
235 """Save audio data to file.
236
237 Note:
(...)
309
310 """
--> 311 backend = dispatcher(uri, format, backend)
312 return backend.save(
313 uri, src, sample_rate, channels_first, format, encoding, bits_per_sample, buffer_size, compression
314 )

File ~\AppData\Roaming\Python\Python311\site-packages\torchaudio_backend\utils.py:221, in get_save_func..dispatcher(uri, format, backend_name)
219 if backend.can_encode(uri, format):
220 return backend
--> 221 raise RuntimeError(f"Couldn't find appropriate backend to handle uri {uri} and format {format}.")

RuntimeError: Couldn't find appropriate backend to handle uri data/KatyPerry-Firework.mp3.WAV.wav.drums.wav and format None.

@EtienneAb3d
Copy link
Owner

@renatobrusarosco

First, check that this file exists and is not empty at the moment the error occurs:
data/KatyPerry-Firework.mp3.WAV.wav.drums.wav

Each processing step is adding a suffix to the original file path, and is creating a specific LOG file.
Check each of these LOG files to see if something is more clear about the problem at one step or an other.

@gmmarc
Copy link

gmmarc commented Jan 5, 2024

For people facing this issue without GPU, here's how you can change it to CPU.

In https://github.com/EtienneAb3d/WhisperHallu/blob/main/transcribeHallu.py#L110, set device to cpu:

model = whisper.load_model(modelSize,device=torch.device("cpu"))

Same thing in https://github.com/EtienneAb3d/WhisperHallu/blob/main/transcribeHallu.py#L265:

demucs_audio(pathIn=pathIn,model=modelDemucs,device="cpu",pathVocals=pathDemucs,pathOther=pathIn+".other.wav")

You will need PyTorch compiled for CPU, I did:

pip uninstall torch
pip install --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cpu

It will take some time, but works.

Tested on macOS Sonoma 14.1 M2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants