-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use ASpIRE Chain Model (By Dan Povey) #50
Comments
@adx349 Try setting the Take a look at this thread for more details |
i too tried that, and found it is not working. my guess is that it ASPIRE model use BLSTM which is not supported in this online decoding. |
@fanskyer @adx349 I actually think it is an issue with the new Kaldi looped decoding not working properly. If you rollback Kaldi to commit bcc71b67d489a1766922c9caf2a54306755f1861 and gst-kaldi-nnet2-online to commit 63b2cfd, then the ASPIRE model works. You will still need to set nnet-mode to 3, acoustic-scale to 1, and frame-subsampling-factor to 3 |
Were you able to get this working? I tried rolling back to 63b2cfd and setting those options in my config. No luck, it just returns Here's my config: https://gist.github.com/maxhawkins/24edbd87be0aa1601da5034acc27d7ee I'm using the ASpIRE chain model from kaldi-asr.org with an HCLG.fst created using the documentation. |
Never mind. I was using the client incorrectly. When I converted my wav file to raw PCM it started working fine. For anyone who encounters this in the future, here are the steps I took:
python kaldigstserver/master_server.py --port=8888 &
env GST_PLUGIN_PATH=.. python kaldigstserver/worker.py -u ws://localhost:8888/worker/ws/speech -c worker.yaml &
sox audio.wav -r 8000 -e signed -b 16 -c 1 -t raw audio.raw remix 1
python kaldigstserver/client.py -r 16000 audio.raw |
Just an update to this -- I did some testing on my side, and the ASpIRE model will work with the latest commits and the frame-subsampling-factor set to 1 instead of 3. This is necessary for the most recent "looped decoding" implementation of Kaldi it seems. However, the accuracy appears to be worse than when the commits of both are reversed. |
Thanks I'll give that a shot. I'm also seeing some errors with word-level alignment (subtle drift noticeable on long recordings) with the ASPIRE model at 63b2cfd, but I think that's a separate issue. I'll keep troubleshooting and file another bug if I can't resolve it. |
It works for me, but it keeps outputting "mhm" every few seconds, while TEDLIUM didn't. Anyone experienced the same issue? |
I've had that issue before. Usually it means your settings are wrong. Check the |
Thank you for your work on kaldi, it is very helpful for me.
I was wondering what changes do I have to make to use the latest ASpIRE Chain Model.
I tried changing the nnet-mode=3 and also replace fst,mdl,conf files with the new model but it is not giving me any output.
What do you think is the issue ?
The text was updated successfully, but these errors were encountered: