Skip to content

This is the unofficial implementation of MFNet, from paper''a Mask Free Neural Network for Monaural Speech Enhancement''

Notifications You must be signed in to change notification settings

taotaowang97479/MFNet-SpeechEnhancement

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 

Repository files navigation

MFNet

This is the unofficial implementation of MFNet, from paper''a Mask Free Neural Network for Monaural Speech Enhancement''

arxiv:https://arxiv.org/abs/2306.04286

I appreciate the guidance and assistance from the author. After the correction following our discussion:
1.The initial learning rate is 3e-4, correcting the value from 0.0034 in the paper.
2.The features input to the network are compressed spectra, i.e., input = sign(stdct) * sqrt(abs(stdct)).
3.DCT transformation without normalization.
I put the key code of STDCT, which may be useful for you.

Result

The model used the code modified by the author in the issues of his repo, ioyy900205/MFNet#1

Experiments were conducted on the Voicebank+Demand, and the training parameters were set as follows:

  • The warm-up strategy mentioned in the paper was not used
  • Initial learning rate: 3e-4
  • Batch size: 2
  • Training for 100 epochs
  • Cosine annealing with a period of 100 epochs
  • Minimum value during cosine annealing: 1e-5

Performance of MFNet on the Voicebank+Demand test set:

PESQ STOI CSIG CBAK COVL SSNR
Noisy 1.97 92.1 3.35 2.44 2.63 1.73
MFNet 3.05 94.6 4.19 3.55 3.63 9.79

The training process is as follows:

MFNet_result

About

This is the unofficial implementation of MFNet, from paper''a Mask Free Neural Network for Monaural Speech Enhancement''

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages