DeepXi: Residual Bidirectional Long Short-Term Memory (ResBLSTM) Network A Priori SNR estimator

DeepXi (where the Greek letter 'xi' or ξ is ponounced /zaɪ/) is a residual bidirectional long short-term memory (ResBLSTM) network a priori SNR estimator that was proposed in [1]. It can be used by minimum mean-square error (MMSE) approaches like the MMSE short-time spectral amplitude (MMSE-STSA) estimator, the MMSE log-spectral amplitude (MMSE-LSA) estimator, and the Wiener filter (WF) approach. It can also be used to estimate the ideal ratio mask (IRM) and the ideal binary mask (IBM). DeepXi is implemented in TensorFlow and is trained to estimate the a priori SNR for single channel noisy speech with a sampling frequency of 16 kHz.

Prerequisites

TensorFlow (installed in a virtual environment)
Python3
MATLAB

Installation

It is recommended to use a virtual environment.

git clone https://github.com/anicolson/DeepXi.git
pip install -r requirements.txt

Download the Model

A trained model can be downloaded from here. Unzip and place in the model directory. The model was trained with a sampling rate of 16 kHz.

How to Perform Speech Enhancement

Simply run the script (python3 deepxi.py). Run the script in the virtual environment that TensorFlow is installed in. The script has different inference options, and is also able to perform training if required.

Directory Description

Directory	Description
lib	Functions for deepxi.py.
model	The directory for the model (the model must be downloaded).
noisy_speech	Noisy speech. Place noisy speech .wav files to be enhanced here.
output	DeepXi outputs, including the enhanced speech .wav output files.
stats	Statistics of a sample from the training set. The mean and standard deviation of the a priori SNR for the sample are used to compute the training target.

References

[1] A. Nicolson and K. K. Paliwal, "Deep Learning For Minimum Mean-Square Error Approaches to Speech Enhancement", Submitted to Speech Communication.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DeepXi: Residual Bidirectional Long Short-Term Memory (ResBLSTM) Network A Priori SNR estimator

Prerequisites

Installation

Download the Model

How to Perform Speech Enhancement

Directory Description

References

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
lib		lib
model		model
noisy_speech		noisy_speech
output/y/mmse-lsa		output/y/mmse-lsa
stats		stats
README.md		README.md
deepxi.py		deepxi.py
fig.png		fig.png
requirements.txt		requirements.txt

yunzqq/DeepXi

Folders and files

Latest commit

History

Repository files navigation

DeepXi: Residual Bidirectional Long Short-Term Memory (ResBLSTM) Network A Priori SNR estimator

Prerequisites

Installation

Download the Model

How to Perform Speech Enhancement

Directory Description

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages