Skip to content
/ ASAM Public
forked from jacoxu/ASAM

This is the code&dataset for our paper [Modeling Attention and Memory for Auditory Selection in a Cocktail Party Environment. AAAI 2018]

Notifications You must be signed in to change notification settings

luyun2hit/ASAM

 
 

Repository files navigation

=======================================================================

Our demo code is implemented in Keras (writtern in Python, and the backend is theano).

Usage:
$python main_run.py
or execute it in terminal background:
$bash run.sh

Notice:
(1). In order to aviod the version mismatch of Keras, we fork the verison_1.2.2 of Keras into this project.
(2). We use Matlab version of BSS_eval to evaluate NSDR.

Figure 1: Auditory Attention

Figure 1: Two specific attention tasks for auditory selection in a three speech mixture environment. One is top-down task-specific attention, and the other is bottom-up stimulus-driven attention.           Figure 2: Framework

Figure 2: An illustration of our Auditory Selection with Attention and Memory (ASAM). (a): The overall architecture of the proposed ASAM. (b): Life-long memory module to memory the prior knowledge. In top-down attention scene, the dashed boxes and arrow are only conducted in the training phase and removed in the evaluation time.    

Figure 3: Attention Heat Map

Figure 3: Effects of attention with different amounts of stimulus on one male and female mixture sample from WSJ0. (a) shows the SIR (Signal-to-Interference Ratio), SAR (Signal-to-Artifacts Ratio) and NSDR results, (b)-(d) are the auditory stimuli whose magnitudes are divided by the maximum magnitude, (e) is the mixture input spectrogram, (i) is the target spectrogram, (f)-(h) are attention maps based on the corresponding auditory stimuli and (j)-(l) are the corresponding predictions with their NSDR performances.    

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

About

This is the code&dataset for our paper [Modeling Attention and Memory for Auditory Selection in a Cocktail Party Environment. AAAI 2018]

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 95.9%
  • Shell 2.4%
  • MATLAB 1.6%
  • TeX 0.1%