# ASAM

**Repository Path**: sharpmddr/ASAM

## Basic Information

- **Project Name**: ASAM
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2023-12-05
- **Last Updated**: 2023-12-05

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

### AAAI2018-[Modeling Attention and Memory for Auditory Selection in a Cocktail Party Environment](https://github.com/jacoxu/ASAM/blob/master/AAAI2018-Modeling%20Attention%20and%20Memory%20for%20Auditory%20Selection%20in%20a%20Cocktail%20Party%20Environment.pdf)     

=======================================================================    

Our demo code is implemented in Keras (writtern in Python, and the backend is theano).    

Usage:    
$python main_run.py    
or execute it in terminal background:     
$bash run.sh    


Notice:    
(1). In order to aviod the version mismatch of [Keras](https://keras.io/), we fork the verison_1.2.2 of Keras into this project.    
(2). We use Matlab version of BSS_eval to evaluate NSDR.    


![Figure 1: Auditory Attention](http://wx3.sinaimg.cn/mw690/697b070fly1fo99vmp5njj20v10cw0vt.jpg)

Figure 1: Two specific attention tasks for auditory selection in a three speech mixture environment. One is top-down task-specific attention, and the other is bottom-up stimulus-driven attention.     
       
![Figure 2: Framework](http://wx3.sinaimg.cn/mw690/697b070fly1fo99vpoevkj21970hnwiq.jpg)

Figure 2: An illustration of our Auditory Selection with Attention and Memory (ASAM). (a): The overall architecture of the proposed ASAM. (b): Life-long memory module to memory the prior knowledge. In top-down attention scene, the dashed boxes and arrow are only conducted in the training phase and removed in the evaluation time.             

![Figure 3: Attention Heat Map](http://wx3.sinaimg.cn/mw690/697b070fly1fo99vsoeg1j21kw0xh4qq.jpg)

Figure 3: Effects of attention with different amounts of stimulus on one male and female mixture sample from WSJ0. (a) shows the SIR (Signal-to-Interference Ratio), SAR (Signal-to-Artifacts Ratio) and NSDR results, (b)-(d) are the auditory stimuli whose magnitudes are divided by the maximum magnitude, (e) is the mixture input spectrogram, (i) is the target spectrogram, (f)-(h) are attention maps based on the corresponding auditory stimuli and (j)-(l) are the corresponding predictions with their NSDR performances.                


![](https://camo.githubusercontent.com/0e32abe541a386cbaf8370777b4b55c35d31657d/68747470733a2f2f692e6372656174697665636f6d6d6f6e732e6f72672f6c2f62792d6e632f342e302f38387833312e706e67)    

This work is licensed under a [Creative Commons Attribution-NonCommercial 4.0 International License](http://creativecommons.org/licenses/by-nc/4.0/).