Paper_Reading

这个仓库用来记录看过的paper与阅读笔记。

[0] Multimodal Machine Learning: A Survey and Taxonomy | note
[1] Looking to Listen at the Cocktail Party: A Speaker-Independent Audio-Visual Model for Speech Separation ｜ note
[2] Audio-Visual Speech Enhancement Using Multimodal Deep Convolutional Neural Networks | note
[3] Audio-Visual Speech Separation and Dereverberation with a Two-Stage Multimodal Network | note ing
[4] Deep Latent Space Learning for Cross-modal Mapping of Audio and Visual Signals
[5] Contextual Audio-Visual Switching For Speech Enhancement in Real-World Environments
[6] The Conversation: Deep Audio-Visual Speech Enhancement | note
[7] Audio-Visual Speech Enhancement using Hierarchical Extreme Learning Machine
[8] AV Speech Enhancement Challenge using a Real Noisy Corpus
[9] Audio-visual Speech Enhancement Using Conditional Variational Auto-Encoder | note
[10] CochleaNet: A Robust Language-independent Audio-Visual Model for Speech Enhancement | note
[11] Tutorial on Variational Autoencoders ｜ note ing
[12] Visual Speech Enhancement | note
[13] Mixture of Inference Networks for VAE-based Audio-visual Speech Enhancement
[14] The Sound of Pixels
[15] Seeing Through Noise: Visually Driven Speaker Separation and Enhancement | note
[16] Audiovisual Speech Source Separation: An overview of key methodologies | note
[17] Using Visual Speech Information in Masking Methods for Audio Speaker Separation | note ing
[18] Time Domain Audio Visual Speech Separation | note
[19] Audio-Visual Scene Analysis with Self-Supervised Multisensory Features
[20] Supervised Speech Separation Based on Deep Learning: An Overview
[21] Multimodal Model-Agnostic Meta-Learning via Task-Aware Modulation
[22] Deep clustering: Discriminative embeddings for segmentation and separation
[23] My lips are concealed: Audio-visual speech enhancement through obstructions | note
[24] Multimodal SpeakerBeam: Single Channel Target Speech Extraction with Audio-Visual Speaker Clues | note
[25] On Training Targets and Objective Functions for Deep-Learning-Based Audio-Visual Speech Enhancement
[26] Effects of Lombard Reflex on the Performance of Deep-Learning-Based Audio-Visual Speech Enhancement Systems
[27] Face Landmark-based Speaker-Independent Audio-Visual Speech Enhancement in Multi-Talker Environments
[28] Deep-Learning-Based Audio-Visual Speech Enhancement in Presence of Lombard Effect

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Paper_Reading

Files

README.md

Latest commit

History

README.md

File metadata and controls

Paper_Reading