Skip to content

JerryPeng21cuhk/mFVAE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

mixture factorized variational auto-encoder for speaker verification

This is a pytorch implementation of mFVAE in the paper: mixture factorization auto-encoder for unsupervised hierarchical deep factorization of speech signal. Note that we apply reparameterization tricks on posteriors generated by both frame tokenizer and utterance embedder.

Here is an online demo of the embeddings extracted from the embedder of mfvae. It can be seen that the speaker information exists in the embeddings. As the aim of mfvae/mfae is to factorize linguistic information and paralinguistic information. The embeddings also contain channel distortion and background noise.

Dependencies

  • Python 3.7
  • Pytorch 1.1.0
  • Kaldi
  • PyKaldi
  • kaldi_io
  • GPUtil
  • NumPy, datetime, argparse, pprint

Usage

Download Dataset

  1. Download and unzip audio files from http://www.robots.ox.ac.uk/~vgg/data/voxceleb/vox1.html
  2. Create a directory named voxceleb1 with two subdirectories named train and test. Move dev data to train directory, test data to test directory.
  3. Download List of trial pairs for Verification(http://www.robots.ox.ac.uk/~vgg/data/voxceleb/meta/veri_test.txt). Move it to voxceleb1 dir.

Revise files in this repository

  1. Go to voxceleb-mfvae directory:
  2. run cmd: ln -fsr "your path to kaldi-trunk/egs/sre08v/1/utils" utils
  3. Modify root_data_dir in run.sh
  4. run cmd: bash run.sh --stage 0

About

No description or website provided.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published