Title: Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model
Authors: Zhen Ye, Peiwen Sun, Jiahe Lei, Hongzhan Lin, Xu Tan, Zheqi Dai, Qiuqiang Kong, Jianyi Chen, Jiahao Pan, Qifeng Liu, Yike Guo*, Wei Xue*
Speech ckpts downlaod link
General audio ckpts [Soon]
python inference.py
torchrun --nnodes=1 --nproc-per-node=8 main_launch_vqdp.py
I would like to extend a special thanks to authors of Uniaudio and DAC, since our code base is mainly borrowed from Uniaudio and DAC.