UniAudio 1.5

This Repository provides an audio codec model, which can be used to build multi-modal LLMs (text and audio modalities). The details and paper will be released as soon as possible. More details will be introduced as soon as.

Introduction

How to use LLM-Codec?

step 1:

download the checkpoint (wget https://huggingface.co/Dongchao/UniAudio/resolve/main/llm3_codec_uni.pth)

Step 2: Download LLAMA 2 7B based on https://github.com/meta-llama/llama-recipes/tree/main
Step 3: refer to infer.py

python infer.py

How to use LLM-Code and LLAMA 2 (UniAudio 1.5)

In the following, we give a simple demonstration to use it.

CUDA_VISIBLE_DEVICES=0 torchrun --nproc_per_node 1 --master_port=10645 infer_code/eval_accent_understanding_v2.py \
            --batch_size 1 \
            --max_seq_len 2048 \
            --num_workers 0 \
            --output_type "next_token_prediction" \
            --audio_path "the path of audio folder" \
            --file_path tsv/acc_9way_1_shot.scp \
            --vq_config_path config.yaml \
            --output_dir log_eval_few_shot/7B_output \
            --llama_model_path llama_inference/llama-2-7b \
            --induction 1 \
            --codec_ckpt "llm-codec.pth" \

Demos

Please refer to demos folder to listen the generated audio.

Acknowledgements

https://github.com/descriptinc/descript-audio-codec https://github.com/yangdongchao/AcademiCodec https://github.com/hubertsiuzdak/snac https://github.com/Meta-Llama/llama-recipes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

UniAudio 1.5

Introduction

How to use LLM-Codec?

How to use LLM-Code and LLAMA 2 (UniAudio 1.5)

Demos

Acknowledgements

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
codec		codec
demos		demos
fig		fig
infer_code		infer_code
llama_inference/llama		llama_inference/llama
tsv		tsv
config.yaml		config.yaml
infer.py		infer.py
infer_codec.py		infer_codec.py
layer1.npy		layer1.npy
layer1.pth		layer1.pth
readme.md		readme.md

ishine/LLM-Codec

Folders and files

Latest commit

History

Repository files navigation

UniAudio 1.5

Introduction

How to use LLM-Codec?

How to use LLM-Code and LLAMA 2 (UniAudio 1.5)

Demos

Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages