FlowDec

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

FlowDec

FlowDec (ICLR 2025) is a full-band audio codec for general audio sampled at 48 kHz that combines non-adversarial codec training with a stochastic postfilter based on a novel conditional flow matching method.

Demo

See our demo page here.

News

2025/03/03 First version is released

Installation

Create a new virtual environment (we recommend Python 3.10) and run

pip install -r requirements.txt --extra-index-url https://download.pytorch.org/whl/cu126

(or whatever matches your local CUDA version).

Checkpoints

You can find the checkpoints for FlowDec-75m and FlowDec-25s, as well as the weights for the underlying NDAC codecs NDAC-75 and NDAC-25, here.

Inference

Please check out the notebook demo.ipynb for how to run inference using the pretrained checkpoints.

Training

We use Hydra for model configuration and training. For training config files, see the config/ folder.

Data preparation

NOTE: We do not provide training/validation/test datasets here, so the training configurations in config/ all use a dummy datamodule config config/datamodule/example.yaml. To actually train FlowDec, you should pre-enhance your own dataset(s) with a pre-trained underlying codec, save the results as .wav files, and store the paired paths in a text file. You can for instance use our pre-trained NDAC variants - see the "Inference" section for how to run them.

The expected input format for FlowDec datasets is a file containing a comma-separated list of paths, e.g.:

/clean_path/file1.wav,/codec_output_path/file1.wav
/clean_path/file2.wav,/codec_output_path/file2.wav
[...]

where you would then have train.txt, validation.txt and test.txt each of this format, and adapt the datamodule config file to use these three .txt files instead of the dummy file.

Running training

After modifying the datamodule, you can then for example run:

python train.py --config-name flowdec_75m

Frequency-dependent sigma_y

For automatically determining the frequency-dependent sigma_y (see Section 3.5 in our paper), you can use the helper script scripts/estimate_flowdec_params.py. This script also implements the heuristic for a global sigma_y discussed in our Appendix A.1.

Citation

If you use our models, methods, or any derivatives thereof, please cite our paper:

@inproceedings{
    welker2025flowdec,
    title={{FlowDec}: A flow-based full-band general audio codec with high perceptual quality},
    author={Simon Welker and Matthew Le and Ricky T. Q. Chen and Wei-Ning Hsu and Timo Gerkmann and Alexander Richard and Yi-Chiao Wu},
    booktitle={The Thirteenth International Conference on Learning Representations},
    year={2025},
    url={https://openreview.net/forum?id=uxDFlPGRLX}
}

License

The majority of FlowDec is licensed under CC-BY-NC, however portions of the project are available under separate license terms: conditional-flow-matching, sgmse, BioinfoMachineLearning, audiotools, and descript-audio-code are licensed MIT; NCSN++ is licensed Apache 2.0.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
config		config
data		data
flowdec		flowdec
scripts		scripts
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
demo.ipynb		demo.ipynb
dummy_paired_filelist.txt		dummy_paired_filelist.txt
enhance.py		enhance.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FlowDec

Demo

News

Installation

Checkpoints

Inference

Training

Data preparation

Running training

Frequency-dependent sigma_y

Citation

License

About

Releases 1

Packages

Languages

License

facebookresearch/FlowDec

Folders and files

Latest commit

History

Repository files navigation

FlowDec

Demo

News

Installation

Checkpoints

Inference

Training

Data preparation

Running training

Frequency-dependent sigma_y

Citation

License

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages