Fluency Scorer

Introduction

It's my implementation for speech fluency assessment model. The idea for this model is from the paper An ASR-Free Fluency Scoring Approach with Self-Supervised Learning (Wei Liu, Kaiqi Fu, Xiaohai Tian, Shuju Shi, Wei Li, Zejun Ma, Tan Lee) proposed in the ICASSP 2023.

These implementations are unofficial, and there might be some bugs that I missed.

But, the repo will complete as soon as possible.

Overview of Model Structure

Here shows the main structure for this repo:

Data

The SpeechOcean762 dataset used in my work is an open dataset licenced with CC BY 4.0. If You have downloaded speechocean762 for yourself, you can fill in your directory path to prep_data/run.sh.

Directions for The Programs

The Input Features and Labels

The input generation program are in prep_data. Just run the shell script in prep_data.

cd prep_data
./run.sh

The labels are fluency scores in speechocean762.
The acoustic features are extracted by Wav2vec_large, where the dim is the value of 1024.
The feats and labels files are collected in data.
The cluster model is trained in train_kmeans.py, the model will be saved in exp/kmeans, which is used in fluency_scoring training later.
kmeans_metric.py is used to take a look the performance of kmeans clustering.

【Noted】: Force alignment result to replace the Kmeans predicted results

You can run the following programming if you want to try the Force alignment results for the replacement of cluster ID.

python3 gen_ctc_force_align.py

If you choose this for the resource of cluster ID, you need to update the run.sh: make the **cluster_pred=False**

Train Models for Fluency Scorer

version for no cluster_id feature:

./noclu_run.sh

version with cluster_id feature:

./run.sh

Results And Performance

Models	Utt FLU PCC
GOPT (Librispeech)	0.756
Proposed paper	0.795
FluScorer+cluster_idx	0.753
Flu_TFR+cluster_idx	0.790

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
data		data
exp		exp
models		models
prep_data		prep_data
pretrained_models		pretrained_models
.gitignore		.gitignore
README.md		README.md
collect_summary.py		collect_summary.py
noclu_run.sh		noclu_run.sh
print_centers.py		print_centers.py
requirements.txt		requirements.txt
run.sh		run.sh
run_transformer.sh		run_transformer.sh
train.py		train.py
tsne.py		tsne.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fluency Scorer

Introduction

Overview of Model Structure

Data

Directions for The Programs

The Input Features and Labels

Train Models for Fluency Scorer

Results And Performance

About

Releases

Packages

Languages

a2d8a4v/fluency_scorer

Folders and files

Latest commit

History

Repository files navigation

Fluency Scorer

Introduction

Overview of Model Structure

Data

Directions for The Programs

The Input Features and Labels

Train Models for Fluency Scorer

Results And Performance

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages