About The Project

Voice-based Conversational Recommender Systems

Towards Building Voice-based Conversational Recommender Systems: Datasets, Potential Solutions, and Prospects

Table of Contents

About The Project
Dataset Description
Potential Solution Exploration
Data Construction
Acknowledgement

About The Project

This project is the code of paper "Towards Building Voice-based Conversational Recommender Systems: Datasets, Potential Solutions, and Prospects". In this project, we aim to provide two voice-based conversational recommender systems datasets in the e-commerce and movie domains.

Dataset Description

You can download datasets from GoogleDrive. The datasets consist of two parts: coat.tar.gz and ml-1m.tar.gz

Dataset files

The data file is formatted as a mp3 file and the file name form is diaidxx_uidxx_iidxx_xx_xx_xx.mp3.

For example, for file diaid21_uid249_iid35_20-30_men_251.mp3, its meaning is as follows:

diaid21: corresponds to dialogue 21 in the text-based conversation dataset
uid249: user id is 249
iid35: item id is 35
20-30: user's age is between 20 and 30
men: user's gender is male
251: corresponds to speaker p251 in vctk dataset

Speaker information on the vctk dataset can be found here

Case study

Here we provide a demo of a data file (i.e., diaid21_uid249_iid35_20-30_men_251.mp3) that contains text and audio dialogue between the user and the agent.

demov1.mp4

Note that since we currently only explore the impact of speech on VCRS from the user's perspective, only the user's speech is included in the provided dataset. If you want complete dialogue audio, you can generate it through the code we provide.

Potential Solution Exploration

We propose to extract explicit semantic features from the voice data and then incorporate them into the recommendation model in a two-phase fusion manner.

Please refer to here for how to run the code.

Data Construction

Our VCRSs dataset creation task includes four steps: (1) backbone dataset selection; (2)text-based conversation generation; (3) voice-based conversation generation; and (4) quality evaluation.

Backbone dataset selection

We choose Coat and ML-1M as our backbone datasets. Using user-item interactions and item features to simulate a text-based conversation between users and agents for recommendation and using user features to assign proper speakers.

Text-based conversation generation

Please refer to here for how to generate the text-based conversation and the code is in ./Dialogue/ directory.

Voice-based conversation generation

Please refer to here for how to generate the voice-based conversation.

Quality evaluation

We adopt the fine-grained evaluation of dialogue (FED) metric to measure the quality of the generated text-based conversation.

Installtion

pip install -r requirements.txt

RUN

cd ./Evaluate/
python evaluate.py --dataset='xxx', xxx is coat or ml-1m.
All results are saved in ./res/ directory.

Acknowledgement

Convert text to audio using VITS, a SOTA end-to-end text-to-speech (TTS) model.
Improve code efficiency by conv_rec_sys.
Evaluate text-based conversation with Fed.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
Dialogue		Dialogue
Evaluate		Evaluate
Recommender		Recommender
Speech		Speech
images		images
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Voice-based Conversational Recommender Systems

About The Project

Dataset Description

Dataset files

Case study

Potential Solution Exploration

Data Construction

Backbone dataset selection

Text-based conversation generation

Voice-based conversation generation

Quality evaluation

Installtion

RUN

Acknowledgement

About

Releases

Packages

Languages

hyllll/VCRS

Folders and files

Latest commit

History

Repository files navigation

Voice-based Conversational Recommender Systems

About The Project

Dataset Description

Dataset files

Case study

Potential Solution Exploration

Data Construction

Backbone dataset selection

Text-based conversation generation

Voice-based conversation generation

Quality evaluation

Installtion

RUN

Acknowledgement

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages