SpeechCollector

SpeechCollector is an open-source full-stack web app for coordinated collection of paired text-speech datasets. It streamlines the process of collecting voice recordings from multiple users by automatically partitioning and allocating text samples, managing recording sessions, and organizing the resulting paired data.

SpeechCollector's tech stack and supported features are as follows:

Frontend: React (Vite)
Backend: Express.js
Database: PostgreSQL
Storage Options:
- Local
- AWS S3
- Cloudflare R2
Authentication: Basic user management where an admin sets plaintext passwords for new users.

Installation & Configuration

After cloning the repository, install all dependencies using pnpm package manager:

cd speech-collector
pnpm i

Required: Database Setup

SpeechCollector requires a PostgreSQL database to store application data including:

User accounts and credentials
Topics and tasks information
Recording metadata and relationships
User submission tracking

📘 First time PostgreSQL user?
See our Database Setup Guide for detailed installation and setup instructions.

Configure Database Credentials
Create or edit the .env file in your project root:

PG_HOST=localhost
PG_PORT=5432
PG_DATABASE=speechcollector
PG_USER=yourusername
PG_PASSWORD=yourpassword

Initialize Database Tables
For testing with the Spoken-SQuAD dataset:

node scripts/spoken-squad/push.js scripts/spoken-squad/spoken_train-v1.1.json

This script will create the necessary tables and populate it with spoken-squad dataset's train partition. To use other database structures, we provide a Task Provider class you can edit.

Audio File Storage Options & Setup

SpeechCollector offers 3 options for storing the sound files: Local, AWS S3 and Cloudflare R2. Edit the .env file based on your storage choice as follows.

Option 1 - Local Storage: Saves collected sound files to a user defined local folder.

Set STORAGE environment variable to local
Set SOUND_RECORDINGS_PATH environment variable to the desired local path

Option 2 - AWS S3 Storage: Saves collected sound files to an AWS S3 bucket. This option requires an AWS account, configured AWS credentials and an initialized public or private AWS S3 bucket.

Set STORAGE environment variable to aws-s3
Set AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY,AWS_REGION and AWS_BUCKET_NAME environment variables.

Option 3 - R2 Storage: Saves collected sound files to an Cloudflare R2 bucket. This option requires a Cloudflare account, configured AWS credentials and an initialized public or private Cloudflare R2 bucket.

Set STORAGE environment variable to r2
Set CF_R2_ACCESS_KEY_ID,CF_R2_SECRET_ACCESS_KEY,CF_R2_ENDPOINT,CF_R2_BUCKET_NAME environment variables.

Run SpeechCollector

Ensure your environment variables are set and the database is populated. Run the development server with:

pnpm dev

To build the project for production:

pnpm build

To run the build:

pnpm serve

Both pnpm dev and pnpm serve will instantiate the backend at port 8000 and run the frontend at port 5173. You can access the app through http://localhost:5173.

User Management & Onboarding Flow

1. Adding New Users

Before users can access the system, an admin must first create their account using one of these methods:

Option A - Using PostgreSQL (Before Starting the App):

INSERT INTO users (username, password) 
VALUES ('newuser', 'userpassword');

Option B - Using API Endpoint (Once App is Running):

curl -X POST http://localhost:8000/api/add-user \
-H "Content-Type: application/json" \
-d '{
  "username": "newuser",
  "password": "userpassword"
}'

2. User First Login & Metadata Collection

When a user first logs in with their assigned username and password, they will be presented with a metadata collection form requesting:

Name
Last name
Email
Gender
Birth date
Other configurable fields

The metadata form fields can be customized through the infoFormConfig.json file with required and optional fields.

3. Regular Usage

After completing the metadata form, users can:

Access their assigned tasks
Record audio for their tasks
Submit recordings
Continue until they reach their topic limit (if MAX_TOPICS_PER_USER is set) or until no tasks remain

Database Structure and Task Management Logic

SpeechCollector uses a PostgreSQL database with tables for Users, Topics, and Tasks. The app includes a TaskProvider system that can be customized for different dataset structures.

📘 Want to customize the database structure?
See our Database Structure Guide for detailed information about the schema and how to adapt it for your needs.

License

This project is licensed under the MIT license.

From neuralwork with ❤️

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.github		.github
backend		backend
docs		docs
frontend		frontend
scripts		scripts
.env		.env
.eslintrc.cjs		.eslintrc.cjs
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
infoFormConfig.json		infoFormConfig.json
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
tsconfig.json		tsconfig.json
tsconfig.node.json		tsconfig.node.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SpeechCollector

Installation & Configuration

Required: Database Setup

Audio File Storage Options & Setup

Run SpeechCollector

User Management & Onboarding Flow

1. Adding New Users

2. User First Login & Metadata Collection

3. Regular Usage

Database Structure and Task Management Logic

License

About

Releases

Sponsor this project

Contributors 2

Languages

License

neuralwork/speech-collector

Folders and files

Latest commit

History

Repository files navigation

SpeechCollector

Installation & Configuration

Required: Database Setup

Audio File Storage Options & Setup

Run SpeechCollector

User Management & Onboarding Flow

1. Adding New Users

2. User First Login & Metadata Collection

3. Regular Usage

Database Structure and Task Management Logic

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Sponsor this project

Contributors 2

Languages