Skip to content

Provides speech audio segmentation and synchronisation for audio transcriptions

License

Notifications You must be signed in to change notification settings

Alveo/alveo-transcriber-services

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

alveo-transcriber-services

Repository to provide additional services to the alveo-transcriber web application. Provides transcription storage, SSAD audio segmentation and access to ASR engines.

Config

  1. If deploying this outside of a local address, you will need to generate an SSL certificate to avoid mixed content browser errors
  2. Build the transcriber with an environment.ts that points towards this as the ALVEO_SERVICES_URL

Running

  1. Install requirements with pip, recommended you use a python virtual environment
  2. Optionally enable debug export FLASK_DEBUG=1, else you will have to set a DATABASE_URI environment variable (see application/config.py)
  3. Consider the use of Google Cloud credentials for ASR support, see ./application/config.py for more information.
  4. If it hasn't been initialised yet, initialise the database with flask app init_db
  5. export FLASK_APP=application && python -m flask run

Unit tests

Set up environment variables for relevant modules (unconfigured ones will be skipped!)

  • Alveo: export ALVEO_API_KEY=<YOUR ALVEO API KEY>

When ready, run the unit tests with python tests.py

Examples

See examples.

Writing a module

The transcriber-services is intended to be as modular as possible. To achieve that, handlers are written for the service of your choosing. The Alveo module is included which demonstrates how to register the authentication, storage and segmentation handlers. Module integration can be set up and disabled by editing the entry in DOMAIN_HANDLERS in the config file.

Deployment with Dokku

The application is deployed using dokku, the following configuration is required on the dokku host:

$ dokku apps:create segmenter

Be sure to set up a database so that Dokku provides the DATABASE_URL environment variable. Supported type are sqlite3, Postgres, MariaDB and MySQL.
Now you can push the repository to the dokku host using git:

$ git remote add dokku [email protected]:segmenter
$ git push dokku master

You could then add and set a domain if you wanted to. You should do this before creating an SSL certificate. Here is an example:

$ dokku domains:add segmenter segmenter.apps.alveo.edu.au
$ dokku domains:set segmenter segmenter.apps.alveo.edu.au

This should build the environment and start the application. We then need to set up an SSL certificate on the dokku host:

$ dokku letsencrypt segmenter

Finally, build the database if it hasn't been built already.

$ dokku run segmenter flask app init_db

About

Provides speech audio segmentation and synchronisation for audio transcriptions

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages