Skip to content

Picovoice/web-voice-processor

Repository files navigation

Web Voice Processor

GitHub release

Made in Vancouver, Canada by Picovoice

A library for real-time voice processing in web browsers.

Browser compatibility

All modern browsers (Chrome/Edge/Opera, Firefox, Safari) are supported, including on mobile. Internet Explorer is not supported.

Using the Web Audio API requires a secure context (HTTPS connection), with the exception of localhost, for local development.

This library includes the utility function browserCompatibilityCheck which can be used to perform feature detection on the current browser and return an object indicating browser capabilities.

ESM:

import { browserCompatibilityCheck } from '@picovoice/web-voice-processor';
browserCompatibilityCheck();

IIFE:

window.WebVoiceProcessor.browserCompatibilityCheck();

Browser features

  • '_picovoice' : whether all Picovoice requirements are met
  • 'AudioWorklet' (not currently used; intended for the future)
  • 'isSecureContext' (required for microphone permission for non-localhost)
  • 'mediaDevices' (basis for microphone enumeration / access)
  • 'WebAssembly' (required for all Picovoice engines)
  • 'webKitGetUserMedia' (legacy predecessor to getUserMedia)
  • 'Worker' (required for downsampling and for all engine processing)

AudioWorklet & Safari

This library does not use the modern AudioWorklet due to lack of support in Safari and Safari Mobile.

Installation

npm install @picovoice/web-voice-processor

(or)

yarn add @picovoice/web-voice-processor

How to use

Via ES Modules (Create React App, Angular, Webpack, etc.)

import { WebVoiceProcessor } from '@picovoice/web-voice-processor';

Via HTML script tag

Add the following to your HTML:

<script src="@picovoice/web-voice-processor/dist/iife/index.js"></script>

The IIFE version of the library adds WebVoiceProcessor to the window global scope.

Start listening

Start up the WebVoiceProcessor with the init async static factory method:

let engines = []; // list of voice processing web workers (see below)
let handle = await WebVoiceProcessor.WebVoiceProcessor.init({
  engines: engines,
});

This is async due to its Web Audio API microphone request. The promise will be rejected if the user refuses permission, no suitable devices are found, etc. Your calling code should anticipate the possibility of rejection. When the promise resolves, the WebVoiceProcessor instance is ready.

engines is an array of voice processing Web Workers implementing the following interface within their onmessage method:

onmessage = function (e) {
    switch (e.data.command) {

        ...

        case 'process':
            process(e.data.inputFrame);
            break;

        ...

    }
};

where e.data.inputFrame is an Int16Array of 512 audio samples.

If you wish to initialize a new WebVoiceProcessor, and not immediately start listening, include start: false in the init options object argument; then call start() on the instance when ready.

const handle = await WebVoiceProcessor.WebVoiceProcessor.init({
  engines: engines,
  start: false,
});
handle.start();

Stop listening

Pause/Resume processing (microphone and Web Audio context will still be active):

handle.pause();
handle.resume();

Close the microphone MediaStream and release resources:

handle.release();

This method is async as it is closing the AudioContext internally.

Build from source

Use yarn or npm to build WebVoiceProcessor:

yarn
yarn build

(or)

npm install
npm run-script build

The build script outputs minified and non-minified versions of the IIFE and ESM formats to the dist folder. It also will output the TypeScript type definitions.