Made in Vancouver, Canada by Picovoice
A library for real-time voice processing in web browsers.
-
Uses the Web Audio API to access microphone audio.
-
Leverages Web Workers to offload compute-intensive tasks off of the main thread.
-
Converts the microphone sampling rate to 16kHz, the de facto standard for voice processing engines.
-
Provides a flexible interface to pass in arbitrary voice processing workers.
All modern browsers (Chrome/Edge/Opera, Firefox, Safari) are supported, including on mobile. Internet Explorer is not supported.
Using the Web Audio API requires a secure context (HTTPS connection), with the exception of localhost
, for local development.
This library includes the utility function browserCompatibilityCheck
which can be used to perform feature detection on the current browser and return an object
indicating browser capabilities.
ESM:
import { browserCompatibilityCheck } from '@picovoice/web-voice-processor';
browserCompatibilityCheck();
IIFE:
window.WebVoiceProcessor.browserCompatibilityCheck();
- '_picovoice' : whether all Picovoice requirements are met
- 'AudioWorklet' (not currently used; intended for the future)
- 'isSecureContext' (required for microphone permission for non-localhost)
- 'mediaDevices' (basis for microphone enumeration / access)
- 'WebAssembly' (required for all Picovoice engines)
- 'webKitGetUserMedia' (legacy predecessor to getUserMedia)
- 'Worker' (required for downsampling and for all engine processing)
This library does not use the modern AudioWorklet due to lack of support in Safari and Safari Mobile.
npm install @picovoice/web-voice-processor
(or)
yarn add @picovoice/web-voice-processor
import { WebVoiceProcessor } from '@picovoice/web-voice-processor';
Add the following to your HTML:
<script src="@picovoice/web-voice-processor/dist/iife/index.js"></script>
The IIFE version of the library adds WebVoiceProcessor
to the window
global scope.
Start up the WebVoiceProcessor with the init
async static factory method:
let engines = []; // list of voice processing web workers (see below)
let handle = await WebVoiceProcessor.WebVoiceProcessor.init({
engines: engines,
});
This is async due to its Web Audio API microphone request. The promise will be rejected if the user refuses permission, no suitable devices are found, etc. Your calling code should anticipate the possibility of rejection. When the promise resolves, the WebVoiceProcessor instance is ready.
engines
is an array of voice processing Web Workers
implementing the following interface within their onmessage
method:
onmessage = function (e) {
switch (e.data.command) {
...
case 'process':
process(e.data.inputFrame);
break;
...
}
};
where e.data.inputFrame
is an Int16Array
of 512 audio samples.
If you wish to initialize a new WebVoiceProcessor, and not immediately start listening, include start: false
in the init options object argument; then call start()
on the instance when ready.
const handle = await WebVoiceProcessor.WebVoiceProcessor.init({
engines: engines,
start: false,
});
handle.start();
Pause/Resume processing (microphone and Web Audio context will still be active):
handle.pause();
handle.resume();
Close the microphone MediaStream and release resources:
handle.release();
This method is async as it is closing the AudioContext internally.
Use yarn
or npm
to build WebVoiceProcessor:
yarn
yarn build
(or)
npm install
npm run-script build
The build script outputs minified and non-minified versions of the IIFE and ESM formats to the dist
folder. It also will output the TypeScript type definitions.