Skip to content

Commit

Permalink
Initial commit
Browse files Browse the repository at this point in the history
  • Loading branch information
chiel committed Aug 3, 2018
0 parents commit 98398e9
Show file tree
Hide file tree
Showing 51 changed files with 9,405 additions and 0 deletions.
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
.idea/
build/
node_modules/
npm-debug.log

1 change: 1 addition & 0 deletions .npmignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
examples/
21 changes: 21 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright (c) 2018 Snirpo

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
123 changes: 123 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
# node-vad

This is a stripped down version of this library (https://github.com/voixen/voixen-vad). Thank you very much!

WebRTC-based Voice Activity Detection library

Voice Activity Detection based on the method used in the upcoming [WebRTC](http://http://www.webrtc.org) HTML5 standard.
Extracted from [Chromium](https://chromium.googlesource.com/external/webrtc/+/branch-heads/43/webrtc/common_audio/vad/) for
stand-alone use as a library.

Supported sample rates are:
- 8000Hz
- 16000Hz*
- 32000Hz
- 48000Hz

*recommended sample rate for best performance/accuracy tradeoff

## Installation

## API

#### new VAD(mode)

Create a new `VAD` object using the given mode.

#### .processAudio(samples, samplerate, callback)

Analyse the given samples (`Buffer` object containing 16bit signed values) and notify the detected voice
event via `callback` and event.

#### .processAudioFloat(samples, samplerate, callback)

Analyse the given samples (`Buffer` object containing 32bit normalized float values) and notify the detected voice
event via `callback` and event.

### Event codes

Event codes are passed to the `processAudio` callback and to event handlers subscribed to the general
'event'-event.

#### VAD.Event.EVENT_ERROR

Constant for voice detection errors. Passed to 'error' event handlers.

#### VAD.Event.EVENT_SILENCE

Constant for voice detection results with no detected voices.
Passed to 'silence' event handlers.

#### VAD.Event.EVENT_VOICE

Constant for voice detection results with detected voice.
Passed to 'voice' event handlers.

#### VAD.Event.EVENT_NOISE

Constant for voice detection results with detected noise.
Not implemented yet

### Available VAD Modes

These contants can be used as the `mode` parameter of the `VAD` constructor to
configure the VAD algorithm.

#### VAD.Mode.MODE_NORMAL

Constant for normal voice detection mode. Suitable for high bitrate, low-noise data.
May classify noise as voice, too. The default value if `mode` is omitted in the constructor.

#### VAD.Mode.MODE_LOW_BITRATE

Detection mode optimised for low-bitrate audio.

#### VAD.Mode.MODE_AGGRESSIVE

Detection mode best suited for somewhat noisy, lower quality audio.

#### VAD.Mode.MODE_VERY_AGGRESSIVE

Detection mode with lowest miss-rate. Works well for most inputs.

## Notes

The library is designed to work with input streams in mind, that is, sample buffers fed to `processAudio` should be
rather short (36ms to 144ms - depending on your needs) and the sample rate no higher than 32kHz. Sample rates higher than
than 16kHz provide no benefit to the VAD algorithm, as human voice patterns center around 4000 to 6000Hz. Minding the
Nyquist-frequency yields sample rates between 8000 and 12000Hz for best results.

## Example

See examples folder for a working example with a sample audio file.

```javascript
const VAD = require('VAD');
const fs = require('fs');

const vad = new VAD(VAD.Mode.MODE_NORMAL);

const stream = fs.createReadStream("demo_pcm_s16_16000.raw");
stream.on("data", chunk => {
vad.processAudio(chunk, 16000, (err, res) => {
switch (res) {
case VAD.Event.EVENT_ERROR:
console.log("EVENT_ERROR");
break;
case VAD.Event.EVENT_NOISE:
console.log("EVENT_NOISE");
break;
case VAD.Event.EVENT_SILENCE:
console.log("EVENT_SILENCE");
break;
case VAD.Event.EVENT_VOICE:
console.log("EVENT_VOICE");
break;
}
})
});
```

## License

[MIT](LICENSE)
26 changes: 26 additions & 0 deletions binding.gyp
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
{
'targets': [
{
'target_name': 'vad',
'product_extension': 'node',
'type': 'shared_library',
'defines': [],
'include_dirs': ["<!(node -e \"require('nan')\")", "./src"],
'sources': [
'src/simplevad.c',
'src/vad_bindings.cc'
],
'dependencies': [
'./vendor/webrtc_vad/webrtc_vad.gyp:webrtc_vad'
],
'conditions': [
['OS=="mac"', {
"xcode_settings": {
"MACOSX_DEPLOYMENT_TARGET": "10.9",
"CLANG_CXX_LIBRARY": "libc++"
}
}]
]
}
]
}
Binary file added examples/demo_pcm_s16_16000.raw
Binary file not shown.
24 changes: 24 additions & 0 deletions examples/voice.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
const VAD = require('../index.js');
const fs = require('fs');

const vad = new VAD(VAD.Mode.MODE_NORMAL);

const stream = fs.createReadStream("demo_pcm_s16_16000.raw");
stream.on("data", chunk => {
vad.processAudio(chunk, 16000, (err, res) => {
switch (res) {
case VAD.Event.EVENT_ERROR:
console.log("EVENT_ERROR");
break;
case VAD.Event.EVENT_NOISE:
console.log("EVENT_NOISE");
break;
case VAD.Event.EVENT_SILENCE:
console.log("EVENT_SILENCE");
break;
case VAD.Event.EVENT_VOICE:
console.log("EVENT_VOICE");
break;
}
})
});
3 changes: 3 additions & 0 deletions index.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
const VAD = require('./lib/vad');

module.exports = VAD;
64 changes: 64 additions & 0 deletions lib/vad.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
const binding = require('bindings')('vad');
const Buffer = require('buffer').Buffer;

class VAD {

constructor(mode) {
const size = binding.vad_alloc(null);
if (size.error) {
throw new Error('Failed to get VAD size')
}

this._vad = Buffer.alloc(size.size);
if (!binding.vad_alloc(this._vad)) {
throw new Error('Failed to allocate VAD')
}

if (!binding.vad_init(this._vad)) {
throw new Error('Failed to initialise VAD')
}

if (typeof mode === 'number' &&
mode >= VAD.Mode.MODE_NORMAL && mode <= VAD.Mode.MODE_VERY_AGGRESSIVE) {
binding.vad_setmode(this._vad, mode);
} else {
throw new Error('Invalid mode settings')
}
}

// expects 16 bit signed audio
processAudio(buffer, rate, callback) {
binding.vad_processAudio(this._vad, VAD.toFloatBuffer(buffer), rate, callback);
}

processAudioFloat(buffer, rate, callback) {
binding.vad_processAudio(this._vad, buffer, rate, callback);
}

// TODO: Not very efficient...
static toFloatBuffer(buffer) {
const floatData = Buffer.alloc(buffer.length * 2);
for (let i = 0; i < buffer.length; i += 2) {
const intVal = buffer.readInt16LE(i);
const floatVal = intVal / 32768.0;
floatData.writeFloatLE(floatVal, i * 2);
}
return floatData;
}
}

VAD.Event = Object.freeze({
EVENT_ERROR: -1,
EVENT_SILENCE: 0,
EVENT_VOICE: 1,
EVENT_NOISE: 2
});

VAD.Mode = Object.freeze({
MODE_NORMAL: 0,
MODE_LOW_BITRATE: 1,
MODE_AGGRESSIVE: 2,
MODE_VERY_AGGRESSIVE: 3
});

module.exports = VAD;
18 changes: 18 additions & 0 deletions package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

28 changes: 28 additions & 0 deletions package.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
{
"name": "node-vad",
"author": {
"name": "Snirpo"
},
"description": "WebRTC-based Voice Activity Detection library",
"version": "1.0.0",
"scripts": {
"install": "node-gyp rebuild"
},
"main": "./index.js",
"license": "MIT",
"engines": {
"node": ">=6.14.3"
},
"dependencies": {
"bindings": "1.2.1",
"nan": "^2.5.0"
},
"gypfile": true,
"maintainers": [
{
"name": "snirpo",
"email": "[email protected]"
}
],
"directories": {}
}
Loading

0 comments on commit 98398e9

Please sign in to comment.