DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

C++ 25,838 4,006 Updated Sep 3, 2024

common-voice / common-voice

Common Voice is part of Mozilla's initiative to help teach machines how real people speak.

TypeScript 3,331 844 Updated Feb 13, 2025

google / uis-rnn

This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.

Python 1,569 320 Updated Sep 25, 2024

taylorlu / Speaker-Diarization

speaker diarization by uis-rnn and speaker embedding by vgg-speaker-recognition

Python 479 120 Updated Jul 1, 2021

mdangschat / ctc-asr

End-to-end trained speech recognition system, based on RNNs and the connectionist temporal classification (CTC) cost function.

Python 122 36 Updated Apr 15, 2020

aimacode / aima-python

Python implementation of algorithms from Russell And Norvig's "Artificial Intelligence - A Modern Approach"

Jupyter Notebook 8,205 3,863 Updated Aug 4, 2024

zzw922cn / awesome-speech-recognition-speech-synthesis-papers

Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)

3,014 515 Updated Oct 19, 2023

cs109Alabs / lab_files

Jupyter Notebook 16 76 Updated Nov 18, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ktnmoo

Block or report ktnmoo

Stars

openai / whisper

stat88 / content-sp21

prob140 / materials-sp21

kkroening / ffmpeg-python

DemisEom / SpecAugment

zcaceres / fastai-audio

facebookresearch / xR-EgoPose

facebookresearch / VideoPose3D

facebookresearch / DeepFocus

facebookresearch / DeepFovea