Skip to content
View ktnmoo's full-sized avatar

Block or report ktnmoo

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Robust Speech Recognition via Large-Scale Weak Supervision

Python 76,134 9,097 Updated Jan 4, 2025

All public course material for STAT 88 used in Spring 2021

Jupyter Notebook 1 Updated Apr 27, 2021
Jupyter Notebook 1 2 Updated Apr 27, 2021

Python bindings for FFmpeg - with complex filtering support

Python 10,292 898 Updated Aug 4, 2024

A Implementation of SpecAugment with Tensorflow & Pytorch, introduced by Google Brain

Python 646 136 Updated Apr 5, 2022

collaborative audio module for fast.ai

Jupyter Notebook 98 21 Updated Jun 26, 2019

New egocentric synthetic dataset for egocentric 3D human pose estimation

Python 61 12 Updated Jul 22, 2023

Efficient 3D human pose estimation in video using 2D keypoint trajectories

Python 3,786 761 Updated Dec 10, 2022

DeepFocus: Learned Image Synthesis for Computational Displays

Python 410 67 Updated Feb 18, 2022

Neural Reconstruction for Foveated Rendering and Video Compression using Learned Statistics of Natural Videos

PureBasic 388 55 Updated Aug 12, 2021

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

C++ 25,838 4,006 Updated Sep 3, 2024

Common Voice is part of Mozilla's initiative to help teach machines how real people speak.

TypeScript 3,331 844 Updated Feb 13, 2025

This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.

Python 1,569 320 Updated Sep 25, 2024

speaker diarization by uis-rnn and speaker embedding by vgg-speaker-recognition

Python 479 120 Updated Jul 1, 2021

End-to-end trained speech recognition system, based on RNNs and the connectionist temporal classification (CTC) cost function.

Python 122 36 Updated Apr 15, 2020

Python implementation of algorithms from Russell And Norvig's "Artificial Intelligence - A Modern Approach"

Jupyter Notebook 8,205 3,863 Updated Aug 4, 2024

Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)

3,014 515 Updated Oct 19, 2023
Jupyter Notebook 16 76 Updated Nov 18, 2016