-
Poetics Inc. @smartmedical-jp
- Tokyo, Japan
-
02:15
(UTC +09:00) - https://sapphosound.com
- https://orcid.org/0000-0003-2645-736X
Highlights
- Pro
Lists (32)
Sort Name ascending (A-Z)
Android
Art
ASR
Audio
BNN
Bayesian Neural NetworkC++
Cloud
Dataset
DL
dotnet
F#
FFXIV
Game
Generative
Godot
Graphics
iOS
LLM
ML
Music
NLP
OS
RAG
Recommendation
RL
Rust
Security
Study
System
Vision
Visualization/UI
Web
Stars
- All languages
- ActionScript
- Arduino
- Assembly
- C
- C#
- C++
- CMake
- CSS
- Cuda
- Dart
- Dockerfile
- Elixir
- Emacs Lisp
- F#
- GDScript
- Go
- HTML
- Haskell
- Java
- JavaScript
- Jupyter Notebook
- Kotlin
- Less
- Lua
- MDX
- Mustache
- OCaml
- Objective-C
- PHP
- Pascal
- Perl
- PowerShell
- Python
- R
- Rich Text Format
- Ruby
- Rust
- SCSS
- Sass
- Scala
- ShaderLab
- Shell
- Svelte
- TeX
- TypeScript
- Visual Basic
- Visual Basic 6.0
- Vue
- Zig
- mcfunction
π§ MCP server implementing RAT (Retrieval Augmented Thinking) - combines DeepSeek's reasoning with GPT-4/Claude/Mistral responses, maintaining conversation context between interactions.
Kaldi-compatible online & offline feature extraction with PyTorch, supporting CUDA, batch processing, chunk processing, and autograd - Provide C++ & Python API
This library provides common speech features for ASR including MFCCs and filterbank energies.
interactive graphing library for .NET programming languages π
An open-source RAG-based tool for chatting with your documents.
a Frontier Japanese Speech Generation net
A dataset of 222 digital musical scores aligned with 1068 performances (more than 92 hours) of Western classical piano music.
π€ smolagents: a barebones library for agents. Agents write python code to call tools and orchestrate other agents.
Python tool for converting files and office documents to Markdown.
λμ€μ½λ μ±λμμ μμ°μ€λ½κ² μ¬λλ€μ λνμ μ°Έμ¬νλ λ΄.
[CVPR 2025] Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis
λΉκ³΅μ μΉμΉμ§ μλλ‘μ΄λ TV μ±
λΉ λ₯Έ μλμ μ€μν μ νλλ₯Ό λͺ©νλ‘νλ νκ΅μ΄ λμ΄μ°κΈ° κ΅μ λͺ¨λΈμ λλ€. (It is a Korean spacing correction model that aims for fast speed and moderate accuracy.)
Trained neural networks and requisite information and data for rnnoise-nu
ASCII generator (image to text, image to image, video to video)
Open-source platform for extracting structured data from documents using AI.
This repository contains the samples for Syncfusion WPF UI Controls and File Format libraries and the guide to use them.
This is the official implementation of reverberant speech to room impulse response estimator
Manipulate audio with a simple and easy high level interface
Flexible audio loudness meter in Python with implementation of ITU-R BS.1770-4 loudness algorithm
Examples of how to use TPM APIs for basic use cases
OSS implementation of the TCG TPM2 Software Stack (TSS2)
Convert PlantUML embedded in Markdown to an image with Pandoc and output it to HTML
Noise reduction in python using spectral gating (speech, bioacoustics, audio, time-domain signals)