Skip to content
View hillday's full-sized avatar

Block or report hillday

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Gender recognition by voice and speech analysis

R 351 102 Updated Jan 16, 2023

Enjoy the magic of Diffusion models!

Python 6,644 607 Updated Dec 17, 2024

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

Jupyter Notebook 8,300 1,135 Updated Nov 13, 2024

A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization

Python 1,319 109 Updated Dec 13, 2024

声纹识别

Jupyter Notebook 11 2 Updated Dec 4, 2023

This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE, ERes2Net, CAM++, etc. It is not excluded that more models will be supported in the future. At the …

Python 848 127 Updated Nov 19, 2024

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 6,945 747 Updated Dec 17, 2024

Animate a given image with animatediff and controlnet

Python 126 6 Updated Jan 2, 2024

视频硬字幕提取,生成srt文件。无需申请第三方API,本地实现文本识别。基于深度学习的视频字幕提取框架,包含字幕区域检测、字幕内容提取。A GUI tool for extracting hard-coded subtitle (hardsub) from videos and generating srt files.

Python 6,301 686 Updated Oct 29, 2024

A simple screen parsing tool towards pure vision based GUI agent

Jupyter Notebook 5,188 403 Updated Dec 13, 2024

🎉 汇聚并整理飞书等公开分享文档链接,解决没有官方全局搜索痛点,让知识持续传递。A list cool, beauty, interesting doc of feishu.

83 6 Updated Aug 21, 2023

Transformer Explained Visually: Learn How LLM Transformer Models Work with Interactive Visualization

JavaScript 3,607 319 Updated Nov 30, 2024

Running speech to text model (whisper.cpp) in Unity3d on your local machine.

C# 445 100 Updated Dec 8, 2024

Robust Speech Recognition via Large-Scale Weak Supervision

Python 72,927 8,701 Updated Dec 1, 2024

"LightRAG: Simple and Fast Retrieval-Augmented Generation"

Python 11,646 1,530 Updated Dec 17, 2024
Python 73 10 Updated Aug 13, 2024

AI Native Data App Development framework with AWEL(Agentic Workflow Expression Language) and Agents

Python 13,985 1,886 Updated Dec 17, 2024

Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.

Shell 10,945 676 Updated Dec 4, 2024

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.

Python 14,785 1,195 Updated Dec 12, 2024

[ACL 2024] IEPile: A Large-Scale Information Extraction Corpus

Python 176 17 Updated Dec 17, 2024

DSPy: The framework for programming—not prompting—language models

Python 20,245 1,533 Updated Dec 17, 2024

A modular graph-based Retrieval-Augmented Generation (RAG) system

Python 1 1 Updated Jul 18, 2024

A phoneme extractor tool for a free lipsync workflow in Unity. This is not made by me. it is made by rmemr

36 5 Updated Mar 9, 2023

Demo project for GDMP plugin.

GDScript 17 4 Updated Dec 10, 2024

基于 Django 3 的网盘系统

JavaScript 38 8 Updated Apr 21, 2024

The container platform tailored for Kubernetes multi-cloud, datacenter, and edge management ⎈ 🖥 ☁️

Go 15,320 2,166 Updated Dec 16, 2024

LlamaIndex is a data framework for your LLM applications

Python 37,380 5,360 Updated Dec 16, 2024

The official Meta Llama 3 GitHub site

Python 27,526 3,138 Updated Aug 12, 2024

📷 EasyPhoto | Your Smart AI Photo Generator.

Python 5,004 402 Updated Jul 10, 2024

Create agents that monitor and act on your behalf. Your agents are standing by!

Ruby 43,961 3,808 Updated Dec 16, 2024
Next