Lists (20)
Sort Name ascending (A-Z)
art_with_ai
computer-vision-models
Deep dive paper
deep fakes
expense-tracker-app
face-attribute-analysis
face-detect-repos
face-spoof-detection
Face spoof dataset, algorithms and research papersgenAI, LLM & LVM
learning
mac app
mixture of experts
ml-ds-interview
number-plate-ocr
Reinforcement Learning
speech-analysis
text to speech (TTS)
vision - Paper, Arch, Repos,Demo
voice spoof detection
yolo-projects
Stars
Semantic segmentation models with 500+ pretrained convolutional and transformer-based backbones.
Image augmentation for machine learning experiments.
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and…
An MIT License of YOLOv9, YOLOv7, YOLO-RD
Make websites accessible for AI agents
💸 An app created to help users manage a budget and purchases
Drop in a screenshot and convert it to clean code (HTML/Tailwind/React/Vue)
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
Jobs_Applier_AI_Agent_AIHawk aims to easy job hunt process by automating the job application process. Utilizing artificial intelligence, it enables users to apply for multiple jobs in a tailored way.
High-resolution models for human tasks.
Efficient Triton Kernels for LLM Training
A project where the license plate number is extracted from image of a vehicle using Object detection and Character recognition techniques.
DeepSeek-VL: Towards Real-World Vision-Language Understanding
Virtual whiteboard for sketching hand-drawn like diagrams
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
BAE-NET: A LOW COMPLEXITY AND HIGH FIDELITY BANDWIDTH-ADAPTIVE NEURAL NETWORK FOR SPEECH SUPER-RESOLUTION
Versatile audio super resolution (any -> 48kHz) with AudioSR.
Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch
The official implementation of HierSpeech++
Official implementation of FaceXFormer: A Unified Transformer for Facial Analysis
Implementation of Vision Mamba from the paper: "Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model" It's 2.8x faster than DeiT and saves 86.8% GPU memory wh…
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.
Data-Driven Evaluation for LLM-Powered Applications
Mixture of experts on convolutional neural network using Keras and Cifar10