An extension that lets the AI take the wheel, allowing it to use the mouse and keyboard, recognize UI elements, and prompt itself :3...now also act as a research assistant

Python 99 1 Updated Oct 22, 2024

arogozhnikov / einops

Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)

Python 8,582 354 Updated Dec 16, 2024

amorehead / alphafold3-pytorch-lightning-hydra

Forked from lucidrains/alphafold3-pytorch

Implementation of AlphaFold 3 in PyTorch Lightning + Hydra

Python 32 7 Updated Oct 4, 2024

lucidrains / alphafold3-pytorch

Implementation of Alphafold 3 from Google Deepmind in Pytorch

Python 1,289 155 Updated Dec 3, 2024

xjdr-alt / entropix

Entropy Based Sampling and Parallel CoT Decoding

Python 3,168 319 Updated Nov 13, 2024

Adamdad / kat

Kolmogorov-Arnold Transformer: A PyTorch Implementation with CUDA kernel

Python 628 36 Updated Oct 8, 2024

PKU-YuanGroup / MoE-LLaVA

Mixture-of-Experts for Large Vision-Language Models

Python 2,024 126 Updated Dec 3, 2024

2noise / ChatTTS

A generative speech model for daily dialogue.

Python 33,030 3,584 Updated Dec 3, 2024

yl4579 / StyleTTS2

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Python 5,040 426 Updated Aug 10, 2024

lifeiteng / naturalspeech3_facodec

FACodec: Speech Codec with Attribute Factorization used for NaturalSpeech 3

Python 175 11 Updated Apr 20, 2024

lifeiteng / Aligner-SUPERB

Speech-To-Text forced-alignment Speech processing Universal PERformance Benchmark

Python 25 2 Updated Jul 14, 2024

MahmoudAshraf97 / ctc-forced-aligner

Text to speech alignment using CTC forced alignment

Python 175 35 Updated Oct 30, 2024

QwenLM / Qwen2-VL

Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Python 3,651 227 Updated Dec 4, 2024

ahwillia / affinewarp

An implementation of piecewise linear time warping for multi-dimensional time series alignment

Python 172 38 Updated Aug 15, 2024

LAION-AI / natural_voice_assistant

Python 467 44 Updated May 27, 2024

baaivision / DIVA

Diffusion Feedback Helps CLIP See Better

Python 228 12 Updated Aug 24, 2024

Zejun-Yang / AniPortrait

AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation

Python 4,732 586 Updated Jul 2, 2024

KwaiVGI / LivePortrait

Bring portraits to life!

Python 13,364 1,423 Updated Nov 12, 2024

neeek2303 / EMOPortraits

Official implementation of EMOPortraits: Emotion-enhanced Multimodal One-shot Head Avatars

Jupyter Notebook 328 18 Updated Oct 6, 2024

buildFast10x / Nextjs-Boilerplate

We aim to build the most powerful Next-js Boilerplate. So, people don't have to write the code for same features again and again.

TypeScript 70 3 Updated Apr 5, 2024

johndpope / VASA-1-hack

Using Claude Sonnet 3.5 to forward (reverse) engineer code from VASA white paper - WIP - (this is for La Raza 🎷)

Python 249 30 Updated Nov 9, 2024

TerryPei / EfficientVMamba

Code Implementation of EfficientVMamba

Python 189 7 Updated Apr 16, 2024

kyegomez / VisionMamba

Implementation of Vision Mamba from the paper: "Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model" It's 2.8x faster than DeiT and saves 86.8% GPU memory wh…

Python 408 19 Updated Nov 25, 2024

YuHengsss / MSVMamba

[NeurIPS2024] Multi-Scale VMamba: Hierarchy in Hierarchy Visual State Space Model

Python 54 3 Updated Sep 29, 2024

ms-dot-k / Multi-head-Visual-Audio-Memory

PyTorch implementation of "Distinguishing Homophenes using Multi-Head Visual-Audio Memory" (AAAI2022)

Python 25 5 Updated Mar 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hanif-rt

Block or report hanif-rt

Stars

Ucas-HaoranWei / GOT-OCR2.0

srush / Triton-Puzzles

liutaocode / talking-face-arxiv-daily

illuin-tech / colpali

lucasnewman / f5-tts-mlx

RandomInternetPreson / Lucid_Autonomy