-
University of Rochester
- Rochester, NY
- masumhasan.com
- @masum_mc2
Lists (2)
Sort Name ascending (A-Z)
Stars
Toolkit for linearizing PDFs for LLM datasets/training
Enhance-A-Video: Better Generated Video for Free
The official implementation of the paper "Human Motion Diffusion as a Generative Prior"
🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/mEkkMXFG
Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑🔬
Open source Claude Artifacts – built with Llama 3.1 405B
Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation
The official repo for [NeurIPS'22] "ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation" and [TPAMI'23] "ViTPose++: Vision Transformer for Generic Body Pose Estimation"
Create Customized Software using Natural Language Idea (through LLM-powered Multi-Agent Collaboration)
Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We also show you how to solve end to end problems using Llama mode…
Clone a voice in 5 seconds to generate arbitrary speech in real-time
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows
A high-performance inference system for large language models, designed for production environments.
AI agent using GPT-4V(ision) capable of using a mouse/keyboard to interact with web UI
🐢 Open-Source Evaluation & Testing for AI & LLM systems
Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
A new one shot face swap approach for image and video domains
🎓 Path to a free self-taught education in Computer Science!
[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821
A dataset containing human-human knowledge-grounded open-domain conversations.
Building a Simple Chatbot from Scratch in Python (using NLTK)
Examples and guides for using the OpenAI API
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
A reference containing Styles and Keywords that you can use with MidJourney AI. There are also pages showing resolution comparison, image weights, and much more!
ML-driven tongue animation (CVPR'22)