Stars
SQUEEZED ATTENTION: Accelerating Long Prompt LLM Inference
Clean, minimal, accessible reproduction of DeepSeek R1-Zero
[EMNLP 2024 Demo] TinyAgent: Function Calling at the Edge!
This Repository includes some of the presentations and tutorials I have made
[ACL 2024] LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement
[ICML 2024] LLMCompiler: An LLM Compiler for Parallel Function Calling
[NeurIPS 2024] KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization
A curated list of practical guide resources of LLMs (LLMs Tree, Examples, Papers)
UpTrain is an open-source unified platform to evaluate and improve Generative AI applications. We provide grades for 20+ preconfigured checks (covering language, code, embedding use-cases), perform…
Gradient-based adaptive sampling algorithms for self-supervising PINNs
Fork of seldridge/rocket-rocc-examples with tests for a systolic array based matmul accelerator
A modular, automatable, tunable mapper for accelerator programming
Aqueduct is no longer being maintained. Aqueduct allows you to run LLM and ML workloads on any cloud infrastructure.
[NeurIPS'22] Squeezeformer: An Efficient Transformer for Automatic Speech Recognition
A simple tool to update bib entries with their official information (e.g., DBLP or the ACL anthology).
GLioblastoma Image Analysis for integrating brain tumor growth models with medical imaging
A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (p…
Perform data science on data that remains in someone else's server
Large datasets for conversational AI
A list of publically available audio data that anyone can download for ASR or other speech activities
[ICASSP'22] Integer-only Zero-shot Quantization for Efficient Speech Recognition
[ICML'21 Oral] I-BERT: Integer-only BERT Quantization
Facebook AI Research's Automatic Speech Recognition Toolkit
torch-optimizer -- collection of optimizers for Pytorch
Tutorial notebooks for hls4ml
Quantization library for PyTorch. Support low-precision and mixed-precision quantization, with hardware implementation through TVM.