Neural Magic
Neural Magic helps developers in accelerating machine learning performance using automated model sparsification techniques and inference technologies.
Pinned Loading
Repositories
Showing 10 of 59 repositories
- upstream-transformers Public Forked from huggingface/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
neuralmagic/upstream-transformers’s past year of commit activity - nm-vllm-certs Public
General Information, model certifications, and benchmarks for nm-vllm enterprise distributions
neuralmagic/nm-vllm-certs’s past year of commit activity - compressed-tensors Public
A safetensors extension to efficiently store sparse quantized tensors on disk
neuralmagic/compressed-tensors’s past year of commit activity - quant_kernel_benchmarks Public
Benchmarking code for running quantized kernels from vLLM and other libraries
neuralmagic/quant_kernel_benchmarks’s past year of commit activity - lm-evaluation-harness Public Forked from EleutherAI/lm-evaluation-harness
A framework for few-shot evaluation of language models.
neuralmagic/lm-evaluation-harness’s past year of commit activity - flash-attention Public Forked from vllm-project/flash-attention
Fast and memory-efficient exact attention
neuralmagic/flash-attention’s past year of commit activity