mengniwang95

Follow

Wang, Mengni mengniwang95

Follow

10 followers · 3 following

Achievements

Achievements

Stars

intel / auto-round

Advanced Quantization Algorithm for LLMs/VLMs.

Python 363 29 Updated Jan 27, 2025

onnx / neural-compressor

Model compression for ONNX

Python 82 9 Updated Nov 18, 2024

huggingface / optimum-habana

Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)

Python 166 226 Updated Feb 1, 2025

intel / llvm

Intel staging area for llvm.org contribution. Home for Intel LLVM-based projects.

LLVM 1,285 752 Updated Feb 3, 2025

bitsandbytes-foundation / bitsandbytes

Accessible large language models via k-bit quantization for PyTorch.

Python 6,584 652 Updated Jan 28, 2025

huggingface / optimum

🚀 Accelerate inference and training of 🤗 Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimization tools

Python 2,715 494 Updated Jan 31, 2025

microsoft / onnxruntime-inference-examples

Examples for using ONNX Runtime for machine learning inferencing.

C++ 1,277 348 Updated Jan 23, 2025

intel / intel-extension-for-transformers

⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡

Python 2,158 211 Updated Oct 8, 2024

microsoft / onnxconverter-common

Common utilities for ONNX converters

Python 257 67 Updated Dec 3, 2024

onnx / onnxmltools

ONNXMLTools enables conversion of models to ONNX

Python 1,044 190 Updated Jan 8, 2025

osmr / imgclsmob

Sandbox for training deep learning networks

Python 2,990 562 Updated Sep 6, 2024

microsoft / Olive

Olive: Simplify ML Model Finetuning, Conversion, Quantization, and Optimization for CPUs, GPUs and NPUs.

Python 1,732 186 Updated Feb 2, 2025

intel / pcm

Intel® Performance Counter Monitor (Intel® PCM)

C++ 2,893 480 Updated Jan 18, 2025

pytorch / pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Python 86,432 23,268 Updated Feb 3, 2025

oneapi-src / oneDNN

oneAPI Deep Neural Network Library (oneDNN)

C++ 3,703 1,022 Updated Feb 1, 2025

herumi / xbyak

A JIT assembler for x86/x64 architectures supporting MMX, SSE (1-4), AVX (1-2, 512), FPU, APX, and AVX10.2

C++ 2,079 276 Updated Feb 3, 2025

onnx / onnx

Open standard for machine learning interoperability

Python 18,338 3,704 Updated Feb 2, 2025

intel / ai-reference-models

Intel® AI Reference Models: contains Intel optimizations for running deep learning workloads on Intel® Xeon® Scalable processors and Intel® Data Center GPUs

Python 693 220 Updated Jan 30, 2025

microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

C++ 15,455 3,022 Updated Feb 3, 2025

NVIDIA / sampleQAT

Inference of quantization aware trained networks using TensorRT

Python 80 20 Updated Jan 27, 2023

NVIDIA / DALI

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.

C++ 5,263 629 Updated Jan 31, 2025

mlcommons / inference

Reference implementations of MLPerf™ inference benchmarks

Python 1,292 541 Updated Feb 3, 2025

intel / neural-compressor

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

Python 2,311 262 Updated Jan 24, 2025

openvinotoolkit / open_model_zoo

Pre-trained Deep Learning models and demos (high quality and extremely fast)

Python 4,145 1,376 Updated Jan 16, 2025