- Sunnyvale, CA
Stars
Generative AI extensions for onnxruntime
Source code examples from the Parallel Forall Blog
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
A JIT assembler for x86/x64 architectures supporting MMX, SSE (1-4), AVX (1-2, 512), FPU, APX, and AVX10.2
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
The Torch-MLIR project aims to provide first class support from the PyTorch ecosystem to the MLIR ecosystem.
A retargetable MLIR-based machine learning compiler and runtime toolkit.
Cross-platform, customizable ML solutions for live and streaming media.
ncnn is a high-performance neural network inference framework optimized for the mobile platform
MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
Quantized Neural Network PACKage - mobile-optimized implementation of quantized neural network operators
FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/
State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.
[ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Optimized primitives for collective multi-GPU communication
NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
[ARCHIVED] The C++ parallel algorithms library. See https://github.com/NVIDIA/cccl
A natural language modeling framework based on PyTorch
TensorFlow code and pre-trained models for BERT
Convert TensorFlow, Keras, Tensorflow.js and Tflite models to ONNX
An Open Source Machine Learning Framework for Everyone
Code samples for my book "Neural Networks and Deep Learning"
Open standard for machine learning interoperability