-
pytorch Public
Forked from pytorch/pytorchTensors and Dynamic neural networks in Python with strong GPU acceleration
Python Other UpdatedOct 1, 2024 -
onediff Public
Forked from siliconflow/onediffOneDiff: An out-of-the-box acceleration library for diffusion models.
Python Apache License 2.0 UpdatedJul 11, 2024 -
oneflow Public
Forked from Oneflow-Inc/oneflowOneFlow is a deep learning framework designed to be user-friendly, scalable and efficient.
C++ Apache License 2.0 UpdatedJul 11, 2024 -
stable-fast Public
Forked from chengzeyi/stable-fastBest inference performance optimization framework for HuggingFace Diffusers on NVIDIA GPUs.
Python MIT License UpdatedMay 9, 2024 -
openpose Public
Forked from CMU-Perceptual-Computing-Lab/openposeOpenPose: Real-time multi-person keypoint detection library for body, face, hands, and foot estimation
C++ Other UpdatedDec 18, 2023 -
-
tvm Public
Forked from apache/tvmOpen deep learning compiler stack for cpu, gpu and specialized accelerators
Python Apache License 2.0 UpdatedSep 24, 2023 -
llvm-project Public
Forked from llvm/llvm-projectThe LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
Other UpdatedSep 24, 2023 -
web-stable-diffusion Public
Forked from mlc-ai/web-stable-diffusionBringing stable diffusion models to web browsers. Everything runs inside the browser with no server support.
Jupyter Notebook Apache License 2.0 UpdatedSep 5, 2023 -
kernl Public
Forked from ELS-RD/kernlKernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackable.
Jupyter Notebook Apache License 2.0 UpdatedAug 21, 2023 -
CuAssembler Public
Forked from OpenPPL/CuAssemblerAn unofficial cuda assembler, for all generations of SASS, hopefully :)
Python MIT License UpdatedMar 20, 2023 -
YOLOv3-model-pruning Public
Forked from Lam1360/YOLOv3-model-pruning对 YOLOv3 做模型剪枝,在 oxford hand 数据集上模型的参数量减少 80% ,FLOPs 降低 70%,Infer 的速度可以达到原来的 200%,mAP 基本保持不变
Python MIT License UpdatedJul 3, 2019 -
cutlass Public
Forked from NVIDIA/cutlassCUDA Templates for Linear Algebra Subroutines
C++ BSD 3-Clause "New" or "Revised" License UpdatedDec 19, 2018 -
caffe Public
Forked from pmgysel/caffeRistretto: Caffe-based approximation of convolutional neural networks.
C++ Other UpdatedApr 16, 2018 -
Stochastic-Quantization Public
Forked from dongyp13/Stochastic-QuantizationTraining Low-bits DNNs with Stochastic Quantization
Jupyter Notebook UpdatedAug 4, 2017