Skip to content
View keilsmart's full-sized avatar

Block or report keilsmart

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
  • pytorch Public

    Forked from pytorch/pytorch

    Tensors and Dynamic neural networks in Python with strong GPU acceleration

    Python Other Updated Oct 1, 2024
  • onediff Public

    Forked from siliconflow/onediff

    OneDiff: An out-of-the-box acceleration library for diffusion models.

    Python Apache License 2.0 Updated Jul 11, 2024
  • oneflow Public

    Forked from Oneflow-Inc/oneflow

    OneFlow is a deep learning framework designed to be user-friendly, scalable and efficient.

    C++ Apache License 2.0 Updated Jul 11, 2024
  • Best inference performance optimization framework for HuggingFace Diffusers on NVIDIA GPUs.

    Python MIT License Updated May 9, 2024
  • OpenPose: Real-time multi-person keypoint detection library for body, face, hands, and foot estimation

    C++ Other Updated Dec 18, 2023
  • rocMLIR Public

    Forked from ROCm/rocMLIR
    Updated Sep 25, 2023
  • tvm Public

    Forked from apache/tvm

    Open deep learning compiler stack for cpu, gpu and specialized accelerators

    Python Apache License 2.0 Updated Sep 24, 2023
  • llvm-project Public

    Forked from llvm/llvm-project

    The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.

    Other Updated Sep 24, 2023
  • Bringing stable diffusion models to web browsers. Everything runs inside the browser with no server support.

    Jupyter Notebook Apache License 2.0 Updated Sep 5, 2023
  • kernl Public

    Forked from ELS-RD/kernl

    Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackable.

    Jupyter Notebook Apache License 2.0 Updated Aug 21, 2023
  • CuAssembler Public

    Forked from OpenPPL/CuAssembler

    An unofficial cuda assembler, for all generations of SASS, hopefully :)

    Python MIT License Updated Mar 20, 2023
  • 对 YOLOv3 做模型剪枝,在 oxford hand 数据集上模型的参数量减少 80% ,FLOPs 降低 70%,Infer 的速度可以达到原来的 200%,mAP 基本保持不变

    Python MIT License Updated Jul 3, 2019
  • cutlass Public

    Forked from NVIDIA/cutlass

    CUDA Templates for Linear Algebra Subroutines

    C++ BSD 3-Clause "New" or "Revised" License Updated Dec 19, 2018
  • caffe Public

    Forked from pmgysel/caffe

    Ristretto: Caffe-based approximation of convolutional neural networks.

    C++ Other Updated Apr 16, 2018
  • Training Low-bits DNNs with Stochastic Quantization

    Jupyter Notebook Updated Aug 4, 2017