Stars
Intel staging area for llvm.org contribution. Home for Intel LLVM-based projects.
This is the implementation of the paper [Optimus: Towards Optimal Layer-Fusion on Deep Learning Processors].
Module to Automatically maximize the utilization of GPU resources in a Kubernetes cluster through real-time dynamic partitioning and elastic quotas - Effortless optimization at its finest!
《Machine Learning Systems: Design and Implementation》- Chinese Version
《代码随想录》LeetCode 刷题攻略:200道经典题目刷题顺序,共60w字的详细图解,视频难点剖析,50余张思维导图,支持C++,Java,Python,Go,JavaScript等多语言版本,从此算法学习不再迷茫!🔥🔥 来看看,你会发现相见恨晚!🚀
A scheduler for spatial DNN accelerators that generate high-performance schedules in one shot using mixed integer programming (MIP)
A template project for beginning new Chisel work
pku-liang / TensorLib
Forked from kirliavc/tensorlibA Spatial Accelerator Generation Framework for Tensor Algebra.
IC implementation of Systolic Array for TPU
Accelerate Linear Equation System Solver on DE1-SoC development Board
Vitis AI is Xilinx’s development stack for AI inference on Xilinx hardware platforms, including both edge devices and Alveo cards.
Transformer related optimization, including BERT, GPT
Count the MACs / FLOPs of your PyTorch model.
Model summary in PyTorch similar to `model.summary()` in Keras
A compiler from AI model to RTL (Verilog) accelerator in FPGA hardware with auto design space exploration.
Intermediate Language (IL) for Hardware Accelerator Generators
Tengine is a lite, high performance, modular inference engine for embedded device
Distributed Communication-Optimal Matrix-Matrix Multiplication Algorithm
A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.