SheaCai

SheaCai

5 followers · 3 following

Highlights

Stars

intel / llvm

Intel staging area for llvm.org contribution. Home for Intel LLVM-based projects.

LLVM 1,298 760 Updated Mar 12, 2025

SheaCai / optimus

This is the implementation of the paper [Optimus: Towards Optimal Layer-Fusion on Deep Learning Processors].

Python 9 6 Updated May 10, 2021

geniayuan / datasciencecoursera

for Data Science class on Coursera

493 145 Updated Nov 26, 2019

nebuly-ai / nos

Module to Automatically maximize the utilization of GPU resources in a Kubernetes cluster through real-time dynamic partitioning and elastic quotas - Effortless optimization at its finest!

Go 647 36 Updated Apr 21, 2024

openmlsys / openmlsys-zh

《Machine Learning Systems: Design and Implementation》- Chinese Version

TeX 4,300 450 Updated Apr 13, 2024

microsoft / AI-System

System for AI Education Resource.

Python 3,887 485 Updated Oct 25, 2024

CyC2018 / CS-Notes

📚 技术面试必备基础知识、Leetcode、计算机操作系统、计算机网络、系统设计

179,291 51,205 Updated Aug 21, 2024

youngyangyang04 / leetcode-master

《代码随想录》LeetCode 刷题攻略：200道经典题目刷题顺序，共60w字的详细图解，视频难点剖析，50余张思维导图，支持C++，Java，Python，Go，JavaScript等多语言版本，从此算法学习不再迷茫！🔥🔥 来看看，你会发现相见恨晚！🚀

Shell 54,608 11,872 Updated Mar 5, 2025

jcjohnson / cnn-benchmarks

Benchmarks for popular CNN models

Python 2,526 407 Updated Sep 25, 2017

drh / lcc

The lcc retargetable ANSI C compiler

C 2,079 453 Updated Oct 6, 2024

mangpo / greenthumb

Racket 86 6 Updated Jun 30, 2022

ucb-bar / cosa

A scheduler for spatial DNN accelerators that generate high-performance schedules in one shot using mixed integer programming (MIP)

Python 79 19 Updated Aug 28, 2023

chipsalliance / chisel-template

A template project for beginning new Chisel work

Scala 624 187 Updated Jan 30, 2025

pku-liang / TensorLib

Forked from kirliavc/tensorlib

A Spatial Accelerator Generation Framework for Tensor Algebra.

Verilog 55 9 Updated Dec 3, 2021

abdelazeem201 / Systolic-array-implementation-in-RTL-for-TPU

IC implementation of Systolic Array for TPU

Verilog 197 26 Updated Oct 21, 2024

Jagadeepram / linear-equation-solver-DE1-SoC

Accelerate Linear Equation System Solver on DE1-SoC development Board

C 1 Updated Aug 7, 2019

Xilinx / Vitis-AI

Vitis AI is Xilinx’s development stack for AI inference on Xilinx hardware platforms, including both edge devices and Alveo cards.

Python 1,554 642 Updated Sep 12, 2024

NVIDIA / FasterTransformer

Transformer related optimization, including BERT, GPT

C++ 6,077 901 Updated Mar 27, 2024

GATECH-EIC / AutoDNNchip

Python 70 13 Updated Mar 22, 2020

Lyken17 / pytorch-OpCounter

Count the MACs / FLOPs of your PyTorch model.

Python 4,962 529 Updated Jul 8, 2024

sksq96 / pytorch-summary

Model summary in PyTorch similar to `model.summary()` in Keras

Python 4,034 415 Updated Mar 2, 2024

IBM / AccDNN

A compiler from AI model to RTL (Verilog) accelerator in FPGA hardware with auto design space exploration.

Verilog 412 102 Updated Dec 2, 2019

calyxir / calyx

Intermediate Language (IL) for Hardware Accelerator Generators

Rust 518 53 Updated Mar 11, 2025

OAID / Tengine

Tengine is a lite, high performance, modular inference engine for embedded device

C++ 4,449 971 Updated Mar 6, 2025

google / xls

XLS: Accelerated HW Synthesis

C++ 1,252 187 Updated Mar 11, 2025

StanfordAHA / Halide-to-Hardware

C++ 80 13 Updated Feb 7, 2025

StanfordVLSI / Genesis2

A home for Genesis2 sources.

Perl 41 12 Updated Feb 18, 2025

BRTResearch / AIChip_Paper_List

604 116 Updated Jan 13, 2021

eth-cscs / COSMA

Distributed Communication-Optimal Matrix-Matrix Multiplication Algorithm

C++ 200 28 Updated Dec 11, 2024

microsoft / nnfusion

A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.

C++ 979 163 Updated Sep 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly