Skip to content
View charlifu's full-sized avatar
  • AMD

Block or report charlifu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A framework for few-shot evaluation of language models.

Python 7,581 2,038 Updated Jan 28, 2025

Stretching GPU performance for GEMMs and tensor contractions.

Python 231 154 Updated Jan 28, 2025

小红书笔记 | 评论爬虫、抖音视频 | 评论爬虫、快手视频 | 评论爬虫、B 站视频 | 评论爬虫、微博帖子 | 评论爬虫、百度贴吧帖子 | 百度贴吧评论回复爬虫 | 知乎问答文章|评论爬虫

Python 19,448 5,867 Updated Jan 20, 2025

A validation and profiling tool for AI infrastructure

Python 289 60 Updated Jan 8, 2025

Dissecting NVIDIA GPU Architecture

Cuda 84 27 Updated Jul 11, 2022

ROB size testing utility

C++ 141 14 Updated Dec 19, 2021

IREE plugin repository for the AMD AIE accelerator

MLIR 73 30 Updated Jan 28, 2025

A cheatsheet of modern C++ language and library features.

20,028 2,126 Updated Oct 15, 2024

Graph Neural Network Library for PyTorch

Python 21,804 3,742 Updated Jan 28, 2025

Library for specialized dense and sparse matrix operations, and deep learning primitives.

C 859 187 Updated Jan 28, 2025

An MLIR-based toolchain for AMD AI Engine-enabled devices.

MLIR 327 98 Updated Jan 28, 2025

METIS - Serial Graph Partitioning and Fill-reducing Matrix Ordering

C 753 150 Updated Oct 27, 2023

A high-performance, zero-overhead, extensible Python compiler using LLVM

C++ 15,322 525 Updated Jan 26, 2025

A JIT assembler for x86/x64 architectures supporting MMX, SSE (1-4), AVX (1-2, 512), FPU, APX, and AVX10.2

C++ 2,079 276 Updated Jan 24, 2025

A list of awesome GNN systems.

Python 299 27 Updated Jan 29, 2025

Python package built to ease deep learning on graph, on top of existing DL frameworks.

Python 13,683 3,027 Updated Jan 26, 2025

Resources on the GraphBLAS standard for graph algorithms in the language of linear algebra

191 11 Updated Oct 22, 2024

The Combinatorial BLAS (CombBLAS) is an extensible distributed-memory parallel graph library offering a small but powerful set of linear algebra primitives specifically targeting graph analytics.

C++ 70 23 Updated Dec 9, 2024

ParMETIS - Parallel Graph Partitioning and Fill-reducing Matrix Ordering

C 125 46 Updated Dec 8, 2023

PyTorch-Based Fast and Efficient Processing for Various Machine Learning Applications with Diverse Sparsity

Cuda 100 27 Updated Jan 20, 2025

Machine learning compiler based on MLIR for Sophgo TPU.

C++ 649 164 Updated Jan 21, 2025

🌟 Wiki of OI / ICPC for everyone. (某大型游戏线上攻略,内含炫酷算术魔法)

TypeScript 22,005 4,078 Updated Jan 28, 2025

PyTorch Extension Library of Optimized Autograd Sparse Matrix Operations

Python 1,032 150 Updated Jan 10, 2025

Stack trace visualizer

Perl 17,713 1,994 Updated Oct 20, 2024
C++ 2 Updated Jun 11, 2022

A microbenchmark support library

C++ 9,216 1,646 Updated Jan 24, 2025

Low-latency machine code generation

C++ 4,026 512 Updated Jan 22, 2025

A list of awesome compiler projects and papers for tensor computation and deep learning.

2,472 307 Updated Oct 19, 2024

Conversions to MLIR EmitC

C++ 126 23 Updated Dec 12, 2024
Next