Skip to content
View drewjin's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report drewjin

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Hackable and optimized Transformers building blocks, supporting a composable construction.

Python 8,920 637 Updated Jan 16, 2025

AlphaFold 3 inference pipeline.

Python 5,864 709 Updated Jan 16, 2025

The repository has collected a batch of noteworthy MLSys bloggers (Algorithms/Systems)

HTML 138 1 Updated Jan 5, 2025

Puzzles for learning Triton, play it with minimal environment configuration!

Python 203 10 Updated Dec 3, 2024

xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism

Python 1,157 95 Updated Jan 18, 2025

Xiao's CUDA Optimization Guide [Active Adding New Contents]

258 18 Updated Nov 8, 2022
Python 1,243 177 Updated Nov 20, 2024

CMake 教程 Modern-CMake 的简体中文翻译,中文版 Gitbook :https://modern-cmake-cn.github.io/Modern-CMake-zh_CN/ Chinese(simplified) translation of famous cmake tutorial Modern CMake. GitHub Pages : https://modern…

CMake 758 93 Updated Aug 6, 2024

Stepwise optimizations of DGEMM on CPU, reaching performance faster than Intel MKL eventually, even under multithreading.

C 121 25 Updated Feb 3, 2022

Zhihu extension built on vscode.

TypeScript 882 79 Updated Jun 27, 2023

🔥 A cross-platform build utility based on Lua

Lua 10,535 815 Updated Jan 18, 2025

Parallel Prefix Sum (Scan) with CUDA

Cuda 18 3 Updated Jun 22, 2024

CMake完整使用教程。CMake教程包括一系列循序渐进的任务,介绍CMake信息,展示如何实现目标。

CMake 270 36 Updated Mar 29, 2021

Development repository for the Triton language and compiler

C++ 14,060 1,715 Updated Jan 18, 2025

LLM notes, including model inference, transformer model structure, and llm framework code analysis notes.

Python 393 34 Updated Jan 14, 2025
Python 96 20 Updated Aug 26, 2024

中文nlp解决方案(大模型、数据、模型、训练、推理)

Jupyter Notebook 3,151 382 Updated Jan 2, 2025

Multimodal Transformers are Hierarchical Modal-wise Heterogeneous Graphs

Jupyter Notebook 1 Updated Jan 6, 2025

Material for gpu-mode lectures

Jupyter Notebook 3,504 353 Updated Jan 6, 2025

c++ 实现 stanford cs149 assiment4

C++ 1 Updated Mar 4, 2023

c++

C++ 1 Updated Mar 22, 2023

c++ 实现stanford cs149 assignment1

C++ 12 Updated Feb 19, 2023

Learning materials for Stanford CS149 : Parallel Computing

C 195 28 Updated Jul 31, 2021

Intel® Implicit SPMD Program Compiler

C++ 2,572 317 Updated Jan 16, 2025
C++ 3 Updated Nov 21, 2024

Stanford CS149 -- Assignment 1

C++ 77 82 Updated Oct 2, 2024

Official implementation of paper 'Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation in Multimodal Large Language Models'.

Python 38 2 Updated Nov 27, 2024

RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RN…

Python 13,012 882 Updated Jan 10, 2025

[NeurIPS'24 Spotlight] To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which reduces inference latency by up to 10x for pre-filling on an A100 whil…

Python 878 41 Updated Dec 28, 2024

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Python 85,952 23,138 Updated Jan 18, 2025
Next