Skip to content
View yukiki-jc's full-sized avatar

Highlights

  • Pro

Block or report yukiki-jc

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

📰 Must-read papers and blogs on Speculative Decoding ⚡️

587 27 Updated Feb 11, 2025

A self-learning tutorail for CUDA High Performance Programing.

JavaScript 353 40 Updated Dec 17, 2024

Spec-Bench: A Comprehensive Benchmark and Unified Evaluation Platform for Speculative Decoding (ACL 2024 Findings)

Python 222 25 Updated Oct 25, 2024

Jetson Benchmark

Python 378 72 Updated Jun 20, 2024

Material for gpu-mode lectures

Jupyter Notebook 3,684 371 Updated Feb 9, 2025

⏰ Collaboratively track deadlines of conferences recommended by CCF (Website, Python Cli, Wechat Applet) / If you find it useful, please star this project, thanks~

Vue 6,860 471 Updated Feb 12, 2025

Training materials associated with NVIDIA's CUDA Training Series (www.olcf.ornl.gov/cuda-training-series/)

Cuda 687 252 Updated Aug 19, 2024

LLM inference in C/C++

C++ 74,015 10,686 Updated Feb 12, 2025

📖A curated list of Awesome LLM/VLM Inference Papers with codes: WINT8/4, Flash-Attention, Paged-Attention, Parallelism, etc. 🎉🎉

3,395 235 Updated Jan 31, 2025

Multiple NVIDIA GPUs or Apple Silicon for Large Language Model Inference?

Jupyter Notebook 1,369 51 Updated May 13, 2024

Tips for Writing a Research Paper using LaTeX

TeX 3,360 383 Updated May 4, 2023

Official Code for "SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression"

Python 113 10 Updated Jan 24, 2025

PyTorch implementation of Language model compression with weighted low-rank factorization

Python 8 2 Updated Jun 28, 2023

Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.

650 38 Updated Aug 3, 2024

A Langchain email agent that responds to incoming email. Email service with AWS SES.

JavaScript 10 4 Updated May 18, 2023

😎 Awesome list of tools and projects with the awesome LangChain framework

7,917 553 Updated Jan 27, 2025

🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.

1,916 94 Updated Jan 26, 2025

Accelerating the development of large multimodal models (LMMs) with one-click evaluation module - lmms-eval.

Python 2,086 204 Updated Feb 11, 2025

✨✨Latest Advances on Multimodal Large Language Models

13,817 890 Updated Feb 11, 2025

Firefly: 大模型训练工具,支持训练Qwen2.5、Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型

Python 6,111 552 Updated Oct 24, 2024

System for AI Education Resource.

Python 3,824 478 Updated Oct 25, 2024

Homework solutions for CMU 10-414/714 – Deep Learning Systems: Algorithms and Implementation

Python 43 4 Updated Dec 12, 2022

The full minitorch student suite.

Python 2,003 437 Updated Aug 17, 2024

Function graph tracer for C/C++/Rust/Python

C 3,123 479 Updated Jan 20, 2025

Specs for new networking hardware offloads.

C 32 4 Updated Feb 5, 2025

XQUIC Library released by Alibaba is a cross-platform implementation of QUIC and HTTP/3 protocol.

C 1,743 340 Updated Jan 28, 2025

Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.

Python 11,084 1,589 Updated Jan 19, 2025
Next