-
University of Virginia
- Virginia
-
05:41
(UTC -05:00) - https://elio-yang.github.io/
Highlights
- Pro
Lists (1)
Sort Name ascending (A-Z)
Starred repositories
Batch convert ppt files to pdf files by Automator on MacOS
A low-latency & high-throughput serving engine for LLMs
A high-throughput and memory-efficient inference and serving engine for LLMs
[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Practice on cifar100(ResNet, DenseNet, VGG, GoogleNet, InceptionV3, InceptionV4, Inception-ResNetv2, Xception, Resnet In Resnet, ResNext,ShuffleNet, ShuffleNetv2, MobileNet, MobileNetv2, SqueezeNet…
ASCII generator (image to text, image to image, video to video)
gem5-nvmain hybrid simulator supporting simulation of DRAM-NVM hybrid memory system
SHMA: Software-managed Caching for Hybrid DRAM/NVM Memory Architectures, implemented with zsim and nvmain hybrid simulators
Transforming Graphs for Efficient Irregular Graph Processing on GPUs
zxhero / gem5-CXL
Forked from gem5/gem5This is an read-only mirror of the gem5 simulator. The upstream repository is stored in https://gem5.googlesource.com, code reviews should be submitted to https://gem5-review.googlesource.com/. The…
Open-source Framework for HPCA2024 paper: Gemini: Mapping and Architecture Co-exploration for Large-scale DNN Chiplet Accelerators
A Python package for extending the official PyTorch that can easily obtain performance on Intel platform
Evaluation code for confidential virtual machines (AMD SEV-SNP / Intel TDX)
GPGPU-Sim enabled Turing WMMA API and its benchmark results. Undergraduate study at Yonsei Univ.
llama3.cuda is a pure C/CUDA implementation for Llama 3 model.
NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
Source code for the paper "Encrypted Image Classification with Low Memory Footprint using Fully Homomorphic Encryption"
A reading list for homomorphic encryption
A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
Materials about Privacy-Preserving Machine Learning