Skip to content
View cangoksen's full-sized avatar

Highlights

  • Pro

Block or report cangoksen

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

JupyterLab for AI in Docker! Anaconda and PyTorch GPU supported.

Python 2 1 Updated Dec 9, 2024

AdaLoRA: Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning (ICLR 2023).

Python 281 28 Updated Jun 1, 2023

Reasoning in Large Language Models: Papers and Resources, including Chain-of-Thought and OpenAI o1 🍓

2,074 119 Updated Dec 11, 2024

Papers and resources on Controllable Generation using Diffusion Models, including ControlNet, DreamBooth, IP-Adapter.

401 22 Updated Sep 29, 2024

An implementation of local windowed attention for language modeling

Python 392 41 Updated Sep 6, 2024

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks

Python 6,714 372 Updated Jul 11, 2024

Extend existing LLMs way beyond the original training length with constant memory usage, without retraining

Python 683 40 Updated Apr 10, 2024

LLM101n: Let's build a Storyteller

30,492 1,671 Updated Aug 1, 2024

Diffusion Reading Group at EleutherAI

Jupyter Notebook 314 17 Updated Aug 8, 2023

Fast Hadamard transform in CUDA, with a PyTorch interface

C 119 17 Updated May 24, 2024

A banded matrix library for python.

Python 26 9 Updated Feb 5, 2020

Efficient GPU kernels for block-sparse matrix multiplication and convolution

Cuda 1,030 202 Updated Jun 8, 2023

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support

Python 8,053 989 Updated Dec 13, 2024

Official repository of Agent Attention (ECCV2024)

Python 551 37 Updated Nov 17, 2024

Transformer based on a variant of attention that is linear complexity in respect to sequence length

Python 707 67 Updated May 5, 2024

Awesome list for LLM quantization

Python 134 10 Updated Dec 12, 2024

Code repo for the paper "SpinQuant LLM quantization with learned rotations"

Python 183 17 Updated Nov 11, 2024

Samples for CUDA Developers which demonstrates features in CUDA Toolkit

C 6,583 1,859 Updated Jul 26, 2024

A list of papers, blogs, datasets and software in the field of lifelong/continual machine learning

280 44 Updated Mar 6, 2021

A PyTorch implementation of the Transformer model in "Attention is All You Need".

Python 8,932 1,987 Updated Apr 16, 2024

Code for the article "What if Neural Networks had SVDs?", to be presented as a spotlight paper at NeurIPS 2020.

Python 70 10 Updated Jul 25, 2024

Linux Kernel for Surface Devices

Shell 5,284 226 Updated Dec 11, 2024

CUDA Templates for Linear Algebra Subroutines

C++ 5,822 1,005 Updated Dec 11, 2024

Fast and memory-efficient exact attention

Python 14,621 1,371 Updated Dec 13, 2024
Python 10 Updated Oct 20, 2023

A learning rate range test implementation in PyTorch

Python 929 120 Updated Dec 1, 2024

A new regularization technique that freezes the layers of the deep neural networks stochastically.

Python 4 Updated Jan 6, 2021

An implementation of Knowledge distillation for segmentation, to train a small (student) UNet from a larger (teacher) UNet thereby reducing the size of the network while achieving performance simil…

Python 51 13 Updated May 7, 2020

Optimization with orthogonal constraints and on general manifolds

Python 126 21 Updated Jul 13, 2020
TeX 5 Updated Dec 8, 2024
Next