svtdanny

Sivtsov Danil svtdanny

Machine Learning and Software engineer

4 followers · 6 following

Yandex Ads
Moscow, Russia

Achievements

Stars

dblalock / bolt

10x faster matrix and vector operations

C++ 2,479 170 Updated Oct 12, 2022

huggingface / search-and-learn

Python 439 28 Updated Dec 18, 2024

Shenyi-Z / ToCa

Accelerating Diffusion Transformers with Token-wise Feature Caching

Python 32 1 Updated Nov 6, 2024

hpcaitech / SwiftInfer

Efficient AI Inference & Serving

Python 461 26 Updated Jan 8, 2024

hpcaitech / TensorNVMe

A Python library transfers PyTorch tensors between CPU and NVMe

C++ 100 19 Updated Nov 27, 2024

OpenSparseLLMs / Skip-DiT

✈️ Accelerating Vision Diffusion Transformers with Skip Branches.

Python 51 Updated Dec 12, 2024

neurostatslab / tensortools

A very simple and barebones tensor decomposition library for CP decomposition a.k.a. PARAFAC a.k.a. TCA

Python 163 66 Updated Jan 10, 2024

tensorly / tensorly

TensorLy: Tensor Learning in Python.

Python 1,576 289 Updated Dec 16, 2024

distantmagic / paddler

Stateful load balancer custom-tailored for llama.cpp 🏓🦙

Rust 650 27 Updated Dec 7, 2024

hpcaitech / ColossalAI

Making large AI models cheaper, faster and more accessible

Python 38,919 4,349 Updated Dec 17, 2024

hpcaitech / Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Python 22,671 2,225 Updated Nov 28, 2024

jax-ml / jax-triton

jax-triton contains integrations between JAX and OpenAI Triton

Python 361 40 Updated Dec 18, 2024

mpi4jax / mpi4jax

Zero-copy MPI communication of JAX arrays, for turbo-charged HPC applications in Python ⚡

Python 453 30 Updated Dec 18, 2024

ai-forever / ru-dolph

RUDOLPH: One Hyper-Tasking Transformer can be creative as DALL-E and GPT-3 and smart as CLIP

Jupyter Notebook 255 29 Updated Feb 6, 2023

ggerganov / ggml

Tensor library for machine learning

C++ 11,383 1,062 Updated Dec 18, 2024

NVIDIA / Star-Attention

Efficient LLM Inference over Long Sequences

Python 313 12 Updated Dec 6, 2024

yiakwy-xpu-ml-framework-team / NV_grouped_gemm

Forked from fanshiqing/grouped_gemm

PyTorch bindings for CUTLASS grouped GEMM for MoE.

Cuda 4 Updated May 12, 2024

fanshiqing / grouped_gemm

Forked from tgale96/grouped_gemm

PyTorch bindings for CUTLASS grouped GEMM.

Cuda 74 26 Updated Jul 18, 2024

tgale96 / grouped_gemm

PyTorch bindings for CUTLASS grouped GEMM.

Cuda 56 40 Updated Oct 31, 2024

NVIDIA / nvbandwidth

A tool for bandwidth measurements on NVIDIA GPUs.

C++ 333 30 Updated Oct 18, 2024

jax-ml / jax

Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more

Python 30,788 2,842 Updated Dec 18, 2024

google / grain

Library for reading and processing ML training data.

Python 338 23 Updated Dec 17, 2024

halide / Halide

a language for fast, portable data-parallel computation

C++ 5,928 1,073 Updated Dec 18, 2024

LambdaLabsML / distributed-training-guide

Best practices & guides on how to write distributed pytorch training code

Python 317 20 Updated Dec 16, 2024

pytorch / torchtitan

A native PyTorch Library for large model training

Python 2,769 224 Updated Dec 18, 2024

lucidrains / mixture-of-experts

A Pytorch implementation of Sparsely-Gated Mixture of Experts, for massively increasing the parameter count of language models

Python 657 50 Updated Sep 13, 2023

TorchMoE / MoE-Infinity

PyTorch library for cost-effective, fast and easy serving of MoE models.

Python 108 8 Updated Dec 9, 2024

huaweicodelabs / HiAI-Foundation

C++ 13 2 Updated Jul 6, 2022

triton-lang / triton

Development repository for the Triton language and compiler

C++ 13,721 1,683 Updated Dec 18, 2024

facebook / buck2

Build system, successor to Buck

Rust 3,637 233 Updated Dec 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sivtsov Danil svtdanny

Achievements

Achievements

Block or report svtdanny

Stars

dblalock / bolt

huggingface / search-and-learn

Shenyi-Z / ToCa

hpcaitech / SwiftInfer

hpcaitech / TensorNVMe

OpenSparseLLMs / Skip-DiT

neurostatslab / tensortools

tensorly / tensorly

distantmagic / paddler

hpcaitech / ColossalAI

hpcaitech / Open-Sora

jax-ml / jax-triton

mpi4jax / mpi4jax

ai-forever / ru-dolph

ggerganov / ggml

NVIDIA / Star-Attention

yiakwy-xpu-ml-framework-team / NV_grouped_gemm

fanshiqing / grouped_gemm

tgale96 / grouped_gemm

NVIDIA / nvbandwidth

jax-ml / jax

google / grain

halide / Halide

LambdaLabsML / distributed-training-guide

pytorch / torchtitan

lucidrains / mixture-of-experts

TorchMoE / MoE-Infinity

huaweicodelabs / HiAI-Foundation

triton-lang / triton

facebook / buck2