yejingxin

Follow

yejingxin

Follow

8 followers · 10 following

Achievements

Achievements

Lists (1)

Sort

Terraform

Stars

GoogleCloudPlatform / container-engine-accelerators

Collection of tools and examples for managing Accelerated workloads in Kubernetes Engine

Go 218 154 Updated Dec 12, 2024

NVIDIA / multi-gpu-programming-models

Examples demonstrating available options to program multiple GPUs in a single node or a cluster

Cuda 573 112 Updated Oct 30, 2024

NVIDIA / TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilizatio…

Python 2,015 334 Updated Dec 14, 2024

gkroiz / ray-on-gpu

Python 2 Updated Nov 4, 2024

volcengine / veScale

A PyTorch Native LLM Training Framework

Python 678 34 Updated Aug 25, 2024

NVIDIA / nvidia-resiliency-ext

NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the effective training time by minimizing the downtime due to fa…

Python 53 4 Updated Nov 27, 2024

siboehm / ShallowSpeed

Small scale distributed training of sequential deep learning models, built on Numpy and MPI.

Python 109 4 Updated Oct 19, 2023

facebookresearch / HolisticTraceAnalysis

A library to analyze PyTorch traces.

Python 312 45 Updated Dec 3, 2024

NVIDIA / Megatron-LM

Ongoing research training transformer models at scale

Python 10,807 2,416 Updated Dec 14, 2024

alibaba / Megatron-LLaMA

Forked from NVIDIA/Megatron-LM

Best practice for training LLaMA models in Megatron-LM

Python 634 53 Updated Jan 2, 2024

GoogleCloudPlatform / nvidia-nemo-on-gke

Training NVIDIA NeMo Megatron Large Language Model (LLM) using NeMo Framework on Google Kubernetes Engine

HCL 12 5 Updated Nov 19, 2024

AI-Hypercomputer / jetstream-pytorch

PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"

Python 42 15 Updated Dec 14, 2024

microsoft / msccl

Microsoft Collective Communication Library

C++ 325 30 Updated Sep 20, 2023

davidmrau / mixture-of-experts

PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al. https://arxiv.org/abs/1701.06538

Python 1,007 105 Updated Apr 19, 2024

BBuf / how-to-optim-algorithm-in-cuda

how to optimize some algorithm in cuda.

Cuda 1,726 142 Updated Dec 12, 2024

gpu-mode / lectures

Material for gpu-mode lectures

Jupyter Notebook 3,170 325 Updated Dec 3, 2024

EleutherAI / gpt-neox

An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries

Python 6,993 1,022 Updated Dec 10, 2024

mlabonne / llm-course

Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.

Jupyter Notebook 39,976 4,240 Updated Jul 28, 2024

character-ai / MuKoe

Python 52 7 Updated Apr 23, 2024

GoogleCloudPlatform / ramble

A multi-platform experimentation framework written in python.

Python 42 28 Updated Dec 13, 2024

GoogleCloudPlatform / ml-testing-accelerators

Testing framework for Deep Learning models (Tensorflow and PyTorch) on Google Cloud hardware accelerators (TPU and GPU)

Jsonnet 64 59 Updated Nov 22, 2024

AI-Hypercomputer / maxdiffusion

Python 165 17 Updated Dec 13, 2024

NVIDIA / JAX-Toolbox

JAX-Toolbox

Jupyter Notebook 268 50 Updated Dec 14, 2024

databricks / megablocks

Python 1,225 175 Updated Nov 20, 2024

stanford-crfm / levanter

Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax

Python 526 85 Updated Dec 12, 2024

SHI-Labs / NATTEN

Neighborhood Attention Extension. Bringing attention to a neighborhood near you!

Cuda 382 31 Updated Dec 2, 2024

spacelift-io-blog-posts / Blog-Technical-Content

Technical content from the Spacelift blog articles.

HCL 51 50 Updated Oct 18, 2023

remusao / remusao.github.io

My personal blog

CSS 5 3 Updated Nov 24, 2024

lucidrains / flash-attention-jax

Implementation of Flash Attention in Jax

Python 201 23 Updated Mar 1, 2024

ray-project / kuberay

A toolkit to run Ray applications on Kubernetes

Go 1,330 421 Updated Dec 14, 2024