Stars
AISystem 主要是指AI系统,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术
NTSocks: An ultra-low latency and compatible PCIe interconnect for rack-scale disaggregation.
Automatically Discovering Fast Parallelization Strategies for Distributed Deep Neural Network Training
microsoft / Megatron-DeepSpeed
Forked from NVIDIA/Megatron-LMOngoing research training transformer language models at scale, including: BERT & GPT-2
Artifacts for ATC '22 paper "Faster Software Packet Processing on FPGA NICs with eBPF Program Warping"
OpenPose: Real-time multi-person keypoint detection library for body, face, hands, and foot estimation
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Train transformer language models with reinforcement learning.
Awesome-LLM: a curated list of Large Language Model
A comprehensive, fast, pure-Python memcached client.
MoonGen is a fully scriptable high-speed packet generator built on DPDK and LuaJIT. It can saturate a 10 Gbit/s connection with 64 byte packets on a single CPU core while executing user-provided Lu…
Tartan: Evaluating Modern GPU Interconnect via a Multi-GPU Benchmark Suite
Modifications to GEM5 for running kernel bypass networking. (DPDK)
l-nic / chipyard
Forked from ucb-bar/chipyardAn Agile Chisel-Based SoC Design Framework
Ongoing research training transformer models at scale
ucsdsysnet / corundum
Forked from corundum/corundumOpen source FPGA-based NIC and platform for in-network compute
A cross-platform ChatGPT/Gemini UI (Web / PWA / Linux / Win / MacOS). 一键拥有你自己的跨平台 ChatGPT/Gemini/Claude LLM 应用。