Lists (3)
Sort Name ascending (A-Z)
Stars
Clean, minimal, accessible reproduction of DeepSeek R1-Zero
Fully open reproduction of DeepSeek-R1
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
LLMs-from-scratch项目中文翻译
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
An logseq to anki syncing plugin with superpowers - image occlusion, card direction, incremental cards, and a lot more.
《剑指 Offer》 Python, Java, C++ 解题代码,LeetBook《图解算法数据结构》配套代码仓
《Hello 算法》:动画图解、一键运行的数据结构与算法教程。支持 Python, Java, C++, C, C#, JS, Go, Swift, Rust, Ruby, Kotlin, TS, Dart 代码。简体版和繁体版同步更新,English version ongoing
An annotated implementation of the Transformer paper.
🎼 一款结构化的 Markdown 引擎,支持 Go 和 JavaScript。A structured Markdown engine that supports Go and JavaScript.
A privacy-first, self-hosted, fully open source personal knowledge management software, written in typescript and golang.
仅需Python基础,从0构建大语言模型;从0逐步构建GLM4\Llama3\RWKV6, 深入理解大模型原理
Lightweight version of MAPPO to help you quickly migrate to your local environment.
Notebooks for the O'Reilly book "Learning Ray"
A Pytorch implementation of the multi agent deep deterministic policy gradients (MADDPG) algorithm
This is suite of the hands-on training materials that shows how to scale CV, NLP, time-series forecasting workloads with Ray.
PyTorch implementations of popular off-policy multi-agent reinforcement learning algorithms, including QMix, VDN, MADDPG, and MATD3.
This is the official implementation of Multi-Agent PPO (MAPPO).
A high-performance, scalable MindSpore reinforcement learning framework.