Skip to content
View xjxyys's full-sized avatar
🎯
Focusing
🎯
Focusing
  • Shanghai Jiao Tong University
  • Shanghai
  • 08:36 (UTC -12:00)

Highlights

  • Pro

Block or report xjxyys

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Create feature-centric and prompt-centric visualizations for sparse autoencoders (like those from Anthropic's published research).

HTML 183 38 Updated Dec 16, 2024

A TinyStories LM with SAEs and transcoders

Python 11 1 Updated Jan 2, 2025

CS229 Solution (summer 2019, 2020).

HTML 13 4 Updated Dec 30, 2023

An implementation of the AlphaZero algorithm for Gomoku (also called Gobang or Five in a Row)

Python 3,430 984 Updated Apr 24, 2024

Implemention of the Decision-Pretrained Transformer (DPT) from the paper Supervised Pretraining Can Learn In-Context Reinforcement Learning.

Python 57 9 Updated May 28, 2024

Curated list of datasets and tools for post-training.

2,741 235 Updated Jan 29, 2025

🐙 OctoPack: Instruction Tuning Code Large Language Models

Jupyter Notebook 453 28 Updated Feb 5, 2025

Data and Code for Program of Thoughts (TMLR 2023)

Python 260 22 Updated May 15, 2024

Code for NeurIPS'24 paper 'Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization'

Python 183 16 Updated Dec 2, 2024

MambaFormer in-context learning experiments and implementation for https://arxiv.org/abs/2402.04248

Python 36 Updated Jun 18, 2024

Mamba SSM architecture

Python 14,071 1,226 Updated Jan 18, 2025

Simulation code for paper "Training Dynamics of Multi-Head Softmax Attention for In-Context Learning: Emergence, Convergence, and Optimality"

Python 4 1 Updated Oct 27, 2024
Python 7 1 Updated Mar 18, 2024

Ongoing research training transformer models at scale

Python 11,556 2,589 Updated Feb 26, 2025

Official codebase for Decision Transformer: Reinforcement Learning via Sequence Modeling.

Python 2,488 467 Updated Apr 29, 2024

Paper list of multi-agent reinforcement learning (MARL)

4,212 742 Updated Oct 17, 2024

Rainbow is all you need! A step-by-step tutorial from DQN to Rainbow

Jupyter Notebook 1,913 342 Updated Nov 10, 2024

CS285 Homework

Jupyter Notebook 26 13 Updated Dec 20, 2020

All notes and materials for the CS229: Machine Learning course by Stanford University

Jupyter Notebook 2,188 885 Updated Feb 14, 2025
C++ 2 Updated Aug 12, 2023

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 42,108 5,155 Updated Feb 25, 2025

[ICML 2021] DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning | 斗地主AI

Python 4,222 607 Updated Jun 26, 2024

李宏毅(Hung-yi Lee) 2022年春季机器学习课程,包括课件和作业,

Jupyter Notebook 142 40 Updated Sep 5, 2022

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)

Python 5,080 502 Updated Feb 22, 2025

The repository is for safe reinforcement learning baselines.

Jupyter Notebook 590 85 Updated Jan 27, 2025

💼 another CV template for your job application, yet powered by Typst and more

Typst 521 39 Updated Feb 22, 2025

An elegant \LaTeX\ résumé template. 大陆镜像 https://gods.coding.net/p/resume/git

TeX 9,600 2,654 Updated Mar 15, 2024

🦜🔗 Build context-aware reasoning applications

Jupyter Notebook 101,641 16,484 Updated Feb 26, 2025

为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, m…

Python 67,706 8,302 Updated Feb 21, 2025
Next