yancong222

Yan Cong yancong222

10 followers · 26 following

Achievements

Lists (6)

Sort

Starred repositories

106 stars written in Python

Clear filter

huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Python 140,198 28,114 Updated Feb 26, 2025

scikit-learn / scikit-learn

scikit-learn: machine learning in Python

Python 61,236 25,630 Updated Feb 25, 2025

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 39,401 5,900 Updated Feb 26, 2025

deepspeedai / DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 37,032 4,261 Updated Feb 26, 2025

explosion / spaCy

💫 Industrial-strength Natural Language Processing (NLP) in Python

Python 30,999 4,452 Updated Feb 3, 2025

openai / openai-python

The official Python library for the OpenAI API

Python 24,782 3,605 Updated Feb 26, 2025

sebastianruder / NLP-progress

Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.

Python 22,792 3,621 Updated Jul 28, 2024

huggingface / datasets

🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools

Python 19,686 2,768 Updated Feb 20, 2025

reddit-archive / reddit

historical code from reddit.com

Python 16,860 2,870 Updated Oct 17, 2017

openai / evals

Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.

Python 15,565 2,669 Updated Dec 18, 2024

m-bain / whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Python 14,117 1,525 Updated Feb 23, 2025

stas00 / ml-engineering

Machine Learning Engineering Open Book

Python 12,933 791 Updated Feb 23, 2025

Embedding / Chinese-Word-Vectors

100+ Chinese Word Vectors 上百种预训练中文词向量

Python 11,928 2,323 Updated Oct 30, 2023

andrewyng / aisuite

Simple, unified interface to multiple Generative AI providers

Python 11,464 1,107 Updated Feb 18, 2025

lucidrains / DALLE2-pytorch

Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch

Python 11,224 1,090 Updated May 11, 2024

RUCAIBox / LLMSurvey

The official GitHub page for the survey paper "A Survey of Large Language Models".

Python 11,066 861 Updated Aug 20, 2024

OptimalScale / LMFlow

An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.

Python 8,359 833 Updated Feb 22, 2025

EleutherAI / gpt-neo

An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library.

Python 8,276 962 Updated Feb 25, 2022

stanfordnlp / stanza

Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languages

Python 7,384 898 Updated Feb 25, 2025

EleutherAI / gpt-neox

An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries

Python 7,110 1,046 Updated Feb 18, 2025

facebookresearch / metaseq

Repo for external large-scale work

Python 6,517 727 Updated Apr 27, 2024

google-research / text-to-text-transfer-transformer

Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

Python 6,273 762 Updated Feb 25, 2025

zihangdai / xlnet

XLNet: Generalized Autoregressive Pretraining for Language Understanding

Python 6,184 1,177 Updated May 28, 2023

pytorch / torchtune

PyTorch native post-training library

Python 4,923 544 Updated Feb 25, 2025

wookayin / gpustat

📊 A simple command-line utility for querying and monitoring GPU status

Python 4,135 285 Updated Aug 8, 2024

xjdr-alt / entropix

Entropy Based Sampling and Parallel CoT Decoding

Python 3,321 319 Updated Nov 13, 2024

google / BIG-bench

Beyond the Imitation Game collaborative benchmark for measuring and extrapolating the capabilities of language models

Python 2,970 598 Updated Jul 19, 2024

JasonKessler / scattertext

Beautiful visualizations of how language differs among document types.

Python 2,284 292 Updated Sep 23, 2024

stanford-crfm / helm

Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110). This framework is also used to evaluate text-to-image …

Python 2,066 278 Updated Feb 26, 2025

psychopy / psychopy

For running psychology and neuroscience experiments

Python 1,746 928 Updated Feb 24, 2025

Yan Cong yancong222

Lists (6)

DSMs for Chinese

LLMs

Medical-NLP

Miscellaneous

SLS-NLP

Stats-teaching

Starred repositories

bea-workshop

psycholinguistics