Skip to content
View Schlampig's full-sized avatar
👶
👶

Block or report Schlampig

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

PTM

42 repositories

An original implementation of "MetaICL Learning to Learn In Context" by Sewon Min, Mike Lewis, Luke Zettlemoyer and Hannaneh Hajishirzi

Python 259 36 Updated Apr 15, 2023

非常全的文言文(古文)-现代文平行语料

Python 1,274 293 Updated Apr 21, 2024

PERT: Pre-training BERT with Permuted Language Model

357 24 Updated Mar 29, 2023

KdConv: A Chinese Multi-domain Dialogue Dataset Towards Multi-turn Knowledge-driven Conversation

Python 471 61 Updated May 8, 2023

Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 500 universities from 70 countries including Stanford, MIT, Harvard, and Cambridge.

Python 25,030 4,515 Updated Aug 18, 2024

Making large AI models cheaper, faster and more accessible

Python 40,373 4,464 Updated Feb 24, 2025

This repo contains codes and instructions for baselines in the VLUE benchmark.

Python 41 3 Updated Jul 16, 2022

clueai工具包: 3行代码3分钟,自定义需要的API!

Python 231 31 Updated Apr 29, 2023

Repository of MatchFormer

Python 179 11 Updated Oct 7, 2022

Awesome Pretrained Chinese NLP Models,高质量中文预训练模型&大模型&多模态模型&大语言模型集合

Python 5,114 488 Updated Feb 15, 2025

Mengzi Pretrained Models

533 63 Updated Nov 29, 2022

:trollface: Self-hosted, lightweight server and website monitoring and O&M tool

Go 8,252 1,398 Updated Feb 24, 2025

LERT: A Linguistically-motivated Pre-trained Language Model(语言学信息增强的预训练模型LERT)

Python 203 15 Updated Mar 29, 2023

MiniRBT (中文小型预训练模型系列)

Python 264 17 Updated Apr 5, 2023

Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.

Python 93 8 Updated Feb 9, 2023

SR based on LLMs.

Python 95 19 Updated Nov 13, 2022

Open-source pre-training implementation of Google's LaMDA in PyTorch. Adding RLHF similar to ChatGPT.

Python 471 74 Updated Feb 24, 2024

Examples and guides for using the OpenAI API

MDX 61,996 9,990 Updated Feb 20, 2025

中文版的ai地牢,直接使用的openai的ChatGPT api作为讲故事的模型。

Python 1,391 141 Updated Mar 27, 2023

中文版ai地牢,基于清源CPM fineutne

Python 237 30 Updated Dec 7, 2022

⚡️ Python client for the unofficial ChatGPT API with auto token regeneration, conversation tracking, proxy support and more.

Python 4,217 442 Updated Jan 5, 2023

[NeurIPS 2021] COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining

Python 118 13 Updated Jul 25, 2023

bert-base-chinese example

Jupyter Notebook 877 233 Updated Aug 7, 2023

Code for CPM-2 Pre-Train

Python 158 26 Updated Mar 18, 2023

tiktoken is a fast BPE tokeniser for use with OpenAI's models.

Python 13,512 953 Updated Feb 14, 2025

Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM

Python 7,762 678 Updated Feb 15, 2025

ChatYuan: Large Language Model for Dialogue in Chinese and English

Python 1,896 183 Updated Jun 16, 2023

CPT: A Pre-Trained Unbalanced Transformer for Both Chinese Language Understanding and Generation

Python 485 72 Updated Dec 30, 2022

MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。

3,719 260 Updated Feb 18, 2025