Stars
A quick guide (especially) for trending instruction finetuning datasets
Get up and running with Llama 3.3, Mistral, Gemma 2, and other large language models.
"LightRAG: Simple and Fast Retrieval-Augmented Generation"
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.
一键中文数据增强包 ; NLP数据增强、bert数据增强、EDA:pip install nlpcda
Benchmark for Multi-Scenario-Recommendation.
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
I have surveyed the technology and papers of CTR & Recommender System, and implemented 25 common-used models with Pytorch for reusage. (对工业界学术界的CTR推荐调研并实现25个算法模型,2023)
A collection of notebooks/recipes showcasing some fun and effective ways of using Claude.
Universal cross-platform tokenizers binding to HF and sentencepiece
GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
MLGB is a library that includes many models of CTR Prediction & Recommender System by TensorFlow & PyTorch. MLGB是一个包含50+点击率预估和推荐系统深度模型的、通过TensorFlow和PyTorch撰写的库。
A configurable, tunable, and reproducible library for CTR prediction https://fuxictr.github.io
一个简单的本地网页界面,使用ChatTTS将文字合成为语音,同时支持对外提供API接口。A simple native web interface that uses ChatTTS to synthesize text into speech, along with support for external API interfaces.
A generative speech model for daily dialogue.
Parsing gigabytes of JSON per second : used by Facebook/Meta Velox, the Node.js runtime, ClickHouse, WatermelonDB, Apache Doris, Milvus, StarRocks
code for piccolo embedding model from SenseTime
Curated list of chatgpt prompts from the top-rated GPTs in the GPTs Store. Prompt Engineering, prompt attack & prompt protect. Advanced Prompt Engineering papers.
A framework for prompt tuning using Intent-based Prompt Calibration
Computational geometry and spatial indexing on the sphere
练习下用pytorch来复现下经典的推荐系统模型, 如MF, FM, DeepConn, MMOE, PLE, DeepFM, NFM, DCN, AFM, AutoInt, ONN, FiBiNET, DCN-v2, AFN, DCAP等