LLM4REC

OVERVIEW

LLMs enhance Recommendation
- Feature Engineering
  - data augmentation
    - generate open-world knowledge for user/item
    - generate interaction data
  - data condense
  - feature selection
  - feature imputation
- Feature Encoder
  - encode text information
  - encode id information
LLMs as Recommenders
- prompt learning
- instruction tuning
- reinforce learning
- knowledge distillation
- Pipeline Controller
  - pipeline design
  - CoT, ToT, SI
  - Incremental Learning
Other Related work
- Self-distillation in LLM
- DPO in LLM
- LLM4CTR

1. LLMs enhance Recommendation

Feature Engineering

Title	Model	Time	Motivation
Bridging the User-side Knowledge Gap in Knowledge-aware Recommendations with Large Language Models	CIKGRec	AAAI25	结构化LLM中用户侧的世界知识，增强知识感知的基于图的推荐算法
Towards Open-World Recommendation with Knowledge Augmentation from Large Language Models	KAR	RecSys24	利用LLM的open-world knowledge扩充用户和物品的信息
A First Look at LLM-Powered Generative News Recommendation	ONCE(GENRE+DIRE)	arXiv23	对于开源LLM，利用它们作为特征编码器。对于闭源LLM，使用提示丰富训练数据
LLMRec: Large Language Models with Graph Augmentation for Recommendation	LLMRec	WSDM24	利用LLM进行图数据增强，从item candidates中选出liked item和disliked item
Integrating Large Language Models into Recommendation via Mutual Augmentation and Adaptive Aggregation	Llama4Rec	arXiv24	由mutual augmentation和adaptive aggregation组成。mutual augmentation包括data增强和prompt增强。
Data-efficient Fine-tuning for LLM-based Recommendation	DEALRec	SIGIR24	设计influence score和effort score，对LLM4REC进行数据蒸馏，挑选出有influential的samples
Distillation is All You Need for Practically Using Different Pre-trained Recommendation Models	PRM-KD	arXiv24	利用了不同类型的预训练推荐模型作为教师模型，提取in-batch negative item scores进行联合知识蒸馏
CoRAL: Collaborative Retrieval-Augmented Large Language Models Improve Long-tail Recommendation	CoRAL	KDD24	通过强化学习，将协同信息以prompt的形式增强LLM，实现对于Long-tail Recommendation推荐性能的改进
Harnessing Large Language Models for Text-Rich Sequential Recommendation		WWW24	关注LLM4REC的数据压缩问题，先将用户历史交互分片，然后用LLM总结每个分片的内容，最后设计prompt将总结后的user偏好、最近user交互和candidate items结合在一起
Large Language Models Enhanced Collaborative Filtering	LLM-CF	CIKM24	通过ICL和COT，将LLM的world knowledge和reasoning capabilities蒸馏到collaborative filtering
Optimization Methods for Personalizing Large Language Models through Retrieval Augmentation		SIGIR24	LLMs不能根据其用户的背景和历史偏好定制其生成的输出,通过强化学习+知识蒸馏选择最能增强LLM的个人信息
Large Language Models for Next Point-of-Interest Recommendation		SIGIR24	现有的next POI方法侧重于短轨迹和冷启动问题（数据量少且轨迹短的用户），没有充分探索丰富的LBSN的数据,可以使用LLM的自然语言理解能力，来处理所有类型的LBSN数据并更好地使用上下文信息

Feature Encoder

Title	Model	Time	Motivation
U-BERT: Pre-training user representations for improved recommendation	U-BERT	AAAI21	早期的工作，主要使用BERT编码评论文本
Towards universal sequence representation learning for recommender systems	UniSRec	KDD22	用BERT对item text信息进行编码，使用了parametric whitening
Learning vector-quantized item representation for transferable sequential recommenders	VQ-Rec	WWW23	首先将文本映射到一个离散索引向量（称为item code ）中，然后使用这些索引来查找code embedding table进行编码
Recommender Systems with Generative Retrieval	TIGER	NIPS23	使用LLM编码有意义的item ID，直接预测candidate IDs，进行端到端的generative retrieval
Representation Learning with Large Language Models for Recommendation	RLMRec	WWW24	通过两次对比学习，对齐LLM编码的语义特征和传统方法的协同特征
Rella: Retrieval-enhanced large language models for lifelong sequential behavior comprehension in recommendation	ReLLa	WWW24	CTR问题，LLM对于长的序列效果不佳；本文根据target item从长序列中选择相似的部分item作为序列；item的embedding通过LLM对text信息进行构建
Breaking the Length Barrier: LLM-Enhanced CTR Prediction in Long Textual User Behaviors	BAHE	SIGIR24 short paper	长序列LLM推理开销大。本文思路是固定LLM的浅层参数，预先存储一些原子交互的LLM的浅层特征，后续直接查表
Large Language Models Augmented Rating Prediction in Recommender System	LLM-TRSR	ICASSP24	ensemble LLM_Rec和传统Rec的输出
Enhancing Content-based Recommendation via Large Language Model	LOID	CIKM24 short paper	不同domain的content语义信息之间可能有gap；同时利用LLM和传统RS的信息，提出一种ID和content信息align的范式。用ID embedding作为key提取text embedding序列当中的信息
Aligning Large Language Models with Recommendation Knowledge		arXiv24	将推荐领域的一些知识，例如MIM和BPR，通过prompt的形式将其传输给LLM
The Elephant in the Room: Rethinking the Usage of Pre-trained Language Model in Sequential Recommendation	Elephant in the Room	RecSys24	序列推荐的大模型的attention层的大部分参数都没有被使用，参数存在大量的冗余。本文将LLM学到的item embedding作为SASRec的初始化，然后再训练SASRec
Demystifying Embedding Spaces using Large Language Models		ICLR24	用LLM对item的embedding空间进行解释，包括未在训练数据中出现过的item

2. LLMs serve as Recommenders

Scoring/Ranking

Title	Model	Time	Motivation
Recommendation as language processing (rlp): A unified pretrain, personalized prompt & predict paradigm (p5)	P5	RecSys22	针对不同任务设计了多个prompts，并且使用推荐数据集重新进行预训练，最终用于解决zero-shot的推荐问题
Text Is All You Need: Learning Language Representations for Sequential Recommendation	RecFormer	KDD23	将键值对展开为类似句子的prompt ，利用LongFormer训练，输出用户交互序列（兴趣）的表征。然后结合对比学习，进行最后的推荐
Recommendation as instruction following: A large language model empowered recommendation approach	InstructRec	arXiv23	采用instruction tuning，将主动的用户指令和被动的交互信息按照一定格式组织成指令，引导LLM完成多任务推荐场景
A bi-step grounding paradigm for large language models in recommendation systems	BIGRec	arXiv23	针对grounding问题，采用instruction-tuning，实现“Grounding Language Space to Recommendation Space”
A Multi-facet Paradigm to Bridge Large Language Model and Recommendation	TransRec	arXiv23	在Item indexing上，将ID, title和attribute都当成Item的facet；在generation grounding上，：将生成的identifiers与in-corpus 的每个item的identifiers取交集选出items
CoLLM: Integrating Collaborative Embeddings into Large Language Models for Recommendation	CoLLM	arXiv23	将传统模型捕获的协作放到LLM的prompt中，并将其映射到最终的embedding空间
LlamaRec: Two-Stage Recommendation using Large Language Models for Ranking	LlamaRec	CIKM23	一般LLM生成推荐结果的推理成本很高，并且要进一步Grounding。LlamaRec利用一个verbalizer ，将LLM head的输出(即所有tokens的分数)转换为候选items的排名分数
Large language models are zero-shot rankers for recommender systems		arXiv23	利用LLM对候选物品集合进行zero-shot排序
Language models as recommender systems: Evaluations and limitations	LMRecSys	NeurIPS21	采用Prompt tuning的方法，将要预测的物品拆分成多个token，由LLM输出每个token的分布，最终进行推荐
Prompt learning for news recommendation	Prompt4NR	SIGIR23	设计离散、连续、混合提示模板，以及它们对应的答案空间。使用prompt ensembling组合效果最好的一组prompt模板
Prompt distillation for efficient llm-based recommendation	POD	CIKM23	通过prompt learning学习作为前缀的连续prompt，将离散prompt信息蒸馏到连续prompt
Large Language Models as Zero-Shot Conversational Recommenders		CIKM23	使用具有代表性的大型语言模型在Zero-Shot下对会话推荐任务进行实证研究
Leveraging Large Language Models (LLMs) to Empower Training-Free Dataset Condensation for Content-Based Recommendation		arXiv23	对推荐数据进行蒸馏，设计prompt，用LLM压缩item的信息、提取user偏好，聚类并计算距离选择top-m的user，并产生交互数据
Is ChatGPT Good at Search? Investigating Large Language Models as Re-Ranking Agents		arXiv23	利用GPT模型进行文本排序任务，将GPT模型的标注结果用于模型蒸馏
LLaRA: Aligning Large Language Models with Sequential Recommenders	LLaRA	arXiv23	在prompt中采用文本表征+传统模型学习的混合表征
Collaborative Contextualization: Bridging the Gap between Collaborative Filtering and Pre-trained Language Model	CollabContext	arXiv23	利用LLM学习到的文本表征和传统模型表征进行双向蒸馏
Adapting Large Language Models by Integrating Collaborative Semantics for Recommendation	LC-Rec	ICDE24	对于item索引，设计了一种语义映射方法，可以为item分配有意义且不冲突的id，同时提出了一系列特别设计的tuning任务，迫使llm深度整合语言和协同过滤语义
Collaborative Large Language Model for Recommender Systems	CLLM4Rec	WWW24	为了减少自然语言和推荐语义的gap，本文为user和item扩充词表使其与唯一的token绑定，并引入协同信号进行训练扩充的token的embedding
Play to Your Strengths: Collaborative Intelligence of Conventional Recommender Models and Large Language Models	Play to Your Strength	arxiv24	CTR task；由于LLM inference时间过长，且传统RS和LLM RS擅长不同的数据，本文考虑对不同数据分别使用传统RS和LLM进行推荐。方法是将传统RS confidence低的sample丢给LLM RS判断
GPT4Rec: A generative framework for personalized recommendation and user interests interpretation	GPT4Rec	arxiv23	用GPT2根据历史交互产生query，在BM25中检索item
Unsupervised large Language Model Alignment for Information Retrieval via Contrastive Feedback		SIGIR24	LLMs产生的responses不能捕捉内容相似的document之间的区别,设计group-wise的方法产生反馈信号，用无监督学习+强化学习，使LLMs产生context-specific的responses
RDRec: Rationale Distillation for LLM-based Recommendation	RDRec	arXiv24	现在的LLM4REC很少关注user产生interaction背后的rationale；让LLM通过prompt从review中提取user preference和item attribute，然后利用小LM进行蒸馏

Pipline Controller

Title	Model	Time	Motivation
Recmind: Large language model powered agent for recommendation	Recmind	arXiv23	由LLM驱动的推荐Agent，可以推理、互动、记忆，提供精确的个性化推荐
Can Small Language Models be Good Reasoners for Sequential Recommendation?	SLIM	WWW24	将大的LLM的逐步推理能力蒸馏到小的LLM中
Preliminary Study on Incremental Learning for Large Language Model-based Recommender Systems		arXiv23	实验发现full retraining and fine-tuning增量学习都没有显著提高LLM4Rec的性能，设计long term lora(freeze)和short term lora(hot)分别关注user长短期偏好
Scaling Law of Large Sequential Recommendation Models		arXiv23	实验发现，扩大模型大小可以极大地提高具有挑战性的推荐任务上(如冷启动、鲁棒性、长期偏好)的性能

3. Other Related work

Self-distillation in LLM

Title	Model	Time	Motivation
SELF-INSTRUCT: Aligning Language Models with Self-Generated Instructions	SELF-INSTRUCT	arXiv22	对于指令微调，人类编写的指令数据开销大，多样性有限、不能推广到广泛的场景；可以让LLM自己产生指令
Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision	SELF-ALIGN	NeurIPS23	人类注释的监督微调(SFT)和来自人类反馈的强化学习(RLHF)有成本高，可靠性、多样性参差不齐等问题;可以将原则驱动推理和LLM的生成能力结合起来，在最少的人类监督下实现人工智能agent的自对齐
RLCD: REINFORCEMENT LEARNING FROM CONTRASTIVE DISTILLATION FOR LM ALIGNMENT	RLCD	ICLR24	设计对比学习，在不使用人类反馈的情况下，使语言模型遵循自然语言表达的原则的方法产生指令(从模型输出中创建偏好对，一个旨在鼓励遵循给定原则的积极提示，另一个旨在鼓励违反原则的消极提示)
Impossible Distillation for Paraphrasing and Summarization: How to Make High-quality Lemonade out of Small, Low-quality Models		arXiv23	从低质量教师模型（本身不能执行某些特定任务任务的模型）提取出高质量的数据集和模型,最后，学生LM通过自我蒸馏进一步完善（在自己的高质量数据上进行训练）
LARGE LANGUAGE MODELS CAN SELF-IMPROVE		arXiv23	微调LLM需要大量有监督数据，而人类的反思不需要外部输入；可以让LLM通过unlabeled数据进行反思;通过Chain-of-Thought prompting 和 self-consistency 让LLM产生“high-confidence” 的回答
Reinforced Self-Training (ReST) for Language Modeling	ReST	arXiv23	RLHF通过将LLM和人类偏好对齐来提升LLM的能力，它采用的在线训练策略在处理新的样本时开销大；可以采用离线强化学习来解决这个问题（时间问题）；离线的强化学习的质量很大程度上取决于数据集的质量，需要得到高质量的离线数据集（提升有效性）
Self-Rewarding Language Models		arXiv24	目前的RLHF根据人类偏好来训练奖励模型，这受到人类表现水平的显示；其次这些冻结的奖励模型无法在LLM训练的过程中学习改进；需要让LLM自动修改奖励函数，并且在训练的过程中自动改进
Baize: An Open-Source Chat Model with Parameter-Efficient Tuning on Self-Chat Data	Baize	arXiv23	目前具备强大能力的聊天模型如ChatGPT访问经常受限（只能通过API访问），希望能够训练一个能力接近ChatGPT的开源模型；为了让开源LLM的聊天能力接近ChatGPT，需要为开源LLM提供高质量的训练数据；通过利用ChatGPT与自己进行对话，可以自动生成高质量的多回合聊天语料库；提出带有反馈的自蒸馏，以进一步提高带有ChatGPT反馈的Baize模型的性能
STaR: Self-Taught Reasoner Bootstrapping Reasoning With Reasoning	STaR	arXiv22	思维链能够提升LLM在复杂推理场景的表现，但是这类方法有个缺点：它们要么需要大量的思维链数据，开销很大；要么只使用少量的思维连数据，损失了一部分推理的能力；希望LLM学习自己生成的rationale来提升推理能力，但自己生成的rationale可能是错误的answer，需要修正
Self-Distillation Bridges Distribution Gap in Language Model Fine-Tuning		arXiv24	为特定任务对LLM进行微调通常会面临一个挑战：平衡对特定任务性能和对一般任务指令的遵循能力；LLM重写特定任务的response，来减少两种分布之间的gap

Direct Preference Optimization in LLM (DPO)

Title	Model	Time	Motivation
Direct Preference Optimization: Your Language Model is Secretly a Reward Model	DPO	NeurIPS24	省去RLHF对于reward model的构建，直接针对偏好数据进行模型的优化
Statistical Rejection Sampling Improves Preference Optimization	RSO	ICLR24	提出DPO的偏好数据并非采样自最优策略，引入显式的reward模型和统计拒绝采样使产生自SFT模型的数据分布可以拟合最优模型的数据分布
KTO: Model Alignment as Prospect Theoretic Optimization	KTO	arXiv24	将DPO修正为针对label数据而非偏好数据对的优化
Curry-DPO: Enhancing Alignment using Curriculum Learning & Ranked Preferences	Curry-DPO	arXiv24	对于同个prompt的多个response，按照reward的差值构造pairwise数据对，再利用课程学习由易到难进行训练
LiPO: Listwise Preference Optimization through Learning-to-Rank	LiPO	arXiv24	修正DPO的loss，直接对listwise数据进行优化
ULMA: Unified Language Model Alignment with Human Demonstration and Point-wise Preference	ULMA	arXiv23	修正DPO的loss，直接对pointwise数据进行优化
Reinforcement Learning from Human Feedback with Active Queries	ADPO	arXiv24	active-learning的范式，去除reward差值小的数据对
RS-DPO: A Hybrid Rejection Sampling and Direct Preference Optimization Method for Alignment of Large Language Models	RS-DPO	arXiv24	引入显式的reward模型，使用拒绝统计采样，去除reward差值小的数据对，提高样本利用效率
Direct Preference Optimization with an Offset	ODPO	arXiv24	引入偏移值来表示偏好数据集中喜欢相对于不喜欢的程度
BRAIN: Bayesian Reward-conditioned Amortized INference for natural language generation from feedback	BRAIN	arXiv24	重新引入reward模型表示偏好数据集中喜欢相对于不喜欢的程度
D2PO: Discriminator-Guided DPO with Response Evaluation Models	D2PO	arXiv24	online训练方式，同时训练一个reward模型，在训练过程中迭代地由当前模型和reward模型产生新样本
Learn Your Reference Model for Real Good Alignment	TR-DPO	arXiv24	使用soft和hard两种更新方式，在训练期间更新reference model
sDPO: Don’t Use Your Data All at Once	sDPO	arXiv24	分批利用训练数据集，并在训练过程中更ref模型
Direct Language Model Alignment from Online AI Feedback	OAIF	arXiv24	利用更优模型在训练过程中产生新的偏好数据对
A General Theoretical Paradigm to Understand Learning from Human Preferences	IPO	PMLR24	在DPO loss上加了一个正则化项，避免训练时快速overfitting
Provably Robust DPO: Aligning Language Models with Noisy Feedback	rDPO	arXiv24	修正DPO的loss,使其对偏好数据概率翻转鲁棒
Zephyr: Direct Distillation of LM Alignment	Zephyr	arXiv23	利用大模型（GPT4）生成偏好数据，再使用DPO对7B模型进行微调

LLM4CTR

Title	Model	Time	Description
CTR-BERT: Cost-effective knowledge distillation for billion-parameter teacher models	CTR-BERT	NIPS WS'21	CTR-BERT 提出了一种成本效益的知识蒸馏方法，用于十亿参数教师模型。
DCAF-BERT: A Distilled Cachable Adaptable Factorized Model For Improved Ads CTR Prediction	DCAF-BERT	WWW'22	DCAF-BERT 提出了一种经过蒸馏的可缓存可适应因式化模型，用于提高广告点击率预测的准确性。
Learning Supplementary NLP Features for CTR Prediction in Sponsored Search	-	KDD'22	为了在赞助搜索中进行点击率预测，该研究探索了学习补充自然语言处理特征的方法。
Practice on Effectively Extracting NLP Features for Click-Through Rate Prediction	-	CIKM'23	通过实践，研究了有效提取自然语言处理特征用于点击率预测的方法。
BERT4CTR: An Efficient Framework to Combine Pre-trained Language Model with Non-textual Features for CTR Prediction	BERT4CTR	KDD'23	BERT4CTR 提出了一种高效的框架，将预训练语言模型与非文本特征结合，用于点击率预测。
M6-rec: Generative pretrained language models are open-ended recommender systems	M6-rec	arxiv'22	M6-rec 提出了一种生成式预训练语言模型，用作开放式推荐系统。
Ctrl: Connect tabular and language model for ctr prediction	Ctrl	arxiv'23	Ctrl 提出了一种连接表格数据和语言模型用于点击率预测的方法。
FLIP: Towards Fine-grained Alignment between ID-based Models and Pretrained Language Models for CTR Prediction	FLIP	arxiv'23	FLIP 旨在实现基于ID的模型和预训练语言模型之间的细粒度对齐，用于点击率预测。
TBIN: Modeling Long Textual Behavior Data for CTR Prediction	TBIN	arxiv'23	TBIN 提出了一种用于点击率预测的长文本行为数据建模方法。
An Unified Search and Recommendation Foundation Model for Cold-Start Scenario	-	CIKM'23	为冷启动场景提出了一个统一的搜索和推荐基础模型。
A Unified Framework for Multi-Domain CTR Prediction via Large Language Models	-	arxiv'23	提出了一个通过大型语言模型进行多领域点击率预测的统一框架。
UFIN: Universal Feature Interaction Network for Multi-Domain Click-Through Rate Prediction	UFIN	arxiv'23	UFIN 提出了一种通用特征交互网络，用于多领域的点击率预测。
ClickPrompt: CTR Models are Strong Prompt Generators for Adapting Language Models to CTR Prediction	ClickPrompt	WWW'24	ClickPrompt 提出了点击率模型作为强大提示生成器，用于调整语言模型以进行点击率预测。
PRINT: Personalized Relevance Incentive Network for CTR Prediction in Sponsored Search	PRINT	WWW'24	PRINT 提出了一种个性化相关性激励网络，用于赞助搜索中的点击率预测。
Breaking the Length Barrier: LLM-Enhanced CTR Prediction in Long Textual User Behaviors	-	arxiv'24	提出一种在长文本用户行为中增强点击率预测的LLM方法。
KELLMRec: Knowledge-Enhanced Large Language Models for Recommendation	KELLMRec	arxiv'24	KELLMRec 提出了一种增强知识的大型语言模型用于推荐任务。
Enhancing sequential recommendation via llm-based semantic embedding learning	-	WWW'24	通过基于LLM的语义嵌入学习来增强顺序推荐任务。
Heterogeneous knowledge fusion: A novel approach for personalized recommendation via llm	-	Recsys'23	通过异构知识融合，提出了一种通过LLM进行个性化推荐的新方法。
Play to Your Strengths: Collaborative Intelligence of Conventional Recommender Models and Large Language Models	-	arxiv'24	利用传统推荐模型和大型语言模型的协同智能来发挥各自优势。
Generative Explore-Exploit: Training-free Optimization of Generative Recommender Systems using LLM Optimizers	-	arxiv'24	通过LLM优化器实现生成式推荐系统的无训练优化，实现探索-利用策略。

Feature Selection

Title	Model	Time	Description
ICE-SEARCH: A Language Model-Driven Feature Selection Approach	ICE-SEARCH	arXiv'24	ICE-SEARCH 提出了一种基于语言模型的特征选择方法。
Large Language Model Pruning	-	arXiv'24	Model Pruning
Dynamic and Adaptive Feature Generation with LLM	-	arXiv'24	利用LLM进行动态和自适应特征生成。

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM4REC

OVERVIEW

1. LLMs enhance Recommendation

Feature Engineering

Feature Encoder

2. LLMs serve as Recommenders

Scoring/Ranking

Pipline Controller

3. Other Related work

Self-distillation in LLM

Direct Preference Optimization in LLM (DPO)

LLM4CTR

Feature Selection

About

Releases

Packages

Contributors 4

istarryn/LLM4REC

Folders and files

Latest commit

History

Repository files navigation

LLM4REC

OVERVIEW

1. LLMs enhance Recommendation

Feature Engineering

Feature Encoder

2. LLMs serve as Recommenders

Scoring/Ranking

Pipline Controller

3. Other Related work

Self-distillation in LLM

Direct Preference Optimization in LLM (DPO)

LLM4CTR

Feature Selection

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Packages