Skip to content

aounon/llm-rank-optimizer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Manipulating Large Language Models to Increase Product Visibility

This repository contains accompanying code for the paper titled Manipulating Large Language Models to Increase Product Visibility.

Introduction

Large language models (LLMs) are being used to search product catalogs and provide users with personalized recommendations tailored to their specific query. In this work, we investigate whether LLM recommendations can be manipulated to enhance a product’s visibility. We demonstrate that adding a strategic text sequence (STS) to a target product’s information page can significantly increase its likelihood of being listed as the LLM’s top recommendation. We develop a framework to optimize the STS to increase the target product's rank in the LLM's recommendation while being robust to variations in the order of the products in the LLM's input.

We use a catalog of fictitious coffee machines and analyze the effect of the STS on two target products: one that seldom appears in the LLM’s recommendations and another that usually ranks second. We observe that the strategic text sequence significantly enhances the visibility of both products by increasing their chances of appearing as the top recommendation.

This Repository

Generating STS: The file rank_opt.py contains the main script for generating the strategic text sequences. It uses the list of products in data/coffee_machines.jsonl as the catalog. It optimizes the probability of the target product's rank being 1. Following is an example command for running this script:

python rank_opt.py --results_dir [path/to/save/results] --target_product_idx [num] --num_iter [num] --test_iter [num] --random_order --mode [self or transfer]

Options:

  1. --results_dir: To specify the location to save the outputs of the script, such as the STS of the target product.

  2. --target_product_idx: To specify the index of the target product in the list of products in data/coffee_machines.jsonl.

  3. --num_iter: Number of iterations of the optimization algorithm.

  4. --test_iter: Interval to test the STS.

  5. --random_order: To optimize the STS to tolerate variations in the product order.

  6. --mode: Mode in which to generate the STS:

    a. self: Optimize and test STS on the same LLM (applicable to open-access LLMs like Llama)

    b. transfer: Optimize to transfer to a different LLM (applicable for API-access models like GPT-3.5), e.g., Optimize using Llama and Vicuna, and test on GPT-3.5.

rank_opt.py generates the STS for the target product and plots the target loss and the rank of the target product in the results directory. See self.sh and transfer.sh in bash script for usage of the above options.

coffee_machines.jsonl in data contains a catalog of ten fictitious coffee machines listed in increasing order of price.

Evaluating STS: evaluate.py evaluates the STS generated by rank_opt.py. We obtain product recommendations from an LLM with and without the STS in the target product's description in the catalog. We then compare the rank of the target product in the LLM's recommendation in the two scenarios. We repeat this experiment several times to quantify the advantage obtained from using the STS. Following is an example command for running the evaluation script:

python evaluate.py --model_path [LLM for STS evaluation] --prod_idx [num] --sts_dir [path/to/STS] --num_iter [num] --prod_ord [random or fixed]

Options:

  1. --model_path: Path to the LLM to use for STS evaluation.

  2. --prod_idx: Target product index.

  3. --sts_dir: Path to STS to evaluate. Same as --results_dir for rank_opt.py.

  4. --num_iter: To specify the number of evaluations.

  5. --prod_ord: To specify the product order in the LLMs input.

Plotting Results: plot_dist.py plots the distribution of the target product's rank before and after STS insertion. It also plots the advantage obtained by using the STS (% of times the target product ranks higher).

See scripts eval_self.sh and eval_transfer.sh for usage of evaluate.py and plot_dist.py.

System Requirements: The strategic text sequences were optimized using NVIDIA A100 GPUs with 80GB memory. When run in transfer mode, rank_opt.py requires access to GPUs. All the abopve scripts need to be run in a Conda environment created as per the instructions below.

Installation

Follow the instructions below to set up the environment for the experiments.

  1. Install Anaconda:
  2. Set up conda environment llm-rank with required packages:
    conda env create -f env.yml
    
  3. Activate environment:
    conda activate llm-rank
    

Manually Build Environment (Optional)

If setting up the environment using env.yml does not work, manually build an environment with the required packages using the following steps:

  1. Create Conda Environment with Python:
    conda create -n [env] python=3.10
    
  2. Activate environment:
    conda activate [env]
    
  3. Install PyTorch with CUDA from: https://pytorch.org/
    conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
    
  4. Install transformers from Huggingface:
    conda install -c huggingface transformers
    
  5. Install accelerate:
    conda install -c conda-forge accelerate
    
  6. Install scikit-learn (required for training safety classifiers):
    conda install -c anaconda scikit-learn
    
  7. Install seaborn:
    conda install anaconda::seaborn
    
  8. Install termcolor:
    conda install -c conda-forge termcolor
    
  9. Instal OpenAI python package:
    conda install conda-forge::openai
    

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published