Official Repo for SemStamp (NAACL24) and k-SemStamp (ACL24)

Updates (Dec 7 2024)

SemStamp and k-SemStamp now both support multi-GPU implementations.
Added more instructions to clarify usage
Integrated data repository with the original data from the paper.

To cite

@inproceedings{hou-etal-2023-semstamp,
    title = "SemStamp: A Semantic Watermark with Paraphrastic Robustness for Text Generation",
    author = "Hou, Abe Bohan*  and
      Zhang, Jingyu*  and
      He, Tianxing*  and
      Chuang, Yung-Sung  and
      Wang, Hongwei  and
      Shen, Lingfeng and
      Van Durme, Benjamin and
      Khashabi, Daniel  and
      Tsvetkov, Yulia",
    booktitle = "Annual Conference of the North American Chapter of the Association for Computational Linguistics",
    year = "2023",
    url = "https://arxiv.org/abs/2310.03991",
}

@inproceedings{hou-etal-2024-k,
    title = "k-{S}em{S}tamp: A Clustering-Based Semantic Watermark for Detection of Machine-Generated Text",
    author = "Hou, Abe  and
      Zhang, Jingyu  and
      Wang, Yichen  and
      Khashabi, Daniel  and
      He, Tianxing",
    editor = "Ku, Lun-Wei  and
      Martins, Andre  and
      Srikumar, Vivek",
    booktitle = "Findings of the Association for Computational Linguistics: ACL 2024",
    month = aug,
    year = "2024",
    address = "Bangkok, Thailand",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2024.findings-acl.98",
    doi = "10.18653/v1/2024.findings-acl.98",
    pages = "1706--1715",
}

Installation

Recursive clone:

git clone --recurse-submodules https://github.com/bohanhou14/SemStamp.git

(Recommended) create a virtual environment, and then:

pip install -r requirements.txt

(MANDATORY) install punkt

python install_punkt.py

SemStamp and k-SemStamp

This is the repo for SemStamp: A Semantic Watermark with Paraphrastic Robustness for Text Generation (Accepted to NAACL 24) and k-SemStamp: A Clustering-Based Semantic Watermark for Detection of Machine-Generated Text (Accepted to ACL 24).

SemStamp is a semantic watermark on Large Language Model(LLM) text generations to allow generated texts to be detected. SemStamp utilizes Locality-Sensitive Hashing (LSH) to partition the high-dimensional embedding space to produce sentence generations with LSH-hashes that follow a pseudo-randomly controlled sequence. During detection time, the algorithm analyzes the LSH-hashes of input sentences to see if they constitute a pseudo-random sequence, subsequently applying a z-test on the pseudo-randomness to determine if the text is watermarked. k-SemStamp is a simple yet effective variant of SemStamp, which has a similar setup but uses k-means clustering to partition the embedding space.

Reproduce paper results

For instance, the pegasus results:

1. load human subset
python load_c4.py
2. (MUST be in GPU environment), for example:
python detection.py semstamp-data/c4-semstamp-pegasus-parrot/semstamp-pegasus-bigram=False --detection_mode lsh --sp_dim 3 --embedder AbeHou/SemStamp-c4-sbert --human_text semstamp-data/original-c4-texts 

python detection.py semstamp-data/c4-semstamp-pegasus-parrot/semstamp-pegasus-bigram=True --detection_mode lsh --sp_dim 3 --embedder AbeHou/SemStamp-c4-sbert --human_text semstamp-data/original-c4-texts 

python detection.py semstamp-data/c4-ksemstamp-pegasus/bigram=False --detection_mode kmeans --sp_dim 8 --embedder AbeHou/SemStamp-c4-sbert --cc_path centroids/c4-cluster_8_centers.pt --human_text semstamp-data/original-c4-texts
3. Feel free to run detections on other data from semstamp-data

Custom Generations

Please read the following guides.

SemStamp

The high-level pipeline of SemStamp is outlined below

Generation

Fine-tune a robust sentence embedder that encodes semantically similar sentences with sentence embeddings having high cosine similarities.
LSH partitions the embedding space through fixing random hyperplanes and assigning signatures of a vector based on the signs of its dot product with the hyperplanes.
Given input sentence $s_1$, repeat generating sentences $s_t$ until $LSH(s_t) \in valid(s_{t-1})$, $t=2,...$ and stop when the max_new_tokens is reached. The valid mask is controlled by the LSH signature of $s_{t-1}$.

Detection

Attempt to remove sentence watermark through sentence-level paraphrasing.
Detect the sentences to see if $LSH(s_t) \in valid(s_{t-1})$, $t=2,...$

Sample usage

create data/ directory and load c4_data, which would also create the human subset for evaluation:

python load_c4.py 2. (Optional) fine-tune the sentence embedder or use a fine-tuned sentence embedder at AbeHou/SemStamp-booksum-sbert and AbeHou/SemStamp-c4-sbert

fine-tune procedure:

# 1. build a smaller-sized huggingface dataset of c4-train dataset with 'text' column (recommended size: 8k) and use the .save_to_disk() API
python build_subset.py data/c4-train --n 8000
# 2. paraphrase
python paraphrase_gen.py data/c4-train-8000
# 3. fine-tune
python finetune_embedder.py --model_name_or_path all-mpnet-base-v2 \
  --dataset_path data/c4-train-8000-pegasus-bigram=False-threshold=0.0 \
  --output_dir $OUTPUT_DIR --learning_rate 4e-5 --warmup_steps 50 \
  --max_seq_length 64 --num_train_epochs 3 --logging_steps 10 \
  --evaluation_strategy epoch --save_strategy epoch \
  --remove_unused_columns False --delta 0.8 --do_train --overwrite_output_dir

produce SemStamp generations:

# 1. build a smaller-sized hugginface dataset of c4-val dataset with 'text' column (e.g. 1k texts) and use the .save_to_disk() API
python build_subset.py data/c4-val --n 1000
# 2. sample
python sampling.py data/c4-val-1000 --model AbeHou/opt-1.3b-semstamp \
    --embedder OUTPUT_DIR_TO_YOUR_EMBEDDER --sp_mode lsh \
    --sp_dim 3 --delta 0.02

# note: it's recommended to use AbeHou/opt-1.3b-semstamp, which is fine-tuned with cross-entropy loss 
# to favor generations of shorter average sentence length, 
# so that the effect of watermarks is more pronounced.
# 3. paraphrase
python paraphrase_gen.py PATH_TO_GENERATED_DATA --model_path AbeHou/opt-1.3b-semstamp
# 4. detection
python detection.py PATH_TO_PARAPHRASED_DATA --detection_mode lsh --sp_dim 3 --embedder OUTPUT_DIR_TO_YOUR_EMBEDDER

Note that if you use GPU to generate, you must use GPU to detect as well in order for the random seed to be consistent. Note that you are free to change the value of delta for your customized tradeoff of robustness and speed. (Higher delta means more strict rejections, thus more robust and slower. Lower delta is the other way around.)

k-SemStamp Generation

Encode a corpus of texts belonging to a specific domain and obtain k-means clusters on the training embeddings
Given input sentence $s_1$, repeat generating sentences $s_t$ until $c(s_t) \in valid(s_{t-1})$, $t=2,...$ and stop when the max_new_tokens is reached. $c(s_t)$ returns the index of the closest cluster to $s_{t}$. The valid mask is controlled by $c(s_{t-1})$.

Detection

The detection procedure is analogous to SemStamp except that $c(s_t)$ is used instead of $LSH(s_t)$

Sample usage

Steps 1 and 2 are the same.

generate sentence embeddings to train kmeans clusters (multi-GPU supported).

python sampling_kmeans_utils.py data/c4-train AbeHou/SemStamp-c4-sbert 8

produce k-SemStamp generations.

python sampling.py data/c4-val-1000 --model AbeHou/opt-1.3b-semstamp --embedder AbeHou/SemStamp-c4-sbert --sp_mode kmeans --sp_dim 8 --delta 0.02 \
  --cc_path data/c4-train/cc.pt

detection:

python detection.py path_to_your_generation --detection_mode kmeans --sp_dim 8 --embedder output_dir_to_your_embedder --cc_path to_your_kmeans_clusters

Example usage to replicate results in k-SemStamp paper:

python detection.py semstamp-data/c4-ksemstamp-pegasus/bigram=False --detection_mode kmeans --sp_dim 8 --embedder AbeHou/SemStamp-c4-sbert --cc_path centroids/c4-cluster_8_centers.pt

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
centroids		centroids
semstamp-data @ bd6e948		semstamp-data @ bd6e948
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
__init__.py		__init__.py
build_subset.py		build_subset.py
contrastive_trainer.py		contrastive_trainer.py
detection.py		detection.py
detection_utils.py		detection_utils.py
eval_clm.py		eval_clm.py
eval_quality.py		eval_quality.py
finetune_embedder.py		finetune_embedder.py
install_punkt.py		install_punkt.py
load_c4.py		load_c4.py
paraphrase_gen.py		paraphrase_gen.py
paraphrase_gen_utils.py		paraphrase_gen_utils.py
requirements.txt		requirements.txt
sampling.py		sampling.py
sampling_kmeans_utils.py		sampling_kmeans_utils.py
sampling_lsh_utils.py		sampling_lsh_utils.py
sampling_utils.py		sampling_utils.py
sbert_lsh_model.py		sbert_lsh_model.py
train_val_test_split.py		train_val_test_split.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Official Repo for SemStamp (NAACL24) and k-SemStamp (ACL24)

Updates (Dec 7 2024)

Installation

SemStamp and k-SemStamp

Reproduce paper results

Custom Generations

SemStamp

Generation

Detection

Sample usage

k-SemStamp Generation

Detection

Sample usage

About

Releases

Packages

Languages

bohanhou14/SemStamp

Folders and files

Latest commit

History

Repository files navigation

Official Repo for SemStamp (NAACL24) and k-SemStamp (ACL24)

Updates (Dec 7 2024)

Installation

SemStamp and k-SemStamp

Reproduce paper results

Custom Generations

SemStamp

Generation

Detection

Sample usage

k-SemStamp Generation

Detection

Sample usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages