This is the official repo for SimMark:A Robust Sentence-Level Similarity-Based Watermarking Algorithm for Large Language Models (accepted to EMNLP 2025).
The widespread adoption of large language models (LLMs) necessitates reliable methods to detect LLM-generated text. We introduce SimMark, a robust sentence-level watermarking algorithm that makes LLMs' outputs traceable without requiring access to model internals, making it compatible with both open and API-based LLMs. By leveraging the similarity of semantic sentence embeddings combined with rejection sampling to embed detectable statistical patterns imperceptible to humans, and employing a soft counting mechanism, SimMark achieves robustness against paraphrasing attacks. Experimental results demonstrate that SimMark sets a new benchmark for robust watermarking of LLM-generated content, surpassing prior sentence-level watermarking techniques in robustness, sampling efficiency, and applicability across diverse domains, all while maintaining the text quality and fluency.
A high-level overview of SimMark detection algorithm. The input text is divided into individual sentences
Top: Generation. For each newly generated sentence (
Bottom: Detection (+ Paraphrase attack). Paraphrased versions of watermarked sentences are generated (
Performance of different algorithms across datasets and paraphrasers, evaluated using ROC-AUC ↑ / TP@FP=1% ↑ / TP@FP=5% ↑, respectively (↑: higher is better), reported from left to right. In each column, bold values indicate the best performance for a given dataset and metric, while underlined values denote the second-best. SimMark consistently outperforms or is on par with other state-of-the-art methods across datasets and paraphrasers, and it is the best on average.
Detection performance of different watermarking methods under various paraphrasing attacks, measured by TP@FP=1% ↑ and averaged across all three datasets (RealNews, BookSum, Reddit-TIFU). Each axis corresponds to a specific paraphrasing attack method (e.g., Pegasus-Bigram), and higher values are better. Our methods,
- Clone this repository:
git clone https://github.com/DabiriAghdam/SimMark.git
- Create a virtual environment (python3.10 recommended), and then install dependencies:
pip3 install -r requirements.txt
- Then run following commands to download the necessary data:
python3 load_punkt_tab.py
python3 load_c4.py
python3 load_booksum.py
python3 load_tifu.py
- Without paraphrase:
- Cosine-SimMark:
python3 detection.py watermarked/c4/c4-cosine --human_text human/c4 --mode cosine --a 0.68 --b 0.76
- Euclidean-SimMark:
python3 detection.py watermarked/c4/c4-euclidean-pca --human_text human/c4 --mode euclidean --a 0.28 --b 0.36 --use_pca
- With Pegasus paraphraser:
- Cosine-SimMark:
python3 detection.py watermarked/c4/c4-cosine-pegasus --human_text human/c4 --mode cosine --a 0.68 --b 0.76
- Euclidean-SimMark:
python3 detection.py watermarked/c4/c4-euclidean-pca-pegasus --human_text human/c4 --mode euclidean --a 0.28 --b 0.36 --use_pca
- Additional paraphrasers data can be used similarly by changing the dataset path.
For other datasets, just replace the dataset paths and make sure you set the correct parameters for "human_text", "mode", "a", "b", and if necessary add "--use_pca" flag.
- For example for the BookSum dataset with Pegasus paraphraser:
- Cosine-SimMark:
python3 detection.py watermarked/booksum/booksum-cosine-pegasus --human_text human/booksum --mode cosine --a 0.68 --b 0.76
- Euclidean-SimMark:
python3 detection.py watermarked/booksum/booksum-euclidean-pegasus --human_text human/booksum --mode euclidean --a 0.4 --b 0.55
The key parameters that can be used with detection.py are summarized as follows (for more details, see the code itself):
- Data path: See the watermarked folder.
- Mode (--mode): 'cosine' or 'euclidean'
- Predefined Intervals ([a, b]):
- RealNews dataset (human_text should be human/c4):
- Cosine Similarity: [0.68, 0.76]
- Euclidean Distance: [0.28, 0.36] (must add --use_pca flag)
- BookSum dataset (human_text should be human/booksum):
- Cosine Similarity: [0.68, 0.76]
- Euclidean Distance: [0.4, 0.55] (DO NOT add --use_pca flag for this dataset)
- Reddit-TIFU dataset (human_text should be human/tifu):
- Cosine Similarity: [0.68, 0.76]
- Euclidean Distance: [0.28, 0.36] (must add --use_pca flag)
- RealNews dataset (human_text should be human/c4):
- Soft Count Decay Factor (--K): 250 is the default.
- LLM (--model_path): 'AbeHou/opt-1.3b-semstamp' is the default.
- Embedding Model (--embedder): 'hkunlp/instructor-large' is the default.
For further details on hyperparameter selection, refer to the paper.
For text quality evaluation, you can run the following command for Euclidean-SimMark on the BookSum dataset for instance:
python3 eval_quality.py --dataset_name watermarked/booksum/booksum-euclidean --human_ref_name human/booksum
- (Optional) If you want you can re-train a PCA model by running the following (first ensure the necessary data is loaded using
load_c4.py):
python3 train_PCA.py --num_components 16
- Generate a smaller subset of RealNews dataset (for example a subset of size n = 1000):
python3 build_subset.py data/c4-val --n 1000
- Then to generate watermarked data run the following, with the appropriate parameters for
human_text,mode,a,b,output:- Cosine-SimMark:
python3 sampling.py data/c4-val-1000 --output c4-cosine-new --a 0.68 --b 0.76 --mode cosine- Euclidean-SimMark:
python3 sampling.py data/c4-val-1000 --output c4-euclidean-new --a 0.28 --b 0.36 --mode euclidean --use_pca
4.(Optional) If you want to apply paraphrasing you can run this for Pegasus-bigram paraphraser, for example with 25 beams:
python3 paraphrase_gen.py c4-cosine-new --paraphraser pegasus-bigram --num_beams 25
You can replace pegasus-bigram with other available paraphrasers.
- You can then detect watermarks (before and after paraphrasing):
python3 detection.py c4-cosine-new --human_text human/c4 --mode cosine --a 0.68 --b 0.76
python3 detection.py c4-cosine-new-pegasus-bigram=True-num_beams=25-threshold=0.1 --human_text human/c4 --mode cosine --a 0.68 --b 0.76
Some of the codes were partially adapted from these work (original github link: https://github.com/bohanhou14/semstamp):
-
"SemStamp: A Semantic Watermark with Paraphrastic Robustness for Text Generation" (https://arxiv.org/abs/2310.03991)
-
"k-SemStamp: A Clustering-Based Semantic Watermark for Detection of Machine-Generated Text" (https://arxiv.org/abs/2402.11399)
If you found this repository helpful, please don't forget to cite our paper:
@misc{dabiriaghdam2025simmarkrobustsentencelevelsimilaritybased,
title={SimMark: A Robust Sentence-Level Similarity-Based Watermarking Algorithm for Large Language Models},
author={Amirhossein Dabiriaghdam and Lele Wang},
year={2025},
eprint={2502.02787},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2502.02787},
}If you have any questions, please feel free to contact amirhossein@ece.ubc.ca.



