SPOT: Text Source Prediction From Originality Score Thresholding · The Large Language Model Bible Contribute to LLM-Bible

SPOT: Text Source Prediction From Originality Score Thresholding

Yvinec Edouard, Kasser Gabriel. Arxiv 2024

[Paper]    
Applications Model Architecture Security Training Techniques Uncategorized

The wide acceptance of large language models (LLMs) has unlocked new applications and social risks. Popular countermeasures aim at detecting misinformation, usually involve domain specific models trained to recognize the relevance of any information. Instead of evaluating the validity of the information, we propose to investigate LLM generated text from the perspective of trust. In this study, we define trust as the ability to know if an input text was generated by a LLM or a human. To do so, we design SPOT, an efficient method, that classifies the source of any, standalone, text input based on originality score. This score is derived from the prediction of a given LLM to detect other LLMs. We empirically demonstrate the robustness of the method to the architecture, training data, evaluation data, task and compression of modern LLMs.

Similar Work