HC3 Plus: A Semantic-invariant Human Chatgpt Comparison Corpus

Su Zhenpeng, Wu Xing, Zhou Wei, Ma Guangyuan, Hu Songlin. Arxiv 2023

[Paper]
Applications GPT Model Architecture Reinforcement Learning

ChatGPT has garnered significant interest due to its impressive performance; however, there is growing concern about its potential risks, particularly in the detection of AI-generated content (AIGC), which is often challenging for untrained individuals to identify. Current datasets used for detecting ChatGPT-generated text primarily focus on question-answering tasks, often overlooking tasks with semantic-invariant properties, such as summarization, translation, and paraphrasing. In this paper, we demonstrate that detecting model-generated text in semantic-invariant tasks is more challenging. To address this gap, we introduce a more extensive and comprehensive dataset that incorporates a wider range of tasks than previous work, including those with semantic-invariant properties.

The Large Language Model Bible

HC3 Plus: A Semantic-invariant Human Chatgpt Comparison Corpus

Su Zhenpeng, Wu Xing, Zhou Wei, Ma Guangyuan, Hu Songlin. Arxiv 2023

Similar Work