Disclosure of Invention
Therefore, the embodiment of the invention provides a method and a system for detecting a machine-generated text, which are used for solving the problem of low accuracy of a detection result in the prior art.
In order to achieve the above object, the present invention provides the following technical solutions:
the first aspect of the invention provides a detection method of machine-generated text, which comprises disturbance difference detection, statistical detection and detection result summarization, wherein,
the disturbance difference detection is to perform paraphrasing substitution in the target sample, generate a substitution sample, calculate the generation probability of the candidate target sample and the substitution sample in the target generation model, and mark the candidate target sample and the substitution sample as machine-generated text when the difference exceeds a preset threshold significance level;
the statistical detection is to calculate the prediction probability, probability ranking and information entropy of the words at the current position when the text is given based on the target generation model, and the three parts of statistical calculation result indexes are integrated to give judgment to the text source;
and summarizing the detection results, namely summarizing and integrating the results of the disturbance difference detection and the statistical detection, and outputting a comprehensive detection result.
Preferably, the disturbance difference detection comprises the steps of text disturbance processing and difference detection calculation, wherein the input of the text disturbance processing step is a target sample and a word list, and a replacement sample set is output; the input of the difference detection calculation step is a target sample, a replacement sample set and a target generation model, and a disturbance difference detection result is output;
the text disturbance processing comprises the steps of obtaining character vector representation by using a pre-provided universal word list, describing the similarity relationship between characters by the cosine similarity of the character vector, for a designated character, finding out the character with the highest cosine similarity of the character vector corresponding to the character from the pre-provided word list as the near-sense character of the current character, randomly selecting k characters in a target sample, and respectively replacing the k characters with the near-sense characters to obtain k replacement samples to form a replacement sample set;
and (3) performing difference detection calculation, namely comparing the generation probability of the disturbed sample with the difference of the target sample by applying micro disturbance to the text, so as to realize the detection of the machine generated text.
Preferably, the text perturbation processing step's near-word substitution operation perturbs the target sample by calculating a regularized probability difference value, and determining that the machine-generated text is when the difference value exceeds a predetermined threshold value by:
(1) Calculating a generation probability mean value of the replacement sample:
(2) Calculating a probability difference value:
(3) Probability difference regularization:
(4) Comparison of significance differences:
if it isDetermining that the machine-generated text;
wherein x is the target sample, and the sample is obtained,to replace the sample set, p θ A model is generated for the target, e is the saliency difference threshold.
Preferably, the statistical test comprises two stages of statistical test index calculation and comprehensive index construction; the input of the statistical detection index calculation stage is a word list, a target sample and a target generation model, and the statistical detection index is output; the input of the comprehensive index construction stage is a statistical detection index, and a statistical detection result is output;
the statistical detection index calculation judges whether the text is generated by a machine by detecting whether the current word is positioned at the head of the distribution;
building comprehensive indexes, namely building three inspection indexes respectively, and then integrating the inspection indexes:
first test indexWherein I (x) is an oscillometric function, and is 1 when x is true, 0 when x is false, alpha 1 As a probability reference value, beta 1 Is a first proportional threshold;
second test indexWherein alpha is 2 For ranking reference value, beta 2 Is a second proportional threshold;
third test indexWherein->α 3 For information entropy reference value, beta 3 For the third proportional threshold, the arithmetic mean value of the three test indicators is +.>As a composite index, if T < 0, the target sample is marked as machine-generated.
Preferably, the method for detecting machine-generated text further comprises a step of sample restoration, wherein the method for detecting text identifies the remarked content in the machine-generated text and restores the remarked content to obtain the original text, specifically:
sequentially performing disturbance difference detection, statistical detection and detection result summarization steps by taking the initialized modified sample as a target sample to obtain a comprehensive detection result;
if the artificial writing is judged, generating an L-th word by the target generation model, and updating the current sample by using the model generation word to obtain the current sample; if the machine generation is judged, updating the current sample by using the L-th word of the original sample to obtain the current sample;
if the length of the current sample is equal to the length of the modified sample, the current sample is restored to the obtained original text.
The invention also provides a detection system of the machine-generated text, which comprises a disturbance difference detection unit, a statistics detection unit and a detection result summarization unit,
the disturbance difference detection unit is used for performing paraphrasing substitution in the target sample, generating a substitution sample, calculating the generation probability of the candidate target sample and the substitution sample in the target generation model, and marking the candidate target sample and the substitution sample as machine-generated text when the difference exceeds a preset threshold significance level;
the statistical detection unit is used for calculating the prediction probability, probability ranking and information entropy of the words at the current position when the text is given on the basis of the target generation model, and comprehensively judging the text source by integrating the three parts of statistical calculation result indexes;
and the detection result summarizing unit is used for summarizing and integrating the results of the disturbance difference detection and the statistical detection and outputting a comprehensive detection result.
Preferably, the disturbance difference detection unit comprises a text disturbance processing module and a difference detection calculation module, wherein the input of the text disturbance processing module is a target sample and a word list, and a replacement sample set is output; the input of the difference detection calculation module is a target sample, a replacement sample set and a target generation model, and a disturbance difference detection result is output;
the text disturbance processing module acquires character vector representation by using a pre-provided universal word list, and characterizes the similarity relationship between characters by the cosine similarity of the character vector, and for a designated character, a character with the highest cosine similarity of the character vector corresponding to the character is found out from the pre-provided word list to be used as a near-meaning character of the current character, then k characters are randomly selected from a target sample, and the k characters are respectively replaced by the near-meaning characters to obtain k replacement samples, so that a replacement sample set is formed;
and the difference detection calculation module is used for comparing the generation probability of the disturbed sample with the difference of the target sample by applying micro disturbance to the text, so that the detection of the machine generated text is realized.
Preferably, the statistical detection unit comprises a statistical detection index calculation module and a comprehensive index construction module; the input of the statistical detection index calculation module is a word list, a target sample and a target generation model, and the statistical detection index is output; the input of the comprehensive index construction module is a statistical detection index, and a statistical detection result is output;
the statistical detection index calculation module judges whether the text is generated by a machine by detecting whether the current word is positioned at the head of the distribution;
the comprehensive index construction module is used for respectively constructing three inspection indexes and then integrating the inspection indexes:
first test indexWherein I (x) is an oscillometric function, and is 1 when x is true, 0 when x is false, alpha 1 As a probability reference value, beta 1 Is a first proportional threshold;
second test indexWherein alpha is 2 For ranking reference value, beta 2 A second ratio threshold;
third test indexWherein->α 3 For information entropy reference value, beta 3 For the third proportional threshold, the arithmetic mean value of the three test indicators is +.>As a composite index, if T < 0, the target sample is marked as machine-generated.
The present invention also provides an electronic device including: at least one processor and at least one memory;
the memory is used for storing one or more program instructions;
the processor is configured to execute one or more program instructions to perform the machine-generated text detection method described above.
The present invention also provides a computer readable storage medium having one or more program instructions embodied therein for performing the machine-generated text detection method described above.
The embodiment of the invention has the following advantages:
the invention provides a detection method of a machine-generated text, which comprises the steps of firstly, disturbing a target sample through the replacement of a paraphrase, and calculating a generated probability difference value to realize disturbance difference detection; calculating the prediction probability, the prediction probability ranking and the prediction distribution cross entropy of the sample according to the target generation model, and realizing statistical detection based on the three statistical indexes; and finally, comprehensively considering the results of the two detection methods, and outputting a comprehensive detection result. In particular, the invention also proposes a detection method for machine-generated text reformulations based on the aforementioned detection method, which implements text reduction while recognizing the reformulation content. The invention realizes the detection function of the machine-generated text, can be used for solving the abuse problem of the text generated by the current language model, and provides a feasible and effective scheme for the supervision of the text sources by related departments.
Detailed Description
Other advantages and advantages of the present invention will become apparent to those skilled in the art from the following detailed description, which, by way of illustration, is to be read in connection with certain specific embodiments, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Aiming at the detection problem of machine-generated texts, the invention provides a detection method of machine-generated texts, which is used for detecting text paragraphs by two methods of disturbance difference detection and statistic detection and judging whether the texts belong to machine generation or not according to the detection results of the two methods. The disturbance difference detection needs to implement the paraphrase replacement on the text, and the unique characteristics of the machine-generated text are effectively utilized to realize the text detection by calculating the probability difference value of the replacement sample; the statistical test involves three parts of statistical calculations, determining whether the sample is machine-generated by a probability distribution. In particular, the present invention also contemplates a detection scheme for generating a reformulation of text that is capable of identifying the reformulation portion in the text and restoring the text. Compared with a general machine-generated text detection model, the method provided by the invention does not need to perform supervised training of the model, combines the specific characteristics of the machine-generated text, and solves the machine-generated text detection problem more pertinently.
The embodiment of the invention provides a detection method of a machine-generated text, which comprises the following steps:
wherein, disturbance difference detects: performing paraphrasing substitution in the target sample to generate a substitution sample, calculating the generation probability of the candidate target sample and the substitution sample in the target generation model, and marking the candidate target sample and the substitution sample as machine-generated text when the difference exceeds a preset threshold significance level;
and (3) statistical detection: calculating the prediction probability, probability ranking and information entropy of the words at the current position when the text is given on the basis of the target generation model, and comprehensively judging the text source by the three statistical calculation result indexes;
summarizing detection results: and summarizing and integrating the disturbance difference detection and the statistical detection results, and outputting a comprehensive detection result.
For generating a description of the text, the method further includes sample reduction.
Referring to fig. 1, the method for detecting machine-generated text provided by the embodiment of the invention specifically includes:
step S01, disturbance difference detection, namely performing paraphrasing substitution in a target sample, generating a substitution sample, calculating the generation probability of candidate target samples and substitution samples in a target generation model, and marking the candidate target samples and the substitution samples as machine-generated texts when the difference exceeds a preset threshold significance level;
step S02, statistical detection, namely calculating the prediction probability, probability ranking and information entropy of the current position word when the text is given on the basis of a target generation model, and comprehensively judging the text source by integrating three parts of statistical calculation result indexes;
and S03, summarizing detection results, summarizing and integrating the results of the disturbance difference detection and the statistical detection, and outputting a comprehensive detection result.
In particular, for the reformulation of the generated text, the sample reduction is to identify the reformulation content in the generated text by the text detection method and restore the reformulation text.
The detection method of the machine-generated text provided by the embodiment of the invention comprises disturbance difference detection, statistical detection, result summarization and output. In particular, for generating a description of the text, the method further includes sample reduction.
In the embodiment of the invention, the disturbance difference detection comprises two operations of text disturbance processing and difference detection calculation. The input of the text disturbance processing stage is a target sample and a vocabulary, and a replacement sample set is output. The input of the difference detection calculation stage is a target sample, a replacement sample set and a target generation model, and a disturbance difference detection result is output.
A specific flow of disturbance difference detection is shown in fig. 2.
And (3) performing text disturbance processing, namely firstly obtaining word vector representation by using a pre-provided universal word list. The method comprises the steps of describing the similarity relation between characters by cosine similarity of character vectors, finding out the character with highest cosine similarity of the character vector corresponding to the character in a word list for a specified character as a near-meaning character of a current character, randomly selecting k characters in a target sample, and replacing the k characters with the near-meaning characters to obtain k replacement samples to form a replacement sample set. Recording the set of substitution samples asX={x 1 ,...,x k },x i To replace the samples, i=1, …, k.
The difference detection calculation, taking into account the characteristics of the machine-generated text, comes from the generation model p θ Is usually located at p θ In the negative curvature region of the logarithmic probability function, therefore, the detection of the machine-generated text can be realized by applying a small disturbance to the text and comparing the generation probability of the disturbed sample with the difference of the target sample.
The invention performs perturbation on the target sample through the near-word substitution operation of the text perturbation processing stage, calculates regularized probability difference values through the following steps, and marks machine-generated text when the difference values exceed a preset threshold significance level. The significance difference threshold value epsilon=0.3 is adopted in the invention. The target generation model may be a common autoregressive language model including GPT, T5, LLaMA, and the like.
Input: target sample x, replacement sample setObject generation model p θ Significance difference threshold e.
Calculating a generation probability mean value of the replacement sample:
calculating a probability difference value:
probability difference regularization:
comparison of significance differences:
if it isThe true is output, otherwise the false is output.
In the embodiment of the invention, the statistical detection comprises two operations of statistical detection index calculation and comprehensive index construction. The input of the statistical detection index calculation stage is a word list, a target sample and a target generation model, and the statistical detection index is output. The input of the comprehensive index construction stage is a statistical detection index, and a statistical detection result is output.
A specific flow of statistical detection is shown in fig. 3.
Statistical detection index calculation, since machine-generated text is typically sampled from the head of the probability distribution, it can be determined whether the text is machine-generated by detecting whether the current word is located at the head of the distribution. Let x be 1 ,...,x j-1 For the first j-1 words of the target sample,for the j-th word in the text, x j And D represents a random variable corresponding to the character at the j-th position, and D represents a character set corresponding to the vocabulary. Generating a model p based on a target θ
The following three statistical detection indexes are calculated:
predictive probability p of jth word j ,
Predictive probability ranking r of jth word j ,
r j Is p j Predicting probability sets in all wordsRanking of (3);
information entropy e of jth word predictive distribution j ,
Wherein p is j And r j For checking whether the j-th word of text comes from the head of the target-generated model probability distribution, e j For checking text x based on the above 1:j-1 For the current word x j The degree of confidence in the prediction, j=1,..n, n is the total number of words of the target sample.
And (3) constructing comprehensive indexes, namely respectively constructing three inspection indexes, and then integrating the inspection indexes.
First test index construction, if the prediction probability of a word in a target sample is closer to 1, the sample is more likely to be machine-generated, and calculation is performedAs a first test index, wherein I (x) is an indication function, and is 1 when x is true, 0 when x is false, and alpha 1 As a probability reference value, beta 1 For the first proportional threshold, α is taken in the present invention 1 =0.1,β 1 =0.5。
Second test index construction, if the predictive probability of a word in a target sample ranks higher, then the sample is more likely to be machine-generated, calculatedAs a second test index, wherein alpha 2 For the ranking reference value beta 2 For the second proportional threshold, α is taken in the present invention 2 =100,β 2 =0.5。
Third test index construction, if the greater the entropy of the word prediction distribution in the target sample, the greater the uncertainty representing the prediction, the less likely the sample is machine-generated, and the calculationAs a third test criterion, wherein->α 3 For information entropy reference value, beta 3 For the third proportional threshold, α is taken in the present invention 3 =0.1,β 3 =0.1. Taking the above three test indexes into consideration, taking their arithmetic mean +.>As a composite index, if T < 0, the target sample is marked as machine-generated.
And the detection result assembly always collects and integrates the results of the disturbance difference detection and the statistical detection, and a final judgment of whether the target sample is generated by the machine is given. The input of the result summarizing and outputting step is the disturbance difference detection result and the statistic detection result, and the comprehensive detection result is output. Considering the detection results comprehensively, if the target sample does not pass the disturbance difference detection or the target sample does not pass the statistical detection (namely, the disturbance difference detection result marks the target sample as machine-generated or the statistical detection result marks the target sample as machine-generated), outputting the comprehensive detection result as machine-generated, otherwise, outputting the comprehensive detection result as artificial writing.
The sample reduction is to identify the modified part in the machine-generated text and reduce the original text.
The specific flow is shown in fig. 4.
The input of the sample reduction step is to reform the sample and the target generation model, and output the original sample. By x *
Representing a change of the sample to be described, represents x * L=1, …, m, p θ Representing the object generation model. The text detection method is used for checking the modified sample word by word, and the specific steps are as follows:
input: description of sample x * Target generation model p θ 。
Initializing l=2, initializing the original samples to be
(1) If L.ltoreq.m:
sequentially executing disturbance difference detection, statistical detection and detection result summarization steps by taking s as a target sample to obtain a comprehensive detection result;
if the comprehensive detection result is generated by the machine, turning to the step (2);
if the comprehensive detection result is written manually, turning to the step (3);
otherwise, go to step (6)
(2) Adding words to the end of string s
Then go to step (5)
(3) Generating an L-th word using the object generation model:
then go to step (4)
(4) Adding words to the end of string s
Then go to step (5)
(5) L+.L+1, go to step (1)
(6) The original sample s is output.
Referring to fig. 5, an embodiment of the present invention also provides a detection system 500 for machine-generated text, the system including a disturbance difference detection unit 510, a statistical detection unit 520, a detection result summarization unit 530,
the disturbance difference detection unit 510 is configured to perform paraphrasing substitution in the target sample, generate a substitution sample, calculate a generation probability of the candidate target sample and the substitution sample in the target generation model, and mark the candidate target sample and the substitution sample as machine-generated text when a difference value of the candidate target sample and the substitution sample exceeds a predetermined threshold significance level;
the statistics detection unit 520 is configured to calculate, based on the target generation model, a prediction probability, a probability ranking, and an information entropy of the word at the current position when the text is given, and integrate the three statistics calculation result indexes to provide a judgment for the text source;
and a detection result summarizing unit 530, configured to summarize and integrate the results of the disturbance difference detection and the statistical detection, and output a comprehensive detection result.
In the detection system of the machine-generated text, a disturbance difference detection unit 510 comprises a text disturbance processing module 51a and a difference detection calculation module 51b, wherein the input of the text disturbance processing module 51a is a target sample and a vocabulary, and a replacement sample set is output; the input of the difference detection calculation module 51b is a target sample, a replacement sample set and a target generation model, and a disturbance difference detection result is output;
the text disturbance processing module 51a acquires a word vector representation by using a pre-provided universal word list, characterizes a similarity relationship between words by cosine similarity of the word vector, finds out a word with highest cosine similarity of the word vector corresponding to a specified word from the pre-provided word list as a near-sense word of the current word, randomly selects k words in a target sample, and respectively replaces the k words with the near-sense words to obtain k replacement samples to form a replacement sample set;
the difference detection calculation module 51b compares the generation probability of the disturbed sample with the difference of the target sample by applying a micro disturbance to the text, thereby realizing the detection of the machine-generated text.
In the machine-generated text detection system, the statistical detection unit 520 includes a statistical detection index calculation module 52a and a comprehensive index construction module 52b; the input of the statistical detection index calculation module is a word list, a target sample and a target generation model, and the statistical detection index is output; the input of the comprehensive index construction module is a statistical detection index, and a statistical detection result is output;
the statistical detection index calculation module 52a determines whether the text is machine-generated by detecting whether the current word is located at the head of the distribution;
the comprehensive index construction module 52b is configured to construct three test indexes respectively, and then integrate the test indexes:
first test indexWherein I (x) is an oscillometric function, and is 1 when x is true, 0 when x is false, alpha 1 As a probability reference value, beta 1 Is a first proportional threshold;
second test indexWherein alpha is 2 For ranking reference value, beta 2 Is a second proportional threshold;
third inspection fingerLabel (C)Wherein->α 3 For information entropy reference value, beta 3 For the third proportional threshold, the arithmetic mean value of the three test indicators is +.>As a comprehensive index, if T<0, marking the target sample as machine-generated.
The embodiment of the invention provides electronic equipment, which comprises: at least one processor and at least one memory;
the memory is used for storing one or more program instructions;
the processor is configured to execute one or more program instructions to perform the detection method as described above.
Embodiments of the present invention also provide a computer readable storage medium having one or more program instructions embodied therein for performing the foregoing detection method.
Technical effects
According to the method for detecting the machine-generated text, firstly, disturbance is implemented on a target sample through the replacement of a paraphrasing, and a probability difference value is calculated and generated so as to realize disturbance difference detection; then, calculating the prediction probability, the prediction probability ranking and the prediction distribution cross entropy of the sample according to the target generation model, and realizing statistical detection based on the three statistical indexes; and finally, comprehensively considering the results of the two detection methods, and outputting a comprehensive detection result. In particular, the invention also proposes a detection method for machine-generated text reformulations based on the aforementioned detection method, which implements text reduction while recognizing the reformulation content. The invention realizes the detection function of the machine-generated text, can be used for solving the abuse problem of the text generated by the current language model, and provides a feasible and effective scheme for the supervision of the text sources by related departments.
While the invention has been described in detail in the foregoing general description and specific examples, it will be apparent to those skilled in the art that modifications and improvements can be made thereto. Accordingly, such modifications or improvements may be made without departing from the spirit of the invention and are intended to be within the scope of the invention as claimed.