CN110211571B

CN110211571B - Sentence fault detection method, sentence fault detection device and computer readable storage medium

Info

Publication number: CN110211571B
Application number: CN201910343889.6A
Authority: CN
Inventors: 张勇; 马骏; 王少军
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2019-04-26
Filing date: 2019-04-26
Publication date: 2023-05-26
Anticipated expiration: 2039-04-26
Also published as: CN110211571A; WO2020215550A1

Abstract

The invention relates to the technical field of voice semantics, and discloses a method for detecting a mispronounced sentence, which comprises the following steps: acquiring a target sentence; identifying i word compositions contained in the target sentence; sequentially inputting the i words into a pre-trained language model according to the sequence in the target sentence, and calculating the confusion degree and/or log likelihood probability of the target sentence through the language model; and judging the target sentence as a wrong sentence when the confusion degree of the target sentence is larger than the preset confusion degree and/or the log likelihood probability of the target sentence is smaller than the preset log likelihood probability. The invention also provides a sentence error detection device and a computer readable storage medium. The invention can identify whether the sentence is a wrong sentence or not.

Description

Sentence fault detection method, sentence fault detection device and computer readable storage medium

Technical Field

The present invention relates to the field of speech semantic technologies, and in particular, to a method and apparatus for detecting a sentence error, and a computer readable storage medium.

Background

With the development of technology, automatic speech recognition (Automatic Speech Recognition, ASR) technology, which is a technology for converting human speech into text, is increasingly used. In the application process of the ASR technology, substitution, insertion or deletion errors are unavoidable in the ASR recognition result due to the influence of background noise or the influence of pronunciation of a speaker, such as dialect, accent, faster speaking, idiom habit and the like. These recognition errors may cause problems of misorder, mismatching, unknown semantics, mislogic statement and the like in the recognition sentences, so that the missentences are formed. These mistakes are not only difficult to understand and analyze, but also present great difficulties for subsequent natural language processing (Natural Language Processing, NLP) applications. In addition to the sentences obtained by ASR techniques, sentences manually entered in the computer may also have errors. Therefore, the recognition of the correctness of the sentence has a certain practical meaning and necessity.

Disclosure of Invention

The invention provides a method and a device for detecting a wrong sentence and a computer readable storage medium, which mainly aim to identify whether the sentence is the wrong sentence or not.

In order to achieve the above object, the present invention further provides a method for detecting a sentence error, the method comprising:

acquiring a target sentence obtained by an automatic voice recognition technology;

acquiring an ith text contained in the target sentence, and judging whether a word matched with the ith text exists in a preset dictionary, wherein the initial value of i is 1, and i is a positive integer;

if the words matched with the ith section of characters do not exist in the preset dictionary, adjusting the word number of the ith section of characters, and judging whether the words matched with the ith section of characters exist in the preset dictionary;

if the words matched with the ith word segment exist in the preset dictionary, determining the ith word segment as the ith word segment of the target sentence, enabling i=i+1, acquiring the ith word segment contained in the target sentence, and judging whether the words matched with the ith word segment exist in the preset dictionary;

when the total word number of i words is the same as the total word number of the target sentence, determining that the target sentence consists of the i words;

Sequentially inputting the i words into a pre-trained language model according to the sequence in the target sentence, and calculating the confusion degree and/or log likelihood probability of the target sentence through the language model;

and judging the target sentence as a wrong sentence when the confusion degree of the target sentence is larger than a preset confusion degree and/or the log likelihood probability of the target sentence is smaller than a preset log likelihood probability.

Optionally, the sequentially inputting the i words into the pre-trained language model according to the sequence in the target sentence includes:

judging whether preset keywords exist in the i words or not;

if the i words have preset keywords, words, except the preset keywords, in the i words are sequentially input into a pre-trained language model according to the sequence in the target sentence.

Optionally, when the confusion degree of the target sentence is greater than a preset confusion degree and/or the log likelihood probability of the target sentence is less than a preset log likelihood probability, before judging that the target sentence is a wrong sentence, the method further includes:

determining the preset confusion degree and/or the preset log likelihood probability;

the determining the preset confusion degree and/or the determining the preset log likelihood probability specifically comprises:

Determining the preset confusion degree and/or determining the preset log likelihood probability comprises:

obtaining a training sample for training the language model, wherein the training sample comprises a positive sample and a negative sample;

obtaining the confusion degree of the positive sample and the log likelihood probability of the positive sample; and

obtaining the confusion degree of the negative sample and the log likelihood probability of the negative sample;

obtaining a confusion degree histogram according to the confusion degree of the positive sample and the confusion degree of the negative sample, and obtaining the preset confusion degree through the confusion degree histogram; and

and acquiring a log-likelihood probability histogram according to the log-likelihood probability of the positive sample and the log-likelihood probability of the negative sample, and acquiring the preset log-likelihood probability through the log-likelihood probability histogram.

Optionally, the language model is a deep learning language model or a statistical-based language model.

Optionally, the method further comprises:

and if the target sentence is a wrong sentence, sending a wrong sentence reminding message.

In addition, in order to achieve the above object, the present invention also provides an apparatus for detecting a sentence in a sentence, the apparatus comprising a memory and a processor, wherein the memory stores a sentence detecting program capable of running on the processor, and the sentence detecting program, when executed by the processor, performs the steps of:

judging whether preset keywords exist in the i words or not;

Optionally, the sentence detection program is executed by the processor, and further implements the following steps:

when the confusion degree of the target sentence is larger than a preset confusion degree and/or the log likelihood probability of the target sentence is smaller than a preset log likelihood probability, determining the preset confusion degree and/or determining the preset log likelihood probability before judging that the target sentence is a wrong sentence;

Optionally, the sentence detection program may be executed by the processor, and further implement the following steps:

In addition, to achieve the above object, the present invention also provides a computer-readable storage medium having stored thereon a sentence detection program executable by one or more processors to implement the steps of the sentence detection method as described above.

The invention provides a method, a device and a computer readable storage medium for detecting a wrong sentence, which are used for acquiring a target sentence obtained by an automatic voice recognition technology; acquiring an ith text contained in the target sentence, and judging whether a word matched with the ith text exists in a preset dictionary, wherein the initial value of i is 1, and i is a positive integer; if the words matched with the ith section of characters do not exist in the preset dictionary, adjusting the word number of the ith section of characters, and judging whether the words matched with the ith section of characters exist in the preset dictionary; if the words matched with the ith word segment exist in the preset dictionary, determining the ith word segment as the ith word segment of the target sentence, enabling i=i+1, acquiring the ith word segment contained in the target sentence, and judging whether the words matched with the ith word segment exist in the preset dictionary; when the total word number of i words is the same as the total word number of the target sentence, determining that the target sentence consists of the i words; sequentially inputting the i words into a pre-trained language model according to the sequence in the target sentence, and calculating the confusion degree and/or log likelihood probability of the target sentence through the language model; when the confusion degree of the target sentence is larger than the preset confusion degree and/or the log likelihood probability of the target sentence is smaller than the preset log likelihood probability, judging that the target sentence is a wrong sentence, thereby realizing the purpose of identifying whether the sentence is the wrong sentence.

Drawings

FIG. 1 is a flowchart illustrating a method for detecting a sentence in accordance with an embodiment of the present invention;

FIG. 2 is a schematic diagram illustrating an internal structure of a sentence detecting apparatus according to an embodiment of the present invention;

fig. 3 is a schematic block diagram of a sentence detection program in a sentence detection apparatus according to an embodiment of the present invention.

The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.

Detailed Description

It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

The invention provides a method for detecting a mispronounced sentence. Referring to fig. 1, a flow chart of a method for detecting a sentence in error according to an embodiment of the invention is shown. The method may be performed by an apparatus, which may be implemented in software and/or hardware.

In this embodiment, the method for detecting a sentence error includes:

step S10, obtaining a target sentence obtained through an automatic voice recognition technology.

In this embodiment, the target sentence obtained by the automatic speech recognition (automatic speech recognition, ASR) technology may be one sentence or a plurality of sentences, and each sentence may be a long sentence or a short sentence. In other embodiments, the target statement may also be a statement entered through other pathways.

Step S20, acquiring an ith text contained in the target sentence, and judging whether a word matched with the ith text exists in a preset dictionary, wherein the initial value of i is 1, and i is a positive integer.

When the ith text included in the target sentence is acquired, the ith text may be acquired sequentially in the left-to-right order (i.e. front-to-back), or may be acquired sequentially in the right-to-left order (i.e. back-to-front).

The number of words in each piece of text acquired may be the same or different.

Preferably, the number of words of the i-th text is the same as the number of words of the longest text in the preset dictionary.

And step S30, if no word matched with the ith section of characters exists in the preset dictionary, adjusting the number of characters of the ith section of characters, and judging whether the word matched with the ith section of characters exists in the preset dictionary.

Step S40, if there is a word matching with the i-th word in the preset dictionary, determining that the i-th word is the i-th word of the target sentence, letting i=i+1, obtaining the i-th word contained in the target sentence, and judging whether there is a word matching with the i-th word in the preset dictionary.

And S50, when the total word number of the i words is the same as the total word number of the target sentence, determining that the target sentence consists of the i words.

For example, for a sentence "i love autumn". If the word number of the longest word in the dictionary is 3, acquiring the first 3 words of 'I love' in 'I love autumn' as a first section, and matching the first section of 'I love' with the words in the dictionary; if the matching is unsuccessful, reducing 'I love' by one word to obtain 'I love', and matching 'I happiness' with words in a dictionary, if the matching is unsuccessful, directly determining 'I' as a single word; then, selecting words in the dictionary for matching in 'favorite autumn', if the matching is unsuccessful, reducing one word for matching in 'favorite autumn', namely, matching the words in the dictionary in 'favorite' if the matching is successful, determining that the words are favorite as one word, and similarly, matching the words in the dictionary in 'autumn', if the matching is successful, determining that the words in 'autumn' are one word. Then, after word segmentation processing is carried out on the sentence, the sentence is obtained to be respectively composed of three words of I'm, favorite and autumn.

Through the steps, the words contained in the target sentence can be rapidly identified, and rapid sentence misplacement detection is facilitated.

Step S60, the i words are sequentially input into a pre-trained language model according to the sequence in the target sentence, and the confusion degree and/or the log likelihood probability of the target sentence are calculated through the language model.

The language model may be a deep learning language model, e.g. a feed forward neural network (feedforward neutral network), a recurrent neural network (recurrent neutral network, RNN), or a statistical based language model, e.g. an N-gram (N-gram).

In this embodiment, a training set (the training set is a positive sample) may be formed by obtaining a correct word sequence (i.e., a correct sentence), then, the selected language model is trained by the sentences in the training set to obtain parameters of the language model, and the trained language model may identify the probability of occurrence of a word sequence.

From a statistical point of view, a sentence s in natural language may be composed of any word string (composed of several words in a certain word order), but the probability of occurrence P(s) of the sentence is small. For example, assume that there are the following sentences s1 and s2:

s1: i just eat dinner

s2: just me has eaten dinner

Obviously, for Chinese, s1 is a correct sentence, and s2 is an erroneous sentence, so for Chinese, the probability of occurrence of sentence s1 is greater than that of sentence s2, i.e. P (s 1) > P (s 2).

If there is one sentence s composed of m words, the probability P (W1, W2, …, wm) of the sentence s is:

P(W1,W2,…,Wm)＝P(W1)P(W1|W2)P(W3|W1,W2)…P(Wm|W1,W2,…Wm-1)

P (W1), P (w1|w2), P (w3|w1, W2), etc. in the above formula can be calculated by using a pre-trained language model, and the probability P (W1, W2, …, wm) of the sentence s can be calculated after the values are calculated. Since the probability is a decimal value in the range of [0,1], a very small decimal value can be obtained by multiplying a plurality of probability values, and errors can be generated, we calculate the logarithmic value to obtain log likelihood probability logprob:

logprob＝log(P(W1,W2,…,Wm))

logprob symbolizes the likelihood of a sentence appearing. The larger the logprob, the more likely it is that the sentence will appear, and the smaller the logprob, the less likely it will appear.

Another parameter that symbolizes the size of the probability of occurrence of a sentence is ppl (confusion), which is defined as the geometric average of the reciprocal of the probability of occurrence of a sentence, defined as:

ppl＝10^(-logprob/(-logprob-OOVs+1))

OOVs in the above formula are the number of out-of-domain words in a sentence (out-of-domain words refer to words outside the dictionary). And, if ppl is smaller, the likelihood of occurrence of the sentence is large, and if ppl is larger, the likelihood of occurrence of the sentence is smaller.

Further, in another embodiment of the present invention, the sequentially inputting the i words into the pre-trained language model according to the order in the target sentence includes:

Judging whether preset keywords exist in the i words or not;

In this embodiment, the preset keyword may be a stop word, for example: "feed", "you good", "thank you", "you" and the like.

The preset keyword may also be an exclamation word, for example: "o", "ya", "wa" and other words.

The preset keywords can be other types of words which do not affect the semantics of the sentences, and can be preset according to actual requirements.

In the embodiment, the preset keywords are removed, so that the accuracy of the detection result is not affected, the obtained target sentences can be simplified, and the detection speed is improved.

And step S70, judging that the target sentence is a wrong sentence when the confusion degree of the target sentence is larger than the preset confusion degree and/or the log likelihood probability of the target sentence is smaller than the preset log likelihood probability.

In this embodiment, the erroneous sentence is a wrong sentence, including a word string of a sick sentence or a non-sentence.

The preset log likelihood probability may be a value that is preset to distinguish whether the log likelihood probability of the target sentence is high or low, and when the log likelihood probability of the target sentence is higher than the preset log likelihood probability, the log likelihood probability of the target sentence is high, and the target sentence is judged to be a correct sentence; when the log-likelihood probability of the target sentence is lower than the preset log-likelihood probability, the log-likelihood probability of the target sentence is low, and the target sentence is judged to be an erroneous sentence.

Similarly, the preset confusion degree may be a preset value for distinguishing whether the confusion degree of the sentence to be detected is high or low, when the confusion degree of the target sentence is higher than the preset confusion degree, the confusion degree of the target sentence is high, the target sentence is judged to be an erroneous sentence, and when the confusion degree of the target sentence is lower than the preset confusion degree, the confusion degree of the sentence to be detected is low, and the target sentence is judged to be a correct sentence.

In a possible embodiment, when the confusion degree of the target sentence is greater than a preset confusion degree and/or the log likelihood probability of the target sentence is less than a preset log likelihood probability, before determining that the target sentence is a wrong sentence, the method further includes:

determining the preset confusion degree and/or determining the preset log likelihood probability.

In this embodiment, the preset confusion degree is determined by a confusion degree histogram of a training sample, and the preset log likelihood probability is determined by a log likelihood probability histogram of the training sample.

Wherein the confusion histogram of the training sample includes a correct sentence and an incorrect sentence confusion histogram calculated from the training sample (i.e., a training set composed of sentences recognized by the ASR), which reflect the confusion distribution of the correct sentence and the confusion distribution of the incorrect sentence.

In the present embodiment, the preset confusion may be determined from the confusion distribution of the correct sentence, and the confusion distribution of the wrong sentence. Then, whether the confusion degree of the target sentence is within the confusion degree range of the correct sentence or the confusion degree range of the error sentence is determined, so that whether the target sentence is the correct sentence or the error sentence is judged.

Specifically, in the training sample, the confusion of the correct sentence and the confusion of the wrong sentence may be calculated. And then determining the confusion distribution of the correct sentences and the confusion distribution of the wrong sentences, thereby determining a confusion threshold for distinguishing the correct sentences from the wrong sentences, namely, the preset confusion.

For example, the confusion (ppl) distribution of correct sentences is:

ppl interval	Number of pieces	Percentage of
			[2000,+∞]	5	0.125％
[1500，2000]	1	0.025％
			[1000,1500]	5	0.125％
[750,1000]	11	0.276％
			[500,750]	28	0.702％
[250,500]	217	5.441％
			[100,250]	1241	31.18％
[0,100]	2480	62.187％

The confusion (ppl) distribution of erroneous sentences is:

ppl interval	Number of pieces	Percentage of
			[2000,+∞]	86	5.923％
[1500，2000]	43	2.961％
			[1000,1500]	111	7.645％
[750,1000]	107	7.369％
			[500,750]	204	14.05％
[250,500]	501	34.504％
			[100,250]	345	23.76％
[0,100]	55	3.788％

From the above histograms, it can be seen that for the correct sentence, the duty cycle of the correct sentence is 93.367% when ppl is less than 250.

For erroneous sentences, the duty cycle of erroneous sentences is 72.425% when ppl is greater than 250.

Therefore, when ppl is 250, a sentence can be distinguished well, and the preset confusion degree (preset ppl) is determined to be 250. In this embodiment, when the ppl of the target sentence is greater than the preset ppl, the target sentence is determined to be an error sentence.

Likewise, the acquisition method of the preset log likelihood probability (logprob) is similar to the acquisition method of the preset confusion. The logprob histogram of the training sample includes a correct sentence and a wrong sentence logprob histogram calculated from the training sample. They reflect the log-likelihood probability distribution of the correct sentence and the log-likelihood probability distribution of the incorrect sentence.

For example, the log likelihood probability (logprob) distribution of the correct sentence is:

logprob interval	Number of pieces	Percentage of
			[-∞,-4.0]	1	0.0251％
[-4.0，-3.5]	0	0
			[-3.5,-3.0]	14	0.351％
[-3.0,-2.5]	122	3.0591％
			[-2.5,-2.0]	1371	34.378％
[-2.0,-1.5]	1740	43.631％
			[-1.5,-1.0]	673	16.876％
[-1.0,0]	67	1.68％

The log likelihood probability (logprob) distribution of the erroneous sentence is:

logprob interval	Number of pieces	Percentage of
			[-∞,-4.0]	8	0.551％
[-4.0，-3.5]	31	2.135％
			[-3.5,-3.0]	200	13.774％
[-3.0,-2.5]	656	45.179％
			[-2.5,-2.0]	502	34.573％
[-2.0,-1.5]	52	3.581％
			[-1.5,-1.0]	3	0.207％
[-1.0,0]	0	0

From the above histograms, it can be seen that for correct sentences, the duty cycle of correct sentences is 96.566% when logprob is greater than-2.5, and for incorrect sentences, the duty cycle of incorrect sentences is 61.639% when logprob is less than-2.5.

Therefore, when the logprob is-2.5, a sentence can be better distinguished, and the preset logprob is determined to be-2.5. In this embodiment, when the logprob of the target is smaller than the logprob threshold, the target sentence is determined to be an erroneous sentence.

According to the method for detecting the mispronounced sentence, which is provided by the embodiment, a target sentence obtained through an automatic voice recognition technology is obtained; acquiring an ith text contained in the target sentence, and judging whether a word matched with the ith text exists in a preset dictionary, wherein the initial value of i is 1, and i is a positive integer; if the words matched with the ith section of characters do not exist in the preset dictionary, adjusting the word number of the ith section of characters, and judging whether the words matched with the ith section of characters exist in the preset dictionary; if the words matched with the ith word segment exist in the preset dictionary, determining the ith word segment as the ith word segment of the target sentence, enabling i=i+1, acquiring the ith word segment contained in the target sentence, and judging whether the words matched with the ith word segment exist in the preset dictionary; when the total word number of i words is the same as the total word number of the target sentence, determining that the target sentence consists of the i words; sequentially inputting the i words into a pre-trained language model according to the sequence in the target sentence, and calculating the confusion degree and/or log likelihood probability of the target sentence through the language model; when the confusion degree of the target sentence is larger than the preset confusion degree and/or the log likelihood probability of the target sentence is smaller than the preset log likelihood probability, judging that the target sentence is a wrong sentence, thereby realizing the purpose of identifying whether the sentence is the wrong sentence.

Further, in another embodiment of the method of the present invention, the method further comprises the steps of:

When the target sentence is judged to be a wrong sentence, a prompt can be sent to a user. For example, when the sentence converted by voice is to be further processed by natural language, if the sentence is judged to be wrong, the user can be reminded that the target sentence is a wrong sentence in a manner of popup window error reporting on the display device.

In this embodiment, by sending the alert, the user can quickly learn whether and which error sentences exist, and further perform the subsequent operation.

The invention also provides a device for detecting the mispronounced sentence. Referring to fig. 2, an internal structure diagram of a sentence detecting device according to an embodiment of the invention is shown.

In this embodiment, the sentence-error detecting apparatus 1 may be a PC (Personal Computer ), or may be a terminal device such as a smart phone, a tablet computer, or a portable computer. The sentence detection device 1 comprises at least a memory 11, a processor 12, a communication bus 13, and a network interface 14.

The memory 11 includes at least one type of readable storage medium including flash memory, a hard disk, a multimedia card, a card memory (e.g., SD or DX memory, etc.), a magnetic memory, a magnetic disk, an optical disk, etc. The memory 11 may in some embodiments be an internal storage unit of the sentence detection device 1, for example a hard disk of the sentence detection device 1. The memory 11 may also be an external storage device of the sentence detection apparatus 1 in other embodiments, for example, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the sentence detection apparatus 1. Further, the memory 11 may also include both an internal storage unit and an external storage device of the sentence detection apparatus 1. The memory 11 may be used not only for storing application software installed in the sentence detection apparatus 1 and various types of data, for example, codes of the sentence detection program 01, but also for temporarily storing data that has been output or is to be output.

The processor 12 may in some embodiments be a central processing unit (Central Processing Unit, CPU), controller, microcontroller, microprocessor or other data processing chip for executing program code or processing data stored in the memory 11, such as executing the sentence detection program 01, etc.

The communication bus 13 is used to enable connection communication between these components.

The network interface 14 may optionally comprise a standard wired interface, a wireless interface (e.g. WI-FI interface), typically used to establish a communication connection between the apparatus 1 and other electronic devices.

Optionally, the device 1 may further comprise a user interface, which may comprise a Display (Display), an input unit such as a Keyboard (Keyboard), and a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an Organic Light-Emitting Diode (OLED) touch, or the like. The display may also be referred to as a display screen or a display unit, as appropriate, for displaying information processed in the sentence detection device 1 and for displaying a visual user interface.

Fig. 2 shows only the sentence detection apparatus 1 having the components 11-14 and the sentence detection program 01, it will be understood by those skilled in the art that the structure shown in fig. 2 does not constitute a limitation of the sentence detection apparatus 1, and may include fewer or more components than shown, or may combine certain components, or may be a different arrangement of components.

In the embodiment of the apparatus 1 shown in fig. 2, the memory 11 stores a sentence detection program 01; the processor 12 performs the following steps when executing the sentence detection program 01 stored in the memory 11:

and acquiring a target sentence obtained by an automatic voice recognition technology.

And acquiring an ith text contained in the target sentence, and judging whether a word matched with the ith text exists in a preset dictionary, wherein the initial value of i is 1, and i is a positive integer.

If the words matched with the ith section of characters do not exist in the preset dictionary, the word number of the ith section of characters is adjusted, and whether the words matched with the ith section of characters exist in the preset dictionary is judged.

If the words matched with the ith word segment exist in the preset dictionary, determining the ith word segment as the ith word segment of the target sentence, enabling i=i+1, acquiring the ith word segment contained in the target sentence, and judging whether the words matched with the ith word segment exist in the preset dictionary.

And when the total word number of the i words is the same as the total word number of the target sentence, determining that the target sentence consists of the i words.

And sequentially inputting the i words into a pre-trained language model according to the sequence in the target sentence, and calculating the confusion degree and/or the log likelihood probability of the target sentence through the language model.

s1: i just eat dinner

s2: just me has eaten dinner

P(W1,W2,…,Wm)＝P(W1)P(W1|W2)P(W3|W1,W2)…P(Wm|W1,W2,…Wm-1)

logprob＝log(P(W1,W2,…,Wm))

logprob symbolizes the size of the likelihood of occurrence of a sentence. The larger the logprob, the more likely it is that the sentence will appear, and the smaller the logprob, the less likely it will appear.

ppl＝10^(-logprob/(-logprob-OOVs+1))

judging whether preset keywords exist in the i words or not;

In a possible embodiment, when the confusion degree of the target sentence is greater than a preset confusion degree and/or the log likelihood probability of the target sentence is less than a preset log likelihood probability, before the target sentence is judged to be a wrong sentence, the following steps are further implemented:

For example, the confusion (ppl) distribution of correct sentences is:

The confusion (ppl) distribution of erroneous sentences is:

As can be seen from the above histograms, for the correct sentence, when ppl is less than 250, the duty cycle of the correct sentence is 93.367%

The erroneous sentence detection device provided by the embodiment acquires a target sentence obtained by an automatic voice recognition technology; acquiring an ith text contained in the target sentence, and judging whether a word matched with the ith text exists in a preset dictionary, wherein the initial value of i is 1, and i is a positive integer; if the words matched with the ith section of characters do not exist in the preset dictionary, adjusting the word number of the ith section of characters, and judging whether the words matched with the ith section of characters exist in the preset dictionary; if the words matched with the ith word segment exist in the preset dictionary, determining the ith word segment as the ith word segment of the target sentence, enabling i=i+1, acquiring the ith word segment contained in the target sentence, and judging whether the words matched with the ith word segment exist in the preset dictionary; when the total word number of i words is the same as the total word number of the target sentence, determining that the target sentence consists of the i words; sequentially inputting the i words into a pre-trained language model according to the sequence in the target sentence, and calculating the confusion degree and/or log likelihood probability of the target sentence through the language model; when the confusion degree of the target sentence is larger than the preset confusion degree and/or the log likelihood probability of the target sentence is smaller than the preset log likelihood probability, judging that the target sentence is a wrong sentence, thereby realizing the purpose of identifying whether the sentence is the wrong sentence.

Further, in another embodiment of the apparatus of the present invention, the sentence detection program may be further invoked by the processor to implement the following steps:

Alternatively, in other embodiments, the sentence detection program may be further divided into one or more modules, where one or more modules are stored in the memory 11 and executed by one or more processors (the processor 12 in this embodiment) to implement the present invention, and the modules referred to herein are a series of instruction segments of a computer program capable of performing a specific function for describing the execution of the sentence detection program in the sentence detection device.

For example, referring to fig. 3, a schematic program module of a sentence detection program in an embodiment of the sentence detection apparatus according to the present invention is shown, where the sentence detection program may be divided into a first obtaining module 10, a second obtaining module 20, an adjusting module 30, a first judging module 40, a determining module 50, a calculating module 60 and a second judging module 70, by way of example:

The first acquisition module 10 is configured to: acquiring a target sentence obtained by an automatic voice recognition technology;

the second acquisition module 20 is configured to: acquiring an ith text contained in the target sentence, and judging whether a word matched with the ith text exists in a preset dictionary, wherein the initial value of i is 1, and i is a positive integer;

the adjustment module 30 is configured to: if the words matched with the ith section of characters do not exist in the preset dictionary, adjusting the word number of the ith section of characters, and judging whether the words matched with the ith section of characters exist in the preset dictionary;

the first judging module 40 is configured to: if the words matched with the ith word segment exist in the preset dictionary, determining the ith word segment as the ith word segment of the target sentence, enabling i=i+1, acquiring the ith word segment contained in the target sentence, and judging whether the words matched with the ith word segment exist in the preset dictionary;

the determining module 50 is configured to: and when the total word number of the i words is the same as the total word number of the target sentence, determining that the target sentence consists of the i words.

The calculation module 60 is configured to: and sequentially inputting the i words into a pre-trained language model according to the sequence in the target sentence, and calculating the confusion degree and/or the log likelihood probability of the target sentence through the language model.

The second judging module 70 is configured to: and judging the target sentence as a wrong sentence when the confusion degree of the target sentence is larger than a preset confusion degree and/or the log likelihood probability of the target sentence is smaller than a preset log likelihood probability.

The functions or operation steps implemented when the program modules, such as the first acquiring module 10, the second acquiring module 20, the adjusting module 30, the first judging module 40, the determining module 50, the calculating module 60, and the second judging module 70, are substantially the same as those of the foregoing embodiments, and are not repeated herein.

In addition, an embodiment of the present invention further proposes a computer-readable storage medium having stored thereon a sentence detection program executable by one or more processors to implement the following operations:

The computer-readable storage medium of the present invention is substantially the same as the embodiments of the apparatus and method for detecting a sentence in the above-described manner, and will not be described in detail herein.

It should be noted that, the foregoing reference numerals of the embodiments of the present invention are merely for describing the embodiments, and do not represent the advantages and disadvantages of the embodiments. And the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, apparatus, article, or method that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, apparatus, article, or method. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, apparatus, article or method that comprises the element.

From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) as described above, comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method according to the embodiments of the present invention.

The foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the scope of the invention, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.

Claims

1. A method for detecting a sentence, the method comprising:

2. The method of claim 1, wherein sequentially inputting the i words into a pre-trained language model according to the order in the target sentence comprises:

judging whether preset keywords exist in the i words or not;

3. The method for detecting a sentence in error according to claim 1 or 2, wherein when the confusion degree of the target sentence is greater than a preset confusion degree and/or the log likelihood probability of the target sentence is less than a preset log likelihood probability, before determining that the target sentence is a sentence in error, further comprising:

4. The method of claim 2, wherein the language model is a deep learning language model or a statistical-based language model.

5. The sentence detection method according to claim 1 or 2, characterized in that the method further comprises:

6. An apparatus for detecting a sentence in a sentence, the apparatus comprising a memory and a processor, the memory having stored thereon a sentence detection program operable on the processor, the sentence detection program, when executed by the processor, performing the steps of:

7. The erroneous sentence detection apparatus of claim 6, wherein sequentially inputting the i words to a pre-trained language model in accordance with the order in the target sentence comprises:

judging whether preset keywords exist in the i words or not;

8. The sentence detection apparatus according to claim 6 or 7, wherein the sentence detection program is executed by the processor, further implementing the steps of:

9. The sentence detection apparatus according to claim 6 or 7, wherein the sentence detection program is executable by the processor, further implementing the steps of:

10. A computer-readable storage medium having stored thereon a sentence detection program executable by one or more processors to implement the steps of the sentence detection method of any one of claims 1 to 5.