CN107844481B

CN107844481B - Text recognition error detection method and device

Info

Publication number: CN107844481B
Application number: CN201711167410.5A
Authority: CN
Inventors: 刘俊华; 魏思; 胡国平; 柳林; 王建社; 方昕; 李永超; 孟廷
Original assignee: Xinjiang Iflytek Information Technology Co ltd
Current assignee: Xinjiang Shenggu Rongchuang Digital Industry Development Co ltd
Priority date: 2017-11-21
Filing date: 2017-11-21
Publication date: 2019-09-13
Anticipated expiration: 2037-11-21
Also published as: CN107844481A

Abstract

The embodiment of the invention provides a method and a device for identifying text and detecting errors, belonging to the technical field of language processing. The method comprises the following steps: acquiring a recognition confidence coefficient, a translation confidence coefficient and a context confidence coefficient of each word in the recognized text; and fusing the recognition confidence, the translation confidence and the context confidence of each participle in the recognized text to obtain a comprehensive confidence score of each participle in the recognized text, and taking the participle with the comprehensive confidence score smaller than a preset threshold value as an error word in the recognized text. Because the credibility of each participle as a recognition result can be reversely determined based on the translation confidence coefficient, and the credibility of each participle as the recognition result in the context before or after translation can be determined based on the context confidence coefficient, when the translation confidence coefficient, the context confidence coefficient and the recognition confidence coefficient are combined to detect errors of each participle in the recognition text, the error detection basis is more diversified, and the error detection accuracy can be improved.

Description

Identify text error-detecting method and device

Technical field

The present embodiments relate to language processing techniques field, more particularly, to a kind of identification text error-detecting method and Device.

Background technique

Currently, language communication becomes the important topic that different racial groups are faced in mutual exchange.Traditional Interpretative system is usually to accompany interpretation, alternately interpretation and simultaneous interpretation etc. using artificial, to solve the problems, such as IMPAIRED VERBAL COMMUNICATION, But it is limited to shortage of manpower and cost limitation, is unable to satisfy the demand that ordinary people carries out communication exchange.And voiced translation technology Development useful supplement is made that conventional translation mode, provide another approach for the daily communication exchange of ordinary people, and Cost and timeliness etc. have more advantage.Voiced translation includes these three steps of speech recognition, machine translation and speech synthesis, Due to the mistake introduced in speech recognition link, will have a direct impact on subsequent translation accuracy, thus how to identification text into Row error detection is the key that promote translation accuracy.

A kind of identification Method for text detection is provided in the related technology, and this method is mainly based upon each word in identification text Recognition confidence judge identify text in each word it is whether correct.By the identification confidence for being then based on each word in identification process It spends to carry out error detection, error detection foundation is more single, so that error detection accuracy is lower.

Summary of the invention

To solve the above-mentioned problems, the embodiment of the present invention provides one kind and overcomes the above problem or at least be partially solved State the identification text error-detecting method and device of problem.

According to a first aspect of the embodiments of the present invention, a kind of identification text error-detecting method is provided, this method comprises:

Obtain recognition confidence, degree of translation confidence and the context confidence level of each participle in identification text, degree of translation confidence It is that the translation accuracy based on target word each in target language text is obtained, context confidence level is based in identification text The contextual feature of each participle is obtained, and target language text is obtained after translating to identification text；

The recognition confidence, degree of translation confidence and the context confidence level that identify each participle in text are merged, with Comprehensive confidence level score value is less than the participle of preset threshold as knowledge by the synthesis confidence level score value of each participle into identification text Erroneous words in other text.

Method provided in an embodiment of the present invention is set by obtaining the recognition confidence of each participle, translation in identification text Reliability and context confidence level will identify that the recognition confidence of each participle, degree of translation confidence and context confidence level carry out in text Comprehensive confidence level score value is less than default threshold to obtain the synthesis confidence level score value of each participle in identification text by weighted sum The participle of value is as the erroneous words in identification text.Due to can reversely determine each participle as identification knot based on degree of translation confidence The credibility of fruit, and can be determined based on context confidence level in context of each participle before translation or after translation as identification knot The credibility of fruit, thus in combining translation confidence level, context confidence level and recognition confidence to each point in identification text When word carries out error detection, error detection foundation is more polynary, and error detection accuracy can be improved.

The possible implementation of with reference to first aspect the first, in the second possible implementation, object language Text will identify that text input to translation encoding and decoding Recognition with Recurrent Neural Network exports to obtain；Correspondingly, it obtains in identification text The degree of translation confidence of each participle, comprising:

Output feature and each participle based on decoding layer in translation encoding and decoding Recognition with Recurrent Neural Network are in identification text Coding characteristic obtains the translation contribution degree of each participle, before the coding characteristic of each participle is for indicating each participle translation Context；

The translation contribution degree of each participle is normalized, each participle corresponding normalization attention weight system is obtained Number；

The corresponding normalization attention weight coefficient of each participle and the translation accuracy of each target word are weighted Summation, obtains the degree of translation confidence of each participle.

The possible implementation of second with reference to first aspect obtains each in the third possible implementation Segment the coding characteristic in identification text, comprising:

The corresponding forward coding feature of term vector of each participle and anti-is obtained by translating encoding and decoding Recognition with Recurrent Neural Network To coding characteristic, the corresponding forward coding feature of each participle is spliced with phase-reversal coding feature, each participle is obtained and exists Identify the coding characteristic in text.

The possible implementation of with reference to first aspect the first, in the fourth possible implementation, each participle Contextual feature include decoding feature, the decoding feature of each participle is used to indicate for each context segmented after translating；Phase Ying Di obtains the contextual feature of each participle in identification text, comprising:

By decoding layer in the corresponding normalization attention weight coefficient of each participle and translation encoding and decoding Recognition with Recurrent Neural Network Output feature be weighted summation, obtain the decoding feature of each participle.

The 4th kind of possible implementation with reference to first aspect, in a fifth possible implementation, each participle Contextual feature further include at least one of following three kinds of data data, below three kinds of data be respectively each participle word to The coding characteristic and the corresponding type of theme of identification text of amount, each participle in identification text.

The possible implementation of with reference to first aspect the first obtains identification in a sixth possible implementation The context confidence level of each participle in text, comprising:

The contextual feature for identifying each participle in text is input to context confidence calculations model, output obtains each point The context confidence level of word, context confidence calculations model be based on training identification text contextual feature to default computation model into It is obtained after row training.

The possible implementation of with reference to first aspect the first obtains identification in the 7th kind of possible implementation The recognition confidence of each participle in text, comprising:

Obtain the posterior probability of each participle in identification text, and the recognition confidence as each participle；Alternatively,

The acoustics score value, language model score value, posterior probability and the duration that identify each participle in text are input to Recognition confidence computation model exports the recognition confidence of each participle.

According to a second aspect of the embodiments of the present invention, a kind of identification text Error Detection Unit is provided, which includes:

Module is obtained, for obtaining recognition confidence, degree of translation confidence and the context confidence of each participle in identification text Degree, degree of translation confidence is that the translation accuracy based on target word each in target language text is obtained, and context confidence level is Contextual feature based on each participle in identification text is obtained, and target language text is obtained after translating to identification text It arrives；

Detection module, for that will identify the recognition confidence of each participle in text, degree of translation confidence and context confidence level It is merged, to obtain the synthesis confidence level score value of each participle in identification text, comprehensive confidence level score value is less than default threshold The participle of value is as the erroneous words in identification text.

According to a third aspect of the embodiments of the present invention, a kind of identification text error-detecting facility is provided, comprising:

At least one processor；And

At least one processor being connect with processor communication, in which:

Memory is stored with the program instruction that can be executed by processor, and the instruction of processor caller is able to carry out first party Text error-detecting method is identified in the various possible implementations in face provided by any possible implementation.

According to the fourth aspect of the invention, a kind of non-transient computer readable storage medium, non-transient computer are provided Readable storage medium storing program for executing stores computer instruction, and computer instruction makes the various possible implementations of computer execution first aspect In text error-detecting method is identified provided by any possible implementation.

It should be understood that above general description and following detailed description be it is exemplary and explanatory, can not Limit the embodiment of the present invention.

Detailed description of the invention

Fig. 1 is a kind of flow diagram of identification text error-detecting method of the embodiment of the present invention；

Fig. 2 is a kind of flow diagram of identification text error-detecting method of the embodiment of the present invention；

Fig. 3 is a kind of structural schematic diagram of translation encoding and decoding Recognition with Recurrent Neural Network of the embodiment of the present invention；

Fig. 4 is a kind of block diagram of identification text Error Detection Unit of the embodiment of the present invention；

Fig. 5 is a kind of block diagram of identification text error-detecting facility of the embodiment of the present invention.

Specific embodiment

With reference to the accompanying drawings and examples, the specific embodiment of the embodiment of the present invention is described in further detail.With Lower embodiment is not limited to the range of the embodiment of the present invention for illustrating the embodiment of the present invention.

Voiced translation refers to the process of the voice signal that the voice signal of original language is automatically translated into object language.Voice Translation generally comprises three speech recognition, machine translation and speech synthesis chief components.Specifically, in given original language When voice signal, the identification text of original language is obtained by speech recognition system first, will be known secondly by machine translation system Other text translates into target language text, and target language text is synthesized to the language of object language finally by speech synthesis system Sound signal.Since desirable level has not been reached yet in each component part under current technological conditions, will lead in some cases The translation result of speech translation system output is wrong, the mistake especially introduced in speech recognition link, after will have a direct impact on Continuous translation accuracy, so that how to carry out error detection to identification text is the key that promote translation accuracy.

For said circumstances, the embodiment of the invention provides a kind of identification text error-detecting methods.This method can be used for voice Scene is translated, that is, first passes through speech recognition and obtains identification text, after being translated to identification text, based on translation result to knowledge Other text carries out error detection.Referring to Fig. 1, this method comprises:

101, the recognition confidence of each participle, degree of translation confidence and context confidence level, translation in identification text is obtained to set Reliability is that the translation accuracy based on target word each in target language text is obtained, and context confidence level is based on identification text Contextual feature of each participle is obtained in this, and target language text is obtained after translating to identification text；

102, the recognition confidence, degree of translation confidence and the context confidence level that identify each participle in text are merged, To obtain the synthesis confidence level score value of each participle in identification text, the participle that comprehensive confidence level score value is less than preset threshold is made For the erroneous words in identification text.

Before executing above-mentioned steps 101, the voice signal that audio collection module receives original language can be first passed through, then to source The voice signal of language carries out speech recognition and obtains identification text.It wherein, can be by identification text when being translated to identification text Originally it is input to translation encoding and decoding Recognition with Recurrent Neural Network, so that output obtains target language text.Identification text can be expressed as x= (x₁,x₂,x₃,…,x_m), x_iIndicate i-th of participle.The recognition confidence of each participle is for indicating each participle as identification knot The degree of translation confidence of the credibility of fruit, each participle reversely determines each participle as identification for indicating based on translation result As a result credibility, the context confidence level of each participle is for indicating each participle as recognition result under current context Credibility.Wherein, the recognition confidence of i-th of participle can indicate C_rec(x_i), degree of translation confidence is represented by C_trans(x_i), Context confidence level is represented by C_contex(x_i).In the synthesis confidence level score value for obtaining i-th of participle, weighted sum can be passed through Or the modes such as non-linear fusion, the present invention is not especially limit this.When by the way of weighted sum, it can lead to Following formula is crossed to calculate:

C(x_i)=w_recC_rec(x_i)+w_transC_trans(x_i)+w_context(x_i)

In above-mentioned formula, C (x_i) it is i-th of synthesis confidence level score value segmented.w_recFor the weight of recognition confidence, w_transFor the weight of degree of translation confidence, w_contextFor the weight of context confidence level.Above-mentioned every kind of weight can according to practical application or Experimental result determines that the present invention is not especially limit this.

In the synthesis confidence level score value C (x that i-th of participle is calculated_i) after, it can be by the synthesis confidence level of i-th of participle Score value C (x_i) be compared with preset threshold T.If the synthesis confidence level score value C (x of i-th of participle_i) be less than preset threshold T, then It can determine that i-th of participle is erroneous words.If the synthesis confidence level score value C (x of i-th of participle_i) be not less than preset threshold T, then may be used Determine that i-th of participle is correct word.

Method provided in an embodiment of the present invention is set by obtaining the recognition confidence of each participle, translation in identification text Reliability and context confidence level will identify that the recognition confidence of each participle, degree of translation confidence and context confidence level carry out in text Comprehensive confidence level score value is less than preset threshold to obtain the synthesis confidence level score value of each participle in identification text by fusion It segments as the erroneous words in identification text.Due to can reversely determine each participle as recognition result based on degree of translation confidence Credibility, and can be determined based on context confidence level in context of each participle before translation or after translation as recognition result Credibility, thus combining translation confidence level, context confidence level and recognition confidence to identification text in it is each segment into When row error detection, error detection foundation is more polynary, and error detection accuracy can be improved.

Content based on the above embodiment, as a kind of alternative embodiment, the embodiment of the invention also provides a kind of acquisitions The method for identifying the recognition confidence of each participle in text, including but not limited to: obtaining in identification text after each participle Test probability, and the recognition confidence as each participle；Alternatively, the acoustics score value, the language mould that will identify each participle in text Type score value, posterior probability and duration are input to recognition confidence computation model, export the recognition confidence of each participle.

Wherein, the first above-mentioned mode for obtaining recognition confidence can be indicated by following formula:

C_rec(x_i)=P_s(x_i)

In above-mentioned formula, C_rec(x_i) indicate i-th of recognition confidence segmented, P_s(x_i) indicate i-th after segmenting Test probability.

In the mode that above-mentioned second obtains recognition confidence, a large amount of training voice signals can be first collected in advance and carry out language Sound identification obtains the text of training identification accordingly.Determine whether each participle identifies correct and same in each training identification text When to it is each participle be labeled, such as identify correctly participle can be labeled as 1, identify mistake participle can be labeled as 0.It will train Acoustics score value, language model score value, posterior probability and the duration of each participle are input to recognition confidence in identification text Computation model is updated according to parameter of the annotation results of each participle to recognition confidence computation model, until preceding primary The variable quantity of model parameter is less than default change threshold between renewal process and a rear renewal process.At this point, updating terminates simultaneously Recognition confidence computation model can be obtained.It, can be by the acoustics score value of each participle, language mould specifically when calculating recognition confidence Type score value, posterior probability and duration are input in recognition confidence computation model, so that it is correct to export each participle identification Probability, and will the obtained probability of output as the recognition confidence of each participle.

Content based on the above embodiment, as a kind of alternative embodiment, the embodiment of the invention also provides a kind of acquisitions The method for identifying the degree of translation confidence of each participle in text.Referring to fig. 2, comprising:

201, the output feature and each participle based on decoding layer in translation encoding and decoding Recognition with Recurrent Neural Network are in identification text In coding characteristic, obtain the translation contribution degree of each participle, the coding characteristic of each participle is for indicating that each participle is translated Preceding context；

202, the translation contribution degree of each participle is normalized, obtains the corresponding normalization attention power of each participle Weight coefficient；

203, the translation accuracy of each corresponding normalization attention weight coefficient of participle and each target word is carried out Weighted sum obtains the degree of translation confidence of each participle.

Wherein, translation encoding and decoding Recognition with Recurrent Neural Network may include coding module (Encode), attention computing module (Attention) and decoder module (Decode).As shown in figure 3, Fig. 3 is based on Recognition with Recurrent Neural Network (Recurrent Neural Networks, RNN) and Attention mechanism translation encoding and decoding Recognition with Recurrent Neural Network structural schematic diagram.When It so, can also be using based on gating cycle unit (Gated in addition to the translation encoding and decoding Recognition with Recurrent Neural Network based on RNN Recurrent Unit, GRU) translation encoding and decoding Recognition with Recurrent Neural Network, or be based on shot and long term memory network (Long Short Term Memory, LSTM) translation encoding and decoding Recognition with Recurrent Neural Network, the present invention is not especially limit this.

In Fig. 3, text x=(x is identified₁,x₂,x₃,…,x_m) it is the input quantity for translating encoding and decoding Recognition with Recurrent Neural Network, y =(y₁,y₂,y₃,…,y_n) it is the output quantity for translating encoding and decoding Recognition with Recurrent Neural Network.Identify the length of text and target language text Degree is m and n, x respectively_iIndicate i-th of participle, y_jIndicate j-th of target word.

Coding module is used to obtain coding characteristic of each participle in identification text.Accordingly, as a kind of optional reality Example is applied, obtains each participle in the method for identifying the coding characteristic in text the embodiment of the invention also provides a kind of, including but It is not limited to: obtaining the corresponding forward coding feature of term vector of each participle and reversed by translating encoding and decoding Recognition with Recurrent Neural Network The corresponding forward coding feature of each participle is spliced with phase-reversal coding feature, obtains each participle and knowing by coding characteristic Coding characteristic in other text.

I-th is segmented, before executing the above process, can first obtain the term vector e of i-th of participle_i.Wherein, to Word2vec can be used in quantization method, and the present invention is not especially limit this.I-th of acquisition segment term vector it Afterwards, i-th of participle is obtained by forward coding Recognition with Recurrent Neural Network based on the term vector and sees forward direction under history lexical information Coding characteristic f_i, i-th of participle is obtained by phase-reversal coding Recognition with Recurrent Neural Network based on the term vector and sees the following lexical information Under phase-reversal coding feature b_i.I-th of corresponding forward coding feature of participle is spliced with phase-reversal coding feature, obtains the Coding characteristic h of the i participle in identification text_i。

Each participle is being obtained after the coding characteristic in identification text, attention computing module can be based on translation encoding and decoding The coding characteristic of the output feature of decoding layer and each participle in identification text, obtains each participle in Recognition with Recurrent Neural Network Translate contribution degree.Wherein, the translation contribution degree for obtaining each participle in identification text can refer to following formula and calculate:

In above-mentioned formula, α_jiIndicate the percentage contribution that i-th of participle is played when translation obtains j-th of target word, Namely translation contribution degree, a (h_i,s_j-1) it is to rely on the coding characteristic h of i-th of coding module participle_iWith a period of time on decoder module Carve the output feature s of decoding Recognition with Recurrent Neural Network_j-1Function.Wherein, which can be there are many implementation, and such as feedover mind Through network function, the present invention is not especially limit this.

For j-th of target word, decoder module is used for coding result and attention based on each participle in identification text The output of computing module is as a result, generate j-th of target word in target language text by decoding layer in decoding Recognition with Recurrent Neural Network Output feature s_j.Wherein, the translation accuracy of j-th of target word after translation is P (y_i)。

Since the translation accuracy of target word is somewhat dependent upon the identification accuracy segmented in identification text, from And it is based on reversed mechanism, the translation accuracy P (y of each target word can be passed through_i), calculate whether each participle in identification text is known Incorrect confidence level namely degree of translation confidence, specific calculating process can refer to following formula:

In above-mentioned formula, C_trans(x_i) it is i-th of degree of translation confidence segmented.β_jiIndicate to translation contribution degree ( Translation i-th of participle played percentage contribution when obtaining j-th of target word) be normalized after, i-th of participle is corresponding to return One changes attention weight coefficient.

Wherein, it is normalized in the translation contribution degree to each participle, obtains the corresponding normalization of each participle and pay attention to When power weight coefficient, following calculation formula can be used:

Content based on the above embodiment can get the recognition confidence of each participle and translation in identification text and set Reliability.As a kind of alternative embodiment, the embodiment of the invention also provides a kind of contexts for obtaining each participle in identification text The method of confidence level, including but not limited to: the contextual feature for identifying each participle in text is input to context confidence calculations Model, output obtain the context confidence level of each participle.

Wherein, context confidence calculations model is that the contextual feature based on training identification text carries out default computation model It is obtained after training.Specifically, a large amount of voice signals can be collected in advance and carry out speech recognition to obtain training identification text, determined Whether each participle identifies correct in each training identification text, and is labeled simultaneously to each participle, such as identifies correct Participle can be labeled as 1, identify that the participle of mistake can be labeled as 0.Using translation encoding and decoding Recognition with Recurrent Neural Network to training identification text This is translated, and the contextual feature of each participle in training identification text is extracted.By the language of each participle in training identification text Input quantity of the border feature as context confidence calculations model, calculating each participle in training identification text is to identify correct word Probability.If the probability being calculated is greater than preset threshold value, then it is assumed that the participle is to identify correct word.According to the participle Annotation results the parameter of context confidence calculations model is updated.It repeats the above process, until a preceding renewal process The variable quantity of model parameter is less than default change threshold between a rear renewal process.At this point, updating terminates and language can be obtained Border confidence calculations model, to be used for subsequent calculating context confidence level.

Before executing calculating context confidence level, the contextual feature of each participle can be first obtained.Based on the above embodiment Content, as a kind of alternative embodiment, the contextual feature of each participle may include decoding feature, and the decoding feature of each participle is used Context after indicating to translate for each participle.Correspondingly, the embodiment of the invention also provides in a kind of acquisition identification text The method of the decoding feature of each participle, including but not limited to: by the corresponding normalization attention weight coefficient of each participle and The output feature of decoding layer is weighted summation in translation encoding and decoding Recognition with Recurrent Neural Network, obtains the decoding feature of each participle. Specific calculating process can refer to following formula:

In above-mentioned formula, d_iThe decoding feature segmented for i-th.

Content based on the above embodiment, as a kind of alternative embodiment, the contextual feature of each participle is in addition to including solution Except code feature, it may also include at least one of following three kinds of data data, three kinds of data are respectively each participle below The coding characteristic and the corresponding type of theme of identification text of term vector, each participle in identification text.

Wherein it is possible to preset several type of subject, and it is based on preset type of subject, according to theme division side Method determines the type of theme of current identification text.Specifically, document subject matter can be used and generate model (Latent Dirichlet Allocation, LDA) etc. Unsupervised clusterings method obtain currently identify text type of theme, the embodiment of the present invention is to this It is not especially limited.

In addition, identification text can be also modified according to testing result due to after carrying out error detection to identification text, and It can be translated again after amendment, so that translation accuracy and interactive experience can be improved.

It should be noted that above-mentioned all alternative embodiments, can form optional implementation of the invention using any combination Example, this is no longer going to repeat them.

Content based on the above embodiment, the embodiment of the invention provides a kind of identification text Error Detection Unit, identification texts This Error Detection Unit is used to execute the identification text error-detecting method in above method embodiment.Referring to fig. 4, which includes:

Module 401 is obtained, recognition confidence, degree of translation confidence and the context for obtaining each participle in identification text are set Reliability, degree of translation confidence are that the translation accuracy based on target word each in target language text is obtained, context confidence level It is that the contextual feature based on each participle in identification text is obtained, target language text is after translating to identification text It obtains；

Error detection module 402, for that will identify the recognition confidence of each participle in text, degree of translation confidence and context confidence Degree is merged, and to obtain the synthesis confidence level score value of each participle in identification text, comprehensive confidence level score value is less than default The participle of threshold value is as the erroneous words in identification text.

As a kind of alternative embodiment, target language text is that text input will be identified to translation encoding and decoding circulation nerve net What network exported；Correspondingly, module 401 is obtained, comprising:

Acquiring unit, for based on translation encoding and decoding Recognition with Recurrent Neural Network in decoding layer output feature and it is each participle exist It identifies the coding characteristic in text, obtains the translation contribution degree of each participle, the coding characteristic of each participle is for indicating each Context before participle translation；

Normalization unit is normalized for the translation contribution degree to each participle, and each participle of acquisition is corresponding to return One changes attention weight coefficient；

Computing unit, for the translation of each corresponding normalization attention weight coefficient of participle and each target word is quasi- Exactness is weighted summation, obtains the degree of translation confidence of each participle.

As a kind of alternative embodiment, acquiring unit, for obtaining each point by translating encoding and decoding Recognition with Recurrent Neural Network The corresponding forward coding feature of the term vector of word and phase-reversal coding feature, by the corresponding forward coding feature of each participle and reversely Coding characteristic is spliced, and coding characteristic of each participle in identification text is obtained.

As a kind of alternative embodiment, the contextual feature of each participle includes decoding feature, the decoding feature of each participle For indicating for the context after each participle translation；Correspondingly, module 401 is obtained, for segmenting corresponding normalizing for each The output feature for changing decoding layer in attention weight coefficient and translation encoding and decoding Recognition with Recurrent Neural Network is weighted summation, obtains every The decoding feature of one participle.

As a kind of alternative embodiment, the contextual feature of each participle further includes at least one of following three kinds of data number According to three kinds of data are respectively the coding characteristic and identification of the term vector, each participle of each participle in identification text below The corresponding type of theme of text.

As a kind of alternative embodiment, module 401 is obtained, for that will identify that the contextual feature of each participle in text inputs To context confidence calculations model, output obtains the context confidence level of each participle, and context confidence calculations model is based on instruction What the contextual feature of white silk identification text obtained after being trained to default computation model.

As a kind of alternative embodiment, module 401 is obtained, for obtaining the posterior probability of each participle in identification text, And the recognition confidence as each participle；Alternatively, will identify the acoustics score value of each participle in text, language model score value, Posterior probability and duration are input to recognition confidence computation model, export the recognition confidence of each participle.

Device provided in an embodiment of the present invention is set by obtaining the recognition confidence of each participle, translation in identification text Reliability and context confidence level will identify that the recognition confidence of each participle, degree of translation confidence and context confidence level carry out in text Comprehensive confidence level score value is less than preset threshold to obtain the synthesis confidence level score value of each participle in identification text by fusion It segments as the erroneous words in identification text.Due to can reversely determine each participle as recognition result based on degree of translation confidence Credibility, and can be determined based on context confidence level in context of each participle before translation or after translation as recognition result Credibility, thus combining translation confidence level, context confidence level and recognition confidence to identification text in it is each segment into When row error detection, error detection foundation is more polynary, and error detection accuracy can be improved.

The embodiment of the invention provides a kind of identification text error-detecting facilities.Referring to Fig. 5, which includes: processor (processor) 501, memory (memory) 502 and bus 503；

Wherein, processor 501 and memory 502 complete mutual communication by bus 503 respectively；

Processor 501 is used to call the program instruction in memory 502, and text is identified provided by above-described embodiment to execute This error-detecting method, for example, obtain recognition confidence, degree of translation confidence and the context confidence of each participle in identification text Degree, degree of translation confidence is that the translation accuracy based on target word each in target language text is obtained, and context confidence level is Contextual feature based on each participle in identification text is obtained, and target language text is obtained after translating to identification text It arrives；The recognition confidence, degree of translation confidence and the context confidence level that identify each participle in text are merged, to be known Comprehensive confidence level score value is less than the participle of preset threshold as identification text by the synthesis confidence level score value of each participle in other text Erroneous words in this.

The embodiment of the present invention provides a kind of non-transient computer readable storage medium, the non-transient computer readable storage medium Matter stores computer instruction, which makes computer execute identification text error-detecting method provided by above-described embodiment, For example, obtain recognition confidence, degree of translation confidence and the context confidence level of each participle in identification text, degree of translation confidence It is that the translation accuracy based on target word each in target language text is obtained, context confidence level is based in identification text The contextual feature of each participle is obtained, and target language text is obtained after translating to identification text；It will identification text Recognition confidence, degree of translation confidence and context confidence level of each participle are merged in this, each in identification text to obtain Comprehensive confidence level score value is less than the participle of preset threshold as the mistake in identification text by the synthesis confidence level score value of participle Word.

Those of ordinary skill in the art will appreciate that: realize that all or part of the steps of above method embodiment can pass through The relevant hardware of program instruction is completed, and program above-mentioned can be stored in a computer readable storage medium, the program When being executed, step including the steps of the foregoing method embodiments is executed；And storage medium above-mentioned includes: ROM, RAM, magnetic disk or light The various media that can store program code such as disk.

The embodiments such as identification text error-detecting facility described above are only schematical, wherein saying as separation unit Bright unit may or may not be physically separated, and component shown as a unit can be or can not also It is physical unit, it can it is in one place, or may be distributed over multiple network units.It can be according to actual need Some or all of the modules therein is selected to achieve the purpose of the solution of this embodiment.Those of ordinary skill in the art are not paying Out in the case where creative labor, it can understand and implement.

Through the above description of the embodiments, those skilled in the art can be understood that each embodiment can It realizes by means of software and necessary general hardware platform, naturally it is also possible to pass through hardware.Based on this understanding, on Stating technical solution, substantially the part that contributes to existing technology can be embodied in the form of software products in other words, should Computer software product may be stored in a computer readable storage medium, such as ROM/RAM, magnetic disk, CD, including several fingers It enables and using so that a computer equipment (can be personal computer, server or the network equipment etc.) executes each implementation Certain Part Methods of example or embodiment.

Finally, the present processes are only preferable embodiment, it is not intended to limit the protection model of the embodiment of the present invention It encloses.With within principle, any modification, equivalent replacement, improvement and so on should be included in all spirit in the embodiment of the present invention Within the protection scope of the embodiment of the present invention.

Claims

1. a kind of identification text error-detecting method characterized by comprising

Recognition confidence, degree of translation confidence and the context confidence level of each participle in the identification text are obtained, the translation is set Reliability is that the translation accuracy based on target word each in target language text is obtained, and the context confidence level is based on institute The contextual feature for stating each participle in identification text is obtained, and the target language text is turned over to the identification text It is obtained after translating；

Recognition confidence, degree of translation confidence and the context confidence level of each participle in the identification text are merged, with Into the identification text, the synthesis confidence level score value of each participle, the participle that comprehensive confidence level score value is less than preset threshold is made For the erroneous words in the identification text.

2. the method according to claim 1, wherein the target language text is by the identification text input It is exported to translation encoding and decoding Recognition with Recurrent Neural Network；Correspondingly, the translation for obtaining each participle in identification text is set Reliability, comprising:

Output feature and each participle based on decoding layer in the translation encoding and decoding Recognition with Recurrent Neural Network are in the identification text In coding characteristic, obtain the translation contribution degree of each participle, the coding characteristic of each participle is for indicating that each participle is translated Preceding context；

The translation contribution degree of each participle is normalized, the corresponding normalization attention weight coefficient of each participle is obtained；

The translation accuracy of each corresponding normalization attention weight coefficient of participle and each target word is weighted summation, Obtain the degree of translation confidence of each participle.

3. according to the method described in claim 2, it is characterized in that, the volume for obtaining each participle in the identification text Code feature, comprising:

The corresponding forward coding feature of term vector of each participle and anti-is obtained by the translation encoding and decoding Recognition with Recurrent Neural Network To coding characteristic, the corresponding forward coding feature of each participle is spliced with phase-reversal coding feature, each participle is obtained and exists Coding characteristic in the identification text.

4. according to the method described in claim 2, it is characterized in that, the contextual feature of each participle include decoding feature, it is each The decoding feature of participle is used to indicate for the context after each participle translation；Correspondingly, described to obtain in the identification text The contextual feature of each participle, comprising:

By decoding layer in each corresponding normalization attention weight coefficient of participle and the translation encoding and decoding Recognition with Recurrent Neural Network Output feature be weighted summation, obtain the decoding feature of each participle.

5. according to the method described in claim 4, it is characterized in that, the contextual feature of each participle further includes following three kinds of data At least one of data, following three kinds of data be respectively that the term vector of each participle, each participle are literary in the identification Coding characteristic and the corresponding type of theme of the identification text in this.

6. the method according to claim 1, wherein the context confidence for obtaining each participle in identification text Degree, comprising:

The contextual feature for identifying each participle in text is input to context confidence calculations model, output obtains each participle Context confidence level, the context confidence calculations model be based on training identification text contextual feature to default computation model into It is obtained after row training.

7. the method according to claim 1, wherein the identification confidence for obtaining each participle in identification text Degree, comprising:

Obtain the posterior probability of each participle in the identification text, and the recognition confidence as each participle；Alternatively,

Acoustics score value, language model score value, posterior probability and the duration of each participle in the identification text are input to Recognition confidence computation model exports the recognition confidence of each participle.

8. a kind of identification text Error Detection Unit characterized by comprising

Module is obtained, for obtaining recognition confidence, degree of translation confidence and the context confidence of each participle in the identification text Degree, the degree of translation confidence is that the translation accuracy based on target word each in target language text is obtained, the context Confidence level is that the contextual feature based on each participle in the identification text is obtained, and the target language text is to described What identification text obtained after being translated；

Detection module, for by it is described identification text in each participle recognition confidence, degree of translation confidence and context confidence level It is merged, to obtain the synthesis confidence level score value of each participle in the identification text, comprehensive confidence level score value is less than pre- If the participle of threshold value is as the erroneous words in the identification text.

9. a kind of identification text error-detecting facility characterized by comprising

At least one processor；And

At least one processor being connect with the processor communication, in which:

The memory is stored with the program instruction that can be executed by the processor, and the processor calls described program to instruct energy Enough methods executed as described in claim 1 to 7 is any.

10. a kind of non-transient computer readable storage medium, which is characterized in that the non-transient computer readable storage medium is deposited Computer instruction is stored up, the computer instruction makes the computer execute the method as described in claim 1 to 7 is any.