CN107844481B - Text recognition error detection method and device - Google Patents
Text recognition error detection method and device Download PDFInfo
- Publication number
- CN107844481B CN107844481B CN201711167410.5A CN201711167410A CN107844481B CN 107844481 B CN107844481 B CN 107844481B CN 201711167410 A CN201711167410 A CN 201711167410A CN 107844481 B CN107844481 B CN 107844481B
- Authority
- CN
- China
- Prior art keywords
- participle
- confidence
- translation
- text
- identification text
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/42—Data-driven translation
- G06F40/44—Statistical methods, e.g. probability models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/58—Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Probability & Statistics with Applications (AREA)
- Machine Translation (AREA)
Abstract
The embodiment of the invention provides a method and a device for identifying text and detecting errors, belonging to the technical field of language processing. The method comprises the following steps: acquiring a recognition confidence coefficient, a translation confidence coefficient and a context confidence coefficient of each word in the recognized text; and fusing the recognition confidence, the translation confidence and the context confidence of each participle in the recognized text to obtain a comprehensive confidence score of each participle in the recognized text, and taking the participle with the comprehensive confidence score smaller than a preset threshold value as an error word in the recognized text. Because the credibility of each participle as a recognition result can be reversely determined based on the translation confidence coefficient, and the credibility of each participle as the recognition result in the context before or after translation can be determined based on the context confidence coefficient, when the translation confidence coefficient, the context confidence coefficient and the recognition confidence coefficient are combined to detect errors of each participle in the recognition text, the error detection basis is more diversified, and the error detection accuracy can be improved.
Description
Technical field
The present embodiments relate to language processing techniques field, more particularly, to a kind of identification text error-detecting method and
Device.
Background technique
Currently, language communication becomes the important topic that different racial groups are faced in mutual exchange.Traditional
Interpretative system is usually to accompany interpretation, alternately interpretation and simultaneous interpretation etc. using artificial, to solve the problems, such as IMPAIRED VERBAL COMMUNICATION,
But it is limited to shortage of manpower and cost limitation, is unable to satisfy the demand that ordinary people carries out communication exchange.And voiced translation technology
Development useful supplement is made that conventional translation mode, provide another approach for the daily communication exchange of ordinary people, and
Cost and timeliness etc. have more advantage.Voiced translation includes these three steps of speech recognition, machine translation and speech synthesis,
Due to the mistake introduced in speech recognition link, will have a direct impact on subsequent translation accuracy, thus how to identification text into
Row error detection is the key that promote translation accuracy.
A kind of identification Method for text detection is provided in the related technology, and this method is mainly based upon each word in identification text
Recognition confidence judge identify text in each word it is whether correct.By the identification confidence for being then based on each word in identification process
It spends to carry out error detection, error detection foundation is more single, so that error detection accuracy is lower.
Summary of the invention
To solve the above-mentioned problems, the embodiment of the present invention provides one kind and overcomes the above problem or at least be partially solved
State the identification text error-detecting method and device of problem.
According to a first aspect of the embodiments of the present invention, a kind of identification text error-detecting method is provided, this method comprises:
Obtain recognition confidence, degree of translation confidence and the context confidence level of each participle in identification text, degree of translation confidence
It is that the translation accuracy based on target word each in target language text is obtained, context confidence level is based in identification text
The contextual feature of each participle is obtained, and target language text is obtained after translating to identification text;
The recognition confidence, degree of translation confidence and the context confidence level that identify each participle in text are merged, with
Comprehensive confidence level score value is less than the participle of preset threshold as knowledge by the synthesis confidence level score value of each participle into identification text
Erroneous words in other text.
Method provided in an embodiment of the present invention is set by obtaining the recognition confidence of each participle, translation in identification text
Reliability and context confidence level will identify that the recognition confidence of each participle, degree of translation confidence and context confidence level carry out in text
Comprehensive confidence level score value is less than default threshold to obtain the synthesis confidence level score value of each participle in identification text by weighted sum
The participle of value is as the erroneous words in identification text.Due to can reversely determine each participle as identification knot based on degree of translation confidence
The credibility of fruit, and can be determined based on context confidence level in context of each participle before translation or after translation as identification knot
The credibility of fruit, thus in combining translation confidence level, context confidence level and recognition confidence to each point in identification text
When word carries out error detection, error detection foundation is more polynary, and error detection accuracy can be improved.
The possible implementation of with reference to first aspect the first, in the second possible implementation, object language
Text will identify that text input to translation encoding and decoding Recognition with Recurrent Neural Network exports to obtain;Correspondingly, it obtains in identification text
The degree of translation confidence of each participle, comprising:
Output feature and each participle based on decoding layer in translation encoding and decoding Recognition with Recurrent Neural Network are in identification text
Coding characteristic obtains the translation contribution degree of each participle, before the coding characteristic of each participle is for indicating each participle translation
Context;
The translation contribution degree of each participle is normalized, each participle corresponding normalization attention weight system is obtained
Number;
The corresponding normalization attention weight coefficient of each participle and the translation accuracy of each target word are weighted
Summation, obtains the degree of translation confidence of each participle.
The possible implementation of second with reference to first aspect obtains each in the third possible implementation
Segment the coding characteristic in identification text, comprising:
The corresponding forward coding feature of term vector of each participle and anti-is obtained by translating encoding and decoding Recognition with Recurrent Neural Network
To coding characteristic, the corresponding forward coding feature of each participle is spliced with phase-reversal coding feature, each participle is obtained and exists
Identify the coding characteristic in text.
The possible implementation of with reference to first aspect the first, in the fourth possible implementation, each participle
Contextual feature include decoding feature, the decoding feature of each participle is used to indicate for each context segmented after translating;Phase
Ying Di obtains the contextual feature of each participle in identification text, comprising:
By decoding layer in the corresponding normalization attention weight coefficient of each participle and translation encoding and decoding Recognition with Recurrent Neural Network
Output feature be weighted summation, obtain the decoding feature of each participle.
The 4th kind of possible implementation with reference to first aspect, in a fifth possible implementation, each participle
Contextual feature further include at least one of following three kinds of data data, below three kinds of data be respectively each participle word to
The coding characteristic and the corresponding type of theme of identification text of amount, each participle in identification text.
The possible implementation of with reference to first aspect the first obtains identification in a sixth possible implementation
The context confidence level of each participle in text, comprising:
The contextual feature for identifying each participle in text is input to context confidence calculations model, output obtains each point
The context confidence level of word, context confidence calculations model be based on training identification text contextual feature to default computation model into
It is obtained after row training.
The possible implementation of with reference to first aspect the first obtains identification in the 7th kind of possible implementation
The recognition confidence of each participle in text, comprising:
Obtain the posterior probability of each participle in identification text, and the recognition confidence as each participle;Alternatively,
The acoustics score value, language model score value, posterior probability and the duration that identify each participle in text are input to
Recognition confidence computation model exports the recognition confidence of each participle.
According to a second aspect of the embodiments of the present invention, a kind of identification text Error Detection Unit is provided, which includes:
Module is obtained, for obtaining recognition confidence, degree of translation confidence and the context confidence of each participle in identification text
Degree, degree of translation confidence is that the translation accuracy based on target word each in target language text is obtained, and context confidence level is
Contextual feature based on each participle in identification text is obtained, and target language text is obtained after translating to identification text
It arrives;
Detection module, for that will identify the recognition confidence of each participle in text, degree of translation confidence and context confidence level
It is merged, to obtain the synthesis confidence level score value of each participle in identification text, comprehensive confidence level score value is less than default threshold
The participle of value is as the erroneous words in identification text.
According to a third aspect of the embodiments of the present invention, a kind of identification text error-detecting facility is provided, comprising:
At least one processor;And
At least one processor being connect with processor communication, in which:
Memory is stored with the program instruction that can be executed by processor, and the instruction of processor caller is able to carry out first party
Text error-detecting method is identified in the various possible implementations in face provided by any possible implementation.
According to the fourth aspect of the invention, a kind of non-transient computer readable storage medium, non-transient computer are provided
Readable storage medium storing program for executing stores computer instruction, and computer instruction makes the various possible implementations of computer execution first aspect
In text error-detecting method is identified provided by any possible implementation.
It should be understood that above general description and following detailed description be it is exemplary and explanatory, can not
Limit the embodiment of the present invention.
Detailed description of the invention
Fig. 1 is a kind of flow diagram of identification text error-detecting method of the embodiment of the present invention;
Fig. 2 is a kind of flow diagram of identification text error-detecting method of the embodiment of the present invention;
Fig. 3 is a kind of structural schematic diagram of translation encoding and decoding Recognition with Recurrent Neural Network of the embodiment of the present invention;
Fig. 4 is a kind of block diagram of identification text Error Detection Unit of the embodiment of the present invention;
Fig. 5 is a kind of block diagram of identification text error-detecting facility of the embodiment of the present invention.
Specific embodiment
With reference to the accompanying drawings and examples, the specific embodiment of the embodiment of the present invention is described in further detail.With
Lower embodiment is not limited to the range of the embodiment of the present invention for illustrating the embodiment of the present invention.
Voiced translation refers to the process of the voice signal that the voice signal of original language is automatically translated into object language.Voice
Translation generally comprises three speech recognition, machine translation and speech synthesis chief components.Specifically, in given original language
When voice signal, the identification text of original language is obtained by speech recognition system first, will be known secondly by machine translation system
Other text translates into target language text, and target language text is synthesized to the language of object language finally by speech synthesis system
Sound signal.Since desirable level has not been reached yet in each component part under current technological conditions, will lead in some cases
The translation result of speech translation system output is wrong, the mistake especially introduced in speech recognition link, after will have a direct impact on
Continuous translation accuracy, so that how to carry out error detection to identification text is the key that promote translation accuracy.
For said circumstances, the embodiment of the invention provides a kind of identification text error-detecting methods.This method can be used for voice
Scene is translated, that is, first passes through speech recognition and obtains identification text, after being translated to identification text, based on translation result to knowledge
Other text carries out error detection.Referring to Fig. 1, this method comprises:
101, the recognition confidence of each participle, degree of translation confidence and context confidence level, translation in identification text is obtained to set
Reliability is that the translation accuracy based on target word each in target language text is obtained, and context confidence level is based on identification text
Contextual feature of each participle is obtained in this, and target language text is obtained after translating to identification text;
102, the recognition confidence, degree of translation confidence and the context confidence level that identify each participle in text are merged,
To obtain the synthesis confidence level score value of each participle in identification text, the participle that comprehensive confidence level score value is less than preset threshold is made
For the erroneous words in identification text.
Before executing above-mentioned steps 101, the voice signal that audio collection module receives original language can be first passed through, then to source
The voice signal of language carries out speech recognition and obtains identification text.It wherein, can be by identification text when being translated to identification text
Originally it is input to translation encoding and decoding Recognition with Recurrent Neural Network, so that output obtains target language text.Identification text can be expressed as x=
(x1,x2,x3,…,xm), xiIndicate i-th of participle.The recognition confidence of each participle is for indicating each participle as identification knot
The degree of translation confidence of the credibility of fruit, each participle reversely determines each participle as identification for indicating based on translation result
As a result credibility, the context confidence level of each participle is for indicating each participle as recognition result under current context
Credibility.Wherein, the recognition confidence of i-th of participle can indicate Crec(xi), degree of translation confidence is represented by Ctrans(xi),
Context confidence level is represented by Ccontex(xi).In the synthesis confidence level score value for obtaining i-th of participle, weighted sum can be passed through
Or the modes such as non-linear fusion, the present invention is not especially limit this.When by the way of weighted sum, it can lead to
Following formula is crossed to calculate:
C(xi)=wrecCrec(xi)+wtransCtrans(xi)+wcontext(xi)
In above-mentioned formula, C (xi) it is i-th of synthesis confidence level score value segmented.wrecFor the weight of recognition confidence,
wtransFor the weight of degree of translation confidence, wcontextFor the weight of context confidence level.Above-mentioned every kind of weight can according to practical application or
Experimental result determines that the present invention is not especially limit this.
In the synthesis confidence level score value C (x that i-th of participle is calculatedi) after, it can be by the synthesis confidence level of i-th of participle
Score value C (xi) be compared with preset threshold T.If the synthesis confidence level score value C (x of i-th of participlei) be less than preset threshold T, then
It can determine that i-th of participle is erroneous words.If the synthesis confidence level score value C (x of i-th of participlei) be not less than preset threshold T, then may be used
Determine that i-th of participle is correct word.
Method provided in an embodiment of the present invention is set by obtaining the recognition confidence of each participle, translation in identification text
Reliability and context confidence level will identify that the recognition confidence of each participle, degree of translation confidence and context confidence level carry out in text
Comprehensive confidence level score value is less than preset threshold to obtain the synthesis confidence level score value of each participle in identification text by fusion
It segments as the erroneous words in identification text.Due to can reversely determine each participle as recognition result based on degree of translation confidence
Credibility, and can be determined based on context confidence level in context of each participle before translation or after translation as recognition result
Credibility, thus combining translation confidence level, context confidence level and recognition confidence to identification text in it is each segment into
When row error detection, error detection foundation is more polynary, and error detection accuracy can be improved.
Content based on the above embodiment, as a kind of alternative embodiment, the embodiment of the invention also provides a kind of acquisitions
The method for identifying the recognition confidence of each participle in text, including but not limited to: obtaining in identification text after each participle
Test probability, and the recognition confidence as each participle;Alternatively, the acoustics score value, the language mould that will identify each participle in text
Type score value, posterior probability and duration are input to recognition confidence computation model, export the recognition confidence of each participle.
Wherein, the first above-mentioned mode for obtaining recognition confidence can be indicated by following formula:
Crec(xi)=Ps(xi)
In above-mentioned formula, Crec(xi) indicate i-th of recognition confidence segmented, Ps(xi) indicate i-th after segmenting
Test probability.
In the mode that above-mentioned second obtains recognition confidence, a large amount of training voice signals can be first collected in advance and carry out language
Sound identification obtains the text of training identification accordingly.Determine whether each participle identifies correct and same in each training identification text
When to it is each participle be labeled, such as identify correctly participle can be labeled as 1, identify mistake participle can be labeled as 0.It will train
Acoustics score value, language model score value, posterior probability and the duration of each participle are input to recognition confidence in identification text
Computation model is updated according to parameter of the annotation results of each participle to recognition confidence computation model, until preceding primary
The variable quantity of model parameter is less than default change threshold between renewal process and a rear renewal process.At this point, updating terminates simultaneously
Recognition confidence computation model can be obtained.It, can be by the acoustics score value of each participle, language mould specifically when calculating recognition confidence
Type score value, posterior probability and duration are input in recognition confidence computation model, so that it is correct to export each participle identification
Probability, and will the obtained probability of output as the recognition confidence of each participle.
Content based on the above embodiment, as a kind of alternative embodiment, the embodiment of the invention also provides a kind of acquisitions
The method for identifying the degree of translation confidence of each participle in text.Referring to fig. 2, comprising:
201, the output feature and each participle based on decoding layer in translation encoding and decoding Recognition with Recurrent Neural Network are in identification text
In coding characteristic, obtain the translation contribution degree of each participle, the coding characteristic of each participle is for indicating that each participle is translated
Preceding context;
202, the translation contribution degree of each participle is normalized, obtains the corresponding normalization attention power of each participle
Weight coefficient;
203, the translation accuracy of each corresponding normalization attention weight coefficient of participle and each target word is carried out
Weighted sum obtains the degree of translation confidence of each participle.
Wherein, translation encoding and decoding Recognition with Recurrent Neural Network may include coding module (Encode), attention computing module
(Attention) and decoder module (Decode).As shown in figure 3, Fig. 3 is based on Recognition with Recurrent Neural Network (Recurrent
Neural Networks, RNN) and Attention mechanism translation encoding and decoding Recognition with Recurrent Neural Network structural schematic diagram.When
It so, can also be using based on gating cycle unit (Gated in addition to the translation encoding and decoding Recognition with Recurrent Neural Network based on RNN
Recurrent Unit, GRU) translation encoding and decoding Recognition with Recurrent Neural Network, or be based on shot and long term memory network (Long Short
Term Memory, LSTM) translation encoding and decoding Recognition with Recurrent Neural Network, the present invention is not especially limit this.
In Fig. 3, text x=(x is identified1,x2,x3,…,xm) it is the input quantity for translating encoding and decoding Recognition with Recurrent Neural Network, y
=(y1,y2,y3,…,yn) it is the output quantity for translating encoding and decoding Recognition with Recurrent Neural Network.Identify the length of text and target language text
Degree is m and n, x respectivelyiIndicate i-th of participle, yjIndicate j-th of target word.
Coding module is used to obtain coding characteristic of each participle in identification text.Accordingly, as a kind of optional reality
Example is applied, obtains each participle in the method for identifying the coding characteristic in text the embodiment of the invention also provides a kind of, including but
It is not limited to: obtaining the corresponding forward coding feature of term vector of each participle and reversed by translating encoding and decoding Recognition with Recurrent Neural Network
The corresponding forward coding feature of each participle is spliced with phase-reversal coding feature, obtains each participle and knowing by coding characteristic
Coding characteristic in other text.
I-th is segmented, before executing the above process, can first obtain the term vector e of i-th of participlei.Wherein, to
Word2vec can be used in quantization method, and the present invention is not especially limit this.I-th of acquisition segment term vector it
Afterwards, i-th of participle is obtained by forward coding Recognition with Recurrent Neural Network based on the term vector and sees forward direction under history lexical information
Coding characteristic fi, i-th of participle is obtained by phase-reversal coding Recognition with Recurrent Neural Network based on the term vector and sees the following lexical information
Under phase-reversal coding feature bi.I-th of corresponding forward coding feature of participle is spliced with phase-reversal coding feature, obtains the
Coding characteristic h of the i participle in identification texti。
Each participle is being obtained after the coding characteristic in identification text, attention computing module can be based on translation encoding and decoding
The coding characteristic of the output feature of decoding layer and each participle in identification text, obtains each participle in Recognition with Recurrent Neural Network
Translate contribution degree.Wherein, the translation contribution degree for obtaining each participle in identification text can refer to following formula and calculate:
In above-mentioned formula, αjiIndicate the percentage contribution that i-th of participle is played when translation obtains j-th of target word,
Namely translation contribution degree, a (hi,sj-1) it is to rely on the coding characteristic h of i-th of coding module participleiWith a period of time on decoder module
Carve the output feature s of decoding Recognition with Recurrent Neural Networkj-1Function.Wherein, which can be there are many implementation, and such as feedover mind
Through network function, the present invention is not especially limit this.
For j-th of target word, decoder module is used for coding result and attention based on each participle in identification text
The output of computing module is as a result, generate j-th of target word in target language text by decoding layer in decoding Recognition with Recurrent Neural Network
Output feature sj.Wherein, the translation accuracy of j-th of target word after translation is P (yi)。
Since the translation accuracy of target word is somewhat dependent upon the identification accuracy segmented in identification text, from
And it is based on reversed mechanism, the translation accuracy P (y of each target word can be passed throughi), calculate whether each participle in identification text is known
Incorrect confidence level namely degree of translation confidence, specific calculating process can refer to following formula:
In above-mentioned formula, Ctrans(xi) it is i-th of degree of translation confidence segmented.βjiIndicate to translation contribution degree (
Translation i-th of participle played percentage contribution when obtaining j-th of target word) be normalized after, i-th of participle is corresponding to return
One changes attention weight coefficient.
Wherein, it is normalized in the translation contribution degree to each participle, obtains the corresponding normalization of each participle and pay attention to
When power weight coefficient, following calculation formula can be used:
Content based on the above embodiment can get the recognition confidence of each participle and translation in identification text and set
Reliability.As a kind of alternative embodiment, the embodiment of the invention also provides a kind of contexts for obtaining each participle in identification text
The method of confidence level, including but not limited to: the contextual feature for identifying each participle in text is input to context confidence calculations
Model, output obtain the context confidence level of each participle.
Wherein, context confidence calculations model is that the contextual feature based on training identification text carries out default computation model
It is obtained after training.Specifically, a large amount of voice signals can be collected in advance and carry out speech recognition to obtain training identification text, determined
Whether each participle identifies correct in each training identification text, and is labeled simultaneously to each participle, such as identifies correct
Participle can be labeled as 1, identify that the participle of mistake can be labeled as 0.Using translation encoding and decoding Recognition with Recurrent Neural Network to training identification text
This is translated, and the contextual feature of each participle in training identification text is extracted.By the language of each participle in training identification text
Input quantity of the border feature as context confidence calculations model, calculating each participle in training identification text is to identify correct word
Probability.If the probability being calculated is greater than preset threshold value, then it is assumed that the participle is to identify correct word.According to the participle
Annotation results the parameter of context confidence calculations model is updated.It repeats the above process, until a preceding renewal process
The variable quantity of model parameter is less than default change threshold between a rear renewal process.At this point, updating terminates and language can be obtained
Border confidence calculations model, to be used for subsequent calculating context confidence level.
Before executing calculating context confidence level, the contextual feature of each participle can be first obtained.Based on the above embodiment
Content, as a kind of alternative embodiment, the contextual feature of each participle may include decoding feature, and the decoding feature of each participle is used
Context after indicating to translate for each participle.Correspondingly, the embodiment of the invention also provides in a kind of acquisition identification text
The method of the decoding feature of each participle, including but not limited to: by the corresponding normalization attention weight coefficient of each participle and
The output feature of decoding layer is weighted summation in translation encoding and decoding Recognition with Recurrent Neural Network, obtains the decoding feature of each participle.
Specific calculating process can refer to following formula:
In above-mentioned formula, diThe decoding feature segmented for i-th.
Content based on the above embodiment, as a kind of alternative embodiment, the contextual feature of each participle is in addition to including solution
Except code feature, it may also include at least one of following three kinds of data data, three kinds of data are respectively each participle below
The coding characteristic and the corresponding type of theme of identification text of term vector, each participle in identification text.
Wherein it is possible to preset several type of subject, and it is based on preset type of subject, according to theme division side
Method determines the type of theme of current identification text.Specifically, document subject matter can be used and generate model (Latent Dirichlet
Allocation, LDA) etc. Unsupervised clusterings method obtain currently identify text type of theme, the embodiment of the present invention is to this
It is not especially limited.
Method provided in an embodiment of the present invention is set by obtaining the recognition confidence of each participle, translation in identification text
Reliability and context confidence level will identify that the recognition confidence of each participle, degree of translation confidence and context confidence level carry out in text
Comprehensive confidence level score value is less than preset threshold to obtain the synthesis confidence level score value of each participle in identification text by fusion
It segments as the erroneous words in identification text.Due to can reversely determine each participle as recognition result based on degree of translation confidence
Credibility, and can be determined based on context confidence level in context of each participle before translation or after translation as recognition result
Credibility, thus combining translation confidence level, context confidence level and recognition confidence to identification text in it is each segment into
When row error detection, error detection foundation is more polynary, and error detection accuracy can be improved.
In addition, identification text can be also modified according to testing result due to after carrying out error detection to identification text, and
It can be translated again after amendment, so that translation accuracy and interactive experience can be improved.
It should be noted that above-mentioned all alternative embodiments, can form optional implementation of the invention using any combination
Example, this is no longer going to repeat them.
Content based on the above embodiment, the embodiment of the invention provides a kind of identification text Error Detection Unit, identification texts
This Error Detection Unit is used to execute the identification text error-detecting method in above method embodiment.Referring to fig. 4, which includes:
Module 401 is obtained, recognition confidence, degree of translation confidence and the context for obtaining each participle in identification text are set
Reliability, degree of translation confidence are that the translation accuracy based on target word each in target language text is obtained, context confidence level
It is that the contextual feature based on each participle in identification text is obtained, target language text is after translating to identification text
It obtains;
Error detection module 402, for that will identify the recognition confidence of each participle in text, degree of translation confidence and context confidence
Degree is merged, and to obtain the synthesis confidence level score value of each participle in identification text, comprehensive confidence level score value is less than default
The participle of threshold value is as the erroneous words in identification text.
As a kind of alternative embodiment, target language text is that text input will be identified to translation encoding and decoding circulation nerve net
What network exported;Correspondingly, module 401 is obtained, comprising:
Acquiring unit, for based on translation encoding and decoding Recognition with Recurrent Neural Network in decoding layer output feature and it is each participle exist
It identifies the coding characteristic in text, obtains the translation contribution degree of each participle, the coding characteristic of each participle is for indicating each
Context before participle translation;
Normalization unit is normalized for the translation contribution degree to each participle, and each participle of acquisition is corresponding to return
One changes attention weight coefficient;
Computing unit, for the translation of each corresponding normalization attention weight coefficient of participle and each target word is quasi-
Exactness is weighted summation, obtains the degree of translation confidence of each participle.
As a kind of alternative embodiment, acquiring unit, for obtaining each point by translating encoding and decoding Recognition with Recurrent Neural Network
The corresponding forward coding feature of the term vector of word and phase-reversal coding feature, by the corresponding forward coding feature of each participle and reversely
Coding characteristic is spliced, and coding characteristic of each participle in identification text is obtained.
As a kind of alternative embodiment, the contextual feature of each participle includes decoding feature, the decoding feature of each participle
For indicating for the context after each participle translation;Correspondingly, module 401 is obtained, for segmenting corresponding normalizing for each
The output feature for changing decoding layer in attention weight coefficient and translation encoding and decoding Recognition with Recurrent Neural Network is weighted summation, obtains every
The decoding feature of one participle.
As a kind of alternative embodiment, the contextual feature of each participle further includes at least one of following three kinds of data number
According to three kinds of data are respectively the coding characteristic and identification of the term vector, each participle of each participle in identification text below
The corresponding type of theme of text.
As a kind of alternative embodiment, module 401 is obtained, for that will identify that the contextual feature of each participle in text inputs
To context confidence calculations model, output obtains the context confidence level of each participle, and context confidence calculations model is based on instruction
What the contextual feature of white silk identification text obtained after being trained to default computation model.
As a kind of alternative embodiment, module 401 is obtained, for obtaining the posterior probability of each participle in identification text,
And the recognition confidence as each participle;Alternatively, will identify the acoustics score value of each participle in text, language model score value,
Posterior probability and duration are input to recognition confidence computation model, export the recognition confidence of each participle.
Device provided in an embodiment of the present invention is set by obtaining the recognition confidence of each participle, translation in identification text
Reliability and context confidence level will identify that the recognition confidence of each participle, degree of translation confidence and context confidence level carry out in text
Comprehensive confidence level score value is less than preset threshold to obtain the synthesis confidence level score value of each participle in identification text by fusion
It segments as the erroneous words in identification text.Due to can reversely determine each participle as recognition result based on degree of translation confidence
Credibility, and can be determined based on context confidence level in context of each participle before translation or after translation as recognition result
Credibility, thus combining translation confidence level, context confidence level and recognition confidence to identification text in it is each segment into
When row error detection, error detection foundation is more polynary, and error detection accuracy can be improved.
In addition, identification text can be also modified according to testing result due to after carrying out error detection to identification text, and
It can be translated again after amendment, so that translation accuracy and interactive experience can be improved.
The embodiment of the invention provides a kind of identification text error-detecting facilities.Referring to Fig. 5, which includes: processor
(processor) 501, memory (memory) 502 and bus 503;
Wherein, processor 501 and memory 502 complete mutual communication by bus 503 respectively;
Processor 501 is used to call the program instruction in memory 502, and text is identified provided by above-described embodiment to execute
This error-detecting method, for example, obtain recognition confidence, degree of translation confidence and the context confidence of each participle in identification text
Degree, degree of translation confidence is that the translation accuracy based on target word each in target language text is obtained, and context confidence level is
Contextual feature based on each participle in identification text is obtained, and target language text is obtained after translating to identification text
It arrives;The recognition confidence, degree of translation confidence and the context confidence level that identify each participle in text are merged, to be known
Comprehensive confidence level score value is less than the participle of preset threshold as identification text by the synthesis confidence level score value of each participle in other text
Erroneous words in this.
The embodiment of the present invention provides a kind of non-transient computer readable storage medium, the non-transient computer readable storage medium
Matter stores computer instruction, which makes computer execute identification text error-detecting method provided by above-described embodiment,
For example, obtain recognition confidence, degree of translation confidence and the context confidence level of each participle in identification text, degree of translation confidence
It is that the translation accuracy based on target word each in target language text is obtained, context confidence level is based in identification text
The contextual feature of each participle is obtained, and target language text is obtained after translating to identification text;It will identification text
Recognition confidence, degree of translation confidence and context confidence level of each participle are merged in this, each in identification text to obtain
Comprehensive confidence level score value is less than the participle of preset threshold as the mistake in identification text by the synthesis confidence level score value of participle
Word.
Those of ordinary skill in the art will appreciate that: realize that all or part of the steps of above method embodiment can pass through
The relevant hardware of program instruction is completed, and program above-mentioned can be stored in a computer readable storage medium, the program
When being executed, step including the steps of the foregoing method embodiments is executed;And storage medium above-mentioned includes: ROM, RAM, magnetic disk or light
The various media that can store program code such as disk.
The embodiments such as identification text error-detecting facility described above are only schematical, wherein saying as separation unit
Bright unit may or may not be physically separated, and component shown as a unit can be or can not also
It is physical unit, it can it is in one place, or may be distributed over multiple network units.It can be according to actual need
Some or all of the modules therein is selected to achieve the purpose of the solution of this embodiment.Those of ordinary skill in the art are not paying
Out in the case where creative labor, it can understand and implement.
Through the above description of the embodiments, those skilled in the art can be understood that each embodiment can
It realizes by means of software and necessary general hardware platform, naturally it is also possible to pass through hardware.Based on this understanding, on
Stating technical solution, substantially the part that contributes to existing technology can be embodied in the form of software products in other words, should
Computer software product may be stored in a computer readable storage medium, such as ROM/RAM, magnetic disk, CD, including several fingers
It enables and using so that a computer equipment (can be personal computer, server or the network equipment etc.) executes each implementation
Certain Part Methods of example or embodiment.
Finally, the present processes are only preferable embodiment, it is not intended to limit the protection model of the embodiment of the present invention
It encloses.With within principle, any modification, equivalent replacement, improvement and so on should be included in all spirit in the embodiment of the present invention
Within the protection scope of the embodiment of the present invention.
Claims (10)
1. a kind of identification text error-detecting method characterized by comprising
Recognition confidence, degree of translation confidence and the context confidence level of each participle in the identification text are obtained, the translation is set
Reliability is that the translation accuracy based on target word each in target language text is obtained, and the context confidence level is based on institute
The contextual feature for stating each participle in identification text is obtained, and the target language text is turned over to the identification text
It is obtained after translating;
Recognition confidence, degree of translation confidence and the context confidence level of each participle in the identification text are merged, with
Into the identification text, the synthesis confidence level score value of each participle, the participle that comprehensive confidence level score value is less than preset threshold is made
For the erroneous words in the identification text.
2. the method according to claim 1, wherein the target language text is by the identification text input
It is exported to translation encoding and decoding Recognition with Recurrent Neural Network;Correspondingly, the translation for obtaining each participle in identification text is set
Reliability, comprising:
Output feature and each participle based on decoding layer in the translation encoding and decoding Recognition with Recurrent Neural Network are in the identification text
In coding characteristic, obtain the translation contribution degree of each participle, the coding characteristic of each participle is for indicating that each participle is translated
Preceding context;
The translation contribution degree of each participle is normalized, the corresponding normalization attention weight coefficient of each participle is obtained;
The translation accuracy of each corresponding normalization attention weight coefficient of participle and each target word is weighted summation,
Obtain the degree of translation confidence of each participle.
3. according to the method described in claim 2, it is characterized in that, the volume for obtaining each participle in the identification text
Code feature, comprising:
The corresponding forward coding feature of term vector of each participle and anti-is obtained by the translation encoding and decoding Recognition with Recurrent Neural Network
To coding characteristic, the corresponding forward coding feature of each participle is spliced with phase-reversal coding feature, each participle is obtained and exists
Coding characteristic in the identification text.
4. according to the method described in claim 2, it is characterized in that, the contextual feature of each participle include decoding feature, it is each
The decoding feature of participle is used to indicate for the context after each participle translation;Correspondingly, described to obtain in the identification text
The contextual feature of each participle, comprising:
By decoding layer in each corresponding normalization attention weight coefficient of participle and the translation encoding and decoding Recognition with Recurrent Neural Network
Output feature be weighted summation, obtain the decoding feature of each participle.
5. according to the method described in claim 4, it is characterized in that, the contextual feature of each participle further includes following three kinds of data
At least one of data, following three kinds of data be respectively that the term vector of each participle, each participle are literary in the identification
Coding characteristic and the corresponding type of theme of the identification text in this.
6. the method according to claim 1, wherein the context confidence for obtaining each participle in identification text
Degree, comprising:
The contextual feature for identifying each participle in text is input to context confidence calculations model, output obtains each participle
Context confidence level, the context confidence calculations model be based on training identification text contextual feature to default computation model into
It is obtained after row training.
7. the method according to claim 1, wherein the identification confidence for obtaining each participle in identification text
Degree, comprising:
Obtain the posterior probability of each participle in the identification text, and the recognition confidence as each participle;Alternatively,
Acoustics score value, language model score value, posterior probability and the duration of each participle in the identification text are input to
Recognition confidence computation model exports the recognition confidence of each participle.
8. a kind of identification text Error Detection Unit characterized by comprising
Module is obtained, for obtaining recognition confidence, degree of translation confidence and the context confidence of each participle in the identification text
Degree, the degree of translation confidence is that the translation accuracy based on target word each in target language text is obtained, the context
Confidence level is that the contextual feature based on each participle in the identification text is obtained, and the target language text is to described
What identification text obtained after being translated;
Detection module, for by it is described identification text in each participle recognition confidence, degree of translation confidence and context confidence level
It is merged, to obtain the synthesis confidence level score value of each participle in the identification text, comprehensive confidence level score value is less than pre-
If the participle of threshold value is as the erroneous words in the identification text.
9. a kind of identification text error-detecting facility characterized by comprising
At least one processor;And
At least one processor being connect with the processor communication, in which:
The memory is stored with the program instruction that can be executed by the processor, and the processor calls described program to instruct energy
Enough methods executed as described in claim 1 to 7 is any.
10. a kind of non-transient computer readable storage medium, which is characterized in that the non-transient computer readable storage medium is deposited
Computer instruction is stored up, the computer instruction makes the computer execute the method as described in claim 1 to 7 is any.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711167410.5A CN107844481B (en) | 2017-11-21 | 2017-11-21 | Text recognition error detection method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711167410.5A CN107844481B (en) | 2017-11-21 | 2017-11-21 | Text recognition error detection method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107844481A CN107844481A (en) | 2018-03-27 |
CN107844481B true CN107844481B (en) | 2019-09-13 |
Family
ID=61679894
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711167410.5A Active CN107844481B (en) | 2017-11-21 | 2017-11-21 | Text recognition error detection method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107844481B (en) |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108647207B (en) * | 2018-05-08 | 2022-04-05 | 上海携程国际旅行社有限公司 | Natural language correction method, system, device and storage medium |
CN108428447B (en) * | 2018-06-19 | 2021-02-02 | 科大讯飞股份有限公司 | Voice intention recognition method and device |
CN109086266B (en) * | 2018-07-02 | 2021-09-14 | 昆明理工大学 | Error detection and correction method for text-shaped near characters |
CN108985289A (en) * | 2018-07-18 | 2018-12-11 | 百度在线网络技术(北京)有限公司 | Messy code detection method and device |
CN110047488B (en) * | 2019-03-01 | 2022-04-12 | 北京彩云环太平洋科技有限公司 | Voice translation method, device, equipment and control equipment |
CN111368531B (en) * | 2020-03-09 | 2023-04-14 | 腾讯科技(深圳)有限公司 | Translation text processing method and device, computer equipment and storage medium |
CN111626118A (en) * | 2020-04-23 | 2020-09-04 | 平安科技(深圳)有限公司 | Text error correction method and device, electronic equipment and computer readable storage medium |
CN111627446A (en) * | 2020-05-29 | 2020-09-04 | 国网浙江省电力有限公司信息通信分公司 | Communication conference system based on intelligent voice recognition technology |
CN113409792B (en) * | 2021-06-22 | 2024-02-13 | 中国科学技术大学 | Voice recognition method and related equipment thereof |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105183848A (en) * | 2015-09-07 | 2015-12-23 | 百度在线网络技术(北京)有限公司 | Human-computer chatting method and device based on artificial intelligence |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101655837B (en) * | 2009-09-08 | 2010-10-13 | 北京邮电大学 | Method for detecting and correcting error on text after voice recognition |
CN103154936B (en) * | 2010-09-24 | 2016-01-06 | 新加坡国立大学 | For the method and system of robotization text correction |
US10586556B2 (en) * | 2013-06-28 | 2020-03-10 | International Business Machines Corporation | Real-time speech analysis and method using speech recognition and comparison with standard pronunciation |
CN105244029B (en) * | 2015-08-28 | 2019-02-26 | 安徽科大讯飞医疗信息技术有限公司 | Voice recognition post-processing method and system |
-
2017
- 2017-11-21 CN CN201711167410.5A patent/CN107844481B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105183848A (en) * | 2015-09-07 | 2015-12-23 | 百度在线网络技术(北京)有限公司 | Human-computer chatting method and device based on artificial intelligence |
Also Published As
Publication number | Publication date |
---|---|
CN107844481A (en) | 2018-03-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107844481B (en) | Text recognition error detection method and device | |
CN107977356B (en) | Method and device for correcting recognized text | |
CN108710704B (en) | Method and device for determining conversation state, electronic equipment and storage medium | |
CN111985239A (en) | Entity identification method and device, electronic equipment and storage medium | |
CN112528637B (en) | Text processing model training method, device, computer equipment and storage medium | |
CN113609965B (en) | Training method and device of character recognition model, storage medium and electronic equipment | |
CN110399472B (en) | Interview question prompting method and device, computer equipment and storage medium | |
CN114021582B (en) | Spoken language understanding method, device, equipment and storage medium combined with voice information | |
CN110427629A (en) | Semi-supervised text simplified model training method and system | |
CN112257437A (en) | Voice recognition error correction method and device, electronic equipment and storage medium | |
CN116662552A (en) | Financial text data classification method, device, terminal equipment and medium | |
CN111563161B (en) | Statement identification method, statement identification device and intelligent equipment | |
CN116341651A (en) | Entity recognition model training method and device, electronic equipment and storage medium | |
CN115457982A (en) | Pre-training optimization method, device, equipment and medium of emotion prediction model | |
CN116227603A (en) | Event reasoning task processing method, device and medium | |
CN116702765A (en) | Event extraction method and device and electronic equipment | |
CN110909174A (en) | Knowledge graph-based method for improving entity link in simple question answering | |
CN112818688B (en) | Text processing method, device, equipment and storage medium | |
CN113160801B (en) | Speech recognition method, device and computer readable storage medium | |
CN113849634B (en) | Method for improving interpretability of depth model recommendation scheme | |
CN115527520A (en) | Anomaly detection method, device, electronic equipment and computer readable storage medium | |
CN115587358A (en) | Binary code similarity detection method and device and storage medium | |
CN112767928A (en) | Voice understanding method, device, equipment and medium | |
CN115081459B (en) | Spoken language text generation method, device, equipment and storage medium | |
CN114818644B (en) | Text template generation method, device, equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CP03 | Change of name, title or address |
Address after: 830000 room 615-5, building 1, science building, No. 458, Northwest Road, saybak District, Urumqi, Xinjiang Uygur Autonomous Region Patentee after: Xinjiang Shenggu rongchuang Digital Industry Development Co.,Ltd. Address before: 830002 room 529, 5th floor, science building, 458 Northwest Road, shayibak District, Urumqi, Xinjiang Uygur Autonomous Region Patentee before: XINJIANG IFLYTEK INFORMATION TECHNOLOGY CO.,LTD. |
|
CP03 | Change of name, title or address |