CN104050160B - Interpreter's method and apparatus that a kind of machine is blended with human translation - Google Patents
Interpreter's method and apparatus that a kind of machine is blended with human translation Download PDFInfo
- Publication number
- CN104050160B CN104050160B CN201410090457.6A CN201410090457A CN104050160B CN 104050160 B CN104050160 B CN 104050160B CN 201410090457 A CN201410090457 A CN 201410090457A CN 104050160 B CN104050160 B CN 104050160B
- Authority
- CN
- China
- Prior art keywords
- translation
- object statement
- sentence
- machine
- text
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Landscapes
- Machine Translation (AREA)
Abstract
The present invention proposes Interpreter's method and device that a kind of machine is blended with human translation.Methods described includes:Continuous voice paragraph is recognized, and punctuate cutting is carried out to which, obtain the input text in units of sentence;Database search is carried out according to the input text, corresponding object statement has been searched whether, directly by object statement with voice output if having;Translation is carried out to the input text using machine translation and obtains object statement, and confidence level marking is carried out to the object statement;Object statement is obtained by the be input into text of human translation;Assessed according to the degree of translation confidence of the machine translation and the quality of human translation is estimated, and the method using phonetic synthesis will assess the adjustable voice output of the preferable special translating purpose sentence generation rhythm.
Description
Technical field
The present invention relates to computerized information automatically processes field, and in particular to the mouth that a kind of machine is blended with human translation
Language interpretation method and device.
Background technology
The research and development of automatic translation system have been carried out 50 years --- in fact since 1940's electronics
Computer has begun to computer utility in the exploration of voiced translation from being born.Since the eighties of last century latter stage eighties,
People start the research for being devoted to voiced translation (Speech-to-speech Translation) technology.So-called voiced translation
Computer is exactly allowed to realize the process from a kind of voice of language to the voiced translation of another kind of language.Which is envisioned substantially, allows
Computer serves as the role translated between the speaker for holding different language as people.The language used due to speaker is general all
It is the spoken language in daily life, and people is also just wishing that machine translation system can receive and realize turning over for any spoken utterance
Translate, also, this fast-developing and raising wished with speech recognition technology and spoken language analyzing technology, have become no longer vast
Boundless and indistinct imagination.Therefore, voiced translation is often referred to as Interpreter (Spoken Language Translation, SLT) again.
Interpreter is related to various subjects such as automatic speech recognition, machine translation, phonetic synthesis and technology, therefore has weight
Big Research Significance.In the eighties of last century early stage nineties, along with the continuous improvement of speech recognition technology, Interpreter is gradually fluffy
Suddenly grow up.With the continuous development of its core technology, Interpreter is no longer that the high-quality being defined in certain field is turned over
Translate, and will be to realize the exchange between different language as target.Research in Interpreter lays particular emphasis on the following aspects:1) exist
Session structure is excavated in interactively voiced translation;2) study written word different from spoken language;3) performance of processing system and Shandong
Rod sex chromosome mosaicism.In recent years, it is with the foundation of statistical framework in speech recognition, more and more in machine translation and Interpreter
System begin to use statistical method to be modeled.And traditional oral translation system, due to by technical limitations, Zhi Nengying
Under some voice constrained environments, and the significant challenge of current techniques is how to solve real-life oral communication to ask
Topic.Specific application scenarios are seeked advice from from international conference (including sports meet) informix service to travel information etc., therefore
From from the perspective of society and economy, Interpreter (SLT) has very big dependency with the world of our this globalization.
Spoken message expresses a kind of main communication way as the daily exchange of people, the terseness of language performance and use
The popularity of scope is increasingly subject to people's attention.Machine translation system of the exploitation based on spoken message contents processing, just
In use and the embedded system platform that carries on be applied to develop for people one of practicality machine translation system it is important in
Hold.But, the spoken Some features of itself cause the research and development of practicality oral translation system difficult.Interpreter's is main
Difficulty is that, 1) in the real scenes such as spoken dialog, Internet chat, the sentence of input often lacks standardization, it is difficult to flutter catch which
Inherent grammatical structure information, the problems such as the result for causing statistical translation is more stiff, stability is poor;2) statistical machine translation be with
Data-driven, its foundation depended on for existence is bilingual data resource.And in current data accumulation, towards the bilingual of spoken language
Language material (China and Britain) is quite deficient.Therefore at present, the oral translation system for being completely dependent on statistical method can not fully meet people
The widespread demand of daily life.3) Interpreter is different from text translation, and its exchange for mainly completing the crowd of different language is asked
Topic, therefore its requirement of real-time to translation is high, especially how to optimize translation flow in a network environment is to improve Consumer's Experience
Key.
The content of the invention
The present invention is intended to provide a kind of Interpreter's method that machine is blended with human translation, can solve simple use
Machine carries out Interpreter, the problems such as its result is stiff, stability is poor and extensive in units of sentence by constantly accumulation
Interpreter data base, improves the automaticity of translation, accelerates the commercialization of oral translation system.
According to an aspect of the present invention, which provides a kind of Interpreter's method that machine is blended with human translation, including:
Step 1, the continuous voice paragraph of identification, and punctuate cutting is carried out to which, obtain the input text in units of sentence
This;
Step 2, according to it is described input text carry out database search, searched whether corresponding object statement, if having
Object statement is otherwise gone to step into 3 with voice output directly;
Step 3, the input text is carried out by translation obtains object statement using machine translation, and to the object statement
Carry out confidence level marking;
Step 4, object statement is obtained by the be input into text of human translation;
Step 5, assessed according to the degree of translation confidence of the machine translation and the quality of human translation is estimated, and adopted
Preferable special translating purpose sentence will be assessed with the method for phonetic synthesis and generate the adjustable voice output of the rhythm.
According to other method of the present invention, Interpreter's device that a kind of machine is blended with human translation, bag are it discloses
Include:
Speech recognition and fragmentation module, recognize continuous voice paragraph, and carry out punctuate cutting to which, obtain with sentence
For the input text of unit;
Masterplate is retrieved and replacement module, is carried out database search according to the input text, has been searched whether corresponding mesh
Poster sentence, directly by object statement with voice output if having, otherwise into the first translation module;
Based on the machine translation module of level phrase, translation is carried out to the input text using machine translation and obtains target
Sentence, and confidence level marking is carried out to the object statement;
The artificial mass-rent translation module of intelligence, obtains object statement by the be input into text of human translation;
Quality assessment modules, assess according to the degree of translation confidence of the machine translation and the quality of human translation is commented
Estimate, and provide the last translation result of judgement;
Phonetic synthesis output module, last translation result quality assessment modules determined using the method for phonetic synthesis
Generate the adjustable voice output of the rhythm.
The result of different interpretation methods is unified to pass in the way of phonetic synthesis described in such scheme proposed by the present invention
User is passed, the real-time demand of speech exchange is met.For image transmission original language user feeling, this recognition methods is by knowing
Other user feeling is simultaneously presented to object language user in the way of illustrating or synthesize, and improves the experience under user's application scenarios.
The present invention translates the warehouse-in of language material and effective by the artificial high-quality that paragraph punctuate is realized in units of sentence
Utilize, overcome the problem that Traditional Man free translation data are difficult to be utilized by statistic translation method;The present invention will by phonetic synthesis
The result of different interpretation methods all must stably be output as the voice comprising emotion, can be good at real-time in solution voiced translation
Communication problem;The present invention can solve mouth of the magnanimity mobile subscriber under the scenes such as daily life, tourism, study, simple dialog
Language issues for translation, accelerates the marketization of spoken automatic translation technology, with very great Practical significance.
Description of the drawings
Fig. 1 is Interpreter's method flow diagram that machine proposed by the present invention is blended with human translation;
Fig. 2 carries out the schematic diagram of sentence segmentation for the present invention to character string " ABCDEFGH ";
Fig. 3 is the user's growth curve figure in the present invention envisioned 2 years futures for arriving;
Fig. 4 is the schematic flow sheet of degree of translation confidence appraisal procedure in the present invention;
Fig. 5 is Interpreter's unit simulation figure that machine proposed by the present invention is blended with human translation.
Specific embodiment
To make the object, technical solutions and advantages of the present invention become more apparent, below in conjunction with specific embodiment, and reference
Accompanying drawing, the present invention is described in further detail.
Referring to Fig. 1, the flow process of the Interpreter's method blended with artificial mass-rent translation according to machine translation of the present invention is such as
Under:
S11:The voice of original language is input into by terminal unit, by the continuous speech of input with the automatic identification of voice, mark
Automatic addition of point symbol etc. carries out cutting punctuate for core, the input text in units of sentence needed for obtaining translating;
S12:After each corpus of text input system, enter line retrieval first in translation database, if matched into
Work(, exports the cypher text for retrieving, and jumps to step S17;
S13:If corresponding data cannot be retrieved, start Intelligent management module, by the complexity for calculating input sentence,
And the level condition with reference to user decides whether to enable human translation, while data are sent into machine translation module;
S14:Based on the statistical machine translation method of level phrase, fusion forces the parameter training result of alignment to be translated
Confidence level estimation;
S15:If it is determined that enable human translation, then the method translated using artificial mass-rent, and on backstage using visualization
Terminal enables the text that interpreter completes sentence by sentence to translate;
S16:Comprehensive assessment is carried out according to degree of translation confidence to machine translation result and human translation result, output is preferably
Cypher text result;
S17:The cypher text of output is generated into stay-in-grade voice output, vivid user feeling can be transmitted, can be very
Solve well the actual demand of face-to-face real time translation.
Said method step is described in detail below by embodiment two.
The fragmentation processing method of continuous speech in step S11
In on-line continuous speech recognition, voice endlessly can be input in sound identification module.Although eigentransformation
Can calculate in real time with the posterior probability of every frame voice, if but decoder and internal memory cannot bear long speech data.
Therefore extremely important to the punctuate cutting of voice in on-line continuous speech recognition, it is not only the guarantee of system stable operation, and
And the recognition performance to system has vital effect.Traditional on-line continuous speech recognition is according to the quiet length in voice
Degree carrys out cutting.So cutting read aloud or news voice in have a reasonable effect, but run into natural voice and know
Other effect will decline suddenly.Because in naturally speaking, a word intermediate hold is also the phenomenon of generally existing, and according to
Quiet cutting will be made pauses in reading unpunctuated ancient writings where pause in cutting.Can so cause pause former and later two sound bites identification can all occur
Mistake.Additionally, in naturally speaking, it is as the fluctuation of emotion causes many words continuously to be said, middle even not include one
Individual pause.The method made pauses in reading unpunctuated ancient writings using quiet has no way of detecting this cutting, and then influences whether the recognition effect of voice.
Our on-line continuous speech recognition uses the cutting punctuate method with discrimination as core, and cutting is not
Carry out under the precondition of loss discrimination.Such cutting method can be automatically sentenced to the punctuate in natural-sounding
It is disconnected.
Fig. 2 shows that the present invention carries out the schematic diagram of sentence segmentation to character string " ABCDEFGH ".As illustrated in fig. 2, it is assumed that
One section of voice, scans for according to acoustic model and language model scores, is as a result ABCDEFG.An ensuing character is
H, it is considered to be to be required for carrying out cutting punctuate to have two kinds of situations in figure below.
Thinking in the case of the first herein should paths above in cutting, i.e. figure, then the probability of language model is:
Wherein w=w1w2....wnRepresent length as n character string, x represents x gram language models.Character string is in this example
ABCDEFG, length are 7.PhIt is the historical probabilities of recognized character string ABCDEFG, this is the initial state of two kinds of situations.It is probability that recognized character string last character G is a tail.P(w1=H|<s>) be first and do not know
Malapropism accords with H as the probability of the beginning of a new sentence.
Second situation is non-divided herein, be exactly in figure below a paths, the probability of language model is:
Wherein PhIt in formula (1) is the same to be.It is that character H follows the probability behind character G.
Probabilistic language model in the case of relatively two kinds, if P1> P2, then we just think herein should cutting, from H
Start in addition in short.Otherwise the words is not over, formed behind new character strings ABCDEFGH of the H immediately G, then
Identical process is carried out to character I below.This method is not only able to exclude the false-alarm that the pause in a word brings, and
Can avoid do not have pause continuous speech in failing to report so that discrimination reach it is best.Other this cutting method is not
Depend on it is quiet, so as to it also avoid the inaccurate mistake cutting for bringing of quiet detection.
In step S12, extensive translation database search method
The translation database stores the high-quality translation result through human translation or artificial check and correction.The data base exists
Two tasks are assume responsibility in system:
When corpus of text is after fragmentation process, enter line retrieval first in translation database, if the match is successful,
Then returning result immediately.Here matching is realized by the similarity calculated between sentence.So-called sentence similarity refers to two
Individual sentence is worth for the real number between [0,1] in matching matching degree semantically, and value shows that more greatly two sentences are more similar.When
When value is 1, show that two sentences are identical semantically;Value is more little, shows that two sentence similarities are lower, works as value
For 0 when, show that two sentences are entirely different semantically.Here we are mainly using fusion semantic information based on vector space
The TF-IDF methods of model calculate sentence similarity.Due to information such as the spoken grammatical structures for lacking specification, we are difficult to flutter and catch
The syntactic-semantic structure of input text.Therefore, semantic information herein mainly includes the shallow-layer languages such as time, numeral, name entity
Adopted information, and be replaced on this basis generation translation template to be input into text match.
As statictic machine translation system is that language material drives, its performance is inseparable with the language material being input into.Especially, I
Translation database collect be real-time, language material on accurate line, for machine translation system, with this partial data
The continuous increase of scale and perfect, systematic function will be continued to optimize.
In step S13, intelligent coordinated method
If input sentence is not retrieved in translation database, we obtain the intelligent coordinated module of system start-up, according to reality
The classification of border user, the complicated dynamic behaviour result of comprehensive input text, decides whether to be accomplished manually translation by interpreter.
The user of the default pays the number of expense, is divided into free users, domestic consumer and (pays making for small charge
User), three kinds of classifications of advanced level user's (user had higher requirements by the accuracy of system).
Fig. 3 illustrates the user's growth curve in the system envisioned 2 years futures for arriving.As seen from Figure 3, in exploitation just
Phase, mainly receiving free users, build and improve based on intelligent translation platform, with the growth of number of users, combined with intelligent
The optimization updating ability of transcription platform itself, increasing paying customer will be attracted.In addition, Intelligent management module is also controlled
And preferably translation result is added in data base, to ensure the continuous updating of data base.Treat its build up to certain scale it
Afterwards, the part language material can be added in statistical translation system, translation model and language model is updated, and then be caused whole
Individual translation system keeps continuing to optimize with the injection of real-time language material.Therefore, with the growth of system processes data scale, data
The translation duties proportion that the automatic module such as library searching and machine translation can be completed is stepped up, and this is to a great extent
The cost of human translation can be reduced, system effectiveness is improved.
In step S14, based on the statistical machine translation method of level phrase, fusion forces the parameter training result of alignment to be entered
Row degree of translation confidence is assessed.
For Interpreter, using based on synchronous context free grammar (synchronous context-free
Grammar, SCFG) stratification phrase statistical machine method.Basic thought is from extensive bilingualism corpora to extract magnanimity
Alignment phrase segment, as knowledge source " memory " on a storage medium.The input text gone out to speech recognition is utilized and is efficiently searched
Rope algorithmic match phrase fragment is simultaneously combined into sentence to be translated.And a kind of fusion is set up based on pressure alignment model parameter instruction
Experienced confidence calculations method, the translation that primary study translation probability characteristic is brought are uncertain, are finally each object language
Translation result generate confidence level marking.
Statistical machine model based on stratification phrase is set up in synchronous context free grammar (synchronous
Context-free grammar, SCFG), belong to the category of formalization syntax.Statistical translation system based on stratification phrase
General characteristic with the statistical machine translation based on formalization syntax:The structure of formalization syntax is borrowed, so that turning over
The process of translating is stratification, is structured.Therefore, the long range sequencing in sentence can be become local sequencing by it;And
Can tree structure in introduce variable solve it is extensive the problems such as.Meanwhile, with the statistical translation system phase for introducing linguisticss syntax
Have than, formalization syntax and do not need the advantages such as extra syntactic analysis resource, completely compatible phrase system.Especially, in mouth
Language is talked with, and in the real scene such as Internet chat, input sentence is often lack of standardization, has certain difference with real written language sentence
Away from this causes the performance of syntactic analysis to be greatly affected.If the system based on linguistic grammatical is built in such environment
Meter machine translation system, its performance will be had a greatly reduced quality because syntactic analysis performance is low.But, the system based on formal syntax
In such language environment, performance will not then be subject to too big impact to meter machine translation, because formal syntax model is in this side
The face still advantage with phrase-based model.The level phrase translation side with the output of confidence level translation result is described below
Method:
1) level phrase translation
Under the interpretative system based on level phrase, the task of decoder is to find optimum object language stringMeet
Wherein e represents object language string, and D is represented and derived the process for generating target strings through a series of.I.e. it is capable to
Synergetic syntax derivation D of a generation original language string is found, the target language that the object language end of the synchronous syntax generates in D
Speech string, is exactly final translation.It should be noted that find here be maximum probability derivation rather than probability
Maximum object language string, because may have various derivations generate similarly object language string, finds general
The maximum object language string calculation cost of rate is higher.
According to log-linear model, probability P (D) may be considered the log-linear combination of multiple features:
Wherein ΦiAnd λiIt is ith feature function and its corresponding weight.Except language model feature PLMIn addition, other
The feature used can be write as the form of regular product:
Wherein X →<R, a>A series of context-free grammars used in representing level phrase system derivation, therefore
P (D) can be rewritten is:
Traditional statistical translation system, such as Pharaoh, using the linear codec algorithm such as beam search, A*, although can
To incorporate the language model of simple sequencing model and N-Gram, but the word order of translation result is always barely satisfactory.But be based on
The translation system front end of level phrase combines the structural information of tree, and the decoding algorithm of rear end accordingly will also utilize the analysis of tree to calculate
Method.So the present invention uses for reference the syntactic analysis method based on chart analysis, the decoder of CYK-Style is realized.
CYK algorithms are improved bottom-up shift-in-reduction algorithms.Can produce in decoding process substantial amounts of it is assumed that being
Avoid searching for all of possibility, we employ stack architecture, and necessity is carried out in search procedure using different strategies
Beta pruning.AddEdge functions are set up for the side that each to be entered line chart (chart), first checks for whether it can pass through threshold value
(threshold) beta pruning, if check by if the side whether will merger (recombination) again, finally check that it is
No energy passes through block diagram (histogram) beta pruning.The side of these three beta prunings has only been successfully passed, line chart could be finally stored into
In.
2) fusion is based on the degree of translation confidence appraisal procedure for forcing alignment model parameter training
Under ordinary meaning, confidence level is to evaluate measuring for a correct probability, and it reflects what a certain particular event occurred
The degree of reliability.In machine translation, it is expressed as under conditions of in advance without reference to answer, the translation result to giving input
Assessment.Within the system, our puzzlements not merely with some translation system external informations such as original language/object language
Various information fusion in translation process are also come in by degree, length etc., scoring of the more preferable simulation people to translation result.
Fig. 4 shows the schematic flow sheet of degree of translation confidence appraisal procedure in the present invention.As described in Figure 4, based on pressure
In the degree of translation confidence appraisal procedure of alignment model parameter training, first with the parameter instruction for forcing alignment techniques to complete model
Practice, and retain the information such as the probability in alignment procedure;Meanwhile, we are by puzzled degree and length of original language string S and target strings T etc.
Knowledge is dissolved in the assessment of confidence level as feature.By machine learning, can by much information be fused to one it is unified
In framework, and simulated target function to greatest extent.In the present system, we used support vector regression (Support
Vector Regression, SVR) as Machine learning tools.This be due to SVR object function can well it is approximate I
The artificial scoring of target-mock translation result that proposes.
In step S15, artificial mass-rent interpretation method:
1) artificial mass-rent translation
Mass-rent (Crowdsourcing), refers to that task is contracted out to by a company with freely voluntary form nonspecific
Popular network way, be a kind of social logistics pattern based on the Internet, business operation each can be embedded in
Link, forms new type of organization.This concept is the well-known IT magazines in the U.S.《Line》Propose, be referred to as after " long-tail is managed by industry
By " after another important business concept.In tissue human translation, this link adopts mass-rent pattern for the invention, has organized
Each passerby of certain Interpreter's ability, creates the social new things that everybody is benefited using group intelligence.
Although mass-rent is translated this concept and is suggested, also have and translate that speech, tiger are flutterred, the website such as shell performs the text of mass-rent pattern
Chapter and books translation community, but the main Translation of these community-type online media sites is to liking the networks such as current political news, competitive sports
The text message of upper real time communication, finds suitable original text generally by the senior online friend of some forums and website employee, and to " translation
The level of group " online friend and the division of labor are held.Mass-rent translation in the invention towards be the mutual mouth of exchange in daily life
Language information, and have passed through fragmentation process.Therefore, translate compared to the mass-rent of the penman texts such as news, the mouth that the invention is proposed
Language mass-rent translation has following some advantage:
The content of spoken message expression is for the purpose of communication exchange in actual life, it is therefore desirable to the more translation of " hommization "
As a result.Mass-rent translation can utilize the strength of colony, not only reduce the cost of research and development, operation, manpower, also allow translation result more
Stick on nearly user's request.
The content of spoken message expression is related to many aspects such as the clothing of people, food, shelter, row, is used herein mass-rent translation,
More amateur translation talents can be converged, the advantage of group intelligence is fully demonstrated.
In visualization terminal, what interpreter faced is the text message after fragmentation.Generally, interpreter is good at good to arranging
Good problem provides suggestion, and provides translation answer for targetedly problem, therefore such setting is more beneficial for interpreter and enters
Row in real time, is efficiently translated.
By the language material of fragmentation is included, on mobile terminal (ipad, mobile phone), carrying out translating, editing for interpreter,
Last interpreter can be selected translation result with voice mode input terminal, realize the purpose of real time translation.
2) interpreter terminal
Text that interpreter complete sentence by sentence is enable to translate using visualization terminal on backstage, the invention reduces the difficulty of interpreter
Degree, reduces the cost of human translation, while ensure that the demand of real time translation, is mainly reflected in the following aspects:
The text of fragmentation is directly displayed in visualization terminal on backstage, what interpreter faced is no longer tediously long, connects
Continuous voice, the substitute is fragmentation and visual text message.On the one hand, the text of fragmentation is with sentence as list
Position, interpreter no longer need the information for remembering many or even whole section, and reduce the demand to interpreter's synthesis abstract ability;It is another
Input information is converted into visualText by aspect, and the text message that interpreter quickly and easily can will be seen that translates into target
Language.This working method relaxes the demand to the various abilities of interpreter so that some " amateur " translation talents can also be fast
Pass and incorporate, drastically increase the scope of application of system.
The word for needing human translation is highlighted by setting the different conditions of current task, such as can be used in terminal
Different colours represent whether data are retrieved, if by machine translation and whether need human translation.When needing human translation
When, it is possible to use it is highlighted to be different from other states.This method for being highlighted particular state allows interpreter side
Just the translation sentence by sentence of text, is quickly completed, the difficulty of interpreter is reduced, the cost of human translation is reduced.
Finally due to our outstanding speech recognition systems, interpreter can select translation result is whole with voice mode input
End, is automatically converted to word by its embedded speech recognition chunk.Here word for word typing translation result is no longer needed, people is reduced
The time of work work, improve work efficiency.
In step S16, the method that comprehensive assessment is carried out to automatic translation and human translation:
After automatic translation and human translation is completed, all results are input into the module and carry out quality evaluation comprehensively output.
The module is given a mark according to the confidence level of automatic translation, and the rank of each interpreter beats grading information, and uses for reference automatic translation evaluation mark
Quasi- BLEU computational methods, carry out comprehensive assessment to the result of different interpretation methods, preferably result will return to phonetic synthesis mould
Block.
Fig. 5 shows Interpreter's unit simulation figure that machine proposed by the present invention is blended with human translation.Such as Fig. 5 institutes
Show:
S11:The voice of original language is input into by terminal unit (such as mobile phone), by the continuous speech of input with the automatic of voice
Identification, automatic addition of punctuation mark etc. carry out cutting punctuate for core;And it is defeated in units of sentence needed for obtaining translating
Enter text;
S12:After each corpus of text input system, enter line retrieval first in translation database, if matched into
Work(, exports the cypher text for retrieving, and after quality evaluation, directly exports the language composite result of cypher text;
S13:If corresponding data cannot be retrieved, start Intelligent management module, by the complexity for calculating input sentence,
And the level condition with reference to user decides whether to enable human translation, while data will be fed into machine translation module;
S14:Based on the statistical machine translation method of level phrase, fusion forces the parameter training result of alignment to be translated
Confidence level estimation;
S15:If it is determined that enable human translation, then the method translated using artificial mass-rent, and on backstage using visualization
Terminal (ipad or other computers) enables the text that interpreter completes sentence by sentence to translate;
S16:Comprehensive assessment is carried out according to degree of translation confidence to machine translation result and human translation result, output is preferably
Cypher text result;
S17:The cypher text of output is generated into stay-in-grade voice output, language performance can solve to face well
The actual demand of face real time translation.
The invention allows for a kind of speech recognition and Interpreter's conversion equipment, which includes:
Speech recognition and fragmentation module:Automatic speech recognition, automatic marking addition can be carried out to continuous speech, is formed
To be translated text of the sentence for unit;
The masterplate retrieval of extensive translation database and replacement module:Using fusion semantic information based on vector space mould
The TF-IDF methods of type complete the large-scale data library searching of Interpreter's example sentence and complete to translate number using shallow-layer syntactic information
According to partial content replace, strengthen data base generalization amount in terms of function;
Intelligent management module:Major function is the complexity by calculating input sentence, and with reference to the level condition of user
Decide whether to enable human translation, while data will be fed into machine translation module.Meanwhile, which persistently can also be collected, collator
The result of work mass-rent translation, forms high-quality translation database to provide efficiently retrieval service, and is machine translation mould
Type updates the bilingual corpora that optimization provides high-quality;And the continuous accumulation with translation database and machine translation system is increasingly complete
It is kind, the ratio of data retrieval and machine translation is stepped up, the dynamic for realizing whole translation system updates and optimizes;
Machine translation module based on level phrase:Comprising translation model, language model and three part of decoder, which is basic
Thought is that magnanimity alignment phrase segment is extracted from extensive bilingualism corpora, as knowledge source " memory " on a storage medium.
The Chinese sentence keyed in user, using efficient searching algorithm match phrase fragment and is combined into english sentence to be translated.
The present invention also establishes a kind of fusion thereon based on the confidence calculations method for forcing alignment model parameter training, primary study
The translation that translation probability characteristic is brought is uncertain, and finally the translation result for each object language generates confidence level marking.
The artificial mass-rent translation module of intelligence:Attract with oracy and be ready to be translated into by the way of mass-rent
The translation talent of market value, forms the service type community for converging such talent, and passes through intelligent platform subcontracting, there is provided i.e.
Shi Rengong interpretation services;And the language material of fragmentation is included on mobile terminal (ipad, mobile phone), carry out translating for interpreter,
Editor, evaluation, passback.Visualization terminal, enables the text translation that interpreter is completed sentence by sentence in real time to include:On backstage by fragmentation
Text be directly displayed at visualization terminal on, no longer directly facing continuous voice and word, this is to a great extent for interpreter
On reduce the specific demand of the audition to interpreter, memory and abstract ability;By the different conditions for setting current task
Being highlighted needs the word of human translation so that interpreter can quickly and easily complete the translation sentence by sentence of text, reduce and translate
The difficulty of member, reduces the cost of human translation;Finally, after the completion of translation, interpreter no longer needs word for word to be input into text, Ke Yixuan
Select and phonetic entry completed by terminal unit, and word is automatically converted to by its embedded sound identification module and submit to system,
This will be greatly improved the work efficiency of interpreter.
Quality assessment modules:The execution degree marking provided by the result and machine translation apparatus of analytical data library searching,
And may feed back in comprehensive artificial mass-rent translation translation result, the translation time, information and the adaptation user such as interpreter's grade
Under the result of grade these multi-aspect informations, comprehensive judgement is carried out to each result, and provides the last translation result of judgement.
Voice synthetic module:Translation result is exported by way of phonetic synthesis.Use for image transmission original language
Family emotion, this recognition methods are carried identifying user emotion and to be presented to object language user by way of illustrating or synthesizing
Experience under high user's application scenarios.
Particular embodiments described above, has been carried out to the purpose of the present invention, technical scheme and beneficial effect further in detail
Describe in detail bright, it should be understood that the foregoing is only the specific embodiment of the present invention, be not limited to the present invention, it is all
Within the spirit and principles in the present invention, any modification, equivalent substitution and improvements done etc. should be included in the protection of the present invention
Within the scope of.
Claims (9)
1. a kind of Interpreter's method that machine is blended with human translation, it is characterised in that include:
Step 1, the continuous voice paragraph of identification, and punctuate cutting is carried out to which, obtain the input text in units of sentence;
Step 2, according to it is described input text carry out database search, searched whether corresponding object statement, it is direct if having
By object statement with voice output, 3 are otherwise gone to step;
Step 3, the input text is carried out by translation obtains object statement using machine translation, and the object statement is carried out
Confidence level is given a mark;
Step 4, object statement is obtained by the be input into text of human translation;
Step 5, assessed according to the degree of translation confidence of the machine translation and the quality of human translation is estimated, and adopt language
The method of sound synthesis will be assessed preferable special translating purpose sentence and generate the adjustable voice output of the rhythm;
Wherein, in the step 3 using the statistical method based on level phrase machine translation system, output is comprising confidence measure
The object statement of amount, its detailed process include:
Magnanimity alignment phrase segment is extracted from bilingualism corpora, voice is known as knowledge source " memory " on a storage medium
The input text not gone out is using searching algorithm match phrase fragment and is combined into object statement to be translated;And merge based on strong
The confidence calculations method of alignment model parameter training processed, is that each object statement generates confidence.
2. method according to claim 1, it is characterised in that punctuate cutting is carried out in the step 1, obtain with sentence be
The input text of unit is specifically included:
The continuous speech that voice paragraph is unit input is carried out into cutting punctuate by principal character of the rhythm, and with reference to voice from
Dynamic identification, the automatic addition of punctuation mark carry out punctuate cutting, and carry out under the precondition for not losing discrimination.
3. method according to claim 1, it is characterised in that step 2 is specifically included:
To each input text, input text is calculated using the TF-IDF methods based on vector space model of fusion semantic information
This similarity with sentence in data base, and then obtain object statement.
4. method according to claim 1, it is characterised in that following judgement was also needed to before execution step 4:Calculate defeated
Enter the complexity of text, and the classification with reference to user determines the need for human translation.
5. the method as described in any one of claim 1-3, it is characterised in that the identification in step 1 according to different language model
Whether probability judgment makes pauses in reading unpunctuated ancient writings, wherein historical probabilities, recognized character of the language model identification probability for recognized character string
String last character is the product of the probability that the probability and first unknown character string of a tail starts for sentence.
6. the method for claim 1, it is characterised in that the statistical translation based on level phrase is referred specifically to from source language
Speech string finds out the derivation of maximum probability in multiple derivation generating process of target strings, by the derivation corresponding target
The result output gone here and there as machine translation.
7. the method for claim 1, it is characterised in that by source in support vector machine just translation process in step 3
The puzzled degree of language strings and target strings, length fusion study, and then confidence level marking is carried out to final object statement.
8. Interpreter's device that a kind of machine is blended with human translation, it is characterised in that include:
Speech recognition and fragmentation module, recognize continuous voice paragraph, and carry out punctuate cutting to which, obtain with sentence as list
The input text of position;
Masterplate is retrieved and replacement module, is carried out database search according to the input text, has been searched whether corresponding target language
Sentence, directly by object statement with voice output if having, otherwise into the first translation module;
Based on the machine translation module of level phrase, translation is carried out to the input text using machine translation and obtains target language
Sentence, and confidence level marking is carried out to the object statement;
The artificial mass-rent translation module of intelligence, obtains object statement by the be input into text of human translation;
Quality assessment modules, assess according to the degree of translation confidence of the machine translation and the quality of human translation is estimated,
And provide the last translation result of judgement;
The last translation result that quality assessment modules are determined is generated by phonetic synthesis output module using the method for phonetic synthesis
The adjustable voice output of the rhythm;
The machine translation module based on level phrase obtains object statement as follows:
Magnanimity alignment phrase segment is extracted from bilingualism corpora, voice is known as knowledge source " memory " on a storage medium
The input text not gone out is using searching algorithm match phrase fragment and is combined into object statement to be translated;And merge based on strong
The confidence calculations method of alignment model parameter training processed, is that each object statement generates confidence.
9. device as claimed in claim 8, it is characterised in that also include:
Intelligent management module, is decided whether to enable and is manually turned over by the complexity for calculating input sentence, and the classification with reference to user
Translate, while data are sent into the machine translation module based on level phrase.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410090457.6A CN104050160B (en) | 2014-03-12 | 2014-03-12 | Interpreter's method and apparatus that a kind of machine is blended with human translation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410090457.6A CN104050160B (en) | 2014-03-12 | 2014-03-12 | Interpreter's method and apparatus that a kind of machine is blended with human translation |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104050160A CN104050160A (en) | 2014-09-17 |
CN104050160B true CN104050160B (en) | 2017-04-05 |
Family
ID=51503013
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410090457.6A Active CN104050160B (en) | 2014-03-12 | 2014-03-12 | Interpreter's method and apparatus that a kind of machine is blended with human translation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104050160B (en) |
Families Citing this family (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104601834B (en) * | 2014-12-19 | 2017-03-22 | 国家电网公司 | Multilingual automatic speech calling and answering device and using method thereof |
CN104731775B (en) * | 2015-02-26 | 2017-11-14 | 北京捷通华声语音技术有限公司 | The method and apparatus that a kind of spoken language is converted to written word |
CN105389305B (en) * | 2015-10-30 | 2019-01-01 | 北京奇艺世纪科技有限公司 | A kind of text recognition method and device |
CN106649283B (en) * | 2015-11-02 | 2020-04-28 | 姚珍强 | Translation device and method based on natural conversation mode for mobile equipment |
CN105761201B (en) * | 2016-02-02 | 2019-03-22 | 山东大学 | A kind of method of text in translation picture |
CN108255857B (en) * | 2016-12-29 | 2021-10-15 | 北京国双科技有限公司 | Statement detection method and device |
CN107632980B (en) * | 2017-08-03 | 2020-10-27 | 北京搜狗科技发展有限公司 | Voice translation method and device for voice translation |
WO2019071607A1 (en) | 2017-10-09 | 2019-04-18 | 华为技术有限公司 | Voice information processing method and device, and terminal |
CN108009138A (en) * | 2017-12-25 | 2018-05-08 | 中译语通科技(青岛)有限公司 | A kind of interactive system of corpus crowdsourcing alignment |
CN108573694B (en) * | 2018-02-01 | 2022-01-28 | 北京百度网讯科技有限公司 | Artificial intelligence based corpus expansion and speech synthesis system construction method and device |
CN108962228B (en) * | 2018-07-16 | 2022-03-15 | 北京百度网讯科技有限公司 | Model training method and device |
CN109460558B (en) * | 2018-12-06 | 2023-04-21 | 云知声(上海)智能科技有限公司 | Effect judging method of voice translation system |
CN109754808B (en) * | 2018-12-13 | 2024-02-13 | 平安科技(深圳)有限公司 | Method, device, computer equipment and storage medium for converting voice into text |
CN109670727B (en) * | 2018-12-30 | 2023-06-23 | 湖南网数科技有限公司 | Crowd-sourcing-based word segmentation annotation quality evaluation system and evaluation method |
CN110175337B (en) * | 2019-05-29 | 2023-06-23 | 科大讯飞股份有限公司 | Text display method and device |
CN110516063A (en) * | 2019-07-11 | 2019-11-29 | 网宿科技股份有限公司 | A kind of update method of service system, electronic equipment and readable storage medium storing program for executing |
CN111178098B (en) * | 2019-12-31 | 2023-09-12 | 苏州大学 | Text translation method, device, equipment and computer readable storage medium |
CN114846543A (en) * | 2020-01-10 | 2022-08-02 | 深圳市欢太科技有限公司 | Voice recognition result detection method and device and storage medium |
CN111444730A (en) * | 2020-03-27 | 2020-07-24 | 新疆大学 | Data enhancement Weihan machine translation system training method and device based on Transformer model |
CN111581373B (en) * | 2020-05-11 | 2021-06-01 | 武林强 | Language self-help learning method and system based on conversation |
CN114205665B (en) | 2020-06-09 | 2023-05-09 | 抖音视界有限公司 | Information processing method, device, electronic equipment and storage medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101739867A (en) * | 2008-11-19 | 2010-06-16 | 中国科学院自动化研究所 | Method for scoring interpretation quality by using computer |
CN102214166A (en) * | 2010-04-06 | 2011-10-12 | 三星电子(中国)研发中心 | Machine translation system and machine translation method based on syntactic analysis and hierarchical model |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9070363B2 (en) * | 2007-10-26 | 2015-06-30 | Facebook, Inc. | Speech translation with back-channeling cues |
-
2014
- 2014-03-12 CN CN201410090457.6A patent/CN104050160B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101739867A (en) * | 2008-11-19 | 2010-06-16 | 中国科学院自动化研究所 | Method for scoring interpretation quality by using computer |
CN102214166A (en) * | 2010-04-06 | 2011-10-12 | 三星电子(中国)研发中心 | Machine translation system and machine translation method based on syntactic analysis and hierarchical model |
Non-Patent Citations (3)
Title |
---|
交互式多策略机器翻译系统(IHSMTS)面向对象分类模式库的研究;胡曾剑;《中国优秀硕士学位论文全文数据库信息科技辑》;20070215(第2期);摘要,第2.3.2.6节,图2.3.3 * |
人机互助的交互式口语翻译方法;刘鹏;《中文信息学报》;20090531;第23卷(第3期);第58-64页 * |
语音翻译词典的涉及与实现与系统评估;程洁;《中国优秀硕士学位论文全文数据库信息科技辑》;20050115(第1期);第第2.3.1节,第2.3.2节,图2.1,2.2 * |
Also Published As
Publication number | Publication date |
---|---|
CN104050160A (en) | 2014-09-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104050160B (en) | Interpreter's method and apparatus that a kind of machine is blended with human translation | |
CN107330011B (en) | The recognition methods of the name entity of more strategy fusions and device | |
US11113323B2 (en) | Answer selection using a compare-aggregate model with language model and condensed similarity information from latent clustering | |
CN108984529A (en) | Real-time court's trial speech recognition automatic error correction method, storage medium and computing device | |
CN111090727B (en) | Language conversion processing method and device and dialect voice interaction system | |
CN109949799B (en) | Semantic parsing method and system | |
CN109857846B (en) | Method and device for matching user question and knowledge point | |
CN110083710A (en) | It is a kind of that generation method is defined based on Recognition with Recurrent Neural Network and the word of latent variable structure | |
CN114116994A (en) | Welcome robot dialogue method | |
CN111709242A (en) | Chinese punctuation mark adding method based on named entity recognition | |
CN112784696A (en) | Lip language identification method, device, equipment and storage medium based on image identification | |
CN110287482A (en) | Semi-automation participle corpus labeling training device | |
CN112016320A (en) | English punctuation adding method, system and equipment based on data enhancement | |
CN109033073B (en) | Text inclusion recognition method and device based on vocabulary dependency triple | |
CN114676255A (en) | Text processing method, device, equipment, storage medium and computer program product | |
CN111046148A (en) | Intelligent interaction system and intelligent customer service robot | |
CN113468891A (en) | Text processing method and device | |
Neubig et al. | A summary of the first workshop on language technology for language documentation and revitalization | |
CN113326367B (en) | Task type dialogue method and system based on end-to-end text generation | |
CN111553157A (en) | Entity replacement-based dialog intention identification method | |
CN112084788B (en) | Automatic labeling method and system for implicit emotion tendencies of image captions | |
CN114281948A (en) | Summary determination method and related equipment thereof | |
CN114564967A (en) | Semantic annotation and semantic dependency analysis method and device for emotion semantics | |
CN114462428A (en) | Translation evaluation method and system, electronic device and readable storage medium | |
CN112085985A (en) | Automatic student answer scoring method for English examination translation questions |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |