Specific implementation mode
To keep the purpose of this utility model, technical solution and advantage clearer, below in conjunction with the accompanying drawings to the utility model
It is described in further detail.
It is succinct and intuitive in order to what is described, hereafter by describing several representative embodiments come to the utility model
Scheme be illustrated.A large amount of details is only used for helping to understand the embodiment of the utility model in embodiment.However, it will be apparent that
The technical solution of the utility model can be not limited to these details when realizing.It is new in order to avoid unnecessarily having obscured this practicality
The scheme of type, some embodiments are not described meticulously, but only give frame.Hereinafter, " comprising " refers to " packet
Include but be not limited to ", " according to ... " refer to " according at least to ..., but be not limited to according only to ... ".Due to Chinese
Speech habits, when being hereinafter not specifically stated the quantity of an ingredient, it is meant that it can also be more that the ingredient, which can be one,
It is a, or can be regarded as at least one.
Fig. 1 is the functional block diagram according to the utility model simultaneous interpretation mobile phone.Simultaneous interpretation mobile phone is that simultaneous interpretation may be implemented
The mobile phone of function.
In Fig. 1, simultaneous interpretation (i.e. simultaneous interpretation) mobile phone includes display screen, loud speaker and cell phone mainboard, is set on cell phone mainboard
It is equipped with language setting unit, voice collecting unit and communication unit, wherein:
Language setting unit, for original language type and target language type to be arranged;
Voice collecting unit, for acquiring voice data to be translated;
Communication unit, for the source language speech data to be translated and the target language type to be transferred to cloud
End, and receive to be translated into based on the source language speech data to be translated from the high in the clouds and meet the target language type
High in the clouds text translate data, or receive to be translated into based on the source language speech data to be translated from the high in the clouds and meet
The high in the clouds voiced translation data of the target language type;
Display screen, for showing that the high in the clouds text translates data;
Loud speaker, for playing the high in the clouds voiced translation data.
For example, it is Chinese that original language type, which is arranged, by language setting unit in user, and target language type is English.
The voice data (Chinese) to be translated that sends out of voice collecting unit acquisition user, communication unit by voice data to be translated with
Target language type (i.e. English) is transferred to high in the clouds, and receives to be translated into based on voice data to be translated from high in the clouds and meet
The voiced translation data of target language type, and/or receive to be translated into based on voice data to be translated from high in the clouds and meet mesh
The text for marking language form translates data.Voiced translation data are the voice of English, and are to be directed to voice data to be translated
It is translated to English in (Chinese).Text translation data are the text of English, and are to be directed to voice data (Chinese) to be translated
In to English translate.
Wherein, target language type can also be multiple.For example, object language class is arranged by language setting unit in user
Type is English and Italian.Moreover, the voice data (Chinese) to be translated that voice collecting unit acquisition user sends out, communication
Voice data and target language type (i.e. English and Italian) to be translated are transferred to high in the clouds by unit, and are received from high in the clouds
The voiced translation data for meeting target language type and (or) text translation data are translated into based on voice data to be translated.
Voiced translation data include English voice and Italian voice, and are in voice data (Chinese) to be translated respectively
To English to the corresponding translation of meaning in.Text translation data include English text and Italian text, and are to be directed to wait for respectively
Arrive the corresponding translation of meaning in the voice data (Chinese) of translation in English.
Communication unit specifically may be embodied as bluetooth communication, infrared communication module or wireless mobile communications module, etc.
Deng.
In one embodiment, simultaneous interpretation mobile phone further includes:
Power supply, for being provided for the language setting unit, voice collecting unit, communication unit, display screen and loud speaker
Electrical power.Specifically, power supply may be embodied as various lithium batteries (for example, lithium metal battery or lithium ion battery).
In one embodiment, simultaneous interpretation mobile phone further includes:Power switch, for being switched on or off the power supply.
In one embodiment, simultaneous interpretation mobile phone further includes:
Local translation unit, when judging between communication unit and high in the clouds without communication connection, by the source language to be translated
Speech voice data is identified as text, and the text is translated into the local side text translation data for meeting target language type, and
Local side text translation data-voice is synthesized into local side voiced translation data;
Display screen is additionally operable to show the local side text translation data;
Loud speaker is additionally operable to play the local side voiced translation data.
Herein, pass through Embedded Speech Recognition System software, translation engine, speech synthesis software Combined Treatment number in local
According to.In view of sometimes simultaneous interpretation mobile phone can not network, therefore it is preferably provided with a small translation engine, is realized in terminal rather than high in the clouds
Speech recognition (ASR Automatic Speech Recognition), translation and phonetic synthesis play (TTS text-to-
speech)。
In one embodiment, simultaneous interpretation mobile phone further includes:
Caching, for preserving the high in the clouds text translation data, the source language speech data to be translated, the high in the clouds
Voiced translation data, the local side text translate data, the local side voiced translation data and meet the object language
The common voice data of type.
After storage meets the common voice data of target language type in the buffer, even if communication unit temporarily cannot be with
Communication connection is established in high in the clouds, and voice playing unit, which still can be obtained and be played from caching, meets the common of target language type
Voice data, to meet the primary demand of user.For example, common voice data can include key message, such as user's wine
Shop address, user nationality, user's blood group etc..
In one embodiment, simultaneous interpretation mobile phone further includes:Wireless network access unit, for providing wireless network access
Business.Therefore, simultaneous interpretation mobile phone can also be provided provides mobile WIFI services based on wireless network access unit, is very suitable for various
Travel abroad personage.The communication unit of simultaneous interpretation mobile phone can access the wireless network of wireless network access unit offer, to
Internet is established with high in the clouds to connect.
In one embodiment, communication unit is additionally operable to receive verbal announcement data or text from other simultaneous interpretation mobile phones
This notice data;Display screen is additionally operable to show the text notification data;Loud speaker is additionally operable to play the verbal announcement number
According to.
For example, other simultaneous interpretation mobile phones can be the same type mobile phone that tour guide carries, by being received from tour guide and playing language
Sound notification data can obtain the various information of tour guide's offer in time, for example sight spot is introduced, alarm post and time, etc.
Deng.
In one embodiment, geographical location information acquiring unit is additionally provided on cell phone mainboard, it is described for obtaining
The geographical location information of simultaneous interpretation mobile phone;The loud speaker, be additionally operable to the geographical location information when the simultaneous interpretation mobile phone meet it is predetermined
Alert if when, play pre-stored audio alert data.
For example, when geographical location information judges that the distance far from tour guide is more than 300 meters, loud speaker is automatic to play
Pre-stored audio alert data.
In one embodiment, communication unit is additionally operable to the geographical location information of the simultaneous interpretation mobile phone uploading to cloud
End;
Artificial line unit is additionally provided on cell phone mainboard, for sending out audio call to the high in the clouds in a triggered,
And the manual position being assigned based on the geographical location information with high in the clouds establishes voice communication.
Based on the geographical location information of simultaneous interpretation mobile phone, scene and quotient are familiar in distribution to translator's system of artificial line automatically
The online manual position of family's demand, such as cosmetics shop, wrist-watch, health products, articles for babies shop, golf shop, Michelin dining room
Etc. local businessman.
In one embodiment, loud speaker, be additionally operable to the geographical location information when the simultaneous interpretation mobile phone meet it is scheduled
When content playing condition, pre-stored text or voice content data are played.The function is in order to facilitate tourist in place appropriate
Automatically tour guide's content is listened to, spoken without tour guide.Such as:Sight spot is introduced, prompt and ad content etc..
In one embodiment, simultaneous interpretation mobile phone further has bluetooth module, passes through bluetooth module and portable Mike
Wind establishes bluetooth connection.It further utilizes portable microphone to record and acquires source language speech data to be translated.Communication unit,
For source language speech data that record portable microphone, to be translated and the target language being arranged by language setting unit
Type transfers are sayed to high in the clouds, and receives to be translated into based on source language speech data to be translated from high in the clouds and meets the target language
It says that the high in the clouds text of type translates data, or receives from the high in the clouds and be translated based on the source language speech data to be translated
At the high in the clouds voiced translation data for meeting the target language type;Display screen, for showing that the high in the clouds text translates data;
Loud speaker, for playing the high in the clouds voiced translation data.Simultaneous interpretation mobile phone is combined based on the bluetooth connection with portable microphone
Work, to realize across language conversation.
Fig. 2 is the structure chart according to the simultaneous interpretation mobile phone of the utility model.
In fig. 2, each element of simultaneous interpretation mobile phone includes:
Mobile phone itself volume knob 1;Mobile phone itself power switch 2;Simultaneous interpretation indicator light 3, wherein green light phase indicate
Normal work;Ask customer service error correction button 4;Button 5 asks customer service direct-on-line speech interpretation conversation content (people after being triggered
Work line is pressed);Button 6 can be pinned directly without mobile phone power-on after being triggered and speak that (other side's language) (power supply is beaten automatically
It opens);Button 7 after being triggered, without booting, is directly pinned speak (one's own side's language) (power supply automatically opens);Button 8, is triggered
After replay translation result;Cell phone mainboard 9;Display screen 10;Central processing module 11 on mainboard.
Equipment includes using process:(1) is turned on the power switch, and other side and one's own side's language is arranged, such as:Other side-English,
One's own side-Chinese;(2) whenever can pin " pin and speak " button, according to other side or one's own side's language against built in mobile phone or
External microphone is spoken.Even if mobile phone can automatically open the function moment if being not keyed up;(3) simultaneous interpretations indicator light greening, table
Show that simultaneous interpretation mobile phone and system have started to work normally;(4) unclamps " pin and speak " button, and alef simultaneous interpretation systems are language translation
It is played at opposite language, and by clearly big volume loud speaker, (5) pins " request customer service ", then asks customer service, carry out
Actual conversation is explained;(6) such as has found translation error, presses " error correction ", carries out the artificial error correction of customer service.(7) if sound pronunciations not
It does not hear, pins " replaying ", replay voice.
Specifically, user is first turned on power switch, and simultaneous interpretation mobile phone is networked automatically.User presses the corresponding button selection needs
Interpreter language languages (such as:China and British is translated), and it is put into speaker nearby.When user speaks completion, unclamps " pin and speak " and press
Button, voice data pass high in the clouds back.Voice is converted to word by high in the clouds, and language translation at other side's language.It passes back and turns over from high in the clouds
Voice after translating plays (delay < 1s) by loud speaker automatically to simultaneous interpretation mobile phone.Switch interpreter language, completes accessible double
Language is talked with.Artificial line button can be pressed if necessary, it is by the professional mobile phone manual service of contact staff and anti-in 10 seconds
Feedback is correct to be explained.This data are automatically stored in backstage.
Wherein, interpreter language switching is switched and is pinned to speak to switch and be subject to close to most people finger, easy to operation;Its
The button that he seldom operates can be placed on upper side appropriate location.The width of simultaneous interpretation mobile phone can be in 4~5cm, with easily single
Subject to hand is held, length 18cm or more after stretching, extension, folded length is easy to carry in 10cm.
Question and answer are derived by manual pushbutton and submit the all-around intelligent machine for each typical scene service of travelling, and are user
It realizes and is serviced under the omnibearing lines such as overseas beat, traffic, shopping, lodging.In short, turning on the power switch dynamic networking;Select language
It pins and speaks, be put into speaker nearby;It speaks completion, unclamps " pin and speak " button;Equipment handle is automatically translated into other side's language
And it plays;If user needs, intelligent Three-Way Calling can be carried out by " artificial line ".This data are automatically stored in backstage.
For American market, ground lower button Chinese and English Phonetics may be implemented to switch, and consumer can be in noisy environments
Word input simultaneous interpretation translates voice and word.If translation is inaccurate, consumer presses artificial error correction button.Professional simultaneous interpretation translation
It afterwards, will be correctly issuing again.System is noted down artificial information and is learnt.Consumer the shop input system that oneself is wanted to go to so long.
It is described below how high in the clouds realizes specific translation.
In the utility model, based on the core technology of computer semantics identity ability, high in the clouds can help computer more
The intelligently accurate meaning of identification information behind.By by information progress deep layer, multi-level simulation tool, not only having understood its code,
Information intention to be expressed is also identified, makes computer more intelligent, more humanely and human communication.Preferably, high in the clouds master
The technological means for having used metalanguage linear structure+keyword (i.e. language block), from the linear structure and keyword of language
Accurately extract the real intention of information.One sentence to be analyzed includes linear structure and keyword (i.e. language block).Wherein,
The key of semantics identity is to identify the linear structure of sentence.The meaning of language is hidden in the linear structure of sentence, language
The linear structure of sentence is equivalent to the constant of language.The meaning of one's words or even meaning and thinking are all hidden in the linear structure of sentence, are led to
The linear structure for crossing anolytic sentence can reach the purpose that identification is intended to.Keyword is equivalent to the variable of language.It is corresponding by replacing
Partly (i.e. variable), the meaning of one's words can retain substantially, can accurately be retrieved or translation result.
Moreover, bilingual, Dan Yujun can accurately identify the meaning of one's words using structural analysis.
By carrying out linear structure+key word analysis sentence by sentence to vast as the open sea documents and materials, we can obtain fully
Sentence linear structure and keyword (i.e. language block).
It illustrates:1、Rural tourismAsTourism Industry in ChinaImportant component and promotionTourism developmentIt is important
Support.(example 1);2、China's economicAsWorld economyImportant component and promotionGlobal finance is stablizedImportant branch
Support.(example 2);By analyzing both the above example, it can be found that:
" rural tourism ", " Tourism Industry in China " and " tourism development " is equivalent to the variable of example 1, because by replacing phase
Should partly (i.e. variable), the meaning of one's words can retain substantially.And " important supports of the x as the important component and promotion x of x "
(wherein x indicates blank) is equivalent to the linear structure of example 1, that is, the constant of language, because the meaning of language is hidden in this
In linear structure.
Similarly, " China's economic ", " world economy ", " Global finance stabilization " are equivalent to the variable of example 2, because passing through
Corresponding portion (i.e. variable) is replaced, the meaning of one's words can retain substantially.And " x as x important component and promote x it is important
Support " (wherein x indicates blank) is equivalent to the linear structure of example 2, that is, the constant of language, because the meaning of language is hidden
In the linear structure.
It can be found that the two exemplary linear structures are identical, variable difference is differed only in.It can be by " x makees
For x important component and promote the important support of x " (wherein x indicate blank) is defined as a kind of linear structure, and " rural area
Tourism ", " Tourism Industry in China ", " tourism development ", " China's economic ", " world economy " and " Global finance stabilization " are defined as closing
Keyword (i.e. language block).
Wherein, some common inherent nouns and/or gerund can be determined as constant by we, but variable not office
It is limited to inherent noun and/or gerund.In some cases, variable can also be a kind of common phrase or even long
Sentence.
In addition, when determining constant and linear structure, dividing mode may not be unique.It is drawn for variable is minimum
Point mode, corresponding to linear structure be known as minimal linear structure.Usually, variable is fewer, it is believed that corresponding line
Property structure expressed by information it is more abundant, then the information of corresponding search is more accurate.
It illustrates again:
1、A FandaUpsurge is swept acrossChina.(example 3);Speculation in stocksUpsurge is swept acrossThe world.(example 4)
By analyzing both the above example, it is found that " A Fanda " and " China " is equivalent to the variable of example 3, because
By replacing corresponding portion (i.e. variable), the meaning of one's words can retain substantially.And " x upsurges sweep across x " (wherein x indicates blank) is quite
In the constant of the linear structure of example 3, that is, language, because the meaning of language is hidden in the linear structure.
Similarly, " speculation in stocks " and " world " is equivalent to the variable of example 4, because by replacing corresponding portion (i.e. variable),
Its meaning of one's words can retain substantially.And " x upsurges sweep across x " (wherein x indicates blank) is equivalent to the linear structure of example 4, that is,
The constant of language, because the meaning of language is hidden in the linear structure.
It can be found that the two exemplary linear structures are identical, variable difference is differed only in.It can be by " x upsurge seats
Volume x " (wherein x indicates blank) is defined as a kind of linear structure, and " A Fanda ", " China ", " speculation in stocks " and " world " is defined as
Keyword (i.e. language block).
It illustrates again:
1、TheyAppealEuropean CommissionObjectively and fairly treatThe MET (Market Economy Treatment) application of Chinese Enterprise.(example 5);2、International Football UnionAppealIrelandObjectively and fairly treatThe result of the match of qualifying match of World Cup and French team.(example 6);3、State Border societyAppealThe Six-Party TalksObjectively and fairly treatKorea problem.(example 7);4、ChinaAppealJapanese governmentIt is objective, just
It treats on groundWorld War II historical problem.(example 8)
By analyzing four examples above, it can be found that:
" they ", " European Commission " and " the MET (Market Economy Treatment) application of Chinese Enterprise " are equivalent to the variable of example 5, because logical
Replacement corresponding portion (i.e. variable) is crossed, the meaning of one's words can retain substantially.And " x appeals that x objectively and fairly treats x " (wherein x tables
Show blank) it is equivalent to the linear structure of example 5, that is, the constant of language, because the meaning of language is hidden in the linear structure
In the middle.
Similarly, " International Football Union ", " Ireland " and " result of the match of qualifying match of World Cup and French team " is equivalent to and shows
The variable of example 6, because by replacing corresponding portion (i.e. variable), the meaning of one's words can retain substantially.And " x appeals that x is objective, just
Ground treats x " (wherein x indicates blank) and is equivalent to the linear structure of example 6, that is, the constant of language, because of the meaning of language
It is hidden in the linear structure.
Similarly, " international community ", " the Six-Party Talks " and " Korea problem " is equivalent to the variable of example 6, because by replacing
Corresponding portion (i.e. variable) is changed, the meaning of one's words can retain substantially.And " x appeals that x objectively and fairly treats x " (wherein x expression skies
It is equivalent to the linear structure of example 6, that is, the constant of language in vain), because the meaning of language is hidden in the linear structure.
Similarly, " China ", " Japanese government " and " World War II historical problem " is equivalent to the variable of example 7, because by replacing
Corresponding portion (i.e. variable) is changed, the meaning of one's words can retain substantially.And " x appeals that x objectively and fairly treats x " (wherein x expression skies
It is equivalent to the linear structure of example 7, that is, the constant of language in vain), because the meaning of language is hidden in the linear structure.
It can be found that this four exemplary linear structures are identical, variable difference is differed only in.It can be by " x appeals x
Objectively and fairly treat x " (wherein x indicate blank) " it is defined as a kind of linear structure, and " they ", " European Commission ", " middle state-owned enterprise
The MET (Market Economy Treatment) application of industry ", " International Football Union ", " Ireland ", " result of the match of qualifying match of World Cup and French team ",
" international community ", " the Six-Party Talks ", " Korea problem ", " China ", " Japanese government " and " World War II historical problem " are defined as key
Word (i.e. language block).
Based on above-mentioned analysis, by lot of documents (including web documents, blog, textbook, various electronic documents etc.)
Above-mentioned cutting is carried out, we can be obtained by sufficient linear structure library and keyword (i.e. language block) library.
Interpretation method of the high in the clouds based on semantics identity is described in detail again below.
First:It is character string that the word of chapter grade, which is utilized symbol cutting, and extracts language from the character string cut out
Linear structure and language block.
Herein, first by the word of chapter grade (for example, an article or an editorial) if being using symbol cutting
A dry character string, and extract from the character string cut out language linear structure and language block successively (specific extraction step can be with
It is analyzed with reference to aforementioned exemplary).
" chapter grade " is not meant to there is any specific restriction to the number of word herein.Substantially, as long as having
Vocabulary, and the sentence formed between these vocabulary is meaningful, so that it may to think that these vocabulary constitute " chapter grade ".
More specifically, can according to fullstop, question mark, exclamation, comma, pause mark, branch, colon, quotation marks, bracket, dash,
Ellipsis, mark of emphasis, hyphen, separation dot, punctuation marks used to enclose the title, line under or beside a word to show that it is a proper noun, annotation number, the number of avoiding mentioning, empty lacking number, slash, identification number,
Instead of number, like a chain of pearls or a string of beads number and/or the punctuation marks such as arrow number, be character string by the word segmentation of chapter grade.For example, can will be arbitrary
Two punctuation marks between Word Input be character string (starting for article a, it is only necessary to punctuation mark).
When determining keyword (language block), we can use a local substring statistical form (hash based on chapter
Table) as interim auxiliary dictionary.That is, if there is in auxiliary dictionary temporarily, we can be determined as
Language block.But, certain not appear in local substring statistical form, language block can also be determined as.It can also be with being based on
The cutting route tree of multi-path planning as segmentation model, first by English (ASCII), simplified form of Chinese Character (GBK/GB18030),
The character codes such as Chinese-traditional (TaiWan, China BIG5, Hong-Kong BIG5-HKSCS) be uniformly converted to UTF-8 coded formats it
It carries out cutting again afterwards, and language block is extracted on the basis of multiple correct cutting results.
After having extracted language block, remaining part is exactly linear structure.
Then:The language linear structure and language block that extract are arranged respectively.
Herein, row specifically includes:
For the language block of each qualification, by number of documents, paragraph, sentence number, word order number and the HTML where the language block
One structure of the boil down tos such as information is put into the living document where the language block;Wherein language block can be arbitrary character string, main
It to include following classification:It is dictionary entry, proper name, the internal vocabulary of proper name, all kinds of phrase/Matching Relations, n-grams, continuous
Stopwords, word+number, arbitrary ASCII strings, postcode and telephone number etc..
And for the language linear structure of each qualification, can by where the language linear structure number of documents, paragraph,
One structure of the boil down tos such as sentence number, word order number and HTML information is put into the living document where the language block.
Then:Create language linear structure subindex and language block subindex, and by language linear structure subindex and language
Block subindex is merged, to form whole index.
Herein, language block vocabulary is written into whole language block index entries (index terms) in memory
(vocabulary) file is written inv_lists files, and association between the two is believed after merging inverted hits
Breath write-in dictionary (dictionary) file.These three files constitute a complete, independent index segment (index run), i.e.,
Language block subindex.
Moreover, linear structural word remittance table is written in whole linear structure index entries (index terms) in memory
(vocabulary) file is written inv_lists files, and association between the two is believed after merging inverted hits
Linear structural word allusion quotation (dictionary) file is written in breath.These three files constitute a complete, independent index segment (index
Run), i.e. linear structure subindex.
Finally, language linear structure subindex and language block subindex are merged, to form whole index.
Finally:It is inputted in character string from the retrieval of user and extracts language linear structure and language block, and according to described whole
Body indexes the information to match with the language linear structure and language block extracted from the retrieval input of user to user feedback.
Herein, it is inputted in character string from the retrieval of user first and extracts linear structure and language block.If for example, user
" I is delithted with the Big Apple for eating Yantai production for input." then extract language block " I ", " Big Apple of Yantai production " and linear structure
X, which is delithted with, eats x (wherein x is blank), and matching linear structure " x, which is delithted with, eats x ", Yi Jiyu are then retrieved in integrally indexing
The information of block " I ", " Big Apple of Yantai production ", and presented to user according to the sequence of matching degree from high to low.
In one embodiment, when the language linear structure extracted in the retrieval input from the user and whole rope
When the repetition number of words of language linear structure in drawing is more, it is believed that this matching degree is higher.
In one embodiment, language linear structure can also be pre-set and repeat wooden fork weight and language block repetition weight;
The language line that weight calculation is extracted from the retrieval input of the user is repeated based on the language linear structure
Property structure and the language linear structure in whole index the first Chong Die index, and based on language block repetition weight calculation from the use
Second Chong Die index of the language block and the language block in whole index that are extracted in the retrieval input at family;
When it is described first overlapping index index Chong Die with second and it is higher, the matching degree is higher.
Wherein, match to user feedback with the language linear structure and language block extracted from the retrieval input of user
Information can specifically include:
The language linear structure and language block for retrieving the input character string respectively in the whole index, to determine whole rope
Language linear structure corresponding with the language linear structure of the input character string in drawing, and determine defeated with this in whole index
Enter the corresponding language block of language block of character string;
Letter involved by the corresponding language linear structure of this in integrally being indexed to user feedback and the corresponding language block
Breath.
The flow of the present invention can be applied in a variety of specific practical applications, for example information retrieval and multilingual turned over
It translates.
When applied to multilingual translation, it is assumed that the retrieval input character string of user is defeated for the retrieval stated with first language
Enter character string.At this point, extracting the language that the input character string first language is stated from the retrieval input character string of user
Linear structure and language block;Then it determines again and the language linear structure with first language statement and language block is corresponding uses second
The language linear structure and language block of language expression;
It is indexed to user feedback and the language linear structure and language block phase stated with second language according to the entirety
The information matched and equally stated with second language.Wherein, first language can be Chinese, and second language is English, Japanese, Korea Spro
Text, Arabic, Spanish, Portuguese, French or Russian, etc..Optionally, first language be English, Japanese,
Korean, Arabic, Spanish, Portuguese, French or Russian, second language are Chinese etc..
Citing:User it is expected Chinese " I will go to Shanghai " translating into English.
At this point, retrieval input character string input by user is " I will go to Shanghai ", Chinese is used in combination to state.First, from user
Retrieval input character string in extract language linear structure that input character string Chinese is stated (i.e.:X will remove x, wherein x
For blank) and Chinese statement language block (I, Shanghai);Then it determines again corresponding with the language linear structure stated with Chinese
The language linear structure (i.e. x want to go to) stated in English, and determine with this with Chinese state language block phase
The corresponding language block (i.e. I, Shanghai) stated in English.Finally, language block and linear structure are combined into the sentence I of translation
Want to go to Shanghai, and be presented to the user.
Further, it can also be indexed to user feedback and linear structure (x want to go to) language further according to whole
The information that block (I, Shanghai) matches and stated with second language, consequently facilitating user search and I want to go to
The relevant english informations of Shanghai.
In above process, a kind of high performance single pass memory is exemplarily applied to fall to arrange algorithm, it is any without generating
Temporary disc file.Therefore, before exporting memory content, in addition to MAP data, system does not have any file I/O expense.Together
When, it need not also number index terms, and not appoint to index term (number or memory character string pointer)
What sort operation.In addition, this method is arranged using all available free physical memories.These properties ensure that this falls
Discharge method can have outstanding spatiotemporal efficiency, and a series of efficient dynamic indexs can be supported to merge the method with index upgrade.Together
When, the inverted index for having the characteristic is also completely suitable for distributed treatment.
In above process, another key feature is that its searching data structure has caching functions, this characteristic can
Support almost arbitrarily large index thesaurus (i.e. vocabulary files).Vocabulary files itself are placed on disk, energy
The number of the index entry enough preserved is unrestricted (in 64-bit file system), can up to several hundred million.Pass through caching work(
Can, which can reach on the x64 servers of 4~6GB memories and include more same or higher configuration servers
Index thesaurus query performance similar in cluster inquiry system.
Moreover, index terms can be arbitrary character string, include mainly following classification (term categories):
Dictionary entry, proper name, the internal vocabulary of proper name, all kinds of phrase/Matching Relations, n-grams, continuous stopwords, word+number
Word, arbitrary ASCII strings, postcode and telephone number etc..
Based on above-mentioned detailed description, embodiment of the present invention also proposed a kind of portable simultaneous interpretation system.
Fig. 3 is the portable simultaneous interpretation system construction drawing of the utility model.
As shown in figure 3, the system includes information collection apparatus 301, data storage device 302, natural language processing device
303, storage device 304 and retrieval service device 305 are indexed.Wherein:
Information collection apparatus 301 crawls the information on internet for being scanned detection to internet;
Data storage device 302 for storing the internet information crawled by information collection apparatus, and preferably provides mutually
The quick positioning searching of networked information;
Natural language processing device 303, for using symbol to being stored in the word of the chapter grade in data storage device 302
Language, cutting are character string, and language linear structure and language block are extracted from the character string cut out;And the language to extracting respectively
Speech linear structure and language block are arranged;And for creating language linear structure subindex and language block subindex, and will
Language linear structure subindex and language block subindex are merged, to form whole index;
Storage device 304 is indexed, for storing the whole index generated by natural language processing device 303;
Retrieval service device 305, the voice data to be translated for providing simultaneous interpretation mobile phone as shown in Figure 1 are converted to
Retrieval input character string, and inputted in character string from retrieval and extract language linear structure and language block, and according to index storage dress
The language linear structure and language that the entirety for setting storage is indexed to user feedback and extracted from the retrieval input of user
The information that block matches.
Wherein, information collection apparatus 301 may further receive newpapers and periodicals, broadcasting and TV and each media member etc. provided it is upper
Communication ceases (such as News Resources) service.
Moreover, retrieval service device 305 can inquire news free of charge for ordinary user, and registered for professional user
And open high-end business after paying.
Preferably, natural language processing device 303, for according to fullstop, question mark, exclamation, comma, pause mark, branch, emit
Number, quotation marks, bracket, dash, ellipsis, mark of emphasis, hyphen, separation dot, punctuation marks used to enclose the title, line under or beside a word to show that it is a proper noun, annotation number, the number of avoiding mentioning,
Empty lacking number, slash, identification number, instead of number, like a chain of pearls or a string of beads number and arrow number, be character string by the word segmentation of the chapter grade.
Preferably, natural language processing device 303, for using the local substring statistical form based on chapter as interim auxiliary
Auxiliary word allusion quotation uses the cutting route tree based on multi-path planning as segmentation model, by the equal character codes of the word of the chapter grade
Uniformly be converted to UTF-8 coded formats;And it is using symbol cutting to the word for being converted to the chapter grade after UTF-8 coded formats
Character string.
Moreover, retrieval service device 305, can be used for the sequence from high to low according to matching degree, to user feedback with
The information that the language linear structure and language block extracted from the retrieval input of user matches.
In one embodiment, retrieval service device 305, for the matching degree according to language linear structure and language block
Sequence from high to low matches to user feedback with the language linear structure and language block extracted from the retrieval input of user
Information.Wherein it is preferred in the language linear structure extracted in the retrieval input from the user is indexed with entirety
When the repetition number of words of language linear structure is more, the matching degree is higher.
In one embodiment, retrieval service device 305 is further used for pre-setting language linear structure repetition power
Weight and language block repeat weight;And it repeats to extract during weight calculation is inputted from the retrieval of the user based on the language linear structure
First Chong Die index of the language linear structure gone out and the language linear structure in whole index, and weight meter is repeated based on language block
Calculate the second Chong Die index of the language block and the language block in whole index that are extracted from the retrieval input of the user;Wherein work as institute
State the first overlapping index index Chong Die with second and higher, the matching degree is higher.
In one embodiment, retrieval service device 305, for retrieving the input word respectively in the whole index
The language linear structure and language block for according with string, it is corresponding with the language linear structure of the input character string in whole index to determine
Language linear structure, and determine language block corresponding with the language block of the input character string in whole index;To user feedback
The corresponding language linear structure and the information involved by the corresponding language block in whole index.
In one embodiment, retrieval service device 305 extracts this for being inputted in character string from the retrieval of user
The language linear structure and language block that input character string is stated with first language;It determines linear with the language stated with first language
Structure and the corresponding language linear structure and language block stated with second language of language block;It is indexed to user according to the entirety
The information that feedback matches and equally stated with second language with the language linear structure and language block stated with second language.
Optionally, first language is English, Japanese, Korean, Arabic, Spanish, Portuguese, French or Russia sieve
Polite etc., second language is Chinese.First language can also be Chinese, second language be English, Japanese, Korean, Arabic,
Spanish, Portuguese, French or Russian, etc..
In conclusion language setting unit, for original language type and target language type to be arranged;Voice collecting unit,
For acquiring source language speech data to be translated;Communication unit is used for the source language speech data and institute to be translated
It states target language type and is transferred to high in the clouds, and receive from the high in the clouds and be translated based on the source language speech data to be translated
Data are translated at the high in the clouds text for meeting the target language type, or are received from the high in the clouds based on the source language to be translated
Speech voice data is translated into the high in the clouds voiced translation data for meeting the target language type;Display unit, for showing
High in the clouds text translation data are stated, or play the high in the clouds voiced translation data.It therefore, can be real by portable simultaneous interpretation mobile phone
Existing voiced translation, can significantly reduce cost.
Moreover, can also include wireless network access unit in simultaneous interpretation mobile phone, surf the Internet convenient for user.Moreover, simultaneous interpretation mobile phone
Various information can also be received, and position can be monitored.In the case where that cannot network, simultaneous interpretation mobile phone still can provide pre-
The common voice data deposited, consequently facilitating user uses.
The above, the only preferred embodiment of the utility model, are not intended to limit the protection of the utility model
Range.Within the spirit and principle of the utility model, any modification, equivalent replacement, improvement and so on should be included in
Within the scope of protection of the utility model.