CN101556745A - Device and method for providing information - Google Patents

Device and method for providing information Download PDF

Info

Publication number
CN101556745A
CN101556745A CNA2008100911610A CN200810091161A CN101556745A CN 101556745 A CN101556745 A CN 101556745A CN A2008100911610 A CNA2008100911610 A CN A2008100911610A CN 200810091161 A CN200810091161 A CN 200810091161A CN 101556745 A CN101556745 A CN 101556745A
Authority
CN
China
Prior art keywords
information
mistake
key element
language message
wrong
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CNA2008100911610A
Other languages
Chinese (zh)
Inventor
刘宏建
周泉
永松健司
布社辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hitachi Ltd
Original Assignee
Hitachi Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hitachi Ltd filed Critical Hitachi Ltd
Priority to CNA2008100911610A priority Critical patent/CN101556745A/en
Publication of CN101556745A publication Critical patent/CN101556745A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention relates to a device and a method for providing information. The device comprises a storage unit for storing element wrongly identified information of language information, a receiving unit for receiving the text input by a user, and a processing unit for fetching the language information from the storage unit and fetching the element wrongly identified information corresponding to the language information in the received text to be used as the element wrongly identified information of the language information in the received text. The device and the method can be used for providing the element wrongly identified information of the language information in the received text for the user according to the received text.

Description

A kind of apparatus and method that are used to provide information
Technical field
The present invention relates to a kind of apparatus and method that are used to provide information.
Background technology
By carrying out the study of language such as electronic installations such as electronic dictionary, PDA, language learners.Can obtain the orthoepy of sentence and word for these devices from loudspeaker, the phonetic of each speech also can show by display screen simultaneously.In this way, the user can obtain the Useful Informations such as pause between rhythm speech in the orthoepy of each word or the sentence.
In this reality, there is the word that much misreads easily.Though the device such by electronic dictionary can obtain correct pronunciation, but be difficult to bring very deep memory to the user.In addition, the word that in a word, may exist many places easily to misread, but some word is easier in these words misreads, and have some to be not easy relatively to misread.In this case, the user wishes and can obtain reminding according to the degree that easily misreads.
In addition, because the polyphone, the polyphonic word phenomenon that in a lot of language, exist in a large number.The pronunciation of same word has very large difference under different situations.Need prompting message in this case to user's necessity.
Because the user's of country variant pronunciation custom is different, the word that is easy to orthoepy in a country may be very difficult concerning another national user simultaneously.This situation often occurs on one's body the different user of language setting.Same phenomenon also usually appears at the different regions of same country, owing to have different dialects in a country, the word that area is easy to orthoepy also may be unusual difficulty concerning another regional user.Need the prompting message with necessity in these cases to the user of country variant, different regions.
In the language learning process, be not only pronunciation, for the user of different language background, the implication of some word or sentence also is very easy to mistaken.This often is present in some places of straining the meaning of the word easily.Particularly at some similar language such as Chinese and Japanese, a lot of words are identical or similar on font, but its implication may be different fully.
Based on above-mentioned background, patent " the reason engineering Xing meeting of shaking, outer state Language is from study of law Xi シ ス テ system, special Open 2001-249679 " shows the result of voice recognition by a speech recognition equipment and sound analysis device.Sound characteristic by language setting who judges the user and the language that will learn can the call user's attention difficulty pronunciation, this patent is also judged the correctness of pronunciation by the content that contrasts actual pronunciation and speech recognition simultaneously.Though said method relates to emphatic reading, process is very complicated.
Summary of the invention
In order to eliminate the problems in the present language learn device, the object of the present invention is to provide a kind of apparatus and method that are used to provide information, it provides the key element of the language message in the text that is received to be known information by mistake based on the text that receives to the user.
In order to realize purpose of the present invention,, comprising according to a kind of device that is used to provide information of the present invention:
Storage unit, the key element that is used to store language message is known information by mistake;
Receiving element is used to receive the text that the user imports;
Processing unit, the corresponding key element of language message that is used for from the text of described storage unit its language message of extraction and described reception is known information by mistake, is known information as the key element of the language message in the text of described reception by mistake.
In order to realize purpose of the present invention,, comprising according to a kind of method that is used to provide information of the present invention:
The key element of storage language message is known information by mistake;
Receive the text of user's input; And
Known information from the corresponding key element of language message that the key element of the language message of described storage is extracted the mistake knowledge information in the text of its language message and described reception by mistake, known information by mistake as the key element of the language message in the text of described reception.
By as can be seen above-mentioned, by the apparatus and method that are used to provide information of the present invention, not needing to carry out speech recognition does not need the result of speech recognition is compared yet, and just can provide the key element of the language message in the text that is received to be known information by mistake to the user according to the text that is received.
Description of drawings
Fig. 1 shows the schematic representation of apparatus that is used to provide information of one embodiment of the invention.
Fig. 2 shows the synoptic diagram of the storage unit of one embodiment of the invention.
The general word that Fig. 3 a-3c shows one embodiment of the invention respectively misreads corpus, individual subscriber word and misreads the example that corpus and language-specific background word misread corpus.
The general word that Fig. 4 a-4c shows one embodiment of the invention is respectively translated wrong corpus, individual subscriber word and is translated the example that wrong corpus translated in wrong corpus and language-specific background word.
The general sentence that Fig. 5 a-5c shows one embodiment of the invention is respectively translated wrong corpus, individual subscriber sentence and is translated the example that wrong corpus translated in wrong corpus and language-specific background sentence.
Fig. 6 shows the structural representation of the processing unit of one embodiment of the invention.
Fig. 7 shows first example of the information of emphasizing of one embodiment of the invention.
Fig. 8 shows second example of the information of emphasizing of one embodiment of the invention.
Fig. 9 shows the 3rd example of the information of emphasizing of one embodiment of the invention.
Embodiment
Fig. 1 shows the schematic representation of apparatus that is used to provide information of one embodiment of the invention.As shown in Figure 1, present embodiment be used to provide the device 10 of information to comprise: receiving element 101, storage unit 102, processing unit 103, display unit 104 and voice unit (VU) 105.
Wherein, the text of receiving element 101 reception user inputs, user's language setting's language category information, user totem information and function selecting information.Wherein, text can comprise one or more words, one or more sentences, and here, word and sentence all belong to language message, but language message is not limited to word and sentence.User's language setting can be user's mother tongue and/or the other Languages except mother tongue that the user was familiar with, for example, and foreign language or dialect etc.Function selecting information shows any function of user expectation device 10 execution.
Storage unit 102 storage but the key element that is not limited to language message are known user's the identification information of information by mistake by the people's that mistake is known information, key element that language message occurs is known information by mistake language setting's language category information or key element that language message occurs and the key element of language message is known the pairing frequency information of information by mistake.
The key element of language message comprises the pronunciation of language message, the translation of language message etc., and information and language message known in the key element of language message is comprised language message by mistake knowledge information pronunciation by mistake translation is by mistake knowledge information etc.Wherein, the pronunciation of language message is comprised the information etc. that misreads of word by mistake knowledge information, for example, " si1 " (the high and level tone sound in the 1 expression tone) is the information that misreads of word " lion ", and the translation of language message is comprised the wrong information etc. of translating of wrong information of translating of word and sentence by mistake knowledge information, for example, the correct Japanese translation that Chinese terms " automobile " is corresponding is " from Move Trucks ", but some can be translated as it Japanese word " automobile " mistakenly, in this case, Japanese translation " automobile " is exactly the wrong information of translating of Chinese terms " automobile ".
In the present embodiment, the key element of storage unit 102 language message of being stored is to obtain by statistics on a large amount of various language settings' crowd's basis by mistake knowledge information.The key element of language message is known the pairing frequency information of information by mistake and is just referred to that people the key element of language message occurs by the frequency of mistake knowledge information.
In addition, storage unit 102 can also be stored the dictionary of the key element correct information that comprises language message, and it comprises the right pronunciation of word, the correct translation of word, the correct translation of sentence etc.
The key element of the language message that processing unit 103 is stored according to storage unit 102 by the people's that mistake is known information, key element that language message occurs is known information by mistake language setting's language category information or key element that language message occurs by mistake know the user's of information identification information, the key element of language message is known pairing frequency information of information and dictionary by mistake, produces the key element correct information at the language message in the information of emphasizing of user's language setting's the text that is received and the text that received.
Display unit 104 and voice unit (VU) 105 form the output unit of device 10, to export the information of emphasizing and the correct information of the text that is received to the user.Wherein, display unit 104 display visual information, for example, the mistake of word combines information into syllables, word translate wrong information, the correct translation information of word, sentence translate wrong information, the correct translation information of sentence etc.Voice unit (VU) 105 output sound information, for example, the wrong pronunciation of word and the right pronunciation of word.
Fig. 2 shows the synoptic diagram of the storage unit of one embodiment of the invention.As shown in Figure 2, storage unit 102 comprises that word misreads that wrong corpus translated in corpus, word, wrong corpus and dictionary translated in sentence.
Wherein, word misreads corpus and comprises that further general word misreads that corpus 301, individual subscriber word misread corpus 302, language-specific background word misreads corpus 303.General word misreads the information that the misreads correspondence of the information that misreads of the word that the people that obtain generally can misread and word is added up in corpus 301 storage to various language settings' people frequency information.The individual subscriber word misreads the frequency information of the information that the misreads correspondence of the information that misreads of the word that individual subscriber that corpus 302 storage obtains user's statistics can misread, user's the identification information of the information that misreads that word occurs and word.Language-specific background word misreads in the corpus 303 information that misreads of the word that the people of the different language background that storage obtains people's statistics of different language background can misread, people's the pairing language category information of language setting that the information that misreads occurs and the frequency information that misreads the information correspondence.
Word is translated wrong material storehouse and is comprised that further wrong corpus 304 translated in general word, wrong corpus 305 translated in the individual subscriber word and wrong corpus 306 translated in language-specific background word.304 storages of wrong corpus translated in general word generally can translate translating wrong information and translating the pairing frequency information of wrong information of wrong word to the people that people's statistics of various language settings obtains.The individual subscriber word translate individual subscriber that 305 storages of wrong corpus obtain user's statistics can translate wrong word translate wrong information, occur translating wrong information the user identification information and translate the frequency information of wrong information correspondence.The people that the different language background that 306 storages of wrong corpus obtain people's statistics of different language background translated in language-specific background word can translate wrong word translate wrong information, occur translating wrong information people the language setting corresponding language category information and translate the frequency information of wrong information correspondence.
Sentence is translated wrong corpus and is comprised that further wrong corpus 307 translated in general sentence, wrong corpus 308 translated in the individual subscriber sentence and wrong corpus 309 translated in language-specific background sentence.Wherein, general sentence is translated 307 storages of wrong corpus and generally can be translated translating wrong information and translating the frequency information of wrong information correspondence of wrong sentence to the people that people's statistics of various language settings obtains.The individual subscriber sentence translate individual subscriber that 308 storages of wrong corpus obtain user's statistics can translate wrong sentence translate wrong information, occur translating wrong information the user identification information and translate the frequency information of wrong information correspondence.The people that the different language background that 309 storages of wrong corpus obtain people's statistics of different language background translated in language-specific background sentence can translate wrong sentence translate wrong information, occur translating wrong information people the pairing language category information of language setting and translate the frequency information of wrong information correspondence.
The right pronunciation information of dictionary 310 storage words and correct translation information, the correct translation information of sentence.In addition, the modified tone dictionary 311 storage words needed modified tone Rule Information that modifies tone.
The general word that Fig. 3 a-3c shows one embodiment of the invention respectively misreads corpus, individual subscriber word and misreads the example that corpus and language-specific background word misread corpus.
Shown in the example that the general word of Fig. 3 a misreads corpus 301, add up discovery by people to various language settings (for example Japanese, Korean, Chinese, China-Sichuan dialect, English, French, German, Russian etc.), people generally can misread " Si1 " to Chinese terms " lion ", and the frequency that " Si1 " mistake takes place is 99.People also generally can misread " Si4 " to word " lion ", and the frequency that " Si4 " (falling tones in the 4 expression tones) mistake takes place is 50.People generally can misread " Shi4 " to word " four ", and the frequency that " Shi4 " mistake takes place is 85.People generally can misread " Si2Tou1 " (the rising tone sound in the 2 expression tones) to word " stone ", and the frequency that " Si2Tou1 " mistake takes place is 73.
Shown in the example that the individual subscriber word of Fig. 3 b misreads corpus 302, the hillside plot individual can misread " Si1 " to word " lion ", the frequency that " Si1 " mistake takes place for he is 50, in addition, the hillside plot individual also can misread " Si4 " to word " lion ", the frequency that " Si4 " mistake takes place for he is 3, the hillside plot individual can misread " Shi4 " to word " four ", the frequency that " Shi4 " mistake takes place for he is 1, wine well individual can misread " Si2Tou1 " to word " stone ", and the frequency that " Si2Tou1 " mistake takes place for he is 2.
Shown in the example that the language-specific background word of Fig. 3 c misreads corpus 303, people with Japanese background are added up discovery, the people of Japanese background can misread " Si1 " to word " lion ", frequency with people's generation " Si1 " mistake of Japanese background is 22, in addition, people with Japanese background also can misread " Si4 " to word " lion ", frequency with people's generation " Si4 " mistake of Japanese background is 68, people with Japanese background can misread " Shi4 " to word " four ", and the frequency with people's generation " Shi4 " mistake of Japanese background is 45.
The general word that Fig. 4 a-4c shows one embodiment of the invention is respectively translated wrong corpus, individual subscriber word and is translated the example that wrong corpus translated in wrong corpus and language-specific background word.
Shown in the example that wrong corpus 304 translated in the general word of Fig. 4 a, add up discovery by people to various language settings (for example Japanese, Korean, Chinese, English, French, German, Russian etc.), people generally can translate wrong one-tenth Japanese word " automobile " to Chinese terms " automobile ", wrong frequency translated in generation Japanese word " automobile " is 76, people generally can " walk " to translate wrong one-tenth Japanese word to Chinese terms and " walk Ru ", and it is 44 that generation Japanese word " walk Ru " to be translated wrong frequency.
Shown in the example that wrong corpus 305 translated in the individual subscriber word of Fig. 4 b, the hillside plot individual can translate wrong one-tenth Japanese word " automobile " to Chinese terms " automobile ", Japanese word " automobile " takes place for he, and to translate wrong frequency be 7, hillside plot individual can " walk Chinese terms " to translate and wrongly become the Japanese word " to walk Ru ", and he takes place that the Japanese word " walk Ru " to translate wrong frequency is 5.
Shown in the example that wrong corpus 306 translated in the language-specific background word of Fig. 4 c, people to the Japanese background add up discovery, people with Japanese background can translate wrong one-tenth Japanese word " automobile " to Chinese terms " automobile ", wrong frequency translated in generation Japanese word " automobile " is 66, people with Japanese background can " walk " to translate wrong one-tenth Japanese word to Chinese terms and " walk Ru ", and it is 89 that generation Japanese word " walk Ru " to be translated wrong frequency.
The general sentence that Fig. 5 a-5c shows one embodiment of the invention is respectively translated wrong corpus, individual subscriber sentence and is translated the example that wrong corpus translated in wrong corpus and language-specific background sentence.
Shown in the example that wrong corpus 307 translated in the general sentence of Fig. 5 a, add up discovery by people to various language settings (for example Japanese, Korean, Chinese, English, French, German, Russian etc.), people generally can " with great difficulty do Chinese sentence and be over " and translate wrong one-tenth japanese sentence " や The く ", and it is 43 that generation japanese sentence " や The く " is translated wrong frequency.
Shown in the example that wrong corpus 308 translated in the individual subscriber sentence of Fig. 5 b, the hillside plot individual can " with great difficulty do Chinese sentence and be over " and translate wrong one-tenth japanese sentence " や The く ", and it is 3 that generation japanese sentence " や The く " is translated wrong frequency.
Shown in the example that wrong corpus 309 translated in the language-specific background sentence of Fig. 5 c, people to the Japanese background add up discovery, people with Japanese background can " with great difficulty do Chinese sentence and be over " and translate wrong one-tenth japanese sentence " や The く ", and it is 25 that generation japanese sentence " や The く " is translated wrong frequency.
Fig. 6 shows the structural representation of the processing unit of one embodiment of the invention.As shown in Figure 6, processing unit 103 comprises text analyzing unit 201, emphasizes information generation unit 202 and correct information generation unit 203.
The 201 pairs of texts that received in text analyzing unit carry out text-processing, and to obtain the language message of the text, here, the language message of the text comprises word and sentence.To the word that obtains, carry out processing such as disambiguation, phonetic mark mark and modified tone.Wherein, word is modified tone handle required information stores in the modified tone dictionary 311 of storage unit 102.
Emphasize the function selecting information that information generation unit 202 is at first received according to receiving element 101, determine that the user selects to carry out following any function: word misreads that informational function, word are translated wrong informational function, wrong informational function translated in sentence.
Then, according to user-selected function, emphasize the user's that information generation unit 202 is received according to receiving element 101 language setting's language category information and user totem information, from word misread corpus, wrong corpus translated in word or sentence is translated the wrong corpus, produce at the information that misreads of user's language setting's the word in the text that receives, the wrong information of translating of wrong information or sentence of translating of word.
Particularly, what select as the user is that word is when misreading informational function, for each the word W in the text that is received, emphasize that information generation unit 202 at first misreads the corpus from the general word of storage unit 102 and extract (perhaps, obtain, retrieval) its word and the word W in the text that is received are corresponding misreads information and this misreads the corresponding frequency information of information, from the language-specific background word of storage unit 102 misread extract the corpus its word and its corresponding language category information respectively with the text that is received in word W and the user's that received language setting's language category information is corresponding misreads information and this misreads the corresponding frequency information of information, and from the individual subscriber word of storage unit 102 misread extract the corpus its word and its user totem information respectively with the text that is received in word W and the user's that received identification information is corresponding misreads information and this misreads the corresponding frequency information of information, then, the frequency information that in the information each misreads information that misreads that is extracted is misread corpus divided by general word, the individual subscriber word misreads corpus and language-specific background word and misreads that each misreads the summation of the pairing frequency information of information in these three corpus of corpus, each that extracted with calculating misreads the probabilistic information of information, at last, misread the probabilistic information order from big to small of information according to each that calculated, each that extracted is misread information sort, the information that misreads after this ordering is exactly the information of emphasizing of the word W in the text that is received.
In addition, what select as the user is that word is when translating wrong informational function, for each the word W in the text that is received, emphasize that information generation unit 202 is at first translated the wrong corpus from the general word of storage unit 102 and extract that its word and the word W in the text that is received are corresponding to translate wrong information and this translates the corresponding frequency information of wrong information, from the language-specific background word of storage unit 102 translate the wrong corpus extract its word and its corresponding language category information respectively with the text that is received in word W and the user's that received language setting's language category information is corresponding translates wrong information and this translates the corresponding frequency information of wrong information, and from the individual subscriber word of storage unit 102 translate the wrong corpus extract its word and its user totem information respectively with the text that is received in word W and the user's that received identification information is corresponding translates wrong information and this translates the corresponding frequency information of wrong information, then, the frequency information that in the wrong information each translates wrong information of translating that is extracted is translated wrong corpus divided by general word, the individual subscriber word is translated wrong corpus and language-specific background word and is translated that each translates the summation of the pairing frequency information of wrong information in these three corpus of wrong corpus, translate the probabilistic information of wrong information to calculate each that extracted, at last, translate the probabilistic information order from big to small of wrong information according to each that calculated, each that extracted is translated wrong information sort, translate the information of emphasizing that wrong information is exactly the word W in the text that is received after this ordering.
In addition, what select as the user is that sentence is when translating wrong informational function, for each the sentence S in the text that is received, emphasize that information generation unit 202 is at first translated the wrong corpus from the general sentence of storage unit 102 and extract that its sentence and the sentence S in the text that is received are corresponding to translate wrong information and this translates the corresponding frequency information of wrong information, from the language-specific background sentence of storage unit 102 translate the wrong corpus extract its sentence and its corresponding language category information respectively with the text that is received in sentence S and the user's that received language setting's language category information is corresponding translates wrong information and this translates the corresponding frequency information of wrong information, and from the individual subscriber sentence of storage unit 102 translate the wrong corpus extract its sentence and its user totem information respectively with the text that is received in sentence S and the user's that received identification information is corresponding translates wrong information and this translates the corresponding frequency information of wrong information, then, the frequency information that in the wrong information each translates wrong information of translating that is extracted is translated wrong corpus divided by general sentence, the individual subscriber sentence is translated wrong corpus and language-specific background sentence and is translated that each translates the summation of the pairing frequency information of wrong information in these three corpus of wrong corpus, translate the probabilistic information of wrong information to calculate each that extracted, at last, translate the probabilistic information order from big to small of wrong information according to each that calculated, each that extracted is translated wrong information sort, translate the information of emphasizing that wrong information is exactly the sentence S in the text that is received after this ordering.
Fig. 7 shows first example of the information of emphasizing of one embodiment of the invention.Wherein, in this first example, suppose that storage unit 102 stored that the general word shown in Fig. 3 a-3c misreads corpus 301, the individual subscriber word misreads corpus 302 and language-specific background word misreads corpus 303, and dictionary.
As shown in Figure 7, in this first example, the text that the receiving element 101 of device 10 receives is " lion and 40 ", the pairing language category information of the user's who receives language setting is Japanese and China-Sichuan dialect, the user totem information that receives is " hillside plot ", and the function selecting information that receives is " word misreads informational function ".
According to the function selecting information that is received is " word misreads informational function ", 201 pairs of texts that received in text analyzing unit in device 10 the processing unit 103 carry out text-processing, obtain the following word of the text that received: lion, son and, four, ten, lion, 40.
Because the function selecting information that is received is " word misreads informational function ", so, the information of emphasizing generation unit 202 in the processing unit 103 of device 10 misreads information that misreads " Si1 " and the corresponding frequency information " 99 " thereof that extracts word " lion " the corpus 301 from the general word of storage unit 102, the information that misreads " Si4 " of word " lion " and corresponding frequency information " 50 " thereof, and the information that misreads " Shi4 " of word " four " and corresponding frequency information " 85 " thereof, misread information that misreads " Si1 " and the corresponding frequency information " 50 " thereof that extracts word " lion " the corpus 302 according to the user totem information that receives " hillside plot " from the individual subscriber word of storage unit 102, the information that misreads " Si4 " of word " lion " and corresponding frequency information " 3 " thereof, and the information that misreads " Shi4 " of word " four " and corresponding frequency information " 1 " thereof, and, according to the pairing language category information of received user's language setting is Japanese and China-Sichuan dialect, misread information that misreads " Si1 " and the corresponding frequency information " 22 " thereof that extracts word " lion " the corpus 303 from the language-specific background word of storage unit 102, the information that misreads " Si4 " of word " lion " and corresponding frequency information " 68 " thereof, and the information that misreads " Shi4 " of word " four " and corresponding frequency information " 45 " thereof.Then, emphasize that information generation unit 202 calculates probabilistic information (99+50+22)/(the 99+50+85+73+50+3+1+2+22+68+45)=171/498=0.343 of the information that misreads " Si1 " of word " lion ", probabilistic information (50+3+68)/(the 99+50+85+73+50+3+1+2+22+68+45)=121/498=0.243 of the information that misreads " Si4 " of word " lion ", probabilistic information (85+1+45)/(the 99+50+85+73+50+3+1+2+22+68+45)=131/498=0.243 of the information that misreads " Shi4 " of word " four ".Then, emphasize the probabilistic information order from big to small that information generation unit 202 calculates separately according to the information that misreads " Si1 " and " Si4 " of word " lion ", the information that misreads " Si4 " that the information that misreads " Si1 " of word " lion " is arranged in word " lion " before, because the information that misreads of word " four " has only one, i.e. " Shi4 " is so be arranged in the information that misreads " Shi4 " of word " four " first place of the information that misreads of word " four ".
Correct information generation unit 203 in the processing unit 103 extracts the right pronunciation information of the word of the text that is received from storage unit 102.
Display unit 104 demonstrates the text 701 that is received, pairing language category information of the user's who is received language setting and function selecting information 702, the right pronunciation information 703 of the word of the text that is received, and the information that misreads " Shi4 " 704 of the information that misreads " Si1 " of word " lion " and " Si4 " and word " four ".
Voice unit (VU) 105 is exported the right pronunciation information of the word of the text that is received in the mode of sound, and the information that misreads " Si1 " of word " lion " and the pronunciation of " Si4 ", the pronunciation of the information that misreads " Shi4 " of word " four ".
Fig. 8 shows second example of the information of emphasizing of one embodiment of the invention.Wherein, in this second example, suppose that storage unit 102 stored that wrong corpus 304 translated in the general word shown in Fig. 4 a-4c, wrong corpus 305 translated in the individual subscriber word and wrong corpus 306 translated in language-specific background word, and dictionary.
As shown in Figure 8, in this second example, the text that the receiving element 101 of device 10 receives is Chinese terms " automobile ", the pairing language category information of the user's who receives language setting is a Japanese, the user totem information that receives is " hillside plot ", and the function selecting information that receives is " wrong informational function translated in word ".
According to the function selecting information that is received is " wrong informational function translated in word ", and the 201 pairs of texts that received in text analyzing unit in the processing unit 103 of device 10 carry out text-processing, the following word of the text that acquisition is received: vapour, car, automobile.
Because the function selecting information that is received is " wrong informational function translated in word ", so, the information of emphasizing generation unit 202 in the processing unit 103 of device 10 is translated from the general word of storage unit 102 and is extracted wrong information Japanese word of translating of Chinese terms " automobile " " automobile " and corresponding frequency information " 76 " thereof the wrong corpus 304, translate the frequency information " 7 " that extracts wrong information Japanese word of translating of Chinese terms " automobile " " automobile " and correspondence thereof the wrong corpus 305 according to the user totem information that receives " hillside plot " from the individual subscriber word of storage unit 102, and, according to the pairing language category information of received user's language setting is Japanese, translates from the language-specific background word of storage unit 102 and extracts wrong information Japanese word of translating of Chinese terms " automobile " " automobile " and corresponding frequency information " 66 " thereof the wrong corpus 306.Then, emphasize that information generation unit 202 calculates probabilistic information (76+7+66)/(the 76+44+7+5+66+89)=149/287=0.52 that translates wrong information Japanese word " automobile " of Chinese terms " automobile ".Then, emphasize translate probabilistic information from big to small the order that wrong information calculate of information generation unit 202 according to Chinese terms " automobile ", the wrong information of translating of Chinese terms " automobile " is sorted, because the wrong information of translating of Chinese terms " automobile " has only one, this day words and phrases languages " automobile " are so be arranged in the wrong information Japanese word of translating of Chinese terms " automobile " " automobile " first place of translating wrong information of Chinese terms " automobile ".
Correct information generation unit 203 in the processing unit 103 extracts the correct translation information of the word of the text that is received from storage unit 102, this day words and phrases languages " from Move Trucks ".
Display unit 104 demonstrates the text 801 that is received, pairing language category information of the user's who is received language setting and function selecting information 802, the correct translation information Japanese word " from Move Trucks " 803 of the word of the text that is received " automobile " and translate wrong information Japanese word " automobile " 804.
Fig. 9 shows the 3rd example of the information of emphasizing of one embodiment of the invention.Wherein, in the 3rd example, suppose that storage unit 102 stored that wrong corpus 307 translated in the general sentence shown in Fig. 5 a-5c, wrong corpus 308 translated in the individual subscriber sentence and wrong corpus 309 translated in language-specific background sentence, and dictionary.
As shown in Figure 9, in the 3rd example, the text that the receiving element 101 of device 10 receives is that Chinese sentence " is with great difficulty done and is over.", the pairing language category information of the user's who receives language setting is a Japanese, the user totem information that receives is " hillside plot ", and the function selecting information that receives is " wrong informational function translated in sentence ".
According to the function selecting information that is received is " wrong informational function translated in sentence ", and the 201 pairs of texts that received in text analyzing unit in the processing unit 103 of device 10 carry out text-processing, the following sentence of the text that acquisition is received: " with great difficulty do and be over.”。
Because the function selecting information that is received is " wrong informational function translated in sentence ", so the information of the emphasizing generation unit 202 in device 10 the processing unit 103 is translated from the general sentence of storage unit 102 and is extracted Chinese sentence the wrong corpus 307 and " with great difficulty do and be over." translate wrong information japanese sentence " や The く " and corresponding frequency information " 43 " thereof, translate from the individual subscriber sentence of storage unit 102 according to the user totem information that receives " hillside plot " and extract Chinese sentence the wrong corpus 308 and " with great difficulty do and be over." translate wrong information japanese sentence " や The く " and corresponding frequency information " 3 " thereof; and; according to the pairing language category information of received user's language setting is Japanese, translates from the language-specific background sentence of storage unit 102 and extracts Chinese sentence the wrong corpus 309 and " with great difficulty do and be over." translate wrong information japanese sentence " や The く " and corresponding frequency information " 25 " thereof.Then, emphasize that information generation unit 202 calculates Chinese sentence and " with great difficulty does and be over." probabilistic information (43+3+25)/(the 43+3+25)=71/71=1 that translates wrong information japanese sentence " や The く ".Then, emphasize that information generation unit 202 " with great difficulty does and be over according to Chinese sentence." translate the probabilistic information order from big to small that wrong information is calculated, Chinese sentence " is with great difficulty done and is over." the wrong information of translating sort because Chinese sentence " with great difficulty does and is over." the wrong information of translating have only one, i.e. japanese sentence " や The く " is so " with great difficulty do Chinese sentence and be over." the wrong information japanese sentence " や The く " of translating be arranged in Chinese sentence and " with great difficulty do and be over." the first place of translating wrong information.
Correct information generation unit 203 in the processing unit 103 extracts the sentence of the text that is received and " with great difficulty does and be over from storage unit 102." correct translation information, i.e. japanese sentence " や つ と finishes ".
Display unit 104 demonstrates the text 901 that is received, pairing language category information of the user's who is received language setting and function selecting information 902, and the sentence of the text that is received " is with great difficulty done and is over." correct translation information japanese sentence " や つ と finishes " 903 and translate wrong information japanese sentence " や The く " 904.
Though it will be appreciated by those skilled in the art that in the above embodiments, storage unit 102 comprises that word misreads corpus, wrong corpus translated in word and three corpus of wrong corpus translated in sentence, and the present invention is not limited thereto.In other embodiments of the invention, storage unit 102 can include only word and misread corpus, word and translate wrong corpus and sentence and translate in the wrong corpus one or two, other key element that perhaps can also comprise word and sentence is known the information corpus by mistake, can also comprise that perhaps the key element of other Languages information is known the information corpus by mistake.
In addition, though it will be appreciated by those skilled in the art that in the above embodiments, word misread that corpus, word translate that wrong corpus and sentence translate wrong corpus each all further comprise three corpus, the present invention is not limited thereto.In other embodiments of the invention, word misread that corpus, word translate that wrong corpus and sentence translate wrong corpus each can only further comprise one of them or two corpus.
In addition, it will be appreciated by those skilled in the art that of the present inventionly to be used for providing the device of information both can realize, also can in computer network environment and wireless communication network environments, realize at electronic installations such as electronic dictionary, PDA, language learners.When in computer network environment and wireless communication network environments, realizing, receiving element 101, storage unit 102 and processing unit 103 are realized in server, and display unit 104 and voice unit (VU) 105 are realized on the terminal as client, and this terminal for example is computing machine or portable terminal etc.
It will be appreciated by those skilled in the art that disclosed in this inventionly to be used to provide the apparatus and method of information to make various distortion and change on the basis of not departing from invention essence, therefore, protection scope of the present invention is limited by claims.

Claims (21)

1, a kind of device that is used to provide information comprises:
Storage unit, the key element that is used to store language message is known information by mistake;
Receiving element is used to receive the text that the user imports; And
Processing unit, the corresponding key element of language message that is used for from the text of described storage unit its language message of extraction and described reception is known information by mistake, is known information as the key element of the language message in the text of described reception by mistake.
2, device as claimed in claim 1, wherein,
The key element of the language message of described storage is comprised that by mistake knowledge information the key element of the wrong language message of knowing of people of different language background known information by mistake;
The described storage unit further key element of the wrong language message of knowing of people of the described different language background of storage is known the pairing language category information of information by mistake;
Described receiving element further receives described user's the pairing language category information of language setting; And
Described processing unit is extracted corresponding key element according to the language category information of language message in the text of described reception and described reception from the key element of the wrong language message of knowing of people of the different language background of described storage and is known information by mistake the mistake knowledge information, known information as the key element of the language message in the text of described reception by mistake.
3, device as claimed in claim 1, wherein,
The key element of the language message of described storage is comprised that by mistake knowledge information the key element of the wrong language message of knowing of individual subscriber known information by mistake;
Described storage unit is further stored the key element of the wrong language message of knowing of described individual subscriber by the pairing user's of mistake knowledge information identification information;
Described receiving element further receives described user's identification information; And
Described processing unit is known information by the corresponding key element of extraction the mistake knowledge information by mistake from the key element of the wrong language message of knowing of individual subscriber of described storage according to the user's of language message in the text of described reception and described reception identification information, is known information as the key element of the language message in the text of described reception by mistake.
4, as claim 2 or 3 described devices, wherein,
The key element of the language message of described storage is known the key element that information also comprises the general wrong language message of knowing of people by mistake and is known information by mistake; And
Described processing unit is obtained corresponding key element from the key element of the general wrong language message of knowing of people of described storage the mistake knowledge information according to the language message in the text of described reception and is known information by mistake, and known information by mistake with the key element of being extracted, known information as the key element of the language message in the text of described reception by mistake.
5, device as claimed in claim 2, wherein,
The key element of the language message of described storage is known the key element that information also comprises the wrong language message of knowing of individual subscriber by mistake and is known information by mistake;
Described storage unit is further stored the key element of the wrong language message of knowing of described individual subscriber by the pairing user's of mistake knowledge information identification information;
Described receiving element further receives described user's identification information; And
Described processing unit is obtained corresponding key element from the key element of the wrong language message of knowing of individual subscriber of described storage the mistake knowledge information according to the user's of language message in the text of described reception and described reception identification information and is known information by mistake, and known information by mistake with the key element of being extracted, known information as the key element of the language message in the text of described reception by mistake.
6, device as claimed in claim 5, wherein,
The key element of the language message of described storage is known the key element that information also comprises the general wrong language message of knowing of people by mistake and is known information by mistake; And
Described processing unit is known information by the corresponding key element of retrieval the mistake knowledge information by mistake from the key element of the general wrong language message of knowing of people of described storage according to the language message in the text of described reception, and known information by mistake with that obtained and key element described extraction, known information as the key element of the language message in the text of described reception by mistake.
7, as any one the described device among the claim 1-6, wherein,
The key element of described language message by mistake knowledge information be word the information that misreads, word translate at least a in the wrong information of translating of wrong information and sentence.
8, as any one the described device among the claim 1-6, wherein, also comprise:
Output unit, the key element of language message that is used for exporting to described user the text of described reception is known information by mistake.
9, device as claimed in claim 8, wherein, described output unit further comprises:
Display unit is used for showing that to described user the key element of language message of text of described reception is by the visual information in the mistake knowledge information; And
Voice unit (VU) is used for exporting the key element of language message of text of described reception by the acoustic information in the mistake knowledge information to described user.
10, as any one the described device among the claim 1-6, wherein, also comprise:
The key element of the language message of described processing unit in extracting the text of described reception by mistake knowledge information before, the text to described reception carries out text-processing with the language message in the text that obtains described reception earlier.
11, as any one the described device among the claim 1-6, wherein,
The key element of the language message of described storage is obtained by statistics by mistake knowledge information.
12, a kind of method that is used to provide information comprises:
The key element of storage language message is known information by mistake;
Receive the text of user's input; And
Known information from the corresponding key element of language message that the key element of the language message of described storage is extracted the mistake knowledge information in the text of its language message and described reception by mistake, known information by mistake as the key element of the language message in the text of described reception.
13, method as claimed in claim 12, wherein, the key element of the language message of described storage is comprised that by mistake knowledge information the key element of the wrong language message of knowing of people of different language background known information by mistake, described method also comprises:
Further the key element of the wrong language message of knowing of people of the described different language background of storage is known the pairing language category information of information by mistake;
Further receive described user's the pairing language category information of language setting; And
Extracted corresponding key element according to the language category information of language message in the text of described reception and described reception the mistake knowledge information from the key element of the wrong language message of knowing of people of the different language background of described storage and known information, known information by mistake as the key element of the language message in the text of described reception by mistake.
14, method as claimed in claim 12, wherein, the key element of the language message of described storage is comprised that by mistake knowledge information the key element of the wrong language message of knowing of individual subscriber known information by mistake, described method also comprises:
Further the key element of the wrong language message of knowing of the described individual subscriber of storage is known the pairing user's of information identification information by mistake;
Further receive described user's identification information; And
Extracted corresponding key element according to the user's of language message in the text of described reception and described reception identification information the mistake knowledge information from the key element of the wrong language message of knowing of individual subscriber of described storage and known information, known information by mistake as the key element of the language message in the text of described reception by mistake.
15, as claim 13 or 14 described methods, wherein, the key element of the language message of described storage is known the key element that information also comprises the general wrong language message of knowing of people by mistake and is known information by mistake, and described method also comprises:
Obtained corresponding key element from the key element of the general wrong language message of knowing of people of described storage the mistake knowledge information according to the language message in the text of described reception and known information by mistake, and known information by mistake with the key element of being extracted, known information as the key element of the language message in the text of described reception by mistake.
16, method as claimed in claim 13, wherein, the key element of the language message of described storage is known the key element that information also comprises the wrong language message of knowing of individual subscriber by mistake and is known information by mistake, and described method also comprises:
Further the key element of the wrong language message of knowing of the described individual subscriber of storage is known the pairing user's of information identification information by mistake;
Further receive described user's identification information; And
Obtained corresponding key element from the key element of the wrong language message of knowing of individual subscriber of described storage the mistake knowledge information according to the user's of language message in the text of described reception and described reception identification information and known information by mistake, and known information by mistake with the key element of being extracted, known information as the key element of the language message in the text of described reception by mistake.
17, method as claimed in claim 16, wherein, the key element of the language message of described storage is known the key element that information also comprises the general wrong language message of knowing of people by mistake and is known information by mistake, and described method also comprises:
Known information by the corresponding key element of retrieval the mistake knowledge information by mistake from the key element of the general wrong language message of knowing of people of described storage according to the language message in the text of described reception, and known information by mistake with that obtained and key element described extraction, known information as the key element of the language message in the text of described reception by mistake.
18, as any one the described method among the claim 12-17, wherein,
The key element of described language message by mistake knowledge information be word the information that misreads, word translate at least a in the wrong information of translating of wrong information and sentence.
19, as any one the described method among the claim 12-17, wherein, also comprise:
Export the key element of the language message in the text of described reception to described user and known information by mistake.
20, as any one the described method among the claim 12-17, wherein, also comprise:
The key element of the language message in extracting the text of described reception by mistake knowledge information before, the text to described reception carries out text-processing with the language message in the text that obtains described reception earlier.
21, as any one the described method among the claim 12-17, wherein,
The key element of the language message of described storage is obtained by statistics by mistake knowledge information.
CNA2008100911610A 2008-04-07 2008-04-07 Device and method for providing information Pending CN101556745A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNA2008100911610A CN101556745A (en) 2008-04-07 2008-04-07 Device and method for providing information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNA2008100911610A CN101556745A (en) 2008-04-07 2008-04-07 Device and method for providing information

Publications (1)

Publication Number Publication Date
CN101556745A true CN101556745A (en) 2009-10-14

Family

ID=41174847

Family Applications (1)

Application Number Title Priority Date Filing Date
CNA2008100911610A Pending CN101556745A (en) 2008-04-07 2008-04-07 Device and method for providing information

Country Status (1)

Country Link
CN (1) CN101556745A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001249679A (en) * 2000-03-03 2001-09-14 Rikogaku Shinkokai Foreign language self-study system
US20030204396A1 (en) * 2001-02-01 2003-10-30 Yumi Wakita Sentence recognition device, sentence recognition method, program, and medium
CN1484173A (en) * 2003-08-10 2004-03-24 卢小林 Method for correcting Chinese word misspelling based on Chinese character shape
CN1937002A (en) * 2006-07-27 2007-03-28 中山名人电脑科技有限公司 Intelligent man-machine dialogue system and realizing method
CN101079189A (en) * 2007-03-20 2007-11-28 无敌科技(西安)有限公司 Chinese pronunciation correction listening-writing study method and system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001249679A (en) * 2000-03-03 2001-09-14 Rikogaku Shinkokai Foreign language self-study system
US20030204396A1 (en) * 2001-02-01 2003-10-30 Yumi Wakita Sentence recognition device, sentence recognition method, program, and medium
CN1484173A (en) * 2003-08-10 2004-03-24 卢小林 Method for correcting Chinese word misspelling based on Chinese character shape
CN1937002A (en) * 2006-07-27 2007-03-28 中山名人电脑科技有限公司 Intelligent man-machine dialogue system and realizing method
CN101079189A (en) * 2007-03-20 2007-11-28 无敌科技(西安)有限公司 Chinese pronunciation correction listening-writing study method and system

Similar Documents

Publication Publication Date Title
CN108287858B (en) Semantic extraction method and device for natural language
CN107291783B (en) Semantic matching method and intelligent equipment
CN1942875B (en) Dialogue supporting apparatus
CN1742273A (en) Multimodal speech-to-speech language translation and display
JP2006190006A5 (en)
CN100592385C (en) Method and system for performing speech recognition on multi-language name
CN108399157B (en) Dynamic extraction method of entity and attribute relationship, server and readable storage medium
CN110808032A (en) Voice recognition method and device, computer equipment and storage medium
KR101130276B1 (en) System and method for interpreting sign language
CN113255331B (en) Text error correction method, device and storage medium
CN112559725A (en) Text matching method, device, terminal and storage medium
CN113012683A (en) Speech recognition method and device, equipment and computer readable storage medium
Gayathri et al. Sign language recognition for deaf and dumb people using android environment
KR20160138613A (en) Method for auto interpreting using emoticon and apparatus using the same
CN107251137A (en) Improve method, device and the computer readable recording medium storing program for performing of the set of at least one semantic primitive using voice
KR20200044176A (en) System and Method for Korean POS Taging Using the Concatenation of Jamo and Sylable Embeding
CN101526857B (en) Method for inputting characters into information processing equipment
US10102203B2 (en) Method for writing a foreign language in a pseudo language phonetically resembling native language of the speaker
JP2019082981A (en) Inter-different language communication assisting device and system
CN101556745A (en) Device and method for providing information
CN113268981A (en) Information processing method and device and electronic equipment
CN112560431A (en) Method, apparatus, device, storage medium, and computer program product for generating test question tutoring information
KR101543024B1 (en) Method and Apparatus for Translating Word based on Pronunciation
KR101777141B1 (en) Apparatus and method for inputting chinese and foreign languages based on hun min jeong eum using korean input keyboard
CN111858840A (en) Intention identification method and device based on concept graph

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20091014