CN102479208B - The various webpage information search transition translation of Chinese phonetics codes method - Google Patents

The various webpage information search transition translation of Chinese phonetics codes method Download PDF

Info

Publication number
CN102479208B
CN102479208B CN201010564052.3A CN201010564052A CN102479208B CN 102479208 B CN102479208 B CN 102479208B CN 201010564052 A CN201010564052 A CN 201010564052A CN 102479208 B CN102479208 B CN 102479208B
Authority
CN
China
Prior art keywords
chinese
webpage
phonetics codes
word
character
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201010564052.3A
Other languages
Chinese (zh)
Other versions
CN102479208A (en
Inventor
苗玉水
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
QINGHAI HANLA INFORMATION TECHNOLOGY CO., LTD.
Original Assignee
Qinghai Hanla Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qinghai Hanla Information Technology Co Ltd filed Critical Qinghai Hanla Information Technology Co Ltd
Priority to CN201010564052.3A priority Critical patent/CN102479208B/en
Publication of CN102479208A publication Critical patent/CN102479208A/en
Application granted granted Critical
Publication of CN102479208B publication Critical patent/CN102479208B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Document Processing Apparatus (AREA)

Abstract

The present invention is the various info web transition translation of a kind of Chinese phonetics codes method, belongs to computer website technical field of information processing. can convert the Chinese information webpage of the Chinese character searching and " Scheme for the Chinese Phonetic Alphabet " expression to Chinese information webpage that Chinese phonetics codes is expressed easily by this method, and can carry out the synthetic output of two-way translation and the webpage Chinese speech of Chinese and outer web page text, in the time of information search, both can adopt the method for input character, also can adopt the method for Chinese speech input, webpage adopts after Chinese speech representation Chinese information, make the webpage that represents Chinese information in pure western code system, to show and to move, the present invention provides great facility can to the translation conversion of the info web of computer or embedded computer system.

Description

The various webpage information search transition translation of Chinese phonetics codes method
One, affiliated technical field
The present invention is that one can be used in computer or embedded computer system (is designated hereinafter simply as computer or computerSystem) the method for the various webpage information search transition translation of Chinese phonetics codes, belong to computer website information processing technology neckTerritory.
Two, background technology
Nineteen forties rises, and the develop rapidly of computer has caused that one with electronic computer in the worldCentered by the third technical revolution, it frees the mankind from heavy mental labour, has started human mind's liberationNew era.
As everyone knows, computer is by processing various symbols, particularly comes by the method for processing 128 ASCII charactersCarry out various character information processing, because 26 Latin alphabets are included in the code symbol collection of 128 ASCII characters, therefore, use26 Latin alphabets taking English as representative the country of alphabetic writing can successfully carry out current new technology revolution, from leapIn the economy of development, acquire benefit.
What record that Chinese uses due to China is express the meaning Chinese character or the Chinese phonetic alphabet of square, the well-known square Chinese character of expressing the meaningComputer internal code not in the code symbol collection of 128 ASCII characters, and the Chinese phonetic alphabet in " Scheme for the Chinese Phonetic Alphabet " (hereinafter to be referred asThe Chinese phonetic alphabet) also there are several disadvantages that are not easy to computer information processing, such as: the first spelling formula is oversize, the two or five tone(containing one softly) do not have alphabetized and not in the scope of ASCII character, the sound, rhyme, tone of the 3rd Chinese single syllable are not justOne-dimensional linear from left to right in computer information processing is arranged, but is arranged above and below, if the 4th do not have non-alphabetizedThe Chinese phonetic alphabet saves help every the sound insulation of syllable symbol, the Chinese phonetic alphabet when taking word as unit write the two or more syllables of a word together, between syllable and syllable oftenEasily obscure, produce audio mixing phenomenon. All these is not easy to the processing of computer to Chinese information. Due to Chinese character and the ChineseThese deficiencies of language phonetic self, make them can not serve as a kind of alphabetic writing, and current all Chinese information webpages can only be usedChinese character or the Chinese phonetic alphabet represent, can not 100% compatibility due to Chinese character and the Chinese phonetic alphabet and ASCII character, make with Chinese character or ChineseThe webpage of the Chinese information of pinyin representation can not show and move, address this problem necessary in pure western codes systemFirst to invent the coding techniques of 26 letter spelling Chinese of a kind of use, secondly because current Chinese character is to express Chinese Chinese informationMain Means, the Chinese phonetic alphabet is to express the supplementary means of Chinese Chinese information, for will be a large amount of existing and new product all the timeThe webpage of the raw Chinese information representing with Chinese character and the Chinese phonetic alphabet can show and move in pure western codes system,Be necessary to invent a kind of Chinese information webpage that Chinese character and the Chinese phonetic alphabet are represented, by computer system application software automatic turningBe translated into the Chinese information that represents with 26 Latin alphabets or taking English as main outer web page text, this webpage can also be with the Chinese if desiredLanguage mandarin, Chinese particular person, Chinese dialect, minority language or bright the reading out of foreign language taking English as master.
To the problem of the Chinese information representing with 26 Latin alphabets, numerous experts, scholar is studied in this respectAnd exploration, but because Chinese is a kind of very special language that has tone, want with and only use 26 Latin alphabets, with regard to energyEnough in 22 initial consonants that comprise Chinese (containing a zero initial), 38 simple or compound vowel of a Chinese syllable, 5 tones (containing softly) are encoded, andAnd for allow arbitrarily many syllables by write the two or more syllables of a word together together after, between syllable and syllable, can not obscure, produce audio mixing phenomenon, everyIn a syllable, also must imply one and so just make the difficulty of this technical scheme very large every syllable symbol, this is also longSince phase, this problem does not obtain the basic reason effectively solving always.
Owing to failing for a long time to invent the method for the Chinese information representing with 26 Latin alphabets, therefore, more do not havePeople can invent the Chinese information webpage that 26 Latin alphabets of use of being based upon on this basis represent, passes through computer systemApplication software is automatically translated into the Chinese information that represents with 26 Latin alphabets or taking English as main outer web page text, if desired thisIndividual webpage can also read out so that voice mode is bright.
The Kingsoft Co. occurring in the market cooperates with Google, and " Google Kingsoft PowerWord " web page translation of combining release is softThe software with web page translation function that part and " one searches " are released, should say to be all that the development of the translation technology of internet has been playedWell impetus, but all web page translation including above-mentioned two software are all with Chinese character or Chinese phonetic alphabetThe Chinese information showing to translate with foreign language, and the Chinese information that Chinese character or the Chinese phonetic alphabet represent and ASCII character system are not100% compatibility cannot show and move in the computer of the pure ASCII character system in west.
Three, summary of the invention
The object of this invention is to provide a kind of brand-new, reversibly converting the Chinese character in webpage, the Chinese phonetic alphabet to oneKind with and only with 26 Latin alphabets on the basis of the method for the Chinese phonetics codes taking word as unit, then in conjunction with Chinese or foreign language languageSound synthesis module, Chinese phonetics codes and foreign language bidirectional machine translation module and Chinese speech identification module solve above Chinese letterThe problems such as breath can not demonstration in the webpage of pure western code system, translation, phonetic synthesis or speech recognition.
The Chinese phonetics codes that the inventive method adopts specifically, it with and only use 26 Latin alphabets as code element,Taking word as unit, adopt the mode of pressing word write the two or more syllables of a word together, the sound, rhyme, tone of the every monosyllabic Chinese phonetic alphabet to composition word are first compiledCode conversion, then encode according to the order of " acoustic code+Jie code+rhyme code+tune code is held concurrently every syllable symbol ".
Because 26 Latin alphabets belong within the scope of 128 ASCII character code symbol collection, therefore, once tradition is used for representing the ChineseThe Chinese character of language information or the Chinese phonetic alphabet convert to after this kind of Chinese phonetics codes, and all are for the treatment of the webpage that comprises of west code in the worldMay be used to show and process the expressed Chinese information of Chinese phonetics codes with browser at interior all software and hardware resources.
Four, detailed description of the invention
Below in conjunction with embodiment, the specific embodiment of the present invention is further described.
(1) coding method of each syllable sound, rhyme, tone of the Chinese phonetics codes adopting adopts following method:
Note: the symbol in bracket is the Chinese phonetic symbols in " Scheme for the Chinese Phonetic Alphabet ", is designated hereinafter simply as the Chinese phonetic alphabetSymbol, the coded identification of each syllable sound, rhyme, tone of the Chinese phonetics codes that not parenthesized letter adopts for this programme, withLower by the following table of comparisons referred to as code table;
1, the coded identification of acoustic code adopts the letter character of the initial consonant basically identical with the Scheme for the Chinese Phonetic Alphabet, under adoptingThe coding form of this acoustic code of face:
b:(b)p:(p)m:(m)f:(f)d:(d)t:(t)
n:(n)l:(l)g:(g)k:(k)h:(h)
j:(zh),(j)q:(ch),(q)x:(sh),(x)r:(r)
z:(z)c:(c)s:(s)y:(y)w:(w)
2, Chinese phonetic alphabet referral letter (ü) adopts a letter representation in 26 Latin alphabets, such as adopting this Jie belowThe coding form of code:
i:(i)u:(u)y:(ü)
3, the coding of rhyme code, a letter representation to single vowel in 26 Latin alphabets of (ü) employing, other employingThe letter character identical with the Chinese phonetic alphabet, the composite vowel of the Chinese phonetic alphabet is all passable as long as adopting consonant to encode, such as adopting this letter character below to encode to the simple or compound vowel of a Chinese syllable of the Chinese phonetic alphabet:
a:(a)o:(o)e:(e)i:(i)u:(u)y:(ü)
k:(ao)c:(ai)s:(an)x:(ou)w:(ei)n:(en)
z:(ua)l:(uo)b:(ang)d:(ong)p:(eng)
Q:(ing) g:(ng) er:(er) (er is without the initial and the final)
R:(i) [only spell mutually with (zh), (ch), (sh)]
4, adjust the coding of code except adopting a no consonant v of Chinese to represent the upper sound (∨) of the Chinese phonetic alphabet, itsIts employing vowel represents the tone of Chinese, such as adopting letter below to encode to the tone of the Chinese phonetic alphabet:
A:(-) high and level tone e:(/) rising tone v:(∨) and upper sound u:() falling tone
O:(does not mark) softly
(2) utilize the Chinese phonetics codes Chinese information of above-mentioned coding to represent to adopt following method:
Taking word as unit, regard individual Chinese character as monosyllable here, according to " the Chinese of each syllable of this word of compositionLanguage phonetic plan " in phonetic, by the sequential encoding of " acoustic code+Jies code+rhyme code+tune yard double every syllable symbol ", same successivelyMultiple syllables of word separate write the two or more syllables of a word together without space, and the coding between word and word separates with space, when Chinese information representsIn the time of Chinese phonetics codes state, its six kinds of periods, seven kinds of labels adopt and English identical form with the number of dividing a word with a hyphen at the end of a line;
Here owing to regarding the independent Chinese character using as monosyllable, therefore, the side of encode character for computer of the present inventionMethod is identical with the method for Chinese single syllable coding, adopts in the present invention single syllable coding to compile by obtaining word after word write the two or more syllables of a word togetherCode, the one group of word being made up of several words is called phrase by we, and the coding of phrase that the present invention adopts is with Chinese sentenceEncode identical, because word can represent phrase and Chinese sentence, the coding of the phrase that therefore adopted in the present invention and the ChineseThe coding of sentence can be realized by the coding of word, and does not need phrase and Chinese sentence to formulate in addition a set of specialCoding, generally during taking word as unit representation Chinese information, in the time understanding, generally do not need to carry out homophone word at whole sentence entire chapterSelection, sound in principle the sentence that can not produce ambiguity, with coding express time also can not produce ambiguity.
Exemplify below some by the inventive method to the Chinese character taking word as unit or the Chinese phonetic alphabet in webpage and Chinese languageExample when tone code is carried out bi-directional conversion:
When the webpage of the Chinese information representing with Chinese character or the Chinese phonetic alphabet being converted to the Chinese of Chinese speech representationWhen the webpage of language information, first computer system finds the source file of this webpage, and the Chinese character that can show this webpage or Chinese are spelledThe Chinese information content that sound represents converts, such as the text with computer expansion " .html " or " .hml " by nameWebpage source code is example, converts the bi-directional conversion module of Chinese phonetics codes to, by this webpage by calling Chinese character or the Chinese phonetic alphabetIn source code text except the Chinese character as filename and as all Chinese characters the Chinese character of Chinese character style title or allThe Chinese phonetic alphabet convert to Chinese phonetics codes just can, such as the net to the following text with computer expansion " .html " by namePage source code converts:
<html>
<head>
<title>test</title>
</head>
<body>
<b>
We can use Chinese character and latin literary composition.
</b>
</body>
</html>
1, convert the Chinese character and the Chinese phonetic alphabet that in webpage, need conversion to Chinese phonetics codes:
(1) in webpage, need conversion Chinese character, first by Chinese character by looking into " the Chinese being stored in advance in computer systemWord and the Chinese phonetic alphabet table of comparisons " mode convert the corresponding Chinese phonetic alphabet to:
In source code file such as above-mentioned webpage, " we can use Chinese character and latin literary composition. " be exactly the Chinese that can show in webpageWord, needs conversion, becomes after converting phonetic to:
wǒmenhuìshǐyònghànyǔlādīngwěn。
To come by Chinese character conversion or original Chinese phonetic alphabet by the above Chinese phonetic alphabet and Chinese phonetics codes code tableThe table of comparisons converts the Chinese phonetic alphabet to following Chinese phonetics codes string.
wovmnohuiuxrvyduhsuyyvlaadqawnv. (between syllable and syllable, separating with space)
Or wovmnohuiuxrvyduhsuyyvlaadqawnv. (between syllable and syllable, separating without space)
(mn after skilledoIn schwa symbol o in the time not causing audio mixing, can omit, above following all with. )
In order to allow everybody see clearly, the letter that represents tone is added to underscore here, the tone letter in phonetic code is sameTime tool sound insulation joint effect, in actual speech code, tone is without underscore, after skilled phonetic code, tone is held concurrently can be easily every syllabic signEach syllable marker space in alphabetic string is branched away.
(2) phonetic code string is carried out to participle cutting, finally complete phonetic code conversion.
Be stored in advance in computer system by searching, the Chinese phonetics codes word dictionary of point good word, by same listMultiple syllable write the two or more syllables of a word together of word, separate with space the Chinese phonetics codes that just obtains following our final needs between word and word:
wovmnohuiuxrvyduhsuyyvlaadqawnv.
It is (3) last that with these Chinese phonetics codes, the position of the Chinese character in webpage originally replaces the Chinese being converted againWord, just completes Chinese character in webpage and converts to the work of the Chinese phonetics codes in webpage, at original web displaying Chinese character:Our meeting Use Chinese character and latin literary composition.Place, now can showwovmnohuiuxrvyduhsuyyvlaadqawnv.For the side of narrationJust in literary composition, the content that can show in webpage is underlined to expressions (above below all with).
Webpage source code after conversion is:
<html>
<head>
<title>test</title>
</head>
<body>
<b>
wovmnohuiuxrvyduhsuyyvlaadqawnv.
</b>
</body>
</html>
In addition in the source code of webpage; Chinese character in " type=bottonValue=Chinese character ", because meeting demonstratesCome, therefore, also need conversion. The Chinese character of comment section in the source code in webpage can be changed also and can not changed, becauseIt can not show, generally except filename and Chinese character style name in webpage source code<body>With</body>Between do not exist<> in Chinese character content need conversion. Such as " wovmnohuiuxrvyduhsuyyvlaadqawnv. " in upper example
In the time converting Chinese character network page the webpage of Chinese speech representation to, originally the English in webpage, English alphabet, AhArabic numbers, western language punctuation mark, the number of dividing a word with a hyphen at the end of a line do not need conversion, retain former state.
2, Chinese phonetics codes converts Chinese character and the Chinese phonetic alphabet to:
In like manner, when converting the webpage of the Chinese information by Chinese speech representation to Chinese character or the Chinese phonetic alphabet representsThe webpage of Chinese information time, first computer system finds the source file of this webpage, the Chinese character that can show this webpage or the ChineseThe Chinese information content of language pinyin representation converts, such as the text literary composition with computer expansion " .html " or " .hml " by nameThe webpage source code of part is example, converts the bi-directional conversion module of Chinese character or the Chinese phonetic alphabet by calling Chinese phonetics codes to, willThe Chinese phonetics codes showing convert to Chinese character or all the Chinese phonetic alphabet just can, if desired also can be using the Chinese language as filenameTone code converts Chinese character to, to convert to or be exactly the webpage of Chinese speech representation originally, because Chinese phonetics codes has oneselfFeature, judging is can change after Chinese phonetics codes, because Chinese phonetics codes according to its coding rule is finallyAdjust code, and adjust code to adopt an a:(-) high and level tone e:(/) rising tone v:(∨) and upper sound u:() falling tone o:(do not mark) these letters softly,Therefore, to each adjusting code is above acoustic code and rhyme code, sometimes also has the code of Jie, because composite vowel represents with a consonant,Individual Chinese phonetics codes syllable is except adjusting code, before also have 2-3 alphabetical, consider that the feelings of zero initial appear in Chinese syllable sometimesCondition for the sake of assurance, lower limit is put into 1 alphabetical upper limit and is amplified to 4 letters, also when a word occur be finally with a,One of e, v, u, o ending, and from each word several from right to left, do not comprise that last tune code is every 1-4 letter repetitionThere is one of a, e, v, u, o letter, substantially can conclude that this is a Chinese phonetics codes word, particularly in sentence or paragraphWord substantially can affirm that this is Chinese phonetics codes word or the sentence being made up of it or paragraph while all there is this feature, thanAs: " wovmnohuiuxrvyduhsuyyvlaadqawnv. " in the letter of underscore be exactly the tone mark of each syllable, allHaving presented such rule, is therefore Chinese phonetics codes sentence, instead of the outer sentence such as English, belongs to the Chinese that needs are changedPhonetic code.
After being the Chinese information of Chinese speech representation by the character string of 26 Latin alphabets compositions in confirmation webpage, pass throughSearch respectively the Chinese phonetics codes that is stored in advance in computer system and Chinese character and the Chinese phonetic alphabet table of comparisons taking word as unit,Can easily Chinese phonetics codes be converted to Chinese character and the Chinese phonetic alphabet, be the following Chinese phonetics codes of using such as what show in webpageThe Chinese information representing:
wovmnohuiuxrvyduhsuyyvlaadqawnv.
Wovmno is by the Chinese of looking into acoustic code, being situated between code, rhyme code, tune code and the Chinese phonetic alphabet table of comparisons or generating according to this table of comparisonsLanguage phonetic code syllable or word and pinyin syllable or the word table of comparisons obtain w ǒ men, then find taking word as unit by w ǒ menChinese character, when the phonetic code taking word as unit is built by the Chinese phonetic alphabet taking word as unit and Chinese character taking word as unitAfter vertical corresponding relation, once need to can no longer need to spell by the Chinese taking word as unit by the phonetic code taking word as unitSound, directly sets up corresponding relation and carries out corresponding conversion with the Chinese character taking word as unit. Such as: wovmno can be converted toW ǒ men, then can convert " we " to by w ǒ men, wovmno and " we " have just directly set up corresponding relation like this, needIn time, can not change by Chinese phonetic alphabet w ǒ men, and directly between wovmno and " us ", realizes bidirectional reversible conversion.Adopt use the same method us can be by remaining Chinese phonetics codes word: " huiu " " xrvydu " hsuyyv " " laadqawnv "Convert respectively Chinese character " meeting " " use " " Chinese " " Latin " or the Chinese phonetic alphabet " hu ì " " sh ǐ y ò ng " " h à ny ǔ " " l ā d ī ng toW ě n ", we will obtain and the sentence " wovmnohuiuxrvyduhsuyyv of Chinese speech representation like thisLaadqawnv. " corresponding Chinese character sentence:
" we can use Chinese character and latin literary composition. "
Or the sentence that represents of the corresponding Chinese phonetic alphabet:
“wǒmenhuìshǐyònghànyǔlādīngwěn。”
While meeting homonym, can differentiate laggard according to means such as the contact of Chinese lexical syntactic context and statistical lawsThe Chinese character of row taking word as unit is selected. Such as: on ysvlune, fill mailbag. On ysvlune, fill crude oil. In conjunction with upper and lowerThe contact of literary composition can be known: " ysvlune " in represents cruise above, after " ysvlune " in one represent oil tanker,These two words can convert respectively " on cruise, having filled mailbag " and " on oil tanker, having filled crude oil " to. Other word situation is also complied withInferiorly analogize.
Such as with the webpage source code of Chinese speech representation being originally:
<html>
<head>
<title>test</title>
</head>
<body>
<b>
wovmnohuiuxrvyduhsuyyvlaadqawnv.
</b>
</body>
</html>
Now, computer screen title bar shows: test screen text column shows:
wovmnohuiuxrvyduhsuyyvlaadqawnv.
Webpage source code by the demonstration Chinese phonetic alphabet webpage after above-mentioned conversion is:
<html>
<head>
<title>test</title>
</head>
<body>
<b>
wǒmenhuìshǐyònghànyǔlādīngwěn。
</b>
</body>
</html>
Now, computer screen title bar shows: test screen text column shows:
wǒmenhuìshǐyònghànyǔlādīngwěn。
Webpage source code by the Display of Chinese characters webpage after above-mentioned conversion is:
<html>
<head>
<title>test</title>
</head>
<body>
<b>
We can use Chinese character and latin literary composition.
</b>
</body>
</html>
Now, computer screen title bar shows: test screen text column shows:
We can use Chinese character and latin literary composition.
Above-mentioned in the time that Chinese phonetics codes converts Chinese character and the Chinese phonetic alphabet to, its punctuation mark is also from the state identical with EnglishChange corresponding Chinese punctuation mark state into. Such as: the fullstop ". " in above-mentioned Chinese phonetics codes sentence has become Chinese character and the ChineseFullstop in language phonetic sentence ". "
In webpage as the Chinese character of filename, in order to transport in the pure Sigma computer system of (also claiming pure ASCII character)OK, sometimes need to convert the Chinese character as filename in webpage to Chinese phonetics codes, the former Chinese character of using after being converted is as literary compositionThe file of part name will copy and store in place, in the specified folder such as given server or local machine, otherwiseComputer system may find this to be converted into the file of Chinese phonetics codes title.
The Chinese character of Chinese character style title, in the time there is not this Chinese character style title in western code system, can be by this Chinese CharacterBody title changes the comparatively close western language font name of appointment into, or the western language font name of acquiescence.
In the time that the Chinese phonetics codes in webpage need to convert voice to, can adopt respectively to look into be stored in advance department of computer scienceChinese phonetics codes in system and syllable, word, the paragraph phonetic synthesis file table of comparisons are exported corresponding voice.
Exemplify the example that some convert Chinese phonetics codes to voice below:
The Chinese information that Chinese phonetics codes is expressed such as what show in webpage:
wovmnohuiuxrvyduhsuyyvlaadqawnv.
Its meaning is expressed as with Chinese character:
" we can use Chinese character and latin literary composition. "
Carry out Chinese speech when synthetic at the Chinese information that Chinese phonetics codes is expressed, generally can adopt as required withOne of lower three kinds of methods:
1. carry out the method for phonetic synthesis by looking into Chinese phonetics codes and the syllable Chinese speech composite document table of comparisons:
Look into after the Chinese phonetics codes that is stored in advance in computer system and the syllable Chinese speech composite document table of comparisonsAudio files to the Chinese speech corresponding with phonetic code (is used " corresponding syllable Chinese spelling for statement facilitates this audio filesSound .wav " represent, in actual conditions, there is no Chinese phonetic symbols, it is just stored in computer system in advance, canThe audio files of the expression corresponding syllables Chinese speech of playing by certain sound playout software)
wov(wǒ.wav)mno(men.wav)huiu(huì.wav)xrv(shǐ.wav)ydu(yòng.wav)hsu(hàn.wav)yyv(yǔ.wav)laa(lā.wav)dqa(dīng.wav)wnv(wěn.wav).
The corresponding audio files of this syllable Chinese speech of the representative finding is broadcast with sound playout software successively orderGo out, between word and word, employing was broadcasted successively continuously than the time interval longer between same single syllable, can sound like thisMore approaching effect of reading aloud by word, the custom that more meets people and listen voice.
2. carry out the method for phonetic synthesis by looking into Chinese holophrase tone code and the word Chinese speech composite document table of comparisons:
Look into after the Chinese holophrase tone code that is stored in advance in computer system and the word pronunciation composite document table of comparisonsTo the audio files that is stored in advance the Chinese speech taking word as unit corresponding with holophrase tone code in computer system(for facilitating this Chinese sound file taking word as unit, statement uses " the corresponding Chinese phonetic alphabet .wav taking word as unit "Represent, actual conditions do not have Chinese phonetic symbols, and it is just stored in computer system in advance, can be by certainThe audio files of the corresponding Chinese speech taking word as unit of expression that sound playout software is play)
wovmno(wǒmen.wav)huiu(huì.wav)xrvydu(shǐyòng.wav)hsuyyv(hànyǔ.wav)laadqawnv(lādīngwěn.wav).
The corresponding audio files sound playout software taking word as unit this Chinese speech of representative finding is complied withInferior order is broadcasted, and between word and word, employing was broadcasted successively continuously than the time interval longer between same single syllable, like thisCan sound more approaching effect of reading aloud by word, the custom that more meets people and listen voice.
3. carry out phonetic synthesis by looking into Chinese phonetics codes string with the maximum coupling paragraph Chinese speech composite document table of comparisonsMethod:
The method adopts maximum matching method, by look into be stored in advance in computer system taking maximum paragraph as unitChinese phonetics codes string and the paragraph Chinese speech composite document table of comparisons are exported corresponding Chinese speech. Such as storing by looking in advanceMaximum paragraph in computer system is: " we can use wovmnohuiuxrvydu " and " hsuyyvlaadqawnvChinese character and latin literary composition " Chinese speech is synthetic is so undertaken by mode below:
wovmnohuiuxrvydu(wǒmenhuìshǐyòng.wav)hsuyyvlaadqawnv(hànyǔlādīngwěn.wav).
(for statement facilitates above-mentioned Chinese sound file that should be taking paragraph as unit with " corresponding taking this paragraph as unitChinese phonetic alphabet .wav " represent, actual conditions do not have Chinese phonetic symbols, it is just stored in computer system in advanceIn, the sound literary composition of the corresponding Chinese speech taking this paragraph as unit of expression that can play by certain sound playout softwarePart)
The like, if Chinese phonetics codes is distinguished in above-mentioned three kinds of situations to the Chinese of corresponding syllable, word, paragraphWhen phonetic synthesis file changes respectively the phonetic synthesis file of Chinese particular person, Chinese dialect, minority language into, computerSystem synthesis out be just the voice of Chinese particular person, Chinese dialect, minority language respectively.
The needed computer system voice document of first method storage area in above-mentioned three kinds of phoneme synthesizing methodsMinimum, the third needed computer system voice document storage area maximum.
Sometimes in order to proofread the convenience of webpage, we need to be by bright to the punctuation mark in Chinese phonetics codes webpage and the number of dividing a word with a hyphen at the end of a lineRead out, this will carry out phonetic synthesis to the punctuation mark in Chinese phonetics codes webpage and the number of dividing a word with a hyphen at the end of a line, in order to make Chinese languageThe Chinese information that tone code is expressed and ASCII character 100% compatibility, the here punctuate in our special provision Chinese phonetics codes webpageSymbol is identical with the number of dividing a word with a hyphen at the end of a line with English punctuation mark respectively with the number of dividing a word with a hyphen at the end of a line, and in the time that concrete sound is synthetic, we will be as long as will be correspondingThe audio files that is stored in advance punctuation mark in computer system and the number of dividing a word with a hyphen at the end of a line extracts, and carries out with sound playout softwareJust can play, such as:
Six kinds of periods: fullstop ". " (j ù h à o.wav), question mark "? " (wenh à o.wav), exclamation mark "! "
(g ǎ nt à nh à o.wav), comma, " (d ò uh à o.wav), colon ": " (m à oh à o.wav), branch "; " (f ē nh ào.wav)。
Seven kinds of labels: quotation marks " " (y ǐ nh à o.wav), bracket () (ku ò h à o.wav), dash "-" (p ò zh é h àO.wav), ellipsis ... (sh ě nglueh à o.wav), mark of emphasis. (zhu ó zh ò ngh à o.wav), punctuation marks used to enclose the title (()) (sh ū m íNgh à o.wav), separation dot. (ji à ng é h à o.wav).
The number of dividing a word with a hyphen at the end of a line: the number of dividing a word with a hyphen at the end of a line "-" (y í h á ngh à o.wav).
List the six kind periods identical with English of the present invention, seven kinds of labels and the number of dividing a word with a hyphen at the end of a line above, drawn together" .wav " file in number is exactly and punctuation mark or the number of the dividing a word with a hyphen at the end of a line corresponding phonetic synthesis file that pronounces, when this phonetic synthesis fileWhile being the phonetic synthesis file of Chinese, the bright sound reading out of this punctuation mark or the number of dividing a word with a hyphen at the end of a line is the corresponding punctuate symbol of ChineseNumber or the sound of the number of dividing a word with a hyphen at the end of a line, when this phonetic synthesis file is respectively the voice of Chinese particular person, Chinese dialect, minority languageWhen composite document, the bright sound reading out of this punctuation mark or the number of dividing a word with a hyphen at the end of a line is respectively just Chinese particular person, Chinese dialect, minorityThe corresponding punctuation mark of native language or the sound of the number of dividing a word with a hyphen at the end of a line.
In the time that webpage is the Chinese information of expressing with Chinese character or the Chinese phonetic alphabet, Chinese character or the Chinese phonetic alphabet can pass through master diePiece first converts the voice that Chinese phonetics codes carries out above-mentioned Chinese, Chinese particular person, Chinese dialect, minority language etc. again to and turnsChange.
When in webpage taking English when main foreign language need to convert voice to, can adopt existing taking English as main foreign languageVoice synthetic module, bright the reading out of foreign language taking English as master that will show in webpage.
When the Chinese information of the Chinese speech representation in webpage being converted to taking English as main foreign language, or netEnglish in page is main foreign language need to convert the Chinese information of Chinese speech representation to time, can call Chinese phonetics codes withForeign language bi-directional conversion module converts the Chinese information of Chinese speech representation to taking English as main foreign language in webpage, orPerson is the Chinese information that main foreign language converts Chinese speech representation to by the English in webpage.
When English webpage will convert Chinese character, Chinese phonetics codes or voice to, first to carry out at webpage source code ComputerThe English of differentiating in webpage source code is content or the HTML symbol of statement that can show in webpage, if HTML statementSymbol does not allow conversion, if the English content showing needs to change.
Due to the statement of writing webpage as HTML statement be also to write with " English+specific character and symbol ", therefore,In the time of conversion, will first distinguish which is the specific markers symbol of homepages language, which is the content that can show, only has meeting to showJust content needs conversion, for prevent from makeing mistakes, can adopt English HTML symbol of statement keyword that all webpages are used andTab character deposits in a table, such as:<html></html>,<head></head>,<title></title>
<body></body>,<b></b>Deng. When computer scanning arrives a string English character that needs conversion, first look into thisTable, only has the English symbol or the symbol string that in this table, do not have just to change, otherwise retains original English character or symbolNumber string form, does not change.
Exemplify some carry out two-way translation to Chinese and english example by the inventive method below:
The Chinese information of Chinese speech representation such as what show in webpage:wovmnomwvtisaxrvydu laadqawnv.Calling Chinese phonetics codes is master's foreign language two-way translation module with English, can translate as follows changeChange:
1.wovmnomwvtisaxrvydulaadqawnv. (Chinese information of Chinese speech representation)
We use Latin every day. (Chinese information representing with Chinese character)
A) Chinese dictionary of looking into the mark word part of speech being stored in advance in computer system is set up word part of speech string: (sentencePart in bracket is part of speech, below all with)
Wovmuo (personal pronoun 1)+mwvtisa (time noun 1)+xrvydu (verb 1)+laadqawnv (noun 2).
Our (personal pronoun 1)+every day (time noun 1)+use (verb 1)+Latin (noun 2).
B) looking into according to sentence part of speech string obtained above the table being stored in advance in computer system is stored in advanceChinese sentence patterns in table:
(the component string composition sentence pattern that part of speech and this word are done, below all with)
Wovmno (personal pronoun 1 is made subject)+mwvtisa (time noun 1 is made time adverbial)+xrvydu (call by verb 1Language)+laadqawnv (object made in noun 2)
Our (personal pronoun 1 is made subject)+every day (time noun 1 is made time adverbial)+use (predicate made in verb 1)+drawFourth literary composition (object made in noun 2)
C) table look-up and be stored in advance the corresponding English sentence in table according to Chinese sentence patterns obtained above:
Wovmno (personal pronoun 1 is made subject)+xrvydu (predicate made in verb 1)+laadqawnv (object made in noun 2)+Mwvtisa (time noun 1 is made time adverbial)
We (personal pronoun 1 is made subject)+use (predicate made in verb 1)+Latin (object made in noun 2)
+ every day (time noun 1 is made time adverbial)
Now look into the Chinese-English dictionary being stored in advance in computer system and carry out the conversion of word or the phrase meaning, and by thisThe conversion that sentence pattern Sequential output just completes Chinese translates into English, can amphicheirality for what show this machine translation process, weRemake further conversion below:
D) table look-up and be stored in advance in table and corresponding English word or phrase according to obtaining English sentence aboveThe part of speech string that part of speech is consistent: (this part of speech string also can extract and obtain from the object language sentence pattern obtaining, below all with)
Wovmno (personal pronoun 1)+xrvydu (verb 1)+laadqawnv (noun 2)+mwvtisa (time noun 1).
We (personal pronoun 1)+use (verb 1)+Latin (noun 2)+every day (time noun 1).
E) look into that the Chinese-English dictionary being stored in advance in computer system carries out the conversion of word or the phrase meaning and by aboveThe Sequential output of the English sentence obtaining:
We (personal pronoun 1) use (verb 1) latin (noun 2) everyday (time noun 1).
weuselatineveryday.
So just, completed the conversion that English translated in Chinese, we can also see except from a is transformed into e simultaneously,We can also use the same method and get back to a from e, and now English has been converted into Chinese, show by method of the present invention passableRealize machine translation, this process can amphicheirality.
The English sentence " weuselatineveryday. " translation being obtained such as us uses Chinese to translate into EnglishThe similar step of language, then this sentence is got back to a and 1 from e, translating into counter the rolling back in path of English from Chinese, we obtainStep below:
1. " weuselatineveryday. " (we translate the English sentence obtaining)
E) look into the mark word that is stored in advance in computer system or the English dictionary of phrase part of speech and set up word or wordThe part of speech string of group:
We (personal pronoun 1)+use (verb 1)+latin (noun 1)+everyday (time noun 2).
D) table look-up and be stored in advance the English sentence in table according to sentence part of speech string obtained above:
We (personal pronoun 1 is made subject)+use (predicate made in verb 1)+latin (object made in noun 1)+everyday (timeTime adverbial made in noun 2)
C) table look-up and be stored in advance the corresponding Chinese sentence patterns in table according to obtaining English sentence above:
We (personal pronoun 1 is made subject)+everyday (time noun 2 is made time adverbial)+use (predicate made in verb 1)+Latin (object made in noun 1)
Now look into Chinese-English-bidirectional English-Chinese dictionary being stored in advance in computer system and carry out turning of word or the phrase meaningChange, and just complete English Translation and become the conversion of Chinese by this sentence pattern Sequential output, can be two for what show this machine translation processTropism, we remake below further conversion:
B) table look-up and be stored in advance Chinese word in table and corresponding or phrase according to obtaining Chinese sentence patterns aboveThe part of speech string that part of speech is consistent:
We (personal pronoun 1)+everyday (time noun 2)+use (verb 1)+latin (noun 1)
A) look into Chinese-English-bidirectional English-Chinese dictionary being stored in advance in computer system and carry out the conversion of word or the phrase meaningAnd by the Sequential output of the Chinese sentence patterns that obtained above:
We use (verb 1) Latin (noun 1) (personal pronoun 1) every day (time noun 2).
We use Latin every day.
Finally can also convert the Chinese information that Chinese phonetics codes is expressed to, get back to again above-mentioned original sentence 1:
1.wovmnomwvtisaxrvydulaadqawnv.
Like this we repeat Chinese translate the process of English just got back we just now the system of giving translate into the Chinese of EnglishSentence, has illustrated that this machine translation method has bidirectional reversible. In like manner, also can carry out two to complex sentence by method aboveTo translation.
When web displaying be Chinese character represent Chinese information time, can, by the method for narrating above, first Chinese character be believedBreath converts to after the Chinese information of Chinese speech representation, then carries out translation transform by above step. The Chinese obtaining for translationThe Chinese information that language phonetic code represents, if need be transformed into the Chinese information that Chinese character represents, equally can be with by narrating aboveMethod, can convert the Chinese information of Chinese speech representation to Chinese information that Chinese character represents.
When we are while obtaining webpage by search engine, the keyword of inputting can be Chinese character, the Chinese phonetic alphabet, the ChineseThe Chinese information that language phonetic code, foreign language or Chinese speech represent;
While using the Chinese information of Chinese character, Chinese phonetic alphabet expression or the foreign language information of foreign language expression when what input, can eitherDirectly carry out the Webpage search of search engine as key character with Chinese character, the Chinese phonetic alphabet or foreign language, also can be first by Chinese character,The Chinese phonetic alphabet or foreign language convert Chinese phonetics codes to by the method for narrating above adopting, then by obtained Chinese phonetics codesKeyword as search engine carries out Webpage search, otherwise also can will in search engine keywords input frame, inputChinese phonetics codes, adopt narrated above method, first change into after Chinese character, the Chinese phonetic alphabet or foreign language, then by obtainedChinese character, the Chinese phonetic alphabet or foreign language carry out Webpage search as the keyword of search engine;
When cursor rests on search engine keywords input frame, when the keyword of required search is inputted with Chinese speech, meterCalculation machine system call Chinese speech identification module, first converts inputted Chinese speech to after Chinese character or Chinese phonetics codes, thenObtained Chinese character or Chinese phonetics codes are carried out to Webpage search as the keyword of search engine, or by obtained Chinese characterOr Chinese phonetics codes uses the method for stating of chatting face to face and convert to respectively after foreign language, the keyword that is re-used as search engine carries out netPage search;
This Chinese speech identification module can adopt traditional Chinese speech identification module, and this traditional Chinese speech is knownThe Chinese character taking word as unit obtaining after other Module recognition, directly carries out Webpage search as the keyword of search engine; OrThis Chinese character taking word as unit is used to the method stated of chatting face to face to be converted to respectively and is re-used as search after Chinese tone code or foreign language and drawsThe keyword of holding up carries out Webpage search;
In the time adopting Chinese-voice-code voice identification module to carry out Chinese speech identification, this Chinese speech identification mouldThe primitive of piece using Chinese syllable as identification, by search the Chinese syllable sound template that is stored in advance in computer system andThe Chinese speech syllabified code table of comparisons, identifies corresponding Chinese syllable phonetic code after coupling, the just company of obtaining when voice are inputted continuouslyContinuous Chinese syllable phonetic code string, ganged up and looked into the dictionary being stored in advance in computer the above-mentioned Chinese syllable phonetic code that obtainsMode carry out by word segmentation, to the multiple segmentation of words, can be according to Chinese lexical syntactic context contact and statistical law etc.Means are carried out the segmentation of words again after differentiating, the word being syncopated as is taked to write the two or more syllables of a word together between the syllable of same word and syllable,Between word and word, the mode in space represents.
Exemplify some carry out speech recognition to Chinese speech example by the inventive method below:
Chinese speech identification converts Chinese phonetics codes to:
Such as: in the time that cursor rests on search engine keywords input frame, we read aloud following Chinese sentence with Chinese speechSon:
" we can use Chinese character and latin literary composition. "
(1) by searching the Chinese syllable sound template and the Chinese speech syllabified code pair that are stored in advance in computer systemAccording to table, after coupling, identify corresponding Chinese syllable phonetic code string:
wovmnohuiuxrvyduhsuyyvlaadqawnv. (between syllable and syllable, having space)
Or wovmnohuiuxrvyduhsuyyvlaadqawnv. (between syllable and syllable without space)
(the schwa symbol o after skilled in mno can omit in the time not causing audio mixing, above following all with. )
In order to allow everybody see clearly here, the letter that represents tone has been added to underscore, the tone letter in phonetic code simultaneouslyThe effect of tool sound insulation joint, in actual speech code, tone is without underscore, and after skilled phonetic code, tone is held concurrently and can conveniently be distinguished every syllabic signOut.
So just, completed the irrelevant pure speech recognition process of the complexity of a system and the dictionary scale of system.
If Chinese speech is Chinese with certain dialectal accent or the dialect of a certain China, as long as this ChinaThe syllable of dialect and Chinese syllable have certain corresponding relation, we by above similar method: by searching in advanceBe stored in the Chinese with certain dialectal accent in computer system or there is the dialect sound of certain corresponding relation with Chinese syllableSound template and the Chinese speech syllabified code table of comparisons of joint, identify corresponding Chinese syllable phonetic code string after coupling, just canRealize this Chinese with certain dialectal accent or the Chinese phonetics codes identification of dialect, realize this Chinese with certain dialectal accentOr the conversion of dialect and Chinese phonetics codes.
(2) Chinese phonetics codes string is carried out to the segmentation of words, finally complete the phonetic code conversion taking word as unit.
By searching the Chinese phonetics codes word dictionary that is stored in advance point good word in computer system, by same listMultiple syllable write the two or more syllables of a word together of word, separate with space the Chinese phonetics codes that just obtains following our final needs between word and word:
wovmnohuiuxrvyduhsuyyvlaadqawnv.
Obtain after Chinese phonetics codes, in the time further need to changing Chinese character and the Chinese phonetic alphabet and foreign language, can be with aboveThe method of narrating converts Chinese character and the Chinese phonetic alphabet and the foreign language taking word as unit to.
Here it is emphasized that this converts Chinese character taking word as unit and process and the language of the Chinese phonetic alphabet and foreign language toSound recognition system does not have inevitable contact, and this standard handovers module can depart from speech recognition system independent operating.
To adopting all webpages that above method searches can be as required, by all or part of content in former webpageAnd surpass the path or the file that connect, convert path or the file of the content of appointment and the super connection of appointment to, such as:
<html>
<head>
<title>test</title>
</head>
<body>
Click<ahref=1.html>Chinese</a>
</body>
</html>
Now, web displaying: "Click Chinese" (italics is the word that hyperlink can occur after clicking to four words, aboveBelow all with), click after Chinese, the automatic redirect of webpage shows the content of the webpage of 1.html file representative.
If by " click is " with " " the Chinese word that these two Chinese characters represent adopts Chinese character above according to code table respectively to ChineseThe method that converts Chinese phonetics codes to converts Chinese phonetics codes to: " disvjia " with " hsuyyv " and will " 1.html " makes into "2.html ", now, we click " hsuyyv " webpage automatically redirect demonstration 2.html file representative webpage inHold. Accomplish this point if by the source code of above-mentioned webpage convert to following webpage source code just can:
<html>
<head>
<title>test</title>
</head>
<body>
disvjia<ahref=2.html>hsuyyv</a>
</body>
</html>
Now, web displaying:disvjiahsuyyv
Key at above-mentioned employing Chinese character, the Chinese phonetic alphabet, Chinese phonetics codes, foreign language or Chinese speech as search engineWord, in the method for the various information searches of being undertaken by search engine, the form of its output of webpage searching can basisSetting in advance, adopts the method for narrating to convert respectively Chinese character, the Chinese phonetic alphabet, Chinese phonetics codes, foreign language or Chinese to aboveAfter voice, export again.
When we obtain webpage be not by search engine, but obtain by alternate manner time, such as passing throughWhat various web browsers obtained represents the webpage of information with Chinese character, the Chinese phonetic alphabet, Chinese phonetics codes or foreign language, also can adoptThe method that using chats face to face stated, the conversion of process corresponding module and method is output into the net of the predefined information category of system againPage, this information category can be but be not limited to Chinese character, the Chinese phonetic alphabet, Chinese phonetics codes, foreign language, Chinese particular person voice, ChinaDialect phonetic, minority language voice, Chinese speech or foreign language voice.
Above in the time setting forth the method for info web conversion and translation, be taking the suffix of computer as " .html " and" .hml " text is example, the net of the various forms of writing with legal means that in fact can viewed device textual research and explain for otherPage source file, is included in the webpage source file that embedded system is used, and can turn by above-mentioned same or similar methodChange and translate, thereby reach conversion to various web page display contents and the object of translation.

Claims (2)

1. the various info web transition translation of a Chinese phonetics codes method, its feature comprises the following steps:
Steps A:
The coding method of each syllable sound, rhyme, tone of the Chinese phonetics codes one, adopting adopts following method:
Symbol in bracket is the Chinese phonetic symbols in " Scheme for the Chinese Phonetic Alphabet ", and not parenthesized letter is adopted by this programmeWith the coded identification of each syllable sound, rhyme, tone of Chinese phonetics codes;
(1), except (zh) adopts, j coding, (ch) employing q's coded identification of acoustic code encode, (sh) adopts x coding, other employingWith the letter character of the on all four initial consonant of the Scheme for the Chinese Phonetic Alphabet, the concrete coding form of this acoustic code below that adopts:
b:(b)p:(p)m:(m)f:(f)d:(d)t:(t)
n:(n)l:(l)g:(g)k:(k)h:(h)
j:(j)q:(q)x:(x)r:(r)
z:(z)c:(c)s:(s)y:(y)w:(w)
(2), Chinese phonetic alphabet referral letter (ü) adopts a letter representation in 26 Latin alphabets, comprises employing this Jies yard belowCoding form:
i:(i)u:(u)y:(ü)
(3), the coding of rhyme code, to single vowel except (ü) adopts a letter representation in 26 Latin alphabets, other employingThe letter character identical with the Chinese phonetic alphabet, the composite vowel of the Chinese phonetic alphabet adopts consonant coding, comprises below this of employingPlanting letter character encodes to the simple or compound vowel of a Chinese syllable of the Chinese phonetic alphabet:
a:(a)o:(o)e:(e)i:(i)u:(u)y:(ü)
k:(ao)c:(ai)s:(an)x:(ou)w:(ei)n:(en)
z:(ua)l:(uo)b:(ang)d:(ong)p:(eng)
q:(ing)g:(ng)er:(er)
r:(i)
(4), adjust code coding except employing a no consonant v of Chinese represent the upper sound (∨) of the Chinese phonetic alphabet, otherInitial consonant adopts vowel to represent the tone of Chinese, comprises that employing letter below compiles the tone of the Chinese phonetic alphabetCode:
A:(-) high and level tone e:(/) rising tone v:(∨) and upper sound u:() falling tone o:(do not mark) softly
Two, utilize the Chinese phonetics codes Chinese information of above-mentioned coding to represent to adopt following method:
Taking word as unit, regard individual Chinese character as monosyllable here, according to " the Chinese spelling of each syllable of this word of compositionSound scheme " in phonetic, successively by the sequential encoding of " acoustic code+Jies code+rhyme code+tune yard double every syllable symbol ", same wordMultiple syllables separate write the two or more syllables of a word together without space, the coding between word and word separates with space, when Chinese information represent inWhen Chinese phonetics codes state, its six kinds of periods, seven kinds of labels adopt and English identical form with the number of dividing a word with a hyphen at the end of a line;
Step B:
When information search, adopt taking existing traditional information search engine as basis, or by Chinese character, the Chinese phonetic alphabet, Chinese languageTone code, foreign language carry out information search as the keyword input frame of the direct inputted search engine of keyword of information search, orBy by Chinese character, the Chinese phonetic alphabet, Chinese phonetics codes, foreign language or the Chinese speech of inputted search engine keyword input frame, pass throughCorresponding modular converter converts to after predefined information category, then carries out information search, and the information that output inquires, canExport according to system default or predefined information category mode, above-mentioned this information category comprises Chinese character, the Chinese phonetic alphabet, the ChineseLanguage phonetic code, foreign language, Chinese particular person voice, Chinese dialect phonetic, minority language voice, Chinese speech or foreign language languageSound;
When the webpage of the Chinese information representing with Chinese character or the Chinese phonetic alphabet converts the net of the Chinese information of Chinese speech representation toWhen page, first computer system finds the source file of this webpage, the Chinese that the Chinese character that can show this webpage or the Chinese phonetic alphabet representThe language information content converts, and converts the bi-directional conversion module of Chinese phonetics codes by calling Chinese character or the Chinese phonetic alphabet to, shouldThe Chinese character that in source file, all meetings show or all Chinese phonetic alphabet convert Chinese on the position of their original webpagesPhonetic code, needing the Chinese character of conversion is except the Chinese character as filename with as all Chinese the Chinese character of Chinese character style titleWord;
In the time converting Chinese character network page the webpage of Chinese speech representation to, the originally English in webpage, English alphabet, ArabNumeral, western language punctuation mark, the number of dividing a word with a hyphen at the end of a line do not need conversion, retain former state;
In webpage as the Chinese character of filename, in order to show in pure west code is also called the computer system of pure ASCII characterAnd operation, need to convert the Chinese character as filename in webpage to Chinese phonetics codes, the former Chinese character of using after being converted is as literary compositionThe file of part name will copy and store in place, and described suitable position comprises in given server or local machineIn specified folder, to guarantee that computer system can find this to be converted into the file of Chinese phonetics codes title;
For the Chinese character of Chinese character style title, when west, code is also while there is not this Chinese character style title in ASCII character system, calculatesMachine or this Chinese character style title is changed automatically into the comparatively close western language fontname that presets and be stored in computerClaim, or the western language font name of the predefined acquiescence of computer;
In the time that the Chinese phonetics codes in webpage converts the Chinese phonetic alphabet to, or adopt the code table of looking in steps A, or look into storage in advanceExist in computer pass through that code table in steps A generates taking syllable or word as the Chinese phonetics codes of unit with syllable or wordFor the Chinese phonetic alphabet table of comparisons of unit, after coupling, find out the corresponding Chinese phonetic alphabet, and with these Chinese phonetic alphabet in original webpageThe position of Chinese phonetics codes replace the Chinese phonetics codes being converted;
In the time that the Chinese phonetics codes in webpage converts Chinese character to, or the Chinese phonetic alphabet first converting to taking word as unit converts to againChinese character taking word as unit, or directly adopt and look into the Chinese phonetics codes that is stored in advance in computer and the Chinese taking word as unitAfter the word table of comparisons, coupling, find out corresponding Chinese character, and the position replacement of the Chinese phonetics codes in original webpage with these Chinese charactersThe Chinese phonetics codes that falls to be converted;
Meet when homonym, first differentiate according to means such as the contact of Chinese lexical syntactic context and statistical laws, after differentiation againThe Chinese character carrying out taking word as unit is selected;
In the time that Chinese phonetics codes converts Chinese character and the Chinese phonetic alphabet to, its punctuation mark is also phase from the state-transition identical with EnglishThe Chinese punctuation mark state of answering;
In the time that the Chinese phonetics codes in webpage converts voice to, adopt and look into the Chinese language being stored in advance in computer system respectivelyTone code and syllable, word, the paragraph phonetic synthesis file table of comparisons are exported corresponding voice;
When the phonetic synthesis file of Chinese phonetics codes or Chinese phonetics codes string being distinguished to corresponding syllable, word or paragraph respectivelyWhile changing the phonetic synthesis file of Chinese particular person, Chinese dialect, minority language into, be stored in advance in computer by looking intoChinese phonetics codes or the syllable of Chinese phonetics codes string and respectively corresponding Chinese particular person, Chinese dialect, minority language,The phonetic synthesis file table of comparisons of word or paragraph, exports respectively corresponding Chinese particular person, Chinese dialect, minority languageVoice;
In the time that phonetic synthesis is carried out in the punctuation mark in Chinese phonetics codes webpage and the number of dividing a word with a hyphen at the end of a line, as long as will store in advance accordinglyPunctuation mark in computer and the audio files of the number of dividing a word with a hyphen at the end of a line extract, and play and just can with sound playout software;
When this punctuation mark is respectively Chinese particular person, Chinese dialect, minority language with the number of dividing a word with a hyphen at the end of a line phonetic synthesis fileWhen phonetic synthesis file, the bright sound reading out of this punctuation mark or the number of dividing a word with a hyphen at the end of a line be respectively just Chinese particular person, Chinese dialect,The corresponding punctuation mark of minority language or the sound of the number of dividing a word with a hyphen at the end of a line;
When webpage comprises that English is in the time that interior foreign language converts voice to, adopt existing foreign language phonetic synthesis including EnglishModule, bright the reading out of foreign language including English that will show in webpage;
When the foreign language Chinese information of the Chinese speech representation in webpage being converted to including English, or in webpageWhen foreign language including English converts the Chinese information of Chinese speech representation to, computer first will distinguish which is to belong to meetingThe Chinese phonetics codes showing in webpage, which is to belong to after the foreign language that can show in webpage, is stored in by calling in advanceChinese phonetics codes in computer and foreign language bi-directional conversion module, the position in the webpage of the phonetic code being converted, by ChineseThe Chinese information that phonetic code represents converts the foreign language including English to, or the foreign language including English being convertedWebpage in position, webpage is comprised to the English Chinese information that converts Chinese speech representation at interior foreign language to;
When cursor rests on search engine keywords input frame, when the keyword of required search is inputted with Chinese speech, computerSystem call Chinese speech identification module, first changes inputted Chinese speech or the Chinese with certain dialectal accent or dialectBecome after Chinese character or Chinese phonetics codes, then obtained Chinese character or Chinese phonetics codes are carried out to net as the keyword of search enginePage search, or obtained Chinese character or Chinese phonetics codes are used to the method for stating of chatting face to face convert to respectively after foreign language, be re-used asThe keyword of search engine carries out Webpage search;
To all webpages that adopt above method to search, as required by all or part of content in former webpage and super chainThe path connecing or file, make the content of specifying and path or the file of the hyperlink of appointment into.
2. the various info web transition translation of Chinese phonetics codes as claimed in claim 1 method, it is further characterized in that: whenThe webpage obtaining is not by search engine, but comprise by alternate manner that various web browsers obtain time, gainedTo represent the webpage of information with Chinese character, the Chinese phonetic alphabet, Chinese phonetics codes or foreign language, comprise the steps A adopting in claim 1With the method for step B, be output into again the webpage of the predefined information category of system through the conversion of corresponding module and method, shouldInformation category comprises Chinese character, the Chinese phonetic alphabet, Chinese phonetics codes, foreign language, Chinese particular person voice, Chinese dialect phonetic, the minority peopleFamily's language voice, Chinese speech or foreign language voice.
CN201010564052.3A 2010-11-26 2010-11-26 The various webpage information search transition translation of Chinese phonetics codes method Expired - Fee Related CN102479208B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201010564052.3A CN102479208B (en) 2010-11-26 2010-11-26 The various webpage information search transition translation of Chinese phonetics codes method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201010564052.3A CN102479208B (en) 2010-11-26 2010-11-26 The various webpage information search transition translation of Chinese phonetics codes method

Publications (2)

Publication Number Publication Date
CN102479208A CN102479208A (en) 2012-05-30
CN102479208B true CN102479208B (en) 2016-05-25

Family

ID=46091856

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201010564052.3A Expired - Fee Related CN102479208B (en) 2010-11-26 2010-11-26 The various webpage information search transition translation of Chinese phonetics codes method

Country Status (1)

Country Link
CN (1) CN102479208B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103514144B (en) * 2012-06-29 2017-03-01 三菱电机株式会社 Service manual creating device

Families Citing this family (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103853705A (en) * 2012-11-28 2014-06-11 上海能感物联网有限公司 Real-time voice subtitle translation method of Chinese voice and foreign language voice of computer
CN103853704A (en) * 2012-11-28 2014-06-11 上海能感物联网有限公司 Method for automatically adding Chinese and foreign subtitles to foreign language voiced video data of computer
CN103853709A (en) * 2012-12-08 2014-06-11 上海能感物联网有限公司 Method for automatically adding Chinese/foreign language subtitles for Chinese voiced image materials by computer
CN103854648A (en) * 2012-12-08 2014-06-11 上海能感物联网有限公司 Chinese and foreign language voiced image data bidirectional reversible voice converting and subtitle labeling method
CN103853708A (en) * 2012-12-08 2014-06-11 上海能感物联网有限公司 Method for automatically adding Chinese subtitles for Chinese voiced image materials by computer
CN103905743A (en) * 2012-12-30 2014-07-02 上海能感物联网有限公司 Phonotape and videotape recording and broadcasting method for automatic and real-time Chinese subtitles labeling with Chinese language
CN103902529A (en) * 2012-12-30 2014-07-02 上海能感物联网有限公司 Audio-video recording and broadcasting method capable of automatically annotating with Chinese and foreign language subtitles for foreign languages
CN103902530A (en) * 2012-12-30 2014-07-02 上海能感物联网有限公司 Audio and video recording and broadcasting method for automatically annotating Chinese and foreign language subtitles in Chinese in real time
CN103902531A (en) * 2012-12-30 2014-07-02 上海能感物联网有限公司 Audio and video recording and broadcasting method for Chinese and foreign language automatic real-time voice translation and subtitle annotation
CN104020840B (en) * 2013-03-03 2019-01-11 上海能感物联网有限公司 The method that foreign language text is remotely controlled computer program operation
CN103279190B (en) * 2013-06-16 2016-01-13 青海汉拉信息科技股份有限公司 Chinese language text calls the device that computer program runs
CN103279362A (en) * 2013-06-19 2013-09-04 江苏华音信息科技有限公司 Device for remotely controlling operation of computer programs through foreign language texts
CN103279363B (en) * 2013-06-19 2016-12-28 青海汉拉信息科技股份有限公司 The device that Chinese speech remote control computer program is run
CN103279463A (en) * 2013-06-19 2013-09-04 江苏华音信息科技有限公司 Device for performing real-time voice subtitle translation on Chinese voice and foreign language voice through computer
CN104239364A (en) * 2013-06-24 2014-12-24 上海能感物联网有限公司 Remote guiding machine information querying method through foreign language texts
CN104698999A (en) * 2013-12-05 2015-06-10 上海能感物联网有限公司 Controller device for robot under foreign language natural language text filed control
CN107391461B (en) * 2017-07-14 2020-07-31 中央民族大学 Tibetan language code encoding method and device and Tibetan language code decoding method and device
CN107451129B (en) * 2017-08-08 2020-09-25 传神语联网网络科技股份有限公司 Method and system for judging and translating irregular words or irregular short sentences
CN111161706A (en) * 2018-10-22 2020-05-15 阿里巴巴集团控股有限公司 Interaction method, device, equipment and system
CN110059168A (en) * 2019-01-23 2019-07-26 艾肯特公司 The method that man-machine interactive system based on natural intelligence is trained

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5475767A (en) * 1989-12-30 1995-12-12 Du; Bingchan Method of inputting Chinese characters using the holo-information code for Chinese characters and keyboard therefor
CN101118539A (en) * 2006-08-01 2008-02-06 苗玉水 Modern Chinese information holographic Latinizing Chinese voice code representation
CN101123089A (en) * 2006-08-08 2008-02-13 苗玉水 Voice mixing method for Chinese voice code

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05266069A (en) * 1992-03-23 1993-10-15 Nec Corp Two-way machie translation system between chinese and japanese languages

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5475767A (en) * 1989-12-30 1995-12-12 Du; Bingchan Method of inputting Chinese characters using the holo-information code for Chinese characters and keyboard therefor
CN101118539A (en) * 2006-08-01 2008-02-06 苗玉水 Modern Chinese information holographic Latinizing Chinese voice code representation
CN101123089A (en) * 2006-08-08 2008-02-13 苗玉水 Voice mixing method for Chinese voice code

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103514144B (en) * 2012-06-29 2017-03-01 三菱电机株式会社 Service manual creating device

Also Published As

Publication number Publication date
CN102479208A (en) 2012-05-30

Similar Documents

Publication Publication Date Title
CN102479208B (en) The various webpage information search transition translation of Chinese phonetics codes method
CN102902660B (en) Chinese phonetics codes spelling and Mixed Pinyin Chinese holographic information processing method
CN101118541B (en) Chinese-voice-code voice recognizing method
Baker Glossary of corpus linguistics
Schultz et al. Multilingual speech processing
CN101131689B (en) Bidirectional mechanical translation method for sentence pattern conversion between Chinese language and foreign language
Masmoudi et al. Arabic transliteration of romanized tunisian dialect text: A preliminary investigation
CN101118540A (en) Chinese characters phonetic and Chinese voice code bidirectional reversible transform method
CN101123089B (en) Voice mixing method for Chinese voice code
Josan et al. A Punjabi to Hindi machine transliteration system
Richmond et al. On generating Combilex pronunciations via morphological analysis
Aswani et al. A hybrid approach to align sentences and words in English-Hindi parallel corpora
Yaseen et al. Building Annotated Written and Spoken Arabic LRs in NEMLAR Project.
CN101727195B (en) Various information input method of Chinese phonetics codes
CN103854648A (en) Chinese and foreign language voiced image data bidirectional reversible voice converting and subtitle labeling method
Gugliotta et al. Tarc: Incrementally and semi-automatically collecting a tunisian arabish corpus
CN103164398A (en) Chinese-Uygur language electronic dictionary and automatic translating Chinese-Uygur language method thereof
Liesenfeld et al. Building and curating conversational corpora for diversity-aware language science and technology
CN103164396A (en) Chinese-Uygur language-Kazakh-Kirgiz language electronic dictionary and automatic translating Chinese-Uygur language-Kazakh-Kirgiz language method thereof
CN103164395A (en) Chinese-Kirgiz language electronic dictionary and automatic translating Chinese-Kirgiz language method thereof
CN103853705A (en) Real-time voice subtitle translation method of Chinese voice and foreign language voice of computer
CN103853709A (en) Method for automatically adding Chinese/foreign language subtitles for Chinese voiced image materials by computer
Dhore et al. Issues in hindi to english and marathi to english machine transliteration of named entities
Anto et al. Text to speech synthesis system for English to Malayalam translation
Carletta et al. The NITE XML Toolkit meets the ICSI Meeting Corpus: Import, annotation, and browsing

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C41 Transfer of patent application or patent right or utility model
TA01 Transfer of patent application right

Effective date of registration: 20160420

Address after: 810003 Qinghai city of Xining province Qinghai Biotechnology Industrial Park by the four Road No. 26 building 510 room hatch

Applicant after: QINGHAI HANLA INFORMATION TECHNOLOGY CO., LTD.

Address before: 200093 Shanghai city Yangpu District Kongjiang village 44 room 105

Applicant before: Miao Yushui

C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20160525

Termination date: 20201126