CN102902660B - Chinese phonetics codes spelling and Mixed Pinyin Chinese holographic information processing method - Google Patents

Chinese phonetics codes spelling and Mixed Pinyin Chinese holographic information processing method Download PDF

Info

Publication number
CN102902660B
CN102902660B CN201110212394.3A CN201110212394A CN102902660B CN 102902660 B CN102902660 B CN 102902660B CN 201110212394 A CN201110212394 A CN 201110212394A CN 102902660 B CN102902660 B CN 102902660B
Authority
CN
China
Prior art keywords
chinese
word
phonetics codes
language
syllable
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201110212394.3A
Other languages
Chinese (zh)
Other versions
CN102902660A (en
Inventor
苗玉水
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
QINGHAI HANLA INFORMATION TECHNOLOGY CO., LTD.
Original Assignee
Qinghai Hanla Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qinghai Hanla Information Technology Co Ltd filed Critical Qinghai Hanla Information Technology Co Ltd
Priority to CN201110212394.3A priority Critical patent/CN102902660B/en
Publication of CN102902660A publication Critical patent/CN102902660A/en
Application granted granted Critical
Publication of CN102902660B publication Critical patent/CN102902660B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Document Processing Apparatus (AREA)

Abstract

The present invention is a kind of Chinese phonetics codes spelling and Mixed Pinyin Chinese holographic information processing method, belongs to computer Chinese-character Chinese information processing technical field.It with and only in units of word, holographic process is carried out to Chinese information with 26 Latin alphabets, can be compatible with ASCCII code 100%.The present invention can be widely used in Chinese information processing, bibliogony, Chinese teaching, rural area eliminate illiteracy, teaching Chinese as a foreign language, Chinese syllable synthesis and identification, the computer documents of various form and the display of webpage Chinese information, information search, Chinese programming, there is the fields such as the mark of the various domain names for internet login website of Chinese implication, the mark of trade mark.The Chinese phonetics codes that the present invention adopts can be directly used in expression Chinese information, especially fail to see or uncomfortable Chinese character people's study, understand, grasp, express Chinese information and standard Chinese and provide a great convenience.

Description

Chinese phonetics codes spelling and Mixed Pinyin Chinese holographic information processing method
One, art
The technology of the present patent application is Chinese phonetics codes spelling and Mixed Pinyin Chinese holographic information processing method, and the method for this information processing belongs to computer Chinese-character Chinese information processing technical field.It with and only in units of word, holographic process is carried out to Chinese information with 26 Latin alphabets, can be compatible with ASCCII code 100%.The method of this information processing comprises Chinese holographic information with identifying that the lettering pen of western language writes input, standard English keyboard is keyed in, various information input, printing, print, store, display, communication, information transmission, speech recognition, phonetic synthesis, Chinese Word Intelligent Segmentation, mechanical translation, information search, various computer file format and info web represent and show, to sew network consisting domain name before and after various network legitimate domain name for logging in corresponding website, Chinese character and words is programmed, the methods such as the mark of trade mark, it is based upon the method used on the computing machine of Global Access or embedded computer system basis, below by computing machine or embedded computer system referred to as computing machine or computer system, the Chinese phonetics codes that this Chinese holographic information processing method adopts has spelling and Mixed Pinyin two kinds of spellings, due to the spelling of Chinese character and Chinese phonetics codes, simplicity, can mutually be changed by code table between Mixed Pinyin three, therefore, below describe in every applicable Chinese character or Chinese phonetics codes spelling, simplicity, one of Mixed Pinyin technical scheme or example, be also necessarily suitable for the Chinese phonetics codes of Chinese phonetics codes or other two kinds of forms, express or citing time be just not repeated, for sake of convenience, we below describe time, as required by Chinese phonetics codes spelling, simplicity, Mixed Pinyin is referred to as Chinese phonetics codes.
Two, background technology
Nineteen forties rises, and the develop rapidly of computing machine causes a third technical revolution centered by robot calculator in the world, and it frees the mankind from heavy brainwork, has started new era of human mind's liberation.
As everyone knows, computing machine mainly carries out character information processing by the method that processes 128 ASCII character symbols, because 26 Latin alphabets are in ASCII character symbol collection, therefore, use take English as representative, the country of the alphabetic writing being code element with 26 Latin alphabets successfully can carry out current new technology revolution, acquires benefit from the economy of developing by leaps and bounds.Before the World War I, only have 60 countries, 26 Latin alphabets to spell the language message expressing this country in the world, and the country using 26 letters to spell expression mother tongue information after World War II has reached 120, this also reflects the most countries value orientation on this problem on our this celestial body in fact.
Fail for a long time to invent with due to China always and only express the technology of Chinese information with 26 holographic spellings of letter, therefore different from thumping majority country of the world, China's record Chinese language information still uses square to express the meaning Chinese character, it is not a kind of alphabetic writing, and this brings very large trouble just to the Chinese of computing machine and Chinese character information processing.Although the Central People's Government of China in 1958 has promulgated that " Scheme for the Chinese Phonetic Alphabet " is as expressing the another kind of supplementary means of Chinese information, but owing to being subject to the limitation of historical conditions at that time, from the requirement of modern computer to the information processing technology, itself also there is the deficiency of the following aspects in " Scheme for the Chinese Phonetic Alphabet ": the first spelling formula is oversize, two or five tone is alphabetized and not in the scope of ASCII character, the sound of the 3rd Chinese language words syllable, rhythm, adjust the arrangement of one-dimensional linear from left to right not being convenient to computer information processing, but be arranged above and below, if the 4th does not have the non-alphabetized Chinese phonetic alphabet to save help every the sound insulation of syllable symbol, the Chinese phonetic alphabet is in units of word during write the two or more syllables of a word together, often easily obscure between syllable and syllable, produce audio mixing phenomenon.All these is not easy to the process of computing machine to Chinese information.Being convenient to the ideal state of computing machine to the phonetic code of the Chinese syllable that Chinese information processes for one is: first each Chinese syllable must contain phonological tone full detail, second arbitrarily many syllable by write the two or more syllables of a word together together after, can not obscure between syllable and syllable, produce audio mixing phenomenon.3rd whole phonetic code must adopt the from left to right one-dimensional linear arrangement of 26 Latin alphabets, so on the one hand can be compatible with ASCII character 100%, is convenient to computer information processing on the other hand; 4th whole phonetic code must easily with the conversion of the Chinese phonetic alphabet, Chinese speech and the Chinese character in units of word; 5th this phonetic code itself does not need to be converted into Chinese character or the Chinese phonetic alphabet or Chinese speech just directly can express Chinese information easily, can be combined into into syllables Chinese received pronunciation by people easily, thus understand the meaning its Chinese information to be expressed according to this Chinese received pronunciation.
For this reason, numerous expert, scholar is studied in this respect and explores, but because Chinese is a kind of very special language having tone, want with and only use 26 Latin alphabets, just can to 22 initial consonants (containing a zero initial) comprising Chinese, 38 simple or compound vowel of a Chinese syllable, 5 tones (comprising softly) are encoded, and in order to allow arbitrarily many syllables by write the two or more syllables of a word together together after, can not obscure between syllable and syllable, produce audio mixing phenomenon, also must imply in each syllable one alphabetized every syllable symbol, so just make the difficulty of this technical scheme very large, this also may be that this problem never has the basic reason that people effectively solves for a long time, it is reported, only has the tone representing Chinese syllable in Microsoft's phonetic in the reform of a writing system scheme of historical China He current with arabic numeral at present, the mode of " tone of initial consonant spelling+simple or compound vowel of a Chinese syllable spelling+numeral " is adopted when representing Chinese speech syllable, do like this is write on above simple or compound vowel of a Chinese syllable than current Chinese phonetic alphabet tone to make progress, one of them solving that " Scheme for the Chinese Phonetic Alphabet " above-mentioned itself also exist in the deficiency of the following aspects is not enough, the i.e. sound of Chinese language words syllable, rhythm, adjust the arrangement of one-dimensional linear from left to right not being convenient to computer information processing, but the main deficiency that " Scheme for the Chinese Phonetic Alphabet " above-mentioned itself also exists other several aspect following still fails to solve, from the coding techniques angle of Chinese information, its essence is and fail to invent with and only use 26 code elements, particularly with and only with 26 Latin alphabets as code element to 22 initial consonants (containing a zero initial) comprising Chinese, 38 simple or compound vowel of a Chinese syllable, the technology that 5 tones (containing softly) carry out encoding, let alone and invent owing to only using 26 Latin alphabets as code element, the sound insulation joint technology between the syllable after word write the two or more syllables of a word together and syllable pressed in Chinese, and the computer Chinese information processing of various Chinese information is carried out with the phonetic code that this technology forms.
Three, summary of the invention
The object of the invention is in order to by provide a kind of completely newly according to pronunciation characteristic of Chinese, with and only with 26 Latin alphabets to forming the initial consonant of each Chinese language words syllable, referral letter, simple or compound vowel of a Chinese syllable, tone carries out scientific and reasonable coding, each syllable of Chinese phonetics codes when Chinese phonetics codes spelling in units of word is successively by the sequential encoding of " simple or compound vowel of a Chinese syllable that referral letter+Chinese phonetic alphabet that initial consonant+Chinese phonetic alphabet that the Chinese phonetic alphabet is identical is identical is identical+tune code is held concurrently every syllable symbol ", when Chinese phonetics codes simplicity, each syllable of Chinese language words is according to the order of " acoustic code+Jie's code+rhyme code+tune code is held concurrently every syllable symbol ", writes input to carry out lettering pen by the mode of word write the two or more syllables of a word together, standard English keyboard is keyed in, various information inputs, printing, print, store, display, communication, information transmission, speech recognition, phonetic synthesis, Chinese Word Intelligent Segmentation, mechanical translation, information search, various computer file format and info web represent and show, to sew network consisting domain name before and after various network legitimate domain name for logging in corresponding website, Chinese character and words is programmed, the mark etc. of trade mark, thus reach and directly process Chinese information with it, to overcome the object of above deficiency.
Well-known: Chinese is by the significant unit that can freely use minimum in language---Chinese language words carries out information representation and transmission, Chinese language words forms (the corresponding Chinese character of a general syllable by several syllables, therefore we can see use separately Chinese character as a monosyllable), no matter how complicated each syllable has, and it is all made up of sound, rhyme, tone three parts.The present invention adopts 26 general in the world Latin alphabets, by unisonance similar shape rule to the whole initial consonants in " Scheme for the Chinese Phonetic Alphabet ", referral letter, simple or compound vowel of a Chinese syllable, tone is encoded, and write sequentially, standard English keyboard is keyed in, printing, print, store, display, communication, transmission, again according to the making words rule of Chinese by any number of syllable in units of word, without space continuous writing between syllable and syllable, standard English keyboard is keyed in, printing, print, store, display, communication, transmission just completes writing of Chinese language words, standard English keyboard is keyed in, printing, print, store, display, communication, transmission, write with these, standard English keyboard is keyed in, printing, print, store, display, communication, just the computer Chinese information processing of above-mentioned various method can be carried out based on the Chinese language words (comprising monosyllabic Chinese character) of transmission.
Such as: we utilize method of the present invention to represent following Chinese language words respectively:
By above word according to expressing the word order of the Chinese meaning in units of word, separate with space between word and word write successively, standard English keyboard key entry, printing, printing, storage, display, communication, transmission just can express a Chinese sentence sentence, this Chinese sentence can have following four kinds of expression waies:
1, directly with the Chinese information expressed by spelling Chinese phonetics codes of the inventive method:
Wovmenohuiushivyonguhanuyyvlaadingawenv.
2, directly with the Chinese information expressed by simplicity Chinese phonetics codes of the inventive method:
Wovmnohuiuxrvyduhcuyyvlaadqawnv.
3, with the Chinese information that " Chinese spelling " is expressed:
Wǒmenhuìshǐyònghànyǔlādīngwěn。
4, with the Chinese information that Chinese character is expressed:
We can use Chinese character and latin literary composition.
In like manner we can use the same method write, Chinese language words that standard English keyboard key entry, printing, printing, storage, display, communication, transmission are all, based on these words, we just can write, any we think Chinese information to be processed for standard English keyboard key entry, printing, printing, storage, display, communication, transmission.The Chinese same sentence of differently being expressed by above four kinds, we can also see:
Because coding of the present invention and " Scheme for the Chinese Phonetic Alphabet " have one-to-one relationship (refer to lower face code and the table of comparisons spelled in the Chinese), directly be used for expressing Chinese information because " Scheme for the Chinese Phonetic Alphabet " can depart from Chinese character again, in like manner method according to the present invention is write, standard English keyboard is keyed in, printing, print, store, display, communication, the Chinese language words of transmission also can depart from Chinese character and directly be used for expressing Chinese information, semantic before and after the Chinese information in units of word again expressed by " Scheme for the Chinese Phonetic Alphabet " combines have one-to-one relationship substantially with the corresponding Chinese character in units of word, and according to recursion rule, the present invention writes, standard English keyboard is keyed in, printing, print, store, display, communication, the Chinese language words in units of word of transmission also has this one-to-one relationship with the corresponding Chinese character in units of word, is writing by the inventive method simultaneously, standard English keyboard is keyed in, printing, print, store, display, communication, when the Chinese language words code of transmission directly expresses Chinese information, its usage in punctuation and meaning adopt with English consistent method, write like this by the inventive method, standard English keyboard is keyed in, printing, print, store, display, communication, transmission display, communication, the Chinese language words code of transmission just can depart from Chinese character directly to express Chinese character and words information with the western code state (26 Latin alphabet states) of ASCII character 100% compatibility, and while also just determines it and only can convert corresponding Chinese character or the Chinese phonetic alphabet or Chinese speech when needed to.This also just indicates, with the inventive method write, standard English keyboard key entry, printing, printing, storage, display, communication, transmission Chinese language words code there is holographic reversible feature.Due to different from Chinese character or the Chinese phonetic alphabet, the phonetic code that the inventive method adopts and ASCII character 100% compatibility, therefore, all western language software and hardware resources do not add transformation just can be used for the Chinese information expressed by phonetic code of process the inventive method, these namely compared with other all kinds of Chinese information expression waies all now the present invention obtain the place of remarkable technical progress.
The present invention has feature easy to learn, computer Chinese, Chinese character information processing can be widely used in, for the birth of Chinese reading machine, oral instruction machine, foreign languages translation machine is laid a good foundation, simultaneously owing to adopting 26 letters to encode, therefore all can process 26 alphabetical processors and can process the Chinese information utilized expressed by Chinese phonetics codes of the present invention in the world.By continuous improvement and popularization in practice, with the inventive method write, standard English keyboard key entry, printing, printing, storage, display, communication, the information approach process such as transmission Chinese language words code, a kind of alphabetic writing of Chinese can also be developed into, Chinese information can be processed easily as with English process English information.
Four, embodiment
Below in conjunction with embodiment, the specific embodiment of the present invention is further described.
(1) coding method of each syllable sound, rhyme, tone of Chinese:
The coding method of each syllable sound, rhyme, tone of the Chinese phonetics codes adopted during Chinese phonetics codes simplicity adopts following method:
Note: the symbol in bracket is Chinese phonetic symbols, the coding that not parenthesized letter is each syllable sound, rhyme, tone of adopted Chinese, we by the contrast coding schedule of following sound, rhyme, tone referred to as code table
(1). for representing that the initial consonant of the phonetic code of Chinese information all adopts a Latin alphabet to represent, such as, adopt the following consonant Latin alphabet to represent the coding of acoustic code:
(2). for representing that the Latin alphabet of the phonetic code of Chinese information in 26 letters represents referral letter, comprise (ü) that to represent with y in original Chinese phonetic alphabet single vowel and referral letter, all the other single vowels adopt the symbol identical with referral letter with Chinese phonetic alphabet single vowel with the coding of referral letter, comprise the coding adopting following referral letter:
i:(i)u:(u)y:(ü)
(3). for representing that the phonetic code of Chinese information is except part is with except the composite vowel of referral letter, the rhyme code of remaining composite vowel represents with a Latin alphabet when simplicity, comprise and representing with a consonant, when being included in Chinese phonetics codes simplicity, adopt the coding of following rhyme code:
Er:(er) (without the initial and the final) (keying in E and R two keys respectively when er western language keyboard is keyed in)
R:(i) [only spell mutually with (zh), (ch), (sh)]
(4). for representing that its tune code of phonetic code five Latin alphabets of Chinese information represent, comprising and adopting following four Latin alphabets and a no alphabetical v of Chinese to represent the coding adjusting code:
A:(-) high and level tone e:(/) rising tone v:(∨) upper sound u:() falling tone o:(do not mark) softly
(2) utilize the holography of the Chinese information of above-mentioned coding to represent and adopt following method:
In units of word, here individual Chinese character is regarded as monosyllable, according to the phonetic in " Scheme for the Chinese Phonetic Alphabet " of each syllable of this word of composition, when Chinese phonetics codes spelling except the expression of ü can adopt a Latin alphabet to comprise y to represent, initial consonant represents and to represent with referral letter and simple or compound vowel of a Chinese syllable represents all identical with the Scheme for the Chinese Phonetic Alphabet, adjust code to adopt a Latin alphabet to represent with Scheme for the Chinese Phonetic Alphabet difference, and this tune code is held concurrently every syllable symbol, namely each syllable of Chinese phonetics codes is successively by the sequential encoding of " simple or compound vowel of a Chinese syllable that referral letter+Chinese phonetic alphabet that initial consonant+Chinese phonetic alphabet that the Chinese phonetic alphabet is identical is identical is identical+tune code is held concurrently every syllable symbol ", when Chinese phonetics codes simplicity successively by the sequential encoding of " acoustic code+Jie's code+rhyme code+tune code is held concurrently every syllable symbol ", no matter be spelling and simplicity, multiple syllables of same word separate write the two or more syllables of a word together without space, coding space between word and word separates, during composition word, each syllable of word both can all with spelling or simplicity syllable composition, also any one syllable of composition word can be adopted spelling or simplicity mix and match composition as required, also the syllable namely had in multiple syllables of composition word can be simplicity, some syllables can be spellings, we call the Mixed Pinyin of Chinese phonetics codes this mixing spelling formula, Chinese phonetics codes spelling and simplicity and Mixed Pinyin are referred to as Chinese phonetics codes or phonetic code by us,
Chinese phonetics codes spelling or Mixed Pinyin can also can carry out contrasting with Chinese character, the Chinese phonetic alphabet, foreign language, minority language separately print, print, store, show, communication, information transmission etc.;
When Chinese information is in spelling or Mixed Pinyin phonetic code state, its usage in punctuation is identical with English usage in punctuation;
Here owing to regarding the independent Chinese character used as monosyllable, therefore, the writing of the Chinese-character phonetic code of the inventive method, standard English keyboard key entry, printing, printing, storage, display, communication, transmission method are identical with the method for Chinese language words, be made up of several words one group of word is called phrase by us, and the method for expressing of phrase of the present invention is identical with Chinese sentence method for expressing.When general whole sentence entire chapter represents Chinese information in units of word, the selection carrying out homophone word is not generally needed when understanding, sound the sentence that can not produce ambiguity in principle, write, standard English keyboard key entry, printing, printing, storage, display, communication, also can not to produce ambiguity after transmission.
The corresponding Chinese phonetic alphabet exemplifying some Chinese language words codes represented by the inventive method and the corresponding Chinese character in units of word below and express with " Scheme for the Chinese Phonetic Alphabet ".(parenthesized be " Chinese spelling " not parenthesized be the Chinese language words code and corresponding Chinese character that represent by the inventive method).
Separate between these word word and words with space, a Chinese phrase or Chinese sentence information just can be represented according to Chinese word order continuous writing, standard English keyboard key entry, printing, printing, storage, display, communication, transmission between word and word, because the representation of phrase is identical with the mode of the expression of Chinese sentence, here be just not repeated, only to write, standard English keyboard key entry, printing, printing, storage, display, communication, transmission a Chinese sentence sentence information:
Wovmnohuiuxrvydulaadqawnv. (Chinese information that Chinese phonetics codes simplicity represents)
We can use Latin.(Chinese information represented with Chinese character)
wǒmenhuìshǐyònglādīngwěn。(Chinese information represented with the Chinese phonetic alphabet)
Wovmenohuiushivyongulaadingawenv. (Chinese information represented by Chinese phonetics codes spelling)
The like with said method can all any polysyllabic Chinese language words information be write, standard English keyboard key entry, printing, printing, storage, display, communication, the information processing such as transmission, based on these Chinese language words, just any Chinese information be can represent, thus conveniently various Chinese character, Chinese information processing carried out.Due to bi-directional conversion can be carried out easily between the spelling of Chinese phonetics codes and simplicity and Mixed Pinyin, for describe simple and clear for the purpose of, general in citing is below example with simplicity Chinese phonetics codes, as long as in fact set up too spelling or Mixed Pinyin Chinese phonetics codes the step and method that simplicity Chinese phonetics codes can be set up, special declaration once here.
(3) Chinese Word Intelligent Segmentation adopts following steps and method:
(1) Chinese Word Intelligent Segmentation adopt a kind of mainly based on the computing machine on the novel Chinese grammar analysis foundation substantially consistent with the morphology syntax and word-building thereof of English Grammar or embedded movable equipment Chinese-character text and there is with " Scheme for the Chinese Phonetic Alphabet " segmenting method of Chinese phonetic alphabet text of one-to-one relationship, its novel Chinese grammar principal feature used is that the part of speech of Chinese is divided into by morphology aspect: noun, pronoun, numeral-classifier compound, adverbial word, adjective, verb, preposition, conjunction, modal particle and onomatopoeia; The sentence element of Chinese divides into by syntax aspect: subject, predicate, object, predicative, appositive, attribute, the adverbial modifier, complement; The complex sentence of sentence is divided into complex sentences with coordinating relation and principal and subordinate's complex sentence; Principal and subordinate's complex sentence can be divided into again: subject clause, object clause, predicative clause, appositive clause, attributive clause, adverbial clause; Chinese loans tense is divided into: past tense, present tense, now future tense, past future tense; Chinese loans body formula is divided into: general expression, carry out formula, perfect, perfect progressive tense; Set up the subjunctive mood of Chinese loans passive voice and predicate verb; The method of the method that the word-building aspect of Chinese is sewed mainly through prefixing, infix, suffix, front and back on root basis and root and root compound carrys out word-building;
By the non-individual Chinese character of Chinese or the specific term of syllable, pronoun, numeral-classifier compound, part adverbial word, preposition, conjunction, modal particle and onomatopoeia, characterize the Feature Words of complex sentences with coordinating relation and each subordinate clause, system when verb is various, passive voice, subjunctive Feature Words, the front and back of word-building are sewed classification and are listed primary word storehouse in, by four words of main Chinese and set phrase, monosyllabic word, adjective, verb, secondary dictionary is listed in other noun and the adverbial word classification that exclude one-level dictionary in, by the prefix of the word-building of Chinese, infix, suffix, three grades of dictionaries are listed in root classification in,
The breakpoint of sentence or character string always will be utilized when participle, from the breakpoint left and right sides, coupling cutting is carried out to the Chinese character or syllable that need cutting, space to be added to all words that the match is successful separate and complete mark on backstage as coupling, wait all complete cut word after cancel this mark again and get back to original font format;
Utilize breakpoint to be formed position comprise: the rising of sentence contain in head, the ending of sentence, various punctuation mark, various expression quantity and the arabic numeral of sequence number, various pi-character, original Chinese character or syllable space, the later breakpoint formed of upper level dictionary participle;
During participle, the first step is first sewed with the word in one-level dictionary and front and back, the Chinese character needed in the whole text of participle or syllable are scanned, the word of cutting is needed to carry out cutting regarding one as through the successful Chinese character of scan matching or syllable, before and after sew after the match is successful, suffix was that a word segmentation is used as comprise all characters sewed front and back as by boundary in the past, had during more than a kind of matching result and was as the criterion with the matching result producing minimum isolated Chinese character or syllable;
After one-level dictionary has divided, get four, two, three and one from the left and right sides of breakpoint successively respectively and there is no the Chinese character that the match is successful or syllable, then mate with the word in secondary dictionary, if the match is successful for the Chinese character got or syllable, and coming to the same thing of forward and reverse coupling is carried out from the left and right sides of breakpoint to same handling object, just think that this is a successful matching result, if the result of coupling is not identical, the matching result producing minimum isolated Chinese character or syllable is considered to successful coupling;
After secondary dictionary has divided word, when further participle, first three grades of dictionaries carry out prefix, suffix, infix and root matching judgment to the Chinese character that the match is successful or syllable is contrasted, if the words of prefix, an absorption isolated Chinese character or syllable form a word and do cutting backward, if be two Chinese characters or syllable matched below, be then combined, by three words cuttings with these two Chinese characters matched or syllable, if suffix, an absorption isolated Chinese character or syllable form a word and do cutting forward, if be two Chinese characters or syllable matched above, are then combined, by three words cuttings with these two Chinese characters matched or syllable, if the words of infix then absorb each word in front and back or syllable forms a word, if before causing after absorbing or occur below one isolated when there is no Chinese character or a syllable of coupling, then this Chinese character or syllable to be absorbed the word into this infix composition, the Chinese character of the word of general composition or syllable number are no more than four, if the words of root, word or syllable can be added according to before it, or word or syllable can be added below, or front and back can add the situation of word or syllable, adopt prefix respectively, suffix, the word method of cutting of infix carries out cutting word, the word that the cutting of above method institute is arrived, when occurrence number accumulative in the different sentences in same section document is no less than twice, system automatically by this word stored in secondary dictionary,
After complete with above three dictionary cuttings, still the Chinese character that the match is successful or syllable string is there is in sentence, or although the match is successful but when belonging to more than three continuously isolated Chinese characters or syllable string, they are combined composition word and carrys out cutting, the word that the cutting of above method institute is arrived, when occurrence number accumulative in the different sentences in same section document is no less than twice, system can according to setting automatically or after manual confirmation by it stored in one-level dictionary;
Manual intervention amendment can also be carried out to last word segmentation result and inspection rule, classify stored in one-level dictionary or secondary dictionary to the neologisms that manual intervention is formed after manual confirmation according to the feature of word, word in dictionary at different levels can also carry out artificial additions and deletions, and word in dictionary preferentially to be classified the principle arrangement be arranged in front by high frequency, when reaching certain threshold values, word classification in secondary dictionary can be risen to one-level dictionary through manual confirmation system, word classification in one-level dictionary drops to secondary dictionary, and this Word Intelligent Segmentation step is called word-dividing mode by us;
Because the tone of the routine Chinese phonetics codes used of the present invention has the effect of sound insulation joint, even if also can not be there is mutually obscuring between syllable and syllable in the syllable write the two or more syllables of a word together many arbitrarily of the phonetic code of composition sentence so together, by means of the sound insulation joint effect of tone, the syllable of Chinese speech one by one still can accurately distinguish by we, such as distinguish by syllable one by one the sentence of Chinese phonetics codes above, we can obtain:
“wovmnohuiuxrvyduhsuyyvlaadqawnv.”
With segmenting method same above, " wovmnohuiuxrvyduhcuyyvlaadqawnv. " phonetic code string participle can be cut into simplicity Chinese phonetics codes and be by us:
“wovmnohuiuxrvyduhcuyyvlaadqawnv.”
The Chinese phonetic alphabet text completed accordingly in " Scheme for the Chinese Phonetic Alphabet " of point word segmentation is:
“Wǒmenhuìshǐyònghànyǔlādīngwěn。”
The Chinese characters text completing point word segmentation is accordingly:
" we can use Chinese character and latin literary composition.”
The spelling Chinese phonetics codes completing point word segmentation is accordingly:
“wovmenohuiushivyonguhanuyyvlaadingawenv.”
The like, the phonetic code that we just can complete all Chinese-character texts and have an one-to-one relationship with the Chinese phonetic alphabet in " Scheme for the Chinese Phonetic Alphabet " like this knows point word segmentation of the various Chinese phonetic alphabet texts that audio mixing does not occur at interior syllable.
Below in conjunction with embodiment, the specific embodiment of the present invention is further described.
The Feature Words of the one-level dictionary of example of the present invention can absorb following Feature Words, comprising:
Be used for replacing the pronoun of title of persons or things, such as: we, you, they, they, they etc.;
Form the word of self pronoun, such as: oneself, I, etc.;
Refer to the word of things, such as: this, that, this, that, these, those etc.;
Refer to the word of proterties, such as: so, so (refer to adverbial word), like this, like that, etc.;
Refer to the word of time, such as: at this moment, at that time, etc.;
Refer to the word in place, such as: here, there, here, there etc.;
Interrogative pronoun in Chinese, such as: what, what, which, which etc.;
Indefinite pronoun in Chinese, such as: some, some, have, some, have people, all, all, any, other, many, various, each, each, often kind, etc.;
The word of system during Chinese, such as: ... ..., once ... to cross, always ... etc.;
Here it should be noted that the word string that similar " ... " such form is represented, mate in pairs when mating, that is, have found " " above no matter in be spaced how many characters, find " " just calculate this word String matching success, and by two words " " and " " mark and cutting separately, this put in this article all with, be not repeated.
Form the word of passive voice common sentences, such as: be ... by etc.;
Form the subjunctive word of Chinese predicate verb, such as: if ... ... for a long time ... if ... mistake ... for a long time ... if ... ... just ..., just in case ... ... just ... etc.;
The contact verb of Chinese, such as: can be regarded as, equal, seem, become etc.;
The contact verb be made up of " sense organ verb+get up ", such as: seem, look, sound, sound etc.;
Represent that there is the Chinese modal verb of certain ability, such as: can, can etc.;
Expressing possibility property, conjecture property, suspection, the word of the tone such as not affirm, such as: may, perhaps, perhaps, can etc.;
Represent allow the tone word, such as: can etc.;
Represent objectively need word, such as: must, have to, should, should, needs etc.;
To express willingness, be determined, ensure, dare wait the conventional modal verb of psychological condition, such as: be ready, be determined, certain etc.;
Represent the Chinese auxiliary verb of tense, such as:, once etc.;
Represent the word of negative, such as: do not have etc.;
Represent the word of the certainly tone, such as: really, certain etc.;
The adverbial word of the expression time of Chinese, such as: at once, at once, immediately, then, then, finally, always etc.;
The adverb of place of the expression of Chinese, such as: everywhere, everywhere, everywhere, everywhere etc.;
The degree adverb of the expression of Chinese, such as: a little, especially, more, very, etc.;
The proterties adverbial word of the expression of Chinese, such as: perhaps, simply, wilfully, specially, suddenly, be happy to, be convenient to etc.;
Represent the word of adverbial word comparative degree, such as: ratio ... more (or comparison) etc.;
Represent the five-star word of adverbial word, such as: ... in ... ..., the most etc.;
The preposition in expression time, place, direction.Such as: since, towards, when ... time etc.;
Represent the preposition of object, such as: for etc.;
Represent the preposition of object, means, mode, such as: in order to, be so that, according to, according to, in line with, etc.;
Represent get rid of preposition, such as: except, remove, except etc.
Represent the preposition of reason, such as: due to, because etc.;
(conjunction listed below can as the conjunctive word connecting each subordinate clause in complex sentence, and the relation between represented subordinate clause is identical with the relation represented by this conjunction, owing to being one group of identical word, does not state below corresponding subordinate clause conjunctive word tired)
Represent the conjunction of Chinese coordination, such as: on the one hand ... on the one hand, both ... be again, not ... but etc.;
Represent that Chinese is along connecing the conjunction of relation, such as: so then, then etc.;
Represent the conjunction of Chinese progressive relationship, such as: not only (not only, not only, not only) ... and, even, especially, not only ... on the contrary etc.;
Represent the conjunction of Chinese choice relation, such as: or ... or, be not ... be exactly or ... or and its ... be not so good as etc.;
Represent Chinese causal conjunction, such as: thus, therefore so, so etc.;
Represent the conjunction of Chinese turning relation, such as: but but but but etc.;
Represent the conjunction of Chinese time subordinate relation, such as: proper ... time, by the time ... (time) until ... (time), by the time ... after, (until) ... in the past, whenever ... (time) etc.;
Represent the conjunction of Chinese reason subordinate relation, such as: because ... so, due to ... since therefore ... just etc.;
Represent the conjunction of Chinese object subordinate relation, such as: in order to so that, so as in order to avoid, allow etc. well;
Represent the conjunction of Chinese result subordinate relation, such as: so that, result, to cause etc.;
Represent the conjunction of Chinese hypothesis subordinate relation, such as: if ... if just ... even if so ... also even ... also etc.;
Represent the conjunction of Chinese condition subordinate relation, such as: only have ... just as long as ... unless just ... not, no matter ... all, no matter ... also (Zong), no matter ... total etc.;
Represent the conjunction of Chinese concession subordinate relation, such as: although ... (but, but but) although ... but etc.;
Represent the conjunction of Chinese mode subordinate relation, such as: seem ... generally, seem ... equally, resemble ... like etc.;
Represent that Chinese compares the conjunction of subordinate relation, such as: surpass, be not so good as, just like and ..., more ... more etc.;
Represent the conjunction of Chinese place subordinate relation, such as: where ... where etc.;
Refer to the proper noun of the title that specific people, things, place or mechanism are proprietary, such as: Mao Zedong, Shanghai, State Council etc.;
Chinese for representing the word of mark, such as: ... point it ... etc.;
Chinese for representing the word of decimal, such as: 0. 0 ... etc.;
Chinese represents the word of approximate number, such as: " left and right " etc.;
For representing the ordinal number of order in Chinese, such as: " the ... number " etc.;
Compound classifier in Chinese, such as: sortie, person-time, km, hour, kilowatt-hour etc.;
Chinese interjection, such as: my God, aha, heartily, etc.General below with there being punctuation mark.
The simple onomatopoeia of Chinese, such as: ouch, smack one's lips, sting slip, cough up, creak, chuckle, thud, bubble, rumble, with cry, clip-clop, rustlingly, rustlingly, thump, thump, sound of snorting or fizzing, ding-dong, jingle, clank, rumble, flash, thinkling sound, murmuring gurgling, sough, rustle, rustle, toot, when when, father-in-law drone, crying of a child, hullabaloo, cough up crash, crash, watchman's wooden clapper watchman's wooden clapper watchman's wooden clapper, rub-a-dub rub-a-dub rub-a-dub, hem and haw, the squeak sound of reading aloud oh, squeak squeak oh, Pi crack, chirp, etc.General below with words such as " " " " " sound ".
Represent the modal particle of indicative mood, such as:, that's all, only, etc.Generally there is comma below, " or fullstop ".”
Represent Chinese adjective comparative degree word, such as: more ..., compare ... ... a bit ... some, ratio ... more ... some etc.;
One of highest: ... ... ... very much etc.;
Represent identical: with ... the same ... etc.;
Represent multiple: compare ... high ... doubly, compare ... many ... doubly, compare ... good ... doubly etc.;
When representing an equation degree than the opposing party's height: ratio ... more ... some (a bit), ratio ... more ... a bit etc.;
When not needing maybe to say comparison other, the comparative degree adjective of employing, such as: compare ... etc.;
Before and after sew: such as: can ... property, easily ... property, etc.;
Cause the conventional preposition of upside-down mounting in Chinese, such as: connect ... all, connect ... also, for ... ... etc.;
Emphasize that object is the preposition of the sentence of morphological pattern, such as: ... give etc.;
List four words of main Chinese and set phrase, all monosyllabic words, adjective, verb, other noun excluding one-level dictionary and adverbial word classification in secondary dictionary, these words have conventional, fixing but measure large feature.Such as: great, glorious, work, go hunting, see, student, teacher, very,, in, year, month, day, two, 1,2 etc., the word of one-level dictionary has been listed in removing in, word in whole up-to-date " Chinese verb " can stored in this dictionary, to to be identified in secondary dictionary the making words rule of the form of applying flexibly of Chinese adjective, numeral-classifier compound, verb etc. and the morphology of applying flexibly that can list is listed as far as possible, to improve the accuracy of secondary dictionary participle simultaneously.Such as:
Describe and apply flexibly: the adverbial word of form that " A+ in " is converted into " A+ in ", A represents the monosyllable of adjective meaning such as: " brave, happy+in " formed respectively: " daring to " and " being happy to " two adverbial words etc.; That is run into the coupling of this kind of word, " in " be equivalent to a suffix, when having the single syllable of an adjective meaning not mate word above, just can be received and " in " be configured to a word.
The single syllable A overlap of adjective meaning can be converted into the adverbial word of AA form.Such as: " fast " (adjective), " white " (adjective) convert " speedily " (adverbial word), " in vain " (adverbial word) respectively to.
In addition the eclipsed form that also has that Chinese adjective is relevant with cutting word applies flexibly form.Eclipsed form mainly contain AA, ABB, AABB, etc. several form.
Wherein AA formula is used for the adjectival overlap of single-tone, represents heighten degree after their overlap, such as: long, high, white, fat, becomes respectively after overlap: long, high, in vain, fat.Wherein high and level tone read in second syllable.After AA formula overlap, Chinese is adjectival describes that degree is all more deep.
ABB formula overlap ratio is as bright, bright rolling; Become respectively after overlap: brightly lit, gleaming.
AABB formula overlap ratio as: clean, happy, affectionate; Become respectively after overlap: neat and tidy, sweet very sweet, be affectionate.After above various overlap Chinese adjectival describe all more original heighten degree of degree some.
Numeral-classifier compound is applied flexibly: Chinese number measure word and measure word can overlappingly use, and the numeral-classifier compound after overlap has the meaning of " each " " many ".Such as measure word AA formula is overlapping: all, rule, all over all over, time time etc.; Numeral-classifier compound ABB formula is overlapping for another example: several crowds of, one by one, several rows of etc.
Applying flexibly of verb: monosyllabic verb changes into the word of adverbial word, such as: " A A and ", " A A " A represent monosyllabic verb as: " crying " while crying, cry and cry.
Work effect eclipsed form verb ABB formula of saying words with emphasis, such as: " help " becomes " doing me a favour ".
That is above quite a few applies flexibly shape is AA formula, ABB formula, AABB formula, if utilize these rules run into " AA " formula, " ABB " formula, " AABB " formula word can be cut into the word of " AA ", " ABB ", " AABB " form.If run into " A A and ", " A A " form also " A A and ", " A A " can be cut into a word respectively, shape of applying flexibly in fact also has ABAB formula, still by AB form, two words are cut into for this form, so do not do further discussion here to applying flexibly shape ABAB formula from participle angle.
List the prefix of the word-building of Chinese, infix, suffix, root in three grades of dictionaries.This quasiprefix, infix, suffix group word ability are strong, after general dictionary participle is above invalid, adopt " affixe " and " root " in this dictionary to carry out participle differentiation.Such as prefix: little ..., old ..., Ah ..., such as suffix: ... person ... ... youngster, such as infix: ... or not ... inner ... ... seven or eight ... ... three ... four, general infix forms Chinese idiom.Such as root " machine " and " street ", can form respectively: lathe, take advantage of the occasion, airport, street, facing the street, T-shaped road junction etc., general root formed word before both can being placed on, also can to put behind formation word, except forming the Chinese idiom of four words, what root was formed at most is two-character word minority is three words, and five character word does not have here that we can not consider substantially.Included altogether in three grades of dictionaries in " the conventional word-building dictionary " published spoken and written languages research institute of the Renmin University of China in March, 1984 listed nearly 4000 can as the Chinese character of root.Along with the development of language, the root not being put into three grades of dictionaries also can be increased as required.
Use dictionary above and method participle as follows:
Former sentence:
I tells you, and Gu Landan is male is the treasure daughter of princess's grave chieftain, if you do not lose no time a Gu Landan, male searching is returned, and I just looks for you to do accounts! Lose no time after finding to report to chieftain himself.
Use one-level dictionary word segmentation result:
I tells you, and Gu Landan is male is the treasure daughter of princess's grave chieftain, if you do not lose no time a Gu Landan, male searching is returned, and I just looks for you to do accounts! Lose no time after finding to report to chieftain himself.
Use secondary dictionary word segmentation result:
I tells you, and Gu Landan is male is the treasure daughter of princess's grave chieftain, if you do not lose no time a Gu Landan, male searching is returned, and I just looks for you to do accounts! Lose no time after finding to report to chieftain himself.
(supposing that " losing no time " does not list secondary dictionary in)
Use three grades of dictionary word segmentation result:
I tells you, and Gu Landan is male is the treasure daughter of princess's grave chieftain, if you do not lose no time a Gu Landan, male searching is returned, and I just looks for you to do accounts! Lose no time after finding to report to chieftain himself.(from three grades of dictionaries, find " catching up with " to be a root, after it, have an isolated word that the match is successful " tightly ", so " catching up with " forms " a losing no time " two-character word with " tightly " below.Again because " losing no time " has at least occurred twice in the sentence that one text is different, therefore, Words partition system will " lose no time " to be saved in secondary dictionary automatically, next time when secondary dictionary participle just direct by its match is successful and cutting)
Word segmentation result after using participle to check rule inspection:
I tells you, and Gu Landan is male is the treasure daughter of princess's grave chieftain, if you do not lose no time a Gu Landan, male searching is returned, and I just looks for you to do accounts! Lose no time after finding to report to chieftain himself.
(" Gu Landan is male " is an isolated Chinese character that the match is successful continuously, it can be used as a Chinese character string to be merged into a word according to inspection rule and carry out cutting, due to the continuous isolated character that the match is successful of this series winding, at least occur twice in the sentence that one text is different, therefore, " Gu Landan is male " is saved in one-level dictionary by Words partition system automatically, and next time just can directly by its match is successful also cutting when one-level dictionary participle)
Finally carry out the word segmentation result after manual intervention:
I tells you, and Gu Landan is male is the treasure daughter of princess's grave chieftain, if you do not lose no time a Gu Landan, male searching is returned, and I just looks for you to do accounts! Lose no time after finding to report to chieftain himself.
(because " princess's grave " is a place name, therefore can not be cut into " princess's grave ", a proper noun is become through manual intervention, system can detect the result of this manual intervention, according to the character of this word belong to specific term after manual confirmation system can automatically by it stored in one-level dictionary, next time is just direct when one-level dictionary participle, and by it, the match is successful and cutting.)
The complete rear system of whole cutting eliminates the mark that in sentence, the match is successful, and font reverts to normal format:
I tells you, and Gu Landan is male is the treasure daughter of princess's grave chieftain, if you do not lose no time a Gu Landan, male searching is returned, and I just looks for you to do accounts! Lose no time after finding to report to chieftain himself.
So just obtain the word segmentation result required for us.
Through the practice of sentence dicing process above, we come to carry out cutting to sentence below again:
Former sentence:
Painstaking efforts through them are found, and finally in afternoon about 5 on April 8th, 1936, it is male that they have found Gu Landan at Xinjiang Urumqi, so the princess's grave that loses no time to send someone is reported to chieftain.
Use one-level dictionary word segmentation result:
Painstaking efforts through them are found, and finally in afternoon about 5 on April 8th, 1936, it is male that they have found Gu Landan at Xinjiang Urumqi, so the princess's grave that loses no time to send someone is reported to chieftain.
(" Gu Landan is male " and " princess's grave " is owing to being stored into one-level dictionary after cutting last time, successful with regard to cutting at one-level dictionary specifically)
Use secondary dictionary word segmentation result:
Painstaking efforts through them are found, and finally in afternoon about 5 on April 8th, 1936, it is male that they have found Gu Landan at Xinjiang Urumqi, so the princess's grave that loses no time to send someone is reported to chieftain.
(because " losing no time " is stored into secondary dictionary after cutting last time, successful with regard to cutting at secondary dictionary specifically, to have the regional system of arabic numeral itself and Chinese character can not be merged)
It is such as that font tilts here that all words cut out all have been done to be identified at, show just to complete coupling participle process in secondary dictionary participle stage system, decrease participle step than last time, prove that this method has the function automatically improving segmenting method.
The complete rear system of whole cutting eliminates the mark that in sentence, the match is successful, and font reverts to normal format, and we obtain last word segmentation result and are:
Painstaking efforts through them are found, and finally in afternoon about 5 on April 8th, 1936, it is male that they have found Gu Landan at Xinjiang Urumqi, so the princess's grave that loses no time to send someone is reported to chieftain.
The like, by enriching constantly and adjusting the word of dictionary at different levels according to frequency and improve word segmentation result inspection rule, the more continuous break-in in practice, Words partition system more and more hommization intelligently can carry out participle.
Because the Chinese phonetic alphabet in " Scheme for the Chinese Phonetic Alphabet " and the Chinese character in units of word have corresponding relation, therefore Chinese Pinyin syllables itself mark in this Chinese phonetic alphabet text is clear when not producing audio mixing, above to point word segmentation that Chinese-character text divides the method for word segmentation to be equally applicable to the Chinese phonetic alphabet text in " Scheme for the Chinese Phonetic Alphabet ", the word that the Chinese character of the work that increase mainly corresponding dictionary at different levels forms or affixe add that the Chinese phonetic alphabet in " Scheme for the Chinese Phonetic Alphabet " accordingly just can.Such as:
With the former sentence that Chinese character is expressed be: " we can use Chinese character and latin literary composition.”
Chinese phonetic alphabet text accordingly in " Scheme for the Chinese Phonetic Alphabet " is:
“Wǒmenhuìshǐyònghànyǔlādīngwěn。”
With above-mentioned segmenting method, former for Chinese character sentence participle can be cut into by we: " we can use Chinese character and latin literary composition.”
Using the same method, former for above-mentioned Chinese phonetic alphabet text sentence participle can be cut into by we:
“Wǒmenhuìshǐyònghànyǔlādīngwěn。”
In like manner to the Chinese phonetics codes of any text such as spelling and simplicity with the Chinese phonetic alphabet in " Scheme for the Chinese Phonetic Alphabet " with one-to-one relationship, as long as meet before point word segmentation, Chinese Pinyin syllables itself mark in this Chinese phonetic alphabet text is clear does not produce audio mixing, we just can carry out a point word segmentation by method above to the Chinese phonetic alphabet text that this has corresponding relation, and the work that increase is that the word of the Chinese character composition of corresponding dictionary at different levels or affixe add that the corresponding coding with the Chinese phonetic alphabet in " Scheme for the Chinese Phonetic Alphabet " with one-to-one relationship just can.Press word segmentation such as to the Chinese phonetics codes of spelling Chinese phonetics codes corresponding to above-mentioned sentence and simplicity, be not repeated here.
(4) Chinese character or the Chinese phonetic alphabet convert Chinese phonetics codes to and the two-way conversion between Chinese phonetics codes spelling and simplicity adopts following steps and method:
When Chinese character or the Chinese phonetic alphabet convert Chinese phonetics codes to, Chinese character first converts the Chinese phonetic alphabet to, when meeting different sound shape similar word, the possible Chinese phonetic alphabet is all listed, the Chinese phonetic alphabet then need not first be changed, and then first convert corresponding Chinese syllable phonetic code string to according to code table, then carry out by word segmentation again calling the word-dividing mode stored in advance in computer systems, which;
Then need not carry out the segmentation of words again after Chinese phonetics codes is converted to the Chinese character and the Chinese phonetic alphabet that divided word, still change in units of original word;
When Chinese phonetics codes needs to convert the Chinese phonetic alphabet to, or adopt and look into the code table stored in advance in computer systems, which, or look into the Chinese phonetics codes in units of syllable or word and the Chinese phonetic alphabet table of comparisons in units of syllable or word that are generated by this code table, after coupling, export the corresponding Chinese phonetic alphabet;
When Chinese phonetics codes needs to convert Chinese character to, or the Chinese phonetic alphabet first converted in units of word converts the Chinese character in units of word again to, or directly adopt and look into the phonetic code that stores in advance in computer systems, which and the Chinese character table of comparisons in units of word, mate and export corresponding Chinese character afterwards;
When meeting homonym, first differentiate according to means such as the contact of Chinese lexical syntactic context and statistical laws, the Chinese character carried out again after differentiation in units of word is selected;
When needing to convert the Chinese phonetics codes of spelling the Chinese phonetics codes of simplicity to, by looking into the code table prestored in a computer, the initial consonant of the Chinese phonetics codes of spelling, referral letter, simple or compound vowel of a Chinese syllable are changed into the acoustic code of the Chinese phonetics codes of simplicity, Jie's code and rhyme code, adjust code to remain unchanged;
Otherwise when needing to convert the Chinese phonetics codes of simplicity the Chinese phonetics codes of spelling to, by looking into the code table prestored in a computer, the acoustic code of the Chinese phonetics codes of simplicity, Jie's code and rhyme code are changed into initial consonant, referral letter, the simple or compound vowel of a Chinese syllable of the Chinese phonetics codes identical with the Chinese phonetic alphabet of spelling, adjust code then to remain unchanged;
When described spelling or simplicity only have part syllable to change, just complete the conversion to Mixed Pinyin;
When Chinese phonetics codes converts Chinese character and the Chinese phonetic alphabet to, its punctuation mark is also from the Chinese punctuation mark state that the state transfer identical with English is corresponding, and the method for this step is called Chinese characters phonetic and Chinese voice code bidirectional modular converter by us.
Exemplify example when some carry out bi-directional conversion by the inventive method to the Chinese characters phonetic in units of word below:
1, Chinese character and the Chinese phonetic alphabet convert Chinese phonetics codes to:
(1) first by the mode of tabling look-up Chinese character to the corresponding Chinese phonetic alphabet converted to Chinese character:
Such as: we can use Chinese character and latin literary composition.Become after converting phonetic to:
wǒmenhuìshǐyònghànyǔlādīngwěn。
(2) to by Chinese character change come or original Chinese phonetic alphabet then convert the Chinese phonetic alphabet to following Chinese phonetics codes string by the above Chinese phonetic alphabet and the phonetic code code table table of comparisons.
Wovmnohuiuxrvyduhsuyyvlaadqawnv. (separate with space between syllable and syllable)
Or wo vmn ohui uxr vyd uhc uyy vla adq awn v. (separating without space between syllable and syllable)
(the schwa symbol o after skilled in mno can omit when not causing audio mixing.)
In order to allow everybody see clearly, will represent that the letter of tone has added underscore here, the tool sound insulation joint effect simultaneously of the tone letter in phonetic code, in actual speech code, tone is without underscore, and after skilled phonetic code, tone is held concurrently and can conveniently be distinguished every syllabic sign.
(3) phonetic code string is carried out participle cutting, finally complete phonetic code conversion.
By searching the Chinese phonetics codes word dictionary having divided word in advance, by multiple syllable write the two or more syllables of a word together of same word, between word and word, separate the Chinese phonetics codes just obtaining our final needs following with space:
wovmnohuiuxrvyduhcuyyvlaadqawnv.
2, Chinese phonetics codes converts Chinese character and the Chinese phonetic alphabet to:
Chinese phonetics codes can be converted to Chinese character and the Chinese phonetic alphabet easily with the Chinese character in units of word and the Chinese phonetic alphabet table of comparisons by searching Chinese phonetics codes respectively, such as:
Wovmno is by looking into acoustic code, Jie's code, rhyme code, the Chinese phonetics codes syllable adjusting code and the Chinese phonetic alphabet table of comparisons or generate according to this table of comparisons or word and pinyin syllable or the word table of comparisons obtain w ǒ men, the Chinese character in units of word is found again by w ǒ men, after the phonetic code in units of word sets up corresponding relation by the Chinese phonetic alphabet in units of word and the Chinese character in units of word, once need the phonetic code in units of word can no longer need by the Chinese phonetic alphabet in units of word, directly set up corresponding relation to the Chinese character in units of word and carry out corresponding conversion.Such as: wovmno can be converted to w ǒ men, can be converted to " we " by w ǒ men again, such wovmno and " we " just directly establish corresponding relation, can not be changed by the Chinese phonetic alphabet w ǒ men when needing, and between wovmno and " us ", directly realize bidirectional reversible conversion.
When meeting homonym, after can differentiating according to means such as the contact of Chinese lexical syntactic context and statistical laws, the Chinese character carried out in units of word is selected.Such as: ysvlune has filled mailbag.Ysvlune fills crude oil.Can know in conjunction with contextual contact: " ysvlune " above in one represents cruise, after " ysvlune " in one represent oil tanker, this two word can convert to respectively " cruise having been filled mailbag " and " oil tanker having been filled crude oil ".To other word situation also.
The result of above-mentioned bidirectional reversible conversion both can show separately also can contrast display, such as:
Former sentence: " we can use Chinese character and latin literary composition." reversibly can be converted to following several form with the inventive method computing machine:
Etc..
In order to the implication and the learning Chinese that allow the foreigner or Minorities In China more understand to aspect Chinese, also can insert corresponding foreign language word or minority language in the word of each contrast, in such as word below, add the note that corresponding English word makes the Chinese meaning:
“wovmnoWǒmenhuiuhuxrvydushǐyònghcuyyvhànyǔlaadqawnvlādīngwěn。”
We We can use use Chinese Chinese Latin Latine by can.
The like, with said method, bidirectional reversible conversion can be carried out between all polysyllabic arbitrarily Chinese characters phonetics in units of word and Chinese phonetics codes, and independent or contrast display as required, based on these Chinese language words, just can realize any Chinese character in units of word and the Chinese phonetic alphabet and the bidirectional reversible between spelling and simplicity Chinese phonetics codes to change, thus conveniently carry out various Chinese character, Chinese information processing.
Following several full form can be converted to by looking into code table to simplicity Chinese phonetics codes " wovmnohuiuxrvyduhcuyyvlaadqawnv. ":
The alphabetized spelling of 1.wovmenohuiushivyonguhanuyyvlaadingawenv. tone
2.wo3men5hui4shi3yong4han4yy3laldinglwen3. tone digitizing spelling
The alphabetized Mixed Pinyin of 3.wovmenohuiushivyduhanuyyvlaadqawenv. tone
Etc..
(5) step and method of Chinese phonetics codes and Chinese speech bi-directional conversion:
1. Chinese phonetics codes converts the step and method of Chinese speech to:
When Chinese phonetics codes converts Chinese speech to, can adopt respectively and look into Chinese syllable in Chinese phonetics codes and the Chinese syllable phonetic synthesis file table of comparisons, Chinese phonetics codes in units of word and the Chinese language words phonetic synthesis file table of comparisons, maximum matching method can also be passed through, the Chinese phonetics codes string that employing is looked in units of maximum paragraph and the Chinese paragraph phonetic synthesis file table of comparisons export corresponding Chinese speech, when by above-mentioned Chinese phonetics codes or Chinese phonetics codes string distinguish corresponding syllable, the phonetic synthesis file of word or paragraph changes Chinese particular person respectively into, Chinese dialects, during the phonetic synthesis file of minority language, by looking into Chinese phonetics codes or Chinese phonetics codes string and corresponding syllables, the phonetic synthesis file table of comparisons of word or paragraph, corresponding Chinese particular person can be exported respectively, Chinese dialects, the voice of minority language, when synthesizing foreign language voice, carry out looking into word, phrase or phrase are the Chinese phonetics codes of unit and corresponding foreign language word, foreign language phrase or the foreign language phrases phonetic synthesis file table of comparisons export corresponding foreign language word, the voice of foreign language phrase or foreign language phrases, to needing the initial consonant inputting each syllable of Chinese, referral letter, simple or compound vowel of a Chinese syllable and tone information just can carry out the system of Chinese syllable synthesis, Chinese phonetics codes can be converted to Chinese Pin Yin pseudonym according to code table, referral letter, after the information of simple or compound vowel of a Chinese syllable and tone, be input to again in speech synthesis systems for Chinese and carry out Chinese syllable synthesis, when carrying out phonetic synthesis to the punctuation mark in Chinese phonetics codes article and the number of dividing a word with a hyphen at the end of a line, as long as the audio files of six kinds of periods, seven kinds of labels and the number of dividing a word with a hyphen at the end of a line of storing Chinese in a computer in advance accordingly extracts by we, carry out playing just can with sound playout software.
When input be the Chinese information of expressing with Chinese character or the Chinese phonetic alphabet time, Chinese character or the Chinese phonetic alphabet can pass through to store Chinese characters phonetic in computer systems, which and Chinese voice code bidirectional modular converter in advance, first convert the speech conversion that spelling or Mixed Pinyin Chinese phonetics codes carry out above-mentioned Chinese, Chinese particular person, Chinese dialects, minority language, foreign language word, foreign language phrase or foreign language phrases again to, this step method is called phonetic code voice synthetic module by us
Exemplify Chinese phonetics codes is converted to voice by some example by the inventive method below:
Such as: wovmnohuiuxrvyduhcuyyvlaadqawnv.
It is the Chinese information expressed by Chinese phonetics codes, and its meaning Chinese character is expressed as:
" we can use Chinese character and latin literary composition.”
(1) method of phonetic synthesis is carried out by looking into Chinese phonetics codes and the syllable Chinese syllable synthesis file table of comparisons:
The audio files of the Chinese speech corresponding with phonetic code is obtained (for statement facilitates this audio files to represent with " corresponding syllable Chinese phonetic alphabet .wav " after looking into Chinese phonetics codes and the syllable Chinese syllable synthesis file table of comparisons, Chinese phonetic symbols is not had in actual conditions, it just stores in a computer in advance, the audio files of the expression corresponding syllables Chinese speech can play by certain sound playout software)
wov(wǒ.wav)mno(men.wav)huiu(huì.wav)xrv(shǐ.wav)ydu(yòng.wav)hsu(hàn.wav)yyv(yǔ.wav)laa(lā.wav)dqa(dīng.wav)wnv(wěn.wav).
To the corresponding audio files sound playout software order broadcast successively of this syllable Chinese speech of the representative found, adopt between word and word and broadcast continuously successively than the time interval longer between same single syllable, the effect closer to reading aloud by word being sounded like this, more meeting the custom that people listen voice.
(2) method of phonetic synthesis is carried out by looking into Chinese language words phonetic code and the word Chinese syllable synthesis file table of comparisons:
The audio files being stored the Chinese speech by word in units of corresponding with holophrase tone code in a computer after looking into Chinese language words phonetic code and the word pronunciation composite document table of comparisons in advance (represents with " the corresponding Chinese phonetic alphabet .wav in units of word " for stating the Chinese sound file facilitating this in units of word, actual conditions do not have Chinese phonetic symbols, it just stores in a computer in advance, the audio files of the corresponding Chinese speech in units of word of expression can play by certain sound playout software)
wovmno(wǒmen.wav)huiu(huì.wav)xrvydu(shǐyòng.wav)hcuyyv(hànyǔ.wav)laadqawnv(lādīngwěn.wav).
The corresponding audio files sound playout software representing this Chinese speech in units of word found successively order is broadcasted, adopt between word and word and broadcast continuously successively than the time interval longer between same single syllable, the effect closer to reading aloud by word being sounded like this, more meeting the custom that people listen voice.
(3) method of phonetic synthesis is carried out by looking into Chinese phonetics codes string and the maximum coupling paragraph Chinese syllable synthesis file table of comparisons:
The method adopts maximum matching method, by looking into Chinese phonetics codes string in units of maximum paragraph and the paragraph Chinese syllable synthesis file table of comparisons exports corresponding Chinese speech.As by looking into the maximum paragraph stored in advance in a computer be: " wovmnohuiuxrvydu we can use " and " hcuyyvlaadqawnv Chinese character and latin literary composition " so Chinese syllable synthesis is undertaken by mode below:
wovmnohuiuxrvydu(wǒmenhuìshǐyòng.wav)hcuyyvlaadqawnv(hànyǔlādīngwěn.wav).
(for statement facilitates above-mentioned should expression with " the corresponding Chinese phonetic alphabet .wav in units of this paragraph " by the Chinese sound file in units of paragraph, actual conditions do not have Chinese phonetic symbols, it just stores in a computer in advance, the audio files of the corresponding Chinese speech in units of this paragraph of expression can play by certain sound playout software)
The like, the Chinese syllable synthesis file of the syllable that in above-mentioned three kinds of situations, if phonetic code is corresponding, word, paragraph change into respectively Chinese particular person, Chinese dialects, minority language phonetic synthesis file time, then the voice of what computing machine was synthesized is respectively just Chinese particular person, Chinese dialects, minority language.
In general, because the sound of the syllable of foreign language and the sound of Chinese syllable can not set up certain corresponding relation, the word order of the sentence of foreign language is also different from the word order of Chinese sentence, only have Chinese language words, phrase or phrase and certain corresponding relation can be set up between foreign language word, phrase or phrase, therefore the synthesis of Chinese phonetics codes and foreign language voice can only be carried out between word, phrase or phrase, and at syllable and syllable or can not carry out between sentence and sentence.Such as: word " wovmno we " can synthesize the sound of English word (we.wav), phrase or phrase hcuyyvlaadqawnv synthesize the sound of English phrase or phrase (ChineseLatin.wav), (we.wav and ChineseLatin.wav represents the audio files of English we and ChineseLatin stored in advance in a computer respectively here, can be play by sound playout software), if there is the above-mentioned situation same with foreign language in certain Chinese dialects or minority language, we also take the method same with foreign language only to carry out word, phrase or the phonetic synthesis between phrase and phrase.
The speech synthesis systems such as the robot of shape of the mouth as one speaks change when needing to imitate human articulation to some, it often needs the initial consonant knowing each syllable of Chinese when Chinese syllable synthesis, referral letter, simple or compound vowel of a Chinese syllable, tone information just can carry out Chinese syllable synthesis, because Chinese phonetics codes of the present invention contains the initial consonant of each syllable of Chinese, referral letter, simple or compound vowel of a Chinese syllable, tone information, therefore can according to Chinese phonetics codes above and Chinese Pin Yin pseudonym, referral letter, simple or compound vowel of a Chinese syllable, the tone coding table of comparisons, convert Chinese phonetics codes to Chinese Pin Yin pseudonym, referral letter, simple or compound vowel of a Chinese syllable, after the information of tone, be input to again in robot speech synthesis systems for Chinese and carry out Chinese syllable synthesis and just can.Such as to Chinese phonetics codes " wovmno ", look into the table of comparisons of encoding of Chinese phonetics codes and Chinese Pin Yin pseudonym, referral letter, simple or compound vowel of a Chinese syllable, tone above can know: w represents the initial consonant (w) of the Chinese phonetic alphabet, o represents the simple or compound vowel of a Chinese syllable (o) of the Chinese phonetic alphabet, v represents the 3rd several tune (∨) of the Chinese phonetic alphabet, m represents the initial consonant (m) of the Chinese phonetic alphabet, n represents the simple or compound vowel of a Chinese syllable (en) of the Chinese phonetic alphabet, and o represents (not marking) softly of the Chinese phonetic alphabet.
In like manner, with said method, all any polysyllabic Chinese phonetics codes can be converted to initial consonant, referral letter, simple or compound vowel of a Chinese syllable, the tone information of the Chinese phonetic alphabet by us, be input to the requirement that just can meet system in the speech synthesis systems for Chinese of required similar robot above, reach the object required for us.
Sometimes in order to proofread the convenience of article, we need the punctuation mark in Chinese phonetics codes article to read out with the number of dividing a word with a hyphen at the end of a line is bright, this will carry out phonetic synthesis to the punctuation mark in Chinese phonetics codes article and the number of dividing a word with a hyphen at the end of a line, in order to make Chinese information expressed by Chinese phonetics codes and ASCII character 100% compatible, here the punctuation mark in our special provision Chinese phonetics codes article is identical with the number of dividing a word with a hyphen at the end of a line with the punctuation mark of English respectively with the number of dividing a word with a hyphen at the end of a line, as long as the audio files of the punctuation mark stored in advance accordingly in a computer and the number of dividing a word with a hyphen at the end of a line extracts by we when concrete sound synthesizes, carry out playing just can with sound playout software, such as:
Six kinds of periods: fullstop ". " (j ù h à o.wav), question mark "? " (wenh à o.wav), exclamation mark "! " (g ǎ nt à nh à o
.wav), comma, " (d ò uh à o.wav), colon ": " (m à oh à o.wav), branch "; " (f ē nh à o.wav).
Seven kinds of labels: quotation marks " " (y ǐ nh à o.wav), bracket () (ku ò h à o.wav), dash "-" (p ò zh é h à o.wav), suspension points ... (sh ě nglueh à o.wav), mark of emphasis. (zhu ó zh ò ngh à o.wav), punctuation marks used to enclose the title (()) (sh ū m í ngh à o.wav), separation dot. (ji à ng é h à o.wav).
The number of dividing a word with a hyphen at the end of a line: the number of dividing a word with a hyphen at the end of a line "-" (y í h á ngh à o.wav).
List the six kind periods identical with English of the present invention above, seven kinds of labels and the number of dividing a word with a hyphen at the end of a line, " .wav " file in bracket is exactly the phonetic synthesis file pronouncing corresponding to punctuation mark or the number of dividing a word with a hyphen at the end of a line, when this phonetic synthesis file is the phonetic synthesis file of Chinese, then this punctuation mark or the bright sound read out of the number of dividing a word with a hyphen at the end of a line are the sound of the corresponding punctuation mark of Chinese or the number of dividing a word with a hyphen at the end of a line, when this phonetic synthesis file is Chinese particular person respectively, Chinese dialects, during the phonetic synthesis file of minority language, then this punctuation mark or the bright sound read out of the number of dividing a word with a hyphen at the end of a line just are Chinese particular person respectively, Chinese dialects, the corresponding punctuation mark of minority language or the sound of the number of dividing a word with a hyphen at the end of a line.
When input be the Chinese information of expressing with Chinese character or the Chinese phonetic alphabet time, carry out the speech conversion of above-mentioned foreign language word, phrase or phrase, Chinese, Chinese particular person, Chinese dialects, minority language etc. after Chinese character or the Chinese phonetic alphabet can first convert Chinese phonetics codes to by Chinese characters phonetic and Chinese voice code bidirectional modular converter again.
2. Chinese speech converts the step and method of Chinese phonetics codes to:
When Chinese speech converts Chinese phonetics codes to, Chinese speech recognition system can successively respectively by Chinese paragraph, Chinese language words, Chinese syllable is as the primitive identified, by searching the Chinese paragraph sound template and the Chinese paragraph phonetic code table of comparisons that store in advance in a computer, Chinese language words sound template and the Chinese language words phonetic code table of comparisons, Chinese syllable sound template and the Chinese speech syllabified code table of comparisons, corresponding Chinese paragraph phonetic code is identified after coupling, Chinese language words phonetic code, Chinese syllable phonetic code, just continuous print Chinese paragraph phonetic code string is obtained successively respectively when voice input continuously, Chinese language words phonetic code string, Chinese syllable phonetic code string, the above-mentioned Chinese syllable phonetic code that obtains was ganged up the word-dividing mode stored in advance in computer systems, which and carried out by word segmentation, then the segmentation of words need not be carried out again to dividing the Chinese language words phonetic code string of word and Chinese paragraph phonetic code string, write the two or more syllables of a word together between the syllable of same word and syllable are taked to the word be syncopated as, between word and word, the mode in space represents, when Chinese phonetics codes needs to convert Chinese character or the Chinese phonetic alphabet further to, corresponding Chinese character or the Chinese phonetic alphabet is exported to the conversion of Chinese voice code bidirectional modular converter by the Chinese characters phonetic stored in advance in computer systems, which, for the dialect that Chinese speech is Chinese with certain dialectal accent or a certain China, as long as the syllable of the dialect of this China or word or paragraph have certain corresponding relation with Chinese syllable or word or paragraph respectively, we are by similar above method namely: by the sound template of the Chinese syllable or word or paragraph of searching the Chinese with certain dialectal accent stored in advance in a computer and Chinese syllable or word or the paragraph phonetic code table of comparisons, and there is the dialect syllable of certain corresponding relation or the sound template of word or paragraph and Chinese speech syllabified code or word or the paragraph table of comparisons, corresponding Chinese syllable or word or paragraph phonetic code string is identified after coupling, just can realize this Chinese with certain dialectal accent or the Chinese phonetics codes identification of dialect, realize this Chinese with certain dialectal accent or the conversion of dialect and Chinese phonetics codes, this step method is called Chinese phonetics codes sound identification module by us,
Exemplify some carry out phonetic code conversion to Chinese speech example by the inventive method below:
Such as: we read aloud with Chinese speech, and " we can use Chinese character and latin literary composition.”
(1) by searching the Chinese syllable sound template and the Chinese speech syllabified code table of comparisons that store in advance in a computer, corresponding Chinese syllable phonetic code string after coupling, is identified:
Wo vmn ohui uxr vyd uhs uyy vla adq awn v. (between syllable and syllable, having space)
Or wo vmn ohui uxr vyd uhc uyy vla adq awn v. (without space between syllable and syllable)
(the schwa symbol o after skilled in mno can omit when not causing audio mixing.)
See clearly to allow everybody and will represent that the letter of tone has added underscore here, the tool sound insulation joint effect simultaneously of the tone letter in phonetic code, in actual speech code, tone is without underscore, and after skilled phonetic code, tone is held concurrently and can conveniently be distinguished every syllabic sign.
Complete the pure speech recognition process that the dictionary scale of the complicacy of a system and system is irrelevant.
If Chinese speech is Chinese with certain dialectal accent or the dialect of a certain China, as long as the syllable of the dialect of this China and Chinese syllable have certain corresponding relation, we are by similar above method namely: by searching the Chinese with certain dialectal accent that stores in advance in a computer or having sound template and the Chinese speech syllabified code table of comparisons of the dialect syllable of certain corresponding relation with Chinese syllable, corresponding Chinese syllable phonetic code string is identified after coupling, just can realize this Chinese with certain dialectal accent or the Chinese phonetics codes identification of dialect, realize this Chinese with certain dialectal accent or the conversion of dialect and Chinese phonetics codes.
(2) phonetic code string is carried out the segmentation of words, finally complete the phonetic code conversion in units of word.
By searching the Chinese phonetics codes word dictionary having divided word in advance, by multiple syllable write the two or more syllables of a word together of same word, between word and word, separate the Chinese phonetics codes just obtaining our final needs following with space:
wovmnohuiuxrvyduhcuyyvlaadqawnv.
In order to obtain traditional voice identification result, we can also carry out following conversion, and it is emphasized that this process and speech recognition system do not have inevitable contacting here, this standard handovers module can depart from speech recognition system independent operating.
As long as Chinese phonetics codes just can be realized by the conversion of Chinese characters phonetic and Chinese voice code bidirectional modular converter when Chinese phonetics codes converts Chinese character and the Chinese phonetic alphabet to convert Chinese character and the Chinese phonetic alphabet to:
Such as, we can obtain " wovmnohuiuxrvyduhcuyyvlaadqawnv. "
Through the phonetic of this module converts and the following sentences of Chinese character composition:
“Wǒmenhuìshǐyònghànyǔlādīngwěn。”
" we can use Chinese character and latin literary composition.”
The result of above-mentioned identification both can show separately also can contrast display, such as:
Former sentence: " we can use Chinese character and latin literary composition." following several form can be converted to:
In order to the implication and the learning Chinese that allow the foreigner or Minorities In China more understand to aspect Chinese, also can insert corresponding foreign language word or minority language in the word of each contrast, in such as word below, add the note that corresponding English word makes the Chinese meaning:
“wovmnoWǒmenhuiuhuìxrvydushǐyònghcuyyvhànyǔlaadqawnvlādīngwěn。”
We We can use use Chinese Chinese Latin Latine by can.
The like, paragraph can be carried out equally and word is the speech recognition of unit with said method, and any polysyllabic Mandarin speech recognition can be become Chinese phonetics codes, and convert Chinese character or the Chinese phonetic alphabet further to as required, Chinese phonetics codes, Chinese character or the Chinese phonetic alphabet can show separately also can contrast display, based on these Chinese language words, just can realize the identification of any Chinese speech information, thus conveniently carry out various Chinese speech information process.
(6) step and method of the bidirectional machine translation of Chinese and foreign language
The method of the Chinese adopted and the bidirectional machine translation of foreign language, setting up on the source language morphology syntax basis substantially consistent with target language, by Chinese and the two-way syntactic transfer of foreign language, realize the bidirectional machine translation of Chinese and foreign language, here mechanical translation machine used refers to computing machine or the embedded computer system of Global Access, hereinafter referred to as computing machine or computer system, here morphology is exactly about the definition of part of speech and division and research word, change of morphology and using method thereof, syntax is exactly be about the definition of sentence element and division and research sentence kind, sentence structure and internal form thereof, sentence pattern is exactly sentence each word inner, phrase, phrase, the part of speech of subordinate clause or quite part of speech and in sentence take on putting in order and form of composition, first the artificial part of speech string of method establishment same language sentence and the corresponding relation of sentence pattern is used before translation, and then setting up in the Chinese morphology system substantially consistent with the foreign language needing to translate and syntax system-based, set up the sentence pattern contrast relationship between the required bilingual translated and store in a computer, the sentence of machine first scan source language when turning over pool, the part of speech string of the sentence of source language is obtained by the dictionary looking into the source language mark part of speech stored in advance in a computer, by looking into the mapping table of source language part of speech string and the source language sentence pattern stored in advance in a computer, the part of speech string of the sentence of source language is converted to corresponding source language sentence pattern, again by looking into the source language sentence pattern and the target language sentence pattern table of comparisons that store in advance in a computer, source language sentence pattern is converted to the target language sentence pattern of coupling, finally by the method looking into source language and the target language bilingual dictionary stored in advance in a computer, word in source language or phrase are translated into word or the phrase of target language, and according to target the order of language sentence pattern arranges output in units of word, just the target language sentences required for us is obtained,
All subordinates clause in complex sentence are extracted by the advanced row grammatical analysis of the complex sentence in source language, till extracting simple sentence of to the last being deducted a percentage by subordinate clause layer by layer, mechanical translation is carried out again by the mode of above-mentioned translation simple sentence, complex sentence part is then by looking into the source language that stores in advance in a computer and the target language sentence pattern table of comparisons completes complex sentence syntactic transfer, and complete the translation of other composition of complex sentence except subordinate clause, finally translated subordinate clause is put into the relevant position of the rear corresponding complex sentence sentence pattern of conversion, so move in circles until till obtaining the whole target language sentence required for us,
When source language is the Chinese of expressing with Chinese character or the Chinese phonetic alphabet or Chinese speech, by storing Chinese characters phonetic in a computer and Chinese voice code bidirectional modular converter in advance, Chinese phonetics codes sound identification module first converts Chinese character or the Chinese phonetic alphabet or Chinese speech to Chinese phonetics codes and translates, when foreign language turns over Chinese, translate the target language by Chinese speech representation that obtains or be directly used in expression Chinese information, or convert Chinese character or the Chinese phonetic alphabet or Chinese speech or Chinese particular person or Chinese dialects and minority language voice output to as necessary by the Chinese characters phonetic stored in advance in a computer and Chinese voice code bidirectional modular converter and Chinese phonetics codes voice synthetic module,
The inconvenience such as the writing in classical Chinese in source language, poem, Chinese idiom, allusion, slang, abbreviation are carried out to the content of grammatical analysis, do not carry out part of speech inquiry and syntactic transfer, before part of speech inquiry and syntactic transfer, directly by searching, the coupling of case library one to one stored in advance is in the machine rear to be exported;
Exemplify some carry out two-way translation to Chinese and english example by the inventive method below:
1.wovmnomwvtisaxrvydulaadqawnv. (Chinese information of Chinese speech representation)
We use Latin every day.(Chinese information represented with Chinese character)
A) Chinese dictionary looking into mark word part of speech sets up word part of speech string: (part in sentence bracket is part of speech, below all with)
Wovmno (personal pronoun 1)+mwvtisa (time noun 1)+xrvydu (verb 1)+laadqawnv (noun 2). our (personal pronoun 1)+every day (time noun 1)+use (verb 1)+Latin (noun 2).
B) to table look-up the Chinese sentence patterns be stored in advance in table according to sentence part of speech string obtained above:
(the component string composition sentence pattern that part of speech and this word are done, below all with)
Wovmno (personal pronoun 1 makes subject)+mwvtisa (time noun 1 makes time adverbial)+xrvydu (predicate made in verb 1)+laadqawnv (object made in noun 2)
Our (personal pronoun 1 makes subject)+every day (time noun 1 makes time adverbial)+use (predicate made in verb 1)+Latin (object made in noun 2)
C) basis obtains the English sentence of the correspondence be stored in advance in table of tabling look-up of Chinese sentence patterns above:
Wovmno (personal pronoun 1 makes subject)+xrvydu (predicate made in verb 1)+laadqawnv (object made in noun 2)+mwvtisa (time noun 1 makes time adverbial)
We (personal pronoun 1 makes subject)+use (predicate made in verb 1)+Latin (object made in noun 2)
+ every day (time noun 1 makes time adverbial)
Now look into Chinese-English dictionary carry out the conversion of word or the phrase meaning and just complete by this sentence pattern Sequential output the conversion that English translated in Chinese, can amphicheirality in order to what show this machine translation process, we further convert below remaking:
D) according to obtain above English sentence table look-up be stored in advance in table with corresponding English word or the consistent part of speech string of phrase part of speech: (this part of speech string also can extract and obtain from the target language sentence pattern obtained, below all with)
Wovmno (personal pronoun 1)+xrvydu (verb 1)+laadqawnv (noun 2)+mwvtisa (time noun 1).
We (personal pronoun 1)+use (verb 1)+Latin (noun 2)+every day (time noun 1).
E) look into Chinese-English dictionary carry out word or the phrase meaning conversion and by the Sequential output of English sentence obtained above:
We (personal pronoun 1) use (verb 1) latin (noun 2) everyday (time noun 1).
weuselatineveryday.
Complete the conversion that English translated in Chinese, we can also see except being transformed into except e from a simultaneously, we can also use the same method from e and get back to a, and now English has been converted into Chinese, and showing can to realize machine translation process by method of the present invention can amphicheirality.
In like manner we can adopt the above-mentioned interpretation method to complex sentence to realize the two-way translation of Chinese to complex sentence.
When source language is the Chinese of expressing with Chinese character or the Chinese phonetic alphabet or Chinese speech, by storing Chinese characters phonetic in a computer and Chinese voice code bidirectional modular converter in advance, Chinese phonetics codes sound identification module first converts Chinese character or the Chinese phonetic alphabet or Chinese speech to Chinese phonetics codes and translates, when foreign language turns over Chinese, translate the target language by Chinese speech representation that obtains or be directly used in expression Chinese information, or convert Chinese character or the Chinese phonetic alphabet or Chinese speech or Chinese particular person or Chinese dialects and minority language voice output to as necessary by the Chinese characters phonetic stored in advance in a computer and Chinese voice code bidirectional modular converter and Chinese phonetics codes voice synthetic module,
The inconvenience such as the writing in classical Chinese in source language, poem, Chinese idiom, allusion, slang, abbreviation are carried out to the content of grammatical analysis, do not carry out part of speech inquiry and syntactic transfer, before part of speech inquiry and syntactic transfer, directly by searching, the coupling of case library one to one stored in advance is in the machine rear to be exported;
If we are by the case library being not easy to grammatical analysis content of any source language and target language, the two-way translation dictionary of the mark part of speech of any source language and target language, the syntactic transfer table of comparisons between the expression formula of sentence part of speech string of same language and the contrast relationship table of sentence pattern expression formula and different language adds in system, and by identical translation program bootup window, we just can realize the two-way translation between any language, it is worthy of note that the above-mentioned various conversion table of comparisons will be based upon in identical or close morphology system and syntax system-based as far as possible, such source language difference just only shows as the difference of word or phrase and sentence pattern with target language, what need to change in machine translation process is only word or phrase and sentence pattern, morphology system and syntax system are due to identical or close, therefore generally conversion is no longer needed.
When the part of speech string that each key element corresponding with Chinese used by this interpretation method is manually set up as the Chinese mark dictionary of part of speech, Chinese and the Chinese sentence patterns table of comparisons and Chinese and the target foreign syntactic transfer table of comparisons change into each key element above-mentioned that another foreign language translation phase is applied to also store in advance in a computer time, above-mentioned interpretation method can also be extended to the machine translation method that a kind of foreign language translation becomes another foreign language, and the method for this step is called Chinese phonetics codes Chinese foreign language two-way translation module by us.
(7) conversion of Chinese phonetics codes domain name and webpage logon step and method
The network address logging in various website or E-mail address adopted is the various legal network address containing the Chinese phonetics codes of spelling or Mixed Pinyin, website both can adopt sews composition with the front and back of Chinese phonetics codes various legal network address for stem adds of spelling or Mixed Pinyin, E-mail address adopts the Chinese phonetics codes of spelling or Mixed Pinyin to be stem ++ the legal suffix composition of various E-mail address, also can by spelling or the Chinese phonetics codes of Mixed Pinyin and the front and back of various legal network address be sewed and corresponding relation set up in the legal suffix of various E-mail address, all input various network address and E-mail address with the Chinese phonetics codes of spelling or Mixed Pinyin to computing machine or embedded computer system, at Chinese phonetics codes with phrase, when sentential form composition network address or E-mail address, all words in the phrase of the Chinese phonetics codes of map network domain name or E-mail address or sentence not space each other, both can be inputted by standard English keyboard during input, also western language handwriting recognition can be passed through, western language optical identification, the mode of letter word speech recognition and Chinese phonetics codes speech recognition inputs, when sending corresponding with the network address of certain website or the E-mail address voice pre-deposited wherein to computing machine or embedded computer system, computing machine or embedded computer system can be searched and store network address corresponding with these voice in a computer or E-mail address in advance, and the network address corresponding with these voice or E-mail address is shown in browser address bar, and open corresponding webpage or website or E-mail address, browser can manually be opened in advance, also can according to automatically opening the setting of computer system in advance after voice signal heard by computing machine or embedded computer system, after website is opened, computing machine or embedded computer system can continue to identify follow-up voice, after identifying search, open corresponding web page or the corresponding web page contents of cursor pointing of this website, and do further subsequent treatment according to setting computer in advance or embedded computer system, the method of this step is referred to as the conversion of Chinese phonetics codes domain name and webpage log-in module by us,
(8) step and method of Chinese phonetics codes various information input:
Chinese phonetics codes for various information input both can be inputted by standard English keyboard, also western language handwriting recognition can be passed through, western language optical identification, the mode of letter word speech recognition and Chinese phonetics codes speech recognition inputs, when inputting Chinese phonetics codes in coding input frame, each respective items included by the various information consistent with the meaning represented by this Chinese phonetics codes all or part ofly first can appear in candidate's input frame according to setting, through select confirm after be finally input to required input computing machine or various hand-held embedded movable equipment in, the various information inputted comprises consistent with inputted Chinese phonetics codes meaning with word, phrase, sentence is the Chinese character of unit, the Chinese phonetic alphabet used in tradition " Scheme for the Chinese Phonetic Alphabet ", Chinese phonetics codes spelling or Mixed Pinyin, with tradition " Scheme for the Chinese Phonetic Alphabet ", there is other various Chinese phonetic alphabet of one-to-one relationship, foreign language, Minorities In China word and the various wired or wireless domain names corresponding with inputted Chinese phonetics codes spelling or Mixed Pinyin or network address, the Chinese phonetics codes spelling inputted according to default or the corresponding various wired or wireless domain names of Mixed Pinyin or network address can be changed mutually automatically according to the code table of correspondence, when inputting Chinese phonetics codes in coding input frame, as run into initial consonant, simple or compound vowel of a Chinese syllable, the Chinese homophone word that tone is all identical, then in candidate's input frame, all list these homophone words and the foreign language consistent with its meaning thereof by specific character such as arabic numeral order respectively, minority language, through select confirm after be finally input to required input computing machine or various hand-held embedded movable equipment in,
The various wired or wireless domain names corresponding with inputted Chinese phonetics codes or network address, can directly be obtained by the Chinese phonetics codes+various legal wired or wireless domain names of the prefix of various legal wired or wireless domain names or network address+input or the suffix composition of network address, also can by call Chinese phonetics codes domain name conversion and webpage log-in module change obtain, at Chinese phonetics codes with phrase, during sentential form input, all words in the phrase of the Chinese phonetics codes in map network domain name or network address or sentence not space each other, when there is the homophone word of Chinese character, the various wired or wireless domain names corresponding with inputted Chinese phonetics codes or network address constant,
After input Chinese phonetics codes in coding input frame, each respective items included by the various information shown in candidate's input frame or add specific character as arabic numeral before each respective items, the input display of individual event respective items is carried out by keying in this specific character, or the respective items included by all various informations comprising the Chinese phonetics codes of input is only identified with a specific character such as arabic numeral, by disposable contrast input display while keying in each respective items that this specific character carries out included by all various informations;
By setting up the corresponding relation between each respective items of various information, when inputting any one in various information respective items in input frame, other each respective items included by various information will show in candidate's input frame with the form preset, in the computing machine being finally input to required input with set display format after selecting to confirm or various hand-held embedded movable equipment.
Exemplify some carry out various information input example by the inventive method below:
In coding input frame, such as input Chinese phonetics codes " wovmno " arrange according to selecting the difference of symbol, the mode of candidate frame display can take following two kinds of different modes: (conveniently describe compactly and know the present invention's example, , the foreign language of example of the present invention is for English, to other corresponding foreign language item and minority language item respectively with other foreign language one, other foreign language two and minority language one, the second-class expression of minority language, during practical application, corresponding position will show other actual foreign language and minority language respectively, such as other foreign language one can show: Japanese, other foreign language two can show: Russian, minority language one can show: Tibetan language, minority language two can show: Uighur etc., and the kind of foreign language and minority language and quantity are not limit, in addition have in the various Chinese phonetic alphabet example of the present invention of one-to-one relationship with tradition " Scheme for the Chinese Phonetic Alphabet " and only list Chinese phonetics codes, other does not list as the Chinese phonetic alphabet phonetic symbol of Taiwan use and with the digital Chinese phonetic alphabet etc. doing tone mark in example of the present invention, various wired or wireless domain names or network address also only list conventional PC interconnected network address, can use the same method according to actual needs for other unlisted part and carry out candidate display and input display)
(1) if add one specifically select symbol such as arabic numeral before each candidate item, the then now respective items of the various information that display is consistent with inputted Chinese phonetics codes " wovmno " meaning in candidate frame: 1. our other foreign language 2 6. minority language 1 minority language two 8.http: //www.wovmno.com of other foreign language 1 of 2.w ǒ men3.we4., key in 1 respectively, 2, 3, 4, 5, 6, 7, 8 just can choose respectively and input Chinese character, phonetic, English, other foreign language one, other foreign language two, minority language one, minority language two and corresponding network address.So just conveniently can choose any one candidate's respective items of input.
(2) if adopt comprise inputted Chinese phonetics codes institute meaningful consistent various information respective items only with one selection symbol, such as identify with arabic numeral, then can adopt following display in candidate frame: 1. our other foreign language two minority language one minority language two http://of other foreign language one of w ǒ menwovmnowe www.wovmno.com, key in numerical key " 1 " then in " we other foreign language two minority language one minority language two http://of other foreign language one of w ǒ menwovmnowe www.wovmno.com " computing machine of being simultaneously presented at required input by contrast input or various hand-held embedded movable equipment, after input contrast display can all item or as required selection portion subitem left and right transverse direction contrast display, also can all item or upper and lower vertical contrast displays of selection portion subitem as required.Adopt this form can easily by meaningful consistent respective items once choose simultaneously all item or selectively subitem contrast input display.Here also it is emphasized that, the coding method that what corresponding various wired or wireless domain names or network address the present invention adopted is " Chinese phonetics codes+various wired or wireless domain names of various wired or wireless domain names or network address prefix+input or network address suffix ", network address prefix in example of the present invention is http://www., the Chinese phonetics codes inputted forming network address in example of the present invention is exactly " wovmno ", network address suffix in example of the present invention is " .com ", during practical application, domain names or network address prefix and suffix are not limited to these, as long as the domain name of various wired or wireless internet lawful or network address prefix and suffix just can, such as network address prefix can also be used respectively: FTP: //, GOPHER: //, TELNET: //, NWES: //, http://wap. etc. type, such as domain names or network address suffix can also be used respectively: " .net " " .org " " .cn " " .cc " " .tv " " .mobi " " .biz " " .info " " .com.cn " etc. type.
When inputting various information using Chinese phonetics codes as input code, as run into the initial and the final tone simultaneously all identical Chinese character express Chinese homonym time, the homonym of Chinese can be pressed any one the suitable number display successively one by one in candidate frame in above two kinds of forms as required, the Chinese language words always represented with Chinese character during display starts, show that the contrast item of each language is below the contrast item consistent with this Chinese character word senses, by the special symbol before pressing every such as numerical key or input or contrast input item by item.Such as: Chinese phonetics codes " yxvlune " the initial and the final tone is simultaneously all identical has two " oil tankers " and " cruise " with the Chinese homonym that Chinese character is expressed, after input Chinese phonetics codes " yxvlune ", for the situation needing to input item by item, show as follows in selection candidate frame: the minority language of the domain names of the corresponding yxvlune of minority language 4 of the corresponding oil tanker of foreign language 3 of the corresponding oil tanker of 1 oil tanker 2 or the corresponding cruise of foreign language 7 of the corresponding cruise of network address 5 cruise 6, the numeral keyed in above just can realize inputting item by item.If we, in order to once contrast input simultaneously, also can adopt and show with the following method in candidate display frame: the minority language of the domain names of the corresponding yxvlune of minority language of the corresponding oil tanker of foreign language of the corresponding oil tanker of 1 oil tanker or the corresponding cruise of foreign language of the corresponding cruise of network address 2 cruise.The numeral keyed in above just can realize the consistent respective items of each meaning and once input simultaneously.By that analogy, when the foreign language of correspondence and the minority language of correspondence have more kinds of, suitable number of above-mentioned identical method is adopted all to list on a corresponding position, still can according to method above when meeting more homonym, until along number having shown all homonyms and the respective items with foreign language and minority language thereof in candidate display frame.Because the domain names of the yxvlune of correspondence or network address are identical in different Chinese character homophone words, therefore can only occur once in candidate display frame.
Owing to regarding the independent Chinese character used as monosyllable in the present invention, therefore, the coding method of the Chinese phonetics codes to Chinese character of the present invention is identical with the coding method of the Chinese phonetics codes to Chinese language words syllable, adopt the coding of the Chinese phonetics codes of single syllable by the coding of Chinese phonetics codes obtaining word after word write the two or more syllables of a word together in the present invention, be made up of several words one group of word is called phrase by us, because word can represent phrase and Chinese sentence, the coding of the coding of the Chinese phonetics codes of therefore adopted in the present invention phrase and the Chinese phonetics codes of Chinese sentence can be realized by the coding of the Chinese phonetics codes of word, and do not need to phrase and Chinese sentence formulate in addition a set of specially the coding of Chinese phonetics codes.Like this we also just can with the identical method of input Chinese phonetics codes realize phrase and phrase, various information between sentence with sentence contrast candidate and item by item or contrast and input.
(1) contrast candidate inputting item by item such as: in input frame, input " wovmnohuiuxrvyduhcuyyvlaadqawnv. " then show in candidate display frame: 1 we can use Chinese character and latin literary composition.2Wǒmenhuìshǐyònghànyǔlādīngwěn。3WecanuseChineseLatine。Sentence 8.http: the //www.wovmnohuiuxrvyduhcuyyvlaadqawnv.com of corresponding minority language two composition of sentence 7 that the corresponding minority language one of sentence 6 that sentence 5 other foreign language two corresponding that 4 other foreign languages one corresponding form forms forms.Key in 1,2,3,4,5,6,7,8 respectively just to choose respectively and input Chinese character, phonetic, the sentence of the sentence of English, accordingly foreign language one composition, accordingly foreign language two composition, the sentence of the sentence of corresponding minority language one composition, accordingly minority language two composition and the network address corresponding with inputted phonetic code., here it is emphasized that: at Chinese phonetics codes with phrase, during sentential form input, not space between word in the phrase of the Chinese phonetics codes in map network domain name or network address or sentence, connect together composition character string, and just constitute a domain names corresponding with inputted Chinese phonetics codes or network address with this character string for sewing before and after keyword adds various domain names or network address, why to do like this, because of the regulation according to internet domain name or network address, space can not be had between the character string of the keyword of composition internet domain name or network address, if regulation can have space later, we also can form the domain names corresponding with inputted Chinese phonetics codes or network address by the form by space, why can do like this, the feature that the tone of this Chinese phonetics codes used with the present invention has the effect of sound insulation joint is relevant, because the tone of the present invention's Chinese phonetics codes used has the effect of sound insulation joint, even if also can not be there is mutually obscuring between syllable and syllable in syllable write the two or more syllables of a word together many arbitrarily so together, still syllable one by one accurately can be read out, thus one will understand that the meaning representated by the Chinese phonetics codes of this string word together with word write the two or more syllables of a word together according to the pronunciation of these Chinese syllables, in like manner to Chinese phonetics codes phrase too.Such format permutation conveniently can choose input any one phrase, sentence or network address candidate item.
(2) contrast candidate and contrast input such as: input after " wovmnohuiuxrvyduhcuyyvlaadqawnv. " in input frame, we also can allow candidate display frame show by following form: 1 we can use Chinese character and latin literary composition.Wǒmenhuìshǐyònghànyǔlādīngwěn。Wovmnohuiuxrvyduhcuyyvlaadqawnv.WecanuseChineseLatine. sentence, http://www.wovmnohuiuxrvyduhcuyyvlaadqawnv.com that the corresponding minority language two of sentence that the sentence corresponding minority language one of corresponding other foreign language two composition of the sentence of corresponding other foreign language one composition forms forms.Key in just can to choose after 1 and the sentence of the Chinese character of contrast input simultaneously, phonetic, the sentence of English, accordingly other foreign language one composition, accordingly other foreign language two composition, corresponding minority language one form sentence, corresponding minority language two composition sentence, http://www.wovmnohuiuxrvyduhcuyyvlaadqawnv.com.Contrast input display is except pressing laterally contrast input such as: we can use Chinese character and latin literary composition.Wǒmenhuìshǐyònghànyǔlādīngwěn。Wovmnohuiuxrvyduhcuyyvlaadqawnv.WecanuseChineseLatine. sentence, http://www.wovmnohuiuxrvyduhcuyyvlaadqawnv.com that the corresponding minority language two of sentence that the sentence corresponding minority language one of corresponding other foreign language two composition of the sentence of corresponding other foreign language one composition forms forms.Also can show by upper and lower vertical contrast input by all or part of according to setting. such as show by upper and lower vertical contrast input according to selection part:
Here our Chinese phonetic alphabet of using by Chinese character, Chinese phonetics codes, tradition " Scheme for the Chinese Phonetic Alphabet " and English contrast are example, use the same method, above-mentioned english sentence can be changed into the sentence of other foreign language corresponding or minority language by us, just can realize sentence that the Chinese phonetic alphabet that uses by the word of other languages and Chinese character, tradition " Scheme for the Chinese Phonetic Alphabet " and Chinese phonetics codes form to contrast up and down to input and show, the foreigner or Minorities In China so just can be allowed more easily to understand implication and the learning Chinese of Chinese.Can allow the whole world no matter can or can not people's input character information and allow people Chinese input foreign language and minority language information and the various network address with Chinese meaning quickly and easily quickly and easily of Chinese character and input method of Chinese character simultaneously.
The sentence word order of upper example Chinese and English is identical with the sentence word order of Chinese, the contrast order of therefore English word is identical with the contrast order of Chinese language words, when the sentence word order of English is not identical with the sentence word order of Chinese, the contrast order of English word will not be identical with the contrast order of Chinese language words, at this moment we both can not consider that the taxeme of each languages or language adopted pure broken word and word to contrast to input and show, also can adopt and consider that the taxeme of each languages or language adopts whole phrase and phrase or whole sentence and sentence to contrast to input and show.In like manner to other foreign language and minority language too.
Due to the Chinese phonetic alphabet used in Chinese phonetics codes and tradition " Scheme for the Chinese Phonetic Alphabet ", with tradition " Scheme for the Chinese Phonetic Alphabet ", there are the various Chinese phonetic alphabet of one-to-one relationship, the foreign language word of various languages, minority language, the various wired or wireless domain names formed for keyword with the Chinese phonetics codes of input or network address establish corresponding relation, therefore any one respective items that we can input in above-mentioned various information in input frame instead can find other respective items, the various information playing same meaning on the one hand searches effect mutually, play on the other hand the word or character be familiar with using us as input code, by selecting the respective items consistent with input code or character meaning, reach the object of other word and corresponding domain name or the network address selecting input meaning consistent.Such as we input any one respective items " we " in various information in input frame, the all items in corresponding various information will be shown: 1. our other foreign language 2 7. minority language 1 minority language two 9.http: //www.wovmno.com of other foreign language 1 of 2.w ǒ men3.wovmno.4.we5. in candidate frame, key in 1 respectively, 2, 3, 4, 5, 6, 7, 8, 9 just can choose respectively and input Chinese character, phonetic, Chinese phonetics codes, English, other foreign language one, other foreign language two, minority language one, minority language two and corresponding network address.Or according to the display of setting candidate frame: 1. our other foreign language two minority language one minority language two http://of other foreign language one of w ǒ menwovmnowe www.wovmno.com, key in numerical key " 1 " then in " we other foreign language two minority language one minority language two http://of other foreign language one of w ǒ menwovmnowe www.wovmno.com " computing machine of being simultaneously presented at required input by laterally or upper and lower contrast input or various hand-held embedded movable equipment, in like manner to other all respective items included by various information, such as: other foreign language too can with oppositely inputting with mode identical above and inquiring about with minority language and domain name or network address etc.
The like, with said method, carry out reversible input and contrast to input showing between the domain names that the Chinese phonetic alphabet that can use all polysyllabic arbitrarily Chinese characters in units of word, phrase, sentence, tradition " Scheme for the Chinese Phonetic Alphabet ", Chinese phonetics codes, other various Chinese phonetic alphabet to tradition " Scheme for the Chinese Phonetic Alphabet " with one-to-one relationship, foreign language, minority language are corresponding with Chinese phonetics codes or network address, thus conveniently carry out reversible input and the process of the consistent various information of various meaning.
(9) Chinese phonetics codes Chinese programming step and method
When carrying out Chinese programming, first by the computerese keyword of programming and the keyword of statement and composition statement thereof, translate into the Chinese information of expressing with Chinese character and the Chinese phonetic alphabet and Chinese phonetics codes spelling and Mixed Pinyin keyword according to their meanings in Chinese or function, and set up the keyword table of comparisons one to one and store in advance in a computer;
Any computing machine and removable embedded computer system software program are all texts, when carrying out computing machine and removable embedded computer system software programming with Chinese, can Chinese character keyword or Chinese phonetic alphabet keyword or Chinese phonetics codes spelling and Mixed Pinyin keyword be programmed one to one with the keyword of the keyword and statement and composition statement thereof with programming, except the keyword of computer programming language and the keyword of statement and composition statement thereof will be used Chinese character keyword or Chinese phonetic alphabet keyword or Chinese phonetics codes spelling and Mixed Pinyin keyword instead and programme, the symbol of other former computer programming language and various programming regulation and rule remain unchanged,
Computer system be pure western code also namely ASCII character system time, except the Chinese character of the keyword of keyword and statement and composition statement thereof or the Chinese phonetic alphabet need to convert to except Chinese phonetics codes keyword, other also needs to convert Chinese phonetics codes to the Chinese information that Chinese character or the Chinese phonetic alphabet represent;
As the text of source program before compiling, computing machine is first according to the keyword table of comparisons stored in advance in a computer, by Chinese character or the Chinese phonetic alphabet or Chinese phonetics codes spelling and Mixed Pinyin keyword Batch conversion become originally change the keyword of English keyword that front corresponding compiling system can compile and statement and composition statement thereof one to one with the keyword of English keyword and statement and composition statement thereof, compiled according to the compile mode of the computer software programs of originally programming with English or explained again after converting, high-level [computer first compiles or is construed to assembly routine, computing machine is handed over to perform after being assembled into machine code by assembly routine again, and after Chinese assembly language program(me) converts English keyword assembly language program(me) to, computing machine is handed over to perform after being then directly assembled into machine code,
When source program code read by needs, computing machine can according to the keyword of the keyword of computer programming language used and statement and composition statement thereof and Chinese character or the Chinese phonetic alphabet or Chinese phonetics codes keyword one-to-one relationship, according to setting in advance, the keyword of the keyword of computer programming language and statement and composition statement thereof can show in the mode of English, Chinese character or the Chinese phonetic alphabet or Chinese phonetics codes spelling and Mixed Pinyin respectively;
Programming content and the character expression way of other non-key word and statement can be constant, also can be output into the text program source code of the information category that system presets again through Chinese characters phonetic and Chinese voice code bidirectional modular converter and Chinese phonetics codes Chinese and foreign language two-way translation module converts, this information category comprises Chinese character, the Chinese phonetic alphabet, Chinese phonetics codes spelling and Mixed Pinyin, foreign language;
After the memonic symbol keyword of English assembly language sets up one-to-one relationship with corresponding Chinese phonetics codes spelling and Mixed Pinyin keyword, Chinese phonetics codes spelling and Mixed Pinyin keyword can also be corresponding with the memonic symbol keyword of English assembly language machine code set up one-to-one relationship, so just form Chinese advanced procedures and can become Chinese assembly language program(me) by direct compilation, then hand over computing machine to perform after being assembled into machine code by Chinese assembly language program(me);
Transform further by hardware circuitry, the corresponding order set of the hardware circuit of computing machine is made to be more suitable for Chinese programming instruction, so just, the repertoire of computer more meeting Chinese feature and custom can be designed, thus realize from Chinese high-level [computer to Chinese low level computer language again to the computer Chinese programming language come down in a continuous line of the computer machine language adapted with Chinese and machine code.
Such as, for C-language Programming Design:
Source code to following C-language Programming Design:
Be with in source code // be comment line symbol, with // content that starts is the annotation of program, its effect annotates program, improves the readability of program.When source code is compiled, program compiler does not compile annotation, and annotation is left in the basket.
If we according to the meaning of these keywords or function respectively by the keyword Chinese character in the source code of above C language, the Chinese phonetic alphabet, Chinese phonetics codes represents, such as: include (comprises, b ā oh á n, bkahse) (be English program language keyword outside note bracket, the use Chinese character that this English program language keyword is corresponding respectively in bracket, the Chinese phonetic alphabet, the keyword of Chinese speech representation), in like manner: stdio.h (input and output header file, sh ū r ù sh ū ch ū t ó uw é nji à n, xuaruuxuaquatxvwnejisu), void (null value, k ō ngzh í, kdajre), main (principal function, zh ù h ǎ nsh ù, juvhsvxuu), Printf (screen display, p í ngxi ǎ n, pqexisv), then the source code of above C language can represent with Chinese character keyword thus realize Chinese character or Chinese Chinese programming:
The source code of above C language can represent with Chinese phonetic alphabet keyword thus realize Chinese phonetic alphabet Chinese programming:
The source code of above C language can also represent with Chinese phonetics codes keyword, thus realizes Chinese phonetics codes Chinese programming:
Carry out keyword replacement in order to the upper keyword table of comparisons, more than can convert the computer source program of writing with English keyword to the computer program that Chinese character, the Chinese phonetic alphabet and Chinese phonetics codes are write, such as:
And this computer source program is exactly the source code of the C-language Programming Design of standard, existing C-language Programming Design program compiler can be transferred to be compiled into assembly language program(me), and assembly language program(me) transfers to computer run after being assembled into machine code again.
Here it is to be noted, because Chinese phonetics codes and English are all with 26 Latin alphabet compositions, only often easily obscure from character angle Chinese phonetics codes and English, therefore, computing machine is when becoming corresponding English programming keyword by Chinese phonetics codes keyword, first to carry out necessary judgement, if the programming keyword of 26 letter compositions and statement are English, directly carry out computer program compiling by existing traditional approach, if judge it is Chinese phonetics codes, after first carrying out converting English keyword to, carry out computer program compiling in a traditional way again, Chinese phonetics codes has certain redaction rule, such as, concerning Chinese phonetics codes brevity code, general turn left from a word the last letter from the right side several do not comprise the last letter (this letter is one of " a " " e " " v " " u " " o " letter often, the tone coded identification of phonetic code syllable), just one of " a " " e " " v " " u " " o " letter is there will be every 1 to 3 letter, such as: bk ahs e, xu aru uxu aqu atx vwn ejis u, kd ajr e, ju vhs vxu u, pq exis vin have the part of underscore to be exactly the tone mark of Chinese phonetics codes, all present above-mentioned feature, the keyword presenting this feature must be Chinese phonetics codes keyword, therefore, must first convert English keyword in this case after, then carry out compiling or explaining.
System be pure western code also namely pure ASCII character system time, because Chinese character or the Chinese phonetic alphabet can not normally show in pure ASCII character system, therefore, need to convert to except Chinese phonetics codes keyword at the Chinese character of the keyword except keyword and statement and composition statement thereof or the Chinese phonetic alphabet, the Chinese character of other non-key word or the Chinese phonetic alphabet also need to convert Chinese phonetics codes to, computing machine is transferred to carry out the conversion of English keyword after converting again, and be finally compiled into machine code, and hand over computer run.
Specifically can adopt following methods, such as, all convert the program that Chinese character keyword Chinese phonetic alphabet keyword is above write to Chinese phonetics codes keyword program source code, the Chinese character shown in source code: " we can use Chinese character and latin literary composition." also converting Chinese phonetics codes to by above-mentioned Chinese characters phonetic and Chinese voice code bidirectional modular converter, the said procedure source code after converting to is as follows:
Before computer run, the C-language Programming Design source program that above-mentioned Chinese phonetics codes keyword is write as replaces to English keyword by Chinese phonetics codes keyword, obtains the C-language Programming Design source program that the English keyword of following use is write:
Now, computer screen is from original display: we can use Chinese character and latin literary composition.Become display: wovmnohuiuxrvyduhcuyyvlaadqawnv.
When source program code read by needs, computing machine can according to the keyword of the keyword of computer programming language used and statement and composition statement thereof and Chinese character keyword or Chinese phonetic alphabet keyword or Chinese phonetics codes keyword one-to-one relationship, according to setting in advance, the keyword of keyword and statement and composition statement thereof respectively can with English, the mode of Chinese character or the Chinese phonetic alphabet or Chinese phonetics codes shows, programming content and the character expression way of other non-key word can be constant, also the text program source code of the information category that system presets can be output into again through the conversion of Chinese characters phonetic and Chinese voice code bidirectional modular converter and Chinese phonetics codes and foreign language two-way translation modular converter, this information category comprises Chinese character, the Chinese phonetic alphabet, Chinese phonetics codes, foreign language.Specifically can adopt following methods, such as:
The source code of the C language of following english writing:
#includ<stdio.h>
voidmain()
{
(" we can use Chinese character and latin literary composition to Printf.”);
}
In Chinese character information be: " we can use Chinese character and latin literary composition." above-mentioned Chinese characters phonetic and Chinese voice code bidirectional modular converter can be called; and adopt said method to change into Chinese phonetics codes information; Chinese phonetics codes information is: it is main foreign language two-way translation module that wovmnomwvtisaxrvydulaadqawnv. finally calls Chinese phonetics codes again with English, converts the Chinese information of Chinese speech representation to english information: weuselatineveryday.
Now the source code of the C language of the above-mentioned english writing containing Chinese character just becomes the source code of literary composition all over Britain:
#includ<stdio.h>
voidmain()
{
Printf(“weuselatineveryday.”);
}
Now, computer screen is from original display: we can use Chinese character and latin literary composition.Become display: weuselatineveryday.
In like manner, the source code information that English can represent by we becomes the Chinese source code information of Chinese speech representation with foreign language two-way translation module converts by above-mentioned Chinese phonetics codes, the Chinese source code information of Chinese speech representation again can by above-mentioned Chinese characters phonetic and the Chinese source code information that Chinese voice code bidirectional modular converter converts Chinese character to, the Chinese phonetic alphabet represents, such as:
The Chinese source code of the full Chinese speech representation of above C language:
#bkahse<xuaruuxuaquatxvwnejisu>
kdajrejuvhsvxuu()
{
pqexisv(“wovmnohuiuxrvyduhcuyyvlaadqawnv.”);
}
Now, computer screen is from original display: we can use Chinese character and latin literary composition.Become display: wovmnohuiuxrvyduhcuyyvlaadqawnv.
The Chinese source code that the full Chinese character of above C language represents:
# comprises < input and output header file >
Null value principal function ()
{
(" we can use Chinese character and latin literary composition to screen display.”);
}
Now, computer screen is from original display: we can use Chinese character and latin literary composition.Become same display: we can use Chinese character and latin literary composition.
The Chinese source code that the full Chinese phonetic alphabet of above C language represents:
#bāohán<shūrùshūchūtóuwénjiàn>
kōngzhízhùhǎnshù()
{
píngxiǎn(“wǒmenhuìshǐyònghànyǔlādīngwěn。”);
}
Now, computer screen is from original display: we can use Chinese character and latin literary composition.Become display: w ǒ menhu ì sh ǐ y ò ngh à ny ǔ l ā d ī ngw ě n.
In like manner also can carry out Chinese character, the Chinese phonetic alphabet, Chinese phonetics codes and the foreign language based on English to comment section to represent, be not repeated here.
Although computer programming develops into visual programming at present, visual programming programme with earlier generations compared with maximum difference to be that earlier generations is programmed be all proceduredriven, and visual programming is by event-driven, as long as software application people load or when clicking a certain event, the relative program of this event will be driven, and visual programming and earlier generations same or analogous place of programming be its driver is also write by various computer programming language to form.Therefore, with regard to the various computer programming languages writing event driven program, the method for replacing the above-mentioned higher level lanquage keyword identical with assembly language and statement keyword can be adopted to be transformed into the computerese carrying out Chinese programming with Chinese character keyword, Chinese phonetic alphabet keyword, Chinese phonetics codes keyword.
We illustrate the method for carrying out computer programming with Chinese for C language above, in fact method same above adopting, we can also will comprise assembly language, C++, the programming language of the foreign language keyword character that Java etc. are all to be used based on English is transformed into Chinese character, the Chinese phonetic alphabet, Chinese phonetics codes is the Chinese programming language of keyword character, and equally can with pressing the method described above, the source code information represented by English converts the Chinese source code information of Chinese speech representation to, the Chinese source code information of Chinese speech representation can also convert Chinese character to further, the Chinese source code information that the Chinese phonetic alphabet represents, vice versa.
By above method, the Chinese information of various computer programming language key word is changed, we can utilize the program compiler of original computing machine or assembly routine or interpretive routine to come for Chinese programming service, simultaneously, make the computerese program compiler before the program compiler of Chinese programming language or assembly routine or interpretive routine and conversion or assembly routine or interpretive routine 100% compatible, like this can not development computer program compiler or assembly routine or interpretive routine again, just in or beyond computing machine program compiler or assembly routine or interpretive routine, add a keyword replaces disposal system in advance, just can be had compatible Chinese and the English computer programming language compiling system for keyword or assemble system or program language interpre(ta)tive system simultaneously, such method is a kind of method stood on giant's shoulder, achieve Chinese programming with getting twice the result with half the effort.
(10). the step and method of Chinese phonetics codes information search:
When carrying out information search, employing is based on existing traditional information search engine, can either by Chinese character, the Chinese phonetic alphabet, Chinese phonetics codes spelling and Mixed Pinyin, foreign language carries out information search as the keyword input frame of the direct inputted search engine of keyword of information search, also can pass through the Chinese character of inputted search engine keyword input frame, the Chinese phonetic alphabet, Chinese phonetics codes spelling and Mixed Pinyin, foreign language or Chinese speech, by above-mentioned Chinese characters phonetic and Chinese voice code bidirectional modular converter, Chinese phonetics codes sound identification module, after the information category that Chinese phonetics codes Chinese foreign language two-way translation module converts becomes to preset, then carry out information search, export the information inquired, can export according to system default or the information category preset mode, above-mentioned this information category comprises Chinese character, the Chinese phonetic alphabet, Chinese phonetics codes spelling and Mixed Pinyin, foreign language, Chinese particular person voice, Chinese dialects voice, minority language voice, Chinese speech or foreign language voice,
When the webpage of the Chinese information represented with Chinese character or the Chinese phonetic alphabet needing search engine to search converts the webpage of the Chinese information that Chinese phonetics codes spelling and Mixed Pinyin represent to, first computer system finds the source file of this webpage, the source file of this webpage is such as the text of an expansion " .html " or " .hml " by name, by calling the Chinese characters phonetic and Chinese voice code bidirectional modular converter that store in advance in computer systems, which, the Chinese character that meetings all in text file are shown or all Chinese phonetic alphabet, the position of their original webpages converts Chinese phonetics codes spelling or Mixed Pinyin to, the general Chinese character needing conversion is all Chinese characters except the Chinese character as filename and the Chinese character as Chinese character style title,
When Chinese character network page being converted to the webpage that Chinese phonetics codes spelling and Mixed Pinyin represent, the English originally in webpage, English alphabet, arabic numeral, western language punctuation mark, the number of dividing a word with a hyphen at the end of a line do not need conversion, retain former state;
As the Chinese character of filename in webpage, in order to display and operation in the computer system of pure ASCII character also can be called at pure western code, need to convert the Chinese character as filename in webpage to Chinese phonetics codes, former Chinese character after being converted will copy as the file of filename and store in place, in specified folder in such as given server or local, to guarantee that computer system can find this to be converted into the file of Chinese phonetics codes title;
For the Chinese character of Chinese character style title, when western code is also when there is not this Chinese character style title in ASCII character system, this Chinese character style title can change into and preset and the comparatively close western language font name stored in a computer by computing machine automatically, or the western language font name of acquiescence that computing machine presets;
When the Chinese phonetics codes in webpage needs to convert Chinese character or the Chinese phonetic alphabet to, by calling the conversion being stored in computer system Chinese characters phonetic and Chinese voice code bidirectional modular converter in advance, obtain corresponding Chinese character or the Chinese phonetic alphabet, and the position of the Chinese phonetics codes in original webpage replaces the Chinese phonetics codes be converted with these Chinese characters or the Chinese phonetic alphabet;
When the Chinese phonetics codes in webpage or the punctuation mark number of dividing a word with a hyphen at the end of a line need to convert voice to, can adopt respectively and look into the voice that the Chinese phonetics codes voice synthetic module stored in advance in computer systems, which exports corresponding Chinese, Chinese particular person, Chinese dialects, minority language and punctuation mark;
When the foreign language based on English in webpage needs to convert voice to, can adopt the existing foreign language voice synthetic module based on English, the foreign language based on English that will show in webpage is bright to read out;
When the foreign language needing the Chinese information of the Chinese speech representation in webpage to be converted to based on English, or the English in webpage is that main foreign language is when needing to convert to the Chinese information that Chinese phonetics codes spelling and Mixed Pinyin represent, the Chinese phonetics codes Chinese and foreign language bi-directional conversion module that store in advance in a computer can be called, position in the webpage of the phonetic code be converted, convert the Chinese information of Chinese speech representation to foreign language based on English, or the position in the webpage of the foreign language based on English be converted, it is the Chinese information that main foreign language converts Chinese phonetics codes spelling or Mixed Pinyin to and represents by webpage Chinese and English,
To adopting all webpages of searching of above method, can as required by the path of all or part of content in former webpage and hyperlink or file, the content of instead specifying and the path of hyperlink of specifying or file;
The webpage obtained when us is not pass through search engine, but by alternate manner such as various web browser obtain time, the webpage representing information with Chinese character, the Chinese phonetic alphabet, Chinese phonetics codes or foreign language obtained, can be output into the webpage of the information category that system presets again by the conversion of above-mentioned various module, this information category comprises Chinese character, the Chinese phonetic alphabet, Chinese phonetics codes spelling and Mixed Pinyin, foreign language, Chinese particular person voice, Chinese dialects voice, minority language voice, Chinese speech or foreign language voice.
(11). the step and method of the unifying identifier such as Chinese phonetics codes trade mark and domain name:
The spelling of Chinese phonetics codes or Mixed Pinyin can as the marks of unit organization or individual, this mark comprises mark as the trade mark of product or service and organization mechanism code, the mark of this trade mark and organization mechanism code can carry out lawful registration, can be imprinted on various material object, also comprise in the various search engine and browser address bar that key character or keyword as SMS network address and mobile phone mobile business street and computing machine or embedded computer system be input to mobile phone or computing machine or embedded computer system, to find the webpage or website that are associated with this mark or code, with as the mark of unit organization or individual or the mark of the trade mark of product or service and the corresponding various wired or wireless domain names of the Chinese phonetics codes of organization mechanism code or network address, can by call Chinese phonetics codes domain name conversion and webpage log-in module obtain, so just the mark of the Chinese phonetics codes keyword of the network address of short message of mobile phone Chinese phonetics codes and mobile phone mobile business street Chinese phonetics codes keyword and computing machine or embedded computer system and the domain names be made up of the stem of the keyword of Chinese phonetics codes and unit organization or the mark of individual or the trade mark of product or service and organization mechanism code can be united mutually.

Claims (10)

1. a Chinese phonetics codes spelling and Mixed Pinyin Chinese holographic information processing method, the method of this information processing is based upon the method used on the computing machine of Global Access or embedded computer system basis, the Chinese phonetics codes that this Chinese holographic information processing method adopts has spelling and Mixed Pinyin two kinds of spellings, what described Mixed Pinyin referred to that the syllable of the Chinese speech of write the two or more syllables of a word together has uses Chinese phonetics codes spelling, some Chinese phonetics codes simplicity, its feature mainly comprises the following steps:
steps A:
(1) coding method of each syllable sound, rhyme, tone of the Chinese phonetics codes adopted during Chinese phonetics codes simplicity adopts following method:
Note: the symbol in bracket is Chinese phonetic symbols, the coding that not parenthesized letter is each syllable sound, rhyme, tone of adopted Chinese, by the contrast coding schedule of following sound, rhyme, tone referred to as code table
(1). for representing that the initial consonant of the phonetic code of Chinese information all adopts a Latin alphabet to represent, comprise the coding adopting the following consonant Latin alphabet to represent acoustic code:
b:(b)p:(p)m:(m)f:(f)d:(d)t:(t)
n:(n)l:(l)g:(g)k:(k)h:(h)
j:(zh),(j)q:(ch),(q)x:(sh),(x)r:(r)
z:(z)c:(c)s:(s)y:(y)w:(w)
(2). for representing that the Latin alphabet of the phonetic code of Chinese information in 26 letters represents referral letter, comprise (ü) that to represent with y in original Chinese phonetic alphabet single vowel and referral letter, all the other single vowels adopt the symbol identical with referral letter with Chinese phonetic alphabet single vowel with the coding of referral letter, comprise the coding adopting following referral letter:
i:(i)u:(u)y:(ü)
(3). for representing that the phonetic code of Chinese information is except part is with except the composite vowel of referral letter, the rhyme code of remaining composite vowel represents with a Latin alphabet when simplicity, comprise and representing with a consonant, when being included in Chinese phonetics codes simplicity, adopt the coding of following rhyme code:
a:(a)o:(o)e:(e)i:(i)u:(u)y:(ü)
z:(ao)t:(ai)c:(an)s:(ou)w:(ei)n:(en)
k:(ua)l:(uo)g:(ang)d:(ong)b:(eng)q:(ing)
p:(ng)
er:(er)
R:(i) [only spell mutually with (zh), (ch), (sh)]
(4). for representing that its tune code of phonetic code five Latin alphabets of Chinese information represent, comprising and adopting following four Latin alphabets and a no alphabetical v of Chinese to represent the coding adjusting code:
A:(-) high and level tone e:(/) rising tone v:(∨) upper sound u:() falling tone o:(do not mark) softly
(2)utilize the holography of the Chinese information of above-mentioned coding to represent and adopt following method:
In units of word, here individual Chinese character is regarded as monosyllable, according to the phonetic in " Scheme for the Chinese Phonetic Alphabet " of each syllable of this word of composition, when Chinese phonetics codes spelling except the expression of ü adopts a Latin alphabet to comprise y to represent, initial consonant represents and to represent with referral letter and simple or compound vowel of a Chinese syllable represents all identical with the Scheme for the Chinese Phonetic Alphabet, adjust code to adopt a Latin alphabet to represent with Scheme for the Chinese Phonetic Alphabet difference, and this tune code is held concurrently every syllable symbol, namely each syllable of Chinese phonetics codes is successively by the sequential encoding of " simple or compound vowel of a Chinese syllable that referral letter+Chinese phonetic alphabet that initial consonant+Chinese phonetic alphabet that the Chinese phonetic alphabet is identical is identical is identical+tune code is held concurrently every syllable symbol ", when Chinese phonetics codes simplicity successively by the sequential encoding of " acoustic code+Jie's code+rhyme code+tune code is held concurrently every syllable symbol ", no matter be spelling and simplicity, multiple syllables of same word separate write the two or more syllables of a word together without space, coding space between word and word separates, during composition word, each syllable of word both can all with spelling or simplicity syllable composition, also any one syllable of composition word can be adopted spelling or simplicity mix and match composition as required, also the syllable namely had in multiple syllables of composition word can be simplicity, some syllables can be spellings, Chinese phonetics codes spelling and Mixed Pinyin are referred to as Chinese phonetics codes or phonetic code,
When Chinese information is in spelling or Mixed Pinyin phonetic code state, its usage in punctuation is identical with English usage in punctuation;
step B:
Chinese character, the Chinese phonetic alphabet, Chinese phonetics codes spelling, Chinese phonetics codes simplicity and Mixed Pinyin can realize bi-directional conversion by Chinese characters phonetic and Chinese voice code bidirectional modular converter each other as required;
Chinese phonetics codes spelling, Chinese phonetics codes Mixed Pinyin all can by corresponding module or method carry out speech recognition, phonetic synthesis, Chinese Word Intelligent Segmentation, mechanical translation, information search, various computer file format and info web represent and show, to sew before and after various network legitimate domain name network consisting domain name for logging in corresponding website, Chinese character and words is programmed, the information processing of trade mark and domain name unifying identifier;
The Chinese holographic information of Chinese phonetics codes spelling, Chinese phonetics codes Mixed Pinyin composition can carry out information processing with the software and hardware resources that all process western language, and comprising can with identifying that the lettering pen of western language writes input, OCR western language optical scanning input, standard English keyboard key entry, letter word speech recognition input;
Chinese phonetics codes spelling or Mixed Pinyin or separately or carry out contrasting with Chinese character, the Chinese phonetic alphabet, foreign language, minority language print, print, store, show, communication, information transmission.
2. Chinese phonetics codes spelling as claimed in claim 1 and Mixed Pinyin Chinese holographic information processing method, it is further characterized in that: Chinese Word Intelligent Segmentation adopt a kind of mainly based on the computing machine on the novel Chinese grammar analysis foundation substantially consistent with the morphology syntax and word-building thereof of English Grammar or embedded movable equipment Chinese-character text and there is with " Scheme for the Chinese Phonetic Alphabet " segmenting method of Chinese phonetic alphabet text of one-to-one relationship, its novel Chinese grammar principal feature used is that the part of speech of Chinese is divided into by morphology aspect: noun, pronoun, numeral-classifier compound, adverbial word, adjective, verb, preposition, conjunction, modal particle and onomatopoeia, the sentence element of Chinese divides into by syntax aspect: subject, predicate, object, predicative, appositive, attribute, the adverbial modifier, complement, the complex sentence of sentence is divided into complex sentences with coordinating relation and principal and subordinate's complex sentence, principal and subordinate's complex sentence can be divided into again: subject clause, object clause, predicative clause, appositive clause, attributive clause, adverbial clause, Chinese loans tense is divided into: past tense, present tense, now future tense, past future tense, Chinese loans body formula is divided into: general expression, carry out formula, perfect, perfect progressive tense, set up the subjunctive mood of Chinese loans passive voice and predicate verb, the method of the method that the word-building aspect of Chinese is sewed mainly through prefixing, infix, suffix, front and back on root basis and root and root compound carrys out word-building,
By the non-individual Chinese character of Chinese or the specific term of syllable, pronoun, numeral-classifier compound, part adverbial word, preposition, conjunction, modal particle and onomatopoeia, characterize the Feature Words of complex sentences with coordinating relation and each subordinate clause, system when verb is various, passive voice, subjunctive Feature Words, the front and back of word-building are sewed classification and are listed primary word storehouse in, by four words of main Chinese and set phrase, monosyllabic word, adjective, verb, secondary dictionary is listed in other noun and the adverbial word classification that exclude one-level dictionary in, by the prefix of the word-building of Chinese, infix, suffix, three grades of dictionaries are listed in root classification in,
The breakpoint of sentence or character string always will be utilized when participle, from the breakpoint left and right sides, coupling cutting is carried out to the Chinese character or syllable that need cutting, space to be added to all words that the match is successful separate and complete mark on backstage as coupling, wait all complete cut word after cancel this mark again and get back to original font format;
Utilize breakpoint to be formed position comprise: the rising of sentence contain in head, the ending of sentence, various punctuation mark, various expression quantity and the arabic numeral of sequence number, various pi-character, original Chinese character or syllable space, the later breakpoint formed of upper level dictionary participle;
During participle, the first step is first sewed with the word in one-level dictionary and front and back, the Chinese character needed in the whole text of participle or syllable are scanned, the word of cutting is needed to carry out cutting regarding one as through the successful Chinese character of scan matching or syllable, before and after sew after the match is successful, suffix was that a word segmentation is used as comprise all characters sewed front and back as by boundary in the past, had during more than a kind of matching result and was as the criterion with the matching result producing minimum isolated Chinese character or syllable;
After one-level dictionary has divided, get four, two, three and one from the left and right sides of breakpoint successively respectively and there is no the Chinese character that the match is successful or syllable, then mate with the word in secondary dictionary, if the match is successful for the Chinese character got or syllable, and coming to the same thing of forward and reverse coupling is carried out from the left and right sides of breakpoint to same handling object, just think that this is a successful matching result, if the result of coupling is not identical, the matching result producing minimum isolated Chinese character or syllable is considered to successful coupling;
After secondary dictionary has divided word, when further participle, first three grades of dictionaries carry out prefix, suffix, infix and root matching judgment to the Chinese character that the match is successful or syllable is contrasted, if the words of prefix, an absorption isolated Chinese character or syllable form a word and do cutting backward, if be two Chinese characters or syllable matched below, be then combined, by three words cuttings with these two Chinese characters matched or syllable, if suffix, an absorption isolated Chinese character or syllable form a word and do cutting forward, if be two Chinese characters or syllable matched above, are then combined, by three words cuttings with these two Chinese characters matched or syllable, if the words of infix then absorb each word in front and back or syllable forms a word, if before causing after absorbing or occur below one isolated when there is no Chinese character or a syllable of coupling, then this Chinese character or syllable to be absorbed the word into this infix composition, the Chinese character of the word of general composition or syllable number are no more than four, if the words of root, word or syllable can be added according to before it, or word or syllable can be added below, or front and back can add the situation of word or syllable, adopt prefix respectively, suffix, the word method of cutting of infix carries out cutting word, the word that the cutting of above method institute is arrived, when occurrence number accumulative in the different sentences in same section document is no less than twice, system automatically by this word stored in secondary dictionary,
After complete with above three dictionary cuttings, still the Chinese character that the match is successful or syllable string is there is in sentence, or although the match is successful but when belonging to more than three continuously isolated Chinese characters or syllable string, they are combined composition word and carrys out cutting, the word that the cutting of above method institute is arrived, when occurrence number accumulative in the different sentences in same section document is no less than twice, system can according to setting automatically or after manual confirmation by it stored in one-level dictionary;
Manual intervention amendment can also be carried out to last word segmentation result and inspection rule, classify stored in one-level dictionary or secondary dictionary to the neologisms that manual intervention is formed after manual confirmation according to the feature of word, word in dictionary at different levels can also carry out artificial additions and deletions, and word in dictionary preferentially to be classified the principle arrangement be arranged in front by high frequency, when reaching certain threshold values, word classification in secondary dictionary can be risen to one-level dictionary through manual confirmation system, word classification in one-level dictionary drops to secondary dictionary, and this Word Intelligent Segmentation step is called word-dividing mode.
3. Chinese phonetics codes spelling as claimed in claim 1 and Mixed Pinyin Chinese holographic information processing method, it is further characterized in that: when Chinese character or the Chinese phonetic alphabet convert Chinese phonetics codes to, Chinese character first converts the Chinese phonetic alphabet to, when meeting different sound shape similar word, the possible Chinese phonetic alphabet is all listed, the Chinese phonetic alphabet then need not first be changed, and then first convert corresponding Chinese syllable phonetic code string to according to code table, then carry out by word segmentation again calling the word-dividing mode stored in advance in computer systems, which;
Then need not carry out the segmentation of words again after Chinese phonetics codes is converted to the Chinese character and the Chinese phonetic alphabet that divided word, still change in units of original word;
When Chinese phonetics codes needs to convert the Chinese phonetic alphabet to, or adopt and look into the code table stored in advance in computer systems, which, or look into the Chinese phonetics codes in units of syllable or word and the Chinese phonetic alphabet table of comparisons in units of syllable or word that are generated by this code table, after coupling, export the corresponding Chinese phonetic alphabet;
When Chinese phonetics codes needs to convert Chinese character to, or the Chinese phonetic alphabet first converted in units of word converts the Chinese character in units of word again to, or directly adopt and look into the phonetic code that stores in advance in computer systems, which and the Chinese character table of comparisons in units of word, mate and export corresponding Chinese character afterwards;
When meeting homonym, first differentiate according to the contact of Chinese lexical syntactic context and statistical law means, the Chinese character carried out again after differentiation in units of word is selected;
When needing to convert the Chinese phonetics codes of spelling the Chinese phonetics codes of simplicity to, by looking into the code table prestored in a computer, the initial consonant of the Chinese phonetics codes of spelling, referral letter, simple or compound vowel of a Chinese syllable are changed into the acoustic code of the Chinese phonetics codes of simplicity, Jie's code and rhyme code, adjust code to remain unchanged;
Otherwise when needing to convert the Chinese phonetics codes of simplicity the Chinese phonetics codes of spelling to, by looking into the code table prestored in a computer, the acoustic code of the Chinese phonetics codes of simplicity, Jie's code and rhyme code are changed into initial consonant, referral letter, the simple or compound vowel of a Chinese syllable of the Chinese phonetics codes of spelling, adjust code to remain unchanged;
When described spelling or simplicity only have part syllable to change, just complete the conversion to Mixed Pinyin;
When Chinese phonetics codes converts Chinese character and the Chinese phonetic alphabet to, the method for this step also from the Chinese punctuation mark state that the state transfer identical with English is corresponding, is called Chinese characters phonetic and Chinese voice code bidirectional modular converter by its punctuation mark.
4. Chinese phonetics codes spelling as claimed in claim 1 and Mixed Pinyin Chinese holographic information processing method, it is further characterized in that: when Chinese phonetics codes converts Chinese speech to, adopt the Chinese syllable looked in Chinese phonetics codes and the Chinese syllable phonetic synthesis file table of comparisons respectively, Chinese phonetics codes in units of word and the Chinese language words phonetic synthesis file table of comparisons, or pass through maximum matching method, the Chinese phonetics codes string that employing is looked in units of maximum paragraph and the Chinese paragraph phonetic synthesis file table of comparisons export corresponding Chinese speech, when by above-mentioned Chinese phonetics codes or Chinese phonetics codes string distinguish corresponding syllable, the phonetic synthesis file of word or paragraph changes Chinese particular person respectively into, Chinese dialects, during the phonetic synthesis file of minority language, by looking into Chinese phonetics codes or Chinese phonetics codes string and corresponding syllables, the phonetic synthesis file table of comparisons of word or paragraph, corresponding Chinese particular person can be exported respectively, Chinese dialects, the voice of minority language, when synthesizing foreign language voice, carry out looking into word, phrase or phrase are the Chinese phonetics codes of unit and corresponding foreign language word, foreign language phrase or the foreign language phrases phonetic synthesis file table of comparisons export corresponding foreign language word, the voice of foreign language phrase or foreign language phrases, to needing the initial consonant inputting each syllable of Chinese, referral letter, simple or compound vowel of a Chinese syllable and tone information just can carry out the system of Chinese syllable synthesis, Chinese phonetics codes can be converted to Chinese Pin Yin pseudonym according to code table, referral letter, after the information of simple or compound vowel of a Chinese syllable and tone, be input to again in speech synthesis systems for Chinese and carry out Chinese syllable synthesis, when carrying out phonetic synthesis to the punctuation mark in Chinese phonetics codes article and the number of dividing a word with a hyphen at the end of a line, as long as the audio files of six kinds of periods, seven kinds of labels and the number of dividing a word with a hyphen at the end of a line of storing Chinese in a computer in advance accordingly extracts by we, carry out playing just can with sound playout software,
When this phonetic synthesis file is the phonetic synthesis file of Chinese, then this punctuation mark or the bright sound read out of the number of dividing a word with a hyphen at the end of a line are the sound of the corresponding punctuation mark of Chinese or the number of dividing a word with a hyphen at the end of a line, when this phonetic synthesis file is Chinese particular person respectively, Chinese dialects, during the phonetic synthesis file of minority language, then this punctuation mark or the bright sound read out of the number of dividing a word with a hyphen at the end of a line just are Chinese particular person respectively, Chinese dialects, the corresponding punctuation mark of minority language or the sound of the number of dividing a word with a hyphen at the end of a line, when input be the Chinese information of expressing with Chinese character or the Chinese phonetic alphabet time, Chinese character or the Chinese phonetic alphabet can pass through to store Chinese characters phonetic in computer systems, which and Chinese voice code bidirectional modular converter in advance, first convert spelling to or Mixed Pinyin Chinese phonetics codes carries out above-mentioned Chinese again, Chinese particular person, Chinese dialects, minority language, foreign language word, the speech conversion of foreign language phrase or foreign language phrases,
When Chinese speech converts Chinese phonetics codes to, Chinese speech recognition system can successively respectively by Chinese paragraph, Chinese language words, Chinese syllable is as the primitive identified, by searching the Chinese paragraph sound template and the Chinese paragraph phonetic code table of comparisons that store in advance in a computer, Chinese language words sound template and the Chinese language words phonetic code table of comparisons, Chinese syllable sound template and the Chinese speech syllabified code table of comparisons, corresponding Chinese paragraph phonetic code is identified after coupling, Chinese language words phonetic code, Chinese syllable phonetic code, just continuous print Chinese paragraph phonetic code string is obtained successively respectively when voice input continuously, Chinese language words phonetic code string, Chinese syllable phonetic code string, the above-mentioned Chinese syllable phonetic code that obtains was ganged up the word-dividing mode stored in advance in computer systems, which and carried out by word segmentation, then the segmentation of words need not be carried out again to dividing the Chinese language words phonetic code string of word and Chinese paragraph phonetic code string, write the two or more syllables of a word together between the syllable of same word and syllable are taked to the word be syncopated as, between word and word, the mode in space represents, when Chinese phonetics codes needs to convert Chinese character or the Chinese phonetic alphabet further to, corresponding Chinese character or the Chinese phonetic alphabet is exported to the conversion of Chinese voice code bidirectional modular converter by the Chinese characters phonetic stored in advance in computer systems, which, for the dialect that Chinese speech is Chinese with certain dialectal accent or a certain China, as long as the syllable of the dialect of this China or word or paragraph have certain corresponding relation with Chinese syllable or word or paragraph respectively, we are by similar above method namely: by the sound template of the Chinese syllable or word or paragraph of searching the Chinese with certain dialectal accent stored in advance in a computer and Chinese syllable or word or the paragraph phonetic code table of comparisons, and there is the dialect syllable of certain corresponding relation or the sound template of word or paragraph and Chinese speech syllabified code or word or the paragraph table of comparisons, corresponding Chinese syllable or word or paragraph phonetic code string is identified after coupling, just can realize this Chinese with certain dialectal accent or the Chinese phonetics codes identification of dialect, realize this Chinese with certain dialectal accent or the conversion of dialect and Chinese phonetics codes, this step method is called Chinese phonetics codes phonetic synthesis and identification module.
5. Chinese phonetics codes spelling as claimed in claim 1 and Mixed Pinyin Chinese holographic information processing method, it is further characterized in that: the method for the Chinese adopted and the bidirectional machine translation of foreign language, setting up on the source language morphology syntax basis substantially consistent with target language, by Chinese and the two-way syntactic transfer of foreign language, realize the bidirectional machine translation of Chinese and foreign language, here mechanical translation machine used refers to computing machine or the embedded computer system of Global Access, hereinafter referred to as computing machine or computer system, here morphology is exactly about the definition of part of speech and division and research word, change of morphology and using method thereof, syntax is exactly be about the definition of sentence element and division and research sentence kind, sentence structure and internal form thereof, sentence pattern is exactly sentence each word inner, phrase, phrase, the part of speech of subordinate clause or quite part of speech and in sentence take on putting in order and form of composition, first the artificial part of speech string of method establishment same language sentence and the corresponding relation of sentence pattern is used before translation, and then setting up in the Chinese morphology system substantially consistent with the foreign language needing to translate and syntax system-based, set up the sentence pattern contrast relationship between the required bilingual translated and store in a computer, the sentence of machine first scan source language during translation, the part of speech string of the sentence of source language is obtained by the dictionary looking into the source language mark part of speech stored in advance in a computer, by looking into the mapping table of source language part of speech string and the source language sentence pattern stored in advance in a computer, the part of speech string of the sentence of source language is converted to corresponding source language sentence pattern, again by looking into the source language sentence pattern and the target language sentence pattern table of comparisons that store in advance in a computer, source language sentence pattern is converted to the target language sentence pattern of coupling, finally by the method looking into source language and the target language bilingual dictionary stored in advance in a computer, word in source language or phrase are translated into word or the phrase of target language, and according to target the order of language sentence pattern arranges output in units of word, just the target language sentences required for us is obtained,
All subordinates clause in complex sentence are extracted by the advanced row grammatical analysis of the complex sentence in source language, till extracting simple sentence of to the last being deducted a percentage by subordinate clause layer by layer, mechanical translation is carried out again by the mode of above-mentioned translation simple sentence, complex sentence part is then by looking into the source language that stores in advance in a computer and the target language sentence pattern table of comparisons completes complex sentence syntactic transfer, and complete the translation of other composition of complex sentence except subordinate clause, finally translated subordinate clause is put into the relevant position of the rear corresponding complex sentence sentence pattern of conversion, so move in circles until till obtaining the whole target language sentence required for us,
When source language is the Chinese of expressing with Chinese character or the Chinese phonetic alphabet or Chinese speech, by storing Chinese characters phonetic in a computer and Chinese voice code bidirectional modular converter in advance, Chinese phonetics codes phonetic synthesis and identification module first convert Chinese character or the Chinese phonetic alphabet or Chinese speech to Chinese phonetics codes and translate, when foreign language turns over Chinese, translate the target language by Chinese speech representation that obtains or be directly used in expression Chinese information, or convert Chinese character or the Chinese phonetic alphabet or Chinese speech or Chinese particular person or Chinese dialects and minority language voice output to by storing Chinese characters phonetic in a computer and Chinese voice code bidirectional modular converter and Chinese phonetics codes phonetic synthesis and identification module in advance,
For in source language comprise the writing in classical Chinese, poem, Chinese idiom, allusion, slang, abbreviation inconvenience carries out the content of grammatical analysis, do not carry out part of speech inquiry and syntactic transfer, before part of speech inquiry and syntactic transfer, directly by searching, the coupling of case library one to one stored in advance is in the machine rear to be exported;
When the part of speech string that each key element corresponding with Chinese used by this interpretation method comprises the Chinese mark dictionary of part of speech, Chinese is manually set up and the Chinese sentence patterns table of comparisons and Chinese and the target foreign syntactic transfer table of comparisons change into each key element above-mentioned that another foreign language translation phase is applied to also store in a computer in advance time, above-mentioned interpretation method can also be extended to the machine translation method that a kind of foreign language translation becomes another foreign language, the method for this step is called Chinese phonetics codes Chinese foreign language two-way translation module.
6. Chinese phonetics codes spelling as claimed in claim 1 and Mixed Pinyin Chinese holographic information processing method, it is further characterized in that: the network address logging in various website or E-mail address adopted is the various legal network address containing the Chinese phonetics codes of spelling or Mixed Pinyin, website both can adopt sews composition with the front and back of Chinese phonetics codes various legal network address for stem adds of spelling or Mixed Pinyin, E-mail address adopts the Chinese phonetics codes of spelling or Mixed Pinyin to be stem ++ the legal suffix composition of various E-mail address, also can by spelling or the Chinese phonetics codes of Mixed Pinyin and the front and back of various legal network address be sewed and corresponding relation set up in the legal suffix of various E-mail address, all input various network address and E-mail address with the Chinese phonetics codes of spelling or Mixed Pinyin to computing machine or embedded computer system, at Chinese phonetics codes with phrase, when sentential form composition network address or E-mail address, all words in the phrase of the Chinese phonetics codes of map network domain name or E-mail address or sentence not space each other, both can be inputted by standard English keyboard during input, also western language handwriting recognition can be passed through, western language optical identification, the mode of letter word speech recognition and Chinese phonetics codes speech recognition inputs, when sending corresponding with the network address of certain website or the E-mail address voice pre-deposited wherein to computing machine or embedded computer system, computing machine or embedded computer system can be searched and store network address corresponding with these voice in a computer or E-mail address in advance, and the network address corresponding with these voice or E-mail address is shown in browser address bar, and open corresponding webpage or website or E-mail address, browser can manually be opened in advance, also can according to automatically opening the setting of computer system in advance after voice signal heard by computing machine or embedded computer system, after website is opened, computing machine or embedded computer system can continue to identify follow-up voice, after identifying search, open corresponding web page or the corresponding web page contents of cursor pointing of this website, and do further subsequent treatment according to setting computer in advance or embedded computer system, the method of this step is referred to as the conversion of Chinese phonetics codes domain name and webpage log-in module.
7. Chinese phonetics codes spelling as claimed in claim 1 and Mixed Pinyin Chinese holographic information processing method, it is further characterized in that: for various information input Chinese phonetics codes or inputted by standard English keyboard, or by western language handwriting recognition, western language optical identification, the mode of letter word speech recognition and Chinese phonetics codes speech recognition inputs, when inputting Chinese phonetics codes in coding input frame, each respective items included by the various information consistent with the meaning represented by this Chinese phonetics codes all or part ofly first can appear in candidate's input frame according to setting, through select confirm after be finally input to required input computing machine or various hand-held embedded movable equipment in, the various information inputted comprises consistent with inputted Chinese phonetics codes meaning with word, phrase, sentence is the Chinese character of unit, the Chinese phonetic alphabet used in tradition " Scheme for the Chinese Phonetic Alphabet ", Chinese phonetics codes spelling or Mixed Pinyin, with tradition " Scheme for the Chinese Phonetic Alphabet ", there is other various Chinese phonetic alphabet of one-to-one relationship, foreign language, Minorities In China word and the various wired or wireless domain names corresponding with inputted Chinese phonetics codes spelling or Mixed Pinyin or network address, the Chinese phonetics codes spelling inputted according to default or the corresponding various wired or wireless domain names of Mixed Pinyin or network address can be changed mutually automatically according to the code table of correspondence, when inputting Chinese phonetics codes in coding input frame, as run into initial consonant, simple or compound vowel of a Chinese syllable, the Chinese homophone word that tone is all identical, then in candidate's input frame, comprise arabic numeral order with specific character respectively and all list these homophone words and the foreign language consistent with its meaning thereof, minority language, through select confirm after be finally input to required input computing machine or various hand-held embedded movable equipment in,
The various wired or wireless domain names corresponding with inputted Chinese phonetics codes or network address, or directly obtained by Chinese phonetics codes+various legal wired and wireless network domain name or network address suffix composition of the prefix of various legal wired or wireless domain names or network address+input, or by call Chinese phonetics codes domain name conversion and webpage log-in module change obtain, at Chinese phonetics codes with phrase, during sentential form input, all words in the phrase of the Chinese phonetics codes in map network domain name or network address or sentence not space each other, when there is the homophone word of Chinese character, the various wired or wireless domain names corresponding with inputted Chinese phonetics codes or network address constant,
After input Chinese phonetics codes in coding input frame, each respective items included by the various information shown in candidate's input frame or add specific character comprise arabic numeral before each respective items, the input display of individual event respective items is carried out by keying in this specific character, or the respective items included by all various informations comprising the Chinese phonetics codes of input is only comprised arabic numeral to identify with a specific character, by disposable contrast input display while keying in each respective items that this specific character carries out included by all various informations;
By setting up the corresponding relation between each respective items of various information, when inputting any one in various information respective items in input frame, other each respective items included by various information will show in candidate's input frame with the form preset, in the computing machine being finally input to required input with set display format after selecting to confirm or various hand-held embedded movable equipment.
8. Chinese phonetics codes spelling as claimed in claim 1 and Mixed Pinyin Chinese holographic information processing method, it is further characterized in that: when carrying out Chinese programming, first by the computerese keyword of programming and the keyword of statement and composition statement thereof, translate into the Chinese information of expressing with Chinese character and the Chinese phonetic alphabet and Chinese phonetics codes spelling and Mixed Pinyin keyword according to their meanings in Chinese or function, and set up the keyword table of comparisons one to one and store in advance in a computer;
Any computing machine and removable embedded computer system software program are all texts, when carrying out computing machine and removable embedded computer system software programming with Chinese, can Chinese character keyword or Chinese phonetic alphabet keyword or Chinese phonetics codes spelling and Mixed Pinyin keyword be programmed one to one with the keyword of the keyword and statement and composition statement thereof with programming, except the keyword of computer programming language and the keyword of statement and composition statement thereof will be used Chinese character keyword or Chinese phonetic alphabet keyword or Chinese phonetics codes spelling and Mixed Pinyin keyword instead and programme, the symbol of other former computer programming language and various programming regulation and rule remain unchanged,
Computer system be pure western code also namely ASCII character system time, except the Chinese character of the keyword of keyword and statement and composition statement thereof or the Chinese phonetic alphabet need to convert to except Chinese phonetics codes keyword, other also needs to convert Chinese phonetics codes to the Chinese information that Chinese character or the Chinese phonetic alphabet represent;
As the text of source program before compiling, computing machine is first according to the keyword table of comparisons stored in advance in a computer, by Chinese character or the Chinese phonetic alphabet or Chinese phonetics codes spelling and Mixed Pinyin keyword Batch conversion become originally change the keyword of English keyword that front corresponding compiling system can compile and statement and composition statement thereof one to one with the keyword of English keyword and statement and composition statement thereof, compiled according to the compile mode of the computer software programs of originally programming with English or explained again after converting, high-level [computer first compiles or is construed to assembly routine, computing machine is handed over to perform after being assembled into machine code by assembly routine again, and after Chinese assembly language program(me) converts English keyword assembly language program(me) to, computing machine is handed over to perform after being then directly assembled into machine code,
When source program code read by needs, computing machine can according to the keyword of the keyword of computer programming language used and statement and composition statement thereof and Chinese character or the Chinese phonetic alphabet or Chinese phonetics codes keyword one-to-one relationship, according to setting in advance, the keyword of the keyword of computer programming language and statement and composition statement thereof can show in the mode of English, Chinese character or the Chinese phonetic alphabet or Chinese phonetics codes spelling and Mixed Pinyin respectively;
Programming content and the character expression way of other non-key word and statement can be constant, also can be output into the text program source code of the information category that system presets again through Chinese characters phonetic and Chinese voice code bidirectional modular converter and Chinese phonetics codes Chinese and foreign language two-way translation module converts, this information category comprises Chinese character, the Chinese phonetic alphabet, Chinese phonetics codes spelling and Mixed Pinyin, foreign language;
After the memonic symbol keyword of English assembly language sets up one-to-one relationship with corresponding Chinese phonetics codes spelling and Mixed Pinyin keyword, Chinese phonetics codes spelling and Mixed Pinyin keyword can also be corresponding with the memonic symbol keyword of English assembly language machine code set up one-to-one relationship, so just form Chinese advanced procedures and can become Chinese assembly language program(me) by direct compilation, then hand over computing machine to perform after being assembled into machine code by Chinese assembly language program(me);
Transform further by hardware circuitry, the corresponding order set of the hardware circuit of computing machine is made to be more suitable for Chinese programming instruction, so just, the repertoire of computer more meeting Chinese feature and custom can be designed, thus realize from Chinese high-level [computer to Chinese low level computer language again to the computer Chinese programming language come down in a continuous line of the computer machine language adapted with Chinese and machine code.
9. Chinese phonetics codes spelling as claimed in claim 1 and Mixed Pinyin Chinese holographic information processing method, it is further characterized in that: can based on existing traditional information search engine when carrying out information search, can either by Chinese character, the Chinese phonetic alphabet, Chinese phonetics codes spelling and Mixed Pinyin, foreign language carries out information search as the keyword input frame of the direct inputted search engine of keyword of information search, also can pass through the Chinese character of inputted search engine keyword input frame, the Chinese phonetic alphabet, Chinese phonetics codes spelling and Mixed Pinyin, foreign language or Chinese speech, by above-mentioned Chinese characters phonetic and Chinese voice code bidirectional modular converter, Chinese phonetics codes phonetic synthesis and identification module, after the information category that Chinese phonetics codes Chinese foreign language two-way translation module converts becomes to preset, then carry out information search, export the information inquired, can export according to system default or the information category preset mode, above-mentioned this information category comprises Chinese character, the Chinese phonetic alphabet, Chinese phonetics codes spelling and Mixed Pinyin, foreign language, Chinese particular person voice, Chinese dialects voice, minority language voice, Chinese speech or foreign language voice,
When the webpage of the Chinese information represented with Chinese character or the Chinese phonetic alphabet needing search engine to search converts the webpage of the Chinese information that Chinese phonetics codes spelling and Mixed Pinyin represent to, first computer system finds the source file of this webpage, the source file of this webpage comprises the text of an expansion " .html " or " .hml " by name, by calling the Chinese characters phonetic and Chinese voice code bidirectional modular converter that store in advance in computer systems, which, the Chinese character that meetings all in text file are shown or all Chinese phonetic alphabet, the position of their original webpages converts Chinese phonetics codes spelling or Mixed Pinyin to, the general Chinese character needing conversion is all Chinese characters except the Chinese character as filename and the Chinese character as Chinese character style title,
When Chinese character network page being converted to the webpage that Chinese phonetics codes spelling and Mixed Pinyin represent, the English originally in webpage, English alphabet, arabic numeral, western language punctuation mark, the number of dividing a word with a hyphen at the end of a line do not need conversion, retain former state;
As the Chinese character of filename in webpage, in order to display and operation in the computer system of pure ASCII character also can be called at pure western code, need to convert the Chinese character as filename in webpage to Chinese phonetics codes, former Chinese character after being converted will copy as the file of filename and store in place, this suitable position comprises in the specified folder in given server or local, to guarantee that computer system can find this to be converted into the file of Chinese phonetics codes title;
For the Chinese character of Chinese character style title, when western code is also when there is not this Chinese character style title in ASCII character system, this Chinese character style title can change into and preset and the comparatively close western language font name stored in a computer by computing machine automatically, or the western language font name of acquiescence that computing machine presets;
When the Chinese phonetics codes in webpage needs to convert Chinese character or the Chinese phonetic alphabet to, by calling the conversion being stored in computer system Chinese characters phonetic and Chinese voice code bidirectional modular converter in advance, obtain corresponding Chinese character or the Chinese phonetic alphabet, and the position of the Chinese phonetics codes in original webpage replaces the Chinese phonetics codes be converted with these Chinese characters or the Chinese phonetic alphabet;
When the Chinese phonetics codes in webpage or the punctuation mark number of dividing a word with a hyphen at the end of a line need to convert voice to, the Chinese phonetics codes phonetic synthesis looking into and store in advance in computer systems, which and identification module can be adopted respectively to export the voice of corresponding Chinese, Chinese particular person, Chinese dialects, minority language and punctuation mark;
When the foreign language based on English in webpage needs to convert voice to, can adopt the existing foreign language voice synthetic module based on English, the foreign language based on English that will show in webpage is bright to read out;
When the foreign language needing the Chinese information of the Chinese speech representation in webpage to be converted to based on English, or the English in webpage is that main foreign language is when needing to convert to the Chinese information that Chinese phonetics codes spelling and Mixed Pinyin represent, the Chinese phonetics codes Chinese and foreign language bi-directional conversion module that store in advance in a computer can be called, position in the webpage of the phonetic code be converted, convert the Chinese information of Chinese speech representation to foreign language based on English, or the position in the webpage of the foreign language based on English be converted, it is the Chinese information that main foreign language converts Chinese phonetics codes spelling or Mixed Pinyin to and represents by webpage Chinese and English,
To adopting all webpages of searching of above method, can as required by the path of all or part of content in former webpage and hyperlink or file, the content of instead specifying and the path of hyperlink of specifying or file;
The webpage obtained when us is not pass through search engine, but by alternate manner comprise various web browser to obtain time, the webpage representing information with Chinese character, the Chinese phonetic alphabet, Chinese phonetics codes or foreign language obtained, can be output into the webpage of the information category that system presets again by the conversion of above-mentioned various module, this information category comprises Chinese character, the Chinese phonetic alphabet, Chinese phonetics codes spelling and Mixed Pinyin, foreign language, Chinese particular person voice, Chinese dialects voice, minority language voice, Chinese speech or foreign language voice.
10. Chinese phonetics codes spelling as claimed in claim 1 and Mixed Pinyin Chinese holographic information processing method, it is further characterized in that: the spelling of Chinese phonetics codes or Mixed Pinyin can as the marks of unit organization or individual, this mark comprises mark as the trade mark of product or service and organization mechanism code, the mark of this trade mark and organization mechanism code can carry out lawful registration, can be imprinted on various material object, also comprise in the various search engine and browser address bar that key character or keyword as SMS network address and mobile phone mobile business street and computing machine or embedded computer system be input to mobile phone or computing machine or embedded computer system, to find the webpage or website that are associated with this mark or code, with as the mark of unit organization or individual or the mark of the trade mark of product or service and the corresponding various wired or wireless domain names of the Chinese phonetics codes of organization mechanism code or network address, can by call Chinese phonetics codes domain name conversion and webpage log-in module obtain, so just the mark of the Chinese phonetics codes keyword of the network address of short message of mobile phone Chinese phonetics codes and mobile phone mobile business street Chinese phonetics codes keyword and computing machine or embedded computer system and the domain names be made up of the stem of the keyword of Chinese phonetics codes and unit organization or the mark of individual or the trade mark of product or service and organization mechanism code can be united mutually.
CN201110212394.3A 2011-07-26 2011-07-26 Chinese phonetics codes spelling and Mixed Pinyin Chinese holographic information processing method Expired - Fee Related CN102902660B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201110212394.3A CN102902660B (en) 2011-07-26 2011-07-26 Chinese phonetics codes spelling and Mixed Pinyin Chinese holographic information processing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201110212394.3A CN102902660B (en) 2011-07-26 2011-07-26 Chinese phonetics codes spelling and Mixed Pinyin Chinese holographic information processing method

Publications (2)

Publication Number Publication Date
CN102902660A CN102902660A (en) 2013-01-30
CN102902660B true CN102902660B (en) 2016-04-20

Family

ID=47574900

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201110212394.3A Expired - Fee Related CN102902660B (en) 2011-07-26 2011-07-26 Chinese phonetics codes spelling and Mixed Pinyin Chinese holographic information processing method

Country Status (1)

Country Link
CN (1) CN102902660B (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104020942A (en) * 2013-03-03 2014-09-03 上海能感物联网有限公司 Method for calling computer program to operate by Chinese text
CN104020840B (en) * 2013-03-03 2019-01-11 上海能感物联网有限公司 The method that foreign language text is remotely controlled computer program operation
CN104235891B (en) * 2013-06-14 2019-01-11 上海能感物联网有限公司 A method of smart electronics gas furnace is manipulated with phonetic order
CN104698998A (en) * 2013-12-05 2015-06-10 上海能感物联网有限公司 Robot system under Chinese speech field control
CN103646017B (en) * 2013-12-11 2017-01-04 南京大学 Acronym generating system for naming and working method thereof
CN105139477A (en) * 2014-06-08 2015-12-09 上海能感物联网有限公司 Non-specific person foreign language voice remote control driven car system
CN105260160A (en) * 2015-09-25 2016-01-20 百度在线网络技术(北京)有限公司 Voice information output method and apparatus
CN106126227B (en) * 2016-06-22 2019-03-19 北京普会科技有限公司 A method of it realizes write code across human language on computers
TWI702504B (en) * 2017-09-27 2020-08-21 毅 牛 System for splicing and converting images of chinese character into vocabularies and mobile terminal
CN109102723A (en) * 2018-02-14 2018-12-28 杨靖 A kind of interactive instructional system based on alphabetical Chinese and realize its method
CN110189554A (en) * 2018-09-18 2019-08-30 张滕滕 A kind of generation method of langue leaning system
CN110413972B (en) * 2019-07-23 2022-11-25 杭州城市大数据运营有限公司 Intelligent table name field name complementing method based on NLP technology
CN112000620A (en) * 2020-08-14 2020-11-27 深圳市绿联科技有限公司 File searching method, device and equipment
CN116821271B (en) * 2023-08-30 2023-11-24 安徽商信政通信息技术股份有限公司 Address recognition and normalization method and system based on voice-shape code

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101118539A (en) * 2006-08-01 2008-02-06 苗玉水 Modern Chinese information holographic Latinizing Chinese voice code representation
CN101118540A (en) * 2006-08-02 2008-02-06 苗玉水 Chinese characters phonetic and Chinese voice code bidirectional reversible transform method
CN101118541A (en) * 2006-08-03 2008-02-06 苗玉水 Chinese-voice-code voice recognizing method
CN101123089A (en) * 2006-08-08 2008-02-13 苗玉水 Voice mixing method for Chinese voice code
CN101727195A (en) * 2008-10-22 2010-06-09 苗玉水 Various information input method of Chinese phonetics codes
CN101739393A (en) * 2008-11-20 2010-06-16 苗玉水 Chinese text intelligent participle method
CN101131689B (en) * 2006-08-22 2010-08-18 苗玉水 Bidirectional mechanical translation method for sentence pattern conversion between Chinese language and foreign language

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101118539A (en) * 2006-08-01 2008-02-06 苗玉水 Modern Chinese information holographic Latinizing Chinese voice code representation
CN101118540A (en) * 2006-08-02 2008-02-06 苗玉水 Chinese characters phonetic and Chinese voice code bidirectional reversible transform method
CN101118541A (en) * 2006-08-03 2008-02-06 苗玉水 Chinese-voice-code voice recognizing method
CN101123089A (en) * 2006-08-08 2008-02-13 苗玉水 Voice mixing method for Chinese voice code
CN101131689B (en) * 2006-08-22 2010-08-18 苗玉水 Bidirectional mechanical translation method for sentence pattern conversion between Chinese language and foreign language
CN101727195A (en) * 2008-10-22 2010-06-09 苗玉水 Various information input method of Chinese phonetics codes
CN101739393A (en) * 2008-11-20 2010-06-16 苗玉水 Chinese text intelligent participle method

Also Published As

Publication number Publication date
CN102902660A (en) 2013-01-30

Similar Documents

Publication Publication Date Title
CN102902660B (en) Chinese phonetics codes spelling and Mixed Pinyin Chinese holographic information processing method
CN102479208B (en) The various webpage information search transition translation of Chinese phonetics codes method
Baker Glossary of corpus linguistics
Gries et al. Linguistic annotation in/for corpus linguistics
CN101739393B (en) Chinese text intelligent participle method
US8515733B2 (en) Method, device, computer program and computer program product for processing linguistic data in accordance with a formalized natural language
CN101118541B (en) Chinese-voice-code voice recognizing method
Dickinson et al. Language and computers
CN101118540A (en) Chinese characters phonetic and Chinese voice code bidirectional reversible transform method
Masmoudi et al. Transliteration of Arabizi into Arabic script for Tunisian dialect
Kumar Attar et al. State of the art of automation in sign language: A systematic review
CN102479078B (en) Chinese programming method for computer by using Chinese phonetic codes
Kecskés et al. Key issues in Chinese as a second language research
CN103164396A (en) Chinese-Uygur language-Kazakh-Kirgiz language electronic dictionary and automatic translating Chinese-Uygur language-Kazakh-Kirgiz language method thereof
CN103853705A (en) Real-time voice subtitle translation method of Chinese voice and foreign language voice of computer
Goyal et al. Text to sign language translation system: a review of literature
Masmoudi et al. Automatic diacritization of tunisian dialect text using smt model
McGillivray et al. Computational valency lexica for Latin and Greek in use: a case study of syntactic ambiguity
Anto et al. Text to speech synthesis system for English to Malayalam translation
Fransen Past, present and future: Computational approaches to mapping historical Irish cognate verb forms
Alosaimy Ensemble Morphosyntactic Analyser for Classical Arabic
Soiffer A flexible design for accessible spoken math
Tasovac THE HISTORICAL DICTIONARY AS AN EXPLORATORY TOOL: A DIGITAL EDITION OF VUK STEFANOVIĆ KARADŽIĆ’S LEXICON SERBICO-GERMANICO-LATINUM
WOLDE Machine translation system for amharic text to ethiopian sign
Kamath et al. English to Konkani Translator Using Hindi as a Pivot Language

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C41 Transfer of patent application or patent right or utility model
TA01 Transfer of patent application right

Effective date of registration: 20160321

Address after: 810008, Qinghai Xining biological science and Technology Industrial Park, No. four, No. 26, hatch No., room 510

Applicant after: QINGHAI HANLA INFORMATION TECHNOLOGY CO., LTD.

Address before: 200093 Shanghai city Yangpu District Kongjiang village 44 room 105

Applicant before: Miao Yushui

C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20160420

Termination date: 20200726

CF01 Termination of patent right due to non-payment of annual fee