CN1107896C - Chinese character and coding and input method for automatic transition of simplified original complex form Chinese character - Google Patents

Chinese character and coding and input method for automatic transition of simplified original complex form Chinese character Download PDF

Info

Publication number
CN1107896C
CN1107896C CN97106635A CN97106635A CN1107896C CN 1107896 C CN1107896 C CN 1107896C CN 97106635 A CN97106635 A CN 97106635A CN 97106635 A CN97106635 A CN 97106635A CN 1107896 C CN1107896 C CN 1107896C
Authority
CN
China
Prior art keywords
word
coding
input
sentence
character
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN97106635A
Other languages
Chinese (zh)
Other versions
CN1213101A (en
Inventor
吴宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN97106635A priority Critical patent/CN1107896C/en
Publication of CN1213101A publication Critical patent/CN1213101A/en
Application granted granted Critical
Publication of CN1107896C publication Critical patent/CN1107896C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Landscapes

  • Document Processing Apparatus (AREA)

Abstract

The present invention discloses a code input method for automatic conversion between simplified Chinese characters and complex Chinese characters. The present invention is characterized in that Chinese characters are divided into words and phrases; for each single word, fundamental strokes, frequently used character components and radicals are used as word forming roots; each single word is split to form the roots according to principles that large parts are taken, the priority is given to meanings, the direct viewing property needs to be considered, sequences are used, parallel forms need to be realized, and complex parts are removed so as to realize coding; natural three code input is carried out according to sequences from left to right, from top to bottom and from outside to inside, a middle part has priority over a lateral part, a horizontal stroke has priority over a vertical stroke, and a left-falling stroke has priority over a right-falling stroke; four code input is realized for phrases according to the division of double word phrases, three word phrased, four word phrases and multiple word phrases. The present invention can be easily leant and used, the favorable space advantage is provided for word stocks of over capacity, the input progress is greatly accelerated, the rate of coincident codes is zero, and the free conversion between simplified Chinese characters and complex Chinese characters can be realized.

Description

A kind of simplified and traditional Chinese characters is the coding and input method of conversion automatically
The present invention relates to the technical field of encode Chinese characters for computer, specifically a kind of simplified and traditional Chinese characters coded input method of conversion automatically that carries out.
The research of Chinese character entering technique is existing so far one from the end of the seventies, vicennial history, in the meantime, has emerged in large numbers a large amount of Chinese character input methods both at home and abroad.Up to the present, only domestic registration Chinese character input method on record has just reached kind more than 600, if add external research, at least more than 1,000 kinds, but, roughly they can be divided into four types of font code, sound sign indicating number, phonetic-stroke code and preface sign indicating numbers, two big schemes according to the difference of the Hanzi attribute of each Chinese character input method institute foundation, promptly a class is the quick type font code scheme towards vast special dealer, and a class is the schemes such as popular sound shape towards amateur person.The former is suitable for touch system by coding, but more complicated is difficult to study and grasp, and the Five-stroke Method, Zheng's sign indicating number, holographic sign indicating number etc. are typically arranged; The latter is easy to learn, and efficient is not high, and phonetic, natural code, configuration code etc. are typically arranged." beating of easily learning is unhappy, difficult of beating soon " reflected the present situation of China's Chinese character entering technique basically.
Along with development of times, the arrival of information age, information processing is increasing, adds the quickening of people's life and work rhythm, and people are inevitable also more and more higher to the requirement of Chinese character processing technology.In word processing field, become method the most commonly used and practical in the Chinese character input method as the font code input method, king's sign indicating number, holographic sign indicating number etc., though advantage is arranged respectively but the character of its coding still only rests on the code fetch basis of stiff and numerous and diverse basic stroke of tradition and single processing, and be bound by huge brevity code and narrow and small character word stock system, let alone to the processing of statement, add repetition rate of coding height, narrow application range, especially up to now domestic also do not have a kind of coded input method, and it can realize simplified and traditional Chinese characters on the basis of natural trigram, the automatic identification conversion of phrase and statement.
The object of the present invention is to provide a kind of simplified and traditional Chinese characters coding and input method of conversion automatically, it can realize the automatic conversion of simplified and traditional Chinese characters, phrase simultaneously on the basis of natural trigram, and does not have repeated code.
Purpose of the present invention can realize by following measure: a kind of simplified and traditional Chinese characters is the coding and input method of conversion automatically, it is characterized in that: Chinese character is divided into individual character and phrase, basic stroke pressed in individual character, radical commonly used, radicals by which characters are arranged in traditional Chinese dictionaries are as structure word base root, structure word base root can be divided into non-word and one-tenth word base root, the principle that individual character is splitted into structure word base root is big for getting, justice earlier, intuitively, with preface, and connect, go numerous, coding is pressed from left to right, from top to bottom, from outside to inside, back in the elder generation, horizontal earlier back is perpendicular, cast aside earlier the order of afterwards pressing down and carry out the trigram input, need be during when the not enough trigram of code fetch or above trigram with free, conversion, substitute, calculate, reasoning, from beginning to end, jump, omit rule and carry out natural trigram input; Phrase is divided by two-character word, three words, four words and multi-character words and is carried out four yards inputs, and two-character word is got preceding two yards of each word, and each prefix coee and last word time sign indicating number got in three words, and the first sign indicating number of each word got in four words, and multi-character words is got the first sign indicating number of first three word and last word.
The present invention has following characteristics:
1, realized freely changing of the simplified and traditional body of Chinese character, because regional historical background such as continent and Hong Kong and Taiwan is different, we can't remove to remember thousands of the complex forms of Chinese characters, same people from Hong Kong and Taiwan also can't remember thousands of simplified Chinese characters, it is this that we use the complex form of Chinese characters that resistance is arranged, the situation that people from Hong Kong and Taiwan uses simplified Chinese character to hinder has obtained change, for bridge has been erected in cultural exchanges to each other;
2, since with individual character by basic stroke, radical commonly used, radicals by which characters are arranged in traditional Chinese dictionaries as structure word base root, and the structure word base root and the principle of divining by means of characters are familiar with by people mostly, " eight big rules " is easy-to-understand, thereby easy, easily remembers easy-to-use;
3, broken through the input of existing " drawing a dipper with a gourd as a model-copy ", make and account for four, five more than half basic root words of Chinese character sum and can be converted into the trigram system effectively, shortened code length, and dealt carefully with contradiction between individual character and phrase, for the dictionary of over capacity provides advantageous space advantage, right surplus the phrase clauses and subclauses can reach 120,000, accelerated the speed of importing greatly, and the repetition rate of coding is zero.
4, started statement coding.This coding can be processed processing to the complicated simple sentence of various and syntax rule and complex sentence and sentence group, and the achievement in research that the linguistics that long-term accumulation is got off is plentiful and substantial is combined in the computer science.Chinese language processing has been reached higher and practical more level on the whole.
Further the present invention is explained in detail below.
The present invention is divided into individual character and phrase with Chinese character, and by basic stroke, radical commonly used, radicals by which characters are arranged in traditional Chinese dictionaries are as structure word base root with individual character, and structure word base root is 160, and is concrete listed as structure word base root summary table; Structure word base root can be divided into non-word and one-tenth word base root, the principle that individual character is splitted into structure word base root is big for getting, justice earlier, directly perceived, with preface and connect, go numerous; Get greatly: in various possible the tearing open in the method for a Chinese character, by splitting out big as far as possible basic root or minimum at every turn with the basic radical that splits out, as (fractionation of following Chinese character and correspondent button be _:
Body: from ---KA becomes: penta
Figure C9710663500072
---RJ justice refers to be under the prerequisite with the sequential write earlier, and the basic root that splits out will be deferred to the meaning of this word itself, as:
Tongue: thousand mouthfuls---HB closes:
Figure C9710663500073
My god---UH intuitively promptly from the structure of whole Chinese character, observes all sidedly, sense organ can directly be accepted, as:
I: hand dagger-axe---JR state: coulee---QE splits in proper order with the sequential write of preface by Chinese character, as:
: OX also connects potential certain the natural identical relation between a Chinese character base root and the basic root that is meant to a Ren soil---IXH justice: Dian Qe---, just handle with " linking " method, as:
Pawl:
Figure C9710663500076
Foretell---NF is most: corpse Rui---PS and go numerous finger when compound basic root obviously is better than basic root, then house is got greatly for a short time, as:
Examine: soil 5---H5
Figure C9710663500078
If seven is little---encoding scheme of HQ is crossed cellular on the design number of basic root, must cause in the disassembled coding of Chinese character, will again and again using " Dian one Shu Pie second twenty or thirty " to wait these to count like the stroke similarly minimum basic root of basic root and other, therefore in the phrase disassembled coding of decision input speed, become more numerous and diverse, dull, bring difficulty except that further having enlarged the mobile institute of eyes to brain collection organize your messages on the one hand in the space, on the other hand because basic root is always paced up and down constant in these basic strokes, to directly cause the repetition rate of coding can not to be in any more, so that dictionary is more and more littler, input speed is more and more slower, numerous scientific practices prove, the quantity of base root is directly proportional to a certain extent with the size of character library dictionary and the repetition rate of coding etc., the present invention breaks through the constraint of stiff basic stroke audaciously in practical operation, adopted multiple flexible way to split, when having listed 160 structure word base roots, also listed one, the secondary brevity code sees brevity lists for details.
The present invention is with the base unit of basic root as code fetch, with the least unit of seven kinds of basic strokes as code fetch, and in other words when not having suitable basic root, can be by the mode code fetch of stroke; After Chinese character resolves into basic root and stroke, press sequential write and arrange successively, promptly from left to right, from top to bottom, from outside to inside, earlier in the back other, earlier horizontal back perpendicular, casts aside afterwards earlier after pressing down, putting in order earlier zero; Each radical, every phrase are the center by all-key; Each individual character is the center with the trigram.
Radical stroke input coding formulas or directions put into verse: the radical stroke can not be exempted from, and the first sum of the end of registering contains.Earlier, where need where to stand in the single-character given name absence 0.The tautonomy stroke turns round ingeniously, title read-write limit conversion.Cross break " J " comes hook curved, sits in the right seat sign indicating number is filled out.
The radical coding rule: in 160 basic roots, they are neither the key name Chinese character neither become word base root, and they are the ingredients that usually occur in the adopting Chinese character form, are called radical.Its coding rule is: key name sign indicating number+the first sum of sign indicating number+inferior pen sign indicating number+end pen sign indicating number.The room replenishes with numerical key " 0 ".The head here, inferior, end pen sign indicating number all refer to draw code fetch by single, and its corresponding relation is as follows:
Single is drawn kind: point is cast aside to bend anyhow and is turned
Single is drawn code: V I I A A L J
For example following radical base root corresponding codes:
Rolling Contraband Rui Fu
JIJI HILO VVIO BJIO
Stroke coding rule: key name sign indicating number+the first sum of sign indicating number+inferior pen sign indicating number+end pen sign indicating number.
(1) single-character given name claims stroke
In adopting Chinese character form, lines by certain orientation is finished continuously are called stroke.Though single-character given name claims stroke special, the rule above having had is compiled out equally easily.For example: Dian one Shu Pie
Figure C9710663500091
VA00 I000 I002 A00U A002 II00
(2) tautonomy claims stroke
Tautonomy claims that stroke is exactly that folding class stroke further is divided into " bending " and " turning ".Because the folding class stroke form of a stroke or a combination of strokes is changeful, comes in every shape, and often uses again in calligraphy, various composing, therefore also includes them in coding ranks.Its coding is different from single-character given name and claims stroke, but encode according to " key name sign indicating number+pronounce method ".
For example: second is pronounced: cross break crotch coding: KJLO
Pronounce: perpendicular crotch coding: LLLO
亅 pronounces: lifting-hook coding: JILO
Figure C9710663500094
Pronounce: perpendicular coding: the LIIO that carries
Individual character input coding formulas or directions put into verse:
General comprehensive justice is preferential, and one or two trigrams add the space,
According to order of strokes observed in calligraphy code fetch unit.The font code adds inadequately.
The key name Chinese character registers four, and the fixed base root is replaced ingeniously,
Become word base root according to removing outward.Simplified and traditional free user's choosing.
The outer Chinese character of key is also easy, and logic memory shape is directly perceived.
Transforming three compiles sign indicating number.The natural not empty biography of intelligence.
The coding rule of key name Chinese character: a special basic root of each key position, be the key name word.As bamboo mouth clothing etc., totally 26, their coding rule is to register+" 4 ", i.e. the conversion of place key write the two or more syllables of a word together four times.
---CCCC---C4 for example: clothing
Bamboo---AAAA---A4
Become the coding rule of word base root: in 160 basic roots, except the key name word, also having much itself is exactly the basic root of a Chinese character, and so basic root is handled according to the outer encode Chinese characters for computer of key, promptly registers+" Split Method ".
For example: literary composition---split Tou Qe coding: AUX
Ear---split Three coding: BNE
From---split everybody coding: AXX
The coding rule of the outer Chinese character of key: a large amount of Chinese characters belongs to this class Chinese character, and the coding rule of therefore grasping this class Chinese character is very important.Before providing coding rule, the notion of basic root sign indicating number is discussed earlier.The basic root that each Chinese character splits out is distributed on certain letter key on the keyboard, and the English alphabet of key position, basic root place is " the basic root sign indicating number " of this Chinese character.Its coding rule is: first+inferior root+last root.
(1) be less than three for basic radical, and Chinese character outside the key in the secondary coding range not, code fetch adds font sign indicating number or two end sign indicating number of writing again in proper order.
For example: drought---day dried coding: MH2
Silicon---Shi Xi coding: NMM
(2), except that special circumstances, in the code fetch process, must apply in a flexible way during for basic radical, make it change into natural trigram in conjunction with eight big rules more than three.The proportion of trigram accounts for more than 90% in general code.
For example: calf---ox soil header encoder: ZHF (conversion)
Pancreas---month longbow coding: DZR (from beginning to end)
Film---month European-allies Coding: DIE (calculating)
Jin---Wang Shanxi (Shanxi) coding: EWE (reasoning)
(3) when a Chinese character is not among eight big rules, then get all-key, this class Chinese character is few in this coding.
For example: bamboo mat spread on the floor for people to sit in ancient China---
Figure C9710663500104
End Yin coding: AAFF
An ancient plucked stringed instrument--- Ren Vow coding: AXMR
Need be when the not enough trigram of code fetch or when surpassing trigram with free, conversion, substitute, calculate, from beginning to end, jump, omit eight big rules.
1, free rule
Finger changes into three grades of codings automatically with detachable basic root in the secondary coding, keep independent constant special basic root and no longer continue to split in three grades of codings.
For example: from sitting---everybody is soil coding: XXH
Seat---extensively encode: FAH from soil
Sit: should in the secondary coding, says that according to reason its coding should be two yards, and it not weave into " AH ", and handle " from " root further splits into " everybody " two basic roots, entered three grades of codings;
Seat: the inside equally also contain " from " root, and in three grades of codings, it but keeps independent constant, as basic root handle.
This is a crucial notion in the general code, and it is applicable to each following rule.Which basic root needs transforms automatically, please refer in the basic root summary table bracket.
2, conversion rule
Finger is similar or close some position in the whole word and some basic root shapes commonly used, converts this basic root commonly used to.
For example: act of violence---encode in mountain chest---fractionation---month Bao mountain: BJW
Figure C9710663500111
---month Na---fractionation---woman month Fu coding: YDB
3, substitution law
Finger uses basic root or whole word to remove to replace identical or close with it Chinese character on meaning.For example: rice---the wooden Http rice of Gu Rong---fractionation---is encoded: TOK
Degree---
Figure C9710663500112
Cross---fractionation---Rui watt of thousand coding: SYH
4, calculating rule
Refer to that utilization specially is arranged on the Substitution method that basic root on the numerical key or other basic root sanctified by usage carry out additive operation.
For example: again---2 push violently, and---fractionation---Rolling 6 (2+2+2) wood is encoded: J6T
Day---24 films---fractionation---month European-allies
Figure C9710663500121
(24+6) coding: DIE
For easy, generally after the utilization additive operation, just no longer conversion as the numeral 6 in top " J6T ", is changed with regard to " six-u " of no use.
5, reasoning rule
Refer to that the known condition of utilization goes to imagine or reasoning from logic has close ties person with it.
For example: cut off the feet------fractionation---Rui enough changes
Figure C9710663500122
Cut off the feet---Rui Foot coding: SAF
---electric brightness---splits---light Mi car---electric Mi car coding: VLR to light
6, head and the tail rule
Refer to intersect mutually between basic root and the basic root intussusception, and the direct compression of taking on the whole.
For example: smooth---the women longbow of longbow one's mother's sister---fractionation---is encoded: YZR
---the fire bow is passed---fractionation---fire bow Chuo coding: QRF to younger brother
7, jump rule
Finger is based on existing precedent or condition and natural cutting cuts.
For example: system---fractionation--- Towel Dao coding: AWN
Pull---fractionation--- Towel hand-code: AWJ
Levy---fractionation---Chi-end coding: AIF
Punish---fractionation---Chi-heart coding: AIQ
8, omit rule
Refer to basic root and single draw at a distance of or intersect and the Force Deletion resolutely made.
For example:
Figure C9710663500126
---Ren repaiies---fractionation---, and Ren Fan San encodes: XAW
Must---heart honey---fractionation---Http heart worm coding: OQG
1. in Chinese character, two basic roots transform simultaneously after the common commentaries on classics constant before; As this word the complex form of Chinese characters is arranged, then simultaneously constant.
For example: large---fractionation---stone factory shellfish is encoded: NNP (not having the traditional font)
Vegetarian---fractionation---literary composition and encoding: AS (complex form of Chinese characters is arranged)
2. conversion rule does not have the complex form of Chinese characters as this word in the secondary coding, usually not directly conversion.
For example: that---fractionation---
Figure C9710663500131
Fu coding: JIB (not having the traditional font)
Pregnant---fractionation---queen coding: YE (complex form of Chinese characters is arranged)
The brevity code rule:
Subjective in order to reduce code length, improve input speed and introduce the saying of brevity code.Yet a kind of quantity of scheme brevity code is high more, and its memory difficulty is just big more, and the time of study is also just long more ... what it brought is not to be what speed, and its behind strictly says to be a millstone around one's neck.No matter for an operator, still for a learner, he may really write down several thousand brevity codes never.Because it is not a kind of natural coding, and depend on mechanical memory fully all the time.If the font sign indicating number is the covert secondary or the repeated code of three grades of codings, brevity code is exactly the font sign indicating number of covert memory so, they the two substantially do not have basic difference.If someone will be used as a kind of creation to brevity code, that is the mistake that doubles.Some codings, comprise popular several codings at present, the quantity of brevity code has almost accounted for 2/3 of its coded set, and the mean code length that utilizes its brevity code to calculate has more lost objectivity with scientific, the only just individual pro forma data of their mean code length are not a serious science.Their repetition rate of coding is like this equally.Therefore the quantity of brevity code also is to weigh one of a kind of scheme quality very important criterion, we can say that fully the quantity of brevity code is high more, and the quality of its scheme is just low more.And at present nearly all coding all is the system that constitutes in the brevity code mode, and this is the another significant fundamental difference of this coding and many tame schemes just also.In this coding, dragons and fishes jumbled together with brevity code for coding, and its situation is as follows:
(1) the one-level brevity code is 36;
(2) 1190 (wherein contain natural coding and account for 900) of secondary " letter " sign indicating number.
Concrete as brevity lists, its input method is: key in one, two letter respectively, space bar of complement promptly for the input of I and II brevity code or coding, needs to hit two, three keys respectively again.
The complex form of Chinese characters and digital input rule:
The input of the complex form of Chinese characters is tightly to link together with the input method of simplified Chinese character, and freely changing by numerical key and English key of it realizes.Promptly,, just finished the input method of this Complex form of Chinese Character as long as convert a certain fixing basic root in this word to general mutually with it basic root or corresponding numerical key for the input of a complex form of Chinese characters.(seeing basic root summary table), wherein secondary is encoded or is utilized the complex form of Chinese characters of the brevity code of secondary coding formation to change, and adopts the mode of font sign indicating number.
Owing to be provided with the needs of the basic root of several necessity on the numerical key, so they have two kinds of purposes as small part Complex form of Chinese Character conversion.
(1) when being used as coding, they directly key in;
(2) when they are used as numeral, then pin this key earlier, and then hit space bar.
The phrase coding rule:
Designed letters method for typing-in phrases in the present invention.The coding method of phrase or phrase and the code fetch of individual character are unified, do not add the input marking of word or speech, and no matter phrase length is got isometric four yards, and individual character can mix input with phrase, see typing words, (comprising the complex form of Chinese characters) sees that speech beats speech, and (comprising the traditional font speech) need not to switch.This letters method for typing-in phrases makes input speed significantly improve.Because the all-key principal component space accounts for 264, adds nihility 104 design in theory, sum reaches more than 560,000, and this yard individual character is almost blank, so the capacity of phrase or phrase is quite big.Its capacity is compared and was had only it with " Ci hai ", and does not have too lately, is the scheme of the unique maximum of present capacity.
The coding formulas or directions put into verse:
The phrase coding does not have strange the change, before four word Chinese idioms are respectively got,
Be as the criterion with four and do in accordance with regulations.Just many four yards complete.
The double word word each one or two, the first sign indicating number of multiword word three,
Fair and reasonable no complaint.First company of tailing in regular turn.
Preceding two head of three couplet silk flosses do not forget last eight main points,
Two yards strings of the continuous benefit of back word.The definition rule is passed through a full piece of writing.
The two-character word coding rule: preceding two yards compositions of its all-key, totally four yards got in every word.
For example: the sub-Bing coding of study---fractionation---: QPJU
Effort---fractionation---woman is power power coding again: YCYY
The three-character words and phrases coding rule: preceding two words are respectively got first yard, and last word is got preceding two yards, totally four yards.
---fractionation---Yan wood several codings: OATN for example: computing machine
The electric Woo wood of televisor---fractionation---several codings: VLTN
Four word coding method rules: first yard of all-key respectively got in each word, totally four yards.
---fractionation---Rolling Ren for example: operating system Si coding: JXAC
Middle mouthful of people's corpse coding: ICXP of Chinese people's---fractionation---
Multi-character words coding rule: get first yard and first yard of last word of first, second and third word, totally four yards.
For example: the Chinese Communist Party---middle mouthful is altogether Coding: IVZQ
The Hong Kong Special Administrative Region---standing grain Rui ox Contraband coding: LSZH
The statement coding rule:
Sentence be by speech or phrase according to the linguistic unit that the certain grammar rule constitutes, certain intonation is arranged, express a complete meaning.It can tell someone something, inquires others' something, requires other people to do something, perhaps certain violent emotion of expression oneself.
Sentence is divided into four kinds of declarative sentences, interrogative sentence, imperative sentence, exclamative sentence on the tone; On structure, it can be divided into two kinds of simple sentence and complex sentences.General simple sentence (solely except statement and the sentence with no subject) all has subject, predicate usually, the object in addition that has.Subject, predicate and object are the trunks of sentence, can also add attribute, the adverbial modifier and complement on the basis of trunk.What these compositions had is served as by speech, and what have is served as by phrase.
The sentence that is made of two or more simple sentences just is complex sentence.The simple sentence that constitutes complex sentence has certain association on meaning, and combines by the certain structure mode.These simple sentences become after the ingredient of complex sentence, have lost original independence, are called subordinate sentence.The relation of each subordinate sentence of complex sentence (or type of complex sentence) has side by side, accepts, goes forward one by one, selection, turnover, cause and effect, suppose, condition etc., also show purpose, chain, choice, explanation etc. in the grammer books that have.Express these when concerning, usually indicate, but related word is not necessarily used in the contact on the just idea that also has with related word.
The combination of several simple sentences or complex sentence has just formed successive sentence group of linking up, expressing a distinct center meaning.But seldom can enough paired related words among the sentence group in two, single often with one, and only be used for follow-up.Relation between them, can use in the complex sentence principle of classification between subordinate sentence and classify: promptly the compound sentence group, accept the sentence group, the sentence group that goes forward one by one, select sentence group, a turnover sentence group, cause and effect sentence group, conditional clause group, purpose sentence group, explain orally sentence group etc.
Though the sentence in the article is always various, no matter how long sentence has, and how complicated structure has, and the various compositions of its formation are always regular governed.For the coding of statement, should conform to syntax rule, but can not fully extremely remove syntactic structure.In this coding,, designed the primitive rule of statement code fetch by induction-arrangement:
(1) code fetch formulas or directions put into verse:
First SVO of simple sentence, each head appends SVO.
It is fixed to add shape or mend composition, multiple complex sentence and sentence group,
Crucial words complex sentence type, primitive rule is sought step by step.
(2) code taking rule:
(1) special sentence formula code taking rule: the first of first+subject-predicate object of key word, totally four yards.
Key word comprises: words and expressions, by words and expressions, make words and expressions, institute's words and expressions.
For example: you sweep the classroom<clean.Coding: [JXJH
(such) problem is that our institute is receptible.Coding: [NWMJ
(2) general simple sentence code taking rule: get SVO attribute (or the adverbial modifier, complement) respectively first, totally four yards.
It comprises: interlock sentence, pivotal sentence, judgement sentence, deposit cash sentence, contracted sentence, subject-predicate sentence etc.
For example: he sees TV to my family.Coding: XCJV
The form master wishes that I write the application form of joining the Youth League.Coding: [EXJM
(3) complicated simple sentence code taking rule: get SVO respectively and decide first of shape complement, totally six yards.
For example: (the great teacher of the revolution) Marx's [for the first time] [up hill and dale] explanation<clear〉(development of nature and society) rule.
Coding: [ZPHXAS
(4) complex sentence code taking rule: first of first+each subordinate sentence subject-predicate object of type of relationship.Type of relationship comprises: side by side, accept, go forward one by one, selection, turnover, cause and effect, suppose, complex sentences such as condition, choice, explanation.
For example: because (nonage) he [adhering to every day] take exercises, so this make he [in 20 years thereafter never]<cross also [never because of] health<bad of disease dally over one's work.
Coding: [VXPKXAUZBF
A code fetch of multiple complex sentence is no more than 12 letters at most, that is to say the highest two reiterant sentences that are limited to, three subordinate sentences.And in fact, in multiple complex sentence, with the most use is exactly two reiterant sentences.If the subject-predicate object of simple sentence etc. occurs keeping top one usually when identical or several in the subordinate sentence.
(5) sentence group code taking rule: earlier it is changed into the form of several simple sentences or complex sentence, and then by rule, gradation is handled.
For example: 1. I like autumn.Coding: [JGLH
2. I like the autumn in these epoch.Coding: [JGLA
3. I am willing to that this big good autumn scenery is forever in the human world.Coding: [JNLX
Above the sentence group of this coordination, both can handle successively by the simple sentence code taking rule, can be converted into causal complex sentence and handle 2., 3. integrating again.
Because the singularity and the complicacy of sentence structure, their code fetch then is defined as: add before its coding respectively that " [" key as a token of, finishes with space bar at last, and " when [", system enters the statement processing capacity with acquiescence in appearance in encoding.
Multiple complex sentence and sentence group are combined by many sentences, their structure than the simple sentence complexity many, need us to analyze sentence by sentence, hold the content and the inner link thereof of sentence up and down, get the train of thought of communicating views clear, could hold its complicated content and structure exactly like this.
At first make total what subordinate sentences of this complex sentence (or sentence group) clear, determine the mutual relationship on idea between subordinate sentence and the subordinate sentence, find out the grammatical relation between each subordinate sentence of the multiple complex sentence of subject-predicate object (can carry out the sentence element analysis) of each subordinate sentence, can utilize the supporting use of related word or make the relation of the combination range and the subordinate sentence that are used for finding out subordinate sentence separately because constitute the simple sentence of each subordinate sentence itself.
In written, when analyzing the structural relation of multiple complex sentence (or sentence group), the conventional always first subordinate sentence of method, layering again, and in fact, level is to have thing virtually, its position is fixed.Not only human brain is difficult to determine for the moment in written, and computer will more be difficult to distinguish.If level is also included in the execution of computer, during the computation complex sentence, can only remove to reconfigure and arrange the sequencing of a formula by the execution of level so, can only become at that time and one step on muddle, man-machine " dialogue " becomes empty verbiage on the contrary.In this case, have only the structure of two reiterant sentences to have stability most.Therefore, for needs easy and adaptation Different Culture level personnel, for triple or triple above complex sentences or sentence group, by the form of " simple sentence+complex sentence ", " complex sentence+simple sentence " or " complex sentence+complex sentence ", the gradation processing.Like this no matter which type of complex sentence or sentence group according to top several rules, can directly enter coded system.Below attached structure word base root summary table and brevity lists.
Structure word base root summary table
Brevity lists

Claims (4)

1, the computer Chinese input method of the automatic conversion of a kind of simplified and traditional Chinese characters, it is characterized in that: Chinese character is divided into individual character, phrase, symbol, statement and radical stroke five parts carries out computing machine and directly import, and simplified Chinese character directly can be carried out the computing machine input of the complex form of Chinese characters by the conversion of coding;
In the individual character input, this method is split as 160 basic roots with Chinese character, and respectively corresponding 26 code elements are pressed from left to right, from top to bottom, from outside to inside, back in the elder generation, erect horizontal earlier back, and the order of afterwards pressing down is cast aside by elder generation, carries out the nature input; The input of the complex form of Chinese characters is to convert with it this Chinese character end fixed base root to general basic root mutually, can finish the direct input of its traditional font;
In the phrase input, carry out four yards inputs by two-character word, three words, four words and multi-character words division; Two-character word is got preceding two yards of each word, and each prefix coee and last word time sign indicating number got in three words, and the first sign indicating number of each word got in four words, and multi-character words is got the first sign indicating number of first three word and last word and imported;
The symbol input comprises linguistic notation, mathematic sign, astronomical sign, chemical symbol, phy symbol, medical science symbol, traffic character etc., in the computing machine input, carries out four yards inputs by " pronouncing method ten replacement roots ";
The feature of statement input is: statement is divided into simple sentence and complex sentence, subject and predicate, guest, attribute or the adverbial modifier, complement first totally four yards got respectively in simple sentence, or first first totally four yards of adding subject and predicate, object of employing key word, or first first totally four yards of adding subject and predicate, object of employing key word, or adopt first totally six yards that get subject and predicate, guest, fixed, shape, complement: first first of adding each subordinate sentence subject and predicate, object of type of relationship got respectively in complex sentence, or get in the sentence phonetic alphabet of each word and freely import in order; " [" key as a token of, finishes with space bar at last in input before its coding;
In radical stroke input, unifiedly be defined as four yards, get ten pen sign indicating numbers of key name sign indicating number ten the first sum of sign indicating numbers, ten end pen sign indicating numbers respectively, replenish with alphabetical O during four yards of less thaies and import;
2, the computer Chinese input method of changing automatically by the described a kind of simplified and traditional Chinese characters of claim 1, it is as follows to enumerate out Ji Genbiao: A bamboo
Figure C9710663500031
Chi Pie Fan For-additional (literary composition from) B mouth (also ear) Fu C clothing Yi
Figure C9710663500034
マ again Si ㄍ Si (identical element) D month Jie
Figure C9710663500036
(order with Ran by Qu Yin then) Chuan E king three or five thirty
Figure C9710663500039
(the rich hair in river) F foot  Chuo Yin wide two foretells (go up and end ) G worker Cui insect without feet or legs Zhao an ancient type of spoon
Figure C97106635000314
(drooping debug) H soil seven Contraband scholars do among (thousand weather wind narrow-necked earthen jars are died) I twenty Lv  One Shu (Chu is sweet) J cun Rolling Shou 亅 Network
Figure C97106635000319
Bao
Figure C97106635000320
Pig is not (
Figure C97106635000322
Being) K eats Cannibals
Figure C97106635000323
(mortar is owed from a jin tooth rice cave) L standing grain Mi Qian in vain
Figure C97106635000324
Woo (not showing that adopting the fork-like farm tool used in ancient China bundle sees) says M day
Figure C97106635000326
Sunset
Figure C97106635000328
Bad (the ware dawn early) N several nine In-particular Dao
Figure C97106635000332
Figure C97106635000333
The Http of the O of factory's (stone) speech Yan ( Gu closes) Fu P gold Jin shellfish the sixth of the twelve Earthly Branches
Figure C97106635000338
Corpse (
Figure C97106635000339
Figure C97106635000340
Silver coin) Q fire
Figure C97106635000341
Little
Figure C97106635000342
Xin (
Figure C97106635000345
Industry heart beans) R penta dagger-axe
Figure C97106635000346
Shoot a retrievable arrow lance bow cutter (car Jian vows public an ancient weapon made of bamboo bone You Xi) S water Rui Rain ( Pu is two pages in the cloud) T wood The hot Tou Bing of ten (but fourth Jilin) U Zhuang
Figure C97106635000352
Six Epileptic ( Upright bird) V koilonychia fish, mouth (interior family square toes are returned in the field) the protruding towering Jiong San of W mountain Cao towel Ji
Figure C97106635000356
(electric switch) X Ren Qe Ren eight
Figure C97106635000360
Figure C97106635000361
( Many) Y woman Quan is big for the Z dog for power four (and watt female Zhou Shi is than beautiful) ( Sheep Ma Niusheng centre altogether also)
3, the computer Chinese input method of changing automatically by claim 1 and 2 described a kind of simplified and traditional Chinese characters, it is characterized in that 160 basic roots are with its " sound ", " shape ", " justice " " sound ", " shape ", " justice " unique location in conjunction with keyboard English, associative memory; It is characterized in that individual character input has been used free, conversion, substituted, calculating, reasoning, from beginning to end, jump, omit eight big rules, all individual characters are compressed in fully in the scope of natural trigram and carry out computing machine input.
4, by the computer Chinese input method of claim 1,2,3 described a kind of simplified and traditional Chinese characters conversions, the input feature vector of its an one-level brevity code and a malapropism, speech, symbol is: this-A, in-B, with-C, all-D, E, can-F, individual-G ,-H, in-I, I-J ,-K and-L, be-M, no-N, just-O, the people-P, party-Q, want-R, institute-S, have-T, you-U, state-V, with-W, people-X, for-Y, deposit-Z.
Widow-fractionation-Http Page (page or leaf) cutter coding: OSR
Preesed-as to split-penta Shu (rice) foot to encode: RKF
Material desire-fractionation-ox not paddy (rice) is owed coding: ZJKK
Think tank-fractionation-arrow
Figure C9710663500041
Eloquence (Rolling) coding: ROVJ
Follow the beaten track-split-the Chi husband (my god)  vows coding: AHFR
‰-(pronouncing) per thousand sign-fractionation-1,008 mouthful 5 coding: HXB5
CN97106635A 1997-09-30 1997-09-30 Chinese character and coding and input method for automatic transition of simplified original complex form Chinese character Expired - Fee Related CN1107896C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN97106635A CN1107896C (en) 1997-09-30 1997-09-30 Chinese character and coding and input method for automatic transition of simplified original complex form Chinese character

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN97106635A CN1107896C (en) 1997-09-30 1997-09-30 Chinese character and coding and input method for automatic transition of simplified original complex form Chinese character

Publications (2)

Publication Number Publication Date
CN1213101A CN1213101A (en) 1999-04-07
CN1107896C true CN1107896C (en) 2003-05-07

Family

ID=5168855

Family Applications (1)

Application Number Title Priority Date Filing Date
CN97106635A Expired - Fee Related CN1107896C (en) 1997-09-30 1997-09-30 Chinese character and coding and input method for automatic transition of simplified original complex form Chinese character

Country Status (1)

Country Link
CN (1) CN1107896C (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102591473A (en) * 2011-01-18 2012-07-18 吴宁 Simplified and traditional Chinese input method with no need of remembering codes
CN105389015B (en) * 2014-09-04 2017-12-22 江山 Solid size input method of Chinese character

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1107595A (en) * 1994-11-01 1995-08-30 彭志春 Chinese character phonic and shape coding
CN1121204A (en) * 1994-10-15 1996-04-24 李保源 Primary and secondary character element code of Chinese character

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1121204A (en) * 1994-10-15 1996-04-24 李保源 Primary and secondary character element code of Chinese character
CN1107595A (en) * 1994-11-01 1995-08-30 彭志春 Chinese character phonic and shape coding

Also Published As

Publication number Publication date
CN1213101A (en) 1999-04-07

Similar Documents

Publication Publication Date Title
CN1107896C (en) Chinese character and coding and input method for automatic transition of simplified original complex form Chinese character
CN1019424B (en) High-speed chinese character inputting method using synthetic coding of pronunciations, forms and strokes and keyboard used
CN1278209C (en) Composite phonetic alphabet Chinese character coding input method and its keyboard
CN1028680C (en) Holographic code for Chinese characters
CN1110743C (en) Writing-speeching-meaning coding method and keyboard for inputting Chinese characters therefor
CN1182232A (en) Zhiyin code Chinese character coding technology
CN1033540C (en) Simple Chinese input method
CN1164689A (en) Computer input method for Chinese characters' sound pattern meaning based on word and Chinese-Spanish compatible keyboard
CN1137432C (en) Fast-easy code Chinese character input method
CN1166997C (en) Chinese-character fast input method without splitting
CN1129058C (en) Chinese character phonetic code and keyboard design
CN1317630C (en) Stroke Chinese character input method
CN1420424A (en) Chinese charactor input method by Chinese character and redical pronunciation code
CN1271492C (en) 26104 computer Chinese character
CN1111776C (en) Chinese pronunciation-shape code keyboard and its input method
CN1303504C (en) 'Letter' input-method for Chinese characters
CN1848051A (en) Standard Chinese character inputting method
CN1186976A (en) Computer Chinese character eight-four code input method and key board
CN1059508C (en) Structural coding input method using Chinese character computerized pen
CN1374577A (en) General Chinese character input method suitable for letter keyboard and digital keyboard in computer and its keyboard
CN1162766C (en) Chinese-character 'pronunciation-shape code' input method and its keyboard profile
CN1023669C (en) Wang's code Chinese input method
CN1194401A (en) Chinese character coding keyboard and input method
CN1075805A (en) Chinese character stroke-form related coding method and its keyboard
CN1841278A (en) Double-code detachment-free high efficiency Chinese character input technology

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C19 Lapse of patent right due to non-payment of the annual fee
CF01 Termination of patent right due to non-payment of annual fee