CN1161495A - Computer Chinese character key-board input method - Google Patents
Computer Chinese character key-board input method Download PDFInfo
- Publication number
- CN1161495A CN1161495A CN 96119064 CN96119064A CN1161495A CN 1161495 A CN1161495 A CN 1161495A CN 96119064 CN96119064 CN 96119064 CN 96119064 A CN96119064 A CN 96119064A CN 1161495 A CN1161495 A CN 1161495A
- Authority
- CN
- China
- Prior art keywords
- word
- characters
- radicals
- traditional chinese
- chinese dictionaries
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Landscapes
- Document Processing Apparatus (AREA)
Abstract
The ivented method adopts the coding way combining the radical, shape, spelling with numeral and combining the character and word into one, and equips with soft word library of enormous capacity; the soft word library is a two diamensional one, which merges the universal, special and personal word libraries into an organic whole, each user can make the dynamic code length of character shape lower to 1.5-1.2, even more lower than 1.5-1.2. The input is accomplished by 1-4 codes of numerical and letter composition. The rules of character-disassembling and coding basically accord with the national education back ground. There is no need of memorizing to distinguish the high frequency, first class and second class characters. There is no harm for misreading the sound and tone of the character or unknowing how to read the character. It is an universial and eaasy learning input method.
Description
The present invention is a kind of Chinese-character keyboard input method, belongs to the computer Chinese information disposal system.Present Chinese-character keyboard input method is quite a lot of, as Chinese patent 90104322,90105471,911066976,91103533 etc., but also can satisfy following 8 requirements simultaneously without any a kind of input method up to now:
(1) it is enough big to handle the word collection, and the word that every modern may use (comprising simplified Chinese character, the complex form of Chinese characters, variant Chinese character, complicated variant word and non-word symbol relatively more commonly used) all should be selected in; According to document (1), can handle the word collection should be more than 30,000 words.But in order to save internal memory as far as possible, can handle the word collection and should be divided into some subclass, to be adapted to the different users that require.
(2) divine by means of characters and coding rule should have learnability and versatility, promptly should meet the various standards, people of Chinese character cognition custom (being the national education background) Chinese character, should be able to by China's Mainland, area, Taiwan, Hong Kong and Macau and in the world other Chinese areas accepted.According to document (2), the national education background of CONTINENTAL AREA OF CHINA (being middle and primary schools' language teaching material contents) is: word, the speech of be familiar with about 3000 words, grasp Chinese phonetic alphabet method being spelt Chinese, can use indexing system for Chinese characters to look up the dictionary, by correct order of strokes observed in calligraphy writing Chinese characters.
(3) fractionation of word and coding rule should be applicable to and wholely handle word collection (comprising each subclass) and do not have any exception regulation, should not remember and differentiate high frequency, I and II and deserted words by the user, read to be forbidden sound, the unacquainted word of mediation also can normally import, non-word symbol is had input medium.
(4) dictionary that general, the professional and individual dictionary three who usually is categorized at present can be merged is arranged, dictionary should be able to reduce dynamic code length effectively, satisfies all users' needs, is the general dictionary that is applicable to all users.A very simple and easy method efficiently that dictionary is carried out additions and deletions optimization should be arranged, can promptly change i.e. usefulness, the more general dictionary that makes the user easily input method itself to be provided is converted into the special-purpose dictionary that is suitable for this user.
(5) dynamic code length, the repetition rate of coding and input speed can be simultaneously accepted by layman's (comprising big, middle and elementary school student and teacher, scientific worker, government official, managerial personnel, secretarial personnel, writer, reporter etc.) and professional typing person.The situation of each user Ying Ke basis oneself is adjusted dynamic code length, the repetition rate of coding and the input speed of word voluntarily, removes redundancy encoding, alleviates the workload of input word to greatest extent.
(6) because user's computer hardware configuring condition difference (especially still there are a considerable amount of low-grade computers below 286 in the area, continent), hanzi system is also of all kinds, and the dictionary committed memory is bigger usually, therefore, dictionary should be divided into some word banks, selects and adjust the amount of ram that takies voluntarily by the user.
(7) it is easy to operate to enter and withdraw from this input method.The user is easy for operation, and presenting bank has more comprehensive information, specifically substantially should be consistent with the service regeulations of all-phonetic input method.
(8) on the whole, should be able to merge package code, graphemic code, phonetic sign indicating number and digital advantage and abandon its shortcoming, consistent with the national education background as far as possible, make full use of computer resource, allow the user save worry as far as possible, laborsaving, save time.
The document that the present invention mentions (1)~(9) are respectively:
(1) " pocket word sea ", Zhao Suosheng, Miu Yonghe chief editor, Jiangsu education publishing house, in January, 1994
(2) " Chinese character keyboard input technology and theoretical foundation ", Chen Yifan, Hu Xuanhua work, publishing house of Tsing-Hua University, in June, 1994
(3) " Xinhua dictionary " (resetting this in 1992) Commercial Press, in March, 1994
(4) " student's four-function dictionary ", Geng Fayou, Li Yili, Zhang Yiding, Ruan Henghui, Int'l Culture Publishing House, in June, 1992
(5) " Hong Kong pupil's Chinese dictionary " (revised and enlarged edition), Liu Ningfu, Xia Yu, Huang Dongyue, Ming Hua publishing company publishes, in November, 1988
(6) " newly organized Chinese dictionary ", Li Guoyan, weighing apparatus, Dan Yaohai, Wu Chongkang do not write, Hunan publishing house, in August, 1988
(7) " the practical data complete works of electrophile ", Zhao Dahe chief editor, Electronic Industry Press, in July, 1989
(8) " modern Chinese dictionary ", Chinese Academy of Social Sciences Institute of Linguistics dictionary editing cubicle is compiled, the Commercial Press, January nineteen eighty-three
(9) " modern Chinese dictionary " (supplement), Chinese Academy of Social Sciences Institute of Linguistics dictionary editing cubicle is compiled, the Commercial Press, in April, 1989
The objective of the invention is to overcome existing various Chinese-character keyboard input method and can not take into account the disadvantage of learnability, versatility and short dynamic code length, the lower repetition rate of coding and higher performance index such as input speed, provide a kind of both easy to learn, the Chinese characters in current use inputting method of superior performance index is arranged again.This input method is called blue moon input method, notes the input method into BM by abridging.
Computer Chinese character key-board input method of the present invention has comprised simplified Chinese character, the complex form of Chinese characters, variant Chinese character, complicated variant word, radical and non-word symbol relatively more commonly used; Used totally 10 numerical keys, A~Z totally 26 English alphabet keys, Alt key, enter key, backspace key, space bar, semi-colon key, the Caps Lock Capslock of 0~9 on the QWERTY keyboard; Adopt phonetic, radical, font and digital combining and the coded system of words unification and dispose corresponding dictionary; The version that it is characterized in that dictionary is flexible lexicon, flexible lexicon is a two-dimentional dictionary, each coordinate points is represented a two-character word, the slogan banner of flexible lexicon is made up of greater than 2 everyday character code length in the code list of Hanzi, the vertical mark of flexible lexicon should be selected in 7000 general words, and press usage frequency and sort, be divided into 56 sections, every section 125 word, every section takies the 64KB space, can form speech to the soft speech word of each vertical mark and each word consideration in the soft vocabulary of whole slogan banner successively, have or not the appearance that in literal and spoken language, links to each other of more chance in other words, form soft speech segment data file thus; The input of word is finished by letter key on the keyboard and numerical key successively with the form of 1~4 code; Radicals by which characters are arranged in traditional Chinese dictionaries pressed in word splits, radicals by which characters are arranged in traditional Chinese dictionaries are divided into into word radicals by which characters are arranged in traditional Chinese dictionaries and non-word radicals by which characters are arranged in traditional Chinese dictionaries, become the word radicals by which characters are arranged in traditional Chinese dictionaries to be divided into standardized form of Chinese charcters radicals by which characters are arranged in traditional Chinese dictionaries and class word radicals by which characters are arranged in traditional Chinese dictionaries again, standardized form of Chinese charcters radicals by which characters are arranged in traditional Chinese dictionaries are made code with the initial of standard pronunciation, class word radicals by which characters are arranged in traditional Chinese dictionaries use 0~9 individual digit or letter as code respectively, non-word radicals by which characters are arranged in traditional Chinese dictionaries are made code with individual digit respectively with first stroke, and that stroke is divided into is horizontal, vertical, cast aside, point and folding; The fractionation radicals by which characters are arranged in traditional Chinese dictionaries of word are according to the order of strokes observed in calligraphy of its first stroke in this word, are decided to be first successively, inferior, last radicals by which characters are arranged in traditional Chinese dictionaries, and word is divided into single radical word, two radicals by which characters are arranged in traditional Chinese dictionaries word and three radicals by which characters are arranged in traditional Chinese dictionaries words according to the number that splits radicals by which characters are arranged in traditional Chinese dictionaries; The single radical word code: the pronunciation of the first letter of word+prefix stroke code+word time stroke code+word end stroke code, when stroke number is not enough, get first stroke earlier, inferior last stroke, the not enough alphabetical V polishing of usefulness got; Two radicals by which characters are arranged in traditional Chinese dictionaries word codes: first radical code+last radical code+last radicals by which characters are arranged in traditional Chinese dictionaries first stroke code+last radicals by which characters are arranged in traditional Chinese dictionaries end stroke code, the 3rd yard changes into when the non-word of last radicals by which characters are arranged in traditional Chinese dictionaries: last radicals by which characters are arranged in traditional Chinese dictionaries time stroke code, all the other are stipulated with the single radical word; Three radicals by which characters are arranged in traditional Chinese dictionaries word codes: first radical code+inferior radical code+last radical code+last radicals by which characters are arranged in traditional Chinese dictionaries end stroke code.
Below technical scheme of the present invention is described in detail, Fig. 1 a, 1b are class word radicals by which characters are arranged in traditional Chinese dictionaries complete list of the present invention, Fig. 2 is the stroke code table, Fig. 3 coding rule process flow diagram that is to divine by means of characters, Fig. 4 a, 4b are that input method of the present invention is used process flow diagram.
1. definite BM input method can be handled the word collection, and the word and the symbol that may use with the modern are selected principle, comprise simplified Chinese character, the complex form of Chinese characters, variant Chinese character, complicated variant word, radical and non-word symbol relatively more commonly used; Comprise 7000 Chinese characters of contemporary Chinese common word table, 6763 Chinese characters and 687 the non-word symbols among the GB2312-80; 20992 Chinese characters among the ISO-10646, etc., amount to selected 30865 Chinese characters and non-word symbol.For adapting to different industries, area and user's situation, the word the handled collection of BM input method is divided into following four subclass again, forms four versions thus:
(1) BM100 version, selected 7862 Chinese characters and non-word symbol comprise 7000 Chinese characters of contemporary Chinese common word table, 6763 Chinese characters and 687 the non-word symbols among the GB2312-80; Be applicable to the most users that use simplified Chinese character.
(2) BM110 version, selected 7891 Chinese characters and non-word symbol, it gets the word scope and the BM100 version is identical, and the simplified Chinese character that just will have the complex form of Chinese characters is replaced into the corresponding complex form of Chinese characters; Be applicable to the most users that use the complex form of Chinese characters.
(3) BM120 version is selected in 10137 words, comprises all words of BM100 and two versions of BM110; Be applicable to the most users that use letter, the complex form of Chinese characters with.
(4) BM130 version is selected in 30865 words, promptly comprises the whole word collection of handling of BM input method, is applicable to the nearly all user who comprises writer, ancient books trimmer, residence management personnel.
Be applicable to whole 30865 Chinese characters and the non-word symbol of handling the word collection 2.BM divine by means of characters with coding rule, formed by eight rules of divining by means of characters, nine supplementary notes and the coding rule process flow diagram of divining by means of characters shown in Figure 3.
Article eight, the rule of divining by means of characters is:
(1) radicals by which characters are arranged in traditional Chinese dictionaries of Chai Fening reach three as far as possible, but must not be more than three; According to the order of strokes observed in calligraphy of its first stroke in this word, be decided to be first successively, inferior, last radicals by which characters are arranged in traditional Chinese dictionaries.
(2) the crossing stroke in word or the radicals by which characters are arranged in traditional Chinese dictionaries is inaccurate without exception splits.
(3) separate another radicals by which characters are arranged in traditional Chinese dictionaries by radicals by which characters are arranged in traditional Chinese dictionaries and inaccurate fractionation of big radicals by which characters are arranged in traditional Chinese dictionaries that constitute.For example: the heart, wood, standing grain, already, originally, , extend, separate, draw, lack etc.; And=Myeon+fore-telling ≠ fourth+eight not, the offspring=
+ Si+the moon ≠ youngster+one+moon etc.
(4) by about integral body, up and down, inside and outside priority splits; Have only when fractionation does not have into the word radicals by which characters are arranged in traditional Chinese dictionaries like this, just allow to split otherwise.For example: suddenly=Ren+
+ dog, height=Tou+mouth+Jiong, and Peng=ten+San+beans etc.
(5) external and internal compositions must inside and outsidely thoroughly split.External and internal compositions comprises: mouth, Contraband, , Qian, Jiong,
,
, the mountain,
, several, the door, be,
, worker, king, soil, do,
, Door etc. (its outside all right additional strokes of external and internal compositions, this does not influence the attribute of external and internal compositions, as
, Guo all ranges Jiong,
Range towel etc.).For example: the week=
≠
+ mouthful; And following word can not be split: day, day, Gang, with, net, separate, the dawn, extend, meat etc.
(6) be analogous to the word or the radicals by which characters are arranged in traditional Chinese dictionaries of external and internal compositions, pay the utmost attention to by the external and internal compositions mode and split; When only in this way not having into the word radicals by which characters are arranged in traditional Chinese dictionaries, just allow to split otherwise.Being analogous to external and internal compositions comprises: Bao, shoot a retrievable arrow, dagger-axe, corpse,
, factory's (extensively),
(
, ),
, Chuo (
), Yin,
, the bow,
Deng.For example: a surname=dagger-axe+
, weak=
+ Bing; But
+ several etc.
(7) eat,
, , Epileptic, these five radicals by which characters are arranged in traditional Chinese dictionaries of yarn itself should try not to split again, unless three of less thaies do not counted in radicals by which characters are arranged in traditional Chinese dictionaries when not splitting.
(8) should split out one-tenth word radicals by which characters are arranged in traditional Chinese dictionaries as much as possible, and make the radicals by which characters are arranged in traditional Chinese dictionaries of back become word as far as possible, and make the stroke number of back radicals by which characters are arranged in traditional Chinese dictionaries many as far as possible.
Article nine, supplementary notes are:
Article (one) eight, split rule, should preferentially use by the priority of its narration.
(2) font is as the criterion with document (3), and the word that document (3) does not have then is as the criterion with document (1).The order of strokes observed in calligraphy of word and radicals by which characters are arranged in traditional Chinese dictionaries is as the criterion (only change having been done in following two words: the Tuan commentary on meaning of different diagrams in The Book Changes 4S03, cover D0S3) with document (4) basically, and word and radicals by which characters are arranged in traditional Chinese dictionaries that document (4) does not have then are as the criterion with document (5).
(3) radicals by which characters are arranged in traditional Chinese dictionaries are made up of at least two strokes, but and the non-word symbol of one stroke word own is not limit by this regulation then, for example: second Y4VV, skill Y569, (1) 4Y1V etc.Be crossing radicals by which characters are arranged in traditional Chinese dictionaries as long as there is stroke to intersect in the radicals by which characters are arranged in traditional Chinese dictionaries, otherwise be non-crossing radicals by which characters are arranged in traditional Chinese dictionaries.Constitute the word or the radicals by which characters are arranged in traditional Chinese dictionaries of an external and internal compositions by stroke radicals by which characters are arranged in traditional Chinese dictionaries of encirclement that are no less than three directions; If have only the stroke of both direction to surround, then for being analogous to the word or the radicals by which characters are arranged in traditional Chinese dictionaries of external and internal compositions.Should as far as possible word be split as three radicals by which characters are arranged in traditional Chinese dictionaries, but must not exceed three, promptly during three of less thaies radicals by which characters are arranged in traditional Chinese dictionaries get little do not get big, then get in the time of can surpassing very much not get little, for example: with 3XC8, Jiang 53C8.
(4) radicals by which characters are arranged in traditional Chinese dictionaries are divided into into word radicals by which characters are arranged in traditional Chinese dictionaries and non-word radicals by which characters are arranged in traditional Chinese dictionaries two classes.Become the word radicals by which characters are arranged in traditional Chinese dictionaries to be divided into two kinds standardized form of Chinese charcters radicals by which characters are arranged in traditional Chinese dictionaries and class word radicals by which characters are arranged in traditional Chinese dictionaries again, the word that standardized form of Chinese charcters radicals by which characters are arranged in traditional Chinese dictionaries are shown prefix (comprising simplified Chinese character, the complex form of Chinese characters, variant Chinese character and complicated variant word) with document (3) is as the criterion, and initial is made code when using document (3) mark pronunciation without exception; The single radical word that can not find out in the document (3), with the initial of document (1) mark pronunciation do code (but only limit to coding to this single radical word itself, for appear at pair, the occasions of three radicals by which characters are arranged in traditional Chinese dictionaries words, still be considered as non-word radicals by which characters are arranged in traditional Chinese dictionaries); BM130 version standardized form of Chinese charcters radicals by which characters are arranged in traditional Chinese dictionaries are in respect of 1478.In fact class word radicals by which characters are arranged in traditional Chinese dictionaries are non-word radicals by which characters are arranged in traditional Chinese dictionaries, but it are considered as being analogous to standardized form of Chinese charcters radicals by which characters are arranged in traditional Chinese dictionaries when divining by means of characters and encode; Selected 41 of class word radicals by which characters are arranged in traditional Chinese dictionaries are wherein made code with arabic numeral 0~9 for 10, have 5 alphabetical A of usefulness, 0, I, U, V to make code; The non-word symbol of single radical is then made code with letter e without exception; Other 25 class word radicals by which characters are arranged in traditional Chinese dictionaries are made code (wherein radicals by which characters are arranged in traditional Chinese dictionaries Fu does not have phonetic notation, makees code with people's custom pronunciation with letter e) with the initial of phonetic notation in document (6) subordinate list " Chinese character radicals namelist ", see Fig. 1 for details.Non-word radicals by which characters are arranged in traditional Chinese dictionaries see Fig. 2 for details without exception with the coded representation of its first stroke.
(5) basic strokes horizontal stroke (), perpendicular (Shu) arranged, cast aside (Pie), press down (Dian and ), folding (,
) five.Regulation and are with being considered as standard stroke; And
Belong to Heng , 亅 belong to perpendicular,
With
Belong to left-falling stroke,
, , is with Ya etc. all belong to folding.Arrange thus,
Be equal to gold, is equal to soil,
Be equal to electricity etc., all be considered as correcting a wrongly written character or a misspelt word.But can not belong to one conversely
, Shu Gui Yu 亅 etc., for example:
Be not equal to " fourth "
Be non-word, and for example in " chi " word
Be not equal to eight, chi ≠ eight+ etc.
(6) stroke can be done linear telescopic (but except the vertical direction along stroke direction, for example the prefix of " green grass or young crops " word is not made " rich " word) or move, be stained with stroke mutually and can move (is two radicals by which characters are arranged in traditional Chinese dictionaries but can not move) along the stroke of being stained with, becoming a standardized form of Chinese charcters as far as possible, but the cognition custom of word is exceeded not violate people; For example: in " one-tenth " word
Do not make " power " word, " chi " word can not move and be and " people " word; But " " in the word
Can be equal to " ear " word, in " week " word
Word can be equal to " Ji " word (but the coding of " Ji " word still is SK10), in " lying " word
Can be equal to " body " word, in " good " word
Can be equal to " woman " word, in " taking advantage of " word
Can be equal to " standing grain " word (being " thousand " and " eight ") etc. but can not move.
When (seven) standardized form of Chinese charcters radicals by which characters are arranged in traditional Chinese dictionaries have a plurality of pronunciation relatively more commonly used in document (3), get its pronunciation of the first letter, but uncommon pronunciation is not then at the row of this regulation by alphabetical A~Z ordering pronunciation in front.For example: " weight " gets C, and " length " gets C, and " rate " gets L, and " Yan " gets S etc., but " closing " got H and do not got G (" closing " word has pronunciation GE) etc.
(8) stroke is formed an identical difference standardized form of Chinese charcters, except needing other occasion of obvious branch, can only get its pronunciation commonly used.For example day (YUE) and (day), by the word of forming day (YUE), the indivedual occasions of depolarization (for example day Y140, firm 6Y10, because have a day R140, rushing 6R10 to need difference with it) are made beyond the code of alphabetical Y, all the other the time all available letter r make code (for example sudden and violent RG13, old woman NRMO etc.); For another example
(DUN) and not (BU), except
(DUN) being encoded to beyond the D023 of itself, all the other occasions all (are for example made code with letter b
BRR3 etc.).
(9) the standardized pronunciation of non-word symbol is divided into the standard pronunciation (in respect of 10 Arabic words, 52 capital and small letter English alphabets, 12 Roman numbers, l69 Hiraganas and katakanas, 48 capital and small letter Greek alphabets, 66 capital and small letter Russion letters and some radical of Chinese character) and no standard (for example+, one, * ,/etc.) two classes, the former adopts the Chinese pronunciation initial of its standard pronunciation to make code, and (with document (6) and document (71 are as the criterion), the latter then makees code with letter e without exception.The fractionation of non-word symbol has following 10 regulations: 1. 29 basic numerals 0~9 ,~nine can be considered as into the word radicals by which characters are arranged in traditional Chinese dictionaries with I~X in non-word symbol, and one stroke also can be considered radicals by which characters are arranged in traditional Chinese dictionaries (for example 3.=3+.), and all the other non-word symbols are considered as non-word radicals by which characters are arranged in traditional Chinese dictionaries without exception.The sequential write of 2. non-word symbol should be determined according to the order of writing strokes of similar Chinese character radicals if any ambiguity.3. remove
(Han Yin),
(Han Ya,
) beyond these two strokes, all the other straight line strokes are all turnover, is considered as two strokes without exception, for example ∠, (wait and all be considered as two strokes and form.4. stroke is all a repetition, promptly is considered as two strokes, for example: n=|+n.5. from the lower-left to upper right stroke of writing, belong to horizontal stroke, for example alphabetical V's is encoded to W3V0.6. belong to Dian, note ", "=+,, be encoded to E3V4, promptly stroke has repetition.7. O is considered as three " (non-intersecting) is stained with and formed to (" (folding) mutually, and all the other are all split for stroke unit with the semicircle by the non-word symbol that semicircle is combined into entirely, for example 3, ∽, ε, S, § and 8, ∞ etc. constitute non-intersecting or crossing radicals by which characters are arranged in traditional Chinese dictionaries by 2~4 semicircles; But the incomplete non-word symbol of forming by semicircle, for example %, ‰, 6,9, ∝, U etc., circle still is considered as single stroke.8. add the non-crossing repetition that thick stroke is considered as this stroke, and add thick stroke and write after without exception, and preferentially extract
Stroke, for example tab
, be encoded to E569; Adding black flour and then be considered as a crossing stroke that belongs to folding, also is to write at last.9. one stroke self constitute to intersect, and no matter whether can be split as a plurality of semicircles, all is considered as intersecting radicals by which characters are arranged in traditional Chinese dictionaries (for example 8 and ∞).10. the I in the letter is considered as three strokes, and all the other situations all are considered as Shu, and for example Roman number I is encoded to Y1VV; Modification stroke in the various symbols (dispensable stroke) all can be ignored, and for example alphabetical A is 3 strokes but not 5 strokes; "-" stroke of noting J and τ does not belong to the modification stroke, so coding should be J0V4 and T0V4.
Divine by means of characters and the coding rule process flow diagram as shown in Figure 3.Process flow diagram has been stipulated the split process and the coding rule of Chinese character or non-word symbol.Any one belongs to Chinese character and the non-word symbol that can handle the word collection, all can compile out four sign indicating numbers.
3.BM input method is used process flow diagram as shown in Figure 4.Article six, supplementary notes are:
(1) form of assembling BM input method: ZBM (.EXE/? /? /? /? /?).Symbol? parameter is selected in representative; Parameter 1 is an entry key, can choose Alt+F1~Alt+F10 wantonly, but occupied or when repeating to install as this mouth, can provide the execution of information and abort commands; Parameter 2 is used hanzi system type; Parameter 3 is the rigging position of word table, can select pack into conventional memory, exented memory or Extended RAM; Parameter 4 is the rigging position of flexible lexicon, also can select pack into conventional memory, exented memory or Extended RAM; San several 5 is the hop count of packing into of flexible lexicon, looks its used version, and the configuring condition of microcomputer memory, the user can select to pack into, and (each version is 56 sections to the soft speech section of a part relatively more commonly used, every section takies the 64KB internal memory), or all pack into, or all do not pack into; But as the conventional memory of packing into, then allow to pack into 3 sections at most.As word table pack into exented memory or Extended RAM, also must pack into exented memory or Extended RAM of flexible lexicon then; Equally, as the flexible lexicon conventional memory of packing into, the word table conventional memory of also must packing into then.
As directly keying in the ZBM carriage return, be default setting: entry key Alt+F4, Kingsoft system, word table pack into conventional memory, the flexible lexicon of not packing into.
For the low-grade machine user who has only the following internal memory of 1MB, floppy disk or the hard disk operation of also word table and flexible lexicon can being packed into, promptly the 1.2MB floppy disk can be packed 18~19 sections into, and the 1.44MB floppy disk can be packed 22 sections into, and hard disk can be packed into 1~56 section according to circumstances; But do not recommend to do like this, because can the life-span of hardware be affected.
(2) utilization BM divines by means of characters and coding rule, and each is belonged to Chinese character and the non-word symbol that can handle the word collection, all can compile out 4 sign indicating numbers, but most Chinese character and symbol does not need all to use these codings.During Chinese character, key in the coded character of this word in input successively, when no more than 10 of selectable word (being called the forecast word), promptly show this 1~10 forecast word by the priority of high frequency mode, if having only 1 forecast word then directly go up screen at presenting bank.With the BM100 version is example, key in 2 yards promptly show 4982 words are arranged, 3 yards have 2713 words, 4 yards 167 words (account for the BM100 version and can handle 2.12% of word collection, and be the word that is of little use) only.The caution of then ringing when keying in error code.When having () symbol to occur in 1~10 forecast word, show that this version leaves the coding of this word, but the temporary transient internal code of not determining this word; This point is based on following consideration: selected 7862 Chinese characters of BM100 version and non-word symbol, comprise 7000 Chinese characters of contemporary Chinese common word table, 6763 Chinese characters and 687 the non-word symbols among the G82312-80, but present most users' computer temporarily can only be handled GB2312-80 word collection, Chinese character and symbol outside this word collection, the self-made characters function that can only utilize each hanzi system to provide at present solves, lack versatility, this neither the problem to be solved in the present invention; In order to make this input method can meet in the future industry standard, thus such processing temporarily done, and in addition perfect when waiting until edition upgrading.
(3) presenting bank have (? just can make amendment when speech) showing.Soft speech formed in demonstration (1 speech) last word of expression and this word, are non-soft speech when showing (0 speech).
(4) cross flexible lexicon when user's modification, and during no more than 9 of this forecast word, key in the Alt+O Macintosh, inquire that promptly the user preserves the soft word information of revising? (in time, preserve, and do not preserve when keying in " N " to key in " Y ".
A defeated software CL.EXE that helps of the present invention also can the wish according to the user preserve the soft word information of revising under hanzi system prompt state, and return the speed of depositing faster (deposit soft speech function when producing conflict when returning of some Word and this input method, then can only deposit function) with returning of this software.CL.EXE also is used for this input method is cancelled from hanzi system, and checks this input method of whether having packed into.
(5) when presenting bank does not show the forecast word as yet, backspace key is whenever by once deleting the last coded character, enter key is then once deleted the coded character that all have been keyed in, a last word has also been deleted with backspace key if note, then this word will be handled as the lead-in of new round input, because lead-in can not utilize flexible lexicon information, so code length will be longer.After presenting bank had shown the forecast word, backspace key was used to access the soft speech word of non-slogan banner (word is of little use).After selecting word, backspace key then is used to delete this word and presenting bank shows.Note showing the forecast word as yet, do not key in error code and when ringing, this coding soft speech word of no slogan banner (everyday character) is described when presenting bank, can key at this moment "; " key, directly access the soft speech word of non-slogan banner.
Where necessary, can utilize the combination of Alt+ space bar to force the input of this word to be set to the lead-in of new round input.
(6) this input method has been used following 42 keys on the keyboard: 0~9, A~Z, Alt key, enter key, backspace key, space bar, semi-colon key, Caps Lock Capslock.
4. of the present invention divine by means of characters the coding and service regeulations mainly be conceived to learnability and versatility, performance index such as code length, the repetition rate of coding and input speed are not high, (with the BM100 version is example, mean code length 3.65, dynamic code length 3.18).The raising of performance index realizes by flexible lexicon.Flexible lexicon is the dictionary of a two dimension, each coordinate points represents that (any multi-character words all can be decomposed into two-character word to a two-character word, be that flexible lexicon can hold all vocabulary), the selected code length of slogan banner (stipulates that code length is a touch potential required when importing a word or symbol above 2 word, comprise necessary backspace key and digital selective key), vertical mark is then pressed the usage frequency ordering of Chinese character, whole 7000 Chinese characters of selected contemporary Chinese common word table (are divided into 56 sections, every section 125 word), the user can decide how many sections (do not pack into and also can use, but performance index are not high) of packing in its sole discretion.Usually, packing into to cover 43.9% Chinese character 1 section the time, and 2 sections coverage rates are that 60.6%, 3 section coverage rate is that 70.7%, 18 section coverage rate reaches 98.6%.Can give up the combination of which unlikely becoming " speech " by flexible lexicon, promptly non-soft speech word, thus shorten code length effectively, reduce the repetition rate of coding and improve input speed.Because the user must not carry out special coding and memory (word coding method unification in other words) to any one speech, only need add or revise speech to dictionary and get final product with the semi-colon key of one on-off action, and promptly change i.e. usefulness, so the present invention is referred to as flexible lexicon with this dictionary.Because each coordinate points can be represented a two-character word in the flexible lexicon, so the capacity of flexible lexicon is very huge, and the BM100 version can effectively hold about 6,800,000 two-character words, and this is that present any input method is difficult to reach.Certainly,, when Chinese character import, have the situation of forecast failure, promptly do not have the word wanted in 1~10 forecast word, then should key in backspace key, key in the remaining coding of this word again, show 1~10 again until presenting bank and forecast word because flexible lexicon gives up function.Therefore, when failure forecast will make code length and at least also Duo one yard when not hanging flexible lexicon.So should strengthen soft speech amount in order to reduce the forecast failure, when soft speech amount was maximum, the forecast success ratio was 1, but also equals not hung flexible lexicon; In order to reduce dynamic code length, should reduce soft speech amount as far as possible, therefore forecast to have an optimum valuing range between success ratio and the dynamic code length.Ten thousand two-character words surplus the BM input method has been selected in 15 in advance, included and related to whole vocabulary and the idiom relatively more commonly used of marking soft speech word in length and breadth in document (3), (6), (8), (9), each user can be on this basis progressively carries out additions and deletions optimization according to characteristics such as the industry of oneself, customs to flexible lexicon, make dynamic code length short as far as possible and success ratio is high as far as possible, thereby the more general flexible lexicon that we provide progressively is converted into the special-purpose dictionary that is suitable for each user, promptly fully be adapted to each user's situation, give full play to each user's wisdom; With regard to each user, dynamic code length can drop to below 1.5 yards particularly, even near one yard one key, has alleviated user's workload widely, but the learnability of BM input method is at all unaffected.
5. BM120 of the present invention and BM130 version judge it is simplified Chinese character user or complex form of Chinese characters user automatically by software, if the simplified Chinese character user then gives up the complex form of Chinese characters automatically, otherwise then give up simplified Chinese character automatically; If but find it is the user that letter, the complex form of Chinese characters mix input, then do not do to give up.Because most users are simple simplified Chinese character user or complex form of Chinese characters users, this Intelligent treatment can reduce the dynamic code length of BM120 and BM130 version effectively.
BM130 version of the present invention, software at first only considers to belong to the word of BM120 word collection: all the BM120 word collection words except that the soft speech word of slogan banner are promptly considered in the forecast first time (soft speech forecast) failure, just consider not belong to the BM130 word collection word of BM120 word collection when forecast is failed once more; When the situation of giving up letter/complex form of Chinese characters is arranged, letter/complex form of Chinese characters of just having considered to give up when waiting until for the third time the forecast failure, and, all have give up, must give up variant Chinese character and complicated variant word in the lump, so the maximum code length of BM130 version is 8.Because the word outside the BM120 word collection is the deserted word of seldom using, therefore, this processing can guarantee that the dynamic code length of this input method is not because of handling the increase variation of word collection.Specifically, in the BM130 version, do not arranged when giving up, code length is that 5,6,7 word has 9781,9506 and 629 successively, and the word that wherein belongs to BM120 word collection has only 225 5 code words.
The present invention compares with existing various input method of Chinese character has following advantage and beneficial effect:
(wherein 29 only are used for non-word symbol 1.BM input method is for 1478 standardized form of Chinese charcters radicals by which characters are arranged in traditional Chinese dictionaries, 202 do not belong to the coding that BM100 can handle the word collection and only be used for this word itself, therefore, also we can say and have only 1247 standardized form of Chinese charcters radicals by which characters are arranged in traditional Chinese dictionaries) and 25 class word radicals by which characters are arranged in traditional Chinese dictionaries, all adopt the initial of standard pronunciation to make code, have only the code of 16 class word radicals by which characters are arranged in traditional Chinese dictionaries to stipulate (provided partials among Fig. 1, helped memory) voluntarily by the present invention, the code of its stroke then got without exception in non-word radicals by which characters are arranged in traditional Chinese dictionaries.Standardized form of Chinese charcters radicals by which characters are arranged in traditional Chinese dictionaries quantity little (existing this character learning level of general pupil), therefore, the BM input method has all-phonetic input method basically and easily learns the easily advantage of note, and because without the page turning word selection, code length is shorter, the repetition rate of coding is lower, has guaranteed input speed faster, has overcome basically that phonetic code weight sign indicating number is many, page turning is many, code length is long, input speed is slow, the pronunciation tone is inaccurate and unacquainted word (comprising non-word symbol) is difficult to a series of shortcomings such as input.
2. can handle the word collection and reach 30865 words, basically can satisfy proprietary needs (word surplus can also further being extended to 50,000 later on, the present invention has reserved the processing space when writing software), this moment is specifically for each user, still can accomplish short dynamic code length (when hanging up 56 sections, can reach about 1.84 yards), the lower repetition rate of coding and input speed faster.Since BM divine by means of characters and coding rule have only ten several, 41 class word radicals by which characters are arranged in traditional Chinese dictionaries have only been stipulated, (must not remember basically) 1478 standardized form of Chinese charcters radicals by which characters are arranged in traditional Chinese dictionaries and some non-word radicals by which characters are arranged in traditional Chinese dictionaries have been used, radicals by which characters are arranged in traditional Chinese dictionaries are consistent with all-phonetic input method basically with the mapping of key unit, therefore the BM input method had both had the advantage of package code, and need to remember up to a hundred people be the radical that optimizes and loaded down with trivial details rule and distinguish the find it difficult to learn shortcoming of difficult note such as high frequency, I and II words by the user to have overcome package code again.
3.BM input method has been used the font style characteristic the information whether stroke of five basic strokes, the order of strokes observed in calligraphy and words of word or radicals by which characters are arranged in traditional Chinese dictionaries intersects, and therefore has the easy advantage of graphemic code again.Because 40 class word radicals by which characters are arranged in traditional Chinese dictionaries all belong to the 201 standardization radicals by which characters are arranged in traditional Chinese dictionaries that State Language Work Committee is recommended, font style characteristic information has also only been used the part of people's easy to perform, meets the national education background basically, so memory capacitance is minimum.
4.BM input method has ten radicals by which characters are arranged in traditional Chinese dictionaries to make code with arabic numeral 0~9, non-word radicals by which characters are arranged in traditional Chinese dictionaries are then made code with first (inferior, end) stroke of radicals by which characters are arranged in traditional Chinese dictionaries with arabic numeral 0~9 by order of writing strokes, therefore we can say to have digital characteristics, but service regeulations are very simple, need not loaded down with trivial detailsly remember.
5.BM the flexible lexicon of input method organically rolls into one general, the professional and individual dictionary of present various input methods.Taking under the situation of same memory, the code list of words of two kinds of data structures that present various input method adopts usually (is introduced according to document (2), a kind of is the fixedly code list of words that is independent of code list of Hanzi, another kind is as index with code list of Hanzi, form the tree structure code list of words), open ended vocabulary is respectively not as good as 2% and 5% of this input method, for example the BM100 version can hold and is not less than 6,800,000 two-character words, but the user need not be a Chinese word coding specially, because word coding method unification, the user need only grasp that BM divines by means of characters and coding rule is encoded to word, has used the semi-colon key of on-off action to get final product to flexible lexicon additions and deletions vocabulary, and has promptly changed promptly and use, follow one's bent, simple and easy quick, all the other work are transferred to software fully and are carried out the intellectuality processing, have avoided the coding conflict of present various input method individual dictionary and have been difficult to memory, a series of shortcomings such as capacity is little.For " general " dictionary, because each user's situation varies, so-called vocabulary also vary with each individual, vary in different localities and because of the time different, no matter this dictionary has much, always someone will feel still not general, do not have a universally applicable standard universal dictionary in other words; The BM input method has been deposited ten thousand more common double word vocabulary surplus in the of 15 in advance in flexible lexicon, allow the user flexible lexicon be carried out additions and deletions optimization with extremely simple and direct means, give full play to each user's wisdom and creativity, make dynamic code length short as far as possible, the repetition rate of coding is low as far as possible, input speed is high as far as possible, thereby the boundary of die out general, professional and individual dictionary really accomplishes to be applicable to each user.
6.BM it is more much bigger than BM100 and BM110 version that the BM120 of input method and BM130 version can be handled the word collection, divining by means of characters of four versions is also identical with coding rule, but handle because flexible lexicon and software are intelligent, four versions do not have any redundancy encoding, and it is also little that dynamic code length differs.This is that present various input method is difficult to reach.
7. though four versions of BM input method all require the internal memory of 3.6~3.75MB, but the low-grade machine below 286 is not worried Out of Memory yet, because BM100 and BM110 version only require that (BM120 is about 80KB to the 64KB internal memory, BM130 is about 240KB) just can normally move, at this moment, the dynamic code length of BM100 version is that about 3.18 (the BM110 version is about 3.2, the BM120 version is about 3.28, the BM130 version is about 3.31), performance is not very poor, but free memory is many more, the performance of BM input method is the good more (microcomputer with the above internal memory of 1MB just, can flexible lexicon and word table not put into conventional memory, then this input method BM108 and BM110 version conventional memory only take about 33.3 and the space of 33.7KB), the raising of BM input method performance in other words is not by preferred radical or increase, what improvement was divined by means of characters and means such as coding rule reach, thus the chronic illness of present Chinese-character keyboard input method " eager to learn is not handy, and handy is not eager to learn " thoroughly solved.Certainly, for the low-grade machine user who makes (even 512KB) internal memory that has only 1MB also can use this input method, divine by means of characters, the aspects such as selection of coding and class word radicals by which characters are arranged in traditional Chinese dictionaries have also been done some that the regulation of memory capacitance is arranged, its learnability is poorer slightly than all-phonetic input method, but more than the input method of present other for well.
8. generally, BM input method science (meets relevant Chinese-character canonical and the people cognition custom to Chinese character basically, has short dynamic code length, the lower repetition rate of coding and input speed faster), simple and clear (people who only has primary school's schooling, to most Chinese characters, its coding of knowing at one sightIt can be seen at a glance, also can listen to beat and want and beat), rigorously (can handle each word that word is concentrated, utilization BM divines by means of characters and coding rule, all has only a kind of fractionation scheme, little ambiguity).Have and become literate 1500 (for the BM100 version, be 1200), slightly know the Chinese phonetic alphabet pronunciation of the first letter of word (that is: know), know with the radicals by which characters are arranged in traditional Chinese dictionaries indexing method of document (3) with by the people of correct order of strokes observed in calligraphy written word, generally can learn to divine by means of characters and encode in 1~3 hour, a week can on top of and use.Generally speaking, the BM input method is in line with easily, general and handy aim are developed, basically meet national education background and the people cognition custom to Chinese character, its principle can also be applicable to the inputting method of other ideographic language that comprises Japanese, Korean etc.; Aspect software programming, utilize as far as possible that computer is soft, the new function of hardware, new technology is for a new road has been opened up in the development of the keyboard input technology of Chinese characters.
Below Figure of description is further specified as follows:
1. Fig. 1 is 41 class word radicals by which characters are arranged in traditional Chinese dictionaries and code table thereof.Be divided into four classes: 25 of first kind meters, adopt the initial of phonetic notation in document (6) subordinate list " Chinese character radicals namelist " to make code; 10 radicals by which characters are arranged in traditional Chinese dictionaries commonly used of the second class meter, 5 more special radicals by which characters are arranged in traditional Chinese dictionaries of the 3rd class meter, 1 radicals by which characters are arranged in traditional Chinese dictionaries that are used for non-word symbol of the 4th class meter are stipulated its code by the present invention.
2. Fig. 2 is the stroke code table.Stipulated the code of 5 basic strokes in crossing and non-crossing radicals by which characters are arranged in traditional Chinese dictionaries.
3. Fig. 3 is the BM coding rule process flow diagram of divining by means of characters.Splitting step and coding rule that each word is concrete have been stipulated.
4. Fig. 4 uses process flow diagram for the BM input method.Concrete using method when having stipulated to utilize this input method on keyboard, to import Chinese character and non-word symbol.
Embodiments of the present invention are as follows:
1. can handle to four versions respectively that the word collection is divined by means of characters by BM and coding rule is encoded, and press the series arrangement of arabic numeral 0~9 and alphabetical A~Z, form code list of Hanzi thus.Below be main statistics:
Single, double and the three radicals by which characters are arranged in traditional Chinese dictionaries words of each version account for the number percent that this version can be handled the word collection respectively: the BM100 version is 12.12%, 18.52% and 69.36%, the BM110 version is 11.39%, 13.90% and 74.71%, the BM120 version is 9.53%, 15.37% and 75.10%, and the BM130 version is 3.91%, 11.90% and 84.19%.
BM130 word collection is in respect of 1478 standardized form of Chinese charcters radicals by which characters are arranged in traditional Chinese dictionaries, have 864 that wherein belong to 2500 everyday characters of GB, what belong to 1000 everyday characters of GB has 114, belong to 7000 general words of State Language Work Committee promulgation but do not belong to have 226 of 3500 everyday characters, the deserted word that belongs to outside 7000 general words has 16, the non-standardized form of Chinese charcters radicals by which characters are arranged in traditional Chinese dictionaries of single radical word (promptly only being used to import this word itself) has 229 (wherein 27 belong to 7000 general words), only is used for have 29 of non-word symbol.In these 1478 standardized form of Chinese charcters radicals by which characters are arranged in traditional Chinese dictionaries, there are 190 to belong to the complex form of Chinese characters, variant Chinese character or complicated variant word (being mainly used in BM120, BM130 version).And the standardized form of Chinese charcters radicals by which characters are arranged in traditional Chinese dictionaries number of BM100 word collection is 1123.
Use preceding 2,3,4 yards of coding when not hanging flexible lexicon and (note not being equal to code length, because do not take into account digital selective key and backspace key) account for the number percent that this version can be handled the word collection respectively, the BM100 version is 63.37%, 34.51% and 2.12%, the BM110 version is 60.52%, 37.36% and 2.12%, the BM120 version is 53.52%, 44.84% and 1.64%, and the BM130 version is 29.25%, 67.48% and 3.27%.
2. pick out code length respectively and form the soft speech word table of slogan banner of version separately greater than 2 word from the everyday character of four version code list of Hanzi, the soft speech word of vertical mark should be selected in 7000 general words, and presses the usage frequency ordering, is divided into 56 sections; Every section 125 word takies the 64KB space.Can form speech to the soft speech word of each vertical mark and each word consideration in the soft vocabulary of whole slogan banner successively, have or not the appearance that in literal and spoken language, links to each other of more chance in other words, form soft speech segment data file thus.Below be the main statistics of each version code length:
The mean code length of each version when 1. not hanging flexible lexicon, BM100 version are about 3.648, the BM110 version is about 3.666, the BM120 version is about 3.719, the BM130 version is about 4.857.
The dynamic code length of each version when 2. not hanging flexible lexicon, BM100 version are about 3.18, the BM110 version is about 3.2, the BM120 version is about 3.28, BM130 is about 3.31.
The dynamic code length of each version when 3. hanging up 56 sections flexible lexicons, in general, the BM100 version is about 1.78, the BM110 version is about 1.78, the BM120 version is about 1.80, the BM130 version is about 1.84.
The dynamic code length of each version when 4. hanging up 2 sections flexible lexicons, in general, the BM100 version is about 2.33, the BM110 version is about 2.34, the BM120 version is about 2.38, the BM130 version is about 2.42.
3. can carry out program composition, compilation, link and debugging according to code list of Hanzi, soft speech segment file and BM input method service regeulations, by after be articulated on the hanzi system.Because this input method data are various, software is longer, in order to save internal memory as far as possible, makes low-grade machine user can use this input method, should adopt assembly language to carry out program composition.Below be the main statistics of each version committed memory:
When the BM100 version program normally moved, the minimum 64KB of taking internal memory (the wherein minimum 33.3KB conventional memory that needs) took the 3.6MB internal memory at most.
When the BM110 version program normally moved, the minimum 64KB of taking internal memory (the wherein minimum 33.7KB conventional memory that needs) took the 3.6MB internal memory at most.
When the BM120 version program normally moved, the minimum 80KB of taking internal memory (the wherein minimum 43.3KB conventional memory that needs) took the 3.65MB internal memory at most.
When the BM130 version program normally moved, the minimum 240KB of taking internal memory (the wherein minimum 128KB conventional memory that needs) took the 3.75MB internal memory at most.
4. software is assented family allowable and is carried out following selection combination:
1. the user can choose Alt+F1~Alt+F10 wantonly as the inlet definition key.
2. the user can choose hanzi system wantonly.
3. can arbitrarily code list of Hanzi be packed into conventional memory, exented memory or Extended RAM of user.
4. can arbitrarily flexible lexicon be packed into conventional memory, exented memory or Extended RAM of user.
5. the user can choose 0~56 section flexible lexicon of packing into wantonly.
6. the user can preserve the soft word information of revising easily, easily this input method is cancelled from hanzi system, and whether is checked that oneself is with this input method hanzi system of packing into.
Claims (10)
1. computer Chinese character key-board input method, comprised simplified Chinese character, the complex form of Chinese characters, variant Chinese character, the complicated variant word, radical and non-word symbol relatively more commonly used, totally 10 numerical keys of 0~9 on the QWERTY keyboard have been used, A~Z is totally 26 English alphabet keys, Alt key, enter key, backspace key, empty Yu key, semi-colon key, Caps Lock Caps lock, adopt phonetic, radical, font and digital combining and the coded system of words unification and dispose corresponding dictionary, the version that it is characterized in that dictionary is flexible lexicon, flexible lexicon is a two-dimentional dictionary, each coordinate points is represented a two-character word, the slogan banner of flexible lexicon is made up of greater than 2 everyday character code length in the code list of Hanzi, the vertical mark of flexible lexicon should be selected in 7000 general words, and press usage frequency and sort, be divided into 56 sections, every section 125 word, every section takies the 64KB space, successively can each word in each vertical soft speech word of mark and the soft vocabulary of whole slogan banner be considered form speech, have or not the appearance that in literal and spoken language, links to each other of more chance in other words, form soft speech segment data file thus, the input of word is finished by letter key on the keyboard and numerical key successively with the form of 1~4 code, radicals by which characters are arranged in traditional Chinese dictionaries pressed in word splits, radicals by which characters are arranged in traditional Chinese dictionaries are divided into into word radicals by which characters are arranged in traditional Chinese dictionaries and non-word radicals by which characters are arranged in traditional Chinese dictionaries, become the word radicals by which characters are arranged in traditional Chinese dictionaries to be divided into standardized form of Chinese charcters radicals by which characters are arranged in traditional Chinese dictionaries and class word radicals by which characters are arranged in traditional Chinese dictionaries again, standardized form of Chinese charcters radicals by which characters are arranged in traditional Chinese dictionaries are made code with the initial of standard pronunciation, class word radicals by which characters are arranged in traditional Chinese dictionaries use 0~9 individual digit or letter as code respectively, non-word radicals by which characters are arranged in traditional Chinese dictionaries are made code with 0~9 individual digit respectively with its first stroke, and stroke is divided into horizontal stroke, perpendicular, cast aside, point and folding; The fractionation radicals by which characters are arranged in traditional Chinese dictionaries of word are according to the order of strokes observed in calligraphy of its first stroke in this word, be decided to be first successively, inferior, last radicals by which characters are arranged in traditional Chinese dictionaries, word is divided into single radical word, two radicals by which characters are arranged in traditional Chinese dictionaries word and three radicals by which characters are arranged in traditional Chinese dictionaries words according to the number that splits radicals by which characters are arranged in traditional Chinese dictionaries, single radical word code: the pronunciation of the first letter of word+prefix stroke code+word time stroke code+word end stroke code, when stroke number is not enough, get first stroke earlier, inferiorly get last stroke, not enough use alphabetical polishing; Two radicals by which characters are arranged in traditional Chinese dictionaries word codes: first radical code+last radical code+last radicals by which characters are arranged in traditional Chinese dictionaries first stroke code+last radicals by which characters are arranged in traditional Chinese dictionaries end stroke code, the 3rd yard changes into when the non-word of last radicals by which characters are arranged in traditional Chinese dictionaries: last radicals by which characters are arranged in traditional Chinese dictionaries time stroke code, all the other are stipulated with the single radical word; Three radicals by which characters are arranged in traditional Chinese dictionaries word codes: first radical code+inferior radical code+last portion code+last radicals by which characters are arranged in traditional Chinese dictionaries end stroke code.
2. computer Chinese character key-board input method as claimed in claim 1 is characterized in that the fractionation of word has eight rules:
(1) radicals by which characters are arranged in traditional Chinese dictionaries of Chai Fening reach three as far as possible, but must not be more than three;
(2) the crossing stroke in word or the radicals by which characters are arranged in traditional Chinese dictionaries is inaccurate without exception splits;
(3) separate another radicals by which characters are arranged in traditional Chinese dictionaries by radicals by which characters are arranged in traditional Chinese dictionaries and inaccurate fractionation of big radicals by which characters are arranged in traditional Chinese dictionaries that constitute, comprise the heart, wood, standing grain, already, originally, extend, separate, draw, lack etc.;
(4) by about integral body, up and down, inside and outside priority splits, and has only when fractionation does not have into the word radicals by which characters are arranged in traditional Chinese dictionaries like this, just allow to split otherwise;
(5) external and internal compositions must inside and outsidely thoroughly split.External and internal compositions comprises: mouth, Contraband, , Qian, Jiong,
,
The mountain,
, , door, be,
, worker, king, soil, do,
Door etc., its outside all right additional strokes of external and internal compositions, this does not influence the attribute of external and internal compositions,
, Guo all ranges Jiong,
Range
, and following word can not be split: day, day, with, net, separate, the dawn, extend, meat etc.;
(6) be analogous to the word or the radicals by which characters are arranged in traditional Chinese dictionaries of external and internal compositions, pay the utmost attention to by the external and internal compositions mode and split, when only in this way not having into the word radicals by which characters are arranged in traditional Chinese dictionaries, just allow to split otherwise, be analogous to external and internal compositions and comprise: Bao, shoot a retrievable arrow, dagger-axe, corpse,
, factory's (extensively),
(
, ),
, Chuo (
), Yin,
, the bow,
Deng;
(7) eat,
, , Epileptic, these five radicals by which characters are arranged in traditional Chinese dictionaries of yarn itself should try not to split again, unless three of less thaies do not counted in radicals by which characters are arranged in traditional Chinese dictionaries when not splitting;
(8) should split out one-tenth word radicals by which characters are arranged in traditional Chinese dictionaries as much as possible, and make the radicals by which characters are arranged in traditional Chinese dictionaries of back become word as far as possible, and make the stroke number of back radicals by which characters are arranged in traditional Chinese dictionaries many as far as possible;
3. computer Chinese character key-board input method as claimed in claim 1 is characterized in that eight rules of divining by means of characters have nine explanation detailed rules and regulations:
Article (one) eight, split rule, should preferentially use by the priority of its narration;
(2) font is as the criterion with document, only change has been done in following two words: the Tuan commentary on meaning of different diagrams in The Book Changes 4S03, cover D0S3;
(3) radicals by which characters are arranged in traditional Chinese dictionaries are made up of at least two strokes, but and the non-word symbol of one stroke word own is not limit by this regulation then, comprise one, second, (1) etc., be crossing radicals by which characters are arranged in traditional Chinese dictionaries as long as there is stroke to intersect in the radicals by which characters are arranged in traditional Chinese dictionaries, otherwise be non-crossing radicals by which characters are arranged in traditional Chinese dictionaries, constitute the word or the radicals by which characters are arranged in traditional Chinese dictionaries of an external and internal compositions by stroke radicals by which characters are arranged in traditional Chinese dictionaries of encirclement that are no less than three directions, if having only the stroke of both direction surrounds, then for be analogous to external and internal compositions word or or radicals by which characters are arranged in traditional Chinese dictionaries, should as far as possible word be split as three radicals by which characters are arranged in traditional Chinese dictionaries, but must not exceed three, promptly during three of less thaies radicals by which characters are arranged in traditional Chinese dictionaries get little do not get big, then get in the time of can surpassing very much not get little;
(4) standardized form of Chinese charcters radicals by which characters are arranged in traditional Chinese dictionaries comprise simplified Chinese character, the complex form of Chinese characters, variant Chinese character and complicated variant word, make code with the initial of standard pronunciation, and standardized form of Chinese charcters radicals by which characters are arranged in traditional Chinese dictionaries are in respect of 1478; Class word radicals by which characters are arranged in traditional Chinese dictionaries are non-word radicals by which characters are arranged in traditional Chinese dictionaries, but when divining by means of characters and encode, it is considered as being analogous to standardized form of Chinese charcters radicals by which characters are arranged in traditional Chinese dictionaries, selected 41 of class word radicals by which characters are arranged in traditional Chinese dictionaries, wherein there be l0 to make code with arabic numeral 0~9, have 5 alphabetical A of usefulness, 0, I, U, V to make code, the non-word symbol of single radical is then made code with letter e without exception, has 24 class word radicals by which characters are arranged in traditional Chinese dictionaries to make code with the initial of standard pronunciation, have 1 radicals by which characters are arranged in traditional Chinese dictionaries Fu to make code with people's custom pronunciation with letter e, non-word radicals by which characters are arranged in traditional Chinese dictionaries are without exception with the coded representation of its first stroke;
(5) in the basic strokes, Dian and are with being considered as standard stroke; And
Belong to Heng , 亅 belong to perpendicular,
With
Belong to left-falling stroke, Dian
, is with Ya etc. all belong to folding, by that analogy,
Be equal to gold, is equal to soil,
Be equal to electricity, all be considered as correcting a wrongly written character or a misspelt word;
(6) stroke can be done linear telescopic along stroke direction, but except the vertical direction, stroke can be done straight line along stroke direction and move, being stained with stroke mutually can move along the stroke of being stained with, but can not move is two radicals by which characters are arranged in traditional Chinese dictionaries, becoming a standardized form of Chinese charcters as far as possible, but the cognition custom of word is exceeded, comprise in " one-tenth " word not violate people
Do not make " power " word, " chi " word can not move and be and " people " word; But " " in the word
Can be equal to " ear " word, in " week " word
Word can be equal to " Ji " word, in " lying " word
Can be equal to " body " word, in " good " word
Can be equal to " woman " word, in " taking advantage of " word
Can be equal to " standing grain " word etc.;
When (seven) standardized form of Chinese charcters radicals by which characters are arranged in traditional Chinese dictionaries have a plurality of pronunciation relatively more commonly used, get its pronunciation of the first letter, but uncommon pronunciation is not then at the row of this regulation by alphabetical A~Z ordering pronunciation in front, comprise that " weight " get C, " length " gets C, and " rate " gets L, " Yan " gets S, but " closing " got H and do not got G etc.;
(8) stroke is formed an identical difference standardized form of Chinese charcters, except needing other occasion of obvious branch, can only get its pronunciation commonly used;
(9) non-word symbol is divided into standard pronunciation and no standard pronunciation two classes, 10 arabic numeral that comprise that the standard pronunciation is arranged, 52 capital and small letter English alphabets, 12 Roman numbers, 169 Hiraganas and katakana, 48 capital and small letter Greek alphabets, 66 capital and small letter Russion letters and some radical of Chinese character, the comprising of no standard pronunciation+,-, *, / etc., the former adopts the Chinese pronunciation initial of its standard pronunciation to make code, the latter then makees code with letter e without exception, the fractionation of non-word symbol has following 10 regulations: 1. 29 basic digital 0~9, one~nine can be considered as into the word radicals by which characters are arranged in traditional Chinese dictionaries with I~X in non-word symbol, and one stroke also can be considered radicals by which characters are arranged in traditional Chinese dictionaries, all the other non-word symbols are considered as non-word radicals by which characters are arranged in traditional Chinese dictionaries without exception, the sequential write of 2. non-word symbol is if any ambiguity, should determine according to the order of writing strokes of similar Chinese character radicals, 3. remove
(containing ),
(Han Ya,
) beyond these two strokes, all the other straight line strokes are all turnover, be considered as two strokes without exception, 4. stroke is all a repetition, promptly is considered as two strokes, comprise n=|+n etc., 5. from the lower-left to upper right stroke of writing, belong to horizontal stroke, comprise that alphabetical V's is encoded to W3V0 etc., 6.. belong to, ", "=+, be encoded to E3V4, be that stroke has repetition, 7. zero be considered as three " (" promptly rolled over stroke and be stained with mutually and form, all the other are all split for stroke unit with the semicircle by the non-word symbol that semicircle is combined into entirely, 3, ∽, ε, S, § and 8, ∞ etc. constitute non-intersecting or crossing radicals by which characters are arranged in traditional Chinese dictionaries by 2~4 semicircles, but the non-word symbol of not exclusively being made up of semicircle comprises %, ‰, 6,9, ∝, U etc., circle still is considered as single stroke, 8. add black flour and be considered as one and belong to the crossing stroke of rolling over stroke, and add black flour and write at last, add the non-crossing repetition that thick stroke then is considered as this stroke, add thick stroke and also write after without exception, and preferentially extract
Stroke comprises tab
Be encoded to E569 etc., 9. one stroke self formation intersects, no matter whether can be split as a plurality of semicircles, all be considered as intersecting radicals by which characters are arranged in traditional Chinese dictionaries, comprise 8 and ∞ etc., 10. the I in the letter is considered as three strokes, all the other situations all are considered as |, comprise that Roman number I is encoded to Y1VV etc., the modification stroke in the various symbols all can be ignored, and comprises that alphabetical A is considered as 3 strokes etc.
4. computer Chinese character key-board input method as claimed in claim 1, it is characterized in that numerical key 0~9 and space bar are used for the selection of repeated code word, the Alt+O key combination is used for back depositing the flexible lexicon of revising, semi-colon key is used for to flexible lexicon additions and deletions vocabulary, backspace key and enter key are used to correct mistakes, and Capslock key and the combination of Alt+ space bar are used to finish or begin to take turns the input of Chinese character and symbol.
5. computer Chinese character key-board input method as claimed in claim 1, first subclass that it is characterized in that handling the word collection is 7862 simplified Chinese characters, comprising 687 non-word symbols, the flexible lexicon useful capacity when program is normally moved, takies the 3.6MB internal memory for being not less than 6,800,000 two-character words at most, the minimum 64KB internal memory that takies, the minimum 33.3KB that takies of conventional memory wherein, mean code length is about 3.18, and dynamic code length is about 1.78~3.18.
6. computer Chinese character key-board input method as claimed in claim 1, second subclass that it is characterized in that handling the word collection is 7891 complex forms of Chinese characters, comprising 687 non-word symbols, the flexible lexicon useful capacity when program is normally moved, takies the 3.6MB internal memory for being not less than 6,800,000 two-character words at most, the minimum 64KB internal memory that takies, the minimum 33.7KB that takies of conventional memory wherein, mean code length is about 3.20, and dynamic code length is about 1.78~3.20.
7. computer Chinese character key-board input method as claimed in claim 1, the three subsetss that it is characterized in that handling the word collection are 10137 simplified Chinese characters and the complex form of Chinese characters, be first subclass and the second subclass sum that to handle the word collection, the flexible lexicon useful capacity when program is normally moved, takies the 3.65MB internal memory for being not less than 1,100 ten thousand two-character words at most, the minimum 80KB internal memory that takies, the minimum 43.3KB that takies of conventional memory wherein, mean code length is about 3.28, and dynamic code length is about 1.80~3.28.
8. computer Chinese character key-board input method as claimed in claim 1, it is characterized in that the whole word collection of handling is 30865 words, the flexible lexicon useful capacity is for being not less than 1,800 ten thousand two-character words, when program is normally moved, take the 3.75MB internal memory at most, the minimum 240KB internal memory that takies, the wherein minimum 128KB that takies of conventional memory, mean code length is about 3.31, and dynamic code length is about 1.84~3.31.
9. computer Chinese character key-board input method as claimed in claim 1 is characterized in that the described whole fractionation radicals by which characters are arranged in traditional Chinese dictionaries of handling the word collection are 41 class word radicals by which characters are arranged in traditional Chinese dictionaries, 1478 standardized form of Chinese charcters radicals by which characters are arranged in traditional Chinese dictionaries and some non-word radicals by which characters are arranged in traditional Chinese dictionaries.
10. computer Chinese character key-board input method as claimed in claim 1, the initial that the code of it is characterized in that correcting a wrongly written character or a misspelt word radicals by which characters are arranged in traditional Chinese dictionaries and 25 class word radicals by which characters are arranged in traditional Chinese dictionaries is got its standard pronunciation, the code of other 16 class word radicals by which characters are arranged in traditional Chinese dictionaries is 0~9 individual digit or letter.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 96119064 CN1161495A (en) | 1996-05-03 | 1996-05-03 | Computer Chinese character key-board input method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 96119064 CN1161495A (en) | 1996-05-03 | 1996-05-03 | Computer Chinese character key-board input method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN1161495A true CN1161495A (en) | 1997-10-08 |
Family
ID=5125523
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN 96119064 Pending CN1161495A (en) | 1996-05-03 | 1996-05-03 | Computer Chinese character key-board input method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN1161495A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102214013A (en) * | 2011-07-15 | 2011-10-12 | 李凯 | Input method of graphic and phonological Chinese character |
CN104991657A (en) * | 2015-06-11 | 2015-10-21 | 周连惠 | Chinese and Japanese katakana integrated input method and input method system |
-
1996
- 1996-05-03 CN CN 96119064 patent/CN1161495A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102214013A (en) * | 2011-07-15 | 2011-10-12 | 李凯 | Input method of graphic and phonological Chinese character |
CN104991657A (en) * | 2015-06-11 | 2015-10-21 | 周连惠 | Chinese and Japanese katakana integrated input method and input method system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN1023916C (en) | Chinese keyboard entry technique with both simplified and original complex form of Chinese character root and its keyboard | |
JP2006127510A (en) | Multilingual input method editor for ten-key keyboard | |
CN102033615A (en) | Digital operation coding input method capable of optimizing world character information and information processing system thereof | |
CN101630197B (en) | Multiple building block type interactive Chinese character input method | |
CN101853084A (en) | Chinese digital pinyin and stroke combination input method and keyboard | |
CN104331173B (en) | The computer processing method and system of character information | |
CN1318786A (en) | Intensive Chinese and English keyboard capable of being displayed on screen | |
CN103616960A (en) | Six vowel binary syllabification input method | |
CN100476826C (en) | Chinese character ordering searching method and device and one information system | |
CN1161495A (en) | Computer Chinese character key-board input method | |
CN101135938B (en) | Chinese characters phonetic two-tone input method | |
CN1136496C (en) | Simplified spelling-touching screen mouse chinese character input method | |
CN103207684A (en) | Phonemic letter double-input method | |
CN1255670A (en) | Chinese-character 5-key input method | |
WO2001093180A1 (en) | World characters numerical coding input method and thereof its information handling system | |
CN1194285C (en) | Chinese-character encode input technique in more input modes for computer | |
CN107256092B (en) | Chinese character digital shape code quick input method | |
CN104793757B (en) | Chinese character input method and device | |
CN1472626A (en) | Intelligent embedded character inputting method and device | |
CN100405264C (en) | Chinese character characterized location encoding combination input method based on one-key -for-one-character | |
CN101021753A (en) | Chinese character five-stroke fourteen-radicals inputting method on cellphone or computer | |
Wu et al. | Computer processing of Chinese characters: An overview of two decades' research and development | |
CN1052200A (en) | Pronunciation-form-meaning words encode series with compatibility and keyboard | |
CN1801050A (en) | System and method for inputting international literal character | |
CN1208712C (en) | <<Chinese character structure> input method> |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C06 | Publication | ||
PB01 | Publication | ||
C01 | Deemed withdrawal of patent application (patent law 1993) | ||
WD01 | Invention patent application deemed withdrawn after publication |