CN1006419B - Sorting and coding method of chinese character outline symbol and character element - Google Patents

Sorting and coding method of chinese character outline symbol and character element

Info

Publication number
CN1006419B
CN1006419B CN 85105556 CN85105556A CN1006419B CN 1006419 B CN1006419 B CN 1006419B CN 85105556 CN85105556 CN 85105556 CN 85105556 A CN85105556 A CN 85105556A CN 1006419 B CN1006419 B CN 1006419B
Authority
CN
China
Prior art keywords
radical
class
axe
chinese character
dagger
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired
Application number
CN 85105556
Other languages
Chinese (zh)
Other versions
CN85105556A (en
Inventor
陈爱文
周静梓
叶芬弟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
AIWEN COMPUTOR Co Ltd ZHANGJIAGANG
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Family has litigation
First worldwide family litigation filed litigation Critical https://patents.darts-ip.com/?family=4794554&utm_source=google_patent&utm_medium=platform_link&utm_campaign=public_patent_search&patent=CN1006419(B) "Global patent litigation dataset” by Darts-ip is licensed under a Creative Commons Attribution 4.0 International License.
Application filed by Individual filed Critical Individual
Priority to CN 85105556 priority Critical patent/CN1006419B/en
Publication of CN85105556A publication Critical patent/CN85105556A/en
Publication of CN1006419B publication Critical patent/CN1006419B/en
Expired legal-status Critical Current

Links

Landscapes

  • Machine Translation (AREA)
  • Document Processing Apparatus (AREA)

Abstract

The present invention belongs to Chinese character information processing and has the following applications: a, as a dictionary lookup method; b, for computer input of Chinese characters; c, as a telegraph substitute code. The present invention is mainly characterized in that first, a set of symbols are simultaneously applicable to dictionary codes and computer codes; second, a set of symbols are simultaneously applicable to simplified Chinese characters and complex Chinese characters; third, a component classification system is established according to the characteristics of the Chinese character components. Then, according to the component shapes, the symbols which are most alike are selected to establish morphological symbols of the Chinese characters. At last, by using the natural relations among the symbols, 47 code elements of the dictionary code are merged into 31 code elements of the computer code.

Description

Sorting and coding method of Chinese character outline symbol and character element
This invention belongs to the Technology of Chinese Information Processing field.This encoding scheme can supply computer input of Chinese characters, can be used as the sign indicating number that substitutes of telegram, also can be used as the indexing method of dictionary.
The prior art data:
Science Press of " Chinese character information processing " Chinese society publishes.
The Li Jin armour: " computer Chinese information configuration code method ", carry " Chinese journal of computers " Volume Four fourth phase.
" Chinese information processing international symposium collection of thesis " Chinese Information Processing Society of China publishes, and 1983.
Zhu Zilong: " day imperial computer Chinese character input method commentary " carries on " PC World " Dec nineteen eighty-two 20
The present invention program is:
1. Chinese character is splitted into some parts (being grapheme), parts are classified, each base part is with a symbology.The Chinese character that has splits after the parts, also has remaining stroke can not include parts in, thereby also will classify stroke, and this stroke also can be described as " single parts ".Each class stroke is also used a symbology.Parts and stroke are collectively referred to as grapheme, and all the code name of grapheme is called " Chinese character outline symbol ".This coding method is called " grapheme (being parts) sorting and coding method ".
2. parts are divided into 50 classes, and stroke is divided into six classes, shared 47 code names.These 47 code names are called " dictionary code " as the coding of dictionary.These 47 symbols have the Latin alphabet, figure and several special symbol.Capitalization, small letter and handwritten form are arranged in the Latin alphabet, as " L " and " l ", " H " and " h ", " X " and " x "; Chinese numeral and arabic numeral are arranged, as " five " and 5 in the figure.Intrinsic anaphora relation is all arranged between these symbols, is one group as " H " and " h ", and " five " and " 5 " are one group.In view of the above, these 47 dictionary codes can be merged into 31 groups in input during computer, and (26 Latin alphabets and five figures) are exactly 31 keys, Here it is computer sign indicating number.
3. table 1 of the present invention thes contents are as follows to table 3 and Fig. 1's:
Table 1: stroke classification
Table 2: part classification (totally 4 pages)
Table 3: Hanzi component system
Fig. 1: Chinese character outline symbol (comprise 47 parts code names and stroke code name, merge into 31 keys on the computer)
The coding primitive rule (the common rule of dictionary code and computer code):
(1), the order of code bit is according to the order of strokes observed in calligraphy, such as " ripples ": 3(Rui), the V(car), Z(Chuo). But inwhole stroke one gas of the parts that have write, but middle other parts of insertion, in this case, when the first stroke of parts occurs, just be used as the precedence that whole parts occur. Such as " witch ", tear three parts into " worker, people, people " open, " worker " word is write earlier two, writes then " people, people ", writes at last last horizontal stroke of " worker " word, " worker " even if first parts of precedence.
(2), the horizontal stroke of " dagger-axe " base part, if elongation left, above or below other stroke is arranged, this horizontal will disconnection is used as two horizontal pens and is belonged to respectively two parts, should tear open such as " hiding " to be that " mouth, ear, dagger-axe ", " force " should be torn open and to be " two, only, shoot a retrievable arrow ".
(3), closed, staggered form, three encloseds, two encloseds, be a firm structure, cannot take apart. Can only split into such as " in vain " ", day ", can not split into ", Ji "; " ox " can only split into "
Figure 85105556_IMG19
", can not split into " , ten "; "
Figure 85105556_IMG21
" can only split into "
Figure 85105556_IMG22
, Jiong ", can not split into " Ren,  "; " ten thousand " can only split into " one,
Figure 85105556_IMG23
", can not split into "
Figure 85105556_IMG24
,  "; " make " splitting into only " people, Dian,  ", can not split into " people,
Figure 85105556_IMG25
Figure 85105556_IMG26
". Because " day " is closed, "
Figure 85105556_IMG27
" be staggered form, " Jiong,
Figure 85105556_IMG28
,  " be three encloseds.
(4), except afore mentioned rules, in the continuous glutinous stroke that connects relation and separation relation, all preferentially obey the needs of upper component, as: " friendship " splits into " six, ㄨ ", do not split into " Tou, father "; " suffering " tear work " vertical, ten " open, do not split into "
Figure 85105556_IMG29
, do ".
5. the rule of computer code
(1), solid size position word. A word only has parts or stroke, is called solid size position word. Its coding except parts, stroke code name, is added the first letter of the pronunciation of this word. The letter of expression pronunciation all is solid size position word the pronunciation code bit such as " wood, end ", and the parts code name all is M. An independent M is dictionary code. The computer code will add the pronunciation code bit, and " wood " is m(m), " end " is m(w).
(2), two yards position words, except the parts code name, add pronunciation sign indicating number position again, as " rose ", tear open and be " king, The-Fan ", dictionary code is " five A ", pronunciation sign indicating number position is " M ", the computer sign indicating number is " five A(M) ".
(3), trigram position word, on household PC just with three sign indicating number positions.Requiring to reduce on the professional computer of repeated code word, can add a pronunciation sign indicating number position again.As " eggplant ", split into and be " Lv, power, mouth ", dictionary code and household PC sign indicating number all are " HXO ".The pronunciation sign indicating number position of " eggplant " is " Q ", and the professional computer sign indicating number is " HXOQ ".
(4), four yards position words do not add pronunciation sign indicating number position, computer sign indicating number and dictionary code with.
(5), the above words in five yards positions, get its first, second and third yard position and last code, as " clamoring " tear open into " mouthful, mouthful, , Jiong, people, mouth, mouth ", dictionary code is " OOTnROO ", the computer sign indicating number is " OOTO ".
(6), the problem held altogether of the complex form of Chinese characters and simplified Chinese character, handle with the following method: the radical that several numbers of words are more, " speech (Yan), gold (Jin), food (Cannibals), Trucks (car), Si (Si), horse (horse) ", the parts code name is with simplified identical.As " speech, Yan " all is i, and " gold, Jin " all is z, and " food, Cannibals " all is S, and " Si (Si) " all is W, and " horse (horse) " all is 5.When these words not as left avertence when other, according to original coding rule typewriting.
For example:
Other the complex form of Chinese characters and simplified Chinese character are divined by means of characters according to the font of oneself respectively, coding.As:
Figure 85105556_IMG32
In the character library of specific use (as library's word), when requiring either traditional and simplified characters to exist in the same character library, the maximum length code position is increased to five yards (surpass five yards word, get first to fourth sign indicating number and last code).The traditional font radical is with multiple representation, as following table:
Figure 85105556_IMG33
6. when this cover coding was as dictionary code, Chinese character, kanji, Korea's Chinese character all can be general.When using the computer sign indicating number, need only first letter (Japanese Roman capitals) that pronunciation sign indicating number position is changed into Japanese pronunciation, just become the computer sign indicating number of kanji; Pronunciation sign indicating number position is changed into first letter (Korean Roman capitals) of Korean pronunciation, just become the computer sign indicating number of Korea's Chinese character.
7. telegram substitutes the rule of sign indicating number
Existing telegraph code, it is very inconvenient to look into sign indicating number from word, can not find out word sometimes in a hurry.If have a cover to substitute sign indicating number, will make things convenient for manyly with dictionary, the unified telegram of using of computer.
Telegram substitutes the coding that sign indicating number utilizes the computer sign indicating number.Tens repeated code words are arranged in the computer sign indicating number, and it is for future reference to be listed as into a repeated code word table.In every group of repeated code word, each word is added a figure and is distinguished mutually, and as " drying in the air ", " scape ", coding is all " D203 ", can stipulate that the telegraph code of " drying in the air " is " D203 1. ", and the telegraph code of " scape " is " D203 2. ".
Advantage of the present invention
This coding is to set up on the basis of Hanzi component categorizing system, has reflected the objective law of Chinese character pattern, can combine with functional literacy.This coding can be applicable to the simplified Chinese character and the complex form of Chinese characters simultaneously.This cover symbol can double as dictionary code and computer sign indicating number, can also substitute sign indicating number as telegram in case of necessity temporarily.This cover coding can also be applicable to kanji and Korea's Chinese character.
Figure 85105556_IMG34
Figure 85105556_IMG36
Figure 85105556_IMG39

Claims (7)

1, the invention belongs to the Chinese character root coding computer input technology, it is characterized in that:
A. conforming to the principle of simplicity, in the whole Chinese characters of numerous disome, put out whole radicals (being grapheme) in order, is not " basic element of character ", constitutes comprehensive etymon list of Chinese character, as the Pinxing letter of Chinese character;
B. Chinese character root is divided into five types: intersection, separation, adhesion, encirclement, cabinet frame etc. form from " type " to " formula " categorizing system to " class ";
C. adopt 26 Latin alphabets and five figures code name, and utilize the pictographic nature that has appearance similar between radical code name and the radical, be convenient to classification, memory and the input of radical as the radical class.
2, a item described in the claim 1 so can be achieved, is owing to carried out two important rules for the fractionation of whole Chinese characters:
(1) square crossing is not torn open without exception, and is much no matter structure has, no matter whether commonly used,
(as " then,
Figure 85105556_IMG1
" etc.) only calculate a radical, no longer split;
(2) be attached to left-falling stroke pen on the radical, except that " standing grain, Chi ", take apart without exception,
(as: white=
Figure 85105556_IMG2
Day, tooth= Pie, body= Pie)
Owing to implemented this two rules, make the radical sum be compressed to more than 300 from traditional more than 600, and make the scope of radical clear and definite and stablize.Obtain whole etymon list of Chinese character thus, rather than " basic element of character table ".
3, a item in the claim 1, thus can realize, also because the present invention has founded a radical type that can reduce radical quantity: " about two, shape is similar, direction is opposite ".Every structure with this feature is all calculated a radical, can not can be regarded as two, and this class radical has: northern Zhao non-Bo Door
Figure 85105556_IMG5
4, the b item of claim 1 is subdivided into following class system:
Figure 85105556_IMG7
5, the item of b described in the claim 1, so can realize, be that also the present invention has founded a rule: the word of " dagger-axe " portion, if top or following other stroke in addition of a horizontal left end, this horizontal pen will disconnect, and is used as two horizontal pens and treats, and (as: hides=mouthful ear dagger-axe,=soil mouthful dagger-axe, I=
Figure 85105556_IMG9
The Rolling dagger-axe, penta=factory dagger-axe, battle-axe used in ancient China=Jin Dagger-axe, a surname=
Figure 85105556_IMG11
The minister dagger-axe) make the word of " dagger-axe " portion become a well-regulated type.
6, the c item described in the claim 1 is the contact of catching the pictographic nature between Chinese character root and the Latin alphabet or the figure, and wherein most important have:
Single cross fork=X
Three bread enclose one=E of therebetween
Three bread enclose towards the right side=C
Three bread enclose the door up=U
Three bread enclose the door down=n
Upright pen=Q is arranged in the completely encircle
Hang down=P in the works in a completely encircle left side
Completely encircle shaped as frame on-right angle=A
Two stroke class=2 of separating
Three stroke class=3 of separating
Four stroke class=4 of separating
Five stroke class=5 of separating
" 5 " cabinet frame class ( With horse )=5
" L " cabinet frame class (seven
Figure 85105556_IMG14
Collect electric セ also ... )=L
7, the c item described in the claim 1, so can realize, also owing to utilized the rule of a word multiform of the Latin alphabet, figure, (for example: single cross fork class is made code name with X, and two piece reverse symmetry classes are used Make code name, during the input computer, Incorporate the X key into; And for example " two is blocked " class is made code name with " five ", and five classes are done number with " 5 ", and during the input computer, " five " incorporate " 5 " key into) Latin alphabet has capitalization, small letter, the multiple shape of handwritten form; Figure has China's numeral, the multiple shape of arabic numeral; The character of multiple shape can be respectively as the code name of different radical classes; And input is during computer, and same key can be incorporated in the code name of several shapes of same letter, like this, and just can be with more radical class and on less key.
CN 85105556 1986-04-30 1986-04-30 Sorting and coding method of chinese character outline symbol and character element Expired CN1006419B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 85105556 CN1006419B (en) 1986-04-30 1986-04-30 Sorting and coding method of chinese character outline symbol and character element

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 85105556 CN1006419B (en) 1986-04-30 1986-04-30 Sorting and coding method of chinese character outline symbol and character element

Publications (2)

Publication Number Publication Date
CN85105556A CN85105556A (en) 1987-06-03
CN1006419B true CN1006419B (en) 1990-01-10

Family

ID=4794554

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 85105556 Expired CN1006419B (en) 1986-04-30 1986-04-30 Sorting and coding method of chinese character outline symbol and character element

Country Status (1)

Country Link
CN (1) CN1006419B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1049990C (en) * 1995-05-23 2000-03-01 孙柱章 Oral arithmetic coding method and keyboard thereof
CN1088865C (en) * 1997-05-05 2002-08-07 冯天岳 Computer Chinese character input method of complex character interated code radical type

Also Published As

Publication number Publication date
CN85105556A (en) 1987-06-03

Similar Documents

Publication Publication Date Title
CN1102714A (en) Chinese character input method and keyboard based on two strokes and two-stroke symbol
CN85100837A (en) Optimize the Five-stroke Method compiling method and keyboard thereof
CN1006419B (en) Sorting and coding method of chinese character outline symbol and character element
CN1116335A (en) Chinese character screen-writing input system
CN1022781C (en) Encoding method of Chinese character strokes
CN1032986C (en) Chinese-character stroke order code enter method and its keyboard
CN1029046C (en) Chinese character radicals and strokes input method
CN1349157A (en) Digital configuration code Chinese character input method
CN1111373A (en) Computer Chinese input scheme based on the Chinese Phonetic Alphabet
CN1284066C (en) Three strokes code method for inputting Chinese characters into computer as well as its keyboard
CN1293448C (en) Ten-stroke digital code input method
CN1109957C (en) Chinese character digital coding input method based on Chinese character basic elements and normal parts
CN1744014A (en) Digital two-stroke and Chinese character input method and key board
CN1102716A (en) Method for putting Chinese character into computer by using numerals
CN1458566A (en) Chinese character plain code input method
CN1419179A (en) Chinese characters input method according to stroke sequence and keyboard thereof
CN1043381C (en) Four-stroke digit look-up method for Chinese characters
CN1142474C (en) Dictionary code Chinese character input method
CN1141632C (en) Chinese character two-bit digital code input method
CN1018774B (en) Chinese-character and symbol encode method based on pattern, pronunciation and symbol and keyboard thereof
CN1121006C (en) Chinese-character input method for computer
CN1164690A (en) Unified Chinese characters encoding method and its keyboard
CN1378122A (en) Yi-code input method for Chinese characters
CN1098212A (en) Chinese-charater key-in system using five-stroke combined geometric and phonetic codes
CN1255666A (en) Coding method for Chinese-character input

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C13 Decision
GR02 Examined patent application
C14 Grant of patent or utility model
GR01 Patent grant
C53 Correction of patent for invention or patent application
CB02 Change of applicant information

Address after: 325000 No. 243, Lane 16, Xinhe street, Zhejiang, Wenzhou

Applicant after: Chen Aiwen

Applicant after: Zhou Jingzi

Address before: Beijing Dongcheng District historians alley West Circle No. 1 Huang Wenhua turn

Applicant before: Chen Aiwen

Applicant before: Zhou Jingzi

COR Change of bibliographic data

Free format text: CORRECT: APPLICANT ADDRESS; FROM: HUANG WENHUA CARE OF NO.1 XILUOYUAN SHIJIA ALLEY,DONGCHENG DISTRICT, BEIJING TO: 325000 NO.16 LANE 243, XINHE STREET, WENZHOU CITY, ZHEJIANG

C53 Correction of patent for invention or patent application
COR Change of bibliographic data

Free format text: CORRECT: APPLICANT; FROM: CHEN AIWEN; ZHOU JINGZI TO: ZHANGJIAGANG AIWEN COMPUTER LTD.

CP03 Change of name, title or address

Address after: No. 35 East Sha Chau Road, Jiangsu, Zhangjiagang

Applicant after: Aiwen Computor Co., Ltd., Zhangjiagang

Address before: Beijing, Dongdan historians alley, No. 1, West Circle

Applicant before: Chen Aiwen

Applicant before: Zhou Jingzi

C17 Cessation of patent right
CX01 Expiry of patent term