CN1499357A - Method for lablling united character and word as well as character patterns and character picture - Google Patents

Method for lablling united character and word as well as character patterns and character picture Download PDF

Info

Publication number
CN1499357A
CN1499357A CNA02147477XA CN02147477A CN1499357A CN 1499357 A CN1499357 A CN 1499357A CN A02147477X A CNA02147477X A CN A02147477XA CN 02147477 A CN02147477 A CN 02147477A CN 1499357 A CN1499357 A CN 1499357A
Authority
CN
China
Prior art keywords
mark
words
word
pronunciation
code
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CNA02147477XA
Other languages
Chinese (zh)
Inventor
���Ծ
李成跃
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CNA02147477XA priority Critical patent/CN1499357A/en
Publication of CN1499357A publication Critical patent/CN1499357A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Document Processing Apparatus (AREA)

Abstract

'Character part' and 'label part' (denoting pronunciation and information code) of word and phrase are coupled as integral whole to be as letter (or technique) symbol of recorded language 'or processing information' for human/machine to recognize. Comprehensive intuitionistic labeling properties of word and phrase in language and information processing coupled belongs to areas of language, information technique and printing technique. The invention plays roles of normalizing and modernizing language and characters, strengthening Chinese spelling and learning and applying mandarin as well as promoting standardization of inputting codes and unifying and simplifying 'simplified, complex and variant character'.

Description

Conjuncted mask method of words and type matrix thereof and word figure
The present invention is made up of with Chinese phonetic alphabet type matrix and word figure the conjuncted mask method of a kind of words, mark type matrix, mark word figure, it carries out conjuncted comprehensive mark directly perceived to words at the attribute aspect spoken and written languages and the information processing, and be spoken and written languages, infotech and printing technology, the type matrix and the word figure of conjuncted mark are provided, belong to spoken and written languages and infotech and printing technology.
For sake of convenience, " notion " to relating in this instructions is illustrated." words " comprises Chinese, or " word " in the other Languages or " speech "; This instructions, with Chinese character by words, Shuo Ming object as an example, the similar application in other Languages, by that analogy." mark " here is meant, to the words in the text, and mark, note in addition." the conjuncted mask method of words " is a kind of mark with words and it, on body, is coupled to an integral body, and as the method for literal (or technology) symbol of record instruction (or process information)." the conjuncted mark of words " is called for short " conjuncted mark "; As not causing ambiguity, can be called for short " mark ", such as, be used for " mark Chinese character ", " mark type matrix " and " mark word figure " etc." type matrix " is meant the dot pattern as the words of literal (or technical symbol) use." word figure " is meant the color graphics as the words of literal (or technical symbol) use." colourity " is meant gradation of color." attribute " of words described words one or more information in pronunciation, body and message code." message code " describes the input coding that is used for the information input, or is used for the machine code that machine recognition is handled, or their both combinations." machine code ", be be used for words the people machine identification and the figure code handled, description comprises database coding, pronunciation code, " personalization " message code of words and is used for one or more information of the synthetic information such as data code of type matrix (or word figure), or describes others information.Database coding comprises that for inquiry the information such as pronunciation, body (comprising type matrix and word figure) and part of speech meaning of words provide " address " (coding), and it is by the certain format composition, or directly is data-base recording number, or is character set encoding, or is the interior code of machine.Pronunciation code is a code of describing the words pronunciation, or the code that utilizes the words pronunciation to generate, or refers to the code of words pronunciation sequence number, when no ambiguity, is a kind of general term.
Existing words mask method comprises dictionary, dictionary, " interlinear notes " and forms such as " pronunciation marks ".Dictionary, dictionary provide the standardization note to pronunciation, body, meaning and the usage of words." interlinear notes " are to mark in the text text; As the duplicate rows interlinear notes, in text, use the duplicate rows small character, to words note in addition." pronunciation mark ", with the pronunciation of words, the head that is labeled in words is gone up, underfooting or after one's death.Above mask method, on body, words and mark separate, and do not bind one, are not easy to input and set type; Whether words marks, and has randomness, is unfavorable for strengthening the standardization of spoken and written languages; In terms of content, do not have markup information code directly perceived, be not easy to coding input, machine recognition and the processing of words.
Aspect individual's use, there are multiple symbols such as Chinese character, the Chinese phonetic alphabet and input coding for Chinese character in Chinese; These symbols, independent separately on body, be not easy to strengthen the standardization that the Chinese symbol uses.
Input coding for Chinese character is to aim at Chinese character input and the symbol code that increases newly, and it is free on outside the Chinese character and the Chinese phonetic alphabet, and encoding scheme is numerous, and the hundreds of kind is arranged.People in use need to be grasped a kind of (or multiple) input coding." translation " coding will be constantly removed in the coding input, makes people's fatigue easily.How code fetch needs memory; After a while need not, forget again.These input codings do not mark with words, need memory and " translation ", the difficulty that has increased study and used.It is reported, now released sound sign indicating number, font code and the numerical coding of standard, the promotion and application of these codings still need a kind of form of expression intuitively.
Chinese character exists that difficulty is read, is difficult to write, difficult note, difficult input, difficult retrieval and be difficult to machine recognition and processing shortcomings such as (being called for short " six difficulties ").
Chinese character is the ideograph with artistry.This artistry shows as " personalization " of words form.But, under existence conditions, handle Chinese character necessary " typing " for adapting to machine recognition; This " typing " is not easy to show " personalization " artistic characteristics of Chinese character.Chinese character style, existing now tens kinds of type matrixes provide application choice, and still, that is not to use person individual's font style.
The legal instrument that the Chinese phonetic alphabet is the mark Chinese-character pronunciation, popularize Beijing pronunciation.But, existing " pronunciation mark ", on body, not and words be bound up, be not easy to the input composing; Range of application is narrower, is applied in phonetic literacy education aspect usually; In society uses widely, lack a kind ofly with words and mark, the application form that is tightly linked on body is unfavorable for strengthening the pronunciation standardization.Under existence conditions, the Chinese phonetic alphabet self, input and output show glibly.This " not smooth " is embodied in: aspect the coupling of font and font, seem incurring dislike, give people's's a kind of " abnormal type " sensation; In computer input operation, trouble seems.
Phonetic literacy is asynchronous with words input training, and learning putonghua is " bringing back to life " easily, and one of them reason is exactly words, pronunciation and input coding, on body, is not bound up.
Information processing need be carried out standard to some practical attributes of words.Use these standards, need memory, because these practical attributes can not be from the body of words, performance intuitively.As, Chinese character " thinks " that when the parts composition was appointed as " wood, order, the heart ", addressable part was " wood, order, the heart "; When the parts composition was appointed as " phase, the heart ", addressable part was " phase, the heart ".And for example, " curtain " and " dusk ", its Chinese phonetic alphabet are all " m ù ", and " the sound sign indicating number " of its input coding is identical; Distinguish if add with its " font code ", its distinguishing characteristics then needs memory.
Existing words machine recognition and processing comprise and discern the printing type face and the individual style of calligraphy of handling words, generally realize by processing words " characteristics of image ".Words is the base unit that identification is handled.What words are arranged, will discern what " characteristics of image ".Same words, the font difference, characteristics of image is just different; Same font, font varies in size, and characteristics of image is also not necessarily identical.Chinese character, number of words be near 90,000, and its " characteristics of image ", quantity is many, and kind is many, and identification is handled, and workload is big." personalization " of the diversity of printing type face and the individual style of calligraphy is the inexorable law of literal and technical development thereof, and this also will constantly increase the difficulty of words machine recognition and processing.
Existing words machine is read, and generally realizes by processing words " code in the machine ".Words is to read the base unit of handling.What words are arranged, will handle what " codes in the machine ".Do not know " code in the machine " of words, read with regard to inconvenient machine.
Existing type matrix is not made of one words and its mark attribute, is not easy to input and sets type, and is not easy to strengthen the standardization of spoken and written languages, is not easy to coding input, machine recognition and the processing of words.
Existing type matrix is not easy to information security management and copyright protection.Existing type matrix is general.Such as, " forever " word, as long as font is certain, the matrix code of " forever " word in same infosystem, is exactly certain, just can be common to the different terminal equipment of system.This " versatility " is unfavorable for information security management and copyright protection.Existing type matrix does not have " individual character " mark.In real society, this " forever " word, it has " individual character " mark.Such as, who writes? whose person's handwriting? whether is permission open? license? and whose publication? whose printing? who duplicates? or the like.But under existence conditions, in the same infosystem, everyone can use this " forever " word; With all " forever " words of yi word pattern, on body, all be the same, there is not " individual character " mark.The development of infotech makes the machine recognition and the processing of text, text duplicating and open, " facility " becomes very.This " facility " has the one side that is unfavorable for information security management and copyright protection.Type matrix does not have " individual character " mark, is not easy to clear and definite legal liabilities, is not easy to false proofly, is not easy to suppress piracy and bootlegging.Type matrix needs " switch " of definition " individual character " mark, needs mark " personalization " information.Existing " encryption " technology has solved the problem that text " without permission, can not be stolen a glance at ", still, does not solve the problem that text " allows to see, and do not allow illegal modifications and duplicate ".
In the prior art, a kind of " combined character " scheme, with the synthetic Chinese character font of parts type matrix, this will simplify or improve existing Chinese font store.The Chinese character part of matrix magazine can have only a spot of parts type matrix, or has only parts type matrix and Chinese characters in common use type matrix.But existing type matrix and word figure can not provide type matrix synthetic data code for the machine recognition input of this scheme.
Existing word figure is applied to child's reading matter mostly.These " colored text pictures ", carry out " diagram signal " to text pronunciation, stroke, structure and spelling, help child's phonetic literacy.But these " colored text pictures " only are the instruments of " diagram " literal, it self, remain by the object of explanatory note; Be not used for record instruction, do not have the meaning of message code aspect.
Existing information processing and reform of a writing system practice, need a kind of Chinese character, the Chinese phonetic alphabet and input coding of both having helped to merge mutually, help again that Chinese character, the Chinese phonetic alphabet and input coding are mutually promoted, cooperative development, can solve real world applications, can take into account the practice form of long-term goal again.
That Chinese character exists is simplified, traditional font and allosome.What it was embodied in has simplified and the traditional font, and what have has " roman " and an allosome, form such as having of having is simplified, traditional font and allosome.For sake of convenience, generally be called " simplified, the traditional font of Chinese character and allosome ".Its essence is exactly " Chinese character, several bodies ".In existing Chinese character information technology, simplified, the traditional font of a word and allosome are to be used as several different " word " to handle, and it has several Chinese character sets codings, has increased the quantity that character is handled.The continuation simplification of Chinese character and the generation of fresh character are subjected to the constraint of existing character set encoding.Existing character set can not in time be arranged their " seat "; After arranging their " seats " exactly, also need over a long time sanctified by usage.Because this " seat " is not labeled on type matrix and the word figure intuitively.Also have, " greatly " word that Zhang San makes, with " greatly " word that Li Si makes, even they are just the same on body, under existence conditions, computing machine is treated as two different " greatly " words, because, their " seat " differences in character set.Under existence conditions, " greatly " word that Zhang San can't be made, the coding registration is labeled on the type matrix (or word figure), allows Li Si know, makes Li Si needn't reproduce this " greatly " word; Perhaps, " greatly " word that allows Li Si make marks the registration coding identical with Zhang San, handles as same " greatly ".
Purpose of the present invention, just provide the conjuncted mask method of a kind of words, the mark type matrix, mark word figure and Chinese phonetic alphabet type matrix and word figure, (1) the conjuncted comprehensive mark directly perceived of realization words attribute, (2) with words, pronunciation and message code are united in one, (3) reduce words study, the difficulty of using and encoding and import, (4) reduce and overcome Chinese character " six difficulties ", improve the determinacy of words spelling, the input and output that improve the Chinese phonetic alphabet show, (5) strengthen language standardization, (6) promote coding input standardization, (7) improve words identification and reading technique, (8) help strengthening information security management and copyright protection, (9) be information processing and reform of a writing system practice, a kind of Chinese character that both helped is provided, the mutual fusion and the cooperative development of the Chinese phonetic alphabet and input coding, take into account the practice form of real world applications and long-term goal again, (10) be the simplified of Chinese character, traditional font and allosome, realize " abnormity is with sign indicating number ", be the use that the continuation of Chinese character is simplified and new coinage accords with, for improving the Chinese character set coding, a kind of practice form is provided, (11) are machine recognition input " type matrix is synthetic " data, condition is provided, and it is personalized that (12) realize that words output shows; (13) strengthen type matrix and the function of word figure in information processing.
The object of the present invention is achieved like this;
(1) the conjuncted comprehensive mark directly perceived of realization words attribute.The present invention adopts the conjuncted mask method of words and corresponding type matrix and word figure, and words is carried out conjuncted, comprehensive, mark intuitively at the attribute aspect spoken and written languages and the information processing.On the mark structure, has conjuncted property.Conjuncted mark and corresponding type matrix and word figure have " word segment " and " mark part ".It on body, is coupled to an integral body with " word segment " and " mark part " of words.Words and mark attribute with integral form, occur together, and the place of words is promptly arranged, and the mark attribute of words is just arranged.They as literal (or technology) symbol of record instruction (or process information), realize the conjuncted mark of words with this integral form.On marked content, have comprehensive.It marks at the attribute aspect spoken and written languages and the information processing words; These attributes comprise pronunciation, body and the message code of words, or include only the pronunciation and the body of words, or include only the message code and the body of words, or include only the message code of words.On mask method, has intuitive.This intuitive shows convenient man-machine identification processing aspect.It will be easy to the mark material (comprise color, colourity, coding, mark, character and distortion thereof, or fingerprint, watermark, magnetic ink etc.) of people or machine recognition and select for use and define, as the sign of words mark.The application of mark material makes the imagery of words mark.It can vividly show the attribute of selecting for use of words, to comprising one or more words attributes of pronunciation, spelling, stroke, the order of strokes observed in calligraphy, structural style, parts composition, parts ownership, selected components, input coding and machine code etc., carry out the visual pattern mark.Mark object and marked content combine together, have intuitive.Word segment adopts the mark form, can the logo image words select attribute for use, to comprise that spelling, stroke, the order of strokes observed in calligraphy, structural style, parts are formed, one or more words attributes of parts ownership and selected components etc., carry out the visual pattern mark; Such as, in the mark form of words, the prompting addressable part.The mark part in " pronunciation input coding " mark of words, can be selected the mark material for use, as with color, colourity or character distortion etc., marks words pronunciation, syllable tone intuitively, or " brevity code " of prompting input coding.In mark combination (with using), has dirigibility." word segment " can show as the original form of words, also can show as the mark form of words." mark part ", statement words pronunciation and message code (message code, comprise " input coding " and " machine code "), have " pronunciation input coding " and " machine code ", or have only " pronunciation input coding ", or have only " pronunciation (input coding is implicit) ", or have only patterns such as " machine codes ", or other pattern.Machine code can mark by visual pattern, also can hide secret mark.Spelling of the pronunciation of words and input coding are linked together, and show as " pronunciation input coding " pattern, and perhaps spelling of the pronunciation of words and input coding are not linked together, such as, show as " pronunciation (input coding implies) " pattern." pronunciation input coding " pattern, its front is the words pronunciation, the back is an input coding, or the front is the words pronunciation, the back is the shape justice feature code in order to difference unisonance words, and two parts combination in front and back is as input coding, and the centre separates with symbol, or separate without symbol, list separator can define as required and select for use.Machine code, be used for words the people machine identification and the figure code handled, it in form, definition comprise " the identification starting point, in length and breadth to identification with reference to " etc. mark, or define other mark, or do not make this class tag definitions; It describes one or more information that comprise database coding, pronunciation code, " personalization " message code of words and be used for the synthetic information such as data code of type matrix (or word figure) in terms of content, or describes others information.Database coding comprises that for inquiry the information such as pronunciation, body (comprising type matrix and word figure) and part of speech meaning of words provide " address " (coding), and it is by the certain format composition, or directly is data-base recording number, or is character set encoding, or is the interior code of machine." personalization " message code is described the one or more information of registering in authority that comprise aspects such as works, publication, print reproduction equipment, special font, the notarization marking and password, or other self-defined information." word segment " and " mark part ", its mutual alignment as " up and down, front and back ", can be determined as required; It makes up mutually, can be that two parts all occur, and also can be that " word segment " occurs with " input coding " combination, or " word segment " occurs with " machine code " combination.On application form, has diversity.Its employing mark type matrix, or adopt mark word figure, or adopt Chinese phonetic alphabet type matrix and word figure, or adopt other form, realization is to one of words, or a plurality of attribute, carries out conjuncted visualize.
(2) words, pronunciation and message code are united in one.The present invention adopts conjuncted mark and corresponding type matrix and word figure, and words at the attribute aspect spoken and written languages and the information processing, is carried out conjuncted comprehensive mark directly perceived.On content and structure,, unite in one with words, pronunciation and input coding.The mark form of words both can have been explained stroke, parts and the structure of words, can imply the input coding of (prompting) words again.Pronunciation mark and input coding combine together.Mark type matrix and word figure are an integral body on body; With whole conjuncted mark, as application units; Have two kinds of functions of letter symbol and technology code.
(3) difficulty of reduce words study, using and encode and import.The mark form of words, the attributes such as spelling, stroke, the order of strokes observed in calligraphy, parts, structure and parts ownership of image statement words help the study and the understanding of words.The pronunciation of mark words helps to read.Input coding is marked with literal, press text input, can realize that input coding need not remember.Input coding is combined with the words pronunciation, can reduce the difficulty of coding input.
(4) reduce and overcome Chinese character " six difficulties ", improve the determinacy of words spelling, the input and output that improve the Chinese phonetic alphabet show.The Chinese character marking pronunciation just given in Chinese character " difficulty is read ".Chinese character " is difficult to write ", " difficult note ", just on adopting Chinese character form, mark the order of strokes observed in calligraphy, structure and parts composition etc., reduce the difficulty of its study and application.If to the Chinese character-type of words, can " not write " really and " note ", can utilize the Chinese phonetic alphabet, words is expressed as its pronunciation code, write down Chinese character by words with pronunciation code; Perhaps,, use the pronunciation code of words mark according to the mark Chinese-character text, or input coding, " note " can " be write " easily in realization, can also import Chinese character." code fetch " parts just pointed out in Chinese character " difficult input " on adopting Chinese character form, with the input coding Direct Mark on the Chinese character limit; Do not mark the Chinese-character text reference, perhaps loigature is not familiar with yet, and just directly imports the pronunciation code of words, word selection in the screen prompt of input method, or directly be expressed as " Chinese phonetic alphabet " pattern.Chinese character " difficult retrieval " utilizes pronunciation mark, input coding and machine code, can behave the machine examination rope provide convenience.Chinese character quantity is many, " being difficult to machine recognition and processing ", just goes up machine code to its mark; Machine code, be specifically designed to words the people machine identification and the figure code handled, utilize it, can conveniently carry out words the people machine recognition." pronunciation input coding " of conjuncted mark, it merges the pronunciation spelling of words with the range of application of the useful expansion Chinese phonetic alphabet and effect mutually with input coding; " pronunciation input coding ", " pronunciation adds shape Yi Tezheng ” form, and promptly the front is the words pronunciation, and the back is the shape justice feature code in order to difference unisonance words, and the centre separates with symbol, or separates without symbol, realizes that the words description has unique determinacy in employing; This coded format is the Chinese phonetic alphabet, solves the shortcoming of " the corresponding a plurality of words of a kind of spelling ", and a kind of thinking is provided.Chinese phonetic alphabet type matrix and word figure; Chinese phonetic alphabet syllable; make a type matrix or word figure, use this integral body, carry out the input and output of the Chinese phonetic alphabet; the output of Chinese phonetic alphabet syllable is shown; on font and font, realize coupling, seem pleasing to the eye; make the input of Chinese phonetic alphabet syllable become convenient, succinct.
(5) strengthen language standardization.Conjuncted mark can carry out conjuncted comprehensive mark directly perceived with words in standard and suggestion aspect pronunciation, body and the input coding, words and mark, on body, bind one, will stop the randomness of applicational language liberal normalization, strengthen the dynamics of language standardization.Such as, using the mark Chinese character, you need only the contact Chinese character, and it just initiatively provides the mandarin phonetic notation to you, advises that you use a certain coding input, helps popularizing Beijing pronunciation, and helps enlarging the range of application and the effect of the Chinese phonetic alphabet.
(6) promote coding input standardization.Input coding for Chinese character, scheme is numerous.These codings also need preferred and standard.Input coding adopts conjuncted comprehensive mark directly perceived, helps the study and the application of input coding.The present invention, the input coding of recommendation " pronunciation adds shape Yi Tezheng " form, it is a kind of input coding for Chinese character that meets existing liberal normalization, combines with the Chinese phonetic alphabet.
(7) improve words identification and reading technique.The present invention marks machine code with the words body.Machine code, be specifically designed to words the people machine identification and the figure code handled, it in form, definition comprise " the identification starting point, in length and breadth to identification with reference to " etc. mark, or define other mark; It describes one or more information that comprise database coding, pronunciation code, " personalization " message code of words and be used for the synthetic information such as data code of type matrix (or word figure) in terms of content, or describes others information; Database coding comprises that for inquiry the information such as pronunciation, body (comprising type matrix and word figure) and part of speech meaning of words provide " address " (coding), and it is by the certain format composition, or directly is data-base recording number, or is character set encoding, or is the interior code of machine; " personalization " message code is described the one or more information of registering in authority that comprise aspects such as works, publication, print reproduction equipment, special font, the notarization marking and password, or other self-defined information.Adopt machine code, machine is handled the identification of words " graphic feature ", be reduced to the identification of words " machine code " is handled, changed the mode that existing words machine recognition is handled, to reduce the workload that machine recognition is handled, help improving accuracy, the reliability that machine recognition is handled.Its concrete improvement is; 1. reduce the quantity that machine recognition is handled.Chinese character, number of words is near 90,000; Printing type face, existing tens kinds.Existing printing type face machine recognition is handled, and needs millions of of identification " graphic feature ".Adopt machine code,, only need 16 kinds " graphic feature " of identification if sexadecimal then has only 16 kinds " figure codes ".2. the identification of machine code is not influenced by font and font." graphic feature " of words is because of font difference difference; Because of the font difference also not exclusively the same.Machine code, indicated the identification starting point of figure code and in length and breadth to identification reference marker.In length and breadth to the identification reference marker, mark the position and the size of figure code each " position " code in different fonts and font, as the reference of machine recognition.Adopt machine code, the machine recognition of words will not be influenced by font and font will.3. simplified Chinese characters " normalization " are handled.Existing words machine recognition need be carried out normalized before " characteristics of image " extracts, comprise place normalization and size normalization, so that can both correctly discern the character of all size.Adopt the identification starting point of machine code sign and, will simplify this " normalization " processing in length and breadth to the identification reference marker.It is reduced to the normalization to 16 kinds " figure codes " with millions of Chinese character " figure " normalization.Machine is read, and adopts machine code, the processing of computing machine to words " code in the machine " can be reduced to the processing to " pronunciation code " of words mark, will reduce the treatment capacity that machine is read.Such as, the machine of Chinese character is read, and the processing of its nearly 90,000 " code in the machine " is reduced to the processing to more than 1300 " pronunciation code ".
(8) help strengthening information security management and copyright protection.Conjuncted mark is for the condition that provides is provided " personalization " of words.This " personalization " uses, and helps information security management and copyright protection.The mark material comprises color, colourity, coding, mark, character and distortion thereof, or fingerprint, watermark, magnetic ink etc., and the user can carry out " personalization " and select.The meaning of these mark material representatives, the user can carry out " personalization " definition.The mark material that has, as, fingerprint, watermark, magnetic ink etc., itself just has the function of information security aspect.The words body can be selected " personalization " mark.The mark form of words, " literal " can be out of shape, variable color, can encode on font, mark, can also use for the people the mark materials such as impression of the hand, watermark and magnetic ink of machine identification, can use font (person's handwriting) figure of " personalization ", realize words form " personalization ".Also have, " word segment " of conjuncted mark can have " font ", or do not have " font " (blank), and the font file output of available " special agreement " shows.The personalisation process of " word segment " can be used as by " the body password " of words." body password " also can directly be expressed in " personalization " message code of " machine code ", be beneficial to the people machine identification handle." machine code " both helped strengthening " versatility " of machine processing, more help strengthening " personalization " of information security management.Personal information, as, works is registered, whether allows openly whether to license, and whether be special-purpose type matrix (or word figure), can be labeled in the type matrix and word figure of each words.These information are expressed in " personalization " message code of machine code, for the people machine identification." personalization " message code is described the one or more information of registering in authority that comprise aspects such as works, publication, print reproduction equipment, special font, the notarization marking and password, or other self-defined information.As not having clearly defining responsibilities and mandate, machine input and output, identification and the reading of mark text will be restricted.Conjuncted mark, can be prevent bootlegging (as, duplicate, scanning) condition is provided.Such as, duplicating is without permission stolen clothes next, that be marked with others' name surname just as dress.Because on the type matrix of duplicate (or word figure), mark has others, through the password and the mark of authority's registration.This behavior has enlarged author's influence on the contrary." machine code " of words brings convenience to investigation.Produce to duplicate, the industry of scanning facility, can utilize " personalization " information of machine code mark, whether allow the password (or mark) that duplicates such as, identification, make its facility have the function of the bootlegging of preventing.Perhaps, on duplicate, mark duplicates the marking of facility, in this marking, should comprise that this duplicates facility, and the registration code in authority is with clearly defining responsibilities.Conjuncted mark can be the reinforcement copyright management condition is provided.In if can the type matrix (or word figure) according to works, the personal information of machine code mark be directly paid royalty to the author, also can play the effect of protection copyright." personalization " information is labeled on the words, not only helps copyright protection, the legal liabilities of all right clear and definite text.Who writes? whose publication? can be expressed in " personalization " information.In conjuncted mark, use mark materials such as fingerprint, watermark and magnetic ink, can play false proofly, strengthen the effect of information security.Utilize conjuncted mark, at " word segment " mark fingerprint, mark, or in " personalization " information of machine code, mark can be realized the special use of type matrix (or word figure) through the information password of authority's registration.Use special-purpose type matrix and word figure, can be clear and definite who writes? whose publication? whose printing? who duplicates? still be example with " forever " word.1. from the machine recognition import pathway, protect copyright.When the body of " forever " word, carried out " personalization " mark, or employing " body password ", such as, top " point " become shape, or become look, or added other marking, as do not mark machine code, and this " forever " word, people can recognize, but can not import with machine recognition; As marked machine code, and machine code added closely, then needs password, could import with machine recognition; Body when " forever " word has not had, and is blank, and machine code has added close, this " forever " word, and people can not recognize, and need password, need the font file of " special agreement ", could use the machine recognition input and output to show.2. the individual special-purpose marking.Body that will " forever " word carries out " personalization " mark, and with individual's person's handwriting, individual's impression of the hand figure in addition, or other mark in addition generate individual's words " body password " such as, literal; In " machine code ", add the statement of " personalization " information, such as, adding is by the character code of ad-hoc location in signature (or use) time, the text (or selected) etc., generate with content of text and connect one, and, supply people and machine recognition subject to the password (or mark) that authority registers (or notarization); By the personalisation process of " word segment " and " mark part ", make this " forever " word have uniqueness, legitimacy and with the conjuncted property of mark text; As with this " forever " word, be replicated in other text, or non-legal use occasion, then this " forever " word will lose due effectiveness.In some occasions that needs clear and definite legal liabilities, each word in the text can adopt this conjuncted mark.3. special-purpose type matrix (or word figure).Publication, printing department, for preventing piracy (or privating by printing), clear and definite professional liability can be used own special-purpose type matrix (or word figure).In the machine code of type matrix (or word figure), password and sign that mark is registered through competent authorities, for the people machine identification.
(9) for information processing and reform of a writing system practice, provide a kind of Chinese character, the Chinese phonetic alphabet and input coding of both having helped to merge mutually and cooperative development, take into account the practice form of real world applications and long-term goal again.The long-term goal of information processing and reform of a writing system practice should be to walk the common direction of pinyin of world's literal; Its realistic task should be that simplified Chinese characters are carried out the Chinese phonetic alphabet, popularized Beijing pronunciation and the existing Chinese notation of standard.Conjuncted mark helps the mutual fusion of existing Chinese notation, and it on body, combines Chinese character, the Chinese phonetic alphabet and input coding for Chinese character together; On function, comprehensive mutually.Such as, at " word segment ", prompting " code fetch " parts (selecting parts) simultaneously, also play the effect of prompting " parts ownership "; The pronunciation mark is combined with shape justice feature code, as input coding.Conjuncted mark, with Chinese character, the Chinese phonetic alphabet and input coding, standard is melted one altogether separately, and cross-reference helps the cooperative development of Chinese notation.It helps strengthening standardization of Chinese characters and Chinese character modernization development for the Chinese character filling; The Chinese phonetic alphabet combines with the Chinese character input, helps improving the determinacy of words spelling, has enlarged range of application and effect; Input coding for Chinese character interrelates with the words pronunciation, helps simplifying and standard Chinese character input coding notation, helps recommending the input coding for Chinese character that is suitable for." word segment " of conjuncted mark and " mark part " have multiple version.Wherein, the horizontally-arranged pattern of " mark part " and " word segment " more helps the spelling of Chinese character practice.Such as, Chinese character " tree ", its " word segment " is " tree ", and " mark part " (pronunciation input coding) is " sh ù `mu ", and its structure horizontally-arranged pattern is " a sh ù `mu tree ".This pattern as real world applications, combines pronunciation, font and the input coding of literal, helps strengthening the standardization of spoken and written languages and input coding; As longterm planning, help " spelling of Chinese character " practice again; " word segment " when the right, progressively " simplification ", " the mark part " on the left side, progressively " sanctified by usage " just can realize the smooth transition of spelling of Chinese character, reaches the purpose of the reform of a writing system.Chinese character with " pronunciation adds feature code " formal description, as " sh ù `mu " (tree), makes the words spelling of the Chinese phonetic alphabet, has unique determinacy.Conjuncted mark at present, helps strengthening the standardization of spoken and written languages and infotech; Help popularizing Beijing pronunciation; Help enlarging the range of application and the effect of the Chinese phonetic alphabet.
(10) be simplified, the traditional font and the allosome of Chinese character, realize " abnormity is with sign indicating number ", the continuation of Chinese character is simplified and the use of new coinage symbol, improves the Chinese character set coding, and a kind of practice form is provided.Conjuncted mark simplified, the traditional font of Chinese character and allosome, is regarded several differences " font " that a Chinese character has as, and arranges corresponding font file for these " fonts "; It is labeled as same input coding to simplified, the traditional font of Chinese character and allosome as far as possible; In machine code, it is simplified, the traditional font and the allosome of a Chinese character, arranges same data base querying " address " (coding); It is arranged in the type matrix and the word figure of simplified, the traditional font of this Chinese character and allosome in the difference " field " of same " address ".The title of these font files or " field " can be called " simplified ", and " traditional font ", " allosome ", or " allosome 1 ", " allosome 2 " ..., etc., or use corresponding coded representation.These practices, are called it " abnormity is with sign indicating number " here and handle.If that Chinese character has is simplified, traditional font and allosome, can be with the code of simplified (or roman) Chinese character in character set, as the inquiry " address " of this Chinese character in database.In conjuncted mark, have the Chinese characters in common use of simplified, traditional font and allosome, their coding input can be done following processing.(1) between simplified, the traditional font and allosome as a Chinese character, have common shape Yi Tezheng, the shape Yi Tezheng that then these is had jointly as the shape justice characteristic of input coding, makes the input coding of simplified, the traditional font of this Chinese character and allosome identical.(2) between simplified, the traditional font and allosome as a Chinese character, do not have common shape Yi Tezheng, then in their the principle pattern of input coding, " character combination " that " appointment " is common, as input coding, make the input coding of simplified, the traditional font of this Chinese character and allosome identical.Chinese characters in common use with simplified, traditional font and allosome, through (1), (2) two kinds of processing, their pronunciations are identical, and input coding is identical, and data base querying " address " is identical, helps in use impelling it to unify and simplify.Or at the afterbody of input coding, filling expression " traditional font ", " allosome ", or " allosome 1 ", " allosome 2 " ...,, help to select font file etc. the font marker code of content." abnormity is with sign indicating number " for improving existing Chinese character set coding, provides a kind of practice form.It will remove traditional font and allosome at " seat " that existing Chinese character is concentrated, with the code of simplified (or roman) Chinese character in character set, as their code, help simplifying Chinese character set; Do not allow " dead word " to occupy legal " seat " of Chinese character set, help the standardization of spoken and written languages.Input and output for non-common Chinese character show, comprise simplified, traditional font and allosome, the present invention's suggestion, the Chinese character input, adopt the input of " pronunciation adds shape Yi Tezheng " form coding, or adopt " data code of combined character (or word figure) " input, promptly the generated data code of the basic element of character of synthetic Chinese character is imported (application known technology) continuously, as the input coding of Chinese character; Chinese character output shows, adopts basic element of character type matrix (or word figure) synthetic (application known technology).Character of newly making and the new Chinese character of simplifying adopt conjuncted mark, in " machine code ", mark their log-on message, database " address " is in " pronunciation input coding ", mark pronunciation and input coding can be avoided repetition coinage and help applying.The number of registration of new coinage symbol should adopt a kind of legal form; The coding of this legal form, mark separately in machine code perhaps, combines mark with database in machine code coding; This legal form should be open to the user, and can be in order to generate number of registration inquiry coding; Use number of registration inquiry coding, should be able to view all information of new coinage symbol.Disclosed new coinage symbol all should be registered, and avoids repetition coinage, is beneficial to language standardization; Before the coinage,, consult earlier, see whether have this word with number of registration inquiry coding.The new Chinese character of simplifying, utilize mask methods such as " pronunciation input coding " mark, " machine code " mark, mark pronunciation, mark input coding, mark database coding and pronunciation code etc. help the identification of people and machine, and what help simplifying Chinese characters applies.
(11) for machine recognition input " type matrix is synthetic " data, provide condition.Chinese character adopts " type matrix is synthetic ", helps simplifying existing Chinese font store, for the new coinage symbol and the character that is of little use, provides a kind of processing scheme.The Chinese character generated data comprises compound component code and structure generated data code.Conjuncted mark, with the synthetic input and the synthetic code and the data code that shows of exporting of Chinese character, Direct Mark helps machine recognition and handles in the machine code of type matrix and word figure.
(12) it is personalized to realize that words output shows.Output shows personalized, shows as the diversity of literal body and the diversity of literal function.Output shows personalized, needs to make the type matrix and the word figure of " personalization ", and their storages is standby.These work can use general known technology to realize.At " word segment " of conjuncted mark, adopt the mark form of words, mark user individual's the style of calligraphy; Adopt individual's person's handwriting figure, scan or be depicted as the words body of " personalization ", express individual's style temperament and interest.In " the mark part " of conjuncted mark, the machine code of mark words provides machine recognition to handle.Machine code, the database coding of mark words for words is located, generates the database coding with versatility in database.The versatility of the personalization of words body and database coding makes the type matrix and the word figure of " personalization ", has versatility, legitimacy and practicality.Because personalized output shows, does not influence the identification of Ren machine and handles, and also meets " character set " coding standard of country.The type matrix and the word figure that will have individualized feature are stored in the word database standbyly, or utilize existing " font file edit tool ", will have the type matrix of individualized feature, deposit in the font file standby.When words output shows, can utilize the machine code among type matrix or the word figure, directly from word database, access personalized type matrix and word figure, or utilize the coding of words in font file, access personalized type matrix.The output of mark text shows, can be blank text, can be ciphertext, also can be to use the conjuncted text of the special-purpose marking; Can be that the single type matrix of color shows, also can be that word figure beautiful in colour shows.The versatility of database coding is with " personalization " message code mark, the different aspect in being to use, not contradiction.
(13) strengthen type matrix and the function of word figure in information processing.The mark type matrix is that with the improvements of existing type matrix it has " word segment " and " mark part ", and two parts on body, are coupled to an integral body, and it is with this integral body, as literal (or technology) symbol of record instruction (or process information); Words at the attribute aspect spoken and written languages and the information processing, on type matrix, is marked; The mark material is selected for use and defined, as the sign of words mark; " word segment ", the original form of performance words, or the mark form of performance words; " mark part ", statement words pronunciation and message code have " pronunciation input coding " and " machine code ", or have only " pronunciation input coding ", or have only " pronunciation (input coding is implicit) ", or have only patterns such as " machine codes ", or other pattern.Mark word figure, its " word segment " and " mark part " on body, are coupled to an integral body, are that with the improvements of existing word figure it is with this integral body, as literal (or technology) symbol of record instruction (or process information); Words at the attribute aspect spoken and written languages and the information processing, on word figure, is marked; The mark material is selected for use and defined, as the sign of words mark; " word segment ", the original form of performance words, or the mark form of performance words; " mark part ", statement words pronunciation and message code have " pronunciation input coding " and " machine code ", or have only " pronunciation input coding ", or have only " pronunciation (input coding is implicit) ", or have only patterns such as " machine codes ", or other pattern; To of words, or a plurality of attribute carries out visualize.Chinese phonetic alphabet type matrix and word figure comprise Chinese phonetic alphabet, syllable and corresponding type matrix or word figure, and the improvements of using with the existing Chinese phonetic alphabet are that among type matrix and the word figure, mark has message code; With single Chinese phonetic alphabet, make a type matrix or word figure; Or, make a type matrix or word figure with Chinese phonetic alphabet syllable; Improved the input mode of the existing Chinese phonetic alphabet.Its common feature has conjuncted property, technical exactly, has spoken and written languages and information processing two aspect functions.
Compare prior art, the present invention has following characteristics:
1, conjuncted mark connects one to the pronunciation of words and body, for strengthening language standardization, provides a kind of mandatory use form: help strengthening the study and the application of mandarin; Help enlarging the range of application and the effect of the Chinese phonetic alphabet.
2, conjuncted mark connects one to the input coding of words and body, helps simplifying and the existing Chinese notation of standard, promotes coding input standardization; To intuitively mark with the input of encoding and combine, help reducing the difficulty of coding input.
3, conjuncted mark, the pronunciation of words, body and input coding connection one, intuitively mark the attributes such as structural style, radical classification, parts composition and input coding of prompting (or appointment) Chinese character, help the phonetic literacy teaching, the difficulty of the input that reduces Chinese studying, uses and encode; Synchronous phonetic literacy education and information skills training, the school eduaction and social application of mandarin are combined closely, help study and lifelong " curing " of children's (lacking) youngster's information input technical ability.
4, on type matrix and word figure, mark is easy to " machine code " of machine recognition, will improve existing machine recognition technology.
5, on type matrix or word figure, mark will be strengthened information security management and copyright protection through " personalization " information of authority's registration.
6, " phonetic adds Chinese character " horizontally-arranged pattern of conjuncted mark for reform of a writing system practice, provided a kind of both having helped to popularize Beijing pronunciation, and helped the practice form of " spelling of Chinese character " again.
7, conjuncted mark is given the Chinese character marking pronunciation, and the diagram body provides (or prompting) input coding, the mark machine code, in Chinese phonetic alphabet syllable back, the shape justice feature code of mark difference unisonance words, or the like, will reduce and overcome Chinese character " six difficulties ", improve the determinacy of words spelling.
8, simplified, the traditional font of Chinese character and allosome, handle as " identical code, several fonts ", will simplify existing character set encoding, promote " simplified and traditional different trisome ", in use unify and simplify.
9, on type matrix (or word figure), mark Chinese character " generated data " code provides condition for the machine recognition of Chinese character is input to synthetic output, will help simplifying and improving existing Chinese font store.
10, on type matrix (or word figure), mark the log-on message of new coinage symbol, help the use and the popularization of new coinage symbol and new simplified Chinese characters.
11, conjuncted mark realizes that words output shows " personalization ", has both kept the artistic characteristics of Chinese character, does not influence the machine recognition input again.
12, with the type matrix and the word figure of conjuncted mark, literal (or technology) symbol as record instruction (or process information) is spoken and written languages and information processing, and a kind of practical new model is provided.
Below, the invention will be further described.
One, conjuncted mark
Conjuncted mark carries out conjuncted, comprehensive, mark intuitively to words at the attribute aspect spoken and written languages and the information processing exactly.The attribute of words aspect spoken and written languages and information processing comprises pronunciation, body and the message code of words and their various combination form.
The general pattern of conjuncted mark, as Fig. 1, " conjuncted mark (individual character) ", Fig. 2 are " shown in the conjuncted mark (phrase).Fig. 1 is the general pattern of individual character; Fig. 2 is the general pattern of phrase.
Conjuncted mark, they have " word segment " and " mark part "." word segment " and " mark part " on body, is coupled to an integral body, as literal (or technology) symbol of record instruction (or process information); This connects whole, in Fig. 1 and Fig. 2, is illustrated as part in the blue frame, is the application units of type matrix and word figure; Such as, need " think of " word, then word segment of " think of " word and mark part will occur together.
" word segment ", the original form of performance words, or the mark form of performance words.The original form of words is exactly existing type matrix of words and word figure pattern, promptly the existing body of words the processing of the mark of work.The mark form of words is to select the mark material for use, on the words body, intuitively indicates a kind of body pattern of body attribute.Such as, the mark form of Chinese character to stroke, the order of strokes observed in calligraphy, parts and the structure etc. of Chinese character, is intuitively explained, and these statements are to carry out according to existing spoken and written languages standard.As Fig. 1, " word segment " in " conjuncted mark (individual character) ", " the mark form " of " think of " word, with the redness of definition, " code fetch (coding) parts " of sign " think of " word are " hearts ".
" mark part ", statement words pronunciation and message code.Message code comprises " input coding " and the machine code of words." mark part ", it includes " pronunciation input coding " and " machine code " of words." pronunciation input coding " is connected into one with the pronunciation mark of words and the input coding of words, has the application characteristic of self.As Fig. 1, " the mark part " in " conjuncted mark (individual character) " marked " pronunciation input coding " and " machine code " of " think of " word, and with the redness of definition, " brevity code " of sign " think of " word input coding is " s ī ".
The pronunciation of words mark and input coding are linked together, and show as " pronunciation input coding " pattern, and perhaps the pronunciation of words spelling discord input coding is linked together, and shows as other pattern." pronunciation input coding " pattern, its front is the pronunciation of words, the back is an input coding, or the front is the pronunciation of words, the back is the shape justice feature code in order to difference unisonance words, and two parts combination in front and back is as input coding, and the centre separates with symbol, or separate without symbol, list separator can define as required and select for use; This a kind of pattern in back in narration, can be expressed as " pronunciation adds shape Yi Tezheng ", or " pronunciation adds shape justice (difference) feature ", or " pronunciation ` feature ".The pronunciation of words mark is called for short " pronunciation ", and the shape justice feature code of words is called for short " feature ", and back a kind of pattern of " pronunciation input coding " will be expressed as " pronunciation ` feature " form.As Fig. 3, " pronunciation input coding ", the pronunciation of " think of " word is the preceding red glyphs part " s ī " of list separator " ` "; The input coding (on the principle) of " think of " word is the two-part combination in front and back " s ī `xn ".Language liberal normalization mark pressed in the words pronunciation.Chinese character by words is by the relevant specification mark of the Scheme for the Chinese Phonetic Alphabet and orthography for the Chinese phonetic alphabet.The modified tone of Chinese character by words, softly, the mutiread sound can be labeled on corresponding type matrix and the word figure, and on input coding (or aspect output demonstration), distinguished (such as, add the code of revising tone, or the pronunciation mark of the synthetic words of editor); Or the modified tone of words do not marked, still use former accent.The modified tone of words is not marked, when the individual reads, by the modified tone rule treatments; Machine recognition and reading by the modified tone processing rule, are provided with corresponding modified tone handling procedure, with the continuous syllable that needs modify tone and handle, handle by the modified tone handling procedure.The words pronunciation can also adopt mark materials such as special color, character distortion, is indicated.The pronunciation of the phrase of conjuncted mark in " pronunciation input coding " pattern, is exactly the spelling of the pronunciation of each individual character, press the word preface and arranges, and is placed on the front of " input coding " or " the adopted feature code of shape ".The pronunciation of words marks, and can also adopt other pattern of the Chinese phonetic alphabet, such as, " Two bors d's oeuveres " pattern of the Chinese phonetic alphabet." Two bors d's oeuveres " code of the Chinese phonetic alphabet and keyboard definition, the narration of seeing below." pronunciation input coding " of words can also be at its afterbody, and the font file code of filling words offers not understanding Chinese characters person, when words import, selects font file usefulness.As Fig. 4, " pronunciation input coding (2) " selects " traditional font " font file for use with character " f " expression.
The spelling of the input coding of words and pronunciation is linked together, and shows as " pronunciation input coding " pattern, and perhaps the spelling of the input coding of words discord pronunciation is linked together, and shows as other pattern." pronunciation input coding " pattern, its front is the words pronunciation, the back is an input coding; Or the front is the words pronunciation, and the back is the shape justice feature code in order to difference unisonance words, and front and back two parts are in conjunction with as input coding; The centre separates with symbol, or separates without symbol, and list separator can define as required and select for use.As Fig. 3, " pronunciation input coding ", the input coding of " think of " word is " s ī `xn ".
" pronunciation ` feature " form of input coding is a kind of in " pronunciation input coding "; It is exactly in pronunciation mark back, the shape justice feature code (abbreviation feature) of filling words, and front and back two parts combination is as input coding; Between the two, separate with character " ` " (or other symbol), or separate without symbol.Shape justice (difference) feature code, the statement words is at the distinguishing characteristics of shape right way of conduct faces such as stroke, parts, structure.The definition of shape justice (difference) feature code as far as possible with the interrelating of the pronunciation (or name) of these features, makes its code be simple and easy to note.Such as, Chinese character " think of " (as Fig. 5, the pattern 1 in " dimension style ") is with shape justice (difference) feature of first-selected parts " heart " conduct with other phonetically similar word; With the double spelling code " xn " (double spelling code illustrates and sees below) of the basic syllable " xin " of " heart ", as shape justice (difference) feature code.The input coding of Chinese character " think of " on principle, is expressed as " s ī `xn ".
The feature code part of input coding also can lie in the form mark prompting of " word segment ", not appearance (as Fig. 5, the pattern 3 in " dimension style ") in " pronunciation input coding ".Among the figure, Chinese character " think of ", at " word segment ", predefined with red, indicates its shape justice (difference) feature, is parts " hearts "; The input coding of " think of " on principle, should be expressed as " s ī `xn ", here, " `xn " is concealed, and only is expressed as red " s ī ".The hidden feature code helps simplifying symbol.
" brevity code " pattern of input coding promptly to the simplification of input coding (principle coding), can be used color, colourity or character distortion, and Direct Mark is on the principle coding (as Fig. 5, the pattern 2 in " dimension style ").Among the figure, Chinese character " think of ", input coding on principle, should be expressed as " s ī `xn ", supposes that its brevity code is " s ī ", here, it with special color and character distortion, is indicated and the differentiation brevity code, input coding is expressed as " s ī `xn "; Red marker is used in the overstriking of brevity code part, and remainder becomes italic.
The input coding of Chinese-character word-phrase in this explanation, recommends to use " pronunciation ` feature " form.It is prior art " two minutes Chinese characters " (application number: a kind of application form 02108826.8).It is together described the pronunciation of words and shape justice (difference) feature of words, between the two, separates with character " ` " (or defining other character), or separates without character.The coding (and spelling) of realizing words has unique determinacy.Shape justice (difference) feature of words can be a character formation component, or stroke structure, or stroke; Pronunciation is arranged, describe with pronunciation, no pronunciation, the stroke code description used; The pronunciation of parts is expressed with " Two bors d's oeuveres " pattern of the Chinese phonetic alphabet.Here, it is not done too much narration.Adopt " pronunciation " to add " shape justice (difference) feature " and describe Chinese character, following benefit is arranged.The one,, pronunciation is the main attribute of words, in language, can not know the words form, and can not know the words pronunciation.The 2nd,, can realize the no repeated code input of all Chinese characters.The 3rd,, help the modernization of Chinese character, the Chinese phonetic alphabet self.The 4th,, it can be spelling of Chinese character, and the pattern of putting into practice of a kind of " developing by self " is provided.Such as, Chinese character " tree ", its input coding are " sh ù `mu ", the pronunciation part further normalization when the left side, and shape justice (difference) feature on the right is progressively sanctified by usage, can be spelling of Chinese character, and a kind of " full word symbol pattern " of continuous transition is provided.(in " conjuncted mark ", spelling of Chinese character also has " phonetic adds Chinese character " to wait other pattern).The 5th,, help Chinese phonetic alphabet practice, explore unique determinacy of words spelling.
The input coding of Chinese character phrase, in two kinds of situation.One, the output as phrase shows, by single mark type matrix (or word figure) combination, on the type matrix (or word figure), mark has the input coding of individual character, and the input coding of its phrase by existing common practice, reads in " the pronunciation ` feature " of each individual character mark.According to the specific requirement of each input method, read corresponding syllable or character.Easy common practice is: two words, read by " sound sound "; Three words groups read by " several sound "; Four words and above phrase thereof read by " several ".Two, the output as phrase shows, is the phrase type matrix (or word figure) of conjuncted mark, on the type matrix of phrase (or word figure), should mark the input coding of phrase; Its input coding mark if adopt " pronunciation ` feature " form, on principle, should mark the pronunciation part of phrase earlier, marks the shape justice characteristic of phrase again; The pronunciation code of phrase is pressed individual character order spelling phrase pronunciation; The shape justice feature code of phrase is determined (or selecting for use) as required, with pronunciation (or name) coded representation of parts, still writes by the individual character order; The length of input coding is determined by concrete input method.Its brevity code, still by existing common practice, two words by " sound sound ", three words groups, by " several sound ", four words and above phrase thereof, by " several ", adopt the mark material, are labeled on the corresponding character of phrase pronunciation part.
The input coding of Chinese-character word-phrase except " pronunciation ` feature " form, can also have other pattern.Such as, input coding for Chinese character that can other is popular is added on the back that pronunciation marks, and plays the effect of describing shape Yi Tezheng, or directly marks other input codings separately.
The actual input of input coding for Chinese character.Under existence conditions, the Chinese phonetic alphabet character and the syllable of " band is transferred ", and letter " ü " on QWERTY keyboard, do not have corresponding key position.The practical operation of Chinese character input need be done following processing.Chinese phonetic alphabet " ü " is in input coding, as must by the present current practice, replacing with character " v " with " ü " when input.Band is transferred Chinese phonetic alphabet character and syllable, and as " ā, d ǎ " etc., their input coding adds tone mark with basic syllabogram and represents.Tone mark with numeral, or is used character, the four tones of standard Chinese pronunciation of expression standard Chinese.With numeral " 1,2,3,4 ", the expression four tones of standard Chinese pronunciation are added in the basic syllable back of words with numeral, and as " ā, d ǎ ", actual input coding is " a1, da3 ".Use character, the four tones of standard Chinese pronunciation of expression standard Chinese, selecting for use of character has multiple scheme, can define as required.A kind of method for expressing is provided here.The four tones of standard Chinese pronunciation with standard Chinese, continue to use the call of traditional " high and level tone, rising tone, last sound, falling tone ", get four words wherein, the tone of " yin, yang, go up (sh ǎ ng), go ", as the representative of the standard Chinese four tones of standard Chinese pronunciation, with the individual letter of first (or second) (during written statement, available its italic distortion) of their syllables, as " i, y, s, q ", as the code of the standard Chinese four tones of standard Chinese pronunciation.As " ā, d ǎ ", its input coding is written to be expressed as " a i, da s ", and the actual keyboard input coding is " ai, das ", and tone code need not tilt.The actual input operation of Chinese character phrase input coding, similar with the input operation of individual character.Output as phrase shows that by single mark type matrix (or word figure) combination, character in its " principle coding " and tone are handled, and is the same with the individual character input; The tone processing is not considered in its brevity code input.Output as phrase shows, is the phrase type matrix (or word figure) of conjuncted mark, and on the type matrix of phrase (or word figure), mark has the input coding of phrase; If " the principle coding " of phrase is assumed to " pronunciation ` feature " form, order is imported the basic syllable of each individual character in the phrase earlier, when needs input tone, with the tone sequence notation of each individual character, is added in the back of the basic syllable of phrase; If also need to import the shape Yi Tezheng of phrase, press the individual character order again, input shape justice feature code.As phrase " the earth ", the Chinese phonetic alphabet of its pronunciation is " d à d ì ", and actual input coding can be " dadi44 ", or " dadiqq ".The input of Chinese phrase, whether the choice of tone needs to import shape justice feature code, can be by concrete input method definition.
Machine code, in conjuncted mark, be used for words the people machine identification and the figure code handled.Here, only explain some machine code patterns (signal), in order to the application of machine code and effect to be described in conjuncted mark.
Machine code, on mark type matrix (or mark word figure), by one group, or a few group coding is formed; Every group coding, by one, or several codings are formed; Every group coding is arranged by " ..., 8,4,2,1 " coded sequence, determines " power and position " and the direction of every group coding; Every coding has corresponding " power and position " that define in every group coding, and its each (, crying its " sign indicating number position " here) wherein has identical " power "; Every coding, as required, definition and division corresponding " the sign indicating number position is interval " (yard position of several successive); Each " the sign indicating number position is interval ", definition refers to content accordingly.Machine code generally is arranged in the bottom of type matrix or word figure, also can be arranged in type matrix or word figure around, or other position.Machine code is generally vertically defining " power and position " and direction, in laterally definition " sign indicating number position " and direction.Machine code on " power and position " and " sign indicating number position " both direction, is provided with corresponding reference marker, in order to distinguish and to define position, the length and width of " power and position " and " sign indicating number position "; Be machine recognition and reading, or artificial cognition, the dot pattern pattern of corresponding code is provided; Because, the font difference, font varies in size, and in the machine code, the dot pattern of " every " also can be different.Here, the dot pattern machine code " every " is called " sign indicating number symbol ".Dot pattern by each " sign indicating number symbol " constitutes is called " sign indicating number figure ".Machine code can " be expressed " mark, sees easily, and also available special material " stealth " mark is difficult for seeing.Machine code can have interlacing, every position, the evacuation pattern of mark one by one, not interlacing also can be arranged, not every the position, connect the compact pattern of one.Every group coding, can be as required, " the sign indicating number position is interval " that comprises database coding, pronunciation code, " personalization " message code and be used for one or more information of the synthetic codes such as data code of type matrix (or word figure) expressed in arrangement, or " the sign indicating number position is interval " of others information described in arrangement; " personalization " message code is described the one or more information of registering in authority that comprise aspects such as works, publication, print reproduction equipment, special font, the notarization marking and password, or other self-defined information; The last position of every group coding is a check bit.
Machine code also can be used for artificial cognition,,, uses for illiterate people as " font code " of manual word's speech as recognition unit (or feature) with " sign indicating number figure "; Or in addition three-dimensional is handled, such as " mold pressing ", or " bundle hole ", offer blind person (or normal person) and use; Or the meaning of a word and the part of speech of appointment words, or the like.
Here, for sake of convenience, be example with the Chinese word earlier, narrated.Suppose that machine code is arranged in type matrix or word figure bottom, every group coding is made up of 4 codings, and code is 16 systems, presses " ..., 8,4,2,1 " power and position arrangement; The example word is " think of ", its machine code pattern, and as Fig. 6, " machine code (evacuation pattern) ", Fig. 7, " machine code (compact pattern) ".Among two figure, the left side, zigzag sign indicating number figure, as vertical reference marker of sign indicating number symbol, the definitions symbol is height vertically, and defines the power and position of every coding and the direction of power and position, and also as the beginning flag of every coding, its lower left corner is the identification starting point of entire machine code; The base, the sign indicating number figure that the black and white rectangle is alternate, as the horizontal reference marker of sign indicating number symbol, definitions symbol transverse width.
Fig. 6 is the evacuation pattern of machine code, 10 row, 36 row; The left side is with zigzag sign indicating number figure, as the longitudinal register reference; Every coding separates with the white rectangle bar; Each yard position is with 1 black 1 white rectangular code symbol sign, by the vertical and horizontal reference marker definition of sign indicating number symbol; 18 sign indicating number positions of every encoding setting; In the drawings, from left to right, the 1st, be coding beginning flag position, the 2nd to the 17th, be data bit, the 18th, be check bit.In data bit, the 2nd to the 6th, the database coding for words provides more than 110 ten thousand numberings; The the 7th to the 9th, be the pronunciation sequence number, more than 4.3 thousand numbering is provided; The the 10th to the 17th, be the standby code position, the setting of " personalization " message code is provided.The 18th, the 1st reciprocal, be check bit, it is every group of machine code, gets the sign indicating number position number sum of " 1 " (or be defined as get " 0 ") in each bar coding, gathers into whole ten.In the drawings, the database of mark " think of " is encoded to " 0CBBCH ", and the pronunciation sequence number is " 3BAH ", and check bit is " 0003H ".
Fig. 7 is the compact pattern of machine code, 6 row, 36 row; Every coding with zigzag sign indicating number figure, as the longitudinal register reference, need not separate; Each yard position is with single black (or white) rectangular code symbol sign, by the vertical and horizontal reference marker definition of sign indicating number symbol; 36 sign indicating number positions of every encoding setting; In the drawings, from left to right, the 1st to the 2nd, be coding beginning flag position, the 3rd to the 35th, be data bit, the 36th, be check bit.In data bit, the 3rd to the 7th, the database coding for words provides more than 110 ten thousand numberings; The the 8th to the 10th, be the pronunciation sequence number, more than 4.3 thousand numbering is provided; The the 11st to the 35th, be the standby code position, the setting of " personalization " message code is provided.The 36th, the 1st reciprocal, be check bit, it is every group of machine code, gets the sign indicating number position number sum of " 1 " (or be defined as get " 0 ") in each bar coding, gathers into whole ten.In the drawings, the database of mark " think of " is encoded to " 0CBBCH ", and the pronunciation sequence number is " 3BAH ", and check bit is " 0003H ".
The database coding, the retrieval (inquiry) " address " of words in database described, or be newborn words " location " in database, it can be by the recording mechanism coding of words in database, or press the character code set coding of phrase lead-in (or surplus word), or the pronunciation code of press words encodes, or with their several persons in conjunction with coding, definition according to actual needs is with the retrieval (inquiry) that helps words.Or directly adopt data-base recording number, or character set encoding, or code in the machine, encode as database.
The pronunciation sequence number is exactly that certain rule compositor pressed in the pronunciation of all words, generates " sound preface ", and this " sound preface ", is expressed as the code pattern.Pronunciation code, the code of expression (or referring to) words pronunciation (or attribute) exactly; In " machine code ", be exactly the number of words of words to be arranged with the pronunciation sequence number of words combine the code in order to retrieval words pronunciation (or out of Memory) of generation.Here, " pronunciation code " comprises " pronunciation sequence number ", and both can use with sometimes.Chinese, mandarin have more than 1300 syllable, with its ordering, can generate more than 1300 pronunciation sequence number.The pronunciation code of Chinese word, not marking-up number is exactly its pronunciation sequence number.The pronunciation sequence number of Chinese phrase, 1 of front, the expression phrase number of words, surplus after, be the series arrangement of each character pronunciation sequence number, perhaps do not represent the phrase number of words.Such as, " thought " speech supposes that the pronunciation sequence number of " think of " word is " 3BAH ", supposes that the pronunciation sequence number of " thinking " word is " 445H ", and number of words is 2, and so, the pronunciation code of phrase " thought " is " 23BA445H ".
Be used for the synthetic data code of type matrix (or word figure).In machine code, the generated data of mark type matrix (or word figure) is exactly to use the machine recognition technology, and the synthetic input and output demonstration for type matrix (or word figure) provides condition.With the synthetic Chinese character font of parts type matrix, simplify or improve Chinese font store, be a known technology; Its expanded application can be regarded as in synthetic Chinese character " word figure ".The generated data of type matrix comprises the parts type matrix that combined character is required, and the feature size of these parts type matrixes and position coordinates.Wherein, the feature size of type matrix and position coordinates can be standardized as the structure type data.Known the part codes of forming a Chinese character, and the structure type code of these parts, just can synthesize this Chinese character.The basic element of character of Chinese character has the hundreds of kind, and the structure type of Chinese character has tens kinds; Can arrange 5 sign indicating number positions, as " the sign indicating number position is interval " of type matrix (or word figure) generated data; With 2,256 codings are provided, the basic element of character code of Chinese character is described; With 2,256 codings are provided, the structure type code of Chinese character is described; With 1,16 codings are provided, description is a combined character, still synthetic word figure, perhaps out of Memory.Whether mark " type matrix (or word figure) generated data code ", can be according to request for utilization and technical conditions decision.
The group number and the length of machine code should be according to the usable range of type matrix and word figure, and promptly the size variation of font commonly used is determined.On body, select for use scope big as type matrix and word figure, such as, from 16 o'clock to 128 o'clock, all to use, also can know the expression machine code for putting up with on little font, for word segment provides than large space, the group number of coding will lack, and code length will be lacked.Such as, under existence conditions, the coding of 1 group of " 1 group 4 " compact pattern can be set, every coding 12 or 13 bit lengths at 16 font supernatant Chu display device codes.
The sign indicating number symbol of machine code indicates in every coding the form of the dot pattern of each " position ".Machine code is the same, but in different fonts, different font, the form of sign indicating number symbol is different.Be exactly in yi word pattern, at the different parts of type matrix and word figure, the sign indicating number symbol of same code is also not necessarily identical.Utilize " sign indicating number symbol " notion, make the size variation of sign indicating number symbol, and its expressed machine code remains unchanged, and helps the machine recognition of machine code with font (or font), final, also help the identification of words.
Arrange the sign indicating number position of machine code.Machine code, every coding is provided with beginning flag position, data bit and check bit.The beginning flag position is made as 1 (or 2), with other coding, constitutes initial identification marking sign indicating number figure, as the initial identification marking of machine code.Data bit is arranged word database coding, words pronunciation sequence number (or pronunciation code), the synthetic input and output code of words, " personalization " message code of words.The word database coding, as 5 16 systems are set, more than 110 ten thousand codings can be provided, the existing character of Chinese character is less than 90,000, and common phrase seldom can satisfy the inquiry of words and use.The pronunciation sound preface numbering of individual character, as 3 16 systems are set, more than 4.3 thousand numbering can be provided, standard Chinese band tuning joint has only more than 1300.The synthetic input and output code of individual character, as 4 16 systems are set, can satisfy coding and tens generated data type codings of a Chinese character hundreds of basic element of character." personalization " message code of words, can comprise whether licensing, whether be open text, the contents such as sequence number of this words in text, the one or more information of registering in authority that comprise aspects such as works, publication, print reproduction equipment, special font, the notarization marking and password are perhaps described, or other self-defined information.Arrange the sign indicating number position of phrase, arranges slightly different with the sign indicating number position of individual character.The database coding of phrase is a unit with the phrase, encodes separately, or will be with regard to the database coding of phrase " lead-in ", as the database coding of phrase; The pronunciation code of phrase is arranged with individual character pronunciation numeric order, or in they fronts, is increased by 1, the number of words of statement phrase; The synthetic input and output code of words with the generated data series arrangement of individual character, or in they fronts, increases by 1, the number of words of statement phrase; " personalization " message code of words can be constant.The machine code length of phrase increases with the length of database coding, pronunciation code and synthetic input and output code.Such as, " thought " speech, suppose that it is the database coding of the character code set of phrase " lead-in " as phrase, during retrieval, character code set that can be according to first letter finds lead-in " think of ", on " think of " word hurdle in database, by the pronunciation sequence number (being labeled in the machine code) of phrase " thought " (or " surplus word " is thought "), just can find the field of record " thought ".The pronunciation sequence number of supposing " think of " word is " 3BAH ", supposes that the pronunciation sequence number of " thinking " word is " 445H ", and number of words is 2, and so, the pronunciation code of phrase " thought " be " 23BA445H ", 7, and than 4 of the pronunciation code increases of individual character.The machine code length of phrase " thought ", 4 of also corresponding increases.Number of words is greater than 15 phrase, less appearance; This (example) lining greater than 15 phrase, adopts number of words individual character synthetic; Expression phrase number of words has only been arranged 1 16 ary codes position; Represent the number of words of number of words as needs, can arrange 2 (or more than) sign indicating number positions greater than 15 phrase (or sentence), as special applications, such as, need in a type matrix, explain a first Tang poetry.
Machine code, the particularly wherein data setting of " personalization " is information security management and copyright protection, and new condition is provided.The data setting of " personalization ", comprise whether licensing, whether be open text, the contents such as sequence number of this words in text, the one or more information of registering in authority that comprise aspects such as works, publication, print reproduction equipment, special font, the notarization marking and password are perhaps described, or other self-defined information, be labeled in " personalization " message code." personalization " message code has the use of two aspects.The one,, without (or need not) authority's registration, only use, but lack legal restraint for individual's (or indivedual occasion); The 2nd,, through authority's registration, acquire full legal force can use socially.Conjuncted mark is with the dot pattern of words, as the object of words encryption.Each words can have corresponding password, is labeled in " personalization " message code of type matrix (or word figure).The type matrix of words and word figure have stamped " personalization " mark, whose type matrix and word figure, and whose typewriter when the output text, can make a mark.Whether whether text is registered, allow openly can mark.Utilize machine code, when being in " special use " state at text, do not authorize, text can not be used machine recognition and reading, can not utilize machine, and the type matrix and the word figure of text edited, revises; Need be in " public " state at text, can carry out " reduction ", promptly use the special procedure of legal (or individual), " personalization " message code is handled, make text can use machine recognition and reading text type matrix and word figure.Illegally duplicated as text, " personalization " data on text type matrix and the word figure can not change.Under text duplicates very easily situation, help protecting copyright.Machine code, during with signs such as watermark, magnetic inks, words has maintains secrecy and antiforge function.
" personalization " message code can define according to individual demand, also can be according to the legal formal definition of authority.Suppose (giving an example), with Fig. 7, " machine code (deflation pattern) " is example, defines the 11st to interval, the 35th bit code position, is the setting of " personalization " message code.As with 4 sign indicating number positions, the 11st to the 14th, statement Publishing branch code can provide more than 60,000 coding.As with 5 sign indicating number positions, the 15th to the 19th, the registration code of statement authority can provide more than 1,040,000 coding; These codings with one or more information of works, author, publication, printing, special font, the notarization marking, tab character and aspects such as position and random cipher thereof, are generated with legal cryptographic algorithm by authority.As with 4 sign indicating number positions, the 20th to the 23rd, statement authority is used for the legal cryptographic algorithm code of spot check, and more than 60,000 coding can be provided.As with 5 sign indicating number positions, the 24th to the 28th, statement author's (or publisher) self-defined code can provide more than 60,000 coding; Arrange sign indicating number position wherein, and password setting can set up on their own, or newspaper notarization (or management) office puts on record." personalization " message code, its sign indicating number position burst length, the option of markup information should be determined according to user's the needs and the space size of type matrix (or word figure).
Combine with the font file of machine code, encryption and decryption is carried out in the identification and the output demonstration of file with " special agreement ".As the font file with " special agreement ", this font file becomes all characters into " blank ", and machine code adopts encrypted code, at this moment, the text that output shows, word segment is " blank ", the mark part is a password; Not deciphering, then this text, the people can not recognize, and machine can not be discerned; Password has been arranged, and machine can be discerned, but the people can't recognize; Password has been arranged, the font file that can " manifest font " of " special agreement " also will be arranged, text could show in normal output, could realize that machine can discern, and the people can recognize.
Utilize conjuncted mark, generate " the special-purpose marking ".Such as, on a receipt, generate " the special-purpose marking ".With signature and the impression of the hand of oneself, make the word segment of type matrix; With the important information in the receipt, as the origin of an incident, the amount of money with go out,, generate password by legal form computing according to information such as times, make the mark part (machine code) of type matrix; So, this type matrix links together with this receipt, as " the special-purpose marking " of this receipt, acquire full legal force; And this " special-purpose marking " leaves this receipt, just loses effectiveness, has conjuncted property.
Utilize machine code, generate in " special-purpose type matrix and word figure ".The practice of " special-purpose type matrix and word figure ", similar to " the special-purpose marking ".Its " personalization " message part will mark the code for information about through authority's registration, and still, the generation of password not necessarily will link together with text.
Machine code and " personalization " message code thereof can be edited and revise.The editor of type matrix and modification have ready-made " font file edit tool ".The individual uses, and can download related software from the Internet.Among the word figure, the editor of machine code and modification just with general " picture " instrument, just can realize.Special-purpose program can also be set, before text is completed, " personalization " message code of unified all type matrixes of mark and word figure.
The man-machine identification of machine code.Utilize existing machine recognition technology, " machine code " discerned, technically, do not have difficulties, reduced the difficulty of machine recognition on the contrary.Just,, on recognition method, the characteristics of self are arranged, in concrete recognizer, should take in the identification of machine code.With Fig. 6, " machine code (evacuation pattern) ", Fig. 7, " machine code (compact pattern) " is example, supposes the bottom of machine code at type matrix and word figure.1. at first, find the starting point of " identification ".The starting point of " identification " comprises the coordinate starting point of yard figure, the position of every coding and " power ", the beginning flag of every coding.The coordinate starting point of sign indicating number figure in vertically reference and the laterally intersection point of reference, is exactly the white rectangle sign indicating number symbol in the lower left corner.The position of every coding and " power " position, in evacuating pattern, " sign indicating number figure " is sign with black and white; In compact pattern, be sign with black and white " sign indicating number symbol "; The beginning flag of every coding is sawtooth font code figure.2. with horizontal reference marker, as the stepping mark of read machine code; In evacuating pattern, " sign indicating number figure " is sign with black and white; In compact pattern, be sign with black and white " sign indicating number symbol "; In the horizontal, read the scale-of-two reading of every group coding on corresponding " position "; In the vertical, read every " weights " that are coded on the correspondence " position ".3. verification.Check bit, in every group coding, the number summation of black (or white) sign indicating number symbol gathers " whole ten ".Here, black sign indicating number symbol is defined as " 1 ", white sign indicating number symbol is defined as " 0 ", will be for the sign indicating number symbol number of " 1 ", as the object of verification.To gather " whole ten " verification,, can improve checking precision than with " odd even " verification.
Machine code also can be passed through artificial cognition.After being familiar with 16 heuristicimal code figure, can be with machine code, as " font code " of words input.Both can be used as single font code, also can be used as the supplementary means of other input codings.
Machine code can also be labeled as other code pattern of the prior art, such as, bar code.
Conjuncted mark can be provided with output and show " switch " attribute number, dimension style that control output shows to the type matrix and the word figure of words.Such as, definition only shows the pronunciation and the body of words, or defines machine code and the body that only shows words, or the definition all properties shows together.When definition only shows the words body, the type matrix of conjuncted mark and word figure demonstration will be the same with word figure with existing type matrix, be returned to standing state.A kind of easy program realization, such as, when type matrix output shows, only output demonstration " word segment "; The code of " mark part " is made as " 0 " (no figure) entirely, again " word segment ",, increases or reduce the number of (being defined as " 1 ") dot matrix (pixel) by the scaling of definition; Just can realize that only output shows " word segment " of type matrix.
Conjuncted mark is because the dirigibility of the comprehensive and mask method of marked content makes dimension style have diversity.It can all mark body, pronunciation, input coding and the machine code of words, also can partly mark or the individual event mark.As Fig. 5, " dimension style ".
Pattern 1, pronunciation, body and the message code of mark words; Word segment, the redness with definition has indicated that " code fetch (or coding) parts " are " hearts "; The mark part, the redness with definition has indicated that " brevity code " of input coding is " s ī ".Pattern 2, " pronunciation input coding " and the body of mark words; Word segment, has indicated " code fetch (or coding) parts " with red; The mark part has marked " pronunciation input coding ", with red, and " brevity code " of sign input coding.Pattern 3 has marked the pronunciation and the body of words; " shape justice (difference) feature " part of input coding lies in word segment; Word segment with red, has indicated the parts as " shape justice (difference) feature "; The mark part, in " pronunciation " spelling, with red, " brevity code " of sign input coding.Pattern 4 has marked the machine code and the body of words; Word segment, has indicated " code fetch (or coding) parts " with red.Pattern 5 has only marked the pronunciation of words; Word segment, the original form of performance words.Pattern 6 has only marked the machine code of words; Word segment is blank; Can show with the font file output of " special agreement ".
Conjuncted mark, concrete practical pattern comprise mark type matrix, mark word figure and Chinese phonetic alphabet type matrix and word figure etc.
It is personalized to realize that words output shows, can utilize " pictures ", " EUDC Editor " in the existing operating system to wait instrument, perhaps utilizes " font file edit tool ", and the type matrix and the word figure of generation personalization are stored in database and the font file.Also can provide ready-made " blank " font file, make things convenient for the user " to make " oneself " type matrix (or word figure) " by operating system.
Conjuncted mark if adopt mark directly perceived at word segment, can be Chinese character simplifiedly, and a kind of pattern is provided.Such as, with " sound " blacking in traditional font " Sound " word (its word is simplified, here as giving an example), all the other virtualizations, mark out pronunciation, make type matrix, travel publication, in subtle, traditional font " Sound " word, just can carry out the transition to simplified " sound " word, not spend special memory, not increase the character learning amount.
The conjuncted mark of Chinese character with simplified, the traditional font and the allosome of Chinese character, is regarded " the several different fonts of same word " as.They use same database coding " address ", arrange (use) same input coding as far as possible, mark same pronunciation, and their type matrix and word figure are stored in the difference " field " of same " address " of database.In concrete the use, can be equipped with corresponding font file and the inquiry code of words at database is provided.These font files can be called " simplified ", and " traditional font ", " allosome ", or " allosome 1 ", " allosome 2 " ..., etc., or use corresponding coded representation.The inquiry code can adopt pronunciation, stroke, parts or popular input coding to make code, and for not being familiar with (not understanding) words person, query word in database is provided convenience.If simplified, the traditional font of understanding Chinese characters and allosome are directly selected corresponding font file for use, carry out the input of Simplified form of Chinese Character, traditional font and allosome.If not simplified, the traditional font of understanding Chinese characters and allosome, the text reference is arranged, be the mark Chinese character as text, from the afterbody mark of input coding, select font file, with the input coding of mark, the input Chinese character; Or,, import according to sample as " font code " according to " sign indicating number figure " appearance of machine code; As text is non-mark Chinese character, then use stroke, parts or the popular input coding etc. of Chinese character earlier, make the inquiry code, in word database, find this Chinese character earlier, be familiar with this Chinese character, understand its input coding, select corresponding font file for use, carry out the input of Simplified form of Chinese Character, traditional font and allosome again." allosome with sign indicating number ", its advantage is to help the standardization of spoken and written languages, impels people to become literate and at first uses simplified Chinese character, its shortcoming to be, and being not easy to not, the people of understanding Chinese characters carry out the Chinese character input." allosome is with sign indicating number " aspect encode Chinese characters for computer, only provided the standardization suggestion, and the root problem of unresolved encode Chinese characters for computer; Simplified, the traditional font of Chinese character and allosome, its huge type matrix (or word figure) database still exists; Deal with problems, the output that the suggestion Chinese character adopts parts type matrixes (or word figure) to synthesize and realizes Chinese character shows at all.On the appellation of font file, since " simplified " arranged, " traditional font ", " allosome ", or " allosome 1 ", " allosome 2 ",, wait call, with existing " Song typeface ", " regular script " ... etc. appellation, how to get in touch? can be called " Song is simplified ", " pattern traditional font " ..., etc. title.
Conjuncted mark for understanding the multiple attribute of words, provides inquiry " address ", with the necessary attribute of words, is labeled on type matrix and the word figure.
The quantity of mark type matrix (or mark word figure).Adopt the conjuncted mark of words, the quantity of Chinese character font (or word figure) than the corresponding type matrix quantity of existing Chinese character, increases to some extent.In the GB scope, 6763 Chinese characters adopt the mark Chinese character, mark basic syllable, need 7300 type matrixes nearly, increase by 7.5% than number of words; Mark band tuning joint needs more than 7600 of type matrixes, increases by 13% than number of words; Modify tone and do not consider that general " softly " word is taken into account.Phrase quantity is bigger, but common phrase quantity is few, and basic vocabulary just still less.Basic vocabulary in the Chinese can be made mark type matrix (or mark word figure) and is used.Other phrase can be synthetic with the mark type matrix (or mark word figure) of individual character, and its input coding adopts " brevity code " pattern, reads the first initial of " code fetch " individual character.It is introduced, existing " font file edit tool ", what have holds more than 60,000 type matrix.Adopt the synthetic output of type matrix to show (known technology) as Chinese character, so, the pronunciation mark of mark Chinese character can use Chinese phonetic alphabet type matrix, realizes that editor is synthetic, just needn't for its in mutiread sound, modified tone, aspect application such as softly, other coinage mould (or word figure); In fact the mark Chinese character, can use type matrixes (or word figure) such as parts, pronunciation, input coding and machine code synthetic.
The legal pattern of conjuncted mark and the standardization of code.Conjuncted mark on application form, has personalization and diversity, and still, it also should have the legal pattern of oneself, is used for legal occasion.Such as, these legal patterns, need (regulation up and down), be made in front and back to the spread pattern of " word segment " and " mark part "; To " pronunciation input coding ", carry out standard.Conjuncted mark, involved code, such as " machine code ", and " database coding " in the machine code, " pronunciation code ", " ' personalization ' message code " etc., their form also should have a kind of legal or standardized form.These forms need be to the sequencing of attribute statement, and the length of code interval etc. defines.Before not having legal pattern and standardization, conjuncted mark can be the user corresponding reference pattern and identifying code form is provided.
In the conjuncted mark, " mark part ", statement words pronunciation and message code, its " other pattern " is meant that " mark part " can be " blank " or other stealthy mark.
Two, preparation type matrix and word figure
Here, will mark type matrix, mark word figure and Chinese phonetic alphabet type matrix and word figure, the way aspect concrete preparation is narrated.They are the concrete application aspect type matrix and word figure of the conjuncted mask method of words.They have some common features:
(1) have " word segment " and " mark part ", two parts on body, are coupled to an integral body, with this integral body, as literal (or technology) symbol of record instruction (or process information);
(2) to words at the attribute aspect spoken and written languages and the information processing, on type matrix (or word figure, or type matrix and word figure), mark; These attributes comprise pronunciation, body and the message code of words, or include only the pronunciation and the body of words, or include only the message code and the body of words, or include only the message code of words;
(3) to the mark material, comprise color (or not having color), colourity, coding, mark, character and distortion thereof, or fingerprint, watermark, magnetic ink etc., select for use and define, as the sign of words mark;
(4) " word segment ", the original form of performance words, or the mark form of performance words;
(5) " mark part ", statement words pronunciation and message code have " pronunciation input coding " and " machine code ", or have only " pronunciation input coding ", or have only " pronunciation (input coding is implicit) ", or have only patterns such as " machine codes ", or other pattern.
(6) to of words, or a plurality of attribute carries out visualize.
Its difference is:
(1) their each conjuncted naturally mask methods, the concrete application aspect type matrix, word figure.
(2) in Chinese phonetic alphabet type matrix and word figure,, regarded " literal " as the Chinese phonetic alphabet, as " word segment " that be marked among type matrix and the word figure; Its literal form and pronunciation mark are all one; Its " pronunciation input coding " can omit.As Fig. 8, " Chinese phonetic alphabet word figure ".
Pattern 1, word segment is complete with the mark part, in the input coding, with " i " table the first; Pattern 2 has only word segment and machine code; Among the figure, machine code has only " pronunciation " code; Pattern 3 has only the Chinese phonetic alphabet.
Use above characteristics, note: type matrix or word figure, with whole body, as literal (or technology) symbol of record instruction (or process information); The attribute of words can be one or more; The mark material to be comprised color, colourity, coding, mark, character and distortion thereof, or fingerprint, watermark, magnetic ink etc., select for use and define, could be as the sign of words mark; Wherein, " coding " can be numeral, or character; " mark " can be figure, or the mark in the stroke.
1, the general step of preparation mark type matrix (or word figure)
(1) as required, the mark " attribute " of definition words comprises content and quantity; Select mark " material " for use, comprise that color, colourity or body change and parameter;
(2) as required, definition " attribute " and the corresponding relation that marks " material ";
(3) determine size in the type matrix (or word figure), the position and the size of word segment and mark part, the selected parts that need mark;
(4) will mark " attribute ",, use mark " material " by the definition requirement, in " EUDC Editor " and " picture " instrument, or in " font file edit tool ", the corresponding realization;
(5), be kept in database or the font file with result.
2, the preparation of mark type matrix
The preparation of " mark type matrix " utilize " EUDC Editor " and " picture " instrument here, or " font file edit tool " is realized." EUDC Editor ", " picture " instrument and " font file edit tool " are existing known technologies.
(1) determines mark " attribute ";
(2) select and define mark " material " for use, as basic colors, the difference parameter of different colourities, or the direction of twist of body variation and angle etc.;
(3) determine the corresponding relation of mark " material " and mark " attribute ", as colourity, or distortion angle and direction etc. refer to object;
(4) determine word segment and mark position size partly, make or select for use font to be marked, font and code pattern;
(5) with each mark " attribute ", process by definition: such as, will need " attribute " of deformation process, by the direction of twist and the angle of definition, carry out deformation process in batches;
(6) result is kept in database or the font file.
The preparation of phrase mark type matrix, general phrase number of words is few, in most several crosses, utilizes " font file edit tool " to make storage, is not subjected to the constraint of Chinese character custom width (2 bytes); Under the prior art condition, the system that has is the phrase type matrix to be used as Chinese character width treat, and imports phrase sometimes, move several cursors down more.
The mark type matrix, the realization of different colourities:
To on type matrix, realize the expression of several different colourities simultaneously, on principle, make the different chroma areas of type matrix exactly, separately in the unit area, the quantity difference of color dot, and make it have obviously " difference " each other.Realize the quantity difference of color dot in the unit area, a kind of specific practice is: earlier to each colourity, the color dot number (density) in unit area is tested and is defined, and makes the module of different colourities; To marking font, carry out " hollow " crisperding again; Then, in hollow frame,, select prefabricated colourity module for use, cover (described point) by definition density.As Fig. 9, " type matrix colourity is distinguished example ", several different colourity squares are realized with " EUDC Editor ".
With " think of " word is example.
1. determine that needing the attribute of mark is body, pronunciation, input coding and machine code; Body mark code fetch parts " heart "; Pronunciation and input coding connect together; Machine code, the descriptive data base coding, pronunciation code is provided with check bit; The database coding is defined as character code set; The character code set of " think of " word is " 0CBBCH ", and pronunciation code is " 3BAH ", and check bit is " 003H "; The input coding of " think of " word is " si`xn ";
2. selected mark material defines the relation of itself and attribute; The type matrix basic colors is a black; The mark of literal body is partly used " hollow " crisperding, mark code fetch parts " heart "; Pronunciation input coding, brevity code black matrix overstriking, shape justice characteristic character italic; The compact pattern of machine code;
3. labeling position from top to bottom, is word segment, pronunciation input coding, machine code three parts; Range size is set word segment, accounts for 50% height, pronunciation input coding, account for 30% height, machine code accounts for 20% height; Between three parts, reserve appropriate gap;
4. font is made as the regular script pattern, and its figure is compressed to 50% height (type matrix height, as follows), reserves the gap; General pattern selected for use in phonetic alphabet, and its figure is compressed to 30% height, and width is compressed to type matrix wide, reserves the gap; Machine code is made sign indicating number figure by the attribute definition requirement, and its figure is compressed to 20% height, and width is compressed to type matrix wide, reserves the gap;
5. the word segment of " think of " word, pronunciation input coding, machine code amalgamation one, the adjusting play has just tentatively generated the mark type matrix;
6. the font of mark type matrix, from 16 o'clock by 128 o'clock, export demonstration, check whether there be " falling a little " (losing the sign indicating number symbol); If " falling a little " occur, adjust about its sign indicating number symbol done, make its in usable range (such as, from 16 o'clock to 128 o'clock), " falling a little " do not appear; At last, generate the mark type matrix of " think of " word.
7. with the type matrix figure of " think of " word, be kept in word database or the font file; A large amount of type matrix figures can be deposited with " font file edit tool " editor, such as using " font creation program "; The foundation of word database belongs to general known technology.
The mark type matrix of " think of " word, as Figure 10, " ' think of ' sign is annotated the type matrix signal ".Figure 11, " a kind of phrase mark type matrix pattern ".This figure, the encoded input of type matrix is tinted after output shows and is made; In Du Yin input coding part, with red, the input brevity code " ysjj " of sign phrase.
Utilize existing " EUDC Editor " mark machine code,, adapt to font from 16 o'clock to 128 o'clock variation for overcoming " falling a little ", in 64 * 64 dot matrix grid, the sign indicating number symbol of machine code can be done as Figure 12, the layout of " a kind of nothing ' is fallen sign indicating number ' (12 patterns in base) example is set ".This " 12 patterns in base " (actual size), 12 stains on base " sign indicating number symbol " changed in 128 point ranges at 16 o'clock, can the missed code symbol.
The making pattern of several mark type matrixes is seen Figure 13, " mark type matrix (pattern 1) ", Figure 14, " mark type matrix (pattern 2) ".
3, the preparation of mark word figure
The preparation of " mark word figure ", similar to the preparation of mark type matrix, utilize " picture " instrument to realize here.
(1) determines mark " attribute ";
(2) select and define mark " material " for use, comprise, the difference parameter of different colourities, or the direction of twist of body variation and angle etc. as basic colors; " the look preface " of define color and colourity; " look preface " is meant and will selects for use the kind of color and colourity to line up sequence, and compiles and go up sequence number, the mark of a certain in order to represent " attribute " or the priority of selecting for use.Look preface diagram is seen Figure 15, " color and look preface ", Figure 16, " colourity and look preface ".
(3) determine the corresponding relation of mark " material " and mark " attribute ", as colourity, or distortion angle and direction etc. refer to object;
(4) determine the size of word figure, font to be marked, font and code pattern are made or selected for use to the position size of word segment and mark part;
(5) with each mark " attribute ", process by definition; Such as, will need " attribute " of deformation process, by the direction of twist and the angle of definition, carry out deformation process in batches;
(6) result is kept in the database file.
(7) output print can be " a colored type matrix ", also can be " gray scale type matrix ".
Be example still with " think of " word.
1. determine that needing the attribute of mark is body, pronunciation, input coding and machine code; Body mark code fetch parts " heart "; Pronunciation and input coding connect together; Machine code, the descriptive data base coding, pronunciation code is provided with check bit; The database coding is defined as character code set; The character code set of " think of " word is " 0CBBCH ", and pronunciation code is " 3BAH ", and check bit is " 003H "; The input coding of " think of " word is " si`xn ";
2. selected mark material defines the relation of itself and attribute; The type matrix basic colors is a black; The mark of literal body is partly used red marker, mark code fetch parts " heart "; Pronunciation input coding, brevity code is with the red overstriking of black matrix, shape justice characteristic character italic; The compact pattern of machine code;
3. word figure size is assumed to be 128 * 128 pixel units; Labeling position from top to bottom, is word segment, pronunciation input coding, machine code three parts; Range size is set word segment, accounts for 50% height (word figure height, down with), pronunciation input coding, account for 30% height, machine code accounts for 20% height; Between three parts, reserve appropriate gap;
4. font is made as the regular script pattern, and its figure is compressed to 50% height, reserves the gap; General pattern selected for use in phonetic alphabet, and its figure is compressed to 30% height, and width is compressed to word figure wide, reserves the gap; Machine code is made sign indicating number figure by the attribute definition requirement, and its figure is compressed to 20% height, and width is compressed to word figure wide, reserves the gap;
5. the word segment of " think of " word, pronunciation input coding, machine code amalgamation one, the adjusting play has just tentatively generated mark word figure;
6. marking word figure amplification and dwindling, observe in usable range, whether have " falling a little " (losing the sign indicating number symbol); Occur " falling a little ", its yard symbol is adjusted, make it in usable range, " falling a little " do not occur; At last, generate the mark word figure of " think of " word.
7. with the word figure of " think of " word, be kept in the word database; The foundation of word database belongs to general known technology.
The mark word figure of " think of " word, as Figure 17, " ' think of ' sign is annotated word figure signal ".
4, Chinese phonetic alphabet type matrix and word figure
Chinese phonetic alphabet type matrix and word figure are the conjuncted mask methods of words, the application aspect the Chinese phonetic alphabet.Its preparation, the same with the mark type matrix with mark word figure.Difference only is, in Chinese phonetic alphabet type matrix and word figure,, has regarded " literal " as the Chinese phonetic alphabet, as " word segment " that be marked among type matrix and the word figure; Its literal form and pronunciation mark are all one; Its " pronunciation input coding " can omit; In actual applications; it can show with Chinese character (or mark Chinese character) horizontally-arranged input and output; to mark outer the moving of pronunciation mark of Chinese character (with the Chinese character of mark type matrix or word figure input and output demonstration), the body that increases word segment shows, can also put into practice pattern for " spelling of Chinese character " provides.It is compared with the application of the existing Chinese phonetic alphabet, it is characterized in that: among (1) type matrix and the word figure, mark has message code; (2), make a type matrix or word figure with single Chinese phonetic alphabet; Or, make a type matrix or word figure with Chinese phonetic alphabet syllable; (3) made things convenient for the artificial input of the Chinese phonetic alphabet.The practical pattern of Chinese phonetic alphabet type matrix (or word figure), as Fig. 8, " Chinese phonetic alphabet word figure " signal.Chinese phonetic alphabet type matrix (or word figure) is used with the Chinese character horizontally-arranged, as Figure 42, and " Chinese phonetic alphabet type matrix and Chinese character horizontally-arranged are used (pattern 1) " signal.Among the figure, Chinese phonetic alphabet type matrix, the database that has marked " think of " word is encoded.
Three, " Two bors d's oeuveres " code and keyboard definition
In the conjuncted mark, the pronunciation mark of words, input coding and shape justice feature description thereof can adopt " Two bors d's oeuveres " pattern of the Chinese phonetic alphabet.
In the Two bors d's oeuveres pattern, initial consonant, simple or compound vowel of a Chinese syllable or letter (or its combination) are represented with 1 alphanumeric codes on the keyboard.Double spelling code should be put into practice the needs definition according to spelling of Chinese character, so that absorb the spelling of Chinese character achievement.This explanation is for the double spelling code definition provides a kind of practice scheme.In this explanation, 1. simple or compound vowel of a Chinese syllable " ü " when being write as " ü " at needs, replaces with letter " v "; 2. simple or compound vowel of a Chinese syllable " ê, er, ueng ", no sound cooperates in mandarin, belongs to zero consonant syllable, separately the definition key position; " er ", as independent application, keyboard is actual to be input as " e " and " r " two characters; " ê " uses separately as need, and available characters " e ' " expression; " ueng " if there is sound to cooperate needs, represents with " u-eng " monogram, and with each monogram part, is converted to corresponding double spelling code, as " u-g " (symbol "-" can omit in actual applications), not fettered by existing Two bors d's oeuveres form; 3. definition character " ng " is for corresponding with phonetic symbol " towering "; If 4. the syllable of new generation is arranged, before not having the definition key position, can adopt the approaching spelling pattern of phoneme, or the approaching spelling pattern of form, be expressed as " x-y ...-z " pattern (1 letter or its combination represented in each character), and be converted to corresponding double spelling code, to deal with needs; 5. zero consonant syllable, " Scheme for the Chinese Phonetic Alphabet " regulation is followed in the conversion of alliteration " i, u, ü ", and the remaining head vowel of a final and rhythm portion are represented with the double spelling code of correspondence; As, " ian " uses separately, is transformed to " yan ", its double spelling code be " yj " (y-an), rather than " m " is (ian).Double spelling code, it specifically is defined as: " A ", represent simple or compound vowel of a Chinese syllable " a "; " B " represents initial consonant " b ", simple or compound vowel of a Chinese syllable " ou "; " C " represents initial consonant " c ", simple or compound vowel of a Chinese syllable " iao "; " D " represents initial consonant " d ", simple or compound vowel of a Chinese syllable " uang, iang "; " E " represents simple or compound vowel of a Chinese syllable " e "; " F " represents initial consonant " f ", simple or compound vowel of a Chinese syllable " en "; " G " represents initial consonant " g ", simple or compound vowel of a Chinese syllable " eng " and letter " ng "; " H " represents initial consonant " h ", simple or compound vowel of a Chinese syllable " ang "; " I " represents initial consonant " ch ", simple or compound vowel of a Chinese syllable " i "; " J " represents initial consonant " j ", simple or compound vowel of a Chinese syllable " an "; " K " represents initial consonant " k ", simple or compound vowel of a Chinese syllable " ao ": " L ", represent initial consonant " l ", simple or compound vowel of a Chinese syllable " ai "; " M " represents initial consonant " m ", simple or compound vowel of a Chinese syllable " ian "; " N " represents initial consonant " n ", simple or compound vowel of a Chinese syllable " in "; " O " represents simple or compound vowel of a Chinese syllable " o, uo "; " P " represents initial consonant " p ", simple or compound vowel of a Chinese syllable " un, ü n "; " Q " represents initial consonant " q ", simple or compound vowel of a Chinese syllable " iu "; " R " represents initial consonant " r ", simple or compound vowel of a Chinese syllable " uan, ü an "; " S " represents initial consonant " s ", simple or compound vowel of a Chinese syllable " iong, ong "; Letter " T " is represented initial consonant " t ", simple or compound vowel of a Chinese syllable " ü e (ue) "; Letter " U " is represented initial consonant " sh ", simple or compound vowel of a Chinese syllable " u "; " V " represents initial consonant " zh ", simple or compound vowel of a Chinese syllable " ui, ü "; " W " represents letter " w ", simple or compound vowel of a Chinese syllable " ua, ia "; " X " represents initial consonant " x ", simple or compound vowel of a Chinese syllable " ie "; " Y " represents letter " y ", simple or compound vowel of a Chinese syllable " uai, ing "; " Z " represents initial consonant " z ", simple or compound vowel of a Chinese syllable " ei ".In this manual, the shape of words justice feature description (giving an example) has adopted " Two bors d's oeuveres " pattern.
The input coding of mark Chinese character adopts " Two bors d's oeuveres " keystroke, and rhythm that can regular coding input reduces stroke.Here, " Two bors d's oeuveres " key position definition with keyboard provides as follows.
" the pronunciation ` feature " list separator of (abbreviation input coding) and the syllable-dividing mark of the Chinese phonetic alphabet can define respectively in " pronunciation input coding ", also can unified Definition.It is defined as respectively; The list separator of input coding, between pronunciation and shape Yi Tezheng, with No. 41 key characters " ` " (the ASCII character value of character the is 96) expression of IBM QWERTY keyboard, between shape Yi Tezheng, with No. 12 key characters "-" (the ASCII character value of character the is 45) expression of IBM QWERTY keyboard; Or adopt other symbolic representation.The syllable-dividing mark of the Chinese phonetic alphabet is with No. 40 key characters " ' of IBM QWERTY keyboard " (the ASCII character value of character is 39) or other character representation.Its unified Definition is: the list separator of input coding and the syllable-dividing mark of the Chinese phonetic alphabet, unified for Chinese phonetic alphabet syllable-dividing mark, with No. 40 key characters " ' of IBM QWERTY keyboard " (the ASCII character value of character is 39) or other character representation.In numeric keypad, for reducing symbol definition, list separator and syllable-dividing mark unification are syllable-dividing mark, with numerical key " 0 " expression.The definition of Chinese punctuation mark, consistent with operating system.
1. the key position of QWERTY keyboard definition
The standard of primary standard keyboard is provided with constant.The definition of spelling code, consistent with original definition of QWERTY keyboard.Here, only narrate the double spelling code definition." XX key (XX) " is the key bit number of IBM QWERTY keyboard, is the ASCII character value of character in the bracket.
No. 16 keys (81) are represented initial consonant " q ", simple or compound vowel of a Chinese syllable " iu "; No. 17 keys (87) are represented letter " w ", simple or compound vowel of a Chinese syllable " ua, ia "
No. 18 keys (69) are represented simple or compound vowel of a Chinese syllable " e "; No. 19 keys (82) are represented initial consonant " r ", simple or compound vowel of a Chinese syllable " uan, ü an ";
No. 20 keys (84) are represented initial consonant " t ", simple or compound vowel of a Chinese syllable " ü e (ue) "; No. 21 keys (89) are represented letter " y ", simple or compound vowel of a Chinese syllable " uai, ing ";
No. 22 keys (85) are represented initial consonant " sh ", simple or compound vowel of a Chinese syllable " u "; No. 23 keys (73) are represented initial consonant " ch ", simple or compound vowel of a Chinese syllable " i ";
No. 24 keys (79) are represented simple or compound vowel of a Chinese syllable " o, uo "; No. 25 keys (80) are represented initial consonant " p ", simple or compound vowel of a Chinese syllable " un, ü n ";
No. 30 keys (65) are represented simple or compound vowel of a Chinese syllable " a "; No. 31 keys (83) are represented initial consonant " s ", simple or compound vowel of a Chinese syllable " iong, ong ";
No. 32 keys (68) are represented initial consonant " d ", simple or compound vowel of a Chinese syllable " uang, iang "; No. 33 keys (70) are represented initial consonant " f ", simple or compound vowel of a Chinese syllable " en ";
No. 34 keys (71) are represented initial consonant " g ", character " eng, ng "; No. 35 keys (72) are represented initial consonant " h ", simple or compound vowel of a Chinese syllable " ang ";
No. 36 keys (74) are represented initial consonant " j ", simple or compound vowel of a Chinese syllable " an "; No. 37 keys (75) are represented initial consonant " k ", simple or compound vowel of a Chinese syllable " ao ";
No. 38 keys (76) are represented initial consonant " l ", simple or compound vowel of a Chinese syllable " ai "; No. 44 keys (90) are represented initial consonant " z ", simple or compound vowel of a Chinese syllable " ei ";
No. 45 keys (88) are represented initial consonant " x ", simple or compound vowel of a Chinese syllable " ie "; No. 46 keys (67) are represented initial consonant " c ", simple or compound vowel of a Chinese syllable " iao ";
No. 47 keys (86) are represented initial consonant " zh ", simple or compound vowel of a Chinese syllable " ui, ü "; No. 48 keys (66) are represented initial consonant " b ", simple or compound vowel of a Chinese syllable " ou ";
No. 49 keys (78) are represented initial consonant " n ", simple or compound vowel of a Chinese syllable " in "; No. 50 keys (77) are represented initial consonant " m ", simple or compound vowel of a Chinese syllable " ian ".
The initial consonant that double spelling code refers to, simple or compound vowel of a Chinese syllable, letter and sound insulation (and separation) symbol corresponding symbol all indicate on the keycap of QWERTY keyboard, or sign is by keycap.
2. the key position of numeric keypad definition
The key position definition of Chinese phonetic alphabet, existing national proposed standard.Here, be another kind of definition pattern.Between two kinds of patterns, can carry out " word/number " conversion by basic code table.
(1) Chinese phonetic alphabet:
Numerical key " 1 " is represented " a, the b " of phonetic alphabet; Numerical key " 2 " is represented " c, the d " of phonetic alphabet;
Numerical key " 3 " is represented " e, the f " of phonetic alphabet; Numerical key " 4 " is represented " g, h, the i " of phonetic alphabet;
Numerical key " 5 " is represented " j, k, the l " of phonetic alphabet; Numerical key " 6 " is represented " m, n, the o " of phonetic alphabet;
Numerical key " 7 " is represented " p, q, the r " of phonetic alphabet; Numerical key " 8 " is represented " s, t, the u " of phonetic alphabet;
Numerical key " 9 " is represented " v, w, the x " of phonetic alphabet; Numerical key " 0 " is represented " y, the z " of phonetic alphabet.
(2) double spelling code;
Numerical key " 1 " is represented the initial consonant " b " of double spelling code, simple or compound vowel of a Chinese syllable " a, ou ";
Numerical key " 2 " is represented the initial consonant " c, d " of double spelling code, simple or compound vowel of a Chinese syllable " iao, iang, uang ";
Numerical key " 3 " is represented the initial consonant " f " of double spelling code, simple or compound vowel of a Chinese syllable " e, en ";
Numerical key " 4 " is represented the initial consonant " g, h, ch " of double spelling code, character " eng, ng, ang, i ";
Numerical key " 5 " is represented the initial consonant " j, k, l " of double spelling code, simple or compound vowel of a Chinese syllable " an, ao, ai ";
Numerical key " 6 " is represented the initial consonant " m, n " of double spelling code, simple or compound vowel of a Chinese syllable " ian, in, o, uo ";
Numerical key " 7 " is represented the initial consonant " p, q, r " of double spelling code, simple or compound vowel of a Chinese syllable " un, ü n, iu, uan, ü an ";
Numerical key " 8 " is represented the initial consonant " s, t, sh " of double spelling code, simple or compound vowel of a Chinese syllable " iong, ong, ü e, ue, u ";
Numerical key " 9 " is represented the initial consonant " zh, x " of double spelling code, letter " w ", simple or compound vowel of a Chinese syllable " ui, ü, ia, ua, ie ";
Numerical key " 0 " is represented the initial consonant " z " of double spelling code, letter " y ", simple or compound vowel of a Chinese syllable " ing, uai, ei ".
The respective symbol of the initial consonant that digital code refers to, simple or compound vowel of a Chinese syllable, letter and sound insulation (and separation) symbol, sign is on the keycap of keyboard, or sign is by keycap.
Four, the application of conjuncted mark
Here, to some concrete application of conjuncted mark, enumerated.
The data base administration of type matrix and word figure.Each type matrix (or word figure) is encoded (coding pattern, the front is narrated),, deposit in the respective field of database " dot matrix (or pixel) figure " of type matrix (or word figure).In database, be provided with pronunciation, the order of strokes observed in calligraphy of literal, reach the relevant a plurality of fields of various popular codings.Be provided with accordingly search, calling program.Concrete operations are finished with general database general knowledge.
The application mode of mark type matrix.
The mark type matrix can be made independent font file, and the mark Chinese character with " homogeneity " appears in the text, finishes with " font file edit tool ".Also can be clipped in the existing font text, utilize existing " EUDC Editor " to finish; Be characterized in,, be convenient to input and set type that to those " differences " Chinese character obvious, that discern easily, also can not mark, two kinds of situations all can be put up with to the words that needs mark.
Words " attribute " is described, and comprises one or more contents of " font ' personalization ' ", " pronunciation ", " stroke ", " order of strokes observed in calligraphy ", " structural style ", " parts composition ", " selected components " and aspects such as " first-selected parts ".
The assorted example of the application of conjuncted mark.Here, with color, colourity and character distortion etc., as the mark material.
1, color (or colourity) and look preface
As required, determine to select for use the kind of color (or colourity); These colors (or colourity) are lined up sequence, and compile and go up sequence number, be called " look preface " here." end " is last number of sequence.The front and back of look preface, that represents a certain " attribute " selects (and mark) for use successively.Give an example as Figure 15 " color and look preface ", Figure 16, " colourity and look preface ".
2, Hanzi structure
Structure type of Chinese characters from big aspect, is divided into left and right sides structure, up-down structure and encirclement (assorted and) structure three classes.But its segmentation but has a lot of concrete patterns.Use conjuncted mark, help learning Chinese characters structure and information encoding process.As Figure 18 to Figure 23, use different colours, distinguish Hanzi structure; Here, left and right sides structure, 6 kinds; Up-down structure, 6 kinds; Inside and outside encirclement, 7 kinds; Other structure, 5 kinds.With a kind of color or colourless, the expression independent body.
(1) left and right sides structure, as Figure 18, " left and right sides structure ";
(2) up-down structure, as Figure 19, " up-down structure ";
(3) the inside and outside encirclement: 1. semi-surrounding, as Figure 20, " inside and outside encirclement (semi-surrounding structure) ";
2. surround entirely, as Figure 21, " inside and outside encirclement (full investing mechanism) ";
(4) other structure, as Figure 22, " other structure ";
(5) independent body, as Figure 23, " independent body ".
3, mark " radicals by which characters are arranged in traditional Chinese dictionaries classifications " (Figure 24, " mark ' radicals by which characters are arranged in traditional Chinese dictionaries classification ")
Chinese character " think of ", understanding double ideophone.From the heart, from fontanel (x ì n), fontanel is sound also.Fontanel, brain.Ancients think that the heart and brain cooperation produces thought.Literal sense: thinking, think; Consider.Radicals by which characters are arranged in traditional Chinese dictionaries are sorted out, and what have is " field " portion with its rule, and what have both was " field " portion with its rule, again its rule was " heart " portion; Here, its rule are " heart " portion, with " redness ", on font, Direct Mark.
Chinese character " meaning ", understanding.From the heart, from sound.Literal sense; Will, regard.Radicals by which characters are arranged in traditional Chinese dictionaries are sorted out, and what have is " standing " portion with its rule, and what have both was " standing " portion with its rule, again its rule was " heart " portion; Here, its rule are " heart " portion, with " redness ", on font, Direct Mark.
4, " first-selection " parts of mark encode Chinese characters for computer (code fetch) (Figure 25, " mark encode Chinese characters for computer ' first-selection ' parts ")
In Chinese character information processing, sometimes, encode Chinese characters for computer (code fetch), needs are determined " first-selection " parts of Chinese character; Particularly, when these " first-selection " position component, when inconsistent, just need memory with sequential write; Here, with " redness ", with its " first-selection " parts, mark comes out intuitively.
5, the selected components of mark encode Chinese characters for computer (code fetch) and order (Figure 26, " selected components and order ")
In Chinese character information processing, encode Chinese characters for computer (code fetch) needs the order of determining that parts and parts are selected for use.The order that these parts, parts are selected for use needs memory; Here, with " look preface ", it is marked out intuitively.
6, the first stroke of a Chinese character stroke of mark Chinese-character writing (Figure 27, " first stroke of a Chinese character stroke of mark Chinese-character writing ")
The first stroke of a Chinese character stroke of Chinese-character writing is marked with " redness ".
7, coding (code fetch) parts (or stroke) (Figure 28, " addressable part of independent body Chinese character (or stroke) ") of mark independent body Chinese character
8, specify in " pinyin character string " pronunciation of Chinese character.
Such as, Chinese character micro-, its a kind of input coding are " w ē i ' chi ", can be " w ē i " with the pronunciation of " color " mark micro-word, as Figure 29, " marking pronunciation in the character string ".
9, specify in " pinyin character string " input coding of Chinese character.
Such as, Chinese character micro-, its a kind of input coding are " w ē i ' chi ", can use the brevity code of the input coding of " color " mark micro-word is " w ē i ' c ", and as Figure 30, " brevity code in the input coding ".
10, part distortion mark, as Figure 31, " character distortion mark brevity code ".
11, be " simplified, traditional font and allosome (and allosome) ", specify common coding (code fetch) parts; This class Chinese character, pronunciation is identical, and input coding is identical, adds, and the database coding in the machine code is identical, can impel it in use to unify and simplify.(as Figure 32, " specifying simplified and traditional font (allosome) common coding (code fetch) parts ")
12, distinguish the tone of Chinese phonetic alphabet syllable with color (or colourity), use the look preface, the initial of sign syllable or main vowel, the expression four tones of standard Chinese pronunciation.As Figure 33, " distinguishing the tone of Chinese phonetic alphabet syllable with color (or colourity) ".
13, distinguish the tone of Chinese phonetic alphabet syllable with character distortion, with the character distortion, the initial of sign syllable or main vowel, the expression four tones of standard Chinese pronunciation, as Figure 34, " with the character distortion distinguish Chinese phonetic alphabet syllable tone (one of) "; As Figure 35, " distinguishing the tone (two) of Chinese phonetic alphabet syllable with character distortion ".
14, with color (or colourity) with combine the initial of sign syllable or main vowel, the expression four tones of standard Chinese pronunciation with character distortion, distinguish the tone of Chinese phonetic alphabet syllable, as Figure 36, " with color (or colourity) with combine with character distortion, distinguish the tone of Chinese phonetic alphabet syllable.
15, combine with the character distortion, or with " color mark ", specify in " pinyin character string ", the pronunciation of Chinese character, as Figure 37, " with color and distortion, the sign pronunciation ".
" w ē i ' chi " (Chinese character micro-" pinyin character string "), its " w ē i ", the body overstriking is masked as the pronunciation of micro-word.Or combine the mark pronunciation with " color mark ".
16, marking on stroke or parts, " the practical attribute " of mark Chinese character, such as, sign code fetch parts.(Figure 38, " marking on stroke or parts ")
17, conjuncted mark, word segment and mark part can have different arrangement patterns; The content of mark part can be determined with technical conditions as required; As Figure 39, " conjuncted mark (no machine code) " illustrated 3 clocks to arrange pattern, do not use machine code.
18, the special-purpose marking, as Figure 40, " the special-purpose marking ".With the conjuncted mark of individual original handwriting, fingerprint and machine code (password), or the text given content encrypted with machine code, and, generates " the special-purpose marking " through notarization; This " the special-purpose marking " will have conjuncted property (can not divert from one use to another it locates), uniqueness and legal effect.
19, " personalization " type matrix.The use of machine code makes type matrix " personalization " become possibility.Because the variation of words body can not influence the identification of people and machine.As Figure 41, " ' personalization ' type matrix (signal) ", it is made in a Tang poetry on the type matrix, and marks upward this Tang poetry " address " coding in database, does not influence the identification of machine.Among the figure, database is encoded to hypothesis, and word segment makes with " font file edit tool ".On the figure, also can mark the pronunciation code of verse.To ancient Chinese prose, the mark pronunciation, particularly, helpful to the reading of the writing in classical Chinese.Certainly, this a kind of simple signal of just type matrix (with word figure) " personalization " being used.Its word segment can adopt the original handwriting of " oneself ".
Five, to the drawing explanation of Figure of description
Fig. 1, " conjuncted mark (individual character) ".Among the figure: 1, be " word segment ".2, be " mark part ".3, be " parts sort out feature "; Among the figure, the parts " heart " of " think of " word are used red marker.4, be " pronunciation input coding "; Among the figure, pronunciation code " s ī ", character indicates with red and body " overstriking "; Feature code " xn ", character indicates with " italic " (body becomes tiltedly).5, be " machine code "; 6, be the configuration (application units) of conjuncted mark (individual character) in the blue square frame, frame.Fig. 1 is colored word figure; Figure of description adopts black gray to print.
Fig. 2, " conjuncted mark (phrase) ".Among the figure: 1, be " word segment ".2, be " mark part ".3, be " parts sort out feature "; Among the figure, two parts " heart " of phrase " thought " are used red marker.4, be " pronunciation input coding ".5, be " machine code "; 6, be the configuration (application units) of conjuncted mark (phrase) in the blue square frame, frame.Fig. 2 is colored word figure; Figure of description adopts black gray to print.
Fig. 3, " pronunciation input coding ".Among the figure: 1, be " pronunciation part "; Among the figure, pronunciation code " s ī " is with red " overstriking " sign.2, be " list separator ".3, be " characteristic "; Among the figure, feature code " xn " is with " italic " distortion sign.Fig. 3 is color graphics; Figure of description adopts black gray to print.
Fig. 4, " pronunciation input coding (2) ".Among the figure: 1, be " pronunciation part "; Among the figure, pronunciation code " s ī " is with red " overstriking " sign.2, be " list separator ".3, be " characteristic "; Among the figure, feature code " xn " is with " italic " distortion sign.4, be " font code ": among the figure, font code " f " is with blue " italic " sign.Fig. 4 is color graphics; Figure of description adopts black gray to print.
Fig. 5, " dimension style ".Among the figure: 1, " all marks " pattern.2, " part mark " pattern.3, " individual event mark " pattern.1-1, " pattern 1 "; Word segment, the parts " heart " of " think of " word are used red marker; Pronunciation in the input coding, pronunciation code " s ī " is with red " overstriking " sign; Feature code " xn " is with " italic " distortion sign.2-1, " pattern 2 "; Word segment, the parts " heart " of " think of " word, use red marker: pronunciation in the input coding, pronunciation code " s ī " is with red " overstriking " sign; Feature code " xn " is with " italic " distortion sign.2-2, " pattern 3 "; Word segment, the parts " heart " of " think of " word are used red marker; Pronunciation code " s ī " is with red " overstriking " sign.2-3, " pattern 4 "; Word segment, the parts " heart " of " think of " word are used red marker.3-1, " pattern 5 ".3-2, " pattern 6 "; Has only machine code; Word segment is " blank " (no word), with lavender " think of " signal.Fig. 5 is color graphics; Figure of description adopts black gray to print.
Fig. 6, " machine code (evacuation pattern) ".Among the figure: 1.0, be " every group coding power and position is arranged "; Among the figure, power and position " 8421 " is corresponding red line assigned address.2.0, " the vertically reference of sign indicating number symbol "; For " zigzag " sign indicating number that is positioned at the machine code left side that red line points to is schemed.3.0, " the laterally reference of sign indicating number symbol "; For " black and white rectangle " sign indicating number that is positioned at the machine code base that red line points to is schemed.
Fig. 7, " machine code (compact pattern) ".Among the figure: 1.0, be " every group coding power and position is arranged "; Among the figure, power and position " 8421 " is corresponding red line assigned address.2.0, " the vertically reference of sign indicating number symbol "; For " zigzag " sign indicating number that is positioned at the machine code left side that red line points to is schemed.3.0, " the laterally reference of sign indicating number symbol "; " black and white rectangle " sign indicating number that is positioned at the machine code base that points to for red line accords with.
Fig. 8, " Chinese phonetic alphabet word figure ".Among the figure; , 1, " pattern 1 "; Chinese phonetic alphabet part is with red " overstriking "; In the input coding, tone, high and level tone is with " i " sign of red and " italic " distortion.2, " pattern 2 "; Chinese phonetic alphabet part is with red " overstriking ".3, " pattern 3 "; Has only Chinese phonetic alphabet part, with red " overstriking ".Fig. 8 is color graphics; Figure of description adopts black gray to print.
Fig. 9, " type matrix colourity is distinguished example ".Among the figure, colourity is 1,2,3,4,5 from deep to shallow in proper order.
Figure 10, " ' think of ' sign is annotated the type matrix signal ".Among the figure, word segment, the parts " heart " of " think of " word are " hollow " character; Pronunciation in the input coding, pronunciation code " s ī " is with body " overstriking " sign; Feature code " xn " is with " italic " distortion sign.
Figure 11, " a kind of phrase mark type matrix pattern ".Among the figure; Chinese character part is red glyphs; Chinese phonetic alphabet part, the initial consonant of each syllable is red " italic " character.Figure 11 is color graphics; Figure of description adopts black gray to print.
Figure 12, " a kind of nothing ' is fallen sign indicating number ' (12 patterns in base) example is set ".Among the figure, 1, " 16 point "; Represent that actual font size is 16 points; 2, " 18 point "; Represent that actual font size is 18 points.3, " 20 point "; Represent that actual font size is 20 points.4, " 28 point "; Represent that actual font size is 28 points.5, " 36 point "; Represent that actual font size is 36 points.6, " 48 point "; Represent that actual font size is 48 points.7, " 72 point "; Represent that actual font size is 72 points.8, " 96 point ": represent that actual font size is 96 points.9, " 128 point "; Represent that actual font size is 128 points.
Figure 13, " mark type matrix (pattern 1) ".Among the figure: 1, " big " word, left side radical are " hollow ".2, " big " word is write first stroke of a Chinese character stroke and is " hollow ".3, " big " word is write first stroke of a Chinese character stroke and is started an end for " hollow ".4, " big " word, full word are " hollow ".5, " big " word, font " anti-white ".6, " big " word, font " anti-white ", band " pit " in " anti-white ".7, " big " word, comprise word segment and pronunciation input coding; Word segment, left side radical " hollow ".8, the mark type matrix, only contain pronunciation input coding and machine code.
Figure 14, " mark type matrix (pattern 2) ".Among the figure: 1, contain word segment and pronunciation mark.2, contain word segment and machine code.3, contain word segment, pronunciation input coding and machine code.4, only contain pronunciation input coding and machine code.5, contain word segment, pronunciation mark and machine code; Pronunciation mark and word segment horizontally-arranged.6, Chinese phonetic alphabet type matrix.
Figure 15, " color and look preface ".Among the figure: color and look preface are expressed as " color designation (look preface sequence number) " form.1, " red (1) "; The red look preface of definition is 1.2, " green (2) "; The green look preface of definition is 2.3, " blue (3) "; The blue look preface of definition is 3.4, " yellow (4) "; The yellow look preface of definition is 4.5, " purple (5) "; The look preface of definition purple is 5.6, " pale blue (6) "; Defining nattier blue look preface is 6.7, " pale green (7) "; Defining absinthe-green look preface is 7.8 ..., suspension points, expression also has other colors and look preface sequence number; 9, " black (inferior end) "; The look preface of definition black is " second from the bottom ".10, " white (end) ", the look preface of definition white is " last ".Figure 15 is color graphics; Figure of description adopts black gray to print.
Figure 16, " colourity and look preface ".Among the figure: colourity and look preface are expressed as " color designation (look preface sequence number) " form.1, " red (1) "; The red look preface of definition is 1.2, " inferior red (2) "; Definition time red look preface is 2.3, " light red (3) "; Defining pink look preface is 3.4, " white (end) "; The look preface of definition white is " last ".5, " green (1) "; The green look preface of definition is 1.6, " inferior green (2) "; Definition time green look preface is 2.7, " pale green (3) "; Defining absinthe-green look preface is 3.8, " blue (1) "; The blue look preface of definition is 1.9, " inferior indigo plant (2) "; Definition time blue look preface is 2.10, " pale blue (3) "; Defining nattier blue look preface is 3.Figure 16 is color graphics; Figure of description adopts black gray to print.
Figure 17, " ' think of ' sign is annotated word figure signal ".Among the figure: word figure word segment, with red, sign parts " heart "; Pronunciation in the input coding " s ī `xn ", pronunciation code " s ī " is with red " overstriking " sign; Feature code " xn " is with " italic " distortion sign.Figure 17 is color graphics; Figure of description adopts black gray to print.
Figure 18, " left and right sides structure ".Among the figure: 1, " zygomorphism "; Chinese character " group ", the right parts " sheep " are with yellow sign.2, " left small and right large "; Chinese character " big ", left side parts " Ren " are with yellow sign.3, " left large and right small "; Chinese character " just ", the right parts " Dao " are with yellow sign.4, " right side is divided again "; Chinese character " wedding ", left side parts " woman " are with yellow sign; The right parts " dusk " are further divided into line part " family name " and parts " day "; The upper right corner, parts " family name " are used the purple sign.5, " Zuo Zaifen "; Chinese character " portion ", the right parts " Fu " are with yellow sign; Left side parts " sound " are further divided into that parts " stand " and parts " mouth "; The upper left corner, parts " stand ", use the purple sign.6, " three is arranged side by side "; Chinese character " is thanked ", left side parts " Yan ", and with yellow sign, intermediate member " body " is used the purple sign.
Figure 19, " up-down structure ".Among the figure: 1, " matching up and down "; Chinese character " think of ", following parts " heart " are with yellow sign.2, " up-small and down-big "; Chinese character " word ", top parts " Http " are with yellow sign.3, " up big and down small "; Chinese character " ridge ", following parts " soil " are with yellow sign.4, " under divide again "; Chinese character " despot ", top parts " rain " are with yellow sign; Remainder In, left side parts " leather " are used the purple sign.5, " on again divide "; Chinese character " temporarily ", following parts " day ", indicate with yellow: during remainder " was cut ", left side parts " car " were used the purple sign.6, " three laminations "; Chinese character " meaning ", following parts " heart ", with yellow sign, the top parts " stand ", use the purple sign.
Figure 20, " inside and outside encirclement (semi-surrounding structure) ".Among the figure: 1, " upper left bag "; Chinese character " mediocre ", upper left corner parts " extensively " are with yellow sign.2, " lower-left bag "; Chinese character " fan ", lower left corner parts " Chuo " are with yellow sign.3, " upper right bag "; Chinese character " pasture ", upper right corner parts " Bao " are with yellow sign.4, " left three guarantees "; Chinese character " casket ", parts " Contraband " are with yellow sign.5, " going up three guarantees "; Chinese character " spare time ", parts " door " are with yellow sign.6, " following three guarantees "; Chinese character " act of violence ", parts " Qian " are with yellow sign.
Figure 21, " inside and outside encirclement (full investing mechanism) ".Among the figure, 1, " inside and outside encirclement (full investing mechanism) "; Chinese character " Gu ", being divided into parts " mouth " and parts " Gu ", parts " mouth " indicate with yellow.
Figure 22, " other structure ".1, " top three is arranged side by side "; Chinese character " diligent ", following parts " heart " are with yellow sign; Top parts " Mao " are divided into " wood ", " lance ", " wood " three parts, and left side parts " wood " are used the purple sign, and intermediate member " lance " is with blue sign.2, " about the middle part "; Chinese character " swashs ", and left side parts " Rui " are with yellow sign; Intermediate member
Figure A0214747700232
Be further divided into parts " in vain " and parts " side ", parts " in vain " indicate that with blue parts " side " are used green mark.3, " about middle part "; Chinese character " basket ", top parts " " are with yellow sign; The left side, middle part, parts
Figure A0214747700233
Use the purple sign; The right, middle part, parts Indicate with blueness.4, " middle part heterozygosis "; Chinese character " rate ", upper edge parts " Tou " are with yellow sign; The middle part is parts
Figure A0214747700235
With " one " heterozygosis, parts Use the blueness sign, parts " one " are used the purple sign.5, " dividing again up and down "; Chinese character " device " is divided into " Song ", " dog ", " Song " three parts; Intermediate member " dog " is with yellow sign; Top parts " Song " are further divided into two " mouths ", and left side parts " mouth " are used the purple sign, and the right parts " mouth " are with blue sign; Following parts " Song " also are divided into two " mouths ", and left side parts " mouth " are used green mark.
Figure 23, " independent body "; 1, the Chinese character on the left side " mountain " is made the background color sign with yellow; 2, the Chinese character on the right " mountain " need not be color-coded.
Figure 24, " mark ' the radicals by which characters are arranged in traditional Chinese dictionaries classification "; 1, Chinese character " think of ", radicals by which characters are arranged in traditional Chinese dictionaries classify as " heart ", use red marker; 2, Chinese character " meaning ", radicals by which characters are arranged in traditional Chinese dictionaries classify as " heart ", use red marker.
Figure 25, " mark encode Chinese characters for computer ' first-selection ' parts "; 1, Chinese character " portion ", encode Chinese characters for computer " first-selection " parts are " Fu ", use red marker; 2, Chinese character " diligent ", encode Chinese characters for computer " first-selection " parts are " heart ", use red marker.
Figure 26, " selected components and order "; 1, Chinese character " diligent ", selected components and order are " heart, wood, lance, wood ", use " red, green, blue, black " sign respectively; 2, Chinese character " portion ", selected components and order are " Fu, upright, mouth ", use " red, green, blueness " sign respectively.
Figure 27, " first stroke of a Chinese character stroke of mark Chinese-character writing "; 1, Chinese character " nine " is write first stroke of a Chinese character stroke and is " Pie ", uses red marker; 2, Chinese character " power " is write first stroke of a Chinese character stroke and is Use red marker; 3, Chinese character " light " is write first stroke of a Chinese character stroke and is " Shu ", uses red marker.
Figure 28, " addressable part of independent body Chinese character (or stroke) "; 1, independent body Chinese character " husband ", addressable part (stroke) is " two ", uses red marker; 2, independent body Chinese character " mountain ", addressable part (stroke) is " Shu ", uses red marker.
Figure 29, " marking pronunciation in the character string "; In character string " w ē i ' chi ", using the pronunciation of red marker character string partly is " w ē i ".
Figure 30, " brevity code in the input coding "; In character string " w ē i ' chi ", using the brevity code of red marker character string input coding is " w ē i`c ".
Figure 31, " character distortion mark brevity code "; In character string " w ē i ' chi ", with redness and character distortion (italic), the brevity code of banner string input coding is " w ē i`c ".
Figure 32, " specifying simplified and traditional font (allosome) common coding (code fetch) parts "; Chinese character " thoroughly ", its simplified, traditional font and allosome are respectively " 1,2,3 ", with red, indicate that common coding (code fetch) parts of its appointment are " cave ".
Figure 33, " distinguishing the tone of Chinese phonetic alphabet syllable with color (or colourity) "; 1, with red, the letter " a " in the sign syllable " ma ", the expression tone is " high and level tone ", represents syllable " m ā "; 2, use aubergine, the letter " a " in the sign syllable " ma ", the expression tone is " rising tone ", represents syllable " m á "; 3, with blue, the letter " a " in the sign syllable " ma ", the expression tone is " going up sound ", represents syllable " m ǎ "; 4, with light blue, the letter " a " in the sign syllable " ma ", the expression tone is " falling tone ", represents syllable " m à "; 5, with the original primary colors of character, sign syllable " ma ", the expression syllable is pronounced " softly ".
Figure 34, " with character distortion distinguish Chinese phonetic alphabet syllable tone (one of) "; 1, the letter " a " in the syllable " ma " is deformed into " italic ", and the expression tone is " high and level tone ", represents syllable " m ā "; 2, the letter " a " in the syllable " ma " is deformed into " italic overstriking ", and the expression tone is " rising tone ", represents syllable " m á "; 3, the letter " A " in the character string " mA " is deformed into " italic ", and the expression tone is " going up sound ", represents syllable " m ǎ "; 4, the letter " A " in the character string " mA " is deformed into " italic overstriking ", and the expression tone is " falling tone ", represents syllable " m à "; 5, deformation process do not made in syllable " ma ", and the expression syllable is pronounced " softly ".
Figure 35, " distinguishing the tone (two) of Chinese phonetic alphabet syllable with character distortion "; 1, the letter " a " in the syllable " ma " is deformed into " italic " and also " moves down ", and the expression tone is " high and level tone ", represents syllable " m ā "; 2, the letter " a " in the syllable " ma " is deformed into " italic overstriking " and also " moves down ", and the expression tone is " rising tone ", represents syllable " m ǎ "; 3, the letter " A " in the character string " mA " is deformed into " italic " and also " moves down ", and the expression tone is " going up sound ", represents syllable " m ǎ "; 4, the letter " A " in the character string " mA " is deformed into " italic overstriking " and also " moves down ", and the expression tone is " falling tone ", represents syllable " m à "; 5, deformation process do not made in syllable " ma ", and the expression syllable is pronounced " softly ".
Figure 36, " with color (or colourity) with combine with character distortion, distinguish the tone of Chinese phonetic alphabet syllable "; 1, the letter " a " in the syllable " ma " is deformed into " italic " and also " moves down ", uses red marker again, and the expression tone is " high and level tone ", represents syllable " m ā "; 2, the letter " a " in the syllable " ma " is deformed into " italic overstriking " and also " moves down ", uses the aubergine sign again, and the expression tone is " rising tone ", represents syllable " m á "; 3, the letter " A " in the character string " mA " is deformed into " italic " and also " moves down ", and with blue sign, the expression tone is " going up sound ", represents syllable " m ǎ " again; 4, the letter " A " in the character string " mA " is deformed into " italic overstriking " and also " moves down ", uses light blue sign again, and the expression tone is " falling tone ", represents syllable " m à "; 5, deformation process do not made in syllable " ma ", also need not be color-coded, and the expression syllable is pronounced " softly ".
Figure 37, " with color and distortion, the sign pronunciation "; In the character string " w ē i`chi ", pronunciation part " w ē i " is used red marker, character body " overstriking "; Shape justice characteristic " chi ", color is constant, and character is represented with " italic distortion ".
Figure 38, " marking on stroke or parts "; 1, Chinese character " think of ", its parts " heart ", body indicates with " white point "; 2, Chinese character " husband ", its parts " two ", body indicates with " white point "; 3, Chinese character " mountain ", its first stroke of a Chinese character stroke " Shu " plays the tip of the brushstyle of a writing or painting, with white " hollow " sign.
Figure 39, " conjuncted mark (no machine code) ".1, " pattern 1 "; Word segment adopts " mark form ", remainder " to ", with blue sign; The mark part does not have " machine code ", and " pronunciation input coding " is arranged in word segment bottom.2, " pattern 2 "; Word segment adopts " mark form ", remainder " to ", with blue sign; The mark part does not have " machine code ", and " pronunciation input coding " has only the pronunciation part, is arranged in the word segment left side.3, " pattern 3 "; Word segment adopts " mark form ", remainder " to ", with blue sign; The mark part does not have " machine code ", and " pronunciation input coding " has only the pronunciation part, is arranged in the word segment top.
Figure 40, " the special-purpose marking "; The special-purpose marking of individual comprises idiograph, individual impression of the hand, the machine code through notarizing and registering; Among the figure, the idiograph is blue, and impression of the hand is red, and machine code is a black, and marking frame is red, and marking background color is " beautiful blue ".
Figure 41, " ' personalization ' type matrix (signal) "; In a type matrix, " daytime, to the greatest extent, ocean current was gone in the Yellow River near the mountain to express verse." and the machine code of this verse.
Figure 42, " Chinese phonetic alphabet type matrix and Chinese character horizontally-arranged are used (pattern 1) ".1, Chinese phonetic alphabet type matrix; In the blue frame, comprise the Chinese phonetic alphabet and its machine code; Pinyin character " s ī " is used red marker, body " overstriking ".2, Chinese character " think of ", parts " heart " are used red marker.Chinese phonetic alphabet type matrix and Chinese character font horizontally-arranged.

Claims (7)

1, the conjuncted mask method of a kind of words, it has " word segment " and " mark part ", belong to spoken and written languages and areas of information technology, it is characterized in that: (1) " word segment " and " mark part ", on body, be coupled to an integral body, as literal (or technology) symbol of record instruction (or process information); (2) words is marked at the attribute aspect spoken and written languages and the information processing; These attributes comprise pronunciation, body and the message code of words, or include only the pronunciation and the body of words, or include only the message code and the body of words, or include only the message code of words; (3) the mark material (is comprised color, colourity, coding, mark, character and distortion thereof, or fingerprint, watermark, magnetic ink etc.) select for use and define, sign as the words mark, to comprising one or more words attributes of pronunciation, spelling, stroke, the order of strokes observed in calligraphy, structural style, parts composition, parts ownership, selected components, input coding and machine code etc., carry out the visual pattern mark, or hide secret mark; (4) " word segment ", the original form of performance words, or the mark form of performance words; The mark form is to select the mark material for use, on the words body, and a kind of body pattern of sign words attribute; (5) " mark part ", statement words pronunciation and message code (message code, comprise " input coding " and " machine code "), have " pronunciation input coding " and " machine code ", or have only " pronunciation input coding ", or have only " pronunciation (input coding is implicit) ", or have only patterns such as " machine codes ", or other pattern; Spelling of the pronunciation of words and input coding are linked together, and show as " pronunciation input coding " pattern, and perhaps the pronunciation of the words spelling input coding of getting along well is linked together; " pronunciation input coding " pattern, its front is the words pronunciation, the back is an input coding, or the front is the words pronunciation, the back is the shape justice feature code in order to difference unisonance words, and two parts combination in front and back is as input coding, and the centre separates with symbol, or separate without symbol, list separator can define as required and select for use; Machine code, be be used for words the people machine identification and the figure code handled, it in form, the definition comprise " identification starting point, in length and breadth to identification with reference to " etc. mark, or define other mark, or do not make this class tag definitions, it is in terms of content, description comprises database coding, pronunciation code, " personalization " message code of words and is used for one or more information of the synthetic information such as data code of type matrix (or word figure), or describes others information; Database coding comprises that for inquiry the information such as pronunciation, body (comprising type matrix and word figure) and part of speech meaning of words provide " address " (coding), and it is by the certain format composition, or directly is data-base recording number, or is character set encoding, or is the interior code of machine; " personalization " message code is described the one or more information of registering in authority that comprise aspects such as works, publication, print reproduction equipment, special font, the notarization marking and password, or other self-defined information; (6) adopt the mark type matrix, or adopt mark word figure, or adopt Chinese phonetic alphabet type matrix and word figure, or adopt other form, realize the words mark.
2, a kind of mark type matrix belongs to spoken and written languages and infotech and printing technology, it is characterized in that; (1) have " word segment " and " mark part ", two parts on body, are coupled to an integral body, with this integral body, as literal (or technology) symbol of record instruction (or process information); (2) to words at the attribute aspect spoken and written languages and the information processing, on type matrix, mark; These attributes comprise pronunciation, body and the message code of words, or include only the pronunciation and the body of words, or include only the message code and the body of words, or include only the message code of words; (3) mark material (comprise colourity, coding, mark, character and distortion thereof, or fingerprint, watermark, magnetic ink etc.) is selected for use and defined, as the sign of words mark; (4) " word segment ", the original form of performance words, or the mark form of performance words; (5) " mark part ", statement words pronunciation and message code have " pronunciation input coding " and " machine code ", or have only " pronunciation input coding ", or have only " pronunciation (input coding is implicit) ", or have only patterns such as " machine codes ", or other pattern.
3, a kind of mark word figure, it is the character and graphic with " word segment " and " mark part ", belongs to spoken and written languages and infotech and printing technology, it is characterized in that; (1) " word segment " and " mark part " on body, is coupled to an integral body, with this integral body, as literal (or technology) symbol of record instruction (or process information); (2) to words at the attribute aspect spoken and written languages and the information processing, on word figure, mark; These attributes comprise pronunciation, body and the message code of words, or include only the pronunciation and the body of words, or include only the message code and the body of words, or include only the message code of words; (3) mark material (comprise color, colourity, coding, mark, character and distortion thereof, or fingerprint, watermark, magnetic ink etc.) is selected for use and defined, as the sign of words mark; (4) " word segment ", the original form of performance words, or the mark form of performance words; (5) " mark part ", statement words pronunciation and message code have " pronunciation input coding " and " machine code ", or have only " pronunciation input coding ", or have only " pronunciation (input coding is implicit) ", or have only patterns such as " machine codes ", or other pattern; (6) to of words, or a plurality of attribute carries out visualize.
4, a kind of Chinese phonetic alphabet type matrix and word figure comprise Chinese phonetic alphabet, syllable and corresponding type matrix or word figure, belong to spoken and written languages and infotech and printing technology, and it is characterized in that: among (1) type matrix and the word figure, mark has message code; (2), make a type matrix or word figure with single Chinese phonetic alphabet; Or, make a type matrix or word figure with Chinese phonetic alphabet syllable.
5, with the conjuncted mask method of the described words of claim 1, or with the described mark type matrix of claim 2, or with the described mark word of claim 3 figure, or with described Chinese phonetic alphabet type matrix of claim 4 and word figure, or make up the input of realization, output (comprise print and show) technology and product mutually with it.
6, with the conjuncted mask method of the described words of claim 1, or with the described mark type matrix of claim 2, or with the described mark word of claim 3 figure, or with described Chinese phonetic alphabet type matrix of claim 4 and word figure, or make up the printing technology of realization and product mutually with it.
7, with the conjuncted mask method of the described words of claim 1, or with the described mark type matrix of claim 2, or with the described mark word of claim 3 figure, or with described Chinese phonetic alphabet type matrix of claim 4 and word figure, or make up commercial publication of realization (comprising the multimedia reading matter) and advertisement mutually with it.
CNA02147477XA 2002-11-01 2002-11-01 Method for lablling united character and word as well as character patterns and character picture Pending CN1499357A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNA02147477XA CN1499357A (en) 2002-11-01 2002-11-01 Method for lablling united character and word as well as character patterns and character picture

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNA02147477XA CN1499357A (en) 2002-11-01 2002-11-01 Method for lablling united character and word as well as character patterns and character picture

Publications (1)

Publication Number Publication Date
CN1499357A true CN1499357A (en) 2004-05-26

Family

ID=34232996

Family Applications (1)

Application Number Title Priority Date Filing Date
CNA02147477XA Pending CN1499357A (en) 2002-11-01 2002-11-01 Method for lablling united character and word as well as character patterns and character picture

Country Status (1)

Country Link
CN (1) CN1499357A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102830809A (en) * 2011-06-15 2012-12-19 高静敏 Chinese character coding input method
CN105487684A (en) * 2014-09-28 2016-04-13 北大方正集团有限公司 Method and device for outputting Pinyin and Chinese characters
CN106021204A (en) * 2016-06-12 2016-10-12 朱信 Making and using of word stock with multiple repeated words
CN108829655A (en) * 2018-06-12 2018-11-16 黄�益 A kind of text handling method
CN108959224A (en) * 2018-06-12 2018-12-07 黄�益 A kind of text handling method
CN109766978A (en) * 2019-01-17 2019-05-17 北京悦时网络科技发展有限公司 A kind of generation method of word code, recognition methods, device, storage medium
CN115687669A (en) * 2022-10-12 2023-02-03 广州中望龙腾软件股份有限公司 Character caching method, terminal and storage medium

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102830809A (en) * 2011-06-15 2012-12-19 高静敏 Chinese character coding input method
CN102830809B (en) * 2011-06-15 2016-05-11 高静敏 Encode method for entering Chinese characters
CN105487684A (en) * 2014-09-28 2016-04-13 北大方正集团有限公司 Method and device for outputting Pinyin and Chinese characters
CN105487684B (en) * 2014-09-28 2018-03-23 北大方正集团有限公司 The output intent of Chinese-character phonetic letter character and the output device of Chinese-character phonetic letter character
CN106021204A (en) * 2016-06-12 2016-10-12 朱信 Making and using of word stock with multiple repeated words
CN108829655A (en) * 2018-06-12 2018-11-16 黄�益 A kind of text handling method
CN108959224A (en) * 2018-06-12 2018-12-07 黄�益 A kind of text handling method
CN109766978A (en) * 2019-01-17 2019-05-17 北京悦时网络科技发展有限公司 A kind of generation method of word code, recognition methods, device, storage medium
CN109766978B (en) * 2019-01-17 2020-06-16 北京悦时网络科技发展有限公司 Word code generation method, word code identification device and storage medium
US11334780B2 (en) 2019-01-17 2022-05-17 Yueshi Network Technology Development Co., Ltd. Method for generating word code, method and device for recognizing codes
CN115687669A (en) * 2022-10-12 2023-02-03 广州中望龙腾软件股份有限公司 Character caching method, terminal and storage medium
CN115687669B (en) * 2022-10-12 2024-06-21 广州中望龙腾软件股份有限公司 Text caching method, terminal and storage medium

Similar Documents

Publication Publication Date Title
CN1040276A (en) Simplified and complex character root Chinese character entering technique and keyboard thereof
CN1125990A (en) Method for minimizing uncertainty in computer software processes allowing for automatic identification of faults locations and locations for modifications due to new system requirements with introdu..
CN1342276A (en) Keyboard input devices, methods and systems
CN1499357A (en) Method for lablling united character and word as well as character patterns and character picture
CN1924995A (en) Content analysis based short message ask/answer system and implementing method thereof
CN1241101C (en) Chinese syllable double reading scheme, Chinese keyboard and information input and processing method
CN1154502A (en) Method and device for ducation standardized inputting Chinese characters by five stroke
CN1048343C (en) Free combination code Chinese character input method and key board
CN1045021C (en) Computer entering method for Chinese numerals and its keyboard
CN1258037A (en) Chinese keyboard and Chinese-character phonetic code input method
CN1275732A (en) Chinese character keyboard input system and applied technology thereof
CN1235122C (en) Indefinite code Chinese character input method for computer and keyboard thereof
CN1220127C (en) 'Dual-separation' Chinese characters, 'dual-separation' input method and combined characters
CN1108552C (en) Perfecting method (PHF) for phoenticizing Chinese charaters
CN1123819C (en) Chinese character key-position code input method for computer
CN1026924C (en) Chinese-character sound dissection encode and input method
CN1050913C (en) Chinese-character word processor with radical coding input
CN1062797A (en) character input keyboard and method
CN1025896C (en) New concept Chinese character coding
CN1120408C (en) Chinese-character struture-pronunciation input method for computer
CN1289078A (en) Chinese character phonetic code and keyboard design
CN1609765A (en) Type code Chinese character ridical inputting method and keyboard thereof
CN1387112A (en) 'Line sign 5-stroke' input method for computer, palm computer, mobile telephone and telephone
CN1255712C (en) Six-element splitting 'shape code' input system
CN1306369C (en) High-speed Chinese character phonetic code computer coding method and its keyboard

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication