CN107451105B - Bright braille conversion system based on novel Chinese character holographic coding rule - Google Patents

Bright braille conversion system based on novel Chinese character holographic coding rule Download PDF

Info

Publication number
CN107451105B
CN107451105B CN201710517639.0A CN201710517639A CN107451105B CN 107451105 B CN107451105 B CN 107451105B CN 201710517639 A CN201710517639 A CN 201710517639A CN 107451105 B CN107451105 B CN 107451105B
Authority
CN
China
Prior art keywords
chinese character
holographic
pronunciation
code
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710517639.0A
Other languages
Chinese (zh)
Other versions
CN107451105A (en
Inventor
富明慧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National Sun Yat Sen University
Original Assignee
National Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National Sun Yat Sen University filed Critical National Sun Yat Sen University
Priority to CN201710517639.0A priority Critical patent/CN107451105B/en
Publication of CN107451105A publication Critical patent/CN107451105A/en
Application granted granted Critical
Publication of CN107451105B publication Critical patent/CN107451105B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/126Character encoding
    • G06F40/129Handling non-Latin characters, e.g. kana-to-kanji conversion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates

Abstract

The invention provides a bright Braille conversion system based on a novel Chinese character holographic coding rule, which comprises the following components: the text acquisition module is used for acquiring Chinese character texts from the outside; the pronunciation database is used for storing the pronunciation of the Chinese characters; the word segmentation preprocessing module is used for automatically or manually inserting word segmentation marks into the Chinese character texts acquired from the outside by the text acquisition module; the Chinese character holographic code pre-compiling module is used for compiling the Chinese character text into a coding format of the Chinese character holographic code and storing the Chinese character holographic code into the Chinese character holographic file storage module; and the Chinese character holographic file storage module is used for storing the file in the Chinese character holographic code format. The invention adopts the novel Chinese character holographic code as a file storage format, determines the font of the Chinese character, uniquely determines the pronunciation of the Chinese character, and also determines whether the Chinese character is separated from the following Chinese characters or not, and contains all information required by the conversion of the plain braille. The invention can fundamentally overcome the problems of 'feijing', 'misunderstanding' and the like commonly existing in the current braille reading of Chinese characters.

Description

Bright braille conversion system based on novel Chinese character holographic coding rule
Technical Field
The invention relates to the field of Chinese character coding and word processing, in particular to a plain braille conversion system based on a novel Chinese character holographic coding rule.
Background
The Chinese characters are unique characters in the world, each character has three elements of ' sound ', ' shape ' and ' meaning ', the sound ' is in the meaning, the meaning ' is in the shape ', and the three elements are inseparable and have no difference. However, the braille of the existing Chinese characters is a pinyin scheme, and because a large number of Chinese characters have the phenomena of homophones, multiple characters and one character with multiple tones, after the Chinese characters are converted into the braille, the situation that the meaning of a word cannot be uniquely determined only by the pronunciation, so that the blind person can read the word easily and even misunderstand the word is generally existed, and the braille is the biggest difficult problem for popularizing and popularizing the braille in China.
With the development of information technology, especially the popularization and spread of computers and dot display devices (hereinafter referred to as dot display devices), it is a good condition to thoroughly solve the above problems.
Disclosure of Invention
In view of the above, there is a need to provide a new type of braille conversion system based on holographic encoding rules of chinese characters, which converts and stores chinese characters in a special format, and combines the "sound", "shape" and "meaning" of chinese characters into the same set of encoding rules, so as to improve the meaning expression accuracy of braille conversion.
In order to achieve the purpose, the invention adopts the following technical scheme:
a bright Braille conversion system based on novel Chinese character holographic coding rules comprises:
the text acquisition module is used for acquiring Chinese character texts from the outside;
the pronunciation database is used for storing the pronunciation of the Chinese characters; wherein, a plurality of different pronunciations of each polyphone are numbered according to a certain sequence, and one of the pronunciations is set as a default pronunciation;
the word segmentation preprocessing module is used for automatically or manually inserting word segmentation marks into the Chinese character texts acquired from the outside by the text acquisition module;
the Chinese character holographic code pre-compiling module is used for compiling the Chinese character text into a coding format of the Chinese character holographic code by combining default pronunciation set in a pronunciation database and a word segmentation mark inserted in the word segmentation preprocessing module and storing the Chinese character text into the Chinese character holographic file storage module;
the Chinese character holographic file storage module is used for storing a file in a Chinese character holographic code format;
the encoding format of the Chinese character holographic code is as follows:
one Chinese character holographic code corresponds to one Chinese character;
the first 2 bytes of the Chinese character holographic code are the internal code of the Chinese character;
one bit of the 3 rd byte of the Chinese character holographic code is defined as a word segmentation identification code, and whether the Chinese character and the next Chinese character form word segmentation is identified by different numerical values of the word segmentation identification code;
defining the 4 th byte of the Chinese character holographic code as a pronunciation identification code, and identifying the number corresponding to the correct pronunciation of the Chinese character in the context by the numerical value of the pronunciation identification code;
the system further comprises:
the text editing module is used for reading a file in a Chinese character holographic code format from the Chinese character holographic file storage module, interpreting Chinese character information and word segmentation information in the Chinese character holographic code, and displaying corresponding Chinese character text and word segmentation marks for a user to review and modify; when a user modifies the Chinese character text or the word segmentation marks, synchronously modifying the Chinese character holographic codes stored in the Chinese character holographic file storage module;
the phonetic notation editing module is used for reading a file in a Chinese character holographic code format from the Chinese character holographic file storage module, interpreting Chinese character information and pronunciation information in the Chinese character holographic code, displaying corresponding Chinese character text and pronunciation information of polyphones, and combining a pronunciation database for a user to review and correct the correct pronunciation of the polyphones; when the user changes the pronunciation of the polyphonic characters, the Chinese character holographic code stored in the Chinese character holographic file storage module is synchronously modified;
the Braille conversion module is used for reading the file in the Chinese character holographic code format from the Chinese character holographic file storage module, interpreting the participle information and the pronunciation information in the Chinese character holographic code, and determining the pronunciation of each Chinese character by combining the pronunciation database so as to convert the Chinese character information in the Chinese character holographic code into Braille for the user to review and modify; when the user modifies the braille, the Chinese character holographic code stored in the Chinese character holographic file storage module is synchronously modified.
Furthermore, in the word segmentation preprocessing module, automatic insertion of word segmentation markers is realized by combining an external or system-built word segmentation database, common words are stored in the word segmentation database, and the word segmentation preprocessing module compares the Chinese character text acquired from the outside by the text acquisition module with words in the word segmentation database so as to automatically insert word segmentation markers in the Chinese character text.
Further, still include:
the listening and reading module is used for reading a file in a Chinese character holographic code format from the Chinese character holographic file storage module, interpreting word segmentation information and pronunciation information in the Chinese character holographic code, and determining the pronunciation of each Chinese character by combining a pronunciation database so as to read aloud by using computer voice; and the pause position of reading is determined according to the punctuation marks and the positions of the word segmentation marks.
Further, still include:
and the paraphrasing module is used for reading the file in the Chinese character holographic code format from the Chinese character holographic file storage module, interpreting Chinese character information, word segmentation information and pronunciation information in the Chinese character holographic code, and determining the font, pronunciation and word segmentation state of each Chinese character so as to provide the correct meaning of each Chinese character or phrase in context for the user to inquire.
Further, the system also comprises a point display device which is used for displaying the contents of the text editing module, the phonetic notation editing module, the Braille conversion module and the paraphrase module in a Braille form.
Further, the encoding format of the Chinese character holographic code further includes:
one bit of the 3 rd byte of the Chinese character holographic code is defined as a default pronunciation identification code, and different values of the default pronunciation identification code are used for identifying whether the pronunciation adopted by the Chinese character in the context is the default pronunciation; when the pronunciation adopted by the Chinese character in the context is the default pronunciation, the 4 th byte of the holographic code of the Chinese character is omitted.
Further, in the Chinese character holographic code, only the last bit and the second last bit are used for information in the 3 rd byte;
the last digit in the 3 rd byte is a default pronunciation identification code, when the digit is 0, the Chinese character adopts default pronunciation, and when 1, the pronunciation of the Chinese character is specified by the 4 th byte;
the second last bit in the 3 rd byte is a word segmentation identification code, when the bit is 0, the Chinese character does not form word segmentation with the next Chinese character, and when the bit is 1, the Chinese character forms word segmentation with the next Chinese character.
Further, the encoding format of the Chinese character holographic code further includes:
when the Chinese character is a single-tone character, the 4 th byte of the holographic code of the Chinese character is omitted.
Further, the encoding format of the Chinese character holographic code further includes:
and when the 4 th byte of the Chinese character holographic code of the Chinese character is omitted and the Chinese character does not form word segmentation with the next Chinese character, the 3 rd byte of the Chinese character holographic code is omitted.
Further, in the reading database, a plurality of different readings of the polyphones are ordered and numbered in order of highest frequency of use, with the highest frequency of use being set as the default reading.
Through the technical scheme, the novel Chinese character holographic code is adopted as a file storage format, the shape of the Chinese character is determined, the pronunciation of the Chinese character is uniquely determined, whether the Chinese character is participled with the following Chinese characters is determined, and all information required by the conversion of the plain braille is contained. The invention provides a bright Braille conversion system based on a novel Chinese character holographic coding rule, which can fundamentally overcome the problems of 'feijie', 'misunderstanding' and the like commonly existing in the current Braille reading. In addition, the file in Chinese character holographic code format synchronously generated as a byproduct is used as a file in Chinese character holographic code format in the process of making paper braille books for the blind by a publishing department, so that the misinterpretation rate of the blind during listening and reading on a computer or a mobile phone and reading on a braille display can be greatly reduced. The information transmission accuracy is guaranteed, and multiple purposes are achieved.
Drawings
FIG. 1 is a functional module schematic diagram of a bright Braille conversion system based on a novel Chinese character holographic coding rule provided by the invention.
Detailed Description
The technical solution of the present invention will be described in detail with reference to the accompanying drawings and specific embodiments.
The embodiment of the invention provides a bright Braille conversion system based on a novel Chinese character holographic coding rule, wherein a novel Chinese character holographic coding rule, namely a Chinese character holographic code, is introduced into the system; the method aims to integrate the 'sound', 'shape' and 'meaning' of the Chinese characters into the same set of coding rules so as to improve the meaning expression accuracy of the plain braille conversion.
Specifically, as a technical core of the present invention, the encoding format of the chinese character holographic code is as follows:
one Chinese character holographic code corresponds to one Chinese character;
the first 2 bytes of the Chinese character holographic code are the internal code of the Chinese character;
one bit of the 3 rd byte of the Chinese character holographic code is defined as a word segmentation identification code, and whether the Chinese character and the next Chinese character form word segmentation is identified by different numerical values of the word segmentation identification code;
one bit of the 3 rd byte of the Chinese character holographic code is defined as a word segmentation identification code, and whether the Chinese character and the next Chinese character form word segmentation is identified by different numerical values of the word segmentation identification code; another bit in the 3 rd byte is defined as a default pronunciation identification code, and whether the pronunciation adopted by the Chinese character in the context is the default pronunciation is identified by different values of the default pronunciation identification code;
the 4 th byte of the Chinese character holographic code is defined as a pronunciation identification code, and the numerical value of the pronunciation identification code is used for identifying the number corresponding to the correct pronunciation of the Chinese character in the context.
Further, in the Chinese character holographic code, only the last bit and the second last bit are used for information in the 3 rd byte;
the last digit in the 3 rd byte is a default pronunciation identification code, when the digit is 0, the Chinese character adopts default pronunciation, and when 1, the pronunciation of the Chinese character is specified by the 4 th byte; because the single-tone character has only one default pronunciation, the 3 rd byte last bit of the Chinese character holographic code of the single-tone character is 0 necessarily, and the 3 rd byte last bit of the Chinese character holographic code of the polyphonic character can be 0 or 1;
the second last bit in the 3 rd byte is a word segmentation identification code, when the bit is 0, the Chinese character does not form word segmentation with the next Chinese character, and when the bit is 1, the Chinese character forms word segmentation with the next Chinese character.
According to the definition, since the information in the 3 rd byte only uses the last bit and the second last bit, and the information corresponds to the unusual control characters in the 4 ASCII codes, the characters of the common ASCII codes are not occupied, no ambiguity is caused when the characters and the Chinese characters are mixed, and the operation and storage efficiency of the computer is improved.
As an improvement, the following method can be adopted to appropriately omit the 3 rd byte and the 4 th byte of the Chinese character holographic code:
when the pronunciation adopted by the Chinese character in the context is the default pronunciation, the 4 th byte of the holographic code of the Chinese character is omitted; when the Chinese character is a single-tone character, the 4 th byte of the holographic code of the Chinese character is omitted; that is, when the last bit of the 3 rd byte is 0, the 4 th byte is omitted;
when the 4 th byte of the Chinese character holographic code of the Chinese character is omitted and the Chinese character does not form word segmentation with the next Chinese character, the 3 rd byte of the Chinese character holographic code is omitted; that is, when the last two bits of the 3 rd byte are 0 at the same time, the 3 rd byte is also omitted, and the Chinese character holographic code of the Chinese character only takes the first 2 bytes.
According to the above rules, appropriate omission of bytes containing no substantial information can greatly reduce the number of data bits used for storing information, thereby reducing the storage space.
The following description starts to describe in detail a bright braille conversion system based on a novel holographic encoding rule of chinese characters, as shown in fig. 1, where the system specifically includes:
the text acquisition module is used for acquiring Chinese character texts from the outside;
the pronunciation database is used for storing the pronunciation of the Chinese characters; wherein a plurality of different pronunciations of each polyphone are numbered in a sequence, and one of the pronunciations is set as a default pronunciation. In this embodiment, the different pronunciations of the polyphone are ordered and numbered in order of highest frequency of use, with the highest frequency of use being the default pronunciation. Note that, the reading database stores not only the readings of polyphones but also the readings of monophonic characters, and only the pronunciation of a monophonic character is unique and is a default reading, and the number of the pronunciation is also only one.
And the word segmentation preprocessing module is used for automatically or manually inserting word segmentation marks into the Chinese character text acquired from the outside by the text acquisition module. The word segmentation mark inserted in the word segmentation preprocessing module is mainly used for providing basic word segmentation information reference when a Chinese character text is converted into a Chinese character holographic code, and the position of the word segmentation mark is not required to be completely accurate; therefore, in order to avoid a great deal of work caused by manually inserting the word segmentation markers, the automatic intelligent word segmentation marker inserting mode can be adopted. Specifically, the function of automatically inserting the word segmentation markers needs to be realized by combining an external word segmentation database or a word segmentation database built in the system, common words are stored in the word segmentation database, and the word segmentation preprocessing module compares the Chinese character text acquired from the outside by the text acquisition module with words in the word segmentation database so as to automatically insert the word segmentation markers in the Chinese character text.
The Chinese character holographic code pre-compiling module is used for compiling the Chinese character text into a coding format of the Chinese character holographic code by combining default pronunciation set in the pronunciation database and word segmentation marks inserted in the word segmentation preprocessing module, and storing the Chinese character text into the Chinese character holographic file storage module.
The Chinese character holographic file storage module is used for storing a file in a Chinese character holographic code format, namely a Chinese character holographic code file. Based on the definition of the encoding format of the Chinese character holographic code given above, the Chinese character holographic code file simultaneously contains Chinese character information, word segmentation information and pronunciation information. Specifically, the Chinese character information is determined by the first 2 bytes of the Chinese character holographic code, the word segmentation information is determined by the word segmentation identification code of the 3 rd byte of the Chinese character holographic code, and the pronunciation information is determined by the default pronunciation identification code of the 3 rd byte of the Chinese character holographic code and the 4 th byte combined pronunciation database.
And the text editing module is used for reading the file in the Chinese character holographic code format from the Chinese character holographic file storage module, interpreting the Chinese character information and the word segmentation information in the Chinese character holographic code, and displaying the corresponding Chinese character text and word segmentation marks for the user to review and modify. In the module, Chinese characters are displayed in a text window, at the moment, operations such as adding, changing and deleting the Chinese characters can be carried out like processing a conventional plain text file, and the positions of word segmentation marks can be modified; in this embodiment, at the end of a word segmentation except for punctuation marks, a TAB key is used as a word segmentation mark. When the user modifies the Chinese character text or the word segmentation mark, the module can synchronously modify the Chinese character holographic code stored in the Chinese character holographic file storage module.
And the phonetic notation editing module is used for reading the file in the Chinese character holographic code format from the Chinese character holographic file storage module, interpreting the Chinese character information and the pronunciation information in the Chinese character holographic code, displaying the corresponding Chinese character text and the pronunciation information of the polyphone, and combining the pronunciation database for the user to review and correct the correct pronunciation of the polyphone. In the module, the text window displays Chinese character text and symbols, when the cursor moves to polyphone characters, a pull-down menu can be automatically popped up, and the correct pronunciation of the current Chinese character can be selected by the up and down cursors. When the cursor moves to the non-polyphonic character, the phonetic notation menu is automatically closed. When the user changes the pronunciation of the polyphonic character, the module will modify the Chinese character holographic code stored in the Chinese character holographic file storage module synchronously.
Because the Chinese character holographic code conversion is carried out in the Chinese character holographic code precompiling module, the Chinese character holographic code conversion is based on rough word segmentation preprocessing and default pronunciation set by a system; although the accuracy of information matching can be improved by improving the intelligent recognition function of the word segmentation preprocessing module and the pronunciation database, the word segmentation information and the pronunciation information of the Chinese characters cannot be completely and accurately expressed by the Chinese character holographic code file initially stored in the Chinese character holographic file storage module. However, by means of the text editing module and the phonetic notation editing module, the Chinese character information, the word segmentation information and the pronunciation information with a small number of errors can be adjusted, and the accuracy of the Chinese character holographic code file is further improved. On the basis, various functional modules can be added, and the Chinese character information, the word segmentation information and the pronunciation information contained in the Chinese character holographic code are used for serving the user.
Specifically, as an improvement, the present invention further includes the following functional modules:
and the Braille conversion module is used for reading the file in the Chinese character holographic code format from the Chinese character holographic file storage module, interpreting the word segmentation information and the pronunciation information in the Chinese character holographic code, and determining the pronunciation of each Chinese character by combining the pronunciation database so as to convert the Chinese character information in the Chinese character holographic code into Braille for the user to review and modify. In the module, Braille can be displayed in the text window, and a user can check and edit the Braille by adding and deleting the Braille and modify unreasonable word segmentation marks. When the user modifies the Braille, the module can synchronously modify the Chinese character holographic code stored in the Chinese character holographic file storage module.
The listening and reading module is used for reading a file in a Chinese character holographic code format from the Chinese character holographic file storage module, interpreting word segmentation information and pronunciation information in the Chinese character holographic code, and determining the pronunciation of each Chinese character by combining a pronunciation database so as to read aloud by using computer voice; and the pause position of reading is determined according to the punctuation marks and the positions of the word segmentation marks. In the module, the parsed Chinese character holographic code can be read aloud by using screen reading software, and because the Chinese character holographic code simultaneously contains pronunciation information and word segmentation information, the screen reading software can correctly read polyphone and can pause more reasonably, so that not only is wrong information caused by misreading of the polyphone avoided, but also a more comfortable listening and reading effect is achieved, which cannot be achieved during listening and reading of a conventional text file.
And the paraphrasing module is used for reading the file in the Chinese character holographic code format from the Chinese character holographic file storage module, interpreting Chinese character information, word segmentation information and pronunciation information in the Chinese character holographic code, and determining the font, pronunciation and word segmentation state of each Chinese character so as to provide the correct meaning of each Chinese character or phrase in context for the user to inquire. Because the Braille of the Chinese characters is a character for expressing the pronunciation, and is influenced by homonymy and heteronymy of the Chinese characters and one character with multiple tones, the condition that one Braille single character or phrase corresponds to a plurality of different Chinese characters or phrases often appears in the traditional Braille dictionary software, the actual meaning of the original Chinese character text to be expressed cannot be confirmed, and the one-to-one correspondence with the original Chinese character text can be realized after reading out the Braille by adopting the Chinese character holographic code provided by the invention for storage, so that the paraphrasing function can be accurately realized. Specifically, after determining the font, pronunciation and word segmentation state of each Chinese character, the paraphrase module performs matching query from a paraphrase database and displays the queried word meaning to the user. The paraphrase database may be an internal database integrated in the system, or may be an external database such as a network dictionary and a dictionary referred from the outside.
And the point display device is used for displaying the contents of the text editing module, the phonetic notation editing module, the Braille conversion module and the paraphrase module in a Braille form. The text editing module, the phonetic notation editing module and the Braille editing module output ASCII codes of identical current character lines on a pointing display, and only a TAB key serving as a word segmentation mark is displayed as a half-angle blank; when the device is used in cooperation with a text editing module and a Braille editing module, the point display can display the content of a line where a current character is positioned, and the operations of proofreading, content addition and deletion and word segmentation can be performed through touch-reading cooperation; when the system is used in cooperation with a phonetic notation editing module, when a Chinese character phonetic notation menu displayed in a computer screen pops up, the point display device can display the current pronunciation, and phonetic notation selection can be completed by switching the point display device by an upper cursor and a lower cursor; when the Chinese character interpretation and word composition module is used in cooperation with the paraphrasing module, the displayed current character is allowed to be interpreted and word composition, the corresponding shortcut key is pressed, and the Chinese character interpretation or word composition information is displayed on the click display.
Obviously, the display device of the invention is not limited to a point display, and can be connected with other display devices such as a liquid crystal display screen and the like to output and display the contents of the text editing module, the phonetic notation editing module, the braille conversion module and the paraphrasing module.
Through the technical scheme, the invention adopts the Chinese character holographic code as a file storage format, uniquely determines the pronunciation of the Chinese character while determining the character form of the Chinese character, and also determines whether the Chinese character is participled with the following Chinese character or not, and contains all information required by the Chinese character clear-blind conversion. The holographic Chinese character code is used as a file storage format, so that the problems of 'feijing', 'misunderstanding' and the like commonly existing in the current braille reading of Chinese characters can be fundamentally overcome.
The following will illustrate the conversion process and technical advantages of Chinese character holographic code by taking several specific examples.
Specifically, for a single-tone word or a polyphone word reading a default pronunciation (in this embodiment, the most frequently used pronunciation is specified), the 4 th code is OX1(16 th system), and the default is set.
Example one:
large (large in size), which is a polyphone, with two pronunciations, da4 and dai4, da4 being the 1 st pronunciation, so its holographic chinese code is the large inner code plus OX1+ OX1, where the 3 rd byte 16 th digit OX1 is the read-through and polyphone cue code, which indicates a polyphone because its last bit is "1", and the pronunciation will be specified by the 4 th byte; the 4 th byte is OX1, corresponding to 1 in 10, which indicates that the word reads the 1 st reading, i.e. the highest frequency reading da 4. Since the 3 rd byte OX1 has zero second last bit, it is indicated that the word is not separated from the following Chinese character.
In addition, the size is large, because the reading is the 1 st reading, the 4 th byte OX1 of the holographic Chinese character code can be defaulted; since no participles are composed with the following words and the 4 th byte is missing, the 3 rd byte can also be omitted. Such a large (large in size) holographic Chinese code can be simplified as: a large inner code.
The following steps are repeated: large (doctor's "large"): is the 2 nd pronunciation of the polyphone 'big', so the holographic Chinese character code thereof is the big inner code + OX1+ OX 2;
example two:
rich: is a single tone word and has only one pronunciation fu4, so the complete holographic Chinese character code is rich inner code + OX1+ OX 1.
Because of the single-tone character, it can also be abbreviated as: rich inner code + OX 1;
when a word is not formed with a later word, the 3 rd byte is OX1, and the simplification can be continued: the rich holographic Chinese character code is rich inner code.
The following is to read the Chinese character holographic Chinese character code under the phrase state:
hobby: love is a single-tone character, and forms participles with the following characters; preferably, the polyphones are polyphones, the 1 st pronunciation being "hao 3" and the 2 nd pronunciation being "hao 4".
The favorite holographic Chinese character code is the love inner code + OX2 (corresponding to binary 10, the last bit is zero, and represents a single tone character, the second last bit is 1, and represents a participle with the following character), because the code is a single tone character, the 4 th byte is omitted) + the good inner code + OX1 (the last bit is 1, and represents a polyphone character, and the second last bit is zero, and represents a participle without the following character) + OX2 (2 in 10 th scale, and represents reading the 2 nd pronunciation).
Example three:
jilin province: ji and Lin are single-tone characters and multi-tone characters, but read the 1 st pronunciation (sheng 3).
Therefore, the inner code + OX2 (single-tone word, and following word component word) of ji lin province, the inner code of ji lin + OX2+ the inner code of ji province + OX1+ OX1, and obviously, the last 2 bytes of ji province can be omitted.
Example four:
good ease and fatigue: the first character is a polyphone character, and the 2 nd pronunciation is read; the third word is also polyphonic (e4, wu4), reading the 2 nd pronunciation, so the holographic Chinese character code of the word is:
good inner code + OX3 (polyphone character, and later character component participle) + OX2+ Yi inner code + OX2 (monophone character, and later character component participle) + oxa inner code + OX3 (polyphone character, and later character component word) + OX2 (oxa 2 nd pronunciation) + Lao inner code (3 rd, 4 th byte omitted)).
The default rule in holographic Chinese code is not confusing. In most cases, the 1 st pronunciation (including the only pronunciation) is adopted for Chinese characters, and more than half of characters in one article do not form words with later characters, so that the storage space is greatly saved due to the shortage of the characters.
By adopting the holographic code of the Chinese character as the file storage format, the invention can avoid the trouble of selecting polyphone when the Chinese character is converted into the braille; and the error of synonymy different sound when the braille is converted into the Chinese character can be avoided. By matching with voice software to play the text after the pronunciation edition, the blind can more accurately and easily know the content to be listened, and the problems of polyphone misreading and phrase misreading during the listening and reading of the conventional text file can be avoided; when the blind person touches and reads the strange or difficult characters on the braille display, the blind person can also use the computer to operate and call the internal code to explain the current characters or give out common words, which is a technical advantage that the traditional braille conversion method cannot provide.
The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A bright Braille conversion system based on novel Chinese character holographic coding rules is characterized by comprising:
the text acquisition module is used for acquiring Chinese character texts from the outside;
the pronunciation database is used for storing the pronunciation of the Chinese characters; wherein, a plurality of different pronunciations of each polyphone are numbered according to a certain sequence, and one of the pronunciations is set as a default pronunciation;
the word segmentation preprocessing module is used for automatically or manually inserting word segmentation marks into the Chinese character texts acquired from the outside by the text acquisition module;
the Chinese character holographic code pre-compiling module is used for compiling the Chinese character text into a coding format of the Chinese character holographic code by combining default pronunciation set in a pronunciation database and a word segmentation mark inserted in the word segmentation preprocessing module and storing the Chinese character text into the Chinese character holographic file storage module;
the Chinese character holographic file storage module is used for storing a file in a Chinese character holographic code format;
the encoding format of the Chinese character holographic code is as follows:
one Chinese character holographic code corresponds to one Chinese character;
the first 2 bytes of the Chinese character holographic code are the internal code of the Chinese character;
one bit of the 3 rd byte of the Chinese character holographic code is defined as a word segmentation identification code, and whether the Chinese character and the next Chinese character form word segmentation is identified by different numerical values of the word segmentation identification code;
defining the 4 th byte of the Chinese character holographic code as a pronunciation identification code, and identifying the number corresponding to the correct pronunciation of the Chinese character in the context by the numerical value of the pronunciation identification code;
the system further comprises:
the text editing module is used for reading a file in a Chinese character holographic code format from the Chinese character holographic file storage module, interpreting Chinese character information and word segmentation information in the Chinese character holographic code, and displaying corresponding Chinese character text and word segmentation marks for a user to review and modify; when a user modifies the Chinese character text or the word segmentation marks, synchronously modifying the Chinese character holographic codes stored in the Chinese character holographic file storage module;
the phonetic notation editing module is used for reading a file in a Chinese character holographic code format from the Chinese character holographic file storage module, interpreting Chinese character information and pronunciation information in the Chinese character holographic code, displaying corresponding Chinese character text and pronunciation information of polyphones, and combining a pronunciation database for a user to review and correct the correct pronunciation of the polyphones; when the user changes the pronunciation of the polyphonic characters, the Chinese character holographic code stored in the Chinese character holographic file storage module is synchronously modified;
the Braille conversion module is used for reading the file in the Chinese character holographic code format from the Chinese character holographic file storage module, interpreting the participle information and the pronunciation information in the Chinese character holographic code, and determining the pronunciation of each Chinese character by combining the pronunciation database so as to convert the Chinese character information in the Chinese character holographic code into Braille for the user to review and modify; when the user modifies the braille, the Chinese character holographic code stored in the Chinese character holographic file storage module is synchronously modified.
2. The system of claim 1, wherein the automatic insertion of segmentation markers in the segmentation pre-processing module is achieved by combining an external or system-internal segmentation database in which commonly used segmentation is stored, and the segmentation pre-processing module compares the text of the chinese characters obtained from the outside by the text collection module with the segmentation in the segmentation database to automatically insert the segmentation markers in the text of the chinese characters.
3. The system for converting braille according to the novel holographic encoding of chinese characters rules of claim 1, further comprising:
the listening and reading module is used for reading a file in a Chinese character holographic code format from the Chinese character holographic file storage module, interpreting word segmentation information and pronunciation information in the Chinese character holographic code, and determining the pronunciation of each Chinese character by combining a pronunciation database so as to read aloud by using computer voice; and the pause position of reading is determined according to the punctuation marks and the positions of the word segmentation marks.
4. The system for converting Ming Braille based on the novel holographic encoding rule of Chinese characters as claimed in claim 3, further comprising:
and the paraphrasing module is used for reading the file in the Chinese character holographic code format from the Chinese character holographic file storage module, interpreting Chinese character information, word segmentation information and pronunciation information in the Chinese character holographic code, and determining the font, pronunciation and word segmentation state of each Chinese character so as to provide the correct meaning of each Chinese character or phrase in context for the user to inquire.
5. A system for the conversion of clear Braille based on the novel holographic Chinese character coding rules according to claim 4, characterized by further comprising a display for displaying the contents of the text editing module, the ZhuYin editing module, the Braille conversion module and the paraphrase module in Braille form.
6. The system for converting Minbraille according to the novel holographic Chinese character encoding rule of claim 1, wherein the encoding format of the holographic Chinese character codes further comprises:
one bit of the 3 rd byte of the Chinese character holographic code is defined as a default pronunciation identification code, and different values of the default pronunciation identification code are used for identifying whether the pronunciation adopted by the Chinese character in the context is the default pronunciation; when the pronunciation adopted by the Chinese character in the context is the default pronunciation, the 4 th byte of the holographic code of the Chinese character is omitted.
7. The system for converting Ming Braille based on the novel holographic Chinese character coding rule according to claim 6, wherein in the holographic Chinese character code, only the last bit and the second last bit are used for the information in the 3 rd byte;
the last digit in the 3 rd byte is a default pronunciation identification code, when the digit is 0, the Chinese character adopts default pronunciation, and when 1, the pronunciation of the Chinese character is specified by the 4 th byte;
the second last bit in the 3 rd byte is a word segmentation identification code, when the bit is 0, the Chinese character does not form word segmentation with the next Chinese character, and when the bit is 1, the Chinese character forms word segmentation with the next Chinese character.
8. The system for converting Minbraille according to the novel holographic Chinese character encoding rule of claim 1, wherein the encoding format of the holographic Chinese character codes further comprises:
when the Chinese character is a single-tone character, the 4 th byte of the holographic code of the Chinese character is omitted.
9. The system for converting Ming Braille based on the novel holographic Chinese character encoding rule according to claim 6 or 8, wherein the encoding format of the holographic Chinese character code further comprises:
and when the 4 th byte of the Chinese character holographic code of the Chinese character is omitted and the Chinese character does not form word segmentation with the next Chinese character, the 3 rd byte of the Chinese character holographic code is omitted.
10. A system for converting Minbraille according to the novel holographic encoding rules for Chinese characters as claimed in claim 1 or 6, wherein the reading database is configured such that a plurality of different readings of polyphones are ordered and numbered in descending order of frequency of use, wherein the reading with the highest frequency of use is set as the default reading.
CN201710517639.0A 2017-06-29 2017-06-29 Bright braille conversion system based on novel Chinese character holographic coding rule Active CN107451105B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710517639.0A CN107451105B (en) 2017-06-29 2017-06-29 Bright braille conversion system based on novel Chinese character holographic coding rule

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710517639.0A CN107451105B (en) 2017-06-29 2017-06-29 Bright braille conversion system based on novel Chinese character holographic coding rule

Publications (2)

Publication Number Publication Date
CN107451105A CN107451105A (en) 2017-12-08
CN107451105B true CN107451105B (en) 2020-04-07

Family

ID=60488117

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710517639.0A Active CN107451105B (en) 2017-06-29 2017-06-29 Bright braille conversion system based on novel Chinese character holographic coding rule

Country Status (1)

Country Link
CN (1) CN107451105B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108415899B (en) * 2018-01-31 2021-09-17 北京联合大学 Braille word segmentation modification method and system

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1591414A (en) * 2004-06-03 2005-03-09 华建电子有限责任公司 Automatic translating converting method for Chinese language to braille
CN1661526A (en) * 2004-02-24 2005-08-31 商荣杰 Set symbol computer keyboard and design of encoding signal input system
CN1848049A (en) * 2006-03-27 2006-10-18 富明慧 Half square braille digital coding Chinese character inputting method
JP2006302149A (en) * 2005-04-22 2006-11-02 Chiba Univ Japanese input device
CN101408803A (en) * 2008-11-04 2009-04-15 中兴通讯股份有限公司 Method for inputting Braille to terminal equipment and terminal equipment thereof
CN103870008A (en) * 2014-04-03 2014-06-18 可牛网络技术(北京)有限公司 Method and device for output and input of Braille characters on touch screen
CN103995600A (en) * 2014-03-20 2014-08-20 江苏科技大学 Braille and Chinese character converting device and method
GB2532770A (en) * 2014-11-27 2016-06-01 Stuart Wainwright Michael Apparatus for use in teaching a language

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1661526A (en) * 2004-02-24 2005-08-31 商荣杰 Set symbol computer keyboard and design of encoding signal input system
CN1591414A (en) * 2004-06-03 2005-03-09 华建电子有限责任公司 Automatic translating converting method for Chinese language to braille
JP2006302149A (en) * 2005-04-22 2006-11-02 Chiba Univ Japanese input device
CN1848049A (en) * 2006-03-27 2006-10-18 富明慧 Half square braille digital coding Chinese character inputting method
CN101408803A (en) * 2008-11-04 2009-04-15 中兴通讯股份有限公司 Method for inputting Braille to terminal equipment and terminal equipment thereof
CN103995600A (en) * 2014-03-20 2014-08-20 江苏科技大学 Braille and Chinese character converting device and method
CN103870008A (en) * 2014-04-03 2014-06-18 可牛网络技术(北京)有限公司 Method and device for output and input of Braille characters on touch screen
GB2532770A (en) * 2014-11-27 2016-06-01 Stuart Wainwright Michael Apparatus for use in teaching a language

Also Published As

Publication number Publication date
CN107451105A (en) 2017-12-08

Similar Documents

Publication Publication Date Title
CN101276245B (en) Reminding method and system for coding to correct error in input process
CN100568225C (en) The Words symbolization processing method and the system of numeral and special symbol string in the text
WO2008106470A1 (en) Shared language model
CN100462901C (en) GB phoneticize input method
CN101118540A (en) Chinese characters phonetic and Chinese voice code bidirectional reversible transform method
CN107451105B (en) Bright braille conversion system based on novel Chinese character holographic coding rule
CN100432903C (en) Half square braille digital coding Chinese character inputting method
CN100501656C (en) Tone and shape combination method for inputting Chinese character into electronic apparatus
CN110716654B (en) Chinese character input method, voice synthesis method and Chinese character input system
CN1195265C (en) Chinese language phonetic transcription simple and quick full spelling input method and its keyboare
CN103853705A (en) Real-time voice subtitle translation method of Chinese voice and foreign language voice of computer
CN100561469C (en) Create and use the method and system of Chinese language data and user-corrected data
CN107145478B (en) Method for converting Chinese sentence into braille
CN106021241A (en) Braille dot location Chinese character codes and a method of machine translation between the Braille dot location Chinese character codes and Braille characters
CN101587381B (en) Input method for audio-shaped characters without repeated code
CN103854647A (en) Chinese-foreign-language bidirectional real time voice translation wireless mobile communication device
CN103297709A (en) Device for adding Chinese subtitles to Chinese audio video data
CN102622098B (en) New pictophonetic code Chinese character input method
CN103279202A (en) Standard holographic Chinese character input method conforming to teaching use and system of input method
CN101604210B (en) Standard sound and font input method for Chinese characters
CN101901062B (en) Computer Chinese character information processing method based on phoneme encoding
CN103279463A (en) Device for performing real-time voice subtitle translation on Chinese voice and foreign language voice through computer
WO2020087769A1 (en) Phonetic writing input method
CN1333325C (en) Pictographic character direct-viewing coding input method
CN1202647A (en) Phonetic Chinese characters

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant