TW432300B - Classification method, database, database establishment method, and input query system of mispronounced Chinese characters - Google Patents

Classification method, database, database establishment method, and input query system of mispronounced Chinese characters Download PDF

Info

Publication number
TW432300B
TW432300B TW88105799A TW88105799A TW432300B TW 432300 B TW432300 B TW 432300B TW 88105799 A TW88105799 A TW 88105799A TW 88105799 A TW88105799 A TW 88105799A TW 432300 B TW432300 B TW 432300B
Authority
TW
Taiwan
Prior art keywords
character
chinese
database
characters
mispronounced
Prior art date
Application number
TW88105799A
Other languages
Chinese (zh)
Inventor
Yung-Sheng Jang
Original Assignee
Iqchina Technology Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Iqchina Technology Inc filed Critical Iqchina Technology Inc
Priority to TW88105799A priority Critical patent/TW432300B/en
Application granted granted Critical
Publication of TW432300B publication Critical patent/TW432300B/en

Links

Landscapes

  • Document Processing Apparatus (AREA)

Abstract

The present invention relates to a classification method, a database, a database establishment method, and an input query system of mispronounced Chinese characters. In other words, it relates to an input method and a query system of Chinese characters. A pronunciation test is given to tens of people having certain education background, e.g. high school graduates, to calculate the ratio of correct pronunciation and locate the commonly mispronounced characters. Then, a certain ratio is selected to reduce the data quantity before making classification for collection of those most commonly mispronounced Chinese characters to establish the required database. When the input of the input query system of mispronounced Chinese characters is a Chinese character, and if the user does not know the correct phonetic symbols for the character, several Chinese characters chosen from the database will be displayed for the user to select. Also, after the selection, a query means will be provided to the user to obtain the selected character's related information, such as correct phonetic symbols, application examples, possible meanings, quotes, category (for examples, pictograph and associative compound), formation principle, and allusions. Hence, a very useful and convenient system that is able to not only enhance the user's general knowledge on Chinese characters but also further promote Chinese culture.

Description

4 32 30 Ο Α7 Β7 經濟部中央標準局員工消費合作社印製 五、mvl·之 本發明係有 庫、白字資料 統,其可提供 之注音以外的 檢索出輸入者 選’以提高輸 查詢國字之正 而可逐步提高 的國字輸入及 【發明之背 按,令國字 、過中國人的智 出指事、會意 ,字出來。且因 會意、形聲、 國字亦不斷的 常用之國字, 閱讀及作電腦 國字希望亦能 詢糸統,如有 增加輸入作業 者查詢則可增 -廣中國文化之 技術範疇】 關一種白字歸類方法、白字資料 庫之建立方法及白字輸入查詢系 國字輸入使用者,一個除輸入正確 一個運用收集國人常見之誤唸音而 可能欲輸入之國字以供輸入者撿 入者的國·字輸入能力,同時並具有 確注音及其定義或例句等之功能, 使用者之國字使用能力的一種新穎 査詢指示系統者。 景】 主要由象形文字發展而來,其後經 慧’由组合及轉用等之應用,發展 、形聲、轉注及假借等之類型之國 時代之變遷’依前述象形、指事、 轉注及假借等之原理所創造出之新 新增。除因時代之變遷而使一些不 或習慣性-唸錯之國字,造成中國人 中文輸入時之困擾外,對於新創的 有一個更通融之輪入判斷方法及查 此種方便性更高的輪入系統,除可 之效率外,配合一查詢系統供使用 加使用者之國字知識,而可達到推 目的。 一----^-------扣衣------ΪΤ------^ (請先閲讀背面之注意事項再填寫本育) 本紙張尺度遑用中國國家揉準(CNS ) A4規格(210X 297公釐) 經濟部中央標準局負工消費合作社印製 f 32 3 Ο 0 a? ____ Β7 五、發明説明(芝) 蓋如前述,中國文字由前述象形、指事、會意、 形聲、轉注及假借等六大原理所發展出之總國字 子數不下數萬,非僅其他國家之中文學習者及中 文使用者對之望而興嘆外,即使是身為一個中國 人亦因教育層級及本身學習能力之問題,常發生 困擾。但中國文字確為一最優美之文字,其每— 個文字皆有其獨立之意義,且經與其他文字排列 組合後又會發展出新的意義,實為一變化豐富且 具深奥薇意之文字。 且中國人在世界上有十多億,而周邊國家例如 a本及韓國及其他東南亞國家等受中國文化影響 而使用中國文字之國家甚多。且最近網際網路之 發達’在世界上使用中國文字之頻率甚高,使得 中國文字之輸入之效率.化有迫切之須要。 【發明欲解決之課題】 目則的國字輸入法中,雖提供了許多字形拆解 及一些利用其他轉折方法之輸入法’但其普及程 度並不十分廣大’最常為國人所使闬之輸入法以 >主音之輸入法為最多。因國人自幼年起即開始學 客主音符號及.國字之注音發音,因此例如對業餘 之網路使用者’仍以注音輸入法較為方便熟悉。 但習知之注音輸入法則有以下之缺點: V 1)國人相互間所使用之國字發音未必與國家依 本紙張t胃i公董 / % ----------^------ΐτ------^ (請先閲請背面之注意事項再填寫太瓦) 鋰濟部中央揉準局員工消费合作社印^ 蒙4 32 30 0 Λ7 __________ B7 五、發明説明(^ — 國學所制定之發音%全相❿。例如國字「滑稽」, 其依字典上之發音乃為「《乂 ν Μ 一」,而一 般國人間則唸成「厂…Μ一」,因此—般 程度之人依其慣用之發音以注音之方式輸入國字 時常無法找到所須之國字而倍感困擾。 (2) 有些國子因流通性不夠,且讀者亦懶於-- 査閱’而以有邊讀邊之情形帶過。此種愎形例如 國人在閱讀古書時常常會發生。例如閱讀「水滸 傳」時把「滸」字有邊讀邊地唸成「丁uv」實 則唸「厂乂 ν」^但像此「水滸傳」之文字實則 上在生活令例如網路或學校之作文中常常要用 到,對‘於不會正確發音之輸入者而言’其輪入即 有問題,而又不能隨便找個字代用,實在倍受困 擾。 (3) 現在的教育方式著重於實用教育’只要能正 確地用字,對其發音往往不多作計較。例如自高 中以後對學生之國字發音教育即漸漸減低比重。 造成一般程度者對國字之正確發音之知識普遍減 低。此種情形在一般生活中雖不致造成困擾。但 對於使甩到例如網路或醫院或捷運等之掛號、查 詢或指引系統之使用者而言,一般程度之使用者 常會碰到無法正確輸入國字之困擾。此情形隨著 科技之日漸發達及公共設施自動化之推進,實有 本纸張尺奴财關緖 83. 3.10,000 (請先閲讀背面之注意事項再填寫本頁) 裝· 訂 I— -I* 酽4 32 3 0 0 A7 B7 五、發明説明(多) 造成困擾之問題。 本發明之發明人有鑑於前述國字注音輸入上之 問題’乃欲供一可配合一般人之國字程度而引 導國人輸入所欲國字之系統’以提高國人之國字 輸入能力及輸入效率,同時並提供一指導系統, 使國人於利用刖述方便之輸入系統時,能從該系 統中取得自己不熟悉之白字之知識,而增進其國 字能力者。 【解決課題之手段】 為達成刖述目的’本發明提供一白字歸類方 法’其係將日常之國字歸類成下述六種,即: 一、部份成音類:即一由兩個部份以上所構成 之國字群’且該國字群常被以其多個組成部份中 之一個部份之注音加以發音者; 一、 全體相似類:即以和欲使用文字之整體外 形為相似之使用者熟悉之國字之注音作發音者; 經濟部中夬標準局員工消費合作社印製 二、 多讀音類:即由^兩個以上的部份所組成之 國字且該國子可用其組成部份之兩個以上之部份 作不同注音之發音者; ®、破音字類:即其注音為包括慣用之正確發 音及另一個亦為其正確之發音之文字類者 五、混淆音類:即一種發音時易發生混淆之文 字類;或通常性的誤哈之字類。 83.3.10,000 n 1^1 ^^1 H ^^1 --- - ϋ * m - - I ------- 0¾ 、T (請先閲讀背面之注意事項再填寫本頁) 本紙張尺度速用中國國家標準(CNS ) A4说格(210X297公釐) » 經 部 中 央 橾 準 局 員 合 作 社 印 製 Ρ4 32 3 Ο 〇 Α7 Β7 五、發明説明(5 ) 六、正確音類:即依照國立編譯館公布之正確 發音之文字類者。 又,本發明復提供一種白字資料庫建立方法, 該方法包括有以下之程序: 程序1 :擷選若干名一定國字程度範圍之受試 者; 程序2:對前述受試者實行國字測試,並收集 其常誤發’音之國字之誤I發音資料,以建立一般程 度使用者可能使用之誤發音之資料之資料庫· 程序3:將誤發音之文字群作性質上之分類整 理; 程序4:調閱前述誤發音文字群中之各文字之 相關性資料’例如正確之發音或其實際意義或其 出典等資相關性資料; ' 程序5:建立前述誤發音字群之可能被採用之 誤發音之資料庫及該誤發音之文字群及前述誤發 音資料群之相關資料等三者間之連繫調閱系統, 通常為一軟體。 又,本發明復提供一利用前述白字資料庫建立 方法所建立出之一白字資料庫者。 又’前述白字輸入查詢系統為包括有: 一注音輸入手段,供輪入注音; ---------裝------訂 (請先閔讀背面之注意事項再填寫本頁) 一國子調閱顯示手段’除顯示前述輸入注音之4 32 30 〇 Α7 Β7 Printed by the Consumer Cooperatives of the Central Standards Bureau of the Ministry of Economic Affairs 5. The invention of the mvl · is a library and white text data system, which can be provided by the searchers other than the phonetic transcription. The input of the Chinese character that can be gradually and gradually improved and [the back of invention] make the Chinese character, the Chinese people's wisdom, understand, understand, and write out. And because the Chinese characters are constantly used in comprehension, phonetic, and Chinese characters, I also hope to read and write computer characters. If you increase the input, the operator can increase it. Classification method, establishment method of white character database, and white character input query are Chinese character input users. In addition to the correct input, a Chinese character that may be intended to be collected by collecting misunderstandings commonly used by Chinese people for the inputter ’s country. · Word input ability, and also has the function of confirming the phonetic notation and its definition or example sentences, etc., a novel query instruction system for the user's ability to use Chinese characters. Scenery] Mainly developed from hieroglyphics, followed by Hui 'from the application of combination and diversion, development, pictophonetics, reposting, and borrowing, etc.', according to the aforementioned pictographs, pointers, reposting and borrowing The new addition created by the principle of waiting. In addition to the changes in the times, some Chinese characters that are not or habitually-mistakenly misunderstood, which cause trouble for Chinese input, there is a more accommodative turn judgment method for new creations and it is more convenient to check this In addition to the efficiency of the round-robin system, it can be used in conjunction with an inquiry system for use and users' knowledge of the national character to achieve the purpose of promotion. A ---- ^ ------- button clothes ------ ΪΤ ------ ^ (Please read the notes on the back before filling in this education) This paper size is rubbed in China Standard (CNS) A4 size (210X 297 mm) Printed by the Central Standards Bureau of the Ministry of Economic Affairs and Consumer Cooperatives f 32 3 Ο 0 a? ____ Β7 5. Description of the invention (Chi) As mentioned above, the Chinese characters are indicated by the aforementioned pictograms, The total number of Chinese characters developed by the six principles of events, comprehension, phonetics, re-notes, and borrowing is no less than tens of thousands. It is not only the Chinese learners and Chinese users in other countries who are looking forward to it, even as a Chinese people also often suffer from problems at the level of education and their own learning ability. However, the Chinese character is indeed the most beautiful character. Each of the characters has its own independent meaning, and after being combined with other characters, it will develop a new meaning. It is a rich and profound meaning. Text. Moreover, there are more than one billion Chinese people in the world, and there are many countries that use Chinese characters under the influence of Chinese culture, such as a book and Korea and other Southeast Asian countries. And the recent development of the Internet ’has used Chinese characters in the world very frequently, making the input of Chinese characters efficient and urgent. [Problems to be Solved by the Invention] Although the Chinese character input method of the project provides many glyph disassembly and some input methods using other turning methods, but its popularity is not very wide, it is most often used by Chinese people. The input method is> the input method of the main voice. Since the Chinese have been learning the pronunciation of the guest phonetic symbols and the Chinese phonetic alphabet since their childhood, for example, it is more convenient and familiar for amateur Internet users to use the Zhuyin input method. However, the known phonetic input method has the following disadvantages: V 1) The pronunciation of the national characters used by the people of each other may not be the same as that of the country according to this paper. /% ---------- ^ --- --- ΐτ ------ ^ (Please read the precautions on the back before filling in terawatts) Printed by the Consumer Consumption Cooperative of the Central Rubbing Bureau of the Ministry of Liji ^ Mongolia 4 32 30 0 Λ7 __________ B7 V. Description of the invention ( ^ — The pronunciation% formulated by the Chinese Academy of Education is all related. For example, the Chinese national character "Funny" is pronounced "字典 Μ 一" according to the pronunciation in the dictionary, and the common Chinese people pronounce it as "factory ... M 一", so —People of ordinary level are often troubled by the way of phonetic input of Chinese characters according to their usual pronunciation, and they are often unable to find the required Chinese characters. (2) Some Guozis are not sufficiently liquid, and readers are too lazy to read-see 'But it ’s a situation of reading while reading. This type of sigma often occurs when people read ancient books. For example, when reading "Water Margin", the word "浒" is read as "ding uv" while reading. Say "factory 乂 ν" ^ But words like "Water Margin" are often used in daily life orders such as online or school essays For 'inputers who can't pronounce correctly,' its rotation is problematic, but it ca n’t be replaced by any word, which is really troubled. (3) The current education method focuses on practical education. The use of Chinese characters is often not considered too much. For example, since high school, students' pronunciation of Chinese characters has gradually reduced the proportion. As a result, the knowledge of the correct pronunciation of Chinese characters has generally decreased. This situation is in ordinary life. Although it does not cause any confusion, for users who make use of the registration, inquiry or guidance system such as the Internet or hospitals or MRTs, users of ordinary levels often encounter the problem of not being able to input Chinese characters correctly. This situation With the development of science and technology and the advancement of public facilities automation, this paper has a rule of 83. 3.10,000 (please read the precautions on the back before filling this page) Binding · Order I— -I * 酽4 32 3 0 0 A7 B7 V. Explanation of the invention (many) Problems that cause distress. The inventor of the present invention has the problem of inputting the phonetic input of the Chinese character in view of the aforementioned problem. To guide the Chinese people to input the desired Chinese character system 'to improve the Chinese national character input ability and input efficiency, and also provide a guidance system so that Chinese people can use the easy-to-describe input system to obtain themselves from the system Those who are unfamiliar with Chinese characters and improve their national character ability. [Means of Solving the Problem] To achieve the stated purpose, "the present invention provides a method for classifying Chinese characters", which is to classify the daily Chinese characters into the following six types , Namely: 1. Partial phonics: that is, a group of Chinese characters composed of two or more parts', and the group of Chinese characters is often pronounced with the phonetic of one of its multiple components I. All similar categories: that is, the pronunciation of the Chinese characters familiar to users who are similar to the overall appearance of the characters to be pronounced; printed by the Consumer Cooperatives of the China Standards Bureau of the Ministry of Economic Affairs. ^ Speakers of Chinese characters composed of two or more parts, and the country can use two or more parts of its constituent parts for different phonetic pronunciation; ®, broken phonetic characters: that is, the phonetic pronunciation includes the correct pronunciation used and Also for a correct pronunciation of words like those of V. confused sound categories: the confusion of the text, word-prone A sound; or usual mistake of Kazakhstan the word class. 83.3.10,000 n 1 ^ 1 ^^ 1 H ^^ 1 ----ϋ * m--I ------- 0¾, T (Please read the precautions on the back before filling this page) The paper size Quickly use Chinese National Standards (CNS) A4 format (210X297 mm) »Printed by the Central Bureau of Economics and Social Cooperation Bureau of the People's Republic of China P4 32 3 〇 〇Α7 Β7 V. Description of the invention (5) 6. Correct tone: that is, according to the national compilation Those who pronounce the correct pronunciation in the library. In addition, the present invention further provides a method for establishing a white character database, which includes the following procedures: Procedure 1: Selecting a number of subjects with a certain degree range of Chinese characters; Procedure 2: Performing a Chinese character test on the aforementioned subjects And collect the mis-pronounced I pronunciation data that often mispronounces the national character of the sound, to build a database of mis-pronounced data that may be used by users in general. Procedure 3: Sort and sort the groups of mis-pronounced characters in nature ; Procedure 4: Retrieve the relevant information of each character in the previously mispronounced character group 'for example, the correct pronunciation or its actual meaning or its relevant information such as its canonical information;' Procedure 5: The possibility of establishing the aforementioned mispronounced character group may be The database of mis-pronounced data, the mis-pronounced text group and the related data of the mis-pronounced data group mentioned above is usually a software. In addition, the present invention further provides a white text database created by using the foregoing white text database creation method. Also, the aforementioned white text input query system includes: a Zhuyin input means for rounded Zhuyin; --------- install ------ order (please read the notes on the back before filling in this (Page) Ikuniko's display means' in addition to displaying the input phonetic

酽4 32 3 ◦ 〇 經濟部中央標準局貝工消費合作社印製 五、發明説明(6 ) 相關正確發音之文字外’亦顯示依前述白字資料 庫建立系統所建立之白字資料庫中所收集之輸入 者可能欲輸入之周字群供輸入者選擇; 一選擇手段,供輸入者將選擇之國字輸入至目 的地處; 一查詢手段’輸入者於輸入所欲國字至目的處 後,若同時想獲知該國字之相關資料,可藉由此 查詢手段調閱出該國字之例如正確發音及其實際 意義及出典等之相關資料。 以下參照附圖詳細說明本發明之較佳實施型態 及其效果。 【發明之較佳實施型態】 本發明所應用之白字歸類方法,係將國字分類 成以下六種。即, 第一種、部份成音類:即一由兩個部份以上所 構成之國字群,且該國字群常被以其多個組成部 份中之一個部份之注音加以發音者。如以下所 示: 1. 乂 v 2. u ^ y 3. 乂厶 4. 乂弓/ 5. 今幺/ Μ% 、、公 /羽 瞧 統礒 撫 7C 皭 蒸、換 桃、抚聩、-蓊 嗡 ——-----裝------訂 <請先閲讀背面之注意事項再填寫本頁) 莞、脘、烷、統、-浣 漕、螬、嶒、艚 本紙張尺度適用中國國家標率(CNS ) Α4規格(210 X 297公釐) 83. 3.10,000 β i· * ^32 30 g 五、發明説明(X) A7 B7 6.厂乂尤✓ Ί.勺大h / 8. Μ 一9. 虫幺\ 10. 虫巧\ 惶 洛 妗 隍 荐 矜 遑 梅 岑 艎 珞 涔 湟、蝗、艎 埤 洮 酆 答 晁、桃、祧、霆 趲、鑽、瓚、讚 第二種、全體相似類:即以和欲使用文字之整 體外形為相似之使用者熟悉之國字之注音作發音 者;如下所示: 經濟部中央揉率局貴工消費合作社印製 1. --- 厶 / :赢 、赢 、'赢、 籯 2. U h \ :蘊 、慍 、酿、 揾 3. Μ 一·_ :碁 、綦 4. 一 甘V ••秭 、姊 、炎 5. Μ t N ••祿 •.逢 、鲦 6. Μ U / :燏、 *講、 鎬 7. \ :鑽、 -蹲 8· 虫 厶 :癥、 澂, 薇 9. 虫 乂 \ N :醱、 蠛' 梃 10 -/ * :侈、 、 移0 輟 第三種、多讀音類:即由兩個以上的部份所組 成之國字且該國字可用其組成部份之兩個以上之 部份作不同注音之發音者;如下所禾: 1. 鸚:可能哈鳥或哈寧 2. 鵁:可能唸艾或唸鳥:.酽 4 32 3 ◦ 〇Printed by the Shellfish Consumer Cooperative of the Central Bureau of Standards of the Ministry of Economic Affairs 5.Invention Note (6) Except for the correctly pronounced words' also shows the information collected in the white-text database established by the white-word database building system The group of week characters that the input person may want to input is for the input person to choose; a selection method for the input person to input the selected Chinese character to the destination; a query means' input person after inputting the desired Chinese character to the destination, if At the same time, if you want to know the relevant information of the Chinese character, you can use this query to retrieve the relevant information of the Chinese character, such as the correct pronunciation, its actual meaning, and the code. Hereinafter, preferred embodiments and effects of the present invention will be described in detail with reference to the drawings. [Preferred Implementation Mode of the Invention] The white character classification method applied in the present invention is to classify Chinese characters into the following six types. That is, the first and part of the phonetic category: a group of Chinese characters composed of two or more parts, and the group of Chinese characters is often pronounced with the phonetic of one of its multiple components By. As shown below: 1. 乂 v 2. u ^ y 3. 乂 厶 4. 乂 弓 / 5. 今 幺 / Μ% 、 Gong / Yu Look Tong Fu 7C 皭 steaming, changing peaches, 聩,- Buzz ------- install ------ order < please read the notes on the back before filling in this page) Wan, 脘, 烷, 统,-, 螬, 螬, 嶒, transcript The scale is applicable to China National Standards (CNS) Α4 specifications (210 X 297 mm) 83. 3.10,000 β i · * ^ 32 30 g 5. Description of the invention (X) A7 B7 6. Factory Chiyou ✓ Ί. h / 8. Μ -1 9. Insects \ 10. Insects \ \ Luo 惶 Recommends Mei Cen 艎 珞 涔 湟, Locust, 艎 埤 洮 酆 Answer, Peach, Mandarin Duck, Ting Yi, Diamond, Mandarin Duck, Zandi Two types, all similar types: that is, the pronunciation of the phonetic characters of the Chinese characters familiar to users who are similar to the overall appearance of the characters to be pronounced; as follows: Printed by the Central Labor Bureau of the Ministry of Economic Affairs, Guigong Consumer Cooperatives 1.- -厶 /: win, win, 'win, 籯 2. U h \: Yun, 愠, brew, 揾 3. Μ 一 · _: 碁, 綦 4. Yigan V •• 秭, sister, Yan 5. Μ t N •• 卢 •. Feng, 鲦 6. Μ U /: 燏, * Speaking, pick 7. \: drill,-squat 8. Moth 症: 、,, 9 9. Moth 乂 \ N: 酦, 蠛 '梃 10-/ *: extravagant,, shift 0, drop the third, more pronunciation Category: That is, a Chinese character composed of two or more parts, and the Chinese character can use two or more parts of its component as different phonetic pronunciations; the following are as follows: 1. Parrot: Maybe a bird or Haning 2. 鵁: May read Ai or Bird :.

本紙張尺度適用中國國家揉準{ CNS ) A4規格(210X297公釐) L 83, 3.10,〇〇〇 --------裝--------—訂 (請先閣讀背面之注意事項再填寫本頁〕 經濟部中央標準局負工消费合作社印製 r F4 32 3 Ο 0 a? Β7_________ 五、發明説明(8 ) 3. 顋:可能唸君或唸頁 4. 親:可能哈馬或哈昆 5. 玟:可能唸王或唸文 6. 鵁:可能唸旦或唸烏 7. 黯:可能唸黑或唸詹 8. 皸:可能唸軍或唸皮 9 f:可能唸馬或唸中 10.麇:可能啥麻或啥取。 第四種、破音字類:即其注音為包括慣用之正 確發音及另一個亦為其正確之發音之文字類者; 如下所示: I.瀧:有尸乂尤及力乂厶/兩種發音 2_沈:有尸及;f 4 /兩種發音 3_晁:有片幺/、虫幺\及虫幺參種發音 4·行:有厂尤/及丁一厶/兩種發音 5·—·•有一、一〆及一、等發音 6,不··有勹乂 \及勺乂 /兩種發音 7:的:有勿古·及勿一 v兩種發音 8. 得:有勿古/及勿\ /兩種發音 9. 吃:有4及Μ — /兩種發音 10. 白:有勹艽/及勺I〆兩種發音。 第五種、混淆音類:即一種發音時易發生混淆 之文字類;如下所示:The size of this paper is applicable to China National Standard {CNS) A4 (210X297 mm) L 83, 3.10, 〇〇〇 -------- Installation --------- Order (please read first) Note on the back, please fill out this page again.] Printed by the Central Standards Bureau of the Ministry of Economic Affairs, Consumer Cooperatives r F4 32 3 〇 0 a? Β7 _________ V. Description of the invention (8) 3. 顋: May read Jun or page 4. Dear: May be Hama or Ha Kun 5. 玟: May read the king or read the text 6. 鵁: May read the dan or wu Wu 7. Dark: May read the black or zhan 8. 皲: May read the army or read the skin 9 f: May 10. Singing or chanting 10. 麇: May be numb or picky. The fourth type is the phonetic type: that is, its phonetic notation includes the correct pronunciation that is commonly used and another text that is also its correct pronunciation; as shown below : I. 泷: You corpse, especially Li 乂 厶 / two kinds of pronunciation 2_Shen: You corpse and; f 4 / two kinds of pronunciation 3_ 晁: You have 幺 /, 幺, and 幺· Line: Youchangyou / and Ding Yi 厶 / Two pronunciations 5 ···· One, one, and one, etc. pronunciation 6, No ··· You \ and spoon 乂 / Two pronunciations 7: Yes: Yes There are two pronunciations of Begu and Beyi v. And don \ / Two kinds of pronunciation 9. Eat: There are 4 and M — / Two kinds of pronunciation 10. White: There are two kinds of pronunciation: 勹 艽 / and spoon I〆. Fifth, confused sound type: that is easy to occur when one kind of pronunciation Obfuscated text classes; as follows:

I 本纸張尺度適用中固國家揉隼(CNS ) Μ规格(210Χ297公釐) -----------裝------訂------Μ (请先Η讀背面之注意事項苒填寫本育) A7 B7 五、發明説明(9 ) 而發生唸成 L厂及c之混淆:例如花给成〔丫 花生。 或厶與尸 {請先閱讀背面之注意事項再填寫本頁} 2. 捲舌及不捲舌之混淆:例如今與 或虫與卩。 3. 鼻音之混淆:例如力與,或—盥 一不準確:例如裝音”V或與二v,有 人誤唸生尤、生乂尤。 第六種、正確音類:即依照國立編譯館公布之 正確發音之文字類者例如中唸虫乂Λ,而文唸 乂 h /。 .从 上述之歸類方法係包含了國人可能誤唸之各種 類型,藉由該分類系統可製作出甚佳之 庫系統。 貝付 又、’本發明復提供一種白字資料庫建立方法, 該方法包括有如第一圖所示之程序,即 經濟部中央揉準局員工消費合作社印裝 程序1:擷選若干名一定國字程度範圍之受試 者;例如為中學程度之若干名受試者β 程序r S前述受試者實行國字測S,並收 其常誤發音之國字之誤發音資料,以建立一兹 度使用者可能使用之誤發音之資料之資料庫^ 程序3:將誤發音之文字群作性質上之分類整 理0 程序4:調閱前述誤發音文宁群中之各文字 83.3.10,000 本紙張尺度適用中國國家標準(CNS ) A4规格(210X29?公釐) 9 A7 B7 ’432 30 0 五、發明説明(|〇) 相關性資料,例如正確之發音或其實際意義或其 出典等資相關性資料。 ----------------’衣! (請先閲讀背面之注意事項再填寫本頁) '程序5:建立前述誤發音字群之可能被採用之 誤發音之資料庫及該誤發音之文字群及前述誤發 音資料群之相關資料等三者間之連繫調閱系統, 通常為一軟體。 又,本發明復提供一利用前述白字資料庫建立 方法所建立出之一白字資料庫者β 又,本發明亦提供一白字輸入查詢系統,其包 括有: 一注音輸入手段’供輸入注音; 一國字調閱顯示手段,除顯示前述輸入注音之 相關正確發音之文字外,亦顯示依前述白字資料 庫建立系統所建立之白字資料庫中所收集之輸入 者可能欲輪入之國字群供輸入者選擇·, 一選擇手段,供輸入者將選擇之國字輸入至目 的地處; 經濟部中央梯率局貝工消費合作社印装 一查詢手段’輸入者於輸入所欲國字至目的處 後,若同時想獲知該國字之相關資料,可藉由此 查詢手段調閱出該國字之例如正確發音及其實際 意義及出典等之相關資料。 而前述白字輸入查詢系統之操作流程係如第二 圖所示’係先輸入一注音(步驟1 〇〇),此時系統 83.3.10,000 私紙張尺度適用中國國家揉準(CNS ) Α4規格(210X297厶釐) 10 r P43230 五、發明説明(lj) A7 B7 經濟部中央標準局員工消費合作衽印11 根據該輸入音自字庫中找出正確發音之國字,同 時並依前述歸類方法自前述白字資料庫中調出可 能是該輸入者所欲找尋但為以白字輸入法輸入之 所有可能之國字(步驟110),之後將前述正確國 字及可能是白字的國字顯示出來(步驟丨2〇),之 後由輸入者撿選所欲之國字(步驟〗3〇) ’此時可 自動顯示出該國字之正確發音,或輸入者為了解 其白字之字義時可下一指令查詢其正確字義(步 驟140) ’若非破音字,則直接顯示該國字所有 相關之資訊’如字意、成語、出處...等。若 為破音字’則顯示出該字之破音内容(步驟丨5〇 ) 此時並提示輪入者可查閱該國字之例如真正意義 及造句及出典等之訊息,供輸入者選用(步驟 160)。之後即輪出訊息(步驟17〇)並終結程式, 而等待輸入者之下一步之操作。 【發明之效果】 本發明藉由前揭之手段可產生以下之效果。 (1) 可將國人常哈錯之白字分成前述五類,再加 上正確發音一共形成六類’以此即可規劃出白字 之資料庫及輸A調閱用之程式,而作出一具邏輯 性及具實用性之白字資料庫及白字檢索系統。 (2) 前述白字資料庫及檢索系統因係對一般程度 之測試者作測試所得因此當可適合大多數使甩者 (請先聞讀背*之注意事項真填寫本頁) -装 訂--- 線— 私紙張尺度適用中國國家標準(C>iS〉A4规格(2l0x297公釐) \ι A7 B7 Γ' 32 3 υ 〇 五、發明説明(【12 ) 之使用便利性’產業價值甚高。 (3) 可提供國字輸入使用者一方便之國字輸入 法,不必再.為找不到正確之輸,入音而浪費許多字. s查a旬時間及造成輸入速度變慢之問題,為一新 穎且有效率之發明,值得重視。 (4) 可將白字之正確字音及該國字之意義或出典 等之資料顯示給輸入者看,可教育使用者而提高 其國字涊識程度,為一甚有意義之輸入法及檢索 法。 則述者僅是本發明之一較佳實施型態, 並不能成為本發明之絕對限制,凡依本發明之精 神所為之種種變更亦應包含於本發明之範圍中 者’自不待言。 【圖示之簡單說明】 〆第一圖為本發明之建立白字資料庫之方法 ^施型態之流程之示意圖。 乂仏 第二圖為本發明之白字輸入查詢系統 型態之操作流程之示意圖。 牧佳實 表紙浪纽適用中國國家‘準(CNS)八4祕(210><297公董) 1j3 — I.^------II------線 (請先Μ讀背面之注意事項再填寫本頁) 經赛部中央榡芈局I工消費合作社印袈 施I The size of this paper is applicable to China Solid State Kneading (CNS) M specifications (210 × 297 mm) ----------- installation ------ order ------ M (please first ΗRead the notes on the back 苒 Fill this education) A7 B7 V. Description of the invention (9) There is a confusion of reading Cheng L factory and c: For example, flowers to Cheng [Ya peanut. Or maggots and corpses {Please read the notes on the back before filling out this page} 2. Confusion between rolling tongues and non-rolling tongues: for example, today and or worms and maggots. 3. Confusion of nasal sounds: such as force and, or-inaccurate: for example, "V" or "v", some people misunderstand Shengyou, Shengjiyou. Sixth, correct sound type: according to the National Compilation Museum Those who have pronounced correctly pronounced texts, such as Zhongnian worm 乂 Λ, and Wennian 乂 h /... From the above classification method, it includes various types that Chinese may misunderstand. With this classification system, very good Beifu, "The present invention provides a method for establishing a white-text database, which includes the procedure shown in the first figure, that is, the printing procedure for the staff consumer cooperative of the Central Bureau of the Ministry of Economic Affairs. Subjects with a certain national character level range; for example, several subjects at the middle school level β program r S The aforementioned subjects perform a national character test S, and receive mispronounced data of the national characters that are often mispronounced to establish A database of mispronounced data that a user may use here ^ Procedure 3: Organize the categorized groups of mispronounced characters in nature 0 Procedure 4: Read each character in the previously mispronounced text group Ning group 83.3.10,000 This paper size applies to China Standard (CNS) A4 specification (210X29? Mm) 9 A7 B7 '432 30 0 V. Description of the invention (| 〇) Relevant information, such as correct pronunciation or its actual meaning or its relevant information such as its canonization. -------------- 'Clothing! (Please read the notes on the back before filling out this page)' Procedure 5: Build a database of mispronounced characters that may be used in the aforementioned mispronounced character groups A system for accessing and linking the three wrongly pronounced text groups and the related data of the aforementioned wrongly pronounced data groups is usually a software. In addition, the present invention further provides a system created by using the foregoing method for establishing a white character database. A white character database β. In addition, the present invention also provides a white character input query system, which includes: a phonetic input means 'for inputting phonetic sounds'; a Chinese character reading display means, in addition to displaying the correctly pronounced text related to the input phonetic sounds mentioned above In addition, it also shows the group of Chinese characters that may be entered by the inputter collected in the white text database established by the aforementioned white text database building system for the inputter to choose, a selection method for the inputter to input the selected Chinese character. To destination The Central Ramp Bureau of the Ministry of Economic Affairs printed a query method 'The importer enters the desired Chinese character to the destination, and if he wants to know the relevant information of the Chinese character at the same time, he can use this query method to read it. Relevant information such as the correct pronunciation of the Chinese character, its actual meaning, and the canonical information are output. The operation flow of the aforementioned white character input query system is as shown in the second figure. 'It is to input a Zhuyin (step 100), at this time System 83.3.10,000 Private paper standards are applicable to China National Standards (CNS) A4 specifications (210X297%) 10 r P43230 V. Description of invention (lj) A7 B7 Employees' cooperation cooperation stamp of the Central Standards Bureau of the Ministry of Economic Affairs 11 According to the input tone Find the correct pronunciation of the Chinese character in the character library, and according to the aforementioned classification method, call out all the possible Chinese characters that may be what the enterer wants to find but input in the white character input method (step 110) , And then display the aforementioned correct Chinese characters and Chinese characters that may be white characters (step 丨 2〇), and then the input person selects the desired Chinese character (step 〖3〇) 'At this time, the country can be automatically displayed If the character is pronounced correctly, or if the input is to understand the meaning of the white character, the next command can be used to check the correct meaning (step 140) 'If it is not a broken character, all relevant information of the Chinese character will be directly displayed', such as the meaning, idiom, source ...Wait. If it is a broken sound word, the broken sound content of the word is displayed (step 丨 5) At this time, the turn-in person is prompted to check the information of the Chinese character such as the true meaning, sentence making, and canonical information, etc., for the input user to choose (step 160). After that, the message is polled (step 17) and the program is terminated, waiting for the next operation of the input person. [Effects of the Invention] The present invention can produce the following effects by means of the previous disclosure. (1) The white characters of Chinese people who often make mistakes can be divided into the aforementioned five categories, plus correct pronunciation to form a total of six categories, so as to plan the database of white characters and the program for inputting A to make a logic. And practical white text database and white text retrieval system. (2) The aforementioned Baizi database and retrieval system are suitable for most testers because they are tested on ordinary testers. (Please read the notes on the back first * Please fill in this page)-Binding- -Line—The private paper scale applies the Chinese national standard (C > iS> A4 specification (2l0x297mm) \ ι A7 B7 Γ '32 3 υ 〇 5. The convenience of use of the invention ([12)' industrial value is very high. (3) It can provide a user-friendly input method for Chinese characters, so there is no need to. Many characters are wasted in order to find the correct input, and the sound is input. S Check the time and cause the problem of slow input speed. It is a novel and efficient invention that deserves attention. (4) The correct pronunciation of the white character and the meaning of the Chinese character or the canonical information can be displayed to the input person, which can educate users and improve their knowledge of the Chinese character. , Is a very meaningful input method and search method. The description is only a preferred embodiment of the present invention, and can not be an absolute limitation of the present invention, all changes made in accordance with the spirit of the present invention should also be included in It goes without saying that it is within the scope of the present invention [Brief description of the diagram] 〆The first diagram is a schematic diagram of the method of establishing a white text database in the present invention ^ application type flow diagram. 操作 The second diagram is a schematic diagram of the operation procedure of the white text input query system type of the present invention. . Muji Real Paper Paper Long New Zealand is applicable to China's National Standard (CNS) Eighty-fourth Secret (210 > < 297 public director) 1j3 — I. ^ ------ II ------ line (please first M (Please read the notes on the back and fill in this page)

Claims (1)

88105 7 9 9 ?88 "ψϋ32 3.iLa_--- 六、申請專利範圍1 1. 一種白字歸類方法,其係將日常之國字歸類成 下述六種,即: 一、 部份成音類:即一由兩個部份以上所構成 之國字,群,且該國字群常被'以其多個組成部貧中 之一個部份之注音加以發音者; 二、 全體相似類:即以和欲使用文字之整體外 形為相似之使用者熟悉之國字之注音作發音者; 三、 多讀音類:即由兩個以上的部份所組成之 國字且該國字可用其組成部份之兩個以上之部份 作不同注音之發音者; 四、 破音字類:即其注音為包括慣用之正確發 ..音及另一個亦為其正確之發音之文字類者; 五、 混淆音類:即一種發音時易發生混淆之文 字類; 六、 正確音類:即依照國立編譯館公布之正確 發音之文字類者;。 經濟部中央標準局員工消費合作社印製 (請先閲讀背面之注意事項再填寫本頁) 2.—種白字資料庫建立方法,該方法包括有以下 之程序: 程序1:擷選若干名一定國字程度範圍之受試 者; 程序2 :對前述受試者實行國字測試,並收 本紙張财關( 210X297,^ ) 濟 部 中 央 標 準 Mi Μ χ 消 費 合 作 社 印 裝 r、4 32 3〇〇_S_ 六、申請專利範圍2 集其常誤發音之國字之誤發音資料,以建立一般 程度使用者可能使用之誤發音之資料之資料庫; 程序3:將誤發音之文字群作性質上之分類整 理; 程序4:調閱前述誤發音文字群中之各文字之 相關性資料’例如正確之發音或其實際意義或其 出典等資相關性資料; 程序5:建立前述誤發音字群之可能被採用之 誤發音之資料庫及該誤發音之文字群及前述誤發 音資料群之相關資料等三者間之連繫調閱系統’ 通常為一軟體。 3.—種白字資料庫,係利用申請專利範圍第2項 所述之白字資料庫建立方法所建立出之一 料庫者。 萸 一輪人查㈣,係一應㈣述_ 利範圍第3項所述之自字資料庫之系統 —注音輸入手段,供輸入注音_ 也括有· 相Si::閱類示手段’除顯示前述輸八注音之 f =發音之文字外,亦顯示依前述白字;料 #nί所建立之対#料庫中所收集<輪人 輪入之國字群供輸入者選擇; ( 2/Ο^ϋ---5 (請先閲讀背面之注意事項再填寫本買) 訂 r 經濟部中央標隼局負工消f合作社印策 酽4 32 3 0 Ο_m_ 六、申請專利範圍3 一選擇手段,供輸入者將選擇之國字輸入至目 的地處; 一查詢手段,輸入者於輸入所欲國字至目的處 後,若同時想獲知該國字之相關資料,可藉由此 查詢手段調閱出該國字之例如正確發音及其實際 意義及出典等之相關資料。 |_^---^1.----f------ίτ--I---千- {诗先聞讀背面之注意事項再填寫本頁) 本紙張尺度適用中國國家標準(CMS > A4规格(210X 297公釐)88105 7 9 9? 88 " ψϋ32 3.iLa _--- 6. Application scope 1 1. A method of classifying white characters, which is to classify daily Chinese characters into the following six types, namely: 1. Part Phonology: a group of Chinese characters and groups consisting of two or more parts, and the group of Chinese characters is often pronounced with the phonetic notation of one of its multiple components; two, all similar Category: That is, the phonetic pronunciation of the national character that is familiar to users who are similar to the overall appearance of the text is used as the pronunciation; III. Multi-pronunciation type: that is, the national character composed of two or more parts and the national character is available Two or more of its constituent parts are pronounced by different phonetic pronunciations; 4. Broken-character types: That is, their phonetic sounds include the commonly used correct pronunciation .. sounds and another text that is also correct pronunciation; V. Confused sounds: a type of text that is easy to be confused when pronounced; 6. Correct sounds: those that are correctly pronounced according to the National Compilation Museum; Printed by the Consumers' Cooperative of the Central Standards Bureau of the Ministry of Economic Affairs (please read the precautions on the back before filling this page) 2. A method for establishing a white text database, which includes the following procedures: Procedure 1: Select a number of certain countries Subjects with a range of characters; Procedure 2: Perform the national character test on the aforementioned subjects, and receive the paper financial clearance (210X297, ^) Ministry of Economy Central Standard Mi Μ χ Consumer Cooperatives printed r, 4 32 3〇〇 _S_ 6. Scope of patent application 2 It collects mispronounced data of frequently mispronounced Chinese characters to establish a database of mispronounced data that users of a general level may use; Procedure 3: Use the group of mispronounced characters as the nature Sorting and sorting; Procedure 4: Review the correlation information of each character in the previously mispronounced character group, such as the correct pronunciation or its actual meaning or its relevant information, etc. Procedure 5: Establish the aforementioned mispronounced character group A database of mis-pronounced databases that may be used, the mis-pronounced text group and related information of the aforementioned mis-pronounced data group, etc., usually a software . 3.—A kind of white text database is one of the databases created by using the white text database building method described in item 2 of the scope of patent application. A round of people search is a system of self-character database that should be described in the third item of the scope of interest— Zhuyin input means for inputting Zhuyin _. It also includes the phase Si :: reading method. In addition to the f = pronounced text, the above-mentioned white characters are also displayed according to the above-mentioned eight-note phonetic transcription; the national character group collected in the material library of the material # nί 的 建 対 # material library is selected by the inputter; (2 / Ο ^ ϋ --- 5 (please read the precautions on the back before filling in this purchase) Order r Central Labor Bureau of the Ministry of Economic Affairs, Consumer Affairs, Cooperative Cooperatives, Printing Policy 4 32 3 0 Ο_m_ VI. Scope of Patent Application 3-A selection method, For the input person to input the selected Chinese character to the destination; a query means, after the input Chinese character desired to the destination, if you want to know the relevant information of the Chinese character, you can use this query method to read The relevant information of the Chinese character, such as the correct pronunciation, its practical meaning, and its canonical appearance. | _ ^ --- ^ 1 .---- f ------ ίτ--I --- 千-{诗First read the notes on the back and then fill out this page.) This paper size applies to Chinese national standards (CMS > A4 size (210X 297 mm))
TW88105799A 1999-04-12 1999-04-12 Classification method, database, database establishment method, and input query system of mispronounced Chinese characters TW432300B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW88105799A TW432300B (en) 1999-04-12 1999-04-12 Classification method, database, database establishment method, and input query system of mispronounced Chinese characters

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW88105799A TW432300B (en) 1999-04-12 1999-04-12 Classification method, database, database establishment method, and input query system of mispronounced Chinese characters

Publications (1)

Publication Number Publication Date
TW432300B true TW432300B (en) 2001-05-01

Family

ID=21640257

Family Applications (1)

Application Number Title Priority Date Filing Date
TW88105799A TW432300B (en) 1999-04-12 1999-04-12 Classification method, database, database establishment method, and input query system of mispronounced Chinese characters

Country Status (1)

Country Link
TW (1) TW432300B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108415895A (en) * 2017-02-09 2018-08-17 腾讯科技(北京)有限公司 Media content error correction method and device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108415895A (en) * 2017-02-09 2018-08-17 腾讯科技(北京)有限公司 Media content error correction method and device
CN108415895B (en) * 2017-02-09 2023-04-07 腾讯科技(北京)有限公司 Media content error correction method and device

Similar Documents

Publication Publication Date Title
Rahman et al. Cultural Preservation: Rediscovering the Endangered Oral Tradition of Maluku (A Case Study on" Kapata" of Central Maluku).
Botley et al. Investigating spelling errors in a Malaysian learner corpus
Astillero Linguistic schoolscape: Studying the place of English and Philippine languages of Irosin secondary school
Corris et al. Dictionaries and endangered languages
Al Shlowiy Language, religion, and communication: The case of Islam and Arabic in the Asia-Pacific
Hyman Of glyphs and glottography
Tschacher From script to language: the three identities of ‘Arabic-Tamil’
Yin et al. Unspoken knowledge: kindergarteners are sensitive to patterns in Chinese pinyin before formally learning it
Chen Choices and patterns of English names among Taiwanese students
KR101107472B1 (en) Chinese education and recording medium for it
Stefanov et al. An Overview of some popular devices and technologies designed for blind and visually impaired people
TW432300B (en) Classification method, database, database establishment method, and input query system of mispronounced Chinese characters
JP2002207414A (en) Teaching material for learning kanji
Seegers et al. Special technological possibilities for students with special needs
TW201104643A (en) Language teaching system
Morey Language Revitalization: The Tai Ahom Language of Northeast India 1
Reppen Designing and building corpora for language learning
Nenotek et al. Errors of Written English on the Outdoor Signs in Kupang City, Indonesia: Linguistic Landscape Approach
Neubert Reading and Listening Between the Lines: Ideas on Singing the Short and Open Vowels [?],[?], and [?], and the Long and Closed Vowels [e?],[o?], and [ø?] in German.
Jamoussi et al. Road sign romanization in Oman: The linguistic landscape close-up
De'Ath et al. Reading and Listening Between the Lines: Ideas on Singing the Short and Open Vowels [i],[??], and [Y], and the Long and Closed Vowels [e [??]],[o [??]], and [o [??]] in German
Jester If I had a hammer: Technology in the language arts classroom
Lee Emergent literacy in Chinese: Print awareness of young children in Taiwan
Hussain Towards solving the crisis of Islam in higher education
Bibliographic Standards Committee MARC Advisory Committee (MAC) 245$ i Discussion Paper (with words)

Legal Events

Date Code Title Description
GD4A Issue of patent certificate for granted invention patent
MM4A Annulment or lapse of patent due to non-payment of fees