JPS6162985A - Recognition order determining system - Google Patents

Recognition order determining system

Info

Publication number
JPS6162985A
JPS6162985A JP59185643A JP18564384A JPS6162985A JP S6162985 A JPS6162985 A JP S6162985A JP 59185643 A JP59185643 A JP 59185643A JP 18564384 A JP18564384 A JP 18564384A JP S6162985 A JPS6162985 A JP S6162985A
Authority
JP
Japan
Prior art keywords
character type
recognition
character
resemblance
candidate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP59185643A
Other languages
Japanese (ja)
Inventor
Hiroshi Matsumura
松村 博
Tatsunosuke Iwahara
岩原 達之助
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tokyo Sanyo Electric Co Ltd
Sanyo Electric Co Ltd
Original Assignee
Tokyo Sanyo Electric Co Ltd
Sanyo Electric Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tokyo Sanyo Electric Co Ltd, Sanyo Electric Co Ltd filed Critical Tokyo Sanyo Electric Co Ltd
Priority to JP59185643A priority Critical patent/JPS6162985A/en
Publication of JPS6162985A publication Critical patent/JPS6162985A/en
Pending legal-status Critical Current

Links

Landscapes

  • Character Discrimination (AREA)

Abstract

PURPOSE:To improve the recognition rate of the titled system without requiring an enormous knowledge part by determining priority levels of individual character type categories in accordance with frequency and determining the recognition order of candidate character type categories on the basis of calculated degrees of resemblance and priority levels. CONSTITUTION:A binary character pattern outputted from a character observing part 1 is inputted to a pattern matching part 4 through a feature extracting part 2, and degrees of resemblance between a feature pattern and standard feature patterns are calculated. A priority level is determined for each character type category in a dictionary part 3 in accordance with frequency. Operating parts 4a-4c select candidate character type cetegories from category sets in the order of the degree of resemblance, and their character type codes and degrees of resemblance are stored in a candidate memory 5 together with priority levels of category sets. A knowledge part 6 determines the recognition order of candidate character type categories on the basis of relations between degrees of resemblance and priority levels and stores character type codes in a result memory 8 in accordance with the recognition order.

Description

【発明の詳細な説明】 (イ)産業上の利用分野 本発明は、手書き漢字を認識する文字認識システムに係
り、候補字種カテゴリーの認識順位決定方式に関する。
DETAILED DESCRIPTION OF THE INVENTION (a) Field of Industrial Application The present invention relates to a character recognition system for recognizing handwritten Chinese characters, and more particularly to a recognition ranking determination method for candidate character type categories.

(ロ)従来の技術 一般に、文字認識システムでは、入力文字パターンから
抽出した特徴パターンと、予め辞書部に登録された字種
カテゴリー毎の標準特徴パターンとの類似度を計算し、
類似度の大きいn個の候補字種カテゴリーを選択する。
(b) Conventional technology Generally, in a character recognition system, the degree of similarity between a feature pattern extracted from an input character pattern and a standard feature pattern for each character type category registered in advance in a dictionary section is calculated,
Select n candidate character type categories with a high degree of similarity.

そして、類似度の最も大きい候補字種カテゴリーを認識
結果として出力すると共に、誤認識の訂正のために、選
択したn個の候補字種カテゴリーには、類似度の大きい
順に第1位から第0位までの認識順位を決定しておく。
Then, the candidate character type category with the highest degree of similarity is output as the recognition result, and in order to correct misrecognition, the selected n candidate character type categories are ranked from 1st to 0th in order of the degree of similarity. Decide on the recognition ranking up to the first rank.

ところが、上述の如く、認識順位の決定に類似度のみを
用いていたのでは誤認識が多く、そこで、類似度による
複数の候補字種カテゴリーの選択後に、何らかの後処理
を施して認識順位を決定する方式が考えられるようにな
った。
However, as mentioned above, using only similarity to determine the recognition ranking often results in false recognition, so after selecting multiple candidate character categories based on similarity, some post-processing is performed to determine the recognition ranking. Now I can think of a way to do that.

そして、従来、後処理としては、特開昭59−3208
2号公報に開示されているように、文法的処理を行なう
ものや、特開昭59−27381号公報のように、被認
識文字の前後の文字が、漢字か、カタカナかあるいはひ
らがなかを判定するものが提案されていた。
Conventionally, as post-processing, Japanese Patent Application Laid-Open No. 59-3208
As disclosed in Publication No. 2, there are methods that perform grammatical processing, and methods that determine whether the characters before and after the recognized character are kanji, katakana, or hiragana, as in Japanese Patent Application Laid-open No. 59-27381. Something was proposed.

0→ 発明が解決しようとする問題点 従来の技術においては、文法的処理を後処理として行な
うので、文法的な辞書等の知識部が莫大となり、更には
、その処理内容が非常に複雑になるという問題があり、
又、前後の文字が、漢字かカタカナか等を判定する方式
では、選択した候補字種カテゴリーが漢字やひらがなば
かりである場合には、認識率の向上は期待できなかった
0→ Problem to be solved by the invention In conventional technology, grammatical processing is performed as post-processing, so the knowledge section such as grammatical dictionaries becomes enormous, and furthermore, the processing content becomes extremely complex. There is a problem that
Furthermore, in the method of determining whether the preceding and succeeding characters are kanji or katakana, no improvement in recognition rate could be expected if the selected candidate character categories were only kanji or hiragana.

に)問題点を解決するための手段 本発明は、字種カテゴリーの各々に、頻度に応じた優先
度を定めておき、計算により得られた類似度とこの優先
度とに基づいて、候補字種カテゴリーの認識順位を決定
するものである。
B) Means for solving the problem The present invention sets a priority level for each character type category according to its frequency, and selects candidate characters based on the similarity obtained by calculation and this priority level. This determines the recognition ranking of species categories.

(羽 作用 本発明に依れば、類似度が大きく、且つ、頻繁に使用さ
れる字種カテゴリーの認識順位が上位に来るようになり
、又、知識部には、類似度と優先度との関係を記憶して
おけばよいこととなる。
(Function) According to the present invention, character type categories that have a large degree of similarity and are frequently used are ranked higher in the recognition order, and the knowledge section has a combination of similarity and priority. All you have to do is remember the relationship.

(へ)実施例 第1図は、本発明を適用した文字認識システムのブロッ
ク図であり、(1)は入力用原稿に書かれた文字を読取
り、読取り結果を2値の文字パターンとして出力する文
字観測部、(2)は入力文字パターンから特徴パターン
を抽出する特徴抽出部、(3)は字種カテゴリー毎の標
準特徴パターンを記憶した辞書部、(4)は抽出した特
徴パターンと標準特徴パターンとのマツチングを行ない
、両パターンの類似度を計算するパターンマツチング部
である。
(F) Embodiment FIG. 1 is a block diagram of a character recognition system to which the present invention is applied, and (1) reads characters written on an input manuscript and outputs the reading result as a binary character pattern. Character observation unit, (2) is a feature extraction unit that extracts feature patterns from input character patterns, (3) is a dictionary unit that stores standard feature patterns for each character type category, and (4) is the extracted feature pattern and standard feature. This is a pattern matching unit that performs matching with patterns and calculates the degree of similarity between both patterns.

辞書部(3)の字種カテゴリーは、頻度の高いものをカ
テゴリーセット1、頻度の中位のものをカテゴリーセッ
ト2、頻度の低いものをカテゴリーセット3、というよ
うに頻度に応じてカテゴリー分けが為されており、各カ
テゴリーセット1〜3に順に優先度1〜3を定めている
The character type categories in the dictionary section (3) are categorized according to their frequency, such as those with high frequency in category set 1, those with medium frequency in category set 2, and those with low frequency in category set 3. Priorities 1 to 3 are set for each category set 1 to 3 in order.

パターンマツチング部(4)は、カテゴリーセット1〜
3に各々対応する3つの演算部(4a)〜(4c)を備
えており、各演算部は各カテゴリーセットの中から類似
度の大きい順にn個の候補字種カテゴリーを選択し、そ
の字種コード及び計算結果としての類似度を、候補メモ
1月5)に格納する。この際、演算部では対応するカテ
ゴリーセットの優先度を字種コード及び類似度に付加し
、これら3つの情報が各々の候補字種カテゴリーの情報
として候補メモ1月5)に記憶される。このようにして
、候補メモ1月5)には、各カテゴリーセットの中から
n個づつ、合計3n個の候補字種カテゴリーが記憶され
る。
The pattern matching section (4) selects category sets 1~
3, each of which selects n candidate character type categories in descending order of similarity from each category set, and selects n candidate character type categories from each category set in descending order of similarity The code and the similarity as the calculation result are stored in the candidate memo (January 5). At this time, the calculation unit adds the priority of the corresponding category set to the character type code and similarity, and these three pieces of information are stored in the candidate memo as information of each candidate character type category. In this manner, a total of 3n candidate character type categories, n from each category set, are stored in the candidate memo January 5).

知識部(6)には、類似度と優先度との関係が記憶され
ており、クラスタリング制御処理部(7)はこの知識部
(6)の内容を参照して、候補メモ1月51に記憶され
た30個の候補字種カテゴリーのうち上位n個の認識順
位を決定し、その字種コードを認識順位順に結果メモ1
月8)に格納する。例えば、類似度としてシティブロッ
ク距離り、を用い、この距離が小さいほど類似度が太き
いとすれば、知識部(6)には具体的には、第2図に示
すように、距離珈と優先度による認識順位の入れ換えの
可否の関係が記憶されている。
The knowledge section (6) stores the relationship between similarity and priority, and the clustering control processing section (7) refers to the contents of this knowledge section (6) and stores it in the candidate memo 51. The recognition rankings of the top n characters from among the 30 candidate character categories are determined, and the result memo 1 is recorded in the recognition ranking order of the character type codes.
Stored on month 8). For example, if we use the city block distance as the similarity, and assume that the smaller the distance, the thicker the similarity, the knowledge section (6) specifically contains the distance and distance as shown in Figure 2. The relationship of whether or not the recognition order can be swapped based on the priority is stored.

以下、具体例を上げて認識順位の決定の様子を説明する
The manner in which the recognition ranking is determined will be explained below using a specific example.

先ず、字種コードがML、シティブロック距離がDi、
優先度がp (i :=a 、 b、  c・”−、P
 =2’ +2.3)の候補字種カテゴリーを(MLD
L、P)と表わすこととし、例えば、第3図(イ)に示
すように、カテゴリーセット1〜3の各々から、類似度
が上位5個づつの候補字種カテゴリーが選択され、各シ
ティブロック距離り、の関係が、Dv<DP〈D、、 
<DQ <Db <DIl <I)wであったとする。
First, the character type code is ML, the city block distance is Di,
The priority is p (i:=a, b, c・”−, P
=2' +2.3) candidate character type category (MLD
For example, as shown in FIG. 3(a), the top five candidate character type categories with the highest similarity are selected from each of category sets 1 to 3, and each city block The relationship between distance is Dv<DP<D,
Assume that <DQ <Db <DIl <I)w.

すると、従来の如くシティブロック距離の大小だけから
では、認識順位は第3図(ロ)に示すようになる。
Then, as in the conventional method, based only on the size of the city block distance, the recognition order becomes as shown in FIG. 3 (b).

ところが、今、知識部(6)における閾値り、、D、。However, now the threshold value in knowledge part (6), ,D,.

D、と計算したシティブロック距離の関係が、DI<D
VI Da(D2 <DQI Dw <Dsであったと
すると、クラスタリング制御処理部(7)は、字種コー
ドMv。
The relationship between D and the calculated city block distance is DI<D
If VI Da(D2 < DQI Dw < Ds, the clustering control processing unit (7) sets the character type code Mv.

M、 、M、をBランクに、そして、字種コードM0゜
Mb MR9MwをCランクにランク分けし、これらの
ランク内で優先度による認識順位の入れ換えを行なうた
め、第3図P→に示すように、各ランク内では優先度の
高い字種コードが上位に来ることとなり、結果として、
類似度と優先度に基づく認識順位が決定される。そして
、゛上位5個の字種コードM、 、 M、 、 Mv、
 Mb 、 MQが順に結果メモ1月8)に記憶され、
答出力制御部(9)は、第1位の字種コードM、をワー
プロあるいはパソコン等の文字表示装置に認識結果とし
て送出し、その字種の表示が行なわれる。このとき、も
し誤認識であれば次の認識順位の字種コードを送出し、
以下、正しい認識結果が得られるまで順次、次の順位の
字種コードが送出される。
M, ,M, are ranked as B rank, and character type code M0゜Mb MR9Mw is ranked as C rank, and in order to change the recognition order based on priority within these ranks, as shown in Fig. 3 P→ As such, within each rank, character type codes with high priority will be placed at the top, and as a result,
A recognition ranking is determined based on similarity and priority. Then, ``Top 5 character type codes M, , M, , Mv,
Mb and MQ are stored in the result memo (January 8) in order,
The answer output control section (9) sends the first character type code M to a character display device such as a word processor or a personal computer as a recognition result, and the character type is displayed. At this time, if there is a misrecognition, the character type code of the next recognition order is sent,
Thereafter, the character type codes of the next rank are sequentially sent out until a correct recognition result is obtained.

従って、類似度が大きく;且つ、頻度の高い候補字種カ
テゴリーが優先されることとなり、このような文字の認
識順位が高くなり、認識率が向上する・・ 又、他の例として、シティブロック距離Dvが他の候補
と比べて十分小さく Dv<DIであり、D。
Therefore, priority is given to candidate character type categories with high similarity and high frequency, and the recognition ranking of such characters becomes high and the recognition rate improves. The distance Dv is sufficiently small compared to other candidates, Dv<DI, and D.

<Dr 、:ob <D2である場合は、字種コードM
vはパランクにランクされ認識順位の入れ換えは行なわ
れず、Bランクの字種コードMP2M12MQ9Mbが
優先度による順位入れ換えが行なわれ、その認識順位は
第3図に)のようになる。この場合も第3図(ロ)に比
較すれば、頻度の高い候補字種カテゴリーの認識順位が
上位に来ることとなる。
<Dr, :ob <D2, character type code M
v is ranked in the pararank and the recognition order is not changed, and the character type code MP2M12MQ9Mb of rank B is changed in the order based on the priority, and the recognition order is as shown in FIG. 3). In this case as well, when compared to FIG. 3 (b), the recognition ranking of the candidate character type category with high frequency comes to the top.

ところで、本実施例においては、字種カテゴリーを優先
度に応じたカテゴリーセットに予め分けておき、優先度
情報を得るようにしたが、辞書部(3)に標準特徴パタ
ーンと共に優先度情報を予め記憶しておいてもよい。
By the way, in this embodiment, the character type categories are divided in advance into category sets according to the priorities, and the priority information is obtained. You may remember it.

(ト)発明の効果 本発明に依れば、頻繁に使用される字種カテゴリーの認
識順位が上位に来るようになるので、認識率が向上し、
又、莫大な知識部を必要とせず、短かい処理時間で認識
順位を決定できる。
(G) Effects of the Invention According to the present invention, frequently used character categories are ranked higher in the recognition ranking, so the recognition rate improves.
Furthermore, recognition rankings can be determined in a short processing time without requiring a huge knowledge section.

【図面の簡単な説明】[Brief explanation of the drawing]

第1図は本発明を適用した文字認識システムのブロック
図、第2図は知識部の内容を示す説明図、第3図(イ)
〜に)は認識順位決定の具体例を示す説明図である。 主な図番の説明 (1)・・・文字観測部、 (2)・・・特徴抽出部、
 (3)・・・辞書部、 (4)・・・パターンマツチ
ング部、 (5)・・・候補メモリ、 (6)・・・知
識部、 (7)・・・クラスタリング制御処理部、 (
8)・・・結果メモリ、 (9)・・・答出力制御部。 出願人 三洋電機株式会社 外1名 代理人 弁理士  佐 野 静 夫 @l  r:?1 (口’)Mv、MP 1Ma 、 MQ、 Mb 、 MP 、 MW; (−’)   Mv 、 + Ma 、  M+A−7
シフーラに−1 )、  Mp 、 Ma’ 3う〉ウ −一−一一一
Figure 1 is a block diagram of a character recognition system to which the present invention is applied, Figure 2 is an explanatory diagram showing the contents of the knowledge section, and Figure 3 (A).
-) are explanatory diagrams showing specific examples of recognition ranking determination. Explanation of main figure numbers (1)...Character observation section, (2)...Feature extraction section,
(3)... Dictionary section, (4)... Pattern matching section, (5)... Candidate memory, (6)... Knowledge section, (7)... Clustering control processing section, (
8)...Result memory, (9)...Answer output control section. Applicant Sanyo Electric Co., Ltd. and one other agent Patent attorney Shizuo Sano @l r:? 1 (mouth') Mv, MP 1Ma, MQ, Mb, MP, MW; (-') Mv, + Ma, M+A-7
Shihura -1), Mp, Ma' 3U〉U -1-111

Claims (1)

【特許請求の範囲】[Claims] (1)入力文字パターンから抽出した特徴パターンと、
予め辞書部に登録された字種カテゴリー毎の標準特徴パ
ターンとの類似度を計算し、複数の候補字種カテゴリー
を選択する文字認識システムにおいて、前記字種カテゴ
リーの各々に頻度に応じた優先度を定め、前記計算によ
り得られた類似度と該優先度に基づいて前記複数の候補
字種カテゴリーの認識順位を決定するようにしたことを
特徴とする認識順位決定方式。
(1) A feature pattern extracted from the input character pattern,
In a character recognition system that selects multiple candidate character categories by calculating the degree of similarity with standard feature patterns for each character type category registered in advance in the dictionary section, each of the character type categories is given priority according to its frequency. , and the recognition ranking of the plurality of candidate character type categories is determined based on the similarity obtained by the calculation and the priority.
JP59185643A 1984-09-04 1984-09-04 Recognition order determining system Pending JPS6162985A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP59185643A JPS6162985A (en) 1984-09-04 1984-09-04 Recognition order determining system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP59185643A JPS6162985A (en) 1984-09-04 1984-09-04 Recognition order determining system

Publications (1)

Publication Number Publication Date
JPS6162985A true JPS6162985A (en) 1986-03-31

Family

ID=16174357

Family Applications (1)

Application Number Title Priority Date Filing Date
JP59185643A Pending JPS6162985A (en) 1984-09-04 1984-09-04 Recognition order determining system

Country Status (1)

Country Link
JP (1) JPS6162985A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS6465680A (en) * 1987-09-04 1989-03-10 Fujitsu Ltd Character recognizing system

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS57209575A (en) * 1981-06-19 1982-12-22 Fujitsu Ltd Character recognizing device

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS57209575A (en) * 1981-06-19 1982-12-22 Fujitsu Ltd Character recognizing device

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS6465680A (en) * 1987-09-04 1989-03-10 Fujitsu Ltd Character recognizing system

Similar Documents

Publication Publication Date Title
US5745602A (en) Automatic method of selecting multi-word key phrases from a document
JP3981734B2 (en) Question answering system and question answering processing method
JP2005228328A (en) Apparatus and method for searching for digital ink query
CN111274428B (en) Keyword extraction method and device, electronic equipment and storage medium
JPS6162985A (en) Recognition order determining system
CN112559324B (en) Software test case generation method based on in-application visual mining
JPS592191A (en) Recognizing and processing system of handwritten japanese sentence
JPS5882373A (en) Online character recognizing method
JPH0896081A (en) Character recognizing device and character recognizing method
JPS6162986A (en) Recognition order determining system
JPH09325962A (en) Document corrector and program storage medium
JPH08512162A (en) Cursive writing analysis method
JPS5842904B2 (en) Handwritten kana/kanji character recognition device
JPS6168679A (en) Decision system of recognition order
KR100255640B1 (en) Character recognizing method
JP3763262B2 (en) Handwritten character recognition device
JPS6186883A (en) Recognition system for on-line handwritten character
JPH10232864A (en) Sentence input device and computer readable recording medium recording sentence input program
JPS6172376A (en) Recognizing order deciding system
KR930012140B1 (en) Recogntion method of on-line writing down character using stroke automata
CN117917621A (en) Chinese character input method and system and keyboard
JPS6059487A (en) Recognizer of handwritten character
CN117634474A (en) Language identification method and related equipment applied to medium-day text
JPH04115383A (en) Character recognizing system for on-line handwritten character recognizing device
JPH0338630B2 (en)