JPH0793466A

JPH0793466A - Device for discriminating character kind and method therefor

Info

Publication number: JPH0793466A
Application number: JP5236002A
Authority: JP
Inventors: Masahiro Shishikura; 正博宍倉
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1993-09-22
Filing date: 1993-09-22
Publication date: 1995-04-07

Abstract

PURPOSE:To accurately discriminate whether a character to read is handwritten or typed in a recognition device used for reading characters. CONSTITUTION:For example, the white picture elements of one dot are added and a pattern with white frame is formed around each binarized character information. 16 kinds of 2X2 patterns which respectively consist of four picture elements of 2X2 dot and are made of the different combination of white and black picture elements are individually coordinated to the pattern with the white frame. Then a generation frequency of each 2X2 pattern at the pattern with the white frame is counted so as to discriminate the kind of a character from the rate of the non-linear/linear components of each 2X2 pattern. Thus, the caracter is always recognized through the use of a dictionary proper to the kinds of the character by discriminating the kinds of the character by utilizing that the outline of a typed character is linear than that of a handwritten character.

Description

【発明の詳細な説明】Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】この発明は、たとえば郵便物上の
宛名を読み取って区分する郵便物自動読取区分機での宛
名認識など、文字読み取りに用いられる文字種判別装置
およびその判別方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a character type discriminating apparatus and a method for discriminating characters used for reading characters, such as address recognition in a mail automatic reading / sorting machine for reading and classifying addresses on mail.

【０００２】[0002]

【従来の技術】近年、文字読み取りの技術を用いた自動
化機器として、たとえば郵便物上に記載されている宛名
を読み取って郵便物を区分する郵便物自動読取区分機が
開発され、中央などの郵便局に導入されている。2. Description of the Related Art In recent years, as an automatic device using a character reading technique, for example, a mail automatic reading / sorting device has been developed which reads mailing addresses written on mails and sorts the mails. Has been introduced to the station.

【０００３】さて、この郵便物自動読取区分機において
は、宛名の記載位置である、宛名情報領域を正しく検出
することが非常に重要であるとともに、宛名の文字種を
正しく判別することが重要になりつつある。In this automatic mail reading / sorting machine, it is very important to correctly detect the address information area, which is the address description position, and it is important to correctly identify the character type of the address. It's starting.

【０００４】すなわち、郵便物の宛名は手書により記載
されるのが一般的であったが、最近では、ワードプロセ
ッサなどを用いた印刷活字（印活）により宛名を記載し
た郵便物が増えている。That is, the address of a postal item was generally written by hand, but recently, the number of postal items in which the address is described by printing type (printing) using a word processor or the like is increasing. .

【０００５】通常、文字読み取りの技術では、文字を正
確に認識することが不可欠となっており、その認識の精
度により、読み取りの正確性が左右されることになる。
したがって、認識の際に文字種を判別し、文字種に適し
た辞書を用いることで、より正確な読み取りが可能とな
るものである。Normally, in the character reading technique, it is essential to recognize the character accurately, and the accuracy of the recognition affects the reading accuracy.
Therefore, it is possible to read more accurately by determining the character type at the time of recognition and using a dictionary suitable for the character type.

【０００６】そこで、郵便物自動読取区分機の場合に
は、宛名の読み取りの際に宛名が手書か印活かを判別
し、その判別結果に応じた辞書を用いて宛名認識処理を
行うようにしている。Therefore, in the case of a mail automatic reading / sorting machine, it is determined whether or not the address is a handwritten stamp when the address is read, and the address recognition process is performed using a dictionary according to the result of the determination. There is.

【０００７】こうして、宛名認識にかかる処理の高速化
と高精度化とを図ることにより、宛名読み取りの正確性
を向上させて、高い区分率を確保するようになってい
る。ところで、文字種に応じて辞書を使い分ける、つま
り手書の場合には手書用の辞書を、印活の場合には印活
用の辞書を使用するためには、認識しようとする文字の
文字種を正確に判別しなければならない。In this way, the address recognition process is speeded up and the accuracy is improved, so that the address reading accuracy is improved and a high classification rate is secured. By the way, to use different dictionaries according to the character type, that is, in the case of handwriting, to use the dictionary for handwriting, and in the case of In printing, to use the dictionary for indicia, the character type of the character to be recognized is correct. Must be determined.

【０００８】この判別を誤ると、正常な認識処理が行え
なくなり、逆に誤読や不読を招いて宛名読み取りの正確
性を低下させることになる。しかしながら、従来におい
ては、手書は印活に比べて文字の大きさや文字と文字と
の距離（ピッチ）にばらつきがあることを利用して、文
字種を判別するようになっていた。このため、文字数が
比較的多い場合には正確に判別できるが、文字数が少な
い場合には正確な判別が行えないという欠点があった。If this discrimination is erroneous, normal recognition processing cannot be performed, and conversely, erroneous reading or non-reading is caused, and the accuracy of address reading is deteriorated. However, in the past, handwriting was used to discriminate the character type by utilizing the fact that the character size and the distance (pitch) between the characters are different in handwriting than in printing. For this reason, when the number of characters is relatively large, it can be accurately determined, but when the number of characters is small, accurate determination cannot be performed.

【０００９】すなわち、従来の郵便物自動読取区分機で
は、郵便物の宛名情報領域より文字単位に切り出された
各文字の大きさや文字間のピッチから手書か印活かを判
別していたため、文字種の判別が宛名の長さによって不
正確なものとなりやすく、高い区分率を確保する上での
妨げとなるなどの問題があった。That is, in the conventional automatic mail sorting machine, since it is determined whether the handwriting stamp is used or not based on the size of each character cut out in character units from the address information area of the mail and the pitch between characters, There is a problem in that the discrimination tends to be inaccurate depending on the length of the address, which hinders securing a high classification rate.

【００１０】このように、宛名を読み取って郵便物を区
分する区分機など、文字読み取りの分野においては、文
字種を正しく判別する必要があり、正しく判別できない
場合には、それが読み取りの正確性などを低下させる原
因となっていた。As described above, in the field of character reading, such as a sorting machine that reads the address and sorts the mail, it is necessary to correctly determine the character type. Was causing the decrease.

【００１１】[0011]

【発明が解決しようとする課題】上記したように、従来
においては、読み取るべき文字の種類を正しく判別する
必要があり、正しく判別できない場合には、それが読み
取りの正確性を低下させるとともに、ひいては自動化機
器における性能の低下を招くなどの問題があった。As described above, conventionally, it is necessary to correctly determine the type of character to be read, and if it cannot be correctly determined, it reduces the accuracy of reading and, in turn, it leads to a decrease in the accuracy of reading. There was a problem that the performance of the automated equipment deteriorates.

【００１２】そこで、この発明は、読み取るべき文字の
種類をより正確に判別することができ、郵便物自動読取
区分機などの自動化機器で認識処理を行う文字読取装置
などに用いて好適な文字種判別装置およびその判別方法
を提供することを目的としている。Therefore, the present invention is capable of more accurately discriminating the type of a character to be read, and is suitable for use in a character reading device or the like for performing recognition processing by an automated device such as an automatic mail reading / sorting machine. It is an object of the present invention to provide a device and a method for determining the device.

【００１３】[0013]

【課題を解決するための手段】上記の目的を達成するた
めに、この発明の文字種判別装置にあっては、読み取る
べき文字が手書か印刷活字かを判別するものにおいて、
読み取ろうとする文字の直線成分および非直線成分を検
出する検出手段と、この検出手段で検出された前記直線
成分と非直線成分との比率により、当該文字の文字種を
判別する判別手段とから構成されている。In order to achieve the above object, in a character type discriminating device of the present invention, in a device for discriminating whether a character to be read is a handwritten print type,
It is composed of detection means for detecting a linear component and a non-linear component of the character to be read, and discrimination means for discriminating the character type of the character by the ratio of the linear component and the non-linear component detected by the detection means. ing.

【００１４】また、この発明の文字種判別装置にあって
は、読み取るべき文字が手書か印刷活字かを判別するも
のにおいて、読み取ろうとする文字の、２値のドットパ
ターンの周囲に１ドット分の白画素からなる枠を付して
被照合用パターンを形成する形成手段と、この形成手段
で形成された前記被照合用パターンに対して、２×２ド
ットの４画素からなり、白画素と黒画素の組み合わせよ
りなる１６種のマスクパターンをかけるマスク手段と、
このマスク手段でかけられた前記マスクパターンごと
の、前記被照合用パターンでの発生率を計数する計数手
段と、この計数手段で計数された前記マスクパターンご
との発生率により、当該文字が手書か印刷活字かを判定
する判定手段とから構成されている。Further, in the character type discriminating apparatus of the present invention, in the case of discriminating whether the character to be read is a handwritten print character, one dot of white is provided around the binary dot pattern of the character to be read. Forming means for forming a pattern to be collated with a frame of pixels, and 4 pixels of 2 × 2 dots for the pattern to be collated formed by this forming means, white pixels and black pixels Mask means for applying 16 kinds of mask patterns, which are combinations of
By the counting means for counting the occurrence rate in the pattern to be collated for each mask pattern applied by the mask means, and the occurrence rate for each mask pattern counted by the counting means, the character is manually written and printed. It is composed of a judging means for judging whether it is a printed character.

【００１５】また、この発明の文字種判別方法にあって
は、読み取るべき文字が手書か印刷活字かを判別する場
合において、読み取ろうとする文字の直線成分および非
直線成分を検出し、この検出された前記直線成分と非直
線成分との比率により、当該文字の文字種を判別するよ
うになっている。Further, according to the character type discriminating method of the present invention, when discriminating whether the character to be read is a handwritten print character, the linear component and the non-linear component of the character to be read are detected, and this is detected. The character type of the character is determined based on the ratio between the linear component and the non-linear component.

【００１６】さらに、この発明の文字種判別方法にあっ
ては、読み取るべき文字が手書か印刷活字かを判別する
場合において、読み取ろうとする文字の、２値のドット
パターンの周囲に１ドット分の白画素からなる枠を付し
て被照合用パターンを形成し、この形成された前記被照
合用パターンに対して、２×２ドットの４画素からな
り、白画素と黒画素の組み合わせよりなる１６種のマス
クパターンをかけ、このかけられた前記マスクパターン
ごとの、前記被照合用パターンでの発生率を計数し、こ
の計数された前記マスクパターンごとの発生率により、
当該文字が手書か印刷活字かを判定するようになってい
る。Further, in the character type discriminating method of the present invention, in the case of discriminating whether the character to be read is a handwritten print character, one dot of white is provided around the binary dot pattern of the character to be read. A pattern to be collated is formed with a frame made up of pixels, and 16 types of 4 pixels of 2 × 2 dots are formed for the formed pattern to be collated, which is a combination of white pixels and black pixels. Apply the mask pattern of, for each of the mask pattern thus applied, counting the occurrence rate in the pattern to be collated, by the occurrence rate of each of the counted mask pattern,
It is designed to determine whether the character is a handwritten print type.

【００１７】[0017]

【作用】この発明は、上記した手段により、文字数に関
係なく、手書か印刷活字かを判別できるようになるた
め、後の認識処理において、文字種に応じた適当な辞書
を使用することが可能となるものである。According to the present invention, since it is possible to discriminate whether the character is a handwritten print type by the means described above, regardless of the number of characters, it is possible to use an appropriate dictionary according to the character type in the subsequent recognition processing. It will be.

【００１８】[0018]

【実施例】以下、この発明の一実施例について図面を参
照して説明する。図１は、本発明にかかる郵便物自動読
取区分機の構成を概略的に示すものである。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS An embodiment of the present invention will be described below with reference to the drawings. FIG. 1 schematically shows the configuration of a mail piece automatic reading / sorting machine according to the present invention.

【００１９】すなわち、この郵便物自動読取区分機は、
葉書や封書などの書状である郵便物（被読取物）Ｐが一
括、かつ立位状態でセットされる供給部１１、この供給
部１１にセットされた郵便部Ｐを最前端より順に１枚ず
つ取り出す取出部１２、この取出部１２で取り出された
郵便物Ｐを搬送する取込搬送路１３、この搬送路１３を
搬送される郵便物Ｐの宛名情報を読み取る読取部１４、
この読取部１４で宛名情報の読み取られた郵便物Ｐをそ
の読取結果（区分指定データ）にもとづいて区分する区
分部１５により構成されている。That is, this automatic postal mail reading and sorting machine is
A supply unit 11 in which postal items (subjects to be read) P, which are letters such as postcards and sealed letters, are set collectively and in an upright position, and one postal unit P set in the supply unit 11 is sequentially arranged from the front end. A take-out section 12 for taking out, a take-in carrying path 13 for carrying the postal matter P taken out by the taking-out section 12, a reading section 14 for reading address information of the postal matter P carried through the carrying path 13.
The postal unit P whose address information has been read by the reading unit 14 is configured by a sorting unit 15 that sorts based on the reading result (sorting designation data).

【００２０】区分部１５は、上記読取部１４を通過した
郵便物Ｐを搬送する書状搬送路１６、鉛直方向に設けら
れた複数段（ここでは、たとえばＡ〜Ｇの７段）の区分
搬送路１７ａ〜１７ｇ、各区分搬送路１７ａ〜１７ｇに
沿って設けられた多数のポケット（集積箱）１８〜から
なっている。The sorting unit 15 is a letter transporting route 16 for transporting the postal matter P that has passed through the reading unit 14, and a plurality of stages (here, 7 stages A to G, for example) sorting transporting routes provided in the vertical direction. 17a to 17g, and a large number of pockets (collection boxes) 18 to be provided along each of the divided conveying paths 17a to 17g.

【００２１】なお、上記供給部１１の上部には、オペレ
ータ（郵便局員）が操作する操作パネルとしてのオペレ
ータパネル１９が設けられている。また、上記搬送路１
３，１６，１７ａ〜１７ｇの各所には、その搬送路上に
おける郵便物Ｐの搬送を検知するための、たとえばフォ
トセンサにより構成される搬送検知器（図示していな
い）が設けられている。An operator panel 19 as an operation panel operated by an operator (post office worker) is provided above the supply unit 11. In addition, the transport path 1
3, 16, 17a to 17g are provided with transport detectors (not shown) configured to detect the transport of the postal matter P on the transport path, for example, configured by photosensors.

【００２２】しかして、供給部１１にセットされた郵便
物Ｐは、取出部１２により順に取り出され、取込搬送路
１３を搬送されて読取部１４に送られる。そして、この
読取部１４によって、郵便物Ｐに記載されている宛名情
報が読み取られる。Then, the postal matter P set in the supply section 11 is sequentially taken out by the take-out section 12, conveyed through the take-in conveyance path 13 and sent to the reading section 14. Then, the address information described on the postal matter P is read by the reading unit 14.

【００２３】この後、郵便物Ｐは区分部１５に送られ、
上記読取部１４によって読み取られた宛名情報に対応す
る区分指定データにもとづいて書状搬送路１６および区
分搬送路１７ａ〜１７ｇのいずれかを選択的に搬送さ
れ、所定のポケット、つまり上記区分指定データに対応
するポケット１８内に区分されて集積される。After this, the postal matter P is sent to the sorting section 15,
Based on the sorting designation data corresponding to the address information read by the reading unit 14, any one of the letter transport path 16 and the sorting transport paths 17a to 17g is selectively transported to a predetermined pocket, that is, the sorting designation data. It is divided and accumulated in the corresponding pocket 18.

【００２４】図２は、上記した読取部１４の概略構成を
示すものである。すなわち、読取部１４は、郵便物Ｐ上
の全面の画像を得て光電変換する光電変換部１４ａ、こ
の光電変換部１４ａの出力に応じて文字パターンの認識
を行うことにより宛名情報を識別する識別部１４ｂから
なっている。FIG. 2 shows a schematic structure of the reading unit 14 described above. That is, the reading unit 14 obtains an image of the entire surface of the postal matter P and performs photoelectric conversion on the photoelectric conversion unit 14a, and recognizes the address information by recognizing the character pattern according to the output of the photoelectric conversion unit 14a. It is composed of a part 14b.

【００２５】上記光電変換部１４ａは、郵便物Ｐの宛名
情報の記載面を光学的に走査して光電変換することによ
ってパターン信号（読取信号）を得るものであり、たと
えば郵便物Ｐ上に光を照射する光源、および郵便物Ｐか
らの反射光を受けてそれを電気信号に変換する自己走査
形のＣＣＤイメージセンサなどによって構成されてい
る。The photoelectric conversion unit 14a obtains a pattern signal (reading signal) by optically scanning the surface of the mail P on which the address information is written and photoelectrically converting it. And a self-scanning CCD image sensor that receives reflected light from the mail P and converts it into an electric signal.

【００２６】上記識別部１４ｂは、宛名領域検出部４
２、文字認識部４３、町名・街区認識部４４、宛名辞書
４５、および宛名認識部４６によって構成されている。
宛名領域検出部４２は、上記光電変換部１４ａからの信
号をもとに、上記郵便物Ｐに記載されている全情報の中
から宛名情報が記載されている領域（読取領域）を検出
し、この宛名情報領域の位置を示すデータを出力するも
のである。The identification section 14b is the address area detection section 4
2, a character recognition unit 43, a town name / block recognition unit 44, an address dictionary 45, and an address recognition unit 46.
The address area detection unit 42 detects an area (reading area) in which the address information is described from all the information described in the mail P based on the signal from the photoelectric conversion unit 14a, The data indicating the position of the address information area is output.

【００２７】なお、この宛名領域検出部４２における検
出方法の詳細については、たとえば特願平５−６２３６
５号に記載されているので、ここでの説明は割愛する。
文字認識部４３は、上記宛名領域検出部４２から供給さ
れる信号、つまり宛名情報領域内の宛名情報に対応する
各文字情報を行単位、さらに文字単位で検出して切り出
す検出切出回路５２、この検出切出回路５２からの出
力、つまり検切りされた文字情報を正規化し、サンプリ
ングする正規化回路５３、およびこの正規化回路５３で
処理された文字情報を、たとえば辞書５５を用いて認識
する認識回路５４によって構成されている。The details of the detection method in the address area detecting unit 42 are described in, for example, Japanese Patent Application No. 5-6236.
Since it is described in No. 5, the explanation here is omitted.
The character recognition unit 43 detects the signal supplied from the address area detection unit 42, that is, each piece of character information corresponding to the address information in the address information area, on a line-by-line basis and further on a character-by-character basis. The output from the detection / cutout circuit 52, that is, the normalization circuit 53 for normalizing and sampling the cut-out character information, and the character information processed by the normalization circuit 53 are recognized using, for example, a dictionary 55. It is composed of a recognition circuit 54.

【００２８】なお、上記認識回路５４での辞書５５を用
いた認識処理の詳細については、後述する。町名・街区
認識部４４は、上記文字認識部４３から供給される認識
文字に対して、宛名辞書４５に登録されている宛名によ
り町名・街区の認識を行うものである。The details of the recognition process using the dictionary 55 in the recognition circuit 54 will be described later. The town name / block recognition unit 44 recognizes the town name / block based on the address registered in the address dictionary 45 for the recognition character supplied from the character recognition unit 43.

【００２９】宛名認識部４６は、上記町名・街区認識部
４４から供給される町名・街区によって宛名を認識し、
この宛名に対応する上記区分指定データを出力するもの
である。The address recognition section 46 recognizes the address by the town name / block supplied from the town name / block recognition section 44,
The classification designation data corresponding to this address is output.

【００３０】すなわち、この区分指定データによって前
記区分部１５におけるポケット１８の位置が示され、そ
のポケット１８にて上記区分指定データに該当する郵便
物Ｐが区分集積されることになる。That is, the position of the pocket 18 in the sorting section 15 is indicated by the sorting designation data, and the mail items P corresponding to the sorting designation data are sorted and collected in the pocket 18.

【００３１】ここで、上記認識回路５４は、たとえば認
識すべき文字情報と辞書５５内の文字に対応する規準パ
ターンとの類似度を複合類似度法などにより求めること
で、文字の認識を行うものであり、その際に、認識すべ
き文字情報の文字種を判別し、その判別の結果にしたが
って辞書５５を使い分けるようになっている。Here, the recognition circuit 54 recognizes a character, for example, by calculating the similarity between the character information to be recognized and the reference pattern corresponding to the character in the dictionary 55 by the composite similarity method or the like. At that time, the character type of the character information to be recognized is determined, and the dictionary 55 is selectively used according to the result of the determination.

【００３２】すなわち、辞書５５には、あらかじめ手書
文字に対応する規準パターンを有してなる手書文字用辞
書５５ａと、印刷活字（印活）に対応する規準パターン
を有してなる印活文字用辞書５５ｂとが用意されてお
り、認識すべき文字情報が手書と判別される場合には手
書文字用辞書５５ａが、また印活と判別される場合には
印活文字用辞書５５ｂがそれぞれに用いられて、上記の
認識処理が行われる。That is, the dictionary 55 has a handwritten character dictionary 55a having a reference pattern corresponding to handwritten characters in advance, and a printing pattern having a reference pattern corresponding to print characters (printing). A character dictionary 55b is prepared, and when the character information to be recognized is determined to be handwritten, the handwritten character dictionary 55a is used. Is used for each and the above recognition processing is performed.

【００３３】以下に、上記認識回路５４における、認識
すべき文字情報の文字種の判別方法について説明する。
図３は、郵便物Ｐの宛名情報の記載に用いられる文字種
の例を示すものである。A method of discriminating the character type of the character information to be recognized in the recognition circuit 54 will be described below.
FIG. 3 shows an example of character types used to describe the address information of the postal matter P.

【００３４】すなわち、郵便物Ｐの宛名情報としては、
ワードプロセッサなどにより記載される印活文字の場合
（同図（ａ））と、手書により記載される手書文字の場
合（同図（ｂ））とがある。That is, as the address information of the postal matter P,
There are cases of printed characters written by a word processor or the like (FIG. 11A) and cases of handwritten characters written by handwriting (FIG. 9B).

【００３５】通常、印活文字は機械的に構成されるもの
であるため、手書文字に比べて、文字の輪郭部分が直線
的になっている場合が多い。本実施例では、この、印活
文字の輪郭部分が手書文字の輪郭部分よりも直線的であ
るという特徴を利用し、認識すべき文字情報の輪郭部分
が直線的であるか否かを調べるとともに、宛名情報に対
応する各文字情報の大きさやピッチを検出することによ
り、文字種の判別を行うようになっている。Since the printed characters are usually mechanically constructed, the contours of the characters are often linear compared to handwritten characters. In this embodiment, by utilizing the feature that the contour portion of the printed character is more linear than the contour portion of the handwritten character, it is checked whether or not the contour portion of the character information to be recognized is linear. At the same time, the character type is determined by detecting the size and pitch of each character information corresponding to the address information.

【００３６】図４は、上記した文字種の判別動作にかか
る処理の流れを示すものである。たとえば、認識回路５
４では、まず、正規化回路５３を経て供給される、検出
切出回路５２にて検切りされた文字情報（２値化パター
ン）の周囲、つまり上，下および左，右にそれぞれ１ド
ット分の白画素を追加して、被照合用パターンとしての
白枠付パターンを形成する（ステップＳＴ１）。FIG. 4 shows a flow of processing relating to the above-mentioned character type discrimination operation. For example, the recognition circuit 5
In 4, the character information (binarized pattern), which is supplied through the normalization circuit 53 and is inspected by the detection / extraction circuit 52, is surrounded by one dot for each of the upper, lower, left, and right. Is added to form a pattern with a white frame as a pattern to be collated (step ST1).

【００３７】図５に、郵便物Ｐ上の宛名情報領域内より
切り出した、「Ｔ」なる２値化パターンの周囲に白枠を
追加した白枠付パターンの例を示している。次いで、認
識回路５４では、２×２ドットの４画素からなり、白画
素と黒画素の異なる組み合わせよりなる１６種の２×２
パターン（マスクパターン）を作成する（ステップＳＴ
２）。FIG. 5 shows an example of a white framed pattern in which a white frame is added around the binarized pattern "T" cut out from the address information area on the mail P. Next, in the recognition circuit 54, 16 kinds of 2 × 2 which are composed of 4 pixels of 2 × 2 dots and which are different combinations of white pixels and black pixels
Create a pattern (mask pattern) (step ST
2).

【００３８】そして、作成した１６種の２×２パターン
のそれぞれを上記白枠付パターンに対応させ、この白枠
付パターンでの各２×２パターンの発生の頻度（発生
率）を計数する（ステップＳＴ３）。Then, each of the 16 types of 2 × 2 patterns created is made to correspond to the above-mentioned white framed pattern, and the frequency (occurrence rate) of occurrence of each 2 × 2 pattern in this white framed pattern is counted ( Step ST3).

【００３９】ここで、作成される１６種の２×２パター
ンは、それぞれ直線を構成し得る成分（直線成分）、直
線を構成し得ない成分（非直線成分）、および直線の構
成に関係し得ない成分（無効成分）の３つのグループに
分類できる。Here, the 16 kinds of 2 × 2 patterns to be created are related to a component that can form a straight line (straight line component), a component that cannot form a straight line (non-linear component), and a straight line configuration, respectively. It can be classified into three groups of components that cannot be obtained (ineffective components).

【００４０】図６に、１６種の２×２パターンを、３つ
のグループに分類した例を示している。この場合、同図
（ａ）は、４画素のすべてが白画素または黒画素からな
る２種のパターンよりなり、直線の構成に関係し得ない
無効成分となっている。FIG. 6 shows an example in which 16 kinds of 2 × 2 patterns are classified into three groups. In this case, (a) in the figure consists of two types of patterns in which all four pixels are white pixels or black pixels, and is an invalid component that cannot be related to the straight line configuration.

【００４１】同図（ｂ）は、４画素のうちの１つが黒画
素（他の３つは白画素）からなる４種のパターンよりな
り、直線を構成し得ない非直線成分となっている。同図
（ｃ）は、４画素のうちの斜めの２つがそれぞれ白画素
または黒画素からなる２種のパターンよりなり、直線を
構成し得ない非直線成分となっている。In FIG. 4B, one of the four pixels has four types of patterns each of which is a black pixel (the other three are white pixels), which is a non-linear component that cannot form a straight line. . In the same figure (c), two diagonal lines out of four pixels are composed of two types of patterns each consisting of a white pixel or a black pixel, and are non-linear components that cannot form a straight line.

【００４２】同図（ｄ）は、４画素のうちの３つが黒画
素（他の１つは白画素）からなる４種のパターンよりな
り、直線を構成し得ない非直線成分となっている。同図
（ｅ）は、４画素のうちの上，下または左，右の２つが
それぞれ白画素または黒画素からなる４種のパターンよ
りなり、直線を構成し得る直線成分となっている。In the same figure (d), four of the four pixels consist of four types of patterns consisting of black pixels (the other one is a white pixel), which is a non-linear component that cannot form a straight line. . In the same figure (e), four types of patterns in which the upper, lower, left, and right of the four pixels are white pixels or black pixels, respectively, are straight line components that can form a straight line.

【００４３】すなわち、白および黒の４画素の組み合わ
せにより考え得る１６種の２×２パターンは、同図
（ａ）に示した無効成分（第１のグループ）、同図
（ｂ），（ｃ），（ｄ）にそれぞれ示した非直線成分
（第２のグループ）、および同図（ｅ）に示した直線成
分（第３のグループ）に、分類される。That is, 16 types of 2 × 2 patterns that can be considered by combining four pixels of white and black are the invalid components (first group) shown in FIG. ) And (d), respectively, and the linear component (third group) shown in (e) of the figure.

【００４４】しかる後、認識回路５４では、上記白枠付
パターンでの２×２パターンごとの発生の頻度を上記の
分類にもとづいて各成分ごとに計数し、非直線成分に対
する直線成分の比率（直線成分の割合）により、手書文
字か印活文字かを判別する（ステップＳＴ４）。Thereafter, the recognition circuit 54 counts the frequency of occurrence of each 2 × 2 pattern in the white framed pattern for each component based on the above classification, and the ratio of the linear component to the non-linear component ( Based on the ratio of the straight line component), it is determined whether the character is a handwritten character or a printed character (step ST4).

【００４５】たとえば、直線成分が非直線成分と比べて
ある基準値（しきい値）以上と判断される場合には印活
文字、ある基準値以下と判断される場合には手書文字と
判定される。For example, when it is judged that the linear component is more than a certain reference value (threshold value) as compared with the non-linear component, it is judged as a print character, and when it is judged that it is less than a certain reference value as a handwritten character. To be done.

【００４６】ここで、印活の「口」と手書の「ロ」とを
例に、文字種を判別する際の具体例について説明する。
図７は、印活の「口」および手書の「ロ」よりそれぞれ
形成される白枠付パターンであり、図８は、印活の
「口」および手書の「ロ」に対する２×２パターンごと
の発生の頻度（直線成分および非直線成分）を示すもの
である。Here, a specific example of discriminating the character type will be described by taking the "mouth" of the printing and the "b" of the handwriting as examples.
FIG. 7 is a pattern with a white frame formed from the “mouth” of the Inko and the “b” of the handwriting respectively, and FIG. 8 is 2 × 2 for the “mouth” of the Inou and the “b” of the handwriting. It shows the frequency of occurrence (linear component and non-linear component) for each pattern.

【００４７】たとえば、印活の「口」および手書の
「ロ」に対する、それぞれの白枠付パターンでの、直線
成分および非直線成分についての２×２パターンごとの
発生の頻度を調べると、この場合、印活の「口」に関し
ては、直線成分と非直線成分との割合が４０：８（＝
５：１）となっている。For example, when the frequency of occurrence for each 2 × 2 pattern for the linear component and the non-linear component in the respective white framed patterns for the “mouth” of the printing and the “b” of the handwriting is examined, In this case, the ratio of the linear component to the non-linear component is 40: 8 (=
5: 1).

【００４８】一方、手書の「ロ」に関しては、直線成分
と非直線成分との割合が３０：２８（＝１５：１４）と
なっている。このことからも、印活の「口」の方が、手
書の「ロ」に比べて、直線成分が非直線成分よりも多い
（つまり、非直線成分が直線成分よりも少ない）ことが
わかる。On the other hand, regarding "B" in the handwriting, the ratio of the linear component and the non-linear component is 30:28 (= 15: 14). From this, it can be seen that the "mouth" of In print has more linear components than non-linear components (that is, less non-linear components) than "B" of handwriting. .

【００４９】このように、直線成分が非直線成分に比し
てある基準値以上ならば印活、ある基準値以下ならば手
書と判定することができる。なお、この判定の結果のみ
で文字種を判別することも可能であるが、本実施例で
は、判別の正確性を期するために、たとえば正規化回路
５３を経て供給される、検出切出回路５２にて検切りさ
れた各文字情報の大きさ、および文字情報間のピッチを
検出し、この検出結果を考慮するようにしている。As described above, when the linear component is greater than a certain reference value compared to the non-linear component, it is possible to determine printing and when the linear component is less than a certain reference value, it is possible to determine handwriting. Although it is possible to determine the character type based only on the result of this determination, in this embodiment, in order to ensure the accuracy of the determination, for example, the detection cutout circuit 52 supplied via the normalization circuit 53. The size of each piece of character information and the pitch between the pieces of character information that have been cut off are detected, and the detection result is taken into consideration.

【００５０】たとえば、各文字情報の大きさおよびピッ
チが一定の場合には印活文字、一定でない場合には手書
文字という判定が下される。また、直線成分のある基準
値との比較により文字種を判別する場合、この処理を、
たとえば検出切出回路５２にて検切りされた複数の文字
情報について、もしくはすべての文字情報について行う
ことにより、読み取りの正確性を満足する上で必要なレ
ベルの信頼性を確保することができる。For example, when the size and pitch of each character information is constant, it is judged that the character is a print character, and when it is not constant, it is a handwritten character. Also, if you want to determine the character type by comparing with a reference value with a linear component, this process,
For example, by performing a plurality of pieces of character information that have been cut out by the detection / cutout circuit 52, or all of the character information, it is possible to ensure the level of reliability necessary for satisfying the accuracy of reading.

【００５１】こうして、郵便物Ｐ上に記載された宛名情
報に対応する文字情報の文字種が判別されると、その判
別の結果にしたがって、辞書５５内より適当な辞書が選
択される。In this way, when the character type of the character information corresponding to the address information written on the mail P is determined, an appropriate dictionary is selected from the dictionary 55 according to the result of the determination.

【００５２】すなわち、手書文字と判別された場合には
手書文字用辞書５５ａが選択され（ステップＳＴ５
ａ）、印活文字と判別された場合には印活文字用辞書５
５ｂが選択される（ステップＳＴ５ｂ）。That is, when it is determined that the character is a handwritten character, the handwritten character dictionary 55a is selected (step ST5).
a), if it is determined to be a print character, the print character dictionary 5
5b is selected (step ST5b).

【００５３】そして、その選択された各辞書５５ａ，５
５ｂをそれぞれに用いて、上述した複合類似度法による
文字の認識が行われることになる（ステップＳＴ６）。
この結果、手書か印活かの文字種の判別がより正確に行
えるようになることにより、宛名情報に対応する文字情
報を正しく認識できるようになるものである。Then, the selected dictionaries 55a, 5
Characters are recognized by the above-described composite similarity method using 5b for each (step ST6).
As a result, it becomes possible to more accurately discriminate the character type of the handwritten seal stamp, so that the character information corresponding to the address information can be correctly recognized.

【００５４】上記したように、文字数に関係なく、手書
か印活かを判別できるようにしている。すなわち、手書
か印活かを判別する際に、印活文字の輪郭は手書文字に
比べて直線成分が多いことを利用するようにしている。
これにより、単に文字の大きさやピッチにより判別する
場合に比べ、より正確に判別できるようになるため、後
の認識処理において、文字種に応じた適当な辞書を使用
することが可能となる。したがって、文字種の判別を誤
ったばかりに文字を正しく認識できなくなるといった不
具合を減少でき、認識にかかる処理の高精度化と高速度
化とを図ることができるようになるとともに、最終的に
は、宛名読み取りの正確性を高めることが可能となるな
ど、郵便物自動読取区分機の性能をも向上し得るもので
ある。As described above, regardless of the number of characters, it is possible to determine whether it is a handwritten stamp. That is, when determining whether a handwritten character is used, the fact that the contour of the printed character has more straight line components than the handwritten character is used.
As a result, it becomes possible to make a more accurate determination as compared with the case where the determination is simply made by the size and pitch of the character, so that it becomes possible to use an appropriate dictionary according to the character type in the subsequent recognition processing. Therefore, it is possible to reduce the problem that the character cannot be correctly recognized because the character type is erroneously identified, and it is possible to improve the accuracy and speed of the recognition process, and finally, It is possible to improve the accuracy of reading and also improve the performance of the automatic mail sorting machine.

【００５５】なお、上記実施例においては、文字の認識
に複合類似度法を用いた場合について説明したが、これ
に限らず、たとえばパターンマッチング法や単純類似度
法などを用いる各種の読取装置に同様に適用できる。そ
の他、この発明の要旨を変更しない範囲において、種々
変形実施可能なことは勿論である。In the above embodiment, the case where the composite similarity method is used for character recognition has been described, but the present invention is not limited to this, and various readers that use, for example, the pattern matching method or the simple similarity method. The same applies. Of course, various modifications can be made without departing from the scope of the invention.

【００５６】[0056]

【発明の効果】以上、詳述したようにこの発明によれ
ば、読み取るべき文字の種類をより正確に判別すること
ができ、郵便物自動読取区分機などの自動化機器で認識
処理を行う文字読取装置などに用いて好適な文字種判別
装置およびその判別方法を提供できる。As described above in detail, according to the present invention, the type of character to be read can be more accurately discriminated, and the character reading is performed by the automatic device such as the automatic mail sorter or the like for recognition processing. It is possible to provide a character type discriminating apparatus and a discriminating method suitable for use in an apparatus or the like.

[Brief description of drawings]

【図１】この発明の一実施例にかかる郵便物自動読取区
分機の概略を示す構成図。FIG. 1 is a configuration diagram showing an outline of a mail piece automatic reading / sorting machine according to an embodiment of the present invention.

【図２】同じく、郵便物自動読取区分機の読取部の概略
構成を示すブロック図。FIG. 2 is a block diagram showing a schematic configuration of a reading unit of the automatic mail reading / sorting machine.

【図３】同じく、郵便物の宛名情報の記載に用いられる
文字種の一例を示す図。FIG. 3 is a diagram similarly showing an example of a character type used for describing address information of a mail item.

【図４】同じく、文字種の判別にかかる処理の流れを説
明するために示すフローチャート。FIG. 4 is a flow chart similarly shown for explaining the flow of processing relating to determination of a character type.

【図５】同じく、白枠付パターンの一例を示す図。FIG. 5 is a diagram similarly showing an example of a pattern with a white frame.

【図６】同じく、２×２パターンの分類を示す図。FIG. 6 is a diagram showing classification of 2 × 2 patterns.

【図７】同じく、文字種の判別にかかる処理を具体例を
あげて説明するために示すパターン図。FIG. 7 is a pattern diagram similarly shown for explaining a process for determining a character type by giving a specific example.

【図８】同じく、文字種の判別にかかる処理を具体例を
あげて説明するために示す頻度分布図。FIG. 8 is a frequency distribution diagram similarly shown for explaining a process for determining a character type with a specific example.

[Explanation of symbols]

１４…読取部、１４ａ…光電変換部、１４ｂ…識別部、
４２…宛名領域検出部、４３…文字認識部、５４…認識
回路、５５…辞書、５５ａ…手書文字用辞書、５５ｂ…
印活文字用辞書、Ｐ…郵便物。14 ... Reading unit, 14a ... Photoelectric conversion unit, 14b ... Identification unit,
42 ... Address area detection unit, 43 ... Character recognition unit, 54 ... Recognition circuit, 55 ... Dictionary, 55a ... Handwritten character dictionary, 55b ...
Ink letter dictionary, P ... Mail.

Claims

[Claims]

1. An apparatus for discriminating whether a character to be read is a hand-printed print type, a detecting means for detecting a linear component and a non-linear component of a character to be read, and the linear component detected by this detecting means A character type discriminating apparatus comprising: a discriminating means for discriminating a character type of the character based on a ratio with a straight line component.

2. A device for discriminating whether a character to be read is a hand-printed print type, in which a frame consisting of one dot of white pixels is attached around a binary dot pattern of the character to be read for collation. 16 kinds of mask patterns consisting of 4 pixels of 2 × 2 dots and consisting of a combination of white pixels and black pixels are applied to the forming means for forming a pattern and the pattern to be collated formed by this forming means. The mask means, the counting means for counting the occurrence rate of the masked pattern for each mask pattern applied by the mask means, and the occurrence rate for each mask pattern counted by the counting means A character type discriminating apparatus comprising a discriminating means for discriminating whether or not is a handwritten print type.

3. The character type discriminating apparatus according to claim 2, wherein the discriminating unit makes a determination in consideration of the size and pitch of each character to be read.

4. A method for discriminating whether a character to be read is a handwritten print type character, wherein a linear component and a non-linear component of the character to be read are detected, and a ratio between the detected linear component and non-linear component is detected. A character type discriminating method characterized in that the character type of the character is discriminated.

5. A method for determining whether a character to be read is a hand-printed print type, in which the character to be read is provided with a frame consisting of one dot of white pixels around the binary dot pattern for verification. A pattern is formed and 2 × 2 is formed for the formed pattern to be collated.
16 kinds of mask patterns consisting of 4 pixels of dots and consisting of a combination of white pixels and black pixels are applied, and the occurrence rate in the pattern to be collated for each of the applied mask patterns is counted. A character type discriminating method, characterized in that the character is discriminated whether it is a hand-printed print type based on the occurrence rate of each mask pattern.

6. The character type determination method according to claim 5, wherein the determination is performed in consideration of the size and pitch of each character to be read.