JP5003051B2

JP5003051B2 - Automatic mail sorting machine and automatic mail sorting method

Info

Publication number: JP5003051B2
Application number: JP2006209364A
Authority: JP
Inventors: 寛光森; 克彦近藤
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2006-08-01
Filing date: 2006-08-01
Publication date: 2012-08-15
Anticipated expiration: 2026-08-01
Also published as: JP2008033851A

Description

本発明は、区分すべき郵便物の画像を収集し、収集した郵便物画像から宛名を読み取り、読み取った宛名から導出される区分特定情報にもとづいて、郵便物を自動的に区分する郵便自動区分機及び郵便自動区分方法に関し、特に、認識アルゴリズムが異なる複数の宛名情報読み取り部を用いて、一つの郵便物画像から並列的に宛名を読み取る郵便自動区分機及び郵便自動区分方法に関する。 The present invention collects images of postal items to be classified, reads addresses from the collected postal images, and automatically classifies postal items based on classification specifying information derived from the read addresses In particular, the present invention relates to an automatic mail sorting machine and a mail automatic sorting method for reading addresses in parallel from one mail image using a plurality of address information reading units having different recognition algorithms.

区分すべき郵便物から宛名（郵便番号、都道府県、市町名、丁目、番地、会社名、宛先氏名等）を読み取り、読み取った宛名から導出される区分特定情報（例えば、区分コード）にもとづいて、郵便物を自動的に区分する郵便自動区分機が知られている。この種の郵便自動区分機は、通常、郵便区分機本体部で区分すべき郵便物の画像を収集し、収集した郵便物画像を宛名情報読み取り部（ＯＣＲ：ＯｐｔｉｃａｌＣｈａｒａｃｔｅｒＲｅａｄｅｒ）に送り、ここで宛名の読み取りを行っている。 Read the address (postal code, prefecture, city, street name, street address, company name, destination name, etc.) from the postal items to be classified, and based on the category identification information (for example, category code) derived from the read address An automatic mail sorting machine for automatically sorting mail items is known. This type of postal mail sorting machine usually collects images of postal items to be sorted by the postal sorting machine body, and sends the collected postal images to an address information reading unit (OCR: Optical Character Reader). The address is being read.

また、宛名情報読み取り部が宛名の読み取りに失敗した場合、その郵便物画像をオペレータ入力部に送り、オペレータに正解値の入力を要求する郵便自動区分機もある。このようなオペレータによる補完入力機能は、一般にビデオコーディングディスクと呼ばれており、ディスプレイに表示した郵便物画像に含まれる宛名（郵便番号又は住所文字列）をオペレータが視認してキー入力すると、その入力情報にもとづいて区分特定情報が導き出される。 There is also an automatic mail sorting machine that sends an image of a mail piece to an operator input unit when the address information reading unit fails to read the address and requests the operator to input a correct value. Such a complementary input function by an operator is generally called a video coding disc. When an operator visually recognizes and inputs a name (postal code or address string) included in a postal image displayed on a display, The category specifying information is derived based on the input information.

近年、文字認識の技術分野では、様々な認識アルゴリズムが開発されており、認識率も向上してきている。しかしながら、郵便物に宛名として記載される文字には、手書き文字や印刷文字が含まれるだけでなく、その書体も様々であるため、一つの認識アルゴリズムでは対応が困難であり、認識性能に限界があった。 In recent years, various recognition algorithms have been developed in the technical field of character recognition, and the recognition rate has been improved. However, characters written as mail addresses include not only handwritten characters and printed characters, but also various types of fonts, so it is difficult to handle with one recognition algorithm, and the recognition performance is limited. there were.

そこで、認識アルゴリズムが異なる複数の宛名情報読み取り部を用いて、一つの郵便物画像から並列的に宛名を読み取ることが提案されている。例えば、特許文献１に記載される文字認識装置は、郵便物画像より得られる文字パターンと参照用の基準パターンとを第１認識部により複合類似度抽出処理にて照合し、読取文字に対する認識結果として第１文字候補を求めるとともに、読取りにより得られる文字パターンの形状と参照用の形状データとを第２認識部により輪郭特徴マッチング処理にて照合し、読取文字に対する認識結果として第２文字候補が求め、これら認識結果をマトリックステーブルに当て嵌めて文字候補選択指標を読出し、この読出した文字候補選択指標に応じて上記２つの認識結果のいずれか一方を選択するようになっている。
特開平７−２７１８９９号公報 Therefore, it has been proposed to read addresses in parallel from a single mail image using a plurality of address information reading units having different recognition algorithms. For example, the character recognition device described in Patent Document 1 collates a character pattern obtained from a postal matter image with a reference pattern for reference in a composite similarity extraction process by a first recognition unit, and recognizes a recognition result for a read character. As the first character candidate is obtained, the shape of the character pattern obtained by reading and the shape data for reference are collated by the contour feature matching process by the second recognition unit, and the second character candidate is obtained as a recognition result for the read character. The character candidate selection index is read by applying these recognition results to the matrix table, and one of the two recognition results is selected according to the read character candidate selection index.
JP 7-271899 A

しかしながら、特許文献１に記載される文字認識装置では、いくつかの問題点がある。
第一の問題点は、認識結果に不具合や偏りがあっても、それを容易に改善できないことである。その理由は、認識結果を選択するための判別テーブル（マトリクステーブル）が固定データであり、２つの認識結果のいずれか一方が一義的に選択されてしまうからである。このような認識結果の不具合や偏りは、判別テーブルの更新により改善することが可能であるが、特許文献１に記載される文字認識装置では、ユーザによる判別テーブルの更新を考慮していない。 However, the character recognition device described in Patent Document 1 has several problems.
The first problem is that even if there is a defect or bias in the recognition result, it cannot be easily improved. The reason is that the discrimination table (matrix table) for selecting the recognition result is fixed data, and one of the two recognition results is uniquely selected. Such inconveniences and biases in the recognition result can be improved by updating the discrimination table. However, the character recognition device described in Patent Document 1 does not consider updating the discrimination table by the user.

第二の問題点は、判別テーブルの容量が大きく、判別テーブルの参照に時間がかかることである。その理由は、特許文献１に記載される文字認識装置の判別テーブルが、文字候補ごとの選択指標を定めたマトリックステーブルだからである。具体的には、第１認識部における第１文字候補の認識要素となる類似度の信頼値を文字候補ごとにＸ方向に配列し、かつ第２認識部における第２文字候補の認識要素となる形状特徴を文字候補ごとにＹ方向に配列し、両配列の対照に基づく文字候補選択指標を表欄にあらかじめ登録したものであり、その容量は膨大なものとなる。 The second problem is that the capacity of the discrimination table is large and it takes time to refer to the discrimination table. The reason is that the discrimination table of the character recognition device described in Patent Document 1 is a matrix table that defines selection indexes for each character candidate. Specifically, confidence values of similarity that are recognition elements of the first character candidate in the first recognition unit are arranged in the X direction for each character candidate, and become recognition elements of the second character candidate in the second recognition unit. The shape features are arranged in the Y direction for each character candidate, and the character candidate selection index based on the contrast of both arrangements is registered in the table column in advance, and its capacity becomes enormous.

本発明は、上記の事情にかんがみなされたものであり、認識アルゴリズムが異なる複数の宛名情報読み取り部を用いて、一つの郵便物画像から並列的に宛名を読み取るものでありながら、オペレータの入力情報や各宛名情報読み取り部の特徴量をデータベースに蓄積し、データベースの蓄積情報にもとづいて判別テーブルを更新することにより、認識結果の不具合や偏りを自主的に改善し、郵便物の区分完了率を向上させることができる郵便自動区分機及び郵便自動区分方法の提供を目的とする。 The present invention has been considered in view of the above circumstances, and uses a plurality of address information reading units with different recognition algorithms to read addresses in parallel from a single mail image, while providing operator input information. The feature quantity of each address information reading unit is stored in the database, and the discrimination table is updated based on the stored information in the database, thereby improving the recognition result defects and bias voluntarily and increasing the classification completion rate of postal items. An object is to provide an automatic mail sorting machine and an automatic mail sorting method that can be improved.

上記目的を達成するため本発明の郵便自動区分機は、区分すべき郵便物の画像を収集し、収集した郵便物画像から宛名を読み取り、読み取った宛名から導出される区分特定情報にもとづいて、郵便物を自動的に区分する郵便自動区分機であって、認識アルゴリズムが異なる複数の宛名情報読み取り部を用いて、一つの郵便物画像から並列的に宛名を読み取る並列読み取り部と、判別テーブルを参照しつつ、各宛名情報読み取り部からの認識結果及び特徴量を比較して、正解と思われる区分特定情報を導出する認識結果統合部と、正解と思われる区分特定情報の導出に失敗したとき、正解宛名情報読み取り部又は正解区分特定情報の入力をオペレータに要求するオペレータ入力部と、オペレータによる入力情報及び各宛名情報読み取り部からの特徴量を蓄積するデータベースと、データベースの蓄積情報にもとづいて、判別テーブルを更新する学習部と、を備える構成としてある。 In order to achieve the above object, the automatic mail sorting machine of the present invention collects images of mail pieces to be sorted, reads addresses from the collected mail images, and based on the classification specifying information derived from the read addresses, A mail sorting machine that automatically sorts mail items, using a plurality of address information reading units with different recognition algorithms, a parallel reading unit that reads addresses in parallel from one mail image, and a discrimination table While referring to the recognition result integration unit that derives the category identification information that seems to be correct by comparing the recognition results and feature quantities from each address information reading unit, and when it fails to derive the category identification information that seems to be correct The correct address information reading unit or the operator input unit that requests the operator to input correct answer category specifying information, and the input information by the operator and each address information reading unit A database for storing a symptom amount, based on the stored information in the database, it is constituted; and a learning unit for updating the determination table.

このようにすると、認識アルゴリズムが異なる複数の宛名情報読み取り部を用いて、一つの郵便物画像から並列的に宛名を読み取る郵便自動区分機において、郵便物の区分完了率を向上させることができる。その理由は、オペレータによる入力情報及び各宛名情報読み取り部からの特徴量をデータベースに蓄積するとともに、データベースの蓄積情報にもとづいて判別テーブルを更新し、認識結果の不具合や偏りを自主的に改善できるからである。 In this way, in the automatic mail sorting machine that reads addresses from a single mail image in parallel using a plurality of address information reading units with different recognition algorithms, it is possible to improve the mail mail sorting completion rate. The reason is that the input information by the operator and the feature quantity from each address information reading unit are accumulated in the database, and the discrimination table is updated based on the accumulated information in the database, so that the defect and the bias of the recognition result can be independently improved. Because.

また、本発明の郵便自動区分機は、前記データベースを、オペレータによる入力情報である正解宛名情報読み取り部情報と、各宛名情報読み取り部からの特徴量である住所領域検知情報及び尤度を蓄積し、前記学習部を、データベースの蓄積データをもとに、正解宛名情報読み取り部を特徴空間のクラス、住所領域検知情報及び尤度を特徴空間の特徴ベクトルとして、各クラスのプロトタイプを学習し、このプロトタイプを判定テーブルにコピーする構成とすることができる。
このようにすると、判別テーブルの容量を小さくできるだけでなく、判別テーブルの参照時間を短縮することができる。その理由は、判別テーブルに書き込まれるデータが、文字候補ごとの選択指標を定めたマトリックスデータではなく、各宛名情報読み取り部を特徴空間のクラスとする特徴量のプロトタイプデータだからである。 Further, the automatic mail sorting machine according to the present invention accumulates the correct address information reading unit information that is input information by the operator, the address area detection information that is the feature amount from each address information reading unit, and the likelihood in the database. The learning unit learns the prototype of each class using the correct address information reading unit as the feature space class, address area detection information and likelihood as the feature space feature vector, based on the accumulated data in the database, The prototype can be copied to the determination table.
In this way, not only the capacity of the discrimination table can be reduced, but also the reference time for the discrimination table can be shortened. The reason is that the data written in the discrimination table is not matrix data in which a selection index for each character candidate is defined, but is prototype data of feature quantities having each address information reading unit as a feature space class.

また、本発明の郵便自動区分機は、前記認識結果統合部を、各宛名情報読み取り部からの特徴量を入力ベクトルとして、判別テーブルが定めるプロトタイプとのユークリッド距離を求め、このユークリッド距離が最も小さい宛名情報読み取り部の認識結果を正解とする構成とすることができる。
このようにすると、各宛名情報読み取り部（認識アルゴリズム）の特徴を捉え、認識文字に最も適合した宛名情報読み取り部の認識結果を選択することができる。また、宛名情報読み取り部の数の増減にも容易に対応することができる。 In the automatic mail sorting machine according to the present invention, the recognition result integration unit obtains the Euclidean distance from the prototype determined by the discrimination table using the feature quantity from each address information reading unit as an input vector, and the Euclidean distance is the smallest. It can be set as the structure which makes the recognition result of an address information reading part the correct answer.
In this way, it is possible to capture the characteristics of each address information reading unit (recognition algorithm) and select the recognition result of the address information reading unit most suitable for the recognized character. Further, it is possible to easily cope with an increase or decrease in the number of address information reading units.

また、本発明の郵便自動区分機は、前記認識結果統合部を、各宛名情報読み取り部のユークリッド距離をリジェクト閾値と比較し、全てのユークリッド距離がリジェクト閾値よりも大きい場合は、全ての宛名情報読み取り部の認識結果を不正解とする構成とすることができる。
このようにすると、誤った区分特定情報の導出を回避し、オペレータに正解宛名情報読み取り部又は正解値の入力を要求することができる。 Further, the automatic mail sorting machine of the present invention compares the Euclidean distance of each address information reading unit with the rejection threshold when the recognition result integration unit compares all of the address information when all the Euclidean distances are larger than the rejection threshold. It can be set as the structure which makes the recognition result of a reading part incorrect.
In this way, it is possible to avoid erroneous categorization information derivation and request the operator to input the correct address information reading unit or correct value.

また、本発明の郵便自動区分機は、前記認識結果統合部を、各宛名情報読み取り部からの認識結果を比較し、全ての認識結果が同一の場合は、全ての宛名情報読み取り部の認識結果を正解とする構成とすることができる。
このようにすると、判別テーブルの参照処理や、ユークリッド距離の演算処理を省き、区分特定情報の導出処理を高速化することができる。 Further, in the automatic mail sorting machine of the present invention, the recognition result integration unit compares the recognition result from each address information reading unit, and when all the recognition results are the same, the recognition result of all the address information reading units Can be set as a correct answer.
In this way, it is possible to speed up the process of deriving the category specifying information by omitting the reference table reference process and the Euclidean distance calculation process.

また、本発明の郵便自動区分方法は、区分すべき郵便物の画像を収集し、収集した郵便物画像から宛名を読み取り、読み取った宛名から導出される区分特定情報にもとづいて、郵便物を自動的に区分する郵便自動区分方法であって、認識アルゴリズムが異なる複数の宛名情報読み取り部を用いて、一つの郵便物画像から並列的に宛名を読み取り、判別テーブルを参照しつつ、各宛名情報読み取り部からの認識結果及び特徴量を比較して、正解と思われる区分特定情報を導出し、正解と思われる区分特定情報の導出に失敗したときは、正解宛名情報読み取り部又は正解区分特定情報の入力をオペレータに要求し、オペレータによる入力情報及び各宛名情報読み取り部からの特徴量をデータベースに蓄積し、データベースの蓄積情報にもとづいて、判別テーブルを更新する方法としてある。
このようにすると、認識アルゴリズムが異なる複数の宛名情報読み取り部を用いて、一つの郵便物画像から並列的に宛名を読み取るにあたり、オペレータの入力情報や各宛名情報読み取り部の特徴量をデータベースに蓄積し、データベースの蓄積情報にもとづいて判別テーブルを更新することにより、認識結果の不具合や偏りを自主的に改善し、郵便物の区分完了率を向上させることができる。 Further, the mail automatic classification method of the present invention collects images of mail pieces to be classified, reads addresses from the collected mail images, and automatically sorts mail items based on classification specifying information derived from the read addresses. Automatic postal sorting method, which uses a plurality of address information reading units with different recognition algorithms, reads addresses in parallel from one postal image, and reads each address information while referring to the discrimination table Compare the recognition results and feature quantities from the part to derive the category identification information that seems to be correct, and if the derivation of category identification information that seems to be correct fails, the correct addressee information reading unit or the correct category identification information Requests input from the operator, accumulates information input by the operator and feature amounts from each address information reading unit in the database, and based on the stored information in the database There a way to update the determination table.
In this way, when reading addresses from a single mail image in parallel using a plurality of address information reading units with different recognition algorithms, the operator's input information and feature values of each address information reading unit are stored in the database. In addition, by updating the determination table based on the stored information in the database, it is possible to voluntarily improve defects and bias in recognition results, and improve the postal matter classification completion rate.

また、本発明の郵便自動区分方法は、前記データベースに、オペレータによる入力情報である正解宛名情報読み取り部情報と、各宛名情報読み取り部からの特徴量である住所領域検知情報及び尤度を蓄積し、データベースの蓄積データをもとに、正解宛名情報読み取り部を特徴空間のクラス、住所領域検知情報及び尤度を特徴空間の特徴ベクトルとして、各クラスのプロトタイプを学習し、このプロトタイプを判定テーブルにコピーするようにできる。
このようにすると、文字候補ごとの選択指標を定めたマトリックスデータを用いる場合に比べ、判別テーブルの容量を小さくできるだけでなく、判別テーブルの参照時間を短縮することができる。 The automatic mail classification method of the present invention accumulates in the database the correct address information reading unit information that is input information by the operator, the address area detection information that is the feature amount from each address information reading unit, and the likelihood. Based on the data stored in the database, the correct address information reading unit is used as a feature space class, address area detection information and likelihood as feature space feature vectors, and prototypes of each class are learned. Can be copied.
This makes it possible not only to reduce the capacity of the discrimination table, but also to shorten the reference time of the discrimination table, compared to the case where matrix data that defines selection indexes for each character candidate is used.

また、本発明の郵便自動区分方法は、各宛名情報読み取り部からの特徴量を入力ベクトルとして、判別テーブルが定めるプロトタイプとのユークリッド距離を求め、このユークリッド距離が最も小さい宛名情報読み取り部の認識結果を正解とするようにできる。
このようにすると、各宛名情報読み取り部（認識アルゴリズム）の特徴を捉え、認識文字に最も適合した宛名情報読み取り部の認識結果を選択することができる。また、宛名情報読み取り部の数の増減にも容易に対応することができる。 The automatic mail classification method of the present invention obtains the Euclidean distance from the prototype determined by the discrimination table using the feature quantity from each address information reading unit as an input vector, and the recognition result of the address information reading unit having the smallest Euclidean distance. Can be made correct.
In this way, it is possible to capture the characteristics of each address information reading unit (recognition algorithm) and select the recognition result of the address information reading unit most suitable for the recognized character. Further, it is possible to easily cope with an increase or decrease in the number of address information reading units.

また、本発明の郵便自動区分方法は、各宛名情報読み取り部のユークリッド距離をリジェクト閾値と比較し、全てのユークリッド距離がリジェクト閾値よりも大きい場合は、全ての宛名情報読み取り部の認識結果を不正解とするようにできる。
このようにすると、誤った区分特定情報の導出を回避し、オペレータに正解宛名情報読み取り部又は正解値の入力を要求することができる。 The automatic mail classification method of the present invention compares the Euclidean distance of each address information reading unit with the reject threshold, and if all the Euclidean distances are larger than the reject threshold, the recognition results of all the address information reading units are rejected. The answer can be correct.
In this way, it is possible to avoid erroneous categorization information derivation and request the operator to input the correct address information reading unit or correct value.

また、本発明の郵便自動区分方法は、各宛名情報読み取り部からの認識結果を比較し、全ての認識結果が同一の場合は、全ての宛名情報読み取り部の認識結果を正解とするようにできる。
このようにすると、判別テーブルの参照処理や、ユークリッド距離の演算処理を省き、区分特定情報の導出処理を高速化することができる。 Further, the automatic mail sorting method of the present invention compares the recognition results from the respective address information reading units, and when all the recognition results are the same, the recognition results of all the address information reading units can be made correct. .
In this way, it is possible to speed up the process of deriving the category specifying information by omitting the reference table reference process and the Euclidean distance calculation process.

以上のように、本発明によれば、認識アルゴリズムが異なる複数の宛名情報読み取り部を用いて、一つの郵便物画像から並列的に宛名を読み取るものでありながら、オペレータの入力情報や各宛名情報読み取り部の特徴量をデータベースに蓄積し、データベースの蓄積情報にもとづいて判別テーブルを更新することにより、認識結果の不具合や偏りを自主的に改善し、郵便物の区分完了率を向上させることができる。 As described above, according to the present invention, a plurality of address information reading units having different recognition algorithms are used to read addresses in parallel from a single mail image, while the operator input information and each address information By storing the feature values of the reading unit in the database and updating the discrimination table based on the stored information in the database, it is possible to voluntarily improve defects and bias in recognition results and improve the completion rate of mail classification it can.

以下、本発明の実施形態について、図面を参照して説明する。ただし、図面においては、適宜、宛名情報読み取り部をＯＣＲ、データベースをＤＢと表す。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. However, in the drawings, the address information reading unit is appropriately expressed as OCR and the database as DB.

［郵便自動区分装置］
図１は、本発明の実施形態に係る郵便自動区分機の構成を示すブロック図である。
この図に示される本発明の実施形態に係る郵便自動区分機は、区分すべき郵便物の画像を収集し、収集した郵便物画像から宛名を読み取り、読み取った宛名から導出される区分特定情報にもとづいて、郵便物を自動的に区分する郵便自動区分機であって、郵便区分機本体部１と、並列読み取り部２と、認識結果統合部３と、認識結果出力部４と、オペレータ入力部５と、データベース６と、学習部７とを備えて構成されている。 [Automatic mail sorting device]
FIG. 1 is a block diagram showing a configuration of an automatic mail sorting machine according to an embodiment of the present invention.
The automatic mail sorting machine according to the embodiment of the present invention shown in this figure collects images of mail pieces to be sorted, reads addresses from the collected mail images, and creates classification specifying information derived from the read addresses. An automatic mail sorting machine that automatically sorts mail pieces based on a mail sorting machine body unit 1, a parallel reading unit 2, a recognition result integration unit 3, a recognition result output unit 4, and an operator input unit. 5, a database 6, and a learning unit 7.

郵便区分機本体部１は、スキャナ等を用いて、区分すべき郵便物の画像を収集するとともに、収集した郵便物画像を並列読み取り部２に送り、認識結果出力部４から返される区分特定情報にもとづいて郵便物の区分処理を行う。区分特定情報は、郵便物の区分が特定し得るものであれば、特に制限はない。本実施形態では、予め定められた区分コードを用いる。 The postal sorting machine body 1 collects images of postal items to be sorted using a scanner or the like, sends the collected postal images to the parallel reading unit 2, and returns the classification specifying information returned from the recognition result output unit 4. Based on this, the mail is sorted. The category specifying information is not particularly limited as long as the category of the mail can be specified. In the present embodiment, a predetermined classification code is used.

並列読み取り部２は、認識アルゴリズムが異なる複数の宛名情報読み取り部２ａ、２ｂを用いて、一つの郵便物画像から並列的に宛名を読み取る。一つの郵便物画像の読み取り処理が終わると、並列読み取り部２からは、その認識結果、特徴量及び処理画像が出力される。本実施形態では、認識結果として区分コードを出力し、特徴量としてＡＢＦエリア情報（ＡｄｄｒｅｓｓＢｌｏｃｋＦｉｎｄｉｎｇ情報：住所領域検知情報）及び尤値を出力する。
なお、本実施形態では、二つの宛名情報読み取り部２ａ、２ｂで宛名の読み取りを行うが、３以上の宛名情報読み取り部で宛名の読み取りを行うようにしてもよい。 The parallel reading unit 2 reads addresses in parallel from one mail image using a plurality of address information reading units 2a and 2b having different recognition algorithms. When the reading process of one postal matter image is completed, the recognition result, the feature amount, and the processed image are output from the parallel reading unit 2. In this embodiment, a classification code is output as a recognition result, and ABF area information (Address Block Finding information: address area detection information) and a likelihood value are output as feature amounts.
In this embodiment, the address is read by the two address information reading units 2a and 2b, but the address may be read by three or more address information reading units.

認識結果統合部３は、判別テーブルを参照しつつ、各宛名情報読み取り部２ａ、２ｂからの認識結果及び特徴量を比較して、正解と思われる区分コードを導き出す。
認識結果出力部４は、認識結果統合部３が正解と思われる区分コードの導出に成功したとき、郵便区分機本体部１に区分コードを返し、また、認識結果統合部３が正解と思われる区分コードの導出に失敗したとき（以下、適宜リジェクト）、各宛名情報読み取り部２ａ、２ｂの認識結果、特徴量及び処理画像をオペレータ入力部５に送る。 The recognition result integration unit 3 compares the recognition results and the feature amounts from the respective address information reading units 2a and 2b while referring to the determination table, and derives a classification code that seems to be correct.
The recognition result output unit 4 returns the classification code to the postal sorting machine main unit 1 when the recognition result integration unit 3 succeeds in deriving the classification code that seems to be correct, and the recognition result integration unit 3 seems to be correct. When the derivation of the classification code has failed (hereinafter, appropriately rejected), the recognition result, the feature amount, and the processed image of each address information reading unit 2a, 2b are sent to the operator input unit 5.

オペレータ入力部５は、各宛名情報読み取り部２ａ、２ｂの認識結果及び処理画像を表示し、正解宛名情報読み取り部（正解ＯＣＲ番号）又は正解値（郵便番号又は住所文字列）の入力をオペレータに要求する。ここで、正解宛名情報読み取り部が入力された場合は、その宛名情報読み取り部の認識結果である区分コードを郵便区分機本体部１に送り、正解値が入力された場合は、それを区分コードに変換して郵便区分機本体部１に送る。 The operator input unit 5 displays the recognition results and processed images of each address information reading unit 2a, 2b, and inputs the correct address information reading unit (correct OCR number) or correct value (postal code or address character string) to the operator. Request. Here, when the correct address information reading unit is input, the classification code that is the recognition result of the address information reading unit is sent to the postal sorting machine main unit 1, and when the correct value is input, it is input to the classification code. And is sent to the mail sorting machine main unit 1.

データベース６は、オペレータによる入力情報及び各宛名情報読み取り部２ａ、２ｂからの特徴量を蓄積する。例えば、オペレータによる入力情報として正解ＯＣＲ番号を蓄積し、各宛名情報読み取り部２ａ、２ｂからの特徴量としてＡＢＦエリア情報及び尤度を蓄積する。 The database 6 stores information input by the operator and feature amounts from the respective address information reading units 2a and 2b. For example, correct OCR numbers are accumulated as input information by the operator, and ABF area information and likelihood are accumulated as feature amounts from the respective address information reading units 2a and 2b.

学習部７は、データベース６の蓄積情報にもとづいて学習し、判別テーブルを更新する。例えば、データベース６の蓄積データをもとに、正解ＯＣＲ番号を特徴空間のクラス、ＡＤＦエリア情報及び尤度を特徴空間の特徴ベクトルとして、各クラスのプロトタイプを学習し、このプロトタイプを判定テーブルにコピーする。 The learning unit 7 learns based on the stored information in the database 6 and updates the discrimination table. For example, based on the data stored in the database 6, the correct OCR number is used as the feature space class, the ADF area information and the likelihood is used as the feature space feature vector, and the prototype of each class is learned, and this prototype is copied to the determination table. To do.

郵便自動区分機をこのように構成すると、認識アルゴリズムが異なる複数の宛名情報読み取り部２ａ、２ｂを用いて、一つの郵便物画像から並列的に宛名を読み取るにあたり、認識結果の不具合や偏りを自主的に改善し、郵便物の区分完了率を向上させることができる。
しかも、判別テーブルに書き込まれるデータは、文字候補ごとの選択指標を定めたマトリックスデータではなく、各宛名情報読み取り部２ａ、２ｂを特徴空間のクラスとする特徴量のプロトタイプデータであるため、判別テーブルの容量を小さくできるだけでなく、判別テーブルの参照時間を短縮することができる。 When the automatic mail sorting machine is configured in this way, the use of a plurality of address information reading units 2a and 2b having different recognition algorithms, the independence of the recognition results is not found when reading addresses from a single mail image in parallel. Can be improved, and the mail completion rate can be improved.
In addition, since the data written in the discrimination table is not matrix data in which a selection index for each character candidate is defined, but is prototype data of feature quantities having each address information reading unit 2a, 2b as a class of feature space. In addition to reducing the capacity of the table, it is possible to shorten the reference table reference time.

そして、認識結果統合部３では、各宛名情報読み取り部２ａ、２ｂからの特徴量を入力ベクトルとして、判別テーブルが定めるプロトタイプとのユークリッド距離を求め、このユークリッド距離が最も小さい宛名情報読み取り部２ａ、２ｂの認識結果を正解とすることができる。これにより、各宛名情報読み取り部２ａ、２ｂの特徴を捉え、認識文字に最も適合した宛名情報読み取り部２ａ、２ｂの認識結果を選択することができる。 Then, the recognition result integration unit 3 uses the feature amounts from the respective address information reading units 2a and 2b as input vectors to determine the Euclidean distance from the prototype determined by the discrimination table, and the address information reading unit 2a having the smallest Euclidean distance. The recognition result of 2b can be a correct answer. Thereby, the feature of each address information reading part 2a, 2b can be caught, and the recognition result of address information reading part 2a, 2b most suitable for a recognition character can be selected.

また、認識結果統合部３は、各宛名情報読み取り部２ａ、２ｂのユークリッド距離をリジェクト閾値と比較し、全てのユークリッド距離がリジェクト閾値よりも大きい場合は、全ての宛名情報読み取り部２ａ、２ｂの認識結果を不正解とすることができる。このようにすると、誤った区分コードの導出を回避し、オペレータ入力部５において正解宛名情報読み取り部又は正解値の入力を要求できる。 Further, the recognition result integration unit 3 compares the Euclidean distances of the respective address information reading units 2a and 2b with the reject threshold value, and when all the Euclidean distances are larger than the reject threshold value, all of the address information reading units 2a and 2b are compared. The recognition result can be an incorrect answer. In this way, it is possible to avoid derivation of an erroneous classification code, and to request the correct address information reading unit or the input of the correct value in the operator input unit 5.

また、認識結果統合部３は、各宛名情報読み取り部２ａ、２ｂからの認識結果を比較し、全ての認識結果が同一の場合は、全ての宛名情報読み取り部２ａ、２ｂの認識結果を正解とすることができる。これにより、判別テーブルの参照処理や、ユークリッド距離の演算処理を省き、区分コードの導出処理を高速化することができる。 Further, the recognition result integration unit 3 compares the recognition results from the respective address information reading units 2a and 2b. If all the recognition results are the same, the recognition results of all the address information reading units 2a and 2b are regarded as correct answers. can do. Thereby, the reference table reference processing and the Euclidean distance calculation processing can be omitted, and the classification code derivation processing can be speeded up.

つぎに、本発明の実施形態に係る郵便自動区分方法について、図２を参照して説明する。 Next, an automatic mail sorting method according to an embodiment of the present invention will be described with reference to FIG.

［郵便自動区分方法］
図２は、本発明の実施形態に係る郵便自動区分機の区分処理手順を示すフローチャートである。
この図に示すように、郵便区分処理がスタートすると、郵便区分機本体部１では、供給された郵便物の画像をスキャナ等で収集し、その画像を並列読み取り部２に送る（Ｓ１１）。
並列読み取り部２では、郵便区分機本体部１から送られてきた郵便物画像中の宛名を複数の宛名情報読み取り部２ａ、２ｂで並列に読み取り、区分コードを導き出す（Ｓ１２）。 [Automatic mail sorting method]
FIG. 2 is a flowchart showing the sorting process procedure of the automatic mail sorting machine according to the embodiment of the present invention.
As shown in this figure, when the mail sorting process starts, the mail sorting machine main unit 1 collects images of the supplied mail items with a scanner or the like, and sends the images to the parallel reading unit 2 (S11).
The parallel reading unit 2 reads the address in the mail image sent from the postal sorting machine main unit 1 in parallel by the plurality of address information reading units 2a and 2b, and derives the classification code (S12).

認識結果統合部３では、並列読み取り部２から各宛名情報読み取り部２ａ、２ｂの認識結果、特徴量及び処理画像を受け取るとともに、判定テーブルを参照しつつ、各宛名情報読み取り部２ａ、２ｂの認識結果及び特徴量を比較して、正解と思われる区分コードを導き出す（Ｓ１３）。
認識結果出力部４では、認識結果統合部３が区分コードの導出に成功したか否かを判断し（Ｓ１４）、該判断結果がＹＥＳのときは、郵便区分機本体部１に区分コードを送り（Ｓ１５）、ＮＯのときは、オペレータ入力部５に認識結果、特徴量及び処理画像を送る。 The recognition result integration unit 3 receives the recognition results, feature amounts and processed images of the address information reading units 2a and 2b from the parallel reading unit 2, and recognizes the address information reading units 2a and 2b while referring to the determination table. The result and the feature quantity are compared to derive a category code that seems to be correct (S13).
The recognition result output unit 4 determines whether or not the recognition result integration unit 3 has succeeded in deriving the classification code (S14). If the determination result is YES, the classification result is sent to the postal sorting machine main unit 1. (S15) When NO, the recognition result, the feature amount, and the processed image are sent to the operator input unit 5.

オペレータ入力部５では、各宛名情報読み取り部２ａ、２ｂの認識結果及び処理画像を表示し、正解宛名情報読み取り部（正解ＯＣＲ番号）又は正解値（郵便番号又は住所文字列）の入力をオペレータに要求する（Ｓ１６）。
オペレータによる入力が完了したら（Ｓ１７）、区分コードを郵便区分機本体部１に送るとともに（Ｓ１８）、データベース６にオペレータが入力したＯＣＲ番号及び各宛名情報読み取り部２ａ、２ｂの特徴量であるＡＢＦエリア情報及び尤度を蓄積する（Ｓ１９）。
そして、一万件のデータがデータベース６に蓄積されたところで（Ｓ２０）、これらのデータを学習関数にかけて前述したプロトタイプを算出し（Ｓ２１）、このプロトタイプをコピーすることにより判別テーブルの更新を行う（Ｓ２２）。 The operator input unit 5 displays the recognition results and processed images of each address information reading unit 2a, 2b, and inputs the correct address information reading unit (correct OCR number) or correct value (postal code or address character string) to the operator. A request is made (S16).
When the input by the operator is completed (S17), the classification code is sent to the postal sorting machine main unit 1 (S18), and the OCR number input by the operator to the database 6 and the ABF which is the feature amount of each address information reading unit 2a, 2b. Area information and likelihood are accumulated (S19).
When 10,000 data items are accumulated in the database 6 (S20), the above-described prototype is calculated by applying these data to the learning function (S21), and the discrimination table is updated by copying this prototype (S21). S22).

つぎに、本発明の具体的な実施例について、図３〜図６を参照して説明する。
図３は、本発明の実施例１に係る郵便自動区分機のオペレータ入力部が表示する画面の例を示す説明図、図４は、本発明の実施例１に係る郵便自動区分機のデータベースに蓄積されるデータの例を示す説明図、図５は、本発明の実施例１に係る郵便自動区分機の判別テーブルに設定されるプロトタイプの例を示す説明図、図６は、本発明の実施例１に係る郵便自動区分機の各宛名情報読み取り部から送られる特徴情報の例を示す説明図である。 Next, specific examples of the present invention will be described with reference to FIGS.
FIG. 3 is an explanatory diagram illustrating an example of a screen displayed by the operator input unit of the automatic mail sorting machine according to the first embodiment of the present invention, and FIG. 4 is a database of the automatic mail sorting machine according to the first embodiment of the present invention. FIG. 5 is an explanatory diagram showing an example of stored data, FIG. 5 is an explanatory diagram showing an example of a prototype set in the discrimination table of the automatic mail sorting machine according to the first embodiment of the present invention, and FIG. 6 is an embodiment of the present invention. It is explanatory drawing which shows the example of the characteristic information sent from each address information reading part of the mail automatic sorting machine which concerns on Example 1. FIG.

郵便区分機本体部１において、供給された郵便物の２５６階調グレースケール画像をスキャナ等により収集し、その画像を並列読み取り部２に送る。本実施例では、並列読み取り部２に二台の宛名情報読み取り部２ａ、２ｂが連結された場合を記載する。 In the postal sorting machine main unit 1, 256-gradation grayscale images of the supplied mail are collected by a scanner or the like, and the images are sent to the parallel reading unit 2. In this embodiment, a case where two address information reading units 2a and 2b are connected to the parallel reading unit 2 will be described.

並列読み取り部２において、郵便区分機本体部１から送られてきた郵便物画像中の宛名を並列に連結した宛名情報読み取り部２ａ、２ｂを用いて読み取る。宛名の読み取りの際には、２５６階調のグレースール画像を適当な閾値を用いて二値化し、ニ値化画像に対して認識処理を行う。宛名の読み取り対象となるのは、各住所階層(郵便番号、都道府県、支町名、丁目、番地、会社名、宛先氏名等)であり、最終結果として区分コードを導き出すには、区分コードを一意に決定できるだけの各階層の読み取り情報が必要となる。
並列読み取り部２の各宛名情報読み取り部２ａ、２ｂからは、区分コード、ＡＢＦ座標情報（ｘ座標，ｙ座標）、尤度及び処理画像を認識結果統合部３に出力する。 The parallel reading unit 2 reads the addresses in the mail image sent from the postal sorting machine body unit 1 using the address information reading units 2a and 2b connected in parallel. When reading the address, the grayscale image of 256 gradations is binarized using an appropriate threshold value, and recognition processing is performed on the binarized image. Each address hierarchy (postal code, prefecture, branch name, chome, street address, company name, destination name, etc.) is subject to address reading.To derive the classification code as the final result, the classification code must be unique. Therefore, it is necessary to have read information of each layer that can be determined by the following.
Each address information reading unit 2a, 2b of the parallel reading unit 2 outputs the classification code, ABF coordinate information (x coordinate, y coordinate), likelihood and processed image to the recognition result integrating unit 3.

認識結果統合部３では、各宛名情報読み取り部２ａ、２ｂから入力された情報のうち、ＡＢＦ座標情報が、郵便物画像を９分割した時にどの分割エリアに当てはまるかを判定し、ＡＢＦ座標情報を９分割画面情報（ＡＢＦエリア情報）に変換する。つぎに、並列読み取り部２からの入力された各宛名情報読み取り部２ａ、２ｂの認識結果（区分コード）を比較し、全ての区分コードが同一の場合は、その区分コードを認識結果出力部４に出力する。また、各宛名情報読み取り部２ａ、２ｂが認識した区分コードが異なる場合は、リジェクト判定とし、認識結果出力部４にＡＢＦエリア情報、尤度及び処理画像を送る。 The recognition result integration unit 3 determines which division area the ABF coordinate information applies to when the postal image is divided into nine pieces of information inputted from the respective address information reading units 2a and 2b, and uses the ABF coordinate information. It is converted into 9-divided screen information (ABF area information). Next, the recognition results (classification codes) of the respective address information reading sections 2a and 2b inputted from the parallel reading section 2 are compared. If all the classification codes are the same, the classification code is recognized as the recognition result output section 4. Output to. Further, when the classification codes recognized by the respective address information reading units 2a and 2b are different, the rejection determination is made, and the ABF area information, the likelihood, and the processed image are sent to the recognition result output unit 4.

認識結果出力部４に入力された正解の区分コードは、郵便区分機本体部１に送り、区分コードを基に、供給された郵便物が指定の区分箱に区分され区分完了となる。リジェクトの場合は、各宛名情報読み取り部２ａ、２ｂのＡＢＦエリア情報、尤度及び処理画像がオペレータ入力部５に送られる。 The correct classification code input to the recognition result output unit 4 is sent to the postal sorting machine main unit 1, and based on the classification code, the supplied mail is classified into a designated classification box, and the classification is completed. In the case of rejection, the ABF area information, likelihood, and processed image of each address information reading unit 2a, 2b are sent to the operator input unit 5.

オペレータ入力部５では、オペレータによる郵便物画像の目視確認が行われる。オペレータ入力部５は、郵便物画像表示用のディスプレイ、情報入力用のマウス及びキーボードを備えている。図３にオペレータ入力部５における画像表示例を示す。図３に示す表示画面の左側が一方の宛名情報読み取り部２ａが処理した二値化画像を表示する領域５ａ、右側が他方の宛名情報読み取り部２ａが処理した二値化画像を表示する領域５ｂである。その上段には、オペレータが正解のＯＣＲ番号（本実施例では１又は２）を入力する欄５ｃと、正解値（区分コード、郵便番号、住所等）を入力する欄５ｄがある。また、宛名情報読み取り部２ａ、２ｂがいくつかの住所階層を読み取れている場合は、各宛名情報読み取り部２ａ、２ｂの処理画像を表示する領域５ａ、５ｂの下側にある領域５ｅ、５ｆに、区分コードの読み取り結果が表示される。
そして、オペレータは、正解ＯＣＲ番号又は正解値を打鍵する。打鍵された正解区分コードは、郵便区分機本体部１に送られ、正解値を打鍵した郵便物は、打鍵された正解値を基に区分され区分完了となる。 In the operator input unit 5, the operator visually checks the mail image. The operator input unit 5 includes a display for displaying a mail image, a mouse for inputting information, and a keyboard. FIG. 3 shows an image display example in the operator input unit 5. The left side of the display screen shown in FIG. 3 is an area 5a for displaying a binarized image processed by one address information reading unit 2a, and the right side is an area 5b for displaying a binarized image processed by the other address information reading unit 2a. It is. In the upper row, there are a column 5c for the operator to input a correct OCR number (1 or 2 in this embodiment) and a column 5d for inputting a correct value (classification code, postal code, address, etc.). Further, when the address information reading units 2a and 2b can read several address hierarchies, the regions 5e and 5f below the regions 5a and 5b displaying the processed images of the address information reading units 2a and 2b are displayed. The result of reading the category code is displayed.
Then, the operator types the correct OCR number or correct value. The keyed correct answer classification code is sent to the postal sorting machine main unit 1, and the postal matter keyed with the correct answer value is classified based on the keyed correct answer value to complete the classification.

オペレータ入力部５で入力された正解ＯＣＲ番号は、各宛名情報読み取り部２ａ、２ｂのＡＢＦエリア情報、尤度と共に、図４に示すようにデータベース６に送られ、蓄積される。
データベース６に蓄えられたデータは、１万通蓄積したところで、学習部７の学習関数にかけられる。 The correct OCR number input by the operator input unit 5 is sent to the database 6 and stored together with the ABF area information and likelihood of each address information reading unit 2a, 2b as shown in FIG.
The data stored in the database 6 is applied to the learning function of the learning unit 7 when 10,000 data are stored.

本実施例では、図４に示すように、２つの宛名情報読み取り部２ａ、２ｂがそれぞれ特徴量であるＡＢＦエリア情報及び尤度を持っているので、４次元の特徴ベクトルでその時選択した宛名情報読み取り部２ａ、２ｂの特徴を表すことができる。この特徴ベクトルが張る特徴空間には、各宛名情報読み取り部２ａ、２ｂに対応する２つのクラスが存在することになり、クラスの代表的なパターンとしてプロトタイプを設定する。プロトタイプは、例えば、広く知られているｋ−ｍｅａｎｓ法を用いて容易に求めることができる。また、特徴量のスケールで特徴空間におけるパターン分布の様相が変わらないように、各特徴量を標準化する。得られたプロトタイプを図５に示す。各クラスを一つのプロトタイプで線形分離不可能な場合は、プロトタイプを増やし、各クラスを分離する。 In this embodiment, as shown in FIG. 4, since the two address information reading units 2a and 2b have ABF area information and likelihood as feature quantities, respectively, the address information selected at that time using a four-dimensional feature vector. Features of the reading units 2a and 2b can be expressed. In the feature space spanned by this feature vector, there are two classes corresponding to each address information reading unit 2a, 2b, and a prototype is set as a representative pattern of the class. The prototype can be easily obtained by using, for example, a widely known k-means method. Also, each feature quantity is standardized so that the pattern distribution aspect in the feature space does not change on the feature quantity scale. The prototype obtained is shown in FIG. If each class cannot be linearly separated by one prototype, the number of prototypes is increased and each class is separated.

学習で得られたプロトタイプは、判定テーブルに反映される。学習によって得られた判別テーブルは、認識結果統合部３の既存の判別テーブルに、プロトタイプをコピーすることにより更新される。
更新後は、更新した判別テーブルを参照しつつ、各宛名情報読み取り部２ａ、２ｂの出力値を比較して、区分コードを導き出す。判別テーブルの参照は、各宛名情報読み取り部２ａ、２ｂからの４次元の特徴量を標準化し、標準化した値を入力ベクトルとして、判別テーブルに設定された各宛名情報読み取り部２ａ、２ｂのプロトタイプとのユークリッド距離を求めることにより行うことができる。例えば、図６のような入力ベクトルの場合、宛名情報読み取り部２ａのプロトタイプとのユークリッド距離は２．２４、宛名情報読み取り部２ｂのプロトタイプとのユークリッド距離は２．４８であり、宛名情報読み取り部２ａのユークリッド距離が最も値が小さいので、出力する認識結果は、宛名情報読み取り部２ａのものとなる。また、ユークリッド距離のリジェクト閾値を決定し、その値より全てのユークリッド距離が大きい場合は、リジェクトとする。また、各宛名情報読み取り部２ａ、２ｂのユークリッド距離が等しい場合も、リジェクトとすることが好ましい。 The prototype obtained by learning is reflected in the determination table. The discrimination table obtained by learning is updated by copying the prototype to the existing discrimination table of the recognition result integration unit 3.
After the update, referring to the updated discrimination table, the output values of the address information reading units 2a and 2b are compared to derive the classification code. The discrimination table is referenced by standardizing the four-dimensional feature amounts from the respective address information reading units 2a and 2b and using the standardized values as input vectors and the prototypes of the address information reading units 2a and 2b set in the discrimination table. This can be done by obtaining the Euclidean distance. For example, in the case of the input vector as shown in FIG. 6, the Euclidean distance from the prototype of the address information reading unit 2a is 2.24, the Euclidean distance from the prototype of the address information reading unit 2b is 2.48, and the address information reading unit Since the Euclidean distance 2a has the smallest value, the output recognition result is that of the address information reading unit 2a. Also, a rejection threshold for the Euclidean distance is determined, and if all the Euclidean distances are larger than the value, the rejection is made. Moreover, it is preferable to reject also when the Euclidean distance of each address information reading part 2a, 2b is equal.

また、更新した判別テーブルで運用しても、所望の認識精度が得られない場合は、各宛名情報読み取り部２ａ、２ｂのＡＢＦエリア情報及び尤度をさらに蓄積して再び学習を行い、判別テーブルを更新する。 Further, if the desired recognition accuracy cannot be obtained even if the updated discrimination table is used, the ABF area information and the likelihood of each address information reading unit 2a, 2b are further accumulated to perform learning again, and the discrimination table Update.

本発明は、区分すべき郵便物の画像を収集し、収集した郵便物画像から宛名を読み取り、読み取った宛名から導出される区分特定情報にもとづいて、郵便物を自動的に区分する郵便自動区分機及び郵便自動区分方法に適用できる。特に、本発明は、認識アルゴリズムが異なる複数の宛名情報読み取り部を用いて、一つの郵便物画像から並列的に宛名を読み取る郵便自動区分機及び郵便自動区分方法において有用である。 The present invention collects images of postal items to be classified, reads addresses from the collected postal images, and automatically classifies postal items based on classification specifying information derived from the read addresses Applicable to machine and postal mail sorting method. In particular, the present invention is useful in an automatic postal sorting machine and an automatic postal sorting method that read addresses in parallel from one postal matter image using a plurality of address information reading units having different recognition algorithms.

本発明の実施形態に係る郵便自動区分機の構成を示すブロック図である。It is a block diagram which shows the structure of the mail automatic sorting machine which concerns on embodiment of this invention. 本発明の実施形態に係る郵便自動区分機の区分処理手順を示すフローチャートである。It is a flowchart which shows the division | segmentation processing procedure of the mail automatic sorting machine which concerns on embodiment of this invention. 本発明の実施例１に係る郵便自動区分機のオペレータ入力部が表示する画面の例を示す説明図である。It is explanatory drawing which shows the example of the screen which the operator input part of the mail automatic sorting machine which concerns on Example 1 of this invention displays. 本発明の実施例１に係る郵便自動区分機のデータベースに蓄積されるデータの例を示す説明図である。It is explanatory drawing which shows the example of the data accumulate | stored in the database of the mail automatic sorting machine which concerns on Example 1 of this invention. 本発明の実施例１に係る郵便自動区分機の判別テーブルに設定されるプロトタイプの例を示す説明図である。It is explanatory drawing which shows the example of the prototype set to the discrimination | determination table of the mail automatic sorting machine which concerns on Example 1 of this invention. 本発明の実施例１に係る郵便自動区分機の各宛名情報読み取り部から送られる特徴情報の例を示す説明図である。It is explanatory drawing which shows the example of the characteristic information sent from each address information reading part of the mail automatic sorting machine which concerns on Example 1 of this invention.

Explanation of symbols

１郵便区分機本体部
２並列読み取り部
２ａ宛名情報読み取り部
２ｂ宛名情報読み取り部
３認識結果統合部
４認識結果出力部
５オペレータ入力部
６データベース
７学習部 DESCRIPTION OF SYMBOLS 1 Postal sorting machine main-body part 2 Parallel reading part 2a Address information reading part 2b Address information reading part 3 Recognition result integration part 4 Recognition result output part 5 Operator input part 6 Database 7 Learning part

Claims

An automatic mail sorting machine that collects images of postal items to be classified, reads addresses from the collected mail images, and automatically classifies mails based on the classification specific information derived from the read addresses ,
A parallel reading unit that reads addresses in parallel from one mail image using a plurality of address information reading units with different recognition algorithms;
While referring to the discrimination table, the recognition result integration unit for deriving the category identification information that seems to be correct by comparing the recognition result and the feature amount from each address information reading unit,
An operator input unit for requesting the operator to input a correct answer address information reading unit or a correct value when derivation of the category specific information that seems to be correct has failed;
A database for storing information input by the operator and feature amounts from each address information reading unit;
A learning unit for updating the discrimination table based on the accumulated information in the database;
Equipped with a,
The database stores correct address information reading unit information that is input information by an operator, address area detection information and likelihood that is a feature amount from each address information reading unit,
The learning unit learns a prototype of each class using the correct address information reading unit as a feature space class, address area detection information and likelihood as a feature space feature vector based on the accumulated data in the database. automatic mail sorting machine, wherein the copy it to the determination table.

The recognition result integration unit obtains the Euclidean distance from the prototype determined by the discrimination table using the feature quantity from each address information reading unit as an input vector, and the recognition result of the address information reading unit having the smallest Euclidean distance is set as a correct answer. The automatic mail sorting machine according to claim 1 .

The recognition result integration unit compares the Euclidean distance of each address information reading unit with a reject threshold, and if all the Euclidean distances are larger than the reject threshold, the recognition result of all the address information reading units is incorrect. Item 2. Automatic mail sorting machine according to item 2 .

The recognition result integration unit, compares the recognition result from the address information reading unit, when all the recognition results of the same, more of claims 1 to 3 to correct the recognition results of all of the address information reading unit The automatic mail sorting machine described in Crab.

An automatic postal classification method that collects images of postal items to be classified, reads addresses from the collected postal images, and automatically classifies postal items based on classification specific information derived from the read addresses ,
Using multiple address information reading units with different recognition algorithms, the address is read in parallel from one postal image,
While referring to the discrimination table, compare the recognition results and feature quantities from each address information reading unit to derive the category specific information that seems to be correct,
If derivation of the category specific information that seems to be correct fails, request the operator to input the correct addressee information reading part or correct value,
Accumulate information input by the operator and feature values from each address information reading unit in the database,
Based on the accumulated information in the database, update the discrimination table ,
In the database, correct address information reading part information that is input information by the operator, address area detection information and likelihood that is a feature amount from each address information reading part,
Based on the data stored in the database, the correct addressee information reading unit is used as the feature space class, address area detection information and likelihood as the feature space feature vector, and the prototype of each class is learned, and this prototype is copied to the discrimination table. An automatic mail sorting method characterized by:

6. The postal automatic system according to claim 5, wherein a feature value from each address information reading unit is used as an input vector to obtain a Euclidean distance from a prototype determined by a discrimination table, and a recognition result of the address information reading unit having the smallest Euclidean distance is taken as a correct answer. Classification method.

The postal automatic classification according to claim 6, wherein the Euclidean distance of each address information reading unit is compared with a reject threshold, and if all the Euclidean distances are larger than the reject threshold, the recognition results of all the address information reading units are incorrect. Method.

Comparing recognition results from the address information reading unit, when all the recognition results of the same, automatic mail classification according to any one of claims 5-7 to correct the recognition results of all of the address information reading unit Method.