JP6474504B1

JP6474504B1 - Handwritten character recognition system

Info

Publication number: JP6474504B1
Application number: JP2018008538A
Authority: JP
Inventors: 雄太松本; 知太寺田
Original assignee: Nomura Research Institute Ltd
Current assignee: Nomura Research Institute Ltd
Priority date: 2018-01-23
Filing date: 2018-01-23
Publication date: 2019-02-27
Anticipated expiration: 2038-01-23
Also published as: JP2019128690A

Abstract

【課題】従来の手書文字認識技術による文字認識の精度向上を、複数の手書文字認識方法を複合的に用いて実現する。
【解決手段】手書文字認識対象の画像から行単位で切り出して行画像とし、各行画像を複数の文字認識方法によって文字認識し、複数の文字認識結果を多数決により文字認識結果を決定する手書文字認識システム。
【選択図】図１

PROBLEM TO BE SOLVED: To improve character recognition accuracy by a conventional handwritten character recognition technique by using a plurality of handwritten character recognition methods in combination.
A handwriting that cuts line-by-line from a handwritten character recognition target image to form a line image, recognizes each line image by a plurality of character recognition methods, and determines a character recognition result by majority of the plurality of character recognition results. Character recognition system.
[Selection] Figure 1

Description

本発明は、手書き文字認識処理に関し、特に、手書き文字認識処理のうち、手書き文字の画像データを前処理する技術に関する。 The present invention relates to handwritten character recognition processing, and more particularly, to a technique for preprocessing image data of handwritten characters in handwritten character recognition processing.

従来から文字認識技術として、活字の文書の画像を文字コード列に変換する情報処理技術としてＯＣＲ（Ｏｐｔｉｃａｌｃｈａｒａｃｔｅｒｒｅｃｏｇｎｉｔｉｏｎ）が広く用いられてきた。これに加え、下記特許文献１では、手書きで記入された文書をスキャンして得た文書画像を文字認識し、文字認識結果を全角又は半角判断して出力する技術が開示されていた。 Conventionally, OCR (Optical character recognition) has been widely used as an information processing technique for converting an image of a printed document into a character code string as a character recognition technique. In addition to this, Japanese Patent Application Laid-Open No. 2004-259561 discloses a technique for recognizing a document image obtained by scanning a handwritten document and determining and outputting a character recognition result in full-width or half-width.

特開２００４−１５１８３６号公報JP 2004-151836 A

従来から、OCR技術に加え、様々な手書文字認識技術が用いられているものの、認証対象の画像によってその精度はまちまちであり、安定した精度を得ることが難しく、より一層の精度の向上が望まれている。 Conventionally, various handwriting recognition technologies have been used in addition to the OCR technology, but the accuracy varies depending on the image to be authenticated, and it is difficult to obtain a stable accuracy. It is desired.

本発明はこうした課題に鑑みてなされたものであり、その目的は、様々の手書文字認識技術を複合的に用いて認識精度が改善した手書き文字認識システムを提供することにある。 The present invention has been made in view of these problems, and an object thereof is to provide a handwritten character recognition system in which recognition accuracy is improved by using various handwritten character recognition techniques in combination.

本発明に係る手書文字認識システムは、対象書面の画像データの画像を画像内にある複数行の手書文字列を行単位に切り出して行画像とする行認識部と、当該行画像を所定の文字認識方法により文字認識する第１の文字認識部と、当該第１の文字認識部の文字認識方法と異なる文字認識方法にて文字認識を行う複数の第２の文字認識部と、前記第１の文字認識部による文字認識結果と当該複数の第２の文字認識部による複数の文字認識結果とから多数決により文字認識結果を得る多数決部とからなり、前記第２の文字認識部は相互に異なる文字認識方法を用いる。 The handwritten character recognition system according to the present invention includes a line recognition unit that cuts out a plurality of handwritten character strings in an image from the image data of the target document into line units, and sets the line image as a predetermined image. A first character recognition unit that recognizes a character by the character recognition method, a plurality of second character recognition units that perform character recognition using a character recognition method different from the character recognition method of the first character recognition unit, A majority decision unit that obtains a character recognition result by a majority decision from a character recognition result by one character recognition unit and a plurality of character recognition results by the plurality of second character recognition units, and the second character recognition units mutually Use different character recognition methods.

本発明によれば、手書文字認識の精度向上を実現することができる。 According to the present invention, the accuracy of handwritten character recognition can be improved.

本発明に係る手書文字認識システムのシステム構成図である。It is a system configuration figure of a handwritten character recognition system concerning the present invention. 本発明に係る手書文字認識システムで処理する書面例である。It is the example of a document processed with the handwritten character recognition system which concerns on this invention. 本発明に係る手書文字認識装置による第１の行認識方法を説明する図である。It is a figure explaining the 1st line recognition method by the handwritten character recognition apparatus which concerns on this invention. 従来の行認識方法による課題を説明する図である。It is a figure explaining the subject by the conventional line recognition method. 本発明に係る手書文字認識装置による第２の行認識方法を説明する図である。It is a figure explaining the 2nd line recognition method by the handwritten character recognition device concerning the present invention. 本発明に係る手書文字認識装置による第２の行認識方法を説明する図である。It is a figure explaining the 2nd line recognition method by the handwritten character recognition device concerning the present invention. 本発明に係る手書文字認識装置による第３の行認識方法を説明する図である。It is a figure explaining the 3rd line recognition method by the handwritten character recognition apparatus which concerns on this invention. 本発明に係る手書文字認識装置が保持する認識結果のデータ構成である。It is a data structure of the recognition result which the handwritten character recognition apparatus which concerns on this invention hold | maintains. 本発明に係る手書文字認識システムの動作フローチャートである。It is an operation | movement flowchart of the handwritten character recognition system which concerns on this invention.

以下、各図面に示される同一または同等の構成要素、部材、処理には、同一の符号を付するものとし、適宜重複した説明は省略する。また、各図面において説明上重要ではない部材の一部は省略して表示する。 Hereinafter, the same or equivalent components, members, and processes shown in the drawings are denoted by the same reference numerals, and repeated description is appropriately omitted. In addition, in the drawings, some of the members that are not important for explanation are omitted.

図１は、本実施形態に係る手書文字認識システム１の構成を示す模式図である。手書文字認識システム１は、手書き文字が記載された伝票を画像として読み込むスキャナー１１と、このスキャナー１１により読み込まれた画像データを受信して画像データを保存し、必要に応じて、所定の装置に保存している画像データを送信するスキャナー管理装置１０と、このスキャナー管理装置１０から画像データを受信し、各処理を実行して伝票に記載された手書き文字を文字認識して所定の装置に文字認識結果を送信する手書文字認識装置２０と、この手書文字認識装置２０からの要請に応じて所定の処理が完了した画像データを前記手書文字認識装置２０と異なる手書文字認識方法による手書文字認識方法により文字認識して認識結果を手書文字認識装置２０に送り返すリモートエンジン装置３１・・・３Ｎと、手書文字認識装置２０から受領した文字認識結果を認識対象の画像データと共にユーザに対して表示する確認承認者装置４１及び確認担当者装置４２とからなる。それぞれの装置は、ネットワーク５０に接続されており、所定の権限によって相互にデータ通信可能となっている。ここで、本実施形態では図１のシステムハードウェア構成としたが、各種様々なハードウェア構成を取ることができ、例えば、スキャナー管理装置１０及び手書文字認識装置２０を同一のコンピュータにて構成することもできる。ここで、スキャナーが読み込む書面の一例として伝票を挙げたが、申請書等の書面であっても良い。また、手書文字認識装置２０とハードウェアとして別個のリモートエンジン装置３１・・・３ＮＮがそれぞれ異なる手書文字認識方法を用いることとしたが、手書文字認識装置２０に複数の異なる手書文字認識方法を実装しても良いが、最近の潮流として、システム会社１社で複数の方法を実現するにはコスト低減及び品質向上を追求することも難しく、自社以外のサービス提供会社のリモートエンジン装置３１・・・３Ｎを用いることで、自社の手書文字認識方法とは異なる別視点の手書文字認識方法による認識結果を得て複合的に用いることで全体として認識結果の向上を期待できる。このようなリモートエンジン装置３１・・・３Ｎはそれぞれ異なる会社により開発運営されていることもあり、入力データのフォーマットとして異なる仕様となっている場合もあるため、必要に応じて、手書文字認識装置２０が仕様に合致した画像データにして送信する。例えば、行画像の画像データではなく、複数行のままの画像データを仕様とするサービス提供会社もあり、逆に、文字単位の画像データを仕様とするサービス提供会社もある。 FIG. 1 is a schematic diagram showing a configuration of a handwritten character recognition system 1 according to the present embodiment. The handwritten character recognition system 1 includes a scanner 11 that reads a slip on which handwritten characters are written as an image, receives image data read by the scanner 11, stores the image data, and, if necessary, a predetermined device The scanner management device 10 that transmits the image data stored in the printer, and the image data is received from the scanner management device 10, and each process is executed to recognize the handwritten characters described in the slip and to the predetermined device. A handwritten character recognition device 20 that transmits a character recognition result, and a handwritten character recognition method that is different from the handwritten character recognition device 20 on image data that has undergone predetermined processing in response to a request from the handwritten character recognition device 20 Remote engine device 31... 3N that recognizes characters by the handwritten character recognition method and sends the recognition result back to the handwritten character recognition device 20; Consisting confirmed approver device 41 and verification personnel device 42 displays to the user a character recognition result received from the location 20 together with the image data of the recognition target. Each device is connected to the network 50 and is capable of data communication with each other with a predetermined authority. Here, in the present embodiment, the system hardware configuration of FIG. 1 is used, but various hardware configurations can be adopted. For example, the scanner management device 10 and the handwritten character recognition device 20 are configured by the same computer. You can also Here, a slip is exemplified as an example of a document read by the scanner, but a document such as an application may be used. Further, the handwritten character recognition device 20 and the separate remote engine devices 31... 3NN as hardware use different handwritten character recognition methods, but the handwritten character recognition device 20 has a plurality of different handwritten characters. Although the recognition method may be implemented, it is difficult to pursue cost reduction and quality improvement in order to realize a plurality of methods by one system company as a recent trend, and remote engine devices of service providers other than our own By using 31... 3N, it is possible to expect the improvement of the recognition result as a whole by obtaining the recognition result by the handwritten character recognition method of a different viewpoint from the company's handwritten character recognition method and using it in combination. Such remote engine devices 31... 3N may be developed and operated by different companies, and may have different specifications as the format of input data. The apparatus 20 transmits the image data that matches the specifications. For example, there are service providers that specify image data of a plurality of lines instead of image data of line images, and conversely, there are service providers that specify image data in character units.

ネットワーク５０は、有線ネットワークまたは無線ネットワークもしくはそれらの組み合わせを含み、インターネット、イントラネット、ＬＡＮ、ＷＡＮ、ＷｉＦｉ、Ｂｌｕｅｔｏｏｔｈ（登録商標）、無線電話網などを含んでもよい。 The network 50 includes a wired network or a wireless network or a combination thereof, and may include the Internet, an intranet, a LAN, a WAN, WiFi, Bluetooth (registered trademark), a wireless telephone network, and the like.

スキャナー１１は画像情報を取り込むイメージスキャナであり、例えば、画像取得対象の書面に対して光を投射してその反射光を撮像素子で検出して電気信号に変換してデジタルデータである画像データにする装置である。スキャナー１１はＵＳＢケーブルなどの接続ケーブルでスキャナー管理装置１０と接続されており、スキャナー管理装置１０を接続ケーブルを介して画像データを取得する。本実施形態では、スキャナー管理装置１０とスキャナー１１との構成を例示したが、他の構成としては、スキャナー管理装置１０とスキャナー１１の機能を統合的に有する複合機（プリンタ複合機又はデジタル複合機）にてシステム構成してもよく、現時点で普及している複合機では、原稿自動送り装置（ＡＤＦ：ＡｕｔｏＣｏｃｕｍｅｎｔＦｅｅｄｅｒ）の機能を有し、複数の書面を連続して送って連続してスキャンすることを実現している。さらに、複合機を用いた発展的な構成としては、複合機に手書文字認識装置２０の機能を実装するものである。そうすることで、確認担当者がＡＤＦを用いて無数の伝票を読み取り、複合機による手書文字認識結果と取得し、複合機からのリクエストでリモートエンジン装置３１・・・３Ｎで処理された文字認識結果も取得し、多数決により文字認識結果を決定し、確認担当者が確認担当者装置４２から複合機にアクセスして決定された文字認識結果を確認する構成となる。 The scanner 11 is an image scanner that captures image information. For example, light is projected onto a document to be imaged, and the reflected light is detected by an image sensor and converted into an electrical signal to be converted into image data that is digital data. It is a device to do. The scanner 11 is connected to the scanner management apparatus 10 via a connection cable such as a USB cable, and the scanner management apparatus 10 acquires image data via the connection cable. In the present embodiment, the configuration of the scanner management apparatus 10 and the scanner 11 has been exemplified. However, as another configuration, a multifunction machine (printer multifunction machine or digital multifunction machine) having the functions of the scanner management apparatus 10 and the scanner 11 in an integrated manner. ) May be configured, and currently popular MFPs have the function of an automatic document feeder (ADF), and send multiple documents continuously to scan continuously. Has realized that. Furthermore, as an advanced configuration using a multifunction machine, the function of the handwritten character recognition device 20 is implemented in the multifunction machine. By doing so, the person in charge of reading reads innumerable slips using ADF, acquires the result of handwritten character recognition by the multifunction peripheral, and the characters processed by the remote engine device 31... 3N in response to a request from the multifunction peripheral The recognition result is also acquired, the character recognition result is determined by majority vote, and the confirmation person in charge confirms the decided character recognition result by accessing the multifunction peripheral from the confirmation person in charge device 42.

スキャナー管理装置１０は取得した画像データをこの画像データを識別する画像識別情報を加えて手書文字認識装置２０に送信する。スキャナー管理装置１０は画像データの画像の一部をトリミングする機能を有しており、手書文字認識装置２０が手書認識する対象範囲が予め決められている場合には予め定められている範囲にトリミングした上で手書文字認識装置２０に送信する。図２に示す対象画像の伝票のような書面では、伝票上の各位置で記載する項目が枠で予め指定されており、トリミングする場合にはその枠線を基準に実行する。伝票上の記載項目は、氏名、住所等といった記載項目データ属性を有しており、伝票の記載位置から各トリミングした画像がどの記載項目データ属性を有しているかを判別可能であり、この記載項目データ属性も手書文字認識装置２０に送信することができる。これらトリミング処理、記載項目データ属性の判別処理は、手書文字認識装置２０で実行しても良い。 The scanner management apparatus 10 transmits the acquired image data to the handwritten character recognition apparatus 20 with image identification information for identifying the image data added. The scanner management device 10 has a function of trimming a part of an image of image data, and when a target range recognized by the handwritten character recognition device 20 is determined in advance, a predetermined range is determined. And is transmitted to the handwritten character recognition device 20. In a document such as a slip of the target image shown in FIG. 2, items to be described at each position on the slip are designated in advance by a frame, and when trimming is performed, the frame line is used as a reference. The description items on the slip have description item data attributes such as name and address, and it is possible to determine which description item data attribute each trimmed image has from the description position of the slip. The item data attribute can also be transmitted to the handwritten character recognition device 20. These trimming processing and description item data attribute discrimination processing may be executed by the handwritten character recognition device 20.

手書文字認識装置２０は受信した画像データの画像（トリミング処理している場合にはトリミング後の画像）に対して、初期処理（傾き補正、拡大縮小、ノイズ除去、明暗補正、不要箇所除去等）を行って行認識し、行認識された文字又は文字列を文字認識し、認識結果と信頼度を出力する。初期処理の傾き補正、拡大縮小、ノイズ除去、明暗補正、不要箇所除去等については各種慣用技術が存在し、それらを適用して処理することができる。傾き補正はトリミングで切り取られた図上において文字の記載方向と切り取りの長手方向とが一致するように処理することもできるし、単に、書面の記載枠に沿って水平方向に一致するように処理してもよい。拡大縮小は手書文字認識の標準の文字大きさに揃えるために実施する。ノイズ除去はスキャンの読み取り書面上に物理的に存在するノイズ、又は、スキャン時に発生する各種ノイズを取り除くための処理である。明暗補正は手書文字認識に最適化された汎用的な明暗補正技術を用いる。不要箇所除去は手書文字認識の対象外のオブジェクトを除去する処理で、書面フォームにある雛形文字（例えば、氏名における「様」）を除去する。 The handwritten character recognition device 20 performs initial processing (tilt correction, enlargement / reduction, noise removal, light / dark correction, unnecessary portion removal, etc.) on the received image data image (trimmed image if trimming is performed). ) To recognize the line, recognize the character or the character string recognized by the line, and output the recognition result and the reliability. There are various conventional techniques for inclination correction, enlargement / reduction, noise removal, light / dark correction, unnecessary portion removal, etc. in the initial processing, and these can be applied and processed. Inclination correction can be processed so that the written direction of the text and the longitudinal direction of the cut match on the figure cut out by trimming, or simply processed so that it matches the horizontal direction along the written description frame. May be. Enlargement / reduction is performed to match the standard character size for handwritten character recognition. Noise removal is a process for removing noise that physically exists on the scanned reading document or various noises generated during scanning. The light / dark correction uses a general light / dark correction technique optimized for handwritten character recognition. Unnecessary portion removal is processing for removing objects that are not subject to handwriting recognition, and removes template characters (for example, “sama” in the name) on the written form.

行認識は複数の方法があるためここで説示する。第１の行認識方法は、初期処理を実施した対象画像（図３（ａ）参照）を左上から左下へかけて１ピクセル行毎に黒ドットをカウントしてヒストグラムを生成し、黒ドットのカウントのない区間を抽出し、その区間の中間点を行の切り出しポイントとし（図３（ｂ）参照）、切り出しポイントを通過する横直線で対象画像を切り出す（図３（ｃ）参照）。図３（ｂ）のヒストグラムでは横軸が度数（黒ドットのカウント数）、縦軸が階級（行画像における縦位置）を示す。ヒストグラムでは通常縦軸が度数、横軸が階級をとるものが一般的であるものの、説明の便宜上、逆にして説示した。 Since line recognition has several methods, it is explained here. In the first row recognition method, a histogram is generated by counting black dots for each pixel row from the upper left to the lower left of the target image (see FIG. 3A) on which the initial processing has been performed. A section having no blank is extracted, and an intermediate point of the section is set as a line cut-out point (see FIG. 3B), and the target image is cut out by a horizontal straight line passing through the cut-out point (see FIG. 3C). In the histogram of FIG. 3B, the horizontal axis indicates the frequency (the number of black dots), and the vertical axis indicates the class (the vertical position in the row image). In the histogram, the vertical axis usually indicates frequency and the horizontal axis indicates class, but for the sake of convenience, the description is reversed.

第１の行認識方法はシンプルで高速に処理することができるものの、横直線で切り出すことで対象行に記載している文字の一部がその上の行又はその下の行にはみ出して記載されている場合にそのはみ出している文字の一部が対象行に含まれずに切り出されてしまい、正しく文字認識できなくなるという課題がある（図４（ａ）参照）。この課題に対しては、第２の行認識方法で解決することができる。第２の行認識方法は、第１の行認識方法と同様に、ヒストグラムを生成し、その後、ヒストグラムのグラフの谷を行切り出しポイントの基準として切り出しを行う（図５（ｂ）参照）。より具体的には、縦方向の全区間の度数のうち、所定閾値の度数以下である１以上の縦方向の部分区間を特定し、その各部分区間を切り出し候補とする。各部分区間のいずれの位置を切り出しポイントにするかはいくつかの方法があり、各部分区間の中間位置でもよいし、各部分区間内で最小値の度数を有する各部分区間の位置であってもよい。図５（ｃ）の切り出しポイントの始点側を開始点とし、横方向（文字の記載方向）の終点側を目標点とする。言い換えれば、切り出しポイントを通過する横直線と画像領域の交点である二点のうち、一方を開始点とし、他方を目標点とする。そして、各文字を構成する線を障害物と捉えて開始点から終了点までのグラフ探索問題（つまり、グラフ上の開始点から終了点までの道を見つける）として既存のグラフ探索問題のアルゴリズムを適用して解法し、開始点から終了点までの道を切り出し線とする。この切り出し線を用いて画像領域を切り取り、切り取った上方向から順にそれぞれ行画像とする。既存のアルゴリズムの一例としては、Ａ＊（Ａ−ｓｔａｒ）探索アルゴリズムを例えば適用することができる。各行画像には画像識別情報に加え、行識別情報が付与される。 The first line recognition method is simple and can be processed at high speed, but by cutting out with a horizontal line, a part of the character described in the target line is written in the line above or below it. In such a case, a part of the protruding character is cut out without being included in the target line, and there is a problem that the character cannot be correctly recognized (see FIG. 4A). This problem can be solved by the second line recognition method. Similar to the first line recognition method, the second line recognition method generates a histogram, and then cuts out using the valley of the histogram graph as a reference for the line cut-out point (see FIG. 5B). More specifically, among the frequencies of all the vertical sections, one or more vertical partial sections that are equal to or less than a predetermined threshold frequency are specified, and the partial sections are set as extraction candidates. There are several methods to determine which position in each partial section as a cut-out point, and it may be the middle position of each partial section, or the position of each partial section having the minimum frequency in each partial section. Also good. The start point side of the cutout point in FIG. 5C is set as the start point, and the end point side in the horizontal direction (character writing direction) is set as the target point. In other words, one of the two points that are the intersection of the horizontal line passing through the cut-out point and the image area is set as the start point, and the other is set as the target point. Then, the algorithm of the existing graph search problem is used as a graph search problem from the start point to the end point (that is, finding the path from the start point to the end point on the graph) by observing the lines constituting each character as an obstacle. The method is applied and solved, and the path from the start point to the end point is taken as the cut line. The image area is cut out using the cut-out line, and each line image is formed in order from the cut-out upper direction. As an example of an existing algorithm, an A * (A-star) search algorithm can be applied, for example. Each row image is given row identification information in addition to the image identification information.

第２の行認識方法のバリエーションとしては、図６（ｂ）の通り、画像データの画像を横方向に均等に二分割し、それぞれの分割領域に対してヒストグラムを求め、図５（ｂ）と同様に、それぞれの分割領域について所定閾値の度数以下である１以上の縦方向の部分区間を特定し、一方の分割領域の各部分区間を開始点の候補とし、他方の分割領域の各部分区間を終了点の候補とし、縦方向から上から順に開始点と終了点の組み合わせとして、それぞれの組み合わせについてグラフ探索問題のアルゴリズムを適用して切り出し線を導出しても良く（図６（ｃ）参照）。対象行に属する文字が上の行又は下の行に大きくはみ出して記載されている場合、例えば、図６（ａ）に示す通り、二行目の「Ｆ」が三行目の文字列側に大きくはみ出している場合に、適切に行認識することができる。 As a variation of the second line recognition method, as shown in FIG. 6B, the image of the image data is equally divided into two in the horizontal direction, and a histogram is obtained for each divided region. Similarly, for each divided area, one or more vertical partial sections that are equal to or less than a predetermined threshold frequency are specified, each partial section of one divided area is set as a candidate for a starting point, and each partial section of the other divided area As a candidate for an end point, and a combination of a start point and an end point in order from the top in the vertical direction, a cutting line may be derived by applying a graph search problem algorithm to each combination (see FIG. 6C). ). When the characters belonging to the target line are described so as to protrude largely into the upper line or the lower line, for example, as shown in FIG. 6A, the “F” in the second line is on the character string side in the third line. Appropriate line recognition is possible when the line protrudes greatly.

また、第１の行認識方法の第２の課題としては、対象行の各文字と上の行又は下の行の各文字が接触している場合にその接触している文字の一部が対象行に含まれずに切り出されてしまい、正しく文字認識できなくなるという課題もある。この第２の課題は、前記第２の行認識方法でも解決できない場合もある。図４（ｂ）では１行目の「代」の一部と二行目の「７」が接触しており、これら二文字の間に空隙が存在しないためにＡ＊探索アルゴリズムでは１行目と２行目の切り出し線が迂回して形成され、この切り出し線に切り出された１行目の文字列には不要な文字が含まれ、二行目の文字列には必要な文字が含まれていない。これに対し、第３の行認識方法ではこの課題を解決でき、第３の行認識方法は、黒ドットの塊を１個とカウントするラベリングを用いて矩形抽出し（図７（ａ）参照）、抽出した矩形領域大きさ又は矩形高さが標準の矩形領域大きさ又は矩形高さと比べて大きい場合には分割処理対象として矩形領域を抽出し（図７（ｂ）参照）、抽出した矩形領域の中心又は重心の位置を通る横手方向の直線上のドットを白ドットに変更し（図７（ｃ）参照）、以降は、第２の行認識方法と同様に、ヒストグラムを形成し、切り出しポイントのグラフ探索を行って切り出し処理を行う（図７（ｄ）参照）。 In addition, as a second problem of the first line recognition method, when each character of the target line is in contact with each character of the upper line or the lower line, a part of the touched character is targeted. There is also a problem that characters are cut out without being included in a line and characters cannot be recognized correctly. This second problem may not be solved even by the second line recognition method. In FIG. 4 (b), a part of “yo” on the first line and “7” on the second line are in contact with each other, and since there is no gap between these two characters, the A * search algorithm uses the first line. And the cutout line on the second line is formed around, and the character string on the first line cut out on the cutout line includes unnecessary characters, and the character string on the second line includes necessary characters. Not. On the other hand, the third line recognition method can solve this problem, and the third line recognition method performs rectangular extraction using labeling that counts a lump of black dots as one (see FIG. 7A). When the extracted rectangular area size or rectangular height is larger than the standard rectangular area size or rectangular height, a rectangular area is extracted as a division processing target (see FIG. 7B), and the extracted rectangular area The dot on the straight line in the transverse direction passing through the center or the center of gravity of the image is changed to a white dot (see FIG. 7C), and thereafter, similarly to the second line recognition method, a histogram is formed, and the cut-out point The graph search is performed to perform the clipping process (see FIG. 7D).

手書文字認識装置２０は自己の手書文字認識機能を用いて各行画像を文字認識する。手書文字認識機能は、慣用技術のオフライン手書文字認識方法（オフライン手書文字認識方法）から構成され、まず、文字抽出され、各種技法（例えば、ニューラルネットワークを用いるもの、文字属性の特徴情報を用いるものなどがある）の文字認識がなされる。手書文字認識装置２０はローカルにある自己のエンジンを用いて手書文字認識を行っているため迅速に文字認識処理することができる。また、いずれの手書文字認識技法を用いた場合でも文字認識結果である文字コード列の他、信頼度も同様に出力することができる。
各リモートエンジン装置３１・・・３Ｎは、手書文字認識装置２０から受信した画像識別情報及び行識別情報が付与された各行画像をそれぞれのオフライン手書文字認識方法を用いて手書き文字認識を行う。各リモードエンジン装置はそれぞれ異なるオフライン手書文字認識方法を用いているため、行画像によっては異なる認識結果を得ることもある。各リモートエンジン装置３１・・・３Ｎは文字認識結果（文字コード列、信頼度）を文字認識装置２０に送信する。 The handwritten character recognition device 20 recognizes each line image by using its own handwritten character recognition function. The handwritten character recognition function is composed of an off-line handwritten character recognition method (offline handwritten character recognition method) of a conventional technique, and first, characters are extracted and various techniques (for example, using a neural network, character attribute feature information) Character recognition) is performed. Since the handwritten character recognition apparatus 20 performs handwritten character recognition using its own local engine, it can quickly perform character recognition processing. In addition, regardless of which handwriting character recognition technique is used, in addition to the character code string that is the character recognition result, the reliability can be output in the same manner.
Each of the remote engine devices 31... 3N performs handwritten character recognition on each line image to which the image identification information and the line identification information received from the handwritten character recognition device 20 are assigned by using respective offline handwritten character recognition methods. . Since each remode engine device uses a different offline handwritten character recognition method, different recognition results may be obtained depending on the line image. Each remote engine device 31... 3N transmits a character recognition result (character code string, reliability) to the character recognition device 20.

手書文字認識装置２０は複数あるリモートエンジン装置３１、・・・、３Ｎ（Ｎ：自然数）から対象の画像に対応するリモートエンジン装置３Ｎを選択し、選択したリモートエンジン装置に対象画像を送信することもできる（選択する場合でも、多数決処理するために複数のリモートエンジン装置３Ｎを選択することが望ましい）。手書文字認識装置２０は、各対象画像の記載項目データ属性を保持しているため、その記載項目データ属性からリモートエンジン装置３Ｎを選択することもできる。手書文字認識装置２０がリモートエンジン装置を用いて手書文字認識するかどうかは、自己の手書文字認識による認識結果の信頼度が所定以下の場合に、リモートエンジン装置３１、・・・、３Ｎに手書文字認識のリクエストを送信しても良い。 The handwritten character recognition device 20 selects a remote engine device 3N corresponding to the target image from a plurality of remote engine devices 31,..., 3N (N: natural number), and transmits the target image to the selected remote engine device. (Even when selecting, it is desirable to select a plurality of remote engine devices 3N for majority processing). Since the handwritten character recognition device 20 holds the description item data attribute of each target image, the remote engine device 3N can be selected from the description item data attribute. Whether or not the handwritten character recognition device 20 recognizes the handwritten character using the remote engine device is determined when the reliability of the recognition result by its own handwritten character recognition is not more than a predetermined value, the remote engine device 31,. A request for handwritten character recognition may be transmitted to 3N.

手書文字認識装置２０は、自己で処理した文字認識結果、各リモートエンジン装置３１・・・３Ｎで処理された文字認識結果を図８のデータ形式で保持している。これら複数の文字認識結果を多数決して一の文字認識結果とする。多数決は、認識結果である文字コード列単位に、複数の文字認識結果を集計し、最も多くのエンジンで認識された文字コード列を採用する。図８の例で言えば、００１の処理エンジンと００２の処理エンジンは同じ文字コード列αであるため文字コード列αのカウンタ数が２で、００３の処理エンジンは文字コード列βのカウンタ数が１で、多数決により文字コードαの認識結果が採用されることになる。信頼度は採用された文字コードを認識結果とする信頼度のいずれかを使用してもよいし、平均計算して平均値を使用してもよいし、採用された文字コードを認識したエンジン数を認識処理した全エンジン数で除算したものであってもよい。前記のような文字コード列単位の集計を行う多数決方法以外に、文字コード列単位に信頼度を集計し、最も大きい信頼度を有する文字コード列の認識結果を採用する構成であっても良い。このような多数決方法の前提としては、処理エンジンで使用する信頼度を正規化する必要がある。
手書文字認識装置２０は追加で多数決した文字認識結果の文字列に対して言語モデルによる補正処理を施すこともできる。ここで、言語モデルとしては、例えば、Ｎ−ｇｒａｍモデル、人工知能モデル（ディープラーニングモデル）がある。汎用的なＮ−ｇｒａｍモデル、人工知能モデルを用いてもよいし、記載項目データ属性に合致したＮ−ｇｒａｍモデル、人工知能モデルを構成して用いてもよい。特に、人工知能モデルでは、実際の入力データを学習データとして準備し、学習モデルを形成することで、氏名の学習モデル、住所の学習モデルを用意することができる。 The handwritten character recognition device 20 holds the character recognition result processed by itself and the character recognition result processed by each remote engine device 31... 3N in the data format of FIG. Many of these plural character recognition results are regarded as one character recognition result. In the majority decision, a plurality of character recognition results are aggregated for each character code string as a recognition result, and the character code string recognized by the most engines is adopted. In the example of FIG. 8, the processing engine of 001 and the processing engine of 002 are the same character code string α, so the number of counters of the character code string α is 2, and the processing engine of 003 has a counter number of the character code string β. 1, the recognition result of the character code α is adopted by majority vote. As the reliability, any of the reliability values obtained by recognizing the adopted character code may be used, or an average value may be used by calculating an average, or the number of engines that have recognized the adopted character code. May be divided by the total number of engines subjected to recognition processing. In addition to the majority method for performing aggregation in character code string units as described above, a configuration may be adopted in which the reliability is totaled in character code string units and the recognition result of the character code string having the highest reliability is adopted. As a premise of such a majority method, it is necessary to normalize the reliability used in the processing engine.
The handwritten character recognition apparatus 20 can also perform correction processing using a language model on a character string as a result of additional character recognition. Here, examples of the language model include an N-gram model and an artificial intelligence model (deep learning model). A general-purpose N-gram model or artificial intelligence model may be used, or an N-gram model or artificial intelligence model that matches the description item data attribute may be used. In particular, in the artificial intelligence model, by preparing actual input data as learning data and forming a learning model, a name learning model and an address learning model can be prepared.

確認担当者装置４２では認識結果を確認することができ、対象書面を選択すると、対象書面の画像（又はその一部）と共に、認識結果を表示し、確認担当者が内容確認して認識結果が正しい場合にはそのフラグを設定する。認識結果が正しくない場合には誤っている認識結果を確認担当者が確認担当者装置４２で修正する。手書文字認識装置２０は複数エンジンで認識した認識結果、多数決した認識結果、補正処理した認識結果それぞれを全て保存し、デフォルトで表示する認識結果は設定可能で、確認担当者の求めに応じて、他の認識結果も参照可能とすることもできる。また、手書文字認識装置２０がリクエストを送ったリモートエンジン装置３Ｎ以外のリモートエンジン装置を確認担当者から受け付けて、再度、手書文字認識を行う処理を行っても良い。 The confirmation person in charge device 42 can confirm the recognition result. When the target document is selected, the recognition result is displayed together with the image (or a part thereof) of the target document, and the confirmation person confirms the contents and the recognition result is confirmed. If it is correct, set the flag. If the recognition result is not correct, the checker corrects the incorrect recognition result by the checker device 42. The handwritten character recognition device 20 stores all of the recognition results recognized by a plurality of engines, the majority recognition results, and the corrected recognition results, and the default recognition results can be set. It is also possible to refer to other recognition results. Alternatively, the handwritten character recognition device 20 may receive a remote engine device other than the remote engine device 3N to which the request has been sent from the person in charge of confirmation, and perform the processing for handwritten character recognition again.

確認承認者装置４１では確認担当者が行った確認をレビューすることができ、対象書面を選択すると、対象書面の画像（又はその一部）、認識結果及び確認担当者の確認結果が表示し、確認承認者が内容確認して認識結果が正しい場合にはそのフラグを設定する。認識結果が正しくない場合には誤っている認識結果を確認承認者が確認承認者装置４１で修正する。 The confirmation approver device 41 can review the confirmation performed by the confirmation person in charge, and when the target document is selected, the image of the target document (or a part thereof), the recognition result, and the confirmation result of the confirmation person in charge are displayed. If the confirmation approver confirms the contents and the recognition result is correct, the flag is set. If the recognition result is incorrect, the confirmation approver corrects the incorrect recognition result by the confirmation approver device 41.

次に、本実施形態の動作について、図９を用いて説明する。 Next, the operation of the present embodiment will be described with reference to FIG.

スキャナー管理装置１０はスキャナー１１でスキャンした画像データを取得し（ステップ１０５）、手書文字認識処理すべき領域をトリミングし（ステップ１１０）、トリミングした画像データに画像情報識別情報及び記載項目データ属性を付与して手書文字認識装置２０に送信する。 The scanner management apparatus 10 acquires image data scanned by the scanner 11 (step 105), trims an area to be handwritten character recognition processed (step 110), and includes image information identification information and description item data attributes in the trimmed image data. Is transmitted to the handwritten character recognition apparatus 20.

手書文字認識装置２０はスキャナー管理装置１０から受信したトリミング画像に対して、各種初期処理（傾き補正等）を実行し（ステップ１１５）、いずれかの行認識方法にて行認識処理を実行し、各行画像を取得する（ステップ１２０）。 The handwritten character recognition apparatus 20 performs various initial processes (such as tilt correction) on the trimmed image received from the scanner management apparatus 10 (step 115), and executes the line recognition process using one of the line recognition methods. Each row image is acquired (step 120).

手書文字認識装置２０は自装置内で各行画像の手書文字認識を実行すると共に（ステップ１２５）、リモートエンジン装置３１・・・３Ｎに各行画像を送信して自装置外での手書文字認識を依頼する。 The handwritten character recognition apparatus 20 performs handwritten character recognition of each line image within the apparatus (step 125), and transmits each line image to the remote engine devices 31... Ask for recognition.

手書文字認識装置２０の依頼を受けたリモートエンジン装置３１・・・３Ｎは対象となる行画像を手書文字認識し（ステップ１３０）、認識結果を手書文字認識装置２０に送信する。 Upon receiving the request from the handwritten character recognition device 20, the remote engine devices 31... 3N recognize the handwritten character as the target line image (step 130), and transmit the recognition result to the handwritten character recognition device 20.

手書文字認識装置２０は自装置内で実行した手書文字認識の認識結果を得ると共に、リモートエンジン装置３１・・・３Ｎの認識結果を受信する。 The handwritten character recognition device 20 obtains a recognition result of handwritten character recognition executed in its own device and receives a recognition result of the remote engine devices 31... 3N.

手書文字認識装置２０は得られた複数の認識結果を多数決し、決定した認識結果を保存する（ステップ１３５）。手書文字認識装置２０は多数決した認識結果に対して補正処理を行って保存する（ステップ１４０）。 The handwritten character recognition apparatus 20 determines a large number of the obtained recognition results and stores the determined recognition results (step 135). The handwritten character recognition apparatus 20 performs correction processing on the majority recognition result and stores it (step 140).

確認担当者装置４２では確認担当者の操作を受けて、手書文字認識装置２０から処理済みの認識結果を参照し、必要に応じて認識結果を修正する（不図示）。修正された場合には修正内容が手書文字認識装置２０に保存される。 In response to the operation of the person in charge of confirmation, the person in charge of confirmation 42 refers to the processed recognition result from the handwritten character recognition apparatus 20, and corrects the recognition result as necessary (not shown). When the correction is made, the correction contents are stored in the handwritten character recognition device 20.

確認承認者装置４１では確認承認者の操作を受けて、手書文字認識装置から処理済みの認識結果又は修正内容を参照し、必要に応じて認識結果又は修正内容を修正する（不図示）。 The confirmation approver device 41 receives the operation of the confirmation approver, refers to the processed recognition result or correction content from the handwritten character recognition device, and corrects the recognition result or correction content as necessary (not shown).

上述の実施の形態において、保存は、ハードディスクや半導体メモリにより行う。また、本明細書の記載に基づき、各部を、図示しないＣＰＵや、インストールされたアプリケーションプログラムのモジュールや、システムプログラムのモジュールや、ハードディスクから読み出したデータの内容を一時的に記憶する半導体メモリなどにより実現できることは本明細書に触れた当業者には理解される。 In the embodiment described above, the storage is performed by a hard disk or a semiconductor memory. Further, based on the description of the present specification, each unit is configured by a CPU (not shown), a module of an installed application program, a module of a system program, a semiconductor memory that temporarily stores the content of data read from the hard disk, or the like. It will be appreciated by those skilled in the art who have touched this specification that this can be achieved.

以上、実施の形態に係る手書文字認識システム１の構成と動作について説明した。この実施の形態は例示であり、各構成要素や各処理の組み合わせにいろいろな変形例が可能なこと、またそうした変形例も本発明の範囲にあることは当業者に理解される。 The configuration and operation of the handwritten character recognition system 1 according to the embodiment have been described above. This embodiment is an exemplification, and it will be understood by those skilled in the art that various modifications can be made to each component and combination of processes, and such modifications are within the scope of the present invention.

本発明は、手書き文字の画像データを前処理する技術に好適に利用可能である。 The present invention can be suitably used for a technique for preprocessing image data of handwritten characters.

手書文字認識システム１
スキャナー管理装置１０
スキャナー１１
手書文字認識装置２０
リモートエンジン装置３１
リモートエンジン装置３Ｎ
確認承認者装置４１
確認担当者装置４２

Handwritten character recognition system 1
Scanner management device 10
Scanner 11
Handwritten character recognition device 20
Remote engine equipment 31
Remote engine device 3N
Confirmation approver device 41
Checker device 42

Claims

A line recognizing unit that extracts an image of the image data of the target document as a line image by cutting out a plurality of lines of handwritten character strings in the image, and a first character recognizing the line image by a predetermined character recognition method. A character recognition unit, a plurality of second character recognition units that perform character recognition using a character recognition method different from the character recognition method of the first character recognition unit, a character recognition result by the first character recognition unit, and A majority decision unit that obtains a character recognition result by a majority decision from a plurality of character recognition results by a plurality of second character recognition units, wherein the second character recognition unit uses a mutually different character recognition method,
The character recognition result consists of a character string obtained by character recognition of the line image and the reliability of the character recognition,
The first character recognizing unit is configured as a handwritten character recognizing device on one computer, and each of the plurality of second character recognizing units is connected to the one computer via a network. Configured as a remote engine device on top,
When the reliability of character recognition character recognition result by the first character recognition part is given below, the second character recognition part have line character recognition,
The line recognition unit takes an image of the image data as a histogram, identifies one or more cutout points using the histogram, cuts out the image of the predetermined portion from the specified cutout point, and forms a row image,
The line recognizing unit sets a line constituting a character in the line image as an obstacle, sets a position in any predetermined section where the frequency of the histogram is equal to or less than a predetermined threshold as a cutout point, and is set with the cutout point as a reference. Answer the graph search problem from the starting point to the target point, cut out the answer as a cut line,
The line recognition unit rectangularly extracts continuous lines constituting characters in the line image as a lump, and if the size or height of the extracted rectangular area is equal to or greater than a predetermined value, a continuous line that is the lump A handwritten character recognition system that solves the graph search problem after performing a disconnection process on a straight line in the transverse direction passing through the center or the center of gravity of the extracted rectangular area .