JP4594952B2

JP4594952B2 - Character recognition device and character recognition method

Info

Publication number: JP4594952B2
Application number: JP2007072673A
Authority: JP
Inventors: 裕子江藤
Original assignee: Toshiba Corp; Toshiba Solutions Corp
Current assignee: Toshiba Corp; Toshiba Digital Solutions Corp
Priority date: 2007-03-20
Filing date: 2007-03-20
Publication date: 2010-12-08
Anticipated expiration: 2027-03-20
Also published as: JP2008234291A

Description

本発明は、例えば免許証などの身分証の画像から文字を認識する文字認識装置及び文字認識方法に関する。 The present invention relates to a character recognition device and a character recognition method for recognizing characters from an image of an identification card such as a license.

携帯電話機の販売代理店などで、利用者が携帯電話機の利用契約を行う際には、利用者が記載した申込書の一部に、例えば免許証や保険証などの身分証のコピーを貼り付けて契約センターなどへＦＡＸ送信される。 When a mobile phone sales agent, etc., makes a mobile phone usage contract, a copy of an identification card such as a license or insurance card is pasted on a part of the application form written by the user. Fax to a contract center.

契約センターでは、ＦＡＸで受信された申込書をスキャナーなどの光学的文字読取装置（以下ＯＣＲと称す）でモノクロ画像に電子化してＰＣなどに表示してそのモノクロ画像から利用者が記載した内容や身分証の内容を確認した上で、利用者（契約者）のデータをデータベースへ登録し管理する。 At the contract center, the application form received by FAX is digitized into a monochrome image by an optical character reader (hereinafter referred to as OCR) such as a scanner and displayed on a PC or the like. After confirming the contents of the ID card, the user (contractor) data is registered and managed in the database.

この際、ＦＡＸから出力、つまりプリントされたモノクロＦＡＸ文書の紙面を、ＯＣＲにかけてその画像を読み取り文字認識することで得られた文字認識結果のテキストデータとイメージデータとを対応させてデータベースへ登録する。 At this time, the text data of the character recognition result obtained by recognizing the characters output by FAX from the FAX, that is, by printing the image of the printed monochrome FAX document by OCR and recognizing the characters, is registered in the database. .

ところで、申込書内の所定の身分証貼付欄には、身分証が必ずしも原寸大でコピーされて、かつ正規の方向に貼り付けられているとは限らない。 By the way, in the predetermined identification card pasting column in the application form, the identification card is not necessarily copied in full size and pasted in the normal direction.

申込書内の身分証の部分だけが、任意の倍率でコピーされていたり、または正規の方向に貼り付けられていない申込書の画像では、身分証の範囲の画像を正しく切り出して正しい文字認識結果が得られないため、オペレータは、認識できなかった文字をＰＣへ直接キー入力することで身分証の内容を登録することになる。 In the application form image where only the part of the ID card in the application form has been copied at an arbitrary magnification or has not been pasted in the correct direction, the image in the ID range is correctly cut out and the correct character recognition result is obtained. Therefore, the operator registers the contents of the identification card by directly inputting the characters that could not be recognized to the PC.

身分証を認識する技術としては、例えば免許証を専用スキャナーで読み取り、読み取った免許証の画像から罫線を検出しそれを頼りに免許証の文字を認識する技術が知られている。 As a technique for recognizing an identification card, for example, a technique is known in which a license is read by a dedicated scanner, a ruled line is detected from the read license image, and the license letter is recognized based on the ruled line.

また、画像の向きを判定して文字を読み取る技術としては、送信対象の帳票の４つのシート角のうち３つのシート角の所定部分に基準マークを印刷しておき、ファクシミリ装置を通じて得られた帳票の画像から基準マークを検出することで帳票画像の向きを検出および修正し、その修正した画像から文字を認識する技術が既にある（例えば特許文献１参照）。
特開平２−１２４７９号公報 Further, as a technique for determining the orientation of an image and reading a character, a reference mark is printed on a predetermined portion of three sheet angles of four sheet angles of a transmission target form, and a form obtained through a facsimile machine is used. There is already a technique for detecting and correcting the orientation of a form image by detecting a reference mark from the image of the image and recognizing characters from the corrected image (see, for example, Patent Document 1).
JP-A-2-12479

上記先行技術の場合、帳票の画像をすべてスキャニングして文字の位置を割り出せば、身分証の位置を検出できるものの、例えば免許証などの免許証番号欄には、斜線を背景とした文字が印字されており、モノクロＦＡＸやそのＦＡＸ用紙をスキャナーで取り込んだモノクロ画像ではこの部分の文字認識がエラーとなる確率が高いという問題があった。 In the case of the above-mentioned prior art, the position of the ID can be detected by scanning the entire image of the form and determining the position of the character. However, characters with a diagonal line in the background are printed in the license number field such as a license. Therefore, there is a problem in that a monochrome image obtained by scanning a monochrome FAX or its FAX paper with a scanner has a high probability that an error will occur in this portion of character recognition.

本発明はこのような課題を解決するためになされたもので、斜線を背景とした文字画像の文字認識率を向上することができる文字認識装置及び文字認識方法を提供することを目的としている。 The present invention has been made to solve such a problem, and an object of the present invention is to provide a character recognition apparatus and a character recognition method capable of improving the character recognition rate of a character image with a diagonal line as a background.

上記した課題を解決するために、本発明の文字認識装置は、斜線を背景にした文字列が印字された印字面より画像を取得する画像情報取得手段と、前記画像情報取得手段により取得された画像の中から、前記斜線を背景にした文字列の画像を抽出し、各文字単位に切り出す文字画像切出手段と、前記文字画像切出手段により切り出された文字画像を、前記斜線がほぼ水平になる角度に回転する文字画像回転手段と、前記文字画像回転手段により回転された文字画像より水平方向に線を構成する黒画素成分を除去する射線処理手段と、前記射線処理手段により水平方向に線を構成する黒画素成分が除去された文字画像の特徴ベクトルを抽出する特徴ベクトル抽出手段と、予め前記斜線の角度と一致する角度で回転させた状態の基準文字画像の特徴ベクトルとテキストデータとを対応付けた辞書を記憶した辞書記憶部と、前記特徴ベクトル抽出手段により抽出された前記文字画像の特徴ベクトルと前記辞書記憶部に記憶された特徴ベクトルとを比較して一致また近似する特徴ベクトルを持つテキストデータを出力する文字認識手段とを具備したことを特徴とする。 In order to solve the above-described problem, the character recognition device of the present invention is acquired by an image information acquisition unit that acquires an image from a printing surface on which a character string with a diagonal line as a background is printed, and the image information acquisition unit. An image of a character string with the background of the oblique line is extracted from the image, and the character image cutout means for cutting out each character unit, and the character image cut out by the character image cutout means, the oblique line is substantially horizontal. A character image rotating means that rotates to an angle of, a ray processing means for removing a black pixel component constituting a line in the horizontal direction from the character image rotated by the character image rotating means, and a horizontal direction by the ray processing means. and feature vector extraction means for black pixel component constituting the line to extract a feature vector of the character image that has been removed, the reference character image being rotated at an angle to match the pre said shaded angle JP Matching compares the dictionary storage unit which stores a dictionary associating the vector and text data, and a feature vector stored in the dictionary storage unit and the feature vectors of the extracted the character image by the feature vector extraction means Further, the image processing apparatus includes a character recognition unit that outputs text data having an approximate feature vector.

本発明の文字認識方法は、斜線を背景にした文字列が印字された印字面より画像を画像情報取得手段が取得するステップと、前記画像情報取得手段により取得された画像の中から、文字画像切出手段が、前記斜線を背景にした文字列の画像を抽出し、各文字単位に切り出すステップと、前記文字画像切出手段により切り出された文字画像を、文字画像回転手段が、前記斜線がほぼ水平になる角度に回転するステップと、前記文字画像回転手段により回転された文字画像より水平方向に線を構成する黒画素成分を射線処理手段が除去するステップと、前記射線処理手段により水平方向に線を構成する黒画素成分が除去された文字画像の特徴ベクトルを特徴ベクトル抽出手段が抽出するステップと、予め前記斜線の角度と一致する角度で回転させた状態の基準文字画像の特徴ベクトルとテキストデータとを対応付けた辞書を辞書記憶部に記憶しておき、前記特徴ベクトル抽出手段により抽出された前記文字画像の特徴ベクトルと、前記辞書記憶部に記憶された特徴ベクトルとを文字認識手段が比較して一致また近似する特徴ベクトルを持つテキストデータを出力するステップとを有することを特徴とする。 According to the character recognition method of the present invention, an image information acquisition unit acquires an image from a printing surface on which a character string with a diagonal line as a background is printed, and a character image is selected from the images acquired by the image information acquisition unit. A cutting means extracts a character string image with the diagonal line as a background and cuts out each character unit; a character image cut out by the character image cutting means; a character image rotating means; A step of rotating to an angle that is substantially horizontal, a step of removing black pixel components constituting a line in a horizontal direction from the character image rotated by the character image rotating unit, and a horizontal direction of the ray processing unit a step of a feature vector of the character images black pixel component is removed feature vector extraction means for extracting constituting a line, is rotated at an angle that matches the previously said shaded angle Stores the state of the reference character image feature vector and the dictionary associating the text data in the dictionary storage unit, a feature vector of the character images extracted by the feature vector extraction means, stored in the dictionary storage unit And a step of outputting text data having a feature vector that the character recognition means compares and matches or approximates with the feature vector thus obtained.

以上説明したように本発明によれば、斜線を背景とした文字画像の文字認識率を向上することができる。 As described above, according to the present invention, the character recognition rate of a character image with a diagonal line as a background can be improved.

以下、本発明の実施の形態を図面を参照して詳細に説明する。
図１は本発明に係る一つの実施の形態の身分証認識システムの構成を示す図、図２は申込書の一例を示す図である。
図１に示すように、この身分証認識システムは、帳票としての申込書１の表面を例えばＣＣＤなどで走査（スキャン）して画像情報（イメージデータ）を取得（生成）するイメージスキャナー２（以下スキャナー２と称す）と、このスキャナー２に接続され、申込書１から読み取った画像情報（イメージデータ）に対して免許証２３の画像の抽出処理及び文字認識処理を行うコンピュータ１０とから構成されている。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.
FIG. 1 is a diagram showing the configuration of an identification card recognition system according to one embodiment of the present invention, and FIG. 2 is a diagram showing an example of an application form.
As shown in FIG. 1, the ID recognition system includes an image scanner 2 (hereinafter referred to as “image scanner”) that acquires (generates) image information (image data) by scanning the surface of an application form 1 as a form with a CCD or the like. And a computer 10 connected to the scanner 2 and performing image extraction processing and character recognition processing of the license 23 on the image information (image data) read from the application form 1. Yes.

スキャナー２は、申込書１よりその表面をＣＣＤセンサにより走査してモノクロ画像（イメージデータ）を取得する。なお、ＦＡＸで送られてきたモノクロＦＡＸ紙をカラーでスキャンしても同じである。つまりスキャナー２は、申込書１よりモノクロ画像情報を取得する画像情報取得手段として機能する。 The scanner 2 acquires a monochrome image (image data) by scanning the surface of the application form 1 with a CCD sensor. It should be noted that the same applies when monochrome FAX paper sent by FAX is scanned in color. That is, the scanner 2 functions as an image information acquisition unit that acquires monochrome image information from the application form 1.

コンピュータ１０は、操作部１１、通信Ｉ／Ｆ１２、記憶手段としてのメモリ１３、表示部１４、ハードディスク装置１５、ＣＰＵ１６とを備えている。操作部１１は、キーボート、マウスなどのユーザーが操作を行う入力手段である。 The computer 10 includes an operation unit 11, a communication I / F 12, a memory 13 as a storage unit, a display unit 14, a hard disk device 15, and a CPU 16. The operation unit 11 is an input unit that is operated by a user such as a keyboard or a mouse.

メモリ１３には、文字を認識するための標準文字イメージ（登録パターンともいう）またはその特徴ベクトルとテキストデータとを対応付けた標準辞書１３ａ（以下「第１辞書１３ａ」と称す）と、免許証の所定の文字（免許書番号部分Ｍの斜線が入った文字）を認識するための斜線入り文字専用辞書１３ｂ（以下「第２辞書１３ｂ」と称す）とが記憶されている。 The memory 13 includes a standard dictionary 13a (hereinafter referred to as “first dictionary 13a”) in which a standard character image for recognizing characters (also referred to as a registered pattern) or its feature vector and text data is associated, and a license. Is stored as a hatched character-only dictionary 13b (hereinafter referred to as "second dictionary 13b") for recognizing a predetermined character (characters with a hatched portion of the license number portion M).

すなわち、メモリ１３は、予め所定の角度で回転（傾斜）させた状態の基準文字画像の特徴ベクトルとテキストデータとを対応付けた辞書（第１辞書１３ｂ）を記憶した辞書記憶部として機能する。所定の角度とは、斜線が入った認識対象の文字画像の斜線の角度と一致する角度である。 That is, the memory 13 functions as a dictionary storage unit that stores a dictionary (first dictionary 13b) in which the feature vector of the reference character image in a state rotated (tilted) in advance at a predetermined angle and text data are associated with each other. The predetermined angle is an angle that coincides with the angle of the oblique line of the character image to be recognized including the oblique line.

また、メモリ１３には、帳票画像内の免許証画像の位置や傾きなど検出し、検出した免許証画像内の文字画像を切り出すための免許証フォーマット１３ｃが記憶されている。
この免許証フォーマット１３ｃには、例えば免許証に予め複数印字されている特徴的な基準文字である「年」、「月」、「日」のうちのいずれか１種類、この例では「日」という文字種と、複数の基準文字「日」の位置関係のデータ（免許証２３のある点（左上角など）を基準（Ｘ座標０，Ｙ座標０）とした「日」という文字のＸ座標・Ｙ座標と各文字間の距離データなど）が記憶されている。 The memory 13 also stores a license format 13c for detecting the position and inclination of the license image in the form image and cutting out the character image in the detected license image.
In this license format 13c, for example, any one of “year”, “month”, and “day” which are characteristic reference characters printed in advance on the license, for example, “day” is used in this example. The data of the positional relationship between the character type and a plurality of reference characters “day” (the X coordinate of the character “day” with the point (the upper left corner etc.) of the license 23 as the reference (X coordinate 0, Y coordinate 0) Y coordinate and distance data between characters) are stored.

また免許証フォーマット１３ｃには、帳票内の免許証の画像から認識対象の文字列の画像を切り出すための領域情報が設定されている。領域情報は、上記「日」という文字と同様に、各文字のＸ座標・Ｙ座標と各文字間の距離データなどにより指定されている。なお、フォーマットや辞書などはハードディスク装置１５に記憶されていてもよい。 In the license format 13c, area information for cutting out the image of the character string to be recognized from the license image in the form is set. The area information is designated by the X-coordinate / Y-coordinate of each character and distance data between the characters, as in the case of the character “day”. Note that the format, dictionary, and the like may be stored in the hard disk device 15.

例えば免許証フォーマット１３ｃには、切り取る領域情報（免許証の左上角の位置座標をＸ座標０，Ｙ座標０）とした場合の各読み取り範囲の指定情報）の他、例えば免許証番号部分Ｍ（斜線を背景にした文字列が印字された印字面）の文字画像については、切り出し対象の文字画像が１２個ある中の１つ目から４つ目の文字画像については第１辞書１３ａ、５つ目から８つ目までの斜線入りの文字画像については第２辞書１３ｂ、９つ目から１２個目までの文字画像については第１辞書１３ａを利用するという辞書指定条件（辞書指定情報）が設定されている。
表示部１４は、スキャナー２が取り込んだ申込書１の画像や、抽出した免許証部分の画像から文字認識処理するための画面、文字認識結果のテキストデータなどを表示する。 For example, in the license format 13c, in addition to the area information to be cut out (specification information of each reading range when the position coordinate of the upper left corner of the license is X coordinate 0, Y coordinate 0), for example, the license number portion M ( As for the character image on the print surface on which a character string with a diagonal line as a background is printed, the first dictionary 13a and the fifth character image for the first to fourth character images out of twelve character images to be cut out. A dictionary designation condition (dictionary designation information) is set such that the second dictionary 13b is used for the character images with diagonal lines from the eighth to the eighth, and the first dictionary 13a is used for the ninth to twelfth character images. Has been.
The display unit 14 displays an image of the application form 1 captured by the scanner 2, a screen for character recognition processing from the extracted image of the license portion, text data of the character recognition result, and the like.

ハードディスク装置１５には、オペレーティングシステム（以下ＯＳと称す）と、ＣＰＵ１６に各部の制御動作を行わせる制御ソフトウェアとがインストールされており、これらが協動して本システムの動作を実現する。動作説明ではコンピュータ起動後のＣＰＵ１６の動作として説明する。 The hard disk device 15 is installed with an operating system (hereinafter referred to as OS) and control software that causes the CPU 16 to perform control operations of each unit, and these cooperate to realize the operation of the present system. In the operation description, the operation of the CPU 16 after the computer is started will be described.

すなわち、ＣＰＵ１６は、スキャナー２により取得された申込書１の画像の中から免許証にプリントされている特徴文字、例えば「年」、「月」、「日」などのうちの少なくとも１つの種類の複数の文字を検出する特徴文字検出手段と、この特徴文字検出手段により検出された複数の文字の位置関係と予め設定された免許証２３の文字の基準位置とに基づいて免許証画像の伸縮率および・または方向（縦・横・正規方向・上下反転方向など）を求め、申込書１の画像の中から免許証２３の画像を抽出する画像抽出手段、この画像抽出手段により抽出された免許証２３の部分の画像に対して文字認識を行う文字認識手段として機能する。 That is, the CPU 16 uses at least one kind of characteristic characters printed on the license from the image of the application form 1 acquired by the scanner 2, for example, “year”, “month”, “day”, and the like. Characteristic character detecting means for detecting a plurality of characters, the expansion / contraction rate of the license image based on the positional relationship between the plurality of characters detected by the characteristic character detecting means and the preset reference position of the characters of the license 23 And / or direction (vertical, horizontal, normal direction, upside down direction, etc.), and image extracting means for extracting the image of the license 23 from the image of the application form 1; the license extracted by the image extracting means It functions as a character recognizing unit that performs character recognition on the image of the portion 23.

ＣＰＵ１６は、斜線を背景にした文字列が印字された印字面より画像を取得する画像情報取得手段として機能する。ＣＰＵ１６は、画像情報取得手段により取得された画像の中から、斜線を背景にした文字列の画像を抽出し、各文字単位に切り出す文字画像切出手段として機能する。ＣＰＵ１６は、文字画像切出手段により切り出された文字画像を、斜線がほぼ水平になる角度に回転する文字画像回転手段として機能する。 The CPU 16 functions as an image information acquisition unit that acquires an image from a printing surface on which a character string with a diagonal line as a background is printed. The CPU 16 functions as a character image cutout unit that extracts a character string image with a diagonal line as a background from the images acquired by the image information acquisition unit and cuts out each character unit. The CPU 16 functions as a character image rotating unit that rotates the character image cut out by the character image cutting unit to an angle at which the oblique line becomes substantially horizontal.

ＣＰＵ１６は、文字画像回転手段により回転された文字画像より水平方向に線を構成する黒画素成分を除去（消去）する射線処理手段として機能する。ＣＰＵ１６は、射線処理手段により水平方向に線を構成する黒画素成分が除去された文字画像の特徴ベクトルを抽出する特徴ベクトル抽出手段として機能する。 The CPU 16 functions as a ray processing means for removing (erasing) a black pixel component constituting a line in the horizontal direction from the character image rotated by the character image rotating means. The CPU 16 functions as a feature vector extracting unit that extracts a feature vector of a character image from which a black pixel component constituting a line is removed in the horizontal direction by the ray processing unit.

ＣＰＵ１６は、特徴ベクトル抽出手段により抽出された文字画像の特徴ベクトルとメモリ１３の第２辞書１３ｂの特徴ベクトルから斜線部分の特徴ベクトルを除いた特徴ベクトルとを比較して一致また近似する特徴ベクトルを持つテキストデータを出力する文字認識手段として機能する。 The CPU 16 compares the feature vector of the character image extracted by the feature vector extraction means with the feature vector obtained by removing the feature vector of the hatched portion from the feature vector of the second dictionary 13b of the memory 13, and matches or approximates the feature vector. It functions as a character recognition means for outputting the text data it has.

すなわち、ＣＰＵ１６は、メモリ１３に予め記憶されている免許証内の画像を読み取るための免許証フォーマット１３ｃに従って、免許証２３の部分の画像の中の所定の領域の文字列、例えば氏名、生年月日、本籍、住所、免許証番号、免許取得年月日などの画像を取得し、個々の領域の画像を各文字単位に切り出し、個々の文字画像に対して免許証フォーマット１３ｃにより個々の文字画像毎に指定された辞書（第１辞書１３ａまたは第２辞書１３ｂのいずれか）を参照して文字認識処理を実行する。 That is, the CPU 16 follows the license format 13c for reading the image in the license stored in advance in the memory 13, and the character string of a predetermined area in the image of the license 23, for example, name, date of birth, etc. Acquire images such as date, permanent address, address, license number, date of license acquisition, etc., cut out images of individual areas into individual character units, and individual character images by license format 13c for individual character images Character recognition processing is executed with reference to the specified dictionary (either the first dictionary 13a or the second dictionary 13b).

切り出した各文字画像を認識する際に、ＣＰＵ１６は、背景に斜線がある文字画像については、斜線がほぼ水平になる角度（この例では文字中心を右まわりに４５°回転）に文字画像を回転し、回転した文字画像より、水平方向の黒画素からなる線成分（斜線成分）を除去し、斜線成分を除去した文字画像の特徴ベクトルを抽出し、その部分専用の第２辞書１３ｂを用いて、斜線部分の個所を除いた特徴ベクトルの比較で文字認識する。 When recognizing each cut-out character image, the CPU 16 rotates the character image at an angle at which the oblique line is almost horizontal (in this example, the character center is rotated 45 ° clockwise) for the character image having the oblique line in the background. Then, from the rotated character image, a line component (slanted line component) composed of black pixels in the horizontal direction is removed, a feature vector of the character image from which the diagonal line component is removed is extracted, and the second dictionary 13b dedicated to that part is used. Character recognition is performed by comparing feature vectors excluding the shaded portion.

図２に示すように、帳票、例えば携帯電話機の利用契約のための申込書１などには、住所、氏名、申し込み内容などの記入欄２１と、本人確認物貼り付け欄２２が設けられている。記入欄２１には、申込者本人がボールペンなどにより自筆で該当事項が記入される。 As shown in FIG. 2, a form, for example, an application form 1 for a mobile phone usage contract, is provided with an entry field 21 for an address, name, application contents, and the like, and an identification confirmation affixing field 22. . In the entry field 21, the applicant himself / herself is filled in with the ballpoint pen or the like.

本人確認物貼り付け欄２２には、予め添付のための注意事項などが印刷されており、その上に申込者本人の確認物として申込者本人から提示された例えば免許証２３や保険証などの身分証をコピーしたものを、その欄内に入るような大きさに拡大または縮小して貼り付け添付される。 Precautions for attachment are pre-printed in the identification confirmation affixing field 22 and, for example, a license 23 or a health insurance card, etc. presented by the applicant himself as the confirmation of the applicant himself / herself is printed thereon. A copy of the ID is enlarged and reduced to a size that fits in the field and attached.

申込書１は、ファクシミリ装置などにかけられ、通信網に伝送されて契約センターに受信されるので、契約センターでは、受信されたＦＡＸ用紙の画像、すなわち申込書１に身分証が貼り付けられた状態の画像は、免許証部分だけを傾きを直したりするような加工はできない。 The application form 1 is applied to a facsimile machine, etc., transmitted to the communication network, and received by the contract center. At the contract center, the received fax paper image, that is, the identification card is pasted on the application form 1 This image cannot be processed so that only the license part is corrected.

つまり、ＦＡＸ用紙の画像上の免許証は、本人確認物貼り付け欄２２内においてその位置がバラバラであり、横向き、縦向き、縮尺率、拡大率など、さまざまな形に変形した状態となっていることがある。また、免許証２３の背景には申込書１に予め印刷されていた各種の文字がはみ出していることが多い。 In other words, the license on the fax paper image has a variety of positions in the identity verification item pasting field 22 and is transformed into various forms such as landscape, portrait, scale, and magnification. There may be. In addition, various characters previously printed on the application form 1 often protrude from the background of the license 23.

申込書１に貼り付けられる免許証２３の方向としては、申込書１の免許証貼付欄に免許証を横長方向に貼り付けることを正規の方向とすると、免許証２３の上下を逆にして貼り付ける第１ケースと、免許証を縦長方向（正規方向と直交する方向）に貼り付ける第２ケースと、それぞれの方向について少し傾斜した状態で貼り付ける第３のケースが考えられる。少し傾斜した状態としては、人間の視覚的の感覚で言えば、例えば３度〜５度程度までである。 The direction of the license 23 to be pasted on the application form 1 is pasted with the license 23 upside down if the normal direction is pasting the license in the license pasting field of the application form 1 A first case to be attached, a second case in which the license is attached in the longitudinal direction (a direction orthogonal to the normal direction), and a third case in which the respective cases are attached in a slightly inclined state are conceivable. The slightly inclined state is, for example, about 3 to 5 degrees in terms of human visual sense.

図２に示すように、免許証２３には、ほぼ同じ大きさの「日」という文字が最低５個（ｐ１…ｐ５）印刷されている。
そこで、本システムでは、免許証２３の貼り付け位置を検出するための基準となる５個の「日」という文字（ｐ１…ｐ５）の中心点の位置情報をメモリ１３に登録しておき、図３に示すような、認識対象の申込書の画像１ａから検出された「日」という文字（ｄ１…ｄｎ）の中心点の位置がそれぞれどれに対応するかを総当りで調べ、最も確からしい組み合わせを求める。 As shown in FIG. 2, the license 23 is printed with at least five letters (p1... P5) having the same size and the same date.
Therefore, in the present system, the position information of the central points of the five letters “p” (p1... P5), which serve as a reference for detecting the position where the license 23 is pasted, is registered in the memory 13. As shown in Fig. 3, the most probable combination is obtained by examining the position of the center point of the letters "d" (d1... Dn) detected from the image 1a of the application form to be recognized, corresponding to each other. Ask for.

このように、組み合わせの中から最適な組み合わせを見つける問題のことを、「組み合わせ最適化問題」と呼び、その解決方法には幅優先探索法、深さ優先探索法、遺伝的アルゴリズム、シミュレーテッド・アニーリング法など、多くの方法が知られており、このシステムでは、いずれかの方法を利用する。 This problem of finding the optimal combination from among the combinations is called a “combination optimization problem”, and its solution methods include breadth-first search method, depth-first search method, genetic algorithm, and simulated annealing. Many methods are known, such as the method, and this method uses either method.

最も確からしい組み合わせが求められれば、その相対位置から、画像上のどの部分に、氏名・生年月日・住所・免許証番号などの記載項目があるのかを算出できる。本システムは、この算出結果に従って免許証２３の部分画像とその中の記載項目を切り出し、文字認識処理を実行する。 If the most probable combination is required, it can be calculated from the relative position where there are items such as name, date of birth, address, license number, etc. on the image. The system cuts out the partial image of the license 23 and the description items therein according to the calculation result, and executes the character recognition process.

ここで、図４のフローチャートを参照してこの身分証認識システムの概要動作を説明する。
この身分証認識システムの場合、スキャナー２の読み取り台に申込書１がセットされ、走査開始の操作が行われると、スキャナー２は、申込書１の表面を走査してイメージデータを生成しコンピュータ１０へ送る。 Here, an outline operation of the identification card recognition system will be described with reference to the flowchart of FIG.
In the case of this ID recognition system, when the application form 1 is set on the reading stand of the scanner 2 and the scanning start operation is performed, the scanner 2 scans the surface of the application form 1 to generate image data and generates the computer 10. Send to.

コンピュータ１０では、ＣＰＵ１６は、スキャナー２から受信された申込書１のイメージデータを縦方向に走査して複数の特徴文字「日」を検出し（Ｓ１０１）、各特徴文字の位置関係から、検出した複数の「日」の組み合わせを最適化し（Ｓ１０２）、免許証部分の画像の伸縮率および方向を検出する。 In the computer 10, the CPU 16 scans the image data of the application form 1 received from the scanner 2 in the vertical direction to detect a plurality of characteristic characters “day” (S 101), and detects from the positional relationship of each characteristic character. The combination of a plurality of “days” is optimized (S102), and the expansion ratio and direction of the image of the license portion are detected.

ＣＰＵ１６は、検出した免許証部分の画像の伸縮率（大きさ）および方向（傾きや上下逆さなど）に基づいて免許証の部分画像を切り出し、さらに予め設定された免許証フォーマット１３ｃに従ってその免許証部分の画像から個々の記載項目の画像をさらに切り出して（Ｓ１０３）、個々の項目に対して文字認識処理を実行することで（Ｓ１０４）、申込書１に貼り付けられている免許証２３の内容をテキストデータに変換してメモリ１３に記憶すると共に、表示部１４に表示する。 The CPU 16 cuts out the partial image of the license based on the expansion / contraction rate (size) and direction (tilt, upside down, etc.) of the detected image of the license portion, and further, the license according to the preset license format 13c. The contents of the license 23 affixed to the application form 1 are further cut out from the partial image (S103), and character recognition processing is performed on the individual items (S104). Is converted into text data, stored in the memory 13 and displayed on the display unit 14.

その後、ユーザーにより操作部１１が保存操作あるいは出力操作されると、ＣＰＵ１６は、メモリ１３のテキストデータと免許証部分の画像とを対応付けて保存場所または出力先であるハードディスク装置１５（データベース）へ保存あるいは出力する。 After that, when the operation unit 11 is stored or output by the user, the CPU 16 associates the text data in the memory 13 with the image of the license portion to the storage location or output destination hard disk device 15 (database). Save or output.

以下、図５，図６を参照してこの身分証認識システムにおける免許証画像認識処理の詳細について説明する。
免許証画認識処理を行う場合、ＣＰＵ１６は、まず、スキャナー２により読み取られた申込書１の画像を一定方向（縦方向）に走査する（図５のＳ１１１）。 The details of the license image recognition process in this identification card recognition system will be described below with reference to FIGS.
When performing the license image recognition process, the CPU 16 first scans the image of the application form 1 read by the scanner 2 in a certain direction (vertical direction) (S111 in FIG. 5).

そして、ＣＰＵ１６は、画像を走査して得た白ピクセルと黒ピクセルの連続数を計数し（Ｓ１１２）、図４に示すように、その中である着目ライン３１において、白ピクセルと黒ピクセルの連続数の比がほぼ黒：白：黒：白：黒＝ａ：ｂ：ａ：ｂ：ａとなる場所を探索し（Ｓ１１３）、「日」という文字があるべき場所（画像領域）を検出する。 Then, the CPU 16 counts the number of consecutive white pixels and black pixels obtained by scanning the image (S112), and as shown in FIG. A place where the ratio of the numbers is almost black: white: black: white: black = a: b: a: b: a is searched (S113), and a place (image region) where the word “day” should be detected is detected. .

次に、ＣＰＵ１６は、検出した場所について、左右に黒ピクセルを追跡し、横方向に連続する黒ピクセルの範囲３２を検出する（Ｓ１１４）。
また、ＣＰＵ１６は、横方向に連続する黒ピクセルの範囲３２の端部の黒ピクセルから、上下に黒ピクセルを追跡し、縦方向に連続する黒ピクセルの範囲３３を検出する（Ｓ１１５）。 Next, the CPU 16 tracks black pixels on the left and right sides of the detected location, and detects a black pixel range 32 that is continuous in the horizontal direction (S114).
Further, the CPU 16 traces the black pixels vertically from the black pixels at the end of the range 32 of black pixels continuous in the horizontal direction, and detects the range 33 of black pixels continuous in the vertical direction (S115).

そして、ＣＰＵ１６は、検出した黒ピクセルの横方向の範囲３２および縦方向の範囲３３が、予めメモリ１３に設定されている基準文字のサイズである、横１ｍｍ×縦２ｍｍ以上、横５ｍｍ×縦５ｍｍ以下という条件を満たしているか否かを判定する（Ｓ１１６）。 Then, the CPU 16 determines that the detected horizontal range 32 and the vertical range 33 of the black pixel are the size of the reference character set in the memory 13 in advance, that is, 1 mm wide × 2 mm vertical, 5 mm wide × 5 mm vertical. It is determined whether or not the following conditions are satisfied (S116).

この判定の結果、条件を満たしている場合（Ｓ１１６のＹｅｓ）、ＣＰＵ１６は、検出した横方向の範囲３２および縦方向の範囲３３内の画像に対して文字認識処理を行い、その文字認識結果が「日」である領域を選出する（Ｓ１１７）。 As a result of this determination, if the condition is satisfied (Yes in S116), the CPU 16 performs character recognition processing on the images in the detected horizontal range 32 and vertical range 33, and the character recognition result is An area that is “day” is selected (S117).

ＣＰＵ１６は、黒ピクセルと白ピクセルとが連続する数の比がａ：ｂ：ａ：ｂ：ａとなる場所がなくなるまで上記処理を繰り返し行う（Ｓ１１８）。つまり、ＣＰＵ１６は、黒ピクセルａと白ピクセルｂとの配置比がａ：ｂ：ａ：ｂ：ａとなる画像領域から特徴文字を検出する。黒ピクセルａと白ピクセルｂとの配置比とはピクセルの並び順とピクセルの数の比をいう。 The CPU 16 repeats the above processing until there is no place where the ratio of the number of consecutive black pixels and white pixels is a: b: a: b: a (S118). That is, the CPU 16 detects a characteristic character from an image area where the arrangement ratio of the black pixel a and the white pixel b is a: b: a: b: a. The arrangement ratio between the black pixel a and the white pixel b is the ratio of the pixel arrangement order and the number of pixels.

このように画像から文字を認識する場合、従来は、画像全体の中から全て文字を認識することが一般的に行われていたが、画像全体の文字を全て認識すると、認識する文字が数百〜数千にも及ぶ場合があり、処理速度が著しく低下してしまう。 In the case of recognizing characters from an image in this way, conventionally, it has been generally performed to recognize all characters from the entire image. However, when all characters of the entire image are recognized, several hundred characters are recognized. In some cases, the process speed may be significantly reduced.

そこで、本実施形態では、特徴文字の「日」の字体が、線が均等間隔に並ぶことを利用して、白・黒のピクセルの並び方から予め「日」という文字があるらしい範囲を特定し、その特定した範囲だけを文字認識することで、免許証２３の特徴的な文字である「日」を極めて高速に検出できる。 Therefore, in the present embodiment, by using the fact that the character “day” font is lined up at equal intervals, the range in which the characters “day” are likely to exist is specified in advance from the arrangement of white and black pixels. By recognizing only the specified range, “character”, which is a characteristic character of the license 23, can be detected very quickly.

ＣＰＵ１６は、上記処理を繰り返すことで、複数の「日」という文字を検出し、それぞれの文字間の距離（間隔）を計算により求め、予めメモリ１３に記憶されている各文字間の基準の距離（間隔）とを対比して免許証部分の画像の拡大率または縮小率などの伸縮率を求める。基準の距離は、免許証の実寸でもよく、ある倍率をかけた距離（間隔）でもよい。 The CPU 16 repeats the above process to detect a plurality of “day” characters, calculates a distance (interval) between the characters, and calculates a reference distance between the characters stored in the memory 13 in advance. In comparison with (interval), the expansion / contraction ratio such as the enlargement ratio or reduction ratio of the image of the license portion is obtained. The reference distance may be the actual size of the license or may be a distance (interval) multiplied by a certain magnification.

ＣＰＵ１６は、免許証部分の画像の伸縮率を求めると、ＦＡＸ用紙の画像の中から伸縮率に応じた範囲を切り出して免許証部分の画像をメモリ１３に一時記憶する。 When obtaining the expansion / contraction rate of the image of the license portion, the CPU 16 cuts out a range corresponding to the expansion / contraction rate from the image on the FAX sheet and temporarily stores the image of the license portion in the memory 13.

そして、ＣＰＵ１６は、メモリ１３の免許証部分の画像に対して免許証フォーマット１３ｃに従って文字画像を切り出して、個々の文字画像についてそれぞれの位置に応じて標準辞書である第１辞書１３ａまたは斜線入り文字専用辞書である第２辞書１３ｂを使い分けて文字認識処理を実行し、文字認識結果をメモリ１３に記憶する。
ＣＰＵ１６は、この文字認識処理の結果であるテキストデータとメモリ１３に一時記憶しておいた免許証２３の部分画像とを対応させて、ハードディスク装置１５に構築されたデータベースに登録する。 Then, the CPU 16 cuts out a character image from the image of the license portion of the memory 13 in accordance with the license format 13c, and the first dictionary 13a which is a standard dictionary or a character with hatched lines according to the position of each character image. Character recognition processing is executed using the second dictionary 13b, which is a dedicated dictionary, and the result of character recognition is stored in the memory 13.
The CPU 16 associates the text data, which is the result of this character recognition process, with the partial image of the license 23 temporarily stored in the memory 13 and registers it in the database constructed in the hard disk device 15.

次に、図７を参照して上記免許証認識処理の応用例について説明する。
図７に示すように、「日」という文字は、厳密には、中央から上の部分と下の部分では形が異なっている。この微妙な違いを検出することで、処理をさらに高速化できる。 Next, an application example of the license recognition process will be described with reference to FIG.
As shown in FIG. 7, strictly speaking, the letters “day” have different shapes in the upper part and the lower part from the center. By detecting this subtle difference, the processing can be further speeded up.

すなわち、ＣＰＵ１６は、免許証２３の特徴的な文字である「日」を認識（検出）した後、「日」という各文字についてそれぞれの上部または下部のパターンを調査し、はみ出し部３５を検知する。この処理では、ＣＰＵ１６は、文字の上半分の部分または下半分の部分のどちらに、はみ出し部３５があるかを判定する。 That is, after recognizing (detecting) “day” which is a characteristic character of the license 23, the CPU 16 investigates the upper or lower pattern of each character “day” and detects the protruding portion 35. . In this process, the CPU 16 determines whether the upper half portion or the lower half portion of the character has the protruding portion 35.

そして、ＣＰＵ１６は、文字の上部にはみ出し部３５がある「日」の数と、下部にはみ出し部３５がある「日」の数を計数し、多い方をもって免許証２３の方向（上下）を判定する。つまりＣＰＵ１６は、検出した「日」という特徴文字の上側部分と下側部分のうち、文字の一部が突出している側の数を計数して、免許証２３の画像の向きを判定する。
このように、はみ出し部３５の方向を検出して、予め文字の向き（上：正規方向または下：逆方向）を判定しておくことで、複数の「日」という文字を組み合わせ最適化処理で上下を判定する必要がなくなるので、免許証２３の認識をより高速に行うことができる。 Then, the CPU 16 counts the number of “days” having the protruding portion 35 at the upper part of the character and the number of “days” having the protruding portion 35 at the lower portion, and determines the direction (up and down) of the license 23 with the larger one. To do. That is, the CPU 16 determines the orientation of the image of the license 23 by counting the number of the upper part and the lower part of the detected character “day” on the side where a part of the character protrudes.
In this way, by detecting the direction of the protruding portion 35 and determining the direction of the character (upper: normal direction or lower: reverse direction) in advance, a plurality of “day” characters can be combined and optimized. Since it is no longer necessary to determine whether it is up or down, the license 23 can be recognized at a higher speed.

従来、免許証を専用スキャナーで読み取り、読み取った免許証２３の画像から罫線を検出しそれを頼りに免許証を認識していたが、このように罫線を検出する技術の場合、申込書１のコピーやファクシミリ装置でＦＡＸ受信される帳票などの場合、罫線が途切れてしまうことが多く、免許証が認識できない場合がしばしば生じる問題があったが、本実施形態の免許証認識システムでは、罫線の代わりに、申込書に貼り付けられた免許証に太く印刷される複数の「日」を検出することで、免許証２３の画像をより安定して認識できる。 Conventionally, a license is read by a dedicated scanner, and a ruled line is detected from the read image of the license 23, and the license is recognized based on the ruled line. In the case of forms such as copies and faxes received by facsimile machines, the ruled lines are often interrupted, and there is often a problem that the license cannot be recognized. However, in the license recognition system of this embodiment, the ruled line Instead, the image of the license 23 can be recognized more stably by detecting a plurality of “days” printed thick on the license pasted on the application form.

また、申込書１に免許証が正規方向または正規方向と上下逆の方向（ほぼ０度または１８０度）に配置された場合にも、免許証２３の画像部分についての認識が可能となる。 Further, even when the license is arranged on the application form 1 in the normal direction or in the direction opposite to the normal direction (approximately 0 degrees or 180 degrees), the image portion of the license 23 can be recognized.

さらに、初めの縦方向の走査で特徴文字の「日」が検出されなかった場合、続いて、画像を９０度回転して、上記免許証画像認識処理を行うことで、正規方向と直交する方向（９０度もしくは２７０度）に配置された免許証２３の画像部分についても認識可能となる。 Furthermore, when the “day” of the characteristic character is not detected in the initial vertical scanning, the image is rotated by 90 degrees, and the license image recognition process is performed, so that the direction orthogonal to the normal direction is obtained. The image portion of the license 23 arranged at (90 degrees or 270 degrees) can also be recognized.

通常、文字認識機能は、文字の画像が±５度程度まで傾いていても、文字を認識することができる。従って、上記実施形態の免許証認識方法によれば、免許証２３が０度±５度、９０度±５度、１８０度±５度、２７０度±５度の範囲で置かれていた場合に認識することが可能となる。通常の申込書であれば、この範囲を対象とすれば、ほとんどの免許証画像を認識できる。 Normally, the character recognition function can recognize a character even if the character image is tilted to about ± 5 degrees. Therefore, according to the license recognition method of the above embodiment, when the license 23 is placed in the range of 0 ° ± 5 °, 90 ° ± 5 °, 180 ° ± 5 °, 270 ° ± 5 °. It becomes possible to recognize. If it is a normal application form, most license images can be recognized if this range is targeted.

さらに、画像を０度と９０度だけでなく、１０度、２０度…１７０度のように１０度刻みで回転して上記免許証画像認識処理を行えば、１０度±５度、２０度±５度…１７０度±５度も対象とすることができ、さらに日が上下反転していても文字認識できる特徴により１９０度±５度、２００度±５度…３５０度±５度も対象とすることができるから、あらゆる方向に置かれた免許証を認識することが可能となる。 Further, if the image is rotated not only at 0 degrees and 90 degrees, but also at 10 degrees, such as 10 degrees, 20 degrees,... 170 degrees, and the license image recognition process is performed, 10 degrees ± 5 degrees, 20 degrees ± 5 degrees… 170 degrees ± 5 degrees can be targeted, and 190 degrees ± 5 degrees, 200 degrees ± 5 degrees… 350 degrees ± 5 degrees are also targeted due to the feature that characters can be recognized even if the day is upside down. It is possible to recognize licenses placed in all directions.

また、従来の方式では、罫線が均等間隔で並んでいることにより、ときどき１行ずれた認識結果が得られる場合があったが、本実施形態では、「日」という文字が画像上に均等に並んでいないため、ずれた認識結果が得られにくいという効果もある。 Further, in the conventional method, there is a case where a recognition result shifted by one line is sometimes obtained because the ruled lines are arranged at equal intervals. In this embodiment, the character “day” is evenly displayed on the image. Since they are not arranged, there is an effect that it is difficult to obtain a shifted recognition result.

また、免許証上の特徴文字である「日」は、上下反転しても「日」と認識できるため、他の文字のように１８０度回転して認識しなくても「日」と検出でき、申込書１の画像から免許証部分の画像を切り出すまでの処理を極めて高速にできる。 In addition, “date”, which is a characteristic character on the license, can be recognized as “day” even if it is flipped upside down, so it can be detected as “day” without being rotated 180 degrees like other characters. The processing from the image of the application form 1 to the image of the license portion can be made extremely fast.

続いて、図８乃至図１５を参照して免許証画像における記載項目の文字認識処理（ステップＳ１０４）の詳細について説明する。 Next, the details of the character recognition process (step S104) for the items described in the license image will be described with reference to FIGS.

図３に示した申込書の画像１ａの中の免許証部分の画像には、氏名、本籍、生年月日、住所、免許交付日、有効年月日などの他に、免許の条件の欄があり、その欄内には免許証番号Ｍがある。 In the image of the license part in the image 1a of the application form shown in FIG. 3, there are columns for license conditions in addition to the name, permanent address, date of birth, address, date of license issuance, date of validity, etc. There is a license number M in the column.

この例では、例えば図８に示すように、「９０９８１１３５１０９０」という免許証番号Ｍが印字されているものとする。この免許証番号Ｍの部分を文字認識する場合、１２桁の番号のうちの中央付近の番号には斜線が入っており、文字認識する上での障害になる。 In this example, for example, as shown in FIG. 8, it is assumed that the license number M “9909811351090” is printed. When characters of the license number M are recognized, the number near the center of the 12-digit number is hatched, which is an obstacle to character recognition.

そこで、本実施形態の場合、ＣＰＵ１６は、１文字毎に切り出した文字画像のうち、背景が白色の文字画像「９」、「０」、「９」、「８」、「１」、「０」、「９」、「０」などと、背景に斜線が描かれている文字画像「１」、「１」、「３」、「５」などとで、異なる画像処理と異なる辞書とで文字認識処理を行う。 Therefore, in the present embodiment, the CPU 16 has character images “9”, “0”, “9”, “8”, “1”, “0” having a white background among character images cut out for each character. ”,“ 9 ”,“ 0 ”, and the like, and the character images“ 1 ”,“ 1 ”,“ 3 ”,“ 5 ”, etc., which are shaded in the background, are used for different image processing and different dictionaries. Perform recognition processing.

ＣＰＵ１６は、白色の文字画像「９」、「０」、「９」、「８」、「１」、「０」、「９」、「０」については、第１辞書１３ａを参照して文字認識処理を行う。第１辞書１３ａの作りについては、一般的な文字認識用の辞書と同じであり、ここではその説明は省略する。
また、ＣＰＵ１６は、背景に斜線が描かれている文字画像「１」、「１」、「３」、「５」については、第２辞書１３ｂを参照して文字認識処理を行う。
＜免許証番号斜線部専用の辞書＞ The CPU 16 refers to the first dictionary 13a for the white character images “9”, “0”, “9”, “8”, “1”, “0”, “9”, “0”. Perform recognition processing. The creation of the first dictionary 13a is the same as a general character recognition dictionary, and the description thereof is omitted here.
In addition, the CPU 16 performs character recognition processing with reference to the second dictionary 13b for the character images “1”, “1”, “3”, and “5” with diagonal lines drawn in the background.
<Dictionary for exclusive use of license number shaded part>

図９に示すように、第２辞書１３ｂには、予め、「０」から「９」までの数字を右に４５度回転させたパターンがテキストデータに対応して登録されている。この第２辞書１３ｂを作成するには、まず、最初に登録するパターンを一定の大きさになるよう縦横に拡大縮小し線形に正規化する。
次に正規化した画像を縦横に格子状に分割し、分割した各マスごとに濃度特徴などを算出して得た特徴量を辞書として登録する。 As shown in FIG. 9, in the second dictionary 13b, patterns in which numbers from “0” to “9” are rotated 45 degrees to the right are registered in advance corresponding to the text data. In order to create the second dictionary 13b, first, a pattern to be registered first is scaled vertically and horizontally so as to have a certain size, and is normalized linearly.
Next, the normalized image is divided into a grid pattern vertically and horizontally, and a feature amount obtained by calculating a density feature or the like for each divided cell is registered as a dictionary.

図９には、数字の例えば「３」を９×９の格子状に分割した場合の例を示す。この場合、特徴量は８１次元の特徴ベクトルで表される。ＣＰＵ１６は、切り出した斜線入りの文字画像についてこの第２辞書１３ｂの特徴ベクトルを用いて、複合類似度法などの文字認識手法を用いて文字認識を行う。
＜斜線が入った文字画像の処理＞
以下、具体的な処理について説明する。 FIG. 9 shows an example in which a numeral “3”, for example, is divided into a 9 × 9 grid. In this case, the feature amount is represented by an 81-dimensional feature vector. The CPU 16 performs character recognition using a character recognition method such as a composite similarity method, using the feature vector of the second dictionary 13b for the cut out character image with diagonal lines.
<Processing of character images with diagonal lines>
Specific processing will be described below.

図８で示したように、免許証番号は１２桁の数字からなり、真中の４桁部分には右上から左下へ４５度の角度で斜線が引かれている。また、免許証番号欄の数字は固定ピッチで印字されている。このようなことを考慮し、免許証フォーマット１３ｃには、文字画像の切り出し位置、範囲および使用辞書の情報が記憶されている。ＣＰＵ１６は、斜線が引かれていない部分から文字のピッチを求める。 As shown in FIG. 8, the license number is a 12-digit number, and the middle four-digit portion is hatched at an angle of 45 degrees from upper right to lower left. The numbers in the license number column are printed at a fixed pitch. In consideration of this, the license format 13c stores information on the cutout position, range, and usage dictionary of the character image. CPU16 calculates | requires the pitch of a character from the part which is not shaded.

これにより、ＣＰＵ１６は、文字ピッチを基に免許証番号部分Ｍの斜線部分の数字を１桁ずつ順に切り出し、免許証フォーマット１３ｃで指定された辞書を参照して文字認識処理を行う。 Thereby, the CPU 16 cuts out the numbers in the hatched portion of the license number portion M one by one in order based on the character pitch, and performs character recognition processing with reference to the dictionary specified by the license format 13c.

ＣＰＵ１６が免許証番号を１文字ずつ順に文字認識を行う中で、例えば図１１に示すように、４桁目の斜線入り文字画像「３」を切り出した場合、ＣＰＵ１６は、切り出した文字画像「３」を４５度右に回転し（図１０のＳ２０１）、斜線が水平線になるようにイメージを変換する（図１２参照）。 When the CPU 16 recognizes the license number one character at a time in sequence, for example, as shown in FIG. 11, when the character image “3” with a four-digit diagonal line is cut out, the CPU 16 reads the cut character image “3”. ”Is rotated 45 degrees to the right (S201 in FIG. 10), and the image is converted so that the diagonal line becomes a horizontal line (see FIG. 12).

次に、ＣＰＵ１６は、回転した画像に対して斜線の位置を検出する（Ｓ２０２）。
ここで、斜線は回転した画像上でＸ軸とほぼ平行な黒画素成分の集まり、つまり水平線となっているため、ＣＰＵ１６は、黒画素の水平方向の射影をとるなどの方法で斜線の位置を検出する。 Next, the CPU 16 detects the position of the oblique line with respect to the rotated image (S202).
Here, since the oblique line is a collection of black pixel components substantially parallel to the X axis on the rotated image, that is, a horizontal line, the CPU 16 determines the position of the oblique line by a method such as taking a horizontal projection of the black pixels. To detect.

斜線の位置を検出すると、ＣＰＵ１６は、続いてその画像から斜線を除去することで（Ｓ２０３）、図１３に示すような間引き画像を得る。 When the position of the oblique line is detected, the CPU 16 subsequently removes the oblique line from the image (S203) to obtain a thinned image as shown in FIG.

次に、第２辞書１３ｂを作成したときと同じ要領で、斜線を消した文字パターンに対して正規化（画像の大きさを辞書の登録パターンに対比できるようにそろえること）を行った上で特徴ベクトルを求める（Ｓ２０４）。正規化とは辞書の登録パターンの大きさに拡大縮小し線形にすることである。この例のように画像を９×９の格子状に分割した場合、特徴ベクトルは８１次元となる。
ここで、斜線が引かれていたマスの特徴は、標準辞書である第１辞書１３ａの特徴と大きく異なるため、斜線部分の特徴は文字認識の結果に悪影響を与えてしまう。 Next, in the same way as when the second dictionary 13b was created, after normalizing the character pattern with the hatched lines removed (aligning the image size with the registered pattern in the dictionary) A feature vector is obtained (S204). Normalization means scaling to the size of a dictionary registration pattern to make it linear. When the image is divided into a 9 × 9 grid as in this example, the feature vector is 81 dimensions.
Here, the features of the squares that are hatched are significantly different from the features of the first dictionary 13a, which is a standard dictionary, and therefore the features of the hatched portions adversely affect the result of character recognition.

そこで、ＣＰＵ１６は、第２辞書１３ｂの登録パターンと、整形した画像の入力パターンそれぞれの特徴ベクトルから、斜線が入ったマスを除いたベクトルで第２辞書１３ｂとの比較を行う。 Accordingly, the CPU 16 compares the registered pattern of the second dictionary 13b with the second dictionary 13b using a vector obtained by removing the hatched square from the feature vectors of the input pattern of the shaped image.

つまり、ＣＰＵ１６は、斜線部分を除去した文字画像の特徴ベクトルと第２辞書１３ｂの特徴ベクトルとを斜線が入ったマスを除いた特徴ベクトルで比較する。
図１４に斜線部分を取り除いた特徴ベクトルの例を示す。この図１４に示す例では、斜線が入っていたマスは、２、４、７行目の各行Ｐ（一行ずつ）であるため、８１次元の特徴ベクトルから、３×９マス分の特徴ベクトル、つまり２７次元分が取り除かれるため、比較対象が５４次元の特徴ベクトルとなる。 That is, the CPU 16 compares the feature vector of the character image from which the hatched portion is removed with the feature vector of the second dictionary 13b using the feature vector excluding the hatched square.
FIG. 14 shows an example of the feature vector from which the hatched portion is removed. In the example shown in FIG. 14, since the hatched squares are the respective rows P (one row) of the second, fourth, and seventh rows, the feature vector for 3 × 9 squares from the 81-dimensional feature vector, That is, since 27 dimensions are removed, the comparison target is a 54-dimensional feature vector.

このように斜線が入った部分の特徴ベクトルを除去して、それ以外の部分の特徴ベクトルを用いて文字認識を行うことにより（Ｓ２０５）、斜線の影響を排除して文字認識処理を実行できる。 In this way, by removing the feature vectors of the hatched portions and performing character recognition using the feature vectors of the other portions (S205), the character recognition process can be executed without the influence of the shaded lines.

また、文字を回転せずに斜線部分に該当するマスを特徴ベクトルからすべて除去した場合、斜線がかかる領域が多くなり、特徴量のほとんどが認識に使用できなくなる可能性がある。 Further, when all the squares corresponding to the shaded part are removed from the feature vector without rotating the character, there are many areas where the shaded line is applied, and there is a possibility that most of the feature amount cannot be used for recognition.

しかし、４５度回転して斜線が水平になるようにしたことで、取り除かれるマスは水平線ごとに横１列、または2列とすることができ、特徴ベクトルの次元数を一定数確保することが可能になるという利点もある。
図１５に回転せずに斜線部分の特徴量を除去した例を示す。この図１５の例では、斜線で塗りつぶしたマスで示される４２マスが、数字の背景の斜線による影響を受けており、文字認識に使用できる特徴量は３９次元となってしまい、比較のための情報量が明らかに減るため、認識精度が低下することになる。 However, by rotating 45 degrees so that the diagonal line is horizontal, the removed mass can be one horizontal row or two horizontal rows for each horizontal line, and a certain number of feature vector dimensions can be secured. There is also an advantage that it becomes possible.
FIG. 15 shows an example in which the feature amount in the shaded portion is removed without rotating. In the example of FIG. 15, 42 squares indicated by hatched squares are affected by the diagonal lines of the number background, and the feature quantity that can be used for character recognition is 39 dimensions. Since the amount of information is obviously reduced, the recognition accuracy is reduced.

このようにこの実施形態の身分証認識システムによれば、予め４５度に回転させた文字パターンで斜線入り文字専用の認識辞書（第２辞書３ｂ）を作成および登録しておき、入力パターンを４５度回転させ斜線を除去した入力画像から特徴ベクトルを求め、その特徴ベクトルと、第２辞書３ｂから、斜線位置にある特徴ベクトルを除外した特徴ベクトルとを比較して文字認識を行うことで、免許証番号の斜線が引かれた部分について、その部分がモノクロ画像であったとしてもその部分の文字を認識することができる。 As described above, according to the ID recognition system of this embodiment, a recognition dictionary (second dictionary 3b) dedicated to hatched characters is created and registered with a character pattern rotated in advance by 45 degrees, and an input pattern is set to 45. A feature vector is obtained from the input image rotated by removing the oblique line and the feature vector is compared with the feature vector excluding the feature vector at the oblique line position from the second dictionary 3b to perform character recognition. With respect to the part where the hatched portion of the identification number is drawn, even if the part is a monochrome image, the character of that part can be recognized.

すなわち、斜線入り文字画像に対してはそれ専用の第２辞書１３ｂを参照し、得られた入力画像パターンと第２辞書１３ｂの登録パターンとの両方から、斜線がかかる部分の成分を除いた特徴ベクトルで文字認識を行うことで、斜線が引かれた文字であっても文字認識を行うことができる。 That is, for the character image with diagonal lines, the second dictionary 13b dedicated thereto is referred to, and the component of the hatched part is excluded from both the obtained input image pattern and the registered pattern of the second dictionary 13b. By performing character recognition using vectors, it is possible to perform character recognition even for hatched characters.

なお、本発明は、上記実施形態のみに限定されるものではない。上記実施形態では、免許証の「日」という文字で説明したが、この他、例えば保険証やパスポートなどでも同様の方法で認識できる。文字は、「日」だけでなく、認識対象の身分証の中に複数存在すれば、例えば「年」、「月」、「号」などを用いても良い。 In addition, this invention is not limited only to the said embodiment. In the above embodiment, the description has been made with the letter “day” of the license, but other than this, for example, an insurance card or a passport can be recognized by the same method. For example, “year”, “month”, “number”, or the like may be used as long as there are a plurality of characters in the identification card to be recognized in addition to “day”.

また、上記実施形態では、文字画像の斜線部分を除去しただけであったが、文字画像内の黒画素成分に数ピクセルずつ肉付けすること、つまり文字を太らせることで特徴ベクトルが減ることを若干でも改善できる。 Further, in the above embodiment, only the shaded portion of the character image is removed, but it is slightly reduced that the feature vector is reduced by adding several pixels to the black pixel component in the character image, that is, by thickening the character. But it can be improved.

また上記実施形態では、第２辞書１３ｂには、予め４５度に回転させた文字パターンの特徴ベクトルとテキストデータとを対応付けて記憶していたが、斜線部分の特徴ベクトルを除いた状態で特徴ベクトルを記憶しておくことで、文字認識の際に、斜線部分の特徴ベクトルを除く処理を行わずに済む。 In the above embodiment, the second dictionary 13b previously stores the character pattern feature vector rotated by 45 degrees and the text data in association with each other. By storing the vectors, it is not necessary to perform the process of removing the feature vectors in the shaded area during character recognition.

この実施形態の身分証認識システムの構成を示す図。The figure which shows the structure of the identification card recognition system of this embodiment. 申込書を示す図。The figure which shows an application form. 申込書の画像を示す図。The figure which shows the image of an application. 図１の身分証認識システムの概要動作を示すフローチャート。The flowchart which shows the outline | summary operation | movement of the identification card recognition system of FIG. 免許証画像認識処理の詳細を示すフローチャート。The flowchart which shows the detail of a license image recognition process. 「日」という文字の検出方法を説明するための図。The figure for demonstrating the detection method of the character "day." 「日」という文字の上部と下部の形の違いで免許証の向きを判定することを示す図。The figure which shows judging the direction of a license by the difference in the shape of the upper part and the lower part of the character "day". 免許書番号部分の画像を拡大した図。The figure which expanded the image of the license number part. 免許証番号斜線部専用の第２辞書の登録内容を示す図。The figure which shows the registration content of the 2nd dictionary only for a license number oblique line part. 免許証番号の斜線入り文字画像に対する文字認識処理を示すフローチャート。The flowchart which shows the character recognition process with respect to the character image with a diagonal line of a license number. 免許証番号部分より切り出された斜線入りの文字画像「３」を示す図。The figure which shows the character image "3" containing the diagonal line cut out from the license number part. 文字画像「３」を４５°回転させた様子を示す図。The figure which shows a mode that character image "3" was rotated 45 degrees. 回転させた文字画像「３」から斜線を除去した様子を示す図。The figure which shows a mode that the oblique line was removed from the rotated character image "3". 斜線部分を取り除いた特徴ベクトルの例を示す図。The figure which shows the example of the feature vector which removed the oblique line part. 回転せずに斜線部分の特徴ベクトルを除去した場合に特徴ベクトルの情報量が減ってしまう様子を示す図。The figure which shows a mode that the information content of a feature vector reduces when the feature vector of a shaded part is removed without rotating.

Explanation of symbols

１…申込書、２…スキャナー、１０…コンピュータ、１１…操作部、１２…通信Ｉ／Ｆ、１３…メモリ、１３ａ…第１辞書、１３ｂ…第２辞書、１３ｃ…免許証フォーマット、１４…表示部、１５…ハードディスク装置、１６…ＣＰＵ。 DESCRIPTION OF SYMBOLS 1 ... Application form, 2 ... Scanner, 10 ... Computer, 11 ... Operation part, 12 ... Communication I / F, 13 ... Memory, 13a ... 1st dictionary, 13b ... 2nd dictionary, 13c ... License format, 14 ... Display 15, hard disk drive, 16 CPU.

Claims

Image information acquisition means for acquiring an image from a print surface on which a character string with a diagonal line is printed;
A character image extracting means for extracting an image of a character string with the diagonal line as a background from the images acquired by the image information acquiring means, and cutting out each character unit;
A character image rotating means for rotating the character image cut out by the character image cutting means at an angle at which the oblique line becomes substantially horizontal;
Ray processing means for removing a black pixel component constituting a line in a horizontal direction from the character image rotated by the character image rotating means;
Feature vector extracting means for extracting a feature vector of a character image from which black pixel components constituting a line in the horizontal direction are removed by the ray processing means;
A dictionary storage unit that stores a dictionary in which the feature vector of the reference character image and the text data in a state of being rotated at an angle that coincides with the angle of the oblique line in advance,
Character recognition means for outputting text data having a feature vector that matches and approximates the feature vector of the character image extracted by the feature vector extraction means and the feature vector stored in the dictionary storage unit. A character recognition device characterized by that.

The character recognition device according to claim 1,
The image information acquisition means
Form image information acquisition means for acquiring an image from a form;
Characteristic character detection means for detecting a plurality of characters of at least one kind of “year”, “month”, and “day” from the image of the form acquired by the form image information acquisition means;
Based on the positional relationship of the plurality of characters detected by the characteristic character detection means and a reference position of the character of the identification card set in advance, the expansion rate and / or direction of the image of the identification card is obtained, The character recognition apparatus according to claim 1, further comprising an image extracting unit that extracts an image of an identification card from the inside.

An image information acquisition unit acquiring an image from a printing surface on which a character string with a diagonal line as a background is printed;
A character image cutting means, from the image acquired by the image information acquisition means, extracting a character string image with the diagonal line as a background, and cutting out each character unit;
A step of rotating the character image cut out by the character image cut-out means to an angle at which the oblique line is substantially horizontal;
A ray processing means removing a black pixel component constituting a line in a horizontal direction from the character image rotated by the character image rotating means;
A step of extracting a feature vector of a character image from which a black pixel component constituting a line in a horizontal direction is removed by the ray processing unit;
A dictionary in which the feature vector of the reference character image rotated at an angle that matches the angle of the oblique line and the text data are stored in a dictionary storage unit in advance, and the feature vector extracting unit extracts the dictionary. Character recognition comprising: a step of outputting text data having a feature vector in which a character recognition unit compares and matches or approximates a feature vector of a character image and a feature vector stored in the dictionary storage unit Method.

The character recognition method according to claim 3.
A form image information obtaining unit obtaining an image from the form;
A feature character detection unit detecting a plurality of characters of at least one kind of “year”, “month”, and “day” from the image of the form acquired by the form image information acquisition unit;
Based on the positional relationship between the plurality of characters detected by the characteristic character detection means and a preset reference position of the character of the identification card, the image extraction means obtains the expansion rate and / or direction of the image of the identification card, 4. A character recognition method according to claim 3, further comprising the step of extracting an image of an identification card from a form image.