JP2021149136A

JP2021149136A - Server, method, and program for extracting character string such as serial number

Info

Publication number: JP2021149136A
Application number: JP2020045108A
Authority: JP
Inventors: 敏郎松村; Toshiro Matsumura; 敬宇蓑和; Yoshitaka Minowa
Original assignee: ISP KK
Current assignee: ISP KK
Priority date: 2020-03-16
Filing date: 2020-03-16
Publication date: 2021-09-27
Anticipated expiration: 2040-03-16
Also published as: JP6878739B1

Abstract

To provide a server, a method, and a program for extracting an object character string such as serial numbers from a format in an indefinite shape.SOLUTION: A character string extraction server is for extracting a plurality of object character strings included in an unspecific area of an image and includes: rectangle detection means which detects a plurality of closed areas being regions surrounded by a margin from a binarization image of an image, and detects a rectangle related to the respective closed areas; skew angle detection means which detects a skew angle of character strings inside the respective detected rectangles; skew correction means detecting a rectangle of a rotation image in accordance with a skew angle with respect to each rectangle; and character recognition means which recognizes a character string in the internal region of the rectangle and/or the rectangle of the rotation image. The recognized character string includes a plurality of object character strings.SELECTED DRAWING: Figure 2

Description

本発明は、文字列抽出システムに関する。特に、不定形のフォーマットからシリアルナンバー等の対象文字列を抽出するためのサーバ、方法及びプログラムに関する。 The present invention relates to a character string extraction system. In particular, it relates to a server, a method and a program for extracting a target character string such as a serial number from an amorphous format.

従来、コードを端末から送信する応募方法又はコードを所定の用紙に付加して郵送する応募方法のいずれかにより応募可能であるキャンペーンの応募を管理するキャンペーン応募管理システムにおいて、データ抽出部が画像読み取り部が取得した応募用紙画像から、画像認識によりシリアル番号を示すコードを読み込み、読み込んだコードの画像データをデコードして、テキストデータによって表される郵送シリアル番号を抽出することが提案された（特開２０１６−０９１１１３公報：特許文献１）。 Conventionally, in the campaign application management system that manages the application of the campaign that can be applied by either the application method of sending the code from the terminal or the application method of adding the code to a predetermined form and mailing it, the data extraction unit reads the image. It was proposed to read the code indicating the serial number from the application form image acquired by the department by image recognition, decode the image data of the read code, and extract the mailed serial number represented by the text data (special feature). Open 2016-091113 (Patent Document 1).

画像データにシリアルコード以外の数字や文字等が混じり込んでいても、シリアルコードを適切に抽出可能な移動端末装置として、シリアルコード及びシリアルコード周辺の文字、数字等の各種記号が映り込んだ画像データから、情報記憶部が記憶しているフォーマット情報に合致する部分を、シリアルコードとして抽出することが提案された（特開２０１８−１３６８３０公報：特許文献２） Even if numbers and characters other than the serial code are mixed in the image data, as a mobile terminal device that can appropriately extract the serial code, an image in which various symbols such as the serial code and characters and numbers around the serial code are reflected. It has been proposed to extract a portion of the data that matches the format information stored in the information storage unit as a serial code (Japanese Unexamined Patent Publication No. 2018-136830: Patent Document 2).

特開２０１６−０９１１１３公報JP-A-2016-091113 特開２０１８−１３６８３０公報JP-A-2018-136830 特許第５９４０６１５号明細書Patent No. 5940615 特許第６３５３８９３号明細書Japanese Patent No. 6353893

販売促進等を目的とするキャンペーンの形態は多様化しており、商品（の包装）に付された応募券（ポイント毎のユニークなシリアルナンバーが印刷されたシール等紙片）を複数集め、所定ポイントに達すると景品（の抽選資格）を与えるキャンペーン形態や、順次集めた応募券のポイントを順次加算し期日にポイントを消化して抽選するようなキャンペーン形態もある。そのようなキャンペーン形態は消費者の購買意欲によく働きかけ得る一方、ポイントの集計が煩雑になりやすい。目視チェックや手入力等の手動による作業を軽減させるためＯＣＲ（ＯｐｔｉｃａｌＣｈａｒａｃｔｅｒＲｅｃｏｇｎｉｔｉｏｎ：光学文字認識）エンジンをウェブアプリケーションに組み込みシリアルコードを認識させること等が考えられるが、概して成功裏の認識が困難であったりフォーマットが限定されやすく実効性に乏しい。 The form of the campaign for the purpose of sales promotion is diversifying, and multiple application tickets (paper pieces such as stickers printed with a unique serial number for each point) attached to the product (packaging) are collected and set to the specified points. There is also a campaign form in which a prize (lottery qualification) is given when the product is reached, and a campaign form in which the points of the application tickets collected in sequence are sequentially added and the points are consumed on the due date to draw a lottery. While such a campaign form can work well on consumers' purchasing motivation, it tends to complicate the aggregation of points. In order to reduce manual work such as visual check and manual input, it is conceivable to incorporate an OCR (Optical Character Recognition) engine into a web application to recognize the serial code, but it is generally difficult to recognize it successfully. There is a tendency to limit the format and it is not effective.

従って、さまざまなキャンペーンの形態に応じて、応募する側にとっても応募を受付する側にとっても手間がかからないように構成されたキャンペーン支援システムが望ましい。 Therefore, it is desirable to have a campaign support system that is configured so that it does not take time for both the applicant and the applicant to accept the application, depending on the form of the various campaigns.

上記に鑑みて本発明は、商品の購入者がキャンペーンに係る応募券を撮影した画像からシリアルナンバー等の文字列を抽出するシステムを提供することを目的とする。特に、画像に含まれる応募券の枚数が一枚乃至複数枚（任意）であってよく、フォーマットに関わらず対象文字列を抽出することができるサーバ、方法、及びプログラムを提供することを目的とする。 In view of the above, it is an object of the present invention to provide a system in which a purchaser of a product extracts a character string such as a serial number from a photographed image of an application ticket related to a campaign. In particular, the number of application tickets included in the image may be one or more (optional), and the purpose is to provide a server, method, and program capable of extracting the target character string regardless of the format. do.

上記課題を解決するための本発明の一つの態様は、画像の不特定の領域に含まれる複数の対象文字列を抽出するための文字列抽出サーバであって、画像の二値化画像から余白に囲まれた領域である閉域を複数検出し、それぞれの閉域に関する矩形を検出する矩形検出手段と、検出されたそれぞれの矩形内部の文字列のスキュー角度を検出するスキュー角度検出手段と、それぞれの矩形についてスキュー角度に応じた回転画像の矩形を検出するスキュー補正手段と、矩形及び／又は回転画像の矩形の内部領域の文字列を認識する文字列認識手段と、を備え、認識された文字列が複数の対象文字列を含むことを特徴とする。 One aspect of the present invention for solving the above problems is a character string extraction server for extracting a plurality of target character strings included in an unspecified area of an image, and a margin from a binarized image of the image. A rectangle detecting means for detecting a plurality of closed areas surrounded by a closed area and detecting a rectangle related to each closed area, and a skew angle detecting means for detecting a skew angle of a character string inside each detected rectangle. A recognized character string including a skew correction means for detecting the rectangle of the rotated image according to the skew angle of the rectangle and a character string recognizing means for recognizing the character string in the internal region of the rectangle and / or the rectangle of the rotated image. Is characterized by containing a plurality of target character strings.

本発明によれば、シリアルナンバー等の対象文字列が周囲に余白を持つという特徴を用いて、画像の特定されない部分にある対象文字列の領域を検出することができる。また、余白等に基づいて文字列の傾きが検出されて補正される。このことにより、個別に集められた複数の応募券を任意に並べて撮影したような画像からでも成功裏の文字認識を行うことができる。このことにより複数の応募券を用いるキャンペーンを、フォーマットを限定したり集計の手間をかけることなく容易に実施することができる。応募する側もキャンペーンサイトから応募台紙をダウンロードしたりすることなく、任意枚数の応募券をスマートフォン等で撮影するだけで複数のシリアルナンバーをキャンペーンサイトに登録することができ、容易にキャンペーンに参加できる。 According to the present invention, it is possible to detect an area of the target character string in an unspecified portion of the image by using the feature that the target character string such as a serial number has a margin around it. In addition, the inclination of the character string is detected and corrected based on the margins and the like. As a result, it is possible to perform successful character recognition even from an image in which a plurality of individually collected application tickets are arbitrarily arranged and photographed. This makes it possible to easily carry out a campaign using a plurality of application tickets without limiting the format or taking the trouble of totaling. Applicants can easily participate in the campaign by registering multiple serial numbers on the campaign site simply by shooting any number of application tickets with a smartphone, etc., without downloading the application mount from the campaign site. ..

好適に、文字列認識手段は、矩形及び／又は回転画像の矩形の縦横比に基づいてそれぞれの矩形内部の文字列が対象文字列を含むか否か判定し、対象文字列を含むと判定した矩形の内部領域の文字列を認識する。本発明によれば、シリアルナンバー等の対象文字列を含む矩形の領域の形状が、スキュー補正等により概ね画一的に抽出される。従って矩形の縦横比に基づいて対象文字列を含む領域のみを抽出し得る。このようにすることでフォーマット情報に依らずＯＣＲ処理範囲を限定することができ、効率的に対象文字列を抽出することができる。 Preferably, the character string recognition means determines whether or not the character string inside each of the rectangles includes the target character string based on the aspect ratio of the rectangle and / or the rectangle of the rotated image, and determines that the character string includes the target character string. Recognize the character string in the internal area of the rectangle. According to the present invention, the shape of a rectangular region including a target character string such as a serial number is extracted substantially uniformly by skew correction or the like. Therefore, only the area including the target character string can be extracted based on the aspect ratio of the rectangle. By doing so, the OCR processing range can be limited regardless of the format information, and the target character string can be efficiently extracted.

対象文字列は、数字、アルファベット、仮名文字、記号を含む文字の組合せから成る、周囲に余白を持つ文字列であってよい。文字列は一行（一列）であってよく、二行（二列）以上であってもよく、書式に合わせて文字列の上下の余白や行間からスキュー角度が検出される。 The target character string may be a character string having a margin around it, which is composed of a combination of characters including numbers, alphabets, kana characters, and symbols. The character string may be one line (one column) or two or more lines (two columns), and the skew angle is detected from the upper and lower margins and the line spacing of the character string according to the format.

二値化画像は、画像のグレースケール画像に含まれる所定の輝度値未満の画素の頻度を計算し、グレースケール画像に含まれる所定の輝度値以上の画素について近傍の画素との輝度値の差の絶対値を中間値から減算し頻度を計算して全ての画素の輝度値が中間値以下となるヒストグラムを作成し、該ヒストグラムに基づいて閾値を決定して生成され得る。 In the binarized image, the frequency of pixels less than a predetermined brightness value included in the grayscale image of the image is calculated, and the difference in brightness value between the pixels having a predetermined brightness value or more included in the grayscale image and the neighboring pixels is calculated. The absolute value of is subtracted from the intermediate value to calculate the frequency to create a histogram in which the luminance values of all the pixels are equal to or less than the intermediate value, and the threshold value can be determined based on the histogram to be generated.

このようにすることで、撮影条件が不十分な画像からでも本発明に係る解析に適した解析用二値化画像が生成され、最終的に成功裏の文字認識がされやすくなり、実効性の高いシステムとすることができる。 By doing so, a binarized image for analysis suitable for the analysis according to the present invention is generated even from an image with insufficient shooting conditions, and finally, successful character recognition is facilitated, and the effectiveness is improved. It can be a high system.

矩形及び／又は前記回転画像の矩形は、上下左右に近接する複数の矩形が合成された矩形を含んでよい。 The rectangle and / or the rectangle of the rotated image may include a rectangle in which a plurality of rectangles adjacent to each other in the vertical and horizontal directions are combined.

シリアルナンバー等は周囲から十分な余白を持って記載されていると考えられ（罫線によって区分される場合もあるが、罫線を除去して余白としてもよい）、検出される矩形は対象文字列のみを含むケースが多い。不要な文字列を含む矩形は、対象文字列を含まないケースが多い。シリアルナンバーが複数行に渡る応募シールを近接して撮影した場合などは、各行を含む矩形がそれぞれ検出されて、矩形はシリアルナンバーの一部を含む場合がある。シリアルナンバーが複数桁に渡る場合、各桁の数字を含む各矩形が検出される場合がある。シリアルナンバーを構成する文字列は互いに近接していると考えられることから、十分に近接した矩形を合成することで、シリアルナンバー全体から成る矩形を合成し得る。 It is considered that the serial number etc. is described with sufficient margin from the surroundings (it may be separated by a ruled line, but the ruled line may be removed to make a margin), and the detected rectangle is only the target character string. In many cases including. A rectangle containing an unnecessary character string often does not include the target character string. When an application sticker with a serial number spanning multiple lines is photographed in close proximity, a rectangle containing each line may be detected, and the rectangle may include a part of the serial number. If the serial number spans multiple digits, each rectangle containing each digit may be detected. Since the character strings that make up the serial number are considered to be close to each other, it is possible to synthesize a rectangle consisting of the entire serial number by synthesizing rectangles that are sufficiently close to each other.

好適に、回転画像の矩形は、矩形検出手段により検出された矩形の決定に関連する座標を線分補間テーブルに基づいて回転変換した座標に基づいて検出される。このようにすることで、ごく少ない計算量でＯＣＲの処理範囲を検出することができる。 Preferably, the rectangle of the rotated image is detected based on the coordinates obtained by rotating the coordinates related to the determination of the rectangle detected by the rectangle detecting means based on the line segment interpolation table. By doing so, the processing range of OCR can be detected with a very small amount of calculation.

本発明のもう一つの態様は、画像の不特定の領域に含まれる複数の対象文字列を抽出するための文字列抽出方法であって、画像の二値化画像から余白に囲まれた領域である閉域を複数検出し、それぞれの閉域に関する矩形を検出する矩形検出ステップと、検出されたそれぞれの矩形内部の文字列のスキュー角度を検出するスキュー角度検出ステップと、それぞれの矩形についてスキュー角度に応じた回転画像の矩形を検出するスキュー補正ステップと、矩形及び／又は回転画像の矩形の内部領域の文字列を認識する文字認識ステップと、を含み、認識された文字列が複数の対象文字列を含むことを特徴とする。 Another aspect of the present invention is a character string extraction method for extracting a plurality of target character strings included in an unspecified area of an image, in a region surrounded by margins from a binarized image of the image. A rectangle detection step that detects multiple closed areas and detects a rectangle related to each closed area, a skew angle detection step that detects the skew angle of the character string inside each detected rectangle, and a skew angle for each rectangle. A skew correction step for detecting the rectangle of the rotated image and a character recognition step for recognizing the character string in the internal area of the rectangle and / or the rectangle of the rotated image are included, and the recognized character string includes a plurality of target character strings. It is characterized by including.

本発明のさらに他の態様は、上記文字抽出方法をコンピュータに実行させるプログラムである。 Yet another aspect of the present invention is a program that causes a computer to execute the above character extraction method.

本発明によれば、さまざまな形態のキャンペーン等に対応可能な汎用性の高い文字列抽出サーバ、方法、プログラムが提供される。複数の応募券の集計が必要なキャンペーンであっても、任意の枚数の応募券の撮影画像からシリアルコード等を自動で抽出することができる。このことにより応募を受ける側はキャンペーンを容易に実施することができ、応募する側も容易に参加することができる。 According to the present invention, a highly versatile character string extraction server, method, and program capable of supporting various forms of campaigns and the like are provided. Even in a campaign that requires aggregation of a plurality of application tickets, a serial code or the like can be automatically extracted from the captured images of an arbitrary number of application tickets. As a result, the applicant can easily carry out the campaign, and the applicant can easily participate.

また、本発明によれば、一般的なユーザにより撮影された画像の状態に応じて、成功裏に文字が認識されるように一連の処理が行われる。このことにより、任意枚数の応募券を任意に並べただけの画像からでも高い精度でシリアルナンバー等対象文字列を抽出することができ、実質的にキャンペーンを支援することができる。 Further, according to the present invention, a series of processes are performed so that characters are successfully recognized according to the state of an image taken by a general user. As a result, the target character string such as the serial number can be extracted with high accuracy even from the image in which an arbitrary number of application tickets are arranged arbitrarily, and the campaign can be substantially supported.

図１は本発明に係るキャンペーン支援システムを概略的に示す。FIG. 1 schematically shows a campaign support system according to the present invention. 図２は本発明に係る文字列抽出サーバの機能ブロック図である。FIG. 2 is a functional block diagram of the character string extraction server according to the present invention. 図３はユーザ端末により撮影された画像の例である。FIG. 3 is an example of an image taken by a user terminal. 図４Ａは比較例の二値化画像である。FIG. 4A is a binarized image of a comparative example. 図４Ｂは本発明に係る実施例の二値化画像である。FIG. 4B is a binarized image of the embodiment according to the present invention. 図５は本発明に係る二値化方法に用いられるヒストグラムの実施例である。FIG. 5 is an example of a histogram used in the binarization method according to the present invention. 図６Ａは二値化画像を模式的に示す。FIG. 6A schematically shows a binarized image. 図６Ｂは二値化画像の水平ブロック化を模式的に示す。FIG. 6B schematically shows the horizontal blocking of the binarized image. 図６Ｃは白地レクタングルの検出を模式的に示す。FIG. 6C schematically shows the detection of a white background rectangle. 図６Ｄは閉域の検出を模式的に示す。FIG. 6D schematically shows the detection of a closed area. 図６Ｅは矩形の検出を模式的に示す。FIG. 6E schematically shows the detection of a rectangle. 図７は他の閉域及び矩形の検出を模式的に示す。FIG. 7 schematically shows the detection of other closed areas and rectangles. 図８はＯＣＲ方向に対する文字列の傾き（スキュー）を概略的に説明する図である。FIG. 8 is a diagram schematically explaining the inclination (skew) of the character string with respect to the OCR direction. 図９は線分補間アルゴリズムにより決定された複数のスキャン画素領域を模式的に示す図である。FIG. 9 is a diagram schematically showing a plurality of scan pixel regions determined by the line segment interpolation algorithm. 図１０はスキャン画素領域のスキャンによるスキュー角度の検出を概略的に説明する図である。FIG. 10 is a diagram schematically illustrating detection of a skew angle by scanning a scan pixel region. 図１１（ａ）はスキュー角度に応じた回転画像の生成を概略的に説明する図である。図１１（ｂ）は回転画像における矩形の検出を模式的に示す。FIG. 11A is a diagram schematically illustrating the generation of a rotated image according to the skew angle. FIG. 11B schematically shows the detection of a rectangle in a rotated image. 図１２（ａ）及び図１２（ｂ）は線分補間テーブルに基づく回転画像の生成及び矩形の検出を説明するための図である。12 (a) and 12 (b) are diagrams for explaining the generation of the rotation image and the detection of the rectangle based on the line segment interpolation table.

以下、図面を参照しながら、本発明のさまざまな特徴が、本発明の限定を意図しない好適な実施例とともに説明される。図面は説明の目的で単純化、概略化されている。 Hereinafter, various features of the invention will be described with reference to the drawings, along with suitable examples not intended to limit the invention. The drawings are simplified and outlined for illustration purposes.

図１を参照し、キャンペーン支援システム１は、本発明に係る文字列抽出サーバ１００、キャンペーンサーバ２００、及び１つ以上のユーザ端末３００を含む。ユーザ端末３００は、キャンペーンの応募のために応募券を含む画像Ｉを撮影することができる。インターネット等の通信ネットワーク２を介して、キャンペーンサーバ２００はユーザ端末３００と通信することができ、文字列抽出サーバ１００はキャンペーンサーバ２００と通信することができる。 With reference to FIG. 1, the campaign support system 1 includes a character string extraction server 100, a campaign server 200, and one or more user terminals 300 according to the present invention. The user terminal 300 can take an image I including an application ticket for applying for the campaign. The campaign server 200 can communicate with the user terminal 300, and the character string extraction server 100 can communicate with the campaign server 200 via a communication network 2 such as the Internet.

文字列抽出サーバ１００は、中央処理装置（ＣＰＵ）、ＲＡＭ、ＲＯＭ、ハードディスクなどを実装し、適切なオペレーティングシステム（ＯＳ）の制御の下でプログラミング言語を実行し、各種処理を実行するための機能手段を提供する。サーバに格納されるプログラムは、ＨＴＭＬ、ＪａｖａＳｃｒｉｐｔ（登録商標）、ネイティブプログラム（オブジェクトコード）等で構築される。サーバ１００はＡＳＰとして提供されてよく、利用者はＷｅｂブラウザや専用のクライアントソフト等を通じてサーバ１００にアクセスし、これを利用することができる。あるいは本発明は、文字列抽出ライブラリとしてキャンペーン支援システム１に組み込むことができる。 The character string extraction server 100 has a function for mounting a central processing unit (CPU), RAM, ROM, hard disk, etc., executing a programming language under the control of an appropriate operating system (OS), and executing various processes. Provide means. The program stored in the server is constructed by HTML, Javascript (registered trademark), a native program (object code), or the like. The server 100 may be provided as an ASP, and the user can access and use the server 100 through a Web browser, dedicated client software, or the like. Alternatively, the present invention can be incorporated into the campaign support system 1 as a character string extraction library.

キャンペーンサーバ２００はキャンペーンに関連する各種情報を管理する、公知のサーバ装置から成る外部サーバであってよい。キャンペーンサーバ２００は、ユーザ端末３００からのキャンペーンサイトへのアクセスに応じてユーザＩＤやパスコードを発行し、ユーザ端末のカメラで撮影した画像Ｉを取得してよい。あるいは、はがきに貼った応募シールを受付したキャンペーンの運用者がはがきを撮影することにより画像Ｉを取得してもよい。キャンペーンサーバ２００は、画像Ｉを文字列抽出サーバ１００に送信し、解析結果を受信することができる。また、適宜データべース（図示せず）に格納された正規のシリアルナンバーリストと照合したり、応募の重複をチェックしたり、シリアルナンバーに紐づけられたポイントリスト等を用いてユーザ毎に応募に係るポイントを集計し景品の抽選等を行うことができる。外部サーバはキャンペーンサーバに限らず、本発明に係る文字列抽出サーバ１００と通信する任意のコンピュータ装置であってよい。 The campaign server 200 may be an external server composed of a known server device that manages various information related to the campaign. The campaign server 200 may issue a user ID or passcode in response to access to the campaign site from the user terminal 300, and may acquire an image I taken by the camera of the user terminal. Alternatively, the operator of the campaign that has received the application sticker attached to the postcard may acquire the image I by photographing the postcard. The campaign server 200 can transmit the image I to the character string extraction server 100 and receive the analysis result. In addition, it is possible to collate with the regular serial number list stored in the database (not shown) as appropriate, check for duplicate applications, and use the point list associated with the serial number for each user. It is possible to collect the points related to the application and draw a prize. The external server is not limited to the campaign server, and may be any computer device that communicates with the character string extraction server 100 according to the present invention.

ユーザ端末３００は、カメラデバイスやブラウザソフトウェアを搭載する公知のスマートフォン等の高機能端末やタブレット型高機能端末であってよい。 The user terminal 300 may be a high-performance terminal such as a known smartphone or a tablet-type high-performance terminal equipped with a camera device or browser software.

図２に、文字列抽出サーバ１００において実装される文字列抽出ロジックの機能ブロック図が示される。文字列抽出サーバ１００は、二値化手段１０、閉域・矩形検出手段２０、スキュー角度検出手段３０、スキュー補正手段４０、対象文字列抽出手段５０、記憶手段６０を備え、キャンペーンサーバ２００から（あるいは、ユーザ端末３００から直接）受信した一枚乃至複数枚（任意の枚数）の応募券を撮影した画像Ｉから、各応募券に印刷されたシリアルナンバー等の対象文字列を抽出した結果（複数の文字列セット）を出力し、キャンペーンサーバ２００に返すことができる。文字列抽出サーバ１００の各手段１０乃至６０は、ソフトウェア、ファームウェア、ハードウェア、又はあらゆるそれらの組み合わせにおいて実装することができる。 FIG. 2 shows a functional block diagram of the character string extraction logic implemented in the character string extraction server 100. The character string extraction server 100 includes a binarization means 10, a closed area / rectangular detection means 20, a skew angle detection means 30, a skew correction means 40, a target character string extraction means 50, and a storage means 60, and is provided from the campaign server 200 (or). , The result of extracting the target character string such as the serial number printed on each application ticket from the image I of one or more (arbitrary number) application tickets received (directly from the user terminal 300). The character string set) can be output and returned to the campaign server 200. Each means 10-60 of the string extraction server 100 can be implemented in software, firmware, hardware, or any combination thereof.

本発明が想定するキャンペーンにおいて、応募券等の枚数は概して任意（一枚乃至複数枚）でありそれらの画像中の配置も任意であり得る。例えばユーザは、スマートフォン等のユーザ端末３００でキャンペーンサイト２００にアクセスし、画面の指示に従って集めた応募券を撮影した画像Ｉをキャンペーンサイトへ送信することができる。図３に、ユーザ端末３００のカメラにより撮影された例示的な画像が示される。この例では、ユーザは白っぽい台（テーブルや広告の裏等）に１０枚の応募券を縦二列に並べて（貼って）撮影している。画像においてユーザ端末の影が左隅中心に写りこみ、応募券の中には画像の縦横方向に対して傾いているものが見られる。ユーザがスマートフォン等携帯端末で撮影した画像においてこのような例は珍しくないと考えられる。この画像を従来どおり二値化しＯＣＲ処理してもシリアルナンバーの成功裏の抽出は望めない。頻繁に認識エラーとなったり、結局目視チェックや手入力を要したり、ＯＣＲ処理できるまで繰り返しユーザに画像の送信を要求したりすることになりかねず、そのようなシステムは不便であり望ましくない。 In the campaign envisioned by the present invention, the number of application tickets and the like is generally arbitrary (one or more), and their arrangement in the image may be arbitrary. For example, the user can access the campaign site 200 with a user terminal 300 such as a smartphone, and transmit the image I of the application tickets collected according to the instructions on the screen to the campaign site. FIG. 3 shows an exemplary image taken by the camera of the user terminal 300. In this example, the user shoots 10 application tickets side by side (pasted) on a whitish table (table, back of advertisement, etc.). In the image, the shadow of the user terminal is reflected in the center of the left corner, and some application tickets are tilted with respect to the vertical and horizontal directions of the image. It is considered that such an example is not uncommon in an image taken by a user with a mobile terminal such as a smartphone. Even if this image is binarized as before and OCR processed, the successful extraction of the serial number cannot be expected. Such a system is inconvenient and undesirable, as it can lead to frequent recognition errors, eventually requiring visual checks and manual input, and repeatedly requiring the user to send images until OCR processing is possible. ..

二値化手段１０は図３のような画像（Ｉ）からでも最終的に必要な文字を精度よく認識するために、二値化を含む前処理を実行する。例えば二値化手段１０は、画像Ｉを適宜スケーリングし、必要に応じて所定のブロックサイズで割り切れるようにトリミングし、適宜フィルタを実行して解析用二値化画像を生成する。 The binarization means 10 executes preprocessing including binarization in order to accurately recognize the finally required characters even from the image (I) as shown in FIG. For example, the binarizing means 10 appropriately scales the image I, trims the image I so as to be divisible by a predetermined block size, and executes an appropriate filter to generate a binarized image for analysis.

従来の二値化を説明するために、図３の画像を大津式により二値化した場合、図４Ａのように、影の部分に含まれる応募券が黒く塗りつぶされてしまう。このとき閾値を暗い方へ補正したとしても、影が薄くなる代わりに明るい部分にある文字が薄くなり過ぎる。大津式の他にもさまざまな二値化方法が存在するが、例えば、画像を分割してそれぞれの閾値を求める動的閾値法等では、分割の大きさ等によって境目でずれが生じ、図（前景）が正確に検出できない場合がある。 In order to explain the conventional binarization, when the image of FIG. 3 is binarized by the Otsu formula, the application ticket included in the shadow portion is painted black as shown in FIG. 4A. At this time, even if the threshold value is corrected to the dark side, the characters in the bright part become too light instead of the shadow becoming light. There are various binarization methods other than the Otsu method. For example, in the dynamic threshold method in which an image is divided to obtain each threshold value, a deviation occurs at the boundary depending on the size of the division, and the figure ( Foreground) may not be detected accurately.

本発明に係る発明者は、影を含む画像であっても地（背景）と図（前景）を容易に区別することができる二値化方法を考案した。当該二値化方法によれば、図４Ｂに示されるように、影の中のシリアルナンバーが黒く塗りつぶされることはない。 The inventor of the present invention has devised a binarization method capable of easily distinguishing a ground (background) and a figure (foreground) even in an image containing shadows. According to the binarization method, the serial number in the shadow is not painted black as shown in FIG. 4B.

図５を用いて、本発明に係る二値化が説明される。初めに、画像Ｉのグレースケール画像（２５６諧調）を構成する画素から所定の輝度値未満（例えば、１６未満、３２未満等）の画素（黒）を抽出する。このようにすることで、影の中でも黒い部分（黒く印字されたシリアルナンバー等）が抽出され得る。所定の輝度値は、図３のようなサンプル画像を用いて黒として残したい画素の明度の閾値を予め測定することにより決定されてよい。所定の輝度値以上（例えば、１６以上、３２以上等）の画素（ｐ）について、例えば、左上の画素（ｓ）を用いて、「ｐ＝中間値−｜−ｓ＋ｐ｜」を計算すると、画素同士の明度の差が小さいほど計算結果は中間値（例えば、１２８）に近づき、明度の差が大きいと０に近づく（０未満は０に補正される）。図５は上記のようにして計算されたヒストグラム（横軸が輝度値、縦軸が頻度）であり、全ての画素が中間値（１２８）以下に分布する。当該ヒストグラムに基づいて、例えば、大津式で閾値を求め二値化画像を生成することで、暗い所の黒い画素も明るい所の黒い画素も抽出することができる。 The binarization according to the present invention will be described with reference to FIG. First, pixels (black) having a brightness value less than a predetermined value (for example, less than 16 or less than 32) are extracted from the pixels constituting the grayscale image (256 gradations) of the image I. By doing so, a black part (such as a serial number printed in black) can be extracted even in the shadow. The predetermined luminance value may be determined by measuring in advance the threshold value of the brightness of the pixel to be left as black using the sample image as shown in FIG. For pixels (p) having a predetermined brightness value or more (for example, 16 or more, 32 or more, etc.), for example, when "p = intermediate value-|-s + p |" is calculated using the upper left pixel (s), the pixels. The smaller the difference in brightness between them, the closer the calculation result approaches the intermediate value (for example, 128), and the larger the difference in brightness, the closer to 0 (less than 0 is corrected to 0). FIG. 5 is a histogram calculated as described above (the horizontal axis is the luminance value and the vertical axis is the frequency), and all the pixels are distributed below the median value (128). Based on the histogram, for example, by obtaining the threshold value by the Otsu equation and generating a binarized image, it is possible to extract both black pixels in a dark place and black pixels in a bright place.

二値化のための適切なエンボスフィルタは左上の画素（ｓ）を用いるものに限られない。近傍数が多いと輪郭（エッジ）が強調され、少ないと弱くなることから、例えば、後述するスキュー角度の検出に適するように、上下左右の４画素の二次微分を取ることができる（４近傍）。あるいは、後述する閉域の検出に適するように、右上と左下の２近傍を用いることができる。その他、３×３等のサイズのマトリックスの４隅の４画素や、８近傍全ての画素等の二値化エンボスタイプを適宜ヒストグラムの生成に用いることができる。 Appropriate embossing filters for binarization are not limited to those using the upper left pixel (s). If the number of neighborhoods is large, the contour (edge) is emphasized, and if it is small, the contour (edge) is weakened. ). Alternatively, two neighborhoods, an upper right corner and a lower left corner, can be used so as to be suitable for detecting a closed area, which will be described later. In addition, binarized embossed types such as 4 pixels at 4 corners of a matrix having a size of 3 × 3 or all pixels in the vicinity of 8 can be appropriately used for generating a histogram.

閉域・矩形検出手段２０は、解析用二値化画像を一定のブロックサイズにブロック化して閉域を検出し、閉域に基づいて矩形を検出する。閉域の検出には以下のような利点がある。公知のＯＣＲエンジンは概して、画像のどこを読むのかフォーマットごとにその範囲を設定する。画像全体をＯＣＲの処理範囲とすると、不要な文字や模様まで文字として処理し解読可能な文字数を超えてしまいエラーを起こす場合があり好ましくない。画像に含まれる応募券の枚数や位置を任意とすると予めＯＣＲ処理範囲を設定できない。本発明は、印字される文字（シリアルナンバー等）が周囲に余白を有するという特徴を用いて、画像からＯＣＲの処理範囲を検出することができる。本明細書において、余白に囲まれた閉じた領域を「閉域」という場合がある。 The closed area / rectangle detecting means 20 blocks the binarized image for analysis into a fixed block size, detects the closed area, and detects the rectangle based on the closed area. The detection of closed areas has the following advantages. Known OCR engines generally set the range of what to read for each format in the image. If the entire image is set as the OCR processing range, unnecessary characters and patterns may be processed as characters and the number of decipherable characters may be exceeded, which is not preferable. If the number and position of application tickets included in the image are arbitrary, the OCR processing range cannot be set in advance. According to the present invention, the OCR processing range can be detected from an image by using the feature that the printed characters (serial number, etc.) have a margin around them. In the present specification, a closed area surrounded by a margin may be referred to as a “closed area”.

図６Ａ〜図６Ｅを用いて、閉域及び矩形の検出が説明される。図６Ａは解析用二値化画像Ｉａの二枚の応募券が含まれる部分を示す。図示の例では二値化の結果、応募券内の七桁二段書きのシリアルナンバーやその上下の枠、及び自動音声受付番号等シリアルナンバー以外の文字が黒画素となっており、他の部分は白画素となっている。 The detection of closed areas and rectangles will be described with reference to FIGS. 6A-6E. FIG. 6A shows a portion of the binarized image Ia for analysis including two application tickets. In the illustrated example, as a result of binarization, characters other than the serial number such as the 7-digit two-stage serial number in the application ticket, the frame above and below it, and the automatic voice reception number are black pixels, and other parts Is a white pixel.

図６Ｂは、画像Ｉａを水平（幅Ｗ）方向について、数画素（２画素、３画素、４画素、５画素、６画素、あるいはそれ以上の任意の整数の画素）のブロックサイズ（Δｗ）でブロック化した水平ブロック化を示す。水平方向のブロックサイズの画素全てが白画素であるときに白地ブランクとして検出することにより、文字列等の傾きを（部分的に）含む画像であっても水平方向に伸長する特徴（余白や行間等）を検出しやすくなる。各ブロック（Ｂ１〜Ｂ１４）内で白地ブランクが連続する部分はそれぞれ白地レクタングルとして検出される。 FIG. 6B shows the image Ia in the horizontal (width W) direction with a block size (Δw) of several pixels (2 pixels, 3 pixels, 4 pixels, 5 pixels, 6 pixels, or any other integer pixel). Indicates blocked horizontal blocking. By detecting as a white background blank when all the pixels of the block size in the horizontal direction are white pixels, even an image containing (partially) the inclination of a character string or the like is stretched in the horizontal direction (margins and line spacing). Etc.) will be easier to detect. Within each block (B1 to B14), a portion where a white background blank is continuous is detected as a white background rectangle.

図６Ｃの白抜きの多数の矩形はそれぞれ検出された白地レクタングルＲを示す。グレーの色塗り部分は、ブロックサイズが一以上の黒画素を含むことにより白地ブランクとして抽出されなかった部分を示す。各レクタングルＲは、画像の左上を原点とし、それぞれ左上及び右下の座標（ラスタデータ）をメモリ等記憶手段６０に記憶され得る。 The large number of white rectangles in FIG. 6C indicate the detected white rectangle R, respectively. The gray colored portion indicates a portion that is not extracted as a white background blank because the block size includes one or more black pixels. Each Rectangle R has the upper left of the image as the origin, and the coordinates (raster data) of the upper left and the lower right can be stored in the storage means 60 such as a memory.

隣り合うブロック間のレクタングルの連結を調べることで、閉域が検出される。レクタングルの連結は、一つのブロックのレクタングルが、隣り合うブロックのレクタングルに接するときに、連結するとみなされ得る。ブロック間のレクタングルが接していない場合も、所定の条件を満たせば連結すると見なされてよい。所定の条件は、連結誤差として予め設定することができる。 Closed areas are detected by examining the connection of rectangles between adjacent blocks. Rectangle concatenation can be considered concatenated when one block of rectangles touches adjacent blocks of rectangles. Even if the rectangles between the blocks are not in contact with each other, they may be considered to be connected if certain conditions are met. Predetermined conditions can be preset as connection errors.

図６Ｄに閉域を表す一連のレクタングルの例がグレーの色塗り部分で示される。閉域とは、白地で囲まれた閉じた領域であり、例えば、あるブロックのひとつのレクタングルが、後続のブロックで複数（例えば、二つ）のレクタングルに連結し、それらが後続のブロックにかけて連結が途切れることなく、最終的に一つの共通するレクタングルに連結するとき、それら一連のレクタングルは内部に閉じた領域を持つ。閉域検出ロジックとして、例えば、特許文献４に記載のものを用いることができるが、これは本発明に係るプログラムロジックが公知であることを述べるものではない。 An example of a series of rectangles representing a closed area is shown in FIG. 6D with gray colored areas. A closed area is a closed area surrounded by a white background, for example, one rectangle of a block is connected to multiple (for example, two) rectangles in a subsequent block, and they are connected to the subsequent blocks. When finally connected to one common rectangle without interruption, those series of rectangles have a closed area inside. As the closed area detection logic, for example, the logic described in Patent Document 4 can be used, but this does not state that the program logic according to the present invention is known.

図６Ｅに、検出されたそれぞれの閉域に基づいて検出される矩形（破線部分）が模式的に示される。矩形の上辺と下辺のｙ座標、左辺と右辺のｘ座標が記憶手段６０に格納される。水平ブロック化による検出の場合、例えば、左辺のｘ座標は左側のレクタングルの右下端のｘ座標、右辺のｘ座標は右側のレクタングルの左上端のｘ座標、上辺のｙ座標は上側の一連のレクタングルの右下端ｙ座標の最小値、下辺のｙ座標は下側の一連のレクタングルの左上端のｙ座標の最大値であってよい。このように検出された矩形の内部領域をＯＣＲの処理対象とすることができる。 FIG. 6E schematically shows a rectangle (broken line portion) detected based on each detected closed area. The y-coordinates of the upper and lower sides of the rectangle and the x-coordinates of the left and right sides are stored in the storage means 60. In the case of detection by horizontal blocking, for example, the x-coordinate of the left side is the x-coordinate of the lower right corner of the left rectangle, the x-coordinate of the right side is the x-coordinate of the upper left corner of the right-hand rectangle, and the y-coordinate of the upper side is a series of rectangles on the upper side. The minimum value of the y-coordinate of the lower right end and the y-coordinate of the lower side may be the maximum value of the y-coordinate of the upper left end of the lower series of rectangles. The internal region of the rectangle detected in this way can be processed by OCR.

上記は画像Ｉａの水平ブロック化に基づく閉域及び矩形の検出の説明であり、好適に、画像Ｉａはさらに、垂直（高さＨ）方向について所定のブロックサイズΔｈでブロック化され、閉域及び矩形が検出される（図示せず）。垂直ブロック化による検出の場合、例えば、左辺のｘ座標は左側の一連のレクタングルの右下端のｘ座標の最小値、右辺のｘ座標は右側の一連のレクタングルの左上端のｘ座標の最大値、上辺のｙ座標は上側のレクタングルの右下端のｙ座標、下辺のｙ座標は下側のレクタングルの左上端のｙ座標であってよい。水平ブロック化で検出された矩形と、垂直ブロック化で検出された矩形がそれぞれＯＣＲ処理されてよい。または、水平ブロック化により検出された矩形と垂直ブロック化により検出された矩形が重なっているかどうか判断し、重なっている場合、重なっている矩形の左右のｘ座標の最小値・最大値と、上下のｙ座標の最小値・最大値により一つの矩形を合成し、合成した矩形がＯＣＲ処理等されてよい。 The above is a description of the detection of the closed area and the rectangle based on the horizontal blocking of the image Ia. Preferably, the image Ia is further blocked with a predetermined block size Δh in the vertical (height H) direction, and the closed area and the rectangle are formed. Detected (not shown). In the case of detection by vertical blocking, for example, the x-coordinate of the left side is the minimum value of the x-coordinate of the lower right end of the series of rectangles on the left side, and the x-coordinate of the right side is the maximum value of the x-coordinate of the upper left end of the series of rectangles on the right side. The y-coordinate of the upper side may be the y-coordinate of the lower right end of the upper rectangle, and the y-coordinate of the lower side may be the y-coordinate of the upper left end of the lower rectangle. The rectangle detected by horizontal blocking and the rectangle detected by vertical blocking may be subjected to OCR processing, respectively. Alternatively, it is determined whether the rectangle detected by horizontal blocking and the rectangle detected by vertical blocking overlap, and if they overlap, the minimum and maximum values of the left and right x-coordinates of the overlapping rectangle and the top and bottom One rectangle may be combined with the minimum and maximum values of the y-coordinate of, and the combined rectangle may be subjected to OCR processing or the like.

上記のような閉域及び矩形の検出によれば、複数の応募券の背景がテーブルの木目等の模様を含むものであっても、ＯＣＲの処理範囲を適切に検出することができる。 According to the detection of the closed area and the rectangle as described above, the processing range of the OCR can be appropriately detected even if the background of the plurality of application tickets includes a pattern such as a grain of a table.

図７にグレーの色塗り部分及び破線で示されるように、対象文字列を含まない閉域及び矩形も検出され得る。後述するように、回転画像の矩形を検出することで、対象文字列に関する矩形は対象文字列に応じた縦横比を持ちやすい。そのような縦横比（範囲であってよい）を予め設定することで、対象文字列を含む矩形のみをＯＣＲ処理することもできる。 Closed areas and rectangles that do not include the target character string can also be detected, as shown by the gray colored portion and the broken line in FIG. 7. As will be described later, by detecting the rectangle of the rotated image, the rectangle related to the target character string tends to have an aspect ratio according to the target character string. By setting such an aspect ratio (which may be a range) in advance, it is possible to perform OCR processing only on the rectangle including the target character string.

例えば、シリアルナンバーが７桁、２行の数字からなる場合、７桁、２行の周囲の余白により画成される矩形は所定の縦横比を持つ。例えば、ＯＣＲ方向に平行な幅（wide：Ｗ）とＯＣＲ方向に垂直な高さ（high：Ｈ）の比が、１．３〜１．８の所定の範囲にある。シリアルナンバーを含む応募券が画像内で幅（高さ）方向に対し傾いている場合、シリアルナンバーの周囲の余白により検出される矩形は必ずしも所定の範囲にない。後述するように、本発明においては、シリアルナンバーの周囲の余白や行間により傾きが検出され補正されるので、シリアルナンバーを含む矩形の縦横比は所定の範囲に入ることになる。なお、傾きが検出されない矩形はスキュー補正することができないが、それは本発明に係るシリアルナンバーのスキューの検出用の検出に適さない部分、すなわち対象文字列を含まない部分であり、概してシリアルナンバー等対象文字列の抽出を妨げないと考えられる。 For example, when the serial number consists of 7 digits and 2 lines, the rectangle defined by the margins around the 7 digits and 2 lines has a predetermined aspect ratio. For example, the ratio of the width parallel to the OCR direction (wide: W) to the height perpendicular to the OCR direction (high: H) is in a predetermined range of 1.3 to 1.8. When the application ticket including the serial number is tilted with respect to the width (height) direction in the image, the rectangle detected by the margin around the serial number is not necessarily within the predetermined range. As will be described later, in the present invention, since the inclination is detected and corrected by the margins around the serial number and the line spacing, the aspect ratio of the rectangle including the serial number falls within a predetermined range. It should be noted that the rectangle in which the inclination is not detected cannot be skew-corrected, but it is a part that is not suitable for detection for detecting the skew of the serial number according to the present invention, that is, a part that does not include the target character string, and is generally a serial number or the like. It is considered that it does not interfere with the extraction of the target character string.

図８に示されるように、矩形の領域３１、３２に含まれる文字列は、矢印で示されるＯＣＲ方向（画像の左から右方向／上から下方向等）に対して正立している場合もあれば傾いている場合もある。所定方向に対して文字列が傾いている（正立していない）場合は、概して、文字を正しく認識することができない。任意に並べられた複数の応募券を画像が含む場合、複数の文字列の傾き（スキュー）が互いに異なることが考えられる。本明細書において、画像の法線ベクトルを中心軸として文字列が基準方向（ＯＣＲ方向）から回転した傾きをスキューという場合がある。 As shown in FIG. 8, when the character strings included in the rectangular areas 31 and 32 are upright with respect to the OCR direction (left-to-right direction / top-down direction, etc.) indicated by the arrows. Some are tilted. If the character string is tilted (not upright) with respect to a predetermined direction, the character cannot be recognized correctly in general. When the image contains a plurality of arbitrarily arranged application tickets, it is possible that the inclinations (skews) of the plurality of character strings are different from each other. In the present specification, the slope in which the character string is rotated from the reference direction (OCR direction) with the normal vector of the image as the central axis may be referred to as skew.

文字列のスキュー量は、文字列の余白や行間に基づいて検出することができる。画像に含まれる文字列に係る行間（余白）の向きを検出するために、画像の横方向または縦方向のヒストグラムを計算する方法が公知である。横方向または縦方向に限定されない任意の方向のヒストグラムを生成するには、画像を順次回転させながら一定方向にスキャンしてヒストグラムを生成する方法が考えられるが、このようにすると計算量が多くなりＣＰＵに負荷がかかりやすい。ＣＰＵに負荷をかけることなく、画像に含まれる複数の領域のスキュー角度を瞬時に検出できるロジックが好ましい。 The amount of skew of a character string can be detected based on the margins of the character string and the line spacing. A method of calculating a horizontal or vertical histogram of an image is known in order to detect the direction of the line spacing (margin) related to the character string included in the image. In order to generate a histogram in any direction that is not limited to the horizontal or vertical direction, it is conceivable to scan the image in a certain direction while rotating the image sequentially, but this method increases the amount of calculation. The load is likely to be applied to the CPU. A logic that can instantly detect skew angles of a plurality of regions included in an image without imposing a load on the CPU is preferable.

このために、検出された矩形の中心を中心画素とする全方位のスキャン画素が予め決定される。このようなスキャン画素は、特許文献３に記載された線分補間方式により決定することができるが、これは本発明に係るプログラムロジックが公知であることを述べるものではない。 For this purpose, omnidirectional scan pixels with the center of the detected rectangle as the center pixel are determined in advance. Such a scan pixel can be determined by the line segment interpolation method described in Patent Document 3, but this does not state that the program logic according to the present invention is known.

具体的に、検出された矩形の中心画素が求められ、該中心画素ｏを有し一対の対辺の画素数がＬ₁、他の一対の対辺の画素数がＬ₂である矩形の画素領域であって、前記中心画素ｏの周りに−９０度ないし９０度の範囲で所定角度（θ）ずつ回転させた複数の矩形のスキャン画素領域が線分補間により求められ、記憶手段６０に記憶される。線分補間アルゴリズムにより、中心画素ｏを含み、基準方向に位置する所定の画素数から成る一連の画素に対して任意の角度θだけ傾いた一連の画素が、正弦余弦テーブル（三角関数表）を用いた加算減算のみで決定され、少ない計算量で実質的に全方向のスキャン位置を決定することができる。 Specifically, the detected rectangular center pixel is obtained, and in a rectangular pixel region having the center pixel o and having a pair of opposite-side pixels of L ₁ and another pair of opposite-side pixels of L _2. Therefore, a plurality of rectangular scan pixel areas rotated by a predetermined angle (θ) in a range of −90 degrees to 90 degrees around the central pixel o are obtained by line segment interpolation and stored in the storage means 60. .. By the line segment interpolation algorithm, a series of pixels including the central pixel o and tilted by an arbitrary angle θ with respect to a series of pixels consisting of a predetermined number of pixels located in the reference direction creates a sine cosine table (trigonometric tables). It is determined only by the addition and subtraction used, and the scan position in substantially all directions can be determined with a small amount of calculation.

図９は、検出された矩形（グレーの色塗り部分）の中心画素ｏを有し、基準方向（矢印）からそれぞれ０度、−１５度、−３０度回転させた例示的なスキャン画素領域Ｓ１、Ｓ２、Ｓ３をそれぞれ点線で示す。 FIG. 9 shows an exemplary scan pixel region S1 having the center pixel o of the detected rectangle (gray colored portion) and rotated by 0 degrees, -15 degrees, and -30 degrees from the reference direction (arrow), respectively. , S2 and S3 are shown by dotted lines.

このような矩形のスキャン領域の各々についてヒストグラムが計算される。図１０は、スキャンにより作成されたヒストグラムに基づくスキュー角度の検出を模式的に示す。ヒストグラム（下図）は、それぞれのスキャン画素領域について横軸にＬ₂、縦軸にヒストグラム値（黒画素の個数）を棒グラフで表したものであってよく、ヒストグラムにおいて黒画素が存在しない一連の画素が存在する場合、その角度は行間の角度を表し得る。図示の例では、矩形３１（破線）について、スキャン画素領域Ｓ１´に対応する角度α₁＝０度がスキュー角度に決定され、矩形３２（破線）について、スキャン画素領域Ｓ２に対応する角度α₂＝−１５度がスキュー角度に決定される。 A histogram is calculated for each of these rectangular scan areas. FIG. 10 schematically shows the detection of the skew angle based on the histogram created by the scan. The histogram (shown below) may be a _{bar graph showing L 2 on} the horizontal axis and the histogram value (number of black pixels) on the vertical axis for each scan pixel area, and is a series of pixels in which black pixels do not exist in the histogram. If is present, the angle can represent the angle between the lines. In the illustrated example, for the rectangle 31 (broken line), the angle α ₁ = 0 degree corresponding to the scan pixel area S1'is determined as the skew angle, and for the rectangle 32 (broken line), the angle α _{2 corresponding to the scan pixel area S2.} = -15 degrees is determined as the skew angle.

検出されたスキュー角度がＯＣＲ処理の許容範囲内（例えば、５度以下、３度以下等）にある矩形３１は、公知のＯＣＲエンジンを備える対象文字列抽出手段５０によりそのままＯＣＲ処理されてよい。ＯＣＲで成功裏の文字読取が期待できないスキュー角度を有する矩形３２は、スキュー補正手段４０によりスキュー補正される。 The rectangle 31 whose detected skew angle is within the permissible range of OCR processing (for example, 5 degrees or less, 3 degrees or less, etc.) may be OCR-processed as it is by the target character string extracting means 50 equipped with a known OCR engine. The rectangle 32 having a skew angle that cannot be expected to read characters successfully by OCR is skew-corrected by the skew correction means 40.

矩形領域３２をスキュー補正するために、矩形領域３２に関する回転画像が生成される。このとき矩形領域３２の入力点（ｘ，ｙ）を、原点ｏを回転の中心として（−α₂）度回転させて回転画像の出力点（ｘ’，ｙ’）を求めると、該当しない画素が抜けて画像が薄くなり、ＯＣＲ処理の精度が低下する恐れがある。従って、ＯＣＲ処理における精度が低下しないように、矩形領域３２の入力点（ｘ’，ｙ’）から回転画像の出力点の座標（ｘ，ｙ）を求めることが好ましい。 A rotated image of the rectangular area 32 is generated in order to skew correct the rectangular area 32. At this time, if the input point (x, y) of the rectangular region 32 is rotated by (−α ₂ ) degrees with the origin o as the center of rotation to obtain the output point (x', y') of the rotated image, the corresponding pixel is obtained. There is a risk that the image will be thin and the accuracy of OCR processing will be reduced. Therefore, it is preferable to obtain the coordinates (x, y) of the output point of the rotated image from the input point (x', y') of the rectangular region 32 so that the accuracy in the OCR processing does not decrease.

図１１（ａ）に回転画像の生成が模式的に示される。スキュー補正手段４０は、中心画素ｏを有し、四辺の画素数が、例えば、検出された矩形３２の外接円の直径２ｒである矩形の画素領域であって、中心画素ｏの周りに、基準方向（回転画像Ｃの領域）についてスキュー角度（α₂）だけ回転させた矩形の画素領域Ｃ´を線分補間方式により求めることができる。該画素領域Ｃ´の対応する画素をコピーすることにより回転画像の矩形３３を検出することができる（図１１（ｂ））。 FIG. 11A schematically shows the generation of a rotated image. The skew correction means 40 has a central pixel o, and the number of pixels on the four sides is, for example, a rectangular pixel region having a diameter of 2r of the circumscribed circle of the detected rectangular 32, and is a reference around the central pixel o. _{The rectangular pixel region C'rotated by the skew angle (α 2} ) with respect to the direction (region of the rotated image C) can be obtained by the line segment interpolation method. The rectangle 33 of the rotated image can be detected by copying the corresponding pixel of the pixel region C'(FIG. 11 (b)).

回転画像の矩形３３において文字列はＯＣＲ方向に対して正立し、対象文字列であるシリアルナンバーを認識することができる。 In the rectangle 33 of the rotated image, the character string stands upright with respect to the OCR direction, and the serial number which is the target character string can be recognized.

図１２を用いて、回転画像の矩形３３の座標の検出が説明される。図１２において、多数の小さな矩形はそれぞれ画素を表す。図１２（ａ）は、矩形３２を含む領域Ｃ（矩形の外枠で示される部分）と、該領域Ｃに対してα₂＝−１５度回転した領域Ｃ´（グレーの濃淡で示される部分）と、矩形３２の座標の決定に関する各座標（ｘ’，ｙ’）を示す。矩形３２の座標の決定に関する各座標（ｘ’，ｙ’）は、閉域を画成する各レクタングルの座標である。領域Ｃ´は線分補間テーブルの画素の桁上がりに基づいて生成され得る。すなわち、中心画素ｏ（１番目の画素）から角度α₂方向に位置する２番目の画素のＸ成分を、中心画素ｏのＸ成分である初期値＝０．５に、角度α₂方向に伸長する２番目の画素の斜辺のＸ方向の増分、すなわちΔｘ（ｃｏｓα₂）を加算した値とし、Ｙ成分を、中心画素ｏのＹ成分である初期値＝０．５に、角度α₂方向に伸長する２番目の画素の斜辺のＹ方向の増分、すなわちΔｙ（ｓｉｎα₂）を加算した値とし、各々の値を整数に切り捨てて、そこから直前の画素の桁上がり値（直前の画素が中心画素である場合、桁上がり値は０）を減算することにより、２番目の画素の直前の画素からのＸ方向の桁上がり、及びＹ方向の桁上がりが判定される。 The detection of the coordinates of the rectangle 33 of the rotated image will be described with reference to FIG. In FIG. 12, each of the many small rectangles represents a pixel. FIG. 12A shows a region C including the rectangle 32 (a portion indicated by a rectangular outer frame) and a region C'(a portion indicated by shades of gray) rotated by _{α 2 = -15 degrees with respect to the region C.} ) And each coordinate (x', y') relating to the determination of the coordinates of the rectangle 32. Each coordinate (x', y') relating to the determination of the coordinates of the rectangle 32 is the coordinate of each rectangle that defines the closed area. Region C'can be generated based on the carry of pixels in the line segment interpolation table. _{That is, the X component of the second pixel located in the angle α 2} direction from the center pixel o (first pixel) _{is extended in the angle α 2} direction to the initial value = 0.5, which is the X component of the center pixel o. The increment in the X direction of the hypotenuse of the second pixel, that is, _{the value obtained by adding Δx (cosα 2} ), and the Y component is set to the initial value = 0.5, which is the Y component of the central pixel o, in the angle α ₂ direction. The increment in the Y direction of the hypotenuse of the second pixel to be extended, that is, _{the value obtained by adding Δy (sinα 2} ), rounding down each value to an integer, and then the carry value of the immediately preceding pixel (centered on the immediately preceding pixel). In the case of a pixel, the carry value is 0), so that the carry in the X direction and the carry in the Y direction from the pixel immediately before the second pixel are determined.

図１２（ｂ）は、線分補間テーブルに基づいてスキュー補正された画像（回転画像）と回転変換された各座標（ｘ，ｙ）等を示す。もとの矩形に関する座標を回転変換し、同様にＸ座標、Ｙ座標の最小値、最大値を取ることにより回転画像の矩形３３が検出され、該矩形３３の内部領域がＯＣＲ処理される。 FIG. 12B shows a skew-corrected image (rotated image) based on the line segment interpolation table, rotation-transformed coordinates (x, y), and the like. The rectangle 33 of the rotated image is detected by rotationally transforming the coordinates related to the original rectangle and similarly taking the minimum and maximum values of the X and Y coordinates, and the internal region of the rectangle 33 is OCR processed.

実施例において、各処理にかかった時間は、１画像のフィルタ処理と閉域検出に２５０ｍｓ、約３０矩形の検出とスキュー角度解析に３５０ｍｓ（１矩形あたり約１２ｍｓ）、１矩形のスキュー補正（回転）に２．０〜３．０ｍｓであり、各シリアルナンバーが精度よく抽出された。 In the embodiment, the time required for each process is 250 ms for filtering and closed area detection of one image, 350 ms for detecting and skew angle analysis of about 30 rectangles (about 12 ms per rectangle), and skew correction (rotation) for one rectangle. It was 2.0 to 3.0 ms, and each serial number was extracted with high accuracy.

上記のように、本発明によれば、複数（種類）の応募券を用いるキャンペーンでも応募が簡単であり、集計も容易になる。高い精度で瞬時にキャンペーンに必要な文字列を抽出することができ、実効性の高いシステムとすることができる。 As described above, according to the present invention, it is easy to apply even in a campaign using a plurality of (types) of application tickets, and it is also easy to tabulate. The character strings required for the campaign can be extracted instantly with high accuracy, and the system can be highly effective.

本発明の思想及び態様から離れることなく多くのさまざまな修正が可能であることは当業者の知るところである。したがって、言うまでもなく、本発明の態様は例示に過ぎず、本発明の範囲を限定するものではない。 It is known to those skilled in the art that many various modifications can be made without departing from the ideas and aspects of the present invention. Therefore, needless to say, the aspects of the present invention are merely examples and do not limit the scope of the present invention.

１００文字列抽出サーバ
１０二値化手段
２０閉域・矩形検出手段
３０スキュー角度検出手段
４０スキュー補正手段
５０対象文字列抽出手段
６０記憶手段 100 Character string extraction server 10 Binarization means 20 Closed area / rectangle detection means 30 Skew angle detection means 40 Skew correction means 50 Target character string extraction means 60 Storage means

Claims

A character string extraction server for extracting multiple target character strings contained in an unspecified area of an image.
A rectangle detecting means that detects a plurality of closed areas, which are areas surrounded by margins, from the binarized image of the image and detects a rectangle related to each closed area.
A skew angle detecting means for detecting the skew angle of the character string inside each of the detected rectangles,
Skew correction means for detecting the rectangle of the rotated image according to the skew angle for each rectangle,
A character string extraction server comprising the character string recognition means for recognizing the character string in the rectangular internal region of the rectangle and / or the rectangle of the rotated image, and the recognized character string including the plurality of target character strings. ..

When the character string recognition means determines whether or not the character string inside each of the rectangles includes the target character string based on the aspect ratio of the rectangle and / or the rectangle of the rotated image, and includes the target character string. A character string extraction server that recognizes the character string in the determined internal area of the rectangle.

The character string extraction server according to claim 1, wherein the target character string is a character string having a margin around it, which is composed of a combination of characters including numbers, alphabets, pseudonym characters, and symbols.

The binarized image calculates the frequency of pixels having a luminance value less than a predetermined luminance value included in the grayscale image of the image, and the pixels having a luminance value equal to or higher than the predetermined luminance value included in the grayscale image are with neighboring pixels. It is generated by subtracting the absolute value of the difference in brightness value from the intermediate value, calculating the frequency, creating a histogram in which the brightness values of all pixels are equal to or less than the intermediate value, and determining the threshold value based on the histogram. The character string extraction server according to claim 1.

The character string extraction server according to claim 1, wherein the rectangle and / or the rectangle of the rotated image includes a rectangle in which a plurality of rectangles adjacent to each other in the vertical and horizontal directions are combined.

The first or second aspect of claim 1 or 2, wherein the rectangle of the rotated image is detected based on the coordinates obtained by rotationally transforming the coordinates related to the determination of the rectangle detected by the rectangle detecting means based on the line segment interpolation table. String extraction server.

It is a character string extraction method for extracting a plurality of target character strings included in an unspecified area of an image.
A rectangle detection step of detecting a plurality of closed areas, which are areas surrounded by margins, from the binarized image of the image and detecting a rectangle related to each closed area.
A skew angle detecting means for detecting the skew angle of each of the detected rectangles,
A skew correction step for detecting a rectangle of a rotated image according to the skew angle for each rectangle, and
A character string extraction method comprising a character recognition step of recognizing a character string in the rectangular area and / or the rectangular internal area of the rotated image, and the recognized character string including the plurality of target character strings.

A program that causes a computer to execute the character extraction method according to claim 7.