JP2016053763A

JP2016053763A - Image processor, image processing method and program

Info

Publication number: JP2016053763A
Application number: JP2014178559A
Authority: JP
Inventors: 池田　裕章; Hiroaki Ikeda; 裕章池田
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2014-09-02
Filing date: 2014-09-02
Publication date: 2016-04-14

Abstract

PROBLEM TO BE SOLVED: To solve a problem that a processing time is rapidly increased if an image size becomes large because the entire image becomes an analysis object when characters are attempted to be read from an image picked up by a digital camera, or the like without omission.SOLUTION: An image processor can suppress character area extraction omission to shorten a processing time by including detection means for detecting a feature area from an input image, determination means for determining a first search range in the input image and a second search range different from the first search range on the basis of the position and the size of the detected feature area, reduction means for performing reduction processing of the second search range with a prescribed reduction rate, and extraction means for extracting character areas from an image of the first search range and a reduced image of the second search range subjected to the reduction processing by using data of each pixel constituting the image and the reduced image.SELECTED DRAWING: Figure 6

Description

本発明は、画像から文字を抽出する画像処理装置、画像処理方法及びプログラムに関する。 The present invention relates to an image processing apparatus, an image processing method, and a program for extracting characters from an image.

撮像された画像を解析して情報を取り出すことで、大量の画像の取り扱いを容易にすることが試みられている。例えば、類似画像検索や画像解析により、撮像された被写体を特定する手法が実用化されつつある。あるいは、観光地において撮像された記念写真に名所等の名称が記載された碑や看板が写っていれば、文字認識により撮像場所が特定可能になる。 Attempts have been made to facilitate handling of a large amount of images by analyzing the captured images and extracting information. For example, a technique for identifying a captured subject by similar image search or image analysis is being put into practical use. Alternatively, if a monument or signboard with a name such as a landmark is shown in a commemorative photo taken at a sightseeing spot, the imaging location can be specified by character recognition.

そこで、撮像された画像から文字領域を検出し文字認識することで、文字情報を取り出す手法が開示されている（例えば非特許文献１、非特許文献２）。しかしながら、一様でない背景を持つ撮像画像から文字領域を抽出することは容易ではないため、非特許文献１ではストロークの幅に着目し文字領域を特定するようにした。また非特許文献２では、画素塊抽出の手法のひとつであるＭＳＥＲ（ＭａｘｉｍａｌｌｙＳｔａｂｌｅＥｘｔｒｅｍａｌＲｅｇｉｏｎｓ）を使用して抽出した画素塊から文字画素塊を選別するようにした。ＭＳＥＲにより、撮像画像から生成した多数の二値画像から、安定的に存在する画素を画素塊として取り出すことができる。 Thus, a technique for extracting character information by detecting a character region from a captured image and recognizing the character is disclosed (for example, Non-Patent Document 1 and Non-Patent Document 2). However, since it is not easy to extract a character region from a captured image having a non-uniform background, Non-Patent Document 1 focuses on the stroke width and specifies the character region. In Non-Patent Document 2, the character pixel block is selected from the pixel block extracted by using MSER (Maximally Stable Extreme Regions) which is one of the pixel block extraction methods. With MSER, pixels that exist stably can be extracted as a pixel block from a large number of binary images generated from a captured image.

また、撮像条件を限定しない画像から、精度よく注目領域を取り出すことは、注目すべき被写体以外が写り込んでいたり、注目すべき領域の大きさが写真によって異なったりするため困難であった。そこで、解析する領域を画像内の一部に限定する手法として特許文献１及び特許文献２が開示されている。 In addition, it is difficult to accurately extract a region of interest from an image that does not limit the imaging conditions because a subject other than the subject to be noticed is reflected or the size of the region to be noticed varies depending on the photograph. Therefore, Patent Literature 1 and Patent Literature 2 are disclosed as methods for limiting the region to be analyzed to a part of the image.

特許文献１では、画像内から注目領域である顔領域を検出し、更に顔の位置から胴体領域を検出し、顔および胴体を除いた領域からランドマークが写ったオブジェクト領域を抽出するようにした。特許文献２では、顔や手といった被写体の部位を検出し、顔や手および顔と手の間を制限領域として設定し、注目領域のうち制限領域を除いた領域から特徴量を抽出するようにした。 In Patent Document 1, a face area that is an attention area is detected from an image, a body area is further detected from the position of the face, and an object area in which a landmark appears is extracted from an area excluding the face and the body. . In Patent Document 2, a part of a subject such as a face or a hand is detected, a face, a hand, or a space between a face and a hand is set as a restricted area, and a feature amount is extracted from an area excluding the restricted area of the attention area. did.

また、写真により異なる注目領域の大きさを一定の大きさにする手法として特許文献３が開示されている。 Further, Patent Document 3 is disclosed as a technique for making the size of a region of interest that varies depending on a photograph constant.

特許文献３では、顔検出の識別器にかける前に、撮像画像から顔と推定される領域の大きさを検出し、あらかじめ定めた値と比較し拡大・縮小率を算出するようにした。 In Patent Document 3, the size of a region estimated as a face is detected from a captured image before being applied to a face detection discriminator, and compared with a predetermined value to calculate an enlargement / reduction ratio.

特開２０１３−６５１５６JP2013-65156 特開２０１２−５３６０６JP2012-53606 特開２００８−１９１７６０JP2008-191760

Ｅｐｓｈｔｅｉｎ他、「Ｄｅｔｅｃｔｉｎｇｔｅｘｔｉｎｎａｔｕｒａｌｓｃｅｎｅｓｗｉｔｈｓｔｒｏｋｅｗｉｄｔｈｔｒａｎｓｆｏｒｍ」、ＣＶＰＲ２０１０ＣｏｎｆｅｒｅｎｃｅｏｎＣｏｍｐｕｔｅｒＶｉｓｉｏｎａｎｄＰａｔｔｅｒｎＲｅｃｏｇｎｉｔｉｏｎ、２９６３−２９７０Epstein et al., “Detecting text in natural scenes with stroke width transform”, CVPR 2010 Conference on Computer Vision and Pattern Recognition, 2963-2970. Ｍｅｒｉｎｏ−Ｇｒａｃｉａ他、「Ａｈｅａｄ−ｍｏｕｎｔｅｄｄｅｖｉｃｅｆｏｒｒｅｃｏｇｎｉｚｉｎｇｔｅｘｔｉｎｎａｔｕｒａｌｓｃｅｎｅｓ」、ＣＢＤＡＲ２０１１Ｐｒｏｃｅｅｄｉｎｇｓｏｆｔｈｅ４ｔｈｉｎｔｅｒｎａｔｉｏｎａｌｃｏｎｆｅｒｅｎｃｅｏｎＣａｍｅｒａ−ＢａｓｅｄＤｏｃｕｍｅｎｔＡｎａｌｙｓｉｓａｎｄＲｅｃｏｇｎｉｔｉｏｎ、２９−４１Merino-Gracia et al., “A head-mounted device for recognizing text in natural senses”, CBDAR 2011 Proceedings of the 4th international concealment.

しかしながら、先行技術では、画像から解析の対象となる領域を漏れなく高速に取り出すことは出来ない。すなわち、注目領域（以下、特徴領域とも記載する。）の検出精度が十分でなければ、解析すべき領域が特徴領域もしくは特徴領域から導かれた領域に含まれず、解析漏れが生ずるという課題があった。 However, in the prior art, it is not possible to quickly extract a region to be analyzed from an image without omission. That is, if the detection accuracy of a region of interest (hereinafter also referred to as a feature region) is not sufficient, the region to be analyzed is not included in the feature region or the region derived from the feature region, and there is a problem that an analysis failure occurs. It was.

マラソン大会の例では、参加者の帽子、サングラス、仮装等の影響で顔検出の精度が大きく左右されるため、特徴領域としての顔領域を正しく検出できない場合がある。解析の対象となるゼッケン番号を漏れなく取り出そうとすると、画像全体を解析する必要があった。その結果、解析に要する時間が増加する課題があった。特に、解析精度を高めるため、画像全体から隈なくゼッケン番号領域を探すように構成すると、処理中の画像の画素数が増加するにつれ、処理時間が大幅に増加する課題があった。 In an example of a marathon event, the accuracy of face detection is greatly affected by the participant's hats, sunglasses, disguise, and the like, so that the face area as a feature area may not be detected correctly. In order to extract the bib number to be analyzed without omission, it was necessary to analyze the entire image. As a result, there is a problem that the time required for analysis increases. In particular, in order to increase the analysis accuracy, if the number number area is searched from all over the image, there is a problem that the processing time increases significantly as the number of pixels of the image being processed increases.

本発明は上記の問題に鑑みてなされたものであり、文字領域の抽出漏れを抑制し、抽出処理の処理時間を短縮することができる画像処理装置、画像処理方法及びプログラムを提供することを目的とする。 The present invention has been made in view of the above problems, and an object of the present invention is to provide an image processing apparatus, an image processing method, and a program that can suppress omission of extraction of a character region and shorten the processing time of extraction processing. And

かかる課題を解決するため、本発明に係る画像処理装置は、以下の構成を備える。すなわち、画像処理装置は、入力画像から特徴領域を検出する検出手段と、検出された前記特徴領域の位置及び大きさに基づいて、前記入力画像における第一探索範囲と、前記第一探索範囲と異なる第二探索範囲を決定する決定手段と、前記第二探索範囲を所定の縮小率で縮小処理する縮小手段と、前記第一探索範囲の画像及び縮小処理された前記第二探索範囲の縮小画像に対して、それぞれを構成する画素ごとのデータを用いて文字領域を抽出する抽出手段と、を有することを特徴とする。 In order to solve this problem, an image processing apparatus according to the present invention has the following configuration. That is, the image processing device includes a detection unit that detects a feature region from the input image, a first search range in the input image, and a first search range based on the detected position and size of the feature region. Determining means for determining a different second search range, reduction means for reducing the second search range at a predetermined reduction ratio, an image of the first search range, and a reduced image of the second search range that has been reduced On the other hand, it is characterized by having extraction means for extracting a character region using data for each pixel constituting each.

本発明によれば、第一探索範囲と第二探索範囲の両方で文字領域を抽出することによって文字領域の抽出漏れを抑制し、第二探索範囲を縮小処理して文字領域の抽出処理を行うことによって抽出処理の処理時間を短縮できる効果がある。 According to the present invention, the extraction of the character area is suppressed by extracting the character area in both the first search range and the second search range, and the character area extraction process is performed by reducing the second search range. As a result, the processing time of the extraction process can be shortened.

画像処理装置のハードウエア構成、及び機能構成を示すブロック図Block diagram showing hardware configuration and functional configuration of image processing apparatus 画像処理装置の処理対象の一例を示す画像An image showing an example of the processing target of the image processing apparatus 第１の実施形態における画像から文字情報を読み取る処理の流れを説明するフローチャートThe flowchart explaining the flow of the process which reads character information from the image in 1st Embodiment. 第１の実施形態における文字領域の抽出処理の流れを説明するフローチャートA flowchart for explaining a flow of character region extraction processing according to the first embodiment. 第１の実施形態における画像の大きさと文字領域の抽出処理の処理時間の関係を示すグラフThe graph which shows the relationship between the size of the image in 1st Embodiment, and the processing time of the extraction process of a character area 第１の実施形態における文字領域の詳細な抽出処理の流れを説明するフローチャートThe flowchart explaining the flow of the detailed extraction process of the character area in 1st Embodiment 第１の実施形態における顔検出の結果と探索範囲を示す図The figure which shows the result of face detection and search range in 1st Embodiment 第１の実施形態における探索範囲の縮小後の画像を示す図The figure which shows the image after reduction of the search range in 1st Embodiment. 第１の実施形態におけるマスク範囲および拡張した探索範囲を示す図The figure which shows the mask range and extended search range in 1st Embodiment 第２の実施形態における文字領域の詳細な抽出処理の流れを説明するフローチャートThe flowchart explaining the flow of the detailed extraction process of the character area in 2nd Embodiment ゼッケンにマーカーが印刷された例を示す図The figure which shows the example where the marker is printed on the bib

以下、本発明を実施する好適な形態について図面を用いて説明する。 DESCRIPTION OF EXEMPLARY EMBODIMENTS Hereinafter, preferred embodiments for carrying out the invention will be described with reference to the drawings.

（第１の実施形態）
図１（ａ）は、第１の実施形態の画像処理装置のハードウエア構成の例である。１０２は本装置の処理を行うＣＰＵであり、１０３は制御プログラムを格納するＲＯＭ、１０４は処理中のデータ等を一時記憶するＲＡＭ、１０５は磁気ディスク等の外部記憶装置である。ＲＯＭ１０３には、後述するフローチャートに示す本装置の処理プログラムが格納されていてもよい。なお、ＣＰＵ１０２は複数あっても良い。 (First embodiment)
FIG. 1A is an example of a hardware configuration of the image processing apparatus according to the first embodiment. Reference numeral 102 denotes a CPU that performs processing of the apparatus, reference numeral 103 denotes a ROM that stores a control program, reference numeral 104 denotes a RAM that temporarily stores data being processed, and reference numeral 105 denotes an external storage device such as a magnetic disk. The ROM 103 may store a processing program of this apparatus shown in a flowchart described later. A plurality of CPUs 102 may be provided.

１０６はネットワークインターフェースであり、ＬＡＮやＷＡＮと接続し、遠隔地の装置と通信を行う。１０８は本装置の操作を行うキーボード等の操作部、１０７は本装置の状態や作業者への情報を表示する表示部である。タッチパネル液晶モニターのように、操作部１０８と表示部１０７が一体となっていてもよい。 A network interface 106 is connected to a LAN or WAN and communicates with a remote device. Reference numeral 108 denotes an operation unit such as a keyboard for operating the apparatus, and reference numeral 107 denotes a display unit for displaying the state of the apparatus and information for the operator. Like the touch panel liquid crystal monitor, the operation unit 108 and the display unit 107 may be integrated.

後述するフローチャートに示す本装置の処理プログラムは、外部記憶装置１０５に記憶したものや、ネットワークインターフェース１０６を介して外部から供給されたものを、ＣＰＵ１０２の制御の元、ＲＡＭ１０４に展開するように構成されていてもよい。 The processing program of this apparatus shown in the flowchart described later is configured to expand what is stored in the external storage device 105 or supplied from the outside via the network interface 106 to the RAM 104 under the control of the CPU 102. It may be.

これら各構成要素はシステムバス１０１上に配置される。 These components are arranged on the system bus 101.

なお、本実施形態の画像処理装置のハードウエア構成として、汎用コンピュータを用いてもよい。 Note that a general-purpose computer may be used as the hardware configuration of the image processing apparatus of the present embodiment.

図１（ｂ）は、第１の実施形態の画像処理装置の機能構成の例を示すブロック図である。画像処理装置に非図示の画像入力部から画像処理の対象となる画像が取り込まれたとする。顔領域検出部１１２は、特徴領域として顔領域を検出する。探索範囲決定部１１３は、検出された顔領域から文字領域を含む可能性が高い範囲を探索範囲として決定する。探索範囲決定部１１３は、処理対象の画像に応じて複数の探索範囲を決定することができる。 FIG. 1B is a block diagram illustrating an example of a functional configuration of the image processing apparatus according to the first embodiment. Assume that an image to be subjected to image processing is captured from an image input unit (not shown) into the image processing apparatus. The face area detection unit 112 detects a face area as a feature area. The search range determination unit 113 determines, as the search range, a range that is likely to include a character region from the detected face region. The search range determination unit 113 can determine a plurality of search ranges according to the image to be processed.

決定された探索範囲に対して、画像縮小部１１４が画像縮小処理を行ってから、画像解析部１１５が解析処理を行う。画像解析部１１５には文字領域抽出部１１６と文字認識部１１７が含まれている。文字領域抽出部１１６は、探索範囲から文字領域を抽出する。文字領域抽出部１１６は一つに限らず、複数あっても良い。文字認識部１１７は、文字領域から文字を読み取る。画像解析部１１５で解析して得られた文字情報を画像処理した画像に紐付けてメモリ又は記憶装置に記憶する。 The image reduction unit 114 performs image reduction processing on the determined search range, and then the image analysis unit 115 performs analysis processing. The image analysis unit 115 includes a character area extraction unit 116 and a character recognition unit 117. The character area extraction unit 116 extracts a character area from the search range. The number of character region extraction units 116 is not limited to one, and a plurality of character region extraction units 116 may be provided. The character recognition unit 117 reads a character from the character area. Character information obtained by analysis by the image analysis unit 115 is associated with the image-processed image and stored in a memory or a storage device.

図２は、図１の画像処理装置が処理する画像の一例である、一般競技者が参加する市民マラソンやロードレースを撮像した画像である。撮像された画像はデータ又は印刷された写真の形式で希望者に提供するために、画像情報からゼッケン番号で検索可能となっている必要がある。本実施形態の画像処理装置を用いて、図２に例示する画像に写っている参加者のゼッケン番号を読み取り、その画像に紐付けて記録しておくことができる。 FIG. 2 is an example of an image processed by the image processing apparatus of FIG. 1, which is an image of a citizen marathon or road race in which a general athlete participates. In order to provide the captured image in the form of data or a printed photograph to the applicant, it is necessary to be able to search by image number from the image information. By using the image processing apparatus of the present embodiment, it is possible to read the number number of the participant shown in the image illustrated in FIG. 2 and record it in association with the image.

次に、本実施形態で画像処理装置が実行する、画像から文字情報を読み取る処理の概要を、図３のフローチャートを使用して説明する。 Next, an outline of processing for reading character information from an image, which is executed by the image processing apparatus according to the present embodiment, will be described with reference to the flowchart of FIG.

ステップＳ３０１で非図示の画像入力部が処理を行う画像を入力し、ＲＡＭ１０４に記憶する。ステップＳ３０２において、文字領域抽出部１１６は入力された画像から文字領域を抽出する。ステップＳ３０３では、文字認識部１１７はステップＳ３０２で抽出された文字領域の画像に対し文字認識を実行する。その結果、画像内のすべての文字画像は文字コードに変換され、文字情報として読取る。ステップＳ３０４において、文字認識結果である文字コードを入力画像のメタデータとして付与し記憶する。メタデータが付与された画像は外部記憶装置１０５に保存する。 In step S <b> 301, an image to be processed by an image input unit (not shown) is input and stored in the RAM 104. In step S302, the character area extraction unit 116 extracts a character area from the input image. In step S303, the character recognition unit 117 performs character recognition on the image of the character area extracted in step S302. As a result, all character images in the image are converted into character codes and read as character information. In step S304, the character code as the character recognition result is assigned and stored as metadata of the input image. The image to which the metadata is added is stored in the external storage device 105.

ステップＳ３０２の文字領域の抽出処理は画像全体を画素単位で解析し画素塊を検出して文字領域を抽出する処理である。背景技術で例示した、画像全体を画素単位で解析し画素塊を検出していく手法は、抽出漏れ等の精度低下を防ぐことが出来る半面、解析処理時間が増加する傾向にある。 The character region extraction processing in step S302 is processing for analyzing the entire image in units of pixels and detecting a pixel block to extract a character region. The method exemplified in the background art for analyzing the entire image in units of pixels and detecting a pixel block can prevent deterioration in accuracy such as omission of extraction, but tends to increase analysis processing time.

例えば、非特許文献２で用いた画素塊抽出の手法ＭＳＥＲは、画像中の輝度値が近い所を１つの領域にまとめていく手法である。この方法では、それぞれの画素の近傍で、輝度値が近い所を連結する。輝度値が近い所を求めるために、複数の画素値（閾値）で２値化処理を行い、複数の２値画像を作る。２値化は全ての画素を対象にするため、一辺がｎ倍になれば、ｎ＾２倍（ｎの２乗倍）の画素を探索することになる。 For example, the pixel block extraction method MSER used in Non-Patent Document 2 is a method in which locations with close luminance values in an image are combined into one region. In this method, places where luminance values are close in the vicinity of each pixel are connected. In order to obtain a place where the luminance value is close, binarization processing is performed with a plurality of pixel values (threshold values) to create a plurality of binary images. Since binarization targets all the pixels, if one side becomes n times, n ^ 2 times (n square times) pixels are searched.

そのため、探索範囲の１辺が長くなると、探索範囲内の画素へのアクセスが累乗倍に増加し、結果的に処理時間も累乗倍となる。非特許文献２のＭＳＥＲを画像全体に適用して画素塊抽出をした場合、図５に示すように、画像の１辺（長辺）の画素数に対し、画素へのアクセスが累乗倍に増加し、結果的に解析処理時間も累乗倍となる。なお、画像の長辺と短辺の比は３対２と仮定している。本実施形態では、画像の長辺と短辺の所定比率に対して、図５に示す画像の長辺の画素数と処理時間の関係を予めメモリ又は記憶装置に保持し、新たに処理する画像の長辺の画素数から解析処理の予測時間を取得する処理時間取得部（非図示）を備える。また、図５に示した以外に、画像の長辺の画像数と処理時間の関係以外に、画像全体の画素数と処理時間の関係を予め用意してメモリ又は記憶装置に保持し、同様に処理時間を取得することもできる。 For this reason, when one side of the search range becomes longer, access to the pixels in the search range increases to a power of 2, and as a result, the processing time also increases to a power of 2. When pixel block extraction is performed by applying the MSER of Non-Patent Document 2 to the entire image, as shown in FIG. 5, access to the pixel increases to the power of the number of pixels on one side (long side) of the image. As a result, the analysis processing time is also multiplied by a power. It is assumed that the ratio of the long side to the short side of the image is 3 to 2. In the present embodiment, the relationship between the number of pixels on the long side of the image shown in FIG. 5 and the processing time is stored in advance in a memory or storage device for a predetermined ratio of the long side and the short side of the image, and a new image is processed. A processing time acquisition unit (not shown) that acquires the prediction time of the analysis processing from the number of pixels on the long side of In addition to the relationship shown in FIG. 5, in addition to the relationship between the number of images on the long side of the image and the processing time, the relationship between the number of pixels of the entire image and the processing time is prepared in advance and stored in a memory or storage device. Processing time can also be acquired.

次に、文字領域の抽出精度を維持しながら、処理時間を短縮する本実施形態の処理について説明する。本実施形態は、文字領域が存在する可能性の高い範囲を決定し、その範囲内を適切な縮小率で画像を縮小処理して文字領域の抽出処理を行う。文字領域の存在する可能性の高い範囲以外の画像部分にも文字領域が存在する可能性があるため、文字領域の抽出漏れが少なくなるように、入力画像から文字領域の存在する可能性の高い範囲を除いた画像部分についても文字領域を探索する。或いは、入力画像から文字領域の存在する可能性の高い範囲及び文字領域が存在する可能性のない領域を除いた画像部分について文字領域を探索する。 Next, a description will be given of the processing of the present embodiment that shortens the processing time while maintaining the extraction accuracy of the character region. In the present embodiment, a range where there is a high possibility that a character area exists is determined, and the image is reduced within the range at an appropriate reduction ratio to perform a character area extraction process. Since there is a possibility that a character area may exist in an image part other than the range where the character area is highly likely to exist, it is highly likely that a character area exists from the input image so that extraction of the character area is reduced. The character area is searched for the image part excluding the range. Alternatively, the character area is searched for an image portion excluding a range where the character area is likely to exist and an area where the character area is unlikely to exist from the input image.

このように、画素単位の解析による画素塊生成を実施して、文字領域の抽出処理を実施するように構成した。その流れを図６のフローチャートを使用して説明する。図６のフローチャートは、図２で示したマラソン大会又はロードレースを撮像して得られた画像を処理対象とし、文字領域としてゼッケン番号の領域を抽出するものである。 As described above, the pixel block generation by the analysis in units of pixels is performed, and the character region extraction processing is performed. The flow will be described with reference to the flowchart of FIG. The flowchart in FIG. 6 is for extracting an area of a race bib number as a character area, using an image obtained by imaging the marathon event or road race shown in FIG.

図６のフローチャートの説明に先立ち、ステップＳ３０２において文字領域抽出部１１６が画素単位の解析によって画素塊を生成して、文字領域を抽出する処理の概要を、図４のフローチャートを使用して詳細に説明する。 Prior to the description of the flowchart of FIG. 6, the outline of the process in which the character area extraction unit 116 generates a pixel block by pixel-by-pixel analysis and extracts the character area in step S302 will be described in detail using the flowchart of FIG. 4. explain.

ステップＳ４０１で入力された画像から画素塊を生成する。画素塊の生成は、例えば画像を２値化し、黒画素を抽出する等で実現できる。非特許文献２のＭＳＥＲを用いれば、複数のレベルで２値化した複数の２値画像に存在する安定的な画素塊を取り出すことで、ノイズを低減した画素塊が抽出可能である。 A pixel block is generated from the image input in step S401. The generation of the pixel block can be realized by binarizing the image and extracting black pixels, for example. By using the MSER of Non-Patent Document 2, a pixel block with reduced noise can be extracted by extracting stable pixel blocks existing in a plurality of binary images binarized at a plurality of levels.

しかしながら、この時点では、文字以外の画素塊も含んでいるため、ステップＳ４０２では、ステップＳ４０１で得られた画素塊から、文字である可能性が高い画素塊を判別する。文字画素塊であるかどうかの判別の一形態として、機械学習を用いることができる。これは、画素塊の大きさや密度等を特徴量として抽出し、文字画素塊か否かを識別できるようにあらかじめ特徴量を学習した識別器を用いて実施するものである。ステップＳ４０１で得られた全ての画素塊に対し識別を実施し、ステップＳ４０２で文字領域内の画素塊と判定された以外の画素塊は非文字とみなし、ステップＳ４０３で処理対象から除外する。 However, since the pixel block other than the character is also included at this point, in step S402, a pixel block that is highly likely to be a character is determined from the pixel block obtained in step S401. Machine learning can be used as one form of determining whether or not a character pixel block. This is performed by using a discriminator that learns the feature quantity in advance so that the size or density of the pixel chunk is extracted as a feature quantity, and whether or not it is a character pixel chunk. Identification is performed on all pixel blocks obtained in step S401, and pixel blocks other than those determined as pixel blocks in the character area in step S402 are regarded as non-characters, and are excluded from processing targets in step S403.

次に、ステップＳ４０４において、文字画素塊として選択された画素塊を、その位置関係を用いてまとめていく。例えば、画素塊の幅の２分の１以下の間隔の隣接画素塊を同一グループとする、等である。判定のための値は、隣接画素塊間から統計的に求めればよい。また、ステップＳ４０３で除外された画素塊においても、同一グループと判断された画素塊間に存在する場合は、当該グループの画素塊としてよい。これにより、文字の一部を構成する複数の画素塊が一つのグループとしてまとまり、文字領域候補となる。 Next, in step S404, the pixel blocks selected as the character pixel blocks are collected using the positional relationship. For example, adjacent pixel blocks having an interval of 1/2 or less of the width of the pixel block are set as the same group. The value for determination may be obtained statistically from between adjacent pixel blocks. In addition, even in the pixel block excluded in step S403, if it exists between pixel blocks determined to be in the same group, the pixel block of the group may be used. As a result, a plurality of pixel blocks constituting a part of the character are grouped as one group and become character region candidates.

ステップＳ４０５では、ステップＳ４０４で得られた画素塊のグループの背景部を解析し、グルーピングされた画素塊が同一文字領域の画素塊であるか否か等を判定する。文字領域の背景はほぼ均一もしくは緩やかに変化していると仮定し、画素塊の外接四辺形を設定し、該四辺形内の画素塊である前景とそれ以外の画素である背景の輝度の平均値や分散値等を求め、求めた値が基準を満たすかどうかで判断する。基準値は、あらかじめ実際の画像を測定して決定しておく。例えば、前景と背景の平均輝度の差が１０以上であるとか、背景画素の輝度の分散値が７０以下である等である。 In step S405, the background portion of the pixel block group obtained in step S404 is analyzed to determine whether or not the grouped pixel block is a pixel block in the same character area. Assuming that the background of the character area is almost uniform or gently changing, set a circumscribed quadrilateral of the pixel block, and average the luminance of the background that is the pixel block in the quadrangle and the other pixels A value, a variance value, and the like are obtained, and it is determined whether or not the obtained value satisfies the standard. The reference value is determined in advance by measuring an actual image. For example, the difference between the average luminance of the foreground and the background is 10 or more, or the luminance dispersion value of the background pixels is 70 or less.

これまでの処理で得られた画素塊グループおよびその背景部の情報から、ステップＳ４０６において抽出する文字領域を決定する。この処理ステップにおいて、グルーピングされた画素塊同士をさらに結合したり、不要な画素塊を削除したりすることで、最終的に抽出する文字領域が確定する。なお、説明した処理方法は文字領域の抽出処理の一例であり、図４の方法に限るものではない。 In step S406, the character region to be extracted is determined from the pixel block group and the background information obtained by the above processing. In this processing step, the character regions to be finally extracted are determined by further combining the grouped pixel blocks or deleting unnecessary pixel blocks. The processing method described above is an example of character region extraction processing, and is not limited to the method shown in FIG.

図４で説明したステップＳ３０２の文字領域抽出の処理方法を用いた本実施形態について、図６のフローチャートを使用して詳細に説明する。 This embodiment using the character region extraction processing method of step S302 described in FIG. 4 will be described in detail using the flowchart of FIG.

ステップＳ６０１において、入力画像の画素数があらかじめ設定した閾値以下かどうか判断する。閾値より小さい画像であれば、非図示の取得手段が例えば図５から取得した処理時間が所定値以内であるため、特徴領域としての顔領域の抽出や縮小処理等を実施せず、ステップＳ６１５で画像全体から文字領域であるゼッケン番号の領域を抽出する。入力画像の画素数の閾値から処理時間を取得するために、処理時間と画素数（処理量）の関係からあらかじめ算出しておく。例えば、処理時間を所定値のＴ秒以内としたい場合、解析処理時間の所定値Ｔ秒となる長辺の画素数を図５のグラフの値を保持した記憶装置から取得し、その画素数Ｐｔを閾値とする。長辺の画素数がＰｔ以下なら、画像全体からゼッケン番号領域を抽出しても、所定値Ｔ秒以内に処理を終えることができる。 In step S601, it is determined whether the number of pixels of the input image is equal to or less than a preset threshold value. If the image is smaller than the threshold, the processing time acquired by the acquisition unit (not shown) from within, for example, FIG. 5 is within a predetermined value, so that the extraction or reduction processing of the face region as the feature region is not performed, and in step S615 A region of a bib number that is a character region is extracted from the entire image. In order to acquire the processing time from the threshold value of the number of pixels of the input image, it is calculated in advance from the relationship between the processing time and the number of pixels (processing amount). For example, when the processing time is desired to be within T seconds of a predetermined value, the number of pixels on the long side that becomes the predetermined value T seconds of analysis processing time is acquired from the storage device that holds the values of the graph of FIG. Is a threshold value. If the number of pixels on the long side is equal to or less than Pt, the processing can be completed within a predetermined value T seconds even if the bib number area is extracted from the entire image.

ステップＳ６０１で画素数の閾値を超える画像が入力されたら、ステップＳ６０２において顔領域検出部１１２が顔検出を行う。顔検出方法については従来の技術を用いて実施可能である。ステップＳ６０２で顔が検出されたか否かによりステップＳ６０３で処理の流れが分岐される。ステップＳ６０２で顔が検出されない場合でも、ゼッケン番号が写っている可能性はあるので、ゼッケン番号領域抽出の解析処理を行う必要がある。 If an image exceeding the threshold value of the number of pixels is input in step S601, the face area detection unit 112 performs face detection in step S602. The face detection method can be implemented using conventional techniques. Depending on whether or not a face is detected in step S602, the flow of processing branches in step S603. Even if a face is not detected in step S602, there is a possibility that a race bib number is shown, so it is necessary to perform an analysis process of bib number number extraction.

特徴領域として顔領域が検出されなかった場合、ステップＳ６１４において、画像縮小部１１４があらかじめ決められた方法で縮小率を算出し、画像全体を縮小処理した後、ステップＳ６１５に進む。縮小率は、抽出しようとするゼッケン番号（文字領域）の大きさに対して、抽出精度が低下しない範囲で算出される。 If no face area is detected as the feature area, in step S614, the image reduction unit 114 calculates a reduction ratio by a predetermined method, reduces the entire image, and then proceeds to step S615. The reduction ratio is calculated within a range in which the extraction accuracy does not decrease with respect to the size of the bib number (character area) to be extracted.

まず、抽出しようとするゼッケン番号（文字領域）の大きさは直接に決めてもよい、又は次の方法によって決めてもよい。被写体である参加者の画像に対する最小の占有割合をあらかじめ決定する。画像の縦横比及びゼッケン番号の縦横比が一定であると仮定して、画像における被写体の占有割合として画像の長辺の大きさに対して被写体（参加者）の幅が２０％以上の被写体を解析の対象として抽出したいとする。 First, the size of the bib number (character area) to be extracted may be determined directly or by the following method. The minimum occupancy ratio for the image of the participant who is the subject is determined in advance. Assuming that the aspect ratio of the image and the aspect ratio of the bib number are constant, the subject (participant) width of the subject (participant) is 20% or more as the occupation ratio of the subject in the image. Suppose you want to extract as an analysis target.

被写体の幅に対して文字領域（ゼッケン番号）の長辺の大きさが平均４０％とすると、画像の長辺に対して長辺の占める割合が８％以上の文字領域を解析の対象として抽出することになる。ここで、長辺の画素数が３５００画素の画像を例に計算すると、長辺の画素数が２８０画素以上のゼッケン番号が抽出対象となる。 If the length of the long side of the character area (bib number) is 40% on average with respect to the width of the subject, a character area in which the ratio of the long side to the long side of the image is 8% or more is extracted as an analysis target. Will do. Here, when an image having a long side of 3500 pixels is calculated as an example, a bib number having a long side of 280 pixels or more is extracted.

次に、ゼッケン番号を抽出するために、例えば文字領域の長辺が８０画素以上あれば抽出精度が確保される場合は、ゼッケン番号の抽出精度が低下しない範囲での最小の縮小率を求めると、縮小率は約２９％となる。この縮小率は処理する画像の大きさによって変わる、例えば、画像の長辺の画素数が４５００画素であれば、上記のように計算すると、縮小率は約２２％となる。すなわち、大きい画像ほど、縮小率をより小さくできるので、処理時間の短縮効果が大きい。 Next, in order to extract the bib number, for example, if the extraction accuracy is ensured if the long side of the character area is 80 pixels or more, the minimum reduction rate within a range where the bib number extraction accuracy does not decrease is obtained. The reduction rate is about 29%. The reduction ratio varies depending on the size of the image to be processed. For example, if the number of pixels on the long side of the image is 4500 pixels, the reduction ratio is about 22% when calculated as described above. That is, the larger the image, the smaller the reduction rate, and the greater the effect of shortening the processing time.

ステップＳ６０２で顔が検出された場合、ステップＳ６０４に進む。図７において、顔領域検出部１１２により顔領域７０１と７０２が検出された様子を示す。なお、向かって左手の参加者の顔領域検出は失敗している。 If a face is detected in step S602, the process proceeds to step S604. FIG. 7 shows a state in which face areas 701 and 702 are detected by the face area detection unit 112. It should be noted that the face area detection of the left hand participant has failed.

ステップＳ６０４では、探索範囲決定部１１３は、検出された顔領域の大きさと位置からゼッケン番号を含む文字領域の大きさと位置を推定し、第一探索範囲を決定する。平均的な顔の大きさとゼッケン番号の大きさは既知の情報であるので、顔領域の位置と大きさから、ゼッケンが存在する可能性の高い範囲を設定出来る。例えば、顔領域の幅の２倍の幅をゼッケン番号が存在する範囲の幅とし、顔領域の下から顔領域の高さの３倍の範囲までを第一探索範囲とする。複数の顔領域が検出された場合、それぞれの顔領域に対して第一探索範囲を決定する。図７において、第一探索範囲７０３と７０４が決定した様子を示す。第一探索範囲では、ユニフォームや肌などの占める割合が大きくなり、画素塊のノイズは比較的に少ない。そのために、第一探索範囲以外の領域と比較して、画素塊のノイズ除去処理の時間を短縮できる。 In step S604, the search range determination unit 113 estimates the size and position of the character region including the bib number from the size and position of the detected face region, and determines the first search range. Since the average face size and number number are known information, the range where the number is likely to exist can be set from the position and size of the face region. For example, a width that is twice the width of the face area is set as the width of the range where the bib number exists, and a range from the bottom of the face area to a range that is three times the height of the face area is set as the first search range. When a plurality of face areas are detected, the first search range is determined for each face area. FIG. 7 shows how the first search ranges 703 and 704 are determined. In the first search range, the ratio of uniforms and skin is increased, and the noise of the pixel block is relatively small. Therefore, it is possible to shorten the time for the noise removal processing of the pixel block as compared with the region other than the first search range.

第一探索範囲を決定した後に、顔領域及び第一探索範囲を含まない第二探索範囲を決定する。入力画像から顔領域及び第一探索範囲をマスク処理して除いた範囲を第二探索範囲とする方法がある。 After determining the first search range, a second search range that does not include the face region and the first search range is determined. There is a method in which a range obtained by masking the face region and the first search range from the input image is set as the second search range.

また、処理時間をさらに短縮するために、第一探索範囲及び第二探索範囲をより小さくすることができる。例えば、顔領域の下から顔領域の幅の２倍の幅と顔領域の高さの２倍高さの範囲を第一探索範囲として、第一探索範囲をより小さくすることができる。一方、入力画像から、顔領域及び顔領域の下から顔領域の幅の２倍の幅と顔領域の高さの５倍の高さの範囲をマスク処理して除いて第二探索範囲とする。これにより、第二探索範囲をより小さい範囲とすることができる。 Moreover, in order to further shorten the processing time, the first search range and the second search range can be made smaller. For example, the first search range can be made smaller by setting the range from the bottom of the face region to twice the width of the face region and twice the height of the face region as the first search range. On the other hand, from the input image, a range of twice the width of the face region and five times the height of the face region from the bottom of the face region and five times the height of the face region is masked to be a second search range. . Thereby, a 2nd search range can be made into a smaller range.

即ち、文字領域が存在する可能性の高い第一探索範囲をより正確に特定すると同時に、第二探索範囲から文字領域が存在する可能性の非常に低い領域を除くことによって、処理する画像のサイズ（画素数）を小さくすることができる。 That is, the size of the image to be processed is determined by more accurately identifying the first search range where a character region is likely to exist, and at the same time excluding the region where the character region is very unlikely to exist from the second search range. (Number of pixels) can be reduced.

ステップＳ６０５において、画像縮小部１１４がそれぞれの第一探索範囲７０３と７０４の縮小率を決定する。文字抽出精度が低下しない文字領域（ゼッケン番号）の大きさをＲＯＭ１０３又は外部記憶装置１０５からあらかじめ取得する。ステップＳ６０２で検出された各顔領域の大きさから文字領域（ゼッケン番号）の大きさを推定し、あらかじめ取得した文字領域の大きさと顔領域から推定した文字領域の大きさから縮小率を決定する。例えば、顔領域の幅をＷとし、文字領域の幅が５０画素以上あれば文字抽出精度が低下しないならば、縮小率は５０／Ｗとなる。 In step S605, the image reduction unit 114 determines the reduction ratios of the first search ranges 703 and 704, respectively. The size of the character area (bib number) that does not deteriorate the character extraction accuracy is acquired from the ROM 103 or the external storage device 105 in advance. The size of the character region (bib number) is estimated from the size of each face region detected in step S602, and the reduction ratio is determined from the size of the character region acquired in advance and the size of the character region estimated from the face region. . For example, if the width of the face area is W and the width of the character area is 50 pixels or more, if the character extraction accuracy does not decrease, the reduction ratio is 50 / W.

このように第一探索範囲の縮小率を決めることによって、大きさの異なる第一探索範囲７０３と７０４に対して、文字抽出精度が低下しない縮小率がそれぞれ求まる。第一探索範囲７０３と７０４をそれぞれの縮小率で縮小処理して、文字領域の抽出精度が低下せずに処理時間を短縮して文字領域を抽出することができる。 By determining the reduction rate of the first search range in this way, reduction rates that do not reduce the character extraction accuracy can be obtained for the first search ranges 703 and 704 having different sizes. The first search ranges 703 and 704 can be reduced at the respective reduction ratios, and the character area can be extracted with a reduced processing time without lowering the accuracy of character area extraction.

また、入力画像から第一探索範囲及顔領域を除いた部分からも漏れがないようにゼッケン番号を抽出する、入力画像から第一探索範囲及び顔領域を除いた第二探索範囲についても縮小率を決定する。第二探索範囲の縮小率は第一探索範囲の縮小率を参照して決めることができる。例えば、顔領域が１つのみ検出された場合は、第二探索範囲の縮小率を第一探索範囲の縮小率と同一値とする。また、顔領域が複数検出された場合は最も大きい顔領域から決定した第一探索範囲の縮小率と同じ値にしても良い。また、複数の第一探索範囲の縮小率を用いて算出しても良い。 In addition, the bib number is extracted so that there is no leakage from the portion of the input image excluding the first search range and the face region, and the reduction rate is also applied to the second search range excluding the first search range and the face region from the input image. To decide. The reduction rate of the second search range can be determined with reference to the reduction rate of the first search range. For example, when only one face area is detected, the reduction rate of the second search range is set to the same value as the reduction rate of the first search range. When a plurality of face areas are detected, the same reduction ratio as the first search range determined from the largest face area may be used. Alternatively, the calculation may be performed using the reduction ratios of the plurality of first search ranges.

あるいは、ステップＳ６１４で説明した方法を用いて、抽出しようとするゼッケン番号（文字領域）の大きさから、抽出精度が低下しない縮小率を求めて、第二探索範囲の縮小率とするようにしてもよい。 Alternatively, by using the method described in step S614, a reduction rate that does not reduce the extraction accuracy is obtained from the size of the number number (character area) to be extracted, and is set as the reduction rate of the second search range. Also good.

例えば、長辺の画素数が３２０画素以上のゼッケン番号を抽出しようとする場合、ゼッケン番号を抽出するために必要となる長辺の画素数が８０画素とすれば、第二探索範囲の最小の縮小率は２５％である。第二探索範囲を第二探索範囲の縮小率で縮小処理してから、文字領域を抽出するので、抽出精度が低下しない範囲で、第二探索範囲において所望の大きさ以上のゼッケン番号の抽出処理時間を短縮することができる。 For example, when trying to extract a bib number with a long side number of 320 pixels or more, if the long side number of pixels required to extract the bib number is 80 pixels, the minimum number of the second search range The reduction rate is 25%. Since the character area is extracted after the second search range is reduced at the reduction ratio of the second search range, the number search number greater than or equal to the desired size in the second search range is extracted without reducing the extraction accuracy. Time can be shortened.

上述したように、第一探索範囲及び第二探索範囲に対して、文字領域の抽出漏れを抑制しながら、独立にそれぞれの最小の縮小率で縮小処理して文字領域を抽出する。そのために、画像全体を縮小する場合の最小の縮小率で画像全体を縮小処理して文字領域を抽出するより、抽出処理の処理時間を短縮することができる。 As described above, the character area is extracted by reducing the character area independently with respect to the first search range and the second search range while suppressing the omission of extraction of the character area. Therefore, the processing time of the extraction process can be shortened compared to extracting the character area by reducing the entire image with the minimum reduction ratio when the entire image is reduced.

ステップＳ６０６で、画像縮小部１１４は、ステップＳ６０４で決定した第一探索範囲に対し、ステップＳ６０５で決定した縮小率で第一探索範囲の画像を縮小し、ステップＳ６０７で縮小後の部分画像からゼッケン番号領域を抽出する。ステップＳ６０８により全ての第一探索範囲からゼッケン番号領域を抽出したか判断し、全ての第一探索範囲に対する抽出処理が終わればステップＳ６０９に進む。図８はステップＳ６０６における縮小画像を示したものである。図８（ａ）は第一探索範囲７０３の元画像であり、（ｃ）は縮小後の画像である。同様に（ｂ）は第一探索範囲７０４の元画像であり、（ｄ）は縮小後の画像である。縮小後のゼッケン番号の文字は、抽出精度が低下しない範囲の大きさでほぼ同じ大きさになる。 In step S606, the image reduction unit 114 reduces the image in the first search range with the reduction rate determined in step S605 with respect to the first search range determined in step S604. In step S607, the image reduction unit 114 determines the number from the reduced partial image. Extract the number area. In step S608, it is determined whether the bib number area has been extracted from all the first search ranges. If the extraction process for all the first search ranges is completed, the process proceeds to step S609. FIG. 8 shows a reduced image in step S606. FIG. 8A is an original image of the first search range 703, and FIG. 8C is an image after reduction. Similarly, (b) is an original image of the first search range 704, and (d) is an image after reduction. The characters of the number numbers after the reduction are almost the same size within a range where the extraction accuracy does not decrease.

ステップＳ６０９では、ステップＳ６０５で決定した縮小率で第二探索範囲を縮小し、ステップＳ６１０では、探索済みの顔領域及び第一探索範囲をマスクする。縮小後の画像に対してマスク済みの様子を図９の（ａ）に示す。即ち、元の画像から顔領域及び第一探索範囲を除いた範囲が第二探索範囲である。 In step S609, the second search range is reduced at the reduction rate determined in step S605. In step S610, the searched face area and the first search range are masked. FIG. 9A shows a state in which the reduced image is masked. That is, a range obtained by removing the face area and the first search range from the original image is the second search range.

ステップＳ６１１において、マスク済みの範囲を解析せずに第二探索範囲の縮小画像からゼッケン番号の領域を抽出する。ステップＳ６１２において、非図示の判断手段は、第二探索範囲から得られたゼッケン番号領域がマスク範囲に掛かっているかを判断する。例えば文字領域（ゼッケン番号の領域）が第一探索範囲に接する位置にあって、且つ文字領域の幅と高さとの比率（縦横比、幅／高さ）が所定値より小さいかどうかによって判断される。この場合は、文字領域の一部である部分領域しか抽出処理されていないので、全てのゼッケン番号を抽出できない可能性が高い。 In step S611, the area of the bib number is extracted from the reduced image of the second search range without analyzing the masked range. In step S612, a determination unit (not shown) determines whether or not the number number area obtained from the second search range covers the mask range. For example, it is determined whether or not the character area (number number area) is in contact with the first search range and the ratio between the width and height (aspect ratio, width / height) of the character area is smaller than a predetermined value. The In this case, since only the partial area that is a part of the character area has been extracted, there is a high possibility that all the bib numbers cannot be extracted.

文字領域（ゼッケン番号の領域）がマスク範囲に掛かっていると判断された場合、抽出された文字領域（ゼッケン番号の領域）の一部が第一探索範囲にあるが、マスク処理によって抽出できなかった文字（ゼッケン番号）が隠れている可能性がある。ステップＳ６１３において、第二探索範囲を拡張し、ゼッケン番号の領域抽出を継続する。拡張する向きは既に抽出された文字から文字行の向きを推定し、推定された文字行の向きに、接している第一探索範囲に向かって第二探索範囲を拡張する。あるいは、参加者のゼッケン番号はほぼ水平であるとみなし、水平方向に第一探索範囲に向かって第二探索範囲を拡張してもよい。文字領域の縦横比が少なくとも所定値になるように文字領域を拡張した様子を図９の（ｂ）に示す。この結果、画像上の全てのゼッケン番号の領域を抽出することができる。 If it is determined that the character area (number number area) falls within the mask range, a part of the extracted character area (number number area) is in the first search range, but cannot be extracted by mask processing. The letters (numbers) may be hidden. In step S613, the second search range is expanded, and the area extraction of the bib number is continued. For the direction of expansion, the direction of the character line is estimated from the already extracted characters, and the second search range is expanded toward the first search range in contact with the estimated direction of the character line. Alternatively, the number number of the participant may be regarded as being almost horizontal, and the second search range may be expanded toward the first search range in the horizontal direction. FIG. 9B shows a state in which the character area is expanded so that the aspect ratio of the character area is at least a predetermined value. As a result, it is possible to extract all the bib number areas on the image.

図６では、第一探索範囲に対するステップＳ６０６からステップＳ６０８までの処理が終わってから、第二探索範囲に対するステップＳ６０９以降の処理を行うように説明されているが、これに限定する必要はない。第一探索範囲と第二探索範囲が決定された後に、第一探索範囲に対する処理と第二探索範囲に対する処理が並列的に行われても良い。複数の文字領域抽出部１１６が複数のＣＰＵ１０２を用いて並列的に処理することによって、画像全体の処理時間を大幅に短縮できる。 In FIG. 6, it is described that the processing from step S <b> 606 to step S <b> 608 for the first search range is finished and then the processing after step S <b> 609 for the second search range is performed, but it is not necessary to limit to this. After the first search range and the second search range are determined, the process for the first search range and the process for the second search range may be performed in parallel. When the plurality of character region extraction units 116 perform processing in parallel using the plurality of CPUs 102, the processing time of the entire image can be significantly reduced.

また、図６では、第一探索範囲と第二探索範囲をそれぞれ縮小処理して文字領域を抽出処理する方法について説明したが、入力画像における第一探索範囲の大きさ（画像のサイズ）に応じて、第一探索範囲を縮小処理しなくても良い。例えば、入力画像における第一探索範囲の大きさは図７の第一探索範囲７０４のように小さい場合は、縮小率が大きい（縮小処理前後の画像サイズの変化は小さい）ので、第一探索範囲を縮小処理しなくてもよい。 In addition, in FIG. 6, the method of extracting the character area by reducing the first search range and the second search range has been described, but according to the size of the first search range (image size) in the input image. Thus, the first search range may not be reduced. For example, when the size of the first search range in the input image is small as in the first search range 704 in FIG. 7, the reduction ratio is large (the change in the image size before and after the reduction process is small). Need not be reduced.

以上説明したように、画素数が多い画像に対し、文字領域の抽出処理を実施する範囲を分割するとともに、精度が低下しない範囲で画像を縮小することによって処理時間を減らすことが可能となる。また、文字領域であるゼッケン番号領域を出来る限り分割しないように探索範囲を設定し、精度低下が生じない範囲で画像を縮小し、各分割画像を縮小することで処理時間を大幅に短縮する効果がある。 As described above, it is possible to reduce the processing time by dividing the range in which the character region extraction process is performed on an image having a large number of pixels and reducing the image within a range in which the accuracy does not decrease. In addition, the search range is set so as not to divide the bib number area, which is a character area, as much as possible, the image is reduced within a range where accuracy is not reduced, and each divided image is reduced to greatly reduce the processing time. There is.

（第２の実施形態）
以下、第２の実施形態を説明する。第２の実施形態の画像処理装置の構成図は前述の第１の実施形態と同一であるので説明を省略する。 (Second Embodiment)
Hereinafter, a second embodiment will be described. Since the configuration diagram of the image processing apparatus according to the second embodiment is the same as that of the first embodiment, the description thereof will be omitted.

本実施形態で画像処理装置が実行する、画像から文字情報を読み取る処理の流れを、図１０のフローチャートを使用して説明する。なお、第１の実施形態と同一な処理ステップについては説明を省略する。 A flow of processing for reading character information from an image, which is executed by the image processing apparatus in the present embodiment, will be described with reference to a flowchart of FIG. Note that description of the same processing steps as those in the first embodiment is omitted.

ステップＳ１００１において、入力画像の画素数があらかじめ設定した閾値以下かどうか判断する。閾値を超えた場合、ステップＳ１００２乃至ステップＳ１００５の処理ステップについては第１の実施形態と同様で、第一探索範囲と第二探索範囲を決定し、画像分割された第一探索範囲と第二探索範囲の部分画像毎に縮小率を決定する。 In step S1001, it is determined whether the number of pixels of the input image is equal to or less than a preset threshold value. When the threshold value is exceeded, the processing steps from step S1002 to step S1005 are the same as in the first embodiment, the first search range and the second search range are determined, and the first search range and the second search that are divided into images are determined. A reduction ratio is determined for each partial image in the range.

ステップＳ１００６において、部分画像の縮小処理及び、縮小された部分画像からゼッケン番号領域の抽出処理の演算量を、部分画像で実施した場合と分割せず画像全体の縮小画像で実施した場合の両方で算出し比較する。画像を分割して処理した場合の演算量が少ない場合、すなわち処理時間短縮の効果がある場合、ステップＳ１００７に進む。 In step S1006, the amount of computation of the partial image reduction process and the extraction of the bib number area from the reduced partial image is performed both when the partial image is performed and when the entire image is not divided and is performed on the reduced image. Calculate and compare. If the amount of calculation when the image is divided and processed is small, that is, if there is an effect of reducing the processing time, the process proceeds to step S1007.

ステップＳ１００７乃至ステップＳ１０１３の処理ステップは第１の実施形態と同様であり、第一探索範囲及び第二探索範囲からゼッケン番号領域を抽出する。 The processing steps from step S1007 to step S1013 are the same as those in the first embodiment, and the bib number area is extracted from the first search range and the second search range.

ステップＳ１００６において、処理時間短縮の効果がない場合、ステップＳ１０１５に進み、あらかじめ決められた方法で縮小率を算出後、入力画像全体を縮小する。なお、縮小率の算出方法は第１の実施形態で説明した図６のステップＳ６１４の算出方法と同様である。ステップＳ１０１６において、入力画像全体からゼッケン番号領域を抽出する。 If it is determined in step S1006 that there is no effect of shortening the processing time, the process proceeds to step S1015, the reduction ratio is calculated by a predetermined method, and the entire input image is reduced. The reduction rate calculation method is the same as the calculation method in step S614 of FIG. 6 described in the first embodiment. In step S1016, a bib number area is extracted from the entire input image.

なお、ステップＳ１００１で入力画像の画素数が閾値以下の場合、あるいは、ステップＳ１０１３でマスク範囲にゼッケン番号領域が掛かる場合の処理は第１の実施形態と同様である。 Note that the processing in the case where the number of pixels of the input image is equal to or smaller than the threshold value in step S1001 or the case where the number range is applied to the mask range in step S1013 is the same as in the first embodiment.

以上説明したように、本発明によれば、処理時間短縮効果を評価後、部分画像で処理を行うように構成することで、より高速な処理ステップが選択される効果がある。 As described above, according to the present invention, it is possible to select a higher-speed processing step by performing processing with partial images after evaluating the processing time reduction effect.

（第３の実施形態）
上記実施形態では顔検出とゼッケン番号の例を用いて説明したが、特徴領域として顔領域を検出することに限定される必要はない。本実施形態は、特徴領域の検出として顔領域の検出ではなく、マーカー領域を検出することによって、第一探索範囲を決定し、第一探索範囲から文字領域であるゼッケン番号領域を検出する。マーカー領域の検出は顔領域検出部１１２の替わりに非図示のマーカー領域検出部を用いて従来のマーカー検出方法によって実施することができる。 (Third embodiment)
Although the above embodiment has been described using the example of face detection and bib number, it is not necessary to be limited to detecting a face region as a feature region. In the present embodiment, the first search range is determined by detecting the marker area instead of the face area as the feature area, and the bib number area that is the character area is detected from the first search range. The marker region can be detected by a conventional marker detection method using a marker region detection unit (not shown) instead of the face region detection unit 112.

マーカーは、図１１（ａ）又は（ｂ）に示すようにゼッケンの左上の角部に印刷された特定のパターンであるが、マーカーの位置及び形状はこれに限定する必要はない。マーカーの位置とゼッケン番号の相対位置が事前に分かっていれば良い。マーカーのパターンの形状と大きさは決まっているので、マーカーの位置と大きさから、ゼッケン番号を含む可能性の高い第一探索範囲を決定する。そして、第一探索範囲を決定した後に、第１実施形態と同様の方法によって、第二探索範囲を決定する。 The marker is a specific pattern printed on the upper left corner of the bib as shown in FIG. 11 (a) or (b), but the position and shape of the marker need not be limited to this. The relative position between the marker position and the bib number need only be known in advance. Since the shape and size of the marker pattern are determined, the first search range that is likely to include the bib number is determined from the position and size of the marker. Then, after determining the first search range, the second search range is determined by the same method as in the first embodiment.

（第４の実施形態）
上記実施形態はゼッケン番号を検知する例を用いて説明したが、自動車のナンバープレートの登録番号を検知する場合にも適用できる。その場合の特徴領域として自動車のロゴマーク又は自動車のヘッドライトを検出して、検出された特徴領域の位置と大きさからナンバープレートの登録番号を含む可能性の高い第一探索範囲を決定する。その他の処理は上記実施形態と同様である。 (Fourth embodiment)
Although the said embodiment demonstrated using the example which detects a number number, it is applicable also when detecting the registration number of the number plate of a motor vehicle. In this case, an automobile logo mark or an automobile headlight is detected as a feature area, and a first search range that is likely to include a license plate registration number is determined from the detected position and size of the feature area. Other processes are the same as in the above embodiment.

＜その他の実施形態＞
本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 <Other embodiments>
The present invention supplies a program that realizes one or more functions of the above-described embodiments to a system or apparatus via a network or a storage medium, and one or more processors in the computer of the system or apparatus read and execute the program This process can be realized. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

１０１システムバス
１０２ＣＰＵ
１０３ＲＯＭ
１０４ＲＡＭ
１０５外部記憶装置
１０６ネットワークインターフェース
１０７表示部
１０８操作部 101 system bus 102 CPU
103 ROM
104 RAM
105 External storage device 106 Network interface 107 Display unit 108 Operation unit

Claims

Detection means for detecting a feature region from the input image;
Determining means for determining a first search range in the input image and a second search range different from the first search range based on the position and size of the detected characteristic region;
Reduction means for reducing the second search range at a predetermined reduction rate;
Extracting means for extracting a character region using data for each pixel constituting each of the image of the first search range and the reduced image of the second search range subjected to the reduction process;
An image processing apparatus comprising:

The reduction means reduces the first search range, and the extraction means extracts a character area for each reduced image of the first search range and the second search range that have been reduced. The image processing apparatus according to claim 1.

Determining means for determining whether the aspect ratio of the character area extracted from the second search range is smaller than a predetermined value; and when the determining means determines that the aspect ratio of the character area is smaller than a predetermined value, 3. The image processing according to claim 1, further comprising: an extension unit that extends a second search range, wherein the extraction unit extracts a character region from the extended second search range. apparatus.

The extension means extends the second search range by extending the character region toward the first search range in contact with the character region in the direction of the character line of the extracted character region. The image processing apparatus according to claim 3.

The image processing apparatus according to claim 4, wherein the expansion unit expands the aspect ratio of the character area until at least the predetermined value is reached.

6. The determination unit according to claim 1, wherein the determination unit determines a second search range by excluding the first search range expanded by the feature region and a predetermined height or width from the input image. The image processing apparatus according to claim 1.

7. The image processing according to claim 1, wherein the extraction unit extracts a character region in parallel from each of the reduced first search range and the second search range. apparatus.

The reduction ratio of the reduction processing of the first search range by the reduction means is calculated based on the size of the feature area detected by the detection means and the size of the character area that can be extracted by the extraction means. The image processing apparatus according to claim 1, wherein the image processing apparatus is an image processing apparatus.

The reduction ratio of the reduction process of the second search range by the second reduction unit is calculated based on the size of the character region extracted by the extraction unit and the size of the character region that can be extracted by the extraction unit. The image processing apparatus according to claim 1, wherein the image processing apparatus is an image processing apparatus.

The image processing apparatus further includes an acquisition unit that acquires a processing time for extracting a character region for the entire input image from the number of pixels of the input image, and if the processing time acquired by the acquisition unit is within a predetermined value, The image processing apparatus according to claim 1, wherein the extraction unit performs a process of extracting a character region from the entire input image.

The image processing apparatus according to claim 1, wherein the feature area is a face area.

The image processing apparatus according to claim 1, wherein the feature region is a marker region.

A detection step of detecting a feature region from the input image;
A determination step of determining a first search range in the input image and a second search range different from the first search range based on the detected position and size of the feature region;
A reduction step of reducing the second search range at a predetermined reduction rate;
An extraction step of extracting a character region using data for each pixel constituting each of the image of the first search range and the reduced image of the second search range subjected to the reduction process;
An image processing method comprising:

A detection step of detecting a feature region from the input image;
A determination step for determining a first search range in the input image and a second search range different from the first search range based on the detected position and size of the feature region;
A reduction step of reducing the second search range at a predetermined reduction rate;
An extraction step of extracting a character region using data for each pixel constituting each of the image of the first search range and the reduced image of the second search range subjected to the reduction process;
A program that causes a computer to execute.