JP5003445B2

JP5003445B2 - Image processing apparatus, character area specifying method, and character area specifying program

Info

Publication number: JP5003445B2
Application number: JP2007314902A
Authority: JP
Inventors: 和也矢後
Original assignee: Konica Minolta Business Technologies Inc
Current assignee: Konica Minolta Business Technologies Inc
Priority date: 2007-12-05
Filing date: 2007-12-05
Publication date: 2012-08-15
Anticipated expiration: 2027-12-05
Also published as: JP2009141597A

Description

この発明は、複数の画素からなる画像の中から文字領域を特定する画像処理装置、当該画像処理装置を用いた文字領域特定方法、および文字領域特定プログラムに関し、特に、複雑で多様な背景を含んだ画像から正確に文字行矩形を生成するとともに、正確に文字行矩形を特定する画像処理装置、当該画像処理装置を用いた文字領域特定方法、および文字領域特定プログラムに関する。 The present invention relates to an image processing device for specifying a character region from an image composed of a plurality of pixels, a character region specifying method using the image processing device, and a character region specifying program, and particularly includes complicated and diverse backgrounds. The present invention relates to an image processing apparatus that accurately generates a character line rectangle from an image, and that accurately specifies a character line rectangle, a character area specifying method using the image processing apparatus, and a character area specifying program.

近年、情報の電子化が進み、文書を紙ではなく電子化して保存、または送信する需要が高まっている。そこで、ＭＦＰ（Multi Function Peripheral）等の画像データを取得する画像処理装置において、スキャンして得られた画像データを紙に印刷することなく、メールに添付するなどして直接送信する機能が普及してきている。 In recent years, with the progress of computerization of information, there is an increasing demand for storing or transmitting documents in electronic form instead of paper. Therefore, in an image processing apparatus such as an MFP (Multi Function Peripheral) that acquires image data, a function of directly transmitting image data obtained by scanning without attaching it to paper without attaching it to paper has become widespread. ing.

ところで、ＭＦＰ等の画像処理装置において取り扱う画像は白黒画像からカラー画像に移行しつつあるため、上記画像データはカラー画像データとなりつつある。ＭＦＰにおいて、いわゆるＡ４サイズと言われる２９７ｍｍ×２１０ｍｍのサイズのフルカラー原稿を解像度３００ｄｐｉでスキャンして得られるカラー画像データのサイズは、約２５ＭＢに達する。そのため、カラー画像データは、メールに添付して送信するにはサイズが大きすぎるという問題が発生してきている。 By the way, since an image handled in an image processing apparatus such as an MFP is shifting from a monochrome image to a color image, the image data is becoming color image data. In the MFP, the size of color image data obtained by scanning a full-color original of a size of 297 mm × 210 mm, so-called A4 size, at a resolution of 300 dpi reaches about 25 MB. For this reason, there has been a problem that the color image data is too large to be attached to an email and transmitted.

この問題を解決するために、スキャンして得られた画像データ（スキャンデータと略する）を圧縮してサイズダウンしてから送信することが一般的になされている。しかしながら、画像全体に対して同一の解像度でスキャンデータを圧縮すると、画像に含まれる文字の判読性が損なわれる。画像中の文字の判読性を確保できる程度の高い解像度で圧縮すると、スキャンデータのサイズダウンが十分にできないという問題がある。 In order to solve this problem, image data (abbreviated as scan data) obtained by scanning is generally compressed and reduced in size before transmission. However, if the scan data is compressed with the same resolution for the entire image, the legibility of the characters included in the image is impaired. If compression is performed with a resolution high enough to ensure the legibility of characters in the image, there is a problem that the size of the scan data cannot be reduced sufficiently.

この問題を解決するために、本願出願人が先に出願して公開されている特開２００４−３０４４６９号公報（特許文献１）において、画像中の領域ごとに異なる解像度や異なる圧縮方法でスキャンデータを圧縮する、いわゆるコンパクトＰＤＦ（Portable Document Format）化と言われる方法などの圧縮方法が提案されている。この方法によると、コンパクトＰＤＦは、（１）スキャンデータの領域を判別する処理を実行し、文字領域と文字以外の部分とを分離し、（２）文字領域に対して、高解像度のまま二値化処理し、同じ色の文字を統合して文字の色を決定して、ＭＭＲ（Modified Modified-Read）圧縮等の可逆圧縮し、（３）文字以外の部分に対して、解像度を下げてＪＰＥＧ（Joint Photographic Experts Group）圧縮等の非可逆圧縮し、（４）ＪＰＥＧレイヤとＭＭＲレイヤとを重ね合わせる、という手順で作成される。このようにして、文字判読性とデータの圧縮性を両立したＰＤＦファイルを生成することが出来る。 In order to solve this problem, in Japanese Patent Application Laid-Open No. 2004-304469 (Patent Document 1) filed and filed by the applicant of the present application, scan data with different resolutions and different compression methods for each region in an image. A compression method such as a so-called compact PDF (Portable Document Format) has been proposed. According to this method, the compact PDF (1) executes a process for determining the area of the scan data, separates the character area and the part other than the character, and (2) the high resolution of the character area. Quantization processing, character of the same color is integrated to determine the color of the character, reversible compression such as MMR (Modified Modified-Read) compression, and (3) lowering the resolution for the part other than the character It is created by a procedure of irreversible compression such as JPEG (Joint Photographic Experts Group) compression and (4) superimposing the JPEG layer and the MMR layer. In this way, it is possible to generate a PDF file that achieves both character legibility and data compressibility.

そして、コンパクトＰＤＦは上記の手順で生成されるため、コンパクトＰＤＦの生成においてはスキャンされた画像データから文字領域を正確に抽出することが重要となる。特に、複雑で多様な背景を含んだ画像から文字を抽出する際に問題となるのが、背景領域を間違えて文字として抽出してしまうことである。この問題を解決するために、文字が行単位にまとまって存在することが多いという特徴を活かし、画素が連結されて形成された矩形について、行らしさを基に文字判定を行う方法が挙げられている。 Since the compact PDF is generated by the above procedure, it is important to accurately extract the character region from the scanned image data in generating the compact PDF. In particular, a problem in extracting characters from an image including a complicated and diverse background is that the background region is mistakenly extracted as characters. In order to solve this problem, there is a method of performing character determination based on the likelihood of a rectangle formed by connecting pixels, taking advantage of the fact that characters often exist in line units. Yes.

たとえば、特開平５−７３７１８号公報（特許文献２）に記載の領域属性識別方式では、文字領域内のすべての黒画素の連結状態を調べ、黒画素が連結しているかたまりに外接する矩形の座標を検出する。各外接矩形に対して最も近い外接矩形を検出してその位置関係を検出する。そして、水平方向につながっていれば水平方向結合カウンタを１つすすめ、垂直方向につながっていれば、垂直方向結合カウンタを１つすすめる。全ての外接矩形について処理が終了したら、水平方向結合カウンタと垂直方向結合カウンタの値を比較し、水平方向結合カウンタが多い場合は横書き領域と判定され、垂直方向結合カウンタが多い場合は縦書き領域と判定する。 For example, in the area attribute identification method described in Japanese Patent Laid-Open No. 5-73718 (Patent Document 2), the connection state of all black pixels in a character area is examined, and a rectangular shape circumscribing a cluster where black pixels are connected. Detect coordinates. A circumscribed rectangle closest to each circumscribed rectangle is detected and its positional relationship is detected. If it is connected in the horizontal direction, one horizontal direction coupling counter is recommended. If it is connected in the vertical direction, one vertical direction combination counter is recommended. When processing is completed for all circumscribed rectangles, the values of the horizontal direction counter and the vertical direction counter are compared. If there are many horizontal direction counters, the horizontal writing area is determined, and if there are many vertical direction counters, the vertical writing area is determined. Is determined.

また、特開平５−１６６０００号公報（特許文献３）に記載の文書画像の領域抽出方法では、領域画像作成処理により領域画像を作成し、その領域画像を用いてラベル画像作成処理によってラベル画像を作成する。文書画像作成処理では、ラベル画像から文字領域以外の領域を全て白にし、文字のみの文字画像を作成する。そして、隣接領域検索処理で、文字領域毎に上，下，左，右それぞれの最も近くに位置する文字領域を検索する。この結果を基に、書き方向結合処理で、文字領域の、行方向あるいは列方向の結合を行う。さらに、グループ化結合処理で書き方向が一致する複数の文字列を結合する。最後に領域統合処理で、グループ化結合処理で結合された文字領域を囲む最小の矩形領域を求めることにより文書画像の領域抽出を行う。 In the document image region extraction method described in Japanese Patent Laid-Open No. 5-166000 (Patent Document 3), a region image is created by region image creation processing, and a label image is created by label image creation processing using the region image. create. In the document image creation process, all regions other than the character region are made white from the label image, and a character image only of characters is created. Then, in the adjacent area search process, the character area located closest to each of the upper, lower, left, and right is searched for each character area. Based on this result, the character areas are combined in the row direction or the column direction by the writing direction combination processing. Further, a plurality of character strings having the same writing direction are combined in the grouping combination process. Finally, in the area integration process, the area of the document image is extracted by obtaining the minimum rectangular area surrounding the character areas combined in the grouping and combining process.

さらに、特開２００７−１９３７５０号公報（特許文献４）には、スキャン画像から、主に図形やグラフなどを含む図領域と、テキスト領域とを分離し、テキスト領域については、近傍の黒画素を連結して、黒画素が連結して得られた矩形単位で文字判定を行う画像処理装置が記載されている。この画像処理装置は、図領域については、黒画素の連結を行わずに、ラベリング処理を行って連続する黒画素の外接矩形を抽出し、その矩形単位で文字判定を行う。 Furthermore, Japanese Patent Application Laid-Open No. 2007-193750 (Patent Document 4) separates a figure area mainly including a figure and a graph from a scanned image and a text area from the scanned image. An image processing apparatus that performs character determination in units of rectangles obtained by connecting and connecting black pixels is described. This image processing apparatus extracts a circumscribed rectangle of continuous black pixels by performing a labeling process for the figure region without connecting the black pixels, and performs character determination in units of the rectangles.

このように、余白を抽出し、残った領域に対して画素を連結し行矩形を生成する行矩形形成方法が公知になっている。この余白抽出処理を行うと文字行と文字行の間（行間）が余白で分断されるため、行矩形の形成精度が良いことが知られている。そして、連結されて出来た矩形のアスペクト比が規定の値以上であれば、文字矩形と判定する文字領域判定方法も公知になっている。
特開２００４−３０４４６９号公報特開平５−７３７１８号公報特開平５−１６６０００号公報特開２００７−１９３７５０号公報 Thus, a row rectangle forming method is known in which margins are extracted and pixels are connected to the remaining region to generate a row rectangle. It is known that when this blank space extraction process is performed, the space between the character lines is separated by a blank space, so that the formation accuracy of the line rectangle is good. Also, a character area determination method for determining a character rectangle if the aspect ratio of the connected rectangles is equal to or greater than a prescribed value is also known.
JP 2004-304469 A JP-A-5-73718 Japanese Patent Laid-Open No. 5-166000 JP 2007-193750 A

しかしながら、上記従来の行矩形形成方法では、周囲に文字行が無い文字行や、テキスト領域の端にあるような文字行の場合、単なる文字間をも余白と誤判定してしまうことがある。この誤判定によって文字行が分断されてしまい、画素の連結を行っても正確な文字行が形成されなかった。また、従来の文字判定方法では、背景領域にも縦長、横長の矩形が存在しているために、背景の一部分を文字として抽出してしまうことがあった。 However, in the above conventional line rectangle forming method, in the case of a character line that does not have a character line around it or a character line that exists at the end of the text area, a mere space between characters may be erroneously determined as a blank space. The character line is divided by this erroneous determination, and an accurate character line cannot be formed even if pixels are connected. Further, in the conventional character determination method, since a vertically long and horizontally long rectangle exists in the background area, a part of the background may be extracted as a character.

本発明は前記問題点を解決するためになされたものであって、本発明の主たる目的は、複雑で多様な背景を含んだ画像からより正確に文字行矩形を生成するとともに、より正確に文字行矩形を特定する画像処理装置、当該画像処理装置を用いた文字領域特定方法、および文字領域特定プログラムを提供することである。 The present invention has been made to solve the above problems, and a main object of the present invention is to generate a character line rectangle more accurately from an image including a complex and diverse background, and more accurately An object is to provide an image processing device for specifying a line rectangle, a character region specifying method using the image processing device, and a character region specifying program.

この発明のある局面に従えば、複数の画素からなる画像の中から文字領域を特定する画像処理装置であって、画像に基づいて２値画像を生成する生成手段と、２値画像から余白領域とそれ以外の非余白領域とを抽出する第１の抽出手段と、非余白領域内で第１の方向およびそれと直交する第２の方向に画素の膨張処理を行うことにより、第１の行矩形をそれぞれ生成する膨張手段と、第１の行矩形の縦横比に基づいて２値画像の文字行方向を判定する判定手段と、判定手段により判定された文字行方向に第１の行矩形同士を連結することによって第２の行矩形を生成する第１の連結手段と、２値画像における第２の行矩形に対応する領域に基づいて文字領域を特定する第１の特定手段とを備える。 According to an aspect of the present invention, there is provided an image processing device for specifying a character area from an image composed of a plurality of pixels, a generating means for generating a binary image based on the image, and a margin area from the binary image. a first extracting means for extracting and other non-margin area, by performing the expansion processing of the pixels in the first direction and a second direction perpendicular to it in a non-blank area, the first row rectangular Each of the first line rectangles in the character line direction determined by the determination means, the expansion means for generating the image line, the determination means for determining the character line direction of the binary image based on the aspect ratio of the first line rectangle First connecting means for generating a second row rectangle by connecting and first specifying means for specifying a character area based on an area corresponding to the second row rectangle in the binary image is provided.

この局面によれば、第１の連結手段が、周囲に文字行が存在しない文字行や、テキスト領域の端にある文字行からでも行矩形を生成する。このため、従来方法より正確に文字行矩形の形成ができ、文字領域の抽出精度が良くなる。 According to this aspect, the first connecting means generates a line rectangle even from a character line that does not have a character line around it or a character line at the end of the text area. For this reason, the character line rectangle can be formed more accurately than the conventional method, and the extraction accuracy of the character region is improved.

好ましくは、判定手段は、第１の行矩形のうち、その縦横比が横に長い行矩形であることを示す第１の所定条件を満たす行矩形の個数と、その縦横比が縦に長い行矩形であることを示す第２の所定条件を満たす行矩形の個数と、を算出する算出手段と、２つの個数に基づいて２値画像の文字行方向を決定する第１の決定手段とを含む。 Preferably, the determination means includes, among the first row rectangles, the number of row rectangles satisfying a first predetermined condition indicating that the aspect ratio is a row rectangle having a horizontally long aspect ratio, and a row having a vertically long aspect ratio. Calculating means for calculating the number of row rectangles satisfying a second predetermined condition indicating that the image is a rectangle, and first determining means for determining the character line direction of the binary image based on the two numbers. .

好ましくは、第１の特定手段は、第２の行矩形に含まれる少なくとも１つの小矩形を抽出する第２の抽出手段と、各第２の行矩形について、第２の行矩形の縦横比に対する小矩形の個数の割合が所定範囲内である場合に、第２の行矩形に対応する領域を文字領域として決定する第２の決定手段とを含む。 Preferably, the first specifying means extracts at least one small rectangle included in the second row rectangle, and for each second row rectangle, the aspect ratio of the second row rectangle to the aspect ratio of the second row rectangle Second determining means for determining an area corresponding to the second row rectangle as a character area when the ratio of the number of small rectangles is within a predetermined range .

この場合には、第２の行矩形を抽出した後で、更に行矩形内の小矩形の情報を用いて行矩形が文字領域であるか否かの判定を行うため、従来より正確に文字領域の特定ができる。つまり、文字領域の抽出精度が良くなって、背景の誤抽出が少なくなる。 In this case, after extracting the second row rectangle, it is further determined whether or not the row rectangle is a character region using the small rectangle information in the row rectangle. Can be specified. In other words, the extraction accuracy of the character region is improved, and background misextraction is reduced.

好ましくは、第１の特定手段は、第２の行矩形に含まれる少なくとも１つの小矩形を抽出する第２の抽出手段と、各第２の行矩形について、第２の行矩形に含まれる小矩形のサイズのばらつきが所定値以下である場合に、第２の行矩形に対応する領域を文字領域として決定する第３の決定手段とを含む。 Preferably, the first specifying unit includes a second extraction unit that extracts at least one small rectangle included in the second row rectangle, and a small row included in the second row rectangle for each second row rectangle. And third determining means for determining an area corresponding to the second row rectangle as a character area when the variation in the size of the rectangle is equal to or less than a predetermined value .

好ましくは、第１の特定手段によって文字領域として特定された領域以外の第１の行矩形同士を、判定手段により判定された文字行方向と垂直な方向に連結することによって第３の行矩形を生成する第２の連結手段と、２値画像における第３の行矩形に対応する領域に基づいて文字領域を特定する第２の特定手段とをさらに備える。 Preferably, the third row rectangle is obtained by connecting the first row rectangles other than the region specified as the character region by the first specifying unit in a direction perpendicular to the character line direction determined by the determination unit. The apparatus further includes second connecting means to be generated and second specifying means for specifying a character area based on an area corresponding to the third row rectangle in the binary image.

好ましくは、生成手段は、画像に対して減色処理を行うことによって得られた複数色の各々に対応する複数種類の２値画像を生成し、第１の抽出手段と、膨張手段と、判定手段と、第１の連結手段とは、各２値画像に対して処理を行い、第１の特定手段は、それぞれの２値画像における第２の行矩形に対応する領域の和集合に基づいて文字領域を特定する。 Preferably, the generation unit generates a plurality of types of binary images corresponding to each of a plurality of colors obtained by performing a color reduction process on the image, and includes a first extraction unit, an expansion unit, and a determination unit. And the first connecting means performs processing on each binary image, and the first specifying means performs character processing based on a union of areas corresponding to the second row rectangles in the respective binary images. Identify the area.

この発明の別の局面に従えば、複数の画素からなる画像の中から文字領域を特定するための画像処理装置を用いた文字領域特定方法であって、画像処理装置は、制御部を備え、文字領域特定方法は、制御部が、画像に基づいて２値画像を生成するステップと、制御部が、２値画像から余白領域とそれ以外の非余白領域とを抽出するステップと、制御部が、非余白領域内で第１の方向およびそれと直交する第２の方向に画素の膨張処理を行うことにより、第１の行矩形をそれぞれ生成するステップと、制御部が、第１の行矩形の縦横比に基づいて２値画像の文字行方向を判定するステップと、制御部が、文字行方向に第１の行矩形同士を連結することによって第２の行矩形を生成するステップと、制御部が、２値画像における第２の行矩形に対応する領域に基づいて文字領域を特定するステップとを備える。
好ましくは、判定するステップは、第１の行矩形のうち、その縦横比が横に長い行矩形であることを示す第１の所定条件を満たす行矩形の個数と、その縦横比が縦に長い行矩形であることを示す第２の所定条件を満たす行矩形の個数と、を算出するステップと、２つの個数に基づいて２値画像の文字行方向を決定するステップとを含む。
好ましくは、文字領域を特定するステップは、第２の行矩形に含まれる少なくとも１つの小矩形を抽出するステップと、各第２の行矩形について、第２の行矩形の縦横比に対する小矩形の個数の割合が所定範囲内である場合に、第２の行矩形に対応する領域を文字領域として決定するステップとを含む。
好ましくは、文字領域を特定するステップは、第２の行矩形に含まれる少なくとも１つの小矩形を抽出するステップと、各第２の行矩形について、第２の行矩形に含まれる小矩形のサイズのばらつきが所定値以下である場合に、第２の行矩形に対応する領域を文字領域として決定するステップとを含む。
好ましくは、文字領域特定方法は、文字領域を特定するステップによって文字領域として特定された領域以外の第１の行矩形同士を、判定するステップで判定された文字行方向と垂直な方向に連結することによって第３の行矩形を生成するステップと、２値画像における第３の行矩形に対応する領域に基づいて文字領域を特定するステップとをさらに備える。
好ましくは、２値画像を生成する生成するステップでは、画像に対して減色処理を行うことによって得られた複数色の各々に対応する複数種類の２値画像を生成し、余白領域とそれ以外の非余白領域とを抽出するステップと、第１の行矩形をそれぞれ生成するステップと、判定するステップと、第２の行矩形を生成するステップとでは、各２値画像に対して処理を行い、文字領域を特定するステップでは、それぞれの２値画像における第２の行矩形に対応する領域の和集合に基づいて文字領域を特定する。 According to another aspect of the present invention, there is provided a character area specifying method using an image processing apparatus for specifying a character area from an image composed of a plurality of pixels, the image processing apparatus including a control unit, The character region specifying method includes a step in which a control unit generates a binary image based on an image, a step in which the control unit extracts a blank region and other non-margin regions from the binary image, and a control unit , by performing the expansion processing of the pixels in the first direction and a second direction perpendicular to it in a non-blank area, and generating a first row rectangle, respectively, the control unit, of the first row rectangular A step of determining the character line direction of the binary image based on the aspect ratio , a step of generating a second line rectangle by connecting the first line rectangles in the character line direction, and a control unit; Is paired with the second row rectangle in the binary image. And a step of specifying a character region based on the region.
Preferably, the determining step includes, among the first row rectangles, the number of row rectangles satisfying the first predetermined condition indicating that the aspect ratio is a row rectangle having a horizontally long aspect ratio, and the aspect ratio being vertically long. A step of calculating the number of row rectangles satisfying a second predetermined condition indicating a row rectangle, and a step of determining a character row direction of the binary image based on the two numbers.
Preferably, the step of specifying the character region includes: extracting at least one small rectangle included in the second row rectangle; and, for each second row rectangle, the small rectangle with respect to the aspect ratio of the second row rectangle. Determining a region corresponding to the second row rectangle as a character region when the number ratio is within a predetermined range.
Preferably, the step of specifying the character region includes the step of extracting at least one small rectangle included in the second row rectangle, and the size of the small rectangle included in the second row rectangle for each second row rectangle. And determining a region corresponding to the second row rectangle as a character region when the variation of is less than or equal to a predetermined value.
Preferably, in the character region specifying method, the first row rectangles other than the region specified as the character region by the step of specifying the character region are connected in a direction perpendicular to the character line direction determined in the determining step. Thus, the method further includes the step of generating the third row rectangle and the step of specifying the character region based on the region corresponding to the third row rectangle in the binary image.
Preferably, in the generating step of generating a binary image, a plurality of types of binary images corresponding to each of a plurality of colors obtained by performing a color reduction process on the image are generated, and a margin area and other areas are generated. In each of the step of extracting the non-margin area, the step of generating the first row rectangle, the step of determining, and the step of generating the second row rectangle, each binary image is processed. In the step of specifying the character region, the character region is specified based on the union of the regions corresponding to the second row rectangles in the respective binary images.

この発明のさらに別の局面に従えば、コンピュータに複数の画素からなる画像の中から文字領域を特定させるための文字領域特定プログラムであって、プログラムは、コンピュータに、画像に基づいて２値画像を生成するステップと、２値画像から余白領域とそれ以外の非余白領域とを抽出するステップと、非余白領域内で第１の方向およびそれと直交する第２の方向に画素の膨張処理を行うことにより、第１の行矩形をそれぞれ生成するステップと、第１の行矩形の縦横比に基づいて２値画像の文字行方向を判定するステップと、文字行方向に第１の行矩形同士を連結することによって第２の行矩形を生成するステップと、２値画像における第２の行矩形に対応する領域に基づいて文字領域を特定するステップとを実行させる。
好ましくは、判定するステップは、第１の行矩形のうち、その縦横比が横に長い行矩形であることを示す第１の所定条件を満たす行矩形の個数と、その縦横比が縦に長い行矩形であることを示す第２の所定条件を満たす行矩形の個数と、を算出するステップと、２つの個数に基づいて２値画像の文字行方向を決定するステップとを含む。
好ましくは、文字領域を特定するステップは、第２の行矩形に含まれる少なくとも１つの小矩形を抽出するステップと、各第２の行矩形について、第２の行矩形の縦横比に対する小矩形の個数の割合が所定範囲内である場合に、第２の行矩形に対応する領域を文字領域として決定するステップとを含む。
好ましくは、文字領域を特定するステップは、第２の行矩形に含まれる少なくとも１つの小矩形を抽出するステップと、各第２の行矩形について、第２の行矩形に含まれる小矩形のサイズのばらつきが所定値以下である場合に、第２の行矩形に対応する領域を文字領域として決定するステップとを含む。
好ましくは、プログラムは、文字領域を特定するステップによって文字領域として特定された領域以外の第１の行矩形同士を、判定するステップで判定された文字行方向と垂直な方向に連結することによって第３の行矩形を生成するステップと、２値画像における第３の行矩形に対応する領域に基づいて文字領域を特定するステップとをさらにコンピュータに実行させる。
好ましくは、２値画像を生成する生成するステップでは、画像に対して減色処理を行うことによって得られた複数色の各々に対応する複数種類の２値画像を生成し、余白領域とそれ以外の非余白領域とを抽出するステップと、第１の行矩形をそれぞれ生成するステップと、判定するステップと、第２の行矩形を生成するステップとでは、各２値画像に対して処理を行い、文字領域を特定するステップでは、それぞれの２値画像における第２の行矩形に対応する領域の和集合に基づいて文字領域を特定する。 According to still another aspect of the present invention, there is provided a character area specifying program for causing a computer to specify a character area from an image composed of a plurality of pixels. The program is a binary image based on an image. Generating a margin area and other non-margin areas from the binary image, and performing pixel expansion processing in the first direction and a second direction orthogonal thereto in the non-margin area it allows generating a first row rectangle respectively, and determining the character line direction of the binary image on the basis of the aspect ratio of the first row rectangular, the first line rectangle between the character line direction The step of generating the second row rectangle by connecting the characters and the step of specifying the character region based on the region corresponding to the second row rectangle in the binary image are executed.
Preferably, the determining step includes, among the first row rectangles, the number of row rectangles satisfying the first predetermined condition indicating that the aspect ratio is a row rectangle having a horizontally long aspect ratio, and the aspect ratio being vertically long. A step of calculating the number of row rectangles satisfying a second predetermined condition indicating a row rectangle, and a step of determining a character row direction of the binary image based on the two numbers.
Preferably, the step of specifying the character region includes: extracting at least one small rectangle included in the second row rectangle; and, for each second row rectangle, the small rectangle with respect to the aspect ratio of the second row rectangle. Determining a region corresponding to the second row rectangle as a character region when the number ratio is within a predetermined range.
Preferably, the step of specifying the character region includes the step of extracting at least one small rectangle included in the second row rectangle, and the size of the small rectangle included in the second row rectangle for each second row rectangle. And determining a region corresponding to the second row rectangle as a character region when the variation of is less than or equal to a predetermined value.
Preferably, the program connects the first line rectangles other than the area specified as the character area by the step of specifying the character area in a direction perpendicular to the character line direction determined in the determination step. And a step of generating a third row rectangle and a step of specifying a character region based on a region corresponding to the third row rectangle in the binary image.
Preferably, in the generating step of generating a binary image, a plurality of types of binary images corresponding to each of a plurality of colors obtained by performing a color reduction process on the image are generated, and a margin area and other areas are generated. In each of the step of extracting the non-margin area, the step of generating the first row rectangle, the step of determining, and the step of generating the second row rectangle, each binary image is processed. In the step of specifying the character region, the character region is specified based on the union of the regions corresponding to the second row rectangles in the respective binary images.

以上のように、この発明によれば、複雑で多様な背景を含んだ画像からより正確に文字行矩形を生成するとともに、より正確に文字行矩形を特定することが可能になる。 As described above, according to the present invention, it is possible to generate a character line rectangle more accurately from an image including a complicated and diverse background and to specify the character line rectangle more accurately.

以下に、図面を参照しつつ、本発明の実施の形態について説明する。以下の説明では、同一の部品および構成要素には同一の符号を付してある。それらの名称および機能も同じである。 Embodiments of the present invention will be described below with reference to the drawings. In the following description, the same parts and components are denoted by the same reference numerals. Their names and functions are also the same.

本実施の形態においては、本発明にかかる画像処理装置として代表的に、複写機能やスキャン機能やＦＡＸ送信機能などを統合したＭＦＰ（Multi Function Peripheral）であるものとする。但し、本発明にかかる画像処理装置はＭＦＰに限定されず、入力された画像データを処理する手段を備える装置であれば他の装置であってもよく、たとえば一般的なパーソナルコンピュータなどであってもよい。 In the present embodiment, the image processing apparatus according to the present invention is typically an MFP (Multi Function Peripheral) integrated with a copy function, a scan function, a FAX transmission function, and the like. However, the image processing apparatus according to the present invention is not limited to an MFP, and may be any other apparatus as long as it is provided with a means for processing input image data, such as a general personal computer. Also good.

＜ハードウェア構成＞
図１は、本実施の形態にかかるＭＦＰ１０のハードウェア構成の具体例を示す図である。図１を参照して、本実施の形態にかかるＭＦＰ１０は、複数の画素（画素データ）からなる画像（画像データ）を処理するものであって、スキャン処理部１と、入力画像処理部２と、記憶部３と、ＣＰＵ（Central Processing Unit）４と、ネットワークＩ／Ｆ（インタフェース）５と、出力画像処理部６と、エンジン部７と、モデム・ＮＣＵ（Network Control Unit）８と、操作部９とを含んで構成される。 <Hardware configuration>
FIG. 1 is a diagram showing a specific example of the hardware configuration of the MFP 10 according to the present embodiment. Referring to FIG. 1, an MFP 10 according to the present embodiment processes an image (image data) composed of a plurality of pixels (pixel data), and includes a scan processing unit 1, an input image processing unit 2, and the like. A storage unit 3, a CPU (Central Processing Unit) 4, a network I / F (interface) 5, an output image processing unit 6, an engine unit 7, a modem / NCU (Network Control Unit) 8, and an operation unit 9.

スキャン処理部１は、ＣＰＵ４からの制御信号に従って、セットされた原稿をスキャンして読み取り、入力画像処理部２に対して画像データを出力する。入力画像処理部２は、上記制御信号に従って、スキャン処理部１から入力された画像データの各画素について、たとえばＲＧＢデータなどの値を算出し、ＣＰＵ４へ出力する、もしくは記憶部３へ記憶する。 The scan processing unit 1 scans and reads a set original in accordance with a control signal from the CPU 4, and outputs image data to the input image processing unit 2. The input image processing unit 2 calculates a value such as RGB data for each pixel of the image data input from the scan processing unit 1 according to the control signal, and outputs the value to the CPU 4 or stores it in the storage unit 3.

記憶部３は、ＤＲＡＭ（Dynamic Random Access Memory）やＳＲＡＭ（Static Random Access Memory）等の電子メモリと、ハードディスク等の磁気メモリとを含んで構成され、プログラムや画像データを保持する。記憶部３は、ＣＰＵ４においてプログラムが実行される際の作業領域としても用いられる。 The storage unit 3 includes an electronic memory such as DRAM (Dynamic Random Access Memory) or SRAM (Static Random Access Memory), and a magnetic memory such as a hard disk, and holds programs and image data. The storage unit 3 is also used as a work area when the CPU 4 executes a program.

ＣＰＵ４は、記憶部３に記憶されるプログラムを実行する。ＣＰＵ４は、操作部９から入力された操作信号に基づいて、必要な制御信号を各部に出力してＭＦＰ１０全体を制御する。たとえば、ＣＰＵ４は、操作キーの検出、操作パネルの表示、入力されたデータの画像ファイルへの変更、電子メールの作成などを実行する。そして、ＣＰＵ４は、記憶部３に記憶されるプログラムを実行し、記憶部３に保持される画像データに対して画像処理を施し、記憶部３、ネットワークＩ／Ｆ部５、またはモデム・ＮＣＵ８などに対して制御信号などを出力する。ＣＰＵ４は、入力画像処理部２から入力される画像データに対して、本実施の形態に従う色変換、色補正、解像度変換、領域特定等の処理を実行する。処理後のデータは記憶部３に保持される。 The CPU 4 executes a program stored in the storage unit 3. Based on the operation signal input from the operation unit 9, the CPU 4 outputs necessary control signals to each unit to control the entire MFP 10. For example, the CPU 4 detects an operation key, displays an operation panel, changes input data to an image file, creates an e-mail, and the like. The CPU 4 executes a program stored in the storage unit 3, performs image processing on the image data held in the storage unit 3, and stores the storage unit 3, the network I / F unit 5, or the modem / NCU 8. A control signal or the like is output. The CPU 4 executes processes such as color conversion, color correction, resolution conversion, and area specification according to the present embodiment on the image data input from the input image processing unit 2. The processed data is held in the storage unit 3.

ネットワークＩ／Ｆ部５は、電子メール等を、ネットワークを介して他の装置に送信するためのＩ／Ｆであり、プロトコルに従って、データパケットの作成などを行う。ネットワークＩ／Ｆ部５は、上記制御信号に従って、ＣＰＵ４から入力された画像データ、または記憶部３から読出した画像データを、ネットワークを介して他の装置に送信する。 The network I / F unit 5 is an I / F for transmitting an electronic mail or the like to another device via the network, and creates a data packet according to a protocol. The network I / F unit 5 transmits the image data input from the CPU 4 or the image data read from the storage unit 3 to another device via the network according to the control signal.

出力画像処理部６は、上記制御信号に従って記憶部３に保持される画像データを読出し、その画像に対してスクリーン制御、スムージング処理等を施し、処理後の画像データをエンジン部７に対して出力する。 The output image processing unit 6 reads the image data held in the storage unit 3 in accordance with the control signal, performs screen control, smoothing processing, etc. on the image, and outputs the processed image data to the engine unit 7. To do.

エンジン部７は、上記制御信号に従って、出力画像処理部６から入力された画像データに基づいてトナー画像を生成し、トナー画像をセットされた印刷用紙に転写することで画像を印刷する。ここでＭＦＰ１０がカラー画像を出力するカラーＭＦＰである場合にはエンジン部７はイエロー、マゼンタ、シアン、ブラックの４色のトナーを用いてトナー画像を生成する。 The engine unit 7 generates a toner image based on the image data input from the output image processing unit 6 according to the control signal, and prints the image by transferring the toner image onto the set printing paper. When the MFP 10 is a color MFP that outputs a color image, the engine unit 7 generates a toner image using toners of four colors, yellow, magenta, cyan, and black.

モデム・ＮＣＵ８は、ファクシミリ送受信のための変復調、ファクシミリの通信プロトコルなどに従って電話回線を介した通信を制御する。モデム・ＮＣＵ８は、上記制御信号に従って、ＣＰＵ４から入力された画像データ、または記憶部３から読出した画像データを、電話回線を介して他の装置に送信する。 The modem / NCU 8 controls communication via a telephone line according to modulation / demodulation for facsimile transmission / reception, a facsimile communication protocol, and the like. The modem / NCU 8 transmits the image data input from the CPU 4 or the image data read from the storage unit 3 to another device via a telephone line in accordance with the control signal.

操作部９は操作キーと表示部とを含んで構成され、ユーザＩ／Ｆとして機能して、ユーザからの宛先の入力、スキャン条件の選択、画像ファイルフォーマットの選択、処理の開始／中断等の操作を受付ける。操作部９は、ユーザの操作に基づいた操作信号を、ＣＰＵ４に対して出力する。 The operation unit 9 includes an operation key and a display unit, and functions as a user I / F to input a destination from the user, select a scan condition, select an image file format, start / interrupt processing, and the like. Accept the operation. The operation unit 9 outputs an operation signal based on a user operation to the CPU 4.

＜機能構成＞
図２は、本実施の形態にかかるＭＦＰ１０において画像データの圧縮を行ってＰＤＦ（Portable Document Format）ファイルを作成するための機能構成の具体例を示すブロック図である。図２に示される各部は、主にＣＰＵ４が記憶部３に記憶されるプログラムを実行することによって実現される機能であるが、いくつかの機能がたとえばスキャン処理部１や入力画像処理部２などの他の専用のハードウェア構成によって実現されてもよい。 <Functional configuration>
FIG. 2 is a block diagram showing a specific example of a functional configuration for compressing image data and creating a PDF (Portable Document Format) file in the MFP 10 according to the present embodiment. Each unit shown in FIG. 2 is a function realized mainly by the CPU 4 executing a program stored in the storage unit 3, but some functions are, for example, the scan processing unit 1, the input image processing unit 2, and the like. It may be realized by other dedicated hardware configuration.

図２を参照して、本実施の形態にかかるＭＦＰ１０においてＰＤＦファイルを作成するための機能は、画像データ取得部１０１と、前処理部１０３と、写真判定部１０４と、文字領域特定部１０５と、可逆圧縮部１０７と、低解像度化部１０９と、非可逆圧縮部１１１と、ＰＤＦ化部１１３とを含んで構成される。 Referring to FIG. 2, functions for creating a PDF file in MFP 10 according to the present embodiment include an image data acquisition unit 101, a preprocessing unit 103, a photo determination unit 104, a character region specification unit 105, and the like. The reversible compression unit 107, the resolution reduction unit 109, the irreversible compression unit 111, and the PDF conversion unit 113 are included.

画像データ取得部１０１は、上記スキャン処理部１において生成された画像データを取得し、ＴＩＦＦ（Tagged Image File Format）、ＪＰＥＧ（Joint Photographic Experts
Group）、ＢＭＰ（Bit MaP）などのデータフォーマットで前処理部１０３に入力する。つまり、画像データ取得部１０１は、ＭＦＰ１０のスキャナ部分であり、原稿を読み取り、画像データを出力する部分である。 The image data acquisition unit 101 acquires the image data generated by the scan processing unit 1 and performs TIFF (Tagged Image File Format), JPEG (Joint Photographic Experts).
Group), BMP (Bit MaP), and other data formats are input to the preprocessing unit 103. In other words, the image data acquisition unit 101 is a scanner portion of the MFP 10 and is a portion that reads a document and outputs image data.

前処置部１０３は、画像データ取得部１０１から入力された画像データに対して、文字領域を特定するための前処理として、画像形式の変換、解像度変換、下地除去などの処理を施し、領域特定部１０５に入力する。また、前処理部１０３は、ＨＳＬ（色相、彩度、明度）変換、明度の２値化、ラベリングを行う。ラベリングでは明度の２値化で得られた連結画素に矩形番号、外接矩形の左上の座標、右下の座標を与え、矩形とする。 The pre-processing unit 103 performs processing such as image format conversion, resolution conversion, and background removal on the image data input from the image data acquisition unit 101 as pre-processing for specifying a character region. Input to the unit 105. The preprocessing unit 103 performs HSL (hue, saturation, brightness) conversion, binarization of lightness, and labeling. In labeling, a rectangle number, the upper left coordinates of the circumscribed rectangle, and the lower right coordinates are given to the connected pixels obtained by the binarization of the lightness to form a rectangle.

写真判定部１０４は、写真領域の判別を行う。写真領域を判別したい理由は、写真領域と写真領域以外の領域とで文字抽出処理の方法を変えるためである。写真領域から文字抽出を試みると文字でないものが文字であると誤判定されることが多く、写真領域においては文字の判定をより厳密に行う必要がある。ただし、写真領域の判定方法はここでは問わない。代表的には、前処理部１０３にて得られた矩形の内、写真の可能性があると判断された所定の大きさを有する矩形について、色相データを用いることにより、色数が多ければ写真であると判定する方法がある。 The photo determination unit 104 determines a photo area. The reason for discriminating the photographic area is to change the character extraction processing method between the photographic area and an area other than the photographic area. When attempting to extract characters from a photo area, it is often mistakenly determined that a non-character is a character, and it is necessary to more accurately determine characters in the photo area. However, the determination method of the photograph area is not limited here. Typically, among the rectangles obtained by the preprocessing unit 103, a rectangle having a predetermined size that is determined to have a possibility of being photographed is used if the number of colors is large by using hue data. There is a method of determining that it is.

文字領域特定部１０５は、写真判定部１０４から入力された写真領域以外の領域の画像データに対して文字領域特定処理を施す。文字領域特定処理には、減色処理（２値化処理）、余白抽出処理、膨張処理、文字行方向判定処理、第１の連結処理、第１の文字行判定処理、第２の連結処理、文字色決定処理などが含まれる。文字領域特定部１０５はこのような文字領域特定処理を実行して、画像中の文字（文字および罫線）領域を抽出する。そして、文字領域特定部１０５は、文字領域と、文字領域以外の写真や図形やグラフなどの背景領域と、を分離する。ここでいう文字領域には、写真や図形やグラフなどのように図の中に存在する文字の領域も含まれる。つまり、文字領域特定部１０５は、写真領域以外の領域と写真領域とで異なる文字領域特定処理を施して文字領域を抽出するものであって、文字の色算出までを行う機能を有したブロックである。写真領域以外の領域上の文字領域特定処理については後述する。 The character area specifying unit 105 performs a character area specifying process on image data of an area other than the photo area input from the photo determining unit 104. Character area specifying processing includes color reduction processing (binarization processing), margin extraction processing, expansion processing, character line direction determination processing, first connection processing, first character line determination processing, second connection processing, character Includes color determination processing. The character area specifying unit 105 executes such a character area specifying process to extract a character (character and ruled line) area in the image. Then, the character area specifying unit 105 separates the character area from a background area such as a photograph, a figure, or a graph other than the character area. The character region here includes a character region existing in the figure such as a photograph, a figure, or a graph. In other words, the character area specifying unit 105 extracts a character area by performing different character area specifying processes for areas other than the photo area and the photo area, and is a block having a function for calculating a character color. is there. The character area specifying process on the area other than the photograph area will be described later.

分離された背景領域、すなわち文字領域と特定されなかった領域、を構成する画像データは低解像度化部１０９を経て非可逆圧縮部１１１に入力される。一方、文字領域と特定された領域を構成する画像データは低解像度化部１０９を経ることなく可逆圧縮部１０７に直接入力される。 The image data constituting the separated background region, that is, the region not specified as the character region, is input to the lossy compression unit 111 via the resolution reduction unit 109. On the other hand, image data constituting an area identified as a character area is directly input to the lossless compression unit 107 without passing through the resolution reduction unit 109.

可逆圧縮部１０７は、領域特定部１０５から入力された文字領域を構成する画像データに対して、ＭＭＲ（Modified Modified-Read）圧縮方式のような可逆性の圧縮を行う。また、非可逆圧縮部１１１は、低解像度化部１０９で低解像度化された背景領域を構成する画像データに対して、ＪＰＥＧ圧縮方式のような非可逆圧縮を行う。可逆圧縮部１０７および非可逆圧縮部１１１において圧縮された文字領域を構成する画像データおよび背景領域を構成する画像データはＰＤＦ化部１１３に入力され、これらに基づいてＰＤＦファイルが作成される。 The lossless compression unit 107 performs reversible compression such as an MMR (Modified Modified-Read) compression method on the image data constituting the character region input from the region specifying unit 105. In addition, the irreversible compression unit 111 performs irreversible compression such as a JPEG compression method on the image data constituting the background area whose resolution has been reduced by the resolution reduction unit 109. The image data constituting the character area and the image data constituting the background area compressed by the lossless compression unit 107 and the lossy compression unit 111 are input to the PDF conversion unit 113, and a PDF file is created based on these.

なお、図２に示されるＭＦＰ１０の機能構成はＰＤＦファイルを作成する際に画像データのうち背景領域を構成する画像データについて解像度を低下させて非可逆圧縮する場合の構成であるが、背景領域を構成する画像データについて解像度を低下させずに非可逆圧縮してもよい。その場合、ＭＦＰ１０の機能には低解像度化部１０９が含まれていないくてもよい。 The functional configuration of the MFP 10 shown in FIG. 2 is a configuration in the case of irreversibly compressing the image data constituting the background area by reducing the resolution of the image data when creating the PDF file. The image data to be configured may be irreversibly compressed without reducing the resolution. In that case, the function of the MFP 10 may not include the resolution reduction unit 109.

図３は、文字領域特定部１０５の機能構成を示す機能ブロック図である。図３に示すように、文字領域特定部１０５は、生成部１１と、第１の抽出部１２と、膨張部１３と、判定部１４と、第１の連結部１７と、第１の特定部１８と、第２の連結部２２と、第２の特定部２３とを含む。判定部１４は、算出部１５と、第１の決定部１６とを含む。前記第１の特定部１８は、第２の抽出部１９と、第２の決定部２０とを含む。また、前記第１の特定部１８は、第２の決定部２０の代わりに、もしくは第２の決定部２０とともに、第３の決定部２１を含むものであってもよい。ここで、文字領域特定部１０５を構成する各機能ブロックは、ＣＰＵ４が記憶部３から読み出されたプログラムを実行することによって実現されるものである。 FIG. 3 is a functional block diagram showing a functional configuration of the character area specifying unit 105. As shown in FIG. 3, the character area specifying unit 105 includes a generating unit 11, a first extracting unit 12, an expanding unit 13, a determining unit 14, a first connecting unit 17, and a first specifying unit. 18, a second connecting portion 22, and a second specifying portion 23. The determination unit 14 includes a calculation unit 15 and a first determination unit 16. The first specifying unit 18 includes a second extraction unit 19 and a second determination unit 20. The first specifying unit 18 may include a third determining unit 21 instead of the second determining unit 20 or together with the second determining unit 20. Here, each functional block constituting the character area specifying unit 105 is realized by the CPU 4 executing a program read from the storage unit 3.

生成部１１は、入力された画像データに基づいて２値画像のデータを生成する。入力された画像データは、ＭＦＰ１０がスキャンすることによって得た画像データであってもよいし、外部の装置からＭＦＰ１０へと入力される画像データであってもよい。本実施の形態に係る生成部１１は、画像データから複数種類の２値画像データを生成する。 The generation unit 11 generates binary image data based on the input image data. The input image data may be image data obtained by scanning by the MFP 10 or image data input from an external device to the MFP 10. The generation unit 11 according to the present embodiment generates a plurality of types of binary image data from image data.

より詳細には、生成部１１は、画像データを減色して、複数色の各々に対する２値画像を作成する。生成部１１は、画素データ毎に、当該画素データに対応する輝度値に基づいて２値画像データを生成する。生成部１１は、画像データについて、予め設定された輝度値のしきい値、もしくは対象となる画像データを一旦検査することによって決定された輝度値のしきい値に基づいて、当該画像データを構成する各画素データに対して「０」（白または黒）もしくは「１」（黒または白）を対応させて記憶部３に記憶させる。 More specifically, the generation unit 11 reduces the color of the image data and creates a binary image for each of a plurality of colors. The generation unit 11 generates binary image data for each pixel data based on the luminance value corresponding to the pixel data. The generation unit 11 configures the image data based on a preset threshold value of luminance value or a threshold value of luminance value determined by inspecting the target image data once. “0” (white or black) or “1” (black or white) is associated with each pixel data to be stored in the storage unit 3.

複数種類の２値画像を生成することによって、いずれかの２値画像において文字色の輝度値と背景色の輝度値との間にしきい値が位置する場合に、当該２値画像において文字の画素データと背景の画素データとを「０」と「１」とに分離させることができる。同じ文字行の文字は同じ色であることが多いため、同じ色（白もしくは黒）に減色されることが多い。たとえば、本実施の形態に係るＭＦＰ１０においては、減色数を４色としている。減色するためしきい値を算出する方法は特には問わないが、頻度法や中央値分割法などがある。 When a threshold value is positioned between the luminance value of the character color and the luminance value of the background color in any binary image by generating a plurality of types of binary images, the pixel of the character in the binary image Data and background pixel data can be separated into “0” and “1”. Since characters in the same character line often have the same color, they are often reduced to the same color (white or black). For example, in MFP 10 according to the present embodiment, the number of color reductions is four. The method for calculating the threshold value for color reduction is not particularly limited, but there are a frequency method, a median value division method, and the like.

第１の抽出部１２は、２値画像から余白領域とそれ以外の非余白領域とを抽出する。ここで、「余白領域」とは、黒画素が規定数以上無い（白画素が規定数以上続く）余白部分をいう。そして、「非余白領域」とは、画像データのうちで余白領域以外の領域をいう。 The first extraction unit 12 extracts a margin area and other non-margin areas from the binary image. Here, the “margin area” refers to a blank portion where there are not more than a specified number of black pixels (the number of white pixels continues for a specified number). The “non-margin area” refers to an area other than the margin area in the image data.

より詳細には、第１の抽出部１２は、生成部１１にて作成された各２値画像に対し、主走査方向および副走査方向に画像を走査して、黒画素が規定数以上無い（白画素が規定数以上続く）余白部分（余白領域）を抽出し、余白色（たとえば「０」）で塗り潰す。一方、白画素が規定数以上続かない非余白領域については、黒画素に対応する画素データを「１」としておく。規定数とは、例えば４．５ｃｍの長さに相当する画素数（３００ｄｐｉなら６００ｐｉｘｅｌ程度）である。この抽出処理により、後述する膨張部１３が、異なる文字行の文字同士を連結することを防ぐことができる。但し、当該抽出処理を行った結果、周囲に黒画素（「１」が対応させられている画素）が無いような文字行の場合、文字行が分断されてしまうことがある。具体的なイメージは後述する。 More specifically, the first extraction unit 12 scans each binary image created by the generation unit 11 in the main scanning direction and the sub-scanning direction, and there are no more than a specified number of black pixels ( A blank part (blank area) in which white pixels continue for a predetermined number or more) is extracted and filled with blank white (for example, “0”). On the other hand, the pixel data corresponding to the black pixel is set to “1” for the non-blank area where the white pixels do not continue the predetermined number or more. The specified number is, for example, the number of pixels corresponding to a length of 4.5 cm (about 600 pixels for 300 dpi). This extraction process can prevent the later-described expansion unit 13 from connecting characters of different character lines. However, as a result of performing the extraction process, in the case of a character line in which there are no black pixels (pixels associated with “1”) around the character line, the character line may be divided. A specific image will be described later.

図４は、主走査方向への画素膨張を示すイメージ図である。「余白領域」は、図４において「−１」が対応付けられている画素からなる領域をいう。「非余白領域」は、図４において「０」または「１」が対応付けられている画素からなる領域をいう。 FIG. 4 is an image diagram showing pixel expansion in the main scanning direction. The “margin area” is an area composed of pixels associated with “−1” in FIG. 4. The “non-margin area” refers to an area composed of pixels associated with “0” or “1” in FIG.

図４を参照して、画素膨張について説明する。膨張部１３は、第１の抽出部１２から渡される画像データ２０Ｍに基づいて、非余白領域内において黒画素（画素データが「１」となっている画素）の膨張処理を行うことにより、少なくとも１つの第１の行矩形を生成する。より詳細には、第１の抽出部１２にて余白領域と判断されなかった領域について、黒画素を膨張して黒画素同士連結を行う。ここでの黒画素の膨張処理は、ＣＰＵ４が、入力された周辺画素マップ（画像データ）２０に基づいて、主走査方向の近隣画素を確認することにより画素を左右方向へ膨張させた後の周辺画素マップ（画像データ）２０Ｌ，２０Ｒを作成する。ここで、画像データ２０Ｌとは、左方向へ膨張処理が施された後の画像データをいう。画像データ２０Ｒとは、右方向へ膨張処理が施された後の画像データをいう。 The pixel expansion will be described with reference to FIG. The expansion unit 13 performs at least expansion processing of black pixels (pixels whose pixel data is “1”) in the non-blank area based on the image data 20M passed from the first extraction unit 12. One first row rectangle is generated. More specifically, black pixels are expanded and connected to each other in areas that are not determined to be blank areas by the first extraction unit 12. The expansion process of the black pixel here is the peripheral after the CPU 4 expands the pixel in the horizontal direction by confirming the neighboring pixel in the main scanning direction based on the input peripheral pixel map (image data) 20. Pixel maps (image data) 20L and 20R are created. Here, the image data 20L refers to image data that has been subjected to expansion processing in the left direction. The image data 20R refers to image data that has been expanded rightward.

具体的には、ＣＰＵ４が、以下の処理を主走査方向の各ラインにて主走査方向に移動しながら行う。（１）ＣＰＵ４は注目画素の値を検査する。注目画素が余白画素として塗り潰されている場合は、当該左右方向へ膨張させた後の周辺画素マップの当該注目画素を「−１」とする。（２）余白画素でない場合（白画素および黒画素）、たとえば注目画素から左側（画素を右方向へ膨張させる場合）へ最大文字幅の２分の１ピクセル以内に黒画素があるかどうかを検査する。ここで、最大文字幅とは、たとえば１９０ピクセルなどである。 Specifically, the CPU 4 performs the following processing while moving in the main scanning direction on each line in the main scanning direction. (1) The CPU 4 inspects the value of the target pixel. When the target pixel is filled as a blank pixel, the target pixel in the peripheral pixel map after being expanded in the left-right direction is set to “−1”. (2) When the pixel is not a blank pixel (white pixel and black pixel), for example, it is checked whether there is a black pixel within half the maximum character width from the target pixel to the left side (when expanding the pixel to the right) To do. Here, the maximum character width is, for example, 190 pixels.

もし黒画素がある場合には、右方向へ膨張させた後の周辺画素マップの注目画素の値を「１」とする。そして、黒画画素が無い場合は０とする。但し、途中で余白画素が見つかったらそれ以上の探索を行わない。このようにして、膨張部１３は、各注目画素に対して左側に探索をおこない、すなわち黒画素については右方向へ膨張するような処理を行って画像データ２１Ｒを生成する。膨張部１３は、同様の処理を右側に対しても行って画像データ２１Ｌを生成する。このようにして、全ての画素について周辺画素マップが完成する。 If there is a black pixel, the value of the pixel of interest in the peripheral pixel map after expansion to the right is set to “1”. If there is no black image pixel, 0 is set. However, if a blank pixel is found in the middle, no further search is performed. In this way, the expansion unit 13 searches the left side for each pixel of interest, that is, performs processing for expanding the black pixel in the right direction to generate the image data 21R. The expansion unit 13 performs the same process on the right side to generate the image data 21L. In this way, the peripheral pixel map is completed for all the pixels.

図５は、図４に示す右方向へ膨張後の画像データ２１Ｒの副走査方向への画素膨張を示すイメージ図である。図５に示すように、膨張部１３は、画像データ２１Ｒについて副走査方向に周辺画素マップの連結を行い、画像データ２２Ｒを生成する。より詳細には、膨張部１３は、（１）各Ｘ座標についてＹ方向へ走査し、第１の所定の間隔（本実施の形態においては１画素）以内に存在する「１」の画素同士を連結する。すなわち、膨張部１３は、「１」の画素データと「１」の画素データの間の「０」の画素データを「１」の画素データで満たしてしまう。主走査方向への膨張と同様に、「−１」である画素データについては変更を行わない。 FIG. 5 is an image diagram showing pixel expansion in the sub-scanning direction of the image data 21R expanded in the right direction shown in FIG. As illustrated in FIG. 5, the expansion unit 13 connects the peripheral pixel maps in the sub-scanning direction with respect to the image data 21 </ b> R, and generates image data 22 </ b> R. More specifically, the expansion unit 13 (1) scans each X coordinate in the Y direction, and detects “1” pixels existing within a first predetermined interval (one pixel in the present embodiment). Link. That is, the expansion unit 13 fills the pixel data “0” between the pixel data “1” and the pixel data “1” with the pixel data “1”. As in the expansion in the main scanning direction, the pixel data “−1” is not changed.

図６は、図４に示す左方向へ膨張後の画像データ２１Ｌの副走査方向への画素膨張を示すイメージ図である。図６に示すように、膨張部１３は、画像データ２１Ｒについて副走査方向への画素を膨張させた後、画像データ２１Ｌについても同様に副走査方向へ画素を膨張させる。 FIG. 6 is an image diagram showing pixel expansion in the sub-scanning direction of the image data 21L after expansion in the left direction shown in FIG. As illustrated in FIG. 6, after the expansion unit 13 expands pixels in the sub-scanning direction for the image data 21 </ b> R, the expansion unit 13 similarly expands the pixels in the sub-scanning direction for the image data 21 </ b> L.

図７は、図５および図６に示す左右方向へ膨張後の画像データ２２Ｌ，２２Ｒの黒画素の和集合を示すイメージ図である。膨張部１３は、画像データ２１Ｒ・２１Ｌの副走査方向への画素膨張の後、（２）注目画素に対応する左側の膨張後の画像データ２２Ｌと右側の画素膨張後の画素データ２２Ｒとについて、いずれかの画像データ２２Ｌ，２２Ｒの周辺画素マップの値が「１」であるか否かを検査する。すなわち、膨張部１３は、図７に示すように、画像データ２２Ｌ，２２Ｒを重ね合わせて、左右の画素データ２２Ｌ，２２Ｒのいずれかが「１」である場合には、当該画素は横方向（主走査方向）の文字行の一部であると判断し黒に塗り潰す。 FIG. 7 is an image diagram showing a union of black pixels of the image data 22L and 22R after expansion in the left-right direction shown in FIG. 5 and FIG. After the pixel expansion in the sub-scanning direction of the image data 21R / 21L, the expansion unit 13 performs (2) the left expanded image data 22L and the right pixel expanded pixel data 22R corresponding to the target pixel. It is checked whether or not the value of the peripheral pixel map of any one of the image data 22L and 22R is “1”. That is, as illustrated in FIG. 7, when the expansion unit 13 superimposes the image data 22L and 22R and any of the left and right pixel data 22L and 22R is “1”, the pixel in the horizontal direction ( It is determined that it is part of a character line in the main scanning direction) and is painted black.

その後、膨張部１３は、画像データ２０Ｍを９０度回転させてから、上記と同様の、主走査方向への画素膨張および副走査方向への画素膨張の処理を行い、垂直方向の文字行を連結させる。 Thereafter, the expansion unit 13 rotates the image data 20M by 90 degrees, performs pixel expansion in the main scanning direction and pixel expansion in the sub-scanning direction, and concatenates character lines in the vertical direction. Let

具体的には、図８に示すように、膨張部１３は、９０度回転された画像データ３０に基づいて、各画素に対して左側に探索をおこない、すなわち右方向へ黒画素を膨張させる処理を行って画像データ３１Ｒを生成する。そして、膨張部１３は、同様の処理を各画素の右側に対しても行って、左方向へ画素膨張させた画像データ３１Ｌを生成する。 Specifically, as illustrated in FIG. 8, the expansion unit 13 performs a search on the left side of each pixel based on the image data 30 rotated by 90 degrees, that is, a process of expanding the black pixel in the right direction. To generate image data 31R. Then, the expansion unit 13 performs the same process on the right side of each pixel, and generates image data 31L in which the pixel is expanded leftward.

そして、膨張部１３は、図９に示すように、右方向へ膨張させた画像データ３１Ｒについて副走査方向に画素の連結を行い、画像データ３２Ｒを生成する。図１０に示すように、膨張部１３は、画像データ３１Ｌについても同様に、副走査方向への画素膨張を行い、画像データ３２Ｌを生成する。そして、図１１に示すように、膨張部１３は、注目画素に対応する右方向への画素膨張後の画像データ３２Ｒと左方向への画素膨張後の画素データ３２Ｌとについて、いずれかの画像データ３２Ｌ・３２Ｒの画素データの値が「１」であるか否かを検査する。そして、膨張部１３は、図１１に示すように、左右の画像データ３２Ｌ・３２Ｒのいずれかが「１」である場合には、当該画素は横方向の文字行の一部であると判断し黒に塗り潰す。 Then, as shown in FIG. 9, the expansion unit 13 connects pixels in the sub-scanning direction with respect to the image data 31R expanded rightward to generate image data 32R. As shown in FIG. 10, the expansion unit 13 similarly performs pixel expansion in the sub-scanning direction on the image data 31L to generate image data 32L. Then, as illustrated in FIG. 11, the expansion unit 13 uses any one of the image data for the pixel data 32R after pixel expansion in the right direction and the pixel data 32L after pixel expansion in the left direction corresponding to the target pixel. It is checked whether the value of the pixel data of 32L and 32R is “1”. Then, as shown in FIG. 11, when either one of the left and right image data 32L and 32R is “1”, the expansion unit 13 determines that the pixel is a part of the horizontal character line. Fill in black.

図３に戻って、判定部１４は、第１の行矩形の形状に基づいて２値画像の文字行方向を判定する。具体的には、算出部１５が、第１の行矩形のうち、その縦横比が第１の所定条件を満たす行矩形の個数と、その縦横比が第２の所定条件を満たす行矩形の個数と、を算出する。そして、第１の決定部１６が、当該２種類の個数に基づいて２値画像の文字行方向を決定する。より詳細には、判定部１４は、文字行矩形の形成が完全でない文字行に基づいてより正確な文字行矩形を生成するために、水平／垂直のどちらの方向へ黒画素を再連結させれば良いかを判断する。 Returning to FIG. 3, the determination unit 14 determines the character line direction of the binary image based on the shape of the first line rectangle. Specifically, the calculation unit 15 includes the number of row rectangles whose aspect ratio satisfies the first predetermined condition and the number of row rectangles whose aspect ratio satisfies the second predetermined condition among the first row rectangles. And are calculated. Then, the first determination unit 16 determines the character line direction of the binary image based on the two types of numbers. More specifically, the determination unit 14 can reconnect the black pixels in either the horizontal or vertical direction in order to generate a more accurate character line rectangle based on a character line that is not completely formed. Judge what to do.

具体的には、まず、算出部１５が、膨張部１３にて形成された第１の行矩形の中で、縦横比（アスペクト比）が第１の所定値（例えば３）以上の矩形を探索する。そして、算出部１５は、第１の行矩形の中で、縦横比（アスペクト比）が第２の所定値（例えば１／３）以下の矩形を探索する。次に、縦横比が規定値以上の矩形を文字行であるとみなして、画像全体として縦に長い文字行矩形と横に長い文字行矩形とのいずれの文字行矩形が多いかを判定して、画像全体としての文字行の方向を決定する。 Specifically, first, the calculation unit 15 searches for a rectangle having an aspect ratio (aspect ratio) of a first predetermined value (for example, 3) or more in the first row rectangle formed by the expansion unit 13. To do. Then, the calculation unit 15 searches for a rectangle having an aspect ratio (aspect ratio) equal to or less than a second predetermined value (for example, 1/3) in the first row rectangle. Next, a rectangle with an aspect ratio equal to or greater than the specified value is regarded as a character line, and it is determined whether there are more character line rectangles, a long character line rectangle or a long character line rectangle, as a whole image. The direction of the character line as the whole image is determined.

第１の連結部１７は、判定部１４により判定された文字行方向に第１の行矩形同士を連結することによって第２の行矩形を生成する。より詳細には、第１の連結部１７は、膨張部１３にて画素が膨張されることによって形成された文字行矩形（黒画素）同士を、判定部１４にて判定された文字行方向に再度連結させる。これにより、第１の連結部１７は、第１の抽出部１２にて余白領域として判定されてしまった文字行内の文字間などを繋げて正確な文字行矩形を生成させるものである。その際、今度は余白領域（「−１」の画素データ）をまたいで黒画素を連結させても良いとする。この処理により、すべての文字行について行矩形を生成することが可能となる。 The first connecting unit 17 generates a second line rectangle by connecting the first line rectangles in the character line direction determined by the determining unit 14. More specifically, the first connecting unit 17 connects character line rectangles (black pixels) formed by expanding the pixels in the expansion unit 13 in the character line direction determined by the determination unit 14. Connect again. Accordingly, the first connecting unit 17 generates an accurate character line rectangle by connecting the characters in the character line that have been determined as the blank area by the first extracting unit 12. At this time, it is assumed that black pixels may be connected across a blank area (pixel data of “−1”). By this processing, it becomes possible to generate a line rectangle for all the character lines.

図１２は、画像データの回転を示すイメージ図である。ここで、文字行方向が水平方向の場合には、以下の処理が実行される。（１）第１の連結部１７は、まず画像データ４０を９０度回転させて、画像データ５０を得る。そして、（２）第１の連結部１７は、注目画素の値を検査する。 FIG. 12 is an image diagram showing rotation of image data. Here, when the character line direction is the horizontal direction, the following processing is executed. (1) First connecting unit 17 first rotates image data 40 by 90 degrees to obtain image data 50. (2) The first connecting unit 17 inspects the value of the target pixel.

図１３は、図１２に示す画像データに対する膨張処理を示すイメージ図である。図１３に示すように、余白画素として塗り潰されている場合は左右の周辺画素マップを「−１」とする。余白画素でない場合（白画素「０」か黒画素「１」の場合）、注目画素から左側へ最大文字幅（たとえば１９０ピクセル）の２分の１ピクセル以内に黒画素があるかどうかを検査する。もし黒画素がある場合には、左周辺画素マップの値を「１」とする。黒画素がない場合は「０」とする。但し、途中で余白画素が見つかったらそれ以上の探索を行わない。このようにして、第１の連結部１７は、右方向への膨張後の画像データ５０Ｒを取得する。図１３に示すように、第１の連結部１７は、各画素について上記と同様の処理を右側に対しても行い、すなわち黒画素を左方向へ膨張させて画像データ５０Ｌを得る。全ての画素について周辺画素マップが完成する。 FIG. 13 is an image diagram showing expansion processing for the image data shown in FIG. As shown in FIG. 13, when the pixel is filled as a blank pixel, the left and right peripheral pixel maps are set to “−1”. If it is not a blank pixel (when white pixel is “0” or black pixel “1”), it is checked whether there is a black pixel within half the maximum character width (for example, 190 pixels) from the target pixel to the left side. . If there is a black pixel, the value of the left peripheral pixel map is set to “1”. When there is no black pixel, “0” is set. However, if a blank pixel is found in the middle, no further search is performed. Thus, the 1st connection part 17 acquires the image data 50R after the expansion | swelling to the right direction. As shown in FIG. 13, the first connecting unit 17 performs the same processing as described above for each pixel on the right side, that is, expands the black pixel leftward to obtain the image data 50L. The peripheral pixel map is completed for all pixels.

図１４は、図１３に示す第１の文字行矩形（黒画素）同士の第１の連結処理を示すイメージ図である。（２）の処理に引き続き、図１４に示すように、（３）第１の連結部１７は、各Ｘ座標についてＹ方向へ走査し、第２の所定の間隔（本実施の形態においては３画素）以内に存在する「１」を連結する。ここで、第２の所定の間隔は、膨張処理における第１の所定の間隔よりも広い間隔が指定されている。また、膨張処理における連結とは異なり、余白画素「−１」であっても１で埋めてしまう。 FIG. 14 is an image diagram showing a first connection process between the first character line rectangles (black pixels) shown in FIG. Following the processing of (2), as shown in FIG. 14, (3) the first connecting portion 17 scans in the Y direction for each X coordinate, and the second predetermined interval (3 in the present embodiment). “1” existing within (pixel) is connected. Here, the second predetermined interval is specified to be wider than the first predetermined interval in the expansion process. Further, unlike the connection in the expansion process, even the blank pixel “−1” is filled with 1.

最後に、（４）第１の連結部１７は、注目画素に対応する左右方向へ膨張した画像データ５１Ｌ・５１Ｒについて周辺画素マップの値が「１」であるかを検査し、左右方向へ膨張した画像データ５１Ｌ・５１Ｒのいずれかが「１」である場合には、当該画素は横方向の行の一部であると判断し、当該画素を黒に塗り潰す。すなわち、画像データ５１Ｌ・５１Ｒを重ね合わせて、第１の文字行矩形（黒画素）の和集合を算出する。これによって、図１４に示すように、文字行内の文字間に存在した余白が黒画素に塗りつぶされて、より正確な文字行矩形が形成される。 Finally, (4) the first connecting unit 17 checks whether the value of the peripheral pixel map is “1” for the image data 51L / 51R expanded in the left-right direction corresponding to the target pixel, and expanded in the left-right direction. When any one of the image data 51L and 51R is “1”, it is determined that the pixel is a part of the horizontal row, and the pixel is painted black. That is, the union of the first character line rectangles (black pixels) is calculated by superimposing the image data 51L and 51R. As a result, as shown in FIG. 14, the blank space that exists between the characters in the character line is filled with black pixels, and a more accurate character line rectangle is formed.

図１５は、図１４に示す画像データに膨張処理のみを行った画素データ５０Ｌ・５０Ｒを重ね合わせて得られた画素データ６１を示すイメージ図である。参考のために、図１５に示すように、膨張処理のみを行った画素データ５０Ｌ・５０Ｒを重ね合わせて得られた画素データ６１においては、黒画素の連結が不十分であって、正確な文字行が生成されていないことがわかる。すなわち、図１４では文字間の余白領域が「１」で埋められているが、図１５では文字間の余白領域が「−１」のままであり正確な文字行矩形が形成されていないことがわかる。 FIG. 15 is an image diagram showing pixel data 61 obtained by superimposing pixel data 50L and 50R obtained by performing only expansion processing on the image data shown in FIG. For reference, as shown in FIG. 15, the pixel data 61 obtained by superimposing the pixel data 50L and 50R obtained by performing only the dilation processing is insufficiently connected with black pixels, so that accurate characters are displayed. You can see that no rows have been generated. That is, in FIG. 14, the blank area between characters is filled with “1”, but in FIG. 15, the blank area between characters remains “−1” and an accurate character line rectangle is not formed. Recognize.

一方、文字行方向が垂直方向の場合には、画像を回転させずに上記の処理を行うものとする。 On the other hand, when the character line direction is vertical, the above processing is performed without rotating the image.

膨張部１３は、膨張処理を行ったのちの黒画素を含む矩形領域（第１の行矩形）の四隅の座標を取得することによって、第１の行矩形の位置座標やサイズや形状を特定する。同様に、第１の連結部１７は、連結処理を行ったのちの黒画素を含む矩形領域（第２の行矩形）の四隅の座標を取得することによって、第２の行矩形の位置座標やサイズや形状を特定する。 The expansion unit 13 specifies the position coordinates, the size, and the shape of the first row rectangle by acquiring the coordinates of the four corners of the rectangular region (first row rectangle) including the black pixel after performing the expansion process. . Similarly, the first connecting unit 17 obtains the coordinates of the four corners of the rectangular area (second row rectangle) including the black pixel after the connection processing, thereby obtaining the position coordinates of the second row rectangle, Identify size and shape.

図３に戻って、特定部１８は、２値画像における第２の行矩形に対応する領域に基づいて文字領域を特定する。第２の抽出部１９は、第２の行矩形に含まれる少なくとも１つの小矩形を抽出する。第２の決定部２０は、各第２の行矩形について、第２の行矩形の縦横比と小矩形の個数との関係が第３の所定条件を満たす場合に、２値画像における第２の行矩形に対応する領域を文字領域として決定する。第３の決定部２１は、各第２の行矩形について、第２の行矩形のサイズと小矩形のサイズとの関係が第４の所定条件を満たす場合に、２値画像における第２の行矩形に対応する領域を文字領域として決定する。 Returning to FIG. 3, the specifying unit 18 specifies a character area based on an area corresponding to the second row rectangle in the binary image. The second extraction unit 19 extracts at least one small rectangle included in the second row rectangle. When the relationship between the aspect ratio of the second row rectangle and the number of small rectangles satisfies the third predetermined condition for each second row rectangle, the second determination unit 20 satisfies the second predetermined value in the binary image. The area corresponding to the line rectangle is determined as the character area. The third determination unit 21 determines the second row in the binary image when the relationship between the size of the second row rectangle and the size of the small rectangle satisfies the fourth predetermined condition for each second row rectangle. The area corresponding to the rectangle is determined as the character area.

より詳しくは、始めに、特定部１８の第２の抽出部１９は、第１の連結部１７によって形成された第２の行矩形から、行矩形の特徴的形状、たとえばアスペクト比に基づいて文字行候補矩形を抽出する。すなわち、第２の抽出部１９は、文字行方向に長く且つアスペクト比が規定値以上（例えば３）の矩形を文字行候補矩形として抽出する。次に、特定部１８の第２の決定部２０が、抽出された文字行候補矩形内に含まれる小矩形の数を調べる。そして、第２の決定部２０は、文字行候補矩形のアスペクト比と、文字行候補矩形の中に含まれる小矩形の数の組が文字行らしい値であるかによって文字行候補矩形を文字行矩形と背景の一部矩形とに分類し、文字判定を行う。次に、特定部１８の第３の決定部２１が、抽出された文字行候補矩形内に含まれる、小矩形の数、小矩形のサイズも調べる。そして、第３の決定部２１は、文字行候補矩形のアスペクト比と、文字行候補矩形の中に含まれる小矩形の数の組が文字行らしい値であるか、小矩形の大きさが同一サイズであるかによって文字行候補矩形を文字行矩形と背景の一部矩形とに分類し、文字判定を行う。 In more detail, first, the second extraction unit 19 of the specifying unit 18 starts from the second row rectangle formed by the first connecting unit 17 based on the characteristic shape of the row rectangle, for example, the aspect ratio. Extract line candidate rectangles. That is, the second extraction unit 19 extracts a rectangle that is long in the character line direction and has an aspect ratio equal to or greater than a specified value (for example, 3) as a character line candidate rectangle. Next, the second determination unit 20 of the specifying unit 18 examines the number of small rectangles included in the extracted character line candidate rectangle. Then, the second determination unit 20 determines the character line candidate rectangle based on the aspect ratio of the character line candidate rectangle and whether the set of the number of small rectangles included in the character line candidate rectangle is a character line-like value. Characters are determined by classifying them into rectangles and partial rectangles of the background. Next, the third determining unit 21 of the specifying unit 18 also checks the number of small rectangles and the size of the small rectangles included in the extracted character line candidate rectangle. Then, the third determination unit 21 determines whether the combination of the aspect ratio of the character line candidate rectangle and the number of small rectangles included in the character line candidate rectangle is a character line value, or the size of the small rectangle is the same. Character line candidate rectangles are classified into character line rectangles and partial rectangles in the background depending on the size, and character determination is performed.

図１６は、特定部１８における文字領域であるか否かの判断方法を示すイメージ図である。図１６（ａ）に示すように、第２の行矩形の縦横比（Ｘ／Ｙ）と第２の行矩形に含まれる小矩形の個数との関係が第３の所定の条件を満たしており、かつ、第２の行矩形の縦横比（Ｘ／Ｙ）と第２の行矩形に含まれる小矩形のサイズとの関係が第４の所定の条件を満たしている場合には、特定部１８は第２の行矩形を文字領域と判断する。 FIG. 16 is an image diagram illustrating a method of determining whether or not the character area is in the specifying unit 18. As shown in FIG. 16A, the relationship between the aspect ratio (X / Y) of the second row rectangle and the number of small rectangles included in the second row rectangle satisfies the third predetermined condition. When the relationship between the aspect ratio (X / Y) of the second row rectangle and the size of the small rectangle included in the second row rectangle satisfies the fourth predetermined condition, the specifying unit 18 Determines that the second row rectangle is a character area.

一方、図１６（ｂ）に示すように、第２の行矩形のサイズと第２の行矩形に含まれる小矩形のサイズとの関係、たとえば第２の行矩形の高さ（縦幅）と小矩形の長さ（横幅）との関係、が第４の所定の条件を満たしていない場合には、特定部１８は第２の行矩形を文字領域でないと判断する。具体的には、第２の行矩形の高さに対する小矩形の長さの割合が１以上である場合に文字領域でないと判断する。また、具体的には、第２の行矩形に含まれる小矩形のサイズにばらつきが大きい、すなわち縦幅もしくは横幅の標準偏差が所定値以上である場合に当該第２の行矩形は文字領域でないと判断する。 On the other hand, as shown in FIG. 16B, the relationship between the size of the second row rectangle and the size of the small rectangle included in the second row rectangle, for example, the height (vertical width) of the second row rectangle If the relationship with the length (horizontal width) of the small rectangle does not satisfy the fourth predetermined condition, the specifying unit 18 determines that the second row rectangle is not a character area. Specifically, when the ratio of the length of the small rectangle to the height of the second row rectangle is 1 or more, it is determined that the character area is not a character area. Specifically, the second row rectangle is not a character area when the size of the small rectangles included in the second row rectangle varies greatly, that is, when the vertical or horizontal standard deviation is greater than or equal to a predetermined value. Judge.

また、図１６（ｃ）に示すように、第２の行矩形の縦横比（Ｘ／Ｙ）と第２の行矩形に含まれる小矩形の個数との関係が第３の所定の条件を満たしていない場合、たとえば縦横比が１０である場合において、小矩形の個数が１０未満である場合（縦横比に対する小矩形の個数の割合が１未満である場合）や、小矩形の個数が５０以上である場合（縦横比に対する小矩形の割合が５以上である場合）には、特定部１８は第２の行矩形を文字領域でないと判断する。 Further, as shown in FIG. 16C, the relationship between the aspect ratio (X / Y) of the second row rectangle and the number of small rectangles included in the second row rectangle satisfies the third predetermined condition. For example, when the aspect ratio is 10, the number of small rectangles is less than 10 (the ratio of the number of small rectangles to the aspect ratio is less than 1), or the number of small rectangles is 50 or more. If it is (the ratio of the small rectangle to the aspect ratio is 5 or more), the specifying unit 18 determines that the second row rectangle is not a character area.

上記の判断によって、第２の行矩形のうちから文字行である可能性が高い行矩形が特定される。つまり、文字行矩形の特定がより正確になって、文字行の誤抽出が少なくなる。 Based on the above determination, a line rectangle that is likely to be a character line is identified from the second line rectangles. That is, the specification of the character line rectangle becomes more accurate, and the erroneous extraction of the character line is reduced.

第２の連結部２２は、特定部１８によって文字領域として特定された領域以外の前記第１の行矩形同士を、判定部１４により判定された文字行方向と垂直な方向に連結することによって第３の行矩形を生成する。つまり、第２の連結部２２は、文字行と垂直方向への連結を行う。ここでは、２値画像から特定部１８にて文字領域と判定された行矩形エリアを省いた画像について、規定しきい値範囲のサイズを持った第１の矩形毎に、判定部１４にて判定された文字行方向と垂直な方向に同一のサイズ（文字幅）の矩形があるかを判定し、同一サイズの矩形を順次連結を行うことで、再度新たな文字方向へ向かって行矩形を生成する。この処理を含めることにより、縦横両方向に文字行が含まれているような画像であっても文字領域をより正確に抽出することが可能である。 The second connecting unit 22 connects the first row rectangles other than the region specified as the character region by the specifying unit 18 in a direction perpendicular to the character line direction determined by the determining unit 14. 3 row rectangles are generated. That is, the 2nd connection part 22 performs the connection to a character line and a perpendicular direction. Here, with respect to an image obtained by omitting a line rectangular area determined as a character area by the specifying unit 18 from the binary image, the determination unit 14 determines for each first rectangle having the size of the specified threshold range. It is determined whether there is a rectangle of the same size (character width) in the direction perpendicular to the character line direction, and by sequentially connecting the rectangles of the same size, a line rectangle is generated again in the new character direction To do. By including this processing, it is possible to extract a character region more accurately even in an image in which character lines are included in both vertical and horizontal directions.

そして、第２の特定部２３は、前記２値画像における第３の行矩形に対応する領域に基づいて文字領域を特定する。そして、上記の各部は、生成された各２値画像に対して処理を行い、第１の特定部１８および第２の特定部２３は、それぞれの前記２値画像における前記第２の行矩形に対応する領域の和集合に基づいて文字領域を特定する。 And the 2nd specific | specification part 23 specifies a character area | region based on the area | region corresponding to the 3rd row rectangle in the said binary image. And each said part processes each produced | generated binary image, and the 1st specific | specification part 18 and the 2nd specific | specification part 23 are set to the said 2nd row rectangle in each said binary image. A character area is specified based on the union of the corresponding areas.

文字色決定部は、ここでは、文字と判定された部分の色を決定する。決定方法は、各文字行内のＲＧＢ値をそれぞれ平均して算出する。 Here, the character color determination unit determines the color of the portion determined to be a character. In the determination method, the RGB values in each character line are averaged.

＜ＰＤＦ化処理＞
図１７は、本実施の形態にかかるＭＦＰ１０において画像データの圧縮を行ってＰＤＦファイルを作成する処理手順を示すフローチャートである。図１７のフローチャートに示される処理は、主にＣＰＵ４が記憶部３に記憶されるプログラムを実行して図２および図３に示される各部を制御することで実現される処理である。 <PDF processing>
FIG. 17 is a flowchart showing a processing procedure for creating a PDF file by compressing image data in the MFP 10 according to the present embodiment. The process shown in the flowchart of FIG. 17 is a process realized mainly by the CPU 4 executing a program stored in the storage unit 3 to control each unit shown in FIGS. 2 and 3.

すなわち、図１７を参照して、本実施の形態にかかるＭＦＰ１０においては、まず画像データ取得部１０１において画像データが取得される（ステップ１００、以下ステップをＳと略す。）。そして、取得された画像データに対して、前処理部１０３での前処理を経てから領域特定部１０５において領域特定処理が施される（Ｓ３００）。画像データにはその判別結果に応じて領域ごとに適した圧縮処理が行われて、ＰＤＦ化部１１３においてＰＤＦ化処理が実行されることで（Ｓ５００）、その画像データが圧縮されてＰＤＦファイルが作成される。 That is, referring to FIG. 17, in MFP 10 according to the present embodiment, first, image data acquisition unit 101 acquires image data (step 100; step is hereinafter abbreviated as S). Then, after the preprocessing in the preprocessing unit 103 is performed on the acquired image data, region specifying processing is performed in the region specifying unit 105 (S300). The image data is subjected to a compression process suitable for each area according to the determination result, and the PDF conversion process is executed in the PDF conversion unit 113 (S500), so that the image data is compressed and a PDF file is generated. Created.

すなわち、Ｓ５００では、Ｓ３００において文字領域と判定された領域を構成する画像データについては、解像度を低下させずに可逆圧縮部１０７でＭＭＲ圧縮方式のような可逆圧縮処理が施される。また、Ｓ３００において背景領域と判定された領域を構成する画像データについては、Ｓ５００において、低解像度化部１０９で解像度を低下させるように解像度変換された後に非可逆圧縮部１１１でＪＰＥＧ圧縮方式のような非可逆圧縮処理が施される。なお、Ｓ５００において、背景領域と判定された領域を構成する画像データについて解像度を低下させずに非可逆圧縮処理が施されてもよい。 That is, in S500, reversible compression processing such as the MMR compression method is performed in the reversible compression unit 107 without reducing the resolution for the image data constituting the region determined as the character region in S300. In addition, the image data constituting the area determined as the background area in S300 is subjected to resolution conversion so as to reduce the resolution in the resolution reduction unit 109 in S500, and then the JPEG compression method is used in the lossy compression unit 111. Irreversible compression processing is performed. In S500, the irreversible compression process may be performed without reducing the resolution on the image data constituting the area determined as the background area.

上記Ｓ５００でのＰＤＦ化処理については、上述したような、いわゆるコンパクトＰＤＦファイルを作成する一般的な処理が採用され、本発明において限定される処理ではない。以下においては、本発明の特徴とする領域特定処理（Ｓ３００）の処理手順について詳細に説明する。 The PDF conversion process in S500 employs a general process for creating a so-called compact PDF file as described above, and is not a process limited in the present invention. Hereinafter, the processing procedure of the area specifying process (S300), which is a feature of the present invention, will be described in detail.

＜文字領域特定処理＞
図１８は、文字領域特定処理Ｓ３００の処理手順を示すフローチャートである。図１８を参照して、ＣＰＵ４もしくは入力画像処理部２は、まず、入力された画像データに対して、２値画像生成処理を行う（Ｓ３１０）。より詳細には、ＣＰＵ４もしくは入力画像処理部２は、入力された画像データに対して、減色処理を施してから、複数種類のしきい値に基づいて２値化処理を施す（Ｓ３１０）。ここでの２値画像生成処理においては、ＣＰＵ４もしくは入力画像処理部２が、１つの画像データに対して、複数種類の色毎に２値画像を生成する。本実施の形態に係るＭＦＰ１０においては、記憶部３に記憶された図１９に示すカラーの画像に基づき、ＣＰＵ４もしくは入力画像処理部２が、４種類のしきい値に基づいて、図２０（ａ）から図２０（ｄ）に示す４種類の２値画像を生成する。そして、余白領域抽出処理Ｓ３２０へと移行する。 <Character area identification processing>
FIG. 18 is a flowchart showing the processing procedure of the character area specifying process S300. Referring to FIG. 18, the CPU 4 or the input image processing unit 2 first performs a binary image generation process on the input image data (S310). More specifically, the CPU 4 or the input image processing unit 2 performs a color reduction process on the input image data, and then performs a binarization process based on a plurality of types of threshold values (S310). In the binary image generation process here, the CPU 4 or the input image processing unit 2 generates a binary image for each of a plurality of types of colors for one image data. In MFP 10 according to the present embodiment, based on the color image shown in FIG. 19 stored in storage unit 3, CPU 4 or input image processing unit 2 uses FIG. ) To generate four types of binary images shown in FIG. Then, the process proceeds to a blank area extraction process S320.

図２１は、余白領域抽出処理Ｓ３２０の処理手順を示すフローチャートである。図２１に示すように、ＣＰＵ４が、対象領域について、主走査方向（ｘ方向）に連続する文字を構成していない画素としての特定色の画素（ここでは白画素）の数をカウントし、規定数以上連続する白画素を検出すると、その連続する白画素を特定の色にて塗りつぶす（Ｓ３２１）。なお、ここでは文字を構成していない画素が具体的に白画素である特定色の画素であるものとされているが、対象となる画素の色は限定されていなくてもよく、たとえば文字を構成する画素色以外の色の画素等であってもよい。 FIG. 21 is a flowchart showing the processing procedure of the blank area extraction processing S320. As shown in FIG. 21, the CPU 4 counts the number of pixels of a specific color (here, white pixels) as pixels that do not constitute a continuous character in the main scanning direction (x direction) for the target area. When a number of consecutive white pixels is detected, the continuous white pixels are painted with a specific color (S321). In addition, although the pixel which does not comprise the character is specifically assumed to be a pixel of a specific color that is a white pixel, the color of the target pixel may not be limited. It may be a pixel of a color other than the constituent pixel color.

同様に、副走査方向について、規定数以上連続する白画素が検出され、さらにその白画素の連続に対して主走査方向の連続数がチェックされる（Ｓ３２２）。上記規定数もまた同様に決定されるものであるが、具体的には、主走査方向に２ｄｏｔ、副走査方向に１５０ｄｏｔ以上などが挙げられる。そして、図１８に戻って、膨張処理Ｓ３３０へと移行する。 Similarly, in the sub-scanning direction, white pixels continuous for a predetermined number or more are detected, and the number of consecutive white pixels in the main scanning direction is checked with respect to the continuous white pixels (S322). The prescribed number is also determined in the same manner. Specifically, the prescribed number is 2 dots in the main scanning direction, 150 dots or more in the sub-scanning direction, and the like. Then, returning to FIG. 18, the process proceeds to the expansion process S330.

図２２は、膨張処理Ｓ３３０の処理手順を示すフローチャートである。図２２に示すように、膨張処理Ｓ３３０においては、文字を構成する特定色の画素（ここでは黒画素とする）が膨張され近傍の画素が連結される。なお、ここでは文字を構成する画素が具体的に黒画素である特定色の画素であるものとされているが、対象となる画素の色は限定されていなくてもよく、たとえば背景を構成する画素色以外の色の画素等であってもよい。 FIG. 22 is a flowchart showing the processing procedure of the expansion processing S330. As shown in FIG. 22, in the expansion process S330, pixels of a specific color constituting the character (here, black pixels) are expanded and neighboring pixels are connected. Here, the pixels constituting the character are assumed to be pixels of a specific color that is specifically a black pixel, but the color of the target pixel may not be limited. For example, it constitutes the background. It may be a pixel of a color other than the pixel color.

図２２を参照して、まず、近傍の画素として、具体的には対象領域の主走査方向について所定の距離以下で隣合う黒画素が検出される（Ｓ３３１）。より詳しくは、画像を主走査方向（ｘ方向）に走査して、あるｘ座標について、そのｘ座標位置の左右最大文字幅（たとえば１９０ｐｉｘｅｌ）の１／２の範囲について黒画素が探索され、黒画素が検出されたそのｙ座標における配列値が１とされて、ｙ座標が０から画像高さから１減じた座標値までについて、順次ｙ方向に、黒画素の探索が繰り返される。 With reference to FIG. 22, first, as neighboring pixels, specifically, adjacent black pixels within a predetermined distance or less in the main scanning direction of the target region are detected (S331). More specifically, the image is scanned in the main scanning direction (x direction), and a black pixel is searched for a range of ½ of the maximum left and right character width (for example, 190 pixels) of the x coordinate position for a certain x coordinate. The array value at the y coordinate where the pixel is detected is set to 1, and the search for black pixels is repeated in the y direction sequentially from the y coordinate to the coordinate value obtained by subtracting 1 from the image height.

ただし、途中で行間が検出された場合には、それ以上のｙ方向の探索が行われない。さらに、上記あるｘ座標について生成された配列が走査され、配列値０の連続が規定数以下である場合にはその連続の配列値を１に書換えられる。なお、本実施の形態においては、近傍の画素を検出する方法として画素の間隔が所定の距離以下であるか否かで検出する方法が示されているが、その他の方法で近傍の画素が検出されてもよい。 However, if a line spacing is detected midway, no further search in the y direction is performed. Furthermore, the array generated with respect to the certain x coordinate is scanned, and if the sequence of array values 0 is equal to or less than a specified number, the array value of the sequences is rewritten to 1. In this embodiment, as a method for detecting neighboring pixels, a method for detecting whether or not a pixel interval is equal to or smaller than a predetermined distance is shown. However, neighboring pixels are detected by other methods. May be.

次に、副走査方向（ｙ方向）に黒画素が膨張される（Ｓ３３２）。より詳しくは、画像が主走査方向（ｘ方向）に走査されてあるｘ座標についてｙ方向に走査され、黒画素が探索される。そして、検出された黒画素の上下最大文字幅（たとえば１９０ｐｉｘｅｌ）の１／２の範囲について、そのｘ座標について生成された配列の配列値が１であるならば、その範囲にある白画素が黒に塗りつぶされる。 Next, black pixels are expanded in the sub-scanning direction (y direction) (S332). More specifically, the image is scanned in the y direction with respect to the x coordinate scanned in the main scanning direction (x direction), and a black pixel is searched. Then, if the array value of the array generated for the x coordinate is 1 in the range of ½ of the maximum vertical character width (for example, 190 pixels) of the detected black pixel, the white pixels in the range are black. It is filled with.

次に、対象領域が９０度回転され（Ｓ３３３）、同様に、主走査方向について所定の距離以下で隣り合う黒画素が検出されて（Ｓ３３４）、副走査方向にそれらの黒画素が膨張される（Ｓ３３５）。図２３は、回転前の膨張処理後の画像と、回転後の膨張処理後の画像とを重ね合わせて、第１の行矩形の和集合を取得した状態の画像データを示すイメージ図である。その後、図１８に戻って、文字行方向判定処理Ｓ３４０が実行される。 Next, the target region is rotated 90 degrees (S333), and similarly, adjacent black pixels within a predetermined distance in the main scanning direction are detected (S334), and these black pixels are expanded in the sub-scanning direction. (S335). FIG. 23 is an image diagram showing image data in a state in which a union of first row rectangles is acquired by superimposing an image after expansion processing before rotation and an image after expansion processing after rotation. Thereafter, returning to FIG. 18, the character line direction determination processing S340 is executed.

図２４は、文字行方向判定処理Ｓ３４０の処理手順を示すフローチャートである。図２４に示すように、ＣＰＵ４は、上述の処理によって連結された黒画素群を囲む最小矩形領域を得るためにラベリングを行い、当該ラベリングによって得られた連結された文字を囲む最小矩形の座標値を取得する（Ｓ３４１）。なお、ここでのラベリング方法は一般的な方法を採用するものとする。 FIG. 24 is a flowchart showing the processing procedure of the character line direction determination processing S340. As shown in FIG. 24, the CPU 4 performs labeling in order to obtain the minimum rectangular area surrounding the black pixel group connected by the above-described processing, and the coordinate value of the minimum rectangle surrounding the connected character obtained by the labeling. Is acquired (S341). Note that a general method is adopted as the labeling method here.

次に、上記ラベリングで得られた矩形領域ごとに、短辺の長さ、短辺と長辺との長さの比率を算出し、所定条件を満たす矩形のみを抽出する（Ｓ３４２）。画像全体として縦長と横長の矩形数をカウントする（Ｓ３４３）。そして、縦長の矩形と横長の矩形とどちらが多いかを判断する（Ｓ３４４）。縦長の矩形の方が多い場合（Ｓ３４４にてＹＥＳの場合）、文字行を縦方向に決定する（Ｓ３４５）。縦長の矩形の方が多くない場合（Ｓ３４４にてＮＯの場合）、文字行を横方向に決定する（Ｓ３４６）。その後、図１８に戻って、第１の連結処理Ｓ３５０が実行される。 Next, for each rectangular region obtained by the labeling, the length of the short side and the ratio of the length of the short side to the long side are calculated, and only the rectangle satisfying the predetermined condition is extracted (S342). The number of vertically and horizontally long rectangles is counted as the entire image (S343). Then, it is determined whether there are more vertically long rectangles or horizontally long rectangles (S344). If there are more vertically long rectangles (YES in S344), the character line is determined in the vertical direction (S345). If there are not more vertically long rectangles (NO in S344), the character line is determined in the horizontal direction (S346). Thereafter, returning to FIG. 18, the first connection process S350 is executed.

図２５は、第１の連結処理Ｓ３５０の処理手順を示すフローチャートである。図２５に示すように、ＣＰＵ４は、判定処理において判定された文字行方向に第１の行矩形同士を連結することによって第２の行矩形を生成する。より詳細には、ＣＰＵ４は、まず黒画素を左右方向へ膨張させて画像データを取得する（Ｓ３５１）。そして左右の画像データを重ね合わせて、黒画素の和集合を算出したのちに、文字方向に向かって第２の所定の間隔以内に存在する「１」を連結する（Ｓ３５２）。ここで、第２の所定の間隔は、膨張処理における第１の所定の間隔よりも広い間隔が指定されている。また、膨張処理における連結とは異なり、文字間が余白画素「−１」であっても文字画素「１」によって埋める。その後、図１８に戻って、第１の文字判定処理Ｓ３６０が実行される。 FIG. 25 is a flowchart showing the processing procedure of the first connection processing S350. As shown in FIG. 25, the CPU 4 generates a second line rectangle by connecting the first line rectangles in the character line direction determined in the determination process. More specifically, the CPU 4 first expands the black pixels in the left-right direction to acquire image data (S351). Then, the left and right image data are overlapped to calculate the union of black pixels, and then “1” existing within the second predetermined interval in the character direction is connected (S352). Here, the second predetermined interval is specified to be wider than the first predetermined interval in the expansion process. Further, unlike the connection in the expansion process, even if the space between characters is a blank pixel “−1”, it is filled with the character pixel “1”. Thereafter, returning to FIG. 18, the first character determination process S360 is executed.

図２６は、第１の文字判定処理Ｓ３６０の処理手順を示すフローチャートである。図２６に示すように、ＣＰＵ４は、矩形領域ごとに、短辺の長さと、短辺と長辺との長さの比率とを算出し、所定条件を満たす第２の行矩形のみを抽出する（Ｓ３６１）。ＣＰＵ４は、第２の行矩形に含まれる少なくとも１つの小矩形を抽出する（Ｓ３６２）。ＣＰＵ４は、各第２の行矩形について、第２の行矩形の縦横比と小矩形の個数との関係が第３の所定条件を満たす場合に、第２の行矩形に対応する領域を文字領域として決定する（Ｓ３６３）。加えて、ＣＰＵ４は、各第２の行矩形について、第２の行矩形のサイズと小矩形のサイズとの関係が第４の所定条件を満たす場合に、第２の行矩形に対応する領域を文字領域として決定してもよい。 FIG. 26 is a flowchart showing the processing procedure of the first character determination processing S360. As shown in FIG. 26, for each rectangular area, the CPU 4 calculates the length of the short side and the ratio of the length of the short side to the long side, and extracts only the second row rectangle that satisfies the predetermined condition. (S361). The CPU 4 extracts at least one small rectangle included in the second row rectangle (S362). For each second row rectangle, the CPU 4 sets the region corresponding to the second row rectangle as the character region when the relationship between the aspect ratio of the second row rectangle and the number of small rectangles satisfies the third predetermined condition. (S363). In addition, for each second row rectangle, when the relationship between the size of the second row rectangle and the size of the small rectangle satisfies the fourth predetermined condition, the CPU 4 selects an area corresponding to the second row rectangle. It may be determined as a character area.

図２７は、文字行方向への第１の連結処理後の画像を示すイメージ図である。図２６および図２７に示すように、膨張後の画像の第１の行矩形と比較して、連結後の画像の第２の行矩形には、文字行内に余白が混在する箇所が無くなっている。つまり、より正確な文字行領域が取得されている。その後、図１８に戻って、第２の連結処理Ｓ３７０が実行される。 FIG. 27 is an image diagram showing an image after the first connection processing in the character line direction. As shown in FIGS. 26 and 27, compared to the first row rectangle of the image after expansion, the second row rectangle of the image after concatenation has no portion where a blank space is mixed in the character line. . That is, a more accurate character line area is acquired. Thereafter, returning to FIG. 18, the second connection process S370 is executed.

図２８は、第２の連結処理Ｓ３７０の処理手順を示すフローチャートである。図２８に示すように、ＣＰＵ４は、特定部１８によって文字領域として特定された領域以外の領域において、上述の処理によって連結された黒画素群を囲む矩形領域を得るためにラベリングを行い、当該ラベリングによって得られた連結された文字を囲む最小矩形の座標値を取得する（Ｓ３７１）。なお、ここでのラベリング方法は一般的な方法を採用するものとする。次に、上記ラベリングで得られた矩形領域ごとに、短辺の長さ、短辺と長辺との長さの比率を算出し、所定条件を満たす矩形のみを抽出する（Ｓ３７２）。ここで、ＣＰＵ４は、当該所定条件を満たす矩形を判定された文字行方向と垂直な方向に連結することによって第３の行矩形を生成してもよい。そして、ＣＰＵ４は、第１の文字判定処理Ｓ３６０と同様に、２値画像における第３の行矩形に対応する領域に基づいて文字領域を特定する。 FIG. 28 is a flowchart illustrating a processing procedure of the second connection processing S370. As shown in FIG. 28, the CPU 4 performs labeling in an area other than the area specified as the character area by the specifying unit 18 in order to obtain a rectangular area surrounding the black pixel group connected by the above-described processing. The coordinate value of the minimum rectangle surrounding the connected characters obtained by the above is acquired (S371). Note that a general method is adopted as the labeling method here. Next, for each rectangular region obtained by the labeling, the length of the short side and the ratio of the length of the short side to the long side are calculated, and only the rectangle satisfying the predetermined condition is extracted (S372). Here, the CPU 4 may generate the third line rectangle by connecting the rectangles satisfying the predetermined condition in a direction perpendicular to the determined character line direction. And CPU4 specifies a character area based on the area | region corresponding to the 3rd row rectangle in a binary image similarly to 1st character determination process S360.

その後、図１８に戻って、全ての２値画像に対して処理が完了したか否かが判断される（Ｓ３８０）。全ての２値画像に対して処理が完了していない場合（Ｓ３８０にてＮＯの場合）、次の２値画像に対してＳ３２０からＳ３７０の処理が繰り返される。一方、全ての２値画像に対して処理が完了した場合（Ｓ３８０にてＹＥＳの場合）、文字色決定処理Ｓ３９０が実行する。 Thereafter, returning to FIG. 18, it is determined whether or not the processing has been completed for all the binary images (S380). If processing has not been completed for all binary images (NO in S380), the processing from S320 to S370 is repeated for the next binary image. On the other hand, when the processing is completed for all the binary images (YES in S380), the character color determination processing S390 is executed.

図２９は、文字色決定処理の処理手順を示すフローチャートである。図２９に示すように、文字領域に対応する元画像の色データ（ＲＧＢデータ）が参照されて、文字の色が決定される。ＣＰＵ４が、文字領域と判定された第２の行矩形毎に、すなわちＲＧＢデータの平均値が算出されて（Ｓ３９１）、１つの文字領域に対して１色が割当てられる（Ｓ３９２）。なお、文字領域と判定され隣り合う２つの矩形領域の間隔が所定の間隔以下であり、それらの矩形領域に割当てられた色の差が所定値以下である場合、これらの矩形領域を統合してもよい。このようにすることが文字領域である矩形領域の数を減らすことができ、作成されるＰＤＦデータのサイズを小さくすることができる。また、作成処理の速度を早めることができる。 FIG. 29 is a flowchart illustrating a processing procedure for character color determination processing. As shown in FIG. 29, the color of the character is determined by referring to the color data (RGB data) of the original image corresponding to the character area. The CPU 4 calculates the average value of RGB data for each second row rectangle determined as a character area (S391), and assigns one color to one character area (S392). If the interval between two adjacent rectangular regions determined to be character regions is equal to or smaller than a predetermined interval and the color difference assigned to the rectangular regions is equal to or smaller than a predetermined value, these rectangular regions are integrated. Also good. By doing so, the number of rectangular areas that are character areas can be reduced, and the size of the created PDF data can be reduced. In addition, the speed of the creation process can be increased.

以上のようにして、全ての２値画像に対して処理が完了して、コンパクトＰＤＦデータ作成における領域特定処理Ｓ３００が終了する。図３０は、コンパクトＰＤＦデータ作成前の最終的な文字領域の画像を示すイメージ図である。 As described above, the processing for all the binary images is completed, and the area specifying process S300 in creating compact PDF data is completed. FIG. 30 is an image diagram showing an image of a final character area before the creation of compact PDF data.

上述のように、本実施の形態にかかる領域特定処理では、原稿に含まれるテキスト領域と図領域とについて文字判定処理を分岐し、テキスト領域については従来からなされている黒画素が連結された矩形単位で文字判定がなされるのに対して、図領域については黒画素が連結されることなくラベリングで得られた矩形領域ごとに文字判定がなされる。 As described above, in the area specifying process according to the present embodiment, the character determination process is branched for the text area and the figure area included in the document, and the conventional black pixel is connected to the text area. While character determination is performed in units, character determination is performed for each rectangular area obtained by labeling without connecting black pixels in the figure area.

このように図領域内では黒画素を連結しないことによって、図領域にある文字近傍に多く存在すると考えられる線や点が文字を構成する画素と連結されて、文字判定の精度を低下させることが防止される。その結果、図領域中の文字領域が高精度で判定される。また、図領域とテキスト領域とを分けてテキスト領域について従来からなされている文字判定を行うことで、処理速度も確保される。 In this way, by not connecting black pixels in the figure area, lines and dots that are considered to exist in the vicinity of the characters in the figure area are connected to the pixels that make up the character, thereby reducing the accuracy of character determination. Is prevented. As a result, the character area in the figure area is determined with high accuracy. Moreover, the processing speed is also ensured by dividing the figure area and the text area and performing the conventional character determination for the text area.

＜まとめ＞
以下、本実施の形態に係る画像処理装置（ＭＦＰ１０）についての特徴をまとめる。本実施の形態に係る文字行矩形の形成方法は、一度連結を行って行単位に形成された矩形群の情報（縦長の矩形か横長の矩形か）を用い、文字間に余白があると判定してしまった文字行を縦方向か横方向のどちらに最連結を行えば良いか文字行方向を決定し、再度、連結処理を行うことで、正確に行矩形を生成する。また、画像全体を見たときの文字行方向と垂直な方向の文字行を正確に抽出するために、画像全体で見たときの文字行方向と垂直な方向に走査し、規定しきい値以上のサイズを持った矩形について、垂直方向に同一サイズの矩形があるかを判定し、同一サイズの矩形があれば順次連結を行うことで、行矩形を生成する。 <Summary>
The features of the image processing apparatus (MFP 10) according to the present embodiment will be summarized below. The method for forming a character line rectangle according to the present embodiment uses information on a rectangular group (vertically long rectangle or horizontally long rectangle) formed once in a row and determined that there is a space between characters. The character line direction is determined as to whether the character line that has been subjected to the maximum connection in the vertical direction or the horizontal direction is determined, and the connection process is performed again, thereby accurately generating a line rectangle. In addition, in order to accurately extract the character line in the direction perpendicular to the character line direction when the entire image is viewed, scanning is performed in the direction perpendicular to the character line direction when viewed in the entire image, and the specified threshold value is exceeded. It is determined whether there is a rectangle having the same size in the vertical direction. If there are rectangles having the same size, row rectangles are generated by sequentially connecting the rectangles.

そして、本実施の形態に係る文字を判定する方法として、形成された矩形の中からアスペクト比が規定値以上の文字行らしい矩形を抽出した後で、行矩形内に含まれる小矩形（文字単位に相当）の大きさにばらつきがあるような行矩形は、背景の一部分であると判定することで、正確に文字を判定する。また、行矩形内に含まれる小矩形（文字単位に相当）の数と行矩形のアスペクト比の組について、行矩形のアスペクト比に対し、小矩形数が少なすぎたり多すぎたりした矩形は、背景の一部分であると判定することで、正確に文字を判定する。 Then, as a method of determining the character according to the present embodiment, after extracting a rectangle that seems to be a character line having an aspect ratio of a specified value or more from the formed rectangle, a small rectangle (character unit) included in the line rectangle The line rectangle having a variation in the size of the image is determined to be a part of the background, thereby accurately determining the character. In addition, regarding the combination of the number of small rectangles (corresponding to character units) contained in the row rectangle and the aspect ratio of the row rectangle, the rectangle that has too few or too many small rectangles relative to the row rectangle aspect ratio is By determining that it is a part of the background, the character is accurately determined.

具体的には、上記の目的を達成するための本実施の形態に係るＭＦＰ１０は以下の機能を実現する構成を備えている。（１）スキャンされた画像を減色し、色毎に２値画像を作成する（文字と文字周辺の背景を分離することを目的とする。）。（２）作成された各画像に対し、主走査方向、副走査方向に画像を走査し、黒画素が全くない余白部分を抽出する。（３）残った各領域に対して、画素を膨張して連結を行う。（４）連結した矩形の中で、縦横比が規定値以上（例えば3）の矩形を行塊となっている矩形とみなし、画像全体として縦横のどちらに長い矩形が多いか、文字行の方向を判定する。（５）求まった方向（文字行方向）について、再度、画素を膨張して連結を行い、文字間に余白が抽出され文字行を形成できなかった文字から文字行を形成する。このとき、上記（２）において余白として抽出された領域をまたいでも良い。（６）連結して出来た矩形から、アスペクト比が規定値（例えば３）以上の矩形を文字行候補矩形として抽出する。（７）抽出された文字行候補矩形内に含まれる、小矩形の数、小矩形のサイズを調べる。（８）文字行候補矩形のアスペクト比と、文字行候補矩形の中に含まれる小矩形の数の組が文字行らしい値であるか、小矩形の大きさが同一サイズであるかによって文字行候補矩形を文字行矩形と背景の一部矩形とに分類し、文字判定を行う。（９）文字行方向と垂直な方向について、（２）にて出来た画像から文字と判定された矩形を省いた各画像について走査し、規定しきい値範囲のサイズを持った矩形について、垂直方向に同一サイズの矩形があるかを判定し、同一サイズの矩形があれば順次連結を行うことで、行矩形を生成する。 Specifically, the MFP 10 according to the present embodiment for achieving the above object has a configuration for realizing the following functions. (1) The color of the scanned image is reduced, and a binary image is created for each color (for the purpose of separating the character and the background around the character). (2) For each created image, the image is scanned in the main scanning direction and the sub-scanning direction, and a blank portion having no black pixels is extracted. (3) The pixels are expanded and connected to each remaining area. (4) Among connected rectangles, rectangles with an aspect ratio equal to or greater than a specified value (for example, 3) are regarded as rectangles in a line lump. Determine. (5) With respect to the obtained direction (character line direction), the pixels are expanded again and connected, and a character line is formed from a character that cannot be formed because a blank space is extracted between characters. At this time, the region extracted as the blank in (2) may be straddled. (6) A rectangle having an aspect ratio of a specified value (for example, 3) or more is extracted as a character line candidate rectangle from the connected rectangles. (7) The number of small rectangles and the size of the small rectangles included in the extracted character line candidate rectangles are examined. (8) A character line depending on the aspect ratio of the character line candidate rectangle and whether the combination of the number of small rectangles included in the character line candidate rectangle is a character line-like value or the size of the small rectangles is the same size. The candidate rectangle is classified into a character line rectangle and a partial rectangle of the background, and character determination is performed. (9) In the direction perpendicular to the character line direction, each image obtained by omitting the rectangle determined to be a character from the image formed in (2) is scanned, and the rectangle having the size of the specified threshold range is vertical. It is determined whether there are rectangles of the same size in the direction, and if there are rectangles of the same size, row rectangles are generated by sequentially connecting them.

＜その他の実施の形態＞
なお、本実施の形態ではＰＤＦファイルを作成する前の画像処理としての文字判定処理に適用した場合を示しているが、本発明はこれに限定されることなく、たとえばＯＣＲ（Optical Character Reader）のような文字認識処理の前処理として文字を誤認識しないように文字判定する処理にも適用できる。 <Other embodiments>
Although the present embodiment shows a case where the present invention is applied to character determination processing as image processing before creating a PDF file, the present invention is not limited to this, and for example, an OCR (Optical Character Reader) The present invention can also be applied to character determination processing so that characters are not erroneously recognized as preprocessing of such character recognition processing.

さらに、本実施の形態にかかるＭＦＰ１０で実行される文字判定処理を、ＣＰＵ４を有するコンピュータに実行させるための文字判定プログラムを提供することもできる。このようなプログラムは、コンピュータに付属するフレキシブルディスク、ＣＤ−ＲＯＭ（Compact Disk-Read Only Memory）、ＲＯＭ（Read Only Memory）、ＲＡＭ（Random Access Memory）およびメモリカードなどのコンピュータ読取り可能な記録媒体にて記録させて、プログラム製品として提供することもできる。あるいは、コンピュータに内蔵するハードディスクなどの記録媒体にて記録させて、プログラムを提供することもできる。また、ネットワークを介したダウンロードによって、プログラムを提供することもできる。 Furthermore, it is possible to provide a character determination program for causing a computer having the CPU 4 to execute character determination processing executed by the MFP 10 according to the present embodiment. Such a program is stored in a computer-readable recording medium such as a flexible disk attached to the computer, a CD-ROM (Compact Disk-Read Only Memory), a ROM (Read Only Memory), a RAM (Random Access Memory), and a memory card. And can be provided as a program product. Alternatively, the program can be provided by being recorded on a recording medium such as a hard disk built in the computer. A program can also be provided by downloading via a network.

なお、本発明にかかる文字判定プログラムは、コンピュータのオペレーションシステム（ＯＳ）の一部として提供されるプログラムモジュールのうち、必要なモジュールを所定の配列で所定のタイミングで呼出して情報管理処理を実行させるものであってもよい。その場合、プログラム自体には上記モジュールが含まれずＯＳと協働して情報管理処理が実行される。このようなモジュールを含まないプログラムも、本発明にかかる文字判定プログラムに含まれ得る。 The character determination program according to the present invention causes information management processing to be executed by calling necessary modules at a predetermined timing in a predetermined arrangement among program modules provided as a part of a computer operation system (OS). It may be a thing. In that case, the program itself does not include the module, and the information management process is executed in cooperation with the OS. A program that does not include such a module can also be included in the character determination program according to the present invention.

提供されるプログラム製品は、ハードディスクなどのプログラム格納部にインストール
されて実行される。なお、プログラム製品は、プログラム自体と、プログラムが記録された記録媒体とを含む。 The provided program product is installed in a program storage unit such as a hard disk and executed. The program product includes the program itself and a recording medium on which the program is recorded.

今回開示された実施の形態はすべての点で例示であって制限的なものではないと考えられるべきである。本発明の範囲は上記した説明ではなくて特許請求の範囲によって示され、特許請求の範囲と均等の意味および範囲内でのすべての変更が含まれることが意図される。 The embodiment disclosed this time should be considered as illustrative in all points and not restrictive. The scope of the present invention is defined by the terms of the claims, rather than the description above, and is intended to include any modifications within the scope and meaning equivalent to the terms of the claims.

本実施の形態にかかるＭＦＰのハードウェア構成の具体例を示す図である。FIG. 3 is a diagram showing a specific example of the hardware configuration of the MFP according to the present embodiment. ＰＤＦファイルを作成するための機能構成を示すブロック図である。It is a block diagram which shows the function structure for producing a PDF file. 文字領域特定部の機能構成を示す機能ブロック図である。It is a functional block diagram which shows the function structure of a character area specific | specification part. 主走査方向への画素膨張を示すイメージ図である。It is an image figure which shows the pixel expansion | swelling to the main scanning direction. 右方向へ膨張後の画像データの副走査方向への画素膨張を示すイメージ図である。It is an image figure which shows pixel expansion to the subscanning direction of the image data after expansion to the right direction. 左方向へ膨張後の画像データの副走査方向への画素膨張を示すイメージ図である。It is an image figure which shows the pixel expansion to the subscanning direction of the image data after expansion to the left direction. 左右方向へ膨張後の画像データの黒画素の和集合を示すイメージ図である。It is an image figure which shows the union of the black pixel of the image data after expansion to the left-right direction. 回転後の主走査方向への画素膨張を示すイメージ図である。It is an image figure which shows the pixel expansion to the main scanning direction after rotation. 回転後の右方向へ膨張後の画像データの副走査方向への画素膨張を示すイメージ図である。It is an image figure which shows pixel expansion to the subscanning direction of the image data after expanding to the right after rotation. 回転後の左方向へ膨張後の画像データの副走査方向への画素膨張を示すイメージ図である。It is an image figure which shows the pixel expansion | swelling to the subscanning direction of the image data after expansion to the left direction after rotation. 回転後の左右方向へ膨張後の画像データの黒画素の和集合を示すイメージ図である。It is an image figure which shows the union of the black pixel of the image data after expanding in the left-right direction after rotation. 画像データの回転を示すイメージ図である。It is an image figure which shows rotation of image data. 膨張処理を示すイメージ図である。It is an image figure which shows an expansion process. 第１の文字行矩形（黒画素）同士の第１の連結処理を示すイメージ図である。It is an image figure which shows the 1st connection process of 1st character line rectangle (black pixel). 膨張処理のみを行った画素データを重ね合わせて得られた画素データを示すイメージ図である。It is an image figure which shows the pixel data obtained by superimposing the pixel data which performed only the expansion process. 特定部における文字領域であるか否かの判断方法を示すイメージ図である。It is an image figure which shows the judgment method of whether it is a character area in a specific part. ＰＤＦファイルを作成する処理手順を示すフローチャートである。It is a flowchart which shows the process sequence which produces a PDF file. 文字領域特定処理の処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of a character area specific process. 入力されるカラー画像の一例を示す図である。It is a figure which shows an example of the color image input. ２値画像を示すイメージ図である。It is an image figure which shows a binary image. 余白領域抽出処理の処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of a blank area | region extraction process. 膨張処理の処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of an expansion process. 回転前の膨張処理後の画像と、回転後の膨張処理後の画像とを重ね合わせて、第１の行矩形の和集合を取得した状態の画像データを示すイメージ図である。It is an image figure which shows the image data of the state which acquired the union of the 1st row rectangle by superimposing the image after the expansion process before rotation, and the image after the expansion process after rotation. 文字行方向判定処理の処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of a character line direction determination process. 第１の連結処理の処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of a 1st connection process. 第１の文字判定処理の処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of a 1st character determination process. 文字行方向への第１の連結処理後の画像を示すイメージ図である。It is an image figure which shows the image after the 1st connection process to a character line direction. 第２の連結処理の処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of a 2nd connection process. 文字色決定処理の処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of a character color determination process. コンパクトＰＤＦデータ作成前の最終的な文字領域の画像を示すイメージ図である。It is an image figure which shows the image of the final character area before compact PDF data creation.

Explanation of symbols

１スキャン処理部、２入力画像処理部、３記憶部、４ＣＰＵ、５ネットワークＩ／Ｆ、６出力画像処理部、７エンジン部、８モデム・ＮＣＵ、９操作部、１０画像処理装置（ＭＦＰ）、１１生成部、１２第１の抽出部、１３膨張部、１４判定部、１５算出部、１６第１の決定部、１７第１の連結部、１８第１の特定部、１９第２の抽出部、２０第２の決定部、２１第３の決定部、２２第２の連結部、２３第２の特定部、１０１画像データ取得部、１０３前処理部、１０５領域特定部、１０７可逆圧縮部、１０９低解像度化部、１１１非可逆圧縮部、１１３ＰＤＦ化部。 1 scan processing unit, 2 input image processing unit, 3 storage unit, 4 CPU, 5 network I / F, 6 output image processing unit, 7 engine unit, 8 modem / NCU, 9 operation unit, 10 image processing device (MFP) , 11 generation unit, 12 first extraction unit, 13 expansion unit, 14 determination unit, 15 calculation unit, 16 first determination unit, 17 first connection unit, 18 first identification unit, 19 second extraction 20, second determination unit, 21 third determination unit, 22 second connection unit, 23 second identification unit, 101 image data acquisition unit, 103 preprocessing unit, 105 region identification unit, 107 lossless compression unit 109, a resolution reduction unit, 111 a lossy compression unit, and a 113 PDF conversion unit.

Claims

An image processing device for specifying a character region from an image composed of a plurality of pixels,
Generating means for generating a binary image based on the image;
First extraction means for extracting a margin area and a non-margin area other than the margin area from the binary image;
Wherein by performing the expansion processing of the pixels in the first direction and a second direction perpendicular to it in a non-blank area, and expansion means for generating a first row rectangle, respectively,
Determining means for determining a character line direction of the binary image based on an aspect ratio of the first line rectangle;
First connecting means for generating a second line rectangle by connecting the first line rectangles in the character line direction determined by the determining means;
An image processing apparatus comprising: a first specifying unit that specifies a character area based on an area corresponding to the second row rectangle in the binary image.

The determination means includes
Among the first row rectangles, the number of row rectangles satisfying a first predetermined condition indicating that the aspect ratio is a horizontally long row rectangle, and the horizontally long row rectangles having a long aspect ratio. Calculating means for calculating the number of row rectangles satisfying a second predetermined condition indicating that;
The image processing apparatus according to claim 1, further comprising: a first determination unit that determines a character line direction of the binary image based on the two numbers.

The first specifying means includes:
Second extraction means for extracting at least one small rectangle included in the second row rectangle;
For each second row rectangle, when the ratio of the number of small rectangles to the aspect ratio of the second row rectangle is within a predetermined range, an area corresponding to the second row rectangle is determined as a character area. The image processing apparatus according to claim 1, further comprising: a second determination unit that performs the determination.

The first specifying means includes:
Second extraction means for extracting at least one small rectangle included in the second row rectangle;
For each of the second row rectangles, when the variation in the size of the small rectangles included in the second row rectangle is equal to or less than a predetermined value, a region corresponding to the second row rectangle is determined as a character region. The image processing apparatus according to claim 1, further comprising: 3 determination means.

A third row rectangle is obtained by connecting the first row rectangles other than the region identified as the character region by the first identifying unit in a direction perpendicular to the character row direction determined by the determining unit. Second connecting means to generate;
5. The image processing apparatus according to claim 1, further comprising: a second specifying unit that specifies a character area based on an area corresponding to a third row rectangle in the binary image. 6.

The generation means generates a plurality of types of binary images corresponding to each of a plurality of colors obtained by performing a color reduction process on the image,
The first extraction means, the expansion means, the determination means, and the first connection means perform processing on each binary image,
6. The device according to claim 1, wherein the first specifying unit specifies a character region based on a union of regions corresponding to the second row rectangles in the respective binary images. Image processing device.

A character area specifying method using an image processing device for specifying a character area from an image composed of a plurality of pixels,
The image processing apparatus includes:
With a control unit,
The image processing method includes:
The control unit generating a binary image based on the image;
The control unit extracting a margin area and a non-margin area other than the margin area from the binary image;
A step wherein said control unit is for generating each said by performing an expansion process in the first direction and the pixel in a second direction perpendicular thereto in a non-blank area, the first row rectangle,
The control unit determining a character line direction of the binary image based on an aspect ratio of the first line rectangle;
The controller generates a second line rectangle by connecting the first line rectangles in the character line direction; and
And a step of specifying the character area based on an area corresponding to the second row rectangle in the binary image.

The step of determining includes
Among the first row rectangles, the number of row rectangles satisfying a first predetermined condition indicating that the aspect ratio is a horizontally long row rectangle, and indicating that the aspect ratio is a vertically long row rectangle. Calculating the number of row rectangles satisfying the second predetermined condition;
The character region specifying method according to claim 7, further comprising: determining a character line direction of the binary image based on the two numbers.

The step of specifying the character area includes:
Extracting at least one small rectangle included in the second row rectangle;
For each second row rectangle, when the ratio of the number of small rectangles to the aspect ratio of the second row rectangle is within a predetermined range, an area corresponding to the second row rectangle is determined as a character area. The character area specifying method according to claim 7 or 8, comprising the step of:

The step of specifying the character area includes:
Extracting at least one small rectangle included in the second row rectangle;
Determining a region corresponding to the second row rectangle as a character region when variation in size of small rectangles included in the second row rectangle is equal to or less than a predetermined value for each of the second row rectangles; The character area specifying method according to claim 7, comprising:

A third row is formed by connecting the first row rectangles other than the region identified as the character region by the step of identifying the character region in a direction perpendicular to the character row direction determined in the determination step. Generating a rectangle;
The character area specifying method according to claim 7, further comprising: specifying a character area based on an area corresponding to a third row rectangle in the binary image.

In the generating step of generating the binary image, a plurality of types of binary images corresponding to each of a plurality of colors obtained by performing a color reduction process on the image are generated,
The steps of extracting the margin area and other non-margin areas, generating the first row rectangle, determining, and generating the second row rectangle, Processing the binary image;
12. The character region is specified based on a union of regions corresponding to the second row rectangles in the respective binary images in the step of specifying the character region. 12. Character area identification method.

A character area specifying program for causing a computer to specify a character area from an image composed of a plurality of pixels,
The program is stored in the computer.
Generating a binary image based on the image;
Extracting a margin area and other non-margin areas from the binary image;
Wherein by performing the expansion processing of the pixels in the first direction and a second direction perpendicular to it in a non-blank area, and generating a first row rectangle, respectively,
Determining a text line direction of the binary image based on an aspect ratio of the first line rectangle;
Generating a second line rectangle by connecting the first line rectangles in the character line direction;
And a step of specifying a character area based on an area corresponding to the second row rectangle in the binary image.

The step of determining includes
Among the first row rectangles, the number of row rectangles satisfying a first predetermined condition indicating that the aspect ratio is a horizontally long row rectangle, and indicating that the aspect ratio is a vertically long row rectangle. Calculating the number of row rectangles satisfying the second predetermined condition;
The character area specifying program according to claim 13, further comprising: determining a character line direction of the binary image based on the two numbers.

The step of specifying the character area includes:
Extracting at least one small rectangle included in the second row rectangle;
For each second row rectangle, when the ratio of the number of small rectangles to the aspect ratio of the second row rectangle is within a predetermined range, an area corresponding to the second row rectangle is determined as a character area. 15. The character area specifying program according to claim 13 or 14, comprising the step of:

The step of specifying the character area includes:
Extracting at least one small rectangle included in the second row rectangle;
Determining a region corresponding to the second row rectangle as a character region when variation in size of small rectangles included in the second row rectangle is equal to or less than a predetermined value for each of the second row rectangles; The character area specifying program according to claim 13, comprising:

A third row is formed by connecting the first row rectangles other than the region identified as the character region by the step of identifying the character region in a direction perpendicular to the character row direction determined in the determination step. Generating a rectangle;
The character area specifying program according to any one of claims 13 to 16, further causing the computer to execute a step of specifying a character area based on an area corresponding to a third row rectangle in the binary image.

In the generating step of generating the binary image, a plurality of types of binary images corresponding to each of a plurality of colors obtained by performing a color reduction process on the image are generated,
The steps of extracting the margin area and other non-margin areas, generating the first row rectangle, determining, and generating the second row rectangle, Processing the binary image;
18. The character area is specified based on a union of areas corresponding to the second row rectangles in each of the binary images in the step of specifying the character area. 18. Character area identification program.