JP6264955B2

JP6264955B2 - Image processing device

Info

Publication number: JP6264955B2
Application number: JP2014044335A
Authority: JP
Inventors: 近藤　真樹; 真樹近藤; 良平小澤; 良幸田中; 長谷川　智彦; 智彦長谷川
Original assignee: Brother Industries Ltd
Current assignee: Brother Industries Ltd
Priority date: 2014-03-06
Filing date: 2014-03-06
Publication date: 2018-01-24
Anticipated expiration: 2034-03-06
Also published as: JP2015170979A

Description

本明細書では、１行の文字列を表わす画像データを分断して、２行以上の文字列を表わす画像データを生成する画像処理装置を開示する。 The present specification discloses an image processing apparatus that generates image data representing character strings of two or more lines by dividing image data representing character strings of one line.

特許文献３には、ドキュメント画像を携帯電話に配信する配信処理サーバが開示されている。配信処理サーバは、ドキュメント画像に含まれる１行分の文字列の幅が携帯電話の表示可能領域の幅を超える場合に、文字列画像を分割して、携帯電話で縦方向に沿って配置されるべき複数個の部分画像を生成する。そして、配信処理サーバは、複数個の部分画像を携帯電話に配信する。これにより、携帯電話では、各文字列が、横スクロールではなく、縦スクロールで表示される。なお、特許文献１には、文書画像に含まれる複数個の文字の位置を示す位置座標リストを利用して、複数個の文字を再配置する携帯電話が開示されている。 Patent Document 3 discloses a distribution processing server that distributes a document image to a mobile phone. The distribution processing server divides the character string image when the width of the character string for one line included in the document image exceeds the width of the displayable area of the mobile phone, and is arranged along the vertical direction on the mobile phone A plurality of partial images to be generated are generated. Then, the distribution processing server distributes the plurality of partial images to the mobile phone. Thereby, in a mobile phone, each character string is displayed not by horizontal scrolling but by vertical scrolling. Patent Document 1 discloses a mobile phone that rearranges a plurality of characters by using a position coordinate list indicating the positions of the plurality of characters included in the document image.

特開２０１２−１０８７５０号公報JP 2012-108750 A 特開２０１２−２３０６２３号公報JP 2012-230623 A 特開２０１１−２４２９８７号公報JP 2011-242987 A 特開２０１０−１８３４８４号公報JP 2010-183484 A 特開２００５−２２３８２４号公報JP-A-2005-223824 特開平５−９４５１１号公報JP-A-5-94511 特開２０００−１３７８０１号公報JP 2000-137801 A 特開平１１−２５２８３号公報Japanese Patent Laid-Open No. 11-25283 特開２０１２−２１６０３８号公報JP 2012-216038 A

“画面サイズに合わせ自動的に改行！文書ファイルをスマートフォン上で読みやすく表示レイアウト再構築技術「GT-Layout」オンラインストレージ「Dropbox」向けのサービスをスタート新開発”、［online］、２０１２年５月３０日、富士フィルム株式会社、［２０１４年１月２４日検索］、インターネット＜http://www.fujifilm.co.jp/corporate/news/articleffnr_0647.html＞"Line breaks automatically according to screen size! Displaying document files easily on smartphones Layout reconstruction technology" GT-Layout "Online storage" Dropbox "service newly developed", [online], May 2012 30th, Fuji Film Co., Ltd. [Search January 24, 2014], Internet <http://www.fujifilm.co.jp/corporate/news/articleffnr_0647.html>

上記の特許文献３には、文字列画像を分割するための具体的な手法が開示されていない。なお、特許文献１には、複数個の文字が再配置される際に、個々の文字の位置を示すデータ（即ち位置座標リスト）が利用される。本明細書では、１行の文字列を表わす画像データを分断して、２行以上の文字列を表わす画像データを生成すべき際に、個々の文字の位置を示すデータを利用せずに済む技術を提供する。 The above Patent Document 3 does not disclose a specific method for dividing the character string image. In Patent Document 1, data indicating the position of each character (that is, a position coordinate list) is used when a plurality of characters are rearranged. In this specification, when image data representing a character string of one line is divided to generate image data representing a character string of two or more lines, it is not necessary to use data indicating the position of each character. Provide technology.

本明細書によって開示される画像処理装置は、対象文字列画像データ取得部と、分断部と、特定文字列画像データ生成部と、を備える。対象文字列画像データ取得部は、第１方向に沿って並ぶ複数個の文字によって構成される１行の対象文字列を表わす対象文字列画像データを取得する。分断部は、対象文字列画像データを分断して、対象文字列のうちの一部である第１の部分文字列を表わす第１の部分文字列画像データと、対象文字列のうちの一部である第２の部分文字列を表わす第２の部分文字列画像データと、を生成する。第１の部分文字列と第２の部分文字列とのうちの少なくとも一方は、複数個の文字のうちの一部である２個以上の文字によって構成される。特定文字列画像データ生成部は、第１の部分文字列画像データと第２の部分文字列画像データとを利用して、第１方向に直交する第２方向に沿って並ぶ第１の特定文字列と第２の特定文字列とを表わす特定文字列画像データを生成する。第１の特定文字列は第１の部分文字列を含み、第２の特定文字列は第２の部分文字列を含む。対象文字列画像データを構成する複数個の画素は、対象文字列に含まれる文字を構成する文字構成画素と、対象文字列に含まれる文字の背景を構成する背景画素と、を含む。分断部は、ヒストグラム生成部と、分断位置決定部と、を備える。ヒストグラム生成部は、対象文字列画像データを利用して、射影ヒストグラムを生成する。射影ヒストグラムは、対象文字列画像データを構成する各画素を第２方向に沿って射影する場合における文字構成画素の頻度の分布を示すヒストグラムである。分断位置決定部は、射影ヒストグラムと、特定の長さと、に基づいて、対象文字列画像データの分断位置を決定する。特定の長さは、特定文字列画像データによって表わされるべき複数行の文字列の第１方向に沿った上限の長さである。分断部は、対象文字列画像データを分断位置で分断して、第１の部分文字列画像データと第２の部分文字列画像データとを生成する。 The image processing apparatus disclosed in the present specification includes a target character string image data acquisition unit, a dividing unit, and a specific character string image data generation unit. The target character string image data acquisition unit acquires target character string image data representing a single line of target character string composed of a plurality of characters arranged in the first direction. The dividing unit divides the target character string image data, and first partial character string image data representing a first partial character string that is a part of the target character string and a part of the target character string And second partial character string image data representing the second partial character string. At least one of the first partial character string and the second partial character string is composed of two or more characters that are a part of a plurality of characters. The specific character string image data generation unit uses the first partial character string image data and the second partial character string image data to form a first specific character lined up in a second direction orthogonal to the first direction. Specific character string image data representing the string and the second specific character string is generated. The first specific character string includes a first partial character string, and the second specific character string includes a second partial character string. The plurality of pixels constituting the target character string image data include a character constituent pixel constituting a character included in the target character string and a background pixel constituting a background of the character included in the target character string. The dividing unit includes a histogram generating unit and a dividing position determining unit. The histogram generation unit generates a projection histogram using the target character string image data. The projection histogram is a histogram showing the frequency distribution of the character constituting pixels when the pixels constituting the target character string image data are projected along the second direction. The dividing position determination unit determines the dividing position of the target character string image data based on the projection histogram and the specific length. The specific length is the upper limit length along the first direction of a plurality of lines of character strings to be represented by the specific character string image data. The dividing unit divides the target character string image data at the dividing position to generate first partial character string image data and second partial character string image data.

上記の構成によると、画像処理装置は、射影ヒストグラムを利用して、複数個の文字によって構成される１行の対象文字列を表わす対象文字列画像データの分断位置を決定し、対象文字列画像データを分断して、少なくとも一方が２個以上の文字によって構成される第１及び第２の部分文字列を表わす第１及び第２の部分文字列画像データを生成する。そして、画像処理装置は、第１及び第２の部分文字列画像データを利用して、特定文字列画像データを生成する。従って、画像処理装置は、１行の対象文字列を表わす対象文字列画像データを分断して、２行以上の特定文字列を表わす特定文字列画像データを生成すべき際に、個々の文字の位置を示すデータを利用せずに済む。 According to the above configuration, the image processing apparatus uses the projection histogram to determine the division position of the target character string image data representing the target character string of one line composed of a plurality of characters, and the target character string image The data is divided to generate first and second partial character string image data representing first and second partial character strings, at least one of which is composed of two or more characters. Then, the image processing device generates specific character string image data using the first and second partial character string image data. Therefore, when the image processing apparatus should generate the specific character string image data representing the specific character string of two or more lines by dividing the target character string image data representing the target character string of one line, There is no need to use location data.

上記の画像処理装置を実現するための制御方法、コンピュータプログラム、及び、当該コンピュータプログラムを格納するコンピュータ読取可能記録媒体も新規で有用である。 A control method, a computer program, and a computer-readable recording medium storing the computer program for realizing the image processing apparatus are also novel and useful.

通信システムの構成を示す。1 shows a configuration of a communication system. 画像処理サーバの処理のフローチャートを示す。3 shows a flowchart of processing of an image processing server. 文字列解析処理のフローチャートを示す。The flowchart of a character string analysis process is shown. 帯状領域決定処理のフローチャートを示す。The flowchart of a strip | belt-shaped area | region determination process is shown. 結合処理のフローチャートを示す。The flowchart of a joint process is shown. 結合処理の具体例を示す。A specific example of the combining process is shown. 分断候補位置決定処理のフローチャートを示す。The flowchart of a division | segmentation candidate position determination process is shown. 分断候補位置決定処理の具体例を示す。The specific example of a division | segmentation candidate position determination process is shown. 再配置処理のフローチャートを示す。The flowchart of a rearrangement process is shown. 行数決定処理のフローチャートを示す。The flowchart of a line number determination process is shown. 再配置処理及び拡大処理の具体例を示す。Specific examples of the rearrangement process and the enlargement process will be described.

（通信システム２の構成）
図１に示されるように、通信システム２は、多機能機１０と画像処理サーバ５０とを備える。多機能機１０と画像処理サーバ５０とは、インターネット４を介して、相互に通信可能である。多機能機１０は、印刷機能、スキャン機能、コピー機能、ＦＡＸ機能等を含む多機能を実行可能な周辺機器（即ち図示省略のＰＣ（Personal Computerの略）等の周辺機器）である。画像処理サーバ５０は、多機能機１０のベンダによってインターネット４上に設けられるサーバである。 (Configuration of communication system 2)
As shown in FIG. 1, the communication system 2 includes a multi-function device 10 and an image processing server 50. The multi-function device 10 and the image processing server 50 can communicate with each other via the Internet 4. The multi-function device 10 is a peripheral device that can execute a multi-function including a print function, a scan function, a copy function, a FAX function, etc. (that is, a peripheral device such as a PC (abbreviation of personal computer) not shown). The image processing server 50 is a server provided on the Internet 4 by the vendor of the multi-function device 10.

（多機能機１０によって実行される各処理の概要）
多機能機１０が実行可能なコピー機能は、モノクロコピー機能とカラーコピー機能とに分類されるが、本実施例では、カラーコピー機能に着目して説明する。カラーコピー機能は、通常カラーコピー機能と文字拡大カラーコピー機能とに分類される。多機能機１０は、どちらのカラーコピー機能の実行指示がユーザから与えられる場合でも、まず、スキャン対象の画像を表わすシート（以下では「スキャン対象シート」と呼ぶ）をカラースキャンして、スキャン画像データＳＩＤを生成する。スキャン画像データＳＩＤは、多階調（例えば２５６階調）のＲＧＢビットマップデータである。 (Outline of each process executed by the multi-function device 10)
Copy functions that can be executed by the multi-function device 10 are classified into a monochrome copy function and a color copy function. In the present embodiment, the description will be made focusing on the color copy function. The color copy function is classified into a normal color copy function and a character enlargement color copy function. Regardless of which color copy function execution instruction is given by the user, the multi-function device 10 first performs color scanning on a sheet representing an image to be scanned (hereinafter referred to as a “scanning sheet”) to obtain a scanned image. A data SID is generated. The scanned image data SID is RGB bitmap data having multiple gradations (for example, 256 gradations).

スキャン画像データＳＩＤによって表わされるスキャン画像ＳＩ（即ちスキャン対象シートに表現されている画像）は、白色の背景を有すると共に、テキストオブジェクトＴＯＢと写真オブジェクトＰＯＢとを含む。テキストオブジェクトＴＯＢは、黒色の複数個の文字「Ａ〜Ｑ」によって構成される４行の文字列を含む。なお、文字の色は、黒色とは異なる色（例えば赤色）でもよい。写真オブジェクトＰＯＢは、文字を含まず、複数色によって構成される写真を含む。 A scan image SI represented by the scan image data SID (that is, an image represented on the scan target sheet) has a white background and includes a text object TOB and a photo object POB. The text object TOB includes a four-line character string composed of a plurality of black characters “A to Q”. The character color may be a color different from black (for example, red). The photo object POB does not include characters, but includes a photo composed of a plurality of colors.

なお、本実施例の各図では、便宜上、テキストオブジェクトＴＯＢを構成する各文字列が、規則的な順序で並ぶアルファベット「Ａ〜Ｑ」によって表現されているが、実際には、各文字列は、センテンスを構成している。各文字列（即ち１行の文字列）では、スキャン画像ＳＩ内の横方向の左側から右側に向かってセンテンスが進む。また、４行の文字列「Ａ〜Ｑ」では、スキャン画像ＳＩ内の縦方向の上側から下側に向かってセンテンスが進む。なお、以下のいずれの画像（例えば後述の処理済み画像ＰＩ）においても、１行の文字列を構成する複数個の文字が並ぶ方向、当該方向に直交する方向を、それぞれ、「横方向」、「縦方向」と呼ぶ。また、左側から右側に向かってセンテンスが進むことから、横方向の左端、横方向の右端のことを、それぞれ、「先端」、「後端」と呼ぶ。 In each drawing of the present embodiment, for convenience, each character string constituting the text object TOB is represented by alphabets “A to Q” arranged in a regular order. Constitutes a sentence. In each character string (that is, one line of character string), the sentence advances from the left side in the horizontal direction in the scan image SI toward the right side. In the four-line character strings “A to Q”, sentences progress from the upper side to the lower side in the vertical direction in the scan image SI. In any of the following images (for example, a processed image PI described later), a direction in which a plurality of characters constituting one line of character string are arranged and a direction orthogonal to the direction are respectively referred to as “lateral direction”, This is called “vertical direction”. Since sentences progress from the left side to the right side, the left end in the horizontal direction and the right end in the horizontal direction are referred to as “front end” and “rear end”, respectively.

また、テキストオブジェクトＴＯＢを構成する４行の文字列「Ａ〜Ｑ」のうちの先頭行の文字列「Ａ〜Ｅ」よりも先端側（即ち左側）には、インデントの余白領域が形成されている。また、４行の文字列「Ａ〜Ｑ」のうちの３行目の文字列「ＬＭ」で改行されており、最終行の文字列「Ｎ〜Ｑ」よりも先端側には、インデントの余白領域が形成されている。即ち、４行の文字列「Ａ〜Ｑ」のうちの先頭の３行の文字列「Ａ〜Ｍ」によって１個のパラグラフが構成されており、最終行の文字列「Ｎ〜Ｑ」によって他の１個のパラグラフが構成されている。 In addition, an indent margin area is formed on the leading end side (that is, on the left side) of the character string “A to E” in the first line of the four character strings “A to Q” constituting the text object TOB. Yes. Also, a line break is made at the third character string “LM” of the four character strings “A to Q”, and an indent margin is provided at the leading end of the last character string “N to Q”. A region is formed. That is, one paragraph is composed of the first three character strings “A to M” of the four character strings “A to Q”, and the last character string “N to Q” is the other. This is a single paragraph.

多機能機１０は、ユーザから通常カラーコピー機能の実行指示が与えられる場合には、スキャン画像データＳＩＤを利用して、ユーザによって設定されるコピー倍率に応じて、画像をシート（以下では「印刷対象シート」と呼ぶ）に印刷する。例えば、コピー倍率が等倍である場合には、多機能機１０は、スキャン対象シートに表現されている画像と同じサイズを有する画像を印刷対象シートに印刷する。また、例えば、コピー倍率が画像の拡大を示す倍率である場合には、多機能機１０は、スキャン対象シートに表現されている画像よりも大きいサイズを有する画像を印刷対象シートに印刷する。この場合、例えば、Ａ４サイズのスキャン対象シートに表現されている画像が拡大されて、Ａ３サイズの印刷対象シートに印刷される。この結果、２個のオブジェクトＴＯＢ，ＰＯＢの全てが拡大されて表現されている画像が印刷対象シートに印刷される。 When a normal color copy function execution instruction is given from the user, the multi-function device 10 uses the scanned image data SID to print an image on a sheet (hereinafter “print”) according to the copy magnification set by the user. To the target sheet). For example, when the copy magnification is equal, the multi-function device 10 prints an image having the same size as the image expressed on the scan target sheet on the print target sheet. Further, for example, when the copy magnification is a magnification indicating the enlargement of the image, the multi-function device 10 prints an image having a size larger than the image expressed on the scan target sheet on the print target sheet. In this case, for example, the image expressed on the A4 size scan target sheet is enlarged and printed on the A3 size print target sheet. As a result, an image in which all of the two objects TOB and POB are enlarged and printed is printed on the print target sheet.

一方、多機能機１０は、ユーザから文字拡大カラーコピー機能の実行指示が与えられる場合には、インターネット４を介して、スキャン画像データＳＩＤを画像処理サーバ５０に送信する。これにより、多機能機１０は、インターネット４を介して、画像処理サーバ５０から処理済み画像データＰＩＤを受信し、処理済み画像データＰＩＤによって表わされる処理済み画像ＰＩを印刷対象シートに印刷する。特に、多機能機１０は、スキャン対象シートと同じサイズ（例えばＡ４サイズ）を有する印刷対象シートに処理済み画像ＰＩを印刷する。 On the other hand, the multi-function device 10 transmits the scan image data SID to the image processing server 50 via the Internet 4 when an instruction to execute the character enlargement color copy function is given from the user. Accordingly, the multi-function device 10 receives the processed image data PID from the image processing server 50 via the Internet 4 and prints the processed image PI represented by the processed image data PID on the print target sheet. In particular, the multi-function device 10 prints the processed image PI on a print target sheet having the same size (for example, A4 size) as the scan target sheet.

処理済み画像ＰＩでは、スキャン画像ＳＩと比べて、写真オブジェクトＰＯＢが拡大されずに、テキストオブジェクトＴＯＢが拡大されて表現されている。従って、スキャン画像ＳＩ内の各文字のサイズが小さい場合でも、処理済み画像ＰＩでは、各文字のサイズが大きくなるので、ユーザは、処理済み画像ＰＩ内の各文字を容易に認識することができる。また、処理済み画像ＰＩ内の各文字列「Ａ〜Ｑ」のうちの先頭行の文字列「Ａ〜Ｆ」の文字数（即ち「６」）は、スキャン画像ＳＩ内の各文字列「Ａ〜Ｑ」のうちの先頭行の文字列「Ａ〜Ｅ」の文字数（即ち「５」）とは異なる。そして、処理済み画像ＰＩ内の文字列の行数（即ち３行）は、スキャン画像ＳＩ内の文字列の行数（即ち４行）とは異なる。ただし、処理済み画像ＰＩ内の２個のパラグラフの関係は、スキャン画像ＳＩ内の２個のパラグラフの関係と同じである。即ち、処理済み画像ＰＩ内の３行の文字列「Ａ〜Ｑ」のうちの先頭行の文字列「Ａ〜Ｆ」よりも先端側（即ち左側）には、インデントの余白領域が形成されており、２行目の文字列「Ｇ〜Ｍ」で改行されており、最終行の文字列「Ｎ〜Ｑ」よりも先端側には、インデントの余白領域が形成されている。即ち、３行の文字列「Ａ〜Ｑ」のうちの先頭の２行の文字列「Ａ〜Ｍ」によって１個のパラグラフが構成されており、最終行の文字列「Ｎ〜Ｑ」によって他の１個のパラグラフが構成されている。 In the processed image PI, the photographic object POB is not enlarged but the text object TOB is enlarged as compared with the scanned image SI. Therefore, even if the size of each character in the scanned image SI is small, the size of each character is large in the processed image PI, so that the user can easily recognize each character in the processed image PI. . In addition, the number of characters (that is, “6”) of the character string “A to F” in the first line among the character strings “A to Q” in the processed image PI is the character string “A to Q” in the scanned image SI. It is different from the number of characters (namely, “5”) in the character string “A to E” in the first line of “Q”. The number of character strings in the processed image PI (that is, three lines) is different from the number of character strings in the scanned image SI (that is, four lines). However, the relationship between the two paragraphs in the processed image PI is the same as the relationship between the two paragraphs in the scanned image SI. That is, an indented blank area is formed on the leading end side (that is, on the left side) of the character string “A to F” in the first line of the three character strings “A to Q” in the processed image PI. In addition, a line break is made at the character string “G to M” on the second line, and an indent margin area is formed on the leading end side of the character string “N to Q” on the last line. That is, one paragraph is composed of the first two character strings “A to M” of the three character strings “A to Q”, and the last character string “N to Q” is the other. This is a single paragraph.

（画像処理サーバ５０の構成）
画像処理サーバ５０は、多機能機１０から受信されるスキャン画像データＳＩＤに対して画像処理を実行して、処理済み画像データＰＩＤを生成し、当該処理済み画像データＰＩＤを多機能機１０に送信する。画像処理サーバ５０は、ネットワークインターフェース５２と、制御部６０と、を備える。ネットワークインターフェース５２は、インターネット４に接続される。制御部６０は、ＣＰＵ６２とメモリ６４とを備える。ＣＰＵ６２は、メモリ６４に格納されているプログラム６６に従って、様々な処理（即ち図２等の処理）を実行するプロセッサである。 (Configuration of the image processing server 50)
The image processing server 50 performs image processing on the scanned image data SID received from the multi-function device 10, generates processed image data PID, and transmits the processed image data PID to the multi-function device 10. To do. The image processing server 50 includes a network interface 52 and a control unit 60. The network interface 52 is connected to the Internet 4. The control unit 60 includes a CPU 62 and a memory 64. The CPU 62 is a processor that executes various processes (that is, the processes in FIG. 2 and the like) according to a program 66 stored in the memory 64.

（画像処理サーバ５０によって実行される各処理；図２）
続いて、図２を参照して、画像処理サーバ５０のＣＰＵ６２によって実行される各処理の内容を説明する。ＣＰＵ６２は、インターネット４を介して、多機能機１０からスキャン画像データＳＩＤを受信する場合に、図２の処理を開始する。 (Each process executed by the image processing server 50; FIG. 2)
Next, the contents of each process executed by the CPU 62 of the image processing server 50 will be described with reference to FIG. When the CPU 62 receives the scan image data SID from the multi-function device 10 via the Internet 4, the CPU 62 starts the process of FIG.

Ｓ１００では、ＣＰＵ６２は、文字列解析処理（後述の図３参照）を実行して、スキャン画像ＳＩ内の４行の文字列「Ａ〜Ｑ」を含むテキストオブジェクト領域ＴＯＡを決定する。そして、ＣＰＵ６２は、テキストオブジェクト領域ＴＯＡ内の４行の文字列「Ａ〜Ｑ」を含む４個の帯状領域ＬＡ１１〜ＬＡ１４を決定する。 In S100, the CPU 62 executes a character string analysis process (see FIG. 3 to be described later), and determines a text object area TOA including four lines of character strings “A to Q” in the scanned image SI. Then, the CPU 62 determines four belt-like areas LA11 to LA14 including four lines of character strings “A to Q” in the text object area TOA.

Ｓ２００では、ＣＰＵ６２は、結合処理（後述の図５参照）を実行して、２個の結合画像ＣＩ１，ＣＩ２を表わす２個の結合画像データを生成する。結合画像ＣＩ１は、３個の帯状領域ＬＡ１１〜ＬＡ１３に含まれる３行の文字列が横方向に沿って直線状に結合（即ち連結）された１行の文字列「Ａ〜Ｍ」を含む。また、結合画像ＣＩ２は、１個の帯状領域ＬＡ１４に含まれる１行の文字列「Ｎ〜Ｑ」を含む。結合画像ＣＩ２は、複数行の文字列が結合された文字列を含むものではないが、便宜上、「結合画像」と呼ぶ。結合画像ＣＩ１は、１個のパラグラフを構成する文字列「Ａ〜Ｍ」を含み、結合画像ＣＩ２は、別の１個のパラグラフを構成する文字列「Ｎ〜Ｑ」を含む。即ち、テキストオブジェクト領域ＴＯＡに含まれるパラグラフ毎に、１個の結合画像を表わす１個の結合画像データが生成される。 In S200, the CPU 62 executes a combination process (see FIG. 5 described later), and generates two combined image data representing the two combined images CI1 and CI2. The combined image CI1 includes one line of character strings “A to M” in which three lines of character strings included in the three belt-like areas LA11 to LA13 are linearly connected (that is, connected) along the horizontal direction. Further, the combined image CI2 includes one line of character strings “N to Q” included in one band-shaped area LA14. The combined image CI2 does not include a character string obtained by combining a plurality of lines of character strings, but is referred to as a “combined image” for convenience. The combined image CI1 includes character strings “A to M” constituting one paragraph, and the combined image CI2 includes character strings “N to Q” constituting another single paragraph. That is, one combined image data representing one combined image is generated for each paragraph included in the text object area TOA.

Ｓ３００では、ＣＰＵ６２は、目標領域決定処理を実行して、スキャン画像ＳＩ内において目標領域ＴＡを決定する。具体的には、ＣＰＵ６２は、まず、テキストオブジェクト領域ＴＯＡの左上の頂点に一致する左上の頂点を有するスペース領域を決定する。スペース領域は、テキストオブジェクト領域ＴＯＡのサイズ（即ち面積）よりも大きいサイズを有すると共に、他のオブジェクト領域（例えば写真オブジェクトＰＯＢを含むオブジェクト領域）に重複しない。そして、ＣＰＵ６２は、スペース領域のアスペクト比に等しいアスペクト比を有する目標領域ＴＡをスペース領域内に決定する。ここで、目標領域ＴＡのサイズ（即ち面積）は、テキストオブジェクト領域ＴＯＡのサイズ（即ち面積）のα倍（αは、１より大きい値であり、例えば、１．４倍である）以下の最大のサイズである。目標領域ＴＡの位置は、目標領域ＴＡの左上の頂点がテキストオブジェクト領域ＴＯＡの左上の頂点に一致するように設定される。目標領域ＴＡのアスペクト比は、通常、テキストオブジェクト領域ＴＯＡのアスペクト比とは異なる。スキャン画像ＳＩ内の目標領域ＴＡは、処理済み画像ＰＩ内の目標領域ＴＡ（Ｓ５００の処理済み画像ＰＩ参照）に一致する。従って、Ｓ３００の処理は、処理済み画像ＰＩ内の目標領域ＴＡを決定する処理に等しい。処理済み画像ＰＩ内の目標領域ＴＡは、拡大されて表現される文字列「Ａ〜Ｑ」が配置されるべき領域である。 In S300, the CPU 62 executes a target area determination process to determine a target area TA in the scan image SI. Specifically, the CPU 62 first determines a space area having an upper left vertex that matches the upper left vertex of the text object area TOA. The space area has a size larger than the size (ie, area) of the text object area TOA and does not overlap with other object areas (for example, an object area including the photo object POB). Then, the CPU 62 determines a target area TA having an aspect ratio equal to the aspect ratio of the space area in the space area. Here, the size (ie, area) of the target area TA is a maximum of α times (where α is a value greater than 1, for example, 1.4 times) the size (ie, area) of the text object area TOA. Is the size of The position of the target area TA is set so that the upper left vertex of the target area TA matches the upper left vertex of the text object area TOA. The aspect ratio of the target area TA is usually different from the aspect ratio of the text object area TOA. The target area TA in the scanned image SI matches the target area TA in the processed image PI (see the processed image PI in S500). Therefore, the process of S300 is equivalent to the process of determining the target area TA in the processed image PI. The target area TA in the processed image PI is an area in which the character strings “A to Q” expressed in an enlarged manner are to be arranged.

Ｓ４００では、ＣＰＵ６２は、再配置処理（後述の図９参照）を実行して、再配置領域ＲＡを決定する。そして、ＣＰＵ６２は、２個の結合画像ＣＩ１，ＣＩ２を表わす２個の結合画像データを利用して、複数個の文字「Ａ〜Ｑ」を再配置領域ＲＡ内に再配置することによって、再配置画像ＲＩを表わす再配置画像データを生成する。 In S400, the CPU 62 executes rearrangement processing (see FIG. 9 described later) to determine the rearrangement area RA. Then, the CPU 62 rearranges the plurality of characters “A to Q” in the rearrangement area RA by using the two combined image data representing the two combined images CI1 and CI2. Rearranged image data representing the image RI is generated.

Ｓ５００では、ＣＰＵ６２は、再配置画像ＲＩを表わす再配置画像データを拡大して、拡大画像データを生成する。そして、ＣＰＵ６２は、拡大画像データを利用して、処理済み画像ＰＩを表わす処理済み画像データＰＩＤを生成する。処理済み画像ＰＩでは各文字が拡大されて表現されるが、処理済み画像データＰＩＤは、スキャン画像データＳＩＤと同じ画素数を有する。 In S500, the CPU 62 enlarges the rearranged image data representing the rearranged image RI to generate enlarged image data. Then, the CPU 62 generates processed image data PID representing the processed image PI using the enlarged image data. In the processed image PI, each character is expressed in an enlarged manner, but the processed image data PID has the same number of pixels as the scanned image data SID.

Ｓ６００では、ＣＰＵ６２は、インターネット４を介して、処理済み画像データＰＩＤを多機能機１０に送信する。これにより、処理済み画像データＰＩＤによって表わされる処理済み画像ＰＩが対象印刷シートに印刷される。 In S <b> 600, the CPU 62 transmits the processed image data PID to the multi-function device 10 via the Internet 4. As a result, the processed image PI represented by the processed image data PID is printed on the target print sheet.

（文字列解析処理；図３）
続いて、図３を参照して、図２のＳ１００で実行される文字列解析処理の内容を説明する。Ｓ１１０では、ＣＰＵ６２は、スキャン画像データＳＩＤに対して二値化処理を実行して、スキャン画像データＳＩＤと同じ画素数を有する二値データＢＤ（図３では一部のみが示されている）を生成する。ＣＰＵ６２は、まず、スキャン画像データＳＩＤを利用して、スキャン画像ＳＩの背景色（本実施例では白色）を決定する。具体的には、ＣＰＵ６２は、スキャン画像データＳＩＤ内の複数個の画素の画素値の頻度の分布を示すヒストグラムを生成する。そして、ＣＰＵ６２は、当該ヒストグラムを利用して、最高の頻度を有する画素値（以下では「最高頻度画素値」と呼ぶ）を特定することによって、背景色を決定する。次いで、ＣＰＵ６２は、スキャン画像データＳＩＤ内の複数個の画素のそれぞれについて、当該画素の画素値が最高頻度画素値に一致する場合には、当該画素に対応する位置に存在する二値データＢＤ内の画素の画素値として「０」を割り当て、当該画素の画素値が最高頻度画素値に一致しない場合には、当該画素に対応する位置に存在する二値データＢＤ内の画素の画素値として「１」を割り当てる。この結果、二値データＢＤでは、テキストオブジェクトＴＯＢに含まれる各文字（例えば「Ａ」，「Ｂ」）を表わす各画素が画素値「１」を示し、写真オブジェクトＰＯＢを表わす各画素が画素値「１」を示し、それ以外の各画素（即ち背景を表わす画素）が画素値「０」を示す。なお、以下では、二値データＢＤ内の画素値「１」を示す画素、画素値「０」を示す画素のことを、それぞれ、「ＯＮ画素」、「ＯＦＦ画素」と呼ぶ。 (Character string analysis processing; Fig. 3)
Next, the contents of the character string analysis process executed in S100 of FIG. 2 will be described with reference to FIG. In S110, the CPU 62 executes binarization processing on the scanned image data SID, and outputs binary data BD (only part of which is shown in FIG. 3) having the same number of pixels as the scanned image data SID. Generate. The CPU 62 first determines the background color (white in the present embodiment) of the scan image SI using the scan image data SID. Specifically, the CPU 62 generates a histogram indicating the frequency distribution of pixel values of a plurality of pixels in the scan image data SID. Then, the CPU 62 determines the background color by specifying the pixel value having the highest frequency (hereinafter referred to as “the highest frequency pixel value”) using the histogram. Next, for each of the plurality of pixels in the scan image data SID, the CPU 62, when the pixel value of the pixel matches the highest frequency pixel value, in the binary data BD existing at the position corresponding to the pixel. When “0” is assigned as the pixel value of the pixel and the pixel value of the pixel does not match the highest frequency pixel value, the pixel value of the pixel in the binary data BD existing at the position corresponding to the pixel is “ 1 ”is assigned. As a result, in the binary data BD, each pixel representing each character (for example, “A”, “B”) included in the text object TOB indicates the pixel value “1”, and each pixel representing the photographic object POB is the pixel value. “1” is indicated, and other pixels (that is, pixels representing the background) indicate the pixel value “0”. Hereinafter, the pixel indicating the pixel value “1” and the pixel indicating the pixel value “0” in the binary data BD are referred to as “ON pixel” and “OFF pixel”, respectively.

Ｓ１２０では、ＣＰＵ６２は、Ｓ１１０で生成された二値データＢＤに対してラべリング処理を実行して、二値データＢＤと同じ画素数を有するラベルデータＬＤ（図３では一部のみが示されている）を生成する。具体的には、ＣＰＵ６２は、二値データＢＤ内の複数個のＯＮ画素を２個以上のＯＮ画素群に分けて、当該２個以上のＯＮ画素群のそれぞれに異なる画素値（例えば「１」、「２」等）を割り当てる。１個のＯＮ画素群は、互いに隣接する２個以上のＯＮ画素によって構成される。即ち、ＣＰＵ６２は、ラべリング処理の対象の１個のＯＮ画素に隣接する８個の隣接画素の中に１個以上のＯＮ画素が含まれる場合には、当該対象の１個のＯＮ画素と、８個の隣接画素のうちの１個以上のＯＮ画素と、を同じＯＮ画素群として区分する（即ちグループ化する）。ＣＰＵ６２は、ラべリング処理の対象のＯＮ画素を変えながら各ＯＮ画素のグループ化を順次実行することによって、２個以上のＯＮ画素群を決定する。例えば、図３のラベルデータＬＤでは、文字「Ａ」を表わす各ＯＮ画素（即ち１個のＯＮ画素群）に画素値「１」が割り当てられており、文字「Ｂ」を表わす各ＯＮ画素（即ち他の１個のＯＮ画素群）に画素値「２」が割り当てられている。 In S120, the CPU 62 performs a labeling process on the binary data BD generated in S110, and the label data LD having the same number of pixels as the binary data BD (only a part is shown in FIG. 3). Is generated). Specifically, the CPU 62 divides a plurality of ON pixels in the binary data BD into two or more ON pixel groups, and each of the two or more ON pixel groups has a different pixel value (for example, “1”). , “2”, etc.). One ON pixel group is composed of two or more ON pixels adjacent to each other. That is, when one or more ON pixels are included in eight adjacent pixels adjacent to one ON pixel to be labeled, the CPU 62 determines that the target ON pixel , One or more ON pixels among the eight adjacent pixels are divided (ie, grouped) into the same ON pixel group. The CPU 62 determines two or more ON pixel groups by sequentially executing grouping of each ON pixel while changing the ON pixels to be labeled. For example, in the label data LD of FIG. 3, the pixel value “1” is assigned to each ON pixel (that is, one ON pixel group) representing the character “A”, and each ON pixel representing the character “B” ( That is, the pixel value “2” is assigned to the other one ON pixel group).

Ｓ１３０では、ＣＰＵ６２は、Ｓ１２０で生成されたラベルデータＬＤを利用して、上記の各ＯＮ画素群に対応する各単位領域を決定する。各単位領域は、対応する１個のＯＮ画素群に外接する矩形の領域である。ＣＰＵ６２は、例えば、図３のラベルデータＬＤを利用する場合には、画素値「１」が割り当てられているＯＮ画素群に外接する単位領域（即ち文字「Ａ」に対応する単位領域）と、画素値「２」が割り当てられているＯＮ画素群に外接する単位領域（即ち文字「Ｂ」に対応する単位領域）と、を決定する。より具体的には、ＣＰＵ６２は、スキャン画像ＳＩの中から、１７個の文字「Ａ」〜「Ｑ」に対応する１７個の単位領域と、１個の写真オブジェクトＰＯＢに対応する１個の単位領域と、を決定する（即ち、合計で１８個の単位領域を決定する）。上記の単位領域の決定は、当該単位領域の各頂点を構成する各画素の位置をメモリ６４に記憶することによって実行される。ただし、以下では、「領域（又は位置）の決定」に関する説明において、画素の位置をメモリ６４に記憶することに関する説明を省略する。 In S130, the CPU 62 determines each unit area corresponding to each of the ON pixel groups using the label data LD generated in S120. Each unit area is a rectangular area circumscribing one corresponding ON pixel group. For example, when using the label data LD of FIG. 3, the CPU 62 circumscribes the ON pixel group to which the pixel value “1” is assigned (that is, the unit region corresponding to the character “A”), A unit area circumscribing the ON pixel group to which the pixel value “2” is assigned (that is, a unit area corresponding to the letter “B”) is determined. More specifically, the CPU 62 selects 17 unit areas corresponding to 17 characters “A” to “Q” and one unit corresponding to one photo object POB from the scanned image SI. (I.e., a total of 18 unit areas are determined). The determination of the unit area is executed by storing the position of each pixel constituting each vertex of the unit area in the memory 64. However, in the description below regarding “determination of region (or position)”, description regarding storing the pixel position in the memory 64 is omitted.

Ｓ１４０では、ＣＰＵ６２は、Ｓ１３０で決定された単位領域を利用して、スキャン画像ＳＩ内のオブジェクト領域を決定する。具体的には、ＣＰＵ６２は、１８個の単位領域を複数個の単位領域群に区分し、各単位領域群に対応する各オブジェクト領域を決定する。１個の単位領域群は、近傍に存在する１個以上の単位領域によって構成される。ＣＰＵ６２は、２個の単位領域の間の距離（即ち画素数）が所定の距離未満である場合に、当該２個の単位領域を同じ単位領域群に区分する。上記の所定の距離は、スキャン画像データＳＩＤの解像度に応じて予め決められている。例えば、本実施例では、スキャン画像データＳＩＤが３００ｄｐｉの解像度を有しており、３００ｄｐｉの解像度に対応する上記の所定の距離は、１０画素である。そして、図３のラベルデータＬＤでは、文字「Ａ」に対応する単位領域ＲＥ１と、文字「Ｂ」に対応する単位領域ＲＥ２と、の間の距離は、３画素である。従って、ＣＰＵ６２は、単位領域ＲＥ１と単位領域ＲＥ２とを同じ単位領域群に区分する。これにより、ＣＰＵ６２は、近傍に存在する各文字（例えば「Ａ」と「Ｂ」）をグループ化することができる。より具体的には、ＣＰＵ６２は、スキャン画像ＳＩについて、テキストオブジェクトＴＯＢ内の１７個の文字「Ａ」〜「Ｑ」に対応する１７個の単位領域を含む単位領域群と、１個の写真オブジェクトＰＯＢに対応する１個の単位領域を含む単位領域群と、を決定する（即ち、合計で２個の単位領域群を決定する）。そして、ＣＰＵ６２は、２個の単位領域群のそれぞれについて、当該単位領域群に外接する矩形の領域をオブジェクト領域として決定する。即ち、ＣＰＵ６２は、スキャン画像ＳＩの中から、テキストオブジェクトＴＯＢ内の１７個の文字「Ａ」〜「Ｑ」を含むオブジェクト領域ＴＯＡと、写真オブジェクトＰＯＢを含むオブジェクト領域ＰＯＡと、を決定する（即ち、合計で２個のオブジェクト領域ＴＯＡ，ＰＯＡを決定する）。 In S140, the CPU 62 determines an object area in the scan image SI using the unit area determined in S130. Specifically, the CPU 62 divides the 18 unit areas into a plurality of unit area groups, and determines each object area corresponding to each unit area group. One unit region group is composed of one or more unit regions existing in the vicinity. When the distance (that is, the number of pixels) between the two unit areas is less than a predetermined distance, the CPU 62 divides the two unit areas into the same unit area group. The predetermined distance is determined in advance according to the resolution of the scan image data SID. For example, in this embodiment, the scan image data SID has a resolution of 300 dpi, and the predetermined distance corresponding to the resolution of 300 dpi is 10 pixels. In the label data LD of FIG. 3, the distance between the unit region RE1 corresponding to the character “A” and the unit region RE2 corresponding to the character “B” is 3 pixels. Therefore, the CPU 62 divides the unit area RE1 and the unit area RE2 into the same unit area group. Thereby, the CPU 62 can group each character (for example, “A” and “B”) existing in the vicinity. More specifically, for the scanned image SI, the CPU 62 includes a unit region group including 17 unit regions corresponding to 17 characters “A” to “Q” in the text object TOB, and one photographic object. A unit region group including one unit region corresponding to POB is determined (that is, two unit region groups are determined in total). Then, for each of the two unit area groups, the CPU 62 determines a rectangular area circumscribing the unit area group as an object area. That is, the CPU 62 determines, from the scanned image SI, an object area TOA including 17 characters “A” to “Q” in the text object TOB and an object area POA including the photographic object POB (ie, the object area POA). , Two object areas TOA and POA are determined in total).

Ｓ１５０では、ＣＰＵ６２は、Ｓ１４０で決定された２個のオブジェクト領域ＴＯＡ，ＰＯＡのそれぞれについて、当該オブジェクト領域の種類を決定する。具体的には、ＣＰＵ６２は、各オブジェクト領域ＴＯＡ，ＰＯＡが、文字を含むテキストオブジェクト領域（以下では単に「テキスト領域」と呼ぶ）であるのか否かを判断する。ＣＰＵ６２は、まず、スキャン画像データＳＩＤのうち、オブジェクト領域ＴＯＡを表わす部分画像データを構成する複数個の画素の画素値の頻度の分布を示すヒストグラムを生成する。そして、ＣＰＵ６２は、当該ヒストグラムを利用して、頻度がゼロより高い画素値の数（即ち、オブジェクト領域ＴＯＡで利用されている色の数）を算出する。ＣＰＵ６２は、算出済みの数が所定数（例えば「１０」）未満である場合には、オブジェクト領域ＴＯＡがテキスト領域であると判断し、算出済みの数が上記の所定数以上である場合には、オブジェクト領域ＴＯＡがテキスト領域でないと判断する。オブジェクト領域ＴＯＡは、黒色の文字「Ａ」〜「Ｑ」と、白色の背景と、を含む。従って、オブジェクト領域ＴＯＡに対応するヒストグラムでは、通常、黒色を示す画素値と白色を示す画素値とを含む２個の画素値のみの頻度がゼロより高い。このために、ＣＰＵ６２は、オブジェクト領域ＴＯＡがテキスト領域であると判断する。一方、例えば、写真オブジェクトＰＯＢでは、通常、１０色以上の色が利用されている。従って、オブジェクト領域ＰＯＡに対応するヒストグラムでは、通常、頻度がゼロより高い画素値の数が上記の所定数以上になる。このために、ＣＰＵ６２は、オブジェクト領域ＰＯＡがテキスト領域でないと判断する（即ち写真オブジェクト領域であると判断する）。 In S150, the CPU 62 determines the type of the object area for each of the two object areas TOA and POA determined in S140. Specifically, the CPU 62 determines whether or not each object area TOA, POA is a text object area including characters (hereinafter simply referred to as “text area”). First, the CPU 62 generates a histogram indicating the frequency distribution of the pixel values of a plurality of pixels constituting the partial image data representing the object area TOA in the scan image data SID. Then, the CPU 62 uses the histogram to calculate the number of pixel values whose frequency is higher than zero (that is, the number of colors used in the object area TOA). When the calculated number is less than a predetermined number (for example, “10”), the CPU 62 determines that the object area TOA is a text area, and when the calculated number is equal to or greater than the predetermined number. The object area TOA is determined not to be a text area. The object area TOA includes black characters “A” to “Q” and a white background. Therefore, in the histogram corresponding to the object region TOA, the frequency of only two pixel values including the pixel value indicating black and the pixel value indicating white is usually higher than zero. For this reason, the CPU 62 determines that the object area TOA is a text area. On the other hand, for example, in the photo object POB, ten or more colors are usually used. Therefore, in the histogram corresponding to the object area POA, the number of pixel values having a frequency higher than zero is usually greater than or equal to the predetermined number. Therefore, the CPU 62 determines that the object area POA is not a text area (that is, determines that it is a photo object area).

Ｓ１６０では、ＣＰＵ６２は、Ｓ１５０で決定されたテキスト領域ＴＯＡに対して帯状領域決定処理（後述の図４参照）を実行する。ただし、ＣＰＵ６２は、写真オブジェクト領域ＰＯＡに対して帯状領域決定処理を実行しない。Ｓ１６０が終了すると、図３の処理が終了する。 In S160, the CPU 62 executes a band-shaped area determination process (see FIG. 4 described later) for the text area TOA determined in S150. However, the CPU 62 does not execute the band-shaped area determination process for the photo object area POA. When S160 ends, the process of FIG. 3 ends.

（帯状領域決定処理；図４）
続いて、図４を参照して、図３のＳ１６０で実行される帯状領域決定処理の内容を説明する。以下では、スキャン画像ＳＩ内のテキスト領域ＴＯＡを例として、図４の処理の内容を説明する。スキャン画像ＳＩ内に複数個のテキストオブジェクトが含まれる場合には、テキストオブジェクト毎（即ちテキスト領域毎）に図４の処理が実行される。 (Strip-like region determination process; FIG. 4)
Next, with reference to FIG. 4, the content of the band-shaped area determination process executed in S160 of FIG. 3 will be described. In the following, the contents of the process of FIG. 4 will be described using the text area TOA in the scanned image SI as an example. When a plurality of text objects are included in the scanned image SI, the process of FIG. 4 is executed for each text object (that is, for each text area).

Ｓ１６２では、ＣＰＵ６２は、テキスト領域ＴＯＡに対応する射影ヒストグラムを生成する。当該射影ヒストグラムは、二値データＢＤ（Ｓ１１０参照）を構成する複数個の画素のうち、テキスト領域ＴＯＡを表わす各画素を横方向に射影する場合におけるＯＮ画素（即ち「１」を示す画素）の頻度の分布を示す。換言すると、当該射影ヒストグラムは、スキャン画像データＳＩＤを構成する複数個の画素のうち、テキスト領域ＴＯＡを表わす各画素を横方向に射影する場合における文字構成画素の頻度の分布を示す。文字構成画素は、テキスト領域ＴＯＡに含まれる文字（例えば「Ａ」）を構成する画素（即ち黒色を示す画素）である。当該射影ヒストグラムでは、１行の文字列（例えば「Ａ〜Ｅ」）が、頻度がゼロより高い範囲（以下では「高頻度範囲」と呼ぶ）で表わされ、２行の文字列の間の行間（例えば「Ａ〜Ｅ」と「Ｆ〜Ｋ」の間の行間）が、頻度がゼロである範囲で表わされる。 In S162, the CPU 62 generates a projection histogram corresponding to the text area TOA. The projection histogram is an ON pixel (that is, a pixel indicating “1”) in a case where each pixel representing the text area TOA is projected in the horizontal direction among a plurality of pixels constituting the binary data BD (see S110). Shows the frequency distribution. In other words, the projection histogram shows the frequency distribution of the character constituent pixels when the pixels representing the text area TOA are projected in the horizontal direction among the plurality of pixels constituting the scan image data SID. The character composing pixel is a pixel (that is, a pixel indicating black) constituting a character (for example, “A”) included in the text area TOA. In the projection histogram, a character string of one line (for example, “A to E”) is represented by a range in which the frequency is higher than zero (hereinafter referred to as “high frequency range”). The line spacing (for example, the line spacing between “A to E” and “F to K”) is represented in a range where the frequency is zero.

Ｓ１６４では、ＣＰＵ６２は、Ｓ１６２で生成された射影ヒストグラムを利用して、１個以上の高頻度範囲に対応する１個以上の帯状領域を決定する。１個の帯状領域の縦方向の長さ（即ち縦方向の画素数）は、当該帯状領域に対応する高頻度範囲の縦方向の長さに等しい。また、１個の帯状領域の横方向の長さ（即ち横方向の画素数）は、オブジェクト領域ＴＯＡの横方向の長さに等しい。より具体的には、ＣＰＵ６２は、テキスト領域ＴＯＡの中から、文字列「Ａ〜Ｅ」を含む帯状領域ＬＡ１１と、文字列「Ｆ〜Ｋ」を含む帯状領域ＬＡ１２と、文字列「ＬＭ」を含む帯状領域ＬＡ１３と、文字列「Ｎ〜Ｑ」を含む帯状領域ＬＡ１４と、を決定する（即ち、合計で４個の帯状領域ＬＡ１１〜ＬＡ１４を決定する）。 In S164, the CPU 62 determines one or more band-like regions corresponding to one or more high-frequency ranges using the projection histogram generated in S162. The length in the vertical direction (that is, the number of pixels in the vertical direction) of one band-like region is equal to the length in the vertical direction of the high-frequency range corresponding to the band-like region. Further, the length in the horizontal direction of one band-like region (that is, the number of pixels in the horizontal direction) is equal to the length in the horizontal direction of the object region TOA. More specifically, the CPU 62 selects, from the text area TOA, a band-shaped area LA11 including the character string “A to E”, a band-shaped area LA12 including the character string “F to K”, and the character string “LM”. A band-like area LA13 including the band-shaped area LA14 including the character string “N to Q” is determined (that is, a total of four band-shaped areas LA11 to LA14 are determined).

続いて、ＣＰＵ６２は、Ｓ１６６〜Ｓ１７４の処理を実行して、各帯状領域ＬＡ１１〜ＬＡ１４に対応する各基準位置を決定する。基準位置は、図２のＳ２００の結合処理（後述の図５参照）において、各帯状領域に含まれる各文字列を結合するための基準となる位置である。 Subsequently, the CPU 62 executes the processes of S166 to S174 to determine each reference position corresponding to each of the belt-like areas LA11 to LA14. The reference position is a position serving as a reference for combining the character strings included in the belt-like regions in the combining process in S200 of FIG. 2 (see FIG. 5 described later).

Ｓ１６６では、ＣＰＵ６２は、Ｓ１６４で決定された４個の帯状領域ＬＡ１１〜ＬＡ１４のうちの１個の帯状領域（以下では「対象帯状領域」と呼ぶ）を処理対象として決定する。以下では、帯状領域ＬＡ１１が対象帯状領域として決定される場合を例として説明する。 In S166, the CPU 62 determines one band-shaped area (hereinafter referred to as “target band-shaped area”) among the four band-shaped areas LA11 to LA14 determined in S164 as a processing target. Hereinafter, a case where the band-shaped area LA11 is determined as the target band-shaped area will be described as an example.

Ｓ１６８では、ＣＰＵ６２は、対象帯状領域ＬＡ１１の縦方向の全範囲ＡＲの中から、縦方向の３画素分の評価範囲を設定する。対象帯状領域ＬＡ１１に関する１回目のＳ１６８では、ＣＰＵ６２は、３画素のうちの最も上の画素が対象帯状領域ＬＡ１１の縦方向の全範囲ＡＲの中間位置に存在するように、１回目の評価範囲を設定する。対象帯状領域ＬＡ１１に関する２回以降のＳ１６８では、ＣＰＵ６２は、前回の評価範囲が１画素だけ下側にずれるように、今回の評価範囲を設定する。なお、変形例では、評価範囲は、縦方向の３画素分の範囲でなくてもよく、縦方向の１画素分又は２画素分の範囲であってもよいし、縦方向の４画素分以上の範囲であってもよい。 In S168, the CPU 62 sets an evaluation range for three pixels in the vertical direction from the entire vertical range AR of the target strip area LA11. In S168 for the first time regarding the target band-shaped area LA11, the CPU 62 sets the first evaluation range so that the uppermost pixel of the three pixels exists at an intermediate position of the entire vertical range AR of the target band-shaped area LA11. Set. In S168 after the second time regarding the target band-shaped region LA11, the CPU 62 sets the current evaluation range so that the previous evaluation range is shifted downward by one pixel. In the modified example, the evaluation range may not be a range corresponding to three pixels in the vertical direction, may be a range corresponding to one pixel or two pixels in the vertical direction, or more than four pixels in the vertical direction. It may be a range.

Ｓ１７０では、ＣＰＵ６２は、今回の評価範囲について、合計下辺長さを算出する。合計下辺長さは、対象帯状領域ＬＡ１１内の５個の文字「Ａ」〜「Ｅ」に対応する５個の単位領域（図３のＳ１３０で決定済み）のうちの１個以上の単位領域の下辺ＸＡ〜ＸＥが今回の評価範囲内に存在する場合に、当該１個以上の単位領域の下辺の長さの和である。図４の例では、１回目及び２回目の評価範囲では、１個の単位領域の下辺が存在しないので、ＣＰＵ６２は、合計下辺長さとして「０」を決定する。そして、ｐ回目の評価範囲では、５個の下辺ＸＡ〜ＸＥの全てが存在するので、ＣＰＵ６２は、合計下辺長さとして「１」以上の値（即ち５個の下辺ＸＡ〜ＸＥの長さの和）を算出する。 In S170, the CPU 62 calculates the total lower side length for the current evaluation range. The total lower side length is the length of one or more unit regions among the five unit regions (determined in S130 of FIG. 3) corresponding to the five characters “A” to “E” in the target strip-shaped region LA11. When the lower sides XA to XE are present in the current evaluation range, it is the sum of the lengths of the lower sides of the one or more unit regions. In the example of FIG. 4, since the lower side of one unit area does not exist in the first and second evaluation ranges, the CPU 62 determines “0” as the total lower side length. In the evaluation range for the pth time, all of the five lower sides XA to XE exist, so the CPU 62 has a value of “1” or more as the total lower side length (that is, the length of the five lower sides XA to XE). Sum).

Ｓ１７２では、ＣＰＵ６２は、対象帯状領域ＬＡ１１の全ての評価範囲について、Ｓ１６８及びＳ１７０の処理が終了したのか否かを判断する。具体的には、ＣＰＵ６２は、対象帯状領域ＬＡ１１の縦方向の全範囲ＡＲの下端と、前回の評価範囲（例えばｐ回目の評価範囲）の下端と、が一致する場合には、全ての評価範囲について処理が終了したと判断して（Ｓ１７２でＹＥＳ）、Ｓ１７４に進む。一方、ＣＰＵ６２は、全ての評価範囲について処理が終了していないと判断する場合（Ｓ１７２でＮＯ）には、Ｓ１６８に戻り、評価範囲を新たに設定する。 In S172, the CPU 62 determines whether or not the processes of S168 and S170 have been completed for all the evaluation ranges of the target strip area LA11. Specifically, the CPU 62 determines that all the evaluation ranges when the lower end of the entire vertical range AR of the target strip-shaped region LA11 matches the lower end of the previous evaluation range (for example, the p-th evaluation range). (YES in S172), the process proceeds to S174. On the other hand, if the CPU 62 determines that the processing has not been completed for all the evaluation ranges (NO in S172), the CPU 62 returns to S168 and newly sets an evaluation range.

Ｓ１７４では、ＣＰＵ６２は、複数個の評価範囲について算出された複数個の合計下辺長さに基づいて、対象帯状領域ＬＡ１１の基準位置を決定する。具体的には、ＣＰＵ６２は、まず、複数個の評価範囲の中から、複数個の合計下辺長さのうちの最大の合計下辺長さが算出された１個の評価範囲（例えばｐ回目の評価範囲）を選択する。なお、ＣＰＵ６２は、複数個の評価範囲の中に、最大の合計下辺長さが算出された２個以上の評価範囲が存在する場合には、当該２個以上の評価範囲のうち、最初に設定された評価範囲を選択する。そして、ＣＰＵ６２は、選択済みの評価範囲の縦方向の中間位置を基準位置として決定する。即ち、図４の例では、対象帯状領域ＬＡ１１の縦方向において、５個の下辺ＸＡ〜ＸＥの近傍の位置、即ち、対象帯状領域ＬＡ１１の最下端の近傍の位置が、基準位置として決定される。 In S174, the CPU 62 determines the reference position of the target strip area LA11 based on the plurality of total lower side lengths calculated for the plurality of evaluation ranges. Specifically, the CPU 62 first has one evaluation range (for example, the p-th evaluation) in which the maximum total lower side length is calculated from the plurality of total lower side lengths. Range). In addition, when there are two or more evaluation ranges in which the maximum total lower side length is calculated among the plurality of evaluation ranges, the CPU 62 sets the first among the two or more evaluation ranges. Selected evaluation range. Then, the CPU 62 determines the intermediate position in the vertical direction of the selected evaluation range as the reference position. That is, in the example of FIG. 4, the position in the vicinity of the five lower sides XA to XE in the vertical direction of the target strip area LA11, that is, the position in the vicinity of the lowermost end of the target strip area LA11 is determined as the reference position. .

Ｓ１７６では、ＣＰＵ６２は、全ての帯状領域ＬＡ１１〜ＬＡ１４について、Ｓ１６６〜Ｓ１７４の処理が終了したのか否かを判断する。ＣＰＵ６２は、処理が終了していないと判断する場合（Ｓ１７６でＮＯ）には、Ｓ１６６において、未処理の帯状領域（例えばＬＡ１２）を処理対象として決定して、Ｓ１６８以降の各処理を再び実行する。この結果、４個の帯状領域ＬＡ１１〜ＬＡ１４に対応する４個の基準位置が決定される。そして、ＣＰＵ６２は、処理が終了したと判断する場合（Ｓ１７６でＹＥＳ）には、図４の処理を終了する。 In S176, the CPU 62 determines whether or not the processes of S166 to S174 have been completed for all the strip-shaped areas LA11 to LA14. If the CPU 62 determines that the process has not ended (NO in S176), the CPU 62 determines an unprocessed belt-like area (for example, LA12) as a process target in S166, and executes each process after S168 again. . As a result, four reference positions corresponding to the four belt-like regions LA11 to LA14 are determined. Then, if the CPU 62 determines that the process is complete (YES in S176), the process of FIG. 4 is terminated.

上述したように、本実施例では、基準位置を決定するために、各文字の単位領域の下辺の長さに着目している。従って、対象帯状領域ＬＡ１１内の縦方向の全範囲ＡＲのうちの比較的に上側の範囲では、通常、合計下辺長さが最大にならない。従って、Ｓ１６８では、対象帯状領域ＬＡ１の縦方向の全範囲ＡＲの中間位置に１回目の評価範囲が設定され、その後、評価範囲を下側に移動させる。これにより、Ｓ１６８で設定される評価範囲の数を減らすことができ、この結果、基準位置を迅速に決定することができる。 As described above, in this embodiment, in order to determine the reference position, attention is paid to the length of the lower side of the unit area of each character. Therefore, the total lower side length is not normally maximized in the relatively upper range of the entire vertical range AR in the target band-shaped region LA11. Accordingly, in S168, the first evaluation range is set at the middle position of the entire vertical range AR of the target band-shaped region LA1, and then the evaluation range is moved downward. As a result, the number of evaluation ranges set in S168 can be reduced, and as a result, the reference position can be quickly determined.

また、図４に示される他の例では、Ｓ１６６で処理対象として決定される対象帯状領域は、小文字のアルファベットである６個の文字「ｄ」〜「ｉ」を含む。文字「ｉ」以外の各文字については、１個の単位領域が決定されるが、文字「ｉ」については、２個の単位領域が決定される。文字「ｇ」の下辺Ｘｇは、他の５個の文字の下辺Ｘｄ〜Ｘｆ，Ｘｈ，Ｘｉ２よりも下側に存在している。この例では、下辺Ｘｉ１を含む評価範囲と、下辺Ｘｄ〜Ｘｆ，Ｘｈ，Ｘｉ２を含む評価範囲と、下辺Ｘｇを含む評価範囲と、のそれぞれについて、１以上の合計下辺長さが算出される。そして、下辺Ｘｄ〜Ｘｆ，Ｘｈ，Ｘｉ２を含む評価範囲について算出される合計下辺長さが最大になるので、当該評価範囲の中間位置が基準位置として決定される。このように、本実施例では、合計下辺長さが最大である評価範囲に基づいて基準位置が決定され、その基準位置に基づいて２行以上の文字列が結合される（図２のＳ２００の結合画像ＣＩ１参照）。このために、ユーザが、処理済み画像ＰＩ（図２のＳ５００参照）内の文字列を構成する複数個の文字の並びを不自然に感じるのを抑制することができる。 In another example shown in FIG. 4, the target band-shaped area determined as the processing target in S166 includes six characters “d” to “i” that are lowercase alphabets. For each character other than the character “i”, one unit region is determined, but for the character “i”, two unit regions are determined. The lower side Xg of the character “g” exists below the lower sides Xd to Xf, Xh, and Xi2 of the other five characters. In this example, one or more total lower side lengths are calculated for each of the evaluation range including the lower side Xi1, the evaluation range including the lower sides Xd to Xf, Xh, and Xi2, and the evaluation range including the lower side Xg. Since the total lower side length calculated for the evaluation range including the lower sides Xd to Xf, Xh, and Xi2 is maximized, the intermediate position of the evaluation range is determined as the reference position. As described above, in this embodiment, the reference position is determined based on the evaluation range having the maximum total lower side length, and two or more lines of character strings are combined based on the reference position (in S200 of FIG. 2). (See the combined image CI1). For this reason, it is possible to suppress the user from unnaturally feeling the arrangement of a plurality of characters constituting the character string in the processed image PI (see S500 in FIG. 2).

（結合処理；図５）
続いて、図５を参照して、図２のＳ２００で実行される結合処理の内容を説明する。Ｓ２１０では、ＣＰＵ６２は、スキャン画像ＳＩ内の１個以上のテキスト領域のうちの１個のテキスト領域（以下では「対象テキスト領域」と呼ぶ）を処理対象として決定する。以下では、テキスト領域ＴＯＡが対象テキスト領域として決定される場合を例として説明する。 (Combining process; FIG. 5)
Next, the contents of the combining process executed in S200 of FIG. 2 will be described with reference to FIG. In S210, the CPU 62 determines one text area (hereinafter referred to as “target text area”) among one or more text areas in the scanned image SI as a processing target. Hereinafter, a case where the text area TOA is determined as the target text area will be described as an example.

Ｓ２１２では、ＣＰＵ６２は、対象テキスト領域ＴＯＡについて決定された４個の帯状領域ＬＡ１１〜ＬＡ１４（図４のＳ１６４参照）のうちの１個の帯状領域（以下では「対象帯状領域」と呼ぶ）を処理対象として決定する。 In S212, the CPU 62 processes one band-like area (hereinafter referred to as “target band-like area”) among the four band-like areas LA11 to LA14 (see S164 in FIG. 4) determined for the target text area TOA. Decide as a target.

Ｓ２１４では、ＣＰＵ６２は、対象帯状領域（例えばＬＡ１１）に対応する射影ヒストグラムを生成する。当該射影ヒストグラムは、二値データＢＤ（図３のＳ１１０参照）を構成する複数個の画素のうち、対象帯状領域を表わす各画素を縦方向に射影する場合におけるＯＮ画素（即ち「１」を示す画素）の頻度の分布を示す。換言すると、当該射影ヒストグラムは、スキャン画像データＳＩＤを構成する複数個の画素のうち、対象帯状領域を表わす各画素を縦方向に射影する場合における文字構成画素の頻度の分布を示す。当該射影ヒストグラムでは、１個の文字（例えば「Ａ」）が、頻度がゼロより高い範囲で表わされ、２個の文字（例えば「Ａ」と「Ｂ」）の間の余白の領域が、頻度がゼロである範囲で表わされる。また、当該射影ヒストグラムでは、対象帯状領域の先端（即ち左端）と先頭の文字（例えば「Ａ」）との間に余白が存在する場合には、当該余白の領域（以下では「先端余白領域」と呼ぶ）が、頻度がゼロである範囲で表わされる。同様に、対象帯状領域の後端（即ち右端）と最後の文字（例えば「Ｅ」）との間に余白が存在する場合には、当該余白の領域（以下では「後端余白領域」と呼ぶ）が、頻度がゼロである範囲で表わされる。 In S214, the CPU 62 generates a projection histogram corresponding to the target band-like region (for example, LA11). The projection histogram indicates an ON pixel (that is, “1”) when each pixel representing the target band-like region is projected in the vertical direction among a plurality of pixels constituting the binary data BD (see S110 in FIG. 3). (Pixel) frequency distribution. In other words, the projection histogram shows the distribution of the frequency of the character constituting pixels when each pixel representing the target band-like region is projected in the vertical direction among the plurality of pixels constituting the scan image data SID. In the projection histogram, one character (for example, “A”) is represented in a range where the frequency is higher than zero, and a blank area between two characters (for example, “A” and “B”) Expressed in the range where the frequency is zero. Further, in the projection histogram, when a margin exists between the leading end (that is, the left end) of the target band-like region and the first character (for example, “A”), the margin region (hereinafter, “leading margin region”). Is expressed in the range where the frequency is zero. Similarly, when there is a blank space between the rear end (that is, the right end) of the target band-shaped region and the last character (for example, “E”), the blank region (hereinafter referred to as “rear end blank region”). ) Is represented in the range where the frequency is zero.

Ｓ２１６では、ＣＰＵ６２は、Ｓ２１４で生成された射影ヒストグラムを利用して、対象帯状領域内の先端余白領域の横方向の長さを特定し、先端余白領域の横方向の長さが閾値ｔｈ１よりも大きいのか否かを判断する。これにより、ＣＰＵ６２は、対象帯状領域内の文字列よりも対象帯状領域の先端側に、比較的大きな余白領域（即ちインデント）が存在するのか否かを判断することができる。即ち、ＣＰＵ６２は、対象帯状領域内の文字列が、パラグラフの先頭を構成する文字列であるのか否かを判断することができる。閾値ｔｈ１は、対象帯状領域の縦方向の長さに応じて決定される。具体的には、ＣＰＵ６２は、対象帯状領域の縦方向の長さに等しい値を閾値ｔｈ１として決定する。ただし、変形例では、閾値ｔｈ１は、対象帯状領域の縦方向の長さとは異なる値であってもよいし（例えば対象帯状領域の縦方向の長さの０．５倍）、予め決められている固定値であってもよい。 In S216, the CPU 62 uses the projection histogram generated in S214 to specify the horizontal length of the leading edge margin region in the target band-shaped region, and the horizontal length of the leading margin region is larger than the threshold value th1. Judge whether it is large or not. Thereby, the CPU 62 can determine whether or not a relatively large blank area (that is, an indent) exists on the leading end side of the target band-shaped area with respect to the character string in the target band-shaped area. That is, the CPU 62 can determine whether or not the character string in the target band-like region is a character string that forms the head of the paragraph. The threshold th1 is determined according to the length in the vertical direction of the target band-like region. Specifically, the CPU 62 determines a value equal to the vertical length of the target strip area as the threshold th1. However, in the modified example, the threshold th1 may be a value different from the vertical length of the target strip-shaped region (for example, 0.5 times the vertical length of the target strip-shaped region) or may be determined in advance. It may be a fixed value.

ＣＰＵ６２は、先端余白領域の横方向の長さが閾値ｔｈ１よりも大きいと判断する場合（Ｓ２１６でＹＥＳ）、即ち、対象帯状領域内の文字列がパラグラフの先頭を構成する文字列であると判断する場合には、当該文字列と前行の文字列とを結合対象として決定せずに、Ｓ２１８の処理を実行する。Ｓ２１８では、ＣＰＵ６２は、対象帯状領域内の先端余白領域が維持されていると共に、対象帯状領域内の後端余白領域が消去されている新たな帯状領域を決定する。なお、対象帯状領域が後端余白領域を含まない場合には、Ｓ２１８では、ＣＰＵ６２は、対象帯状領域をそのまま新たな帯状領域として決定する。このようにして、ＣＰＵ６２は、決定済みの新たな帯状領域を表わす部分画像データを取得する。当該部分画像データによって表わされる文字列は、当該文字列よりも先端側にインデントの余白領域を含むので、通常、パラグラフの先頭の文字列を構成している。従って、以下では、当該部分画像データ、当該文字列のことを、それぞれ、「先頭部分画像データ」、「先頭文字列」と呼ぶ。 If the CPU 62 determines that the horizontal length of the leading edge margin area is larger than the threshold th1 (YES in S216), that is, determines that the character string in the target band-shaped area is the character string constituting the head of the paragraph. If so, the process of S218 is executed without determining that the character string and the character string of the previous line are to be combined. In S218, the CPU 62 determines a new band-shaped area in which the leading edge blank area in the target band-shaped area is maintained and the trailing edge blank area in the target band-shaped area is deleted. If the target band-shaped area does not include the trailing edge blank area, in S218, the CPU 62 determines the target band-shaped area as a new band-shaped area as it is. In this way, the CPU 62 acquires partial image data representing a determined new band-like area. Since the character string represented by the partial image data includes an indented blank area on the leading end side of the character string, the character string usually constitutes the first character string of the paragraph. Therefore, hereinafter, the partial image data and the character string are referred to as “first partial image data” and “first character string”, respectively.

一方、ＣＰＵ６２は、先端余白領域の横方向の長さが閾値ｔｈ１以下であると判断する場合（Ｓ２１６でＮＯ）、即ち、対象帯状領域内の文字列がパラグラフの先頭を構成する文字列でないと判断する場合には、当該文字列と前行の文字列とを結合対象として決定して、Ｓ２２０の処理を実行する。Ｓ２２０では、ＣＰＵ６２は、まず、対象帯状領域内の前端余白領域及び後端余白領域が消去された新たな帯状領域を決定する。なお、対象帯状領域が前端余白領域及び後端余白領域を含まない場合には、Ｓ２２０では、ＣＰＵ６２は、対象帯状領域をそのまま新たな帯状領域として決定する。このようにして、ＣＰＵ６２は、決定済みの新たな帯状領域を表わす部分画像データを取得する。当該部分画像データによって表わされる文字列は、通常、パラグラフの先頭の文字列を構成していない。従って、以下では、当該部分画像データ、当該文字列のことを、それぞれ、「非先頭部分画像データ」、「非先頭文字列」と呼ぶ。 On the other hand, when the CPU 62 determines that the horizontal length of the leading edge margin area is equal to or smaller than the threshold th1 (NO in S216), that is, the character string in the target band-like area is not a character string constituting the head of the paragraph. When determining, the character string and the character string of the previous line are determined as a combination target, and the process of S220 is executed. In S220, the CPU 62 first determines a new band area in which the front end margin area and the rear end margin area in the target band area are deleted. If the target band area does not include the front end margin area and the rear end margin area, in S220, the CPU 62 determines the target band area as a new band area as it is. In this way, the CPU 62 acquires partial image data representing a determined new band-like area. The character string represented by the partial image data usually does not constitute the first character string of the paragraph. Therefore, hereinafter, the partial image data and the character string are referred to as “non-leading partial image data” and “non-leading character string”, respectively.

Ｓ２２０では、さらに、ＣＰＵ６２は、非先頭文字列の前行の文字列（例えばＳ２１８で決定される先頭文字列）が左側に存在すると共に非先頭文字列が右側に存在するように、前行の文字列と非先頭文字列とを横方向に沿って直線状に結合する。より具体的には、ＣＰＵ６２は、前行の文字列について決定された基準位置（図４のＳ１７４参照）と、非先頭文字列について決定された基準位置と、が縦方向の同じ位置に存在するように、前行の文字列を表わす部分画像データ（例えば先頭文字列画像データ）と、非先頭文字列を表わす非先頭文字列画像データと、を結合する。この際に、ＣＰＵ６２は、前行の文字列と非先頭文字列との間に予め決められている横方向の固定長さを有する余白領域が形成されるように、当該余白領域を表わす画素、即ち、スキャン画像ＳＩの背景色を有する画素を補充する。即ち、ＣＰＵ６２は、補充される画素を介して、各部分画像データを結合する。上記の固定長さは、当該固定長さを有する余白領域が後述の分断候補位置として決定され得る長さ（例えば図７のＳ２４２のｈ／４以上の長さ）である。また、ＣＰＵ６２は、結合後の画像データによって表わされる画像が矩形形状にならない場合には、スキャン画像ＳＩの背景色を有する画素を補充して、矩形形状の画像を表わす画像データを生成する。Ｓ２２０が終了すると、Ｓ２２２に進む。 In S220, the CPU 62 further controls the previous line so that the character string of the previous line of the non-first character string (for example, the first character string determined in S218) exists on the left side and the non-first character string exists on the right side. A character string and a non-leading character string are joined in a straight line along the horizontal direction. More specifically, the CPU 62 has the reference position determined for the preceding character string (see S174 in FIG. 4) and the reference position determined for the non-leading character string at the same vertical position. As described above, the partial image data (for example, the leading character string image data) representing the preceding character string and the non-leading character string image data representing the non-leading character string are combined. At this time, the CPU 62 displays a pixel representing the blank area so that a blank area having a predetermined fixed length in the horizontal direction is formed between the character string in the previous line and the non-leading character string. That is, the pixels having the background color of the scanned image SI are supplemented. That is, the CPU 62 combines the partial image data via the pixels to be replenished. The fixed length is a length (for example, a length equal to or greater than h / 4 in S242 in FIG. 7) in which a blank area having the fixed length can be determined as a division candidate position to be described later. If the image represented by the combined image data does not have a rectangular shape, the CPU 62 supplements the pixels having the background color of the scanned image SI to generate image data representing the rectangular image. When S220 ends, the process proceeds to S222.

Ｓ２２２では、ＣＰＵ６２は、対象テキスト領域に含まれる全ての帯状領域（例えばＬＡ１１〜ＬＡ１４）について、Ｓ２１２〜Ｓ２２０の処理が終了したのか否かを判断する。ＣＰＵ６２は、処理が終了していないと判断する場合（Ｓ２２２でＮＯ）には、Ｓ２１２において、未処理の帯状領域（例えばＬＡ１２）を処理対象として決定して、Ｓ２１４以降の各処理を再び実行する。そして、ＣＰＵ６２は、処理が終了したと判断する場合（Ｓ２２２でＹＥＳ）には、Ｓ２２４に進む。 In S222, the CPU 62 determines whether or not the processing in S212 to S220 has been completed for all the band-like regions (for example, LA11 to LA14) included in the target text region. If the CPU 62 determines that the process has not ended (NO in S222), the CPU 62 determines an unprocessed belt-like area (eg, LA12) as a process target in S212, and executes each process subsequent to S214 again. . If the CPU 62 determines that the process has been completed (YES in S222), the CPU 62 proceeds to S224.

Ｓ２２４では、ＣＰＵ６２は、スキャン画像ＳＩ内の全てのテキスト領域について、Ｓ２１０〜Ｓ２２２の処理が終了したのか否かを判断する。ＣＰＵ６２は、処理が終了していないと判断する場合（Ｓ２２４でＮＯ）には、Ｓ２１０において、未処理のテキスト領域を処理対象として決定する。そして、ＣＰＵ６２は、処理が終了したと判断する場合（Ｓ２２４でＹＥＳ）には、Ｓ２３０において、分断候補位置決定処理（後述の図７参照）を実行した後に、図５の処理を終了する。 In S224, the CPU 62 determines whether or not the processing in S210 to S222 has been completed for all text regions in the scan image SI. If the CPU 62 determines that the processing has not ended (NO in S224), the CPU 62 determines an unprocessed text area as a processing target in S210. If the CPU 62 determines that the process has been completed (YES in S224), the CPU 62 executes the division candidate position determination process (see FIG. 7 described later) in S230, and then ends the process in FIG.

（結合処理の具体例；図６）
続いて、図６を参照して、図５の結合処理の具体例を説明する。（例１）の（１−１）では、帯状領域ＬＡ１１の先端余白領域ＦＢ１１が閾値ｔｈ１よりも大きいと判断される（Ｓ２１６でＹＥＳ）。帯状領域ＬＡ１１が後端余白領域を含まないので、帯状領域ＬＡ１１がそのまま新たな帯状領域として決定され、帯状領域ＬＡ１１を表わす先頭部分画像データが取得される（Ｓ２１８）。 (Concrete processing example; FIG. 6)
Next, a specific example of the combining process in FIG. 5 will be described with reference to FIG. In (1-1) of (Example 1), it is determined that the leading edge blank area FB11 of the strip-shaped area LA11 is larger than the threshold th1 (YES in S216). Since the band-shaped area LA11 does not include the rear margin area, the band-shaped area LA11 is determined as a new band-shaped area as it is, and the head partial image data representing the band-shaped area LA11 is acquired (S218).

（１−２）では、帯状領域ＬＡ１２の先端余白領域ＦＢ１２が閾値ｔｈ１以下であると判断される（Ｓ２１６でＮＯ）。帯状領域ＬＡ１２が前端余白領域及び後端余白領域を含まないので、帯状領域ＬＡ１２がそのまま新たな帯状領域として決定され、帯状領域ＬＡ１２を表わす非先頭部分画像データが取得される（Ｓ２２０）。 In (1-2), it is determined that the leading edge blank area FB12 of the strip-shaped area LA12 is equal to or less than the threshold th1 (NO in S216). Since the band-shaped area LA12 does not include the front end margin area and the rear end blank area, the band-shaped area LA12 is determined as a new band-shaped area as it is, and the non-leading partial image data representing the band-shaped area LA12 is acquired (S220).

（１−３）では、２個の帯状領域ＬＡ１１，ＬＡ１２の２個の基準位置が縦方向の同じ位置に存在するように、固定長さを有する余白領域を表わす画素を介して、（１−１）の先頭部分画像データと、（１−２）の非先頭部分画像データと、が結合される（Ｓ２２０）。この結果、文字列「Ａ〜Ｅ」と文字列「Ｆ〜Ｋ」とが横方向に沿って直線状に結合されている１行の文字列「Ａ〜Ｋ」を含む中間画像ＭＩ１を表わす中間画像データが生成される。 In (1-3), the two reference positions of the two strip-like areas LA11 and LA12 are located at the same position in the vertical direction via pixels representing a margin area having a fixed length (1- The first partial image data of 1) and the non-first partial image data of (1-2) are combined (S220). As a result, the middle representing the intermediate image MI1 including the character string “A to K” of one line in which the character string “A to E” and the character string “F to K” are linearly coupled along the horizontal direction. Image data is generated.

（１−４）では、帯状領域ＬＡ１３の先端余白領域ＦＢ１３が閾値ｔｈ１以下であると判断される（Ｓ２１６でＮＯ）。帯状領域ＬＡ１３が後端余白領域ＲＢ１３を含むので、後端余白領域ＲＢ１３が消去された新たな帯状領域が決定され、当該新たな帯状領域を表わす非先頭部分画像データが取得される（Ｓ２２０）。 In (1-4), it is determined that the leading edge blank area FB13 of the band-shaped area LA13 is equal to or less than the threshold th1 (NO in S216). Since the band-shaped area LA13 includes the rear end blank area RB13, a new band-shaped area from which the rear end blank area RB13 has been deleted is determined, and non-leading partial image data representing the new band-shaped area is acquired (S220).

（１−５）では、３個の帯状領域ＬＡ１１〜ＬＡ１３の３個の基準位置が縦方向の同じ位置に存在するように、固定長さを有する余白領域を表わす画素を介して、（１−３）の中間画像ＭＩ１を表わす中間画像データと、（１−４）の非先頭部分画像データと、が結合される（Ｓ２２０）。この結果、文字列「Ａ〜Ｋ」と文字列「ＬＭ」とが横方向に沿って直線状に結合されている１行の文字列「Ａ〜Ｍ」を含む結合画像ＣＩ１を表わす結合画像データが生成される。 In (1-5), the three reference positions of the three strip-shaped areas LA11 to LA13 are located at the same position in the vertical direction through pixels representing a margin area having a fixed length (1- The intermediate image data representing the intermediate image MI1 in 3) and the non-leading partial image data in (1-4) are combined (S220). As a result, the combined image data representing the combined image CI1 including the character string “A to M” in one line in which the character string “A to K” and the character string “LM” are linearly combined in the horizontal direction. Is generated.

上記の（１−３）及び（１−５）では、非先頭部分画像データが結合される際に、同じ固定長さを有する余白領域が挿入されるので、結合画像ＣＩ１では、文字「Ｅ」と文字「Ｆ」との間の余白の長さと、文字「Ｋ」と文字「Ｌ」との間の余白の長さと、が同じになる。この結果、結合画像ＣＩ１内の文字列「Ａ〜Ｍ」を構成する各文字の間の余白の長さがほぼ等しくなり得る。 In the above (1-3) and (1-5), when the non-leading partial image data is combined, a blank area having the same fixed length is inserted. Therefore, in the combined image CI1, the character “E” And the length of the margin between the characters “F” and the length of the margin between the characters “K” and “L” are the same. As a result, the lengths of the margins between the characters constituting the character string “A to M” in the combined image CI1 can be substantially equal.

（１−６）では、帯状領域ＬＡ１４の先端余白領域ＦＢ１４が閾値ｔｈ１よりも大きいと判断される（Ｓ２１６でＹＥＳ）。帯状領域ＬＡ１４が後端余白領域ＲＢ１４を含むので、後端余白領域ＲＢ１４が消去された新たな帯状領域が決定され、当該新たな帯状領域を表わす先頭部分画像データが取得される（Ｓ２１８）。帯状領域ＬＡ１４が最後の処理対象であるので（図５のＳ２２２でＹＥＳ）、ここで取得される先頭部分画像データが、結合画像ＣＩ２を表わす結合画像データになる。即ち、（例１）では、２個のパラグラフに対応する２個の結合画像ＣＩ１，ＣＩ２を表わす２個の結合画像データが生成される。 In (1-6), it is determined that the leading edge blank area FB14 of the strip-shaped area LA14 is larger than the threshold th1 (YES in S216). Since the band-shaped area LA14 includes the rear end blank area RB14, a new band-shaped area from which the rear end blank area RB14 has been deleted is determined, and the head partial image data representing the new band-shaped area is acquired (S218). Since the band-shaped area LA14 is the last processing target (YES in S222 in FIG. 5), the top partial image data acquired here becomes combined image data representing the combined image CI2. That is, in (Example 1), two combined image data representing two combined images CI1 and CI2 corresponding to two paragraphs are generated.

続いて、（例２）を説明する。（２−１）〜（２−５）は、上記の（１−１）〜（１−５）とほぼ同様である。即ち、（２−１）では、先端余白領域ＦＢ２１が閾値ｔｈ１よりも大きいと判断され、帯状領域ＬＡ２１を表わす先頭部分画像データが取得される（Ｓ２１８）。（２−２）では、先端余白領域ＦＢ２２が閾値ｔｈ１以下であると判断され、帯状領域ＬＡ１２を表わす非先頭部分画像データが取得される（Ｓ２２０）。そして、（２−３）では、（２−１）の先頭部分画像データと（２−２）の非先頭部分画像データとが結合される（Ｓ２２０）。この結果、１行の文字列「ｄ〜ｖ」を含む中間画像ＭＩ２を表わす中間画像データが生成される。（２−４）では、先端余白領域ＦＢ２３が閾値ｔｈ１よりも大きいと判断され、帯状領域ＬＡ２３の後端余白領域ＦＲ２３が消去された新たな帯状領域が決定され、当該新たな帯状領域を表わす非先頭部分画像データが取得される（Ｓ２２０）。（２−５）では、３個の帯状領域ＬＡ２１〜ＬＡ２３の３個の基準位置が縦方向の同じ位置に存在するように、（２−３）の中間画像データと（２−４）の非先頭部分画像データとが結合される（Ｓ２２０）。この結果、文字列「ｄ〜ｖ」と文字列「Ｗ〜Ｚ」とが横方向に沿って直線状に結合されている１行の文字列「ｄ〜ｖＷ〜Ｚ」を含む中間画像ＭＩ３を表わす中間画像データが生成される。 Subsequently, (Example 2) will be described. (2-1) to (2-5) are substantially the same as the above (1-1) to (1-5). That is, in (2-1), it is determined that the leading edge blank area FB21 is larger than the threshold value th1, and the head partial image data representing the strip-shaped area LA21 is acquired (S218). In (2-2), it is determined that the leading edge blank area FB22 is equal to or less than the threshold th1, and non-leading partial image data representing the belt-shaped area LA12 is acquired (S220). In (2-3), the head partial image data in (2-1) and the non-head partial image data in (2-2) are combined (S220). As a result, intermediate image data representing intermediate image MI2 including one line of character string “d to v” is generated. In (2-4), it is determined that the leading edge blank area FB23 is larger than the threshold th1, and a new belt-like area in which the trailing edge blank area FR23 of the belt-like area LA23 is erased is determined, and the non-printing area representing the new belt-like area First partial image data is acquired (S220). In (2-5), the intermediate image data of (2-3) and the non-display of (2-4) are set so that the three reference positions of the three strip regions LA21 to LA23 are present at the same position in the vertical direction. The head partial image data is combined (S220). As a result, an intermediate image MI3 including one line of character strings “d to vW to Z” in which the character strings “d to v” and the character strings “W to Z” are linearly coupled along the horizontal direction is obtained. Representing intermediate image data is generated.

（２−５）の中間画像ＭＩ３は、矩形形状を有していない。従って、（２−６）では、中間画像ＭＩ３に外接する矩形形状を有する結合画像ＣＩ３が形成されるように、中間画像ＭＩ３を表わす中間画像データに余白領域を表わす画素、即ち、スキャン画像ＳＩの背景色を有する画素が補充される（Ｓ２２０）。これにより、矩形形状を有する結合画像ＣＩ３を表わす結合画像データが生成される。 The intermediate image MI3 of (2-5) does not have a rectangular shape. Therefore, in (2-6), pixels representing a blank area in the intermediate image data representing the intermediate image MI3, that is, the scan image SI, are formed so that a combined image CI3 having a rectangular shape circumscribing the intermediate image MI3 is formed. Pixels having a background color are supplemented (S220). Thus, combined image data representing the combined image CI3 having a rectangular shape is generated.

（分断候補位置決定処理；図７）
続いて、図７を参照して、図５のＳ２３０で実行される分断候補位置決定処理の内容を説明する。Ｓ２３２では、ＣＰＵ６２は、図５のＳ２１０〜Ｓ２２４で生成された１個以上の結合画像（例えばＣＩ１）を表わす１個以上の結合画像データのうちの１個の結合画像データ（以下では「対象結合画像データ」と呼ぶ）を処理対象として決定する。図７の例では、対象結合画像データによって表わされる対象結合画像ＣＩ１は、英語のセンテンス「Ｉｓａｉｄ ‘Ｉｈａｖｅａｄｒｅａｍ’．」を含む。 (Division candidate position determination processing; FIG. 7)
Next, the contents of the division candidate position determination process executed in S230 of FIG. 5 will be described with reference to FIG. In S232, the CPU 62 selects one combined image data (hereinafter referred to as “target combination”) of one or more combined image data representing one or more combined images (for example, CI1) generated in S210 to S224 in FIG. (Referred to as “image data”) as a processing target. In the example of FIG. 7, the target combined image CI <b> 1 represented by the target combined image data includes an English sentence “I side 'I have a dream ′.”.

Ｓ２３４では、ＣＰＵ６２は、対象結合画像データに対して二値化処理を実行する。当該二値化処理の内容は、図３のＳ１１０と同様である。 In S234, the CPU 62 executes binarization processing on the target combined image data. The contents of the binarization process are the same as S110 in FIG.

Ｓ２３６では、ＣＰＵ６２は、Ｓ２３４で生成された二値データを利用して、対象結合画像データに対応する射影ヒストグラムを生成する。当該射影ヒストグラムは、二値データを構成する各画素を縦方向に射影する場合におけるＯＮ画素（即ち文字構成画素）の頻度の分布を示す。当該射影ヒストグラムでは、１個の文字又は記号（例えば「Ｉ」、「‘」、「．」）が、頻度がゼロより高い範囲で表わされ、２個の文字又は記号の間の余白部分（例えば、「Ｉｓａｉｄ」において、「Ｉ」と「ｓ」の間の余白部分、「ｓ」と「ａ」の間の余白部分等）が、頻度がゼロである範囲で表わされる。 In S236, the CPU 62 uses the binary data generated in S234 to generate a projection histogram corresponding to the target combined image data. The projection histogram shows the frequency distribution of ON pixels (that is, character constituent pixels) when the pixels constituting the binary data are projected in the vertical direction. In the projection histogram, one character or symbol (for example, “I”, “′”, “.”) Is represented in a range where the frequency is higher than zero, and a blank portion between two characters or symbols ( For example, in “I Said”, a blank portion between “I” and “s”, a blank portion between “s” and “a”, and the like) are expressed in a range where the frequency is zero.

Ｓ２３８では、ＣＰＵ６２は、対象結合画像ＣＩ１内において、文字構成画素が存在する領域と、文字構成画素が存在しない領域と、を区別するための閾値を設定する。具体的には、ＣＰＵ６２は、原則として、ゼロを閾値として設定する。ただし、ＣＰＵ６２は、Ｓ２３６で生成された射影ヒストグラムの中に１個以上の連続範囲が存在する場合には、１個以上の連続範囲を選択して、選択済みの１個以上の連続範囲のそれぞれについて、当該連続範囲内の頻度の最小値（即ちゼロより大きい値）を閾値として決定する。即ち、ＣＰＵ６２は、連続範囲についてゼロより大きい値を閾値として決定し、連続範囲以外の範囲についてゼロを閾値として決定する。例えば、図７の対象結合画像ＣＩ１のように、１個の連続範囲も存在しない場合には、ＣＰＵ６２は、全ての範囲についてゼロを閾値として決定する。連続範囲は、例えば、センテンスの中に取り消し線、下線等の装飾線が含まれる場合に、当該装飾線を表わす範囲である。装飾線がＯＮ画素で表わされるので、射影ヒストグラム内の装飾線に対応する範囲は、頻度がゼロより高くなり、かつ、横方向に比較的に長くなる。このために、本実施例では、ＣＰＵ６２は、頻度がゼロより高く、かつ、所定の長さ以上の横方向の長さを有する範囲を、連続範囲として選択する。上記の所定の長さは、スキャン画像データＳＩＤの解像度に応じて予め決定されている。例えば、スキャン画像データＳＩＤの解像度が３００ｄｐｉである場合には、上記の所定の長さは５０画素であり、解像度が６００ｄｐｉである場合には、上記の所定の長さは１００画素である。上記の所定の長さは、装飾線の存在を特定可能な長さであればどのような値であってもよいが、例えば、１個の文字の横方向の長さよりも大きい値である。ここで決定される閾値は、後述のＳ２４０及びＳ２４４で利用される。 In S238, the CPU 62 sets a threshold value for distinguishing between a region in which the character component pixel exists and a region in which the character component pixel does not exist in the target combined image CI1. Specifically, the CPU 62 sets zero as a threshold value in principle. However, if there is one or more continuous ranges in the projection histogram generated in S236, the CPU 62 selects one or more continuous ranges and selects each of the selected one or more continuous ranges. , A minimum value of the frequency within the continuous range (that is, a value greater than zero) is determined as a threshold value. That is, the CPU 62 determines a value larger than zero as the threshold value for the continuous range, and determines zero as a threshold value for a range other than the continuous range. For example, when there is no continuous range as in the target combined image CI1 in FIG. 7, the CPU 62 determines zero as the threshold value for all ranges. The continuous range is, for example, a range that represents a decorative line when a decorative line such as a strikethrough or underline is included in the sentence. Since the decoration line is represented by ON pixels, the range corresponding to the decoration line in the projection histogram has a frequency higher than zero and is relatively long in the horizontal direction. For this reason, in this embodiment, the CPU 62 selects a range having a frequency that is higher than zero and has a lateral length equal to or greater than a predetermined length as a continuous range. The predetermined length is determined in advance according to the resolution of the scanned image data SID. For example, when the resolution of the scan image data SID is 300 dpi, the predetermined length is 50 pixels, and when the resolution is 600 dpi, the predetermined length is 100 pixels. The predetermined length may be any value as long as the presence of the decoration line can be specified. For example, the predetermined length is a value larger than the horizontal length of one character. The threshold value determined here is used in S240 and S244 described later.

Ｓ２４０では、ＣＰＵ６２は、Ｓ２３６で生成された射影ヒストグラムと、Ｓ２３８で決定された閾値と、を利用して、１個の中間余白領域を処理対象として決定する。中間余白領域は、２個の文字又は記号の間の余白部分に対応する領域である。具体的には、中間余白領域は、Ｓ２３８で決定された閾値よりも高い頻度を有する２個の領域に挟まれた領域であって、当該閾値以下の頻度を有する領域である。例えば、図７の対象結合画像ＣＩ１では、全ての範囲について、頻度ゼロが閾値として決定される。この場合、例えば、ゼロより高い頻度を有する２個の領域（即ち「Ｉ」の領域と「ｓ」の領域）に挟まれた領域ＢＡ１（即ち頻度ゼロである領域ＢＡ１）が、中間余白領域である。１回目のＳ２４０では、ＣＰＵ６２は、最も先端側（即ち左側）に存在する１個の中間余白領域（図７の対象結合画像ＣＩ１では領域ＢＡ１）を処理対象として決定する。そして、２回目以降のＳ２４０では、ＣＰＵ６２は、前回の処理対象の中間余白領域の右側に存在する１個以上の中間余白領域のうち、最も先端側に存在する１個の中間余白領域（例えば「ｓａｉｄ」のうちの「ｓ」と「ａ」の間の領域）を今回の処理対象として決定する。 In S240, the CPU 62 determines one intermediate blank area as a processing target using the projection histogram generated in S236 and the threshold value determined in S238. The intermediate margin area is an area corresponding to a margin portion between two characters or symbols. Specifically, the middle blank area is an area between two areas having a frequency higher than the threshold determined in S238 and having a frequency equal to or lower than the threshold. For example, in the target combined image CI1 in FIG. 7, the frequency of zero is determined as the threshold for all ranges. In this case, for example, a region BA1 (that is, a region BA1 having a frequency of zero) sandwiched between two regions having a frequency higher than zero (that is, an “I” region and an “s” region) is an intermediate blank region. is there. In S240 for the first time, the CPU 62 determines one intermediate blank area (area BA1 in the target combined image CI1 in FIG. 7) existing on the most distal side (that is, the left side) as a processing target. Then, in the second and subsequent S240s, the CPU 62 selects one intermediate blank area (for example, “for example,“ at the forefront side) among one or more intermediate blank areas existing on the right side of the previous intermediate blank area to be processed. The area between “s” and “a” in “said”) is determined as the current processing target.

Ｓ２４２では、ＣＰＵ６２は、処理対象の中間余白領域の横方向の長さがｈ／４未満であるのか否かを判断する。ここで、「ｈ」は、対象結合画像ＣＩ１の縦方向の長さ（即ち縦方向の画素数）である。 In S242, the CPU 62 determines whether the horizontal length of the intermediate blank area to be processed is less than h / 4. Here, “h” is the length in the vertical direction of the target combined image CI1 (that is, the number of pixels in the vertical direction).

ＣＰＵ６２は、処理対象の中間余白領域の横方向の長さがｈ／４以上であると判断する場合（Ｓ２４２でＮＯ）、換言すれば、当該中間余白領域が比較的に大きいと判断する場合には、Ｓ２４６において、当該中間余白領域の右端を分断候補位置として決定する。このように、余白領域が分断候補位置として決定されるので、１個の文字（例えば「Ａ」）の中間で分断されてしまうことを抑制することができる。また、中間余白領域の右端が分断候補位置として決定される理由は、以下のとおりである。例えば、１行の文字列に含まれる２個の中間余白領域のそれぞれの右端で当該文字列が分断されて、縦方向に沿って並ぶ３行目の文字列が再配置される状況を想定する。この場合、２行目及び３行目の文字列の左側に余白が形成されないので、２行目及び３行目の文字列の先端（即ち左端）を揃えることができる。このように、２行目以降の文字列の先端を揃えることができるので、再配置される複数行の文字列の見た目を美しくすることができる。なお、変形例では、Ｓ２４６において、ＣＰＵ６２は、中間余白領域の右端以外の位置（例えば、左端、中間位置等）を分断候補位置として決定してもよい。Ｓ２４６が終了すると、Ｓ２４８に進む。 When determining that the horizontal length of the intermediate blank area to be processed is equal to or greater than h / 4 (NO in S242), in other words, the CPU 62 determines that the intermediate blank area is relatively large. In S246, the right end of the intermediate blank area is determined as the division candidate position. Thus, since the blank area is determined as the division candidate position, it is possible to suppress division in the middle of one character (for example, “A”). The reason why the right end of the middle blank area is determined as the division candidate position is as follows. For example, a situation is assumed in which the character string is divided at the right end of each of the two middle blank areas included in one line of character string, and the third line of character strings arranged in the vertical direction is rearranged. . In this case, since a blank space is not formed on the left side of the character strings on the second and third lines, the leading ends (that is, the left ends) of the character strings on the second and third lines can be aligned. As described above, since the leading ends of the character strings in the second and subsequent lines can be aligned, the appearance of the rearranged character strings in a plurality of lines can be made beautiful. In a modified example, in S246, the CPU 62 may determine a position other than the right end of the intermediate blank area (for example, the left end, the intermediate position, etc.) as the division candidate position. When S246 ends, the process proceeds to S248.

一方、ＣＰＵ６２は、処理対象の中間余白領域の横方向の長さがｈ／４未満であると判断する場合（Ｓ２４２でＹＥＳ）、換言すれば、当該中間余白領域が比較的に小さいと判断する場合には、Ｓ２４４において、左側隣接領域と右側隣接領域との少なくとも一方の横方向の長さがｈ／２未満であるのか否かを判断する。左側（又は右側）隣接領域は、処理対象の中間余白領域の左側（又は右側）で当該中間余白領域に隣接する領域であって、Ｓ２３８で決定された閾値よりも高い頻度を有する領域である。例えば、中間余白領域ＢＡ１では、「Ｉｓａｉｄ」のうち、「Ｉ」に対応する領域、「ａ」に対応する領域が、それぞれ、左側隣接領域、右側隣接領域である。 On the other hand, when the CPU 62 determines that the horizontal length of the intermediate blank area to be processed is less than h / 4 (YES in S242), in other words, determines that the intermediate blank area is relatively small. In this case, in S244, it is determined whether or not the lateral length of at least one of the left adjacent area and the right adjacent area is less than h / 2. The left side (or right side) adjacent region is a region adjacent to the intermediate margin region on the left side (or right side) of the intermediate blank region to be processed, and has a frequency higher than the threshold value determined in S238. For example, in the intermediate blank area BA1, the area corresponding to “I” and the area corresponding to “a” in “I Said” are the left adjacent area and the right adjacent area, respectively.

ＣＰＵ６２は、左側隣接領域と右側隣接領域との双方の横方向の長さがｈ／２以上であると判断する場合（Ｓ２４４でＮＯ）、例えば、左側隣接領域と右側隣接領域との双方に比較的に大きな文字（例えば、アルファベットの大文字、漢字、日本語の仮名等）が存在する場合には、Ｓ２４６において、中間余白領域の右端を分断候補位置として決定する。一方、ＣＰＵ６２は、左側隣接領域と右側隣接領域との少なくとも一方の横方向の長さがｈ／２未満であると判断する場合（Ｓ２４４でＹＥＳ）、例えば、左側隣接領域と右側隣接領域との少なくとも一方に比較的に小さな文字（例えばアルファベットの小文字）又は記号（例えば、カンマ、ピリオド、引用符号等）が存在する場合には、Ｓ２４６を実行せずに、Ｓ２４８に進む。即ち、ＣＰＵ６２は、今回の処理対象の中間余白領域を分断候補位置として決定しない。 When the CPU 62 determines that the lateral lengths of both the left adjacent area and the right adjacent area are equal to or greater than h / 2 (NO in S244), for example, the CPU 62 compares the left adjacent area with the right adjacent area. If there is a large character (for example, uppercase alphabetic characters, kanji characters, Japanese kana characters, etc.), the right end of the middle blank area is determined as the division candidate position in S246. On the other hand, when the CPU 62 determines that the horizontal length of at least one of the left adjacent area and the right adjacent area is less than h / 2 (YES in S244), for example, the left adjacent area and the right adjacent area If there is a relatively small character (for example, lowercase alphabet) or a symbol (for example, comma, period, quotation mark, etc.) in at least one of them, the process proceeds to S248 without executing S246. That is, the CPU 62 does not determine the intermediate blank area to be processed this time as the division candidate position.

Ｓ２４８では、ＣＰＵ６２は、対象結合画像ＣＩ１に含まれる全ての中間余白領域について、Ｓ２４０〜Ｓ２４６の処理が終了したのか否かを判断する。ＣＰＵ６２は、処理が終了していないと判断する場合（Ｓ２４８でＮＯ）には、Ｓ２４０において、未処理の中間余白領域を処理対象として決定して、Ｓ２４２以降の各処理を再び実行する。そして、ＣＰＵ６２は、処理が終了したと判断する場合（Ｓ２４８でＹＥＳ）には、Ｓ２５０に進む。 In S248, the CPU 62 determines whether or not the processing of S240 to S246 has been completed for all intermediate blank areas included in the target combined image CI1. If the CPU 62 determines that the process has not ended (NO in S248), the CPU 62 determines an unprocessed intermediate blank area as a process target in S240, and executes each process after S242 again. If the CPU 62 determines that the process has been completed (YES in S248), the CPU 62 proceeds to S250.

Ｓ２５０では、ＣＰＵ６２は、図５のＳ２１０〜Ｓ２２４で生成された全ての結合画像データについて、Ｓ２３２〜Ｓ２４８の処理が終了したのか否かを判断する。ＣＰＵ６２は、処理が終了していないと判断する場合（Ｓ２５０でＮＯ）には、Ｓ２３２において、未処理の結合画像データを処理対象として決定する。そして、ＣＰＵ６２は、処理が終了したと判断する場合（Ｓ２５０でＹＥＳ）には、図７の処理を終了する。 In S250, the CPU 62 determines whether or not the processing of S232 to S248 has been completed for all the combined image data generated in S210 to S224 of FIG. If the CPU 62 determines that the process has not ended (NO in S250), it determines unprocessed combined image data as a processing target in S232. Then, if the CPU 62 determines that the process is complete (YES in S250), the process of FIG. 7 is terminated.

（分断位置決定処理の具体例；図８）
続いて、図８を参照して、図７の分断位置決定処理の具体例を説明する。ケースＡの対象結合画像は、図７の対象結合画像ＣＩ１と同じである。従って、対象結合画像内に連続範囲（即ち装飾線）が存在せず、全ての範囲について、頻度ゼロが閾値として決定される（Ｓ２３８）。この場合、「Ｉｓａｉｄ」のうちの「Ｉ」と「ｓ」の間の領域ＢＡ１が１個目の処理対象の中間余白領域として決定される（Ｓ２４０）。中間余白領域ＢＡ１は、単語「Ｉ」と単語「ｓａｉｄ」の間の余白（いわゆるスペース）に相当し、通常、ｈ／４以上の横方向の長さを有する（Ｓ２４２でＮＯ）。従って、中間余白領域ＢＡ１が分断候補位置として決定される（Ｓ２４６）。２個の英単語「Ｉ」，「ｓａｉｄ」の間の余白で文字列が分断されても、ユーザが分断後の各文字列を読み難いと感じる可能性が低いので、本実施例では、中間余白領域ＢＡ１が分断候補位置として決定される。 (Specific example of dividing position determination processing; FIG. 8)
Next, a specific example of the dividing position determination process in FIG. 7 will be described with reference to FIG. The target combined image in case A is the same as the target combined image CI1 in FIG. Therefore, there is no continuous range (that is, a decoration line) in the target combined image, and the frequency of zero is determined as the threshold value for all ranges (S238). In this case, an area BA1 between “I” and “s” in “I Said” is determined as the first intermediate blank area to be processed (S240). The intermediate margin area BA1 corresponds to a margin (so-called space) between the word “I” and the word “said”, and usually has a horizontal length of h / 4 or more (NO in S242). Therefore, the middle blank area BA1 is determined as the division candidate position (S246). Even if the character string is divided at the space between the two English words “I” and “said”, it is unlikely that the user will find it difficult to read the divided character strings. The margin area BA1 is determined as the division candidate position.

次いで、「Ｉｓａｉｄ」のうちの「ｓ」と「ａ」の間の領域ＢＡ２が２個目の処理対象の中間余白領域として決定される（Ｓ２４０）。中間余白領域ＢＡ２は、１個の英単語「ｓａｉｄ」を構成する２個の文字（即ち「ｓ」と「ａ」）の間の余白に相当し、通常、ｈ／４未満の横方向の長さを有する（Ｓ２４２でＹＥＳ）。また、中間余白領域ＢＡ２の左側隣接領域、右側隣接領域は、それぞれ、「ｓａｉｄ」のうちの「ｓ」、「ａ」に相当し、通常、ｈ／２未満の横方向の長さを有する（Ｓ２４４でＹＥＳ）。従って、中間余白領域ＢＡ２が分断候補位置として決定されない。１個の英単語（例えば「ｓａｉｄ」）を構成する２個の文字（例えば「ｓ」と「ａ」）の間の余白で文字列が分断されると、ユーザが分断後の各文字列を読み難いと感じる可能性が高いので、本実施例では、中間余白領域ＢＡ２が分断候補位置として決定されない。 Next, an area BA2 between “s” and “a” in “I Said” is determined as the second intermediate blank area to be processed (S240). The middle margin area BA2 corresponds to a margin between two letters (that is, “s” and “a”) constituting one English word “said”, and is generally a horizontal length of less than h / 4. (YES in S242). Further, the left side adjacent area and the right side adjacent area of the middle blank area BA2 correspond to “s” and “a” of “said”, respectively, and generally have a lateral length of less than h / 2 ( YES in S244). Therefore, the intermediate blank area BA2 is not determined as the division candidate position. When a character string is divided by a blank space between two characters (for example, “s” and “a”) constituting one English word (for example, “said”), the user can select each character string after the division. Since there is a high possibility that it will be difficult to read, in the present embodiment, the intermediate blank area BA2 is not determined as the division candidate position.

上記と同様に、３個目以降の各中間余白領域についても、当該中間余白領域が分断候補位置であるのか否かが決定される。例えば、「・・・ｓａｉｄ ‘Ｉ・・・」において、「ｄ」と「‘」の間の余白に相当する中間余白領域ＢＡ３は、ｈ／４以上の横方向の長さを有するので（Ｓ２４２でＮＯ）、分断候補位置として決定される（Ｓ２４６）。また、例えば、「‘」と「Ｉ」の間の余白に相当する中間余白領域ＢＡ４は、ｈ／４未満の横方向の長さを有する（Ｓ２４２でＹＥＳ）。そして、中間余白領域ＢＡ４の左側隣接領域「‘」及び右側隣接領域「Ｉ」の双方は、ｈ／２未満の横方向の長さを有する（Ｓ２４４でＹＥＳ）。このために、中間余白領域ＢＡ４は、分断候補位置として決定されない。結果として、センテンス「Ｉｓａｉｄ ‘Ｉｈａｖｅａｄｒｅａｍ’．」について、５個の分断候補位置が決定される。 Similarly to the above, for each of the third and subsequent intermediate margin regions, whether or not the intermediate margin region is a division candidate position is determined. For example, in “... Said side“ I... ”, The intermediate margin area BA3 corresponding to the margin between“ d ”and“ ′ ”has a horizontal length of h / 4 or more (S242). NO), it is determined as a division candidate position (S246). Further, for example, the intermediate margin area BA4 corresponding to the margin between “′” and “I” has a lateral length of less than h / 4 (YES in S242). Then, both the left adjacent area “′” and the right adjacent area “I” of the middle blank area BA4 have a lateral length of less than h / 2 (YES in S244). For this reason, the middle blank area BA4 is not determined as a division candidate position. As a result, five division candidate positions are determined for the sentence “I side‘ I have a dream ’.”.

なお、ケースＡでは、２個の英単語の間の余白が分断候補位置として決定される例を想定している。ただし、例えば、日本語の文章と文章との間に１文字分のスペースが挿入されている場合でも、当該スペースは、通常、Ｓ２４２でＮＯと判断され、分断候補位置として決定される（Ｓ２４６）。英語及び日本語とは異なる言語についても、比較的に大きい余白が存在する場合には、当該余白は、通常、分断候補位置として決定される。 In case A, an example is assumed in which a margin between two English words is determined as a division candidate position. However, for example, even when a space for one character is inserted between Japanese sentences, the space is usually determined as NO in S242 and determined as a division candidate position (S246). . Even in a language different from English and Japanese, when a relatively large margin exists, the margin is usually determined as a division candidate position.

ケースＢでは、対象結合画像は、ケースＡと同じセンテンスを含むが、「‘Ｉｈａｖｅａｄｒｅａｍ’．」に取り消し線が付されている。この場合、Ｓ２３６で生成される射影ヒストグラムにおいて、取り消し線に対応する範囲が、頻度がゼロより高く、かつ、所定の長さ（例えば５０画素）以上の所定の長さを有することになるので、当該範囲が連続範囲として選択される。そして、連続範囲（即ち「‘Ｉｈａｖｅａｄｒｅａｍ’．」）について、当該連続範囲内の頻度の最小値（即ちゼロより大きい値）が閾値として決定され、当該連続範囲以外の範囲（即ち「Ｉｓａｉｄ」）について、頻度ゼロが閾値として決定される（Ｓ２３８）。 In case B, the target combined image includes the same sentence as in case A, but “’ I have a dream ”. In this case, in the projection histogram generated in S236, the range corresponding to the strikethrough has a predetermined length that is higher than zero and has a predetermined length (for example, 50 pixels) or more. The range is selected as a continuous range. Then, for the continuous range (ie, “'I have a dream'.”), The minimum value of the frequency within the continuous range (ie, a value greater than zero) is determined as a threshold, and a range other than the continuous range (ie, “I side ") is determined as a threshold value (S238).

連続範囲以外の範囲（即ち「Ｉｓａｉｄ」）では、ケースＡと同様に、閾値ゼロを利用して各中間余白領域が決定され、中間余白領域ＢＡ１が分断候補位置として決定される（Ｓ２４２でＮＯ）。また、領域ＢＡ５内の左側の範囲（即ち取り消し線が付されていない範囲）では、閾値がゼロであり、領域ＢＡ５内の右側の範囲（即ち取り消し線が付されている範囲）では、閾値が正の値である。領域ＢＡ５内の左側の範囲が閾値（即ちゼロ）以下であると判断され、かつ、領域ＢＡ５内の右側の範囲が閾値（即ち上記の正の値）以下であると判断されるので、領域ＢＡ５が中間余白領域として決定される（Ｓ２４０）。また、連続範囲内の領域ＢＡ６，ＢＡ７も、それぞれ、閾値（即ち上記の正の値）以下であると判断されるので、領域ＢＡ６，ＢＡ７が中間余白領域として決定される（Ｓ２４０）。そして、各中間余白領域ＢＡ５，ＢＡ７が分断候補位置として決定され（Ｓ２４２でＮＯ、Ｓ２４６）、中間余白領域ＢＡ６が分断候補位置として決定されない（Ｓ２４２でＹＥＳ、Ｓ２４４でＹＥＳ）。このように、本実施例では、取り消し線に対応する連続範囲について、ゼロより大きい閾値が決定されるので、取り消し線を考慮して、中間余白領域ＢＡ５等と隣接領域（例えば「Ｉ」に対応する領域）とを適切に決定することができる。この結果、ケースＡと同様に、５個の分断候補位置を適切に決定することができる。 In a range other than the continuous range (ie, “I Said”), as in case A, each intermediate margin area is determined using the threshold value zero, and the intermediate margin area BA1 is determined as a division candidate position (NO in S242). ). Further, the threshold value is zero in the left range in the area BA5 (that is, the range without the strikethrough), and the threshold value is zero in the right range in the area BA5 (that is, the range with the strikethrough). It is a positive value. Since it is determined that the left range in the area BA5 is less than or equal to the threshold (that is, zero) and the right range in the area BA5 is determined to be less than or equal to the threshold (that is, the positive value), the area BA5 Is determined as an intermediate margin area (S240). In addition, since the areas BA6 and BA7 within the continuous range are also determined to be equal to or less than the threshold value (that is, the positive value), the areas BA6 and BA7 are determined as intermediate blank areas (S240). Then, the intermediate margin areas BA5 and BA7 are determined as the division candidate positions (NO in S242, S246), and the intermediate margin area BA6 is not determined as the division candidate position (YES in S242 and YES in S244). As described above, in this embodiment, a threshold value greater than zero is determined for the continuous range corresponding to the strikethrough line, and therefore, considering the strikethrough line, the intermediate blank area BA5 and the like and the adjacent area (for example, “I”). Area) to be determined appropriately. As a result, like the case A, the five division candidate positions can be appropriately determined.

なお、ケースＢでは、取り消し線が付されている状況を想定しているが、取り消し線の代わりに下線が付されている場合でも、同様の射影ヒストグラムが得られる。このために、下線が付されている場合でも、ケースＢと同様に、５個の分断候補位置を適切に決定することができる。 Case B assumes a situation in which a strikethrough is attached, but a similar projection histogram can be obtained even when an underline is attached instead of a strikethrough. For this reason, even in the case of being underlined, the five division candidate positions can be appropriately determined as in the case B.

ケースＣの対象結合画像は、日本語の文字列を含む。中間余白領域ＢＡ８は、括弧Ｃ１と平仮名Ｃ２（即ち「あ」）の間の余白に相当し、通常、ｈ／４未満の横方向の長さを有する（Ｓ２４２でＹＥＳ）。また、右側隣接領域（即ち平仮名Ｃ２）は、通常、ｈ／２以上の横方向の長さを有するが、左側隣接領域（即ち括弧Ｃ１）は、通常、ｈ／２未満の横方向の長さを有する（Ｓ２４４でＹＥＳ）。従って、中間余白領域ＢＡ８が分断候補位置として決定されない。括弧と文字の間の余白で文字列が分断されると、ユーザが分断後の各文字列を読み難いと感じる可能性が高いので、本実施例では、中間余白領域ＢＡ８が分断候補位置として決定されない。 The target combined image of case C includes a Japanese character string. The intermediate margin area BA8 corresponds to a margin between the parenthesis C1 and the hiragana C2 (ie, “A”), and generally has a lateral length of less than h / 4 (YES in S242). Also, the right adjacent area (ie Hiragana C2) usually has a lateral length of h / 2 or more, while the left adjacent area (ie bracket C1) usually has a lateral length of less than h / 2. (YES in S244). Therefore, the middle blank area BA8 is not determined as the division candidate position. If the character string is divided at the blank space between the parenthesis and the character, the user is likely to find it difficult to read each character string after the division. Therefore, in this embodiment, the middle blank area BA8 is determined as the division candidate position. Not.

中間余白領域ＢＡ９は、１個の平仮名Ｃ３（即ち「い」）を構成する左側の線と右側の線の間の余白に相当し、通常、ｈ／４未満の横方向の長さを有する（Ｓ２４２でＹＥＳ）。また、左側隣接領域（即ち平仮名Ｃ３を構成する左側の線）と右側隣接領域（即ち平仮名Ｃ３を構成する右側の線）とは、通常、ｈ／２未満の横方向の長さを有する（Ｓ２４４でＹＥＳ）。従って、中間余白領域ＢＡ９が分断候補位置として決定されない。１個の平仮名Ｃ３が分断されると、ユーザが１個の平仮名Ｃ３を認識することができないので、本実施例では、中間余白領域ＢＡ９が分断候補位置として決定されない。なお、平仮名「い」のみならず、平仮名Ｃ８〜Ｃ１０（即ち「け」、「に」、「は」）、片仮名Ｃ１１（即ち「ハ」）、漢字Ｃ１２（即ち「卵」）についても、１個の文字の間に余白が形成され得るが、当該余白も、通常、分断候補位置として決定されない（Ｓ２４４でＹＥＳ）。 The middle margin area BA9 corresponds to a margin between a left line and a right line constituting one hiragana C3 (ie, “I”), and generally has a lateral length of less than h / 4 ( YES in S242). Further, the left adjacent area (ie, the left line constituting Hiragana C3) and the right adjacent area (ie, the right line constituting Hiragana C3) usually have a lateral length of less than h / 2 (S244). YES) Therefore, the middle blank area BA9 is not determined as the division candidate position. If one hiragana C3 is divided, the user cannot recognize one hiragana C3. Therefore, in the present embodiment, the intermediate blank area BA9 is not determined as the division candidate position. Not only hiragana “I”, but also hiragana C8 to C10 (ie, “ke”, “ni”, “ha”), katakana C11 (ie, “ha”), and kanji C12 (ie, “egg”), 1 Although margins can be formed between individual characters, the margins are usually not determined as division candidate positions (YES in S244).

中間余白領域ＢＡ１０は、平仮名Ｃ４（即ち「う」）と平仮名Ｃ５（即ち「え」）の間の余白に相当し、通常、ｈ／４未満の横方向の長さを有する（Ｓ２４２でＹＥＳ）。左側隣接領域（即ち平仮名Ｃ４）と右側隣接領域（即ち平仮名Ｃ５）とは、通常、ｈ／２以上の横方向の長さを有する（Ｓ２４４でＮＯ）。従って、中間余白領域ＢＡ１０が分断候補位置として決定される（Ｓ２４６）。日本語の２個の文字の間の余白で文字列が分断されても、ユーザが分断後の各文字列を読み難いと感じる可能性が低いので、本実施例では、中間余白領域ＢＡ１０が分断候補位置として決定される。 The intermediate margin area BA10 corresponds to a margin between the hiragana C4 (ie, “U”) and the hiragana C5 (ie, “e”), and generally has a lateral length of less than h / 4 (YES in S242). . The left adjacent area (ie Hiragana C4) and the right adjacent area (ie Hiragana C5) usually have a lateral length of h / 2 or more (NO in S244). Therefore, the intermediate blank area BA10 is determined as the division candidate position (S246). Even if the character string is divided at the margin between two Japanese characters, it is unlikely that the user will find it difficult to read each divided character string. In this embodiment, the intermediate margin area BA10 is divided. It is determined as a candidate position.

中間余白領域ＢＡ１１は、平仮名Ｃ６（即ち「お」）と句点Ｃ７（即ち「。」）の間の余白に相当し、通常、ｈ／４未満の横方向の長さを有する（Ｓ２４２でＹＥＳ）。左側隣接領域（即ち平仮名Ｃ６）は、通常、ｈ／２以上の横方向の長さを有するが、右側隣接領域（即ち句点Ｃ７）は、通常、ｈ／２未満の横方向の長さを有する（Ｓ２４４でＹＥＳ）。従って、中間余白領域ＢＡ１１が分断候補位置として決定されない。文字と句点の間の余白で文字列が分断されると、ユーザが分断後の各文字列を読み難いと感じる可能性が高いので、本実施例では、中間余白領域ＢＡ１１が分断候補位置として決定されない。なお、同様に、文字と読点（即ち「、」）の間の余白も、通常、分断候補位置として決定されない（Ｓ２４４でＹＥＳ）。 The intermediate margin area BA11 corresponds to a margin between the hiragana C6 (ie, “o”) and the punctuation point C7 (ie, “.”), And generally has a lateral length of less than h / 4 (YES in S242). . The left adjacent area (ie hiragana C6) typically has a lateral length of h / 2 or more, while the right adjacent area (ie phrase C7) typically has a lateral length of less than h / 2. (YES in S244). Therefore, the intermediate blank area BA11 is not determined as the division candidate position. If the character string is divided at the margin between the character and the punctuation point, the user is likely to find it difficult to read each character string after the division. In this embodiment, the intermediate margin area BA11 is determined as the division candidate position. Not. Similarly, the margin between the character and the punctuation mark (that is, “,”) is not usually determined as the division candidate position (YES in S244).

（再配置処理；図９）
続いて、図９を参照して、図２のＳ４００で実行される再配置処理の内容を説明する。Ｓ４１０では、ＣＰＵ６２は、スキャン画像ＳＩ内の１個以上のテキスト領域のうちの１個のテキスト領域（例えばＴＯＡ）を処理対象として決定する。以下では、Ｓ４１０で処理対象として決定されるテキスト領域のことを「対象テキスト領域」と呼ぶ。また、対象テキスト領域について決定された目標領域（例えば図２のＳ３００のＴＡ）のことを「対象目標領域」と呼ぶ。また、対象テキスト領域に含まれる各文字列が結合された結合画像（例えば図２のＳ２００のＣＩ１，ＣＩ２）、当該結合画像を表わす結合画像データのことを、それぞれ、「対象結合画像」、「対象結合画像データ」と呼ぶ。 (Relocation processing; FIG. 9)
Next, the contents of the rearrangement process executed in S400 of FIG. 2 will be described with reference to FIG. In S410, the CPU 62 determines one text area (for example, TOA) out of one or more text areas in the scanned image SI as a processing target. Hereinafter, the text area determined as the processing target in S410 is referred to as “target text area”. A target area determined for the target text area (for example, TA in S300 in FIG. 2) is referred to as a “target target area”. In addition, a combined image (for example, CI1 and CI2 in S200 of FIG. 2) in which the character strings included in the target text area are combined, and combined image data representing the combined image are referred to as “target combined image”, “ This is called “target combined image data”.

Ｓ４２０では、ＣＰＵ６２は、決定されるべき再配置領域（図２のＳ４００のＲＡ参照）の候補である候補再配置領域の横方向の長さＷ（即ち横方向の画素数Ｗ）の初期値、縦方向の長さＨ（即ち縦方向の画素数Ｈ）の初期値として、それぞれ、対象テキスト領域の横方向の長さＯＰｘ、縦方向の長さＯＰｙを設定する。 In S420, the CPU 62 determines the initial value of the horizontal length W (that is, the number of pixels W in the horizontal direction) of the candidate rearrangement region that is a candidate for the rearrangement region to be determined (see RA in S400 in FIG. 2). As the initial value of the vertical length H (that is, the number of vertical pixels H), the horizontal length OPx and the vertical length OPy of the target text area are set, respectively.

Ｓ４３０では、ＣＰＵ６２は、候補再配置領域の縦方向の長さＨに対する横方向の長さＷの比Ｗ／Ｈが、対象目標領域の縦方向の長さＴＨに対する横方向の長さＴＷの比ＴＷ／ＴＨ未満であるのか否かを判断する。 In S430, the CPU 62 determines that the ratio W / H of the horizontal length W to the vertical length H of the candidate rearrangement area is the ratio of the horizontal length TW to the vertical length TH of the target target area. It is determined whether it is less than TW / TH.

ＣＰＵ６２は、比Ｗ／Ｈが比ＴＷ／ＴＨ未満であると判断する場合（Ｓ４３０でＹＥＳ）には、Ｓ４３２において、候補再配置領域の横方向の現在の長さＷに予め決められている固定値β（例えば１画素）を加算して、候補再配置領域の横方向の新たな長さＷを決定する。Ｓ４３２が終了すると、Ｓ４４０に進む。 When the CPU 62 determines that the ratio W / H is less than the ratio TW / TH (YES in S430), in S432, the fixed length predetermined to the current length W in the horizontal direction of the candidate rearrangement region is determined. The value β (for example, one pixel) is added to determine a new lateral length W of the candidate rearrangement region. When S432 ends, the process proceeds to S440.

一方、ＣＰＵ６２は、比Ｗ／Ｈが比ＴＷ／ＴＨ以上であると判断する場合（Ｓ４３０でＮＯ）には、Ｓ４３４において、候補再配置領域の横方向の現在の長さＷから予め決められている固定値β（例えば１画素）を減算して、候補再配置領域の横方向の新たな長さＷを決定する。Ｓ４３４が終了すると、Ｓ４４０に進む。なお、本実施例では、Ｓ４３２及びＳ４３４において、同じ固定値βが利用されるが、変形例では、Ｓ４３２の固定値とＳ４３４の固定値とは異なる値であってもよい。 On the other hand, if the CPU 62 determines that the ratio W / H is greater than or equal to the ratio TW / TH (NO in S430), the CPU 62 determines in advance from the current length W in the horizontal direction of the candidate rearrangement region in S434. A fixed length β (for example, one pixel) is subtracted to determine a new horizontal length W of the candidate rearrangement region. When S434 ends, the process proceeds to S440. In the present embodiment, the same fixed value β is used in S432 and S434. However, in a modified example, the fixed value in S432 and the fixed value in S434 may be different values.

Ｓ４４０では、ＣＰＵ６２は、スキャン画像データＳＩＤの解像度に応じて、縦方向に沿った行間の長さｍ（即ち行間の画素数ｍ）を決定する。例えば、ＣＰＵ６２は、スキャン画像データＳＩＤの解像度が３００ｄｐｉである場合には、行間の長さｍとして１画素を決定し、スキャン画像データＳＩＤの解像度が６００ｄｐｉである場合には、行間の長さｍとして２画素を決定する。即ち、ＣＰＵ６２は、スキャン画像データＳＩＤの解像度が高くなる程、大きい行間の長さｍを決定する。この構成によると、ＣＰＵ６２は、スキャン画像データＳＩＤの解像度に応じた適切な大きさを有する行間の長さｍを決定することができる。なお、変形例では、スキャン画像データＳＩＤの解像度に関わらず、行間の長さｍとして同じ値が採用されてもよい。 In S440, the CPU 62 determines the length m between rows along the vertical direction (that is, the number m of pixels between rows) according to the resolution of the scan image data SID. For example, when the resolution of the scanned image data SID is 300 dpi, the CPU 62 determines one pixel as the length m between the rows, and when the resolution of the scanned image data SID is 600 dpi, the length m between the rows. 2 pixels are determined. That is, the CPU 62 determines a larger line length m as the resolution of the scanned image data SID increases. According to this configuration, the CPU 62 can determine the length m between lines having an appropriate size according to the resolution of the scanned image data SID. In the modified example, the same value may be adopted as the length m between lines regardless of the resolution of the scanned image data SID.

Ｓ４５０では、ＣＰＵ６２は、対象結合画像データと、Ｓ４３２又はＳ４３４で決定された候補再配置領域の横方向の新たな長さＷと、に基づいて、行数決定処理を実行する（後述の図１０参照）。行数決定処理では、ＣＰＵ６２は、対象結合画像（例えば図２のＣＩ１）に含まれる複数個の文字（例えば「Ａ〜Ｍ」）を候補再配置領域内に再配置する場合における行数を決定する。 In S450, the CPU 62 executes a row number determination process based on the target combined image data and the new horizontal length W of the candidate rearrangement region determined in S432 or S434 (FIG. 10 described later). reference). In the line number determination process, the CPU 62 determines the number of lines in the case where a plurality of characters (for example, “A to M”) included in the target combined image (for example, CI1 in FIG. 2) are rearranged in the candidate rearrangement region. To do.

（行数決定処理；図１０）
図１０を参照して、図９のＳ４５０で実行される行数決定処理の内容を説明する。ＣＰＵ６２は、２個以上の対象結合画像が存在する場合には、１個の対象結合画像毎に行数決定処理を実行して、２個以上の対象結合画像に対応する２個以上の行数を決定する。 (Line number determination processing; FIG. 10)
With reference to FIG. 10, the contents of the line number determination process executed in S450 of FIG. 9 will be described. When there are two or more target combined images, the CPU 62 executes the number-of-rows determination process for each target combined image, and the number of two or more lines corresponding to the two or more target combined images. To decide.

Ｓ４５１では、ＣＰＵ６２は、１個の対象結合画像（例えばＣＩ１）の横方向の長さＩＷが、候補再配置領域の横方向の長さＷ以下であるのか否かを判断する。ＣＰＵ６２は、長さＩＷが長さＷ以下であると判断する場合（Ｓ４５１でＹＥＳ）には、Ｓ４５２において、「１」を行数として決定する。対象結合画像（例えばＣＩ１）に含まれる全ての文字（例えば「Ａ〜Ｍ」）が横方向に沿って直線状に並んだ状態で、当該全ての文字が候補再配置領域内に収まるからである。Ｓ４５２が終了すると、図１０の処理が終了する。 In S451, the CPU 62 determines whether or not the horizontal length IW of one target combined image (for example, CI1) is equal to or smaller than the horizontal length W of the candidate rearrangement region. If the CPU 62 determines that the length IW is less than or equal to the length W (YES in S451), the CPU 62 determines “1” as the number of rows in S452. This is because all the characters (for example, “A to M”) included in the target combined image (for example, CI1) are arranged in a straight line along the horizontal direction, and all the characters fit in the candidate rearrangement region. . When S452 ends, the process of FIG. 10 ends.

一方、ＣＰＵ６２は、長さＩＷが長さＷよりも大きいと判断する場合（Ｓ４５１でＮＯ）には、対象結合画像に含まれる複数個の文字を複数行に分断して配置する必要がある。このために、ＣＰＵ６２は、Ｓ４５３及び４５４を実行して、図７で決定された複数個の分断候補位置（例えば図１０内の対象結合画像ＣＩ１に付された複数個の矢印参照）の中から、１個以上の分断候補位置を選択する。 On the other hand, when the CPU 62 determines that the length IW is larger than the length W (NO in S451), it is necessary to divide and arrange a plurality of characters included in the target combined image into a plurality of lines. For this purpose, the CPU 62 executes S453 and 454, and selects from among a plurality of division candidate positions determined in FIG. 7 (for example, refer to a plurality of arrows attached to the target combined image CI1 in FIG. 10). One or more division candidate positions are selected.

Ｓ４５３では、ＣＰＵ６２は、選択長さＳＷが候補再配置領域の横方向の長さＷ以下の最大の長さになるように、複数個の分断候補位置の中から１個の分断候補位置を選択する。１個の分断候補位置も未だに選択されていない状態では、選択長さＳＷは、対象結合画像の先端（即ち左端）と、選択されるべき分断候補位置と、の間の横方向の長さである。また、１個以上の分断候補位置が既に選択されている状態では、選択長さＳＷは、直近に選択された分断候補位置と、当該分断候補位置よりも後端側（即ち右側）に存在する新たに選択されるべき分断候補位置と、の間の横方向の長さである。図１０の例では、対象結合画像ＣＩ１に含まれる文字「Ｆ」と文字「Ｇ」との間の分断候補位置が選択される。 In S453, the CPU 62 selects one division candidate position from among a plurality of division candidate positions so that the selection length SW becomes the maximum length not more than the horizontal length W of the candidate rearrangement area. To do. In a state where one division candidate position has not yet been selected, the selection length SW is a horizontal length between the leading end (that is, the left end) of the target combined image and the division candidate position to be selected. is there. In a state where one or more division candidate positions have already been selected, the selection length SW is present at the most recently selected division candidate position and the rear end side (that is, the right side) of the division candidate position. This is the length in the horizontal direction between the division candidate position to be newly selected. In the example of FIG. 10, a division candidate position between the character “F” and the character “G” included in the target combined image CI1 is selected.

Ｓ４５４では、ＣＰＵ６２は、残存長さＲＷが候補再配置領域の横方向の長さＷ以下であるのか否かを判断する。残存長さＲＷは、直近に選択された分断候補位置と、対象結合画像の後端（即ち右端）と、の間の横方向の長さである。ＣＰＵ６２は、残存長さＲＷが長さＷよりも大きいと判断する場合（Ｓ４５４でＮＯ）には、Ｓ４５３に戻り、複数個の分断候補位置の中から、直近に選択された分断候補位置よりも後端側に存在する分断候補位置を新たに決定する。 In S454, the CPU 62 determines whether or not the remaining length RW is less than or equal to the horizontal length W of the candidate rearrangement region. The remaining length RW is a length in the horizontal direction between the most recently selected division candidate position and the rear end (that is, the right end) of the target combined image. If the CPU 62 determines that the remaining length RW is greater than the length W (NO in S454), the CPU 62 returns to S453, and from among the plurality of division candidate positions that are selected most recently. A division candidate position existing on the rear end side is newly determined.

一方、ＣＰＵ６２は、残存長さＲＷが長さＷ以下であると判断する場合（Ｓ４５４でＹＥＳ）には、Ｓ４５５において、選択済みの分断候補位置の数に「１」を加算することによって得られる数を行数として決定する。Ｓ４５５が終了すると、図１０の処理が終了する。 On the other hand, if the CPU 62 determines that the remaining length RW is less than or equal to the length W (YES in S454), it is obtained by adding “1” to the number of selected division candidate positions in S455. Determine the number as the number of rows. When S455 ends, the process of FIG. 10 ends.

図１０の例では、ＣＰＵ６２は、対象結合画像ＣＩ１について、１個の分断候補位置のみを選択する。この結果、ＣＰＵ６２は、対象結合画像ＣＩ１に対応する行数として「２」を決定する。また、ＣＰＵ６２は、対象結合画像ＣＩ２の横方向の長さＩＷが、候補再配置領域の横方向の長さＷ以下であると判断し（Ｓ４５１でＹＥＳ）、この結果、対象結合画像ＣＩ２に対応する行数として「１」を決定する。 In the example of FIG. 10, the CPU 62 selects only one division candidate position for the target combined image CI1. As a result, the CPU 62 determines “2” as the number of rows corresponding to the target combined image CI1. Further, the CPU 62 determines that the horizontal length IW of the target combined image CI2 is equal to or smaller than the horizontal length W of the candidate rearrangement region (YES in S451), and as a result, corresponds to the target combined image CI2. “1” is determined as the number of rows to be executed.

（再配置処理の続き；図９）
図９のＳ４６０では、ＣＰＵ６２は、Ｓ４６０内の数式に従って、候補再配置領域の縦方向の新たな長さＨを決定する。Ｓ４６０内の数式において、「ＮＨ」は１個の対象結合画像から得られる再配置文字列の高さであり、「Ｎ」は対象結合画像の数である。また、「ｍ」はＳ４４０で決定された行間の長さであり、「ｎ」はＳ４５０で決定された行数であり、「ｈ」は対象結合画像の縦方向の長さである（図１０の結合画像ＣＩ１のｈ１、結合画像ＣＩ２のｈ２参照）。 (Continuation of rearrangement processing; FIG. 9)
In S460 of FIG. 9, the CPU 62 determines a new length H in the vertical direction of the candidate rearrangement region according to the formula in S460. In the formula in S460, “NH” is the height of the rearranged character string obtained from one target combined image, and “N” is the number of target combined images. “M” is the length between lines determined in S440, “n” is the number of lines determined in S450, and “h” is the vertical length of the target combined image (FIG. 10). H1 of the combined image CI1 and h2 of the combined image CI2).

Ｓ４７０では、ＣＰＵ６２は、候補再配置領域のアスペクト比Ｗ／Ｈが対象目標領域のアスペクト比ＴＷ／ＴＨに近似するのか否かを判断する。具体的には、ＣＰＵ６２は、候補再配置領域のアスペクト比Ｗ／Ｈが、対象目標領域のアスペクト比ＴＷ／ＴＨに基づいて設定される所定範囲内に含まれるのか否かを判断する。上記の所定範囲は、対象目標領域のアスペクト比ＴＷ／ＴＨから値γを減算することによって得られる値と、対象目標領域のアスペクト比ＴＷ／ＴＨに値γを加算することによって得られる値と、の間の範囲である。なお、値γは、予め決められている固定値であってもよいし、ＴＷ／ＴＨに所定の係数（例えば０．０５）を乗算することによって得られる値であってもよい。 In S470, the CPU 62 determines whether or not the aspect ratio W / H of the candidate rearrangement area approximates the aspect ratio TW / TH of the target target area. Specifically, the CPU 62 determines whether or not the aspect ratio W / H of the candidate rearrangement area is included within a predetermined range set based on the aspect ratio TW / TH of the target target area. The predetermined range includes a value obtained by subtracting the value γ from the aspect ratio TW / TH of the target target area, a value obtained by adding the value γ to the aspect ratio TW / TH of the target target area, The range between. Note that the value γ may be a predetermined fixed value, or may be a value obtained by multiplying TW / TH by a predetermined coefficient (for example, 0.05).

ＣＰＵ６２は、候補再配置領域のアスペクト比Ｗ／Ｈが対象目標領域のアスペクト比ＴＷ／ＴＨに近似しないと判断する場合（Ｓ４７０でＮＯ）には、Ｓ４３０〜Ｓ４６０の各処理を再び実行する。これにより、ＣＰＵ６２は、候補再配置領域の横方向の新たな長さＷと縦方向の新たな長さＨとを決定して、Ｓ４７０の判断を再び実行する。 If the CPU 62 determines that the aspect ratio W / H of the candidate rearrangement area does not approximate the aspect ratio TW / TH of the target target area (NO in S470), the CPU 62 executes each process of S430 to S460 again. Thereby, the CPU 62 determines a new length W in the horizontal direction and a new length H in the vertical direction of the candidate rearrangement region, and executes the determination in S470 again.

一方、ＣＰＵ６２は、候補再配置領域のアスペクト比Ｗ／Ｈが対象目標領域のアスペクト比ＴＷ／ＴＨに近似すると判断する場合（Ｓ４７０でＹＥＳ）には、Ｓ４８０において、まず、横方向の長さＷと縦方向の長さＨとを有する候補再配置領域を再配置領域（例えば図２のＲＡ）として決定する。そして、ＣＰＵ６２は、図１０のＳ４５３で１個以上の分断候補位置を選択済みである場合には、当該１個以上の分断候補位置で対象結合画像データを分断して、２個以上の分断画像を表わす２個以上の分断画像データを生成する。次いで、ＣＰＵ６２は、２個以上の分断画像が縦方向に沿って並ぶように、２個以上の分断画像データを再配置領域内に配置する。この際に、ＣＰＵ６２は、縦方向に沿って隣接する２個の分断画像の間にＳ４４０で決定された行間が形成されるように、２個の分断画像データを配置する。この結果、例えば、図２のＳ４００に示されるように、複数個の文字「Ａ」〜「Ｑ」が再配置領域ＲＡ内に再配置されている再配置画像ＲＩを表わす再配置画像データが生成される。再配置画像ＲＩ内の複数個の文字「Ａ」〜「Ｑ」のサイズは、スキャン画像ＳＩ内の複数個の文字「Ａ」〜「Ｑ」のサイズに等しい。 On the other hand, if the CPU 62 determines that the aspect ratio W / H of the candidate rearrangement area is close to the aspect ratio TW / TH of the target target area (YES in S470), first, in S480, the horizontal length W And a candidate rearrangement area having a vertical length H are determined as rearrangement areas (for example, RA in FIG. 2). Then, when one or more division candidate positions have been selected in S453 of FIG. 10, the CPU 62 divides the target combined image data at the one or more division candidate positions, and two or more division images. Two or more pieces of divided image data representing are generated. Next, the CPU 62 arranges the two or more divided image data in the rearrangement region so that the two or more divided images are arranged in the vertical direction. At this time, the CPU 62 arranges the two pieces of divided image data so that the line spacing determined in S440 is formed between the two divided images adjacent in the vertical direction. As a result, for example, as shown in S400 of FIG. 2, rearranged image data representing a rearranged image RI in which a plurality of characters “A” to “Q” are rearranged in the rearrangement region RA is generated. Is done. The size of the plurality of characters “A” to “Q” in the rearranged image RI is equal to the size of the plurality of characters “A” to “Q” in the scan image SI.

Ｓ４９０では、ＣＰＵ６２は、全てのテキスト領域について、Ｓ４１０〜Ｓ４８０の処理が終了したのか否かを判断する。ＣＰＵ６２は、処理が終了していないと判断する場合（Ｓ４９０でＮＯ）には、Ｓ４１０において、未処理のテキスト領域を処理対象として決定して、Ｓ４１２以降の各処理を再び実行する。そして、ＣＰＵ６２は、処理が終了したと判断する場合（Ｓ４９０でＹＥＳ）には、図９の処理を終了する。 In S490, the CPU 62 determines whether or not the processing of S410 to S480 has been completed for all text areas. If the CPU 62 determines that the process has not ended (NO in S490), the CPU 62 determines an unprocessed text area as a process target in S410, and executes each process after S412 again. Then, if the CPU 62 determines that the process has been completed (YES in S490), the process of FIG. 9 is terminated.

（具体的なケース；図１１）
続いて、図１１を参照して、図２のＳ４００の再配置処理（図９参照）とＳ５００の拡大処理について、具体的なケースを説明する。（１）に示されるように、候補再配置領域の横方向の長さＷの初期値、縦方向の長さＨの初期値として、それぞれ、対象テキスト領域ＴＯＡの横方向の長さＯＰｘ、縦方向の長さＯＰｙが設定される（図９のＳ４２０）。本ケースでは、Ｗ／ＨがＴＷ／ＴＨ未満である。即ち、対象目標領域ＴＡは、対象テキスト領域ＴＯＡと比べると、横長の形状を有する。この場合、候補再配置領域を横長の形状にしていけば、候補再配置領域のアスペクト比が対象目標領域ＴＡのアスペクト比に近づくことになる。従って、（２）に示されるように、候補再配置領域の横方向の現在の長さＷに固定値βが加算されて、候補再配置領域の横方向の新たな長さＷが決定される（Ｓ４３２）。 (Specific case; Fig. 11)
Next, a specific case will be described with reference to FIG. 11 for the rearrangement process in S400 (see FIG. 9) in FIG. 2 and the enlargement process in S500. As shown in (1), as the initial value of the horizontal length W and the initial value of the vertical length H of the candidate rearrangement area, the horizontal length OPx and the vertical length of the target text area TOA, respectively. The direction length OPy is set (S420 in FIG. 9). In this case, W / H is less than TW / TH. That is, the target target area TA has a horizontally long shape as compared with the target text area TOA. In this case, if the candidate rearrangement area has a horizontally long shape, the aspect ratio of the candidate rearrangement area approaches the aspect ratio of the target target area TA. Accordingly, as shown in (2), the fixed value β is added to the current length W in the horizontal direction of the candidate rearrangement region to determine a new length W in the horizontal direction of the candidate rearrangement region. (S432).

次いで、結合画像ＣＩ１に対応する行数として、文字列「Ａ〜Ｅ」と文字列「Ｆ〜Ｋ」と文字列「ＬＭ」とを含む３行が決定され、結合画像ＣＩ２に対応する行数として、文字列「Ｎ〜Ｑ」を含む１行が決定される（Ｓ４５０）。次いで、結合画像ＣＩ１から得られる３行の再配置文字列（即ち「Ａ〜Ｅ」、「Ｆ〜Ｋ」、「ＬＭ」）の縦方向の長さＮＨ１（＝３行×ｈ１＋（３行−１）×ｍ）が算出され、結合画像ＣＩ２から得られる１行の再配置文字列（即ち「Ｎ〜Ｑ」）の縦方向の長さＮＨ２（＝１行×ｈ２＋（１行−１）×ｍ）が算出される（Ｓ４６０）。そして、候補再配置領域の縦方向の新たな長さＨ（＝ＮＨ１＋ＮＨ２＋（２−１）×ｍ）が決定される（Ｓ４６０）。 Next, three lines including the character string “A to E”, the character string “F to K”, and the character string “LM” are determined as the number of lines corresponding to the combined image CI1, and the number of lines corresponding to the combined image CI2 is determined. As described above, one line including the character string “N to Q” is determined (S450). Next, the length NH1 (= 3 rows × h1 + (3 rows−−) in the vertical direction of the three rows of rearranged character strings (ie, “A to E”, “F to K”, and “LM”) obtained from the combined image CI1. 1) × m) is calculated, and the vertical length NH2 (= 1 line × h2 + (1 line−1) ×× 1 line of rearranged character strings (ie, “N to Q”) obtained from the combined image CI2 m) is calculated (S460). Then, a new vertical length H (= NH1 + NH2 + (2-1) × m) of the candidate rearrangement region is determined (S460).

（２）の状態では、候補再配置領域のアスペクト比Ｗ／Ｈが対象目標領域ＴＡのアスペクト比ＴＷ／ＴＨに近似しないので（Ｓ４７０でＮＯ）、（３）に示されるように、候補再配置領域の横方向の現在の長さＷに固定値βが再び加算されて、候補再配置領域の横方向の新たな長さＷが再び決定される（Ｓ４３２）。 In the state of (2), since the aspect ratio W / H of the candidate rearrangement area does not approximate the aspect ratio TW / TH of the target target area TA (NO in S470), as shown in (3), the candidate rearrangement The fixed value β is added again to the current length W in the horizontal direction of the area, and the new horizontal length W of the candidate rearrangement area is determined again (S432).

次いで、結合画像ＣＩ１に対応する行数として、文字列「Ａ〜Ｆ」と文字列「Ｇ〜Ｍ」とを含む２行が決定される（Ｓ４５０）。即ち、候補再配置領域の横方向の長さＷが大きくなったことに起因して、候補再配置領域内の１行の文字列を構成することが可能な最大の文字数が増える。また、結合画像ＣＩ２に対応する行数として、文字列「Ｎ〜Ｑ」を含む１行が決定される（Ｓ４５０）。次いで、結合画像ＣＩ１から得られる２行の再配置文字列（即ち「Ａ〜Ｆ」、「Ｇ〜Ｍ」）の縦方向の長さＮＨ１（＝２行×ｈ１＋（２行−１）×ｍ）が算出され、結合画像ＣＩ２から得られる１行の再配置文字列（即ち「Ｎ〜Ｑ」）の縦方向の長さＮＨ２（＝１行×ｈ２＋（１行−１）×ｍ）が算出される（Ｓ４６０）。そして、候補再配置領域の縦方向の新たな長さＨ（＝ＮＨ１＋ＮＨ２＋（２−１）×ｍ）が決定される（Ｓ４６０）。 Next, two lines including the character string “A to F” and the character string “G to M” are determined as the number of lines corresponding to the combined image CI1 (S450). That is, the maximum number of characters that can form one line of character string in the candidate rearrangement area increases due to the increase in the horizontal length W of the candidate rearrangement area. Further, one line including the character string “N to Q” is determined as the number of lines corresponding to the combined image CI2 (S450). Next, the length NH1 (= 2 lines × h1 + (2 lines−1) × m) in the vertical direction of the two lines of rearranged character strings (that is, “A to F” and “G to M”) obtained from the combined image CI1. ) Is calculated, and the vertical length NH2 (= 1 line × h2 + (1 line−1) × m) of the rearranged character string (that is, “N to Q”) obtained from the combined image CI2 is calculated. (S460). Then, a new vertical length H (= NH1 + NH2 + (2-1) × m) of the candidate rearrangement region is determined (S460).

（３）の状態では、候補再配置領域のアスペクト比Ｗ／Ｈが対象目標領域ＴＡのアスペクト比ＴＷ／ＴＨに近似する（Ｓ４７０でＹＥＳ）。従って、（４）に示されるように、（３）の候補再配置領域が再配置領域ＲＡとして決定される（Ｓ４８０）。次いで、結合画像ＣＩ１を表わす結合画像データが分断されて、２個の分断画像ＤＩ１，ＤＩ２を表わす２個の分断画像データが生成される（Ｓ４８０）。そして、２個の分断画像ＤＩ１，ＤＩ２と結合画像ＣＩ２とが縦方向に沿って並び、かつ、隣接する２個の画像の間に長さｍの行間が形成されるように、２個の分断画像ＤＩ１，ＤＩ２を表わす２個の分断画像データと、結合画像ＣＩ２を表わす対象結合画像データと、が再配置領域ＲＡ内に配置される。この結果、再配置画像ＲＩを表わす再配置画像データが生成される（Ｓ４８０）。 In the state (3), the aspect ratio W / H of the candidate rearrangement area approximates the aspect ratio TW / TH of the target target area TA (YES in S470). Therefore, as shown in (4), the candidate rearrangement region in (3) is determined as the rearrangement region RA (S480). Next, the combined image data representing the combined image CI1 is divided to generate two divided image data representing the two divided images DI1 and DI2 (S480). Then, the two divided images DI1, DI2 and the combined image CI2 are arranged in the vertical direction, and the two divided images are formed so as to form a space of length m between two adjacent images. Two pieces of divided image data representing the images DI1 and DI2 and target combined image data representing the combined image CI2 are arranged in the rearrangement region RA. As a result, rearranged image data representing the rearranged image RI is generated (S480).

次いで、再配置画像データが拡大されて、拡大画像を表わす拡大画像データが生成される（図２のＳ５００）。具体的には、再配置画像ＲＩの対角線が伸びる方向に再配置画像ＲＩが拡大され、その結果、拡大画像を表わす拡大画像データが生成される。例えば、再配置領域ＲＡのアスペクト比Ｗ／Ｈが対象目標領域ＴＡのアスペクト比ＴＷ／ＴＨに等しい場合には、拡大画像の４個の辺の全てが、対象目標領域ＴＡの４個の辺に一致する。即ち、この場合、拡大画像のサイズが目標領域ＴＡのサイズに一致する。ただし、例えば、再配置領域ＲＡのアスペクト比Ｗ／Ｈが対象目標領域ＴＡのアスペクト比ＴＷ／ＴＨに等しくない場合には、再配置画像ＲＩを徐々に拡大していく過程において、拡大画像のいずれかの辺が対象目標領域ＴＡのいずれかの辺に一致した段階で、再配置画像ＲＩの拡大が終了する。即ち、この場合、拡大画像のサイズが目標領域ＴＡのサイズよりも小さくなる。 Next, the rearranged image data is enlarged, and enlarged image data representing the enlarged image is generated (S500 in FIG. 2). Specifically, the rearranged image RI is enlarged in the direction in which the diagonal line of the rearranged image RI extends, and as a result, enlarged image data representing the enlarged image is generated. For example, when the aspect ratio W / H of the rearrangement area RA is equal to the aspect ratio TW / TH of the target target area TA, all four sides of the enlarged image are in the four sides of the target target area TA. Match. In other words, in this case, the size of the enlarged image matches the size of the target area TA. However, if, for example, the aspect ratio W / H of the rearrangement area RA is not equal to the aspect ratio TW / TH of the target target area TA, any of the enlarged images in the process of gradually expanding the rearrangement image RI. The enlargement of the rearranged image RI is completed at the stage when these sides coincide with any side of the target target area TA. That is, in this case, the size of the enlarged image is smaller than the size of the target area TA.

続いて、（５）に示されるように、再配置画像ＲＩを表わす再配置画像データが拡大された拡大画像データが、スキャン画像データＳＩＤの目標領域ＴＡ内に上書きされる（図２のＳ５００）。この結果、処理済み画像ＰＩを表わす処理済み画像データＰＩＤが完成する。 Subsequently, as shown in (5), the enlarged image data obtained by enlarging the rearranged image data representing the rearranged image RI is overwritten in the target area TA of the scanned image data SID (S500 in FIG. 2). . As a result, processed image data PID representing the processed image PI is completed.

（実施例の効果）
本実施例によると、画像処理サーバ５０は、スキャン画像データＳＩＤから得られる３行の文字列（即ち、「Ａ〜Ｅ」、「Ｆ〜Ｋ」、「ＬＭ」）を表わす３個の部分画像データを結合して、当該３行の文字列が横方向に沿って直線状に結合された文字列を含む結合画像ＣＩ１を表わす結合画像データを生成する（図２のＳ２００）。この際に、画像処理サーバ５０は、スキャン画像ＳＩ内の各文字に対応する各単位領域を利用して（図３のＳ１３０、図４のＳ１７０）、結合画像データを生成するが、結合画像データを一旦生成すると、その後、スキャン画像ＳＩ内の複数個の文字を再配置領域ＲＡ内に再配置すべき際に、当該結合画像データを分断すれば済み（図９のＳ４８０、図１１）、スキャン画像ＳＩ内の各文字の位置を示すデータを利用せずに済む。特に、画像処理サーバ５０は、結合画像データを分断する際に、１文字ずつに分断するのではなく、２文字以上を含む分断画像（例えば図１１のＣＩ１）を表わす分断画像データに分断する。即ち、画像処理サーバ５０は、スキャン画像ＳＩ内の１個の文字を単位として処理を実行せずに済む。この結果、画像処理サーバ５０は、迅速に処理を実行し得るので、処理済み画像データＰＩＤを多機能機１０のユーザに迅速に提供し得る。 (Effect of Example)
According to this embodiment, the image processing server 50 includes three partial images representing three lines of character strings (that is, “A to E”, “F to K”, and “LM”) obtained from the scanned image data SID. The data is combined to generate combined image data representing a combined image CI1 including a character string in which the three rows of character strings are linearly combined in the horizontal direction (S200 in FIG. 2). At this time, the image processing server 50 generates combined image data by using each unit area corresponding to each character in the scanned image SI (S130 in FIG. 3 and S170 in FIG. 4). Is generated, then, when a plurality of characters in the scan image SI are to be rearranged in the rearrangement area RA, the combined image data has only to be divided (S480 in FIG. 9, FIG. 11), It is not necessary to use data indicating the position of each character in the image SI. In particular, when dividing the combined image data, the image processing server 50 divides the combined image data into divided image data representing a divided image including two or more characters (for example, CI1 in FIG. 11). In other words, the image processing server 50 does not have to execute processing for each character in the scanned image SI. As a result, since the image processing server 50 can execute the processing quickly, the processed image data PID can be quickly provided to the user of the multi-function device 10.

また、本実施例では、画像処理サーバ５０は、結合画像ＣＩ１を表わす結合画像データを生成した後に、図７のＳ２３６で生成される射影ヒストグラムを利用して、結合画像データの複数個の分断候補位置（即ち後の分断位置）を適切に決定することができる。即ち、画像処理サーバ５０は、図８に示される様々なケースに応じて、結合画像データの分断位置を適切に決定することができる。このために、画像処理サーバ５０は、結合画像データを適切な位置で分断して、再配置画像ＲＩを表わす再配置画像データを適切に生成することができる。この結果、画像処理サーバ５０は、再配置画像データを利用して、適切な処理済み画像ＰＩをユーザに提供することができる。 In this embodiment, the image processing server 50 generates combined image data representing the combined image CI1, and then uses the projection histogram generated in S236 of FIG. The position (i.e. the subsequent split position) can be determined appropriately. That is, the image processing server 50 can appropriately determine the division position of the combined image data in accordance with various cases shown in FIG. Therefore, the image processing server 50 can appropriately generate rearranged image data representing the rearranged image RI by dividing the combined image data at an appropriate position. As a result, the image processing server 50 can provide the user with an appropriate processed image PI using the rearranged image data.

（対応関係）
画像処理サーバ５０が、「画像処理装置」の一例である。スキャン画像ＳＩが、「原画像」の一例である。図１１の例では、スキャン画像ＳＩ内の先頭の３行の文字列「Ａ〜Ｍ」が、「Ｍ行の文字列」の一例である。結合画像ＣＩ１を表わす結合画像データ、分断画像ＤＩ１を表わす分断画像データ、分断画像ＤＩ２を表わす分断画像データ、再配置画像ＲＩを表わす再配置画像データが、それぞれ、「対象文字列画像データ」、「第１の部分文字列画像データ」、「第２の部分文字列画像データ」、「特定文字列画像データ」の一例である。分断画像ＤＩ１内の文字列「Ａ〜Ｆ」が、「第１の部分文字列」及び「第１の特定文字列」の一例である。分断画像ＤＩ２内の文字列「Ｇ〜Ｍ」が、「第２の部分文字列」及び「第２の特定文字列」の一例である。従って、本実施例では、「第１（又は第２）の部分文字列」と「第１（又は第２）の特定文字列」とが一致する。また、横方向、左側、右側、縦方向が、それぞれ、「第１方向」、「第１方向の第１側」、「第１方向の第２側」、「第２方向」の一例である。 (Correspondence)
The image processing server 50 is an example of an “image processing apparatus”. The scan image SI is an example of an “original image”. In the example of FIG. 11, the first three character strings “A to M” in the scan image SI are examples of “M character strings”. The combined image data representing the combined image CI1, the divided image data representing the divided image DI1, the divided image data representing the divided image DI2, and the rearranged image data representing the rearranged image RI are “target character string image data”, “ It is an example of “first partial character string image data”, “second partial character string image data”, and “specific character string image data”. The character strings “A to F” in the divided image DI1 are examples of “first partial character string” and “first specific character string”. The character strings “GM” in the divided image DI2 are examples of the “second partial character string” and the “second specific character string”. Therefore, in the present embodiment, the “first (or second) partial character string” matches the “first (or second) specific character string”. The horizontal direction, the left side, the right side, and the vertical direction are examples of the “first direction”, the “first side in the first direction”, the “second side in the first direction”, and the “second direction”, respectively. .

図１０のＳ４５３で利用されるＷが、「特定の長さ」の一例である。図７において、Ｓ２３８で決定される連続範囲、Ｓ２３８で決定される閾値、Ｓ２４２で利用されるｈ／４、Ｓ２４４で利用されるｈ／２が、それぞれ、「特定範囲」、「第１の閾値」、「第２の閾値」、「第３の閾値」の一例である。例えば、図８のケースＡにおいて、「Ｉ」に対応する領域、中間余白領域ＢＡ１が、それぞれ、「第１の範囲」、「第２の範囲」の一例である。また、例えば、ケースＣにおいて、括弧Ｃ１に対応する領域、中間余白領域ＢＡ８、平仮名Ｃ２に対応する領域が、それぞれ、「第１の範囲」、「第２の範囲」、「第３の範囲」の一例である。 W used in S453 in FIG. 10 is an example of “specific length”. In FIG. 7, the continuous range determined in S238, the threshold determined in S238, h / 4 used in S242, and h / 2 used in S244 are “specific range” and “first threshold, respectively. ”,“ Second threshold ”, and“ third threshold ”. For example, in case A of FIG. 8, the area corresponding to “I” and the intermediate margin area BA1 are examples of “first range” and “second range”, respectively. Further, for example, in case C, the area corresponding to the parenthesis C1, the middle blank area BA8, and the area corresponding to the hiragana C2 are “first range”, “second range”, and “third range”, respectively. It is an example.

以上、本発明の具体例を詳細に説明したが、これらは例示にすぎず、特許請求の範囲を限定するものではない。特許請求の範囲に記載の技術には以上に例示した具体例を様々に変形、変更したものが含まれる。上記の実施例の変形例を以下に列挙する。 Specific examples of the present invention have been described in detail above, but these are merely examples and do not limit the scope of the claims. The technology described in the claims includes various modifications and changes of the specific examples illustrated above. The modifications of the above embodiment are listed below.

（変形例１）上記の実施例では、図１１に示されるように、スキャン画像ＳＩ内の先頭の３行の文字列「Ａ〜Ｅ」，「Ｆ〜Ｋ」，「ＬＭ」が結合され（結合画像ＣＩ１参照）、その後、１行の文字列「Ａ〜Ｍ」が分断される（２個の分断画像ＤＩ１，ＤＩ２参照）。これに代えて、ＣＰＵ６２は、スキャン画像ＳＩ内の先頭の３行の文字列「Ａ〜Ｅ」，「Ｆ〜Ｋ」，「ＬＭ」を結合しなくてもよく、再配置画像ＲＩを生成すべき際に、以下の処理を実行してもよい。即ち、ＣＰＵ６２は、スキャン画像ＳＩ内の２行目の文字列「Ｆ〜Ｋ」を表わす部分画像データを分断して、文字列「Ｆ」を表わす第１の分断画像データと、文字列「Ｇ〜Ｋ」を表わす第２の分断画像データと、を生成する。次いで、ＣＰＵ６２は、スキャン画像ＳＩ内の１行目の文字列「Ａ〜Ｅ」を表わす部分画像データと、文字列「Ｆ」を表わす第１の分断画像データと、を結合して、文字列「Ａ〜Ｆ」を表わす第１の結合画像データを生成する。次いで、ＣＰＵ６２は、文字列「Ｇ〜Ｋ」を表わす第２の分断画像データと、スキャン画像ＳＩ内の３行目の文字列「ＬＭ」を表わす部分画像データと、を結合して、文字列「Ｇ〜Ｍ」を表わす第２の結合画像データを生成する。そして、ＣＰＵ６２は、第１の結合画像データと第２結合画像データとを再配置領域ＲＡ内に配置して、再配置画像ＲＩを表わす再配置画像データを生成する。本変形例では、スキャン画像ＳＩ内の２行目の文字列「Ｆ〜Ｋ」、当該文字列「Ｆ〜Ｋ」を表わす部分画像データが、それぞれ、「対象文字列」、「対象文字列画像データ」の一例である。分断後の文字列「Ｆ」、第１の分断画像データ、分断後の文字列「Ｇ〜Ｋ」、第２の分断画像データが、それぞれ、「第１の部分文字列」、「第１の部分文字列画像データ」、「第２の部分文字列」、「第２の部分文字列画像データ」の一例である。また、再配置画像ＲＩ内の１行目の文字列「Ａ〜Ｆ」、２行目の文字列「Ｇ〜Ｍ」が、それぞれ、「第１の特定文字列」、「第２の特定文字列」の一例である。即ち、本変形例では、「第１（又は第２）の部分文字列」と「第１（又は第２）の特定文字列」とが一致しない。 (Modification 1) In the above embodiment, as shown in FIG. 11, the first three lines of character strings “A to E”, “F to K”, and “LM” in the scanned image SI are combined ( After that, the character string “A to M” in one line is divided (see two divided images DI1 and DI2). Instead, the CPU 62 does not have to combine the first three character strings “A to E”, “F to K”, and “LM” in the scan image SI, and generates the rearranged image RI. When necessary, the following processing may be executed. That is, the CPU 62 divides the partial image data representing the character string “F to K” in the second row in the scan image SI, and the first divided image data representing the character string “F” and the character string “G”. Second divided image data representing “˜K”. Next, the CPU 62 combines the partial image data representing the character string “A to E” on the first line in the scan image SI and the first divided image data representing the character string “F” to obtain a character string. First combined image data representing “A to F” is generated. Next, the CPU 62 combines the second segmented image data representing the character string “G to K” and the partial image data representing the character string “LM” on the third line in the scan image SI to obtain a character string. Second combined image data representing “G to M” is generated. Then, the CPU 62 arranges the first combined image data and the second combined image data in the rearrangement area RA, and generates rearranged image data representing the rearranged image RI. In this modification, the character string “F to K” on the second line in the scanned image SI and the partial image data representing the character string “F to K” are respectively “target character string” and “target character string image”. It is an example of “data”. The divided character string “F”, the first divided image data, the divided character strings “G to K”, and the second divided image data are respectively “first partial character string”, “first It is an example of “partial character string image data”, “second partial character string”, and “second partial character string image data”. In addition, the character string “A to F” on the first line and the character string “G to M” on the second line in the rearranged image RI are respectively “first specific character string” and “second specific character”. It is an example of a column. That is, in the present modification, the “first (or second) partial character string” does not match the “first (or second) specific character string”.

（変形例２）上記の実施例では、再配置画像データが拡大された拡大画像データが生成され、拡大画像データがスキャン画像データＳＩＤ内の一部の領域に上書きされる。これにより、処理済み画像データＰＩＤが生成される（図１１参照）。これに代えて、ＣＰＵ６２は、再配置画像データをそのままスキャン画像データＳＩＤ内の一部の領域に上書きして、処理済み画像データＰＩＤを生成してもよい。即ち、処理済み画像ＰＩでは、スキャン画像ＳＩ内の各文字「Ａ〜Ｑ」のサイズと同じサイズを有する各文字が表現されていてもよい。本変形例では、処理済み画像データＰＩＤが、「特定文字列画像データ」の一例である。 (Modification 2) In the above embodiment, enlarged image data obtained by enlarging the rearranged image data is generated, and the enlarged image data is overwritten in a part of the area in the scan image data SID. Thereby, processed image data PID is generated (see FIG. 11). Instead, the CPU 62 may generate the processed image data PID by overwriting the rearranged image data as it is in a partial area in the scanned image data SID. That is, in the processed image PI, each character having the same size as the size of each character “A to Q” in the scanned image SI may be expressed. In this modification, the processed image data PID is an example of “specific character string image data”.

（変形例３）上記の実施例では、画像処理サーバ５０が、スキャン画像データＳＩＤに対して画像処理（即ち図２のＳ１００〜Ｓ５００の各処理）を実行して処理済み画像データＰＩＤを生成し、当該処理済み画像データＰＩＤを多機能機１０に送信する（Ｓ６００）。これに代えて、多機能機１０が、スキャン画像データＳＩＤに対して画像処理を実行して処理済み画像データＰＩＤを生成してもよい（即ち画像処理サーバ５０が存在しなくてもよい）。本変形例では、多機能機１０が、「画像処理装置」の一例である。 (Modification 3) In the above embodiment, the image processing server 50 performs image processing on the scanned image data SID (that is, each processing of S100 to S500 in FIG. 2) to generate processed image data PID. Then, the processed image data PID is transmitted to the multi-function device 10 (S600). Alternatively, the multi-function device 10 may perform image processing on the scanned image data SID to generate processed image data PID (that is, the image processing server 50 may not exist). In the present modification, the multi-function device 10 is an example of an “image processing apparatus”.

（変形例４）画像処理サーバ５０によって実行される画像処理の対象は、スキャン画像データＳＩＤでなくてもよく、文書作成ソフト、表編集ソフト、描画作成ソフト等によって生成されるデータであってもよい。即ち、「原画像データ」は、スキャン対象シートのスキャンによって得られるデータに限られず、他の種類のデータであってもよい。 (Modification 4) The target of image processing executed by the image processing server 50 may not be the scanned image data SID, but may be data generated by document creation software, table editing software, drawing creation software, or the like. Good. That is, the “original image data” is not limited to data obtained by scanning the scan target sheet, and may be other types of data.

（変形例５）上記の実施例では、スキャン画像ＳＩは、横方向の左側から右側に向かってセンテンスが進むと共に、縦方向の上側から下側に向かってセンテンスが進む文字列（即ち横書きの文字列）を含む。これに代えて、スキャン画像ＳＩは、縦方向の上側から下側に向かってセンテンスが進むと共に、横方向の右側から左側に向かってセンテンスが進む文字列（即ち縦書きの文字列）を含んでいてもよい。この場合、画像処理サーバ５０は、図４のＳ１６２及びＳ１６４において、横方向の射影ヒストグラムに基づいて、通常、帯状領域を決定することができない。従って、画像処理サーバ５０は、縦方向の射影ヒストグラムを生成して、帯状領域を決定する。その後、画像処理サーバ５０は、横方向の代わりに縦方向を利用し、縦方向の代わりに横方向を利用して、上記の実施例と同様の処理を実行すればよい。本変形例では、縦方向、横方向が、それぞれ、「第１方向」、「第２方向」の一例である。縦方向の上側、下側が、それぞれ、「第１方向の第１側」、「第１方向の第２側」の一例である。 (Modification 5) In the above embodiment, the scanned image SI is a character string in which the sentence advances from the left side in the horizontal direction to the right side and the sentence advances in the vertical direction from the upper side to the lower side (that is, horizontally written characters). Column). Instead, the scanned image SI includes a character string (that is, a vertically written character string) in which the sentence advances from the upper side to the lower side in the vertical direction and the sentence advances from the right side in the horizontal direction to the left side. May be. In this case, the image processing server 50 cannot normally determine the band-like region based on the horizontal projection histogram in S162 and S164 of FIG. Therefore, the image processing server 50 generates a projection histogram in the vertical direction and determines the band-like region. Thereafter, the image processing server 50 may perform the same processing as in the above-described embodiment by using the vertical direction instead of the horizontal direction and using the horizontal direction instead of the vertical direction. In the present modification, the vertical direction and the horizontal direction are examples of “first direction” and “second direction”, respectively. The upper side and the lower side in the vertical direction are examples of the “first side in the first direction” and the “second side in the first direction”, respectively.

（変形例６）上記の実施例では、画像処理サーバ５０のＣＰＵ６２がプログラム６６（即ちソフトウェア）を実行することによって、図２〜図１１の各処理が実現される。これに代えて、図２〜図１１の各処理のうちの少なくとも１つの処理は、論理回路等のハードウェアによって実現されてもよい。 (Modification 6) In the above-described embodiment, the CPU 62 of the image processing server 50 executes the program 66 (that is, software), thereby realizing the processes shown in FIGS. Instead, at least one of the processes in FIGS. 2 to 11 may be realized by hardware such as a logic circuit.

また、本明細書または図面に説明した技術要素は、単独であるいは各種の組合せによって技術的有用性を発揮するものであり、出願時請求項記載の組合せに限定されるものではない。また、本明細書または図面に例示した技術は複数目的を同時に達成するものであり、そのうちの一つの目的を達成すること自体で技術的有用性を持つものである。 The technical elements described in this specification or the drawings exhibit technical usefulness alone or in various combinations, and are not limited to the combinations described in the claims at the time of filing. In addition, the technology illustrated in the present specification or the drawings achieves a plurality of objects at the same time, and has technical utility by achieving one of the objects.

２：通信システム、４：インターネット、１０：多機能機、５０：画像処理サーバ、５２：ネットワークインターフェース、６０：制御部、６２：ＣＰＵ、６４：メモリ、６６：プログラム、ＳＩ：スキャン画像、ＰＩ：処理済み画像、ＴＯＢ：テキストオブジェクト、ＰＯＢ：写真オブジェクト、ＴＯＡ：テキストオブジェクト領域（テキスト領域）、ＰＯＡ：写真オブジェクト領域、ＬＡ１１〜ＬＡ１４：帯状領域、ＴＡ：目標領域、ＲＡ：再配置領域、ＣＩ１，ＣＩ２：結合画像、ＤＩ１，ＤＩ２：分断画像、ＲＩ：再配置画像 2: Communication system, 4: Internet, 10: Multi-function device, 50: Image processing server, 52: Network interface, 60: Control unit, 62: CPU, 64: Memory, 66: Program, SI: Scanned image, PI: Processed image, TOB: Text object, POB: Photo object, TOA: Text object area (text area), POA: Photo object area, LA11 to LA14: Strip area, TA: Target area, RA: Relocation area, CI1, CI2: combined image, DI1, DI2: fragmented image, RI: rearranged image

Claims

An image processing apparatus,
A target character string image data acquisition unit for acquiring target character string image data representing a target character string of one line composed of a plurality of characters arranged along the first direction;
The target character string image data is divided into first partial character string image data representing a first partial character string that is a part of the target character string, and a part of the target character string. A second partial character string image data representing a second partial character string, wherein at least one of the first partial character string and the second partial character string is , The dividing portion constituted by two or more characters that are part of the plurality of characters;
Using the first partial character string image data and the second partial character string image data, a first specific character string and a second specific string arranged along a second direction orthogonal to the first direction A specific character string image data generating unit that generates specific character string image data representing a character string, wherein the first specific character string includes the first partial character string, and the second specific character string is The specific character string image data generation unit including the second partial character string,
The plurality of pixels constituting the target character string image data include a character constituent pixel constituting a character included in the target character string, and a background pixel constituting a background of the character included in the target character string. Including
The dividing portion is
A histogram generation unit that generates a projection histogram using the target character string image data, wherein the projection histogram projects each pixel constituting the target character string image data along the second direction. A histogram showing a frequency distribution of the character constituting pixels in
A division position determination unit that determines a division position of the target character string image data based on the projection histogram and a specific length, wherein the specific length is represented by the specific character string image data. The division position determination unit, which is an upper limit length along the first direction of a power string of power lines,
The divided portion is divided the target character string image data in the dividing position, and generates a first partial character string image data and the second partial character string image data,
The segmentation position determination unit determines a plurality of candidate positions that are candidates for the segmentation position of the target character string image data using the projection histogram, and the first direction of the first specific character string is determined in the first direction. The dividing position is determined from among the plurality of candidate positions so that the length along the line becomes a maximum length not more than the specific length,
In the target character string, a sentence advances from the first side to the second side in the first direction,
The dividing position determining unit is
In the projection histogram, (A) the frequency of the character constituent pixels in the first range along the first direction is greater than a first threshold, and (B) the second range along the first direction. The frequency of the character constituent pixels in the second range that is located on the second side in the first direction with respect to the first range and is adjacent to the first range is the first threshold value. And (C) a third range along the first direction, located on the second side in the first direction with respect to the second range, and the second range. The length of the second range along the first direction in a specific case indicating that the frequency of the character constituent pixels in the third range adjacent to the second range is greater than the first threshold, Determined according to the number of pixels along the second direction of the target character string image data. Second determining whether it is below a threshold that is,
In the specific case, at least one of the length along the first direction of the first range and the length along the first direction of the third range is the target character string image data. Whether it is less than a third threshold value determined according to the number of pixels along the second direction of
When it is determined that the length of the second range along the first direction is greater than or equal to the second threshold, the predetermined position in the second range is determined as one candidate position. And
The length of the second range along the first direction is determined to be less than the second threshold, and the length of the first range along the first direction and the third range are determined. A predetermined position in the second range is determined as one candidate position when it is determined that both the length along the first direction of the range is greater than or equal to the third threshold,
The length of the second range along the first direction is determined to be less than the second threshold, and the length of the first range along the first direction and the third range are determined. If it is determined that at least one of the lengths of the range along the first direction is less than the third threshold value, the predetermined position in the second range is set as one candidate position. Image processing device not determined as .

The image processing apparatus according to claim 1 , wherein the third threshold is larger than the second threshold.

An image processing apparatus,
A target character string image data acquisition unit for acquiring target character string image data representing a target character string of one line composed of a plurality of characters arranged along the first direction;
The target character string image data is divided into first partial character string image data representing a first partial character string that is a part of the target character string, and a part of the target character string. A second partial character string image data representing a second partial character string, wherein at least one of the first partial character string and the second partial character string is , The dividing portion constituted by two or more characters that are part of the plurality of characters;
Using the first partial character string image data and the second partial character string image data, a first specific character string and a second specific string arranged along a second direction orthogonal to the first direction A specific character string image data generating unit that generates specific character string image data representing a character string, wherein the first specific character string includes the first partial character string, and the second specific character string is The specific character string image data generation unit including the second partial character string,
The plurality of pixels constituting the target character string image data include a character constituent pixel constituting a character included in the target character string, and a background pixel constituting a background of the character included in the target character string. Including
The dividing portion is
A histogram generation unit that generates a projection histogram using the target character string image data, wherein the projection histogram projects each pixel constituting the target character string image data along the second direction. A histogram showing a frequency distribution of the character constituting pixels in
A division position determination unit that determines a division position of the target character string image data based on the projection histogram and a specific length, wherein the specific length is represented by the specific character string image data. The division position determination unit, which is an upper limit length along the first direction of a power string of power lines,
The divided portion is divided the target character string image data in the dividing position, and generates a first partial character string image data and the second partial character string image data,
The segmentation position determination unit determines a plurality of candidate positions that are candidates for the segmentation position of the target character string image data using the projection histogram, and the first direction of the first specific character string is determined in the first direction. The dividing position is determined from among the plurality of candidate positions so that the length along the line becomes a maximum length not more than the specific length,
In the target character string, a sentence advances from the first side to the second side in the first direction,
The dividing position determining unit is
A specific range along the first direction from the projection histogram, indicating that the frequency of the character constituent pixels is higher than zero, and a length equal to or greater than a predetermined length along the first direction Selecting the specific range having
Of the frequency distribution of the character constituting pixels in the specific range, a minimum value of the frequency of the character constituting pixels is determined as a first threshold to be used in the specific range,
In the projection histogram, (A) the frequency of the character constituent pixels in the first range along the first direction is larger than the first threshold; and (B) the first direction along the first direction. And the frequency of the character constituent pixels in the second range that is located on the second side in the first direction relative to the first range and is adjacent to the first range is the second range. An image processing apparatus that determines a predetermined position along the first direction within the second range as one candidate position in a specific case indicating that the threshold value is 1 or less .

An image processing apparatus,
A target character string image data acquisition unit for acquiring target character string image data representing a target character string of one line composed of a plurality of characters arranged along the first direction;
The target character string image data is divided into first partial character string image data representing a first partial character string that is a part of the target character string, and a part of the target character string. A second partial character string image data representing a second partial character string, wherein at least one of the first partial character string and the second partial character string is , The dividing portion constituted by two or more characters that are part of the plurality of characters;
Using the first partial character string image data and the second partial character string image data, a first specific character string and a second specific string arranged along a second direction orthogonal to the first direction A specific character string image data generating unit that generates specific character string image data representing a character string, wherein the first specific character string includes the first partial character string, and the second specific character string is look including the second substring, the specific character string image data is generated so as to be aligned in a state where the tip of the first tip and the second specific character string of a specific character string are aligned , The specific character string image data generation unit,
The plurality of pixels constituting the target character string image data include a character constituent pixel constituting a character included in the target character string, and a background pixel constituting a background of the character included in the target character string. Including
The dividing portion is
A histogram generation unit that generates a projection histogram using the target character string image data, wherein the projection histogram projects each pixel constituting the target character string image data along the second direction. A histogram showing a frequency distribution of the character constituting pixels in
A division position determination unit that determines a division position of the target character string image data based on the projection histogram and a specific length, wherein the specific length is represented by the specific character string image data. The division position determination unit, which is an upper limit length along the first direction of a power string of power lines,
The divided portion is divided the target character string image data in the dividing position, and generates a first partial character string image data and the second partial character string image data,
The segmentation position determination unit determines a plurality of candidate positions that are candidates for the segmentation position of the target character string image data using the projection histogram, and the first direction of the first specific character string is determined in the first direction. The dividing position is determined from among the plurality of candidate positions so that the length along the line becomes a maximum length not more than the specific length,
In the target character string, a sentence advances from the first side to the second side in the first direction,
The division position determination unit is configured such that the projection histogram has (A) a frequency of the character constituent pixels in a first range along the first direction is greater than a first threshold value, and (B) the first direction. Of the character constituent pixels in the second range that is located on the second side in the first direction with respect to the first range and that is adjacent to the first range. The frequency is equal to or less than the first threshold; and (C) a third range along the first direction, which is located on the second side in the first direction with respect to the second range. And in a specific case indicating that the frequency of the character constituent pixels in the third range adjacent to the second range is greater than the first threshold, the second in the second range. An image processing apparatus that determines a side edge as one candidate position .

The image processing apparatus further includes:
An original image data acquisition unit that acquires original image data representing an original character string of M rows (where M is an integer of 2 or more), wherein each of the original character strings of the M rows is arranged along the first direction. The original image data acquisition unit, which is composed of a plurality of characters, and in which the original character strings of the M rows are arranged along the second direction,
The target character string image data acquisition unit combines the M original character string image data representing the M lines of original character strings obtained from the original image data to generate the target character string image data. Obtaining the target character string image data;
The image according to any one of claims 1 to 4 , wherein the target character string image data represents the target character string in which the original character strings of the M rows are linearly coupled along the first direction. Processing equipment.

A computer program for an image processing apparatus,
In the computer mounted on the image processing apparatus, the following steps, that is,
A target character string image data acquisition step of acquiring target character string image data representing a target character string of one line composed of a plurality of characters arranged along the first direction;
The target character string image data is divided into first partial character string image data representing a first partial character string that is a part of the target character string, and a part of the target character string. A second partial character string image data representing a second partial character string, wherein at least one of the first partial character string and the second partial character string is , The dividing step constituted by two or more characters that are part of the plurality of characters;
Using the first partial character string image data and the second partial character string image data, a first specific character string and a second specific string arranged along a second direction orthogonal to the first direction A specific character string image data generating step for generating specific character string image data representing a character string, wherein the first specific character string includes the first partial character string, and the second specific character string is including the second substring, the a specific character string image data generation step, is executed,
The plurality of pixels constituting the target character string image data include a character constituent pixel constituting a character included in the target character string, and a background pixel constituting a background of the character included in the target character string. Including
The dividing step includes
A histogram generation step of generating a projection histogram using the target character string image data, wherein the projection histogram projects each pixel constituting the target character string image data along the second direction. The histogram generation step, which is a histogram showing a frequency distribution of the character constituting pixels in
A division position determining step for determining a division position of the target character string image data based on the projection histogram and a specific length, wherein the specific length is represented by the specific character string image data. The dividing position determining step, which is an upper limit length along the first direction of a plurality of lines of power strings,
Wherein in the cutting step, by dividing the target character string image data in the dividing position, and it generates a first partial character string image data and the second partial character string image data,
The segmentation position determination unit determines a plurality of candidate positions that are candidates for the segmentation position of the target character string image data using the projection histogram, and the first direction of the first specific character string is determined in the first direction. The dividing position is determined from among the plurality of candidate positions so that the length along the line becomes a maximum length not more than the specific length,
In the target character string, a sentence advances from the first side to the second side in the first direction,
In the dividing position determining step,
In the projection histogram, (A) the frequency of the character constituent pixels in the first range along the first direction is greater than a first threshold, and (B) the second range along the first direction. The frequency of the character constituent pixels in the second range that is located on the second side in the first direction with respect to the first range and is adjacent to the first range is the first threshold value. And (C) a third range along the first direction, located on the second side in the first direction with respect to the second range, and the second range. The length of the second range along the first direction in a specific case indicating that the frequency of the character constituent pixels in the third range adjacent to the second range is greater than the first threshold, Determined according to the number of pixels along the second direction of the target character string image data. Second determining whether it is below a threshold that is,
In the specific case, at least one of the length along the first direction of the first range and the length along the first direction of the third range is the target character string image data. Whether it is less than a third threshold value determined according to the number of pixels along the second direction of
When it is determined that the length of the second range along the first direction is greater than or equal to the second threshold, the predetermined position in the second range is determined as one candidate position. And
The length of the second range along the first direction is determined to be less than the second threshold, and the length of the first range along the first direction and the third range are determined. A predetermined position in the second range is determined as one candidate position when it is determined that both the length along the first direction of the range is greater than or equal to the third threshold,
The length of the second range along the first direction is determined to be less than the second threshold, and the length of the first range along the first direction and the third range are determined. If it is determined that at least one of the lengths of the range along the first direction is less than the third threshold value, the predetermined position in the second range is set as one candidate position. Not determined as a computer program.

A computer program for an image processing apparatus,
In the computer mounted on the image processing apparatus, the following steps, that is,
A target character string image data acquisition step of acquiring target character string image data representing a target character string of one line composed of a plurality of characters arranged along the first direction;
The target character string image data is divided into first partial character string image data representing a first partial character string that is a part of the target character string, and a part of the target character string. A second partial character string image data representing a second partial character string, wherein at least one of the first partial character string and the second partial character string is , The dividing step constituted by two or more characters that are part of the plurality of characters;
Using the first partial character string image data and the second partial character string image data, a first specific character string and a second specific string arranged along a second direction orthogonal to the first direction A specific character string image data generating step for generating specific character string image data representing a character string, wherein the first specific character string includes the first partial character string, and the second specific character string is including the second substring, the a specific character string image data generation step, is executed,
The plurality of pixels constituting the target character string image data include a character constituent pixel constituting a character included in the target character string, and a background pixel constituting a background of the character included in the target character string. Including
The dividing step includes
A histogram generation step of generating a projection histogram using the target character string image data, wherein the projection histogram projects each pixel constituting the target character string image data along the second direction. The histogram generation step, which is a histogram showing a frequency distribution of the character constituting pixels in
A division position determining step for determining a division position of the target character string image data based on the projection histogram and a specific length, wherein the specific length is represented by the specific character string image data. The dividing position determining step, which is an upper limit length along the first direction of a plurality of lines of power strings,
Wherein in the cutting step, by dividing the target character string image data in the dividing position, and it generates a first partial character string image data and the second partial character string image data,
In the division position determination step, a plurality of candidate positions that are candidates for the division position of the target character string image data are determined using the projection histogram, and the first direction of the first specific character string is determined. The dividing position is determined from among the plurality of candidate positions so that the length along the line becomes a maximum length not more than the specific length,
In the target character string, a sentence advances from the first side to the second side in the first direction,
In the dividing position determining step,
A specific range along the first direction from the projection histogram, indicating that the frequency of the character constituent pixels is higher than zero, and a length equal to or greater than a predetermined length along the first direction Selecting the specific range having
Of the frequency distribution of the character constituting pixels in the specific range, a minimum value of the frequency of the character constituting pixels is determined as a first threshold to be used in the specific range,
In the projection histogram, (A) the frequency of the character constituent pixels in the first range along the first direction is larger than the first threshold; and (B) the first direction along the first direction. And the frequency of the character constituent pixels in the second range that is located on the second side in the first direction relative to the first range and is adjacent to the first range is the second range. A computer program for determining, as a candidate position, a predetermined position along the first direction within the second range in a specific case indicating that the threshold value is 1 or less .

A computer program for an image processing apparatus,
In the computer mounted on the image processing apparatus, the following steps, that is,
A target character string image data acquisition step of acquiring target character string image data representing a target character string of one line composed of a plurality of characters arranged along the first direction;
The target character string image data is divided into first partial character string image data representing a first partial character string that is a part of the target character string, and a part of the target character string. A second partial character string image data representing a second partial character string, wherein at least one of the first partial character string and the second partial character string is , The dividing step constituted by two or more characters that are part of the plurality of characters;
Using the first partial character string image data and the second partial character string image data, a first specific character string and a second specific string arranged along a second direction orthogonal to the first direction A specific character string image data generating step for generating specific character string image data representing a character string, wherein the first specific character string includes the first partial character string, and the second specific character string is look including the second substring, the specific character string image data is generated so as to be aligned in a state where the tip of the first tip and the second specific character string of a specific character string are aligned , Executing the specific character string image data generation step,
The plurality of pixels constituting the target character string image data include a character constituent pixel constituting a character included in the target character string, and a background pixel constituting a background of the character included in the target character string. Including
The dividing step includes
A histogram generation step of generating a projection histogram using the target character string image data, wherein the projection histogram projects each pixel constituting the target character string image data along the second direction. The histogram generation step, which is a histogram showing a frequency distribution of the character constituting pixels in
A division position determining step for determining a division position of the target character string image data based on the projection histogram and a specific length, wherein the specific length is represented by the specific character string image data. The dividing position determining step, which is an upper limit length along the first direction of a plurality of lines of power strings,
Wherein in the cutting step, by dividing the target character string image data in the dividing position, and it generates a first partial character string image data and the second partial character string image data,
In the division position determination step, a plurality of candidate positions that are candidates for the division position of the target character string image data are determined using the projection histogram, and the first direction of the first specific character string is determined. The dividing position is determined from among the plurality of candidate positions so that the length along the line becomes a maximum length not more than the specific length,
In the target character string, a sentence advances from the first side to the second side in the first direction,
In the dividing position determination step, the projection histogram includes: (A) the frequency of the character constituent pixels in the first range along the first direction is greater than a first threshold; and (B) the first direction. Of the character constituent pixels in the second range that is located on the second side in the first direction with respect to the first range and that is adjacent to the first range. The frequency is equal to or less than the first threshold; and (C) a third range along the first direction, which is located on the second side in the first direction with respect to the second range. And in a specific case indicating that the frequency of the character constituent pixels in the third range adjacent to the second range is greater than the first threshold, the second in the second range. determining the end of the side as one candidate position, the computer Program.