JP7019963B2

JP7019963B2 - Character string area / character rectangle extraction device, character string area / character rectangle extraction method, and program

Info

Publication number: JP7019963B2
Application number: JP2017087683A
Authority: JP
Inventors: 敏生岡; 敬由阿部; 達海大庭
Original assignee: Toppan Inc
Current assignee: Toppan Inc
Priority date: 2016-05-10
Filing date: 2017-04-26
Publication date: 2022-02-16
Anticipated expiration: 2037-04-26
Also published as: JP2017204270A

Description

本発明は、文字列領域・文字矩形抽出装置、文字列領域・文字矩形抽出方法、およびプログラムに関する。 The present invention relates to a character string area / character rectangle extraction device, a character string area / character rectangle extraction method, and a program.

光学文字認識（ＯＣＲ；ＯｐｔｉｃａｌＣｈａｒａｃｔｅｒＲｅｃｏｇｎｉｔｉｏｎ）においては、文書中の文字の存在範囲を示す文字列領域を抽出する文字列領域抽出、および、当該文字列領域に含まれるそれぞれの文字の範囲を示す文字矩形を抽出する文字矩形抽出が行われる。光学文字認識技術においては、スキャナ等によって読み込まれた文書が画像データに変換され、当該画像データの解析によって文字列領域抽出および文字矩形抽出が行われ、抽出された文字矩形それぞれに対して文字認識が行われる、という処理フローが一般的である。
コンピュータ等によって自動的に文字列領域抽出および文字矩形抽出が行われる方法としては、射影を利用する方法（例えば、特許文献１）、黒画素の統合による方法（例えば、特許文献２）など、様々な方法が提案されている。また、ユーザの操作に基づいて手動により文字列領域抽出が行われ、自動的に文字矩形抽出が行われる方法（例えば、特許文献３）も提案されている。 In Optical Character Recognition (OCR), character string area extraction that extracts a character string area that indicates the existence range of characters in a document and characters that indicate the range of each character included in the character string area are extracted. Character rectangle extraction is performed to extract the rectangle. In optical character recognition technology, a document read by a scanner or the like is converted into image data, and character string area extraction and character rectangle extraction are performed by analyzing the image data, and character recognition is performed for each of the extracted character rectangles. Is generally performed.
As a method for automatically extracting a character string area and a character rectangle by a computer or the like, there are various methods such as a method using projection (for example, Patent Document 1) and a method by integrating black pixels (for example, Patent Document 2). Method has been proposed. Further, a method (for example, Patent Document 3) in which a character string area is manually extracted based on a user's operation and a character rectangle is automatically extracted has also been proposed.

特開平５－２６６２５０号公報Japanese Unexamined Patent Publication No. 5-266250 特開平５－２７４４７２号公報Japanese Unexamined Patent Publication No. 5-274472 特開２００９－７０２４２号公報Japanese Unexamined Patent Publication No. 2009-70242 特開２００１－４３３１３号公報Japanese Unexamined Patent Publication No. 2001-43313 特開２００７－１０２７０２号公報Japanese Unexamined Patent Publication No. 2007-102702 特開２０１０－２２５１１２号公報Japanese Unexamined Patent Publication No. 2010-225112 特許第５６９９５７０号公報Japanese Patent No. 5699570

Ｙ．ＬｅＣｕｎｅｔａｌ．，”Ｇｒａｄｉｅｎｔ－ＢａｓｅｄＬｅａｒｎｉｎｇＡｐｐｌｉｅｄｔｏＤｏｃｕｍｅｎｔＲｅｃｏｇｎｉｔｉｏｎ”，Ｎｏｖｅｍｂｅｒ１９９８，Ｐｒｏｃｅｅｄｉｎｇｓ．ｏｆｔｈｅＩＥＥＥ，ｖｏｌｕｍｅ８６，ｉｓｓｕｅ１１，ｐｐ２２７８－２３２４．Y. LeCun et al. , "Gradient-Based Learning Applied to Document Recognition", November 1998, Proceedings. of the IEEE, volume 86, issu 11, pp2278-2324.

しかしながら、ユーザの操作に基づいて手動により文字列領域抽出が行われる従来技術においては、例えば、数行に跨る範囲の文字列領域が指定された場合には、行抽出における誤認識などが発生することがある。例えば、１行分の範囲ごとに文字列領域が指定された場合には、文字列（単語）の途中で改行がされているときには文字列領域の末尾の文字列は途中で分断された状態となるため、当該分断された文字列を結合するための処理が必要になるという課題がある。 However, in the prior art in which the character string area is manually extracted based on the user's operation, for example, when a character string area in a range spanning several lines is specified, erroneous recognition in line extraction occurs. Sometimes. For example, when a character string area is specified for each range of one line, when a line break is made in the middle of the character string (word), the character string at the end of the character string area is divided in the middle. Therefore, there is a problem that a process for combining the divided character strings is required.

本発明は上記の点に鑑みてなされたものであり、本発明は、ユーザの操作に基づく補助情報を用いて、文字列領域の抽出および文字矩形の抽出の精度を高めることができる文字列領域・文字矩形抽出装置、文字列領域・文字矩形抽出方法、およびプログラムを提供する。 The present invention has been made in view of the above points, and the present invention is a character string area capable of improving the accuracy of extracting a character string area and extracting a character rectangle by using auxiliary information based on a user's operation. -Provides a character rectangle extraction device, a character string area / character rectangle extraction method, and a program.

（１）本発明は上記の課題を解決するためになされたものであり、本発明の一態様としては、文字列を含む画像を示す画像データを取得する画像データ取得部と、取得された前記画像データに基づく画像を表示する表示部と、ユーザからの操作入力を受け付ける操作入力部と、前記操作入力に基づく補助情報に基づいて特定される前記画像に含まれる前記文字列からなる行の始点と終点とに基づいて前記行に含まれる文字列全体を四角く囲む文字列領域を抽出する文字列領域抽出部と、抽出された前記文字列領域と、前記文字列領域と隣接する他の文字列領域と、を結合する文字列領域結合部と、を備えることを特徴とする文字列領域・文字矩形抽出装置である。 (1) The present invention has been made to solve the above-mentioned problems, and one aspect of the present invention is an image data acquisition unit for acquiring image data indicating an image including a character string, and the acquired image data acquisition unit. A display unit that displays an image based on image data, an operation input unit that accepts operation input from the user, and a start point of a line consisting of the character string included in the image specified based on auxiliary information based on the operation input. A character string area extraction unit that extracts a character string area that encloses the entire character string contained in the line based on the above and the end point, the extracted character string area, and other character strings adjacent to the character string area. It is a character string area / character rectangular extraction device characterized by including a character string area connecting portion for connecting a region and a character string region.

（２）本発明の一態様としては、前記文字列領域抽出部は、前記始点と前記終点とを結ぶ接続線と文字とが交差している範囲に基づいて、前記文字列領域を抽出することを特徴とする（１）に記載の文字列領域・文字矩形抽出装置である。
（３）本発明の一態様としては、前記操作入力部はポインタの操作に基づく前記操作入力を受け付け、前記文字列領域結合部は、前記ポインタの操作に基づいて、前記抽出された文字列領域と、前記文字列領域と隣接する他の文字列領域と、を結合することを特徴とする（１）または（２）に記載の文字列領域・文字矩形抽出装置である。 (2) As one aspect of the present invention, the character string area extraction unit extracts the character string area based on the range where the connection line connecting the start point and the end point and the character intersect. The character string area / character rectangle extraction device according to (1).
( 3 ) In one aspect of the present invention, the operation input unit accepts the operation input based on the operation of the pointer, and the character string area joining unit receives the extracted character string area based on the operation of the pointer. The character string area / character rectangular extraction device according to (1) or (2) , wherein the character string area and another character string area adjacent to the character string area are combined.

（４）本発明の一態様としては、前記文字列領域抽出部は、前記操作入力に基づく補助情報に基づいて、抽出された前記文字列領域を修正することを特徴とする（１）から（３）までのいずれか一つに記載の文字列領域・文字矩形抽出装置である。 ( 4 ) One aspect of the present invention is characterized in that the character string area extraction unit modifies the extracted character string area based on auxiliary information based on the operation input (1) to (1). 3 ) The character string area / character rectangle extraction device described in any one of the above.

（５）本発明の一態様としては、前記操作入力部はポインタの操作に基づく前記操作入力を受け付け、前記文字列領域抽出部は、前記ポインタの操作に基づいて、抽出された前記文字列領域を修正することを特徴とする（４）に記載の文字列領域・文字矩形抽出装置である。 ( 5 ) In one aspect of the present invention, the operation input unit accepts the operation input based on the operation of the pointer, and the character string area extraction unit extracts the character string area based on the operation of the pointer. The character string area / character rectangle extraction device according to ( 4 ), which is characterized by modifying.

（６）本発明の一態様としては、前記文字列領域結合部は、文字列の出現における統計情報に基づいて、抽出された前記文字列領域と、前記文字列領域と隣接する他の文字列領域と、を結合することを特徴とする（１）に記載の文字列領域・文字矩形抽出装置である。 ( 6 ) In one aspect of the present invention, the character string area joining portion includes the character string area extracted based on statistical information in the appearance of the character string, and another character string adjacent to the character string area. The character string area / character rectangle extraction device according to (1), characterized in that the area and the area are combined.

（７）本発明の一態様としては、前記操作入力に基づく補助情報に基づいて、前記文字列と、前記文字列に対応するルビと、を対応付けるルビ対応付け部、を備えることを特徴とする（１）から（６）までのいずれか一つに記載の文字列領域・文字矩形抽出装置である。 ( 7 ) One aspect of the present invention is characterized in that a ruby mapping unit for associating the character string with the ruby corresponding to the character string is provided based on the auxiliary information based on the operation input. The character string area / character rectangle extraction device according to any one of (1) to ( 6 ).

（８）本発明の一態様としては、前記表示部は、対応付けられた前記文字列と前記文字列に対応するルビとが囲まれた範囲であるルビ対応付け領域を表示することを特徴とする（７）に記載の文字列領域・文字矩形抽出装置である。 ( 8 ) One aspect of the present invention is characterized in that the display unit displays a ruby correspondence area which is a range in which the associated character string and the ruby corresponding to the character string are enclosed. This is the character string area / character rectangle extraction device described in ( 7 ).

（９）本発明の一態様としては、前記文字列を構成するそれぞれの文字の矩形を表す文字矩形を抽出する文字矩形抽出部を備えることを特徴とする（１）から（８）までのいずれか一つに記載の文字列領域・文字矩形抽出装置である。 ( 9 ) One aspect of the present invention is any of (1) to ( 8 ), characterized in that a character rectangle extraction unit for extracting a character rectangle representing a character rectangle of each character constituting the character string is provided. It is a character string area / character rectangle extraction device described in one of them.

（１０）本発明の一態様としては、前記文字矩形抽出部は、複数の文字切り出し位置の候補の中から、文字の形状情報や、文字認識における認識確度などから算出される評価値に基づいて、適切な文字切り出し位置を特定することにより、前記文字矩形を抽出することを特徴とする（９）に記載の文字列領域・文字矩形抽出装置である。 ( 10 ) As one aspect of the present invention, the character rectangular extraction unit is based on an evaluation value calculated from character shape information, recognition accuracy in character recognition, etc. from among a plurality of character cutout position candidates. The character string area / character rectangle extraction device according to ( 9 ), wherein the character rectangle is extracted by specifying an appropriate character cutout position.

（１１）本発明の一態様としては、前記文字矩形抽出部は、前記操作入力に基づく補助情報に基づいて、抽出された前記文字矩形を修正することを特徴とする（９）に記載の文字列領域・文字矩形抽出装置である。 ( 11 ) One aspect of the present invention is the character described in ( 9 ), wherein the character rectangle extraction unit corrects the extracted character rectangle based on auxiliary information based on the operation input. It is a column area / character rectangle extractor.

（１２）本発明の一態様としては、前記操作入力部はポインタの操作に基づく前記操作入力を受け付け、前記文字矩形抽出部は、前記ポインタの操作に基づいて、抽出された前記文字矩形を修正することを特徴とする（１１）に記載の文字列領域・文字矩形抽出装置である。 ( 12 ) In one aspect of the present invention, the operation input unit accepts the operation input based on the operation of the pointer, and the character rectangle extraction unit modifies the extracted character rectangle based on the operation of the pointer. The character string area / character rectangle extraction device according to ( 11 ).

（１３）本発明の一態様としては、前記表示部は、前記行に含まれる文字列の表示対象範囲を示す文字列領域を抽出した順に付与された番号である行番号を、それぞれ前記文字列領域と対応付けて表示することを特徴とする（１）から（１２）までのいずれか一つに記載の文字列領域・文字矩形抽出装置である。 ( 13 ) In one aspect of the present invention, the display unit assigns a line number, which is a number assigned in the order in which a character string area indicating a display target range of a character string included in the line is extracted, to the character string. The character string area / character rectangular extraction device according to any one of (1) to ( 12 ), which is characterized by displaying in association with an area.

（１４）本発明の一態様としては、前記表示部は、前記画像データに基づく前記画像と前記行番号がリスト表示された画像であるリスト表示画像とを表示することを特徴とする（１３）に記載の文字列領域・文字矩形抽出装置である。 ( 14 ) One aspect of the present invention is characterized in that the display unit displays the image based on the image data and a list display image which is an image in which the line numbers are listed ( 13 ). It is a character string area / character rectangle extraction device described in 1.

（１５）本発明の一態様としては、前記文字列領域抽出部は、前記画像に含まれる文字列からなる行に対する前記操作入力に基づく補助情報により特定された始点および終点に基づいて、前記画像に含まれる文字列からなる他の行の始点および終点の位置を特定する特定情報を生成し、前記特定情報により特定される始点と終点とに基づいて、前記他の行に含まれる文字列の前記文字列領域を抽出することを特徴とする（１）から（１４）までのいずれかに記載の文字列領域・文字矩形抽出装置である。 ( 15 ) In one aspect of the present invention, the character string area extraction unit is based on the start point and the end point specified by the auxiliary information based on the operation input for the line consisting of the character string included in the image. Generates specific information that identifies the position of the start point and end point of another line consisting of the character string contained in the other line, and based on the start point and end point specified by the specific information, the character string contained in the other line. The character string area / character rectangular extraction device according to any one of (1) to ( 14 ), which is characterized by extracting the character string area.

（１６）本発明の一態様としては、文字列領域抽出部は、前記画像に対して特定された始点および終点のうち少なくともいずれか一方に対する前記操作入力に基づく補助情報に基づいて、前記画像に対して生成された前記特定情報を修正することを特徴とする（１５）に記載の文字列領域・文字矩形抽出装置である。 (16) In one aspect of the present invention, the character string region extraction unit attaches to the image based on auxiliary information based on the operation input for at least one of the start point and the end point specified for the image. The character string area / character rectangular extraction device according to ( 15 ), which modifies the specific information generated in response to the above.

（１７）本発明の一態様としては、前記文字列領域抽出部は、前記画像に対して抽出された前記文字列領域に対する前記操作入力に基づく補助情報に基づいて、前記画像に対して抽出された前記文字列領域を修正することを特徴とする（１５）又は（１６）に記載の文字列領域・文字矩形抽出装置である。 ( 17 ) In one aspect of the present invention, the character string area extraction unit is extracted for the image based on auxiliary information based on the operation input for the character string area extracted for the image. The character string area / character rectangular extraction device according to ( 15 ) or ( 16 ), wherein the character string area is modified.

（１８）本発明の一態様としては、前記文字列領域抽出部は、第１画像に対して行われた前記操作入力に基づく補助情報、及び前記第１画像に対して生成された前記特定情報のうち少なくともいずれか一方に基づいて、前記第１画像とは異なる第２画像に対する前記特定情報を生成することを特徴とする（１５）から（１７）までのいずれかに記載の文字列領域・文字矩形抽出装置である。 ( 18 ) As one aspect of the present invention, the character string region extraction unit has auxiliary information based on the operation input performed on the first image and the specific information generated on the first image. The character string area according to any one of ( 15 ) to ( 17 ), characterized in that the specific information for a second image different from the first image is generated based on at least one of the two. It is a character rectangle extractor.

（１９）本発明の一態様としては、前記文字列領域抽出部は、第１画像から抽出された前記文字列領域に関する情報に基づいて、前記第１画像とは異なる第２画像に対する前記文字列領域を抽出することを特徴とする（１５）から（１８）までのいずれかに記載の文字列領域・文字矩形抽出装置である。 ( 19 ) In one aspect of the present invention, the character string area extraction unit uses the character string for a second image different from the first image based on the information about the character string area extracted from the first image. The character string area / character rectangular extraction device according to any one of ( 15 ) to ( 18 ), which is characterized by extracting an area.

（２０）本発明の一態様としては、コンピュータによる文字列領域・文字矩形抽出方法であって、画像データ取得部が、文字列を含む画像を示す画像データを取得する画像データ取得ステップと、表示部が、取得された前記画像データに基づく画像を表示する表示ステップと、操作入力部が、ユーザからの操作入力を受け付ける操作入力ステップと、文字列領域抽出部が、前記操作入力に基づく補助情報に基づいて特定される前記画像に含まれる前記文字列からなる行の始点と終点とに基づいて前記行に含まれる文字列全体を四角く囲む文字列領域を抽出する文字列領域抽出ステップと、文字列領域結合部が、抽出された前記文字列領域と、前記文字列領域と隣接する他の文字列領域と、を結合する文字列領域結合ステップと、を有することを特徴とする文字列領域・文字矩形抽出方法である。 (20) One aspect of the present invention is a method for extracting a character string area / character rectangle by a computer, in which an image data acquisition unit acquires image data indicating an image including a character string, and displays an image data acquisition step. A display step for displaying an image based on the acquired image data, an operation input step for the operation input unit to receive an operation input from a user, and an auxiliary information based on the operation input for a character string area extraction unit. A character string area extraction step for extracting a character string area that encloses the entire character string included in the line based on the start point and end point of the line consisting of the character string included in the image specified based on the character. The character string area joining portion is characterized by having a character string area joining step for joining the extracted character string area and another character string area adjacent to the character string area. This is a character string extraction method.

（２１）本発明の一態様としては、コンピュータに、文字列を含む画像を示す画像データを取得する画像データ取得ステップと、取得された前記画像データに基づく画像を表示する表示ステップと、ユーザからの操作入力を受け付ける操作入力ステップと、前記操作入力に基づく補助情報に基づいて特定される前記画像に含まれる前記文字列からなる行の始点と終点とに基づいて前記行に含まれる文字列全体を四角く囲む文字列領域を抽出する文字列領域抽出ステップと、抽出された前記文字列領域と、前記文字列領域と隣接する他の文字列領域と、を結合する文字列領域結合ステップと、を実行させるためのプログラムである。 ( 21 ) As one aspect of the present invention, an image data acquisition step for acquiring image data indicating an image including a character string, a display step for displaying an image based on the acquired image data, and a user The entire character string included in the line based on the operation input step that accepts the operation input of and the start point and end point of the line consisting of the character string included in the image specified based on the auxiliary information based on the operation input. A character string area extraction step for extracting a character string area surrounding a square , and a character string area joining step for combining the extracted character string area with another character string area adjacent to the character string area. It is a program to be executed.

本発明によれば、ユーザの操作に基づく補助情報を用いて、文字列領域の抽出の精度を高めることができる。 According to the present invention, it is possible to improve the accuracy of extracting the character string region by using the auxiliary information based on the user's operation.

本発明の実施形態に係る文字列領域・文字矩形抽出装置の機能構成を示すブロック図である。It is a block diagram which shows the functional structure of the character string area / character rectangle extraction apparatus which concerns on embodiment of this invention. 本発明の実施形態に係る文字列領域・文字矩形抽出装置の表示部によって表示される文字列領域抽出画面の一例を示す図である。It is a figure which shows an example of the character string area extraction screen displayed by the display part of the character string area / character rectangle extraction apparatus which concerns on embodiment of this invention. 本発明の実施形態に係る文字列領域・文字矩形抽出装置の表示部によって表示される文字列領域抽出画面の一例を示す図である。It is a figure which shows an example of the character string area extraction screen displayed by the display part of the character string area / character rectangle extraction apparatus which concerns on embodiment of this invention. 本発明の実施形態に係る文字列領域・文字矩形抽出装置の表示部によって表示される文字列領域抽出画面の一例を示す図である。It is a figure which shows an example of the character string area extraction screen displayed by the display part of the character string area / character rectangle extraction apparatus which concerns on embodiment of this invention. 本発明の実施形態に係る文字列領域・文字矩形抽出装置の文字列領域抽出部による文字列領域抽出処理の一例を示す図である。It is a figure which shows an example of the character string area extraction processing by the character string area extraction part of the character string area / character rectangle extraction apparatus which concerns on embodiment of this invention. 本発明の実施形態に係る文字列領域・文字矩形抽出装置の文字列領域抽出部による文字列領域抽出処理の一例を示す図である。It is a figure which shows an example of the character string area extraction processing by the character string area extraction part of the character string area / character rectangle extraction apparatus which concerns on embodiment of this invention. 本発明の実施形態に係る文字列領域・文字矩形抽出装置の表示部によって表示される文字列領域抽出画面の一例を示す図である。It is a figure which shows an example of the character string area extraction screen displayed by the display part of the character string area / character rectangle extraction apparatus which concerns on embodiment of this invention. 本発明の実施形態に係る文字列領域・文字矩形抽出装置の表示部によって表示される文字列領域抽出画面の一例を示す図である。It is a figure which shows an example of the character string area extraction screen displayed by the display part of the character string area / character rectangle extraction apparatus which concerns on embodiment of this invention. 本発明の実施形態に係る文字列領域・文字矩形抽出装置の文字矩形抽出部による文字矩形抽出処理の一例を示す図である。It is a figure which shows an example of the character rectangle extraction processing by the character rectangle extraction part of the character string area / character rectangle extraction apparatus which concerns on embodiment of this invention. 本発明の実施形態に係る文字列領域・文字矩形抽出装置のルビ対応付け部によるルビ抽出処理の一例を示す図である。It is a figure which shows an example of the ruby extraction processing by the ruby correspondence part of the character string area / character rectangle extraction apparatus which concerns on embodiment of this invention. 本発明の実施形態に係る文字列領域・文字矩形抽出装置の表示部によって表示される文字列領域管理画面の一例を示す図である。It is a figure which shows an example of the character string area management screen displayed by the display part of the character string area / character rectangle extraction apparatus which concerns on embodiment of this invention. 本発明の実施形態に係る文字列領域・文字矩形抽出装置の表示部によって表示される文字列領域管理画面の一例を示す図である。It is a figure which shows an example of the character string area management screen displayed by the display part of the character string area / character rectangle extraction apparatus which concerns on embodiment of this invention. 本発明の実施形態に係る文字列領域・文字矩形抽出装置の表示部によって表示される文字列領域管理画面の一例を示す図である。It is a figure which shows an example of the character string area management screen displayed by the display part of the character string area / character rectangle extraction apparatus which concerns on embodiment of this invention. 本発明の実施形態に係る文字列領域・文字矩形抽出装置の表示部によって表示される文字列領域管理画面の一例を示す図である。It is a figure which shows an example of the character string area management screen displayed by the display part of the character string area / character rectangle extraction apparatus which concerns on embodiment of this invention. 本発明の実施形態に係る文字列領域・文字矩形抽出装置の動作を示すフローチャートである。It is a flowchart which shows the operation of the character string area / character rectangle extraction apparatus which concerns on embodiment of this invention. 本発明の実施形態の変形例１に係る文字列領域・文字矩形抽出装置の機能構成を示すブロック図である。It is a block diagram which shows the functional structure of the character string area / character rectangle extraction apparatus which concerns on modification 1 of embodiment of this invention. 本発明の実施形態の変形例１に係る文字列領域・文字矩形抽出装置の動作を示すフローチャートである。It is a flowchart which shows the operation of the character string area / character rectangle extraction apparatus which concerns on modification 1 of embodiment of this invention. 本発明の実施形態の変形例２に係る文字列領域・文字矩形抽出装置の表示部によって表示される文字列領域抽出画面の一例を示す図である。It is a figure which shows an example of the character string area extraction screen displayed by the display part of the character string area / character rectangle extraction apparatus which concerns on modification 2 of the Embodiment of this invention. 本発明の実施形態の変形例２に係る文字列領域・文字矩形抽出装置の表示部によって表示される文字列領域抽出画面の一例を示す図である。It is a figure which shows an example of the character string area extraction screen displayed by the display part of the character string area / character rectangle extraction apparatus which concerns on modification 2 of the Embodiment of this invention. 本発明の実施形態の変形例２に係る文字列領域・文字矩形抽出装置の表示部によって表示される文字列領域抽出画面の一例を示す図である。It is a figure which shows an example of the character string area extraction screen displayed by the display part of the character string area / character rectangle extraction apparatus which concerns on modification 2 of the Embodiment of this invention. 本発明の実施形態の変形例２に係る文字列領域・文字矩形抽出装置の表示部によって表示される文字列領域抽出画面の一例を示す図である。It is a figure which shows an example of the character string area extraction screen displayed by the display part of the character string area / character rectangle extraction apparatus which concerns on modification 2 of the Embodiment of this invention. 本発明の実施形態の変形例２に係る文字列領域・文字矩形抽出装置の表示部によって表示される文字列領域抽出画面の一例を示す図である。It is a figure which shows an example of the character string area extraction screen displayed by the display part of the character string area / character rectangle extraction apparatus which concerns on modification 2 of the Embodiment of this invention. 本発明の実施形態の変形例２に係る文字列領域・文字矩形抽出装置の表示部によって表示される文字列領域抽出画面の一例を示す図である。It is a figure which shows an example of the character string area extraction screen displayed by the display part of the character string area / character rectangle extraction apparatus which concerns on modification 2 of the Embodiment of this invention. 本発明の実施形態の変形例２に係る文字列領域・文字矩形抽出装置の表示部によって表示される文字列領域抽出画面の一例を示す図である。It is a figure which shows an example of the character string area extraction screen displayed by the display part of the character string area / character rectangle extraction apparatus which concerns on modification 2 of the Embodiment of this invention.

＜実施形態＞
以下、本発明の実施形態について説明する。
本実施形態に係る装置は、文字列を含む画像を示す画像データ（例えば、文書をスキャナ等によって読み取ることにより生成される画像データ）が入力され、入力された画像データに基づく画像において文字列が表示されている範囲（以下、文字列領域ともいう）を抽出（特定）する。さらに、本実施形態に係る装置は、抽出された文字列領域に含まれる文字列の中のそれぞれの文字が表示されている範囲（以下、文字矩形ともいう）を抽出（特定）する。すなわち、本実施形態に係る装置は、文字列領域および文字矩形を抽出する文字列領域・文字矩形抽出装置である。 <Embodiment>
Hereinafter, embodiments of the present invention will be described.
In the apparatus according to the present embodiment, image data indicating an image including a character string (for example, image data generated by reading a document with a scanner or the like) is input, and the character string is input in an image based on the input image data. Extract (specify) the displayed range (hereinafter, also referred to as the character string area). Further, the apparatus according to the present embodiment extracts (specifies) a range (hereinafter, also referred to as a character rectangle) in which each character in the character string included in the extracted character string area is displayed. That is, the device according to the present embodiment is a character string area / character rectangle extraction device that extracts a character string area and a character rectangle.

そして、本実施形態に係る文字列領域・文字矩形抽出装置は、抽出されたそれぞれの文字矩形に含まれる文字を示す画像に対して光学文字認識（ＯＣＲ）を行い、文字を認識する。なお、本実施形態においては、光学文字認識には周知の技術が用いられる。例えば、代表的な文字認識の手法の１つとして、非特許文献１に記載された畳み込みニューラルネットワークを用いる手法がある。 Then, the character string area / character rectangle extraction device according to the present embodiment performs optical character recognition (OCR) on the image indicating the character included in each of the extracted character rectangles, and recognizes the character. In this embodiment, a well-known technique is used for optical character recognition. For example, as one of the typical character recognition methods, there is a method using a convolutional neural network described in Non-Patent Document 1.

なお、自動的に文字列領域抽出および文字矩形抽出が行われる従来技術においては、例えば、文書中の文字列領域のレイアウトが複雑である場合には、実際とは大きく異なるレイアウトとして誤認識されることがある。例えば、文字列領域の順序が実際の文章とは異なる誤認識、２行が１行として認識されるなどの行抽出における誤認識、および、絵や図が文字として認識される誤認識などが発生する。さらに、そのような場合には、文字矩形抽出の失敗による誤認識が発生することがある。例えば、文字が見落とされる誤認識、２文字が１文字として認識される誤認識、１文字が２文字として認識される誤認識、および、文字に付されたルビを認識できずにルビを含めて１文字であると認識する誤認識などが発生する。 In the prior art in which the character string area extraction and the character rectangle extraction are automatically performed, for example, when the layout of the character string area in the document is complicated, it is erroneously recognized as a layout that is significantly different from the actual layout. Sometimes. For example, misrecognition that the order of the character string area is different from the actual sentence, misrecognition in line extraction such as two lines being recognized as one line, and misrecognition that a picture or figure is recognized as a character occur. do. Further, in such a case, erroneous recognition may occur due to the failure of character rectangle extraction. For example, misrecognition that a character is overlooked, misrecognition that two characters are recognized as one character, misrecognition that one character is recognized as two characters, and ruby that is attached to a character cannot be recognized and includes ruby. Misrecognition that recognizes as one character occurs.

また、手動によって文字列領域抽出が行われる従来技術においては、例えば、数行に跨る範囲の文字列領域が指定された場合には、行抽出における誤認識、および、文字矩形抽出の失敗による誤認識などが発生することがある。また、例えば、１行分の範囲ごとに文字列領域が指定された場合においても、文字矩形抽出の失敗による誤認識が発生することがある。さらに、そのような場合には、文字列（単語）の途中で改行がされているときには文字列領域の末尾の文字列は途中で分断された状態となるため、当該分断された文字列を結合するための処理が必要になる。 Further, in the conventional technique in which the character string area is manually extracted, for example, when a character string area in a range spanning several lines is specified, an erroneous recognition in the line extraction and an erroneous due to a failure in character rectangle extraction are performed. Recognition etc. may occur. Further, for example, even when a character string area is specified for each range of one line, erroneous recognition may occur due to a failure of character rectangle extraction. Furthermore, in such a case, when a line break is made in the middle of the character string (word), the character string at the end of the character string area is in a divided state in the middle, so the divided character strings are combined. Processing is required.

＜文字列領域・文字矩形抽出装置の構成＞
以下、実施形態に係る文字列領域・文字矩形抽出装置１の構成について、図面を参照しながら説明する。
図１は、本発明の実施形態に係る文字列領域・文字矩形抽出装置１の機能構成を示すブロック図である。
図示するように、本実施形態に係る文字列領域・文字矩形抽出装置１は、制御部１０と、画像データ取得部１１と、操作入力部１２と、表示部１３と、記憶部１４と、文字列領域抽出部１５と、文字列領域結合部１６と、文字矩形抽出部１７と、ルビ対応付け部１８と、管理画面生成部１９と、文字認識部２０と、と含んで構成される。 <Structure of character string area / character rectangle extractor>
Hereinafter, the configuration of the character string area / character rectangle extraction device 1 according to the embodiment will be described with reference to the drawings.
FIG. 1 is a block diagram showing a functional configuration of a character string area / character rectangle extraction device 1 according to an embodiment of the present invention.
As shown in the figure, the character string area / character rectangular extraction device 1 according to the present embodiment includes a control unit 10, an image data acquisition unit 11, an operation input unit 12, a display unit 13, a storage unit 14, and characters. It is composed of a column area extraction unit 15, a character string area combination unit 16, a character rectangle extraction unit 17, a ruby mapping unit 18, a management screen generation unit 19, and a character recognition unit 20.

制御部１０は、文字列領域・文字矩形抽出装置１における各種の処理を制御する。制御部１０は、例えば、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ；中央演算処理装置）を含んで構成される。 The control unit 10 controls various processes in the character string area / character rectangle extraction device 1. The control unit 10 includes, for example, a CPU (Central Processing Unit).

画像データ取得部１１は、文字列を含む画像を示す画像データを取得する。画像データ取得部１１は、例えば、外部の機器であるスキャナ等によって文書が読み取られることによって生成された画像データを取得する入力インターフェースを含んで構成される。または、画像データ取得部１１は、例えば、スキャナを含んで構成される。 The image data acquisition unit 11 acquires image data indicating an image including a character string. The image data acquisition unit 11 includes, for example, an input interface for acquiring image data generated by reading a document by an external device such as a scanner. Alternatively, the image data acquisition unit 11 includes, for example, a scanner.

操作入力部１２は、ユーザからの操作入力を受け付ける。操作入力部１２は、受け付けた操作入力に基づく操作入力信号（補助情報）を生成し、後述する表示部１３、文字列領域抽出部１５、文字列領域結合部１６、文字矩形抽出部１７、ルビ対応付け部１８、または管理画面生成部１９へ出力する。操作入力部１２は、後述する表示部１３に表示されるポインタ（または、カーソル）をユーザが操作するために用いられるポインティングデバイス、例えば、マウス、タッチパッド、タッチパネル、スタイラス、またはトラックボール等を含んで構成される。 The operation input unit 12 receives an operation input from the user. The operation input unit 12 generates an operation input signal (auxiliary information) based on the received operation input, and the display unit 13, the character string area extraction unit 15, the character string area connection unit 16, the character rectangle extraction unit 17, and the ruby, which will be described later. It is output to the correspondence unit 18 or the management screen generation unit 19. The operation input unit 12 includes a pointing device used for the user to operate a pointer (or a cursor) displayed on the display unit 13 described later, for example, a mouse, a touch pad, a touch panel, a stylus, a trackball, or the like. Consists of.

なお、本実施形態においては、文字列領域・文字矩形抽出装置１が操作入力部１２を備えるものとしたが、これに限られない。例えば、操作入力部１２が外部の装置に備えられ、文字列領域・文字矩形抽出装置１が、ユーザによるポインタ操作を示す信号を当該外部の装置から取得するようにしてもよい。 In the present embodiment, the character string area / character rectangle extraction device 1 is provided with the operation input unit 12, but the present invention is not limited to this. For example, the operation input unit 12 may be provided in an external device, and the character string area / character rectangle extraction device 1 may acquire a signal indicating a pointer operation by the user from the external device.

表示部１３は、画像データ取得部１１によって取得された画像データに基づく画像を表示する。表示部１３は、ディスプレイ、例えば、液晶ディスプレイ（ＬＣＤ；ＬｉｑｕｉｄＣｒｙｓｔａｌＤｉｓｐｌａｙ）、有機ＥＬ（ＯｒｇａｎｉｃＥｌｅｃｔｒｏｌｕｍｉｎｅｓｃｅｎｃｅ；エレクトロルミネッセンス）ディスプレイ、またはＣＲＴ（ＣａｔｈｏｄｅＲａｙＴｕｂｅ；陰極線管）等を含んで構成される。 The display unit 13 displays an image based on the image data acquired by the image data acquisition unit 11. The display unit 13 includes a display, for example, a liquid crystal display (LCD), an organic EL (Organic Electroluminescence) display, a CRT (Cathode Ray Tube), or the like.

なお、本実施形態においては、文字列領域・文字矩形抽出装置１が表示部１３を備えるものとしたが、これに限られない。例えば、文字列領域・文字矩形抽出装置１が、表示させる画像を示す画像データを外部の装置へ送信し、当該外部の装置が備える表示部において当該画像が表示されるようにしてもよい。 In the present embodiment, the character string area / character rectangle extraction device 1 is provided with the display unit 13, but the present invention is not limited to this. For example, the character string area / character rectangle extraction device 1 may transmit image data indicating an image to be displayed to an external device so that the image is displayed on a display unit included in the external device.

記憶部１４は、文字列領域・文字矩形抽出装置１において用いられる各種のコンピュータプログラムやデータ等を記憶する。また、記憶部１４は、文字列領域・文字矩形抽出装置１における各種の演算処理等において用いられる一時的な記憶領域としての機能も有する。記憶部１４は、記憶媒体、例えば、ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ；ハードディスクドライブ）、フラッシュメモリ、ＥＥＰＲＯＭ（Ｅｌｅｃｔｒｉｃａｌｌｙ
ＥｒａｓａｂｌｅＰｒｏｇｒａｍｍａｂｌｅＲｅａｄＯｎｌｙＭｅｍｏｒｙ；イーイープロム）、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓｒｅａｄ／ｗｒｉｔｅＭｅｍｏｒｙ；読み書き可能なメモリ）、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ；読み出し専用メモリ）、又はそれらの任意の組み合わせを含んで構成される。 The storage unit 14 stores various computer programs, data, and the like used in the character string area / character rectangle extraction device 1. Further, the storage unit 14 also has a function as a temporary storage area used in various arithmetic processes and the like in the character string area / character rectangle extraction device 1. The storage unit 14 is a storage medium, for example, an HDD (Hard Disk Drive), a flash memory, or an EEPROM (Electrically).
Includes Erasable Program Read Only Memory, RAM (Random Access read / write Memory), ROM (Read Only Memory), or any combination thereof.

文字列領域抽出部１５は、操作入力部１２から入力される操作入力信号（補助情報）に基づいて、画像データ取得部１１が取得した画像データに基づく画像に含まれる文字列からなる行の始点と終点とを特定し、特定された行の始点と終点とに基づいて当該行に含まれる文字列の表示対象範囲を示す文字列領域を抽出する。また、文字列領域抽出部１５は、操作入力部１２から入力される、ユーザによるポインタの操作に基づく操作入力信号（補助情報）に基づいて、抽出された文字列領域を修正する。 The character string area extraction unit 15 is the start point of a line consisting of a character string included in the image based on the image data acquired by the image data acquisition unit 11 based on the operation input signal (auxiliary information) input from the operation input unit 12. And the end point are specified, and the character string area indicating the display target range of the character string included in the line is extracted based on the start point and the end point of the specified line. Further, the character string area extraction unit 15 corrects the extracted character string area based on the operation input signal (auxiliary information) input from the operation input unit 12 based on the operation of the pointer by the user.

文字列領域結合部１６は、文字列領域抽出部１５によって抽出された文字列領域と、当該文字列領域と隣接する他の文字列領域（例えば、当該文字列領域が抽出される前に抽出された隣接する他の文字列領域）と、を結合する。文字列領域結合部１６は、結合された文字列領域を示す画像を表示部１３に表示させる。 The character string area joining unit 16 is extracted before the character string area extracted by the character string area extraction unit 15 and another character string area adjacent to the character string area (for example, the character string area is extracted). And other adjacent string areas) and. The character string area combining unit 16 causes the display unit 13 to display an image showing the combined character string area.

文字列領域結合部１６は、操作入力部１２から入力される、ユーザによるポインタの操作に基づく操作入力信号（補助情報）に基づいて、抽出された文字列領域と、当該文字列領域と隣接する他の文字列領域と、を結合する。 The character string area combining unit 16 is adjacent to the extracted character string area and the character string area based on the operation input signal (auxiliary information) input from the operation input unit 12 based on the operation of the pointer by the user. Combine with other string areas.

文字矩形抽出部１７は、文字列領域抽出部１５によって抽出された文字列領域に含まれる文字列を構成するそれぞれの文字の矩形を表す文字矩形を抽出する。また、文字矩形抽出部１７は、操作入力部１２から入力される、ユーザによるポインタの操作に基づく操作入力信号（補助情報）に基づいて、抽出された文字矩形を修正する。 The character rectangle extraction unit 17 extracts a character rectangle representing a rectangle of each character constituting the character string included in the character string area extracted by the character string area extraction unit 15. Further, the character rectangle extraction unit 17 corrects the extracted character rectangle based on the operation input signal (auxiliary information) input from the operation input unit 12 based on the operation of the pointer by the user.

なお、例えば、特許文献7に記載の画像処理装置のように、文字矩形抽出部１７は、複数の文字切り出し位置の候補の中から、文字の形状情報や、文字認識における認識確度などから算出される評価値に基づいて、適切な文字切り出し位置を特定することにより、文字矩形を抽出するようにしてもよい。 For example, as in the image processing device described in Patent Document 7, the character rectangle extraction unit 17 is calculated from a plurality of character cutout position candidates, such as character shape information and recognition accuracy in character recognition. The character rectangle may be extracted by specifying an appropriate character cutout position based on the evaluation value.

例えば、「化学」という文字列に対して文字の切り出しが行われる場合、「化」と「学」とが切り出されるパターン、「イ」と「ヒ学」とが切り出されるパターン、および「イ」と「ヒ」と「学」とが切り出されるパターンがあるように、複数の文字切り出し位置の候補が存在する。それらの文字切り出し位置の候補の中から、予め蓄積されている文字の形状情報や、文字認識における認識確度などから算出される評価値に基づいて、適切な文字切り出し位置を特定する。 For example, when a character is cut out from the character string "Chemistry", a pattern in which "ka" and "gaku" are cut out, a pattern in which "i" and "higaku" are cut out, and "i" There are multiple candidate character cutout positions so that there is a pattern in which "hi" and "gaku" are cut out. From among the candidates for the character cutout position, an appropriate character cutout position is specified based on the evaluation value calculated from the character shape information accumulated in advance and the recognition accuracy in character recognition.

ルビ対応付け部１８は、ユーザによるポインタの操作に基づく操作入力信号（補助情報）に基づいて、文字列領域抽出部１５によって抽出された文字列領域に含まれる文字列と、文字列領域抽出部１５によって抽出されたルビであって当該文字列に対応するルビと、を対応付ける。ルビ対応付け部１８は、対応付けられた文字列と当該文字列に対応するルビとが囲まれた範囲であるルビ対応付け領域を示す画像を、表示部１３に表示させる。 The ruby mapping unit 18 is a character string included in the character string area extracted by the character string area extraction unit 15 based on an operation input signal (auxiliary information) based on a pointer operation by the user, and a character string area extraction unit. The ruby extracted by 15 and corresponding to the character string is associated with the ruby. The ruby correspondence unit 18 causes the display unit 13 to display an image showing a ruby correspondence area which is a range in which the associated character string and the ruby corresponding to the character string are enclosed.

管理画面生成部１９は、ユーザが文字列領域および文字矩形の修正等を行うための管理画面を生成する。管理画面生成部１９は、文字列領域抽出部１５によって抽出された文字列領域（行単位で抽出された文字列領域）に対して、抽出された順に付与される番号である行番号を対応付ける。そして、管理画面生成部１９は、画像データ取得部１１によって取得された画像データに基づく画像に含まれる文字列領域（行単位で抽出された文字列領域）にそれぞれ行番号が付与された画像と、当該行番号がリスト表示された画像であるリスト表示画像と、を表示する管理画面を、表示部１３に表示させる。 The management screen generation unit 19 generates a management screen for the user to modify the character string area and the character rectangle. The management screen generation unit 19 associates a line number, which is a number assigned in the order of extraction, with the character string area (character string area extracted in line units) extracted by the character string area extraction unit 15. Then, the management screen generation unit 19 includes an image in which a line number is assigned to each of the character string areas (character string areas extracted in line units) included in the image based on the image data acquired by the image data acquisition unit 11. , A management screen for displaying a list display image, which is an image in which the line number is displayed in a list, is displayed on the display unit 13.

文字認識部２０は、文字矩形抽出部１７によって抽出されたそれぞれの文字矩形に含まれる文字を示す画像に対して光学文字認識（ＯＣＲ）を行い、文字を認識する。具体的には、文字認識部２０は、（例えば、スキャナによって読み取られた）文書に含まれる文字を示す画像データを解析し、コンピュータにより編集可能なデータ形式（例えば、文字コードの列）に変換する。 The character recognition unit 20 performs optical character recognition (OCR) on an image showing characters included in each character rectangle extracted by the character rectangle extraction unit 17, and recognizes the characters. Specifically, the character recognition unit 20 analyzes image data indicating characters contained in a document (for example, read by a scanner) and converts it into a data format (for example, a character code string) that can be edited by a computer. do.

（文字列領域の抽出）
以下、文字列領域・文字矩形抽出装置１による文字列領域の抽出について、図面を参照しながら説明する。
図２乃至図４は、本発明の実施形態に係る文字列領域・文字矩形抽出装置１の表示部１３によって表示される文字列領域抽出画面の一例を示す図である。 (Extraction of character string area)
Hereinafter, the extraction of the character string area by the character string area / character rectangle extraction device 1 will be described with reference to the drawings.
2 to 4 are diagrams showing an example of a character string area extraction screen displayed by the display unit 13 of the character string area / character rectangle extraction device 1 according to the embodiment of the present invention.

図２に示すように、ユーザは、操作入力部１２によりポインタｐｔ１を操作することにより、文字列領域抽出画面ｄｓ１に含まれる文字列の範囲を、行単位で指定する。当該行単位での文字列領域の指定は、例えば、ユーザがマウスを用いて、まず、行の始点近傍の位置にポインタｐｔ１を移動させてマウスボタンをクリックし、次に、行の終点近傍の位置にポインタｐｔ１を移動させてマウスボタンをクリックすることによって行われる。または、当該行単位での文字列領域の指定は、例えば、ユーザがマウスを用いて、まず、行の始点近傍の位置にポインタｐｔ１を移動させてマウスボタンをクリックし、そのまま行の終点近傍の位置までポインタｐｔ１をドラッグすることによって行われる。 As shown in FIG. 2, the user operates the pointer pt1 by the operation input unit 12 to specify the range of the character string included in the character string area extraction screen ds1 in line units. To specify the character string area for each line, for example, the user first moves the pointer pt1 to a position near the start point of the line, clicks the mouse button, and then near the end point of the line. This is done by moving the pointer pt1 to a position and clicking the mouse button. Alternatively, to specify the character string area for each line, for example, the user first moves the pointer pt1 to a position near the start point of the line, clicks the mouse button, and then clicks the mouse button as it is near the end point of the line. This is done by dragging the pointer pt1 to the position.

図２に示すように、行の始点を指定する操作がなされると、始点として指定された位置（すなわち、例えば、マウスボタンが１回目にクリックされた位置、またはドラッグが開始された位置）である始点ｓｔ１には、例えば、白い丸型のアイコンが表示される。
次に、図３に示すように、行の終点を指定する操作がなされると、終点として指定された位置（すなわち、例えば、マウスボタンが２回目にクリックされた位置、またはドラッグが終了された位置）である終点ｅｄ１には、例えば、黒い丸型のアイコンが表示される。また、行の終点を指定する操作がなされると、始点ｓｔ１と終点ｅｄ１を結ぶ接続線ｃｎ１が表示される。 As shown in FIG. 2, when the operation of specifying the start point of the line is performed, the position specified as the start point (that is, the position where the mouse button is clicked for the first time or the position where the drag is started) is performed. For example, a white round icon is displayed at a certain starting point st1.
Next, as shown in FIG. 3, when the operation of specifying the end point of the line is performed, the position specified as the end point (that is, the position where the mouse button is clicked for the second time, or the drag is finished, for example). At the end point ed1 which is the position), for example, a black circle icon is displayed. Further, when the operation of designating the end point of the line is performed, the connection line cn1 connecting the start point st1 and the end point ed1 is displayed.

そして、図４に示すように、行の始点および終点を指定する操作が完了すると、始点および終点とが指定された行に含まれる文字列全体を四角く囲むように、囲み線が表示される。この囲み線で囲まれた領域が、文字列領域抽出部１５によって抽出された文字列領域である。図４は、文字列領域ｓａ１、文字列領域ｓａ２、および文字列領域ｓａ３の３つの文字列領域が抽出された状態を表した図である。 Then, as shown in FIG. 4, when the operation of designating the start point and the end point of the line is completed, a surrounding line is displayed so as to enclose the entire character string included in the line in which the start point and the end point are designated. The area surrounded by the enclosing line is the character string area extracted by the character string area extraction unit 15. FIG. 4 is a diagram showing a state in which three character string areas of the character string area sa1, the character string area sa2, and the character string area sa3 are extracted.

以下、文字列領域の抽出処理の一例について、図面を参照しながら説明する。
図５は、本発明の実施形態に係る文字列領域・文字矩形抽出装置１の文字列領域抽出部１５による文字列領域抽出処理の一例を示す図である。 Hereinafter, an example of the character string area extraction process will be described with reference to the drawings.
FIG. 5 is a diagram showing an example of a character string area extraction process by the character string area extraction unit 15 of the character string area / character rectangle extraction device 1 according to the embodiment of the present invention.

図５（Ａ）に示すように、文字列領域抽出部１５は、始点ｓｔ１と終点ｅｄ１とが指定されることにより接続線ｃｎ１を設定すると、当該接続線ｃｎ１と文字とが交差している範囲を特定する。すなわち、例えば、図５（Ａ）に示す例においては、読点「、」の位置では当該読点と接続線ｃｎ１とは交差していないが、それ以外の文字については全ての文字が接続線ｃｎ１と交差している。文字列領域抽出部１５は、文字と接続線ｃｎ１とが交差している位置において、それぞれ文字の上端部と下端部の位置を測定する。そして、文字列領域抽出部１５は、測定されたそれぞれの文字の上端部と下端部において、最も上に位置する箇所である最上端部ｕｐ１と、最も下に位置する箇所である最下端部ｌｐ１とを特定する。 As shown in FIG. 5A, when the connection line cn1 is set by designating the start point st1 and the end point ed1, the character string area extraction unit 15 is in a range where the connection line cn1 and the character intersect. To identify. That is, for example, in the example shown in FIG. 5A, the reading point and the connecting line cn1 do not intersect at the position of the reading point ",", but all the other characters are connected to the connecting line cn1. It intersects. The character string area extraction unit 15 measures the positions of the upper end portion and the lower end portion of the character at the positions where the character and the connection line cn1 intersect, respectively. Then, the character string area extraction unit 15 has an uppermost end portion up1 which is a portion located at the uppermost portion and a lowermost end portion lp1 which is a portion located at the lowermost portion in the upper end portion and the lower end portion of each measured character. And identify.

図５に示す例においては、図５（Ｂ）に示すように、最上端部ｕｐ１は、「書」の文字の上端部である。また、図５に示す例においては、図５（Ｃ）に示すように、最下端部ｌｐ１は、「学」の文字の下端部である。 In the example shown in FIG. 5, as shown in FIG. 5 (B), the uppermost end portion up1 is the upper end portion of the character of "calligraphy". Further, in the example shown in FIG. 5, as shown in FIG. 5 (C), the lowermost end portion lp1 is the lower end portion of the character “Gaku”.

文字列領域抽出部１５は、最上端部ｕｐ１を特定すると、当該最上端部ｕｐ１からそれぞれ水平方向に左右に伸ばした線である最上端線ｕｌ１を、文字列領域ｓａ２の上端を表す線として設定する。同様に、文字列領域抽出部１５は、最下端部ｄｐ１を特定すると、当該最下端部ｄｐ１からそれぞれ水平方向に左右に伸ばした線である最下端線ｄｌ１を、文字列領域ｓａ２の下端を表す線として設定する。 When the uppermost end portion up1 is specified, the character string area extraction unit 15 sets the uppermost end line ul1, which is a line extending horizontally from the uppermost end portion up1 to the left and right, as a line representing the upper end of the character string area sa2. do. Similarly, when the character string area extraction unit 15 specifies the lowermost end portion dl1, the lowermost end line dl1 which is a line extending horizontally from the lowermost end portion dl1 to the left and right represents the lower end of the character string area sa2. Set as a line.

また、文字列領域抽出部１５は、始点ｓｔ１から垂直方向に上下に伸ばした線を、文字列領域ｓａ２の左端を表す線として設定する。同様に、文字列領域抽出部１５は、終点ｅｄ１から垂直方向に上下に伸ばした線を、文字列領域ｓａ２の右端を表す線として設定する。 Further, the character string area extraction unit 15 sets a line extending vertically from the start point st1 as a line representing the left end of the character string area sa2. Similarly, the character string area extraction unit 15 sets a line extending vertically from the end point ed1 as a line representing the right end of the character string area sa2.

以上により、文字列領域ｓａ２の上下方向および左右方向の範囲が定まるため、文字列領域ｓａ２の範囲が一意に決定される。
図５（Ｄ）は、上記の文字列領域の抽出方法によって抽出された文字列領域ｓａ２を表したものである。 As described above, since the range in the vertical direction and the horizontal direction of the character string area sa2 is determined, the range of the character string area sa2 is uniquely determined.
FIG. 5D shows the character string area sa2 extracted by the above-mentioned character string area extraction method.

なお、上記の文字列領域の抽出方法は、あくまで一例である。この他にも、例えば、文字列領域抽出部１５は、最上端部ｕｐ１から上方向に所定の長さだけ離れた位置、および最下端部ｄｐ１から下方向に所定の長さだけ離れた位置に基づいて、それぞれ最上端線ｕｌ１と最下端線ｄｌ１とを設定するようにしてもよい。すなわち、文字列領域抽出部１５は、最上端部ｕｐ１と最下端部ｄｐ１とから、少し上下方向に幅を持たせた範囲を文字列領域ｓａ２として設定する（すなわち、所定の長さだけ広めに文字列領域ｓａ２を設定する）ようにしてもよい。 The above method for extracting the character string area is just an example. In addition to this, for example, the character string area extraction unit 15 is located at a position separated from the uppermost end portion up1 by a predetermined length upward and at a position separated from the lowermost end portion dp1 by a predetermined length downward. Based on this, the uppermost end line ul1 and the lowermost end line dl1 may be set, respectively. That is, the character string area extraction unit 15 sets a range having a width slightly in the vertical direction from the uppermost end portion up1 and the lowermost end portion dp1 as the character string area sa2 (that is, widen by a predetermined length). The character string area sa2 may be set).

また、この他にも、例えば、文字列領域抽出部１５は、最上端部ｕｐ１および最下端部ｄｐ１の位置の特定を行わずに、接続線ｃｎ１から上下方向にそれぞれ所定の長さだけ離れた位置に該当する線を、それぞれ最上端線ｕｌ１と最下端線ｄｌ１とを設定するようにしてもよい。 In addition to this, for example, the character string area extraction unit 15 is separated from the connection line cn1 by a predetermined length in the vertical direction without specifying the positions of the uppermost end portion up1 and the lowermost end portion dp1. The uppermost end line ul1 and the lowermost end line dl1 may be set for the line corresponding to the position, respectively.

なお、文字列領域抽出部１５が、文字列領域を抽出する対象とする文字列は、水平方向に文字が並ぶ（すなわち、横書きの）文字列であるとは限らない。例えば、文字列領域抽出部１５が、文字列領域を抽出する対象とする文字列は、垂直方向に文字が並ぶ（すなわち、縦書きの）文字列や、斜め方向に文字が並ぶ文字列である場合もある。したがって、文字列領域の範囲の設定は、適宜、文字列が記載された状態に適した手法によって行われることが好ましい。 The character string to be extracted by the character string area extraction unit 15 is not necessarily a character string in which characters are arranged in the horizontal direction (that is, horizontally written). For example, the character string to be extracted by the character string area extraction unit 15 is a character string in which characters are arranged in a vertical direction (that is, vertically written) or a character string in which characters are arranged in an oblique direction. In some cases. Therefore, it is preferable that the range of the character string area is set by an appropriate method suitable for the state in which the character string is described.

また、上述したように、文字列領域抽出部１５は、操作入力部１２から入力される、ユーザによるポインタの操作に基づく操作入力信号（補助情報）に基づいて、抽出された文字列領域を修正することができる。
図６は、本発明の実施形態に係る文字列領域・文字矩形抽出装置１の文字列領域抽出部１５による文字列領域抽出処理の一例を示す図である。 Further, as described above, the character string area extraction unit 15 modifies the extracted character string area based on the operation input signal (auxiliary information) input from the operation input unit 12 based on the operation of the pointer by the user. can do.
FIG. 6 is a diagram showing an example of a character string area extraction process by the character string area extraction unit 15 of the character string area / character rectangle extraction device 1 according to the embodiment of the present invention.

図６において、文字列領域抽出部１５によって抽出された文字列領域は、「矩形を抽出する」と記載された範囲と「くけい」と記載された範囲とを含む領域である。図示するように、文字列領域抽出部１５によって抽出された文字列領域は、点線で囲まれた範囲を含む領域である。すなわち、文字列領域抽出部１５は、「矩形」の振り仮名を表す「くけい」の文字列までを含めて1行の文字列領域であると認識し、誤った文字列領域の抽出を行っている。 In FIG. 6, the character string area extracted by the character string area extraction unit 15 is an area including a range described as “extracting a rectangle” and a range described as “kukei”. As shown in the figure, the character string area extracted by the character string area extraction unit 15 is an area including a range surrounded by a dotted line. That is, the character string area extraction unit 15 recognizes that it is a character string area of one line including the character string of "Kukei" representing the furigana of "rectangle", and extracts an erroneous character string area. ing.

本実施形態に係る文字列領域・文字矩形抽出装置１によれば、ユーザは、文字列領域抽出部１５によって誤って抽出された文字列領域を修正することができる。
ユーザは、誤って抽出された文字列領域ｓａ１１０の上端の線を、操作入力部１２によりポインタｐｔ１を操作することによって移動させる。具体的には、ユーザは、例えば、マウスを操作して図６に点線で示される文字列領域ｓａ１００の上端の線の位置にポインタｐｔ１を移動させ、当該上端の線をドラッグして、図６に実線で示される文字列領域ｓａ１０１の上端の線の位置に移動させる。これにより、文字列領域ｓａ１０１には「矩形を抽出する」の文字列のみが含まれ、「くけい」という振り仮名の文字列は含まれないように正しく修正される。 According to the character string area / character rectangle extraction device 1 according to the present embodiment, the user can correct the character string area erroneously extracted by the character string area extraction unit 15.
The user moves the line at the upper end of the character string area sa110 extracted by mistake by operating the pointer pt1 by the operation input unit 12. Specifically, for example, the user operates the mouse to move the pointer pt1 to the position of the upper end line of the character string area sa100 shown by the dotted line in FIG. 6, and drags the upper end line to FIG. It is moved to the position of the line at the upper end of the character string area sa101 shown by the solid line. As a result, the character string area sa101 is correctly corrected so that only the character string of "extracting the rectangle" is included and the character string of the furigana "kukei" is not included.

（文字列領域の結合）
上述した文字列領域の抽出は、行単位での文字列領域の抽出であった。しかしながら、行単位では、行の末尾が単語の途中である場合もあるため、複数の行をまとめて１つの文字列領域として認識させたい場合がある。
以下、文字列領域・文字矩形抽出装置１による文字列領域の結合について、図面を参照しながら説明する。
図７乃至図８は、本発明の実施形態に係る文字列領域・文字矩形抽出装置１の表示部１３によって表示される文字列領域抽出画面の一例を示す図である。 (Combination of character string areas)
The above-mentioned extraction of the character string area was the extraction of the character string area on a line-by-line basis. However, in line units, the end of a line may be in the middle of a word, so there are cases where it is desired to collectively recognize a plurality of lines as one character string area.
Hereinafter, the combination of the character string area and the character string area by the character rectangle extraction device 1 will be described with reference to the drawings.
7 to 8 are diagrams showing an example of a character string area extraction screen displayed by the display unit 13 of the character string area / character rectangle extraction device 1 according to the embodiment of the present invention.

図７に示すように、ユーザは、終点ｅｄ１を設定する操作を行う際に、設定する文字列領域と、当該文字列領域に隣接する他の文字列領域と、を結合することを示す操作を行う。図７に示す例においては、設定する文字列領域と当該文字列領域に隣接する他の文字列領域とを結合することを示す操作とは、終点ｅｄ１を設定する際にポインタｐｔ１を下方向にドラッグする操作である。または、例えば、表示部１３がタッチパネルである場合には、設定する文字列領域と当該文字列領域に隣接する他の文字列領域とを結合することを示す操作とは、フリックする操作である。
これらの操作が行われた場合には、文字列領域結合部１６は、設定する文字列領域と、当該文字列領域の１つ前に設定した文字列領域（図７に示す例においては、「光学文字認識（ＯＣＲ）においては、文書中の」という文字列を含む文字列領域）と、を結合する処理を行う。 As shown in FIG. 7, when the user performs an operation for setting the end point ed1, an operation indicating that the character string area to be set and another character string area adjacent to the character string area are combined is performed. conduct. In the example shown in FIG. 7, the operation indicating that the character string area to be set and another character string area adjacent to the character string area are combined is the operation of moving the pointer pt1 downward when setting the end point ed1. It is a drag operation. Or, for example, when the display unit 13 is a touch panel, the operation indicating that the character string area to be set and another character string area adjacent to the character string area are combined is an operation of flicking.
When these operations are performed, the character string area joining unit 16 sets the character string area and the character string area set immediately before the character string area (in the example shown in FIG. 7, " In optical character recognition (OCR), a process of combining a character string area including the character string "in a document") is performed.

文字列領域を結合する操作が行われた場合、図８に示すように、結合された文字列領域が抽出される。図８は、文字列領域抽出画面ｄｓ１において、文字列領域ｓａ１および文字列領域ｓａ４の２つの文字列領域が抽出された状態を表した図である。図４に示した例において抽出された文字列領域とは異なり、図８に示す例においては、図４に示した文字列領域ｓａ１と文字列領域ｓａ２とが結合された文字列領域である文字列領域ｓａ４が抽出される。 When the operation of combining the character string areas is performed, the combined character string areas are extracted as shown in FIG. FIG. 8 is a diagram showing a state in which two character string areas, a character string area sa1 and a character string area sa4, are extracted on the character string area extraction screen ds1. Unlike the character string area extracted in the example shown in FIG. 4, in the example shown in FIG. 8, a character that is a character string area in which the character string area sa1 and the character string area sa2 shown in FIG. 4 are combined. The column area sa4 is extracted.

（文字矩形の抽出）
以下、文字列領域・文字矩形抽出装置１による文字矩形の抽出について、図面を参照しながら説明する。
図９は、本発明の実施形態に係る文字列領域・文字矩形抽出装置１の文字矩形抽出部１７による文字矩形抽出処理の一例を示す図である。 (Extraction of character rectangle)
Hereinafter, the extraction of the character rectangle by the character string area / character rectangle extraction device 1 will be described with reference to the drawings.
FIG. 9 is a diagram showing an example of a character rectangle extraction process by the character rectangle extraction unit 17 of the character string area / character rectangle extraction device 1 according to the embodiment of the present invention.

図９（Ａ）は、文字矩形抽出部１７が、文字列領域抽出部１５によって抽出された文字列領域ｓａ５および文字列領域ｓａ６について、それぞれ文字矩形の抽出処理を行った状態を表す。 FIG. 9A shows a state in which the character rectangle extraction unit 17 has performed character rectangle extraction processing for each of the character string area sa5 and the character string area sa6 extracted by the character string area extraction unit 15.

なお、複数の文字が含まれる文字列からそれぞれの文字の文字矩形を抽出する方法としては、例えば、特許文献４に記載の２つの閾値を用いる方法や、ヒストグラムを用いる方法などが知られている。本実施形態における文字矩形の抽出では、任意の文字矩形抽出方法が用いられる。 As a method of extracting a character rectangle of each character from a character string including a plurality of characters, for example, a method using two threshold values described in Patent Document 4, a method using a histogram, and the like are known. .. In the extraction of the character rectangle in the present embodiment, an arbitrary character rectangle extraction method is used.

図９（Ａ）に示す例においては、文字列領域ｓａ５は、当該文字列領域ｓａ５に含まれる全ての文字に対して、誤りなく文字矩形が抽出されている。また、図９（Ａ）に示す例においては、文字列領域ｓａ６は、３か所において、誤った文字矩形が抽出されているため、ユーザは当該誤った文字矩形を修正する操作をそれぞれ行う。図９（Ｂ）、図９（Ｃ）、および図９（Ｄ）は、上記の３か所に対する、ユーザによる文字矩形の修正の操作を図示したものである。 In the example shown in FIG. 9A, in the character string area sa5, character rectangles are extracted without error for all the characters included in the character string area sa5. Further, in the example shown in FIG. 9A, since the erroneous character rectangle is extracted at three places in the character string area sa6, the user performs an operation to correct the erroneous character rectangle. 9 (B), 9 (C), and 9 (D) illustrate the operation of the user to correct the character rectangle with respect to the above three places.

図９（Ｂ）の上段に図示するように、文字矩形抽出部１７は、「域」の文字と、「抽」の文字の偏（すなわち、手偏）の部分と、をまとめて１つの文字と認識し、誤った文字矩形である矩形ｒｃ０１を抽出している。また、図９（Ｂ）の上段に図示するように、文字矩形抽出部１７は、「抽」の文字の作り（すなわち、「由」）の部分のみを１つの文字と認識し、誤った文字矩形である矩形ｒｃ０２を抽出している。 As shown in the upper part of FIG. 9B, the character rectangle extraction unit 17 collectively combines the character of the "area" and the biased part (that is, the hand bias) of the character of the "drawing" into one character. The rectangle rc01, which is an erroneous character rectangle, is extracted. Further, as shown in the upper part of FIG. 9B, the character rectangle extraction unit 17 recognizes only the part of making the character of "drawing" (that is, "Yu") as one character, and recognizes only the part of making the character "Yu", and is an erroneous character. The rectangle rc02, which is a rectangle, is extracted.

本実施形態に係る文字列領域・文字矩形抽出装置１によれば、ユーザは、文字矩形抽出部１７によって誤って抽出された文字矩形を修正することができる。
ユーザは、誤って抽出された文字矩形ｒｃ０１の右端であり、かつ、誤って抽出された文字矩形ｒｃ０２の左端でもある線（すなわち、区切り線ｓｌ１）を、操作入力部１２によりポインタｐｔ１を操作することによって移動させる。具体的には、ユーザは、例えば、マウスを操作して図９（Ｂ）の上段に示す区切り線ｓｌ１の位置にポインタｐｔ１を移動させ、当該区切り線ｓｌ１をドラッグして図９（Ｂ）の下段に示す位置に移動させる。
これにより、矩形ｒｃ１１には「域」の文字のみが含まれ、矩形ｒｃ１２には「抽」の文字が正しく含まれるように修正される。 According to the character string area / character rectangle extraction device 1 according to the present embodiment, the user can correct the character rectangle erroneously extracted by the character rectangle extraction unit 17.
The user operates the pointer pt1 on the line (that is, the dividing line sl1) which is the right end of the erroneously extracted character rectangle rc01 and also the left end of the erroneously extracted character rectangle rc02 by the operation input unit 12. Move by. Specifically, for example, the user operates the mouse to move the pointer pt1 to the position of the dividing line sl1 shown in the upper part of FIG. 9B, and drags the dividing line sl1 to the position of the dividing line sl1 in FIG. 9B. Move to the position shown in the lower row.
As a result, the rectangle rc11 is modified so that only the characters of the "area" are included, and the rectangle rc12 correctly includes the characters of the "drawing".

図９（Ｃ）の上段に図示するように、文字矩形抽出部１７は、「・（中黒）」の文字が複数並ぶ位置において、誤って２つの文字矩形に分割して抽出している。
なお、本例においては、「・（中黒）」のような記号が連続して複数並んでいる場合には、それらをまとめて１つの文字矩形として抽出されることが、正しい文字矩形の抽出であるものとする。 As shown in the upper part of FIG. 9C, the character rectangle extraction unit 17 mistakenly divides the characters into two character rectangles and extracts them at the positions where a plurality of characters of ". (Middle black)" are lined up.
In this example, when multiple symbols such as "・ (middle black)" are lined up in succession, it is correct to extract them as one character rectangle. Suppose that

ユーザは、誤って２つの文字矩形に分割して抽出された文字矩形の境界線である区切り線ｓｌ２を削除する修正を行う。具体的には、ユーザは、例えば、マウスを操作して図９（Ｃ）の上段に示す区切り線ｓｌ２の位置にポインタｐｔ１を移動させ、当該区切り線ｓｌ１をドラッグして図９（Ｃ）の下段に示す位置（矩形の外部である位置）にポインタｐｔ１を移動させる。これにより、区切り線ｓｌ２が削除され、連続して複数並ぶ「・（中黒）」の記号がまとめて１つの文字矩形の中に含まれるように修正される。 The user makes a correction to delete the dividing line sl2 which is the boundary line of the character rectangle extracted by mistakenly dividing it into two character rectangles. Specifically, for example, the user operates the mouse to move the pointer pt1 to the position of the dividing line sl2 shown in the upper part of FIG. 9C, and drags the dividing line sl1 to the position of the dividing line sl1 in FIG. 9C. Move the pointer pt1 to the position shown in the lower row (the position outside the rectangle). As a result, the dividing line sl2 is deleted, and it is corrected so that a plurality of consecutively arranged "・ (middle black)" symbols are collectively included in one character rectangle.

図９（Ｄ）の上段に図示するように、文字矩形抽出部１７は、「－（ハイフン）」の記号と、「２」の文字と、をまとめて１つの文字と認識し、誤った文字矩形である矩形ｒｃ０３を抽出している。 As shown in the upper part of FIG. 9D, the character rectangle extraction unit 17 recognizes the "-(hyphen)" symbol and the "2" character as one character, and recognizes the wrong character. The rectangle rc03, which is a rectangle, is extracted.

ユーザは、２つの文字が１つの文字として誤って認識されて抽出がなされた文字矩形を分割する修正を行う。具体的には、ユーザは、例えば、マウスを操作して図９（Ｄ）の上段に示す文字列領域ｓａ６の上端の位置にポインタｐｔ１を移動させ、当該ポインタｐｔ１をドラッグして図９（Ｃ）の下段に示す文字列領域ｓａ６の下端の位置に移動させる。
この操作により、区切り線ｓｌ３が生成および表示され、「－（ハイフン）」の記号と「２」の文字とが、それぞれ文字矩形ｒｃ０３ａと文字矩形ｒｃ０３ｂとに含まれるように修正される。 The user modifies the extracted character rectangle by erroneously recognizing two characters as one character. Specifically, for example, the user operates the mouse to move the pointer pt1 to the position of the upper end of the character string area sa6 shown in the upper part of FIG. 9D, and drags the pointer pt1 to FIG. 9 (C). ) To the position of the lower end of the character string area sa6 shown in the lower part.
By this operation, the dividing line sl3 is generated and displayed, and the "-(hyphen)" symbol and the character "2" are modified so as to be included in the character rectangle rc03a and the character rectangle rc03b, respectively.

（ルビの対応付け）
本実施形態に係る文字列領域・文字矩形抽出装置１によれば、ユーザは、文字列に含まれる単語と当該単語に対応付けられたルビとの対応付けを行うことができる。
なお、ここでいうルビ（ｒｕｂｙ）とは、文字列に含まれる単語の上側に、当該文字列よりも小さいサイズの文字で記される、当該単語の振り仮名を表す文字である。
以下、文字列領域・文字矩形抽出装置１によるルビの対応付けについて、図面を参照しながら説明する。
図１０は、本発明の実施形態に係る文字列領域・文字矩形抽出装置１のルビ対応付け部１８によるルビ抽出処理の一例を示す図である。 (Mapping of ruby)
According to the character string area / character rectangle extraction device 1 according to the present embodiment, the user can associate a word included in a character string with ruby associated with the word.
The term "ruby" as used herein is a character that is written on the upper side of a word included in a character string with a character having a size smaller than that of the character string, and is a character representing the furigana of the word.
Hereinafter, the correspondence of ruby by the character string area / character rectangle extraction device 1 will be described with reference to the drawings.
FIG. 10 is a diagram showing an example of ruby extraction processing by the ruby mapping unit 18 of the character string region / character rectangle extraction device 1 according to the embodiment of the present invention.

図１０（Ａ）は、文字列領域抽出部１５により、「矩形」という単語を含む文字列領域と、当該単語に対応するルビである「くけい」という振り仮名を含むルビの文字列領域と、が抽出された状態を表す。また、図１０（Ａ）は、文字矩形抽出部１７により、当該文字列領域に含まれる文字と当該ルビの文字列領域に含まれる文字とに対して、それぞれ文字矩形の抽出がなされた状態を表す。 FIG. 10A shows a character string area including the word “rectangle” and a ruby character string area including the furigana “Kukei”, which is the ruby corresponding to the word, by the character string area extraction unit 15. , Represents the extracted state. Further, FIG. 10A shows a state in which the character rectangle extraction unit 17 extracts the character rectangles for the characters included in the character string area and the characters included in the ruby character string area. show.

ユーザは、操作入力部１２により、単語と当該単語の振り仮名を表すルビとを対応付ける操作を行う。具体的には、図１０（Ｂ）に図示するように、例えば、ユーザはマウスを操作してポインタｐｔ１を単語またはルビの近傍に移動させ、次に、当該ポインタｐｔ１をドラッグして単語およびルビを囲むように移動させる。 The user performs an operation of associating a word with a ruby character representing a furigana of the word by the operation input unit 12. Specifically, as illustrated in FIG. 10B, for example, the user operates the mouse to move the pointer pt1 to the vicinity of the word or ruby, and then drags the pointer pt1 to the word and ruby. Move to surround.

この操作が行われることにより、図１０（Ｃ）に図示するように、単語と当該単語の振り仮名を表すルビとを囲むルビ対応付け領域ｒｂ１が生成および表示され、単語と当該単語の振り仮名を表すルビとの対応付けがなされる。
このように、単語と当該単語の振り仮名を表すルビとの対応付けがなされることによって、例えば、文字認識部２０が、単語およびルビに対して文字認識を行う場合において、単語およびルビの双方の情報を活用することができるため、文字認識の精度を高めることができる。 By performing this operation, as shown in FIG. 10C, a ruby correspondence area rb1 surrounding the word and the ruby representing the furigana of the word is generated and displayed, and the word and the furigana of the word are generated and displayed. Is associated with ruby that represents.
By associating the word with the ruby character representing the furigana of the word in this way, for example, when the character recognition unit 20 performs character recognition for the word and the ruby character, both the word and the ruby character are used. Since the information of can be utilized, the accuracy of character recognition can be improved.

（文字列領域の管理）
本実施形態に係る文字列領域・文字矩形抽出装置１によれば、ユーザは、管理画面生成部１９によって生成される文字列領域管理画面によって、文字列領域抽出部１５によって抽出された文字列領域の結合や分割、および文字列領域抽出部１５によって抽出された文字列領域の順番の入れ替えなどの管理を行うことができる。
図１１乃至図１４は、本発明の実施形態に係る文字列領域・文字矩形抽出装置１の表示部１３によって表示される文字列領域管理画面の一例を示す図である。 (Management of character string area)
According to the character string area / character rectangle extraction device 1 according to the present embodiment, the user can use the character string area management screen generated by the management screen generation unit 19 to extract the character string area by the character string area extraction unit 15. It is possible to manage the combination and division of characters, and the order of the character string areas extracted by the character string area extraction unit 15.
11 to 14 are diagrams showing an example of a character string area management screen displayed by the display unit 13 of the character string area / character rectangle extraction device 1 according to the embodiment of the present invention.

図１１に示すように、文字列領域管理画面ｍｓ１は、２つの画面が左右に並べて配置された画面である。左側の画面には、文字列領域抽出画面と同様に、画像データ取得部１１によって取得された画像データが示す画像が表示される。なお、図１１は、ユーザの操作により、行ごとに始点と終点とが設定され、行単位での文字列領域の抽出がなされた状態の画面を表す。 As shown in FIG. 11, the character string area management screen ms1 is a screen in which two screens are arranged side by side. Similar to the character string area extraction screen, the screen on the left side displays the image indicated by the image data acquired by the image data acquisition unit 11. Note that FIG. 11 shows a screen in which the start point and the end point are set for each line by the user's operation and the character string area is extracted for each line.

抽出された文字列領域には、抽出された順に、文字列領域番号が割り当てられ、文字列領域管理画面ｍｓ１に表示される。図１１に図示するように、例えば、文字列領域ｓａ１１には「１」番を表す文字列領域番号ｎ１が割り当てられ、また、文字列領域ｓａ１２には「２」番を表す文字列領域番号ｎ２が割り振られている。 Character string area numbers are assigned to the extracted character string areas in the order of extraction, and are displayed on the character string area management screen ms1. As shown in FIG. 11, for example, the character string area sa11 is assigned the character string area number n1 representing the "1" number, and the character string area sa12 is assigned the character string area number n2 representing the "2" number. Is assigned.

文字列領域管理画面ｍｓ１を構成する２つの画面のうち、右側の画面には、左側の画面においてそれぞれの文字列領域に対して割り当てられた文字列領域番号のリストである文字列領域番号リストが表示される。図１１に図示するように、例えば、左側の画面の文字列領域ｓａ１１に割り当てられた文字列領域番号「１」を表すリスト要素ｌｎ１や、左側の画面の文字列領域ｓａ１２に割り当てられた文字列領域番号「２」を表すリスト要素ｌｎ２などが、文字列領域番号リストに順に表示されている。 Of the two screens that make up the character string area management screen ms1, the screen on the right side has a character string area number list that is a list of character string area numbers assigned to each character string area on the left screen. Is displayed. As shown in FIG. 11, for example, the list element ln1 representing the character string area number "1" assigned to the character string area sa11 on the left screen and the character string assigned to the character string area sa12 on the left screen. List elements such as ln2 representing the area number “2” are sequentially displayed in the character string area number list.

上述したように、文字列領域番号は、文字列領域が抽出された順に、当該文字列領域に対して割り当てられる。また、文字列領域番号リストにおけるリスト要素の並び順は、文字列領域・文字矩形抽出装置１が（文面の）内容的な文字列の並び順として認識している順番を表す。 As described above, the character string area numbers are assigned to the character string areas in the order in which the character string areas are extracted. Further, the order of the list elements in the character string area number list represents the order recognized as the order of the content character strings (of the text) by the character string area / character rectangle extraction device 1.

すなわち、図１１に示す例では、右側の画面に表示された文字列領域番号リストにおいて、上から順に、文字列領域番号ｎ１を表すリスト要素ｌｎ１、および文字列領域番号ｎ２を表すリスト要素ｌｎ２が表示されていることから、文字列領域・文字矩形抽出装置１は、内容的に、文字列領域番号ｎ１に対応する文字列領域ｓａ１１に含まれる文字列である「３．光学文字認識」の次に、文字列領域番号ｎ２に対応する文字列領域ｓａ１２に含まれる文字列である「光学文字認識（ＯＣＲ）においては、文書中の」が続いているものと認識している。 That is, in the example shown in FIG. 11, in the character string area number list displayed on the screen on the right side, the list element ln1 representing the character string area number n1 and the list element ln2 representing the character string area number n2 are arranged in this order from the top. Since it is displayed, the character string area / character rectangle extraction device 1 is next to "3. Optical character recognition" which is a character string included in the character string area sa11 corresponding to the character string area number n1. In addition, it is recognized that "in the document in optical character recognition (OCR)", which is a character string included in the character string area sa12 corresponding to the character string area number n2, continues.

なお、図１１に示す文字列領域管理画面ｍｓ１は、ユーザの操作により、行ごとに始点と終点とが設定され、行単位での文字列領域が抽出された直後の画面を表したものであるため、文字列領域番号リストに表示されているリスト要素の並び順は、文字列領域番号の並び順と同一になっている。すなわち、初期状態（デフォルト状態）の文字列領域番号リストでは、リスト要素が文字列領域番号の順に並べられて表示される。 The character string area management screen ms1 shown in FIG. 11 represents a screen immediately after the start point and the end point are set for each line by the user's operation and the character string area is extracted for each line. Therefore, the order of the list elements displayed in the character string area number list is the same as the order of the character string area numbers. That is, in the character string area number list in the initial state (default state), the list elements are displayed in the order of the character string area number.

上述したように、文字列領域管理画面において、ユーザは、操作入力部１２による操作により、文字列領域抽出部１５によって抽出された文字列領域の結合や分割などの管理を行うことができる。 As described above, on the character string area management screen, the user can manage the combination and division of the character string area extracted by the character string area extraction unit 15 by the operation by the operation input unit 12.

図１２に示す文字列領域管理画面ｍｓ２は、ユーザにより、文字列領域の結合の操作がなされた後の時点における文字列領域管理画面の状態の一例である。図示するように、図１１における文字列領域ｓａ１２、文字列領域ｓａ１３、文字列領域ｓａ１４、・・・、および文字列領域ｓａ１９の８つの文字列領域は、図１２においては、結合されて１つの文字列領域ｓａ２１になっている。また、図示するように、図１１における文字列領域ｓａ２０、および文字列領域ｓａ２１の２つの文字列領域は、図１２においては、結合されて１つの文字列領域ｓａ２３になっている。 The character string area management screen ms2 shown in FIG. 12 is an example of the state of the character string area management screen at a time point after the operation of combining the character string areas is performed by the user. As shown in the figure, the eight character string areas of the character string area sa12, the character string area sa13, the character string area sa14, ..., And the character string area sa19 in FIG. 11 are combined into one in FIG. It is the character string area sa21. Further, as shown in the figure, the two character string areas of the character string area sa20 and the character string area sa21 in FIG. 11 are combined to form one character string area sa23 in FIG.

また、図示するように、上記のように文字列領域が結合されたことにより、図１１における「２」番を表す文字列領域番号ｎ２、「３」番を表す文字列領域番号ｎ３、「４」番を表す文字列領域番号ｎ４、・・・、および「９」番を表す文字列領域番号ｎ９は、図１２においては、それぞれ「２－１」番を表す文字列領域番号ｎ２１、「２－２」番を表す文字列領域番号ｎ２２、「２－３」番を表す文字列領域番号ｎ２３、・・・、および「２－８」番を表す文字列領域番号ｎ２８に付け替えがなされている。また、図示するように、上記のように文字列領域が結合されたことにより、図１１における「１０」番を表す文字列領域番号ｎ１０、および「１１」番を表す文字列領域番号ｎ１１は、図１２においては、それぞれ「３－１」番を表す文字列領域番号ｎ３１、および「３－２」番を表す文字列領域番号ｎ３２に付け替えがなされている。 Further, as shown in the figure, by combining the character string areas as described above, the character string area numbers n2 representing the "2" number in FIG. 11 and the character string area numbers n3 and "4" representing the "3" number are combined. In FIG. 12, the character string area numbers n4 and ... Representing the number "9" and the character string area number n9 representing the "9" are the character string area numbers n21 and "2" representing the "2-1" numbers, respectively. -The character string area number n22 representing "-2", the character string area number n23 representing "2-3", ..., And the character string area number n28 representing "2-8" have been replaced. .. Further, as shown in the figure, since the character string areas are combined as described above, the character string area number n10 representing the "10" number and the character string area number n11 representing the "11" number in FIG. 11 are replaced with each other. In FIG. 12, the character string area number n31 representing the "3-1" number and the character string area number n32 representing the "3-2" number are replaced, respectively.

また、図示するように、上記のように文字列領域番号の付け替えがなされたことにより、例えば、図１１における「２」番を表すリスト要素ｌｎ２、「３」番を表すリスト要素ｌｎ３、および「４」番を表すリスト要素ｌｎ４は、図１２においては、「２－１」番を表すリスト要素ｌｎ２１、「２－２」番を表すリスト要素ｌｎ２２、および「２－３」番を表すリスト要素ｌｎ２３に変更がなされている。また、図示するように、上記のように文字列領域番号の付け替えがなされたことにより、図１１における「１０」番を表すリスト要素ｌｎ１０、および「１１」番を表すリスト要素ｌｎ１１は、図１２においては、「３－１」番を表すリスト要素ｌｎ３１、および「３－２」番を表すリスト要素ｌｎ３２に変更がなされている。 Further, as shown in the figure, due to the replacement of the character string area numbers as described above, for example, the list element ln2 representing the "2" number in FIG. 11, the list element ln3 representing the "3" number, and " In FIG. 12, the list element ln4 representing the number 4 ”is a list element ln21 representing the number“ 2-1 ”, a list element ln22 representing the number“ 2-2 ”, and a list element representing the number“ 2-3 ”. Changes have been made to ln23. Further, as shown in the figure, due to the replacement of the character string area numbers as described above, the list element ln10 representing the "10" number and the list element ln11 representing the "11" number in FIG. 11 are shown in FIG. In, the list element ln31 representing the "3-1" number and the list element ln32 representing the "3-2" number have been changed.

上述したように、文字列領域管理画面において、ユーザは、操作入力部１２による操作により、文字列領域抽出部１５によって抽出された文字列領域の順番の入れ替えの管理を行うことができる。具体的には、ユーザは、順番を入れ替えたい文字列領域の位置へポインタｐｔ１を移動させ、その位置から文字列領域番号リストにおける所望の移動先の位置へ当該ポインタｐｔ１をドラッグさせることにより、文字列領域の順番の入れ替えがなされる。 As described above, on the character string area management screen, the user can manage the order change of the character string area extracted by the character string area extraction unit 15 by the operation by the operation input unit 12. Specifically, the user moves the pointer pt1 to the position of the character string area whose order is to be changed, and drags the pointer pt1 from that position to the desired destination position in the character string area number list to display the character. The order of the column areas is changed.

例えば図１３に示すように、ユーザは、操作入力部１２による操作により、ポインタｐｔ１を文字列領域ｓａ１１の位置へ移動させてマウスボタンをクリックし、そのまま文字列領域番号リストにおける所望の位置（図１３に示す例においては、文字列領域番号リストの末尾の位置）までポインタｐｔ１をドラッグする。 For example, as shown in FIG. 13, the user moves the pointer pt1 to the position of the character string area sa11 by the operation by the operation input unit 12, clicks the mouse button, and directly clicks the desired position in the character string area number list (FIG. 13). In the example shown in 13, the pointer pt1 is dragged to the position at the end of the character string area number list).

図１４に示す文字列領域管理画面ｍｓ３は、ユーザにより、文字列領域の順番の入れ替えの操作がなされた後の時点における文字列領域管理画面の状態の一例である。図示するように、図１３に示す文字列領域番号リストにおいて先頭に（最上段に）表示されている「１」番を示すリスト要素ｌｎ１は、図１３に示す文字列領域番号リストにおいては末尾に（リスト要素ｌｎ３２の下の位置である最下段に）表示され、その他の全てのリスト要素はそれぞれ１段上に繰り上げられて表示されている。 The character string area management screen ms3 shown in FIG. 14 is an example of the state of the character string area management screen at a time point after the operation of changing the order of the character string areas is performed by the user. As shown in the figure, the list element ln1 indicating the number "1" displayed at the beginning (at the top) of the character string area number list shown in FIG. 13 is at the end of the character string area number list shown in FIG. It is displayed (at the bottom, which is the position below the list element ln32), and all the other list elements are displayed one level up.

上記のように文字列領域番号リストにおけるリスト要素の並び替えがなされると、文字列領域・文字矩形抽出装置１は、並び替えがなされた後のリスト要素（に対応する文字列領域に含まれる文字列）の順番が、内容的な文字列の順番であると認識する。すなわち、図１４に示す例においては、文字列領域・文字矩形抽出装置１は、内容的には、文字列領域番号「３－２」に対応する「・ユーザの操作に基づいて手動で文字列領域抽出・・・・・・・・・３-２章」という文字列の後に、文字列領域番号「１」に対応する「３．光学文字認識」という文字列が続いているものと認識する。 When the list elements in the character string area number list are rearranged as described above, the character string area / character rectangle extraction device 1 is included in the list element (corresponding to the character string area) after the rearrangement is performed. Recognize that the order of the character string) is the order of the content character string. That is, in the example shown in FIG. 14, the character string area / character rectangular extraction device 1 is, in terms of content, "・ a character string manually based on the user's operation" corresponding to the character string area number "3-2". Recognize that the character string "3. Optical character recognition" corresponding to the character string area number "1" follows the character string "Chapter 3-2". ..

（文字列領域・文字矩形抽出装置の動作）
以下、実施形態に係る文字列領域・文字矩形抽出装置１の動作について、図面を参照しながら説明する。
図１５は、本発明の実施形態に係る文字列領域・文字矩形抽出装置１の動作を示すフローチャートである。本フローチャートは、文字列を含む画像を示す画像データ（例えば、文書をスキャナ等によって読み取ることにより生成される画像データ）が、画像データ取得部１１に入力される際に開始する。 (Operation of character string area / character rectangle extractor)
Hereinafter, the operation of the character string area / character rectangle extraction device 1 according to the embodiment will be described with reference to the drawings.
FIG. 15 is a flowchart showing the operation of the character string area / character rectangle extraction device 1 according to the embodiment of the present invention. This flowchart starts when image data indicating an image including a character string (for example, image data generated by reading a document with a scanner or the like) is input to the image data acquisition unit 11.

（ステップｓｔ００１）文字列領域・文字矩形抽出装置１の画像データ取得部１１は、文字列を含む画像を示す画像データ（例えば、文書をスキャナ等によって読み取ることにより生成される画像データ）を取得する。その後、ステップｓｔ００２へ進む。 (Step st001) The image data acquisition unit 11 of the character string area / character rectangle extraction device 1 acquires image data (for example, image data generated by reading a document with a scanner or the like) indicating an image including a character string. .. After that, the process proceeds to step st002.

（ステップｓｔ００２）文字列領域・文字矩形抽出装置１の制御部１０は、画像データ取得部１１によって取得された画像データを記憶部１４に記憶させる。そして、制御部１０は、記憶部１４に記憶された当該画像データに対して、各種の事前処理を行う。ここでいう事前処理とは、画像データ取得部１１によって取得された画像データに基づく画像の傾きを補正する処理や、当該画像の色分解を行う処理などである。すなわち、当該事前処理は、例えば、文字列領域の抽出、文字矩形の抽出、および文字認識の処理を容易にするため、あるいは文字列領域の抽出、文字矩形の抽出、および文字認識の精度を高めるために行われる処理である。その後、ステップｓｔ００３へ進む。 (Step st002) The control unit 10 of the character string area / character rectangle extraction device 1 stores the image data acquired by the image data acquisition unit 11 in the storage unit 14. Then, the control unit 10 performs various preprocessing on the image data stored in the storage unit 14. The pre-processing referred to here is a process of correcting the inclination of an image based on the image data acquired by the image data acquisition unit 11, a process of performing color separation of the image, and the like. That is, the preprocessing facilitates, for example, the processing of character string area extraction, character rectangle extraction, and character recognition, or enhances the accuracy of character string area extraction, character rectangle extraction, and character recognition. It is a process performed for the purpose. Then, the process proceeds to step st003.

なお、文字の傾きを補正する処理方法としては、例えば、特許文献５に記載されているような、様々なずらし角度ごとに輪郭画像から重み付きヒストグラムを作成し、最適な経路を探索することによって文字の傾きを補正する処理方法などが知られている。また、色分解を行って文字列領域を抽出する方法としては、例えば、特許文献６に記載されているような、エッジ画像データと色画像データとを用いて文字列領域を抽出する方法などが知られている。 As a processing method for correcting the inclination of characters, for example, a weighted histogram is created from a contour image for each of various shift angles as described in Patent Document 5, and an optimum path is searched for. A processing method for correcting the inclination of characters is known. Further, as a method of extracting a character string area by performing color separation, for example, a method of extracting a character string area using edge image data and color image data as described in Patent Document 6 is used. Are known.

（ステップｓｔ００３）文字列領域・文字矩形抽出装置１の文字列領域抽出部１５は、操作入力部１２から入力される操作入力信号（補助情報）に基づいて、画像データ取得部１１が取得した画像データに基づく画像に含まれる文字列からなる行の始点と終点とを特定し、特定された行の始点と終点とに基づいて当該行に含まれる文字列の表示対象範囲を示す文字列領域を抽出する。その後、ステップｓｔ００４へ進む。 (Step st003) The character string area extraction unit 15 of the character string area / character rectangle extraction device 1 is an image acquired by the image data acquisition unit 11 based on the operation input signal (auxiliary information) input from the operation input unit 12. The start point and end point of the line consisting of the character string included in the image based on the data are specified, and the character string area indicating the display target range of the character string included in the line is specified based on the start point and end point of the specified line. Extract. Then, the process proceeds to step st004.

（ステップｓｔ００４）文字列領域・文字矩形抽出装置１の文字列領域結合部１６は、文字列領域抽出部１５によって抽出された文字列領域と、当該文字列領域と隣接する他の文字列領域（例えば、当該文字列領域が抽出される前に抽出された隣接する他の文字列領域）と、を結合するか否かの判定を行う。その後、ステップｓｔ００５へ進む。 (Step st004) The character string area combining unit 16 of the character string area / character rectangular extraction device 1 includes a character string area extracted by the character string area extraction unit 15 and another character string area adjacent to the character string area (step st004). For example, it is determined whether or not to combine with another adjacent character string area extracted before the character string area is extracted. Then, the process proceeds to step st005.

（ステップｓｔ００５）文字列領域結合部１６が、文字列領域抽出部１５によって抽出された文字列領域と、当該文字列領域と隣接する他の文字列領域と（例えば、当該文字列領域が抽出される前に抽出された隣接する他の文字列領域）、を結合すると判定した場合には、ステップｓｔ００６へ進む。そうでない場合（すなわち、結合しないと判定された場合）には、ステップｓｔ００７へ進む。 (Step st005) The character string area joining unit 16 extracts the character string area extracted by the character string area extraction unit 15 and another character string area adjacent to the character string area (for example, the character string area is extracted). If it is determined to combine (another adjacent character string area extracted before), the process proceeds to step st006. If not (that is, when it is determined not to combine), the process proceeds to step st007.

（ステップｓｔ００６）文字列領域結合部１６が、文字列領域抽出部１５によって抽出された文字列領域と、当該文字列領域と隣接する他の文字列領域と（例えば、当該文字列領域が抽出される前に抽出された隣接する他の文字列領域）、を結合する。その後、ステップｓｔ００７へ進む。 (Step st006) The character string area joining unit 16 extracts the character string area extracted by the character string area extraction unit 15 and another character string area adjacent to the character string area (for example, the character string area is extracted). (Other adjacent string areas extracted before), are combined. Then, the process proceeds to step st007.

（ステップｓｔ００７）文字列領域結合部１６は、抽出された文字列領域を示す画像を表示部１３に表示させる。そして、ユーザは、表示部１３に表示された画像を確認し、文字列領域が誤りなく抽出されているか否かを確認する。文字列領域が誤りなく抽出されている場合には、ステップｓｔ００９へ進む。そうでない場合、すなわち、文字列領域が誤って抽出されている箇所が存在する場合には、ステップｓｔ００８へ進む。 (Step st007) The character string area joining unit 16 causes the display unit 13 to display an image showing the extracted character string area. Then, the user confirms the image displayed on the display unit 13 and confirms whether or not the character string area is extracted without error. If the character string area is extracted without error, the process proceeds to step st009. If this is not the case, that is, if there is a place where the character string area is erroneously extracted, the process proceeds to step st008.

（ステップｓｔ００８）文字列領域抽出部１５は、操作入力部１２から入力される、ユーザによるポインタの操作に基づく操作入力信号（補助情報）に基づいて、抽出された文字列領域を修正する。なお、ユーザは、表示部１３に表示された文字列領域管理画面などを参照しながら、操作入力部１２（例えば、マウス）により文字列領域の修正のための操作を行う。その後、ステップｓｔ００９へ進む。 (Step st008) The character string area extraction unit 15 modifies the extracted character string area based on the operation input signal (auxiliary information) input from the operation input unit 12 based on the operation of the pointer by the user. The user performs an operation for correcting the character string area by the operation input unit 12 (for example, a mouse) while referring to the character string area management screen displayed on the display unit 13. Then, the process proceeds to step st009.

（ステップｓｔ００９）画像データ取得部１１によって取得された画像データに基づく画像に含まれる全ての文字列領域の抽出が完了した場合には、ステップｓｔ０１０へ進む。
そうでない場合は、ステップｓｔ００３へ戻る。 (Step st009) When the extraction of all the character string regions included in the image based on the image data acquired by the image data acquisition unit 11 is completed, the process proceeds to step st010.
If not, the process returns to step st003.

（ステップｓｔ０１０）文字列領域・文字矩形抽出装置１の文字矩形抽出部１７は、文字列領域抽出部１５によって抽出された文字列領域に含まれる文字列を構成するそれぞれの文字の矩形を表す文字矩形を抽出する。
なお、例えば、文字矩形抽出部１７は、複数の文字切り出し位置の候補の中から、文字の形状情報や、文字認識における認識確度などから算出される評価値に基づいて、適切な文字切り出し位置を特定することにより、文字矩形を抽出する。その後、ステップｓｔ０１１へ進む。 (Step st010) The character rectangle extraction unit 17 of the character string area / character rectangle extraction device 1 represents a character representing a rectangle of each character constituting the character string included in the character string area extracted by the character string area extraction unit 15. Extract the rectangle.
For example, the character rectangle extraction unit 17 determines an appropriate character cutout position from a plurality of character cutout position candidates based on an evaluation value calculated from character shape information, recognition accuracy in character recognition, and the like. By specifying, the character rectangle is extracted. After that, the process proceeds to step st011.

（ステップｓｔ０１１）文字矩形抽出部１７は、抽出された文字矩形を示す画像を表示部１３に表示させる。そして、ユーザは、表示部１３に表示された画像を確認し、文字矩形が誤りなく抽出されているか否かを確認する。文字矩形が誤りなく抽出されている場合には、ステップｓｔ０１３へ進む。そうでない場合、すなわち、文字矩形が誤って抽出されている箇所が存在する場合には、ステップｓｔ０１２へ進む。 (Step st011) The character rectangle extraction unit 17 causes the display unit 13 to display an image showing the extracted character rectangle. Then, the user confirms the image displayed on the display unit 13 and confirms whether or not the character rectangle is extracted without error. If the character rectangle is extracted without error, the process proceeds to step st013. If this is not the case, that is, if there is a place where the character rectangle is erroneously extracted, the process proceeds to step st012.

（ステップｓｔ０１２）文字矩形抽出部１７は、操作入力部１２から入力される、ユーザによるポインタの操作に基づく操作入力信号（補助情報）に基づいて、抽出された文字矩形を修正する。なお、ユーザは、表示部１３に表示された文字矩形管理画面（図示せず）などを参照しながら、操作入力部１２（例えば、マウス）により文字矩形の修正のための操作を行う。その後、ステップｓｔ００９へ進む。 (Step st012) The character rectangle extraction unit 17 corrects the extracted character rectangle based on the operation input signal (auxiliary information) input from the operation input unit 12 based on the operation of the pointer by the user. The user performs an operation for correcting the character rectangle by the operation input unit 12 (for example, a mouse) while referring to the character rectangle management screen (not shown) displayed on the display unit 13. Then, the process proceeds to step st009.

（ステップｓｔ０１３）文字列領域・文字矩形抽出装置１の文字認識部２０は、文字矩形抽出部１７によって抽出されたそれぞれの文字矩形に含まれる文字を示す画像に対して光学文字認識（ＯＣＲ）を行い、文字を認識する。具体的には、文字認識部２０は、（例えば、スキャナによって読み取られた）文書に含まれる文字を示す画像データを解析し、コンピュータにより編集可能なデータ形式（例えば、文字コードの列）に変換する。
以上で、本フローチャートに示される処理が終了する。 (Step st013) The character recognition unit 20 of the character string area / character rectangle extraction device 1 performs optical character recognition (OCR) on an image showing characters included in each character rectangle extracted by the character rectangle extraction unit 17. Do and recognize characters. Specifically, the character recognition unit 20 analyzes image data indicating characters contained in a document (for example, read by a scanner) and converts it into a data format (for example, a character code string) that can be edited by a computer. do.
This completes the process shown in this flowchart.

以上、説明したように、本実施形態に係る文字列領域・文字矩形抽出装置１は、文字列を含む画像を示す画像データを取得する画像データ取得部１１と、取得された画像データに基づく画像を表示する表示部１３と、ユーザからの操作入力を受け付ける操作入力部１２と、操作入力に基づく補助情報に基づいて当該画像に含まれる文字列からなる行の始点と終点とを特定し特定された行の始点と終点とに基づいて行に含まれる文字列の表示対象範囲を示す文字列領域を抽出する文字列領域抽出部１５と、抽出された文字列領域と当該文字列領域と隣接する他の文字列領域とを結合する文字列領域結合部１６と、を備える。 As described above, the character string area / character rectangle extraction device 1 according to the present embodiment has an image data acquisition unit 11 for acquiring image data indicating an image including a character string, and an image based on the acquired image data. The start point and the end point of the line consisting of the character string included in the image are specified and specified based on the display unit 13 that displays the display unit 13, the operation input unit 12 that accepts the operation input from the user, and the auxiliary information based on the operation input. The character string area extraction unit 15 that extracts the character string area indicating the display target range of the character string included in the line based on the start point and the end point of the line, and the extracted character string area and the character string area are adjacent to each other. A character string area connecting portion 16 for connecting to another character string area is provided.

以上により、本発明の実施形態に係る文字列領域・文字矩形抽出装置１は、ユーザの操作に基づく補助情報を用いて、文字列領域の抽出および文字矩形の抽出の精度を高めることができる。 As described above, the character string area / character rectangle extraction device 1 according to the embodiment of the present invention can improve the accuracy of the character string area extraction and the character rectangle extraction by using the auxiliary information based on the user's operation.

＜実施形態の変形例１＞
上述した実施形態に係る文字列領域・文字矩形抽出装置１においては、ユーザの操作に基づく補助情報などにしたがって文字列領域の結合の処理が行われる。しかしながら、文字認識部２０による文字認識の結果を示す情報も用いて文字列領域の結合の処理が行われるような構成であってもよい。
以下に説明する実施形態の変形例１に係る文字列領域・文字矩形抽出装置２においては、文字認識部２０による文字認識の結果を示す情報も用いて文字列領域の結合の処理が行われる。 <Modification 1 of the embodiment>
In the character string area / character rectangle extraction device 1 according to the above-described embodiment, the processing of combining the character string areas is performed according to the auxiliary information based on the user's operation. However, the configuration may be such that the processing of combining the character string regions is performed using the information indicating the result of character recognition by the character recognition unit 20.
In the character string area / character rectangle extraction device 2 according to the first modification of the embodiment described below, the processing of combining the character string areas is performed using the information indicating the result of character recognition by the character recognition unit 20.

（文字列領域・文字矩形抽出装置の構成）
以下、実施形態の変形例１に係る文字列領域・文字矩形抽出装置２の構成について、図面を参照しながら説明する。
図１６は、本発明の実施形態の変形例１に係る文字列領域・文字矩形抽出装置２の機能構成を示すブロック図である。
図示するように、実施形態の変形例１に係る文字列領域・文字矩形抽出装置２は、制御部１０と、画像データ取得部１１と、操作入力部１２と、表示部１３と、記憶部１４と、文字列領域抽出部１５と、文字列領域結合部１６と、文字矩形抽出部１７と、ルビ対応付け部１８と、管理画面生成部１９と、文字認識部２０と、と含んで構成される。
また、文字列領域結合部１６は、言語解析部２６を含んで構成される。 (Structure of character string area / character rectangle extractor)
Hereinafter, the configuration of the character string area / character rectangle extraction device 2 according to the modification 1 of the embodiment will be described with reference to the drawings.
FIG. 16 is a block diagram showing a functional configuration of the character string area / character rectangle extraction device 2 according to the first modification of the embodiment of the present invention.
As shown in the figure, the character string area / character rectangular extraction device 2 according to the modified example 1 of the embodiment includes a control unit 10, an image data acquisition unit 11, an operation input unit 12, a display unit 13, and a storage unit 14. , A character string area extraction unit 15, a character string area combination unit 16, a character rectangle extraction unit 17, a ruby mapping unit 18, a management screen generation unit 19, and a character recognition unit 20. To.
Further, the character string area joining unit 16 includes a language analysis unit 26.

制御部１０は、文字列領域・文字矩形抽出装置２における各種の処理を制御する。制御部１０は、例えば、ＣＰＵを含んで構成される。 The control unit 10 controls various processes in the character string area / character rectangle extraction device 2. The control unit 10 includes, for example, a CPU.

なお、本実施形態の変形例１においては、文字列領域・文字矩形抽出装置１が操作入力部１２を備えるものとしたが、これに限られない。例えば、操作入力部１２が外部の装置に備えられ、文字列領域・文字矩形抽出装置１が、ユーザによるポインタ操作を示す信号を当該外部の装置から取得するようにしてもよい。 In the first modification of the present embodiment, the character string area / character rectangle extraction device 1 is provided with the operation input unit 12, but the present invention is not limited to this. For example, the operation input unit 12 may be provided in an external device, and the character string area / character rectangle extraction device 1 may acquire a signal indicating a pointer operation by the user from the external device.

表示部１３は、画像データ取得部１１によって取得された画像データに基づく画像を表示する。表示部１３は、ディスプレイ、例えば、液晶ディスプレイ（ＬＣＤ）、有機ＥＬディスプレイ、またはＣＲＴ等を含んで構成される。 The display unit 13 displays an image based on the image data acquired by the image data acquisition unit 11. The display unit 13 includes a display, for example, a liquid crystal display (LCD), an organic EL display, a CRT, or the like.

なお、本実施形態の変形例１においては、文字列領域・文字矩形抽出装置１が表示部１３を備えるものとしたが、これに限られない。例えば、表示部１３が外部の装置に備えられ、文字列領域・文字矩形抽出装置１が、表示させる画像を示す画像データを当該外部の装置へ送信するようにしてもよい。 In the first modification of the present embodiment, the character string area / character rectangle extraction device 1 is provided with the display unit 13, but the present invention is not limited to this. For example, the display unit 13 may be provided in an external device, and the character string area / character rectangle extraction device 1 may transmit image data indicating an image to be displayed to the external device.

記憶部１４は、文字列領域・文字矩形抽出装置１において用いられる各種のコンピュータプログラムやデータを記憶する。また、記憶部１４は、文字列領域・文字矩形抽出装置１における各種の演算処理等において用いられる一時的な記憶領域としての機能も有する。記憶部１４は、記憶媒体、例えば、ＨＤＤ、フラッシュメモリ、ＥＥＰＲＯＭ、ＲＡＭ、ＲＯＭ、又はそれらの任意の組み合わせを含んで構成される。 The storage unit 14 stores various computer programs and data used in the character string area / character rectangle extraction device 1. Further, the storage unit 14 also has a function as a temporary storage area used in various arithmetic processes and the like in the character string area / character rectangle extraction device 1. The storage unit 14 includes a storage medium, for example, an HDD, a flash memory, an EEPROM, a RAM, a ROM, or any combination thereof.

文字列領域抽出部１５は、操作入力部１２から入力される操作入力信号（補助情報）に基づいて、画像データ取得部１１が取得した画像データに基づく画像に含まれる文字列からなる行の始点と終点とを特定し、特定された行の始点と終点とに基づいて当該行に含まれる文字列の表示対象範囲を示す文字列領域を抽出する。 The character string area extraction unit 15 is the start point of a line consisting of a character string included in the image based on the image data acquired by the image data acquisition unit 11 based on the operation input signal (auxiliary information) input from the operation input unit 12. And the end point are specified, and the character string area indicating the display target range of the character string included in the line is extracted based on the start point and the end point of the specified line.

また、上述したように、文字列領域結合部１６は言語解析部２６を備える。
言語解析部２６は、文字認識部２０による文字認識の結果を示す情報を解析する。そして、言語解析部２６は、解析された結果を示す情報に基づいて、文字列領域を結合するか否かを判定する。具体的には、例えば、言語解析部２６は、ある２つの文字列領域にそれぞれ含まれる文字に対しての文字認識部２０による文字認識の結果を示す情報と、文字の生起確率における統計的な情報（例えば、ある文字列の中でＮ個の文字列または単語の組み合わせが、どの程度出現するかを調査する言語モデルであるＮグラムモデルなど）とに基づいて、文字列領域を結合するか否か（すなわち、当該２つの文字列領域が、同一の文字列領域であるか否か）を判定する。 Further, as described above, the character string region joining unit 16 includes a language analysis unit 26.
The language analysis unit 26 analyzes information indicating the result of character recognition by the character recognition unit 20. Then, the language analysis unit 26 determines whether or not to combine the character string regions based on the information indicating the analysis result. Specifically, for example, the language analysis unit 26 provides information indicating the result of character recognition by the character recognition unit 20 for characters contained in each of two character string regions, and statistical statistics on the occurrence probability of the characters. Whether to combine string regions based on information (for example, the N-gram model, which is a language model that investigates how many combinations of N strings or words appear in a string). Whether or not (that is, whether or not the two character string areas are the same character string area) is determined.

または、文字列領域結合部１６は、操作入力部１２から入力される、ユーザによるポインタの操作に基づく操作入力信号（補助情報）に基づいて、抽出された文字列領域と、当該文字列領域と隣接する他の文字列領域と、を結合する。
すなわち、文字列領域結合部１６は、ユーザの操作により（手動で）文字列領域の結合を行う場合と、統計情報に基づいて（自動で）文字列領域の結合を行う場合と、がある。 Alternatively, the character string area combining unit 16 has the extracted character string area and the character string area based on the operation input signal (auxiliary information) based on the operation of the pointer by the user, which is input from the operation input unit 12. Combines with other adjacent string areas.
That is, the character string area joining unit 16 may (manually) join the character string areas by the user's operation, or (automatically) join the character string areas based on the statistical information.

ルビ対応付け部１８は、ユーザによるポインタの操作に基づく操作入力信号（補助情報）に基づいて、文字列領域抽出部１５によって抽出された文字列領域に含まれる文字列と、当該文字列に対応するルビと、を対応付ける。ルビ対応付け部１８は、対応付けられた文字列と当該文字列に対応するルビとが囲まれた範囲であるルビ対応付け領域を示す画像を、表示部１３に表示させる。 The ruby mapping unit 18 corresponds to the character string included in the character string area extracted by the character string area extraction unit 15 based on the operation input signal (auxiliary information) based on the operation of the pointer by the user, and the character string. Correspond to the ruby. The ruby correspondence unit 18 causes the display unit 13 to display an image showing a ruby correspondence area which is a range in which the associated character string and the ruby corresponding to the character string are enclosed.

（文字列領域・文字矩形抽出装置の動作）
以下、実施形態の変形例１に係る文字列領域・文字矩形抽出装置２の動作について、図面を参照しながら説明する。
図１７は、本発明の実施形態の変形例１に係る文字列領域・文字矩形抽出装置２の動作を示すフローチャートである。本フローチャートは、文字列を含む画像を示す画像データ（例えば、文書をスキャナ等によって読み取ることにより生成される画像データ）が、画像データ取得部１１に入力される際に開始する。 (Operation of character string area / character rectangle extractor)
Hereinafter, the operation of the character string area / character rectangle extraction device 2 according to the modification 1 of the embodiment will be described with reference to the drawings.
FIG. 17 is a flowchart showing the operation of the character string area / character rectangle extraction device 2 according to the first modification of the embodiment of the present invention. This flowchart starts when image data indicating an image including a character string (for example, image data generated by reading a document with a scanner or the like) is input to the image data acquisition unit 11.

（ステップｓｔ１０１）文字列領域・文字矩形抽出装置２の画像データ取得部１１は、文字列を含む画像を示す画像データ（例えば、文書をスキャナ等によって読み取ることにより生成される画像データ）を取得する。その後、ステップｓｔ１０２へ進む。 (Step st101) The image data acquisition unit 11 of the character string area / character rectangle extraction device 2 acquires image data (for example, image data generated by reading a document with a scanner or the like) indicating an image including the character string. .. Then, the process proceeds to step st102.

（ステップｓｔ１０２）文字列領域・文字矩形抽出装置２の制御部１０は、画像データ取得部１１によって取得された画像データを記憶部１４に記憶させる。そして、制御部１０は、記憶部１４に記憶された当該画像データに対して、各種の事前処理を行う。ここでいう事前処理とは、画像データが示す画像の傾きを補正する処理や、色分解を行う処理などである。すなわち、当該事前処理は、例えば、文字列領域の抽出、文字矩形の抽出、および文字認識の処理を容易にするため、あるいは文字列領域の抽出、文字矩形の抽出、および文字認識の精度を高めるために行われる処理である。その後、ステップｓｔ００３へ進む。 (Step st102) The control unit 10 of the character string area / character rectangle extraction device 2 stores the image data acquired by the image data acquisition unit 11 in the storage unit 14. Then, the control unit 10 performs various preprocessing on the image data stored in the storage unit 14. The pre-processing referred to here is a process of correcting the inclination of the image indicated by the image data, a process of performing color separation, and the like. That is, the preprocessing facilitates, for example, the processing of character string area extraction, character rectangle extraction, and character recognition, or enhances the accuracy of character string area extraction, character rectangle extraction, and character recognition. It is a process performed for the purpose. Then, the process proceeds to step st003.

（ステップｓｔ１０３）文字列領域・文字矩形抽出装置２の文字列領域抽出部１５は、操作入力部１２から入力される操作入力信号（補助情報）に基づいて、画像データ取得部１１が取得した画像データに基づく画像に含まれる文字列からなる行の始点と終点とを特定し、特定された行の始点と終点とに基づいて当該行に含まれる文字列の表示対象範囲を示す文字列領域を抽出する。その後、ステップｓｔ１０４へ進む。 (Step st103) The character string area extraction unit 15 of the character string area / character rectangle extraction device 2 is an image acquired by the image data acquisition unit 11 based on the operation input signal (auxiliary information) input from the operation input unit 12. The start point and end point of the line consisting of the character string included in the image based on the data are specified, and the character string area indicating the display target range of the character string included in the line is specified based on the start point and end point of the specified line. Extract. Then, the process proceeds to step st104.

（ステップｓｔ１０４）画像データ取得部１１によって取得された画像データに基づく画像に含まれる全ての文字列領域の抽出が完了した場合には、ステップｓｔ１０５へ進む。
そうでない場合は、ステップｓｔ１０３へ戻る。 (Step st104) When the extraction of all the character string regions included in the image based on the image data acquired by the image data acquisition unit 11 is completed, the process proceeds to step st105.
If not, the process returns to step st103.

（ステップｓｔ１０５）文字列領域・文字矩形抽出装置２の文字矩形抽出部１７は、文字列領域抽出部１５によって抽出された文字列領域に含まれる文字列を構成するそれぞれの文字の矩形を表す文字矩形を抽出する。その後、ステップｓｔ１０６へ進む。 (Step st105) The character rectangle extraction unit 17 of the character string area / character rectangle extraction device 2 represents a character representing a rectangle of each character constituting the character string included in the character string area extracted by the character string area extraction unit 15. Extract the rectangle. Then, the process proceeds to step st106.

（ステップｓｔ１０６）文字矩形抽出部１７は、抽出された文字矩形を示す画像を表示部１３に表示させる。そして、ユーザは、表示部１３に表示された画像を確認し、文字矩形が誤りなく抽出されているか否かを確認する。文字矩形が誤りなく抽出されている場合には、ステップｓｔ１０８へ進む。そうでない場合、すなわち、文字矩形が誤って抽出されている箇所が存在する場合には、ステップｓｔ１０７へ進む。 (Step st106) The character rectangle extraction unit 17 causes the display unit 13 to display an image showing the extracted character rectangle. Then, the user confirms the image displayed on the display unit 13 and confirms whether or not the character rectangle is extracted without error. If the character rectangle is extracted without error, the process proceeds to step st108. If this is not the case, that is, if there is a place where the character rectangle is erroneously extracted, the process proceeds to step st107.

（ステップｓｔ１０７）文字矩形抽出部１７は、操作入力部１２から入力される、ユーザによるポインタの操作に基づく操作入力信号（補助情報）に基づいて、抽出された文字矩形を修正する。なお、ユーザは、表示部１３に表示された文字矩形管理画面（図示せず）などを参照しながら、操作入力部１２（例えば、マウス）により文字矩形の修正のための操作を行う。その後、ステップｓｔ１０８へ進む。 (Step st107) The character rectangle extraction unit 17 corrects the extracted character rectangle based on the operation input signal (auxiliary information) input from the operation input unit 12 based on the operation of the pointer by the user. The user performs an operation for correcting the character rectangle by the operation input unit 12 (for example, a mouse) while referring to the character rectangle management screen (not shown) displayed on the display unit 13. Then, the process proceeds to step st108.

（ステップｓｔ１０８）文字列領域・文字矩形抽出装置２の文字認識部２０は、文字矩形抽出部１７によって抽出されたそれぞれの文字矩形に含まれる文字を示す画像に対して光学文字認識（ＯＣＲ）を行い、文字を認識する。具体的には、文字認識部２０は、（例えば、スキャナによって読み取られた）文書に含まれる文字を示す画像データを解析し、コンピュータによりが編集可能なデータ形式（例えば、文字コードの列）に変換する。その後、ステップｓｔ１０９へ進む。 (Step st108) The character recognition unit 20 of the character string area / character rectangle extraction device 2 performs optical character recognition (OCR) on an image showing characters included in each character rectangle extracted by the character rectangle extraction unit 17. Do and recognize characters. Specifically, the character recognition unit 20 analyzes image data indicating characters contained in a document (for example, read by a scanner) and converts it into a data format (for example, a character code string) that can be edited by a computer. Convert. Then, the process proceeds to step st109.

（ステップｓｔ１０９）文字列領域結合部１６に備えられた言語解析部２６は、文字認識部２０による文字認識の結果を示す情報を解析する。そして、言語解析部２６は、解析された結果を示す情報に基づいて、文字列領域を結合するか否かを判定する。具体的には、例えば、言語解析部２６は、ある２つの文字列領域にそれぞれ含まれる文字に対しての文字認識部２０による文字認識の結果を示す情報と、文字の生起確率における統計的な情報（例えば、ある文字列の中でＮ個の文字列または単語の組み合わせが、どの程度出現するかを調査する言語モデルであるＮグラムモデルなど）とに基づいて、文字列領域を結合するか否か（すなわち、当該２つの文字列領域が、同一の文字列領域であるか否か）を判定する。 (Step st109) The language analysis unit 26 provided in the character string area joining unit 16 analyzes information indicating the result of character recognition by the character recognition unit 20. Then, the language analysis unit 26 determines whether or not to combine the character string regions based on the information indicating the analysis result. Specifically, for example, the language analysis unit 26 provides information indicating the result of character recognition by the character recognition unit 20 for characters contained in each of two character string regions, and statistical statistics on the occurrence probability of the characters. Whether to combine string regions based on information (for example, the N-gram model, which is a language model that investigates how many combinations of N strings or words appear in a string). Whether or not (that is, whether or not the two character string areas are the same character string area) is determined.

そして、文字列領域結合部１６は、言語解析部２６による上記の判定の結果に基づいて、文字列領域抽出部１５によって抽出された文字列領域と、当該文字列領域と隣接する他の文字列領域と（例えば、当該文字列領域が抽出される前に抽出された隣接する他の文字列領域）、を結合する。その後、ステップｓｔ１１０へ進む。 Then, the character string area joining unit 16 has the character string area extracted by the character string area extraction unit 15 based on the result of the above determination by the language analysis unit 26, and another character string adjacent to the character string area. The area is combined (eg, another adjacent string area extracted before the string area was extracted). Then, the process proceeds to step st110.

（ステップｓｔ１１０）文字列領域結合部１６は、結合された文字列領域を示す画像を表示部１３に表示させる。そして、ユーザは、表示部１３に表示された画像を確認し、文字列領域の結合が誤りなく行われているか否かを確認する。文字列領域の結合が誤りなく行われている場合には、本フローチャートに示される処理が終了する。そうでない場合、すなわち、文字列領域が誤って結合されている（または、誤って分割されている）箇所が存在する場合には、ステップｓｔ１１１へ進む。 (Step st110) The character string area combining unit 16 causes the display unit 13 to display an image showing the combined character string area. Then, the user confirms the image displayed on the display unit 13 and confirms whether or not the character string areas are combined without error. If the character string areas are combined without error, the process shown in this flowchart ends. If this is not the case, that is, if there is a place where the character string areas are erroneously combined (or erroneously divided), the process proceeds to step st111.

（ステップｓｔ１１１）文字列領域抽出部１５は、操作入力部１２から入力される、ユーザによるポインタの操作に基づく操作入力信号（補助情報）に基づいて、抽出された文字列領域を修正する。なお、ユーザは、表示部１３に表示された文字列領域管理画面などを参照しながら、操作入力部１２（例えば、マウス）により文字列領域の修正のための操作を行う。
以上で、本フローチャートに示される処理が終了する。 (Step st111) The character string area extraction unit 15 modifies the extracted character string area based on the operation input signal (auxiliary information) input from the operation input unit 12 based on the operation of the pointer by the user. The user performs an operation for correcting the character string area by the operation input unit 12 (for example, a mouse) while referring to the character string area management screen displayed on the display unit 13.
This completes the process shown in this flowchart.

以上、説明したように、本実施形態の変形例１に係る文字列領域・文字矩形抽出装置１は、文字列を含む画像を示す画像データを取得する画像データ取得部１１と、取得された画像データに基づく画像を表示する表示部１３と、ユーザからの操作入力を受け付ける操作入力部１２と、操作入力に基づく補助情報に基づいて当該画像に含まれる文字列からなる行の始点と終点とを特定し特定された行の始点と終点とに基づいて行に含まれる文字列の表示対象範囲を示す文字列領域を抽出する文字列領域抽出部１５と、抽出された文字列領域と当該文字列領域と隣接する他の文字列領域とを結合する文字列領域結合部１６と、を備える。 As described above, the character string area / character rectangle extraction device 1 according to the modification 1 of the present embodiment has an image data acquisition unit 11 for acquiring image data indicating an image including a character string, and an acquired image. A display unit 13 that displays an image based on data, an operation input unit 12 that receives an operation input from a user, and a start point and an end point of a line consisting of a character string included in the image based on auxiliary information based on the operation input. The character string area extraction unit 15 for extracting the character string area indicating the display target range of the character string included in the line based on the start point and the end point of the specified line, the extracted character string area, and the character string. A character string area connecting portion 16 for connecting an area and another adjacent character string area is provided.

さらに、本実施形態の変形例１に係る文字列領域・文字矩形抽出装置１は、文字認識部２０による文字認識の結果を示す情報を解析し、解析された結果を示す情報に基づいて文字列領域を結合するか否かを判定する言語解析部２６を備え、文字列領域結合部は、言語解析部２６によって判定された結果に基づいて、抽出された前記文字列領域と当該文字列領域と隣接する他の文字列領域とを結合する。 Further, the character string area / character rectangular extraction device 1 according to the modification 1 of the present embodiment analyzes the information indicating the result of character recognition by the character recognition unit 20, and the character string is based on the information indicating the analyzed result. A language analysis unit 26 for determining whether or not to combine areas is provided, and the character string area combination unit includes the character string area extracted and the character string area based on the result determined by the language analysis unit 26. Combines with other adjacent string areas.

以上により、本発明の実施形態の変形例１に係る文字列領域・文字矩形抽出装置２は、ユーザの操作に基づく補助情報を用いて、文字列領域の抽出および文字矩形の抽出の精度を高めることができる。さらに、本発明の実施形態の変形例１に係る文字列領域・文字矩形抽出装置２は、文字認識の結果を示す情報も用いて文字列領域の結合の判定を行うことができるため、文字列領域の抽出の精度を高めたり、文字列領域の抽出のための処理の効率化を図ったりすることができる。 As described above, the character string area / character rectangle extraction device 2 according to the first modification of the embodiment of the present invention improves the accuracy of the character string area extraction and the character rectangle extraction by using the auxiliary information based on the user's operation. be able to. Further, since the character string area / character rectangular extraction device 2 according to the first modification of the embodiment of the present invention can determine the combination of the character string areas by using the information indicating the result of the character recognition, the character string can be determined. It is possible to improve the accuracy of area extraction and improve the efficiency of processing for extracting the character string area.

＜実施形態の変形例２＞
上述した実施形態に係る文字列領域・文字矩形抽出装置１においては、文字列領域抽出部１５は、ユーザの操作に基づく補助情報に対応させ、表示部１３が表示する画像に含まれる文字列からなる、ユーザの操作が行われた行の始点および終点の位置を特定した。しかしながら、文字列領域抽出部１５は、画像に対するユーザの操作に基づく補助情報により始点および終点が特定された行に基づいて、その画像に含まれる他の行の始点および終点の位置を特定してもよい。
以下に説明する実施形態の変形例２に係る文字列領域・文字矩形抽出装置１においては、文字列領域抽出部１５は、表示部１３が表示する画像に対しユーザの操作により始点および終点が指定された行に基づいて、その画像に含まれる他の行の始点および終点の位置を特定する。なお、以下に述べる実施形態の変形例２の説明において、実施形態と同じ構成には同じ符号を付し、その説明を省略する。 <Modification 2 of the embodiment>
In the character string area / character rectangle extraction device 1 according to the above-described embodiment, the character string area extraction unit 15 corresponds to auxiliary information based on the user's operation, and from the character string included in the image displayed by the display unit 13. The position of the start point and the end point of the line where the user's operation was performed was specified. However, the character string area extraction unit 15 identifies the positions of the start points and end points of other lines included in the image based on the line in which the start point and the end point are specified by the auxiliary information based on the user's operation on the image. May be good.
In the character string area / character rectangular extraction device 1 according to the second modification of the embodiment described below, the character string area extraction unit 15 designates the start point and the end point of the image displayed by the display unit 13 by the user's operation. Based on the row, the location of the start and end points of other rows contained in the image is determined. In the description of the second modification of the embodiment described below, the same reference numerals are given to the same configurations as those of the embodiment, and the description thereof will be omitted.

文字列領域抽出部１５は、表示部１３が表示する画像に含まれる文字列からなる行に対する前記操作入力に基づく補助情報により特定された始点および終点に基づいて、その画像に含まれる文字列からなる他の行の始点および終点の位置を特定する特定情報を生成し、特定情報により特定される始点と終点とに基づいて、他の行に含まれる文字列の文字列領域を抽出する。例えば、文字列領域抽出部１５は、画像に含まれる複数の行のうちいずれか一つの行の始点および終点の位置がユーザの入力操作により指定されると、残りの行の始点および終点の位置を特定する。 The character string area extraction unit 15 is derived from the character string included in the image based on the start point and the end point specified by the auxiliary information based on the operation input for the line consisting of the character string included in the image displayed by the display unit 13. Generates specific information that identifies the positions of the start and end points of other lines, and extracts the character string area of the character string contained in the other lines based on the start and end points specified by the specific information. For example, in the character string area extraction unit 15, when the start point and end point positions of any one of a plurality of lines included in the image are specified by a user input operation, the start point and end point positions of the remaining lines are specified. To identify.

また、文字列領域抽出部１５は、表示部１３が表示する画像に対して特定された始点および終点のうち少なくともいずれか一方に対する前記操作入力に基づく補助情報により、その画像に対して生成された特定情報を修正する。例えば、文字列領域抽出部１５は、画像に含まれる複数の行の始点および終点の位置が特定されている場合、複数の行の始点のうちいずれか一つの始点の位置がユーザの入力操作により修正されると、残りの行の始点の位置を修正する。あるいは、文字列領域抽出部１５は、画像に含まれる複数の行の始点および終点の位置が特定されている場合、複数の行の始点のうちいずれか一つの終点の位置がユーザの入力操作により修正されると、残りの行の終点の位置を修正する。 Further, the character string area extraction unit 15 is generated for the image by the auxiliary information based on the operation input for at least one of the start point and the end point specified for the image displayed by the display unit 13. Correct specific information. For example, when the position of the start point and the end point of a plurality of lines included in the image is specified, the character string area extraction unit 15 determines the position of the start point of any one of the start points of the plurality of lines by a user input operation. Once modified, it modifies the position of the starting point of the remaining lines. Alternatively, when the positions of the start point and the end point of a plurality of lines included in the image are specified, the character string area extraction unit 15 determines the position of the end point of any one of the start points of the plurality of lines by a user input operation. Once modified, it modifies the position of the end points of the remaining lines.

また、文字列領域抽出部１５は、表示部１３が表示する画像に対して抽出された文字列領域に対する操作入力に基づく補助情報により、その画像に対して抽出された前記文字列領域を修正する。例えば、文字列領域抽出部１５は、画像に含まれる複数の文字列の文字列領域の位置が抽出されている場合、複数の文字列領域のうちいずれか一つの文字列領域の位置がユーザの入力操作により修正されると、残りの文字列領域の位置を修正する。 Further, the character string area extraction unit 15 modifies the character string area extracted for the image by the auxiliary information based on the operation input for the character string area extracted for the image displayed by the display unit 13. .. For example, in the character string area extraction unit 15, when the positions of the character string areas of a plurality of character strings included in the image are extracted, the position of any one of the plurality of character string areas is the position of the user. When corrected by an input operation, the position of the remaining character string area is corrected.

また、文字列領域抽出部１５は、表示部１３が表示する第１画像に対して行われたユーザによる操作入力信号（補助情報）、および、その第１画像に対して生成された特定情報のうち少なくともいずれか一方に基づいて、表示部１３が表示する第２画像に対する特定情報を生成する。
ここで、第１画像および第２画像とは、文字列領域を抽出する対象である文字列が含まれる画像であり、例えば第１画像は書籍の１ページ目、第２画像は当該書籍の２ページ目等を撮像した画像である。
例えば、文字列領域抽出部１５は、書籍のあるページを撮像した画像に対してユーザにより指定された始点および終点の位置に基づいて、書籍の別のページに含まれる文字列の始点および終点の位置を特定する。あるいは、文字列領域抽出部１５は、書籍のあるページを撮像した画像に対してユーザにより指定された始点および終点の位置から特定した他の行の始点および終点の位置に基づいて、書籍の別のページに含まれる文字列の始点および終点の位置を特定する。あるいは、文字列領域抽出部１５は、書籍のあるページを撮像した画像に対してユーザにより指定された始点および終点の位置と、文字列領域抽出部１５が特定した他の行の始点および終点の位置とに基づいて、書籍の別のページに含まれる文字列の始点および終点の位置を特定する。 Further, the character string area extraction unit 15 contains an operation input signal (auxiliary information) performed by the user for the first image displayed by the display unit 13, and specific information generated for the first image. Based on at least one of them, specific information for the second image displayed by the display unit 13 is generated.
Here, the first image and the second image are images including a character string to be extracted from the character string area. For example, the first image is the first page of a book, and the second image is 2 of the book. It is an image of a page or the like.
For example, the character string area extraction unit 15 determines the start point and end point of the character string included in another page of the book based on the positions of the start point and the end point specified by the user with respect to the image obtained by capturing the image of one page of the book. Identify the location. Alternatively, the character string area extraction unit 15 separates the books based on the positions of the start points and end points of other lines specified from the positions of the start points and end points specified by the user with respect to the image obtained by capturing a page of the book. Specify the position of the start point and end point of the character string contained in the page of. Alternatively, the character string area extraction unit 15 has the positions of the start point and the end point specified by the user with respect to the image obtained by capturing a page of the book, and the start point and the end point of another line specified by the character string area extraction unit 15. Determine the position of the start and end points of a string contained on another page of the book based on the position.

また、文字列領域抽出部１５は、表示部１３が表示する第１画像に対して抽出された文字列領域に関する情報に基づいて、第１画像とは異なる第２画像に対する文字列領域を抽出する。例えば、文字列領域抽出部１５は、書籍のあるページを撮像した画像に対して抽出された文字列領域の位置に基づいて、書籍の別のページに含まれる文字列領域を抽出する。 Further, the character string area extraction unit 15 extracts a character string area for a second image different from the first image based on the information regarding the character string area extracted for the first image displayed by the display unit 13. .. For example, the character string area extraction unit 15 extracts a character string area included in another page of a book based on the position of the character string area extracted with respect to an image obtained by capturing a page of the book.

（特定情報の生成）
以下、文字列領域抽出部１５が、画像に含まれる文字列からなる行に対する前記操作入力に基づく補助情報により特定された始点および終点に基づいて、その画像に含まれる文字列からなる他の行の始点および終点の位置を特定する特定情報を生成する方法について、図面を参照しながら説明する。
図１８乃至図１９は、実施形態に係る文字列領域・文字矩形抽出装置１の表示部１３によって表示される文字列領域抽出画面の一例を示す図である。 (Generation of specific information)
Hereinafter, the character string area extraction unit 15 will use another line consisting of the character string included in the image based on the start point and the end point specified by the auxiliary information based on the operation input for the line consisting of the character string included in the image. A method of generating specific information for specifying the positions of the start point and the end point of the above will be described with reference to the drawings.
18 to 19 are diagrams showing an example of a character string area extraction screen displayed by the display unit 13 of the character string area / character rectangle extraction device 1 according to the embodiment.

図１８に示すように、ユーザは、操作入力部１２によりポインタｐｔ１を操作することにより、文字列領域抽出画面ｄｓ２に含まれる文字列の範囲を、行単位で指定する。ここでは、ユーザは、文字列領域抽出画面ｄｓ２に表示される画像に含まれる縦書きの文字列のうち、右端の一行目に記載された文字列の始点および終点を指定するものとして説明する。また、ユーザによる当該行単位での文字列領域を指定する操作は、すでに説明したものと同様な操作であるため、ここでは詳細な説明を省略する。 As shown in FIG. 18, the user operates the pointer pt1 by the operation input unit 12 to specify the range of the character string included in the character string area extraction screen ds2 in line units. Here, the user will be described as designating the start point and the end point of the character string described in the first line at the right end of the vertically written character strings included in the image displayed on the character string area extraction screen ds2. Further, since the operation of specifying the character string area for each line by the user is the same operation as that already described, detailed description thereof will be omitted here.

ユーザにより行の始点および終点を指定する操作がなされると、表示部１３により、文字列領域抽出画面ｄｓ２には、始点ｓｔ１に白い丸型のアイコン、終点ｅｄ１に黒い丸型のアイコンがそれぞれ表示される。また、ユーザにより行の終点を指定する操作がなされると、文字列領域抽出画面ｄｓ２には、始点ｓｔ１と終点ｅｄ１を結ぶ接続線ｃｎ１が表示される。 When the user performs an operation to specify the start point and the end point of the line, the display unit 13 displays a white circle icon at the start point st1 and a black circle icon at the end point ed1 on the character string area extraction screen ds2. Will be done. Further, when the user performs an operation of designating the end point of the line, the connection line cn1 connecting the start point st1 and the end point ed1 is displayed on the character string area extraction screen ds2.

そして、文字列領域抽出部１５は、ユーザにより行の始点および終点が指定されると、当該始点および終点が指定された行（図１８の例では、一行目の行）の位置関係に基づいて、文字列領域抽出画面ｄｓ２に表示される画像に含まれる他の行（図１８の例では、二行目以降の行）の始点、および終点の位置を特定する特定情報を生成する。特定情報とは、例えば、二行目以降の行の始点、および終点のｘｙ座標値である。以下、特定情報が行の始点、および終点のｘｙ座標値であるものとして説明する。また、以下においては、特定情報を生成する処理は、ｘｙ座標系においてｙ軸方向に配列された文字列（つまり、縦書きの文字列）を、ｘ軸の負の方向（つまり、右から左の方向）に処理する動作を例として説明する。 Then, when the start point and the end point of the line are specified by the user, the character string area extraction unit 15 is based on the positional relationship of the line (in the example of FIG. 18, the first line) to which the start point and the end point are specified. , Generates specific information for specifying the positions of the start points and end points of other lines (lines after the second line in the example of FIG. 18) included in the image displayed on the character string area extraction screen ds2. The specific information is, for example, xy coordinate values of the start point and the end point of the second and subsequent rows. Hereinafter, it is assumed that the specific information is the xy coordinate values of the start point and the end point of the row. Further, in the following, in the process of generating specific information, a character string arranged in the y-axis direction (that is, a vertically written character string) in the xy coordinate system is moved in the negative direction of the x-axis (that is, from right to left). The operation of processing in the direction of) will be described as an example.

文字列領域抽出部１５は、例えば、ユーザの操作により指定された一行目の行の始点ｓｔ１のｘｙ座標値が（ｘｓｔ１、ｙｓｔ１）であったとすると、二行目の行の始点Ｅｓｔ２のｘｙ座標値を（ｘｓｔ１－ｄ、ｙｓｔ１）とする。つまり、二行目の行の始点Ｅｓｔ２は、一行目の行の始点の位置からｘ軸の負の方向にｄ離れ、ｙ軸方向に変化しない（同じｙ座標値）位置とする。
また、文字列領域抽出部１５は、例えば、ユーザの操作により指定された一行目の行の終点ｅｄ１のｘｙ座標値が（ｘｅｄ１、ｙｅｄ１）であったとすると、二行目の行の終点Ｅｅｄ２のｘｙ座標値を（ｘｅｄ１－ｄ、ｙｅｄ１）とする。つまり、二行目の行の終点Ｅｅｄ２は、一行目の行の終点の位置からｘ軸方向に行間の間隔がｄとなる位置であって、ｙ軸方向に一行目の行の始点のｙ座標値と同じとなる位置とする。
なお、ｄは、予め記憶部１４に記憶された値であってもよいし、ユーザが指定する値であってもよい。また、ｄは、文字列領域抽出画面ｄｓ２に表示される画像の種別に応じて設定される値であってもよい。画像の種別とは、例えば、文庫本の書式に基づいて記載された文字列を撮像した画像か、新書の書式に基づいて記載された文字列を撮像した画像か等である。これにより、文字列領域抽出部１５は、文庫本の書式と、新書の書式とで行間の間隔が異なる場合には、それぞれに対応した行間の間隔を用いて特定情報を生成することができる。 For example, assuming that the xy coordinate value of the start point st1 of the first row specified by the user's operation is (xst1, yst1), the character string area extraction unit 15 has the xy coordinates of the start point Est2 of the second row. The value is (xst1-d, yst1). That is, the start point Est2 of the second row is set to a position d away from the position of the start point of the first row in the negative direction of the x-axis and does not change in the y-axis direction (same y-coordinate value).
Further, if the xy coordinate value of the end point ed1 of the first line specified by the user's operation is (xed1, yed1), the character string area extraction unit 15 has the end point Eed2 of the second line. Let the xy coordinate value be (xed1-d, yesd1). That is, the end point Eed2 of the second row is a position where the distance between the rows is d in the x-axis direction from the position of the end point of the first row, and the y coordinate of the start point of the first row in the y-axis direction. The position is the same as the value.
Note that d may be a value stored in the storage unit 14 in advance, or may be a value specified by the user. Further, d may be a value set according to the type of the image displayed on the character string area extraction screen ds2. The type of the image is, for example, an image obtained by capturing a character string described based on the format of a paperback book, an image obtained by capturing a character string described based on the format of a new book, or the like. As a result, when the space between lines differs between the format of the paperback book and the format of the new book, the character string area extraction unit 15 can generate specific information using the space between lines corresponding to each.

文字列領域抽出部１５は、三行目以降の行についても同様に、始点Ｅｓｔ３～Ｅｓｔ８それぞれのｘｙ座標値を（ｘｓｔ１－ｋ×ｄ、ｙｓｔ１）、終点Ｅｅｄ３～Ｅｅｄ８それぞれのｘｙ座標値を（ｘｅｄ１－ｋ×ｄ、ｙｅｄ１）とする。ここで、ｋは行番号から１を減算した値である。行番号は、三行目の行が３、四行目の行が４、・・の順に設定される番号である。このように、文字列領域抽出部１５は、ユーザの操作により指定された文字列領域抽出画面ｄｓ２に表示される画像に含まれる一行目の行の始点のｘｙ座標に基づいて、文字列領域抽出画面ｄｓ２に表示される画像の二行目以降の行の始点のｘｙ座標値を、ユーザの指定した一行目の行を基準として、ｘ軸方向に行間の間隔がｄとなる位置であって、ｙ軸方向に一行目の行の始点のｙ座標値と同じとなる位置とする。 Similarly, for the third and subsequent lines, the character string area extraction unit 15 sets the xy coordinate values of the start points Est3 to Est8 (xst1-k × d, yst1) and the xy coordinate values of the end points Eed3 to Eed8 (xst1-k × d, yst1). Let it be xed1-k × d, yed1). Here, k is a value obtained by subtracting 1 from the line number. The line number is a number set in the order of 3, the third line is 3, the fourth line is 4, and so on. In this way, the character string area extraction unit 15 extracts the character string area based on the xy coordinates of the start point of the first line included in the image displayed on the character string area extraction screen ds2 designated by the user's operation. The xy coordinate value of the start point of the second and subsequent rows of the image displayed on the screen ds2 is the position where the space between the rows is d in the x-axis direction with respect to the first row specified by the user. The position is the same as the y-coordinate value of the start point of the first row in the y-axis direction.

図１８の例では、表示部１３は、文字列領域抽出部１５が特定した特定情報に基づいて、文字列領域抽出画面ｄｓ２に、二行目の行の始点Ｅｓｔ２に白い方形のアイコン、終点Ｅｅｄ２に黒い方形のアイコンをそれぞれ表示する。また、表示部１３は、二行目の行と同様に、三行目以降の行の始点Ｅｓｔ３～Ｅｓｔ８に白い方形のアイコン、終点Ｅｅｄ３～Ｅｅｄ８に黒い方形のアイコンをそれぞれ表示する。 In the example of FIG. 18, the display unit 13 displays the character string area extraction screen ds2 on the character string area extraction screen ds2, the display unit 13 has a white square icon at the start point Est2 of the second line, and the end point Eed2 based on the specific information specified by the character string area extraction unit 15. A black square icon is displayed on each. Further, the display unit 13 displays a white square icon at the start points Est3 to Est8 of the third and subsequent lines and a black square icon at the end points Eed3 to Eed8, respectively, as in the second line.

ここで、文字列領域抽出部１５は、二行目以降の行の始点および終点の位置を特定する際、その行に対応する箇所に文字列が示されているか否かに関わらず、始点および終点の位置を特定している。文字列領域抽出部１５は、始点ｓｔ１のｘｙ座標値からｘ軸方向にｄの整数倍離れた箇所に相当する点で、文字列領域抽出画面ｄｓ２に表示することができる点の全てを始点とする。また、文字列領域抽出部１５は、終点ｅｄ１のｘｙ座標値からｘ軸方向にｄの整数倍離れた箇所に相当する点で、文字列領域抽出画面ｄｓ２に表示することができる点の全てを終点とする。図１８の例では、文字列領域抽出部１５は、始点Ｅｓｔ８、および終点Ｅｅｄ８の間には文字列が存在していない場合であっても、始点Ｅｓｔ８、および終点Ｅｅｄ８を特定する。 Here, when the character string area extraction unit 15 specifies the positions of the start point and the end point of the second and subsequent lines, the start point and the start point and the end point regardless of whether or not the character string is shown at the position corresponding to the line. The position of the end point is specified. The character string area extraction unit 15 sets all points that can be displayed on the character string area extraction screen ds2 as starting points at points corresponding to points separated by an integral multiple of d in the x-axis direction from the xy coordinate value of the starting point st1. do. Further, the character string area extraction unit 15 has all the points that can be displayed on the character string area extraction screen ds2 at points corresponding to points separated by an integral multiple of d in the x-axis direction from the xy coordinate value of the end point ed1. Let it be the end point. In the example of FIG. 18, the character string area extraction unit 15 identifies the start point Est8 and the end point Eed8 even when the character string does not exist between the start point Est8 and the end point Eed8.

図１９に示すように、文字列領域抽出部１５は、ユーザによる一行目の行の始点および終点のｘｙ座標値に基づいて、他の行の始点および終点の位置を特定した後、特定した始点から終点までに含まれる文字列を囲む矩形状の文字列領域ｓａ１～ｓａ７を抽出する。文字列領域抽出部１５が文字列領域を抽出する処理については、すでに説明したものと同様な処理であるため、ここでは詳細な説明を省略するが、文字列領域抽出部１５は、特定した始点および終点を接続する接続線と文字とが交差している範囲を文字列領域とする。このため、図１８の始点Ｅｓｔ８、および終点Ｅｅｄ８のように、始点と終点との間に文字列が存在していない場合、始点Ｅｓｔ８、および終点Ｅｅｄ８に対応する文字列領域は抽出されない。 As shown in FIG. 19, the character string area extraction unit 15 identifies the positions of the start point and the end point of another line based on the xy coordinate values of the start point and the end point of the first line by the user, and then the specified start point. The rectangular character string areas sa1 to sa7 surrounding the character string included from to the end point are extracted. Since the process of extracting the character string area by the character string area extraction unit 15 is the same process as that already described, detailed description thereof is omitted here, but the character string area extraction unit 15 is the specified start point. The range where the connection line connecting the end points and the character intersects is defined as the character string area. Therefore, when the character string does not exist between the start point and the end point as in the start point Est8 and the end point Eed8 in FIG. 18, the character string area corresponding to the start point Est8 and the end point Eed8 is not extracted.

なお、上記においては、文字列領域抽出部１５は、文字列領域抽出画面ｄｓ２に表示される画像に縦書きの文字列が示されている場合について説明した。しかしながら、文字列領域抽出部１５は、文字列領域抽出画面ｄｓ２に表示される画像に横書きの文字列が示されている場合にも、ユーザの操作により指定された行の始点および終点の位置に基づいて、文字列領域抽出画面ｄｓ２に表示される画像に含まれる他の行の始点および終点の位置を特定することができる。この場合、横書きの文字列は、ｘｙ座標系においてｘ軸方向に配列された文字列であり、ｙ軸の正の方向（つまり、上から下の方向）に示される文字列となる。このことから、文字列領域抽出部１５は、ユーザの操作により指定された行の始点のｘｙ座標値に対し、ｘ軸方向にｘ座標値が同じ値であって、ｙ軸の正の方向に一定（例えば、ｄ）の間隔ごと離れた位置を、他の行の始点の位置とする。また、文字列領域抽出部１５は、ユーザの操作により指定された行の終点のｘｙ座標値に対し、ｘ軸方向にｘ座標値が同じ値であって、ｙ軸の正の方向に一定（例えば、ｄ）の間隔ごと離れた位置を、他の行の終点の位置とする。 In the above, the character string area extraction unit 15 has described the case where a vertically written character string is shown in the image displayed on the character string area extraction screen ds2. However, even when the horizontal character string is shown in the image displayed on the character string area extraction screen ds2, the character string area extraction unit 15 is set at the start point and end point positions of the line specified by the user's operation. Based on this, the positions of the start point and the end point of other lines included in the image displayed on the character string area extraction screen ds2 can be specified. In this case, the horizontally written character string is a character string arranged in the x-axis direction in the xy coordinate system, and is a character string shown in the positive direction of the y-axis (that is, from top to bottom). From this, the character string area extraction unit 15 has the same x-coordinate value in the x-axis direction with respect to the xy-coordinate value of the start point of the line specified by the user's operation, and is in the positive direction of the y-axis. A position separated by a fixed interval (for example, d) is set as the position of the start point of another line. Further, the character string area extraction unit 15 has the same x-coordinate value in the x-axis direction with respect to the xy-coordinate value of the end point of the line specified by the user's operation, and is constant in the positive direction of the y-axis (the character string area extraction unit 15 has the same value in the x-axis direction. For example, the positions separated by the interval of d) are set as the positions of the end points of other rows.

なお、文字列領域抽出部１５は、縦書きの文字列であって、ｘ軸の負の方向（つまり、右から左の方向）ではなく、ｘ軸の正の方向（つまり、左から右の方向）に示されている場合や、横書きの文字列であって、ｙ軸の正の方向（つまり、上から下の方向）ではなく、ｙ軸の負の方向（つまり、下から上の方向）に示されている場合であっても、上記と同様な方法を用い、ユーザの操作により指定された行の始点および終点の位置に基づいて、文字列領域抽出画面ｄｓ２に表示される画像に含まれる他の行の始点および終点の位置を特定することができる。 The character string area extraction unit 15 is a vertically written character string, and is not in the negative direction of the x-axis (that is, the direction from right to left) but in the positive direction of the x-axis (that is, from left to right). Direction) or a horizontal string that is not in the positive direction of the y-axis (that is, from top to bottom) but in the negative direction of the y-axis (that is, from bottom to top). ), The same method as above is used to display the image displayed on the character string area extraction screen ds2 based on the positions of the start and end points of the line specified by the user's operation. The positions of the start and end points of other contained lines can be specified.

このように、本実施形態の変形例２の文字列領域・文字矩形抽出装置１では、文字列領域抽出部１５は、前記操作入力に基づく補助情報に基づいて、表示部１３が表示する文字列領域抽出画面ｄｓ２に表示された、画像データ取得部１１が取得した画像データに基づく画像（「画像」の一例）に含まれる文字列からなる行のうち、補助情報により特定された行の始点ｓｔ１と終点ｅｄ１における当該行（例えば一行目）とは異なる他の行（例えば、二行目～八行目）の始点および終点をそれぞれ特定する特定情報を生成し、特定情報に基づいて特定される他の行の始点Ｅｓｔ２～Ｅｓｔ８と終点Ｅｅｄ２～Ｅｅｄ８とに基づいて、他の行に含まれる文字列の文字列領域ｓａ２～ｓａ５を抽出する。 As described above, in the character string area / character rectangular extraction device 1 of the modification 2 of the present embodiment, the character string area extraction unit 15 displays the character string displayed by the display unit 13 based on the auxiliary information based on the operation input. Of the lines consisting of character strings displayed in the area extraction screen ds2 and consisting of the character strings displayed in the image based on the image data acquired by the image data acquisition unit 11 (an example of "image"), the start point st1 of the line specified by the auxiliary information. And the end point ed1 generates specific information that specifies the start point and end point of another line (for example, the second line to the eighth line) different from the line (for example, the first line), and is specified based on the specific information. Based on the start points Est2 to Est8 and the end points Eed2 to Eed8 of the other lines, the character string areas sa2 to sa5 of the character strings included in the other lines are extracted.

書籍等の場合、文字列は一定の間隔で規則正しく並んで記載されることが多い。このため、書籍等を撮像した画像に含まれる文字列は、一定の間隔で並んでいる場合がほとんどである。つまり、文字列領域抽出部１５は、ユーザが指定した行の始点ｓｔ１、終点ｅｄ１、および接続線ｃｎ１に基づいて、他の行の始点Ｅｓｔ２～Ｅｓｔ８と終点Ｅｅｄ２～Ｅｅｄ８とを特定することができる。文字列領域抽出部１５が他の行の始点および終点の位置を特定することができるため、ユーザは、画像における文字列の一行ごとに始点および終点を指定する必要がない。つまり、文字列領域・文字矩形抽出装置１では、ユーザの操作に基づく補助情報を用いて、文字列領域の抽出の精度を高めることができる他、文字列領域抽出部１５が補助情報に基づいて、他の行の始点および終点の位置を特定することができ、ユーザが文字列からなる行を一行ごとに始点および終点を指定する操作の手間を省くことができる。 In the case of books and the like, character strings are often written in a regular arrangement at regular intervals. For this reason, in most cases, the character strings included in the captured image of a book or the like are arranged at regular intervals. That is, the character string area extraction unit 15 can specify the start points Est2 to Est8 and the end points Eed2 to Eed8 of other lines based on the start point st1, the end point ed1, and the connection line cn1 of the line specified by the user. .. Since the character string area extraction unit 15 can specify the positions of the start point and the end point of another line, the user does not need to specify the start point and the end point for each line of the character string in the image. That is, in the character string area / character rectangle extraction device 1, the accuracy of extracting the character string area can be improved by using the auxiliary information based on the user's operation, and the character string area extraction unit 15 is based on the auxiliary information. , The positions of the start and end points of other lines can be specified, and the user can save the trouble of specifying the start point and the end point for each line consisting of a character string.

なお、文字列領域抽出部１５は、ユーザにより特定の行の始点および終点が指定された場合、他の行の始点および終点の位置を特定するか否かを、選択できるようにしてもよい。そして、文字列領域抽出部１５は、他の行の始点および終点の位置を特定することが選択された場合に、他の行の全ての行の始点および終点の位置を特定するか、または他の行の一部の始点および終点の位置を特定するかを、選択できるようにしてもよい。
文字列領域抽出部１５が、他の行の一部の始点および終点の位置を特定する場合、例えば、画像に文字列が七行に渡って記載され、その中の二行目から四行目までの行の始点および終点の位置を特定する場合について説明する。ユーザは、まず、二行目の行の始点近傍の位置にポインタｐｔ１を移動させてマウスボタンをクリックし、そのまま四行目の行の終点の位置までポインタｐｔ１をドラッグする。その後、二行目の行の始点近傍の位置にポインタｐｔ１を移動させてマウスボタンをクリックし、次に二行目の行の終点の位置にポインタｐｔ１を移動させてマウスボタンをクリックする。このような入力操作がなされることにより、文字列領域抽出部１５は、七行の文字列のうち、二行目の行に対して指定された始点および終点の位置に基づいて、三、四行目のそれぞれの行の始点および終点の位置を特定する。
文字列領域抽出部１５が、複数ある行の一部の始点および終点の位置を特定することにより、複数の行に記載される文字列のそれぞれが互いに書式が異なる場合（例えば、目次や見出しを示す文字列、本文を示す文字列等）、それぞれの書式に応じた始点および終点の位置を特定することができる。 The character string area extraction unit 15 may be able to select whether or not to specify the positions of the start points and end points of other lines when the start point and end point of a specific line are specified by the user. Then, when the position of the start point and the end point of another line is selected, the character string area extraction unit 15 specifies the positions of the start point and the end point of all the lines of the other line, or other It may be possible to select whether to specify the positions of the start and end points of a part of the line.
When the character string area extraction unit 15 specifies the positions of the start point and the end point of a part of another line, for example, the character string is described in seven lines in the image, and the second to fourth lines in the image. The case of specifying the positions of the start point and the end point of the line up to is described. First, the user moves the pointer pt1 to a position near the start point of the second row, clicks the mouse button, and drags the pointer pt1 to the position of the end point of the fourth row as it is. After that, the pointer pt1 is moved to a position near the start point of the second row and the mouse button is clicked, then the pointer pt1 is moved to the position of the end point of the second row and the mouse button is clicked. By performing such an input operation, the character string area extraction unit 15 performs three or four based on the positions of the start point and the end point specified for the second line of the seven lines of the character string. Specify the positions of the start and end points of each line in the line.
When the character string area extraction unit 15 identifies the positions of the start point and the end point of a part of a plurality of lines so that the character strings described in the plurality of lines have different formats from each other (for example, a table of contents or a heading). It is possible to specify the positions of the start point and the end point according to each format (character string indicating, character string indicating the text, etc.).

（特定情報の修正）
以下、文字列領域抽出部１５が、自身が生成した特定情報を、ユーザによるポインタの操作に基づく補助情報に基づいて修正する方法について、図面を参照しながら説明する。
図２０乃至図２１は、実施形態に係る文字列領域・文字矩形抽出装置１の表示部１３によって表示される文字列領域抽出画面の一例を示す図である。 (Correction of specific information)
Hereinafter, a method of correcting the specific information generated by the character string area extraction unit 15 based on the auxiliary information based on the operation of the pointer by the user will be described with reference to the drawings.
20 to 21 are diagrams showing an example of a character string area extraction screen displayed by the display unit 13 of the character string area / character rectangle extraction device 1 according to the embodiment.

図２０に示すように、文字列領域抽出部１５が、ユーザにより指定された行の始点および終点の位置に基づいて、同じ画像に含まれる他の行の始点および終点の位置を特定しても、必ずしも実際の行の始点および終点の位置と一致するとは限らない。図２０の例では、文字列領域抽出部１５が特定した始点の位置は、実際の始点の位置よりも右方向にずれている。文字列領域抽出部１５は、ユーザにより指定された始点のｘ座標値からｘ軸方向にｄの整数倍離れた位置であって、ｙ軸方向の座標値が変化しない位置を、他の行の始点の位置とした。しかしながら、実際には、ユーザにより指定された行の始点の位置と、他の行の始点の位置とは、ｘ軸方向にｄ１離れているため、上述した右方向のずれが生じている。 As shown in FIG. 20, even if the character string area extraction unit 15 specifies the positions of the start points and end points of other lines included in the same image based on the positions of the start points and end points of the line specified by the user. , Does not always match the actual start and end points of the line. In the example of FIG. 20, the position of the start point specified by the character string area extraction unit 15 is shifted to the right from the position of the actual start point. The character string area extraction unit 15 sets a position on another line at a position separated by an integral multiple of d in the x-axis direction from the x-coordinate value of the start point specified by the user and in which the coordinate value in the y-axis direction does not change. The position of the starting point was used. However, in reality, the position of the start point of the row specified by the user and the position of the start point of another row are separated by d1 in the x-axis direction, so that the above-mentioned deviation in the right direction occurs.

文字列領域抽出部１５が特定した他の行の始点および終点の位置が、実際の始点および終点の位置と異なっている場合、文字列領域抽出部１５が特定した始点と終点とを接続させた接続線が、文字列と交差しないことがある。接続線が文字列と交差しない場合、文字列領域抽出部１５は、文字列領域を抽出することができない。このため、文字列領域抽出部１５は、画像に対して特定された始点および終点を修正できることが望ましい。 When the positions of the start point and the end point of the other lines specified by the character string area extraction unit 15 are different from the actual start point and end point positions, the start point and the end point specified by the character string area extraction unit 15 are connected. The connecting line may not intersect the string. If the connecting line does not intersect the character string, the character string area extraction unit 15 cannot extract the character string area. Therefore, it is desirable that the character string area extraction unit 15 can correct the start point and the end point specified for the image.

本実施形態の変形例２に係る文字列領域・文字矩形抽出装置１によれば、文字列領域抽出部１５は、ユーザによる入力操作に基づいて、画面に含まれる文字列に対して生成された特定情報を、修正することができる。
図２１に示すように、ユーザは、文字列領域抽出画面ｄｓ２に表示された画面に含まれる文字列に対して特定された始点Ｅｓｔ２の位置を、操作入力部１２によりポインタｐｔ１を操作することによって移動させる。具体的には、ユーザは、例えば、マウスを操作して図２１に点線で示される始点Ｅｓｔ２が表示された位置にポインタｐｔ１を移動させ、始点Ｅｓｔ２をドラッグして、図２１に実線で示される始点Ｅｓｔ１２が表示された位置に移動させる。これにより、文字列領域抽出部１５は、二行目の行の始点の位置を、始点Ｅｓｔ２から始点Ｅｓｔ１２が表示される箇所に修正する。 According to the character string area / character rectangular extraction device 1 according to the modification 2 of the present embodiment, the character string area extraction unit 15 is generated for the character string included in the screen based on the input operation by the user. Specific information can be modified.
As shown in FIG. 21, the user operates the pointer pt1 by the operation input unit 12 at the position of the start point Est2 specified with respect to the character string included in the screen displayed on the character string area extraction screen ds2. Move it. Specifically, for example, the user operates the mouse to move the pointer pt1 to the position where the start point Est2 shown by the dotted line in FIG. 21 is displayed, drags the start point Est2, and is shown by the solid line in FIG. Move the start point Est12 to the displayed position. As a result, the character string area extraction unit 15 corrects the position of the start point of the second line from the start point Est2 to the place where the start point Est12 is displayed.

文字列領域抽出部１５は、ユーザのポインタ入力操作により二行目の行の始点の位置が修正されると、当該修正内容に基づき、三行目以降の行の始点の位置を修正する。具体的には、文字列領域抽出部１５は、一行目の行の始点Ｅｓｔ１のｘ座標値と修正後の二行目の行の始点Ｅｓｔ１２のｘ座標値の差分を算出する。当該ｘ座標値の差分が一行目と二行目の、ｘ軸方向における間隔に相当し、図２１の例では、ｘ軸方向の差分はｄ１である。また、文字列領域抽出部１５は、修正前の二行目の行の始点Ｅｓｔ２と修正後の二行目の行の始点Ｅｓｔ１２とのｙ座標の差分を算出する。当該ｙ座標の差分がｙ軸方向の修正量に相当する。 When the position of the start point of the second line is corrected by the pointer input operation of the user, the character string area extraction unit 15 corrects the position of the start point of the third and subsequent lines based on the corrected content. Specifically, the character string area extraction unit 15 calculates the difference between the x-coordinate value of the start point Est1 of the first row and the x-coordinate value of the start point Est12 of the modified second row. The difference between the x-coordinate values corresponds to the distance between the first row and the second row in the x-axis direction, and in the example of FIG. 21, the difference in the x-axis direction is d1. Further, the character string area extraction unit 15 calculates the difference in y-coordinates between the start point Est2 of the second row before modification and the start point Est12 of the second row after modification. The difference in the y-coordinate corresponds to the correction amount in the y-axis direction.

文字列領域抽出部１５は、ユーザの操作により指定された一行目の行の始点ｓｔ１のｘｙ座標値が（ｘｓｔ１、ｙｓｔ１）、ユーザのポインタ操作により修正された二行目の行の始点Ｅｓｔ１２のｘｙ座標値が（ｘｓｔ１－ｄ１、ｙｓｔ１＋ｙ１）であったとする。つまり、ｘ軸方向に行間の間隔をｄからｄ１とする修正、およびｙ軸の正の方向にｙ１移動させる修正がユーザにより行われたとする。この場合、文字列領域抽出部１５は、三行目の行の始点Ｅｓｔ３のｘｙ座標値を（ｘｓｔ１－２×ｄ１、ｙｓｔ１＋ｙ１）とする。つまり、文字列領域抽出部１５は、三行目の行の始点Ｅｓｔ３の位置を、二行目の行の始点の位置からｘ軸方向にｄ１離れ、ｙ軸方向に変化しない位置とする。文字列領域抽出部１５は、四行目以降の行についても同様に、始点Ｅｓｔ４～Ｅｓｔ８それぞれのｘｙ座標値を（ｘｓｔ１－ｋ×ｄ１、ｙｓｔ１＋ｙ１）とする。ここで、ｋは行番号から１を減算した値である。 In the character string area extraction unit 15, the xy coordinate value of the start point st1 of the first line specified by the user's operation is (xst1, yst1), and the start point Est12 of the second line is corrected by the user's pointer operation. It is assumed that the xy coordinate value is (xst1-d1, yst1 + y1). That is, it is assumed that the user has made a correction to change the space between rows from d to d1 in the x-axis direction and to move y1 in the positive direction of the y-axis. In this case, the character string area extraction unit 15 sets the xy coordinate value of the start point Est3 of the third line to (xst1-2 × d1, yst1 + y1). That is, the character string area extraction unit 15 sets the position of the start point Est3 of the third row to a position d1 away from the position of the start point of the second row in the x-axis direction and does not change in the y-axis direction. Similarly, the character string area extraction unit 15 sets the xy coordinate values of the start points Est4 to Est8 to (xst1-k × d1, yst1 + y1) for the fourth and subsequent rows. Here, k is a value obtained by subtracting 1 from the line number.

表示部１３は、文字列領域抽出部１５が修正した特定情報に基づいて、文字列領域抽出画面ｄｓ３に、始点Ｅｓｔ１１～Ｅｓｔ１８に表示させていた白い方形のアイコンを消去するとともに始点Ｅｓｔ２１～Ｅｓｔ２８に白い方形のアイコンそれぞれを表示する。 The display unit 13 erases the white square icon displayed at the start points Est11 to Est18 on the character string area extraction screen ds3 based on the specific information corrected by the character string area extraction unit 15, and at the start points Est21 to Est28. Display each of the white square icons.

なお、上記においては、文字列領域抽出部１５が始点の位置を修正する例について説明したが、終点についても同様である。文字列領域抽出部１５は、例えば、ユーザのポインタ入力操作により二行目の行の終点の位置が修正されると、当該修正内容に基づき、三行目以降の行の終点の位置を修正する。 In the above, an example in which the character string area extraction unit 15 corrects the position of the start point has been described, but the same applies to the end point. For example, when the position of the end point of the second line is corrected by the pointer input operation of the user, the character string area extraction unit 15 corrects the position of the end point of the third and subsequent lines based on the correction content. ..

このように、文字列領域・文字矩形抽出装置１では、文字列領域抽出部１５は、操作入力部１２から入力される操作入力信号（補助情報）に基づき、特定情報を修正する。ユーザが、例えば、文字列領域抽出部１５により特定された複数の始点または終点のうちの一つの始点の位置をユーザが修正すれば、文字列領域抽出部１５は、その修正内容に応じて残りの始点または終点の位置を修正する。従って、ユーザは一行ごとに始点または終点の位置を修正する必要がない。このため、文字列領域・文字矩形抽出装置１では、ユーザの操作に基づく補助情報を用いて、文字列領域の抽出の精度を高めることができる他、文字列領域抽出部１５が他の行の始点または終点を誤って特定した場合でも、当該誤って特定された始点または終点を一行ごとに修正する手間を省くことができる。 As described above, in the character string area / character rectangular extraction device 1, the character string area extraction unit 15 corrects the specific information based on the operation input signal (auxiliary information) input from the operation input unit 12. If, for example, the user corrects the position of one of the plurality of start points or end points specified by the character string area extraction unit 15, the character string area extraction unit 15 remains according to the correction content. Correct the position of the start point or end point of. Therefore, the user does not need to correct the position of the start point or the end point for each line. Therefore, in the character string area / character rectangle extraction device 1, the accuracy of extracting the character string area can be improved by using the auxiliary information based on the user's operation, and the character string area extraction unit 15 can be used for other lines. Even if the start point or end point is erroneously specified, it is possible to save the trouble of correcting the erroneously specified start point or end point line by line.

また、文字列領域抽出部１５は、ユーザにより特定の行の始点または終点の位置が修正された場合に、他の行の始点または終点を修正するか否かを、選択できるようにしてもよい。そして、文字列領域抽出部１５は、他の行の始点または終点を修正することが選択された場合に、他の行の全ての行の始点または終点を修正するか、または他の行の一部の行の始点または終点を修正するかを、選択できるようにしてもよい。 Further, the character string area extraction unit 15 may be able to select whether or not to correct the start point or end point of another line when the position of the start point or end point of a specific line is corrected by the user. .. Then, when it is selected to modify the start point or end point of another line, the character string area extraction unit 15 corrects the start point or end point of all the lines of the other line, or one of the other lines. It may be possible to select whether to modify the start point or the end point of the line of the part.

文字列領域抽出部１５が、他の行の一部の始点または終点を修正する場合、例えば、画像に七行分の始点および終点の位置が特定され、その中の二行目から四行目までの行の始点を修正する場合について説明する。ユーザは、まず、シフトキーを押下しながら、二行目から四行目までの行の始点近傍の位置に、ポインタｐｔ１を順に移動させ、移動させる度にマウスボタンをクリックする。次に、ユーザは、シフトキーの押下を止め、二行目の行の始点近傍の位置にポインタｐｔ１を移動させてマウスボタンをクリックし、そのまま二行目の行の始点を移動させたい位置までポインタｐｔ１を移動させてドラッグする。このような入力操作がなされることにより、文字列領域抽出部１５は、七行の文字列のうち、二行目の行に対して修正された始点の位置に基づいて、三、四行目のそれぞれの行の始点の位置を修正する。 When the character string area extraction unit 15 modifies the start point or end point of a part of another line, for example, the positions of the start point and end point of seven lines are specified in the image, and the second to fourth lines in the image are specified. The case of correcting the start point of the line up to is described. First, the user moves the pointer pt1 in order to a position near the start point of the lines from the second line to the fourth line while pressing the shift key, and clicks the mouse button each time the pointer is moved. Next, the user stops pressing the shift key, moves the pointer pt1 to a position near the start point of the second line, clicks the mouse button, and then clicks the pointer to the position where the start point of the second line is desired to be moved. Move pt1 and drag. By performing such an input operation, the character string area extraction unit 15 performs the third and fourth lines based on the position of the starting point corrected with respect to the second line of the seven lines of the character string. Correct the position of the start point of each line of.

（文字列領域の修正）
以下、文字列領域抽出部１５が、補助情報に基づいて、自身が抽出した文字列領域を修正する方法について、図面を参照しながら説明する。
図２２は、実施形態に係る文字列領域・文字矩形抽出装置１の表示部１３によって表示される文字列領域抽出画面の一例を示す図である。 (Correction of character string area)
Hereinafter, a method of modifying the character string area extracted by the character string area extraction unit 15 based on the auxiliary information will be described with reference to the drawings.
FIG. 22 is a diagram showing an example of a character string area extraction screen displayed by the display unit 13 of the character string area / character rectangle extraction device 1 according to the embodiment.

図２２に示すように、文字列領域抽出部１５が、画像に含まれる文字列の文字列領域を特定しても、実際の文字列領域と一致するとは限らない。図２２の例では、文字列領域抽出部１５が特定した文字列領域ｓａ１～ｓａ７それぞれの領域には、文字列が記載される領域に、その文字列に付された振り仮名が記載される領域が含まれてしまっている。
図２２に示すように、文字列領域の中に文字列に含まれる文字以外の文字が含まれている場合、後述する文字矩形抽出部１７において、文字矩形の抽出を行うことができない可能性がある。 As shown in FIG. 22, even if the character string area extraction unit 15 specifies the character string area of the character string included in the image, it does not always match the actual character string area. In the example of FIG. 22, in each of the character string areas sa1 to sa7 specified by the character string area extraction unit 15, the area in which the character string is described is the area in which the spelling pseudonym attached to the character string is described. Has been included.
As shown in FIG. 22, when a character other than the character included in the character string is included in the character string area, there is a possibility that the character rectangle extraction unit 17 described later cannot extract the character rectangle. be.

本実施形態の変形例２に係る文字列領域・文字矩形抽出装置１によれば、ユーザは、文字列領域抽出部１５によって特定された文字列領域を修正することができる。
図２２に示すように、ユーザは、文字列領域ｓａ１の右端の線を、操作入力部１２によりポインタｐｔ１を操作することによって移動させる。具体的には、ユーザは、例えば、マウスを操作して図２２に点線で示される文字列領域ｓａ１の右側上端の角の近傍の位置にポインタｐｔ１を移動させ、当該角の近傍の位置からドラッグして、図２２に実線で示される文字列領域ｓａ１１の右側上端の角の近傍の位置に移動させる。これにより、文字列領域ｓａ１の領域が、文字列領域ｓａ１１の領域に修正され、文字列領域ｓａ１１には文字列のみが含まれ、文字列の振り仮名が記載された領域が含まれない領域に修正される。 According to the character string area / character rectangle extraction device 1 according to the second modification of the present embodiment, the user can modify the character string area specified by the character string area extraction unit 15.
As shown in FIG. 22, the user moves the rightmost line of the character string area sa1 by operating the pointer pt1 by the operation input unit 12. Specifically, for example, the user operates the mouse to move the pointer pt1 to a position near the corner of the upper right upper end of the character string region sa1 shown by the dotted line in FIG. 22, and drags the pointer pt1 from the position near the corner. Then, it is moved to a position near the corner of the upper right upper end of the character string area sa11 shown by the solid line in FIG. As a result, the area of the character string area sa1 is modified to the area of the character string area sa11, and the character string area sa11 contains only the character string and does not include the area in which the pseudonym of the character string is described. It will be fixed.

文字列領域抽出部１５は、文字列領域ｓａ１がユーザの入力操作により修正されると、当該修正内容に基づき、他の文字列領域ｓａ２～ｓａ７それぞれを修正する。具体的には、文字列領域抽出部１５は、修正前の文字列領域ｓａ１の代表点（例えば、領域の右側上端の角）のｘｙ座標値と、修正後の文字列領域ｓａ１１の代表点のｘｙ座標値を比較し、移動量を算出する。文字列領域抽出部１５は、他の文字列領域ｓａ２～ｓａ７それぞれに対し、それぞれの代表点を、算出した移動量だけ移動させることにより、文字列領域ｓａ３～ｓａ７を修正し、それぞれ修正後の文字列領域ｓａ１３～ｓａ１７とする。 When the character string area sa1 is modified by the input operation of the user, the character string area extraction unit 15 modifies each of the other character string areas sa2 to sa7 based on the modified content. Specifically, the character string area extraction unit 15 has xy coordinate values of the representative points of the character string area sa1 before modification (for example, the upper right corner of the area) and the representative points of the character string area sa11 after modification. The movement amount is calculated by comparing the xy coordinate values. The character string area extraction unit 15 modifies the character string areas sa3 to sa7 by moving their respective representative points by the calculated movement amount with respect to each of the other character string areas sa2 to sa7, and after each correction. The character string areas are sa13 to sa17.

なお、文字列領域抽出部１５は、ユーザにより特定の行に対応する文字列領域が修正された場合に、他の行に対応する文字列領域を修正するか否かを、選択できるようにしてもよい。そして、文字列領域抽出部１５は、他の行に対応する文字列領域を修正することが選択された場合に、他の行に対応する文字列領域の全てを修正するか、または他の行に対応する文字列領域の一部の文字列領域を修正するかを、選択できるようにしてもよい。 The character string area extraction unit 15 enables the user to select whether or not to modify the character string area corresponding to another line when the character string area corresponding to a specific line is modified by the user. May be good. Then, when it is selected to modify the character string area corresponding to the other line, the character string area extraction unit 15 corrects all of the character string area corresponding to the other line, or the other line. It may be possible to select whether to modify a part of the character string area corresponding to.

文字列領域抽出部１５が、他の文字列領域の一部の文字列領域を修正する場合、例えば、画像に七行分の文字列領域が特定され、その中の二行目から四行目までの文字列領域を修正する場合について説明する。ユーザは、まず、シフトキーを押下しながら、二行目から四行目までの行の文字列領域の内側の位置に、ポインタｐｔ１を順に移動させ、移動させる度にマウスボタンをクリックする。ユーザは、次に、シフトキーの押下を止め、二行目の文字列領域の所定の位置（例えば、右側上端の角）にポインタｐｔ１を移動させてマウスボタンをクリックする。そして、ユーザは、そのまま文字列領域の当該角を移動させたい位置までポインタｐｔ１を移動させてドラッグする。このような入力操作がなされることにより、文字列領域抽出部１５は、七行の文字列のうち、二行目の文字列領域に対して修正された内容に基づいて、三、四行目のそれぞれの文字列領域を修正する。
文字列領域抽出部１５が、複数ある文字列領域の一部の文字列領域を修正することにより、例えば、画像の中に振り仮名が振られていない文字列の文字列領域と、振り仮名が振られている文字列の文字列領域とが、混在している場合、振り仮名が振られている文字列の文字列領域のみを修正することができる。 When the character string area extraction unit 15 modifies a part of the character string area of another character string area, for example, a character string area for seven lines is specified in the image, and the second to fourth lines in the character string area are specified. The case of modifying the character string area up to is described. The user first moves the pointer pt1 to a position inside the character string area of the second to fourth lines while pressing the shift key, and clicks the mouse button each time the pointer pt1 is moved. The user then stops pressing the shift key, moves the pointer pt1 to a predetermined position in the character string area of the second line (for example, the upper right corner), and clicks the mouse button. Then, the user moves the pointer pt1 to the position where he / she wants to move the corner of the character string area as it is, and drags it. By performing such an input operation, the character string area extraction unit 15 performs the third and fourth lines based on the contents corrected for the character string area of the second line among the character strings of the seven lines. Modify each string area of.
By modifying a part of the character string area of the plurality of character string areas, for example, the character string area of the character string in which the choreography is not assigned in the image and the choreography kana can be obtained by the character string area extraction unit 15. When the character string area of the assigned character string is mixed, only the character string area of the character string to which the assigned pseudonym is assigned can be modified.

このように、文字列領域・文字矩形抽出装置１では、文字列領域抽出部１５は、操作入力部１２から入力される操作入力信号（補助情報）に基づき、文字列領域抽出画面ｄｓ２が表示する画像から抽出された文字列領域ｓａ１～ｓａ５を修正する。例えば、文字列領域抽出部１５により抽出された文字列領域に、振り仮名が振られている領域が含まれた状態であっても、一つの文字列領域（例えば、文字列領域ｓａ１）をユーザが修正すれば、文字列領域抽出部１５は、その修正内容に応じて、他の文字列領域（例えば、文字列領域ｓａ２～ｓａ７）を修正する。従って、ユーザは文字列領域ごとに修正を行う必要がない。このため、文字列領域・文字矩形抽出装置１では、上述した効果を奏する他、文字列領域を修正する場合、ユーザが複数の文字列領域それぞれを修正する操作を行う手間を省くことができる。 As described above, in the character string area / character rectangle extraction device 1, the character string area extraction unit 15 displays the character string area extraction screen ds2 based on the operation input signal (auxiliary information) input from the operation input unit 12. The character string areas sa1 to sa5 extracted from the image are modified. For example, even if the character string area extracted by the character string area extraction unit 15 includes an area to which a pseudonym is assigned, one character string area (for example, the character string area sa1) can be used by the user. If is corrected, the character string area extraction unit 15 corrects another character string area (for example, the character string areas sa2 to sa7) according to the correction content. Therefore, the user does not need to make corrections for each character string area. Therefore, in the character string area / character rectangle extraction device 1, in addition to the above-mentioned effects, when the character string area is modified, it is possible to save the user the trouble of performing an operation of modifying each of the plurality of character string areas.

（第２画像における特定情報の生成）
以下、文字列領域抽出部１５が、第１画像とは異なる第２画像に対する特定情報を生成する方法について、図面を参照しながら説明する。
図２３は、実施形態に係る文字列領域・文字矩形抽出装置１の表示部１３によって表示される文字列領域抽出画面の一例を示す図である。 (Generation of specific information in the second image)
Hereinafter, a method in which the character string region extraction unit 15 generates specific information for a second image different from the first image will be described with reference to the drawings.
FIG. 23 is a diagram showing an example of a character string area extraction screen displayed by the display unit 13 of the character string area / character rectangle extraction device 1 according to the embodiment.

図２３に示すように、文字列領域抽出部１５は、文字列領域抽出画面ｄｓ２に表示される画像について特定した始点および終点のｘｙ座標値を、文字列領域抽出画面ｄｓ２とは異なる別の文字列領域抽出画面ｄｓ３に表示される画像に含まれる文字列からなる行の始点および終点のｘｙ座標値として特定情報を生成してもよい。
文字列領域抽出部１５は、例えば、文字列領域抽出画面ｄｓ２に表示される画像に含まれる文字列からなる行の始点および終点のｘｙ座標値それぞれを示す特定情報を記憶部１４に記憶させる。文字列領域抽出部１５は、文字列領域抽出画面ｄｓ３に表示される画像に含まれる文字列からなる行の始点および終点の位置を特定する場合、記憶部１４に記憶させた文字列領域抽出画面ｄｓ２に表示される画像に含まれる文字列からなる行の始点および終点のｘｙ座標値をそれぞれ参照する。そして、参照したそれぞれのｘｙ座標値を、文字列領域抽出画面ｄｓ３に表示される画像に含まれる文字列からなる行の始点および終点のｘｙ座標値とする。 As shown in FIG. 23, the character string area extraction unit 15 sets the xy coordinate values of the start point and the end point specified for the image displayed on the character string area extraction screen ds2 to different characters from the character string area extraction screen ds2. Specific information may be generated as xy coordinate values of the start point and the end point of the line consisting of the character string included in the image displayed on the column area extraction screen ds3.
The character string area extraction unit 15 stores, for example, specific information indicating the xy coordinate values of the start point and the end point of the line consisting of the character string included in the image displayed on the character string area extraction screen ds2 in the storage unit 14. When the character string area extraction unit 15 specifies the positions of the start point and the end point of the line consisting of the character string included in the image displayed on the character string area extraction screen ds3, the character string area extraction unit 15 stores the character string area extraction screen in the storage unit 14. Refer to the xy coordinate values of the start point and the end point of the line consisting of the character string included in the image displayed in ds2, respectively. Then, each of the referenced xy coordinate values is set as the xy coordinate value of the start point and the end point of the line consisting of the character string included in the image displayed on the character string area extraction screen ds3.

あるいは、文字列領域抽出部１５は、特定情報の代わりに補助情報を用いてもよい。例えば、文字列領域抽出部１５は、文字列領域抽出画面ｄｓ２に表示される画面に対しユーザが指定した始点および終点のｘｙ座標値を記憶部１４に記憶させる。文字列領域抽出部１５は、文字列領域抽出画面ｄｓ３に表示される画像に含まれる文字列からなる行の始点および終点の位置を特定する場合、記憶部１４に記憶させた、補助情報に基づいて指定された始点および終点のｘｙ座標値をそれぞれ参照する。そして、参照したｘｙ座標値を、文字列領域抽出画面ｄｓ３に表示される画像に含まれる文字列に対する始点および終点のｘｙ座標値とする。
そして、文字列領域抽出部１５は、文字列領域抽出画面ｄｓ２に表示される画像において、ユーザにより指定された一行目の行の始点および終点のｘｙ座標値に基づき、二行目以降の行における始点および終点の位置を特定する場合と同様な方法で、文字列領域抽出画面ｄｓ３に表示される画像において、一行目の行の始点および終点のｘｙ座標値に基づき、二行目以降の行における始点および終点の位置を特定する。 Alternatively, the character string area extraction unit 15 may use auxiliary information instead of the specific information. For example, the character string area extraction unit 15 stores the xy coordinate values of the start point and the end point designated by the user with respect to the screen displayed on the character string area extraction screen ds2 in the storage unit 14. When the character string area extraction unit 15 specifies the positions of the start point and the end point of the line consisting of the character string included in the image displayed on the character string area extraction screen ds3, the character string area extraction unit 15 is based on the auxiliary information stored in the storage unit 14. Refers to the xy coordinate values of the start point and the end point specified in the above. Then, the referenced xy coordinate value is set as the xy coordinate value of the start point and the end point with respect to the character string included in the image displayed on the character string area extraction screen ds3.
Then, the character string area extraction unit 15 is in the second and subsequent lines based on the xy coordinate values of the start point and the end point of the first line specified by the user in the image displayed on the character string area extraction screen ds2. In the image displayed on the character string area extraction screen ds3 in the same manner as when specifying the positions of the start point and the end point, in the second and subsequent lines based on the xy coordinate values of the start point and the end point of the first line. Identify the location of the start and end points.

図２３の例では、表示部１３は、文字列領域抽出部１５が特定した特定情報に基づいて、文字列領域抽出画面ｄｓ３に、始点Ｅｓｔ１１～Ｅｓｔ１８に白い方形のアイコン、終点Ｅｅｄ１１～Ｅｅｄ１８に黒い方形のアイコンをそれぞれ表示する。 In the example of FIG. 23, the display unit 13 has a white square icon at the start points Est11 to Est18 and black at the end points Eed11 to Eed18 on the character string area extraction screen ds3 based on the specific information specified by the character string area extraction unit 15. Display each square icon.

あるいは、文字列領域抽出部１５は、特定情報の代わりに補助情報を用いてもよい。例えば、文字列領域抽出部１５は、文字列領域抽出画面ｄｓ２に表示される画像においてユーザが指定した補助情報を記憶部１４に記憶させる。文字列領域抽出部１５は、文字列領域抽出画面ｄｓ３に表示される画像に含まれる文字列からなる行の始点および終点の位置を特定する場合、記憶部１４に記憶させた文字列領域抽出画面ｄｓ２に表示される画像の補助情報を参照する。そして、文字列領域抽出部１５は、文字列領域抽出画面ｄｓ２に表示される画像の補助情報に基づいて、文字列領域抽出画面ｄｓ３に表示される画像に含まれる文字列からなる特定の行（例えば、一行目）の始点および終点の位置を特定する。そして、文字列領域抽出部１５は、特定した始点及び終点の位置関係に基づいて、文字列領域抽出画面ｄｓ３に表示される画像における他の行の始点および終点の位置を特定するようにしてもよい。あるいは、文字列領域抽出部１５は、文字列領域抽出画面ｄｓ２に表示される画像における補助情報および特定情報に基づいて、文字列領域抽出画面ｄｓ３に表示される画像における始点および終点の位置を特定するようにしてもよい。 Alternatively, the character string area extraction unit 15 may use auxiliary information instead of the specific information. For example, the character string area extraction unit 15 stores the auxiliary information specified by the user in the image displayed on the character string area extraction screen ds2 in the storage unit 14. When the character string area extraction unit 15 specifies the positions of the start point and the end point of the line consisting of the character string included in the image displayed on the character string area extraction screen ds3, the character string area extraction unit 15 stores the character string area extraction screen in the storage unit 14. Refer to the auxiliary information of the image displayed on ds2. Then, the character string area extraction unit 15 has a specific line consisting of a character string included in the image displayed on the character string area extraction screen ds3, based on the auxiliary information of the image displayed on the character string area extraction screen ds2. For example, the positions of the start point and the end point of the first line) are specified. Then, the character string area extraction unit 15 may specify the positions of the start points and end points of other lines in the image displayed on the character string area extraction screen ds3 based on the positional relationship between the specified start point and end point. good. Alternatively, the character string area extraction unit 15 specifies the positions of the start point and the end point in the image displayed on the character string area extraction screen ds3 based on the auxiliary information and the specific information in the image displayed on the character string area extraction screen ds2. You may try to do it.

このように、本実施形態の変形例２の文字列領域・文字矩形抽出装置１では、文字列領域抽出部１５は、第１画像（文字列領域抽出画面ｄｓ２に表示される画像）に対して行われた操作入力に基づく補助情報、および文字列領域抽出画面ｄｓ２に対して生成した特定情報のうち少なくともいずれか一方に基づいて、第２画像（文字列領域抽出画面ｄｓ３に表示される画像）に含まれる文字列からなる行に関する特定情報を生成する。これにより、本実施形態の変形例２の文字列領域・文字矩形抽出装置１では、上述した効果を奏する他、文字列領域抽出部１５が第１画像に対して取得した補助情報、または特定情報に基づいて、第２画像に含まれる文字列からなる行の始点および終点の位置を特定することができ、ユーザが画像ごとに始点および終点を指定する操作を省くことができる。 As described above, in the character string area / character rectangular extraction device 1 of the modification 2 of the present embodiment, the character string area extraction unit 15 refers to the first image (the image displayed on the character string area extraction screen ds2). The second image (the image displayed on the character string area extraction screen ds3) based on at least one of the auxiliary information based on the operation input performed and the specific information generated for the character string area extraction screen ds2. Generates specific information about a line consisting of the strings contained in. As a result, in the character string area / character rectangular extraction device 1 of the modification 2 of the present embodiment, in addition to the above-mentioned effects, the auxiliary information or specific information acquired by the character string area extraction unit 15 for the first image is obtained. Based on the above, the positions of the start point and the end point of the line consisting of the character string included in the second image can be specified, and the user can omit the operation of designating the start point and the end point for each image.

なお、文字列領域抽出部１５は、第１画像の始点および終点のｘｙ座標値を、第２画像の始点および終点のｘｙ座標値としていることから、第２画像の始点および終点の位置が、必ずしも実際の行の始点および終点の位置と一致するとは限らない。第１画像や第２画像が書籍のページを撮像した画像である場合などにおいては、第１画像と、第２画像とで文字列の記載が開始される位置が、ずれる場合が多いと考えられる。このような場合、ユーザは、文字列領域抽出部１５に、上述した、特定情報の修正を行わせることができる。ユーザは、例えば一行目の行の始点および終点の位置を、ポインタｐｔ１を操作することによって修正することで、二行目以降の行の始点および終点の位置を修正することができる。 Since the character string area extraction unit 15 uses the xy coordinate values of the start point and the end point of the first image as the xy coordinate values of the start point and the end point of the second image, the positions of the start point and the end point of the second image are set. It does not always match the actual start and end points of the line. When the first image or the second image is an image obtained by capturing a page of a book, it is considered that the position where the character string is started to be described is often deviated between the first image and the second image. .. In such a case, the user can have the character string area extraction unit 15 modify the specific information described above. The user can correct the positions of the start point and the end point of the second and subsequent lines by, for example, correcting the positions of the start point and the end point of the first line by operating the pointer pt1.

（第２画像における文字列領域の抽出）
以下、文字列領域抽出部１５が、第１画像から抽出した文字列領域に関する情報に基づき、第２画像に含まれる文字列における文字列領域を抽出する方法について、図面を参照しながら説明する。
図２４は、実施形態に係る文字列領域・文字矩形抽出装置１の表示部１３によって表示される文字列領域抽出画面の一例を示す図である。 (Extraction of character string area in the second image)
Hereinafter, a method of extracting the character string area in the character string included in the second image based on the information regarding the character string area extracted from the first image by the character string area extraction unit 15 will be described with reference to the drawings.
FIG. 24 is a diagram showing an example of a character string area extraction screen displayed by the display unit 13 of the character string area / character rectangle extraction device 1 according to the embodiment.

図２４に示すように、文字列領域抽出部１５は、文字列領域抽出画面ｄｓ２に表示される画像について特定した文字列領域を、文字列領域抽出画面ｄｓ２とは異なる別の文字列領域抽出画面ｄｓ３に表示される画像に含まれる文字列の文字列領域としてもよい。
文字列領域抽出部１５は、例えば、文字列領域抽出画面ｄｓ２に表示される画像において抽出した文字列領域それぞれに関する情報を記憶部１４に記憶させる。文字列領域に関する情報には、例えば文字列領域を特定することができる情報、例えば当該領域における四隅のｘｙ座標値が含まれる。
文字列領域抽出部１５は、文字列領域抽出画面ｄｓ３に表示される画像に含まれる文字列の文字列領域を抽出する場合、記憶部１４に記憶させた文字列領域抽出画面ｄｓ２において抽出した文字列領域（例えば、図２０の文字列領域ｓａ１～ｓａ７）それぞれの四隅のｘｙ座標値を参照する。そして、文字列領域抽出部１５は、それぞれのｘｙ座標値により囲まれる領域を、文字列領域抽出画面ｄｓ３に表示される画像に含まれる文字列に対する文字列領域Ｅｓａ１～Ｅｓａ７として抽出する。 As shown in FIG. 24, the character string area extraction unit 15 sets the character string area specified for the image displayed on the character string area extraction screen ds2 to another character string area extraction screen different from the character string area extraction screen ds2. It may be a character string area of the character string included in the image displayed on ds3.
The character string area extraction unit 15 stores, for example, information about each of the character string areas extracted in the image displayed on the character string area extraction screen ds2 in the storage unit 14. The information regarding the character string area includes, for example, information that can specify the character string area, for example, xy coordinate values of the four corners in the area.
When the character string area extraction unit 15 extracts the character string area of the character string included in the image displayed on the character string area extraction screen ds3, the character extracted on the character string area extraction screen ds2 stored in the storage unit 14 Refer to the xy coordinate values of the four corners of each of the column regions (for example, the character string regions sa1 to sa7 in FIG. 20). Then, the character string area extraction unit 15 extracts the area surrounded by each xy coordinate value as the character string areas Esa1 to Esa7 for the character string included in the image displayed on the character string area extraction screen ds3.

このように、本実施形態の変形例２の文字列領域・文字矩形抽出装置１では、文字列領域抽出部１５は、第１画像（文字列領域抽出画面ｄｓ２に表示された画像）に対して抽出された文字列領域に関する情報に基づいて、第２画像（文字列領域抽出画面ｄｓ３に表示された画像）に含まれる文字列の文字列領域を抽出する。これにより、文字列領域・文字矩形抽出装置１では、上述した効果を奏する他、文字列領域抽出部１５が第１画像に対して抽出した文字列領域に基づいて、第２画像に含まれる文字列の文字列領域を抽出することができ、ユーザが画像ごとに文字列領域を抽出するために行う操作の手間を省くことができる。 As described above, in the character string area / character rectangular extraction device 1 of the modification 2 of the present embodiment, the character string area extraction unit 15 refers to the first image (the image displayed on the character string area extraction screen ds2). Based on the information about the extracted character string area, the character string area of the character string included in the second image (the image displayed on the character string area extraction screen ds3) is extracted. As a result, in the character string area / character rectangular extraction device 1, in addition to the above-mentioned effects, the characters included in the second image based on the character string area extracted by the character string area extraction unit 15 with respect to the first image. The character string area of the column can be extracted, and the time and effort of the user for extracting the character string area for each image can be saved.

なお、文字列領域抽出部１５は、第１画像の文字列領域の位置を、第２画像の文字列領域の位置としていることから、第２画像の文字列領域の位置が、必ずしも実際の文字列領域の位置と一致するとは限らない。第１画像や第２画像が書籍のページを撮像した画像である場合などにおいては、第１画像と、第２画像とで文字列領域の位置がずれる場合が多いと考えられる。このような場合、ユーザは、文字列領域抽出部１５に、上述した、文字列領域の修正を行わせることができる。ユーザは、例えば一行目の文字列領域の位置を、ポインタｐｔ１を操作することによって修正し、二行目以降の文字列領域の位置を修正することができる。 Since the character string area extraction unit 15 sets the position of the character string area of the first image as the position of the character string area of the second image, the position of the character string area of the second image is not necessarily the actual character. It does not always match the position of the column area. When the first image or the second image is an image obtained by capturing a page of a book, it is considered that the position of the character string region is often deviated between the first image and the second image. In such a case, the user can have the character string area extraction unit 15 modify the character string area as described above. The user can, for example, correct the position of the character string area on the first line by operating the pointer pt1, and correct the position of the character string area on the second and subsequent lines.

以上、この発明の実施形態について詳しく説明してきたが、具体的な構成は上述のものに限られることはなく、この発明の要旨を逸脱しない範囲内において様々な設計変更等をすることが可能である。 Although the embodiments of the present invention have been described in detail above, the specific configuration is not limited to the above-mentioned ones, and various design changes and the like can be made without departing from the gist of the present invention. be.

なお、上述した実施形態における文字列領域・文字矩形抽出装置１、上述した実施形態の変形例１、および上述した実施形態の変形例２における文字列領域・文字矩形抽出装置２の一部または全部をコンピュータで実現するようにしてもよい。その場合、この制御機能を実現するためのプログラムをコンピュータ読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行することによって実現してもよい。 In addition, a part or all of the character string area / character rectangle extraction device 2 in the character string area / character rectangle extraction device 1 in the above-described embodiment, the modification 1 of the above-described embodiment, and the modification 2 of the above-mentioned embodiment. May be realized by a computer. In that case, the program for realizing this control function may be recorded on a computer-readable recording medium, and the program recorded on the recording medium may be read by a computer system and executed.

なお、ここでいう「コンピュータシステム」とは、文字列領域・文字矩形抽出装置１および文字列領域・文字矩形抽出装置２に内蔵されたコンピュータシステムであって、ＯＳや周辺機器等のハードウェアを含むものとする。また、「コンピュータ読み取り可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ－ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。 The "computer system" here is a computer system built in the character string area / character rectangle extraction device 1 and the character string area / character rectangle extraction device 2, and includes hardware such as an OS and peripheral devices. It shall include. Further, the "computer-readable recording medium" refers to a portable medium such as a flexible disk, a magneto-optical disk, a ROM, or a CD-ROM, and a storage device such as a hard disk built in a computer system.

さらに「コンピュータ読み取り可能な記録媒体」とは、インターネット等のネットワークや電話回線等の通信回線を介してプログラムを送信する場合の通信回線のように、短時間、動的にプログラムを保持するもの、その場合のサーバやクライアントとなるコンピュータシステム内部の揮発性メモリのように、一定時間プログラムを保持しているものも含んでもよい。また上記プログラムは、前述した機能の一部を実現するためのものであっても良く、さらに前述した機能をコンピュータシステムにすでに記録されているプログラムとの組み合わせで実現できるものであってもよい。 Further, a "computer-readable recording medium" is a medium that dynamically holds a program for a short time, such as a communication line when a program is transmitted via a network such as the Internet or a communication line such as a telephone line. In that case, a program may be held for a certain period of time, such as a volatile memory inside a computer system serving as a server or a client. Further, the above-mentioned program may be for realizing a part of the above-mentioned functions, and may be further realized for realizing the above-mentioned functions in combination with a program already recorded in the computer system.

また、上述した実施形態における文字列領域・文字矩形抽出装置１、および上述した実施形態の変形例１における文字列領域・文字矩形抽出装置２を、ＬＳＩ（ＬａｒｇｅＳｃａｌｅＩｎｔｅｇｒａｔｉｏｎ）等の集積回路として実現してもよい。文字列領域・文字矩形抽出装置１および文字列領域・文字矩形抽出装置２の各機能ブロックは個別にプロセッサ化してもよいし、一部、または全部を集積してプロセッサ化してもよい。また、集積回路化の手法はＬＳＩに限らず専用回路、または汎用プロセッサで実現してもよい。また、半導体技術の進歩によりＬＳＩに代替する集積回路化の技術が出現した場合、当該技術による集積回路を用いてもよい。 Further, the character string area / character rectangle extraction device 1 in the above-described embodiment and the character string area / character rectangle extraction device 2 in the modification 1 of the above-described embodiment are realized as an integrated circuit such as an LSI (Large Scale Integration). You may. Each functional block of the character string area / character rectangle extraction device 1 and the character string area / character rectangle extraction device 2 may be individually made into a processor, or a part or all of them may be integrated into a processor. Further, the method of making an integrated circuit is not limited to the LSI, and may be realized by a dedicated circuit or a general-purpose processor. Further, when an integrated circuit technology that replaces an LSI appears due to advances in semiconductor technology, an integrated circuit based on this technology may be used.

１・・・文字列領域・文字矩形抽出装置、２・・・文字列領域・文字矩形抽出装置、１
０・・・制御部、１１・・・画像データ取得部、１２・・・操作入力部、１３・・・表示
部、１４・・・記憶部、１５・・・文字列領域抽出部、１６・・・文字列領域結合部、１
７・・・文字矩形抽出部、１８・・・ルビ対応付け部、１９・・・管理画面生成部、２６
・・・言語解析部 1 ... Character string area / character rectangle extractor, 2 ... Character string area / character rectangle extractor, 1
0 ... Control unit, 11 ... Image data acquisition unit, 12 ... Operation input unit, 13 ... Display unit, 14 ... Storage unit, 15 ... Character string area extraction unit, 16 ...・・ Character string area joint, 1
7 ... Character rectangle extraction unit, 18 ... Ruby mapping unit, 19 ... Management screen generation unit, 26
・・・ Language analysis department

Claims

An image data acquisition unit that acquires image data indicating an image containing a character string,
A display unit that displays an image based on the acquired image data,
An operation input unit that accepts operation input from the user,
A character that extracts a character string area that encloses the entire character string included in the line based on the start point and end point of the line consisting of the character string included in the image specified based on the auxiliary information based on the operation input. Column area extractor and
A character string area joining portion that connects the extracted character string area and another character string area adjacent to the character string area,
A character string area / character rectangle extraction device characterized by being equipped with.

The character string area extraction unit extracts the character string area based on the range where the connection line connecting the start point and the end point and the character intersect.
The character string area / character rectangle extraction device according to claim 1.

The operation input unit receives the operation input based on the operation of the pointer, and receives the operation input.
The character string area joining portion is characterized in that the extracted character string area and another character string area adjacent to the character string area are combined based on the operation of the pointer. Alternatively, the character string area / character rectangle extraction device according to claim 2 .

The item according to any one of claims 1 to 3 , wherein the character string area extraction unit modifies the extracted character string area based on auxiliary information based on the operation input. Character string area / character rectangle extractor.

The operation input unit receives the operation input based on the operation of the pointer, and receives the operation input.
The character string area / character rectangular extraction device according to claim 4 , wherein the character string area extraction unit corrects the extracted character string area based on the operation of the pointer.

The character string area joining portion is characterized in that the extracted character string area and another character string area adjacent to the character string area are combined based on statistical information in the appearance of the character string. The character string area / character rectangle extraction device according to claim 1.

A ruby mapping unit that associates the character string with the ruby corresponding to the character string based on the auxiliary information based on the operation input.
The character string area / character rectangle extraction device according to any one of claims 1 to 6 , wherein the character string area / character rectangle extraction device is provided.

The character string area according to claim 7 , wherein the display unit displays a ruby correspondence area which is a range in which the associated character string and the ruby corresponding to the character string are enclosed. Character rectangle extractor.

The character string area / character according to any one of claims 1 to 8 , further comprising a character rectangle extraction unit for extracting a character rectangle representing a character rectangle of each character constituting the character string. Rectangular extractor.

The character rectangle extraction unit identifies an appropriate character cutout position from a plurality of character cutout position candidates based on an evaluation value calculated from character shape information, recognition accuracy in character recognition, and the like. The character string area / character rectangle extracting device according to claim 9 , wherein the character rectangle is extracted.

The character string area / character rectangle extraction device according to claim 9 , wherein the character rectangle extraction unit corrects the extracted character rectangle based on auxiliary information based on the operation input.

The operation input unit receives the operation input based on the operation of the pointer, and receives the operation input.
The character string area / character rectangle extraction device according to claim 11 , wherein the character rectangle extraction unit corrects the extracted character rectangle based on the operation of the pointer.

The display unit is characterized in that line numbers, which are numbers assigned in the order in which character string areas indicating a display target range of character strings included in the line are extracted, are displayed in association with the character string area. The character string area / character rectangle extraction device according to any one of claims 1 to 12 .

The character string area / character rectangle extracting device according to claim 13 , wherein the display unit displays the image based on the image data and a list display image which is an image in which the line numbers are listed. ..

The character string area extraction unit is based on the start point and the end point specified by the auxiliary information based on the operation input for the line consisting of the character string included in the image, and the other line consisting of the character string included in the image. Generates specific information that identifies the location of the start and end points,
One of claims 1 to 14 , characterized in that the character string area of the character string included in the other line is extracted based on the start point and the end point specified by the specific information. The character string area / character rectangle extraction device described in.

The character string area extraction unit corrects the specific information generated for the image based on the auxiliary information based on the operation input for at least one of the start point and the end point specified for the image. The character string area / character rectangle extracting device according to claim 15 , wherein the character string area / character rectangle is extracted.

The character string area extraction unit is characterized in that the character string area extracted for the image is modified based on auxiliary information based on the operation input for the character string area extracted for the image. The character string area / character rectangle extracting device according to claim 15 or 16 .

The character string region extraction unit is based on at least one of the auxiliary information based on the operation input performed on the first image and the specific information generated on the first image. The character string area / character rectangular extraction device according to any one of claims 15 to 17 , wherein the specific information for a second image different from the first image is generated.

The character string area extraction unit is characterized in that it extracts the character string area for a second image different from the first image based on the information about the character string area extracted from the first image. The character string area / character rectangle extracting device according to any one of claims 15 to 18 .

It is a method of extracting character string areas and character rectangles by a computer.
The image data acquisition unit acquires image data indicating an image including a character string, and an image data acquisition step.
A display step in which the display unit displays an image based on the acquired image data, and
An operation input step in which the operation input unit receives an operation input from the user,
The character string area extraction unit encloses the entire character string included in the line based on the start point and end point of the line consisting of the character string included in the image specified based on the auxiliary information based on the operation input. The string area extraction step to extract the character string area and
A character string area joining step in which the character string area joining portion joins the extracted character string area and another character string area adjacent to the character string area.
A character string area / character rectangle extraction method characterized by having.

On the computer
An image data acquisition step for acquiring image data indicating an image including a character string, and
A display step for displaying an image based on the acquired image data, and
An operation input step that accepts operation input from the user, and
A character that extracts a character string area that encloses the entire character string included in the line based on the start point and end point of the line consisting of the character string included in the image specified based on the auxiliary information based on the operation input. Column area extraction step and
A character string area joining step that joins the extracted character string area and another character string area adjacent to the character string area,
A program to execute.