JP6222541B2

JP6222541B2 - Image processing apparatus and program

Info

Publication number: JP6222541B2
Application number: JP2013042800A
Authority: JP
Inventors: 真太郎安達; 徹也脇山; 洋実北; 勝也小柳; 清水　淳一; 淳一清水; 紘幸岸本
Original assignee: Fuji Xerox Co Ltd; Fujifilm Business Innovation Corp
Current assignee: Fujifilm Business Innovation Corp
Priority date: 2013-03-05
Filing date: 2013-03-05
Publication date: 2017-11-01
Anticipated expiration: 2033-03-05
Also published as: JP2014170452A

Description

本発明は、画像処理装置及びプログラムに関する。 The present invention relates to an image processing apparatus and a program.

特許文献１は、文書画像上でタイトル文字列の近傍に記載されるキーワード文字列、及びこのキーワード文字列に対する前記タイトル文字列の相対的な位置情報を取得する抽出条件取得手段と、文書画像内の少なくとも前記タイトル文字列及び前記キーワード文字列を含む領域を対象にして文字認識を行う文字認識手段と、この文字認識手段の認識結果から、前記抽出条件取得手段で取得したキーワード文字列を検索してその位置を取得するキーワード検索手段と、このキーワード検索手段で取得したキーワード文字列の位置、及び前記抽出条件取得手段で取得したキーワード文字列に対するタイトル文字列の相対的な位置に基づいて、タイトル文字列の位置を取得するタイトル位置取得手段と、このタイトル位置取得手段で取得したタイトル文字列の位置に基づいて、タイトル文字列のデータを出力するタイトル出力手段とを有することを特徴とするタイトル抽出装置について開示している。 Patent Document 1 discloses a keyword character string described in the vicinity of a title character string on a document image, extraction condition acquisition means for acquiring relative position information of the title character string with respect to the keyword character string, Character recognition means for recognizing at least a region including the title character string and the keyword character string, and a keyword character string acquired by the extraction condition acquisition means is searched from the recognition result of the character recognition means. Based on the keyword search means for acquiring the position, the position of the keyword character string acquired by the keyword search means, and the relative position of the title character string with respect to the keyword character string acquired by the extraction condition acquisition means Title position acquisition means for acquiring the position of the character string, and the title position acquired by the title position acquisition means Based on the position of the character string, discloses title extracting apparatus characterized by having a title output means for outputting the data of the title string.

特許文献２は、複数の項目に関して、その記入欄と予めプリントされた項目名とが配列され、前記記入欄と項目名とがそれぞれ罫線によって区画された帳票を読み取った帳票イメージから、前記記入欄内の文字を認識する帳票ＯＣＲ処理をコンピュータに実行させる帳票ＯＣＲプログラムにおいて、前記帳票イメージの全面に対してＯＣＲ処理を実行し、前記記入欄を区画する記入枠の位置，前記項目名を取り囲む項目名枠の位置，及びこれらの枠内の文字列を認識する全面ＯＣＲ処理ステップと、前記全面ＯＣＲ処理ステップの結果情報に基づいて、再度ＯＣＲ処理をすべき記入欄を特定する記入欄特定処理ステップと、特定された記入欄に対して部分的にＯＣＲ処理を実行する部分ＯＣＲ処理ステップとからなることを特徴とする帳票ＯＣＲプログラムについて開示している。 Patent Document 2 relates to a plurality of items, the entry fields and preprinted item names are arranged, and the entry fields are obtained from a form image obtained by reading a form in which the entry fields and the item names are partitioned by ruled lines. In a form OCR program that causes a computer to execute a form OCR process for recognizing characters in the form, the OCR process is performed on the entire surface of the form image, and the position of the entry frame that divides the entry field and the items surrounding the item name Full-frame OCR processing step for recognizing the position of name frames and character strings in these frames, and an entry field specifying processing step for specifying an entry field to be subjected to OCR processing again based on the result information of the full-screen OCR processing step And a partial OCR processing step that partially executes OCR processing on the specified entry field. It discloses R program.

特開２００８−０７７４５４号公報JP 2008-077454 A 特開２００５−１７３７３０号公報JP 2005-173730 A

本発明の目的は、記録媒体を読み取って得られた文書画像からの所望の文字列の抽出精度を向上することができる画像処理装置及びプログラムを提供することである。 An object of the present invention is to provide an image processing apparatus and program capable of improving the extraction accuracy of a desired character string from a document image obtained by reading a recording medium.

請求項１に係る本発明は、記録媒体を読み取って得られた文書画像を解析する解析手段と、前記解析手段による解析結果から前記文書画像における予め定められた複数の文字列の座標を検出する座標検出手段と、前記文書画像に対応する基準画像に含まれる前記予め定められた複数の文字列の座標及び該基準画像に含まれる目標点の座標と、前記座標検出手段により検出された複数の座標とに基づいて、該基準画像の目標点に対応する前記文書画像の目標点の座標を取得する座標取得手段と、前記解析手段による解析結果から、前記座標取得手段により取得された座標を基点とした領域の文字列を抽出する文字列抽出手段とを有する画像処理装置である。 According to the first aspect of the present invention, an analysis unit that analyzes a document image obtained by reading a recording medium, and detects coordinates of a plurality of predetermined character strings in the document image from an analysis result by the analysis unit. A coordinate detection unit; a plurality of predetermined character string coordinates included in a reference image corresponding to the document image; a target point coordinate included in the reference image; and a plurality of coordinates detected by the coordinate detection unit. Based on the coordinates, the coordinate acquisition means for acquiring the coordinates of the target point of the document image corresponding to the target point of the reference image, and the coordinates acquired by the coordinate acquisition means from the analysis result by the analysis means And a character string extracting unit that extracts a character string in the region.

請求項２に係る本発明は、前記座標検出手段は、正規表現を用いて予め定められた複数の文字列の座標を検出する請求項１記載の画像処理装置である。 The present invention according to claim 2 is the image processing apparatus according to claim 1, wherein the coordinate detecting means detects coordinates of a plurality of predetermined character strings using a regular expression.

請求項３に係る本発明は、前記座標取得手段は、前記基準画像に含まれる前記予め定められた複数の文字列の座標と前記座標検出手段により検出された座標とを用いて、前記基準画像と前記文書画像との間の拡大又は縮小の倍率を取得し、該倍率に基づいて前記文書画像の目標点の座標を取得する請求項１又は２記載の画像処理装置である。 The present invention according to claim 3 is characterized in that the coordinate acquisition means uses the coordinates of the plurality of predetermined character strings included in the reference image and the coordinates detected by the coordinate detection means. The image processing apparatus according to claim 1, wherein a magnification of enlargement or reduction between the document image and the document image is acquired, and coordinates of a target point of the document image are acquired based on the magnification.

請求項４に係る本発明は、前記座標取得手段は、前記基準画像に含まれる前記予め定められた複数の文字列の座標から該基準画像に含まれる目標点への相対座標に応じて、該基準画像の目標点に対応する前記文書画像の目標点の座標を複数取得し、前記文字列抽出手段は、前記座標取得手段により取得された各座標を基点とした複数の領域のうち、いずれかの領域に含まれる文字列を抽出する請求項１又は２記載の画像処理装置である。 According to a fourth aspect of the present invention, the coordinate acquisition unit is configured to change the coordinates of the plurality of predetermined character strings included in the reference image to a target point included in the reference image. A plurality of coordinates of the target point of the document image corresponding to the target point of the reference image are acquired, and the character string extraction unit is any one of a plurality of regions based on the coordinates acquired by the coordinate acquisition unit. The image processing apparatus according to claim 1, wherein a character string included in the area is extracted.

請求項５に係る本発明は、前記文字列抽出手段は、前記座標取得手段により取得された各座標を基点とした複数の領域全てに含まれる文字列を抽出する請求項４記載の画像処理装置である。 The present invention according to claim 5 is the image processing apparatus according to claim 4, wherein the character string extraction means extracts character strings included in all of a plurality of regions based on the coordinates acquired by the coordinate acquisition means. It is.

請求項６に係る本発明は、前記文字列抽出手段は、予め定められた条件を満たす文字列を抽出対象とする請求項４又は５記載の画像処理装置である。 The present invention according to claim 6 is the image processing apparatus according to claim 4 or 5, wherein the character string extracting means extracts a character string that satisfies a predetermined condition.

請求項７に係る本発明は、前記座標検出手段は、少なくとも３つの予め定められた文字列の座標を検出し、前記座標取得手段は、前記基準画像に含まれる前記少なくとも３つの予め定められた文字列の座標及び該座標により囲まれる領域内における目標点の座標と、前記座標検出手段により検出された複数の座標とに基づいて、前記文書画像の目標点の座標を取得する請求項１又は２記載の画像処理装置である。 In the present invention according to claim 7, the coordinate detection unit detects coordinates of at least three predetermined character strings, and the coordinate acquisition unit detects the at least three predetermined characters included in the reference image. The coordinate of the target point of the document image is acquired based on the coordinates of the character string and the coordinates of the target point in the region surrounded by the coordinates and the plurality of coordinates detected by the coordinate detecting means. 2. The image processing apparatus according to 2.

請求項８に係る本発明は、記録媒体を読み取って得られた文書画像を解析する解析ステップと、前記解析ステップでの解析結果から前記文書画像における予め定められた複数の文字列の座標を検出する座標検出ステップと、前記文書画像に対応する基準画像に含まれる前記予め定められた複数の文字列の座標及び該基準画像に含まれる目標点の座標と、前記座標検出ステップで検出された複数の座標とに基づいて、該基準画像の目標点に対応する前記文書画像の目標点の座標を取得する座標取得ステップと、前記解析ステップでの解析結果から、前記座標取得ステップで取得された座標を基点とした領域の文字列を抽出する文字列抽出ステップとをコンピュータに実行させるプログラムである。 According to an eighth aspect of the present invention, an analysis step for analyzing a document image obtained by reading a recording medium, and coordinates of a plurality of predetermined character strings in the document image are detected from the analysis result in the analysis step. The coordinate detection step, the coordinates of the plurality of predetermined character strings included in the reference image corresponding to the document image, the coordinates of the target point included in the reference image, and the plurality detected in the coordinate detection step. The coordinate acquisition step of acquiring the coordinates of the target point of the document image corresponding to the target point of the reference image based on the coordinates of the reference image, and the coordinates acquired in the coordinate acquisition step from the analysis result in the analysis step Is a program that causes a computer to execute a character string extraction step of extracting a character string in a region based on.

請求項１に係る本発明によれば、記録媒体を読み取って得られた文書画像からの所望の文字列の抽出精度を、本構成を有しない場合に比べ向上することができる画像処理装置を提供することができる。 According to the first aspect of the present invention, there is provided an image processing apparatus capable of improving the extraction accuracy of a desired character string from a document image obtained by reading a recording medium as compared with a case where the present configuration is not provided. can do.

請求項２に係る本発明によれば、請求項１に係る本発明の効果に加え、所望の文字列の抽出に用いる予め定められた文字列の座標の検出を効率的に行なうことができる画像処理装置を提供することができる。 According to the second aspect of the present invention, in addition to the effect of the present invention according to the first aspect, an image capable of efficiently detecting a predetermined character string coordinate used for extracting a desired character string. A processing device can be provided.

請求項３に係る本発明によれば、請求項１又は２に係る本発明の効果に加え、文書画像が元の画像に比べて拡大又は縮小されている場合であっても、所望の文字列を抽出することができる画像処理装置を提供することができる。 According to the present invention of claim 3, in addition to the effect of the present invention of claim 1 or 2, even if the document image is enlarged or reduced compared to the original image, a desired character string Can be provided.

請求項４に係る本発明によれば、請求項１又は２に係る本発明の効果に加え、文書画像に含まれる各文字列の位置関係が、元の画像における位置関係から崩れている場合であっても、所望の文字列を抽出することができる画像処理装置を提供することができる。 According to the present invention of claim 4, in addition to the effect of the present invention of claim 1 or 2, the positional relationship between the character strings included in the document image is broken from the positional relationship in the original image. Even if it exists, the image processing apparatus which can extract a desired character string can be provided.

請求項５に係る本発明によれば、請求項４に係る本発明の効果に加え、本構成を有しない場合に比べ、より確実に所望の文字列を抽出することができる画像処理装置を提供することができる。 According to the fifth aspect of the present invention, in addition to the effect of the present invention according to the fourth aspect, an image processing apparatus that can extract a desired character string more reliably than the case without the present configuration is provided. can do.

請求項６に係る本発明によれば、請求項４又は５に係る本発明の効果に加え、本構成を有しない場合に比べ、より確実に所望の文字列を抽出することができる画像処理装置を提供することができる。 According to the sixth aspect of the present invention, in addition to the effect of the present invention according to the fourth or fifth aspect, an image processing apparatus that can extract a desired character string more reliably than the case without the present configuration. Can be provided.

請求項７に係る本発明によれば、請求項１又は２に係る本発明の効果に加え、文書画像が元の画像に比べて拡大又は縮小されている場合であっても、所望の文字列を抽出することができる画像処理装置を提供することができる。 According to the present invention of claim 7, in addition to the effect of the present invention of claim 1 or 2, a desired character string can be obtained even when the document image is enlarged or reduced compared to the original image. Can be provided.

請求項８に係る本発明によれば、記録媒体を読み取って得られた文書画像からの所望の文字列の抽出精度を、本構成を有しない場合に比べ向上することができるプログラムを提供することができる。 According to the eighth aspect of the present invention, there is provided a program capable of improving the extraction accuracy of a desired character string from a document image obtained by reading a recording medium as compared with a case where the present configuration is not provided. Can do.

本発明の実施形態に係る画像処理装置２のハードウェア構成を示す模式図である。It is a schematic diagram which shows the hardware constitutions of the image processing apparatus 2 which concerns on embodiment of this invention. プログラムが実行されることにより実現される画像処理装置２の機能構成を示すブロック図である。It is a block diagram which shows the function structure of the image processing apparatus 2 implement | achieved by running a program. 本実施形態の文字列の抽出について説明する模式図であり、（ａ）は、基準画像を例示し、（ｂ）は、基準情報格納部３４に予め格納されている情報を例示し、（ｃ）は、文書画像を例示している。It is a schematic diagram explaining extraction of the character string of this embodiment, (a) illustrates a reference image, (b) illustrates information stored in the reference information storage unit 34 in advance, (c ) Exemplifies a document image. 基準情報格納部３４への基準情報の格納の流れを示すフローチャートである。4 is a flowchart showing a flow of storing reference information in a reference information storage unit 34; 文書画像から文字列を抽出する動作について例示するフローチャートである。It is a flowchart which illustrates about operation | movement which extracts a character string from a document image. 第１の変形例における文字列の抽出について説明する模式図であり、（ａ）は、基準画像を例示し、（ｂ）は、基準情報格納部３４に予め格納されている情報を例示し、（ｃ）は、文書画像を例示している。It is a schematic diagram explaining extraction of a character string in the first modification, (a) exemplifies a reference image, (b) exemplifies information stored in advance in the reference information storage unit 34, (C) illustrates a document image. 第１の変形例において、文書画像から文字列を抽出する動作について例示するフローチャートである。It is a flowchart which illustrates about operation | movement which extracts a character string from a document image in a 1st modification. 第２の変形例における文字列の抽出について説明する模式図であり、（ａ）は、基準画像を例示し、（ｂ）は、基準画像における３つの利用文字列の座標点に囲まれた領域内の目標点の位置関係を図示している。It is a schematic diagram explaining extraction of a character string in the second modification, (a) illustrates a reference image, (b) is an area surrounded by coordinate points of three use character strings in the reference image The positional relationship between the target points is illustrated. 第２の変形例における文字列の抽出について説明する模式図であり、（ａ）は、文書画像を例示し、（ｂ）は、文書画像における３つの利用文字列の座標点に囲まれた領域内の目標点の位置関係を図示している。FIG. 10 is a schematic diagram for explaining extraction of a character string in a second modification, where (a) illustrates a document image, and (b) is an area surrounded by coordinate points of three use character strings in the document image. The positional relationship between the target points is illustrated. 第２の変形例において、文書画像から文字列を抽出する動作について例示するフローチャートである。14 is a flowchart illustrating an operation of extracting a character string from a document image in the second modification.

以下、本発明の実施形態について図面を参照して詳細に説明する。
図１は、本発明の実施形態に係る画像処理装置２のハードウェア構成を示す模式図である。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.
FIG. 1 is a schematic diagram showing a hardware configuration of an image processing apparatus 2 according to an embodiment of the present invention.

図１に示すように、画像処理装置２は、ＣＰＵ４及びメモリ６などを含む本体８、入力装置１０、出力装置１２、記憶装置１４、記憶媒体１６、通信装置１８及び画像読取装置２０から構成され、画像処理装置２は、他の装置との通信が可能なコンピュータとしての構成部分を有している。 As shown in FIG. 1, the image processing apparatus 2 includes a main body 8 including a CPU 4 and a memory 6, an input device 10, an output device 12, a storage device 14, a storage medium 16, a communication device 18, and an image reading device 20. The image processing apparatus 2 includes a component as a computer that can communicate with other apparatuses.

ＣＰＵ４は、メモリ６に記憶されたプログラムに基づく処理を実行する。記憶装置１４は、内蔵ＨＤＤなどであり、記憶媒体１６は、ＣＤ、ＦＤ及び外付けＨＤＤなどである。なお、ＣＰＵ４は、記憶装置１４又は記憶媒体１６に記憶されたプログラムを実行してもよい。 The CPU 4 executes processing based on the program stored in the memory 6. The storage device 14 is an internal HDD or the like, and the storage medium 16 is a CD, FD, external HDD, or the like. Note that the CPU 4 may execute a program stored in the storage device 14 or the storage medium 16.

入力装置１０は、キーボード、マウス及びジョイスティックなどであり、出力装置１２は、ディスプレイなどの表示装置である。なお、入力装置１０及び出力装置１２を、タッチパネルなどにより構成してもよい。 The input device 10 is a keyboard, a mouse, and a joystick, and the output device 12 is a display device such as a display. Note that the input device 10 and the output device 12 may be configured by a touch panel or the like.

通信装置１８は、ＬＡＮ、インターネットなどの通信回線を介して他の装置と通信を行う、データ回線終端装置などの通信装置である。 The communication device 18 is a communication device such as a data line termination device that communicates with other devices via a communication line such as a LAN or the Internet.

画像読取装置２０は、スキャナなどの画像読取装置であり、用紙などの記録媒体を文書画像として読み取る。 The image reading device 20 is an image reading device such as a scanner, and reads a recording medium such as paper as a document image.

図２は、プログラムが実行されることにより実現される画像処理装置２の機能構成を示すブロック図である。なお、図２に示される構成の一部又は全ては、ＡＳＩＣ，ＦＰＧＡなどのハードウェアにより実現されてもよい。 FIG. 2 is a block diagram illustrating a functional configuration of the image processing apparatus 2 realized by executing the program. Note that part or all of the configuration shown in FIG. 2 may be realized by hardware such as ASIC and FPGA.

図２に示すように、画像処理装置２は、文書画像取得部３０と、解析部３２と、基準情報格納部３４と、座標検出部３６と、座標取得部３８と、文字列抽出部４０とを有し、記録媒体を読み取って得られた文書画像から文字列を抽出する処理を行う。 As shown in FIG. 2, the image processing apparatus 2 includes a document image acquisition unit 30, an analysis unit 32, a reference information storage unit 34, a coordinate detection unit 36, a coordinate acquisition unit 38, and a character string extraction unit 40. The character string is extracted from the document image obtained by reading the recording medium.

文書画像取得部３０は、記録媒体を読み取って得られた文書画像データを取得する。本実施形態では、文書画像取得部３０は、画像読取装置２０により読み取られた文書画像データを画像読取装置２０から取得するが、文書画像取得部３０は、例えば、通信装置１８を介して送信された文書画像データを取得してもよいし、記憶媒体１６に格納された文書画像データを読み出して取得してもよい。なお、以下の説明において、文書画像とは、文書画像取得部３０により取得された文書画像データを示す。 The document image acquisition unit 30 acquires document image data obtained by reading a recording medium. In the present embodiment, the document image acquisition unit 30 acquires the document image data read by the image reading device 20 from the image reading device 20, but the document image acquisition unit 30 is transmitted via, for example, the communication device 18. The document image data may be acquired, or the document image data stored in the storage medium 16 may be read out and acquired. In the following description, the document image indicates document image data acquired by the document image acquisition unit 30.

解析部３２は、文書画像取得部３０により取得された文書画像データを解析する。解析部３２は、例えば、ＯＣＲ（Optical Character Recognition：光学文字認識）機能を使用することによって文書画像を解析し、文書画像に含まれる文字及び文字の位置を検出する。 The analysis unit 32 analyzes the document image data acquired by the document image acquisition unit 30. The analysis unit 32 analyzes the document image by using, for example, an OCR (Optical Character Recognition) function, and detects characters and character positions included in the document image.

基準情報格納部３４は、文書画像取得部３０が取得した文書画像に対応する基準画像についての情報を格納する。基準画像とは、文書画像取得部３０が取得した文書画像に対応した予め定められた画像であり、例えば、記録媒体に当該文書画像を形成する際に形成対象となった画像データ、文書の雛形となる画像データなどが該当する。なお、基準情報格納部３４が格納する情報の詳細については、後述する。 The reference information storage unit 34 stores information about the reference image corresponding to the document image acquired by the document image acquisition unit 30. The reference image is a predetermined image corresponding to the document image acquired by the document image acquisition unit 30. For example, image data that is a formation target when the document image is formed on a recording medium, and a template of the document This corresponds to image data. Details of the information stored in the reference information storage unit 34 will be described later.

ここで、基準画像に含まれる文字列の基準画像内における位置と、当該文字列の文書画像内における位置とは、異なる場合がある。例えば、画像読取装置２０により記録媒体を読み取る際にずれて読み取られた場合には、当該文字列の文書画像内の位置は、基準画像内の位置に比べてずれてしまう。また、基準画像に比べ画像が拡大又は縮小された場合にも、当該文字列の文書画像内の位置は、基準画像内の位置とは異なることとなる。このような場合、当該文字列の基準画像における絶対座標（例えば、画像の左上端を原点とした座標）を用いて、文書画像取得部３０が取得した文書画像から当該文字列を抽出しようとしても、当該文字列の文書画像における絶対座標は変移しているため、抽出できないおそれがある。 Here, the position of the character string included in the reference image in the reference image may be different from the position of the character string in the document image. For example, when the image reading device 20 reads the recording medium with a deviation, the position of the character string in the document image is deviated from the position in the reference image. Even when the image is enlarged or reduced compared to the reference image, the position of the character string in the document image is different from the position in the reference image. In such a case, even if an attempt is made to extract the character string from the document image acquired by the document image acquisition unit 30 using absolute coordinates in the reference image of the character string (for example, coordinates with the upper left corner of the image as the origin). Since the absolute coordinates in the document image of the character string have changed, there is a possibility that they cannot be extracted.

本実施形態では、文書画像取得部３０が取得した文書画像から所望の文字列を抽出する際、絶対座標により抽出対象の文字列を指定するのではなく、予め定められた複数の文字列からの相対座標を用いて抽出対象の文字列を指定する。なお、以下の説明において、この予め定められた文字列を利用文字列と呼ぶことがある。 In the present embodiment, when a desired character string is extracted from the document image acquired by the document image acquisition unit 30, a character string to be extracted is not designated by absolute coordinates, but from a plurality of predetermined character strings. Specify the character string to be extracted using relative coordinates. In the following description, this predetermined character string may be referred to as a use character string.

座標検出部３６は、解析部３２による解析結果から、文書画像取得部３０が取得した文書画像における上述の予め定められた複数の文字列（利用文字列）の座標（絶対座標）を検出する。なお、利用文字列が何であるかについては、基準情報格納部３４に予め格納されている。 The coordinate detection unit 36 detects the coordinates (absolute coordinates) of the plurality of predetermined character strings (used character strings) in the document image acquired by the document image acquisition unit 30 from the analysis result of the analysis unit 32. Note that what the character string is used is stored in the reference information storage unit 34 in advance.

座標取得部３８は、文書画像取得部３０が取得した文書画像に対応する基準画像に含まれる上述の予め定められた複数の文字列の座標（絶対座標）と、基準画像に含まれる目標点の座標と、座標検出部３６により検出された複数の座標とに基づいて、基準画像の目標点に対応する文書画像の目標点の座標を取得する。なお、目標点とは、抽出対象の文字列の位置を示す点ある。ここで、基準画像に含まれる目標点の座標としては、本実施形態では、予め定められた文字列の座標からの相対座標として説明するが、相対座標は、絶対座標がわかれば算出できるので絶対座標であってもよい。 The coordinate acquisition unit 38 includes the coordinates (absolute coordinates) of the plurality of predetermined character strings included in the reference image corresponding to the document image acquired by the document image acquisition unit 30 and the target points included in the reference image. Based on the coordinates and the plurality of coordinates detected by the coordinate detection unit 36, the coordinates of the target point of the document image corresponding to the target point of the reference image are acquired. The target point is a point indicating the position of the character string to be extracted. In this embodiment, the coordinates of the target point included in the reference image will be described as relative coordinates from the coordinates of a predetermined character string. However, the relative coordinates can be calculated if the absolute coordinates are known. It may be a coordinate.

なお、文書画像取得部３０が取得した文書画像に対応する基準画像に含まれる利用文字列の座標及び基準画像に含まれる目標点の座標は、基準情報格納部３４に予め格納されている。 Note that the coordinates of the used character string included in the reference image corresponding to the document image acquired by the document image acquisition unit 30 and the coordinates of the target point included in the reference image are stored in the reference information storage unit 34 in advance.

文字列抽出部４０は、解析部３２による解析結果から、座標取得部３８により取得された目標点の座標を基点とした領域の文字列を抽出する。本実施形態では、目標点の座標を基点とした矩形領域内の文字列を抽出する。なお、矩形領域の幅及び高さについては、基準情報格納部３４に予め格納されている、幅及び高さについての情報に基づき決定される。 The character string extraction unit 40 extracts a character string in a region based on the coordinates of the target point acquired by the coordinate acquisition unit 38 from the analysis result of the analysis unit 32. In the present embodiment, a character string in a rectangular area using the coordinates of the target point as a base point is extracted. Note that the width and height of the rectangular region are determined based on the information about the width and height stored in advance in the reference information storage unit 34.

次に、本実施形態における文字列の抽出について、具体例により説明する。
図３は、本実施形態の文字列の抽出について説明する模式図であり、図３（ａ）は、基準画像を例示し、図３（ｂ）は、基準情報格納部３４に予め格納されている情報を例示し、図３（ｃ）は、文書画像を例示している。ここでは、基準画像が画像形成された記録媒体についてファクシミリ送信された結果、ファクシミリの受信画像としての文書画像が、基準画像に比べて、縮小され、かつ、全体的に文字列の位置が平行移動している場合を例に説明する。 Next, extraction of a character string in the present embodiment will be described using a specific example.
FIG. 3 is a schematic diagram for explaining extraction of a character string according to the present embodiment. FIG. 3A illustrates a reference image, and FIG. 3B is stored in the reference information storage unit 34 in advance. FIG. 3C illustrates a document image. Here, as a result of facsimile transmission of the recording medium on which the reference image is formed, the document image as the facsimile reception image is reduced as compared with the reference image, and the entire character string position is moved in parallel. An example will be described.

なお、図３で示した例では、抽出対象の文字列が、受注番号として記載されている「７８９１２３」であるものと仮定し、また、この文字列の抽出のために用いる予め定められた複数の文字列（利用文字列）が、「注文書」及び「エベレスト」の２つであるものと仮定する。 In the example shown in FIG. 3, it is assumed that the character string to be extracted is “789123” described as the order number, and a plurality of predetermined strings used for extraction of this character string are used. It is assumed that there are two character strings (utilization character strings) of “Order Form” and “Everest”.

また、図３（ａ）において、＋印は、利用文字列の座標位置を示し、枠線は、基準画像における抽出対象の文字列の領域を示している。この例では、枠線の左上隅の点が基準画像における目標点であるが、目標点が枠線の他の隅の点になるようにしてもよい。 In FIG. 3A, the + mark indicates the coordinate position of the used character string, and the frame line indicates the area of the character string to be extracted in the reference image. In this example, the upper left corner point of the frame line is the target point in the reference image, but the target point may be the other corner point of the frame line.

図３（ｂ）に示すように、例えば、基準情報格納部３４は、基準画像における利用文字列の座標と、基準画像における目標点の座標及び当該目標点を基点とした矩形領域の幅及び高さ情報を格納している。 As shown in FIG. 3B, for example, the reference information storage unit 34 uses the coordinates of the used character string in the reference image, the coordinates of the target point in the reference image, and the width and height of the rectangular area based on the target point. Information is stored.

なお、図３に示した例において、利用文字列「注文書」の基準画像における座標は、(x1_org, y1_org)であり、利用文字列「エベレスト」の基準画像における座標は、(x2_org, y2_org)であり、基準画像における目標点の相対座標（利用文字列「注文書」の座標からの相対座標）は、(x_org, y_org)であり、基準画像における枠線の幅は、w_orgであり、枠線の高さは、h_orgであるものとする。 In the example shown in FIG. 3, the coordinates in the reference image of the usage character string “order” are (x1_org, y1_org), and the coordinates in the reference image of the usage character string “Everest” are (x2_org, y2_org). The relative coordinates of the target point in the reference image (relative coordinates from the coordinates of the use character string “Order Form”) are (x_org, y_org), the width of the frame line in the reference image is w_org, Assume that the line height is h_org.

また、同様に、利用文字列「注文書」の文書画像における座標は、(x1_img, y1_img)であり、利用文字列「エベレスト」の文書画像における座標は、(x2_img, y2_img)であり、文書画像における目標点の相対座標（利用文字列「注文書」の座標からの相対座標）は、(x_img, y_img)であり、文書画像における枠線の幅は、w_imgであり、枠線の高さは、h_imgであるものとする。 Similarly, the coordinates in the document image of the used character string “Purchase Order” are (x1_img, y1_img), the coordinates in the document image of the used character string “Everest” are (x2_img, y2_img), and the document image The relative coordinates of the target point in (the relative coordinates from the coordinates of the usage string “Purchase Order”) are (x_img, y_img), the width of the border in the document image is w_img, and the height of the border is , H_img.

また、図３（ｃ）において、＋印は、基準画像における利用文字列の座標位置を投影したものであり、×印は、文書画像における利用文字列の座標位置を示している。また、破線の枠線は、基準画像における枠線を投影したものであり、実線の枠線は、文書画像における抽出対象の文字列の領域を示している。このように、文書画像が基準画像に比べて縮小及び移動した結果、破線の枠内には抽出対象の文字列「７８９１２３」が存在していない。 In FIG. 3C, the + mark is a projection of the coordinate position of the used character string in the reference image, and the x mark indicates the coordinate position of the used character string in the document image. Also, the broken frame line is a projection of the frame line in the reference image, and the solid line frame line indicates the area of the character string to be extracted in the document image. As described above, as a result of the reduction and movement of the document image as compared with the reference image, the character string “789123” to be extracted does not exist within the dashed frame.

図３に示した例では、座標検出部３６は、文書画像についての解析部３２のＯＣＲ結果から、利用文字列の座標(x1_img, y1_img)及び(x2_img, y2_img)を検出する。 In the example illustrated in FIG. 3, the coordinate detection unit 36 detects the coordinates (x1_img, y1_img) and (x2_img, y2_img) of the used character string from the OCR result of the analysis unit 32 for the document image.

また、座標取得部３８は、例えば次のような算出式により、文書画像における目標点の相対座標 (x_img, y_img)を取得する。 The coordinate acquisition unit 38 acquires the relative coordinates (x_img, y_img) of the target point in the document image, for example, by the following calculation formula.

このように、座標取得部３８は、基準画像に含まれる予め定められた複数の文字列（利用文字列）の座標と座標検出部３６により検出された座標とを用いて、基準画像と文書画像との間の拡大又は縮小の倍率を取得し、この倍率に基づいて文書画像の目標点の座標を取得する。 As described above, the coordinate acquisition unit 38 uses the coordinates of a plurality of predetermined character strings (utilized character strings) included in the reference image and the coordinates detected by the coordinate detection unit 36 to use the reference image and the document image. The magnification of enlargement or reduction between and is acquired, and the coordinates of the target point of the document image are obtained based on this magnification.

これに対し、文字列抽出部４０は、文書画像おける(x_img, y_img)を基点とした領域内に含まれる文字列を解析部３２のＯＣＲ結果から抽出する。なお、(x_img, y_img)を基点とした領域の幅及び高さは、w_img及びh_imgである。ここで、w_img及びh_imgの値は、例えば、基準画像における枠線の幅w_org及び高さh_orgに、基準画像に対する文書画像の倍率を乗じることにより算出される。 On the other hand, the character string extraction unit 40 extracts a character string included in an area based on (x_img, y_img) in the document image from the OCR result of the analysis unit 32. Note that the width and height of the region with (x_img, y_img) as the base point are w_img and h_img. Here, the values of w_img and h_img are calculated, for example, by multiplying the width w_org and the height h_org of the frame line in the reference image by the magnification of the document image with respect to the reference image.

以上、図３により示した例では、利用文字列として、「注文書」、「エベレスト」としているが、利用文字列は、例えば、基準画像に含まれる文字列のうち、文書画像内においても予め定められた領域内に存在することが期待される文字列であることが好ましい。例えば、文書の題名、項目名など、文書の雛形のデータに予め含まれる文字列などが利用文字列として選択されることが好ましい。 As described above, in the example shown in FIG. 3, “order form” and “everest” are used as the character strings used, but the character strings used are, for example, in advance in the document image among the character strings included in the reference image. It is preferable that the character string is expected to exist in a predetermined area. For example, it is preferable that a character string or the like previously included in the document template data such as a document title or an item name is selected as the use character string.

また、図３に示した例では、利用文字列として、「注文書」、「エベレスト」といった一意の文字列を用いる例を示したが、正規表現により表された文字列を利用文字列としてもよい。例えば、メタ文字を用いて、３文字の漢字からなる文字列、「注文」の文字列の後に任意の１文字がある文字列などに該当する文字列を利用文字列として指定するようにしてもよい。例えば、座標検出部３６により利用文字列「注文書」の座標を検出する際に、「書」について文字認識ができず、解析部３２の解析結果から「注文書」という文字列を探すことが出来ない場合であっても、利用文字列を「注文書」ではなく、正規表現により「注文」の文字列の後に任意の１文字がある文字列を指定すれば、利用文字列が見つかることとなる。 In the example shown in FIG. 3, an example of using a unique character string such as “Order Form” or “Everest” as the use character string is shown. However, a character string represented by a regular expression may be used as the use character string. Good. For example, by using a meta character, a character string corresponding to a character string composed of three kanji characters, a character string having an arbitrary character after the character string of “order”, or the like may be designated as a use character string. Good. For example, when the coordinates of the used character string “order book” are detected by the coordinate detection unit 36, the character of the “book” cannot be recognized, and the character string “order book” is searched from the analysis result of the analysis unit 32. Even if it is not possible, if you specify a character string with an arbitrary character after the character string of "order" by regular expression instead of "order form", you can find the usage character string Become.

図４は、基準情報格納部３４への基準情報の格納の流れを示すフローチャートである。
ステップ１０（Ｓ１０）において、利用文字列を決定する。例えば、使用者によって、基準画像に含まれる文字列のうちいずれの文字列を利用文字列とするかが指定され、画像処理装置２は、指定された文字列を利用文字列として決定する。なお、本実施形態では、利用文字列としては、複数の文字列が指定される。 FIG. 4 is a flowchart showing a flow of storing reference information in the reference information storage unit 34.
In step 10 (S10), a use character string is determined. For example, the user designates which of the character strings included in the reference image is to be used as the use character string, and the image processing apparatus 2 determines the designated character string as the use character string. In the present embodiment, a plurality of character strings are designated as the use character strings.

ステップ１２（Ｓ１２）において、ステップ１０で決定された利用文字列からの抽出対象の文字列への相対座標を指定する。例えば、使用者によって、目標点が指定されることにより、相対座標が決定される。また、使用者によって、目標点を基点とした抽出領域の幅及び高さが指定される。 In step 12 (S12), relative coordinates to the character string to be extracted from the used character string determined in step 10 are designated. For example, relative coordinates are determined by designating a target point by the user. Further, the width and height of the extraction area with the target point as the base point are designated by the user.

ステップ１４（Ｓ１４）において、各利用文字列に対する座標情報と、目標点の相対座標及び目標点を基点とした抽出領域の幅及び高さの情報が、基準情報格納部３４に格納される。 In step 14 (S14), the coordinate information for each character string used, the relative coordinates of the target point, and the width and height information of the extraction area based on the target point are stored in the reference information storage unit 34.

図５は、文書画像から文字列を抽出する動作について例示するフローチャートである。
ステップ２０（Ｓ２０）において、画像読取装置２０により読み取られた文書画像を文書画像取得部３０が取得する。 FIG. 5 is a flowchart illustrating an operation for extracting a character string from a document image.
In step 20 (S20), the document image acquisition unit 30 acquires the document image read by the image reading device 20.

ステップ２２（Ｓ２２）において、解析部３２が、文書画像に含まれる文字及び文字の位置について解析する。 In step 22 (S22), the analysis unit 32 analyzes characters and character positions included in the document image.

ステップ２４（Ｓ２４）において、基準情報格納部３４に格納されている基準情報を読み込む。 In step 24 (S24), the reference information stored in the reference information storage unit 34 is read.

ステップ２６（Ｓ２６）において、基準情報に定義されている利用文字列が文書画像に存在するか否かが判定され、利用文字列が存在する場合には、ステップ２８へ移行し、存在しない場合には、ステップ３４へ移行する。 In step 26 (S26), it is determined whether or not the used character string defined in the reference information exists in the document image. If the used character string exists, the process proceeds to step 28. Goes to step 34.

ステップ２８（Ｓ２８）において、座標検出部３６が、文書画像における利用文字列の絶対座標を検出する。 In step 28 (S28), the coordinate detector 36 detects the absolute coordinates of the used character string in the document image.

ステップ３０（Ｓ３０）において、座標取得部３８が、基準情報に定義されている基準画像における利用文字列の座標及び目標点の座標と、ステップ２８で検出した利用文字列の文書画像における座標とから、文書画像における目標点の相対座標を取得する。 In step 30 (S30), the coordinate acquisition unit 38 uses the coordinates of the used character string and the target point in the reference image defined in the reference information, and the coordinates in the document image of the used character string detected in step 28. The relative coordinates of the target point in the document image are acquired.

ステップ３２（Ｓ３２）において、文字列抽出部４０が、ステップ３０で取得した相対座標に基づいて、目標点を基点とした領域内の文字列を、ステップ２２の解析結果から抽出する。 In step 32 (S32), the character string extraction unit 40 extracts the character string in the region based on the target point from the analysis result in step 22 based on the relative coordinates acquired in step 30.

一方、ステップ３４（Ｓ３４）では、相対座標に基づく文字列の抽出が行なえないため、予め定められた絶対座標に基づく領域内の文字列を抽出する。 On the other hand, in step 34 (S34), since character strings cannot be extracted based on relative coordinates, character strings in an area based on predetermined absolute coordinates are extracted.

なお、以上説明した実施形態では、利用文字列を複数設けているが、利用文字列を１つとし、当該１つの利用文字列から目標点までの相対距離により文書画像における文字列の抽出領域を求めるようにしてもよい。ただし、利用文字列が１つの場合、文書画像が基準画像に対して拡大又は縮小されたときには、対応できないことがある。 In the embodiment described above, a plurality of used character strings are provided. However, the number of used character strings is one, and the character string extraction region in the document image is determined by the relative distance from the one used character string to the target point. You may make it ask. However, if there is one character string used, it may not be possible when the document image is enlarged or reduced relative to the reference image.

次に、本実施形態の第１の変形例について説明する。上記の実施形態では、基準情報格納部３４には、目標点の座標として、複数の利用文字列のうちいずれかの利用文字列からの相対座標を格納していた。これに対し、第１の変形例では、複数の利用文字列それぞれからの相対座標を格納している。そして、第１の変形例では、座標取得部３８は、基準画像に含まれる複数の利用文字列の座標から基準画像に含まれる目標点への相対座標に応じて、基準画像の目標点に対応する文書画像の目標点の座標を複数取得し、文字列抽出部４０は、座標取得部３８により取得された各座標を基点とした複数の領域のうち、いずれかの領域に含まれる文字列を抽出する。 Next, a first modification of the present embodiment will be described. In the above embodiment, the reference information storage unit 34 stores relative coordinates from any of the plurality of used character strings as the coordinates of the target point. On the other hand, in the first modification, relative coordinates from each of a plurality of used character strings are stored. In the first modification, the coordinate acquisition unit 38 corresponds to the target point of the reference image according to the relative coordinates from the coordinates of the plurality of used character strings included in the reference image to the target point included in the reference image. A plurality of coordinates of the target point of the document image to be acquired, and the character string extraction unit 40 selects a character string included in any one of the plurality of regions based on the coordinates acquired by the coordinate acquisition unit 38. Extract.

図６は、第１の変形例における文字列の抽出について説明する模式図であり、図６（ａ）は、基準画像を例示し、図６（ｂ）は、基準情報格納部３４に予め格納されている情報を例示し、図６（ｃ）は、文書画像を例示している。 FIG. 6 is a schematic diagram for explaining extraction of a character string in the first modification. FIG. 6A illustrates a reference image, and FIG. 6B is stored in the reference information storage unit 34 in advance. FIG. 6C illustrates a document image.

なお、ここでは、例えば文書データのデータ形式が変換されることにより、文書画像における文字列の配置が、基準画像のおける文字列の配列と異なっている場合を例に説明する。例えば、基準画像が、ＰＤＦ(ＰｏｒｔａｂｌｅＤｏｃｕｍｅｎｔＦｏｒｍａｔ)形式の文書データに基づく画像である場合に、当該文書データを他の形式の文書データに変換し、変換後の文書データについて記録媒体に画像形成した場合、この記録媒体を読み取った文書画像の文字列の配置は、基準画像における文字列の配置と異なる場合がある。図６に示した例では、文書画像において、利用文字列として指定されている「注文書」の文字列が基準画像に比べ、左方向にずれている。その結果、基準画像における「注文書」の文字列と「社名」以下の文字列との位置関係と、文書画像における「注文書」の文字列と「社名」以下の文字列との位置関係とは、異なっている。また、図６に示した例では、文書画像における文字列は、さらに、全体的に右下方向にずれている。 Here, a case will be described as an example where the arrangement of character strings in the document image differs from the arrangement of character strings in the reference image, for example, by converting the data format of the document data. For example, when the reference image is an image based on PDF (Portable Document Format) document data, the document data is converted into other format document data, and the converted document data is imaged on a recording medium. In this case, the arrangement of the character strings of the document image read from the recording medium may be different from the arrangement of the character strings in the reference image. In the example shown in FIG. 6, in the document image, the character string of “Order Form” designated as the use character string is shifted to the left as compared with the reference image. As a result, the positional relationship between the character string of “Order Form” in the reference image and the character string below “Company Name”, and the positional relationship between the character string of “Order Form” in the document image and the character string below “Company Name” Is different. In the example shown in FIG. 6, the character string in the document image is further shifted in the lower right direction as a whole.

なお、図６に示した例では、文書画像は、基準画像に対して拡大又は縮小されていないが、第１の変形例として示す処理を拡大又は縮小された文書画像に適用してもよい。 In the example shown in FIG. 6, the document image is not enlarged or reduced with respect to the reference image. However, the process shown as the first modification may be applied to the enlarged or reduced document image.

図６で示した例では、抽出対象の文字列が、受注番号として記載されている「７８９１２３」であるものと仮定し、また、この文字列の抽出のために用いる予め定められた複数の文字列（利用文字列）が、「注文書」、「エベレスト」、「ページ」の３つであるものと仮定する。 In the example shown in FIG. 6, it is assumed that the character string to be extracted is “789123” described as the order number, and a plurality of predetermined characters used for extraction of this character string. It is assumed that there are three columns (utilization character strings): “Order Form”, “Everest”, and “Page”.

また、図６（ａ）において、＋印は、利用文字列の座標位置を示し、枠線は、基準画像における抽出対象の文字列の領域を示している。この例では、枠線の左上隅の点が基準画像における目標点である。また、＋印から目標点へと伸びる各矢印は、利用文字列の各座標位置から目標点への相対座標を明示するベクトルを説明のため明示したものである。 In FIG. 6A, the + mark indicates the coordinate position of the used character string, and the frame line indicates the area of the character string to be extracted in the reference image. In this example, the point at the upper left corner of the frame line is the target point in the reference image. In addition, each arrow extending from the + mark to the target point clearly indicates a vector that clearly indicates the relative coordinates from each coordinate position of the used character string to the target point.

図６（ｂ）に示すように、例えば、基準情報格納部３４は、基準画像における利用文字列の座標と、基準画像における目標点の座標及び当該目標点を基点とした矩形領域の幅及び高さ情報を格納している。ここで、第１の変形例では、複数の利用文字列それぞれからの相対座標を格納している点で上記の実施形態と異なっている。 As shown in FIG. 6B, for example, the reference information storage unit 34 uses the coordinates of the used character string in the reference image, the coordinates of the target point in the reference image, and the width and height of the rectangular area based on the target point. Information is stored. Here, the first modification is different from the above-described embodiment in that the relative coordinates from each of the plurality of used character strings are stored.

また、図６（ｃ）において、×印は、文書画像における利用文字列の座標位置を示している。また、×印から伸びる各矢印は、図６（ａ）に示した矢印（利用文字列の座標位置から目標点への相対座標を示すベクトル）を投影したものであり、矢印により表されるベクトルは、図６（ａ）に示すベクトルと図６（ｃ）に示すベクトルとで等しい。また、図６（ｃ）において、ベクトルの終点は、基準画像の目標点に対応する文書画像の目標点の座標位置を示し、文書画像における、この目標点を基点とする枠線は、図６（ａ）に示した枠線と同じである。このように、図６（ａ）に示した基準画像と図６（ｃ）に示した文書画像とでは、各ベクトル及び枠線は同じであるものの、利用文字列の座標位置が変更されているため、枠線に囲まれる領域は、各々異なっている。 In FIG. 6C, the x mark indicates the coordinate position of the used character string in the document image. Each arrow extending from the x mark is a projection of the arrow shown in FIG. 6A (a vector indicating relative coordinates from the coordinate position of the used character string to the target point), and is a vector represented by the arrow. Is equal to the vector shown in FIG. 6A and the vector shown in FIG. In FIG. 6C, the end point of the vector indicates the coordinate position of the target point of the document image corresponding to the target point of the reference image, and the frame line with the target point as the base point in the document image is shown in FIG. It is the same as the frame line shown in (a). As described above, in the reference image shown in FIG. 6A and the document image shown in FIG. 6C, the vectors and the frame lines are the same, but the coordinate position of the used character string is changed. Therefore, the areas surrounded by the frame lines are different.

第１の変形例では、座標取得部３８は、座標検出部３６により検出された文書画像における利用文字列の座標と、基準情報格納部３４に格納されている目標点への相対座標とから、文書画像の目標点の利用文字列からの座標を取得する。図６に示した例では、座標取得部３８は、座標検出部３６により検出された文書画像における利用文字列「注文書」の座標と、基準情報格納部３４に格納されている利用文字列「注文書」の基準画像における座標及びこの利用文字列から目標点への相対座標とに基づいて、利用文字列「注文書」からの目標点の座標を取得する。また、座標取得部３８は、同様にして、利用文字列「エベレスト」からの目標点の座標、利用文字列「ページ」からの目標点の座標についても取得する。 In the first modification, the coordinate acquisition unit 38 uses the coordinates of the used character string in the document image detected by the coordinate detection unit 36 and the relative coordinates to the target point stored in the reference information storage unit 34. The coordinates of the target point of the document image are obtained from the used character string. In the example illustrated in FIG. 6, the coordinate acquisition unit 38 uses the coordinates of the used character string “order form” in the document image detected by the coordinate detecting unit 36 and the used character string “stored in the reference information storage unit 34. The coordinates of the target point from the use character string “order sheet” are acquired based on the coordinates in the reference image of the “ordering document” and the relative coordinates from the use character string to the target point. Similarly, the coordinate acquisition unit 38 also acquires the coordinates of the target point from the usage character string “Everest” and the coordinates of the target point from the usage character string “page”.

また、第１の変形例では、文字列抽出部４０は、座標取得部３８により取得された各座標を基点とした複数の領域のうち、いずれかの領域に含まれる文字列を抽出する。図６に示した例では、まず、座標取得部３８により取得された利用文字列「注文書」についての座標を基点とした領域の指定が行なわれる。なお、領域の指定は、基準情報格納部３４に格納されている、矩形領域の幅及び高さ情報を用いて矩形領域を定めることにより行なわれる。同様にして、座標取得部３８により取得された他の利用文字列の座標を基点とした領域についても指定される。なお、指定された各領域は、図６（ｃ）では、各枠線内の領域として表される。次に、文字列抽出部４０は、例えば、指定された全ての領域に含まれる文字列を抽出対象の文字列として抽出する。 In the first modification, the character string extraction unit 40 extracts a character string included in any one of a plurality of regions based on the coordinates acquired by the coordinate acquisition unit 38. In the example shown in FIG. 6, first, an area is specified based on the coordinates of the used character string “order book” acquired by the coordinate acquisition unit 38. The designation of the area is performed by determining the rectangular area using the width and height information of the rectangular area stored in the reference information storage unit 34. Similarly, an area based on the coordinates of other used character strings acquired by the coordinate acquisition unit 38 is also specified. Each designated region is represented as a region within each frame line in FIG. Next, the character string extraction unit 40 extracts, for example, character strings included in all designated areas as extraction target character strings.

なお、文字列抽出部４０は、このように指定された全ての領域に含まれる文字列を抽出対象の文字列として抽出してもよいが、予め定められた閾値以上の個数の領域に含まれる文字列を抽出対象の文字列として抽出するよう構成してもよい。 Note that the character string extraction unit 40 may extract character strings included in all areas specified in this way as extraction target character strings, but are included in a number of areas equal to or greater than a predetermined threshold. A character string may be extracted as a character string to be extracted.

また、文字列抽出部４０は、予め定められた条件を満たす文字列を抽出対象としてもよい。例えば、文字列が、数字、アルファベットなどの予め定められた文字の種別であるもののみを抽出対象とするようにしてもよいし、形態素解析などを行なうことにより領域内の文字列の内容を解析し、文字列が予め定められた内容（例えば、住所を示す文字列、氏名を示す文字列など）であるもののみを抽出対象とするようにしてもよい。 The character string extraction unit 40 may extract a character string that satisfies a predetermined condition. For example, it may be possible to extract only a character string having a predetermined character type such as a number or alphabet, or analyze the contents of the character string in the region by performing morphological analysis or the like. Alternatively, only a character string having a predetermined content (for example, a character string indicating an address or a character string indicating a name) may be extracted.

図７は、第１の変形例において、文書画像から文字列を抽出する動作について例示するフローチャートである。ここで、ステップ２０〜ステップ２４については、図４におけるフローチャートと同様なので、ステップ３０以降について説明する。 FIG. 7 is a flowchart illustrating an operation of extracting a character string from a document image in the first modification. Here, since Step 20 to Step 24 are the same as those in the flowchart in FIG. 4, Step 30 and subsequent steps will be described.

ステップ３０（Ｓ３０）において、基準情報に定義されている利用文字列が文書画像に存在するか否かが判定され、利用文字列が存在する場合には、ステップ３２へ移行し、存在しない場合には、ステップ３８へ移行する。 In step 30 (S30), it is determined whether or not the use character string defined in the reference information exists in the document image. If the use character string exists, the process proceeds to step 32. Goes to step 38.

ステップ３２（Ｓ３２）において、座標検出部３６が、文書画像における利用文字列の絶対座標を検出する。 In step 32 (S32), the coordinate detection unit 36 detects the absolute coordinates of the used character string in the document image.

ステップ３４（Ｓ３４）において、座標取得部３８が、ステップ３２で検出された文書画像における利用文字列の座標と、基準情報格納部３４に格納されている目標点への相対座標とから、文書画像の目標点の当該利用文字列からの相対座標を取得する。 In step 34 (S34), the coordinate acquisition unit 38 calculates the document image from the coordinates of the used character string in the document image detected in step 32 and the relative coordinates to the target point stored in the reference information storage unit 34. The relative coordinates of the target point of the target character string are obtained.

ステップ３６（Ｓ３６）において、文字列抽出部４０が、ステップ３４により取得された座標を基点とした領域内の文字列をステップ２２の解析結果から抽出する。 In step 36 (S 36), the character string extraction unit 40 extracts the character string in the region based on the coordinates acquired in step 34 from the analysis result of step 22.

ステップ３８（Ｓ３８）において、基準情報に他の利用文字列が定義されているか否かを判定し、他の利用文字列が定義されている場合には、ステップ３０へと戻り、当該他の利用文字列について、上記の処理がなされる。一方、定義されている全ての利用文字列について上記の処理がなされた場合には、ステップ４０へと移行する。 In step 38 (S38), it is determined whether or not another use character string is defined in the reference information. If another use character string is defined, the process returns to step 30 and the other use character string is defined. The above processing is performed on the character string. On the other hand, if the above processing has been performed for all defined use character strings, the process proceeds to step 40.

ステップ４０（Ｓ４０）において、ステップ３０において文書画像中に存在すると判定された利用文字列があるか否かが判定され、１つ以上の利用文字列について文書画像中に存在するとステップ３０で判定されている場合には、ステップ４２へ移行し、文書画像中に存在すると判定された利用文字列が１つもなかった場合には、ステップ４４へ移行する。 In step 40 (S40), it is determined whether or not there is a use character string determined to be present in the document image in step 30, and it is determined in step 30 that one or more use character strings exist in the document image. If there is no character string to be used that is determined to exist in the document image, the process proceeds to step 44.

ステップ４２（Ｓ４２）において、文字列抽出部４０は、ステップ３６で抽出された文字列の中から、文字列を決定する。例えば、上述のように、文字列抽出部４０は、指定された全ての領域に含まれる文字列を抽出対象の文字列として決定する。なお、ステップ３６で抽出された文字列のいずれも、条件を満たさない（例えば、指定された一部の領域にしか含まれていない等）場合、ステップ４４へ移行するようにしてもよい。 In step 42 (S42), the character string extraction unit 40 determines a character string from the character strings extracted in step 36. For example, as described above, the character string extraction unit 40 determines character strings included in all designated areas as extraction target character strings. If none of the character strings extracted in step 36 satisfies the condition (for example, it is included only in a part of the designated area), the process may proceed to step 44.

一方、ステップ４４（Ｓ４４）では、相対座標に基づく文字列の抽出が行なえないため、予め定められた絶対座標に基づく領域内の文字列を抽出する。 On the other hand, in step 44 (S44), since character strings cannot be extracted based on relative coordinates, character strings in an area based on predetermined absolute coordinates are extracted.

なお、以上説明したフローチャートにおいて、ステップ４０では、文書画像中に存在するとステップ３０で判定された利用文字列が１つ以上あれば、ステップ４２へ移行するものとして説明したが、１つに限らず、２以上としてもよい。 In the flowchart described above, in step 40, it is described that if there is one or more character strings determined in step 30 as being present in the document image, the process proceeds to step 42. However, the number is not limited to one. It is good also as 2 or more.

次に、本実施形態の第２の変形例について説明する。第２の変形例では、座標検出部３６は、少なくとも３つの利用文字列の座標を検出し、座標取得部３８は、基準情報格納部３４に格納された、基準画像に含まれる少なくとも３つの利用文字列の座標及び該座標により囲まれる領域内における目標点の座標と、座標検出部３６により検出された複数の利用文字列の座標とに基づいて、文書画像の目標点の座標を取得する。 Next, a second modification of the present embodiment will be described. In the second modification, the coordinate detection unit 36 detects the coordinates of at least three usage character strings, and the coordinate acquisition unit 38 stores at least three usages included in the reference image stored in the reference information storage unit 34. The coordinates of the target point of the document image are acquired based on the coordinates of the character string, the coordinates of the target point in the area surrounded by the coordinates, and the coordinates of the plurality of used character strings detected by the coordinate detection unit 36.

図８は、第２の変形例における文字列の抽出について説明する模式図であり、図８（ａ）は、基準画像を例示し、図８（ｂ）は、基準画像における３つの利用文字列の座標点に囲まれた領域内の目標点の位置関係を図示している。 FIG. 8 is a schematic diagram for explaining extraction of a character string in the second modified example, FIG. 8A illustrates a reference image, and FIG. 8B illustrates three use character strings in the reference image. The positional relationship of the target points within the area surrounded by the coordinate points is shown.

図８（ａ）において、＋印は、利用文字列の座標位置を示し、丸印は目標点を示し、枠線は、目標点を基点とした領域であり、基準画像における抽出対象の文字列の領域を示している。このように、第２の実施形態では、目標点を囲むように少なくとも３つの利用文字列を用いる。 In FIG. 8A, the + mark indicates the coordinate position of the used character string, the circle mark indicates the target point, and the frame line is an area having the target point as a base point, and the character string to be extracted in the reference image Shows the area. Thus, in the second embodiment, at least three use character strings are used so as to surround the target point.

第２の変形例で、基準情報格納部３４には、基準画像における少なくとも３つの利用文字列の座標と、基準画像における目標点の座標情報及び当該目標点を基点とした矩形領域の幅及び高さ情報が格納される。例えば、基準画像における目標点の座標情報としては、例えば、図８（ｂ）に示されるように、利用文字列の座標点により形成される三角形に対する目標点の相対位置を示す座標が格納される。なお、図８（ｂ）に示す例では、三角形のいずれか一辺に向けた目標点からの垂線による当該一辺の内分点の位置（図中に示される、垂線の交点から頂点までの距離ａ及びｂ）と、当該一辺の一方の頂点から目標点を通るように他の辺に向けた直線による当該他の辺の内分点の位置（図中に示される、直線の交点から頂点までの距離ｃ及びｄ）とを基準情報格納部３４は目標点の座標情報として格納する。 In the second modification, the reference information storage unit 34 includes the coordinates of at least three used character strings in the reference image, the coordinate information of the target point in the reference image, and the width and height of the rectangular area based on the target point. Information is stored. For example, as the coordinate information of the target point in the reference image, for example, as shown in FIG. 8B, coordinates indicating the relative position of the target point with respect to the triangle formed by the coordinate points of the use character string are stored. . In the example shown in FIG. 8B, the position of the internal dividing point of the one side by the perpendicular from the target point toward any one side of the triangle (the distance a from the intersection of the perpendicular to the apex shown in the figure) And b), and the position of the internal dividing point of the other side by a straight line toward the other side so as to pass through the target point from one vertex of the one side (shown in the figure from the intersection of the straight line to the vertex) The reference information storage unit 34 stores the distances c and d) as the coordinate information of the target point.

図９は、第２の変形例における文字列の抽出について説明する模式図であり、図９（ａ）は、文書画像を例示し、図９（ｂ）は、文書画像における３つの利用文字列の座標点に囲まれた領域内の目標点の位置関係を図示している。ここでは、文書画像が、基準画像に比べて、縮小され、かつ、全体的に文字列の位置が平行移動している場合を例に説明する。 FIG. 9 is a schematic diagram for explaining extraction of a character string in the second modified example, FIG. 9A illustrates a document image, and FIG. 9B illustrates three used character strings in the document image. The positional relationship of the target points within the area surrounded by the coordinate points is shown. Here, an example will be described in which the document image is reduced as compared with the reference image and the position of the character string is moved in parallel as a whole.

図９（ａ）において、×印は文書画像における利用文字列の座標位置を示し、丸印は文書画像における目標点を示し、枠線は、目標点を基点とした領域であり、文書画像における抽出対象の文字列の領域を示している。第２の変形例では、図９（ｂ）に示すように、上記垂線及び直線の辺の内分比率が図８（ｂ）と同様になるよう、垂線の交点（図中に示される、垂線の交点から頂点までの距離ａ'及びｂ'）と、直線の交点（図中に示される、直線の交点から頂点までの距離ｃ'及びｄ'）が定められ文書画像における目標点の位置が求められる。座標取得部３８は、このように目標点の座標を計算し、文字列抽出部４０は、上述の実施形態と同様、目標点を基点とした枠線内の文字列を抽出する。 In FIG. 9A, the x mark indicates the coordinate position of the used character string in the document image, the circle mark indicates the target point in the document image, and the frame line is an area based on the target point. The character string area to be extracted is shown. In the second modification, as shown in FIG. 9 (b), the intersection of the perpendiculars (perpendicular line shown in the figure) so that the internal ratio of the vertical and straight sides is the same as in FIG. 8 (b). The distances a ′ and b ′ from the intersections to the vertices) and the intersections of the straight lines (distances c ′ and d ′ from the intersections of the straight lines to the vertices shown in the figure) are determined, and the position of the target point in the document image is determined. Desired. The coordinate acquisition unit 38 calculates the coordinates of the target point in this way, and the character string extraction unit 40 extracts the character string within the frame line with the target point as the base point, as in the above-described embodiment.

図１０は、第２の変形例において、文書画像から文字列を抽出する動作について例示するフローチャートである。ここで、ステップ２０〜ステップ２４については、図４におけるフローチャートと同様なので、ステップ４０以降について説明する。 FIG. 10 is a flowchart illustrating an operation of extracting a character string from a document image in the second modification. Here, since Step 20 to Step 24 are the same as those in the flowchart in FIG. 4, Step 40 and subsequent steps will be described.

ステップ４０（Ｓ４０）において、基準情報に定義されている３つの利用文字列が文書画像に存在するか否かが判定され、利用文字列が存在する場合には、ステップ４２へ移行し、存在しない場合には、ステップ４８へ移行する。 In step 40 (S40), it is determined whether or not the three used character strings defined in the reference information exist in the document image. If there is a used character string, the process proceeds to step 42 and does not exist. If yes, go to step 48.

ステップ４２（Ｓ４２）において、座標検出部３６が、文書画像における各利用文字列の絶対座標を検出する。 In step 42 (S42), the coordinate detector 36 detects the absolute coordinates of each character string used in the document image.

ステップ４４（Ｓ４４）において、座標取得部３８が、ステップ４２で検出された文書画像における利用文字列の座標と、基準情報格納部３４に格納されている基準画像の利用文字列の座標及び目標点の座標情報とから、文書画像の目標点の座標を取得する。 In step 44 (S44), the coordinate acquisition unit 38 uses the coordinates of the used character string in the document image detected in step 42, the coordinates of the used character string of the reference image stored in the reference information storage unit 34, and the target point. The coordinate of the target point of the document image is acquired from the coordinate information of the document image.

ステップ４６（Ｓ４６）において、文字列抽出部４０が、ステップ４４により取得された座標を基点とした領域内の文字列をステップ２２の解析結果から抽出する。 In step 46 (S 46), the character string extraction unit 40 extracts a character string in the region based on the coordinates acquired in step 44 from the analysis result in step 22.

一方、ステップ４８（Ｓ４８）では、予め定められた絶対座標に基づく領域内の文字列を抽出する。 On the other hand, in step 48 (S48), a character string in an area based on predetermined absolute coordinates is extracted.

以上、図８及び図９に示した例では、１つの目標点と、利用文字列の座標点からなる１つの三角形領域とを用いて説明したが、利用文字列を４つ以上設定し、目標点を囲む複数の異なる三角形領域を用いてもよい。また、複数の異なる三角形領域において、各三角形領域内の目標点は、枠線上の同じ頂点であってもよいし、異なる頂点であってもよい。このように、利用文字列を４つ以上設定し、目標点を囲む複数の異なる三角形領域を用いることにより、利用文字列として３つだけを設定する場合に比べ、ステップ４０において「有」と判定されやすくなることが期待される。 As described above, in the examples shown in FIGS. 8 and 9, the description has been made using one target point and one triangular area composed of the coordinate points of the used character string. A plurality of different triangular regions surrounding a point may be used. Further, in a plurality of different triangular regions, the target points in each triangular region may be the same vertex on the frame line, or may be different vertices. In this way, when four or more use character strings are set and a plurality of different triangular regions surrounding the target point are used, compared with the case where only three use character strings are set, “Yes” is determined in step 40. It is expected to become easier.

２画像処理装置
４ＣＰＵ
６メモリ
２０画像読取装置
３０文書画像取得部
３２解析部
３４基準情報格納部
３６座標検出部
３８座標取得部
４０文字列抽出部 2 Image processing device 4 CPU
6 Memory 20 Image Reading Device 30 Document Image Acquisition Unit 32 Analysis Unit 34 Reference Information Storage Unit 36 Coordinate Detection Unit 38 Coordinate Acquisition Unit 40 Character String Extraction Unit

Claims

An analysis means for analyzing a document image obtained by reading a recording medium;
Coordinate detection means for detecting coordinates of a plurality of predetermined character strings in the document image from the analysis result by the analysis means;
Based on the coordinates of the plurality of predetermined character strings included in the reference image corresponding to the document image, the coordinates of the target point included in the reference image, and the plurality of coordinates detected by the coordinate detection unit. Coordinate acquisition means for acquiring the coordinates of the target point of the document image corresponding to the target point of the reference image;
A character string extracting means for extracting a character string of a region based on the coordinates acquired by the coordinate acquiring means from the analysis result by the analyzing means;
The coordinate acquisition means uses the reference coordinates of the plurality of predetermined character strings included in the reference image and the coordinates of a plurality of character strings corresponding to the document image detected by the coordinate detection means. An image processing apparatus that obtains a magnification for enlargement or reduction between an image and the document image, and obtains coordinates of a target point of the document image based on the magnification.

The image processing apparatus according to claim 1, wherein the coordinate detection unit detects coordinates of a plurality of predetermined character strings using a regular expression.

The coordinate acquisition means corresponds to the target point of the reference image according to the relative coordinates from the coordinates of the predetermined character strings included in the reference image to the target point included in the reference image. Get multiple coordinates of the target point of the document image,
The image processing apparatus according to claim 1, wherein the character string extraction unit extracts a character string included in any one of a plurality of regions based on the coordinates acquired by the coordinate acquisition unit.

The image processing apparatus according to claim 3, wherein the character string extraction unit extracts a character string included in all of a plurality of regions with each coordinate acquired by the coordinate acquisition unit as a base point.

The image processing apparatus according to claim 3, wherein the character string extraction unit extracts a character string that satisfies a predetermined condition.

The coordinate detecting means detects coordinates of at least three predetermined character strings;
The coordinate acquisition means includes the coordinates of the at least three predetermined character strings included in the reference image, the coordinates of a target point in an area surrounded by the coordinates, and a plurality of coordinates detected by the coordinate detection means. The image processing apparatus according to claim 1, wherein the coordinates of the target point of the document image are acquired based on.

An analysis step for analyzing a document image obtained by reading a recording medium;
A coordinate detection step of detecting coordinates of a plurality of predetermined character strings in the document image from the analysis result in the analysis step;
Based on the coordinates of the plurality of predetermined character strings included in the reference image corresponding to the document image, the coordinates of the target point included in the reference image, and the plurality of coordinates detected in the coordinate detection step. A coordinate acquisition step of acquiring the coordinates of the target point of the document image corresponding to the target point of the reference image;
From the analysis result in the analysis step, a program for causing a computer to execute a character string extraction step for extracting a character string in a region based on the coordinates acquired in the coordinate acquisition step,
The coordinate acquisition step uses the reference coordinates of the plurality of predetermined character strings included in the reference image and the coordinates of a plurality of character strings corresponding to the document image detected by the coordinate detection step. A program for obtaining a magnification for enlargement or reduction between an image and the document image and obtaining coordinates of a target point of the document image based on the magnification.