JP2018085093A

JP2018085093A - Information processing apparatus, control method, and program

Info

Publication number: JP2018085093A
Application number: JP2017147464A
Authority: JP
Inventors: 容川口; Hiroshi Kawaguchi; 新一三浦; Shinichi Miura; 孝文白波瀬; Takafumi Shirahase
Original assignee: Canon Marketing Japan Inc
Current assignee: Canon Marketing Japan Inc
Priority date: 2016-11-17
Filing date: 2017-07-31
Publication date: 2018-05-31
Anticipated expiration: 2037-07-31
Also published as: JP6947971B2

Abstract

PROBLEM TO BE SOLVED: To provide a mechanism capable of recognizing whether an item for which a value has been acquired or an item having caused an error at the time of acquiring a value is an item to be acquired.SOLUTION: The apparatus registers acquisition conditions including a key character string and a positional condition beforehand, and specifies an area on the basis of the key character string and the positional condition. If an attempt to acquire a character string from the specified area is failed, the apparatus determines whether the character string has been acquired from another area in the same line. If it is determined that the character string has been successfully acquired, the apparatus determines that the specified area is an area to acquire a value. Further, the apparatus determines whether the area from which the value has been acquired is an area to acquire the value.SELECTED DRAWING: Figure 13

Description

本発明は、文書画像の所望の箇所の値を容易に取得可能とする画像処理技術に関する。 The present invention relates to an image processing technique that makes it possible to easily acquire a value of a desired portion of a document image.

従来、様々な画像処理技術が開示されている。その中で、報告書や伝票等の帳票をスキャナで取り込み、取り込んだデータからＯＣＲ（ＯｐｔｉｃａｌＣｈａｒａｃｔｅｒＲｅｃｏｇｎｉｔｉｏｎ；光学文字認識）技術を用いて必要な項目の値を認識し取得する方法が提案されている。 Conventionally, various image processing techniques have been disclosed. Among them, a method has been proposed in which a form such as a report or a slip is captured by a scanner, and the value of a necessary item is recognized and acquired from the captured data using an OCR (Optical Character Recognition) technology. .

特に、取得が必要な項目に対して、取得すべき箇所から値を取得したかを判断することが重要となる。 In particular, it is important to determine whether a value is acquired from a location to be acquired for an item that needs to be acquired.

特許文献１には、指定された項目名称に対して、検索対象の項目名称が表示されている位置と、同じカテゴリの他の項目名称が表示されている位置とを比較することにより、必要情報か否かを判定する方法が記載されている。 In Patent Literature 1, the required information is obtained by comparing the position where the item name to be searched is displayed with the position where other item names of the same category are displayed for the specified item name. A method for determining whether or not is described.

特開２０１４−１８６４３５号公報JP 2014-186435 A

しかしながら、特許文献１に記載された方法は、事前に定義された複数の項目名称（文字列）に対して、それらの表示位置の差により必要情報か否かを判定する方法であり、検査値のような事前定義できない項目については適用できない。 However, the method described in Patent Document 1 is a method for determining whether or not necessary information is obtained based on a difference between display positions of a plurality of item names (character strings) defined in advance. It cannot be applied to items that cannot be pre-defined.

また、特許文献１に記載された方法は、帳票における項目のカテゴリ分けが変わるような非定型帳票の場合にも対応できない。 Further, the method described in Patent Document 1 cannot cope with a non-standard form in which the category classification of items in the form changes.

帳票等から項目の値を取得する場合、特にエラーのため取得できなかった項目が、取得すべき項目であったか否かを提示することは、エラー訂正のために役立つ。 When acquiring the value of an item from a form or the like, it is particularly useful for error correction to indicate whether an item that could not be acquired due to an error was an item that should be acquired.

また、帳票等から一度取得した結果に基づいて、各項目が取得すべき項目であるか否かを提示することは、非定型帳票に対して値を取得すべき項目を指定する場合に有効である。 In addition, presenting whether or not each item is an item to be acquired based on the result obtained once from a form or the like is effective when specifying an item for which a value is to be acquired for an atypical form. is there.

そこで、本発明の目的は、値を取得した項目、または、値の取得時にエラーとなった項目が取得すべき項目であるか否かを認識することができる仕組みを提供することを目的とする。 Accordingly, an object of the present invention is to provide a mechanism capable of recognizing whether an item for which a value has been acquired or an item in which an error has occurred is an item to be acquired. .

本発明は、キー文字列と位置条件とを記憶する取得条件記憶手段を備える情報処理装置であって、前記キー文字列と前記位置条件とに基づいて領域を特定する特定手段と、前記特定された領域から文字列を取得する取得手段と、前記取得手段により文字列が取得できなかった場合に、前記領域と同じ並びの他の領域で文字列を取得できたか否かを判定する判定手段と、前記判定手段にて文字列を取得できたと判定された場合、前記特定された領域が値を取得すべき領域であったと判断する判断手段とを備えることを特徴とする。 The present invention is an information processing apparatus comprising an acquisition condition storage means for storing a key character string and a position condition, the specifying means for specifying an area based on the key character string and the position condition, and the specified An acquisition unit that acquires a character string from the area, and a determination unit that determines whether a character string has been acquired in another area in the same sequence as the area when the acquisition unit cannot acquire a character string; When the determination unit determines that the character string has been acquired, the determination unit includes a determination unit that determines that the specified region is a region from which a value is to be acquired.

本発明によれば、値を取得した項目、または、値の取得時にエラーとなった項目が取得すべき項目であるか否かを認識することができる。 According to the present invention, it is possible to recognize whether an item for which a value has been acquired or an item in which an error has occurred is an item to be acquired.

画像処理システムの構成例を示す図である。It is a figure which shows the structural example of an image processing system. ＰＣ２０１の概略構成を示すブロック図である。2 is a block diagram illustrating a schematic configuration of a PC 201. FIG. 画像処理システムにて実行される画像処理の概略を示すフローチャートである。It is a flowchart which shows the outline of the image processing performed with an image processing system. 値取得処理のフローチャートである。It is a flowchart of a value acquisition process. ブロック情報の一例を示すデータ図である。It is a data figure which shows an example of block information. 文字認識領域情報の一例を示すデータ図である。It is a data figure which shows an example of character recognition area information. 設定ファイルの一例を示すデータ図である。It is a data diagram which shows an example of a setting file. 設定ファイルの一例を示すデータ図である。It is a data diagram which shows an example of a setting file. 読み込む文書画像の一例を示すイメージ図である。It is an image figure which shows an example of the document image to read. ブロック情報を文書画像上で表示した場合のイメージ図である。It is an image figure at the time of displaying block information on a document image. 文字認識領域情報を選択領域として文書画像上で表示した場合のイメージ図である。It is an image figure at the time of displaying on a document image as character recognition area information as a selection area. 認識結果を表示する画面の一例を示す図である。It is a figure which shows an example of the screen which displays a recognition result. 認識結果を表示する画面の一例を示す図である。It is a figure which shows an example of the screen which displays a recognition result. 値取得処理の一例を説明するイメージ図である。It is an image figure explaining an example of value acquisition processing. 第２の実施形態の値取得処理のフローチャートである。It is a flowchart of the value acquisition process of 2nd Embodiment. 読み込む文書画像の一例を示すイメージ図である。It is an image figure which shows an example of the document image to read. 第２の実施形態での認識結果を表示する画面の一例を示す図である。It is a figure which shows an example of the screen which displays the recognition result in 2nd Embodiment.

＜第１の実施形態＞
以下、本発明の実施形態を、図面を参照して詳細に説明する。 <First Embodiment>
Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

図１は、本発明の実施形態に係る画像処理方法が適用された画像処理システムの構成例を示す図である。 FIG. 1 is a diagram illustrating a configuration example of an image processing system to which an image processing method according to an embodiment of the present invention is applied.

図１において、画像処理システムは、例えば、情報処理装置としてのパーソナルコンピュータ（ＰＣ）２０１と、画像読取装置としてのスキャナ２０２と、印刷装置としてのプリンタ２０３とを備え、これらがネットワーク２０４を介して互いに接続されている。 In FIG. 1, the image processing system includes, for example, a personal computer (PC) 201 as an information processing apparatus, a scanner 202 as an image reading apparatus, and a printer 203 as a printing apparatus, which are connected via a network 204. Are connected to each other.

スキャナ２０２は、紙文書を光学的に読み取って電子化し、その画像データをＰＣ２０１に送ることができる。ＰＣ２０１は、受信した画像データに対して所定の画像処理を実行する。その際、オペレータがキーボードやマウス等を操作して処理結果の確認および修正を行うことができる。プリンタ２０３は、所定の画像処理が行われた画像データをＰＣ２０１から受信して印刷を行う。 The scanner 202 can optically read a paper document, digitize it, and send the image data to the PC 201. The PC 201 executes predetermined image processing on the received image data. At that time, the operator can confirm and correct the processing result by operating a keyboard, a mouse, or the like. The printer 203 receives image data on which predetermined image processing has been performed from the PC 201 and performs printing.

ネットワーク２０４は、インターネット、ＬＡＮやＷＡＮ、電話回線、専用デジタル回線、ＡＴＭやフレームリレー回線、通信衛星回線、ケーブルテレビ回線、データ放送用無線回線等のいずれか、またはこれらの組み合わせにより実現される、いわゆる通信ネットワークであり、データの送受信が可能であればよい。 The network 204 is realized by any one of the Internet, a LAN or WAN, a telephone line, a dedicated digital line, an ATM or a frame relay line, a communication satellite line, a cable TV line, a data broadcasting wireless line, or a combination thereof. It is a so-called communication network as long as it can transmit and receive data.

なお、本発明の実施形態に係る画像処理方法を図示の画像処理システムに適用した形態について説明するが、これに限定されず、スキャナやプリンタが一体的に構成された複合機に適用した形態であってもよい。また、ＰＣ２０１は、スキャナ２０２から入力された画像データに限らず、デジタルカメラ等で撮影された文書画像データに対して本発明の画像処理方法を実行してもよく、画像データの入力先や入力方法を限定するものではない。 Although an embodiment in which the image processing method according to the embodiment of the present invention is applied to the illustrated image processing system will be described, the present invention is not limited to this, and the embodiment is applied to a multifunction machine in which a scanner and a printer are integrated. There may be. Further, the PC 201 is not limited to the image data input from the scanner 202, and may execute the image processing method of the present invention on document image data captured by a digital camera or the like. The method is not limited.

図２は、図１のＰＣ２０１の概略構成を示すブロック図である。 FIG. 2 is a block diagram showing a schematic configuration of the PC 201 of FIG.

ＰＣ２０１において、ＣＰＵ１０１は、ＲＯＭ１０２に格納されている制御プログラムに従って装置全体の制御を行う。ＲＯＭ１０２は、ＣＰＵ１０１が実行する後述する処理等の制御プログラムを含む各種プログラムや各種パラメータデータを格納する。ＲＡＭ１０３は、記憶装置１０４からロードされたプログラムを一時的に記憶したり、エリア画像や各種データを記憶する。また、ＲＡＭ１０３は、データの作業領域や一時待避領域として機能する。 In the PC 201, the CPU 101 controls the entire apparatus according to a control program stored in the ROM 102. The ROM 102 stores various programs and various parameter data including a control program for processing to be described later executed by the CPU 101. The RAM 103 temporarily stores programs loaded from the storage device 104, and stores area images and various data. The RAM 103 functions as a data work area and a temporary save area.

記憶装置１０４は、例えば、ハードディスクやＣＤ−ＲＯＭ等で構成され、画像データを管理するデータベースを含む各種データを記憶する。ディスプレイ１０５は、例えば、ＬＣＤやＣＲＴで構成される。入力装置１０６は、例えば、マウスやキーボード、ペンタブレット等で構成される。 The storage device 104 is composed of, for example, a hard disk or a CD-ROM, and stores various data including a database for managing image data. The display 105 is composed of, for example, an LCD or a CRT. The input device 106 is composed of, for example, a mouse, a keyboard, a pen tablet, and the like.

ネットワークインターフェース（Ｉ／Ｆ）１０９は、ネットワーク２０４上に接続されている外部装置（スキャナ２０２やプリンタ２０３に限らず、不図示のサーバや外部記憶装置等）と通信し、プログラムやデータを読み込んだり、書き込んだりする。 A network interface (I / F) 109 communicates with an external device (not limited to the scanner 202 and the printer 203, but a server or an external storage device, not shown) connected on the network 204, and reads a program or data. Or write.

図３は、図１の画像処理システムにて実行される画像処理の概略を示すフローチャートである。本処理は、ＰＣ２０１内の画像処理プログラムに基づいてＣＰＵ１０１により実行される処理である。なお、図３の詳細な処理を示すフローチャートは、図４を用いて説明する。 FIG. 3 is a flowchart showing an outline of image processing executed in the image processing system of FIG. This process is a process executed by the CPU 101 based on the image processing program in the PC 201. A flowchart showing the detailed processing of FIG. 3 will be described with reference to FIG.

まず、ステップＳ３０１では、ＰＣ２０１は、スキャナ２０２を制御して紙文書の画像を読み取らせて、その画像データを取得する。次に、ＰＣ２０１は、画像に対してブロックセレクション処理を行って、画像から表、文字、絵や図、枠、線の各領域を抽出する。 First, in step S301, the PC 201 controls the scanner 202 to read an image of a paper document and acquires the image data. Next, the PC 201 performs a block selection process on the image, and extracts a table, a character, a picture, a figure, a frame, and a line area from the image.

ブロックセレクション処理とは、図９のように読み取った一頁のイメージデータをオブジェクト毎の塊として認識し、該ブロック各々を文字／図画／写真／線／表等の属性に判定し、異なる属性を持つ領域に分割する処理である。 In block selection processing, image data of one page read as shown in FIG. 9 is recognized as a block for each object, each block is determined as an attribute such as character / drawing / photo / line / table, and different attributes are set. This is a process of dividing the area.

具体的には、先ず、入力画像を白黒に二値化し、輪郭線追跡をおこなって黒画素輪郭で囲まれる画素の塊を抽出する。面積の大きい黒画素の塊については、内部にある白画素に対しても輪郭線追跡をおこない白画素の塊を抽出、さらに一定面積以上の白画素の塊の内部からは再帰的に黒画素の塊を抽出する。 Specifically, first, the input image is binarized into black and white, and contour tracking is performed to extract a block of pixels surrounded by a black pixel contour. For a black pixel block with a large area, contour tracing is also performed for white pixels inside, and a white pixel block is extracted, and the black pixel block is recursively extracted from the white pixel block with a certain area or more. Extract lumps.

このようにして得られた黒画素の塊を、大きさおよび形状で分類し、異なる属性を持つ領域へ分類していく。たとえば、縦横比が１に近く、大きさが一定の範囲のものを文字相当の画素塊とし、さらに近接する文字が整列良くグループ化可能な部分を文字領域、扁平な画素塊を線領域、一定大きさ以上でかつ四角系の白画素塊を整列よく内包する黒画素塊の占める範囲を表領域、不定形の画素塊が散在している領域を写真領域、それ以外の任意ブロックセレクション処理で得られた各ブロックに対するブロック情報を図５に示す。図５に示すブロック情報に対応する画像の例が図１０である。なお、各ブロック情報は画面上で表示されないが、説明のために図１０で各ブロックを示すものとする。 The black pixel blocks obtained in this way are classified by size and shape, and are classified into regions having different attributes. For example, if the aspect ratio is close to 1 and the size is within a certain range, the pixel block corresponding to the character is used, the portion where the adjacent characters can be grouped in a well-aligned manner is the character region, and the flat pixel block is the line region. The area occupied by the black pixel block that is larger than the size and contains the square white pixel block in a well-aligned manner is obtained as a table area, the area where irregular pixel blocks are scattered is obtained as a photo area, and other arbitrary block selection processes are used. FIG. 5 shows block information for each obtained block. FIG. 10 shows an example of an image corresponding to the block information shown in FIG. In addition, although each block information is not displayed on a screen, each block shall be shown in FIG. 10 for description.

このブロックセレクション処理で得られたブロックのうち、テキスト属性（種別）を持つブロックを取得し、このブロックについて文字認識処理を行う。文字認識技術については既知の技術であるため説明を省略する。 Among the blocks obtained by this block selection process, a block having a text attribute (type) is acquired, and character recognition processing is performed on this block. Since the character recognition technique is a known technique, the description thereof is omitted.

文字認識をした結果得られた文字列から、複数行にわたる領域かを判定し、複数行にわたる場合には、各行に分割して、領域として登録する。１行はそのままの領域として登録する。文字認識した場合に得られた領域（座標）も取得できるため、その座標を登録する。領域として登録されたデータの例が図６である。図５のブロック４が１３個の領域として分割され、登録される。 From a character string obtained as a result of character recognition, it is determined whether the region extends over a plurality of lines. When the region extends over a plurality of lines, it is divided into each line and registered as a region. One line is registered as an area as it is. Since the area (coordinates) obtained when the character is recognized can also be acquired, the coordinates are registered. An example of data registered as an area is shown in FIG. Block 4 in FIG. 5 is divided and registered as 13 areas.

次に設定ファイル（図７）に設定された取得条件７０１に従って、値を取得する領域を特定する。この特定は、ブロックセレクションにより得られたブロックに対して文字認識処理を行い、行分割して、領域を登録する。この領域から、取得条件７０１のキー領域の検索文字列（例えば、ＨＤＬコレステロール）を基に、キー領域を検索して、このキー領域から、検索方向（例えば、横）を用いて、キー領域の横の領域を値取得領域として特定する。この時値フォーマット（例えば、数字、３ケタ）の条件に合わない場合は更に検索方向の隣の領域を値取得領域として特定する。 Next, in accordance with the acquisition condition 701 set in the setting file (FIG. 7), an area for acquiring a value is specified. For this specification, character recognition processing is performed on the block obtained by block selection, the line is divided, and the region is registered. From this area, the key area is searched based on the search character string (for example, HDL cholesterol) of the key area in the acquisition condition 701, and the key area is searched from the key area using the search direction (for example, horizontal). The horizontal area is specified as the value acquisition area. At this time, if the condition of the value format (for example, numbers, three digits) is not met, the area adjacent to the search direction is further specified as the value acquisition area.

この時、７０２のように設定ファイルの検索文字列をカンマ区切りで複数登録しておくと、１番目の検索文字列に合致する文字列が存在しない場合は２番目の検索文字列（例えば、トリグリセライド）を検索する。 At this time, if a plurality of search character strings in the setting file are registered in a comma-separated manner as in 702, if there is no character string that matches the first search character string, the second search character string (for example, triglyceride) )

また、７０３のように設定ファイルの検索方向をカンマ区切りで複数登録しておくと、１番目の検索方向に合致する値が存在しない場合は２番目の検索方向（例えば、下）を検索する。 If a plurality of setting file search directions are registered in a comma-separated manner as in 703, the second search direction (for example, the bottom) is searched when there is no value that matches the first search direction.

さらに、上記の検索方向に加えて、図８に示す除外列設定条件８０１および優先列設定条件８０３により、除外すべき列、優先すべき列を判断し、値を取得すべき領域を特定する。なお、本実施例では、除外列、優先列は、縦並びの一連の項目として説明しているが、帳票等の書式によっては横並びの一連の項目（除外行、優先行と呼んでもよい）としてもよい。 Further, in addition to the above search direction, columns to be excluded and columns to be prioritized are determined by the exclusion column setting conditions 801 and priority column setting conditions 803 shown in FIG. In this embodiment, the exclusion column and the priority column are described as a series of items arranged vertically. However, depending on the format of the form or the like, a series of items arranged side by side (may be called an exclusion row or a priority row). Also good.

除外列設定条件８０１は、設定された文字列８０２が表示されている列には、検査の基準値や前回の検査値など、今回の検査結果とは異なる情報が表示されていると判断される対象を設定する条件である。 The exclusion column setting condition 801 is determined to display information different from the current test result, such as the test reference value and the previous test value, in the column in which the set character string 802 is displayed. This is a condition for setting the target.

優先列設定条件８０３は、設定された文字列８０４が表示されている列には、今回の検査結果が表示されていると判断される対象を設定する条件である。 The priority column setting condition 803 is a condition for setting a target for which it is determined that the current examination result is displayed in the column in which the set character string 804 is displayed.

そして、値取得領域として特定された領域に登録されている文字列（数字）を今回の検査結果として取得する。 And the character string (number) registered in the area | region specified as a value acquisition area is acquired as this test result.

ステップＳ３０２では、ＰＣ２０１は、ステップＳ３０１にて取得した文字列（数字）を今回の検査結果としてディスプレイ１０５に表示する。ディスプレイへの表示例は、図１２および図１３であり、それぞれの画面については後述する。また、ユーザからの出力指示によりＣＳＶファイル等に取得した値のデータ群をエクスポートして出力する。 In step S302, the PC 201 displays the character string (number) acquired in step S301 on the display 105 as the current examination result. Examples of display on the display are shown in FIGS. 12 and 13, and the respective screens will be described later. In addition, a data group of values acquired in a CSV file or the like according to an output instruction from the user is exported and output.

図４を用いて、ステップＳ３０１の値取得処理について説明する。 The value acquisition process in step S301 will be described with reference to FIG.

ステップＳ４０１では、ＰＣ２０１は、設定ファイルから取得条件７０１、除外列設定条件８０１、優先列設定条件８０３を読み込む。それぞれの条件はステップＳ３０１で説明した通りである。利用する設定ファイルは、ユーザが任意に選択して読み込んでもよい。 In step S401, the PC 201 reads the acquisition condition 701, the exclusion column setting condition 801, and the priority column setting condition 803 from the setting file. Each condition is as described in step S301. The setting file to be used may be arbitrarily selected and read by the user.

ステップＳ４０２では、ＰＣ２０１は、スキャナから取り込まれた画像、或いは所定のフォルダに格納されている画像を読み込む。 In step S402, the PC 201 reads an image captured from the scanner or an image stored in a predetermined folder.

ステップＳ４０３では、ＰＣ２０１は、読み込まれた画像を用いて、ブロックセレクション処理を実行する。このブロックセレクションは、ブロックセレクションライブラリを用いて実行する。なお、ブロックセレクション処理については、ステップＳ３０１にて説明した通りである。また、ブロック情報は図５と同様である。さらに、ブロックセレクションライブラリで文字認識処理を実行してもよい。 In step S403, the PC 201 executes block selection processing using the read image. This block selection is executed using a block selection library. The block selection process is as described in step S301. The block information is the same as in FIG. Further, the character recognition process may be executed by a block selection library.

ステップＳ４０４では、ＰＣ２０１は、ブロックセレクションライブラリからテキスト種別、表種別を有するブロック情報を取得する。 In step S404, the PC 201 acquires block information having a text type and a table type from the block selection library.

ステップＳ４０５では、ＰＣ２０１は、取得したブロックに対して、文字認識処理を行う。文字認識処理は既知の技術であり説明を省略するが、例えば、パターンマッチングを用いて、記入文字と文字のテンプレートを照合して、文字候補を抽出する技術がある。 In step S405, the PC 201 performs character recognition processing on the acquired block. The character recognition process is a known technique and will not be described. For example, there is a technique for extracting character candidates by collating a character with a character template using pattern matching.

文字認識処理により、文字列（数字含む）と、その文字列の領域情報（座標）が取得できる。領域情報は、文字列を囲む枠を示す領域である。文字認識した結果は、ブロックごとにメモリで管理されるものとする。 By character recognition processing, a character string (including numbers) and area information (coordinates) of the character string can be acquired. The area information is an area indicating a frame surrounding the character string. The result of character recognition is managed in memory for each block.

ステップＳ４０５では、すべてのブロックに対して、まず文字認識を実行し、ステップＳ４０６の処理へ移行するようにしているが、１ブロックごとに、文字認識を行い、Ｓ４０６の判定を行ってもよい。すなわち、すべてのブロックに対して文字認識を実行したあとに、ステップＳ４０６の判定を実行する手順に限定されるものではない。 In step S405, character recognition is first executed for all blocks, and the process proceeds to step S406. However, character recognition may be performed for each block, and the determination in step S406 may be performed. That is, the present invention is not limited to the procedure for performing the determination in step S406 after performing character recognition on all blocks.

ステップＳ４０６では、ＰＣ２０１は、すべてのブロックに対して、処理を実行したか否かを判定する。すべてのブロックに対して処理が行われていない場合、次に処理するブロックの文字列を取得するべくステップＳ４０７へ処理を移す。すべてのブロックに対して処理を実行した場合には、ステップＳ４１１へ処理を移す。 In step S406, the PC 201 determines whether processing has been executed for all blocks. If the process has not been performed for all the blocks, the process proceeds to step S407 in order to obtain the character string of the block to be processed next. If the process has been executed for all blocks, the process proceeds to step S411.

ステップＳ４０７では、ＰＣ２０１は、処理対象のブロックの文字認識結果（文字列）をメモリから取得する。 In step S407, the PC 201 obtains the character recognition result (character string) of the block to be processed from the memory.

ステップＳ４０８では、ＰＣ２０１は、取得した文字列が複数行の文字列かを判定する。複数行の文字列を取得した場合はステップＳ４１０へ処理を移す。また、１行の文字列を取得した場合はステップＳ４０９へ処理を移す。 In step S408, the PC 201 determines whether the acquired character string is a character string of a plurality of lines. If a plurality of lines of character strings have been acquired, the process proceeds to step S410. If a single line of character string is acquired, the process proceeds to step S409.

ステップＳ４０９では、ＰＣ２０１は、文字認識結果をメモリに記憶する。文字認識結果は図６の文字認識領域情報に登録される。図６では、領域ＩＤと、文字認識結果から得られる文字列と、文字列がある位置を示す領域情報（座標）を登録する。 In step S409, the PC 201 stores the character recognition result in the memory. The character recognition result is registered in the character recognition area information of FIG. In FIG. 6, the area ID, the character string obtained from the character recognition result, and area information (coordinates) indicating the position where the character string is located are registered.

ステップＳ４１０では、ＰＣ２０１は、行ごとに領域を分割して、文字認識領域情報を生成し、登録する。図６の６０１に示す通り、ブロック４の文字列が、１３個の領域に分けて登録される。なお、領域情報（座標）は、それぞれの行の文字列のある位置の座標が登録される。 In step S410, the PC 201 divides the area for each line to generate and register character recognition area information. As indicated by reference numeral 601 in FIG. 6, the character string of block 4 is registered in 13 areas. In the area information (coordinates), the coordinates of a certain position of the character string in each line are registered.

ステップＳ４１１では、ＰＣ２０１は、読み込まれた取得条件７０１のキー領域の検索文字列（図７参照）を取得して、文字認識領域情報を検索する。キー領域の検索文字列に従って、文字認識領域情報の文字列で一致する領域を特定する。これにより、一致した領域がキー検索領域となる。 In step S411, the PC 201 obtains the read character string (see FIG. 7) of the key area of the obtained acquisition condition 701, and searches the character recognition area information. In accordance with the search character string of the key area, a matching area is specified in the character string of the character recognition area information. Thereby, the matched area becomes the key search area.

ステップＳ４１２では読み込まれた除外列設定条件８０１の検索文字列（図８参照）を取得して、文字認識領域情報を検索し、値が一致する領域を特定する。この時、値が一致した領域が表領域の場合は表中の自分自身が存在する列を「除外列」とする。 In step S412, the retrieval character string (see FIG. 8) of the read exclusion string setting condition 801 is acquired, the character recognition area information is retrieved, and the area where the values match is specified. At this time, if the area where the values match is a table area, the column in which the table exists is defined as an “excluded column”.

除外列設定条件８０１の検索文字列を複数登録している場合は全ての検索文字列に対して、上記を実施する。 When a plurality of search character strings for the exclusion string setting condition 801 are registered, the above is performed for all search character strings.

除外列の検索文字列と一致した個所が表形式でない場合は自分自身の文字列のＸ座標と同じ並びの領域を「除外列」として判断しても良い。また、除外列が横並びの場合はＹ座標で判断してもよい。 If the portion that matches the search character string in the exclusion string is not in the table format, an area in the same sequence as the X coordinate of the own character string may be determined as the “exclusion string”. Further, when the exclusion column is arranged side by side, the determination may be made based on the Y coordinate.

ステップＳ４１３では読み込まれた優先列設定条件８０３の検索文字列（図８参照）を取得して、文字認識領域情報を検索し、値が一致する領域を特定する。この時、値が一致した領域が表領域の場合は表中の自分自身が存在する列を「優先列」とする。 In step S413, a retrieved character string (see FIG. 8) of the read priority column setting condition 803 is acquired, character recognition area information is retrieved, and an area with a matching value is specified. At this time, if the area where the values match is a table area, the column in which the table exists is set as the “priority column”.

優先列設定条件８０３の検索文字列を複数登録している場合は全ての検索文字列に対して、上記を実施する。 When a plurality of search character strings for the priority column setting condition 803 are registered, the above is performed for all the search character strings.

優先列の検索文字列と一致した個所が表形式でない場合は自分自身の文字列のＸ座標と同じ並びの領域を「優先列」として判断しても良い。また、優先列が横並びの場合はＹ座標で判断してもよい。 If the portion that matches the search character string of the priority column is not in the table format, an area in the same sequence as the X coordinate of the own character string may be determined as the “priority column”. Further, when the priority columns are arranged side by side, the determination may be made based on the Y coordinate.

ステップＳ４１４では、ＰＣ２０１は、キー領域と、取得条件７０１の検索方向（例えば、右）を取得して、キー領域から１つ右の領域を特定する。キー領域の座標を基に、文字認識領域情報（例えば、図６）を参照して右方向の座標を有する領域を検索し、特定する。この特定した領域が値取得領域となる。なお、右方向の座標を有する領域のうち、一番近い座標を持つ領域から順に１つ目の領域、２つ目の領域とする。キー領域が複数ある場合には、同様にそれぞれの条件に従い値の取得領域を特定する。 In step S414, the PC 201 acquires the key area and the search direction (for example, right) of the acquisition condition 701, and specifies an area one right from the key area. Based on the coordinates of the key area, an area having right-direction coordinates is searched and specified with reference to character recognition area information (for example, FIG. 6). This specified area becomes a value acquisition area. It should be noted that, among the areas having rightward coordinates, the first area and the second area are sequentially arranged from the area having the closest coordinates. When there are a plurality of key areas, a value acquisition area is similarly specified according to each condition.

上記にて値の取得領域を取得する際に、値の検索方向に「優先列」が存在する場合はその列（例えば、図１１の１１０１）を優先的に値を取得する取得領域の候補とする。また「優先列」が複数存在する場合は、優先度順に値の取得領域の候補とする。 When a “priority column” exists in the value search direction when acquiring a value acquisition region as described above, the column (for example, 1101 in FIG. 11) is used as a candidate for an acquisition region for acquiring a value preferentially. To do. If there are a plurality of “priority columns”, values are acquired as candidates in the priority order.

また、値の検索方向に「除外列」が存在する場合はその列（例えば、図１１の１１０２）を値取得領域の対象外とする。 Further, if an “excluded column” exists in the value search direction, that column (for example, 1102 in FIG. 11) is excluded from the value acquisition area.

例えば、図１４に示す報告書例１４０１の場合、キー項目「ＨＤＬコレステロール」の値を取得する際に、基準値が表示された列１４０２は除外列を表す文字列を含まないため除外列とは判断されないが、文字列「今回」を含む列１４０３が優先列と判断され、列１４０２をスキップして、列１４０３の文字列「５１」の領域を値の取得領域とする。 For example, in the case of the report example 1401 shown in FIG. 14, when acquiring the value of the key item “HDL cholesterol”, the column 1402 in which the reference value is displayed does not include a character string representing the exclusion column. Although not determined, the column 1403 including the character string “current” is determined to be a priority column, the column 1402 is skipped, and the region of the character string “51” in the column 1403 is set as a value acquisition region.

また、図１４に示す報告書例１４１１の場合、文字列「基準値」を含む列１４１２が除外列と判断され、検索方向で除外列にある文字列「４０〜８６」は対象から除外され、除外列をスキップした列１４１３の文字列「５１」の領域を値の取得領域とする。なお、列１４１３は優先列を表す文字列を含まないため優先列とは判断されない。 In the case of the report example 1411 illustrated in FIG. 14, the column 1412 including the character string “reference value” is determined as an excluded column, and the character strings “40 to 86” in the excluded column in the search direction are excluded from the target. The area of the character string “51” in the column 1413 in which the exclusion column is skipped is set as a value acquisition region. Note that the column 1413 does not include a character string representing a priority column, and thus is not determined as a priority column.

ステップＳ４１５では、ＰＣ２０１は、ステップＳ４１４で特定した値の取得領域から値を取得する。この時取得した文字列が読み込まれた取得条件７０１の値フォーマット（図７参照）と異なる場合は読み込まれた取得条件７０１の検索方向（図７参照）の次の文字列を取得する。これを値フォーマットに合致した文字列が取得できるか、同一の表の端まで繰り返す。 In step S415, the PC 201 acquires a value from the acquisition area of the value specified in step S414. If the character string acquired at this time is different from the value format of the read acquisition condition 701 (see FIG. 7), the next character string in the search direction (see FIG. 7) of the read acquisition condition 701 is acquired. This is repeated until the character string that matches the value format can be acquired or until the end of the same table.

ステップＳ４１６では、ＰＣ２０１は、値が取得できたか否かを判断し、値が取得できた場合はステップＳ４１８を実施する。値が取得できなかった場合はステップＳ４１７を実施する。 In step S416, the PC 201 determines whether or not a value can be acquired. If the value can be acquired, step S418 is performed. If the value cannot be acquired, step S417 is performed.

ステップＳ４１７では、ＰＣ２０１は、読み込まれた取得条件７０１の検索方向（図７参照）に設定された全ての検索方向に対して値の取得を実施したか否かを判断し、実施した場合はステップＳ４１８を実施し、実施していない場合は検索方向を次の方向にしてステップＳ４１４を実施する。 In step S417, the PC 201 determines whether or not values have been acquired for all the search directions set in the search direction (see FIG. 7) of the read acquisition condition 701. Step S418 is performed. If not, step S414 is performed with the search direction set to the next direction.

ステップＳ４１８では、ＰＣ２０１は、認識した全てのキー項目に対して値の取得を実施した場合はステップS４１９を実施し、値の取得が未実施のキー項目がある場合は値の取得処理（ステップＳ４１４〜ステップＳ４１７）を実施する。 In step S418, the PC 201 executes step S419 when values are acquired for all recognized key items, and acquires values (step S414) when there are key items for which values have not been acquired. To Step S417).

ステップＳ４１９では、値取得領域として特定された領域のうち、ステップＳ４１５で値を取得できなかったものについて、値を取得すべき領域であったどうかを判定する。判定した結果は、ステップＳ３０２にて結果を表示する際に反映させる。 In step S419, it is determined whether or not the area for which a value could not be acquired in step S415 among the areas specified as the value acquisition area was an area where a value should be acquired. The determined result is reflected when the result is displayed in step S302.

具体的処理を、画像として取り込んだ報告書等が表形式の場合について説明する。ステップＳ４１５にて、あるキー項目についてエラーとなり、当該キー項目に対して値が取得できなかった場合、値を取得しようとした領域に対して、同じ列に値が正常に取得できた他の値取得領域が存在する場合は「優先領域」と判定する。逆に同じ列に値が正常に取得できた他の値取得領域が存在しない場合は「非優先領域」と判定する。つまり、値が正常に取得できた値取得領域が存在する列については、取得すべき項目が並んだ列である可能性が高いため、同列でエラーとなった領域についても、値を取得すべき領域と判定している。 Specific processing will be described for a case where a report or the like captured as an image is in a table format. In step S415, if an error occurs with respect to a key item and a value cannot be acquired for the key item, another value in which the value can be normally acquired in the same column for the area from which the value is to be acquired. If the acquisition area exists, it is determined as “priority area”. Conversely, if there is no other value acquisition area in which the value can be normally acquired in the same column, it is determined as a “non-priority area”. In other words, since there is a high possibility that the column where there is a value acquisition area where the value can be acquired normally is the column where the items to be acquired are arranged, the value should be acquired even for the area where an error occurred in the same column Judged as an area.

図１３に判定結果の一例を示す。図１３では、キー項目「ＡＬＴ（ＧＰＴ）」について、値取得領域として１３０２と１３０３の２ヶ所が特定され、いずれも値取得がエラーとなっている。値取得領域１３０２では、他のキー項目「ＡＳＴ（ＧＯＴ）」などについて同列の値取得領域で値が正常に取得できているため、「優先領域」、つまり値を取得すべきキー領域と判定される。一方、値取得領域１３０３では、他のキー項目について同列で値を正常に取得できた値取得領域が存在しないため、「非優先領域」、つまり値を取得しなくてもよいキー領域と判定される。
また、画像として取り込んだ報告書等が表形式でない場合は、エラーとなったキー領域のＸ座標と近いＸ座標を持つ領域に値の取得できた他のキー項目が存在する場合に「優先領域」、存在しない場合に「非優先領域」と判定してもよい。また、表形式の場合に列ではなく行で領域を特定してもよく、表形式でない場合にＹ座標で特定してもよい。 FIG. 13 shows an example of the determination result. In FIG. 13, for the key item “ALT (GPT)”, two locations 1302 and 1303 are specified as the value acquisition areas, and the value acquisition is an error in both cases. In the value acquisition area 1302, since values can be normally acquired in the value acquisition area in the same column for other key items “AST (GOT)” and the like, it is determined as a “priority area”, that is, a key area from which a value is to be acquired. The On the other hand, in the value acquisition area 1303, since there is no value acquisition area in which values can be normally acquired in the same column for other key items, it is determined as a “non-priority area”, that is, a key area for which no value needs to be acquired. The
In addition, if the report or the like imported as an image is not in a table format, if there is another key item whose value can be acquired in an area having an X coordinate close to the X coordinate of the key area in which an error occurs, the “priority area” ", It may be determined as a" non-priority area "if it does not exist. In the case of a table format, the area may be specified by a row instead of a column, and in the case of a table format, it may be specified by a Y coordinate.

次に、ステップＳ３０２の値出力処理により表示される画面について説明する。 Next, the screen displayed by the value output process in step S302 will be described.

図１２は、ステップＳ３０２により表示される画面の一例である認識結果画面１２０１の画面イメージである。 FIG. 12 is a screen image of a recognition result screen 1201, which is an example of a screen displayed in step S302.

認識結果画面１２０１は、左側に読み取った画像イメージ１２０２、右側に検査項目毎の認識結果一覧１２０３を表示する。 The recognition result screen 1201 displays the read image image 1202 on the left side and a recognition result list 1203 for each inspection item on the right side.

ステップＳ４１５で、キー領域が空欄の場合や、値フォーマットに合う文字列がない場合など、値が取得できない項目がある場合にはエラー項目として強調表示する（１２０４、１２０５）。認識結果に誤りがあった場合には、ユーザにより、修正入力エリアに修正値を入力させることが可能である（１２０６）。修正入力エリアに入力された場合には、入力値が登録される値となる。 In step S415, if there is an item whose value cannot be acquired, such as when the key area is blank or there is no character string that matches the value format, the item is highlighted as an error item (1204, 1205). If there is an error in the recognition result, the user can input a correction value in the correction input area (1206). When input to the correction input area, the input value is a registered value.

また、優先列設定条件８０３および除外列設定条件８０１により特定される優先列、除外列を識別可能に表示してもよい。 Further, the priority column and the exclusion column specified by the priority column setting condition 803 and the exclusion column setting condition 801 may be displayed so as to be identifiable.

図１３は、ステップＳ４１９による判定結果を反映させた画面の一例である認識結果画面１３０１の画面イメージである。 FIG. 13 is a screen image of a recognition result screen 1301, which is an example of a screen reflecting the determination result in step S419.

値取得領域１３０２はステップＳ４１５にて値を取得できず、ステップＳ４１９にて「優先領域」と判定されたため、値取得領域１３０２と、値取得領域１３０２の認識結果１３０４が強調表示されている。 Since the value acquisition area 1302 cannot acquire a value in step S415 and is determined as a “priority area” in step S419, the value acquisition area 1302 and the recognition result 1304 of the value acquisition area 1302 are highlighted.

一方、値取得領域１３０３はステップＳ４１５にて値を取得できず、ステップＳ４１９にて「非優先領域」と判定されたため、値取得領域１３０３と、値取得領域１３０３の認識結果１３０５が、「優先領域」とは異なる形式で表示されている。 On the other hand, since the value acquisition area 1303 cannot acquire a value in step S415 and is determined as a “non-priority area” in step S419, the recognition result 1305 of the value acquisition area 1303 and the value acquisition area 1303 is “priority area”. "Is displayed in a different format.

また、認識結果画面１２０１および１３０１は、出力ボタンを備えており、認識結果と、ユーザによる値の修正があれば修正結果とをＣＳＶファイルに出力する。なお、複数の画像が読み込まれた場合には、すべての画像に対して、値取得、確認を実行し、最後の画像に対して出力ボタンを押下すると、ＣＳＶファイルに一括して値を出力する。出力するファイルの形式は一例であり、限定されるものではない。 The recognition result screens 1201 and 1301 include an output button, and outputs a recognition result and a correction result if a value is corrected by the user to a CSV file. When a plurality of images are read, value acquisition and confirmation are executed for all images, and when the output button is pressed for the last image, the values are output collectively to the CSV file. . The format of the output file is an example and is not limited.

上記により、値の取得時にエラーとなった項目が取得すべき項目であるか否かを認識することができるようになる。
＜第２の実施形態＞
以下、本発明の第２の実施形態について説明する。なお、第１の実施形態と同じ内容については説明を省略する。 Based on the above, it is possible to recognize whether or not an item that has an error when acquiring a value is an item to be acquired.
<Second Embodiment>
Hereinafter, a second embodiment of the present invention will be described. Note that description of the same contents as in the first embodiment is omitted.

図１５は、ステップＳ３０１の値取得処理の詳細フローを示すフローチャートである（第１の実施形態の図４に当たる）。また、図１６に本例で読み込む文書画像のイメージ、図１７に認識結果を表示する画面イメージを示し、都度説明に使用する。 FIG. 15 is a flowchart showing a detailed flow of the value acquisition processing in step S301 (corresponding to FIG. 4 of the first embodiment). FIG. 16 shows an image of a document image read in this example, and FIG. 17 shows a screen image for displaying a recognition result, which is used for explanation each time.

ステップＳ４０１からＳ４１０までは図４と同じであるため、説明を省略する。ただし、ステップＳ４０１での除外列設定条件８０１、優先列設定条件８０３の読み込みは不要である。後続処理として、ステップＳ１５０１からの処理について説明する
ステップＳ１５０１では、ＰＣ２０１は、ステップＳ３０１にて分割したブロックごとに以下の処理を繰り返し実行する。ブロックごとに処理を実施するのは、ブロックにより表示項目の並びが変わる可能性があるからである。 Steps S401 to S410 are the same as those in FIG. However, it is not necessary to read the exclusion column setting condition 801 and the priority column setting condition 803 in step S401. As subsequent processing, processing from step S1501 will be described. In step S1501, the PC 201 repeatedly executes the following processing for each block divided in step S301. The reason why the process is performed for each block is that the arrangement of display items may change depending on the block.

ステップＳ１５０２では、ＰＣ２０１は、読み込まれた取得条件７０１のキー領域の検索文字列（図７参照）を取得して、文字認識領域情報を検索する。キー領域の検索文字列に従って、文字認識領域情報の文字列で一致する領域を特定する。これにより、一致した領域がキー検索領域となる。 In step S1502, the PC 201 acquires the key area search character string (see FIG. 7) of the read acquisition condition 701 and searches the character recognition area information. In accordance with the search character string of the key area, a matching area is specified in the character string of the character recognition area information. Thereby, the matched area becomes the key search area.

ステップＳ１５０３では、ＰＣ２０１は、キー領域と、取得条件７０１の検索方向（例えば、右。本例では検索方向は１種類とする。）を取得して、キー領域から検索方向に順に領域を検索し、値取得領域を１つ特定する。領域の検索方法としては、キー領域の座標を基に、文字認識領域情報（例えば、図６）を参照して検索方向の座標を有する領域を順に検索する。この検索した領域が値取得領域となり、検索方向に順に取得される領域を列と呼ぶ。 In step S1503, the PC 201 acquires the key area and the search direction of the acquisition condition 701 (for example, right. In this example, one type of search direction is used), and searches the area sequentially from the key area to the search direction. , One value acquisition area is specified. As an area search method, areas having coordinates in the search direction are sequentially searched with reference to character recognition area information (for example, FIG. 6) based on the coordinates of the key area. This searched area becomes a value acquisition area, and the areas acquired in order in the search direction are called columns.

ステップＳ１５０４では、ＰＣ２０１は、ステップＳ１５０３で特定した値の取得領域から値を取得する。この時取得した文字列が読み込まれた取得条件７０１の値フォーマット（図７参照）と合致する場合は取得候補と判定し、値フォーマットと異なる場合は除外候補と判定する。 In step S1504, the PC 201 acquires a value from the acquisition area of the value specified in step S1503. If the acquired character string matches the value format (see FIG. 7) of the acquired acquisition condition 701, it is determined as an acquisition candidate, and if it is different from the value format, it is determined as an exclusion candidate.

図１７の認識結果画面１７０１では、画像プレビュー１７０２を表示し、取得候補となった項目を網掛け表示（１７０３）している。検査項目「ＨＤＬコレステロール」、「ＬＤＬコレステロール」では２列目、３列目が、検査項目「尿蛋白」では２列目から４列目までが取得候補と判定されている。それ以外の項目は除外候補と判定されている。 In the recognition result screen 1701 of FIG. 17, an image preview 1702 is displayed, and items that are acquisition candidates are displayed in a shaded manner (1703). The test items “HDL cholesterol” and “LDL cholesterol” are determined as acquisition candidates in the second and third columns, and the test item “urine protein” is determined from the second to fourth columns as acquisition candidates. The other items are determined as exclusion candidates.

ステップＳ１５０５では、ステップＳ１５０３で検索される全ての領域（列）についてステップＳ１５０３、Ｓ１５０４の処理を実施したか否かを判断し、実施した場合はステップＳ１５０６に移行し、実施していない場合は、ステップＳ１５０３に戻って次の領域（列）について処理する。 In step S1505, it is determined whether or not the processing in steps S1503 and S1504 has been performed for all the areas (columns) searched in step S1503. If so, the process proceeds to step S1506. Returning to step S1503, the next area (column) is processed.

ステップＳ１５０６では、ＰＣ２０１は、認識した全てのキー項目に対してステップＳ１５０２〜Ｓ１５０５の処理を実施したか否かを判断し、実施した場合はステップＳ１５０７に移行し、実施していない場合はステップＳ１５０２に戻って次のキー項目について処理する。 In step S1506, the PC 201 determines whether or not the processing of steps S1502 to S1505 has been performed on all recognized key items. If it has been performed, the process proceeds to step S1507. If not, step S1502 is performed. Return to and process the next key item.

ステップＳ１５０７では、ＰＣ２０１は、ステップＳ１５０６までの処理結果をもとに、除外列、つまり、項目値として取得しない列を特定する。特定方法の例としては、ステップＳ１５０６までの処理の結果、除外候補の割合が所定値（例えば４０％）を超える場合は、除外列と特定する。ここで、同じ列とは、対象ブロックが表形式の場合は同一のＸ座標範囲を持つ領域、表形式でない場合は近似するＸ座標範囲を持つ領域を示す。また行列反転した表形式の場合はＹ座標範囲で判断する。 In step S1507, the PC 201 identifies an excluded column, that is, a column that is not acquired as an item value, based on the processing results up to step S1506. As an example of the specifying method, if the ratio of exclusion candidates exceeds a predetermined value (for example, 40%) as a result of the processing up to step S1506, it is identified as an exclusion column. Here, the same column indicates a region having the same X coordinate range when the target block is in the tabular format, and a region having an approximate X coordinate range when the target block is not in the tabular format. In the case of a table format in which the matrix is inverted, the determination is made based on the Y coordinate range.

図１７の認識結果画面１７０１では、検査項目「ＨＤＬコレステロール」、「ＬＤＬコレステロール」について、第２列（「正常値」列）は除外候補であるため（検査項目３つに対し２つが除外候補）、第２列を除外列と判断し、認識結果欄１７０４に除外列であることを表示している。 In the recognition result screen 1701 in FIG. 17, the second column (“normal value” column) is an exclusion candidate for the examination items “HDL cholesterol” and “LDL cholesterol” (two for three examination items are exclusion candidates). The second column is determined to be an excluded column, and the recognition result column 1704 displays that it is an excluded column.

ステップＳ１５０８では、ＰＣ２０１は、ステップＳ１５０６までの処理結果をもとに、値取得候補列、つまり、項目値として取得する候補となる列を特定する。特定方法の例としては、ステップＳ１５０６までの処理の結果、取得候補の割合が所定値（例えば６０％）以上の場合は、値取得候補列と特定する。 In step S1508, the PC 201 identifies a value acquisition candidate column, that is, a column to be acquired as an item value, based on the processing results up to step S1506. As an example of the specifying method, if the ratio of acquisition candidates is equal to or greater than a predetermined value (for example, 60%) as a result of the processing up to step S1506, the value acquisition candidate string is specified.

図１７の認識結果画面１７０１では、全ての検査項目「について、第３列、第４列（「ＸＸ年」列、「ＹＹ年」列）は取得候補であるため（検査項目３つに対し３つが取得候補）、第３列、第４列を値取得候補列と判断し、認識結果欄１７０４に値取得候補列であることを表示している。 In the recognition result screen 1701 in FIG. 17, the third column and the fourth column (“XX year” column, “YY year” column) for all the inspection items “3” are acquisition candidates (3 for three inspection items). The third column and the fourth column are determined as the value acquisition candidate columns, and the recognition result column 1704 indicates that they are value acquisition candidate columns.

ステップＳ１５０９では、ＰＣ２０１は、認識結果画面１７０１を表示し、ユーザにより値取得候補列から値取得列の指定を受け付ける。認識結果画面１７０１では、値取得候補列の判断された列の上部に選択チェックボックス１７０５が表示され、ユーザが値取得列として指定する列の選択チェックボックス１７０５をチェックし、取得列決定ボタン１７０６を押下することにより、値取得列が決定され（画面例では第３列）、当該列の項目値が各検査項目に対して取得する値として確定される。 In step S1509, the PC 201 displays a recognition result screen 1701, and accepts designation of a value acquisition sequence from the value acquisition candidate sequence by the user. In the recognition result screen 1701, a selection check box 1705 is displayed above the determined value acquisition candidate column, the column selection check box 1705 specified by the user as the value acquisition column is checked, and an acquisition column determination button 1706 is displayed. By pressing, a value acquisition column is determined (third column in the screen example), and the item value of the column is determined as a value acquired for each inspection item.

ステップＳ１５１０では、ＰＣ２０１は、全てのブロックに対してステップＳ１５０２からＳ１５０９までの処理を実施すれば繰り返し処理を終了し、そうでなければステップＳ１５０１に戻り次のブロックについて処理する。 In step S1510, the PC 201 completes the repetitive processing if the processing from steps S1502 to S1509 is performed on all the blocks, otherwise returns to step S1501 to process the next block.

以上で、第２の実施形態による値取得処理の説明を終了する。 Above, description of the value acquisition process by 2nd Embodiment is complete | finished.

上記により、認識結果に基づいて除外列、値取得候補列を提示するため、効率よく値取得列を決定することができる。 As described above, since the exclusion sequence and the value acquisition candidate sequence are presented based on the recognition result, the value acquisition sequence can be determined efficiently.

以上、一実施形態について示したが、本発明は、例えば、システム、装置、方法、プログラムもしくは記録媒体等としての実施態様をとることが可能であり、具体的には、複数の機器から構成されるシステムに適用しても良いし、また、一つの機器からなる装置に適用しても良い。例えば、クラウド環境で実現する構成であってもよい。その場合、クラウド環境上のサーバで、設定ファイル作成ツールが実行される。 Although one embodiment has been described above, the present invention can take an embodiment as, for example, a system, apparatus, method, program, or recording medium, and specifically includes a plurality of devices. The present invention may be applied to a system including a single device. For example, the structure implement | achieved in a cloud environment may be sufficient. In this case, the configuration file creation tool is executed on the server on the cloud environment.

また、本発明におけるプログラムは、図に示すフローチャートの処理方法をコンピュータが実行可能なプログラムである。なお、記憶媒体に図に示す処理方法をコンピュータが実行可能なプログラムが記憶される構成であってもよい。なお、本発明におけるプログラムは図に示す各装置の処理方法ごとのプログラムであってもよい。 Further, the program in the present invention is a program that allows a computer to execute the processing method of the flowchart shown in the figure. Note that the storage medium may be configured to store a program capable of executing the processing method illustrated in the drawing. The program in the present invention may be a program for each processing method of each apparatus shown in the figure.

以上のように、前述した実施形態の機能を実現するプログラムを記録した記録媒体を、システムあるいは装置に供給し、そのシステムあるいは装置のコンピュータ（またはＣＰＵやＭＰＵ）が記録媒体に格納されたプログラムを読出し実行することによっても、本発明の目的が達成されることは言うまでもない。 As described above, a recording medium that records a program that implements the functions of the above-described embodiments is supplied to a system or apparatus, and a computer (or CPU or MPU) of the system or apparatus stores the program stored in the recording medium. It goes without saying that the object of the present invention can also be achieved by executing the reading.

この場合、記録媒体から読み出されたプログラム自体が本発明の新規な機能を実現することになり、そのプログラムを記憶した記録媒体は本発明を構成することになる。 In this case, the program itself read from the recording medium realizes the novel function of the present invention, and the recording medium storing the program constitutes the present invention.

プログラムを供給するための記録媒体としては、例えば、フレキシブルディスク、ハードディスク、光ディスク、光磁気ディスク、ＣＤ−ＲＯＭ、ＣＤ−Ｒ、ＤＶＤ−ＲＯＭ、磁気テープ、不揮発性のメモリカード、ＲＯＭ、ＥＥＰＲＯＭ、シリコンディスク、ソリッドステートドライブ等を用いることができる。 As a recording medium for supplying the program, for example, a flexible disk, hard disk, optical disk, magneto-optical disk, CD-ROM, CD-R, DVD-ROM, magnetic tape, nonvolatile memory card, ROM, EEPROM, silicon A disk, solid state drive, or the like can be used.

また、コンピュータが読み出したプログラムを実行することにより、前述した実施形態の機能が実現されるだけでなく、そのプログラムの指示に基づき、コンピュータ上で稼働しているＯＳ（オペレーティングシステム）等が実際の処理の一部または全部を行い、その処理によって前述した実施形態の機能が実現される場合も含まれることは言うまでもない。 Further, by executing the program read by the computer, not only the functions of the above-described embodiments are realized, but also an OS (operating system) operating on the computer based on an instruction of the program is actually It goes without saying that a case where the function of the above-described embodiment is realized by performing part or all of the processing and the processing is included.

さらに、記録媒体から読み出されたプログラムが、コンピュータに挿入された機能拡張ボードやコンピュータに接続された機能拡張ユニットに備わるメモリに書き込まれた後、そのプログラムコードの指示に基づき、その機能拡張ボードや機能拡張ユニットに備わるＣＰＵ等が実際の処理の一部または全部を行い、その処理によって前述した実施形態の機能が実現される場合も含まれることは言うまでもない。 Furthermore, after the program read from the recording medium is written to the memory provided in the function expansion board inserted into the computer or the function expansion unit connected to the computer, the function expansion board is based on the instructions of the program code. It goes without saying that the case where the CPU or the like provided in the function expansion unit performs part or all of the actual processing and the functions of the above-described embodiments are realized by the processing.

また、本発明は、複数の機器から構成されるシステムに適用しても、１つの機器からなる装置に適用してもよい。また、本発明は、システムあるいは装置にプログラムを供給することによって達成される場合にも適応できることは言うまでもない。この場合、本発明を達成するためのプログラムを格納した記録媒体を該システムあるいは装置に読み出すことによって、そのシステムあるいは装置が、本発明の効果を享受することが可能となる。 Further, the present invention may be applied to a system composed of a plurality of devices or an apparatus composed of a single device. Needless to say, the present invention can be applied to a case where the present invention is achieved by supplying a program to a system or apparatus. In this case, by reading a recording medium storing a program for achieving the present invention into the system or apparatus, the system or apparatus can enjoy the effects of the present invention.

さらに、本発明を達成するためのプログラムをネットワーク上のサーバ、データベース等から通信プログラムによりダウンロードして読み出すことによって、そのシステムあるいは装置が、本発明の効果を享受することが可能となる。 Furthermore, by downloading and reading a program for achieving the present invention from a server, database, etc. on a network using a communication program, the system or apparatus can enjoy the effects of the present invention.

なお、上述した各実施形態およびその変形例を組み合わせた構成も全て本発明に含まれるものである。 In addition, all the structures which combined each embodiment mentioned above and its modification are also included in this invention.

２０１ＰＣ
２０２スキャナ
２０３プリンタ
２０４ネットワーク 201 PC
202 Scanner 203 Printer 204 Network

Claims

An information processing apparatus comprising acquisition condition storage means for storing a key character string and a position condition,
A specifying means for specifying an area based on the key character string and the position condition;
Obtaining means for obtaining a character string from the identified area;
A determination unit that determines whether or not a character string can be acquired in another region in the same arrangement as the region when the acquisition unit cannot acquire a character string;
An information processing apparatus comprising: a determination unit that determines that the specified region is a region from which a value is to be acquired when it is determined that the character string can be acquired by the determination unit.

The information processing apparatus according to claim 1, further comprising a display unit configured to display an identifiable area for which the value should have been acquired by the determination unit.

An information processing apparatus including an acquisition condition storage unit that stores a key character string, a position condition, and a value format condition,
A specifying means for specifying an area based on the key character string and the position condition;
Obtaining means for obtaining a character string from the identified area;
A determination means for determining whether or not the character strings acquired from the region of the same sequence satisfy the value format condition;
An information processing apparatus comprising: a display control unit configured to identify and display that the region in the same sequence is not a region for which a value is to be acquired when the determination unit determines that it is not suitable.

4. The display control unit according to claim 3, wherein, when the determination unit determines that the display unit is suitable, the display control unit identifies and displays that the region in the same sequence is a region candidate for which a value is to be acquired. Information processing device.

The information processing apparatus according to claim 4, wherein the display control unit receives designation of a region from which a value is to be acquired from among regions that are identified and displayed as candidates for the region from which the value is to be acquired.

The information processing apparatus according to claim 1, wherein the arrangement is a table column or row.

The information processing apparatus according to claim 1, wherein the arrangement is a certain range of vertical arrangement or horizontal arrangement.

An information processing apparatus comprising acquisition condition storage means for storing a key character string and a position condition,
A specifying step for specifying a region based on the key character string and the position condition;
An acquisition step of acquiring a character string from the specified area;
A determination step for determining whether or not a character string can be acquired in another region in the same sequence as the region when the determination unit cannot acquire a character string in the acquisition step;
An information processing apparatus comprising: a determination step of determining that the specified region is a region from which a value is to be acquired when the determination unit determines that the character string can be acquired in the determination step; Control method.

A method for controlling an information processing apparatus including an acquisition condition storage unit that stores a key character string, a position condition, and a value format condition,
A specifying step for specifying a region based on the key character string and the position condition;
An acquisition step of acquiring a character string from the specified area;
A determining step for determining whether or not the character strings acquired from the same line of regions meet the value format condition;
And a display control step for identifying and displaying that the region in the same sequence is not a region from which a value is to be acquired when it is determined in the determination step that the display control unit does not conform. Control method of the device.

The program for functioning an information processing apparatus as a means of any one of Claims 1 thru | or 7.