JP2023012225A

JP2023012225A - Support system, support method, and program

Info

Publication number: JP2023012225A
Application number: JP2021115746A
Authority: JP
Inventors: 光晟河津; Kosei Kawazu; 崇岡田; Takashi Okada
Original assignee: Toppan Printing Co Ltd
Current assignee: Toppan Inc
Priority date: 2021-07-13
Filing date: 2021-07-13
Publication date: 2023-01-25

Abstract

To support a work for detecting an error in writing a hand-written character.SOLUTION: A support system according to the present invention has an acquisition unit for acquiring a learning data set including, as a set, a learning image and a first label associated with the learning image, an estimation unit for subjecting the learning image of the learning data set acquired by the acquisition unit to image recognition to thereby estimate a second label estimated to be associated with the learning image, a comparison unit for comparing the first label with the second label, a display unit for displaying the learning image having the first label and the second label different from each other based on the comparison result by the comparison unit, and an input unit input with input information indicating whether or not the image indicated as the learning image displayed by the display unit is different from the first label associated with the learning image.SELECTED DRAWING: Figure 1

Description

本発明は、支援システム、支援方法及びプログラムに関する。 The present invention relates to a support system, support method and program.

近年、帳票などに書かれている手書き文字に対してディープラーニングを用いて機械学習し、文字認識する手法が盛んに開発されている。ディープラーニングの機械学習を行うには、大量の手書き文字とそこに書かれている文字の文字種を表すラベルが必要となる。このとき、学習に用いる手書き文字に書き間違いがある場合、その手書き文字を学習させると、文字認識の精度が低下する要因となり得る。よって、書き間違えのあるものや人の目で見て明らかに正解文字だと読めないものを学習用のデータから除く必要がある。そのため、学習を行う前に手書き文字に書き間違いがないか、予め検査を行うが、手書き文字には様々な筆致があるため、誤りを見つけるのは、人間の目視に頼る必要がある。しかしながら、ディープラーニングに用いるためのデータは膨大であるため、検査する文字数が非常に多く、見落とし等が発生する可能性があり、人が大量の手書き文字の全てを目視にて検査するのは限界がある。 In recent years, techniques for machine learning and character recognition using deep learning for handwritten characters written on forms and the like have been actively developed. Deep learning machine learning requires a large amount of handwritten characters and labels that represent the character types of the characters written there. At this time, if there is a writing error in the handwritten characters used for learning, learning the handwritten characters may cause a decrease in the accuracy of character recognition. Therefore, it is necessary to remove from the data for learning the characters that are incorrectly written and the characters that cannot be clearly read as correct characters by human eyes. For this reason, handwritten characters are inspected in advance for writing errors before learning, but handwritten characters have various strokes, so it is necessary to rely on human eyes to detect errors. However, since the data to be used for deep learning is enormous, the number of characters to be inspected is extremely large, and there is a possibility that oversights may occur. There is

特許文献１には、学習データに付与されているラベルの修正に係る作業を効率化する技術が記載されている。特許文献１に記載された技術では、画像から得られる高次元データを低次元データに変換してプロット図として表示し、そのプロット図上で選択された基準点と、基準点と同じラベルを持つ点との距離などを用いて画像に付与されたラベルの修正を行う。しかし、学習データが文字である場合、クラス数（文字の種類）が数千種と膨大なため表示方法としてプロット図を用いるのは有効的ではない。 Patent Literature 1 describes a technique for improving the efficiency of work related to correcting labels assigned to learning data. In the technique described in Patent Document 1, high-dimensional data obtained from an image is converted to low-dimensional data and displayed as a plot diagram, and a reference point selected on the plot diagram has the same label as the reference point. Corrects the label given to the image using the distance to the point. However, when the learning data is characters, it is not effective to use a plot diagram as a display method because the number of classes (types of characters) is as large as thousands.

また、手書き文字の間違いを検出する技術として漢字の自動採点を目的としたものが存在する。特許文献２には、入力手書文字画像とそこに書いてあるべき正解文字ラベルを機械学習モデルに入力することで入力手書文字画像の特徴を持った正解文字画像を生成し、生成された正解文字画像と入力手書文字画像との差分を取ることで漢字の書き間違いを検出する技術が記載されている。しかしながら、入力文字と正解文字との差分を取る手法は、漢字の採点のような正しい字形であるかを判定する場合には有効であるが、帳票等の文字認識に用いるデータセットの手書き文字では、とめ、はね、はらいなどの字形を正確に判定する必要はなく、人が目視した際に正しい文字であることが分かる字形であればよいため有効的ではない。 There is also a technology for detecting errors in handwritten characters that aims at automatic scoring of Chinese characters. In Patent Document 2, by inputting an input handwritten character image and a correct character label that should be written there into a machine learning model, a correct character image having the characteristics of the input handwritten character image is generated. It describes a technique for detecting misspellings of kanji characters by taking a difference between a correct character image and an input handwritten character image. However, the method of taking the difference between the input character and the correct character is effective when judging whether the character shape is correct, such as scoring kanji, but it is effective for handwritten characters in a data set used for character recognition such as forms. It is not necessary to accurately determine character shapes such as , tome, han, and harai.

特開２０２０－１３５６３１号公報JP 2020-135631 A 特開２０２１－２６７２９号公報Japanese Patent Application Laid-Open No. 2021-26729

上述した技術では、膨大な量の手書き文字に含まれる書き間違いを検出する作業を効率的に行うことができないという問題があった。 The technique described above has the problem that it is not possible to efficiently detect writing errors contained in a huge amount of handwritten characters.

上述の課題を鑑み、本発明は、手書き文字の書き間違いを検出する作業を支援することができる、支援システム、支援方法及びプログラムを提供することを目的とする。 SUMMARY OF THE INVENTION In view of the above problems, an object of the present invention is to provide a support system, a support method, and a program that can support the task of detecting handwriting errors.

本発明の一態様に係る支援システムは、学習用画像と、前記学習用画像に対応付けられた第１ラベルと、がセットになった学習データセットを取得する取得部と、前記取得部によって取得された前記学習データセットの前記学習用画像を画像認識することによって、前記学習用画像に対応付けられると推定される第２ラベルを推定する推定部と、前記第１ラベルと、前記第２ラベルとを比較する比較部と、前記比較部による比較結果に基づいて、前記第１ラベルと前記第２ラベルとが異なる前記学習用画像を表示する表示部と、前記表示部によって表示された前記学習用画像に示されている画像が、前記学習用画像に対応付けられた第１ラベルと異なっているか否かを示す入力情報が入力される入力部と、を備える。 A support system according to an aspect of the present invention includes an acquisition unit that acquires a learning data set in which a learning image and a first label associated with the learning image are a set; an estimation unit for estimating a second label estimated to be associated with the learning image by image recognition of the learning image of the training data set; the first label; the second label; a display unit for displaying the learning image in which the first label and the second label are different based on the comparison result by the comparison unit; and the learning image displayed by the display unit. an input unit for inputting input information indicating whether or not the image shown in the learning image is different from the first label associated with the learning image.

本発明の一態様に係る支援方法は、学習用画像と、前記学習用画像に対応付けられた第１ラベルと、がセットになった学習データセットを取得する取得過程と、取得された前記学習データセットの前記学習用画像を画像認識することによって、前記学習用画像に対応付けられると推定される第２ラベルを推定する推定過程と、前記第１ラベルと、前記第２ラベルとを比較する比較過程と、前記比較結果に基づいて、前記第１ラベルと前記第２ラベルとが異なる前記学習用画像を表示する表示過程と、表示された前記学習用画像に示されている画像が、前記学習用画像に対応付けられた第１ラベルと異なっているか否かを示す入力情報が入力される入力過程と、を含む。 A support method according to an aspect of the present invention includes an acquisition process of acquiring a learning data set in which a learning image and a first label associated with the learning image are a set; comparing the first label and the second label with an estimation process of estimating a second label that is estimated to be associated with the training image by image recognition of the training image of the dataset; a comparing step, a displaying step of displaying the learning image having the different first label and the second label based on the comparison result, and the image shown in the displayed learning image being the and an input process of inputting input information indicating whether or not the label is different from the first label associated with the learning image.

本発明の一態様に係るプログラムは、コンピュータに、学習用画像と、前記学習用画像に対応付けられた第１ラベルと、がセットになった学習データセットを取得するステップと、取得された前記学習データセットの前記学習用画像を画像認識することによって、前記学習用画像に対応付けられると推定される第２ラベルを推定するステップと、前記第１ラベルと、前記第２ラベルとを比較するステップと、前記比較結果に基づいて、前記第１ラベルと前記第２ラベルとが異なる前記学習用画像を表示するステップと、表示された前記学習用画像に示されている画像が、前記学習用画像に対応付けられた第１ラベルと異なっているか否かを示す入力情報が入力されるステップと、を実行させるためのプログラムである。 A program according to an aspect of the present invention is configured to provide a computer with a step of acquiring a learning data set in which a learning image and a first label associated with the learning image are a set; estimating a second label estimated to be associated with the learning image by image recognition of the learning image of the learning data set; and comparing the first label and the second label. a step of displaying the learning image having the first label and the second label different from each other based on the comparison result; a step of inputting input information indicating whether the label is different from the first label associated with the image.

本発明によれば、手書き文字の書き間違いを検出する作業を支援することができる。 Advantageous Effects of Invention According to the present invention, it is possible to support the task of detecting handwriting errors in handwritten characters.

本発明の実施形態による検査支援装置の概略構成を示すブロック図である。1 is a block diagram showing a schematic configuration of an examination support device according to an embodiment of the present invention; FIG. 本発明の実施形態による検査支援装置が原稿用紙の画像から一文字単位で手書文字画像を切り出す手順を説明するための図である。FIG. 4 is a diagram for explaining a procedure for the inspection support apparatus according to the embodiment of the present invention to cut out handwritten character images in units of characters from an image of a manuscript sheet; 本発明の実施形態による検査支援装置が記憶する取得ラベルテーブルのデータ構造を示す概略図である。4 is a schematic diagram showing the data structure of an acquisition label table stored by the inspection support device according to the embodiment of the present invention; FIG. 本発明の実施形態による検査支援装置が記憶する推定ラベルテーブルのデータ構造を示す概略図である。4 is a schematic diagram showing the data structure of an estimated label table stored by the examination support device according to the embodiment of the present invention; FIG. 本発明の実施形態による検査支援装置が表示する目視検査用画面の一例を示すイメージ図である。FIG. 4 is an image diagram showing an example of a visual inspection screen displayed by the inspection support device according to the embodiment of the present invention; 本発明の実施形態による検査支援装置が表示する目視検査用画面の他の例を示すイメージ図である。FIG. 5 is an image diagram showing another example of a visual inspection screen displayed by the inspection support device according to the embodiment of the present invention; 本発明の実施形態による検査支援装置が表示する目視検査用画面の他の例を示すイメージ図である。FIG. 5 is an image diagram showing another example of a visual inspection screen displayed by the inspection support device according to the embodiment of the present invention; 本発明の実施形態による検査支援装置が実行する検査支援処理の手順を示すフローチャートである。It is a flowchart which shows the procedure of the test|inspection assistance process which the test|inspection assistance apparatus by embodiment of this invention performs.

以下、本発明の実施の形態について図面を参照しながら説明する。
図１は、本実施形態による検査支援装置１の概略構成を示すブロック図である。
検査支援装置１（支援システム）は、データ入力部１０１、取得部１０２、取得ラベル記憶部１０３、推定部１０４、識別器１０５、推定ラベル記憶部１０６、比較部１０７、選定データ記憶部１０８、表示部１０９、正誤情報入力部１１０、及び誤りデータ記憶部１１１の各々を備えている。 BEST MODE FOR CARRYING OUT THE INVENTION Hereinafter, embodiments of the present invention will be described with reference to the drawings.
FIG. 1 is a block diagram showing a schematic configuration of an examination support apparatus 1 according to this embodiment.
The examination support apparatus 1 (support system) includes a data input unit 101, an acquisition unit 102, an acquired label storage unit 103, an estimation unit 104, a discriminator 105, an estimated label storage unit 106, a comparison unit 107, a selection data storage unit 108, a display A section 109, a correct/incorrect information input section 110, and an error data storage section 111 are provided.

データ入力部１０１は、例えば文字の種別（以下、「文字種」とする）毎に書く位置が指定されている原稿用紙をカメラやスキャナなどで撮像した画像を入力する。原稿用紙には、予め、文字種の異なる複数の手書き文字が各指定位置に書かれている。文字は後段の取得部１０２及び推定部１０４に含まれる識別器１０５において認識できるような条件で撮像されていることが望ましい。データ入力部１０１が入力する画像は、ディープラーニングに用いるデータであるため、データ入力部１０１は、通常、同じ文字種に対して複数枚の原稿用紙の画像を入力する。 The data input unit 101 inputs an image captured by a camera, scanner, or the like of a manuscript sheet in which a writing position is specified for each type of character (hereinafter referred to as “character type”). A plurality of handwritten characters of different character types are written in advance on each designated position on the manuscript paper. It is desirable that the characters are imaged under such conditions that they can be recognized by the classifier 105 included in the acquisition unit 102 and the estimation unit 104 in the latter stage. Since the image input by the data input unit 101 is data used for deep learning, the data input unit 101 normally inputs images of a plurality of manuscript sheets for the same character type.

取得部１０２は、学習用画像と、学習用画像に対応付けられた第１ラベルと、がセットになった学習データセットを取得する。学習用画像は、手書きの文字が示された手書文字画像である。また、第１ラベルは、手書文字画像に示された文字の種別を示す情報である。具体的には、取得部１０２は、データ入力部１０１で入力された原稿用紙の画像に対して画像処理を行い、画像内の文字を一文字単位で切り出す。一文字単位で切り出した文字の画像（手書文字画像）が、ディープラーニングに用いる学習用画像である。その後、取得部１０２は、原稿用紙の画像内の位置によって各手書文字画像に対応する正解文字種を特定し、切り出した手書文字画像のデータ（以下、「文字画像データ」とする）と対応する正解文字種を表す文字コード（第１ラベル）との組み合わせを取得ラベル記憶部１０３に書き込む。原稿用紙の画像から一文字単位で手書文字画像を切り出す手段については後述する。 The acquiring unit 102 acquires a learning data set in which a learning image and a first label associated with the learning image are set. The learning image is a handwritten character image showing handwritten characters. Also, the first label is information indicating the type of characters shown in the handwritten character image. Specifically, the acquisition unit 102 performs image processing on the image of the document sheet input by the data input unit 101, and cuts out characters in the image on a character-by-character basis. A character image (handwritten character image) cut out in units of one character is a learning image used for deep learning. After that, the acquisition unit 102 identifies the correct character type corresponding to each handwritten character image according to the position in the image of the manuscript paper, and associates it with the data of the extracted handwritten character image (hereinafter referred to as “character image data”). The combination with the character code (first label) representing the correct character type is written in the acquired label storage unit 103 . Means for extracting a handwritten character image in units of characters from the image of the manuscript paper will be described later.

取得ラベル記憶部１０３は、取得ラベルテーブルを記憶する。この取得ラベルテーブルは、一文字単位の文字画像データと、それに対応する文字種を表す文字コード（第１ラベル）とを各々対応付けて記憶する。 The acquired label storage unit 103 stores an acquired label table. This acquisition label table stores character image data in units of characters and character codes (first labels) representing character types corresponding thereto in association with each other.

推定部１０４は、取得部１０２によって取得された学習データセットの学習用画像を画像認識することによって、学習用画像に対応付けられることが推定される第２ラベルを推定する。例えば、推定部１０４は、識別器１０５を含む。識別器１０５は、事前に学習された、手書文字画像を入力すると画像に書かれている文字の文字種の確率を推定するＣＮＮ（ＣｏｎｖｏｌｕｔｉｏｎａｌＮｅｕｒａｌＮｅｔｗｏｒｋ）である。推定部１０４は、取得部１０２で切り出された手書文字画像を識別器１０５に入力し、入力した手書文字画像に対する推定結果から最も確率の高い文字種として出力された文字種を、当該手書文字画像に対応付けられることが推定される文字種とする。そして、推定部１０４は、取得部１０２で切り出された手書文字画像のアドレスと、推定した文字種を表す文字コード（第２ラベル）との組み合わせを推定ラベル記憶部１０６に書き込む。 The estimating unit 104 estimates a second label that is estimated to be associated with the learning image by performing image recognition on the learning image of the learning data set acquired by the acquiring unit 102 . For example, the estimator 104 includes the classifier 105 . The discriminator 105 is a previously learned CNN (Convolutional Neural Network) that estimates the probability of the character type of the characters written in the image when a handwritten character image is input. The estimating unit 104 inputs the handwritten character image cut out by the acquiring unit 102 to the classifier 105, and determines the character type output as the character type with the highest probability from the estimation result for the input handwritten character image as the handwritten character. The character type is assumed to be associated with the image. Then, the estimation unit 104 writes a combination of the address of the handwritten character image cut out by the acquisition unit 102 and the character code (second label) representing the estimated character type into the estimated label storage unit 106 .

推定ラベル記憶部１０６は、推定ラベルテーブルを記憶する。推定ラベルテーブルは、一文字単位の文字画像データが記憶されている領域の先頭アドレスを示す文字画像データアドレスと、それに対応する推定部１０４によって推定された文字種を表す文字コード（第２ラベル）とを各々対応付けて記憶する。文字画像データアドレスは、文字画像データの各々が記憶されている取得ラベル記憶部１０３の領域のアドレスを示しており、推定ラベル記憶部１０６から文字画像データを読み出す際のインデックスとなっている。 The estimated label storage unit 106 stores an estimated label table. The estimated label table contains a character image data address indicating the head address of an area where character image data for each character is stored, and a corresponding character code (second label) indicating the character type estimated by the estimation unit 104. They are stored in association with each other. The character image data address indicates the address of the area of the obtained label storage unit 103 in which each character image data is stored, and serves as an index when reading the character image data from the estimated label storage unit 106 .

比較部１０７は、第１ラベルと第２ラベルとを比較する。具体的には、比較部１０７は、取得ラベル記憶部１０３および推定ラベル記憶部１０６から同じ手書文字画像に対応する文字コードを各々読み出して比較する。そして、比較部１０７は、文字コードが異なっている手書文字画像を目視検査対象とし、そのアドレスを、取得ラベル記憶部１０３から読み出した文字コード（第１ラベル）と組み合わせて選定データ記憶部１０８に書き込む。 A comparison unit 107 compares the first label and the second label. Specifically, the comparison unit 107 reads character codes corresponding to the same handwritten character image from the obtained label storage unit 103 and the estimated label storage unit 106 and compares them. Then, the comparison unit 107 selects handwritten character images with different character codes as objects for visual inspection, combines the addresses with the character codes (first label) read out from the acquired label storage unit 103, and selects data storage unit 108. write to

選定データ記憶部１０８は、選定データテーブルを記憶する。選定データテーブルは、比較部１０７で選定された目視検査対象となる一文字単位の文字画像データが記憶されている領域の先頭アドレスを示す文字画像データアドレスと、それに対応する取得部１０２によって取得された文字種を表す文字コード（第１ラベル）とを各々対応付けて記憶する。文字画像データアドレスは、文字画像データの各々が記憶されている取得ラベル記憶部１０３の領域のアドレスを示しており、選定データ記憶部１０８から文字画像データを読み出す際のインデックスとなっている。 The selection data storage unit 108 stores a selection data table. The selection data table includes a character image data address indicating the head address of an area storing character image data for each character to be visually inspected and selected by the comparison unit 107, and the corresponding character image data address obtained by the obtaining unit 102. A character code (first label) representing the character type is associated with each other and stored. The character image data address indicates the address of the area of the acquisition label storage unit 103 in which each character image data is stored, and serves as an index when reading the character image data from the selection data storage unit 108 .

表示部１０９は、比較部１０７による比較結果に基づいて、第１ラベルと第２ラベルとが異なる学習用画像を表示する。このとき、表示部１０９は、第１ラベルと第２ラベルとが異なる学習用画像を、第１ラベルの種別ごとに、まとめてタイル状に表示する。具体的には、表示部１０９は、選定データ記憶部１０８に記憶されている手書文字画像のうち、同一の文字コードを持つ手書文字画像全てを、タイル状に並べてディスプレイ等に表示する。この際、表示部１０９は、学習用画像とともに、第１ラベルが示す基準画像を表示する。例えば、表示部１０９は、比較として異なる２種類以上のデジタルフォントから生成した正解文字画像（基準画像）を手書文字画像とともに表示してもよい。また、表示部１０９は、タイトルバーに正解の文字コードで表される文字を表示してもよい。 The display unit 109 displays learning images with different first labels and second labels based on the comparison result by the comparison unit 107 . At this time, the display unit 109 collectively displays learning images with different first labels and second labels in tiles for each type of first label. Specifically, the display unit 109 arranges all the handwritten character images having the same character code among the handwritten character images stored in the selection data storage unit 108 into tiles and displays them on a display or the like. At this time, the display unit 109 displays the reference image indicated by the first label together with the learning image. For example, the display unit 109 may display a correct character image (reference image) generated from two or more different digital fonts for comparison, together with the handwritten character image. In addition, the display unit 109 may display characters represented by the correct character code in the title bar.

正誤情報入力部１１０は、表示部１０９によって表示された学習用画像に示されている手書文字画像が、その手書文字画像に対応付けられた第１ラベルと異なっているか否かを示す入力情報が入力される。具体的には、正誤情報入力部１１０は、表示部１０９が表示した目視検査対象の手書文字画像が書き間違いのある書き間違い文字である（第１ラベルと異なっている）か否かを示す入力情報の入力を受け付ける。そして、正誤情報入力部１１０は、書き間違い文字のアドレスと、対応する文字コード（第１ラベル）との組み合わせを誤りデータ記憶部１１１に書き込む。例えば、検査者は、表示部１０９が表示した手書文字画像群を正解文字画像と見比べて目視検査し、書き間違えている、または極端に字形が崩れていて読むことができないと判断した手書文字画像を書き間違い文字として選択する。一般的な方法として、書き間違い文字の手書文字画像をマウスクリックで選択する方法が考えられる。この方法の場合、正誤情報入力部１１０は、マウスクリックで選択された手書文字画像を書き間違い文字と判定し、その文字画像データアドレスを、対応する文字コードに組み合わせて誤りデータ記憶部１１１に書き込む。 The correct/wrong information input unit 110 inputs whether or not the handwritten character image shown in the learning image displayed by the display unit 109 is different from the first label associated with the handwritten character image. Information is entered. Specifically, the correct/wrong information input unit 110 indicates whether or not the handwritten character image to be visually inspected displayed by the display unit 109 is a erroneously written character (differs from the first label). Accepts input of input information. Then, the correct/incorrect information input unit 110 writes the combination of the address of the erroneously written character and the corresponding character code (first label) into the error data storage unit 111 . For example, the inspector visually inspects the group of handwritten character images displayed by the display unit 109 by comparing them with correct character images, and determines that the handwritten characters are incorrectly written or the characters are extremely deformed and unreadable. A character image is selected as a scribbled character. As a general method, a method of selecting a handwritten character image of the erroneously written character by clicking with a mouse is conceivable. In the case of this method, the correct/wrong information input unit 110 determines that the handwritten character image selected by mouse click is an incorrectly written character, combines the character image data address with the corresponding character code, and stores the error data storage unit 111. Write.

誤りデータ記憶部１１１は、誤りデータテーブルを記憶する。誤りデータテーブルは、正誤情報入力部１１０が書き間違い文字と判定した文字画像データが記憶されている領域の先頭アドレスを示す文字画像データアドレスと、対応する文字コード（第１ラベル）とを各々対応付けて記憶する。文字画像データアドレスは、文字画像データの各々が記憶されている取得ラベル記憶部１０３の領域のアドレスを示しており、誤りデータ記憶部１１１から文字画像データを読み出す際のインデックスとなっている。 The error data storage unit 111 stores an error data table. The error data table corresponds to the character image data address indicating the head address of the area storing the character image data determined as the erroneous character by the correct/wrong information input unit 110 and the corresponding character code (first label). memorize it. The character image data address indicates the address of the area of the acquisition label storage unit 103 in which each character image data is stored, and serves as an index when reading the character image data from the error data storage unit 111 .

続いて、図２を参照して、取得部１０２が、原稿用紙の画像から一文字単位で手書文字画像を切り出す手段について説明する。図２は、本実施形態による検査支援装置１が原稿用紙の画像から一文字単位で手書文字画像を切り出す手順を説明するための図である。
本実施形態における原稿用紙は、一文字毎に区切ることができる枠（マス目）を有する方眼紙である。原稿用紙のマス目は等間隔で並べられた正方形であるため、外枠四角形の四隅座標がわかると全てのマス目及びマス目内に含まれる文字の相対位置が算出でき、手書文字画像を切り出すことができる。しかし、スキャンされた原稿用紙はスキャン時の傾きやズレによって外枠の位置が一定ではないため、取得部１０２は、以下の手順で外枠四角形の四隅座標を検出し、一文字単位の手書文字画像を切り出す。 Next, with reference to FIG. 2, the means by which the acquisition unit 102 cuts out handwritten character images in units of characters from the image of the manuscript paper will be described. FIG. 2 is a diagram for explaining a procedure for the examination support apparatus 1 according to the present embodiment to cut out a handwritten character image in units of characters from an image of a manuscript sheet.
The manuscript paper in the present embodiment is a graph paper having a frame (square) that can divide each character. Since the squares of the manuscript paper are squares arranged at equal intervals, if the coordinates of the four corners of the outer frame rectangle are known, the relative positions of all the squares and the characters contained in the squares can be calculated, and the handwritten character image can be obtained. can be cut out. However, since the position of the outer frame of the scanned manuscript paper is not constant due to inclination or misalignment during scanning, the acquisition unit 102 detects the coordinates of the four corners of the rectangular outer frame according to the following procedure, Crop the image.

取得部１０２は、データ入力部１０１から入力された原稿用紙の画像（原稿用紙文字画像２０１）を二値化し、白黒反転をして、二値化・白黒反転結果画像２０２を得る。その後、取得部１０２は、二値化・白黒反転結果画像２０２から輪郭抽出を行い、外枠検出結果画像２０３の符号２０３１のような最も領域面積の広い輪郭を外枠として検出する。 Acquisition unit 102 binarizes an image of a document sheet (document sheet character image 201 ) input from data input unit 101 and performs black-and-white reversal to obtain a binarized/black-and-white reversal result image 202 . After that, the acquiring unit 102 performs contour extraction from the binarized/black-and-white inverted result image 202, and detects a contour with the widest area, such as reference numeral 2031 of the outer frame detection result image 203, as the outer frame.

非直線枠線拡大図２０４は、外枠検出結果画像２０３の四隅右上の領域２０３２を拡大したものである。非直線枠線拡大図２０４に示すように、画像輪郭は通常、細かい線の凹凸によって直線にならない。そのため、取得部１０２は、抽出した外枠輪郭に対して直線近似を行い、直線近似枠線拡大図２０５に示すように、外枠を直線にする。直線近似枠線拡大図２０５は、外枠輪郭に対し直線近似を行った後の、外枠検出結果画像２０３の領域２０３２を拡大したものである。これによって外枠輪郭の端点が４つになり、外枠四角形の四隅座標が決まる。 A non-straight frame line enlarged view 204 is an enlarged view of a region 2032 in the upper right corner of the outer frame detection result image 203 . As shown in the non-straight border enlargement 204, image contours are not usually straight due to fine line irregularities. Therefore, the acquisition unit 102 performs linear approximation on the extracted outline of the outer frame, and straightens the outer frame as shown in a linear approximation enlarged frame line diagram 205 . A linear approximation frame line enlarged view 205 is an enlarged view of a region 2032 of the outer frame detection result image 203 after linear approximation is performed on the outer frame contour. As a result, the outer frame outline has four end points, and the coordinates of the four corners of the outer frame rectangle are determined.

検出された外枠四角形は大抵長方形でも平行四辺形でもない歪んだ四角形となるため、取得部１０２は、ホモグラフィによって正対するよう画像変換を行う。その後、取得部１０２は、画像変換を行った一文字画像枠線検出結果画像２０６から原稿用紙の寸法に従ってマス目を切り出すことで、一文字単位の手書文字画像を取得する。 Since the detected outer frame quadrilateral is mostly a distorted quadrilateral that is neither a rectangle nor a parallelogram, the acquisition unit 102 performs image conversion so that the rectangles face each other by homography. After that, the acquiring unit 102 acquires a handwritten character image for each character by cutting out squares according to the dimensions of the original paper from the image-converted one-character image frame line detection result image 206 .

続いて、図３を参照して、取得ラベル記憶部１０３が記憶する取得ラベルテーブルについて説明する。図３は、本実施形態による検査支援装置１が記憶する取得ラベルテーブルのデータ構造を示す概略図である。取得ラベルテーブルは、取得部１０２が切り出した各手書文字画像の文字画像データと、当該手書文字画像の正解文字種を表す文字コード（第１ラベル）とを各々対応付けて記憶する。 Next, an acquired label table stored in the acquired label storage unit 103 will be described with reference to FIG. FIG. 3 is a schematic diagram showing the data structure of the acquisition label table stored in the inspection support apparatus 1 according to this embodiment. The obtained label table stores the character image data of each handwritten character image extracted by the obtaining unit 102 in association with the character code (first label) representing the correct character type of the handwritten character image.

続いて、図４を参照して、推定ラベル記憶部１０６が記憶する推定ラベルテーブルについて説明する。図４は、本実施形態による検査支援装置１が記憶する推定ラベルテーブルのデータ構造を示す概略図である。推定ラベルテーブルは、取得部１０２が切り出した各手書文字画像の文字画像データが記憶されている先頭アドレスを示す文字画像データアドレスと、推定部１０４が推定した当該手書文字画像の文字種を表す文字コード（第２ラベル）とを各々対応付けて記憶する。文字画像データアドレスは、取得ラベル記憶部１０３から文字画像データを読み出す際のインデックスとなっている。
このように、文字画像データの各々が記憶されている取得ラベル記憶部１０３の領域のアドレスを記憶することにより、文字画像データそのものを記憶する場合と比べて記憶容量を削減することができる。
なお、選定データ記憶部１０８が記憶する選定データテーブル及び誤りデータ記憶部１１１が記憶する誤りデータテーブルも本図に示す推定ラベルテーブルと同様に、文字画像データではなく、文字画像データアドレスを文字コード（第１ラベル）と対応付けて記憶する。 Next, the estimated label table stored in the estimated label storage unit 106 will be described with reference to FIG. FIG. 4 is a schematic diagram showing the data structure of the estimated label table stored by the examination support apparatus 1 according to this embodiment. The estimated label table represents a character image data address indicating the head address where the character image data of each handwritten character image extracted by the acquisition unit 102 is stored, and the character type of the handwritten character image estimated by the estimation unit 104. Character codes (second labels) are associated with each other and stored. The character image data address serves as an index for reading character image data from the acquisition label storage unit 103 .
By storing the address of the region of the acquisition label storage unit 103 in which each character image data is stored in this manner, the storage capacity can be reduced compared to the case where the character image data itself is stored.
Note that the selection data table stored in the selection data storage unit 108 and the error data table stored in the error data storage unit 111 are also similar to the estimated label table shown in FIG. (first label) and stored.

続いて、図５～図７を参照して、表示部１０９が表示する目視検査用画面について説明する。図５は、本実施形態による検査支援装置１が表示する目視検査用画面の一例を示すイメージ図である。
目視検査用画面は、検査者が文字の書き間違いを目視検査するために、表示部１０９がディスプレイＤＰ等に表示する画面である。本図に示す例では、表示部１０９は、選定データ記憶部１０８に記憶されている手書文字画像のうち、同一の文字コードを持つ手書文字画像群（手書文字画像３０１－１～３０１－９）、教科書体の正解文字画像３０２、およびゴシック体の正解文字画像３０３を目視検査用画面３００に表示している。本例では、表示部１０９は、文字「柏」の文字コードに対応する手書文字画像群（手書文字画像３０１－１～３０１－９）と、２種類の正解文字画像３０２，３０３とを表示している。ここで、表示部１０９は、検査対象となる手書文字画像３０１－１～３０１－９と、正解文字画像３０２，３０３とを区別し易くするために、正解文字画像３０２，３０３を白黒反転（文字部分を白、背景を黒）させて表示する。また、表示部１０９は、正解文字画像３０２，３０３を目視検査用画面３００の中心に表示する。また、表示部１０９は、正解の文字コードで表される文字「柏」３０４を目視検査用画面３００のタイトルバーに表示している。 Next, visual inspection screens displayed by the display unit 109 will be described with reference to FIGS. 5 to 7. FIG. FIG. 5 is an image diagram showing an example of a visual inspection screen displayed by the inspection support apparatus 1 according to this embodiment.
The visual inspection screen is a screen displayed on the display DP or the like by the display unit 109 so that the inspector can visually inspect for spelling errors. In the example shown in this figure, the display unit 109 displays a group of handwritten character images (handwritten character images 301-1 to 301-301) having the same character code among the handwritten character images stored in the selection data storage unit 108. -9), a textbook-style correct character image 302 and a gothic-style correct character image 303 are displayed on the visual inspection screen 300 . In this example, the display unit 109 displays a handwritten character image group (handwritten character images 301-1 to 301-9) corresponding to the character code of the character "Kashiwa" and two types of correct character images 302 and 303. it's shown. Here, the display unit 109 reverses the correct character images 302 and 303 in black and white ( Display text with white and background with black. Also, the display unit 109 displays the correct character images 302 and 303 in the center of the visual inspection screen 300 . In addition, the display unit 109 displays the character “Kashiwa” 304 represented by the correct character code in the title bar of the visual inspection screen 300 .

ここで、正解文字種は「柏」であるのに対して、手書文字画像群（手書文字画像３０１－１～３０１－９）の中段左から４番目の手書文字画像３０１－６及び下段左から２番目の手書文字画像３０１－８には「拍」が書かれており、書き間違いであることがわかる。また、手書文字画像群（手書文字画像３０１－１～３０１－９）の上段左から３番目の手書文字画像３０１－３は「柏」の「木」部分が書き崩されて書かれており、「柏」であるか「拍」であるかの判断がつきにくいため書き間違い文字として選択される。 Here, the correct character type is "Kashiwa", while the handwritten character image group (handwritten character images 301-1 to 301-9) is the fourth handwritten character image 301-6 from the left in the middle row and the lower row. In the second handwritten character image 301-8 from the left, "beat" is written, and it can be seen that the handwriting is erroneous. In addition, the third handwritten character image 301-3 from the upper left of the group of handwritten character images (handwritten character images 301-1 to 301-9) is written with the “tree” part of “oak” broken down. Therefore, it is difficult to determine whether it is ``Kashiwa'' or ``Matsu'', so it is selected as a erroneous character.

例えば、検査者は、目視検査用画面において、書き間違い文字と判断した手書文字画像をマウスクリックすることにより、書き間違い文字を入力する。正誤情報入力部１１０は、表示部１０９が表示した目視検査用画面においてマウスクリックを受け付けた手書文字画像を、書き間違い文字と判定し、その文字画像データアドレスを、対応する文字コード（第１ラベル）とともに誤りデータ記憶部１１１に書き込む。 For example, the inspector inputs the erroneously written character by clicking the mouse on the image of the handwritten character determined to be the erroneously written character on the visual inspection screen. The correct/wrong information input unit 110 determines that a handwritten character image for which a mouse click is received on the visual inspection screen displayed by the display unit 109 is a erroneous character, and converts the character image data address to the corresponding character code (first label) in the error data storage unit 111.

このように、表示部１０９は、選定データ記憶部１０８に文字画像データアドレスが記憶されている手書文字画像のみを目視検査対象として表示するため、取得ラベル記憶部１０３が記憶する全ての手書文字画像を目視検査する必要がなくなる。すなわち、検査者は、書き間違いの可能性のある手書文字画像のみを目視検査すれば良くなる。そのため、検査者の負担が減り、検査を効率的に行える。
また、この表示例のように、文字種毎にまとめてタイル状に表示し目視検査を行えるようにすることで、検査対象の手書文字画像を一つずつ正解文字画像と見比べて検査する手間を省き、効率的に検査を行うことができる。
また、正解文字画像を複数種類（本例では２種類）表示することにより、同じ文字種におけるフォントや書体による違いを検査者が確認することができる。例えば、「柏」の場合には、「白」部分の一画目のはらいは、教科書体では左上部分にあるが、ゴシック体では中央部分にある。よって、検査者は、２種類の正解文字画像を見比べて、「白」の一画目のはらいは左上にあっても中央にあっても良いことを知ることができる。
また、正解文字画像を白黒反転して表示することにより、検査者が正解文字画像と検査対象となる手書文字画像とを判別し易くなる。
また、正解文字画像を中心に表示し、正方形に近いタイル状に各手書文字画像を表示することにより、検査対象となる手書文字画像各々と正解文字画像との距離が略同一となるため、検査者がどの手書文字画像と見比べるときにも視線の移動距離が略同じになり、比較し易くなる。 In this manner, the display unit 109 displays only the handwritten character images whose character image data addresses are stored in the selection data storage unit 108 as targets for visual inspection. It eliminates the need for visual inspection of character images. In other words, the inspector only needs to visually inspect the handwritten character images that may contain handwriting errors. Therefore, the burden on the inspector is reduced, and the inspection can be performed efficiently.
As shown in this display example, each character type is grouped into tiles for visual inspection, which saves the trouble of comparing handwritten character images to be inspected one by one with correct character images. inspection can be efficiently performed.
In addition, by displaying a plurality of types (two types in this example) of correct character images, the inspector can confirm the differences due to fonts and typefaces in the same character type. For example, in the case of "Kashiwa", the first stroke of the "white" part is located in the upper left part in the textbook typeface, but in the central part in the Gothic typeface. Therefore, the inspector can compare the two types of correct character images and know that the first stroke of "white" can be either in the upper left or in the center.
In addition, by displaying the correct character image with black and white reversed, the inspector can easily distinguish between the correct character image and the handwritten character image to be inspected.
In addition, by displaying the correct character image in the center and displaying each handwritten character image in a nearly square tile shape, the distance between each handwritten character image to be inspected and the correct character image is approximately the same. When the inspector compares any handwritten character image, the moving distance of the line of sight becomes substantially the same, which facilitates the comparison.

図６は、本実施形態による検査支援装置１が表示する目視検査用画面の他の例を示すイメージ図である。
本図に示す目視検査用画面３００Ａに表示されている手書文字画像群（手書文字画像３０１－１～３０１－９）及び正解文字画像３０２，３０３は、図５に示す目視検査用画面３００に表示されているものと同一である。本図に示す目視検査用画面３００Ａでは、正解文字画像３０２，３０３が、手書文字画像群（手書文字画像３０１－１～３０１－９）の後（画面右下）に表示されている点が、目視検査用画面３００と異なる。他の表示は、目視検査用画面３００と同様であるため、その説明を省略する。本図に示す例に限らず、正解文字画像３０２，３０３は、検査者が視認できる位置であれば、目視検査用画面のどの位置に表示されていてもよい。 FIG. 6 is an image diagram showing another example of the visual inspection screen displayed by the inspection support apparatus 1 according to this embodiment.
The group of handwritten character images (handwritten character images 301-1 to 301-9) and the correct character images 302 and 303 displayed on the visual inspection screen 300A shown in this figure are the same as those displayed on the visual inspection screen 300 shown in FIG. is identical to that shown in In the visual inspection screen 300A shown in this figure, the correct character images 302 and 303 are displayed after the group of handwritten character images (handwritten character images 301-1 to 301-9) (lower right of the screen). is different from the visual inspection screen 300 . Since other displays are the same as those of the visual inspection screen 300, description thereof will be omitted. The correct character images 302 and 303 may be displayed at any positions on the visual inspection screen, as long as they are visible to the inspector.

図７は、本実施形態による検査支援装置１が表示する目視検査用画面の他の例を示すイメージ図である。
本図に示す目視検査用画面３００Ｂに表示されている手書文字画像群（手書文字画像３０１－１～３０１－９）及び正解文字画像３０２，３０３は、図５に示す目視検査用画面３００に表示されているものと同一である。目視検査用画面３００が横長の画面であるのに対し、本図に示す目視検査用画面３００Ｂは、縦長の画面である。また、目視検査用画面３００に表示されている２種類の正解文字画像３０２，３０３が横一列に配置されているのに対し、本図に示す目視検査用画面３００Ｂでは、２種類の正解文字画像３０２，３０３が縦一列に配置されている。本図に示す例に限らず、目視検査用画面３００は、横長の画面であってもよいし、縦長の画面であってもよい。 FIG. 7 is an image diagram showing another example of the visual inspection screen displayed by the inspection support apparatus 1 according to this embodiment.
The group of handwritten character images (handwritten character images 301-1 to 301-9) and the correct character images 302 and 303 displayed on the visual inspection screen 300B shown in FIG. is identical to that shown in While the visual inspection screen 300 is a horizontally long screen, the visual inspection screen 300B shown in this figure is a vertically long screen. In addition, while the two types of correct character images 302 and 303 displayed on the visual inspection screen 300 are arranged in a horizontal row, the visual inspection screen 300B shown in this figure displays two types of correct character images. 302 and 303 are arranged in a vertical line. The visual inspection screen 300 is not limited to the example shown in this figure, and may be a horizontally long screen or a vertically long screen.

なお、上述した表示例では、正解文字画像を２種類表示しているが、これに限らず、検査用画面に表示する正解文字画像は３種類以上であってもよいし、１種類であってもよい。 In the display example described above, two types of correct character images are displayed. good too.

続いて、図８を参照して、検査支援装置１による検査支援処理について説明する。図８は、本実施形態による検査支援装置１が実行する検査支援処理の手順を示すフローチャートである。 Next, the examination support processing by the examination support apparatus 1 will be described with reference to FIG. 8 . FIG. 8 is a flow chart showing the procedures of examination support processing executed by the examination support apparatus 1 according to this embodiment.

（ステップＳ１）データ入力部１０１は、書き間違いの有無の検査したい検査対象となる文字画像の入力を受け付ける。文字画像は、例えば、原稿用紙等に複数文字種の手書き文字が書かれた画像である。 (Step S1) The data input unit 101 receives an input of a character image to be inspected for writing errors. A character image is, for example, an image in which a plurality of character types of handwritten characters are written on a manuscript sheet or the like.

（ステップＳ２）取得部１０２は、ステップＳ１で入力された文字画像に対して画像処理を行い、一文字単位で手書文字画像を切り出し、対応する文字種を表す第１ラベルを取得する。 (Step S2) The acquiring unit 102 performs image processing on the character image input in step S1, cuts out the handwritten character image for each character, and acquires the first label representing the corresponding character type.

（ステップＳ３）推定部１０４は、ステップＳ２で切り出された各手書文字画像を識別器１０５に入力し、各々の文字種（第２ラベル）を推定する。 (Step S3) The estimation unit 104 inputs each handwritten character image cut out in step S2 to the classifier 105, and estimates each character type (second label).

（ステップＳ４）比較部１０７は、ステップＳ２で切り出された手書文字画像から１枚選択し、それに対応するステップＳ２で取得された第１ラベルと、ステップＳ３で推定した第２ラベルとを比較する。 (Step S4) The comparison unit 107 selects one handwritten character image cut out in step S2, and compares the corresponding first label obtained in step S2 with the second label estimated in step S3. do.

（ステップＳ５）比較部１０７は、ステップＳ４での比較結果、第１ラベルと第２ラベルとが異なっているか否かを判定する。ラベルが異なる場合（ステップＳ５：ＹＥＳ）には、ステップＳ６に処理を進める。ラベルが同じ場合（ステップＳ５：ＮＯ）には、ステップＳ７に処理を進める。 (Step S5) The comparison unit 107 determines whether the comparison result in step S4 is different between the first label and the second label. If the labels are different (step S5: YES), the process proceeds to step S6. If the labels are the same (step S5: NO), the process proceeds to step S7.

（ステップＳ６）比較部１０７は、ラベルが異なる手書文字画像の文字画像データアドレスを、それに対応するステップＳ２で取得された第１ラベルに対応付けて選定データ記憶部１０８に書き込んで保存する。 (Step S6) The comparison unit 107 writes and saves the character image data address of the handwritten character image with a different label in the selection data storage unit 108 in association with the corresponding first label acquired in step S2.

（ステップＳ７）比較部１０７は、ステップＳ２で切り出された全ての手書文字画像のラベルを比較したか否かを判定する。全ての手書文字画像のラベルを比較し終えた場合（ステップＳ７：ＹＥＳ）には、ステップＳ８に処理を進める。全ての手書文字画像のラベルを比較し終えていない場合（ステップＳ７：ＮＯ）には、ステップＳ４に処理を戻す。 (Step S7) The comparison unit 107 determines whether or not the labels of all handwritten character images cut out in step S2 have been compared. If the labels of all handwritten character images have been compared (step S7: YES), the process proceeds to step S8. If the labels of all handwritten character images have not been compared (step S7: NO), the process returns to step S4.

（ステップＳ８）表示部１０９は、選定データ記憶部１０８から同一の文字種（第１ラベル）の手書文字画像を読み出し、読み出した全ての手書文字画像を正解文字画像とともにタイル状に並べディスプレイ等に表示する。 (Step S8) The display unit 109 reads the handwritten character images of the same character type (first label) from the selection data storage unit 108, arranges all the read handwritten character images together with the correct character image in a tile form, and displays them. to display.

（ステップＳ９）正誤情報入力部１１０は、書き間違い文字の選択入力を取得する。例えば、検査者は、ステップＳ８で表示された手書文字画像群を目視検査し、書き間違えている手書文字画像がある場合には、その手書文字画像をマウスクリック等で選択する。正誤情報入力部１１０は、マウスクリックで選択された手書文字画像を書き間違い文字と判定し、その文字画像データアドレスを第１ラベルに対応付けて誤りデータ記憶部１１１に書き込む。表示されている全ての手書文字画像に対して目視検査が終了すると、ステップＳ１０に処理を進める。例えば、表示部１０９は、表示した目視検査用画面が検査者の操作により閉じられたときに、表示されている全ての手書文字画像に対して目視検査が終了したと判定する。 (Step S9) The correct/wrong information input unit 110 acquires the selection input of the erroneously written character. For example, the inspector visually inspects the group of handwritten character images displayed in step S8, and if there is an incorrectly written handwritten character image, selects the handwritten character image by a mouse click or the like. The correct/wrong information input unit 110 determines that the handwritten character image selected by mouse click is an incorrectly written character, and writes the character image data address to the error data storage unit 111 in association with the first label. When all the displayed handwritten character images have been visually inspected, the process proceeds to step S10. For example, when the displayed visual inspection screen is closed by the inspector's operation, the display unit 109 determines that the visual inspection has been completed for all displayed handwritten character images.

（ステップＳ１０）表示部１０９は、選定データ記憶部１０８に記憶されている全ての手書文字画像に対し目視検査を実施したか否かを判定する。目視検査を実施していない手書文字画像がある場合（ステップＳ１０：ＮＯ）には、目視検査していない文字種に対してステップＳ８に処理を進める。選定データ記憶部１０８に記憶されている全ての手書文字画像の目視検査が終わっている場合（ステップＳ１０：ＹＥＳ）には、処理を終了する。 (Step S10) The display unit 109 determines whether or not all handwritten character images stored in the selection data storage unit 108 have been visually inspected. If there is a handwritten character image that has not been visually inspected (step S10: NO), the process proceeds to step S8 for the character type that has not been visually inspected. If all the handwritten character images stored in the selection data storage unit 108 have been visually inspected (step S10: YES), the process ends.

このように、書き間違いのある手書文字画像の文字画像データアドレスを誤りデータ記憶部１１１に記憶しておくことで、取得ラベル記憶部１０３に記憶されている手書文字画像のうち、誤りデータ記憶部１１１にその文字画像データアドレスが記憶されているものをディープラーニングの学習用画像から除くことができる。すなわち、書き間違いのない手書文字画像のみをディープラーニングの学習用画像として用いることができる。 In this way, by storing the character image data addresses of the handwritten character images with writing errors in the error data storage unit 111, the error data among the handwritten character images stored in the acquired label storage unit 103 can be Images whose character image data addresses are stored in the storage unit 111 can be excluded from deep learning learning images. In other words, only handwritten character images without spelling mistakes can be used as training images for deep learning.

このように、本実施形態よれば、検査支援装置１は、学習用画像と、学習用画像に対応付けられた第１ラベルと、がセットになった学習データセットを取得する取得部１０２と、取得部１０２によって取得された学習データセットの学習用画像を画像認識することによって、学習用画像に対応付けられることが推定される第２ラベルを推定する推定部１０４と、第１ラベルと、第２ラベルとを比較する比較部１０７と、比較部１０７による比較結果に基づいて、第１ラベルと第２ラベルとが異なる学習用画像を表示する表示部１０９と、表示部１０９によって表示された学習用画像に示されている手書文字画像が、その手書文字画像に対応付けられた第１ラベルと異なっているか否かを示す入力情報が入力される正誤情報入力部１１０と、を備える。 As described above, according to the present embodiment, the inspection support apparatus 1 includes the acquisition unit 102 that acquires a learning data set in which a learning image and a first label associated with the learning image are set; An estimating unit 104 that estimates a second label that is estimated to be associated with the learning image by image recognition of the learning image of the learning data set acquired by the acquiring unit 102, a first label, a first A comparison unit 107 that compares the two labels, a display unit 109 that displays learning images with different first labels and second labels based on the comparison result of the comparison unit 107, and a learning image displayed by the display unit 109. and a correct/incorrect information input unit 110 for inputting input information indicating whether or not the handwritten character image shown in the handwritten image is different from the first label associated with the handwritten character image.

上述した構成により、学習データセットに対して学習用画像に誤りがないかの目視検査をする際、画像認識によって取得される第２ラベルとあらかじめ付与されている第１ラベルとを比較してラベルの異なる学習用画像のみ選定することで、学習データセットの大半を占める間違いではない学習用画像を検査する手間を省くことができる。よって、検査者は、全ての学習用画像を目視検査する必要がなくなり、手書文字画像に示されている文字が第１ラベルと異なっているか可能性のある学習用画像のみを目視検査すれば良くなる。すなわち、目視検査すべき学習用画像を削減できるため、検査者の負担が減り、検査を効率的に行える。よって、本実施形態によれば、膨大にある学習データセットから誤りを検出する作業を支援し、効率的に行うことができる。 With the above-described configuration, when visually inspecting the learning image for errors in the learning data set, the second label obtained by image recognition is compared with the first label assigned in advance to obtain a label. By selecting only the training images with different values, it is possible to save the trouble of inspecting correct training images that occupy most of the training data set. Therefore, the inspector does not need to visually inspect all the learning images, and only needs to visually inspect the learning images for which there is a possibility that the characters shown in the handwritten character image are different from the first label. Get better. That is, since the number of learning images to be visually inspected can be reduced, the burden on the inspector is reduced, and the inspection can be performed efficiently. Therefore, according to the present embodiment, it is possible to support and efficiently detect errors from a huge amount of learning data sets.

また、表示部１０９は、第１ラベルと第２ラベルとが異なる学習用画像を、第１ラベルの種別ごとに、まとめてタイル状に表示する。このように、第１ラベルと第２ラベルとが異なる検査対象の学習用画像を同一のラベル毎にまとめてタイル状に表示することで、検査者は、複数の学習用画像をまとめて一度に検査することができるため、効率的に検査することが可能になる。 In addition, the display unit 109 collectively displays learning images with different first labels and second labels in tiles for each type of first label. In this way, the training images to be inspected, which have different first labels and second labels, are collectively displayed in tiles for each of the same labels. Since it can be inspected, efficient inspection becomes possible.

また、表示部１０９は、学習用画像とともに、第１ラベルが示す基準画像を表示する。これにより、検査者は、表示画面上で学習用画像を基準画像と比較して検査することができるため、作業を効率化することができる。 Also, the display unit 109 displays the reference image indicated by the first label together with the learning image. As a result, the inspector can inspect the learning image by comparing it with the reference image on the display screen, so that the work can be made more efficient.

また、学習用画像は、手書きの文字が示された手書文字画像であり、第１ラベルは、手書文字画像に示された文字の種別を示す情報である。よって、本実施形態によれば、大量にある手書き文字の書き間違いを検出する作業を支援することができる。すなわち、手書文字画像に対して、効率的かつ見落としが少ない目視検査を行うことができる。 The learning image is a handwritten character image showing handwritten characters, and the first label is information indicating the type of the character shown in the handwritten character image. Therefore, according to the present embodiment, it is possible to support the task of detecting writing errors in a large number of handwritten characters. That is, it is possible to perform an efficient visual inspection with few oversights on the handwritten character image.

また、取得部１０２は、文字の種別が異なる複数の手書きの文字が書かれた原稿用紙の画像を二値化及び白黒反転させて一文字単位で手書文字画像を取得し、原稿用紙における位置から各手書文字画像の文字の種別を特定する。これにより、スキャンした原稿用紙の画像に影等が写り込んでいた場合にそれらを排除し、原稿用紙の直線を精度良く抽出することができるため、原稿用紙の画像から手書文字画像とその文字の種別とを正確に取得することができる。 In addition, the acquiring unit 102 acquires a handwritten character image for each character by binarizing and black-and-white inverting an image of a manuscript paper on which a plurality of handwritten characters of different character types are written, and obtains a handwritten character image for each character from a position on the manuscript paper. Identify the character type of each handwritten character image. As a result, if shadows are reflected in the image of the scanned manuscript paper, they can be removed and the straight lines of the manuscript paper can be extracted with high accuracy. can be obtained accurately.

上述した実施形態における検査支援装置１の全部または一部をコンピュータで実現するようにしてもよい。その場合、この機能を実現するためのプログラムをコンピュータ読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行することによって実現してもよい。なお、ここでいう「コンピュータシステム」とは、ＯＳや周辺機器等のハードウェアを含むものとする。また、「コンピュータ読み取り可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ－ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。さらに「コンピュータ読み取り可能な記録媒体」とは、インターネット等のネットワークや電話回線等の通信回線を介してプログラムを送信する場合の通信線のように、短時間の間、動的にプログラムを保持するもの、その場合のサーバやクライアントとなるコンピュータシステム内部の揮発性メモリのように、一定時間プログラムを保持しているものも含んでもよい。また上記プログラムは、前述した機能の一部を実現するためのものであってもよく、さらに前述した機能をコンピュータシステムにすでに記録されているプログラムとの組み合わせで実現できるものであってもよく、ＦＰＧＡ等のプログラマブルロジックデバイスを用いて実現されるものであってもよい。 All or part of the examination support apparatus 1 in the above-described embodiment may be realized by a computer. In that case, a program for realizing this function may be recorded in a computer-readable recording medium, and the program recorded in this recording medium may be read into a computer system and executed. It should be noted that the "computer system" referred to here includes hardware such as an OS and peripheral devices. The term "computer-readable recording medium" refers to portable media such as flexible discs, magneto-optical discs, ROMs and CD-ROMs, and storage devices such as hard discs incorporated in computer systems. Furthermore, "computer-readable recording medium" means a medium that dynamically retains a program for a short period of time, like a communication line when transmitting a program via a network such as the Internet or a communication line such as a telephone line. It may also include something that holds the program for a certain period of time, such as a volatile memory inside a computer system that serves as a server or client in that case. Further, the program may be for realizing a part of the functions described above, or may be capable of realizing the functions described above in combination with a program already recorded in the computer system. It may be implemented using a programmable logic device such as FPGA.

以上、この発明の実施形態について図面を参照して詳述してきたが、具体的な構成はこの実施形態に限られるものではなく、この発明の要旨を逸脱しない範囲の設計等も含まれる。 Although the embodiment of the present invention has been described in detail with reference to the drawings, the specific configuration is not limited to this embodiment, and design and the like are included within the scope of the gist of the present invention.

例えば、上述した実施形態では、検査支援装置１は、手書き文字が書かれた原稿用紙から手書文字画像を取得しているが、これに限らず、例えば、タッチパネルを備えたタブレット端末やパーソナルコンピュータ等にタッチペンで一文字ずつ手書き入力する等、他の方法で手書文字画像を取得してもよい。 For example, in the above-described embodiment, the examination support apparatus 1 acquires a handwritten character image from a document sheet on which handwritten characters are written. Handwritten character images may be obtained by other methods such as handwriting input of each character with a touch pen.

また、上述した実施形態では、学習用画像として手書文字画像を例に説明したが、学習用画像は、これに限らず、風景画像（例えば、雲が撮像された風景画像に、その雲の種別をラベルしたもの）や、天気画像（晴れ、曇り、雨等をラベルしたもの）や、その他物の画像等、任意の画像を対象とすることができる。例えば、検査支援装置１は、動物を撮像した動物画像に、その動物の種別をラベルしたものである場合には、第１ラベルが「猫」の学習用画像のなかに「熊」の画像があるときに、検査対象画像として表示してもよい。 Further, in the above-described embodiment, an example of a handwritten character image as a learning image has been described, but the learning image is not limited to this. Any image can be targeted, such as weather images (labeled as sunny, cloudy, rainy, etc.), or images of other objects. For example, in the case where an animal image obtained by imaging an animal is labeled with the type of the animal, the examination support apparatus 1 detects that an image of "bear" is included in the training images whose first label is "cat". At certain times, it may be displayed as an image to be inspected.

１…検査支援装置（支援システム）
１０１…データ入力部（入力部）
１０２…取得部
１０３…取得ラベル記憶部
１０４…推定部
１０５…識別器
１０６…推定ラベル記憶部
１０７…比較部
１０８…選定データ記憶部
１０９…表示部
１１０…正誤情報入力部
１１１…誤りデータ記憶部
２０１…原稿用紙文字画像
２０２…二値化・白黒反転結果画像
２０３…外枠検出結果画像
２０４…非直線枠線拡大図
２０５…直線近似枠線拡大図
２０６…一文字画像枠線検出結果画像
３００,３００Ａ，３００Ｂ…目視検査用画面
３０１－１～３０１－９…手書文字画像
３０２…正解文字画像
３０３…正解文字画像
ＤＰ…ディスプレイ 1... Inspection support device (support system)
101 ... data input unit (input unit)
DESCRIPTION OF SYMBOLS 102... Acquisition part 103... Acquisition label storage part 104... Estimation part 105... Discriminator 106... Estimated label storage part 107... Comparing part 108... Selection data storage part 109... Display part 110... True/False information input part 111... Error data storage part 201... Document paper character image 202... Binary/black and white inverted result image 203... Outer frame detection result image 204... Non-linear frame line enlarged view 205... Linear approximation frame line enlarged view 206... Single character image frame line detection result image 300, 300A, 300B... Visual inspection screen 301-1 to 301-9... Handwritten character image 302... Correct character image 303... Correct character image DP... Display

Claims

an acquisition unit that acquires a learning data set in which a learning image and a first label associated with the learning image are a set;
an estimating unit that estimates a second label that is estimated to be associated with the learning image by image recognition of the learning image of the learning data set acquired by the acquiring unit;
a comparison unit that compares the first label and the second label;
a display unit that displays the learning image with the first label and the second label that are different from each other based on the comparison result by the comparison unit;
an input unit for inputting input information indicating whether or not the image shown in the learning image displayed by the display unit is different from the first label associated with the learning image;
Support system with

The display unit collectively displays the learning images with different first labels and second labels for each type of the first label.
The assistance system of claim 1.

The display unit displays the learning images with different first labels and second labels in tiles for each type of the first label.
3. A support system according to claim 2.

The display unit displays a reference image indicated by the first label together with the learning image.
4. A support system according to any one of claims 1 to 3.

The learning image is a handwritten character image showing handwritten characters,
The first label is information indicating the type of characters shown in the handwritten character image.
5. A support system according to any one of claims 1 to 4.

The acquisition unit acquires a handwritten character image for each character by binarizing and black-and-white inverting an image of a manuscript paper on which a plurality of handwritten characters of different character types are written, and obtains a handwritten character image for each character from a position on the manuscript paper. Identifying the type of characters in a handwritten character image,
6. A support system according to claim 5.

an acquisition process of acquiring a learning data set in which a learning image and a first label associated with the learning image are a set;
an estimation process of estimating a second label estimated to be associated with the learning image by image recognition of the learning image of the acquired learning data set;
a comparing step of comparing the first label and the second label;
a display step of displaying the learning image having the different first label and the second label based on the comparison result in the comparison step;
an input step of inputting input information indicating whether or not the image shown in the displayed learning image is different from the first label associated with the learning image;
Assistance methods including;

to the computer,
obtaining a learning data set in which a learning image and a first label associated with the learning image are a set;
estimating a second label estimated to be associated with the learning image by image recognition of the learning image of the acquired learning data set;
comparing the first label and the second label;
a step of displaying the learning image with the first label and the second label different based on the comparison result in the comparing step;
a step of inputting input information indicating whether or not the image shown in the displayed learning image is different from the first label associated with the learning image;
program to run the