JP2021002270A

JP2021002270A - Image recognition learning device, image recognition learning method, image recognition learning program and terminal device

Info

Publication number: JP2021002270A
Application number: JP2019116379A
Authority: JP
Inventors: 英樹竹原; Hideki Takehara
Original assignee: JVCKenwood Corp
Current assignee: JVCKenwood Corp
Priority date: 2019-06-24
Filing date: 2019-06-24
Publication date: 2021-01-07

Abstract

To construct a multi-layer neutral network having an appropriate recognition ratio.SOLUTION: An image recognition learning device includes: an image acquisition unit 31 that acquires an image provided with a label indicating a first category and an image provided with a label indicating a second category; a reference image generation unit 20 that generates a reference image based on the image provided with the label indicating the first category; a difference image generation unit 33 that generates a first difference image calculated from a difference between the image provided with the label indicating the first category and the reference image, and a second difference image calculated from a difference between the image provided with the label indicating the second category and the reference image; and a learning unit 35 that generates a learned model obtained by learning weighting of a multi-layer neural network that identifies the first category and the second category, by using the first difference image and the second difference image.SELECTED DRAWING: Figure 1

Description

本発明は、画像認識学習装置、画像認識学習方法、画像認識学習プログラム及び端末装置に関する。 The present invention relates to an image recognition learning device, an image recognition learning method, an image recognition learning program, and a terminal device.

製品の外観検査の方法として、画像認識によって異常の有無を判定する方法が知られている（例えば、特許文献１ないし特許文献３参照）。 As a method of visual inspection of a product, a method of determining the presence or absence of an abnormality by image recognition is known (see, for example, Patent Documents 1 to 3).

特開２００６−０４７０４０号公報Japanese Unexamined Patent Publication No. 2006-047040 特開２０１８−１７３８１４号公報JP-A-2018-173814 特開平１０−２８９３１７号公報Japanese Unexamined Patent Publication No. 10-289317

特許文献１は、検査対象物の検査対象領域から特徴を抽出して学習を実行する。特許文献１では、部品や製品毎に試行錯誤を重ねて特徴を抽出する必要がある。特許文献２は、認識対象の教師データを用いて画像認識を行うが、本質的な特徴以外も同時に学習してしまう。これにより、非常に大量の教師データを必要とし、実用に耐え得るだけの教師データを用意することが難しいおそれがある。また、特許文献２は、特徴部分以外をマスクすることで本質的な特徴以外を除去する。ところが、本質的な特徴以外を判別するのに特徴量を用いなくてはならない。特許文献３は、所定比較領域を入力画像と基準画像で比較することで画素の濃度を調整する。これにより、大量の基準画像をあらかじめ記憶させる必要がある。 Patent Document 1 extracts features from an inspection target area of an inspection target and executes learning. In Patent Document 1, it is necessary to extract features by repeating trial and error for each part or product. In Patent Document 2, image recognition is performed using the teacher data to be recognized, but other than the essential features are also learned at the same time. As a result, a very large amount of teacher data is required, and it may be difficult to prepare enough teacher data to withstand practical use. Further, Patent Document 2 removes non-essential features by masking non-features. However, it is necessary to use the feature amount to discriminate other than the essential features. Patent Document 3 adjusts the pixel density by comparing a predetermined comparison area with an input image and a reference image. As a result, it is necessary to store a large amount of reference images in advance.

多層ニューラルネットワークを用いたディープラーニングでは、一般的に非常に大量の教師データを必要とし、実用に耐え得るだけの教師データを用意することが難しいおそれがある。教師データの量が不十分であると、適切な認識率を得ることが難しくなる。このように、多層ニュートラルネットワークを用いたディープラーニングでは、認識率の向上に改善の余地がある。 Deep learning using a multi-layer neural network generally requires a very large amount of teacher data, and it may be difficult to prepare enough teacher data to withstand practical use. Insufficient amount of teacher data makes it difficult to obtain an appropriate recognition rate. As described above, in deep learning using a multi-layer neutral network, there is room for improvement in improving the recognition rate.

本発明は、上記に鑑みてなされたものであって、適切な認識率を有する多層ニュートラルネットワークを構築することを目的とする。 The present invention has been made in view of the above, and an object of the present invention is to construct a multi-layer neutral network having an appropriate recognition rate.

上述した課題を解決し、目的を達成するために、本発明に係る画像認識学習装置は、第１の分類を示すラベルを付与した画像と第２の分類を示すラベルを付与した画像とを取得する画像取得部と、前記第１の分類を示すラベルを付与した画像に基づく基準画像を生成する基準画像生成部と、前記第１の分類を示すラベルを付与した画像と前記基準画像との差分から算出される第１の差分画像及び前記第２の分類を示すラベルを付与した画像と前記基準画像との差分から算出される第２の差分画像を生成する差分画像生成部と、前記第１の差分画像と前記第２の差分画像とを用いて、前記第１の分類と前記第２の分類とを識別する多層ニューラルネットワークの重みづけを学習させた学習済みモデルを生成する学習部と、を備える。 In order to solve the above-mentioned problems and achieve the object, the image recognition learning device according to the present invention acquires an image with a label indicating the first classification and an image with a label indicating the second classification. The difference between the image acquisition unit, the reference image generation unit that generates a reference image based on the image with the label indicating the first classification, and the image with the label indicating the first classification and the reference image. A difference image generation unit that generates a second difference image calculated from the difference between the first difference image calculated from the above and the image with the label indicating the second classification and the reference image, and the first difference image. A learning unit that generates a trained model in which the weighting of the multi-layer neural network that distinguishes the first classification and the second classification is trained by using the difference image of the above and the second difference image. To be equipped.

本発明に係る画像認識学習方法は、第１の分類を示すラベルを付与した画像と第２の分類を示すラベルを付与した画像とを取得する画像取得ステップと、前記第１の分類を示すラベルを付与した画像に基づく基準画像を生成する基準画像生成ステップと、前記第１の分類を示すラベルを付与した画像と前記基準画像との差分から算出される第１の差分画像及び前記第２の分類を示すラベルを付与した画像と前記基準画像との差分から算出される第２の差分画像を生成する差分画像生成ステップと、前記第１の差分画像と前記第２の差分画像とを用いて、前記第１の分類と前記第２の分類とを識別する多層ニューラルネットワークの重みづけを学習させた学習済みモデルを生成する学習ステップと、を含む。 The image recognition learning method according to the present invention includes an image acquisition step of acquiring an image with a label indicating the first classification and an image with a label indicating the second classification, and a label indicating the first classification. The first difference image calculated from the difference between the reference image generation step of generating the reference image based on the image to which the first classification is given, the image to which the label indicating the first classification is given, and the reference image, and the second Using the difference image generation step of generating the second difference image calculated from the difference between the image with the label indicating the classification and the reference image, and the first difference image and the second difference image. , A learning step of generating a trained model trained in the weighting of a multi-layer neural network that distinguishes the first classification from the second classification.

本発明に係る画像認識学習プログラムは、第１の分類を示すラベルを付与した画像と第２の分類を示すラベルを付与した画像とを取得する画像取得ステップと、前記第１の分類を示すラベルを付与した画像に基づく基準画像を生成する基準画像生成ステップと、前記第１の分類を示すラベルを付与した画像と前記基準画像との差分から算出される第１の差分画像及び前記第２の分類を示すラベルを付与した画像と前記基準画像との差分から算出される第２の差分画像を生成する差分画像生成ステップと、前記第１の差分画像と前記第２の差分画像とを用いて、前記第１の分類と前記第２の分類とを識別する多層ニューラルネットワークの重みづけを学習させた学習済みモデルを生成する学習ステップと、を含む。 The image recognition learning program according to the present invention includes an image acquisition step of acquiring an image with a label indicating the first classification and an image with a label indicating the second classification, and a label indicating the first classification. The first difference image calculated from the difference between the reference image generation step of generating the reference image based on the image to which the first classification is given, the image to which the label indicating the first classification is given, and the reference image, and the second Using the difference image generation step of generating the second difference image calculated from the difference between the image with the label indicating the classification and the reference image, and the first difference image and the second difference image. , A learning step of generating a trained model trained in the weighting of a multi-layer neural network that distinguishes the first classification from the second classification.

本発明に係る端末装置は、特定の画像を取得する画像取得部と、第１の分類を示すラベルを付与した画像に基づく基準画像を取得する基準画像取得部と、前記特定の画像と前記基準画像との差分から算出される第３の差分画像を生成する差分画像生成部と、第１の分類を示すラベルを付与した画像と前記基準画像との差分から算出される第１の差分画像と、第２の分類を示すラベルを付与した画像と前記基準画像との差分から算出される第２の差分画像とを用いて、前記第１の分類と前記第２の分類とを識別する多層ニューラルネットワークの重みづけを学習させた学習済みモデルを取得する学習済みモデル取得部と、前記学習済みモデルに前記第３の差分画像を入力することで、前記画像取得部で取得した特定の画像を第１の分類と前記第２の分類とに識別するデータ推論部と、を備える。 The terminal device according to the present invention includes an image acquisition unit that acquires a specific image, a reference image acquisition unit that acquires a reference image based on an image with a label indicating the first classification, the specific image, and the reference. A difference image generation unit that generates a third difference image calculated from the difference between the images, and a first difference image calculated from the difference between the image with the label indicating the first classification and the reference image. , A multi-layer neural that discriminates between the first classification and the second classification by using a second difference image calculated from the difference between the image labeled indicating the second classification and the reference image. By inputting the third difference image into the trained model and the trained model acquisition unit that acquires the trained model in which the weighting of the network is trained, the specific image acquired by the image acquisition unit can be obtained. It is provided with a data inference unit that distinguishes between the first classification and the second classification.

本発明によれば、適切な認識率を有する多層ニュートラルネットワークを構築することができるという効果を奏する。 According to the present invention, there is an effect that a multi-layer neutral network having an appropriate recognition rate can be constructed.

図１は、第一実施形態に係る画像処理システムの構成例の一例を示す概略図である。FIG. 1 is a schematic view showing an example of a configuration example of the image processing system according to the first embodiment. 図２は、画像セットを説明する図である。FIG. 2 is a diagram illustrating an image set. 図３は、第一実施形態に係る学習部の構成例を示す概略図である。FIG. 3 is a schematic view showing a configuration example of the learning unit according to the first embodiment. 図４は、第一実施形態に係る合成異常画像を説明する図である。FIG. 4 is a diagram illustrating a composite abnormality image according to the first embodiment. 図５は、第一実施形態に係る推論部の構成例を示す概略図である。FIG. 5 is a schematic view showing a configuration example of the inference unit according to the first embodiment. 図６は、第一実施形態に係る画像処理システムにおける処理の流れを示すフローチャートである。FIG. 6 is a flowchart showing a processing flow in the image processing system according to the first embodiment. 図７は、第一実施形態に係る学習部における処理の流れを示すフローチャートである。FIG. 7 is a flowchart showing a processing flow in the learning unit according to the first embodiment. 図８は、第一実施形態に係る推論部における処理の流れを示すフローチャートである。FIG. 8 is a flowchart showing a processing flow in the inference unit according to the first embodiment. 図９は、第二実施形態に係る画像処理システムの構成例の一例を示す概略図である。FIG. 9 is a schematic view showing an example of a configuration example of the image processing system according to the second embodiment. 図１０は、第三実施形態に係る画像処理システムの構成例の一例を示す概略図である。FIG. 10 is a schematic view showing an example of a configuration example of the image processing system according to the third embodiment.

以下に添付図面を参照して、本発明に係る画像認識学習装置、画像認識学習方法、画像認識学習プログラム及び端末装置の実施形態を詳細に説明する。なお、以下の実施形態により本発明が限定されるものではない。 Hereinafter, embodiments of the image recognition learning device, the image recognition learning method, the image recognition learning program, and the terminal device according to the present invention will be described in detail with reference to the accompanying drawings. The present invention is not limited to the following embodiments.

［第一実施形態］
図１は、第一実施形態に係る画像処理システムの構成例の一例を示す概略図である。画像処理システム１は、教師データを使用する教師あり学習のディープラーニングによって学習させた多層ニューラルネットワーク（以下、「ＤＮＮ」という。）により画像認識を実行する。 [First Embodiment]
FIG. 1 is a schematic view showing an example of a configuration example of the image processing system according to the first embodiment. The image processing system 1 executes image recognition by a multi-layer neural network (hereinafter referred to as “DNN”) trained by deep learning of supervised learning using teacher data.

ディープラーニングは、ＤＮＮを使用した推論の精度が向上するように、ＤＮＮの重みの更新を繰り返し実行して、ＤＮＮの推論結果が良好になるようにＤＮＮの重みを導出する機械学習の一手法である。 Deep learning is a machine learning method that repeatedly updates the DNN weights so that the accuracy of inference using DNN is improved, and derives the DNN weights so that the inference result of DNN is good. is there.

画像処理システム１は、例えば、製品の表面にある傷等の異常を検出する際に使用される。画像処理システム１は、記録部１０と、基準画像生成部２０と、学習部３０と、推論部４０と、表示部５０とを含む。記録部１０と基準画像生成部２０と学習部３０とを含む構成を、画像認識学習装置２という。記録部１０と基準画像生成部２０と推論部４０とを含む構成を、画像認識装置３という。 The image processing system 1 is used, for example, when detecting an abnormality such as a scratch on the surface of a product. The image processing system 1 includes a recording unit 10, a reference image generation unit 20, a learning unit 30, an inference unit 40, and a display unit 50. The configuration including the recording unit 10, the reference image generation unit 20, and the learning unit 30 is referred to as an image recognition learning device 2. The configuration including the recording unit 10, the reference image generation unit 20, and the inference unit 40 is referred to as an image recognition device 3.

記録部１０は、画像処理システム１で使用する教師データである画像セットを記録する。記録部１０は、訓練用（ｔｒａｉｎｉｎｇ）画像セット、確認用（ｖａｌｉｄａｔｉｏｎ）画像セット、及びテスト用（ｔｅｓｔ）画像セットが記録されている。訓練用画像セット、確認用画像セット、及びテスト用画像セットの各画像セットはディープラーニングで一般的に利用される画像セットの分類法である。訓練用画像セットは、学習に利用する。確認用画像セットは、汎化性能を評価し、その評価結果を受けて学習済みモデルの分類器のパラメータの再調整に利用する。汎化性能とは、学習時に与えられた訓練データだけに対してだけでなく、未知の新たなデータに対するクラスラベルや関数値も正しく予測できる能力のことである。テスト用画像セットは、ＤＮＮを使用した推論結果を評価する。各画像セットには、画像とその画像の分類タイプ（ラベル）との組み合わせが含まれる。記録部１０は、例えば、フラッシュメモリ（ＦｌａｓｈＭｅｍｏｒｙ）などの半導体メモリ素子、または、ハードディスク、光ディスクなどの記憶装置である。なお、記録部１０は教師データである画像セット以外にも、外部から画像を取得することも可能である。 The recording unit 10 records an image set which is teacher data used in the image processing system 1. The recording unit 10 records a training (training) image set, a confirmation (validation) image set, and a test (test) image set. Each image set of training image set, confirmation image set, and test image set is an image set classification method generally used in deep learning. The training image set is used for learning. The confirmation image set evaluates the generalization performance, receives the evaluation result, and uses it for readjusting the parameters of the classifier of the trained model. Generalization performance is the ability to correctly predict class labels and function values not only for training data given at the time of training but also for unknown new data. The test image set evaluates the inference result using DNN. Each image set contains a combination of images and their classification type (label). The recording unit 10 is, for example, a semiconductor memory element such as a flash memory or a storage device such as a hard disk or an optical disk. In addition to the image set that is the teacher data, the recording unit 10 can also acquire an image from the outside.

図２を用いて、画像セットについて説明する。図２は、画像セットを説明する図である。画像セットには、本実施形態では、傷、汚れ、色むら、及び変形とこれらのいずれにも該当しない正常の５つの分類タイプへ画像を分類する。画像は、それぞれの分類タイプを示すラベルを画像に付与することで分類される。正常、傷、汚れ、色むら、及び変形の各分類タイプの画像数はそれぞれ、１００、５、５、５、５とする。本実施形態では、正常、傷、汚れ、色むら、及び変形の各分類タイプの画像数はそれぞれ、１００、５、５、５、５としているが、これに限定されない。正常、傷、汚れ、色むら、及び変形の各分類タイプの画像数は１００より多くても少なくても良い。分類タイプが正常である画像を正常画像１００（図４参照）、分類タイプが傷、汚れ、色むら、または変形である画像を異常画像１１０（図４参照）とする。言い換えると、第１の分類を示すラベルが付与された画像を、正常画像１００とする。第２の分類を示すラベルが付与された画像を、異常画像１１０という。本実施形態では、分類タイプを５つとしているが、これに限定されない。分類タイプは、例えば、傷、汚れ、色むら、及び変形の４つをまとめて１つの分類タイプ（例えば、異常）としてもよい。分類タイプは、例えば、傷、汚れ、色むら、及び変形の少なくとも１つ以上であればよい。分類タイプは、６つ以上でもよい。 The image set will be described with reference to FIG. FIG. 2 is a diagram illustrating an image set. In the image set, in the present embodiment, images are classified into five normal classification types that do not correspond to scratches, stains, color unevenness, and deformations. Images are classified by giving the images labels indicating their respective classification types. The number of images of each classification type of normal, scratch, stain, color unevenness, and deformation is 100, 5, 5, 5, 5, respectively. In the present embodiment, the number of images of each classification type of normal, scratch, stain, color unevenness, and deformation is 100, 5, 5, 5, 5, respectively, but is not limited thereto. The number of images of each classification type of normal, scratch, stain, color unevenness, and deformation may be more than or less than 100. An image having a normal classification type is defined as a normal image 100 (see FIG. 4), and an image having a classification type of scratches, stains, color unevenness, or deformation is defined as an abnormal image 110 (see FIG. 4). In other words, the image to which the label indicating the first classification is given is defined as the normal image 100. An image to which a label indicating the second classification is given is referred to as an abnormal image 110. In the present embodiment, there are five classification types, but the classification type is not limited to this. As the classification type, for example, scratches, stains, color unevenness, and deformation may be collectively set as one classification type (for example, abnormal). The classification type may be, for example, at least one or more of scratches, stains, color unevenness, and deformation. The classification type may be 6 or more.

本実施形態では、訓練用画像セット、確認用画像セット、及びテスト用画像セットの各画像セットに属する各分類タイプの画像の数は同じである。訓練用画像セット、確認用画像セット、及びテスト用画像セットの各画像セットに属する各分類タイプの画像は全て異なる画像である。本実施形態では、訓練用画像セット、確認用画像セット、及びテスト用画像セットの各分類タイプの画像の数は同じであるとするが、各分類タイプの画像はそれぞれ１枚以上あれば、各画像セットの各分類タイプの画像の数は同じでなくてもよい。 In the present embodiment, the number of images of each classification type belonging to each image set of the training image set, the confirmation image set, and the test image set is the same. The images of each classification type belonging to each image set of the training image set, the confirmation image set, and the test image set are all different images. In the present embodiment, it is assumed that the number of images of each classification type of the training image set, the confirmation image set, and the test image set is the same, but if there is one or more images of each classification type, each is The number of images for each classification type in the image set does not have to be the same.

基準画像生成部２０は、第１の分類を示すラベルを付与した画像としての正常画像１００に基づく基準画像を生成する。基準画像生成部２０は、記録部１０に記録されている訓練用画像セットに含まれる全ての正常画像１００の平均画像である基準画像を生成し、記録部１０に基準画像を記録する。本実施形態では、基準画像生成部２０は、訓練用画像セットに含まれる全ての正常画像１００のＲＧＢ毎に画素単位で平均して、基準画像を生成する。または、基準画像生成部２０は、輝度成分と色差成分に分離したＹＣｂＣｒ毎に画素単位の平均に基づいて基準画像を生成してもよい。または、基準画像の色成分が不要である場合、基準画像生成部２０は、正常画像１００をグレースケールに変換し、グレースケールに変換した後の画像から基準画像を生成してもよい。 The reference image generation unit 20 generates a reference image based on the normal image 100 as an image with a label indicating the first classification. The reference image generation unit 20 generates a reference image which is an average image of all the normal images 100 included in the training image set recorded in the recording unit 10, and records the reference image in the recording unit 10. In the present embodiment, the reference image generation unit 20 generates a reference image by averaging each of the RGB of all the normal images 100 included in the training image set in pixel units. Alternatively, the reference image generation unit 20 may generate a reference image based on the average of each pixel for each YCbCr separated into a luminance component and a color difference component. Alternatively, when the color component of the reference image is unnecessary, the reference image generation unit 20 may convert the normal image 100 to grayscale and generate the reference image from the image after the conversion to grayscale.

本実施形態では、基準画像を訓練用画像セットから生成したが、確認用画像セットから生成してもよく、訓練用画像セットと確認用画像セットとの両方から生成してもよい。 In the present embodiment, the reference image is generated from the training image set, but it may be generated from the confirmation image set, or may be generated from both the training image set and the confirmation image set.

本実施形態では、基準画像を全ての正常画像１００の平均画像としたが、１枚以上の正常画像１００の平均画像であればよい。 In the present embodiment, the reference image is the average image of all the normal images 100, but it may be an average image of one or more normal images 100.

基準画像生成部２０は、学習部３０と別であるものとして説明したが、学習部３０に含まれていてもよい。 Although the reference image generation unit 20 has been described as being separate from the learning unit 30, it may be included in the learning unit 30.

学習部３０は、第１の差分画像と第２の差分画像とを用いて、第１の分類と第２の分類とを識別する多層ニューラルネットワークの重みづけを学習させた学習済みモデル（以下、「ＤＮＮモデル」という。）を生成する。学習部３０は、訓練用画像セットと確認用画像セットとを用いて、ＤＮＮモデルを、５つの分類タイプに分類する分類器としてディープラーニングにより学習させる。学習部３０は、学習されたＤＮＮモデルを記録部１０に保存する。学習部３０は、学習の途中で得られる損失値と正解率とを表示部５０へ出力する。 The learning unit 30 uses the first difference image and the second difference image to learn the weighting of the multi-layer neural network that distinguishes between the first classification and the second classification (hereinafter, trained model). "DNN model") is generated. The learning unit 30 uses the training image set and the confirmation image set to train the DNN model by deep learning as a classifier that classifies the DNN model into five classification types. The learning unit 30 stores the learned DNN model in the recording unit 10. The learning unit 30 outputs the loss value and the correct answer rate obtained in the middle of learning to the display unit 50.

ＤＮＮモデルは、学習したＤＮＮを再現するために必要なパラメータで構成され、少なくともＤＮＮの構成と重みとを含むものとする。ＤＮＮモデルの構成は、入力層と出力層の間に隠れ層として、畳み込み層、プーリング層、及び全結合層を少なくともそれぞれ１層以上有する。ＤＮＮモデルの構成として、例えば、隠れ層として、畳み込み、プーリングを２回繰り返した後、全結合層３つが並ぶＬｅＮＥＴがある。 The DNN model is composed of parameters necessary for reproducing the learned DNN, and includes at least the structure and weight of the DNN. The configuration of the DNN model has at least one convolutional layer, a pooling layer, and a fully connected layer as hidden layers between the input layer and the output layer. As a configuration of the DNN model, for example, as a hidden layer, there is a LeNET in which three fully connected layers are lined up after convolution and pooling are repeated twice.

図３を用いて、学習部３０について詳しく説明する。図３は、第一実施形態に係る学習部の構成例を示す概略図である。学習部３０は、画像取得部３１、異常画像生成部３２、差分画像生成部３３、変形画像生成部３４、ＤＮＮ学習部３５、及び学習済みモデル保存部３６を有する。 The learning unit 30 will be described in detail with reference to FIG. FIG. 3 is a schematic view showing a configuration example of the learning unit according to the first embodiment. The learning unit 30 includes an image acquisition unit 31, an abnormal image generation unit 32, a difference image generation unit 33, a deformed image generation unit 34, a DNN learning unit 35, and a learned model storage unit 36.

画像取得部３１は、記録部１０から画像セットを取得する。画像取得部３１は、第１の分類を示すラベルを付与した画像としての正常画像１００と、第２の分類を示すラベルを付与した画像としての異常画像１１０とを取得する。本実施形態では、画像取得部３１は、記録部１０から画像セットの正常画像１００または異常画像１１０を取得する。画像取得部３１は、取得した正常画像１００または異常画像１１０を異常画像生成部３２へ出力する。 The image acquisition unit 31 acquires an image set from the recording unit 10. The image acquisition unit 31 acquires a normal image 100 as an image with a label indicating the first classification and an abnormal image 110 as an image with a label indicating the second classification. In the present embodiment, the image acquisition unit 31 acquires the normal image 100 or the abnormal image 110 of the image set from the recording unit 10. The image acquisition unit 31 outputs the acquired normal image 100 or abnormal image 110 to the abnormal image generation unit 32.

図４を用いて、異常画像生成部３２について説明する。図４は、第一実施形態に係る合成異常画像を説明する図である。異常画像生成部３２は、第２の分類を示すラベルを付与した異常画像１１０から特定の対象が含まれるエリアを切り出して第１の分類を示すラベルを付与した正常画像１００に重ねて合成した合成異常画像１２０を生成する。本実施形態では、特定の対象が含まれるエリアは、例えば、傷、汚れ、色むら、及び変形などが含まれる異常エリアＡ１とする。異常画像生成部３２は、異常エリアＡ１を正常画像１００に重ねて合成して合成異常画像１２０を生成する。異常画像生成部３２は、生成した合成異常画像１２０を異常画像に含める。異常画像生成部３２は、生成した合成異常画像１２０を差分画像生成部３３へ出力する。 The abnormal image generation unit 32 will be described with reference to FIG. FIG. 4 is a diagram illustrating a composite abnormality image according to the first embodiment. The abnormal image generation unit 32 cuts out an area including a specific target from the abnormal image 110 to which the label indicating the second classification is attached, and superimposes and synthesizes the normal image 100 to which the label indicating the first classification is attached. Anomalous image 120 is generated. In the present embodiment, the area including the specific target is, for example, the abnormal area A1 including scratches, stains, color unevenness, deformation, and the like. The abnormal image generation unit 32 superimposes the abnormal area A1 on the normal image 100 and synthesizes the abnormal area A1 to generate a composite abnormal image 120. The abnormal image generation unit 32 includes the generated composite abnormal image 120 in the abnormal image. The abnormal image generation unit 32 outputs the generated composite abnormal image 120 to the difference image generation unit 33.

より詳しくは、異常画像生成部３２は、既知である異常画像１１０上の異常エリアＡ１を切り出して、正常画像１００上の同一位置及び異常エリアＡ１を合成する合成エリアＡ２内に位置を変えながら、言い換えると、移動させながら、異常エリアＡ１を合成して、合成異常画像１２０を生成する。異常画像生成部３２は、既知である異常画像１１０上の異常エリアＡ１を回転させて合成エリアＡ２に合成して、合成異常画像１２０を生成してもよい。異常画像生成部３２は、既知である異常画像１１０上の異常エリアＡ１を拡大または縮小させて合成エリアＡ２に合成して、合成異常画像１２０を生成してもよい。異常画像生成部３２は、既知である異常画像１１０上の異常エリアＡ１の輝度レベルを高くまたは低くして合成エリアＡ２に合成して、合成異常画像１２０を生成してもよい。ここで、移動、回転、拡大、縮小、または輝度レベルの変更等の処理はそれぞれ個別に実行して合成異常画像１２０を生成してもよく、移動、回転、拡大、縮小、または輝度レベルの変更等の処理の複数を組み合わせて合成異常画像１２０を生成してもよい。 More specifically, the abnormal image generation unit 32 cuts out the abnormal area A1 on the known abnormal image 110 and changes the position in the same position on the normal image 100 and the composite area A2 for synthesizing the abnormal area A1. In other words, while moving, the abnormal area A1 is combined to generate the combined abnormal image 120. The abnormal image generation unit 32 may rotate the abnormal area A1 on the known abnormal image 110 and synthesize it with the composite area A2 to generate the composite abnormal image 120. The abnormal image generation unit 32 may enlarge or reduce the abnormal area A1 on the known abnormal image 110 and combine it with the composite area A2 to generate the composite abnormal image 120. The abnormal image generation unit 32 may generate the composite abnormal image 120 by increasing or lowering the brightness level of the abnormal area A1 on the known abnormal image 110 and combining it with the composite area A2. Here, processing such as movement, rotation, enlargement, reduction, or change of brightness level may be executed individually to generate the composite abnormal image 120, and movement, rotation, enlargement, reduction, or change of brightness level may be generated. The composite abnormal image 120 may be generated by combining a plurality of processes such as.

異常エリアＡ１は、既知の異常画像１１０において、例えば、傷、汚れ、色むら、及び変形等の異常が写されたエリアである。異常エリアＡ１は、ユーザによって切り出すエリアが事前に指定されている。異常画像１１０に対して特徴抽出処理が可能である場合、特徴抽出処理によって異常エリアＡ１を切り出してもよい。 The abnormal area A1 is an area in which abnormalities such as scratches, stains, color unevenness, and deformation are captured in the known abnormal image 110. In the abnormal area A1, the area to be cut out is designated in advance by the user. When the feature extraction process is possible for the abnormal image 110, the abnormal area A1 may be cut out by the feature extraction process.

合成エリアＡ２は、異常画像１１０の分類タイプ毎に個別に設定してもよい。より詳しくは、合成エリアＡ２は、例えば、傷、汚れ、色むら、及び変形等の異常の種類毎に、当該異常が発生し得るエリアを限定して設定してもよい。これらにより、不適切な合成異常画像１２０が生成されることが抑制される。 The composite area A2 may be set individually for each classification type of the abnormal image 110. More specifically, the synthetic area A2 may be set by limiting the area where the abnormality can occur for each type of abnormality such as scratches, stains, color unevenness, and deformation. As a result, it is possible to suppress the generation of an inappropriate composite abnormality image 120.

異常エリアＡ１を合成する対象となる正常画像１００は１以上であれば多い方が好ましい。 It is preferable that the number of normal images 100 to be combined with the abnormal area A1 is 1 or more.

合成異常画像１２０は、分類タイプが傷、汚れ、色むら、または変形である画像についてそれぞれ例えば、２０ずつ生成される。本実施形態では、２０ずつ生成されるとしたが、１以上であれば多い方が望ましい。ただし、合成異常画像１２０の画像数を多くする場合、正常画像１００の画像数も合わせて増やす方が望ましい。 The composite abnormality image 120 is generated, for example, 20 for each image whose classification type is scratch, stain, color unevenness, or deformation. In this embodiment, it is assumed that 20 are generated at a time, but it is desirable that the number is 1 or more. However, when increasing the number of images of the composite abnormal image 120, it is desirable to increase the number of images of the normal image 100 as well.

図３に戻って、差分画像生成部３３は、第１の分類を示すラベルを付与した正常画像１００と基準画像との差分から算出される第１の差分画像としての正常差分画像、及び、第２の分類を示すラベルを付与した異常画像１１０と基準画像との差分から算出される第２の差分画像としての異常差分画像を生成する。差分画像生成部３３は、異常画像生成部３２が生成した合成異常画像１２０を第２の分類を示すラベルを付与した異常画像１１０に含める。言い換えると、差分画像生成部３３は、合成異常画像１２０と基準画像との差分画像を異常差分画像に含んで生成する。差分画像生成部３３は、生成した正常差分画像と異常差分画像とを変形画像生成部３４へ出力する。 Returning to FIG. 3, the difference image generation unit 33 includes a normal difference image as a first difference image calculated from the difference between the normal image 100 and the reference image with a label indicating the first classification, and a first. An abnormal difference image as a second difference image calculated from the difference between the abnormal image 110 to which the label indicating the classification of 2 is given and the reference image is generated. The difference image generation unit 33 includes the composite abnormal image 120 generated by the abnormal image generation unit 32 in the abnormal image 110 with a label indicating the second classification. In other words, the difference image generation unit 33 includes the difference image between the composite abnormality image 120 and the reference image in the abnormality difference image and generates it. The difference image generation unit 33 outputs the generated normal difference image and the abnormal difference image to the modified image generation unit 34.

本実施形態では、差分画像生成部３３は、ＲＧＢ毎の画素単位の絶対差分に基づいて差分画像を生成する。または、差分画像生成部３３は、ＹＣｂＣｒ毎に画素単位の絶対差分に基づいて差分画像を生成してもよい。さらにまたは、差分画像の色成分が不要である場合、差分画像生成部３３は、正常画像１００と異常画像１１０と基準画像とをグレースケールに変換し、グレースケールに変換した後の画像について差分画像を生成してもよい。 In the present embodiment, the difference image generation unit 33 generates a difference image based on the absolute difference in pixel units for each RGB. Alternatively, the difference image generation unit 33 may generate a difference image for each YCbCr based on the absolute difference in pixel units. Further, or when the color component of the difference image is unnecessary, the difference image generation unit 33 converts the normal image 100, the abnormal image 110, and the reference image into grayscale, and the difference image is obtained for the image after the conversion to grayscale. May be generated.

変形画像生成部３４は、訓練用画像セットについて、新たな正常差分画像としての正常変形画像と、新たな異常差分画像としての異常変形画像とを生成する。より詳しくは、変形画像生成部３４は、第１の差分画像としての正常差分画像と第２の差分画像としての異常差分画像とを、それぞれ移動、回転、拡大、縮小、または輝度レベルの変更等させて変形させて、新たな正常差分画像と異常差分画像とを生成する。移動、回転、拡大、または縮小等の変形処理にはアフィン変換を利用する。移動、回転、拡大、縮小、または輝度レベルの変更はそれぞれ個別に変形処理してもよく、移動、回転、拡大、縮小、または輝度レベルの変更の複数を組み合わせて変形処理してもよい。このように、正常差分画像から大量の正常変形画像を生成し、異常差分画像から大量の異常変形画像を生成する。変形画像生成部３４は、生成した正常変形画像と異常変形画像とをＤＮＮ学習部３５へ出力する。 The deformed image generation unit 34 generates a normal deformed image as a new normal difference image and an abnormally deformed image as a new abnormal difference image for the training image set. More specifically, the modified image generation unit 34 moves, rotates, enlarges, reduces, changes the brightness level, etc. of the normal difference image as the first difference image and the abnormal difference image as the second difference image, respectively. It is transformed to generate a new normal difference image and an abnormal difference image. Affine transformation is used for deformation processing such as movement, rotation, enlargement, or reduction. The movement, rotation, enlargement, reduction, or change of the brightness level may be individually transformed, or a plurality of movement, rotation, enlargement, reduction, or change of the brightness level may be combined and transformed. In this way, a large amount of normal deformed images are generated from the normal difference images, and a large amount of abnormally deformed images are generated from the abnormal difference images. The deformed image generation unit 34 outputs the generated normal deformed image and the abnormally deformed image to the DNN learning unit 35.

ＤＮＮ学習部３５は、訓練用画像セットについて、変形画像生成部３４が生成した正常変形画像を含む正常差分画像と、変形画像生成部３４が生成した異常変形画像を含む異常差分画像とを用いて分類タイプを認識するモデルとしてＤＮＮの重みを学習する。ＤＮＮ学習部３５における学習は、公知のディープラーニングにおける学習と同様の方法で実行すればよい。例えば、ＤＮＮ学習部３５は、分類タイプが「正常」である正常差分画像をＤＮＮに入力してディープラーニング学習を実行する。そして、分類タイプが「異常」であると誤った推論結果が出力される場合に誤差逆伝搬法に基づいて、学習中の重みを更新する。また、ＤＮＮ学習部３５は、分類タイプが「異常」である異常差分画像をＤＮＮに入力してディープラーニング学習を実行する。そして、分類タイプが「正常」であると誤った推論結果が出力される場合に誤差逆伝搬法に基づいて、学習中の重みを更新する。このようにエポックと呼ばれる処理を繰り返し行うことによって、学習の結果として学習済みの重みを求める。所望の損失値と正解率が得られるまでエポックを繰り返し実行する。ＤＮＮ学習部３５は、ＤＮＮの重みを含むＤＮＮモデルを学習済みモデル保存部３６へ出力する。 The DNN learning unit 35 uses a normal difference image including a normal deformed image generated by the deformed image generation unit 34 and an abnormal difference image including an abnormal deformed image generated by the deformed image generation unit 34 for the training image set. Learn DNN weights as a model for recognizing classification types. The learning in the DNN learning unit 35 may be performed in the same manner as the learning in the known deep learning. For example, the DNN learning unit 35 inputs a normal difference image having a classification type of "normal" into the DNN and executes deep learning learning. Then, when an erroneous inference result is output when the classification type is "abnormal", the weight during learning is updated based on the error back propagation method. Further, the DNN learning unit 35 inputs an abnormality difference image whose classification type is "abnormal" to the DNN and executes deep learning learning. Then, when an erroneous inference result is output when the classification type is "normal", the weight during learning is updated based on the error back propagation method. By repeating the process called epoch in this way, the learned weight is obtained as a result of learning. Repeat the epoch until the desired loss value and correct answer rate are obtained. The DNN learning unit 35 outputs a DNN model including the weight of the DNN to the trained model storage unit 36.

ＤＮＮ学習部３５は、確認用画像セットについて、正常差分画像と異常差分画像とについて分類タイプを推論する。ＤＮＮ学習部３５における推論は、公知のディープラーニングにおける推論と同様の方法で実行すればよい。例えば、ＤＮＮ学習部３５は、分類タイプが「正常」である正常差分画像をＤＮＮに入力して学習済みのＤＮＮの汎化性能を確認する。また、ＤＮＮ学習部３５は、分類タイプが「異常」である異常差分画像をＤＮＮに入力して学習済みのＤＮＮの汎化性能を確認する。このような処理をエポック毎に繰り返し行うことによって、ＤＮＮ学習部３５は、汎化性能を確認する。学習済みモデル保存部汎化性能が十分でない場合には、ＤＮＮモデルの見直しや訓練用画像セットの見直しが必要になる。 The DNN learning unit 35 infers the classification type of the normal difference image and the abnormal difference image for the confirmation image set. The inference in the DNN learning unit 35 may be executed in the same manner as the inference in the known deep learning. For example, the DNN learning unit 35 inputs a normal difference image having a classification type of “normal” into the DNN and confirms the generalization performance of the learned DNN. Further, the DNN learning unit 35 inputs an abnormality difference image whose classification type is "abnormal" into the DNN and confirms the generalization performance of the learned DNN. By repeating such processing for each epoch, the DNN learning unit 35 confirms the generalization performance. If the generalized performance of the trained model storage unit is not sufficient, it is necessary to review the DNN model and the training image set.

学習済みモデル保存部３６は、ＤＮＮ学習部３５から出力されたＤＮＮモデルを記録部１０に記録する。 The learned model storage unit 36 records the DNN model output from the DNN learning unit 35 in the recording unit 10.

図５を用いて、推論部４０について説明する。図５は、第一実施形態に係る推論部の構成例を示す概略図である。推論部４０は、ＤＮＮを使用して画像認識を実行して推論する。より詳しくは、推論部４０は、記録部１０から学習されたＤＮＮモデルを読み出してＤＮＮを再現し、テスト用画像セットの分類タイプを分類し、推論結果を表示部５０へ出力する。推論部４０は、画像取得部４２、差分画像生成部４３、データ推論部４４、及び学習済みモデル取得部４６、を有する。 The inference unit 40 will be described with reference to FIG. FIG. 5 is a schematic view showing a configuration example of the inference unit according to the first embodiment. The inference unit 40 executes image recognition using DNN and infers. More specifically, the inference unit 40 reads the DNN model learned from the recording unit 10, reproduces the DNN, classifies the classification type of the test image set, and outputs the inference result to the display unit 50. The inference unit 40 includes an image acquisition unit 42, a difference image generation unit 43, a data inference unit 44, and a trained model acquisition unit 46.

学習済みモデル取得部４６は、記録部１０に記録されているＤＮＮモデルを取得する。学習済みモデル取得部４６は、取得したＤＮＮモデルをデータ推論部４４へ出力する。 The trained model acquisition unit 46 acquires the DNN model recorded in the recording unit 10. The trained model acquisition unit 46 outputs the acquired DNN model to the data inference unit 44.

画像取得部４２は、記録部１０からテスト用画像セットの正常画像１００と異常画像１１０とを評価画像として取得する。画像取得部４２は、評価画像を差分画像生成部４３へ出力する。 The image acquisition unit 42 acquires the normal image 100 and the abnormal image 110 of the test image set from the recording unit 10 as evaluation images. The image acquisition unit 42 outputs the evaluation image to the difference image generation unit 43.

差分画像生成部４３は、評価画像と基準画像との差分画像である評価差分画像を生成する。より詳しくは、差分画像生成部４３は、評価画像のうちの正常画像１００と基準画像との正常差分画像、及び、評価画像のうちの異常画像１１０と基準画像との異常差分画像を評価差分画像として生成する。差分画像生成部４３は、生成した評価差分画像をデータ推論部４４へ出力する。 The difference image generation unit 43 generates an evaluation difference image which is a difference image between the evaluation image and the reference image. More specifically, the difference image generation unit 43 evaluates the normal difference image between the normal image 100 and the reference image in the evaluation image and the abnormal difference image between the abnormal image 110 and the reference image in the evaluation image. Generate as. The difference image generation unit 43 outputs the generated evaluation difference image to the data inference unit 44.

データ推論部４４は、評価差分画像について分類タイプを推論する。データ推論部４４における推論は、公知のディープラーニングにおける推論と同様の方法で実行すればよい。データ推論部４４は、正常差分画像と異常差分画像とを用いて、正常であるか異常であるかを認識するＤＮＮで推論する。このような処理によって、データ推論部４４は、ＤＮＮを使用した推論結果を評価する。データ推論部４４は、推論結果を表示部５０へ出力する。 The data inference unit 44 infers the classification type for the evaluation difference image. The inference in the data inference unit 44 may be executed in the same manner as the inference in known deep learning. The data inference unit 44 uses the normal difference image and the abnormal difference image to infer with a DNN that recognizes whether the image is normal or abnormal. By such processing, the data inference unit 44 evaluates the inference result using DNN. The data inference unit 44 outputs the inference result to the display unit 50.

表示部５０は、例えば、液晶ディスプレイ（ＬＣＤ：ＬｉｑｕｉｄＣｒｙｓｔａｌＤｉｓｐｌａｙ）または有機ＥＬ（ＯｒｇａｎｉｃＥｌｅｃｔｒｏ−Ｌｕｍｉｎｅｓｃｅｎｃｅ）ディスプレイを含むディスプレイである。表示部５０は、学習部３０の学習の途中経過及び推論部４０の推論結果を表示する。 The display unit 50 is, for example, a display including a liquid crystal display (LCD: Liquid Crystal Display) or an organic EL (Organic Electro-Luminence) display. The display unit 50 displays the progress of learning of the learning unit 30 and the inference result of the inference unit 40.

ここまでで、画像処理システム１の構成について説明したが、基準画像生成部２０、学習部３０、推論部４０はＲＯＭ、ＲＡＭ、ＣＰＵ、ＧＰＵ等であり、またはこれらの組み合わせである。 Although the configuration of the image processing system 1 has been described so far, the reference image generation unit 20, the learning unit 30, the inference unit 40 are a ROM, a RAM, a CPU, a GPU, or the like, or a combination thereof.

次に、図６を用いて、画像処理システムの画像認識方法及び作用について説明する。図６は、第一実施形態に係る画像処理システムにおける処理の流れを示すフローチャートである。画像認識学習装置２として使用する場合、ステップＳＴ１ないしステップＳＴ３及びステップＳＴ５の処理を実行すればよい。画像認識装置３として使用する場合、ステップＳＴ１ないしステップＳＴ２及びステップＳＴ４ないしステップＳＴ５の処理を実行すればよい。ステップＳＴ１ないしステップＳＴ３の処理は、ＤＮＮモデルを学習する際に少なくとも一度実行すればよい。 Next, the image recognition method and operation of the image processing system will be described with reference to FIG. FIG. 6 is a flowchart showing a processing flow in the image processing system according to the first embodiment. When used as the image recognition learning device 2, the processes of steps ST1 to ST3 and step ST5 may be executed. When used as the image recognition device 3, the processes of steps ST1 to ST2 and steps ST4 to ST5 may be executed. The processes of steps ST1 to ST3 may be executed at least once when learning the DNN model.

画像処理システム１は、記録部１０によって、画像セットを記録する(ステップＳＴ１)。画像処理システム１は、ステップＳＴ２へ進む。 The image processing system 1 records an image set by the recording unit 10 (step ST1). The image processing system 1 proceeds to step ST2.

画像処理システム１は、基準画像生成部２０によって、基準画像を生成する(ステップＳＴ２)。画像処理システム１は、ステップＳＴ３へ進む。 The image processing system 1 generates a reference image by the reference image generation unit 20 (step ST2). The image processing system 1 proceeds to step ST3.

画像処理システム１は、学習部３０によって、ＤＮＮモデルを学習する (ステップＳＴ３)。ステップＳＴ３の処理の詳細は後述する。画像処理システム１は、ステップＳＴ４へ進む。 The image processing system 1 learns the DNN model by the learning unit 30 (step ST3). Details of the process in step ST3 will be described later. The image processing system 1 proceeds to step ST4.

画像処理システム１は、推論部４０によって、ＤＮＮモデルによって正常であるか異常であるかを推論する(ステップＳＴ４)。ステップＳＴ４の処理の詳細は後述する。画像処理システム１は、ステップＳＴ５へ進む。 The image processing system 1 infers whether it is normal or abnormal by the DNN model by the inference unit 40 (step ST4). Details of the process in step ST4 will be described later. The image processing system 1 proceeds to step ST5.

画像処理システム１は、表示部５０によって、推論結果を表示する(ステップＳＴ５)。画像処理システム１は、処理を終了する。 The image processing system 1 displays the inference result by the display unit 50 (step ST5). The image processing system 1 ends the processing.

図７を用いて、図６に示すフローチャートのステップＳＴ３の学習部３０における処理について説明する。図７は、第一実施形態に係る学習部における処理の流れを示すフローチャートである。Ｅはエポック数のカウンタである。なお、エポック数とは、同一の訓練用画像セットまたは確認用画像セットを用いて、繰り返し学習する回数を意味する。本実施形態では、エポック数を１０とする。エポック数の数だけステップＳＴ１０からステップＳＴ２２までの処理を繰り返す。Ｓは画像セットの種類を示すカウンタである。カウンタＳが０の場合、訓練用画像セットを示し、カウンタＳが１の場合、確認用画像セットを示す。画像セットの種類の数だけステップＳＴ１１からステップＳＴ２１までの処理を繰り返す。 The process in the learning unit 30 of step ST3 of the flowchart shown in FIG. 6 will be described with reference to FIG. 7. FIG. 7 is a flowchart showing a processing flow in the learning unit according to the first embodiment. E is a counter for the number of epochs. The number of epochs means the number of times of repeated learning using the same training image set or confirmation image set. In this embodiment, the number of epochs is 10. The process from step ST10 to step ST22 is repeated as many times as the number of epochs. S is a counter indicating the type of image set. When the counter S is 0, the training image set is shown, and when the counter S is 1, the confirmation image set is shown. The process from step ST11 to step ST21 is repeated for the number of types of image sets.

学習部３０が処理を実行する前に、基準画像生成部２０が基準画像を生成する。また、画像取得部３１は、記録部１０から各画像セットの正常画像１００を取得する。 The reference image generation unit 20 generates a reference image before the learning unit 30 executes the process. Further, the image acquisition unit 31 acquires the normal image 100 of each image set from the recording unit 10.

学習部３０は、エポック数のカウンタＥをクリアしてゼロにする(ステップＳＴ１０)。学習部３０は、ステップＳＴ１１に進む。 The learning unit 30 clears the epoch number counter E to zero (step ST10). The learning unit 30 proceeds to step ST11.

学習部３０は、画像セットのカウンタＳをクリアしてゼロにする(ステップＳＴ１１)。学習部３０は、ステップＳＴ１２に進む。 The learning unit 30 clears the counter S of the image set to zero (step ST11). The learning unit 30 proceeds to step ST12.

学習部３０は、異常画像生成部３２によって、異常エリアＡ１を各正常画像１００に合成して各合成異常画像１２０を生成する（ステップＳＴ１２）。学習部３０は、ステップＳＴ１３に進む。 The learning unit 30 synthesizes the abnormal area A1 with each normal image 100 by the abnormal image generation unit 32 to generate each composite abnormal image 120 (step ST12). The learning unit 30 proceeds to step ST13.

学習部３０は、差分画像生成部３３によって、各正常画像１００と基準画像の差分画像である正常差分画像をそれぞれ生成する（ステップＳＴ１３）。学習部３０は、ステップＳＴ１４に進む。 The learning unit 30 generates a normal difference image which is a difference image of each normal image 100 and a reference image by the difference image generation unit 33 (step ST13). The learning unit 30 proceeds to step ST14.

学習部３０は、差分画像生成部３３によって、各異常画像１１０と基準画像の差分画像である異常差分画像をそれぞれ生成する（ステップＳＴ１４）。学習部３０は、ステップＳＴ１５に進む。 The learning unit 30 generates an abnormal difference image which is a difference image of each abnormal image 110 and a reference image by the difference image generation unit 33 (step ST14). The learning unit 30 proceeds to step ST15.

学習部３０は、差分画像生成部３３によって、各合成異常画像１２０と基準画像の差分画像である異常差分画像をそれぞれ生成する（ステップＳＴ１５）。学習部３０は、ステップＳＴ１６に進む。 The learning unit 30 generates an abnormal difference image which is a difference image of each composite abnormal image 120 and a reference image by the difference image generation unit 33 (step ST15). The learning unit 30 proceeds to step ST16.

学習部３０は、変形画像生成部３４によって、画像セットのカウンタＳが示す画像セットが訓練用画像セットであるか検査する（ステップＳＴ１６）。変形画像生成部３４は、画像セットのカウンタＳが０である場合、画像セットが訓練用画像セットであると判定する（ステップＳＴ１６でＹｅｓ）。変形画像生成部３４は、画像セットのカウンタＳが０ではない場合、画像セットが訓練用画像セットではないと判定する（ステップＳＴ１６でＮｏ）。 The learning unit 30 inspects whether the image set indicated by the counter S of the image set is a training image set by the deformed image generation unit 34 (step ST16). When the counter S of the image set is 0, the modified image generation unit 34 determines that the image set is a training image set (Yes in step ST16). When the counter S of the image set is not 0, the deformed image generation unit 34 determines that the image set is not the training image set (No in step ST16).

画像セットのカウンタＳが示す画像セットが訓練用画像セットであると判定された場合（ステップＳＴ１６のＹＥＳ）、学習部３０は、変形画像生成部３４によって、各正常差分画像に移動、回転、拡大、縮小、または輝度レベルの変更等の変形処理を加えて、正常変形画像をそれぞれ生成する（ステップＳＴ１７）。学習部３０は、ステップＳＴ１８に進む。 When it is determined that the image set indicated by the counter S of the image set is the training image set (YES in step ST16), the learning unit 30 moves, rotates, and enlarges each normal difference image by the deformed image generation unit 34. , Reduction, or change of the brightness level is added to generate a normally deformed image (step ST17). The learning unit 30 proceeds to step ST18.

学習部３０は、変形画像生成部３４によって、各異常差分画像に移動、回転、拡大、縮小、または輝度レベルの変更等の変形処理を加えて、異常変形画像をそれぞれ生成する（ステップＳＴ１８）。学習部３０は、ステップＳＴ１９に進む。 The learning unit 30 generates an abnormally deformed image by adding a deforming process such as moving, rotating, enlarging, reducing, or changing the brightness level to each abnormal difference image by the deformed image generation unit 34 (step ST18). The learning unit 30 proceeds to step ST19.

学習部３０は、ＤＮＮ学習部３５によって、各正常変形画像を含む正常差分画像と、各異常変形画像を含む異常差分画像とを用いて分類タイプを認識するモデルとしてＤＮＮの重みを学習する（ステップＳＴ１９）。学習部３０は、ステップＳＴ２１に進む。 The learning unit 30 learns the DNN weight as a model for recognizing the classification type by the DNN learning unit 35 using the normal difference image including each normal deformation image and the abnormality difference image including each abnormal deformation image (step). ST19). The learning unit 30 proceeds to step ST21.

画像セットのカウンタＳが示す画像セットが訓練用画像セットではないと判定された他場合（ステップＳＴ１６のＮＯ）、ＤＮＮ学習部３５によって、正常差分画像と異常差分画像について分類タイプを推論する（ステップＳＴ２０）。ここでは、ＤＮＮ学習部３５による推論結果を、推論部４０は表示部５０に送出し、表示部５０は推論結果を表示する。ここで、推論結果には、推論精度（Ａｃｃｕｒａｃｙ）、適合率（Ｐｒｅｃｉｓｉｏｎ）、再現率（Ｒｅｃａｌｌ）などがある。本装置のユーザは、汎化性能の確認を行い、汎化性能の進展が思わしくなければ、処理を中止し、学習済みモデルの分類器のパラメータを再調整し、図７を最初からやり直してもよい。学習部３０は、ステップＳＴ２１に進む。 In other cases where it is determined that the image set indicated by the counter S of the image set is not the training image set (NO in step ST16), the DNN learning unit 35 infers the classification type for the normal difference image and the abnormal difference image (step). ST20). Here, the inference result by the DNN learning unit 35 is sent to the display unit 50 by the inference unit 40, and the display unit 50 displays the inference result. Here, the inference result includes inference accuracy (Accuracy), precision rate (Precision), recall rate (Recall), and the like. The user of this device can check the generalization performance, and if the progress of the generalization performance is not good, stop the processing, readjust the parameters of the classifier of the trained model, and restart FIG. 7 from the beginning. Good. The learning unit 30 proceeds to step ST21.

学習部３０は、画像セットのカウンタＳが２未満である場合、画像セットのカウンタＳを＋１して(ステップＳＴ２１)、ステップＳＴ１１の処理を再度実行する。学習部３０は、画像セットのカウンタＳが２未満ではない場合、ステップＳＴ２２に進む。 When the counter S of the image set is less than 2, the learning unit 30 increments the counter S of the image set by +1 (step ST21), and executes the process of step ST11 again. If the counter S of the image set is not less than 2, the learning unit 30 proceeds to step ST22.

学習部３０は、エポック数のカウンタＥが１０未満である場合、エポック数のカウンタＥを＋１して(ステップＳＴ２２)、ステップＳＴ１０の処理を再度実行する。学習部３０は、エポック数のカウンタＥが１０未満ではない場合、ステップＳＴ２３に進む。 When the epoch number counter E is less than 10, the learning unit 30 increments the epoch number counter E by +1 (step ST22) and executes the process of step ST10 again. If the counter E of the number of epochs is not less than 10, the learning unit 30 proceeds to step ST23.

学習部３０は、学習済みモデル保存部３６によって、ＤＮＮの重みを含むＤＮＮモデルを記録部１０に記録する（ステップＳＴ２３）。学習部３０は、処理を終了する。 The learning unit 30 records the DNN model including the weight of the DNN in the recording unit 10 by the learned model storage unit 36 (step ST23). The learning unit 30 ends the process.

図７では、各エポック毎に訓練用画像セットと確認用画像セットの合成異常画像、正常差分画像、異常差分画像、正常変形画像、異常変形画像を生成しているが、最初のエポックでこれらを保存しておき、２回目以降のエポックではこれらを読み出すだけにしてもよいし、事前に保存しておいて最初から読み出してもよい。 In FIG. 7, a composite abnormal image, a normal difference image, an abnormal difference image, a normal deformed image, and an abnormal deformed image of the training image set and the confirmation image set are generated for each epoch, but these are generated in the first epoch. They may be saved and then only read in the second and subsequent epochs, or they may be saved in advance and read from the beginning.

図８を用いて、図６に示すフローチャートのステップＳＴ４の推論部４０における処理について説明する。図８は、第一実施形態に係る推論部における処理の流れを示すフローチャートである。 The process in the inference unit 40 of step ST4 of the flowchart shown in FIG. 6 will be described with reference to FIG. FIG. 8 is a flowchart showing a processing flow in the inference unit according to the first embodiment.

推論部４０は、学習済みモデル取得部４６によって、記録部１０に記録されているＤＮＮモデルを読み出す（ステップＳＴ３０）。推論部４０は、ステップＳＴ３１に進む。 The inference unit 40 reads out the DNN model recorded in the recording unit 10 by the learned model acquisition unit 46 (step ST30). The inference unit 40 proceeds to step ST31.

推論部４０は、画像取得部４２によって、記録部１０からテスト用画像セットの正常画像１００と異常画像１１０とを評価画像として取得する（ステップＳＴ３１）。推論部４０は、ステップＳＴ３２に進む。 The inference unit 40 acquires the normal image 100 and the abnormal image 110 of the test image set from the recording unit 10 as evaluation images by the image acquisition unit 42 (step ST31). The inference unit 40 proceeds to step ST32.

推論部４０は、差分画像生成部４３によって、評価画像と基準画像との差分画像である評価差分画像を生成する（ステップＳＴ３２）。推論部４０は、ステップＳＴ３３に進む。 The inference unit 40 generates an evaluation difference image, which is a difference image between the evaluation image and the reference image, by the difference image generation unit 43 (step ST32). The inference unit 40 proceeds to step ST33.

推論部４０は、データ推論部４４によって、評価差分画像について分類タイプを推論する（ステップＳＴ３３）。推論部４０は、処理を終了する。 The inference unit 40 infers the classification type for the evaluation difference image by the data inference unit 44 (step ST33). The inference unit 40 ends the process.

以上のように、画像認識学習装置２では、基準画像生成部２０が、正常画像１００から基準画像を生成する。学習部３０の差分画像生成部３３が、正常画像１００と基準画像との正常差分画像、及び、異常画像１１０と基準画像との異常差分画像を生成する。ＤＮＮ学習部３５が、正常差分画像と異常差分画像とを用いて、正常画像１００と異常画像１１０とを認識するようにＤＮＮの重みを学習する。 As described above, in the image recognition learning device 2, the reference image generation unit 20 generates a reference image from the normal image 100. The difference image generation unit 33 of the learning unit 30 generates a normal difference image between the normal image 100 and the reference image, and an abnormal difference image between the abnormal image 110 and the reference image. The DNN learning unit 35 learns the weight of the DNN so as to recognize the normal image 100 and the abnormal image 110 by using the normal difference image and the abnormal difference image.

画像認識装置３では、基準画像生成部２０が、正常画像１００から基準画像を生成する。推論部４０の差分画像生成部４３が、評価画像のうちの正常画像１００と基準画像との正常差分画像、及び、評価画像のうちの異常画像１１０と基準画像との異常差分画像を評価差分画像として生成する。データ推論部４４が、正常差分画像と異常差分画像とを用いて、正常であるか異常であるかを認識するＤＮＮを使用して推論する。 In the image recognition device 3, the reference image generation unit 20 generates a reference image from the normal image 100. The difference image generation unit 43 of the inference unit 40 evaluates the normal difference image between the normal image 100 and the reference image in the evaluation image and the abnormal difference image between the abnormal image 110 and the reference image in the evaluation image. Generate as. The data inference unit 44 makes an inference using a DNN that recognizes whether the normal difference image and the abnormal difference image are normal or abnormal.

上述したように、本実施形態では、正常画像１００から基準画像を生成する。本実施形態では、正常画像１００と基準画像との正常差分画像、及び、異常画像１１０と基準画像との異常差分画像を生成する。本実施形態は、正常差分画像と異常差分画像とを用いて、正常画像１００と異常画像１１０とを認識するようにＤＮＮの重みを学習することができる。本実施形態は、学習部３０のＤＮＮ学習部３５における学習時、正常差分画像と異常差分画像とを用いることにより、異常個所のみを学習させることができる。本実施形態は、適切な認識率を有するＤＮＮを構築することができる。 As described above, in the present embodiment, the reference image is generated from the normal image 100. In the present embodiment, a normal difference image between the normal image 100 and the reference image and an abnormal difference image between the abnormal image 110 and the reference image are generated. In this embodiment, the weight of the DNN can be learned so as to recognize the normal image 100 and the abnormal image 110 by using the normal difference image and the abnormal difference image. In this embodiment, during learning in the DNN learning unit 35 of the learning unit 30, only the abnormal portion can be learned by using the normal difference image and the abnormal difference image. In this embodiment, a DNN having an appropriate recognition rate can be constructed.

本実施形態では、学習部３０の異常画像生成部３２が、既知である異常画像１１０上の異常エリアＡ１を正常画像１００に合成して合成異常画像１２０を生成する。この際に、異常エリアＡ１を合成エリアＡ２内に位置を変えて合成異常画像１２０を生成する。異常エリアＡ１を拡大または縮小させて合成異常画像１２０を生成する。異常エリアＡ１を回転させて合成異常画像１２０を生成する。異常エリアＡ１の輝度レベルを高くまたは低くして合成異常画像１２０を生成する。本実施形態は、少ない既知の異常画像１１０から合成異常画像１２０を大量に生成することができる。本実施形態によれば、例えば、異常の発生率が低い工業製品について、本来得られにくい異常画像１１０から合成異常画像１２０を大量に生成することができる。このように、本実施形態によれば、少量の教師データから、適切な認識率を有するＤＮＮを構築することができる。 In the present embodiment, the abnormal image generation unit 32 of the learning unit 30 synthesizes the abnormal area A1 on the known abnormal image 110 with the normal image 100 to generate the composite abnormal image 120. At this time, the abnormal area A1 is repositioned within the composite area A2 to generate the composite abnormal image 120. The abnormal area A1 is enlarged or reduced to generate a composite abnormal image 120. The abnormal area A1 is rotated to generate a composite abnormal image 120. The composite abnormality image 120 is generated by increasing or decreasing the brightness level of the abnormality area A1. In this embodiment, a large amount of composite abnormal images 120 can be generated from a small number of known abnormal images 110. According to the present embodiment, for example, for an industrial product having a low occurrence rate of abnormality, a large amount of composite abnormal image 120 can be generated from the abnormal image 110 which is originally difficult to obtain. As described above, according to this embodiment, a DNN having an appropriate recognition rate can be constructed from a small amount of teacher data.

本実施形態では、学習部３０の異常画像生成部３２が、異常画像１１０の分類タイプ毎に個別に合成エリアＡ２を設定することができる。本実施形態によれば、不適切な合成異常画像１２０が生成されることを抑制することができる。本実施形態は、不適切な教師データに基づく学習を抑制することができる。 In the present embodiment, the abnormal image generation unit 32 of the learning unit 30 can individually set the composite area A2 for each classification type of the abnormal image 110. According to this embodiment, it is possible to suppress the generation of an inappropriate composite abnormality image 120. This embodiment can suppress learning based on inappropriate teacher data.

本実施形態では、学習部３０の変形画像生成部３４が、正常差分画像に移動、回転、拡大、または縮小等の変形処理を加えて、正常変形画像を生成する。本実施形態では、変形画像生成部３４が、異常差分画像に移動、回転、拡大、縮小、または輝度レベルの変更等の変形処理を加えて、異常変形画像を生成する。このようにして、本実施形態は、正常差分画像から正常変形画像を大量に生成し、異常差分画像から異常変形画像を大量に生成することができる。本実施形態によれば、ディープラーニングによる認識率の向上につなげることができる。 In the present embodiment, the deformed image generation unit 34 of the learning unit 30 adds deformation processing such as movement, rotation, enlargement, or reduction to the normal difference image to generate a normal deformed image. In the present embodiment, the deformed image generation unit 34 generates an abnormally deformed image by adding deformation processing such as moving, rotating, enlarging, reducing, or changing the brightness level to the abnormal difference image. In this way, the present embodiment can generate a large amount of normal deformed images from the normal difference image and a large amount of abnormally deformed images from the abnormal difference image. According to this embodiment, it is possible to improve the recognition rate by deep learning.

また、本実施形態は、正常画像１００から基準画像を生成する。本実施形態は、評価画像のうちの正常画像１００と基準画像との正常差分画像、及び、評価画像のうちの異常画像１１０と基準画像との異常差分画像を評価差分画像として生成する。本実施形態は、推論部４０のデータ推論部４４による推論時、正常差分画像と異常差分画像とを用いて、正常であるか異常であるかを認識するＤＮＮを使用して推論することができる。本実施形態によれば、適切な認識率で推論結果を得ることができる。 In addition, this embodiment generates a reference image from the normal image 100. In this embodiment, a normal difference image between the normal image 100 and the reference image in the evaluation image and an abnormal difference image between the abnormal image 110 and the reference image in the evaluation image are generated as the evaluation difference image. In this embodiment, when the data inference unit 44 of the inference unit 40 makes an inference, the normal difference image and the abnormal difference image can be used to make an inference using a DNN that recognizes whether the image is normal or abnormal. .. According to this embodiment, the inference result can be obtained with an appropriate recognition rate.

［第二実施形態］
図９を参照しながら、本実施形態に係る画像処理システム１Ａについて説明する。図９は、第二実施形態に係る画像処理システムの構成例の一例を示す概略図である。画像処理システム１Ａは、基本的な構成は第一実施形態の画像処理システム１と同様である。以下の説明においては、画像処理システム１と同様の構成要素には、同一の符号または対応する符号を付し、その詳細な説明は省略する。本実施形態は、画像処理システム１Ａは、カメラ６０Ａを有する点が、第一実施形態と異なる。 [Second Embodiment]
The image processing system 1A according to the present embodiment will be described with reference to FIG. FIG. 9 is a schematic view showing an example of a configuration example of the image processing system according to the second embodiment. The basic configuration of the image processing system 1A is the same as that of the image processing system 1 of the first embodiment. In the following description, the same components as those of the image processing system 1 are designated by the same reference numerals or corresponding reference numerals, and detailed description thereof will be omitted. The present embodiment is different from the first embodiment in that the image processing system 1A has a camera 60A.

画像処理システム１Ａは、製品を生産する工場の生産ライン等に配置される。画像処理システム１Ａは、カメラ６０Ａ、記録部１０Ａ、基準画像生成部２０Ａ、学習部３０、及び推論部４０Ａを有する。 The image processing system 1A is arranged on a production line or the like of a factory that produces products. The image processing system 1A includes a camera 60A, a recording unit 10A, a reference image generation unit 20A, a learning unit 30, and an inference unit 40A.

カメラ６０Ａは、例えば、検査対象である製品の外観の画像を撮影する。カメラ６０Ａは、工場の生産ライン等に配置される。カメラ６０Ａは、撮影した画像を記録部１０Ａ、基準画像生成部２０Ａ、及び推論部４０Ａへ出力する。 The camera 60A captures, for example, an image of the appearance of the product to be inspected. The camera 60A is arranged on a production line of a factory or the like. The camera 60A outputs the captured image to the recording unit 10A, the reference image generation unit 20A, and the inference unit 40A.

記録部１０Ａは、カメラ６０Ａから入力される画像を訓練用画像セット、及び確認用画像セットとして記録する。カメラ６０Ａから、正常、傷、汚れ、色むら、及び変形の各分類タイプの画像を取得し、訓練用画像セットと確認用画像セットにそれぞれ所定の画像枚数分記録する。 The recording unit 10A records the image input from the camera 60A as a training image set and a confirmation image set. Images of each classification type of normal, scratch, stain, color unevenness, and deformation are acquired from the camera 60A, and a predetermined number of images are recorded in the training image set and the confirmation image set, respectively.

基準画像生成部２０Ａは、カメラ６０Ａから入力される正常画像１００の平均画像である基準画像を生成し、記録部１０Ａに基準画像を記録する。ここでは、基準画像はカメラ６０Ａから入力される正常画像１００から生成したが、訓練用画像セットに含まれる正常画像１００から生成してもよい。 The reference image generation unit 20A generates a reference image which is an average image of the normal image 100 input from the camera 60A, and records the reference image in the recording unit 10A. Here, the reference image is generated from the normal image 100 input from the camera 60A, but may be generated from the normal image 100 included in the training image set.

推論部４０Ａは、カメラ６０Ａから入力される画像の分類タイプを分類する。より詳しくは、推論部４０Ａは、記録部１０Ａから学習されたＤＮＮモデルを読み出してＤＮＮを再現し、カメラ６０Ａから入力される画像の分類タイプが「正常」か「異常」かを推論する。推論部４０Ａによって「正常」との推論結果が得られた画像に写された製品は、正常品である。推論部４０Ａによって「異常」との推論結果が得られた画像に写された製品は、エラー品である。 The inference unit 40A classifies the classification type of the image input from the camera 60A. More specifically, the inference unit 40A reads the DNN model learned from the recording unit 10A, reproduces the DNN, and infers whether the classification type of the image input from the camera 60A is "normal" or "abnormal". The product copied in the image in which the inference result of "normal" is obtained by the inference unit 40A is a normal product. The product copied in the image in which the inference result of "abnormal" is obtained by the inference unit 40A is an error product.

上述したように、本実施形態は、カメラ６０Ａが撮影した画像に基づいて、適切な認識率を有するＤＮＮを構築することができる。 As described above, in the present embodiment, a DNN having an appropriate recognition rate can be constructed based on the image taken by the camera 60A.

本実施形態は、カメラ６０Ａが撮影した画像について、ＤＮＮを使用して製品が正常であるか異常であるかを適切な認識率で推論することができる。ここでは、画像処理システム１Ａに学習部３０を含むとしたが、学習部３０は含まなくてもよい。その場合、画像処理システム１Ａは、ＵＳＢやネットワーク等を経由して外部機器などに接続する接続部をを備え、記録部１０Ａは接続部を介して外部機器から学習されたＤＮＮモデルを取得する。 In this embodiment, it is possible to infer at an appropriate recognition rate whether the product is normal or abnormal by using DNN for the image taken by the camera 60A. Here, the image processing system 1A includes the learning unit 30, but the learning unit 30 may not be included. In that case, the image processing system 1A includes a connection unit that connects to an external device or the like via USB, a network, or the like, and the recording unit 10A acquires a DNN model learned from the external device via the connection unit.

［第三実施形態］
図１０を参照しながら、本実施形態に係る画像処理システム１Ｂについて説明する。図１０は、第三実施形態に係る画像処理システムの構成例の一例を示す概略図である。本実施形態は、カメラ６０Ｂが工場の生産ライン等に配置され、推論部４０Ｂが工場の生産ラインを監視する監視施設の端末（端末装置）９０Ｂ等に配置され、学習部３０Ｂがクラウドに配置される点で、第一実施形態と異なる。カメラ６０Ｂと、記録部１０Ｂ及び推論部４０Ｂはデータを送受信可能に接続されている。基準画像生成部２０Ｂと推論部４０Ｂとは、データを送受信可能に接続されている。学習部３０Ｂと推論部４０Ｂとは、データを送受信可能に接続されている。 [Third Embodiment]
The image processing system 1B according to the present embodiment will be described with reference to FIG. FIG. 10 is a schematic view showing an example of a configuration example of the image processing system according to the third embodiment. In this embodiment, the camera 60B is arranged on the production line of the factory, the inference unit 40B is arranged on the terminal (terminal device) 90B or the like of the monitoring facility that monitors the production line of the factory, and the learning unit 30B is arranged on the cloud. In that respect, it differs from the first embodiment. The camera 60B, the recording unit 10B, and the inference unit 40B are connected so as to be able to transmit and receive data. The reference image generation unit 20B and the inference unit 40B are connected so that data can be transmitted and received. The learning unit 30B and the inference unit 40B are connected so that data can be transmitted and received.

カメラ６０Ｂは、撮影した画像を、記録部１０Ｂ及び推論部４０Ｂへ送信する。カメラ６０Ｂは、正常、傷、汚れ、色むら、及び変形の各分類タイプの画像を取得する。 The camera 60B transmits the captured image to the recording unit 10B and the inference unit 40B. The camera 60B acquires images of each classification type of normal, scratch, stain, color unevenness, and deformation.

記録部１０Ｂは、クラウド上に配置されている。記録部１０Ｂは、カメラ６０Ｂから送信された画像を訓練用画像セット、及び確認用画像セットとして記録する。 The recording unit 10B is arranged on the cloud. The recording unit 10B records the image transmitted from the camera 60B as a training image set and a confirmation image set.

基準画像生成部２０Ｂは、クラウド上に配置されている。基準画像生成部２０Ｂは、カメラ６０Ｂから送信された正常画像１００の平均画像である基準画像を生成し、記録部１０Ｂに基準画像を記録する。ここでは、基準画像はカメラ６０Ｂから入力される正常画像１００から生成したが、訓練用画像セットに含まれる正常画像１００から生成してもよい。 The reference image generation unit 20B is arranged on the cloud. The reference image generation unit 20B generates a reference image which is an average image of the normal image 100 transmitted from the camera 60B, and records the reference image in the recording unit 10B. Here, the reference image is generated from the normal image 100 input from the camera 60B, but may be generated from the normal image 100 included in the training image set.

推論部４０Ｂは、工場の生産ラインを監視する監視施設に配置された端末９０Ｂ上に配置されている。推論部４０Ｂは、カメラ６０Ｂから送信された画像の分類タイプを分類する。推論部４０Ｂは、基準画像取得部４１Ｂと、画像取得部４２Ｂと、差分画像生成部４３Ｂと、データ推論部４４Ｂと、学習済みモデル取得部４６Ｂとを有する。 The inference unit 40B is arranged on the terminal 90B arranged in the monitoring facility that monitors the production line of the factory. The inference unit 40B classifies the classification type of the image transmitted from the camera 60B. The inference unit 40B includes a reference image acquisition unit 41B, an image acquisition unit 42B, a difference image generation unit 43B, a data inference unit 44B, and a trained model acquisition unit 46B.

基準画像取得部４１Ｂは、第１の分類を示すラベルを付与した画像としての正常画像１００に基づく基準画像を取得する。 The reference image acquisition unit 41B acquires a reference image based on the normal image 100 as an image with a label indicating the first classification.

画像取得部４２Ｂは、特定の画像を取得する。特定の画像とは、例えば、工場の生産ラインで生産された製品のうち、検査対象の製品が写された画像である。 The image acquisition unit 42B acquires a specific image. The specific image is, for example, an image showing the product to be inspected among the products produced on the production line of the factory.

差分画像生成部４３Ｂは、画像取得部４２Ｂが取得した特定の画像と、基準画像取得部４１Ｂが取得した基準画像との差分から算出される第３の差分画像を生成する。 The difference image generation unit 43B generates a third difference image calculated from the difference between the specific image acquired by the image acquisition unit 42B and the reference image acquired by the reference image acquisition unit 41B.

データ推論部４４Ｂは、学習済みモデル取得部４６Ｂが取得した学習済みモデルに、差分画像生成部４３Ｂが生成した第３の差分画像を入力することで、画像取得部４２Ｂで取得した特定の画像を第１の分類と第２の分類とに識別する。 The data inference unit 44B inputs a third difference image generated by the difference image generation unit 43B into the trained model acquired by the trained model acquisition unit 46B to obtain a specific image acquired by the image acquisition unit 42B. It is divided into a first classification and a second classification.

学習済みモデル取得部４６Ｂは、第１の分類を示すラベルを付与したとしての正常画像１００と、基準画像取得部４１Ｂが取得した基準画像との差分から算出される第１の差分画像としての正常差分画像と、第２の分類を示すラベルを付与した画像としての異常画像１１０と基準画像との差分から算出される第２の差分画像としての異常差分画像とを用いて、第１の分類と第２の分類とを識別する多層ニューラルネットワークの重みづけを学習させた学習済みモデルを学習部３０Ｂから取得する。 The trained model acquisition unit 46B is normal as a first difference image calculated from the difference between the normal image 100 with a label indicating the first classification and the reference image acquired by the reference image acquisition unit 41B. Using the difference image and the abnormal difference image as the second difference image calculated from the difference between the abnormal image 110 as the image with the label indicating the second classification and the reference image, the first classification and A trained model trained in the weighting of the multilayer neural network that distinguishes it from the second classification is acquired from the learning unit 30B.

上述したように、本実施形態は、工場の生産ライン等に配置された端末９０Ｂでは、画像取得部４２Ｂが、例えば、検査対象の製品が映された画像のような特定の画像を取得する。差分画像生成部４３Ｂが、特定の画像と正常画像１００に基づく基準画像との差分から算出される第３の差分画像を生成する。データ推論部４４Ｂは、学習済みモデル取得部４６Ｂがクラウド上の学習部３０Ｂから取得した学習済みモデルに第３の差分画像を入力することで、特定の画像を第１の分類と前記第２の分類とに識別する。本実施形態によれば、工場の生産ライン等に配置されたカメラ６０Ｂが撮影した特定の画像について、適切な認識率で推論結果を得ることができる。 As described above, in the present embodiment, in the terminal 90B arranged on the production line of the factory or the like, the image acquisition unit 42B acquires a specific image such as an image on which the product to be inspected is projected. The difference image generation unit 43B generates a third difference image calculated from the difference between the specific image and the reference image based on the normal image 100. The data inference unit 44B inputs a third difference image into the trained model acquired by the trained model acquisition unit 46B from the learning unit 30B on the cloud, thereby classifying the specific image into the first classification and the second classification. Identify in classification. According to this embodiment, it is possible to obtain an inference result with an appropriate recognition rate for a specific image taken by a camera 60B arranged on a production line or the like of a factory.

本実施形態は、カメラ６０Ｂが撮影した画像について、端末９０Ｂ上に配置された推論部４０Ｂは、ＤＮＮを使用して製品が正常であるか異常であるかを適切な認識率で推論することができる。 In the present embodiment, for the image taken by the camera 60B, the inference unit 40B arranged on the terminal 90B can infer whether the product is normal or abnormal by using DNN with an appropriate recognition rate. it can.

さて、これまで本発明に係る画像処理システム１について説明したが、上述した実施形態以外にも種々の異なる形態にて実施されてよいものである。 Although the image processing system 1 according to the present invention has been described so far, it may be implemented in various different forms other than the above-described embodiment.

図示した画像処理システム１の各構成要素は、機能概念的なものであり、必ずしも物理的に図示の如く構成されていなくてもよい。すなわち、各装置の具体的形態は、図示のものに限られず、各装置の処理負担や使用状況等に応じて、その全部または一部を任意の単位で機能的または物理的に分散または統合してもよい。 Each component of the illustrated image processing system 1 is functionally conceptual and does not necessarily have to be physically configured as shown in the figure. That is, the specific form of each device is not limited to the one shown in the figure, and all or part of each device is functionally or physically dispersed or integrated in an arbitrary unit according to the processing load and usage status of each device. You may.

画像処理システム１の構成は、例えば、コンピュータ等のソフトウェアやファームウェアとして、ＲＯＭやＲＡＭ等のメモリにロードされたプログラム等をＣＰＵやＧＰＵ等の装置によって実行されることによって実現される。上記実施形態では、これらのハードウェアまたはソフトウェアの連携によって実現される機能ブロックとして説明した。すなわち、これらの機能ブロックについては、ハードウェアのみ、ソフトウェアのみ、または、それらの組み合わせによって種々の形で実現できる。 The configuration of the image processing system 1 is realized, for example, by executing a program or the like loaded in a memory such as ROM or RAM as software or firmware of a computer or the like by a device such as a CPU or GPU. In the above embodiment, it has been described as a functional block realized by linking these hardware or software. That is, these functional blocks can be realized in various forms by hardware only, software only, or a combination thereof.

上記した構成要素には、当業者が容易に想定できるもの、実質的に同一のものを含む。さらに、上記した構成は適宜組み合わせが可能である。また、本発明の要旨を逸脱しない範囲において構成の種々の省略、置換または変更が可能である。 The above-mentioned components include those that can be easily assumed by those skilled in the art and those that are substantially the same. Further, the above configurations can be combined as appropriate. Further, various omissions, substitutions or changes of the configuration can be made without departing from the gist of the present invention.

基準画像及び差分画像の色成分が不要であるかを、ＤＮＮによる認識率に基づいて判定してもよい。まず、学習部３０は、基準画像及び差分画像をグレースケールで生成して、ＤＮＮの重みを学習する。そして、推論部４０は、ＤＮＮを使用した推論を行い、推論結果の認識率が閾値以上である場合、色成分が不要であり、基準画像及び差分画像をグレースケールで生成して、ＤＮＮの重みを学習すればよいと判定する。推論結果の認識率が閾値以上ではない場合、色成分が必要であり、基準画像及び差分画像をＲＧＢまたはＹＣｂＣｒで生成して、ＤＮＮの重みを学習すればよいと判定する。このようにすることにより、ユーザが色成分の要否を判断しなくてよいので、容易に適切な認識率を有するＤＮＮを構築することができる。 Whether or not the color components of the reference image and the difference image are unnecessary may be determined based on the recognition rate by DNN. First, the learning unit 30 generates a reference image and a difference image on a gray scale, and learns the weight of DNN. Then, the inference unit 40 performs inference using DNN, and when the recognition rate of the inference result is equal to or higher than the threshold value, the color component is unnecessary, the reference image and the difference image are generated in gray scale, and the weight of DNN is generated. Is determined to be learned. When the recognition rate of the inference result is not equal to or more than the threshold value, it is determined that the color component is necessary and the reference image and the difference image should be generated by RGB or YCbCr to learn the weight of DNN. By doing so, the user does not have to determine the necessity of the color component, so that a DNN having an appropriate recognition rate can be easily constructed.

１画像処理システム
２画像認識学習装置
３画像認識装置
１０記録部
２０基準画像生成部
３０学習部
３１画像取得部
３２異常画像生成部
３３差分画像生成部
３４変形画像生成部
３５ＤＮＮ学習部
３６学習済みモデル保存部
４０推論部
４１学習済みモデル取得部
４２画像取得部
４３差分画像生成部
４４データ推論部
５０表示部
１００正常画像
１１０異常画像
１２０合成異常画像
Ａ１異常エリア
Ａ２合成エリア 1 Image processing system 2 Image recognition learning device 3 Image recognition device 10 Recording unit 20 Reference image generation unit 30 Learning unit 31 Image acquisition unit 32 Abnormal image generation unit 33 Difference image generation unit 34 Deformed image generation unit 35 DNN learning unit 36 Learned Model storage unit 40 Reasoning unit 41 Learned model acquisition unit 42 Image acquisition unit 43 Difference image generation unit 44 Data inference unit 50 Display unit 100 Normal image 110 Abnormal image 120 Composite abnormal image A1 Abnormal area A2 Composite area

Claims

An image acquisition unit that acquires an image with a label indicating the first classification and an image with a label indicating the second classification, and
A reference image generation unit that generates a reference image based on an image with a label indicating the first classification, and
Calculated from the difference between the first difference image calculated from the difference between the image with the label indicating the first classification and the reference image, and the image with the label indicating the second classification and the reference image. A difference image generation unit that generates a second difference image to be generated,
Learning to generate a trained model in which the weighting of the multi-layer neural network that distinguishes between the first classification and the second classification is trained using the first difference image and the second difference image. Department and
An image recognition learning device comprising.

An abnormal image generation unit that cuts out an area including a specific object from an image with a label indicating the second classification and superimposes it on an image with a label indicating the first classification to generate a composite abnormal image. ,
With
The image recognition learning device according to claim 1, wherein the difference image generation unit includes the composite abnormal image in an image with a label indicating the second classification.

Moving, rotating, enlarging, reducing, or changing the brightness level of the first difference image and the second difference image, respectively, to generate a new first difference image and the second difference image. The image recognition learning device according to claim 1 or 2, wherein the modified image generation unit is provided.

An image acquisition step of acquiring an image with a label indicating the first classification and an image with a label indicating the second classification, and
A reference image generation step of generating a reference image based on an image with a label indicating the first classification, and
Calculated from the difference between the first difference image calculated from the difference between the image with the label indicating the first classification and the reference image, and the image with the label indicating the second classification and the reference image. The difference image generation step of generating the second difference image to be performed, and
Learning to generate a trained model in which the weighting of the multi-layer neural network that distinguishes between the first classification and the second classification is trained using the first difference image and the second difference image. Steps and
Image recognition learning method including.

An image acquisition step of acquiring an image with a label indicating the first classification and an image with a label indicating the second classification, and
A reference image generation step of generating a reference image based on an image with a label indicating the first classification, and
Calculated from the difference between the first difference image calculated from the difference between the image with the label indicating the first classification and the reference image, and the image with the label indicating the second classification and the reference image. The difference image generation step of generating the second difference image to be performed, and
Learning to generate a trained model in which the weighting of the multi-layer neural network that distinguishes between the first classification and the second classification is trained using the first difference image and the second difference image. Steps and
An image recognition learning program that causes a computer that operates as an image recognition learning device to execute.

An image acquisition unit that acquires a specific image,
A reference image acquisition unit that acquires a reference image based on an image with a label indicating the first classification, and
A difference image generation unit that generates a third difference image calculated from the difference between the specific image and the reference image, and
Calculated from the difference between the first difference image calculated from the difference between the image with the label indicating the first classification and the reference image, and the image with the label indicating the second classification and the reference image. A trained model acquisition unit that acquires a trained model trained in the weighting of a multi-layer neural network that distinguishes between the first classification and the second classification by using the second difference image.
By inputting the third difference image into the trained model, a data inference unit that distinguishes a specific image acquired by the image acquisition unit into a first classification and a second classification,
A terminal device comprising.