JP2023163983A

JP2023163983A - Image processing device, image processing method, image processing program, image classification device, and learned model generated by image classification device

Info

Publication number: JP2023163983A
Application number: JP2022075252A
Authority: JP
Inventors: 和寛鶴田; Kazuhiro Tsuruta; 健太郎藤井; Kentaro Fujii; 哲也小代; Tetsuya Koshiro
Original assignee: Nakamura Sangyo Gakuen; Fuji Techno Co Ltd
Current assignee: Nakamura Sangyo Gakuen; Fuji Techno Co Ltd
Priority date: 2022-04-28
Filing date: 2022-04-28
Publication date: 2023-11-10

Abstract

To provide an image processing device capable of creating data for achieving stable determination accuracy even when an amount of created data is small and used for machine learning, and an image processing method and the like.SOLUTION: An image processing device 10 comprises first arithmetic means 120 and second arithmetic means 130. The first arithmetic means numerically converts a photographic image O to a plurality of pieces of color information per pixel, and creates data by linking each of pieces of the numerically converted color information. The second arithmetic means creates data used for machine learning by subjecting the data created by the first arithmetic means 120 to fast Fourier transformation.SELECTED DRAWING: Figure 1

Description

本発明は、撮影画像から取得した情報に基づいて機械学習に用いられるデータを作成する画像処理装置、画像処理方法、および画像処理プログラムに関する。また、これらの画像処理装置などから作成されたデータにより機械学習を行う画像分類装置、この画像分類装置により生成された学習済みモデルに関する。 The present invention relates to an image processing device, an image processing method, and an image processing program that create data used for machine learning based on information acquired from captured images. The present invention also relates to an image classification device that performs machine learning using data created by these image processing devices, and a trained model generated by this image classification device.

従来、カメラなどで撮影された画像から、異常を検知する手法が知られている。例えば、キズの無い正常な部品が写っている画像と、キズの有る異常な部品が写っている画像とを教師データとしてディープラーニングにより学習させ、ある画像に写っている部品が正常（キズ無し）か異常（キズ有り）かを判別する学習済みモデルを生成する方法が知られている。 Conventionally, methods are known for detecting abnormalities from images taken with a camera or the like. For example, by using deep learning to learn images that show normal parts without scratches and images that show abnormal parts with scratches as training data, the parts shown in a certain image are normal (no scratches). There is a known method for generating a trained model that determines whether the object is defective or abnormal (with scratches).

そして、このような機械学習を用いた撮影画像に基づく検知手法（判定手法）は、判定精度の向上や軽量化、リアルタイム性などが求められている。特に、少ない教師データで高い判定精度を実現することは、現在のＡＩ技術において共通の課題として認識されている。 Detection methods (determination methods) based on captured images using such machine learning are required to have improved determination accuracy, be lightweight, and have real-time performance. In particular, achieving high judgment accuracy with a small amount of training data is recognized as a common challenge in current AI technology.

特許文献１に記載の技術も、このような課題に対して考えられたものである。特許文献１には、改質段階分類モデルを用いて、搬送される土壌の改質要否を判定する装置が記載されている。また、この改質段階分類モデルは、土壌を改質段階のいずれかに分類するものであり、改質段階ごとに用意された各サンプル画像から切り出された複数の小領域画像を学習することによって作成されるものである。
なお、特許文献１には、この少領域画像を構成する画素の値（画素値）を用いて高速フーリエ変換を行い、振幅スペクトルを求めることが記載されている。そして、高速フーリエ変換後に得られる各画素の値は正規化され、変換後の小領域画像が学習用・検証用として機械学習に用いられる（特許文献１の明細書の段落００４０）。 The technique described in Patent Document 1 was also conceived in response to such a problem. Patent Document 1 describes an apparatus that uses a modification stage classification model to determine whether or not the soil being transported needs to be modified. In addition, this modification stage classification model classifies soil into one of the modification stages, by learning multiple small area images cut out from each sample image prepared for each modification stage. It is something that is created.
Note that Patent Document 1 describes that fast Fourier transform is performed using the values of pixels (pixel values) constituting this small area image to obtain an amplitude spectrum. Then, the value of each pixel obtained after the fast Fourier transform is normalized, and the small area image after the transform is used for machine learning for learning and verification (Paragraph 0040 of the specification of Patent Document 1).

また、特許文献２には、車載のカメラからの信号入力の更新によりカメラの撮影画像が更新される度に、撮影画像の所定領域をフーリエ変換して得られるスペクトラムパターンに基づいて、横断歩道に応じた所定周期のゼブラパターンを所定領域から抽出する抽出手段を備える横断歩道検出装置が記載されている。
なお、特許文献２には、振幅スペクトルの平均値と強度閾値との比較に基づいて、エッジ処理された画像信号について、横断歩道に該当するゼブラパターンの候補が存在するか否かの判定を行うことが記載されている（特許文献２の明細書の段落００４２）。 Furthermore, Patent Document 2 discloses that each time the captured image of the camera is updated due to an update of the signal input from the vehicle-mounted camera, a crosswalk is detected based on a spectrum pattern obtained by Fourier transforming a predetermined area of the captured image. A crosswalk detection device is described that includes an extraction means for extracting a zebra pattern of a corresponding predetermined period from a predetermined area.
Note that Patent Document 2 describes that it is determined whether or not a zebra pattern candidate corresponding to a crosswalk exists in an edge-processed image signal based on a comparison between the average value of the amplitude spectrum and an intensity threshold value. This is described (Paragraph 0042 of the specification of Patent Document 2).

また、特許文献３には、画像中の検出対象を検出する際の検出率を向上させることを目的とした検出装置やプログラムが記載されている。特許文献３には、第１の判定部（第１の分類モデル）で検出されなかった物体（未検出物体）のみを第２の判定部（第２の分類モデル）で判定することで、高い精度を実現することができると記載されてる（特許文献３の明細書の段落００３９，００４０）。 Further, Patent Document 3 describes a detection device and a program aimed at improving the detection rate when detecting a detection target in an image. Patent Document 3 discloses that only objects (undetected objects) that are not detected by the first determination unit (first classification model) are determined by the second determination unit (second classification model), so that high It is stated that accuracy can be achieved (Paragraphs 0039 and 0040 of the specification of Patent Document 3).

特開２０２０－０４１８０１号公報Japanese Patent Application Publication No. 2020-041801 特開２０１３－１１４６５２号公報Japanese Patent Application Publication No. 2013-114652 特開２０２１－１５７５５０号公報Japanese Patent Application Publication No. 2021-157550

しかし、特許文献２に記載の技術は、そもそも機械学習に用いられるデータを作成するものではないため、特許文献２に記載の技術により得られたスペクトラムパターンは、効果的な学習結果をもたらすデータ足るものではない。
また、特許文献３に記載の技術は、いわば複数の分類モデルを作成し使用することで、判定精度の向上を図るものである。そのため、複数の分類モデルを作成するだけの（多くの）教師データや学習が必要となる。 However, since the technology described in Patent Document 2 does not create data used for machine learning in the first place, the spectrum pattern obtained by the technology described in Patent Document 2 is sufficient data to produce effective learning results. It's not a thing.
Further, the technique described in Patent Document 3 aims to improve the determination accuracy by creating and using a plurality of classification models, so to speak. Therefore, (a lot of) training data and learning are required to create multiple classification models.

なお、特許文献１に記載の技術は、撮影された画像の変換を行うものではあるが、学習用データとして１の画像から複数の小領域画像を切り出すことにより効率的に大量の学習用データを取得するものである。そして、土壌のように、１の画像からどの部分を切り出しても似たような判定対象（土壌）が写っている場合はよいが、ネジやナットなどの機械部品、または人体といった撮影された画像から切り出された部分ごとに写っている対象が異なる場合は、有効な教師データを得ることはできない。 Note that the technology described in Patent Document 1 converts a captured image, but it can efficiently generate a large amount of learning data by cutting out multiple small area images from one image as learning data. It is something to be acquired. It is fine if the target for determination (soil) is similar no matter which part is cut out from the first image, such as soil, but if the image is a photographed image of mechanical parts such as screws or nuts, or a human body, If different objects appear in different parts of the image, it is not possible to obtain effective training data.

よって、本発明は、少ないデータであっても安定した判定精度を実現するための、機械学習に用いられるデータを作成する画像処理装置や画像処理方法などを提供する。 Therefore, the present invention provides an image processing device, an image processing method, and the like for creating data used in machine learning in order to achieve stable determination accuracy even with a small amount of data.

本発明の画像処理装置は、撮影画像を画素毎に複数の色情報へ数値変換し、当該数値変換されたそれぞれの色情報を結合させることによりデータを作成する第１の演算手段と、第１の演算手段により作成されたデータを高速フーリエ変換することにより機械学習に用いられるデータを作成する第２の演算手段と、を有する。
これにより、撮影画像に写っている撮影対象が複数の色情報へ数値変換されてから結合されることで統合データが作成され、その後さらに当該統合データが高速フーリエ変換される。 The image processing device of the present invention includes a first calculation means that numerically converts a photographed image into a plurality of color information for each pixel, and creates data by combining each of the numerically converted color information; and second calculation means for creating data used for machine learning by performing fast Fourier transform on the data created by the calculation means.
As a result, the object to be photographed in the photographed image is numerically converted into a plurality of pieces of color information and then combined to create integrated data, and then the combined data is further subjected to fast Fourier transform.

また、第１の演算手段は、撮影画像を行方向および列方向に走査して当該撮影画像の画素を複数の色情報へ数値変換し、行方向に走査して得られた複数の色情報と、列方向に走査して得られた複数の色情報とを結合させることによりデータを作成するものであることが好ましい。
これにより、撮影画像に写っている撮影対象が行方向および列方向の複数方向から走査されることによって、より多くの複数の色情報が取得および結合される。 Further, the first calculation means scans the photographed image in the row direction and the column direction, numerically converts the pixels of the photographed image into a plurality of color information, and converts the pixels of the photographed image into a plurality of color information obtained by scanning in the row direction. It is preferable that the data be created by combining a plurality of pieces of color information obtained by scanning in the column direction.
As a result, the object to be photographed in the photographed image is scanned from a plurality of directions, including the row direction and the column direction, thereby acquiring and combining more pieces of color information.

また、第１の演算手段は、三次元撮影画像である撮影画像を体積素毎に複数の色情報へ数値変換するものであることが好ましい。
これにより、さらに多くの複数の色情報が取得および結合される。 Further, it is preferable that the first calculation means numerically converts the photographed image, which is a three-dimensional photographed image, into a plurality of color information for each volume element.
This allows more multiple color information to be acquired and combined.

本発明の画像処理システムは、上述した画像処理装置と、撮影対象を照らす光源と、光源に照らされた撮影対象を撮影する撮影手段と、を有する。なお、撮影対象は機械部品であり、光源は複数あって、複数の異なる方向から撮影対象を照らすものであることが好ましい。
これにより、撮影対象は複数の光源により複数の異なる方向から照らされ、撮影手段により撮影対象が変わっても陰影のない均一な撮影画像が取得される。 The image processing system of the present invention includes the above-described image processing device, a light source that illuminates an object to be photographed, and a photographing means that photographs the object illuminated by the light source. Note that it is preferable that the object to be photographed is a mechanical part, and that there are a plurality of light sources that illuminate the object to be photographed from a plurality of different directions.
As a result, the object to be photographed is illuminated from a plurality of different directions by a plurality of light sources, and even if the object to be photographed is changed by the photographing means, a uniform photographed image without shadows can be obtained.

本発明の画像分類装置は、上述した画像処理装置により作成された教師データにより機械学習を行う学習手段を有する。
なお、学習手段はさらに、正常な状態を示す撮影対象が撮影された撮影画像に基づいて上述した画像処理装置によって作成された正常データのみにより、機械学習を行うものであることが好ましい。
これにより、撮影画像に写った撮影対象を分類し得る学習済みモデルが生成される。 The image classification device of the present invention includes learning means for performing machine learning using teacher data created by the above-described image processing device.
Preferably, the learning means further performs machine learning using only normal data created by the above-mentioned image processing device based on a photographed image of a photographed object showing a normal state.
As a result, a trained model is generated that can classify the photographed object in the photographed image.

また、本発明の画像処理装置は、第１の演算手段により、撮影画像を画素毎に複数の色情報へ数値変換し、当該数値変換されたそれぞれの色情報を結合させることによりデータを作成する工程と、第２の演算手段により、第１の演算手段により作成されたデータを高速フーリエ変換することにより機械学習に用いられるデータを作成する工程と、を有する。 Further, the image processing device of the present invention numerically converts a photographed image into a plurality of color information for each pixel using the first calculation means, and creates data by combining the respective numerically converted color information. and a step of creating data used for machine learning by performing fast Fourier transform on the data created by the first calculation means, using the second calculation means.

また、本発明の画像処理プログラムは、コンピュータを、撮影画像を画素毎に複数の色情報へ数値変換し、当該数値変換されたそれぞれの色情報を結合させることによりデータを作成する第１の演算手段と、第１の演算手段により作成されたデータを高速フーリエ変換することにより機械学習に用いられるデータを作成する第２の演算手段と、を有する画像処理装置として動作させる。 Further, the image processing program of the present invention allows a computer to perform a first operation of numerically converting a photographed image into a plurality of color information for each pixel and creating data by combining the numerically converted color information. and a second calculation means that generates data used for machine learning by performing fast Fourier transform on the data created by the first calculation means.

（１）本発明の画像処理装置は、撮影画像を画素毎に複数の色情報へ数値変換し、当該数値変換されたそれぞれの色情報を結合させることによりデータを作成する第１の演算手段と、第１の演算手段により作成されたデータを高速フーリエ変換することにより機械学習に用いられるデータを作成する第２の演算手段と、を有する構成により、撮影画像に写っている撮影対象が複数の色情報へ数値変換されてから結合されることで統合データが作成され、その後さらに当該統合データが高速フーリエ変換されるため、撮影対象に存在する異常個所の位置や形状の変化にとらわれない特徴量を明確に抽出することができ、機械学習に有効的に利用することができるデータを作成することができる。 (1) The image processing device of the present invention includes a first calculation means that numerically converts a photographed image into a plurality of color information for each pixel and creates data by combining each of the numerically converted color information. , and a second calculation means that creates data used for machine learning by performing fast Fourier transform on the data created by the first calculation means. Integrated data is created by numerically converting to color information and combining it, and then the integrated data is further subjected to fast Fourier transform, so it is possible to create feature quantities that are not affected by changes in the position or shape of anomalies existing in the photographed subject. It is possible to clearly extract data and create data that can be effectively used for machine learning.

（２）また、第１の演算手段は、撮影画像を行方向および列方向に走査して当該撮影画像の画素を複数の色情報へ数値変換し、行方向に走査して得られた複数の色情報と、列方向に走査して得られた複数の色情報とを結合させることによりデータを作成するものである構成により、撮影画像に写っている撮影対象が行方向および列方向の複数方向から走査されることによって、より多くの複数の色情報が取得および結合されるため、より明確に撮影対象に存在する異常個所の位置や形状の変化にとらわれない特徴量を抽出することができる。 (2) Further, the first calculation means scans the photographed image in the row direction and the column direction, numerically converts the pixels of the photographed image into a plurality of color information, and converts the pixels of the photographed image into a plurality of color information obtained by scanning in the row direction. The configuration creates data by combining color information and multiple pieces of color information obtained by scanning in the column direction. Since more color information is acquired and combined by scanning, it is possible to more clearly extract feature amounts that are independent of changes in the position or shape of an abnormal location present in the photographic subject.

（３）また、第１の演算手段は、三次元撮影画像である撮影画像を体積素毎に複数の色情報へ数値変換するものである構成により、さらに多くの複数の色情報が取得および結合されるため、さらに明確に撮影対象に存在する異常個所の位置や形状の変化にとらわれない特徴量を抽出することができる。 (3) In addition, the first calculation means is configured to numerically convert a photographed image, which is a three-dimensional photographed image, into a plurality of color information for each volume element, so that even more plurality of color information can be acquired and combined. Therefore, it is possible to more clearly extract a feature quantity that is independent of changes in the position or shape of an abnormal part that exists in an object to be photographed.

（４）本発明の画像処理システムは、上述した画像処理装置と、撮影対象を照らす光源と、光源に照らされた撮影対象を撮影する撮影手段と、を有する構成であり、特に撮影対象は機械部品であり、光源は複数あって、複数の異なる方向から撮影対象を照らすものである構成により、撮影対象は複数の光源により複数の異なる方向から照らされ、撮影手段により撮影対象が変わっても陰影のない均一な撮影画像が取得されるため、ムラがなく安定して撮影対象の特徴量を抽出することができる。 (4) The image processing system of the present invention has the above-described image processing device, a light source that illuminates the object to be photographed, and a photographing means that photographs the object illuminated by the light source, and in particular, the object to be photographed is a machine. Because of the configuration, the object is illuminated from multiple different directions by multiple light sources, and even if the object changes depending on the shooting method, there will be no shadows. Since a uniform photographed image without any unevenness is obtained, it is possible to extract the feature amount of the photographed object stably without any unevenness.

（５）本発明の画像分類装置は、上述した画像処理装置により作成された教師データにより機械学習を行う学習手段を有する構成であり、特に学習手段はさらに、正常な状態を示す撮影対象が撮影された撮影画像に基づいて上述した画像処理装置によって作成された正常データのみにより、機械学習を行うものである構成により、撮影画像に写った撮影対象を分類し得る学習済みモデルが生成されるため、当該学習済みモデルを用いて撮影画像に写った撮影対象の分類を行うことができる。 (5) The image classification device of the present invention has a learning means that performs machine learning using teacher data created by the above-mentioned image processing device. With the configuration that performs machine learning using only the normal data created by the above-mentioned image processing device based on the photographed image, a trained model that can classify the photographed object in the photographed image is generated. , it is possible to classify the photographed object in the photographed image using the learned model.

なお、本発明の画像処理装置や画像処理プログラムによれば、本発明の画像処理装置と同等の作用効果を奏することができる。 Note that, according to the image processing device and the image processing program of the present invention, it is possible to achieve the same effects as the image processing device of the present invention.

本発明の実施の形態に係る画像処理システムの概略構成図である。1 is a schematic configuration diagram of an image processing system according to an embodiment of the present invention. 本発明の実施の形態に係る画像処理方法を説明するためのフロー図である。FIG. 2 is a flow diagram for explaining an image processing method according to an embodiment of the present invention. 本発明の実施の形態に係る画像処理方法を説明するための図である。1 is a diagram for explaining an image processing method according to an embodiment of the present invention. FIG. 撮影画像を説明するための図であり、（Ａ）は正常な部品が写った撮影画像、（Ｂ）は異常な部品（錆あり）が写った撮影画像、（Ｃ）は異常な部品（打痕）が写った撮影画像である。These are diagrams for explaining photographed images, in which (A) is a photographed image in which a normal part is photographed, (B) is a photographic image in which an abnormal part (with rust) is photographed, and (C) is a photographic image in which an abnormal part (with rust) is photographed. This is a photographed image showing traces. 図４に示す撮影画像に基づいて、本発明の実施の形態に係る画像処理方法により作成された変換画像を説明するための図であり、（Ａ）は正常な部品が写った撮影画像に基づいて作成された変換画像、（Ｂ）は異常な部品（錆あり）が写った撮影画像に基づいて作成された変換画像、（Ｃ）は異常な部品（打痕）が写った撮影画像に基づいて作成された変換画像である。5 is a diagram for explaining a converted image created by the image processing method according to the embodiment of the present invention based on the photographed image shown in FIG. 4; FIG. (B) is a converted image created based on a photographed image showing an abnormal part (rust); (C) is a converted image created based on a photographed image showing an abnormal part (dents). This is a converted image created by その他の画像に基づいて、本発明の実施の形態に係る画像処理方法により作成された変換画像を説明するための図である。FIG. 7 is a diagram for explaining a converted image created by the image processing method according to the embodiment of the present invention based on other images. その他の画像に基づいて、本発明の実施の形態に係る画像処理方法により作成された変換画像を説明するための図である。FIG. 7 is a diagram for explaining a converted image created by the image processing method according to the embodiment of the present invention based on other images. その他の画像に基づいて、本発明の実施の形態に係る画像処理方法により作成された変換画像を説明するための図である。FIG. 7 is a diagram for explaining a converted image created by the image processing method according to the embodiment of the present invention based on other images. その他の画像に基づいて、本発明の実施の形態に係る画像処理方法により作成された変換画像を説明するための図である。FIG. 7 is a diagram for explaining a converted image created by the image processing method according to the embodiment of the present invention based on other images. その他の画像に基づいて、本発明の実施の形態に係る画像処理方法により作成された変換画像を説明するための図である。FIG. 7 is a diagram for explaining a converted image created by the image processing method according to the embodiment of the present invention based on other images.

以下に本発明の実施の形態を詳細に説明するが、以下に記載する構成要件の説明は、本発明の実施形態の一例（代表例）であり、本発明はその要旨を逸脱しない限り、以下の内容に限定されない。 The embodiments of the present invention will be described in detail below, but the explanation of the constituent elements described below is an example (representative example) of the embodiments of the present invention, and the present invention will be described below without departing from the gist thereof. is not limited to the content.

［画像処理システム］
図１は、本発明の実施の形態に係る画像処理システムの概略構成図である。図１に示すように、本発明の実施の形態に係る画像処理システム１は、画像処理装置１０、撮影手段２０、および光源３０を有する。 [Image processing system]
FIG. 1 is a schematic configuration diagram of an image processing system according to an embodiment of the present invention. As shown in FIG. 1, an image processing system 1 according to an embodiment of the present invention includes an image processing device 10, a photographing means 20, and a light source 30.

光源３０は、撮影対象Ｏを照らすものであり、例えばＬＥＤライトやストロボなどである。 The light source 30 illuminates the photographic object O, and is, for example, an LED light or a strobe light.

また、撮影手段２０は、撮影対象Ｏを撮影するものであり、例えば二次元画像を撮影するカメラや、二次元（平面）に加えて深度（奥行）を含む三次元画像を撮影する深度センサ付きカメラ（３Ｄカメラ）などである。 The photographing means 20 is for photographing the object O, and is, for example, a camera that photographs a two-dimensional image, or a camera equipped with a depth sensor that photographs a three-dimensional image that includes depth (depth) in addition to a two-dimensional (plane) image. A camera (3D camera), etc.

光源３０によって照らされ、撮影手段２０によって撮影された撮影対象Ｏの撮影画像データは、画像処理装置１０へ送られる。この際、撮影画像データは、有線通信または無線通信、もしくはＵＳＢや外付けハードディスクなどの記憶媒体を介して撮影手段２０から画像処理装置１０へ送ることができる。 Photographed image data of the photographic object O illuminated by the light source 30 and photographed by the photographing means 20 is sent to the image processing device 10. At this time, the photographed image data can be sent from the photographing means 20 to the image processing device 10 via wired communication, wireless communication, or a storage medium such as a USB or an external hard disk.

［画像処理装置］
画像処理装置１０は、撮影画像データを加工して、変換画像を作成するものである。この変換画像は教師データなど、機械学習に用いられるデータとして利用される。 [Image processing device]
The image processing device 10 processes captured image data to create a converted image. This converted image is used as data used in machine learning, such as teacher data.

また、図１に示すように、画像処理装置１０は、入力手段１１０、第１の演算手段１２０、第２の演算手段１３０、表示手段１４０、記憶手段１５０、および出力手段１６０を有する。 Further, as shown in FIG. 1, the image processing device 10 includes an input means 110, a first calculation means 120, a second calculation means 130, a display means 140, a storage means 150, and an output means 160.

入力手段１１０は、撮影手段２０によって撮影された撮影画像の入力処理を行うものである。入力手段１１０は、上述したように有線通信または無線通信、もしくは記憶媒体を介して撮影手段２０から送られてきた撮影画像を受信し、記憶手段１５０に保存する。
また、入力手段１１０は、キーボードやマウス、タッチパネルなどを介して、画像処理装置１０の利用者からの各種入力操作を受け付けるものである。 The input means 110 performs input processing of the photographed image photographed by the photographing means 20. As described above, the input means 110 receives the photographed image sent from the photographing means 20 via wired communication, wireless communication, or a storage medium, and stores it in the storage means 150.
Furthermore, the input means 110 accepts various input operations from the user of the image processing apparatus 10 via a keyboard, mouse, touch panel, or the like.

記憶手段１５０は、画像処理装置１０に必要な情報が記憶（保存）されるものであり、例えばメモリやデータベースなどである。記憶手段１５０には撮影画像の他、変換画像、変換途中の画像、または画像処理装置１０を動作させるためのプログラムなどが記憶される。 The storage unit 150 stores (saves) information necessary for the image processing device 10, and is, for example, a memory or a database. In addition to photographed images, the storage unit 150 stores converted images, images in the middle of conversion, programs for operating the image processing device 10, and the like.

第１の演算手段１２０は、撮影画像を画素毎に複数の色情報へ数値変換し、当該数値変換されたそれぞれの色情報を結合させることによりデータを作成するものである。
撮影画像が三次元撮影画像である場合、第１の演算手段１２０は当該撮影画像を体積素毎に複数の色情報へ数値変換し、当該数値変換されたそれぞれの色情報を結合させることによりデータを作成する。 The first calculation means 120 numerically converts the photographed image into a plurality of color information for each pixel, and creates data by combining the respective numerically converted color information.
When the captured image is a three-dimensional captured image, the first calculation means 120 numerically converts the captured image into a plurality of color information for each volume element, and combines the numerically converted color information to create data. Create.

第２の演算手段１３０は、第１の演算手段１２０により作成されたデータを高速フーリエ変換し、変換画像を作成するものである。この変換画像が、機械学習に用いられるデータとして利用される。 The second calculation means 130 performs fast Fourier transform on the data created by the first calculation means 120 to create a transformed image. This converted image is used as data for machine learning.

出力手段１６０は、第２の演算手段１３０によって作成された変換画像を、有線通信または無線通信、もしくは記憶媒体を介して外部装置などへ出力するものである。 The output means 160 outputs the converted image created by the second calculation means 130 to an external device or the like via wired communication, wireless communication, or a storage medium.

表示手段１４０は、利用者へ各種情報を表示させるものであり、例えばディスプレイなどである。 The display means 140 displays various information to the user, and is, for example, a display.

画像処理装置１０は、例えばモバイルＰＣやデスクトップＰＣ、またはタブレットなどなどに、図１に示す上述したような機能を実現し得るプログラムを構築（実装）することにより、本発明の実施の形態に係る画像処理装置として用いることができる。 The image processing device 10 according to the embodiment of the present invention is configured by constructing (implementing) a program capable of realizing the above-described functions shown in FIG. 1 on, for example, a mobile PC, a desktop PC, or a tablet. It can be used as an image processing device.

［画像分類装置］
なお、図示しないが、本発明の実施の形態として、画像処理装置１０により作成された変換画像を教師データなどとして機械学習を行う学習手段を有する画像分類装置がある。教師データは、例えば図４（Ａ），（Ｂ），（Ｃ）に示されるように、機械部品（ネジ）が写った撮影画像に基づいて作成された変換画像が、教師データとして用いられる。 [Image classification device]
Although not shown, as an embodiment of the present invention, there is an image classification device having a learning means for performing machine learning using a converted image created by the image processing device 10 as training data. As the teacher data, for example, as shown in FIGS. 4A, 4B, and 4C, a converted image created based on a photographed image showing a mechanical part (screw) is used as the teacher data.

また、画像分類装置は、ＣＮＮ（ＣｏｎｖｏｌｕｔｉｏｎＮｅｕｒａｌＮｅｔｗｏｒｋ）やＲＮＮ（ＲｅｃｕｒｒｅｎｔＮｅｕｒａｌＮｅｔｗｏｒｋ）、ＡｕｔｏＥｎｃｏｄｅｒなどのディープラーニング（深層学習）、または転移学習などといった学習手段により機械学習を行い、学習済みモデルを生成する。 In addition, the image classification device performs machine learning using learning means such as CNN (Convolution Neural Network), RNN (Recurrent Neural Network), AutoEncoder, or transfer learning to generate a trained model. .

そして、画像分類装置は、当該学習済みモデルにより撮影画像の判定を行う。例えば、工場で製造されたネジが撮影された撮影画像について、当該学習済みモデルを用いて推論を行い、当該撮影画像に写っているネジが正常（錆や打痕なし）であるか、異常（錆や打痕あり）であるかを判定する。 Then, the image classification device determines the photographed image using the learned model. For example, the trained model is used to make inferences about photographic images of screws manufactured in a factory, and whether the screws in the photographic images are normal (no rust or dents) or abnormal (no rust or dents). Determine if there is rust or dents).

なお、画像処理装置１０がこの学習手段を有するようにして、上述したような判定機能を画像処理装置１０に持たせてもよい。この場合、学習手段は記憶手段１５０に保存された変換画像を教師データとして学習を行い、生成された学習済みモデルは記憶手段１５０に保存される。 Note that the image processing apparatus 10 may include this learning means to provide the above-described determination function. In this case, the learning means performs learning using the transformed image stored in the storage means 150 as teacher data, and the generated trained model is stored in the storage means 150.

［画像処理方法］
図２は、本発明の実施の形態に係る画像処理方法を説明するためのフロー図である。
以下、各図面を参照して、本実施の形態に係る画像処理装置を用いた画像処理方法を説明する。なお、この説明において、本実施の形態に係る画像処理方法は利用者が行うものとする。 [Image processing method]
FIG. 2 is a flow diagram for explaining the image processing method according to the embodiment of the present invention.
Hereinafter, an image processing method using the image processing apparatus according to the present embodiment will be described with reference to each drawing. In this description, it is assumed that the image processing method according to the present embodiment is performed by the user.

まず、画像処理装置１０に、撮影手段２０によって撮影された撮影画像が入力される（ステップＳ１１０）。そうすると、入力手段１１０によって入力処理が行われ、撮影画像が記憶手段１５０に保存される。なお、この説明では、撮影画像は二次元画像として説明する。 First, a photographed image photographed by the photographing means 20 is input to the image processing device 10 (step S110). Then, input processing is performed by the input means 110, and the photographed image is stored in the storage means 150. Note that in this description, the photographed image will be described as a two-dimensional image.

次に、第１の演算手段１２０によって、撮影画像が画素毎に複数の色情報へ数値変換される（ステップＳ１２０）。
図３は、本発明の実施の形態に係る画像処理方法を説明するための図である。分かりやすいようにある撮影画像の一部を拡大して説明すると、図３に示す例において、拡大された画像の一部の画素数は「８×８」である。 Next, the first calculation means 120 numerically converts the photographed image into a plurality of color information for each pixel (step S120).
FIG. 3 is a diagram for explaining an image processing method according to an embodiment of the present invention. To make it easier to understand, a part of a photographed image is enlarged and explained. In the example shown in FIG. 3, the number of pixels of the part of the enlarged image is "8x8".

そして、第１の演算手段１２０は、この８×８の画素毎に、複数の色情報を算出する。
例えば、複数の色情報がＲＧＢ色モデルに基づく赤、緑、および青の情報である場合、この８×８の画素毎の＜赤＞，＜緑＞，＜青＞の色情報は、それぞれ以下のように算出される。 Then, the first calculation means 120 calculates a plurality of pieces of color information for each 8×8 pixel.
For example, if the multiple color information is red, green, and blue information based on the RGB color model, the color information of <red>, <green>, and <blue> for each 8×8 pixel is as follows. It is calculated as follows.

＜赤＞
２５５１７５７２３２２１０２１６５３８
２０２１５８９９６６１９９１９５４９２４
１１９１１０１２０９６１５６１３９１０５９５
１００１４１１１４１１２１１４８６２１７２２４
６９１１５１０４１１２１４６１１０２４４２４７
６２８５１０３１２９１２２１１６７３６６
１１２９４６３１０８１５４１７５３５０
１４０１０７３２７８１８１２０８４３０ <Red>
255 175 72 32 210 216 53 8
202 158 99 66 199 195 49 24
119 110 120 96 156 139 105 95
100 141 114 112 114 86 217 224
69 115 104 112 146 110 244 247
62 85 103 129 122 116 73 66
112 94 63 108 154 175 35 0
140 107 32 78 181 208 43 0

＜緑＞
１３５１５６１６４２０９２０７１５６１６８
１８５２１４９１５４２１０２０７１４５１６１
７８７７１０２９９１８９１９８１８４１８７
１４７１５７６３６７１４６１４８２５５２５５
１５５１５８６１５６１４９１３６２５５２５５
１２９１３１１０６１０１７７７４５６６１
１０２１１７１５０１５０４３３４００
９３１１９１６１１５５３７１８００ <Green>
1 35 156 164 209 207 156 168
18 52 149 154 210 207 145 161
78 77 102 99 189 198 184 187
147 157 63 67 146 148 255 255
155 158 61 56 149 136 255 255
129 131 106 101 77 74 56 61
102 117 150 150 43 34 0 0
93 119 161 155 37 18 0 0

＜青＞
１１０８１４２４１４２４７８５
３０３９８８５４７９７９５６９１
１１２８８６６４２１３２１４４１３６１４０
１９９１８３４４３４１３５１４３２５２２４８
２１６１９３４５３１１５８１５３２５５２５２
１９７１６７７９６４８４９６６６６５
１７３１４８９９８４２６４２０４
１６５１４５９７７３１０１８０５ <Blue>
1 10 81 42 41 42 47 85
30 39 88 54 79 79 56 91
112 88 66 42 132 144 136 140
199 183 44 34 135 143 252 248
216 193 45 31 158 153 255 252
197 167 79 64 84 96 66 65
173 148 99 84 26 42 0 4
165 145 97 73 10 18 0 5

例えば左上の数字が、８×８の画素における左上の画素（図３に示す斜線部分）の色情報である。そのため、この例において、図３に示す斜線部分の画素の＜赤＞の色情報は２５５、＜緑＞の色情報は１、＜青＞の色情報は１である。 For example, the upper left number is the color information of the upper left pixel (the shaded area shown in FIG. 3) in the 8×8 pixels. Therefore, in this example, the <red> color information of the pixel in the diagonally shaded area shown in FIG. 3 is 255, the <green> color information is 1, and the <blue> color information is 1.

そして、第１の演算手段１２０によって、数値変換されたそれぞれの色情報を結合させることにより、データが作成される（ステップＳ１３０）。なお、この説明においては、撮影画像の画素毎の＜赤＞の色情報の後ろに画素毎の＜緑＞の色情報が結合され、さらにその後ろに画素毎の＜青＞の色情報のが結合される。 Then, the first calculation means 120 creates data by combining the numerically converted color information (step S130). In this explanation, the <green> color information for each pixel is combined after the <red> color information for each pixel of the photographed image, and the <blue> color information for each pixel is further followed. be combined.

例えば、上記８×８の画素における色情報を例に挙げると、撮影画像を行方向（横方向）に走査する場合、この統合されたデータは、
「２５５，１７５，７２，………，２０８，４３，０，………，１，３５，１５６，………，１８，０，０，………，１，１０，８１，………，１８，０，５」
のような、６４×３＝１９２個の色情報が羅列されたデータとなる。 For example, taking the color information in the 8x8 pixels as an example, when scanning a photographed image in the row direction (horizontal direction), this integrated data is
"255,175,72,......,208,43,0,......,1,35,156,......,18,0,0,......,1,10,81,...... ,18,0,5''
The data is a list of 64×3=192 pieces of color information, such as:

それから、第２の演算手段１３０によって、第１の演算手段により作成されたデータが高速フーリエ変換され、変換画像が作成される（ステップＳ１４０）。 Then, the second calculation means 130 performs fast Fourier transform on the data created by the first calculation means to create a transformed image (step S140).

最後に、出力手段１６０によって、変換画像が出力される（ステップＳ１５０）。
以上のような手順で得られた変換画像は、機械学習における教師データとして用いられる。 Finally, the output means 160 outputs the converted image (step S150).
The converted image obtained through the above procedure is used as training data in machine learning.

［実施例１］
図４は、撮影画像を説明するための図である。図４（Ａ）～図４（Ｃ）に示されるように、撮影対象は機械部品（ネジ）である。ここで、図４（Ｂ）に示す撮影画像には錆（点線円の部分参照）のあるネジが写っており、図４（Ｃ）に示す撮影画像には打痕（点線円の部分参照）のあるネジが写っている。一方、図４（Ａ）に示す撮影画像には、このような錆や打痕の無い正常なネジが写っている。 [Example 1]
FIG. 4 is a diagram for explaining a photographed image. As shown in FIGS. 4(A) to 4(C), the object to be photographed is a mechanical part (screw). Here, the photographed image shown in FIG. 4(B) shows a screw with rust (see the dotted circle), and the photographed image shown in FIG. 4(C) shows a dent (see the dotted circle). The picture shows a screw with a . On the other hand, the photographed image shown in FIG. 4A shows a normal screw without such rust or dents.

また、図５は、図４に示す撮影画像に基づいて、本実施の形態に係る画像処理方法により作成された変換画像を説明するための図である。ここで、図５（Ａ）は図４（Ａ）に示す撮影画像に基づいて作成された変換画像、図５（Ｂ）は図４（Ｂ）に示す撮影画像に基づいて作成された変換画像、図５（Ｃ）は図４（Ｃ）に示す撮影画像に基づいて作成された変換画像である。 Further, FIG. 5 is a diagram for explaining a converted image created by the image processing method according to the present embodiment based on the photographed image shown in FIG. 4. Here, FIG. 5(A) is a converted image created based on the captured image shown in FIG. 4(A), and FIG. 5(B) is a converted image created based on the captured image shown in FIG. 4(B). , FIG. 5(C) is a converted image created based on the captured image shown in FIG. 4(C).

図５に示すように、作成された変換画像は、周波数成分が振幅（縦軸）として抽出された画像となっている。なお、色情報は時間軸（時間情報）を持たないため、第２の演算手段１３０は、任意の値（例えば、周期が１０ミリ秒の周波数１００Ｈｚ）を用いて高速フーリエ変換を行っている。 As shown in FIG. 5, the created converted image is an image in which frequency components are extracted as amplitudes (vertical axis). Note that since the color information does not have a time axis (time information), the second calculation means 130 performs fast Fourier transform using an arbitrary value (for example, a frequency of 100 Hz with a period of 10 milliseconds).

ここで、図５（Ａ）と図５（Ｂ）とを比較すると、振幅の大きさが異なる部分があることが分かる（図５（Ｂ）の矢印部分参照）。また、図５（Ａ）と図５（Ｃ）とを比較しても、振幅の大きさが異なる部分があることが分かる（図５（Ｃ）の矢印部分参照）。なお、図５（Ｂ）と図５（Ｃ）とを比較しても、振幅の大きさは異なっていることが分かる。 Here, when comparing FIG. 5(A) and FIG. 5(B), it can be seen that there are portions where the amplitudes are different (see the arrow portion in FIG. 5(B)). Furthermore, even when comparing FIG. 5(A) and FIG. 5(C), it can be seen that there are portions where the amplitudes differ (see the arrow portions in FIG. 5(C)). Note that even when comparing FIG. 5(B) and FIG. 5(C), it can be seen that the magnitudes of the amplitudes are different.

つまり、図５に示す変換画像は、錆有り、打痕有り、何も無しといった撮影画像に写っている撮影対象の特徴が抽出されていると言える。そのため、この変換画像は、機械学習における正常または異常を示す教師データとして有効であると言え、当該教師データを用いて機械学習を行うことにより、少ないデータであっても安定した判定精度を実現し得る学習済みモデルを生成することができると言える。 In other words, it can be said that the converted image shown in FIG. 5 has extracted features of the photographed object shown in the photographed image, such as rust, dents, and nothing. Therefore, this converted image can be said to be effective as training data that indicates normality or abnormality in machine learning, and by performing machine learning using this training data, stable judgment accuracy can be achieved even with a small amount of data. It can be said that it is possible to generate a trained model to obtain

［実施例２］
図６～図９は、その他の画像に基づいて、本実施の形態に係る画像処理方法により作成された変換画像を説明するための図である。図６～図９において、上が撮影画像と仮定する画像を示している。一方、下がその画像に基づいて、本実施の形態に係る画像処理方法により作成された変換画像を示している。 [Example 2]
6 to 9 are diagrams for explaining converted images created by the image processing method according to the present embodiment based on other images. In FIGS. 6 to 9, the upper part shows an image assumed to be a photographed image. On the other hand, the bottom part shows a converted image created by the image processing method according to the present embodiment based on the image.

［位置変更］
例えば、図６（Ａ）の上に示す画像は、異常（六芒星形のキズ有り）と仮定できるような六芒星の模様が映っている。そして、この画像から、図６（Ａ）の下に示すような変換画像が作成された。 [Position change]
For example, the image shown at the top of FIG. 6A shows a six-pointed star pattern that can be assumed to be an abnormality (six-pointed star-shaped scratch). From this image, a converted image as shown at the bottom of FIG. 6(A) was created.

また、図６（Ｂ）の上に示す画像は、図６（Ａ）の上に示す画像に写っている六芒星の模様の位置（場所）のみを変えたものである。つまり、当該六芒星の模様の位置は変わっているが、大きさや色などは変わっていない。そして、この画像から、図６（Ｂ）の下に示すような変換画像が作成された。 Further, the image shown at the top of FIG. 6(B) is obtained by changing only the position (location) of the hexagram pattern shown in the image shown at the top of FIG. 6(A). In other words, although the position of the hexagram pattern has changed, its size and color have not changed. From this image, a converted image as shown at the bottom of FIG. 6(B) was created.

図６（Ａ），（Ｂ）の変換画像を比較してみると、振幅の大きさおよび形状は変わらないことが分かる。つまり、撮影画像に写っているキズや錆、打痕などがある位置が変わったとしても、正確にその情報（特徴）を抽出できていることが分かる。要するに、変換画像を機械学習における教師データとして用いる場合、異常部分のある位置の変化に対するロバスト性は高いと言える。 Comparing the converted images in FIGS. 6(A) and 6(B), it can be seen that the magnitude and shape of the amplitude do not change. In other words, it can be seen that even if the location of scratches, rust, dents, etc. in the photographed image changes, the information (features) can be extracted accurately. In short, when a transformed image is used as training data in machine learning, it can be said to have high robustness against changes in the position of an abnormal part.

［色変更］
図７（Ａ）は、図６（Ａ）に示した画像および変換画像と同じである。図７（Ｂ）と比較するために記載している。
一方、図７（Ｂ）は、図７（Ａ）の上に示す画像に写っている六芒星の模様の色を変えたものである。 [Change color]
FIG. 7(A) is the same as the image and converted image shown in FIG. 6(A). It is described for comparison with FIG. 7(B).
On the other hand, FIG. 7(B) shows the hexagram pattern shown in the image shown at the top of FIG. 7(A) in a different color.

図７（Ａ），（Ｂ）の変換画像を比較してみると、振幅の大きさは変わるが、振幅の形状は変わらないことが分かる。つまり、変更画像で示される振幅を補正（例えば拡大、縮小）することにより、異常な部分があると判定し得る特徴を抽出できていることが分かる。
なお、振幅の形状は変わらないが、振幅の大きさは変わるため、赤錆や青錆など色のみが異なる異常についても区別できるような特徴を抽出できていることが分かる。 Comparing the converted images in FIGS. 7A and 7B, it can be seen that although the magnitude of the amplitude changes, the shape of the amplitude does not change. In other words, it can be seen that by correcting (for example, enlarging or reducing) the amplitude shown in the modified image, it is possible to extract features that can be determined to be an abnormal part.
Note that although the shape of the amplitude does not change, the magnitude of the amplitude changes, so it can be seen that it is possible to extract features that can be used to distinguish between abnormalities that differ only in color, such as red rust and blue rust.

［大きさ変更］
また、図８（Ａ）も、図６（Ａ）に示した画像および変換画像と同じである。図８（Ｂ），（Ｃ）と比較するために記載している。
一方、図８（Ｂ）は、図８（Ａ）の上に示す画像に写っている六芒星の模様の大きさを変えた（大きくした）ものであり、図８（Ｃ）は、逆に小さくしたものである。 [Change size]
Further, FIG. 8(A) is also the same as the image and converted image shown in FIG. 6(A). This is described for comparison with FIGS. 8(B) and (C).
On the other hand, Fig. 8(B) shows the hexagram pattern shown in the upper image of Fig. 8(A) changed in size (larger), and Fig. 8(C), conversely, is made smaller. This is what I did.

図８（Ａ）～図８（Ｃ）の変換画像を比較してみると、それぞれ振幅の大きさおよび形状が変わることが分かる。そのため、例えばキズの大きさなど「異常の程度」を区別できるような特徴を抽出できていることが分かる。 Comparing the converted images in FIGS. 8(A) to 8(C), it can be seen that the amplitude and shape of each image are different. Therefore, it can be seen that features that can distinguish the "degree of abnormality", such as the size of a scratch, can be extracted.

［形状変更］
また、図９（Ａ）も、図６（Ａ）に示した画像および変換画像と同じである。図９（Ｂ），（Ｃ）と比較するために記載している。
一方、図９（Ｂ）は、模様の形状を六芒星形から六角形に変えたものであり、図９（Ｃ）は、六芒星形の模様を左に９０度回転させたものである。 [shape change]
Further, FIG. 9(A) is also the same as the image and converted image shown in FIG. 6(A). This is described for comparison with FIGS. 9(B) and (C).
On the other hand, in FIG. 9(B), the shape of the pattern is changed from a six-pointed star to a hexagon, and in FIG. 9(C), the six-pointed star pattern is rotated 90 degrees to the left.

図９（Ａ）～図９（Ｃ）の変換画像を比較してみると、それぞれ振幅の大きさおよび形状が変わることが分かる。そのため、例えば擦りキズや凹みキズなど「異常の種類」を区別できるような特徴を抽出できていることが分かる。 Comparing the transformed images in FIGS. 9(A) to 9(C), it can be seen that the amplitude and shape of each image are different. Therefore, it can be seen that features that can distinguish the "type of abnormality" such as scratches and dents, for example, can be extracted.

［実施例３］
図１０は、その他の画像に基づいて、本実施の形態に係る画像処理方法により作成された変換画像を説明するための図である。図１０においても、上が撮影画像と仮定する画像を、下がその画像に基づいて、本実施の形態に係る画像処理方法により作成された変換画像を示している。 [Example 3]
FIG. 10 is a diagram for explaining a converted image created by the image processing method according to the present embodiment based on other images. In FIG. 10 as well, the upper part shows an image assumed to be a photographed image, and the lower part shows a converted image created based on the image by the image processing method according to the present embodiment.

例えば、図１０（Ａ）の上に示す画像には、異常（横線状の擦りキズ）と仮定できるような横線の模様が映っている。また、図１０（Ｂ）の上に示す画像には、異常（縦線状の擦りキズ）と仮定できるような縦線の模様が、図１０（Ｃ）の上に示す画像には、異常（斜め線状の擦りキズ）と仮定できるような斜め線の模様が映っている。
そして、これらの画像から、図１０（Ａ）～図１０（Ｃ）の下に示すような変換画像が作成された。 For example, the image shown at the top of FIG. 10A shows a pattern of horizontal lines that can be assumed to be an abnormality (horizontal scratches). In addition, the image shown at the top of FIG. 10(B) has a pattern of vertical lines that can be assumed to be an abnormality (vertical line-like scratches), and the image shown at the top of FIG. A pattern of diagonal lines, which can be assumed to be diagonal scratches, is visible.
From these images, converted images as shown at the bottom of FIGS. 10(A) to 10(C) were created.

図１０（Ａ），（Ｂ）の変換画像を比較してみると、それぞれ振幅の大きさおよび形状が変わっていることが分かる。一方、図１０（Ｃ）の変換画像は、図１０（Ａ），（Ｂ）の変換画像に示される振幅を合わせたものであることが分かる。
つまり、上述したような撮影画像を行方向（横方向）に走査するだけでなく、列方向（縦方向）にも走査し、それぞれの走査で得られた色情報を結合させたデータを作成することで、異常部分が形状や異常部分が設けられた方向（例えば、横から設けられた擦りキズ、縦から設けられた擦りキズ、斜めから設けられた擦りキズなど）の変化に対するロバスト性は高いと言える。 Comparing the converted images in FIGS. 10(A) and 10(B), it can be seen that the amplitude and shape of each image are different. On the other hand, it can be seen that the converted image in FIG. 10(C) is a combination of the amplitudes shown in the converted images in FIGS. 10(A) and 10(B).
In other words, the captured image is scanned not only in the row direction (horizontal direction) as described above, but also in the column direction (vertical direction), and data is created by combining the color information obtained from each scan. This makes it highly robust to changes in the shape of the abnormal part and the direction in which the abnormal part is provided (e.g., horizontal scratches, vertical scratches, diagonal scratches, etc.). I can say that.

そのため、第１の演算手段１２０により撮影画像を行方向および列方向に走査して当該撮影画像の画素を複数の色情報へ数値変換し、行方向に走査して得られた複数の色情報と、列方向に走査して得られた複数の色情報とを結合させてデータを作成し、第２の演算手段１３０により当該データを高速フーリエ変換して変換画像を作成すると、図９（Ａ）の上に示す画像に基づく変換画像の振幅と、図９（Ｃ）の上に示す画像に基づく変換画像の振幅は、ほぼ同じ大きさおよび形状になると思われる。 Therefore, the first calculation means 120 scans the photographed image in the row and column directions, numerically converts the pixels of the photographed image into a plurality of color information, and converts the pixels of the photographed image into a plurality of color information obtained by scanning in the row direction. , data is created by combining a plurality of color information obtained by scanning in the column direction, and the data is fast Fourier transformed by the second calculation means 130 to create a transformed image, as shown in FIG. 9(A) The amplitude of the converted image based on the image shown above and the amplitude of the converted image based on the image shown above in FIG. 9(C) are considered to have approximately the same size and shape.

以上のように本発明の実施の形態を説明したが、説明した実施の形態はあくまで一例であり、本発明はその要旨を逸脱しない限り、この内容に限定されない。 Although the embodiments of the present invention have been described as above, the described embodiments are merely examples, and the present invention is not limited to this content unless it departs from the gist thereof.

［教師なし学習］
例えば、画像処理装置１０により作成された変換画像を教師データとして機械学習を行う学習手段を有する画像分類装置を説明した。つまり、学習手段は、以下のような正常データと異常データの両方を含む教師データに基づいて機械学習を行うものとして説明した。
・正常データ：正常な状態を示す撮影対象（ネジ）が撮影された撮影画像（図４（Ａ））に基づいて作成された教師データ。
・異常データ：錆や打痕など以上な状態を示す撮影対象（ネジ）が撮影された撮影画像（図４（Ｂ），（Ｃ））に基づいて作成された教師データ。 [Unsupervised learning]
For example, an image classification device has been described that has a learning means that performs machine learning using a converted image created by the image processing device 10 as training data. That is, the learning means has been described as performing machine learning based on teacher data including both normal data and abnormal data as described below.
- Normal data: Teacher data created based on a photographed image (FIG. 4(A)) of a photographic object (screw) showing a normal state.
- Abnormal data: Teacher data created based on photographed images (Fig. 4 (B), (C)) of photographed objects (screws) showing the above-mentioned conditions such as rust and dents.

学習手段は、このような「正常」、「異常（錆）」、「異常（打痕）」などのラベルが付けられた教師データを用いた教師あり学習の他に、教師なし学習を行うこともできる。
例えば、学習手段は上記正常データのみにより機械学習を行うことができる。具体的には、学習手段は教師なし学習の１クラス分類に応用される手法であるＯｎｅＣｌａｓｓＳＶＭ（ＳｕｐｐｏｒｔＶｅｃｔｏｒＭａｃｈｉｎｅ）を用いて、正常データとして１つのクラスを学習して識別境界を決定し、当該識別境界を基準として外れ値（異常データ）を検出することができる。 In addition to supervised learning using supervised data labeled as ``normal,'' ``abnormal (rust),'' ``abnormal (dents),'' etc., the learning method can be unsupervised learning. You can also do it.
For example, the learning means can perform machine learning using only the normal data. Specifically, the learning means uses One Class SVM (Support Vector Machine), which is a method applied to one-class classification in unsupervised learning, to learn one class as normal data and determine the discrimination boundary. Outliers (abnormal data) can be detected using the identification boundary as a reference.

つまり、画像処理装置１０は、図５（Ａ）～図５（Ｃ）に示す変換画像からも明らかなように、正常データ（図５（Ａ））と異常データ（図５（Ｂ），（Ｃ））との間の識別境界を決定し得る特徴量を十分に抽出できている。そのため、学習手段は例えば図５（Ａ）に示すデータを基準としユークリッド距離やマハラノビス距離などに基づいて、振幅の大きさや形状などにより示される特徴量が識別境界内にあると判断されるものは正常データとして判定し、逆に識別境界外にあると判断されるものは異常データとして判定するような学習を行うことができる。 In other words, as is clear from the converted images shown in FIGS. 5(A) to 5(C), the image processing device 10 processes normal data (FIG. 5(A)) and abnormal data (FIGS. 5(B), ( C)) It has been possible to sufficiently extract the feature values that can determine the discrimination boundary between C) and C). Therefore, the learning means uses, for example, the data shown in FIG. 5(A) as a reference, and based on Euclidean distance, Mahalanobis distance, etc., the feature values indicated by amplitude size, shape, etc. are determined to be within the identification boundary. Learning can be performed such that data that is determined to be normal data is determined, and conversely, data that is determined to be outside the identification boundary is determined to be abnormal data.

現実的な問題として、正常データと異常データを含めた教師あり学習を行いたいが、この異常データの取得が困難な場合がある。例えば機械部品を製造している工場で、異常のある部品が製造される確率は１０万個～１００万個に１個など極めて低い。そのため、赤錆や青錆、打痕など、複数種類の異常データをそれぞれ十分に取得することが極めて困難な場合がある。
本発明の画像処理装置１０や画像分類装置によれば、このような場合でも、作成された変換画像により教師なし学習を行うことができるため、少ないデータであっても安定した判定精度を実現することができる。 As a practical problem, we would like to perform supervised learning that includes normal data and abnormal data, but it may be difficult to obtain this abnormal data. For example, in a factory that manufactures mechanical parts, the probability of producing a defective part is extremely low, ranging from 1 in 100,000 to 1 million. Therefore, it may be extremely difficult to obtain sufficient data on multiple types of abnormalities, such as red rust, blue rust, and dents.
According to the image processing device 10 and image classification device of the present invention, even in such a case, unsupervised learning can be performed using the created converted image, so stable judgment accuracy can be achieved even with a small amount of data. be able to.

このように、画像処理装置１０により作成された変換画像は、教師あり学習に用いることができ、かつ教師なし学習に用いることもできる。 In this way, the converted image created by the image processing device 10 can be used for supervised learning, and can also be used for unsupervised learning.

［前処理］
また、撮影画像に対して前処理（画像処理）を行うこともできる。
例えば、第１の演算手段１２０は、撮影画像の色調（トーン）をＲＧＢ色モデルに基づく赤、緑、および青の色調にそれぞれ画像変換する。そうすると、赤の色調に変換された撮影画像、緑の色調に変換された撮影画像、青の色調に変換された撮影画像といった複数の画像が生成されるため、第１の演算手段１２０は、さらに当該生成（変換）された画像を複数の色情報へ数値変換する。 [Preprocessing]
Further, preprocessing (image processing) can also be performed on the photographed image.
For example, the first calculation means 120 converts the tones of the photographed image into red, green, and blue tones based on the RGB color model. In this case, since a plurality of images such as a photographed image converted to a red tone, a photographed image converted to a green tone, and a photographed image converted to a blue tone are generated, the first calculation means 120 further The generated (converted) image is numerically converted into a plurality of color information.

ＲＧＢ色モデルに基づく色調の変換とは、例えばＲＧＢ値が０～２５５で指定されるとき、赤の色調に変換された撮影画像の場合は赤の値は２５５とし、緑および青の値は０となるように画像変換することである。 Conversion of tone based on the RGB color model means, for example, when the RGB value is specified from 0 to 255, in the case of a photographed image converted to a red tone, the red value is 255, and the green and blue values are 0. The purpose is to convert the image so that

そして、第１の演算手段１２０は、当該変換された画像を複数の色情報へ数値変換する。このとき、（１）第１の演算手段１２０は、赤の色調、緑の色調、青の色調にそれぞれ変換された撮影画像に基づく複数の色情報を結合させることによりデータを作成してもよい。また、（２）赤の色調に変換された撮影画像に基づく複数の色情報を結合させたデータ、緑の色調に変換された撮影画像に基づく複数の色情報を結合させたデータ、青の色調に変換された撮影画像に基づく複数の色情報を結合させたデータをそれぞれ作成してもよい。 Then, the first calculation means 120 numerically converts the converted image into a plurality of color information. At this time, (1) the first calculation means 120 may create data by combining a plurality of color information based on captured images that have been converted into red tones, green tones, and blue tones, respectively. . In addition, (2) data that combines multiple color information based on a captured image converted to a red tone, data that combines multiple color information based on a captured image converted to a green tone, and blue tone Data may be created by combining a plurality of pieces of color information based on captured images converted into .

このような前処理を施すことで、撮影画像に写った異常箇所の特徴量をより明確に抽出することができる。つまり、撮影画像に写った異常箇所をより正確に特定することができる。なぜならば、異常箇所は赤錆や青錆など色に特徴を持つものもあるため、このような前処理を施すことで、その異常箇所の特徴をより際立たせることができる。
例えば、上記（２）のようにして、第１の演算手段１２０により赤の色調に変換された撮影画像に基づいて結合データが作成され、その後第２の演算手段１３０により機械学習に用いられる学習用データが作成された場合、当該学習用データが異常箇所の特徴量を明確に抽出できていなくても、緑または青の色調に変換された撮影画像に基づいて作成された学習用データが異常箇所の特徴量を明確に抽出できていることがある。 By performing such pre-processing, it is possible to more clearly extract the feature amount of the abnormal location in the photographed image. In other words, it is possible to more accurately identify the abnormal location in the photographed image. This is because some abnormal areas have characteristic colors such as red rust or blue rust, so by performing such pretreatment, the characteristics of the abnormal area can be made more conspicuous.
For example, as described in (2) above, combined data is created based on the photographed image converted into a red tone by the first calculation means 120, and then the learning used for machine learning by the second calculation means 130. If training data is created based on captured images converted to green or blue tones, even if the training data cannot clearly extract the features of the abnormal location, the training data created based on the captured image converted to green or blue tone will be abnormal. In some cases, the feature values of a location can be clearly extracted.

そうすると、画像分類装置は、赤、緑、青の色調にそれぞれ変換された撮影画像に基づいて作成された学習用データにより機械学習を行い、学習済みモデルがそれぞれ生成されることとなる。そして、ある撮影画像について、それぞれ生成された複数の学習済みモデルを用いて推論を行ったとき、どれか１つでも異常と判定すれば、当該撮影画像に写った撮影対象には異常箇所があると判断することができる。
特に、機械学習を用いた撮影画像に基づく検知手法においては、正常を異常と判定することはある程度許容されるが、異常を正常と判定することは許されない。 Then, the image classification device performs machine learning using the learning data created based on the photographed images converted into red, green, and blue tones, and each of the learned models is generated. Then, when inference is performed using multiple trained models generated for a certain photographed image, if any one of them is determined to be abnormal, there is an abnormality in the photographed object in the photographed image. It can be determined that
In particular, in a detection method based on captured images using machine learning, it is permissible to some extent to determine normality as abnormality, but it is not permitted to determine abnormality as normality.

［三次元画像］
その他、上述した説明では主に撮影画像を二次元画像として説明したが、画像処理装置１０は、二次元（平面）に加えて深度（奥行）を含む三次元画像に基づいて変換画像を作成することもできる。この場合、第１の演算手段１２０は、第１の演算手段１２０は当該撮影画像を体積素毎に複数の色情報へ数値変換し、当該数値変換されたそれぞれの色情報を結合させることによりデータを作成する。そして、第２の演算手段１３０は、当該データを高速フーリエ変換することにより、変換画像を作成する。 [Three-dimensional image]
In addition, in the above description, the photographed image was mainly described as a two-dimensional image, but the image processing device 10 creates a converted image based on a three-dimensional image that includes depth (depth) in addition to two-dimensional (plane) image. You can also do that. In this case, the first calculation means 120 numerically converts the captured image into a plurality of color information for each volume element, and combines the numerically converted color information to create data. Create. Then, the second calculation means 130 creates a transformed image by performing fast Fourier transform on the data.

これにより、二次元の情報（ＸＹ）よりも情報量が多い三次元の情報（ＸＹＺ）に基づいて色情報が取得されるため、より錆や打痕、キズなど異常がある部分の特徴を抽出することができる。例えば、撮影対象に打痕がある場合、当該打痕の手前（浅いところ）よりも奥の方（深いところ）が暗くなっている、つまり色情報に違いがあるため、その特徴をより精度よく抽出することができる。 As a result, color information is acquired based on three-dimensional information (XYZ), which has a larger amount of information than two-dimensional information (XY), so features of areas with abnormalities such as rust, dents, and scratches can be extracted. can do. For example, if there is a dent on the object to be photographed, the depth of the dent is darker than the front (shallow part) of the dent. In other words, there is a difference in color information, so the characteristics of the dent can be detected more accurately. can be extracted.

さらに、撮影対象はネジやナットといった機械部品以外にも、胃の中や皮膚といった人体内外で悪性腫瘍などの異常が発生する部分であってもよい。 Furthermore, the imaging target may be not only mechanical parts such as screws and nuts, but also parts where an abnormality such as a malignant tumor occurs inside or outside the human body, such as the stomach or the skin.

また、複数の色情報は、ＲＧＢ色モデルに基づく赤、緑、および青の情報以外にも、ＨＳＶモデルに基づく色相、彩度、および明度の情報であってもよく、それ以外の色情報であってもよい。
色情報は例えば、撮影対象が明るい色をしているか、暗い色をしているか、または光沢のあるものかなど、撮影対象に応じて適宜設計変更することができる。 In addition to red, green, and blue information based on the RGB color model, the plurality of color information may also be information on hue, saturation, and brightness based on the HSV model, or other color information. There may be.
The design of the color information can be changed as appropriate depending on the object to be imaged, such as whether the object is bright, dark, or glossy.

本発明は、少ないデータであっても安定した判定精度を実現するための、機械学習に用いられるデータを作成する画像処理装置や画像処理方法などとして教師あり学習や教師なし学習などの分野において有効利用することができるため、産業上有用である。 The present invention is effective in fields such as supervised learning and unsupervised learning as an image processing device and image processing method for creating data used in machine learning in order to achieve stable judgment accuracy even with small amounts of data. It is industrially useful because it can be used.

１画像処理システム
１０画像処理装置
１１０入力手段
１２０第１の演算手段
１３０第２の演算手段
１４０表示手段
１５０記憶手段
１６０出力手段
２０撮影手段
３０光源
Ｏ撮影対象 1 Image processing system 10 Image processing device 110 Input means 120 First calculation means 130 Second calculation means 140 Display means 150 Storage means 160 Output means 20 Photographing means 30 Light source O Photographing object

Claims

a first calculation means that numerically converts the photographed image into a plurality of color information for each pixel and creates data by combining the respective numerically converted color information;
a second calculation means for creating data used for machine learning by fast Fourier transforming the data created by the first calculation means;
An image processing device having:

The first calculation means scans the photographed image in the row direction and the column direction, numerically converts the pixels of the photographed image into a plurality of color information, and converts the plurality of color information obtained by scanning in the row direction. 2. The image processing apparatus according to claim 1, wherein the data is created by combining the color information and the plurality of color information obtained by scanning in the column direction.

The image processing device according to claim 1, wherein the first calculation means numerically converts the photographed image, which is a three-dimensional photographed image, into a plurality of color information for each volume element.

4. The image processing apparatus according to claim 2, wherein the plurality of color information are red, green, and blue information based on an RGB color model, or hue, saturation, and lightness information based on an HSV model.

The plurality of color information is red, green, and blue information based on an RGB color model,
The first calculation means converts the captured image into red, green, and blue tones based on an RGB color model, and numerically converts the converted image into the plurality of color information. The image processing device according to claim 2 or 3.

A light source that illuminates the photographic subject,
Photographing means for photographing the photographing target illuminated by the light source;
An image processing device according to claim 4;
An image processing system with

The object to be photographed is a mechanical part,
7. The image processing system according to claim 6, wherein there are a plurality of said light sources and illuminates said object to be photographed from a plurality of different directions.

An image classification device comprising learning means for performing machine learning using teacher data created by the image processing device according to claim 1.

The image classification according to claim 8, wherein the learning means further performs machine learning using only normal data created by the image processing device based on a photographed image of a photographed object showing a normal state. Device.

A learned model generated by the image classification device according to claim 8 or 9.

A step of numerically converting the photographed image into a plurality of color information for each pixel by the first calculation means, and creating data by combining the respective numerically converted color information;
creating data used for machine learning by performing fast Fourier transform on the data created by the first calculation means, using a second calculation means;
An image processing method comprising:

computer,
a first calculation means that numerically converts the photographed image into a plurality of color information for each pixel and creates data by combining the respective numerically converted color information;
a second calculation means for creating data used for machine learning by fast Fourier transforming the data created by the first calculation means;
An image processing program that operates as an image processing device having.