WO2022137841A1 - 異常検出システム、学習装置、異常検出プログラム、および学習プログラム - Google Patents
異常検出システム、学習装置、異常検出プログラム、および学習プログラム Download PDFInfo
- Publication number
- WO2022137841A1 WO2022137841A1 PCT/JP2021/040920 JP2021040920W WO2022137841A1 WO 2022137841 A1 WO2022137841 A1 WO 2022137841A1 JP 2021040920 W JP2021040920 W JP 2021040920W WO 2022137841 A1 WO2022137841 A1 WO 2022137841A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- unit
- feature extraction
- extraction unit
- size
- Prior art date
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 90
- 238000007689 inspection Methods 0.000 claims abstract description 72
- 238000000605 extraction Methods 0.000 claims abstract description 69
- 238000012549 training Methods 0.000 claims abstract description 50
- 230000002950 deficient Effects 0.000 claims abstract description 11
- 230000007547 defect Effects 0.000 claims abstract description 7
- 230000005856 abnormality Effects 0.000 claims description 88
- 239000000284 extract Substances 0.000 claims description 13
- 238000000034 method Methods 0.000 description 30
- 238000010586 diagram Methods 0.000 description 13
- 238000011176 pooling Methods 0.000 description 12
- 238000004364 calculation method Methods 0.000 description 11
- 230000006870 function Effects 0.000 description 9
- 238000012545 processing Methods 0.000 description 7
- 238000004891 communication Methods 0.000 description 6
- 230000002159 abnormal effect Effects 0.000 description 2
- 230000000052 comparative effect Effects 0.000 description 2
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 238000003384 imaging method Methods 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 230000000717 retained effect Effects 0.000 description 2
- 238000009966 trimming Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000005452 bending Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 239000000758 substrate Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0004—Industrial image inspection
- G06T7/001—Industrial image inspection using an image reference approach
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N21/00—Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
- G01N21/84—Systems specially adapted for particular applications
- G01N21/88—Investigating the presence of flaws or contamination
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/48—Extraction of image or video features by mapping characteristic values of the pattern into a parameter space, e.g. Hough transformation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/761—Proximity, similarity or dissimilarity measures
Definitions
- the present invention relates to an abnormality detection system, a learning device, an abnormality detection program, and a learning program.
- AE AutoEncoder
- VAE Variational AE
- the abnormality detection system disclosed in Patent Document 1 is stored by a storage unit that stores a latent variable model and a simultaneous probability model, an acquisition unit that acquires sensor data output by a sensor, and a storage unit.
- the measurement unit Based on the latent variable model and the simultaneous probability model, the measurement unit that measures the likelihood of the sensor data acquired by the acquisition unit and the sensor data based on the likelihood of the sensor data measured by the measurement unit. It is provided with a determination unit for determining whether is normal or abnormal, and a learning unit for learning a latent variable model and a simultaneous probability model based on sensor data output by the sensor.
- an image restoration generation unit is obtained in advance based on a feature vector extracted from each of a plurality of good product images representing the appearance of a good product inspection object.
- a restored image is generated in which the input inspection target image representing the appearance of the inspection target is restored in the subspace of the feature space representing the non-defective product feature.
- the abnormality determination unit compares the generated restored image with the inspection target image and detects an abnormality in the appearance of the inspection target object.
- the present invention has been made in view of the above circumstances, and is an abnormality detection system and learning that can secure stable determination accuracy without depending on the image size at the time of abnormality detection for detecting defects in the appearance of an object. It is an object of the present invention to provide an apparatus, an anomaly detection program, and a learning program.
- An anomaly detection system that detects defects in the appearance of an object.
- An input unit for inputting inspection images of target objects of multiple types of image sizes larger than a predetermined size,
- An inspection image of a target object to be inspected which is a plurality of types of image sizes of a predetermined size or larger, and a restored image of the inspection image restored by the feature extraction unit and the image generation unit, which are input to the input unit.
- An abnormality detection system comprising a detection unit for detecting an abnormality of the target object based on a similarity calculated by comparing with and.
- the abnormality detection unit according to (3) above extracts a feature map satisfying the following formula (1) when the size of the inspection image is M and the size of the feature map is N. system. N ⁇ M ⁇ (1/2) ⁇ a formula (1)
- M and N are the number of vertical or horizontal pixels, and a is the number of layers of the convolutional layer of the feature extraction unit.
- a learning device for learning a learning model for performing anomaly detection for detecting defects in the appearance of an object is composed of a feature extraction unit and an image generation unit.
- An input unit for inputting training images including non-defective images of the target object The feature extraction unit that extracts a feature map based on the training image input to the input unit, and the feature extraction unit.
- the image generation unit that generates a restored image obtained by restoring the training image from the feature map extracted by the feature extraction unit, and the image generation unit.
- a learning unit that updates the parameters of the feature extraction unit and the image generation unit based on the training image and the restored image. Equipped with The input unit is a learning device that inputs the training images of a plurality of types of image sizes of a predetermined size or larger.
- the abnormality detection system extracts a feature map from an input unit for inputting inspection images of a plurality of types of image sizes of a predetermined size or larger and a training image including a non-defective image of the target object.
- a detection unit for detecting an abnormality in the target object is provided. As a result, stable determination accuracy can be ensured without depending on the image size.
- FIG. 1 is a diagram showing the configuration of the abnormality detection system 100.
- FIG. 2 is a block diagram of the abnormality detection system 100.
- the abnormality detection system 100 is connected to the photographing device 50 by a network 90 or a cable. Further, the photographing device 50 may be included in the configuration of the abnormality detection system 100.
- the abnormality detection system 100 functions as a learning device during learning.
- the photographing device 50 photographs an object to be inspected, generates image data, and outputs the image data.
- This image data is also an inspection image 350 (or training image 351) to be inspected (see FIG. 4 described later).
- the photographing device 50 is composed of, for example, a camera.
- the inspection target is, for example, a predetermined product, and the product includes a substrate or other electronic circuits, or parts such as bolts and nuts.
- the inspection includes a selection inspection of non-defective products and defective products by detecting the presence or absence of abnormalities such as breakage, bending, chipping, scratches, and stains. The inspection may only detect abnormal parts such as breaks, bends, chips, scratches, and stains.
- the photographing device 50 photographs an imaging range including the inspection target and outputs a captured image (image data).
- a captured image image data
- the captured image the captured image of a plurality of types of image sizes may be output.
- the captured image is a captured image in black and white or color and has an image size (number of pixels (pixels)) of 720 ⁇ 480 (SD image), 1920 ⁇ 1080 (HD image), or 3840 ⁇ 2160 (4K image).
- image size number of pixels (pixels)
- the image size of 512 ⁇ 512 or 1024 ⁇ 1024 may be obtained from these captured images by trimming, compression, or the like. It is preferable that the maximum size is 2000 ⁇ 2000 and the input image size is smaller than this from the viewpoint of processing speed.
- the photographing device 50 transmits the generated photographed image to the abnormality detection system 100.
- the abnormality detection system 100 includes a control unit 110, a storage unit 120, a communication unit 130, and an operation display unit 140. These components are connected to each other via the bus 150.
- the abnormality detection system 100 is composed of, for example, a computer terminal.
- the abnormality detection system 100 may be an on-premises server or a cloud server using a commercial cloud service.
- the control unit 110 is composed of a memory such as a CPU (Central Processing Unit), a RAM (Random Access Memory), and a ROM (Read Only Memory), and controls and performs arithmetic processing of each part of the abnormality detection system 100 according to a program. The details of the function of the control unit 110 will be described later.
- a memory such as a CPU (Central Processing Unit), a RAM (Random Access Memory), and a ROM (Read Only Memory)
- the storage unit 120 is composed of an HDD (Hard Disk Drive), SSD (Solid State Drive), etc., and stores various programs and various data.
- a learning model learned by machine learning (learning model 200 described later) is stored in the storage unit 120. Further, the training image used for learning may be stored in the storage unit 120.
- the communication unit 130 is an interface circuit (for example, a LAN card or the like) for communicating with an external device via a network.
- the communication unit 130 receives the captured image generated by the photographing device 50, and passes the received captured image to the input unit 111 (described later) and the storage unit 120.
- the operation display unit 140 may be composed of, for example, a touch panel, a liquid crystal display, and a signal tower.
- the operation display unit 140 receives various inputs from the user.
- the operation display unit 140 displays the inspection result of the inspection target.
- FIG. 3 is a functional block diagram showing the function of the control unit 110 during learning of the abnormality detection system 100
- FIG. 4 is a schematic diagram showing a configuration example of the control unit 110 during learning
- FIG. 5 is a flowchart showing a learning process of the abnormality detection system 100.
- the abnormality detection system 100 functions as a learning device at the time of learning.
- the control unit 110 functions as an input unit 111 and a learning unit 112.
- the input unit 111 can input a plurality of sizes of captured images (training image, inspection image).
- the learning unit 112 learns and generates a learning model using a large number of training images input from the input unit 111.
- a photographed image (image data) obtained by photographing a plurality of normal inspection targets is used as the training image (learning data) at the time of learning of the abnormality detection system 100.
- image data image data obtained by photographing a plurality of normal inspection targets is used.
- the image data of a normal target object (non-defective product) among the captured images will be referred to as “training image 351”.
- An example of a target object is, for example, an electronic circuit (board).
- a training image group composed of a plurality of training images 351 is used as input data, and is composed of an autoencoder (AE: AutoEncoder) or a variational autoencoder (VAE: Variational AE).
- AE AutoEncoder
- VAE Variational AE
- the learning model 200 is a model of a neural network composed of a feature extraction unit 201 (also referred to as an encoder) and an image generation unit 202 (also referred to as a decoder).
- the feature map 355 obtained by the calculation of a plurality of convolution layers and pooling layers (in the case of simply “pooling layer", the maximum pooling layer or the average pooling layer, the same applies hereinafter) by the feature extraction unit 201 for the input data is obtained. It is output to the image generation unit 202, and the input data is restored and output by the image generation unit 202.
- the training image 351 is input, and learning is performed by back propagation so that there is no difference (loss) between the restored image 360 output from the training model 200 and the training image 351.
- the learning unit 112 generates or updates the learning model.
- the feature extraction unit 201 as an encoder is composed of a plurality of convolution layers and a pooling layer.
- the pooling layer referred to here is, for example, the maximum pooling layer. For example, maximum pooling is performed in a 2 ⁇ 2 area.
- the feature extraction unit 201 does not include a fully connected layer or an overall average pooling (GAP) layer. By doing so, the feature map 355 obtained from the input captured image does not lose the spatial information of the captured image and remains retained.
- GAP overall average pooling
- the feature extraction unit 201 extracts a feature map 355 having a vertical and horizontal size of 8 pixels or more regardless of the image size of the captured image to be input. In order to make such a configuration, at the time of learning, it is set to extract the feature map 355 having a vertical and horizontal size of 8 pixels or more (vertical and horizontal 8 ⁇ 8 or more).
- the size of the feature map 355 extracted by the feature extraction unit 201 is a size proportional to the size of the input captured image, and is set so as to satisfy the following formula (1).
- N is the vertical or horizontal size (number of pixels) of the inspection image 350 (or training image 351)
- N is the same size of the feature map
- a is the number of layers of the convolution layer of the feature extraction unit 201. Is.
- the reason for using the equation (1) is that when the feature extraction unit 201 downsamples the input captured image, it is necessary to perform a convolution process to abstract the information. If this is not done, the characteristic information of the good image may be lost when downsampling.
- the structures of the feature extraction unit 201 and the image generation unit 202 may be changed according to the input image size.
- the structural change is, for example, a change in the number of strides, the number of layers of the convolutional layer (or the deconvolutional layer), and the like (see Structures 1 to 3 described later).
- the image generation unit 202 has a configuration corresponding to the feature extraction unit 201, that is, a configuration in which the configuration of the feature extraction unit 201 is reversed.
- the image generation unit 202 includes a deconvolution layer and an amplifiering layer (also referred to as an upsampling layer) corresponding to the convolution layer and the pooling layer of the feature extraction unit 201, respectively, and is input to the feature extraction unit 201.
- the size of the image and the restored image 360 output from the image generation unit 202 are the same.
- the input unit 111 acquires a training image group including a plurality of training images 351 from the photographing device 50 via the communication unit 130. Alternatively, this training image group is temporarily stored in the storage unit 120 in advance. Then, the input unit 111 acquires this.
- This training image group includes a training image 351 composed of a plurality of different image sizes of a predetermined size or larger.
- the predetermined size has a vertical and horizontal size of 512 pixels or more, more preferably 1024 pixels or more.
- a processed image subjected to various processing by the input unit 111 may also be used in order to increase the number of samples of the training image.
- Various processes include a trimming process for cutting out a part of the training image 351, a rotation process, an inversion (mirror image) process, and the like.
- Step S402 The control unit 110 selects a learning model 200 having a different structure according to the image size of the training image 351 to be trained.
- any of the following (Structure 1) to (Structure 3) can be applied.
- the element of different structure is the number of strides, and all kernels (filters) are used in common. The larger the image size, the larger the number of strides. In this case, the other structures (number of layers, kernel size, number of paddings) are the same.
- Structure 2 The element of different structure is the number of layers, and some kernels are commonly used. Specifically, the number of layers of the convolutional layer (deconvolutional layer) is made different according to the image size.
- the number of layers is increased.
- the same kernel is commonly used for the same number of layers. That is, a layer is added to the front stage or the rear stage of the encoder or decoder for small size.
- Structure 3 The element of different structure is the number of layers, and the kernel is not common. Specifically, a plurality of learning models having different numbers of layers and different kernels are selectively used according to the image size, and the following trainings are performed separately.
- step S403 the training image 351 is input to the feature extraction unit 201 using the selected learning model 200, and the restored image 360 is output from the image generation unit 202 via the feature map 355.
- Step S404 The learning unit 112 updates the parameters of the learning model 200 (feature extraction unit 201 and image generation unit 202) due to an error between the training image 351 input and output in step S403 and the restored image 360. Specifically, the difference between the training image 351 and the restored image 360 is taken, and the parameters of the learning model 200 are updated so that the error between the two becomes small.
- Step S405 When the learning for a predetermined number of times is completed (YES), for example, when the learning for all the training images 351 included in the training image group is completed, the process proceeds to step S406. If it is not completed, the process is returned to step S402, and learning using the next training image 351 is repeated.
- Step S406 The control unit 110 stores the learning model 200 generated or updated by such machine learning in the storage unit 120, and ends the learning process (end).
- FIG. 6 is a functional block diagram of the control unit 110 when an abnormality is detected in the abnormality detection system 100.
- FIG. 7 is a schematic diagram showing a configuration example of the control unit 110, and
- FIG. 8 is a flowchart showing an abnormality detection process.
- control unit 110 functions as an input unit 111, a calculation unit 115, and a detection unit 116.
- the input unit 111 acquires a captured image from the photographing device 50 via the communication unit 130 in the same manner as during the above learning.
- This photographed image is obtained by photographing the target object which is the object of the actual inspection by the photographing apparatus 50.
- an inspection image or an “inspection image 350”.
- the input inspection image 350 is input to the learning model 200 and the restored image 360 is output.
- the feature extraction unit 201 as an encoder of the learning model 200 generates a feature map 355 in the process. Even when the image size of the inspection image to be input is large, the feature map 355 has a size of 8 ⁇ 8 pixels or more by changing the structure of the learning model 200 (for example, structures 1 to 3) as described above. It is set to be.
- the feature map 355 is set to have a size proportional to the input image size.
- the input inspection image is input from the input unit 111 in a size corresponding to the image size without resizing to a constant image size (for example, 256 ⁇ 256 or 512 ⁇ 512). For example, enter the size as it is. Then, from such a thing, the feature map 355 of the size proportional to the image size of the input inspection image is extracted. In this case, the image may be resized in several steps according to the image size. Alternatively, the upper limit of the input image size may be set, and when the upper limit is exceeded, the size may be resized to the upper limit value.
- the image size of 2000 or less in both vertical and horizontal directions is input as it is, and the captured image of the image size in which either vertical or horizontal is more than 2000 is 2000 or less in the case of exceeding 2000. Resize to the target.
- the feature extraction unit 201 as an encoder is composed of a plurality of convolution layers and a pooling layer, but does not have a fully connected layer or an overall average pooling (GAP: Global Average Polling) layer.
- GAP Global Average Polling
- the size of the feature map 355 extracted by the feature extraction unit 201 is a size proportional to the size of the input inspection image 350. Further, by using the learning model 200 learned as described above, the size of the feature map 355 has a vertical and horizontal size of 8 pixels or more, and satisfies the above equation (1).
- the calculation unit 115 calculates the degree of similarity between the restored data output from the learning model 200 and the inspection image on which the restored data is based. For example, the calculation unit 115 calculates and outputs the absolute value of the difference between the restored data and the pixel value of the inspection image as the degree of similarity and outputs it. The calculation unit 115 may calculate the root mean square of the absolute value of the difference between the restored data and the pixel value of the inspection image as the degree of similarity. The calculation unit 115 may calculate the similarity between the restored data and the inspection image by a well-known method such as SIMM or cosine distance. The similarity may be output as a score.
- SIMM SIMM or cosine distance
- the detection unit 116 detects an abnormality in the inspection image based on the similarity calculated by the calculation unit 115 and outputs the detection result. For example, the detection unit 116 may determine the inspection image as an abnormality by treating the pixel portion where the absolute value of the difference between the pixel values of the restored data and the inspection image is equal to or greater than a predetermined threshold value as an abnormality (defect). The detection unit 116 may determine an inspection image in which the root mean square of the absolute value of the difference between the restored data and the pixel value of the product image is equal to or greater than a predetermined threshold value as an abnormality.
- the detection unit 116 may determine a product image in which the similarity between the restored data and the inspection image is less than a predetermined threshold value calculated by a well-known method such as SIMM or cosine distance as an abnormality. These threshold values can be appropriately set experimentally from the viewpoint of the abnormality detection accuracy of the abnormality detection system 100.
- the input unit 111 acquires a photographed image (inspection image 350) of the inspection object from the photographing apparatus 50 or the like.
- the image size of the inspection image 350 is a plurality of types of image sizes of a predetermined size or more.
- the predetermined size has a vertical and horizontal size of 512 pixels or more, more preferably 1024 pixels or more.
- Step S502 The control unit 110 changes the structure of the learning model 200 according to the image size of the inspection image 350.
- the learning model is changed to one of the above structures (Structure 1) to (Structure 3).
- a learning model 200 having a different number of strides (structure 1) or a different number of layers (structures 2 and 3) is read from the storage unit 120 and used.
- Step S503 Using the learning model 200 after changing the structure, the inspection image 350 is input to the feature extraction unit 201, and the restored image 360 is output from the image generation unit 202 via the feature map 355.
- Step S504 The calculation unit 115 calculates the degree of similarity between the restored image 360 obtained in step S503 and the inspection image 350 that is the basis thereof. The similarity is output as a score.
- Step S505 The detection unit 116 detects an abnormality in the inspection image, that is, an abnormality in the target object that is the subject of the inspection image, based on the similarity obtained in step S504, and outputs a determination result.
- FIG. 9 is a schematic diagram illustrating the relationship between the size of the feature map and the restoration accuracy.
- the area B the information in the spatial direction can be used for reconstruction (restoration) as intended.
- the area A is an area created by an already incomplete kernel process due to the influence of padding, and the incomplete kernel process is further repeated in the subsequent decoding.
- the rightmost pixels are the area a1 (one pixel) not affected by the padding and the area a2 (one pixel) affected by the padding.
- the calculation process is performed on the padded area a3 (pixels added by the padding of 5) and the padded area a3 (three pixels).
- the pixels used for the calculation have many regions a2 and a3, so that the imperfections are high.
- the size of the feature map is 8 ⁇ 8 or more
- the number of pixels in the area A ⁇ the number of pixels in the area B, but when it is 6 ⁇ 6 or less, the opposite is true.
- the number of pixels in the area A > the number of pixels in the area B.
- only 36 pixels (6 ⁇ 6) that can be reconstructed can be secured. This is larger than the number of pixels in region A (28), and the number of regions B that can be maximized as intended is dominant.
- the feature map 355 having a size proportional to the input image size is extracted. That is, the input unit 111 inputs the input image to the learning model 200 as it is without resizing the input image as it is, that is, without resizing it to a predetermined size, and extracts the feature map 355 proportional to the size. A restored image 360 is obtained from this. By doing so, in the present embodiment, the abnormality can be detected with an accuracy of a certain level or higher without depending on the image size input to the input unit 111.
- the configuration of the abnormality detection system 100 described above is the main configuration described in explaining the features of the above-described embodiment, and is not limited to the above-mentioned configuration, and can be modified within the scope of the claims. can. Further, the configuration provided in the general abnormality detection system 100 is not excluded.
- the means and methods for performing various processes in the abnormality detection system 100 (or learning device) according to the above-described embodiment can be realized by either a dedicated hardware circuit or a programmed computer.
- the above program including the abnormality detection program and the learning program may be provided by a computer-readable recording medium such as a USB memory or a DVD (Digital Versaille Disc) -ROM, or may be provided online via a network such as the Internet. May be provided at.
- the program recorded on the computer-readable recording medium is usually transferred to and stored in a storage unit such as a hard disk.
- the above program may be provided as a single application software, or may be incorporated into the software of the device as a function of the device.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Databases & Information Systems (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Quality & Reliability (AREA)
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Analytical Chemistry (AREA)
- Biochemistry (AREA)
- Immunology (AREA)
- Pathology (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
Description
所定サイズ以上の複数種類の画像サイズの対象物体の検査画像を入力する入力部と、
前記対象物体の良品画像を含む訓練画像から特徴マップを抽出するように予め学習された特徴抽出部と、
前記特徴抽出部で抽出された前記特徴マップから前記訓練画像を復元するように予め学習された画像生成部と、
前記入力部に入力された、所定サイズ以上の複数種類の画像サイズである、検査対象である対象物体の検査画像と、前記特徴抽出部および前記画像生成部により復元された該検査画像の復元画像とを比較することで算出した類似度に基づいて、前記対象物体の異常を検出する検出部と、を備える、異常検出システム。
N≧M×(1/2)^a 式(1)
ただし、M、Nは縦または横の画素数、aは前記特徴抽出部の畳み込み層の層数である。
前記学習モデルは、特徴抽出部および画像生成部で構成され、
対象物体の良品画像を含む訓練画像を入力する入力部と、
前記入力部に入力された前記訓練画像に基づいて特徴マップを抽出する前記特徴抽出部と、
前記特徴抽出部で抽出された前記特徴マップから前記訓練画像を復元した復元画像を生成する前記画像生成部と、
前記訓練画像と前記復元画像に基づいて、前記特徴抽出部および前記画像生成部のパラメータを更新する学習部と、
を備え、
前記入力部は、所定サイズ以上の複数種類の画像サイズの前記訓練画像を入力する、学習装置。
N≧M×(1/2)^a 式(1)
ただし、M、Nは縦または横の画素数、aは前記特徴抽出部の畳み込み層の層数である。
以下、図3から図5を参照し、制御部110の学習機能について説明する。図3は、異常検出システム100の学習時における制御部110の機能を示す機能ブロック図であり、図4は、学習時における制御部110の構成例を示す模式図である。図5は、異常検出システム100の学習処理を示すフローチャートである。上述のように異常検出システム100は、学習時には、学習装置として機能する。
N≧M×(1/2)^a 式(1)
ここで、Mは検査画像350(または訓練画像351)の縦または横のサイズ(画素数)であり、Nは特徴マップの同サイズであり、aは、特徴抽出部201の畳み込み層の層数である。式(1)とする理由は、特徴抽出部201により、入力された撮影画像をダウンサンプリングする際には、その前に畳み込み処理を入れて情報を抽象化する必要があるためである。これを行わないと、ダウンサンプリングするときに、良品画像の特徴的な情報が失われる虞がある。
入力部111は、複数の訓練画像351からなる訓練画像群を、撮影装置50から通信部130を介して取得する。あるいは、予めこの訓練画像群を一時的に記憶部120に蓄積しておく。そして、入力部111は、これを取得する。この訓練画像群には、所定サイズ以上の複数の異なる画像サイズからなる訓練画像351が含まれる。所定サイズとは、縦および横のサイズが512ピクセル以上であり、より好ましくは1024ピクセル以上である。なお、学習モデル200に用いる訓練画像351としては、訓練画像のサンプル数を増加させるために、入力部111により各種処理を施した処理画像も用いてもよい。各種処理としては、訓練画像351の一部を切り出すトリミング処理、回転処理、反転(鏡像)処理、等がある。
制御部110は、訓練対象となる訓練画像351の画像サイズに応じて、構造が異なる学習モデル200を選択する。例えば、以下の(構造1)~(構造3)のいずれかを適用できる。
(構造1)異なる構造の要素は、ストライド数であり、全部のカーネル(フィルタ)を共通で用いる。画像サイズが大きいほど、ストライド数を大きくする。なお、この場合、その他の構造(層数、カーネルサイズ、パディング数)は同じである。
(構造2)異なる構造の要素は、層数であり、一部のカーネルを共通に用いる。具体的には、画像サイズに応じて畳み込み層(逆畳み込み層)の層数を異ならせる。画像サイズが所定サイズよりも大きい場合に、層数をより多くする。この場合、同じ層数分に関しては同じカーネルを共通に用いる。すなわち、小サイズ用のエンコーダ、デコーダの前段または後段に層が追加されたものになる。
(構造3)異なる構造の要素は、層数であり、またカーネルは非共通である。具体的には、画像サイズに応じて、層数およびカーネルが異なる複数の学習モデルを選択的に使用し、別々に以下の訓練を行う。
ステップS402で、選択した学習モデル200を用いて、訓練画像351を特徴抽出部201に入力し、特徴マップ355を経て、画像生成部202から復元画像360を出力する。
学習部112は、ステップS403で入力、および出力された訓練画像351、および復元画像360との誤差により、学習モデル200(特徴抽出部201および画像生成部202)のパラメータを更新する。具体的には、訓練画像351と復元画像360との差分をとり、両者間の誤差が小さくなるように、学習モデル200のパラメータを更新する。
所定回数の学習が終了したならば(YES)、例えば、訓練画像群に含まれる全ての訓練画像351に対する学習が終了すれば、処理をステップS406に進める。終了していなければ処理をステップS402に戻し、次の訓練画像351を用いた学習を繰り返す。
制御部110は、このような機械学習により生成または更新された学習モデル200を記憶部120に記憶させ、学習処理を終了する(エンド)。
以下、図6から図8を参照し、上述の学習処理により生成された学習モデル200を用いて実施する異常検出処理について説明する。図6は、異常検出システム100の異常検出時における制御部110の機能ブロック図である。図7は、制御部110の構成例を示す模式図であり、図8は、異常検出処理を示すフローチャートである。
入力部111は、検査対象物の撮影画像(検査画像350)を撮影装置50等から取得する。この検査画像350の画像サイズは、所定サイズ以上の複数種類の画像サイズである。所定サイズとは、縦および横のサイズが512ピクセル以上であり、より好ましくは1024ピクセル以上である。
制御部110は、検査画像350の画像サイズに応じて、学習モデル200の構造を変更する。例えば上述の(構造1)~(構造3)のいずれかの構造とした学習モデルに変更する。例えば、ストライド数が異なったり(構造1)、層数が異なったり(構造2、3)する学習モデル200を記憶部120から読み出して、使用する。
構造を変更した後の学習モデル200を用いて、検査画像350を特徴抽出部201に入力し、特徴マップ355を経て、画像生成部202から復元画像360を出力する。
算出部115は、ステップS503で得られた復元画像360とその元になった検査画像350との類似度を算出する。類似度はスコアとして出力される。
検出部116は、ステップS504で得られた類似度に基づいて、検査画像の異常、すなわち、検査画像の被写体である対象物体の異常を検出し、判定結果を出力する。
本実施形態では、以下のことから、入力画像サイズに依存せずに、一定の検出精度で異常を検出できる。すなわち、エンコーダである特徴抽出部201により特徴マップ355を生成しても、ベクトル情報に変換せずに、画像の空間的情報を保持する。また、その特徴マップ355のサイズを8×8以上にすることで、パディング(Padding)の影響を抑えることができる。図9は、特徴マップのサイズと復元精度との関係を説明する模式図である。特徴マップの外側領域(網掛け領域)は、パディング(パディング数=1)の影響を受ける領域Aであり、その内側領域はパディングの影響を受けない(または影響が少ない)領域Bである。領域Bは、空間方向の情報を意図通りの再構成(復元)に使える。領域Aは、パディングの影響でただでさえ不完全なカーネル処理でつくられた領域で、その後のデコードにおいて不完全なカーネル処理をさらに重ねることになる。例えば、図示のように、3×3のカーネルで畳み込み処理した場合には、右端の画素は、パディングの影響を受けていない領域a1(1つの画素)と、パディングの影響を受けた領域a2(3つの画素)と、パディングした領域a3(5のパディングで追加した画素)で演算処理される。領域Aにおいては、算出に用いる画素において領域a2、a3が多いために不完全性が高くなる。
110 制御部
111 入力部
112 学習部
115 算出部
116 検出部
200 学習モデル
201 特徴抽出部
202 画像生成部
120 記憶部
130 通信部
140 操作表示部
50 撮影装置
350 検査画像
351 訓練画像
355 特徴マップ
360 復元画像
Claims (16)
- 物体の外観上の欠陥を検出する異常検出システムであって、
所定サイズ以上の複数種類の画像サイズの対象物体の検査画像を入力する入力部と、
前記対象物体の良品画像を含む訓練画像から特徴マップを抽出するように予め学習された特徴抽出部と、
前記特徴抽出部で抽出された前記特徴マップから前記訓練画像を復元するように予め学習された画像生成部と、
前記入力部に入力された、所定サイズ以上の複数種類の画像サイズである、検査対象である対象物体の検査画像と、前記特徴抽出部および前記画像生成部により復元された該検査画像の復元画像とを比較することで算出した類似度に基づいて、前記対象物体の異常を検出する検出部と、を備える、異常検出システム。 - 前記検出部は、入力部に入力された画像サイズに依存せずに一定以上の精度で検出するように設定されている、請求項1に記載の異常検出システム。
- 前記特徴抽出部が抽出する前記特徴マップのサイズは、8×8以上である、請求項1、または請求項2に記載の異常検出システム。
- 前記特徴抽出部は、前記検査画像のサイズをM、特徴マップのサイズをNとした場合、以下の式(1)を満たす特徴マップを抽出する、請求項3に記載の異常検出システム。
N≧M×(1/2)^a 式(1)
ただし、M、Nは縦または横の画素数、aは前記特徴抽出部の畳み込み層の層数である。 - 前記特徴抽出部が抽出する前記特徴マップのサイズは、前記入力部に入力された検査画像のサイズに比例した大きさである、請求項3、または請求項4に記載の異常検出システム。
- 前記特徴抽出部は、画像の空間的情報を喪失していない前記特徴マップを抽出する、請求項1から請求項5のいずれかに記載の異常検出システム。
- 前記特徴抽出部は、全結合層、またはGAP(Global Average Pooling)層を備えない、請求項6に記載の異常検出システム。
- 前記特徴抽出部および前記画像生成部は、入力された前記検査画像のサイズに応じて構造を変更する、請求項1から請求項7のいずれかに記載の異常検出システム。
- 前記検査画像は、電子回路の画像である、請求項1から請求項8のいずれかに記載の異常検出システム。
- 物体の外観上の欠陥を検出する異常検出を行うための学習モデルを学習させる学習装置であって、
前記学習モデルは、特徴抽出部および画像生成部で構成され、
対象物体の良品画像を含む訓練画像を入力する入力部と、
前記入力部に入力された前記訓練画像に基づいて特徴マップを抽出する前記特徴抽出部と、
前記特徴抽出部で抽出された前記特徴マップから前記訓練画像を復元した復元画像を生成する前記画像生成部と、
前記訓練画像と前記復元画像に基づいて、前記特徴抽出部および前記画像生成部のパラメータを更新する学習部と、
を備え、
前記入力部は、所定サイズ以上の複数種類の画像サイズの前記訓練画像を入力する、学習装置。 - 前記特徴抽出部が抽出する前記特徴マップのサイズは、8×8以上である、請求項10に記載の学習装置。
- 前記特徴抽出部は、前記訓練画像のサイズをM、特徴マップのサイズをNとした場合、以下の式(1)を満たす特徴マップを抽出する、請求項11に記載の学習装置。
N≧M×(1/2)^a 式(1)
ただし、M、Nは縦または横の画素数、aは前記特徴抽出部の畳み込み層の層数である。 - 前記特徴抽出部は、画像の空間的情報を喪失していない前記特徴マップを抽出する、請求項10から請求項12のいずれかに記載の学習装置。
- 前記特徴抽出部は、全結合層、またはGAP(Global Average Pooling)層を備えない、請求項13に記載の学習装置。
- 請求項1から請求項9のいずれかの異常検出システムとしてコンピューターを機能させるための異常検出プログラム。
- 請求項10から請求項14のいずれかの学習装置としてコンピューターを機能させるための学習プログラム。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2022571941A JPWO2022137841A1 (ja) | 2020-12-25 | 2021-11-08 | |
US18/037,817 US20230410285A1 (en) | 2020-12-25 | 2021-11-08 | Abnormality detection system, learning apparatus, abnormality detection program, and learning program |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2020-216488 | 2020-12-25 | ||
JP2020216488 | 2020-12-25 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022137841A1 true WO2022137841A1 (ja) | 2022-06-30 |
Family
ID=82157533
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2021/040920 WO2022137841A1 (ja) | 2020-12-25 | 2021-11-08 | 異常検出システム、学習装置、異常検出プログラム、および学習プログラム |
Country Status (3)
Country | Link |
---|---|
US (1) | US20230410285A1 (ja) |
JP (1) | JPWO2022137841A1 (ja) |
WO (1) | WO2022137841A1 (ja) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2017097718A (ja) * | 2015-11-26 | 2017-06-01 | 株式会社リコー | 識別処理装置、識別システム、識別処理方法、およびプログラム |
CN109615604A (zh) * | 2018-10-30 | 2019-04-12 | 中国科学院自动化研究所 | 基于图像重构卷积神经网络的零件外观瑕疵检测方法 |
JP2019069145A (ja) * | 2017-10-06 | 2019-05-09 | キヤノンメディカルシステムズ株式会社 | 医用画像処理装置及び医用画像処理システム |
JP2019522897A (ja) * | 2016-05-25 | 2019-08-15 | ケーエルエー−テンカー コーポレイション | 半導体用途のための、入力画像からのシミュレーション画像の生成 |
JP2020181532A (ja) * | 2019-04-26 | 2020-11-05 | 富士通株式会社 | 画像判定装置及び画像判定方法 |
-
2021
- 2021-11-08 JP JP2022571941A patent/JPWO2022137841A1/ja active Pending
- 2021-11-08 US US18/037,817 patent/US20230410285A1/en active Pending
- 2021-11-08 WO PCT/JP2021/040920 patent/WO2022137841A1/ja active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2017097718A (ja) * | 2015-11-26 | 2017-06-01 | 株式会社リコー | 識別処理装置、識別システム、識別処理方法、およびプログラム |
JP2019522897A (ja) * | 2016-05-25 | 2019-08-15 | ケーエルエー−テンカー コーポレイション | 半導体用途のための、入力画像からのシミュレーション画像の生成 |
JP2019069145A (ja) * | 2017-10-06 | 2019-05-09 | キヤノンメディカルシステムズ株式会社 | 医用画像処理装置及び医用画像処理システム |
CN109615604A (zh) * | 2018-10-30 | 2019-04-12 | 中国科学院自动化研究所 | 基于图像重构卷积神经网络的零件外观瑕疵检测方法 |
JP2020181532A (ja) * | 2019-04-26 | 2020-11-05 | 富士通株式会社 | 画像判定装置及び画像判定方法 |
Also Published As
Publication number | Publication date |
---|---|
JPWO2022137841A1 (ja) | 2022-06-30 |
US20230410285A1 (en) | 2023-12-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP7426011B2 (ja) | 誤り検出及び訂正のためのシステム、ストレージメディア及び方法 | |
KR102150673B1 (ko) | 외관불량 검사방법 및 외관불량 검사 시스템 | |
JP7059883B2 (ja) | 学習装置、画像生成装置、学習方法、及び学習プログラム | |
CN110619618A (zh) | 一种表面缺陷检测方法、装置及电子设备 | |
CN109671078B (zh) | 一种产品表面图像异常检测方法及装置 | |
JP7435303B2 (ja) | 検査装置、ユニット選択装置、検査方法、及び検査プログラム | |
JP2015041164A (ja) | 画像処理装置、画像処理方法およびプログラム | |
WO2021181749A1 (ja) | 学習装置、画像検査装置、学習済みパラメータ、学習方法、および画像検査方法 | |
JP2020181532A (ja) | 画像判定装置及び画像判定方法 | |
CN112767394A (zh) | 图像检测方法、装置和设备 | |
US20240005477A1 (en) | Index selection device, information processing device, information processing system, inspection device, inspection system, index selection method, and index selection program | |
JP7459697B2 (ja) | 異常検知システム、学習装置、異常検知プログラム、学習プログラム、異常検知方法、および学習方法 | |
CN111986103A (zh) | 图像处理方法、装置、电子设备和计算机存储介质 | |
JP7059889B2 (ja) | 学習装置、画像生成装置、学習方法、及び学習プログラム | |
WO2022137841A1 (ja) | 異常検出システム、学習装置、異常検出プログラム、および学習プログラム | |
US11120541B2 (en) | Determination device and determining method thereof | |
JP2022029262A (ja) | 画像処理装置、画像処理方法、画像処理プログラム、および学習装置 | |
US7646892B2 (en) | Image inspecting apparatus, image inspecting method, control program and computer-readable storage medium | |
JP6904062B2 (ja) | 情報処理装置、情報処理方法およびプログラム | |
WO2016092783A1 (en) | Information processing apparatus, method for processing information, discriminator generating apparatus, method for generating discriminator, and program | |
CN116342395A (zh) | 图像修复方法、图像修复装置、电子设备、介质 | |
JP7459696B2 (ja) | 異常検知システム、学習装置、異常検知プログラム、学習プログラム、異常検知方法、および学習方法演算装置の学習方法 | |
JP7070308B2 (ja) | 推定器生成装置、検査装置、推定器生成方法、及び推定器生成プログラム | |
US20240273708A1 (en) | Abnormality determination computer and abnormality determination method | |
CN113658112B (zh) | 一种基于模板匹配与神经网络算法的弓网异常检测方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21909995 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2022571941 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 18037817 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21909995 Country of ref document: EP Kind code of ref document: A1 |