CN114022475A - Image anomaly detection and anomaly positioning method and system based on self-supervision mask - Google Patents
Image anomaly detection and anomaly positioning method and system based on self-supervision mask Download PDFInfo
- Publication number
- CN114022475A CN114022475A CN202111397389.4A CN202111397389A CN114022475A CN 114022475 A CN114022475 A CN 114022475A CN 202111397389 A CN202111397389 A CN 202111397389A CN 114022475 A CN114022475 A CN 114022475A
- Authority
- CN
- China
- Prior art keywords
- image
- mask
- reconstructed
- updating
- anomaly
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 51
- 238000000034 method Methods 0.000 title claims abstract description 36
- 238000012549 training Methods 0.000 claims abstract description 97
- 230000002159 abnormal effect Effects 0.000 claims abstract description 56
- 238000000605 extraction Methods 0.000 claims abstract description 33
- 230000006870 function Effects 0.000 claims description 48
- 238000011156 evaluation Methods 0.000 claims description 39
- 238000012360 testing method Methods 0.000 claims description 26
- 230000005856 abnormality Effects 0.000 claims description 19
- 238000013528 artificial neural network Methods 0.000 claims description 19
- 239000011159 matrix material Substances 0.000 claims description 16
- 238000005070 sampling Methods 0.000 claims description 12
- 230000004807 localization Effects 0.000 claims description 5
- 230000000295 complement effect Effects 0.000 claims description 4
- 230000014759 maintenance of location Effects 0.000 claims description 4
- 230000000873 masking effect Effects 0.000 claims description 4
- 238000011084 recovery Methods 0.000 claims description 4
- 238000003062 neural network model Methods 0.000 claims description 3
- 238000012545 processing Methods 0.000 abstract description 5
- 238000012544 monitoring process Methods 0.000 abstract 1
- 230000007547 defect Effects 0.000 description 5
- 238000013527 convolutional neural network Methods 0.000 description 4
- 238000003745 diagnosis Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/088—Non-supervised learning, e.g. competitive learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Quality & Reliability (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
The invention provides an image anomaly detection and anomaly positioning method and system based on an automatic supervision mask, which relate to the technical field of computer vision and image processing, and comprise the following steps: the method comprises the steps of mask random generation, mask initialization, mask generation initialization, image feature extraction, image reconstruction, reconstructed image alignment, mask updating termination decision making and anomaly assessment. The invention improves the abnormal positioning capability of the abnormal detection algorithm by introducing the training of the self-monitoring mask, thereby obtaining better performance on the tasks of abnormal detection and abnormal positioning.
Description
Technical Field
The invention relates to the technical field of computer vision and image processing, in particular to an unsupervised image anomaly detection and anomaly positioning method and system based on an automatic supervision mask, and particularly relates to an image anomaly detection and anomaly positioning method and system based on the automatic supervision mask.
Background
Currently, deep learning techniques based on deep neural networks have achieved significant success in object classification tasks, and such data-driven approaches typically require large amounts of labeled data for training. However, in the task of anomaly detection, the variety of anomalies is not exhaustive, and therefore, it is too costly to collect enough anomaly data for model training. In this case, the anomaly detection task usually only provides normal data for model training, and requires that the anomaly detection method must be able to still have data anomaly detection capability without abnormal data training.
The image anomaly detection solution based on image reconstruction uses data of normal category to train an image reconstruction model, and assumes that the model cannot be applied to image reconstruction of abnormal data. In the abnormal detection stage, the image reconstruction model has limited capability of image reconstruction aiming at abnormal data, and larger image reconstruction errors can be caused. Therefore, the reconstruction error can be used as a detection index for abnormality detection. However, for practical applications, such as medical diagnosis and industrial defect detection tasks, the abnormality often appears in only a small portion of pixels of the image, and the above method can only be used for judging whether the abnormality exists in the whole image, and cannot accurately complete the positioning of the abnormal region. In fact, in order to improve the anomaly detection performance of the algorithm and the interpretability of the anomaly detection algorithm, anomaly positioning is very important, but the important task is often ignored by the existing anomaly detection algorithm.
An invention patent with publication number CN110866908B discloses an image processing method, apparatus, server and storage medium, comprising: acquiring an image to be detected, and performing down-sampling abnormal classification processing on the image to be detected to obtain an abnormal class prediction label and a target characteristic diagram; performing primary abnormal positioning processing based on the abnormal category prediction label and the target characteristic image to obtain an initial positioning image corresponding to the image to be detected; carrying out up-sampling abnormal positioning processing on the initial positioning image to obtain a target positioning image corresponding to the image to be detected; and outputting the abnormal category prediction label and the target positioning image.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides an image anomaly detection and anomaly positioning method and system based on an automatic supervision mask.
According to the image anomaly detection and anomaly positioning method and system based on the self-supervision mask, the scheme is as follows:
in a first aspect, a method for detecting and locating an image anomaly based on an auto-supervision mask is provided, the method comprising:
a mask random generation step: randomly generating a mask with the size consistent with that of the model training image, and applying the mask on the image to remove information of a partial region of the image;
mask initialization step: generating multi-scale initialization masks according to the tested images, and respectively applying the masks to the images to be tested to generate multi-scale images to be tested with partial image area information removed;
an image feature extraction step: extracting high-dimensional features of the image by using a depth convolution neural network for the image obtained in the mask random generation step or the mask initialization step;
an image reconstruction step: carrying out image reconstruction on the high-dimensional features of the image by using a depth convolution neural network to obtain a reconstructed training image or a reconstructed test image;
and aligning the reconstructed images: according to the reconstruction training image and the model training image, an image reconstruction loss function is used for realizing self-supervision learning;
a mask updating step: updating the multi-scale mask according to the reconstructed test image by using a mask updating algorithm, so that the mask is more concentrated on the abnormal part of the image;
a mask updating termination decision step: judging whether the mask is consistent with the mask before updating according to the updated mask obtained in the mask updating step, if so, entering an abnormal evaluation step, and if not, acting the updated mask on the image to be tested, and entering the image feature extraction step again;
an abnormality evaluation step: and according to the result of the mask updating termination decision step, using an anomaly evaluation function to realize image anomaly evaluation.
Preferably, the mask random generating step includes:
using images for model training as input, decomposing each input image intoWherein H and W are the height and width of the image, where k controls the size of the grid;
each grid is composed of a square of k × k pixels and is set as a basic unit of the mask;
each grid is randomly selected for masking or retention and the resulting mask matrix is denoted M.
Preferably, the mask initializing step includes: given an image to be tested, as an initialization, from a set of multi-scale masksThe mask is composed of eight chessboard-like matrices with different scales, wherein the size K of the grid belongs to K;
for each grid size k, a pair of complementary masks is included that collectively cover all pixels in the image.
Preferably, the image feature extraction step includes: and taking the image obtained in the mask random generation step or the mask initialization step as input, and extracting high-dimensional feature information of the image by using a deep convolution neural network, wherein the image feature extraction network consists of a plurality of layers of convolution and down-sampling operations.
Preferably, the image reconstructing step includes: the high-dimensional characteristic information obtained in the image characteristic extraction step is used as input, image reconstruction is realized by utilizing a deep convolution neural network model to obtain a reconstructed image, and an image attribute recovery network is formed by a plurality of layers of convolution and up-sampling operations;
and if the input is the model training image, outputting a reconstructed training image, and if the input is the image to be tested, outputting a reconstructed testing image.
Preferably, the step of aligning the reconstructed images specifically includes:
and for the model training image, comparing the reconstructed training image obtained in the image reconstruction step with the model training image, and respectively calculating the following loss functions:
(1) mean square loss function:
(2) gradient magnitude similarity loss function:
wherein 1 represents a full 1 matrix;
Icandrespectively representing the c-th color channel of the model training image and the reconstructed training image;
gradient magnitude representing model training image and reconstructed training imageA degree similarity function;
i, j represents two-dimensional coordinates of the image;
and the gradient magnitude similarity loss matrix for channel c:
wherein a represents a constant;
hxand hyIs a Prewitt filter in the x and y dimensions;
(3) structural similarity exponential loss function:
wherein,representing a structural similarity index function centered on the image two-dimensional coordinates i, j.
Preferably, the mask updating step includes:
in each iteration updating, a region with small reconstruction error is regarded as a normal region and is removed from the mask in the next iteration, so that the mask is updated by the reconstruction error;
given the grid size k, the image is divided into k × k grids, and the mask is updated by taking the k × k grids as a unit, so that the algorithm is more stable, and the iterative update times are reduced;
for each grid, the average reconstruction error is calculated, the mask is updated according to a threshold value, and the parts of the mask where the reconstruction error is higher than the threshold value are reserved.
Preferably, the mask update termination decision step includes:
when most of the area covered by the mask is an abnormal area, stopping updating the mask and obtaining a final mask;
after the process is finished, the expected mask only covers the abnormal part of the image, and the final mask and the reconstructed image are used as the input of the abnormality evaluation step;
and if the mask is continuously changed in the mask updating step, not entering an abnormal evaluation step, but re-entering the image feature extraction step until the output mask of the mask updating step is kept unchanged.
Preferably, the abnormality assessing step includes:
for the image to be tested, comparing the final reconstructed image obtained in the mask updating and stopping decision step with the image to be tested, thereby calculating the following abnormal evaluation function:
In a second aspect, there is provided an image anomaly detection and anomaly localization system based on an auto-supervised mask, the system comprising:
a mask random generation module: randomly generating a mask with the size consistent with that of the model training image, and applying the mask on the image to remove information of a partial region of the image;
a mask initialization module: generating multi-scale initialization masks according to the tested images, and respectively applying the masks to the images to be tested to generate multi-scale images to be tested with partial image area information removed;
an image feature extraction module: extracting high-dimensional features of the image by using a depth convolution neural network for the image obtained by the mask random generation module or the mask initialization module;
an image reconstruction module: carrying out image reconstruction on the high-dimensional features of the image by using a depth convolution neural network to obtain a reconstructed training image or a reconstructed test image;
a reconstructed image alignment module: according to the reconstruction training image and the model training image, an image reconstruction loss function is used for realizing self-supervision learning;
a mask updating module: updating the multi-scale mask according to the reconstructed test image by using a mask updating algorithm, so that the mask is more concentrated on the abnormal part of the image;
a mask update termination decision module: judging whether the mask is consistent with the mask before updating according to the updated mask obtained by the mask updating module, if so, entering an abnormal evaluation module, and if not, acting the updated mask on the image to be tested and entering an image feature extraction module again;
an anomaly assessment module: and according to the result of the mask updating termination decision module, using an anomaly evaluation function to realize image anomaly evaluation.
Compared with the prior art, the invention has the following beneficial effects:
1. the invention expands the image reconstruction task for image anomaly detection to the image anomaly detection and anomaly positioning field through the training of the self-supervision mask;
2. in practical application, such as medical diagnosis and industrial defect detection tasks, the abnormity often only appears in a small part of pixels of an image, and the abnormity detection method based on the image reconstruction task cannot accurately complete the positioning of an abnormity region.
Drawings
Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:
FIG. 1 is a flow chart of the method of the present invention;
fig. 2 is a schematic diagram of the system in the embodiment.
Detailed Description
The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the invention, but are not intended to limit the invention in any way. It should be noted that it would be obvious to those skilled in the art that various changes and modifications can be made without departing from the spirit of the invention. All falling within the scope of the present invention.
The embodiment of the invention provides an image anomaly detection and anomaly positioning method based on an automatic supervision mask, and as shown in figure 1, the method specifically comprises the following steps:
a mask random generation step: randomly generating a mask with the size consistent with that of the model training image, and applying the mask on the image to remove information of a partial region of the image;
in this step, images for model training are used as input, and each input image is decomposed intoWherein H and W are the height and width of the image, where k controls the size of the grid; each grid is composed of a square of k × k pixels and is set as a basic unit of the mask; k from the setMiddle sampling, wherein NkIs the set cardinality, kiRepresents the ith grid size k; in our implementation, we use K ═ {4, 8, 16, 32}, because it covers a wide range of scale sizes of anomaly classes. To expand the mask exploration space, a random mask is dynamically generated for each image during each training phase. Each grid is then randomly selected for masking or retention, and the resulting mask matrix is denoted M. In this way, a set of random masks of different sizes and shapes can be generated.By this way of generating random masks, each image is enhanced into a different set of training triplesWhere I is the input image, M is the resulting mask,it is the generated model input image, which is a point product operation in the spatial domain (mask needs to be copied along the channel dimension).
Mask initialization step: generating multi-scale initialization masks according to the tested images, and respectively applying the masks to the images to be tested to generate multi-scale images to be tested with partial image area information removed;
in this step, in particular, the image to be tested is given as an initialization from a set of multi-scale masksThe mask is composed of eight chessboard-like matrices with different scales, wherein the size K of the grid belongs to K; for each grid size k, a pair of complementary masks is included that collectively cover all pixels in the image. Thereby avoiding missing any possible abnormal area.
An image feature extraction step: and for the image obtained in the mask random generation step or the mask initialization step, extracting the high-dimensional features of the image by using a deep convolution neural network, wherein the image feature extraction network consists of a plurality of layers of convolution and down-sampling operations.
An image reconstruction step: carrying out image reconstruction on the high-dimensional features of the image by using a depth convolution neural network to obtain a reconstructed training image or a reconstructed test image;
specifically, high-dimensional feature information obtained in the image feature extraction step is used as input, image reconstruction is realized by utilizing a deep convolution neural network model to obtain a reconstructed image, and an image attribute recovery network is formed by a plurality of layers of convolution and upsampling operations; and if the input is the model training image, outputting a reconstructed training image, and if the input is the image to be tested, outputting a reconstructed testing image.
And aligning the reconstructed images: according to the reconstructed training image and the model training image, self-supervision learning is achieved by using an image reconstruction loss function;
specifically, for the model training image, comparing the reconstructed training image obtained in the image reconstruction step with the model training image, thereby respectively calculating the following loss functions:
(1) mean square loss function:
(2) gradient magnitude similarity loss function:
wherein 1 represents a full 1 matrix;
Icandrespectively representing the c-th color channel of the model training image and the reconstructed training image;
display moduleA gradient amplitude similarity function of the model training image and the reconstructed training image;
i, j represents two-dimensional coordinates of the image;
and the gradient magnitude similarity loss matrix for channel c:
wherein a represents a constant;
hxand hyIs a Prewitt filter in the x and y dimensions.
(3) Structural similarity exponential loss function:
wherein,representing a structural similarity index function centered on the image two-dimensional coordinates i, j.
A mask updating step: updating the multi-scale mask according to the reconstructed test image by using a mask updating algorithm, so that the mask is more concentrated on the abnormal part of the image;
the purpose of the mask update is to remove the regions of the mask that may correspond to normal regions of the image so that the image reconstruction network is more concerned with the remaining abnormal regions. In each iteration updating, a region with small reconstruction error is regarded as a normal region and is removed from the mask in the next iteration, so that the mask is updated by the reconstruction error; given the grid size k, the image is divided into k × k grids, and the mask is updated by taking the k × k grids as a unit, so that the algorithm is more stable, and the iterative update times are reduced; then, for each grid, an average reconstruction error is calculated, and the mask is updated according to a threshold value, leaving in the mask the portions of the reconstruction error above the threshold value.
A mask updating termination decision step: judging whether the mask is consistent with the mask before updating according to the updated mask obtained in the mask updating step, if so, entering an abnormal evaluation step, and if not, acting the updated mask on the image to be tested, and entering the image feature extraction step again;
in particular, when most of the area covered by the mask is an abnormal area, providing more image information cannot significantly reduce the reconstruction error of the abnormal area. In this case, the overall reconstruction error will not be significantly reduced and the corresponding mask will remain unchanged. At this time, the mask updating should be terminated, and the final mask is obtained; finally, when this method is finished, it is expected that the mask will cover only the abnormal part of the image, and the final mask and the reconstructed image are taken as input to the abnormality assessment step. And if the mask is continuously changed in the mask updating step, not entering the abnormality evaluating step, but re-entering the image feature extracting step until the output mask of the mask updating step is kept unchanged.
An abnormality evaluation step: according to the result of the mask updating termination decision step, using an anomaly evaluation function to realize image anomaly evaluation;
the method specifically comprises the following steps: for the image to be tested, comparing the final reconstructed image obtained in the mask updating and stopping decision step with the image to be tested, thereby calculating the following abnormal evaluation function:
Next, the present invention will be described in more detail.
The invention provides an image anomaly detection and anomaly positioning method based on an automatic supervision mask, as shown in figure 1, which is a flow chart of an embodiment of the image anomaly detection and anomaly positioning method based on the automatic supervision mask, the method randomly generates a mask with the same size with an image for model training, and applies the mask on the image to remove the information of part of the image; generating multi-scale initialization masks for the images to be tested, and respectively applying the masks to the images to be tested to generate multi-scale images to be tested with partial image area information removed; extracting high-dimensional features of the input image by using a deep convolutional neural network; carrying out image reconstruction on the high-dimensional characteristics of the image by using a deep convolution neural network to obtain a reconstructed training image or a reconstructed test image, and realizing self-supervision learning by using an image reconstruction loss function; and updating the multi-scale mask by using a mask updating algorithm according to the reconstructed test image, so that the mask is more concentrated on the abnormal part of the image, performing a mask updating termination decision by judging whether the mask is consistent with the mask before updating, and realizing the abnormal evaluation of the image by using an abnormal evaluation function.
The invention expands the image reconstruction task for image anomaly detection to the fields of image anomaly detection and anomaly positioning through the training of the self-supervision mask. In practical applications, such as medical diagnosis and industrial defect detection tasks, abnormalities often appear in only a small portion of pixels of an image, and the abnormality detection method based on the image reconstruction task cannot accurately complete the positioning of an abnormal region. By introducing the training of the self-supervision mask, the abnormity positioning capability of the abnormity detection algorithm is improved, and the interpretability of the abnormity detection algorithm is improved, so that better performance is obtained on the tasks of abnormity detection and abnormity positioning.
Specifically, with reference to fig. 1, the method comprises the steps of:
a mask random generation step: randomly generating a mask with the size consistent with that of an image for model training, applying the mask on the image and removing information of a partial region of the image;
mask initialization step: generating multi-scale initialization masks for the images to be tested, and respectively applying the masks to the images to be tested to generate multi-scale images to be tested with partial image area information removed;
an image feature extraction step: extracting high-dimensional features of the image by using a depth convolution neural network for the image obtained in the mask random generation step or the mask initialization step;
an image reconstruction step: carrying out image reconstruction on the high-dimensional features of the image obtained in the image feature extraction step by using a deep convolutional neural network to obtain a reconstructed training image or a reconstructed test image;
and aligning the reconstructed images: according to the reconstructed training image and the model training image obtained in the image reconstruction step, an image reconstruction loss function is used for realizing self-supervision learning;
a mask updating step: updating the multi-scale mask by using a mask updating algorithm on the reconstructed test image obtained according to the image reconstruction step, so that the mask is more concentrated on the abnormal part of the image;
a mask updating termination decision step: judging whether the mask is consistent with the mask before updating according to the updated mask obtained in the mask updating step, if so, entering an abnormal evaluation step, and if not, acting the updated mask on the image to be tested, and entering the image feature extraction step again;
an abnormality evaluation step: and according to the result of the mask updating termination decision step, using an anomaly evaluation function to realize image anomaly evaluation.
In the embodiment of the invention, the mask random generation step comprises the following steps: a mask having a size corresponding to that of an image is randomly generated for the image for model training, and the mask is applied to the image to remove information in a partial region of the image.
In the mask initialization step: and generating multi-scale initialization masks for the images to be tested, and respectively applying the initialization masks to the images to be tested to generate multi-scale images to be tested with partial image area information removed.
In the image feature extraction step: and (3) extracting the high-dimensional features of the image by using a deep convolutional neural network for the image obtained in the mask random generation step or the mask initialization step.
In the image reconstruction step: and (3) carrying out image reconstruction on the high-dimensional features of the image obtained in the image feature extraction step by using a deep convolution neural network to obtain a reconstructed training image or a reconstructed test image.
In the reconstructed image alignment step: and realizing self-supervision learning by using an image reconstruction loss function according to the reconstructed training image and the model training image obtained in the image reconstruction step.
In the mask updating step: and updating the multi-scale mask by using a mask updating algorithm on the reconstructed test image obtained according to the image reconstruction step, so that the mask is more concentrated on the abnormal part of the image.
In the mask update termination decision step: and judging whether the mask is consistent with the mask before updating according to the updated mask obtained in the mask updating step, if so, entering an abnormal evaluation step, and if not, acting the updated mask on the image to be tested, and entering the image feature extraction step again.
In the abnormality assessment step: and according to the result of the mask updating termination decision step, using an anomaly evaluation function to realize image anomaly evaluation.
The invention also provides an image anomaly detection and anomaly positioning system based on the self-supervision mask, which specifically comprises the following steps:
a mask random generation module: randomly generating a mask with the size consistent with that of the model training image, and applying the mask on the image to remove information of a partial region of the image;
a mask initialization module: generating multi-scale initialization masks according to the tested images, and respectively applying the masks to the images to be tested to generate multi-scale images to be tested with partial image area information removed;
an image feature extraction module: extracting high-dimensional features of the image by using a depth convolution neural network for the image obtained by the mask random generation module or the mask initialization module;
an image reconstruction module: carrying out image reconstruction on the high-dimensional features of the image by using a depth convolution neural network to obtain a reconstructed training image or a reconstructed test image;
a reconstructed image alignment module: according to the reconstruction training image and the model training image, an image reconstruction loss function is used for realizing self-supervision learning;
a mask updating module: updating the multi-scale mask according to the reconstructed test image by using a mask updating algorithm, so that the mask is more concentrated on the abnormal part of the image;
a mask update termination decision module: judging whether the mask is consistent with the mask before updating according to the updated mask obtained by the mask updating module, if so, entering an abnormal evaluation module, and if not, acting the updated mask on the image to be tested and entering an image feature extraction module again;
an anomaly assessment module: and according to the result of the mask updating termination decision module, using an anomaly evaluation function to realize image anomaly evaluation.
Specifically, a network framework of a training system consisting of a mask random generation module, an image feature extraction module, an image reconstruction module, a reconstructed image alignment module, a mask updating termination decision module and an anomaly evaluation module is shown in fig. 2, and the whole system framework can be trained end to end.
In the embodiment system framework shown in FIG. 2, each input image is decomposed into a number of images for model training as inputWhere H and W are the height and width of the image, where k controls the size of the grid. Each grid consists of a square of k × k pixels and is set as the basic unit of the mask. In particular, the size k is from the set Middle sampling, wherein NkIs the set cardinality. In our implementation, we use K ═ {4, 8, 16, 32}, because it covers a wide range of scale sizes of anomaly classes. To expand the mask exploration space, a random mask is dynamically generated for each image during each training phase. Each grid is then randomly selected for masking or retention, and the resulting mask matrix is denoted M. In this way, a set of random masks of different sizes and shapes can be generated. By this way of generating random masks, each image is enhanced into a different set of training triplesWhere I is the input image, M is the resulting mask,it is the generated model input image, which is a point product operation in the spatial domain (mask needs to be copied along the channel dimension).
In the system framework of the embodiment shown in FIG. 2, given an image to be tested, as an initialization, from a set of multi-scale masksInitially, the mask consists of eight tessellated matrices of different dimensions, where the grid size K ∈ K. For each grid size k, a pair of complementary masks is included that collectively cover all pixels in the image, thereby avoiding the loss of any possible outlier regions.
In the system framework of the embodiment shown in fig. 2, the image obtained by the mask random generation module or the mask initialization module is used as an input, a deep convolution neural network is used to extract high-dimensional feature information of the image, and the image feature extraction network is composed of a plurality of layers of convolution and down-sampling operations.
In the system framework of the embodiment shown in fig. 2, high-dimensional feature information obtained by an image feature extraction module is used as input, image reconstruction is realized by using a deep convolutional neural network model to obtain a reconstructed image, and an image attribute recovery network is formed by a plurality of layers of convolution and upsampling operations. And if the input is the model training image, outputting a reconstructed training image, and if the input is the image to be tested, outputting a reconstructed testing image.
In the system framework of the embodiment shown in fig. 2, for the model training image, the reconstructed training image obtained by the image reconstruction module is compared with the model training image, so as to calculate the following loss functions respectively:
1) mean square loss function:
(2) gradient magnitude similarity loss function:
wherein 1 represents a full 1 matrix;
Icandrespectively representing the c-th color channel of the model training image and the reconstructed training image;
representing a gradient amplitude similarity function of the model training image and the reconstructed training image;
i, j represents two-dimensional coordinates of the image;
and the gradient magnitude similarity loss matrix for channel c:
wherein a represents a constant;
hxand hyIs a Prewitt filter in the x and y dimensions.
(3) Structural similarity exponential loss function:
wherein,representing a structural similarity index function centered on the image two-dimensional coordinates i, j.
In the embodiment system framework shown in fig. 2, the purpose of the mask update is to remove the mask regions that may correspond to normal regions of the image, so that the image reconstruction network is more concerned about the remaining abnormal regions. In each iteration update, the mask is updated with the reconstruction error by considering the region with the smaller reconstruction error as a normal region and removing it from the mask in the next iteration. The mask is updated by taking a grid of k multiplied by k as a unit, so that the algorithm is more stable, and the iterative updating times are reduced. Thus, given a mesh size k, the image is segmented into a k × k mesh. Then, for each grid, the average reconstruction error over it is calculated, and the mask is updated according to a threshold, leaving in the mask the portions of the reconstruction error above the threshold.
Referring to fig. 2, when most of the area covered by the mask is an abnormal area, providing more image information does not significantly reduce the reconstruction error of the abnormal area. In this case, the overall reconstruction error will not be significantly reduced and the corresponding mask will remain unchanged. The mask update should be terminated at this point and the final mask is obtained. Finally, when this method ends, it is expected that the mask will cover only the abnormal portion of the image, and the final mask and the reconstructed image are taken as input to the abnormality assessment module. And if the mask is continuously changed in the mask updating step, the abnormal evaluation module is not entered, but the image feature extraction module is entered again until the output mask of the mask updating module is kept unchanged.
In the system framework of the embodiment shown in fig. 2, for the image to be tested, the final reconstructed image obtained by the reconstruction mask update stop decision module is compared with the image to be tested, so as to calculate the following abnormal rating function:
In summary, the embodiment of the invention provides an image anomaly detection and anomaly positioning method and system based on an automatic supervision mask, and an image reconstruction task for image anomaly detection is expanded to the field of image anomaly detection and anomaly positioning through the training of the automatic supervision mask. In practical applications, such as medical diagnosis and industrial defect detection tasks, abnormalities often appear in only a small portion of pixels of an image, and the abnormality detection method based on the image reconstruction task cannot accurately complete the positioning of an abnormal region. By introducing the training of the self-supervision mask, the abnormity positioning capability of the abnormity detection algorithm is improved, the interpretability of the abnormity detection algorithm is improved, and therefore better performance is achieved on the tasks of abnormity detection and abnormity positioning.
Those skilled in the art will appreciate that, in addition to implementing the system and its various devices, modules, units provided by the present invention as pure computer readable program code, the system and its various devices, modules, units provided by the present invention can be fully implemented by logically programming method steps in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Therefore, the system and various devices, modules and units thereof provided by the invention can be regarded as a hardware component, and the devices, modules and units included in the system for realizing various functions can also be regarded as structures in the hardware component; means, modules, units for performing the various functions may also be regarded as structures within both software modules and hardware components for performing the method.
The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes or modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention. The embodiments and features of the embodiments of the present application may be combined with each other arbitrarily without conflict.
Claims (10)
1. An image anomaly detection and anomaly positioning method based on an automatic supervision mask is characterized by comprising the following steps:
a mask random generation step: randomly generating a mask with the size consistent with that of the model training image, and applying the mask on the image to remove information of a partial region of the image;
mask initialization step: generating multi-scale initialization masks according to the tested images, and respectively applying the masks to the images to be tested to generate multi-scale images to be tested with partial image area information removed;
an image feature extraction step: extracting high-dimensional features of the image by using a depth convolution neural network for the image obtained in the mask random generation step or the mask initialization step;
an image reconstruction step: carrying out image reconstruction on the high-dimensional features of the image by using a depth convolution neural network to obtain a reconstructed training image or a reconstructed test image;
and aligning the reconstructed images: according to the reconstruction training image and the model training image, an image reconstruction loss function is used for realizing self-supervision learning;
a mask updating step: updating the multi-scale mask according to the reconstructed test image by using a mask updating algorithm, so that the mask is more concentrated on the abnormal part of the image;
a mask updating termination decision step: judging whether the mask is consistent with the mask before updating according to the updated mask obtained in the mask updating step, if so, entering an abnormal evaluation step, and if not, acting the updated mask on the image to be tested, and entering the image feature extraction step again;
an abnormality evaluation step: and according to the result of the mask updating termination decision step, using an anomaly evaluation function to realize image anomaly evaluation.
2. The method for image anomaly detection and anomaly localization based on an unsupervised mask according to claim 1, wherein the mask random generation step comprises:
using images for model training as input, decomposing each input image intoWherein H and W are the height and width of the image, where k controls the size of the grid;
each grid is composed of a square of k × k pixels and is set as a basic unit of the mask;
each grid is randomly selected for masking or retention and the resulting mask matrix is denoted M.
3. The method for image anomaly detection and anomaly localization based on an unsupervised mask according to claim 1, wherein the mask initialization step comprises: given an image to be tested, as an initialization, from a set of multi-scale masksThe mask is composed of eight chessboard-like matrices with different scales, wherein the size K of the grid belongs to K;
for each grid size k, a pair of complementary masks is included that collectively cover all pixels in the image.
4. The method for detecting and locating image abnormality based on self-supervision mask as claimed in claim 1, wherein the image feature extraction step includes: and taking the image obtained in the mask random generation step or the mask initialization step as input, and extracting high-dimensional feature information of the image by using a deep convolution neural network, wherein the image feature extraction network consists of a plurality of layers of convolution and down-sampling operations.
5. The method for detecting and locating image anomalies based on self-supervised masks according to claim 1, characterized in that the image reconstruction step comprises: the high-dimensional characteristic information obtained in the image characteristic extraction step is used as input, image reconstruction is realized by utilizing a deep convolution neural network model to obtain a reconstructed image, and an image attribute recovery network is formed by a plurality of layers of convolution and up-sampling operations;
and if the input is the model training image, outputting a reconstructed training image, and if the input is the image to be tested, outputting a reconstructed testing image.
6. The image anomaly detection and anomaly positioning method based on the self-supervision mask according to claim 1, characterized in that the reconstructed image alignment step specifically comprises the following steps:
and for the model training image, comparing the reconstructed training image obtained in the image reconstruction step with the model training image, and respectively calculating the following loss functions:
(1) mean square loss function:
(2) gradient magnitude similarity loss function:
wherein 1 represents a full 1 matrix;
Icandrespectively representing the c-th color channel of the model training image and the reconstructed training image;
representing a gradient amplitude similarity function of the model training image and the reconstructed training image;
i, j represents two-dimensional coordinates of the image;
and the gradient magnitude similarity loss matrix for channel c:
wherein a represents a constant;
hxand hyIs a Prewitt filter in the x and y dimensions;
(3) structural similarity exponential loss function:
7. The method for image anomaly detection and anomaly localization based on an unsupervised mask according to claim 1, wherein the mask updating step comprises:
in each iteration updating, a region with small reconstruction error is regarded as a normal region and is removed from the mask in the next iteration, so that the mask is updated by the reconstruction error;
given the grid size k, the image is divided into k × k grids, and the mask is updated by taking the k × k grids as a unit, so that the algorithm is more stable, and the iterative update times are reduced;
for each grid, the average reconstruction error is calculated, the mask is updated according to a threshold value, and the parts of the mask where the reconstruction error is higher than the threshold value are reserved.
8. The method of claim 1, wherein the mask update termination decision step comprises:
when most of the area covered by the mask is an abnormal area, stopping updating the mask and obtaining a final mask;
after the process is finished, the expected mask only covers the abnormal part of the image, and the final mask and the reconstructed image are used as the input of the abnormality evaluation step;
and if the mask is continuously changed in the mask updating step, not entering an abnormal evaluation step, but re-entering the image feature extraction step until the output mask of the mask updating step is kept unchanged.
9. The method of claim 1, wherein the anomaly assessment step comprises:
for the image to be tested, comparing the final reconstructed image obtained in the mask updating and stopping decision step with the image to be tested, thereby calculating the following abnormal evaluation function:
10. An image anomaly detection and anomaly localization system based on an auto-supervised mask, comprising:
a mask random generation module: randomly generating a mask with the size consistent with that of the model training image, and applying the mask on the image to remove information of a partial region of the image;
a mask initialization module: generating multi-scale initialization masks according to the tested images, and respectively applying the masks to the images to be tested to generate multi-scale images to be tested with partial image area information removed;
an image feature extraction module: extracting high-dimensional features of the image by using a depth convolution neural network for the image obtained by the mask random generation module or the mask initialization module;
an image reconstruction module: carrying out image reconstruction on the high-dimensional features of the image by using a depth convolution neural network to obtain a reconstructed training image or a reconstructed test image;
a reconstructed image alignment module: according to the reconstruction training image and the model training image, an image reconstruction loss function is used for realizing self-supervision learning;
a mask updating module: updating the multi-scale mask according to the reconstructed test image by using a mask updating algorithm, so that the mask is more concentrated on the abnormal part of the image;
a mask update termination decision module: judging whether the mask is consistent with the mask before updating according to the updated mask obtained by the mask updating module, if so, entering an abnormal evaluation module, and if not, acting the updated mask on the image to be tested and entering an image feature extraction module again;
an anomaly assessment module: and according to the result of the mask updating termination decision module, using an anomaly evaluation function to realize image anomaly evaluation.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111397389.4A CN114022475B (en) | 2021-11-23 | 2021-11-23 | Image anomaly detection and anomaly positioning method and system based on self-supervision mask |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111397389.4A CN114022475B (en) | 2021-11-23 | 2021-11-23 | Image anomaly detection and anomaly positioning method and system based on self-supervision mask |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114022475A true CN114022475A (en) | 2022-02-08 |
CN114022475B CN114022475B (en) | 2024-08-02 |
Family
ID=80066084
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111397389.4A Active CN114022475B (en) | 2021-11-23 | 2021-11-23 | Image anomaly detection and anomaly positioning method and system based on self-supervision mask |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114022475B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114565594A (en) * | 2022-03-04 | 2022-05-31 | 西安电子科技大学 | Image anomaly detection method based on soft mask contrast loss |
CN114862814A (en) * | 2022-05-18 | 2022-08-05 | 上海师范大学天华学院 | Solar cell panel defect detection method and system, storage medium and terminal |
CN116246114A (en) * | 2023-03-14 | 2023-06-09 | 哈尔滨市科佳通用机电股份有限公司 | Method and device for detecting pull ring falling image abnormality of self-supervision derailment automatic device |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112651916A (en) * | 2020-12-25 | 2021-04-13 | 上海交通大学 | Method, system and medium for pre-training of self-monitoring model |
WO2021191908A1 (en) * | 2020-03-25 | 2021-09-30 | Yissum Research Development Company Of The Hebrew University Of Jerusalem Ltd. | Deep learning-based anomaly detection in images |
CN113658115A (en) * | 2021-07-30 | 2021-11-16 | 华南理工大学 | Image anomaly detection method for generating countermeasure network based on deep convolution |
-
2021
- 2021-11-23 CN CN202111397389.4A patent/CN114022475B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021191908A1 (en) * | 2020-03-25 | 2021-09-30 | Yissum Research Development Company Of The Hebrew University Of Jerusalem Ltd. | Deep learning-based anomaly detection in images |
CN112651916A (en) * | 2020-12-25 | 2021-04-13 | 上海交通大学 | Method, system and medium for pre-training of self-monitoring model |
CN113658115A (en) * | 2021-07-30 | 2021-11-16 | 华南理工大学 | Image anomaly detection method for generating countermeasure network based on deep convolution |
Non-Patent Citations (2)
Title |
---|
CHAOQIN HUANG等: "Self-Supervised Masking for Unsupervised Anomaly Detection and Localization", 《 IEEE TRANSACTIONS ON MULTIMEDIA》, 19 May 2022 (2022-05-19) * |
陈辰;唐胜;李锦涛;: "动态生成掩膜弱监督语义分割", 中国图象图形学报, no. 06, 16 June 2020 (2020-06-16) * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114565594A (en) * | 2022-03-04 | 2022-05-31 | 西安电子科技大学 | Image anomaly detection method based on soft mask contrast loss |
CN114862814A (en) * | 2022-05-18 | 2022-08-05 | 上海师范大学天华学院 | Solar cell panel defect detection method and system, storage medium and terminal |
CN116246114A (en) * | 2023-03-14 | 2023-06-09 | 哈尔滨市科佳通用机电股份有限公司 | Method and device for detecting pull ring falling image abnormality of self-supervision derailment automatic device |
CN116246114B (en) * | 2023-03-14 | 2023-10-10 | 哈尔滨市科佳通用机电股份有限公司 | Method and device for detecting pull ring falling image abnormality of self-supervision derailment automatic device |
Also Published As
Publication number | Publication date |
---|---|
CN114022475B (en) | 2024-08-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111488789B (en) | Pedestrian detection method and device for monitoring based on image analysis | |
CN114022475A (en) | Image anomaly detection and anomaly positioning method and system based on self-supervision mask | |
CN108764292B (en) | Deep learning image target mapping and positioning method based on weak supervision information | |
US11645744B2 (en) | Inspection device and inspection method | |
US8379994B2 (en) | Digital image analysis utilizing multiple human labels | |
CN111275660B (en) | Flat panel display defect detection method and device | |
Zhao et al. | Real‐time fabric defect detection based on multi‐scale convolutional neural network | |
CN109685830B (en) | Target tracking method, device and equipment and computer storage medium | |
CN112906794A (en) | Target detection method, device, storage medium and terminal | |
Shah et al. | Abnormality detection using deep neural networks with robust quasi-norm autoencoding and semi-supervised learning | |
CN113065459B (en) | Video instance segmentation method and system based on dynamic condition convolution | |
CN114266894A (en) | Image segmentation method and device, electronic equipment and storage medium | |
CN115147426B (en) | Model training and image segmentation method and system based on semi-supervised learning | |
CN112862799A (en) | Image attribute recovery-based image anomaly detection method and system | |
Wang et al. | Building correlations between filters in convolutional neural networks | |
CN116958131A (en) | Image processing method, device, equipment and storage medium | |
Lee et al. | Semi-supervised learning for simultaneous location detection and classification of mixed-type defect patterns in wafer bin maps | |
Shao et al. | Generative image inpainting with salient prior and relative total variation | |
CN117134958B (en) | Information processing method and system for network technology service | |
CN117173404A (en) | Remote sensing target automatic detection and hiding method based on deep learning | |
CN116630286A (en) | Method, device, equipment and storage medium for detecting and positioning image abnormality | |
CN115810152A (en) | Remote sensing image change detection method and device based on graph convolution and computer equipment | |
CN114119970A (en) | Target tracking method and device | |
CN114049361A (en) | Self-supervision tumour segmentation system based on picture layer decomposition | |
CN114022458A (en) | Skeleton detection method and device, electronic equipment and computer readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |