CN116091987A

CN116091987A - Industrial scene-oriented multi-strategy image anomaly sample generation method

Info

Publication number: CN116091987A
Application number: CN202310035284.7A
Authority: CN
Inventors: 王素玉; 晋一淑
Original assignee: Beijing University of Technology
Current assignee: Beijing University of Technology
Priority date: 2023-01-10
Filing date: 2023-01-10
Publication date: 2023-05-09

Abstract

The invention discloses a multi-strategy image anomaly sample generation method facing industrial scenes, which comprises the following steps: obtaining a normal sample in an industrial scene, sending the normal sample into a preset network, regarding industrial data as three categories of texture categories, common object categories and small target object categories, further adopting three generation methods to generate an abnormal sample, training the normal sample and the abnormal sample, and finally generating a detection model of the corresponding category; the invention has the advantages of low cost, high precision and the like in implementation, and has higher application value.

Description

Industrial scene-oriented multi-strategy image anomaly sample generation method

Technical Field

The invention relates to the technical field of image anomaly detection under a computer vision task based on deep learning, the technical field of image classification and identification and an industrial scene-oriented multi-strategy image anomaly detection method.

Background

Along with the development of image acquisition technology, the cost for acquiring the product images on the industrial assembly line is gradually reduced, and the analysis and detection of the industrial product images can help us to check anomalies and improve the industrial manufacturing level. At present, under the technical condition of computer vision based on deep learning, the technical schemes of the image anomaly detection technology applicable to industrial scenes mainly comprise the following steps:

(1) Based on the image reconstruction method, the core idea is to train a model by using normal samples, when abnormal samples are input, the model cannot reconstruct abnormal areas normally, so that the abnormality can be judged according to the difference between the input image and the reconstructed image.

(2) Based on the method of generating the countermeasure network (Generative Adversarial Network, GAN), the core idea is to introduce a generator and a arbiter to improve the quality of the reconstructed image, which also takes into account differences in feature space, unlike (1).

(3) The method based on depth feature embedding mainly comprises a knowledge distillation based method and a depth feature modeling based method, and is divided into two parts, namely feature extraction and anomaly estimation, and the judgment is carried out by comparing the depth embedded features of a target image and a normal image.

(4) The self-supervision learning-based method is to learn visual features from unlabeled images and then apply the visual features to a task detection process, and mainly comprises agent tasks and contrast learning, wherein the former takes simulation abnormality, prediction relative relation and the like into consideration, and the latter mainly relates to network design.

However, the conventional abnormality detection technique has the following drawbacks:

(1) The method (1) just depends on the normal sample, and does not introduce prior knowledge of abnormality, so that abnormal characteristics cannot be learned, and modeling can be performed only by means of limited characteristics in the normal sample.

(2) The lack of accurate assessment and reasoning of probability distribution in the method (2) often results in fuzzy reconstruction results and poor quality, and training of GAN also faces challenges such as pattern collapse.

(3) The method (3) needs to extract fine-grained features to process multi-scale and non-aligned data, has complex algorithm design and high space-time complexity, and is difficult to meet the floor requirement.

(4) In the method (4), the existing abnormal sample generation method is rough, the generated abnormality still has a large gap from the real abnormality, abnormal data in an actual scene cannot be well simulated, network learning is not facilitated, and the contrast learning needs to carefully design a network structure.

Disclosure of Invention

The invention provides a multi-strategy abnormal sample generation method for an industrial scene, which designs a series of abnormal sample generation strategies aiming at the industrial scene and has the advantages of low cost, high precision, easiness in deployment and the like.

The method comprises the steps of using an image anomaly detection network to receive a normal sample of a target industrial product, completing simulation of the abnormal sample according to the multi-strategy anomaly sample generation method provided by the invention, training the normal sample and the generated abnormal sample by the network, and automatically judging whether the target industrial image is abnormal or not by using the generated model after the training is finished.

In order to achieve the above purpose, the present invention provides a multi-strategy image anomaly sample generation method for industrial scenes, which comprises the steps of obtaining normal samples of industrial products collected by equipment, and sending the normal samples into a preset anomaly detection network; selecting a corresponding abnormal sample generation method according to the types of the obtained industrial product normal samples, and then adopting the method to generate corresponding abnormal samples; and obtaining an abnormal sample, and combining the normal sample for training to generate a detection model.

Specifically, the technical scheme adopted by the invention is a multi-strategy image anomaly sample generation method facing industrial scenes, and the method comprises the following steps:

step S1, obtaining a normal sample in an industrial scene, confirming that the sample type is one of texture type, small target object type and common object type, and then sending the sample type into a training network;

step S2, judging whether the sample class is texture class data, if so, calling a texture class abnormal sample generation method, namely adopting affine transformation and spherical deformation to carry out mixed processing, and turning to step S5, if not, turning to step S3;

step S3, judging whether the class is a small target object class, if so, calling a small target object class abnormal sample generation method, namely adopting color dithering, position limiting, random rotation and size reduction to carry out mixed processing, and turning to step S5, if not, turning to step S4;

s4, calling a common object abnormal sample generation method, namely adopting position definition and random rotation to carry out mixed processing, and turning to S5;

and S5, training the normal sample and the generated abnormal sample to generate a detection model.

The model generated by the method is obtained by a training process based on a self-supervision mode; the process is oriented to industrial scenes, industrial data are regarded as three categories of textures, common objects and small target objects, training can be completed by only using normal samples on the premise of no need of abnormal samples, the cost is low, the method is suitable for industrial scenes with multiple abnormal sample categories and difficult collection, and the method has high practical application value; specifically, the method utilizes normal samples to generate abnormal samples to simulate real abnormal data, performs design of a multi-strategy abnormal sample generation method aiming at different types of data, then selects ResNet-18 as a characteristic extraction network, trains normal and abnormal data sent into the network, learns characteristic representations of different types, and further generates a detection model of a corresponding type.

The steps S2, S3 and S4 design three abnormal sample generation methods; firstly, regarding data types in an industrial scene as three types of textures, common objects and small target objects, randomly selecting a region in a normal sample for cutting, and then pasting the region into a primary sample to simulate an abnormal sample; then, on the basis, designing abnormal sample generation methods with different strategies for the three types of data, and processing the cut area by adopting an affine transformation and spherical deformation mixed mode for texture type data so as to simulate real deformation; for common object data, limiting the position of a cutting area, and carrying out random rotation processing on the cut image so as to simulate angle transformation; for small target objects, on the basis of a common object processing method, color dithering and size reduction processing are added to the cut area.

Step S2 designs an image anomaly sample generation method aiming at texture industrial data; processing the cut area in a mode of mixing affine transformation and spherical deformation, namely setting the ratio of the two processing modes to be 1:1, and uniformly mixing two abnormal samples; for affine transformation, in order to ensure the area of the transformed image as much as possible and randomly generate the transformation of different shapes, three generation coordinates are set as (0, a), (b, c), (d, e), the distance between the coordinates is limited, and the difference between specified vertexes is not less than 0.2 times of the side length of the original image, so that the overlapping or too close of the coordinate positions can be avoided, and the quality of the generated image is improved; for spherical deformation, generating a polar coordinate according to three elements of a current pixel point, an image center point and a horizontal line passing through the center point, mapping a pixel point on a straight line connecting the current point and the center point according to an extreme value and a polar angle, and outputting the pixel point as a target point.

Step S4 designs an image anomaly sample generation method aiming at common object industrial data; firstly, processing a cutting area by adopting a mode of combining position limitation and random rotation; dividing an object into 3 position limiting modes, setting the proportion of a cutting area to be 1:1 for a square target object, and cutting out peripheral boundary areas when cutting, namely cutting out vertexes which are 0.1 time and 0.9 time of the side length of an original image; for a rectangular target object, setting the cutting area ratio to be 4:3 or 3:4, wherein the long-side vertex is between 0.1 and 0.9 times of the side length of the original image, and the short-side vertex is between 0.2 and 0.8 times of the side length of the original image; then, the cut area is randomly rotated and then stuck, and the rotation angle range is [ -45 degrees, 45 degrees ].

Step S3 designs an image abnormal sample generation method aiming at small target object industrial data; the processing of adding color dithering and downsizing the cropped area is based on claim 5, wherein 4 parameters of color dithering, namely brightness, contrast, saturation and hue are all set to 0.1, the downsizing is to limit the pixel number of the cropped area, and the long and short side length pixel number ranges are respectively [30,80] and [2,15].

Compared with the prior art, the invention has the following technical effects.

The technical features and technical advantages of the present invention may be considered from the technical principle and implementation point of view of the present method.

1. The invention designs a multi-strategy abnormal sample generation algorithm aiming at the detection object in the industrial scene, improves the quality of the abnormal sample and improves the abnormal detection precision.

2. The invention does not need a real abnormal sample, only depends on a small amount of normal samples for training, has lower cost, is suitable for actual industrial scenes, and has higher application value.

3. The abnormal sample generation algorithm designed by the invention is suitable for various industrial product types, can be used for a long time after one training, and has the advantages of convenience and high efficiency.

4. The invention has simple network structure, smaller model volume after training, lower requirement on equipment performance and very convenient floor deployment.

Drawings

FIG. 1 is a schematic flow chart of the method of the present invention;

FIG. 2 is a flow chart of the training process of the method according to the present invention;

FIG. 3 is a flow chart showing specific steps in the training process of the method according to the present invention.

Detailed Description

The technical scheme of the invention is further explained below through examples.

The invention provides a multi-strategy image anomaly sample generation method for an industrial scene, which is suitable for the product anomaly detection requirement in the industrial scene. The method is based on a self-supervision learning method, data categories are regarded as three types of textures, objects and small target objects, and the method only utilizes a small amount of normal samples to generate a large amount of abnormal samples to simulate real abnormal data, then trains two samples simultaneously, generates a detection model, and when in use, sends images to be detected acquired from an industrial assembly line into the trained model, and outputs a detection result after model calculation and judgment.

The algorithm framework is referred to as follows:

referring to fig. 2, fig. 2 is a flow chart illustrating a training process according to the method of the present invention.

As shown in fig. 2, the embodiment of the present invention provides an industrial scene-oriented multi-strategy image anomaly sample generation method, where the image anomaly sample generation method generally includes the following steps:

step S1, acquiring a normal sample in an industrial scene, and sending the normal sample into a training network;

step S2, judging whether the class is texture class data, if yes, calling a texture class abnormal sample generation method, namely adopting affine transformation and spherical deformation to carry out mixed processing, turning to step S5, and if not, turning to step S3;

step S3, judging whether the class is a small target class, if so, calling a small target object class abnormal sample generation method, namely adopting color dithering, position limiting, rotation and size reduction to carry out mixed processing, and turning to step S5, if not, turning to step S4;

s4, invoking an object abnormal sample generation method, namely adopting position limiting and rotation to perform mixing treatment, and turning to S5;

Specific algorithms are referenced below:

the training process of the models called in the steps S2, S3 and S4 is shown in fig. 2, firstly, normal samples required for training are sent to the network, and because each type of the normal samples is independently trained, the normal samples can be sent to the network, the target generally comprises two major classes of textures and objects, wherein the objects comprise two major classes of common objects and small target objects, when training, whether the samples are textures is judged firstly, if not, the representative samples are objects, and at the moment, whether the classes are small target objects is further judged. Through the steps, the training data can be judged as one of three types of texture types, common object types and small target object types. Then, a cutting and pasting mode is used as a reference method, namely a small area is cut randomly from a normal sample and copied, then the copied image is processed by adopting different strategies according to the different characteristics of three categories, and then the processed image is randomly pasted back to the normal sample, at the moment, an abnormal deformation area exists in the sample, so that the abnormal deformation area is regarded as an abnormal sample and the normal sample are sent into a network together for classification training. The details of the policies employed by the three categories are as follows.

(1) If the sample is texture, the product of the type is often deformed on a texture structure due to stretching and extrusion, and the deformation cannot be simulated by simple cutting and pasting, so that the affine transformation and spherical deformation are adopted to deform the area under cutting, and the generation mode of combining the two deformations is used for approaching the real abnormality. In order to simulate the deformation, a next area is randomly cut from a normal sample, the area is processed in a mode of mixing affine transformation and spherical deformation, namely, the ratio of the two processing modes is set to be 1:1, two abnormal sample generation strategies are randomly and uniformly mixed, and then the processed image is randomly stuck back to the normal sample, so that the real deformation of texture data is simulated.

Affine transformation refers to various operations such as translation, rotation and the like of an image through a series of geometric transformations, and the transformation can keep the flatness and parallelism of the image. The invention adopts complex affine transformation, and the key point is that three output coordinates are selected, three coordinates are set to be (0, a), (b, c), (d, e) in order to ensure the area of the transformed image as much as possible and randomly generate transformation of different shapes, the distance between the coordinates is limited, and the difference between appointed vertexes is not smaller than 0.2 times of the side length of the original image, thereby avoiding overlapping or approaching of the coordinate positions and further improving the quality of the generated image.

The spherical deformation is to bulge the middle part of the image into a sphere, so that the object has a three-dimensional bulge effect, the deformation is realized by using polar coordinates, a polar coordinate is generated according to three elements of the current pixel point, the image center point and the horizontal line passing through the center point, a pixel point is mapped on the straight line connecting the current point and the center point according to the following formula, and finally the pixel point is output as a target point, the specific process is as follows, the point coordinate on the original image is expressed by (x ', y'), the point coordinate on the spherical deformation effect graph is expressed by (x, y), and then the relation that the spherical deformation effect meets is as follows:

/>

offsetx＝x-midx,offsety＝y-midy

wherein, (midx, midy) represents the coordinates of the exact center point of the image, ρ and θ represent the extreme values and polar angles corresponding to the polar coordinate system in which the point (x, y) is located.

(2) If the sample is a common object, it is known through analysis that the object is always located in the center of the image in the sample, if an arbitrary area in the sample is cut and pasted according to texture data, the area is easily cut to a non-object area, so that the quality of the generated abnormal sample is reduced, therefore, the position of the object cutting area is limited, and the cutting and pasting are only performed in the center area of the image, so that the proximity degree of the generated abnormal sample and the real abnormality is improved.

Because the shapes of the object objects are different, in order to generate more accurate abnormal data, the object objects are divided into 3 position limiting modes, if the target object is square, the proportion of a cutting area is set to be 1:1, and peripheral boundary areas are omitted during cutting, namely, cut vertexes are located between 0.1 time and 0.9 time of the side length of an original image, similarly, if the target object is rectangular, the proportion of the cutting area is set to be 4:3 or 3:4, long-side vertexes are located between 0.1 time and 0.9 time of the side length of the original image, and short-side vertexes are located between 0.2 time and 0.8 time of the side length of the original image.

In addition, considering the diversity of abnormal morphology changes, the cut areas are randomly rotated on the basis of the limiting positions and then stuck, and the rotation angle range is [ -45 degrees, 45 degrees ], so that various angle transformations of true abnormalities are simulated.

(3) If the sample is a small target object, the analysis shows that the object area is smaller in the sample, if an arbitrary area in the sample is cut and pasted according to texture data, the sample is easy to cut to a non-object area, even if the sample is subjected to position definition and rotation treatment, the generated sample is difficult to approach to real abnormal data, therefore, the carefully treated cut area is considered, and experiments show that the method in (2) is combined with color dithering and size reduction treatment, at the moment, the difference between the cut area and the original sample is highlighted, the generation effect is better, and the characteristic difference between the normal and abnormal samples is favorable for network learning. The 4 parameters of color dithering, namely, brightness, contrast, saturation and hue are all set to 0.1, the size reduction is to limit the pixel number of a clipping region, and the long-short-side long pixel number ranges are respectively [30,80] and [2,15].

The above is an introduction of three generation strategies, the specific training steps after generating the abnormal samples are shown in fig. 3, and in order to ensure the feature extraction capability of the network, the body size of the generated model is controlled, and ResNet-18 is selected as the feature extraction network to learn the input samples. For three types of data, the three generation methods are adopted for processing respectively, and then the abnormal samples and the normal samples are sent into a network together for classification training, and the objective function is as follows:

L＝E _x∈X {CE(g(x),0)+CE(g(CP(x)),1)}

where X is the normal dataset, CP (&) represents the abnormal sample generation method, g is a binary classifier parameterized by the deep convolutional network, and CE (&) refers to the cross entropy loss function.

In each round of training, the network generates abnormal samples according to the existing normal samples to assist learning, the generation quantity is determined by the designated batch size, for example, if the batch size is designated as 32, 32 abnormal samples are generated by the normal samples in each round of iteration until the training round number is completed, at this time, a training model is generated, the model can be used in the subsequent actual use process, namely, an image to be tested is sent into the model, and the model can judge whether the image belongs to an abnormal image or not and output a result.

After the multi-strategy abnormal sample generation method is adopted, experimental verification is carried out on an MVTec data set, wherein the data set comprises 15 categories, 5 texture categories and 10 object categories, and the result is as follows.

/>

Claims

1. An industrial scene-oriented multi-strategy image anomaly sample generation method is characterized by comprising the following steps of:

2. The industrial scene-oriented multi-strategy image anomaly sample generation method of claim 1, wherein the method comprises the steps of:

the model generated by the method is obtained by a training process based on a self-supervision mode; the process is oriented to industrial scenes, and industrial data are regarded as three categories of textures, common objects and small target objects; generating abnormal samples by using normal samples to simulate real abnormal data, designing a multi-strategy abnormal sample generation method aiming at data of different categories, selecting ResNet-18 as a characteristic extraction network, training normal and abnormal data sent into the network, learning characteristic representations of different categories, and further generating a detection model of corresponding categories.

3. The industrial scene-oriented multi-strategy image anomaly sample generation method of claim 1, wherein the method comprises the steps of:

step S2, step S3 and step S4 are used for designing three abnormal sample generation methods; firstly, regarding data types in an industrial scene as three types of textures, common objects and small target objects, randomly selecting a region in a normal sample for cutting, and then pasting the region into a primary sample to simulate an abnormal sample; then designing abnormal sample generation methods with different strategies for the three types of data, and processing the cut area by adopting an affine transformation and spherical deformation mixed mode for texture type data so as to simulate real deformation; for common object data, limiting the position of a cutting area, and carrying out random rotation processing on the cut image so as to simulate angle transformation; for small target objects, on the basis of a common object processing method, color dithering and size reduction processing are added to the cut area.

4. The industrial scene-oriented multi-strategy image anomaly sample generation method of claim 1, wherein the method comprises the steps of:

step S2 designs an image anomaly sample generation method aiming at texture industrial data; processing the cut area in a mode of mixing affine transformation and spherical deformation, namely setting the ratio of the two processing modes to be 1:1, and uniformly mixing two abnormal samples; for affine transformation, in order to ensure the area of the transformed image as much as possible and randomly generate the transformation of different shapes, three generation coordinates are set to be 0, a, b, c, d and e, the distance between the coordinates is limited, and the difference between specified vertexes is not smaller than 0.2 times of the side length of the original image, so that the overlapping or too close of the coordinate positions is avoided, and the quality of the generated image is improved; for spherical deformation, generating a polar coordinate according to three elements of a current pixel point, an image center point and a horizontal line passing through the center point, mapping a pixel point on a straight line connecting the current point and the center point according to an extreme value and a polar angle, and outputting the pixel point as a target point.

5. The industrial scene-oriented multi-strategy image anomaly sample generation method of claim 1, wherein the method comprises the steps of:

6. The industrial scene-oriented multi-strategy image anomaly sample generation method of claim 1, wherein the method comprises the steps of: