CN117079008A

CN117079008A - Abnormality detection method, device, equipment and storage medium based on distillation learning

Info

Publication number: CN117079008A
Application number: CN202310875823.8A
Authority: CN
Inventors: 丁贵广; 杨会越; 陈辉
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2023-07-17
Filing date: 2023-07-17
Publication date: 2023-11-17

Abstract

The application relates to an anomaly detection method, device, equipment and storage medium based on distillation learning, wherein the method comprises the following steps: collecting abnormal detection data of a target; inputting the abnormality detection data into a Teacher-Student framework to obtain a plurality of pairs of feature images, and normalizing the plurality of pairs of feature images based on a preset feature processing strategy; inputting a pair of feature maps with the lowest resolution in the pairs of feature maps into a pre-trained reconstrator network to obtain a plurality of pairs of reconstructed feature maps, subtracting each pair of reconstructed feature maps to obtain an anomaly map, and acquiring an anomaly detection result of anomaly detection image data based on the anomaly map and the reconstructed feature anomaly map. Therefore, the problems that the existing pixel level anomaly detection method based on distillation learning is low in accuracy, poor in robustness and the like when facing anomaly data are solved.

Description

Abnormality detection method, device, equipment and storage medium based on distillation learning

Technical Field

The present application relates to the field of distillation learning technology, and in particular, to a method, apparatus, device, and storage medium for detecting anomalies based on distillation learning.

Background

The abnormal detection can be divided into supervised learning and unsupervised learning according to supervision conditions, and a large amount of abnormal data is required for training a model by a supervised learning method, however, the data amount of the abnormal data is insufficient and the manual labeling cost is high; unsupervised anomaly detection can be classified into two categories according to detection targets: anomaly detection at the image level and anomaly detection at the pixel level. The object of image level anomaly detection is to perform two classifications on images and determine whether anomalies exist in the images; the object of the pixel level anomaly detection is to determine whether each pixel point in the image is an anomaly region.

There are three methods for image level anomaly detection: generating a model, a data distribution and classification method; wherein the method based on the generated model is to detect abnormality according to the degree of reconstruction loss; the distribution-based method considers that the samples deviating from the normal data distribution are abnormal, and when the probability distribution of only normal products is generated, the probability density of the abnormal image is very low, so that the abnormal data can be classified; the classification-based method is an anomaly detection method combining geometric transformation and classification, and unknown anomaly data is poor in classification accuracy, so that an anomaly sample is detected; the above method can distinguish between normal and abnormal images, but cannot effectively locate abnormal pixel positions in the abnormal image.

The pixel level anomaly detection target is each pixel point, and the difficulty is higher than that of image level anomaly detection. The main methods of pixel level anomaly detection are based on methods of generating models, such as generating countermeasure networks (Generative Adversarial Network, GAN) and Auto Encoders (AE); in recent years, the related art can perform pixel level anomaly detection by combining the GAN and the automatic encoder, but the method needs to generate a model and reconstruct a normal image with higher accuracy, otherwise, the accuracy of anomaly detection is degraded.

At present, the exploration of a pixel level abnormality detection method based on distillation learning is less common, and the existing method mainly focuses on the characteristic distribution of normal or abnormal images, and detects an abnormal region through the difference of the distribution during the test; because abnormal data are difficult to obtain in large quantity and the cost of manual labeling is high, the existing method mostly uses abnormal-free images as training sets, trains the ability of a model to extract normal characteristics, and has the following defects:

1. the method is characterized in that the resolution is compressed when the features are extracted based on the depth model, so that the defect features with smaller scales are lost, the existing method tries to solve the problem by adopting a feature pyramid mode, the features with multiple resolutions are usually extracted, the abnormal graph of each resolution is calculated, and finally a detection result is obtained by multiplying the multiple abnormal graphs, wherein the feature pyramid can detect the multi-scale abnormality, but when the detection of one abnormal graph is wrong or the precision is low, the final result is influenced, and the robustness is poor;

2. the existing method mainly focuses on how to train a Student model to effectively learn features, but does not have the capability of reconstructing a normal image similar to that of a generated model, a convolutional neural network can cause high-dimensional to low-dimensional feature loss when extracting features, and the Student learning features only focus on the consistency of a feature space and Teacher and do not take the consistency when reconstructing a picture by the features into consideration.

In summary, the existing pixel level anomaly detection method based on distillation learning has low precision and poor robustness when facing anomaly data, and needs to be solved.

Disclosure of Invention

The application provides an anomaly detection method, device, equipment and storage medium based on distillation learning, which are used for solving the problems of low precision, poor robustness and the like of the existing pixel-level anomaly detection method based on distillation learning when facing anomaly data.

An embodiment of a first aspect of the present application provides an abnormality detection method based on distillation learning, including the steps of: collecting abnormal detection data of a target; and inputting the anomaly detection data into a Teacher-Student framework to obtain a plurality of pairs of feature graphs, normalizing the pairs of feature graphs based on a preset feature processing strategy, inputting a pair of feature graphs with the lowest resolution in the pairs of feature graphs into a pre-trained Reconstructor network to obtain a plurality of pairs of reconstructed feature graphs, subtracting each pair of reconstructed feature graphs to obtain an anomaly graph, and obtaining an anomaly detection result of the anomaly detection image data based on the anomaly graph and the reconstructed feature anomaly graph.

Optionally, in an embodiment of the present application, before inputting the anomaly detection data into the Teacher-Student framework to obtain a plurality of pairs of feature graphs, and normalizing the plurality of pairs of feature graphs based on a preset feature processing policy, the method further includes: collecting training detection images meeting preset conditions; inputting the training detection image into a Teacher model to obtain a first feature map; training the Reconstructor network based on the first feature map.

Optionally, in an embodiment of the present application, the inputting the anomaly detection data into a Teacher-Student framework to obtain a plurality of pairs of feature graphs, and normalizing the plurality of pairs of feature graphs based on a preset feature processing policy includes: respectively inputting the detection images meeting preset conditions into the pre-trained Teacher model and the Student model to be trained to obtain the pairs of feature images; normalizing each pixel point of each pair of feature maps in the plurality of pairs of feature maps by using an L2 norm based on a channel dimension; and constructing a first loss function based on the normalization result to maintain consistency of the feature space of the Teacher model and the Student model.

Optionally, in an embodiment of the present application, after inputting the anomaly detection data into a Teacher-Student framework to obtain a plurality of pairs of feature graphs, normalizing the plurality of pairs of feature graphs based on a preset feature processing policy, the method further includes: processing the detection images meeting the preset conditions through a preset network layer in a Teacher model and a Student model respectively to obtain corresponding second feature images; inputting the second feature map to the Reconstructor network to respectively obtain reconstruction features corresponding to the Teacher model and the Student model; fixing the parameters of the Reconstructor network, and constructing a second loss function to keep the consistency of the reconstructed image feature space of the Teacher model and the Student model; based on the first and second loss functions, a total loss function is constructed to supervise training of the Student model.

Optionally, in one embodiment of the application, the mathematical representation of the total loss function is as follows:

L＝γ ₁ L ₁ +γ ₂ L ₂

wherein L is ₁ For weighted loss of all feature maps, gamma ₁ Is L ₁ Coefficient of loss function, L ₂ Reconstructing a loss function for a feature, gamma ₂ Is L ₂ Coefficients of the loss function.

An embodiment of the second aspect of the present application provides an abnormality detection device based on distillation learning, including: the first acquisition module is used for acquiring the abnormality detection data of the target; the processing module is used for inputting the abnormality detection data into a Teacher-Student framework to obtain a plurality of pairs of feature images, normalizing the pairs of feature images based on a preset feature processing strategy, and inputting a pair of feature images with the lowest resolution in the pairs of feature images into a pre-trained Reconstructor network to obtain a plurality of pairs of reconstructed feature images, subtracting each pair of reconstructed feature images to obtain an abnormality image, and obtaining an abnormality detection result of the abnormality detection image data based on the abnormality image and the reconstructed feature abnormality image.

Optionally, in one embodiment of the present application, further includes: the second acquisition module is used for acquiring training detection images meeting preset conditions before inputting the abnormality detection data into a Teacher-Student framework to obtain a plurality of pairs of feature images and normalizing the pairs of feature images based on a preset feature processing strategy; the first feature extraction module is used for inputting the training detection image into a Teacher model to obtain a first feature map; and the training module is used for training the Reconstructor network based on the first feature map.

Optionally, in one embodiment of the present application, the processing module includes: the second feature extraction module is used for respectively inputting the detection images meeting preset conditions into the pre-trained Teacher model and the Student model to be trained to obtain the multiple pairs of feature images; the normalization module is used for normalizing each pixel point of each pair of feature graphs in the plurality of pairs of feature graphs by using an L2 norm based on the channel dimension; and the function construction module is used for constructing a first loss function based on the normalization result so as to keep consistency of the feature space of the Teacher model and the Student model.

Optionally, in one embodiment of the present application, further includes: the third feature extraction module is used for inputting the abnormality detection data into a Teacher-Student framework to obtain a plurality of pairs of feature images, normalizing the pairs of feature images based on a preset feature processing strategy, and respectively processing the detection images meeting preset conditions through a preset network layer in the Teacher model and the Student model to obtain corresponding second feature images; the reconstruction module is used for inputting the second feature map to a trained Reconstructor network to respectively obtain reconstruction features corresponding to the Teacher model and the Student model; the parameter setting module is used for fixing the trained Reconstructor network parameters and constructing a second loss function so as to keep the consistency of the reconstructed image feature space of the Teacher model and the Student model; and the supervision module is used for constructing a total loss function based on the first loss function and the second loss function so as to supervise training of the Student model.

L＝γ ₁ L ₁ + ₂ L ₂

An embodiment of a third aspect of the present application provides an electronic device, including: the processor executes the program to realize the abnormality detection method based on distillation learning as described in the above embodiment.

An embodiment of the fourth aspect of the present application provides a computer-readable storage medium storing a computer program which, when executed by a processor, implements the abnormality detection method based on distillation learning as above.

Thus, embodiments of the present application have the following beneficial effects:

the embodiment of the application can collect the abnormality detection data of the target; inputting the abnormality detection data into a Teacher-Student framework to obtain a plurality of pairs of feature images, and normalizing the plurality of pairs of feature images based on a preset feature processing strategy; inputting a pair of feature maps with the lowest resolution in the pairs of feature maps into a pre-trained Reconstructor network to obtain a plurality of reconstructed feature maps, subtracting each pair of reconstructed feature maps to obtain an anomaly map, and acquiring an anomaly detection result of anomaly detection image data based on the anomaly map and the reconstructed feature anomaly map, so that the model fully learns feature distribution of different scales of a normal image, has the capability of reconstructing the normal image, effectively improves the robustness of the model, and improves the detection precision. Therefore, the problems that the existing pixel level anomaly detection method based on distillation learning is low in accuracy, poor in robustness and the like when facing anomaly data are solved.

Additional aspects and advantages of the application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the application.

Drawings

The foregoing and/or additional aspects and advantages of the application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings, in which:

fig. 1 is a flowchart of an abnormality detection method based on distillation learning according to an embodiment of the present application;

FIG. 2 is a schematic diagram of logic for performing an anomaly detection method based on distillation learning according to an embodiment of the present application;

fig. 3 is an exemplary diagram of an abnormality detection apparatus based on distillation learning according to an embodiment of the present application;

fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

The system comprises a 10-abnormality detection device based on distillation learning, a 100-first acquisition module, a 200-processing module, a 300-detection module, a 401-memory, a 402-processor and a 403-communication interface.

Detailed Description

Embodiments of the present application are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative and intended to explain the present application and should not be construed as limiting the application.

The following describes an abnormality detection method, apparatus, device, and storage medium based on distillation learning according to an embodiment of the present application with reference to the accompanying drawings. In view of the above-mentioned problems in the background art, the present application provides an abnormality detection method based on distillation learning, in which abnormality detection data of a target is collected; inputting the abnormality detection data into a Teacher-Student framework to obtain a plurality of pairs of feature images, and normalizing the plurality of pairs of feature images based on a preset feature processing strategy; inputting a pair of feature maps with the lowest resolution in the pairs of feature maps into a pre-trained Reconstructor network to obtain a plurality of reconstructed feature maps, subtracting each pair of reconstructed feature maps to obtain an anomaly map, and acquiring an anomaly detection result of anomaly detection image data based on the anomaly map and the reconstructed feature anomaly map, so that the model fully learns feature distribution of different scales of a normal image, has the capability of reconstructing the normal image, effectively improves the robustness of the model, and improves the detection precision. Therefore, the problems that the existing pixel level anomaly detection method based on distillation learning is low in accuracy, poor in robustness and the like when facing anomaly data are solved.

Specifically, fig. 1 is a flowchart of an abnormality detection method based on distillation learning according to an embodiment of the present application.

As shown in fig. 1, the abnormality detection method based on distillation learning includes the steps of:

in step S101, abnormality detection data of a target is acquired.

In the embodiment of the application, firstly, the image data needing to be detected abnormally can be acquired, for example, in industrial vision, the image data of cloth, bearings and the like are acquired at fixed time by an industrial camera, so that reliable data support is provided for the subsequent detection of possible abnormalities such as color stain abnormality, off-line hole abnormality or texture structure abnormality and the like.

In step S102, the anomaly detection data is input to the Teacher-Student framework to obtain a plurality of pairs of feature graphs, and the plurality of pairs of feature graphs are normalized based on a preset feature processing policy.

After the anomaly detection data are acquired, the anomaly detection data can be respectively input into a Teacher network model and a Student network model in a Teacher-Student framework, wherein the Teacher model in the embodiment of the application is a pre-trained ResNet-18 network on an ImageNet data set, the Student model is an untrained ResNet-18 network which has the same structure as the Teacher network, so that a plurality of pairs of feature graphs are obtained, and the obtained pairs of feature graphs are normalized by utilizing L2 norms based on channel dimensions, so that a data basis is provided for realizing double consistency of the subsequent Teacher model and the Student model.

Optionally, in an embodiment of the present application, before inputting the anomaly detection data into the Teacher-Student framework to obtain a plurality of pairs of feature graphs, and normalizing the plurality of pairs of feature graphs based on a preset feature processing policy, the method further includes: collecting training detection images meeting preset conditions; inputting the training detection image into a Teacher model to obtain a first feature map; and training the reconstractor network based on the first feature map.

It should be noted that, before the anomaly detection data is respectively input to the Teacher-Student framework to obtain a plurality of pairs of feature graphs, the embodiment of the present application further needs to train the Reconstructor network,

specifically, since the Teacher model is a pre-trained res net-18 network on the ImageNet dataset, the re-structor network is similar to the Teacher network in structure, but the re-structor network is arranged in reverse order, and the convolution layer is replaced by a deconvolution layer, and the input of the re-structor network is a feature map of the Teacher with the lowest resolution, namely a feature map of the conv5 output.

Therefore, the embodiment of the application can train the reconstractor network by using normal images, and the Teacher model is recorded as T, the reconstractor network is R, and the data set is D= { I ₁ ,I ₂ ,…I _n Each picture is w×h×c in size; for image I _k The conv5 output feature map f through the Teacher model takes f as input of a re-constructor network, and after a series of deconvolution layers, the image output by the re-constructor network is I' _k The embodiment of the application can use the L2 loss function for supervision in the process of the reconstrator network training, and the L2 loss function is shown as the following formula:

therefore, the embodiment of the application trains the Reconstructor network by utilizing the Teacher network model, so that the image reconstructed by the Reconstructor network according to the feature map is similar to the original image, and the performance of subsequent model anomaly detection is effectively ensured.

Optionally, in an embodiment of the present application, inputting the anomaly detection data into a Teacher-Student framework to obtain a plurality of pairs of feature graphs, and normalizing the plurality of pairs of feature graphs based on a preset feature processing policy, including: respectively inputting detection images meeting preset conditions into a pre-trained Teacher model and a Student model to be trained to obtain a plurality of pairs of feature images; normalizing each pixel point of each pair of feature maps in the plurality of pairs of feature maps by using an L2 norm based on the channel dimension; and constructing a first loss function based on the normalization result to maintain consistency of the feature space of the Teacher model and the Student model.

In the embodiment of the application, the ResNet-18 network can generate pyramid type characteristics for the acquired image data, the bottom layer generates high-resolution characteristics which can encode bottom layer information such as texture, color and the like, the top layer generates low-resolution characteristics containing context information, different layers in the deep neural network correspond to different receptive fields, the Student network can learn characteristic information with different resolutions from the Teacher network, and the layered characteristic matching can detect anomalies with different scales.

It should be noted that, the training data of the Teacher-Student framework are all normal pictures, and feature images output by conv2, conv3 and conv4 of the Teacher network are used as targets of Student learning, so that the difference between the output of Student on a normal image and Teacher is as small as possible, wherein the normal image dataset can be expressed as D= { I ₁ ,I ₂ ,…I _n Input image I _k The first feature map output by Teacher and Student, with dimensions w×h×c, is written asAnd->The sizes are w _l ×h _l ×d _l 。

Specifically, first, the embodiment of the present application may normalize each pixel point position of the feature map with all resolutions output by the Student and the Teacher according to the channel dimension by the L2 norm, as shown in the following formula:

wherein (i, j) is the position coordinates,and->For the eigenvectors of Teacher and Student at position (i, j), is +.>

The embodiment of the application can use L2 loss function supervision in the training process of the Student model to reduce the distance between regularized feature vectors of the Student network model and the Teacher network model, and the mathematical expression of the loss function is as follows:

picture I _k Is a weighted loss of all resolution feature maps, the loss function can be expressed as:

wherein alpha is _l Is the weight of the first feature map.

Therefore, the embodiment of the application enables the Student network to receive multi-level mixed knowledge from the feature pyramid under better supervision by integrating the multi-scale feature matching strategy, so that the anomalies of different scales can be detected, the difference of the feature pyramids generated by the two networks is used as an evaluation function, the probability of anomaly occurrence of each pixel point in an image is calculated, and the feature representation of the Teacher model on a normal image is learned through the Student model, so that the Student network model and the Teacher network model have consistency in the extracted feature space.

Optionally, in an embodiment of the present application, after the anomaly detection data is input to the Teacher-Student framework to obtain a plurality of pairs of feature graphs, and the plurality of pairs of feature graphs are normalized based on a preset feature processing policy, the method further includes: processing the detection images meeting the preset conditions through a preset network layer in a Teacher model and a Student model respectively to obtain corresponding second feature images; inputting the second feature map to a Reconstructor network to respectively obtain reconstruction features corresponding to the Teacher model and the Student model; fixing the parameters of the Reconstructor network, and constructing a second loss function to keep consistency of the reconstructed image feature space of the Teacher model and the Student model; based on the first and second loss functions, a total loss function is constructed to supervise training of the Student model.

It should be noted that, after ensuring feature space consistency, embodiments of the present application may output a conv5 output feature map in a Student networkAs input to the Reconstructor network, reconstruction features are obtained respectively>At the same time, feature map of conv5 output in Teacher network>As input to the reconstractor network, reconstructed features are obtainedAs shown in fig. 2.

It should be noted that parameters of the reconstractor network need to be fixed in the training process, so that when the Student features are input as the reconstractor network, the reconstruction features are consistent with those of the Teacher network model, namely, the reconstruction features are minimizedAnd->The distance between them, embodiments of the present application can be supervised using an L2 loss function, the mathematical expression of which is as follows:

therefore, the embodiment of the application refines the knowledge learned by the pre-trained Teacher network to the Student network through the Teacher-S-traversal framework, trains the distance through the minimized feature, enables the features extracted by the Student and the Teacher to have consistency, so as to learn the distribution of the non-abnormal image, and enables the image reconstructed according to the Teacher and the Student to have consistency by utilizing the Reconstructor model.

Alternatively, in one embodiment of the application, the mathematical representation of the total loss function is as follows:

L＝γ ₁ L ₁ +γ ₂ L ₂

As will be appreciated by those skilled in the art, training of the Student network model requires that feature space consistency be guaranteed simultaneously with reconstructed image feature space consistency, and thus the total loss of training in embodiments of the present application can be defined as L ₁ And L ₂ Is shown below:

L＝γ ₁ L ₁ +γ ₂ L ₂

Therefore, the embodiment of the application can effectively avoid the problem of feature loss caused by feature compression by a convolution layer by designing the Reconstructor model, and introduce reconstruction consistency when training the Student network model, so that the features extracted by the Student network model are consistent with the Teacher network model in the reconstructed feature space, thereby increasing the robustness of the model and improving the detection precision.

In step S103, a pair of feature maps with the lowest resolution in the pairs of feature maps is input to a retraining network, so as to obtain a plurality of pairs of reconstructed feature maps, each pair of reconstructed feature maps is subtracted to obtain an anomaly map, and an anomaly detection result of the anomaly detection image data is obtained based on the anomaly map and the reconstructed feature anomaly map.

After training a Reconstructor model to reconstruct an image by using a Teacher network model and training a Student network model together by using the Teacher network model and the Reconstructor with fixed parameters, the embodiment of the application can further test the trained model, and the test process is as follows:

1. inputting the image I to be tested into a Teacher network model and a Student network model to respectively obtain feature imagesAnd->

2. Taking the first three feature images of the Teacher network model and the Student network model, subtracting the feature images with the same resolution, and normalizing according to the channel dimension to obtain three abnormal images omega ₁ ,Ω ₂ ,Ω ₃ I.e.

3. Will beAnd->Respectively inputting the trained Reconstructor models to obtain a reconstructed image I _t And I _s ；

4. Will I _t And I _s Subtracting and normalizing according to the channel dimension to obtain omega ₄ The final output anomaly graph is:

Ω＝Ω ₁ ·Ω ₂ ·Ω ₃ ·Ω ₄ 。

therefore, the embodiment of the application tests by using the distillation learning anomaly detection method with double consistencies, and locates the anomaly region by using the difference between the characteristics of the Student network model and the Teacher network model, thereby effectively improving the detection precision of anomaly data and the robustness of the model.

According to the anomaly detection method based on distillation learning provided by the embodiment of the application, anomaly detection data of a target are collected; inputting the abnormality detection data into a Teacher-Student framework to obtain a plurality of pairs of feature images, and normalizing the plurality of pairs of feature images based on a preset feature processing strategy; inputting a pair of feature maps with the lowest resolution in the pairs of feature maps into a pre-trained Reconstructor network to obtain a plurality of reconstructed feature maps, subtracting each pair of reconstructed feature maps to obtain an anomaly map, and acquiring an anomaly detection result of anomaly detection image data based on the anomaly map and the reconstructed feature anomaly map, so that the model fully learns feature distribution of different scales of a normal image, has the capability of reconstructing the normal image, effectively improves the robustness of the model, and improves the detection precision.

Next, an abnormality detection device based on distillation learning according to an embodiment of the present application will be described with reference to the drawings.

Fig. 3 is a block diagram schematically illustrating an abnormality detection apparatus based on distillation learning according to an embodiment of the present application.

As shown in fig. 3, the abnormality detection device 10 based on distillation learning includes: the device comprises a first acquisition module 100, a processing module 200 and a detection module 300.

The first acquisition module 100 is configured to acquire anomaly detection data of a target.

The processing module 200 is configured to input the anomaly detection data to the Teacher-Student framework, obtain a plurality of pairs of feature graphs, and normalize the plurality of pairs of feature graphs based on a preset feature processing policy.

The detection module 300 is configured to input a pair of feature maps with the lowest resolution of the pairs of feature maps to a pre-trained reconstrator network, obtain a plurality of pairs of reconstructed feature maps, subtract each pair of reconstructed feature maps to obtain an anomaly map, and obtain an anomaly detection result of the anomaly detection image data based on the anomaly map and the reconstructed feature anomaly map.

Optionally, in one embodiment of the present application, the abnormality detection device 10 based on distillation learning of the embodiment of the present application further includes: the device comprises a second acquisition module, a first characteristic extraction module and a training module.

The second acquisition module is used for acquiring training detection images meeting preset conditions before the abnormal detection data are input into the Teacher-Student framework to obtain a plurality of pairs of feature images and the plurality of pairs of feature images are normalized based on a preset feature processing strategy.

The first feature extraction module is used for inputting the training detection image into the Teacher model to obtain a first feature map.

And the training module is used for training the Reconstructor network based on the first feature map.

Optionally, in one embodiment of the present application, the processing module 200 includes: the system comprises a second feature extraction module, a normalization module and a function construction module.

The second feature extraction module is used for respectively inputting the detection images meeting the preset conditions into the pre-trained Teacher model and the Student model to be trained to obtain a plurality of pairs of feature images.

And the normalization module is used for normalizing each pixel point of each pair of feature maps in the plurality of pairs of feature maps by using the L2 norm based on the channel dimension.

And the function construction module is used for constructing a first loss function based on the normalization result so as to keep consistency of the feature space of the Teacher model and the Student model.

Optionally, in one embodiment of the present application, the abnormality detection device 10 based on distillation learning of the embodiment of the present application further includes: the system comprises a third feature extraction module, a reconstruction module, a parameter setting module and a supervision module.

The third feature extraction module is used for inputting the abnormal detection data into the Teacher-Student framework to obtain a plurality of pairs of feature images, normalizing the pairs of feature images based on a preset feature processing strategy, and then respectively processing the detection images meeting preset conditions through a preset network layer in the Teacher model and the Student model to obtain corresponding second feature images.

And the reconstruction module is used for inputting the second feature map to the trained Reconstructor network to respectively obtain reconstruction features corresponding to the Teacher model and the Student model.

And the parameter setting module is used for fixing the parameters of the re-constructor after the training is finished and constructing a second loss function so as to keep the consistency of the reconstructed image feature space of the Teacher model and the Student model.

And the supervision module is used for constructing a total loss function based on the first loss function and the second loss function so as to supervise training of the Student model.

L＝γ ₁ L ₁ +γ ₂ L ₂

It should be noted that the foregoing explanation of the embodiment of the abnormality detection method based on distillation learning is also applicable to the abnormality detection device based on distillation learning of this embodiment, and will not be repeated here.

According to the abnormality detection device based on distillation learning, provided by the embodiment of the application, the abnormality detection data of the target are collected; inputting the abnormality detection data into a Teacher-Student framework to obtain a plurality of pairs of feature images, and normalizing the plurality of pairs of feature images based on a preset feature processing strategy; inputting a pair of feature maps with the lowest resolution in the pairs of feature maps into a pre-trained Reconstructor network to obtain a plurality of reconstructed feature maps, subtracting each pair of reconstructed feature maps to obtain an anomaly map, and acquiring an anomaly detection result of anomaly detection image data based on the anomaly map and the reconstructed feature anomaly map, so that the model fully learns feature distribution of different scales of a normal image, has the capability of reconstructing the normal image, effectively improves the robustness of the model, and improves the detection precision.

Fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application. The electronic device may include:

memory 401, processor 402, and a computer program stored on memory 401 and executable on processor 402.

The processor 402 implements the abnormality detection method based on distillation learning provided in the above-described embodiment when executing a program.

Further, the electronic device further includes:

a communication interface 403 for communication between the memory 401 and the processor 402.

A memory 401 for storing a computer program executable on the processor 402.

Memory 401 may comprise high-speed RAM memory or may also include non-volatile memory (non-volatile memory), such as at least one disk memory.

If the memory 401, the processor 402, and the communication interface 403 are implemented independently, the communication interface 403, the memory 401, and the processor 402 may be connected to each other by a bus and perform communication with each other. The bus may be an industry standard architecture (Industry Standard Architecture, abbreviated ISA) bus, an external device interconnect (Peripheral Component, abbreviated PCI) bus, or an extended industry standard architecture (Extended Industry Standard Architecture, abbreviated EISA) bus, among others. The buses may be divided into address buses, data buses, control buses, etc. For ease of illustration, only one thick line is shown in fig. 4, but not only one bus or one type of bus.

Alternatively, in a specific implementation, if the memory 401, the processor 402, and the communication interface 403 are integrated on a chip, the memory 401, the processor 402, and the communication interface 403 may complete communication with each other through internal interfaces.

The processor 402 may be a central processing unit (Central Processing Unit, abbreviated as CPU) or an application specific integrated circuit (Application Specific Integrated Circuit, abbreviated as ASIC) or one or more integrated circuits configured to implement embodiments of the present application.

The embodiment of the present application also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the abnormality detection method based on distillation learning as above.

In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or N embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.

Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In the description of the present application, "N" means at least two, for example, two, three, etc., unless specifically defined otherwise.

Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and additional implementations are included within the scope of the preferred embodiment of the present application in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order from that shown or discussed, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the embodiments of the present application.

Logic and/or steps represented in the flowcharts or otherwise described herein, e.g., a ordered listing of executable instructions for implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or N wires, a portable computer cartridge (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). In addition, the computer readable medium may even be paper or other suitable medium on which the program is printed, as the program may be electronically captured, via optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.

It is to be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the N steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. If implemented in hardware, as in another embodiment, may be implemented using any one or combination of the following techniques, as known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.

Those of ordinary skill in the art will appreciate that all or a portion of the steps carried out in the method of the above-described embodiments may be implemented by a program to instruct related hardware, where the program may be stored in a computer readable storage medium, and where the program, when executed, includes one or a combination of the steps of the method embodiments.

In addition, each functional unit in the embodiments of the present application may be integrated in one processing module, or each unit may exist alone physically, or two or more units may be integrated in one module. The integrated modules may be implemented in hardware or in software functional modules. The integrated modules may also be stored in a computer readable storage medium if implemented in the form of software functional modules and sold or used as a stand-alone product.

The above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, or the like. While embodiments of the present application have been shown and described above, it will be understood that the above embodiments are illustrative and not to be construed as limiting the application, and that variations, modifications, alternatives and variations may be made to the above embodiments by one of ordinary skill in the art within the scope of the application.

Claims

1. An abnormality detection method based on distillation learning is characterized by comprising the following steps:

collecting abnormal detection data of a target;

inputting the abnormality detection data into a Teacher-Student framework to obtain a plurality of pairs of feature graphs, normalizing the pairs of feature graphs based on a preset feature processing strategy, and

inputting a pair of feature maps with the lowest resolution ratio in the pairs of feature maps to a pre-trained Reconstructor network to obtain a plurality of pairs of reconstructed feature maps, subtracting each pair of reconstructed feature maps to obtain an anomaly map, and acquiring an anomaly detection result of the anomaly detection image data based on the anomaly map and the reconstructed feature anomaly map.

2. The method of claim 1, further comprising, prior to inputting the anomaly detection data to a Teacher-Student framework, obtaining a plurality of pairs of feature maps and normalizing the plurality of pairs of feature maps based on a preset feature processing policy:

collecting training detection images meeting preset conditions;

inputting the training detection image into a Teacher model to obtain a first feature map;

training the Reconstructor network based on the first feature map.

3. The method of claim 2, wherein inputting the anomaly detection data to a Teacher-Student framework to obtain a plurality of pairs of feature maps and normalizing the plurality of pairs of feature maps based on a preset feature processing policy comprises:

respectively inputting the detection images meeting preset conditions into the pre-trained Teacher model and the Student model to be trained to obtain the pairs of feature images;

normalizing each pixel point of each pair of feature maps in the plurality of pairs of feature maps by using an L2 norm based on a channel dimension;

and constructing a first loss function based on the normalization result to maintain consistency of the feature space of the Teacher model and the Student model.

4. The method of claim 3, further comprising, after inputting the anomaly detection data to a Teacher-Student framework to obtain a plurality of pairs of feature maps and normalizing the plurality of pairs of feature maps based on a preset feature processing policy:

processing the detection images meeting the preset conditions through a preset network layer in a Teacher model and a Student model respectively to obtain corresponding second feature images;

inputting the second feature map to the Reconstructor network to respectively obtain reconstruction features corresponding to the Teacher model and the Student model;

fixing the parameters of the Reconstructor network, and constructing a second loss function to keep the consistency of the reconstructed image feature space of the Teacher model and the Student model;

based on the first and second loss functions, a total loss function is constructed to supervise training of the Student model.

5. The method of claim 4, wherein the mathematical representation of the total loss function is as follows:

L＝γ ₁ L ₁ + ₂ L ₂

6. An abnormality detection device based on distillation learning, comprising:

the first acquisition module is used for acquiring the abnormality detection data of the target;

the processing module is used for inputting the abnormality detection data into a Teacher-Student framework to obtain a plurality of pairs of feature graphs, normalizing the pairs of feature graphs based on a preset feature processing strategy, and

the detection module is used for inputting a pair of feature images with the lowest resolution ratio in the pairs of feature images into a pre-trained Reconstructor network to obtain a plurality of pairs of reconstructed feature images, subtracting each pair of reconstructed feature images to obtain an anomaly image, and acquiring an anomaly detection result of the anomaly detection image data based on the anomaly image and the reconstructed feature anomaly image.

7. The apparatus as recited in claim 6, further comprising:

the second acquisition module is used for acquiring training detection images meeting preset conditions before inputting the abnormality detection data into a Teacher-Student framework to obtain a plurality of pairs of feature images and normalizing the pairs of feature images based on a preset feature processing strategy;

the first feature extraction module is used for inputting the training detection image into a Teacher model to obtain a first feature map;

8. The apparatus of claim 7, wherein the processing module comprises:

the second feature extraction module is used for respectively inputting the detection images meeting preset conditions into the pre-trained Teacher model and the Student model to be trained to obtain the multiple pairs of feature images;

the normalization module is used for normalizing each pixel point of each pair of feature graphs in the plurality of pairs of feature graphs by using an L2 norm based on the channel dimension;

9. The apparatus as recited in claim 8, further comprising:

the third feature extraction module is used for inputting the abnormality detection data into a Teacher-Student framework to obtain a plurality of pairs of feature images, normalizing the pairs of feature images based on a preset feature processing strategy, and respectively processing the detection images meeting preset conditions through a preset network layer in the Teacher model and the Student model to obtain corresponding second feature images;

the reconstruction module is used for inputting the second feature map to a trained Reconstructor network to respectively obtain reconstruction features corresponding to the Teacher model and the Student model;

the parameter setting module is used for fixing the trained Reconstructor network parameters and constructing a second loss function so as to keep the consistency of the reconstructed image feature space of the Teacher model and the Student model;

10. The apparatus of claim 9, wherein the mathematical representation of the total loss function is as follows:

L＝γ ₁ L ₁ + ₂ L ₂

11. An electronic device, comprising: a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor executing the program to implement the distillation learning-based anomaly detection method of any one of claims 1-5.

12. A computer-readable storage medium having stored thereon a computer program, characterized in that the program is executed by a processor for realizing the abnormality detection method based on distillation learning according to any one of claims 1 to 5.