WO2023097638A1 - 一种基于对比表征蒸馏的快速异常检测方法和系统 - Google Patents

一种基于对比表征蒸馏的快速异常检测方法和系统 Download PDF

Info

Publication number
WO2023097638A1
WO2023097638A1 PCT/CN2021/135278 CN2021135278W WO2023097638A1 WO 2023097638 A1 WO2023097638 A1 WO 2023097638A1 CN 2021135278 W CN2021135278 W CN 2021135278W WO 2023097638 A1 WO2023097638 A1 WO 2023097638A1
Authority
WO
WIPO (PCT)
Prior art keywords
network
output
teacher
feature
samples
Prior art date
Application number
PCT/CN2021/135278
Other languages
English (en)
French (fr)
Inventor
束岸楠
陈璨
黄举
Original Assignee
宁德时代新能源科技股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 宁德时代新能源科技股份有限公司 filed Critical 宁德时代新能源科技股份有限公司
Priority to CN202180067533.XA priority Critical patent/CN117099125A/zh
Priority to EP21960115.0A priority patent/EP4224379A4/en
Priority to PCT/CN2021/135278 priority patent/WO2023097638A1/zh
Publication of WO2023097638A1 publication Critical patent/WO2023097638A1/zh
Priority to US18/356,229 priority patent/US12020425B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0004Industrial image inspection
    • G06T7/001Industrial image inspection using an image reference approach
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/096Transfer learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20224Image subtraction

Definitions

  • the present application relates to the field of artificial intelligence, in particular to a method and system for fast anomaly detection based on contrastive representation distillation.
  • the anomaly detection of industrial products usually uses manual labeling of defects, which usually consumes more human resource costs and may cause the risk of missed detection of unknown defect samples.
  • the present application provides a method and system for anomaly detection, which can quickly detect defects without manually marking defects, significantly reduce detection costs, and greatly improve quality inspection efficiency.
  • the present application provides a method for anomaly detection, including: acquiring pictures of objects to be detected; inputting the acquired pictures respectively into the trained teacher network and the student network obtained by distillation of the teacher network to obtain Obtaining the feature map output by the teacher network and the feature map output by the student network, wherein the teacher network is trained by constructing defect samples and learning the feature distribution of normal samples from the pre-trained expert network; and the The largest abnormal pixel in the difference map between the feature map output by the teacher network and the feature map output by the student network is determined as an abnormal value of the acquired picture, so as to output an abnormal detection result.
  • a three-layer network architecture is designed, that is, an expert-teacher-student network architecture, so that only normal image data needs to be used to train the network, avoiding a large amount of time and cost for image labeling, And by constructing defective samples, a more accurate feature space distribution of normal samples is obtained, which enhances the discrimination of difficult negative samples and solves the problem of underfitting.
  • the technology of knowledge distillation to compress network parameters, it ensures that the network has fast reasoning ability and also makes the network have strong robustness, and realizes fast visual detection of product anomalies.
  • the feature maps of each level of the student network are distilled from the feature maps of the corresponding levels of the teacher network, wherein the feature maps output by the teacher network are combined with the feature maps output by the student network
  • Determining the largest abnormal pixel in the difference map between the feature maps as the outlier value of the acquired picture further includes: the correspondence between the feature maps of each level with different resolutions output by the teacher network and the output of the student network The largest abnormal pixel in the total difference map between the feature maps of each level is determined as the abnormal value of the acquired picture, so as to output the abnormality detection result.
  • the knowledge expressed in the middle layer is transferred by learning the feature map of the middle layer of the teacher network, so as to obtain a deeper network, better learn the generalization ability of the teacher network, and thus achieve more accurate anomaly detection.
  • the defect samples are constructed through data augmentation.
  • data enhancement to construct defect samples, the collection and labeling of abnormal product images are avoided, and the underfitting problem is solved by constructing difficult negative samples.
  • the training of the teacher network includes the following steps, wherein the training data set includes normal and defect-free samples and defect samples: inputting the normal and defect-free samples into the pre-trained expert network and the teacher network , minimize the distance between the last layer of feature vectors output by the expert network and the teacher network in the contrastive loss function; input the normal non-defective samples into the pre-trained expert network and the defective samples inputting the teacher network, maximizing the distance between the last layer of feature vectors output by the expert network and the teacher network in a contrastive loss function; and updating parameters of the teacher network based on loss calculation results. Therefore, in the training process of the teacher network, first learn the features of normal samples from the pre-trained expert network, and obtain a more compact feature space expression by constructing defect samples and maximizing the distance of the output features, so that the difficult Negative samples are more discriminative.
  • the training of the student network includes the following steps, wherein the training data set includes normal samples: inputting the normal samples into the trained teacher network and the student network to obtain each output of the teacher network.
  • the outputting the abnormality detection result further includes: locating the abnormal region in the acquired picture based on the determined abnormal value to obtain the segmented abnormal region mask.
  • Anomaly detection results are visually represented by locating and segmenting abnormal regions in images to be detected based on outliers.
  • the present application provides a system for anomaly detection, including: an image acquisition module configured to acquire a picture of an object to be detected; a feature extraction module configured to Input the obtained pictures into the trained teacher network and the student network obtained by distilling the teacher network to obtain the feature map output by the teacher network and the feature map output by the student network, wherein the teacher network is obtained by Constructing defect samples trained from the pre-trained expert network to learn the feature distribution of normal samples; and an anomaly detection module configured to combine the feature map output by the teacher network with the feature map output by the student network The largest abnormal pixel in the difference graph between the graphs is determined as the abnormal value of the acquired image, so as to output the abnormality detection result.
  • a three-layer network architecture is designed, that is, an expert-teacher-student network architecture, so that only normal image data needs to be used to train the network, avoiding a large amount of time and cost for image labeling, And by constructing defective samples, a more accurate feature space distribution of normal samples is obtained, which enhances the discrimination of difficult negative samples and solves the problem of underfitting.
  • the technology of knowledge distillation to compress network parameters, it ensures that the network has fast reasoning ability and also makes the network have strong robustness, and realizes fast visual detection of product anomalies.
  • the feature maps of each level of the student network are obtained by distillation from the feature maps corresponding to each level of the teacher network, wherein the anomaly detection module is further configured to: output the teacher network The largest abnormal pixel in the total difference map between the feature maps of each level with different resolutions and the corresponding feature maps output by the student network is determined as the abnormal value of the acquired picture to output anomaly detection result.
  • the knowledge expressed in the middle layer is transferred by learning the feature map of the middle layer of the teacher network, so as to obtain a deeper network, better learn the generalization of the teacher network, and thus achieve more accurate anomaly detection.
  • the defect samples are constructed through data augmentation.
  • data enhancement to construct defect samples, the collection and labeling of abnormal product images are avoided, and the underfitting problem is solved by constructing difficult negative samples.
  • the training of the teacher network includes the following steps, wherein the training data set includes normal and defect-free samples and defect samples: inputting the normal and defect-free samples into the pre-trained expert network and the teacher network , minimize the distance between the last layer of feature vectors output by the expert network and the teacher network in the contrastive loss function; input the normal non-defective samples into the pre-trained expert network and the defective samples inputting the teacher network, maximizing the distance between the last layer of feature vectors output by the expert network and the teacher network in a contrastive loss function; and updating parameters of the teacher network based on loss calculation results. Therefore, in the training process of the teacher network, first learn the features of normal samples from the pre-trained expert network, and obtain a more compact feature space expression by constructing defect samples and maximizing the distance of the output features, so that the difficult Negative samples are more discriminative.
  • the training of the student network includes the following steps, wherein the training data set includes normal samples: inputting the normal samples into the trained teacher network and the student network to obtain each output of the teacher network.
  • the abnormality detection module is further configured to: locate the abnormal region in the acquired picture based on the determined abnormal value to obtain the segmented abnormal region mask. Anomaly detection results are visually represented by locating and segmenting abnormal regions in images to be detected based on outliers.
  • the present application provides an apparatus for anomaly detection, including: a memory, the memory stores computer-executable instructions; and at least one processor, the computer-executable instructions are processed by the at least one When executed by the device, the device can implement the method described in any one of the preceding aspects.
  • the device uses the expert-teacher-student network architecture, and only uses normal image data to train the network, avoiding the time and cost of image labeling, and obtaining more compact by constructing defect samples.
  • the consistent normal sample feature space distribution enhances the discriminative power of difficult negative samples and solves the problem of underfitting.
  • the technology of knowledge distillation to compress network parameters, it ensures that the network has fast reasoning ability and also makes the network have strong robustness, and realizes fast visual detection of product anomalies.
  • FIG. 1 is an example flowchart of a method for anomaly detection according to an embodiment of the present application
  • FIG. 2 is a schematic structural diagram of a three-layer network architecture for anomaly detection according to an embodiment of the present application
  • FIG. 3 is an example flowchart of a method for training a teacher network according to an embodiment of the present application
  • FIG. 4 is a schematic diagram of a method for training a teacher network according to an embodiment of the present application.
  • FIG. 5 is an example flowchart of a method for training a student network according to an embodiment of the present application
  • FIG. 6 is a schematic diagram of a method for training a student network according to an embodiment of the present application.
  • FIG. 7 is a schematic architecture diagram of a system for anomaly detection according to an embodiment of the present application.
  • Fig. 8 is a schematic architecture diagram of an apparatus for anomaly detection according to another embodiment of the present application.
  • Anomaly detection system 700 image acquisition module 701, feature extraction module 702, anomaly detection module 703;
  • a device 800 a memory 801 , and a processor 802 .
  • Power batteries are not only used in energy storage power systems such as hydraulic, thermal, wind and solar power plants, but also widely used in electric vehicles such as electric bicycles, electric motorcycles, electric vehicles, as well as military equipment and aerospace and other fields .
  • Electric vehicles such as electric bicycles, electric motorcycles, electric vehicles, as well as military equipment and aerospace and other fields .
  • Sealing nail welding is an indispensable link in the production process of power batteries. Whether the sealing nail welding meets the standard directly affects the safety of the battery.
  • the welding area of the sealing nail is called the weld bead. Due to changes in temperature, environment, and laser angle during welding, there are often defects such as explosive lines (virtual welding) and melting beads on the weld bead.
  • the inventors have conducted in-depth research and designed a three-layer network architecture based on comparative representation distillation for fast anomaly detection.
  • This application only uses normal image data to train the network, avoiding the time and cost of image labeling, and constructing defect samples to prompt the network to correct the feature space distribution of normal samples according to these difficult negative samples, so as to obtain a more Discriminative spatial distribution, which solves the underfitting problem.
  • the application compresses the network parameters by using the technology of knowledge distillation, which not only ensures the fast reasoning ability of the network, but also makes the network have strong robustness, and realizes the rapid visual detection of product abnormalities.
  • the network proposed in this application only needs 5-10ms to infer a picture.
  • the network proposed in this application can still detect abnormal samples of unknown defect types.
  • the present application can be applied to the field of abnormal detection combined with artificial intelligence (AI), and the method and system for abnormal detection disclosed in the embodiments of the present application can be used, but not limited to, for abnormal detection of sealing nail welding bead, It can also be used for anomaly detection for various other products in modern industrial manufacturing.
  • AI artificial intelligence
  • FIG. 1 is an exemplary flowchart of a method 100 for anomaly detection according to an embodiment of the present application.
  • FIG. 2 is a schematic structural diagram of a three-layer network architecture for anomaly detection according to an embodiment of the present application.
  • the method 100 starts at step 101, acquiring a picture of an object to be detected.
  • the obtained pictures are respectively input into the trained teacher network and the student network obtained by distilling the teacher network to obtain the feature map output by the teacher network and the feature map output by the student network, wherein the teacher network is obtained by Constructing defect samples is trained by learning the feature distribution of normal samples from the pre-trained expert network.
  • the largest abnormal pixel in the difference map between the feature map output by the teacher network and the feature map output by the student network is determined as an abnormal value of the acquired picture, so as to output an abnormality detection result.
  • the teacher network in step 102 is a randomly initialized model, which assists in learning the feature distribution of normal samples from the pre-trained expert network by constructing defective samples, so as to obtain a more compact and accurate feature space expression.
  • the pre-trained expert network can be a network model trained in advance, which has powerful feature extraction and image classification capabilities.
  • the pre-trained expert network can be AlexNet, VGG, GoogLeNet, ResNet, DenseNet, SENet, ShuteNet or MobileNet.
  • the student network in step 102 is a randomly initialized model that is distilled from the trained teacher network. Knowledge distillation adopts the teacher-student model, and the complex and large network is used as the teacher network, while the structure of the student network is relatively simple.
  • the teacher network is used to assist the training of the student network. Due to the strong learning ability of the teacher network, it can be learned Knowledge is transferred to the student network with relatively weak learning ability, so as to enhance the generalization ability of the student network. After the acquired pictures to be detected are input into the teacher network and the student network, the feature map output by the teacher network and the feature map output by the student network can be obtained. Since the teacher network only teaches the ability of feature extraction and image classification on normal images to the student network, the student network does not have knowledge of abnormal inputs, while the teacher network has knowledge of abnormal inputs, which leads to underlying differences in the behavior of the two networks.
  • this difference can be defined and quantified by the difference map between the feature maps output by the two networks, that is, the largest abnormal pixel in the difference map is determined as the outlier value of the acquired picture, as an anomaly detection Indicators are used to locate anomalies in images, thereby realizing pixel-level anomaly detection of product images.
  • the feature maps of each level of the student network are distilled from the feature maps corresponding to each level of the teacher network, wherein the feature maps output by the teacher network are combined with the student
  • Determining the largest abnormal pixel in the difference map between the feature maps output by the network as the outlier value of the obtained picture further includes: combining the feature maps of each level with different resolutions output by the teacher network with the corresponding each level of the output of the student network The largest abnormal pixel in the total difference map between the feature maps of the layers is determined as the abnormal value of the acquired picture, so as to output the abnormality detection result.
  • the student network not only learns the resulting knowledge of the teacher network, but also learns the intermediate layer features in the teacher network structure. Specifically, the student network not only fits the soft-target of the teacher network, but also fits the output of the intermediate layers (features extracted by the teacher network) at different resolutions, thus making the student network deeper. Therefore, as shown in Figure 2, in the inference stage after the training is completed, the acquired pictures to be detected can be input into the teacher network and the student network obtained by feature distillation from the teacher network, and output four features of different levels.
  • the sum of the mean square errors of the feature maps of each level output by the two networks is used as the total difference map, that is, the outlier distribution map of the network, and the largest abnormal pixel in the total difference map is defined as the picture to be detected
  • the outlier value of is used as an anomaly detection indicator.
  • the knowledge expressed in the middle layer is transferred by learning the feature map of the middle layer of the teacher network, so as to obtain a deeper network while compressing the model, and better learn the generalization ability of the teacher network.
  • the degree of anomaly is defined and quantified through the difference between the feature maps of each level, and the anomaly detection of the product image at the pixel level is realized.
  • FIG. 4 is a schematic diagram of a method for training a teacher network according to an embodiment of the present application, wherein defect samples may be constructed through data augmentation.
  • defect samples can be constructed through data augmentation, and the method of contrastive representation can be used to learn a more compact feature distribution of normal samples from the pre-trained expert network.
  • Data enhancement methods include, for example, Cutout, Random Erasing, GridMask, etc.
  • a defect image may be constructed by randomly selecting a region in a normal image and setting the pixel value of the region to 0 or other uniform values.
  • defect samples By using data enhancement to construct defect samples, the collection and labeling of abnormal product images is avoided, and the underfitting problem is solved by constructing difficult negative samples, and a more discriminative feature space expression is obtained.
  • the method 300 starts at step 301, inputting normal and defect-free samples into the pre-trained expert network and teacher network, and minimizing the distance between the last layer feature vectors output by the expert network and the teacher network in the contrastive loss function.
  • the normal non-defective samples are input into the pre-trained expert network and the defective samples are input into the teacher network, and the distance between the last layer feature vectors output by the expert network and the teacher network is maximized in the contrastive loss function.
  • the parameters of the teacher network are updated based on the loss calculation result. Subsequently, each step of the method 300 is repeated, and continuous iterative optimization is performed until convergence.
  • the normal images i.e., positive samples
  • the feature vectors are output from the last layer of the two networks respectively.
  • the distance between the expert network and the teacher network for the last layer of feature vectors of the positive samples for example, the Euclidean distance
  • Defective images i.e., negative samples
  • normal samples are fed into the expert network
  • feature vectors are output from the last layers of the two networks respectively, maximizing The distance between the feature vectors of the last layer of the expert network and the teacher network is optimized, so that the teacher network can better learn the distribution of normal samples in the feature space through the constructed negative samples, so that the teacher network can obtain a more compact feature space.
  • the contrastive loss function used in training is a loss function with the self-discovery property of hard negative samples. This property is crucial for learning high-quality self-supervised representations.
  • the teacher network can obtain a more compact feature space representation, making it more discriminative for hard negative samples.
  • FIG. 5 is an exemplary flowchart of a method 500 for training a student network according to an embodiment of the present application, wherein the training data set includes normal samples
  • FIG. 6 is a schematic diagram of a method for training a student network according to an embodiment of the present application.
  • the method 500 starts at step 501 , inputting normal samples into the trained teacher network and the student network to obtain feature maps of each level output by the teacher network and feature maps corresponding to each level output by the student network.
  • step 502 the distance between the feature maps of each level output by the teacher network and the feature maps corresponding to each level output by the student network is minimized in the distillation loss function.
  • the parameters of the student network are updated based on the loss calculation result.
  • each step of the method 500 is repeated, and continuous iterative optimization is performed until convergence.
  • the middle layer is trained first, and since the output sizes of the layers of the two networks may be different, an additional convolutional layer is required to match the output sizes of the layers of the two networks, Then, the middle layer of the student network is trained by means of knowledge distillation, so that the middle layer of the student network learns the output of the hidden layer of the teacher network, that is, the training of the student model is guided according to the loss of the teacher network, so that the difference between each level of the two networks The distance between them is the smallest (for example, the MSE difference is the smallest).
  • the current parameters are used as the initialization parameters of the student network, and knowledge distillation is used to train all layer parameters of the student network, so that the student network learns the output of the teacher network.
  • the teacher network distills the learned discriminative feature space to the student network so that the student network can better learn the feature extraction and image classification capabilities of the expert network, and through model compression
  • the network has the ability to detect anomalies with fewer network parameters.
  • the network structure is simple, so it is easy to deploy and realize rapid detection.
  • the network has the ability to identify and detect defects that traditional detection methods do not have. It can still detect samples with unknown defect types correctly, avoiding the failure of unknown defect types. Missed detection, with strong robustness.
  • outputting the abnormality detection result further includes locating the abnormal region in the acquired picture based on the determined abnormal value to obtain the segmented abnormal region mask.
  • the abnormality detection result can be visually represented.
  • FIG. 7 is a schematic architecture diagram of a system 700 for anomaly detection according to an embodiment of the present application.
  • the system 700 includes at least an image acquisition module 701 , a feature extraction module 702 and an anomaly detection module 703 .
  • the image acquisition module 801 can be used to acquire pictures of the object to be detected.
  • the feature extraction module 702 can be used to input the acquired pictures into the trained teacher network and the student network obtained by distilling the teacher network to obtain the feature map output by the teacher network and the feature map output by the student network, wherein the teacher network It is trained by constructing defect samples and learning the feature distribution of normal samples from the pre-trained expert network.
  • the anomaly detection module 703 can be used to determine the largest anomalous pixel in the difference map between the feature map output by the teacher network and the feature map output by the student network as an outlier value of the acquired picture, so as to output an anomaly detection result.
  • the system for anomaly detection constructs an expert-teacher-student three-layer network architecture, and can firstly learn more compactness of normal samples from the expert network by comparing representations feature distribution, which avoids the difficulty of collecting and labeling abnormal product images in traditional methods, saves a lot of manpower and material resources, solves the problem of underfitting, and then uses knowledge distillation to train high-quality student networks.
  • the learning of sample images instead of learning of defect samples enables the network to correctly detect unknown defects and avoid missed detection of unknown defect types, which is extremely robust.
  • the difference between feature images is defined and quantified The degree of abnormality realizes fast abnormal detection of product images at the pixel level.
  • the feature maps of each level of the student network are distilled from the feature maps of the corresponding levels of the teacher network, wherein the anomaly detection module can be further configured to output the teacher network with The largest abnormal pixel in the total difference map between the feature maps of different resolutions at each level and the corresponding feature maps of each level output by the student network is determined as the abnormal value of the acquired picture to output the abnormality detection result.
  • feature distillation can be used to obtain a deeper network, better learn the generalization ability of the teacher network, and realize the ability to perform fast anomaly detection with fewer network parameters.
  • degree of anomaly is defined and quantified through the difference between the feature maps of each level, and the anomaly detection of the product image at the pixel level is realized.
  • FIG. 8 is a schematic architecture diagram of an apparatus 800 for anomaly detection according to another embodiment of the present application.
  • an apparatus 800 may include a memory 801 and at least one processor 802 .
  • the memory 801 may store computer-executable instructions.
  • the memory 801 may also store a trained teacher network and a student network distilled from the teacher network, wherein the teacher network is trained by constructing defect samples and learning the feature distribution of normal samples from the pre-trained expert network.
  • the device 800 performs the following operations: obtain a picture of the object to be detected; input the obtained picture into the trained teacher network and the student obtained by using the teacher network distillation.
  • the network to obtain the feature map output by the teacher network and the feature map output by the student network; and determine the largest abnormal pixel in the difference map between the feature map output by the teacher network and the feature map output by the student network as the Get the outlier value of the image to output the anomaly detection result.
  • the memory 801 may include RAM, ROM, or a combination thereof. In some cases, memory 801 may contain, among other things, a BIOS that may control basic hardware or software operations, such as interaction with peripheral components or devices.
  • Processor 802 may comprise an intelligent hardware device (e.g., a general purpose processor, DSP, CPU, microcontroller, ASIC, FPGA, programmable logic device, discrete gate or transistor logic components, discrete hardware components, or any combination thereof) .
  • the device for anomaly detection can learn a more compact feature distribution of normal samples by using the expert-teacher-student three-layer network architecture, avoiding It overcomes the difficulty of collecting and labeling abnormal product images in traditional methods, saves a lot of manpower and material resources, solves the problem of underfitting, and enables the network to correctly detect abnormal product images by learning normal sample images instead of defect samples. Unknown defects avoid missed detection of unknown defect types, and are extremely robust. In addition, the degree of abnormality is defined and quantified through the difference between feature images, and fast abnormal detection of product images at the pixel level is realized.
  • the computer-executable instructions when executed by at least one processor 802 , cause the apparatus 800 to perform various operations described above with reference to FIGS. 1-6 , and details are omitted here for brevity.
  • a general-purpose processor can be a microprocessor, but in the alternative, the processor can be any conventional processor, controller, microcontroller, or state machine.
  • a processor may also be implemented as a combination of computing devices (eg, a combination of a DSP and a microprocessor, multiple microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration).
  • the functions described herein may be implemented in hardware, software executed by a processor, firmware, or any combination thereof. If implemented in software executed by a processor, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Other examples and implementations are within the scope of the disclosure and the following claims. For example, due to the nature of software, functions described herein can be implemented using software executed by a processor, hardware, firmware, hardwiring, or any combination thereof. Features implementing functions may also be physically located at various locations, including being distributed such that portions of functions are implemented at different physical locations.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)

Abstract

本申请提供了一种用于异常检测的方法和系统。方法包括:获取待检测对象的图片;将所获取的图片分别输入经训练的教师网络和利用该教师网络蒸馏得到的学生网络以得到该教师网络输出的特征图和该学生网络输出的特征图,其中该教师网络是通过构造缺陷样本从预训练的专家网络学习正常样本的特征分布来训练得到的;以及将该教师网络输出的特征图与该学生网络输出的特征图之间的差值图中最大的异常像素确定为所获取的图片的异常值,以输出异常检测结果。

Description

一种基于对比表征蒸馏的快速异常检测方法和系统 技术领域
本申请涉及人工智能领域,特别是涉及一种基于对比表征蒸馏的快速异常检测方法和系统。
背景技术
在现代工业制造领域中,由于工艺及设备原因,工业产品会存在一定异常或缺陷,因此对工业产品的异常检测是工业产品质量检测的关键部分,这对于改良产品工艺,提高产线良率是非常重要的。
然而,在传统工业制造中,对工业产品的异常检测通常采用人工标注缺陷的方式,这通常需要消耗较多的人力资源成本,并且可能造成未知缺陷样本漏检的风险。
发明内容
鉴于上述问题,本申请提供一种用于异常检测的方法和系统,能够在无需人工标注缺陷的情况下快速检出缺陷,并且显著减少了检测成本,且大大提高了质检效率。
第一方面,本申请提供了一种用于异常检测的方法,包括:获取待检测对象的图片;将所获取的图片分别输入经训练的教师网络和利用所述教师网络蒸馏得到的学生网络以得到所述教师网络输出的特征图和所述学生网络输出的特征图,其中所述教师网络是通过构造缺陷样本从预训练的专家网络学习正常样本的特征分布来训练得到的;以及将所述教师网络输 出的特征图与所述学生网络输出的特征图之间的差值图中最大的异常像素确定为所获取的图片的异常值,以输出异常检测结果。
本申请实施例的技术方案中,设计了一种三层网络架构,即专家-教师-学生网络架构,使得仅需使用正常的图像数据训练网络,避免了图像标注的时间和成本的大量消耗,并且通过构造缺陷样本获得更精准的正常样本特征空间分布,增强了对难负样本的判别力,解决了欠拟合问题。另外,通过使用知识蒸馏的技术对网络参数进行压缩,保证了网络具有快速推理能力的同时也让网络具备了较强的鲁棒性,实现了对产品异常的快速视觉检测。
在一些实施例中,所述学生网络的各层级的特征图是从所述教师网络的对应各层级的特征图蒸馏得到的,其中将所述教师网络输出的特征图与所述学生网络输出的特征图之间的差值图中最大的异常像素确定为所获取的图片的异常值进一步包括:将所述教师网络输出的具有不同分辨率的各层级的特征图与所述学生网络输出的对应各层级的特征图之间的总差值图中最大的异常像素确定为所获取的图片的异常值,以输出异常检测结果。在进行知识蒸馏时通过学习教师网络中间层特征图来传递中间层表达的知识,从而得到更深的网络,更好地学习到教师网络的泛化能力,并且因此实现了更精确的异常检测。
在一些实施例中,所述缺陷样本是通过数据增强来构造的。通过利用数据增强的方式来构造缺陷样本,避免了对异常产品图像的收集和标注,并且通过构造难负样本解决了欠拟合问题。
在一些实施例中,对所述教师网络的训练包括以下步骤,其中训练数据集包括正常无缺陷样本以及缺陷样本:将所述正常无缺陷样本输入所述预训练的专家网络和所述教师网络,在对比损失函数中最小化所述专家网络和所述教师网络输出的最后一层特征向量之间的距离;将所述正常无 缺陷样本输入所述预训练的专家网络并且将所述缺陷样本输入所述教师网络,在对比损失函数中最大化所述专家网络和所述教师网络输出的最后一层特征向量之间的距离;以及基于损失计算结果来更新所述教师网络的参数。由此,在教师网络的训练过程中,首先从预训练的专家网络学习正常样本的特征,并且通过构造缺陷样本,最大化输出特征的距离来获得更加紧致的特征空间表达,从而使得对于难负样本更具判别性。
在一些实施例中,对所述学生网络的训练包括以下步骤,其中训练数据集包括正常样本:将所述正常样本输入经训练的教师网络和所述学生网络以得到所述教师网络输出的各层级的特征图以及所述学生网络输出的对应各层级的特征图;在蒸馏损失函数中最小化所述教师网络输出的各层级的特征图与所述学生网络输出的对应各层级的特征图之间的距离;以及基于损失计算结果来更新所述学生网络的参数。由此,在学生网络的训练过程中,教师网络将学习到的具有判别性的特征空间蒸馏给学生网络以使得学生网络更好地学习到专家网络的特征提取能力,从而实现更精确的异常检测。
在一些实施例中,所述输出异常检测结果进一步包括:基于所确定的异常值来定位所获取的图片中的异常区域以得到所分割的异常区域掩膜。通过基于异常值来在待检测的图片中对异常区域进行定位和分割来直观地表示异常检测结果。
第二方面,本申请提供了一种用于异常检测的系统,包括:图像采集模块,所述图像采集模块被配置成获取待检测对象的图片;特征提取模块,所述特征提取模块被配置成将所获取的图片分别输入经训练的教师网络和利用所述教师网络蒸馏得到的学生网络以得到所述教师网络输出的特征图和所述学生网络输出的特征图,其中所述教师网络是通过构造缺陷样本从预训练的专家网络学习正常样本的特征分布来训练得到的;以及异常 检测模块,所述异常检测模块被配置成将所述教师网络输出的特征图与所述学生网络输出的特征图之间的差值图中最大的异常像素确定为所获取的图片的异常值,以输出异常检测结果。
本申请实施例的技术方案中,设计了一种三层网络架构,即专家-教师-学生网络架构,使得仅需使用正常的图像数据训练网络,避免了图像标注的时间和成本的大量消耗,并且通过构造缺陷样本获得更精准的正常样本特征空间分布,增强了对难负样本的判别力,解决了欠拟合问题。另外,通过使用知识蒸馏的技术对网络参数进行压缩,保证了网络具有快速推理能力的同时也让网络具备了较强的鲁棒性,实现了对产品异常的快速视觉检测。
在一些实施例中,所述学生网络的各层级的特征图是从所述教师网络的对应各层级的特征图蒸馏得到的,其中所述异常检测模块被进一步配置成:将所述教师网络输出的具有不同分辨率的各层级的特征图与所述学生网络输出的对应各层级的特征图之间的总差值图中最大的异常像素确定为所获取的图片的异常值,以输出异常检测结果。在进行知识蒸馏时通过学习教师网络中间层特征图来传递中间层表达的知识,从而得到更深的网络,更好地学习到教师网络的泛化性,并且因此实现了更精确的异常检测。
在一些实施例中,所述缺陷样本是通过数据增强来构造的。通过利用数据增强的方式来构造缺陷样本,避免了对异常产品图像的收集和标注,并且通过构造难负样本解决了欠拟合问题。
在一些实施例中,对所述教师网络的训练包括以下步骤,其中训练数据集包括正常无缺陷样本以及缺陷样本:将所述正常无缺陷样本输入所述预训练的专家网络和所述教师网络,在对比损失函数中最小化所述专家网络和所述教师网络输出的最后一层特征向量之间的距离;将所述正常无缺陷样本输入所述预训练的专家网络并且将所述缺陷样本输入所述教师网 络,在对比损失函数中最大化所述专家网络和所述教师网络输出的最后一层特征向量之间的距离;以及基于损失计算结果来更新所述教师网络的参数。由此,在教师网络的训练过程中,首先从预训练的专家网络学习正常样本的特征,并且通过构造缺陷样本,最大化输出特征的距离来获得更加紧致的特征空间表达,从而使得对于难负样本更具判别性。
在一些实施例中,对所述学生网络的训练包括以下步骤,其中训练数据集包括正常样本:将所述正常样本输入经训练的教师网络和所述学生网络以得到所述教师网络输出的各层级的特征图以及所述学生网络输出的对应各层级的特征图;在蒸馏损失函数中最小化所述教师网络输出的各层级的特征图与所述学生网络输出的对应各层级的特征图之间的距离;以及基于损失计算结果来更新所述学生网络的参数。由此,在学生网络的训练过程中,教师网络将学习到的具有判别性的特征空间蒸馏给学生网络以使得学生网络更好地学习到专家网络的特征提取能力,从而实现更精确的异常检测。
在一些实施例中,所述异常检测模块被进一步配置成:基于所确定的异常值来定位所获取的图片中的异常区域以得到所分割的异常区域掩膜。通过基于异常值来在待检测的图片中对异常区域进行定位和分割来直观地表示异常检测结果。
第三方面,本申请提供了一种用于异常检测的装置,包括:存储器,所述存储器存储有计算机可执行指令;以及至少一个处理器,所述计算机可执行指令在被所述至少一个处理器执行时使所述装置实现如前述方面中的任一者所述的方法。
本申请实施例的技术方案中,该装置利用专家-教师-学生网络架构,并且仅使用正常的图像数据训练网络,避免了图像标注的时间和成本的大量消耗,并且通过构造缺陷样本获得更紧致的正常样本特征空间分布,增 强了对难负样本的判别力,解决了欠拟合问题。另外,通过使用知识蒸馏的技术对网络参数进行压缩,保证了网络具有快速推理能力的同时也让网络具备了较强的鲁棒性,实现了对产品异常的快速视觉检测。
上述说明仅是本申请技术方案的概述,为了能够更清楚了解本申请的技术手段,而可依照说明书的内容予以实施,并且为了让本申请的上述和其它目的、特征和优点能够更明显易懂,以下特举本申请的具体实施方式。
附图说明
为了能详细地理解本申请的上述特征所用的方式,可以参考各实施例来对以上简要概述的内容进行更具体的描述,其中一些方面在附图中示出。然而应该注意,附图仅示出了本申请的某些典型方面,故不应被认为限定其范围,因为该描述可以允许有其它等同有效的方面。
图1是根据本申请的一实施例的用于异常检测的方法的示例流程图;
图2是根据本申请的一实施例的用于异常检测的三层网络架构的结构示意图;
图3是根据本申请的一实施例的用于训练教师网络的方法的示例流程图;
图4是根据本申请的一实施例的用于训练教师网络的方法的示意图;
图5是根据本申请的一实施例的用于训练学生网络的方法的示例流程图;
图6是根据本申请的一实施例的用于训练学生网络的方法的示意图;
图7是根据本申请的一实施例的用于异常检测的系统的示意架构图;
图8是根据本申请的另一实施例的用于异常检测的装置的示意架构图。
具体实施方式中的附图标号如下:
异常检测系统700,图像采集模块701,特征提取模块702,异常检测模块703;
装置800,存储器801,处理器802。
具体实施方式
下面将结合附图对本申请技术方案的实施例进行详细的描述。以下实施例仅用于更加清楚地说明本申请的技术方案,因此只作为示例,而不能以此来限制本申请的保护范围。
除非另有定义,本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同;本文中所使用的术语只是为了描述具体的实施例的目的,不是旨在于限制本申请;本申请的说明书和权利要求书及上述附图说明中的术语“包括”和“具有”以及它们的任何变形,意图在于覆盖不排他的包含。
在本申请实施例的描述中,“多个”的含义是两个以上,除非另有明确具体的限定。在本文中提及“实施例”意味着,结合实施例描述的特定特征、结构或特性可以包含在本申请的至少一个实施例中。在说明书中的各个位置出现该短语并不一定均是指相同的实施例,也不是与其它实施例互斥的独立的或备选的实施例。本领域技术人员显式地和隐式地理解的是,本文所描述的实施例可以与其它实施例相结合。
在本申请实施例的描述中,术语“和/或”仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系。
目前,从市场形势的发展来看,动力电池的应用越加广泛。动力电 池不仅被应用于水力、火力、风力和太阳能电站等储能电源系统,而且还被广泛应用于电动自行车、电动摩托车、电动汽车等电动交通工具,以及军事装备和航空航天等多个领域。随着动力电池应用领域的不断扩大,其市场的需求量也在不断地扩增。密封钉焊接是动力电池生产过程中不可或缺的环节,密封钉焊接是否达标直接影响电池的安全。密封钉焊接区域称为焊道,由于焊接时候的温度、环境、激光角度等变化,焊道上常常存在爆线(虚焊)、融珠等缺陷。
目前随着机器视觉和工业自动化的发展,存在基于人工智能来自动检测异常的方法,然而,目前在基于深度学习的工业产品缺陷视觉检测方法中,首先需要收集大量的缺陷样本并且准确标注缺陷才能作为网络的训练数据集。实际生产中会存在缺陷样本较少的问题,同时数据标注过程也会占据大量的模型开发时间,消耗大量的时间和人力成本。另外,实际产品的缺陷种类极多,而目标检测网络需要对缺陷的种类做出准确的定义,但是现实场景中往往无法准确定义所有的缺陷种类,从而造成未知缺陷样本漏检的风险,或者缺陷种类过多会导致模型的参数过多,模型的体积过大,影响到模型的部署和鲁棒性。另外,仅使用基于知识蒸馏的方法训练网络也存在问题,因为在训练过程中仅使用正常样本,并且针对正常样本进行建模。如此,学习到的特征空间会存在数据的欠拟合问题,因为在实际应用中,存在各种不同的异常的情况,甚至存在某些异常样本很接近于训练的正常样本的情况。
基于以上考虑,为了解决异常检测中大量样本标注、模型过大和欠拟合的问题,发明人经过深入研究,设计了一种基于对比表征蒸馏的三层网络架构以用于快速异常检测。本申请仅使用正常的图像数据来训练网络,避免了图像标注的时间和成本的大量消耗,并且通过构造缺陷样本来促使网络根据这些难负样本来纠正正常样本特征空间分布,从而得到一个更具 有判别性的空间分布,解决了欠拟合问题。另外,本申请通过使用知识蒸馏的技术对网络参数进行压缩,保证了网络具有快速推理能力的同时也让网络具备了较强的鲁棒性,实现了对产品异常的快速视觉检测。相较于之前算法的推理速度,本申请提出的网络推理出一张图仅需要5-10ms,另外,对于未知的异常样本,本申请提出的网络依然能够检测出未知缺陷种类的异常样本。
可以领会,本申请可以应用于与人工智能(AI)相结合的异常检测领域,本申请实施例公开的用于异常检测的方法和系统可以但不限用于针对密封钉焊道的异常检测,还可用于针对现代工业制造中的其他各类产品的异常检测。
图1是根据本申请的一实施例的用于异常检测的方法100的示例流程图。根据本申请的一个实施例,参考图1,并且进一步参考图2,图2是根据本申请的一实施例的用于异常检测的三层网络架构的结构示意图。方法100开始于步骤101,获取待检测对象的图片。在步骤102,将所获取的图片分别输入经训练的教师网络和利用该教师网络蒸馏得到的学生网络以得到该教师网络输出的特征图和该学生网络输出的特征图,其中该教师网络是通过构造缺陷样本从预训练的专家网络学习正常样本的特征分布来训练得到的。在步骤103,将该教师网络输出的特征图与该学生网络输出的特征图之间的差值图中最大的异常像素确定为所获取的图片的异常值,以输出异常检测结果。
如图2所示,步骤102中的教师网络是随机初始化模型,其通过构造缺陷样本来辅助从预训练的专家网络学习正常样本的特征分布,从而得到更加紧致和精准的特征空间表达。预训练的专家网络可以是提前训练好的网络模型,该模型具备强大的特征提取和图像分类能力。例如,预训练的专家网络可以是AlexNet、VGG、GoogLeNet、ResNet、DenseNet、 SENet、ShuteNet或MobileNet。步骤102中的学生网络是随机初始化模型,其是从经训练的教师网络蒸馏得到的。知识蒸馏采取教师-学生模式,将复杂且大的网络作为教师网络,而学生网络的结构较为简单,用教师网络来辅助学生网络的训练,由于教师网络的学习能力强,可以将它学到的知识迁移给学习能力相对弱的学生网络,以此来增强学生网络的泛化能力。在将所获取的待检测图片输入教师网络和学生网络之后,可以得到该教师网络输出的特征图和该学生网络输出的特征图。由于教师网络仅将对正常图像进行特征提取和图像分类的能力教给了学生网络,因此学生网络不具备对异常输入的知识,而教师网络具备对异常输入的知识,这导致在异常输入的情况下两个网络的行为的潜在差异。由此,可以通过两个网络输出的特征图之间的差值图来定义和量化这种差异,即,将差值图中最大的异常像素确定为所获取的图片的异常值,作为异常检测指标来在图片中定位异常,从而实现了对产品图像像素级的异常检测。
由此,通过构建专家-教师-学生网络架构,可以首先通过对比表征来从专家网络学习正常样本的更紧致的特征分布,避免了传统方法中对异常产品图像的收集和标注的困难,节约了大量人力物力,解决了欠拟合问题,并且随后利用知识蒸馏来训练出高质量的学生网络,实现了对特征提取和分类能力的迁移,通过特征图像之间的差异定义和量化了异常程度,实现了对产品图像像素级的快速异常检测。
根据本申请的一个实施例,可选地,继续参考图2,学生网络的各层级的特征图是从教师网络的对应各层级的特征图蒸馏得到的,其中将教师网络输出的特征图与学生网络输出的特征图之间的差值图中最大的异常像素确定为所获取的图片的异常值进一步包括:将教师网络输出的具有不同分辨率的各层级的特征图与学生网络输出的对应各层级的特征图之间的总差值图中最大的异常像素确定为所获取的图片的异常值,以输出异常检 测结果。
如图2中可见,学生网络不仅学习教师网络的结果知识,还学习教师网络结构中的中间层特征。具体而言,学生网络不仅拟合教师网络的软目标(Soft-target),而且拟合不同分辨率的中间层的输出(教师网络抽取的特征),从而使学生网络更深。由此,如图2所示,在训练完成之后的推理阶段,可以将所获取的待检测图片分别输入教师网络和从该教师网络进行特征蒸馏得到的学生网络,输出各自的不同层级的四幅特征图,随后将两个网络输出的各层级特征图的均方差的和作为总差值图,即,网络的异常值分布图,该总差值图中最大的异常像素被定义为该待检测图片的异常值,作为异常检测指标。
由此,在进行知识蒸馏时通过学习教师网络中间层特征图来传递中间层表达的知识,从而在压缩模型的同时得到更深的网络,更好地学习到教师网络的泛化能力,实现了以较少的网络参数进行快速异常检测的能力。另外,通过各层级特征图之间的差异定义和量化了异常程度,实现了对产品图像像素级的异常检测。
根据本申请的一个实施例,可选地,参考图4,图4是根据本申请的一实施例的用于训练教师网络的方法的示意图,其中缺陷样本可以是通过数据增强来构造的。
在训练教师网络时,可以通过数据增强来构造缺陷样本,并且利用对比表征的方法来从预训练的专家网络学习更紧致的正常样本特征分布。数据增强的方法包括例如Cutout、Random Erasing、GridMask等。例如,可以通过在正常图像中随机选取一块区域,并且将该区域的像素值设置为0或其他统一的值来构造缺陷图像。
通过利用数据增强的方式来构造缺陷样本,避免了对异常产品图像的收集和标注,并且通过构造难负样本解决了欠拟合问题,获得更加具有 判别性的特征空间表达。
根据本申请的一个实施例,可选地,参考图3至图4,图3是根据本申请的一实施例的用于训练教师网络的方法300的示例流程图,其中训练数据集包括正常无缺陷样本以及缺陷样本。方法300开始于步骤301,将正常无缺陷样本输入预训练的专家网络和教师网络,在对比损失函数中最小化专家网络和教师网络输出的最后一层特征向量之间的距离。在步骤302,将正常无缺陷样本输入预训练的专家网络并且将缺陷样本输入教师网络,在对比损失函数中最大化专家网络和教师网络输出的最后一层特征向量之间的距离。在步骤303,基于损失计算结果来更新教师网络的参数。随后,重复方法300的各步骤,进行不断迭代寻优,直至收敛。
如图4所示,在训练教师网络时,首先将正常图像(即,正样本)输入到预训练的专家网络和教师网络中,分别从两个网络的最后一层输出特征向量,在对比损失函数中最小化专家网络和教师网络各自针对于正样本最后一层特征向量之间的距离(例如,欧式距离),使得教师网络从专家网络学习到正常样本特征分布。随后将例如通过数据增强构造的缺陷图像(即,负样本)输入到教师网络中,将正常样本输入到专家网络中,分别从两个网络的最后一层输出特征向量,在对比损失函数中最大化专家网络和教师网络各自最后一层特征向量之间的距离,使得通过构造的负样本辅助教师网络更好地学习正常样本在特征空间上的分布,从而促使教师网络获得更加紧致的特征空间表达。在训练中所使用的对比损失函数是具备难负样本自发现性质的损失函数,这一性质对于学习高质量的自监督表示是至关重要的,其中关注难负样本的作用在于,对于那些已经远离的样本,不需要继续让其远离,而主要聚焦在如何使没有远离的那些样本远离,从而使得到的特征空间更加均匀和紧致。
由此,通过利用对比表征来训练教师网络,并且通过构造缺陷样本, 可以使教师网络获得更紧致的特征空间表达,从而使得对于难负样本更具判别性。
根据本申请的一个实施例,可选地,参考图5至图6,图5是根据本申请的一实施例的用于训练学生网络的方法500的示例流程图,其中训练数据集包括正常样本,图6是根据本申请的一实施例的用于训练学生网络的方法的示意图。方法500开始于步骤501,将正常样本输入经训练的教师网络和学生网络以得到教师网络输出的各层级的特征图以及学生网络输出的对应各层级的特征图。在步骤502,在蒸馏损失函数中最小化教师网络输出的各层级的特征图与学生网络输出的对应各层级的特征图之间的距离。在步骤503,基于损失计算结果来更新所述学生网络的参数。随后,重复方法500的各步骤,进行不断迭代寻优,直至收敛。
如图6所示,在训练学生网络时,将正常图像输入到训练好的教师网络和学生网络中。四幅不同层级的特征图从两个网络中取出,其中四幅特征图具有不同的分辨率,设计一种学生蒸馏损失函数(例如,L2损失换数)来最小化各自层级之间在两个网络中的距离。例如,在第一阶段期间,首先对中间层进行训练,由于两个网络的各层级的输出尺寸可能不同,因此需要另外接一层卷积层来使两个网络的各层级的输出尺寸匹配,随后通过知识蒸馏的方式来训练学生网络的中间层,使得学生网络的中间层学习到教师网络的隐藏层的输出,即,根据教师网络的损失来指导训练学生模型,使得两个网络各层级之间的距离最小(例如,MSE差异最小)。在第二阶段期间,在训练好学生网络的中间层之后,将当前参数作为学生网络的初始化参数,利用知识蒸馏的方式来训练学生网络的所有层参数,使学生网络学习教师网络的输出。
由此,在学生网络的训练过程中,教师网络将学习到的具有判别性的特征空间蒸馏给学生网络以使得学生网络更好地学习到专家网络的特征 提取和图像分类能力,并且通过模型压缩使得网络具备了以较少的网络参数进行异常检测的能力,另外,网络结构简单,因此容易部署和实现快速检测。另外,通过对正常样本图像的学习而不是对缺陷样本的学习,使得网络具备传统检测方法所不具备的缺陷识别检测能力,对于具有未知缺陷种类的样本依然能够正确检出,避免未知缺陷类型的漏检,具有极强的鲁棒性。
根据本申请的一个实施例,可选地,继续参考图2,输出异常检测结果进一步包括基于所确定的异常值来定位所获取的图片中的异常区域以得到所分割的异常区域掩膜。
由此,通过基于异常值来在待检测的图片中对异常区域进行定位和分割来直观地表示异常检测结果。
图7是根据本申请的一实施例的用于异常检测的系统700的示意架构图。根据本申请的一个实施例,参考图7,系统700至少包括图像采集模块701、特征提取模块702和异常检测模块703。图像采集模块801可用于获取待检测对象的图片。特征提取模块702可用于将所获取的图片分别输入经训练的教师网络和利用该教师网络蒸馏得到的学生网络以得到该教师网络输出的特征图和该学生网络输出的特征图,其中该教师网络是通过构造缺陷样本从预训练的专家网络学习正常样本的特征分布来训练得到的。异常检测模块703可用于将该教师网络输出的特征图与该学生网络输出的特征图之间的差值图中最大的异常像素确定为所获取的图片的异常值,以输出异常检测结果。
与上述用于异常检测的方法100相对应,根据本申请的用于异常检测的系统通过构建专家-教师-学生三层网络架构,可以首先通过对比表征来从专家网络学习正常样本的更紧致的特征分布,避免了传统方法中对异常产品图像的收集和标注的困难,节约了大量人力物力,解决了欠拟合问 题,并且随后利用知识蒸馏来训练出高质量的学生网络,通过对正常样本图像的学习而不是对缺陷样本的学习,使得网络能够正确检出未知缺陷,避免未知缺陷类型的漏检,具有极强的鲁棒性,另外,通过特征图像之间的差异定义和量化了异常程度,实现了对产品图像像素级的快速异常检测。
根据本申请的一个实施例,可选地,学生网络的各层级的特征图是从教师网络的对应各层级的特征图蒸馏得到的,其中异常检测模块可被进一步配置成将教师网络输出的具有不同分辨率的各层级的特征图与学生网络输出的对应各层级的特征图之间的总差值图中最大的异常像素确定为所获取的图片的异常值,以输出异常检测结果。
由此,可以利用特征蒸馏来得到更深的网络,更好地学习到教师网络的泛化能力,实现了以较少的网络参数进行快速异常检测的能力。另外,通过各层级特征图之间的差异定义和量化了异常程度,实现了对产品图像像素级的异常检测。
本领域技术人员能够理解,本公开的系统及其各模块既可以以硬件形式实现,也可以以软件形式实现,并且各模块可以任意合适的方式合并或组合。
图8是根据本申请的另一实施例的用于异常检测的装置800的示意架构图。根据本申请的一个实施例,参考图8,装置800可包括存储器801和至少一个处理器802。存储器801可存储有计算机可执行指令。存储器801还可存储有经训练的教师网络和利用该教师网络蒸馏得到的学生网络,其中该教师网络是通过构造缺陷样本从预训练的专家网络学习正常样本的特征分布来训练得到的。计算机可执行指令在被该至少一个处理器802执行时使该装置800进行以下操作:获取待检测对象的图片;将所获取的图片分别输入经训练的教师网络和利用该教师网络蒸馏得到的学生网络以得到该教师网络输出的特征图和该学生网络输出的特征图;以及将该 教师网络输出的特征图与该学生网络输出的特征图之间的差值图中最大的异常像素确定为所获取的图片的异常值,以输出异常检测结果。
存储器801可包括RAM、ROM、或其组合。在一些情形中,存储器801可尤其包含BIOS,该BIOS可控制基本硬件或软件操作,诸如与外围组件或设备的交互。处理器802可包括智能硬件设备(例如,通用处理器、DSP、CPU、微控制器、ASIC、FPGA、可编程逻辑器件、分立的门或晶体管逻辑组件、分立的硬件组件,或其任何组合)。
由此,与上述用于异常检测的方法100相对应,根据本申请的用于异常检测的装置通过利用专家-教师-学生三层网络架构,可以学习正常样本的更紧致的特征分布,避免了传统方法中对异常产品图像的收集和标注的困难,节约了大量人力物力,解决了欠拟合问题,并且通过对正常样本图像的学习而不是对缺陷样本的学习,使得网络能够正确检出未知缺陷,避免未知缺陷类型的漏检,具有极强的鲁棒性,另外,通过特征图像之间的差异定义和量化了异常程度,实现了对产品图像像素级的快速异常检测。在此,计算机可执行指令在由至少一个处理器802执行时使该装置800执行上文中参考图1-6所描述的各种操作,在此为简洁起见不再赘述。
结合本文中的公开描述的各种解说性框以及模块可以用设计成执行本文中描述的功能的通用处理器、DSP、ASIC、FPGA或其他可编程逻辑器件、分立的门或晶体管逻辑、分立的硬件组件、或其任何组合来实现或执行。通用处理器可以是微处理器,但在替换方案中,处理器可以是任何常规的处理器、控制器、微控制器、或状态机。处理器还可被实现为计算设备的组合(例如,DSP与微处理器的组合、多个微处理器、与DSP核心协同的一个或多个微处理器,或者任何其他此类配置)。
本文中所描述的功能可以在硬件、由处理器执行的软件、固件、或其任何组合中实现。如果在由处理器执行的软件中实现,则各功能可以作 为一条或多条指令或代码存储在计算机可读介质上或藉其进行传送。其他示例和实现落在本公开及所附权利要求的范围内。例如,由于软件的本质,本文描述的功能可使用由处理器执行的软件、硬件、固件、硬连线或其任何组合来实现。实现功能的特征也可物理地位于各种位置,包括被分布以使得功能的各部分在不同的物理位置处实现。
虽然已经参考优选实施例对本申请进行了描述,但在不脱离本申请的范围的情况下,可以对其进行各种改进并且可以用等效物替换其中的部件。尤其是,只要不存在结构冲突,各个实施例中所提到的各项技术特征均可以任意方式组合起来。本申请并不局限于文中公开的特定实施例,而是包括落入权利要求的范围内的所有技术方案。

Claims (13)

  1. 一种用于异常检测的方法,所述方法包括:
    获取待检测对象的图片;
    将所获取的图片分别输入经训练的教师网络和利用所述教师网络蒸馏得到的学生网络以得到所述教师网络输出的特征图和所述学生网络输出的特征图,其中所述教师网络是通过构造缺陷样本从预训练的专家网络学习正常样本的特征分布来训练得到的;以及
    将所述教师网络输出的特征图与所述学生网络输出的特征图之间的差值图中最大的异常像素确定为所获取的图片的异常值,以输出异常检测结果。
  2. 如权利要求1所述的方法,其特征在于,所述学生网络的各层级的特征图是从所述教师网络的对应各层级的特征图蒸馏得到的,其中将所述教师网络输出的特征图与所述学生网络输出的特征图之间的差值图中最大的异常像素确定为所获取的图片的异常值进一步包括:
    将所述教师网络输出的具有不同分辨率的各层级的特征图与所述学生网络输出的对应各层级的特征图之间的总差值图中最大的异常像素确定为所获取的图片的异常值,以输出异常检测结果。
  3. 如权利要求1或2所述的方法,其特征在于,所述缺陷样本是通过数据增强来构造的。
  4. 如权利要求1-3中任一项所述的方法,其特征在于,对所述教师网络的训练包括以下步骤,其中训练数据集包括正常无缺陷样本以及缺陷样本:
    将所述正常无缺陷样本输入所述预训练的专家网络和所述教师网络,在对比损失函数中最小化所述专家网络和所述教师网络输出的最后一层特征向量之间的距离;
    将所述正常无缺陷样本输入所述预训练的专家网络并且将所述缺陷样本输入所述教师网络,在对比损失函数中最大化所述专家网络和所述教师网络输出的最后一层特征向量之间的距离;以及
    基于损失计算结果来更新所述教师网络的参数。
  5. 如权利要求1-4中任一项所述的方法,其特征在于,对所述学生网络的训练包括以下步骤,其中训练数据集包括正常样本:
    将所述正常样本输入经训练的教师网络和所述学生网络以得到所述教师网络输出的各层级的特征图以及所述学生网络输出的对应各层级的特征图;
    在蒸馏损失函数中最小化所述教师网络输出的各层级的特征图与所述学生网络输出的对应各层级的特征图之间的距离;以及
    基于损失计算结果来更新所述学生网络的参数。
  6. 如权利要求1-5中任一项所述的方法,其特征在于,所述输出异常检测结果进一步包括:
    基于所确定的异常值来定位所获取的图片中的异常区域以得到所分割的异常区域掩膜。
  7. 一种用于异常检测的系统,所述系统包括:
    图像采集模块,所述图像采集模块被配置成获取待检测对象的图片;
    特征提取模块,所述特征提取模块被配置成将所获取的图片分别输入经训练的教师网络和利用所述教师网络蒸馏得到的学生网络以得到所述教师网络输出的特征图和所述学生网络输出的特征图,其中所述教师网络是通过构造缺陷样本从预训练的专家网络学习正常样本的特征分布来训练得到的;以及
    异常检测模块,所述异常检测模块被配置成将所述教师网络输出的特征图与所述学生网络输出的特征图之间的差值图中最大的异常像素确定为所获取的图片的异常值,以输出异常检测结果。
  8. 如权利要求7所述的系统,其特征在于,所述学生网络的各层级的特征图是从所述教师网络的对应各层级的特征图蒸馏得到的,其中所述异常检测模块被进一步配置成:
    将所述教师网络输出的具有不同分辨率的各层级的特征图与所述学生网络输出的对应各层级的特征图之间的总差值图中最大的异常像素确定为所获取的图片的异常值,以输出异常检测结果。
  9. 如权利要求7或8所述的系统,其特征在于,所述缺陷样本是通过数据增强来构造的。
  10. 如权利要求7-9中任一项所述的系统,其特征在于,对所述教师网络的训练包括以下步骤,其中训练数据集包括正常无缺陷样本以及缺陷样本:
    将所述正常无缺陷样本输入所述预训练的专家网络和所述教师网络,在对比损失函数中最小化所述专家网络和所述教师网络输出的最后一层特征向量之间的距离;
    将所述正常无缺陷样本输入所述预训练的专家网络并且将所述缺陷样本输入所述教师网络,在对比损失函数中最大化所述专家网络和所述教师网络输出的最后一层特征向量之间的距离;以及
    基于损失计算结果来更新所述教师网络的参数。
  11. 如权利要求7-10中任一项所述的系统,其特征在于,对所述学生网络的训练包括以下步骤,其中训练数据集包括正常样本:
    将所述正常样本输入经训练的教师网络和所述学生网络以得到所述教师网络输出的各层级的特征图以及所述学生网络输出的对应各层级的特征图;
    在蒸馏损失函数中最小化所述教师网络输出的各层级的特征图与所述学生网络输出的对应各层级的特征图之间的距离;以及
    基于损失计算结果来更新所述学生网络的参数。
  12. 如权利要求7-11中任一项所述的系统,其特征在于,所述异常检测模块被进一步配置成:
    基于所确定的异常值来定位所获取的图片中的异常区域以得到所分割的异常区域掩膜。
  13. 一种用于异常检测的装置,所述系统包括:
    存储器,所述存储器存储有计算机可执行指令;以及
    至少一个处理器,所述计算机可执行指令在被所述至少一个处理器执行时使所述装置实现如权利要求1-6中任一项所述的方法。
PCT/CN2021/135278 2021-12-03 2021-12-03 一种基于对比表征蒸馏的快速异常检测方法和系统 WO2023097638A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
CN202180067533.XA CN117099125A (zh) 2021-12-03 2021-12-03 一种基于对比表征蒸馏的快速异常检测方法和系统
EP21960115.0A EP4224379A4 (en) 2021-12-03 2021-12-03 METHOD AND SYSTEM FOR RAPID ANOMALY DETECTION BASED ON CONTRASTIVE IMAGE DISTILLATION
PCT/CN2021/135278 WO2023097638A1 (zh) 2021-12-03 2021-12-03 一种基于对比表征蒸馏的快速异常检测方法和系统
US18/356,229 US12020425B2 (en) 2021-12-03 2023-07-21 Fast anomaly detection method and system based on contrastive representation distillation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/135278 WO2023097638A1 (zh) 2021-12-03 2021-12-03 一种基于对比表征蒸馏的快速异常检测方法和系统

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/356,229 Continuation US12020425B2 (en) 2021-12-03 2023-07-21 Fast anomaly detection method and system based on contrastive representation distillation

Publications (1)

Publication Number Publication Date
WO2023097638A1 true WO2023097638A1 (zh) 2023-06-08

Family

ID=86611278

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/135278 WO2023097638A1 (zh) 2021-12-03 2021-12-03 一种基于对比表征蒸馏的快速异常检测方法和系统

Country Status (4)

Country Link
US (1) US12020425B2 (zh)
EP (1) EP4224379A4 (zh)
CN (1) CN117099125A (zh)
WO (1) WO2023097638A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116958148A (zh) * 2023-09-21 2023-10-27 曲阜师范大学 输电线路关键部件缺陷的检测方法、装置、设备、介质
CN116993694A (zh) * 2023-08-02 2023-11-03 江苏济远医疗科技有限公司 一种基于深度特征填充的无监督宫腔镜图像异常检测方法
CN118470563A (zh) * 2024-07-11 2024-08-09 北京数慧时空信息技术有限公司 一种基于自监督网络的遥感影像变化检测方法

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117274750B (zh) * 2023-11-23 2024-03-12 神州医疗科技股份有限公司 一种知识蒸馏半自动可视化标注方法及系统
CN118411369A (zh) * 2024-07-04 2024-07-30 杭州橙织数据科技有限公司 一种基于机器视觉的缝纫线迹缺陷检测方法及系统
CN118468230B (zh) * 2024-07-10 2024-09-20 苏州元瞰科技有限公司 一种基于多模态教师与学生框架的玻璃缺陷检测算法

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112801298A (zh) * 2021-01-20 2021-05-14 北京百度网讯科技有限公司 异常样本检测方法、装置、设备和存储介质
CN112991330A (zh) * 2021-04-19 2021-06-18 征图新视(江苏)科技股份有限公司 基于知识蒸馏的正样本工业缺陷检测方法
CN113326941A (zh) * 2021-06-25 2021-08-31 江苏大学 基于多层多注意力迁移的知识蒸馏方法、装置及设备

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108764462A (zh) 2018-05-29 2018-11-06 成都视观天下科技有限公司 一种基于知识蒸馏的卷积神经网络优化方法
CN111105008A (zh) * 2018-10-29 2020-05-05 富士通株式会社 模型训练方法、数据识别方法和数据识别装置
CA3076424A1 (en) * 2019-03-22 2020-09-22 Royal Bank Of Canada System and method for knowledge distillation between neural networks
CN111767711B (zh) * 2020-09-02 2020-12-08 之江实验室 基于知识蒸馏的预训练语言模型的压缩方法及平台
US20240005477A1 (en) * 2020-12-16 2024-01-04 Konica Minolta, Inc. Index selection device, information processing device, information processing system, inspection device, inspection system, index selection method, and index selection program
US20220036194A1 (en) * 2021-10-18 2022-02-03 Intel Corporation Deep neural network optimization system for machine learning model scaling

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112801298A (zh) * 2021-01-20 2021-05-14 北京百度网讯科技有限公司 异常样本检测方法、装置、设备和存储介质
CN112991330A (zh) * 2021-04-19 2021-06-18 征图新视(江苏)科技股份有限公司 基于知识蒸馏的正样本工业缺陷检测方法
CN113326941A (zh) * 2021-06-25 2021-08-31 江苏大学 基于多层多注意力迁移的知识蒸馏方法、装置及设备

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4224379A4 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116993694A (zh) * 2023-08-02 2023-11-03 江苏济远医疗科技有限公司 一种基于深度特征填充的无监督宫腔镜图像异常检测方法
CN116993694B (zh) * 2023-08-02 2024-05-14 江苏济远医疗科技有限公司 一种基于深度特征填充的无监督宫腔镜图像异常检测方法
CN116958148A (zh) * 2023-09-21 2023-10-27 曲阜师范大学 输电线路关键部件缺陷的检测方法、装置、设备、介质
CN116958148B (zh) * 2023-09-21 2023-12-12 曲阜师范大学 输电线路关键部件缺陷的检测方法、装置、设备、介质
CN118470563A (zh) * 2024-07-11 2024-08-09 北京数慧时空信息技术有限公司 一种基于自监督网络的遥感影像变化检测方法

Also Published As

Publication number Publication date
EP4224379A4 (en) 2024-02-14
US12020425B2 (en) 2024-06-25
CN117099125A (zh) 2023-11-21
EP4224379A1 (en) 2023-08-09
US20230368372A1 (en) 2023-11-16

Similar Documents

Publication Publication Date Title
WO2023097638A1 (zh) 一种基于对比表征蒸馏的快速异常检测方法和系统
Chen et al. Accurate and robust crack detection using steerable evidence filtering in electroluminescence images of solar cells
EP4322106B1 (en) Defect detection method and apparatus
WO2023097637A1 (zh) 一种用于缺陷检测的方法和系统
CN107748901B (zh) 基于相似性局部样条回归的工业过程故障诊断方法
CN111898566B (zh) 姿态估计方法、装置、电子设备和存储介质
CN113780484B (zh) 工业产品缺陷检测方法和装置
CN116797977A (zh) 巡检机器人动态目标识别与测温方法、装置和存储介质
CN115393714A (zh) 一种融合图论推理的输电线路螺栓缺销钉检测方法
CN111768380A (zh) 一种工业零配件表面缺陷检测方法
CN116385353B (zh) 一种摄像头模组异常检测方法
CN117495786A (zh) 缺陷检测元模型构建方法、缺陷检测方法、设备及介质
CN117078603A (zh) 基于改进的yolo模型的半导体激光芯片损伤探测方法和系统
CN116432078A (zh) 建筑楼宇机电设备监测系统
CN116977780A (zh) 一种基于机器视觉的锻造缺陷识别方法
CN115063405A (zh) 钢材表面缺陷检测的方法、系统、电子设备和存储介质
CN114267044A (zh) 一种数字水表的数据识别方法及装置
WO2020194583A1 (ja) 異常検知装置、制御方法、及びプログラム
CN112150458A (zh) 基于人工智能的光伏电池板焊带偏移检测方法及系统
CN117789184B (zh) 一种统一的焊缝射线图像智能识别方法
Rochan et al. Domain Adaptation in 3D Object Detection with Gradual Batch Alternation Training
CN110148089B (zh) 一种图像处理方法、装置及设备、计算机存储介质
CN118411351A (zh) 一种基于深度学习的线路部件检测方法、系统及存储介质
Rochan et al. Gradual Batch Alternation for Effective Domain Adaptation in LiDAR-Based 3D Object Detection
CN112419241A (zh) 一种基于人工智能的物体鉴别方法、装置和可读存储介质

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 202180067533.X

Country of ref document: CN

ENP Entry into the national phase

Ref document number: 2021960115

Country of ref document: EP

Effective date: 20230503

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21960115

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE