CN116051532A

CN116051532A - Deep learning-based industrial part defect detection method and system and electronic equipment

Info

Publication number: CN116051532A
Application number: CN202310109015.0A
Authority: CN
Inventors: 蒋学芹; 陈齐航; 周树波; 潘峰
Original assignee: Donghua University
Current assignee: Donghua University
Priority date: 2023-02-13
Filing date: 2023-02-13
Publication date: 2023-05-02

Abstract

The invention provides an industrial part defect detection method, system and electronic equipment based on deep learning; the method comprises the following steps: acquiring a data set; constructing a deep neural network for detecting surface defects of a target industrial part; training the deep neural network by using the data set, and acquiring a trained target neural network so as to detect surface defects of the target industrial part based on the target neural network; the invention provides an industrial part defect detection method based on deep learning, which realizes the detection of the surface defects of industrial parts by taking a convolutional neural network and a transducer as theoretical bases, combines the advantages of the convolutional neural network and the transducer, improves the segmentation precision, adopts the design of parallel branches, and ensures the convergence speed during the training of the deep neural network and the time requirement during the reasoning test.

Description

Defect detection method, system and electronic equipment for industrial parts based on deep learning

技术领域technical field

本发明涉及物理领域，尤其涉及工业零件表面缺陷检测技术，特别是一种基于深度学习的工业零件缺陷检测方法、系统及电子设备。The present invention relates to the field of physics, in particular to detection technology for surface defects of industrial parts, in particular to a method, system and electronic equipment for detecting defects of industrial parts based on deep learning.

背景技术Background technique

在工业生产过程中，现有技术、工作条件等因素的局限性会严重降低制成品的质量；其中，表面缺陷是产品质量降低的一种典型表现，因此，为了保证合格率和可靠的质量，必须进行产品表面缺陷检测。In the process of industrial production, the limitations of existing technology, working conditions and other factors will seriously reduce the quality of finished products; among them, surface defects are a typical manifestation of product quality reduction, therefore, in order to ensure the pass rate and reliable quality , Product surface defect detection must be carried out.

“缺陷”一般可以理解为与正常样品相比的缺失、缺陷或面积；表面缺陷检测是指检测样品表面的划痕、缺陷、异物遮挡、颜色污染、孔洞等缺陷，从而获得被测样品表面缺陷的类别、轮廓、位置、大小等一系列相关信息；人工缺陷检测曾经是主流方法，工人接受培训以识别复杂的表面缺陷，但这种方法效率低下；检测结果容易受人为主观因素的影响，不能满足实时检测的要求；因此，实现缺陷检测自动化是一项具有重要意义，并且具有很大挑战的工作。"Defect" can generally be understood as the absence, defect or area compared with normal samples; surface defect detection refers to the detection of scratches, defects, foreign matter blocking, color pollution, holes and other defects on the surface of the sample, so as to obtain the surface defects of the tested sample A series of related information such as category, outline, position, size, etc.; manual defect detection used to be the mainstream method, and workers were trained to identify complex surface defects, but this method is inefficient; detection results are easily affected by human subjective factors, and cannot To meet the requirements of real-time detection; therefore, to realize the automation of defect detection is a work of great significance and great challenge.

传统的机器视觉方法必须人工提取特征以适应特定领域，然后按照人工制定的规则或可学习的分类器(如SVM、决策树等)做出决策，这种方式十分依赖于人工经验，并且开发周期较长，难以跟上产品的迭代速度。Traditional machine vision methods must manually extract features to adapt to specific fields, and then make decisions according to artificially formulated rules or learnable classifiers (such as SVM, decision tree, etc.), this method is very dependent on human experience, and the development cycle Long, it is difficult to keep up with the iteration speed of the product.

发明内容Contents of the invention

本发明的目的在于提供一种基于深度学习的工业零件缺陷检测方法、系统及电子设备，用于解决现有产品表面缺陷检测技术存在的上述问题。The object of the present invention is to provide a deep learning-based industrial part defect detection method, system and electronic equipment, which are used to solve the above-mentioned problems existing in the existing product surface defect detection technology.

为实现上述目的及其他相关目的，本发明提供一种基于深度学习的工业零件缺陷检测方法，包括以下步骤：获取数据集；所述数据集包括工业零件的目标表面缺陷图像；构建用于检测目标工业零件表面缺陷的深度神经网络；所述深度神经网络包括：融合模块、Transformer分支、CNN分支及解码器；所述Transformer分支、所述CNN分支及所述解码器均与所述融合模块连接；其中，所述融合模块用于对所述Transformer分支输出的第一结果和所述CNN分支输出的第二结果进行融合，所述解码器用于对所述融合模块输出的第三结果进行解码，所述解码器的输出作为所述深度神经网络的输出；利用所述数据集训练所述深度神经网络，获取训练好的目标神经网络，以基于所述目标神经网络对所述目标工业零件进行表面缺陷检测。In order to achieve the above object and other related objects, the present invention provides a method for detecting defects in industrial parts based on deep learning, comprising the following steps: obtaining a data set; the data set includes target surface defect images of industrial parts; A deep neural network for surface defects of industrial parts; the deep neural network includes: a fusion module, a Transformer branch, a CNN branch and a decoder; the Transformer branch, the CNN branch and the decoder are all connected to the fusion module; Wherein, the fusion module is used to fuse the first result output by the Transformer branch and the second result output by the CNN branch, and the decoder is used to decode the third result output by the fusion module, so The output of the decoder is used as the output of the deep neural network; the deep neural network is trained using the data set, and the trained target neural network is obtained, so as to perform surface defects on the target industrial part based on the target neural network detection.

于本发明的一实施例中，所述获取数据集包括以下步骤：获取工业零件的原始表面缺陷图像；对所述原始表面缺陷图像进行预处理，获取目标表面缺陷图像。In an embodiment of the present invention, the acquiring the data set includes the following steps: acquiring original surface defect images of industrial parts; performing preprocessing on the original surface defect images to acquire target surface defect images.

于本发明的一实施例中，所述融合模块还用于对所述第一结果和所述第二结果融合后产生的第四结果进行增强，产生第五结果，及用于对所述第五结果、所述第一结果和所述第二结果进行拼接。In an embodiment of the present invention, the fusion module is further configured to enhance the fourth result generated after the fusion of the first result and the second result to generate a fifth result, and to The fifth result, the first result and the second result are concatenated.

于本发明的一实施例中，所述融合模块的数量为三；所述解码器包括：第一注意力模块、第二注意力模块、第一卷积层、第二卷积层及分割头，所述解码器对所述第三结果进行解码的过程包括：将一所述融合模块输出的第三结果和另一所述融合模块输出的第三结果输入至所述第一注意力模块；将所述第一注意力模块输出的第六结果输入至所述第一卷积层，获取第七结果；将所述第七结果和又一所述融合模块输出的第三结果输入至所述第二注意力模块；将所述第二注意力模块输出的第八结果输入至所述第二卷积层，获取第九结果；将所述第九结果输入至所述分割头；所述分割头的输出作为所述解码器的输出。In one embodiment of the present invention, the number of the fusion modules is three; the decoder includes: a first attention module, a second attention module, a first convolutional layer, a second convolutional layer and a segmentation head , the process of the decoder decoding the third result includes: inputting the third result output by one of the fusion modules and the third result output by another fusion module to the first attention module; The sixth result output by the first attention module is input to the first convolution layer to obtain the seventh result; the seventh result and the third result output by another fusion module are input to the The second attention module; the eighth result output by the second attention module is input to the second convolutional layer to obtain the ninth result; the ninth result is input to the segmentation head; the segmentation The output of the header serves as the output of the decoder.

于本发明的一实施例中，所述分割头包括：第三卷积层和双线性差值层；其中，在将所述第九结果输入至所述分割头的过程中，先将所述第九结果输入至所述第三卷积层，然后经所述双线性差值层将所述第三卷积层的输出恢复至原分辨率。In an embodiment of the present invention, the segmentation head includes: a third convolutional layer and a bilinear difference layer; wherein, in the process of inputting the ninth result to the segmentation head, the The ninth result is input to the third convolutional layer, and then the output of the third convolutional layer is restored to the original resolution through the bilinear difference layer.

于本发明的一实施例中，所述利用所述数据集训练所述深度神经网络包括以下步骤：从所述数据集中选取训练图像；将所述训练图像输入至所述深度神经网络，以训练所述深度神经网络；在训练所述深度神经网络过程中，通过最小化损失函数对所述深度神经网络进行训练。In an embodiment of the present invention, the training of the deep neural network using the data set includes the following steps: selecting training images from the data set; inputting the training images into the deep neural network to train The deep neural network; in the process of training the deep neural network, the deep neural network is trained by minimizing a loss function.

于本发明的一实施例中，在训练所述深度神经网络过程中，通过迭代训练对所述深度神经网络进行训练；所述利用所述数据集训练所述深度神经网络还包括以下步骤：利用交并比评分和/或Dice评分对所述深度神经网络进行评估。In an embodiment of the present invention, in the process of training the deep neural network, the deep neural network is trained through iterative training; the training of the deep neural network using the data set also includes the following steps: using The deep neural network is evaluated by intersection score and/or Dice score.

本发明提供一种基于深度学习的工业零件缺陷检测系统，包括：获取模块，用于获取数据集；所述数据集包括工业零件的目标表面缺陷图像；构建模块，用于构建用于检测目标工业零件表面缺陷的深度神经网络；所述深度神经网络包括：融合模块、Transformer分支、CNN分支及解码器；所述Transformer分支、所述CNN分支及所述解码器均与所述融合模块连接；其中，所述融合模块用于对所述Transformer分支输出的第一结果和所述CNN分支输出的第二结果进行融合，所述解码器用于对所述融合模块输出的第三结果进行解码，所述解码器的输出作为所述深度神经网络的输出；训练模块，用于利用所述数据集训练所述深度神经网络，获取训练好的目标神经网络，以基于所述目标神经网络对所述目标工业零件进行表面缺陷检测。The present invention provides an industrial part defect detection system based on deep learning, including: an acquisition module for acquiring a data set; the data set includes target surface defect images of industrial parts; The deep neural network of parts surface defects; the deep neural network includes: a fusion module, a Transformer branch, a CNN branch and a decoder; the Transformer branch, the CNN branch and the decoder are all connected to the fusion module; wherein , the fusion module is used to fuse the first result output by the Transformer branch and the second result output by the CNN branch, and the decoder is used to decode the third result output by the fusion module, the The output of the decoder is used as the output of the deep neural network; the training module is used to use the data set to train the deep neural network, obtain the trained target neural network, and use the target neural network to analyze the target industry based on the target neural network. Parts are inspected for surface defects.

本发明提供一种存储介质，其上存储有计算机程序，该计算机程序被处理器执行时实现上述的基于深度学习的工业零件缺陷检测方法。The present invention provides a storage medium on which a computer program is stored, and when the computer program is executed by a processor, the above-mentioned method for detecting defects of industrial parts based on deep learning is realized.

本发明提供一种电子设备，包括：处理器及存储器；所述存储器用于存储计算机程序；所述处理器用于执行所述存储器存储的计算机程序，以使所述电子设备执行上述的基于深度学习的工业零件缺陷检测方法。The present invention provides an electronic device, including: a processor and a memory; the memory is used to store computer programs; the processor is used to execute the computer programs stored in the memory, so that the electronic device performs the above-mentioned deep learning-based Defect detection method for industrial parts.

如上所述，本发明所述的基于深度学习的工业零件缺陷检测方法、系统及电子设备，具有以下有益效果：As mentioned above, the deep learning-based industrial part defect detection method, system and electronic equipment of the present invention have the following beneficial effects:

(1)与现有技术相比，本发明的目的是在于解决仅使用卷积神经网络的情况下，难以捕获远距离的依赖信息的问题，提出一种基于深度学习的工业零件缺陷检测方法，引入Transformer以及注意力模块，提高了分割精度。(1) Compared with the prior art, the purpose of the present invention is to solve the problem that it is difficult to capture long-distance dependent information when only using a convolutional neural network, and propose a method for detecting defects in industrial parts based on deep learning, The introduction of Transformer and attention module improves the segmentation accuracy.

(2)本发明提供了一种基于深度学习的工业零件缺陷检测方法，通过以卷积神经网络和Transformer为理论基础实现对工业零件表面缺陷的检测，结合了卷积神经网络与Transformer的优势，提高了分割精度，同时采用了并行分支的设计，保证了对深度神经网络训练时的收敛速度以及推理测试时的时间要求。(2) The present invention provides a method for detecting defects in industrial parts based on deep learning, which realizes the detection of surface defects in industrial parts based on convolutional neural network and Transformer, and combines the advantages of convolutional neural network and Transformer, The segmentation accuracy is improved, and the design of parallel branches is adopted to ensure the convergence speed of the deep neural network training and the time requirements of the reasoning test.

附图说明Description of drawings

图1显示为本发明的基于深度学习的工业零件缺陷检测方法于一实施例中的流程图。FIG. 1 is a flowchart of an embodiment of the method for detecting defects in industrial parts based on deep learning of the present invention.

图2显示为本发明的深度神经网络于一实施例中的框架图。FIG. 2 is a block diagram of a deep neural network in an embodiment of the present invention.

图3显示为本发明的融合模块于一实施例中的结构示意图。FIG. 3 is a schematic structural diagram of an embodiment of the fusion module of the present invention.

图4显示为本发明的AttentionGate模块于一实施例中的结构示意图。FIG. 4 is a schematic structural diagram of an AttentionGate module in an embodiment of the present invention.

图5显示为本发明的SCSE模块于一实施例中的结构示意图。FIG. 5 is a schematic structural diagram of an embodiment of the SCSE module of the present invention.

图6显示为本发明的ViT于一实施例中的结构示意图。FIG. 6 is a schematic diagram showing the structure of ViT in an embodiment of the present invention.

图7显示为本发明的ResNet34于一实施例中的结构示意图。FIG. 7 is a schematic structural diagram of a ResNet34 of the present invention in an embodiment.

图8显示为本发明的基于深度学习的工业零件缺陷检测方法于另一实施例中的流程图。FIG. 8 is a flow chart of another embodiment of the method for detecting defects in industrial parts based on deep learning of the present invention.

图9显示为本发明的基于深度学习的工业零件缺陷检测系统于一实施例中的结构示意图。FIG. 9 is a schematic structural diagram of an embodiment of the deep learning-based industrial part defect detection system of the present invention.

具体实施方式Detailed ways

以下通过特定的具体实施例说明本发明的实施方式，本领域技术人员可由本说明书所揭露的内容轻易地了解本发明的其他优点与功效。本发明还可以通过另外不同的具体实施方式加以实施或应用，本说明书中的各项细节也可以基于不同观点与应用，在没有背离本发明的精神下进行各种修饰或改变。需说明的是，在不冲突的情况下，以下实施例及实施例中的特征可以相互组合。The implementation of the present invention is described below through specific specific examples, and those skilled in the art can easily understand other advantages and effects of the present invention from the content disclosed in this specification. The present invention can also be implemented or applied through other different specific implementation modes, and various modifications or changes can be made to the details in this specification based on different viewpoints and applications without departing from the spirit of the present invention. It should be noted that, in the case of no conflict, the following embodiments and features in the embodiments can be combined with each other.

需要说明的是，以下实施例中所提供的图示仅以示意方式说明本发明的基本构想，图示中仅显示与本发明中有关的组件而非按照实际实施时的组件数目、形状及尺寸绘制，其实际实施时各组件的型态、数量及比例可为一种随意的改变，且其组件布局型态也可能更为复杂。It should be noted that the diagrams provided in the following embodiments are only schematically illustrating the basic ideas of the present invention, and only the components related to the present invention are shown in the diagrams rather than the number, shape and size of the components in actual implementation Drawing, the type, quantity and proportion of each component can be changed arbitrarily during its actual implementation, and its component layout type may also be more complicated.

近十年，随着海量数据分析和学习技术的发展，深度神经网络已经应用在许多视觉识别任务中，与经典的机器视觉方法相比，深度学习可以直接从数据中学习高级特征，因而具有更高的复杂结构表示能力，这使得自动学习过程取代了特征的手工工程。In the past ten years, with the development of massive data analysis and learning technology, deep neural network has been applied in many visual recognition tasks. Compared with classical machine vision methods, deep learning can learn advanced features directly from data, so it has more High ability to represent complex structures, which enables an automatic learning process to replace manual engineering of features.

卷积操作具有平移不变性，这使得它天然地适用于图像处理，但它的局部性又使得它关注的区域有限，难以捕获远程的依赖关系。随着近年来Transformer在自然语言处理上大放异彩，其通过全局注意力机制来很好地捕获全局的依赖关系的这一优势也被人们应用在计算机视觉领域；本发明将二者的优势进行互补，并与注意力模块进行结合，构造出一个完整的网络结构来用于工业领域的缺陷检测。The convolution operation has translation invariance, which makes it naturally suitable for image processing, but its locality makes it focus on a limited area, making it difficult to capture long-range dependencies. As Transformer shines in natural language processing in recent years, its advantage of capturing global dependencies well through the global attention mechanism has also been applied in the field of computer vision; the present invention combines the advantages of the two Complementary and combined with the attention module, a complete network structure is constructed for defect detection in the industrial field.

下面将结合具体实施例和附图来解释说明本发明的基于深度学习的工业零件缺陷检测方法。The method for detecting defects in industrial parts based on deep learning of the present invention will be explained below in conjunction with specific embodiments and drawings.

如图1所示，将本发明的基于深度学习的工业零件缺陷检测方法应用于瓷砖表面缺陷检测；具体地，该基于深度学习的工业零件缺陷检测方法包括以下步骤：As shown in Figure 1, the industrial part defect detection method based on deep learning of the present invention is applied to tile surface defect detection; specifically, the industrial part defect detection method based on deep learning includes the following steps:

步骤S1、使用相机获取瓷砖的表面缺陷图像。Step S1, using a camera to obtain images of surface defects of ceramic tiles.

步骤S2、对表面缺陷图像进行预处理，获取数据集。Step S2, preprocessing the surface defect image to obtain a data set.

具体地，将所有的表面缺陷图像重新调整为256×256×3的大小，并对图像进行随机的旋转、平移以及缩放等数据增强操作，以获取数据集。Specifically, all surface defect images are rescaled to a size of 256×256×3, and data enhancement operations such as random rotation, translation, and scaling are performed on the images to obtain a dataset.

同时，将上述操作同步作用于图像的分割标签，获取数据集。At the same time, the above operations are synchronously applied to the segmentation label of the image to obtain the data set.

然后，再将数据集随机分为385张用于训练的图像以及31张用于推理测试的图像。Then, the dataset is randomly divided into 385 images for training and 31 images for inference testing.

步骤S3、训练时，每次从数据集中获取维度为H×W×C的图像I_img作为输入。Step S3, during training, each time an image I _img with dimensions H×W×C is obtained from the data set as input.

其中，H和W表示图像I_img的高和宽，C表示图像I_img的通道数量。Wherein, H and W represent the height and width of the image I _img , and C represents the number of channels of the image I _img .

具体地，训练时每次从数据集中随机选取一对图像：一张为表面缺陷图像，另一张为对应的分割标签。Specifically, a pair of images is randomly selected from the dataset each time during training: one is the surface defect image, and the other is the corresponding segmentation label.

步骤S4：构建深度神经网络，并将图像I_img作为输入，训练深度神经网络。Step S4: Construct a deep neural network, and use the image I _img as input to train the deep neural network.

参照图2，训练时将图像I_img输入网络中。Referring to Figure 2, the image I _img is input into the network during training.

于一实施例中，深度神经网络包括：Transformer分支、CNN分支、融合模块以及解码器；其中，Transformer分支采用ViT(Vision Transformer，具体结构见图6所示)作为主干网络，CNN分支采用残差结构(ResNet)作为主干网络。In one embodiment, the deep neural network includes: Transformer branch, CNN branch, fusion module and decoder; wherein, Transformer branch adopts ViT (Vision Transformer, as shown in Figure 6 for specific structure) as the backbone network, and CNN branch adopts residual structure (ResNet) as the backbone network.

需要说明的是，本发明为了结合Transformer和CNN的优势，即Transformer通过全局注意力机制可以很好地捕获全局的依赖关系，而CNN则更擅于捕获局部细节；将Transformer分支和CNN分支中具有相同分辨率的特征图输入对应的融合模块，之后将融合模块的输出输入解码器，依次上采样得到深度神经网络的输出。It should be noted that the present invention combines the advantages of Transformer and CNN, that is, Transformer can capture global dependencies well through the global attention mechanism, while CNN is better at capturing local details; the Transformer branch and the CNN branch have The feature map of the same resolution is input to the corresponding fusion module, and then the output of the fusion module is input to the decoder, and the output of the deep neural network is obtained by sequentially upsampling.

如图7所示，于一实施例中，CNN分支采用ResNet34作为主干网络。As shown in FIG. 7 , in one embodiment, the CNN branch uses ResNet34 as the backbone network.

需要说明的是，Transformer分支采用ViT作为主干网络，将图像分成若干个16×16×3的部分，并将每个部分重新调整为一维向量，每个一维向量再经线性变换后得到长度为384的一维向量，之后输入ViT，之后将ViT输出的若干向量重新拼接为三维特征图，并通过两次上采样来提高特征图的分辨率以及降低通道数，并得到不同分辨率的输出。It should be noted that the Transformer branch uses ViT as the backbone network, divides the image into several 16×16×3 parts, and readjusts each part into a one-dimensional vector, and each one-dimensional vector is linearly transformed to obtain the length It is a one-dimensional vector of 384, and then input into ViT, and then reassemble several vectors output by ViT into a three-dimensional feature map, and improve the resolution of the feature map and reduce the number of channels through two upsampling, and obtain outputs of different resolutions .

对于CNN分支，则采用ResNet作为主干网络；具体地，图像输入ResNet后先经过一个卷积核大小为7，步长为2的一层卷积，后接Relu激活函数，利用最大池化进行下采样，之后经过三个Residual block(残差块)，并得到各Residual block对应的不同分辨率的输出。For the CNN branch, ResNet is used as the backbone network; specifically, after the image is input into ResNet, it first goes through a layer of convolution with a convolution kernel size of 7 and a step size of 2, followed by a Relu activation function, and uses maximum pooling for the next step. Sampling, and then go through three Residual blocks (residual blocks), and obtain outputs of different resolutions corresponding to each Residual block.

通过Transformer分支以及CNN分支来捕获图像的特征信息，之后通过融合模块将特征信息进行融合，解码器将来自融合模块的输出进行解码。The feature information of the image is captured through the Transformer branch and the CNN branch, and then the feature information is fused through the fusion module, and the decoder decodes the output from the fusion module.

整个流程如图2所示，本实施例以二维图像为例，输入图像尺寸为256×256×3，对于Transformer分支，将Transformer输出向量reshape为16×16×384，之后再通过两次双线性插值以及反卷积，分别得到尺寸为32×32×128和64×64×64的输出特征图。The entire process is shown in Figure 2. This embodiment takes a two-dimensional image as an example. The input image size is 256×256×3. For the Transformer branch, reshape the Transformer output vector to 16×16×384, and then pass two double Linear interpolation and deconvolution result in output feature maps of size 32×32×128 and 64×64×64, respectively.

对于CNN分支，第一个卷积块有64个卷积核，卷积核的大小为7，步长为2，后跟Relu激活函数，并采用最大池化；经过第一个卷积块后，得到尺寸为64×64×64的特征图；之后是连续三个残差块，残差块中卷积核的大小均为3，除第一个残差块步长为1，其余步长均为2，卷积核个数分别为64，128，256，三个残差块分别得到尺寸为64×64×64，32×32×128以及16×16×256的输出特征图。For the CNN branch, the first convolution block has 64 convolution kernels, the size of the convolution kernel is 7, the step size is 2, followed by the Relu activation function, and the maximum pooling is adopted; after the first convolution block, A feature map with a size of 64×64×64 is obtained; followed by three consecutive residual blocks, the size of the convolution kernel in the residual block is 3, and the step size of the first residual block is 1, and the other step sizes are is 2, the number of convolution kernels is 64, 128, and 256 respectively, and the three residual blocks obtain output feature maps with sizes of 64×64×64, 32×32×128 and 16×16×256 respectively.

得到CNN分支以及Transformer分支输出的特征图之后，将分辨率相同的一对特征图输入融合模块中，融合模块结构如图3所示，CNN分支以及Transformer分支输出的特征图分别经过卷积核大小为1的卷积层，将通道数对齐，之后相加，再经过SCSEBlock(SpatialSqueeze and Channel Excitation Block)，并将输出的结果与CNN分支输出的特征图以及Transformer分支输出的特征图进行拼接，最后再将拼接结果输入Residual block得到融合模块最终的输出。After obtaining the feature maps output by the CNN branch and the Transformer branch, a pair of feature maps with the same resolution are input into the fusion module. Convolutional layer of 1, align the number of channels, add them up, and then go through SCSEBlock (SpatialSqueeze and Channel Excitation Block), and splice the output result with the feature map output by the CNN branch and the feature map output by the Transformer branch, and finally Then input the splicing result into the Residual block to get the final output of the fusion module.

其中，SCSEBlock结构如图5所示。Among them, the SCSEBlock structure is shown in Figure 5.

SCSEBlock包括两个分支，分别为SSE分支和CSE分支；其中，对于SSE分支，特征图输入后先经过一个卷积核大小为1，卷积核个数为1的卷积层，得到一个二维的特征图，将其按空间维度与输入特征图相乘得到SSE分支的输出；对于CSE分支，特征图输入后经过一个平均池化层得到一个一维向量，之后经过两个卷积核大小为1的卷积层，将维度先减小再恢复，最终得到一个与通道数相同维度的一维向量，将其按通道维度与输入特征图相乘得到CSE分支的输出；之后将SSE分支的输出与CSE分支的输出相加，得到SCSEBlock最终的输出。SCSEBlock includes two branches, namely the SSE branch and the CSE branch; among them, for the SSE branch, after the feature map is input, it first passes through a convolution layer with a convolution kernel size of 1 and a convolution kernel number of 1 to obtain a two-dimensional The feature map is multiplied by the input feature map according to the spatial dimension to get the output of the SSE branch; for the CSE branch, after the feature map is input, it passes through an average pooling layer to obtain a one-dimensional vector, and then passes through two convolution kernels with a size of 1 convolutional layer, the dimension is first reduced and then restored, and finally a one-dimensional vector with the same dimension as the number of channels is obtained, which is multiplied by the channel dimension and the input feature map to obtain the output of the CSE branch; then the output of the SSE branch Add the output of the CSE branch to get the final output of SCSEBlock.

需要说明的是，本发明为了进一步提升Transformer分支对全局依赖的捕获以及CNN分支对局部细节的捕获，通过在融合模块中使用额外的注意力模块进一步地对Transformer分支以及CNN分支捕获的特征进行增强，其中使用SCSE Block来对Transformer分支的输出与CNN分支的输出融合后的结果进行增强，最终将增强后的特征图进行拼接并最终经过Residual block得到融合模块的输出并输入最终的解码器。It should be noted that, in order to further improve the capture of global dependencies by the Transformer branch and the capture of local details by the CNN branch, the present invention further enhances the features captured by the Transformer branch and the CNN branch by using an additional attention module in the fusion module , where the SCSE Block is used to enhance the fusion result of the output of the Transformer branch and the output of the CNN branch, and finally the enhanced feature maps are spliced and finally passed through the Residual block to obtain the output of the fusion module and input to the final decoder.

本发明的解码器的解码过程，记第i个融合模块输出的特征图为f_i，则解码器第i+1层输出的特征图为

且

其中Conv为卷积层，Up为上采样，AG为注意力模块。In the decoding process of the decoder of the present invention, record the feature map output by the ith fusion module as f _i , then the feature map output by the i+1 layer of the decoder is

and

Among them, Conv is the convolutional layer, Up is upsampling, and AG is the attention module.

于一实施例中，本发明解码过程中采用注意力(AttentionGate)模块(简称AG模块)进一步提升最终分割结果，深层的特征图单个像素具有更大的感受野，可以更好地关注到更加全局的信息，该模块利用深层的特征图提取到的信息，来作为一个注意力机制来施加于浅层的特征图，使得浅层的特征图可以关注到更加全局的信息。In one embodiment, the present invention uses the AttentionGate module (AG module for short) in the decoding process to further improve the final segmentation result. A single pixel in the deep feature map has a larger receptive field, which can better focus on the more global The module uses the information extracted from the deep feature map as an attention mechanism to apply to the shallow feature map, so that the shallow feature map can pay attention to more global information.

记第i个解码器输出的特征图为

第i+1个融合模块输出的特征图为f_i+1，则AG模块的输出为

且

其中，Conv为卷积层，其卷积核大小为1，步长为1，Up为双线性插值上采样。Note that the feature map output by the i-th decoder is

The feature map output by the i+1th fusion module is f _i+1 , then the output of the AG module is

and

Among them, Conv is a convolutional layer with a convolution kernel size of 1 and a step size of 1, and Up is bilinear interpolation upsampling.

具体地，解码器包括两个AttentionGate模块、两个卷积层以及一个分割头组成，其中，AttentionGate模块结构如图4所示，浅层特征图先经过卷积核大小为1的卷积层，使通道数与深层特征图对齐，之后通过下采样将分辨率也和深层特征图对齐，之后得到的结果与深层特征图相加，再经过Relu层，并通过卷积核大小为1的卷积层将通道数恢复为原先数量，再经过sigmoid层，之后将结果进行上采样回浅层特征图的分辨率，得到的结果即为注意力特征图，将注意力特征图与原先的浅层特征图相乘即可得到AttentionGate模块的输出。Specifically, the decoder consists of two AttentionGate modules, two convolutional layers, and a segmentation head. The structure of the AttentionGate module is shown in Figure 4. The shallow feature map first passes through the convolutional layer with a convolution kernel size of 1. Align the number of channels with the deep feature map, and then align the resolution with the deep feature map by downsampling, then add the result to the deep feature map, pass through the Relu layer, and pass the convolution with a convolution kernel size of 1 The layer restores the number of channels to the original number, and then passes through the sigmoid layer, and then upsamples the result back to the resolution of the shallow feature map. The result is the attention feature map, which combines the attention feature map with the original shallow feature map. The output of the AttentionGate module can be obtained by multiplying the graphs.

解码器将融合模块输出的

与f₁输入AttentionGate模块，再将输出经由一个卷积层得到

之后再将

与f₂输入另一个AttentionGate模块，再将输出经由一个卷积层得到

最后将

经过分割头得到最终的分割结果。The decoder will fuse the output of the module

Input the AttentionGate module with f ₁ , and then pass the output through a convolutional layer to obtain

later on

Input another AttentionGate module with f ₂ , and then pass the output through a convolutional layer to get

Finally will

The final segmentation result is obtained through the segmentation head.

分割头包括卷积层和双线性插值，通过双线性插值将卷积层的输出恢复至原分辨率并得到最终的分割结果。The segmentation head includes a convolutional layer and bilinear interpolation, and the output of the convolutional layer is restored to the original resolution through bilinear interpolation to obtain the final segmentation result.

于一实施例中，通过最小化损失函数训练深度神经网络。In one embodiment, the deep neural network is trained by minimizing a loss function.

需要说明的是，该最小化损失函数是来源于传统配准方法的能量函数：It should be noted that the minimized loss function is an energy function derived from traditional registration methods:

其中，L_total通过深监督来提升最终分割精度，α、β、γ为可变的超参数，G为分割标签，head为预测头，将输入转为分割结果，t_i为Transformer分支的输出经过i次上采样之后的结果；Among them, L _total improves the final segmentation accuracy through deep supervision, α, β, γ are variable hyperparameters, G is the segmentation label, head is the prediction head, which converts the input into the segmentation result, and t _i is the output of the Transformer branch. The result after i times upsampling;

L＝L_IoU+L_bce；L=L _IoU +L _bce ;

其中，

in,

其中，y为分割标签，

为深度神经网络的输出。Among them, y is the segmentation label,

is the output of the deep neural network.

本实施例中设置为200个epoch，使用Adam优化器驱动网络优化，L_total中α、β、γ分别为0.5、0.3、0.2，完成迭代次数后，得到最终的模型。In this embodiment, it is set to 200 epochs, and the Adam optimizer is used to drive network optimization. α, β, and γ in L _total are respectively 0.5, 0.3, and 0.2. After the number of iterations is completed, the final model is obtained.

于一实施例中，深度神经网络进行迭代训练。In one embodiment, the deep neural network is trained iteratively.

于一实施例中，对于训练好的深度神经网络，利用IOU(Intersection OverUnion，交并比)评分和Dice评分作为缺陷分割性能的优劣指标。In one embodiment, for the trained deep neural network, IOU (Intersection Over Union) score and Dice score are used as indicators of defect segmentation performance.

需要说明的是，利用训练好的深度神经网络进行推理测试；具体地，测试时在测试集中按序依次选取一张图像作为输入，同时还要输入图像对应的分割标签。It should be noted that the reasoning test is performed using the trained deep neural network; specifically, an image is sequentially selected in the test set as input during the test, and the corresponding segmentation label of the image is also input.

在本实施例中，二维图像的分割标签有31个。测试网络输出分割结果及分割评价指标，评价指标表达式为：IOU评分，表达式为：

Dice评分，表达式为：

其中y为分割标签，

为网络输出。In this embodiment, there are 31 segmentation labels for the two-dimensional image. The test network outputs the segmentation result and the segmentation evaluation index. The expression of the evaluation index is: IOU score, and the expression is:

Dice score, the expression is:

where y is the segmentation label,

output for the network.

需要说明的是，IoU系数以及Dice系数(Dice系数是一种集合相似度度量函数，通常用于计算两个样本的相似度)是集合相似度度量指标，用于计算两个样本的相似度，值的范围为[0,1]，分割效果越好，IoU值以及Dice值越接近1。It should be noted that the IoU coefficient and the Dice coefficient (the Dice coefficient is a set similarity measure function, which is usually used to calculate the similarity between two samples) are set similarity measures used to calculate the similarity between two samples. The value range is [0,1]. The better the segmentation effect, the closer the IoU value and Dice value are to 1.

本发明公开了一种基于深度学习的工业零件缺陷检测方法，首先利用相机获取工业零件表面图片，并对其进行旋转、平移、缩放等数据增强操作，得到预处理后的图像；然后将预处理后的图像分别输入到Transformer网络和卷积神经网络中图像进行特征提取和特征融合；之后通过特征上采样技术将特征图变换回原尺寸并输入预测层得到最终的语义分割结果；训练过程为监督训练过程，利用IOU代价损失函数以及Dice代价损失函数进行迭代训练和优化参数，直到模型参数收敛，保存模型参数文件；测试过程中，输入图像为尺寸为224×224×3的工业零件缺陷图像，利用训练好的深度神经网络在其上进行测试，最终得到尺寸为224×224×1的分割结果；本发明通过将Transformer和CNN的优势相结合，可以在工业零件数据集上实现更加准确的缺陷分割，提高最终分割结果的精度。The invention discloses a method for detecting defects of industrial parts based on deep learning. Firstly, a camera is used to obtain a picture of the surface of an industrial part, and data enhancement operations such as rotation, translation, and scaling are performed on it to obtain a preprocessed image; and then the preprocessed image is obtained. The final image is input to the Transformer network and the convolutional neural network for feature extraction and feature fusion; then the feature map is transformed back to the original size through feature upsampling technology and input into the prediction layer to obtain the final semantic segmentation result; the training process is supervised During the training process, use the IOU cost loss function and the Dice cost loss function to iteratively train and optimize parameters until the model parameters converge, and save the model parameter file; during the test process, the input image is an industrial part defect image with a size of 224×224×3, Use the trained deep neural network to test on it, and finally get the segmentation result with a size of 224×224×1; the present invention can realize more accurate defects on the industrial parts data set by combining the advantages of Transformer and CNN Segmentation to improve the accuracy of the final segmentation results.

现有技术的方案中通常仅采用卷积神经网络。本发明在采用卷积神经网络的基础上，还引入了Transformer，并在融合模块以及解码器中添加注意力模块，能够提高有用信息的权重，抑制噪音的影响，实现了分割精度的提升。In prior art solutions, only convolutional neural networks are usually used. The present invention introduces a Transformer on the basis of the convolutional neural network, and adds an attention module to the fusion module and the decoder, which can increase the weight of useful information, suppress the influence of noise, and realize the improvement of segmentation accuracy.

本发明涉及基于深度学习的工业零件缺陷检测方法，是一种以卷积神经网络和Transformer为理论基础的实现方法。本发明结合卷积神经网络与Transformer的优势，提高分割精度，同时采用了并行分支的设计，保证了训练时的收敛速度以及推理测试时的时间要求。The invention relates to a method for detecting defects of industrial parts based on deep learning, and is an implementation method based on a convolutional neural network and a Transformer theory. The invention combines the advantages of the convolutional neural network and the Transformer to improve the segmentation accuracy, and simultaneously adopts the design of parallel branches to ensure the convergence speed during training and the time requirement during reasoning and testing.

如图8所示，于一实施例中，本发明的基于深度学习的工业零件缺陷检测方法包括以下步骤：As shown in Figure 8, in one embodiment, the deep learning-based industrial part defect detection method of the present invention includes the following steps:

步骤H1、获取数据集。Step H1, obtaining a data set.

需要说明的是，所述数据集包括工业零件的目标表面缺陷图像。It should be noted that the dataset includes target surface defect images of industrial parts.

于一实施例中，所述获取数据集包括以下步骤：In one embodiment, the acquiring data set includes the following steps:

步骤(11)、获取工业零件的原始表面缺陷图像。Step (11), obtaining the original surface defect image of the industrial part.

步骤(12)、对所述原始表面缺陷图像进行预处理，获取目标表面缺陷图像。Step (12), performing preprocessing on the original surface defect image to obtain a target surface defect image.

于一实施例中，步骤(12)中的预处理包括图像增强处理；其中，图像增强处理至少包括但并不限于以下任意一种处理方式：旋转、平移、缩放。In one embodiment, the preprocessing in step (12) includes image enhancement processing; wherein, the image enhancement processing includes at least but not limited to any of the following processing methods: rotation, translation, and scaling.

步骤H2、构建用于检测目标工业零件表面缺陷的深度神经网络。Step H2, constructing a deep neural network for detecting surface defects of target industrial parts.

于本实施例中，所述深度神经网络包括：融合模块、Transformer分支、CNN分支及解码器；所述Transformer分支、所述CNN分支及所述解码器均与所述融合模块连接；其中，所述融合模块用于对所述Transformer分支输出的第一结果和所述CNN分支输出的第二结果进行融合，所述解码器用于对所述融合模块输出的第三结果进行解码，所述解码器的输出作为所述深度神经网络的输出。In this embodiment, the deep neural network includes: a fusion module, a Transformer branch, a CNN branch, and a decoder; the Transformer branch, the CNN branch, and the decoder are all connected to the fusion module; wherein, the The fusion module is used to fuse the first result output by the Transformer branch and the second result output by the CNN branch, and the decoder is used to decode the third result output by the fusion module, and the decoder The output of is used as the output of the deep neural network.

于一实施例中，所述融合模块还用于对所述第一结果和所述第二结果融合后产生的第四结果进行增强，产生第五结果，及用于对所述第五结果、所述第一结果和所述第二结果进行拼接。In one embodiment, the fusion module is further configured to enhance the fourth result generated after the fusion of the first result and the second result to generate a fifth result, and to generate the fifth result, The first result and the second result are concatenated.

于一实施例中，所述融合模块的数量为三；所述解码器包括：第一注意力模块、第二注意力模块、第一卷积层、第二卷积层及分割头，所述解码器对所述第三结果进行解码的过程包括：In one embodiment, the number of the fusion modules is three; the decoder includes: a first attention module, a second attention module, a first convolutional layer, a second convolutional layer and a segmentation head, the The process for the decoder to decode the third result includes:

步骤(21)、将一所述融合模块输出的第三结果

和另一所述融合模块输出的第三结果f₁输入至所述第一注意力模块。Step (21), the third result of a described fusion module output

and the third result _f1 output by the other fusion module are input to the first attention module.

步骤(22)、将所述第一注意力模块输出的第六结果输入至所述第一卷积层，获取第七结果

Step (22), input the sixth result output by the first attention module to the first convolutional layer, and obtain the seventh result

步骤(23)、将所述第七结果

和又一所述融合模块输出的第三结果f₂输入至所述第二注意力模块。Step (23), the seventh result

and a third result _f2 output by the fusion module is input to the second attention module.

步骤(24)、将所述第二注意力模块输出的第八结果输入至所述第二卷积层，获取第九结果

Step (24), input the eighth result output by the second attention module to the second convolutional layer, and obtain the ninth result

步骤(25)、将所述第九结果

输入至所述分割头；所述分割头的输出作为所述解码器的输出。Step (25), the ninth result

input to the segmentation header; the output of the segmentation header serves as the output of the decoder.

于一实施例中，所述分割头包括：第三卷积层和双线性差值层；其中，在将所述第九结果输入至所述分割头的过程中，先将所述第九结果输入至所述第三卷积层，然后经所述双线性差值层将所述第三卷积层的输出恢复至原分辨率。In one embodiment, the segmentation head includes: a third convolutional layer and a bilinear difference layer; wherein, in the process of inputting the ninth result to the segmentation head, the ninth The result is input to the third convolutional layer, and then the output of the third convolutional layer is restored to the original resolution through the bilinear difference layer.

步骤H3、利用所述数据集训练所述深度神经网络，获取训练好的目标神经网络，以基于所述目标神经网络对所述目标工业零件进行表面缺陷检测。Step H3, using the data set to train the deep neural network, and obtain a trained target neural network, so as to detect surface defects of the target industrial parts based on the target neural network.

于一实施例中，所述利用所述数据集训练所述深度神经网络包括以下步骤：In one embodiment, the training of the deep neural network using the data set includes the following steps:

步骤(31)、从所述数据集中选取训练图像。Step (31), selecting training images from the data set.

步骤(32)、将所述训练图像输入至所述深度神经网络，以训练所述深度神经网络。Step (32), inputting the training image into the deep neural network to train the deep neural network.

于一实施例中，在训练所述深度神经网络过程中，通过最小化损失函数对所述深度神经网络进行训练。In one embodiment, during training the deep neural network, the deep neural network is trained by minimizing a loss function.

于一实施例中，在训练所述深度神经网络过程中，通过迭代训练对所述深度神经网络进行训练。In one embodiment, in the process of training the deep neural network, the deep neural network is trained by iterative training.

于一实施例中，所述利用所述数据集训练所述深度神经网络还包括以下步骤：利用交并比评分和/或Dice评分对所述深度神经网络进行评估。In one embodiment, the training of the deep neural network using the data set further includes the following step: evaluating the deep neural network by using an intersection ratio score and/or a Dice score.

于一实施例中，基于所述目标神经网络对所述目标工业零件进行表面缺陷检测包括：将所述目标工业零件的目标表面缺陷图像输入至所述目标神经网络，以实现所述目标神经网络对所述目标工业零件表面缺陷的检测；所述目标神经网络的输出即为其对目标工业零件表面缺陷进行检测的结果。In one embodiment, the surface defect detection of the target industrial part based on the target neural network includes: inputting the target surface defect image of the target industrial part into the target neural network to realize the target neural network The detection of the surface defects of the target industrial parts; the output of the target neural network is the result of the detection of the surface defects of the target industrial parts.

需要说明的是，本实施例提供的基于深度学习的工业零件缺陷检测方法的工作原理可参考上述具体实施例中对于基于深度学习的工业零件缺陷检测方法的介绍，在此不再详细赘述。It should be noted that the working principle of the deep learning-based industrial part defect detection method provided in this embodiment can refer to the introduction of the deep learning-based industrial part defect detection method in the above specific embodiments, and will not be described in detail here.

需要说明的是，本发明所述的基于深度学习的工业零件缺陷检测方法的保护范围不限于本实施例列举的步骤执行顺序，凡是根据本发明的原理所做的现有技术的步骤增减、步骤替换所实现的方案都包括在本发明的保护范围内。It should be noted that the scope of protection of the deep learning-based industrial part defect detection method described in the present invention is not limited to the execution order of the steps listed in this embodiment. The solutions realized by step replacement are all included in the protection scope of the present invention.

本发明的存储介质上存储有计算机程序，该计算机程序被处理器执行时实现上述的基于深度学习的工业零件缺陷检测方法。所述存储介质包括：只读存储器(Read-OnlyMemory，ROM)、随机访问存储器(Random Access Memory，RAM)、磁碟、U盘、存储卡或者光盘等各种可以存储程序代码的介质。A computer program is stored on the storage medium of the present invention, and when the computer program is executed by a processor, the above-mentioned method for detecting defects of industrial parts based on deep learning is realized. The storage medium includes: various media capable of storing program codes such as read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk, U disk, memory card or optical disk.

可以采用一个或多个存储介质的任意组合。存储介质可以是计算机可读信号介质或者计算机可读存储介质。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件，或者任意以上的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括：具有一个或多个导线的电连接、便携式计算机盘、硬盘、RAM、ROM、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本文件中，计算机可读存储介质可以是任何包含或存储程序的有形介质，该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。Any combination of one or more storage media may be employed. The storage medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or any combination thereof. More specific examples (non-exhaustive list) of computer-readable storage media include: electrical connection with one or more conductors, portable computer disk, hard disk, RAM, ROM, erasable programmable read-only memory (EPROM or flash memory), fiber optics, portable compact disc read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing. In this document, a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.

计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号，其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式，包括——但不限于——电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质，该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。A computer readable signal medium may include a data signal carrying computer readable program code in baseband or as part of a carrier wave. Such propagated data signals may take many forms, including - but not limited to - electromagnetic signals, optical signals, or any suitable combination of the foregoing. A computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, which can send, propagate, or transmit a program for use by or in conjunction with an instruction execution system, apparatus, or device. .

计算机可读介质上包含的程序代码可以用任何适当的介质传输，包括——但不限于——无线、有线、光缆、RF等等，或者上述的任意合适的组合。Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including - but not limited to - wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

可以以一种或多种程序设计语言的任意组合来编写用于执行本发明操作的计算机程序代码，所述程序设计语言包括面向对象的程序设计语言—诸如Java、Smalltalk、C++等，还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中，远程计算机可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)—连接到用户计算机，或者，可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。Computer program code for carrying out the operations of the present invention may be written in any combination of one or more programming languages, including object-oriented programming languages—such as Java, Smalltalk, C++, etc., including conventional A procedural programming language—such as "C" or a similar programming language. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In cases involving a remote computer, the remote computer can be connected to the user computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as through an Internet service provider). Internet connection).

下面将参照根据本发明实施例的方法、装置(系统)和计算机程序产品的流程图和/或框图描述本发明。应当理解，流程图和/或框图的每个方框以及流程图和/或框图中各方框的组合，都可以由计算机程序指令实现。这些计算机程序指令可以提供给通用计算机、专用计算机或其它可编程数据处理装置的处理器，从而生产出一种机器，使得这些计算机程序指令在通过计算机或其它可编程数据处理装置的处理器执行时，产生了实现流程图和/或框图中的一个或多个方框中规定的功能/动作的装置。The present invention is described below with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It should be understood that each block of the flowchart and/or block diagrams, and combinations of blocks in the flowchart and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine such that when executed by the processor of the computer or other programmable data processing apparatus , producing an apparatus for realizing the functions/actions specified in one or more blocks in the flowchart and/or block diagram.

也可以把这些计算机程序指令存储在计算机可读介质中，这些指令使得计算机、其它可编程数据处理装置、或其他设备以特定方式工作，从而，存储在计算机可读介质中的指令就产生出包括实现流程图和/或框图中的一个或多个方框中规定的功能/动作的指令的制造品(article of manufacture)。These computer program instructions can also be stored in a computer-readable medium, and these instructions cause a computer, other programmable data processing apparatus, or other equipment to operate in a specific way, so that the instructions stored in the computer-readable medium produce information including An article of manufacture that implements the functions/actions specified in one or more blocks in a flowchart and/or block diagram.

也可以把计算机程序指令加载到计算机、其它可编程数据处理装置、或其它设备上，使得在计算机、其它可编程数据处理装置或其它设备上执行一系列操作步骤，以产生计算机实现的过程，从而使得在计算机或其它可编程装置上执行的指令提供实现流程图和/或框图中的一个或多个方框中规定的功能/动作的过程。It is also possible to load computer program instructions onto a computer, other programmable data processing apparatus, or other equipment, so that a series of operational steps are performed on the computer, other programmable data processing apparatus, or other equipment to produce a computer-implemented process, thereby Cause instructions executed on a computer or other programmable device to provide a process for implementing the functions/actions specified in one or more blocks in the flowcharts and/or block diagrams.

本发明的电子设备包括处理器及存储器。The electronic device of the present invention includes a processor and a memory.

所述存储器用于存储计算机程序；优选地，所述存储器包括：ROM、RAM、磁碟、U盘、存储卡或者光盘等各种可以存储程序代码的介质。The memory is used to store computer programs; preferably, the memory includes various media capable of storing program codes such as ROM, RAM, magnetic disk, U disk, memory card or optical disk.

所述处理器与所述存储器相连，用于执行所述存储器存储的计算机程序，以使所述电子设备执行上述的基于深度学习的工业零件缺陷检测方法。The processor is connected to the memory, and is used to execute the computer program stored in the memory, so that the electronic device executes the above-mentioned deep learning-based industrial part defect detection method.

优选地，所述处理器可以是通用处理器，包括中央处理器(Central ProcessingUnit，简称CPU)、网络处理器(Network Processor，简称NP)等；还可以是数字信号处理器(Digital Signal Processor，简称DSP)、专用集成电路(Application SpecificIntegrated Circuit，简称ASIC)、现场可编程门阵列(Field Programmable Gate Array，简称FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。Preferably, the processor can be a general-purpose processor, including a central processing unit (Central Processing Unit, referred to as CPU), a network processor (Network Processor, referred to as NP), etc.; it can also be a digital signal processor (Digital Signal Processor, referred to as DSP), Application Specific Integrated Circuit (ASIC for short), Field Programmable Gate Array (Field Programmable Gate Array, FPGA for short) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.

如图9所示，于一实施例中，本发明的基于深度学习的工业零件缺陷检测系统包括：As shown in Figure 9, in one embodiment, the industrial part defect detection system based on deep learning of the present invention includes:

获取模块91，用于获取数据集；所述数据集包括工业零件的目标表面缺陷图像。The acquisition module 91 is configured to acquire a data set; the data set includes target surface defect images of industrial parts.

构建模块92，用于构建用于检测目标工业零件表面缺陷的深度神经网络；所述深度神经网络包括：融合模块、Transformer分支、CNN分支及解码器；所述Transformer分支、所述CNN分支及所述解码器均与所述融合模块连接；其中，所述融合模块用于对所述Transformer分支输出的第一结果和所述CNN分支输出的第二结果进行融合，所述解码器用于对所述融合模块输出的第三结果进行解码，所述解码器的输出作为所述深度神经网络的输出。Construction module 92, is used for constructing the deep neural network that is used to detect the surface defect of target industrial part; Described deep neural network comprises: Fusion module, Transformer branch, CNN branch and decoder; Described Transformer branch, described CNN branch and all The decoders are all connected to the fusion module; wherein, the fusion module is used to fuse the first result output by the Transformer branch and the second result output by the CNN branch, and the decoder is used to fuse the The third result output by the fusion module is decoded, and the output of the decoder is used as the output of the deep neural network.

训练模块93，用于利用所述数据集训练所述深度神经网络，获取训练好的目标神经网络，以基于所述目标神经网络对所述目标工业零件进行表面缺陷检测。The training module 93 is configured to use the data set to train the deep neural network, obtain a trained target neural network, and perform surface defect detection on the target industrial parts based on the target neural network.

需要说明的是，所述获取模块91、所述构建模块92及所述训练模块93的结构及原理与上述基于深度学习的工业零件缺陷检测方法中的步骤(步骤H1～步骤H3)一一对应，故在此不再赘述。It should be noted that the structures and principles of the acquisition module 91, the construction module 92 and the training module 93 correspond to the steps (step H1 to step H3) in the above-mentioned deep learning-based industrial part defect detection method , so it will not be repeated here.

需要说明的是，应理解以上系统的各个模块的划分仅仅是一种逻辑功能的划分，实际实现时可以全部或部分集成到一个物理实体上，也可以物理上分开。且这些模块可以全部以软件通过处理元件调用的形式实现；也可以全部以硬件的形式实现；还可以部分模块通过处理元件调用软件的形式实现，部分模块通过硬件的形式实现。例如，x模块可以为单独设立的处理元件，也可以集成在上述系统的某一个芯片中实现，此外，也可以以程序代码的形式存储于上述系统的存储器中，由上述系统的某一个处理元件调用并执行以上x模块的功能。其它模块的实现与之类似。此外这些模块全部或部分可以集成在一起，也可以独立实现。这里所述的处理元件可以是一种集成电路，具有信号的处理能力。在实现过程中，上述方法的各步骤或以上各个模块可以通过处理器元件中的硬件的集成逻辑电路或者软件形式的指令完成。It should be noted that it should be understood that the division of the various modules of the above system is only a division of logical functions, and may be fully or partially integrated into a physical entity or physically separated during actual implementation. And these modules can all be implemented in the form of calling software through processing elements; they can also be implemented in the form of hardware; some modules can also be implemented in the form of calling software through processing elements, and some modules can be implemented in the form of hardware. For example, the x module can be a separate processing element, or it can be integrated into a certain chip of the above-mentioned system. In addition, it can also be stored in the memory of the above-mentioned system in the form of program code, and a certain processing element of the above-mentioned system can Call and execute the function of the above x module. The implementation of other modules is similar. In addition, all or part of these modules can be integrated together, and can also be implemented independently. The processing element mentioned here may be an integrated circuit with signal processing capabilities. In the implementation process, each step of the above method or each module above can be completed by an integrated logic circuit of hardware in the processor element or an instruction in the form of software.

例如，以上这些模块可以是被配置成实施以上方法的一个或多个集成电路，例如：一个或多个特定集成电路(Application Specific Integrated Circuit，简称ASIC)，或，一个或多个数字信号处理器(Digital Signal Processor，简称DSP)，或，一个或者多个现场可编程门阵列(Field Programmable Gate Array，简称FPGA)等。再如，当以上某个模块通过处理元件调度程序代码的形式实现时，该处理元件可以是通用处理器，例如中央处理器(Central Processing Unit，简称CPU)或其它可以调用程序代码的处理器。再如，这些模块可以集成在一起，以片上系统(System-On-a-Chip，简称SOC)的形式实现。For example, the above modules may be one or more integrated circuits configured to implement the above method, for example: one or more specific integrated circuits (Application Specific Integrated Circuit, referred to as ASIC), or, one or more digital signal processors (Digital Signal Processor, DSP for short), or, one or more Field Programmable Gate Arrays (Field Programmable Gate Array, FPGA for short), etc. For another example, when one of the above modules is implemented in the form of a processing element scheduler code, the processing element may be a general-purpose processor, such as a central processing unit (Central Processing Unit, referred to as CPU) or other processors that can call program codes. For another example, these modules can be integrated together and implemented in the form of a System-On-a-Chip (SOC for short).

需要说明的是，本发明的基于深度学习的工业零件缺陷检测系统可以实现本发明的基于深度学习的工业零件缺陷检测方法，但本发明的基于深度学习的工业零件缺陷检测方法的实现装置包括但不限于本实施例列举的基于深度学习的工业零件缺陷检测系统的结构，凡是根据本发明的原理所做的现有技术的结构变形和替换，都包括在本发明的保护范围内。It should be noted that the deep learning-based industrial part defect detection system of the present invention can implement the deep learning-based industrial part defect detection method of the present invention, but the realization device of the deep learning-based industrial part defect detection method of the present invention includes but It is not limited to the structure of the deep learning-based industrial part defect detection system listed in this embodiment, and any structural deformation and replacement of the prior art based on the principles of the present invention are included in the scope of protection of the present invention.

综上所述，本发明的基于深度学习的工业零件缺陷检测方法、系统及电子设备，与现有技术相比，本发明的目的是在于解决仅使用卷积神经网络的情况下，难以捕获远距离的依赖信息的问题，提出一种基于深度学习的工业零件缺陷检测方法，引入Transformer以及注意力模块，提高了分割精度；本发明提供了一种基于深度学习的工业零件缺陷检测方法，通过以卷积神经网络和Transformer为理论基础实现对工业零件表面缺陷的检测，结合了卷积神经网络与Transformer的优势，提高了分割精度，同时采用了并行分支的设计，保证了对深度神经网络训练时的收敛速度以及推理测试时的时间要求；所以，本发明有效克服了现有技术中的种种缺点而具高度产业利用价值。In summary, compared with the prior art, the deep learning-based industrial part defect detection method, system and electronic equipment of the present invention, the purpose of the present invention is to solve the problem of difficult to capture remote For the problem of distance dependence information, a deep learning-based industrial part defect detection method is proposed, and Transformer and attention module are introduced to improve the segmentation accuracy; the present invention provides a deep learning-based industrial part defect detection method, through the Convolutional neural network and Transformer are the theoretical basis to realize the detection of surface defects of industrial parts. Combining the advantages of convolutional neural network and Transformer, the segmentation accuracy is improved. The convergence speed and the time requirement of the reasoning test; therefore, the present invention effectively overcomes various shortcomings in the prior art and has high industrial application value.

上述实施例仅例示性说明本发明的原理及其功效，而非用于限制本发明。任何熟悉此技术的人士皆可在不违背本发明的精神及范畴下，对上述实施例进行修饰或改变。因此，举凡所属技术领域中具有通常知识者在未脱离本发明所揭示的精神与技术思想下所完成的一切等效修饰或改变，仍应由本发明的权利要求所涵盖。The above-mentioned embodiments only illustrate the principles and effects of the present invention, but are not intended to limit the present invention. Anyone skilled in the art can modify or change the above-mentioned embodiments without departing from the spirit and scope of the present invention. Therefore, all equivalent modifications or changes made by those skilled in the art without departing from the spirit and technical ideas disclosed in the present invention should still be covered by the claims of the present invention.

Claims

1. The industrial part defect detection method based on deep learning is characterized by comprising the following steps of:

acquiring a data set; the dataset comprising a target surface defect image of an industrial part;

constructing a deep neural network for detecting surface defects of a target industrial part; the deep neural network includes: a fusion module, a transducer branch, a CNN branch and a decoder; the transducer branch, the CNN branch and the decoder are all connected with the fusion module; the fusion module is used for fusing a first result output by the converter branch and a second result output by the CNN branch, the decoder is used for decoding a third result output by the fusion module, and the output of the decoder is used as the output of the deep neural network;

and training the deep neural network by using the data set, and acquiring a trained target neural network so as to detect surface defects of the target industrial part based on the target neural network.

2. The deep learning based industrial part defect detection method of claim 1, wherein the acquiring the data set comprises the steps of:

acquiring an original surface defect image of an industrial part;

and preprocessing the original surface defect image to obtain a target surface defect image.

3. The deep learning based industrial part defect detection method of claim 1, wherein the fusion module is further configured to enhance a fourth result generated after the first result and the second result are fused, generate a fifth result, and splice the fifth result, the first result, and the second result.

4. The deep learning-based industrial part defect detection method of claim 1, wherein the number of fusion modules is three; the decoder includes: the process of decoding the third result by the decoder comprises the following steps:

inputting a third result output by one fusion module and a third result output by the other fusion module into the first attention module;

inputting a sixth result output by the first attention module into the first convolution layer to obtain a seventh result;

inputting the seventh result and a third result output by the fusion module to the second attention module;

inputting an eighth result output by the second attention module to the second convolution layer to obtain a ninth result;

inputting the ninth result to the segmentation head; the output of the split head is taken as the output of the decoder.

5. The deep learning based industrial part defect detection method of claim 4, wherein the segmentation head comprises: a third convolution layer and a bilinear difference layer; in the process of inputting the ninth result to the dividing head, the ninth result is input to the third convolution layer, and then the output of the third convolution layer is restored to the original resolution through the bilinear difference layer.

6. The deep learning based industrial part defect detection method of claim 1, wherein the training the deep neural network with the data set comprises the steps of:

selecting a training image from the dataset;

inputting the training image to the deep neural network to train the deep neural network;

in the training process of the deep neural network, the deep neural network is trained by minimizing a loss function.

7. The deep learning based industrial part defect detection method of claim 1, wherein the deep neural network is trained by iterative training in training the deep neural network;

the training of the deep neural network using the data set further comprises the steps of: the deep neural network is evaluated using cross-ratio scores and/or Dice scores.

8. An industrial part defect detection system based on deep learning, comprising:

the acquisition module is used for acquiring the data set; the dataset comprising a target surface defect image of an industrial part;

the construction module is used for constructing a deep neural network for detecting the surface defects of the target industrial part; the deep neural network includes: a fusion module, a transducer branch, a CNN branch and a decoder; the transducer branch, the CNN branch and the decoder are all connected with the fusion module; the fusion module is used for fusing a first result output by the converter branch and a second result output by the CNN branch, the decoder is used for decoding a third result output by the fusion module, and the output of the decoder is used as the output of the deep neural network;

and the training module is used for training the deep neural network by utilizing the data set, and acquiring a trained target neural network so as to detect the surface defects of the target industrial part based on the target neural network.

9. A storage medium having stored thereon a computer program, which when executed by a processor, implements the deep learning-based industrial part defect detection method of any one of claims 1 to 7.

10. An electronic device, comprising: a processor and a memory;

the memory is used for storing a computer program;

the processor is configured to execute the computer program stored in the memory, so that the electronic device performs the deep learning-based industrial part defect detection method according to any one of claims 1 to 7.