CN114723833A

CN114723833A - Improved YOLOV 5-based deep learning wafer solder joint detection method

Info

Publication number: CN114723833A
Application number: CN202210362924.0A
Authority: CN
Inventors: 许江杰; 邹艳丽; 谭宇飞; 余自淳
Original assignee: Guangxi Normal University
Current assignee: Guangxi Normal University
Priority date: 2022-04-08
Filing date: 2022-04-08
Publication date: 2022-07-08

Abstract

The invention discloses a deep learning wafer welding spot detection method based on improved YOLOV5, which comprises the following steps: 1) manufacturing a wafer welding spot data set; 2) constructing an attention mechanism module CCANET; 3) constructing a YOLOV5 network integrated with an attention mechanism; 4) introducing a Ghost module; 5) the improved YOLOV5 network was trained. The invention improves the precision of the wafer welding spot detection under the condition of using less network parameters, and not only can detect more wafer welding spots, but also can detect the shielded wafer welding spots under the same condition.

Description

Improved YOLOV 5-based deep learning wafer solder joint detection method

Technical Field

The invention relates to the technical field of target detection, in particular to a deep learning wafer welding spot detection method based on improved YOLOV 5.

Background

With the increasing complexity of chips, the number of modules and functions in the chips increases, and how to effectively test the proportion of wafers considered in the whole chip design increases. Meanwhile, wafer testing is one of the most important statistical modes of the chip yield, the chip yield is improved, the loss of chip manufacturing in industrial production can be greatly reduced, and the chip production efficiency is improved. Therefore, wafer testing is of great strategic importance in the overall chip fabrication.

Wafer test is to carry on the probe test to every crystalline grain on the chip, use the probe on the detection head to contact with the solder joint, then measure the capacity and other properties of the wafer through the electric test, wherein the contact of the probe and solder joint is mostly finished and aligned by the manual work, because the area of solder joint is 40x50 microns or smaller, in the manual alignment process, need to try and touch the solder joint, if the probe head contacts the surface of solder joint of the wafer too lightly, will lead to the test data to be inaccurate; if the contact is too heavy, the micro circuit on the wafer will be damaged, and meanwhile, the stability of the existing wafer testing machine is not high, and the misjudgment rate is high, which is one of the main contradictions that always troubles the wafer production factory. In actual industrial production, the detection rate of a wafer testing machine is about 40%, the manual testing rate is about 30%, and one of the main reasons for the low detection efficiency of the wafer testing machine is misalignment between a probe and a wafer welding point. The traditional method for manually detecting the alignment condition is difficult to meet the requirements of high precision and high real-time performance in industry. With the rapid development of machine vision, the change of industrial detection modes is promoted, and people begin to use a machine vision method to identify and position the welding points of the wafer. The alignment condition is detected in real time by identifying the welding points of the wafer, and the alignment is calibrated when the alignment is abnormal.

The deep learning is widely applied to target detection, and a new technical support is provided for the detection of the wafer welding spots. At present, Two most mainstream directions of a target detection algorithm are a Two-stage (Two-stage) detection algorithm and a single-stage (One-stage) detection algorithm: the two-stage detection algorithm treats object detection according to classification problems, firstly generates a region containing an object, and then classifies and calibrates candidate regions to obtain a final detection result; the single-stage detection algorithm directly gives the final detection result without an explicit candidate box generation step.

The most classical method in the two-stage target detection method is an R-CNN series algorithm, and although the R-CNN series algorithm has high precision, the requirement of real-time detection cannot be met due to large network parameters and low detection speed. However, the YOLO family of the single-stage target detection algorithm is the algorithm which achieves the accuracy and speed optimization so far, and successfully achieves the good balance between the detection accuracy and the detection speed, wherein the YOLOV5 algorithm proposed in 2020 maintains the higher accuracy while reducing the network parameters by using a lot of network processing skills. However, due to the particularity of the detection target, the YOLOV5 algorithm still has a certain optimization space for the wafer solder joint detection task.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provides a deep learning wafer welding spot detection method based on improved YOLOV 5. The method can detect more wafer welding spots and can also detect the shielded wafer welding spots, and the method has the advantages of less network parameters and high precision and is more suitable for being deployed in practical application.

The technical scheme for realizing the purpose of the invention is as follows:

a deep learning wafer solder joint detection method based on improved YOLOV5 comprises the following steps:

1) manufacturing a wafer welding spot data set: collecting wafer welding point images, preprocessing the collected wafer welding point images to improve the sample number and the picture quality of a data set, then labeling the data set, adopting rectangular labeling for welding points needing to be aligned and welding points not needing to be aligned, namely, naming the welding points needing to be aligned as Rig holes and naming the welding points not needing to be aligned as Wro holes, finally manufacturing a wafer welding point data set containing 1464 wafer welding point images, and using 9:1, dividing a data set into a training set and a verification set;

2) constructing an attention mechanism module CCANET: the method comprises the following steps:

2.1) firstly, a filter module consisting of a 1X 1 convolution module, a Bathnorm module and a Sigmoid function is adopted to obtain a feature diagram X, and through dimensionality reduction of data, cross-channel interaction of information is realized while the calculated amount is reduced, and the information continuity is enhanced;

2.2) then a three-branch configuration is used:

2.2.1) the first branch characteristic diagram X is connected with a color attention mechanism module, and the output characteristic diagram X₁As shown in equation (1):

X₁＝X*(σ(MLP(Resize(AvgPool(X)))+σ(MLP(Resize(AvgPool7(X)))+σ(MLP(Resize(MaxPool(X)))) (1)，

wherein MLP is a multilayer perceptron, Resize is feature map size optimization, AvgPool is global average pooling of feature map size 1 x 1, AvgPool7 is global average pooling of feature map size 7 x 7, MaxPool is global maximum pooling of feature map size 1 x 1, and σ is a Sigmoid activation function;

2.2.2) second Branch Profile X and output Profile X of the first Branch₁Adding channels to output characteristic diagram X₂As shown in equation (2):

2.2.3) feature map X output by step 2.2.2)₂Inputting the position attention mechanism module and outputting the characteristic diagram X₃As shown in equation (3):

X₃＝M_s(X₂) (3)，

wherein M is_sOperation of a position attention mechanism;

2.2.4) third Branch Profile X and Profile X₃Multiplying to obtain a feature map X₄As shown in equation (4):

2.3) final output feature map X₄Completing the attention mechanism module CCANET;

3) constructing a YOLOV5 network integrated with attention mechanism: the method comprises the following steps:

3.1) introducing an attention mechanism, wherein the original Yolov5 network consists of a feature extraction layer, a bone stem layer and a detection layer, and the feature extraction layer and the bone stem layer of the original network are optimized: firstly, respectively adding a CCANET module constructed in the step 2) behind a first C3 module, a second C3 module and an SPP module in a feature extraction layer in a Yolov5 network, and then carrying out channel addition on a feature graph output by each CCANET module and a feature graph output by a backbone layer of an original Yolov5 network to obtain a new feature extraction layer;

3.2) carrying out self-adaptive scaling on the output channel number of each CCANET module, wherein the self-adaptive scaling ratio sigma of the channel number is 1/2, and the channel number scaling expression is shown as a formula (5):

X_C＝Y*σ (5)，

wherein, X_CThe number of output channels is Y, the number of input channels is Y, the network efficiency is enhanced by zooming the output characteristic diagram, and the network precision is improved while the parameters are reduced;

4) introducing a Ghost module: further optimizing the YOLOV5 network integrated with the attention mechanism and formed in the step 3), and introducing a Ghost module: replacing a first basic convolution module Conv behind three feature map splicing modules Concat in a network backbone layer with a Ghost module to obtain a final improved YOLOV5 convolution neural network;

5) training the improved YOLOV5 network: firstly, training the wafer welding spot data set manufactured in the step 1) by adopting the improved YOLOV5 network obtained in the step 4) to obtain a wafer welding spot detection network, then inputting a wafer welding spot picture or a wafer welding spot video to be tested into the wafer welding spot detection network to obtain the positions and the number of the wafer welding spots, and then evaluating the precision of the wafer welding spot detection network and network model parameters.

The preprocessing in the step 1) is image cutting and image enhancement, wherein the image cutting fully considers the situation that actual welding spots in the wafer welding spot picture only occupy a small part of the whole picture, if the whole picture is trained, the training time is prolonged, the network precision is reduced, the original picture of 6112 3440 is cut into 2030 1500 picture only containing wafer welding spots, thus reducing background redundant information, and simultaneously expanding the wafer welding spots of small targets into middle targets, thus improving the precision of network training and reducing the training time; the image enhancement is to enhance the image of the wafer welding spot picture under different illumination intensities, namely to enhance the brightness of the welding spot picture with darker brightness.

The color attention mechanism module described in step 2.2.1) comprises the following processes:

2.2.1.1) inputting the training set picture into the network in an RGB three-channel mode, and adopting the enhancement of channel characteristic information aiming at the specificity of obvious color difference of a wafer welding spot target so as to enhance the color characteristic information and focus the attention of the network on the color characteristic;

2.2.1.2) firstly, extracting channel characteristics from different structures by using three scales of pooling modes of 1 × 1 global average pooling, 7 × 7 global average pooling and 1 × 1 global maximum pooling to the characteristic diagram, thereby improving the structural information of the characteristic diagram and reducing information loss; then, normalizing the channel characteristics of different structures of the three pooled characteristic graphs by using a Resize module, and keeping the sizes of the characteristic graphs at the same size, so as to facilitate the further fusion of the characteristic graphs, wherein the Resize module adopts self-adaptive scaling of the sizes of the characteristic graphs, and the scaled sizes are 1 x 1; and finally, inputting the three scaled feature graphs into a shared network consisting of a plurality of layers of perceptrons respectively, and outputting feature vectors after adding.

And 3) adding the CCANET modules constructed in the step 2) respectively, so that the number of output channels of each module is adaptively scaled to be half of the number of input channels while the network is deepened, the efficiency of extracting network characteristics is improved to a certain extent by scaling the attention mechanism modules, and the network precision is improved while parameters are reduced.

The color attention module in the step 2) is embedded into the position attention module, because of the particularity of the target detection task, the color attention module and the position attention module are adopted to describe the problem of 'what' and 'where' the task target in a picture, an attention mechanism module more suitable for the target detection subdivision task is generated, and the color attention module is embedded into the position attention module to enable the two modules to be connected through an embedded structure, so that redundant calculation is reduced, and the correlation and dependency between the two characteristics are enhanced.

According to the technical scheme, a color and position attention mechanism module CCANET is provided for specific color and position characteristics of a wafer detection task, and the accuracy of the wafer welding spot detection task is improved by enhancing channel information and position characteristic information;

according to the technical scheme, different semantic features are contained according to feature maps with different depths, firstly, a color attention mechanism is embedded into a position attention mechanism through a nested structure, the correlation between different feature information is enhanced, the two feature information is enhanced, and information loss is reduced; then, channel addition is carried out on the enhanced color characteristic information and the shallow information through an internal residual error structure, so that the fact that the global information added with an attention mechanism is not lost is guaranteed; finally, by means of the characteristic information, network attention is better focused on interested places, and the precision of a target task is improved;

according to the technical scheme, an attention mechanism layer is designed from three dimensions of the number, the position and the channel number, four CCANET modules are added to a feature extraction layer of a YOLOV5 network, then the CCANET modules are subjected to self-adaptive scaling, and output channels of the attention mechanism modules CCANET are subjected to self-adaptive scaling to be 1/2 of the input channel number. From the angle of network scaling, the relation between the number and the position of the attention mechanism layers and the relation between the number of the attention mechanism layers and the number of channels are found out when the attention mechanism layers are designed, and under the condition of limited network parameters, the target detection precision is greatly improved;

according to the technical scheme, the Ghost module is introduced, the conv of the common convolution layer in the Neck layer in the original network is replaced by the Ghost module, more redundancy characteristics are generated through less calculated amount, and network parameters are reduced.

The invention can improve the efficiency of accurate alignment and detection of the wafer chip in actual industrial production. Meanwhile, the alignment condition of the wafer welding points and the probes can be judged in an auxiliary mode through detection of the wafer welding points.

The method can detect more wafer welding spots and can also detect the shielded wafer welding spots, and the method has the advantages of less network parameters and high precision and is more suitable for being deployed in practical application.

Description of the drawings:

FIG. 1 is a schematic flow chart of an exemplary method;

FIG. 2 is a schematic diagram of a CCANET structure of the attention mechanism module in the embodiment;

FIG. 3 is a schematic structural diagram of a color attention mechanism module in an embodiment;

FIG. 4 is a schematic diagram of an improved Yolov5 network feature extraction layer network structure in an embodiment;

FIG. 5 is a schematic diagram of a Neck layer network structure of a modified YOLOV5 network in an embodiment;

FIG. 6 is a diagram showing the results of detection in the examples;

FIG. 7 is a diagram showing the results of detection in the examples;

FIG. 8 is a diagram showing the detection results in the example.

The specific implementation mode is as follows:

the invention will be further elucidated below by reference to the drawings and examples, without being limited thereto.

Example (b):

referring to fig. 1, a deep learning wafer solder joint inspection method based on improved YOLOV5 includes the following steps:

1) manufacturing a wafer welding spot data set: collecting wafer welding spot images, preprocessing the collected wafer welding spot images to improve the sample number and the picture quality of a data set, then labeling the data set, adopting rectangular labeling for welding spots needing to be aligned and welding spots not needing to be aligned to name the welding spots needing to be aligned as Rig holes and the welding spots not needing to be aligned as Wro holes, finally manufacturing a wafer welding spot data set containing 1464 wafer welding spot images, and using 9:1, dividing a data set into a training set and a verification set;

2) constructing an attention mechanism module CCANET: attention mechanism module CCANET is shown in fig. 2 and comprises:

2.2) then a three-branch configuration is used:

2.2.1) the first branch characteristic diagram X is connected with a color attention mechanism module, the color attention mechanism module is shown in figure 3, and the output characteristic diagram X₁As shown in equation (1):

wherein MLP is a multilayer perceptron, Resize is feature map size optimization, AvgPool is global average pooling for feature map size 1 x 1, AvgPool7 is global average pooling for feature map size 7 x 7, MaxPool is global maximum pooling for feature map size 1 x 1, and σ is a Sigmoid activation function;

2.2.2) second branch profile X and output profile X of the first branch₁Channel addition is carried out to output a characteristic diagram X₂As shown in equation (2):

X₃＝M_s(X₂) (3)，

wherein M is_sOperation of a position attention mechanism;

2.2.4) third branch profile X and profile X₃Multiplying to obtain a feature map X₄As shown in formula (4)Showing:

3.1) introducing an attention mechanism, as shown in fig. 4, an original YOLOV5 network is composed of a feature extraction layer, a bone stem layer and a detection layer, and the feature extraction layer and the bone stem layer of the original network are optimized: firstly, respectively adding a CCANET module constructed in the step 2) behind a first C3 module, a second C3 module and an SPP module in a feature extraction layer in a Yolov5 network, and then carrying out channel addition on a feature graph output by each CCANET module and a feature graph output by a backbone layer of an original Yolov5 network to obtain a new feature extraction layer;

X_C＝Y*σ (5)，

4) introducing a Ghost module: further optimizing the YOLOV5 network integrated with the attention mechanism formed in the step 3), and introducing a Ghost module: replacing a first basic convolution module Conv behind three feature map splicing modules Concat in a network backbone layer with a Ghost module to obtain a final improved YOLOV5 convolution neural network;

5) training the improved YOLOV5 network: as shown in fig. 4 and 5, firstly, the improved YOLOV5 network obtained in step 4) is adopted to train the wafer welding point data set manufactured in step 1) to obtain a wafer welding point detection network, then a wafer welding point picture or a wafer welding point video to be tested is input into the wafer welding point detection network to obtain the positions and the number of the wafer welding points, and then the precision of the wafer welding point detection network and the network model parameters are evaluated.

2.2.1.2) firstly, extracting channel characteristics from different structures by using three scales of pooling modes of 1 × 1 global average pooling, 7 × 7 global average pooling and 1 × 1 global maximum pooling to the characteristic diagram, thereby improving the structural information of the characteristic diagram and reducing information loss; then, normalizing the channel characteristics of different structures of the three pooled characteristic graphs by using a Resize module, and keeping the sizes of the characteristic graphs at the same size, so as to facilitate the further fusion of the characteristic graphs, wherein the Resize module adopts self-adaptive scaling of the sizes of the characteristic graphs, and the scaled sizes are 1 x 1; and finally, inputting the three scaled feature maps into a shared network consisting of a plurality of layers of perceptrons respectively, and outputting feature vectors after adding.

In this example, the experimental environment configuration is shown in table 1, the experiment is written in python 3.8 language, the deep learning frame is pytorch1.9, the training lot is set to 500, the small lot is set to 32, the initial learning rate is 0.01, the weight attenuation coefficient is 0.0005, the momentum coefficient is 0.9, the experimental data set is a wafer solder joint data set which has been collected and manufactured, wherein the solder joints are divided into two types, Rig solder joints and Wro solder joints, 1464 pictures are contained in the data set, the ratio of the training set to the verification set is 9:1, the training time epochs is 500, and the preselected boxes used in the training process are clustered by using a k-means clustering algorithm to obtain nine preselected boxes, namely (10,13), (16,30), (33,23), (30,61), (62,45), (59,119), (116,90), (156,198), (373,326), so as to improve the generalization capability of the model.

TABLE 1

Name (R)	Configuration of
		Operating system	Windows Server 2016 Standard
CPU	Intel(R) Xeon(R) Gold 6152 CPU@2.10GHz 2.10 GHz
		GPU	NVIDIA Tesla P40 24GB
GPU acceleration library	CUDA Version:10.2

Experimental results and analysis:

the testing evaluation index adopts rig (AP), Wro (AP), F1 and the comprehensive precision mean mAP @0.5 to measure the accuracy of the detection algorithm, and then adopts network parameters (parameters) and model size to comprehensively evaluate the size of the network model.

Table 2 shows the YOLOV5 algorithm, the CCANET module and the Ghost module in combination with the above evaluation indexes, and it can be seen from table 2 that the original YOLOV5 algorithm rig (ap) is 75.4%, the wro (ap) is 89.2%, the average precision Map @0.5 is 82.3%, the parameters are 7.3M, the F1 is 78%, and the model size is 14.4M; after the Ghost module is added, rig (AP) is reduced by 6.8%, Wro (AP) is improved by 0.6%, average precision Map @0.5 is reduced by 3.1%, parameters are reduced by 1.05M, F1 is reduced by 1%, and the size of the model is reduced by 2.3M; after the Ghost module and the CCANET module are added, rig (AP) is improved by 7.9%, Wro (AP) is improved by 5%, average precision Map @0.5 is improved by 6.5%, parameters are reduced by 0.6M, F1 is improved by 7%, and the size of the model is reduced by 1.6M. Through the experiment, the parameters of the model and the size of the model are reduced after the Ghost module is added, but the average precision and the single precision of each welding point are also reduced; after the two modules are combined, the effect of the combined improved algorithm is optimal, all precision indexes are obviously improved, the parameter number and the size of the model are reduced, and the combined improved algorithm obtains a better balance on the model parameters and the detection precision.

Table 2: ablation experiments with different modules

Table 3 shows comparative experiments of the modified YOLOV5 algorithm with other network algorithms in combination with the above evaluation indexes. From Table 3 it can be seen that the original YOLOV5 algorithm rig (AP) is 75.4%, Wro (AP) is 89.2%, the average accuracy Map @0.5 is 82.3%, parameters are 7.3M, F1 is 78%, and the model size is 14.4M; yolov3-spp algorithm rig (AP) of 61.3%, Wro (AP) of 92.4%, mean accuracy Map @0.5 of 76.8%, parameters of 62.5M, F1 of 78%, model size of 119M; the improved YOLOV5 algorithm rig (AP) is 83.3%, the Wro (AP) is 94.2%, the average precision Map @0.5 is 88.8%, the parameters are 6.7M, the F1 is 85%, and the size of the model is 12.8M.

TABLE 3 comparative test with other algorithms

As shown in fig. 6, 7, and 8, the method of the present embodiment has good robustness in the detection of the solder joint of the wafer, has the advantage of smaller model while maintaining higher detection accuracy, and is more suitable for being deployed in practical industrial applications to assist the real-time detection of the alignment condition of the wafer. However, when the illumination intensity of the surface of the welding spot of the wafer is too low, the accuracy still has a certain improvement space, and the detection precision is further improved under different illumination intensities in the next work.

Claims

1. A deep learning wafer solder joint detection method based on improved YOLOV5 is characterized by comprising the following steps:

1) manufacturing a wafer welding spot data set: collecting wafer welding spot images, preprocessing the collected wafer welding spot images, labeling a data set, naming welding spots needing to be aligned and welding spots not needing to be aligned as Rig holes and Wro holes by adopting rectangular labeling, and finally manufacturing a wafer welding spot data set containing 1464 wafer welding spot images, wherein 9 is used for: 1, dividing a data set into a training set and a verification set;

2.1) firstly, adopting a filter module consisting of 1 × 1 convolution, a Bathnorm module and a Sigmoid function to obtain a feature map X;

2.2) then a three-branch configuration is used:

X₃＝M_s(X₂) (3)，

wherein M is_sOperation of a position attention mechanism;

X_C＝Y*σ (5)，

wherein, X_CIs the number of output channels, and Y is the number of input channels;

2. The improved YOLOV 5-based deep learning wafer solder joint detection method according to claim 1, wherein the preprocessing in step 1) is image cropping and image enhancement, wherein the image cropping crops 6112 x 3440 original pictures into 2030 x 1500 pictures only containing wafer solder joints, and expands the wafer solder joints of small targets into middle targets;

the image enhancement is to enhance the image of the wafer welding spot picture under different illumination intensities, namely to enhance the brightness of the welding spot picture with darker brightness.

3. The improved YOLOV 5-based deep learning wafer solder joint inspection method of claim 1, wherein the color attention mechanism module of step 2.2.1) comprises the following processes:

2.2.1.2) firstly, extracting channel characteristics from different structures by using a pooling mode of three scales of 1 × 1 global average pooling, 7 × 7 global average pooling and 1 × 1 global maximum pooling, then normalizing the channel characteristics of different structures by using a Resize module for the three pooled characteristic maps, and keeping the size of the characteristic maps at the same scale, wherein the Resize module adopts self-adaptive scaling of the size of the characteristic maps, and the scaled size is 1 × 1; and finally, inputting the three scaled characteristic graphs into a shared network consisting of a plurality of layers of perceptrons respectively, and outputting characteristic vectors after adding.

4. The improved YOLOV 5-based deep learning wafer solder joint inspection method of claim 1, wherein the CCANET modules constructed in step 2) are added in step 3) and the number of output channels of each module is adaptively scaled to half the number of input channels.

5. The improved YOLOV 5-based deep learning wafer solder joint inspection method of claim 1, wherein the color attention module of step 2) is embedded in the location attention module.