CN116363064A

CN116363064A - Defect identification method and device integrating target detection model and image segmentation model

Info

Publication number: CN116363064A
Application number: CN202310152888.XA
Authority: CN
Inventors: 袁烨; 张永; 兰儒恺; 王茂霖; 何志超
Original assignee: Yuanshi Intelligent Technology Nantong Co ltd
Current assignee: Yuanshi Intelligent Technology Nantong Co ltd
Priority date: 2023-02-22
Filing date: 2023-02-22
Publication date: 2023-06-30

Abstract

The invention provides a defect identification method and device for fusing a target detection model and an image segmentation model, wherein the method comprises the following steps: acquiring a surface image of equipment to be detected; inputting the surface image of the equipment to be detected into a target detection model to obtain a defect detection result of the equipment to be detected; inputting the surface image of the equipment to be detected into an image segmentation model to obtain a defect segmentation result of the equipment to be detected; obtaining a defect identification result of equipment to be detected according to the defect detection result and the defect segmentation result; the target detection model is obtained by training according to the surface image of the sample equipment and the defect detection label of the sample equipment, and the image segmentation model is obtained by training according to the surface image of the sample equipment and the defect segmentation label of the sample equipment. The invention realizes the efficient and accurate detection of the equipment flaws by using the computer vision, reduces the error of the detection of the equipment flaws, improves the detection precision and saves the human resources.

Description

Defect identification method and device integrating target detection model and image segmentation model

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a defect identification method and device for fusing a target detection model and an image segmentation model.

Background

Gears are a typical power transmission device in the automotive field, playing an important role in each automobile. Therefore, if the gear has defects, the performance and the service life of the automobile are directly affected.

At present, a plurality of specialized instruments for measuring gear parameters exist in the market, but in terms of price, practicability and the like, most manufacturers still adopt a manual detection mode to detect the gear at present, and the mode usually consumes a large amount of human resources, has low detection speed, has the possibility of false detection and missed detection caused by eye fatigue, and is difficult to meet the requirements of modern industrial mass production and manufacturing.

Disclosure of Invention

The invention provides a defect identification method and device integrating a target detection model and an image segmentation model, which are used for solving the defects of low detection efficiency and low detection accuracy caused by manual defect detection in the prior art and realizing automatic and accurate defect detection.

The invention provides a defect identification method integrating a target detection model and an image segmentation model, which comprises the following steps:

Acquiring a surface image of equipment to be detected;

inputting the surface image of the equipment to be detected into a target detection model to obtain a defect detection result of the equipment to be detected;

inputting the surface image of the equipment to be detected into an image segmentation model to obtain a defect segmentation result of the equipment to be detected;

obtaining a defect identification result of the equipment to be detected according to the defect detection result and the defect segmentation result;

the target detection model is obtained by training according to the surface image of the sample equipment and the defect detection label of the sample equipment, and the image segmentation model is obtained by training according to the surface image of the sample equipment and the defect segmentation label of the sample equipment.

According to the defect identification method for fusing the target detection model and the image segmentation model, the training steps of the target detection model comprise:

acquiring a surface image of the sample device;

marking a corresponding defect segmentation label on the surface image of the sample equipment according to a plurality of preset defect types;

marking a corresponding defect detection label on the surface image of the sample equipment according to the defect segmentation label;

Constructing a defect detection data set according to the surface image of the sample equipment and the defect detection label of the sample equipment;

pre-training parameters of an original target detection model according to the defect detection data set to obtain pre-training parameters;

according to the pre-training parameters, performing iterative training on the parameters of the improved target detection model to obtain the target detection model;

the improved target detection model is obtained by improving an original backbone network of an original target detection model on the basis of the original target detection model; the improvement of the original backbone network comprises replacing at least one convolutional network layer in the original backbone network of the original target detection model with a deformable convolutional network layer, and adding a symmetrical network structure consisting of an encoder and a decoder in the original backbone network.

According to the defect identification method for fusing the target detection model and the image segmentation model provided by the invention, the iterative training is carried out on the parameters of the improved target detection model according to the pre-training parameters to obtain the target detection model, and the defect identification method comprises the following steps:

initializing parameters of the improved target detection model according to the pre-training parameters;

Fixing parameters of other networks except the improved backbone network in the initialized improved target detection model, and training the parameters of the improved backbone network according to the defect detection data set until a first preset termination condition is met to obtain training parameters of the improved backbone network;

and training the initialized parameters of the improved target detection model according to the training parameters of the improved backbone network until a second preset termination condition is met, so as to obtain the target detection model.

According to the defect identification method for fusing the target detection model and the image segmentation model provided by the invention, the obtaining of the surface image of the sample equipment comprises the following steps:

acquiring an original surface image of the sample device;

preprocessing and image expansion are carried out on the original surface image to obtain a surface image of the sample equipment;

wherein the preprocessing includes one or more combinations of normalization processing, histogram equalization processing, and noise reduction processing; the image augmentation includes superimposing one or more combinations of randomly generated background pictures, random cropping, and random flipping.

According to the defect identification method for fusing the target detection model and the image segmentation model, the loss function of the target detection model is constructed and generated based on the regression loss function and the classification loss function;

the regression loss function is generated according to the defect detection result of the sample equipment and angle loss, distance loss, shape loss and cross ratio loss determined by the defect detection label of the sample equipment;

the classification loss function is generated according to a defect detection result of the sample equipment and a zoom loss construction determined by a defect detection label of the sample equipment.

According to the defect identification method for fusing the target detection model and the image segmentation model, for the current iterative training, the current learning rate of the target detection model is calculated according to the first learning rate and/or the second learning rate;

the first learning rate is calculated according to the current iteration times, the maximum iteration times and the maximum learning rate;

the second learning rate is calculated according to a maximum learning rate, a minimum learning rate, the current iteration number and the maximum iteration number.

According to the defect identification method for fusing the target detection model and the image segmentation model provided by the invention, the defect identification result of the equipment to be detected is obtained according to the defect detection result and the defect segmentation result, and the method comprises the following steps:

performing non-maximum inhibition treatment on the defect detection result, and acquiring a defect detection frame with the intersection ratio larger than a preset threshold value from the treated defect detection result; the defect detection frame is a boundary frame where the detected defect area is located;

obtaining a defect segmentation frame with the intersection ratio larger than the preset threshold value from the defect segmentation result; the defect dividing frame is a boundary frame where the divided defect area is located;

and acquiring the defect identification result according to the confidence coefficient corresponding to the defect detection frame and the confidence coefficient corresponding to the defect segmentation frame.

The invention also provides a defect recognition device fusing the target detection model and the image segmentation model, which comprises:

the acquisition module is used for acquiring a surface image of the equipment to be detected;

the detection module is used for inputting the surface image of the equipment to be detected into a target detection model to obtain a defect detection result of the equipment to be detected;

The segmentation module is used for inputting the surface image of the equipment to be detected into an image segmentation model to obtain a defect segmentation result of the equipment to be detected;

the identification module is used for acquiring a defect identification result of the equipment to be detected according to the defect detection result and the defect segmentation result;

The invention also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the defect identification method of fusing the target detection model and the image segmentation model according to any one of the above when executing the program.

The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a defect recognition method of fusing a target detection model and an image segmentation model as described in any one of the above.

The invention also provides a computer program product comprising a computer program which when executed by a processor implements a method of defect identification incorporating a target detection model and an image segmentation model as described in any one of the above.

According to the defect identification method and device for fusing the target detection model and the image segmentation model, the obtained surface images of the equipment to be detected are respectively input into the target detection model and the image segmentation model, so that the corresponding defect detection result of the equipment to be detected and the defect segmentation result of the equipment to be detected are obtained; and then, according to the obtained defect detection result and defect segmentation result, obtaining the defect identification result of the equipment to be detected, so that the defect detection work of the equipment is efficiently and accurately carried out by utilizing computer vision, the error of the defect detection of the equipment is reduced, the detection precision is improved, and the human resources are saved.

Drawings

In order to more clearly illustrate the invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic flow chart of a defect identification method for fusing a target detection model and an image segmentation model;

FIG. 2 is a second flow chart of a defect recognition method for fusing a target detection model and an image segmentation model according to the present invention;

FIG. 3 is a schematic view of the original image segmentation model in the present invention;

FIG. 4 is a graph comparing fusion model predictions provided by the present invention with predictions using a single method;

FIG. 5 is a schematic flow chart of a training method of the object detection model provided by the invention;

FIG. 6 is a schematic diagram of the structure of an improved object detection model provided by the present invention;

FIG. 7 is a PR curve comparison of an improved target detection model and an original target detection model with increased deformable convolution provided by the present invention;

FIG. 8 is a PR curve comparison of an improved target detection model provided by the present invention and an original target detection model with only the addition of a deformable convolution;

FIG. 9 is a flow chart of a training method of an image segmentation model provided by the invention;

FIG. 10 is a flow chart of a training method of the object detection model provided by the invention;

FIG. 11 is a schematic diagram of a defect recognition device for fusing a target detection model and an image segmentation model according to the present invention;

Fig. 12 is a schematic structural diagram of an electronic device provided by the present invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the invention, are within the scope of the invention in accordance with embodiments of the present invention.

In recent years, with the continuous advancement of intelligent manufacturing, ai+ (Artificial Intelligence Plus, artificial intelligence) industrial quality inspection has become an important application scenario in the field of industrial intelligence. The high-speed development of the machine vision technology brings new possibility to intelligent manufacturing, and the high-efficiency and accurate characteristics of the novel machine vision technology are applied to industrial quality inspection more and more. Considering how to combine machine vision and equipment flaw detection to improve the efficiency of production and processing, the reduction of rejection rate and the improvement of efficiency are key points for promoting the development of the whole equipment flaw detection industry.

At present, the machine vision industry is still monopolized by a few international tap enterprises. Some businesses are almost monopolizing more than 50% of the worldwide visual inspection market, all of which provide corresponding solutions based on core components and technology (operating systems, sensors, etc.). Machine vision inspection schemes for various enterprises have accumulated some technology in recent years, but there are still significant gaps compared with these international faucet enterprises. Therefore, the gear flaw detection task has important significance for improving the industrial quality inspection efficiency and guaranteeing the product quality.

The invention aims to overcome the defects of low detection efficiency and low detection accuracy caused by manual defect detection, and provides a defect identification method for fusing a target detection model and an image segmentation model.

The defect recognition method of the fusion target detection model and the image segmentation model of the present invention is described below with reference to fig. 1.

Fig. 1 is a schematic flow chart of a defect identifying method according to the present embodiment, as shown in fig. 1, the method includes:

step 101, acquiring a surface image of equipment to be detected;

the device to be detected may be a device that needs to perform defect detection. By way of example, the device to be detected may be a gear, a bearing, etc., which is not particularly limited in this embodiment.

As shown in fig. 2, after the surface image of the device to be detected is acquired, detection model selection is performed on the acquired data, and the data is processed by a method of combining target detection and semantic segmentation. Specifically, selecting a target detection model to perform target detection on a surface image of the equipment to be detected, selecting an image segmentation model to perform semantic segmentation on the surface image of the equipment to be detected, respectively obtaining training results, and then fusing the target detection model and the image segmentation model to obtain a final result.

102, inputting the surface image of the equipment to be detected into a target detection model to obtain a defect detection result of the equipment to be detected;

step 103, inputting the surface image of the equipment to be detected into an image segmentation model to obtain a defect segmentation result of the equipment to be detected;

Specifically, before step 102 is performed, the initial target detection model may be trained to obtain the target detection model. Specifically, an initial target detection model is firstly constructed, then a surface image of sample equipment and a defect detection label marked by the surface image of the sample equipment are collected, and the initial target detection model is trained to obtain the target detection model.

Wherein the initial target detection model comprises an original target detection model and/or an improved target detection model of the original target detection model; the original target detection model may be constructed based on PP-YOLO (a modified single-stage Anchor-free target detection model based on the YOLO model).

The PP-YOLOE detection model is an industrial advanced target model with high performance and friendly deployment, comprises modules such as a backbone network CSPRepResNet (Cross Stage Partial-Regional Memory Network, a cross-stage local memory network), a feature fusion CSPPAN (Cross Stage Partial Path Aggregation network, a cross-stage local path aggregation network), a lightweight ET-Head (effective Task-aligned Head), an improved dynamic matching algorithm TAL (Task Alignment Learning), and the like, and designs a series of models according to different application scenes. The full series model of PP-YOLOE achieves the industry optimization in terms of precision, speed and cost performance.

After training to obtain a target detection model, inputting a surface image of the equipment to be detected into the target detection model, and carrying out defect detection on the equipment to be detected through the target detection model to obtain a defect detection result of the equipment to be detected.

Alternatively, when the surface image of the device to be detected is input to the target detection model, the surface image of the device to be detected may be directly input to the target detection model; or the surface image of the equipment to be detected is subjected to related pretreatment, and the surface image with higher image quality is obtained after the pretreatment, so that the defect detection precision is improved. It should be noted that whether to pre-process the surface image of the device to be detected may be set according to an actual scene, which is not specifically limited in this embodiment.

Similarly, the initial image segmentation model may be trained to obtain the haptic image segmentation model. Specifically, an initial image segmentation model is firstly constructed, then a surface image of the sample equipment and a corresponding defect segmentation label marked by the surface image of the sample equipment are collected, and the initial image segmentation model is trained to obtain the image segmentation model.

As shown in fig. 3, a schematic diagram of the structure of the initial image segmentation model is shown. The initial image segmentation model is built based on U-Net++ (a new segmentation architecture based on nested and densely skipped connections); the first layer convolution network structure of U-Net++ is replaced by a deformable convolution structure, and an improved U-Net++ model, namely an initial image segmentation model, is obtained.

U-Net++ is based on improvements and innovations on U-Net networks (image semantic segmentation networks). Including downsampling (hereinafter also referred to as Down-sampling), upsampling (hereinafter also referred to as Up-sampling), skip connection (hereinafter also referred to as Skip connection), and convolutional network (hereinafter also referred to as Convolution). On the basis of the U-shaped network structure of an Encoder and a Decoder in the U-Net, a convolution layer is arranged on a jump path, a semantic gap is set up between the Encoder and the Decoder network, a series of nested, dense skip path flexible network structures are designed, gradient flow is improved, and model pruning can be performed in cooperation with depth supervision.

After training to obtain an image segmentation model, inputting a surface image of the equipment to be detected into the image segmentation model, and carrying out defect segmentation on the equipment to be detected through the image segmentation model to obtain a defect segmentation result of the equipment to be detected.

104, obtaining a defect identification result of the equipment to be detected according to the defect detection result and the defect segmentation result;

optionally, after the defect detection result of the device to be detected and the defect segmentation result of the device to be detected are obtained, the defect detection result and the defect segmentation result may be comprehensively considered, and the defect identification result of the device to be detected is obtained according to the comprehensive considered result.

Alternatively, the comprehensive consideration of the defect detection result and the defect segmentation result may be that a result with the largest confidence is selected from the confidence corresponding to the defect detection frame of the defect detection result and the confidence corresponding to the defect segmentation frame of the defect segmentation result as the final defect recognition result; the defect detection result and the defect segmentation result may be fused, so that the fusion result is used as a final defect identification result, and the method is specifically set according to an actual scene, which is not specifically limited in this embodiment.

To further verify the validity of the defect detection model provided in this embodiment, the defect detection mode combining the target detection model and the semantic segmentation model in this embodiment is compared with the defect detection mode based on a single defect detection model.

As shown in fig. 4, in order to compare the predicted result of the fusion model with the PR (Precision-Recall) curve predicted by using only a single method, it can be seen that the defect detection is performed by combining the target detection and the semantic segmentation in the present embodiment, which has higher detection Precision, and solves the defects of the existing single-model structure.

According to the method, the obtained surface image of the equipment to be detected is input into the target detection model and the image segmentation model respectively, so that a corresponding defect detection result of the equipment to be detected and a corresponding defect segmentation result of the equipment to be detected are obtained; and then, according to the obtained defect detection result and defect segmentation result, obtaining the defect identification result of the equipment to be detected, establishing a frame for efficiently and rapidly detecting the gear flaws in industrial production, realizing the efficient and accurate detection of the equipment flaws by using computer vision, reducing the error of the equipment flaws detection, improving the detection precision and saving human resources.

On the basis of the above embodiment, fig. 5 is a schematic flow chart of a training method of the target detection model according to the present embodiment.

As shown in fig. 5, the training step of the object detection model includes:

step 501, acquiring a surface image of the sample device;

alternatively, after the surface image of the sample device is acquired, the surface image of the sample device may be directly input into the original target detection model; or the surface image of the sample equipment is subjected to related pretreatment and/or image expansion and then is input into an original target detection model; it should be noted that whether to pre-process and/or expand the surface image of the sample device may be set according to the actual scenario, which is not specifically limited in this embodiment.

Wherein the surface image of the sample device comprises a defective surface image and a portion of a normal surface image.

Step 502, marking a corresponding defect segmentation label on the surface image of the sample equipment according to a plurality of preset defect types;

and marking the defect type in the defect surface image of the sample equipment and generating a label file. Marking according to the requirement of semantic segmentation (Semantic Segmentation) during marking, wherein each defect corresponds to a plurality of point coordinates to obtain segment (segmented region);

The semantic segmentation is to label each point in the target category on the image according to the semantics, so that different kinds of objects are distinguished on the image. The classification task at the pixel level can be understood.

Semantic segmentation is a typical computer vision problem that involves taking some raw data (e.g., planar images) as input and converting them into a mask with highlighted regions of interest.

At this time, the requirement of semantic segmentation may be to divide the defective surface image of the sample device into a foreground region and a segmented region.

Wherein the foreground region (corresponding to the background) is a part of the region without defects; the defective area is segmented as a segmented area.

The defect segmentation label can be obtained by marking each pixel point, and the marking of each pixel point is defined as 0 or 1. Wherein, 0 represents that the pixel point is positioned in the foreground region; 1 represents that the pixel point is located in the divided region.

And obtaining the defect segmentation label corresponding to the surface image mark of the sample equipment in the defect segmentation mode.

Step 503, marking a corresponding defect detection label on the surface image of the sample equipment according to the defect segmentation label;

And (3) marking a corresponding defect segmentation label on the surface image of the sample equipment, taking the maximum value of X and Y coordinates corresponding to each defect area, and generating a bbox (marking Box, rectangular label frame), namely the defect detection frame of the target detection model.

The rectangular label frame bbox marks the rectangle of the object on the image, and common labeling formats thereof include Pascal VOC (top left right bottom), COCO (top left width and height), YOLO (center coordinates—width and height), and the like.

Finally, two types of labels, namely a defect segmentation label and a defect detection label, can be obtained for the same defect area.

Step 504, constructing a defect detection data set according to the surface image of the sample equipment and the defect detection label of the sample equipment;

then, constructing a defect detection data set of the sample equipment by using the defect surface image of the sample equipment and the corresponding defect detection label for training a subsequent target detection model.

Step 505, pre-training parameters of an original target detection model according to the defect detection data set to obtain pre-training parameters;

when the original target detection model is trained by using the obtained defect detection data set of the equipment, loss value calculation can be performed through a loss function so as to obtain pre-training parameters of the original target detection model.

Alternatively, the loss function used for training the original target detection model may be generated based on joint construction of a classification loss function and a regression loss function, and the function structures of the classification loss function and the regression loss function may be specifically determined according to actual situations, and the selection of the loss function is not specifically limited in this embodiment. If the classification loss function adopts a correction loss function, the regression loss function adopts a boundary box regression function. Step 506, performing iterative training on parameters of an improved target detection model according to the pre-training parameters to obtain the target detection model;

optionally, after the pre-training parameters are obtained, based on the pre-training parameters, performing back propagation iterative training on the parameters of the improved target detection model based on the loss function, and obtaining the target detection model after the training reaches a certain convergence condition.

As shown in fig. 6, a schematic structural diagram of the improved object detection model is shown. Specifically, a deformable convolution and multi-scale feature fusion technology is used, in a feature extraction module of the PP-YOLOE, a convolution network (Convolutional Neural Networks, CNN) in a first layer of ResBlock (Residual Block) in the CSPRepResNet is replaced by a deformable convolution network (Deformable Convolution Network, DCN), and a direction vector for adjusting a convolution kernel is added on a traditional neural network, so that a convolution result is more matched with a defect area.

As shown in fig. 7, in order to increase the PR contrast curve of the original target detection model and the original target detection model of the deformable convolution, it is obvious that the MAP (Mean Average Precision, average accuracy) of the target detection model (hereinafter also referred to as dcn+pp-yolo) after the deformable convolution is increased is higher than the MAP of the original target detection model (hereinafter also referred to as PP-yolo), which indicates that the target detection model prediction effect is better after the DCN is used.

Further, in order to fully fuse the features, a symmetrical network structure (also called U-Net) consisting of an encoder and a decoder is added in an original backbone network, so that the first multi-scale feature fusion is carried out on the feature map after the CSPRepResNet is passed through by adopting the thought of the U-Net, and the information exchange between different sensing fields is fully realized.

As shown in fig. 8, in order to improve the PR contrast curve of the target detection model (hereinafter also referred to as unet+dcn+pp-YOLOE) and the original target detection model added with the deformable convolution, it can be clearly seen that the MAP of the improved target detection model after the U-Net feature fusion for the device flaw detection is significantly higher, which indicates that the prediction effect of the target detection model is further improved after the U-Net feature fusion is used.

Fig. 9 is a flowchart of a training method of an image segmentation model according to the present embodiment.

As shown in fig. 9, after obtaining the defect segmentation label, the training step of the image segmentation model includes:

step 901, constructing a defect segmentation data set according to a surface image of the sample equipment and a defect segmentation label of the sample equipment;

then, constructing a defect segmentation data set of the sample equipment by using the defect surface image of the sample equipment and the corresponding defect segmentation label for training a subsequent image segmentation model.

Step 902, pre-training parameters of an original image segmentation model according to the defect segmentation data set to obtain pre-training parameters;

when the original image segmentation model is trained by using the obtained defect segmentation data set of the equipment, loss value calculation can be performed through a loss function so as to obtain pre-training parameters of the deep learning model.

Alternatively, the selection of the Loss function for training the original image segmentation model may be determined according to the actual scenario, for example, selecting the cross entropy Loss function to construct the Loss function of the original image segmentation model, so as to calculate the Loss value when multi-classification is performed through the cross entropy Loss function, where the cross entropy Loss function Loss is as follows:

wherein p= [ p ] ₀ ，…p _C-1 ]Refers to probability distribution, each p _i Representing the probability that the sample belongs to class i; y= [ y ] ₀ ，…y _C-1 ]One-hot coded representation of a finger sample tag, y _i =1 represents that the sample belongs to class i; c represents the sample tag.

Step 903, performing iterative training on parameters of the improved image segmentation model according to the pre-training parameters to obtain the image segmentation model;

after the pre-training parameters are obtained, iterative training of back propagation can be performed on the parameters of the improved image segmentation model based on the loss function on the basis of the pre-training parameters, and the image segmentation model can be obtained after training reaches a certain convergence condition.

The improved image segmentation model is obtained by improving an original backbone network of an original image segmentation model on the basis of the original image segmentation model; the improvement of the original backbone network comprises the step of replacing at least one convolution network layer in the original backbone network of the original image segmentation model with a deformable convolution network layer, such as replacing a first layer convolution network structure of U-Net++ with a deformable convolution structure, so as to obtain an improved U-Net++ model, namely an improved image segmentation model.

In the embodiment, a defect detection label corresponding to the surface image mark of the sample equipment is obtained through the defect segmentation label corresponding to the surface image mark of the sample equipment, and a defect detection data set of the sample equipment is constructed; then, pre-training parameters of an original target detection model by using the defect detection data set to obtain pre-training parameters; and finally, carrying out iterative training on the parameters of the improved target detection model according to the pre-training parameters to obtain the target detection model. Meanwhile, a defect segmentation data set of the sample equipment is constructed by using a defect segmentation label corresponding to the surface image mark of the sample equipment; then, pre-training parameters of an original image segmentation model by using the defect segmentation data set to obtain pre-training parameters; and finally, carrying out iterative training on parameters of the improved image segmentation model according to the pre-training parameters to obtain the image segmentation model, completing training of the target detection model and the image segmentation model, improving the training precision and efficiency of the target detection model and the image segmentation model, and further realizing the efficient and accurate equipment flaw detection work by utilizing computer vision.

Based on the above embodiment, in this embodiment, the parameters of the improved target detection model are iteratively trained according to the pre-training parameters to obtain the target detection model, and the obtained pre-training parameters are specifically migrated to the improved target detection model, and then the improved model network is trained by using the defect detection data set to obtain a parameter file suitable for equipment defect detection, so as to complete the construction of the target detection model.

Fig. 10 is a schematic flow chart of an iterative training method of the target detection model according to the present embodiment.

As shown in fig. 10, the iterative training step of the object detection model includes:

step 1001, initializing parameters of the improved target detection model according to the pre-training parameters;

step 1002, fixing parameters of other networks except for the improved backbone network in the initialized improved target detection model, and training the parameters of the improved backbone network according to the defect detection data set until a first preset termination condition is met, so as to obtain training parameters of the improved backbone network;

because the model structure is replaced and increased, the improved target detection model is trained in a fine adjustment mode, and the method specifically comprises the following steps: freezing the common feature extraction layer, namely taking the pre-training parameters as the initialization parameters of the improved target detection model directly; the parameters of the improved backbone network are trimmed with the defect detection dataset, while the parameters of the other parts remain fixed until a first preset termination condition is met.

The training strategy can be a multi-card training strategy adopting single-precision floating point (Float) and half-precision floating point (Float 16) mixing and data parallelism, so that the occupied video memory of a model is reduced, and the calculation speed is improved.

And step 1003, training the initialized parameters of the improved target detection model according to the training parameters of the improved backbone network until a second preset termination condition is met, so as to obtain the target detection model.

After the fine tuning meets the first preset termination condition, all layers are thawed, and then fine tuning training is carried out on the whole target detection model with smaller learning rate until the second preset termination condition is met, so that the target detection model is finally obtained.

Likewise, the parameters of the improved image segmentation model can be iteratively trained according to the pre-training parameters in the training mode to obtain the image segmentation model,

according to the method, firstly, parameters of an improved target detection model are initialized according to pre-training parameters; fixing parameters of other networks except the improved backbone network in the initialized improved target detection model, and training the parameters of the improved backbone network by using the defect detection data set until a first preset termination condition is met, so as to obtain training parameters of the improved backbone network; and finally, training the initialized parameters of the improved target detection model based on the training parameters of the improved backbone network until a second preset termination condition is met, so as to obtain the target detection model. And then, the image segmentation model is obtained through iterative training by the same method, so that the training precision and efficiency of the target detection model and the image segmentation model are improved, and the efficient and accurate equipment flaw detection work by utilizing computer vision is further realized.

On the basis of the foregoing embodiment, the acquiring a surface image of the sample device in this embodiment includes: acquiring an original surface image of the sample device; preprocessing and image expansion are carried out on the original surface image to obtain a surface image of the sample equipment; wherein the preprocessing includes one or more combinations of normalization processing, histogram equalization processing, and noise reduction processing; the image augmentation includes superimposing one or more combinations of randomly generated background pictures, random cropping, and random flipping.

The normalization processing aims to change the value range of the pixel value of the image from 0-255 to 0-1, thereby accelerating the network training.

Alternatively, a specific choice of the image normalization process may be to set according to the actual scene. For example, in this embodiment, the image normalization processing may be performed by selecting the maximum and minimum normalization method.

The maximum and minimum value normalization calculation mode is that the data and the minimum value of the column are subjected to difference and divided by the extremely difference. The formula is as follows:

wherein norm is normalized result, x _i Representing image pixel values, max (x) and min (x) represent maximum and minimum values, respectively, of an image pixel.

And because the lighting is uneven, histogram equalization processing is performed on the image, so that probability distribution of each gray value in the transformed image is the same, and the contrast of the image is enhanced.

The histogram equalization process changes the gray scale of each pixel in an image by changing the histogram of the image, and is mainly used for enhancing the contrast of an image with a smaller dynamic range. The original image may be concentrated in a narrower interval due to its gray scale distribution, resulting in an insufficient definition of the image. For example, the gray level of an overexposed image is concentrated in a high brightness range, while underexposure will concentrate the image gray level in a low brightness range. By adopting histogram equalization, the histogram of the original image can be converted into a uniformly distributed (equalized) form, so that the dynamic range of gray value difference between pixels is increased, and the effect of enhancing the overall contrast of the image is achieved. In other words, the basic principle of histogram equalization is: the gray values with a large number of pixels in the image (i.e. the gray values which play a main role in the picture) are widened, and the gray values with a small number of pixels (i.e. the gray values which do not play a main role in the picture) are merged, so that the contrast is increased, the image is clear, and the purpose of enhancement is achieved.

The noise reduction processing can be a mode of selecting mean filtering, and a low-pass filter is utilized to remove high-frequency signals in the training image, so that sharp noise of the image is eliminated, and the image noise reduction function is realized.

In addition, when training the target detection model, various data enhancement strategies can be adopted to improve the performance of the model through image expansion. Wherein the image augmentation may be a combination comprising one or more of overlaying a randomly generated background picture, random cropping, and random flipping.

The method comprises the steps of generating a background picture by superimposing randomly generated background pictures based on background characteristics in a random expansion mode, placing training samples according to random proportion, and enhancing the diversity of data.

Alternatively, the background picture may be a randomly generated picture. For example, a black background may be selected as the generated background picture in the present embodiment, which is not particularly limited.

The random clipping is based on background characteristics, random proportion clipping is carried out on the samples after random expansion, a new round of training samples are generated, and the diversity of data is further enhanced.

The random overturn is based on the characteristic of up-down symmetry of data, and the training samples with a certain proportion are overturned up and down.

In addition, in the embodiment, random pixel content transformation can be performed on the training samples, so that the diversity of data is further increased.

In this embodiment, the original surface image of the sample device is subjected to preprocessing including one or more of normalization processing, histogram equalization processing and noise reduction processing, and the original surface image of the sample device is subjected to image expansion including one or more of overlapping randomly generated background images, random clipping and random overturning, so that the surface image of the sample device is richer, and further training precision and efficiency of a target detection model and an image segmentation model are improved, and further accuracy of device flaw detection is improved.

On the basis of the above embodiment, the loss function of the target detection model in this embodiment is constructed and generated based on a regression loss function and a classification loss function; the regression loss function is generated according to the defect detection result of the sample equipment and angle loss, distance loss, shape loss and cross ratio loss determined by the defect detection label of the sample equipment; the classification loss function is generated according to a defect detection result of the sample equipment and a zoom loss construction determined by a defect detection label of the sample equipment.

When the target detection model is trained, the loss function comprises a regression loss function for carrying out regression prediction of the detection frame and a classification loss function for classifying the defect types;

the regression function may be a SIoU Loss (Scyla-IoU Loss, an overlap-ratio Loss function) using a corresponding proprietary Scyla-Net (a convolutional neural network whose architecture is defined for a particular data set of a given predefined layer type using genetic algorithms), the Loss consisting of four cost functions of Angle cost, distance cost, shape cost, and overlap-ratio (IoU, intersection over Union) cost, defining a regression Loss function L _box The method comprises the following steps:

where Δ is the distance cost defined based on the angle cost, Ω is the shape cost, and IoU is the cross-ratio cost.

The class Loss function may be a variable Loss (VFL) that is calculated as:

where α is the loss weight value for the foreground background; p is p ^γ The weight of different samples is that the loss weight of the difficult sample is increased; p is the predicted IACS (IoU-aware classification score, intersection perception classification score) score; q is the target IoU score.

It should be noted that, when training the object detection model or the image segmentation model, the model may be trained by using a strategy of online difficult-to-sample mining (OHEM, online Hard Example Mining), i.e., loss value calculation is performed by using OHEM and then returned, aiming at various unbalanced defects and a small number of negative samples. The core of the OHEM is to select a difficult sample as a training sample, and the improvement of the performance by the OHEM is more obvious by expanding a data set.

In addition, when the loss function is used for training the target detection model and the image segmentation model, the directivity can be introduced into the cost of the loss function, so that the rapid convergence can be realized in the training stage, and the inference performance is better.

The embodiment trains a target detection model by jointly constructing a loss function by using a regression loss function generated by constructing angle loss, distance loss, shape loss and cross-over ratio loss determined by a defect detection result of sample equipment and a defect detection label of the sample equipment and a classification loss function generated by constructing zoom loss determined by the defect detection result of the sample equipment and the defect detection label of the sample equipment; and training the image segmentation model by using the cross entropy loss function, and calculating the loss value of the target detection model and the loss value of the image segmentation model so as to rapidly and accurately acquire the target detection model and the image segmentation model, thereby realizing the efficient and accurate detection of the equipment flaws by using computer vision.

Based on the above embodiment, in this embodiment, for the current iterative training, the current learning rate of the target detection model is calculated according to the first learning rate and/or the second learning rate; the first learning rate is calculated according to the current iteration times, the maximum iteration times and the maximum learning rate; the second learning rate is calculated according to a maximum learning rate, a minimum learning rate, the current iteration number and the maximum iteration number.

Where the learning rate is a tuning parameter in the optimization algorithm in machine learning and statistics that determines the step size in each iteration of the model training, converging the loss function to a minimum.

In the target detection model training and the image cutting model training, the attenuation strategy of the learning rate may be a strategy of one or a superposition (weighted addition) of the two learning rates of the first learning rate and the second learning rate.

Wherein, the first learning rate lr ₁ The method can adopt the warming up of the learning rate, and is specifically calculated according to the current iteration number, the maximum iteration number and the maximum learning rate, and the function is as follows:

wherein T is _warmup For the maximum iteration number, t is the current iteration number, lr _max Is the maximum learning rate. The strategy is helpful for slowing down the early fitting phenomenon of a model to mini-batch (small batch data) in the initial stage and keeping the distribution stable.

Wherein the second learning rate lr ₂ The method can be implemented by Cosine Annealing, and is specifically calculated according to a maximum learning rate, a minimum learning rate, a current iteration number and a maximum iteration number, and the function is as follows:

wherein lr is _max For minimum learning rate lr _max For maximum learning rate, T _cur Represents the current epoch (iteration number), T _max Indicating the maximum epoch. By this strategy the local minima are "jumped out" and a path is found to the global minimum.

Similarly, the image segmentation model can also be obtained by training according to the first learning rate and/or the second learning rate.

In this embodiment, the current learning rate of the target detection model is obtained by overlapping (weighted addition) the first learning rate calculated according to the current iteration number, the maximum iteration number and the maximum learning rate and one or two learning rates of the second learning rate calculated according to the maximum learning rate, the minimum learning rate, the current iteration number and the maximum iteration number, so that training of the target detection model is quickly completed, and further, efficient and accurate equipment flaw detection work by using computer vision is realized.

On the basis of the foregoing embodiment, in this embodiment, obtaining, according to the defect detection result and the defect segmentation result, a defect identification result of the device to be detected includes: performing non-maximum inhibition treatment on the defect detection result, and acquiring a defect detection frame with the intersection ratio larger than a preset threshold value from the treated defect detection result; the defect detection frame is a boundary frame where the detected defect area is located; obtaining a defect segmentation frame with the intersection ratio larger than the preset threshold value from the defect segmentation result; the defect dividing frame is a boundary frame where the divided defect area is located; and acquiring the defect identification result according to the confidence coefficient corresponding to the defect detection frame and the confidence coefficient corresponding to the defect segmentation frame.

Wherein Non-maximum suppression (NMS) is an element that suppresses the presence of Non-maxima, i.e., local maximum search. The local representation is a neighborhood, and the neighborhood has two variable parameters, namely the dimension of the neighborhood and the size of the neighborhood.

Non-maximum suppression is an edge refinement technique that can help suppress all gradient values (by setting them to 0) except for local maxima, which indicates the location with the strongest intensity value variation.

Specifically, a Multi-Class NMS (Multi-Class maxima suppression) may be selected to perform post-processing operation by performing NMS inside each Class of defect detection results, and fused with the segmentation results output by the image segmentation model, so that an accurate prediction result is finally obtained by fusing the object detection model with the semantic segmentation model.

And after performing non-maximum inhibition treatment on the defect detection result, acquiring a defect detection frame with the cross ratio larger than a preset threshold value from the defect detection result, and taking the defect detection frame as a boundary frame where the detected defect area is located.

Wherein, the intersection is more accurate than the larger defect detection frame, which detects the defect point; that is, when the intersection ratio is greater than the preset threshold, the area having the defect detection frame as the boundary frame may be determined as the defect area.

Alternatively, the preset threshold may be set according to an actual scene, which is not specifically limited in this embodiment.

And similarly, obtaining a defect segmentation frame with the intersection ratio larger than a preset threshold value from the defect segmentation result, and taking the defect segmentation frame as a boundary frame where the segmented defect area is located.

And finally, selecting the detection result of the corresponding frame with higher confidence in the defect detection frame and the defect segmentation frame as a final defect identification result.

Specifically, if the confidence coefficient of the defect detection frame obtained by the defect detection model is higher, the defect detection result corresponding to the defect detection model is taken as a final defect identification result; if the confidence coefficient corresponding to the defect segmentation frame obtained by the defect segmentation model is highest, the defect segmentation result obtained by the defect segmentation model is used as a final recognition result.

In the embodiment, firstly, non-maximum inhibition treatment is carried out on the defect detection result; then obtaining a defect detection frame with the cross ratio larger than a preset threshold value from the defect detection result, and taking the defect detection frame as a boundary frame where the detected defect area is located; then, obtaining a defect segmentation frame with the intersection ratio larger than a preset threshold value from a defect segmentation result, and taking the defect segmentation frame as a boundary frame where a segmented defect area is located; and finally, selecting a detection result of a frame with higher confidence corresponding to the defect detection frame and the defect segmentation frame as a final defect identification result so as to obtain a defect identification result of the equipment to be detected according to the defect detection result and the defect segmentation result, and realizing efficient and accurate equipment flaw detection by utilizing computer vision.

The defect recognition device for the fusion target detection model and the image segmentation model provided by the invention is described below, and the defect recognition device for the fusion target detection model and the image segmentation model described below and the defect recognition method for the fusion target detection model and the image segmentation model described above can be correspondingly referred to each other.

As shown in fig. 11, a schematic structural diagram of a defect identifying device for fusing a target detection model and an image segmentation model according to the present invention is provided, where the device includes:

an acquisition module 1101, configured to acquire a surface image of a device to be detected;

the detection module 1102 is configured to input a surface image of the device to be detected to a target detection model, so as to obtain a defect detection result of the device to be detected;

the segmentation module 1103 is configured to input the surface image of the device to be detected to an image segmentation model, so as to obtain a defect segmentation result of the device to be detected;

an identifying module 1104, configured to obtain a defect identifying result of the device to be detected according to the defect detecting result and the defect dividing result;

According to the defect identification device fusing the target detection model and the image segmentation model, the obtained surface images of the equipment to be detected are respectively input into the target detection model and the image segmentation model, so that a corresponding defect detection result of the equipment to be detected and a corresponding defect segmentation result of the equipment to be detected are obtained; and then, according to the obtained defect detection result and defect segmentation result, obtaining the defect identification result of the equipment to be detected, establishing a frame for efficiently and rapidly detecting the gear flaws in industrial production, realizing the efficient and accurate detection of the equipment flaws by using computer vision, reducing the error of the equipment flaws detection, improving the detection precision and saving human resources.

On the basis of the foregoing embodiment, the apparatus in this embodiment further includes a training module, specifically configured to: acquiring a surface image of the sample device; marking a corresponding defect segmentation label on the surface image of the sample equipment according to a plurality of preset defect types; marking a corresponding defect detection label on the surface image of the sample equipment according to the defect segmentation label; constructing a defect detection data set according to the surface image of the sample equipment and the defect detection label of the sample equipment; pre-training parameters of an original target detection model according to the defect detection data set to obtain pre-training parameters; according to the pre-training parameters, performing iterative training on the parameters of the improved target detection model to obtain the target detection model; the improved target detection model is obtained by improving an original backbone network of an original target detection model on the basis of the original target detection model; the improvement of the original backbone network comprises replacing at least one convolutional network layer in the original backbone network of the original target detection model with a deformable convolutional network layer, and adding a symmetrical network structure consisting of an encoder and a decoder in the original backbone network.

On the basis of the foregoing embodiment, the training module in this embodiment further includes an iterative training module, specifically configured to: initializing parameters of the improved target detection model according to the pre-training parameters; fixing parameters of other networks except the improved backbone network in the initialized improved target detection model, and training the parameters of the improved backbone network according to the defect detection data set until a first preset termination condition is met to obtain training parameters of the improved backbone network; and training the initialized parameters of the improved target detection model according to the training parameters of the improved backbone network until a second preset termination condition is met, so as to obtain the target detection model.

On the basis of the foregoing embodiment, the training module in this embodiment further includes an image processing module, specifically configured to: acquiring an original surface image of the sample device; preprocessing and image expansion are carried out on the original surface image to obtain a surface image of the sample equipment; wherein the preprocessing includes one or more combinations of normalization processing, histogram equalization processing, and noise reduction processing; the image augmentation includes superimposing one or more combinations of randomly generated background pictures, random cropping, and random flipping.

On the basis of the above embodiment, in the training module in this embodiment, the loss function of the target detection model is constructed and generated based on the regression loss function and the classification loss function; the regression loss function is generated according to the defect detection result of the sample equipment and angle loss, distance loss, shape loss and cross ratio loss determined by the defect detection label of the sample equipment; the classification loss function is generated according to a defect detection result of the sample equipment and a zoom loss construction determined by a defect detection label of the sample equipment.

On the basis of the above embodiment, in the training module in this embodiment, for the current iterative training, the current learning rate of the target detection model is calculated according to the first learning rate and/or the second learning rate; the first learning rate is calculated according to the current iteration times, the maximum iteration times and the maximum learning rate; the second learning rate is calculated according to a maximum learning rate, a minimum learning rate, the current iteration number and the maximum iteration number.

On the basis of the above embodiment, the identification module 1104 in this embodiment is specifically further configured to: performing non-maximum inhibition treatment on the defect detection result, and acquiring a defect detection frame with the intersection ratio larger than a preset threshold value from the treated defect detection result; the defect detection frame is a boundary frame where the detected defect area is located; obtaining a defect segmentation frame with the intersection ratio larger than the preset threshold value from the defect segmentation result; the defect dividing frame is a boundary frame where the divided defect area is located; and acquiring the defect identification result according to the confidence coefficient corresponding to the defect detection frame and the confidence coefficient corresponding to the defect segmentation frame.

Fig. 12 illustrates a physical structure diagram of an electronic device, as shown in fig. 12, which may include: processor 1210, communication interface (Communications Interface), 1220, memory 1230 and communication bus 1240, wherein processor 1210, communication interface 1220 and memory 1230 communicate with each other via communication bus 1240. Processor 1210 may invoke logic instructions in memory 1230 to perform a defect recognition method that fuses the object detection model and the image segmentation model, the method comprising: acquiring a surface image of equipment to be detected; inputting the surface image of the equipment to be detected into a target detection model to obtain a defect detection result of the equipment to be detected; inputting the surface image of the equipment to be detected into an image segmentation model to obtain a defect segmentation result of the equipment to be detected; obtaining a defect identification result of the equipment to be detected according to the defect detection result and the defect segmentation result; the target detection model is obtained by training according to the surface image of the sample equipment and the defect detection label of the sample equipment, and the image segmentation model is obtained by training according to the surface image of the sample equipment and the defect segmentation label of the sample equipment.

In addition, the logic instructions in the memory 1230 described above may be implemented in the form of software functional units and sold or used as a stand-alone product, stored in a computer-readable storage medium. In light of this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

In another aspect, the present invention also provides a computer program product, the computer program product including a computer program, the computer program being storable on a non-transitory computer readable storage medium, the computer program, when executed by a processor, being capable of executing the defect recognition method of fusing a target detection model and an image segmentation model provided by the above methods, the method comprising: acquiring a surface image of equipment to be detected; inputting the surface image of the equipment to be detected into a target detection model to obtain a defect detection result of the equipment to be detected; inputting the surface image of the equipment to be detected into an image segmentation model to obtain a defect segmentation result of the equipment to be detected; obtaining a defect identification result of the equipment to be detected according to the defect detection result and the defect segmentation result; the target detection model is obtained by training according to the surface image of the sample equipment and the defect detection label of the sample equipment, and the image segmentation model is obtained by training according to the surface image of the sample equipment and the defect segmentation label of the sample equipment.

In yet another aspect, the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, is implemented to perform a method of defect identification of a fusion object detection model and an image segmentation model provided by the above methods, the method comprising: acquiring a surface image of equipment to be detected; inputting the surface image of the equipment to be detected into a target detection model to obtain a defect detection result of the equipment to be detected; inputting the surface image of the equipment to be detected into an image segmentation model to obtain a defect segmentation result of the equipment to be detected; obtaining a defect identification result of the equipment to be detected according to the defect detection result and the defect segmentation result; the target detection model is obtained by training according to the surface image of the sample equipment and the defect detection label of the sample equipment, and the image segmentation model is obtained by training according to the surface image of the sample equipment and the defect segmentation label of the sample equipment.

The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.

From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. In light of this understanding, the foregoing technical solutions may be embodied essentially or in part in the form of a software product, which may be stored in a computer-readable storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., including instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform the various embodiments or portions of the methods described herein.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. A defect identification method integrating a target detection model and an image segmentation model is characterized by comprising the following steps:

acquiring a surface image of equipment to be detected;

2. The method for identifying a defect by fusing a target detection model and an image segmentation model as set forth in claim 1, wherein the training of the target detection model comprises:

acquiring a surface image of the sample device;

3. The method for identifying defects by fusing a target detection model and an image segmentation model according to claim 2, wherein the performing iterative training on parameters of an improved target detection model according to the pre-training parameters to obtain the target detection model comprises:

4. The method of defect identification of a fusion object detection model and an image segmentation model according to claim 2, wherein the acquiring a surface image of the sample device comprises:

acquiring an original surface image of the sample device;

5. The defect recognition method of fusing a target detection model and an image segmentation model according to any one of claims 1-4, wherein a loss function of the target detection model is generated based on a regression loss function and a classification loss function construction;

6. The method for identifying defects by fusing a target detection model and an image segmentation model according to any one of claims 1-4, wherein for a current iterative training, a current learning rate of the target detection model is calculated according to a first learning rate and/or a second learning rate;

7. The method for identifying a defect by fusing a target detection model and an image segmentation model according to any one of claims 1-4, wherein the obtaining a defect identification result of the device to be detected according to the defect detection result and the defect segmentation result comprises:

8. A defect recognition apparatus that fuses a target detection model and an image segmentation model, comprising:

9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of defect identification of fusing a target detection model and an image segmentation model as claimed in any one of claims 1 to 7 when the program is executed by the processor.

10. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the defect identification method of fusing a target detection model and an image segmentation model according to any one of claims 1 to 7.