CN113487551B - Gasket detection method and device for improving dense target performance based on deep learning - Google Patents

Gasket detection method and device for improving dense target performance based on deep learning Download PDF

Info

Publication number
CN113487551B
CN113487551B CN202110732858.7A CN202110732858A CN113487551B CN 113487551 B CN113487551 B CN 113487551B CN 202110732858 A CN202110732858 A CN 202110732858A CN 113487551 B CN113487551 B CN 113487551B
Authority
CN
China
Prior art keywords
layer
gasket
convolution
detection
feature map
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110732858.7A
Other languages
Chinese (zh)
Other versions
CN113487551A (en
Inventor
黄坤山
李霁峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Foshan Nanhai Guangdong Technology University CNC Equipment Cooperative Innovation Institute
Original Assignee
Foshan Nanhai Guangdong Technology University CNC Equipment Cooperative Innovation Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Foshan Nanhai Guangdong Technology University CNC Equipment Cooperative Innovation Institute filed Critical Foshan Nanhai Guangdong Technology University CNC Equipment Cooperative Innovation Institute
Priority to CN202110732858.7A priority Critical patent/CN113487551B/en
Publication of CN113487551A publication Critical patent/CN113487551A/en
Application granted granted Critical
Publication of CN113487551B publication Critical patent/CN113487551B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0004Industrial image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20092Interactive image processing based on input by user
    • G06T2207/20104Interactive definition of region of interest [ROI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging

Abstract

The invention discloses a gasket detection method for improving dense target performance based on deep learning, which comprises the following steps: collecting gasket images on a detection assembly line, preprocessing the gasket images, and sending the preprocessed gasket images into a trained gasket detection model for detection to obtain gasket detection and positioning results; the gasket detection model comprises a depth feature extraction network backbox, a feature pyramid network Neck and a detection network Head; the collected gasket picture data are arranged, cleaned and marked, and finally a training set and a testing set are manufactured; and defining the type and the mark of the gasket according to the requirement of a manufacturer on gasket detection, and performing multiple training and generalization tests on the model by using the manufactured training set until the gasket detection model with the performance meeting the requirement is trained. Compared with the common visual detection algorithm, the invention has better performance on the detection of dense targets, and realizes the detection and positioning of the gasket at high speed and high precision and in a non-contact way.

Description

Gasket detection method and device for improving dense target performance based on deep learning
Technical Field
The invention relates to the field of deep learning, in particular to a gasket detection method and device for improving dense target performance based on deep learning.
Background
The gasket is an indispensable part between screw and the nut, can increase area of contact, reduces pressure, prevents not hard up, protection part and screw. The factories for producing gaskets generally transmit the produced gaskets to each area for processing through a production line, so that the gaskets are precisely controlled in the transmission process, and the timely counting, detection and other treatments are required.
The conventional method is to let the worker count and count at both ends of the line, but the number of gaskets is large and small, the method is inefficient and costly, and the worker is fatigued and causes many missed checks due to the increase of working time.
Disclosure of Invention
The invention aims to provide a gasket detection method and device for improving the performance of a dense target based on deep learning.
In order to realize the tasks, the invention adopts the following technical scheme:
a gasket detection method for improving dense target performance based on deep learning comprises the following steps:
collecting gasket images on a detection assembly line, preprocessing the gasket images, and sending the preprocessed gasket images into a trained gasket detection model for detection to obtain gasket detection and positioning results;
the gasket detection model comprises a deep feature extraction network backbox, a feature pyramid network Neck and a detection network Head, wherein:
the depth feature extraction network backhaul has a 19-layer network structure, wherein the first layer and the second layer are both convolution layers, and the third layer is a maximum pooling layer; the fourth layer to the seventh layer, the eighth layer to the eleventh layer, the twelfth layer to the tenth layer and the sixteenth layer to the nineteenth layer respectively form a residual error unit, each residual error unit comprises two residual error blocks, and each residual error block is formed by two convolution layers of adjacent layers; in each residual error unit, the step length and the number of convolution kernels of the first three convolution layers are the same, and the step length and the number of convolution kernels of the last layer are multiples of the step length and the number of convolution kernels of the first three convolution layers; the feature maps output by the seventh layer, the eleventh layer, the tenth layer and the nineteenth layer are respectively marked as C2, C3, C4 and C5;
the feature pyramid network Neck has a five-layer structure of P2, P3, P4, P5, and P6, respectively, wherein:
the P6 layer receives the feature map output by the P5 layer and carries out convolution operation to carry out feature scaling and channel descent treatment; the P5 layer carries out convolution operation on the feature map C5 so as to carry out channel number reduction processing; the P4 layer carries out convolution operation on the feature map C4 to carry out channel number reduction processing, carries out convolution operation on the feature map output by the P5 to carry out up-sampling processing, and then carries out fusion on the feature map after the channel number reduction processing and the feature map after the up-sampling processing; p3 fuses the feature map after the channel descent processing of C3 and the feature map after the up-sampling processing of the output of P4; p2 fuses the characteristic diagram after the C2 is subjected to channel descent treatment and the characteristic diagram after the P3 output is subjected to up-sampling treatment;
the detection network Head has a first branch structure and a second branch structure, wherein:
the first layer of the first branch is a convolution layer, a feature map output by a Neck network P2, P3, P4, P5 and P6 is obtained, convolution operation is carried out, and the size and the channel number are kept unchanged to stabilize the feature; the second layer is a convolution layer and is used for carrying out channel descent processing on the output result of the first layer; the third layer is a boundary frame generation and prediction layer and is used for generating a prediction frame pixel by pixel on the feature map output by the second layer, performing regression operation and finally outputting an initial four-dimensional vector of a detected target in the feature map to establish an ROI (region of interest); the fourth layer is a deconvolution layer, and is used for executing deconvolution operation on the ROI area obtained by the previous layer; the fifth layer is a deconvolution layer and is used for executing the prediction frame generation and regression operation again on the feature map output by the fourth layer, and a new four-dimensional vector is obtained at the moment; the fifth layer is a boundary frame correction layer and is used for correcting the initial four-dimensional vector and the new four-dimensional vector obtained by the third layer and the fifth layer to obtain an accurate boundary frame;
the second branch is input to the classification layer by using the feature map output by the second layer of the first branch and the accurate boundary box obtained by the sixth layer of the first branch, and the targets in the boundary box are identified.
Further, the convolution kernel size of each layer in the first layer and the second layer of the backhaul network is 3×3×64, wherein the step length of the first layer is 1, and the step length of the second layer is 2; the size of the pooling core of the third layer is 3 x 64, and the step length is 1; the step length of the convolution layers from the fourth layer to the sixth layer is 1, and the number of convolution kernels is 64; and the step length of the convolution layer of the seventh layer is 2, and the number of convolution kernels is increased to 128; the step length of the convolution layers from the eighth layer to the tenth layer is 1, and the number of convolution kernels is 128; the step length of the convolution layer of the eleventh layer is 2, and the number of convolution kernels is increased to 256; the step length of the convolution layers from the twelfth layer to the fourteenth layer is 1, and the number of convolution kernels is 256; and the step length of the convolution layer of the tenth layer is 2, and the number of convolution kernels is increased to 512; the step length of the convolution layers from the sixteenth layer to the eighteenth layer is 1, and the number of convolution kernels is 512; whereas the nineteenth layer has a convolution layer step size of 2, the number of convolution kernels rises to 1024.
Further, the convolution kernel size in the P6 layer of the Neck network is 1*1, the step length is 2, and the number of kernels is 128; the convolution kernel size in the P5 layer is 1*1, the step length is 1, and the number of kernels is 128; two convolution layers are arranged in the P4 layer, the first convolution kernel has a size of 1*1, the step length is 1, and the number of kernels is 128; the second convolution kernel size is 1*1, the step size is 1/2, and the number of kernels is 128.
Further, in the Head network, the convolution kernel size of the first layer is 3*3, the convolution kernel size of the second layer is 1*1, and the number of kernels is 36.
Further, the correction is performed on the initial four-dimensional vector and the new four-dimensional vector obtained by the third layer and the fifth layer, and the accurate bounding box is calculated as follows:
Obox={x,y,w,h}
Tbox={x′,y′,w′,h′}
the Obox represents the initial four-dimensional vector, the Tbox represents the new four-dimensional vector, and Nbox represents the final positioning result of the positioning box.
Further, the training process of the gasket detection model is as follows:
step 1, collecting a large number of gasket pictures, and ensuring the quantity balance of various types of gaskets in all gasket pictures;
step 2, preprocessing the acquired gasket pictures, including arrangement, cleaning and labeling;
step 3, manufacturing a training set and a testing set for the preprocessed gasket picture; wherein the training set does not coincide with the test set;
and 4, training the established gasket detection model by using a training set, performing generalization test on the trained gasket detection model by using a testing set, detecting performance indexes of the gasket detection model according to a testing result, and obtaining the trained gasket detection model after the performance indexes meet design requirements.
Further, the preprocessing of the collected gasket picture specifically includes:
the arrangement is to adjust the size direction of the gasket pictures, and the purpose is to unify the gasket pictures into the same format; the cleaning is to define the labels of various types of gaskets according to the definition of the gasket by the producer; labeling is to label the corresponding type label in the gasket by using a labeling tool, and generate a labeling file as a training positive sample, and classify the non-gasket part as a training negative sample.
A deep learning-based gasket detection device that promotes dense target performance, comprising:
the image acquisition module is used for acquiring and detecting gasket images on the assembly line and preprocessing the gasket images;
the identification detection module is used for sending the preprocessed gasket image into a trained gasket detection model for detection to obtain a gasket detection and positioning result; the gasket detection model comprises a depth feature extraction network backbox, a feature pyramid network Neck and a detection network Head.
A terminal device is installed on a gasket detection pipeline and comprises a memory, a processor and a computer program which is stored in the memory and can run on the processor, wherein the processor realizes the steps of the gasket detection method based on deep learning to improve dense target performance when executing the computer program.
A computer readable storage medium storing a computer program which when executed by a processor performs the steps of the aforementioned deep learning-based method of pad detection to improve dense target performance.
Compared with the prior art, the invention has the following technical characteristics:
according to the invention, the production object gasket is used for learning, and finally, the gasket on the production line can be detected and positioned; experiments show that the method not only reduces the detection workload, but also obtains better performance on the detection of dense targets than the common visual detection algorithm, thereby realizing the detection and positioning of the gasket at high speed and high precision and in a non-contact way.
Drawings
FIG. 1 is a schematic diagram of a gasket detection model;
FIG. 2 is a flowchart of a training process for a pad inspection model;
FIG. 3 is an image of a target pad to be inspected in one embodiment of the invention;
FIG. 4 is a graph showing the output of the object detection pad after model detection in accordance with one embodiment of the present invention.
Detailed Description
Referring to fig. 1 and 2, the method for detecting the gasket based on deep learning to improve the dense target performance comprises the following steps:
collecting gasket images on a detection assembly line, preprocessing the gasket images, and sending the preprocessed gasket images into a trained gasket detection model for detection to obtain gasket detection and positioning results; the gasket detection model comprises a depth feature extraction network backbox, a feature pyramid network rock and a detection network Head, wherein:
1. depth feature extraction network backhaul
The backhaul network has a 19-layer network structure, specifically:
the first layer and the second layer are both convolution layers, and are processed by adopting two convolution layers which are processed in series in sequence to adapt to the small target input of the project, and the size of each convolution kernel is 3×3×64 (the number of long×wide×channel), because the kernel size is reduced, more small target features can be captured; the step length of the first layer is 1, the step length of the second layer is 2, so that the gasket image with the input of 224×224×3 is processed by the two layers, and the output of 112×112×64 is obtained, and the output characteristic diagram is marked as C1.
The third layer is a maximum pooling layer, and the characteristic diagram C1 obtained by convolution of the first two layers is strengthened to a certain extent, so that characteristics are not lost in subsequent continuous convolution treatment; the pool core size of the layer is 3×3×64, the step size is 1, and thus the final output is 112×112×64.
The fourth layer to the seventh layer adopt small blocks designed by the principle of residual errors, and each two layers of convolution layers are taken as a residual error block, wherein the convolution kernel sizes of the two convolution layers in each residual error block are respectively 1 x 64 and 3 x 64; the step length of the convolution layers from the fourth layer to the sixth layer is 1, and the number of convolution kernels is 64; and the step length of the convolution layer of the seventh layer is 2, and the number of convolution kernels is increased to 128; the size of the feature map of the third layer output is 112×112×64, after passing through the two residual blocks, the size of the feature map of the seventh layer output is 56×56×128, and the output is marked as C2.
The eighth layer to the eleventh layer adopt small blocks designed by the principle of residual error, and each two layers of convolution layers are taken as a residual error block, wherein the convolution kernel sizes of the two convolution layers in each residual error block are respectively 1 x 128 and 3 x 128; the step length of the convolution layers from the eighth layer to the tenth layer is 1, and the number of convolution kernels is 128; the step length of the convolution layer of the eleventh layer is 2, and the number of convolution kernels is increased to 256; the size of the output feature map of the seventh layer is 56×56×128, and after passing through the two residual blocks, the size of the output feature map of the eleventh layer is 28×28×256, and is denoted as C3.
The twelfth layer to the fifteenth layer adopt small blocks designed by the principle of residual errors, and each two convolution layers are taken as a residual error block, wherein the convolution kernel sizes of the two convolution layers in each residual error block are respectively 1 x 256 and 3 x 256; the step length of the convolution layers from the twelfth layer to the fourteenth layer is 1, and the number of convolution kernels is 256; and the step length of the convolution layer of the tenth layer is 2, and the number of convolution kernels is increased to 512; the size of the feature map output by the eleventh layer is 28×28×256, and after passing through the two residual blocks, the size of the feature map output by the fifteenth layer is 14×14×512, and is marked as C4.
Sixteenth to nineteenth layers, small blocks designed by adopting a residual principle are adopted, and every two convolution layers are taken as a residual block, wherein the convolution kernel sizes of the two convolution layers in each residual block are respectively 1 x 512 and 3 x 512; the step length of the convolution layers from the sixteenth layer to the eighteenth layer is 1, and the number of convolution kernels is 512; the step length of a convolution layer of the nineteenth layer is 2, and the number of convolution kernels is increased to 1024; the size of the feature map output by the fifteenth layer is 14×14×512, and after passing through the two residual blocks, the size of the feature map output by the nineteenth layer is 7×7×1024, and is marked as C5.
The Backbone feature extraction layer design ends up. The final obtained feature map s outputs are 56×56×128 of C2, 28×28×256 of C3, 14×14×512 of C4, and 7×7×1024 of C5, respectively, as inputs to the subsequent feature pyramid network Neck.
2. Feature pyramid network Neck
The Neck network adopts a feature pyramid structure, and mainly performs fusion output on features with various sizes output by the backhaul network so as to improve the feature recognition capability under various sizes. In order to adapt to the characteristics of small targets and high density of gasket detection, the application provides a FPN network with a 5-layer structure, wherein the first four layers are C2, C3, C4 and C5 which are respectively corresponding to the output of a backup network, and P2, P3, P4 and P5; layer 5P 6 performs a 1*1 convolution operation with a step length of 2 on C5 again, and reduces the feature map again to grasp a smaller target; finally, 5 output characteristic diagrams are obtained, the number of channels is unified to 128, and the sizes are 56×56, 28×28, 14×14, 7*7 and 3*3 respectively.
The P6 layer obtains 7.7X104 characteristic diagrams output by the P5 identity, the P6 layer passes through 1*1 convolution kernels, the step length is 2, the number of the kernels is 128, and therefore the characteristic scaling and the channel number are reduced; the P6 output is 3 x 128.
The P5 layer obtains 7 x 1024 of C5 characteristic images, adopts 1*1 convolution kernel, carries out convolution treatment with the step length of 1 and the kernel number of 128, and only carries out channel number reduction operation on the characteristic images; p5 outputs 7×7×128.
The method comprises the steps that a P4 layer obtains 14 x 512 of C4 characteristic graphs and 7 x 1024 of P5 output characteristic graphs, the P4 firstly adopts 1*1 of C4 output, the step length is 1, the number of cores is 128 to carry out convolution kernel processing, the 14 x 128 size characteristic graphs are obtained after channel number reduction processing is completed, 1*1 of the P5 output characteristic graphs are used, the step length is 1/2, the number of cores is 128 to carry out convolution kernel processing, up-sampling processing is completed, the 14 x 128 characteristic graphs are obtained, finally, the P4 fuses the two characteristic graphs, and the final 14 x 128 characteristic graphs are obtained.
The processing mode of P3 and P2 is consistent with P4:
that is, P3 fuses the feature map after the channel down processing of C3 and the feature map after the up sampling processing of the output of P4, and obtains a feature map with a final output size of 28×28×128.
And P2 fuses the characteristic diagram after the channel descent processing of C2 and the characteristic diagram after the up-sampling processing of the output of P3 to obtain the characteristic diagram with the final output size of 56 x 128.
Thus, the Neck network design ends. In the final output, the feature size of P2 is 56×56×128, the feature size of P3 is 28×28×128, the feature size of P4 is 14×14×128, the feature size of P5 is 7×7×128, and the feature size of P6 is 3×3×128.
3. Detecting network Head
In order to improve the performance of dense detection, the Head network structure in the scheme is divided into two branches, and the two branches are respectively responsible for obtaining accurate category confidence detection and accurate bounding box positioning accuracy from feature graphs output from different P2 to P6.
The structure of the first branch is as follows:
the first layer is a convolution layer, and Neck output is obtained and subjected to 3*3 convolution operation; taking P5 as an example, the feature map of 7×7×128 output by P5, the size and the channel number are all kept unchanged, and this operation is mainly to keep the feature stable and strengthen the effect of the Neck network fusion.
The second layer is a convolution layer, and channel descent processing is carried out on the output result of the first layer, so that preparation work is carried out for generating a bounding box on the feature map. The 1*1 convolution kernel is adopted, the number of the kernels is 36, the input is 7×7×128, and the output is 7×7×36.
The third layer is a boundingbox generation and prediction layer, which generates a prediction frame pixel by pixel on the 7 x 36 feature map output by the second layer, and performs a regression operation, and finally outputs an initial four-dimensional vector (x, y, w, h) of the detected object in the feature map to establish the ROI region.
The fourth layer is deconvolution layer deconv, and the deconvolution operation is performed on the ROI area (region of interest) obtained by the previous layer by the deconvolution layer, namely, the feature map is restored to the front of convolution on the premise of locking the target position, so that the range of the target in lower-level semantics is obtained.
The fifth layer is a prediction layer and a prediction layer, and has the same structure as the third layer, and mainly performs a prediction frame generation and regression operation on the feature map obtained by deconvolution again, so that new four-dimensional vectors (x ', y', w ', h') are obtained.
The sixth layer is a binding box correction layer, and four-dimensional vectors obtained by the third layer and the fifth layer are compared and corrected; finally, an accurate bounding box is obtained, and the method is specifically calculated as follows:
Obox={x,y,w,h}
Tbox={x′,y′,w′,h′}
the Obox represents the initial four-dimensional vector of the target, the Tbox represents the new four-dimensional vector, and Nbox represents the final obtained positioning result of the positioning box.
The second branch inputs the feature map of 7 x 36 output by the second layer of the first branch and the accurate bounding box obtained by the sixth layer of the first branch into the classification layer together, and identifies the target in the boundary box.
Feature images with different size levels output by P2-P6 in a Neck network are respectively input into a Head network for processing; if the target is not detected after the feature map is processed by the Neck network, the boundary box is not output finally, and otherwise, the boundary box and the identified target are output.
For the gasket detection model, the training process of the invention is as follows:
step 1, collecting a large number of gasket pictures, and ensuring the quantity balance of various types of gaskets in all gasket pictures;
step 2, preprocessing the acquired gasket pictures, including arrangement, cleaning and labeling;
the size direction of the gasket pictures is adjusted, and the gasket pictures can be unified into the same format, so that the gasket pictures can be placed into a gasket detection model for training and testing. The cleaning is to define the labels of various types of gaskets according to the definition of the gasket by the producer; labeling is to label the corresponding type label in the gasket by using a labeling tool, and generate a labeling file as a training positive sample, and classify the non-gasket part as a training negative sample.
Step 3, manufacturing a training set and a testing set for the preprocessed gasket picture; wherein the training set does not coincide with the test set.
And 4, training the established gasket detection model by using a training set, performing generalization test on the trained gasket detection model by using a testing set, detecting performance indexes such as accuracy and the like of the gasket detection model according to a testing result, obtaining the trained gasket detection model after the performance indexes meet design requirements, and applying the trained model to a detection assembly line for gasket production.
In the process of detecting the performance index, counting the classification recognition accuracy of the gasket, and respectively calculating the false detection rate and the accuracy; and if the performance does not reach the standard, repeating the steps 1 to 4, and continuously increasing the number and the types of the gaskets to improve the detection richness of the target detection algorithm on the gaskets, and simultaneously adjusting the proportion of positive and negative samples to enhance the rejection capability of the target detection algorithm on the negative samples.
Another aspect of the present invention further provides a spacer detection apparatus for improving dense target performance based on deep learning, including:
the image acquisition module is used for acquiring and detecting gasket images on the assembly line and preprocessing the gasket images;
the identification detection module is used for sending the preprocessed gasket image into a trained gasket detection model for detection to obtain a gasket detection and positioning result; the gasket detection model comprises a depth feature extraction network backbox, a feature pyramid network Neck and a detection network Head.
It should be noted that, the specific structures of the deep feature extraction network backhaul, the feature pyramid network Neck, and the detection network Head can be referred to the corresponding steps in the foregoing method embodiments with related explanations, which are not repeated herein.
The embodiment of the application further provides a terminal device, which can be a computer or a server; the method comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor realizes the steps of the gasket detection method for improving the dense target performance based on deep learning when executing the computer program.
A computer program may also be split into one or more modules/units that are stored in memory and executed by a processor to complete the present application. One or more modules/units may be a series of instruction segments of a computer program capable of performing a specific function, where the instruction segments are used to describe an execution process of the computer program in the terminal device, for example, the computer program may be divided into a picture acquisition module and an identification detection module, and the functions of each module are referred to in the foregoing apparatus and are not described herein.
Implementations of the present application provide a computer readable storage medium storing a computer program that when executed by a processor performs the steps of the deep learning-based method for improving dense target performance.
The integrated modules/units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the present application may implement all or part of the flow of the method of the above embodiment, or may be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, and when the computer program is executed by a processor, the computer program may implement the steps of each method embodiment described above. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, executable files or in some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the content of the computer readable medium can be appropriately increased or decreased according to the requirements of the jurisdiction's jurisdiction and the patent practice, for example, in some jurisdictions, the computer readable medium does not include electrical carrier signals and telecommunication signals according to the jurisdiction and the patent practice.
The above embodiments are only for illustrating the technical solution of the present application, and are not limiting thereof; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application, and are intended to be included in the scope of the present application.

Claims (10)

1. The gasket detection method for improving the dense target performance based on deep learning is characterized by comprising the following steps of:
collecting gasket images on a detection assembly line, preprocessing the gasket images, and sending the preprocessed gasket images into a trained gasket detection model for detection to obtain gasket detection and positioning results;
the gasket detection model comprises a deep feature extraction network backbox, a feature pyramid network Neck and a detection network Head, wherein:
the depth feature extraction network backhaul has a 19-layer network structure, wherein the first layer and the second layer are both convolution layers, and the third layer is a maximum pooling layer; the fourth layer to the seventh layer, the eighth layer to the eleventh layer, the twelfth layer to the tenth layer and the sixteenth layer to the nineteenth layer respectively form a residual error unit, each residual error unit comprises two residual error blocks, and each residual error block is formed by two convolution layers of adjacent layers; in each residual error unit, the step length and the number of convolution kernels of the first three convolution layers are the same, and the step length and the number of convolution kernels of the last layer are 2 times of the step length and the number of convolution kernels of the first three convolution layers; the feature maps output by the seventh layer, the eleventh layer, the tenth layer and the nineteenth layer are respectively marked as C2, C3, C4 and C5;
the feature pyramid network Neck has a five-layer structure of P2, P3, P4, P5, and P6, respectively, wherein:
the P6 layer receives the feature map output by the P5 layer and carries out convolution operation to carry out feature scaling and channel descent treatment; the P5 layer carries out convolution operation on the feature map C5 so as to carry out channel number reduction processing; the P4 layer carries out convolution operation on the feature map C4 to carry out channel number reduction processing, carries out convolution operation on the feature map output by the P5 to carry out up-sampling processing, and then carries out fusion on the feature map after the channel number reduction processing and the feature map after the up-sampling processing; p3 fuses the feature map after the channel descent processing of C3 and the feature map after the up-sampling processing of the output of P4; p2 fuses the characteristic diagram after the C2 is subjected to channel descent treatment and the characteristic diagram after the P3 output is subjected to up-sampling treatment;
the detection network Head has a first branch structure and a second branch structure, wherein:
the first layer of the first branch is a convolution layer, a feature map output by a Neck network P2, P3, P4, P5 and P6 is obtained, convolution operation is carried out, and the size and the channel number are kept unchanged to stabilize the feature; the second layer is a convolution layer and is used for carrying out channel descent processing on the output result of the first layer; the third layer is a boundary frame generation and prediction layer and is used for generating a prediction frame pixel by pixel on the feature map output by the second layer, performing regression operation and finally outputting an initial four-dimensional vector of a detected target in the feature map to establish an ROI (region of interest); the fourth layer is a deconvolution layer, and is used for executing deconvolution operation on the ROI area obtained by the previous layer; the fifth layer is a boundary frame generation and prediction layer and is used for executing prediction frame generation and regression operation again on the feature map output by the fourth layer, and a new four-dimensional vector is obtained at the moment; the sixth layer is a boundary frame correction layer and is used for correcting the initial four-dimensional vector and the new four-dimensional vector obtained by the third layer and the fifth layer to obtain an accurate boundary frame;
the second branch is input to the classification layer by using the feature map output by the second layer of the first branch and the accurate boundary box obtained by the sixth layer of the first branch, and the targets in the boundary box are identified.
2. The method for detecting the gasket based on deep learning to improve the dense target performance according to claim 1, wherein the convolution kernel size of each of the first layer and the second layer of the backhaul network is 3×3×64, wherein the step size of the first layer is 1, and the step size of the second layer is 2; the size of the pooling core of the third layer is 3 x 64, and the step length is 1; the step length of the convolution layers from the fourth layer to the sixth layer is 1, and the number of convolution kernels is 64; and the step length of the convolution layer of the seventh layer is 2, and the number of convolution kernels is increased to 128; the step length of the convolution layers from the eighth layer to the tenth layer is 1, and the number of convolution kernels is 128; the step length of the convolution layer of the eleventh layer is 2, and the number of convolution kernels is increased to 256; the step length of the convolution layers from the twelfth layer to the fourteenth layer is 1, and the number of convolution kernels is 256; and the step length of the convolution layer of the tenth layer is 2, and the number of convolution kernels is increased to 512; the step length of the convolution layers from the sixteenth layer to the eighteenth layer is 1, and the number of convolution kernels is 512; whereas the nineteenth layer has a convolution layer step size of 2, the number of convolution kernels rises to 1024.
3. The gasket detection method for improving dense target performance based on deep learning according to claim 1, wherein the convolution kernel size in the P6 layer of the negk network is 1*1, the step length is 2, and the number of convolution kernels is 128; the convolution kernel size in the P5 layer is 1*1, the step length is 1, and the number of convolution kernels is 128; two convolution layers are arranged in the P4 layer, the first convolution kernel has a size of 1*1, the step length is 1, and the number of convolution kernels is 128; the second convolution kernel size is 1*1, the step size is 1/2, and the number of kernels is 128.
4. The method for detecting the gasket based on deep learning to improve the dense target performance according to claim 1, wherein the convolution kernel size of the first layer in the Head network is 3*3, the convolution kernel size of the second layer is 1*1, and the number of kernels is 36.
5. The method for detecting the gasket based on deep learning to improve the dense target performance according to claim 1, wherein the correction of the initial four-dimensional vector and the new four-dimensional vector obtained by the third layer and the fifth layer is performed to obtain an accurate bounding box calculated as follows:
the Obox represents the initial four-dimensional vector, the Tbox represents the new four-dimensional vector, and Nbox represents the final positioning result of the positioning box.
6. The method for detecting the gasket based on deep learning to improve the dense target performance according to claim 1, wherein the training process of the gasket detection model is as follows:
step 1, collecting a large number of gasket pictures, and ensuring the quantity balance of various types of gaskets in all gasket pictures;
step 2, preprocessing the acquired gasket pictures, including arrangement, cleaning and labeling;
step 3, manufacturing a training set and a testing set for the preprocessed gasket picture; wherein the training set does not coincide with the test set;
and 4, training the established gasket detection model by using a training set, performing generalization test on the trained gasket detection model by using a testing set, detecting performance indexes of the gasket detection model according to a testing result, and obtaining the trained gasket detection model after the performance indexes meet design requirements.
7. The method for detecting the gasket based on deep learning to improve the dense target performance according to claim 6, wherein the preprocessing of the collected gasket pictures is specifically as follows:
the arrangement is to adjust the size direction of the gasket pictures, and the purpose is to unify the gasket pictures into the same format; the cleaning is to define the labels of various types of gaskets according to the definition of the gasket by the producer; labeling is to label the corresponding type label in the gasket by using a labeling tool, and generate a labeling file as a training positive sample, and classify the non-gasket part as a training negative sample.
8. Gasket detection device based on intensive target performance of deep learning promotion, characterized in that includes:
the image acquisition module is used for acquiring and detecting gasket images on the assembly line and preprocessing the gasket images;
the identification detection module is used for sending the preprocessed gasket image into a trained gasket detection model for detection to obtain a gasket detection and positioning result;
the gasket detection model comprises a deep feature extraction network backbox, a feature pyramid network Neck and a detection network Head, wherein:
the depth feature extraction network backhaul has a 19-layer network structure, wherein the first layer and the second layer are both convolution layers, and the third layer is a maximum pooling layer; the fourth layer to the seventh layer, the eighth layer to the eleventh layer, the twelfth layer to the tenth layer and the sixteenth layer to the nineteenth layer respectively form a residual error unit, each residual error unit comprises two residual error blocks, and each residual error block is formed by two convolution layers of adjacent layers; in each residual error unit, the step length and the number of convolution kernels of the first three convolution layers are the same, and the step length and the number of convolution kernels of the last layer are 2 times of the step length and the number of convolution kernels of the first three convolution layers; the feature maps output by the seventh layer, the eleventh layer, the tenth layer and the nineteenth layer are respectively marked as C2, C3, C4 and C5;
the feature pyramid network Neck has a five-layer structure of P2, P3, P4, P5, and P6, respectively, wherein:
the P6 layer receives the feature map output by the P5 layer and carries out convolution operation to carry out feature scaling and channel descent treatment; the P5 layer carries out convolution operation on the feature map C5 so as to carry out channel number reduction processing; the P4 layer carries out convolution operation on the feature map C4 to carry out channel number reduction processing, carries out convolution operation on the feature map output by the P5 to carry out up-sampling processing, and then carries out fusion on the feature map after the channel number reduction processing and the feature map after the up-sampling processing; p3 fuses the feature map after the channel descent processing of C3 and the feature map after the up-sampling processing of the output of P4; p2 fuses the characteristic diagram after the C2 is subjected to channel descent treatment and the characteristic diagram after the P3 output is subjected to up-sampling treatment;
the detection network Head has a first branch structure and a second branch structure, wherein:
the first layer of the first branch is a convolution layer, a feature map output by a Neck network P2, P3, P4, P5 and P6 is obtained, convolution operation is carried out, and the size and the channel number are kept unchanged to stabilize the feature; the second layer is a convolution layer and is used for carrying out channel descent processing on the output result of the first layer; the third layer is a boundary frame generation and prediction layer and is used for generating a prediction frame pixel by pixel on the feature map output by the second layer, performing regression operation and finally outputting an initial four-dimensional vector of a detected target in the feature map to establish an ROI (region of interest); the fourth layer is a deconvolution layer, and is used for executing deconvolution operation on the ROI area obtained by the previous layer; the fifth layer is a boundary frame generation and prediction layer and is used for executing prediction frame generation and regression operation again on the feature map output by the fourth layer, and a new four-dimensional vector is obtained at the moment; the sixth layer is a boundary frame correction layer and is used for correcting the initial four-dimensional vector and the new four-dimensional vector obtained by the third layer and the fifth layer to obtain an accurate boundary frame;
the second branch is input to the classification layer by using the feature map output by the second layer of the first branch and the accurate boundary box obtained by the sixth layer of the first branch, and the targets in the boundary box are identified.
9. A terminal device, characterized in that it is installed on a gasket detection pipeline, comprising a memory, a processor and a computer program stored in said memory and executable on said processor, the processor implementing the steps of the method for gasket detection based on deep learning to improve dense target performance according to any one of claims 1-7 when the computer program is executed.
10. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the deep learning-based dense target performance improvement gasket detection method according to any one of claims 1 to 7.
CN202110732858.7A 2021-06-30 2021-06-30 Gasket detection method and device for improving dense target performance based on deep learning Active CN113487551B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110732858.7A CN113487551B (en) 2021-06-30 2021-06-30 Gasket detection method and device for improving dense target performance based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110732858.7A CN113487551B (en) 2021-06-30 2021-06-30 Gasket detection method and device for improving dense target performance based on deep learning

Publications (2)

Publication Number Publication Date
CN113487551A CN113487551A (en) 2021-10-08
CN113487551B true CN113487551B (en) 2024-01-16

Family

ID=77936821

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110732858.7A Active CN113487551B (en) 2021-06-30 2021-06-30 Gasket detection method and device for improving dense target performance based on deep learning

Country Status (1)

Country Link
CN (1) CN113487551B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110287849A (en) * 2019-06-20 2019-09-27 北京工业大学 A kind of lightweight depth network image object detection method suitable for raspberry pie
CN110532859A (en) * 2019-07-18 2019-12-03 西安电子科技大学 Remote Sensing Target detection method based on depth evolution beta pruning convolution net
CN111783590A (en) * 2020-06-24 2020-10-16 西北工业大学 Multi-class small target detection method based on metric learning
CN111814750A (en) * 2020-08-14 2020-10-23 深延科技(北京)有限公司 Intelligent garbage classification method and system based on deep learning target detection and image recognition
CN111832513A (en) * 2020-07-21 2020-10-27 西安电子科技大学 Real-time football target detection method based on neural network
CN111898668A (en) * 2020-07-24 2020-11-06 佛山市南海区广工大数控装备协同创新研究院 Small target object detection method based on deep learning
CN112613541A (en) * 2020-12-08 2021-04-06 北京迈格威科技有限公司 Target detection method and device, storage medium and electronic equipment

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110287849A (en) * 2019-06-20 2019-09-27 北京工业大学 A kind of lightweight depth network image object detection method suitable for raspberry pie
CN110532859A (en) * 2019-07-18 2019-12-03 西安电子科技大学 Remote Sensing Target detection method based on depth evolution beta pruning convolution net
CN111783590A (en) * 2020-06-24 2020-10-16 西北工业大学 Multi-class small target detection method based on metric learning
CN111832513A (en) * 2020-07-21 2020-10-27 西安电子科技大学 Real-time football target detection method based on neural network
CN111898668A (en) * 2020-07-24 2020-11-06 佛山市南海区广工大数控装备协同创新研究院 Small target object detection method based on deep learning
CN111814750A (en) * 2020-08-14 2020-10-23 深延科技(北京)有限公司 Intelligent garbage classification method and system based on deep learning target detection and image recognition
CN112613541A (en) * 2020-12-08 2021-04-06 北京迈格威科技有限公司 Target detection method and device, storage medium and electronic equipment

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
A-DFPN: Adversarial Learning and Deformation Feature Pyramid Networks for Object Detection;Miao Cheng;《IEEE Xplore》;全文 *
Rethinking Classification and Localization for Object Detection;Yue Wu;《 ARXIV》;全文 *
基于特征金字塔网络的目标检测算法研究;刘佳甲;《CNKI中国知网》;全文 *
改进FPN的Mask R-CNN工业表面缺陷检测;王海云;《制造业自动化》(第第2020年第42卷第12期期);全文 *

Also Published As

Publication number Publication date
CN113487551A (en) 2021-10-08

Similar Documents

Publication Publication Date Title
CN111681273B (en) Image segmentation method and device, electronic equipment and readable storage medium
CN111932511B (en) Electronic component quality detection method and system based on deep learning
CN110827312A (en) Learning method based on cooperative visual attention neural network
CN113177957B (en) Cell image segmentation method and device, electronic equipment and storage medium
CN116309612B (en) Semiconductor silicon wafer detection method, device and medium based on frequency decoupling supervision
CN113487551B (en) Gasket detection method and device for improving dense target performance based on deep learning
CN116912674A (en) Target detection method and system based on improved YOLOv5s network model under complex water environment
CN116824135A (en) Atmospheric natural environment test industrial product identification and segmentation method based on machine vision
CN116403062A (en) Point cloud target detection method, system, equipment and medium
CN116051532A (en) Deep learning-based industrial part defect detection method and system and electronic equipment
CN113763384B (en) Defect detection method and defect detection device in industrial quality inspection
CN115995017A (en) Fruit identification and positioning method, device and medium
CN113223037B (en) Unsupervised semantic segmentation method and unsupervised semantic segmentation system for large-scale data
CN115631197A (en) Image processing method, device, medium, equipment and system
CN115734072A (en) Internet of things centralized monitoring method and device for industrial automation equipment
CN114964628A (en) Shuffle self-attention light-weight infrared detection method and system for ammonia gas leakage
CN113487550B (en) Target detection method and device based on improved activation function
CN116109627B (en) Defect detection method, device and medium based on migration learning and small sample learning
CN117576098B (en) Cell division balance evaluation method and device based on segmentation
CN117440104B (en) Data compression reconstruction method based on target significance characteristics
Bian et al. Swin transformer UNet for very high resolution image dehazing
CN112381832A (en) Image semantic segmentation method based on optimized convolutional neural network
CN116702878A (en) Migration learning method for improving unsupervised domain adaptation effect based on countermeasure
CN117474898A (en) Method and device for detecting foreign matters in cans
CN116612284A (en) Small part image segmentation method and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant