CN113487550B

CN113487550B - Target detection method and device based on improved activation function

Info

Publication number: CN113487550B
Application number: CN202110732476.4A
Authority: CN
Inventors: 黄坤山; 李霁峰
Original assignee: Foshan Nanhai Guangdong Technology University CNC Equipment Cooperative Innovation Institute
Current assignee: Foshan Nanhai Guangdong Technology University CNC Equipment Cooperative Innovation Institute
Priority date: 2021-06-30
Filing date: 2021-06-30
Publication date: 2024-01-16
Anticipated expiration: 2041-06-30
Also published as: CN113487550A

Abstract

The invention discloses a target detection method and device based on an improved activation function, wherein the method comprises the following steps: collecting a gasket image on a detection assembly line, preprocessing the gasket image, and sending the preprocessed gasket image into a trained target detection model for detection to obtain a detection and positioning result of the gasket; the target detection model comprises a depth feature extraction network backbox, a feature pyramid network Neck and a detection network Head, and is based on the problems that the data volume is large, the data types are more, and the version iteration is required to be continuously carried out, so that the activation function is improved, and better learning performance is obtained; the scheme can be used for rapidly detecting and positioning the gaskets, thereby providing convenience for sorting the subsequent mechanical arms and accelerating the realization of the intellectualization of gasket classification; meanwhile, the device can run for a long time due to stable and reliable performance, realizes high-speed, high-precision and non-contact detection and positioning of the gasket, improves the efficiency and saves the cost.

Description

Target detection method and device based on improved activation function

Technical Field

The invention relates to the field of target detection and neural networks, in particular to a target detection method and device based on an improved activation function.

Background

Parts produced on factory lines, especially various tiny parts, are most common, and in the production process, the produced workpieces need to be checked and detected in time. In the current processing method, a production line is erected in a special production line, and workers stand on two sides of the production line to count and detect the workpieces flowing through. But this is a lagging approach, is inefficient, costly, and due to the increased working time, the workers are fatigued and cause a lot of missed inspections.

Disclosure of Invention

The invention aims to provide a target detection method and device based on an improved activation function, which are used for solving the problems of low efficiency and high cost of the existing detection means.

In order to realize the tasks, the invention adopts the following technical scheme:

an object detection method based on an improved activation function, comprising the steps of:

collecting a gasket image on a detection assembly line, preprocessing the gasket image, and sending the preprocessed gasket image into a trained target detection model for detection to obtain a detection and positioning result of the gasket;

the target detection model comprises a depth feature extraction network backbox, a feature pyramid network Neck and a detection network Head, wherein:

the depth feature extraction network backhaul has a 53-layer network structure, wherein the first layer and the second layer are both convolution layers, and the third layer is a maximum pooling layer; the fourth layer to the twelfth layer, the thirteenth layer to the twenty-fourth layer, the twenty-fifth layer to the thirty-third layer, and the thirty-fourth layer to the fifty-third layer respectively constitute first residual error units to fourth residual error units, wherein: the first residual error unit and the third residual error unit both comprise two residual error blocks, and each residual error block is composed of two convolution layers of adjacent layers; the second residual error unit and the fourth residual error unit respectively comprise three residual error blocks and five residual error blocks, and each residual error block consists of four convolution layers of adjacent layers; in each residual error unit, the step length and the number of convolution kernels of the rest convolution layers are the same except for the last convolution layer, and the step length and the number of the convolution kernels of the last layer are multiples of the step length and the number of the convolution kernels of the first three convolution layers; the feature maps output by the twelfth layer, the twenty-fourth layer, the thirty-third layer and the fifty-third layer are respectively marked as C2, C3, C4 and C5;

the feature pyramid network Neck has a five-layer structure of P2, P3, P4, P5, and P6, respectively, wherein:

the P6 layer receives the feature map output by the P5 layer and carries out convolution operation to carry out feature scaling and channel descent treatment; the P5 layer carries out convolution operation on the feature map C5 so as to carry out channel number reduction processing; the P4 layer carries out convolution operation on the feature map C4 to carry out channel number reduction processing, carries out convolution operation on the feature map output by the P5 to carry out up-sampling processing, and then carries out fusion on the feature map after the channel number reduction processing and the feature map after the up-sampling processing; p3 fuses the feature map after the channel descent processing of C3 and the feature map after the up-sampling processing of the output of P4; p2 fuses the characteristic diagram after the C2 is subjected to channel descent treatment and the characteristic diagram after the P3 output is subjected to up-sampling treatment;

the detection network Head comprises an average pooling layer, a full connection layer and a frame regression and classification layer, wherein:

five average pooling layers are arranged and correspond to five feature graphs output by the Neck networks P2 to P6 respectively; after carrying out average pooling treatment on the corresponding feature graphs by the five average pooling layers respectively, inputting the results of the five average pooling layers into the full-connection layer, and finally entering a frame regression and classification layer to carry out prediction frame generation and regression operation to obtain classification results;

the activation functions used in the deep feature extraction network backbox, the feature pyramid network neg and the detection network Head are expressed as follows:

wherein x represents input data, e is a natural logarithmic base, beta is an adjustable parameter, and the value range is [1, + ] infinity.

Further, the adjustable parameter β in the activation function is obtained by manual adjustment setting or by adaptive learning.

Further, the convolution kernel size of each layer in the first layer and the second layer of the backhaul network is 3×3×64, wherein the step length of the first layer is 1, and the step length of the second layer is 2; the size of the pooling core of the third layer is 3 x 64, and the step length is 1; the step length of the convolution layers from the fourth layer to the eleventh layer is 1, and the number of convolution kernels is 64; the step length of the convolution layer of the twelfth layer is 2, and the number of convolution kernels is increased to 256; the step length of the convolution layers from the thirteenth layer to the twenty-third layer is 1, and the number of convolution kernels is 256; and the step length of the convolution layer of the twenty-fourth layer is 2, and the number of convolution kernels is increased to 512; the step length of the convolution layers of the twenty-fifth layer to the thirty-second layer is 1, and the number of convolution kernels is 512; the step length of the convolution layer of the thirty-third layer is 2, and the number of convolution kernels is increased to 1024; the step length of the convolution layers from the thirty-fourth layer to the fifty-second layer is 1, and the number of convolution kernels is 1024; and the step length of the convolution layer of the fifty third layer is 2, and the number of convolution kernels is increased to 2048.

Further, the convolution kernel size in the P6 layer of the Neck network is 1*1, the step length is 2, and the number of kernels is 256; the convolution kernel size in the P5 layer is 1*1, the step length is 1, and the number of kernels is 256; two convolution layers are arranged in the P4 layer, the first convolution kernel size is 1*1, the step length is 1, and the number of kernels is 256; the second convolution kernel size is 1*1, the step size is 1/2, and the number of kernels is 256.

Further, the training process of the target detection model is as follows:

step 1, collecting a large number of gasket pictures, and ensuring the quantity balance of various types of gaskets in all gasket pictures;

step 2, preprocessing the acquired gasket pictures, including arrangement, cleaning and labeling;

step 3, manufacturing a training set and a testing set for the preprocessed gasket picture; wherein the training set does not coincide with the test set;

and 4, training the established target detection model by using a training set, performing generalization test on the trained target detection model by using a testing set, detecting the performance index of the target detection model according to a testing result, and obtaining the trained target detection model after the performance index meets the design requirement.

Further, the preprocessing of the collected gasket picture specifically includes:

the arrangement is to adjust the size direction of the gasket pictures, and the purpose is to unify the gasket pictures into the same format; the cleaning is to define the labels of various types of gaskets according to the definition of the gasket by the producer; labeling is to label the corresponding type label in the gasket by using a labeling tool, and generate a labeling file as a training positive sample, and classify the non-gasket part as a training negative sample.

An object detection apparatus based on an improved activation function, comprising:

the image acquisition module is used for acquiring and detecting gasket images on the assembly line and preprocessing the gasket images;

the recognition detection module is used for sending the preprocessed gasket image into a trained target detection model for detection to obtain a detection and positioning result of the gasket; the target detection model comprises a depth feature extraction network backbox, a feature pyramid network Neck and a detection network Head.

A terminal device for installation on a gasket detection pipeline, comprising a memory, a processor and a computer program stored in said memory and executable on said processor, the processor implementing the steps of the aforementioned method for target detection based on an improved activation function when executing the computer program.

A computer readable storage medium storing a computer program which when executed by a processor performs the steps of the aforementioned method for detecting an object based on an improved activation function.

Compared with the prior art, the invention has the following technical characteristics:

according to the method, a target detection model of the deep learning method is introduced into gasket detection, so that production automation and intellectualization are realized, an activation function more suitable for the project is provided, and better learning performance is obtained; the target detection model can rapidly detect and position the gaskets, thereby providing convenience for the sorting of the subsequent mechanical arms and accelerating the realization of the intellectualization of the gasket sorting; meanwhile, the device can run for a long time due to stable and reliable performance, realizes high-speed, high-precision and non-contact detection and positioning of the gasket, replaces an inefficient manual method, improves efficiency and saves cost.

Drawings

FIG. 1 is a schematic diagram of a structure of a target detection model;

FIG. 2 is a flowchart of a training process for a target detection model;

FIG. 3 is an image of a target pad to be inspected in one embodiment of the invention;

FIG. 4 is a graph showing the output of the object detection pad after model detection in accordance with one embodiment of the present invention.

Detailed Description

Referring to fig. 1, an object detection method based on an improved activation function of the present invention includes the steps of:

collecting a gasket image on a detection assembly line, preprocessing the gasket image, and sending the preprocessed gasket image into a trained target detection model for detection to obtain a detection and positioning result of the gasket; the target detection model comprises a depth feature extraction network backbox, a feature pyramid network rock and a detection network Head, wherein:

1. depth feature extraction network backhaul

The backhaul network has a 53-layer network structure, specifically:

the first layer and the second layer are both convolution layers, and are processed by adopting two convolution layers which are processed in series in sequence to adapt to the small target input of the project, and the size of each convolution kernel is 3×3×64 (the number of long×wide×channel), because the kernel size is reduced, more small target features can be captured; the step length of the first layer is 1, the step length of the second layer is 2, so that the gasket image with the input of 224×224×3 is processed by the two layers, and the output of 112×112×64 is obtained, and the output characteristic diagram is marked as C1.

The third layer is a maximum pooling layer, and the characteristic diagram C1 obtained by convolution of the first two layers is strengthened to a certain extent, so that characteristics are not lost in subsequent continuous convolution treatment; the pool core size of the layer is 3×3×64, the step size is 1, and thus the final output is 112×112×64.

A fourth layer to a twelfth layer (a first residual error unit), wherein small blocks designed by adopting a residual error principle are used as a residual error block of each three convolution layers to form three residual error blocks; the convolution kernel sizes of the three convolution layers in each residual block are respectively 1x 64, 3 x 64 and 1x 64; the step length of the convolution layers from the fourth layer to the eleventh layer is 1, and the number of convolution kernels is 64; the step length of the convolution layer of the twelfth layer is 2, and the number of convolution kernels is increased to 256; the size of the feature map output by the third layer is 112×112×64, after passing through three residual blocks, the size of the feature map output by the twelfth layer is 56×56×256, and the output is marked as C2.

The thirteenth layer to the twenty-fourth layer (second residual error unit) adopt small blocks designed by the principle of residual error, and each four layers of convolution layers are taken as a residual error block to form three residual error blocks; the convolution kernel sizes of the four convolution layers in each residual block are respectively 1x 256, 3 x 256 and 1x 256; the step length of the convolution layers from the thirteenth layer to the twenty-third layer is 1, and the number of convolution kernels is 256; and the step length of the convolution layer of the twenty-fourth layer is 2, and the number of convolution kernels is increased to 512; the thirteenth layer outputs a map size of 56 x 256, and after passing through the three residual blocks, the twenty-fourth layer outputs a map size of 28 x 512, labeled C3.

The twenty-fifth to thirty-third layers (third residual error units) have the same structure as the fourth to twelfth layers, namely, each three convolution layers are taken as a residual error block to form three residual error blocks; the convolution kernel sizes of three convolution layers in each residual block are respectively 1x 512, 3 x 512 and 1x 512, the convolution layer step sizes of the twenty-fifth layer to the thirty-second layer are all 1, and the number of the convolution kernels is 512; the step length of the thirty-third layer convolution layer is 2, and the number of convolution kernels is increased to 1024; the output feature map size of the twenty-fifth layer is 28×28×512, and after passing through the three residual blocks, the output feature map size of the thirty-third layer is 14×14×1024, and is labeled as C4.

In the thirty-fourth layer to the fifty-third layer (a fourth residual error unit), small blocks designed by adopting a residual error principle are used, and each four layers of convolution layers are used as a residual error block to form five residual error blocks; the convolution kernel sizes of the four convolution layers in each residual block are respectively 1x 1024, 3 x 1024 and 1x 1024; the step length of the convolution layers from the thirty-fourth layer to the fifty-second layer is 1, and the number of convolution kernels is 1024; the step length of the convolution layer of the twenty-fourth layer is 2, and the number of convolution kernels is increased to 2048; the output feature map size of the twenty-fifth layer is 14×14×1024, and after five residual blocks, the output feature map size of the fifty-third layer is 7×7×2048, and is labeled as C5.

The Backbone feature extraction layer design ends up. The final feature map output is 56×56×256 of C2, 28×28×512 of C3, 14×14×1024 of C4, 7×7×2048 of C5, respectively, which are used as inputs to the subsequent feature pyramid network Neck.

2. Feature pyramid network Neck

The Neck network adopts a feature pyramid structure, and mainly performs fusion output on features with various sizes output by the backhaul network so as to improve the feature recognition capability under various sizes. In order to adapt to the characteristics of small targets and high density of gasket detection, the application provides a FPN network with a 5-layer structure, wherein the first four layers are C2, C3, C4 and C5 which are respectively corresponding to the output of a backup network, and P2, P3, P4 and P5; layer 5P 6 performs a 1*1 convolution operation with a step length of 2 on C5 again, and reduces the feature map again to grasp a smaller target; finally, 5 output characteristic diagrams are obtained, the number of channels is unified to be 256, and the sizes are 56×56, 28×28, 14×14, 7*7 and 3*3 respectively.

The P6 layer obtains 7.72048 characteristic diagrams of P5 identity output, the P6 layer passes through 1*1 convolution kernels, the step length is 2, the number of the kernels is 256, and therefore the characteristic scaling and the channel number are reduced; the P6 output is 3 x 256.

The P5 layer obtains 7-2048 of C5 characteristic images, convolution processing is carried out by adopting 1*1 convolution kernels, the step length is 1, the number of the kernels is 256, and only the channel number reduction operation is carried out on the characteristic images; p5 outputs 7×7×256.

The P4 layer obtains 14 x 1024 of C4 characteristic diagram and 7 x 256 of P5 output characteristic diagram, P4 adopts 1*1 to C4 output, step length is 1, the number of cores is 256 to carry out convolution core processing, 14 x 256 size characteristic diagram is obtained after channel number reduction processing is completed, and 1*1 is used for the output of P5, the step length is 1/2, the number of the cores is 256, the convolution core processing is completed, the up-sampling processing is completed, the 14 x 256 feature images are obtained, and finally, the P4 fuses the two feature images to obtain the feature image with the final output size of 14 x 256.

The processing mode of P3 and P2 is consistent with P4:

that is, P3 fuses the feature map after the channel down processing of C3 and the feature map after the up sampling processing of the output of P4, and obtains a feature map with a final output size of 28×28×256.

And P2 fuses the characteristic diagram after the channel descent processing of C2 and the characteristic diagram after the up-sampling processing of the output of P3 to obtain the characteristic diagram with the final output size of 56 x 256.

Thus, the Neck network design ends. In the final output, the feature size of P2 is 56×56×256, the feature size of P3 is 28×28×256, the feature size of P4 is 14×14×256, the feature size of P5 is 7×7×256, and the feature size of P6 is 3×3×256.

3. Detecting network Head

The detection network Head in this scheme includes average pooling layer, full tie layer and frame regression and classification layer, wherein:

five average pooling layers are arranged and correspond to five feature graphs output by the Neck networks P2 to P6 respectively; after carrying out average pooling treatment on the corresponding feature graphs by the five average pooling layers respectively, converting the feature graphs into a size of 1x1, inputting the results of the five average pooling layers into the full-connection layer, and finally entering a frame regression and classification layer to carry out prediction frame generation and regression operation, thereby finally obtaining classification results.

In the scheme, the traditional activation function is adaptively modified, mainly because the convergence rate of the conventional activation function training is too slow, the data quantity of training in gasket detection is large, the data types are more, and version iteration is required to be continuously carried out; the modified activation function is as follows:

wherein x represents input data, e is a natural logarithmic base, beta is a manually adjustable parameter, and the value range is [1, + ]; meanwhile, the method can also be used for self-adaptive learning, and the detected target is smaller and single, so that the interference intensity of the background of the detected object is also small, namely the possibility of sinking into a local minimum value in the training process is almost zero, and in order to reduce the number of training parameters and speed up the training speed, the method of self-adaptive learning is not used for obtaining the beta parameter, but a large number of experiments and tests are used for manually adjusting and obtaining the beta parameter, and finally, the activation function is proved to have the maximum learning effect. The activation function in this scheme is applied after the convolutional layer, the max-pooling layer, the average pooling layer, and the full-join layer.

For the above object detection model, the training process of the present invention is as follows:

the size direction of the gasket picture is adjusted in the arrangement mode, and the gasket picture can be unified into the same format, so that the gasket picture can be placed into a target detection model for training and testing. The cleaning is to define the labels of various types of gaskets according to the definition of the gasket by the producer; labeling is to label the corresponding type label in the gasket by using a labeling tool, and generate a labeling file as a training positive sample, and classify the non-gasket part as a training negative sample.

Step 3, manufacturing a training set and a testing set for the preprocessed gasket picture; wherein the training set does not coincide with the test set.

And 4, training the established target detection model by using a training set, performing generalization test on the trained target detection model by using a testing set, detecting performance indexes such as accuracy and the like of the target detection model according to a testing result, obtaining the trained target detection model after the performance indexes meet design requirements, and applying the trained model to a detection assembly line for gasket production.

In the process of detecting the performance index, counting the classification recognition accuracy of the gasket, and respectively calculating the false detection rate and the accuracy; and if the performance does not reach the standard, repeating the steps 1 to 4, and continuously increasing the number and the types of the gaskets to improve the detection richness of the target detection algorithm on the gaskets, and simultaneously adjusting the proportion of positive and negative samples to enhance the rejection capability of the target detection algorithm on the negative samples.

Another aspect of the present invention further provides an object detection apparatus based on an improved activation function, comprising:

It should be noted that, the specific structures of the deep feature extraction network backhaul, the feature pyramid network Neck, and the detection network Head can be referred to the corresponding steps in the foregoing method embodiments with related explanations, which are not repeated herein.

The embodiment of the application further provides a terminal device, which can be a computer or a server; comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the above-described method for target detection based on an improved activation function when the computer program is executed.

A computer program may also be split into one or more modules/units that are stored in memory and executed by a processor to complete the present application. One or more modules/units may be a series of instruction segments of a computer program capable of performing a specific function, where the instruction segments are used to describe an execution process of the computer program in the terminal device, for example, the computer program may be divided into a picture acquisition module and an identification detection module, and the functions of each module are referred to in the foregoing apparatus and are not described herein.

Implementations of the present application provide a computer readable storage medium storing a computer program which, when executed by a processor, performs the steps of the above-described method of target detection based on an improved activation function.

The integrated modules/units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the present application may implement all or part of the flow of the method of the above embodiment, or may be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, and when the computer program is executed by a processor, the computer program may implement the steps of each method embodiment described above. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, executable files or in some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the content of the computer readable medium can be appropriately increased or decreased according to the requirements of the jurisdiction's jurisdiction and the patent practice, for example, in some jurisdictions, the computer readable medium does not include electrical carrier signals and telecommunication signals according to the jurisdiction and the patent practice.

The above embodiments are only for illustrating the technical solution of the present application, and are not limiting thereof; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application, and are intended to be included in the scope of the present application.

Claims

1. A method of target detection based on an improved activation function, comprising the steps of:

collecting and detecting gasket images on a pipeline, and preprocessing, including finishing, cleaning and labeling; sending the preprocessed gasket image into a trained target detection model for detection to obtain a detection and positioning result of the gasket;

the target detection model comprises a backbox adopting a depth feature extraction network, a Neck adopting a feature pyramid structure and a detection network Head, wherein:

the backhaul has a 53-layer network structure, wherein the first layer and the second layer are both convolution layers, and the third layer is a maximum pooling layer; the fourth layer to the twelfth layer, the thirteenth layer to the twenty-fourth layer, the twenty-fifth layer to the thirty-third layer, and the thirty-fourth layer to the fifty-third layer respectively constitute first residual error units to fourth residual error units, wherein: the first residual error unit and the third residual error unit both comprise two residual error blocks, and each residual error block is composed of two convolution layers of adjacent layers; the second residual error unit and the fourth residual error unit respectively comprise three residual error blocks and five residual error blocks, and each residual error block consists of four convolution layers of adjacent layers; in each residual error unit, the step length and the number of convolution kernels of the rest convolution layers are the same except for the last convolution layer, and the step length and the number of the convolution kernels of the last layer are multiples of the step length and the number of the convolution kernels of the first three convolution layers; the feature maps output by the twelfth layer, the twenty-fourth layer, the thirty-third layer and the fifty-third layer are respectively marked as C2, C3, C4 and C5;

neck has a five-layer structure, P2, P3, P4, P5 and P6, respectively, wherein:

the Head comprises an average pooling layer, a full connection layer and a frame regression and classification layer, wherein:

five average pooling layers are arranged and correspond to five feature graphs output by the P2 layer to the P6 layer of the Neck respectively; after carrying out average pooling treatment on the corresponding feature graphs by the five average pooling layers respectively, inputting the results of the five average pooling layers into the full-connection layer, and finally entering a frame regression and classification layer to carry out prediction frame generation and regression operation to obtain classification results;

the activation function used in Backbone, neck and Head is expressed as follows:

2. The method for target detection based on an improved activation function according to claim 1, wherein the adjustable parameter β in the activation function is obtained by manual adjustment setting or by adaptive learning.

3. The method for detecting an object based on an improved activation function according to claim 1, wherein the convolution kernel size of each of the first layer and the second layer of the backhaul is 3×3×64, and the step size of the first layer is 1, and the step size of the second layer is 2; the size of the pooling core of the third layer is 3 x 64, and the step length is 1; the step length of the convolution layers from the fourth layer to the eleventh layer is 1, and the number of convolution kernels is 64; the step length of the convolution layer of the twelfth layer is 2, and the number of convolution kernels is increased to 256; the step length of the convolution layers from the thirteenth layer to the twenty-third layer is 1, and the number of convolution kernels is 256; and the step length of the convolution layer of the twenty-fourth layer is 2, and the number of convolution kernels is increased to 512; the step length of the convolution layers of the twenty-fifth layer to the thirty-second layer is 1, and the number of convolution kernels is 512; the step length of the convolution layer of the thirty-third layer is 2, and the number of convolution kernels is increased to 1024; the step length of the convolution layers from the thirty-fourth layer to the fifty-second layer is 1, and the number of convolution kernels is 1024; and the step length of the convolution layer of the fifty third layer is 2, and the number of convolution kernels is increased to 2048.

4. The method for detecting an object based on an improved activation function according to claim 1, wherein the convolution kernel size in the P6 layer of negk is 1*1, the step size is 2, and the number of kernels is 256; the convolution kernel size in the P5 layer is 1*1, the step length is 1, and the number of kernels is 256; two convolution layers are arranged in the P4 layer, the first convolution kernel size is 1*1, the step length is 1, and the number of kernels is 256; the second convolution kernel size is 1*1, the step size is 1/2, and the number of kernels is 256.

5. The method for detecting an object based on an improved activation function according to claim 1, wherein the training process of the object detection model is as follows:

and 4, training the target detection model by using a training set, performing generalization test on the trained target detection model by using a testing set, detecting the performance index of the target detection model according to a test result, and obtaining the trained target detection model after the performance index meets the design requirement.

6. The method for detecting an object based on an improved activation function according to claim 5, wherein the preprocessing of the acquired gasket picture is specifically:

7. An object detection device based on an improved activation function, comprising:

the image acquisition module is used for acquiring and detecting gasket images on the assembly line and preprocessing, including arrangement, cleaning and labeling;

the recognition detection module is used for sending the preprocessed gasket image into a trained target detection model for detection to obtain a detection and positioning result of the gasket;

neck has a five-layer structure, P2, P3, P4, P5 and P6, respectively, wherein:

8. Terminal device, characterized in that it is installed on a gasket detection pipeline, comprising a memory, a processor and a computer program stored in said memory and executable on said processor, the processor implementing the steps of the method for object detection based on an improved activation function according to any one of claims 1-6 when the computer program is executed.

9. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the improved activation function based object detection method according to any of claims 1-6.