CN113420819B - Lightweight underwater target detection method based on CenterNet - Google Patents

Lightweight underwater target detection method based on CenterNet Download PDF

Info

Publication number
CN113420819B
CN113420819B CN202110723096.4A CN202110723096A CN113420819B CN 113420819 B CN113420819 B CN 113420819B CN 202110723096 A CN202110723096 A CN 202110723096A CN 113420819 B CN113420819 B CN 113420819B
Authority
CN
China
Prior art keywords
target
image
underwater
detection
algorithm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110723096.4A
Other languages
Chinese (zh)
Other versions
CN113420819A (en
Inventor
沈钧戈
毛昭勇
丁文俊
姜旭阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northwestern Polytechnical University
Original Assignee
Northwestern Polytechnical University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwestern Polytechnical University filed Critical Northwestern Polytechnical University
Priority to CN202110723096.4A priority Critical patent/CN113420819B/en
Publication of CN113420819A publication Critical patent/CN113420819A/en
Application granted granted Critical
Publication of CN113420819B publication Critical patent/CN113420819B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a lightweight underwater target detection method based on CenterNet, which comprises the steps of shooting a target image underwater, making the target image into a data set, dividing the data set into a training set and a testing set, labeling the training set, selecting ResNet18 as a feature extraction network, building a feature pyramid for multi-scale feature fusion, outputting a feature graph with the maximum size of the fused image to a detection head, performing deep learning training on the image and labeled information in the training set by using a CenterNet algorithm to obtain a trained model, performing target detection, and obtaining classification information and position information of a target to be detected in the image. The underwater multi-scale target detection method is lighter, is suitable for embedded equipment, has higher target detection precision, further improves the detection precision of the multi-scale target in the underwater optical image, reduces part of required calculation amount, increases reasoning speed, and makes the algorithm lighter and more real-time.

Description

Lightweight underwater target detection method based on CenterNet
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to an underwater target detection method.
Background
With the development of human civilization, the utilization of marine resources by human beings is more and more deep and frequent, and the types and the quantity of marine facilities are more and more abundant. The construction of marine facilities is difficult and often plays an important role at the military strategic level, and once damaged, the damage is enormous and the repair is difficult. These characteristics make it extremely vulnerable to vandalism by other countries and terrorists, and it is therefore extremely important to protect the safety of marine facilities. The particularities of the geographical location of marine facilities make the protection of these facilities particularly difficult.
The underwater target detection is the 'eye' of human observation ocean and has extremely important significance in ocean resource development and ocean facility protection. The traditional underwater target detection mainly artificially extracts the characteristics of target radiation noise, then constructs a classifier, and classifies and identifies the target based on the extracted characteristics. In recent years, with great progress of artificial intelligence in the field of image recognition, application of deep learning in underwater target detection is also subjected to a great deal of and more research. So far, algorithms based on deep learning can be divided into two categories, one is a two-stage detector, such as R-CNN; one is a one-stage detector such as SSD, YOLO. The one-stage detector gives the category and position information of the target directly through the backbone network without using the RPN network, so they are faster. At present, one-stage or two-stage detectors have little practice in underwater target detection, but no anchor-free algorithm is used. The CenterNet is an one-stage algorithm without an anchor frame, avoids the complex operation of designing an anchor-box, does not Need Maximum Suppression (NMS), and has a structure simpler than that of a plurality of anchors-free algorithms, so that the CenterNet has higher speed and has higher real-time property compared with other algorithms; meanwhile, the requirement on the GPU is lower, and the method is more suitable for embedded equipment with limited computing capability. The algorithm consists of two parts: the system comprises a feature extraction network Hourglass for extracting features and a detection head for positioning and classifying targets and detecting based on central points. However, underwater targets tend to be small in size and densely distributed. The Hourglass of the feature extraction network used by the CenterNet has an overlarge acceptance domain due to a special nested structure, and the number of network layers is deep, so that a large amount of small target information is lost, and the detection effect on small targets and dense targets is poor; and the structure is complex, the calculated amount is large, the reasoning speed is slow, and the method is not suitable for being used in a lightweight algorithm. The current underwater optical image target detection has the following defects:
1. underwater targets are often small and dense, and existing underwater optical image target detection algorithms cannot well detect such targets.
2. The existing underwater target detection algorithm has no light weight and high precision. The present invention solves the two problems described above.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a lightweight underwater target detection method based on CenterNet.
The technical scheme adopted by the invention for solving the technical problem comprises the following steps:
step 1, shooting a target image underwater to make a data set;
placing a target type to be detected under water, installing a camera on a Remote Operated Vehicle (ROV), carrying out multi-scale and multi-azimuth image shooting on the target to be detected to obtain a target image, and making the target image into a data set;
step 2, dividing the data set into a training set and a test set, and labeling the training set;
step 3, selecting ResNet18 which has better detection effects on small targets and dense targets and is more suitable for a lightweight algorithm as a feature extraction network; constructing a characteristic pyramid, respectively carrying out multi-scale characteristic fusion on the last layer with the convolution channel numbers of 128, 256 and 512 in the ResNet18, and outputting a characteristic diagram with the largest image size after fusion to a detection head;
step 4, performing deep learning training on the images and the labeled information in the training set by using a CenterNet algorithm to obtain a trained CenterNet algorithm model;
and 5, performing target detection on the images in the test set or the actually shot images by using a CenterNet algorithm, and acquiring the classification information and the position information of the target to be detected in the images.
The step 2 comprises the following steps:
and dividing the image into a training set and a testing set according to the proportion of 7:3 to 9:1, the number of images in the training set is more than 500, so that the generalization performance of the detection algorithm is ensured. Carrying out data annotation on the acquired images of the training set by using software labelimage, wherein annotation information is position information and category information of targets to be detected in the images to obtain an underwater optical image data set;
the step 3 comprises the following steps:
a network with an excessive number of layers is not suitable for underwater targets and lightweight algorithms, since an excessive number of network layers would result in a large loss of small target information and an increase in the amount of required computation. Therefore, the ResNet18 is selected as the feature extraction network according to the characteristics of small size and dense distribution of underwater targets and the condition that the computing capability of the embedded equipment is limited. Meanwhile, in order to increase the detection effect of the detection algorithm on small targets, multi-scale feature fusion is used for ResNet18, a feature pyramid is built, multi-scale feature fusion is respectively carried out on the last layer with the number of convolution channels of 128, 256 and 512 in ResNet18, a feature map with the image size of 64 multiplied by 64 is output to a detection head after fusion, and the detection accuracy of the algorithm is remarkably increased. Compared with the common characteristic pyramid, the invention deletes two output channels with smaller output picture size of the characteristic pyramid, reduces the calculation amount of the detection algorithm, and keeps the superiority that the CenterNet does not need the non-maximum inhibition process, thereby leading the algorithm to be lighter and more real-time. Meanwhile, the centret algorithm originally increases the output 16 × 16 size picture to 128 × 128 through three-layer deconvolution, and outputs the picture to a detection head to obtain the position, width, height and deviation information of the target in the image; and because the invention uses the characteristic pyramid to carry out multi-scale characteristic fusion and only reserves the channel with the largest output image size (64 multiplied by 64), and does not need to pass through three layers of deconvolution, the invention only reserves one layer of deconvolution, deletes the other two layers, and can increase the final output image size to 128 multiplied by 128, thereby reducing the required calculated amount and leading the algorithm to be lighter.
The step 4 comprises the following steps:
respectively storing the annotation files and the images in the data set in two folders, jointly moving the annotation files and the images into a data folder of an algorithm, and operating a main file of the algorithm through a command at a terminal to train a network; in the training process, firstly calling a feature extraction network comprising a multi-scale fusion mode to perform feature extraction on an underwater optical image in a training set, then calling a detection head related file, outputting a feature diagram output by the feature extraction network into a Loss Function (Loss Function) in a detection head to calculate a numerical value, and completing one-time forward propagation; and then, the convolutional neural network adjusts parameters in the model according to the change situation of the LOSS function value until the LOSS function LOSS reaches the minimum value, and the whole process is automatic, namely the process of training the model by the deep learning algorithm, so that the LOSS of the training model changes towards the minimum value. The algorithm can generate a continuously updated training model in the training process, loss gradually converges to the minimum value in the repeated forward propagation and backward propagation processes, namely the relationship curve of the Loss and time tends to be smooth and does not decline any more, and the training model at the moment is the optimal underwater optical image target detection training model, namely the training part of the CenterNet algorithm is completed.
The step 5 comprises the following steps:
reading the images in the test set or the actually shot images into a trained underwater optical image target detection training model, and then detecting the images through an underwater target detection algorithm;
in the detection stage, firstly, an input image is zoomed to 512 x 512, then feature extraction is carried out on the zoomed image through a feature extraction network, parameters in the feature extraction network are parameters of a training model at the moment, the optimal feature information in the underwater optical image is extracted, LOSS of the model is converged at the moment, and the condition that the parameters in the model are the most appropriate parameters for extracting the features at the moment is that the parameters are optimal, if the parameters are not optimal, the LOSS is not converged, the parameters can also continuously rise or fall, the feature information is input to a detection head part, and the specific detection mode of the detection head part is as follows:
suppose the input image is I e R W×H×3 Where W and H are the width and height of the image, respectively, at the time of detection, a hotspot map (keypoint heat map) of the key point is generated by the gaussian kernel:
Figure BDA0003134518570000041
Figure BDA0003134518570000042
representing the value of each point in the hot spot diagram, wherein R is the step length of outputting the corresponding original drawing, and is set to be 4, C represents the category of target detection objects, and if 4 underwater targets to be detected exist, C =4; in this way it is possible to obtain,
Figure BDA0003134518570000043
is a predicted value of the detected object for
Figure BDA0003134518570000044
Indicating that for class C, an object of this class is detected in the current (x.y) coordinates, and
Figure BDA0003134518570000045
then it means that there is no object with category C at this coordinate point currently;
the hot spots of each class in the output graph are extracted separately in the following way:
finally predicted from the model trained in step 4
Figure BDA0003134518570000046
The value of (a), that is, the probability value of the object existing at the center point of the current prediction target, selects the center point; detecting the value of the current hot spot by adopting maximum pooling (MaxPool) of 3x3
Figure BDA0003134518570000047
The points which are larger than or equal to the surrounding eight adjacent points (eight directions) are taken, and then the numerical value in all the points is taken
Figure BDA0003134518570000048
The largest first m points, m being less than or equal to 100, produce an effect similar to non-maximum suppression in anchor-based detection; predicting the position of a target object through the position of the central point by using m central points selected from the image to obtain m prediction frames, and judging whether the prediction frames are accurate or not so as to obtain an estimated value as a confidence coefficient;
Figure BDA0003134518570000049
in order to be able to detect the point,
Figure BDA00031345185700000410
representing the detected points in the class C, and expressing the position of each key point (namely the central point of the target to be detected) as an integer coordinate
Figure BDA00031345185700000411
Then use
Figure BDA00031345185700000412
Representing the probability that the current point is the center point, then a prediction box is generated using the coordinates:
Figure BDA00031345185700000413
wherein
Figure BDA00031345185700000414
Is the offset of the current point to the original image,
Figure BDA00031345185700000415
representing the length and width of the predicted corresponding target of the current point;
deleting the prediction target with the confidence coefficient smaller than the threshold 0.3, reserving the position of the prediction frame with the confidence coefficient larger than or equal to the threshold 0.3 as final position information, and using the classification of the heat point map as final classification information.
The invention has the advantages that the lightweight underwater target detection method based on the CenterNet is an underwater optical image target detection algorithm which can be carried on embedded equipment with limited calculation conditions and can detect the target type and position information in an underwater optical image in real time with higher precision:
1. the detection head selected by the invention is CenterNet, the detection method belongs to a one-stage detector, and the detection method is a target detection algorithm without an anchor-frame (anchor-free) and without maximum suppression (NMS), the speed is higher than that of a two-stage detector or a target detection algorithm based on the anchor-based and maximum suppression process, and the detection method has real-time performance; and the required calculated amount is less than that of other detectors, so that the detector is lighter in weight and is suitable for embedded equipment.
2. The invention selects ResNet18 as a feature extraction network, the network has simple structure, few layers and required calculation amount, higher detection precision on underwater optical image targets with the characteristics of small target size and dense distribution, lighter algorithm and capability of detecting underwater targets with higher precision and real time even if the underwater optical image targets are carried in embedded equipment with limited calculation conditions.
3. The invention uses the improved characteristic pyramid as a multi-scale characteristic fusion mode, and further improves the detection precision of the algorithm on the multi-scale target in the underwater optical image on the premise of keeping the superiority of the CenterNet in the process of not needing non-maximum inhibition.
4. On the original basis of the CenterNet, the invention deletes two layers of deconvolution, reduces the calculation amount needed by part, increases the reasoning speed and makes the algorithm lighter and more real-time.
Drawings
Fig. 1 is an exemplary view of a photographed underwater optical image.
Fig. 2 is a schematic structural diagram of ResNet18 according to the present invention.
Fig. 3 is a schematic diagram of the general structure of the algorithm of the present invention.
Figure 4 is a schematic diagram of the CenterNet detection head structure of the present invention.
Fig. 5 is two exemplary graphs of output results according to the present invention, fig. 5 (a) is an exemplary graph of output results of the graph one, and fig. 5 (b) is an exemplary graph of output results of the graph two.
Detailed Description
The invention is further illustrated with reference to the following figures and examples.
In order to solve the limitations and defects of the prior art, the invention provides a lightweight underwater target detection method based on the CenterNet.
Therefore, resNet18 is used as a feature extraction network, a unique multi-scale feature fusion mode is adopted, multi-scale information in ResNet18 is fused, and a feature diagram with the maximum resolution is output to a detection head. The two structures obviously improve the detection effect of the algorithm on small targets and dense targets, reduce the calculated amount and the size of the model, and enable the algorithm to detect the underwater small targets and the dense targets with higher precision and higher speed in embedded equipment with limited calculation capability.
The target detection method is easier to carry in a small underwater vehicle or a passive underwater monitor or a mobile monitoring station, if the server can be carried in the underwater monitor, even a sonar image is not required to be transmitted back to the monitoring station, the underwater monitor can automatically send a counter-braking command, and the target can be more quickly countered; the higher operation speed is precious in military use competing for minutes and seconds, and even in a civil underwater robot, the higher reaction speed can be generated, so that the higher working efficiency is achieved.
A lightweight underwater target detection method based on CenterNet comprises the following steps:
step 1, shooting a target image underwater, and making a data set;
step 2, dividing the data set into a training set and a test set, and labeling the training set;
step 3, selecting ResNet18 which has better detection effects on small targets and dense targets and is more suitable for a lightweight algorithm as a feature extraction network; constructing a characteristic pyramid, respectively carrying out multi-scale characteristic fusion on the last layer with the convolution channel numbers of 128, 256 and 512 in the ResNet18, and outputting a characteristic diagram with the fused image size of 64 multiplied by 64 to a detection head;
step 4, performing deep learning training on the images and the labeled information in the training set by using a CenterNet algorithm;
and 5, performing target detection on the images in the test set or the actually shot images by using a CenterNet algorithm to acquire classification information and position information of the target to be detected in the images.
The step 1 comprises the following steps:
placing a target to be detected under water, installing a camera on a remote-control unmanned underwater vehicle (ROV), and shooting multi-scale and multi-azimuth images of the target to be detected, wherein a shooting sample is shown in figure 1;
the step 2 comprises the following steps:
dividing the images into a training set and a test set in a proportion of 7: 1, the number of images in the training set is not too small, and is preferably more than 500, so as to ensure the generalization performance of the detection algorithm. Performing data annotation on the acquired training set image by using a software labelimage, wherein annotation information is position information and category information of a target to be detected in the image, and obtaining an underwater optical image data set;
the step 3 comprises the following steps:
a network with an excessive number of layers is not suitable for underwater targets and lightweight algorithms, since an excessive number of network layers would result in a large loss of small target information and an increase in the amount of required computation. Therefore, aiming at the characteristics of small size and dense distribution of underwater targets and the condition of limited computing power of embedded equipment, the method selects ResNet18 as a feature extraction network, the conventional ResNet18 structure is shown in FIG. 2, and in the scheme, the sizes of input pictures are 512 x 512, so the sizes of output feature pictures of con3_ x, con4_ x and con5_ x are 128 x 128, 256 x 256 and 512 x 512 respectively. In order to increase the detection effect of the detection algorithm on small targets, the method uses multi-scale feature fusion on ResNet18 to build a feature pyramid: and respectively carrying out multi-scale feature fusion on the last layer with the number of convolution channels of 128, 256 and 512 in the ResNet18, and outputting a feature map with the image size of 64 multiplied by 64 after fusion to a detection head, so that the detection precision of the algorithm is obviously improved, and the overall structure of the algorithm is shown in figure 3. The specific fusion mode is as follows: firstly, carrying out 1 multiplied by 1 convolution on three input feature maps with different sizes, reducing the number of channels to 128, fusing different feature maps and reducing the calculated amount; then, performing maximum pooling on the three input feature maps with the changed channel number respectively, and increasing the image sizes to 512 × 512; finally, the three feature maps of 512 × 512 × 128 are added, and the obtained feature maps are output to the detection head. Different from the common characteristic pyramid, the method deletes two output channels with smaller sizes of the output pictures of the characteristic pyramid, reduces the calculation amount of the detection algorithm, and keeps the superiority that the CenterNet does not need a non-maximum inhibition process, so that the algorithm is lighter and more real-time. Meanwhile, the centret algorithm originally increases the output 16 × 16 size picture to 128 × 128 through three-layer deconvolution, and outputs the picture to a detection head to obtain the position, width, height and deviation information of the target in the image; and because the method uses the characteristic pyramid to perform multi-scale characteristic fusion and only reserves the channel with the largest output image size (64 multiplied by 64) without three-layer deconvolution, the method only reserves one layer of deconvolution and deletes the other two layers, namely the size of the final output image can be increased to 128 multiplied by 128, the required calculated amount is reduced, and the algorithm is lighter.
The step 4 comprises the following steps:
and respectively storing the annotation files and the images in the data set in two folders, and jointly moving the annotation files and the images into the data folders of the algorithm. Py file to run algorithm by command at terminal to train the network. In the training process, the algorithm firstly calls a feature extraction network comprising a multi-scale fusion mode to perform feature extraction on an underwater optical image in a training set, then calls a detection head related file, outputs a feature diagram output by the feature extraction network into a Loss Function (Loss Function) in a detection head to calculate a numerical value, and completes one-time forward propagation; and then, the convolutional neural network adjusts parameters in the model according to the change condition of the Loss function value, so that the Loss of the training model changes to the minimum value. The algorithm can generate a continuously updated training model in the training process, loss gradually converges to the minimum value in the repeated forward propagation and backward propagation processes, and the training model at the moment is the optimal underwater optical image target detection training model, namely the training part of the CenterNet algorithm is completed.
The step 5 comprises the following steps:
and inputting the underwater optical image into an underwater target detection algorithm, and reading the trained underwater optical image target detection training model. In the detection stage, firstly, the input image is zoomed to 512 x 512, and then the zoomed image is subjected to feature extraction through a feature extraction network, at the moment, parameters in the feature extraction network are parameters of a training model, so that feature information in the underwater optical image can be optimally extracted. The characteristic information is input into the detection head, the schematic diagram of the detection head structure is shown in fig. 4, and the specific detection mode of the detection head is as follows.
Suppose the input image is I e R W×H×3 Where W and H are the width and height of the image, respectively, upon detection, a hot spot map (keypoint heat map) of the key point is generated:
Figure BDA0003134518570000081
wherein R is a step length for outputting the corresponding original image, which is set to 4, and C represents a category of the target detection object, and if 4 types of targets to be detected underwater exist, C =4; in this way it is possible to obtain,
Figure BDA0003134518570000082
is a predicted value of the detected object for
Figure BDA0003134518570000083
Indicating that for category c, an object of this category is detected in the current (x.y) coordinates, and
Figure BDA0003134518570000084
it means that there is no object of the category c at this coordinate point at present. The hot spots of each class in the output graph are extracted individually. The extraction method comprises the following steps: detecting the value of the current hot spot by adopting maximum pooling (MaxPool) of 3x3
Figure BDA0003134518570000085
Points (or equal) larger than the surrounding eight neighboring points (eight orientations), and then taking the value of all points
Figure BDA0003134518570000086
The first 100 points of maximum, produce effects similar to non-maximum suppression in the anchor-based assay.
Suppose that
Figure BDA0003134518570000087
In order for the point to be detected,
Figure BDA0003134518570000088
representing one point detected in class c. The position of each key point is expressed by integer coordinates
Figure BDA0003134518570000089
Then use
Figure BDA00031345185700000810
Representing the confidence of the current point, and then using such coordinates to generate a calibration box:
Figure BDA00031345185700000811
wherein
Figure BDA00031345185700000812
Is the bias point of the current point corresponding to the original image,
Figure BDA00031345185700000813
representing the length and width of the predicted current point corresponding to the target. Finally predicted from the model
Figure BDA00031345185700000814
The value of (A), i.e. the probability value of the object existing at the current center point, is selected
Figure BDA00031345185700000815
The point with the value of the top 100 is taken as a possible center point; in the present invention, a threshold value of 0.3 is set, that is, a central point greater than the threshold value is called out from 100 results selected from the images as a final result, and the detection results finally output from the two images are shown in fig. 5.

Claims (3)

1. A lightweight underwater target detection method based on CenterNet is characterized by comprising the following steps:
step 1, shooting a target image underwater, and making a data set;
placing a target to be detected under water, mounting a camera on the remote-control unmanned submersible, carrying out multi-scale and multi-azimuth image shooting on the target to be detected to obtain a target image, and making the target image into a data set;
step 2, dividing the data set into a training set and a test set, and labeling the training set;
step 3, selecting ResNet18 which has better detection effects on small targets and dense targets and is more suitable for a lightweight algorithm as a feature extraction network; building a characteristic pyramid, respectively carrying out multi-scale characteristic fusion on the last layer with the convolution channel number of 128, 256 and 512 in ResNet18, outputting a characteristic image with the size of 64 multiplied by 64 to a detection head CenterNet, deleting two output channels with smaller size of the output image of the characteristic pyramid, carrying out multi-scale characteristic fusion by using the characteristic pyramid, only reserving a channel with the largest size of the output image, not needing to carry out three-layer deconvolution, only reserving one layer of deconvolution, deleting the other two layers, and increasing the size of the final output image to 128 multiplied by 128;
step 4, performing deep learning training on the images and the labeled information in the training set by using a CenterNet algorithm to obtain a trained CenterNet algorithm model;
step 5, performing target detection on the images in the test set or the actually shot images by using a CenterNet algorithm to obtain classification information and position information of a target to be detected in the images, reading the images in the test set or the actually shot images into a trained underwater optical image target detection training model, and then performing detection by using an underwater target detection algorithm;
in the detection stage, firstly, an input image is zoomed to 512 × 512, then feature extraction is carried out on the zoomed image through a feature extraction network, parameters in the feature extraction network are parameters of a training model at the moment, the optimal feature information in the underwater optical image is extracted, LOSS convergence of the model is optimal at the moment, the feature information is input to a detection head part, and the specific detection mode of the detection head part is as follows:
the input image is I E R W×H×3 And W and H are the width and the height of the image respectively, and when in detection, a hotspot graph of a key point is generated through a Gaussian kernel:
Figure FDA0003900169340000011
Figure FDA0003900169340000012
representing the value of each point in the hot spot diagram, wherein R is the step size of outputting the corresponding original image, set to 4, C represents the category of the target detection object,
Figure FDA0003900169340000013
is a predicted value of the detected object for
Figure FDA0003900169340000014
Indicating that for class C, an object of this class is detected in the current (x.y) coordinates, and
Figure FDA0003900169340000015
then it means that there is no object with category C at this coordinate point currently;
the hot spots of each class in the output graph are extracted individually in the following way:
predicted from the model trained in step 4
Figure FDA0003900169340000021
The value of (a), that is, the probability value of the object existing at the center point of the current prediction target, selects the center point; detecting the value of the current hot spot by adopting the maximum pooling of 3x3
Figure FDA0003900169340000022
The number of the points is taken from all the points which are larger than or equal to the eight surrounding adjacent points
Figure FDA0003900169340000023
The maximum front m points, m is less than or equal to 100, the positions of the target objects are predicted from m central points selected from the images according to the positions of the central points to obtain m prediction frames, and whether the prediction frames are accurate or not is judged, so that an estimated value is obtained as a confidence coefficient;
Figure FDA0003900169340000024
in order for the point to be detected,
Figure FDA0003900169340000025
representing the detected points in class C, the position of each key point is expressed by integer coordinates
Figure FDA0003900169340000026
Then use
Figure FDA0003900169340000027
Representing the probability that the current point is the center point, then a prediction box is generated using the coordinates:
Figure FDA0003900169340000028
wherein
Figure FDA0003900169340000029
Is the offset of the current point to the original image,
Figure FDA00039001693400000210
representing the length and width of the target corresponding to the predicted current point; deleting the prediction target with the confidence coefficient smaller than the threshold 0.3, reserving the position of the prediction frame with the confidence coefficient larger than or equal to the threshold 0.3 as final position information, and using the classification of the hot spot diagram as final classification information.
2. The CenterNet-based lightweight underwater target detection method of claim 1, wherein:
the step 2 comprises the following steps:
dividing the image into a training set and a testing set according to the proportion, wherein the proportion is 7:3 to 9:1, performing data annotation on the acquired images of the training set by using software labelimage, wherein annotation information is position information and category information of a target to be detected in the images, and acquiring an underwater optical image data set, wherein the number of the images of the training set is more than 500.
3. The CenterNet-based lightweight underwater target detection method according to claim 1, characterized in that:
the step 4 comprises the following steps:
respectively storing the marked files and the images in the data set in two folders, jointly moving the marked files and the images into a data folder of an algorithm, and operating a main.py file of the algorithm through a command at a terminal to train a network; in the training process, firstly calling a feature extraction network comprising a multi-scale fusion mode to perform feature extraction on underwater optical images in a training set, then calling a relevant file of a detection head, outputting a feature diagram output by the feature extraction network to a loss function in the detection head to calculate a numerical value, and completing one-time forward propagation; and then, the convolutional neural network adjusts parameters in the model according to the change condition of the LOSS function value until the LOSS function LOSS reaches the minimum value and does not decrease, and the training model at the moment is the optimal underwater optical image target detection training model, namely the training part of the CenterNet algorithm is completed.
CN202110723096.4A 2021-06-25 2021-06-25 Lightweight underwater target detection method based on CenterNet Active CN113420819B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110723096.4A CN113420819B (en) 2021-06-25 2021-06-25 Lightweight underwater target detection method based on CenterNet

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110723096.4A CN113420819B (en) 2021-06-25 2021-06-25 Lightweight underwater target detection method based on CenterNet

Publications (2)

Publication Number Publication Date
CN113420819A CN113420819A (en) 2021-09-21
CN113420819B true CN113420819B (en) 2022-12-06

Family

ID=77716932

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110723096.4A Active CN113420819B (en) 2021-06-25 2021-06-25 Lightweight underwater target detection method based on CenterNet

Country Status (1)

Country Link
CN (1) CN113420819B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113888528A (en) * 2021-10-22 2022-01-04 山东省计算中心(国家超级计算济南中心) Bottle bottom die point identification method and system based on improved CenterNet
CN114677443B (en) * 2022-05-27 2022-08-19 深圳智华科技发展有限公司 Optical positioning method, device, equipment and storage medium
CN115170923B (en) * 2022-07-19 2023-04-07 哈尔滨市科佳通用机电股份有限公司 Fault identification method for loss of railway wagon supporting plate nut
CN117809168B (en) * 2024-01-08 2024-05-17 中国电子科技集团公司第十五研究所 Method and device for detecting inherent attribute characteristics based on underwater target

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112446388A (en) * 2020-12-05 2021-03-05 天津职业技术师范大学(中国职业培训指导教师进修中心) Multi-category vegetable seedling identification method and system based on lightweight two-stage detection model
CN113011365A (en) * 2021-03-31 2021-06-22 中国科学院光电技术研究所 Target detection method combined with lightweight network

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109344821A (en) * 2018-08-30 2019-02-15 西安电子科技大学 Small target detecting method based on Fusion Features and deep learning
US11416707B2 (en) * 2019-12-04 2022-08-16 Panasonic Intellectual Property Corporation Of America Information processing method, information processing system, and information processing apparatus
EP3838427A1 (en) * 2019-12-20 2021-06-23 IHP Systems A/S A method for sorting objects travelling on a conveyor belt
CN111797697B (en) * 2020-06-10 2022-08-05 河海大学 Angle high-resolution remote sensing image target detection method based on improved CenterNet
CN112101277B (en) * 2020-09-24 2023-07-28 湖南大学 Remote sensing target detection method based on image semantic feature constraint
CN112446327B (en) * 2020-11-27 2022-06-07 中国地质大学(武汉) Remote sensing image target detection method based on non-anchor frame

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112446388A (en) * 2020-12-05 2021-03-05 天津职业技术师范大学(中国职业培训指导教师进修中心) Multi-category vegetable seedling identification method and system based on lightweight two-stage detection model
CN113011365A (en) * 2021-03-31 2021-06-22 中国科学院光电技术研究所 Target detection method combined with lightweight network

Also Published As

Publication number Publication date
CN113420819A (en) 2021-09-21

Similar Documents

Publication Publication Date Title
CN113420819B (en) Lightweight underwater target detection method based on CenterNet
CN110059558B (en) Orchard obstacle real-time detection method based on improved SSD network
CN108961235B (en) Defective insulator identification method based on YOLOv3 network and particle filter algorithm
WO2022193420A1 (en) Intelligent detection method for multiple types of diseases of bridge near water, and unmanned surface vessel device
CN111461291B (en) Long-distance pipeline inspection method based on YOLOv3 pruning network and deep learning defogging model
CN111626128B (en) Pedestrian detection method based on improved YOLOv3 in orchard environment
CN111222574B (en) Ship and civil ship target detection and classification method based on multi-model decision-level fusion
CN111079739B (en) Multi-scale attention feature detection method
CN113076871A (en) Fish shoal automatic detection method based on target shielding compensation
JP2020038660A (en) Learning method and learning device for detecting lane by using cnn, and test method and test device using the same
CN111753682B (en) Hoisting area dynamic monitoring method based on target detection algorithm
CN113378686A (en) Two-stage remote sensing target detection method based on target center point estimation
CN111738206B (en) Excavator detection method for unmanned aerial vehicle inspection based on CenterNet
CN112149620A (en) Method for constructing natural scene character region detection model based on no anchor point
CN115690542A (en) Improved yolov 5-based aerial insulator directional identification method
CN115861853A (en) Transmission line bird nest detection method in complex environment based on improved yolox algorithm
CN112883850A (en) Multi-view aerospace remote sensing image matching method based on convolutional neural network
CN110610130A (en) Multi-sensor information fusion power transmission line robot navigation method and system
CN114596340A (en) Multi-target tracking method and system for monitoring video
CN113688830A (en) Deep learning target detection method based on central point regression
Zou et al. Sonar Image Target Detection for Underwater Communication System Based on Deep Neural Network.
CN113160117A (en) Three-dimensional point cloud target detection method under automatic driving scene
CN116824330A (en) Small sample cross-domain target detection method based on deep learning
CN115496998A (en) Remote sensing image wharf target detection method
CN112069997B (en) Unmanned aerial vehicle autonomous landing target extraction method and device based on DenseHR-Net

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant