CN113435269A - Improved water surface floating object detection and identification method and system based on YOLOv3 - Google Patents

Improved water surface floating object detection and identification method and system based on YOLOv3 Download PDF

Info

Publication number
CN113435269A
CN113435269A CN202110647573.3A CN202110647573A CN113435269A CN 113435269 A CN113435269 A CN 113435269A CN 202110647573 A CN202110647573 A CN 202110647573A CN 113435269 A CN113435269 A CN 113435269A
Authority
CN
China
Prior art keywords
water surface
training
drifter
yolov3
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110647573.3A
Other languages
Chinese (zh)
Inventor
刘献忠
徐浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
East China Normal University
Original Assignee
East China Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by East China Normal University filed Critical East China Normal University
Priority to CN202110647573.3A priority Critical patent/CN113435269A/en
Publication of CN113435269A publication Critical patent/CN113435269A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0484Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a water surface floater detecting and identifying method based on an improved YOLOv3 identifying model, which relates to the technical field of computer vision and comprises the following steps: acquiring water surface drift object data in advance, performing enhanced amplification on the image data by adopting geometric transformation and color transformation, labeling drift objects in the data to obtain a water surface drift object data set, and splitting the water surface drift object data set into a training set and a testing set; constructing an improved YOLOv3 network model, and training the improved YOLOv3 network model by adopting a water surface drift training set; and constructing a water surface drifter test set according to the water surface drifter data image, and detecting and identifying the water surface drifter test set by using a trained improved YOLOv3 network model. The improved YOLOv3 has strong generalization capability, occupies small storage space and display space, improves the detection and identification accuracy, can ensure real-time performance, and can realize accurate and rapid monitoring and identification of the water surface drifter in client equipment with computational power and limited memory.

Description

Improved water surface floating object detection and identification method and system based on YOLOv3
Technical Field
The invention belongs to the technical field of computer vision, and relates to an improved method and system for detecting and identifying a water surface drift based on YOLOv 3.
Background
In recent years, the speed of urbanization and industrialization in China is faster and faster, and the problem of water environment pollution is not optimistic while the economy is rapidly developed. A large amount of floaters exist in rivers and lakes, so that not only is the natural ecological landscape destroyed, but also the life health of human beings and the sustainable development of economy are seriously threatened, and therefore, the research on effectively monitoring the floaters in the rivers and lakes has important practical significance.
The existing water surface floater detection technology based on video images mainly aims at remote sensing images, and analyzes and detects whether floaters exist or not by extracting spectral features, spatial features and textural features of the remote sensing images. Because the remote sensing image field of vision is usually far away, so be difficult to detect the floater of less area in the city river course, simultaneously because the formation remote sensing image has the requirement to imaging device, it has certain degree of difficulty to gather the remote sensing image data set that has a large amount of floaters, is unfavorable for popularizing in the reality application. In the traditional image segmentation technology, because the water surface has reflection, the segmentation effect is not ideal due to factors such as illumination change and the like, and a large amount of water surface reflection cannot be correctly segmented.
The deep learning target detection technology uses a convolutional neural network to extract features, and stronger adaptability and generalization capability are realized through training and learning. The Yolov3 is a mainstream algorithm in the field of target detection at present, but the Yolov3 is large in size and cannot be embedded in equipment with limited computing power (such as an automatic salvage ship and other river patrol equipment) so as to meet the requirement of real-time detection. The problem that the sample class of the model is unbalanced is caused by the fact that a professional data set disclosed in the field of water surface drift object detection does not exist at present. YOLOv3 sacrifices a certain detection speed to improve detection accuracy, but detection of the water surface floating objects of small targets still has certain difficulty.
Disclosure of Invention
In order to solve the defects in the prior art, the invention aims to provide a water surface floater detecting and identifying method based on an improved YOLOv3 identification model, which maintains and improves the performance of a YOLOv3 algorithm on water surface floater detecting and identifying and simplifies the volume of a detection algorithm model.
The technical scheme of the invention is realized as follows:
the method comprises the following steps of firstly, acquiring a data set for training the water surface drifter in advance, enhancing and amplifying image data by adopting geometric transformation and color transformation, labeling the drifter of the data set to obtain a water surface drifter data set, and splitting the water surface drifter data set into a training set and a testing set;
step two, constructing an improved YOLOv3 network model;
step three, training the improved YOLOv3 network model constructed in the step two by using the water surface drifter training set obtained in the step one;
and step four, using the water surface drifter test set obtained in the step one and split according to the data image of the water surface drifter, and detecting and identifying the water surface drifter test set by using the improved YOLOv3 network model trained in the step three.
The inventive dataset was collected manually on site and required processing into the PASCAL VOC dataset format used for YOLOv 3.
Further, the step one is divided into the following two steps:
1.1, acquiring a training data set of the water surface drifter manually on site, carrying out color transformation by adjusting hue, contrast, saturation and brightness, carrying out geometric transformation and random cutting on an image, and then splicing by randomly selecting pictures to generate a new image; the geometric transformation refers to zooming, translation and rotation;
and 1.2, manually labeling the data by adopting Labelme, converting a data set label format into a PASCAL VOC data set format, and dividing a training data set and a testing data set by adopting a ratio of 8: 2-9: 1.
Further, in the second step, the construction of the improved YOLOv3 network model specifically includes the following steps:
2.1, replacing the original DarkNet53 network of the YOLOv3 network model with a GhostNet network; an attention mechanism layer SELayer is added into a GhostNet bottleck of GhostNet to enhance the attention degree of main channel characteristics;
and 2.2, sequentially extracting feature maps with down-sampling multiples of 4, 8, 16 and 32 from the GhostNet network structure through average pooling, and sequentially performing up-sampling on the feature maps and fusing the original features to obtain four new feature maps.
The GhostNet backbone network comprises three GhostNet branch output feature maps with the sizes of 26 × 26, 52 × 52 and 104 × 104, respectively, in addition to the feature map with the size of 13 × 13 of the final output.
The whole backbone network is subjected to feature extraction, and feature graphs of different sizes contain feature information of different levels. The shallower the hierarchy, the more limited the feature meaning, the deeper the hierarchy, the richer the feature meaning, and the feature map of 104 is extracted by the first three Ghost blocks in the Ghost net; then, extracting the features of two Ghost blocks to obtain a feature map of 52; then obtaining a 26 characteristic diagram through six Ghost blocks; and finally obtaining a feature map of 13 through five GhostBlock, and finishing the process of extracting the features of the backbone network.
2.3, replacing the original positioning loss function of the YOLOv3 network model by GIOU loss;
the IOU is used for respectively solving intersection and union of any A, B frames and finally solving the ratio of the intersection and the union; the IOU expression is:
Figure BDA0003109737320000021
GIOU indicates that for any A, B boxes, a minimum closed shape C capable of enclosing the boxes is found firstly, then the ratio of the area of C \ AbuB to the area of C is calculated, wherein the area of C \ AbuB is the area of C minus the area of AbuB, and then the ratio is subtracted from the IOU value of A, B to obtain GIOU; the GIOU expression is as follows:
Figure 1
wherein: A. b is two arbitrary convex regions, C refers to the smallest closed shape containing A and B;
the expression for the final localization loss is:
final positioning loss LGIOU=1-GIOU。
2.4, replacing a category Loss function of an original model of YOLOv3 with Focal local Loss;
the calculation formula of Focal local is as follows:
FL(pt)=-α(1-pt)γlog(pt)
wherein alpha is 2, gamma is 0.25, and p istRepresenting the probability equation of positive and negative samples, ptAs shown in the following formula:
wherein:
Figure BDA0003109737320000032
p represents the positive sample probability and y represents the label value;
in step 2.2, deeper features are obtained through four-scale feature fusion, and 13 × 13, 26 × 26, 52 × 52 and 104 × 104 are selected as four output feature maps; the number of network iterations is set to 1000; the features of the four-layer feature graph are further extracted through an improved up-sampling module dw _ res2net _ block, and then up-sampling is carried out to be fused with the original features to serve as new candidate features.
The dw _ Res2Net _ block is constructed by referring to the basic structure of GhostNet and Res2Net on the basis of the original invested _ Res _ block structure. And a smaller residual error connection module is added, so that the gradient disappearance is relieved, the communication among all the characteristic graphs is increased, and the original 3X 3 convolution layer is replaced by 3X 1 and 1X 3, so that the module characteristic extraction capability is more exquisite, and the parameter quantity is reduced by 1/3. In order to further reduce the model parameters, DWConv is used to replace part of the convolution operation, so that the convolution effect which is not much different from the original convolution layer can be obtained with less calculation force.
In step 2.3, the improvement on the YOLOv3 backbone network includes acquiring more effective channel information through a self-attention mechanism SE, and adding the SE self-attention mechanism to ghost bottleeck to make the network pay more attention to the training of important channel characteristics.
Further, the improvement of the YOLOv3 backbone network further comprises: and (3) clustering the sizes of the anchor boxes by adopting k-means + + to the water surface floater data sets to generate 12 different sizes. The technology can improve the detection performance of the model and accelerate the convergence speed.
In step three, the training of the improved YOLOv3 network model comprises the following steps:
3.1, initializing the weight of the improved YOLOv3 network model and parameters, wherein the parameters comprise a convolutional layer parameter value, a learning rate, an iteration number epoch and the number of data volumes per batch, namely batch _ size;
3.2, placing the training set and the test set in an agreed directory, and operating the program for training;
3.3, the pre-training program will select 1/10 data from the training set as the validation set, and each iteration will pass the validation set to verify and record the difficult sample that does not perform well. Outputting each index of the model after the iteration times are reached; the indicators include average accuracy mAP, single category average accuracy AP, accuracy rate, and recall rate.
3.4, after training is finished, the recorded difficulty is analyzed, the difficulty sample is enhanced, and the process of steps 3.3-3.4 is repeated for three times.
And 3.5, reaching the iteration times, finishing training, and storing various parameters and weights of the model.
In the fourth step, the detection capability of the improved YOLOv3 network model on the water surface floater is detected by using the average class accuracy, the model parameters and the FLOPs as performance indexes.
The invention also provides a system for realizing the detection method, which comprises the following steps: the system comprises a data input module, a data processing module, a YOLOv3 network module and a result output module.
The data input module is used for transmitting the acquired image data to the data processing module;
the data processing module is used for carrying out geometric transformation and color transformation on the image data, labeling the target object to obtain a data set, and splitting the data set into a training set and a test set;
the YOLOv3 network module is used for training a model and detecting and identifying a target object;
the result output module is used for outputting a corresponding result to the input image data.
Based on the foregoing method, the present invention further provides a system of the detection and identification method, including:
the method comprises the following steps of (1) carrying out user login interface, water surface drift detection page, picture uploading, video uploading, detection function and camera identification; wherein the content of the first and second substances,
the user login interface: displaying the name, author and application version information of the application, and providing an entrance of a detection interface on the interface;
the water surface drift object detection page comprises: the interface is the display area of the application main function button and the uploaded picture and the predicted picture; the display functions mainly include: uploading pictures, uploading videos, starting detection and detecting a camera;
uploading the pictures: uploading a picture to be detected;
and uploading the video: uploading a video to be detected;
the detection function is as follows: clicking a button to identify, and displaying an identified result;
the camera is identified: and opening the camera to monitor the detected object in real time.
The beneficial effects of the invention include:
the improved YOLOv3 network model in the invention adopts GhostNet network structure to greatly reduce the parameters and FLOPs of the model; meanwhile, the detection effect of the network on the small target water surface drifter can be improved by adopting four-scale feature map fusion; a GIOU loss function is adopted; the Focal local Loss function is adopted, so that the training of difficult samples is emphasized more in the training process of the model, and the problem of sample class imbalance is solved; and the detection effect of the model is improved by means of data enhancement and multi-scale training. Through the method, compared with a YOLOv3 original edition algorithm, the method has the advantages that the detection effect is higher, the volume of the model is greatly reduced, the parameter quantity is also greatly reduced, and accurate and rapid detection and identification can be realized in a mobile client with limited calculation capacity.
Drawings
FIG. 1 is a diagram of enhancing the pre-frequency number of a set of difficult sample data according to an embodiment of the present invention.
Fig. 2 is a frequency diagram after enhancing a difficult sample data set according to an embodiment of the present invention.
Fig. 3 is a general framework diagram of the YOLOv3 improved network of the present invention.
Fig. 4 is a detailed diagram of dw _ res2net _ block modules in the network of the present invention.
FIG. 5 is a flow chart of model training according to the present invention.
FIG. 6 is a diagram of the improved YOLOv3 network mAP convergence.
Fig. 7 is a graph of the raw YOLOv3 network mAP convergence.
Detailed Description
The invention is further described in detail with reference to the following specific examples and the accompanying drawings. The procedures, conditions, experimental methods and the like for carrying out the present invention are general knowledge and common general knowledge in the art except for the contents specifically mentioned below, and the present invention is not particularly limited.
The invention provides an improved water surface floating object detection and identification method based on YOLOv3, which comprises the following steps:
the method comprises the following steps of firstly, acquiring a data set for training the water surface drifter in advance, enhancing and amplifying image data by adopting geometric transformation and color transformation, labeling the drifter of the data set to obtain a water surface drifter data set, and splitting the water surface drifter data set into a training set and a testing set;
step two, constructing an improved YOLOv3 network model;
step three, training the improved YOLOv3 network model constructed in the step two by using the water surface drifter training set obtained in the step one;
and step four, using the water surface drifter test set obtained in the step one and split according to the data image of the water surface drifter, and detecting and identifying the water surface drifter test set by using the improved YOLOv3 network model trained in the step three.
The invention also provides a system for realizing the detection method, which comprises the following steps: the system comprises a data input module, a data processing module, a YOLOv3 network module and a result output module.
The data input module is used for transmitting the acquired image data to the data processing module;
the data processing module is used for carrying out geometric transformation and color transformation on the image data, labeling the target object to obtain a data set, and splitting the data set into a training set and a test set;
the YOLOv3 network module is used for training a model and detecting and identifying a target object;
the result output module is used for outputting a corresponding result to the input image data.
Examples
The embodiment provides an improved water surface drifter detection and identification method based on YOLOv3, which comprises the following steps:
step 1, preparing a water surface drift object data set and performing data enhancement on the data, wherein the method specifically comprises the following steps:
(1) the data set is collected in a solid scene, comprises 3443 pictures in total, and adopts 8: a scale of 2 divides the data set into a training set and a test set. The training set comprises 2755 pictures, the testing set comprises 688 pictures, the 688 pictures are uniformly processed into 416 x 416 pictures, and the pictures are marked by Labelme. Because the number of each category contained in the data and the training difficulty are different, the difficult samples with low occurrence frequency cannot be sufficiently learned in the model, so that the samples with the precision of less than 50% are marked in the training process, and after the training is finished, the samples are trained again. The format of the picture marking file is json, and the picture marking file is converted into a VOC format through coding in order to facilitate network training.
(2) In order to make more data available to the network, a series of data enhancement methods are used, including geometric transformation and color transformation. Firstly, randomly selecting a small number of category samples, performing color transformation by adjusting hue, contrast, saturation, brightness and the like, performing geometric transformation such as scaling, translation, rotation and the like and random cutting on the images, and then randomly selecting the images to splice to generate a new image. And then, after the training is finished, enhancing the sample with the precision of less than 50% again by adopting the enhancing mode mentioned above, and taking the sample and the original data set together as training data. Data enhancement can alleviate the problem of data sample imbalance, such as fig. 1 is a frequency chart before enhancement on a difficult sample in the embodiment, and fig. 2 is a frequency chart after enhancement on a difficult sample in the embodiment.
And 2, building a YOLOv3 improved network model.
(1) In the present invention, fig. 3 is an overall framework diagram of the YOLOv3 improved network. Firstly, a YOLOv3 backbone network is improved, ResNet block in Darknet53 is replaced by Ghost block, images are partially convolved, then the images are subjected to linear operation through DWConv, and finally the results of the two images are spliced together, so that a new feature map is obtained. The GhostNet connection mode can greatly deepen the depth of a network while reducing model parameters, and better feature extraction capability can be obtained through less calculation force.
(2) And enhancing the multi-scale feature fusion capability. Sequentially extracting feature maps with down-sampling multiples of 13, 26, 52 and 104 from a GhostNet structure: route-1, Route-2, Route-3, Route-4. The dw _ res2net _ block shown in fig. 4 further performs fusion extraction on the features, and then performs up-sampling operation on the extracted feature map to obtain fusion features m1, m2, m3 and m4, so that the detection capability of the small-target water surface drifter is improved.
(3) dw _ Res2Net _ block is constructed by referring to the infrastructure of GhostNet and Res2Net on the basis of the original invested _ Res _ block structure. And a smaller residual error connection module is added, so that the gradient disappearance is relieved, the communication among all the characteristic graphs is increased, and the original 3X 3 convolution layer is replaced by 3X 1 and 1X 3, so that the module characteristic extraction capability is more exquisite, and the parameter quantity is reduced by 1/3. In order to further reduce the model parameters, DWConv is used to replace part of the convolution operation, so that the convolution effect which is not much different from the original convolution layer can be obtained with less calculation force. In addition, SE block is added to distribute the weight of the characteristic channel, and the extraction capability of the model characteristic is improved.
(4) Replacing the original positioning loss function of the YOLOv3 network model by GIOU loss;
IOU represents that for any A, B boxes, intersection and union are respectively solved, and finally the ratio of the intersection and the union is solved; the IOU expression is:
Figure BDA0003109737320000071
GIOU indicates that for any A, B boxes, a minimum closed shape C capable of enclosing the boxes is found firstly, then the ratio of the area of C \ AbuB to the area of C is calculated, wherein the area of C \ AbuB is the area of C minus the area of AbuB, and then the ratio is subtracted from the IOU value of A, B to obtain GIOU; the GIOU expression is as follows:
Figure BDA0003109737320000072
wherein: A. b is two arbitrary convex regions, C refers to the smallest closed shape containing A and B;
the final positioning loss expression is:
final positioning loss LGIOU=1-GIOU
(5) Replacing the category Loss function of the original model of YOLOv3 with Focal local Loss;
the calculation formula of Focal local is as follows:
FL(pt)=-α(1-pt)γlog(pt)
in the formula, alpha is 2, gamma is 0.25, pt represents the probability equation of positive and negative samples, and pt is shown as the following formula:
wherein:
Figure BDA0003109737320000073
p represents the probability of a positive sample, y represents the label value;
(6) the anchors are clustered by using a k-means algorithm in an original network model of YOLOv3, and are replaced by using k-means + + in the invention. Compared with the method that k-means + + randomly selects k clustering centers at one time, the method is more reasonable, and each selection of k-means + + selects the clustering centers farther away from the last time, so that 12 anchors are selected. k-means + + can reduce the final position error to some extent.
Step 3, training the improved model
The picture input size of the model is 416 multiplied by 416, the initial learning rate is set to be 1e-3, the processed training data set is input into the model according to the set batch _ size (set according to hardware conditions) to carry out forward propagation and calculate loss, then the parameters in the network are updated by carrying out backward propagation according to a loss function, after a plurality of iterations, when the network loss tends to be stable, the training of the model is stopped, and the parameters of the network model are stored.
Step 4, using the trained improved model to perform detection test
The trained model is used for detecting the test data, the detection result is averaged, and the result shows that the detection precision of the improved model on the water surface drifter is greatly improved, and particularly the detection on the small target water surface drifter is remarkably improved.
As shown in table 1, when the average AP of the analogs of YOLOv3 and YOLOv3 are compared, it can be seen that the AP of each category is greatly improved, but the recognition capability of the categories with small amount of waterweeds, branches and lotus leaves is not high, and when folcal loss is added to YOLOv3, the recognition capability of the difficult samples with small amount of waterweeds and lotus leaves is improved, only at the cost of the precision of consuming a small amount of easy-training samples. As shown in Table 2, the detection accuracy of the improved Yolov3 in the overall average accuracy performance of the model is 13% higher than that of the improved Yolov3, and the addition of the Focal local can bring about 1% performance improvement for the model.
TABLE 1 average AP of each model
Figure BDA0003109737320000081
TABLE 2 Overall average accuracy of the model
Figure BDA0003109737320000082
As shown in table 3, the improved YOLOv3 was a large reduction in volume relative to YOLOv3 in the model parameters compared to the calculated force required, and the calculated force required by the model was also reduced. Compared with YOLOv3, the improved YOLOv3 is more suitable for being deployed in a device with limited calculation capacity and higher real-time requirement.
TABLE 3 comparison of the parameters of the model with the calculated forces required
Figure BDA0003109737320000083
When the number of network iterations is 1000, training is performed on the GPU2080ti, the improved YOLOv3 network is trained for about 3 days and half converged and ended, and the original network is trained for about one week and converged and ended. The converged view is shown in fig. 6 and fig. 7 (fig. 6 is the improved network; fig. 7 is the original network), and it can be seen from fig. 6 and fig. 7 that the training process of the improved network is smoother and faster in convergence, while the training process of the original network has larger fluctuation and slow training speed. Meanwhile, as can be seen from table 3, the parameters of the improved model are much smaller than those of the original model, so that fewer parameters need to be calculated and updated in the training process, and thus the training speed and the convergence speed of the improved model are faster.
The protection of the present invention is not limited to the above embodiments. Variations and advantages that may occur to those skilled in the art may be incorporated into the invention without departing from the spirit and scope of the inventive concept, and the scope of the appended claims is intended to be protected.

Claims (10)

1. A water surface floating object detection and identification method based on an improved YOLOv3 identification model is characterized by comprising the following steps:
the method comprises the following steps of firstly, acquiring a data set for training the water surface drifter in advance, enhancing and amplifying image data by adopting geometric transformation and color transformation, labeling the drifter of the data set to obtain a water surface drifter data set, and splitting the water surface drifter data set into a training set and a testing set;
step two, constructing an improved YOLOv3 network model;
step three, training the improved YOLOv3 network model constructed in the step two by using the water surface drifter training set obtained in the step one;
and step four, using the water surface drifter test set obtained in the step one and split according to the data image of the water surface drifter, and detecting and identifying the water surface drifter test set by using the improved YOLOv3 network model trained in the step three.
2. The improved YOLOv3 recognition model-based water surface floater detection and recognition method according to claim 1, wherein the first step specifically comprises the following two substeps;
1.1, acquiring a training data set of the water surface drifter manually on site, carrying out color transformation by adjusting hue, contrast, saturation and brightness, carrying out geometric transformation and random cutting on an image by zooming, translating and rotating, and then randomly selecting pictures for splicing to generate a new image;
and 1.2, manually labeling the data by adopting Labelme, converting a data set label format into a PASCAL VOC data set format, and dividing a training data set and a testing data set by adopting a ratio of 8: 2-9: 1.
3. The method for detecting and identifying water surface drifters based on the improved YOLOv3 recognition model according to claim 1, wherein in the second step, the step of constructing the improved YOLOv3 network model specifically comprises the following steps:
2.1, replacing the original DarkNet53 network of the YOLOv3 network model with a GhostNet network; an attention mechanism layer SELayer is added into the GhostNet bottleck of the GhostNet, so that the attention degree of main channel characteristics is enhanced;
2.2, sequentially extracting feature maps with down-sampling multiples of 4, 8, 16 and 32 from the GhostNet network structure through average pooling, and sequentially performing up-sampling on the feature maps and fusing the original features to obtain four new feature maps;
2.3, replacing the original positioning loss function of the YOLOv3 network model by GIOU loss;
IOU represents that for any A, B boxes, intersection and union are respectively solved, and finally the ratio of the intersection and the union is solved; the IOU expression is:
Figure FDA0003109737310000011
GIOU indicates that for any A, B boxes, a minimum closed shape C capable of enclosing the boxes is found firstly, then the ratio of the area of C \ AbuB to the area of C is calculated, wherein the area of C \ AbuB is the area of C minus the area of AbuB, and then the ratio is subtracted from the IOU value of A, B to obtain GIOU; the GIOU expression is as follows:
Figure FDA0003109737310000021
wherein A, B is two arbitrary convex regions, C refers to the smallest closed shape containing A and B;
the expression for the final localization loss is:
final positioning loss LGIOU=1-GIOU;
2.4, replacing a category Loss function of an original model of YOLOv3 with Focal local Loss;
the calculation formula of Focal local is as follows:
FL(pt)=-α(1-pt)γlog(pt),
wherein alpha is 2, gamma is 0.25, and p istRepresenting the probability equation of positive and negative samples, ptAs shown in the following formula:
wherein:
Figure FDA0003109737310000022
p represents the positive sample probability and y represents the label value.
4. The method for detecting and identifying the water surface drifter based on the improved YOLOv3 recognition model according to claim 3, wherein in step 2.2, deeper features are obtained through four-scale feature fusion, and 13 × 13, 26 × 26, 52 × 52 and 104 × 104 are selected as four output feature maps; the number of network iterations is set to 1000; further extracting features of the four-layer feature graph through an improved up-sampling module dw _ res2net _ block, and then performing up-sampling to fuse the features with original features to serve as new candidate features;
the dw _ res2net _ block is added with a smaller residual connecting module, so that the interaction among all sections of feature maps is increased while the gradient disappears is relieved, the module feature extraction capability becomes finer and finer, and the parameter quantity of 1/3 is reduced; while DWConv is used instead of part of the operation of convolution.
5. The method for detecting and identifying the water surface drifter based on the improved YOLOv3 recognition model according to claim 3, wherein in the second step, the improvement of the YOLOv3 trunk network further comprises a training that a SE self-attention mechanism is added into the ghost bottleeck to make the network focus more on important channel features.
6. The method for detecting and identifying the water surface drifter based on the improved YOLOv3 recognition model according to claim 3, wherein in the second step, the improvement on the YOLOv3 trunk network further comprises clustering the sizes of the anchor box by using k-means + + to the water surface drifter data set to generate 12 different sizes, so that the detection performance of the model is improved, and the convergence speed is accelerated.
7. The method for detecting and identifying water surface floating objects based on the improved YOLOv3 recognition model, according to claim 1, wherein the training of the improved YOLOv3 network model comprises the following steps in three steps:
3.1, initializing the weight of the improved YOLOv3 network model and parameters, wherein the parameters comprise a convolutional layer parameter value, a learning rate, an iteration number epoch and the number of data volumes per batch, namely batch _ size;
3.2, placing the training set and the test set in an agreed directory, and operating the program for training;
3.3, selecting 1/10 data from the training set as a verification set by the program before training, verifying through the verification set by each iteration, and recording difficult samples with poor performance; outputting each index of the model after the iteration times are reached; the indicators comprise an average accuracy mAP, a single category average accuracy AP, an accuracy rate and a recall rate;
3.4, after training is finished, analyzing the recorded difficulty, enhancing the difficulty sample, and repeating the steps 3.3-3.4 for three times;
and 3.5, reaching the iteration times, finishing training, and storing various parameters and weights of the model.
8. The method for detecting and identifying the water surface floating objects based on the improved YOLOv3 recognition model according to claim 1, wherein in the fourth step, the detection capability of the improved YOLOv3 network model on the water surface floating objects is detected by using average class accuracy, model parameters and FLOPs as performance indexes.
9. A system for implementing the detection and identification method according to any of claims 1 to 8, said system comprising: the system comprises a data input module, a data processing module, a YOLOv3 network module and a result output module;
the data input module is used for transmitting the acquired image data to the data processing module;
the data processing module is used for carrying out geometric transformation and color transformation on the image data, labeling the target object to obtain a data set, and splitting the data set into a training set and a test set;
the YOLOv3 network module is used for training a model and detecting and identifying a target object;
the result output module is used for outputting a corresponding result to the input image data.
10. A system for implementing the detection and identification method according to any of claims 1 to 8, said system comprising:
the method comprises the following steps of (1) carrying out user login interface, water surface drift detection page, picture uploading, video uploading, detection function and camera identification; wherein the content of the first and second substances,
the user login interface: displaying the name, author and application version information of the application, and providing an entrance of a detection interface on the interface;
the water surface drift object detection page comprises: the interface is the display area of the application main function button and the uploaded picture and the predicted picture; the display functions mainly include: uploading pictures, uploading videos, starting detection and detecting a camera;
uploading the pictures: uploading a picture to be detected;
and uploading the video: uploading a video to be detected;
the detection function is as follows: clicking a button to identify, and displaying an identified result;
the camera is identified: and opening the camera to monitor the detected object in real time.
CN202110647573.3A 2021-06-10 2021-06-10 Improved water surface floating object detection and identification method and system based on YOLOv3 Pending CN113435269A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110647573.3A CN113435269A (en) 2021-06-10 2021-06-10 Improved water surface floating object detection and identification method and system based on YOLOv3

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110647573.3A CN113435269A (en) 2021-06-10 2021-06-10 Improved water surface floating object detection and identification method and system based on YOLOv3

Publications (1)

Publication Number Publication Date
CN113435269A true CN113435269A (en) 2021-09-24

Family

ID=77755681

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110647573.3A Pending CN113435269A (en) 2021-06-10 2021-06-10 Improved water surface floating object detection and identification method and system based on YOLOv3

Country Status (1)

Country Link
CN (1) CN113435269A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112926501A (en) * 2021-03-23 2021-06-08 哈尔滨理工大学 Traffic sign detection algorithm based on YOLOv5 network structure
CN114170483A (en) * 2022-02-11 2022-03-11 南京甄视智能科技有限公司 Training and using method, device, medium and equipment of floater identification model
CN114332584A (en) * 2021-12-16 2022-04-12 淮阴工学院 Lake water surface floater identification method and medium based on image processing
CN114758206A (en) * 2022-06-13 2022-07-15 武汉珈鹰智能科技有限公司 Steel truss structure abnormity detection method and device
CN114937195A (en) * 2022-03-29 2022-08-23 江苏海洋大学 Water surface floating object target detection system based on unmanned aerial vehicle aerial photography and improved YOLO v3
CN115019243A (en) * 2022-04-21 2022-09-06 山东大学 Monitoring floater lightweight target detection method and system based on improved YOLOv3
CN115035354A (en) * 2022-08-12 2022-09-09 江西省水利科学院 Reservoir water surface floater target detection method based on improved YOLOX

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112257794A (en) * 2020-10-27 2021-01-22 东南大学 YOLO-based lightweight target detection method
CN112434672A (en) * 2020-12-18 2021-03-02 天津大学 Offshore human body target detection method based on improved YOLOv3
CN112465057A (en) * 2020-12-08 2021-03-09 中国人民解放军空军工程大学 Target detection and identification method based on deep convolutional neural network
CN112528934A (en) * 2020-12-22 2021-03-19 燕山大学 Improved YOLOv3 traffic sign detection method based on multi-scale feature layer
CN112784748A (en) * 2021-01-22 2021-05-11 大连海事大学 Microalgae identification method based on improved YOLOv3

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112257794A (en) * 2020-10-27 2021-01-22 东南大学 YOLO-based lightweight target detection method
CN112465057A (en) * 2020-12-08 2021-03-09 中国人民解放军空军工程大学 Target detection and identification method based on deep convolutional neural network
CN112434672A (en) * 2020-12-18 2021-03-02 天津大学 Offshore human body target detection method based on improved YOLOv3
CN112528934A (en) * 2020-12-22 2021-03-19 燕山大学 Improved YOLOv3 traffic sign detection method based on multi-scale feature layer
CN112784748A (en) * 2021-01-22 2021-05-11 大连海事大学 Microalgae identification method based on improved YOLOv3

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
周飞: "水面清污机器人垃圾检测算法的研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 *
李汉冰: "基于Yolov3改进的实时车辆检测方法", 《激光与光电子学进展》 *
郭飞: "基于交通夜视场景的改进 YOLOv3 轻量化网络模型", 《电子技术与软件工程》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112926501A (en) * 2021-03-23 2021-06-08 哈尔滨理工大学 Traffic sign detection algorithm based on YOLOv5 network structure
CN114332584A (en) * 2021-12-16 2022-04-12 淮阴工学院 Lake water surface floater identification method and medium based on image processing
CN114170483A (en) * 2022-02-11 2022-03-11 南京甄视智能科技有限公司 Training and using method, device, medium and equipment of floater identification model
CN114170483B (en) * 2022-02-11 2022-05-20 南京甄视智能科技有限公司 Training and using method, device, medium and equipment of floater identification model
CN114937195A (en) * 2022-03-29 2022-08-23 江苏海洋大学 Water surface floating object target detection system based on unmanned aerial vehicle aerial photography and improved YOLO v3
CN115019243A (en) * 2022-04-21 2022-09-06 山东大学 Monitoring floater lightweight target detection method and system based on improved YOLOv3
CN114758206A (en) * 2022-06-13 2022-07-15 武汉珈鹰智能科技有限公司 Steel truss structure abnormity detection method and device
CN114758206B (en) * 2022-06-13 2022-10-28 武汉珈鹰智能科技有限公司 Steel truss structure abnormity detection method and device
CN115035354A (en) * 2022-08-12 2022-09-09 江西省水利科学院 Reservoir water surface floater target detection method based on improved YOLOX
CN115035354B (en) * 2022-08-12 2022-11-08 江西省水利科学院 Reservoir water surface floater target detection method based on improved YOLOX

Similar Documents

Publication Publication Date Title
CN113435269A (en) Improved water surface floating object detection and identification method and system based on YOLOv3
CN111104898B (en) Image scene classification method and device based on target semantics and attention mechanism
CN108805070A (en) A kind of deep learning pedestrian detection method based on built-in terminal
CN114445670B (en) Training method, device and equipment of image processing model and storage medium
CN110287806A (en) A kind of traffic sign recognition method based on improvement SSD network
CN112784756B (en) Human body identification tracking method
CN113160246A (en) Image semantic segmentation method based on depth supervision
CN114511710A (en) Image target detection method based on convolutional neural network
CN115620010A (en) Semantic segmentation method for RGB-T bimodal feature fusion
CN110738132A (en) target detection quality blind evaluation method with discriminant perception capability
Tu et al. Scale effect on fusing remote sensing and human sensing to portray urban functions
CN115909280A (en) Traffic sign recognition algorithm based on multi-head attention mechanism
CN113988147A (en) Multi-label classification method and device for remote sensing image scene based on graph network, and multi-label retrieval method and device
CN114694185A (en) Cross-modal target re-identification method, device, equipment and medium
CN115908789A (en) Cross-modal feature fusion and asymptotic decoding saliency target detection method and device
CN114693952A (en) RGB-D significance target detection method based on multi-modal difference fusion network
CN116524189A (en) High-resolution remote sensing image semantic segmentation method based on coding and decoding indexing edge characterization
CN115661932A (en) Fishing behavior detection method
CN114842180A (en) Point cloud completion method, device, equipment and medium
CN113361496B (en) City built-up area statistical method based on U-Net
CN114926826A (en) Scene text detection system
CN114549959A (en) Infrared dim target real-time detection method and system based on target detection model
CN114283315A (en) RGB-D significance target detection method based on interactive guidance attention and trapezoidal pyramid fusion
CN115471901B (en) Multi-pose face frontization method and system based on generation of confrontation network
CN116957921A (en) Image rendering method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20210924

WD01 Invention patent application deemed withdrawn after publication