CN113569667A

CN113569667A - Inland ship target identification method and system based on lightweight neural network model

Info

Publication number: CN113569667A
Application number: CN202110775903.7A
Authority: CN
Inventors: 张煜; 康哲; 马杰
Original assignee: Wuhan University of Technology WUT
Current assignee: Wuhan University of Technology WUT
Priority date: 2021-07-09
Filing date: 2021-07-09
Publication date: 2021-10-29
Anticipated expiration: 2041-07-09
Also published as: CN113569667B

Abstract

The invention discloses a method and a system for recognizing inland ship targets based on a lightweight neural network model, wherein the method comprises the following steps: s1, constructing a lightweight neural network model, and compressing a MobileNetv3Large network in a feature extraction network part to obtain a feature extraction network; on the algorithm prediction structure, performing multi-convolution layer feature fusion by using a feature pyramid structure; loss calculation is carried out by utilizing a loss function of the fusion distance measurement index; s2, screening and sorting images of inland ships to form an inland ship image data set, and dividing a training set and a test set; s3, training the constructed lightweight neural network model; and S4, recognizing the river ship target by using the trained model. The method effectively improves the image target identification precision of the inland ship, reduces the dependence of ship identification on the calculation performance of hardware equipment, and effectively improves the processing capacity of the video monitoring information of the inland ship.

Description

Inland ship target identification method and system based on lightweight neural network model

Technical Field

The invention relates to the technical field of image recognition, in particular to a method and a system for recognizing an inland ship target based on a lightweight neural network model.

Background

With the rapid development of inland waterway shipping services, the traffic volume of inland waterway is rapidly increased, the navigation environment is increasingly complex, and the identification of a ship target by utilizing a channel video monitoring system is an important basis for ship safety monitoring and danger early warning. At present, analysis and processing of inland river video monitoring information need to be completed by a large amount of manpower, and the problem that information processing is wrong and untimely is easily caused due to the influence of human factors. By deploying the inland ship identification model, ship targets can be positioned and classified in real time, and data support is provided for ship navigation supervision and dangerous behavior early warning.

The traditional target recognition algorithm is designed based on a mathematical modeling idea, and target recognition is carried out by extracting ship contour features in an image. The ship contour information is easily interfered by the image background, so that the traditional target identification algorithm has weaker ship classification and positioning capacity in a complex environment, meanwhile, the small-scale ship image target characteristic information is less, and the traditional identification algorithm has lower identification precision on the ship. Aiming at the problems, a plurality of scholars provide a target identification algorithm based on a deep convolutional neural network, and the algorithm extracts ship image features through a deep network so as to realize accurate classification and positioning of multi-scale ship image targets. However, the algorithm based on the deep convolutional neural network needs to be trained by means of a large amount of ship image data, and a large amount of network parameters are generated in the training process. Meanwhile, the ship identification model can meet the requirements of real-time identification and accurate identification of the ship only by means of a high-performance image processor. The inland ship monitoring equipment belongs to typical embedded equipment, is weak in computing capability, and cannot run an identification model with high computational consumption and model parameter quantity.

Disclosure of Invention

The invention mainly aims to provide a inland ship identification method and system based on a lightweight neural network, which can realize quick acquisition of ship features and accurate classification and positioning of ship targets without artificially extracting the features.

The technical scheme adopted by the invention is as follows:

the method for recognizing the inland ship target based on the lightweight neural network model comprises the following steps:

s1, constructing a lightweight neural network model, and removing a 5 th Bnic module and a 9 th Bnic module on the basis of original 15 Bnic modules of a MobileNetv3Large network in a feature extraction network part to obtain a compressed feature extraction network S-MobileNet network; on the algorithm prediction structure, performing multi-convolution layer feature fusion on the 6 th, 9 th and 13 th Bnegk modules of the S-MobileNet network by using a feature pyramid structure; in the aspects of the regression loss function of the prediction frame and the inhibition of the non-maximum value of the target prediction frame, performing loss calculation by using the loss function of the fusion distance measurement index;

s2, screening and sorting the images of the inland ships to form an inland ship image data set; dividing the data set into a training set and a testing set by using a random division principle;

s3, training the constructed lightweight neural network model through a training set and a testing set;

and S4, recognizing the river ship target by using the trained model.

According to the technical scheme, in the process of calculating the loss value of the prediction frame, the classification loss and the position loss of the prediction frame are adjusted by using a label smoothing method, so that overfitting is avoided.

According to the technical scheme, the loss function of the fusion distance measurement index is a MIoU loss index function constructed by utilizing the intersection ratio of the target prediction frame and the real frame and the distance measurement index of the central point of the two frames, and specifically comprises the following steps:

wherein | P ≧ G | is the intersection area of the target prediction frame and the real frame, | P ≧ G | is the union area of the two frames, L_MIoUAs a function of MIoU loss, d₁Is the distance between the center points of the two frames, d₂Diagonal distance of two frames minimum closure area, b^gtRespectively prediction box and trueCenter point of solid frame, (x)₁,y₁)、(x₂,y₂) The coordinates of the center points of the prediction frame and the real frame are respectively.

According to the technical scheme, the label smoothing method comprises the following specific calculation processes:

wherein, P_iAnd expressing the adjusted prediction probability, K expressing the total number of the categories to be classified, epsilon expressing the set hyper-parameter, i expressing a target prediction value, and y expressing a target true value.

According to the technical scheme, the characteristic pyramid structure has a top-down operation logic, specifically, 3 target prediction channels are constructed by transversely connecting with the convolution layer information of the characteristic extraction network part, and the identification capability of the multi-scale image target is improved.

According to the technical scheme, the model training process specifically comprises the following steps: performing model training by applying a transfer learning method; meanwhile, in the model training process, the model loss value is calculated in each iteration, and when the difference value of the model loss value in a certain number of training iterations is smaller than the interruption threshold value, the model training is finished.

According to the technical scheme, the transfer learning method comprises the following steps: in the initial stage of model training, only starting the last full connection layer of the S-MobileNet network, freezing the convolution layer of the Bnegk module, performing model pre-training, and storing parameters after pre-training; and after the pre-training is finished, starting all convolution modules of the S-MobileNet network to perform full convolution layer operation.

According to the technical scheme, the ratio of the number of the images of the training set to the number of the images of the testing set is 8: 2.

According to the technical scheme, in the model training process, operations including cutting, translation and scaling are carried out on three or four images randomly, the color saturation, brightness and contrast of the images are adjusted, the selected images are placed according to a specified sequence, the processed images are combined into one image, and then training is carried out.

The invention also provides a inland ship target recognition system based on the lightweight neural network model, which comprises the following steps:

the lightweight neural network model building module is used for building a lightweight neural network model, and on the basis of the original 15 Bneck modules of the MobileNetv3Large network, the 5 th Bneck module and the 9 th Bneck module are removed in the feature extraction network part to obtain a compressed feature extraction network S-MobileNet network; on the algorithm prediction structure, performing multi-convolution layer feature fusion on the 6 th, 9 th and 13 th Bnegk modules of the S-MobileNet network by using a feature pyramid structure; in the aspects of the regression loss function of the prediction frame and the inhibition of the non-maximum value of the target prediction frame, performing loss calculation by using the loss function of the fusion distance measurement index;

the data set processing module is used for screening and sorting the inland ship images to form an inland ship image data set; dividing the data set into a training set and a testing set by using a random division principle;

the training module is used for training the constructed lightweight neural network model through a training set and a testing set;

and the recognition module is used for recognizing the river ship target by using the trained model.

The invention has the following beneficial effects: according to the invention, the 5 th and 9 th Bneck modules with smaller weight in the MobileNetv3Large network are deleted, and the 6 th, 9 th and 13 th Bneck modules of the compressed network are subjected to multi-convolution layer characteristic fusion, so that a new lightweight neural network model is constructed, the accurate identification of the image target of the multi-scale inland ship is realized by using the new network model, the influence of human factors in ship identification and positioning is reduced, the dependence of the ship target identification on the computing capacity of hardware equipment is reduced, and the processing capacity of the video monitoring information of the ship in the inland river environment is effectively improved.

Furthermore, 3 target prediction channels are constructed by transversely connecting the characteristic pyramid structure with the characteristic extraction network convolution layer information, and the identification capability of the multi-scale image target is improved.

Furthermore, in the process of calculating the loss value of the prediction frame, the classification loss and the position loss of the prediction frame are adjusted by using a label smoothing method, so that the phenomenon of overfitting of the model is avoided.

Furthermore, in the model training process, the images are cut, translated, zoomed and the like, the color saturation, brightness, contrast and the like of the images are adjusted, and finally the images are spliced, so that the utilization rate of the image information of the data set of the small ship is effectively improved, the diversity of the data set is enhanced, and the training effect of the light ship recognition algorithm is ensured.

Furthermore, by utilizing the MIoU loss function and integrating the central point normalized distance between the prediction frame and the real frame, the regression speed of the target prediction frame to the real frame is improved. When the prediction frame does not intersect with the real frame, the distance index can accelerate the speed of the two frames to generate overlapping; when the predicted frame intersects with the real frame, the boundary of the predicted frame is accelerated to reduce the difference value with the boundary of the real frame under the combined action of the intersection ratio of the two frames and the distance measurement index.

Drawings

The invention will be further described with reference to the accompanying drawings and examples, in which:

FIG. 1 is a flow chart of a inland ship target identification method based on a lightweight neural network model according to an embodiment of the invention;

FIG. 2(a) is a schematic diagram of three image stitching according to an embodiment of the present invention;

FIG. 2(b) is a schematic diagram of four image stitching according to the embodiment of the present invention;

FIG. 3 is a diagram illustrating a specific process of constructing an S-MobileNet network according to an embodiment of the present invention;

FIG. 4 is a diagram illustrating two types of loss convergence curves in a model training process according to an embodiment of the present invention;

FIG. 5(a) is a schematic diagram of large-scale vessel identification according to an embodiment of the present invention;

FIG. 5(b) is a schematic diagram of small-scale vessel identification according to an embodiment of the present invention;

FIG. 5(c) is a schematic view of a ship partial block according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

As shown in fig. 1, the inland ship target identification method based on the lightweight neural network model in the embodiment of the present invention includes the following steps:

and S4, recognizing the river ship target by using the trained model.

The method can realize the rapid acquisition of the ship characteristics and the accurate classification and positioning of the ship target without manually extracting the characteristics. The model constructed by the method has the characteristics of short training time, small parameter quantity, low computational power consumption and the like, and is suitable for being deployed in embedded equipment with weak computational capability.

Furthermore, the inland ship image data set is formed by screening and sorting inland ship images, and the data set is divided into a training set and a testing set by using a random division principle. In the embodiment of the invention, 6000 inland ship images are preliminarily collated by using a web crawler technology, and a ship data set consisting of 4000 images is formed by further screening in consideration of the diversity and adaptability of ship image data. And then, randomly dividing the inland ship image data set into a training set and a test set, wherein the number ratio of the images of the training set to the test set is 8:2, namely the training set is composed of 3200 images, and the test set is composed of 800 images.

As shown in fig. 2(a) and 2(b), in order to avoid the problem that the training of the small sample data set is easy to generate the over-fitting problem, in the model training process, the ship image is randomly turned, zoomed, color gamut changed and the like by using an image stitching data enhancement method, so that the diversity of the ship data is enhanced. Specifically, in the model training process, operations such as cropping, translation, scaling and the like are randomly performed on three or four images. Meanwhile, the color saturation, brightness, contrast and the like of the image are adjusted, the selected pictures are put in a specified sequence, the processed pictures are combined into one picture, and then the picture is input into a ship recognition algorithm for training. The method effectively improves the utilization rate of image information of the data set of the small ship, enhances the diversity of the data set and ensures the training effect of the lightweight ship identification algorithm.

When a lightweight neural network model is constructed, compressing the characteristic extraction network part based on a MobileNet v3Large network to obtain an S-MobileNet network; on the basis of an algorithm prediction structure, the multi-convolution layer feature fusion is realized by using a feature pyramid structure, and the identification capability of a small-scale ship image target is improved; in the aspects of the regression loss function of the prediction frame and the inhibition of the non-maximum value of the target prediction frame, performing loss calculation by using the loss function of the fusion distance measurement index; and finally, in the process of calculating the loss value of the prediction frame, the classification loss and the position loss of the prediction frame are adjusted by using a label smoothing method, so that the overfitting phenomenon of the model is avoided.

The label smoothing method specifically corrects the calculated value of the cross entropy loss function, optimizes the excessive trust of partial error labels in the model training process, better calibrates each parameter of the network and improves the generalization capability of the recognition model. The label smoothing method specifically comprises the following calculation processes:

wherein, P_iAnd expressing the adjusted prediction probability, K expressing the total number of the categories to be classified, epsilon expressing the set hyper-parameter, i expressing a target prediction value, and y expressing a target true value. Since model predictions tend to be more confident of higher confidence in the prediction box, the hyperparameter of 0.5 is added to reduce the effect of the label smoothing method on the confidence value. As shown in table 1, based on the same inland ship image data, when the hyper parameter is set to 0.5, the ship identification experiment accuracy (the value of the mAP) is the highest.

TABLE 1 results of different superparametric experiments

Value of hyper-parameter	Experimental mAP values
		1.0	0.9427
0.8	0.9502
		0.6	0.9588
0.5	0.9637
		0.4	0.9541
0.2	0.9523

As shown in fig. 3, the S-MobileNet network is specifically obtained by the following steps: on the basis of the original 15 Bneck modules of the MobileNetv3Large network, the 5 th Bneck module and the 9 th Bneck module are removed to obtain a compressed feature extraction network, and the network is composed of 13 Bneck modules, so that the model parameters and the calculated amount are further reduced compared with the original network.

The S-MobileNet feature extraction network has operation logic from bottom to top and deep semantic information through forward convolution calculation, can effectively extract ship image features, and improves classification and positioning accuracy of ship targets. The input of the network is an RGB three-channel image with 608 multiplied by 608 resolution, the number of convolution channels, the size of the channels and the size of convolution kernels of each Bneck module of the network are shown in Table 2:

TABLE 2 Bnic Module introduction

Bneck number	Number of convolution channels	Size of channel	Convolution kernel size
				1	16	304×304	3×3
2	16	304×304	3×3
				3	24	152×152	3×3
4	24	152×152	5×5
				5	40	76×76	5×5
6	40	76×76	3×3
				7	80	38×38	3×3
8	80	38×38	3×3
				9	80	38×38	3×3
10	112	38×38	3×3
				11	112	38×38	5×5
12	160	19×19	5×5
				13	160	19×19	5×5

The Bneck module integrates a depth separable convolution, a lightweight attention model and an inverse residual error structure with a linear bottleneck, and simultaneously replaces a swish function with an h-swish activation function, so that the calculated amount is reduced, and the feature extraction capability is improved.

The characteristic pyramid structure means: after the S-MobileNet network is obtained, the prediction structure of the characteristic pyramid design algorithm is utilized to carry out splicing (Concat) on the convolution layers of the 6 th, 9 th and 13 th Bnegk modules of the characteristic extraction network. Meanwhile, the fusion of feature maps with different sizes and the output of feature information are completed by using standard convolution (Conv2D) and UpSampling (UpSamplling 2D). The prediction structure has top-down operation logic, 3 target prediction channels are constructed by transversely splicing the prediction structure with the information of the feature extraction network convolution layer, ship target prediction of different scales is respectively carried out, and the recognition capability of multi-scale ship image targets is improved.

Further, the loss function of the fused distance metric is: constructing an MIoU loss index by utilizing the intersection ratio of the target prediction frame and the real frame and the distance measurement index of the central points of the two frames, which specifically comprises the following steps:

wherein | P ≧ G | is the intersection area of the target prediction frame and the real frame, | P ≧ G | is the union area of the two frames, L_MIoUAs a function of MIoU loss, d₁Is the distance between the center points of the two frames, d₂Diagonal distance of two frames minimum closure area, b^gtThe center points of the predicted frame and the real frame, respectively, (x)₁,y₁)、(x₂,y₂) The coordinates of the center points of the prediction frame and the real frame are respectively.

In the embodiment of the invention, the inland ship identification model training process based on the lightweight neural network comprises the following specific steps:

s31, training an inland ship recognition model by applying a transfer learning method;

s32, in the model training process, calculating a model loss value in each iteration;

and S33, when the difference value of the model loss value in 10 iterations is smaller than the interruption threshold value, ending the model training.

Further, the transfer learning method refers to: in the initial stage of model training, only starting the last full connection layer of the S-MobileNet network, freezing all convolution layers of the Bnegk module, performing model pre-training, and updating the trained parameters; and after the pre-training is finished, starting all convolution modules of the S-MobileNet network to perform full convolution layer training.

The inland ship target identification system based on the lightweight neural network model is mainly used for realizing the embodiment of the method, and comprises the following steps:

Further functions of each module are detailed in the above embodiment of the method, and are not described in detail here.

According to the invention, by designing a lightweight ship target recognition model facing to an inland river environment, utilizing technologies such as a multi-feature fusion structure, label smoothing and transfer learning, and based on an inland ship image data set and a data enhancement method, accurate recognition of a multi-scale inland ship image target is realized, background information interference is effectively eliminated, target identification precision under adverse conditions such as ship shielding is improved, influence of human factors in ship recognition and positioning is reduced, dependence of ship target recognition on hardware equipment computing capacity is reduced, and processing capacity of video monitoring information of ships in the inland river environment is effectively improved.

Model training and validation examples:

in the above embodiment, the parameters of the model training process are specifically set as follows: in the training of the lightweight inland ship identification algorithm, the algorithm momentum (momentum) is set to be 0.9, firstly, in the migration learning stage, the batch size is set to be 30, 15 iterations are operated, the initial learning rate is set to be 10^-3(ii) a After all convolutional layers are turned on, due to the increase in the scale of the parameters, the batch is set to 6, the initial learning rate is set to 10^-4This phase has a total of 90 iterations.

In the above embodiment, the model trainingTwo types of loss convergence curves in the training process are shown in fig. 4, in which a loss convergence curve is obtained based on the training set, and a val _ loss convergence curve is obtained based on the validation set. In the transfer learning stage, the learning rate is kept at 10^-3The two types of losses converge to about 30; after the convolutional layer was all on, the learning rate was 10^-4Then, the two types of losses converge to about 7.4; when the epoch is 56, the learning rate is reduced to 10^-5The two types of losses converge to about 5.6; when the epoch is 85, the learning rate is reduced to 10^-6The two types of losses eventually converge to around 3.8. The data shows that the two types of losses of the model have small difference and have stronger overfitting resistance.

In order to further verify the effectiveness of the method, the method selects the average identification precision AP of each type of ship_iAnd carrying out quantitative evaluation on the average mAP of the average precision of various ships, wherein the calculation formula is as follows:

wherein tp is the number of correctly classified prediction frames and up-to-standard boundary positions; fp is the number of wrong classification of the prediction frame or unqualified boundary positions; fn is the number of real frames that are not predicted; pr is the accuracy, i.e. the proportion of the real target in the prediction result; re is recall rate, namely the proportion of the prediction result which can cover the real target at most; p (R) is a Pr-Re characteristic curve corresponding to each type of ship; n is the number of ship categories, and n is 5 in the experiment, which is respectively: ore carriers, container ships, cargo ships, fishing ships and passenger ships.

To better verify the effectiveness of the present invention, YOLO2, YOLOv3, Tiny-YOLO3 and YOLOv3-MobileNetv3 target algorithm (hereinafter, YOLOv3-ML algorithm) were compared and the results are shown in table 3:

TABLE 3 comparison of the results of the different tests

As can be seen from the data in Table 3, the 3 algorithms have good identification precision on container ships, ore sand ships and cargo ships, and compared with other types of ships, the three types of ships have large target size and are convenient for image feature extraction. Through comparative analysis of experimental results of YOLOv2 and YOLOv3, the Darknet-53 network can be found to greatly improve the identification capability of small-scale ships such as fishing boats, passenger boats and the like; the number of the convolutional layers is small, so that the extraction of ship target features is incomplete, deep semantic information is not available, and the experimental result of the algorithm is poor; the lightweight ship target identification model provided by the invention can effectively identify small-scale targets such as fishing ships and the like, and the model parameter and the calculated power consumption are only about 1/3 of YOLOv 3.

The experimental result of the inland ship identification model based on the lightweight neural network is shown in the figure, and as can be seen from the figure 5(a), the method can accurately identify small-scale ships such as fishing boats, passenger ships and the like, and target missing identification or error identification does not occur; as can be seen from fig. 5(b), the method has good identification capability for various large-scale ship targets, and effectively eliminates the interference of the near-shore background; as can be seen from FIG. 5(c), the ship image target under the ship shielding condition can be well identified, and the robustness is good.

It will be understood that modifications and variations can be made by persons skilled in the art in light of the above teachings and all such modifications and variations are intended to be included within the scope of the invention as defined in the appended claims.

Claims

1. A inland ship target identification method based on a lightweight neural network model is characterized by comprising the following steps:

and S4, recognizing the river ship target by using the trained model.

2. The inland ship target identification method based on the lightweight neural network model according to claim 1, characterized in that in the calculation process of the loss value of the prediction frame, the classification loss and the position loss of the prediction frame are adjusted by using a label smoothing method, so that overfitting is avoided.

3. The inland ship target identification method based on the lightweight neural network model according to claim 1, wherein the loss function of the fusion distance measurement index is a MIoU loss index function constructed by using the intersection ratio of a target prediction frame and a real frame and the distance measurement index of the central point of the two frames, and specifically comprises the following steps:

wherein | P ≧ G | is the intersection area of the target prediction frame and the real frame, | P ≧ G | is the union area of the two frames, L_MIoUAs a function of MIoU loss, d₁Is the distance between the center points of the two frames, d₂For two-frame minimum closure areaDiagonal distance, b^gtThe center points of the predicted frame and the real frame, respectively, (x)₁,y₁)、(x₂,y₂) The coordinates of the center points of the prediction frame and the real frame are respectively.

4. The inland ship target identification method based on the lightweight neural network model according to claim 2, characterized in that the label smoothing method specifically comprises the following calculation processes:

5. The inland ship target identification method based on the lightweight neural network model as claimed in claim 1, wherein the feature pyramid structure has a top-down operation logic, and specifically, 3 target prediction channels are constructed by transversely connecting with convolutional layer information of a feature extraction network part, so that the identification capability of a multi-scale image target is improved.

6. The inland ship target identification method based on the lightweight neural network model according to claim 1, characterized in that the model training process specifically comprises: performing model training by applying a transfer learning method; meanwhile, in the model training process, the model loss value is calculated in each iteration, and when the difference value of the model loss value in a certain number of training iterations is smaller than the interruption threshold value, the model training is finished.

7. The inland ship target identification method based on the lightweight neural network model according to claim 6, wherein the transfer learning method is as follows: in the initial stage of model training, only starting the last full connection layer of the S-MobileNet network, freezing the convolution layer of the Bnegk module, performing model pre-training, and storing parameters after pre-training; and after the pre-training is finished, starting all convolution modules of the S-MobileNet network to perform full convolution layer operation.

8. The inland ship target recognition method based on the lightweight neural network model as claimed in claim 1, wherein the ratio of the number of images in the training set to the number of images in the testing set is 8: 2.

9. The inland ship target recognition method based on the lightweight neural network model as claimed in claim 1, wherein in the model training process, operations including cutting, translation and scaling are performed on three or four images at random, the color saturation, brightness and contrast of the images are adjusted, the selected images are placed in a specified order, and after the processed images are combined into one image, training is performed.

10. An inland ship target recognition system based on a lightweight neural network model is characterized by comprising: