CN115909064A

CN115909064A - Marine ship target detection method based on deep learning in sea fog environment

Info

Publication number: CN115909064A
Application number: CN202211429553.XA
Authority: CN
Inventors: 王宁; 王元元; 唐路源; 吴伟; 赵红
Original assignee: Dalian Maritime University
Current assignee: Dalian Maritime University
Priority date: 2022-11-15
Filing date: 2022-11-15
Publication date: 2023-04-04

Abstract

The invention relates to a marine ship target detection method based on deep learning in a sea fog environment, which comprises the following steps of: acquiring an image to be processed containing rain mist; performing rain and fog removal pretreatment on the image to be processed containing rain and fog based on a GCANet deep learning network to obtain the image to be processed after the rain and fog removal; the invention adds an image rain and fog removing preprocessing process before the target detection, and highlights the obvious characteristics of the target by improving the contrast of the target and the background.

Description

Marine ship target detection method based on deep learning in sea fog environment

Technical Field

The invention belongs to the technical field of computer vision, and particularly relates to a marine ship target detection method based on deep learning in a sea fog environment.

Background

China has a wide sea area and abundant ocean resources, and is always a key concern for the development and utilization of oceans. The method has very important research significance for target detection of marine ships.

The marine environment is complex and changeable, and aiming at the situation that target detection is difficult under the current sea fog condition, a deep learning method is applied to firstly carry out defogging treatment on an original image and then carry out target detection on the defogged image, so that the accuracy rate of ship target detection under the sea fog environment is improved, and a good identification effect can be achieved under the complex sea fog condition. In the aspects of defogging processing and target detection algorithms, the traditional image defogging algorithms are mainly divided into two types: defogging algorithm based on image enhancement ^[1] And defogging algorithm based on image restoration ^[2] . The defogging algorithm based on image enhancement starts to remove image noise as much as possible and improve the image contrast so as to restore a clear image. The disadvantage is that the defogging is not targeted and not true. Image restoration-based defogging algorithm is basically based on an atmospheric scattering model ^[3] And compensating the image loss caused by fog by analyzing the influence factors of image degradation to restore the original image. The method is generally better than defogging algorithms based on image enhancement, but relies on prior knowledge.These conventional methods are characterized by large calculation amount and poor robustness. Defogging algorithm based on deep learning along with rapid development of big data and deep learning ^[4-9] The method has the advantages that the method does not depend on prior knowledge, utilizes an end-to-end network to directly regress the residual error between the blurred image and the clear image and establishes the relation between the blurred image and the clear image, so that the foggy image is input into the network to directly output the clear fogless image, and the method has the advantages of high operation speed and good robustness.

Object detection is one of the main research directions in the field of computer vision. The marine ship target detection task aims to identify the position and the category of a ship target in an acquired sea surface image. Today, target detection is mainly divided into two main research directions: based on digital image processing technology and combined with machine learning ^[10] The traditional target detection algorithm and the deep convolution-based neural network ^[11] The target detection algorithm of (1). The traditional ship target detection algorithm firstly generates a feature extraction candidate area on an original image, then extracts a proper target feature from the candidate area, finally generates a recognition and positioning model through classifier training such as Adaboost and a support vector machine, and performs image recognition detection on the image to be recognized ^[12] . The traditional target detection algorithm neglects the time complexity of the algorithm while improving the identification accuracy and the positioning accuracy of ship target detection, and cannot ensure the timeliness of the system, so that the practicability of the system is low. Deep convolutional neural network ^[13-19] The method has the advantages that a specific characteristic extraction candidate region does not need to be generated, overall characteristic extraction can be performed on the recognition target autonomously through multiple convolutions, the model of the user can be trained and perfected continuously, and finally the optimal model is selected. Compared with the traditional algorithm, the target detection algorithm can rapidly detect the target and keep higher robustness on the premise of higher identification precision, and the system practicability is greatly improved.

The existing defogging algorithm has no pertinence, cannot cope with the marine environment of multi-sea fog, causes the loss of important target information after defogging, and does not consider and make up the problem of the loss of important information which is beneficial to detection and caused by image preprocessing, so that the feature extraction capability of the network is insufficient.

The existing target detection model does not fully consider the influence of the shape statistical characteristics of the marine ship target on detection and identification, so that the marine ship target detection false detection rate based on deep learning is high.

Disclosure of Invention

In order to solve the problems, the invention provides the technical scheme that: a marine ship target detection method based on deep learning in a sea fog environment is characterized by comprising the following steps: the method comprises the following steps:

acquiring a to-be-processed image containing rain and fog;

performing rain and fog removal pretreatment on the image to be processed containing rain and fog based on a GCANet deep learning network to obtain the image to be processed after the rain and fog removal;

and (4) identifying the type and the position of the marine ship target in the image to be processed by adopting an SEMSSD target detection model on the image to be processed after the rain and fog are removed.

Further: the method comprises the following steps of carrying out rain fog removal pretreatment on an image to be processed containing rain fog based on a GCANet deep learning network, and obtaining the image to be processed after the rain fog removal as follows:

for an image to be processed containing rain and fog, a first convolution normalization module with the convolution kernel size of 3 multiplied by 3 and the step length of 1 and a second convolution normalization module with the convolution kernel size of 3 multiplied by 3 and the step length of 2 are used successively to encode the image into a feature map;

sequentially passing the feature graph through smooth hole residual blocks with different hole rates to obtain image features of different levels;

features of different levels are fused through a gating fusion sub-network, and weights of low, medium and high-level features are learned; obtaining a characteristic diagram;

the feature map is subjected to deconvolution layer, the feature map is up-sampled to the original resolution, and the feature map is converted back to an image space through two convolution layers to obtain a fog residual error of the marine ship target;

and adding the fog residual error into the original fog input image to obtain an image without rain and fog.

Further: the SEMSSD target detection model takes a VGG network as a backbone network, and the VGG network comprises a 3 rd convolution layer of a 4 th convolution block, a channel attention module, a full connection layer, three convolution layers and one average pooling layer;

the 3 rd convolution layer, the channel attention module and the full-connection layer of the 4 th convolution block are sequentially connected in series;

sequentially adding three convolution layers behind the full-connection layer; the three groups of convolution layers comprise a first convolution layer, a second convolution layer and a third convolution layer;

a set of average pooling layers is connected in series after the third convolutional layer.

Further: the SEMSSD target detection model predicts a target position by using a prior frame, and the prior frame is determined in the following way:

counting the shape characteristics of the marine ship target in the image to be processed;

adjusting the prior frame shape statistical characteristics based on the target characteristics of the marine vessel target; the statistical features include aspect ratio information of the prior frames.

Further: the aspect ratios of 6 output layers of the SEMSSD target detection model are respectively as follows: [1, 2], [1, 2, 4], [1, 2, 3], [1, 2].

Further: screening a large number of detection frames with different sizes predicted by the characteristic layer through a non-maximum suppression method, and determining a detection frame corresponding to each target in the image and a corresponding category.

Further, the method comprises the following steps: the channel attention module SE input characteristic diagram is marked as P1, the P1 characteristic diagram is divided into two paths, one path is not processed, the output is marked as P2, the other path passes through the global pooling, the full connection layer, the ReLU function, the full connection layer and the Sigmoid function in sequence, the output is marked as P3, the P2 and the P3 are added to obtain an output P4 of the channel attention module SE, and the dimensionality of the output P4 is consistent with that of the input layer P1.

A marine ship target detection device based on deep learning in a sea fog environment comprises the following steps:

an acquisition module: the system is used for acquiring an image to be processed containing rain mist;

rain and fog preprocessing module: the method is used for carrying out rain and fog removing pretreatment on the image to be processed containing rain and fog based on the GCANet deep learning network to obtain the image to be processed after the rain and fog is removed;

a detection and identification module: and the method is used for identifying the type and the position of the marine ship target in the image to be processed by adopting an SEMSSD target detection model on the image to be processed after the rain and fog are removed.

The marine ship target detection method based on deep learning in the sea fog environment provided by the invention has the following advantages:

1. the sea is rich in rain and fog, the characteristics of the target shot in the rain and fog environment are fuzzy, and the characteristic extraction of the sea target is difficult inevitably. The extraction of effective characteristics is closely related to the subsequent detection and identification precision. Based on the method, an image defogging pretreatment process is added before target detection, the obvious characteristics of the target are highlighted by improving the contrast ratio of the target and the background, and compared with a traditional defogging model, the GCANet deep learning defogging network can not only defogge the foggy image, but also keep the foggy image clear, and is beneficial to improving the detection and identification precision of the ship target in the sea fog environment.

2. The important information beneficial to detection is lost due to image defogging preprocessing operation, a channel attention module SE is introduced into the basic convolution layer with rich characteristic information, and the module learns the importance degree of each characteristic channel through an attention mechanism, so that the capability of extracting the characteristics of the network is improved, and the subsequent target detection and identification precision is improved.

3. In the offshore target detection based on the convolutional neural network, statistical information such as the shape of a target is fully utilized, and the network training convergence can be faster. Based on the method, a universal prior frame is abandoned, a reasonable prior frame is determined through statistical analysis of the shape characteristics of the ship target data set, and the false detection rate of the marine ship target is effectively reduced.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the description of the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.

FIG. 1 is a flow chart of the method;

FIG. 2 is a diagram of a GCANet network architecture;

FIG. 3 is a diagram of the G-SEMSSD model architecture;

fig. 4 (a) MVDD13 dataset aspect ratio 2D distribution (b) MVDD13 dataset aspect ratio 3D distribution;

FIG. 5 (a) is a graph showing the results of the detection of various models for a fishing boat, and (b) is a graph showing the results of the detection of various models for a sailing boat; (c) is a detection result graph of various models for the fire fighting ship; (d) is a diagram of the results of the various models for the cargo vessel; (e) The detection result of various models aiming at the container ship is shown; (f) is a detection result graph of various models aiming at the warship; (g) is a diagram of the detection results of various models for passenger ships; (h) is a diagram of the detection results of various models for the submarine ship; (i) a graph of the results of the various models for the bulk carrier; (j) a detection result graph of various models for the ship-supporting; (k) maps of the measurements of the various models for the drillship; (l) a map of the results of the testing of the various models for the tanker; (m) graphs of the results of the various models for the pleasure boat.

Detailed Description

It should be noted that, in the case of conflict, the embodiments and features of the embodiments of the present invention may be combined with each other, and the present invention will be described in detail with reference to the accompanying drawings and embodiments.

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the invention, its application, or uses. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the invention. As used herein, the singular forms "a", "an", and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.

The relative arrangement of the components and steps, the numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present invention unless specifically stated otherwise. Meanwhile, it should be understood that the sizes of the respective portions shown in the drawings are not drawn in an actual proportional relationship for the convenience of description. Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate. Any specific values in all examples shown and discussed herein are to be construed as exemplary only and not as limiting. Thus, other examples of the exemplary embodiments may have different values. It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be discussed further in subsequent figures.

In the description of the present invention, it is to be understood that the orientation or positional relationship indicated by the directional terms such as "front, rear, upper, lower, left, right", "lateral, vertical, horizontal" and "top, bottom", etc., are generally based on the orientation or positional relationship shown in the drawings, and are used for convenience of description and simplicity of description only, and in the absence of any contrary indication, these directional terms are not intended to indicate and imply that the device or element so referred to must have a particular orientation or be constructed and operated in a particular orientation, and therefore should not be considered as limiting the scope of the present invention: the terms "inner and outer" refer to the inner and outer relative to the profile of the respective component itself.

For ease of description, spatially relative terms such as "over 8230 \ 8230;,"' over 8230;, \8230; upper surface "," above ", etc. may be used herein to describe the spatial relationship of one device or feature to another device or feature as shown in the figures. It will be understood that the spatially relative terms are intended to encompass different orientations of the device in use or operation in addition to the orientation depicted in the figures. For example, if a device in the figures is turned over, devices described as "above" or "on" other devices or configurations would then be oriented "below" or "under" the other devices or configurations. Thus, the exemplary terms "at 8230; \8230; above" may include both orientations "at 8230; \8230; above" and "at 8230; \8230; below". The device may be otherwise variously oriented (rotated 90 degrees or at other orientations) and the spatially relative descriptors used herein interpreted accordingly.

It should be noted that the terms "first", "second", and the like are used to define the components, and are only used for convenience of distinguishing the corresponding components, and the terms have no special meanings unless otherwise stated, and therefore, the scope of the present invention should not be construed as being limited.

FIG. 1 is a flow chart of the method;

a marine ship target detection method based on deep learning in a sea fog environment comprises the following steps:

s1, acquiring an image to be processed containing rain and fog;

s2, performing rain and fog removing pretreatment on the image to be processed containing rain and fog based on a GCANet deep learning network to obtain the image to be processed after the rain and fog is removed;

and S3, identifying the type and position of the marine ship target in the image to be processed by adopting the trained SEMSSD target detection model.

Step S1/S2/S3 is executed in sequence;

FIG. 2 is a diagram of a GCANet network architecture;

further, the method comprises the following steps: the method comprises the following steps of carrying out rain fog removal pretreatment on an image to be processed containing rain fog based on a GCANet deep learning network, and obtaining the image to be processed after the rain fog removal as follows:

s11, for the image to be processed containing rain and fog, coding the image into a feature map by using a first convolution normalization module with two convolution kernels of which the sizes are 3 multiplied by 3 and the step length is 1 and a second convolution normalization module with one convolution kernel of which the size is 3 multiplied by 3 and the step length is 2; respectively reducing the width and the height of the characteristic diagram to half of the original width and the height;

s12, sequentially passing the feature graph through smooth hole residual blocks with different hole rates to obtain image features of different levels;

the output characteristic diagram can sequentially pass through 7 smooth cavity residual blocks with cavity rates of 2,4 and 1 to obtain image characteristics of different levels;

s13, fusing the features of different levels through a gated fusion sub-network, and learning the weight of the features of the low level, the middle level and the high level; obtaining a characteristic diagram;

s14, passing the feature map through a deconvolution layer, up-sampling the feature map to an original resolution, and converting the feature map back to an image space through two convolution layers to obtain a fog residual error of the marine ship target;

the step length of the deconvolution layer can be 1/2;

FIG. 3 is a diagram of the G-SEMSSD model architecture;

further, the SEMSSD target detection model takes a VGG network as a backbone network, and the VGG network comprises a 3 rd convolutional layer of a 4 th convolutional block, a channel attention module, a full connection layer, three convolutional layers and one average pooling layer;

a set of averaging pooling layers is connected in series after the third convolutional layer.

Further, regarding the channel attention module, due to the defects of contrast and brightness reduction of the foggy day image, the target edge in the image is often blurred, the texture feature is not obvious, and important information beneficial to detection is lost due to image defogging preprocessing operation, the channel attention module SE is introduced into the VGG basic convolution layer, and the input feature diagram of the channel attention module SE is marked as P1. The P1 characteristic diagram is divided into two paths, wherein one path is not processed, the output is marked as P2, the other path sequentially passes through the global pooling, the full connection layer, the ReLU function, the full connection layer and the Sigmoid function, the output is marked as P3, the P2 and the P3 are added to obtain an output P4 of the SE module, and the dimension of the output P4 is consistent with that of the input layer P1.

The training process of the SEMSSD target detection model is as follows:

GCANet is added at the front end of an SEMSSD target detection model serving as a ship target detection network as an image defogging preprocessing process, the size of an input image is unchanged before and after defogging, the image after defogging is zoomed to 300 x 300, then the image is input into a main network VGG of an SSD model, and a feature map F1 with the shape of 38 x 512 is obtained by accessing a SE module after a Conv4_3 layer (a 3 rd convolutional layer (layer) of a 4 th convolutional block (block)).

F1 has two branches, one of which is output after normalization and is marked as M1; another FC7 layer passing through VGG obtains a 19X 1024 characteristic diagram M2; m2 is subjected to convolution to obtain a characteristic diagram M3 of 10 multiplied by 512; m3, obtaining a characteristic diagram M4 of 5 multiplied by 256 through the convolution layer; m4, obtaining a characteristic diagram M5 of 3 multiplied by 256 through a convolution layer;

m5 is subjected to average pooling to obtain a 1X 256 characteristic map M6.

At the moment, the network has six output feature maps M1, M2, M3, M4, M5 and M6, the SSD model judges the target position by using a prior frame, the feature maps are respectively provided with prior frames with different sizes, non-maximum value inhibition is carried out after the confidence coefficient and the position offset of the prior frames are obtained, and a prediction frame which is closest to the real position is obtained. Assuming that the network uses individual signatures for prediction, for any signature,

further: the SEMSSD target detection model judges the position of a target by using a prior frame, and the determination mode of the prior frame is as follows:

counting target characteristics of marine ship targets in the image to be processed;

adjusting the prior frame shape statistical characteristics based on the target characteristics of the marine ship target, wherein the statistical characteristics can be aspect ratio information;

the prior frame width-to-height ratio AR in the original SSD model is obtained according to the universal data set, so that the method is not suitable for a ship target detection task in a marine environment, and the statistical characteristics of a target need to be obtained from the ship data set. Taking the MVDD13 data set as an example, the AR of the real target frame is counted.

Fig. 4 (a) MVDD13 dataset aspect ratio 2D distribution (b) MVDD13 dataset aspect ratio 3D distribution; the target AR distribution is characterized by a high mean, low on both sides, similar to a normal distribution with mean μ =2.33, variance σ ² =1.77. At this point approximately 68.2% of the target distribution is in the interval [ mu-sigma, mu + sigma ]]In, i.e.

According to the results of the comparative experiment, the AR distributions of 6 output layers were obtained as: [1, 2], [1, 2, 4], [1, 2, 3], [1, 2].

The prior frame proportion calculation formula is as follows:

wherein s is _min ＝0.2，s _max =0.9, representing the minimum and maximum proportion of the prior box to the area of the feature map, respectively. Since the detected target has different ARs, different ARs are set in the six different output feature maps according to the above.

The width w and height h of each prior box are calculated using the following two equations:

wherein: s is the size of the original picture;

the center coordinate of each prior box is

Wherein, | f _k I represents the size of the kth feature map, and i and j are position coordinates on the feature map. For each position coordinate of the six output feature maps, 4, 6, 4 prior frames of different ARs with different sizes are respectively applied according to the corresponding ARs given in table 1, and the number of the prior frames is 8732 in total.

TABLE 1 comparison of test results before and after changing the Prior frame AR

From Table 1, it can be seen that the mAP @ 5 index of MSSD is 3.7% higher than SSD. This shows that the prior frame obtained from data statistics can improve the detection performance of the SSD to some extent, and the operation of fine-tuning the AR plays a positive role in the model.

In all 8732 a priori frames, a single detection frame corresponding to each target is determined, and therefore non-maximum suppression is used for screening, and the steps are as follows:

(1) And arranging all prior frames according to the confidence score, and finding the prior frame with the highest score.

(2) And setting the threshold of the IoU to be 0.5, traversing other prior frames, and deleting the prior frame with the highest score determined in the step (1) if the IoU of the prior frame is more than 0.5.

(3) And (3) selecting the highest scoring frame from the rest frames, and repeating the steps (1) and (2) until no deletable frame exists.

Further, the settings for the semsd target detection model training process are as follows: the optimizer trains 300 generations in total by using a random gradient descent method, the weights of the main network part are frozen in the first 50 generations, only part of the weights are updated, and all the weights are thawed and updated after 50 generations. The learning rate decreases linearly with the increase of training generations, the initial learning rate is 2X 10-3, and the final learning rate is 1X 10-4. 8 pictures are input in each batch in the training process, when all the training set pictures are trained once, the training of the current generation is completed, and the effect test is performed on the network by using the verification set pictures after the training is completed once.

In order to objectively embody the improvement of the ship target detection effect in the sea fog environment, the mAP is used as an index for evaluating the detection precision, and the calculation process is as follows:

IoU (intersection ratio), namely the ratio of the intersection and union of the areas of the frame predicted by the SSD network and the real frame. A larger IoU indicates that the predicted box fits better with the actual box, and a value of IoU greater than a threshold of 0.5 indicates that the prediction is correct. The specific formula is as follows:

wherein B is _p And B _gt Respectively a prediction box and a real box.

And S42, AP (Average Precision).

At a given IoU threshold, recall R (Recall) and Precision P (Precision) can be determined accordingly, and for each vessel type, a corresponding P-R curve can be drawn, in which case AP is defined as the area enclosed by the P-R curve:

mAP (mean Average Precision). The mAP represents the average value of all target types of APs, and the formula level is represented as follows:

where is the AP value of the ith class and C is the total number of classes. The evaluation index selected by the invention is mAP score when the IoU threshold value is 0.5, and is recorded as mAP @ 5.

A marine vessel target detection device based on deep learning under an ocean fog environment comprises:

an acquisition module: the method comprises the steps of obtaining an image to be processed containing rain and fog;

an identification module: and the method is used for identifying the type and the position of the marine ship target in the image to be processed by adopting the trained SEMSSD target detection model.

The effectiveness and superiority of the model provided by the invention are explained in detail, the following 7-group comparison experiments are developed, and the experimental model is explained as follows:

(1) G-SSD: an original SSD network is trained by using a defogged picture of a GCANet module;

(2) MSSD: adjusting the SSD network of the prior frame according to the MVDD13 data set;

(3) SESSD: embedding an SE module in the SSD network convolution layer;

(4) SE-MSSD: embedding an SE module in the MSSD rolling layer;

(5) G-MSSD: using a GCANet module to demist, and then training the MSSD network by using pictures;

(6) G-SESSD: using a GCANet module to carry out defogging and then training an SESSD network by pictures;

(7) G-SEMSSD: using a GCANet module to carry out defogging and then training an SE-MSSD network by pictures;

the experiment also used mAP @.5 as an evaluation index, and the results are given in the following table:

TABLE 2 comparison of detection accuracy for different models

The mAP of the G-SSD, MSSD and SESSD models was improved by 1.00%, 3.70% and 3.96%, respectively, compared to the original SSD. Therefore, the three technologies can improve the detection performance of the SSD model to a certain extent. In particular, the detection performance of the MSSD model is obviously improved due to the fact that a reasonable default frame is determined by applying the statistical characteristics of the data. In addition, the mAP of the G-MSSD is improved by 3.1% compared with that of the G-SSD, which further indicates that the adjustment of the prior frame width and height ratio is beneficial to the detection. As can be seen from Table 2, the G-SESSD has more remarkable advantages than the G-SSD model, and the SE module can effectively improve the detection performance. Although the G-SESSD is slightly inferior to the SESSD in the SE-MSSD, the G-SEMSSD algorithm has obvious advantages.

For each ship category in the MVDD13 dataset, the effect of the different models is shown as 5, fig. 5 (a) is a graph of the test results of various models for fishing boats, and (b) is a graph of the test results of various models for sailing boats; (c) is a diagram of the detection results of various models for a fire fighting ship; (d) is a diagram of the results of the various models for the cargo vessel; (e) The detection result of various models aiming at the container ship is shown; (f) is a detection result graph of various models aiming at the warship; (g) is a diagram of the results of the various models for the passenger ship; (h) is a diagram of the detection results of various models for the submarine ship; (i) a graph of the results of the various models for the bulk carrier; (j) a detection result graph of various models for the ship-supporting; (k) maps of the measurements of the various models for the drillship; (l) a map of the results of the various models for the tanker; (m) graphs of the results of the various models for the pleasure boat.

In addition to container ships and warships, the G-semsd model can balance accuracy and recall well on other ship types, especially on fire, sailing and fishing vessels far beyond other models. The ship type in the data set has remarkable shape characteristics, the width of most ships is larger than the height, and the preset width-height ratio is more favorable for accurate detection of an algorithm.

In conclusion, the G-SEMSSD model provided by the invention can obtain higher detection precision in the sea fog environment, and can be applied to ship detection tasks in the sea fog environment.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

1.Wu D,Zhu Q S.The latest research progress of image dehazing.Acta Automatica Sinica,2015,41(2):221-239.

2.Gan J J,Xiao C X.Fast image dehazing based on accurate scattering map.Journal of Image and Graphics,2013,18(5):583-590.

3.He K,Sun J,Tang X,et al.Single image haze removal using dark channel prior.IEEE Transactions on Pattern Analysis and Machine Intelligence,2011,33(12):2341-2353.

4.Chen D,He M,Fan Q,Liao J,Zhang L,Hou D,Yuan L,Hua G.Gated context aggregation network for image dehazing and deraining.2019IEEE Winter Conference on Applications of Computer Vision,2019,1375-1383.

5.Cai B,Xu X,Jia K,Qing C,Tao D.Dehazenet:An end-to-end system for single image haze removal.IEEE Transactions on Image Processing,2016,25(11):5187-5198.

6.Li B,Peng X,Wang Z,Xu J,Feng D.An all-in-one network for dehazing and beyond.arXiv preprint arXiv,2017,1707.06543.

7.Zhang H,Patel V M.Densely connected pyramid dehazing network.In:Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2018,3194-3203.

8.Li C,Guo C,Guo J,Han P,Fu H,Cong R.PDR-Net:Perception-inspired single image dehazing network with refinement.IEEE Transactions on Multimedia,2020,22(3):704-716.

9.Chen X,Lu Y,Wu Z,Yu J,Wen L.Reveal of domain effect:How visual restoration contributes to object detection in aquatic scenes.arXiv preprint arXiv,2020,2003.01913.

10.Isaza C,Anaya K,Fuentes-Silva C,et al.Dynamic set point model for driver alert state using digital image processing.Multimedia Tools and Applications,2019,78(14):19543-19563.

11.Sun W F,Dai Y S,Ji Y G,et al.New ideas for HFSWR maritime target tracking.Marine Sciences,2017,41(6):144-149.

12.Zhang H,Mao H,Qiu D.Feature extraction for the stored-grain insect detection system based on image recognition technology.Transactions of the Chinese Society of Agricultural Engineering,2009,25(2):126-130.

13.Huang H,Zhou H,Yang X,Zhang L,Qi L,Zang A Y.Faster R-CNN for marine organisms detection and recognition using data augmentation.Neurocomputing,2019,337:372-384.

14.He K,Zhang X,Ren S,Sun J.Spatial pyramid pooling in deep convolutional networks for visual recognition.IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,37(9):1904-1916.

15.Ren S,He K,Girshick R,Sun J.Faster R-CNN:Towards real-time object detection with region proposal networks.IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149.

16.Redmon J,Farhadi A.Yolo9000:Better,faster,stronger.In:2017 IEEE Conference on Computer Vision and Pattern Recognition,2017,6517-6525.

17.Liu W,Anguelov D,Erhan D,Szegedy C,Berg A C.SSD:Single shot multibox detector,in European Conference on Computer Vision,2015,21-37.

18.Redmon J,Farhadi A.Yolov3.An incremental improvement,arXiv preprint arXiv,2018,1804.02767.

19.Bochkovskiy A,Wang C Y,Liao H Y.Yolov4:Optimal speed and accuracy of object detection.arXiv preprint,2020,2004.10934.

Claims

1. A marine ship target detection method based on deep learning in a sea fog environment is characterized by comprising the following steps: the method comprises the following steps:

acquiring a to-be-processed image containing rain and fog;

performing rain and fog removing pretreatment on the image to be processed containing rain and fog based on a GCANet deep learning network to obtain the image to be processed after the rain and fog is removed;

and (4) identifying the type and the position of the marine ship target in the image to be processed by adopting an SEMSSD target detection model in the image to be processed after rain and fog removal.

2. The marine vessel target detection method based on deep learning in the sea fog environment according to claim 1, characterized in that: the method comprises the following steps of carrying out rain fog removal pretreatment on an image to be processed containing rain fog based on a GCANet deep learning network, and obtaining the image to be processed after the rain fog removal as follows:

3. The marine vessel target detection method based on deep learning in the sea fog environment according to claim 1, characterized in that: the SEMSSD target detection model takes a VGG network as a backbone network, and the VGG network comprises a 3 rd convolution layer of a 4 th convolution block, a channel attention module, a full connection layer, three convolution layers and one average pooling layer;

4. The marine vessel target detection method based on deep learning in the sea fog environment according to claim 1, characterized in that: the SEMSSD target detection model predicts a target position by using a prior frame, and the prior frame is determined in the following way:

adjusting the prior frame shape statistical characteristics based on the target characteristics of the marine ship target; the statistical features include aspect ratio information of the prior box.

5. The marine vessel target detection method based on deep learning in the sea fog environment according to claim 1, characterized in that: the aspect ratios of 6 output layers of the SEMSSD target detection model are respectively as follows: [1, 2], [1, 2, 4], [1, 2, 3], [1, 2].

6. The marine vessel target detection method based on deep learning in the sea fog environment according to claim 1, characterized in that: screening a large number of detection frames with different sizes predicted by the characteristic layer through a non-maximum suppression method, and determining a detection frame corresponding to each target in the image and a corresponding category.

7. The marine vessel target detection method based on deep learning in the sea fog environment according to claim 3, characterized in that: the channel attention module SE input characteristic diagram is marked as P1, the P1 characteristic diagram is divided into two paths, one path is not processed, the output is marked as P2, the other path passes through the global pooling, the full connection layer, the ReLU function, the full connection layer and the Sigmoid function in sequence, the output is marked as P3, the P2 and the P3 are added to obtain an output P4 of the channel attention module SE, and the dimensionality of the output P4 is consistent with that of the input layer P1.

8. The utility model provides a marine naval vessel target detection device based on deep learning under sea fog environment which characterized in that: the method comprises the following steps:

a detection and identification module: and the SEMSSD target detection model is used for identifying the type and the position of the marine ship target in the image to be processed by adopting the image to be processed after the rain and fog are removed.