CN115272876A

CN115272876A - Remote sensing image ship target detection method based on deep learning

Info

Publication number: CN115272876A
Application number: CN202210836090.2A
Authority: CN
Inventors: 张致齐; 郑慧刚; 谢广奇; 曹金山; 常学立
Original assignee: Hubei University of Technology
Current assignee: Hubei University of Technology
Priority date: 2022-07-15
Filing date: 2022-07-15
Publication date: 2022-11-01

Abstract

The invention provides a remote sensing image ship target detection method based on deep learning, which is applied to an FRS-Net network; the technical key points are as follows: the method comprises the following steps: constructing a data set and preprocessing and enhancing the data set; constructing an FRS-Net network; performing FRS-Net model training; obtaining model parameters of FRS-Net, and detecting the target image based on the model of FRS-Net; and evaluating the network performance, and performing iterative optimization on the network model according to the evaluation result. The invention mainly aims at remote sensing ship detection tasks under a complex environment interfered by cloud and fog, provides a more appropriate anchor frame setting and distribution strategy according to the characteristics of remote sensing images, and enables a shallow network to obtain higher-level semantic information by constructing characteristic fusion so as to slow down the interference of the cloud and fog. The method has the advantages of high operation efficiency, accurate target detection and the like, can be suitable for large-scale remote sensing ship detection tasks, and has obvious value and significance in the aspect of quickly extracting ship targets.

Description

Remote sensing image ship target detection method based on deep learning

Technical Field

The invention belongs to the technical field of remote sensing ship target detection in a complex environment, and relates to a remote sensing image ship target detection method based on deep learning.

Background

The ship detection technology of the optical remote sensing image is widely applied to the fields of river monitoring, port management, illegal border crossing pursuit and the like. But influenced by vapor evaporation, condensation at the high altitude etc. the regional cloud layer, haze cover of being more easily appearing in the regional comparison of waters in land. This phenomenon increases the difficulty of the task of target detection based on optical remote sensing images. Aiming at the ship detection task blocked by the thin cloud or haze, the image can be cleared by using a cloud or haze removing algorithm, and then the ship is detected by using a target detection algorithm. However, the cloud and mist removing operations are time-consuming, the whole detection process is long, and the requirements in the high-time-efficiency application field are difficult to meet. The method for directly detecting the optical remote sensing image shielded by the cloud and fog can reduce the time consumption of the whole task target detection, so that the requirement of the high-timeliness target detection task is better responded.

Target detection algorithms based on deep learning, such as the YOLO series, etc., although performing satisfactorily in the PASCAL VOC and COCO target detection tasks. But is difficult to be directly applied to the detection task of the space remote sensing image ships under the cloud and mist coverage. The main reason is that different from the natural image target detection task mainly focusing on close-range targets, ship targets in the space remote sensing image are small and narrow, and the ship targets are often densely arranged in narrow areas such as ports and riverways, and cloud and fog covering interference of different degrees is superposed on the narrow and narrow areas, so that the ideal effect is difficult to obtain by directly using a related target detection algorithm.

Disclosure of Invention

Aiming at the problem that the existing target detection algorithm is difficult to maintain certain precision in the remote sensing ship detection task under the shielding of foggy days and has quick detection speed, the invention provides a remote sensing image ship target detection method based on deep learning, which comprises the following steps:

step S101: constructing a data set and preprocessing and enhancing the data set;

step S102: constructing an FRS-Net network;

step S103: performing FRS-Net model training;

step S104: obtaining a model parameter of FRS-Net, and detecting a target image;

step S105: and evaluating the network performance, and performing iterative optimization on the network model according to the evaluation result.

Preferably, the constructing and preprocessing and enhancing the data set comprises the following steps:

collecting the disclosed remote sensing data set to obtain a basic remote sensing ship image, and simulating the remote sensing ship data set with different degrees of cloud and mist covering environments through a dark channel mist adding algorithm on the basis;

simplifying the number of anchor frames by a K-means algorithm, increasing a shallow network, and transmitting high-level semantic information of a deep network to the shallow network for parameter fusion in a characteristic pyramid fusion mode;

and a finer characteristic scale is added in a prediction network, and the accuracy of detecting the densely arranged targets is improved.

Preferably, the FRS-Net network includes a backbone feature extraction network and a feature fusion network; wherein the backbone feature extraction network adopts CSPdark 53-Tiny; the feature fusion network adopts an FPN algorithm, and high-level semantic information of the deep network is transmitted to the shallow network through the FPN algorithm.

Preferably, the performing FRS-Net model training includes:

the number of anchor frames is simplified by using the characteristics of the remote sensing ship and adopting a K-Means clustering algorithm, the parameter quantity of the network is reduced on the basis of ensuring the accuracy by simplifying the number of the anchor frames matched with each feature layer, and the reasoning time of the network is shortened.

Preferably, the performing FRS-Net model training further comprises: the method has the advantages that the new shallow network is added to bring detailed information for the network, the feature fusion network is used to bring advanced semantic information for the shallow network, the robustness of the network in a complex environment is enhanced, the prediction network is improved, the condition that the extracted features of the network are inconsistent with the calculated features is relieved, and the accuracy of the algorithm for detecting the densely arranged targets is improved.

Preferably, the obtaining of the model parameter of FRS-Net to detect the target image specifically includes the following steps:

and detecting the remote sensing images covered by the clouds with different concentrations by using an FRS-Net algorithm, wherein the remote sensing images covered by the clouds with different concentrations are obtained by adding the clouds with different concentrations in the same remote sensing ship image through a dark channel fogging algorithm.

Preferably, the obtaining the model parameter of FRS-Net and detecting the target image further includes: inputting the test set to a trained FRS-Net network structure, and obtaining detection results in batches; the detection result comprises bounding box information, confidence and category scores; a part of anchor frames with strong interference are excluded through a non-maximum suppression algorithm; and finally, measuring the network performance of FRS-Net by evaluating the indexes.

Preferably, the evaluating the network performance and the iteratively optimizing the network model according to the evaluation result specifically include the following steps:

comparing the predicted boundary frame with the real frame, evaluating the predicted boundary frame by adopting average precision, calculation delay and frame per second, and performing iterative optimization on the network model according to an evaluation result; wherein obtaining the average precision comprises:

the average precision is a curve area formed by setting different thresholds according to precision and recall rate; and testing ten data sets with different degrees of foggy day coverage concentration in the test data set to obtain an average value.

Preferably, in the obtaining the average Precision, the Average Precision (AP), precision (Precision) and Recall (Recall) are calculated by the following formulas:

in the above formula, TP represents a true case, FP represents a false positive case, and FN represents a false negative case; p (r) represents a curve bounded by accuracy and recall, and AP represents the average accuracy by integrating the curve.

The invention has the technical effects and advantages that:

aiming at the remote sensing ship detection task under the complex environment interfered by rain and fog, the invention provides a more appropriate anchor frame setting and distribution strategy according to the characteristics of the remote sensing image, and enables a shallow network to obtain higher-level semantic information by constructing characteristic fusion so as to slow down the interference of the cloud and fog. The method is suitable for the requirement of remote sensing ship target detection under the complex scene interfered by cloud and fog, provides a more appropriate anchor frame setting and distribution strategy according to the characteristics of the remote sensing image, enables the high-level semantic information of the deep network to be transmitted to the shallow network through the backbone extraction network and the FPN algorithm, and improves the ship detection capability of the network in the complex environment such as cloud coverage. The method has the advantages of high operation efficiency, accurate target detection and the like, can be suitable for large-scale remote sensing ship detection tasks, and has obvious value and significance in the aspect of quickly extracting ship targets.

Drawings

FIG. 1 is a schematic flow chart of a ship detection method for covering an environment by high-resolution optical remote sensing cloud based on a deep neural network;

FIG. 2 is a schematic diagram of an anchor frame matching strategy according to an embodiment of the present invention;

FIG. 3 is a diagram of the FRS-Net network architecture of the present invention;

FIG. 4 is a schematic diagram of FRS-Net prediction network processing according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The invention provides a ship detection method for a high-resolution optical remote sensing cloud and mist masking environment based on a deep neural network, which is applied to an FRS-Net network, is used for a remote sensing ship target detection task under the cloud and mist masking of different concentrations, and can meet the application field with extremely high timeliness such as satellite borne and the like.

Specifically, as shown in fig. 1, the method of the present invention includes: step S101: constructing a data set and preprocessing and enhancing the data set; the method comprises the following specific steps of,

first, in terms of data set construction, since the remote sensing data sets disclosed in the prior art are all clear ship images, in this embodiment, the data sets are represented as remote sensing images of clear ship images. In order to obtain remote sensing images under cloud and mist shielding, remote sensing ship detection tasks under different degrees of cloud and mist shielding environments are simulated through a dark channel mist adding algorithm.

Furthermore, since the remote sensing ship data sets disclosed at present mostly mainly comprise clear remote sensing images, the network training is better in expression effect. The invention obtains a basic remote sensing ship image by collecting a public remote sensing data set DIOR, simulates remote sensing ship data sets (marked as SDIOR in the invention) with different degrees of cloud and fog covering environments by a dark channel fog adding algorithm on the basis, and correspondingly enhances the data sets.

In the aspect of data set construction, a dark channel fogging algorithm is utilized to carry out data simulation, the dark channel first-aid algorithm is taken as a theoretical basis, and a used fogging model is as follows:

I(x)＝J(x)t(x)+A(1-t(x))

wherein J (x) is a clear remote sensing image; i (x) is a remote sensing image of a cloud and mist covering environment; a is the atmospheric light component, and t (x) is the image transmittance. In order to simulate the remote sensing picture covered in the foggy day on vision and data, t (x) and A values of a real cloud image are extracted through an existing cloud coverage map (a priori map) and are used for the model to carry out fogging processing on an original data set image, and the concentration is quantitatively controlled through a coefficient in the fogging process. The specific process is as follows:

where ω is set to 0.95.I is^cCalculated by guided filtering, Ω (x) represents a local area block centered on x. After t (x) and A values of the existing real cloud and mist coverage map are obtained, the original data set image is subjected to mist adding treatment, wherein alpha is a concentration coefficient obtained in the previous step, and the formula is as follows:

J(x)＝α·t(x)·(I(x)-A)+A

in the implementation, the degree of cloud and fog shielding is regulated and controlled by adjusting alpha, remote sensing ship detection tasks of different cloud and fog shielding are simulated, ten remote sensing ship images with different cloud and fog shielding degrees are included in the original image, and the remote sensing ship images are mixed together to form a data set.

Secondly, the number of anchor frames is simplified through a K-means algorithm by utilizing the characteristics of the ship shape in the aspect of data preprocessing, and the detection efficiency of the model is improved. The target detection algorithm can obtain more detailed information by adding the shallow layer network, and then the high-level semantic information of the deep layer network is transmitted to the shallow layer network for parameter fusion in a characteristic pyramid fusion mode, so that the shallow layer network can obtain more high-level semantic information, and the capability of the algorithm for detecting the ship target in the cloud and mist coverage environment is improved.

In the aspect of data enhancement, aiming at the characteristics of the shape of the remote sensing ship, the number of anchor frames is simplified by using a K-Means clustering algorithm, and the detection speed is improved while certain accuracy is ensured.

Fig. 2 is a schematic diagram of the anchor frame matching strategy, in which yellow stars represent each anchor frame, and different color sample points represent that they are allocated to different anchor frames, (a) represents allocation of 4 anchor frames, and (b) represents allocation of 6 anchor frames. Specifically, the length and width of all target samples in the training set and all anchor boxes clustered by the K-Means algorithm are extracted and mapped to the initial input size of the network model (416 ). And performing IOU calculation on all target samples and the anchor frames to obtain the anchor frame distributed to each target sample point, and selecting partial sample points according to the same proportion. (a) Relative to (b), the same is true over the entire sample, but the number of target samples matched to each anchor frame increases due to the decrease in the number of anchor frames. For example, in (b) the target sample detected by the two anchor frames, only one anchor frame is needed for detection in (a), and the irregular curve frame in (b) represents the phenomenon.

Then, aiming at the scene of dense arrangement of small targets in a remote sensing ship detection task, the situation that the network extraction features are inconsistent with the calculation features is relieved by adding finer feature scales in a prediction network, and the accuracy of detection of the dense arrangement targets is improved.

Step S102 is to construct an FRS-Net network on the basis of step S101, and to enhance the detection capability of the remote sensing ship target under complex scenes such as rain and fog coverage.

As shown in fig. 3, an FRS-Net network structure diagram is shown, in which an FRS-Net algorithm is mainly composed of a backbone feature extraction network, a feature fusion network and a prediction network. The algorithm comprises image preprocessing, wherein the remote sensing image to be detected is stretched, compressed and subjected to data enhancement in the image preprocessing process, wherein the data enhancement comprises one or more of HSV color gamut transformation, mosaic transformation and mirror image transformation; the image is preprocessed and then sent to a target feature extraction network for feature extraction, the extracted feature information is sent to a feature fusion network for feature fusion to improve the high-level semantic information content contained in a shallow network, and then the feature information is sent to a feature information conversion network for decoding the feature information, so that target frame information, target confidence coefficient and category information required by a prediction network are obtained. The remote sensing ship detection task under the cloud and mist shielding is realized by the difference between target information predicted by different reduction networks and real target information through the prediction network.

In order to meet the requirement of high-aging application, FRS-Net adopts a backbone extraction network CSPdark Net53-Tiny. The CSPdark net53-Tiny is composed of three basic volumes and two large residual blocks respectively. The basic convolution mainly consists of normal convolution, batch Normalization (BN) and an active function leakage rectification linear unit (leakage ReLU). The residual module can resolve vanishing gradients, achieving good performance even in deeper networks.

The feature fusion network adopts FPN algorithm. It is known that the shallow element layer possesses more low-level semantic information, including bounding box position information. The new shallow network is added to bring more detailed information, so that the accuracy of the ship target identification algorithm in a complex environment is improved. On the basis, high-level semantic information of the deep network is transmitted to the shallow network through the FPN, and the capability of the network for detecting the ship in the cloud coverage environment is further improved.

Step S103, performing FRS-Net model training on the basis of the completion of step S102, wherein the specific process comprises the following steps:

the method has the advantages that the new shallow network is added to bring detailed information for the network, the feature fusion network is used to bring high-level semantic information for the shallow network, the robustness of the network in a complex environment is enhanced, the prediction network is improved on the basis, the condition that the extracted features of the network are inconsistent with the calculated features is relieved, and the accuracy of the algorithm for detecting the densely arranged targets is improved. The specific implementation process is as follows.

Referring to FIG. 3, a sample image is stretched during the FRS-Net algorithm training phaseRather than scaling to 416 x 416, feature extraction and downsampling through the backbone network, three feature maps with different feature scales, C x 13,

is obtained by the feature fusion network. In the feature fusion network, in order to integrate with the feature map

The fusion is carried out on the parameters, the C multiplied by 13 needs to adjust the space size of the channel and the feature map, and after the basic convolution and the up-sampling, the feature scale C multiplied by 13 is changed into

At this time, the spatial sizes of the feature maps become uniform, and then the feature maps are superimposed on the channel dimension to generate the feature map

At the moment, the characteristic diagram has two functions, firstly, the characteristic diagram is sent to a prediction network for predicting a target boundary frame, a type and a confidence coefficient after channel conversion is carried out through a prediction head; secondly, the adjustment of the space sizes of the channel and the feature map is continued and the parameter fusion is carried out with the feature map to generate the feature map

And then fed into the predictive network to calculate feature information. In the FRS-Net feature layer object detection allocation strategy, the feature output layer 52 × 52 is used to detect small objects, and the feature output layer 26 × 26 is used to detect non-small objects. In order to match the feature information of the feature output layer 26 × 26 with the feature information of the feature output layer 52 × 52, information matching is ensured by adding a base convolution and upsampling operation on the feature output layer with the feature scale of 26 × 26. After the characteristic information of the characteristic output layer 26 × 26 and the characteristic information of the characteristic output layer 52 × 52 are subjected to parameter fusion, the shallow layer network can acquire high-level semantic information from the deep layer network, so that the robustness of the network model in the cloud and fog covering environment is improved. At the same time, two feature inputs are maintained due to FRS-NetAnd (3) layering, so that the coating has high timeliness similar to that of YOLOv 4-Tiny.

Further, step S103 further includes: the number of anchor frames is simplified by using the characteristics of the remote sensing ship and adopting a K-Means clustering algorithm, the parameter quantity of the network is reduced on the basis of ensuring the accuracy by simplifying the number of the anchor frames matched with each feature layer, and the reasoning time of the network is shortened.

Fig. 4 is a schematic processing diagram of an FRS-Net prediction network, in which two densely arranged boat targets are sent to the prediction network after feature information is extracted by the feature network, and feature values of the boat targets are converted into frame information. Firstly, a target central point in the frame information passes through a sigmod activation function and outputs sigma (t)_x)∈(0,1),σ(t_y) E (0, 1). Wherein the length and width of each network is set to 1 in size. When the feature map size of the predicted network output is 26 × 26, the center point of the target needs to match the size of the upper feature map on a normalized basis. Due to the insufficient size of the 26 × 26 feature map, the center points of the two boats are distributed on the same grid at the moment. However, the coordinates 26 × 26 at the upper left corner of the grid can only match one target center point, so that in the grid with two target features, only one target is calculated in the prediction network, but the convolution network extracts the features of two boats, which causes the phenomenon that the two targets are inconsistent. The chaotic matching causes that the detection of the remote sensing ship by the algorithm often has the conditions of missing detection and false detection. Because two boats have the same width and height and belong to the same detection layer, the problem cannot be solved by adding the anchor frame, and the additional reasoning time of a network is increased. The FRS-Net algorithm alleviates the above problem by adding a new feature prediction scale of 52 x 52. In the right diagram of fig. 4, the target center points may obtain a grid again in the feature diagram enlarged by two times, and the enlargement of the feature scale also increases the spatial distance between the two target center points, so that the two target center points can be successfully allocated to different grids. In densely arranged remote sensing ship images, the method enables a part of target center points which are critical to grid edges to be redistributed to a new grid. It can successfully alleviate the features extracted by the convolutional networkThe method is inconsistent with the characteristics of the prediction network calculation, and the capability of the target detection algorithm for detecting densely arranged small ships is improved.

Step S104, on the basis of the completion of step S103, obtaining model parameters of FRS-Net, and detecting the target image based on the model of FRS-Net, wherein the specific steps are as follows:

and detecting on the remote sensing images covered by the cloud mist with different concentrations by using an FRS-Net algorithm. The remote sensing images covered by the clouds with different concentrations are obtained by adding the cloud scenes with different concentrations in the same remote sensing ship image through a dark channel fogging algorithm.

And inputting the test set into the trained FRS-Net network structure, and obtaining detection results in batches, wherein the detection results comprise bounding box information, confidence coefficient and category scores.

Testing the ship detection capability of the FRS-Net under different cloud and fog covering degree environments, simulating 10 degrees of images of the verification set each time, finally obtaining ship detection results under different cloud and fog covering, and averaging.

It should be noted that, in the detection stage, the input remote sensing ship image is stretched and compressed to 416 × 416, and is sent to the backbone network for extracting target features, including detail features and high-level semantic features. In order to accurately identify the target, the detail features and the high-level language features are subjected to information fusion at a feature fusion layer. In order to enable the prediction network to calculate the fused information, the fused information is decoded so that the parameter dimension becomes N × 2 × (4 +1+ c). Where N represents the feature output layer size and C represents the number of classifications. In the present invention, FRS-Net has only two feature output layers, so the size of N is 52 × 52, 26 × 26, respectively. However, the invention only aims at the ship target detection task, so that the number of C is 1. Specifically, 2 represents the number of FRS-Net anchor boxes, 4 represents the center point and size of the target box, and 1 represents the confidence of the predicted box.

In the prediction stage of the prediction network, the prediction frame obtains a bounding box score by multiplying the confidence coefficient and the category score, and the prediction frame with the bounding box score larger than the set threshold is reserved and the prediction frame with the bounding box score smaller than the set threshold is discarded. To avoid having multiple prediction bounding boxes for the same target, the final prediction box is obtained by introducing a non-maximum suppression algorithm (NMS).

Step S105, on the basis of the completion of step S104, evaluates the network performance, and performs iterative optimization on the network model according to the evaluation result, which includes the following specific steps:

and comparing the prediction frame reserved in the step S104 with the real frame, evaluating the prediction frame by adopting common network evaluation index average accuracy (mAP), calculation delay (Latency) and Frame Per Second (FPS), and performing iterative optimization on the network model according to an evaluation result. Wherein each evaluation index is defined as follows:

the average precision AP is a curve area formed by precision (precision) and Recall (Recall) by setting different thresholds.

The calculated delay is the time elapsed between the issuance of a request and the receipt of a response, and is used to measure the speed of algorithmic reasoning.

The frames per second FPS refers to the number of output frames per second.

The parameter indexes for measuring the performance of the FRS-Net network adopt the common average precision (mAP), the detection speed (FPS) and the network reasoning time (Latency).

Further, the mAP is obtained by averaging the APs of all the predicted types, and in detail, the average value of the APs obtained by testing ten data sets with different degrees of foggy day coverage concentration in the test data set is obtained by the invention. In this embodiment, the AP is the area of a curve formed by Precision (Precision) and Recall (Recall) calculations, and this curve is commonly referred to as a PR curve, which is formed by setting different thresholds. Regarding the calculation of the accuracy and recall ratio, it is calculated by the following formula.

In the above formula, TP represents the true case, FP represents the false positive case, and FN represents the false negative case. P (r) represents a curve defined by accuracy and recall, which is integrated to yield AP.

In conclusion, the invention mainly aims at the remote sensing ship detection task under the complex environment interfered by rain and fog, provides a more appropriate anchor frame setting and distribution strategy according to the characteristics of the remote sensing image, and enables a shallow network to obtain higher-level semantic information by constructing feature fusion so as to slow down the interference of the cloud and fog.

Finally, it should be noted that: although the present invention has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that modifications may be made to the embodiments or portions thereof without departing from the spirit and scope of the invention.

Claims

1. A remote sensing image ship target detection method based on deep learning is characterized by comprising the following steps:

constructing a data set and preprocessing and enhancing the data set;

constructing an FRS-Net network;

performing FRS-Net model training;

obtaining model parameters of FRS-Net, and detecting the target image based on the model of FRS-Net;

and evaluating the network performance, and performing iterative optimization on the network model according to the evaluation result.

2. The remote sensing image ship target detection method based on deep learning of claim 1, wherein the steps of constructing and preprocessing and enhancing the data set comprise:

collecting a public remote sensing data set to obtain a basic remote sensing ship image, and simulating a remote sensing ship data set with different degrees of cloud and mist covering environments through a dark channel mist adding algorithm;

fine feature scales are added in a prediction network to relieve the condition that network extraction features are inconsistent with calculation features, and the accuracy of detection on densely arranged targets is improved.

3. The remote sensing image ship target detection method based on deep learning of claim 1, wherein the FRS-Net network comprises a backbone feature extraction network and a feature fusion network; wherein, the backbone feature extraction network adopts CSPdark net53-Tiny; the feature fusion network adopts an FPN algorithm, and high-level semantic information of the deep network is transmitted to the shallow network through the FPN algorithm.

4. The remote sensing image ship target detection method based on deep learning of claim 1, wherein the performing FRS-Net model training comprises:

5. The remote sensing image ship target detection method based on deep learning of claim 4, wherein the performing FRS-Net model training further comprises:

the new shallow network is added to bring detailed information to the network, high-level semantic information is brought to the shallow network through the feature fusion network, the robustness of the network in a complex environment is enhanced, the prediction network is improved, the condition that the extracted features of the network are inconsistent with the calculated features is relieved, and the accuracy of the algorithm for detecting the densely arranged targets is improved.

6. The remote sensing image ship target detection method based on deep learning of claim 1, wherein the obtaining of the model parameters of FRS-Net, the detection of the target image based on the model of FRS-Net comprises:

7. The remote sensing image ship target detection method based on deep learning of claim 6, wherein the obtaining of the model parameters of FRS-Net, the detection of the target image based on the model of FRS-Net further comprises:

inputting the test set to a trained FRS-Net network structure, and obtaining detection results in batches; the detection result comprises bounding box information, confidence and category scores; a part of anchor frames with strong interference are excluded through a non-maximum suppression algorithm; and finally, measuring the network performance of FRS-Net by evaluating the indexes.

8. The remote sensing image ship target detection method based on deep learning of claim 1, wherein the evaluation of network performance and the iterative optimization of the network model according to the evaluation result comprise the following steps:

9. The method for detecting the ship target based on the deep learning remote sensing image as recited in claim 8, wherein in the obtaining of the average accuracy, the average Accuracy (AP), the accuracy (Precision) and the Recall ratio (Recall) are calculated by the following formulas:

in the above formula, TP represents a true case, FP represents a false positive case, and FN represents a false negative case; p (r) represents a curve formed by the precision and the recall rate, and AP represents the average precision obtained by integrating the curve.