CN110826485A - Target detection method and system for remote sensing image - Google Patents

Target detection method and system for remote sensing image Download PDF

Info

Publication number
CN110826485A
CN110826485A CN201911071646.8A CN201911071646A CN110826485A CN 110826485 A CN110826485 A CN 110826485A CN 201911071646 A CN201911071646 A CN 201911071646A CN 110826485 A CN110826485 A CN 110826485A
Authority
CN
China
Prior art keywords
target
detected
remote sensing
detection
target area
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911071646.8A
Other languages
Chinese (zh)
Other versions
CN110826485B (en
Inventor
王俊强
李建胜
周学文
吴峰
郑凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Information Engineering University of PLA Strategic Support Force
Original Assignee
Information Engineering University of PLA Strategic Support Force
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Information Engineering University of PLA Strategic Support Force filed Critical Information Engineering University of PLA Strategic Support Force
Priority to CN201911071646.8A priority Critical patent/CN110826485B/en
Publication of CN110826485A publication Critical patent/CN110826485A/en
Application granted granted Critical
Publication of CN110826485B publication Critical patent/CN110826485B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/13Satellite images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/182Network patterns, e.g. roads or rivers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Abstract

The invention relates to a target detection method and a system of remote sensing images, wherein the target detection system comprises a memory, a processor and a computer program which is stored in the memory and can be run on the processor, and the processor comprises the following steps when running the computer program: (1) training to obtain a deep learning network; (2) acquiring the size of a target area to be detected and the actual size of a target; (3) dividing the target area to be detected according to the proportional relation between the size of the target area to be detected and the actual size of the target to obtain a plurality of small grids; (4) and (4) sequentially inputting all the small grid images obtained in the step (3) into the deep learning network obtained in the step (1) for target detection, and finally obtaining a detection result. The target can not be cut excessively or even cut by the adaptively divided grids, and the target in the remote sensing image to be detected can be detected more accurately after the target is detected and identified by the fast and high-precision deep learning network model.

Description

Target detection method and system for remote sensing image
Technical Field
The invention relates to a target detection method and a target detection system for remote sensing images.
Background
The remote sensing image target detection is used for determining whether an interested target exists in a remote sensing image, detecting and accurately positioning the interested target, and is one of important research directions for image interpretation. In the traditional target detection process, a sliding window is adopted to select an area, then shallow level features are extracted, and finally the type is judged. The target recognition effect of the method depends on the set characteristics seriously, and the deep characteristics in the image are difficult to be fully mined. The robustness of feature extraction is poor, the conditions of illumination change, inconsistent resolution and the like of the multisource remote sensing image cannot be adapted, and the requirement of large-scale automatic application is difficult to meet.
In recent years, with the continuous enhancement of computer computing capability, deep learning is widely applied, a remote sensing image target detection technology based on the deep learning is rapidly developed, a deep convolution neural network does not need to design features manually on the aspect of target detection, the remote sensing image data is subjected to feature extraction automatically, and the performance exceeds that of a traditional algorithm.
However, when the target detection is performed on the remote sensing image shot at high altitude, the pixels of the whole image of the image are too large, the processor cannot perform detection at one time, so that the image to be detected can be divided and then detected, and the target to be detected is often cut by the division of the target area to be detected, so that the target in the remote sensing image cannot be accurately identified.
Disclosure of Invention
The invention aims to provide a target detection method and a target detection system of a remote sensing image, which aim to solve the problem that the target detection in the existing remote sensing image is not accurate.
In order to achieve the above object, the present invention provides a target detection method for remote sensing images, comprising the steps of:
(1) training to obtain a deep learning network;
(2) acquiring the size of a target area to be detected and the actual size of a target;
(3) dividing the target area to be detected according to the proportional relation between the size of the target area to be detected and the actual size of the target to obtain a plurality of small grids;
(4) and (4) sequentially inputting all the small grid images obtained in the step (3) into the deep learning network obtained in the step (1) for target detection, and finally obtaining a detection result.
Has the advantages that: according to the target detection method of the remote sensing image, firstly, a target area to be detected is subjected to grid division, the grid division is adaptive division according to the size of the target area to be detected and the actual size of the target, the target cannot be cut excessively or even cannot be cut by the adaptively divided grid, and after the target detection and identification are carried out through a rapid and high-precision deep learning network model, the target in the remote sensing image to be detected can be detected more accurately.
Further, the target area to be detected in the step (3) is divided into m rows and n columns of small grids,
Figure RE-GDA0002311772860000021
in the formula (I), the compound is shown in the specification,
Figure RE-GDA0002311772860000022
denotes upward integer, XmaxIs the maximum value of the coordinate in the X direction of the target area to be detected, XminIs the minimum value of the coordinate in the X direction, Y, of the target area to be detectedmaxIs the maximum value of the coordinate in the Y direction of the target area to be detected, YminThe minimum value of the coordinate in the Y direction of the target area to be detected is obtained; xwidthMapping the actual length of container loading area, Y, of a target area to be detected at an image levelheightThe actual width of a container loading area of a map under a certain image level is a target area to be detected; overlapxIs in the X directionThe overlapping rate is determined by the proportion of the size of the target to the length of the target area to be detected in the X direction; overlapyThe overlapping rate in the Y direction is determined by the ratio of the size of the target to the length of the target area to be detected in the Y direction.
Further, the overlapping rate in the X direction and the overlapping rate in the Y direction are:
Figure RE-GDA0002311772860000023
in the formula IobjectIs a target actual length, wobjectIs the target actual width.
Further, the divided small grid coordinates are expressed as
Wherein the content of the first and second substances,is the maximum value of the coordinate in the X direction of the ith row and jth column grids,
Figure RE-GDA0002311772860000032
is the coordinate minimum value in the X direction of the ith row and jth column grids,
Figure RE-GDA0002311772860000033
is the maximum value of the coordinate in the Y direction of the ith row and the jth column grid,the minimum value of the coordinate in the Y direction of the ith row and the jth column grid; i is 1,2, … … m, j is 1,2, … … n.
Further, the image level is 18 layers. Under the influence level of 18 layers, the detection effect is better.
Further, the deep learning network comprises a Faster R-CNN network and an RPN network.
Further, the Faster R-CNN network comprises a deep residual error network, and the deep residual error network is 50 layers.
Further, the RPN network outputs rectangular candidate boxes by convolution.
Further, the mechanism for generating the rectangular candidate frame in the RPN network is as follows: and determining the number of the rectangular candidate frames according to the number of the area scaling factors and the number of the aspect ratios of the training sample images.
In order to achieve the above object, the present invention provides an object detection system of a remote sensing image, including a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of the object detection method when executing the computer program.
Drawings
FIG. 1 is a flowchart of a target detection method of a remote sensing image according to the present invention;
FIG. 2 is a schematic structural diagram of a Faster R-CNN model constructed in an embodiment of the present invention;
FIG. 3-a is an original image before being processed by the data enhancement method of the present invention;
FIG. 3-b is an image of the present invention under random color data enhancement;
FIG. 3-c is an image of the noise-disturbed data enhancement according to the present invention;
FIG. 3-d is an image of the present invention under random scaling data enhancement;
FIG. 3-e is an image of the present invention under random rotation data enhancement;
FIG. 3-f is an image of the random inversion data enhancement according to the present invention;
FIG. 4 is a schematic diagram illustrating the meshing of remote sensing images according to the present invention;
FIG. 5 is a flow chart of multi-level object detection according to the present invention;
FIG. 6 is a comparison graph of the total loss training results under four generation mechanisms according to the present invention;
FIG. 7-a is a comparison graph of the detection results of different target scales under the Anchor mechanism 1 of the present invention;
FIG. 7-b is a comparison graph of the detection results of different target scales under the Anchor mechanism 2 of the present invention;
FIG. 7-c is a comparison graph of the detection results of different target scales under the Anchor mechanism 3 of the present invention;
FIG. 7-d is a comparison graph of the detection results of different target scales under the Anchor mechanism 4 of the present invention;
FIG. 8-a is a graph of mesh division and aircraft target size at the image 17 level according to the present invention;
FIG. 8-b is a graph of mesh division and aircraft target size at the image 18 level according to the present invention;
FIG. 8-c is a graph of mesh division and aircraft target size at the image 19 level in accordance with the present invention;
FIG. 9-a is a block diagram of detection boundaries at the level 18 of the present invention;
FIG. 9-b is a block diagram of detecting boundaries at the level 19 according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail with reference to the accompanying drawings and examples, but the embodiments of the present invention are not limited thereto.
The technical idea of the invention is as follows: and carrying out grid division on the image to be detected according to the overlapping rate in the X direction and the Y direction, and then sequentially inputting the divided small grids into a built and learned trained fast R-CNN detection model for target detection, so that the target in the image to be detected can be detected more accurately and reliably.
The embodiment of the target detection method comprises the following steps:
in the present embodiment, an airplane and an athletic field are described as examples. It should be noted that the invention is also applicable to targets to be detected in other structural forms such as ships, oil drums and the like. The implementation flow of the target detection method is shown in fig. 1, and the specific implementation steps are as follows:
step 1, obtaining a training sample image and a test sample image.
And manually acquiring target sample images with different scales and different categories from the digital earth satellite images in a global range by using a target detection service system. And forming a training sample image containing the target after marking the target sample image. The target types selected in the implementation are airplanes and track and field fields, wherein 1000 airplane samples are obtained, and 900 track and field samples are obtained. 70% of the picture samples are used as a training set, the rest 30% of the picture samples are used as a test set, and a large number of small airplane targets are introduced into the test set to serve as verification models for detecting the accuracy of the small targets. Through the data enhancement operation, the total number of the final training set is 7980, and the test set is 3420.
Step 2, constructing a deep learning network model based on Faster R-CNN
The structure of the deep learning network model in this embodiment is as shown in fig. 2, and the deep learning network model is obtained by a combination structure of the Faster R-CNN network and the RPN network.
Extracting basic features by adopting a 50-layer deep residual error network (ResNet-50) in a Faster R-CNN network to obtain a feature map; the problem that the gradient disappears in the back propagation process can be solved by the deep residual error network on the structural level of the neural network, and the gradient does not disappear even if the network is deep, so that the acquisition precision is ensured.
And the RPN network is used for generating a rectangular candidate region according to the feature map, outputting the position parameter of the rectangular candidate frame (namely the anchor) and the probability that the rectangular candidate frame is the target through convolution, and then outputting the suggestion frame to the Faster R-CNN for training and learning.
Since both the fast R-CNN network and the RPN network belong to the prior art as described above, they will not be further described herein.
Step 3, learning and training of deep learning network
And inputting the training sample image into a deep learning network model for learning and training. There are many discussions in the prior art regarding the learning and training process of the deep learning network, so that only the distinctive related design links are introduced here:
3.1 the generation mechanism of rectangular candidate frame (i.e. anchor) in the RNP network of the present invention is: and for a training sample image, determining the number of rectangular candidate frames according to the number of the area scaling factors and the number of the aspect ratios of the training sample image. For example, a 51 × 39 training sample image, which is a 256-channel image, has an area of 256 × 256, which actually represents a rectangular candidate box with a scaling factor of 1 and an aspect ratio of 1.
Based on the generation mechanism of the invention, the invention sets four anchors generation mechanisms, as shown in the following table 1:
TABLE 1
Figure RE-GDA0002311772860000051
3.2 loss function in RPN networks
The RPN network calculates the total loss using the multitask loss, which is expressed as:
Figure RE-GDA0002311772860000052
where i is represented as the ith rectangular candidate box, piPredicting a probability of being targeted for the ith rectangular candidate box;
Figure RE-GDA0002311772860000053
the true probability represents that if the target of the rectangular candidate box is 1, otherwise, the target is 0; t is tiA four-dimensional vector, representing the coordinate transformation parameters (translation and scaling) of the predictor with respect to the rectangular candidate box i,
Figure RE-GDA0002311772860000054
transformation parameters representing rectangular candidate boxes relative to authentic marker boxes, { piThe set of prediction probability lists for all rectangular candidate boxes, { t }iIs the set of all predicted coordinate transformation parameter lists, λ is the balance coefficient, NclsAnd NregIn order to be a function of the normalization,
Figure RE-GDA0002311772860000061
to classify the losses, cross-entropy loss calculations can be utilized,for positional regression loss, calculations were performed using the SmoothL1 loss function。
3.3 loss function of Classification regression Process
In performing the objective classification regression, a multitask loss synchronous calculation is adopted, and for each pooling (RoI) layer in the Faster R-CNN network, the loss function is expressed as:
L(p,u,tu,v)=Lcls(p,u)+λ[u≥1]Lloc(tu,v)
wherein p ═ p (p)0,.....pk) Denotes the probability of the K +1 class (including background), u represents the true class of RoI, v ═ vx,vy,vw,vh) Which represents the real position of the object,
Figure RE-GDA0002311772860000063
represents the predicted position, [ u ≧ 1]When u is more than or equal to 1, 1 is taken, otherwise 0 is taken, lambda is an equilibrium coefficient, Lcls(p, u) is the classification loss, which can be calculated using the cross-entropy loss, Lloc(tuV) bounding box regression loss, calculated using the SmoothL1 loss function.
3.4 the diversity of training samples can be improved by utilizing a data enhancement mode, overfitting caused by insufficient samples in the training process is prevented, and meanwhile, the robustness of the model is enhanced.
Five ways are used herein, namely random rotational transformation of 0 ° to 360 °, noise perturbation, color dithering, random non-equal scaling of 0.8 to 1.2 times, flip transformation of 180 °, 90 ° or 270 °, as shown in fig. 3-a, fig. 3-b, fig. 3-c, fig. 3-d, fig. 3-e, fig. 3-f, respectively. In training, the initial learning rate is set to 0.0003 and the learning momentum is set to 0.9. In the training process, the pre-training model based on the public data set is subjected to transfer learning, and the network weight is initialized.
3.5, learning and training in the deep learning network model through the training set, namely finishing training to obtain the deep learning network with the detection function.
And 4, carrying out dynamic grid division on the remote sensing image to be detected to obtain a plurality of small grid images.
4.1 dynamic meshing is performed according to a map container, as shown in fig. 4, the coordinate system in the map container is the real coordinate system (generally, the projection coordinate system) in the map container.
The maximum value and the minimum value of the target area to be detected in the X direction and the Y direction of the remote sensing image to be detected can be obtained according to the coordinate system in the graph: xmaxIs the maximum value of the coordinate in the X direction of the target area to be detected, XminIs the minimum value of the coordinate in the X direction, Y, of the target area to be detectedmaxIs the maximum value of the coordinate in the Y direction of the target area to be detected, YminThe minimum value of the coordinate in the Y direction of the target area to be detected.
And 4.2, determining the actual size of the loading area of the map container of the remote sensing image under a certain level according to the target area to be detected.
The invention specifically determines the actual size, X, of a map container loading area of a remote sensing image under 18 levelswidthPlotting the actual length of the container loading area at level 18 for the target area to be inspected, YheightThe actual width of the container loading area is mapped at 18 levels for the target area to be detected.
The target detection and positioning in a large area range can be realized based on grid division, but if the detection image tile level is too high, although the precision is improved, the grid division quantity is too much, the efficiency is greatly reduced, and the production requirement cannot be met. Therefore, it is important to select a proper image level in the detection process, and the determination is mainly based on the size proportion of the target in the image.
4.3 dividing the target area to be detected of the remote sensing image according to the overlapping rate of the X direction and the Y direction, and forming a plurality of small grids of m rows and n columns after division:
Figure RE-GDA0002311772860000071
in the formula (I), the compound is shown in the specification,
Figure RE-GDA0002311772860000072
indicating upward integer, overlapxThe overlapping rate in the X direction is determined by the ratio of the size of the target to the length of the target area to be detected in the X direction; overlapyIs the overlapping rate in the Y directionAnd determining the proportion of the size of the mark to the length of the target area to be detected in the Y direction.
Overlap ratio overlap in X and Y directionsxAnd overlapyThe expression is as follows:
Figure RE-GDA0002311772860000073
wherein lobjectIs a target actual length, wobjectIs the target actual width.
4.4 the maximum value and the minimum value of the coordinates of the small grid target area to be detected in the X direction and the Y direction after division are as follows:
Figure RE-GDA0002311772860000081
wherein the content of the first and second substances,
Figure RE-GDA0002311772860000082
is the maximum value of the coordinate in the X direction of the ith row and jth column grids,
Figure RE-GDA0002311772860000083
is the coordinate minimum value in the X direction of the ith row and jth column grids,
Figure RE-GDA0002311772860000084
is the maximum value of the coordinate in the Y direction of the ith row and the jth column grid,
Figure RE-GDA0002311772860000085
the minimum value of the coordinate in the Y direction of the ith row and the jth column grid; i is 1,2, … … m, j is 1,2, … … n.
4.5, after each frame is calculated according to the formula in the step 4.4, drawing the small grids according to the frame coordinates.
And 5, sequentially inputting the small grid images into the deep learning network model for target detection.
Drawing a grid of a target area to be detected according to the step 4, automatically operating the small grid to load tile images of 18 layers and the remote sensing image to be detected under the action of a map control, obtaining a frame of a target range to be detected, namely position information, and mutually combining the tile images and the remote sensing image to be detected to determine the detection and positioning of the target; namely, the airplane in the remote sensing image is checked and identified, and the specific position is located. And finally, filtering the overlapped detection target of the overlapped grid part by a space analysis method.
The target detection and positioning in a large area range can be realized based on grid division, but if the detection image tile level is too high, although the precision is improved, the grid division quantity is too much, the efficiency is greatly reduced, and the production requirement cannot be met. In order to solve the above problems, a multi-level target detection process is designed, in which a target is detected in a low-level image, and then high-level judgment and accurate positioning are performed with the target to be detected as a center, and the whole process is shown in fig. 5.
Experimental analysis:
a target detection service system under a B/S framework is developed and constructed based on an ArcGIS API for JavaScript component, a TensorFlow deep learning framework and a Django framework, and large-range target detection is realized through client operation. The server-side operating system is Ubuntu16.04, 16G memory and is configured with GTX1080Ti video card. Test comparative analysis was performed using 3420 test sets in step 1.
Firstly, comparing the training precision of the SSD-based deep learning network model with that of the Faster R-CNN-based deep learning network model
The SSD adopts an inceptionv3 network as a feature extraction network, the network parameters of the SSD are slightly more complicated than ResNet-50, the invention trains four anchors generation mechanisms respectively, and the precision verification is carried out on a test set.
And when the precision is verified, respectively calculating the average Accuracy (AP) of each type for the test set, and taking the average accuracy (mAP) as the precision of the training result of the measurement model. Briefly, each category can be plotted against recall (call) and accuracy (precision), so that AP is the area under the curve and the maps is the average of the APs for the multiple categories.
The test results of the two models and their corresponding parameter configurations are compared as shown in table 2:
TABLE 2
Figure RE-GDA0002311772860000091
Through the comparative analysis of table 2, the average value of the detection accuracy of the airplane is lower than that of the track and field, because the airplane target is generally smaller than that of the track and field target, before the image detection starts, the model needs to perform resize operation on the image, so that the target characteristics of a plurality of small airplanes disappear, and the target is likely to be missed or mistakenly detected. The target detection precision of the deep learning network model based on the Raster R-CNN is superior to that of an SSD frame, particularly the target detection method has more obvious advantages on detecting the plane target, and the detection performance of the fast R-CNN frame small target is superior to that of the SSD frame.
Secondly, comparing the performance conditions of four different anchors generating mechanisms of fast R-CNN
As shown in fig. 6, the loss function training conditions under the four anchors are shown, the total loss of the four anchors can achieve the convergence effect, the total loss of all the anchors is finally below 0.1 after training for 40 ten thousand steps, especially the PRN loss is controlled to be smaller, and thus the model training effect is good, the fluctuation of various loss functions of the anchor 3 is minimal, and the training effect is optimal.
From Table 2, it can be seen that the four different mechanisms of generating Faster R-CNN can reach nearly 90% mAP value. Compared with the anchor mechanism 4 and the anchor mechanism 1, the anchor mechanism 4 is added with a smaller area scaling factor, the mAP value of the area scaling factor is also superior to that of the anchor mechanism 1, which indicates that the detection precision of a small target can be improved by a smaller target candidate frame, compared with the anchor mechanism 3 and the anchor mechanism 4, the anchor mechanism 3 is added with an aspect ratio coefficient of 0.25 relative to the anchor mechanism 4, the mAP value is equivalent, which indicates that the aspect ratio coefficient cannot be increased to improve the precision aiming at the data set, compared with the anchor mechanism 3 and the anchor mechanism 2, the area scaling factor of the anchor mechanism 3 relative to the anchor mechanism 2 is consistent, but the coefficients with the same total number of the aspect ratio coefficients are inconsistent, the final mAP values of the two coefficients are also different, which indicates that the precision can be ensured by proper coefficients.
From the perspective of the frame prediction accuracy of the target candidate frame, the anchor mechanism 2 and the anchor mechanism 3 are equivalent and superior to the anchor mechanism 1 and the anchor mechanism 4, mainly because the length-width ratio coefficient is increased, the target effect (such as an athletic field) is better when the length-width ratio is predicted to be large, as shown in fig. 7-a, 7-b, 7-c and 7-d, and the target effect is a detection result of several different target scales. Therefore, combining the mAP value and the frame prediction effect, the anchor mechanism 3 is selected as the anchor mechanism of the Faster R-CNN target detection framework.
Thirdly, comparing the detection precision and efficiency under three image levels (17,18,19)
All airplane targets of the Xinzheng airport are detected based on a dynamic grid division mode, the detection range is 21.5 square kilometers, the detection accuracy and efficiency under three image levels (17,18 and 19) are compared, grid division is shown in a figure 8-a, a figure 8-b and a figure 8-c, and the overlapping degree between grids is dynamically generated according to the size of the targets and the detection levels. And counting the detection results, judging the identification accuracy by adopting the accuracy and the recall rate, wherein the recall rate is the number of correctly identified airplanes divided by the total number of airplanes in the test image, the accuracy is the number of correctly identified airplanes divided by the number of model identifications as the number of airplanes, and the identification is judged to be correct when IoU (the ratio of the intersection area of the detection frame and the real frame to the union area of the detection frame and the real frame) exceeds 0.5, and the statistical results are shown in table 3, wherein the detection time consumption is the total of image tile loading, network transmission, execution detection and result feedback.
TABLE 3
Figure RE-GDA0002311772860000101
As can be seen from table 3, although the training sample set does not introduce high-resolution image data, it still can achieve better effect. The detection efficiency is highest at the level of 17, but because the proportion of the airplane target in the breadth of the image is too small, resize operation is needed before the image is input into the feature extraction network, the small airplane target feature disappears, and the main reason for low recognition rate is that the detection efficiency is lower at the level of 18 compared with the level of 17, but the precision is greatly improved, and the precision at the level of 19 is highest, the recall rate reaches 97.6%, the accuracy reaches 95%, but because the number of divided grids is too large, the efficiency is greatly reduced. Therefore, in the detection process, it is very important to select an appropriate detection image level according to the size of the detection target. In the experiment, 18 levels are selected as the levels of grid detection images, and the frame of the detection target is accurately positioned on 19 levels by using the accurate positioning method designed in the text. As shown in fig. 9-a and 9-b, the detection bounding boxes at two levels are shown, and it is obvious that the position accuracy of the 19-level image border is better than that of the 18-level image border, and the accurate positioning effect can be achieved.
Generally speaking, the Faster R-CNN target detection framework has higher detection precision, a training sample set is enriched by adopting a reasonable data enhancement mode, a small target detection can be realized by the anchor generation mechanism designed in the text, but if the detection image resolution is low, the recall rate still keeps a low level, so that the selection of a proper image resolution for typical detection is also important in the aspects of balance efficiency and precision. Experiments verify that the target detection and frame accurate positioning method based on dynamic grid division and image multi-level designed by the invention can realize target detection and frame accurate positioning in a large area range, and has certain reference value for the rapid remote sensing target retrieval based on a deep learning method and application to the industrial production field.
Target detection system embodiment:
the target detection system of the remote sensing image comprises a memory, a processor and a computer program which is stored in the memory and can be run on the processor, wherein the processor realizes the target detection method when running the computer program, and the process of the target detection method is described in detail in the embodiment and is not described herein again.
Finally, it should be noted that the above embodiments are only used for illustrating the technical solutions of the present invention and not for limiting the protection scope thereof, and although the present application is described in detail with reference to the above embodiments, those skilled in the art should understand that after reading the present application, various changes, modifications or equivalents of the embodiments of the present application can be made, and these changes, modifications or equivalents are within the protection scope of the claims of the present invention.

Claims (10)

1. A target detection method of remote sensing images is characterized by comprising the following steps:
(1) training to obtain a deep learning network;
(2) acquiring the size of a target area to be detected and the actual size of a target;
(3) dividing the target area to be detected according to the proportional relation between the size of the target area to be detected and the actual size of the target to obtain a plurality of small grids;
(4) and (4) sequentially inputting all the small grid images obtained in the step (3) into the deep learning network obtained in the step (1) for target detection, and finally obtaining a detection result.
2. The method for detecting the target of the remote sensing image according to claim 1, wherein the target area to be detected in the step (3) is divided into m rows and n columns of small grids:
Figure FDA0002261134060000011
in the formula (I), the compound is shown in the specification,
Figure FDA0002261134060000012
denotes upward integer, XmaxIs the maximum value of the coordinate in the X direction of the target area to be detected, XminIs the minimum value of the coordinate in the X direction, Y, of the target area to be detectedmaxIs the maximum value of the coordinate in the Y direction of the target area to be detected, YminThe minimum value of the coordinate in the Y direction of the target area to be detected is obtained; xwidthMapping the actual length of container loading area, Y, of a target area to be detected at an image levelheightThe actual width of a container loading area of a map under a certain image level is a target area to be detected; overlapxIs the overlapping rate in the X direction, is determined by the ratio of the target size to the length of the target area to be detected in the X direction, overlapyThe overlapping rate in the Y direction is determined by the ratio of the size of the target to the length of the target area to be detected in the Y direction.
3. The method for detecting an object in a remote sensing image according to claim 2, wherein the overlapping rate in the X direction and the overlapping rate in the Y direction are:
Figure FDA0002261134060000013
in the formula IobjectIs a target actual length, wobjectIs the target actual width.
4. The method for detecting an object in a remote sensing image according to claim 3, wherein the divided small grid coordinates are expressed as
Wherein the content of the first and second substances,
Figure FDA0002261134060000022
is the maximum value of the coordinate in the X direction of the ith row and jth column grids,
Figure FDA0002261134060000023
is the coordinate minimum value in the X direction of the ith row and jth column grids,
Figure FDA0002261134060000024
is the maximum value of the coordinate in the Y direction of the ith row and the jth column grid,
Figure FDA0002261134060000025
the minimum value of the coordinate in the Y direction of the ith row and the jth column grid; i is 1,2, … … m, j is 1,2, … … n.
5. The method for detecting the target of the remote sensing image according to claim 2, wherein the image hierarchy is 18 layers.
6. The method for target detection of remotely sensed images as claimed in any of claims 1-5, wherein said deep learning network comprises a Faster R-CNN network and an RPN network.
7. The method for target detection of remote sensing images of claim 6, wherein the Faster R-CNN network comprises a depth residual network, and the depth residual network comprises 50 layers.
8. The method for detecting the target of the remote sensing image as claimed in claim 7, wherein the RPN network outputs a rectangular candidate frame by convolution.
9. The method for detecting the target of the remote sensing image according to claim 8, wherein the generation mechanism of the rectangular candidate frame in the RPN network is as follows: and determining the number of the rectangular candidate frames according to the number of the area scaling factors and the number of the aspect ratios of the training sample images.
10. An object detection system for remote sensing images, comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the object detection method for remote sensing images according to any one of claims 1 to 9 when executing the computer program.
CN201911071646.8A 2019-11-05 2019-11-05 Target detection method and system for remote sensing image Active CN110826485B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911071646.8A CN110826485B (en) 2019-11-05 2019-11-05 Target detection method and system for remote sensing image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911071646.8A CN110826485B (en) 2019-11-05 2019-11-05 Target detection method and system for remote sensing image

Publications (2)

Publication Number Publication Date
CN110826485A true CN110826485A (en) 2020-02-21
CN110826485B CN110826485B (en) 2023-04-18

Family

ID=69552492

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911071646.8A Active CN110826485B (en) 2019-11-05 2019-11-05 Target detection method and system for remote sensing image

Country Status (1)

Country Link
CN (1) CN110826485B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112560933A (en) * 2020-12-10 2021-03-26 中邮信息科技(北京)有限公司 Model training method and device, electronic equipment and medium
CN115063428A (en) * 2022-08-18 2022-09-16 中国科学院国家空间科学中心 Spatial dim small target detection method based on deep reinforcement learning

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180089530A1 (en) * 2015-05-11 2018-03-29 Siemens Healthcare Gmbh Method and system for landmark detection in medical images using deep neural networks
CN108304873A (en) * 2018-01-30 2018-07-20 深圳市国脉畅行科技股份有限公司 Object detection method based on high-resolution optical satellite remote-sensing image and its system
CN108427912A (en) * 2018-02-05 2018-08-21 西安电子科技大学 Remote sensing image object detection method based on the study of dense target signature
CN108460341A (en) * 2018-02-05 2018-08-28 西安电子科技大学 Remote sensing image object detection method based on integrated depth convolutional network
CN109344774A (en) * 2018-10-08 2019-02-15 国网经济技术研究院有限公司 Heat power station target identification method in remote sensing image
CN109800637A (en) * 2018-12-14 2019-05-24 中国科学院深圳先进技术研究院 A kind of remote sensing image small target detecting method
CN109919108A (en) * 2019-03-11 2019-06-21 西安电子科技大学 Remote sensing images fast target detection method based on depth Hash auxiliary network
WO2019144575A1 (en) * 2018-01-24 2019-08-01 中山大学 Fast pedestrian detection method and device

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180089530A1 (en) * 2015-05-11 2018-03-29 Siemens Healthcare Gmbh Method and system for landmark detection in medical images using deep neural networks
WO2019144575A1 (en) * 2018-01-24 2019-08-01 中山大学 Fast pedestrian detection method and device
CN108304873A (en) * 2018-01-30 2018-07-20 深圳市国脉畅行科技股份有限公司 Object detection method based on high-resolution optical satellite remote-sensing image and its system
CN108427912A (en) * 2018-02-05 2018-08-21 西安电子科技大学 Remote sensing image object detection method based on the study of dense target signature
CN108460341A (en) * 2018-02-05 2018-08-28 西安电子科技大学 Remote sensing image object detection method based on integrated depth convolutional network
CN109344774A (en) * 2018-10-08 2019-02-15 国网经济技术研究院有限公司 Heat power station target identification method in remote sensing image
CN109800637A (en) * 2018-12-14 2019-05-24 中国科学院深圳先进技术研究院 A kind of remote sensing image small target detecting method
CN109919108A (en) * 2019-03-11 2019-06-21 西安电子科技大学 Remote sensing images fast target detection method based on depth Hash auxiliary network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
孙梓超;谭喜成;洪泽华;董华萍;沙宗尧;周松涛;杨宗亮;: "基于深度卷积神经网络的遥感影像目标检测" *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112560933A (en) * 2020-12-10 2021-03-26 中邮信息科技(北京)有限公司 Model training method and device, electronic equipment and medium
CN115063428A (en) * 2022-08-18 2022-09-16 中国科学院国家空间科学中心 Spatial dim small target detection method based on deep reinforcement learning

Also Published As

Publication number Publication date
CN110826485B (en) 2023-04-18

Similar Documents

Publication Publication Date Title
CN107871119B (en) Target detection method based on target space knowledge and two-stage prediction learning
CN109801293B (en) Remote sensing image segmentation method and device, storage medium and server
CN108460382B (en) Optical remote sensing image ship detection method based on deep learning single-step detector
CN111563473B (en) Remote sensing ship identification method based on dense feature fusion and pixel level attention
CN107665498B (en) Full convolution network aircraft detection method based on typical example mining
CN109492561B (en) Optical remote sensing image ship detection method based on improved YOLO V2 model
CN111179217A (en) Attention mechanism-based remote sensing image multi-scale target detection method
CN110796048B (en) Ship target real-time detection method based on deep neural network
CN106228125B (en) Method for detecting lane lines based on integrated study cascade classifier
CN108304761A (en) Method for text detection, device, storage medium and computer equipment
CN111914924B (en) Rapid ship target detection method, storage medium and computing equipment
CN109242019B (en) Rapid detection and tracking method for optical small target on water surface
CN110826485B (en) Target detection method and system for remote sensing image
CN112766184B (en) Remote sensing target detection method based on multi-level feature selection convolutional neural network
CN110674674A (en) Rotary target detection method based on YOLO V3
CN111144234A (en) Video SAR target detection method based on deep learning
CN116563726A (en) Remote sensing image ship target detection method based on convolutional neural network
CN114565824A (en) Single-stage rotating ship detection method based on full convolution network
CN110633727A (en) Deep neural network ship target fine-grained identification method based on selective search
CN114241314A (en) Remote sensing image building change detection model and algorithm based on CenterNet
CN113486819A (en) Ship target detection method based on YOLOv4 algorithm
CN113628180A (en) Semantic segmentation network-based remote sensing building detection method and system
CN110084203B (en) Full convolution network airplane level detection method based on context correlation
CN110363792A (en) A kind of method for detecting change of remote sensing image based on illumination invariant feature extraction
CN112001388B (en) Method for detecting circular target in PCB based on YOLOv3 improved model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant