CN114842353A - Neural network remote sensing image target detection method based on self-adaptive target direction - Google Patents

Neural network remote sensing image target detection method based on self-adaptive target direction Download PDF

Info

Publication number
CN114842353A
CN114842353A CN202210484478.0A CN202210484478A CN114842353A CN 114842353 A CN114842353 A CN 114842353A CN 202210484478 A CN202210484478 A CN 202210484478A CN 114842353 A CN114842353 A CN 114842353A
Authority
CN
China
Prior art keywords
target
pixels
feature map
remote sensing
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210484478.0A
Other languages
Chinese (zh)
Other versions
CN114842353B (en
Inventor
董志鹏
刘焱雄
冯义楷
王艳丽
陈义兰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
First Institute of Oceanography MNR
Original Assignee
First Institute of Oceanography MNR
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by First Institute of Oceanography MNR filed Critical First Institute of Oceanography MNR
Priority to CN202210484478.0A priority Critical patent/CN114842353B/en
Publication of CN114842353A publication Critical patent/CN114842353A/en
Application granted granted Critical
Publication of CN114842353B publication Critical patent/CN114842353B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/13Satellite images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Multimedia (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Mathematical Physics (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Astronomy & Astrophysics (AREA)
  • Remote Sensing (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a neural network remote sensing image target detection method based on a self-adaptive target direction, and belongs to the technical field of remote sensing image target identification and information extraction. The invention firstly provides an anchor point and five-parameter-based self-adaptive target direction region regression method, which can realize the regression of a target region in any direction in a high-resolution remote sensing image; secondly, based on the idea of self-adaptive target direction regional regression, a convolutional neural network remote sensing image target detection method based on self-adaptive target direction is provided, the method can realize target region regression in any direction and accurate classification of target categories, and accurate high-resolution remote sensing image target detection results are obtained. The invention has the characteristics of simplicity, reliability, high precision and easy realization. The method can be widely applied to remote sensing image target identification and information extraction occasions.

Description

Neural network remote sensing image target detection method based on self-adaptive target direction
Technical Field
The invention relates to a neural network remote sensing image target detection method based on a self-adaptive target direction, and belongs to the technical field of remote sensing image target identification and information extraction.
Background
The high-resolution remote sensing image target detection is used as a key technology for automatically extracting, analyzing and understanding image information in a high-resolution earth observation system, and plays an important role in aspects of military reconnaissance, ocean monitoring, precision striking and the like based on the high-resolution remote sensing image. The high-resolution remote sensing image target detection refers to a process of accurately positioning a target area of interest in an image and accurately classifying target categories. For the detection of the target of the high-resolution remote sensing image, scholars at home and abroad develop a large amount of research, wherein a plurality of research methods mainly adopt a mode of extracting a target candidate region- > obtaining the characteristics of the target candidate region- > classifying the characteristics of the target candidate region to realize the detection of the image target. In the mode, firstly, a sliding window, a selective search algorithm, an edge boxes algorithm and the like are used for obtaining a target candidate region in an image; then extracting target candidate region characteristics such as gradient histograms, local binary patterns, scale invariant characteristic transformation and the like through artificially designed characteristics; and finally, inputting the characteristics into a traditional classifier in a characteristic vector form, such as a support vector machine, AdaBoost, a decision tree and the like for classification, and realizing the detection of the target in the image. The mode achieves better effect in a specific target detection task. However, due to the complex and changeable shooting conditions of the remote sensing satellite and the fact that a large number of remote sensing images are generated every day, the mode is difficult to adapt to the target detection of the remote sensing images with large data volume under different conditions, and the robustness and universality of the algorithm are weak.
In recent years, deep learning has attracted extensive attention from researchers in different fields. Convolutional Neural Networks (CNNs) are the hottest deep learning models, and because image features do not need to be designed artificially and the same image feature map shares convolution kernel parameters of "local receptive fields", the convolutional neural networks have fewer network parameters than other network models. In addition, the convolutional neural network can automatically extract and learn effective image features according to mass data and labels based on a special network structure; in addition, under the condition that the training data is sufficient, the model has good generalization capability, and can maintain good robustness and universality under the condition of complexity and variability. Therefore, the convolutional neural network model has been widely used in digital image processing. Since the global CNN (R-CNN) target detection architecture based on the convolutional neural network in 2014 achieved better detection results than the conventional target detection method (employing artificial design image features) in the PASCAL VOC target detection challenge race, different target detection architectures based on the convolutional neural network were rapidly developed, such as Fast-RCNN, SSD (single shot multi-detector), and yolo (you only look) architectures. However, in all of these network detections, a horizontal frame is adopted to regress a target area, which is difficult to effectively apply to the detection of remote sensing image targets with large length and width and dense distribution, and many missing detections occur.
Aiming at the problems, the invention provides a convolutional neural network remote sensing image target detection method based on a self-adaptive target direction.
Disclosure of Invention
The invention provides a convolutional neural network remote sensing image target detection method based on a self-adaptive target direction, aiming at the problem that the existing convolutional neural network target detection framework is difficult to effectively adapt to target detection with large length, large width and dense distribution.
The invention relates to a neural network remote sensing image target detection method based on a self-adaptive target direction, which comprises the following steps of:
s1, self-adaptive direction target region regression: expressing the target area by using five parameters, and realizing regression on the target area in any direction based on the anchor point;
expressing the target region by using five parameters (x, y, w, h and theta), wherein (x, y) is the coordinate of the central point of the target region, w and h are the width and the height of the target region respectively, theta is an included angle between an angle point with the minimum value of y and an x axis in four angle points of the target region, and theta belongs to (0, pi/2);
and (3) regressing the target area in any direction based on the anchor point and the five parameters, wherein the calculation formula is as follows:
Figure BDA0003628701100000021
Figure BDA0003628701100000022
Figure BDA0003628701100000023
Figure BDA0003628701100000024
Figure BDA0003628701100000025
wherein: (O) x ,O y ,O w ,O h ,O θ ) Five parameter (x, y, w, h, theta) values for the target region regression; a is w And a h Respectively the width and height of the anchor point; (x) 0 ,x 1 ,x 2 ,x 3 ,x 4 ) (ii) a network output value for the convolutional neural network at the location of the feature map (i, j);
and calculating and obtaining the coordinates of four corner points of the target region based on the five parameters (x, y, w, h and theta) of the target region, wherein the calculation formula is as follows:
Figure BDA0003628701100000031
wherein: (x) P1 ,y P1 )、(x P2 ,y P2 )、(x P3 ,y P3 ) And (x) P4 ,y P4 ) Coordinates of four corner points P1, P2, P3 and P4 of the target area respectively;
s2, convolutional neural network target detection of self-adaptive target direction: the target detection architecture of the convolutional neural network based on the self-adaptive target direction can realize target region regression in any direction and accurate classification of target categories; the training loss of the target detection architecture is calculated as follows:
Loss=L coord +L class +L obj
(7)
Figure BDA0003628701100000032
Figure BDA0003628701100000033
Figure BDA0003628701100000034
Figure BDA0003628701100000035
wherein: loss is the training Loss of the target detection architecture; l is coord 、L class And L obj Target coordinates, categories and confidence loss, respectively; m is the width or height of the feature map; n is the number of anchor points at each location of the feature map;
Figure BDA0003628701100000041
indicating whether the anchor point labeled k at the location of the feature map (i, j) is a positive sample, if so
Figure BDA0003628701100000042
Is 1, otherwise is 0; w is a ij And h ij Width and height of the target region for the true value corresponding to anchor point labeled k at the location of feature map (i, j);(x ij ,y ij ,w ij ,h ijij ) The five parameters of the target area are true values;
Figure BDA0003628701100000043
a predicted value of a network architecture of a target area five-parameter generated for an anchor point with a label of k; w is a a And h a Width and height of anchor point labeled k; r is the classification number of the network architecture;
Figure BDA0003628701100000044
generating predicted values of different categories of the target area for the network measurement architecture pair;
Figure BDA0003628701100000045
a confidence predictor targeting the target region generated based on the anchor point labeled k.
Preferably, in step S2, the darknet-53 is used to extract a feature map of the image in the target detection architecture, and the target region is trained and regressed on three feature maps with different scales.
Preferably, in step S2, the target region is trained and regressed based on three anchor points at each position of the feature map with the size of 13 × 13 pixels, where the sizes of the anchor points are 116 × 90 pixels, 156 × 198 pixels and 373 × 326 pixels, respectively;
upsampling the feature map with the size of 13 pixels multiplied by 13 pixels to 26 pixels multiplied by 26 pixels, and combining the upsampled feature map with the size of 26 pixels multiplied by 26 pixels in a network architecture to form a new feature map with the size of 26 pixels multiplied by 26 pixels;
training and regressing the target area based on three anchor points at each position of a new feature map with the size of 26 pixels multiplied by 26 pixels, wherein the sizes of the three anchor points are respectively 30 pixels multiplied by 61 pixels, 62 pixels multiplied by 45 pixels and 59 pixels multiplied by 119 pixels;
upsampling the new feature map with the size of 26 pixels multiplied by 26 pixels to 52 pixels multiplied by 52 pixels, and combining the upsampled feature map with the size of 52 pixels multiplied by 52 pixels in the network architecture to form a new feature map with the size of 52 pixels multiplied by 52 pixels;
the target area is trained and regressed at each position of the new feature map with dimensions of 52 pixels × 52 pixels based on three anchor points, which have dimensions of 10 pixels × 13 pixels, 16 pixels × 30 pixels and 33 pixels × 23 pixels, respectively.
Preferably, in the step S2, a multi-scale training concept is adopted to train the target detection architecture, and if the intersection ratio of one anchor point to the true-value target region is the largest of the intersection ratios of all anchor points to the true-value target region in the training process, the anchor point region is marked as a positive sample; the anchor points remaining that are not marked as positive samples are marked as negative samples.
Preferably, in the step S2, in the target detection architecture testing stage, all the x, y, θ, confidence level and category prediction values in the network are processed by using formula (11).
Preferably, in the step S2, five parameters of the anchor point-based target generation region are obtained using equations (1) - (5); if the confidence coefficient of the generated target area is larger than the set threshold value, retaining, otherwise, removing; in order to reduce the redundancy of the target detection result, a non-maximum value suppression algorithm is used for suppressing the reserved target area, wherein the intersection ratio threshold value is set to be 0.3; and the target area remained after the inhibition of the non-maximum value is the target detection result of the target detection framework.
The beneficial effects of the invention are: the neural network remote sensing image target detection method based on the self-adaptive target direction can effectively solve the problem that the existing convolutional neural network target detection framework adopts a horizontal frame to miss detection of the remote sensing image targets which are large in length-width ratio and densely distributed, and obtains an accurate high-resolution remote sensing image target detection result.
Drawings
FIG. 1 is a diagram of an object detection architecture of the present invention.
Fig. 2 is a schematic diagram of the present invention based on five-parameter representation of target areas.
FIG. 3 is a schematic diagram of the regression of the target region based on the anchor point according to the present invention.
Detailed Description
The technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention.
Example (b):
the technical scheme of the invention can adopt a computer software mode to support the automatic operation process. The technical scheme of the invention is explained in detail in the following by combining the drawings and the embodiment.
(1) Self-adaptive direction target area regression: and expressing the target area by using five parameters, and realizing regression on the target area in any direction based on the anchor point.
In the invention, the target area is represented by using five parameters (x, y, w, h and theta), as shown in FIG. 2; wherein (x, y) is the coordinate of the central point of the target area, w and h are the width and height of the target area respectively, theta is the included angle between the angle point with the minimum y value in the four angle points of the target area and the x axis, and theta is belonged to (0, pi/2).
The target area in any direction is regressed based on the anchor point and the five parameters, as shown in fig. 3, the calculation formula is as follows:
Figure BDA0003628701100000051
Figure BDA0003628701100000052
Figure BDA0003628701100000053
Figure BDA0003628701100000054
Figure BDA0003628701100000055
wherein: (O) x ,O y ,O w ,O h ,O θ ) Five parameter (x, y, w, h, theta) values for the target region regression; a is w And a h Respectively the width and height of the anchor point; (x) 0 ,x 1 ,x 2 ,x 3 ,x 4 ) And (4) outputting the network output value of the convolutional neural network at the position of the feature map (i, j).
Coordinates of four corner points of the target area can be calculated and obtained based on five parameters (x, y, w, h and theta) of the target area, and the calculation formula is as follows:
Figure BDA0003628701100000061
wherein: (x) P1 ,y P1 )、(x P2 ,y P2 )、(x P3 ,y P3 ) And (x) P4 ,y P4 ) The coordinates of the four corner points P1, P2, P3 and P4 of the target area, respectively.
(2) And (3) convolutional neural network target detection of self-adaptive target direction: the target detection architecture of the convolutional neural network based on the self-adaptive target direction can realize target region regression in any direction and accurate classification of target categories.
The target detection architecture of the convolutional neural network based on adaptive target direction is shown in fig. 1. In the target detection framework, the darknet-53 is used for extracting the feature map of the image, and the target region is trained and regressed on the feature maps of three different scales. The target region is trained and regressed at each position of a feature map of size 13 pixels by 13 pixels based on three anchor points of size 116 pixels by 90 pixels, 156 pixels by 198 pixels and 373 pixels by 326 pixels, respectively.
The feature map with the size of 13 pixels × 13 pixels is up-sampled to 26 pixels × 26 pixels, and is combined with the feature map with the size of 26 pixels × 26 pixels in the network architecture to form a new feature map with the size of 26 pixels × 26 pixels. The target region is trained and regressed at each position of the new feature map with dimensions of 26 pixels × 26 pixels based on three anchor points, which have dimensions of 30 pixels × 61 pixels, 62 pixels × 45 pixels and 59 pixels × 119 pixels, respectively.
The new feature map with the size of 26 pixels × 26 pixels is up-sampled to 52 pixels × 52 pixels, and is combined with the feature map with the size of 52 pixels × 52 pixels in the network architecture to form a new feature map with the size of 52 pixels × 52 pixels. The target area is trained and regressed at each position of the new feature map with dimensions of 52 pixels × 52 pixels based on three anchor points, which have dimensions of 10 pixels × 13 pixels, 16 pixels × 30 pixels and 33 pixels × 23 pixels, respectively.
Training the target detection architecture by adopting a multi-scale training thought, wherein if the intersection ratio of one anchor point to a true-value target area is the largest of the intersection ratios of all the anchor points to the true-value target area in the training process, the anchor point area is marked as a positive sample; the anchor points remaining that are not marked as positive samples are marked as negative samples. The training loss of the architecture of the present invention is calculated as follows:
Loss=L coord +L class +L obj (7)
Figure BDA0003628701100000071
Figure BDA0003628701100000072
Figure BDA0003628701100000073
Figure BDA0003628701100000074
wherein: loss is the training Loss of the target detection architecture; l is coord 、L class And L obj Target coordinates, categories and confidence loss, respectively; m is the width or height of the feature map; n is the number of anchor points at each location of the feature map;
Figure BDA0003628701100000075
indicating whether the anchor point labeled k at the location of the feature map (i, j) is a positive sample, if so
Figure BDA0003628701100000076
Is 1, otherwise is 0; w is a ij And h ij Width and height of the target region for the true value corresponding to anchor point labeled k at the location of feature map (i, j); (x) ij ,y ij ,w ij ,h ijij ) The five parameters are true values of the target area;
Figure BDA0003628701100000077
a predicted value of a network architecture of a target area five-parameter generated for an anchor point with a label of k; w is a a And h a Width and height of anchor point labeled k; r is the classification number of the network architecture;
Figure BDA0003628701100000078
generating predicted values of different categories of the target area for the network measurement architecture pair;
Figure BDA0003628701100000079
a confidence predictor targeting the target region generated based on the anchor point labeled k.
In the target detection architecture test stage, all x, y, theta, confidence degrees and category predicted values in the network are processed by using a formula (11). Five parameters of the anchor point-based target generation region are obtained using equations (1) - (5). And if the confidence of the generated target region is greater than the set threshold value, retaining the target region, and otherwise, removing the target region. In order to reduce redundancy of target detection results, the remaining target area is suppressed using a non-maximum suppression algorithm, wherein the cross-over ratio threshold is set to 0.3. And the target area remained after the inhibition of the non-maximum value is the target detection result of the target detection framework.
The method can be widely applied to remote sensing image target identification and information extraction occasions.
The specific examples described herein are merely illustrative of the spirit of the invention. Various modifications or additions may be made or substituted in a similar manner to the specific embodiments described herein by those skilled in the art without departing from the spirit of the invention or exceeding the scope thereof as defined in the appended claims.

Claims (6)

1. A neural network remote sensing image target detection method based on a self-adaptive target direction is characterized by comprising the following steps:
s1, self-adaptive direction target region regression: expressing the target area by using five parameters, and realizing regression of the target area in any direction based on the anchor point;
expressing the target region by using five parameters (x, y, w, h and theta), wherein (x, y) is the coordinate of the central point of the target region, w and h are the width and the height of the target region respectively, theta is an included angle between an angle point with the minimum value of y and an x axis in four angle points of the target region, and theta belongs to (0, pi/2);
and (3) regressing the target area in any direction based on the anchor point and the five parameters, wherein the calculation formula is as follows:
Figure FDA0003628701090000011
Figure FDA0003628701090000012
Figure FDA0003628701090000013
Figure FDA0003628701090000014
Figure FDA0003628701090000015
wherein: (O) x ,O y ,O w ,O h ,O θ ) Five parameter (x, y, w, h, theta) values for the target region regression; a is w And a h Respectively the width and height of the anchor point; (x) 0 ,x 1 ,x 2 ,x 3 ,x 4 ) (ii) a network output value for the convolutional neural network at the location of the feature map (i, j);
and calculating and obtaining the coordinates of four corner points of the target region based on the five parameters (x, y, w, h and theta) of the target region, wherein the calculation formula is as follows:
Figure FDA0003628701090000016
wherein: (x) P1 ,y P1 )、(x P2 ,y P2 )、(x P3 ,y P3 ) And (x) P4 ,y P4 ) Coordinates of four corner points P1, P2, P3 and P4 of the target area respectively;
s2, convolutional neural network target detection of self-adaptive target direction: the target detection architecture of the convolutional neural network based on the self-adaptive target direction can realize target region regression in any direction and accurate classification of target categories; the training loss of the target detection architecture is calculated as follows:
Figure FDA0003628701090000021
Figure FDA0003628701090000022
Figure FDA0003628701090000023
Figure FDA0003628701090000024
Figure FDA0003628701090000025
wherein: loss is the training Loss of the target detection architecture; l is coord 、L class And L obj Target coordinates, categories and confidence loss, respectively; m is the width or height of the feature map; n is the number of anchor points at each location of the feature map;
Figure FDA0003628701090000026
indicating whether the anchor point labeled k at the location of the feature map (i, j) is a positive sample, if so
Figure FDA0003628701090000027
Is 1, otherwise is 0; w is a ij And h ij Width and height of the target region for the true value corresponding to anchor point labeled k at the location of feature map (i, j); (x) ij ,y ij ,w ij ,h ijij ) The five parameters of the target area are true values;
Figure FDA0003628701090000028
a predicted value of a network architecture of a target area five-parameter generated for an anchor point with a label of k; w is a a And h a Width and height of anchor point labeled k; r is the classification number of the network architecture;
Figure FDA0003628701090000029
generating predicted values of different categories of the target area for the network measurement architecture pair;
Figure FDA00036287010900000210
a confidence predictor targeting the target region generated based on the anchor point labeled k.
2. The method for target detection based on adaptive target direction neural network remote sensing image of claim 1, wherein in step S2, a feature map of an image is extracted by using dark net-53 in a target detection architecture, and a target region is trained and regressed on three feature maps of different scales.
3. The method for target detection based on the adaptive target direction neural network remote sensing image of claim 2, wherein in step S2, the target region is trained and regressed based on three anchor points at each position of the feature map with the size of 13 × 13 pixels, the three anchor points having the sizes of 116 × 90 pixels, 156 × 198 pixels and 373 × 326 pixels, respectively;
upsampling the feature map with the size of 13 pixels multiplied by 13 pixels to 26 pixels multiplied by 26 pixels, and combining the upsampled feature map with the size of 26 pixels multiplied by 26 pixels in the network architecture to form a new feature map with the size of 26 pixels multiplied by 26 pixels;
training and regressing the target area based on three anchor points at each position of a new feature map with the size of 26 pixels multiplied by 26 pixels, wherein the sizes of the three anchor points are respectively 30 pixels multiplied by 61 pixels, 62 pixels multiplied by 45 pixels and 59 pixels multiplied by 119 pixels;
upsampling the new feature map with the size of 26 pixels multiplied by 26 pixels to 52 pixels multiplied by 52 pixels, and combining the upsampled feature map with the size of 52 pixels multiplied by 52 pixels in the network architecture to form a new feature map with the size of 52 pixels multiplied by 52 pixels;
the target area is trained and regressed at each position of the new feature map with dimensions of 52 pixels × 52 pixels based on three anchor points, which have dimensions of 10 pixels × 13 pixels, 16 pixels × 30 pixels and 33 pixels × 23 pixels, respectively.
4. The method for target detection based on the adaptive target direction remote sensing image of the neural network as claimed in claim 3, wherein in the step S2, a multi-scale training concept is adopted to train the target detection architecture, and in the training process, if the intersection ratio of an anchor point to the true value target area is the largest of the intersection ratios of all anchor points to the true value target area, the anchor point area is marked as a positive sample; the anchor points remaining that are not marked as positive samples are marked as negative samples.
5. The method for detecting the target of the neural network remote sensing image based on the adaptive target direction as claimed in claim 1, wherein in the step S2, in the target detection architecture testing stage, all x, y, θ, confidence and category prediction values in the network are processed by using formula (11).
6. The method for target detection based on adaptive target direction neural network remote sensing image of claim 5, wherein in step S2, five parameters of anchor point-based target generation area are obtained using formulas (1) - (5); if the confidence coefficient of the generated target area is larger than the set threshold value, retaining, otherwise, removing; in order to reduce the redundancy of the target detection result, a non-maximum value suppression algorithm is used for suppressing the reserved target area, wherein the intersection ratio threshold value is set to be 0.3; and the target area remained after the inhibition of the non-maximum value is the target detection result of the target detection framework.
CN202210484478.0A 2022-05-06 2022-05-06 Neural network remote sensing image target detection method based on self-adaptive target direction Active CN114842353B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210484478.0A CN114842353B (en) 2022-05-06 2022-05-06 Neural network remote sensing image target detection method based on self-adaptive target direction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210484478.0A CN114842353B (en) 2022-05-06 2022-05-06 Neural network remote sensing image target detection method based on self-adaptive target direction

Publications (2)

Publication Number Publication Date
CN114842353A true CN114842353A (en) 2022-08-02
CN114842353B CN114842353B (en) 2024-04-02

Family

ID=82567243

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210484478.0A Active CN114842353B (en) 2022-05-06 2022-05-06 Neural network remote sensing image target detection method based on self-adaptive target direction

Country Status (1)

Country Link
CN (1) CN114842353B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110046530A (en) * 2019-03-15 2019-07-23 中科院微电子研究所昆山分所 A kind of bar code Slant Rectify method based on multitask target detection
CN111046756A (en) * 2019-11-27 2020-04-21 武汉大学 Convolutional neural network detection method for high-resolution remote sensing image target scale features
WO2021139069A1 (en) * 2020-01-09 2021-07-15 南京信息工程大学 General target detection method for adaptive attention guidance mechanism

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110046530A (en) * 2019-03-15 2019-07-23 中科院微电子研究所昆山分所 A kind of bar code Slant Rectify method based on multitask target detection
CN111046756A (en) * 2019-11-27 2020-04-21 武汉大学 Convolutional neural network detection method for high-resolution remote sensing image target scale features
WO2021139069A1 (en) * 2020-01-09 2021-07-15 南京信息工程大学 General target detection method for adaptive attention guidance mechanism

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
孙梓超;谭喜成;洪泽华;董华萍;沙宗尧;周松涛;杨宗亮;: "基于深度卷积神经网络的遥感影像目标检测", 上海航天, no. 05, 25 October 2018 (2018-10-25) *

Also Published As

Publication number Publication date
CN114842353B (en) 2024-04-02

Similar Documents

Publication Publication Date Title
Chen et al. Vehicle detection in high-resolution aerial images via sparse representation and superpixels
Li et al. Object detection using convolutional neural networks in a coarse-to-fine manner
CN107133569B (en) Monitoring video multi-granularity labeling method based on generalized multi-label learning
CN111179217A (en) Attention mechanism-based remote sensing image multi-scale target detection method
CN109949316A (en) A kind of Weakly supervised example dividing method of grid equipment image based on RGB-T fusion
CN112329559A (en) Method for detecting homestead target based on deep convolutional neural network
Shu et al. Center-point-guided proposal generation for detection of small and dense buildings in aerial imagery
He et al. Ship detection without sea-land segmentation for large-scale high-resolution optical satellite images
Han et al. Research on remote sensing image target recognition based on deep convolution neural network
CN109753962A (en) Text filed processing method in natural scene image based on hybrid network
Chen et al. Object detection of optical remote sensing image based on improved faster RCNN
Wang et al. YOLOv5_CSL_F: YOLOv5’s loss improvement and attention mechanism application for remote sensing image object detection
Yildirim et al. Ship detection in optical remote sensing images using YOLOv4 and Tiny YOLOv4
Zhang et al. Few-shot object detection with self-adaptive global similarity and two-way foreground stimulator in remote sensing images
Zhang et al. Contextual squeeze-and-excitation mask r-cnn for sar ship instance segmentation
CN114463624A (en) Method and device for detecting illegal buildings applied to city management supervision
Yang et al. SAR image target detection and recognition based on deep network
Ma et al. Efficient small object detection with an improved region proposal networks
Xingxin et al. Adaptive auxiliary input extraction based on vanishing point detection for distant object detection in high-resolution railway scene
CN110826478A (en) Aerial photography illegal building identification method based on countermeasure network
CN114842353B (en) Neural network remote sensing image target detection method based on self-adaptive target direction
CN113658223B (en) Multi-row person detection and tracking method and system based on deep learning
Wu et al. Research on asphalt pavement disease detection based on improved YOLOv5s
CN114463628A (en) Deep learning remote sensing image ship target identification method based on threshold value constraint
Wang et al. Ctl-dnnet: effective circular traffic light recognition with a deep neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant