CN111797697B - Angle high-resolution remote sensing image target detection method based on improved CenterNet - Google Patents

Angle high-resolution remote sensing image target detection method based on improved CenterNet Download PDF

Info

Publication number
CN111797697B
CN111797697B CN202010521896.3A CN202010521896A CN111797697B CN 111797697 B CN111797697 B CN 111797697B CN 202010521896 A CN202010521896 A CN 202010521896A CN 111797697 B CN111797697 B CN 111797697B
Authority
CN
China
Prior art keywords
target
image
remote sensing
size
point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010521896.3A
Other languages
Chinese (zh)
Other versions
CN111797697A (en
Inventor
王鑫
戴慧凤
石爱业
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hohai University HHU
Original Assignee
Hohai University HHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hohai University HHU filed Critical Hohai University HHU
Priority to CN202010521896.3A priority Critical patent/CN111797697B/en
Publication of CN111797697A publication Critical patent/CN111797697A/en
Application granted granted Critical
Publication of CN111797697B publication Critical patent/CN111797697B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/13Satellite images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Molecular Biology (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Astronomy & Astrophysics (AREA)
  • Remote Sensing (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an angle high-resolution remote sensing image target detection method based on improved CenterNet, which comprises the steps of adopting HRNet as a main network under a CenterNet frame to obtain the improved CenterNet frame, giving a plurality of remote sensing target images as training samples, inputting the improved CenterNet frame to train to obtain a remote sensing image target detection frame, cutting the remote sensing image to be detected into a plurality of unit images with the same size, respectively inputting each unit image into the remote sensing image target detection frame to perform target detection, determining the target detection frame of each unit image, performing edge splicing on each unit image according to the target detection frame of each unit image, determining the detection target of the remote sensing image to be detected, and improving the precision of detecting the corresponding target in the remote sensing image to be detected.

Description

Angle high-resolution remote sensing image target detection method based on improved CenterNet
Technical Field
The invention relates to the technical field of digital image processing, in particular to an angular high-resolution remote sensing image target detection method based on improved CenterNet.
Background
The remote sensing technology is one of important marks for measuring the scientific and technological level and the comprehensive national strength of a country, and is widely applied to various fields of military and civil use. The essence of the remote sensing technology is that more effective information needs to be extracted from complicated remote sensing images. The high-resolution remote sensing image is an important analysis object of the remote sensing technology. Today, the remote sensing technology is rapidly developed, and higher requirements are put forward on the precision and speed of target detection of remote sensing images. The on-satellite self-detection mode can directly acquire the remote sensing image and process the image, greatly improves the processing speed of the remote sensing image, but has higher space requirement on the computing unit. Therefore, a remote sensing target detection algorithm which can realize quick detection, has a small operation space and is not poor in precision is needed.
The method for detecting the remote sensing image target of the publication number CN110490069A comprises the steps of firstly utilizing an additive operator splitting algorithm to the remote sensing image, and constructing a nonlinear scale space by stably converging the remote sensing image on any step length. And then, screening the image in each scale space by adopting the response value of the Hessian matrix to detect the characteristic points. And then dividing the neighborhood of the characteristic point, performing down-sampling by taking the current scale parameter as a sampling step length, and calculating the gray average value and the first-order gradient in the horizontal and vertical directions of each grid sampling point. Carrying out binary comparison on the calculation results of each grid to obtain a feature descriptor; and finally, performing feature matching detection by using the Hamming distance as similarity measurement. The method has good effect on the real-time detection of the remote sensing image, but the detection precision is slightly poor, and the mobility is general.
The method for detecting the remote sensing image target of the publication number CN110084093A comprises the steps of firstly adopting multilayer output of a convolutional neural network, extracting high-level characteristics of remote sensing images in training data sets, adopting a four-point marking method to mark any quadrangle, generating multiple candidate frames with multiple areas and multiple aspect ratios on the high-level characteristics, and screening the candidate frames; then, performing feature fusion on candidate regions screened by different layers of the convolutional neural network, obtaining classification errors and positioning errors according to fusion results, and performing deep learning training on the screened candidate regions by adopting an optimization function to obtain a training optimization model; and finally, carrying out target discrimination and positioning on the remote sensing image to be detected through the training optimization model. The method can realize the detection and detection of small object targets, high aspect ratio targets and multi-class targets in the remote sensing image, but has higher requirement on space and is not suitable for on-satellite self-detection.
There is also a document that proposes an aircraft target detection algorithm based on a combination of saliency maps and a deep confidence network. Firstly, extracting a significant target in an image by adopting a histogram-based contrast method; secondly, positioning a candidate target through a positioning connected region; and then extracting color moments, Hu invariant moments, Tamura texture features and edge direction histograms of the candidate targets. And finally, applying the standardized features to the deep confidence network to detect the target. The method can be used for accurately detecting the airplane well, but the extraction process is complicated, and the process is not intelligent enough for processing a large-scale data set.
In summary, the existing high-resolution remote sensing image target detection method has many limitations mainly expressed in that: (1) the detection precision is poor, and the process is not intelligent enough for processing large-scale data sets; (2) the detection process is too complicated and is not suitable for on-satellite self-detection. Therefore, the traditional remote sensing image target detection scheme has limitations and easily causes the problem of low detection precision.
Disclosure of Invention
Aiming at the problems, the invention provides an angular high-resolution remote sensing image target detection method based on an improved CenterNet.
In order to realize the purpose of the invention, the invention provides an angular high-resolution remote sensing image target detection method based on improved CenterNet, which comprises the following steps:
s10, adopting HRNet as a backbone network under the CenterNet framework to obtain an improved CenterNet framework;
s20, giving a plurality of remote sensing target images as training samples, inputting the training samples into an improved CenterNet frame, and training to obtain a remote sensing image target detection frame;
s30, cutting the remote sensing image to be detected into a plurality of unit images with the same size, respectively inputting each unit image into a remote sensing image target detection frame for target detection, and determining a target detection frame of each unit image;
and S40, performing edge splicing on each unit image according to the target detection frame of each unit image to determine the detection target of the remote sensing image to be detected.
In one embodiment, the unit images are respectively input into a remote sensing image target detection frame for target detection, and determining the target detection frame of the unit image comprises:
respectively inputting the unit images into a remote sensing image target detection frame, calculating a feature graph of the unit images by using HRNet, performing convolution operation on the feature graph, calculating a thermodynamic diagram of the images, obtaining a central point of the unit images by searching a peak value of the thermodynamic diagram, obtaining a target central point, a target size and a target orientation angle of the unit images by regression calculation, and determining the target detection frame according to the target central point, the target size and the target orientation angle.
Specifically, the convolution operation is performed on the feature map, the thermodynamic diagram of the image is calculated, the central point of the unit image is obtained by searching the peak value of the thermodynamic diagram, the target central point, the target size and the target orientation angle of the unit image are obtained through regression calculation, and the target detection frame is determined according to the target central point, the target size and the target orientation angle, and the method comprises the following steps:
performing convolution operation on the feature map to calculate thermodynamic diagram of the image
Figure BDA0002532375290000031
Wherein R represents the size scaling, C is the type number of key points, W and H are the width and height of the image, the peak point of the thermodynamic diagram is the central point, and the peak point position of each characteristic diagram predicts the width and height information of the target;
training key points of a target through thermodynamic diagrams to determine central points of unit images;
and obtaining a target central point, a target size and a target orientation angle of the unit image through regression calculation, and determining a target detection frame according to the target central point, the target size and the target orientation angle.
Specifically, assume that the target key point position on the target true value graph of the unit image is set as p ∈ R 2 Continuously down-sampling the original input image through the HRNet network to obtain a corresponding feature map, wherein the corresponding key points on the feature map are
Figure BDA0002532375290000032
Wherein R represents a size scaling;
by the Gaussian kernel formula
Figure BDA0002532375290000033
Dispersing GT point on a truth diagram into a thermodynamic diagram
Figure BDA0002532375290000034
Where σ is p Is the adaptive standard deviation of the target size, if more than two Gaussian functions in the target size are overlapped, the largest one is selected,
Figure BDA0002532375290000035
is composed of
Figure BDA0002532375290000036
The horizontal and vertical coordinates of (1);
the training objective function of the key points of the target thermodynamic diagram is set as follows:
Figure BDA0002532375290000037
wherein alpha and beta are hyper-parameters, which are usually set to 2 and 4 in a specific experiment, N is the number of key points in a pixel point,
Figure BDA0002532375290000038
the main purpose of (1) is to perform normalization;
adding local offsets to the prediction of the center point
Figure BDA0002532375290000039
All classes c use the same offset prediction, this offset is trained with a loss of L1:
Figure BDA00025323752900000310
it follows that the local offset will only be at the critical point location
Figure BDA00025323752900000311
Making a prediction;
when searching for target pixel points, key points are used
Figure BDA00025323752900000312
Get all centers, assume
Figure BDA00025323752900000313
For a target block diagram of class k, then the center position is
Figure BDA0002532375290000041
The regression of the target block diagram for each target is equal to
Figure BDA0002532375290000042
For fast target block diagram acquisition, a single size prediction is used for calculation, and a loss function of L1 is added at the center position:
Figure BDA0002532375290000043
when the angle orientation of the target is searched, key points are adopted
Figure BDA0002532375290000044
To perform regression, suppose
Figure BDA0002532375290000045
For a class k object block diagram, then the object is oriented
Figure BDA0002532375290000046
Add the loss function of L1 at the center position:
Figure BDA0002532375290000047
the network target loss function in the whole target detection process consists of four parts:
L det =L ksize L sizeangle L angleoff L off
in the experiment, lambda size =0.1,λ angle =0.1,λ off 1 the whole network predicts that c +5 values will be output at each position, i.e. the key pointClass c, target center (x, y), size (w, h), target angle, all of which share the same back bone of the deep convolutional neural network.
The peak point for each category on the thermodynamic diagram is extracted to determine the target center point, target size, and target orientation angle.
Specifically, extracting the peak point of each category on the thermodynamic diagram includes:
and comparing all the response points on the thermodynamic diagram with the eight adjacent points, if the value of the corresponding point is greater than or equal to the maximum value of the eight adjacent points, keeping the corresponding point, and otherwise, eliminating the corresponding point, and keeping all the top 100 peak points meeting the requirement of the corresponding point to obtain the peak point of each category on the thermodynamic diagram.
The method for detecting the target of the angle high-resolution remote sensing image based on the improved CenterNet comprises the steps of obtaining the improved CenterNet frame by adopting HRNet as a main network under the CenterNet frame, giving a plurality of remote sensing target images as training samples, inputting the improved CenterNet frame to train to obtain the target detection frame of the remote sensing image, cutting the remote sensing image to be detected into a plurality of unit images with the same size, inputting each unit image into the target detection frame of the remote sensing image respectively to perform target detection, determining the target detection frame of each unit image, performing edge splicing on each unit image according to the target detection frame of each unit image, determining the detection target of the remote sensing image to be detected, and improving the precision of detecting the corresponding target in the remote sensing image to be detected.
Drawings
FIG. 1 is a flow diagram of an embodiment of an angular high-resolution remote sensing image target detection method based on improved CenterNet;
FIG. 2 is a schematic diagram of an image cropping and stitching process according to one embodiment;
FIG. 3 is a schematic diagram of an overall algorithm framework of an angular high-resolution remote sensing image target detection method based on the improved CenterNet according to an embodiment;
FIG. 4 is a schematic diagram of an experimental result of a confusion matrix for remote sensing image target identification according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.
Today, the remote sensing technology is rapidly developed, and higher requirements are put forward on the accuracy and speed of remote sensing image target identification. The on-satellite self-recognition mode can directly acquire the remote sensing image and process the image, greatly improves the processing speed of the remote sensing image, but has higher space requirement on a calculation unit. Therefore, a remote sensing target identification algorithm which can quickly identify, has a small operation space and is not poor in precision is needed. The invention provides an angular high-resolution remote sensing image target detection method based on an improved CenterNet, which is characterized in that angle identification is carried out on the basis of the CenterNet, and a backsbone is replaced, and experiments show that on the basis of comprehensively considering space efficiency and experiment precision, in one embodiment, referring to fig. 1, fig. 1 is a flow chart of the angular high-resolution remote sensing image target detection method based on the improved CenterNet, and the method comprises the following steps:
s10, adopting HRNet as a Backbone network (Backbone) under the CenterNet framework to obtain the improved CenterNet framework.
And S20, giving a plurality of remote sensing target images as training samples, and inputting the training samples into an improved CenterNet frame to train to obtain a remote sensing image target detection frame.
The steps give a plurality of remote sensing target images as training samples, and the training samples are input into an improved CenterNet framework to calculate a target Feature map (Feature map). Compared with the traditional CenterNet target detection framework which usually adopts ResNet, DLA, Hourglass and the like as a Backbone network (Backbone), the improved CenterNet target detection framework adopts lightweight HRNet as the Backbone network to calculate the feature map of the remote sensing target image, and compared with other Backbone networks, the HRNet has the advantages of parallel high-resolution network architecture and repeated multi-scale fusion mode, and can calculate the target feature map more quickly and accurately.
Specifically, a plurality of remote sensing target images can be given as training samples and input into an improved CenterNet framework; and secondly, calculating a characteristic map of each remote sensing target image by using the HRNet. HRNet adopts repeated multi-scale fusion under the same scale and the same resolution. The feature maps of the three scales can be respectively changed into three feature maps of different scales through sampling and upsampling, then the three feature maps are respectively fused on different scales, and finally three representations of fusion of different levels are obtained. The high-resolution images use the low-resolution images to obtain more accurate classification features, and the low-resolution images can obtain more accurate position features through the high-resolution images. The repeated fusion of different layers can obtain a plurality of intermediate products between the standard image and the noisy image, the more products are used as guide images for obtaining the standard image, and the learning capability of the convolutional neural network is added to obtain more standard feature image output. Finally, after each input image block passes through HRNet, a plurality of feature maps with lower resolution than the input image are output.
And S30, cutting the remote sensing image to be detected into a plurality of unit images with the same size, inputting each unit image into the remote sensing image target detection frame respectively for target detection, and determining the target detection frame of each unit image.
The above steps perform convolution operation on the feature map obtained by HRNet calculation, calculate a thermodynamic diagram (Heat map) of the image, then obtain a central point of the image by finding a peak value of the thermodynamic diagram, and obtain a target central point (x, y), a target size (w, h), and a target orientation angle by regression calculation, and based on the central point, the size, and the orientation angle, obtain a target detection frame, i.e. a target detection result.
And S40, performing edge splicing on each unit image according to the target detection frame of each unit image to determine the detection target of the remote sensing image to be detected.
Considering that the remote sensing image acquired on the satellite generally has a large size and a high resolution, and the direct processing time efficiency and the direct processing space efficiency are both low, the embodiment proposes to perform overlapping division (or called clipping) on the large-format remote sensing image to be detected into a plurality of smaller image blocks with the same size for processing. When the target to be detected is located on the cutting line, the target can be divided into two parts, so that the position, the size or the direction can be difficultly fused in the final detection, and therefore the problem is solved by the splicing method aiming at the edge target, and the detection precision of the target on the cutting line is improved.
In one example, as shown in FIG. 2, (a) is an original image; (b) the image is a small block of the cut image, and the airplane positioned on the cutting line in the original image can be seen to be divided into two parts by the left image and the right image, and the cut airplane is taken as an edge target; the graph (c) is a result of directly splicing the left graph and the right graph, and it can be seen that the recognition results of the left graph and the right graph cannot be fused at all, and the sizes, the dimensions and the directions of the two graphs have different degrees, so that the recognition accuracy of the edge target is seriously influenced; the graph (d) is the splicing result of the splicing strategy for the edge target provided by the invention, overlapping cutting is carried out in the cutting process to obtain a left image small block, a middle image small block and a right image small block, the edge target is completely placed into the image small block overlapped and cut in the middle, and overlapping splicing is also carried out in the splicing process, so that the strategy completely avoids the identification error of the edge target caused by cutting, and the identification precision of the edge target is greatly improved. It is noted that this embodiment only shows one edge of the image, and in actual operation, the process of overlap cropping is for four edges.
The method for detecting the target of the angle high-resolution remote sensing image based on the improved CenterNet comprises the steps of obtaining the improved CenterNet frame by adopting HRNet as a main network under the CenterNet frame, giving a plurality of remote sensing target images as training samples, inputting the improved CenterNet frame to train to obtain the target detection frame of the remote sensing image, cutting the remote sensing image to be detected into a plurality of unit images with the same size, inputting each unit image into the target detection frame of the remote sensing image respectively to perform target detection, determining the target detection frame of each unit image, performing edge splicing on each unit image according to the target detection frame of each unit image, determining the detection target of the remote sensing image to be detected, and improving the precision of detecting the corresponding target in the remote sensing image to be detected.
In one embodiment, the unit images are respectively input into a remote sensing image target detection frame for target detection, and determining the target detection frame of the unit image comprises:
respectively inputting the unit images into a remote sensing image target detection frame, calculating a feature graph of the unit images by using HRNet, performing convolution operation on the feature graph, calculating a thermodynamic diagram of the images, obtaining a central point of the unit images by searching a peak value of the thermodynamic diagram, obtaining a target central point, a target size and a target orientation angle of the unit images by regression calculation, and determining the target detection frame according to the target central point, the target size and the target orientation angle.
As an embodiment, performing a convolution operation on the feature map, calculating a thermodynamic diagram of the image, obtaining a center point of the unit image by finding a peak of the thermodynamic diagram, and obtaining a target center point, a target size, and a target orientation angle of the unit image by a regression calculation, and determining the target detection frame according to the target center point, the target size, and the target orientation angle includes:
performing convolution operation on the feature map to calculate thermodynamic diagram of the image
Figure BDA0002532375290000071
Wherein R represents the size scaling, C is the number of key point types, W, H are the image width and height, the peak point of the thermodynamic diagram is the center point, and the peak point position of each characteristic diagram predicts the target widthHigh information;
training key points of a target through thermodynamic diagrams to determine central points of unit images;
and obtaining a target central point, a target size and a target orientation angle of the unit image through regression calculation, and determining a target detection frame according to the target central point, the target size and the target orientation angle.
As an embodiment, assume that the target key point position on the target true value graph of the unit image is p ∈ R 2 Continuously down-sampling the original input image through the HRNet network to obtain a corresponding feature map, wherein the corresponding key points on the feature map are
Figure BDA0002532375290000081
Wherein R represents a size scaling;
by the Gaussian kernel formula
Figure BDA0002532375290000082
Dispersing GT point on a truth diagram into a thermodynamic diagram
Figure BDA0002532375290000083
Where σ is p Is the adaptive standard deviation of the target size, if more than two Gaussian functions in the target size are overlapped, the largest one is selected,
Figure BDA0002532375290000084
is composed of
Figure BDA0002532375290000085
The horizontal and vertical coordinates of (1);
the training objective function of the key points of the target thermodynamic diagram is set as follows:
Figure BDA0002532375290000086
wherein alpha and beta are hyper-parameters, which are usually set to 2 and 4 in a specific experiment, N is the number of key points in a pixel point,
Figure BDA0002532375290000087
the main purpose of (1) is to perform normalization;
adding local offsets to the prediction of the center point
Figure BDA0002532375290000088
All classes c use the same offset prediction, this offset is trained with a loss of L1:
Figure BDA0002532375290000089
it follows that the local offset will only be at the critical point location
Figure BDA00025323752900000810
Making a prediction;
when searching for target pixel points, key points are used
Figure BDA00025323752900000811
Get all centers, assume
Figure BDA00025323752900000812
For a target block diagram of class k, then the center position is
Figure BDA00025323752900000813
The regression of the target block diagram for each target is equal to
Figure BDA00025323752900000814
For fast target block diagram acquisition, a single size prediction is used for calculation, and a loss function of L1 is added at the center position:
Figure BDA0002532375290000091
when the angle orientation of the target is searched, key points are adopted
Figure BDA0002532375290000092
To perform regression, suppose
Figure BDA0002532375290000093
For a class k object block diagram, then the object is oriented
Figure BDA0002532375290000094
Add the loss function of L1 at the center position:
Figure BDA0002532375290000095
the network target loss function in the whole target detection process consists of four parts:
L det =L ksize L sizeangle L angleoff L off
in the experiment, lambda size =0.1,λ angle =0.1,λ off 1 the whole network prediction will output c +5 values at each position, namely the keypoint class c, the target center point (x, y), the size (w, h), the target angle, all of which share the same back bone of the deep convolutional neural network.
And extracting a peak value point of each category on the thermodynamic diagram to determine a target center point, a target size and a target orientation angle.
Specifically, extracting the peak point of each category on the thermodynamic diagram includes:
and comparing all the response points on the thermodynamic diagram with the eight adjacent points, if the value of the corresponding point is greater than or equal to the maximum value of the eight adjacent points, keeping the corresponding point, and otherwise, eliminating the corresponding point, and keeping all the top 100 peak points meeting the requirement of the corresponding point to obtain the peak point of each category on the thermodynamic diagram.
Suppose that
Figure BDA0002532375290000096
Is the set of N center points of the detected category c, as shown in the following formula:
Figure BDA0002532375290000097
each keypoint being in the form of a shaped coordinate (x) i ,y i ) It is given.
Figure BDA0002532375290000098
As a measured detection confidence, a target block diagram of the following formula is generated:
Figure BDA0002532375290000099
wherein the content of the first and second substances,
Figure BDA0002532375290000101
in order to shift the result of the prediction,
Figure BDA0002532375290000102
is the result of the scale prediction and is,
Figure BDA0002532375290000103
is the result of the angle prediction.
In one example, the above-mentioned convolution operation on the feature map obtained by HRNet calculation, calculating a thermodynamic diagram (Heat map) of the image, then obtaining a central point of the image by finding a peak of the thermodynamic diagram, and obtaining a target central point (x, y), a target size (w, h), and a target orientation angle by regression calculation, and based on the central point, the size, and the orientation angle, a detection frame of the target, that is, a result of target detection, may be obtained by:
first, a feature map obtained by HRNet calculation is convolved to calculate a thermodynamic map of an image
Figure BDA0002532375290000104
Where R represents the size scaling and C is the key point classThe number of types (namely the number of channels of the output feature map), W and H are the width and the height of the image, the peak point of the thermodynamic diagram is the central point, and the peak point position of each feature map predicts the width and height information of the target;
second, key points of the target are trained by thermodynamic diagrams. Assuming that the position of a target key point on a target true value graph (Ground route) is set as p ∈ R 2 The GT points contain all effective information in the feature graph and are key nodes for feature extraction. Continuously down-sampling the original input image through the HRNet network to obtain corresponding feature maps, wherein the corresponding key points on the feature maps are
Figure BDA0002532375290000105
Wherein R represents a size scaling;
by the Gaussian kernel formula
Figure BDA0002532375290000106
Dispersing GT point on a truth diagram into a thermodynamic diagram
Figure BDA0002532375290000107
Where σ is p Is the adaptive standard deviation of the target size. If more than two gaussian functions overlap in the target size, the one with the largest value is selected.
Figure BDA0002532375290000108
Is composed of
Figure BDA0002532375290000109
The horizontal and vertical coordinates of (2);
the training objective function of the key points of the target thermodynamic diagram is set as follows:
Figure BDA00025323752900001010
wherein alpha and beta are hyper-parameters, which are usually set to 2 and 4 in a specific experiment, N is the number of key points in pixel points,
Figure BDA00025323752900001011
the main purpose of (1) is to perform normalization;
thirdly, since the GT point is biased by the low pixels in the down-sampling of the image, a local offset is added to the prediction of the center point
Figure BDA00025323752900001012
All classes c use the same offset prediction, this offset is trained with a loss of L1:
Figure BDA0002532375290000111
it follows that the local offset will only be at the critical point location
Figure BDA0002532375290000112
The operation is not used in other positions when prediction is made;
fourthly, when searching for a target pixel point, the key point is used
Figure BDA0002532375290000113
All centers are obtained. Suppose that
Figure BDA0002532375290000114
For a target block diagram of class k, then the center position is
Figure BDA0002532375290000115
The regression of the target block diagram for each target is equal to
Figure BDA0002532375290000116
To get the target block diagram quickly, we use a single size prediction to compute, i.e.
Figure BDA0002532375290000117
Where W, H are image width and height, R represents size scaling, and the loss function of L1 is added at the center position:
Figure BDA0002532375290000118
fifthly, key points are also adopted when the angle orientation of the target is searched
Figure BDA0002532375290000119
To perform regression. The same assumption is made
Figure BDA00025323752900001110
For a class k object block diagram, then the object is oriented
Figure BDA00025323752900001111
Similarly, add the loss function of L1 at the center position:
Figure BDA00025323752900001112
sixthly, the network target loss function in the whole target detection process consists of four parts:
L det =L ksize L sizeangle L angleoff L off
in the experiment, lambda size =0.1,λ angle =0.1,λ off 1 the whole network prediction will output c +5 values at each position, namely the keypoint class c, the target center point (x, y), the size (w, h), the target angle, all of which share the same back bone of the deep convolutional neural network.
Seventh, a peak point of each category on the thermodynamic diagram is extracted. To obtain these peak points, we compare all the response points on the thermodynamic diagram with the eight neighboring points, and if the value of the corresponding point is greater than or equal to the maximum value of the eight neighboring points, that corresponding point is retained, otherwise the corresponding point is eliminated. Finally, we will retain all the first 100 peak points that meet their requirements. Suppose that
Figure BDA00025323752900001113
Is the set of N center points of the detected category c, as shown in the following formula:
Figure BDA0002532375290000121
each keypoint being in the form of a shaped coordinate (x) i ,y i ) It is given.
Figure BDA0002532375290000122
As a measured detection confidence, a target block diagram of the following formula is generated:
Figure BDA0002532375290000123
wherein the content of the first and second substances,
Figure BDA0002532375290000124
in order to shift the result of the prediction,
Figure BDA0002532375290000125
is the result of the scale prediction and is,
Figure BDA0002532375290000126
is the prediction result of the angle.
The embodiment has the following beneficial effects:
the CenterNet framework has simple steps and strong sensitivity to the direction, can quickly identify the target of the high-resolution remote sensing image, and can accurately identify the direction of the target.
The HRNet network is light in weight, but the characteristic extraction effect is good, and the HRNet network is used in the on-satellite self-identification operation environment and can take the precision and space limitations into consideration.
The splicing strategy aiming at the edge target can effectively improve the accuracy of edge target identification and integrally realize the target identification of the end-to-end large-size remote sensing image.
In one embodiment, the overall algorithm framework of the angular high-resolution remote sensing image target detection method based on the improved centret can be referred to fig. 3, and includes:
step one, a plurality of remote sensing target images are given and used as training samples, and the training samples are input into an improved CenterNet framework to calculate a target Feature map (Feature map). Compared with the traditional CenterNet target detection framework which usually adopts ResNet, DLA, Hourglass and the like as a Backbone network (Backbone), the improved CenterNet target detection framework adopts light-weight HRNet as the Backbone network to calculate the feature map of the remote sensing target image, and compared with other Backbone networks, the HRNet has the advantages that the parallel high-resolution network architecture and the repeated multi-scale fusion mode can calculate the target feature map more quickly and accurately;
firstly, giving a plurality of remote sensing target images as training samples, and inputting the training samples into a CenterNet framework;
and secondly, calculating a characteristic map of each remote sensing target image by using the HRNet. The HRNet adopts repeated multi-scale fusion under the same scale and resolution, wherein the feature maps of three scales can be respectively changed into three feature maps of different scales through sampling and up-sampling modes, then the three feature maps are respectively fused on different scales, and finally three representations of different-level fusion are obtained. The high-resolution images use the low-resolution images to obtain more accurate classification features, and the low-resolution images can obtain more accurate position features through the high-resolution images. The repeated fusion of different layers can obtain a plurality of intermediate products between the standard image and the noisy image, the more products are used as guide images for obtaining the standard image, and the learning capability of the convolutional neural network is added to obtain more standard feature image output. Finally, after each input image block passes through HRNet, a plurality of feature maps with lower resolution than the input image are output.
Performing convolution operation on the feature map obtained by HRNet calculation, calculating a Heat map of the image, obtaining a central point of the image by searching a peak value of the Heat map, obtaining a target central point (x, y), a target size (w, h) and a target orientation angle by regression calculation, and obtaining a target detection frame, namely a target detection result, based on the central point, the size and the orientation angle;
first, a feature map obtained by HRNet calculation is convolved to calculate a thermodynamic map of an image
Figure BDA0002532375290000131
Wherein R represents the size scaling, C is the number of types of key points (namely the number of channels of the output feature map), W and H are the width and height of the image, the peak point of the thermodynamic map is the central point, and the peak point position of each feature map predicts the width and height information of the target;
second, key points of the target are trained by thermodynamic diagrams. Assuming that the position of a target key point on a target true value graph (Ground route) is set as p ∈ R 2 The GT points contain all effective information in the feature graph and are key nodes for feature extraction. Continuously down-sampling the original input image through the HRNet network to obtain corresponding feature maps, wherein the corresponding key points on the feature maps are
Figure BDA0002532375290000132
Wherein R represents a size scaling;
by the Gaussian kernel formula
Figure BDA0002532375290000133
Dispersing GT point on a truth diagram into a thermodynamic diagram
Figure BDA0002532375290000134
Where σ is p Is the adaptive standard deviation of the target size. If more than two gaussian functions overlap in the target size, the one with the largest value is selected.
Figure BDA0002532375290000135
Is composed of
Figure BDA0002532375290000136
The horizontal and vertical coordinates of (1);
the training objective function of the key points of the target thermodynamic diagram is set as follows:
Figure BDA0002532375290000137
wherein alpha and beta are hyper-parameters, which are usually set to 2 and 4 in a specific experiment, N is the number of key points in a pixel point,
Figure BDA0002532375290000138
the main purpose of (1) is to perform normalization;
thirdly, since the GT point is biased by the low pixels in the down-sampling of the image, a local offset is added to the prediction of the center point
Figure BDA0002532375290000141
All classes c use the same offset prediction, this offset is trained with a loss of L1:
Figure BDA0002532375290000142
it follows that the local offset will only be at the critical point location
Figure BDA0002532375290000143
The operation is not used in other positions when prediction is made;
fourthly, when searching for target pixel points, the key points are used
Figure BDA0002532375290000144
All centers are obtained. Suppose that
Figure BDA0002532375290000145
For a target block diagram of class k, then the center position is
Figure BDA0002532375290000146
The regression of the target block diagram for each target is equal to
Figure BDA0002532375290000147
To get the target block diagram quickly, we use a single size prediction to compute, i.e.
Figure BDA0002532375290000148
Where W, H are image width and height, R represents size scaling, and the loss function of L1 is added at the center position:
Figure BDA0002532375290000149
fifthly, key points are also adopted when the angle orientation of the target is searched
Figure BDA00025323752900001410
To perform regression. The same assumption is made
Figure BDA00025323752900001411
For a class k object block diagram, then the object is oriented
Figure BDA00025323752900001412
Similarly, add a loss function of L1 at the center position:
Figure BDA00025323752900001413
sixthly, the network target loss function in the whole target detection process consists of four parts:
L det =L ksize L sizeangle L angleoff L off
in the experiment, lambda size =0.1,λ angle =0.1,λ off 1 the entire network predicts that at each location c +5 values, i.e. keypoint class c,target center point (x, y), size (w, h), target angle, all of these outputs share the same back bone of the deep convolutional neural network.
Seventh, a peak point of each category on the thermodynamic diagram is extracted. To obtain these peak points, we compare all the response points on the thermodynamic diagram with the eight neighboring points, and if the value of the corresponding point is greater than or equal to the maximum value of the eight neighboring points, that corresponding point is retained, otherwise the corresponding point is eliminated. Finally, we will retain all the first 100 peak points that meet their requirements. Suppose that
Figure BDA0002532375290000151
Is the set of N center points of the detected category c, as shown in the following formula:
Figure BDA0002532375290000152
each keypoint being in the form of a shaped coordinate (x) i ,y i ) It is given.
Figure BDA0002532375290000153
As a measured detection confidence, a target block diagram of the following formula is generated:
Figure BDA0002532375290000154
wherein the content of the first and second substances,
Figure BDA0002532375290000155
in order to shift the result of the prediction,
Figure BDA0002532375290000156
is the result of the scale prediction and is,
Figure BDA0002532375290000157
is the prediction result of the angle.
And step three, considering that the remote sensing images acquired on the satellite are generally large in size, high in resolution and low in direct processing time efficiency and space efficiency, the invention provides that the large-format remote sensing images to be detected are subjected to overlapping division (or called cutting) into a plurality of small image blocks with the same size for processing. When the target to be detected is located on the cutting line, the target is divided into two parts, so that the position, the size or the direction are difficult to fuse in the final detection process, and therefore, a splicing method for the edge target is provided to solve the problem, so that the detection accuracy of the target on the cutting line is improved.
As shown in fig. 2, (a) is an original image; (b) the image is a small block of the cut image, and the airplane positioned on the cutting line in the original image can be seen to be divided into two parts by the left image and the right image, and the cut airplane is taken as an edge target; the graph (c) is a result of directly splicing the left graph and the right graph, and it can be seen that the recognition results of the left graph and the right graph cannot be fused at all, and the sizes, the dimensions and the directions of the two graphs have different degrees, so that the recognition accuracy of the edge target is seriously influenced; the graph (d) is the splicing result of the splicing strategy for the edge target provided by the invention, overlapping cutting is carried out in the cutting process to obtain a left image small block, a middle image small block and a right image small block, the edge target is completely placed into the image small block overlapped and cut in the middle, and overlapping splicing is also carried out in the splicing process, so that the strategy completely avoids the identification error of the edge target caused by cutting, and the identification precision of the edge target is greatly improved. It is worth noting that this example only shows one edge of the image, and in actual practice, the process of overlap cropping is for four edges.
In order to verify the algorithm provided by the invention, a DOTA public data set is adopted, the data set is an aerial image data set made by cooperation of emphatic laboratory Xiaguasong in remote sensing country of Wuhan university and white flying in China telecom college, and 2806 aerial images are collected from different sensors and platforms, and the data set comprises 188282 examples in 15 categories. Wherein, the plane, the ship, the storage tank, the baseball Diamond, the tenis court, the swimming pool, the ground track field, the harbor, the bridge, the large court, the small court, the helicopters, the round-about, the socker court field, the baseball court pool totally 15 classes, of which 14 main classes, small court and large court are subclasses of the court. Different from the traditional labeling method, the data set adopts an angled target detection labeling mode, so that the positioning is more accurate.
The original dataset was cropped into 512 x 512 images for training. Specifically, the data set is defined as a training set and a prediction set, although the DOTA data set has extremely unbalanced class distribution, it is necessary to ensure that the training set and the prediction set both include the above 15 classes, and the proportion of the training set to the verification set is about 3: 1. a target detection method of CenterNet is adopted, training is carried out through a network with backbone HRNet, then a model of the position, size and orientation of a target is obtained through regression, a verification set is predicted according to the model, and finally a target output result is obtained through a splicing strategy aiming at an edge target.
In one example, the running environment for the respective experiment was ubuntu14.04, framed by pyrroch 1.2.0, trained using a computer platform configured as NVIDIA Tesla K40C. The improved algorithm adopts a cross entropy loss function, an Adam optimizer, and the batch _ size is set to 16, 100 epoach are trained in total, and the learning rate is 0.000001.
FIG. 4 shows the results of confusion matrix experiments in remote sensing image target identification. It can be seen that the method adopted by us can well detect the plane, ship, baseball court, storage tank, swamming pool, etc., but the neglect of large targets such as ground track field, harbor, baseball diamond, tenis court, and sodium ball field can be neglected to a certain extent, because these large sites are similar to background and are easily neglected in the learning of neural network, thereby generating false negative results. In addition, the detection result is still relatively accurate for the category of the vehicle, but the detection results are confused to a certain extent due to the similarity between the large vehicle and the small vehicle. It is worth pointing out that background is a category that takes the false detection data into account and directly uses the false detection as the output of the confusion matrix.
The experimental result of the remote sensing image target detection covers 15 types of targets in the DOTA data set. By visually observing the recognition result, the recognition effects of targets with obvious characteristics, such as plane, ship, storage tank, baseball Diamond, Tenis court, swimming pool, large vessel, small vessel, helicopters, round-about, and baseball court, are good, the direction and the size are both suitable, and the false detection is carried out to a certain degree; for field identification with a large target proportion such as a capacitor ball field and a ground track field but with few characteristics, the effect is poor, and particularly, the direction has a certain degree of deviation; for the strip-shaped objects such as the hardor and bridge, the problem of the identified angle is worse, and therefore the strip-shaped objects are obtained, so that the deviation of the angle of a point has a great influence on the overall accuracy of the category, and therefore the angles of the hardor and bridge relative to the original objects show large deviation. In general, the algorithm provided by the invention has good expressions on the position, size and direction of remote sensing image recognition, and is particularly different from the traditional target detection.
In the experiment, original backbone in the CenterNet and HRNet _30 are selected to be used as backbone for comparison, and the backbone is introduced into an improved CenterNet-based angular remote sensing image target identification algorithm framework for experiment. In a specific operation, ResNet _50, DLA _34 and hourglass were used for the comparison test. Because the invention is dedicated to the on-satellite self-learning remote sensing image target detection, a lighter network is needed as a backbone, and the space utilization rate of the network is an important measurement index. Space efficiency of the various networks as shown in table 1, it can be seen that HRNet _30 is much smaller than hourglass and slightly smaller than the ResNet _50 network, but HRNet _30 is slightly larger than the DLA _34 network, both in terms of network parameters and network computation.
TABLE 1 comparison of back bone space efficiencies
Figure BDA0002532375290000171
Table 2 shows the accuracy comparison of the final experimental results of different backbones. It should be noted that the evaluation indexes AP and AR used in this document are both evaluated with an angle. It can be seen that the accuracy of HRNet _30 is much better than ResNet _50 and DLA _34 in AP, AR, AP50, AP75, but is less accurate than the accuracy of ourglass.
TABLE 2 backsbone experiment precision comparison
Figure BDA0002532375290000172
By comprehensively comparing the space efficiency and the experimental precision of the backhaul, the HRNet _30 space occupation is much smaller than that of a hourglass network, but the precision is slightly different and is just a little inferior; the DLA _34 network occupies less space than HRNet _30, but has much poorer precision, which is far less than HRNet _ 30; ResNet _50 is comparable to HRNet _30, regardless of space usage or experimental accuracy. In conclusion, on the basis of comprehensively considering the space efficiency and the experimental precision, the algorithm taking the HRNet as the backbone is more suitable for self-learning of satellite remote sensing image target identification and is the optimal choice.
In addition, it is noted that the algorithm and the IENet provided by the invention are both applied to directional target detection of remote sensing images, and experiments are performed on DOTA data sets. In terms of accuracy, the accuracy of the AP50 in the algorithm provided by the invention is 0.5829, and the accuracy of the AP50 of IENet is 0.5714, so that the algorithm provided by the invention is slightly better than the accuracy of the IENet; in terms of network parameters, the network parameter of the backhaul in the algorithm provided by the invention is 25.42MB, and the network parameter of the IENet is 212MB, so that the algorithm provided by the invention has lower space occupancy rate. Therefore, compared with IENet, the algorithm provided by the invention is better, and is more suitable for on-satellite self-identification of remote sensing images.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
It should be noted that the terms "first \ second \ third" referred to in the embodiments of the present application merely distinguish similar objects, and do not represent a specific ordering for the objects, and it should be understood that "first \ second \ third" may exchange a specific order or sequence when allowed. It should be understood that "first \ second \ third" distinct objects may be interchanged under appropriate circumstances such that the embodiments of the application described herein may be implemented in an order other than those illustrated or described herein.
The terms "comprising" and "having" and any variations thereof in the embodiments of the present application are intended to cover non-exclusive inclusions. For example, a process, method, apparatus, product, or device that comprises a list of steps or modules is not limited to the listed steps or modules but may alternatively include other steps or modules not listed or inherent to such process, method, product, or device.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (4)

1. An angular high-resolution remote sensing image target detection method based on improved CenterNet is characterized by comprising the following steps:
s10, adopting HRNet as a backbone network under the CenterNet framework to obtain an improved CenterNet framework;
s20, giving a plurality of remote sensing target images as training samples, inputting the training samples into an improved CenterNet frame, and training to obtain a remote sensing image target detection frame;
s30, cutting the remote sensing image to be detected into a plurality of unit images with the same size, inputting each unit image into the remote sensing image target detection frame respectively for target detection, and determining the target detection frame of each unit image, wherein the method comprises the following steps: respectively inputting the unit images into a remote sensing image target detection framework, calculating a feature graph of the unit images by using HRNet, performing convolution operation on the feature graph, calculating a thermodynamic diagram of the images, obtaining a central point of the unit images by searching a peak value of the thermodynamic diagram, obtaining a target central point, a target size and a target orientation angle of the unit images by regression calculation, and determining a target detection frame according to the target central point, the target size and the target orientation angle;
and S40, performing edge splicing on each unit image according to the target detection frame of each unit image to determine the detection target of the remote sensing image to be detected.
2. The improved centret-based angular high-resolution remote sensing image target detection method according to claim 1, wherein the convolution operation is performed on the feature map, the thermodynamic diagram of the image is calculated, the central point of the unit image is obtained by searching the peak value of the thermodynamic diagram, the target central point, the target size and the target orientation angle of the unit image are obtained by regression calculation, and the target detection frame is determined according to the target central point, the target size and the target orientation angle, and comprises the following steps:
performing convolution operation on the feature map to calculate thermodynamic diagram of the image
Figure FDA0003690552040000011
Wherein R represents the size scaling, C is the type number of key points, W and H are the width and height of the image, the peak point of the thermodynamic diagram is the central point, and the peak point position of each characteristic diagram predicts the width and height information of the target;
training key points of a target through thermodynamic diagrams to determine central points of unit images;
and obtaining a target central point, a target size and a target orientation angle of the unit image through regression calculation, and determining a target detection frame according to the target central point, the target size and the target orientation angle.
3. The improved CenterNet based angular high-resolution remote sensing image target detection method according to claim 2,
assuming that the position of a target key point on a target true value graph of a unit image is set as p ∈ R 2 Continuously down-sampling the original input image through the HRNet network to obtain a corresponding feature map, wherein the corresponding key points on the feature map are
Figure FDA0003690552040000012
Wherein R represents a size scaling;
by the Gaussian kernel formula
Figure FDA0003690552040000021
Dispersing GT point on a truth diagram into a thermodynamic diagram
Figure FDA0003690552040000022
Where σ is p Is the adaptive standard deviation of the target size, if more than two Gaussian functions in the target size are overlapped, the largest one is selected,
Figure FDA0003690552040000023
is composed of
Figure FDA0003690552040000024
The horizontal and vertical coordinates of (1);
the training objective function of the key points of the target thermodynamic diagram is set as follows:
Figure FDA0003690552040000025
wherein alpha and beta are hyper-parameters, which are usually set to 2 and 4 in a specific experiment, N is the number of key points in a pixel point,
Figure FDA0003690552040000026
the main purpose of (1) is to perform normalization;
adding local offsets to the prediction of the center point
Figure FDA0003690552040000027
All classes c use the same offset prediction, this offset is trained with a loss of L1:
Figure FDA0003690552040000028
it follows that the local offset will only be at the critical point location
Figure FDA0003690552040000029
Making a prediction;
when searching for target pixel points, key points are used
Figure FDA00036905520400000210
Get all centers, assume
Figure FDA00036905520400000211
For a target block diagram of class k, then the center position is
Figure FDA00036905520400000212
The regression of the target block diagram for each target is equal to
Figure FDA00036905520400000213
For fast target block diagram acquisition, a single size prediction is used for calculation, and a loss function of L1 is added at the center position:
Figure FDA00036905520400000214
when finding the angle orientation of the targetBy using key points
Figure FDA00036905520400000215
To perform regression, suppose
Figure FDA00036905520400000216
For a class k object block diagram, then the object is oriented
Figure FDA00036905520400000217
Add the loss function of L1 at the center position:
Figure FDA0003690552040000031
the network target loss function in the whole target detection process consists of four parts:
L det =L ksize L sizeangle L angleoff L off
in the experiment, lambda size =0.1,λ angle =0.1,λ off 1, the whole network prediction outputs c +5 values at each position, namely a key point category c, a target central point (x, y), a size (w, h) and a target angle, and all the outputs share the backbone of the same deep convolutional neural network;
the peak point for each category on the thermodynamic diagram is extracted to determine the target center point, target size, and target orientation angle.
4. The improved CenterNet-based angular height remote sensing image target detection method according to claim 3, wherein extracting the peak point of each category on the thermodynamic diagram comprises:
and comparing all the response points on the thermodynamic diagram with the eight adjacent points, if the value of the response point is greater than or equal to the maximum value of the eight adjacent points, retaining the response point, and otherwise, eliminating the response points, and retaining all the top 100 peak points meeting the requirement of the response point to obtain the peak point of each category on the thermodynamic diagram.
CN202010521896.3A 2020-06-10 2020-06-10 Angle high-resolution remote sensing image target detection method based on improved CenterNet Active CN111797697B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010521896.3A CN111797697B (en) 2020-06-10 2020-06-10 Angle high-resolution remote sensing image target detection method based on improved CenterNet

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010521896.3A CN111797697B (en) 2020-06-10 2020-06-10 Angle high-resolution remote sensing image target detection method based on improved CenterNet

Publications (2)

Publication Number Publication Date
CN111797697A CN111797697A (en) 2020-10-20
CN111797697B true CN111797697B (en) 2022-08-05

Family

ID=72804728

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010521896.3A Active CN111797697B (en) 2020-06-10 2020-06-10 Angle high-resolution remote sensing image target detection method based on improved CenterNet

Country Status (1)

Country Link
CN (1) CN111797697B (en)

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111968145B (en) * 2020-10-23 2021-01-15 腾讯科技(深圳)有限公司 Box type structure identification method and device, electronic equipment and storage medium
CN112232432B (en) * 2020-10-26 2023-04-11 西安交通大学 Security check X-ray image target detection and identification method based on improved central point detection
CN112465854A (en) * 2020-12-17 2021-03-09 北京三川未维科技有限公司 Unmanned aerial vehicle tracking method based on anchor-free detection algorithm
CN112733624B (en) * 2020-12-26 2023-02-03 电子科技大学 People stream density detection method, system storage medium and terminal for indoor dense scene
CN112883840B (en) * 2021-02-02 2023-07-07 中国人民公安大学 Power transmission line extraction method based on key point detection
CN112884742B (en) * 2021-02-22 2023-08-11 山西讯龙科技有限公司 Multi-target real-time detection, identification and tracking method based on multi-algorithm fusion
CN112949730B (en) * 2021-03-11 2024-04-09 无锡禹空间智能科技有限公司 Method, device, storage medium and equipment for detecting target with few samples
CN112990198B (en) * 2021-03-22 2023-04-07 华南理工大学 Detection and identification method and system for water meter reading and storage medium
CN113159198A (en) * 2021-04-27 2021-07-23 上海芯物科技有限公司 Target detection method, device, equipment and storage medium
CN113420819B (en) * 2021-06-25 2022-12-06 西北工业大学 Lightweight underwater target detection method based on CenterNet
CN113673576A (en) * 2021-07-26 2021-11-19 浙江大华技术股份有限公司 Image detection method, terminal and computer readable storage medium thereof
CN113627288B (en) * 2021-07-27 2023-08-18 武汉大学 Intelligent information label acquisition method for massive images
CN113723511B (en) * 2021-08-31 2023-12-22 厦门大学 Target detection method based on remote sensing electromagnetic radiation and infrared image
CN113869246B (en) * 2021-09-30 2022-04-01 安徽大学 Wheat stripe rust germ summer spore microscopic image detection method based on improved CenterNet technology
CN114913428A (en) * 2022-04-26 2022-08-16 哈尔滨理工大学 Remote sensing image target detection system based on deep learning

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2458123A1 (en) * 2003-03-13 2004-09-13 Synodon Inc. Remote sensing of gas leaks
CN111091105A (en) * 2019-12-23 2020-05-01 郑州轻工业大学 Remote sensing image target detection method based on new frame regression loss function

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2458123A1 (en) * 2003-03-13 2004-09-13 Synodon Inc. Remote sensing of gas leaks
CN111091105A (en) * 2019-12-23 2020-05-01 郑州轻工业大学 Remote sensing image target detection method based on new frame regression loss function

Also Published As

Publication number Publication date
CN111797697A (en) 2020-10-20

Similar Documents

Publication Publication Date Title
CN111797697B (en) Angle high-resolution remote sensing image target detection method based on improved CenterNet
CN113378686B (en) Two-stage remote sensing target detection method based on target center point estimation
CN108021890B (en) High-resolution remote sensing image port detection method based on PLSA and BOW
Duan et al. Automatic alignment of geographic features in contemporary vector data and historical maps
Shahi et al. Road condition assessment by OBIA and feature selection techniques using very high-resolution WorldView-2 imagery
CN113223068A (en) Multi-modal image registration method and system based on depth global features
CN110175524A (en) A kind of quick vehicle checking method of accurately taking photo by plane based on lightweight depth convolutional network
EP3680608A1 (en) Antenna downward inclination angle measurement method based on multi-scale detection algorithm
CN105069451A (en) License plate identifying and positioning method based on binocular camera
CN104574335A (en) Infrared and visible image fusion method based on saliency map and interest point convex hulls
CN113033315A (en) Rare earth mining high-resolution image identification and positioning method
Li et al. Ship detection of optical remote sensing image in multiple scenes
CN106845458A (en) A kind of rapid transit label detection method of the learning machine that transfinited based on core
Guo et al. Exploring GIS knowledge to improve building extraction and change detection from VHR imagery in urban areas
Su et al. Object detection in aerial images using a multiscale keypoint detection network
Tang et al. Fast multidirectional vehicle detection on aerial images using region based convolutional neural networks
Sun et al. Building outline extraction from aerial imagery and digital surface model with a frame field learning framework
CN110751077B (en) Optical remote sensing picture ship detection method based on component matching and distance constraint
CN112614121A (en) Multi-scale small-target equipment defect identification and monitoring method
CN116403071B (en) Method and device for detecting few-sample concrete defects based on feature reconstruction
CN114860974A (en) Remote sensing image retrieval positioning method
Farooq et al. Efficient object proposals extraction for target detection in VHR remote sensing images
Wang et al. Detection of remote sensing targets with angles via modified CenterNet
Vanegas et al. Detection of aligned objects for high resolution image understanding
CN116385477A (en) Tower image registration method based on image segmentation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant