CN115019181A - Remote sensing image rotating target detection method, electronic equipment and storage medium - Google Patents

Remote sensing image rotating target detection method, electronic equipment and storage medium Download PDF

Info

Publication number
CN115019181A
CN115019181A CN202210900309.0A CN202210900309A CN115019181A CN 115019181 A CN115019181 A CN 115019181A CN 202210900309 A CN202210900309 A CN 202210900309A CN 115019181 A CN115019181 A CN 115019181A
Authority
CN
China
Prior art keywords
target
remote sensing
sample points
rotating frame
sensing image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210900309.0A
Other languages
Chinese (zh)
Other versions
CN115019181B (en
Inventor
金世超
冯鹏铭
贺广均
符晗
刘世烁
常江
邹同元
梁银川
车程安
田路云
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Satellite Information Engineering
Original Assignee
Beijing Institute of Satellite Information Engineering
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Satellite Information Engineering filed Critical Beijing Institute of Satellite Information Engineering
Priority to CN202210900309.0A priority Critical patent/CN115019181B/en
Publication of CN115019181A publication Critical patent/CN115019181A/en
Application granted granted Critical
Publication of CN115019181B publication Critical patent/CN115019181B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/52Scale-space analysis, e.g. wavelet analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/766Arrangements for image or video recognition or understanding using pattern recognition or machine learning using regression, e.g. by projecting features on hyperplanes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/7715Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Abstract

The invention relates to a method for detecting a rotating target of a remote sensing image, electronic equipment and a storage medium, wherein in training, rich sample points are obtained by using an elliptical distribution sampling mode for a given target position label; by utilizing a self-adaptive foreground sampling strategy, high-quality foreground sample points are sequentially obtained from a high-level characteristic diagram to a low-level characteristic diagram and are input into a loss function together with a network predicted foreground target, so that a more accurate target characteristic representation method is learned.

Description

Remote sensing image rotating target detection method, electronic equipment and storage medium
Technical Field
The invention relates to a remote sensing image rotating target detection method, electronic equipment and a storage medium.
Background
With the gradual improvement of the resolution and the imaging quality of the remote sensing images, the extraction of interested targets from the high-resolution remote sensing images becomes possible. The rotary target in the remote sensing image can be well detected by utilizing the deep learning technology based on the convolutional neural network through automatic extraction and learning of target characteristics.
Common target detection methods based on deep learning mainly include two types: 1. a multi-step detector, namely, on a feature map extracted by a feature extraction backbone Network, firstly distinguishing a target (foreground) from a background by using a Region suggestion Network (RPN) through an Anchor frame (Anchor), and then classifying and regressing the target according to an output result of the RPN; 2. the single-step detector mainly comprises an anchor frame and a non-anchor frame, and predicts the offset and the category of a target position on the feature map by using the anchor frame and then performs regression and classification; and the single-step detection method based on the non-anchor frame directly predicts four vertexes of the target frame on the down-sampled feature map by using a method similar to key point detection, classifies the target and regresses the vertexes.
The traditional non-anchor frame single-step detector mainly comprises the following steps:
step 1: extracting features of the input image by using a deep convolutional neural network as a target feature extraction backbone network to obtain a down-sampled feature map;
step 2: performing upsampling on the downsampled Feature map layer by using a Feature Pyramid (FPN) to acquire more detailed features of the image;
and step 3: after a characteristic diagram is obtained from the FPN, predicting the category and the position of a rotating frame representing a target by utilizing two network branches, namely a target classification network, a size regression network and an angle regression network;
and 4, step 4: obtaining foreground sample points by using a sampling strategy based on a fixed center distance and a limited characteristic map, and respectively calculating the losses of the two branch networks through loss functions by combining the predicted sample points;
and 5: through gradient back propagation, network parameters are optimized, and the purpose of training a target detection network is achieved;
step 6: and detecting the target in the high-resolution remote sensing image by using the trained network, and obtaining and outputting the information representing the category, the position and the angle of the target rotating frame by using the output results of the two sub-networks.
Although the traditional method can acquire and learn the characteristics of the target, the characteristics of the target size and the extremely large variation range of the aspect ratio in the remote sensing image are ignored, so that the problems that the characteristics of the first two ends of the target cannot be acquired and the positive sample is lost exist.
Disclosure of Invention
In view of the technical problems, the invention provides a remote sensing image rotating target detection method, electronic equipment and a storage medium by using a self-adaptive ellipse distribution foreground sample sampling strategy, so that a network framework can obtain high-quality target characteristics by using a single-step non-anchor frame process to further perform target detection of a rotating frame; based on the sampling strategy, the invention solves another key problem that the target scale and the length-width ratio change greatly in the remote sensing image target detection, so that the network is difficult to extract the accurate features of the target, thereby improving the accuracy of the target detection.
The technical solution for realizing the purpose of the invention is as follows: a remote sensing image rotating target detection method comprises the following steps:
s1, obtaining the remote sensing image and a target label corresponding to the remote sensing image, and obtaining a corresponding multi-scale feature map according to the remote sensing image;
s2, constructing a target classification branch network and a position and angle regression branch network, and predicting the multi-scale feature map to obtain a target prediction value;
s3, screening sample points meeting the elliptical distribution on the multi-scale characteristic graph by using the target label corresponding to the remote sensing image in the S1 to obtain foreground sample points and obtain the real category and the regression value of the foreground sample points;
step S4, performing network training by using the foreground sample points and the predicted values, repeatedly executing steps S1 to S3, and training a detection model;
and S5, detecting the remote sensing image by using the detection model obtained in the step S4.
According to one aspect of the invention, in step S1, after the remote sensing image is acquired, the remote sensing image is preprocessed, and the preprocessing process at least includes cutting and turning.
According to an aspect of the present invention, in step S1, the method further includes:
step S11, extracting input sample characteristics by using a characteristic extraction backbone network to obtain a characteristic diagram;
and S12, performing up-sampling and feature fusion on the feature map obtained in the step S11 by using the feature pyramid to obtain a multi-scale feature map.
According to an aspect of the present invention, in step S2, the method specifically includes:
step S21, establishing a classification prediction network branch and a position regression prediction network branch on the basis of the multi-scale characteristic diagram obtained in the step S1;
and step S22, predicting the multi-scale characteristic diagram by adopting the classification prediction network branch and the position regression prediction network branch to obtain a target prediction value.
According to an aspect of the present invention, in step S3, the method specifically includes:
s31, according to the target label corresponding to the remote sensing image in the S1, aiming at the actual size of each target, screening out sample points meeting the elliptical distribution on the multi-scale feature map obtained in the S2;
in step S32, a self-adaptive foreground sample sampling strategy is adopted, and a certain number of foreground sample points are sequentially sampled from the high level to the low level on the feature pyramid, and the true category and the regression value thereof are obtained.
According to an aspect of the present invention, in step S31, the method specifically includes:
step S311, obtaining the multi-scale characteristic diagram by using the step 3
Figure 100002_DEST_PATH_IMAGE001
According to the sizes of different feature maps, calculating the position coordinates of the sample points mapped back to the original image
Figure 100002_DEST_PATH_IMAGE002
Step S312, utilizing the position coordinates of the sample points
Figure 100002_DEST_PATH_IMAGE003
And the coordinates of the rotating frame in the corresponding target label
Figure 100002_DEST_PATH_IMAGE004
Constructing an ellipse distribution sampling range based on the target size and the rotation angle, and screening out the ellipse distribution range
Figure DEST_PATH_IMAGE005
Inner sample points:
Figure DEST_PATH_1
wherein the content of the first and second substances,
Figure 654110DEST_PATH_IMAGE007
is the x-axis coordinate of the central point of the rotating frame,
Figure 100002_DEST_PATH_IMAGE008
is the coordinate of the center point y axis of the rotating frame, w is the width of the rotating frame, h is the height of the rotating frame,
Figure 894248DEST_PATH_IMAGE009
and
Figure 100002_DEST_PATH_IMAGE010
calculating the coordinates of the target label and the sample point to obtain:
Figure 372503DEST_PATH_IMAGE011
wherein the content of the first and second substances,
Figure 100002_DEST_PATH_IMAGE012
and
Figure 100002_DEST_PATH_IMAGE013
representing the coordinate offset between the sample point and the center point of the rotating frame:
Figure DEST_PATH_IMAGE014
Figure DEST_PATH_2
range threshold representing elliptical distribution:
Figure 100002_DEST_PATH_IMAGE016
according to an aspect of the present invention, in step S32, the method specifically includes:
step S321, utilizing the width of the rotating frame in the target label
Figure 100002_DEST_PATH_IMAGE017
And height
Figure 100002_DEST_PATH_IMAGE018
Obtaining the longest edge of each rotating frame
Figure 100002_DEST_PATH_IMAGE019
Figure 100002_DEST_PATH_IMAGE020
Step S322, calculating Euclidean distance from the sample point to the center point of the rotating frame by using the sample point in the elliptical distribution range acquired in the step S31
Figure 482279DEST_PATH_IMAGE021
And the longest side of the rotating frame
Figure 415600DEST_PATH_IMAGE019
Calculating the normalized distance from the sample point to the center of the rotating frame
Figure 100002_DEST_PATH_IMAGE022
Figure 22162DEST_PATH_IMAGE023
Figure 100002_DEST_PATH_IMAGE024
And S323, sequentially sampling foreground samples from large to small according to the longest edge of the rotating frame, sequentially selecting sample points which are closest to the high-level feature map and are not selected from the low-level feature map by utilizing the normalized distance, and regarding the sample points as the foreground sample points of the rotating frame.
According to an aspect of the present invention, in step S4, performing network training by using the foreground sample points and the predicted values, specifically including:
step S41, adopting focal loss to calculate loss in the target classification network branch, the formula is as follows:
Figure 202477DEST_PATH_IMAGE025
wherein alpha and beta are a balance factor and a smoothing factor respectively,Mthe number of the selected sample points in the image is counted;
step S42, calculating the position regression branch network loss by using Smooth-L1 loss;
step S43, weighted average is performed on the two branch network losses to obtain a total loss:
Figure 100002_DEST_PATH_IMAGE026
wherein the content of the first and second substances,
Figure 100002_DEST_PATH_IMAGE027
is the number of positive samples and is,
Figure 100002_DEST_PATH_IMAGE028
,
Figure DEST_PATH_IMAGE029
loss of classification, positional regression, respectively.
According to an aspect of the present invention, there is provided an electronic apparatus including: one or more processors, one or more memories, and one or more computer programs; wherein, the processor is connected with the memory, the one or more computer programs are stored in the memory, and when the electronic device runs, the processor executes the one or more computer programs stored in the memory, so as to make the electronic device execute a remote sensing image rotating object detection method according to any one of the above technical solutions.
According to an aspect of the present invention, there is provided a computer readable storage medium for storing computer instructions, which when executed by a processor, implement a method for detecting a rotating target in a remote sensing image according to any one of the above technical solutions.
According to the concept of the invention, the invention provides a remote sensing image rotating target detection method, electronic equipment and a storage medium, wherein in training, rich sample points are obtained by using an elliptical distribution sampling mode for a given target position label; secondly, by utilizing a self-adaptive foreground sampling strategy, high-quality foreground sample points are sequentially obtained from a high-level feature map to a low-level feature map and then are input into a loss function together with a foreground target predicted by a network, so that a more accurate target feature representation method is learned.
Meanwhile, in the constructed feature pyramid, high-quality foreground sampling points are obtained in all layers of the feature pyramid from high to low according to the size of the target, training and prediction are carried out through fusion, the problem that small-size targets are difficult to obtain sampling points in the feature pyramid and too many redundant sampling points are obtained through large sizes is solved, sampling precision and generalization are improved through a self-adaptive method, and the method has important significance for high-resolution remote sensing image rotating frame target detection.
Drawings
FIG. 1 is a flow chart schematically illustrating model training of a method for detecting a rotating target in a remote sensing image according to an embodiment of the invention;
FIG. 2 is a schematic diagram illustrating adaptive elliptically distributed foreground sample point sampling according to an embodiment of the present invention;
FIG. 3 schematically illustrates a model training phase in accordance with one embodiment of the present invention;
FIG. 4 is a flow chart that schematically illustrates a method for detecting a rotating target in a remote sensing image, in accordance with an embodiment of the present invention;
FIG. 5 schematically shows a flowchart of step S2 according to an embodiment of the present invention;
fig. 6 schematically shows a flowchart of step S3 according to an embodiment of the present invention.
Detailed Description
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments will be briefly described below. It is obvious that the drawings in the following description are only some embodiments of the invention, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.
The present invention is described in detail below with reference to the drawings and the specific embodiments, which are not repeated herein, but the embodiments of the present invention are not limited to the following embodiments.
As shown in fig. 1 to 6, the method for detecting the remote sensing image rotating target of the invention comprises the following steps:
s1, obtaining the remote sensing image and a target label corresponding to the remote sensing image, and obtaining a corresponding multi-scale characteristic diagram according to the remote sensing image;
s2, constructing a target classification branch network and a position and angle regression branch network, and predicting the multi-scale feature map to obtain a target prediction value;
s3, screening sample points meeting the elliptical distribution on the multi-scale characteristic graph by using the target label corresponding to the remote sensing image in the S1 to obtain foreground sample points and obtain the real category and the regression value of the foreground sample points;
s4, performing network training by using the foreground sample points and the predicted values, repeatedly executing the steps S1-S3, and training a detection model;
and S5, detecting the remote sensing image by using the detection model obtained in the step S4.
In this embodiment, as shown in fig. 1 and 4, in training, for a given target position label, rich sample points are obtained by using an elliptical distribution sampling method; secondly, by utilizing a self-adaptive foreground sampling strategy, high-quality foreground sample points are sequentially obtained from a high-level feature map to a low-level feature map and then are input into a loss function together with a foreground target predicted by a network, so that a more accurate target feature representation method is learned.
Meanwhile, in the constructed characteristic pyramid, high-quality foreground sampling points are obtained in each layer of the characteristic pyramid from high to low according to the size of the target, training and prediction are carried out through fusion, the problem that small-size targets are difficult to obtain sampling points in the characteristic pyramid and too many redundant sampling points are obtained in large sizes is solved, sampling precision and generalization are improved through a self-adaptive method, and the method has great significance for detecting the high-resolution remote sensing image rotating frame target.
In an embodiment of the present invention, preferably, in step S1, after the remote sensing image is acquired, the remote sensing image is preprocessed, and the preprocessing process at least includes cropping and flipping.
In the embodiment, the remote sensing image is cut, turned and the like, so that the complexity of a subsequent algorithm is reduced and the efficiency is improved.
In an embodiment of the present invention, preferably, in step S1, the method further includes:
step S11, extracting input sample characteristics by using a characteristic extraction backbone network to obtain a characteristic diagram;
and S12, performing up-sampling and feature fusion on the feature map obtained in the step S11 by using the feature pyramid to obtain a multi-scale feature map.
As shown in fig. 5, in an embodiment of the present invention, preferably, in step S2, the method specifically includes:
s21, establishing a classification prediction network branch and a position regression prediction network branch on the basis of the multi-scale feature map obtained in the S1;
and step S22, predicting the multi-scale characteristic diagram by adopting the classification prediction network branch and the position regression prediction network branch to obtain a target prediction value.
As shown in fig. 6, in an embodiment of the present invention, preferably, in step S3, the method specifically includes:
s31, according to the target label corresponding to the remote sensing image in the S1, aiming at the actual size of each target, screening out sample points meeting the elliptical distribution on the multi-scale feature map obtained in the S2;
in step S32, a self-adaptive foreground sample sampling strategy is adopted, and a certain number of foreground sample points are sequentially sampled from the high level to the low level on the feature pyramid, and the true category and the regression value thereof are obtained.
In this embodiment, the feature extraction backbone network is used to extract the features of the input samples to obtain a feature map
Figure DEST_PATH_IMAGE030
Wherein
Figure DEST_PATH_IMAGE031
The number of layers of the characteristic diagram; applying the feature pyramid structure, firstly, the feature map obtained in step S2 is compared
Figure DEST_PATH_IMAGE032
Figure DEST_PATH_IMAGE033
Figure DEST_PATH_IMAGE034
Up-sampling to obtain characteristic diagram
Figure DEST_PATH_IMAGE035
Figure DEST_PATH_IMAGE036
Figure DEST_PATH_IMAGE037
Then go right again
Figure 941761DEST_PATH_IMAGE037
Down-sampling to obtain
Figure DEST_PATH_IMAGE038
Figure DEST_PATH_IMAGE039
And using 1 × 1 convolution to make the feature pyramid in which the feature is formed
Figure 570451DEST_PATH_IMAGE035
Figure 297098DEST_PATH_IMAGE036
Figure 913893DEST_PATH_IMAGE037
Figure 358781DEST_PATH_IMAGE038
Figure 584970DEST_PATH_IMAGE039
Cascading to obtain Feature heads (Feature heads) with different scales; and establishing a classification prediction network branch and a position regression prediction network branch on the basis of the acquired Feature Head.
In the process of target detection, a foreground target and a background need to be distinguished, a classification prediction network branch can be used for distinguishing the foreground target and the background, the types of the targets are distinguished, and a position regression prediction network branch is used for determining the position and the angle of a rotating frame.
In one embodiment of the present invention, preferably, in step S31, the size according to the target label and the obtained size is
Figure DEST_PATH_IMAGE040
Characteristic diagram
Figure 166124DEST_PATH_IMAGE001
Constructing an elliptical distribution to screen sample points, wherein
Figure DEST_PATH_IMAGE041
In order to be a two-dimensional size of the feature map,
Figure DEST_PATH_IMAGE042
the number of the channels specifically comprises:
step S311, utilizing step 3 to obtain the multi-scale characteristic diagram
Figure 422661DEST_PATH_IMAGE001
According to the sizes of different feature maps, calculating the position coordinates of the sample points mapped back to the original image
Figure 620425DEST_PATH_IMAGE002
Wherein it is assumed that
Figure DEST_PATH_IMAGE043
Is from input to
Figure 387655DEST_PATH_IMAGE031
Total step size of layers, for feature maps
Figure 354474DEST_PATH_IMAGE001
A position on
Figure DEST_PATH_IMAGE044
Its position on the input image is:
Figure 781913DEST_PATH_IMAGE045
Figure DEST_PATH_IMAGE046
wherein the content of the first and second substances,
Figure 139076DEST_PATH_IMAGE047
represents rounding down;
step S312, utilizing the position coordinates of the sample points
Figure 978506DEST_PATH_IMAGE003
And rotating frame coordinates in the corresponding target label
Figure 799831DEST_PATH_IMAGE004
Constructing an ellipse distribution sampling range based on the target size and the rotation angle, and screening out the ellipse distribution range
Figure 945642DEST_PATH_IMAGE005
Inner sample points:
Figure 249020DEST_PATH_1
wherein the content of the first and second substances,
Figure 991144DEST_PATH_IMAGE007
is the x-axis coordinate of the central point of the rotating frame,
Figure 932555DEST_PATH_IMAGE008
is the coordinate of the center point y axis of the rotating frame, w is the width of the rotating frame, h is the height of the rotating frame,
Figure 514846DEST_PATH_IMAGE009
and
Figure 659651DEST_PATH_IMAGE010
calculating the coordinates of the target label and the sample point to obtain:
Figure 618380DEST_PATH_IMAGE011
wherein the content of the first and second substances,
Figure 148718DEST_PATH_IMAGE012
and
Figure 901910DEST_PATH_IMAGE013
represents the coordinate offset between the sample point and the center point of the rotating frame:
Figure DEST_PATH_IMAGE048
Figure 704650DEST_PATH_IMAGE049
range threshold representing elliptical distribution:
Figure 152556DEST_PATH_IMAGE016
in the embodiment, a large number of backgrounds can be ignored and foreground objects are highlighted by constructing the ellipse distribution sampling range based on the size and the rotation angle of the object, and the sampling is adaptively performed on the feature map by adjusting the lengths of the long side and the short side of the ellipse, so that the extraction of background information in the rectangular frame is reduced, and the object feature information is more accurately extracted.
As shown in fig. 2 and 3, in an embodiment of the present invention, preferably, in step S32, the method specifically includes:
step S321, utilizing the width of the rotating frame in the target label
Figure 537401DEST_PATH_IMAGE017
And height
Figure 195915DEST_PATH_IMAGE018
Obtaining the longest edge of each rotating frame
Figure 813847DEST_PATH_IMAGE019
Figure 848799DEST_PATH_IMAGE020
Step S322, calculating Euclidean distance from the sample point to the center point of the rotating frame by using the sample point in the elliptical distribution range acquired in the step S31
Figure 619309DEST_PATH_IMAGE021
And the longest side of the rotating frame
Figure 714304DEST_PATH_IMAGE019
Calculating the normalized distance from the sample point to the center of the rotating frame
Figure DEST_PATH_IMAGE050
Figure 789839DEST_PATH_IMAGE023
Figure 894061DEST_PATH_IMAGE024
And S323, sequentially sampling foreground samples from large to small according to the longest edge of the rotating frame, sequentially selecting sample points which are closest to the high-level feature map and are not selected from the low-level feature map by utilizing the normalized distance, and regarding the sample points as the foreground sample points of the rotating frame.
In an embodiment of the present invention, preferably, in step S4, the network training is performed by using the foreground sample points and the predicted values, which specifically includes:
step S41, adopting focal loss to calculate loss in the target classification network branch, wherein the formula is as follows:
Figure 987919DEST_PATH_IMAGE025
wherein the content of the first and second substances,Mthe number of the selected sample points in the image is counted;
step S42, calculating the position regression branch network loss by using Smooth-L1 loss;
step S43, the two branch network losses are weighted and averaged to obtain a total loss:
Figure 253815DEST_PATH_IMAGE026
wherein the content of the first and second substances,
Figure 580760DEST_PATH_IMAGE027
is the number of positive samples and is,
Figure 223094DEST_PATH_IMAGE028
,
Figure 968196DEST_PATH_IMAGE051
loss of classification, positional regression, respectively.
According to an aspect of the present invention, there is provided an electronic apparatus including: one or more processors, one or more memories, and one or more computer programs; wherein, the processor is connected with the memory, the one or more computer programs are stored in the memory, and when the electronic device runs, the processor executes the one or more computer programs stored in the memory, so as to make the electronic device execute a remote sensing image rotating object detection method according to any one of the above technical solutions.
According to an aspect of the present invention, there is provided a computer-readable storage medium for storing computer instructions, which when executed by a processor, implement a method for detecting a rotating target in a remote sensing image according to any one of the above technical solutions.
In summary, the invention provides a method for detecting a rotating target of a remote sensing image, an electronic device and a storage medium, wherein in training, rich sample points are obtained by using an elliptical distribution sampling mode for a given target position label; secondly, by utilizing a self-adaptive foreground sampling strategy, high-quality foreground sample points are sequentially obtained from a high-level feature map to a low-level feature map and then are input into a loss function together with a foreground target predicted by a network, so that a more accurate target feature representation method is learned.
Meanwhile, in the constructed feature pyramid, high-quality foreground sampling points are obtained in all layers of the feature pyramid from high to low according to the size of the target, training and prediction are carried out through fusion, the problem that small-size targets are difficult to obtain sampling points in the feature pyramid and too many redundant sampling points are obtained through large sizes is solved, sampling precision and generalization are improved through a self-adaptive method, and the method has important significance for high-resolution remote sensing image rotating frame target detection.
Furthermore, it should be noted that the present invention may be provided as a method, apparatus or computer program product. Accordingly, embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present invention may take the form of a computer program product embodied on one or more computer-usable storage media having computer-usable program code embodied in the medium.
Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing terminal to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It should also be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in the process, method, article, or terminal equipment comprising the element.
Finally, it should be noted that while the above describes a preferred embodiment of the invention, it will be appreciated by those skilled in the art that, once the basic inventive concepts have been learned, numerous changes and modifications may be made without departing from the principles of the invention, which shall be deemed to be within the scope of the invention. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the embodiments of the invention.

Claims (10)

1. A method for detecting a remote sensing image rotating target comprises the following steps:
s1, obtaining the remote sensing image and a target label corresponding to the remote sensing image, and obtaining a corresponding multi-scale characteristic diagram according to the remote sensing image;
s2, constructing a target classification branch network and a position and angle regression branch network, and predicting the multi-scale feature map to obtain a target prediction value;
s3, screening out sample points meeting the elliptical distribution on the multi-scale characteristic diagram by using the target labels corresponding to the remote sensing images in the S1, obtaining foreground sample points, and obtaining the real categories and regression values of the foreground sample points;
s4, performing network training by using the foreground sample points and the predicted values, repeatedly executing the steps S1-S3, and training a detection model;
and S5, detecting the remote sensing image by using the detection model obtained in the step S4.
2. The method of claim 1, wherein in step S1, after the remote sensing image is obtained, the remote sensing image is preprocessed, and the preprocessing includes at least cropping and flipping.
3. The method according to claim 2, wherein in step S1, the method further comprises:
step S11, extracting input sample characteristics by using a characteristic extraction backbone network to obtain a characteristic diagram;
and S12, performing up-sampling and feature fusion on the feature map obtained in the step S11 by using the feature pyramid to obtain a multi-scale feature map.
4. The method according to claim 3, wherein in step S2, the method specifically comprises:
s21, establishing a classification prediction network branch and a position regression prediction network branch on the basis of the multi-scale feature map obtained in the S1;
and step S22, predicting the multi-scale feature map by adopting the classification prediction network branch and the position regression prediction network branch to obtain a target prediction value.
5. The method according to claim 1, wherein in step S3, the method specifically includes:
s31, according to the target label corresponding to the remote sensing image in the S1, aiming at the actual size of each target, screening out sample points meeting the elliptical distribution on the multi-scale feature map obtained in the S2;
in step S32, a self-adaptive foreground sample sampling strategy is adopted, and a certain number of foreground sample points are sequentially sampled from a high layer to a low layer on the feature pyramid, and the true category and the regression value thereof are obtained.
6. The method according to claim 5, wherein in step S31, the method specifically includes:
step S311, utilizing step 3 to obtain the multi-scale characteristic diagram
Figure DEST_PATH_IMAGE001
According to the sizes of different feature maps, calculating the position coordinates of the sample points mapped back to the original image
Figure DEST_PATH_IMAGE002
Step S312, utilizing the position coordinates of the sample points
Figure 409908DEST_PATH_IMAGE002
And the coordinates of the rotating frame in the corresponding target label
Figure DEST_PATH_IMAGE003
Constructing an ellipse distribution sampling range based on the target size and the rotation angle, and screening out the ellipse distribution range
Figure DEST_PATH_IMAGE004
Inner sample points:
Figure 1
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE006
is the x-axis coordinate of the central point of the rotating frame,
Figure DEST_PATH_IMAGE007
is the coordinate of the center point y axis of the rotating frame, w is the width of the rotating frame, h is the height of the rotating frame,
Figure DEST_PATH_IMAGE008
and
Figure DEST_PATH_IMAGE009
calculating the coordinates of the target label and the sample point to obtain:
Figure DEST_PATH_IMAGE010
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE011
and
Figure DEST_PATH_IMAGE012
represents the coordinate offset between the sample point and the center point of the rotating frame:
Figure DEST_PATH_IMAGE013
Figure 2
range threshold representing elliptical distribution:
Figure DEST_PATH_IMAGE015
7. the method according to claim 6, wherein in step S32, the method specifically includes:
step S321, utilizing the width of the rotating frame in the target label
Figure DEST_PATH_IMAGE016
And height
Figure DEST_PATH_IMAGE017
Obtaining the longest edge of each rotating frame
Figure DEST_PATH_IMAGE018
Figure DEST_PATH_IMAGE019
Step S322, calculating Euclidean distance from the sample point to the center point of the rotating frame by using the sample point in the elliptical distribution range acquired in the step S31
Figure DEST_PATH_IMAGE020
And the longest side of the rotating frame
Figure 657186DEST_PATH_IMAGE018
Calculating the normalized distance from the sample point to the center of the rotating frame
Figure DEST_PATH_IMAGE021
Figure DEST_PATH_IMAGE022
Figure DEST_PATH_IMAGE023
And S323, sequentially sampling foreground samples from large to small according to the longest edge of the rotating frame, sequentially selecting sample points which are closest to the high-level feature map and are not selected from the low-level feature map by utilizing the normalized distance, and regarding the sample points as the foreground sample points of the rotating frame.
8. The method according to claim 1, wherein in step S4, performing network training using the foreground sample points and the predicted values specifically includes:
step S41, adopting focal loss to calculate loss in the target classification network branch, the formula is as follows:
Figure DEST_PATH_IMAGE024
wherein, α and βRespectively a balance factor and a smoothing factor,Mthe number of the selected sample points in the image is counted;
step S42, calculating the position regression branch network loss by using Smooth-L1 loss;
step S43, weighted average is performed on the two branch network losses to obtain a total loss:
Figure DEST_PATH_IMAGE025
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE026
is the number of positive samples and is,
Figure DEST_PATH_IMAGE027
,
Figure DEST_PATH_IMAGE028
the loss of classification and position regression respectively.
9. An electronic device, comprising: one or more processors, one or more memories, and one or more computer programs; wherein a processor is connected to the memory, the one or more computer programs being stored in the memory, and when the electronic device is running, the processor executes the one or more computer programs stored in the memory to cause the electronic device to perform the method for remote sensing image rotation target detection according to any one of claims 1-8.
10. A computer-readable storage medium storing computer instructions which, when executed by a processor, implement a method for detecting a rotating object in a remote sensing image according to any one of claims 1 to 8.
CN202210900309.0A 2022-07-28 2022-07-28 Remote sensing image rotating target detection method, electronic equipment and storage medium Active CN115019181B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210900309.0A CN115019181B (en) 2022-07-28 2022-07-28 Remote sensing image rotating target detection method, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210900309.0A CN115019181B (en) 2022-07-28 2022-07-28 Remote sensing image rotating target detection method, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN115019181A true CN115019181A (en) 2022-09-06
CN115019181B CN115019181B (en) 2023-02-07

Family

ID=83065607

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210900309.0A Active CN115019181B (en) 2022-07-28 2022-07-28 Remote sensing image rotating target detection method, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115019181B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116403122A (en) * 2023-04-14 2023-07-07 北京卫星信息工程研究所 Method for detecting anchor-frame-free directional target
CN116630794A (en) * 2023-04-25 2023-08-22 北京卫星信息工程研究所 Remote sensing image target detection method based on sorting sample selection and electronic equipment

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105160686A (en) * 2015-10-21 2015-12-16 武汉大学 Improved scale invariant feature transformation (SIFT) operator based low altitude multi-view remote-sensing image matching method
CN110458158A (en) * 2019-06-11 2019-11-15 中南大学 A kind of text detection and recognition methods for blind person's aid reading
US20200311479A1 (en) * 2019-03-28 2020-10-01 International Business Machines Corporation Learning of detection model using loss function
CN111901681A (en) * 2020-05-04 2020-11-06 东南大学 Intelligent television control device and method based on face recognition and gesture recognition
CN112005312A (en) * 2018-02-02 2020-11-27 莫勒库莱特股份有限公司 Wound imaging and analysis
CN113191372A (en) * 2021-04-29 2021-07-30 华中科技大学 Construction method and application of ship target directional detection model
CN113468968A (en) * 2021-06-02 2021-10-01 中国地质大学(武汉) Remote sensing image rotating target detection method based on non-anchor frame
CN113887605A (en) * 2021-09-26 2022-01-04 中国科学院大学 Shape-adaptive rotating target detection method, system, medium, and computing device
CN114170527A (en) * 2021-11-30 2022-03-11 航天恒星科技有限公司 Remote sensing target detection method represented by rotating frame
CN114445371A (en) * 2022-01-27 2022-05-06 安徽大学 Remote sensing image target detection method and device based on ellipse intersection ratio

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105160686A (en) * 2015-10-21 2015-12-16 武汉大学 Improved scale invariant feature transformation (SIFT) operator based low altitude multi-view remote-sensing image matching method
CN112005312A (en) * 2018-02-02 2020-11-27 莫勒库莱特股份有限公司 Wound imaging and analysis
US20200311479A1 (en) * 2019-03-28 2020-10-01 International Business Machines Corporation Learning of detection model using loss function
CN110458158A (en) * 2019-06-11 2019-11-15 中南大学 A kind of text detection and recognition methods for blind person's aid reading
CN111901681A (en) * 2020-05-04 2020-11-06 东南大学 Intelligent television control device and method based on face recognition and gesture recognition
CN113191372A (en) * 2021-04-29 2021-07-30 华中科技大学 Construction method and application of ship target directional detection model
CN113468968A (en) * 2021-06-02 2021-10-01 中国地质大学(武汉) Remote sensing image rotating target detection method based on non-anchor frame
CN113887605A (en) * 2021-09-26 2022-01-04 中国科学院大学 Shape-adaptive rotating target detection method, system, medium, and computing device
CN114170527A (en) * 2021-11-30 2022-03-11 航天恒星科技有限公司 Remote sensing target detection method represented by rotating frame
CN114445371A (en) * 2022-01-27 2022-05-06 安徽大学 Remote sensing image target detection method and device based on ellipse intersection ratio

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
LIPING等: "A multi-feature fusion-based change detection method for remote sensing images", 《REMOTE SENSING》 *
丁业兵: "基于惯量矩的椭圆拟合方法", 《计算机工程与应用》 *
聂光涛等: "光学遥感图像目标检测算法综述", 《自动化学报》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116403122A (en) * 2023-04-14 2023-07-07 北京卫星信息工程研究所 Method for detecting anchor-frame-free directional target
CN116403122B (en) * 2023-04-14 2023-12-19 北京卫星信息工程研究所 Method for detecting anchor-frame-free directional target
CN116630794A (en) * 2023-04-25 2023-08-22 北京卫星信息工程研究所 Remote sensing image target detection method based on sorting sample selection and electronic equipment
CN116630794B (en) * 2023-04-25 2024-02-06 北京卫星信息工程研究所 Remote sensing image target detection method based on sorting sample selection and electronic equipment

Also Published As

Publication number Publication date
CN115019181B (en) 2023-02-07

Similar Documents

Publication Publication Date Title
CN110443143B (en) Multi-branch convolutional neural network fused remote sensing image scene classification method
CN110287960B (en) Method for detecting and identifying curve characters in natural scene image
CN106960195B (en) Crowd counting method and device based on deep learning
CN105574513B (en) Character detecting method and device
CN111027493B (en) Pedestrian detection method based on deep learning multi-network soft fusion
CN115019181B (en) Remote sensing image rotating target detection method, electronic equipment and storage medium
CN111027547A (en) Automatic detection method for multi-scale polymorphic target in two-dimensional image
CN109685768B (en) Pulmonary nodule automatic detection method and system based on pulmonary CT sequence
CN109977997B (en) Image target detection and segmentation method based on convolutional neural network rapid robustness
CN111723860A (en) Target detection method and device
CN108647588A (en) Goods categories recognition methods, device, computer equipment and storage medium
CN112800964B (en) Remote sensing image target detection method and system based on multi-module fusion
US9330336B2 (en) Systems, methods, and media for on-line boosting of a classifier
CN108305260B (en) Method, device and equipment for detecting angular points in image
CN111814794A (en) Text detection method and device, electronic equipment and storage medium
CN111814753A (en) Target detection method and device under foggy weather condition
CN111626176A (en) Ground object target detection method and system of remote sensing image
CN116368500A (en) Model training method, image processing method, calculation processing apparatus, and non-transitory computer readable medium
CN115019182B (en) Method, system, equipment and storage medium for identifying fine granularity of remote sensing image target
CN110659601B (en) Depth full convolution network remote sensing image dense vehicle detection method based on central point
CN115457565A (en) OCR character recognition method, electronic equipment and storage medium
CN114821102A (en) Intensive citrus quantity detection method, equipment, storage medium and device
CN114663380A (en) Aluminum product surface defect detection method, storage medium and computer system
CN112800955A (en) Remote sensing image rotating target detection method and system based on weighted bidirectional feature pyramid
CN113901972A (en) Method, device and equipment for detecting remote sensing image building and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant