CN113221775B - Method for detecting target remote sensing image with single-stage arbitrary quadrilateral regression frame large length-width ratio - Google Patents

Method for detecting target remote sensing image with single-stage arbitrary quadrilateral regression frame large length-width ratio Download PDF

Info

Publication number
CN113221775B
CN113221775B CN202110545880.0A CN202110545880A CN113221775B CN 113221775 B CN113221775 B CN 113221775B CN 202110545880 A CN202110545880 A CN 202110545880A CN 113221775 B CN113221775 B CN 113221775B
Authority
CN
China
Prior art keywords
target
regression
loss
remote sensing
sensing image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110545880.0A
Other languages
Chinese (zh)
Other versions
CN113221775A (en
Inventor
宿南
黄志博
闫奕名
冯收
赵春晖
黄博闻
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Engineering University
Original Assignee
Harbin Engineering University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Engineering University filed Critical Harbin Engineering University
Priority to CN202110545880.0A priority Critical patent/CN113221775B/en
Publication of CN113221775A publication Critical patent/CN113221775A/en
Application granted granted Critical
Publication of CN113221775B publication Critical patent/CN113221775B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/13Satellite images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Probability & Statistics with Applications (AREA)
  • Astronomy & Astrophysics (AREA)
  • Remote Sensing (AREA)
  • Image Analysis (AREA)

Abstract

A single-stage arbitrary quadrilateral regression frame large aspect ratio target remote sensing image detection method belongs to the technical field of remote sensing images and aims to solve the problem that a horizontal frame cannot be adopted to accurately position a large aspect ratio target due to the fact that the remote sensing image is a bird's-eye view angle. The method is based on a single-stage target detection framework and can return to any quadrangle; the process comprises the following steps: respectively extracting the features of the three feature layers of the target remote sensing image by using a feature pyramid network structure, and fusing the extracted features; performing regression calculation on the target position of the target remote sensing image by adopting any quadrilateral frame to obtain a candidate frame of any quadrilateral and obtain a classification result and a confidence score; and combining the candidate frames with high confidence scores on the three scales, reducing the candidate frames to the original size, calculating the intersection ratio between the candidate frames of each category, and removing redundant candidate frames by adopting a non-maximum suppression algorithm for solving any quadrilateral to obtain a final detection result. The method is used for detecting the target remote sensing image with the large length-width ratio.

Description

Method for detecting target remote sensing image with single-stage arbitrary quadrilateral regression frame large length-width ratio
Technical Field
The invention relates to a detection algorithm for a target remote sensing image with a large length-width ratio, and belongs to the technical field of remote sensing images.
Background
With the development of the optical remote sensing satellite technology, the resolution of the remote sensing image is greatly improved, and the requirement of target detection through the optical remote sensing image also comes. However, remote sensing image target detection is different from traditional target detection due to changes of shooting visual angles and application scenes, and two new challenges exist.
On one hand, since the remote sensing images are mostly shot by satellites or unmanned planes, the visual field is large, and detection of a specific target in a large-range scene requires high detection speed while pursuing accuracy. On the other hand, the remote sensing images are all overlooking angles, and a horizontal rectangular frame used in the traditional target detection cannot well describe the position information of the inclined target with large length-width ratio, such as a ship, in the remote sensing images.
Disclosure of Invention
The invention aims to solve the problem that a remote sensing image is at an overlook view angle and a target with a large aspect ratio cannot be accurately positioned by adopting a horizontal frame, and provides a single-stage arbitrary quadrilateral regression frame target remote sensing image detection method with the large aspect ratio.
According to the method for detecting the long-length-width ratio target remote sensing image of the single-stage arbitrary quadrilateral regression frame, the detection algorithm is based on the single-stage target detection frame and can regress an arbitrary quadrilateral;
the specific process comprises the following steps:
s1, respectively extracting features of the three feature layers of the target remote sensing image by using the feature pyramid network structure, and fusing the extracted features;
s2, performing regression calculation on the target position of the target remote sensing image by adopting any quadrilateral frame, acquiring a candidate frame of any quadrilateral, and simultaneously acquiring a classification result and a confidence score;
and S3, combining the candidate frames with high confidence score on the three scales, reducing the candidate frames to the original size, calculating the intersection ratio among the candidate frames of each category, and then removing redundant candidate frames by adopting a non-maximum suppression algorithm for solving any quadrangle to obtain a final detection result.
Preferably, in the step S1, feature extraction is respectively performed on three feature layers of the target remote sensing image, and a CSP-Darknet53 network is used for calculation;
the method specifically comprises the following steps:
copying a feature mapping graph of a base layer when feature extraction is carried out on the deep feature graph;
and when feature extraction is carried out on feature maps with different scales, the upper-layer information and the lower-layer information are respectively subjected to up-down sampling combination.
Preferably, the regression calculation of S2 includes: and the regression rotation is realized by adding four offsets into the regression center coordinate and the regression width and height.
Preferably, when the regression rotation is realized by adding four offsets, the loss function loss of the target position detection part is represented by the regression loss of four parts:
loss=lbox+la+lcls+lobj
the regression losses of the four parts are respectively: regression loss lbox of horizontal bounding box, regression loss la of normalized tilt offset, loss lcls of classification and loss lobj of confidence;
wherein, the regression loss lbox of the horizontal circumscribed frame is:
Figure GDA0003540799310000021
wherein, { xi,yi,wi,hiDenotes the predicted value of each candidate region of the target bounding contour,
Figure GDA0003540799310000022
representing the true value in the target circumscribing outline tag,
Figure GDA0003540799310000023
indicates whether there is an object at the (i, j) position, 1 indicates present, and 0 indicates absent; lambda [ alpha ]boxRepresents a custom horizontal regression loss coefficient, λbox∈(0,1];S2Representing each lattice point in the area with the side length S; b representsEach bounding box on a lattice point;
the regression loss la for the normalized tilt offset is:
Figure GDA0003540799310000024
wherein alpha isikA predicted value representing the inclination of the target,
Figure GDA0003540799310000025
a true value representing the tilt of the target; lambda [ alpha ]αRepresents a custom rotation offset regression loss coefficient, λα∈(0,1](ii) a k represents a k-th rotational offset amount;
the classified losses lcls are:
Figure GDA0003540799310000026
pi(c) representing the probability of prediction as class c;
Figure GDA0003540799310000027
representing the true probability of class c; lambda [ alpha ]classDenotes a custom classification loss factor, λclass∈(0,1];
The loss of confidence lobj is:
Figure GDA0003540799310000031
ciindicating the probability of predicting the target at the i position,
Figure GDA0003540799310000032
the probability that a target is really located at the position i is shown, 1 shows that the target is located, and 0 shows that the target is not located; lambda [ alpha ]noobjRepresenting a custom confidence loss coefficient, λnoobj∈(0,1];
Figure GDA0003540799310000033
Indicating whether there is no object at the (i, j) position, 1 indicating no, 0 indicating present;
the classification result is obtained by the loss of classification lcls and the confidence score is obtained by the loss of confidence lobj.
Preferably, the regression loss lbox of the horizontal circumscribing frame is used for positioning the central position and the circumscribing profile of the target;
the regression loss la of the normalized inclination offset is used for representing the inclination degree of the target;
the loss of classification, lcls, is used to represent the training classification capability;
the loss of confidence lobj is used to distinguish whether the candidate region contains a target object.
Preferably, the specific method for calculating the intersection ratio between the candidate frames of each category in S3 includes:
setting a target remote sensing image as a quadrangle;
s3-1, for any two quadrangles RiAnd RjEstablishing an empty point set PSet;
s3-2, two quadrangles RiAnd RjThe intersection points of all the edges in the PSet are added into the PSet;
s3-3, dividing the quadrangle RiAll of (A) are located in the quadrangle RjAdding the internal vertex into the PSet;
s3-4, dividing the quadrangle RjAll of (A) are located in the quadrangle RiAdding the internal vertex into the PSet;
s3-5, sorting all points in the PSet according to a reverse clock, and calculating the area of the overlapping part of the two quadrangles by using a triangle subdivision algorithm:
the area of triangle Δ IJK is expressed as:
Figure GDA0003540799310000034
wherein the content of the first and second substances,
Figure GDA0003540799310000035
represents a vector from I to J;
Figure GDA0003540799310000036
represents a vector from I to K;
the area of the polygonal Aera (IJKLMNOP) is shown as:
SAera(IJKLMNOP)=SΔIJK+SΔIKL+SΔILM+SΔIMN+SΔINO+SΔIOP
wherein S isΔIKL、SΔILM、SΔIMN、SΔINOAnd SΔIOPRepresents the areas of triangles Δ IKL, Δ ILM, Δ IMN, Δ INO, and Δ IOP, respectively;
s3-6, obtaining intersection ratio IoU [ i, j ]:
Figure GDA0003540799310000041
Area(Ri) Represents the area of the rotation frame i; area (R)j) Represents the area of the rotation frame j; area (i) indicates the overlapping area of the rotation frame i and the rotation frame j;
and S3-7, sorting the intersection ratio of all the candidate frames according to the confidence score, and when the intersection ratio of two frames is more than IoU [ i, j ] and is more than 0.5, reserving the candidate frame with high confidence score.
The invention has the advantages that: in order to solve the problems, the invention provides a method for detecting a target remote sensing image with a large length-width ratio of a regression frame of a single-stage arbitrary quadrangle, which can regress the arbitrary quadrangle. Aiming at a large-scale remote sensing optical image, the pixel position of a target in the image is obtained, the position and the category of a target object can be quickly obtained from a large-area background and are described by a quadrangle closely attached to the contour of the target, the quick detection of the target is realized,
drawings
FIG. 1 is a block diagram of a flow chart of a single-stage arbitrary quadrilateral regression frame large aspect ratio target remote sensing image detection method according to the present invention;
FIG. 2 is a schematic diagram of the present invention illustrating the addition of four offsets to achieve regression rotation;
FIG. 3 is a first overlapping case of two quadrilaterals, for example for a ship;
FIG. 4 is a second overlapping case of two quadrilaterals, for example for a ship;
FIG. 5 is a third overlapping case of two quadrilaterals, for example for a ship;
fig. 6 is a fourth case of overlapping two quadrilaterals, taking a ship as an example.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the embodiments and features of the embodiments may be combined with each other without conflict.
The invention is further described with reference to the following drawings and specific examples, which are not intended to be limiting.
The first embodiment is as follows: the present embodiment is described below with reference to fig. 1, in which the single-stage arbitrary quadrilateral regression frame in the present embodiment is capable of regressing an arbitrary quadrilateral based on a single-stage target detection frame;
the specific process comprises the following steps:
s1, respectively extracting features of the three feature layers of the target remote sensing image by using the feature pyramid network structure, and fusing the extracted features;
s2, performing regression calculation on the target position of the target remote sensing image by adopting any quadrilateral frame, acquiring a candidate frame of any quadrilateral, and simultaneously acquiring a classification result and a confidence score;
and S3, combining the candidate frames with high confidence score on the three scales, reducing the candidate frames to the original size, calculating the intersection ratio among the candidate frames of each category, and then removing redundant candidate frames by adopting a non-maximum suppression algorithm for solving any quadrangle to obtain a final detection result.
The second embodiment is as follows: in the embodiment, to further explain the first specific embodiment, in step S1, feature extraction is respectively performed on three feature layers of the target remote sensing image, and a CSP-Darknet53 network is used for calculation;
the method specifically comprises the following steps:
copying a feature mapping graph of a base layer when feature extraction is carried out on the deep feature graph;
and when feature extraction is carried out on feature maps with different scales, the upper-layer information and the lower-layer information are respectively subjected to up-down sampling combination.
In the present embodiment, the conventional target detection algorithm is generally divided into a single-phase algorithm and a two-phase algorithm. The two-stage algorithm is to extract the interest region first and then classify and regress each candidate region. Although the method improves the detection precision to a certain extent, the calculation amount of the network is greatly increased, and the method is not suitable for the detection task of the remote sensing image with a large area. The single-stage target detection network architecture only needs one time of feature extraction and regression classification calculation, so that the speed is greatly improved. In the present embodiment, the CSP-dark net53 network is used as the feature extraction algorithm to extract abundant information features from the input target remote sensing image. And the feature mapping chart of the basic layer is copied during deep feature extraction, so that the problem of gradient disappearance is solved, the reuse of network features is increased, the number of network parameters is reduced, and the calculation speed of feature extraction is accelerated. Meanwhile, for the characteristic diagrams with different scales, the characteristic diagrams are not used independently, and the upper layer information and the lower layer information are respectively subjected to up-down sampling combination, so that the information loss is avoided.
The third concrete implementation mode: in this embodiment, to further explain the first or second embodiment, the regression calculation in S2 includes: and the regression rotation is realized by adding four offsets into the regression center coordinate and the regression width and height.
In the embodiment, since the remote sensing image is different from the conventional image, and objects are all overlooking angles, a target with a large length-width ratio, such as a ship, cannot be accurately positioned by using a horizontal frame. The regression calculation of the tilt target position can be realized by using an arbitrary quadrangular frame in S2. In the regression part, in addition to the coordinates of the center and the width and height of the regression routine, four additional offsets are added to achieve the rotation regression. In addition, the angle is not used as a regression parameter, so that the periodicity problem caused by angle regression is avoided on one hand, and the rotation angle problem is described through the four parameters, so that the robustness of the algorithm is greatly improved, and the influence of single parameter fluctuation on the final result is reduced. The algorithm describes the rotation angle by four offsets, but the loss function is calculated without returning to the four offsets directly, but instead, to the ratio of the four offsets to the width and height.
The fourth concrete implementation mode: the following describes the present embodiment with reference to fig. 2, and the present embodiment further describes a third specific embodiment, where when the regression rotation is implemented by adding four offset amounts, the loss function loss of the target position detection portion is collectively expressed by using the regression losses of four portions:
loss=lbox+la+lcls+lobj
the regression losses of the four parts are respectively: regression loss lbox of horizontal bounding box, regression loss la of normalized tilt offset, loss lcls of classification and loss lobj of confidence;
wherein, the regression loss lbox of the horizontal circumscribed frame is:
Figure GDA0003540799310000061
wherein, { xi,yi,wi,hiDenotes the predicted value of each candidate region of the target bounding contour,
Figure GDA0003540799310000062
representing the true value in the target circumscribing outline tag,
Figure GDA0003540799310000063
indicates whether there is an object at the (i, j) position, 1 indicates present, and 0 indicates absent; lambda [ alpha ]boxRepresents a custom horizontal regression loss coefficient, λbox∈(0,1];S2Representing each lattice point in the area with the side length S; b represents each bounding box on the grid point;
the regression loss la for the normalized tilt offset is:
Figure GDA0003540799310000064
wherein alpha isikA predicted value representing the inclination of the target,
Figure GDA0003540799310000065
a true value representing the tilt of the target; lambda [ alpha ]αRepresents a custom rotation offset regression loss coefficient, λα∈(0,1](ii) a k represents a k-th rotational offset amount;
the classified losses lcls are:
Figure GDA0003540799310000066
pi(c) representing the probability of prediction as class c;
Figure GDA0003540799310000071
representing the true probability of class c; lambda [ alpha ]classDenotes a custom classification loss factor, λclass∈(0,1];
The loss of confidence lobj is:
Figure GDA0003540799310000072
ciindicating the probability of predicting the target at the i position,
Figure GDA0003540799310000073
the probability that a target is really located at the position i is shown, 1 shows that the target is located, and 0 shows that the target is not located; lambda [ alpha ]noobjRepresenting a custom confidence loss coefficient, λnoobj∈(0,1];
Figure GDA0003540799310000074
Indicating whether there is no object at the (i, j) position, 1 indicating no, 0 indicating present;
the classification result is obtained by the loss of classification lcls and the confidence score is obtained by the loss of confidence lobj.
In this embodiment, the higher the confidence loss lobj score is, the closer to the target, and the lower the score is, the closer to the background. The detector uses the same branch to train classification and regression parameters simultaneously, on one hand, the network calculation speed is accelerated, the network parameters are reduced, on the other hand, the classification and regression parameters are mutually promoted during training, and the network convergence is accelerated.
The fifth concrete implementation mode: the fourth embodiment is further described in the present embodiment, where the regression loss lbox of the horizontal circumscribing frame is used to position the center position and the circumscribing profile of the target;
the regression loss la of the normalized inclination offset is used for representing the inclination degree of the target;
the loss of classification, lcls, is used to represent the training classification capability;
the loss of confidence lobj is used to distinguish whether the candidate region contains a target object.
The sixth specific implementation mode: the present embodiment is described below with reference to fig. 3 to fig. 6, and the present embodiment further describes a fifth embodiment, where the specific method for calculating the intersection ratio between candidate frames of each category in S3 includes:
setting a target remote sensing image as a quadrangle;
s3-1, for any two quadrangles RiAnd RjEstablishing an empty point set PSet;
s3-2, two quadrangles RiAnd RjThe intersection points of all the edges in the PSet are added into the PSet;
s3-3, dividing the quadrangle RiAll of (A) are located in the quadrangle RjAdding the internal vertex into the PSet;
s3-4, dividing the quadrangle RjAll of (A) are located at fourEdge shape RiAdding the internal vertex into the PSet;
s3-5, sorting all points in the PSet according to a reverse clock, and calculating the area of the overlapping part of the two quadrangles by using a triangle subdivision algorithm:
the area of triangle Δ IJK is expressed as:
Figure GDA0003540799310000081
wherein the content of the first and second substances,
Figure GDA0003540799310000082
represents a vector from I to J;
Figure GDA0003540799310000083
represents a vector from I to K;
the area of the polygonal Aera (IJKLMNOP) is shown as:
SAera(IJKLMNOP)=SΔIJK+SΔIKL+SΔILM+SΔIMN+SΔINO+SΔIOP
wherein S isΔIKL、SΔILM、SΔIMN、SΔINOAnd SΔIOPRepresents the areas of triangles Δ IKL, Δ ILM, Δ IMN, Δ INO, and Δ IOP, respectively;
s3-6, obtaining intersection ratio IoU [ i, j ]:
Figure GDA0003540799310000084
Area(Ri) Represents the area of the rotation frame i; area (R)j) Represents the area of the rotation frame j; area (i) indicates the overlapping area of the rotation frame i and the rotation frame j;
and S3-7, sorting the intersection ratio of all the candidate frames according to the confidence score, and when the intersection ratio of two frames is more than IoU [ i, j ] and is more than 0.5, reserving the candidate frame with high confidence score.
In the present embodiment, S3 proposes a non-maximum suppression algorithm for solving any quadrilateral, and the regression frame in the conventional target detection is generally a horizontal rectangle, so that the intersection ratio of two horizontal rectangles can be easily obtained in the non-maximum suppression. However, the algorithm adopts a quadrilateral regression box in any direction to describe the position and the size of the target, so that the invention adopts a more complex intersection ratio calculation mode in any shape. For example, the regression frames of the ship are all quadrangles, taking the quadrangles as examples, as shown in fig. 3 to 6. The overlapping conditions of two quadrangles can be simply classified into four categories, fig. 3 shows that the most specific overlapping area is a triangle, fig. 4 shows that the overlapping area is a quadrangle, fig. 5 shows that the overlapping area is a hexagon, fig. 6 shows that the overlapping area is an octagon, and other odd conditions show that a vertex of one quadrangle falls on another side.
Although the invention herein has been described with reference to particular embodiments, it is to be understood that these embodiments are merely illustrative of the principles and applications of the present invention. It is therefore to be understood that numerous modifications may be made to the illustrative embodiments and that other arrangements may be devised without departing from the spirit and scope of the present invention as defined by the appended claims. It should be understood that features described in different dependent claims and herein may be combined in ways different from those described in the original claims. It is also to be understood that features described in connection with individual embodiments may be used in other described embodiments.

Claims (3)

1. The method for detecting the target remote sensing image with the large length-width ratio of the single-stage arbitrary quadrilateral regression frame is characterized in that,
the detection method is based on a single-stage target detection framework, and can return to any quadrangle;
the specific process comprises the following steps:
s1, respectively extracting features of the three feature layers of the target remote sensing image by using the feature pyramid network structure, and fusing the extracted features;
s2, performing regression calculation on the target position of the target remote sensing image by adopting any quadrilateral frame, acquiring a candidate frame of any quadrilateral, and simultaneously acquiring a classification result and a confidence score;
s3, combining the candidate frames with high confidence score on three scales, reducing the candidate frames to the original size, calculating the intersection ratio among the candidate frames of each category, and then removing redundant candidate frames by adopting a non-maximum suppression algorithm for solving any quadrangle to obtain a final detection result;
s2 the regression calculation includes: regression rotation is realized by regression center coordinates, regression width and height and adding four offsets;
when the four offsets are added to realize regression rotation, the loss function loss of the target position detection part is represented by the regression loss of the four parts:
loss=lbox+la+lcls+lobj
the regression losses of the four parts are respectively: regression loss lbox of horizontal bounding box, regression loss la of normalized tilt offset, loss lcls of classification and loss lobj of confidence;
wherein, the regression loss lbox of the horizontal circumscribed frame is:
Figure FDA0003540799300000011
wherein, { xi,yi,wi,hiDenotes the predicted value of each candidate region of the target bounding contour,
Figure FDA0003540799300000012
representing the true value in the target circumscribing outline tag,
Figure FDA0003540799300000013
indicates whether there is an object at the (i, j) position, 1 indicates present, and 0 indicates absent; lambda [ alpha ]boxRepresents a custom horizontal regression loss coefficient, λbox∈(0,1];S2Representing each lattice point in the area with the side length S; b represents each bounding box on the grid point;
the regression loss la for the normalized tilt offset is:
Figure FDA0003540799300000014
wherein alpha isikA predicted value representing the inclination of the target,
Figure FDA0003540799300000015
a true value representing the tilt of the target; lambda [ alpha ]αRepresents a custom rotation offset regression loss coefficient, λα∈(0,1](ii) a k represents a k-th rotational offset amount;
the classified losses lcls are:
Figure FDA0003540799300000021
pi(c) representing the probability of prediction as class c;
Figure FDA0003540799300000022
representing the true probability of class c; lambda [ alpha ]classDenotes a custom classification loss factor, λclass∈(0,1];
The loss of confidence lobj is:
Figure FDA0003540799300000023
ciindicating the probability of predicting the target at the i position,
Figure FDA0003540799300000024
the probability that a target is really located at the position i is shown, 1 shows that the target is located, and 0 shows that the target is not located; lambda [ alpha ]noobjRepresenting a custom confidence loss coefficient, λnoobj∈(0,1];
Figure FDA0003540799300000025
Indicating whether there is no object at the (i, j) position, 1 indicating no, 0 indicating present;
obtaining a classification result through classified loss lcls, and obtaining a confidence score through confidence loss lobj;
s3, the specific method for calculating the intersection ratio between candidate frames of each category includes:
setting a target remote sensing image as a quadrangle;
s3-1, for any two quadrangles RiAnd RjEstablishing an empty point set PSet;
s3-2, two quadrangles RiAnd RjThe intersection points of all the edges in the PSet are added into the PSet;
s3-3, dividing the quadrangle RiAll of (A) are located in the quadrangle RjAdding the internal vertex into the PSet;
s3-4, dividing the quadrangle RjAll of (A) are located in the quadrangle RiAdding the internal vertex into the PSet;
s3-5, sorting all points in the PSet according to a reverse clock, and calculating the area of the overlapping part of the two quadrangles by using a triangle subdivision algorithm:
the area of triangle Δ IJK is expressed as:
Figure FDA0003540799300000026
wherein the content of the first and second substances,
Figure FDA0003540799300000027
represents a vector from I to J;
Figure FDA0003540799300000028
represents a vector from I to K;
the area of the polygonal Aera (IJKLMNOP) is shown as:
SAera(IJKLMNOP)=SΔIJK+SΔIKL+SΔILM+SΔIMN+SΔINO+SΔIOP
wherein S isΔIKL、SΔILM、SΔIMN、SΔINOAnd SΔIOPRepresents the areas of triangles Δ IKL, Δ ILM, Δ IMN, Δ INO, and Δ IOP, respectively;
s3-6, obtaining intersection ratio IoU [ i, j ]:
Figure FDA0003540799300000031
Area(Ri) Represents the area of the rotation frame i; area (R)j) Represents the area of the rotation frame j; area (i) indicates the overlapping area of the rotation frame i and the rotation frame j;
and S3-7, sorting the intersection ratio of all the candidate frames according to the confidence score, and when the intersection ratio of two frames is more than IoU [ i, j ] and is more than 0.5, reserving the candidate frame with high confidence score.
2. The method for detecting the large aspect ratio target remote sensing image of the single-stage arbitrary quadrilateral regression frame according to claim 1, wherein the step S1 of performing feature extraction on the three feature layers of the target remote sensing image respectively is performed by calculating through a CSP-Darknet53 network;
the method specifically comprises the following steps:
copying a feature mapping graph of a base layer when feature extraction is carried out on the deep feature graph;
and when feature extraction is carried out on feature maps with different scales, the upper-layer information and the lower-layer information are respectively subjected to up-down sampling combination.
3. The method for detecting the long-width ratio target remote sensing image of the single-stage arbitrary quadrilateral regression frame according to claim 1, wherein the regression loss lbox of the horizontal circumscribed frame is used for positioning the central position and the circumscribed outline of the target;
the regression loss la of the normalized inclination offset is used for representing the inclination degree of the target;
the loss of classification, lcls, is used to represent the training classification capability;
the loss of confidence lobj is used to distinguish whether the candidate region contains a target object.
CN202110545880.0A 2021-05-19 2021-05-19 Method for detecting target remote sensing image with single-stage arbitrary quadrilateral regression frame large length-width ratio Active CN113221775B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110545880.0A CN113221775B (en) 2021-05-19 2021-05-19 Method for detecting target remote sensing image with single-stage arbitrary quadrilateral regression frame large length-width ratio

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110545880.0A CN113221775B (en) 2021-05-19 2021-05-19 Method for detecting target remote sensing image with single-stage arbitrary quadrilateral regression frame large length-width ratio

Publications (2)

Publication Number Publication Date
CN113221775A CN113221775A (en) 2021-08-06
CN113221775B true CN113221775B (en) 2022-04-26

Family

ID=77093115

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110545880.0A Active CN113221775B (en) 2021-05-19 2021-05-19 Method for detecting target remote sensing image with single-stage arbitrary quadrilateral regression frame large length-width ratio

Country Status (1)

Country Link
CN (1) CN113221775B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113449702B (en) * 2021-08-31 2021-12-03 天津联图科技有限公司 Target detection method and device for remote sensing image, storage medium and electronic equipment
CN114972710B (en) * 2022-07-27 2022-10-28 深圳爱莫科技有限公司 Method and system for realizing multi-shape target detection in image
CN116030120B (en) * 2022-09-09 2023-11-24 北京市计算中心有限公司 Method for identifying and correcting hexagons

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109492561A (en) * 2018-10-29 2019-03-19 北京遥感设备研究所 A kind of remote sensing image Ship Detection based on improved YOLO V2 model
CN110189304A (en) * 2019-05-07 2019-08-30 南京理工大学 Remote sensing image target on-line quick detection method based on artificial intelligence
KR102030628B1 (en) * 2019-04-04 2019-10-10 (주)아이엠시티 Recognizing method and system of vehicle license plate based convolutional neural network
CN111723748A (en) * 2020-06-22 2020-09-29 电子科技大学 Infrared remote sensing image ship detection method
CN111738114A (en) * 2020-06-10 2020-10-02 杭州电子科技大学 Vehicle target detection method based on anchor-free accurate sampling remote sensing image
CN112102241A (en) * 2020-08-11 2020-12-18 中山大学 Single-stage remote sensing image target detection algorithm

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109492561A (en) * 2018-10-29 2019-03-19 北京遥感设备研究所 A kind of remote sensing image Ship Detection based on improved YOLO V2 model
KR102030628B1 (en) * 2019-04-04 2019-10-10 (주)아이엠시티 Recognizing method and system of vehicle license plate based convolutional neural network
CN110189304A (en) * 2019-05-07 2019-08-30 南京理工大学 Remote sensing image target on-line quick detection method based on artificial intelligence
CN111738114A (en) * 2020-06-10 2020-10-02 杭州电子科技大学 Vehicle target detection method based on anchor-free accurate sampling remote sensing image
CN111723748A (en) * 2020-06-22 2020-09-29 电子科技大学 Infrared remote sensing image ship detection method
CN112102241A (en) * 2020-08-11 2020-12-18 中山大学 Single-stage remote sensing image target detection algorithm

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Quantitative Analysis of Metallographic Image Using Attention-Aware Deep Neural Networks;Xu Yifei 等;《Sensors》;20201223;第21卷(第43期);1-22 *
基于深度学习的单阶段车辆检测算法综述;赵奇慧 等;《计算机应用》;20210126;第40卷(第02期);30-36 *

Also Published As

Publication number Publication date
CN113221775A (en) 2021-08-06

Similar Documents

Publication Publication Date Title
CN113221775B (en) Method for detecting target remote sensing image with single-stage arbitrary quadrilateral regression frame large length-width ratio
TWI677826B (en) License plate recognition system and method
CN112084869B (en) Compact quadrilateral representation-based building target detection method
CN109816708B (en) Building texture extraction method based on oblique aerial image
CN110097584B (en) Image registration method combining target detection and semantic segmentation
CN111553347B (en) Scene text detection method oriented to any angle
CN111369495B (en) Panoramic image change detection method based on video
CN109145747A (en) A kind of water surface panoramic picture semantic segmentation method
CN113326763B (en) Remote sensing target detection method based on boundary frame consistency
CN111814827A (en) Key point target detection method based on YOLO
CN112766184A (en) Remote sensing target detection method based on multi-level feature selection convolutional neural network
CN113033315A (en) Rare earth mining high-resolution image identification and positioning method
CN111126381A (en) Insulator inclined positioning and identifying method based on R-DFPN algorithm
CN111027538A (en) Container detection method based on instance segmentation model
Fond et al. Facade proposals for urban augmented reality
CN114565842A (en) Unmanned aerial vehicle real-time target detection method and system based on Nvidia Jetson embedded hardware
CN112560852A (en) Single-stage target detection method with rotation adaptive capacity based on YOLOv3 network
Zhao et al. Boundary regularized building footprint extraction from satellite images using deep neural network
CN113284185B (en) Rotating target detection method for remote sensing target detection
CN110636248B (en) Target tracking method and device
CN114387346A (en) Image recognition and prediction model processing method, three-dimensional modeling method and device
Zhang et al. Alignment of 3d building models with satellite images using extended chamfer matching
Gerhardt et al. Neural network-based traffic sign recognition in 360° images for semi-automatic road maintenance inventory
KR100946707B1 (en) Method, system and computer-readable recording medium for image matching of panoramic images
Liu et al. Polar ray: A single-stage angle-free detector for oriented object detection in aerial images

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant