CN111914795B - Method for detecting rotating target in aerial image - Google Patents

Method for detecting rotating target in aerial image Download PDF

Info

Publication number
CN111914795B
CN111914795B CN202010823765.0A CN202010823765A CN111914795B CN 111914795 B CN111914795 B CN 111914795B CN 202010823765 A CN202010823765 A CN 202010823765A CN 111914795 B CN111914795 B CN 111914795B
Authority
CN
China
Prior art keywords
target
loss
mask
candidate region
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010823765.0A
Other languages
Chinese (zh)
Other versions
CN111914795A (en
Inventor
刘怡光
唐天航
朱先震
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan University
Original Assignee
Sichuan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan University filed Critical Sichuan University
Priority to CN202010823765.0A priority Critical patent/CN111914795B/en
Publication of CN111914795A publication Critical patent/CN111914795A/en
Application granted granted Critical
Publication of CN111914795B publication Critical patent/CN111914795B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/13Satellite images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Abstract

The invention adopts a deep learning method to design a target detection model for detecting targets such as vehicles, ships, airplanes and the like in high-altitude aerial images and simultaneously carries out positioning prediction on a target rotating frame. Firstly, designing an image feature extraction network for acquiring high-dimensional features of an input aerial image, and simultaneously constructing a feature pyramid by adopting an FPN (field programmable gate array) architecture to realize target feature extraction under different resolutions; then, generating the size of a basic anchor point of a candidate region extraction network by adopting a clustering method, and realizing the size adjustment of the corresponding anchor point according to the size distribution information of the target in the training image, thereby improving the training efficiency; designing a characteristic denoising detector combined with an attention mechanism for denoising the target characteristics of the candidate region; and finally, designing corresponding weight factors aiming at the target frames with different length-width ratios by adopting a rotation angle error optimization method, optimizing the positioning result of the target with the large length-width ratio, and realizing the rotation frame prediction of various targets in the aerial image.

Description

Method for detecting rotating target in aerial image
Technical Field
The invention relates to an aerial image target detection algorithm, in particular to a target rotating frame positioning prediction method aiming at rotating target detection.
Background
Target detection is a challenging computer vision task, and has application prospects in various fields including face recognition, search and rescue, intelligent transportation and the like. The traditional target detection method mainly realizes target detection by artificially designing the characteristics of a target to be detected, is very complicated, and has low efficiency and lack of robustness due to the characteristics of difficult extraction, instability and the like of the target characteristics. With the recent proposal and application of deep learning methods, the related field of target detection tasks also obtains a plurality of milestones, and the detection precision and the detection speed of targets are greatly improved. The target detection method based on deep learning mainly comprises single-step detection and two-step detection, the single-step detection algorithm is high in detection speed, but sacrifices part of precision, and the high-precision detection requirement is difficult to achieve. The single-step detection classical model comprises a YOLO series model and an SSD model, the two-step detection is represented by Fast RCNN, the single-step detection and the two-step detection are obviously different from a model architecture and comprise the steps of detection characteristics of a detector and model training optimization, but the detection characteristics and the model training optimization are used as main algorithms of target detection, the detection characteristics are still consistent on the whole process, aiming at an input image, firstly, a basic characteristic extraction network is used for processing low-dimensional pixel information to construct high-dimensional characteristic information, and then the detector is used for predicting the size of a target central point and a bounding box based on high-order characteristics. The small target detection and the rotating target detection are another important computer vision task after the classical target task, the small target has fewer pixels and less image occupation ratio, and meanwhile, the small target is very easy to be ignored in the feature extraction process of the convolutional neural network, so the detection difficulty is high. In recent years, a plurality of algorithms are designed for a small target, low-dimensional features are combined with high-dimensional features to predict the small target, and the situation that the small target features are ignored along with the increase of the convolution depth to influence the final prediction result is avoided. In the aerial photography image, a plurality of target gathering areas such as parking lots, harbors, airports and the like exist, in the areas with high gathering degree, a traditional horizontal frame is adopted, the situation that a large number of target frames are restrained can occur through non-maximum value restraint, so that a large number of targets in a detection result are lost, the problem can be effectively avoided by adopting a rotating frame to carry out target detection, and meanwhile, more accurate positioning prediction is realized.
Disclosure of Invention
The invention provides a target feature denoising and angle error optimization method based on multi-scale clustering and combined with an attention mechanism for realizing small target detection and rotating target positioning prediction in an aerial image, and realizes accurate positioning prediction of a rotating target.
The method adopts a residual error network structure ResNet as a basic feature extraction framework to extract high-dimensional feature information of an input image, designs a feature pyramid structure to realize the fusion of high-dimensional features and low-dimensional features; then, multi-scale clustering is adopted for setting anchor point parameters of the candidate region proposed network RPN, and corresponding anchor points are distributed to each feature layer according to characteristics such as receptive fields of different resolution characteristics of the feature pyramid; then according to the candidate region result generated by the RPN, intercepting a corresponding feature map on a corresponding feature layer, and denoising each candidate region feature by combining with a proposed attention denoiser; inputting the denoised target characteristics into a full-connection layer, and performing final positioning and classification prediction.
The aerial image rotating target detection method comprises the following steps:
the method comprises the following steps: and (6) data acquisition and labeling. The method comprises the following steps of acquiring aerial images by adopting equipment or network resources, acquiring high-resolution images by utilizing an unmanned aerial vehicle to shoot at high altitude or utilizing Google maps and the like, and performing target marking work after the images are acquired, wherein the marking mode is different from the traditional marking mode of a horizontal external rectangle, but a rotating frame mode is adopted for marking, and the specific implementation steps are as follows:
step A: collecting images by using an unmanned aerial vehicle or network resources, and constructing a large amount of training image data;
and B: marking the target frame by using a marking tool, and adopting a 4-point method, namely a marking mode of four vertexes of a quadrilateral;
and C: and (3) completing the labeling of the rotating rectangular frame by utilizing a quadrilateral minimum rectangle external connection method to form a labeled file:
step two: and (4) preprocessing data. Aerial photography image has very high image resolution, no matter in training or actual testing process, the original image of direct input is unreal, and this can bring very big burden for equipment, and training speed receives very big influence, consequently should original image cutting be the small image, inputs the model again and trains and predict, and concrete realization step does:
step A: cutting the image, setting the cutting size to be 800 x 800 pixels according to the equipment capability and the depth learning model, considering that direct cutting may cause the target at the cutting edge to be cut off, and setting a cutting overlapping area to be 200 pixels specifically;
and B: reconstructing target label data, configuring corresponding label data for each image target generated by cutting, and judging whether a label belongs to the image according to whether a label center is in the cut image;
and C: and constructing training data, uniformly constructing training data tensors according to the cut images and the label data, facilitating model input, and converting the labels expressed by the four-point method into central points, square frame sizes and rotation angle expressions in the label processing process.
Step three: the model design, the deep learning detection model of the invention mainly includes four core structures, namely characteristic extraction network, candidate area generation network, characteristic denoising structure, rotating target predictor, in the course of detecting and training, the input data passes through the 4 structures in turn, finally produce the prediction result, the concrete realization steps of the model are as follows:
step A: adopting a residual error network ResNet as a feature extractor to obtain high-dimensional information of an input image, then constructing a feature pyramid in a top-down mode, and sequentially performing feature fusion on high-dimensional features downwards to generate a plurality of feature maps;
and B: generating anchor point size of a candidate area generation network by adopting a clustering mode, firstly counting target size of training data, setting the number of clustering centers, clustering by adopting a K-means method, generating the clustering centers with corresponding number, and taking coordinates of the center points as width and height parameters of the anchor points for configuring parameters of the candidate area generation network. The candidate areas generate a plurality of groups of classification of the candidate areas and anchor point positioning deviation values according to the characteristic diagram;
and C: intercepting a feature map of a feature layer corresponding to the feature pyramid according to the anchor point positioning deviation value to generate interesting region features, constructing a denoising map through a plurality of convolution layers according to the result, multiplying the denoising map with the interesting region feature layers one by one to obtain denoised target features, and generating a corresponding attention loss function during training;
step D: inputting the denoised target features into a full-link layer, respectively predicting classification information and positioning information of the target, wherein the classification information is a serial number of a target type, the positioning information is a target center, a size and a rotating angle, and the rotating angle error weight is set according to a target length-width ratio during training to realize angle error optimization;
step four: and (3) designing a loss function, wherein the design of the loss function mainly comprises three parts, namely foreground and background classification errors and anchor point offset positioning errors in a candidate region generation network, attention loss in attention denoising, classification errors of a final prediction result and rotating frame positioning errors.
Drawings
FIG. 1 is a diagram of a rotating object detection network architecture for the method of the present invention.
Fig. 2 is a partial aerial image of step one of the present invention.
FIG. 3 is a schematic diagram of image segmentation in step two of the present invention.
Fig. 4 is a structural diagram of a feature pyramid in step three of the present invention.
FIG. 5 is a diagram of an attention denoising detector in step three according to the present invention.
FIG. 6 is a model regression target design of step three of the present invention.
FIG. 7 is an attention mechanism mask design of step three of the present invention.
Detailed Description
The details of the model design of the present invention are described with reference to fig. 1, and the steps of the embodiment are as follows:
the method comprises the following steps: and (5) extracting image features. Processing the low-dimensional image pixels, and extracting high-dimensional feature information (a feature pyramid structure is shown in fig. 4), wherein the specific implementation steps are as follows:
step A: the residual error network ResNet50d is used as a backbone network, 4 residual error blocks are used for the input image, and 4 feature maps { C) with different resolutions are correspondingly generated2,C3,C4,C5};
And B: fusing the generated feature maps from top to bottom, firstly, fusing the feature maps by C5Generation of pyramid high-level features P by convolution5After P5C is to be4And add to obtain P4Sequentially fusing downwards to finally generate a characteristic pyramid { P }2,P3,P4,P5};
Step two: generating anchor point size based on a clustering method, simultaneously distributing anchor points to each characteristic layer for prediction, and predicting an offset value of a target relative to the anchor points by using the generated characteristic pyramid by using a candidate region extraction network, wherein the specific implementation steps are as follows:
step A: counting target size information in the training data, clustering by using a K-means clustering method, setting the number of clustering centers to be 35, finally generating 35 clustering centers, corresponding to the sizes of 35 anchor points, and distributing each anchor point to a corresponding characteristic layer:
and B: performing convolution operation on the generated characteristic pyramid to generate foreground and background score prediction of the target and an offset value of a relative anchor point, outputting 2 values in the foreground and background scores by a model, respectively representing scores of a foreground and a background, and outputting 4 numerical values by the offset value of the relative anchor point, respectively representing center offset x, y and size offset w, h;
step three: feature denoising in combination with attention mechanism. Firstly, intercepting the features according to the candidate region, then denoising the target features, predicting the final target classification and rotating frame positioning, and specifically realizing the following steps:
step A: inversely decoding according to the anchor point offset of the candidate area and the corresponding anchor point size to obtain a real size;
and B: calculating the percentage of the input image size according to the real size;
and C: utilizing the percentage to cut the characteristic diagram of the corresponding characteristic layer to obtain the characteristic diagram of the interested area;
step D: inputting the feature map into an attention denoising generator to generate a feature denoising map with the same size as the feature map;
step E: multiplying the feature map by corresponding elements of the feature de-noising map to obtain a target feature map;
step F: externally connecting full connection twitching, predicting the type of a target and positioning information of a rotating frame;
step four: designing a loss function, namely designing the loss function of a target detection model for model training and convergence, and specifically comprising the following steps:
step A: the candidate area generation network comprises 2 parts of losses, namely foreground and background classification losses and a break-point offset value loss;
and B: attention loss, constructing a convergence mask target of the target in a manner shown in fig. 7, and simultaneously, in the attention denoising, generating an attention feature map when generating a feature denoising map, and constructing an attention loss function by using the attention feature map and the mask target;
and C: the method comprises the steps of target classification and positioning loss, wherein the target classification and positioning loss comprises the loss of target type classification and predicted rotating frame positioning loss, and when the positioning loss is constructed, corresponding weight is set according to the length-width ratio of a target, so that the condition that the final prediction result is excessively influenced by the supervision error of the target with a large length-width ratio is avoided;
the implementation details of the first step are as follows: firstly, a feature map { C ] is generated according to a residual error network2,C3,C4,C5P is calculated by convolution5
Figure 43606DEST_PATH_IMAGE001
(1)
Then carrying out fusion from top to bottom to obtain P2,P3,P4
Figure 799203DEST_PATH_IMAGE002
(2)
The anchor point allocation strategy of the second step is shown by the following formula:
Figure 237138DEST_PATH_IMAGE003
(3)
the mask generation details of the third step are as follows: firstly, generating an independent mask for each target according to the label:
Figure 657755DEST_PATH_IMAGE004
(4)
wherein FillPoly represents pixel filling, where a zero matrix with the same size as the input image is constructed first, and then the pixels in the target area are set as the type number of the target. Splicing all target masks after construction is completed, and constructing a high-dimensional matrix:
Figure 130324DEST_PATH_IMAGE005
(5)
and constructing a single-hot vector of target frame regression according to the result of the candidate region generation network:
Figure 306091DEST_PATH_IMAGE006
(6)
here, rois _ assignments represents the target box number corresponding to each candidate region. At the same time, the mask is cropped using the generated candidate area result:
Figure 231321DEST_PATH_IMAGE007
(7)
ROI _ Align here represents a crop scaling operation where a mask can be cropped according to a candidate region while scaling to the same size. Finally, the corresponding mask is activated using the unique heat vector:
Figure 455629DEST_PATH_IMAGE008
(8)
step four, the construction of the loss function is divided into three parts, and the overall loss function is as follows:
Figure 517126DEST_PATH_IMAGE009
(9)
wherein the loss function of the candidate area generating network is defined as follows:
Figure 110132DEST_PATH_IMAGE010
(10)
the target predicted loss is defined as follows:
Figure 522659DEST_PATH_IMAGE011
(11)
attention loss is defined as follows:
Figure 550658DEST_PATH_IMAGE012
(12)
wherein λ isiWeights, rp, for controlling losses of partsnRepresenting the foreground-background prediction probability, rv, of candidate region generationnAnchor point offset prediction, gp, representing candidate region generationnRepresenting the probability of a real foreground and background, foreground 1, background 0, gvnRepresenting the true anchor offset value, FpnIndicating the result of classification of predicted objects, FvnIndicating the predicted spin frame positioning result, GpnRepresenting the true kind of target, GvnRepresenting the true orientation of the target rotating frame, Rh nAnd Rw nDenotes the scaled attention feature size, uijValues, gu, representing the corresponding positions of the attention feature mapijRepresenting the true mask value, LclsDenotes the softmax cross entropy, LregAnd Lreg_thetaIndicating a smooth L1 loss, LADRepresenting the softmax cross entropy at the pixel level.

Claims (1)

1. A method for detecting a rotating target of an aerial image is characterized by comprising the following steps: the method comprises the steps of data acquisition and marking, data preprocessing, model design and loss function design, wherein an aerial image is acquired by adopting an unmanned aerial vehicle and a satellite map, a target is marked by adopting a rotating frame mode, then the image is preprocessed and divided into a plurality of sub-images with fixed sizes, corresponding label data is configured for the image generated by cutting, the label data is used for constructing a training data tensor, then model design is carried out, the model comprises a feature extraction network, a candidate region generation network, a feature denoising structure and a rotating target predictor, image features are extracted by utilizing a residual error network, anchor point sizes are generated based on a clustering method, feature denoising is carried out by combining an attention mechanism, the candidate target features are input into classification and positioning information of a prediction target of a full connection layer, and the loss function of a rotating target detection model is designed, and comprises the loss of the candidate region generation network, the loss function of the loss function, and the loss function of the loss function, Attention loss, target classification and positioning loss, model training is completed by using relevant aerial photography data, and finally positioning and classification prediction of aerial photography image rotating targets are realized, wherein the specific realization steps are as follows:
the implementation details of the first step are as follows: firstly, a feature map { C ] is generated according to a residual error network2,C3,C4,C5P is calculated by convolution5
P5=Conv(C5) (1)
Then carrying out fusion from top to bottom to obtain P2,P3,P4
PCj=0.5*Pj+1+0.5*Cj,j∈{2,3,4}
Pj=Conv(PCj),j∈{2,3,4} (2)
The anchor point allocation strategy of the second step is shown by the following formula:
Figure FDA0003595716950000011
the mask generation details of the third step are as follows: firstly, generating an independent mask for each target according to the label:
t_maskk=FillPoly(Zeros([H,W]),gt_boxesk,labelsk) (4)
wherein, fillPoly represents pixel filling, a zero matrix with the same size as an input image is firstly constructed, then pixels in a target area are set as the type serial number of a target, and after construction, all target masks are spliced to construct a high-dimensional matrix:
mask=Concate(t_maskk),k∈[0,len(gt_boxes)] (5)
and constructing a single-hot vector of target frame regression according to the result of the candidate region generation network:
onehot=One_hot(rois_assignments,len(gt_boxes)) (6)
here, rois _ assignments represents the target box number corresponding to each candidate region, and at the same time, the mask is cropped using the generated candidate region result:
rois_cropped_mask=ROI_Align(rois,mask) (7)
here, ROI _ Align represents a cropping scaling operation where masks can be cropped according to the candidate region while scaling to the same size, and finally, the corresponding mask is activated using the unique heat vector:
rois_mask=Sum(rois_onehot*rois_cropped_msk) (8)
step four, the construction of the loss function is divided into three parts, and the overall loss function is as follows:
Losstotal=LossRPN+LossFAST+LossMASK (9)
wherein the loss function of the candidate area generating network is defined as follows:
Figure FDA0003595716950000021
the target predicted loss is defined as follows:
Figure FDA0003595716950000022
attention loss is defined as follows:
Figure FDA0003595716950000023
wherein λ is1、λ2、λ3、λ4、λ5Weights, rp, for controlling losses of partsnIndicating the probability of foreground prediction, rv, generated for the nth candidate regionniAnchor point offset prediction, gp, representing the parameter i generated by the nth candidate regionnRepresenting the probability of a real foreground and background, foreground 1, background 0, gvniThe true anchor offset value, Fp, representing the parameter i corresponding to the nth candidate regionn(iv) shows the classification result of the nth predicted target, FvniThe positioning result of the rotating frame, Gp, representing the parameter i corresponding to the nth predicted targetnRepresenting the nth object true category, GvniActual orientation of the rotating frame, R, representing the parameter i corresponding to the nth objecth nAnd Rw nDenotes the scaled attention feature size, ureValues, gu, representing the corresponding positions of the attention feature mapreRepresenting the true mask value, LclsDenotes the softmax cross entropy, LregAnd Lreg_θIndicating a smooth L1 loss, LADRepresenting the softmax cross entropy at the pixel level.
CN202010823765.0A 2020-08-17 2020-08-17 Method for detecting rotating target in aerial image Active CN111914795B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010823765.0A CN111914795B (en) 2020-08-17 2020-08-17 Method for detecting rotating target in aerial image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010823765.0A CN111914795B (en) 2020-08-17 2020-08-17 Method for detecting rotating target in aerial image

Publications (2)

Publication Number Publication Date
CN111914795A CN111914795A (en) 2020-11-10
CN111914795B true CN111914795B (en) 2022-05-27

Family

ID=73279011

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010823765.0A Active CN111914795B (en) 2020-08-17 2020-08-17 Method for detecting rotating target in aerial image

Country Status (1)

Country Link
CN (1) CN111914795B (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112508029A (en) * 2020-12-03 2021-03-16 苏州科本信息技术有限公司 Instance segmentation method based on target box labeling
CN112799055A (en) * 2020-12-28 2021-05-14 深圳承泰科技有限公司 Method and device for detecting detected vehicle and electronic equipment
CN112926463A (en) * 2021-03-02 2021-06-08 普联国际有限公司 Target detection method and device
CN112926480B (en) * 2021-03-05 2023-01-31 山东大学 Multi-scale and multi-orientation-oriented aerial photography object detection method and system
CN112907972B (en) * 2021-04-06 2022-11-29 昭通亮风台信息科技有限公司 Road vehicle flow detection method and system based on unmanned aerial vehicle and computer readable storage medium
CN113298720B (en) * 2021-04-21 2022-08-19 重庆邮电大学 Self-adaptive overlapped image rotation method
CN113591748A (en) * 2021-08-06 2021-11-02 广东电网有限责任公司 Aerial photography insulator sub-target detection method and device
CN113723217A (en) * 2021-08-09 2021-11-30 南京邮电大学 Object intelligent detection method and system based on yolo improvement
CN113628208B (en) * 2021-08-30 2024-02-06 北京中星天视科技有限公司 Ship detection method, device, electronic equipment and computer readable medium
CN113673478B (en) * 2021-09-02 2023-08-11 福州视驰科技有限公司 Port large-scale equipment detection and identification method based on deep learning panoramic stitching
CN114360007B (en) * 2021-12-22 2023-02-07 浙江大华技术股份有限公司 Face recognition model training method, face recognition device, face recognition equipment and medium
CN114119610B (en) * 2022-01-25 2022-06-28 合肥中科类脑智能技术有限公司 Defect detection method based on rotating target detection
CN116306936A (en) * 2022-11-24 2023-06-23 北京建筑大学 Knowledge graph embedding method and model based on hierarchical relation rotation and entity rotation
CN116823838B (en) * 2023-08-31 2023-11-14 武汉理工大学三亚科教创新园 Ocean ship detection method and system with Gaussian prior label distribution and characteristic decoupling

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101807256A (en) * 2010-03-29 2010-08-18 天津大学 Object identification detection method based on multiresolution frame
JP2019049484A (en) * 2017-09-11 2019-03-28 コニカミノルタ株式会社 Object detection system and object detection program
CN110276269A (en) * 2019-05-29 2019-09-24 西安交通大学 A kind of Remote Sensing Target detection method based on attention mechanism
CN111079519A (en) * 2019-10-31 2020-04-28 高新兴科技集团股份有限公司 Multi-posture human body detection method, computer storage medium and electronic device
CN111178213A (en) * 2019-12-23 2020-05-19 大连理工大学 Aerial photography vehicle detection method based on deep learning
CN111223041A (en) * 2020-01-12 2020-06-02 大连理工大学 Full-automatic natural image matting method
CN111401201A (en) * 2020-03-10 2020-07-10 南京信息工程大学 Aerial image multi-scale target detection method based on spatial pyramid attention drive
CN111428765A (en) * 2020-03-17 2020-07-17 武汉大学 Target detection method based on global convolution and local depth convolution fusion

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10699119B2 (en) * 2016-12-02 2020-06-30 GEOSAT Aerospace & Technology Methods and systems for automatic object detection from aerial imagery
US11295532B2 (en) * 2018-11-15 2022-04-05 Samsung Electronics Co., Ltd. Method and apparatus for aligning 3D model

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101807256A (en) * 2010-03-29 2010-08-18 天津大学 Object identification detection method based on multiresolution frame
JP2019049484A (en) * 2017-09-11 2019-03-28 コニカミノルタ株式会社 Object detection system and object detection program
CN110276269A (en) * 2019-05-29 2019-09-24 西安交通大学 A kind of Remote Sensing Target detection method based on attention mechanism
CN111079519A (en) * 2019-10-31 2020-04-28 高新兴科技集团股份有限公司 Multi-posture human body detection method, computer storage medium and electronic device
CN111178213A (en) * 2019-12-23 2020-05-19 大连理工大学 Aerial photography vehicle detection method based on deep learning
CN111223041A (en) * 2020-01-12 2020-06-02 大连理工大学 Full-automatic natural image matting method
CN111401201A (en) * 2020-03-10 2020-07-10 南京信息工程大学 Aerial image multi-scale target detection method based on spatial pyramid attention drive
CN111428765A (en) * 2020-03-17 2020-07-17 武汉大学 Target detection method based on global convolution and local depth convolution fusion

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
Fuhao Zou等.Arbitrary-oriented object detection via dense feature fusion and attention model for remote sensing super-resolution image.《Neural Computing and Applications》.2020,第32卷 *
Ronggang Huang等.Multiple rotation symmetry group detection via saliency-based visual attention and Frieze expansion pattern.《Signal Processing:Image Communication》.2018,第60卷 *
Tsung-YiLin等.Feature Pyramid Networks for Object Detection.《computer vision and pattern recognition》.2017, *
周颖等.基于多尺度特征聚类算法的不确定目标检测.《火力与指挥控制》.2019,第44卷(第4期), *
雷家荟.基于深度学习的遥感目标检测算法.《中国优秀硕士学位论文全文数据库 (工程科技辑)》.2020, *

Also Published As

Publication number Publication date
CN111914795A (en) 2020-11-10

Similar Documents

Publication Publication Date Title
CN111914795B (en) Method for detecting rotating target in aerial image
Yu et al. A real-time detection approach for bridge cracks based on YOLOv4-FPM
CN111145174B (en) 3D target detection method for point cloud screening based on image semantic features
CN110728200B (en) Real-time pedestrian detection method and system based on deep learning
Liu et al. Multiscale U-shaped CNN building instance extraction framework with edge constraint for high-spatial-resolution remote sensing imagery
CN111178213A (en) Aerial photography vehicle detection method based on deep learning
CN104134234A (en) Full-automatic three-dimensional scene construction method based on single image
CN112633277A (en) Channel ship board detection, positioning and identification method based on deep learning
CN111242041A (en) Laser radar three-dimensional target rapid detection method based on pseudo-image technology
CN111640116B (en) Aerial photography graph building segmentation method and device based on deep convolutional residual error network
CN111914720B (en) Method and device for identifying insulator burst of power transmission line
CN110991444A (en) Complex scene-oriented license plate recognition method and device
CN113343858B (en) Road network geographic position identification method and device, electronic equipment and storage medium
CN113177560A (en) Universal lightweight deep learning vehicle detection method
CN116343053B (en) Automatic solid waste extraction method based on fusion of optical remote sensing image and SAR remote sensing image
Lu et al. A CNN-transformer hybrid model based on CSWin transformer for UAV image object detection
CN115861619A (en) Airborne LiDAR (light detection and ranging) urban point cloud semantic segmentation method and system of recursive residual double-attention kernel point convolution network
CN111027538A (en) Container detection method based on instance segmentation model
CN114519819B (en) Remote sensing image target detection method based on global context awareness
CN110287798B (en) Vector network pedestrian detection method based on feature modularization and context fusion
Liao et al. Lr-cnn: Local-aware region cnn for vehicle detection in aerial imagery
CN113902792A (en) Building height detection method and system based on improved RetinaNet network and electronic equipment
CN113989296A (en) Unmanned aerial vehicle wheat field remote sensing image segmentation method based on improved U-net network
CN113326734A (en) Rotary target detection method based on YOLOv5
CN110348311B (en) Deep learning-based road intersection identification system and method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant