CN111914795B - Method for detecting rotating target in aerial image - Google Patents
Method for detecting rotating target in aerial image Download PDFInfo
- Publication number
- CN111914795B CN111914795B CN202010823765.0A CN202010823765A CN111914795B CN 111914795 B CN111914795 B CN 111914795B CN 202010823765 A CN202010823765 A CN 202010823765A CN 111914795 B CN111914795 B CN 111914795B
- Authority
- CN
- China
- Prior art keywords
- target
- loss
- mask
- candidate region
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
- G06V20/13—Satellite images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
Abstract
The invention adopts a deep learning method to design a target detection model for detecting targets such as vehicles, ships, airplanes and the like in high-altitude aerial images and simultaneously carries out positioning prediction on a target rotating frame. Firstly, designing an image feature extraction network for acquiring high-dimensional features of an input aerial image, and simultaneously constructing a feature pyramid by adopting an FPN (field programmable gate array) architecture to realize target feature extraction under different resolutions; then, generating the size of a basic anchor point of a candidate region extraction network by adopting a clustering method, and realizing the size adjustment of the corresponding anchor point according to the size distribution information of the target in the training image, thereby improving the training efficiency; designing a characteristic denoising detector combined with an attention mechanism for denoising the target characteristics of the candidate region; and finally, designing corresponding weight factors aiming at the target frames with different length-width ratios by adopting a rotation angle error optimization method, optimizing the positioning result of the target with the large length-width ratio, and realizing the rotation frame prediction of various targets in the aerial image.
Description
Technical Field
The invention relates to an aerial image target detection algorithm, in particular to a target rotating frame positioning prediction method aiming at rotating target detection.
Background
Target detection is a challenging computer vision task, and has application prospects in various fields including face recognition, search and rescue, intelligent transportation and the like. The traditional target detection method mainly realizes target detection by artificially designing the characteristics of a target to be detected, is very complicated, and has low efficiency and lack of robustness due to the characteristics of difficult extraction, instability and the like of the target characteristics. With the recent proposal and application of deep learning methods, the related field of target detection tasks also obtains a plurality of milestones, and the detection precision and the detection speed of targets are greatly improved. The target detection method based on deep learning mainly comprises single-step detection and two-step detection, the single-step detection algorithm is high in detection speed, but sacrifices part of precision, and the high-precision detection requirement is difficult to achieve. The single-step detection classical model comprises a YOLO series model and an SSD model, the two-step detection is represented by Fast RCNN, the single-step detection and the two-step detection are obviously different from a model architecture and comprise the steps of detection characteristics of a detector and model training optimization, but the detection characteristics and the model training optimization are used as main algorithms of target detection, the detection characteristics are still consistent on the whole process, aiming at an input image, firstly, a basic characteristic extraction network is used for processing low-dimensional pixel information to construct high-dimensional characteristic information, and then the detector is used for predicting the size of a target central point and a bounding box based on high-order characteristics. The small target detection and the rotating target detection are another important computer vision task after the classical target task, the small target has fewer pixels and less image occupation ratio, and meanwhile, the small target is very easy to be ignored in the feature extraction process of the convolutional neural network, so the detection difficulty is high. In recent years, a plurality of algorithms are designed for a small target, low-dimensional features are combined with high-dimensional features to predict the small target, and the situation that the small target features are ignored along with the increase of the convolution depth to influence the final prediction result is avoided. In the aerial photography image, a plurality of target gathering areas such as parking lots, harbors, airports and the like exist, in the areas with high gathering degree, a traditional horizontal frame is adopted, the situation that a large number of target frames are restrained can occur through non-maximum value restraint, so that a large number of targets in a detection result are lost, the problem can be effectively avoided by adopting a rotating frame to carry out target detection, and meanwhile, more accurate positioning prediction is realized.
Disclosure of Invention
The invention provides a target feature denoising and angle error optimization method based on multi-scale clustering and combined with an attention mechanism for realizing small target detection and rotating target positioning prediction in an aerial image, and realizes accurate positioning prediction of a rotating target.
The method adopts a residual error network structure ResNet as a basic feature extraction framework to extract high-dimensional feature information of an input image, designs a feature pyramid structure to realize the fusion of high-dimensional features and low-dimensional features; then, multi-scale clustering is adopted for setting anchor point parameters of the candidate region proposed network RPN, and corresponding anchor points are distributed to each feature layer according to characteristics such as receptive fields of different resolution characteristics of the feature pyramid; then according to the candidate region result generated by the RPN, intercepting a corresponding feature map on a corresponding feature layer, and denoising each candidate region feature by combining with a proposed attention denoiser; inputting the denoised target characteristics into a full-connection layer, and performing final positioning and classification prediction.
The aerial image rotating target detection method comprises the following steps:
the method comprises the following steps: and (6) data acquisition and labeling. The method comprises the following steps of acquiring aerial images by adopting equipment or network resources, acquiring high-resolution images by utilizing an unmanned aerial vehicle to shoot at high altitude or utilizing Google maps and the like, and performing target marking work after the images are acquired, wherein the marking mode is different from the traditional marking mode of a horizontal external rectangle, but a rotating frame mode is adopted for marking, and the specific implementation steps are as follows:
step A: collecting images by using an unmanned aerial vehicle or network resources, and constructing a large amount of training image data;
and B: marking the target frame by using a marking tool, and adopting a 4-point method, namely a marking mode of four vertexes of a quadrilateral;
and C: and (3) completing the labeling of the rotating rectangular frame by utilizing a quadrilateral minimum rectangle external connection method to form a labeled file:
step two: and (4) preprocessing data. Aerial photography image has very high image resolution, no matter in training or actual testing process, the original image of direct input is unreal, and this can bring very big burden for equipment, and training speed receives very big influence, consequently should original image cutting be the small image, inputs the model again and trains and predict, and concrete realization step does:
step A: cutting the image, setting the cutting size to be 800 x 800 pixels according to the equipment capability and the depth learning model, considering that direct cutting may cause the target at the cutting edge to be cut off, and setting a cutting overlapping area to be 200 pixels specifically;
and B: reconstructing target label data, configuring corresponding label data for each image target generated by cutting, and judging whether a label belongs to the image according to whether a label center is in the cut image;
and C: and constructing training data, uniformly constructing training data tensors according to the cut images and the label data, facilitating model input, and converting the labels expressed by the four-point method into central points, square frame sizes and rotation angle expressions in the label processing process.
Step three: the model design, the deep learning detection model of the invention mainly includes four core structures, namely characteristic extraction network, candidate area generation network, characteristic denoising structure, rotating target predictor, in the course of detecting and training, the input data passes through the 4 structures in turn, finally produce the prediction result, the concrete realization steps of the model are as follows:
step A: adopting a residual error network ResNet as a feature extractor to obtain high-dimensional information of an input image, then constructing a feature pyramid in a top-down mode, and sequentially performing feature fusion on high-dimensional features downwards to generate a plurality of feature maps;
and B: generating anchor point size of a candidate area generation network by adopting a clustering mode, firstly counting target size of training data, setting the number of clustering centers, clustering by adopting a K-means method, generating the clustering centers with corresponding number, and taking coordinates of the center points as width and height parameters of the anchor points for configuring parameters of the candidate area generation network. The candidate areas generate a plurality of groups of classification of the candidate areas and anchor point positioning deviation values according to the characteristic diagram;
and C: intercepting a feature map of a feature layer corresponding to the feature pyramid according to the anchor point positioning deviation value to generate interesting region features, constructing a denoising map through a plurality of convolution layers according to the result, multiplying the denoising map with the interesting region feature layers one by one to obtain denoised target features, and generating a corresponding attention loss function during training;
step D: inputting the denoised target features into a full-link layer, respectively predicting classification information and positioning information of the target, wherein the classification information is a serial number of a target type, the positioning information is a target center, a size and a rotating angle, and the rotating angle error weight is set according to a target length-width ratio during training to realize angle error optimization;
step four: and (3) designing a loss function, wherein the design of the loss function mainly comprises three parts, namely foreground and background classification errors and anchor point offset positioning errors in a candidate region generation network, attention loss in attention denoising, classification errors of a final prediction result and rotating frame positioning errors.
Drawings
FIG. 1 is a diagram of a rotating object detection network architecture for the method of the present invention.
Fig. 2 is a partial aerial image of step one of the present invention.
FIG. 3 is a schematic diagram of image segmentation in step two of the present invention.
Fig. 4 is a structural diagram of a feature pyramid in step three of the present invention.
FIG. 5 is a diagram of an attention denoising detector in step three according to the present invention.
FIG. 6 is a model regression target design of step three of the present invention.
FIG. 7 is an attention mechanism mask design of step three of the present invention.
Detailed Description
The details of the model design of the present invention are described with reference to fig. 1, and the steps of the embodiment are as follows:
the method comprises the following steps: and (5) extracting image features. Processing the low-dimensional image pixels, and extracting high-dimensional feature information (a feature pyramid structure is shown in fig. 4), wherein the specific implementation steps are as follows:
step A: the residual error network ResNet50d is used as a backbone network, 4 residual error blocks are used for the input image, and 4 feature maps { C) with different resolutions are correspondingly generated2,C3,C4,C5};
And B: fusing the generated feature maps from top to bottom, firstly, fusing the feature maps by C5Generation of pyramid high-level features P by convolution5After P5C is to be4And add to obtain P4Sequentially fusing downwards to finally generate a characteristic pyramid { P }2,P3,P4,P5};
Step two: generating anchor point size based on a clustering method, simultaneously distributing anchor points to each characteristic layer for prediction, and predicting an offset value of a target relative to the anchor points by using the generated characteristic pyramid by using a candidate region extraction network, wherein the specific implementation steps are as follows:
step A: counting target size information in the training data, clustering by using a K-means clustering method, setting the number of clustering centers to be 35, finally generating 35 clustering centers, corresponding to the sizes of 35 anchor points, and distributing each anchor point to a corresponding characteristic layer:
and B: performing convolution operation on the generated characteristic pyramid to generate foreground and background score prediction of the target and an offset value of a relative anchor point, outputting 2 values in the foreground and background scores by a model, respectively representing scores of a foreground and a background, and outputting 4 numerical values by the offset value of the relative anchor point, respectively representing center offset x, y and size offset w, h;
step three: feature denoising in combination with attention mechanism. Firstly, intercepting the features according to the candidate region, then denoising the target features, predicting the final target classification and rotating frame positioning, and specifically realizing the following steps:
step A: inversely decoding according to the anchor point offset of the candidate area and the corresponding anchor point size to obtain a real size;
and B: calculating the percentage of the input image size according to the real size;
and C: utilizing the percentage to cut the characteristic diagram of the corresponding characteristic layer to obtain the characteristic diagram of the interested area;
step D: inputting the feature map into an attention denoising generator to generate a feature denoising map with the same size as the feature map;
step E: multiplying the feature map by corresponding elements of the feature de-noising map to obtain a target feature map;
step F: externally connecting full connection twitching, predicting the type of a target and positioning information of a rotating frame;
step four: designing a loss function, namely designing the loss function of a target detection model for model training and convergence, and specifically comprising the following steps:
step A: the candidate area generation network comprises 2 parts of losses, namely foreground and background classification losses and a break-point offset value loss;
and B: attention loss, constructing a convergence mask target of the target in a manner shown in fig. 7, and simultaneously, in the attention denoising, generating an attention feature map when generating a feature denoising map, and constructing an attention loss function by using the attention feature map and the mask target;
and C: the method comprises the steps of target classification and positioning loss, wherein the target classification and positioning loss comprises the loss of target type classification and predicted rotating frame positioning loss, and when the positioning loss is constructed, corresponding weight is set according to the length-width ratio of a target, so that the condition that the final prediction result is excessively influenced by the supervision error of the target with a large length-width ratio is avoided;
the implementation details of the first step are as follows: firstly, a feature map { C ] is generated according to a residual error network2,C3,C4,C5P is calculated by convolution5:
Then carrying out fusion from top to bottom to obtain P2,P3,P4:
The anchor point allocation strategy of the second step is shown by the following formula:
the mask generation details of the third step are as follows: firstly, generating an independent mask for each target according to the label:
wherein FillPoly represents pixel filling, where a zero matrix with the same size as the input image is constructed first, and then the pixels in the target area are set as the type number of the target. Splicing all target masks after construction is completed, and constructing a high-dimensional matrix:
and constructing a single-hot vector of target frame regression according to the result of the candidate region generation network:
here, rois _ assignments represents the target box number corresponding to each candidate region. At the same time, the mask is cropped using the generated candidate area result:
ROI _ Align here represents a crop scaling operation where a mask can be cropped according to a candidate region while scaling to the same size. Finally, the corresponding mask is activated using the unique heat vector:
step four, the construction of the loss function is divided into three parts, and the overall loss function is as follows:
wherein the loss function of the candidate area generating network is defined as follows:
the target predicted loss is defined as follows:
attention loss is defined as follows:
wherein λ isiWeights, rp, for controlling losses of partsnRepresenting the foreground-background prediction probability, rv, of candidate region generationnAnchor point offset prediction, gp, representing candidate region generationnRepresenting the probability of a real foreground and background, foreground 1, background 0, gvnRepresenting the true anchor offset value, FpnIndicating the result of classification of predicted objects, FvnIndicating the predicted spin frame positioning result, GpnRepresenting the true kind of target, GvnRepresenting the true orientation of the target rotating frame, Rh nAnd Rw nDenotes the scaled attention feature size, uijValues, gu, representing the corresponding positions of the attention feature mapijRepresenting the true mask value, LclsDenotes the softmax cross entropy, LregAnd Lreg_thetaIndicating a smooth L1 loss, LADRepresenting the softmax cross entropy at the pixel level.
Claims (1)
1. A method for detecting a rotating target of an aerial image is characterized by comprising the following steps: the method comprises the steps of data acquisition and marking, data preprocessing, model design and loss function design, wherein an aerial image is acquired by adopting an unmanned aerial vehicle and a satellite map, a target is marked by adopting a rotating frame mode, then the image is preprocessed and divided into a plurality of sub-images with fixed sizes, corresponding label data is configured for the image generated by cutting, the label data is used for constructing a training data tensor, then model design is carried out, the model comprises a feature extraction network, a candidate region generation network, a feature denoising structure and a rotating target predictor, image features are extracted by utilizing a residual error network, anchor point sizes are generated based on a clustering method, feature denoising is carried out by combining an attention mechanism, the candidate target features are input into classification and positioning information of a prediction target of a full connection layer, and the loss function of a rotating target detection model is designed, and comprises the loss of the candidate region generation network, the loss function of the loss function, and the loss function of the loss function, Attention loss, target classification and positioning loss, model training is completed by using relevant aerial photography data, and finally positioning and classification prediction of aerial photography image rotating targets are realized, wherein the specific realization steps are as follows:
the implementation details of the first step are as follows: firstly, a feature map { C ] is generated according to a residual error network2,C3,C4,C5P is calculated by convolution5:
P5=Conv(C5) (1)
Then carrying out fusion from top to bottom to obtain P2,P3,P4:
PCj=0.5*Pj+1+0.5*Cj,j∈{2,3,4}
Pj=Conv(PCj),j∈{2,3,4} (2)
The anchor point allocation strategy of the second step is shown by the following formula:
the mask generation details of the third step are as follows: firstly, generating an independent mask for each target according to the label:
t_maskk=FillPoly(Zeros([H,W]),gt_boxesk,labelsk) (4)
wherein, fillPoly represents pixel filling, a zero matrix with the same size as an input image is firstly constructed, then pixels in a target area are set as the type serial number of a target, and after construction, all target masks are spliced to construct a high-dimensional matrix:
mask=Concate(t_maskk),k∈[0,len(gt_boxes)] (5)
and constructing a single-hot vector of target frame regression according to the result of the candidate region generation network:
onehot=One_hot(rois_assignments,len(gt_boxes)) (6)
here, rois _ assignments represents the target box number corresponding to each candidate region, and at the same time, the mask is cropped using the generated candidate region result:
rois_cropped_mask=ROI_Align(rois,mask) (7)
here, ROI _ Align represents a cropping scaling operation where masks can be cropped according to the candidate region while scaling to the same size, and finally, the corresponding mask is activated using the unique heat vector:
rois_mask=Sum(rois_onehot*rois_cropped_msk) (8)
step four, the construction of the loss function is divided into three parts, and the overall loss function is as follows:
Losstotal=LossRPN+LossFAST+LossMASK (9)
wherein the loss function of the candidate area generating network is defined as follows:
the target predicted loss is defined as follows:
attention loss is defined as follows:
wherein λ is1、λ2、λ3、λ4、λ5Weights, rp, for controlling losses of partsnIndicating the probability of foreground prediction, rv, generated for the nth candidate regionniAnchor point offset prediction, gp, representing the parameter i generated by the nth candidate regionnRepresenting the probability of a real foreground and background, foreground 1, background 0, gvniThe true anchor offset value, Fp, representing the parameter i corresponding to the nth candidate regionn(iv) shows the classification result of the nth predicted target, FvniThe positioning result of the rotating frame, Gp, representing the parameter i corresponding to the nth predicted targetnRepresenting the nth object true category, GvniActual orientation of the rotating frame, R, representing the parameter i corresponding to the nth objecth nAnd Rw nDenotes the scaled attention feature size, ureValues, gu, representing the corresponding positions of the attention feature mapreRepresenting the true mask value, LclsDenotes the softmax cross entropy, LregAnd Lreg_θIndicating a smooth L1 loss, LADRepresenting the softmax cross entropy at the pixel level.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010823765.0A CN111914795B (en) | 2020-08-17 | 2020-08-17 | Method for detecting rotating target in aerial image |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010823765.0A CN111914795B (en) | 2020-08-17 | 2020-08-17 | Method for detecting rotating target in aerial image |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111914795A CN111914795A (en) | 2020-11-10 |
CN111914795B true CN111914795B (en) | 2022-05-27 |
Family
ID=73279011
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010823765.0A Active CN111914795B (en) | 2020-08-17 | 2020-08-17 | Method for detecting rotating target in aerial image |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111914795B (en) |
Families Citing this family (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112508029A (en) * | 2020-12-03 | 2021-03-16 | 苏州科本信息技术有限公司 | Instance segmentation method based on target box labeling |
CN112799055A (en) * | 2020-12-28 | 2021-05-14 | 深圳承泰科技有限公司 | Method and device for detecting detected vehicle and electronic equipment |
CN112926463A (en) * | 2021-03-02 | 2021-06-08 | 普联国际有限公司 | Target detection method and device |
CN112926480B (en) * | 2021-03-05 | 2023-01-31 | 山东大学 | Multi-scale and multi-orientation-oriented aerial photography object detection method and system |
CN112907972B (en) * | 2021-04-06 | 2022-11-29 | 昭通亮风台信息科技有限公司 | Road vehicle flow detection method and system based on unmanned aerial vehicle and computer readable storage medium |
CN113298720B (en) * | 2021-04-21 | 2022-08-19 | 重庆邮电大学 | Self-adaptive overlapped image rotation method |
CN113591748A (en) * | 2021-08-06 | 2021-11-02 | 广东电网有限责任公司 | Aerial photography insulator sub-target detection method and device |
CN113723217A (en) * | 2021-08-09 | 2021-11-30 | 南京邮电大学 | Object intelligent detection method and system based on yolo improvement |
CN113628208B (en) * | 2021-08-30 | 2024-02-06 | 北京中星天视科技有限公司 | Ship detection method, device, electronic equipment and computer readable medium |
CN113673478B (en) * | 2021-09-02 | 2023-08-11 | 福州视驰科技有限公司 | Port large-scale equipment detection and identification method based on deep learning panoramic stitching |
CN114360007B (en) * | 2021-12-22 | 2023-02-07 | 浙江大华技术股份有限公司 | Face recognition model training method, face recognition device, face recognition equipment and medium |
CN114119610B (en) * | 2022-01-25 | 2022-06-28 | 合肥中科类脑智能技术有限公司 | Defect detection method based on rotating target detection |
CN116306936A (en) * | 2022-11-24 | 2023-06-23 | 北京建筑大学 | Knowledge graph embedding method and model based on hierarchical relation rotation and entity rotation |
CN116823838B (en) * | 2023-08-31 | 2023-11-14 | 武汉理工大学三亚科教创新园 | Ocean ship detection method and system with Gaussian prior label distribution and characteristic decoupling |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101807256A (en) * | 2010-03-29 | 2010-08-18 | 天津大学 | Object identification detection method based on multiresolution frame |
JP2019049484A (en) * | 2017-09-11 | 2019-03-28 | コニカミノルタ株式会社 | Object detection system and object detection program |
CN110276269A (en) * | 2019-05-29 | 2019-09-24 | 西安交通大学 | A kind of Remote Sensing Target detection method based on attention mechanism |
CN111079519A (en) * | 2019-10-31 | 2020-04-28 | 高新兴科技集团股份有限公司 | Multi-posture human body detection method, computer storage medium and electronic device |
CN111178213A (en) * | 2019-12-23 | 2020-05-19 | 大连理工大学 | Aerial photography vehicle detection method based on deep learning |
CN111223041A (en) * | 2020-01-12 | 2020-06-02 | 大连理工大学 | Full-automatic natural image matting method |
CN111401201A (en) * | 2020-03-10 | 2020-07-10 | 南京信息工程大学 | Aerial image multi-scale target detection method based on spatial pyramid attention drive |
CN111428765A (en) * | 2020-03-17 | 2020-07-17 | 武汉大学 | Target detection method based on global convolution and local depth convolution fusion |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10699119B2 (en) * | 2016-12-02 | 2020-06-30 | GEOSAT Aerospace & Technology | Methods and systems for automatic object detection from aerial imagery |
US11295532B2 (en) * | 2018-11-15 | 2022-04-05 | Samsung Electronics Co., Ltd. | Method and apparatus for aligning 3D model |
-
2020
- 2020-08-17 CN CN202010823765.0A patent/CN111914795B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101807256A (en) * | 2010-03-29 | 2010-08-18 | 天津大学 | Object identification detection method based on multiresolution frame |
JP2019049484A (en) * | 2017-09-11 | 2019-03-28 | コニカミノルタ株式会社 | Object detection system and object detection program |
CN110276269A (en) * | 2019-05-29 | 2019-09-24 | 西安交通大学 | A kind of Remote Sensing Target detection method based on attention mechanism |
CN111079519A (en) * | 2019-10-31 | 2020-04-28 | 高新兴科技集团股份有限公司 | Multi-posture human body detection method, computer storage medium and electronic device |
CN111178213A (en) * | 2019-12-23 | 2020-05-19 | 大连理工大学 | Aerial photography vehicle detection method based on deep learning |
CN111223041A (en) * | 2020-01-12 | 2020-06-02 | 大连理工大学 | Full-automatic natural image matting method |
CN111401201A (en) * | 2020-03-10 | 2020-07-10 | 南京信息工程大学 | Aerial image multi-scale target detection method based on spatial pyramid attention drive |
CN111428765A (en) * | 2020-03-17 | 2020-07-17 | 武汉大学 | Target detection method based on global convolution and local depth convolution fusion |
Non-Patent Citations (5)
Title |
---|
Fuhao Zou等.Arbitrary-oriented object detection via dense feature fusion and attention model for remote sensing super-resolution image.《Neural Computing and Applications》.2020,第32卷 * |
Ronggang Huang等.Multiple rotation symmetry group detection via saliency-based visual attention and Frieze expansion pattern.《Signal Processing:Image Communication》.2018,第60卷 * |
Tsung-YiLin等.Feature Pyramid Networks for Object Detection.《computer vision and pattern recognition》.2017, * |
周颖等.基于多尺度特征聚类算法的不确定目标检测.《火力与指挥控制》.2019,第44卷(第4期), * |
雷家荟.基于深度学习的遥感目标检测算法.《中国优秀硕士学位论文全文数据库 (工程科技辑)》.2020, * |
Also Published As
Publication number | Publication date |
---|---|
CN111914795A (en) | 2020-11-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111914795B (en) | Method for detecting rotating target in aerial image | |
Yu et al. | A real-time detection approach for bridge cracks based on YOLOv4-FPM | |
CN111145174B (en) | 3D target detection method for point cloud screening based on image semantic features | |
CN110728200B (en) | Real-time pedestrian detection method and system based on deep learning | |
Liu et al. | Multiscale U-shaped CNN building instance extraction framework with edge constraint for high-spatial-resolution remote sensing imagery | |
CN111178213A (en) | Aerial photography vehicle detection method based on deep learning | |
CN104134234A (en) | Full-automatic three-dimensional scene construction method based on single image | |
CN112633277A (en) | Channel ship board detection, positioning and identification method based on deep learning | |
CN111242041A (en) | Laser radar three-dimensional target rapid detection method based on pseudo-image technology | |
CN111640116B (en) | Aerial photography graph building segmentation method and device based on deep convolutional residual error network | |
CN111914720B (en) | Method and device for identifying insulator burst of power transmission line | |
CN110991444A (en) | Complex scene-oriented license plate recognition method and device | |
CN113343858B (en) | Road network geographic position identification method and device, electronic equipment and storage medium | |
CN113177560A (en) | Universal lightweight deep learning vehicle detection method | |
CN116343053B (en) | Automatic solid waste extraction method based on fusion of optical remote sensing image and SAR remote sensing image | |
Lu et al. | A CNN-transformer hybrid model based on CSWin transformer for UAV image object detection | |
CN115861619A (en) | Airborne LiDAR (light detection and ranging) urban point cloud semantic segmentation method and system of recursive residual double-attention kernel point convolution network | |
CN111027538A (en) | Container detection method based on instance segmentation model | |
CN114519819B (en) | Remote sensing image target detection method based on global context awareness | |
CN110287798B (en) | Vector network pedestrian detection method based on feature modularization and context fusion | |
Liao et al. | Lr-cnn: Local-aware region cnn for vehicle detection in aerial imagery | |
CN113902792A (en) | Building height detection method and system based on improved RetinaNet network and electronic equipment | |
CN113989296A (en) | Unmanned aerial vehicle wheat field remote sensing image segmentation method based on improved U-net network | |
CN113326734A (en) | Rotary target detection method based on YOLOv5 | |
CN110348311B (en) | Deep learning-based road intersection identification system and method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |