CN111401201A - Aerial image multi-scale target detection method based on spatial pyramid attention drive - Google Patents

Aerial image multi-scale target detection method based on spatial pyramid attention drive Download PDF

Info

Publication number
CN111401201A
CN111401201A CN202010164167.7A CN202010164167A CN111401201A CN 111401201 A CN111401201 A CN 111401201A CN 202010164167 A CN202010164167 A CN 202010164167A CN 111401201 A CN111401201 A CN 111401201A
Authority
CN
China
Prior art keywords
attention
feature
spatial
pyramid
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010164167.7A
Other languages
Chinese (zh)
Other versions
CN111401201B (en
Inventor
孙玉宝
辛宇
徐宏伟
陈勋豪
周旺平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Information Science and Technology
Original Assignee
Nanjing University of Information Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Information Science and Technology filed Critical Nanjing University of Information Science and Technology
Priority to CN202010164167.7A priority Critical patent/CN111401201B/en
Publication of CN111401201A publication Critical patent/CN111401201A/en
Application granted granted Critical
Publication of CN111401201B publication Critical patent/CN111401201B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/13Satellite images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biomedical Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Remote Sensing (AREA)
  • Astronomy & Astrophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an aerial image multi-scale target detection method based on spatial pyramid attention drive, which comprises the following steps of: firstly, aiming at a large-size data set, a block processing method is applied to enhance the training data set; designing a residual error network represented by the convolution attention enhancement features as a backbone network, and further efficiently extracting image features; further constructing a spatial pyramid attention module to promote the network to more accurately focus targets with different scales and extract an interested area where the targets are located; establishing a target category analysis and target frame regression module, and classifying the regions of interest under different scales and predicting the target frames; in the testing stage, a multi-scale testing strategy is adopted by using a trained detection network, and then detection results of different scales are fused by a global integration non-maximum suppression algorithm, so that the detection accuracy is further improved.

Description

Aerial image multi-scale target detection method based on spatial pyramid attention drive
Technical Field
The invention belongs to the technical field of image recognition and target detection, and particularly relates to an aerial image multi-scale target detection method based on spatial pyramid attention driving.
Background
The target detection, also called target extraction, is an image segmentation based on target geometry and statistical characteristics, which combines the segmentation and identification of targets into one, and the accuracy and real-time performance of the method are important capabilities of the whole system. Especially, in a complex scene, when a plurality of targets need to be processed in real time, automatic target extraction and identification are particularly important. With the development of computer technology and the wide application of computer vision principle, the real-time tracking research on the target by using the computer image processing technology is more and more popular, and the dynamic real-time tracking and positioning of the target has wide application value in the aspects of intelligent traffic systems, intelligent monitoring systems, military target detection, surgical instrument positioning in medical navigation operations and the like.
On the one hand, in recent years, many methods for detecting targets have appeared, such as methods of YO L O, SSD, RetinaNet, and RCNN series, wherein YO L O, SSD, RetinaNet are single-stage methods, and original RCNN and its extended Fast-RCNN and Fast-RCNN are two-stage methods.
On the other hand, the visual attention mechanism is a brain signal processing mechanism unique to human vision. Human vision obtains a target area needing important attention, namely a focus of attention in general, by rapidly scanning a global image so as to acquire more information which is critical to the characteristics of the target needing attention. Therefore, the model introducing the attention mechanism is of great help to improve the accuracy of target detection.
Under the condition of not considering the detection speed, the accuracy of the two-stage target detection algorithm is higher than that of the single-stage target detection algorithm, so that the two-stage target detection algorithm can achieve higher accuracy in many conditions such as detection of aerial pictures of the unmanned aerial vehicle. Therefore, the patent provides a feature pyramid dual-attention-driven multi-scale target detection network based on a deep learning theory and a latest attention mechanism method.
Disclosure of Invention
The invention aims to solve the technical problem that the prior art is not enough, and provides an aerial image multi-scale target detection method based on spatial pyramid attention drive.
In order to achieve the technical purpose, the technical scheme adopted by the invention is as follows:
a multi-scale target detection method for aerial images based on spatial pyramid attention driving is disclosed, wherein: the method comprises the following steps:
s101: collecting an unmanned aerial vehicle aerial image set and carrying out blocking processing to obtain a large number of small cut-block images with consistent sizes;
s102: inputting the cut small images into a residual error network, extracting features through a convolution attention module in the residual error network, wherein the convolution attention module comprises a first channel attention unit and a first space attention unit, obtaining a channel attention diagram through calculation according to the first channel attention unit, obtaining a space attention diagram through calculation according to the first space attention unit, and generating a first feature diagram by combining the channel attention diagram and the space attention diagram;
s103: extracting features from the first feature map by a detector based on a feature pyramid, adding a dual attention module containing a second spatial attention unit and a second channel attention unit to each layer of the feature pyramid from top to bottom, fusing feature maps generated by the two attention units to obtain a second feature map, performing region-of-interest alignment operation on the second feature map generated by the region suggestion network in the last layer, and fixing the size of the features;
s104: aiming at the obtained second feature map aligned with the region of interest, a target category analysis and target frame regression module is established, and classification and target frame prediction are carried out on the region of interest under different scales;
s105: the original image and the 1.5-time original image are adopted to carry out multi-scale image testing, images of two scales are respectively input into a depth network to be tested, and results of different scales are fused through a global integration non-maximum suppression algorithm, so that the detection accuracy is improved.
In order to optimize the technical scheme, the specific measures adopted further comprise:
the step S101 specifically includes: and carrying out sliding window type blocking on the image according to the pixel size of 1000 × 1000, adopting the overlapping rate of 0.25, keeping the coordinate information of the manual labeling frame of the vehicle with the IOU larger than 0.7, and converting the manually labeled boundary frame into the coordinate of the small diced picture for all vehicles in the diced image.
The step S102 specifically includes: inputting the picture into a residual error network embedded with a convolution attention module, wherein a first channel attention unit compresses the picture in a spatial dimension by using maximum pooling and average pooling to obtain two different spatial backgrounds
Figure BDA0002406283750000021
And
Figure BDA0002406283750000022
spatial background through residual network
Figure BDA0002406283750000023
And
Figure BDA0002406283750000024
and calculating to obtain a channel attention diagram, wherein the calculation formula of the channel attention unit is as follows:
Figure BDA0002406283750000025
wherein: w1And W0Representing weights of a multi-layered perceptron, and in which two weights share an input, and in which W is0Followed by a relu activation function; σ represents Sigmoid function, and F represents convolution operation corresponding to the stage in the attention mechanism;
wherein the first spatial attention unit derives two different profiles in the dimension of the channel based on the maximum pooling and the average pooling
Figure BDA0002406283750000031
And
Figure BDA0002406283750000032
generating a spatial attention diagram according to convolution calculation, wherein the calculation formula of the first spatial attention unit is as follows:
Figure BDA0002406283750000033
wherein: σ denotes Sigmoid function, f7*7Represents a convolution kernel size of 7 × 7;
a first feature map is then generated from the channel attention map and the spatial attention map.
The step3 is specifically: extracting features from the first feature map by a feature pyramid-based detector, and adding a dual attention module containing a second location attention unit and a second spatial attention unit to each layer of the feature pyramid from top to bottom;
calculating a correlation strength matrix between any two point features through a second position attention unit, namely an original feature AjObtaining characteristic B through convolution dimensionality reductioniFeature CjAnd feature DiThen changing the characteristic dimension BiAnd CjObtaining a correlation strength matrix between any two point characteristics according to the matrix product; by passingCalculating and obtaining characteristics S of each position to other positions by using softmax functionjiThen the feature SjiAnd feature DiPerforming multiplication and fusion, and finally, combining the result with the original characteristic AjAnd adding to obtain a position feature map finally output by the position attention unit, wherein the calculation formula of the second position attention unit is as follows:
Figure BDA0002406283750000034
wherein A isjRepresenting the feature corresponding to the given position; b isi,Cj,DiIs shown asjThree new features, S, generated by convolution dimensionality reductionjiRepresents that B isi,CjThe position attention map obtained by matrix multiplication after the re-deformation and then the softmax layer is obtained, Ej1A position feature map representing the final output of the second position attention unit;
carrying out dimension transformation and matrix multiplication on the features of any two channels through a second spatial attention unit to obtain the correlation strength of any two channels, then calculating to obtain a feature map between the channels, and finally carrying out weighting and fusion on the feature maps between the channels to enable global correlation to be generated between the channels and obtain features with stronger semantic response, wherein the calculation formula of the second spatial attention unit is as follows:
Figure BDA0002406283750000041
wherein A isjRepresenting the feature, x, corresponding to a given locationjiIs represented by AjAnd AjTranspose A ofiChannel profile obtained by multiplication through softmax layer, Ej2A spatial signature graph representing the final output of the second spatial attention unit;
and finally, performing feature fusion on the first spatial feature map and the second spatial feature map to obtain a final second feature map, and recommending a network to perform region-of-interest alignment operation on the obtained second feature map in the last layer of region, and fixing the size of the features.
The step S104 is specifically: and after aligning the interested regions of the second feature map and obtaining the size of the fixed features, connecting two 1024 layers of full-connection layers, dividing the full-connection layers into two branches, respectively establishing a target category analysis and target frame regression module, and classifying the interested regions under different scales of the feature pyramid and predicting the target frames.
The step S105 is specifically: in the test, a multi-scale image test is adopted, the original image and the 1.5-time image of the original image are collected in the test, the images of two scales are processed in a blocking mode, then the images of the two scales are respectively input into a depth network to be tested, detection results on the respective scales are obtained, the detection results of the two scales are combined with the detection results of the two scales through a global non-maximum inhibition fusion algorithm, and therefore the detection accuracy is improved.
The global integrated non-maximum suppression algorithm process is as follows:
step1, globally aligning the coordinates of the prediction frames of the subblocks in each scale;
step2, weighted calculation and sequencing of confidence coefficient weights of the detection frames;
step3, selecting a ratio boundary box with the highest confidence coefficient to be added into a final output list, and deleting the ratio boundary box from the boundary box list;
step4, calculating the areas of all the boundary frames;
step5, calculating IOUs of the bounding box with the highest confidence coefficient and other candidate boxes;
step6, deleting the boundary box with the IOU larger than the threshold value;
step7. repeat the above process until the bounding box list is empty.
The invention has the beneficial effects that:
the invention utilizes the theory of computer target detection and attention mechanism to establish a multi-scale target detection network method based on feature pyramid dual attention drive, under the condition that a model has larger aerial image size, smaller target to be detected and high background complexity, firstly, the blocking processing of a data set is carried out, then, the powerful feature extraction capability driven by the feature pyramid dual attention is utilized, meanwhile, a multi-scale fusion detection method is adopted, and the detection results of two scales are combined with the detection results of the two scales by utilizing a global non-maximum inhibition fusion algorithm, so that the most accurate detection result is finally obtained. The detection network provided by the invention achieves a good effect on target detection of aerial pictures, and plays a significant role in the fields of geographic environment detection, traffic flow control, military behavior monitoring and the like.
Drawings
FIG. 1 is a schematic flow chart of the algorithm of the present invention;
FIG. 2 is a schematic flow diagram of a global non-maximum suppression fusion algorithm;
FIG. 3 is a schematic diagram of a feature pyramid portion of a dual attention mechanism drive constructed in accordance with the present invention;
FIG. 4 is a schematic diagram of a detection network of the present invention;
fig. 5 is a comparison graph of quantitative analysis of the unmanned aerial vehicle dataset of the present invention.
Detailed Description
Embodiments of the present invention are described in further detail below with reference to the accompanying drawings.
As shown in fig. 1, the present invention is a spatial pyramid attention-driven aerial image multi-scale target detection method, wherein: the method comprises the following steps:
s101, before training, carrying out block processing on an unmanned aerial vehicle aerial photography automobile data set used for verifying the effectiveness of a designed network;
the method specifically comprises the following steps: before the data set is sent to network training, the data set is processed firstly, the data set used in our experiments comprises 4355 aerial images and corresponding coordinates of manually marked vehicles, and as for each image, the image size is too large due to aerial shooting by an unmanned aerial vehicle, the image is subjected to sliding window type partitioning according to the pixel size of 1000 × 1000 to obtain a large number of small cut-block images, in order to avoid the situation that the vehicle is incomplete due to image segmentation as far as possible, the overlapping rate of 0.25 is adopted, the coordinate information of the manually marked frame of the vehicle with the IOU being greater than 0.7 is reserved, and for all vehicle examples in the image after the image is cut, the cut-blocks are stored, the manually marked boundary frame of the cut-block frames of the vehicle is converted into the coordinates of the small cut-block images, and 48416 small cut-block images with the size of 1000 × 1000 are obtained in total.
S102, inputting the small cut-block image into a residual error network, extracting features through a convolution attention module in the residual error network, wherein the convolution attention module comprises a first channel attention unit and a first space attention unit, obtaining a channel attention diagram through calculation according to the first channel attention unit, obtaining a space attention diagram through calculation according to the first space attention unit, and generating a first feature diagram by combining the channel attention diagram and the space attention diagram.
The method specifically comprises the following steps: firstly, a picture passes through a backbone network, a residual network is selected by the backbone network, and a convolution attention mechanism module is embedded in the residual, wherein the convolution attention module is an attention module combined with space and channels, and then feature mapping is multiplied by an input feature map to carry out feature self-adaptive learning; after the picture passes through the backbone network, a characteristic diagram is generated and sent to the next link;
the convolution attention module comprises a first channel attention unit and a first spatial attention unit, the first channel attention unit is more concerned with what is meaningful in an input picture, in order to calculate the channel attention efficiently, the first channel attention unit compresses in a spatial dimension by using maximum pooling and average pooling to obtain two different spatial backgrounds
Figure BDA0002406283750000061
And
Figure BDA0002406283750000062
the channel attention map is calculated by using a shared network consisting of M L P to obtain two different spatial background descriptions, so the calculation formula of the first channel attention cell is as follows:
Figure BDA0002406283750000063
wherein, W1And W0Representing weights of a multi-tier perceptron, and in which two weights share an input, and in which W is0Followed by a relu activation function; σ represents the Sigmoid function and F represents the convolution operation corresponding to this stage in the attention module.
Wherein the first spatial attention unit is different from the first channel attention unit, the first spatial attention unit mainly focuses on the position information, and two different feature descriptions are obtained by using maximum pooling and average pooling on the channel dimension
Figure BDA0002406283750000064
And
Figure BDA0002406283750000065
the two feature descriptions are then merged using concatenation and a spatial attention graph is generated using convolution operations, the calculation formula for the first spatial attention cell being as follows:
Figure BDA0002406283750000066
wherein: σ stands for Sigmoid function, f7*7Representing a convolution kernel size of 7 x 7 in the convolution operation, and then generating the first feature map from the channel attention map and the spatial attention map.
S103, extracting features from the first feature map by a detector based on the feature pyramid, calculating the association degree between different features and the association between modeling channels by adding a dual attention mechanism module containing a second spatial attention unit and a second channel attention unit to each layer of the feature pyramid from top to bottom, and performing region-of-interest alignment operation on the generated second feature map by using a network suggested in the last layer of region to fix the size of the features.
The method specifically comprises the following steps: in the detector link, firstly, a characteristic pyramid network is fused into the Faster-RCNN to increase the cognition of the detector on the whole image information, meanwhile, a spatial characteristic pyramid structure is improved, a double attention module is added, and finally, the original region of interest with the fixed characteristic in the Faster-RCNN is subjected to pooling operation and replaced by region of interest alignment operation with pixel level and higher precision.
The loss function of the detection network comprises classification loss and regression loss, and the loss function formula is as follows:
Figure BDA0002406283750000071
wherein: i is the ith target box and i is the ith target box,
Figure BDA0002406283750000072
is the probability of targeting the anchor frame, when the anchor frame is targeted,
Figure BDA0002406283750000073
1, otherwise 0, ti is the location coordinate of the prediction box,
Figure BDA0002406283750000074
is the coordinates of the real tag;
the part from bottom to top of the feature pyramid is the features obtained by the backbone network, and the adopted operation is that 1 x 1 dimensionality reduction operation is carried out on the 2 nd layer from bottom to top, and then the results after the 3 rd layer from bottom to top is sampled are added to obtain the 2 nd layer from top to bottom; the same applies to the top-to-bottom next layer, and then the network operation is subjected to area recommendation for the resulting top-to-bottom portion to obtain a recommendation for the area to be detected.
The specific steps of the feature pyramid part which is integrated into the double attention module in the residual error network are that feature extraction of an object to be detected is achieved on feature graphs of different scales, a feature graph with higher precision and richer information can be obtained by adding the double attention mechanism to each layer of the feature pyramid from the top to the bottom, and the double attention module respectively introduces the self-attention mechanism into the space dimension and the channel dimension of the feature, namely a second position attention unit and a second channel attention unit, so that the global dependency relationship of the feature is effectively grasped.
Wherein the second location attention unit mutually enhances the expression of the respective features by utilizing the association between any two features. Specifically, firstly, a correlation strength matrix between any two point features, namely an original feature A, is calculatedjObtaining characteristic B through convolution dimensionality reductioniFeature CjAnd feature DiThen changing the characteristic dimension BiAnd CjAnd obtaining a correlation strength matrix between any two point characteristics according to the matrix product. Then obtaining the characteristics S of each position to other positions through the normalization of the softmax operationjiWherein the more similar between two point features, the S thereofjiThe larger the response value. Then the response value S in the feature map is comparedjiThe feature D is weighted and fused as a weight, so that for each point of the position, the calculation formula of the second position attention unit is as follows through the fusion of the feature map in the global space and similar features:
Figure BDA0002406283750000075
wherein A isjRepresenting the feature corresponding to a given position, Bi,Cj,DiIs shown asjTwo new characteristic maps, S, generated by feeding the convolutional layersjiRepresents that B isi,CjCarrying out matrix multiplication after re-deformation and obtaining a spatial characteristic diagram through a softmax layer, Ej1A position feature map representing the final output of the second position attention unit.
The second spatial attention unit enhances specific semantic response capability under the channels by modeling the association between the channels. The specific process is similar to the position attention module, except that when the feature attention diagram X is obtained, dimension transformation and matrix multiplication are carried out on any two channel features to obtain the correlation strength of any two channels, and then the feature diagram between the channels is obtained through the softmax operation. And finally, fusion is carried out through attention diagram weighting among the channels, so that global association can be generated among all the channels, and the characteristics of stronger semantic response are obtained. The calculation formula of the channel attention module is as follows:
Figure BDA0002406283750000081
wherein A isjRepresenting the feature, x, corresponding to a given locationjiIs represented by AjAnd AjTranspose A ofiChannel profile obtained by multiplication through softmax layer, Ej2A spatial signature graph representing the final output of the second spatial attention unit.
In the target detection algorithm, a region suggestion candidate box of a result to be detected is obtained in a region suggestion network, and then candidate regions with different sizes are mapped onto a feature map with a fixed size by using region-of-interest pooling. However, there are two obvious disadvantages to using region-of-interest pooling, one of which is that errors may occur when quantizing the candidate frame boundaries to integer coordinates, and errors may also occur when floating point numbers are rounded when pooling. The coordinate position of the candidate frame can be deviated due to the error accumulation result, and the detection effect is influenced. Because our data set is to detect the car of the unmanned aerial vehicle aerial image, the target that needs to detect is the target with the extremely small proportion in the picture, therefore we have replaced the alignment operation of the interested region with pixel level and higher precision in our, and then cancel the quantization operation, obtain the image number value on the pixel point of the coordinate as the floating point number through the method of using bilinear interpolation, thus turn the whole characteristic gathering process into a continuous operation.
And S104, after the region-of-interest alignment operation is carried out on the second feature map and the size of the fixed feature is obtained, connecting two 1024 layers of full-connection layers, dividing into two branches, respectively establishing a target category analysis and target frame regression module, and classifying the region-of-interest under different scales of the feature pyramid and predicting the target frame.
And S105, adopting multi-scale image testing in the testing process, except for the original image concentrated in the testing process and the 1.5-time image of the original image, carrying out blocking processing on the images of the two scales, respectively inputting the images of the two scales into a depth network for testing to obtain detection results on the respective scales, and combining the detection results of the two scales with a global non-maximum inhibition fusion algorithm to improve the detection accuracy.
The overall integrated non-maximum suppression algorithm process is as follows;
step1, globally aligning the coordinates of the prediction frames of the subblocks in each scale;
step2, weighted calculation and sequencing of confidence coefficient weights of the detection frames;
step3, selecting a ratio boundary box with the highest confidence coefficient to be added into a final output list, and deleting the ratio boundary box from the boundary box list;
step4, calculating the areas of all the boundary frames;
step5, calculating IOUs of the bounding box with the highest confidence coefficient and other candidate boxes;
step6, deleting the boundary box with the IOU larger than the threshold value;
step7. repeat the above process until the bounding box list is empty.
Compared experiments are carried out on the invention, the data set used in the experiments is the unmanned aerial vehicle aerial photography automobile data set of 'Behcet' information fusion challenge match, and the hyper-parameters are set as follows: the maximum number of iterations is 12, the batch size is 1, the learning rate is set by adopting a warming up strategy, the initial learning rate is 0.3333, the learning rate is gradually increased and reduced to 0.00025 in the initial 500 iterations, and the learning rate is reduced in the 8 th and 11 th periods.
Evaluation of the experiment two analytical methods of quantification and visualization were used:
for quantitative analysis comparison, precision (accuracy), recall (recall) and F1 scores are used for judging detection precision, and precision and recall are used for calculating F1 scores to measure the detection precision of the algorithm. Wherein the accuracy, the recall rate and the F1 score are calculated as follows:
Figure BDA0002406283750000091
Figure BDA0002406283750000092
Figure BDA0002406283750000093
wherein, true posotives actually means that the target to be detected is correctly detected, false posotives actually means that the target not to be detected is detected, and false negatives actually means that the target to be detected is not detected.
The visual analysis comparison means that the same picture to be detected is detected for models run out through different detection algorithms, the effect of the detected picture is visualized through the written visual codes, and then the detection effects of the models run out through different detection algorithms on the same picture are artificially compared.
Compared with the conventional target detection algorithm, the unmanned aerial vehicle aerial image detection method has the advantages of being low in detection precision, poor in effect and the like. The invention utilizes a deep learning and attention mechanism to establish a multi-scale unmanned aerial vehicle aerial photography target detection network based on feature pyramid dual attention drive, and in the process of feature extraction, the attention mechanism is integrated into a space pyramid, so that richer and more effective information can be extracted and further sent to a regional suggestion network for classification and regression.
The above is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above-mentioned embodiments, and all technical solutions belonging to the idea of the present invention belong to the protection scope of the present invention. It should be noted that modifications and embellishments within the scope of the invention may be made by those skilled in the art without departing from the principle of the invention.

Claims (7)

1. A multi-scale target detection method of aerial images based on spatial pyramid attention drive is characterized by comprising the following steps: the method comprises the following steps:
s101: collecting an unmanned aerial vehicle aerial image set and carrying out blocking processing to obtain a large number of small cut-block images with consistent sizes;
s102: inputting the cut small images into a residual error network, extracting features through a convolution attention module in the residual error network, wherein the convolution attention module comprises a first channel attention unit and a first space attention unit, obtaining a channel attention diagram through calculation according to the first channel attention unit, obtaining a space attention diagram through calculation according to the first space attention unit, and generating a first feature diagram by combining the channel attention diagram and the space attention diagram;
s103: extracting features from the first feature map by a detector based on a feature pyramid, adding a dual attention module containing a second spatial attention unit and a second channel attention unit to each layer of the feature pyramid from top to bottom, fusing feature maps generated by the two attention units to obtain a second feature map, performing region-of-interest alignment operation on the second feature map generated by the region suggestion network in the last layer, and fixing the size of the features;
s104: aiming at the obtained second feature map aligned with the region of interest, a target category analysis and target frame regression module is established, and classification and target frame prediction are carried out on the region of interest under different scales;
s105: the original image and the 1.5-time original image are adopted to carry out multi-scale image testing, images of two scales are respectively input into a depth network to be tested, and results of different scales are fused through a global integration non-maximum suppression algorithm, so that the detection accuracy is improved.
2. The aerial image multi-scale target detection method based on spatial pyramid attention driving according to claim 1, characterized in that: the step S101 specifically includes:
and carrying out sliding window type blocking on the image according to the pixel size of 1000 × 1000, adopting the overlapping rate of 0.25, keeping the coordinate information of the manual labeling frame of the vehicle with the IOU larger than 0.7, and converting the manually labeled boundary frame into the coordinate of the small diced picture for all vehicles in the diced image.
3. The aerial image multi-scale target detection method based on spatial pyramid attention driving according to claim 2, characterized in that: the step S102 specifically includes:
inputting the picture into a residual error network embedded with a convolution attention module, wherein a first channel attention unit compresses the picture in a spatial dimension by using maximum pooling and average pooling to obtain two different spatial backgrounds
Figure FDA0002406283740000011
And
Figure FDA0002406283740000012
spatial background through residual network
Figure FDA0002406283740000013
And
Figure FDA0002406283740000014
and calculating to obtain a channel attention diagram, wherein the calculation formula of the first channel attention unit is as follows:
Figure FDA0002406283740000015
wherein: w1And W0Representing weights of a multi-layered perceptron, and in which two weights share an input, and in which W is0Followed by a relu activation function; σ represents Sigmoid function, and F represents convolution operation corresponding to the stage in the attention mechanism;
wherein the first spatial attention unit derives two different profiles in the dimension of the channel based on the maximum pooling and the average pooling
Figure FDA0002406283740000021
And
Figure FDA0002406283740000022
generating a spatial attention diagram according to convolution calculation, wherein the calculation formula of the first spatial attention unit is as follows:
Figure FDA0002406283740000023
wherein: σ denotes Sigmoid function, f7*7Represents a convolution kernel size of 7 × 7;
a first feature map is then generated from the channel attention map and the spatial attention map.
4. The aerial image multi-scale target detection method based on spatial pyramid attention driving according to claim 3, characterized in that: the step3 specifically comprises the following steps:
extracting features from the first feature map by a feature pyramid-based detector, and adding a dual attention mechanism including a second location attention unit and a second spatial attention unit to each layer of the feature pyramid from top to bottom;
calculating a correlation strength matrix between any two point features through a second position attention unit, namely an original feature AjObtaining characteristic B through convolution dimensionality reductioniFeature CjAnd feature DiThen changing the characteristic dimension BiAnd CjObtaining a correlation strength matrix between any two point characteristics according to the matrix product; calculating and obtaining characteristics S of each position to other positions by utilizing softmax functionjiThen the feature SjiAnd feature DiPerforming multiplication and fusion, and finally, combining the result with the original characteristic AjAnd adding to obtain a position feature map finally output by the position attention unit, wherein the calculation formula of the second position attention unit is as follows:
Figure FDA0002406283740000024
wherein A isjRepresents a given position pairThe corresponding characteristics; b isi,Cj,DiIs shown asjThree new features, S, generated by convolution dimensionality reductionjiRepresents that B isi,CjThe position attention map obtained by matrix multiplication after the re-deformation and then the softmax layer is obtained, Ej1A position feature map representing the final output of the second position attention unit;
performing dimension transformation and matrix multiplication on the features of any two channels through a second spatial attention unit to obtain the correlation strength of any two channels, then calculating to obtain an attention diagram between the channels, and finally performing fusion through weighting of the attention diagrams between the channels to enable global correlation to be generated between the channels and obtain features with stronger semantic response, wherein the calculation formula of the second spatial attention unit is as follows:
Figure FDA0002406283740000031
wherein A isjRepresenting the feature, x, corresponding to a given locationjiIs represented by AjAnd AjTranspose A ofiChannel profile obtained by multiplication through softmax layer, Ei2A spatial signature graph representing the final output of the second spatial attention unit.
And finally, performing feature fusion on the position feature map and the space feature map to obtain a final second feature map, and recommending a network to perform region-of-interest alignment operation on the obtained second feature map in the last layer of region, and fixing the size of the features.
5. The aerial image multi-scale target detection method based on spatial pyramid attention driving according to claim 4, characterized in that: the step S104 specifically includes:
and after aligning the interested regions of the second feature map and obtaining the size of the fixed features, connecting two 1024 layers of full-connection layers, dividing the full-connection layers into two branches, respectively establishing a target category analysis and target frame regression module, and classifying the interested regions under different scales of the feature pyramid and predicting the target frames.
6. The aerial image multi-scale target detection method based on spatial pyramid attention driving according to claim 5, characterized in that: the step S105 specifically includes:
in the test, a multi-scale image test is adopted, the original image and the 1.5-time image of the original image are collected in the test, the images of two scales are processed in a blocking mode, then the images of the two scales are respectively input into a depth network to be tested, detection results on the respective scales are obtained, the detection results of the two scales are combined with the detection results of the two scales through a global non-maximum inhibition fusion algorithm, and therefore the detection accuracy is improved.
7. The aerial image multi-scale target detection method based on spatial pyramid attention driving of claim 6, characterized in that: the global integrated non-maximum suppression algorithm process is as follows:
step1, globally aligning the coordinates of the prediction frames of the subblocks in each scale;
step2, weighted calculation and sequencing of confidence coefficient weights of the detection frames;
step3, selecting a ratio boundary box with the highest confidence coefficient to be added into a final output list, and deleting the ratio boundary box from the boundary box list;
step4, calculating the areas of all the boundary frames;
step5, calculating IOUs of the bounding box with the highest confidence coefficient and other candidate boxes;
step6, deleting the boundary box with the IOU larger than the threshold value;
step7. repeat the above process until the bounding box list is empty.
CN202010164167.7A 2020-03-10 2020-03-10 Aerial image multi-scale target detection method based on spatial pyramid attention drive Active CN111401201B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010164167.7A CN111401201B (en) 2020-03-10 2020-03-10 Aerial image multi-scale target detection method based on spatial pyramid attention drive

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010164167.7A CN111401201B (en) 2020-03-10 2020-03-10 Aerial image multi-scale target detection method based on spatial pyramid attention drive

Publications (2)

Publication Number Publication Date
CN111401201A true CN111401201A (en) 2020-07-10
CN111401201B CN111401201B (en) 2023-06-20

Family

ID=71432330

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010164167.7A Active CN111401201B (en) 2020-03-10 2020-03-10 Aerial image multi-scale target detection method based on spatial pyramid attention drive

Country Status (1)

Country Link
CN (1) CN111401201B (en)

Cited By (65)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111814726A (en) * 2020-07-20 2020-10-23 南京工程学院 Detection method for visual target of detection robot
CN111814704A (en) * 2020-07-14 2020-10-23 陕西师范大学 Full convolution examination room target detection method based on cascade attention and point supervision mechanism
CN111860683A (en) * 2020-07-30 2020-10-30 中国人民解放军国防科技大学 Target detection method based on feature fusion
CN111882002A (en) * 2020-08-06 2020-11-03 桂林电子科技大学 MSF-AM-based low-illumination target detection method
CN111914795A (en) * 2020-08-17 2020-11-10 四川大学 Method for detecting rotating target in aerial image
CN111914917A (en) * 2020-07-22 2020-11-10 西安建筑科技大学 Target detection improved algorithm based on feature pyramid network and attention mechanism
CN111985552A (en) * 2020-08-17 2020-11-24 中国民航大学 Method for detecting diseases of thin strip-shaped structure of airport pavement under complex background
CN112016569A (en) * 2020-07-24 2020-12-01 驭势科技(南京)有限公司 Target detection method, network, device and storage medium based on attention mechanism
CN112037237A (en) * 2020-09-01 2020-12-04 腾讯科技(深圳)有限公司 Image processing method, image processing device, computer equipment and medium
CN112101366A (en) * 2020-09-11 2020-12-18 湖南大学 Real-time segmentation system and method based on hybrid expansion network
CN112101113A (en) * 2020-08-14 2020-12-18 北京航空航天大学 Lightweight unmanned aerial vehicle image small target detection method
CN112101189A (en) * 2020-09-11 2020-12-18 北京航空航天大学 SAR image target detection method and test platform based on attention mechanism
CN112132216A (en) * 2020-09-22 2020-12-25 平安国际智慧城市科技股份有限公司 Vehicle type recognition method and device, electronic equipment and storage medium
CN112131925A (en) * 2020-07-22 2020-12-25 浙江元亨通信技术股份有限公司 Construction method of multi-channel characteristic space pyramid
CN112163447A (en) * 2020-08-18 2021-01-01 桂林电子科技大学 Multi-task real-time gesture detection and recognition method based on Attention and Squeezenet
CN112163580A (en) * 2020-10-12 2021-01-01 中国石油大学(华东) Small target detection algorithm based on attention mechanism
CN112183269A (en) * 2020-09-18 2021-01-05 哈尔滨工业大学(深圳) Target detection method and system suitable for intelligent video monitoring
CN112233071A (en) * 2020-09-28 2021-01-15 国网浙江省电力有限公司杭州供电公司 Multi-granularity hidden danger detection method and system based on power transmission network picture in complex environment
CN112307984A (en) * 2020-11-02 2021-02-02 安徽工业大学 Safety helmet detection method and device based on neural network
CN112365480A (en) * 2020-11-13 2021-02-12 哈尔滨市科佳通用机电股份有限公司 Brake pad loss fault identification method for brake clamp device
CN112396035A (en) * 2020-12-07 2021-02-23 国网电子商务有限公司 Object detection method and device based on attention detection model
CN112464851A (en) * 2020-12-08 2021-03-09 国网陕西省电力公司电力科学研究院 Smart power grid foreign matter intrusion detection method and system based on visual perception
CN112464910A (en) * 2020-12-18 2021-03-09 杭州电子科技大学 Traffic sign identification method based on YOLO v4-tiny
CN112465880A (en) * 2020-11-26 2021-03-09 西安电子科技大学 Target detection method based on multi-source heterogeneous data cognitive fusion
CN112528786A (en) * 2020-11-30 2021-03-19 北京百度网讯科技有限公司 Vehicle tracking method and device and electronic equipment
CN112561876A (en) * 2020-12-14 2021-03-26 中南大学 Image-based pond and reservoir water quality detection method and system
CN112633158A (en) * 2020-12-22 2021-04-09 广东电网有限责任公司电力科学研究院 Power transmission line corridor vehicle identification method, device, equipment and storage medium
CN112651371A (en) * 2020-12-31 2021-04-13 广东电网有限责任公司电力科学研究院 Dressing security detection method and device, storage medium and computer equipment
CN112651326A (en) * 2020-12-22 2021-04-13 济南大学 Driver hand detection method and system based on deep learning
CN112733691A (en) * 2021-01-04 2021-04-30 北京工业大学 Multi-direction unmanned aerial vehicle aerial photography vehicle detection method based on attention mechanism
CN112883907A (en) * 2021-03-16 2021-06-01 云南师范大学 Landslide detection method and device for small-volume model
CN112907972A (en) * 2021-04-06 2021-06-04 昭通亮风台信息科技有限公司 Road vehicle flow detection method and system based on unmanned aerial vehicle and computer readable storage medium
CN112926480A (en) * 2021-03-05 2021-06-08 山东大学 Multi-scale and multi-orientation-oriented aerial object detection method and system
CN113192058A (en) * 2021-05-21 2021-07-30 中国矿业大学(北京) Intelligent brick pile loading system based on computer vision and loading method thereof
CN113255759A (en) * 2021-05-20 2021-08-13 广州广电运通金融电子股份有限公司 Attention mechanism-based in-target feature detection system, method and storage medium
CN113343755A (en) * 2021-04-22 2021-09-03 山东师范大学 System and method for classifying red blood cells in red blood cell image
CN113345082A (en) * 2021-06-24 2021-09-03 云南大学 Characteristic pyramid multi-view three-dimensional reconstruction method and system
CN113420729A (en) * 2021-08-23 2021-09-21 城云科技(中国)有限公司 Multi-scale target detection method, model, electronic equipment and application thereof
CN113469942A (en) * 2021-06-01 2021-10-01 天津大学 CT image lesion detection method
CN113486930A (en) * 2021-06-18 2021-10-08 陕西大智慧医疗科技股份有限公司 Small intestinal lymphoma segmentation model establishing and segmenting method and device based on improved RetinaNet
CN113537119A (en) * 2021-07-28 2021-10-22 国网河南省电力公司电力科学研究院 Transmission line connecting part detection method based on improved Yolov4-tiny
CN113538331A (en) * 2021-05-13 2021-10-22 中国地质大学(武汉) Metal surface damage target detection and identification method, device, equipment and storage medium
CN113567984A (en) * 2021-07-30 2021-10-29 长沙理工大学 Method and system for detecting artificial small target in SAR image
CN113591748A (en) * 2021-08-06 2021-11-02 广东电网有限责任公司 Aerial photography insulator sub-target detection method and device
CN113591859A (en) * 2021-06-23 2021-11-02 北京旷视科技有限公司 Image segmentation method, apparatus, device and medium
CN113628179A (en) * 2021-07-30 2021-11-09 厦门大学 PCB surface defect real-time detection method and device and readable medium
CN113743521A (en) * 2021-09-10 2021-12-03 中国科学院软件研究所 Target detection method based on multi-scale context sensing
CN113762251A (en) * 2021-08-17 2021-12-07 慧影医疗科技(北京)有限公司 Target classification method and system based on attention mechanism
CN113822871A (en) * 2021-09-29 2021-12-21 平安医疗健康管理股份有限公司 Target detection method and device based on dynamic detection head, storage medium and equipment
CN114038067A (en) * 2022-01-07 2022-02-11 深圳市海清视讯科技有限公司 Coal mine personnel behavior detection method, equipment and storage medium
CN114140683A (en) * 2020-08-12 2022-03-04 天津大学 Aerial image target detection method, equipment and medium
CN114155475A (en) * 2022-01-24 2022-03-08 杭州晨鹰军泰科技有限公司 Method, device and medium for recognizing end-to-end personnel actions under view angle of unmanned aerial vehicle
CN114241003A (en) * 2021-12-14 2022-03-25 成都阿普奇科技股份有限公司 All-weather lightweight high-real-time sea surface ship detection and tracking method
CN114529825A (en) * 2022-04-24 2022-05-24 城云科技(中国)有限公司 Target detection model, method and application for fire fighting channel occupation target detection
CN114549413A (en) * 2022-01-19 2022-05-27 华东师范大学 Multi-scale fusion full convolution network lymph node metastasis detection method based on CT image
CN114648736A (en) * 2022-05-18 2022-06-21 武汉大学 Robust engineering vehicle identification method and system based on target detection
CN114821374A (en) * 2022-06-27 2022-07-29 中国电子科技集团公司第二十八研究所 Knowledge and data collaborative driving unmanned aerial vehicle aerial photography target detection method
CN114972860A (en) * 2022-05-23 2022-08-30 郑州轻工业大学 Target detection method based on attention-enhanced bidirectional feature pyramid network
CN115100545A (en) * 2022-08-29 2022-09-23 东南大学 Target detection method for small parts of failed satellite under low illumination
CN115147375A (en) * 2022-07-04 2022-10-04 河海大学 Concrete surface defect characteristic detection method based on multi-scale attention
CN115424230A (en) * 2022-09-23 2022-12-02 哈尔滨市科佳通用机电股份有限公司 Fault detection method for vehicle door pulley out-of-track, storage medium and equipment
CN116468730A (en) * 2023-06-20 2023-07-21 齐鲁工业大学(山东省科学院) Aerial insulator image defect detection method based on YOLOv5 algorithm
CN117474861A (en) * 2023-10-31 2024-01-30 东北石油大学 Surface mounting special-shaped element parameter extraction method and system based on improved RetinaNet and Canny-Franklin moment sub-pixels
CN117671473A (en) * 2024-02-01 2024-03-08 中国海洋大学 Underwater target detection model and method based on attention and multi-scale feature fusion
CN112131925B (en) * 2020-07-22 2024-06-07 随锐科技集团股份有限公司 Construction method of multichannel feature space pyramid

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110084210A (en) * 2019-04-30 2019-08-02 电子科技大学 The multiple dimensioned Ship Detection of SAR image based on attention pyramid network
CN110110751A (en) * 2019-03-31 2019-08-09 华南理工大学 A kind of Chinese herbal medicine recognition methods of the pyramid network based on attention mechanism
CN110378242A (en) * 2019-06-26 2019-10-25 南京信息工程大学 A kind of remote sensing target detection method of dual attention mechanism
CN110533084A (en) * 2019-08-12 2019-12-03 长安大学 A kind of multiscale target detection method based on from attention mechanism
CN110533045A (en) * 2019-07-31 2019-12-03 中国民航大学 A kind of luggage X-ray contraband image, semantic dividing method of combination attention mechanism
CN110532955A (en) * 2019-08-30 2019-12-03 中国科学院宁波材料技术与工程研究所 Example dividing method and device based on feature attention and son up-sampling
CN110705457A (en) * 2019-09-29 2020-01-17 核工业北京地质研究院 Remote sensing image building change detection method

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110110751A (en) * 2019-03-31 2019-08-09 华南理工大学 A kind of Chinese herbal medicine recognition methods of the pyramid network based on attention mechanism
CN110084210A (en) * 2019-04-30 2019-08-02 电子科技大学 The multiple dimensioned Ship Detection of SAR image based on attention pyramid network
CN110378242A (en) * 2019-06-26 2019-10-25 南京信息工程大学 A kind of remote sensing target detection method of dual attention mechanism
CN110533045A (en) * 2019-07-31 2019-12-03 中国民航大学 A kind of luggage X-ray contraband image, semantic dividing method of combination attention mechanism
CN110533084A (en) * 2019-08-12 2019-12-03 长安大学 A kind of multiscale target detection method based on from attention mechanism
CN110532955A (en) * 2019-08-30 2019-12-03 中国科学院宁波材料技术与工程研究所 Example dividing method and device based on feature attention and son up-sampling
CN110705457A (en) * 2019-09-29 2020-01-17 核工业北京地质研究院 Remote sensing image building change detection method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
李新叶等: "基于深度学习的图像语义分割研究进展", 《科学技术与工程》 *
沈文祥等: "基于多级特征和混合注意力机制的室内人群检测网络", 《计算机应用》 *

Cited By (96)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111814704A (en) * 2020-07-14 2020-10-23 陕西师范大学 Full convolution examination room target detection method based on cascade attention and point supervision mechanism
CN111814726B (en) * 2020-07-20 2023-09-22 南京工程学院 Detection method for visual target of detection robot
CN111814726A (en) * 2020-07-20 2020-10-23 南京工程学院 Detection method for visual target of detection robot
CN111914917A (en) * 2020-07-22 2020-11-10 西安建筑科技大学 Target detection improved algorithm based on feature pyramid network and attention mechanism
CN112131925B (en) * 2020-07-22 2024-06-07 随锐科技集团股份有限公司 Construction method of multichannel feature space pyramid
CN112131925A (en) * 2020-07-22 2020-12-25 浙江元亨通信技术股份有限公司 Construction method of multi-channel characteristic space pyramid
CN112016569A (en) * 2020-07-24 2020-12-01 驭势科技(南京)有限公司 Target detection method, network, device and storage medium based on attention mechanism
CN111860683A (en) * 2020-07-30 2020-10-30 中国人民解放军国防科技大学 Target detection method based on feature fusion
CN111860683B (en) * 2020-07-30 2021-04-27 中国人民解放军国防科技大学 Target detection method based on feature fusion
CN111882002A (en) * 2020-08-06 2020-11-03 桂林电子科技大学 MSF-AM-based low-illumination target detection method
CN111882002B (en) * 2020-08-06 2022-05-24 桂林电子科技大学 MSF-AM-based low-illumination target detection method
CN114140683A (en) * 2020-08-12 2022-03-04 天津大学 Aerial image target detection method, equipment and medium
CN112101113A (en) * 2020-08-14 2020-12-18 北京航空航天大学 Lightweight unmanned aerial vehicle image small target detection method
CN112101113B (en) * 2020-08-14 2022-05-27 北京航空航天大学 Lightweight unmanned aerial vehicle image small target detection method
CN111985552A (en) * 2020-08-17 2020-11-24 中国民航大学 Method for detecting diseases of thin strip-shaped structure of airport pavement under complex background
CN111914795A (en) * 2020-08-17 2020-11-10 四川大学 Method for detecting rotating target in aerial image
CN111985552B (en) * 2020-08-17 2022-07-29 中国民航大学 Method for detecting diseases of thin strip-shaped structure of airport pavement under complex background
CN111914795B (en) * 2020-08-17 2022-05-27 四川大学 Method for detecting rotating target in aerial image
CN112163447A (en) * 2020-08-18 2021-01-01 桂林电子科技大学 Multi-task real-time gesture detection and recognition method based on Attention and Squeezenet
CN112163447B (en) * 2020-08-18 2022-04-08 桂林电子科技大学 Multi-task real-time gesture detection and recognition method based on Attention and Squeezenet
CN112037237B (en) * 2020-09-01 2023-04-07 腾讯科技(深圳)有限公司 Image processing method, image processing device, computer equipment and medium
CN112037237A (en) * 2020-09-01 2020-12-04 腾讯科技(深圳)有限公司 Image processing method, image processing device, computer equipment and medium
CN112101366A (en) * 2020-09-11 2020-12-18 湖南大学 Real-time segmentation system and method based on hybrid expansion network
CN112101189B (en) * 2020-09-11 2022-09-30 北京航空航天大学 SAR image target detection method and test platform based on attention mechanism
CN112101189A (en) * 2020-09-11 2020-12-18 北京航空航天大学 SAR image target detection method and test platform based on attention mechanism
CN112183269A (en) * 2020-09-18 2021-01-05 哈尔滨工业大学(深圳) Target detection method and system suitable for intelligent video monitoring
CN112183269B (en) * 2020-09-18 2023-08-29 哈尔滨工业大学(深圳) Target detection method and system suitable for intelligent video monitoring
CN112132216B (en) * 2020-09-22 2024-04-09 平安国际智慧城市科技股份有限公司 Vehicle type recognition method and device, electronic equipment and storage medium
CN112132216A (en) * 2020-09-22 2020-12-25 平安国际智慧城市科技股份有限公司 Vehicle type recognition method and device, electronic equipment and storage medium
CN112233071A (en) * 2020-09-28 2021-01-15 国网浙江省电力有限公司杭州供电公司 Multi-granularity hidden danger detection method and system based on power transmission network picture in complex environment
CN112163580B (en) * 2020-10-12 2022-05-03 中国石油大学(华东) Small target detection algorithm based on attention mechanism
CN112163580A (en) * 2020-10-12 2021-01-01 中国石油大学(华东) Small target detection algorithm based on attention mechanism
CN112307984B (en) * 2020-11-02 2023-02-17 安徽工业大学 Safety helmet detection method and device based on neural network
CN112307984A (en) * 2020-11-02 2021-02-02 安徽工业大学 Safety helmet detection method and device based on neural network
CN112365480A (en) * 2020-11-13 2021-02-12 哈尔滨市科佳通用机电股份有限公司 Brake pad loss fault identification method for brake clamp device
CN112465880B (en) * 2020-11-26 2023-03-10 西安电子科技大学 Target detection method based on multi-source heterogeneous data cognitive fusion
CN112465880A (en) * 2020-11-26 2021-03-09 西安电子科技大学 Target detection method based on multi-source heterogeneous data cognitive fusion
CN112528786B (en) * 2020-11-30 2023-10-31 北京百度网讯科技有限公司 Vehicle tracking method and device and electronic equipment
CN112528786A (en) * 2020-11-30 2021-03-19 北京百度网讯科技有限公司 Vehicle tracking method and device and electronic equipment
CN112396035A (en) * 2020-12-07 2021-02-23 国网电子商务有限公司 Object detection method and device based on attention detection model
CN112464851A (en) * 2020-12-08 2021-03-09 国网陕西省电力公司电力科学研究院 Smart power grid foreign matter intrusion detection method and system based on visual perception
CN112561876B (en) * 2020-12-14 2024-02-23 中南大学 Image-based water quality detection method and system for ponds and reservoirs
CN112561876A (en) * 2020-12-14 2021-03-26 中南大学 Image-based pond and reservoir water quality detection method and system
CN112464910A (en) * 2020-12-18 2021-03-09 杭州电子科技大学 Traffic sign identification method based on YOLO v4-tiny
CN112633158A (en) * 2020-12-22 2021-04-09 广东电网有限责任公司电力科学研究院 Power transmission line corridor vehicle identification method, device, equipment and storage medium
CN112651326A (en) * 2020-12-22 2021-04-13 济南大学 Driver hand detection method and system based on deep learning
CN112651371A (en) * 2020-12-31 2021-04-13 广东电网有限责任公司电力科学研究院 Dressing security detection method and device, storage medium and computer equipment
CN112733691A (en) * 2021-01-04 2021-04-30 北京工业大学 Multi-direction unmanned aerial vehicle aerial photography vehicle detection method based on attention mechanism
CN112926480B (en) * 2021-03-05 2023-01-31 山东大学 Multi-scale and multi-orientation-oriented aerial photography object detection method and system
CN112926480A (en) * 2021-03-05 2021-06-08 山东大学 Multi-scale and multi-orientation-oriented aerial object detection method and system
CN112883907A (en) * 2021-03-16 2021-06-01 云南师范大学 Landslide detection method and device for small-volume model
CN112907972A (en) * 2021-04-06 2021-06-04 昭通亮风台信息科技有限公司 Road vehicle flow detection method and system based on unmanned aerial vehicle and computer readable storage medium
CN113343755A (en) * 2021-04-22 2021-09-03 山东师范大学 System and method for classifying red blood cells in red blood cell image
CN113538331A (en) * 2021-05-13 2021-10-22 中国地质大学(武汉) Metal surface damage target detection and identification method, device, equipment and storage medium
CN113255759B (en) * 2021-05-20 2023-08-22 广州广电运通金融电子股份有限公司 In-target feature detection system, method and storage medium based on attention mechanism
CN113255759A (en) * 2021-05-20 2021-08-13 广州广电运通金融电子股份有限公司 Attention mechanism-based in-target feature detection system, method and storage medium
CN113192058B (en) * 2021-05-21 2021-11-23 中国矿业大学(北京) Intelligent brick pile loading system based on computer vision and loading method thereof
CN113192058A (en) * 2021-05-21 2021-07-30 中国矿业大学(北京) Intelligent brick pile loading system based on computer vision and loading method thereof
CN113469942A (en) * 2021-06-01 2021-10-01 天津大学 CT image lesion detection method
CN113486930B (en) * 2021-06-18 2024-04-16 陕西大智慧医疗科技股份有限公司 Method and device for establishing and segmenting small intestine lymphoma segmentation model based on improved RetinaNet
CN113486930A (en) * 2021-06-18 2021-10-08 陕西大智慧医疗科技股份有限公司 Small intestinal lymphoma segmentation model establishing and segmenting method and device based on improved RetinaNet
CN113591859A (en) * 2021-06-23 2021-11-02 北京旷视科技有限公司 Image segmentation method, apparatus, device and medium
CN113345082B (en) * 2021-06-24 2022-11-11 云南大学 Characteristic pyramid multi-view three-dimensional reconstruction method and system
CN113345082A (en) * 2021-06-24 2021-09-03 云南大学 Characteristic pyramid multi-view three-dimensional reconstruction method and system
CN113537119A (en) * 2021-07-28 2021-10-22 国网河南省电力公司电力科学研究院 Transmission line connecting part detection method based on improved Yolov4-tiny
CN113628179A (en) * 2021-07-30 2021-11-09 厦门大学 PCB surface defect real-time detection method and device and readable medium
CN113567984B (en) * 2021-07-30 2023-08-22 长沙理工大学 Method and system for detecting artificial small target in SAR image
CN113567984A (en) * 2021-07-30 2021-10-29 长沙理工大学 Method and system for detecting artificial small target in SAR image
CN113628179B (en) * 2021-07-30 2023-11-24 厦门大学 PCB surface defect real-time detection method, device and readable medium
CN113591748A (en) * 2021-08-06 2021-11-02 广东电网有限责任公司 Aerial photography insulator sub-target detection method and device
CN113762251A (en) * 2021-08-17 2021-12-07 慧影医疗科技(北京)有限公司 Target classification method and system based on attention mechanism
CN113762251B (en) * 2021-08-17 2024-05-10 慧影医疗科技(北京)股份有限公司 Attention mechanism-based target classification method and system
CN113420729A (en) * 2021-08-23 2021-09-21 城云科技(中国)有限公司 Multi-scale target detection method, model, electronic equipment and application thereof
CN113743521A (en) * 2021-09-10 2021-12-03 中国科学院软件研究所 Target detection method based on multi-scale context sensing
CN113743521B (en) * 2021-09-10 2023-06-27 中国科学院软件研究所 Target detection method based on multi-scale context awareness
CN113822871A (en) * 2021-09-29 2021-12-21 平安医疗健康管理股份有限公司 Target detection method and device based on dynamic detection head, storage medium and equipment
CN114241003A (en) * 2021-12-14 2022-03-25 成都阿普奇科技股份有限公司 All-weather lightweight high-real-time sea surface ship detection and tracking method
CN114038067A (en) * 2022-01-07 2022-02-11 深圳市海清视讯科技有限公司 Coal mine personnel behavior detection method, equipment and storage medium
CN114549413A (en) * 2022-01-19 2022-05-27 华东师范大学 Multi-scale fusion full convolution network lymph node metastasis detection method based on CT image
CN114155475B (en) * 2022-01-24 2022-05-17 杭州晨鹰军泰科技有限公司 Method, device and medium for identifying end-to-end personnel actions under view angle of unmanned aerial vehicle
CN114155475A (en) * 2022-01-24 2022-03-08 杭州晨鹰军泰科技有限公司 Method, device and medium for recognizing end-to-end personnel actions under view angle of unmanned aerial vehicle
CN114529825B (en) * 2022-04-24 2022-07-22 城云科技(中国)有限公司 Target detection model, method and application for fire fighting access occupied target detection
CN114529825A (en) * 2022-04-24 2022-05-24 城云科技(中国)有限公司 Target detection model, method and application for fire fighting channel occupation target detection
CN114648736A (en) * 2022-05-18 2022-06-21 武汉大学 Robust engineering vehicle identification method and system based on target detection
CN114972860A (en) * 2022-05-23 2022-08-30 郑州轻工业大学 Target detection method based on attention-enhanced bidirectional feature pyramid network
CN114821374A (en) * 2022-06-27 2022-07-29 中国电子科技集团公司第二十八研究所 Knowledge and data collaborative driving unmanned aerial vehicle aerial photography target detection method
CN115147375B (en) * 2022-07-04 2023-07-25 河海大学 Concrete surface defect feature detection method based on multi-scale attention
CN115147375A (en) * 2022-07-04 2022-10-04 河海大学 Concrete surface defect characteristic detection method based on multi-scale attention
CN115100545A (en) * 2022-08-29 2022-09-23 东南大学 Target detection method for small parts of failed satellite under low illumination
CN115424230B (en) * 2022-09-23 2023-06-06 哈尔滨市科佳通用机电股份有限公司 Method for detecting failure of vehicle door pulley derailment track, storage medium and device
CN115424230A (en) * 2022-09-23 2022-12-02 哈尔滨市科佳通用机电股份有限公司 Fault detection method for vehicle door pulley out-of-track, storage medium and equipment
CN116468730B (en) * 2023-06-20 2023-09-05 齐鲁工业大学(山东省科学院) Aerial Insulator Image Defect Detection Method Based on YOLOv5 Algorithm
CN116468730A (en) * 2023-06-20 2023-07-21 齐鲁工业大学(山东省科学院) Aerial insulator image defect detection method based on YOLOv5 algorithm
CN117474861A (en) * 2023-10-31 2024-01-30 东北石油大学 Surface mounting special-shaped element parameter extraction method and system based on improved RetinaNet and Canny-Franklin moment sub-pixels
CN117671473A (en) * 2024-02-01 2024-03-08 中国海洋大学 Underwater target detection model and method based on attention and multi-scale feature fusion
CN117671473B (en) * 2024-02-01 2024-05-07 中国海洋大学 Underwater target detection model and method based on attention and multi-scale feature fusion

Also Published As

Publication number Publication date
CN111401201B (en) 2023-06-20

Similar Documents

Publication Publication Date Title
CN111401201B (en) Aerial image multi-scale target detection method based on spatial pyramid attention drive
CN112200161B (en) Face recognition detection method based on mixed attention mechanism
CN109993082B (en) Convolutional neural network road scene classification and road segmentation method
CN109857889B (en) Image retrieval method, device and equipment and readable storage medium
EP3690741A2 (en) Method for automatically evaluating labeling reliability of training images for use in deep learning network to analyze images, and reliability-evaluating device using the same
CN113780296A (en) Remote sensing image semantic segmentation method and system based on multi-scale information fusion
CN113298815A (en) Semi-supervised remote sensing image semantic segmentation method and device and computer equipment
CN113591872A (en) Data processing system, object detection method and device
CN104778699B (en) A kind of tracking of self adaptation characteristics of objects
CN113111968A (en) Image recognition model training method and device, electronic equipment and readable storage medium
CN115797929A (en) Small farmland image segmentation method and device based on double-attention machine system
CN115620393A (en) Fine-grained pedestrian behavior recognition method and system oriented to automatic driving
CN116524189A (en) High-resolution remote sensing image semantic segmentation method based on coding and decoding indexing edge characterization
CN111339967A (en) Pedestrian detection method based on multi-view graph convolution network
CN114821341A (en) Remote sensing small target detection method based on double attention of FPN and PAN network
CN117333948A (en) End-to-end multi-target broiler behavior identification method integrating space-time attention mechanism
CN112949380A (en) Intelligent underwater target identification system based on laser radar point cloud data
CN115731517B (en) Crowded Crowd detection method based on crown-RetinaNet network
CN115862119A (en) Human face age estimation method and device based on attention mechanism
CN115359091A (en) Armor plate detection tracking method for mobile robot
CN115223245A (en) Method, system, equipment and storage medium for detecting and clustering behavior of tourists in scenic spot
CN115331254A (en) Anchor frame-free example portrait semantic analysis method
Motwake et al. Enhancing land cover classification in remote sensing imagery using an optimal deep learning model
CN117523428B (en) Ground target detection method and device based on aircraft platform
CN117576665B (en) Automatic driving-oriented single-camera three-dimensional target detection method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant