CN111401201A - Aerial image multi-scale target detection method based on spatial pyramid attention drive - Google Patents
Aerial image multi-scale target detection method based on spatial pyramid attention drive Download PDFInfo
- Publication number
- CN111401201A CN111401201A CN202010164167.7A CN202010164167A CN111401201A CN 111401201 A CN111401201 A CN 111401201A CN 202010164167 A CN202010164167 A CN 202010164167A CN 111401201 A CN111401201 A CN 111401201A
- Authority
- CN
- China
- Prior art keywords
- attention
- feature
- spatial
- pyramid
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
- G06V20/13—Satellite images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Multimedia (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Biomedical Technology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Remote Sensing (AREA)
- Astronomy & Astrophysics (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses an aerial image multi-scale target detection method based on spatial pyramid attention drive, which comprises the following steps of: firstly, aiming at a large-size data set, a block processing method is applied to enhance the training data set; designing a residual error network represented by the convolution attention enhancement features as a backbone network, and further efficiently extracting image features; further constructing a spatial pyramid attention module to promote the network to more accurately focus targets with different scales and extract an interested area where the targets are located; establishing a target category analysis and target frame regression module, and classifying the regions of interest under different scales and predicting the target frames; in the testing stage, a multi-scale testing strategy is adopted by using a trained detection network, and then detection results of different scales are fused by a global integration non-maximum suppression algorithm, so that the detection accuracy is further improved.
Description
Technical Field
The invention belongs to the technical field of image recognition and target detection, and particularly relates to an aerial image multi-scale target detection method based on spatial pyramid attention driving.
Background
The target detection, also called target extraction, is an image segmentation based on target geometry and statistical characteristics, which combines the segmentation and identification of targets into one, and the accuracy and real-time performance of the method are important capabilities of the whole system. Especially, in a complex scene, when a plurality of targets need to be processed in real time, automatic target extraction and identification are particularly important. With the development of computer technology and the wide application of computer vision principle, the real-time tracking research on the target by using the computer image processing technology is more and more popular, and the dynamic real-time tracking and positioning of the target has wide application value in the aspects of intelligent traffic systems, intelligent monitoring systems, military target detection, surgical instrument positioning in medical navigation operations and the like.
On the one hand, in recent years, many methods for detecting targets have appeared, such as methods of YO L O, SSD, RetinaNet, and RCNN series, wherein YO L O, SSD, RetinaNet are single-stage methods, and original RCNN and its extended Fast-RCNN and Fast-RCNN are two-stage methods.
On the other hand, the visual attention mechanism is a brain signal processing mechanism unique to human vision. Human vision obtains a target area needing important attention, namely a focus of attention in general, by rapidly scanning a global image so as to acquire more information which is critical to the characteristics of the target needing attention. Therefore, the model introducing the attention mechanism is of great help to improve the accuracy of target detection.
Under the condition of not considering the detection speed, the accuracy of the two-stage target detection algorithm is higher than that of the single-stage target detection algorithm, so that the two-stage target detection algorithm can achieve higher accuracy in many conditions such as detection of aerial pictures of the unmanned aerial vehicle. Therefore, the patent provides a feature pyramid dual-attention-driven multi-scale target detection network based on a deep learning theory and a latest attention mechanism method.
Disclosure of Invention
The invention aims to solve the technical problem that the prior art is not enough, and provides an aerial image multi-scale target detection method based on spatial pyramid attention drive.
In order to achieve the technical purpose, the technical scheme adopted by the invention is as follows:
a multi-scale target detection method for aerial images based on spatial pyramid attention driving is disclosed, wherein: the method comprises the following steps:
s101: collecting an unmanned aerial vehicle aerial image set and carrying out blocking processing to obtain a large number of small cut-block images with consistent sizes;
s102: inputting the cut small images into a residual error network, extracting features through a convolution attention module in the residual error network, wherein the convolution attention module comprises a first channel attention unit and a first space attention unit, obtaining a channel attention diagram through calculation according to the first channel attention unit, obtaining a space attention diagram through calculation according to the first space attention unit, and generating a first feature diagram by combining the channel attention diagram and the space attention diagram;
s103: extracting features from the first feature map by a detector based on a feature pyramid, adding a dual attention module containing a second spatial attention unit and a second channel attention unit to each layer of the feature pyramid from top to bottom, fusing feature maps generated by the two attention units to obtain a second feature map, performing region-of-interest alignment operation on the second feature map generated by the region suggestion network in the last layer, and fixing the size of the features;
s104: aiming at the obtained second feature map aligned with the region of interest, a target category analysis and target frame regression module is established, and classification and target frame prediction are carried out on the region of interest under different scales;
s105: the original image and the 1.5-time original image are adopted to carry out multi-scale image testing, images of two scales are respectively input into a depth network to be tested, and results of different scales are fused through a global integration non-maximum suppression algorithm, so that the detection accuracy is improved.
In order to optimize the technical scheme, the specific measures adopted further comprise:
the step S101 specifically includes: and carrying out sliding window type blocking on the image according to the pixel size of 1000 × 1000, adopting the overlapping rate of 0.25, keeping the coordinate information of the manual labeling frame of the vehicle with the IOU larger than 0.7, and converting the manually labeled boundary frame into the coordinate of the small diced picture for all vehicles in the diced image.
The step S102 specifically includes: inputting the picture into a residual error network embedded with a convolution attention module, wherein a first channel attention unit compresses the picture in a spatial dimension by using maximum pooling and average pooling to obtain two different spatial backgroundsAndspatial background through residual networkAndand calculating to obtain a channel attention diagram, wherein the calculation formula of the channel attention unit is as follows:
wherein: w1And W0Representing weights of a multi-layered perceptron, and in which two weights share an input, and in which W is0Followed by a relu activation function; σ represents Sigmoid function, and F represents convolution operation corresponding to the stage in the attention mechanism;
wherein the first spatial attention unit derives two different profiles in the dimension of the channel based on the maximum pooling and the average poolingAndgenerating a spatial attention diagram according to convolution calculation, wherein the calculation formula of the first spatial attention unit is as follows:
wherein: σ denotes Sigmoid function, f7*7Represents a convolution kernel size of 7 × 7;
a first feature map is then generated from the channel attention map and the spatial attention map.
The step3 is specifically: extracting features from the first feature map by a feature pyramid-based detector, and adding a dual attention module containing a second location attention unit and a second spatial attention unit to each layer of the feature pyramid from top to bottom;
calculating a correlation strength matrix between any two point features through a second position attention unit, namely an original feature AjObtaining characteristic B through convolution dimensionality reductioniFeature CjAnd feature DiThen changing the characteristic dimension BiAnd CjObtaining a correlation strength matrix between any two point characteristics according to the matrix product; by passingCalculating and obtaining characteristics S of each position to other positions by using softmax functionjiThen the feature SjiAnd feature DiPerforming multiplication and fusion, and finally, combining the result with the original characteristic AjAnd adding to obtain a position feature map finally output by the position attention unit, wherein the calculation formula of the second position attention unit is as follows:
wherein A isjRepresenting the feature corresponding to the given position; b isi,Cj,DiIs shown asjThree new features, S, generated by convolution dimensionality reductionjiRepresents that B isi,CjThe position attention map obtained by matrix multiplication after the re-deformation and then the softmax layer is obtained, Ej1A position feature map representing the final output of the second position attention unit;
carrying out dimension transformation and matrix multiplication on the features of any two channels through a second spatial attention unit to obtain the correlation strength of any two channels, then calculating to obtain a feature map between the channels, and finally carrying out weighting and fusion on the feature maps between the channels to enable global correlation to be generated between the channels and obtain features with stronger semantic response, wherein the calculation formula of the second spatial attention unit is as follows:
wherein A isjRepresenting the feature, x, corresponding to a given locationjiIs represented by AjAnd AjTranspose A ofiChannel profile obtained by multiplication through softmax layer, Ej2A spatial signature graph representing the final output of the second spatial attention unit;
and finally, performing feature fusion on the first spatial feature map and the second spatial feature map to obtain a final second feature map, and recommending a network to perform region-of-interest alignment operation on the obtained second feature map in the last layer of region, and fixing the size of the features.
The step S104 is specifically: and after aligning the interested regions of the second feature map and obtaining the size of the fixed features, connecting two 1024 layers of full-connection layers, dividing the full-connection layers into two branches, respectively establishing a target category analysis and target frame regression module, and classifying the interested regions under different scales of the feature pyramid and predicting the target frames.
The step S105 is specifically: in the test, a multi-scale image test is adopted, the original image and the 1.5-time image of the original image are collected in the test, the images of two scales are processed in a blocking mode, then the images of the two scales are respectively input into a depth network to be tested, detection results on the respective scales are obtained, the detection results of the two scales are combined with the detection results of the two scales through a global non-maximum inhibition fusion algorithm, and therefore the detection accuracy is improved.
The global integrated non-maximum suppression algorithm process is as follows:
step1, globally aligning the coordinates of the prediction frames of the subblocks in each scale;
step2, weighted calculation and sequencing of confidence coefficient weights of the detection frames;
step3, selecting a ratio boundary box with the highest confidence coefficient to be added into a final output list, and deleting the ratio boundary box from the boundary box list;
step4, calculating the areas of all the boundary frames;
step5, calculating IOUs of the bounding box with the highest confidence coefficient and other candidate boxes;
step6, deleting the boundary box with the IOU larger than the threshold value;
step7. repeat the above process until the bounding box list is empty.
The invention has the beneficial effects that:
the invention utilizes the theory of computer target detection and attention mechanism to establish a multi-scale target detection network method based on feature pyramid dual attention drive, under the condition that a model has larger aerial image size, smaller target to be detected and high background complexity, firstly, the blocking processing of a data set is carried out, then, the powerful feature extraction capability driven by the feature pyramid dual attention is utilized, meanwhile, a multi-scale fusion detection method is adopted, and the detection results of two scales are combined with the detection results of the two scales by utilizing a global non-maximum inhibition fusion algorithm, so that the most accurate detection result is finally obtained. The detection network provided by the invention achieves a good effect on target detection of aerial pictures, and plays a significant role in the fields of geographic environment detection, traffic flow control, military behavior monitoring and the like.
Drawings
FIG. 1 is a schematic flow chart of the algorithm of the present invention;
FIG. 2 is a schematic flow diagram of a global non-maximum suppression fusion algorithm;
FIG. 3 is a schematic diagram of a feature pyramid portion of a dual attention mechanism drive constructed in accordance with the present invention;
FIG. 4 is a schematic diagram of a detection network of the present invention;
fig. 5 is a comparison graph of quantitative analysis of the unmanned aerial vehicle dataset of the present invention.
Detailed Description
Embodiments of the present invention are described in further detail below with reference to the accompanying drawings.
As shown in fig. 1, the present invention is a spatial pyramid attention-driven aerial image multi-scale target detection method, wherein: the method comprises the following steps:
s101, before training, carrying out block processing on an unmanned aerial vehicle aerial photography automobile data set used for verifying the effectiveness of a designed network;
the method specifically comprises the following steps: before the data set is sent to network training, the data set is processed firstly, the data set used in our experiments comprises 4355 aerial images and corresponding coordinates of manually marked vehicles, and as for each image, the image size is too large due to aerial shooting by an unmanned aerial vehicle, the image is subjected to sliding window type partitioning according to the pixel size of 1000 × 1000 to obtain a large number of small cut-block images, in order to avoid the situation that the vehicle is incomplete due to image segmentation as far as possible, the overlapping rate of 0.25 is adopted, the coordinate information of the manually marked frame of the vehicle with the IOU being greater than 0.7 is reserved, and for all vehicle examples in the image after the image is cut, the cut-blocks are stored, the manually marked boundary frame of the cut-block frames of the vehicle is converted into the coordinates of the small cut-block images, and 48416 small cut-block images with the size of 1000 × 1000 are obtained in total.
S102, inputting the small cut-block image into a residual error network, extracting features through a convolution attention module in the residual error network, wherein the convolution attention module comprises a first channel attention unit and a first space attention unit, obtaining a channel attention diagram through calculation according to the first channel attention unit, obtaining a space attention diagram through calculation according to the first space attention unit, and generating a first feature diagram by combining the channel attention diagram and the space attention diagram.
The method specifically comprises the following steps: firstly, a picture passes through a backbone network, a residual network is selected by the backbone network, and a convolution attention mechanism module is embedded in the residual, wherein the convolution attention module is an attention module combined with space and channels, and then feature mapping is multiplied by an input feature map to carry out feature self-adaptive learning; after the picture passes through the backbone network, a characteristic diagram is generated and sent to the next link;
the convolution attention module comprises a first channel attention unit and a first spatial attention unit, the first channel attention unit is more concerned with what is meaningful in an input picture, in order to calculate the channel attention efficiently, the first channel attention unit compresses in a spatial dimension by using maximum pooling and average pooling to obtain two different spatial backgroundsAndthe channel attention map is calculated by using a shared network consisting of M L P to obtain two different spatial background descriptions, so the calculation formula of the first channel attention cell is as follows:
wherein, W1And W0Representing weights of a multi-tier perceptron, and in which two weights share an input, and in which W is0Followed by a relu activation function; σ represents the Sigmoid function and F represents the convolution operation corresponding to this stage in the attention module.
Wherein the first spatial attention unit is different from the first channel attention unit, the first spatial attention unit mainly focuses on the position information, and two different feature descriptions are obtained by using maximum pooling and average pooling on the channel dimensionAndthe two feature descriptions are then merged using concatenation and a spatial attention graph is generated using convolution operations, the calculation formula for the first spatial attention cell being as follows:
wherein: σ stands for Sigmoid function, f7*7Representing a convolution kernel size of 7 x 7 in the convolution operation, and then generating the first feature map from the channel attention map and the spatial attention map.
S103, extracting features from the first feature map by a detector based on the feature pyramid, calculating the association degree between different features and the association between modeling channels by adding a dual attention mechanism module containing a second spatial attention unit and a second channel attention unit to each layer of the feature pyramid from top to bottom, and performing region-of-interest alignment operation on the generated second feature map by using a network suggested in the last layer of region to fix the size of the features.
The method specifically comprises the following steps: in the detector link, firstly, a characteristic pyramid network is fused into the Faster-RCNN to increase the cognition of the detector on the whole image information, meanwhile, a spatial characteristic pyramid structure is improved, a double attention module is added, and finally, the original region of interest with the fixed characteristic in the Faster-RCNN is subjected to pooling operation and replaced by region of interest alignment operation with pixel level and higher precision.
The loss function of the detection network comprises classification loss and regression loss, and the loss function formula is as follows:
wherein: i is the ith target box and i is the ith target box,is the probability of targeting the anchor frame, when the anchor frame is targeted,1, otherwise 0, ti is the location coordinate of the prediction box,is the coordinates of the real tag;
the part from bottom to top of the feature pyramid is the features obtained by the backbone network, and the adopted operation is that 1 x 1 dimensionality reduction operation is carried out on the 2 nd layer from bottom to top, and then the results after the 3 rd layer from bottom to top is sampled are added to obtain the 2 nd layer from top to bottom; the same applies to the top-to-bottom next layer, and then the network operation is subjected to area recommendation for the resulting top-to-bottom portion to obtain a recommendation for the area to be detected.
The specific steps of the feature pyramid part which is integrated into the double attention module in the residual error network are that feature extraction of an object to be detected is achieved on feature graphs of different scales, a feature graph with higher precision and richer information can be obtained by adding the double attention mechanism to each layer of the feature pyramid from the top to the bottom, and the double attention module respectively introduces the self-attention mechanism into the space dimension and the channel dimension of the feature, namely a second position attention unit and a second channel attention unit, so that the global dependency relationship of the feature is effectively grasped.
Wherein the second location attention unit mutually enhances the expression of the respective features by utilizing the association between any two features. Specifically, firstly, a correlation strength matrix between any two point features, namely an original feature A, is calculatedjObtaining characteristic B through convolution dimensionality reductioniFeature CjAnd feature DiThen changing the characteristic dimension BiAnd CjAnd obtaining a correlation strength matrix between any two point characteristics according to the matrix product. Then obtaining the characteristics S of each position to other positions through the normalization of the softmax operationjiWherein the more similar between two point features, the S thereofjiThe larger the response value. Then the response value S in the feature map is comparedjiThe feature D is weighted and fused as a weight, so that for each point of the position, the calculation formula of the second position attention unit is as follows through the fusion of the feature map in the global space and similar features:
wherein A isjRepresenting the feature corresponding to a given position, Bi,Cj,DiIs shown asjTwo new characteristic maps, S, generated by feeding the convolutional layersjiRepresents that B isi,CjCarrying out matrix multiplication after re-deformation and obtaining a spatial characteristic diagram through a softmax layer, Ej1A position feature map representing the final output of the second position attention unit.
The second spatial attention unit enhances specific semantic response capability under the channels by modeling the association between the channels. The specific process is similar to the position attention module, except that when the feature attention diagram X is obtained, dimension transformation and matrix multiplication are carried out on any two channel features to obtain the correlation strength of any two channels, and then the feature diagram between the channels is obtained through the softmax operation. And finally, fusion is carried out through attention diagram weighting among the channels, so that global association can be generated among all the channels, and the characteristics of stronger semantic response are obtained. The calculation formula of the channel attention module is as follows:
wherein A isjRepresenting the feature, x, corresponding to a given locationjiIs represented by AjAnd AjTranspose A ofiChannel profile obtained by multiplication through softmax layer, Ej2A spatial signature graph representing the final output of the second spatial attention unit.
In the target detection algorithm, a region suggestion candidate box of a result to be detected is obtained in a region suggestion network, and then candidate regions with different sizes are mapped onto a feature map with a fixed size by using region-of-interest pooling. However, there are two obvious disadvantages to using region-of-interest pooling, one of which is that errors may occur when quantizing the candidate frame boundaries to integer coordinates, and errors may also occur when floating point numbers are rounded when pooling. The coordinate position of the candidate frame can be deviated due to the error accumulation result, and the detection effect is influenced. Because our data set is to detect the car of the unmanned aerial vehicle aerial image, the target that needs to detect is the target with the extremely small proportion in the picture, therefore we have replaced the alignment operation of the interested region with pixel level and higher precision in our, and then cancel the quantization operation, obtain the image number value on the pixel point of the coordinate as the floating point number through the method of using bilinear interpolation, thus turn the whole characteristic gathering process into a continuous operation.
And S104, after the region-of-interest alignment operation is carried out on the second feature map and the size of the fixed feature is obtained, connecting two 1024 layers of full-connection layers, dividing into two branches, respectively establishing a target category analysis and target frame regression module, and classifying the region-of-interest under different scales of the feature pyramid and predicting the target frame.
And S105, adopting multi-scale image testing in the testing process, except for the original image concentrated in the testing process and the 1.5-time image of the original image, carrying out blocking processing on the images of the two scales, respectively inputting the images of the two scales into a depth network for testing to obtain detection results on the respective scales, and combining the detection results of the two scales with a global non-maximum inhibition fusion algorithm to improve the detection accuracy.
The overall integrated non-maximum suppression algorithm process is as follows;
step1, globally aligning the coordinates of the prediction frames of the subblocks in each scale;
step2, weighted calculation and sequencing of confidence coefficient weights of the detection frames;
step3, selecting a ratio boundary box with the highest confidence coefficient to be added into a final output list, and deleting the ratio boundary box from the boundary box list;
step4, calculating the areas of all the boundary frames;
step5, calculating IOUs of the bounding box with the highest confidence coefficient and other candidate boxes;
step6, deleting the boundary box with the IOU larger than the threshold value;
step7. repeat the above process until the bounding box list is empty.
Compared experiments are carried out on the invention, the data set used in the experiments is the unmanned aerial vehicle aerial photography automobile data set of 'Behcet' information fusion challenge match, and the hyper-parameters are set as follows: the maximum number of iterations is 12, the batch size is 1, the learning rate is set by adopting a warming up strategy, the initial learning rate is 0.3333, the learning rate is gradually increased and reduced to 0.00025 in the initial 500 iterations, and the learning rate is reduced in the 8 th and 11 th periods.
Evaluation of the experiment two analytical methods of quantification and visualization were used:
for quantitative analysis comparison, precision (accuracy), recall (recall) and F1 scores are used for judging detection precision, and precision and recall are used for calculating F1 scores to measure the detection precision of the algorithm. Wherein the accuracy, the recall rate and the F1 score are calculated as follows:
wherein, true posotives actually means that the target to be detected is correctly detected, false posotives actually means that the target not to be detected is detected, and false negatives actually means that the target to be detected is not detected.
The visual analysis comparison means that the same picture to be detected is detected for models run out through different detection algorithms, the effect of the detected picture is visualized through the written visual codes, and then the detection effects of the models run out through different detection algorithms on the same picture are artificially compared.
Compared with the conventional target detection algorithm, the unmanned aerial vehicle aerial image detection method has the advantages of being low in detection precision, poor in effect and the like. The invention utilizes a deep learning and attention mechanism to establish a multi-scale unmanned aerial vehicle aerial photography target detection network based on feature pyramid dual attention drive, and in the process of feature extraction, the attention mechanism is integrated into a space pyramid, so that richer and more effective information can be extracted and further sent to a regional suggestion network for classification and regression.
The above is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above-mentioned embodiments, and all technical solutions belonging to the idea of the present invention belong to the protection scope of the present invention. It should be noted that modifications and embellishments within the scope of the invention may be made by those skilled in the art without departing from the principle of the invention.
Claims (7)
1. A multi-scale target detection method of aerial images based on spatial pyramid attention drive is characterized by comprising the following steps: the method comprises the following steps:
s101: collecting an unmanned aerial vehicle aerial image set and carrying out blocking processing to obtain a large number of small cut-block images with consistent sizes;
s102: inputting the cut small images into a residual error network, extracting features through a convolution attention module in the residual error network, wherein the convolution attention module comprises a first channel attention unit and a first space attention unit, obtaining a channel attention diagram through calculation according to the first channel attention unit, obtaining a space attention diagram through calculation according to the first space attention unit, and generating a first feature diagram by combining the channel attention diagram and the space attention diagram;
s103: extracting features from the first feature map by a detector based on a feature pyramid, adding a dual attention module containing a second spatial attention unit and a second channel attention unit to each layer of the feature pyramid from top to bottom, fusing feature maps generated by the two attention units to obtain a second feature map, performing region-of-interest alignment operation on the second feature map generated by the region suggestion network in the last layer, and fixing the size of the features;
s104: aiming at the obtained second feature map aligned with the region of interest, a target category analysis and target frame regression module is established, and classification and target frame prediction are carried out on the region of interest under different scales;
s105: the original image and the 1.5-time original image are adopted to carry out multi-scale image testing, images of two scales are respectively input into a depth network to be tested, and results of different scales are fused through a global integration non-maximum suppression algorithm, so that the detection accuracy is improved.
2. The aerial image multi-scale target detection method based on spatial pyramid attention driving according to claim 1, characterized in that: the step S101 specifically includes:
and carrying out sliding window type blocking on the image according to the pixel size of 1000 × 1000, adopting the overlapping rate of 0.25, keeping the coordinate information of the manual labeling frame of the vehicle with the IOU larger than 0.7, and converting the manually labeled boundary frame into the coordinate of the small diced picture for all vehicles in the diced image.
3. The aerial image multi-scale target detection method based on spatial pyramid attention driving according to claim 2, characterized in that: the step S102 specifically includes:
inputting the picture into a residual error network embedded with a convolution attention module, wherein a first channel attention unit compresses the picture in a spatial dimension by using maximum pooling and average pooling to obtain two different spatial backgroundsAndspatial background through residual networkAndand calculating to obtain a channel attention diagram, wherein the calculation formula of the first channel attention unit is as follows:
wherein: w1And W0Representing weights of a multi-layered perceptron, and in which two weights share an input, and in which W is0Followed by a relu activation function; σ represents Sigmoid function, and F represents convolution operation corresponding to the stage in the attention mechanism;
wherein the first spatial attention unit derives two different profiles in the dimension of the channel based on the maximum pooling and the average poolingAndgenerating a spatial attention diagram according to convolution calculation, wherein the calculation formula of the first spatial attention unit is as follows:
wherein: σ denotes Sigmoid function, f7*7Represents a convolution kernel size of 7 × 7;
a first feature map is then generated from the channel attention map and the spatial attention map.
4. The aerial image multi-scale target detection method based on spatial pyramid attention driving according to claim 3, characterized in that: the step3 specifically comprises the following steps:
extracting features from the first feature map by a feature pyramid-based detector, and adding a dual attention mechanism including a second location attention unit and a second spatial attention unit to each layer of the feature pyramid from top to bottom;
calculating a correlation strength matrix between any two point features through a second position attention unit, namely an original feature AjObtaining characteristic B through convolution dimensionality reductioniFeature CjAnd feature DiThen changing the characteristic dimension BiAnd CjObtaining a correlation strength matrix between any two point characteristics according to the matrix product; calculating and obtaining characteristics S of each position to other positions by utilizing softmax functionjiThen the feature SjiAnd feature DiPerforming multiplication and fusion, and finally, combining the result with the original characteristic AjAnd adding to obtain a position feature map finally output by the position attention unit, wherein the calculation formula of the second position attention unit is as follows:
wherein A isjRepresents a given position pairThe corresponding characteristics; b isi,Cj,DiIs shown asjThree new features, S, generated by convolution dimensionality reductionjiRepresents that B isi,CjThe position attention map obtained by matrix multiplication after the re-deformation and then the softmax layer is obtained, Ej1A position feature map representing the final output of the second position attention unit;
performing dimension transformation and matrix multiplication on the features of any two channels through a second spatial attention unit to obtain the correlation strength of any two channels, then calculating to obtain an attention diagram between the channels, and finally performing fusion through weighting of the attention diagrams between the channels to enable global correlation to be generated between the channels and obtain features with stronger semantic response, wherein the calculation formula of the second spatial attention unit is as follows:
wherein A isjRepresenting the feature, x, corresponding to a given locationjiIs represented by AjAnd AjTranspose A ofiChannel profile obtained by multiplication through softmax layer, Ei2A spatial signature graph representing the final output of the second spatial attention unit.
And finally, performing feature fusion on the position feature map and the space feature map to obtain a final second feature map, and recommending a network to perform region-of-interest alignment operation on the obtained second feature map in the last layer of region, and fixing the size of the features.
5. The aerial image multi-scale target detection method based on spatial pyramid attention driving according to claim 4, characterized in that: the step S104 specifically includes:
and after aligning the interested regions of the second feature map and obtaining the size of the fixed features, connecting two 1024 layers of full-connection layers, dividing the full-connection layers into two branches, respectively establishing a target category analysis and target frame regression module, and classifying the interested regions under different scales of the feature pyramid and predicting the target frames.
6. The aerial image multi-scale target detection method based on spatial pyramid attention driving according to claim 5, characterized in that: the step S105 specifically includes:
in the test, a multi-scale image test is adopted, the original image and the 1.5-time image of the original image are collected in the test, the images of two scales are processed in a blocking mode, then the images of the two scales are respectively input into a depth network to be tested, detection results on the respective scales are obtained, the detection results of the two scales are combined with the detection results of the two scales through a global non-maximum inhibition fusion algorithm, and therefore the detection accuracy is improved.
7. The aerial image multi-scale target detection method based on spatial pyramid attention driving of claim 6, characterized in that: the global integrated non-maximum suppression algorithm process is as follows:
step1, globally aligning the coordinates of the prediction frames of the subblocks in each scale;
step2, weighted calculation and sequencing of confidence coefficient weights of the detection frames;
step3, selecting a ratio boundary box with the highest confidence coefficient to be added into a final output list, and deleting the ratio boundary box from the boundary box list;
step4, calculating the areas of all the boundary frames;
step5, calculating IOUs of the bounding box with the highest confidence coefficient and other candidate boxes;
step6, deleting the boundary box with the IOU larger than the threshold value;
step7. repeat the above process until the bounding box list is empty.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010164167.7A CN111401201B (en) | 2020-03-10 | 2020-03-10 | Aerial image multi-scale target detection method based on spatial pyramid attention drive |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010164167.7A CN111401201B (en) | 2020-03-10 | 2020-03-10 | Aerial image multi-scale target detection method based on spatial pyramid attention drive |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111401201A true CN111401201A (en) | 2020-07-10 |
CN111401201B CN111401201B (en) | 2023-06-20 |
Family
ID=71432330
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010164167.7A Active CN111401201B (en) | 2020-03-10 | 2020-03-10 | Aerial image multi-scale target detection method based on spatial pyramid attention drive |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111401201B (en) |
Cited By (65)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111814726A (en) * | 2020-07-20 | 2020-10-23 | 南京工程学院 | Detection method for visual target of detection robot |
CN111814704A (en) * | 2020-07-14 | 2020-10-23 | 陕西师范大学 | Full convolution examination room target detection method based on cascade attention and point supervision mechanism |
CN111860683A (en) * | 2020-07-30 | 2020-10-30 | 中国人民解放军国防科技大学 | Target detection method based on feature fusion |
CN111882002A (en) * | 2020-08-06 | 2020-11-03 | 桂林电子科技大学 | MSF-AM-based low-illumination target detection method |
CN111914795A (en) * | 2020-08-17 | 2020-11-10 | 四川大学 | Method for detecting rotating target in aerial image |
CN111914917A (en) * | 2020-07-22 | 2020-11-10 | 西安建筑科技大学 | Target detection improved algorithm based on feature pyramid network and attention mechanism |
CN111985552A (en) * | 2020-08-17 | 2020-11-24 | 中国民航大学 | Method for detecting diseases of thin strip-shaped structure of airport pavement under complex background |
CN112016569A (en) * | 2020-07-24 | 2020-12-01 | 驭势科技(南京)有限公司 | Target detection method, network, device and storage medium based on attention mechanism |
CN112037237A (en) * | 2020-09-01 | 2020-12-04 | 腾讯科技(深圳)有限公司 | Image processing method, image processing device, computer equipment and medium |
CN112101366A (en) * | 2020-09-11 | 2020-12-18 | 湖南大学 | Real-time segmentation system and method based on hybrid expansion network |
CN112101113A (en) * | 2020-08-14 | 2020-12-18 | 北京航空航天大学 | Lightweight unmanned aerial vehicle image small target detection method |
CN112101189A (en) * | 2020-09-11 | 2020-12-18 | 北京航空航天大学 | SAR image target detection method and test platform based on attention mechanism |
CN112132216A (en) * | 2020-09-22 | 2020-12-25 | 平安国际智慧城市科技股份有限公司 | Vehicle type recognition method and device, electronic equipment and storage medium |
CN112131925A (en) * | 2020-07-22 | 2020-12-25 | 浙江元亨通信技术股份有限公司 | Construction method of multi-channel characteristic space pyramid |
CN112163447A (en) * | 2020-08-18 | 2021-01-01 | 桂林电子科技大学 | Multi-task real-time gesture detection and recognition method based on Attention and Squeezenet |
CN112163580A (en) * | 2020-10-12 | 2021-01-01 | 中国石油大学(华东) | Small target detection algorithm based on attention mechanism |
CN112183269A (en) * | 2020-09-18 | 2021-01-05 | 哈尔滨工业大学(深圳) | Target detection method and system suitable for intelligent video monitoring |
CN112233071A (en) * | 2020-09-28 | 2021-01-15 | 国网浙江省电力有限公司杭州供电公司 | Multi-granularity hidden danger detection method and system based on power transmission network picture in complex environment |
CN112307984A (en) * | 2020-11-02 | 2021-02-02 | 安徽工业大学 | Safety helmet detection method and device based on neural network |
CN112365480A (en) * | 2020-11-13 | 2021-02-12 | 哈尔滨市科佳通用机电股份有限公司 | Brake pad loss fault identification method for brake clamp device |
CN112396035A (en) * | 2020-12-07 | 2021-02-23 | 国网电子商务有限公司 | Object detection method and device based on attention detection model |
CN112464851A (en) * | 2020-12-08 | 2021-03-09 | 国网陕西省电力公司电力科学研究院 | Smart power grid foreign matter intrusion detection method and system based on visual perception |
CN112464910A (en) * | 2020-12-18 | 2021-03-09 | 杭州电子科技大学 | Traffic sign identification method based on YOLO v4-tiny |
CN112465880A (en) * | 2020-11-26 | 2021-03-09 | 西安电子科技大学 | Target detection method based on multi-source heterogeneous data cognitive fusion |
CN112528786A (en) * | 2020-11-30 | 2021-03-19 | 北京百度网讯科技有限公司 | Vehicle tracking method and device and electronic equipment |
CN112561876A (en) * | 2020-12-14 | 2021-03-26 | 中南大学 | Image-based pond and reservoir water quality detection method and system |
CN112633158A (en) * | 2020-12-22 | 2021-04-09 | 广东电网有限责任公司电力科学研究院 | Power transmission line corridor vehicle identification method, device, equipment and storage medium |
CN112651371A (en) * | 2020-12-31 | 2021-04-13 | 广东电网有限责任公司电力科学研究院 | Dressing security detection method and device, storage medium and computer equipment |
CN112651326A (en) * | 2020-12-22 | 2021-04-13 | 济南大学 | Driver hand detection method and system based on deep learning |
CN112733691A (en) * | 2021-01-04 | 2021-04-30 | 北京工业大学 | Multi-direction unmanned aerial vehicle aerial photography vehicle detection method based on attention mechanism |
CN112883907A (en) * | 2021-03-16 | 2021-06-01 | 云南师范大学 | Landslide detection method and device for small-volume model |
CN112907972A (en) * | 2021-04-06 | 2021-06-04 | 昭通亮风台信息科技有限公司 | Road vehicle flow detection method and system based on unmanned aerial vehicle and computer readable storage medium |
CN112926480A (en) * | 2021-03-05 | 2021-06-08 | 山东大学 | Multi-scale and multi-orientation-oriented aerial object detection method and system |
CN113192058A (en) * | 2021-05-21 | 2021-07-30 | 中国矿业大学(北京) | Intelligent brick pile loading system based on computer vision and loading method thereof |
CN113255759A (en) * | 2021-05-20 | 2021-08-13 | 广州广电运通金融电子股份有限公司 | Attention mechanism-based in-target feature detection system, method and storage medium |
CN113343755A (en) * | 2021-04-22 | 2021-09-03 | 山东师范大学 | System and method for classifying red blood cells in red blood cell image |
CN113345082A (en) * | 2021-06-24 | 2021-09-03 | 云南大学 | Characteristic pyramid multi-view three-dimensional reconstruction method and system |
CN113420729A (en) * | 2021-08-23 | 2021-09-21 | 城云科技(中国)有限公司 | Multi-scale target detection method, model, electronic equipment and application thereof |
CN113469942A (en) * | 2021-06-01 | 2021-10-01 | 天津大学 | CT image lesion detection method |
CN113486930A (en) * | 2021-06-18 | 2021-10-08 | 陕西大智慧医疗科技股份有限公司 | Small intestinal lymphoma segmentation model establishing and segmenting method and device based on improved RetinaNet |
CN113537119A (en) * | 2021-07-28 | 2021-10-22 | 国网河南省电力公司电力科学研究院 | Transmission line connecting part detection method based on improved Yolov4-tiny |
CN113538331A (en) * | 2021-05-13 | 2021-10-22 | 中国地质大学(武汉) | Metal surface damage target detection and identification method, device, equipment and storage medium |
CN113567984A (en) * | 2021-07-30 | 2021-10-29 | 长沙理工大学 | Method and system for detecting artificial small target in SAR image |
CN113591748A (en) * | 2021-08-06 | 2021-11-02 | 广东电网有限责任公司 | Aerial photography insulator sub-target detection method and device |
CN113591859A (en) * | 2021-06-23 | 2021-11-02 | 北京旷视科技有限公司 | Image segmentation method, apparatus, device and medium |
CN113628179A (en) * | 2021-07-30 | 2021-11-09 | 厦门大学 | PCB surface defect real-time detection method and device and readable medium |
CN113743521A (en) * | 2021-09-10 | 2021-12-03 | 中国科学院软件研究所 | Target detection method based on multi-scale context sensing |
CN113762251A (en) * | 2021-08-17 | 2021-12-07 | 慧影医疗科技(北京)有限公司 | Target classification method and system based on attention mechanism |
CN113822871A (en) * | 2021-09-29 | 2021-12-21 | 平安医疗健康管理股份有限公司 | Target detection method and device based on dynamic detection head, storage medium and equipment |
CN114038067A (en) * | 2022-01-07 | 2022-02-11 | 深圳市海清视讯科技有限公司 | Coal mine personnel behavior detection method, equipment and storage medium |
CN114140683A (en) * | 2020-08-12 | 2022-03-04 | 天津大学 | Aerial image target detection method, equipment and medium |
CN114155475A (en) * | 2022-01-24 | 2022-03-08 | 杭州晨鹰军泰科技有限公司 | Method, device and medium for recognizing end-to-end personnel actions under view angle of unmanned aerial vehicle |
CN114241003A (en) * | 2021-12-14 | 2022-03-25 | 成都阿普奇科技股份有限公司 | All-weather lightweight high-real-time sea surface ship detection and tracking method |
CN114529825A (en) * | 2022-04-24 | 2022-05-24 | 城云科技(中国)有限公司 | Target detection model, method and application for fire fighting channel occupation target detection |
CN114549413A (en) * | 2022-01-19 | 2022-05-27 | 华东师范大学 | Multi-scale fusion full convolution network lymph node metastasis detection method based on CT image |
CN114648736A (en) * | 2022-05-18 | 2022-06-21 | 武汉大学 | Robust engineering vehicle identification method and system based on target detection |
CN114821374A (en) * | 2022-06-27 | 2022-07-29 | 中国电子科技集团公司第二十八研究所 | Knowledge and data collaborative driving unmanned aerial vehicle aerial photography target detection method |
CN114972860A (en) * | 2022-05-23 | 2022-08-30 | 郑州轻工业大学 | Target detection method based on attention-enhanced bidirectional feature pyramid network |
CN115100545A (en) * | 2022-08-29 | 2022-09-23 | 东南大学 | Target detection method for small parts of failed satellite under low illumination |
CN115147375A (en) * | 2022-07-04 | 2022-10-04 | 河海大学 | Concrete surface defect characteristic detection method based on multi-scale attention |
CN115424230A (en) * | 2022-09-23 | 2022-12-02 | 哈尔滨市科佳通用机电股份有限公司 | Fault detection method for vehicle door pulley out-of-track, storage medium and equipment |
CN116468730A (en) * | 2023-06-20 | 2023-07-21 | 齐鲁工业大学(山东省科学院) | Aerial insulator image defect detection method based on YOLOv5 algorithm |
CN117474861A (en) * | 2023-10-31 | 2024-01-30 | 东北石油大学 | Surface mounting special-shaped element parameter extraction method and system based on improved RetinaNet and Canny-Franklin moment sub-pixels |
CN117671473A (en) * | 2024-02-01 | 2024-03-08 | 中国海洋大学 | Underwater target detection model and method based on attention and multi-scale feature fusion |
CN112131925B (en) * | 2020-07-22 | 2024-06-07 | 随锐科技集团股份有限公司 | Construction method of multichannel feature space pyramid |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110084210A (en) * | 2019-04-30 | 2019-08-02 | 电子科技大学 | The multiple dimensioned Ship Detection of SAR image based on attention pyramid network |
CN110110751A (en) * | 2019-03-31 | 2019-08-09 | 华南理工大学 | A kind of Chinese herbal medicine recognition methods of the pyramid network based on attention mechanism |
CN110378242A (en) * | 2019-06-26 | 2019-10-25 | 南京信息工程大学 | A kind of remote sensing target detection method of dual attention mechanism |
CN110533084A (en) * | 2019-08-12 | 2019-12-03 | 长安大学 | A kind of multiscale target detection method based on from attention mechanism |
CN110533045A (en) * | 2019-07-31 | 2019-12-03 | 中国民航大学 | A kind of luggage X-ray contraband image, semantic dividing method of combination attention mechanism |
CN110532955A (en) * | 2019-08-30 | 2019-12-03 | 中国科学院宁波材料技术与工程研究所 | Example dividing method and device based on feature attention and son up-sampling |
CN110705457A (en) * | 2019-09-29 | 2020-01-17 | 核工业北京地质研究院 | Remote sensing image building change detection method |
-
2020
- 2020-03-10 CN CN202010164167.7A patent/CN111401201B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110110751A (en) * | 2019-03-31 | 2019-08-09 | 华南理工大学 | A kind of Chinese herbal medicine recognition methods of the pyramid network based on attention mechanism |
CN110084210A (en) * | 2019-04-30 | 2019-08-02 | 电子科技大学 | The multiple dimensioned Ship Detection of SAR image based on attention pyramid network |
CN110378242A (en) * | 2019-06-26 | 2019-10-25 | 南京信息工程大学 | A kind of remote sensing target detection method of dual attention mechanism |
CN110533045A (en) * | 2019-07-31 | 2019-12-03 | 中国民航大学 | A kind of luggage X-ray contraband image, semantic dividing method of combination attention mechanism |
CN110533084A (en) * | 2019-08-12 | 2019-12-03 | 长安大学 | A kind of multiscale target detection method based on from attention mechanism |
CN110532955A (en) * | 2019-08-30 | 2019-12-03 | 中国科学院宁波材料技术与工程研究所 | Example dividing method and device based on feature attention and son up-sampling |
CN110705457A (en) * | 2019-09-29 | 2020-01-17 | 核工业北京地质研究院 | Remote sensing image building change detection method |
Non-Patent Citations (2)
Title |
---|
李新叶等: "基于深度学习的图像语义分割研究进展", 《科学技术与工程》 * |
沈文祥等: "基于多级特征和混合注意力机制的室内人群检测网络", 《计算机应用》 * |
Cited By (96)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111814704A (en) * | 2020-07-14 | 2020-10-23 | 陕西师范大学 | Full convolution examination room target detection method based on cascade attention and point supervision mechanism |
CN111814726B (en) * | 2020-07-20 | 2023-09-22 | 南京工程学院 | Detection method for visual target of detection robot |
CN111814726A (en) * | 2020-07-20 | 2020-10-23 | 南京工程学院 | Detection method for visual target of detection robot |
CN111914917A (en) * | 2020-07-22 | 2020-11-10 | 西安建筑科技大学 | Target detection improved algorithm based on feature pyramid network and attention mechanism |
CN112131925B (en) * | 2020-07-22 | 2024-06-07 | 随锐科技集团股份有限公司 | Construction method of multichannel feature space pyramid |
CN112131925A (en) * | 2020-07-22 | 2020-12-25 | 浙江元亨通信技术股份有限公司 | Construction method of multi-channel characteristic space pyramid |
CN112016569A (en) * | 2020-07-24 | 2020-12-01 | 驭势科技(南京)有限公司 | Target detection method, network, device and storage medium based on attention mechanism |
CN111860683A (en) * | 2020-07-30 | 2020-10-30 | 中国人民解放军国防科技大学 | Target detection method based on feature fusion |
CN111860683B (en) * | 2020-07-30 | 2021-04-27 | 中国人民解放军国防科技大学 | Target detection method based on feature fusion |
CN111882002A (en) * | 2020-08-06 | 2020-11-03 | 桂林电子科技大学 | MSF-AM-based low-illumination target detection method |
CN111882002B (en) * | 2020-08-06 | 2022-05-24 | 桂林电子科技大学 | MSF-AM-based low-illumination target detection method |
CN114140683A (en) * | 2020-08-12 | 2022-03-04 | 天津大学 | Aerial image target detection method, equipment and medium |
CN112101113A (en) * | 2020-08-14 | 2020-12-18 | 北京航空航天大学 | Lightweight unmanned aerial vehicle image small target detection method |
CN112101113B (en) * | 2020-08-14 | 2022-05-27 | 北京航空航天大学 | Lightweight unmanned aerial vehicle image small target detection method |
CN111985552A (en) * | 2020-08-17 | 2020-11-24 | 中国民航大学 | Method for detecting diseases of thin strip-shaped structure of airport pavement under complex background |
CN111914795A (en) * | 2020-08-17 | 2020-11-10 | 四川大学 | Method for detecting rotating target in aerial image |
CN111985552B (en) * | 2020-08-17 | 2022-07-29 | 中国民航大学 | Method for detecting diseases of thin strip-shaped structure of airport pavement under complex background |
CN111914795B (en) * | 2020-08-17 | 2022-05-27 | 四川大学 | Method for detecting rotating target in aerial image |
CN112163447A (en) * | 2020-08-18 | 2021-01-01 | 桂林电子科技大学 | Multi-task real-time gesture detection and recognition method based on Attention and Squeezenet |
CN112163447B (en) * | 2020-08-18 | 2022-04-08 | 桂林电子科技大学 | Multi-task real-time gesture detection and recognition method based on Attention and Squeezenet |
CN112037237B (en) * | 2020-09-01 | 2023-04-07 | 腾讯科技(深圳)有限公司 | Image processing method, image processing device, computer equipment and medium |
CN112037237A (en) * | 2020-09-01 | 2020-12-04 | 腾讯科技(深圳)有限公司 | Image processing method, image processing device, computer equipment and medium |
CN112101366A (en) * | 2020-09-11 | 2020-12-18 | 湖南大学 | Real-time segmentation system and method based on hybrid expansion network |
CN112101189B (en) * | 2020-09-11 | 2022-09-30 | 北京航空航天大学 | SAR image target detection method and test platform based on attention mechanism |
CN112101189A (en) * | 2020-09-11 | 2020-12-18 | 北京航空航天大学 | SAR image target detection method and test platform based on attention mechanism |
CN112183269A (en) * | 2020-09-18 | 2021-01-05 | 哈尔滨工业大学(深圳) | Target detection method and system suitable for intelligent video monitoring |
CN112183269B (en) * | 2020-09-18 | 2023-08-29 | 哈尔滨工业大学(深圳) | Target detection method and system suitable for intelligent video monitoring |
CN112132216B (en) * | 2020-09-22 | 2024-04-09 | 平安国际智慧城市科技股份有限公司 | Vehicle type recognition method and device, electronic equipment and storage medium |
CN112132216A (en) * | 2020-09-22 | 2020-12-25 | 平安国际智慧城市科技股份有限公司 | Vehicle type recognition method and device, electronic equipment and storage medium |
CN112233071A (en) * | 2020-09-28 | 2021-01-15 | 国网浙江省电力有限公司杭州供电公司 | Multi-granularity hidden danger detection method and system based on power transmission network picture in complex environment |
CN112163580B (en) * | 2020-10-12 | 2022-05-03 | 中国石油大学(华东) | Small target detection algorithm based on attention mechanism |
CN112163580A (en) * | 2020-10-12 | 2021-01-01 | 中国石油大学(华东) | Small target detection algorithm based on attention mechanism |
CN112307984B (en) * | 2020-11-02 | 2023-02-17 | 安徽工业大学 | Safety helmet detection method and device based on neural network |
CN112307984A (en) * | 2020-11-02 | 2021-02-02 | 安徽工业大学 | Safety helmet detection method and device based on neural network |
CN112365480A (en) * | 2020-11-13 | 2021-02-12 | 哈尔滨市科佳通用机电股份有限公司 | Brake pad loss fault identification method for brake clamp device |
CN112465880B (en) * | 2020-11-26 | 2023-03-10 | 西安电子科技大学 | Target detection method based on multi-source heterogeneous data cognitive fusion |
CN112465880A (en) * | 2020-11-26 | 2021-03-09 | 西安电子科技大学 | Target detection method based on multi-source heterogeneous data cognitive fusion |
CN112528786B (en) * | 2020-11-30 | 2023-10-31 | 北京百度网讯科技有限公司 | Vehicle tracking method and device and electronic equipment |
CN112528786A (en) * | 2020-11-30 | 2021-03-19 | 北京百度网讯科技有限公司 | Vehicle tracking method and device and electronic equipment |
CN112396035A (en) * | 2020-12-07 | 2021-02-23 | 国网电子商务有限公司 | Object detection method and device based on attention detection model |
CN112464851A (en) * | 2020-12-08 | 2021-03-09 | 国网陕西省电力公司电力科学研究院 | Smart power grid foreign matter intrusion detection method and system based on visual perception |
CN112561876B (en) * | 2020-12-14 | 2024-02-23 | 中南大学 | Image-based water quality detection method and system for ponds and reservoirs |
CN112561876A (en) * | 2020-12-14 | 2021-03-26 | 中南大学 | Image-based pond and reservoir water quality detection method and system |
CN112464910A (en) * | 2020-12-18 | 2021-03-09 | 杭州电子科技大学 | Traffic sign identification method based on YOLO v4-tiny |
CN112633158A (en) * | 2020-12-22 | 2021-04-09 | 广东电网有限责任公司电力科学研究院 | Power transmission line corridor vehicle identification method, device, equipment and storage medium |
CN112651326A (en) * | 2020-12-22 | 2021-04-13 | 济南大学 | Driver hand detection method and system based on deep learning |
CN112651371A (en) * | 2020-12-31 | 2021-04-13 | 广东电网有限责任公司电力科学研究院 | Dressing security detection method and device, storage medium and computer equipment |
CN112733691A (en) * | 2021-01-04 | 2021-04-30 | 北京工业大学 | Multi-direction unmanned aerial vehicle aerial photography vehicle detection method based on attention mechanism |
CN112926480B (en) * | 2021-03-05 | 2023-01-31 | 山东大学 | Multi-scale and multi-orientation-oriented aerial photography object detection method and system |
CN112926480A (en) * | 2021-03-05 | 2021-06-08 | 山东大学 | Multi-scale and multi-orientation-oriented aerial object detection method and system |
CN112883907A (en) * | 2021-03-16 | 2021-06-01 | 云南师范大学 | Landslide detection method and device for small-volume model |
CN112907972A (en) * | 2021-04-06 | 2021-06-04 | 昭通亮风台信息科技有限公司 | Road vehicle flow detection method and system based on unmanned aerial vehicle and computer readable storage medium |
CN113343755A (en) * | 2021-04-22 | 2021-09-03 | 山东师范大学 | System and method for classifying red blood cells in red blood cell image |
CN113538331A (en) * | 2021-05-13 | 2021-10-22 | 中国地质大学(武汉) | Metal surface damage target detection and identification method, device, equipment and storage medium |
CN113255759B (en) * | 2021-05-20 | 2023-08-22 | 广州广电运通金融电子股份有限公司 | In-target feature detection system, method and storage medium based on attention mechanism |
CN113255759A (en) * | 2021-05-20 | 2021-08-13 | 广州广电运通金融电子股份有限公司 | Attention mechanism-based in-target feature detection system, method and storage medium |
CN113192058B (en) * | 2021-05-21 | 2021-11-23 | 中国矿业大学(北京) | Intelligent brick pile loading system based on computer vision and loading method thereof |
CN113192058A (en) * | 2021-05-21 | 2021-07-30 | 中国矿业大学(北京) | Intelligent brick pile loading system based on computer vision and loading method thereof |
CN113469942A (en) * | 2021-06-01 | 2021-10-01 | 天津大学 | CT image lesion detection method |
CN113486930B (en) * | 2021-06-18 | 2024-04-16 | 陕西大智慧医疗科技股份有限公司 | Method and device for establishing and segmenting small intestine lymphoma segmentation model based on improved RetinaNet |
CN113486930A (en) * | 2021-06-18 | 2021-10-08 | 陕西大智慧医疗科技股份有限公司 | Small intestinal lymphoma segmentation model establishing and segmenting method and device based on improved RetinaNet |
CN113591859A (en) * | 2021-06-23 | 2021-11-02 | 北京旷视科技有限公司 | Image segmentation method, apparatus, device and medium |
CN113345082B (en) * | 2021-06-24 | 2022-11-11 | 云南大学 | Characteristic pyramid multi-view three-dimensional reconstruction method and system |
CN113345082A (en) * | 2021-06-24 | 2021-09-03 | 云南大学 | Characteristic pyramid multi-view three-dimensional reconstruction method and system |
CN113537119A (en) * | 2021-07-28 | 2021-10-22 | 国网河南省电力公司电力科学研究院 | Transmission line connecting part detection method based on improved Yolov4-tiny |
CN113628179A (en) * | 2021-07-30 | 2021-11-09 | 厦门大学 | PCB surface defect real-time detection method and device and readable medium |
CN113567984B (en) * | 2021-07-30 | 2023-08-22 | 长沙理工大学 | Method and system for detecting artificial small target in SAR image |
CN113567984A (en) * | 2021-07-30 | 2021-10-29 | 长沙理工大学 | Method and system for detecting artificial small target in SAR image |
CN113628179B (en) * | 2021-07-30 | 2023-11-24 | 厦门大学 | PCB surface defect real-time detection method, device and readable medium |
CN113591748A (en) * | 2021-08-06 | 2021-11-02 | 广东电网有限责任公司 | Aerial photography insulator sub-target detection method and device |
CN113762251A (en) * | 2021-08-17 | 2021-12-07 | 慧影医疗科技(北京)有限公司 | Target classification method and system based on attention mechanism |
CN113762251B (en) * | 2021-08-17 | 2024-05-10 | 慧影医疗科技(北京)股份有限公司 | Attention mechanism-based target classification method and system |
CN113420729A (en) * | 2021-08-23 | 2021-09-21 | 城云科技(中国)有限公司 | Multi-scale target detection method, model, electronic equipment and application thereof |
CN113743521A (en) * | 2021-09-10 | 2021-12-03 | 中国科学院软件研究所 | Target detection method based on multi-scale context sensing |
CN113743521B (en) * | 2021-09-10 | 2023-06-27 | 中国科学院软件研究所 | Target detection method based on multi-scale context awareness |
CN113822871A (en) * | 2021-09-29 | 2021-12-21 | 平安医疗健康管理股份有限公司 | Target detection method and device based on dynamic detection head, storage medium and equipment |
CN114241003A (en) * | 2021-12-14 | 2022-03-25 | 成都阿普奇科技股份有限公司 | All-weather lightweight high-real-time sea surface ship detection and tracking method |
CN114038067A (en) * | 2022-01-07 | 2022-02-11 | 深圳市海清视讯科技有限公司 | Coal mine personnel behavior detection method, equipment and storage medium |
CN114549413A (en) * | 2022-01-19 | 2022-05-27 | 华东师范大学 | Multi-scale fusion full convolution network lymph node metastasis detection method based on CT image |
CN114155475B (en) * | 2022-01-24 | 2022-05-17 | 杭州晨鹰军泰科技有限公司 | Method, device and medium for identifying end-to-end personnel actions under view angle of unmanned aerial vehicle |
CN114155475A (en) * | 2022-01-24 | 2022-03-08 | 杭州晨鹰军泰科技有限公司 | Method, device and medium for recognizing end-to-end personnel actions under view angle of unmanned aerial vehicle |
CN114529825B (en) * | 2022-04-24 | 2022-07-22 | 城云科技(中国)有限公司 | Target detection model, method and application for fire fighting access occupied target detection |
CN114529825A (en) * | 2022-04-24 | 2022-05-24 | 城云科技(中国)有限公司 | Target detection model, method and application for fire fighting channel occupation target detection |
CN114648736A (en) * | 2022-05-18 | 2022-06-21 | 武汉大学 | Robust engineering vehicle identification method and system based on target detection |
CN114972860A (en) * | 2022-05-23 | 2022-08-30 | 郑州轻工业大学 | Target detection method based on attention-enhanced bidirectional feature pyramid network |
CN114821374A (en) * | 2022-06-27 | 2022-07-29 | 中国电子科技集团公司第二十八研究所 | Knowledge and data collaborative driving unmanned aerial vehicle aerial photography target detection method |
CN115147375B (en) * | 2022-07-04 | 2023-07-25 | 河海大学 | Concrete surface defect feature detection method based on multi-scale attention |
CN115147375A (en) * | 2022-07-04 | 2022-10-04 | 河海大学 | Concrete surface defect characteristic detection method based on multi-scale attention |
CN115100545A (en) * | 2022-08-29 | 2022-09-23 | 东南大学 | Target detection method for small parts of failed satellite under low illumination |
CN115424230B (en) * | 2022-09-23 | 2023-06-06 | 哈尔滨市科佳通用机电股份有限公司 | Method for detecting failure of vehicle door pulley derailment track, storage medium and device |
CN115424230A (en) * | 2022-09-23 | 2022-12-02 | 哈尔滨市科佳通用机电股份有限公司 | Fault detection method for vehicle door pulley out-of-track, storage medium and equipment |
CN116468730B (en) * | 2023-06-20 | 2023-09-05 | 齐鲁工业大学(山东省科学院) | Aerial Insulator Image Defect Detection Method Based on YOLOv5 Algorithm |
CN116468730A (en) * | 2023-06-20 | 2023-07-21 | 齐鲁工业大学(山东省科学院) | Aerial insulator image defect detection method based on YOLOv5 algorithm |
CN117474861A (en) * | 2023-10-31 | 2024-01-30 | 东北石油大学 | Surface mounting special-shaped element parameter extraction method and system based on improved RetinaNet and Canny-Franklin moment sub-pixels |
CN117671473A (en) * | 2024-02-01 | 2024-03-08 | 中国海洋大学 | Underwater target detection model and method based on attention and multi-scale feature fusion |
CN117671473B (en) * | 2024-02-01 | 2024-05-07 | 中国海洋大学 | Underwater target detection model and method based on attention and multi-scale feature fusion |
Also Published As
Publication number | Publication date |
---|---|
CN111401201B (en) | 2023-06-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111401201B (en) | Aerial image multi-scale target detection method based on spatial pyramid attention drive | |
CN112200161B (en) | Face recognition detection method based on mixed attention mechanism | |
CN109993082B (en) | Convolutional neural network road scene classification and road segmentation method | |
CN109857889B (en) | Image retrieval method, device and equipment and readable storage medium | |
EP3690741A2 (en) | Method for automatically evaluating labeling reliability of training images for use in deep learning network to analyze images, and reliability-evaluating device using the same | |
CN113780296A (en) | Remote sensing image semantic segmentation method and system based on multi-scale information fusion | |
CN113298815A (en) | Semi-supervised remote sensing image semantic segmentation method and device and computer equipment | |
CN113591872A (en) | Data processing system, object detection method and device | |
CN104778699B (en) | A kind of tracking of self adaptation characteristics of objects | |
CN113111968A (en) | Image recognition model training method and device, electronic equipment and readable storage medium | |
CN115797929A (en) | Small farmland image segmentation method and device based on double-attention machine system | |
CN115620393A (en) | Fine-grained pedestrian behavior recognition method and system oriented to automatic driving | |
CN116524189A (en) | High-resolution remote sensing image semantic segmentation method based on coding and decoding indexing edge characterization | |
CN111339967A (en) | Pedestrian detection method based on multi-view graph convolution network | |
CN114821341A (en) | Remote sensing small target detection method based on double attention of FPN and PAN network | |
CN117333948A (en) | End-to-end multi-target broiler behavior identification method integrating space-time attention mechanism | |
CN112949380A (en) | Intelligent underwater target identification system based on laser radar point cloud data | |
CN115731517B (en) | Crowded Crowd detection method based on crown-RetinaNet network | |
CN115862119A (en) | Human face age estimation method and device based on attention mechanism | |
CN115359091A (en) | Armor plate detection tracking method for mobile robot | |
CN115223245A (en) | Method, system, equipment and storage medium for detecting and clustering behavior of tourists in scenic spot | |
CN115331254A (en) | Anchor frame-free example portrait semantic analysis method | |
Motwake et al. | Enhancing land cover classification in remote sensing imagery using an optimal deep learning model | |
CN117523428B (en) | Ground target detection method and device based on aircraft platform | |
CN117576665B (en) | Automatic driving-oriented single-camera three-dimensional target detection method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |