CN112861720B - Remote sensing image small sample target detection method based on prototype convolutional neural network - Google Patents

Remote sensing image small sample target detection method based on prototype convolutional neural network Download PDF

Info

Publication number
CN112861720B
CN112861720B CN202110172985.6A CN202110172985A CN112861720B CN 112861720 B CN112861720 B CN 112861720B CN 202110172985 A CN202110172985 A CN 202110172985A CN 112861720 B CN112861720 B CN 112861720B
Authority
CN
China
Prior art keywords
image
prototype
category
target detection
images
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110172985.6A
Other languages
Chinese (zh)
Other versions
CN112861720A (en
Inventor
程塨
施佩珍
闫博唯
姚西文
韩军伟
郭雷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northwestern Polytechnical University
Original Assignee
Northwestern Polytechnical University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwestern Polytechnical University filed Critical Northwestern Polytechnical University
Priority to CN202110172985.6A priority Critical patent/CN112861720B/en
Publication of CN112861720A publication Critical patent/CN112861720A/en
Application granted granted Critical
Publication of CN112861720B publication Critical patent/CN112861720B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/13Satellite images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Remote Sensing (AREA)
  • Astronomy & Astrophysics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a remote sensing image small sample target detection method based on a prototype convolutional neural network. The method mainly comprises the steps of constructing a target detection network mainly comprising a feature extraction and category prototype acquisition module, a prototype guidance RPN module, a redirection feature map module and a detector module, firstly carrying out network model basic learning on a basic category data set containing a large number of labeling samples, then carrying out fine tuning on a network model on a balanced sub data set, and finally carrying out post-processing operations such as non-maximum suppression and the like to realize multi-category target detection of a small sample remote sensing image. The method can rapidly and accurately detect different types of targets from the optical remote sensing image with the complex background by using a small amount of new labeling data, and has higher detection precision and higher detection speed.

Description

Remote sensing image small sample target detection method based on prototype convolutional neural network
Technical Field
The invention belongs to the technical field of remote sensing image processing, and particularly relates to a remote sensing image small sample target detection method based on a prototype convolutional neural network, which can be applied to remote sensing image target detection under the condition of small samples with few marked sample data.
Background
With the continuous development of satellite remote sensing technology, the acquisition of massive remote sensing images has become easier and easier. However, the labeling of remote sensing images requires a lot of manpower and financial resources. In addition, some kinds of targets are rare, and there is a problem that data is difficult to acquire. Therefore, how to use a small amount of labeling samples to realize target detection of remote sensing images is one of the problems to be solved in the present day.
The existing small sample target detection methods can be summarized into three categories, namely target detection based on meta-learning, target detection based on metric learning and target detection based on fine tuning. The target detection based on meta learning mainly constructs different tasks through each iteration, so that the network has strong generalization performance, and can be quickly adapted when encountering new tasks of new types, thereby realizing the target detection task of a small sample. The object detection based on measurement learning mainly comprises learning a measurement space, wherein the closer the object 'distance' of the same category is, the better the object 'distance' of different categories is, and the 'distance' is obtained through some measurement modes, such as common Euclidean distance, cosine similarity and the like. The target detection based on fine tuning mainly comprises the steps of training an initial model in advance, and then constructing a balanced data set to fine tune parameters of the initial model, so that small sample target detection is realized. However, besides the characteristics of large scale change, dense arrangement and the like, the remote sensing image is also influenced by illumination, cloud layers, target forms and complex background environments, so that the remote sensing image and the natural scene image have larger difference. These problems present a significant challenge to the task of small sample target detection of optical remote sensing images. The existing small sample target detection method is difficult to be directly applied to remote sensing images.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a remote sensing image small sample target detection method based on a prototype convolutional neural network. The method comprises the steps of constructing a target detection network mainly comprising a feature extraction and category prototype acquisition module, a prototype guide RPN module, a redirection feature map module and a detector module, wherein the feature extraction and category prototype acquisition module respectively extracts features of a support set image and a query set image by using a deep convolution neural network shared by parameters, and performing global average pooling operation on feature maps of the support set image to acquire a category prototype; the prototype guiding RPN module guides the process of the region of interest generated by the region suggestion network by using the acquired category prototype; the redirection feature map module adopts a channel attention mechanism to remodel the channel information of the foreground target candidate frame features; and the detection module detects the re-modeled characteristics by using the classification network and the regression network, and outputs a detection frame containing the target prediction category and the target position. According to the invention, network model basic learning is firstly carried out on a basic class data set, then fine adjustment is carried out on a network model on a balanced sub data set, the sub data set simultaneously comprises basic class and new class samples, and finally multi-class target detection of a small sample remote sensing image is realized through post-processing operations such as non-maximum suppression and the like. The method can rapidly and accurately detect different types of targets from the optical remote sensing image with the complex background by using a small amount of new labeling data, and has higher detection precision and higher detection speed.
A remote sensing image small sample target detection method based on a prototype convolutional neural network is characterized by comprising the following steps:
Step 1, basic training data preparation: forming a support set image by using all kinds of target instance images in the base class training image dataset D base, wherein the target instance images are image blocks in a target instance bounding box in the image; randomly extracting the whole image containing any category and number of target examples from the basic class training image dataset D base to be used as a query set image; then, preprocessing all the support set images and the query set images, wherein the preprocessed images are used as training data; the preprocessing operation comprises the steps of carrying out normalization processing on all images, adjusting the length and width of the images in the support set to be M multiplied by M, and adjusting the length and width of the images in the query set to be M 'multiplied by M'; the value of M 'is 0.8 to 1.2 times of the size of the basic training image, and the value of M is 0.28 times of the size of M'; the basic training image data set is a remote sensing image target detection DIOR data set;
step 2, constructing a target detection network: the target detection network mainly comprises a feature extraction and category prototype acquisition module, a prototype guidance RPN module, a redirection feature map module and a detector module;
The specific implementation process of the feature extraction and category prototype acquisition module is as follows: firstly, inputting a support set image and a query set image, respectively carrying out feature extraction on the support set image and the query set image by using a feature extraction backbone network B to obtain corresponding feature images, and recording the feature images of the support set image as { F s,1,Fs,2,...,Fs,C }, wherein the feature extraction backbone network B adopts a convolution layer of the first four stages of ResNet-101 networks, F s,i represents the feature images of the support images of the ith category, and i=1, 2, …, C and C are category numbers; then, global average pooling operation is carried out on the feature map { F s,1,Fs,2,...,Fs,C } of the support set image, so that prototype vectors { p 1,p2,...,pC},pi of each class represent prototype vectors of the ith class, i=1, 2, … and C;
The prototype guided RPN module is used for generating an interested region possibly containing a target, and the specific implementation process is as follows: inputting a class prototype vector { p 1,p2,...,pC } obtained by a feature extraction and class prototype acquisition module into a three-layer fully-connected network, outputting a vector with the same length as that of the vector after all convolution kernels of an RPN classifier are unfolded, recombining the vector according to the shape identical to that of the convolution kernels of the RPN classifier to form a group of new convolution kernel parameters, taking the parameters as parameters of an auxiliary classifier, respectively performing front-background scoring on anchor points in a feature map of an inquiry set image by using the auxiliary classifier and the RPN classifier, and adding the obtained two scores as foreground target scores of the anchor points; then, determining the label of each anchor point according to the anchor point dividing rule of the RPN, wherein the label comprises a foreground category, a background category and an neglected sample; sequencing the foreground target scores from high to low, and adjusting r anchor points with higher scores by using an RPN regression to obtain r regions of interest;
The redirection feature map module performs feature extraction on the region of interest obtained by the prototype guided RPN module by using the RoI Align operation of the Mask R-CNN network to obtain a feature map { F 1,F2,...,Fr } of the region of interest, wherein F i represents the feature map of the ith region of interest, and i=1, 2, …, R and R are the number of the regions of interest; then, multiplying the class prototype vector and the feature map of the region of interest channel by channel to obtain a redirected feature map;
The detector module detects the redirection feature map output by the redirection feature map module by using a second-stage detector of the fast R-CNN network, and outputs a detection frame containing the target prediction category and the position; wherein the classification loss of the detector module adopts a cross entropy loss function, and the regression loss of the detector module adopts a SmoothL loss function;
Step3, training a target detection network: inputting the preprocessed support set image and the preprocessed query set image obtained in the step 1 into the target detection network constructed in the step 2 for training to obtain a target detection network trained by the base class data set;
Step 4, fine tuning training data preparation: firstly, randomly extracting 3N labeling sample images of each category from a basic category training image data set D base, and combining the labeling sample images with a new category training image data set D novel to form a fine tuning training image data set D few; the number of the labeling samples contained in the new class image data set is not more than 30, and N is the number of the labeling samples of each class;
Then, the fine tuning training image data set D few is used for replacing the base class training image data set D base to execute the processing in the step 1, so as to obtain fine tuning training data;
step 5, performing target detection network fine adjustment: inputting the support set image and the query set image of the fine tuning training data obtained in the step 4 into the trained target detection network obtained in the step 3, and training the network again to obtain the target detection network trained by the fine tuning data set;
Then, each type of labeling sample in the dataset D few constructed in the step 4 is respectively input into a trained feature extraction backbone network B, and then feature representative vectors are obtained through global average pooling operation, and average vectors of all feature representative vectors are calculated to be used as prototype vectors of the type; so processed, each class obtains a prototype vector to obtain C few prototype vectors, C few being the number of image classes contained in dataset D few;
Step 6, target detection: inputting the preprocessed data to be detected as an inquiry set image into a feature extraction backbone network B trained in the step 5 to obtain inquiry set image features, inputting C few category prototype vectors obtained in the step 5 and the inquiry set image features into a trained prototype-guided RPN module, and obtaining a detection frame containing a target prediction category and a position through a trained redirection feature map module and a detector module, wherein a non-maximum suppression method is adopted to filter out redundant detection frames, and the rest detection frames are final target detection results of the data to be detected.
The beneficial effects of the invention are as follows: the feature extraction and category prototype acquisition module of the shared backbone network is adopted, so that the fitting problem can be effectively relieved, the memory is saved, and the calculation speed is improved; the prototype is adopted to guide the RPN, so that the quality of the acquired region of interest is higher, and the detection of a subsequent detector is facilitated; the detection precision can be further improved due to the adoption of a characteristic redirection processing mode; because the basic training data and the fine tuning training data are adopted to train the target detection network in sequence, the target detection network can realize the target detection task of the complex background remote sensing image under the condition that only a small amount of new label samples are contained, and the target detection network has higher detection precision and better robustness.
Drawings
FIG. 1 is a flow chart of a method for detecting a target of a small sample of a remote sensing image based on a prototype convolutional neural network;
FIG. 2 is a two-stage training schematic of the method of the present invention;
FIG. 3 is an exemplary diagram of the detection results of the method of the present invention.
Detailed Description
The invention will be further illustrated with reference to the following figures and examples, which include but are not limited to the following examples.
The hardware environment for implementation is: intel (R) Core (TM) i3-8100 CPU computer, 8.0GB memory, the running software environment is: ubuntu16.04.5lts and Pycharm2018. The experiment uses a large scale remote sensing image public database DIOR Dataset, which has a total of 23463 images, wherein 5862 images are partitioned into a training set, 5863 images are partitioned into a validation set, and the remaining 11738 images are partitioned into a test set. 20 common remote sensing image targets are covered. Each category contains approximately 1200 images, with an image size of 800 x 800 pixels, and a spatial resolution ranging from 0.5 meters/pixel to 30 meters/pixel. In order to verify the effectiveness of the proposed solution described above, baseball fields, basketball courts, bridges, chimneys, ships, airplanes, airports, highway toll booths, wharfs, track and field fields, dams, golf courses, oil storage tanks, tennis courts, and vehicles are taken as basic classes, and highway service areas, overpasses, stadiums, train stations, and windmills are taken as new classes. 11725 images of the training set and the verification set are used for training, images containing new classes in the 11725 images are removed according to the classification of the base class and the new class, and the rest 8573 images are used as a base class training data set D base. The number of the labeling samples of the new class is 10 per class, and the labeling samples of the new class and 10 labeling samples randomly extracted from each class in the base class are combined to construct a small sample data set D few. Finally, the remaining 11738 images were used as test sets for testing.
As shown in fig. 1, the specific implementation process of the remote sensing image small sample target detection method based on the prototype convolutional neural network is as follows:
1. base class training data preparation
First, a support set image and a query set image are respectively constructed based on the base class training image dataset D base. The query set image is directly obtained by randomly extracting from the D base, namely randomly extracting the whole image containing any category and number of target examples as the query set image. The support set image is composed of all kinds of target instance images in the base class training image dataset D base, wherein the target instance image is an image block in a target instance bounding box in the image, and the target instance can be obtained by matting out the target instance from the whole remote sensing image by utilizing a real label frame of the image.
Then, preprocessing the support set image and the query set image respectively, including: (1) The image is normalized by using the mean value R mean、Gmean、Bmean and the standard deviation R std、Gstd、Bstd of three channel components of each image RGB:
Where I p,c represents the c-channel component of the image before normalization, I' p,c represents the c-channel component of the image after normalization, mean c represents the Mean of the image c-channels, and Std c represents the standard deviation of the image c-channels.
(2) The length and width of the images in the support set are adjusted to be M multiplied by M, and the length and width of the images in the query set are adjusted to be M 'multiplied by M', wherein the value of M 'is 0.8 to 1.2 times the original image size in D base, and the value of M is 0.28 times that of M'.
2. Building a target detection network
The target detection network mainly comprises a feature extraction and category prototype acquisition module, a prototype guidance RPN module, a redirection feature map module and a detector module.
(1) Feature extraction and category prototype acquisition module
The present invention adopts the convolution layers of the first four stages of ResNet-101 networks as features to extract the backbone network B. The ResNet-101 network structure is described in document "K.He,X.Zhang,S.Ren,and J.Sun,"Deep residual learning for image recognition,"in Proceedings of the IEEE conference on computer vision and pattern recognition,2016,pp.770-778.".
Firstly, inputting a support set image and a query set image, respectively extracting features of the support set image and the query set image by using a feature extraction backbone network B to obtain corresponding feature images, and recording the feature images of the support set image as { F s,1,Fs,2,...,Fs,C }, wherein F s,i represents the feature images of the support images of the ith category, i=1, 2, …, C and C are the category numbers contained in the support set image; then, global average pooling operation is performed on the feature map { F s,1,Fs,2,...,Fs,C } of the support set image, so as to obtain a prototype vector { p 1,p2,...,pC},pi of each class, where i=1, 2, …, and C, and the prototype vector { p 1,p2,...,pC},pi of each class represents a prototype vector of the i-th class.
(2) Prototype guided RPN module
The prototype guided RPN module is used for generating an interested region possibly containing a target, and the specific implementation process is as follows: inputting a class prototype vector { p 1,p2,...,pC } obtained by a feature extraction and class prototype acquisition module into a three-layer fully-connected network, outputting a vector with the same length as that of the vector after all convolution kernels of an RPN classifier are unfolded, recombining the vector according to the shape identical to that of the convolution kernels of the RPN classifier to form a group of new convolution kernel parameters, taking the parameters as parameters of an auxiliary classifier, respectively performing front-background scoring on anchor points in a feature map of an inquiry set image by using the auxiliary classifier and the RPN classifier, and adding the obtained two scores as foreground target scores of the anchor points; then, determining the label of each anchor point according to the anchor point dividing rule of the RPN, wherein the label comprises a foreground category, a background category and an neglected sample; and sequencing the foreground target scores from high to low, and adjusting r anchor points with higher scores by using an RPN regression to obtain r regions of interest, wherein r is 256 in the embodiment.
Among them, the RPN method is described in literature "S.Ren,R.Girshick,R.Girshick,and J.Sun,"Faster R-CNN:Towards Real-Time Object Detection with Region Proposal Networks,"IEEE Transactions on Pattern Analysis&Machine Intelligence,vol.39,no.6,pp.1137-1149,2017.".
(3) Redirection feature map module
Firstly, a redirection feature map module performs feature extraction on the region of interest obtained by the prototype guided RPN module by utilizing the RoI Align operation proposed in the Mask R-CNN network proposed in 2017 by KAIMING HE to obtain a feature map { F 1,F2,...,Fr } of the region of interest, wherein F i represents the feature map of the ith region of interest, and i=1, 2, …, R and R are the number of the regions of interest; then, the class prototype vector and the feature map of the region of interest are multiplied channel by channel to obtain a redirected feature map {F1c1,F1c2,...,F1c,F2c1,F2c2,...,F2c,...,Frc1,Frc2...,Frc},, where F i,j represents a redirected feature map in which the feature map of the i-th region of interest is multiplied channel by channel with the prototype vector of class j.
(4) Detector module
Detecting the redirection feature map output by the redirection feature map module by using a second-stage detector of the Faster R-CNN network, and outputting a detection frame containing the target prediction category and the position; wherein the classification loss of the detector module adopts a cross entropy loss function, and the regression loss of the detector module adopts a SmoothL loss function.
Said Faster R-CNN is described in literature "S.Ren,R.Girshick,R.Girshick,and J.Sun,"Faster R-CNN:Towards Real-Time Object Detection with Region Proposal Networks,"IEEE Transactions on Pattern Analysis&Machine Intelligence,vol.39,no.6,pp.1137-1149,2017.7.".
3. Training target detection network
And (3) inputting the preprocessed support set image and the preprocessed query set image obtained in the step (1) into the target detection network constructed in the step (2) for training to obtain the target detection network trained by the base class data set.
4. Fine tuning training data preparation
Constructing a data set D few containing both a base class annotation sample and a new class annotation sample, namely: randomly extracting 3N labeling sample images of each category from the basic category training image data set D base, and combining the labeling sample images with the new category training image data set D novel to form a fine tuning training image data set D few; the new image dataset comprises a small number of labeling samples, the number of the labeling samples is not more than 30, N is the number of labeling samples of each category, and in this embodiment, n=10.
Then, the fine tuning training image dataset D few is used to replace the base class training image dataset D base to perform the processing in step 1, resulting in fine tuning training data.
5. Fine tuning of a target detection network
The fine tuning training data obtained through the processing in the step 4 also comprises a preprocessed support set image and a preprocessed query set image, the preprocessed support set image and the preprocessed query set image are respectively input into the target detection network trained on the basic class data set obtained in the step 3, and the network is trained again to perform network fine tuning, so that the target detection network trained by the fine tuning data set is obtained. As shown in fig. 2.
Then, each type of labeling sample in the dataset D few is respectively input into the feature extraction backbone network B after retraining, and then the feature representative vector is obtained through global average pooling operation, and the average vector of all feature representative vectors is calculated to be used as the prototype vector of the category; thus, each class obtains a prototype vector, resulting in C few prototype vectors, C few being the number of image classes contained in dataset D few, C few being 20 in this embodiment.
6. Target detection
And (3) sending the data to be detected preprocessed according to the method in the step (1) into a retrained backbone network B as an inquiry set image, and combining the 20 category prototype vectors obtained in the step (5), and guiding an RPN module, a redirection feature map module and a detector module through the retrained prototypes to obtain a detection frame for predicting the category and the position of the target in the inquiry set image. Filtering redundant detection frames by adopting a non-maximum suppression method, and giving a final detection result of each picture, wherein the score threshold is set to be 0.3, and the redundant detection frames are filtered by adopting the NMS coincidence threshold to be 0.5.
The NMS method is described in literature "A.Neubeck and L.Gool,"Efficient Non-Maximum Suppression."18th International Conference on Pattern Recognition,2006,pp.850-855.".
FIG. 3 shows an example of the detection results obtained by the present invention. Meanwhile, in order to verify the effectiveness of the method, mAP values are selected to measure the detection results, the values are between 0 and 1, and the larger the values are, the better the detection effect is. The calculation method of mAP is described in literature "M.Everingham,SMA.Eslami and L Van Gool,"The pascal visual object classes challenge:A retrospective."International journal of computer vision,2015,pp.98-136.".
Comparing the detection result obtained by the method with a small sample target detection method Meta R-CNN method, the comparison result is shown in table 1, and the method has higher mAP value on the basis and the new class and higher detection precision.
TABLE 1
Basic mAP New class of mAP
Meta R-CNN process 52.3% 17.0%
The method of the invention 52.6% 18.6%

Claims (1)

1. A remote sensing image small sample target detection method based on a prototype convolutional neural network is characterized by comprising the following steps:
Step 1, basic training data preparation: forming a support set image by using all kinds of target instance images in the base class training image dataset D base, wherein the target instance images are image blocks in a target instance bounding box in the image; randomly extracting the whole image containing any category and number of target examples from the basic class training image dataset D base to be used as a query set image; then, preprocessing all the support set images and the query set images, wherein the preprocessed images are used as training data; the preprocessing operation comprises the steps of carrying out normalization processing on all images, adjusting the length and width of the images in the support set to be M multiplied by M, and adjusting the length and width of the images in the query set to be M 'multiplied by M'; the value of M 'is 0.8 to 1.2 times of the size of the basic training image, and the value of M is 0.28 times of the size of M'; the basic training image data set is a remote sensing image target detection DIOR data set;
step 2, constructing a target detection network: the target detection network mainly comprises a feature extraction and category prototype acquisition module, a prototype guidance RPN module, a redirection feature map module and a detector module;
The specific implementation process of the feature extraction and category prototype acquisition module is as follows: firstly, inputting a support set image and a query set image, respectively carrying out feature extraction on the support set image and the query set image by using a feature extraction backbone network B to obtain corresponding feature images, and recording the feature images of the support set image as { F s,1,Fs,2,...,Fs,C }, wherein the feature extraction backbone network B adopts a convolution layer of the first four stages of ResNet-101 networks, F s,i represents the feature images of the support images of the ith category, and i=1, 2, …, C and C are category numbers; then, global average pooling operation is carried out on the feature map { F s,1,Fs,2,...,Fs,C } of the support set image, so that prototype vectors { p 1,p2,...,pC},pi of each class represent prototype vectors of the ith class, i=1, 2, … and C;
The prototype guided RPN module is used for generating an interested region possibly containing a target, and the specific implementation process is as follows: inputting a class prototype vector { p 1,p2,...,pC } obtained by a feature extraction and class prototype acquisition module into a three-layer fully-connected network, outputting a vector with the same length as that of the vector after all convolution kernels of an RPN classifier are unfolded, recombining the vector according to the shape identical to that of the convolution kernels of the RPN classifier to form a group of new convolution kernel parameters, taking the parameters as parameters of an auxiliary classifier, respectively performing front-background scoring on anchor points in a feature map of an inquiry set image by using the auxiliary classifier and the RPN classifier, and adding the obtained two scores as foreground target scores of the anchor points; then, determining the label of each anchor point according to the anchor point dividing rule of the RPN, wherein the label comprises a foreground category, a background category and an neglected sample; sequencing the foreground target scores from high to low, and adjusting r anchor points with higher scores by using an RPN regression to obtain r regions of interest;
The redirection feature map module performs feature extraction on the region of interest obtained by the prototype guided RPN module by using the RoI Align operation of the Mask R-CNN network to obtain a feature map { F 1,F2,...,Fr } of the region of interest, wherein F i represents the feature map of the ith region of interest, and i=1, 2, …, R and R are the number of the regions of interest; then, multiplying the class prototype vector and the feature map of the region of interest channel by channel to obtain a redirected feature map;
The detector module detects the redirection feature map output by the redirection feature map module by using a second-stage detector of the fast R-CNN network, and outputs a detection frame containing the target prediction category and the position; wherein the classification loss of the detector module adopts a cross entropy loss function, and the regression loss of the detector module adopts a SmoothL loss function;
Step3, training a target detection network: inputting the preprocessed support set image and the preprocessed query set image obtained in the step 1 into the target detection network constructed in the step 2 for training to obtain a target detection network trained by the base class data set;
Step 4, fine tuning training data preparation: firstly, randomly extracting 3N labeling sample images of each category from a basic category training image data set D base, and combining the labeling sample images with a new category training image data set D novel to form a fine tuning training image data set D few; the number of the labeling samples contained in the new class image data set is not more than 30, and N is the number of the labeling samples of each class;
Then, the fine tuning training image data set D few is used for replacing the base class training image data set D base to execute the processing in the step 1, so as to obtain fine tuning training data;
step 5, performing target detection network fine adjustment: inputting the support set image and the query set image of the fine tuning training data obtained in the step 4 into the trained target detection network obtained in the step 3, and training the network again to obtain the target detection network trained by the fine tuning data set;
Then, each type of labeling sample in the dataset D few constructed in the step 4 is respectively input into a trained feature extraction backbone network B, and then feature representative vectors are obtained through global average pooling operation, and average vectors of all feature representative vectors are calculated to be used as prototype vectors of the type; so processed, each class obtains a prototype vector to obtain C few prototype vectors, C few being the number of image classes contained in dataset D few;
Step 6, target detection: inputting the preprocessed data to be detected as an inquiry set image into a feature extraction backbone network B trained in the step 5 to obtain inquiry set image features, inputting C few category prototype vectors obtained in the step 5 and the inquiry set image features into a trained prototype-guided RPN module, and obtaining a detection frame containing a target prediction category and a position through a trained redirection feature map module and a detector module, wherein a non-maximum suppression method is adopted to filter out redundant detection frames, and the rest detection frames are final target detection results of the data to be detected.
CN202110172985.6A 2021-02-08 2021-02-08 Remote sensing image small sample target detection method based on prototype convolutional neural network Active CN112861720B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110172985.6A CN112861720B (en) 2021-02-08 2021-02-08 Remote sensing image small sample target detection method based on prototype convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110172985.6A CN112861720B (en) 2021-02-08 2021-02-08 Remote sensing image small sample target detection method based on prototype convolutional neural network

Publications (2)

Publication Number Publication Date
CN112861720A CN112861720A (en) 2021-05-28
CN112861720B true CN112861720B (en) 2024-05-14

Family

ID=75989205

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110172985.6A Active CN112861720B (en) 2021-02-08 2021-02-08 Remote sensing image small sample target detection method based on prototype convolutional neural network

Country Status (1)

Country Link
CN (1) CN112861720B (en)

Families Citing this family (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113283513B (en) * 2021-05-31 2022-12-13 西安电子科技大学 Small sample target detection method and system based on target interchange and metric learning
CN113378936B (en) * 2021-06-11 2024-03-08 长沙军民先进技术研究有限公司 Faster RCNN-based few-sample target detection method
CN113408546B (en) * 2021-06-21 2023-03-07 武汉工程大学 Single-sample target detection method based on mutual global context attention mechanism
CN113780272A (en) * 2021-07-02 2021-12-10 北京建筑大学 SAR image ship detection method and device, electronic equipment and storage medium
CN113743455A (en) * 2021-07-23 2021-12-03 北京迈格威科技有限公司 Target retrieval method, device, electronic equipment and storage medium
CN113642574B (en) * 2021-07-30 2022-11-29 中国人民解放军军事科学院国防科技创新研究院 Small sample target detection method based on feature weighting and network fine tuning
CN113705570B (en) * 2021-08-31 2023-12-08 长沙理工大学 Deep learning-based few-sample target detection method
CN114124437B (en) * 2021-09-28 2022-09-23 西安电子科技大学 Encrypted flow identification method based on prototype convolutional network
CN113822368B (en) * 2021-09-29 2023-06-20 成都信息工程大学 Anchor-free incremental target detection method
CN114169442B (en) * 2021-12-08 2022-12-09 中国电子科技集团公司第五十四研究所 Remote sensing image small sample scene classification method based on double prototype network
CN114399644A (en) * 2021-12-15 2022-04-26 北京邮电大学 Target detection method and device based on small sample
CN114219804B (en) * 2022-02-22 2022-05-24 汉斯夫(杭州)医学科技有限公司 Small sample tooth detection method based on prototype segmentation network and storage medium
CN114743045B (en) * 2022-03-31 2023-09-26 电子科技大学 Small sample target detection method based on double-branch area suggestion network
CN115049870A (en) * 2022-05-07 2022-09-13 电子科技大学 Target detection method based on small sample
CN115049944B (en) * 2022-06-02 2024-05-28 北京航空航天大学 Small sample remote sensing image target detection method based on multitasking optimization
CN114861842B (en) * 2022-07-08 2022-10-28 中国科学院自动化研究所 Few-sample target detection method and device and electronic equipment
CN115100532B (en) * 2022-08-02 2023-04-07 北京卫星信息工程研究所 Small sample remote sensing image target detection method and system
CN115115898B (en) * 2022-08-31 2022-11-15 南京航空航天大学 Small sample target detection method based on unsupervised feature reconstruction
CN115984621B (en) * 2023-01-09 2023-07-11 宁波拾烨智能科技有限公司 Small sample remote sensing image classification method based on restrictive prototype comparison network
CN116310894B (en) * 2023-02-22 2024-04-16 中交第二公路勘察设计研究院有限公司 Unmanned aerial vehicle remote sensing-based intelligent recognition method for small-sample and small-target Tibetan antelope
CN116129226B (en) * 2023-04-10 2023-07-25 之江实验室 Method and device for detecting few-sample targets based on multi-prototype mixing module
CN117409250B (en) * 2023-10-27 2024-04-30 北京信息科技大学 Small sample target detection method, device and medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109063594A (en) * 2018-07-13 2018-12-21 吉林大学 Remote sensing images fast target detection method based on YOLOv2
CN109919108A (en) * 2019-03-11 2019-06-21 西安电子科技大学 Remote sensing images fast target detection method based on depth Hash auxiliary network
CN110189304A (en) * 2019-05-07 2019-08-30 南京理工大学 Remote sensing image target on-line quick detection method based on artificial intelligence
CN111160249A (en) * 2019-12-30 2020-05-15 西北工业大学深圳研究院 Multi-class target detection method of optical remote sensing image based on cross-scale feature fusion
CN111666836A (en) * 2020-05-22 2020-09-15 北京工业大学 High-resolution remote sensing image target detection method of M-F-Y type lightweight convolutional neural network
CN111797676A (en) * 2020-04-30 2020-10-20 南京理工大学 High-resolution remote sensing image target on-orbit lightweight rapid detection method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109063594A (en) * 2018-07-13 2018-12-21 吉林大学 Remote sensing images fast target detection method based on YOLOv2
CN109919108A (en) * 2019-03-11 2019-06-21 西安电子科技大学 Remote sensing images fast target detection method based on depth Hash auxiliary network
CN110189304A (en) * 2019-05-07 2019-08-30 南京理工大学 Remote sensing image target on-line quick detection method based on artificial intelligence
CN111160249A (en) * 2019-12-30 2020-05-15 西北工业大学深圳研究院 Multi-class target detection method of optical remote sensing image based on cross-scale feature fusion
CN111797676A (en) * 2020-04-30 2020-10-20 南京理工大学 High-resolution remote sensing image target on-orbit lightweight rapid detection method
CN111666836A (en) * 2020-05-22 2020-09-15 北京工业大学 High-resolution remote sensing image target detection method of M-F-Y type lightweight convolutional neural network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于特征融合与分类器在线学习的目标跟踪算法;胡秀华;郭雷;李晖晖;;控制与决策(第09期);全文 *

Also Published As

Publication number Publication date
CN112861720A (en) 2021-05-28

Similar Documents

Publication Publication Date Title
CN112861720B (en) Remote sensing image small sample target detection method based on prototype convolutional neural network
CN110363122B (en) Cross-domain target detection method based on multi-layer feature alignment
CN106778595B (en) Method for detecting abnormal behaviors in crowd based on Gaussian mixture model
CN108596055B (en) Airport target detection method of high-resolution remote sensing image under complex background
CN111709416B (en) License plate positioning method, device, system and storage medium
CN111160249A (en) Multi-class target detection method of optical remote sensing image based on cross-scale feature fusion
CN110163213B (en) Remote sensing image segmentation method based on disparity map and multi-scale depth network model
CN111353395A (en) Face changing video detection method based on long-term and short-term memory network
CN111738055B (en) Multi-category text detection system and bill form detection method based on same
CN110082821B (en) Label-frame-free microseism signal detection method and device
CN111460980B (en) Multi-scale detection method for small-target pedestrian based on multi-semantic feature fusion
CN104036284A (en) Adaboost algorithm based multi-scale pedestrian detection method
CN112016605A (en) Target detection method based on corner alignment and boundary matching of bounding box
CN109584206B (en) Method for synthesizing training sample of neural network in part surface flaw detection
CN114495010A (en) Cross-modal pedestrian re-identification method and system based on multi-feature learning
CN114360030A (en) Face recognition method based on convolutional neural network
CN111401113A (en) Pedestrian re-identification method based on human body posture estimation
CN110826411A (en) Vehicle target rapid identification method based on unmanned aerial vehicle image
CN114332473A (en) Object detection method, object detection device, computer equipment, storage medium and program product
CN116091946A (en) Yolov 5-based unmanned aerial vehicle aerial image target detection method
CN114266805A (en) Twin region suggestion network model for unmanned aerial vehicle target tracking
CN112446305A (en) Pedestrian re-identification method based on classification weight equidistant distribution loss model
CN116993760A (en) Gesture segmentation method, system, device and medium based on graph convolution and attention mechanism
Xia et al. Abnormal event detection method in surveillance video based on temporal CNN and sparse optical flow
Zeng et al. Masanet: Multi-angle self-attention network for semantic segmentation of remote sensing images

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant