CN115035409A - Weak supervision remote sensing image target detection algorithm based on similarity comparison learning - Google Patents

Weak supervision remote sensing image target detection algorithm based on similarity comparison learning Download PDF

Info

Publication number
CN115035409A
CN115035409A CN202210698556.7A CN202210698556A CN115035409A CN 115035409 A CN115035409 A CN 115035409A CN 202210698556 A CN202210698556 A CN 202210698556A CN 115035409 A CN115035409 A CN 115035409A
Authority
CN
China
Prior art keywords
candidate frame
similarity
candidate
remote sensing
cluster
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210698556.7A
Other languages
Chinese (zh)
Other versions
CN115035409B (en
Inventor
张浩鹏
谭智文
姜志国
谢凤英
赵丹培
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN202210698556.7A priority Critical patent/CN115035409B/en
Publication of CN115035409A publication Critical patent/CN115035409A/en
Application granted granted Critical
Publication of CN115035409B publication Critical patent/CN115035409B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/761Proximity, similarity or dissimilarity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a weak supervision remote sensing image target detection algorithm based on similarity comparison learning, which comprises the steps of firstly obtaining a similarity matrix by a feature extraction neural network, combining candidate frame detection scores to obtain a similarity candidate frame cluster, and generating a merging candidate frame; and then, obtaining a merging candidate frame comparison loss according to the similarity candidate frame cluster, inputting the merging candidate frame into a feature extraction neural network to obtain multiple example losses and refining losses, updating the training feature extraction neural network by combining the merging candidate frame comparison loss, finally, detecting the test image by using the trained feature extraction neural network to obtain a merging candidate frame of the test image, and inputting the merging candidate frame into a weak supervision detection framework to obtain a detection result. The algorithm disclosed by the invention can generate a new candidate frame set with uniform size distribution and reduced proportion of small candidate frames, and can obtain the candidate frame characteristics with discriminative power, thereby improving the capability of the algorithm for distinguishing the rich semantic candidate frames.

Description

Weak supervision remote sensing image target detection algorithm based on similarity comparison learning
Technical Field
The invention relates to the technical field of digital image processing, in particular to a weak supervision remote sensing image target detection algorithm based on similarity contrast learning.
Background
The target detection technology is defined as a technology for finding targets of interest in an image according to image features and determining the positions and the categories of the targets. With the development of deep learning, the target detection technology of the convolutional neural network trained on the labeled image set is mature. However, the mainstream object detection technology, such as fast-RCNN, YOLOv3, SSD, etc., needs to use labels at the object level, that is, for the images in the training set of the training model, specific position and size information of the object in the image needs to be given. Today, remote sensing data is growing explosively, and objects in remote sensing images tend to be densely clustered and oriented to any two features. Therefore, obtaining target-level annotations of remote sensing images is quite time-consuming and labor-consuming.
To address the problem of target-level labeling being difficult to obtain, researchers have proposed and developed target detection algorithms based on weakly supervised learning. Unlike the traditional deep learning target detection technology based on target-level labeling, the weak supervision target detection technology uses image-level labeling, that is, a training set only needs to give which types of targets exist in an image, and does not need to give position and size information of the targets specifically, as shown in fig. 2. During the testing process, the weakly supervised object detection algorithm can still predict the location and size of the objects of the category of interest. For the increasingly huge remote sensing image data sets, the weakly supervised target detection algorithm avoiding fine target level labeling has better application prospect.
However, most of the existing weak supervision target detection algorithms are researched and tested on a natural data set, and when the algorithm is transferred to a remote sensing image data set, a huge problem exists: small box dominance phenomenon, as in fig. 3. The reasons for this phenomenon are mainly two: 1. the mainstream weak supervision target detection algorithm needs to firstly obtain a large number of candidate frames from an input image by using a candidate frame extraction algorithm, and compared with a natural image, a remote sensing image is more complex in background, and the texture of a remote sensing target is clear and the structure is complex, so that the candidate frame extraction algorithm can generate a large number of small candidate frames. The candidate frames play a leading role in updating network parameters in the training process, so that a large number of small frames are detected as targets instead of large detection frames containing all targets and rich in information in the testing process; 2. due to the complex background of the remote sensing image, the characteristic extraction part in the weak supervision target detection frame is difficult to learn the characteristic representation with distinctiveness, and under the interference of a large amount of background noise, the algorithm is difficult to correctly predict the large candidate frame with rich semantics.
Therefore, it is an urgent need to solve the above-mentioned problems by those skilled in the art to provide a weakly supervised target detection algorithm capable of overcoming the above-mentioned drawbacks.
Disclosure of Invention
In view of the above, the invention provides a weak supervised remote sensing image target detection algorithm based on similarity comparison learning, which realizes the update training of the original feature extraction part through the similarity merging candidate frame generation step and the comparison learning step, thereby generating a new candidate frame set with a small candidate frame greatly reduced in proportion to detect a more complete target.
The method comprises the following concrete steps:
a weak supervision remote sensing image target detection algorithm based on similarity comparison learning comprises,
s11, extracting candidate frame features of the training image initial candidate frame based on the feature extraction neural network, and calculating similarity by adopting a cosine similarity criterion to obtain a similarity matrix;
s12, obtaining a candidate frame detection score according to the MIL network branches;
s13, determining a center candidate frame index according to the similarity matrix and the candidate frame detection score to obtain a similarity candidate frame cluster;
s14, generating a combining candidate frame by utilizing the similarity candidate frame cluster;
s15, obtaining a positive and negative sample set according to the similarity candidate frame cluster, and calculating the loss of the similarity candidate frame cluster according to a comparison loss function based on candidate frames according to the similarity score between the candidate frame and a center candidate frame in the positive and negative sample set;
s16, calculating the contrast loss of the merging candidate frame according to the loss of the similarity candidate frame cluster;
s17, inputting the merging candidate box into the feature extraction neural network, and obtaining multi-example loss and refining loss of the merging candidate box through an MIL network branch and a refining network branch;
s18, combining the contrast loss, the multi-example loss and the refining loss, and updating and training the feature extraction neural network according to the gradient back propagation;
and S19, detecting the test image by using the trained feature extraction neural network, executing the step S11-14 to obtain a merging candidate frame of the test image, inputting the merging candidate frame of the test image into the feature extraction neural network, and sequentially passing through the MIL network branch and the refining network branch to obtain a detection result.
Preferably, in S11, the obtaining of the similarity matrix includes,
calculating the similarity between the candidate frames according to the cosine similarity formula,
Figure BDA0003703059930000031
in the formula, p i 、p j Features representing the ith and jth candidate boxes;
obtaining a similarity matrix M F
Figure BDA0003703059930000032
Wherein M is F ∈R m×m M is the total number of candidate frames,
Figure BDA0003703059930000033
for the similarity matrix M F Row i and column j.
Preferably, in S13, the obtaining the similarity candidate frame cluster includes the following steps,
step one, setting the candidate frame as an available state available;
step two, according to the candidate frame detection score obtained by the MIL model, searching the candidate frame with the highest score, defining the candidate frame as a center candidate frame, recording the index of the center candidate frame as Centerj,
Figure BDA0003703059930000034
step three, extracting a centrj column vector in the similarity matrix, and recording the centrj column vector as Fj;
step four, searching the element position index higher than the threshold value in the Fj, and forming a similarity candidate frame cluster Cj together with the index of the center candidate frame;
step five, setting the candidate frame participating in the index as an unavailable state unavailable;
and step six, repeating the step two to the step four in the candidate frames which are in the available state available to obtain a new similarity candidate frame cluster until all the candidate frames are set as unavailable or the upper limit of the cycle times is reached.
Preferably, in S14, according to the position and size information of all candidate frames in the similarity candidate frame cluster, a minimum bounding rectangle is calculated as a new candidate frame one, and coordinates are recorded as: [ x ] of 1 new ,y 1 new ,x 2 new ,y 2 new ]。
Preferably, in S15, the obtaining positive and negative sample sets includes:
selecting a candidate frame corresponding to any index from the similarity candidate frame cluster as a positive sample, and marking as posj;
the element position indexes in Fj which are lower than the threshold value are negative sample index sets, Nj indexes are selected from the negative sample index sets to be used as the negative sample sets, and are marked as the negative sample sets,
Figure BDA0003703059930000041
preferably, the loss of the similarity candidate frame cluster is calculated by the following formula:
Figure BDA0003703059930000042
wherein, delta is a hyper-parameter,
Figure BDA0003703059930000043
the xth element of the Fj vector represents the cosine similarity score between the jth candidate frame feature and the xth candidate frame feature, where x is pos j Or neg j i
Preferably, the contrast loss function is obtained by the following formula:
Figure BDA0003703059930000044
wherein K is the number of the similarity candidate frame clusters.
According to the technical scheme, compared with the prior art, in the algorithm disclosed by the invention, the similarity candidate frame generation network obtains the new candidate frames with more uniform size distribution by utilizing the similarity criterion and the specially designed new candidate frame generation algorithm; and constructing a comparison sample set by the comparison learning network based on the candidate frame, and enhancing the feature expression capability of a feature extraction part in the weak supervision target detection framework by a comparison loss function based on the candidate frame, so that the capability of distinguishing the rich semantic candidate frame by an algorithm is improved, and the generation quality of a similarity candidate frame generation module is promoted. Through the algorithm disclosed by the invention, a new candidate frame set with more uniform size distribution and greatly reduced proportion of small candidate frames can be generated, so that a more complete target is detected, and the detection effect is greatly improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
FIG. 1 is a diagram of an overall framework of a weak supervised remote sensing image target detection algorithm based on similarity comparison learning, provided by the invention;
FIG. 2 is a diagram illustrating the difference between image-level labeling and target-level labeling provided by the present invention;
FIG. 3 is a diagram illustrating an example of a small frame dominance phenomenon provided by the present invention;
fig. 4 is a diagram showing a comparison between a detection result of the detection algorithm of the present invention and a detection result of the existing weak supervision detection algorithm.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the invention discloses a weak supervised remote sensing image target detection algorithm based on similarity contrast learning, which constructs a similarity merging candidate frame generation network and a contrast learning network on the basis of an original feature extraction part and an MIL branch, and effectively solves the problem of small frame domination when a weak supervised target detection method is applied to a remote sensing image;
specifically, the provided similarity candidate frame generation network can obtain a new candidate frame with more balanced size distribution and richer semantics, and effectively solves the problem of excessive small noise frames when the remote sensing image generates the candidate frame;
the comparison learning network based on the candidate box improves the feature expression capability of a feature extraction part in a weak supervision target detection framework and further improves the detection performance of the algorithm.
Meanwhile, the algorithm disclosed by the invention can be applied to the existing weak supervision target detection algorithm as a plug-in, and the detection performance of the algorithm can be improved without destroying the original frame. In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
Specifically, as shown in fig. 1, the training phase, i.e. the process of similarity comparison learning, is as follows,
firstly, extracting candidate frame characteristics of an initial candidate frame of a training image based on a characteristic extraction neural network, calculating similarity by adopting a cosine similarity criterion to obtain a similarity matrix,
the generation of the similarity matrix is obtained by calculating the similarity between the features of the candidate frames, for each candidate frame, the feature extraction part obtains the feature p of the corresponding candidate frame, and then the cosine similarity criterion is adopted to calculate the similarity, which is specifically as follows: for the similarity between the ith candidate frame and the jth candidate frame, calculating the similarity score between the two candidate frames by adopting a cosine similarity formula, namely calculating the cosine value of an included angle between the characteristic vectors of the two candidate frames;
Figure BDA0003703059930000061
in the formula, p i 、p j Features representing the ith and jth candidate boxes;
further, a similarity matrix is obtained, denoted as M F
Figure BDA0003703059930000062
Wherein, M F ∈R m×m M is the total number of candidate frames,
Figure BDA0003703059930000063
for the similarity matrix M F Row i and column j;
secondly, obtaining a candidate frame detection score according to the MIL network branch, determining a center candidate frame index according to the similarity matrix and the candidate frame detection score to obtain a similarity candidate frame cluster,
the part designs a similarity candidate frame cluster generating algorithm (SPC) to obtain a series of similarity candidate frame clusters by using the similarity matrix obtained in the last part and the detection score of each candidate frame obtained by the MIL model.
Let the candidate frame be b, first all the candidate frames are taken from b 0 To b m And sequentially indexing. Each candidate frame b i A corresponding score vector q can be obtained in the MIL branch i . For the category c (indicating that the category target exists in the image) of which the value is 1 in the true value image level label corresponding to the input image, executing the following steps to obtain a plurality of final similarity candidate frame clusters;
the method comprises the following specific steps:
step one, setting the candidate frame as an available state available;
step two, according to the candidate frame detection score obtained by the MIL branch, searching the candidate frame with the highest category c score from all available candidate frames, defining the candidate frame as a center candidate frame, recording the index of the center candidate frame as the center j,
Figure BDA0003703059930000064
step three, extracting a centrj column vector in the similarity matrix for the center candidate frame, and recording the centrj column vector as Fj;
step four, searching all element position indexes higher than a threshold value in the Fj, and forming a similarity candidate frame cluster Cj together with the index of the center candidate frame;
step five, setting all candidate frames participating in the index into an unavailable state unavailable;
step six, continuously repeating the step two to the step four in the candidate frames which are in the available state available to obtain new similarity candidate frame clusters until all the candidate frames are set as unavailable or the upper limit of the cycle times is reached;
and generating candidate frame clusters for all categories with truth labels being 1 according to the steps and combining the candidate frame clusters together to obtain a required similarity candidate frame cluster set C { C1.., CK }, wherein K is the total cycle number, namely the number of the candidate frame clusters.
Further, generating a combined candidate frame by using the similarity candidate frame cluster, specifically calculating a minimum bounding rectangle according to the position and size information of all candidate frames in the similarity candidate frame cluster, and taking the minimum bounding rectangle as a new candidate frame, wherein coordinates are as follows: [ x ] of 1 new ,y 1 new ,x 2 new ,y 2 new ],
Specifically, for each similarity candidate frame cluster Cj, the position and size information of the candidate frame corresponding to the index therein is found. When using the candidate frame extraction algorithm, the position and size information of all candidate frames of each image is obtained simultaneously, including the coordinates of the upper left corner and the lower right corner of each candidate frame, and the coordinate is [ x ] 1 ,y 1 ,x 2 ,y 2 ]In the form of x and y representing coordinate values of the point on the respective coordinate axes,
for the Cj candidate frame cluster, the new merging candidate frame directly calculates the minimum circumscribed rectangle of the candidate frames corresponding to the indexes in all the clusters, and the coordinates of the minimum circumscribed rectangle are recorded as the information of the jth new candidate frame.
Obtaining a positive and negative sample set according to the similarity candidate frame cluster: as known from the process of generating the similarity merge candidate boxes, each similarity candidate box cluster Cj includes an index Centerj of the center candidate box and a plurality of candidate box indexes with similarity higher than a threshold value with the center candidate box,
selecting a candidate frame corresponding to any index from the similarity candidate frame cluster, namely randomly selecting a candidate frame corresponding to one index in Cj as a positive sample, and marking the positive sample as pos j
The negative sample set is obtained in a similar manner, and for the Center candidate box, the Center of the MF is extracted j Column vector, denoted as F j
Searching element position indexes lower than a threshold value in the Fj as a negative sample index set, selecting Nj indexes from the negative sample index set, and recording a set contained by the indexes as a negative sample set
Figure BDA0003703059930000071
Wherein the content of the first and second substances,
Figure BDA0003703059930000072
| Cjneg | represents the number of elements in the set;
then, according to the similarity score between the candidate frames in the positive and negative sample sets and the center candidate frame, calculating the loss of each similarity candidate frame cluster by using a comparison loss function based on the candidate frames;
the corresponding calculation formula is:
Figure BDA0003703059930000073
wherein, delta is a hyper-parameter,
Figure BDA0003703059930000081
the xth element of the Fj vector represents the cosine similarity score between the jth candidate box feature and the xth candidate box feature, where x is pos j Or neg j i . The right part of the above expression is an expression of the comparison loss function based on the candidate frame, and each candidate is obtained through the function expressionContrast loss for boxed clusters.
Calculating a contrast loss function according to the loss of the similarity candidate frame cluster and the following formula;
Figure BDA0003703059930000082
wherein K is the number of the similarity candidate frame clusters.
Inputting the merging candidate box into the feature extraction neural network, and obtaining multi-example loss and refining loss of the merging candidate box through an MIL network branch and a refining network branch; meanwhile, the feature extraction neural network is updated and trained according to gradient back propagation in combination with the contrast loss;
specifically, the merged candidate frame is sent to the weak supervision target detection framework in a similar manner to the original candidate frame, loss for the merged candidate frame is obtained, and gradient updating is performed on the feature extraction part of the neural network together with the original candidate frame loss and the contrast loss, so that the neural network can focus more on the candidate frame region with rich semantic features, and the frame domination phenomenon is reduced.
And finally, entering a testing stage:
and detecting the test image by using the trained feature extraction neural network, executing the step S11-14 to obtain a merging candidate frame of the test image, inputting the merging candidate frame of the test image into the feature extraction neural network, and sequentially passing through the MIL network branch and the refining network branch to obtain a detection result.
Through the guidance of the loss function, the feature extraction capability of the feature extraction part in the weak supervision target detection algorithm framework can be effectively improved, the difference between the features of the obtained target-related candidate box and the features of the background noise is larger, and the generation quality of the similarity merging candidate box is further improved in turn.
Further, the method and the device adopt a weak supervision remote sensing image target detection algorithm based on similarity comparison learning to carry out target detection of interested categories on the remote sensing image. In the experimental part, two published remote sensing image datasets were used: HRSC2016 and NWPU VHR-10. The HRSC2016 contains 1061 remote sensing ship images with sizes varying from 300 × 300 to 1500 × 900. The data set contains a total of four categories: aircraft carriers, commercial ships, attack ships and civil ships, including 436 training images, 181 verification images and 444 test images. The NWPU VHR-10 comprises 650 remote sensing images with different sizes, and 10 types of targets are contained in total: airplanes, ships, storage bins, basketball courts, tennis courts, baseball fields, playgrounds, ports, bridges, and vehicles. Of these, 455 training images and 195 test images.
The evaluation indexes used in the experiment were mAP and Corloc, both according to the PASCAL VOC standard. Wherein mAP is tested on a test set, and the higher the value is, the better the detection result is; CorLoc is tested on a training set, and higher values represent better training of the algorithm.
Table 1 shows the performance of the method disclosed by the present invention compared with other weakly supervised target detection algorithms on the HRSC2016 dataset,
Method mAP CorLoc
WSDDN 1.46 8.35
PCL 6.14 17.98
OICR 15.67 19.31
Ours 39.79 55.29
TABLE 1
As can be seen from the table, the detection performance is greatly improved by the algorithm, and the algorithm can be better applied to the field of remote sensing images. Meanwhile, fig. 4 shows a comparison graph of the detection result of the detection algorithm of the present invention and the detection result of the existing weak supervision detection algorithm, wherein (a) is WSDDN, (b) is oic r, (c) is PCL, and (d) is the detection result of the algorithm of the present invention, so that the algorithm of the present invention effectively solves the problem of small frame domination, can detect a more complete target, and greatly improves the detection effect.
Table 2 shows the results of the algorithm of the present application on the NWPU VHR-10 dataset in comparison with other algorithms,
Method mAP CorLoc
OICR 10.83 14.63
PCL 12.42 18.80
Ours 33.80 52.32
TABLE 2
Obviously, the algorithm has better performance on a multi-class (10-class) remote sensing data set than other weak supervision object detection algorithms.
The results show that the algorithm can effectively solve the problem that the traditional weak supervision target detection algorithm dominates small frames appearing in remote sensing images, greatly improves the detection effect and detection indexes, and fully proves the effectiveness of the algorithm and the application value of the weak supervision target detection algorithm in the field of remote sensing.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed in the embodiment corresponds to the method disclosed in the embodiment, so that the description is simple, and the relevant points can be referred to the description of the method part.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (7)

1. A weak supervision remote sensing image target detection algorithm based on similarity comparison learning is characterized by comprising the following steps:
s11, extracting candidate frame features of the training image initial candidate frame based on the feature extraction neural network, and calculating similarity by adopting a cosine similarity criterion to obtain a similarity matrix;
s12, obtaining a candidate frame detection score according to the MIL network branches;
s13, determining a center candidate frame index according to the similarity matrix and the candidate frame detection score to obtain a similarity candidate frame cluster;
s14, generating a combining candidate frame by utilizing the similarity candidate frame cluster;
s15, obtaining a positive and negative sample set according to the similarity candidate frame cluster, and calculating the loss of the similarity candidate frame cluster according to a comparison loss function based on candidate frames according to the similarity score between the candidate frame and a center candidate frame in the positive and negative sample set;
s16, calculating the contrast loss of the merging candidate frame according to the loss of the similarity candidate frame cluster;
s17, inputting the merging candidate box into the feature extraction neural network, and obtaining multi-example loss and refining loss of the merging candidate box through an MIL network branch and a refining network branch;
s18, combining the contrast loss, the multi-sample loss and the refining loss, and updating and training the feature extraction neural network according to the gradient back propagation;
and S19, detecting the test image by using the trained feature extraction neural network, executing the step S11-14 to obtain a merging candidate frame of the test image, inputting the merging candidate frame of the test image into the feature extraction neural network, and sequentially passing through the MIL network branch and the refining network branch to obtain a detection result.
2. The similarity comparison learning-based weakly supervised remote sensing image target detection algorithm according to claim 1, wherein in S11, the obtaining of the similarity matrix includes,
calculating the similarity between the candidate frames according to the cosine similarity formula according to the characteristics of the candidate frames,
Figure FDA0003703059920000011
in the formula, p i 、p j Features representing the ith and jth candidate boxes,
obtaining a similarity matrix M F
Figure FDA0003703059920000012
Wherein, M F ∈R m×m M is the total number of the candidate frames,
Figure FDA0003703059920000021
for the similarity matrix M F Row i and column j.
3. The similarity comparison learning-based weakly supervised remote sensing image target detection algorithm according to claim 1, wherein in S13, the obtaining of the similarity candidate frame cluster includes the following steps,
step one, setting the candidate frame as an available state available;
step two, according to the candidate frame detection score obtained by the MIL model, searching the candidate frame with the highest score, defining the candidate frame as a Center candidate frame, and recording the index of the Center candidate frame as the Center j
Figure FDA0003703059920000022
Step three, extracting a centrj column vector in the similarity matrix, and recording the centrj column vector as F j
Step four, searching the F j The element position index higher than the set threshold τ and the index of the center candidate frame form a similarity candidate frame cluster Cj;
step five, setting the candidate frame participating in the index as an unavailable state unavailable;
and step six, repeating the step two to the step four in the candidate frames which are in the available state available to obtain a new similarity candidate frame cluster until all the candidate frames are set as unavailable or the upper limit of the cycle times is reached.
4. The weak supervised remote sensing image target detection algorithm based on similarity comparison learning of claim 1, wherein in S14, according to the position and size information of all candidate frames in the similarity candidate frame cluster, a minimum bounding rectangle is calculated as a new candidate frame one, and the coordinates are recorded as: [ x ] of 1 new ,y 1 new ,x 2 new ,y 2 new ]。
5. The similarity comparison learning-based weakly supervised remote sensing image target detection algorithm as recited in claim 3, wherein in S15, the obtaining positive and negative sample sets includes:
selecting a candidate frame corresponding to any index from the similarity candidate frame cluster as a positive sample, and marking the positive sample as pos j
Said F j The element position index of which the middle is lower than the threshold value is a negative sample index set, and N is selected from the negative sample index set j The index is used as a negative sample set and is recorded as
Figure FDA0003703059920000023
Wherein the content of the first and second substances,
Figure FDA0003703059920000024
6. the weak supervised remote sensing image target detection algorithm based on similarity comparison learning of claim 5, wherein the loss of the similarity candidate frame cluster is calculated by the following formula:
Figure FDA0003703059920000031
wherein, delta is a hyper-parameter,
Figure FDA0003703059920000032
is represented by F j The x-th element of the vector represents the cosine similarity score between the jth candidate box feature and the x-th candidate box feature, where x is pos j Or neg j i
7. The similarity comparison learning-based weakly supervised remote sensing image target detection algorithm according to claim 6, wherein the contrast loss function is obtained by the following formula:
Figure FDA0003703059920000033
wherein K is the number of the similarity candidate frame clusters.
CN202210698556.7A 2022-06-20 2022-06-20 Weak supervision remote sensing image target detection algorithm based on similarity comparison learning Active CN115035409B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210698556.7A CN115035409B (en) 2022-06-20 2022-06-20 Weak supervision remote sensing image target detection algorithm based on similarity comparison learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210698556.7A CN115035409B (en) 2022-06-20 2022-06-20 Weak supervision remote sensing image target detection algorithm based on similarity comparison learning

Publications (2)

Publication Number Publication Date
CN115035409A true CN115035409A (en) 2022-09-09
CN115035409B CN115035409B (en) 2024-05-28

Family

ID=83124106

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210698556.7A Active CN115035409B (en) 2022-06-20 2022-06-20 Weak supervision remote sensing image target detection algorithm based on similarity comparison learning

Country Status (1)

Country Link
CN (1) CN115035409B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018137357A1 (en) * 2017-01-24 2018-08-02 北京大学 Target detection performance optimization method
CN109190636A (en) * 2018-07-30 2019-01-11 北京航空航天大学 A kind of remote sensing images Ship Target information extracting method
CN111275044A (en) * 2020-02-21 2020-06-12 西北工业大学 Weak supervision target detection method based on sample selection and self-adaptive hard case mining
CN112183414A (en) * 2020-09-29 2021-01-05 南京信息工程大学 Weak supervision remote sensing target detection method based on mixed hole convolution
WO2022062543A1 (en) * 2020-09-27 2022-03-31 上海商汤智能科技有限公司 Image processing method and apparatus, device and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018137357A1 (en) * 2017-01-24 2018-08-02 北京大学 Target detection performance optimization method
CN109190636A (en) * 2018-07-30 2019-01-11 北京航空航天大学 A kind of remote sensing images Ship Target information extracting method
CN111275044A (en) * 2020-02-21 2020-06-12 西北工业大学 Weak supervision target detection method based on sample selection and self-adaptive hard case mining
WO2022062543A1 (en) * 2020-09-27 2022-03-31 上海商汤智能科技有限公司 Image processing method and apparatus, device and storage medium
CN112183414A (en) * 2020-09-29 2021-01-05 南京信息工程大学 Weak supervision remote sensing target detection method based on mixed hole convolution

Also Published As

Publication number Publication date
CN115035409B (en) 2024-05-28

Similar Documents

Publication Publication Date Title
Gao et al. A mutually supervised graph attention network for few-shot segmentation: The perspective of fully utilizing limited samples
Altwaijry et al. Learning to detect and match keypoints with deep architectures.
CN110689081B (en) Weak supervision target classification and positioning method based on bifurcation learning
Yu et al. Unsupervised random forest indexing for fast action search
CN109583562A (en) SGCNN: the convolutional neural networks based on figure of structure
CN110675437A (en) Image matching method based on improved GMS-ORB characteristics and storage medium
Ghosh et al. Unsupervised grow-cut: cellular automata-based medical image segmentation
Wang et al. View-based 3D object retrieval with discriminative views
Bojanić et al. On the comparison of classic and deep keypoint detector and descriptor methods
CN115359074B (en) Image segmentation and training method and device based on hyper-voxel clustering and prototype optimization
CN112270286B (en) Shadow interference resistant monochromatic video target tracking method
CN104751463B (en) A kind of threedimensional model optimal viewing angle choosing method based on sketch outline feature
CN106228027A (en) A kind of semi-supervised feature selection approach of various visual angles data
Guan et al. An Object Detection Framework Based on Deep Features and High-Quality Object Locations.
Wang et al. Building correlations between filters in convolutional neural networks
Aman et al. Content-based image retrieval on CT colonography using rotation and scale invariant features and bag-of-words model
Liang et al. Weakly-supervised salient object detection on light fields
CN111008630A (en) Target positioning method based on weak supervised learning
Kovalev et al. Biomedical image recognition in pulmonology and oncology with the use of deep learning
Zhang et al. Graph-PBN: Graph-based parallel branch network for efficient point cloud learning
Abdullah et al. Vehicle counting using deep learning models: a comparative study
CN113762151A (en) Fault data processing method and system and fault prediction method
Sadati et al. An improved image classification based in feature extraction from convolutional neural network: application to flower classification
CN113313213B (en) Data set processing method for accelerating training of target detection algorithm
INTHIYAZ et al. YOLO (YOU ONLY LOOK ONCE) Making Object detection work in Medical Imaging on Convolution detection System.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant