CN117725243B - Class irrelevant instance retrieval method based on hierarchical semantic region decomposition - Google Patents

Class irrelevant instance retrieval method based on hierarchical semantic region decomposition Download PDF

Info

Publication number
CN117725243B
CN117725243B CN202410173702.3A CN202410173702A CN117725243B CN 117725243 B CN117725243 B CN 117725243B CN 202410173702 A CN202410173702 A CN 202410173702A CN 117725243 B CN117725243 B CN 117725243B
Authority
CN
China
Prior art keywords
feature
features
instance
image
cluster
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410173702.3A
Other languages
Chinese (zh)
Other versions
CN117725243A (en
Inventor
赵万磊
孙琦颖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen University
Original Assignee
Xiamen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiamen University filed Critical Xiamen University
Priority to CN202410173702.3A priority Critical patent/CN117725243B/en
Publication of CN117725243A publication Critical patent/CN117725243A/en
Application granted granted Critical
Publication of CN117725243B publication Critical patent/CN117725243B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses a class irrelevant instance retrieval method based on hierarchical semantic region decomposition, which extracts an instance of a characteristic image through a detector and stores the instance in a potential instance library; then extracting the characteristics of the examples in the potential example library through a descriptor to obtain example characteristics, and storing the example characteristics in the characteristic library; when a query picture is input, the retriever extracts features of the query picture to obtain query features, matches the query features with example features in a feature library, and obtains top K example features most similar to the query features as retrieval results. In the detector, the invention realizes the rapid discovery of the instance-level regions with different scales by carrying out hierarchical semantic region decomposition and screening on the image so as to achieve the purpose of comprehensively discovering the instance which is likely to be retrieved. The hierarchical decomposition well solves the problems of object shielding and embedding commonly existing in a real instance retrieval scene.

Description

Class irrelevant instance retrieval method based on hierarchical semantic region decomposition
Technical Field
The invention relates to the technical field of computer vision, in particular to a class irrelevant instance retrieval method based on hierarchical semantic region decomposition.
Background
Example-based retrieval techniques are the direction of current picture retrieval with widespread demand. As it does not retrieve the picture itself, but rather a specific retrieval is made for the instance given by the user in the picture. For a given query picture, in an instance retrieval task, a specific query instance marked by a rectangular frame needs to be searched, a picture containing the query instance is retrieved from a database consisting of a large number of images, and a specific position of the queried instance is marked on a retrieval result image. The search for such refinement requirements is widely used in real-world scenarios. For example, in electronic commerce platforms and online shopping, a user can upload a commodity picture, the system can perform example retrieval according to image features and return similar or related commodities to help the user find the commodity of interest; in addition, for large-scale image libraries or image databases, instance retrieval can be used to quickly search and locate specific images, which is widely used in image management systems, photo album applications, and image archives.
Conventional retrieval methods generally use manual features to extract features of potential examples and query examples, and then obtain more refined and accurate example features through techniques such as feature compression, but such methods generally have high computational requirements and cannot robustly cope with the non-rigid transformation of images. In recent years, with the development of computer vision technology, some methods have attempted to use deep learning algorithms for instance discovery and feature extraction. Some methods have an image represented by a depth global feature that is assembled from a convolution layer. During the pooling process, high weights are assigned to potential instance areas. Such approaches fail to acquire instance positioning and are subject to background interference. Some attempts have focused on designing instance-level features for search tasks, dividing instance retrieval tasks into two steps, instance location and feature extraction from detected instance areas. Among these methods, the method using the supervised or weakly supervised method as the backbone network is limited in the ability to detect unknown class objects, and in addition, the examples of different scales found by the existing methods are not comprehensive enough, which affects the effect of the search.
The main difficulty of instance retrieval focuses on the potential instance discovery of instance localization, for which the dimensions, size, shape of the subsequent query instance are unknown, even defects may occur. How to find out potential examples and extract features of the potential examples becomes a key to solving the problem. The existing methods are poor in exhaustive and accurate example extraction, and can not solve the problems of example shielding and multi-example retrieval. Thus, there is a need for a simple but efficient hierarchical instance discovery method to address existing challenges.
Disclosure of Invention
Aiming at the problems existing in the prior art, the invention aims to provide a class irrelevant instance retrieval method based on hierarchical semantic region decomposition, which realizes quick discovery of instance level regions with different scales by carrying out hierarchical semantic region decomposition and screening on images so as to realize comprehensive discovery of instances which can be retrieved.
In order to achieve the above purpose, the invention adopts the following technical scheme:
A class irrelevant instance retrieval method based on hierarchical semantic region decomposition extracts instances of feature images through a detector and stores the instances in a potential instance library; then extracting the characteristics of the examples in the potential example library through a descriptor to obtain example characteristics, and storing the example characteristics in the characteristic library; when a query picture is input, the retriever performs feature extraction on the query picture to obtain query features, matches the query features with example features in a feature library, and obtains top K example features most similar to the query features as retrieval results; the detector processes the characteristic pictures specifically as follows:
s1, encoding a characteristic image by using a characteristic encoder trained by an unsupervised method to obtain an encoded image block set;
Feature images come from a feature library, and feature set marks are obtained after feature images are coded by an unsupervised feature coder Namely, the coded image block set;
S2, carrying out hierarchical decomposition on the image in a rapid bipartite clustering mode, so as to detect example areas on different scales; the method comprises the following steps:
Adding the coded image block set into a to-be-processed node queue, and performing hierarchical bipartition on each node in the queue until no node exists in the queue;
s2.1, carrying out initialization clustering on the image block set in each node: if two image blocks And/>Similarity/>Greater than threshold/>Consider image block/>And/>Similarly, the degree/>, of the two image blocksAnd/>Respectively increasing 1; after similarity calculation of all the image blocks is completed, selecting the image block features with the highest degree and the lowest degree as seeds of clustering, and distributing the rest image block features into corresponding clusters according to the distances between the rest image block features and the seeds;
S2.2, performing binary clustering on each cluster obtained in the step S2.1;
Defining an objective function The objective function is intended to minimize the distance between features within the cluster; wherein k represents the number of clusters,/>Is a cluster,/>Representative of belonging to cluster/>Is a block feature of the image;
Specifically, the distance between each feature and the cluster is as follows:
Wherein, Is a cluster,/>For cluster/>Number of features in/>For cluster/>Sum of internal features,/>A transpose representing the characteristics of this image block;
When a feature is moved into another cluster, the distance between the feature and the cluster it is moved to is calculated as follows:
Wherein, Is a cluster,/>For cluster/>Number of features in/>For cluster/>The sum of the internal features;
If it is If the feature movement is described as the optimization of the objective function, the movement is continued, and the process is iterated until the objective function converges to obtain two new image block sets;
S2.3, detecting regional connectivity;
Mapping the two graphic block sets obtained in the step 2.2 back to the original image to obtain two partial subgraphs, further dividing each subgraph into smaller subregions according to the spatial connectivity of the image blocks, and detecting the regional connectivity, wherein each subgraph can obtain at least one group of communicated subregions, so that the subsequent processing is more accurate and reliable;
s2.4, judging whether to continue to divide into two parts or not;
mapping the plurality of subareas obtained in the step 2.3 to an encoder for encoding to obtain a plurality of image block sets, and judging whether to continue bisection by using average internal connectivity for each image block set;
feature sets corresponding to each image block set By calculating the feature set/>The number of internal edges to measure its connectivity; specifically, assume feature set/>The total number of internal edges is/>When average internal connectivity/>When the threshold is exceeded, the feature set/>Further segmentation is performed, i.e. average internal connectivity/>Discarding the image block set corresponding to the feature set S when the image block set is larger than the threshold value, otherwise, adding the image block set corresponding to the feature set S into a queue, and continuing halving;
s3, screening examples, namely screening examples with insignificant semantics according to significance conditions;
determining a feature set node by intensity of feature energy Whether it is a dummy node or not, specifically as follows:
feature sets corresponding to each image block set Judging a feature set/>, by the intensity of feature energyWhether it is a dummy node:
Specifically, a given feature set ,/>Is in feature set/>Set of features with high energy in (1), "feature set/>And feature set/>The overlapping ratio of (2) is as follows:
When the ratio of overlapping When the threshold value is smaller than the threshold value, the feature set S is a dummy node, and the feature set S is discarded; otherwise, the feature set is retained.
In the step S2.4, the method for establishing the inner edge of the set S is as follows: if S two image blocks are assembledAnd/>Similarity/>Greater than threshold/>Considering the two as similar, an edge is established between the two image blocks, which is one of the inner edges of the set S.
In said step 2.4, average internal connectivityDefined as the total number of edges/>And feature set size/>Ratio of average internal connectivity/>The value range of (2) is/>Between them.
The instance processing in the potential instance library by the descriptor is specifically as follows:
For a feature set in an image obtained by the detector, it is projected onto the encoder downsampled according to its order On a feature map of size, wherein/>Is the length of the feature map,/>To be the width of the feature map, a/>, with the case of 1 and the case of 0, is formedA mask of a size, up-sampling the mask to an original size, i.e. an instance representation mask of the original size is reached, and inputting the image into a convolutional neural network based feature extractor to obtain a feature map encoded for the image; for each positioning result represented by the mask, downsampling the mask to obtain a mask with the same size as the feature map; the feature representation for each instance is obtained using a generalized mean pooling on each channel by multiplying the feature map with a downsampled mask.
After the scheme is adopted, a hierarchy-based instance detection structure is introduced to discover instance information of different scales, and a reasonable hierarchy pause and instance screening mechanism is added. On one hand, the introduction of the effective hierarchical structure can completely find out examples of different hierarchies, and the semantically significant region is detected by minimizing the feature distance in the cluster, so that the robustness of the detection model under special retrieval tasks such as multiple examples, shielding and the like is improved while the retrieval precision is improved; on the other hand, a specific optimization algorithm is introduced, so that the problems of computational complexity and local optimal solution in the traditional clustering method are solved, and the computational consumption required by the instance discovery is greatly reduced, so that potential instances can be quickly discovered; in addition, the introduction of the instance screening mechanism based on significance also greatly improves the retrieval efficiency.
Drawings
FIG. 1 is an overall flow chart of an example search of the present invention;
fig. 2 is a flow chart of a method of the detector portion of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described with reference to the following examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention. On the contrary, the invention is intended to cover any alternatives, modifications, equivalents, and variations as may be included within the spirit and scope of the invention as defined by the appended claims. Further, in the following detailed description of the present invention, certain specific details are set forth in order to provide a thorough understanding of the present invention. The present invention will be fully understood by those skilled in the art without the details described herein.
As shown in FIG. 1, the invention discloses a class irrelevant instance retrieval method based on hierarchical semantic region decomposition, which extracts an instance of a feature image through a detector and stores the instance in a potential instance library; then extracting the characteristics of the examples in the potential example library through a descriptor to obtain example characteristics, and storing the example characteristics in the characteristic library; when a query picture is input, the retriever extracts features of the query picture to obtain query features, matches the query features with example features in a feature library, and obtains top K example features most similar to the query features as retrieval results.
As shown in fig. 2, the detector processes the feature image specifically as follows:
s1, a feature encoder trained by an unsupervised method is used for encoding the feature image, and an encoded image block set is obtained.
The feature image (original image) comes from a feature library, and the feature image is marked by a feature set after being coded by an unsupervised feature coderI.e. the set of encoded image blocks.
S2, carrying out hierarchical decomposition on the image in a rapid bipartite clustering mode, and detecting the instance areas on different scales. The method comprises the following steps:
And adding the coded image block set into a to-be-processed node queue, and carrying out hierarchical bipartition on each node (each node corresponds to one image block set) in the queue until no node exists in the queue.
S2.1, carrying out initialization clustering on the image block set in each node: if two image blocksAnd/>Similarity/>Greater than threshold/>Consider image block/>And/>Similarly, the degree/>, of the two image blocksAnd/>Respectively increasing 1; after the similarity calculation of all the image blocks is completed, selecting the image block features with the highest and lowest degrees as the seeds of clustering, and distributing the rest image block features into corresponding clusters according to the distance between the rest image block features and the seeds.
S2.2, performing binary clustering on each cluster obtained in the step S2.1.
Defining an objective functionThe objective function is intended to minimize the distance between features within the cluster. In this formula, k represents the number of clusters,/>Is a cluster,/>Representative of belonging to cluster/>Is included. Hereinafter/>Is the number of features in the cluster,/>Is the sum of intra-cluster features,/>Representing a transpose of this image block feature (which is a vector).
Specifically, the distance between each feature and the cluster is as follows:
Wherein, Is a cluster,/>For cluster/>Number of features in/>For cluster/>Sum of internal features,/>Representing a transpose of this image block feature.
The similarity between features is measured by computing the sum of squares of the euclidean distances between them. An optimization procedure is used here to gradually reduce the value of the objective function by moving features into different clusters. When one feature is moved into another cluster, the distance between the feature and the cluster it is moved into is calculated as follows:
Wherein, Is a cluster,/>For cluster/>Number of features in/>For cluster/>And the sum of the internal features.
If it isThe objective function is optimized after the feature movement is described, and the movement can be performed, and the process is iterated until the objective function converges, so that two new image block sets are obtained.
S2.3, detecting the connectivity of the area.
Mapping the two graphic block sets obtained in the step 2.2 back to the original image to obtain two partial subgraphs, and for the two subgraphs, not necessarily, two suitable example representations are needed to be considered, wherein connectivity of the two subgraphs on the feature image is needed to be considered. Thus requiring further partitioning to obtain connected sub-regions.
According to the space connectivity of the image blocks, each sub-graph is further divided into smaller sub-regions, the divided regions are guaranteed to be communicated on the original graph, and each sub-graph can obtain at least one group of communicated sub-regions through region connectivity detection, so that subsequent processing is more accurate and reliable.
S2.4, judging whether to continue bisection or not.
And (3) mapping the plurality of subareas obtained in the step (2.3) to an encoder to obtain a plurality of image block sets after encoding, and judging whether to continue bisection by using average internal connectivity for each image block set.
Feature sets corresponding to each image block setBy calculating the feature set/>The number of edges inside measures its connectivity. The establishing mode of the inner edge of the set S is as follows: if set S two tiles/>And/>Similarity/>Greater than threshold/>Considering the two as similar, an edge is established between the two image blocks, which is one of the inner edges of the set S.
Specifically, assume a feature setThe total number of internal edges is/>Average internal connectivity/>Defined as the total number of edges/>And feature set size/>Ratio of average internal connectivity/>The value range of (2) is/>Between them. Feature set/>, as deeper and deeper nodes are decomposedBecome more compact, average internal connectivity/>And also increases. Therefore, the invention is achieved by adjusting the average internal connectivity/>Is used to control the granularity of the segmentation when average internal connectivity/>When the threshold is exceeded, the feature set/>Further segmentation is performed, i.e. average internal connectivity/>And if the image block set is larger than the threshold value, discarding the image block set corresponding to the feature set S, otherwise, adding the image block set corresponding to the feature set S into a queue, and continuing halving.
The larger threshold value can lead to finer granularity of segmentation, and the function of the termination condition is to ensure that the segmentation process is not excessively subdivided, avoid excessive subsets, and ensure that the interior of the segmented subsets has enough tight connectivity.
S3, screening examples, and screening examples with insignificant semantics according to significance conditions.
In hierarchical clustering, a feature set corresponding to a certain image block set does not necessarily correspond to a semantically compact region, and may be a mixture of multiple instances of different categories, or may be a mixture of an object and a background, or may be merely a blank background, and such feature set is a dummy node. The feature of building instance level from dummy nodes is of little significance.
The invention judges a feature set node by the intensity of feature energyWhether it is a dummy node.
Feature sets corresponding to each image block setJudging a feature set/>, by the intensity of feature energyWhether it is a dummy node:
Specifically, a given feature set ,/>Is in feature set/>A set of features with high energy. Feature set/>And feature set/>The overlapping ratio of (2) is as follows:
When (when) At low levels, it is essentially represented that the region is dominated by semantically unimportant features, and therefore these dummy nodes in the hierarchy are ignored by setting a threshold. In particular, when the overlap ratio/>When the threshold value is smaller than the threshold value, the feature set S is a dummy node, and the feature set S is discarded; otherwise, the feature set is retained.
The determination of dummy nodes does not interfere with feature set nodesThe segmentation is further performed. Feature set node/>, when clustering reaches a finer granularity levelMay become a significant area.
The instance processing in the potential instance library by the descriptor is specifically as follows:
For a feature set in an image obtained by the detector, it is projected onto the encoder downsampled according to its order On a feature map of size, wherein/>Is the length of the feature map,/>To be the width of the feature map, a/>, with the case of 1 and the case of 0, is formedA mask of size, up-sampling this mask to the original size, i.e. an instance up to the original size represents the mask.
Inputting the original image into a feature extractor based on a convolutional neural network to obtain a feature image for encoding the original image.
An example level feature is extracted using a masked region of interest feature extraction method. Specifically, for each positioning result represented by a mask, the mask is downsampled to obtain a mask having the same size as the feature map. The feature representation of each instance is obtained and stored in a feature library using a generalized mean pooling on each channel by multiplying the feature map with the downsampled mask.
The retriever processes the query picture as follows:
Preprocessing the query picture, and then extracting the characteristics to obtain the query characteristics. And for the query features, calculating cosine similarity of all instance features and the query features in a feature library, ranking the similarity, and returning the instance with the top ranking to obtain a query result.
For the target bounding box that the query needs to return, then the bounding box of coordinates with a mask of 1 in the mask up-sampled to the original size in the descriptor step is used as the bounding box for this instance.
The invention encodes the image by using a feature encoder trained by an unsupervised method to obtain an encoded image block feature set; then, after an initial point is selected according to priori knowledge, carrying out hierarchical decomposition on the image in a rapid bipartite clustering mode until the internal connectivity is met, and finally realizing rapid detection on instance areas on different scales; then, screening some examples with insignificant semantics according to the significance conditions to reduce the number of examples stored in the feature library; finally, for the obtained instance mask, the instance-level features are extracted using the masked region of interest feature extraction method and stored in a feature library. For given query example features, cosine similarity of all example features and query features is calculated, so that a query result picture and an example positioning frame can be queried. In summary, the invention provides a simple and efficient instance retrieval method, which realizes the discovery of instance-level regions with different scales by carrying out hierarchical semantic region decomposition on images so as to achieve the purpose of comprehensively discovering instances which can be retrieved, and the hierarchical decomposition well solves the problems of object shielding and embedding commonly existing in a real instance retrieval scene and improves the retrieval precision.
In the example search task, tests were performed on three well-known example search task data sets, instance-160, instance-240, instance-335 and INSTRE, to demonstrate the effectiveness of the present invention. The present invention and comparison of the effectiveness of the R-MAC, CAM-weight, BLCF and DASR methods using three search criteria, mAP-50, mAP-100 and mAP-all, show the advances of the methods of the present invention as shown in Table 1. The method of the present invention is referred to as CLAID. And before evaluation, an existing normalization and whitening strategy and a feature extraction layer number selection strategy are adopted for all the method models so as to reflect the real requirements of the actual scene.
TABLE 1
In table 1, mAP50 represents the average search accuracy reported on the first 50 results; mAP100 represents the average search accuracy reported on the first 100 results; map-all represents the average search accuracy reported over all search results.
The first left-most column of Table 1 represents the method name, and the top row of Instance-160, instance-335, INSTRE represents the three retrieved datasets. Wherein, R-MAC is the maximum area activation convolution method; CAM-weight is a class activation mapping weighting method; BLCF is a partial convolution feature bag-of-word model method; BLCF-SalGAN are local convolution feature bag of words model methods based on saliency weighting; DASR is an example retrieval method based on depth activation of salient regions; DUODIS is region-based dataset-driven unsupervised object discovery; CLAID is a class independent instance level descriptor method (present invention).
Table 1 shows the experimental results of the search experiments performed on three well-known example search datasets, showing how accurately the examples were searched by different methods by reporting the average search accuracy over the top 50, 100 and all search results. From Table 1 we can see that the invention exceeds other example search approaches in search accuracy.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any adaptations, uses, or adaptations of the disclosure following the general principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains.

Claims (4)

1. A class irrelevant instance retrieval method based on hierarchical semantic region decomposition extracts instances of feature images through a detector and stores the instances in a potential instance library; then extracting the characteristics of the examples in the potential example library through a descriptor to obtain example characteristics, and storing the example characteristics in the characteristic library; when a query picture is input, the retriever performs feature extraction on the query picture to obtain query features, matches the query features with example features in a feature library, and obtains top K example features most similar to the query features as retrieval results; the method is characterized in that: the detector processes the characteristic pictures specifically as follows:
s1, encoding a characteristic image by using a characteristic encoder trained by an unsupervised method to obtain an encoded image block set;
Feature images come from a feature library, and feature set marks are obtained after feature images are coded by an unsupervised feature coder Namely, the coded image block set;
S2, carrying out hierarchical decomposition on the image in a rapid bipartite clustering mode, so as to detect example areas on different scales; the method comprises the following steps:
Adding the coded image block set into a to-be-processed node queue, and performing hierarchical bipartition on each node in the queue until no node exists in the queue;
s2.1, carrying out initialization clustering on the image block set in each node: if two image blocks And/>Similarity/>Greater than threshold/>Consider image block/>And/>Similarly, the degree/>, of the two image blocksAnd/>Respectively increasing 1; after similarity calculation of all the image blocks is completed, selecting the image block features with the highest degree and the lowest degree as seeds of clustering, and distributing the rest image block features into corresponding clusters according to the distances between the rest image block features and the seeds;
S2.2, performing binary clustering on each cluster obtained in the step S2.1;
Defining an objective function The objective function is intended to minimize the distance between features within the cluster; wherein k represents the number of clusters,/>Is a cluster,/>Representative of belonging to cluster/>Is a block feature of the image;
Specifically, each feature and cluster The distance between them is as follows:
Wherein, Is a cluster,/>For cluster/>Number of features in/>For cluster/>Sum of internal features,/>Representing this image block feature/>Is a transpose of (2);
Moving features to another cluster Calculates the cluster/>, to which the feature and the cluster it movedThe distance between them is as follows:
Wherein, Is a cluster,/>For cluster/>Number of features in/>For cluster/>The sum of the internal features;
If it is If the feature movement is described as the optimization of the objective function, the movement is continued, and the process is iterated until the objective function converges to obtain two new image block sets;
S2.3, detecting regional connectivity;
Mapping the two graphic block sets obtained in the step 2.2 back to the original image to obtain two partial subgraphs, further dividing each subgraph into smaller subregions according to the spatial connectivity of the image blocks, and detecting the regional connectivity, wherein each subgraph can obtain at least one group of communicated subregions, so that the subsequent processing is more accurate and reliable;
s2.4, judging whether to continue to divide into two parts or not;
mapping the plurality of subareas obtained in the step 2.3 to an encoder for encoding to obtain a plurality of image block sets, and judging whether to continue bisection by using average internal connectivity for each image block set;
feature sets corresponding to each image block set By calculating the feature set/>The number of internal edges to measure its connectivity; specifically, assume feature set/>The total number of internal edges is/>When average internal connectivity/>When the threshold is exceeded, the feature set/>Further segmentation is performed, i.e. average internal connectivity/>Discarding the image block set corresponding to the feature set S when the image block set is larger than the threshold value, otherwise, adding the image block set corresponding to the feature set S into a queue, and continuing halving;
s3, screening examples, namely screening examples with insignificant semantics according to significance conditions;
determining a feature set node by intensity of feature energy Whether it is a dummy node or not, specifically as follows:
feature sets corresponding to each image block set Determining a feature set by intensity of feature energyWhether it is a dummy node:
Specifically, a given feature set ,/>Is in feature set/>Set of features with high energy in (1), "feature set/>And feature set/>The overlapping ratio of (2) is as follows:
When the ratio of overlapping When the threshold value is smaller than the threshold value, the feature set S is a dummy node, and the feature set S is discarded; otherwise, the feature set is retained.
2. The class independent instance retrieval method based on hierarchical semantic region decomposition according to claim 1, wherein: in the step S2.4, the method for establishing the inner edge of the set S is as follows: if S two image blocks are assembledAnd/>Similarity of (2)Greater than threshold/>Considering the two as similar, an edge is established between the two image blocks, which is one of the inner edges of the set S.
3. The class independent instance retrieval method based on hierarchical semantic region decomposition according to claim 1, wherein: in said step 2.4, average internal connectivityDefined as the total number of edges/>And feature set size/>Ratio of average internal connectivity/>The value range of (2) is/>Between them.
4. The class independent instance retrieval method based on hierarchical semantic region decomposition according to claim 1, wherein: the instance processing in the potential instance library by the descriptor is specifically as follows:
For a feature set in an image obtained by the detector, it is projected onto the encoder downsampled according to its order On a feature map of size, wherein/>Is the length of the feature map,/>To be the width of the feature map, a/>, with the case of 1 and the case of 0, is formedA mask of a size, up-sampling the mask to an original size, namely, an instance representation mask of the original size is reached, and inputting the image into a convolutional neural network-based feature extractor to obtain a feature map for image coding; for each positioning result represented by the mask, downsampling the mask to obtain a mask with the same size as the feature map; the feature representation for each instance is obtained using a generalized mean pooling on each channel by multiplying the feature map with a downsampled mask. /(I)
CN202410173702.3A 2024-02-07 2024-02-07 Class irrelevant instance retrieval method based on hierarchical semantic region decomposition Active CN117725243B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410173702.3A CN117725243B (en) 2024-02-07 2024-02-07 Class irrelevant instance retrieval method based on hierarchical semantic region decomposition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410173702.3A CN117725243B (en) 2024-02-07 2024-02-07 Class irrelevant instance retrieval method based on hierarchical semantic region decomposition

Publications (2)

Publication Number Publication Date
CN117725243A CN117725243A (en) 2024-03-19
CN117725243B true CN117725243B (en) 2024-06-04

Family

ID=90207330

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410173702.3A Active CN117725243B (en) 2024-02-07 2024-02-07 Class irrelevant instance retrieval method based on hierarchical semantic region decomposition

Country Status (1)

Country Link
CN (1) CN117725243B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111914107A (en) * 2020-07-29 2020-11-10 厦门大学 Instance retrieval method based on multi-channel attention area expansion
CN114730463A (en) * 2019-11-22 2022-07-08 豪夫迈·罗氏有限公司 Multi-instance learner for tissue image classification
CN117453944A (en) * 2023-12-25 2024-01-26 厦门大学 Multi-level significant region decomposition unsupervised instance retrieval method and system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8949252B2 (en) * 2010-03-29 2015-02-03 Ebay Inc. Product category optimization for image similarity searching of image-based listings in a network-based publication system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114730463A (en) * 2019-11-22 2022-07-08 豪夫迈·罗氏有限公司 Multi-instance learner for tissue image classification
CN111914107A (en) * 2020-07-29 2020-11-10 厦门大学 Instance retrieval method based on multi-channel attention area expansion
CN117453944A (en) * 2023-12-25 2024-01-26 厦门大学 Multi-level significant region decomposition unsupervised instance retrieval method and system

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Instance search based on weakly supervised feature learning;Jie Lin等;《Neurocomputing》;20191204;第424卷;全文 *
一种优化初始中心点的K平均文本聚类算法;赵万磊等;《计算机软件及计算机应用》;20050910;第25卷(第9期);全文 *
基于多示例学习的图像表示方法;祁萌;;工业控制计算机;20120625(第06期);全文 *

Also Published As

Publication number Publication date
CN117725243A (en) 2024-03-19

Similar Documents

Publication Publication Date Title
CN107679250B (en) Multi-task layered image retrieval method based on deep self-coding convolutional neural network
Garcia-Fidalgo et al. Vision-based topological mapping and localization methods: A survey
CN107256262B (en) Image retrieval method based on object detection
CN108319964B (en) Fire image recognition method based on mixed features and manifold learning
Xia et al. Loop closure detection for visual SLAM using PCANet features
Liu et al. 3D Point cloud analysis
Bhute et al. Content based image indexing and retrieval
CN110147841A (en) The fine grit classification method for being detected and being divided based on Weakly supervised and unsupervised component
Xu et al. GLORN: Strong generalization fully convolutional network for low-overlap point cloud registration
Zhan et al. A method of hierarchical image retrieval for real-time photogrammetry based on multiple features
CN112446431A (en) Feature point extraction and matching method, network, device and computer storage medium
Yin et al. Pse-match: A viewpoint-free place recognition method with parallel semantic embedding
CN108694411B (en) Method for identifying similar images
CN117725243B (en) Class irrelevant instance retrieval method based on hierarchical semantic region decomposition
Ye et al. SuperPlane: 3D plane detection and description from a single image
Zhang et al. Saliency detection via sparse reconstruction errors of covariance descriptors on Riemannian manifolds
Lou et al. Crin: rotation-invariant point cloud analysis and rotation estimation via centrifugal reference frame
Dadgostar et al. Gesture-based human–machine interfaces: a novel approach for robust hand and face tracking
Sala et al. 3-d volumetric shape abstraction from a single 2-d image
Ghuge et al. Systematic analysis and review of video object retrieval techniques
Chen et al. Big Visual Data Analysis: Scene Classification and Geometric Labeling
CN111611427A (en) Image retrieval method and system based on linear discriminant analysis depth hash algorithm
Karpagam et al. Improved content-based classification and retrieval of images using support vector machine
Wang et al. A Review of Vision SLAM-based Closed-loop Inspection
Zhou et al. An efficient image-based indoor positioning approach using ORB and LSH

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant