CN112883216B - Semi-supervised image retrieval method and device based on disturbance consistency self-integration - Google Patents
Semi-supervised image retrieval method and device based on disturbance consistency self-integration Download PDFInfo
- Publication number
- CN112883216B CN112883216B CN202110226266.8A CN202110226266A CN112883216B CN 112883216 B CN112883216 B CN 112883216B CN 202110226266 A CN202110226266 A CN 202110226266A CN 112883216 B CN112883216 B CN 112883216B
- Authority
- CN
- China
- Prior art keywords
- data
- image
- semi
- hash
- layer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
- G06F18/2155—Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the incorporation of unlabelled data, e.g. multiple instance learning [MIL], semi-supervised techniques using expectation-maximisation [EM] or naïve labelling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/55—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Library & Information Science (AREA)
- Databases & Information Systems (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computational Linguistics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Evolutionary Biology (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a semi-supervised image retrieval method and device based on disturbance consistency self-integration, which comprises the steps of inputting an image into a trained semi-supervised image feature extraction model to obtain the features of the image, wherein the semi-supervised image feature extraction model comprises the following steps: the system comprises a convolution neural network, a hash layer and a disturbance consistency self-integration module; converting the characteristics of the image into a discrete binary hash code of the image; and retrieving according to the binary hash code to obtain an image retrieval result. The method can discover the distinguishing characteristics of each category by integrating the characteristics of the same sample under different data enhancement conditions; the similarity between the output of the hash layer of the unmarked data and the corresponding integrated features is maximized through a designed disturbance consistency loss function, and the generalization capability of the unmarked data is fully utilized to promote the network; better search effect can be obtained.
Description
Technical Field
The invention belongs to the technical field of software, and particularly relates to a semi-supervised image retrieval method and device based on disturbance consistency self-integration.
Background
With the explosive growth of image data on the internet, the huge amount of image data and high-dimensional image features make image retrieval face a huge challenge. The deep hash method is a research hotspot in recent years due to the characteristics of low storage cost and high retrieval speed.
Generally, the deep hash method maps high-dimensional real-value image features into compact binary hash codes to realize quick retrieval, and utilizes semantic similarity of images to constrain the hash codes in the mapping process to ensure retrieval accuracy. In a big data environment, the supervised hash method usually depends on a large amount of labeled image data to obtain higher retrieval accuracy, and the performance of the supervised hash method is greatly reduced when only a small amount of labeled data exists. Chinese patent application CN109800314A discloses a method for generating hash codes for image retrieval using a deep convolutional network, in which a hash layer is added before a classification layer, and the output of the hash layer is binarized to obtain the hash codes of images, but in this application, a large amount of labeled data is used to train a hash model to obtain better retrieval performance, but in an actual scene, a large amount of data is labeled, and huge manpower and material resources are consumed. Therefore, a deep semi-supervised hashing method is proposed, which learns a better hash function with a small amount of labeled data and a large amount of unlabeled data.
The existing semi-supervised hash method mainly utilizes the visual similarity of unmarked data and marked data to guide the learning of the unmarked data hash code, and realizes the hash function learning by keeping the visual neighbor relation between unmarked samples and marked samples in the hash space. Therefore, many researchers are trying to construct reliable sample proximity relations. These research efforts can be broadly divided into graph-based approaches and relationship-based approaches. Graph-based methods construct an approximate graph using visual similarity between samples, where nodes on the graph represent labeled data and unlabeled data, and edges on the graph reflect visual similarity between samples. The method based on relationship consistency adopts a self-integration model to generate the integrated features of each sample, and the visual similarity of the integrated features between paired samples is used for representing the semantic similarity relationship between the samples.
At present, the semi-supervised hashing method uses visual similarity among samples to represent semantic similarity among the samples, but the visual similarity cannot reflect the real semantic similarity among the samples, and two samples with similar visual information may come from two different categories. Therefore, guiding the learning of the hash code by using wrong visual similarity can cause the similarity of the hash code learned by the two samples to be inconsistent with the real semantic similarity relationship.
Disclosure of Invention
Aiming at the problems of the existing method, the invention aims to design a semi-supervised image retrieval method and device based on disturbance consistency self-integration.
The technical content of the invention comprises:
a semi-supervised image retrieval method based on disturbance consistency self-integration comprises the following steps:
1) inputting the image into a trained semi-supervised image feature extraction model to obtain the features of the image, wherein the semi-supervised image feature extraction model comprises the following steps: a convolutional neural network, a hash layer and a disturbance consistency self-integration module, wherein the semi-supervised image feature extraction model is trained by using a small amount of marked data and a large amount of unmarked data as follows:
1.1) training a pre-training convolutional neural network and a hash layer by using a small amount of marked data to obtain a preliminarily trained convolutional neural network and a hash layer;
1.2) maximizing unmarked data x by perturbing the consistency self-integration module k Hash layer output of h k And integration featuresTraining the preliminarily trained convolutional neural network and Hash layer to obtain the trained convolutional neural network and Hash layer, and generating integrated featuresWhere t is the number of iterations and k is the number of unmarked data, integration featuresThrough h k Andobtaining the result by weighted summation;
2) converting the characteristics of the image into a binary hash code with discrete image;
3) and searching according to the binary hash code to obtain an image searching result.
And further, before the marked data and the unmarked data are input into the trained convolutional neural network, respectively acquiring the enhanced data of the marked data and the unmarked data, and training the enhanced data of the marked data and the unmarked data to obtain the semi-supervised image feature extraction model.
Further, the semi-supervised image feature extraction model further comprises a classification layer; before the convolutional neural network and the hash layer which are initially trained through label-free data training, the classification layer is trained by using fc7 features corresponding to the label data to obtain the trained classification layer, wherein the fc7 features are full-connection layer output of the convolutional neural network.
Further, a classification loss function L for classification training is performed c =∑ j∈L -y j logf j Wherein y is j For marked data x j True mark of f j For marked data x j J is the number of the marked data, and L is the marked data set.
Further, the loss function is maintained by pairwise similarityThe hash layer is trained by the labeled data, wherein S is a semantic similarity matrix,h i and h j Respectively being marked data x i And x j And (4) outputting the hash layer.
Further, the disturbance consistency self-integration module further comprises a memory bank (memory bank); will integrate featuresAnd storing the information in the memory bank.
Further, by perturbing the consistency loss functionMaximizing unmarked data x k Hash layer output h of k And integration featuresWhere U is the label-free data set, μ is the scaling factor, alpha is the momentum coefficient.
Further, the method for converting the hash layer output characteristics of the image into the discrete binary hash code of the image comprises the following steps: inputting features of an image into a sign functionIn (1).
A storage medium having a computer program stored therein, wherein the computer program is arranged to perform the above-mentioned method when executed.
An electronic device comprising a memory having a computer program stored therein and a processor arranged to run the computer to perform the method as described above.
Compared with the prior art, the invention has the following positive effects:
1) by integrating the hash layer characteristics of the same sample under different data enhancement conditions, the distinguishing characteristics of each category can be found;
2) the similarity between the hash layer output of the unmarked data and the corresponding integrated features is maximized through the designed disturbance consistency loss function, and the generalization capability of the unmarked data is fully utilized to improve the network;
3) better search effect can be obtained.
Drawings
FIG. 1 is a diagram of a semi-supervised hashing framework in accordance with the present invention.
Detailed Description
In order to make the technical solutions in the embodiments of the present invention better understood and make the objects, features, and advantages of the present invention more comprehensible, the technical core of the present invention is described in further detail below with reference to the accompanying drawings. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The invention provides a method for maximizing similarity between hash layer output of unmarked data and corresponding integrated features, which can improve generalization capability of a network, and designs a Self-integrated semi-supervised hash framework (DCSE) based on Disturbance consistency, as shown in FIG. 1. The frame comprises three parts: (1) a backbone network comprising a convolutional neural network, a hash layer, and a classification layer. (2) A pairwise similarity preserving loss function and a classification loss function for learning hash codes and performing image classification on the labeled data sets. (3) The module integrates the network output of the same unmarked sample under different data enhancement conditions to form a global feature, and then maximizes the similarity between the network output of the sample and the corresponding integrated feature by using a designed disturbance consistency loss function.
The specific method is that labeled data and unlabeled data under different data enhancement conditions are input into a neural network to obtain fc7 layer characteristics.
In the marked data stream, the output fc7 feature of the fully-connected layer of the marked data is transferred to a classification layer for classification, and the classification loss function is as follows:
L c =∑ j∈L -y j logf j (1)
wherein y is j And f j Is the mark data x j True labels and classification layer prediction results, L denotes the label dataset. Simultaneously fc7 features of the label data are transferred to the hash layer for hash code learning, and the pairwise similarity preserving loss function is as follows:
whereinh i Is the mark data x i S is the semantic similarity matrix if sample x i And x j Have the same class, then S ij 1, otherwise S ij =0。
In the label-free data stream, a memory space (memory bank) is established for storing the global features of each sample integration, and a novel disturbance consistency loss function L is designed u To maximize the current unmarked sample x k Output h of k And corresponding integration featuresThe similarity of (c).
Whereinμ is the scaling factor. The memory bank is then updated using Exponential Moving Average (EMA), i.e. by equation (4).
When image retrieval is actually carried out, image features output by a semi-supervised Hash framework Hash layer are input into a symbolic functionAnd obtaining the binary hash code with the discrete image, and searching according to the binary hash code with the discrete image to obtain an image searching result.
To validate the present invention, we performed a number of experiments to evaluate the search effect of DCSE. Our model was trained and tested on the image dataset CIFAR-10 and NUS-WIDE. Wherein, the CIFAR-10 has 60000 images, and we randomly select 100 images of each class as a query set, and the rest of the pictures as a search set, wherein 500 images of each class are selected as a labeled data set in the search set, and the rest of the pictures are taken as unlabeled data sets. The NUS-WIDE data set contains about 270000 pictures and we select the 21 categories that appear the most, with at least 5000 pictures per category. And then randomly selecting 100 pictures of each type as a query set, and taking the rest pictures as a retrieval set. In the training phase, 500 pieces of labeled data sets are randomly selected from each type in the search set, and the rest labeled data sets are selected. Our base network uses pre-trained VGG 16.
Table 1 shows mAP results on CIFAR-10 and NUS-WIDE for DCSE and other image retrieval methods, including: locality Sensitive Hashing (LSH), iterative quantization (ITQ), Supervised Discrete Hashing (SDH), Convolutional Neural Network Hashing (CNNH), Network In Network Hashing (NINH), semi-supervised deep hashing (SSDH), Bipartite Graph Deep Hashing (BGDH), semi-supervised generative countermeasures hashing (SSGAH), semi-supervised deep pairwise hashing (SSDPH), Generalized Product Quantization (GPQ). The experimental results show that the invention is superior to other comparison methods.
Table 2 shows the results of ablation experiments for DCSE, DCSE-1 being a variation of DCSE removal perturbation consistency self-integrating module. Experimental results show that the disturbance consistency self-integration module provided by the invention obviously improves the semi-supervised retrieval performance.
Table 3 shows the results of an unseen class of experiment in which we used 75% of the classes in the dataset for training and the remaining 25% for testing. Specifically, we divide the data set into 4 parts: train75, test75, train25 and test25, wherein train75 and test75 belong to the 75% of the category in the data set and train25 and test25 belong to the 25% of the category in the data set. We make train75 as the labeled training set, train25 and test75 as the search set, and test25 as the query set. Experimental results show that the method is superior to other comparison methods.
Table 1 mAP results for different bit lengths on two data sets for different methods
TABLE 2 ablation test results
Table 3 results of unsettled type experiments
The above embodiments only express the embodiments of the present invention, and the description is specific, but not construed as limiting the scope of the present invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent should be subject to the appended claims.
Claims (5)
1. A semi-supervised image retrieval method based on disturbance consistency self-integration comprises the following steps:
1) inputting the image into a trained semi-supervised image feature extraction model to obtain the features of the image, wherein the semi-supervised image feature extraction model comprises the following steps: the semi-supervised image feature extraction model is trained by using a small amount of marked data and a large amount of unmarked data as follows:
1.1) inputting marked data and unmarked data under different data enhancement conditions into the convolutional neural network to obtain fc7 layer characteristics, wherein the fc7 layer characteristics are full-connection layer output of the convolutional neural network;
1.2) respectively transferring fc7 layer characteristics of the marked data to the classification layer for classification learning and the hash layer for hash code learning, wherein the loss function of the classification learning is L c =∑ j∈L -y j logf j ,y j And f j Is the mark data x j The true label and classification layer prediction results, L represents the labeled data set, and the loss function of hash code learning is Parameter(s)h i Is the mark data x i Hash layer output of S ij Is an element in the semantic similarity matrix S and when marking data x i And mark data x j When they are of the same class, S ij 1, otherwise S ij =0;
1.3) the disturbance consistencySelf-integration module for same unmarked sample x under different data enhancement conditions k Hash layer output h of k Performing integration to form a global feature, and maximizing unmarked sample x by using perturbation consistency loss function k Hash layer output h of k Similarity to corresponding integrated features, said perturbation consistency loss function Mu is a scaling factor, U is a label-free data set, and the integration features are updated using exponential moving averages Alpha is a momentum coefficient, and t represents the iteration number in training;
2) converting the characteristics of the image into a binary hash code with discrete image;
3) and retrieving according to the binary hash code to obtain an image retrieval result.
2. The method of claim 1, wherein before the labeled data and the unlabeled data are input into the trained convolutional neural network, enhanced data of the labeled data and the unlabeled data are respectively obtained, and the semi-supervised image feature extraction model is obtained through training of the enhanced data of the labeled data and the unlabeled data.
5. An electronic device comprising a memory having a computer program stored therein and a processor arranged to run the computer program to perform the method according to any of claims 1-4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110226266.8A CN112883216B (en) | 2021-03-01 | 2021-03-01 | Semi-supervised image retrieval method and device based on disturbance consistency self-integration |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110226266.8A CN112883216B (en) | 2021-03-01 | 2021-03-01 | Semi-supervised image retrieval method and device based on disturbance consistency self-integration |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112883216A CN112883216A (en) | 2021-06-01 |
CN112883216B true CN112883216B (en) | 2022-09-16 |
Family
ID=76055106
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110226266.8A Active CN112883216B (en) | 2021-03-01 | 2021-03-01 | Semi-supervised image retrieval method and device based on disturbance consistency self-integration |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112883216B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113762393B (en) * | 2021-09-08 | 2024-04-30 | 杭州网易智企科技有限公司 | Model training method, gaze point detection method, medium, device and computing equipment |
CN114972118B (en) * | 2022-06-30 | 2023-04-28 | 抖音视界有限公司 | Noise reduction method and device for inspection image, readable medium and electronic equipment |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018028255A1 (en) * | 2016-08-11 | 2018-02-15 | 深圳市未来媒体技术研究院 | Image saliency detection method based on adversarial network |
CN109165306A (en) * | 2018-08-09 | 2019-01-08 | 长沙理工大学 | Image search method based on the study of multitask Hash |
CN109241313A (en) * | 2018-08-14 | 2019-01-18 | 大连大学 | A kind of image search method based on the study of high-order depth Hash |
CN110309331A (en) * | 2019-07-04 | 2019-10-08 | 哈尔滨工业大学(深圳) | A kind of cross-module state depth Hash search method based on self-supervisory |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105512273A (en) * | 2015-12-03 | 2016-04-20 | 中山大学 | Image retrieval method based on variable-length depth hash learning |
-
2021
- 2021-03-01 CN CN202110226266.8A patent/CN112883216B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018028255A1 (en) * | 2016-08-11 | 2018-02-15 | 深圳市未来媒体技术研究院 | Image saliency detection method based on adversarial network |
CN109165306A (en) * | 2018-08-09 | 2019-01-08 | 长沙理工大学 | Image search method based on the study of multitask Hash |
CN109241313A (en) * | 2018-08-14 | 2019-01-18 | 大连大学 | A kind of image search method based on the study of high-order depth Hash |
CN110309331A (en) * | 2019-07-04 | 2019-10-08 | 哈尔滨工业大学(深圳) | A kind of cross-module state depth Hash search method based on self-supervisory |
Also Published As
Publication number | Publication date |
---|---|
CN112883216A (en) | 2021-06-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110298037B (en) | Convolutional neural network matching text recognition method based on enhanced attention mechanism | |
CN109165306B (en) | Image retrieval method based on multitask Hash learning | |
Cao et al. | Deep visual-semantic quantization for efficient image retrieval | |
Wang et al. | Semi-supervised hashing for scalable image retrieval | |
CN112819023B (en) | Sample set acquisition method, device, computer equipment and storage medium | |
CN111046179B (en) | Text classification method for open network question in specific field | |
CN111914156A (en) | Cross-modal retrieval method and system for self-adaptive label perception graph convolution network | |
EP3166020A1 (en) | Method and apparatus for image classification based on dictionary learning | |
Wu et al. | Distance metric learning from uncertain side information with application to automated photo tagging | |
CN112883216B (en) | Semi-supervised image retrieval method and device based on disturbance consistency self-integration | |
US11803971B2 (en) | Generating improved panoptic segmented digital images based on panoptic segmentation neural networks that utilize exemplar unknown object classes | |
Ma et al. | A weighted KNN-based automatic image annotation method | |
Sumbul et al. | Deep learning for image search and retrieval in large remote sensing archives | |
Niu et al. | Knowledge-based topic model for unsupervised object discovery and localization | |
CN112507912B (en) | Method and device for identifying illegal pictures | |
CN114461804B (en) | Text classification method, classifier and system based on key information and dynamic routing | |
Zhang et al. | ObjectPatchNet: Towards scalable and semantic image annotation and retrieval | |
CN112163114B (en) | Image retrieval method based on feature fusion | |
Shen et al. | DSRPH: deep semantic-aware ranking preserving hashing for efficient multi-label image retrieval | |
Yu et al. | Text-image matching for cross-modal remote sensing image retrieval via graph neural network | |
Zhang et al. | Image region annotation based on segmentation and semantic correlation analysis | |
Dong et al. | Training inter-related classifiers for automatic image classification and annotation | |
Tian et al. | Automatic image annotation with real-world community contributed data set | |
CN115994239A (en) | Prototype comparison learning-based semi-supervised remote sensing image retrieval method and system | |
CN116363460A (en) | High-resolution remote sensing sample labeling method based on topic model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |