CN112214623A - Image-text sample-oriented efficient supervised image embedding cross-media Hash retrieval method - Google Patents
Image-text sample-oriented efficient supervised image embedding cross-media Hash retrieval method Download PDFInfo
- Publication number
- CN112214623A CN112214623A CN202010943065.5A CN202010943065A CN112214623A CN 112214623 A CN112214623 A CN 112214623A CN 202010943065 A CN202010943065 A CN 202010943065A CN 112214623 A CN112214623 A CN 112214623A
- Authority
- CN
- China
- Prior art keywords
- sample
- image
- matrix
- text
- samples
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 238000000034 method Methods 0.000 title claims abstract description 29
- 239000011159 matrix material Substances 0.000 claims abstract description 60
- 230000006870 function Effects 0.000 claims abstract description 36
- 238000013507 mapping Methods 0.000 claims abstract description 21
- 238000013139 quantization Methods 0.000 claims abstract description 9
- 238000005457 optimization Methods 0.000 claims abstract description 7
- 238000002372 labelling Methods 0.000 claims abstract description 5
- 238000012549 training Methods 0.000 claims description 16
- 238000012360 testing method Methods 0.000 claims description 8
- 238000000354 decomposition reaction Methods 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 description 3
- 101150050759 outI gene Proteins 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000001815 facial effect Effects 0.000 description 1
- 230000036651 mood Effects 0.000 description 1
- 230000006855 networking Effects 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/55—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/31—Indexing; Data structures therefor; Storage structures
- G06F16/316—Indexing structures
- G06F16/325—Hash tables
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/40—Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
- G06F16/41—Indexing; Data structures therefor; Storage structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/53—Querying
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biomedical Technology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Evolutionary Biology (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Multimedia (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to the technical field of multimedia, in particular to an image-text sample-oriented efficient supervised image embedding cross-media Hash retrieval method, which comprises the following steps: constructing a picture-text sample pair sample set, and labeling semantic categories of sample pairs; extracting the characteristics of the images and the text samples in the sample set, and mapping the characteristics to a nonlinear space by using a radial basis Gaussian kernel function; constructing a graph adjacency matrix of the sample pairs by using the class labels of the sample pairs to obtain a Laplace matrix; mapping the category labels to a potential semantic space by utilizing linear mapping, and keeping the semantic similarity between the modalities and in the modalities of the image and text samples to respectively learn linear mapping matrixes for the image and text modalities; learning an orthogonal rotation matrix minimization quantization error; a discrete iteration optimization algorithm is provided to obtain a discrete solution of the hash code; the invention utilizes the semantic similarity between the modes and the semantic similarity between the modes of the images and the text samples, the similarity based on the class labels and the minimized quantization error to learn the Hash codes, thereby improving the algorithm retrieval performance.
Description
Technical Field
The invention relates to the technical field of multimedia, in particular to an efficient supervision picture embedding cross-media Hash retrieval method facing to picture and text samples.
Background
With the rapid development of network technologies and portable mobile devices, more and more people are used to share drops in life through a network, for example, when a person passes through a birthday, a birthday photo (image) is published and own mood (text) is described through social software such as WeChat and facial makeup, so that data on the network is increased explosively, and how a user searches for required information in mass data becomes a challenge. On the one hand, the amount of data on the network is large, and the dimensionality of the sample features is usually very high, even up to ten thousand dimensions. The conventional search method needs to calculate the distances between the query sample and all samples to be searched, such as euclidean distance, cosine distance, etc., which may cause excessive computational complexity and memory overhead. On the other hand, the data on the network has multiple modes, and each mode represents the heterogeneity, and how to measure the similarity of heterogeneous samples becomes a challenge. The cross-media hashing method can well solve the above two problems. The supervised cross-media hash method can learn the hash code by utilizing the class label containing high-level semantics, improves the distinguishing capability of the hash code and obtains satisfactory retrieval performance. However, most of the methods have the following problems, and need to be further solved: 1) most methods cannot fully utilize the class labels to improve the performance of the hash codes, and the existing methods mainly learn the hash codes by keeping the similarity based on two similar matrixes, but the two similar matrixes not only cause the loss of class information, but also cause higher calculation complexity and memory overhead; 2) most of the existing discrete hash methods solve the hash code matrix bit by bit in the optimization process, which results in higher computational complexity. The invention provides a high-efficiency Hash retrieval method for embedding a supervision picture facing to a picture-text sample, which can effectively solve the above problems. Firstly, in order to better keep the semantic similarity of the samples, the invention provides the method for simultaneously keeping the semantic similarity among and in the modes of the samples and the similarity based on the class labels, learning the hash code and the linear mapping matrix, learning an orthogonal rotation matrix to reduce the quantization error and further improving the distinguishing capability of the hash code. Then, an iterative optimization algorithm is provided, so that not only can the discrete solution of the Hash code closure of the sample be directly obtained, but also the computational complexity of the algorithm is reduced.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides an image-text sample-oriented efficient supervised image embedding cross-media Hash retrieval method, which is characterized by utilizing a computer device to realize the following steps:
step 1, collecting images and text samples from a network, taking the images and the text samples belonging to the same webpage as image-text sample pairs to form an image-text sample set, labeling the types of the image-text sample pairs, and dividing the image-text sample pairs into a training set and a test set;
step 2, extracting the characteristics of all images and text samples in the training set and the test set, and normalizing and removing the mean value of the characteristics;
step 3, feature use of image-text sample pair in training setIs shown in which,Respectively the characteristics of all image samples and text samples in the training set,,which represents a real number of the digital signal,the dimensions of the features are represented in a graph,representing the number of pairs of teletext samples in the training set,class labels representing pairs of samples, whereinThe number of total categories is indicated and,representing the number of pairs of teletext samples; random selectionA sample pairAs anchor points, wherein,Mapping the characteristics of all image samples and text samples to a nonlinear space by using a Gaussian radial basis function:
whereinIn order to be a scale parameter,to representThe norm of the number of the first-order-of-arrival,represents a transpose of a matrix or vector;
step 4, constructing a graph adjacency matrix of the sample pairs by using the class labels of the image-text sample pairs,Represents a real number, which is defined as follows:
wherein,representation matrixTo (1) aGo to the firstThe values of the columns are such that,to representA norm;
step 5, further obtaining a graph adjacency matrixIs the Laplace matrixWhereinIs thatDiagonal matrix of (2), diagonal elements thereof;
Step 6, constructing an objective function of the method by using inter-modal and intra-modal semantic similarities and minimized quantization errors which keep sample characteristics based on the variables of the steps 1 to 5, wherein the objective function is defined as follows:
wherein、、、、Andin order to be a weight parameter, the weight parameter,andrepresented as linear projection matrices learned for image and text sample modalities respectively,which indicates the length of the hash code and,the traces of the matrix are represented by,in order to be a linear mapping matrix, the mapping matrix is,for the learned hash code of the image-text sample pair,is an orthogonal rotation matrix and is characterized in that,expressed in size ofThe unit matrix of (a) is obtained,representing a regularization term;
and 7, solving the objective function by using an iterative optimization algorithm, which specifically comprises the following steps:
step 71, fixing,,Andsolving for: removing andan irrelevant term, then the objective function becomes:
due to Laplace matrixIs of a size ofSo as to calculateThe computational complexity and memory overhead ofThe application of the invention in large-scale sample sets is limited, and the above formula can be further rewritten as follows:
however, calculateAndthe computational complexity and memory overhead ofThe invention proposes a predefined constantThen, then、Further predefining constantsThen, thenCan be written asTo calculateThe computational complexity and memory overhead of(ii) a For theCan be rewritten asThe computational complexity and memory overhead to calculate this isThus calculatingIs reduced to;
Step 73, fixing,,Andsolving for: removing andan irrelevant term, then the objective function becomes:
step 74, fixing,,Andsolving for: removing andan irrelevant term, then the objective function becomes:
the above equation can be solved by a Singular Value Decomposition (SVD) algorithm, i.e.WhereinIn the form of a left-hand singular matrix,in the form of a right singular matrix,is a matrix of singular values, then;
Step 75, fixing,,Andsolving for: removing andan irrelevant term, then the objective function becomes:
the following can be obtained:
step 76, repeating the steps 71-75 until the algorithm converges or the maximum iteration number is reached;
step 8, a user inputs a query sample, the sample can be an image or a text, the characteristics of the query sample are extracted, the characteristics are normalized and mean-removed, the characteristics of the sample are mapped to a nonlinear space by using a Gaussian radial basis function, and the representation of the query sample is obtained;
Step 9, generating a hash code of the query sample by using the learned linear mapping function and the rotation matrix:
step 10, calculating Hamming distance between query sample and hash code of heterogeneous sample in sample setArranged from small to large in Hamming distance and returned to the previous positionAnd obtaining the retrieval result by the sample.
Compared with the prior art, the invention has the beneficial effects that:
1. the computation complexity and the memory overhead based on the spectrum embedding algorithm are reduced by the introduced constantIs reduced to。
2. The Hash code is learned by keeping semantic similarity in and among the modes and the similarity based on the label, so that the performance of the Hash code is improved.
3. An orthogonal rotation matrix is learned in a supervision mode to reduce quantization errors, so that the distinguishing capability of the hash codes is further enhanced, and the performance of the algorithm is improved.
Drawings
Fig. 1 is a flowchart of the steps of the efficient supervised image embedding cross-media hash retrieval method for image-text samples according to the present invention.
Detailed Description
In order to more fully and clearly describe the technical scheme of the invention, the invention is further described in detail with reference to the specific embodiments, and it should be understood that the embodiments described herein are only used for explaining and explaining the invention, and are not used for limiting the protection scope of the invention.
The invention relates to an image-text sample-oriented efficient supervision image embedding cross-media Hash retrieval method, which comprises the steps of collecting images and text samples on the Internet, forming sample pairs by the images and the text samples from the same webpage, establishing an image-text sample pair set, labeling the types of the sample pairs, and dividing the image-text sample set into a training set and a testing set; extracting the characteristics of all images and text samples in the training set and the test set, and mapping the characteristics of the images and the text samples to a nonlinear space by using a radial basis Gaussian kernel function; constructing a graph adjacency matrix of the sample pair by using the class label of the sample pair, and further obtaining a Laplace matrix of the graph; mapping the category labels to a potential semantic space by utilizing linear mapping, and respectively learning linear mapping matrixes for image and text modes by keeping semantic similarity between the modes and in the modes of the image and text samples in the space; minimizing quantization error by learning an orthogonal rotation matrix; an efficient discrete iterative optimization algorithm is provided, direct solving by using a Laplace matrix is avoided by predefining a plurality of constants, the efficiency of the algorithm is improved, and a discrete solution of a Hash code can be directly obtained; the retrieval performance of the algorithm is improved by utilizing the semantic similarity between the modes and the intra-mode semantic similarity of the images and the text samples, the similarity based on the class labels and the minimized quantization error learning hash code.
Referring to fig. 1, an efficient supervised image embedding cross-media hash retrieval method for image-text samples is characterized in that a computer device is used for implementing the following steps:
the first step is as follows: collecting images and text samples from a network, taking the images and the text samples belonging to the same webpage as image-text sample pairs to form an image-text sample set, labeling the types of the image-text sample pairs, randomly selecting 75% of the image-text sample pairs to form a training set, and forming a test set by the rest of the image-text sample pairs;
the second step is that: extracting 150-dimensional texture features Of all image samples and 500-dimensional BOW (bag Of words) features Of all text samples, and normalizing and removing the mean value Of the features;
the third step: for training features of concentrated pairs of graphic samplesIs shown in which,Respectively representing the characteristics of all the images and text samples in the training set,,,which represents the number of pairs of samples,class labels representing pairs of samples, whereinRepresenting the number of sample categories; randomly select 500 samples(wherein) As anchor points, the features of the sample are mapped to a non-linear space using gaussian radial basis functions:
the fourth step: construction of graph adjacency matrix of sample pairs using class labels of graph-text sample pairsIt is defined as follows:
wherein,representation matrixTo (1) aGo to the firstThe values of the columns are such that,to representA norm;
the fifth step: further obtaining a graph adjacency matrixIs the Laplace matrixWhereinIs a diagonal matrix whose diagonal elements;
And a sixth step: based on the above variables, the objective function of the method is constructed by keeping the inter-modal and intra-modal semantic similarity and minimizing the quantization error of the sample features, which is defined as follows:
wherein,,,, ,,Andrepresented as linear projection matrices learned for image and text sample modalities respectively,which indicates the length of the hash code and,the traces of the matrix are represented by,in order to be a linear mapping matrix, the mapping matrix is,for the learned hash code of the image-text sample pair,is an orthogonal rotation matrix and is characterized in that,expressed in size ofThe unit matrix of (a) is obtained,representing a regularization term;
the seventh step: solving the objective function by using an iterative optimization algorithm, and initializing the iteration timesMaximum number of iterationsValue of the objective function(a sufficiently large number) and a threshold value of 0.001, comprising in particular the following steps:
due to Laplace matrixIs of a size ofSo as to calculateIs both of complexity and memory overheadThe application of the invention in large-scale sample sets is limited, and the above formula can be further rewritten as follows:
however, calculateAndis still of complexity and memory overheadThe invention proposes a predefined constantThen, then、Further predefining constantsThen, thenCan be written asTo calculateHas a complexity and memory overhead of(ii) a For theCan be rewritten asThe complexity and memory overhead of calculating this isThus calculatingIs reduced to;
the above equation can be solved by a Singular Value Decomposition (SVD) algorithm, i.e.WhereinIn the form of a left-hand singular matrix,in the form of a right singular matrix,is a matrix of singular values, then;
the following can be obtained:
(6) calculating the value of an objective functionAnd make a judgment onOrIf yes, stopping iteration; if not, then、And repeatedly executing the steps (1) - (5);
eighth step: a user inputs a query sample, or an image or a text, 150-dimensional textural features of the query sample are extracted if the image is input, 500-dimensional BOW features of the query sample are extracted if the text is input, the features are normalized and mean-removed, and the features of the sample are mapped to a nonlinear space by using a Gaussian radial basis function to obtain the representation of the query sample;
The ninth step: generating a hash code of the query sample by using the learned linear mapping function and the rotation matrix:
the tenth step: calculating the Hamming distance between the query sample and the hash code of the heterogeneous sample in the sample set, arranging the Hamming distances from small to large, and returning to the previous stepAnd obtaining the retrieval result by the sample.
The present example verifies the effectiveness of the method of the present invention on a public sample set Mirflickr25K, where the sample set contains 20015 image text pairs collected from social networking site Flickr, and the sample pairs contain 24 semantic categories; in the embodiment, 75% of the image-text sample pairs are randomly selected as a training set, and the rest 25% are selected as a test set; each image is represented as a Gist feature (texture feature) with 150 dimensions, a text is represented as a BOW (bag Of words) feature with 500 dimensions, and the features are normalized and subjected to mean value removing; in order to evaluate the retrieval performance of the method, average accuracy (MAP @ 100) is used as an evaluation standard, namely MAP is calculated by the first 100 returned samples, MAP @100 results of different hash code lengths on two tasks of image retrieval text and text retrieval image are shown in Table 1, the MAP @100 results of the method on a Mirflickr25K sample set are shown, and the result shows that the retrieval performance of the method is obviously higher than that of the prior art.
TABLE 1
Claims (4)
1. An image-text sample-oriented efficient supervised graph embedding cross-media Hash retrieval method is characterized by comprising the following steps:
step 1, collecting images and text samples from a network, taking the images and the text samples belonging to the same webpage as image-text sample pairs to form an image-text sample set, labeling the types of the image-text sample pairs, and dividing the image-text sample pairs into a training set and a test set;
step 2, extracting the characteristics of all images and text samples in the training set and the test set, and normalizing and removing the mean value of the characteristics;
step 3, feature use of image-text sample pair in training setIs shown in which、Respectively representing the characteristics of all image samples and text samples in the training set,,which represents a real number of the digital signal,the dimensions of the features are represented in a graph,representing the number of pairs of teletext samples in the training set,class labels representing pairs of samples, whereinThe total number of categories is represented as,representing the number of pairs of teletext samples; random selectionA sample pairAs anchor points, wherein,Mapping the characteristics of all image samples and text samples to a nonlinear space by using a Gaussian radial basis function:
whereinIn order to be a scale parameter,to representThe norm of the number of the first-order-of-arrival,represents a transpose of a matrix or vector;
step 4, constructing a graph adjacency matrix of the sample pairs by using the class labels of the image-text sample pairs,Represents a real number, which is defined as follows:
wherein,representation matrixTo (1) aGo to the firstThe values of the columns are such that,to representA norm;
step 5, constructing a graph adjacency matrixIs the Laplace matrixWhereinIs thatDiagonal matrix of (2), diagonal elements thereof;
Step 6, combining the steps 1 to 5, constructing a target function of the method by using the inter-modal and intra-modal semantic similarity and the minimized quantization error which keep the characteristics of the samples;
7, solving an objective function by using an iterative optimization algorithm;
step 8, inputting a query sample by a user, extracting the characteristics of the query sample, normalizing and removing the mean value of the characteristics, and mapping the characteristics of the sample to a nonlinear space by using a Gaussian radial basis function to obtain the representation of the query sample;
Step 9, generating a hash code of the query sample by utilizing the learned linear mapping function and the rotation matrix;
2. The method for retrieving the cross-media hash of the embedded efficient supervision map for the image-text sample as claimed in claim 1, wherein the objective function in step 6 is defined as follows:
wherein、、、、Andin order to be a weight parameter, the weight parameter,andrespectively expressed as linear projection matrices learned for image sample and text sample modalities,which indicates the length of the hash code and,the traces of the matrix are represented by,in order to be a linear mapping matrix, the mapping matrix is,for the learned hash code of the image-text sample pair,is an orthogonal rotation matrix and is characterized in that,expressed in size ofThe unit matrix of (a) is obtained,a regularization term is represented.
3. The method for retrieving the cross-media hash embedded in the efficient supervision map facing the image-text sample as claimed in claim 1 or 2, wherein the step 7 of solving the objective function specifically comprises the following steps:
step 71, fixing,,Andsolving for: removing andan irrelevant term, then the objective function becomes:
laplace matrixIs composed ofThe matrix is a matrix of a plurality of matrices,both the computational complexity and the memory overhead of:
Andboth the computational complexity and the memory overhead ofPredefining a constantThen, then、(ii) a Predefined constantsThen, thenCan be converted intoTo do soThe computational complexity and memory overhead of;Can be converted intoTo do soThe computational complexity and memory overhead ofThus calculatingIs reduced in both computational complexity and memory overhead to;
utilization and solutionIn a similar way, will calculateIs reduced in both computational complexity and memory overhead to;
Step 73, fixing,,Andsolving for: removing andan irrelevant term, then the objective function becomes:
step 74, fixing,,Andsolving for: removing andan irrelevant term, then the objective function becomes:
the above equation can be solved by a Singular Value Decomposition (SVD) algorithm, i.e.WhereinIn the form of a left-hand singular matrix,in the form of a right singular matrix,is a matrix of singular values, then;
Step 75, fixing,,Andsolving for: removing andan irrelevant term, then the objective function becomes:
the following can be obtained:
and step 76, repeating the steps 71-75 until the algorithm converges or the maximum iteration number is reached.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010943065.5A CN112214623A (en) | 2020-09-09 | 2020-09-09 | Image-text sample-oriented efficient supervised image embedding cross-media Hash retrieval method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010943065.5A CN112214623A (en) | 2020-09-09 | 2020-09-09 | Image-text sample-oriented efficient supervised image embedding cross-media Hash retrieval method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112214623A true CN112214623A (en) | 2021-01-12 |
Family
ID=74049225
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010943065.5A Withdrawn CN112214623A (en) | 2020-09-09 | 2020-09-09 | Image-text sample-oriented efficient supervised image embedding cross-media Hash retrieval method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112214623A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113191445A (en) * | 2021-05-16 | 2021-07-30 | 中国海洋大学 | Large-scale image retrieval method based on self-supervision countermeasure Hash algorithm |
CN113407661A (en) * | 2021-08-18 | 2021-09-17 | 鲁东大学 | Discrete hash retrieval method based on robust matrix decomposition |
CN113868366A (en) * | 2021-12-06 | 2021-12-31 | 山东大学 | Streaming data-oriented online cross-modal retrieval method and system |
CN117315687A (en) * | 2023-11-10 | 2023-12-29 | 哈尔滨理工大学 | Image-text matching method for single-class low-information-content data |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107256271A (en) * | 2017-06-27 | 2017-10-17 | 鲁东大学 | Cross-module state Hash search method based on mapping dictionary learning |
CN107729513A (en) * | 2017-10-25 | 2018-02-23 | 鲁东大学 | Discrete supervision cross-module state Hash search method based on semanteme alignment |
CN108595688A (en) * | 2018-05-08 | 2018-09-28 | 鲁东大学 | Across the media Hash search methods of potential applications based on on-line study |
CN109871454A (en) * | 2019-01-31 | 2019-06-11 | 鲁东大学 | A kind of discrete across media Hash search methods of supervision of robust |
CN110110100A (en) * | 2019-05-07 | 2019-08-09 | 鲁东大学 | Across the media Hash search methods of discrete supervision decomposed based on Harmonious Matrix |
-
2020
- 2020-09-09 CN CN202010943065.5A patent/CN112214623A/en not_active Withdrawn
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107256271A (en) * | 2017-06-27 | 2017-10-17 | 鲁东大学 | Cross-module state Hash search method based on mapping dictionary learning |
CN107729513A (en) * | 2017-10-25 | 2018-02-23 | 鲁东大学 | Discrete supervision cross-module state Hash search method based on semanteme alignment |
CN108595688A (en) * | 2018-05-08 | 2018-09-28 | 鲁东大学 | Across the media Hash search methods of potential applications based on on-line study |
CN109871454A (en) * | 2019-01-31 | 2019-06-11 | 鲁东大学 | A kind of discrete across media Hash search methods of supervision of robust |
CN110110100A (en) * | 2019-05-07 | 2019-08-09 | 鲁东大学 | Across the media Hash search methods of discrete supervision decomposed based on Harmonious Matrix |
Non-Patent Citations (2)
Title |
---|
TAO YAO,LIANSHAN YAN, YILAN MA, HONG YU, QINGTANG SU: "《Fast discrete cross-modal hashing with semantic consistency》", 《NEURAL NETWORKS》 * |
姚涛: "《基于哈希方法的跨媒体检索研究》", 《中国优秀博硕士学位论文全文数据库(博士)信息科技辑》 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113191445A (en) * | 2021-05-16 | 2021-07-30 | 中国海洋大学 | Large-scale image retrieval method based on self-supervision countermeasure Hash algorithm |
CN113191445B (en) * | 2021-05-16 | 2022-07-19 | 中国海洋大学 | Large-scale image retrieval method based on self-supervision countermeasure Hash algorithm |
CN113407661A (en) * | 2021-08-18 | 2021-09-17 | 鲁东大学 | Discrete hash retrieval method based on robust matrix decomposition |
CN113868366A (en) * | 2021-12-06 | 2021-12-31 | 山东大学 | Streaming data-oriented online cross-modal retrieval method and system |
CN117315687A (en) * | 2023-11-10 | 2023-12-29 | 哈尔滨理工大学 | Image-text matching method for single-class low-information-content data |
CN117315687B (en) * | 2023-11-10 | 2024-10-08 | 泓柯垚利(北京)劳务派遣有限公司 | Image-text matching method for single-class low-information-content data |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108334574B (en) | Cross-modal retrieval method based on collaborative matrix decomposition | |
Kulis et al. | Fast similarity search for learned metrics | |
CN112214623A (en) | Image-text sample-oriented efficient supervised image embedding cross-media Hash retrieval method | |
Kulis et al. | Kernelized locality-sensitive hashing | |
CN106033426B (en) | Image retrieval method based on latent semantic minimum hash | |
Ge et al. | Graph cuts for supervised binary coding | |
CN104820696B (en) | A kind of large-scale image search method based on multi-tag least square hash algorithm | |
CN109697451B (en) | Similar image clustering method and device, storage medium and electronic equipment | |
CN111159485B (en) | Tail entity linking method, device, server and storage medium | |
CN110929080B (en) | Optical remote sensing image retrieval method based on attention and generation countermeasure network | |
Huang et al. | Object-location-aware hashing for multi-label image retrieval via automatic mask learning | |
CN109871454B (en) | Robust discrete supervision cross-media hash retrieval method | |
Ali et al. | Modeling global geometric spatial information for rotation invariant classification of satellite images | |
CN110943981A (en) | Cross-architecture vulnerability mining method based on hierarchical learning | |
Choi et al. | Face video retrieval based on the deep CNN with RBF loss | |
Liu et al. | An indoor scene classification method for service robot Based on CNN feature | |
CN116304307A (en) | Graph-text cross-modal retrieval network training method, application method and electronic equipment | |
JP2014197412A (en) | System and method for similarity search of images | |
CN115795065A (en) | Multimedia data cross-modal retrieval method and system based on weighted hash code | |
CN113656700A (en) | Hash retrieval method based on multi-similarity consistent matrix decomposition | |
Al-Jubouri | Content-based image retrieval: Survey | |
CN108647295B (en) | Image labeling method based on depth collaborative hash | |
Pengcheng et al. | Fast Chinese calligraphic character recognition with large-scale data | |
CN107133348B (en) | Approximate searching method based on semantic consistency in large-scale picture set | |
Sun et al. | Search by detection: Object-level feature for image retrieval |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20210112 |
|
WW01 | Invention patent application withdrawn after publication |