CN109472282B - Depth image hashing method based on few training samples - Google Patents

Depth image hashing method based on few training samples Download PDF

Info

Publication number
CN109472282B
CN109472282B CN201811053140.XA CN201811053140A CN109472282B CN 109472282 B CN109472282 B CN 109472282B CN 201811053140 A CN201811053140 A CN 201811053140A CN 109472282 B CN109472282 B CN 109472282B
Authority
CN
China
Prior art keywords
hash
samples
training
network
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201811053140.XA
Other languages
Chinese (zh)
Other versions
CN109472282A (en
Inventor
耿立冰
潘炎
印鉴
赖韩江
潘文杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sun Yat Sen University
Guangzhou Zhongda Nansha Technology Innovation Industrial Park Co Ltd
Original Assignee
Sun Yat Sen University
Guangzhou Zhongda Nansha Technology Innovation Industrial Park Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sun Yat Sen University, Guangzhou Zhongda Nansha Technology Innovation Industrial Park Co Ltd filed Critical Sun Yat Sen University
Priority to CN201811053140.XA priority Critical patent/CN109472282B/en
Publication of CN109472282A publication Critical patent/CN109472282A/en
Application granted granted Critical
Publication of CN109472282B publication Critical patent/CN109472282B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a deep image hash method based on few training samples, which is designed on the premise that the existing traditional hash methods and the hash method based on deep learning are carried out on the premise of a large number of training samples, and the cost for obtaining a large number of marked training samples is very high in a real production environment, so that the method has very high practical value if an image hash model with relatively good effect can be obtained under few training samples.

Description

Depth image hashing method based on few training samples
Technical Field
The invention relates to the field of image retrieval and computer vision, in particular to a depth image hashing method based on few training samples.
Background
In recent years, with the rapid development of large data and information technology, the image data generated every day cannot be estimated, and it is important how to search the vast image data for the image display desired by the user. Meanwhile, the information retrieval technology is also greatly developed and applied, and one of the more important technologies in the field of information retrieval is image hashing.
As for the image hashing technique, from the implementation point of view, the conventional image hashing and the deep learning based image hashing (deep hashing) can be divided. In recent years, with the rapid development of deep learning, deep hashing has become the leading image hashing method at present. The deep hash model has strong characterization capability, and simultaneously needs a large amount of training samples to learn the whole deep neural network. However, in a real-world environment, it is often difficult to obtain a large number of training samples, and thus a problem arises: how to design a relatively good hash model when some training samples are few? This is the problem to be solved by the invention patent, and therefore, a deep hash method (few-shot hashing) for learning few new samples from the existing a priori knowledge is provided.
Disclosure of Invention
The invention provides a depth image hashing method which can obtain an image hashing model with relatively good effect and is based on few training samples.
In order to achieve the technical effects, the technical scheme of the invention is as follows:
a depth image hashing method based on few training samples comprises the following steps:
s1: task definition and data division;
s2: constructing a triplet-based universal deep hash model;
s3: constructing a support memory based on a universal deep hash model;
s4: learning a feature representation of few samples through a bidirectional long-short term memory subnetwork and supporting memory;
s5: training the depth image Hash model under few samples, and performing retrieval test on the test set of few samples.
Further, the specific process of step S1 is:
s11: taking the cifar100 dataset as an example, a specific definition of few-shot hashing is given. Dividing the cifar100 into 2 parts, wherein the first part comprises 80 classes, each class comprises 500 training pictures which are sufficient and are marked as S (support set); the other part has 20 classes, each with only a small number of 3 (or 5, 10) training samples, and this part is denoted as l (learning set). The goal of this is to train a deep hash model so that pictures belonging to these 20 classes can be retrieved relatively efficiently across a class 100 image database.
Further, the specific process of step S2 is:
s21: for the task of depth image hashing, a feature learning sub-network, namely a deep convolutional network (CNN), needs to be constructed first. The convolution network is formed by stacking a convolution layer, an activation layer and a pooling layer and has strong characteristic expression capability;
s22: after passing through a convolution sub-network, each picture is converted into a semantic feature vector, and then a full connection layer with the output neuron number q and a corresponding sigmod activation function layer are added behind the feature vector. Thus, each image is converted into a q-dimensional real number vector ranging from 0 to 1, namely a Hash vector;
s23: after the hash vectors are obtained, constraint is carried out through a triple loss function (triple ranking loss), and the purpose of the triple loss function is that the distance between the approximate hash vectors of similar pictures is far smaller than the distance between the hash vectors of dissimilar pictures through learning;
s24: and training the triplet-based universal deep hash network to obtain a universal deep hash model.
Further, the specific process of step S3 is as follows:
s31: from the previous task definition, there are 2 parts of the data set, one part is S (support set), and the other part is l (learning set) which is concerned and has few training samples, and each class of S has enough training samples and can correspond to what has been seen or learned; there are few training samples in L, corresponding to newly seen things;
s32: and (4) carrying out feature extraction on the sample of S by using the trained triplet-based universal deep hash model. The method specifically comprises the following steps: sequentially inputting samples I [ I ] [ j ] (I is more than or equal to 1 and less than or equal to S, j is more than or equal to 1 and less than or equal to n, S is the number of the types of S, and n is the number of the samples of each type) into a deep hash model to obtain the semantic dimensional characteristics of each picture;
s33: arranging all the characteristics into M [ i ] [ j ], specifically, each row i is the same, representing that the characteristic vector of the row belongs to the same class, different columns represent the jth sample characteristic vector of the class, and M is a support memory (support memory);
further, the specific process of step S4 is as follows:
s41: in each iteration, the support pops up a feature vector for each class of features according to a specified sequence, and the feature vector is recorded as ft, wherein t is more than or equal to 1 and is less than or equal to s.
S42: the forward and backward unrolling of the bidirectional long short term memory subnetwork (BLSTM) is s time steps.
S43: let flTime-invariant (static) input x as a bidirectional long-short term memory subnetworkstaticLet ftInput x as time-varying (time-varying) for bidirectional long-short term memory subnetworkst
S44: through the interaction of the bidirectional long-short term memory sub-network and the support memory, the final feature representation of few new samples is obtained
S45: the new feature representation is constrained with a triplet loss function.
Further, the specific process of step S5 is as follows:
s51: and training the whole network by using a random gradient descent method.
S52: and searching the test set of the L in the whole image database, and calculating a test result.
Compared with the prior art, the technical scheme of the invention has the beneficial effects that:
the invention is designed on the premise that the existing traditional Hash method and the Hash method based on deep learning are both designed on the premise of a large number of training samples, and the cost for obtaining a large number of marked training samples is very high in a real production environment, so that under the condition of few training samples, if an image Hash model with relatively good effect can be obtained, the invention has very great practical value.
Drawings
FIG. 1 is a schematic diagram of a triplet-based generalized deep hash network;
FIG. 2 is a diagram of the overall network architecture of the present invention;
FIG. 3 is a diagram of a network architecture of a bidirectional long-short term memory subnetwork;
FIG. 4 is the result of an NDCG experiment on the SUN dataset;
FIG. 5 is the result of NDCG experiment on CIFAR-10 dataset;
FIG. 6 shows the results of NDCG experiments on the CIFAR-100 dataset.
Detailed Description
The drawings are for illustrative purposes only and are not to be construed as limiting the patent;
for the purpose of better illustrating the embodiments, certain features of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product;
it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The technical solution of the present invention is further described below with reference to the accompanying drawings and examples.
Example 1
The technical solution of the present invention is further described with reference to the drawings and the embodiments.
1. Task definition and data partitioning
When a new training sample arrives, the deep neural network updates the whole network from beginning to end, and if the number of the new training samples is small, overfitting inevitably occurs, so that the effect is poor. However, it has been found that when a person sees a new object, it is often reminiscent of the previous object, for example, a child first sees a tiger, he may search from memory and find that the new object is similar to a cat that has been familiar before, so that he sees a tiger once and may remember the look of the tiger.
Elicited from this and applied to the image hashing problem: if the training pictures of some things are very few, for example, each class has only 3 or 5 pictures, it is possible to learn a "priori knowledge" (prior knowledge) or "support memory" (support memory) from a large number of samples of other things, and then learn a new sample from the a priori knowledge, which is called few-shot hashing.
A specific definition of few-shot hashing is given below using the cifar100 dataset as an example. Dividing the cifar100 into 2 parts, wherein the first part comprises 80 classes, each class comprises 500 training pictures which are sufficient and are marked as S (support set); the other part has 20 classes, each with only a small number of 3 (or 5, 10) training samples, and this part is denoted as l (learning set). The goal is to train a deep hash model so that pictures belonging to these 20 classes can be retrieved relatively efficiently across a class 100 image database.
2. Construction of triple-based universal deep hash model
The deep hash model is widely applied to the field of image retrieval, such as 'searching images with images', 'searching similar articles from Taobao', and the like. Furthermore, the deep hash model is the basic part of the few-shot hashing model, so the deep hash model in this document will be explained first, and as shown in fig. 1, it is mainly divided into three major parts: a feature learning subnetwork, a hash code generation subnetwork, and a loss function.
a) Feature learning subnetwork
In the image field, a deep convolutional network (CNN) is formed by stacking convolutional layers, active layers and pooling layers, and has a strong feature expression capability, and AlexNet, google lenet, VGG, ResNet and the like are common. The image can be converted into feature vectors through a convolutional network, for example, 1024-dimensional vectors of the last pooling layer of google lenet or 4096-dimensional vectors of the last fully-connected layer of VGG can be used as feature representations of the image, and the depth features are far better than traditional manual features such as GIST features, SIFT features and the like. The Googlenet used is exemplified below.
b) Hash code generation subnetwork
After CNN, each picture is transformed into a 1024-dimensional feature vector, and the final purpose is to obtain 0/1 hash codes with specific length, such as 12-bit hash codes, so that it is the most intuitive and common practice to add a fully-connected layer with 12 output neurons after the 1024-dimensional feature vector, and then to connect a sigmod activation function after that. Thus, each image is converted into a 12-dimensional real number vector ranging from 0 to 1, which is called an approximate hash code vector.
c) Triple loss function triple ranking loss
The loss functions of the deep hash model are various, and can be roughly divided into two main classes, one is pair-based, and the other is triplet-based. As used herein, a triple ranking loss is described in detail below.
the input to the triple ranking loss is triplets, i.e. image triplets, having an image dataset in which I is the sample, sim is the similarity between 2 images, if sim (I, I)+)>sim(I,I-) Then called (I, I)+,I-) Is a triplet. For example, in the case of a single-label image dataset: image a and image b belong to the same class, and a and c are not, then (a, b, c) is a triplet.
the purpose of the triple ranking loss is to let I and I go through learning+Is much smaller than I and I-The distance between hash vectors is mathematically defined as follows:
ltri(v(I),v(I+),v(I-))=max(0,m+||v(I)-v(I+)||-||v(I)-v(I-)||)
s.t.v(I),v(I+),v(I-)∈[0,1]n
where v (i) represents a hash code approximation vector and m represents a distance parameter margin. As can be seen from equation (1), when I and I-Is less than I and I+If the sum of the distance between the two and margin is greater than zero, loss occurs, and I are increased-By a distance of, I and I are reduced+The distance of (d); when I and I-Is greater than I andI+the sum of the distance between and margin, loss is zero, indicating that this triplet has been learned.
After the deep hash model training is finished, a user submits a picture, the picture is changed into an approximate hash vector through the deep hash model, the approximate hash vector is changed into binary hash after quantization (each bit of the approximate hash vector is more than or equal to 0.5 and is 1, otherwise, the approximate hash vector is 0), hamming distance calculation is carried out on the hash code and hash codes of all images in a database, all hamming distances are sorted from small to large after calculation, and then the result of top-k can be quickly returned and presented to the user.
3. Construction of support memory based on universal deep hash model
From the previous task definition, there are 2 parts of the data set, one part is S (support set), and the other part is L (learning set) which is concerned and is also a few training samples. Each class in S has sufficient training samples, which can correspond to things that a child has seen or learned; the training samples in L are rare, corresponding to newly seen things. First, a priori knowledge or support memory is constructed by S.
And (3) training a triplet-based deep hash network by using all the data S, as shown in figure 1. Because the training samples of S are sufficient, the parameters of the deep hash network can be well learned, and a good-effect hash model is obtained and is marked as a Support Hashing Model (SHM).
Then, SHM is used for constructing 'support memory', specifically: the samples I [ I ] [ j ] (I is not less than 1 and not more than S, j is not less than 1 and not more than n, S is the number of the types of S, such as 80, n is the number of the samples of each type, such as 500) are sequentially input into the SHM, all the features are arranged as M [ I ] [ j ], specifically, each row I is the same, the feature vectors of the row belong to the same type, different rows show the jth sample feature vector of the type, and M is the support memory, as shown in FIG. 2, 1024-dimensional features (the last pooling layer) of each picture are obtained.
4. Learning feature representations of few samples through bidirectional long-short term memory subnetworks and supporting memory
This section is the core of few-shot hashing and will describe how to learn few new samples from the support memory.
Firstly, an overall network structure diagram of few-shot hashing is given, and the main difference is that a support memory and a bidirectional long-short term memory sub-network are added on the basis of a triplet-based deep hash network, as shown in the following figure:
as can be seen from FIG. 2, during training, each new sample I is subjected to feature extraction through a convolution sub-network, and the feature is flIt is worth noting here that the parameters of the convolution sub-network are shared with the SHM and that part of the parameters are not updated, i.e., the trained SHM is used to act as a feature extractor for new samples in addition to being used to construct the support memory.
After feature extraction, a bidirectional long-short term memory network BLSTM is designed to carry out interaction and learning of a new sample and a support memory. As shown in fig. 3.
Specifically, in each iteration of the training phase, M pops up a feature vector for each type of feature according to a specified sequence, the feature vector is recorded as ft, t is more than or equal to 1 and less than or equal to s, and meanwhile, the forward expansion and the reverse expansion of BLSTM are s time steps. Then, as shown in FIG. 3, let flInput x as non-time-varying (static) of BLSTMstaticLet ft be the input xt of time-varying of BLSTM, in the mathematical form:
x’t=concat(xt,xstatic)
wherein the concat function is a splicing operation of the feature vectors, such as splicing 2 1024-dimensional ft and fl into 2048-dimensional vector x't. The hidden size of BLSTM is set to 1024 (consistent with the original feature dimension), after s time steps of BLSTM, each forward LSTM cell outputs a 1024-dimensional hft and each backward LSTM cell outputs a 1024-bit hbt according to the equation (3) de operation:
hft=LSTMf(hft-1,xt-1),1<t≤s
hbt=LSTMb(hbt+1,x′t+1),1≤t<s
the new feature lnew of the new sample can then be expressed as:
lsum=eltwise sum(hfs,hb1)
lnew=eltwise product(hsum,0.5)
where eltwise _ num is the addition operation between the elements of the vector and eltwise _ product is the multiplication operation between the elements, and in the direct view, equation (4) is to add hfs and hb1 and then take the average as the new feature representation.
Therefore, after each new sample is subjected to interactive learning with the support memory through the BLSTM sub-network, a new 1024-dimensional feature representation is obtained.
After the new feature representation is obtained, model training can be performed through the same hash generation sub-network and triplet ranking loss, as shown in FIG. 2.
5. Results of the experiment
1) Data set
SUN, 64 classes of pictures, 430 samples per class, total 27,520 pictures. The SUN is divided into 2 parts: the first part, S, contains all samples of class 54, for a total of 23200 pictures. The remaining 10 classes of the second part L are newly learned samples, and there are only 3, 5, 10 training samples of each class of L (herein the three cases of experiment few-shot are referred to as 3shot, 5shot, 10shot, respectively). All samples of S and L, except the test sample of L, make up the search database.
CIFAR-10, 6000 samples of each category, 10 categories, 60000 pictures. The first part S of CIFAR-10 contained 48000 samples of the top 8 classes. The remaining last class 2 is L. Similarly, there are 3 cases for the number of training samples of each class L: 3shot, 5shot and 10 shot. All samples of S and L, except for the test sample of L, make up the search database.
CIFAR-100 is similar to CIFAR-10, except that it contains 100 classes of samples, 600 training samples per class. The first 80 sample groups and the last 20 sample groups. Similarly, the training samples of L have only three cases of 3, 5 and 10 per class.
2) Evaluation index
The most common Mean Average Precision (MAP) and Normalized Dis-counted relative Gains (NDCG) in the information search field were selected as evaluation indexes. The larger the MAP and NDCG are, the better the search effect is.
3) Comparative test
The following are comparative tests on 3 data sets:
table 1: MAP experimental results on SUN dataset
Figure BDA0001795088980000081
Table 2: MAP experimental results on CIFAR-10 dataset
Figure BDA0001795088980000082
Table 3: MAP experimental results on CIFAR-100 dataset
Figure BDA0001795088980000083
The results show that the invention is greatly improved compared with the prior method, the invention reasonably utilizes the bidirectional long-short term memory sub-network and the support memory to learn the characteristic representation of few new samples by dividing the method by a large amount of support memory or a priori knowledge, and the whole network structure of the invention is shown as the attached figure 2.
The same or similar reference numerals correspond to the same or similar parts;
the positional relationships depicted in the drawings are for illustrative purposes only and are not to be construed as limiting the present patent;
it should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.

Claims (3)

1. A depth image hashing method based on training samples is characterized by comprising the following steps:
s1: task definition and data division;
s2: constructing a triplet-based universal deep hash model;
s3: constructing a support memory based on a universal deep hash model;
s4: learning a feature representation of few samples through a bidirectional long-short term memory subnetwork and supporting memory;
s5: training a depth image hash model under few samples, and performing retrieval test on a test set of few samples;
the specific process of step S1 is:
dividing a cifar100 data set into 2 parts by taking the cifar100 data set as a sample, wherein the first part comprises 80 classes, each class comprises 500 sufficient training pictures, and the training pictures are recorded as S; the other part is 20 types, each type only has a small number of 3 or 5 or 10 training samples, the part is marked as L, and the aim is to train a deep hash model, so that pictures belonging to the 20 types can be relatively effectively searched in the whole image database of 100 types;
the specific process of step S2 is:
s21: aiming at the task of the depth image Hash, firstly, a feature learning sub-network, namely a depth convolution network, needs to be constructed, wherein the convolution network is formed by stacking a convolution layer, an activation layer and a pooling layer and has strong feature expression capability;
s22: after passing through a convolution sub-network, each picture is converted into a semantic feature vector, then a full connection layer with the output neuron number q and a corresponding sigmoid activation function layer are added behind the feature vector, and each image is converted into a q-dimensional real number vector, namely a Hash vector, with the range of 0-1;
s23: after the hash vectors are obtained, constraint is carried out through a triple loss function (triple ranking loss), wherein the triple loss function aims to enable the distance between the approximate hash vectors of similar pictures to be far smaller than the distance between the hash vectors of dissimilar pictures through learning;
s24: training a triplet-based universal deep hash network to obtain a universal deep hash model;
the specific process of step S3 is:
s31: from the previous task definition, there are 2 parts of the data set, one part is S, and the other part is L, and each class in S has sufficient training samples, which can correspond to what has been seen or learned; there are few training samples in L, corresponding to newly seen things;
s32: carrying out feature extraction on the sample in the S by using a trained triplet-based universal deep hash model, which specifically comprises the following steps: sequentially inputting the samples I [ I ] [ j ] into the universal deep hash model to obtain the semantic features of each picture, wherein I is more than or equal to 1 and less than or equal to S, j is more than or equal to 1 and less than or equal to n, S is the number of S types, and n is the number of samples of each type;
s33: all features are arranged as M [ i ] [ j ], specifically: each row i is the same, indicating that the feature vectors of the row belong to the same class, the different columns indicate the jth sample feature vector of the class, and M is the support memory.
2. The training sample-based depth image hashing method according to claim 1, wherein the specific process of said step S4 is as follows:
s41: in each iteration, the support memory pops up a feature vector for each class of features according to a specified sequence, and the feature vector is recorded as ft, wherein t is more than or equal to 1 and is less than or equal to s;
s42: the forward and reverse expansion of the bidirectional long and short term memory sub-network is s time steps;
s43: let flTime-invariant input x as a bidirectional long-short term memory subnetworkstaticLet ftTime-varying input x as a bi-directional long-short term memory subnetworkt
S44: through the interaction of the bidirectional long-short term memory sub-network and the support memory, the final feature representation of few new samples is obtained;
s45: the new feature representation is constrained with a triplet loss function.
3. The training sample-based depth image hashing method according to claim 2, wherein said step S5 includes the following steps:
s51: training the whole network by a random gradient descent method;
s52: and searching the test set of the L in the whole image database, and calculating a test result.
CN201811053140.XA 2018-09-10 2018-09-10 Depth image hashing method based on few training samples Expired - Fee Related CN109472282B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811053140.XA CN109472282B (en) 2018-09-10 2018-09-10 Depth image hashing method based on few training samples

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811053140.XA CN109472282B (en) 2018-09-10 2018-09-10 Depth image hashing method based on few training samples

Publications (2)

Publication Number Publication Date
CN109472282A CN109472282A (en) 2019-03-15
CN109472282B true CN109472282B (en) 2022-05-06

Family

ID=65664206

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811053140.XA Expired - Fee Related CN109472282B (en) 2018-09-10 2018-09-10 Depth image hashing method based on few training samples

Country Status (1)

Country Link
CN (1) CN109472282B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110134803B (en) * 2019-05-17 2020-12-11 哈尔滨工程大学 Image data quick retrieval method based on Hash learning
CN110210988B (en) * 2019-05-31 2021-04-27 北京理工大学 Symbolic social network embedding method based on deep hash
CN110674335B (en) * 2019-09-16 2022-08-23 重庆邮电大学 Hash code and image bidirectional conversion method based on multiple generation and multiple countermeasures
CN113780245B (en) * 2021-11-02 2022-06-14 山东建筑大学 Method and system for retrieving articles in multiple scenes

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106682233A (en) * 2017-01-16 2017-05-17 华侨大学 Method for Hash image retrieval based on deep learning and local feature fusion
CN107871014A (en) * 2017-11-23 2018-04-03 清华大学 A kind of big data cross-module state search method and system based on depth integration Hash

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106682233A (en) * 2017-01-16 2017-05-17 华侨大学 Method for Hash image retrieval based on deep learning and local feature fusion
CN107871014A (en) * 2017-11-23 2018-04-03 清华大学 A kind of big data cross-module state search method and system based on depth integration Hash

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Deep Semantic Hashing with Generative Adversarial Networks;zhaofan qiu er al;《proceedings of the 40th international ACM SIGIR conference on research and development in information》;20170831;第225-234页 *
Deep Visual-Semantic Hashing for Cross-Modal Retrieval;yue cao et al;《proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data》;20160831;第1445-1454页 *
FP-CNNH:一种基于深度卷积神经网络的快速图像哈希算法;刘冶 等;《计算机科学》;20160930;第43卷(第9期);第39-51页 *
Object-Location-Aware Hashing for Multi-Label Image Retrieval via Automatic Mask Learning;changqin huang et al;《IEEE transactions on image processing》;20180521;第27卷;第4490-4502页 *

Also Published As

Publication number Publication date
CN109472282A (en) 2019-03-15

Similar Documents

Publication Publication Date Title
CN110188227B (en) Hash image retrieval method based on deep learning and low-rank matrix optimization
CN109472282B (en) Depth image hashing method based on few training samples
CN111858954B (en) Task-oriented text-generated image network model
CN110059160B (en) End-to-end context-based knowledge base question-answering method and device
CN109635083B (en) Document retrieval method for searching topic type query in TED (tele) lecture
CN104317834B (en) A kind of across media sort methods based on deep neural network
CN107562812A (en) A kind of cross-module state similarity-based learning method based on the modeling of modality-specific semantic space
CN108875076B (en) Rapid trademark image retrieval method based on Attention mechanism and convolutional neural network
CN112819023B (en) Sample set acquisition method, device, computer equipment and storage medium
CN110837578B (en) Video clip recommendation method based on graph convolution network
CN112100346A (en) Visual question-answering method based on fusion of fine-grained image features and external knowledge
CN111898374B (en) Text recognition method, device, storage medium and electronic equipment
CN106844518B (en) A kind of imperfect cross-module state search method based on sub-space learning
CN112949740B (en) Small sample image classification method based on multilevel measurement
CN112966091A (en) Knowledge graph recommendation system fusing entity information and heat
CN113239159B (en) Cross-modal retrieval method for video and text based on relational inference network
CN116580257A (en) Feature fusion model training and sample retrieval method and device and computer equipment
CN110674326A (en) Neural network structure retrieval method based on polynomial distribution learning
CN112148886A (en) Method and system for constructing content knowledge graph
CN110598022A (en) Image retrieval system and method based on robust deep hash network
CN113297410A (en) Image retrieval method and device, computer equipment and storage medium
CN111985520A (en) Multi-mode classification method based on graph convolution neural network
CN114219824A (en) Visible light-infrared target tracking method and system based on deep network
CN108446605A (en) Double interbehavior recognition methods under complex background
CN111078952A (en) Cross-modal variable-length Hash retrieval method based on hierarchical structure

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20220506