CN112395438A - Hash code generation method and system for multi-label image - Google Patents

Hash code generation method and system for multi-label image Download PDF

Info

Publication number
CN112395438A
CN112395438A CN202011226768.2A CN202011226768A CN112395438A CN 112395438 A CN112395438 A CN 112395438A CN 202011226768 A CN202011226768 A CN 202011226768A CN 112395438 A CN112395438 A CN 112395438A
Authority
CN
China
Prior art keywords
label
hash
image
hash code
occurrence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011226768.2A
Other languages
Chinese (zh)
Inventor
刘渝
汪洋涛
谢延昭
周可
夏天
冯树耀
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong University of Science and Technology
Original Assignee
Huazhong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong University of Science and Technology filed Critical Huazhong University of Science and Technology
Priority to CN202011226768.2A priority Critical patent/CN112395438A/en
Publication of CN112395438A publication Critical patent/CN112395438A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/51Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9014Indexing; Data structures therefor; Storage structures hash tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Library & Information Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a hash code generation method and a hash code generation system for a multi-label image, and belongs to the field of artificial intelligence image retrieval. The method comprises the steps of firstly combining a convolutional neural network and a graph convolutional network to respectively generate image representation and label co-occurrence embedding, then adopting MFB to fuse the two modal vectors, and finally learning a Hash model through a loss function based on Cauchy distribution. Mutual dependency among the objects is explored through co-occurrence probability of the objects in the label set, multi-mode bilinear combination co-occurrence characteristics and image characteristics based on an attention mechanism are adopted, the capability of measuring the dependency of the object relation among data through the hash code is improved, and further the performance of the hash code is improved. The use of the co-occurrence relationship and the MFB not only can improve the accuracy of the hash code, but also accelerates the hash learning.

Description

Hash code generation method and system for multi-label image
Technical Field
The invention belongs to the field of artificial intelligence image retrieval, and particularly relates to a hash code generation method and system for a multi-label image.
Background
Similarity hash codes are widely used for large-scale image retrieval due to their lightweight storage (compact binary) and efficient comparison (exclusive or). For classical image hashing, correct identification of objects from images is an important factor to improve retrieval accuracy. However, for multi-tag image retrieval, where each image contains more objects, it becomes more challenging to correctly identify the objects.
The prior solution technology has the following problems:
1. the target dependency is ambiguous: how to construct the ideal topology and what dependencies should be expressed is not certain.
2. An end-to-end training approach using topology information cannot be established: it is very difficult to represent the end-to-end approach by end-learning the images in the hash task using topology information.
Disclosure of Invention
Aiming at the defects and the improvement requirements of the prior art, the invention provides a method and a system for generating a hash code of a multi-label image, and aims to replace DP with improved MFB, fuse the MFB with the image characteristics under the multi-label correlation characteristics obtained by a graph convolution network, train an end-to-end hash model through a Cauchy loss function, and improve the correctness of the hash method through the information of the multi-label correlation.
To achieve the above object, according to a first aspect of the present invention, there is provided a hash code generation method for a multi-label image, the method including the steps of:
s1, counting all labels in a multi-label image set, mapping each label into a label word vector to obtain a label word vector matrix corresponding to the multi-label image set, and calculating the co-occurrence probability between any two labels to obtain a label co-occurrence correlation matrix corresponding to the multi-label image set;
s2, extracting image characteristic vectors of all multi-label images in the multi-label image set by adopting a convolutional neural network, and convolving a label word vector matrix and a label co-occurrence correlation matrix by adopting a graph convolution network to obtain label co-occurrence embedded characteristic vectors corresponding to the multi-label image set;
s3, respectively fusing the feature vectors of the images and the label co-occurrence embedded feature vectors by adopting multi-mode bilinear based on an attention mechanism to obtain the fused feature vectors of the multi-label images;
s4, respectively inputting the fusion characteristic vectors of the multi-label images into a hash activation layer to generate corresponding hash codes;
s5, calculating the total loss value of all hash codes generated by the whole multi-label image set based on Cauchy distribution;
s6, adjusting parameters of a convolutional neural network, a graph convolution network and a multi-mode bilinear and Hash activation layer based on an attention mechanism according to the total loss value to minimize the total loss value;
s7, repeating the steps S2-S6 until the stop condition is met, and obtaining the trained Hash code generation model of the multi-label image and the Hash code library of the multi-label image set.
Has the advantages that: the method and the device realize the purpose of determining the icon label correlation characteristics of the target by modeling the label correlation dependency in a conditional probability mode and extracting the correlation information of the label in a graph convolution mode; the hash function based on the improved Cauchy distribution can solve the problem that the traditional S-shaped function brings low concentration of similar samples in a shorter Hamming distance, and can obtain better effect. Meanwhile, the relevance among the labels is considered, so that the identification of multiple targets is improved, and a more accurate Hash model is obtained.
Preferably, in step S1, the tag dependency is modeled in the form of conditional probabilities, i.e. the tag dependency is modeled
Figure BDA0002762728350000031
Wherein, TjRepresenting a multi-label image set label rjNumber of occurrences, TijIndicating the number of times two objects appear simultaneously.
Has the advantages that: the invention models the dependency of the tag relevance in the form of conditional probability, describes the dependency of the tag in the mode of the conditional probability, and accurately reflects the relevance among the tags, thereby achieving the purpose of accurately describing the tag relevance.
Preferably, in step S2, the convolutional neural network employs a pre-trained ResNet-101.
Has the advantages that: according to the method, the balance between the effect and the training speed is achieved in various pre-training models through comparison of different pre-training convolutional neural network models in the image feature extraction process.
In step S3, for the ith label, i is 1, 2, …, R, and R is the number of label word vectors in the label word vector matrix, and the multi-modal bilinear model is as follows:
Figure BDA0002762728350000032
wherein z isiIs the fusion feature corresponding to the ith tag feature,
Figure BDA0002762728350000033
is an image feature vector, E is a label co-occurrence embedded feature vector, k is a potential dimension of a decomposition matrix, Ui、ViIs the trainable parameter corresponding to the ith tag feature,
Figure BDA0002762728350000034
Figure BDA0002762728350000035
Figure BDA0002762728350000036
is an all-one vector of dimension k,
Figure BDA0002762728350000037
is the Hadmard product, i.e. the element-wise multiplication of two vectors, the function D (·) representing the dimensionality.
Has the advantages that: hadmard product-sum pooling is utilized to increase the interaction of vector elements between the different forms, rather than DP, thereby improving accuracy. On the other hand, it reduces overfitting and parameter explosion due to increased interaction by summing pooling, thereby speeding up model convergence.
Preferably, in step S4, a full connection layer is located before the hash activation layer, the fused feature vector enters the full connection layer first, and then enters the hash activation layer, where the number of nodes in the full connection layer is the same as that of the hash activation layer.
Has the advantages that: full linkage layer itself is difficult to train, but the parameter capacity of the full linkage layer of solitary hash is less, is difficult to learn complicated transform, has enlarged parameter capacity through increasing full linkage layer in the layer of hash in front, but too much full linkage layer is unfavorable for training, here we have selected a full linkage layer and have added the mode of active layer through experimental, have obtained the balance of effect and training.
Preferably, in step S5, the total loss function
L=λLcce+(1-λ)Lcq
Cauchy cross entropy error
Figure BDA0002762728350000041
Cauchy quantization error
Figure BDA0002762728350000042
Wherein λ is Cauchy crossingThe weight of the entropy error is determined,
Figure BDA0002762728350000043
is a training sample pair { (x)i,xj,sij) Weight of sijIs a multi-label image xiAnd xjSimilar relationship of (1), sijIf 1 indicates similarity, sij0 indicates dissimilarity, S is a set of similarity relationships, Ss={sij∈S:s ij1 is a set of similarity pairs, Sd={sij∈S:sij0 is a set of dissimilar pairings, | · | is an operator taking the number of elements of the set, hi,hj∈{-1,1}KRespectively represent inputs as xi,xjCorresponding output of the time-full join hash layer, δ (h)i,hj) Is hi,hjGamma is a cauchy distribution parameter, N is the number of multi-labeled images of the multi-labeled image set, and K is the hash code length.
Preferably, the hamming distance is calculated as follows:
Figure BDA0002762728350000044
has the advantages that: according to the method, the Cauchy distance in the Cauchy loss function is regularized through the Hamming distance, meanwhile, for calculation convenience, the original definition of the Hamming distance is not adopted for calculation, and an approximate calculation mode is provided, so that better model performance is obtained.
Preferably, the method is applied to the field of image multi-label retrieval.
Has the advantages that: the invention obtains an excellent hash code generation scheme by improving the existing hash generation method in many ways based on the introduction of MFB and the use of improved Cauchy loss, and achieves the existing best performance in retrieval after the hash code is generated.
To achieve the above object, according to a second aspect of the present invention, there is provided a hash code generation system for a multi-label image, including: a computer-readable storage medium and a processor;
the computer-readable storage medium is used for storing executable instructions;
the processor is configured to read executable instructions stored in the computer-readable storage medium, and execute the hash code generation method for a multi-label image according to the first aspect.
Generally, by the above technical solution conceived by the present invention, the following beneficial effects can be obtained:
the method comprises the steps of firstly combining a Convolutional Neural Network (CNN) and a Graph Convolutional Network (GCN) to respectively generate image representation and label co-occurrence embedding, then adopting MFB to fuse the two modal vectors, and finally learning a Hash model through a loss function based on Cauchy distribution. Mutual dependency among the objects is explored through co-occurrence probability of the objects in the label set, and multi-mode bilinear (MFB) based on an attention mechanism is adopted to combine co-occurrence characteristics and image characteristics, so that the capability of measuring the dependency of object relations among data of the hash codes is improved, and further the performance of the hash codes is improved. The use of the co-occurrence relationship and the MFB not only can improve the accuracy of the hash code, but also accelerates the hash learning. Extensive experiments with this method on public datasets showed that: the method can achieve the existing latest retrieval result; the co-occurrence relation and the MFB are used, so that the accuracy of the hash code can be improved, the best performance at present can be achieved, and meanwhile, the hash learning is accelerated on the basis.
Drawings
Fig. 1 is a flowchart of a hash code generation method for a multi-label image according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
As shown in fig. 1, the present invention provides a hash code generation method for a multi-label image, including the following steps:
s1, counting all labels in a multi-label image set, mapping each label into a label word vector to obtain a label word vector matrix corresponding to the multi-label image set, and calculating the co-occurrence probability between any two labels to obtain a label co-occurrence correlation matrix corresponding to the multi-label image set.
Preferably, in step S1, the tag dependency is modeled in the form of conditional probabilities, i.e. the tag dependency is modeled
Figure BDA0002762728350000061
Wherein, TjRepresenting a multi-label image set label rjNumber of occurrences, Tij(equal to T)ji) Indicating the number of times two objects appear simultaneously.
For example, the multi-label image shown in fig. 1 includes four labels { person, football, court, goal }, each of which is mapped to a label word vector, resulting in four label word vectors [00], [01], [10], [11 ].
In order to avoid the long tail phenomenon caused by rare samples, a matrix A is binarized by using a threshold value tau:
Figure BDA0002762728350000062
Figure BDA0002762728350000063
wherein the content of the first and second substances,
Figure BDA0002762728350000064
is a binary correlation matrix, q ∈ (0, 1).
And S2, extracting image characteristic vectors of all multi-label images in the multi-label image set by adopting a convolutional neural network, and convolving the label word vector matrix and the label co-occurrence correlation matrix by adopting a graph convolution network to obtain label co-occurrence embedded characteristic vectors corresponding to the multi-label image set.
Preferably, in step S2, the convolutional neural network employs a pre-trained ResNet-101.
In this embodiment, the target image data is sampled by a pre-training depth model to obtain a feature vector of 2048 × 14 × 14 dimensions, and then a global maximum pooling layer is introduced to generate image-level features
Figure BDA0002762728350000071
Wherein θ represents a parameter of CNN and
Figure BDA0002762728350000072
by a graph convolution function FgcnCompleting the extraction of the characteristics, and converting the word description in the label set into a vector V (r), wherein V isc∈RR×D(V(r))The input D (v (r)) representing level C represents the dimension of v (r). The input of the relation is a correlation matrix A epsilon RR×RThe updated node characteristics are represented as Vc+1∈RR×D(V(r))′. Each GCN layer propagation function is described as:
Figure BDA0002762728350000073
wherein the content of the first and second substances,
Figure BDA0002762728350000074
in this embodiment, two GCN layers are used, i.e. e (r) ═ Vc+2Through experiments, the two-layer structure achieves the purpose of extracting features and ensures the training speed.
And S3, respectively fusing the feature vectors of the images and the label co-occurrence embedded feature vectors by adopting multi-mode bilinear based on an attention mechanism to obtain the fused feature vectors of the multi-label images.
Preferably, in step S3, for the features of the ith object, the multi-modal bilinear model with two low rank matrices is as follows:
Figure BDA0002762728350000075
wherein the content of the first and second substances,
Figure BDA0002762728350000076
is an image feature vector, E (r) is a tag co-occurrence embedded feature vector, k is a potential dimension of a decomposition matrix, UiIs a parameter that can be trained in a way that,
Figure BDA0002762728350000077
Viis a parameter that can be trained in a way that,
Figure BDA0002762728350000078
is an all-one vector of dimension k,
Figure BDA00027627283500000710
is a Hadmard product, i.e. an element-wise multiplication of two vectors, D (·) representing the taking dimension function.
And (3) respectively finishing the transformation by adopting two parallel k-dimension fc layers, and introducing pooling after multiplication to obtain:
Figure BDA0002762728350000079
wherein the content of the first and second substances,
Figure BDA0002762728350000081
sum function
Figure BDA0002762728350000082
Is expressed in a use size of
Figure BDA0002762728350000083
The one-dimensional non-overlapping window pairs of (a) are summed and combined.
And S4, respectively inputting the fusion characteristic vectors of the multi-label images into a hash activation layer to generate corresponding hash codes.
Preferably, in step S4, a full connection layer is located before the hash activation layer, the fused feature vector enters the full connection layer first, and then enters the hash activation layer, where the number of nodes in the full connection layer is the same as that of the hash activation layer.
And fitting the deep network through a loss function, wherein the last two layers are a full-connection layer and a full-connection Hash layer respectively, and the obtained matrix Z is used as input to obtain a predicted Hash code and a final Hash algorithm model.
And S5, calculating the total loss value of all hash codes generated by the whole multi-label image set based on Cauchy distribution.
Preferably, in step S5, the total loss function
L=λLcce+(1-λ)Lcq
Cauchy cross entropy error
Figure BDA0002762728350000084
Cauchy quantization error
Figure BDA0002762728350000085
Where λ is the weight of the Cauchy cross entropy error,
Figure BDA0002762728350000086
is a training sample pair { (x)i,xj,sij) Weight of sijIs xiAnd xjSimilar relationship of (1), sijIf 1 indicates similarity, sijSimilarity is indicated by 0, S is a set of similarity relationships, Ss={sij∈S:sij1 is a set of similarity pairs, Sd={sij∈S:sij0 is a set of dissimilar pairings, | · | is an operator taking the number of elements of the set, hi,hj∈{-1,1}KRespectively representing when the input of the fully-connected hash layer is xi,xjTime xi,xjCorresponding output, δ (h)i,hj) Is hi,hjGamma is the cauchy distribution parameter, N is the input size, and K is the hash code length.
Preferably, the hamming distance is calculated as follows:
Figure BDA0002762728350000091
and S6, adjusting parameters of a convolutional neural network, a graph convolution network and a multi-mode bilinear and Hash activation layer based on an attention mechanism according to the total loss value, so that the total loss value is minimized.
In the embodiment, the parameters of each module are adjusted by adopting a gradient descent optimization method, so that the total loss value is minimized.
And S7, repeating the steps S2-S6 until the stop condition is met, and obtaining the trained Hash code generation model of the multi-label image and the Hash code library of the multi-label image set.
Preferably, in step S7, the hamming distance is smaller than a set threshold as the stop condition, and in this embodiment, the set threshold is 2. It is also possible to use reaching a specified number of iterations as a stop condition.
Preferably, the method is applied to image multi-label retrieval.
For example, when applied to cloud photo album retrieval, the multi-tag image set contains all photos of the user, and each photo may include a plurality of tags, which are the types of objects contained in the image, such as people, dogs, tables, and the like. Inputting a picture to be inquired, extracting a picture characteristic vector through a convolutional neural network, then co-existing the picture characteristic vector with a label corresponding to a multi-label image set, embedding the co-embedded characteristic vector into an MFB, obtaining a hash code after passing through a hash activation layer, comparing the hash code with the hash code in a hash code library, returning an approximate picture with similarity in a set threshold range as a retrieval result, determining the photographing preference of a user by using the similar picture in a cloud photo album of the user, or determining whether the photo album of the user contains extremely similar pictures, so as to delete the picture and save the cloud storage space.
For example, when applied to the retrieval of an image of a product, the multi-tag image collectively includes all the images of the product in the product database, and each photo may include a plurality of tags, where the tags are the types of objects included in the image, such as a certain brand bag, a certain brand car, a certain brand computer, and the like. Inputting a commodity picture to be inquired, extracting a picture characteristic vector through a convolutional neural network, then co-occurrence embedding the picture characteristic vector with a label corresponding to a multi-label image set into an MFB, obtaining a hash code through a hash activation layer, comparing the hash code with the hash code in a hash code library, returning an approximate picture with a similarity in a set threshold range as a retrieval result, and enabling the picture to correspond to the commodity to achieve the purpose of retrieving the commodity through the picture.
The invention provides a hash code generation system of a multi-label image, which comprises: a computer-readable storage medium and a processor;
the computer-readable storage medium is used for storing executable instructions;
the processor is used for reading the executable instructions stored in the computer-readable storage medium and executing the hash code generation method of the multi-label image.
It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (10)

1. A hash code generation method of a multi-label image is characterized by comprising the following steps:
s1, counting all labels in a multi-label image set, mapping each label into a label word vector to obtain a label word vector matrix corresponding to the multi-label image set, and calculating the co-occurrence probability between any two labels to obtain a label co-occurrence correlation matrix corresponding to the multi-label image set;
s2, extracting image characteristic vectors of all multi-label images in the multi-label image set by adopting a convolutional neural network, and convolving a label word vector matrix and a label co-occurrence correlation matrix by adopting a graph convolution network to obtain label co-occurrence embedded characteristic vectors corresponding to the multi-label image set;
s3, respectively fusing the feature vectors of the images and the label co-occurrence embedded feature vectors by adopting multi-mode bilinear based on an attention mechanism to obtain the fused feature vectors of the multi-label images;
s4, respectively inputting the fusion characteristic vectors of the multi-label images into a hash activation layer to generate corresponding hash codes;
s5, calculating the total loss value of all hash codes generated by the whole multi-label image set based on Cauchy distribution;
s6, adjusting parameters of a convolutional neural network, a graph convolution network and a multi-mode bilinear and Hash activation layer based on an attention mechanism according to the total loss value to minimize the total loss value;
s7, repeating the steps S2-S6 until the stop condition is met, and obtaining the trained Hash code generation model of the multi-label image and the Hash code library of the multi-label image set.
2. The method of claim 1, wherein in step S1, the tag dependency is modeled in the form of conditional probabilities
Figure FDA0002762728340000011
Wherein, TjRepresenting a multi-label image set label rjNumber of occurrences, TijIndicating the number of times two objects appear simultaneously.
3. The method of claim 1 or claim 2, wherein in step S2, the convolutional neural network employs a pre-trained ResNet-101.
4. The method according to any one of claims 1 to 3, wherein in step S3, for the ith label, i is 1, 2, …, R, R is the number of label word vectors in the label word vector matrix, and the multi-modal bilinear model is as follows:
Figure FDA0002762728340000021
wherein z isiIs the fusion feature corresponding to the ith tag feature,
Figure FDA0002762728340000022
is an image feature vector, E is a label co-occurrence embedded feature vector, k is a potential dimension of a decomposition matrix, Ui、ViIs the trainable parameter corresponding to the ith tag feature,
Figure FDA0002762728340000023
Figure FDA0002762728340000024
Figure FDA0002762728340000025
is the all-one vector of dimension k, with ° being the Hadmard product, i.e. the element-wise multiplication of the two vectors, the function D (·) representing the dimensionality.
5. The method according to any one of claims 1 to 4, wherein in step S4, the hash activation layer is preceded by a full connection layer, the fused feature vector enters the full connection layer first, and then enters the hash activation layer, and the number of nodes of the full connection layer and the hash activation layer is the same.
6. The method of claim 5, wherein in step S5, the total loss function
L=2Lcce+(1-λ)Lcq
Cauchy cross entropy error
Figure FDA0002762728340000026
Cauchy quantization error
Figure FDA0002762728340000027
Where λ is the weight of the Cauchy cross entropy error,
Figure FDA0002762728340000028
is a training sample pair { (x)i,xj,sij) Weight of sijIs a multi-label image xiAnd xjSimilar relationship of (1), sijIf 1 indicates similarity, sij0 indicates dissimilarity, S is a set of similarity relationships, Ss={sij∈S:sij1 is a set of similarity pairs, Sd={sij∈S:sij0 is a set of dissimilar pairings, | · | is an operator taking the number of elements of the set, hi,hj∈{-1,1}KRespectively represent inputs as xi,xjCorresponding output of the time-full join hash layer, δ (h)i,hj) Is hi,hjGamma is a cauchy distribution parameter, N is the number of multi-labeled images of the multi-labeled image set, and K is the hash code length.
7. The method according to any one of claims 1 to 6, wherein in step S7, a Hamming distance less than a set threshold is used as the stop condition.
8. The method of claim 6, wherein the hamming distance is calculated as follows:
Figure FDA0002762728340000031
9. the method of any one of claims 1 to 8, applied to image multi-label retrieval or image multi-label classification.
10. A hash code generation system for a multi-label image, comprising: a computer-readable storage medium and a processor;
the computer-readable storage medium is used for storing executable instructions;
the processor is configured to read executable instructions stored in the computer-readable storage medium, and execute the hash code generation method of the multi-label image according to any one of claims 1 to 9.
CN202011226768.2A 2020-11-05 2020-11-05 Hash code generation method and system for multi-label image Pending CN112395438A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011226768.2A CN112395438A (en) 2020-11-05 2020-11-05 Hash code generation method and system for multi-label image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011226768.2A CN112395438A (en) 2020-11-05 2020-11-05 Hash code generation method and system for multi-label image

Publications (1)

Publication Number Publication Date
CN112395438A true CN112395438A (en) 2021-02-23

Family

ID=74598242

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011226768.2A Pending CN112395438A (en) 2020-11-05 2020-11-05 Hash code generation method and system for multi-label image

Country Status (1)

Country Link
CN (1) CN112395438A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112800260A (en) * 2021-04-09 2021-05-14 北京邮电大学 Multi-label image retrieval method and device based on deep hash energy model
CN113177132A (en) * 2021-06-30 2021-07-27 中国海洋大学 Image retrieval method based on depth cross-modal hash of joint semantic matrix
CN113239214A (en) * 2021-05-19 2021-08-10 中国科学院自动化研究所 Cross-modal retrieval method, system and equipment based on supervised contrast
CN113449775A (en) * 2021-06-04 2021-09-28 广州大学 Multi-label image classification method and system based on class activation mapping mechanism
CN113704522A (en) * 2021-10-28 2021-11-26 山东建筑大学 Artificial intelligence-based target image rapid retrieval method and system
CN113886607A (en) * 2021-10-14 2022-01-04 哈尔滨工业大学(深圳) Hash retrieval method, device, terminal and storage medium based on graph neural network
CN114463583A (en) * 2022-01-26 2022-05-10 南通大学 Deep hashing method for pneumonia CT image classification
CN114596456A (en) * 2022-05-10 2022-06-07 四川大学 Image set classification method based on aggregated hash learning
CN115994237A (en) * 2023-01-05 2023-04-21 北京东方通网信科技有限公司 Label characterization construction method for multi-label image retrieval

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107992611A (en) * 2017-12-15 2018-05-04 清华大学 The high dimensional data search method and system of hash method are distributed based on Cauchy

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107992611A (en) * 2017-12-15 2018-05-04 清华大学 The high dimensional data search method and system of hash method are distributed based on Cauchy

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
YANZHAO XIE 等: "Label-Attended Hashing for Multi-Label Image Retrieval", PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI-20), pages 955 - 961 *
YANZHAO XIE等: "Label-Attended Hashing for Multi-Label Image Retrieval", 《PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI-20)》, pages 955 - 961 *

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112800260A (en) * 2021-04-09 2021-05-14 北京邮电大学 Multi-label image retrieval method and device based on deep hash energy model
CN113239214A (en) * 2021-05-19 2021-08-10 中国科学院自动化研究所 Cross-modal retrieval method, system and equipment based on supervised contrast
CN113449775A (en) * 2021-06-04 2021-09-28 广州大学 Multi-label image classification method and system based on class activation mapping mechanism
CN113449775B (en) * 2021-06-04 2023-02-24 广州大学 Multi-label image classification method and system based on class activation mapping mechanism
CN113177132A (en) * 2021-06-30 2021-07-27 中国海洋大学 Image retrieval method based on depth cross-modal hash of joint semantic matrix
CN113177132B (en) * 2021-06-30 2021-09-14 中国海洋大学 Image retrieval method based on depth cross-modal hash of joint semantic matrix
CN113886607A (en) * 2021-10-14 2022-01-04 哈尔滨工业大学(深圳) Hash retrieval method, device, terminal and storage medium based on graph neural network
CN113886607B (en) * 2021-10-14 2022-07-12 哈尔滨工业大学(深圳) Hash retrieval method, device, terminal and storage medium based on graph neural network
CN113704522B (en) * 2021-10-28 2022-02-18 山东建筑大学 Artificial intelligence-based target image rapid retrieval method and system
CN113704522A (en) * 2021-10-28 2021-11-26 山东建筑大学 Artificial intelligence-based target image rapid retrieval method and system
CN114463583A (en) * 2022-01-26 2022-05-10 南通大学 Deep hashing method for pneumonia CT image classification
CN114463583B (en) * 2022-01-26 2024-03-19 南通大学 Deep hashing method for pneumonia CT image classification
CN114596456A (en) * 2022-05-10 2022-06-07 四川大学 Image set classification method based on aggregated hash learning
CN114596456B (en) * 2022-05-10 2022-07-22 四川大学 Image set classification method based on aggregated hash learning
CN115994237A (en) * 2023-01-05 2023-04-21 北京东方通网信科技有限公司 Label characterization construction method for multi-label image retrieval

Similar Documents

Publication Publication Date Title
CN112395438A (en) Hash code generation method and system for multi-label image
JP7360497B2 (en) Cross-modal feature extraction method, extraction device, and program
CN111353076B (en) Method for training cross-modal retrieval model, cross-modal retrieval method and related device
CN112119411A (en) System and method for integrating statistical models of different data modalities
CN112819023B (en) Sample set acquisition method, device, computer equipment and storage medium
CN111382868A (en) Neural network structure search method and neural network structure search device
CN113177141B (en) Multi-label video hash retrieval method and device based on semantic embedded soft similarity
Guizilini et al. Learning to reconstruct 3D structures for occupancy mapping from depth and color information
CN116049459B (en) Cross-modal mutual retrieval method, device, server and storage medium
CN114612767B (en) Scene graph-based image understanding and expressing method, system and storage medium
CN114358188A (en) Feature extraction model processing method, feature extraction model processing device, sample retrieval method, sample retrieval device and computer equipment
CN116932722A (en) Cross-modal data fusion-based medical visual question-answering method and system
CN112836502A (en) Implicit causal relationship extraction method for events in financial field
CN115438160A (en) Question and answer method and device based on deep learning and electronic equipment
CN115599984A (en) Retrieval method
CN112800253B (en) Data clustering method, related device and storage medium
CN116703531B (en) Article data processing method, apparatus, computer device and storage medium
Hoxha et al. Retrieving images with generated textual descriptions
CN115640418A (en) Cross-domain multi-view target website retrieval method and device based on residual semantic consistency
CN110019815B (en) Natural language processing using KNN
CN117938951B (en) Information pushing method, device, computer equipment and storage medium
CN117252665B (en) Service recommendation method and device, electronic equipment and storage medium
Zhang et al. UHD Aerial Photograph Categorization by Leveraging Deep Multiattribute Matrix Factorization
CN115358235B (en) Quality control method and device for medical knowledge graph, computer equipment and storage medium
Li et al. Triplet Deep Hashing with Joint Supervised Loss Based on Deep Neural Networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination