CN111985548A - Label-guided cross-modal deep hashing method - Google Patents

Label-guided cross-modal deep hashing method Download PDF

Info

Publication number
CN111985548A
CN111985548A CN202010802092.0A CN202010802092A CN111985548A CN 111985548 A CN111985548 A CN 111985548A CN 202010802092 A CN202010802092 A CN 202010802092A CN 111985548 A CN111985548 A CN 111985548A
Authority
CN
China
Prior art keywords
label
text
image
network
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202010802092.0A
Other languages
Chinese (zh)
Inventor
曾焕强
阮海涛
朱建清
陈婧
曹九稳
廖昀
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huaqiao University
Original Assignee
Huaqiao University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huaqiao University filed Critical Huaqiao University
Priority to CN202010802092.0A priority Critical patent/CN111985548A/en
Publication of CN111985548A publication Critical patent/CN111985548A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9014Indexing; Data structures therefor; Storage structures hash tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a label-guided cross-modal deep hash method, which comprises the following steps: the method comprises the steps of constructing an image, a text and a feature extraction network corresponding to label information, designing a loss function, and carrying out common representation space learning and label space learning on two modes of the input text and the input image so as to eliminate semantic gaps among different modes. The invention especially considers the difficulty in the cross-modal retrieval field, the data of different modes has semantic gaps, namely, the data are expressed as high-level semantic correlation, and the bottom-level characteristics are heterogeneous, and the invention can effectively improve the cross-modal retrieval precision.

Description

Label-guided cross-modal deep hashing method
Technical Field
The invention relates to the field of multi-modal learning and multi-modal fusion, in particular to a label-guided cross-modal deep hash method.
Background
With the explosive growth of multi-modal, multimedia data, cross-modal retrieval has become an urgent issue. The cross-modality retrieval aims at retrieval among data of different modalities (images, texts, voice, video and the like), such as image retrieval texts, text retrieval audios, audio retrieval videos and the like, and has very important application value. The application scenarios of the cross-modal retrieval are very wide, such as highlight retrieval of video websites, personalized semantic short video retrieval and the like.
However, data of different modalities often have the characteristics that the underlying features are heterogeneous and the high-level semantics are related. For example, the semantic of tiger has SIFT, LBP, etc. on the representation of image features, but the representation of text features is dictionary vectors, etc., and it can be seen that the expression types of the same semantic on different modality data are completely different from the description of the features. Therefore, the search across modal search is very challenging.
Disclosure of Invention
The invention provides a label-guided cross-modal deep hash method aiming at the problem that different modal data have semantic gaps in cross-modal retrieval.
The technical scheme adopted by the invention for solving the technical problem is as follows:
a label-guided cross-modal deep hashing method comprises a training process and a retrieval process, and comprises the following steps:
training process S1: inputting image text pairs with the same semantics and class label information of the image text pairs into a label-guided deep hash network model for training until the model converges, thereby obtaining a network model M;
the retrieval process S2: and respectively extracting the feature vectors of the image to be queried and each text in the candidate library by using the network model M obtained by training in the S1, thereby calculating the similarity between the image to be queried and the text in the candidate library, sequencing according to the similarity, and returning a retrieval result.
Preferably, the training process S1 includes the following steps:
step S11): image data v of different classesiInputting the image characteristics into an image modal characteristic extraction network to extract image characteristics;
step S12): will be compared with the image data v in S11)iCorresponding text data tiInputting the text data into a text modal feature extraction network to extract text features;
step S13): from image data viAnd text data tiExtracting a common subspace characteristic B and a characteristic L of a label space from the category information;
step S14): class label information l labeled on image datai=[li1,...,lic]Inputting the character into label character extracting network to extract character H in Hamming spacelAnd features L of the tag spacel
Step S15): the feature vector B, L, H obtained abovel、LlAnd respectively sending the label space and the common representation space to carry out joint learning, optimizing the label network loss by adopting an error back propagation algorithm to obtain a convergent label network, so as to guide updating of the image text network, and carrying out iteration to form a label-guided cross-mode deep hash network model M.
Preferably, in step S11), the image feature extraction network is composed of five convolutional layers, a pooling layer and three fully-connected layers, where the number of fully-connected layer hidden units in the last layer is N-K + c, that is, the number is composed of a hash code length and an image data class number, where K denotes the length of the hash code and c denotes the data class number.
Preferably, in step S12), the text feature extraction network is formed by an MS model and a three-layer feedforward neural network, and the whole is T → MS → 4096 → 512 → N, where T denotes an input layer of the text network, MS denotes a multi-scale model, 4096 and 512 denote the number of hidden neurons of the two previous layers of feedforward networks, respectively, and N ═ K + c is formed by a hash code length K and a text data class number c, where K denotes the length of the hash code and c denotes the data class number; the MS model consists of five pooling layers, and the size of the MS model is (1 × 1, 2 × 2, 3 × 3, 5 × 5 and 10 × 10).
Preferably, in step S13), from the image data viAnd text data tiExtracting common subspace characteristics B and characteristics L of the label space from the category information of (A), wherein B is through a similarity matrix SijConstructed when SijWhen 1 denotes OiAnd OjSimilarly; when S isijWhen 0 denotes OiAnd OjDissimilar, L-shaped asi=[li1,...,lic]When the data sample OiBelonging to class j, Lij1, otherwise Lij=0。
Preferably, in step S14), the tag feature extraction network is composed of two four-layer feedforward neural networks, where the number of cells in an implicit layer in the feedforward neural network is (L → 4096 → 512 → N), where L denotes an input layer of the tag network, 4096 and 512 denote the number of implicit neurons in the feedforward neural network in the first two layers, N ═ K + c is composed of a hash code length K and a text data class number c, and L ═ K + c isi=[li1,...,lic]Represents a data sample OiBelongs to class j, then Lij1, otherwise L ij0; and for the extracted Hamming spatial feature HlAnd tag space characteristics Ll,HlGenerated by the sign function: namely, it is
Figure BDA0002627760820000021
Wherein the symbol functionThe numerical formula is:
Figure BDA0002627760820000031
and L islThe method is generated by introducing a sigmod function into an active layer, wherein the function formula of the sigmod is as follows:
Figure BDA0002627760820000032
wherein the content of the first and second substances,
Figure BDA0002627760820000033
semantic features learned for the hamming space; f. ofv,t,lRepresenting a hash function, θv,t,lRepresenting network parameters to be learned;
preferably, in step S15), the label network is updated first, and the target function formula of the label network is:
Figure BDA0002627760820000034
wherein S isijIs a similarity matrix when SijWhen 1 denotes OiAnd OjSimilarly; when S isijWhen 0 denotes OiAnd OjAre not similar to each other, wherein
Figure BDA0002627760820000035
And HlAnd LlRespectively representing a predicted hash code and a predicted class label, where1To preserve the similarity of semantic features, λ2For ensuring that data instances having the same class label have similar hash codes, λ3Hash code loss, λ, to optimize learning4Is to optimize the loss of label space;
and then guiding the updating of the image text network and an objective function formula in the image and text feature learning process:
Figure BDA0002627760820000036
wherein S isijIs a similarity matrix when SijWhen 1 denotes OiAnd OjSimilarly; when S isijWhen 0 denotes OiAnd OjAre not similar to each other, wherein
Figure BDA0002627760820000037
Alpha, gamma, mu, beta are hyper-parameters, the values are all 1, and HlAnd LlRespectively representing a predicted hash code and a predicted class label, where ξ1To preserve the similarity, ξ, of semantic features2To ensure that data instances with the same class label have similar hash codes, ξ3Hash code loss, ξ, to optimize learning4Optimizing the loss of the label space and optimizing the objective function to obtain the final model M.
Preferably, the step of the retrieving process S2 is as follows:
step S21): respectively extracting the hash code vector of the image to be inquired and the hash code vector of each text in the candidate library in the image retrieval text task by using the basic network model M obtained by training in the S1;
step S22): by Hamming distance distH=(bi·bj) Calculating the similarity between the feature vector of the image to be queried and the feature vector of each text in the candidate library, bi,bjThe hash codes respectively represent the hash code of the query image i and the hash code of the jth text data in the candidate library, and (phi) represents the inner product operation.
Step S23): and performing descending sorting according to the obtained similarity, and returning a retrieval result.
The invention has the following beneficial effects:
the invention constructs a label-guided cross-modal deep hash network, obtains the depth characteristics of each modality by constructing the deep learning network of each modality, introduces a common subspace-Hamming space to measure heterogeneous data, and fully utilizes the pure semantics of label information to supervise and learn the image and text modalities. The cross-modal deep hash model obtained by training has higher accuracy in text detection by image and text detection by image; in the retrieval process, a network model obtained based on the method training is utilized to perform feature extraction and Hamming distance calculation on an image (text) to be queried and texts (images) in a candidate library, so that the similarity between the image to be queried and the text data in the candidate library is obtained, and cross-mode retrieval is realized. According to the method, the original features are mapped to the Hamming space, so that the calculation speed is greatly increased, the storage capacity is reduced, and the retrieval precision is improved.
The present invention is described in further detail with reference to the drawings and the embodiments, but the label-guided cross-modal deep hash network of the present invention is not limited to the embodiments.
Drawings
FIG. 1 is a block flow diagram of the method of the present invention.
Detailed Description
The technical scheme of the invention is specifically explained below with reference to the accompanying drawings.
Referring to fig. 1, the invention relates to a label-guided cross-modal deep hash method, where the model includes a training process and a retrieval process, and specifically,
the training process S1 includes the following steps:
step S11): image data v of different classesiInputting the image characteristics into an image modal characteristic extraction network to extract image characteristics;
step S12): will be compared with the image data v in S11)iCorresponding text data tiInputting the data into a text modal characteristic extraction network to extract text data characteristics;
step S13): from image data viAnd text data tiExtracting a common subspace characteristic B and a characteristic L of a label space from the category information;
step S14): class label information l labeled on image datai=[li1,...,lic]Inputting the character into label character extracting network to extract character H in Hamming spacelAnd features L of the tag spacel
Step S15): the feature vector B, L, H obtained abovel、LlAnd respectively sending the label space and the common representation space to carry out joint learning, optimizing the label network loss by adopting an error back propagation algorithm to obtain a convergent label network, so as to guide updating of the image text network, and carrying out iteration to form a label-guided cross-mode deep hash network model M.
The steps of the retrieval process S2 are as follows:
step S21): respectively extracting the hash code vector of the image to be inquired and the hash code vector of each text in the candidate library in the image retrieval text task by using the basic network model M obtained by training in the S1;
step S22): by Hamming distance distH=(bi·bj) Calculating the similarity between the feature vector of the image to be queried and the feature vector of each text in the candidate library, bi,bjRespectively representing the hash code of the query image i and the hash code of the jth text data in the candidate library, and (phi) representing inner product operation;
step S23): and performing descending sorting according to the obtained similarity, and returning a retrieval result.
Further, the image feature extraction network in step S11) for adapting to the proposed idea principle is composed of five convolutional layers, a pooling layer and three fully-connected layers, where the number of fully-connected layer hidden units in the last layer is (N ═ K + c), that is, the network is composed of a hash code length and an image data class number, where K denotes the length of the hash code and c denotes the data class number;
further, in step S12), the text feature extraction network is formed by an MS model and a three-layer feedforward neural network, and the whole is (T → MS → 4096 → 512 → N), where T denotes an input layer of the text network, MS denotes a multi-scale model, 4096 and 512 denote the number of implicit neurons of the preceding two-layer feedforward network, respectively, N ═ K + c is formed by the hash code length K and the number of text data classes c, and the MS model is formed by five layers of pooling layers, and has a size of (1 × 1, 2 × 2, 3 × 3, 5 × 5, 10 × 10).
Further, extracting the reference common subspace feature B and the label space feature L from the training data in step S13), wherein B is through the similarity matrix SijConstructed when SijWhen 1 denotes OiAnd OjSimilarly; when S isijWhen 0 denotes OiAnd OjDissimilar, L-shaped asi=[li1,...,lic]When the data sample OiBelonging to class j, Lij1, otherwise Lij=0。
Further, in step S14), the tag feature extraction network is composed of two four-layer feedforward neural networks, where the number of cells in an implicit layer in the feedforward neural network is (L → 4096 → 512 → N), where L denotes an input layer of the tag network, 4096 and 512 denote the number of implicit neurons in the feedforward neural network in the first two layers, N ═ K + c is composed of a hash code length K and a text data class number c, and L ═ K + c is calculated by calculating a hash code length K and a text data class number c, where L is a function of the number of the implicit neurons ini=[li1,...,lic]Represents a data sample OiBelongs to class j, then Lij1, otherwise L ij0. And for the extracted Hamming spatial feature HlAnd tag space characteristics Ll,HlGenerated by the sign function: namely, it is
Figure BDA0002627760820000061
Wherein the symbolic function formula is:
Figure BDA0002627760820000062
and L islThe method is generated by introducing a sigmod function into an active layer, wherein the function formula of the sigmod is as follows:
Figure BDA0002627760820000063
wherein
Figure BDA0002627760820000064
Semantic features learned for the hamming space: wherein f isv,t,lIs a hash function, θv,t,lIs a network parameter to be learned;
further, in step S15), the label network is updated first, and the target function formula of the label network is:
Figure BDA0002627760820000065
wherein S isijIs a similarity matrix when SijWhen 1 denotes OiAnd OjSimilarly; when S isijWhen 0 denotes OiAnd OjAre not similar to each other, wherein
Figure BDA0002627760820000066
And HlAnd LlRespectively representing a predicted hash code and a predicted class label, where1To preserve the similarity of semantic features, λ2For ensuring that data instances having the same class label have similar hash codes, λ3Hash code loss, λ, to optimize learning4Is to optimize the loss of label space.
And then guiding the updating of the image text network and an objective function formula in the image and text feature learning process:
Figure BDA0002627760820000071
wherein S isijIs a similarity matrix when SijWhen 1 denotes OiAnd OjSimilarly; when S isijWhen 0 denotes OiAnd OjAre not similar to each other, wherein
Figure BDA0002627760820000072
Alpha, gamma, mu, beta are hyperparameters which all have a value of 1, and HlAnd LlRespectively representing a predicted hash code and a predicted class label, where ξ1To preserve the similarity, ξ, of semantic features2To ensure that data instances with the same class label have similar hash codes, ξ3Hash code loss, ξ, to optimize learning4Is to optimize the loss of label space.
The above is only one preferred embodiment of the present invention. However, the present invention is not limited to the above embodiments, and any equivalent changes and modifications made according to the present invention, which do not bring out the functional effects beyond the scope of the present invention, belong to the protection scope of the present invention.

Claims (8)

1. A label-guided cross-modal deep hash method is characterized by comprising a training process and a retrieval process, and comprises the following steps:
training process S1: inputting image text pairs with the same semantics and class label information of the image text pairs into a label-guided deep hash network model for training until the model converges, thereby obtaining a network model M;
the retrieval process S2: and respectively extracting the feature vectors of the image to be queried and each text in the candidate library by using the network model M obtained by training in the S1, thereby calculating the similarity between the image to be queried and the text in the candidate library, sequencing according to the similarity, and returning a retrieval result.
2. The label-guided cross-modal deep hash method of claim 1, wherein the training procedure S1 comprises the following steps:
step S11): image data v of different classesiInputting the image characteristics into an image modal characteristic extraction network to extract image characteristics;
step S12): will be compared with the image data v in S11)iCorresponding text data tiInputting the text data into a text modal feature extraction network to extract text features;
step S13): from image data viAnd text data tiExtracting a common subspace characteristic B and a characteristic L of a label space from the category information;
step S14): class label information l labeled on image datai=[li1,...,lic]Extracting features in hamming space from input into label feature extraction networkSign HlAnd features L of the tag spacel
Step S15): the feature vector B, L, H obtained abovel、LlAnd respectively sending the label space and the common representation space to carry out joint learning, optimizing the label network loss by adopting an error back propagation algorithm to obtain a convergent label network, so as to guide updating of the image text network, and carrying out iteration to form a label-guided cross-mode deep hash network model M.
3. The label-guided cross-modal depth hashing method according to claim 2, wherein in step S11), the image feature extraction network is composed of five convolutional layers, a pooling layer and three fully-connected layers, wherein the number of fully-connected layer hidden units in the last layer is N ═ K + c, that is, the number is composed of a hash code length and an image data class number, where K denotes the length of the hash code and c denotes the data class number.
4. The label-guided cross-mode deep hash method according to claim 2, wherein in step S12), the text feature extraction network is formed by an MS model and a three-layer feedforward neural network, which is T → MS → 4096 → 512 → N as a whole, where T denotes a text network input layer, MS denotes a multi-scale model, 4096 and 512 denote the number of hidden neurons of the two previous layers of feedforward networks, respectively, and N ═ K + c is composed of a hash code length K and a text data class number c, where K denotes a hash code length and c denotes a data class number; the MS model consists of five pooling layers, and the size of the MS model is (1 × 1, 2 × 2, 3 × 3, 5 × 5 and 10 × 10).
5. The label-guided cross-modal depth hashing method according to claim 2, wherein in step S13), the image data v is selected fromiAnd text data tiExtracting common subspace characteristics B and characteristics L of the label space from the category information of (A), wherein B is through a similarity matrix SijConstructed when SijWhen 1 denotes OiAnd OjSimilarly; when S isijWhen 0 denotes OiAnd OjDissimilar, L-shaped asi=[li1,...,lic]When the data sample OiBelonging to class j, Lij1, otherwise Lij=0。
6. The label-guided cross-mode deep hash method according to claim 2, wherein in step S14), the label feature extraction network is composed of two four-layer feedforward neural networks, where the number of hidden layer units in the feedforward neural network is (L → 4096 → 512 → N), where L represents the input layer of the label network, 4096 and 512 represent the number of hidden neurons in the two previous layers of feedforward networks, N ═ K + c is composed of the hash code length K and the number of text data classes c, and L ═ K + c isi=[li1,...,lic]Represents a data sample OiBelongs to class j, then Lij1, otherwise Lij0; and for the extracted Hamming spatial feature HlAnd tag space characteristics Ll,HlGenerated by the sign function: namely, it is
Figure FDA0002627760810000021
Wherein the symbolic function formula is:
Figure FDA0002627760810000022
and L islThe method is generated by introducing a sigmod function into an active layer, wherein the function formula of the sigmod is as follows:
Figure FDA0002627760810000023
wherein the content of the first and second substances,
Figure FDA0002627760810000024
semantic features learned for the hamming space; f. ofv,t,lRepresenting a hash function, θv,t,lIndicating the network parameters that need to be learned.
7. The label-oriented cross-modal deep hash method according to claim 2, wherein in step S15), the label network is updated first, and an objective function formula of the label network is:
Figure FDA0002627760810000031
wherein S isijIs a similarity matrix when SijWhen 1 denotes OiAnd OjSimilarly; when S isijWhen 0 denotes OiAnd OjAre not similar to each other, wherein
Figure FDA0002627760810000032
And HlAnd LlRespectively representing a predicted hash code and a predicted class label, where1To preserve the similarity of semantic features, λ2For ensuring that data instances having the same class label have similar hash codes, λ3Hash code loss, λ, to optimize learning4Is to optimize the loss of label space;
and then guiding the updating of the image text network and an objective function formula in the image and text feature learning process:
Figure FDA0002627760810000033
wherein S isijIs a similarity matrix when SijWhen 1 denotes OiAnd OjSimilarly; when S isijWhen 0 denotes OiAnd OjAre not similar to each other, wherein
Figure FDA0002627760810000034
Alpha, gamma, mu, beta are hyper-parameters, the values are all 1, and HlAnd LlRespectively representing a predicted hash code and a predicted class label, where ξ1For preserving similarity of semantic featuresSex xi2To ensure that data instances with the same class label have similar hash codes, ξ3Hash code loss, ξ, to optimize learning4Optimizing the loss of the label space and optimizing the objective function to obtain the final model M.
8. The label-guided cross-modal deep hash method of claim 1, wherein the retrieving process S2 comprises the following steps:
step S21): respectively extracting the hash code vector of the image to be inquired and the hash code vector of each text in the candidate library in the image retrieval text task by using the basic network model M obtained by training in the S1;
step S22): by Hamming distance distH=(bi·bj) Calculating the similarity between the feature vector of the image to be queried and the feature vector of each text in the candidate library, bi,bjThe hash codes respectively represent the hash code of the query image i and the hash code of the jth text data in the candidate library, and (phi) represents the inner product operation.
Step S23): and performing descending sorting according to the obtained similarity, and returning a retrieval result.
CN202010802092.0A 2020-08-11 2020-08-11 Label-guided cross-modal deep hashing method Withdrawn CN111985548A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010802092.0A CN111985548A (en) 2020-08-11 2020-08-11 Label-guided cross-modal deep hashing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010802092.0A CN111985548A (en) 2020-08-11 2020-08-11 Label-guided cross-modal deep hashing method

Publications (1)

Publication Number Publication Date
CN111985548A true CN111985548A (en) 2020-11-24

Family

ID=73433849

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010802092.0A Withdrawn CN111985548A (en) 2020-08-11 2020-08-11 Label-guided cross-modal deep hashing method

Country Status (1)

Country Link
CN (1) CN111985548A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114239730A (en) * 2021-12-20 2022-03-25 华侨大学 Cross-modal retrieval method based on neighbor sorting relation
CN117237259A (en) * 2023-11-14 2023-12-15 华侨大学 Compressed video quality enhancement method and device based on multi-mode fusion
CN117975342A (en) * 2024-03-28 2024-05-03 江西尚通科技发展有限公司 Semi-supervised multi-mode emotion analysis method, system, storage medium and computer

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114239730A (en) * 2021-12-20 2022-03-25 华侨大学 Cross-modal retrieval method based on neighbor sorting relation
CN117237259A (en) * 2023-11-14 2023-12-15 华侨大学 Compressed video quality enhancement method and device based on multi-mode fusion
CN117237259B (en) * 2023-11-14 2024-02-27 华侨大学 Compressed video quality enhancement method and device based on multi-mode fusion
CN117975342A (en) * 2024-03-28 2024-05-03 江西尚通科技发展有限公司 Semi-supervised multi-mode emotion analysis method, system, storage medium and computer
CN117975342B (en) * 2024-03-28 2024-06-11 江西尚通科技发展有限公司 Semi-supervised multi-mode emotion analysis method, system, storage medium and computer

Similar Documents

Publication Publication Date Title
CN108984724B (en) Method for improving emotion classification accuracy of specific attributes by using high-dimensional representation
CN113239181B (en) Scientific and technological literature citation recommendation method based on deep learning
Alami et al. Enhancing unsupervised neural networks based text summarization with word embedding and ensemble learning
CN109829104B (en) Semantic similarity based pseudo-correlation feedback model information retrieval method and system
CN110263325B (en) Chinese word segmentation system
CN111259127B (en) Long text answer selection method based on transfer learning sentence vector
Sun et al. Sentiment analysis for Chinese microblog based on deep neural networks with convolutional extension features
Ombabi et al. Deep learning framework based on Word2Vec and CNNfor users interests classification
CN111602128A (en) Computer-implemented method and system for determining
CN111985548A (en) Label-guided cross-modal deep hashing method
CN112417097B (en) Multi-modal data feature extraction and association method for public opinion analysis
CN112905822A (en) Deep supervision cross-modal counterwork learning method based on attention mechanism
CN111027595A (en) Double-stage semantic word vector generation method
CN115238690A (en) Military field composite named entity identification method based on BERT
Learning Hybrid model for twitter data sentiment analysis based on ensemble of dictionary based classifier and stacked machine learning classifiers-svm, knn and c50
Su et al. Semi-supervised knowledge distillation for cross-modal hashing
Roudsari et al. Comparison and analysis of embedding methods for patent documents
Meddeb et al. Deep learning based semantic approach for Arabic textual documents recommendation
CN112613451A (en) Modeling method of cross-modal text picture retrieval model
Mousa et al. Cascaded RBF-CBiLSTM for Arabic named entity recognition
Ballal et al. A study of deep learning in text analytics
Pingili et al. Target-based sentiment analysis using a bert embedded model
CN115169429A (en) Lightweight aspect-level text emotion analysis method
Zhu et al. A named entity recognition model based on ensemble learning
Setiawan Topic Detection on Twitter using GloVe with Convolutional Neural Network and Gated Recurrent Unit

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20201124