CN107402993B - The cross-module state search method for maximizing Hash is associated with based on identification - Google Patents

The cross-module state search method for maximizing Hash is associated with based on identification Download PDF

Info

Publication number
CN107402993B
CN107402993B CN201710581083.1A CN201710581083A CN107402993B CN 107402993 B CN107402993 B CN 107402993B CN 201710581083 A CN201710581083 A CN 201710581083A CN 107402993 B CN107402993 B CN 107402993B
Authority
CN
China
Prior art keywords
hash
text
data
image
object function
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201710581083.1A
Other languages
Chinese (zh)
Other versions
CN107402993A (en
Inventor
张化祥
卢旭
万文博
刘丽
郭培莲
任玉伟
孙建德
王强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Normal University
Original Assignee
Shandong Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Normal University filed Critical Shandong Normal University
Priority to CN201710581083.1A priority Critical patent/CN107402993B/en
Publication of CN107402993A publication Critical patent/CN107402993A/en
Application granted granted Critical
Publication of CN107402993B publication Critical patent/CN107402993B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention proposes a kind of cross-module state search method for being associated with based on identification and maximizing Hash, including:Multi-modal extraction is carried out to training dataset, obtains training multi-modal data collection;For training multi-modal data collection, constructs and be associated with the object function for maximizing Hash based on identification on the data set;The object function is solved, the joint Hash codes of image, the projection matrix for projecting to common hamming space of text, image text pair are obtained;For test data set, the common hamming space is projected to, and is quantified as the Hash codes of training set sample by hash function;Cross-module state retrieval is carried out based on Hash codes.The present invention improves efficiency and the accuracy of cross-media retrieval.

Description

The cross-module state search method for maximizing Hash is associated with based on identification
Technical field
The present invention relates to field of data retrieval, and in particular to a kind of to be associated with the cross-module state inspection for maximizing Hash based on identification Suo Fangfa.
Background technology
With the development of science and technology, a large amount of multi-modal data has been poured in internet.In order to be retrieved from internet To useful information, range of information retrieval technique has been risen.Traditional information retrieval is based on single mode, that is, the inquiry inputted Data and retrieval obtain the result is that same mode.This makes information retrieval have very much limitation, therefore it is desirable that by single mode The information retrieval of state expands to the information retrieval of cross-module state, i.e., a given pictures are retrieved and retouched with the relevant word of the picture It states, otherwise similarly.
Because the data of different modalities have different characteristics, therefore the similitude of hardly possible directly both measurements, This is the significant challenge across Modal Method.In order to solve this problem, most common method is exactly sub-space learning method.Typical phase It is a kind of general unsupervised sub-space learning method to close analysis (CCA), it is by the data projection of different modalities to the same space, together When by between two mode relationship maximize.CCA methods are intended to maximize the relationship between the data of two different modalities, And offset minimum binary (PLS) is to solve the problems, such as cross-media retrieval from the angle of covariance.The multimode analysis (GMA) of broad sense makes Use category label as supervision message, it is CCA methods in the extension for having supervision field.
Cross-media retrieval method above-mentioned is generally required to consume a large amount of time and be deposited when handling large-scale data Store up space.In order to solve this problem, hash method comes into being.In hash method, indicated with binary Hash codes Data, in the similitude between measuring different data, it is only necessary to base be carried out to the Hash codes of different data in hamming space In the XOR operation of digit.Hash method effectively reduces computational complexity, uses less memory space.Based on Hash across Modal Method obtains different modalities usually by the data projection of different modalities to a general hamming space in this space The Hash codes of data, so as to directly carry out the measuring similarity between different modalities data.Cross-module state inspection based on Hash Suo Fangfa has been obtained for effectively applying, and the common matrix decomposition (CMFH) based on Hash is that multi-modal data learns to one altogether Hash codes, and measuring similarity is carried out in general semantics space using it;Potential applications sparse hash (LSSH) is respectively The high-layer semantic information of two modal datas is obtained using sparse coding and matrix decomposition, is then carried out across matchmaker using hash method Physical examination rope.
Although there are many cross-media retrieval method based on Hash, existing method does not account for data characteristics Identification is distributed.(similar data characteristics is as close possible to inhomogeneous data characteristics to the greatest extent may be used for the identification distribution of data characteristics Can be separate) it can make the better accurate of cross-media retrieval.Therefore, how image and text are being projected into semantic space While keep their own identification to be distributed, be that current those skilled in the art still need to solve to improve retrieval precision The technical issues of.
Invention content
The present invention is to solve the above-mentioned problems, it is proposed that a kind of to be associated with the cross-module state retrieval for maximizing Hash based on identification Method still maintains each mode after the data characteristics of text and image modalities to be projected to a common hamming space Identification distribution and so that the association between pairs of multi-modal data is maximized, to improve cross-module state retrieve it is accurate Degree.
The specific technical solution of the present invention is as follows:
It is a kind of to be associated with the cross-module state search method for maximizing Hash based on identification, include the following steps:
Step 1:Training dataset is obtained, wherein each sample includes pairs of two modal datas of image and text;
Step 2:Multi-modal extraction is carried out to training dataset, obtains training multi-modal data collection Otrain
Step 3:For training multi-modal data collection Otrain, construct being associated with based on identification on the data set and maximize Kazakhstan Uncommon object function;
Step 4:The object function is solved, the projection matrix for projecting to common hamming space of image, text is obtained W1And W2, image text pair joint Hash codes B, use joint Hash codes B as the Hash codes to image and text;
Step 5:Test data set is obtained, and multi-modal extraction is carried out to it, obtains test multi-modal data collection Otest
Step 6:For testing multi-modal data collection Otest, the projection matrix W that is acquired according to step 31And W2, number will be tested The common hamming space is projected to according to the image or text of concentrating each sample, and training set is quantified as by hash function The Hash codes of sample;
Step 7:Carry out cross-module state retrieval, be based on Hash codes, the training data concentrate retrieval with it is to be checked in test set The object of the relevant another mode of rope sample;
Object function is in the step 3:
Wherein,It is the data characteristics square of image and text respectively Battle array,It is label matrix;λ, μ1, μ2, β, α are balance parameters, and γ is regularization parameter.
Further, the step 3 includes:
Step 3-1:If training multi-modal data subset OtrainEach data sample isWherein,It is the feature vector of image,It is the feature vector of text, yi∈{0,1}cIt is category label, N is sample Number;The data of two mode are projected to from original isomeric space in common hamming space, and are made in a sample in pairs Image and text between association maximize:
Step 3-2:Linear discriminant analysis processing is carried out to text modality data, and its characteristic is made to be transmitted to image modalities number According to:
Step 3-3:It is Hash codes by two modal data Feature Conversions, the quantization of Hash codes will be obtained by hash function Minimization of loss:
Step 3-4:Category label is added as supervision message, classifies to Hash codes:
Step 3-5:Increasing regularization term prevents over-fitting, is defined as:
Step 3-6:Step 3-1 to 3-5 is integrated, object function is obtained.
Further, the step 4 object function method for solving is:
Step 4-1:Other in object function are fixed, the projection matrix W of image modalities is solved1
Step 4-2:Other in object function are fixed, the projection matrix W of text modality is solved2
Step 4-3:Other in object function are fixed, joint Hash codes B is solved;
Step 4-4:Other in object function are fixed, grader matrix Q is solved.
Further, the search method further includes:Retrieval is being judged according to the category label that multi-modal data collection carries just True rate.
According to another aspect of the present invention, the present invention also provides a kind of object function structure sides for the retrieval of cross-module state Method, including:
Step 1:Training dataset is obtained, wherein each sample includes pairs of two modal datas of image and text; Multi-modal extraction is carried out to the training dataset, obtains training multi-modal data collection Otrain
Step 2:The data of two mode are projected to from original isomeric space in common hamming space, and make one Association in sample between pairs of image and text maximizes;
Step 3:Linear discriminant analysis processing is carried out to text modality data, and its characteristic is made to be transmitted to image modalities number According to;
Step 4:It is Hash codes by two modal data Feature Conversions, the quantization that Hash codes are obtained by hash function is damaged It loses and minimizes;
Step 5:Category label is added as supervision message;
Step 6:Increasing regularization term prevents over-fitting;
Step 7:Step 2 to 6 is integrated, obtains being associated with the object function object function for maximizing Hash based on identification.
Further, the association between image and text pairs of in a sample is made to maximize definition in the step 2 For:
Wherein, V and T is the data characteristics matrix of image and text, W respectively1And W2Respectively image, text project to The projection matrix in common hamming space.
Further, the step 3 includes:Linear discriminant analysis processing is carried out to text modality data, obtains phase in class Like degree matrix SwThe similarity matrix S between classb, which is transmitted to image modalities data, is defined as:
Further, the quantization minimization of loss that Hash codes are obtained by hash function is defined as by the step 4:
Wherein, B is joint Hash codes.
Further, step 5 category label is defined as:
Wherein, Q is grader matrix.
Further, step 6 regularization term is defined as:
Beneficial effects of the present invention are:
The present invention takes full advantage of the identification distribution of data characteristics, to text in the cross-media retrieval based on Hash Mode carries out linear discriminant analysis and its characteristic is passed to image modalities.It it also allows the multi-modal number of same sample Association is maximized according to being still maintained after projecting to common hamming space.This all makes the data characteristics in hamming space point Cloth more has identification, to be easier to make for classifying by the Hash codes that data characteristics quantifies, to improve across media The performance of retrieval, while the application of salted hash Salted can reduce consumption of the cross-module state retrieval in the time, spatially.
Description of the drawings
The accompanying drawings which form a part of this application be for providing further understanding of the present application, the application's Suitability embodiment and its explanation do not constitute the improper restriction to the application for explaining the application.
Fig. 1 is that the cross-media retrieval general flow chart for maximizing Hash is associated with based on identification;
Fig. 2 is the organigram for the object function that maximized Hash is associated with based on identification;
Fig. 3 is the schematic diagram for solving the object function.
Specific implementation mode
Below in conjunction with drawings and examples, technical scheme in the embodiment of the invention is clearly and completely described.
It is noted that following detailed description is all illustrative, it is intended to provide further instruction to the application.Unless another It indicates, all technical and scientific terms used herein has usual with the application person of an ordinary skill in the technical field The identical meanings of understanding.
It should be noted that term used herein above is merely to describe specific implementation mode, and be not intended to restricted root According to the illustrative embodiments of the application.As used herein, unless the context clearly indicates otherwise, otherwise singulative It is also intended to include plural form, additionally, it should be understood that, when in the present specification using term "comprising" and/or " packet Include " when, indicate existing characteristics, step, operation, device, component and/or combination thereof.
Embodiment one
A kind of cross-module state search method for being associated with based on identification and maximizing Hash is present embodiments provided, as shown in Figure 1, Include the following steps:
Step 1:Training dataset is obtained, wherein each sample includes pairs of two modal datas of image and text;
Step 2:Multi-modal extraction is carried out to training dataset, obtains training multi-modal data collection Otrain
Step 3:For training multi-modal data collection Otrain, construct being associated with based on identification on the data set and maximize Kazakhstan Uncommon object function;
Step 4:The object function is solved, the projection matrix for projecting to common hamming space of image, text is obtained W1And W2, image text pair joint Hash codes B and grader matrix Q, use joint Hash codes B as this to image and text Hash codes;
Step 5:Test data set is obtained, and multi-modal extraction is carried out to it, obtains test multi-modal data collection Otest
Step 6:For test data set Otest, the projection matrix W that is acquired according to step 31And W2, test data is concentrated The image or text of each sample project to the common hamming space, and utilize the hash function hash function f learnt (V)=sgn (W1) and g (T)=sgn (W V2T), the Hash codes that test data concentrates image and text can directly be acquired;
Step 7:Carry out cross-module state retrieval, be based on Hash codes, the training data concentrate retrieval with it is to be checked in test set The object of the relevant another mode of rope sample.
Described be associated with based on identification maximizes the object function of Hash as (as shown in Figure 2):
Wherein,It is the data characteristics square of image and text respectively Battle array,It is label matrix;λ, μ1, μ2, β, α are balance parameters, and γ is regularization parameter.
Described be associated with based on identification is maximized the construction process of object function of Hash and is:
Step 1:Multi-modal data set O is obtained, the multi-modal data set includes training multi-modal data subset OtrainWith test multi-modal data subset Otest
It is assumed that each data sampleWhereinIt is the feature vector of image, It is the feature vector of text, yi∈{0,1}cIt is category label, N is number of samples.Include a pair of of image in each data sample Text pair, their physical characteristic are different, but their semantic meanings having the same, belong to same class.
We assume here that each sample belongs to one of c class.Then It is the data characteristics matrix of image and text respectively.It is label matrix, such as One sample o of fruitiIn image and text data feature viAnd tiJ-th of class is belonged to, then yiJ-th of element be 1, remaining It is 0.
Step 2:The data characteristics of original isomeric space is projected in a general hamming space.
Step 2-1:For OtrainIn each sampleSet two mode of image and text Hash function f (V)=sgn (W1) and g (T)=sgn (W V2T), the data of two mode are projected from original isomeric space Into in a common hamming space.
The hash function of two mode is defined as:WithI Use a kind of common hash function representation method, then the specific representation of two hash functions is as follows:F (V)=sgn (W1) and g (T)=sgn (W V2T).Wherein sgn () is sign function, and continuous data discrete can be melted into binary Kazakhstan by it Uncommon code;W1And W2It is the projection matrix of two mode respectively.
Step 2-2:Because the image and text in reset condition in one sample are pairs of, therefore after projection Hamming space in the association between image and text pairs of in a sample should be made to maximize, be defined as follows:
Wherein W1And W2It is the projection matrix of image and text.
Step 2-3:In order to keep the identification characteristic of data, we introduce linear discriminant analysis (LDA) to text mould The data of state are handled, and its characteristic is made to be transmitted to image modalities, are defined as follows:
Wherein SwFor similar degree in the class matrix, SbThe similarity matrix between class.
Linear discriminant analysis (LDA) into an optimal identification space, is projecting the data projection in higher dimensional space In space afterwards, distance is as big as possible between different classes of data, and the distance between similar data are as small as possible.To text The data of mode carry out linear discriminant analysis, define SwFor similar degree in the class matrix, SbThe similarity matrix between class.We are to text The data of this mode carry out linear discriminant analysis, this can be such that the data distribution for projecting to the text modality in public hamming space has Identification passes through SwAnd SbThis characteristic is transmitted to image modalities, is defined as:
Wherein tr () is the mark of matrix.The formula is equivalent to:
Step 3:Using the hash function defined in step (2-1), the image and text in public hamming space will be projected to Data characteristics be quantified as Hash codes.
Because a sample is made of a pair of of image and text, their semantic meanings having the same, therefore we Introduce an auxiliary variable --- the joint Hash codes of two modeThat is a pair of of figure in a sample Picture and text use the same Hash codes.It is as small as possible that we should be such that the quantization of generation Hash codes loses as possible, is defined as follows:
Step 4:Category label is added as supervision message, the joint Hash codes that we learn to obtain can be easy to handy In classification, specifically, the Hash codes learnt are B, be its increased semantic information it is Y, due to semantic information Y and Hash codes B Matrix dimensionality is inconsistent, introduces grader matrix Q and is converted.It is defined as:
Step 5:Over-fitting in order to prevent carries out regularization constraint to projection matrix, is defined as:
Increasing regularization term prevents over-fitting, is defined as:
Above five steps are integrated, we obtain a complete object function:
Wherein λ, μ1, μ2, β, α are balance parameters, and γ is regularization parameter (for preventing over-fitting).
Our purpose is to obtain projection matrix W by the object function of solution above1And W2, joint Hash codes B.Due to Contain multiple known variables in object function, it can not direct solution.Therefore the present invention proposes an iterative solution algorithm, fixed Its dependent variable solves a variable, we may finally obtain optimal solution in this way.In addition, calculating for simplicity, we will combine The discrete constraint B ∈ { -1,1 } of Hash codes BL×NLoosen as continuous constraint 0≤B≤1.
According to the object function for maximizing Hash is associated with based on identification, we have proposed an iterative solution algorithms (such as Shown in Fig. 3), for solving our required projection matrix W1And W2, joint Hash codes B and grader matrix Q.
Step 1:Its dependent variable W in fixed object function2, Q and B, solve projection matrix W1.Object function becomes:
By to W1Partial derivative is sought, obtained W1Solution:
W1=(μ1BVT+λW2TVT)(μ1VVT+λVVT+γI)。
Step 2:By fixing its dependent variable W1, Q and B, solve projection matrix W2.Object function becomes:
By calculating W2Partial derivative and enable its be equal to 0, obtain W2Solution:
Step 3:Fix its dependent variable W1、W2And Q, solve joint Hash codes B.Object function becomes:
By calculating the partial derivative of B and it being enabled to be equal to 0, the solution of B is obtained:
B=(α QTQ+(μ12)I)-1(αQTY+μ1W1V+μ2W2T)。
Step 4:Fix its dependent variable W1、W2And B, solve grader matrix Q.Object function becomes:
By calculating the partial derivative of Q and it being enabled to be equal to 0, the solution of Q is obtained:
Q=(α YBT)(αBBT+γI)-1
Finally, we use joint Hash codes B as the Hash codes of training sample, for new test sample, Wo Mentong It crosses and hash function is quantified to obtain the Hash codes of test sample.It is carried out across matchmaker by the similarity-rough set between Hash codes Physical examination rope.
The search method further includes:Retrieval accuracy is judged according to the category label that multi-modal data collection carries.Here We assess the retrieval accuracy of this method using common Average Accuracy (MAP) value.A sample retrieval set is given, The Average Accuracy (AP) of wherein each sample retrieval is defined as:WhereinIt is that sample retrieval is concentrated The sum of sample, P (r) indicates the ratio of the quantity and the sample size that is all retrieved of correlated samples, if r-th of retrieval obtains Sample then δ (r)=1 related to query sample, otherwise δ (r)=0.Average value, that is, MAP of the AP values of all samples.
Embodiment two
According to the cross-module state search method for maximizing Hash is associated with based on identification above, present embodiments provide corresponding Object function construction method, as shown in Fig. 2, including:
Step 1:Training dataset is obtained, wherein each sample includes pairs of two modal datas of image and text; Multi-modal extraction is carried out to the training dataset, obtains training multi-modal data collection Otrain
Step 2:The data of two mode are projected to from original isomeric space in common hamming space, and make one Association in sample between pairs of image and text maximizes;
Step 3:Linear discriminant analysis processing is carried out to text modality data, and its characteristic is made to be transmitted to image modalities number According to;
Step 4:It is Hash codes by two modal data Feature Conversions, the quantization that Hash codes are obtained by hash function is damaged It loses and minimizes;
Step 5:Category label is added as supervision message;
Step 6:Increasing regularization term prevents over-fitting;
Step 7:Step 2 to 6 is integrated, obtains being associated with the object function object function for maximizing Hash based on identification.
It maximizes the association between image and text pairs of in a sample in the step 2 to be defined as:
Wherein, V and T is the data characteristics matrix of image and text, W respectively1And W2Respectively image, text project to The projection matrix in common hamming space.
The step 3 includes:Linear discriminant analysis processing is carried out to text modality data, obtains similar degree in the class matrix Sw The similarity matrix S between classb, which is transmitted to image modalities data, is defined as:
The quantization minimization of loss that Hash codes are obtained by hash function is defined as by the step 4:
Wherein, B is joint Hash codes.
Step 5 category label is defined as:
Wherein, Q is grader matrix.
Step 6 regularization term is defined as:
Experiment effect:
It is verified with the image text data in Wiki image text data sets, retrieval rate is as shown in table 1.
The retrieval accuracy of 6 kinds of cross-media retrievals (image retrieval text and text retrieval image) on 1 Wiki data sets of table (MAP) compare
As can be seen that the data that the method for the present invention is respectively two mode of text and image learn to respective Hash letter Original data characteristics is projected to a common hamming space, and carries out linear discriminant point to the data of text modality by number Analyse (LDA) processing so that the text feature after projecting keeps identification, and this characteristic will be passed to image modalities. In common hamming space, data characteristics can be transformed into Hash codes, can be easy to breathing out using Classmark information Uncommon code is classified.These operations can obtain good cross-media retrieval effect, at the same the application of salted hash Salted can reduce across Mode retrieves the consumption in the time, spatially.
The foregoing is merely the preferred embodiments of the application, are not intended to limit this application, for the skill of this field For art personnel, the application can have various modifications and variations.Within the spirit and principles of this application, any made by repair Change, equivalent replacement, improvement etc., should be included within the protection domain of the application.
Above-mentioned, although the foregoing specific embodiments of the present invention is described with reference to the accompanying drawings, not protects model to the present invention The limitation enclosed, those skilled in the art should understand that, based on the technical solutions of the present invention, those skilled in the art are not Need to make the creative labor the various modifications or changes that can be made still within protection scope of the present invention.

Claims (10)

1. a kind of being associated with the cross-module state search method for maximizing Hash based on identification, which is characterized in that include the following steps:
Step 1:Training dataset is obtained, wherein each sample includes pairs of two modal datas of image and text;
Step 2:Multi-modal extraction is carried out to training dataset, obtains training multi-modal data collection Otrain
Step 3:For training multi-modal data collection Otrain, construct being associated with based on identification on the data set and maximize Hash Object function;
Step 4:The object function is solved, the projection matrix W for projecting to common hamming space of image, text is obtained1With W2, image text pair Hash codes B;
Step 5:Test data set is obtained, and multi-modal extraction is carried out to it, obtains test multi-modal data collection Otest
Step 6:For testing multi-modal data collection Otest, the projection matrix W that is acquired according to step 41And W2, by test data set In each sample image or text project to the common hamming space, and Hash codes are quantified as by hash function;
Step 7:Cross-module state retrieval is carried out, Hash codes are based on, retrieval and sample to be retrieved in test set are concentrated in the training data The object of this relevant another mode;
Object function is in the step 3:
s.t.B∈{-1,1}L×N,W1W1 T=Ik,
Wherein,It is the data characteristics matrix of image and text respectively,It is label matrix;λ, μ1, μ2, β, α are balance parameters, and γ is regularization parameter, SwFor similar degree in the class Matrix, SbThe similarity matrix between class, Q are grader matrix, and N is number of samples, and c indicates classification number.
2. a kind of cross-module state search method being associated with maximization Hash based on identification as described in claim 1, feature are existed In the step 3 includes:
Step 3-1:If training multi-modal data subset OtrainEach data sample isWherein,It is the feature vector of image,It is the feature vector of text, yi∈{0,1}cIt is category label, N is sample Number;The data of two mode are projected to from original isomeric space in common hamming space, and are made in a sample in pairs Image and text between association maximize:
s.t.W1W1 T=Ik,
Step 3-2:Linear discriminant analysis processing is carried out to text modality data, and its characteristic is made to be transmitted to image modalities data:
Step 3-3:It is Hash codes by two modal data Feature Conversions, the quantization that Hash codes are obtained by hash function is lost It minimizes:
s.t.B∈{-1,1}L,W1W1 T=Ik,
Step 3-4:Category label is added as supervision message, classifies to Hash codes:
s.t.B∈{-1,1}L
Step 3-5:Increasing regularization term prevents over-fitting, is defined as:
Step 3-6:Step 3-1 to 3-5 is integrated, object function is obtained.
3. a kind of cross-module state search method being associated with maximization Hash based on identification as claimed in claim 2, feature are existed In the step 4 object function method for solving is:
Step 4-1:Other in object function are fixed, the projection matrix W of image modalities is solved1
Step 4-2:Other in object function are fixed, the projection matrix W of text modality is solved2
Step 4-3:Other in object function are fixed, joint Hash codes B is solved;
Step 4-4:Other in object function are fixed, grader matrix Q is solved.
4. a kind of cross-module state search method being associated with maximization Hash based on identification as described in claim 1, feature are existed In the search method further includes:Retrieval accuracy is judged according to the category label that multi-modal data collection carries.
5. a kind of object function construction method for the retrieval of cross-module state, which is characterized in that including:
Step 1:Training dataset is obtained, wherein each sample includes pairs of two modal datas of image and text;To institute It states training dataset and carries out multi-modal extraction, obtain training multi-modal data collection Otrain
Step 2:The data of two mode are projected to from original isomeric space in common hamming space, and make a sample In association between pairs of image and text maximize;
Step 3:Linear discriminant analysis processing is carried out to text modality data, and its characteristic is made to be transmitted to image modalities data;
Step 4:It is Hash codes by two modal data Feature Conversions, the quantization loss of Hash codes will be obtained by hash function most Smallization;
Step 5:Category label is added as supervision message;
Step 6:Increasing regularization term prevents over-fitting;
Step 7:Step 2 to 6 is integrated, obtains being associated with the object function object function for maximizing Hash based on identification.
6. a kind of object function construction method for the retrieval of cross-module state as claimed in claim 5, which is characterized in that the step It maximizes the association between image and text pairs of in a sample in rapid 2 to be defined as:
s.t.W1W1 T=Ik,
Wherein, V and T is the data characteristics matrix of image and text, W respectively1And W2Respectively image, text project to it is public Hamming space projection matrix.
7. a kind of object function construction method for the retrieval of cross-module state as claimed in claim 6, which is characterized in that the step Rapid 3 include:Linear discriminant analysis processing is carried out to text modality data, obtains similar degree in the class matrix SwThe similarity moment between class Battle array Sb, which is transmitted to image modalities data, is defined as:
SwFor similar degree in the class matrix, SbThe similarity matrix between class.
8. a kind of object function construction method for the retrieval of cross-module state as claimed in claims 6 or 7, which is characterized in that institute Step 4 is stated to be defined as the quantization minimization of loss for obtaining Hash codes by hash function:
s.t.B∈{-1,1}L,W1W1 T=Ik,
Wherein, B is joint Hash codes.
9. a kind of object function construction method for the retrieval of cross-module state as claimed in claim 8, which is characterized in that step 5 Category label is defined as:
s.t.B∈{-1,1}L
Wherein, Q is grader matrix, and Y indicates label matrix.
10. a kind of object function construction method for the retrieval of cross-module state as claimed in claim 9, which is characterized in that described Step 6 regularization term is defined as:
CN201710581083.1A 2017-07-17 2017-07-17 The cross-module state search method for maximizing Hash is associated with based on identification Expired - Fee Related CN107402993B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710581083.1A CN107402993B (en) 2017-07-17 2017-07-17 The cross-module state search method for maximizing Hash is associated with based on identification

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710581083.1A CN107402993B (en) 2017-07-17 2017-07-17 The cross-module state search method for maximizing Hash is associated with based on identification

Publications (2)

Publication Number Publication Date
CN107402993A CN107402993A (en) 2017-11-28
CN107402993B true CN107402993B (en) 2018-09-11

Family

ID=60400727

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710581083.1A Expired - Fee Related CN107402993B (en) 2017-07-17 2017-07-17 The cross-module state search method for maximizing Hash is associated with based on identification

Country Status (1)

Country Link
CN (1) CN107402993B (en)

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108170755B (en) * 2017-12-22 2020-04-07 西安电子科技大学 Cross-modal Hash retrieval method based on triple deep network
CN109376261B (en) * 2018-10-29 2019-09-24 山东师范大学 Mode independent retrieval method and system based on intermediate text semantic enhancing space
CN109299216B (en) * 2018-10-29 2019-07-23 山东师范大学 A kind of cross-module state Hash search method and system merging supervision message
CN109522946A (en) * 2018-10-31 2019-03-26 咪咕文化科技有限公司 A kind of image classification model treatment method, apparatus and storage medium
CN109766455B (en) * 2018-11-15 2021-09-24 南京邮电大学 Identified full-similarity preserved Hash cross-modal retrieval method
CN109766481B (en) * 2019-01-11 2021-06-08 西安电子科技大学 Online Hash cross-modal information retrieval method based on collaborative matrix decomposition
CN111460077B (en) * 2019-01-22 2021-03-26 大连理工大学 Cross-modal Hash retrieval method based on class semantic guidance
CN110019652B (en) * 2019-03-14 2022-06-03 九江学院 Cross-modal Hash retrieval method based on deep learning
CN110059198B (en) * 2019-04-08 2021-04-13 浙江大学 Discrete hash retrieval method of cross-modal data based on similarity maintenance
CN110059154B (en) * 2019-04-10 2022-04-15 山东师范大学 Cross-modal migration hash retrieval method based on inheritance mapping
CN110188210B (en) * 2019-05-10 2021-09-24 山东师范大学 Cross-modal data retrieval method and system based on graph regularization and modal independence
CN110674323B (en) * 2019-09-02 2020-06-30 山东师范大学 Unsupervised cross-modal Hash retrieval method and system based on virtual label regression
CN111259176B (en) * 2020-01-16 2021-08-17 合肥工业大学 Cross-modal Hash retrieval method based on matrix decomposition and integrated with supervision information
CN111368176B (en) * 2020-03-02 2023-08-18 南京财经大学 Cross-modal hash retrieval method and system based on supervision semantic coupling consistency
CN111651577B (en) * 2020-06-01 2023-04-21 全球能源互联网研究院有限公司 Cross-media data association analysis model training and data association analysis method and system
CN113343014A (en) * 2021-05-25 2021-09-03 武汉理工大学 Cross-modal image audio retrieval method based on deep heterogeneous correlation learning
CN117033724B (en) * 2023-08-24 2024-05-03 广州市景心科技股份有限公司 Multi-mode data retrieval method based on semantic association

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101996191A (en) * 2009-08-14 2011-03-30 北京大学 Method and system for searching for two-dimensional cross-media element
CN102629275A (en) * 2012-03-21 2012-08-08 复旦大学 Face and name aligning method and system facing to cross media news retrieval
CN105205096A (en) * 2015-08-18 2015-12-30 天津中科智能识别产业技术研究院有限公司 Text modal and image modal crossing type data retrieval method

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9280587B2 (en) * 2013-03-15 2016-03-08 Xerox Corporation Mailbox search engine using query multi-modal expansion and community-based smoothing
US9830506B2 (en) * 2015-11-09 2017-11-28 The United States Of America As Represented By The Secretary Of The Army Method of apparatus for cross-modal face matching using polarimetric image data
CN106777318B (en) * 2017-01-05 2019-12-10 西安电子科技大学 Matrix decomposition cross-modal Hash retrieval method based on collaborative training

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101996191A (en) * 2009-08-14 2011-03-30 北京大学 Method and system for searching for two-dimensional cross-media element
CN102629275A (en) * 2012-03-21 2012-08-08 复旦大学 Face and name aligning method and system facing to cross media news retrieval
CN105205096A (en) * 2015-08-18 2015-12-30 天津中科智能识别产业技术研究院有限公司 Text modal and image modal crossing type data retrieval method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Linear Subspace Ranking Hashing for Cross-Modal Retrieval;Kai Li等;《IEEE Transactions on Pattern Analysis and Machine Intelligence》;20160919;第39卷(第9期);第1825-1838页 *

Also Published As

Publication number Publication date
CN107402993A (en) 2017-11-28

Similar Documents

Publication Publication Date Title
CN107402993B (en) The cross-module state search method for maximizing Hash is associated with based on identification
CN108897989B (en) Biological event extraction method based on candidate event element attention mechanism
Mandal et al. Generalized semantic preserving hashing for n-label cross-modal retrieval
CN106777318B (en) Matrix decomposition cross-modal Hash retrieval method based on collaborative training
CN107256271B (en) Cross-modal Hash retrieval method based on mapping dictionary learning
CN107729513B (en) Discrete supervision cross-modal Hash retrieval method based on semantic alignment
CN106202256B (en) Web image retrieval method based on semantic propagation and mixed multi-instance learning
US11176462B1 (en) System and method for prediction of protein-ligand interactions and their bioactivity
CN109784405B (en) Cross-modal retrieval method and system based on pseudo-tag learning and semantic consistency
Cheng et al. Semi-supervised multi-graph hashing for scalable similarity search
Niu et al. Knowledge-based topic model for unsupervised object discovery and localization
Ji et al. Image-attribute reciprocally guided attention network for pedestrian attribute recognition
CN111126563B (en) Target identification method and system based on space-time data of twin network
Li et al. Hashing with dual complementary projection learning for fast image retrieval
CN112101029B (en) Bert model-based university teacher recommendation management method
Xu et al. Transductive visual-semantic embedding for zero-shot learning
Wang et al. Asymmetric correlation quantization hashing for cross-modal retrieval
Sitaula et al. Unsupervised deep features for privacy image classification
CN109857892B (en) Semi-supervised cross-modal Hash retrieval method based on class label transfer
Shen et al. Semi-paired hashing for cross-view retrieval
Tang et al. Efficient dictionary learning for visual categorization
Yazici et al. Color naming for multi-color fashion items
Wang et al. Deep hashing with active pairwise supervision
CN107885854A (en) A kind of semi-supervised cross-media retrieval method of feature based selection and virtual data generation
Xu et al. Interaction content aware network embedding via co-embedding of nodes and edges

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20180911