CN110059198A

CN110059198A - A kind of discrete Hash search method across modal data kept based on similitude

Info

Publication number: CN110059198A
Application number: CN201910277146.3A
Authority: CN
Inventors: 孔祥维; 李明阳
Original assignee: Dalian University of Technology; Zhejiang University ZJU
Current assignee: Dalian University of Technology; Zhejiang University ZJU
Priority date: 2019-04-08
Filing date: 2019-04-08
Publication date: 2019-07-26
Anticipated expiration: 2039-04-08
Also published as: CN110059198B

Abstract

The invention discloses a kind of discrete Hash search methods across modal data kept based on similitude.The cross-module state retrieval data set being made of the sample comprising two mode is established, training set and test set are divided into；The objective function for establishing similitude in similitude and mode between keeping mode, solve to objective function and obtains Hash codes matrix by a kind of discrete optimizing method；Learn the hash function of each mode according to Hash codes matrix；The Hash codes of all samples in training set and test set are calculated using hash function；One mould measurement integrates as query set, another mode training set is retrieved set, calculates in query set that Hamming distance, sequence are used as search result between the Hash codes of sample and the Hash codes of sample in retrieved set.The present invention can effectively keep the similitude in similitude and mode between mode, and consider the discrete feature of Hash codes, be solved using a kind of method of discrete optimization to objective function, to improve the accuracy of cross-module state retrieval.

Description

A kind of discrete Hash search method across modal data kept based on similitude

Technical field

The present invention relates to a kind of a kind of cross-module state search methods in multimedia search technology field, more particularly, to one kind The discrete Hash search method across modal data kept based on similitude.

Background technique

With the fast development of Internet information technique, the multimedia messages of various mode are in explosive growth on network. Development trend is complied with, the retrieval of cross-module state becomes a most important problem, attracted the attention of many researchers.Across Mode retrieval typical scene is exactly the query sample of a given mode, retrieves other similar mode.But due to The presence of isomery wide gap can not directly measure the similitude between different modalities.Further, since the explosive increase of data, concern The carrying cost and efficiency retrieved on a large scale are necessary.Hash method is that popular method, target exist in recent years In data to be mapped as to compact binary code.By Hash, lower memory space can be used to save data, and pass through Hamming distance measures the similitude between different modalities, and Hamming distance can quickly be counted by the xor operation of bit It calculates.

In recent years, many cross-module state hash methods have been proposed in researcher.Most of cross-module state hash method it is main Thinking is to learn hash function using training data, by the Feature Mapping in luv space to a public Hamming space, And hash function should keep the semantic dependency in original feature space.Next it simply introduces some typical and relatively new Across Modal Method.CVH is expanded by the spectrum Hash of single mode, by the distance minimization of Weight.IMH is by keeping mode Between and mode in consistency learn linear hash function.CMFH uses Harmonious Matrix Factorization, to the difference of a sample Mode learns unified Hash codes.SMFH is to be based on confederate matrix Factorization, while keeping local Geometrical consistency and mark Consistency is signed to learn unified Hash codes.By keeping mode to learn hash function based on the mixing similitude of figure.

Similitude holding is one extremely important problem of cross-module state hash method.It is similar between most of method concern mode Property, that is to say, that if an image pattern and a samples of text are semantically being mutually related, they should have Similar Hash codes.In addition, similitude is also critically important in mode.Similitude is intended to keep the local geometric knot of each mode in mode Structure.Certain methods keep similitude in mode using figure Laplce's regular terms, however only focus in belonging within k neighbour Sample, the weight of the sample in relational matrix except k neighbour are arranged to 0, such as SMFH.In this way, in original feature space Similar sample will obtain similar Hash codes, but the Hash codes of dissimilar sample are not necessarily dissimilar because they not by To limitation.In addition, Hash codes are binary codes, study binary code is a discrete optimization problems of device, which is usually NP tired Difficult problem.The strategy that most of existing cross-module state hash methods use is to loosen original discrete constraint for continuous constraint, Then the successive value of acquisition is quantified as binary code by re-optimization objective function.However, this strategy that loosens will affect accessibility Energy.

Summary of the invention

In view of the above-mentioned deficiencies in the prior art, it is an object of the present invention to provide it is a kind of based on similitude keep across modal data Discrete Hash search method.

The technical solution adopted by the present invention the following steps are included:

1) the cross-module state retrieval data set being made of the sample comprising two mode, sample are established in the database of server This two mode are respectively image modalities and text modality, and data set is divided into training set and test set；

2) objective function of similitude in similitude and mode between keeping mode is established, and passes through a kind of discrete optimizing method Objective function is solved, Hash codes matrix is obtained；

3) the Hash codes matrix succeeded in school according to step 2), learns the hash function of each mode；

4) Hash codes of all samples in training set and test set are calculated using hash function；

5) using the test set of a mode as query set, using the training set of another mode as retrieved set, according to step Rapid 4) mode obtains Hash codes, calculates the Hamming distance in query set in the Hash codes of sample and retrieved set between the Hash codes of sample From being ranked up according to the sequence of Hamming distance from small to large to sample in retrieved set, the forward sample that sorts will be by as inspection Hitch fruit.

The step 1) specifically:

Image and text are collected from webpage, and the identical piece image of corresponding meaning and a text are constituted into an image Text pair, meaning identical finger describes same thing, such as the image and a text for describing people's surfing of width people's surfing With regard to constituting an image text pair；To retrieve data set, an image text pair to building cross-module state by each image text Characteristics of image and text feature constitute a sample；The training set of cross-module state retrieval data set has n sample, each sample packet Feature containing two mode of image and text, X⁽¹⁾Indicate the image modalities matrix that the feature of n image modalities is constituted,Each arrange represents the feature of the image modalities of a sample,Indicate the figure of p-th of sample As the feature of mode, the as pth of image modalities matrix is arranged,Wherein d₁Indicate the dimension of the feature of image modalities Degree, R indicate set of real numbers；X⁽²⁾Indicate the text modality matrix that the feature of n text modality is constituted,Each arrange represents the feature of the text modality of a sample,Indicate p-th of sample The feature of text modality, as the pth column of text modality matrix,Wherein d₂Indicate the dimension of the feature of text modality Degree；By the corresponding feature of two modeWithConstitute sample characteristics；

Y={ y₁,y₂,…,y_nIndicate label matrix, Y ∈ { 0,1 }^c×n, wherein c indicates classification sum, y_pIndicate pth The label vector of sample, i.e. the pth column of label matrix, y_p={ y_1p,y_2p,…,y_ip,…,y_np, y_ipIndicate that pth sample exists The label of i-th class classification；If p-th of sample belongs to the i-th class, the element y of the i-th row pth column in label matrix Y_ip=1, Otherwise y_ip=0.

Data set is divided into training set and test set by the present invention, is extracted feature respectively to image and text, is wrapped in training set Containing the characteristics of image and text feature for being used as training, include characteristics of image and text feature as test in test set.

The step 2) specifically includes:

2.1) for two kinds of different modalities of same sample, identical Hash codes are arrived in study, are able to maintain similar between mode Property.Similarity matrix S is first constructed according to label matrix Y cosine similarity, the element of pth row q column is S in S_pq=y_p· y_q/(||y_p||₂||y_q||₂), wherein p and q is the ordinal number of sample, y_p·y_qIndicate the label vector y of p-th of sample_pWith q The label vector y of a sample_qBetween inner product, | | y_p||₂With | | y_q||₂Respectively indicate the label vector y of p-th of sample_pWith q The label vector y of a sample_qTwo norms；

Then, the loss function of similitude between keeping mode below is established:WhereinIt is F Square of norm, B indicate the Hash codes matrix that the Hash codes of all samples are constituted, B ∈ { -1,1 }^k×n, wherein k is Hash codes Length；

2.2) it is directed to a mode, it is desirable to the local geometry that sample can be kept, i.e., in original feature space In similar sample, it is desirable to after being mapped to Hamming space, their Hash codes are also similar.Mould is kept by using figure regular terms Similitude in state establishes following holding mould for m-th of mode (m=1 indicates image modalities, and m=2 indicates text modality) The loss function of similitude in state:

Wherein, b_pAnd b_qIt is the pth column and q column of Hash codes matrix B, W respectively^(m)It is the weight matrix of m-th of mode,It is weight matrix W^(m)Pth row q column element, L_mIt is the Laplacian Matrix of m-th of mode, D^(m)It is m-th of mould The diagonal matrix of state,Indicate diagonal matrix D^(m)Pth row q column element,L_m=D^(m)-W^(m)；tr The mark of () representing matrix,Indicate square of 2 norms；

This method not only considers the sample close apart from a certain sample, it is also considered that the sample far apart from a certain sample, on State weight matrix W^(m)In elementSpecifically, more different Hash codes can be obtained in this way:

Wherein, e is natural constant,Indicate in m-th of mode with sample characteristicsApart from nearest k₁A sample The set that feature is constituted,Indicate in m-th of mode with sample characteristicsApart from farthest k₂A sample characteristics are constituted Set, μ be tradeoff parameter, the value of σ takes maximum

GatheringIn, with sample characteristicsDistance is closer, and weight is arranged bigger；GatheringIn, With sample characteristicsDistance is remoter, and the absolute value of weight is arranged bigger.Weight is set according to above-mentioned formula, can both make phase As Hash codes distance after sample mapping it is close, and the Hash codes distance after dissimilar sample mapping can be made remote.

2.3) loss function of similitude between keeping mode is combinedWith the loss function for keeping similitude in modeEstablish the overall goal function of following study Hash codes are as follows:

s.t.B∈{-1,1}^k×n

Wherein, α indicate keep mode between similitude loss function tradeoff parameter, β₁Indicate the mould of holding image modalities The tradeoff parameter of the loss function of similitude, β in state₂Indicate the power of the loss function of similitude in the mode of holding text modality Weigh parameter, T representing matrix transposition；

2.4) due to the presence of Hash codes discrete constraint, solution procedure 2.1) objective function be a np problem, use A kind of discrete optimizing method carries out solving overall goal function, specifically:

2.4.1) random initializtion Hash codes matrix B⁽⁰⁾∈{-1,1}^k×n, B⁽⁰⁾Indicate initial Hash codes matrix B；Breathe out The uncommon initial random generation of code matrix B, element therein are selected as -1 or 1.

2.4.2 solution) is iterated using following procedure:

First seek overall goal functionGradient:

Then iterative processing, the discrete Hash codes matrix B obtained using following formula according to iteration j^(j)It handles To the Hash codes matrix B of (j+1) secondary iteration^(j+1):

Wherein, λ is learning rate；B^(j)Indicate the Hash codes matrix that iteration j obtains, B^(j+1)Indicate that (j+1) is secondary repeatedly The Hash codes matrix B that generation obtains^(j+1)；

Hash codes matrix is updated according to above-mentioned iterative formula and completes optimization process, obtains optimal Hash codes matrix B.

In the step 3), hash function uses simple Linear Mapping h₁(x⁽¹⁾)=sign (P₁ ^Tx⁽¹⁾),Learning hash function is to learn two mapping matrix P₁And P₂, wherein P₁Indicate image modalities Mapping matrix, P₂Indicate the mapping matrix of text modality, x⁽¹⁾Indicate the feature of image modalities in sample, x⁽²⁾Indicate sample Chinese The feature of this mode；

It solves following formula and obtains mapping matrix P₁And P₂:

Wherein,Indicate that the loss function of study mapping matrix, γ are tradeoff parameters；

Pass through orderIt solves to calculate and obtains P₁: P₁=(X⁽¹⁾X^(1)T+γI)^-1X⁽¹⁾B^T, I expression unit matrix；

Pass through orderIt solves to calculate and obtains P₂: P₂=(X⁽²⁾X^(2)T+γI)^-1X⁽²⁾B^T。

In the step 4), the Hash codes formula h of training set and test set image modalities₁(x⁽¹⁾)=sign (P₁ ^Tx⁽¹⁾) It calculates, wherein x⁽¹⁾Indicate the feature of the image modalities of sample, h₁(x⁽¹⁾) indicate by the sample image mode feature x⁽¹⁾Meter The Hash codes of calculating；The Hash codes formula of training set and test set text modalityIt calculates, wherein x⁽²⁾Indicate the feature of sample text mode, h₂(x⁽²⁾) indicate by the sample text mode feature x⁽²⁾Calculated Hash codes.

Present invention reserved mapping matrix P after step 2) and step 3) training₁,P₂And it abandons step 2) and succeeds in school Training set Hash codes.

The beneficial effects of the present invention are:

Similitude in similitude and mode between the mode that the present invention can effectively be kept simultaneously, and it is not concerned only with similar sample This Hash codes, have also paid close attention to the Hash codes of dissimilar sample, the method for the present invention makes sample dissimilar in original feature space This obtains dissimilar Hash codes after being mapped to Hamming space, and similar sample also obtains similar Hash codes, and Hash codes are more Having any different property solves the problems, such as to learn the data retrieval of discrete Hash codes.

The present invention considers the discrete feature of Hash codes, is asked using a kind of method of discrete optimization objective function Solution, to improve the accuracy of cross-module state data retrieval.

Detailed description of the invention

Fig. 1 is the implementation steps of the invention flow chart.

Fig. 2 is an example schematic of the image retrieval text on cross-module state data set Wiki.

Fig. 3 is an example schematic of the text retrieval image on cross-module state data set Wiki.

Specific embodiment

The present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments.

As shown in Figure 1, specific embodiments of the present invention situation is as follows:

Specific implementation is to be described further with flow chart and cross-module state data set Wiki to technical solution of the present invention；Its In, cross-module state data set Wiki derives from wikipedia, includes 2866 images collected from wikipedia article Text pair.Image with 128 dimension SIFT feature indicate, text with 10 tie up LDA character representation.The image text of Wiki data set This is to being divided into 10 semantic classes, each image text is to belonging to one kind therein.It randomly chooses 2173 samples and constitutes instruction Practice collection, remaining 693 sample constitutes test set.

1) the cross-module state retrieval data set being made of the sample comprising two mode, sample are established in the database of server This two mode are respectively image modalities and text modality, and data set is divided into training set and test set.

Image and text are collected from webpage, and the identical piece image of corresponding meaning and a text are constituted into an image Text pair, meaning identical finger describes same thing, such as the image and a text for describing people's surfing of width people's surfing With regard to constituting an image text pair；To retrieve data set, an image text pair to building cross-module state by each image text Characteristics of image and text feature constitute a sample.

The training set of cross-module state retrieval data set has n sample, and each sample includes the spy of two mode of image and text Sign, X⁽¹⁾Indicate the image modalities matrix that the feature of n image modalities is constituted, Indicate pth The feature of the image modalities of a sample,Wherein d₁Indicate the dimension of the feature of image modalities, R indicates set of real numbers； X⁽²⁾Indicate the text modality matrix that the feature of n text modality is constituted, It indicates p-th The feature of the text modality of sample,Wherein d₂Indicate the dimension of the feature of text modality；It is corresponding by two mode FeatureWithConstitute sample characteristics；Y={ y₁,y₂,…,y_nIndicate label matrix, Y ∈ { 0,1 }^c×n, wherein c table Show classification sum, y_pIndicate the label vector of p-th of sample, y_p={ y_1p,y_2p,…,y_ip,…,y_np, y_ipIndicate p-th of sample In the label of the i-th class classification；If p-th of sample belongs to the i-th class, the element y of the i-th row pth column in label matrix Y_ip= 1, otherwise y_ip=0.

2) objective function of similitude in similitude and mode between keeping mode is established, and passes through a kind of discrete optimizing method Objective function is solved, Hash codes matrix is obtained, learns Hash codes for training set.

2.1) similarity matrix S is first constructed according to label matrix Y cosine similarity, the element of pth row q column is in S S_pq=y_p·y_q/(||y_p||₂||y_q||₂), wherein p and q is the ordinal number of sample, y_p·y_qIndicate the label of p-th of sample to Measure y_pWith the label vector y of q-th of sample_qBetween inner product, | | y_p||₂With | | y_q||₂Respectively indicate the label of p-th of sample to Measure y_pWith the label vector y of q-th of sample_qTwo norms；

2.2) similitude in mode is kept by using figure regular terms, (m=1 indicates image mould for m-th of mode State, m=2 indicate text modality), establish the following loss function for keeping similitude in mode:

Above-mentioned weight matrix W^(m)In elementSpecifically:

s.t.B∈{-1,1}^k×n

2.4) it carries out solving overall goal function using a kind of discrete optimizing method, specifically:

2.4.2 solution) is iterated using following procedure:

First seek overall goal functionGradient:

3) the Hash codes matrix succeeded in school according to step 2), learns the hash function of each mode.

Hash function uses simple Linear Mapping h₁(x⁽¹⁾)=sign (P₁ ^Tx⁽¹⁾),It learns Practising hash function is to learn two mapping matrix P₁And P₂, wherein P₁Indicate the mapping matrix of image modalities, P₂Indicate text mould The mapping matrix of state, x⁽¹⁾Indicate the feature of image modalities in a certain sample, x⁽²⁾Indicate the feature of text modality in a certain sample；

It solves following formula and obtains mapping matrix P₁And P₂:

Pass through orderIt solves to calculate and obtains P₁: P₁=(X⁽¹⁾X^(1)T+γI)^-1X⁽¹⁾B^T；

4) Hash codes of all samples in training set and test set are calculated using hash function.

The Hash codes formula h of training set and test set image modalities₁(x⁽¹⁾)=sign (P₁ ^Tx⁽¹⁾) calculate, wherein x⁽¹⁾ Indicate the feature of the image modalities of sample, h₁(x⁽¹⁾) indicate by the sample image mode feature x⁽¹⁾Calculated Hash codes； The Hash codes formula of training set and test set text modalityIt calculates, wherein x⁽²⁾Indicate sample text The feature of this mode, h₂(x⁽²⁾) indicate by the sample text mode feature x⁽²⁾Calculated Hash codes.

5) using the test set of a mode as query set, using the training set of another mode as retrieved set, according to step Rapid 4) mode obtains Hash codes, in the Hamming in the Hash codes of sample in calculating query set and retrieved set between the Hash codes of sample Distance is ranked up sample in retrieved set according to the sequence of Hamming distance from small to large, and the forward sample that sorts will be by conduct Search result.

The present embodiment is used as evaluation criterion using mAP (mean Average Precision), and mAP value is bigger, the side of explanation The cross-module state retrieval performance of method is better.With CMFH (referring to document Tang J, Wang K, Shao on cross-module state data set Wiki L.Supervised matrix factorization hashing for cross-modal retrieval[J].IEEE Transactions on Image Processing,2016,25(7):3157-3166)、 SMFH(Ding G,Guo Y, Zhou J.Collective matrix factorization hashing for multimodal data[C] .Proceedings of the IEEE conference on computer vision and pattern recognition.2014:2075-2082.)、FSH(Liu H,Ji R,Wu Y,et al. Cross-modality binary code learning via fusion similarity hashing[C].Proceedings of CVPR.2017:6345- 6353.) three kinds of cross-module state hash methods are compared, and preceding 100 samples are returned when Hash code length is 16 bit MAP value is as shown in table 1.

MAP value on 1 Wiki data set of table

Method	Image retrieval text	Text retrieval image
			CMFH	0.2295	0.3479
SMFH	0.2411	0.3658
			FSH	0.2408	0.3871
The present invention	0.2455	0.4086

As it can be seen from table 1 the method for the present invention achieves highest mAP value, cross-module state compared with three kinds of control methods Retrieval performance is best.

Fig. 2 gives an example of an image retrieval text on cross-module state data set Wiki, and return is sequence In preceding 6 text, affiliated semantic classes is given above image and text.Query image belongs to geography class, solid line Frame indicates the text retrieved and query image belongs to same semantic classes, and dotted line frame indicates the text retrieved and query image It is not belonging to same semantic classes.As can be seen that method provided by the invention is better than to analogy from the search result of this example Method.

Fig. 3 gives an example of a text retrieval image on cross-module state data set Wiki, and return is sequence In preceding 6 image, affiliated semantic classes is given above image and text.Query text belongs to literature class, figure The rimless image for indicating to retrieve and query text belong to same semantic classes outside piece, have dotted line frame expression to retrieve outside picture Image and query text are not belonging to same semantic classes.It is from the search result of this example as can be seen that provided by the invention Method is better than control methods.

In conclusion the method for the present invention can effectively keep the similitude in similitude and mode between mode, and not The Hash codes of similar sample have been concerned only with, the Hash codes of dissimilar sample have also been paid close attention to, have been conducive to the more different Kazakhstan of study Uncommon code, and solve the problems, such as to learn discrete Hash codes using a kind of discrete optimizing method, to improve the retrieval of cross-module state Accuracy.

Claims

1. a kind of discrete Hash search method across modal data kept based on similitude, it is characterised in that: method includes such as Lower step:

1) the cross-module state retrieval data set being made of the sample comprising two mode is established in the database of server, sample Two mode are respectively image modalities and text modality, and data set is divided into training set and test set；

2) objective function of similitude in similitude and mode between keeping mode is established, and by a kind of discrete optimizing method to mesh Scalar functions are solved, and Hash codes matrix is obtained；

5) using the test set of a mode as query set, using the training set of another mode as retrieved set, according to step 4) Mode obtains Hash codes, calculates the Hamming distance in query set in the Hash codes of sample and retrieved set between the Hash codes of sample, Sample in retrieved set is ranked up according to the sequence of Hamming distance from small to large, the forward sample that sorts will be tied as retrieval Fruit.

2. a kind of discrete Hash search method across modal data kept based on similitude according to claim 1, It is characterized in that: the step 1) specifically:

Image and text are collected from webpage, and the identical piece image of corresponding meaning and a text are constituted into an image text Right, to retrieve data set to building cross-module state by each image text, the characteristics of image and text of image text pair are special Sign constitutes a sample；The training set of cross-module state retrieval data set has n sample, and each sample includes two moulds of image and text The feature of state, X⁽¹⁾Indicate the image modalities matrix that the feature of n image modalities is constituted, Table Show the feature of the image modalities of p-th of sample,Wherein d₁Indicate the dimension of the feature of image modalities, R indicates real Manifold；X⁽²⁾Indicate the text modality matrix that the feature of n text modality is constituted, It indicates The feature of the text modality of p-th of sample,Wherein d₂Indicate the dimension of the feature of text modality；By two moulds The corresponding feature of stateWithConstitute sample characteristics；Y={ y₁,y₂,…,y_nIndicate label matrix, Y ∈ { 0,1 }^c×n, Middle c indicates classification sum, y_pIndicate the label vector of p-th of sample, y_p={ y_1p,y_2p,…,y_ip,…,y_np, y_ipIndicate pth Label of a sample in the i-th class classification.

3. a kind of discrete Hash search method across modal data kept based on similitude according to claim 1, Be characterized in that: the step 2) specifically includes:

2.1) similarity matrix S is first constructed according to label matrix Y cosine similarity, the element of pth row q column is S in S_pq= y_p·y_q/(||y_p||₂||y_q||₂), wherein p and q is the ordinal number of sample, y_p·y_qIndicate the label vector y of p-th of sample_p With the label vector y of q-th of sample_qBetween inner product, | | y_p||₂With | | y_q||₂Respectively indicate the label vector y of p-th of sample_p With the label vector y of q-th of sample_qTwo norms；

Then, the loss function of similitude between keeping mode below is established:WhereinIt is F norm Square, B indicates the Hash codes matrix that the Hash codes of all samples are constituted, B ∈ { -1,1 }^k×n, wherein k is the length of Hash codes；

2.2) for m-th of mode (m=1 indicates image modalities, and m=2 indicates text modality), foundation is following to keep phase in mode Like the loss function of property:

Wherein, b_pAnd b_qIt is the pth column and q column of Hash codes matrix B, W respectively^(m)It is the weight matrix of m-th of mode,It is Weight matrix W^(m)Pth row q column element, L_mIt is the Laplacian Matrix of m-th of mode, D^(m)It is pair of m-th of mode Angle battle array,Indicate diagonal matrix D^(m)Pth row q column element,L_m=D^(m)-W^(m)；Tr () table Show the mark of matrix,Indicate square of 2 norms；

Above-mentioned weight matrix W^(m)In elementSpecifically:

Wherein, e is natural constant,Indicate in m-th of mode with sample characteristicsApart from nearest k₁A sample is special The set constituted is levied,Indicate in m-th of mode with sample characteristicsApart from farthest k₂What a sample characteristics were constituted Set, μ are tradeoff parameters, and the value of σ takes maximum

s.t.B∈{-1,1}^k×n

Wherein, α indicate keep mode between similitude loss function tradeoff parameter, β₁It indicates to keep in the mode of image modalities The tradeoff parameter of the loss function of similitude, β₂Indicate to keep the tradeoff of the loss function of similitude in the mode of text modality to join Number, T representing matrix transposition；

2.4.1) random initializtion Hash codes matrix B⁽⁰⁾∈{-1,1}^k×n, B⁽⁰⁾Indicate initial Hash codes matrix B；

2.4.2 solution) is iterated using following procedure:

First seek overall goal functionGradient:

Then iterative processing, the discrete Hash codes matrix B obtained using following formula according to iteration j^(j)Processing obtains the (j+1) the Hash codes matrix B of secondary iteration^(j+1):

Wherein, λ is learning rate；B^(j)Indicate the Hash codes matrix that iteration j obtains, B^(j+1)Indicate that (j+1) secondary iteration obtains Hash codes matrix B^(j+1)；

4. a kind of discrete Hash search method across modal data kept based on similitude according to claim 1, Be characterized in that: in the step 3), hash function uses simple Linear Mapping h₁(x⁽¹⁾)=sign (P₁ ^Tx⁽¹⁾), h₂(x⁽²⁾) =sign (P₂ ^Tx⁽²⁾), study hash function is to learn two mapping matrix P₁And P₂, wherein P₁Indicate the mapping of image modalities Matrix, P₂Indicate the mapping matrix of text modality, x⁽¹⁾Indicate the feature of image modalities in sample, x⁽²⁾Indicate text mould in sample The feature of state；

It solves following formula and obtains mapping matrix P₁And P₂:

5. a kind of discrete Hash search method across modal data kept based on similitude according to claim 1, It is characterized in that: in the step 4), the Hash codes formula h of training set and test set image modalities₁(x⁽¹⁾)=sign (P₁ ^Tx⁽¹⁾) calculate, wherein x⁽¹⁾Indicate the feature of the image modalities of sample, h₁(x⁽¹⁾) indicate by the sample image mode feature x⁽¹⁾ Calculated Hash codes；The Hash codes formula of training set and test set text modalityIt calculates, Middle x⁽²⁾Indicate the feature of sample text mode, h₂(x⁽²⁾) indicate by the sample text mode feature x⁽²⁾Calculated Hash Code.