CN103488713B - A kind of cross-module state search method that can directly measure similarity between different modalities data - Google Patents
A kind of cross-module state search method that can directly measure similarity between different modalities data Download PDFInfo
- Publication number
- CN103488713B CN103488713B CN201310410553.XA CN201310410553A CN103488713B CN 103488713 B CN103488713 B CN 103488713B CN 201310410553 A CN201310410553 A CN 201310410553A CN 103488713 B CN103488713 B CN 103488713B
- Authority
- CN
- China
- Prior art keywords
- data
- dictionary
- retrieval
- cross
- sigma
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/903—Querying
- G06F16/9032—Query formulation
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of cross-module state search method that can directly measure similarity between different modalities data.It comprises the steps: 1) feature extraction;2) model is set up and study;3) retrieve across media data;4) evaluation of result.The present invention directly can carry out similarity-rough set between different modalities data, and for cross-module state retrieval tasks, user can submit the text of any mode, image, sound etc. to, go to retrieve the corresponding mode result of their demand.The difference of the present invention and tradition cross-media retrieval method is the similarity system design that can directly carry out between different modalities data, meet the demand of cross-media retrieval, the retrieval more directly achieving user is intended to, compared with other cross-media retrieval algorithm that can directly measure different modalities similarity, this method has stronger noise resistance interference performance and to loosely related across the ability to express of modal data so that retrieval effectiveness is more preferable.
Description
Technical field
The present invention relates to the retrieval of cross-module state, particularly relate to a kind of can directly measure the cross-module of similarity between different modalities data
State search method.
Background technology
Nowadays, the type of electronic data increasingly trends towards rich and varied, various types of data, such as text, image, sound
Sound, map etc. are widely present on the internet.Same semantic content the most both can describe, also by the data of a kind of mode
Can describe by the data of other mode.Sometimes, for the description of a kind of categorical data of certain semanteme, it is intended that find
The description of the other types data of its correspondence.Such as, remove, according to text, the picture that retrieval and text implication are close, or according to figure
News report that sheet search is relevant with picture etc..But, existing search method is typically all for single mode data, as
Text retrieval text, image retrieval image.Also there are some multi-modal or multimedia retrieval method, but these multi-modal retrieval sides
Method is mostly to do similarity measurement between same mode, then calculates the similarity between media data by Function Mapping,
Seldom there is the search method directly comparing different modalities similarity.The cross-media retrieval side of measuring similarity is carried out between same mode
Method, it is disadvantageous in that, can not learn to the relation across between modal data, needs to rely in data base preassigned
Join relation, and for corresponding relation loose between multi-medium data, inquiry effect is undesirable.Therefore, it is necessary to proposition can
Directly carry out the cross-media retrieval method of different modalities similarity measurement.Directly carry out the comparison of different modalities data similarity,
Its difficult point is, between the feature of different modalities data, difference is relatively big, and in general dimension is higher, there is " semantic gap "
Problem.
For how overcoming " semantic gap " problem, carrying out the similarity system design between different modalities data and retrieval, having one
A bit with the method that traditional method is different, these methods are generally divided into two classes: a class is that the data of different modalities are regarded as random change
Amount, the mapping that the latent space that makes these stochastic variable relatednesss maximum by searching is corresponding, for retrieval data, also projected
To latent space, thus complete the retrieval of cross-module state.Another kind of method assumes that more implicit themes, different modalities number in these data
According to relatedness carried out model by theme.This two classes method is all directly to carry out the method for data similarity-rough set between different modalities,
But, to semantic level loosely related situation between different modalities, " correlation maximum " and " theme " these semantic levels interior
In the degree of reliability understood, just less than the classification of different modalities data and incidence relation, these known determine information.The present invention will
Dictionary learning is incorporated in the retrieval of cross-module state, directly learns with regard to explicit incidence relation, and utilize label information, can be very
Well relation in the loose correspondence of semantic level between text and image is modeled, thus improves the robustness to noise, carry
The accuracy rate of high cross-module state retrieval.
Summary of the invention
It is an object of the invention to provide a kind of cross-module state retrieval side that can directly measure similarity between different modalities data
Method, in order to the data of another one or more mode can be directly retrieved by the data of a kind of mode.
The cross-module state retrieval side that can directly measure similarity between different modalities data comprises the steps:
1) each modal data in data base is carried out feature extraction and label record;
2) according to corresponding informance between different modalities data in data base and label information, from the angle rebuild to different moulds
Diversity and similarity between state paired data are expressed, and utilize label information, build cross-module state retrieval block mold and learn
Practise model parameter;
3) the known modal data submitting user to, utilizes cross-media retrieval model to return the most right after carrying out feature extraction
Other modal datas of the user's request answered;
4) utilize the true corresponding informance across modal data and label information, cross-media retrieval model is believed from correspondence simultaneously
Breath and distinctiveness information two aspect are evaluated.
Described step 1) including:
1) all of image modalities data in data base are extracted SIFT feature, and use k-means method to cluster
Forming vision word, be then normalized feature, making the characteristic vector representing each image is unit vector;
2) text modality data all of in data base are carried out part-of-speech tagging, go, unless noun word, to retain in text
Noun, constitute a dictionary with the word that occurred in all data bases, each text individually added up the word in dictionary
The number of times occurred, uses single text vocabulary frequency to carry out vector quantization, is then normalized characteristic vector, makes to represent each
The characteristic vector of text is unit vector;
3) to the data of other mode in data base, extract conventional industry standard feature, and feature be normalized,
Making the characteristic vector representing each data is unit vector.
4) to different modalities data corresponding in data base, add up their label information, i.e. record them from that
Classification.
Described step 2) including:
1) in cross-media retrieval, introduce the concept of dictionary learning, form cross-module state searching algorithm based on dictionary learning,
The data of each mode, the different distinctivenesses between dictionary encoding different modalities, different modalities is rebuild with dictionary and sparse coefficient
Similarity between data is modeled by the incidence relation matrix between sparse coefficient, and dictionary, sparse coefficient and incidence relation matrix are equal
Obtain from each modal data learning;
2) utilize label information to participate in the retrieval of cross-module state, in dictionary learning, belong to the same modal data of same label
Sharing identical dictionary primitive, the dictionary being i.e. not zero arranges, so that label information encodes during dictionary learning, and study
To the dictionary with distinctiveness information;
3) dictionary, sparse coefficient, incidence relation matrix, label information are unified in as expression formula (1) based on dictionary
In the multi-modal retrieval algorithm frame practised, the corresponding data of different modalities is expressed as entirety and learns;
Wherein, M represents the number of mode, and J represents classification i.e. label number, X(m)Represent the characteristic of m mode, D(m)
Represent the dictionary of m mode, A(m)Represent the sparse coefficient of m mode,Represent that m mode has those numbers of label l
According to corresponding sparse coefficient, the matrix A to any k × n,W(m)It it is m mode incidence relation square
Battle array, λm(m=1 ..., M), β, γ be adjustable parameter, be used for regulating the ratio that every part is shared in expression formula,Table
Show D(m)In dictionary element, i.e. a string, k is columns;
4) circulation updates sparse coefficient, dictionary and incidence relation matrix, first fixes dictionary and incidence relation matrix update
Sparse coefficient, then utilizes the sparse coefficient obtained and fixed correlation relational matrix to update dictionary, the sparse system that recycling updates
Number and dictionary updating incidence relation matrix, so circulate, until meeting the condition of convergence, specifically comprises the following steps that
(1) first fix dictionary and incidence relation matrix, update sparse coefficient as follows:
(2), after obtaining sparse coefficient, each mode dictionary is updated according to the following formula:
(3) last, update incidence relation matrix as follows:
Described step 3) including:
1) the known m modal data submitted to according to userThe known mode dictionary D obtained with study(m), initialized
Know the sparse coefficient of modal dataAs follows:
Wherein, λ is the parameter of an adjustment factor degree;
2) according to the sparse coefficient of initialized known modal dataThe incidence relation matrix W obtained with study(m),
The sparse coefficient of initial reguirements modal dataAs follows:
3) sparse coefficient of modal data according to demandThe demand mode dictionary D obtained with study(n), initialize and need
Seek modal dataAs follows:
4) according to known modal data, study obtain information and above initialization, update known mode sparse coefficient and
The sparse coefficient of demand mode is as follows:
Wherein β, λm、λnIt is adjustable parameter, corresponding with formula (1).
5) according to sparse coefficient and the demand mode dictionary of the demand mode updated, demand modal data is finally determined such as
Under:
Described step 4) including:
1) evaluate the retrieval of cross-module state with corresponding informance, be conceived to known modal data and its other the most corresponding mode numbers
According to, with the quality of the demand modal data corresponding with known modal data position evaluation result in the results list, for giving
Fixed t% index, if before the demand modal data corresponding with known modal data comes t%, then it is assumed that retrieval is correct, otherwise recognizes
For retrieval error;
2) retrieve with distinctiveness information evaluation cross-module state, be conceived to known modal data and belong to the need of same label with it
Seek modal data, with retrieve list to weigh cross-module state retrieval result, there is identical label as phase with known modal data
Closing, be otherwise uncorrelated, the concrete MAP used in information retrieval is as the measurement of this index, the cross-module state to a request
Retrieval data, and the list that search returns, the definition of a length of R, MAP is defined as follows based on AP, AP:
Wherein, the number of data relevant to retrieval data during L is the list that search returns.Prec (r) represents 1 ... r number
The ratio shared by data relevant to retrieval data according to, if δ (r)=1 r item data is relevant to retrieval data, otherwise δ
R ()=0, MAP is defined as the meansigma methods of all retrieval data AP values.
What the present invention had has the advantages that: the present invention compared with traditional cross-media retrieval method, can directly than
Between relatively different modalities, similarity rather than dependence travel to different modalities data with similarity system design between mode by corresponding relation
Between, directly compare the implicit associations pass that the benefit of similarity between different modalities it is possible to really excavate across between media data, directly
Connect the retrieval intention realizing user.Other cross-media retrieval algorithm that can directly measure different modalities similarity is compared, this
Bright improve measurement results to the capacity of resisting disturbance of noise and to loosely related across the ability to express of media data so that retrieval
Effect is more preferable, and result is more relevant to retrieval data from semantically.
Accompanying drawing explanation
Fig. 1 is the cross-module state search method schematic diagram that can directly measure similarity between different modalities data;
Fig. 2 is corresponding picture and the example of text in embodiment data base;
Fig. 3 is picture retrieval text and the example of text retrieval picture of the present invention.Each inquiry lists first four and returns
Return result.Top is the example of picture retrieval text, in order to more preferably show retrieval result, here with the true picture that text is corresponding
Represent the Similar Text that retrieval obtains.Lower section is the example of text retrieval picture.Each example all compared for the present invention and (gives a name
And another directly measures the retrieval effectiveness of cross-media retrieval method (GMA) of similarity between different modalities SliM2).
Detailed description of the invention
The cross-module state retrieval side that can directly measure similarity between different modalities data comprises the steps:
1) each modal data in data base is carried out feature extraction and label record;
2) according to corresponding informance between different modalities data in data base and label information, from the angle rebuild to different moulds
Diversity and similarity between state paired data are expressed, and utilize label information, build cross-module state retrieval block mold and learn
Practise model parameter;
3) the known modal data submitting user to, utilizes cross-media retrieval model to return the most right after carrying out feature extraction
Other modal datas of the user's request answered;
4) utilize the true corresponding informance across modal data and label information, cross-media retrieval model is believed from correspondence simultaneously
Breath and distinctiveness information two aspect are evaluated.
Described step 1) including:
1) all of image modalities data in data base are extracted SIFT feature, and use k-means method to cluster
Forming vision word, be then normalized feature, making the characteristic vector representing each image is unit vector;
2) text modality data all of in data base are carried out part-of-speech tagging, go, unless noun word, to retain in text
Noun, constitute a dictionary with the word that occurred in all data bases, each text individually added up the word in dictionary
The number of times occurred, uses single text vocabulary frequency to carry out vector quantization, is then normalized characteristic vector, makes to represent each
The characteristic vector of text is unit vector;
3) to the data of other mode in data base, extract conventional industry standard feature, and feature be normalized,
Making the characteristic vector representing each data is unit vector.
4) to different modalities data corresponding in data base, add up their label information, i.e. record them from that
Classification.
Described step 2) including:
1) in cross-media retrieval, introduce the concept of dictionary learning, form cross-module state searching algorithm based on dictionary learning,
The data of each mode, the different distinctivenesses between dictionary encoding different modalities, different modalities is rebuild with dictionary and sparse coefficient
Similarity between data is modeled by the incidence relation matrix between sparse coefficient, and dictionary, sparse coefficient and incidence relation matrix are equal
Obtain from each modal data learning;
2) utilize label information to participate in the retrieval of cross-module state, in dictionary learning, belong to the same modal data of same label
Sharing identical dictionary primitive, the dictionary being i.e. not zero arranges, so that label information encodes during dictionary learning, and study
To the dictionary with distinctiveness information;
3) dictionary, sparse coefficient, incidence relation matrix, label information are unified in as expression formula (1) based on dictionary
In the multi-modal retrieval algorithm frame practised, the corresponding data of different modalities is expressed as entirety and learns;
Wherein, M represents the number of mode, and J represents classification i.e. label number, X(m)Represent the characteristic of m mode, D(m)
Represent the dictionary of m mode, A(m)Represent the sparse coefficient of m mode,Represent that m mode has those numbers of label l
According to corresponding sparse coefficient, the matrix A to any k × n,W(m)It it is m mode incidence relation square
Battle array, λm(m=1 ..., M), β, γ be adjustable parameter, be used for regulating the ratio that every part is shared in expression formula,Table
Show D(m)In dictionary element, i.e. a string, k is columns;
4) circulation updates sparse coefficient, dictionary and incidence relation matrix, first fixes dictionary and incidence relation matrix update
Sparse coefficient, then utilizes the sparse coefficient obtained and fixed correlation relational matrix to update dictionary, the sparse system that recycling updates
Number and dictionary updating incidence relation matrix, so circulate, until meeting the condition of convergence, specifically comprises the following steps that
(1) first fix dictionary and incidence relation matrix, update sparse coefficient as follows:
(2), after obtaining sparse coefficient, each mode dictionary is updated according to the following formula:
(3) last, update incidence relation matrix as follows:
Described step 3) including:
1) the known m modal data submitted to according to userThe known mode dictionary D obtained with study(m), initialized
Know the sparse coefficient of modal dataAs follows:
Wherein, λ is the parameter of an adjustment factor degree;
2) according to the sparse coefficient of initialized known modal dataThe incidence relation matrix W obtained with study(m),
The sparse coefficient of initial reguirements modal dataAs follows:
3) sparse coefficient of modal data according to demandThe demand mode dictionary D obtained with study(n), initialize and need
Seek modal dataAs follows:
4) according to known modal data, study obtain information and above initialization, update known mode sparse coefficient and
The sparse coefficient of demand mode is as follows:
Wherein β, λm、λnIt is adjustable parameter, corresponding with formula (1).
5) according to sparse coefficient and the demand mode dictionary of the demand mode updated, demand modal data is finally determined such as
Under:
Described step 4) including:
1) evaluate the retrieval of cross-module state with corresponding informance, be conceived to known modal data and its other the most corresponding mode numbers
According to, with the quality of the demand modal data corresponding with known modal data position evaluation result in the results list, for giving
Fixed t% index, if before the demand modal data corresponding with known modal data comes t%, then it is assumed that retrieval is correct, otherwise recognizes
For retrieval error;
2) retrieve with distinctiveness information evaluation cross-module state, be conceived to known modal data and belong to the need of same label with it
Seek modal data, with retrieve list to weigh cross-module state retrieval result, there is identical label as phase with known modal data
Closing, be otherwise uncorrelated, the concrete MAP used in information retrieval is as the measurement of this index, the cross-module state to a request
Retrieval data, and the list that search returns, the definition of a length of R, MAP is defined as follows based on AP, AP:
Wherein, the number of data relevant to retrieval data during L is the list that search returns.Prec (r) represents 1 ... r number
The ratio shared by data relevant to retrieval data according to, if δ (r)=1 r item data is relevant to retrieval data, otherwise δ
R ()=0, MAP is defined as the meansigma methods of all retrieval data AP values.
Embodiment
Assume that we have 2173 respectively to the text of known corresponding relation and view data, the textual data of unknown corresponding relation
According to each with view data 693, the example of picture and text such as Fig. 2.Firstly for image modalities data all of in data base
Extract SIFT feature, and use k-means method to carry out cluster formation vision word, then feature is normalized, makes generation
The characteristic vector of each image of table is unit vector.Text modality data all of in data base are carried out part of speech mark simultaneously
Note, goes, unless noun word, to retain the noun in text, constitutes a dictionary with the word occurred in all data bases, right
The number of times that the word in dictionary occurs individually added up by each text, uses single text vocabulary frequency to carry out vector quantization, then to spy
Levying vector to be normalized, making the characteristic vector representing each text is unit vector.
2173 pairs of data (feature) of pairing are expressed as matrix form, it is stipulated that M represents the number of mode, and J represents classification
I.e. label number, X(m)Represent the characteristic of m mode, D(m)Represent the dictionary of m mode, A(m)Represent the sparse of m mode
Coefficient,Represent that m mode has the sparse coefficient corresponding to those data of label l, the matrix to any k × nW(m)It is m mode incidence relation matrix, λm(m=1 ..., M), β, γ be adjustable parameter, use
Regulate the ratio that every part is shared in expression formula,Represent D(m)In dictionary element, i.e. a string, k for row
Number, has two mode, text and image, therefore M=2 here, and text and image are respectively as X(1)And X(2)。
Then following steps are performed:
1) first fix dictionary and incidence relation matrix, update sparse coefficient as follows:
2), after obtaining sparse coefficient, each mode dictionary is updated according to the following formula:
3) last, update incidence relation matrix as follows:
Thus study obtains D={D(1),D(2),...,D(M), W={W(1),W(2),...,W(M)}.Subsequently into retrieval rank
Section, in retrieval phase, we retrieve by any one in 693*2 text of the unknown corresponding relation and image, can return
Return text or the image of its correspondence.Specifically comprise the following steps that
Assume that user submits the retrieval data of known image or text modality toWherein m=1 or 2.
1) the known m modal data submitted to according to userThe known mode dictionary D obtained with study(m), initialized
Know the sparse coefficient of modal dataAs follows:
Wherein, λ is the parameter of an adjustment factor degree;
2) according to the sparse coefficient of initialized known modal dataThe incidence relation matrix W obtained with study(m),
The sparse coefficient of initial reguirements modal dataAs follows:
3) sparse coefficient of modal data according to demandThe demand mode dictionary D obtained with study(n), initialize and need
Seek modal dataAs follows:
4) according to known modal data, study obtain information and above initialization, update known mode sparse coefficient and
The sparse coefficient of demand mode is as follows:
Wherein β, λm、λnIt is adjustable parameter, corresponding with formula (1).
5) according to sparse coefficient and the demand mode dictionary of the demand mode updated, demand modal data is finally determined such as
Under:
6) modal data according to demand, returns list ordering to demand mode candidate, returns result after sequence.
Fig. 3 illustrates the concrete instance of cross-media retrieval, including example (top) and the text retrieval of picture retrieval text
The example (bottom) of picture, and (give a name SliM to compared for the present invention2) directly measure between different modalities with another similar
The retrieval effectiveness of the cross-media retrieval method (GMA) of property.For the example (top) of picture retrieval text, in order to more intuitively open up
Showing retrieval effectiveness, the true picture that we use text corresponding represents the text data that retrieval obtains.It will be seen that retrieval figure
Sheet derives from sports category, and the result that two kinds of methods are retrieved both is from sports category, but utilizes the knot that the present invention retrieves
Text (the corresponding picture of text represents) corresponding to retrieving image has been come first by fruit, and remaining retrieves resulting text
(representing with corresponding picture) is also semantic more relevant to retrieving image with in content.For the example of text retrieval picture, literary composition
This has intercepted one section of display, and the content of the text is mainly relevant park and trail, belongs to geographical classification, examines by the present invention
Rope result out and retrieval text belong to same category, and the most relevant in content.And alternatively retrieve
Result out makes number one and the picture of the 4th is from history classification, does not the most also have method and the inspection of the present invention
Rope text relevant is strong.
From the example above it can be seen that be different from traditional method, the present invention can directly carry out between different modalities similar
Property tolerance, thus realize cross-module state retrieval, even and if compared with the method the most directly comparing different modalities similarity,
The method of the present invention has more preferable retrieval effectiveness.
Claims (4)
1. the cross-module state search method that can directly measure similarity between different modalities data, it is characterised in that include walking as follows
Rapid:
1) each modal data in data base is carried out feature extraction and label record;
2) according to corresponding informance between different modalities data in data base and label information, from the angle rebuild, different modalities is joined
Diversity between data and similarity are expressed, utilizes label information, build cross-module state retrieval block mold and learn mould
Shape parameter;
3) the known modal data submitting user to, utilizes cross-module state retrieval block mold to return the most right after carrying out feature extraction
Other modal datas of the user's request answered;
4) utilize the true corresponding informance across modal data and label information, cross-module state retrieval block mold is believed from correspondence simultaneously
Breath and distinctiveness information two aspect are evaluated;
Wherein, described step 1) specifically include:
1) all of image modalities data in data base are extracted SIFT feature, and use k-means method to carry out cluster formation
Vision word, is then normalized feature, and making the characteristic vector representing each image is unit vector;
2) text modality data all of in data base are carried out part-of-speech tagging, go, unless noun word, to retain the name in text
Word, constitutes a dictionary with the word occurred in all data bases, each text is individually added up the word in dictionary and occurs
Number of times, use single text vocabulary frequency to carry out vector quantization, then characteristic vector be normalized, make to represent each text
Characteristic vector be unit vector;
3) to the data of other mode in data base, extract conventional industry standard feature, and feature is normalized, make generation
The characteristic vector of each data of table is unit vector;
4) to different modalities data corresponding in data base, add up their label information, i.e. record them from that classification.
A kind of cross-module state search method that can directly measure similarity between different modalities data the most according to claim 1,
It is characterized in that, described step 2) including:
1) in cross-module state is retrieved, introduce the concept of dictionary learning, form cross-module state searching algorithm based on dictionary learning, with word
Allusion quotation and sparse coefficient rebuild the data of each mode, the different distinctivenesses between dictionary encoding different modalities, different modalities data
Between similarity modeled by the incidence relation matrix between sparse coefficient, dictionary, sparse coefficient and incidence relation matrix are all from respectively
Modal data learning obtains;
2) utilizing label information to participate in the retrieval of cross-module state, in dictionary learning, the same modal data belonging to same label is shared
Identical dictionary primitive, the dictionary being i.e. not zero arranges, so that label information encodes during dictionary learning, study is to tool
The dictionary of having any different property information;
3) dictionary, sparse coefficient, incidence relation matrix, label information be unified in such as formula (1) is many based on dictionary learning
In mode searching algorithm framework, the corresponding data of different modalities is expressed as entirety and learns;
Wherein, M represents the number of mode, and J represents classification i.e. label number, X(m)Represent the characteristic of m mode, D(m)Represent
The dictionary of m mode, A(m)Represent the sparse coefficient of m mode,Represent that m mode has those data institute of label l right
The sparse coefficient answered, the matrix A to any k × n,W(m)It is m mode incidence relation matrix, λm、β、
γ is adjustable parameter, wherein m=1 ..., M, it is used for regulating the ratio that every part is shared in expression formula,Represent D(m)
In dictionary element, i.e. a string, k is columns;
4) circulation updates sparse coefficient, dictionary and incidence relation matrix, first fixes dictionary and incidence relation matrix update is sparse
Coefficient, then utilizes the sparse coefficient obtained and fixed correlation relational matrix to update dictionary, sparse coefficient that recycling updates and
Dictionary updating incidence relation matrix, so circulates, until meeting the condition of convergence, specifically comprises the following steps that
(1) first fix dictionary and incidence relation matrix, update sparse coefficient as follows:
(2), after obtaining sparse coefficient, each mode dictionary is updated according to the following formula:
(3) last, update incidence relation matrix as follows:
A kind of cross-module state search method that can directly measure similarity between different modalities data the most according to claim 2,
It is characterized in that, described step 3) including:
1) the known m modal data submitted to according to userThe known mode dictionary D obtained with study(m), initialize known mould
The sparse coefficient of state dataAs follows:
Wherein, λ is the parameter of an adjustment factor degree;
2) according to the sparse coefficient of initialized known modal dataThe incidence relation matrix W obtained with study(m), initialize
The sparse coefficient of demand modal dataAs follows:
3) sparse coefficient of modal data according to demandThe demand mode dictionary D obtained with study(n), initial reguirements mould
State dataAs follows:
4) obtain information and above initialization according to known modal data, study, update sparse coefficient and the demand of known mode
The sparse coefficient of mode is as follows:
Wherein β, λm、λnIt is adjustable parameter, corresponding with formula (1);
5) according to sparse coefficient and the demand mode dictionary of the demand mode updated, finally determine that demand modal data is as follows:
The cross-module state search method that can directly measure similarity between different modalities data the most according to claim 1, it is special
Levy and be, described step 4) including:
1) evaluate the retrieval of cross-module state with corresponding informance, be conceived to known modal data and its other the most corresponding modal datas,
With the quality of the demand modal data corresponding with known modal data position evaluation result in the results list, for given
T% index, if before the demand modal data corresponding with known modal data comes t%, then it is assumed that retrieval is correct, otherwise it is assumed that
Retrieval error;
2) retrieve with distinctiveness information evaluation cross-module state, be conceived to known modal data and belong to the demand mould of same label with it
State data, with retrieve list to weigh cross-module state retrieval result, there is identical label as relevant to known modal data, no
Being then uncorrelated, the concrete MAP used in information retrieval is as the measurement of this index, the cross-module state retrieval number to a request
According to, and the list that search returns, the definition of a length of R, MAP is defined as follows based on AP, AP:
Wherein, the number of data relevant to retrieval data during L is the list that search returns;Prec (r) represents 1 ... in r data
To the ratio shared by the retrieval relevant data of data, if δ (r)=1 r item data is relevant to retrieving data, otherwise δ (r)=
0, the MAP meansigma methods being defined as all retrieval data AP values.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310410553.XA CN103488713B (en) | 2013-09-10 | 2013-09-10 | A kind of cross-module state search method that can directly measure similarity between different modalities data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310410553.XA CN103488713B (en) | 2013-09-10 | 2013-09-10 | A kind of cross-module state search method that can directly measure similarity between different modalities data |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103488713A CN103488713A (en) | 2014-01-01 |
CN103488713B true CN103488713B (en) | 2016-09-28 |
Family
ID=49828939
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310410553.XA Active CN103488713B (en) | 2013-09-10 | 2013-09-10 | A kind of cross-module state search method that can directly measure similarity between different modalities data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103488713B (en) |
Families Citing this family (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104199826B (en) * | 2014-07-24 | 2017-06-30 | 北京大学 | A kind of dissimilar medium similarity calculation method and search method based on association analysis |
CN104166684A (en) * | 2014-07-24 | 2014-11-26 | 北京大学 | Cross-media retrieval method based on uniform sparse representation |
CN104317838B (en) * | 2014-10-10 | 2017-05-17 | 浙江大学 | Cross-media Hash index method based on coupling differential dictionary |
CN104317837B (en) * | 2014-10-10 | 2017-06-23 | 浙江大学 | A kind of cross-module state search method based on topic model |
CN104346450B (en) * | 2014-10-29 | 2017-06-23 | 浙江大学 | A kind of across media sort methods based on multi-modal recessive coupling expression |
CN104462489B (en) * | 2014-12-18 | 2018-02-23 | 北京邮电大学 | A kind of cross-module state search method based on Deep model |
CN105550190B (en) * | 2015-06-26 | 2019-03-29 | 许昌学院 | Cross-media retrieval system towards knowledge mapping |
CN108121750B (en) * | 2016-11-30 | 2022-07-08 | 西门子公司 | Model processing method and device and machine readable medium |
CN107633259B (en) * | 2017-08-21 | 2020-03-31 | 天津大学 | Cross-modal learning method based on sparse dictionary representation |
CN108038080A (en) * | 2017-11-29 | 2018-05-15 | 浙江大学 | A kind of method that local multi-modal sparse coding completion is carried out using the similar tactical ruleization of adaptability |
CN110059217B (en) * | 2019-04-29 | 2022-11-04 | 广西师范大学 | Image text cross-media retrieval method for two-stage network |
CN110704708B (en) * | 2019-09-27 | 2023-04-07 | 深圳市商汤科技有限公司 | Data processing method, device, equipment and storage medium |
CN111930972B (en) * | 2020-08-04 | 2021-04-27 | 山东大学 | Cross-modal retrieval method and system for multimedia data by using label level information |
CN112364197B (en) * | 2020-11-12 | 2021-06-01 | 四川省人工智能研究院(宜宾) | Pedestrian image retrieval method based on text description |
CN113656660B (en) * | 2021-10-14 | 2022-06-28 | 北京中科闻歌科技股份有限公司 | Cross-modal data matching method, device, equipment and medium |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103268346A (en) * | 2013-05-27 | 2013-08-28 | 翁时锋 | Semi-supervised classification method and semi-supervised classification system |
-
2013
- 2013-09-10 CN CN201310410553.XA patent/CN103488713B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103268346A (en) * | 2013-05-27 | 2013-08-28 | 翁时锋 | Semi-supervised classification method and semi-supervised classification system |
Non-Patent Citations (2)
Title |
---|
跨媒体分析与检索;吴飞等;《中国计算机学会通讯》;20110228;第7卷(第2期);第23-27页 * |
面向web图片检索的文本和图片信息融合技术研究;尹湘舟;《中国优秀硕士学位论文全文数据库 信息科技辑》;20111215(第S2期);第23-44页 * |
Also Published As
Publication number | Publication date |
---|---|
CN103488713A (en) | 2014-01-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103488713B (en) | A kind of cross-module state search method that can directly measure similarity between different modalities data | |
CN106295796B (en) | entity link method based on deep learning | |
Hendricks et al. | Deep compositional captioning: Describing novel object categories without paired training data | |
Min et al. | Question answering through transfer learning from large fine-grained supervision data | |
Peng et al. | Semi-supervised cross-media feature learning with unified patch graph regularization | |
CN106202256B (en) | Web image retrieval method based on semantic propagation and mixed multi-instance learning | |
CN102197393B (en) | Image-based semantic distance | |
US20210366025A1 (en) | Item recommendation method based on user intention in session and system thereof | |
CN109635083A (en) | It is a kind of for search for TED speech in topic formula inquiry document retrieval method | |
CN106886601A (en) | A kind of Cross-modality searching algorithm based on the study of subspace vehicle mixing | |
CN106156333A (en) | A kind of improvement list class collaborative filtering method of mosaic society information | |
CN106844738B (en) | The classification method of Junker relationship between food materials neural network based | |
CN110647904A (en) | Cross-modal retrieval method and system based on unmarked data migration | |
CN112417306A (en) | Method for optimizing performance of recommendation algorithm based on knowledge graph | |
CN105701514A (en) | Multi-modal canonical correlation analysis method for zero sample classification | |
CN105701225B (en) | A kind of cross-media retrieval method based on unified association hypergraph specification | |
CN105718940A (en) | Zero-sample image classification method based on multi-group factor analysis | |
CN113239159B (en) | Cross-modal retrieval method for video and text based on relational inference network | |
CN109472282B (en) | Depth image hashing method based on few training samples | |
CN110059220A (en) | A kind of film recommended method based on deep learning Yu Bayesian probability matrix decomposition | |
CN104317838A (en) | Cross-media Hash index method based on coupling differential dictionary | |
CN113779219A (en) | Question-answering method for embedding multiple knowledge maps by combining hyperbolic segmented knowledge of text | |
CN112800292A (en) | Cross-modal retrieval method based on modal specificity and shared feature learning | |
CN105893573A (en) | Site-based multi-modal media data subject extraction model | |
CN102693321A (en) | Cross-media information analysis and retrieval method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
EE01 | Entry into force of recordation of patent licensing contract | ||
EE01 | Entry into force of recordation of patent licensing contract |
Application publication date: 20140101 Assignee: TONGDUN HOLDINGS Co.,Ltd. Assignor: ZHEJIANG University Contract record no.: X2021990000612 Denomination of invention: A cross modal retrieval method that can directly measure the similarity between different modal data Granted publication date: 20160928 License type: Common License Record date: 20211012 |