CN1920818A - Transmedia search method based on multi-mode information convergence analysis - Google Patents

Transmedia search method based on multi-mode information convergence analysis Download PDF

Info

Publication number
CN1920818A
CN1920818A CN 200610053392 CN200610053392A CN1920818A CN 1920818 A CN1920818 A CN 1920818A CN 200610053392 CN200610053392 CN 200610053392 CN 200610053392 A CN200610053392 A CN 200610053392A CN 1920818 A CN1920818 A CN 1920818A
Authority
CN
China
Prior art keywords
multimedia
media object
user
semantic information
distance
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN 200610053392
Other languages
Chinese (zh)
Other versions
CN100388282C (en
Inventor
潘云鹤
庄越挺
吴飞
杨易
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CNB2006100533923A priority Critical patent/CN100388282C/en
Publication of CN1920818A publication Critical patent/CN1920818A/en
Application granted granted Critical
Publication of CN100388282C publication Critical patent/CN100388282C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a mediate-span search method based on multi-mode information fusion analysis, wherein the invention can fuse and analyze the multi-mode information to understand the multimedia semantic, to realize multimedia document search, image search, sound search and text search based on content; user can via provided search sample at any mode searches the media object or multimedia document at any mode; for example, for searching image, user can provide image as search sample to search, or provide sound or text or their combination as the search sample to search. Since the invention not only uses keyword, also fuses and analyzes all multimedia objects in the multimedia document, to synthesize the information carried by variable mode mediate to understand the mantic to obtain better search effect. Since the search sample and feedback result are in different modes, it has strong function and wide application.

Description

Stride the medium search method based on multi-modal information convergence analysis
Technical field
The present invention relates to multimedia retrieval, relate in particular to and a kind ofly stride the medium search method based on multi-modal information convergence analysis.
Background technology
Multimedia document is current very common file type, it is made up of the media object (comprising audio frequency, image and text etc.) of a plurality of different modalities, and have certain semanteme, all belong to multimedia document as lantern slide of multimedia encyclopedia, webpage and Microsoft PowerPoint form etc.In general, multimedia document has two characteristics.The first, form complex structure, the media object of multiple modalities is present in multimedia document inside simultaneously; The second, the media object of the inner different modalities of same multimedia document is complementary semantically, and the semanteme of multimedia document is by its inner all media object co expression.Therefore have ambiguous the time when a certain media object, do as a wholely, the semanteme of multimedia document is clear and definite often.Because traditional search method designs at single mode media object often, do not take all factors into consideration the complementary information that inner each mode media object of multimedia document is contained, therefore well in the analysis-by-synthesis multimedia document each media object of different modalities understanding semantic information of multimedia, thereby can't fine adaptation user's request.
At present,, comprise text along with memory technology and development of internet technology, picture, sound clip and multimedia document etc. are more and more at the interior multimedia file that can accessed by the userly arrive.Retrieval technique can help the content that the user finds oneself fast in the data of magnanimity need, and becomes field more and more important in the Computer Applied Technology.Traditional retrieval technique can be divided into based on the retrieval of key word and content-based retrieval.In searching system, need in advance multimedia object to be marked based on key word.But because the present media object enormous amount that exists, it is vast and numerous therefore to mark the process workload; And because the influence of the marked content person's subjective factor that is subjected to the mark inevitably, at same multimedia object, different mark persons may mark different key words, so key word whole semantemes that often can not reflect multimedia object fully objectively and contained.The content-based retrieval system does not then need multimedia object is marked, and the user can submit to a retrieval example that multimedia object is retrieved.But there are two weakness in traditional content-based retrieval technology: the one, and the user can only retrieve and the media object of inquiring about the identical mode of example, that is to say and to retrieve audio frequency by the image examples retrieving images or by audio example, and can't remove retrieving images or retrieve audio frequency by audio example by image examples; The 2nd, have semantic wide gap between the low-level image feature of media object and the high-level semantic, so precision ratio not very desirable.Consider that media object occurs with the form of multimedia document often, and media object often has identical semanteme in the same multimedia document, in order to cross over semantic wide gap, can utilize the semantic complementarity of different modalities media object to come disambiguation, understand semantic information of multimedia better.Simultaneously, in order to satisfy the needs that the user strides Media Inquiries, as by sound example query image, find a kind of content-based medium search method of striding quite meaningful.
Summary of the invention
The object of the present invention is to provide a kind of content-based multimedia document retrieval and stride the method that medium are retrieved, it is characterized in that comprising the steps:
1) based on multi-modal information convergence analysis semantic information of multimedia is understood;
2) user submits in the database media object beyond the existing or database to retrieve as the inquiry example;
3) according to user's relevant feedback, carry out quadratic search;
4) according to user's relevant feedback, the semantic information of multimedia space is safeguarded.
Describedly based on multi-modal information convergence analysis semantic information of multimedia is understood, its step is as follows:
1) all audio fragments in the database is extracted root mean square RMS, cutoff frequency Rolloff, zero-crossing rate ZCR and four features of barycenter Centroid, utilize dynamic time all audio fragments of DTW algorithm computation distance between any two of stretching, and with all range normalizations;
2) image objects all in the database is extracted color and textural characteristics, calculate all images object Euclidean distance between any two, and with all range normalizations;
3) adopt single text vocabulary frequency/contrary text frequency (TF/IDF) method to carry out vector quantization to text media objects all in the database, calculate all text media objects distance between any two, and with all range normalizations;
4) by non-linear method to the target voice in each multimedia document, the entrained information of text object and image object is carried out convergence analysis, thereby obtains multimedia document distance between any two;
5) set up a multimedia document associated diagram.Each multimedia document is a summit on this figure, and a weighting limit is arranged between any 2, and weight is the distance between resulting these two the corresponding multimedia documents in summit of step 4;
6) reconstruct multimedia document associated diagram, method are at first to set a threshold value, then weight all are made as infinity greater than the power on the limit of this threshold value.Then to all limits, with the new weight of the shortest path between 2 o'clock as this limit;
7) adopt multi-dimensionality gage method (Multidimensional Scaling) that the multimedia document associated diagram is projected to the semantic information of multimedia space, this space can keep the topological relation of multimedia document associated diagram, and all multimedia documents all have unique coordinate and pointed by this coordinate in this space; All media object are all pointed by the coordinate of multimedia document under them.
The user submits the method that existing media object is retrieved as the inquiry example in the database to, its step is as follows: at first find the coordinate of this media object in the semantic information of multimedia space, then according to all coordinate of media object in the semantic information of multimedia space, calculate inquiry example and the Euclidean distance of other all media object in the semantic information of multimedia space, and according to this distance, all media object are sorted from small to large the media object of the target mode that layback is nearest;
Media object beyond the user submits in the database is as follows as the step that the inquiry example carries out search method:
1) find in the database and all media object of the identical mode of inquiry example, the low-level image feature distance of calculating these media object and inquiring about example;
2) according to the low-level image feature distance, find in the database and the immediate k of an inquiry example media object, at the barycenter in semantic information of multimedia space coordinate, submit to the method that existing media object is retrieved as the inquiry example in the database to stride the medium retrieval these media object according to foregoing user as the retrieval example.
Relevant feedback according to the user, the step of carrying out quadratic search is as follows: return after the Query Result, the user estimates Query Result, and mark the result that some they praise, system is labeled as the user on the coordinate of the barycenter of those media object in the semantic information of multimedia space of positive example as the retrieval example, calculate inquiry example and the Euclidean distance of other all media object in the semantic information of multimedia space, and according to this distance, all media object are sorted the media object of the target mode that layback is nearest.
According to user's relevant feedback, the step that the semantic information of multimedia space is safeguarded is as follows:
1) according to user's relevant feedback historical record, periodically on-the-fly modifies the multimedia document associated diagram and re-construct the semantic information of multimedia space, make it to reflect more exactly the semantic information of multimedia relation;
2) according to user's relevant feedback, the inquiry example outside the database is mapped to the semantic information of multimedia space, thereby finishes database update.
The present invention compares with background technology, and the useful effect that has is:
The present invention proposes the new content-based retrieval method of a cover.Because this method has adopted multi-modal information fusion mechanism, makes full use of the entrained information of different modalities media object, the ability of crossing over semantic wide gap is stronger, therefore has higher precision ratio.Simultaneously, this method also discloses a kind of method of striding the medium retrieval, the user can (comprising image, text, sound or multimedia document) remove to inquire about the media object or the multimedia document of any mode by submitting any type of example to, inquiry example and return results can be different modalities, and be therefore more powerful than traditional content-based retrieval systemic-function.
Description of drawings
Fig. 1 is system framework figure of the present invention;
Fig. 2 is primary retrieval result of the present invention.This figure displaying contents is preceding 9 results that the user goes query image to return by the sound of submitting one section car engine to.
Embodiment
The present invention carries out semantic understanding by multi-modal information convergence analysis to multimedia document, for all multimedia documents are set up unified index, the multimedia object of different modalities can be pointed by the coordinate of the multimedia document under it, thereby set up unified index for the multimedia object of different modalities, realized the retrieval of multimedia document and stride the medium retrieval.
The content-based retrieval method example that the present invention proposes specifies as follows as shown in Figure 1:
1) pretreatment module: this module realizes the media object in the database is carried out semantic understanding and set up unified index.This module comprises that mainly feature extraction, multi-modal information fusion and semantic information of multimedia space set up three main algorithm.Specify as follows:
Feature extraction of a multimedia object and similarity computational algorithm; This algorithm extracts feature respectively and calculates the low-level image feature distance the media object of different modalities.For all images object in the database, extract texture and color characteristic, calculate all images object Euclidean distance between any two then.For all target voices, extract root mean square, zero-crossing rate, cutoff frequency and four features of barycenter, utilize dynamic time all target voices of (DTW) algorithm computation distance between any two of stretching then.For all text objects, carry out the text vector quantization according to the TF/IDF method, calculate all text objects Euclidean distance between any two then.Then image distance, acoustic distance and text distance are done Gaussian normalization respectively.
The multi-modal information fusion algorithm of b: this algorithm calculates the distance of multimedia document by the relation between the inner different media object of convergence analysis multimedia document.For any two multimedia documents, can obtain distance between their contained images, sound and the text object by step a, try to achieve minimum value mindis and maximal value maxdis between these distances then.Can being defined as of multimedia document apart from MMDdis: MMDdis=λ * mindis+ (a+ln (β * (maxdis-mindis)+1)); If having only a kind of media object between two multimedia documents is identical mode, so, MMDdis=λ * mindis+A, α wherein, β, λ and A are according to database size and the adjustable constant of DATA DISTRIBUTION situation.If there is not the media object of identical mode between two multimedia documents, the distance of two multimedia documents is set to infinity earlier so, then can be in the step of back by shortest path as the distance between the multimedia document.
C structure multimedia document associated diagram; In order to construct the multimedia document associated diagram,, the summit of a correspondence is set in the drawings for each multimedia document in the database; Between any two summits a limit is set all, the power on limit is the distance between the corresponding multimedia document in two summits; Reconstruct should figure then, and method is: weights are changed to infinity (this threshold value can be set to the poor of the mean value of all length of sides and standard deviation) greater than the weight on all limits of a certain threshold value; The definition path is the weights sum along these all limits, path, the weights on limit between any two, all summits among the figure is reset to the length of shortest path between two summits.
D semantic information of multimedia space is set up; Construct a matrix D, it each d IjBe that i multimedia document is to the distance in the multimedia document associated diagram between j the multimedia document, if the distance between two multimedia documents is infinitely great, so just with d IjBe set to 1.Then with matrix D as input, by multi-dimensionality gage method the multimedia document associated diagram is carried out projection, obtain the semantic information of multimedia space.Each multimedia document all has the coordinate of a correspondence in this space, each multimedia document all has the pointer that points to its attached media object simultaneously.
2) retrieval module: this module realizes striding the medium retrieval, comprises multimedia document retrieval, image retrieval, sound retrieval and text retrieval.The user can submit to many matchmakers document, image, sound or text to remove to inquire about the media object or the multimedia document of any mode as the retrieval example.Specify as follows:
The retrieval example that a submits to as the user is during already present multimedia document, at first to find the coordinate of the document in the semantic information of multimedia space in database, finds the k neighbour of inquiry example in the semantic information of multimedia space then.If the user at the retrieving multimedia document, then directly returns the k neighbour; If the user at retrieving images, then returns the image that belongs to k neighbour multimedia document; If the user is at retrieval sound or text, method and retrieving images are similar.
When the retrieval example that b submits to as the user is already present multimedia object in database (image, sound or text), at first find the affiliated multimedia document of retrieval example, then this multimedia document is provided with the retrieval example and retrieves, method is consistent with step a.
When the retrieval example that c submits to as the user was a multimedia document outside database, then the method for calculating the multimedia document distance according to pretreatment module was calculated the distance of retrieval example all multimedia documents in the database, found the k neighbour of retrieval example.If the user at the retrieving multimedia document, then directly returns the k neighbour; If the user at retrieving images, then returns the image that is contained in the k neighbour; If the user is at retrieval sound or text, method and retrieving images are similar.
When the retrieval example that d submits to as the user is a multimedia object outside database, then at first calculate in retrieval example and the database between the identical mode multimedia object distance in feature space and find k the arest neighbors of retrieval example at feature space, obtain the affiliated multimedia document of this k neighbour then, and try to achieve their barycenter in the semantic information of multimedia space; This barycenter is retrieved as the retrieval example, and method is as described in the step a.
The e result for retrieval returns to after the user, the user can estimate result for retrieval, system is made as the retrieval example with the positive example of user mark and carries out quadratic search then, method be with the barycenter of positive example in the semantic information of multimedia space as the retrieval example, carry out quadratic search according to the method for step a then.
3) maintenance module: this module mainly realizes the reconstruct and multimedia object outside the database and multimedia document be mapped to the semantic information of multimedia space of refining to the semantic information of multimedia space.Specify as follows:
A disposes a journal file in system, recording user comprises the evaluation of user to each return results to the feedback content of each retrieval.The multimedia document associated diagram is periodically revised according to the content of journal file by system.Specific practice is: the power between the multimedia document of the positive example that each retrieval user in the multimedia document associated diagram is labeled as multiply by one less than 1 number, and power between the multimedia document of the multimedia document of positive example and negative example that user in each retrieval in the multimedia document associated diagram is labeled as multiply by one greater than 1 number.If retrieval of content is a multimedia object, that is to say that the positive and negative example that the user marks is a multimedia object, then revise the power on limit between their affiliated host's multimedia documents according to the method described above.Again the semantic information of multimedia space is calculated in projection then.
B is when retrieval example that the user submits to is media object or multimedia object database outside, and system can be mapped to the semantic information of multimedia space by automatically that database is the outer inquiry example of user's relevant feedback, thus automatic EDS extended data set.Specific practice is: if return results is a multimedia document, at first try to achieve the user and be labeled as the barycenter of the multimedia document of positive example in the semantic information of multimedia space, take out then near three positive examples of barycenter, try to achieve the barycenter of these three positive examples and with this barycenter as newly inquiring about the coordinate of example in the semantic information of multimedia space; If return results is a media object, then at first try to achieve the user and be labeled as the barycenter of the affiliated multimedia document of multimedia object of positive example in the semantic information of multimedia space, take out then near three positive examples of barycenter, try to achieve the barycenter of these three positive examples and with this barycenter as newly inquiring about the coordinate of example in the semantic information of multimedia space.
Embodiment:
Suppose to have 900 multimedia documents, by 900 images, 300 sound clips and 700 sections texts constitute.At first calculate the low-level image feature that extracts all images, comprise the RGB color histogram, color convergence vector sum Tamura textural characteristics calculates the distance in twos between all images then; To sound clip, extract root mean square, zero-crossing rate, cutoff frequency and four features of barycenter, utilize dynamic time all target voices of (DTW) algorithm computation distance between any two of stretching then; To text, calculate text object distance between any two behind the employing TF/IDF vector quantization.After finishing the media object distance calculation, will be to image distance, the normalization respectively of text distance and acoustic distance, then for any multimedia document first and second, at first find the text that belongs to these two multimedia documents respectively, distance between sound and the image object is calculated their maximal value maxdis and minimum value mindis then.If two multimedia documents have only the multimedia object of two kinds of identical mode, then maxdis and mindis are respectively the minimum and maximum value of acoustic distance and image distance, and other analogues can be analogized.Such as in the multimedia document first image being arranged, text and sound, and have only image and target voice in the multimedia document second,, the maxdis of these two multimedia documents and mindis are respectively the minimum and maximum value of acoustic distance and image distance so.After calculating maxdis and mindis, calculate multimedia document distance, MMDdis=mindis+ (0.1+ln (0.3 * (maxdis-mindis)+1)) according to following formula.If two multimedia documents have only a kind of media object of identical mode, then the distance with them is provided with this mode media object apart from adding 0.1.Such as having only image and sound in the multimedia document first, and have only sound and text in the multimedia document second, their distance is set to acoustic distance and adds 0.1.If there is not the media object of identical mode between two multimedia documents, the distance of two multimedia documents is set to infinity earlier so, then can be in the step of back by shortest path as the distance between the multimedia document.After finishing the multimedia document distance calculation, can be according to the weighted graph of distance structure between the multimedia document.There is a limit in a summit on each multimedia document corresponding diagram between any two summits, and the weight on limit is the distance between the multimedia document of two summit correspondences.After finishing the structure of figure, entitlement among this figure all is changed to infinity again greater than 0.35 power,, finds their bee-lines between any two then for all summits, and the employing dijkstra's algorithm, with the new weight of bee-line as limit between two summits.Construct matrix D, wherein a D IjFor multimedia document i to the distance between the multimedia document j, if the distance between these two multimedia documents is infinitely great, D is set then IjBe 1.Then to D Ij(Multidimensional Scaling) carries out projection with multi-dimensionality gage method, obtains the semantic information of multimedia space of one 20 dimension, and each multimedia document has the coordinate of one 20 dimension in this space.It is pointed out that above structure about the semantic information of multimedia space is that off-line carries out.
Figure two is preceding 9 results that the user goes query image to return by the sound of submitting one section car engine to, its retrieving is as follows: the sound of submitting car engine as the user to is as the retrieval example time, and system at first finds multimedia document under this audio file at the coordinate in semantic information of multimedia space; According to all multimedia documents in the database from small to large, all multimedia documents are sorted then to the distance between the multimedia document of inquiry under the example; Then from the close-by examples to those far off, search whether there is image in each multimedia document,, then as a result of return to the user,, then continue to search next multimedia document, reach the number of user's appointment up to the amount of images of returning if do not have if having.From figure two as can be seen, Query Result is quite accurately, and the method that this explanation the present invention proposes can effectively be crossed over semantic wide gap, well understands semantic information of multimedia, has higher accuracy rate.On the other hand, it seems from the return results of figure two, though the retrieval example of submitting to is an audio fragment and the result that returns is an image, it is consistent inquiring about between example and the return results semantically, and this explanation the present invention possesses the good ability that medium are retrieved of striding.
From top example as can be seen, compare with traditional search method, the present invention is owing to adopted multi-modal information fusion mechanism to carry out semantic information of multimedia understanding, therefore compare with traditional multimedia retrieval based on interior, can understand semantic information of multimedia more accurately, have higher retrieval rate; Simultaneously, the present invention can also finish and stride the medium retrieval, just can remove to retrieve the result for retrieval of any mode with the retrieval example of any mode, (such as using the sound retrieval image), therefore compare with traditional content-based multimedia retrieval, function is more powerful.

Claims (6)

  1. One kind based on multi-modal information convergence analysis stride the medium search method, it is characterized in that comprising the steps:
    1), carries out semantic information of multimedia and understand to multi-modal information convergence analysis;
    2) user submits in the database media object beyond the existing or database to retrieve as the inquiry example;
    3) according to user's relevant feedback, carry out quadratic search;
    4) according to user's relevant feedback, the semantic information of multimedia space is safeguarded.
  2. 2. according to claim 1 a kind of based on multi-modal information convergence analysis stride the medium search method, it is characterized in that, described to multi-modal information convergence analysis, carry out semantic information of multimedia and understand, its step is as follows:
    1) all audio fragments in the database are extracted root mean square, cutoff frequency, zero-crossing rate and four features of barycenter, utilize dynamic time all audio fragments of algorithm computation distance between any two of stretching, and with all range normalizations;
    2) image objects all in the database is extracted color and textural characteristics, calculate all images object Euclidean distance between any two, and with all range normalizations;
    3) adopt single text vocabulary frequency/contrary text frequency approach to carry out vector quantization to text media objects all in the database, calculate all text media objects distance between any two, and with all range normalizations;
    4) by non-linear method to the target voice in each multimedia document, the entrained information of text object and image object is carried out convergence analysis, thereby obtains multimedia document distance between any two;
    5) set up a multimedia document associated diagram.Each multimedia document is a summit on this figure, and it is distance between resulting these two the pairing multimedia documents in summit of step 4 that a weighting limit, weight are arranged between any 2;
    6) reconstruct multimedia document associated diagram, method are at first to set a threshold value, then weight all are made as infinity greater than the power on the limit of this threshold value, then to all limits, with the new weight of the shortest path between 2 o'clock as this limit;
    7) adopt multi-dimensionality gage method that the multimedia document associated diagram is projected to the semantic information of multimedia space, this space can keep the topological relation of multimedia document associated diagram, and all multimedia documents all have unique coordinate and pointed by this coordinate in this space; All media object are all pointed by the coordinate of multimedia document under them.
  3. 3. according to claim 1ly a kind ofly stride the medium search method based on multi-modal information convergence analysis, it is characterized in that, described user submits to the step of the method that existing media object in the database retrieves as the inquiry example to be: at first find the coordinate of this media object in the semantic information of multimedia space, then according to all coordinate of media object in the semantic information of multimedia space, calculate inquiry example and the Euclidean distance of other all media object in the semantic information of multimedia space, and according to this distance, all media object are sorted the media object of the target mode that layback is nearest;
  4. 4. according to claim 1 a kind of based on multi-modal information convergence analysis stride the medium search method, it is characterized in that the media object beyond described user submits in the database is as follows as the step that the inquiry example carries out search method:
    1) find in the database and all media object of the identical mode of inquiry example, the low-level image feature distance of calculating these media object and inquiring about example;
    2) according to the low-level image feature distance, find in the database and the immediate k of an inquiry example media object, these media object at the barycenter in the semantic information of multimedia space coordinate as the retrieval example, are striden medium according to the method in the right 3 and retrieved.
  5. 5. according to claim 1ly a kind ofly stride the medium search method based on multi-modal information convergence analysis, it is characterized in that, described relevant feedback according to the user, the step of carrying out quadratic search is as follows: return after the Query Result, the user estimates Query Result, and mark the result that some they praise, system is labeled as the user on the coordinate of the barycenter of those media object in the semantic information of multimedia space of positive example as the retrieval example, calculate inquiry example and the Euclidean distance of other all media object in the semantic information of multimedia space, and according to this distance, all media object are sorted the media object of the target mode that layback is nearest.
  6. 6. according to claim 1 based on multi-modal information convergence analysis stride the medium search method, it is characterized in that, described relevant feedback according to the user, the step that the semantic information of multimedia space is safeguarded is as follows:
    1) according to user's relevant feedback historical record, periodically on-the-fly modifies the multimedia document associated diagram and re-construct the semantic information of multimedia space, make it to reflect more exactly the semantic information of multimedia relation;
    2) according to user's relevant feedback, the inquiry example outside the database is mapped to the semantic information of multimedia space, thereby finishes database update.
CNB2006100533923A 2006-09-14 2006-09-14 Transmedia search method based on multi-mode information convergence analysis Expired - Fee Related CN100388282C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2006100533923A CN100388282C (en) 2006-09-14 2006-09-14 Transmedia search method based on multi-mode information convergence analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2006100533923A CN100388282C (en) 2006-09-14 2006-09-14 Transmedia search method based on multi-mode information convergence analysis

Publications (2)

Publication Number Publication Date
CN1920818A true CN1920818A (en) 2007-02-28
CN100388282C CN100388282C (en) 2008-05-14

Family

ID=37778544

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2006100533923A Expired - Fee Related CN100388282C (en) 2006-09-14 2006-09-14 Transmedia search method based on multi-mode information convergence analysis

Country Status (1)

Country Link
CN (1) CN100388282C (en)

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101923561A (en) * 2010-05-24 2010-12-22 中国科学技术信息研究所 Automatic document classifying method
WO2011001002A1 (en) * 2009-06-30 2011-01-06 Nokia Corporation A method, devices and a service for searching
CN101984424A (en) * 2010-10-26 2011-03-09 浙江工商大学 Mass inter-media index method
CN101599062B (en) * 2008-06-06 2011-06-15 佛山市顺德区顺达电脑厂有限公司 Search method and search system
CN102129477A (en) * 2011-04-23 2011-07-20 山东大学 Multimode-combined image reordering method
CN101441649B (en) * 2007-11-21 2011-09-21 株式会社日立制作所 Spoken document retrieval system
CN102262670A (en) * 2011-07-29 2011-11-30 中山大学 Cross-media information retrieval system and method based on mobile visual equipment
CN102289430A (en) * 2011-06-29 2011-12-21 北京交通大学 Method for analyzing latent semantics of fusion probability of multi-modality data
CN102693321A (en) * 2012-06-04 2012-09-26 常州南京大学高新技术研究院 Cross-media information analysis and retrieval method
CN103049526A (en) * 2012-12-20 2013-04-17 中国科学院自动化研究所 Cross-media retrieval method based on double space learning
CN103164539A (en) * 2013-04-15 2013-06-19 中国传媒大学 Interactive type image retrieval method of combining user evaluation and labels
CN101996191B (en) * 2009-08-14 2013-08-07 北京大学 Method and system for searching for two-dimensional cross-media element
CN103473327A (en) * 2013-09-13 2013-12-25 广东图图搜网络科技有限公司 Image retrieval method and image retrieval system
CN103559191A (en) * 2013-09-10 2014-02-05 浙江大学 Cross-media sorting method based on hidden space learning and two-way sorting learning
CN103995804A (en) * 2013-05-20 2014-08-20 中国科学院计算技术研究所 Cross-media topic detection method and device based on multimodal information fusion and graph clustering
CN104268140A (en) * 2014-07-31 2015-01-07 浙江大学 Image retrieval method based on weight learning hypergraphs and multivariate information combination
CN104317834A (en) * 2014-10-10 2015-01-28 浙江大学 Cross-media sorting method based on deep neural network
CN104346450A (en) * 2014-10-29 2015-02-11 浙江大学 Cross-media ordering method based on multi-modal implicit coupling expression
CN106021463A (en) * 2016-05-17 2016-10-12 北京百度网讯科技有限公司 Method for providing intelligent services on basis of artificial intelligence, intelligent service system and intelligent terminal
CN106446524A (en) * 2016-08-31 2017-02-22 北京智能管家科技有限公司 Intelligent hardware multimodal cascade modeling method and apparatus
CN103870500B (en) * 2012-12-14 2017-05-24 联想(北京)有限公司 Searching method and searching device
CN107766571A (en) * 2017-11-08 2018-03-06 北京大学 The search method and device of a kind of multimedia resource
CN108319686A (en) * 2018-02-01 2018-07-24 北京大学深圳研究生院 Antagonism cross-media retrieval method based on limited text space
US10339146B2 (en) 2014-11-25 2019-07-02 Samsung Electronics Co., Ltd. Device and method for providing media resource
CN111782921A (en) * 2020-03-25 2020-10-16 北京沃东天骏信息技术有限公司 Method and device for searching target

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI553494B (en) * 2015-11-04 2016-10-11 創意引晴股份有限公司 Multi-modal fusion based Intelligent fault-tolerant video content recognition system and recognition method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7610306B2 (en) * 2003-06-30 2009-10-27 International Business Machines Corporation Multi-modal fusion in content-based retrieval

Cited By (36)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101441649B (en) * 2007-11-21 2011-09-21 株式会社日立制作所 Spoken document retrieval system
CN101599062B (en) * 2008-06-06 2011-06-15 佛山市顺德区顺达电脑厂有限公司 Search method and search system
WO2011001002A1 (en) * 2009-06-30 2011-01-06 Nokia Corporation A method, devices and a service for searching
CN101996191B (en) * 2009-08-14 2013-08-07 北京大学 Method and system for searching for two-dimensional cross-media element
CN101923561A (en) * 2010-05-24 2010-12-22 中国科学技术信息研究所 Automatic document classifying method
CN101984424A (en) * 2010-10-26 2011-03-09 浙江工商大学 Mass inter-media index method
CN102129477B (en) * 2011-04-23 2013-01-09 山东大学 Multimode-combined image reordering method
CN102129477A (en) * 2011-04-23 2011-07-20 山东大学 Multimode-combined image reordering method
CN102289430A (en) * 2011-06-29 2011-12-21 北京交通大学 Method for analyzing latent semantics of fusion probability of multi-modality data
CN102289430B (en) * 2011-06-29 2013-11-13 北京交通大学 Method for analyzing latent semantics of fusion probability of multi-modality data
CN102262670A (en) * 2011-07-29 2011-11-30 中山大学 Cross-media information retrieval system and method based on mobile visual equipment
CN102693321A (en) * 2012-06-04 2012-09-26 常州南京大学高新技术研究院 Cross-media information analysis and retrieval method
CN103870500B (en) * 2012-12-14 2017-05-24 联想(北京)有限公司 Searching method and searching device
CN103049526A (en) * 2012-12-20 2013-04-17 中国科学院自动化研究所 Cross-media retrieval method based on double space learning
CN103049526B (en) * 2012-12-20 2015-08-05 中国科学院自动化研究所 Based on the cross-media retrieval method of double space study
CN103164539A (en) * 2013-04-15 2013-06-19 中国传媒大学 Interactive type image retrieval method of combining user evaluation and labels
CN103995804A (en) * 2013-05-20 2014-08-20 中国科学院计算技术研究所 Cross-media topic detection method and device based on multimodal information fusion and graph clustering
CN103995804B (en) * 2013-05-20 2017-02-01 中国科学院计算技术研究所 Cross-media topic detection method and device based on multimodal information fusion and graph clustering
CN103559191A (en) * 2013-09-10 2014-02-05 浙江大学 Cross-media sorting method based on hidden space learning and two-way sorting learning
CN103559191B (en) * 2013-09-10 2016-09-14 浙江大学 Based on latent space study and Bidirectional sort study across media sort method
CN103473327A (en) * 2013-09-13 2013-12-25 广东图图搜网络科技有限公司 Image retrieval method and image retrieval system
CN104268140B (en) * 2014-07-31 2017-06-23 浙江大学 Image search method based on weight self study hypergraph and multivariate information fusion
CN104268140A (en) * 2014-07-31 2015-01-07 浙江大学 Image retrieval method based on weight learning hypergraphs and multivariate information combination
CN104317834B (en) * 2014-10-10 2017-09-29 浙江大学 A kind of across media sort methods based on deep neural network
CN104317834A (en) * 2014-10-10 2015-01-28 浙江大学 Cross-media sorting method based on deep neural network
CN104346450B (en) * 2014-10-29 2017-06-23 浙江大学 A kind of across media sort methods based on multi-modal recessive coupling expression
CN104346450A (en) * 2014-10-29 2015-02-11 浙江大学 Cross-media ordering method based on multi-modal implicit coupling expression
US10339146B2 (en) 2014-11-25 2019-07-02 Samsung Electronics Co., Ltd. Device and method for providing media resource
CN106021463A (en) * 2016-05-17 2016-10-12 北京百度网讯科技有限公司 Method for providing intelligent services on basis of artificial intelligence, intelligent service system and intelligent terminal
CN106021463B (en) * 2016-05-17 2019-07-09 北京百度网讯科技有限公司 Method, intelligent service system and the intelligent terminal of intelligent Service are provided based on artificial intelligence
US11651002B2 (en) 2016-05-17 2023-05-16 Beijing Baidu Netcom Science And Technology Co., Ltd. Method for providing intelligent service, intelligent service system and intelligent terminal based on artificial intelligence
CN106446524A (en) * 2016-08-31 2017-02-22 北京智能管家科技有限公司 Intelligent hardware multimodal cascade modeling method and apparatus
CN107766571A (en) * 2017-11-08 2018-03-06 北京大学 The search method and device of a kind of multimedia resource
CN108319686A (en) * 2018-02-01 2018-07-24 北京大学深圳研究生院 Antagonism cross-media retrieval method based on limited text space
CN108319686B (en) * 2018-02-01 2021-07-30 北京大学深圳研究生院 Antagonism cross-media retrieval method based on limited text space
CN111782921A (en) * 2020-03-25 2020-10-16 北京沃东天骏信息技术有限公司 Method and device for searching target

Also Published As

Publication number Publication date
CN100388282C (en) 2008-05-14

Similar Documents

Publication Publication Date Title
CN100388282C (en) Transmedia search method based on multi-mode information convergence analysis
CN111680173B (en) CMR model for unified searching cross-media information
US8341112B2 (en) Annotation by search
US10445359B2 (en) Method and system for classifying media content
CN110442777A (en) Pseudo-linear filter model information search method and system based on BERT
CA2917153C (en) Method and system for simplifying implicit rhetorical relation prediction in large scale annotated corpus
Huang et al. A patent keywords extraction method using TextRank model with prior public knowledge
Jin et al. Entity linking at the tail: sparse signals, unknown entities, and phrase models
CN106528846A (en) Retrieval method and device
Bokhari et al. Multimodal information retrieval: Challenges and future trends
Yazici et al. An intelligent multimedia information system for multimodal content extraction and querying
US20070112839A1 (en) Method and system for expansion of structured keyword vocabulary
CN117932000A (en) Long document dense retrieval method and system based on topic clustering global features
Allani et al. Pattern graph-based image retrieval system combining semantic and visual features
Lu et al. A novel approach towards large scale cross-media retrieval
Yang et al. Exploring word similarity to improve chinese personal name disambiguation
Li et al. Sparse constraint nearest neighbour selection in cross-media retrieval
Sallaberry et al. Towards an IE and IR System Dealing with Spatial Information in Digital Libraries-Evaluation Case Study.
JP2011159100A (en) Successive similar document retrieval apparatus, successive similar document retrieval method and program
EP1876539A1 (en) Method and system for classifying media content
Patel et al. Recent Trends of Information Retrieval System: Review Based on IR Models and Applications
Ji et al. Vocabulary hierarchy optimization and transfer for scalable image search
Jin et al. Curator: Efficient Indexing for Multi-Tenant Vector Databases
Dobrescu et al. Multi-modal CBIR algorithm based on Latent Semantic Indexing
Peng Quantization to speedup approximate nearest neighbor search

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C17 Cessation of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20080514

Termination date: 20120914