CN107766571A - The search method and device of a kind of multimedia resource - Google Patents

The search method and device of a kind of multimedia resource Download PDF

Info

Publication number
CN107766571A
CN107766571A CN201711108216.XA CN201711108216A CN107766571A CN 107766571 A CN107766571 A CN 107766571A CN 201711108216 A CN201711108216 A CN 201711108216A CN 107766571 A CN107766571 A CN 107766571A
Authority
CN
China
Prior art keywords
multimedia resource
inquiry request
search library
information
multimedia
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201711108216.XA
Other languages
Chinese (zh)
Other versions
CN107766571B (en
Inventor
柳军飞
麻志毅
杨寒
李宏强
孙博
范红杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University
Original Assignee
Peking University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University filed Critical Peking University
Priority to CN201711108216.XA priority Critical patent/CN107766571B/en
Publication of CN107766571A publication Critical patent/CN107766571A/en
Application granted granted Critical
Publication of CN107766571B publication Critical patent/CN107766571B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/7867Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, title and artist information, manually generated time, location and usage information, user ratings
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/43Querying

Abstract

The invention discloses a kind of search method of multimedia resource and device, methods described includes:Receive the inquiry request that user sends;Retrieved according to the inquiry request in multimedia resource search library, and return to retrieval result;Wherein, the multi-modal information of multiple multimedia resources is stored with the multimedia resource search library.The multimedia resource for meeting search condition can be more fully retrieved using the present invention, so as to preferably meet the Search Requirement of multimedia resource.

Description

The search method and device of a kind of multimedia resource
Technical field
The present invention relates to field of video retrieval, the search method and device of a kind of multimedia resource are particularly related to.
Background technology
Along with the rapid development of Internet technology and the significant increase of network bandwidth, more matchmakers on the internet are stored Body resource (video) is in explosive growth.In the multimedia resource of these magnanimity, it is no lack of the valuable money that huge commercial value be present Source.Efficient retrieval how is carried out in mass multimedia resource (video) to become as multimedia video resources effective utilization and maximum Change the key of its value.
Currently the retrieval to multimedia resource (video) is depended primarily on based on keyword to multimedia resource (video) Inventory information retrieved;And different multimedia resource manufacturers is typically to need definition multimedia resource according to oneself Inventory information;Therefore, the information included in the inventory information of multimedia resource often has limitation or one-sidedness.It is based on The retrieval that inventory information is carried out, it is impossible to meet Search Requirement well, many useful multimedia resources can be omitted.
The content of the invention
In view of this, it is an object of the invention to propose the search method and device of a kind of multimedia resource, can more fill The multimedia resource for meeting search condition is retrieved with dividing, so as to preferably meet the Search Requirement of multimedia resource.
A kind of search method of multimedia resource is provided based on the above-mentioned purpose present invention, including:
Receive the inquiry request that user sends;
Retrieved according to the inquiry request in multimedia resource search library, and return to retrieval result;
Wherein, the multi-modal information of multiple multimedia resources is stored with the multimedia resource search library.
It is preferred that also it is stored with the multimedia resource search library:The inventory information of each multimedia resource.
Wherein, the multi-modal information of the multimedia resource includes text message;And
The text message is prestored to the multimedia resource search library:
Text message is identified from the video of the multimedia resource;
The text information storage that will identify that is into the multimedia resource search library.
Wherein, the multi-modal information of the multimedia resource includes voice messaging;Wherein, the voice messaging is with audio Compressed encoding form and/or written form are prestored to the multimedia resource search library:
Audio is extracted from the multimedia resource and is converted to word content after carrying out speech recognition, will be converted to Word content as the written form of the multimedia resource voice messaging store into the multimedia resource search library; And/or
Audio will be extracted from the multimedia resource, the feature of audio described in onestep extraction of going forward side by side and to extracting After audio frequency characteristics are compressed coding, the voice messaging of the compressed audio coding form of the multimedia resource is obtained.
Wherein, the multi-modal information of the multimedia resource includes image information;Wherein, described image information is with pixel Compressed encoding form and/or written form are prestored to the multimedia resource search library:
Extract key frame from the video of the multimedia resource, to the key frame carry out picture material description and/or Image object mark is carried out, picture material is described into obtained word content and/or image object marks obtained word content Image information as the written form of the multimedia resource is stored into the multimedia resource search library;And/or
Key frame will be extracted from the video of the multimedia resource, the picture pixels feature for extracting the key frame is gone forward side by side After row compressed encoding, the image information storage for obtaining the pixel compressed encoding form of the multimedia resource provides to the multimedia In the search library of source.
Wherein, it is described to be retrieved according to the inquiry request in multimedia resource search library, including:
The inquiry request is analyzed, obtains the set of keywords K of the inquiry request;
The set of keywords K is expanded, the set of keywords K ' after being expanded;
Retrieved according to the set of keywords K ' after the expansion in the multimedia resource search library.
Or it is described retrieved according to the inquiry request in multimedia resource search library, including:
The inquiry request is analyzed, obtains the audio fragment in the inquiry request;
According to the audio fragment, the audio-frequency information of the compressed audio coding form in the multimedia resource search library In retrieved.
Or it is described retrieved according to the inquiry request in multimedia resource search library, including:
The inquiry request is analyzed, obtains the picture in the inquiry request;
According to the picture, enter in the image information of the pixel compressed encoding form in the multimedia resource search library Row retrieval.
Further, it is described retrieved according to the inquiry request in multimedia resource search library after, in addition to:
For same multimedia resource, the inventory information of the multimedia resource is obtained, and the information of different modalities is divided Not Dui Yingyu the inquiry request compatible degree;
The information of the inventory information of multimedia resource, and different modalities is corresponded respectively to the contract of the inquiry request It is right to do weighted average, the score value of the inquiry request is matched with using obtained weighted average as the multimedia resource;
Descending sort is made according to the score value of each multimedia resource;
Using the ranking results of each multimedia resource as the retrieval result.
The present invention also provides a kind of retrieval device of multimedia resource, including:
Multimedia resource search library, for storing the multi-modal information of multiple multimedia resources;
Inquiry request receiving module, the inquiry request sent for receiving user;
Module is retrieved, for being retrieved according to the inquiry request in the multimedia resource search library, and is returned Retrieval result.
Further, also it is stored with the multimedia resource search library:The inventory information of each multimedia resource.
Wherein, the multi-modal information of the multimedia resource comprises at least one of following information:Text message, voice letter Breath, image information;Wherein, the voice messaging be in the form of compressed audio coding and/or written form be prestored to it is described Multimedia resource search library;Described image information is in the form of pixel compressed encoding and/or written form is prestored to institute State multimedia resource search library.
Further, described device also includes:Multi-modal information memory module;And
The multi-modal information memory module is included at least such as one of lower unit:
Text information storage unit, for identifying text message from the video of the multimedia resource;It will identify that Text information storage into the multimedia resource search library;
Voice messaging memory cell, for extracting audio from the multimedia resource and changing after carrying out speech recognition For word content, stored the word content being converted to as the voice messaging of the written form of the multimedia resource to institute State in multimedia resource search library;And/or audio will be extracted from the multimedia resource, audio described in onestep extraction of going forward side by side Feature and after the audio frequency characteristics to extracting are compressed coding, obtain the compressed audio coding form of the multimedia resource Voice messaging, the multimedia is arrived into the storage of the voice messaging of the compressed audio coding form of the obtained multimedia resource In resource search library;
Image information memory cell, for extracting key frame from the video of the multimedia resource, to the key frame Carry out picture material description and/or carry out image object mark, the word content and/or image thing that picture material is described to obtain The word content that body marks to obtain provides as the image information storage of the written form of the multimedia resource to the multimedia In the search library of source;And/or key frame will be extracted from the video of the multimedia resource, extract the picture pixels of the key frame Feature and after carrying out compressed encoding, the image information storage of pixel compressed encoding form of the multimedia resource is obtained to described In multimedia resource search library.
In technical solution of the present invention, the multi-modal information of multimedia resource is stored with multimedia resource search library, according to Inquiry request is retrieved in multimedia resource search library, can the information based on than inventory information more horn of plenty examined Rope, so as to more fully retrieve the multimedia resource for meeting search condition, more preferably meet the retrieval need of multimedia resource Ask.
Brief description of the drawings
Fig. 1 is a kind of search method flow chart of multimedia resource of the embodiment of the present invention;
Fig. 2 is the method flow diagram of the text message of a kind of acquisition of the embodiment of the present invention and storing multimedia resource;
Fig. 3 is the method flow diagram of the voice messaging of a kind of acquisition of the embodiment of the present invention and storing multimedia resource;
Fig. 4 is the method flow diagram of the image information of a kind of acquisition of the embodiment of the present invention and storing multimedia resource;
Fig. 5 is a kind of retrieval device internal structure block diagram of multimedia resource of the embodiment of the present invention.
Embodiment
For the object, technical solutions and advantages of the present invention are more clearly understood, below in conjunction with specific embodiment, and reference Accompanying drawing, the present invention is described in more detail.
Embodiments of the invention are described below in detail, the example of the embodiment is shown in the drawings, wherein from beginning to end Same or similar label represents same or similar element or the element with same or like function.Below with reference to attached The embodiment of figure description is exemplary, is only used for explaining the present invention, and is not construed as limiting the claims.
Those skilled in the art of the present technique are appreciated that unless expressly stated, singulative " one " used herein, " one It is individual ", " described " and "the" may also comprise plural form.Wording "and/or" used herein includes one or more associated List items whole or any cell and all combine.
It should be noted that all statements for using " first " and " second " are for differentiation two in the embodiment of the present invention The non-equal entity of individual same names or non-equal parameter, it is seen that " first " " second " should not only for the convenience of statement The restriction to the embodiment of the present invention is interpreted as, subsequent embodiment no longer illustrates one by one to this.
The present inventor is it is considered that contain multi-modal information in multimedia resource (video), such as text, language Sound, image etc..If using these information in retrieval, the multimedia money for meeting search condition can be more fully retrieved Source, so as to preferably meet the Search Requirement of multimedia resource.
Technical scheme is discussed in detail below in conjunction with the accompanying drawings.
Based on above-mentioned thinking, in order to the multi-modal information in retrieval using multimedia resource, the present invention is implemented In the technical scheme of example, the multimedia resource of storage is pre-processed first, multi-modal letter is extracted from multimedia resource Breath storage is into multimedia resource search library.In multimedia resource search library provided in an embodiment of the present invention, each multimedia resource Multi-modal information can include a kind of at least following information:Text message, voice messaging, image information.Multimedia resource Multi-modal information be pre-stored within multimedia resource search library, wherein, the voice messaging is with audio compression Coding form and/or written form are prestored to the multimedia resource search library;Described image information is with pixel pressure Contracting coding form and/or written form are prestored to the multimedia resource search library.How to obtain and store multi-modal Information will subsequently be discussed in detail.Certainly, more preferably, the inventory information of multimedia resource can be also also stored into multimedia money In the search library of source.
Based on above-mentioned multimedia resource search library, a kind of retrieval side of multimedia resource provided in an embodiment of the present invention Method, flow is as shown in figure 1, comprise the following steps:
S101:Receive the inquiry request that user sends.
In this step, the inquiry request of reception can include keyword to be checked, or audio fragment to be checked, or Person's picture to be checked.
S102:Retrieved according to the inquiry request in multimedia resource search library.
In this step, the inquiry request for including keyword to be checked, the inquiry request can be analyzed first, is obtained To the set of keywords K of the inquiry request;It is for instance possible to use participle, Chinese word segmentation, name Entity recognition, sentiment analysis Etc. technical Analysis inquiry request, the set of keywords K of inquiry request is obtained.
And then the set of keywords K is expanded, the set of keywords K ' after being expanded;For example, it can pass through The methods of knowledge mapping or synonym extend expands set of keywords K.
Afterwards, entered according to the set of keywords K ' after expansion in the multi-modal information of the multimedia resource search library Row retrieval;Can also be, according to the set of keywords K ' after the expansion in the multi-modal of the multimedia resource search library Retrieved in information and inventory information.
Set of keywords is expanded to the completeness for being intended to improve inquiry herein.Such as user's inquiry request is comprising " western red Persimmon ", then for the synonym " tomato " of " tomato ", technical scheme can be inquired equally comprising " tomato " content Video.That is, retrieved according to the set of keywords after expansion, can obtain more with the inquiry in inquiry request The related retrieval result of condition.
How the method retrieved according to set of keywords is well known to those skilled in the art, and is not repeated herein.
In this step, the inquiry request for including audio fragment to be checked, the inquiry request is analyzed first, obtain Audio fragment in the inquiry request;And then according to the audio fragment, the audio in the multimedia resource search library Retrieved in the audio-frequency information of compressed encoding form:Compressed encoding is carried out after extracting the audio frequency characteristics of audio fragment, using poly- Audio letter similar in being searched in the audio-frequency information of compressed audio coding form of the class algorithm in the multimedia resource search library Breath.
In this step, the inquiry request for including picture to be checked analyzes the inquiry request first, described in acquisition Picture in inquiry request;And then according to the picture, the pixel compressed encoding form in the multimedia resource search library Image information in retrieved:After extracting the picture pixels feature of the picture and carrying out compressed encoding, clustering algorithm is utilized Image information similar in the image information lookup of pixel compressed encoding form in the multimedia resource search library.
Further, after being retrieved in the multi-modal information of the multimedia resource search library and inventory information, The inventory information of same multimedia resource, and information (i.e. text message, voice messaging, the image of different modalities can be obtained Information) compatible degree of the inquiry request, or matching degree are corresponded respectively to, by the inventory information of multimedia resource, and The information (i.e. text message, voice messaging, image information) of different modalities corresponds respectively to the compatible degree of the inquiry request Weighted average is done, the score value of the inquiry request is matched with using obtained weighted average as the multimedia resource.According to each The score value of multimedia resource makees descending sort;Using the ranking results of each multimedia resource as the retrieval result.
S103:Return to retrieval result.
After the retrieval result to match with the querying condition in inquiry request is obtained, retrieval result is returned to user, Then user can know the multimedia resource for meeting querying condition, or meet the multimedia money of the condition close with querying condition Source.
The multi-modal information of each multimedia resource is obtained and stored in advance in above-mentioned multimedia resource search library, its In, a kind of specific method flow such as Fig. 2 institutes for obtaining the simultaneously text message of storing multimedia resource provided in an embodiment of the present invention Show, comprise the following steps:
S201:Text message is identified from the video of the multimedia resource.
Specifically, picture frame that can be high to similarity degree in the multimedia resource carries out duplicate removal, to the institute after duplicate removal The picture frame for stating multimedia resource video carries out Text region.
S202:The text information storage that will identify that is into the multimedia resource search library.
In this step, it is preferred that duplicate removal processing first can be carried out to the text message identified, by the text envelope after duplicate removal Breath storage is into the multimedia resource search library.Duplicate removal processing assists in removing bulk redundancy information, saves multimedia resource The space of search library.
A kind of specific method stream for obtaining the simultaneously voice messaging of storing multimedia resource in advance provided in an embodiment of the present invention Journey is as shown in figure 3, comprise the following steps:
S301:Audio is extracted from the multimedia resource.
S302:Word content is converted to after the audio extracted is carried out into speech recognition, and/or further extracts the sound After the feature of frequency and the audio frequency characteristics to extracting are compressed coding, the compressed audio coding shape of the multimedia resource is obtained The voice messaging of formula.
S303:The multimedia is arrived using the word content being converted to as the voice messaging storage of the multimedia resource In resource search library, and/or the voice messaging of the compressed audio coding form of the multimedia resource obtained after compressed encoding deposited Store up in the multimedia resource search library.
In this step, it is preferred that the word content being converted to is done into text snippet, the word content that summary is obtained is made Stored for the voice messaging of the multimedia resource into the multimedia resource search library;And/or
In this step, by obtaining the compressed audio coding form of the multimedia resource after compressed encoding in S302 steps Voice messaging store into the multimedia resource search library.
In general, the voice content in multimedia resource is bigger, but a useful simply part therein.It is so right The word content being converted to does text snippet, gets rid of the wherein content without practical significance.Then by obtained word of making a summary Content is added in multi-modal media resource search library.Bulk redundancy information is so assisted in removing, saves multimedia resource inspection Suo Ku space.
A kind of specific method stream for obtaining the simultaneously image information of storing multimedia resource in advance provided in an embodiment of the present invention Journey is as shown in figure 4, comprise the following steps:
S401:Key frame is extracted from the video of the multimedia resource.
In fact, the video of multimedia resource is made up of picture one by one, the semantic information included in picture For understanding that video content is most important.The system first does Key Frame Extraction to video and obtains key frame.
S402:Picture material description is carried out to the key frame of extraction and/or carries out image object mark, and/or extraction institute State the picture pixels feature of key frame and carry out compressed encoding.
In this step, picture material description is carried out to every key frame, generation describes the content of text of the key frame, and/ Or image object mark is carried out to every key frame, obtain the word content of image object mark.Specifically, depth can be used The artificial intelligence correlation techniques such as study carry out picture material description, the word content described to key frame;Wherein, to key Frame carries out image object mark and refers specifically to carry out label character to the subject image identified in key frame.And/or
In this step, after extracting the picture pixels feature of every key frame and carrying out compressed encoding, the multimedia is obtained The image information of the pixel compressed encoding form of resource.
S403:Picture material is described into obtained word content and/or image object mark obtained word content as The image information of the written form of the multimedia resource stored into the multimedia resource search library, and/or will be obtained The image information of the pixel compressed encoding form of multimedia resource is stored into the multimedia resource search library.
In this step, it is preferred that picture material first can be described into obtained word content and/or image object marks The word content arrived carries out duplicate removal processing, the image using the word content after duplicate removal as the written form of the multimedia resource Information is stored into the multimedia resource search library;And/or
In this step, more matchmakers are arrived into the image information storage of the pixel compressed encoding form of obtained multimedia resource In body resource search library.
Based on above-mentioned method, a kind of retrieval device of multimedia resource provided in an embodiment of the present invention, internal frame diagram is such as Shown in Fig. 5, including:Multimedia resource search library 501, inquiry request receiving module 502, retrieval module 503.
Multimedia resource search library 501 is used for the multi-modal information for storing multiple multimedia resources;It is preferred that multimedia provides It can be also stored with source search library 501:The inventory information of each multimedia resource.Wherein, the multi-modal letter of the multimedia resource Breath comprises at least one of following information:Text message, voice messaging, image information.
Inquiry request receiving module 502 is used to receive the inquiry request that user sends.
Module 503 is retrieved to be used to be retrieved in multimedia resource according to the inquiry request that inquiry request receiving module 502 receives Retrieved in storehouse 501, and return to retrieval result.
It is preferred that retrieval module 503 is used to analyze the inquiry request, the set of keywords K of the inquiry request is obtained; The set of keywords K is expanded, the set of keywords K ' after being expanded;According to the set of keywords after the expansion K ' is retrieved in the multimedia resource search library.The specific search method of retrieval module 503 may be referred to above-mentioned steps Content in S102, here is omitted.
Further, set of keywords K ' of the module 503 after according to expansion is retrieved in the multimedia resource search library After being retrieved in multi-modal information and inventory information, for same multimedia resource, the multimedia resource can be obtained Inventory information, and the information of different modalities corresponds respectively to the compatible degree of the inquiry request, or matching degree, will be more The inventory information of media resource, and the information of different modalities correspond respectively to the compatible degree of the inquiry request to do weighting flat , the score value of the inquiry request is matched with using obtained weighted average as the multimedia resource.Retrieval result is pressed and divided Value descending returns to user.
Or retrieval module 503 can also be used to analyze the inquiry request, obtain the audio piece in the inquiry request Section;According to the audio fragment, enter in the audio-frequency information of the compressed audio coding form in the multimedia resource search library Row retrieval.
Or retrieval module 503 can also be used to analyze the inquiry request, obtain the picture in the inquiry request;Root According to the picture, retrieved in the image information of the pixel compressed encoding form in the multimedia resource search library.
Further, a kind of retrieval device of multimedia resource provided in an embodiment of the present invention can also include:Multi-modal letter Cease memory module 504;
Multi-modal information memory module 504 is included at least such as one of lower unit:Text information storage unit 511, voice letter Cease memory cell 512, image information memory cell 513.
Text information storage unit 511 is used to identify text message from the video of the multimedia resource;Will identification The text information storage gone out is into the multimedia resource search library 501.Text information storage unit 511 obtains and stores more matchmakers The specific method of the text message of body resource refers to each step method shown in above-mentioned Fig. 2, and here is omitted.
Voice messaging memory cell 512 is used to extract audio from the multimedia resource and turned after carrying out speech recognition Word content is changed to, is arrived the word content being converted to as the voice messaging storage of the written form of the multimedia resource In the multimedia resource search library;Audio will be extracted from the multimedia resource, audio described in onestep extraction of going forward side by side After feature and the audio frequency characteristics to extracting are compressed coding, the compressed audio coding form of the multimedia resource is obtained Voice messaging, the voice messaging storage of the compressed audio coding form of the obtained multimedia resource is provided to the multimedia In source search library 501.Voice messaging memory cell 512 obtains and the specific method of the voice messaging of storing multimedia resource can join It is admitted to and states each step method shown in Fig. 3, here is omitted.
Image information memory cell 513 extracts key frame from the video of the multimedia resource, and the key frame is entered Row picture material describes and/or carried out image object mark, the word content and/or image object that picture material is described to obtain Mark obtained word content and arrive the multimedia resource as the image information storage of the written form of the multimedia resource In search library;And/or key frame being extracted from the video of the multimedia resource, the picture pixels for extracting the key frame are special After levying and carrying out compressed encoding, the image information for obtaining the pixel compressed encoding form of the multimedia resource is stored to described more In media resource search library 501.Image information memory cell 513 obtains and the specific side of the image information of storing multimedia resource Method refers to each step method shown in above-mentioned Fig. 4, and here is omitted.
In technical solution of the present invention, the multi-modal information of multimedia resource is stored with multimedia resource search library, according to Inquiry request is retrieved in multimedia resource search library, can the information based on than inventory information more horn of plenty examined Rope, so as to more fully retrieve the multimedia resource for meeting search condition, more preferably meet the retrieval need of multimedia resource Ask.
Those skilled in the art of the present technique are appreciated that in the various operations discussed in the present invention, method, flow Step, measure, scheme can be replaced, changed, combined or deleted.Further, it is each with having been discussed in the present invention Kind operation, method, other steps in flow, measure, scheme can also be replaced, changed, reset, decomposed, combined or deleted. Further, it is of the prior art to have and the step in the various operations disclosed in the present invention, method, flow, measure, scheme It can also be replaced, changed, reset, decomposed, combined or deleted.
Those of ordinary skills in the art should understand that:The discussion of any of the above embodiment is exemplary only, not It is intended to imply that the scope of the present disclosure (including claim) is limited to these examples;Under the thinking of the present invention, above example Or can also be combined between the technical characteristic in different embodiments, step can be realized with random order, and exist such as Many other changes of upper described different aspect of the invention, for simplicity, they are not provided in details.Therefore, it is all Within the spirit and principles in the present invention, any omission for being made, modification, equivalent substitution, improvement etc., it should be included in the present invention's Within protection domain.

Claims (10)

  1. A kind of 1. search method of multimedia resource, it is characterised in that including:
    Receive the inquiry request that user sends;
    Retrieved according to the inquiry request in multimedia resource search library, and return to retrieval result;
    Wherein, the multi-modal information of multiple multimedia resources is stored with the multimedia resource search library.
  2. 2. according to the method for claim 1, it is characterised in that the multi-modal information of the multimedia resource includes text envelope Breath;And
    The text message is prestored to the multimedia resource search library:
    Text message is identified from the video of the multimedia resource;
    The text information storage that will identify that is into the multimedia resource search library.
  3. 3. according to the method for claim 1, it is characterised in that the multi-modal information of the multimedia resource is believed including voice Breath;Wherein, the voice messaging is in the form of compressed audio coding and/or written form is prestored to the multimedia resource Search library:
    Audio is extracted from the multimedia resource and is converted to word content after carrying out speech recognition, the text that will be converted to Word content is stored into the multimedia resource search library as the voice messaging of the written form of the multimedia resource;With/ Or
    Audio will be extracted from the multimedia resource, the feature and the audio to extracting of audio described in onestep extraction of going forward side by side After feature is compressed coding, the voice messaging of the compressed audio coding form of the multimedia resource is obtained.
  4. 4. according to the method for claim 1, it is characterised in that the multi-modal information of the multimedia resource is believed including image Breath;Wherein, described image information is in the form of pixel compressed encoding and/or written form is prestored to the multimedia resource Search library:
    Key frame is extracted from the video of the multimedia resource, picture material description and/or progress are carried out to the key frame Image object marks, picture material is described into obtained word content and/or image object mark obtained word content as The image information of the written form of the multimedia resource is stored into the multimedia resource search library;And/or
    Key frame will be extracted from the video of the multimedia resource, extract the picture pixels feature of the key frame and pressed After reducing the staff code, the image information storage for obtaining the pixel compressed encoding form of the multimedia resource is examined to the multimedia resource In Suo Ku.
  5. 5. according to the method for claim 3, it is characterised in that described to be retrieved according to the inquiry request in multimedia resource Retrieved in storehouse, including:
    The inquiry request is analyzed, obtains the audio fragment in the inquiry request;
    According to the audio fragment, enter in the audio-frequency information of the compressed audio coding form in the multimedia resource search library Row retrieval.
  6. 6. according to the method for claim 4, it is characterised in that described to be retrieved according to the inquiry request in multimedia resource Retrieved in storehouse, including:
    The inquiry request is analyzed, obtains the picture in the inquiry request;
    According to the picture, examined in the image information of the pixel compressed encoding form in the multimedia resource search library Rope.
  7. 7. according to the method for claim 1, it is characterised in that examined described according to the inquiry request in multimedia resource After being retrieved in Suo Ku, in addition to:
    For same multimedia resource, the inventory information of the multimedia resource is obtained, and the information institute of different modalities is right respectively The compatible degree of inquiry request described in Ying Yu;
    The information of the inventory information of multimedia resource, and different modalities is corresponded respectively to the compatible degree of the inquiry request Weighted average is done, the score value of the inquiry request is matched with using obtained weighted average as the multimedia resource;
    Descending sort is made according to the score value of each multimedia resource;
    Using the ranking results of each multimedia resource as the retrieval result.
  8. 8. a kind of retrieval device of multimedia resource, including:
    Multimedia resource search library, for storing the multi-modal information of multiple multimedia resources;
    Inquiry request receiving module, the inquiry request sent for receiving user;
    Module is retrieved, for being retrieved according to the inquiry request in the multimedia resource search library, and returns to retrieval As a result.
  9. 9. device according to claim 8, it is characterised in that the multi-modal information of the multimedia resource comprises at least such as One of lower information:Text message, voice messaging, image information;Wherein, the voice messaging is in the form of compressed audio coding And/or written form is prestored to the multimedia resource search library;Described image information is with pixel compressed encoding shape Formula and/or written form are prestored to the multimedia resource search library.
  10. 10. device according to claim 9, it is characterised in that also include:Multi-modal information memory module;And
    The multi-modal information memory module is included at least such as one of lower unit:
    Text information storage unit, for identifying text message from the video of the multimedia resource;The text that will identify that This information is stored into the multimedia resource search library;
    Voice messaging memory cell, for extracting audio from the multimedia resource and being converted to text after carrying out speech recognition Word content, arrived using the word content being converted to as the voice messaging storage of the written form of the multimedia resource described more In media resource search library;And/or audio will be extracted from the multimedia resource, the spy for audio described in onestep extraction of going forward side by side After sign and the audio frequency characteristics to extracting are compressed coding, the language of the compressed audio coding form of the multimedia resource is obtained Message is ceased, and the voice messaging storage of the compressed audio coding form of the obtained multimedia resource is arrived into the multimedia resource In search library;
    Image information memory cell, for extracting key frame from the video of the multimedia resource, the key frame is carried out Picture material describes and/or carried out image object mark, the word content and/or image object mark that picture material is described to obtain Obtained word content is noted to examine to the multimedia resource as the image information storage of the written form of the multimedia resource In Suo Ku;And/or key frame will be extracted from the video of the multimedia resource, extract the picture pixels feature of the key frame And after carrying out compressed encoding, more matchmakers are arrived in the image information storage for obtaining the pixel compressed encoding form of the multimedia resource In body resource search library.
CN201711108216.XA 2017-11-08 2017-11-08 Multimedia resource retrieval method and device Expired - Fee Related CN107766571B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711108216.XA CN107766571B (en) 2017-11-08 2017-11-08 Multimedia resource retrieval method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711108216.XA CN107766571B (en) 2017-11-08 2017-11-08 Multimedia resource retrieval method and device

Publications (2)

Publication Number Publication Date
CN107766571A true CN107766571A (en) 2018-03-06
CN107766571B CN107766571B (en) 2021-02-09

Family

ID=61272932

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711108216.XA Expired - Fee Related CN107766571B (en) 2017-11-08 2017-11-08 Multimedia resource retrieval method and device

Country Status (1)

Country Link
CN (1) CN107766571B (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108647245A (en) * 2018-04-13 2018-10-12 腾讯科技(深圳)有限公司 Matching process, device, storage medium and the electronic device of multimedia resource
CN109255036A (en) * 2018-08-31 2019-01-22 北京字节跳动网络技术有限公司 Method and apparatus for output information
CN109446356A (en) * 2018-09-21 2019-03-08 深圳市九洲电器有限公司 A kind of multimedia document retrieval method and device
CN109684553A (en) * 2018-12-26 2019-04-26 北京百度网讯科技有限公司 For obtaining the method and device of information
CN110110099A (en) * 2019-04-12 2019-08-09 华勤通讯技术有限公司 A kind of multimedia document retrieval method and device
CN110489594A (en) * 2018-05-14 2019-11-22 北京松果电子有限公司 Image vision mask method, device, storage medium and equipment
CN110532404A (en) * 2019-09-03 2019-12-03 北京百度网讯科技有限公司 One provenance multimedia determines method, apparatus, equipment and storage medium
CN111159435A (en) * 2019-12-27 2020-05-15 北大方正集团有限公司 Multimedia resource processing method, system, terminal and computer readable storage medium
CN111221984A (en) * 2020-01-15 2020-06-02 北京百度网讯科技有限公司 Multimodal content processing method, device, equipment and storage medium
CN112528053A (en) * 2020-12-23 2021-03-19 三星电子(中国)研发中心 Multimedia library classified retrieval management system
CN112818906A (en) * 2021-02-22 2021-05-18 浙江传媒学院 Intelligent full-media news cataloging method based on multi-mode information fusion understanding
WO2021136058A1 (en) * 2019-12-31 2021-07-08 华为技术有限公司 Video processing method and device
CN113507613A (en) * 2021-06-07 2021-10-15 茂名市群英网络有限公司 CDN-based video input scheduling system and method

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1920818A (en) * 2006-09-14 2007-02-28 浙江大学 Transmedia search method based on multi-mode information convergence analysis
CN101021849A (en) * 2006-09-14 2007-08-22 浙江大学 Transmedia searching method based on content correlation
CN101272397A (en) * 2008-05-05 2008-09-24 南京师范大学 Method for acquiring addressable stream media based on ASF data amalgamation technology
US20100005121A1 (en) * 1999-02-01 2010-01-07 At &T Corp. Multimedia integration description scheme, method and system for mpeg-7
US20100066684A1 (en) * 2008-09-12 2010-03-18 Behzad Shahraray Multimodal portable communication interface for accessing video content
US20100100439A1 (en) * 2008-06-12 2010-04-22 Dawn Jutla Multi-platform system apparatus for interoperable, multimedia-accessible and convertible structured and unstructured wikis, wiki user networks, and other user-generated content repositories
CN101968819A (en) * 2010-11-05 2011-02-09 中国传媒大学 Audio/video intelligent catalog information acquisition method facing to wide area network
CN102650993A (en) * 2011-02-25 2012-08-29 北大方正集团有限公司 Index establishing and searching methods, devices and systems for audio-video file
US20140032538A1 (en) * 2012-07-26 2014-01-30 Telefonaktiebolaget L M Ericsson (Publ) Apparatus, Methods, and Computer Program Products For Adaptive Multimedia Content Indexing
CN103778204A (en) * 2014-01-13 2014-05-07 北京奇虎科技有限公司 Voice analysis-based video search method, equipment and system
US20140201240A1 (en) * 2013-01-16 2014-07-17 Althea Systems and Software Pvt. Ltd System and method to retrieve relevant multimedia content for a trending topic
CN106209575A (en) * 2016-06-23 2016-12-07 厦门幻世网络科技有限公司 Method for sending information, acquisition methods, device and interface system
CN106446051A (en) * 2016-08-31 2017-02-22 北京新奥特云视科技有限公司 Deep search method of Eagle media assets
CN107203586A (en) * 2017-04-19 2017-09-26 天津大学 A kind of automation result generation method based on multi-modal information

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100005121A1 (en) * 1999-02-01 2010-01-07 At &T Corp. Multimedia integration description scheme, method and system for mpeg-7
CN1920818A (en) * 2006-09-14 2007-02-28 浙江大学 Transmedia search method based on multi-mode information convergence analysis
CN101021849A (en) * 2006-09-14 2007-08-22 浙江大学 Transmedia searching method based on content correlation
CN101272397A (en) * 2008-05-05 2008-09-24 南京师范大学 Method for acquiring addressable stream media based on ASF data amalgamation technology
US20100100439A1 (en) * 2008-06-12 2010-04-22 Dawn Jutla Multi-platform system apparatus for interoperable, multimedia-accessible and convertible structured and unstructured wikis, wiki user networks, and other user-generated content repositories
US20100066684A1 (en) * 2008-09-12 2010-03-18 Behzad Shahraray Multimodal portable communication interface for accessing video content
CN101968819A (en) * 2010-11-05 2011-02-09 中国传媒大学 Audio/video intelligent catalog information acquisition method facing to wide area network
CN102650993A (en) * 2011-02-25 2012-08-29 北大方正集团有限公司 Index establishing and searching methods, devices and systems for audio-video file
US20140032538A1 (en) * 2012-07-26 2014-01-30 Telefonaktiebolaget L M Ericsson (Publ) Apparatus, Methods, and Computer Program Products For Adaptive Multimedia Content Indexing
US20140201240A1 (en) * 2013-01-16 2014-07-17 Althea Systems and Software Pvt. Ltd System and method to retrieve relevant multimedia content for a trending topic
CN103778204A (en) * 2014-01-13 2014-05-07 北京奇虎科技有限公司 Voice analysis-based video search method, equipment and system
CN106209575A (en) * 2016-06-23 2016-12-07 厦门幻世网络科技有限公司 Method for sending information, acquisition methods, device and interface system
CN106446051A (en) * 2016-08-31 2017-02-22 北京新奥特云视科技有限公司 Deep search method of Eagle media assets
CN107203586A (en) * 2017-04-19 2017-09-26 天津大学 A kind of automation result generation method based on multi-modal information

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
范红杰等: "多策略相似度整合的XML模式匹配方法", 《计算机科学与探索》 *

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108647245B (en) * 2018-04-13 2023-04-18 腾讯科技(深圳)有限公司 Multimedia resource matching method and device, storage medium and electronic device
US11914639B2 (en) 2018-04-13 2024-02-27 Tencent Technology (Shenzhen) Company Limited Multimedia resource matching method and apparatus, storage medium, and electronic apparatus
CN108647245A (en) * 2018-04-13 2018-10-12 腾讯科技(深圳)有限公司 Matching process, device, storage medium and the electronic device of multimedia resource
WO2019196659A1 (en) * 2018-04-13 2019-10-17 腾讯科技(深圳)有限公司 Method and apparatus for matching multimedia resource, and storage medium and electronic device
CN110489594A (en) * 2018-05-14 2019-11-22 北京松果电子有限公司 Image vision mask method, device, storage medium and equipment
CN109255036A (en) * 2018-08-31 2019-01-22 北京字节跳动网络技术有限公司 Method and apparatus for output information
CN109446356A (en) * 2018-09-21 2019-03-08 深圳市九洲电器有限公司 A kind of multimedia document retrieval method and device
CN109684553A (en) * 2018-12-26 2019-04-26 北京百度网讯科技有限公司 For obtaining the method and device of information
CN110110099A (en) * 2019-04-12 2019-08-09 华勤通讯技术有限公司 A kind of multimedia document retrieval method and device
CN110532404A (en) * 2019-09-03 2019-12-03 北京百度网讯科技有限公司 One provenance multimedia determines method, apparatus, equipment and storage medium
CN110532404B (en) * 2019-09-03 2023-08-04 北京百度网讯科技有限公司 Source multimedia determining method, device, equipment and storage medium
CN111159435A (en) * 2019-12-27 2020-05-15 北大方正集团有限公司 Multimedia resource processing method, system, terminal and computer readable storage medium
CN111159435B (en) * 2019-12-27 2023-09-05 新方正控股发展有限责任公司 Multimedia resource processing method, system, terminal and computer readable storage medium
CN113128285A (en) * 2019-12-31 2021-07-16 华为技术有限公司 Method and device for processing video
WO2021136058A1 (en) * 2019-12-31 2021-07-08 华为技术有限公司 Video processing method and device
CN111221984A (en) * 2020-01-15 2020-06-02 北京百度网讯科技有限公司 Multimodal content processing method, device, equipment and storage medium
CN111221984B (en) * 2020-01-15 2024-03-01 北京百度网讯科技有限公司 Multi-mode content processing method, device, equipment and storage medium
CN112528053A (en) * 2020-12-23 2021-03-19 三星电子(中国)研发中心 Multimedia library classified retrieval management system
CN112818906A (en) * 2021-02-22 2021-05-18 浙江传媒学院 Intelligent full-media news cataloging method based on multi-mode information fusion understanding
CN112818906B (en) * 2021-02-22 2023-07-11 浙江传媒学院 Intelligent cataloging method of all-media news based on multi-mode information fusion understanding
CN113507613A (en) * 2021-06-07 2021-10-15 茂名市群英网络有限公司 CDN-based video input scheduling system and method

Also Published As

Publication number Publication date
CN107766571B (en) 2021-02-09

Similar Documents

Publication Publication Date Title
CN107766571A (en) The search method and device of a kind of multimedia resource
CN107526799B (en) Knowledge graph construction method based on deep learning
US9087297B1 (en) Accurate video concept recognition via classifier combination
US8396286B1 (en) Learning concepts for video annotation
US7424421B2 (en) Word collection method and system for use in word-breaking
US8788503B1 (en) Content identification
US8463808B2 (en) Expanding concept types in conceptual graphs
CN106844571B (en) Method and device for identifying synonyms and computing equipment
CN108647322B (en) Method for identifying similarity of mass Web text information based on word network
CN107402912A (en) Parse semantic method and apparatus
CN106844482B (en) Search engine-based retrieval information matching method and device
CN109446299B (en) Method and system for searching e-mail content based on event recognition
CN112738556A (en) Video processing method and device
CN111858940A (en) Multi-head attention-based legal case similarity calculation method and system
CN112257452A (en) Emotion recognition model training method, device, equipment and storage medium
CN111125457A (en) Deep cross-modal Hash retrieval method and device
CN111177367A (en) Case classification method, classification model training method and related products
CN110990563A (en) Artificial intelligence-based traditional culture material library construction method and system
CN115840812A (en) Method and system for intelligently matching enterprises according to policy text
KR20200018469A (en) Computerized Methods for Data Compression and Analysis
CN111291163A (en) Disease knowledge graph retrieval method based on symptom characteristics
CN113488194B (en) Medicine identification method and device based on distributed system
CN114491079A (en) Knowledge graph construction and query method, device, equipment and medium
Sagcan et al. Toponym recognition in social media for estimating the location of events
CN109660621A (en) A kind of content delivery method and service equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20210209

Termination date: 20211108