CN106126619A

CN106126619A - A kind of video retrieval method based on video content and system

Info

Publication number: CN106126619A
Application number: CN201610458196.8A
Authority: CN
Inventors: 罗笑南; 徐颂华; 姜涛; 林格
Original assignee: National Sun Yat Sen University
Current assignee: National Sun Yat Sen University
Priority date: 2016-06-20
Filing date: 2016-06-20
Publication date: 2016-11-16

Abstract

The invention discloses a kind of video retrieval method based on video content and system, wherein, described method includes: the descriptor of the search key of user with the topic set of background subject term is mated, obtains descriptor close with search key in coupling；Carry out video frequency searching according to the described descriptor close with search key, obtain the video that the described descriptor close with search key is corresponding；According to the video that the described descriptor close with search key is corresponding, obtain video essential information；According to described video essential information, described Video similarity is estimated, obtain Video similarity and estimate result；Described Video similarity is estimated result and carries out aggregative weighted appraisal, obtain Video similarity and comprehensively estimate result；Comprehensively estimate result according to acquisition Video similarity to show；In embodiments of the present invention, retrieve the inaccurate problem of content present in effective solution video frequency searching based on title, improve the experience sense of user.

Description

A kind of video retrieval method based on video content and system

Technical field

The present invention relates to video search technique area, particularly relate to a kind of video retrieval method based on video content and be System.

Background technology

Since eighties of last century the nineties, along with development and the development of community network, the network of network media technology Broadband improves constantly, and increasing information is presented in the Internet by multimedia forms such as video, audio frequency, images, many Media data presents the growth of explosion type；And in various Multi-media Materials, as video, audio frequency, text, figure, image and Animations etc., because video data enjoys liking of people with its vividness, intuitive and great affinity, in recent years, to provide Video sharing be the video website of main business be also flourish, domestic have the strange skill of love, excellent cruel, Rhizoma Solani tuber osi and a CNTV etc., abroad There are YouTube, Hulu, Yahoo, Video etc.；Large number of video sharing website enriches the audiovisual entertainment of people greatly Deng movable.

The multimedia video of magnanimity enriches the life of the people, along with increasing of video data, to these video datas There is huge technical problem in management, needs to put into a large amount of manpower and materials and video carries out classification process and adds tag processes, But these process still can do nothing to help user and retrieve the video that user needs fast and accurately.

Summary of the invention

It is an object of the invention to overcome the deficiencies in the prior art, the invention provides a kind of video based on video content Search method and system, effectively solve the retrieval inaccurate problem of content present in video frequency searching based on title, improve and use The experience sense at family.

In order to solve above-mentioned technical problem, the invention provides a kind of video retrieval method based on video content, described Method includes:

The descriptor of the search key of user with the topic set of background subject term is mated, obtains in coupling and close with retrieval The descriptor that keyword is close；

Carry out video frequency searching according to the described descriptor close with search key, obtain described close with search key Video corresponding to descriptor；

The video corresponding according to obtaining the described descriptor close with search key, obtains video essential information；

According to described video essential information, described Video similarity is estimated, obtain Video similarity and estimate result；

Described Video similarity is estimated result and carries out aggregative weighted appraisal, obtain Video similarity and comprehensively estimate result；

Comprehensively estimate result according to acquisition Video similarity to show.

Preferably, before the descriptor of the search key of user with the topic set of background subject term is mated, described side Method also includes::

The caption information of all videos in acquisition data base and audio-frequency information, believe described caption information and described audio frequency Breath is converted into the first video text message；

By described first video text message is processed, obtain described video text message descriptor；

The background document relevant to described video text message descriptor is obtained by described video text message descriptor Information, processes described background document information, obtains the first background document descriptor information；

According to described video text message descriptor and described first background document descriptor information, obtain background theme word Set.

Preferably, described video essential information at least includes the second video text message, the second background document descriptor letter Any one information in breath and video comments information.

Preferably, the obtaining step of described video comments information, including:

Described video is carried out data reptile process, obtains described video comments information.

Preferably, described according to video essential information, described Video similarity is estimated, obtain Video similarity and estimate The step of amount result, including:

According to described second video text message, the similarity of described video is estimated, obtain Video similarity and estimate Result；

According to described second background document descriptor information, the similarity of described video is estimated, obtain video similar Property estimate result；

According to described video comments information, the similarity of described video is estimated, obtain Video similarity and estimate knot Really.

Preferably, described estimating the similarity of described video according to described second video text message, acquisition regards Frequently similarity estimates the step of result, including:

The similarity of the character string according to described second video text message carries out Video similarity appraisal, obtains based on institute The Video similarity stating character string estimates result；

The similarity of the corpus according to described second visual text message carries out Video similarity appraisal, obtains based on institute The Video similarity stating corpus estimates result；

The similarity of the word content according to described second video text message carries out Video similarity appraisal, obtain based on The Video similarity of described word content estimates result.

Preferably, described according to described second background document descriptor information, the similarity of described video is estimated, Obtain Video similarity and estimate the step of result, including:

Set similarity according to described second background document descriptor information carries out Video similarity appraisal, obtain based on The Video similarity of described set estimates result；

Lexical Similarity according to described second background document descriptor information carries out Video similarity appraisal, obtain based on The Video similarity of described vocabulary estimates result.

Preferably, described according to described video comments information, the similarity of described video is estimated, obtain video phase The step of result is estimated like property, including:

Described video comments information is processed, obtains the relation letter of the video in video comments information, user, comment Breath；

According to the video in described video comments information, user, the relation information of comment, described video is carried out similarity Estimate, obtain and estimate result based on described video comments information similarity.

Preferably, described described Video similarity is estimated result carry out aggregative weighted appraisal, obtain Video similarity and combine Close the step estimating result, including:

Build Video similarity and comprehensively estimate model；

Comprehensively estimate model according to described Video similarity and described similarity appraisal result is carried out data training, obtain instruction Practice result.

Described training result is carried out aggregative weighted process, obtains aggregative weighted result；

Obtain Video similarity according to aggregative weighted result and comprehensively estimate result.

It addition, present invention also offers a kind of video frequency search system based on video content, described system includes:

Matching module: for the search key of user is matched with the topic set of background subject term, obtain and retrieve key The descriptor that word is close；

Retrieval module: for carrying out video frequency searching according to the described descriptor close with search key, obtain described and The video that descriptor that search key is close is corresponding；

Data obtaining module: for the video corresponding according to the described descriptor close with search key, obtains video Essential information；

Similarity measurement modules: for estimating described Video similarity according to described video essential information, obtains Video similarity estimates result；

Aggregative weighted module: carry out aggregative weighted appraisal for described Video similarity is estimated result, obtain video phase Result is comprehensively estimated like property；

Display module: show for comprehensively estimating result according to acquisition Video similarity.

In embodiments of the present invention, by the video in video library first being carried out pretreatment, obtain each in video library regarding Frequently background theme set, matches with video background theme set according to the search key of user, obtains descriptor, according to master Epigraph carries out Video similarity appraisal, comprehensively estimates result, root eventually through what the mode of aggregative weighted obtained Video similarity It is presented to user according to the comprehensive appraisal result of the Video similarity obtained by high to Low, completes the retrieval to video；Effective solution Certainly retrieve the inaccurate problem of content present in video frequency searching based on title, improve the experience sense of user.

Accompanying drawing explanation

In order to be illustrated more clearly that the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing In having technology to describe, the required accompanying drawing used is briefly described, it should be apparent that, the accompanying drawing in describing below is only this Some embodiments of invention, for those of ordinary skill in the art, on the premise of not paying creative work, it is also possible to Other accompanying drawing is obtained according to these accompanying drawings.

Fig. 1 is the method flow schematic diagram of the video retrieval method in the embodiment of the present invention；

Fig. 2 is the method flow schematic diagram that the background theme set in the embodiment of the present invention obtains；

Fig. 3 is that the Video similarity that obtains in the embodiment of the present invention comprehensively estimates the steps flow chart schematic diagram of result；

Fig. 4 is the system structure composition schematic diagram of the video frequency search system in the embodiment of the present invention；

Fig. 5 is the modular structure composition schematic diagram of the aggregative weighted module in the embodiment of the present invention.

Detailed description of the invention

Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Describe, it is clear that described embodiment is only a part of embodiment of the present invention rather than whole embodiments wholely.Based on Embodiment in the present invention, it is every other that those of ordinary skill in the art are obtained under not making creative work premise Embodiment, broadly falls into the scope of protection of the invention.

Fig. 1 is the method flow schematic diagram of the video retrieval method in the embodiment of the present invention, as it is shown in figure 1, the method bag Include:

S11: the descriptor of the search key of user with the topic set of background subject term is mated, with inspection in acquisition coupling The descriptor that rope key word is close；

S12: carry out video frequency searching according to the descriptor close with search key, obtains the master close with search key The video that epigraph is corresponding；

S13: the video corresponding according to obtaining the descriptor close with search key, obtains video essential information；

S14: estimate described Video similarity according to video essential information, obtains Video similarity and estimates result；

S15: Video similarity is estimated result and carries out aggregative weighted appraisal, obtains Video similarity and comprehensively estimates result；

S16: the comprehensive appraisal result according to obtaining Video similarity shows.

S11 is described further:

Search key is got, by get by the way of the search key that frame retrieval receives user's input Search key is mutually matched with background theme word, gets the theme close with search key according to matching result Word.

Further, before S11, Fig. 2 is that the method flow that the background theme set in the embodiment of the present invention obtains shows It is intended to, as in figure 2 it is shown, the method includes:

S111: the caption information of all videos in acquisition data base and audio-frequency information, by caption information and audio-frequency information It is converted into the first video text message；

S112: by processing the first video text message, obtains video text message descriptor；

S113: obtain the background document relevant to video text message descriptor by video text message descriptor and believe Breath, processes background document information, obtains the first background document descriptor information；

S114: according to video text message descriptor and the first background document descriptor information, obtain background theme word set Close.

S111 is described further:

First extract all videos in video library, respectively each video is carried out frame segmentation, use OCR technique to obtain every Caption information on one frame, and the caption information of the caption information of former frame with a later frame is compared, if two frames front and back The similarity of caption information is more than 80%, then it is assumed that two frame caption informations repeat mutually, only preserves the caption information of former frame；Use Such mode, till having obtained the caption information of all frames of video, obtains caption information；Video is carried out audio-frequency information Classification processes, and gets audio-frequency information, uses ASR technology to process this audio-frequency information, this audio-frequency information is converted to literary composition Word information.

Then caption information and Word message carrying out garbage character string removal process, the rule of this process can be such that

(1) if the character that comprises of character string is more than 40, then regard as garbage character string, filter out；

(2) if the sum of the alphanumeric character (or alphabetic character) in a character string is less than 50%, then regard as Garbage character string, filters out；

(3) if a character string has continuous 4 identical characters, then regard as garbage character string, filter out；

(4) for comprising only the character string of letter number, vowel and the quantity of consonant are checked, if a kind of letter Quantity less than the 10% of another kind of number of letters, then regard as garbage character string, filter out；

(5), after removing the head and the tail letter of a character string, if the kind more than two of punctuation character, then rubbish is regarded as Character string, filters out；

(6) it is all lower case when the head and the tail of a character string, if capitalization occurs in middle any position, then assert For garbage character, filter out.

Based on above rule, remove the most garbage character string in caption information and Word message all by effective mistake Filter, simultaneously for the character string of some specific formats, such as addresses of items of mail or some specific representation, uses regular expression With the mode of configuration file, these character strings are retained；Then remaining caption information and Word message are removed The merging repeated, gets visual text message.

S112 is described further:

Use KEA++ algorithm that video text message is processed, obtain the descriptor of text message；It had been embodied as Cheng Zhong, training dataset: using video text message as training data, and video text message remittance is trained；Control word Remittance table: extract the key phrase in video text message, the key phrase of different field is assigned to the control of its art Vocabulary；Descriptor generates: according to training dataset and managing terminology table, uses KEA++ algorithm to generate a learning model, adopts With this learning model, video text message is carried out descriptor prediction, generate descriptor.

S113 is described further:

According in S112 obtain to descriptor, use this descriptor to carry out literature search, get and this descriptor phase The background document information closed, uses KEA++ algorithm to carry out background document information processing acquisition background document descriptor letter Breath；In specific implementation process, training dataset: using background document information as training data, and background document information is entered Row training；Managing terminology table: extract the key phrase in background document information, the key phrase of different field is assigned to its institute The managing terminology table in genus field；Descriptor generates: according to training dataset and managing terminology table, uses KEA++ algorithm to generate one Individual learning model, uses this learning model that background document information carries out descriptor prediction, generates background document descriptor information.

S114 is described further:

The learning model produced according to KEA++ algorithm predicts the video text message descriptor and background document theme obtained Descriptor in word information is multiplied with degree of association probability and the Lucene score of video with this descriptor of prediction, obtains the highest being multiplied Result part builds descriptor set.

Wherein the code of points of Lucene be the descriptor to each documentation & info or text message and its inquiry request or The frequency of occurrences (TF) of key word, reverse documentation & info or text message frequency (IDF) documentation & info or text message and each The weight of inquiry field, and the length of the documentation & info of inquiry field or text message is relevant.

S12 is described further:

By retrieving according to the video in the topic word pair video database close with search key, thus obtain To the video that the descriptor close with search key is corresponding.

S13 is described further:

After getting, according to above-mentioned S12, the video that the descriptor close with search key is corresponding, obtain this video Video essential information；Wherein, video essential information at least includes video text message, background document descriptor information and video Review information.

Video text message and background document descriptor information are to extract in S11 is to the result after this Video processing Out, video comments information is the process processing mode using data reptile, the review information of this video that crawls, thus gets Video comments information.

S14 is described further:

According to video essential information, this video is carried out similarity appraisal, obtain Video similarity and estimate result；Because depending on Frequently essential information at least includes video text message, background document descriptor information and video comments information；In the present embodiment, It is to be respectively adopted video text message, background document descriptor information and video comments information to carry out Video similarity appraisal, obtains Take Video similarity and estimate result.

Further, according to video text message, background document descriptor information and video comments information respectively to video Carry out similarity appraisal, and the appraisal result that the similarity obtaining them respectively is estimated；It is i.e. according to video text message pair The similarity of video is estimated, and obtains Video similarity and estimates result；According to the background document descriptor information phase to video Estimate like property, obtain Video similarity and estimate result；According to video comments information, the similarity of video is estimated, obtain Take Video similarity and estimate result.

Further, according to video text message, the similarity of video is estimated, obtain Video similarity and estimate knot Fruit is divided into the similarity of character string based on video text message to carry out Video similarity appraisal, language based on video text message The similarity in material storehouse carries out the similarity of Video similarity appraisal and word content based on video text message and carries out video phase Estimate like property, thus obtain similarity and estimate result.

Wherein, the similarity of character string based on video text message is carried out between Video similarity appraisal employing character string The similarity of cosine similarity calculating character string, formula is as follows:

\cos (T_{i}, T_{j}) = \frac{Σ_{k = 1}^{n} W_{i k} W_{j k}}{\sqrt{Σ_{k = 1}^{n} W_{i k}^{2}} \sqrt{\sqrt{Σ_{k = 1}^{n} W_{j k}^{2}}}};

Wherein, T_iRepresent the vector of i-th character string, w_ikRepresent the kth dimension of i-th character string, T_jRepresent jth word The vector of symbol string, w_jkThe kth dimension of expression jth character string vector, k=1 ..., n, i, j=1 ... m.

Wherein, the similarity of corpus based on video text message carries out the Video similarity appraisal similar calculation of employing set Method, formula is as follows:

D i c e (T_{i}, T_{j}) = \frac{2 \times c o m m (T_{i}, T_{j})}{s i z e (T_{i}) + s i z e (T_{j})};

Wherein, T_iRepresent video text message, T_jRepresent corpus information, comm (T_i,T_j) represent text message and language material There is the number of identical characters string, size (T in storehouse information_i)、size(T_j) represent video text message and corpus information respectively The size of string assemble.

Wherein, the similarity of word content based on video text message carry out Video similarity estimate employing utilize Lin Algorithm calculates, and formula is as follows:

L i n (T_{i}, T_{j}) = \frac{2 \times I C (L C S (T_{i}, T_{j}))}{I C (T_{i}) + I C (T_{j})}

Wherein, T_i,T_jIt is to be compared two character strings, LCS (T_i,T_j) it is the nearest ancestors of two character strings, IC (w) Represent the quantity of information of character string T.

Further, according to background document descriptor information, the similarity of video is estimated, obtain Video similarity Estimating result is divided into the set similarity according to background document descriptor information to carry out Video similarity appraisal, obtains based on set Video similarity estimate result；Lexical Similarity according to background document descriptor information carries out Video similarity appraisal, obtains Take Video similarity based on vocabulary and estimate result.

Wherein, carry out Video similarity appraisal according to the set similarity of background document descriptor information, obtain based on collection The Video similarity closed is estimated result and is used set Similarity Algorithm to calculate, and formula is as follows:

S i m (S_{i}, S_{j}) = \frac{Σ_{k}^{h} \max_{h} (w u p (t_{i k}, t_{j h})) + Σ_{h}^{m} \max_{k} (w u p (t_{j h}, t_{i k}))}{s i z e (S_{i}) + s i z e (S_{j})}

Wherein, S_i、S_jFor two the most close descriptor set, t_ik、t_jhIt is set S respectively_i、S_jIn descriptor, wup(t_ik,t_jh) it is the wup similarity between two descriptor, max_h(wup(t_ik,t_jh)) it is descriptor t_ikWith set S_jIn The maximum of the wup similarity of all descriptor, max_k(wup(t_jh,t_ik)) it is descriptor t_jhWith set S_iIn all themes The maximum of the wup similarity of word, size (S) represents the number of set.

Further, according to video comments information, the similarity of video is estimated, obtain Video similarity and estimate knot Really, it is by video comments information is processed, obtains the video in video comments information, user, the relation information of comment； According to the video in video comments information, user, the relation information of comment, video is carried out similarity appraisal, obtain based on video Review information similarity estimates result.

S15 is described further:

Use Video similarity comprehensively to estimate algorithm and build Video similarity appraisal model, comprehensively estimate according to Video similarity Similarity is estimated result and is carried out data training by amount model, obtains training result；Training result is carried out aggregative weighted process, obtains Take aggregative weighted result；Obtain Video similarity according to aggregative weighted result and comprehensively estimate result.

Further, the acquisition Video similarity during Fig. 3 is the embodiment of the present invention is comprehensively estimated the steps flow chart of result and is shown It is intended to, as it is shown on figure 3, this flow process includes:

S151: build Video similarity and comprehensively estimate model；

S152: comprehensively estimate model according to Video similarity and similarity appraisal result is carried out data training, obtain training Result；

S153: training result carries out aggregative weighted process, obtains aggregative weighted result；

S154: obtain Video similarity according to aggregative weighted result and comprehensively estimate result.

S151 is described further:

First build Adaboost algorithm, use Adaboost algorithm to carry out data training and prepare, with on [0,1] interval Real number value represents that the similarity degree of video, 0 expression differ completely, and 1 represents identical, and the biggest similarity of numerical value is the highest, right Training data is marked, and completes training data and prepares, and forms Video similarity and comprehensively estimates model.

S152 is described further:

Adaboost algorithm is utilized successively each Video similarity metric one weak recurrence learning algorithm of customization to be carried out Training, specifically includes the similarity of the comment of visual text message content, background document descriptor content and video, obtains training Result.

S153 is described further:

Correlation result after training is carried out aggregative weighted process, and its aggregative weighted is average weighted processing procedure, The weight of correlation result is all identical, obtains aggregative weighted result.

S154 is described further:

According to the size of aggregative weighted result, it is ranked up, determines the result that Video similarity is comprehensively estimated.

S16 is described further:

Comprehensively estimate result according to acquisition Video similarity to show；It is according to the comprehensive appraisal obtaining Video similarity Result sizes, is shown to user from high to low, facilitates user to check.

Fig. 4 is the system structure composition schematic diagram of the video frequency search system in the embodiment of the present invention, and as shown in Figure 4, this is System includes:

Matching module 11: for using the search key of user to match with the topic set of background subject term, obtain and retrieve The descriptor that key word is close；

Data obtaining module 12: for according to the descriptor close with search key, obtaining the video that descriptor is corresponding In video text message, background document subject information, video comments information；

Similarity measurement modules 13: for according to video text message, background document descriptor information, video comments information This Video similarity is estimated, obtains Video similarity and estimate result；

Aggregative weighted module 14: carry out aggregative weighted appraisal for this Video similarity is estimated result, obtain video phase Result is comprehensively estimated like property；

Display module 15: for being shown to this user according to the comprehensive appraisal result obtaining Video similarity by high to Low.

Preferably, this data obtaining module 12 includes:

Video acquisition unit: for carrying out video frequency searching according to the descriptor that search key is close, obtain topic word pair The video answered；

Information acquisition unit: for regarding according to video acquisition video text message, acquisition video background documentation & info, acquisition Frequently review information.

It should be noted that data obtaining module includes video acquisition unit and information acquisition unit, use video acquisition Unit carries out video frequency searching by the descriptor close with search key, obtains the video retrieved, and uses acquisition of information list These videos are processed by unit, obtain the video text message of these videos, video background documentation & info and video comments letter Breath.

Further, video acquisition unit carries out video frequency searching by the descriptor close with search key, obtains inspection The video that rope arrives, uses OCR technique and ASR technology to process these videos, the word of acquisition in information acquisition unit After information, these Word messages are carried out redundancy removal process, the Word message removing redundancy is merged, gets video Text message；According to the video text message got, this video text message is carried out KEA++ process, obtain video subject Word, uses these video subject words to carry out literature search, obtains background document information, these background document information are carried out KEA+ + and Lucene process, obtain background document descriptor information；After retrieval obtains video, use the mode that data reptile processes Video is processed, obtains the video comments information of this video.

Preferably, this information acquisition unit includes that data reptile processes subelement；

Data reptile processes subelement, for video carries out data reptile process, obtains video comments information.

Preferably, this similarity measurement modules 13 includes:

Text message estimates unit: for estimating the similarity of video according to video text message, obtains video Similarity estimates result；

Descriptor information estimates unit: for estimating the similarity of video according to background document descriptor information, Obtain Video similarity and estimate result；

Review information estimates unit: for estimating the similarity of video according to video comments information, obtains video Similarity estimates result.

It should be noted that use text message to estimate unit, the similarity of video is estimated by video text message Amount, obtains Video similarity and estimates result；Descriptor information is used to estimate unit to background document descriptor information to video Similarity is estimated, and obtains Video similarity and estimates result；Review information is used to estimate unit to video comments information to regarding The similarity of frequency is estimated, and obtains Video similarity and estimates result；Wherein, text message estimates unit, descriptor information is estimated Amount unit and review information are estimated the execution sequence of unit and are not limited, and can perform to be performed separately simultaneously.

Preferably, text information appraisal unit includes:

Character string estimates subelement: the similarity for the character string according to video text message carries out Video similarity and estimates Amount, obtains Video similarity based on character string and estimates result；

Subelement estimated in corpus: the similarity for the corpus according to visual text message carries out Video similarity and estimates Amount, obtains Video similarity based on corpus and estimates result；

Word content estimates subelement: it is similar that the similarity for the word content according to video text message carries out video Property estimate, obtains Video similarity based on word content appraisal result.

Enter it should be noted that use character string to estimate subelement according to the similarity of the character string of video text message Row Video similarity is estimated, and obtains Video similarity based on character string and estimates result；Corpus is used to estimate subelement root Carry out Video similarity appraisal according to the similarity of the corpus of visual text message, obtain Video similarity based on corpus and estimate Amount result；Use word content to estimate subelement and carry out Video similarity according to the similarity of the word content of video text message Estimate, obtain Video similarity based on word content and estimate result；Wherein, use character string to estimate subelement, corpus is estimated When quantum boxes and word content appraisal subelement carry out Video similarity appraisal, their appraisal order is unfixed, can Being to carry out or separately successively carry out simultaneously.

Preferably, descriptor information appraisal unit includes:

Subelement is estimated in set: estimate for carrying out Video similarity according to the set similarity of background document descriptor information Amount, obtains Video similarity based on set and estimates result；

Subelement estimated in vocabulary: estimates for carrying out Video similarity according to the Lexical Similarity of background document descriptor information Amount, obtains Video similarity based on vocabulary and estimates result.

Enter it should be noted that use set to estimate subelement according to the set similarity of background document descriptor information Row Video similarity is estimated, and obtains Video similarity based on set and estimates result；Vocabulary is used to estimate subelement according to the back of the body The Lexical Similarity of scape document subject word information carries out Video similarity appraisal, obtains Video similarity based on vocabulary and estimates knot Really, wherein, when using set appraisal subelement and vocabulary appraisal subelement to carry out Video similarity appraisal, their appraisal order It is unfixed, can be to carry out simultaneously or the most successively carry out.

Preferably, review information appraisal unit includes:

Review information processes subelement: for processing video comments information, obtains regarding in video comments information Frequently, the relation information of user, comment；

Estimate subelement: for video being carried out according to the video in video comments information, user, the relation information of comment Similarity is estimated, and obtains and estimates result based on video comments information similarity.

Preferably, aggregative weighted module 14 includes:

Construction unit 141: be used for building Video similarity and comprehensively estimate model；

Training unit 142: similarity appraisal result is carried out data instruction for comprehensively estimating model according to Video similarity Practice, obtain training result.

Weighting processing unit 143: for training result being carried out aggregative weighted process, obtain aggregative weighted result；

Comprehensive appraisal acquiring unit 144: comprehensively estimate knot for obtaining Video similarity according to aggregative weighted result Really.

It should be noted that use construction unit 141 to build Video similarity comprehensively estimate model；Use training unit 142 estimate model according to the Video similarity built carries out data training to Video similarity appraisal result, obtains training Result；Use weighting processing unit 143 that training result carries out aggregative weighted process, obtain aggregative weighted result；Use Comprehensive appraisal acquiring unit 144 obtains Video similarity according to aggregative weighted result and comprehensively estimates result.

Specifically, the operation principle of the system related functions module of the embodiment of the present invention can be found in the relevant of embodiment of the method Describe, repeat no more here.

One of ordinary skill in the art will appreciate that all or part of step in the various methods of above-described embodiment is can Completing instructing relevant hardware by program, this program can be stored in a computer-readable recording medium, storage Medium may include that read only memory (ROM, Read Only Memory), random access memory (RAM, Random Access Memory), disk or CD etc..

It addition, above a kind of based on video content the video retrieval method being provided the embodiment of the present invention and system are entered Having gone and be discussed in detail, principle and the embodiment of the present invention are set forth by specific case used herein, above enforcement The explanation of example is only intended to help to understand method and the core concept thereof of the present invention；General technology people simultaneously for this area Member, according to the thought of the present invention, the most all will change, in sum, and this explanation Book content should not be construed as limitation of the present invention.

Claims

1. a video retrieval method based on video content, it is characterised in that described method includes:

The descriptor of the search key of user with the topic set of background subject term is mated, with search key in acquisition coupling Close descriptor；

Carry out video frequency searching according to the described descriptor close with search key, obtain the described master close with search key The video that epigraph is corresponding；

Video retrieval method the most according to claim 1, it is characterised in that by the search key of user and background subject term Before the descriptor of topic set is mated, described method also includes:

Obtain caption information and the audio-frequency information of all videos in data base, described caption information and described audio-frequency information are converted It it is the first video text message；

The background document information relevant to described video text message descriptor is obtained by described video text message descriptor, Described background document information is processed, obtains the first background document descriptor information；

According to described video text message descriptor and described first background document descriptor information, obtain background theme word set Close.

Video retrieval method the most according to claim 1, it is characterised in that described video essential information at least includes second Any one information in video text message, the second background document descriptor information and video comments information.

Video retrieval method the most according to claim 3, it is characterised in that the obtaining step of described video comments information, Including:

Video retrieval method the most according to claim 1, it is characterised in that described regard described according to video essential information Frequently similarity is estimated, and obtains Video similarity and estimates the step of result, including:

According to described second video text message, the similarity of described video is estimated, obtain Video similarity and estimate knot Really；

According to described second background document descriptor information, the similarity of described video is estimated, obtain Video similarity and estimate Amount result；

According to described video comments information, the similarity of described video is estimated, obtain Video similarity and estimate result.

Video retrieval method the most according to claim 5, it is characterised in that described according to described second video text message The similarity of described video is estimated, obtains Video similarity and estimate the step of result, including:

The similarity of the character string according to described second video text message carries out Video similarity appraisal, obtains based on described word The Video similarity of symbol string estimates result；

The similarity of the corpus according to described second video text message carries out Video similarity appraisal, obtains based on institute's predicate The Video similarity in material storehouse estimates result；

The similarity of the word content according to described second video text message carries out Video similarity appraisal, obtains based on described The Video similarity of word content estimates result.

Video retrieval method the most according to claim 5, it is characterised in that described according to described second background document theme The similarity of described video is estimated by word information, obtains Video similarity and estimates the step of result, including:

Set similarity according to described second background document descriptor information carries out Video similarity appraisal, obtains based on described The Video similarity of set estimates result；

Lexical Similarity according to described second background document descriptor information carries out Video similarity appraisal, obtains based on described The Video similarity of vocabulary estimates result.

Video retrieval method the most according to claim 5, it is characterised in that described according to described video comments information to institute The similarity stating video is estimated, and obtains Video similarity and estimates the step of result, including:

Described video comments information is processed, obtains the video in video comments information, user, the relation information of comment；

According to the video in described video comments information, user, the relation information of comment, described video is carried out similarity appraisal, Obtain and estimate result based on described video comments information similarity.

Video retrieval method the most according to claim 1, it is characterised in that described to described Video similarity appraisal result Carry out aggregative weighted appraisal, obtain Video similarity and comprehensively estimate the step of result, including:

Build Video similarity and comprehensively estimate model；

Comprehensively estimate model according to described Video similarity and described similarity appraisal result is carried out data training, obtain training knot Really；

10. a video frequency search system based on video content, it is characterised in that described system includes:

Matching module: for the descriptor of the search key of user with the topic set of background subject term being mated, obtain coupling In the descriptor close with search key；

Retrieval module: for carrying out video frequency searching according to the described descriptor close with search key, obtains described and retrieval The video that descriptor that key word is close is corresponding；

Data obtaining module: for the video corresponding according to the described descriptor close with search key, obtains video basic Information；

Aggregative weighted module: carry out aggregative weighted appraisal for described Video similarity is estimated result, obtain Video similarity Comprehensively estimate result；