The method and apparatus determining video original
Technical field
The present invention relates to field of video processing, particularly relate to the method and apparatus determining video original.
Background technology
On network, multiple videos are likely to be of identical or substantially identical content, for such as Web Video Service supplier, wanting to the video by these are similar arrange out, and avoid repeating to recommend similar content to user, this is greatly improved Consumer's Experience.
Currently general method is the identification information of the such as md5 summary first calculating video, then adopts MapReduce (map conclude) technology find some videos with same identification information and therefrom specify and will recommend the original video of user.Principle according to md5 code, only small two videos of difference are likely to the md5 code producing have very big-difference, therefore can relatively accurately identify same video by the identification information of existing such as md5 summary, though but it identifies substantially similarity but the effect of video that differs is unsatisfactory.In reality, user may upload after existing video is edited, for instance, increase or delete a small amount of content in existing video start or end, or insert advertisement etc. at existing video mid portion.And according to current method, often omit these videos when adding up similar video, thus causing often to recommend similar content to user, bring very bad Consumer's Experience.
Summary of the invention
In view of this, the technical problem to be solved in the present invention is how to avoid repeating to recommend similar content to user.
In order to solve above-mentioned technical problem, the present invention provides a kind of method determining video original, including:
Similarity based on video finger print obtains first kind similar video subset from multiple videos;
Determine the original video in each first kind similar video subset.
For said method, in a kind of possible implementation, from multiple videos, obtain first kind similar video subset based on the similarity of video finger print and include:
Any two video in the plurality of video all constitutes video pair, for each video pair, if the similarity of the video finger print of its two included videos is more than first threshold, then these two videos is included into same first kind similar video subset.
For said method, in a kind of possible implementation, from multiple videos, obtain first kind similar video subset based on the similarity of video finger print and include:
Any two video in the plurality of video all constitutes video pair, for each video pair, if the similarity of the video finger print of its two included videos is more than first threshold, and the similarity of the essential information of these two videos is more than Second Threshold, then these two videos being included into same first kind similar video subset, it is part or all of that described essential information includes in elements below: visual classification, video title, video tab, video quality.
For said method, in a kind of possible implementation, the plurality of video includes newly-increased video subset, obtains first kind similar video subset based on the similarity of video finger print and include from multiple videos:
Any one video in described newly-increased video subset and in the plurality of video any video except this video all constitute video pair, the any two video being not admitted to described newly-increased video subset in the plurality of video does not all constitute video pair, for each video pair, if the similarity of the video finger print of its two included videos is more than first threshold, then these two videos are included into same first kind similar video subset.
For said method, in a kind of possible implementation, the plurality of video includes newly-increased video subset, obtains first kind similar video subset based on the similarity of video finger print and include from multiple videos:
Any one video in described newly-increased video subset and in the plurality of video any video except this video all constitute video pair, the any two video being not admitted to described newly-increased video subset in the plurality of video does not all constitute video pair, for each video pair, if the similarity of the video finger print of its two included videos is more than first threshold, and the similarity of the essential information of these two videos is more than Second Threshold, then these two videos are included into same first kind similar video subset, it is part or all of that described essential information includes in elements below: visual classification, video title, video tab, video quality.
For said method, in a kind of possible implementation, the video finger print of each video includes the multiple characteristic informations corresponding respectively to the multiple frames in this video, and the similarity of the video finger print of two videos is based on what the quantity of characteristic information identical in the video finger print of these two videos was determined.
For said method, in a kind of possible implementation, whether each corresponding element that the similarity of the essential information of two videos is based in the essential information of these two videos identical determines.
For said method, in a kind of possible implementation, also include:
When merging Equations of The Second Kind similar video subset and first kind similar video subset, the Equations of The Second Kind and first kind similar video subset that include same video are merged into the similar video subset after a merging;
Determine the original video in the similar video subset after each merging.
For said method, in a kind of possible implementation, determining the original video in this similar video subset based on the relevant information of each video in similar video subset, it is part or all of that described relevant information includes in lower surface element: visual classification, video title, video tab, video quality, video playback data, user interaction data.
For said method, in a kind of possible implementation, it is determined that the original video in similar video subset includes:
Obtaining the relevant information metric that in this similar video subset, each video is corresponding, the state of each element that the described relevant information metric of video is based in the described relevant information of this video is determined;
Select the video with maximal correlation Information Meter value as the original video in this video subset.
The present invention also provides for a kind of device determining video original, including:
Similar video subset acquisition module, obtains first kind similar video subset for the similarity based on video finger print from multiple videos;
Module determined by original, for determining the original video in each first kind similar video subset.
For said apparatus, in a kind of possible implementation, described similar video subset acquisition module include with lower unit any one or multiple:
First acquiring unit, when all constituting video pair for any two video in the plurality of video, for each video pair, if the similarity of the video finger print of its two included videos is more than first threshold, then these two videos are included into same first kind similar video subset;
Second acquisition unit, when all constituting video pair for any two video in the plurality of video, for each video pair, if the similarity of the video finger print of its two included videos is more than first threshold, and the similarity of the essential information of these two videos is more than Second Threshold, then these two videos being included into same first kind similar video subset, it is part or all of that described essential information includes in elements below: visual classification, video title, video tab, video quality;
3rd acquiring unit, for any one video in newly-increased video subset and in the plurality of video any video except this video all constitute video pair, and any two video being not admitted to described newly-increased video subset in the plurality of video is not when all constituting video pair, for each video pair, if the similarity of the video finger print of its two included videos is more than first threshold, then these two videos are included into same first kind similar video subset;.
4th acquiring unit, for including newly-increased video subset at the plurality of video, any one video in described newly-increased video subset and in the plurality of video any video except this video all constitute video pair, and any two video being not admitted to described newly-increased video subset in the plurality of video is not when all constituting video pair, for each video pair, if the similarity of the video finger print of its two included videos is more than first threshold, and the similarity of the essential information of these two videos is more than Second Threshold, then these two videos are included into same first kind similar video subset, it is part or all of that described essential information includes in elements below: visual classification, video title, video tab, video quality.
For said apparatus, in a kind of possible implementation, the video finger print of each video includes the multiple characteristic informations corresponding respectively to the multiple frames in this video, and the similarity of the video finger print of two videos is based on what the quantity of characteristic information identical in the video finger print of these two videos was determined.
For said apparatus, in a kind of possible implementation, also include:
Similar video subset merges module, for when merging Equations of The Second Kind similar video subset and first kind similar video subset, the Equations of The Second Kind and first kind similar video subset that include same video being merged into the similar video subset after a merging;
The original video that module is additionally operable in the similar video subset after determining each merging determined by described original.
For said apparatus, in a kind of possible implementation, described original determines that module is additionally operable to determine the original video in this similar video subset based on the relevant information of each video in similar video subset, and it is part or all of that described relevant information includes in lower surface element: visual classification, video title, video tab, video quality, video playback data, user interaction data.
For said apparatus, in a kind of possible implementation, described original determines that module is additionally operable to obtain the relevant information metric that in this similar video subset, each video is corresponding, and the state of each element that the described relevant information metric of video is based in the described relevant information of this video is determined;Select the video with maximal correlation Information Meter value as the original video in this video subset.
The embodiment of the present invention, by obtaining accurate similar video subset based on the similarity of video finger print and therefrom determining original video, thus avoiding recommending similar video to user, is greatly improved Consumer's Experience.
According to below with reference to the accompanying drawings to detailed description of illustrative embodiments, further feature and the aspect of the present invention will be clear from.
Accompanying drawing explanation
The accompanying drawing of the part comprising in the description and constituting description together illustrates the exemplary embodiment of the present invention, feature and aspect with description, and is used for explaining principles of the invention.
Fig. 1 illustrates the flow chart of the method for determining video original according to an embodiment of the invention.
Fig. 2 illustrates the flow chart of the method for the more new video original according to an embodiment of the invention.
Fig. 3 illustrates the comprehensive similar video information obtained according to the present invention according to an embodiment of the invention and the method flow diagram of similar video information obtained according to additive method.
Fig. 4 illustrates the structured flowchart of a kind of device determining video original according to an embodiment of the invention;
Fig. 5 illustrates the structured flowchart of a kind of device determining video original according to another embodiment of the invention.
Detailed description of the invention
The various exemplary embodiments of the present invention, feature and aspect is described in detail below with reference to accompanying drawing.Accompanying drawing labelling identical in accompanying drawing represents the same or analogous element of function.Although the various aspects of embodiment shown in the drawings, but unless otherwise indicated, it is not necessary to accompanying drawing drawn to scale.
Word " exemplary " special here means " as example, embodiment or illustrative ".Here should not necessarily be construed as preferred or advantageous over other embodiments as any embodiment illustrated by " exemplary ".
It addition, in order to better illustrate the present invention, detailed description of the invention below gives numerous details.It will be appreciated by those skilled in the art that there is no some detail, the equally possible enforcement of the present invention.In some instances, method, means, element and the circuit known for those skilled in the art are not described in detail, in order to highlight the purport of the present invention.
Embodiment 1
Fig. 1 illustrates the flow chart of the method for determining video original according to an embodiment of the invention.As it is shown in figure 1, the method specifically includes that
Step 101, similarity based on video finger print obtain first kind similar video subset from multiple videos.
Step 102, the original video determined in each first kind similar video subset.
Specifically, video finger print typically refers to the information that can uniquely identify video produced by technology such as identification, extraction, compressions.Inventor studies discovery, compared to md5 code, utilizes video finger print usually more effectively to weigh the similarity degree between video.In the present embodiment, by obtaining similar video subset based on the similarity of video finger print and therefrom determining original, can be prevented effectively from and repeat to recommend similar video to user.
The similarity of the video finger print reflecting different video can be taken various forms.Such as, the video finger print of each video can include multiple characteristic informations corresponding with the multiple frames (can be such as multiple key frame) in this video respectively, can determine the similarity of the video finger print of the two video based on the quantity of characteristic information identical in the video finger print of two videos.Such as, video A includes 90 characteristic informations, and video B includes 100 characteristic informations, and video B inserts one section of advertisement in beginning place of video A, and corresponding 10 characteristic informations of the advertisement part of wherein beginning place, all the other 90 characteristic informations and video A are identical.Now, the characteristic information of the 100% of video A is identical with video B, then it is believed that the similarity of the video finger print of video A and video B is 100%, and the characteristic information of the 90% of video B is identical with video A, then it is believed that the similarity of the video finger print of video B and video A is 90%, it can be seen that in this case, this similarity is directive;Or it is also contemplated that the similarity of video A and video B is 95%, namely take the arithmetic mean of instantaneous value of 90% and 100%, in this case, this similarity can be do not have directive.The video finger print similarity with and without directivity can be selected as the case may be.Being above the exemplary description of the similarity to video finger print and unrestricted, those skilled in the art may be used without other any suitable methods and weigh or obtain the similarity of video finger print.
In a kind of possible embodiment, it is believed that any two video in multiple videos all constitutes video pair, for each video pair, if the similarity of the video finger print of its two included videos is more than first threshold, then these two videos can be included into same first kind similar video subset.Such as, if multiple video is video A, B, C, D, if the similarity of the video finger print of the similarity of the video finger print of video A and B, video A and C both is greater than first threshold, then video A, B and C can be included into same first kind similar video subset.This first threshold can be arranged as required to, for instance, repeat to recommend if paying the utmost attention to elimination, then relatively low first threshold can be set;Prevent from recommendation exists omitting if paid the utmost attention to, then it is contemplated that arrange higher first threshold.
May also be combined with the similarity of the essential information of video as required to video to filtering further.Essential information described herein can refer to the information that can be used for representing the basic feature of video self.It is part or all of that the essential information of video can include in elements below: visual classification (such as game, video display etc.), video title (such as cutter tower, Journey to the West etc.), video tab (such as dota, pis, blue cat etc.), video quality (such as high definition, general clear, the video scoring etc. that obtains based on user interaction data COMPREHENSIVE CALCULATING).Element included in this essential information can be arranged as required to.
In a kind of possible embodiment, it is believed that any two video in multiple videos all constitutes video pair, for each video pair, if the similarity of the video finger print of its two included videos is more than first threshold, and the similarity of the essential information of these two videos is more than Second Threshold, then these two videos can be included into same first kind similar video subset.Second Threshold can be arranged as required to.Such as, if multiple videos are video A, B, C, D, if the similarity of the video finger print of the similarity of the video finger print of video A and B, video A and C is all higher than first threshold, and the similarity of the essential information of video A and C is not more than Second Threshold to the similarity of the essential information of video A and B more than Second Threshold, then video A, B can be included into same first kind similar video subset.
Can based on the similarity of the whether identical essential information determining these two videos of the corresponding element in the essential information of two videos.Such as, the essential information of setting video can include element " visual classification ", " video title ", " video tab ", " video quality ", it is simultaneous for two videos included by video centering, corresponding element can be set identical, take " 1 ", corresponding element differs, and takes " 0 ", it is possible to represent the similarity of essential information between these two videos with the comparative result weighted sum of each corresponding element.The weight of different elements can need to arrange according to different business.Such as, can arrange " visual classification ", " video title ", " video tab ", " video quality " weight be followed successively by 0.3,0.3,0.2,0.2, assume that video A is all identical with the visual classification of video B, video title, and video tab is different with video quality, then the similarity of its essential information is represented by 0.3+0.3+0+0=0.6.Those skilled in the art may be used without the similarity of essential information that other any suitable methods are weighed or obtained video.
The above-mentioned similarity based on video finger print obtains the process of several first kind similar video subsets from multiple videos and the algorithm framework based on Spark+GraphX can be adopted to implement, and adopts this algorithm framework to be conducive to improving computational efficiency.
The original video in each first kind similar video subset can be determined based on the relevant information of video.Relevant information described herein can refer to can be used for the information of the pouplarity of basic feature and/or the reflecting video representing video self.It is part or all of that relevant information can include in elements below: visual classification (such as game, video display etc.), video title (such as cutter tower, Journey to the West etc.), video tab (such as dota, pis, blue cat etc.), video quality (such as high definition, the general video scoring clearly etc., obtained based on user interaction data COMPREHENSIVE CALCULATING), video playback data (such as video playback amount etc.), user interaction data (such as user pushes up, steps on).
Such as, the relevant information metric that in this first kind similar video subset, each video is corresponding can be obtained, then select the video with maximal correlation Information Meter value as the original video in this first kind similar video subset.The relevant information metric of video can be based in the relevant information of this video the state of each element to be determined.Such as, if targeted customer is game enthusiasts, then " visual classification " be " game " state relative to other states such as such as " video display ", its " visual classification " this element can contribute more relevant information metric;Such as, if the state of " video playback data " represents by its playback volume, then its playback volume can be counted relevant information metric according to a certain percentage;Such as, when considering " during user interaction data ", " top " can count relevant information metric with the form just divided by a certain percentage, and " stepping on " can count relevant information metric with the form of negative point by a certain percentage, etc..Those skilled in the art may be used without the relevant information metric that other any suitable methods are weighed or obtained video.
At video when continuing to increase, for instance, video service provider every day of such as video website will video in the face of newly uploading, it may be necessary to redefine first kind similar video subset and original thereof at set intervals.Now, if indistinction treat history video and newly-increased video, along with accumulating over a long period, its amount of calculation can be very big, and computational efficiency is on the low side.In a kind of possible embodiment, video newly-increased within the nearest time period in multiple video can be set as newly-increased video subset, it is believed that any one video in newly-increased video subset and in all videos video except this video except all may make up video pair, and any two video being not admitted to increase newly video subset all not may make up video pair.For each video pair, if the similarity of the video finger print of its two included videos is more than first threshold, then these two videos are included into same first kind similar video subset.Such as, if history video is video A, B, C, wherein video A, B constitutes a first kind similar video set, newly-increased video is video D, E, then video D can respectively with video A, B, C and E constitutes video pair, video E can respectively with video A, B, C, D constitutes video pair, respectively the similarity of the video finger print of these 8 videos pair is judged, assume that result is that the similarity of the video finger print of video D and video A is more than first threshold, and the similarity of the video finger print of other 7 couples is no more than first threshold, then video D and video A can be included into same first kind similar video subset, first kind similar video subset A can be obtained, B, D.What be considered above is the direction-free situation of similarity of video finger print, and the directive situation of similarity of video finger print is also similar.The method, without video finger print similarity between two history videos is judged, saves substantial amounts of amount of calculation, substantially increases treatment effeciency.
Similarly, the similarity in combinations with the essential information of video filters further.In a kind of possible embodiment, video newly-increased within the nearest time period in multiple video can be set as newly-increased video subset, it is believed that any one video in newly-increased video subset and in all videos video except this video except all may make up video pair, and any two video being not admitted to increase newly video subset all not may make up video pair.For each video pair, if the similarity of the video finger print of its two included videos is more than first threshold, and the similarity between the essential information of these two videos is more than Second Threshold, then these two videos are included into same first kind similar video subset.The similarity of described essential information and essential information can refer to described above.
In some cases, it can be possible to need to merge Equations of The Second Kind similar video and above-mentioned first kind similar video." first kind " and " Equations of The Second Kind " in the present invention is only used for making statement relatively sharp, rather than carries out any extra restriction.This Equations of The Second Kind similar video subset can be the similar video subset that history is accumulated by or the similar video subset etc. obtained according to additive method.In this case, it is contemplated that the Equations of The Second Kind and first kind similar video subset that include same video are merged into the similar video subset after a merging, then can determine that the original video in the similar video subset after each merging.
Fig. 2 illustrates the flow chart of the method for the more new video original according to an embodiment of the invention.
Step 201, similarity based on video finger print obtain similar video pair.Total video can include newly-increased video and history video on the same day, any one video in newly-increased video and in total video video except this video all constitute video pair, and between any two history video, all do not constitute video pair.For each video pair, if the similarity of the video finger print of its two included videos is more than first threshold, then it is assumed that it is similar video pair;Otherwise, then it is assumed that it is not similar video pair.
Step 202, based on two videos included by similar video centering essential information similarity filter similar video pair.For each similar video pair, if the similarity of the essential information of its two included videos is more than Second Threshold, then retain this similar video pair;Otherwise, this video is rejected from similar video centering.
Step 203, obtain first kind similar video subset.Two videos included by each similar video centering obtained after filtration are all included into same first kind similar video subset.
Step 204, the original video determined in each first kind similar video subset.The original video in this first kind similar video subset can be determined based on the relevant information of each video in each first kind similar video subset.Such as, for each first kind similar video subset, the relevant information metric (can referring to above) of wherein each video can be obtained, and select the video with maximal correlation Information Meter value as the original video in this first kind similar video subset.The information that can represent the original video of each obtained first kind similar video subset and each first kind similar video subset can be stored.
Further, the method for this more new video original can also include:
Step 205, several history similar video subsets that currently available first kind similar video subset and history are accumulative can be merged.The history similar video subset and first kind similar video subset that include same video can be merged into the similar video subset after a merging.Such as, certain first kind similar video subset includes video A, B, C, certain history similar video subset includes video A, D, and another history similar video subset includes video B, E, then the similar video subset after its obtained merging can include video A, B, C, D, E.
Step 206, the original video that can determine that in the similar video subset after each merging.The similar method of the original video that can adopt and determine in first kind similar video subset is to determine the original video in the similar video subset after each merging.The original video in the similar video subset after this merging can be determined based on the relevant information of each video in the similar video subset after each merging.Such as, for the similar video subset after each merging, the relevant information metric (can referring to above) of wherein each video can be obtained, and select the video with maximal correlation Information Meter value as the original video in the similar video subset after this merging.Owing to the relevant information of history video can be time dependent, therefore, the relevant information metric of each video in the similar video subset after can calculating merging in the present embodiment (including newly-increased video and history video).
Step 207, renewable history similar video subset and original video thereof.Can using the similar video subset after the merging that obtains in step 206 and original video thereof as new history similar video subset and original video thereof.
Fig. 3 illustrates the comprehensive similar video information obtained according to the present invention according to an embodiment of the invention and the method flow diagram of similar video information obtained according to additive method.
Step 301, the first kind similar video subset obtained according to the present invention can be merged and the similar video set obtained according to additive method.Several first kind video subsets including same video and several similar videos obtained according to additive method can be merged into the similar video subset after a merging.Such as, certain first kind similar video subset includes video A, B, C, video A, D is included according to a certain similar video subset that additive method obtains, include video B, E according to another similar video subset that additive method obtains, then the similar video subset after its obtained merging can include video A, B, C, D, E.
Step 302, the original video that can determine that in the similar video subset after each merging.The original video in the similar video subset after this merging can be determined based on the relevant information of each video in the similar video subset after each merging.Such as, for the similar video subset after each merging, the relevant information metric (can referring to above) of wherein each video can be obtained, and select the video with maximal correlation Information Meter value as the original video in the similar video subset after this merging.
The method of the determination video original of the present invention, it is possible to the similarity based on video finger print obtains accurate similar video subset and therefrom determines original video, thus avoiding recommending similar video to user, is greatly improved Consumer's Experience.
Embodiment 2
Fig. 4 illustrates the structured flowchart of a kind of device determining video original of one embodiment of the present of invention.This determines that the device of video original may include that module 43 determined by similar video subset acquisition module 41 and original.Similar video subset acquisition module 41 obtains first kind similar video subset for the similarity based on video finger print from multiple videos.Original determines that module 43 is for determining the original video in each first kind similar video subset.
In the present embodiment, by obtaining similar video subset based on the similarity of video finger print and therefrom determining original, can be prevented effectively from and repeat to recommend similar video to user.
Embodiment 3
Fig. 5 illustrates the structured flowchart of a kind of device determining video original of an alternative embodiment of the invention.It is distinctive in that with a upper embodiment, the video finger print of each video can include the multiple characteristic informations corresponding respectively to the multiple frames in this video, and the similarity of the video finger print of two videos can be based on what the quantity of characteristic information identical in the video finger print of these two videos was determined.
As it is shown in figure 5, similar video subset acquisition module 41 can include with lower unit:
First acquiring unit 411, when all may be constructed video pair for any two video in the plurality of video, for each video pair, if the similarity of the video finger print of its two included videos is more than first threshold, then these two videos can be included into same first kind similar video subset.
Second acquisition unit 412, when all constituting video pair for any two video in the plurality of video, for each video pair, if the similarity of the video finger print of its two included videos is more than first threshold, and the similarity of the essential information of these two videos is more than Second Threshold, then these two videos being included into same first kind similar video subset, it is part or all of that described essential information includes in elements below: visual classification, video title, video tab, video quality;
3rd acquiring unit 413, for any one video in newly-increased video subset and in the plurality of video any video except this video all constitute video pair, and any two video being not admitted to described newly-increased video subset in the plurality of video is not when all constituting video pair, for each video pair, if the similarity of the video finger print of its two included videos is more than first threshold, then these two videos are included into same first kind similar video subset;.
4th acquiring unit 414, for including newly-increased video subset at the plurality of video, any one video in described newly-increased video subset and in the plurality of video any video except this video all constitute video pair, and any two video being not admitted to described newly-increased video subset in the plurality of video is not when all constituting video pair, for each video pair, if the similarity of the video finger print of its two included videos is more than first threshold, and the similarity of the essential information of these two videos is more than Second Threshold, then these two videos are included into same first kind similar video subset, it is part or all of that described essential information includes in elements below: visual classification, video title, video tab, video quality.
In a kind of possible implementation, whether each corresponding element that the similarity of the essential information of two videos can be based in the essential information of these two videos identical determines.
In a kind of possible implementation, this device also includes similar video subset and merges module 45, for when merging Equations of The Second Kind similar video subset and first kind similar video subset, similar video subset merges module and may be used for the similar video subset after the Equations of The Second Kind and first kind similar video subset that include same video are merged into a merging.Original determines that module 43 can be also used for the original video in the similar video subset after determining each merging.
In a kind of possible implementation, can determining the original video in this similar video subset based on the relevant information of each video in similar video subset, it is part or all of that described relevant information can include in lower surface element: visual classification, video title, video tab, video quality, video playback data, user interaction data.
In a kind of possible implementation, described original determines that module 43 is additionally operable to obtain the relevant information metric that in this similar video subset, each video is corresponding, and the state of each element that the described relevant information metric of video can be based in the described relevant information of this video is determined;The video with maximal correlation Information Meter value can be selected as the original video in this video subset.
In a kind of possible implementation, obtaining first kind similar video subset from multiple videos based on the similarity of video finger print can be adopt to implement based on Spark+GraphX algorithm framework.
The above; being only the specific embodiment of the present invention, but protection scope of the present invention is not limited thereto, any those familiar with the art is in the technical scope that the invention discloses; change can be readily occurred in or replace, all should be encompassed within protection scope of the present invention.Therefore, protection scope of the present invention should be as the criterion with described scope of the claims.