CN105959686B

CN105959686B - A kind of video feature extraction method, video matching method and device

Info

Publication number: CN105959686B
Application number: CN201610460125.1A
Authority: CN
Inventors: 徐敘遠
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2016-06-22
Filing date: 2016-06-22
Publication date: 2018-11-09
Anticipated expiration: 2036-06-22
Also published as: CN105959686A

Abstract

The embodiment of the invention discloses video feature extraction methods, including：The video of input is subjected to frame rate conversion, obtains the video pictures of preset frame rate；Video pictures are subjected to gray proces, and the picture after gray proces is carried out to the conversion of time-domain information representative image, obtain the first picture；First picture is subjected to anti-noise variation, obtains multiple corresponding second pictures of the first picture；Every second picture is split respectively, obtains corresponding M picture region with equal area size, and extracts the color space characteristic of M picture region respectively；M is the integer more than 1；Binary conversion treatment is carried out to the color space characteristic of extraction, and obtains the binary feature sequence of video after being ranked up according to preset ordering rule.The invention also discloses video matching method and relevant apparatus, and the bad technical problem of the video features anti-noise ability of prior art extraction is efficiently solved using the present invention, can quickly complete copyright detection.

Description

A kind of video feature extraction method, video matching method and device

Technical field

The present invention relates to computer communication field, especially a kind of video feature extraction method, video matching method, video are special Levy extraction element and video matching device.

Background technology

As the development of multimedia technology, especially internet have become the indispensable part of daily life, video Content becomes more and more abundant, while pirate video also occurs at double.In multimedia copyright protection, copyright identification person need from It in a large amount of video data, fast and effeciently detects copy that may be present and judges the ownership of content, this is just needed pair Video carries out the content retrievals such as fingerprint or feature extraction.

The copyright based on content is utilized to detect (Content-Based Copy Detection, CBCD) in the prior art right The copyright of video is protected, and the copyright detection for being currently based on content is mainly divided into two kinds：Video based on spatial color refers to Line (or video features) and the video finger print (or video features) of feature based extraction.

Video finger print based on spatial color is substantially using picture in some period, the histogram of specific region. But the feature of color can change according to the different-format of video, not have so as to cause the video finger print based on color high Anti-noise ability such as adds trade mark, the variation of black surround etc..

The representative are two-dimension discrete cosine transform (2D-Discrete Cosine for the video finger print of feature based Transform, DCT) video finger print.Video finger print after being changed by 2D-DCT becomes a two-dimensional vector, to generation Table feature of the video in some region.To the fingerprint compared with color space, the feature video fingerprint based on DCT is for video Color space caused by format has certain anti-noise ability.But compression of the characteristic fingerprint of DCT for video, trade mark, black surround add Add, the variation anti-noise ability rotated etc. is bad.

Invention content

Technical problem to be solved of the embodiment of the present invention is, provides a kind of video feature extraction method, video matching Method, video feature extraction device and video matching device, the video features for efficiently solving prior art extraction are anti- The bad technical problem of ability of making an uproar.

In order to solve the above-mentioned technical problem, first aspect of the embodiment of the present invention discloses a kind of video feature extraction method, Including：

The video of input is subjected to frame rate conversion, obtains the video pictures of preset frame rate；

The video pictures are subjected to gray proces, and the picture after gray proces is subjected to time-domain information representative image Conversion, obtains the first picture；

First picture is subjected to anti-noise variation, obtains multiple corresponding second pictures of first picture；

Every second picture is split respectively, obtains corresponding M picture region with equal area size Domain, and the color space characteristic of the M picture region is extracted respectively；The M is the integer more than 1；

Binary conversion treatment is carried out to the color space characteristic of extraction, and after being ranked up according to preset ordering rule Obtain the binary feature sequence of the video.

Second aspect of the embodiment of the present invention discloses a kind of video matching method, including：

Extract the binary feature sequence of the first video of input；

By the binary feature sequence of the second video stored in the binary feature sequence of first video and feature database into Row matching, and export the corresponding video information of feature that Hamming distance is less than predetermined threshold value；

Wherein, the binary feature sequence of first video and the binary feature sequence of second video are by above-mentioned The characteristic sequence that video feature extraction method extracts in first aspect.

The third aspect of the embodiment of the present invention discloses a kind of video feature extraction device, including：

Frame rate conversion module, the video for that will input carry out frame rate conversion, obtain the video pictures of preset frame rate；

Gradation processing module, for the video pictures to be carried out gray proces；

Time domain transforming block is used for and carries out the picture after gray proces the conversion of time-domain information representative image, obtains First picture；

Anti-noise changes module, and for first picture to be carried out anti-noise variation, it is corresponding more to obtain first picture Open second picture；

Divide module, for being split respectively to every second picture, obtains corresponding big with equal area M small picture region；

Characteristic extracting module, the color space characteristic for extracting the M picture region respectively；The M is more than 1 Integer；

Binary processing module carries out binary conversion treatment for the color space characteristic to extraction, and according to default Ordering rule be ranked up after obtain the binary feature sequence of the video.

Fourth aspect of the embodiment of the present invention discloses a kind of video matching device, characteristic sequence extraction module, for extracting The binary feature sequence of first video of input；

Output module is matched, the second video for will be stored in the binary feature sequence of first video and feature database Binary feature sequence matched, and export Hamming distance be less than predetermined threshold value the corresponding video information of feature；

Wherein, the characteristic sequence extraction module includes the video feature extraction device of the above-mentioned third aspect；Described first The binary feature sequence of video and the binary feature sequence of second video are to be extracted by the video feature extraction device The characteristic sequence gone out.

The 5th aspect of the embodiment of the present invention discloses a kind of electronic equipment, including processor, memory and input unit And output device；Wherein, the processor is executed above-mentioned by executing the video feature extraction program stored in the memory The Overall Steps of video feature extraction method in first aspect.

The 6th aspect of the embodiment of the present invention discloses a kind of electronic equipment, including processor, memory and input unit And output device；Wherein, the processor executes above-mentioned first by executing the video matching program stored in the memory The Overall Steps of video matching method in aspect.

Implement the embodiment of the present invention, frame rate conversion is carried out by the video that will be inputted, video pictures are then subjected to gray scale Processing, and carry out time-domain information representative image is converted to the first picture, and the first picture is then carried out anti-noise variation, is obtained Multiple corresponding second pictures of first picture；And every second picture is split respectively, it obtains corresponding with phase Etc. area sizes M picture region, and extract the color space characteristic of the M picture region respectively and carry out at binaryzation Reason, obtains the binary feature sequence of video, efficiently solves the prior art after being finally ranked up according to preset ordering rule The bad technical problem of video features anti-noise ability of scheme extraction, the video based on spatial color compared with the existing technology refer to Line has higher anti-noise ability, can not only handle the variation of sdi video, such as rotates, scaling etc., and can be located in very well Manage localized variation, such as trade mark, black surround etc.；The video finger print of middle feature based compared with the existing technology has efficient processing energy Power can rapidly carry out feature extraction, and carry out rapidly video matching convenient for follow-up, quickly complete copyright detection.

Description of the drawings

In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technology description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of invention for those of ordinary skill in the art without creative efforts, can be with Obtain other attached drawings according to these attached drawings.

Fig. 1 is the flow diagram of video feature extraction method provided in an embodiment of the present invention；

Fig. 2 is the schematic diagram of picture segmentation provided in an embodiment of the present invention；

Fig. 3 is the flow diagram that anti-noise variation is carried out to picture of the embodiment of the present invention；

Fig. 4 is the flow diagram of feature extraction provided in an embodiment of the present invention and binary conversion treatment；

Fig. 5 is the flow diagram of video matching method provided in an embodiment of the present invention；

Fig. 6 is the flow diagram of another embodiment of video matching method provided by the invention；

Fig. 7 is the principle schematic of the geometric meaning of least square method provided by the invention；

Fig. 8 is the structural schematic diagram of video feature extraction device provided in an embodiment of the present invention；

Fig. 9 is the structural schematic diagram of anti-noise variation module provided in an embodiment of the present invention；

Figure 10 is the structural schematic diagram of binary processing module provided in an embodiment of the present invention；

Figure 11 is the structural schematic diagram of another embodiment of binary processing module provided by the invention；

Figure 12 is the structural schematic diagram of another embodiment of video feature extraction device provided by the invention；

Figure 13 is the structural schematic diagram of another embodiment of video feature extraction device provided by the invention；

Figure 14 is the structural schematic diagram of video matching device provided in an embodiment of the present invention；

Figure 15 is the structural schematic diagram of matching output module provided in an embodiment of the present invention；

Figure 16 is the structural schematic diagram of another embodiment of video matching device provided by the invention；

Figure 17 is the structural schematic diagram of another embodiment of video matching device provided by the invention.

Specific implementation mode

Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation describes, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, those of ordinary skill in the art are obtained every other without creative efforts Embodiment shall fall within the protection scope of the present invention.

It is the flow diagram of video feature extraction method provided in an embodiment of the present invention, including following step referring to Fig. 1 Suddenly：

Step S100：The video of input is subjected to frame rate conversion, obtains the video pictures of preset frame rate；

Specifically, frame rate conversion is carried out in time to the video of input, obtains the video figure of J frames (preset frame rate) per second Piece；The J is positive integer, such as preset frame rate J frames/second can be 5 frames/second, and the embodiment of the present invention does not limit the size of J, uses The size of J can be arranged in family according to self-demand or empirical data etc..

Step S102：The video pictures are subjected to gray proces, and the picture after gray proces is subjected to time-domain information The conversion of representative image obtains the first picture；

Specifically, gray proces are carried out to J frame pictures, obtains the picture after gray proces, such as obtain black and white picture；So Time-domain information representative image (Temporal Informative are carried out to the picture after the gray proces afterwards Representative Image, TIRI) conversion, wherein can be converted by following equation 1：

Wherein, p_(x,y,i)It is the pixel value of i-th of image.ω_iIt is coefficient, the calculating of coefficient can be constant, can also be Based on the value that other mould shapes are calculated, such as straight line, exponential curve etc..Based on TIRI convert after, obtain it is per second only there are one The picture (i.e. the first picture) of TIRI can be that the follow-up high efficiency for increasing video finger print and anti-noise ability lay the first stone.

Step S104：First picture is subjected to anti-noise variation, obtains multiple corresponding second figures of first picture Piece；

Specifically, the embodiment of the present invention is in order to improve the robustness of color space fingerprint, and in video features, (i.e. video refers to Line) extraction progress anti-noise variation before, it forms multiple samples (i.e. multiple second pictures) with anti-noise and is used as subsequent fingerprint The sample matched.

Step S106：Every second picture is split respectively, obtains corresponding M with equal area size A picture region, and the color space characteristic of the M picture region is extracted respectively；The M is the integer more than 1；

Specifically, second picture can be divided into the picture region of P*Q equal area, the size of the P*Q is should M, the P and Q are the positive integer more than or equal to 1, and are equal to 1 when P with Q differences；Figure provided in an embodiment of the present invention as shown in Figure 2 Second picture can be divided into the picture region of 4*4 equal area by the schematic diagram of piece segmentation by taking P and Q are equal to 4 as an example； Then the color space characteristic of 16 picture regions is extracted respectively.It will be appreciated that step S106 is respectively to every second Picture is split and feature extraction, therefore the color space for respectively obtaining the respective M picture region of multiple second pictures is special Sign.

Step S108：Binary conversion treatment is carried out to the color space characteristic of extraction, and according to preset ordering rule The binary feature sequence of the video is obtained after being ranked up.

Specifically, the binary conversion treatment in the embodiment of the present invention may include carrying out average segmentation or other points to feature Mode (such as intermediate value is divided) is cut to change, obtains the corresponding binary feature B of each picture region_x,y, the wherein value of the x can To be followed successively by 0 to P-1, the value of y can be followed successively by 1 to Q.Preset ordering rule in the embodiment of the present invention may include but It is not limited to following sortord：

By taking P and Q are equal to 4 as an example, the binary feature sequence of the video after sequence can be B_0,1, B_0,2,B_0,3, B_0,4, B_1,1, B_1,2, B_1,3, B_1,4, B_2,1, B_2,2, B_2,3, B_2,4, B_3,1, B_3,2, B_3,3, B_3,4.Two of the video after sequence Value tag sequence can also be B_0,1, B_1,1,B_2,1, B_3,1, B_0,2, B_1,2, B_2,2, B_3,2, B_0,3, B_1,3, B_2,3, B_3,3, B_0,4, B_1,4, B_2,4, B_4,4.Or the inverted order etc. to sort above.

It will be appreciated that the corresponding respective binary feature sequence of every second picture, i.e. multiple second pictures are corresponding Multiple binary feature sequences form the binary feature sequence of the video.

Further, the flow diagram that anti-noise variation is carried out to picture of the embodiment of the present invention in conjunction with shown in Fig. 3, in detail It describes how the bright present invention carries out anti-noise variation to first picture in detail, obtains multiple corresponding second pictures of first picture, it can To include the following steps：

Step S300：When detecting first picture there are when black surround, the black surround in first picture is deleted；

Specifically, can black surround detection first be carried out to first picture, it is (i.e. solid when detecting first picture there are black surrounds When fixed black region), that is, the black surround in first picture is deleted, if there are black surround, Ke Yizhi less than first picture for monitoring It connects and executes step S302.

Step S302：Multiple rotary is carried out with predetermined angle and direction to first picture, obtains multiple second pictures； Wherein postrotational picture corresponds to a second picture every time；

Specifically, the embodiment of the present invention does not limit R ° of predetermined angle and the size of n times rotation, does not limit the side of rotation yet To.Such as can 30 ° of rotation every time be carried out to first picture clockwise, it rotates 5 times, obtains 5 second pictures；Or into Row 40 ° of rotation every time, rotates 6 times, obtains 6 second pictures, etc..Wherein the size of R*n can be less than 360.

Step S304：Cutting and/or scaling processing are carried out to the second picture respectively, obtain presetting wide the second high figure Piece.

Specifically, the second picture in the embodiment of the present invention is rectangle picture, and the embodiment of the present invention does not limit default wide high The size of W and H can be arranged in the size of W*H, user according to self-demand or empirical data etc., and the W and H are positive number.If through Cross the ratio of width to height of the second picture that step S302 is directly obtained it is identical as the ratio of width to height of the high W*H of the default width when, can not be to this Second picture is cut, and the wide height of the second picture is directly scaled to the default high W*H of width, otherwise need to this Wide height adjustment is the default width high W and H after cutting and scaling by two pictures；If directly obtained by step S302 When the width of two pictures is high consistent with the default high W*H of width, it may not need and cut and scaling processing, that is to say, that present invention reality It applies example and not necessarily executes step S304.

It should be noted that preferably realized by way of Fig. 3 embodiments to picture carry out anti-noise variation can be fast Speed efficiently completes anti-noise variation before video feature extraction, not only increases the robustness of color space fingerprint, Er Qieti The high efficiency of entire video extraction process.But the embodiment of the present invention is not limited to the mode of Fig. 3 embodiments, other to improve The embodiment of the anti-noise variation of the robustness of color space fingerprint (or color space characteristic) both falls within video features of the present invention In the protection domain of the technical solution of extraction.

Still further, the flow of the feature extraction provided in an embodiment of the present invention and binary conversion treatment in conjunction with shown in Fig. 4 Schematic diagram illustrates the color space characteristic and step S108 for extracting M picture region in above-mentioned steps S106 respectively Embodiment may include steps of：

Step S400：Calculate separately the pixel average of the M picture region；

Specifically, by taking P and Q are equal to 4 as an example, second picture can be divided into the picture region of 4*4 equal area, That is M is equal to 16, calculates separately the pixel average A of 16 picture regions_x,y, the wherein value of the x can be followed successively by The value of 0 to P-1, y can be followed successively by 1 to Q.

Step S402：By the pixel average of the M picture region belonged in the same second picture respectively with whole institute The pixel average for stating second picture is compared, and binary feature is converted into；

Specifically, the present embodiment is by being averagely divided into example, to carry out binary conversion treatment.System can precalculate whole Feature is changed into binary feature by the pixel average C of second picture then by following equation 2：

After the transition, to increase the dissimilar anti-noise ability in part, the embodiment of the present invention can be by the combination of feature point It it is two kinds, global binary feature sequence and local two kinds of binary feature sequence are specific such as step S404 to step S410.

Step S404：The video is obtained after M obtained binary feature is ranked up according to preset ordering rule Corresponding overall situation binary feature sequence；

Specifically, by obtained whole binary features (such as step S400's is exemplified as 16 binary features) according to preset Ordering rule obtains the corresponding global binary feature sequence of the video after being ranked up, specific sortord can refer to above-mentioned Description in Fig. 1 embodiments in step S108, which is not described herein again.

Step S406：It is subordinated to K picture region of extraction in M picture region in the same second picture；The K is More than 0, it is less than the integer of the M；

Specifically, similarly such as step S400, by taking M is equal to 16 as an example, K picture region can be extracted, the K for more than 0, Integer less than 16, such as 15,14 etc..User can confirm size and the position of K according to the actual conditions of video, such as There are sample trademark patterns in the upper right corner or the upper left corner of video under normal circumstances, then can extract except the upper right corner or upper left corner area Other regions as K picture region.

It will be appreciated that the embodiment of the present invention is by the way that the first picture is carried out anti-noise variation, obtains first picture pair Multiple second pictures answered, then the K picture region extracted every time in step S406 is to belong to a second picture Picture region, therefore each local binary characteristic sequence subsequently obtained is also a corresponding second picture；And to same The position of second picture extraction is different, can obtain K different picture regions, such as K is 15, but the picture region extracted Position is different, can obtain the local binary characteristic sequence that different length is 15.

Step S408：Pixel by the pixel average of the K picture region respectively with whole Zhang Suoshu second pictures is put down Mean value is compared, and binary feature is converted into；

Specifically, formula 2 is again may be by calculate K binary feature.

Step S410：The video is obtained after K obtained binary feature is ranked up according to preset ordering rule Corresponding local binary characteristic sequence；

Specifically, K obtained binary feature (such as 15 or 14 binary features) is carried out according to preset ordering rule The corresponding local binary characteristic sequence of the video is obtained after sequence, specific sortord can refer in above-mentioned Fig. 1 embodiments Description in step S108, which is not described herein again.

Step S412：The global binary feature sequence and local binary characteristic sequence layering are individually stored, The identical binary feature sequence storage of middle length is within the same layer.

Specifically, after the extraction of the binary feature of global and local, Dividing Characteristics can individually be stored, by length Identical binary feature sequence storage is within the same layer, that is to say, that the binary feature sequence of different length is stored in different In layer.By taking M is equal to 16 as an example, the length of global binary feature sequence is 16, if after having carried out K=15 picture region extraction, And the different extraction in multiple picture region position is carried out, then the local binary that multiple and different length is 15 can be obtained The local binary characteristic sequence that multiple length is 15 is stored separately in same layer by characteristic sequence, and is 14 by length Local binary characteristic sequence is stored separately in another layer, and individually storage is layered to realize.

It should be noted that again by taking M is equal to 16 as an example, if it is extracted 15 and 14 picture regions respectively, and to 15 Picture region has carried out the extraction of 3 different picture regions positions to get the part two for being 15 to 3 different length when extracting Value tag sequence, and when 14 picture regions are extracted carried out the extraction of 5 different picture regions positions to get to 5 not Same length is 15 local binary characteristic sequence, then can divide 3 layers to the storage of binary feature sequence, wherein one layer of storage 16 global binary feature sequences, the global binary feature sequence in 3 different 15 of another layer of storage, another layer of storage 5 is not The global binary feature sequences of same 14.

It should also be noted that, storage video feature information may include video mark, the temporal information of video, with And the corresponding video features of each time point；The embodiment of the present invention can pass through the side of inverted list when storing video feature information Formula (Invert Index File, IIF) builds index to be stored, that is to say, that according to the time of video features, video Information is stored to the indexed sequential of the mark of video, then since subsequent video matching can be comparing video features, so The temporal information for obtaining matched video again afterwards finally obtains which corresponding video.

Still further, the flow diagram of video matching method provided in an embodiment of the present invention as shown in Figure 5, in detail The technical solution for illustrating the video matching method of the present invention, includes the following steps：

Step S500：Extract the binary feature sequence of the first video of input；

Specifically, it is to be extracted by way of the above-mentioned Fig. 1 to Fig. 4 any embodiment to the feature extraction of first video The binary feature sequence gone out, which is not described herein again.

Step S502：The two-value of the second video stored in the binary feature sequence of first video and feature database is special Sign sequence is matched, and exports the corresponding video information of feature that Hamming distance is less than predetermined threshold value；

Specifically, the binary feature sequence of the second video stored in this feature library is also arbitrary by above-mentioned Fig. 1 to Fig. 4 The mode of embodiment is come the binary feature sequence for extracting and being stored, and which is not described herein again；The first of equal length is regarded The binary feature sequence of frequency is compared with the binary feature sequence of the second video, seeks Hamming distance, to export Hamming distance Video information corresponding less than the feature of predetermined threshold value, the video information may include first video in the embodiment of the present invention The time domain match information at time point and the time point of second video, such as obtain the 5th second video content of the first video (i.e. Video pictures) it is matched with the 10th second of the second video video content, the 8th second video content and the second video of the first video The 23rd second video content matching, etc..

Implement the embodiment of the present invention, frame rate conversion carried out by the video that will be inputted during extracting video features, Then video pictures are subjected to gray proces, and carry out time-domain information representative image is converted to the first picture, then by One picture carries out anti-noise variation, obtains multiple corresponding second pictures of the first picture；And every second picture is divided respectively It cuts, obtains corresponding M picture region with equal area size, and extract the face of the M picture region respectively Colour space feature simultaneously carries out binary conversion treatment, and the binary feature of video is obtained after being finally ranked up according to preset ordering rule Sequence efficiently solves the bad technical problem of the video features anti-noise ability of prior art extraction, relative to existing skill The video finger print based on spatial color of art has higher anti-noise ability, can not only handle the variation of sdi video, such as revolves Turn, scaling etc., and localized variation, such as trade mark, black surround can be handled well etc.；Middle feature based compared with the existing technology Video finger print has efficient processing capacity, can rapidly carry out feature extraction, therefore can realize that fast video matches, Copyright detection is quickly completed.

Further, when this feature library, bedding storage has the global binary feature sequence of second video and local binary special When levying sequence, the wherein identical binary feature sequence of length stores within the same layer；So step S502 can be specifically included：Needle To the binary feature sequence of each first video, the corresponding number of plies is found out in this feature library；Wherein, it is stored in the number of plies The length of the binary feature sequence of second video is identical as the length of binary feature sequence of first video；It respectively will be each The binary feature sequence of first video is matched with the binary feature sequence of second video in corresponding number of plies.

Specifically, it when the binary feature sequence for the first video for being such as 15 to length matches, is looked into from this feature library The corresponding number of plies is found out, the length of the binary feature sequence of the second video of number of plies storage is all 15, then first regards this The binary feature sequence of frequency is matched with the binary feature sequence of each second video in the number of plies, exports matching result. When the binary feature sequence for the first video for being such as 14 to length matches, the corresponding number of plies is found out from this feature library, The length of the binary feature sequence of second video of number of plies storage is all 14, then by the binary feature sequence of first video It is matched with the binary feature sequence of each second video in the number of plies, exports matching result.Due to being stored in feature database Local binary characteristic sequence be some picture regions extracted from whole original video picture and the binary feature sequence generated, And the picture region position of extraction can be various, therefore global binary feature efficiently solves regarding for prior art extraction The bad technical problem of frequency feature anti-noise ability, can not only handle the variation of sdi video, such as rotate, scaling etc., Er Qieneng Localized variation, such as trade mark, black surround etc. are handled well.

Still further, the flow signal of another embodiment of video matching method provided by the invention as shown in Figure 6 Figure, includes the following steps：

Step S600：Extract the binary feature sequence of the first video of input；

Step S602：The two-value of the second video stored in the binary feature sequence of first video and feature database is special Sign sequence is matched, and exports the corresponding video information of feature that Hamming distance is less than predetermined threshold value；

Specifically, step S600 and step S602 can correspond to the description with reference to above-mentioned steps S500 and step S502, this In repeat no more.

Step S604：According to the time domain match information, time domain matching point is carried out by preset time point mapping algorithm Analysis obtains the matching relationship of the first duration and the second duration in second video in first video.

Specifically, the time point mapping algorithm in the embodiment of the present invention can include but is not limited to time domain scaling and/or partially Move algorithm, time-domain curve algorithm etc., it is preferable that the present invention is using linear time domain scaling and/or migration algorithm, such as can incite somebody to action The time point x of multigroup matched first video and time point y of second video substitute into formula ax+by=c, utilize minimum two Multiplication calculates a, b and c, is then calculated in first video second in the first duration and second video according to a, b and c The matching relationship of duration.

Again specifically, it is illustrated so that preset time point mapping algorithm is following equation 3 as an example in the embodiment of the present invention：

Ax+by=c formula 3

Wherein, which is the video time point of the first video of input, and it is (i.e. former which is characterized the second video stored in library Video) video time point.The time domain match information exported in so step S602 includes at least 2 or more match informations Unknown numerical value that can be in solution formula 3, such as the time domain match information of output includes：5th second video content of the first video It is matched with the 10th second of the second video video content, the 8th second video content of the first video and the 23rd second of the second video Video content matching and the 10th second video content of the first video and the 26th second video content of the second video Match, then equation group can be formed：

The embodiment of the present invention can solve equation group by least square method (or least squares method), when in the presence of more than 2 When a match information, formula ax+by=c can be converted the matrix form of least square method：

X β=y,

Wherein

M is the number for the match information that step S602 is obtained, and n is the number (i.e. 3) of unknown number, and m is more than n；The obvious party In general journey group does not solve, so allowing the equation " as possible set up " to choose most suitable β, introduce residual sum of squares (RSS) function S：

S (β)=| | x β-y | |²

WhenWhen, S (β) is minimized, and is denoted as：

Most it is worth by asking S (β) progress differential, can be obtained：

If matrix x^TX is nonsingular, and β has unique solution：

The embodiment of the present invention can easily acquire unknown data β using least square method (can also obtain formula In a, b and c), and so that the quadratic sum of error between these data and real data for acquiring is minimum, specifically such as Fig. 7 institutes The principle schematic of the geometric meaning for the least square method provided by the invention shown, according to the sum of square-error in definition minimum Change and carry out fitting a straight line, to obtain the matching relationship of the first duration and the second duration in second video in first video, example Such as the 1st to 3 second of the first video matches on the 2nd to 6 second with the second video, is equivalent to the speed of this section of video of the second video playing 1 times is linearly slowed compared with the first video, so as to estimate the deformation information of video time domain, and obtains same video The corresponding period of content.

Implement the embodiment of the present invention, frame rate conversion carried out by the video that will be inputted during extracting video features, Then video pictures are subjected to gray proces, and carry out time-domain information representative image is converted to the first picture, then by One picture carries out anti-noise variation, obtains multiple corresponding second pictures of the first picture；And every second picture is divided respectively It cuts, obtains corresponding M picture region with equal area size, and extract the face of the M picture region respectively Colour space feature simultaneously carries out binary conversion treatment, and the binary feature of video is obtained after being finally ranked up according to preset ordering rule Sequence efficiently solves the bad technical problem of the video features anti-noise ability of prior art extraction, relative to existing skill The video finger print based on spatial color of art has higher anti-noise ability, can not only handle the variation of sdi video, such as revolves Turn, scaling etc., and localized variation, such as trade mark, black surround can be handled well etc.；Middle feature based compared with the existing technology Video finger print has efficient processing capacity, can rapidly carry out feature extraction, therefore can realize that fast video matches, And time domain the matching analysis is carried out during video matching, copyright detection can have been quickly completed, matched video has been obtained and corresponds to Period, meet the demand that user carries out the copyright of video preliminary identification.

For the ease of preferably implementing the said program of the embodiment of the present invention, the present invention is also corresponding to provide a kind of video spy Levy extraction element, the structural schematic diagram of video feature extraction device provided in an embodiment of the present invention as shown in Figure 8, video features Extraction element 80 may include：Frame rate conversion module 800, gradation processing module 802, time domain transforming block 804, anti-noise changing pattern Block 806, segmentation module 808, characteristic extracting module 8010 and binary processing module 8012, wherein

The video that frame rate conversion module 800 is used to input carries out frame rate conversion, obtains the video pictures of preset frame rate；

Gradation processing module 802 is used to the video pictures carrying out gray proces；

Time domain transforming block 804 is used for and carries out the picture after gray proces the conversion of time-domain information representative image, obtains To the first picture；

Anti-noise, which changes module 806, to be used to first picture carrying out anti-noise variation, and it is corresponding to obtain first picture Multiple second pictures；

Segmentation module 808 obtains corresponding with equal area for being split respectively to every second picture M picture region of size；

Characteristic extracting module 8010 is used to extract the color space characteristic of the M picture region respectively；The M be more than 1 integer；

Binary processing module 8012 is used to carry out binary conversion treatment to the color space characteristic of extraction, and according to pre- If ordering rule be ranked up after obtain the binary feature sequence of the video.

Specifically, the structural schematic diagram of anti-noise variation module provided in an embodiment of the present invention as shown in Figure 9, anti-noise variation Module 806 may include deleting unit 8060 and rotary unit 8062, wherein

Deleting unit 8060 is used to, when detecting first picture there are when black surround, delete black in first picture Side；

Rotary unit 8062 is used to carry out multiple rotary to first picture with predetermined angle and direction, obtain multiple the Two pictures；Wherein postrotational picture corresponds to a second picture every time.

It can also include cutting unit for scaling 8064 that anti-noise, which changes module 806, in rotary unit 8062 to described the One picture carries out multiple rotary with predetermined angle and direction, and after obtaining multiple second pictures, segmentation module 808 is respectively to every Before second picture is split, cutting and/or scaling processing are carried out to the second picture respectively, obtain default the ratio of width to height Second picture.

Further, the characteristic extracting module 8010 in the embodiment of the present invention is specific can be used for calculating separately the M The pixel average of picture region；The structural representation of binary processing module provided in an embodiment of the present invention as shown in Figure 10 Figure, binary processing module 8012 may include the first comparison unit 80120 and the first sequencing unit 80122, wherein

First comparison unit 80120 is used to belong to the pixel average of M picture region in the same second picture The pixel average with whole Zhang Suoshu second pictures is compared respectively, is converted into binary feature；

M obtained binary feature for after being ranked up by the first sequencing unit 80122 according to preset ordering rule Obtain the corresponding global binary feature sequence of the video.

Still further, the structure of another embodiment of binary processing module provided by the invention as shown in Figure 11 is shown It is intended to, binary processing module 8012 includes that can also include outside the first comparison unit 80120 and the first sequencing unit 80122： Area extracting unit 80124, the second comparison unit 80126 and the second sequencing unit 80128, wherein

Area extracting unit 80124 is for being subordinated to K picture of extraction in M picture region in the same second picture Region；The K is to be less than the integer of the M more than 0；

Second comparison unit 80126 be used for by the pixel average of the K picture region respectively with whole Zhang Suoshu second The pixel average of picture is compared, and binary feature is converted into；

K obtained binary feature for after being ranked up by the second sequencing unit 80128 according to preset ordering rule Obtain the corresponding local binary characteristic sequence of the video.

Specifically, the area extracting unit 80124 of the embodiment of the present invention, which can execute repeatedly, described is subordinated to same the The step of K picture region is extracted in M picture region in two pictures, secondary K values in execution are identical or different；

Wherein, in the identical step of K values, figure that M picture region being subordinated in the same second picture extracts The position of panel region is different.

Still further, the structure of another embodiment of video feature extraction device provided by the invention as shown in Figure 12 Schematic diagram, video feature extraction device 80 include frame rate conversion module 800, gradation processing module 802, time domain transforming block 804, Anti-noise changes outside module 806, segmentation module 808, characteristic extracting module 8010 and binary processing module 8012, can also include Memory module 8014, for obtaining the video after binary processing module 8012 is ranked up according to preset ordering rule Binary feature sequence after, by the global binary feature sequence and local binary characteristic sequence layering individually storage, The wherein identical binary feature sequence of length stores within the same layer.

Please refer to Fig.1 the structural representation that 3, Figure 13 is another embodiment of video feature extraction device provided by the invention Figure.Wherein, as shown in figure 13, video feature extraction device 130 may include：At least one processor 1301, such as CPU, until A few network interface 1304, user interface 1303, memory 1305, at least one communication bus 1302 and display screen 1306.Wherein, communication bus 1302 is for realizing the connection communication between these components.Wherein, user interface 1303 can wrap Include touch screen, keyboard or mouse etc..Network interface 1304 may include optionally standard wireline interface and wireless interface (such as WI-FI interfaces).Memory 1305 can be high-speed RAM memory, can also be non-labile memory (non- Volatile memory), a for example, at least magnetic disk storage, memory 1305 includes the flash in the embodiment of the present invention. Memory 1305 optionally can also be at least one storage system for being located remotely from aforementioned processor 1301.As shown in figure 13, As may include operating system, network communication module, user interface mould in a kind of memory 1305 of computer storage media Block and video feature extraction program；Display screen 1306 is considered as output device, and user interface 1303 is considered as input dress It sets, network interface 1304, which can regard input unit as, can also regard output device as.

In the video feature extraction device 130 shown in Figure 13, processor 1301 can be used for calling in memory 1305 The video feature extraction program of storage, and execute following operation：

The video inputted from input unit is subjected to frame rate conversion, obtains the video pictures of preset frame rate；

Specifically, first picture is carried out anti-noise variation by processor 1301, and it is corresponding more to obtain first picture Second picture is opened, may include：

When detecting first picture there are when black surround, the black surround in first picture is deleted；

Multiple rotary is carried out with predetermined angle and direction to first picture, obtains multiple second pictures；Wherein every time Postrotational picture corresponds to a second picture.

Specifically, processor 1301 carries out multiple rotary to first picture with predetermined angle and direction, obtains multiple After second picture, it is described every second picture is split respectively before, can also be performed：

Cutting and/or scaling processing are carried out to the second picture respectively, obtain the second picture of default the ratio of width to height.

Specifically, processor 1301 extracts the color space characteristic of the M picture region respectively, including：It calculates separately The pixel average of the M picture region；

Processor 1301 carries out binary conversion treatment to the color space characteristic of extraction, and according to preset ordering rule The binary feature sequence that the video is obtained after being ranked up may include：

By the pixel average of the M picture region belonged in the same second picture respectively with whole Zhang Suoshu second pictures Pixel average compared, be converted into binary feature；

The corresponding overall situation of the video is obtained after M obtained binary feature is ranked up according to preset ordering rule Binary feature sequence.

Specifically, processor 1301 carries out binary conversion treatment to the color space characteristic of extraction, and according to preset Ordering rule obtains the binary feature sequence of the video after being ranked up, can also be performed：

It is subordinated to K picture region of extraction in M picture region in the same second picture；The K be more than 0, it is small In the integer of the M；

The pixel average of the K picture region is carried out with the pixel average of whole Zhang Suoshu second pictures respectively pair Than being converted into binary feature；

The corresponding part of the video is obtained after K obtained binary feature is ranked up according to preset ordering rule Binary feature sequence.

Specifically, processor 1301 can execute repeatedly the M picture region being subordinated in the same second picture The step of K picture region of middle extraction；Each K values in execution are identical or different；

Specifically, processor 1301 obtains the binary feature sequence of the video after being ranked up according to preset ordering rule After row, it can also be performed：

By the global binary feature sequence and local binary characteristic sequence layering individually storage, wherein length is identical Binary feature sequence storage within the same layer.

It should be noted that the video feature extraction device 80 in the embodiment of the present invention or video feature extraction device 130 Can be the electronic equipments such as personal computer or mobile intelligent terminal, tablet computer；It will be appreciated that video feature extraction device 80 or video feature extraction device 130 in each module function can correspond to it is arbitrary with reference to Fig. 1 to Fig. 4 in above-mentioned each method embodiment The specific implementation of embodiment, which is not described herein again.

Still further, the present invention also correspondence provides a kind of video matching device, the present invention as shown in Figure 14 is implemented The structural schematic diagram for the video matching device that example provides, video matching device 140 may include characteristic sequence extraction module 1400 With matching output module 1402, wherein

Characteristic sequence extraction module 1400 is used to extract the binary feature sequence of the first video of input；

Matching output module 1402 for that will store in the binary feature sequence of first video and feature database second The binary feature sequence of video is matched, and exports the corresponding video information of feature that Hamming distance is less than predetermined threshold value；

Wherein, characteristic sequence extraction module 1400 may include the video feature extraction in Fig. 7 to Figure 12 any embodiments Device 80；The binary feature sequence of first video and the binary feature sequence of second video are to be carried by video features The characteristic sequence for taking device 80 to extract.

Specifically, when the feature database bedding storage has the global binary feature sequence and local binary of second video When characteristic sequence, the wherein identical binary feature sequence of length stores within the same layer；The embodiment of the present invention as shown in Figure 15 The structural schematic diagram of the matching output module of offer, matching output module 1402 may include searching unit 14020 and sequence With unit 14022, wherein

Searching unit 14020 is used for the binary feature sequence for each first video, is found out in the feature database The corresponding number of plies；Wherein, the length of the binary feature sequence of second video stored in the number of plies is regarded with described first The length of the binary feature sequence of frequency is identical；

Sequences match unit 14022 is used for respectively by the binary feature sequence of each first video and corresponding number of plies In the binary feature sequence of second video matched.

Still further, the video information in the embodiment of the present invention may include first video time point with it is described The time domain match information at the time point of the second video；Another implementation of video matching device provided by the invention as shown in Figure 16 The structural schematic diagram of example, video matching device 140 include outside characteristic sequence extraction module 1400 and matching output module 1402, also It may include time domain matching analysis module 1404, be less than predetermined threshold value for exporting Hamming distance in matching output module 1402 After the corresponding video information of feature, according to the time domain match information, time domain is carried out by preset time point mapping algorithm The matching analysis obtains the matching relationship of the first duration and the second duration in second video in first video.

Please refer to Fig.1 the structural schematic diagram that 7, Figure 17 is another embodiment of video matching device provided by the invention.Its In, as shown in figure 17, video matching device 170 may include：At least one processor 1701, such as CPU, at least one network Interface 1704, user interface 1703, memory 1705, at least one communication bus 1702 and display screen 1706.Wherein, it communicates Bus 1702 is for realizing the connection communication between these components.Wherein, user interface 1703 may include touch screen, keyboard or Mouse etc..Network interface 1704 may include optionally standard wireline interface and wireless interface (such as WI-FI interface).Storage Device 1705 can be high-speed RAM memory, can also be non-labile memory (non-volatile memory), such as At least one magnetic disk storage, memory 1705 include the flash in the embodiment of the present invention.Memory 1705 optionally can be with It is at least one storage system for being located remotely from aforementioned processor 1701.As shown in figure 17, as a kind of computer storage media Memory 1705 in may include operating system, network communication module, Subscriber Interface Module SIM and video matching program；Display Screen 1706 is considered as output device, and user interface 1703 is considered as input unit, and network interface 1704 can be regarded as defeated Output device can also be regarded as by entering device.

In the video matching device 170 shown in Figure 17, processor 1701 can be used for calling to be stored in memory 1705 Video matching program, and execute following operation：

Extract the binary feature sequence of the first video inputted from input unit；

Wherein, the binary feature sequence of first video and the binary feature sequence of second video are above-mentioned Fig. 1 The characteristic sequence extracted to the video feature extraction method in Fig. 4 any embodiments.

Specifically, when the feature database bedding storage has the global binary feature sequence and local binary of second video When characteristic sequence, the wherein identical binary feature sequence of length stores within the same layer；Processor 1701 is by first video Binary feature sequence matched with the binary feature sequence of the second video stored in feature database, may include：

For the binary feature sequence of each first video, the corresponding number of plies is found out in the feature database；Wherein, institute State the length of the binary feature sequence of second video stored in the number of plies and the binary feature sequence of first video Length is identical；

Respectively by two of second video in the binary feature sequence of each first video and corresponding number of plies Value tag sequence is matched.

Specifically, the video information include the time point of first video with the time point of second video when Domain match information；Processor 1701 exports Hamming distance and is less than after the corresponding video information of feature of predetermined threshold value, can be with It executes：

According to the time domain match information, time domain the matching analysis is carried out by preset time point mapping algorithm, obtains institute State the matching relationship of the first duration and the second duration in second video in the first video.

It should be noted that the video matching device 140 or video matching device 170 in the embodiment of the present invention can be a The electronic equipments such as people's computer or mobile intelligent terminal, tablet computer；It will be appreciated that video matching device 140 or video Function with each module in device 170 can correspond to the specific reality with reference to Fig. 5 in above-mentioned each method embodiment to Fig. 7 any embodiments Existing mode, which is not described herein again.

In conclusion implementing the embodiment of the present invention, carried out by the video that will be inputted during extracting video features Then video pictures are carried out gray proces by frame rate conversion, and carry out time-domain information representative image is converted to the first picture, Then the first picture is subjected to anti-noise variation, obtains multiple corresponding second pictures of the first picture；And respectively to every second figure Piece is split, and obtains corresponding M picture region with equal area size, and extract the M picture respectively The color space characteristic in region simultaneously carries out binary conversion treatment, and video is obtained after being finally ranked up according to preset ordering rule Binary feature sequence efficiently solves the bad technical problem of the video features anti-noise ability of prior art extraction, relatively In the video finger print based on spatial color of the prior art, there is higher anti-noise ability, the change of sdi video can not only be handled Change, such as rotates, scaling etc., and localized variation, such as trade mark, black surround can be handled well etc.；Middle base compared with the existing technology In the video finger print of feature, there is efficient processing capacity, can rapidly carry out feature extraction, therefore can realize and quickly regard Frequency matches, and time domain the matching analysis is carried out during video matching, can quickly complete copyright detection, obtain matched regard Frequently the corresponding period meets the demand that user carries out the copyright of video preliminary identification.

One of ordinary skill in the art will appreciate that realizing all or part of flow in above-described embodiment method, being can be with Relevant hardware is instructed to complete by computer program, the program can be stored in a computer read/write memory medium In, the program is when being executed, it may include such as the flow of the embodiment of above-mentioned each method.Wherein, the storage medium can be magnetic Dish, CD, read-only memory (Read-Only Memory, ROM) or random access memory (Random Access Memory, RAM) etc..

The above disclosure is only the preferred embodiments of the present invention, cannot limit the right model of the present invention with this certainly It encloses, therefore equivalent changes made in accordance with the claims of the present invention, is still within the scope of the present invention.

Claims

1. a kind of video feature extraction method, which is characterized in that including：

The video pictures are subjected to gray proces, and the picture after gray proces is subjected to turning for time-domain information representative image It changes, obtains the first picture；

Every second picture is split respectively, obtains corresponding M picture region with equal area size, and The color space characteristic of the M picture region is extracted respectively；The M is the integer more than 1；

Binary conversion treatment is carried out to the color space characteristic of extraction, and is obtained after being ranked up according to preset ordering rule The binary feature sequence of the video；

Wherein, described that first picture is subjected to anti-noise variation, multiple corresponding second pictures of first picture are obtained, are wrapped It includes：

When detecting first picture there are when black surround, the black surround in first picture is deleted.

2. the method as described in claim 1, which is characterized in that it is described that first picture is subjected to anti-noise variation, obtain institute Multiple corresponding second pictures of the first picture are stated, including：

Multiple rotary is carried out with predetermined angle and direction to first picture, obtains multiple second pictures；Wherein rotation every time Picture afterwards corresponds to a second picture.

3. method as claimed in claim 2, which is characterized in that described to be carried out with predetermined angle and direction to first picture Multiple rotary, after obtaining multiple second pictures, it is described every second picture is split respectively before, further include：

4. the method as described in claim 1, which is characterized in that the color space for extracting the M picture region respectively Feature, including：Calculate separately the pixel average of the M picture region；

The color space characteristic of described pair of extraction carries out binary conversion treatment, and after being ranked up according to preset ordering rule The binary feature sequence of the video is obtained, including：

By the pixel average of the M picture region belonged in the same second picture the picture with whole Zhang Suoshu second pictures respectively Plain average value is compared, and binary feature is converted into；

The corresponding global two-value of the video is obtained after M obtained binary feature is ranked up according to preset ordering rule Characteristic sequence.

5. method as claimed in claim 4, which is characterized in that the color space characteristic of described pair of extraction carries out binaryzation Processing, and the binary feature sequence of the video is obtained after being ranked up according to preset ordering rule, further include：

It is subordinated to K picture region of extraction in M picture region in the same second picture；The K is to be less than institute more than 0 State the integer of M；

Pixel average by the pixel average of the K picture region respectively with whole Zhang Suoshu second pictures compares, It is converted into binary feature；

The corresponding local binary of the video is obtained after K obtained binary feature is ranked up according to preset ordering rule Characteristic sequence.

6. method as claimed in claim 5, which is characterized in that execute repeatedly the M being subordinated in the same second picture The step of K picture region is extracted in a picture region；Each K values in execution are identical or different；

Wherein, in the identical step of K values, picture region that M picture region being subordinated in the same second picture extracts The position in domain is different.

7. method as claimed in claim 6, which is characterized in that it is described be ranked up according to preset ordering rule after obtain institute After the binary feature sequence for stating video, further include：

By the global binary feature sequence and local binary characteristic sequence layering individually storage, wherein length identical two Value tag sequence stores within the same layer.

8. a kind of video matching method, which is characterized in that including：

Extract the binary feature sequence of the first video of input；

By the binary feature sequence progress of the second video stored in the binary feature sequence of first video and feature database Match, and exports the corresponding video information of feature that Hamming distance is less than predetermined threshold value；

Wherein, the binary feature sequence of first video and the binary feature sequence of second video are to pass through claim The characteristic sequence that 1-7 any one of them video feature extraction methods extract.

9. method as claimed in claim 8, which is characterized in that when the feature database bedding storage has the complete of second video When office's binary feature sequence and local binary feature sequence, the wherein identical binary feature sequence of length stores within the same layer； The binary feature sequence progress of the second video stored in the binary feature sequence by first video and feature database Match, including：

For the binary feature sequence of each first video, the corresponding number of plies is found out in the feature database；Wherein, the layer The length of the length of the binary feature sequence of second video stored in number and the binary feature sequence of first video It is identical；

It is respectively that the two-value of second video in the binary feature sequence of each first video and corresponding number of plies is special Sign sequence is matched.

10. method as claimed in claim 8, which is characterized in that the video information includes the time point of first video With the time domain match information at the time point of second video；The feature that the output Hamming distance is less than predetermined threshold value is corresponding After video information, further include：

According to the time domain match information, time domain the matching analysis is carried out by preset time point mapping algorithm, obtains described the The matching relationship of first duration and the second duration in second video in one video.

11. method as claimed in claim 10, which is characterized in that the time point mapping algorithm include time domain scaling and/or Migration algorithm.

12. a kind of video feature extraction device, which is characterized in that including：

Anti-noise changes module, for will first picture progress anti-noise variation, obtain first picture it is corresponding multiple the Two pictures；

Segmentation module obtains corresponding M with equal area size for being split respectively to every second picture A picture region；

Characteristic extracting module, the color space characteristic for extracting the M picture region respectively；The M is whole more than 1 Number；

Binary processing module carries out binary conversion treatment for the color space characteristic to extraction, and according to preset row Sequence rule obtains the binary feature sequence of the video after being ranked up；

Wherein, the anti-noise variation module includes：

Deleting unit, for when detecting first picture there are when black surround, deleting the black surround in first picture.

13. device as claimed in claim 12, which is characterized in that the anti-noise changes module and includes：

Rotary unit obtains multiple second pictures for carrying out multiple rotary to first picture with predetermined angle and direction； Wherein postrotational picture corresponds to a second picture every time.

14. device as claimed in claim 13, which is characterized in that the anti-noise changes module and further includes：

Unit for scaling is cut, for repeatedly being revolved with predetermined angle and direction to first picture in the rotary unit Turn, after obtaining multiple second pictures, before the segmentation module is respectively split every second picture, respectively to described Second picture carries out cutting and/or scaling processing, obtains the second picture of default the ratio of width to height.

15. device as claimed in claim 12, which is characterized in that the characteristic extracting module is specifically used for calculating separately described The pixel average of M picture region；

The binary processing module includes：

First comparison unit, for will belong to the pixel average of M picture region in the same second picture respectively with it is whole The pixel average of Zhang Suoshu second pictures is compared, and binary feature is converted into；

First sequencing unit, the M binary feature for will obtain obtain described after being ranked up according to preset ordering rule The corresponding global binary feature sequence of video.

16. device as claimed in claim 15, which is characterized in that the binary processing module further includes：

Area extracting unit extracts K picture region for being subordinated in M picture region in the same second picture；Institute It is to be less than the integer of the M more than 0 to state K；

Second comparison unit, for by the pixel average of the K picture region picture with whole Zhang Suoshu second pictures respectively Plain average value is compared, and binary feature is converted into；

Second sequencing unit, the K binary feature for will obtain obtain described after being ranked up according to preset ordering rule The corresponding local binary characteristic sequence of video.

17. device as claimed in claim 16, which is characterized in that the area extracting unit executes repeatedly described be subordinated to together The step of K picture region is extracted in M picture region in one second picture, secondary K values in execution are identical or different；

18. device as claimed in claim 17, which is characterized in that described device further includes：

Memory module, for obtaining the video after the binary processing module is ranked up according to preset ordering rule Binary feature sequence after, by the global binary feature sequence and local binary characteristic sequence layering individually storage, The wherein identical binary feature sequence of length stores within the same layer.

19. a kind of video matching device, which is characterized in that including：

Characteristic sequence extraction module, the binary feature sequence of the first video for extracting input；

Output module is matched, two of the second video for will be stored in the binary feature sequence of first video and feature database Value tag sequence is matched, and exports the corresponding video information of feature that Hamming distance is less than predetermined threshold value；

Wherein, the characteristic sequence extraction module includes claim 11-17 any one of them video feature extraction devices；Institute The binary feature sequence of the binary feature sequence and second video of stating the first video is to be filled by the video feature extraction Set the characteristic sequence extracted.

20. device as claimed in claim 19, which is characterized in that when the feature database bedding storage has second video When global binary feature sequence and local binary feature sequence, the wherein identical binary feature sequence of length is stored in same layer In；The matching output module includes：

Searching unit finds out corresponding layer for the binary feature sequence for each first video in the feature database Number；Wherein, the two-value of the length and first video of the binary feature sequence of second video stored in the number of plies The length of characteristic sequence is identical；

Sequences match unit, for respectively will be described in the binary feature sequence of each first video and corresponding number of plies The binary feature sequence of second video is matched.

21. device as claimed in claim 19, which is characterized in that the video information includes the time point of first video With the time domain match information at the time point of second video；Described device further includes：

Time domain matching analysis module, the feature for being less than predetermined threshold value in the matching output module output Hamming distance correspond to Video information after, according to the time domain match information, time domain the matching analysis is carried out by preset time point mapping algorithm, Obtain the matching relationship of the first duration and the second duration in second video in first video.

22. device as claimed in claim 21, which is characterized in that the time point mapping algorithm include time domain scaling and/or Migration algorithm.