CN101631240A

CN101631240A - Method for identifying dynamic video content

Info

Publication number: CN101631240A
Application number: CN 200810132600
Authority: CN
Inventors: 张骥
Original assignee: BEIJING YUVAD Co Ltd
Current assignee: BEIJING YUVAD Co Ltd
Priority date: 2008-07-17
Filing date: 2008-07-17
Publication date: 2010-01-20

Abstract

The invention provides a method for identifying dynamic video content. A stored fingerprint database is formed in advance, and contains video signals of broadcast video content; and the method at least comprises the following steps: storing the dynamic video content of a continuous video frame sequence in a buffer; acquiring video frames by a sampler; storing samples in fingerprint, namely storing fingerprint A into the fingerprint database; and performing matching calculation under a fingerprint model between the fingerprint A in the fingerprint database and the remained fingerprint part B to determine whether the dynamic video content is broadcast before. The method can effectively manage, file and search the video content, reduces the cost of digital storage equipment, and effectively identifies the video content in the condition of less or no human intervention.

Description

The method of identification dynamic video content

Technical field

What this invention related to is the method for an identification dynamic video content, in particular, is the method for an identification dynamic video content fingerprint.

Here so-called " fingerprint " is meant a series of dot information (the television receiver decoder is produced), and each dot information all is to select from a frame of TV signal, and these frame numbers are to select from all TV signal.And can select one or more dot information data in the frame.So so-called " fingerprint " can be used for identifying uniquely said TV signal.

Background technology

Presents is described the mechanism and the method for video content in detail.Video is the best way of giving masses with message transmission.Now all video contents nearly all are number formats, to compression and issue, all are number format from video acquisition, making, editor, special efficacy.In addition, also have a large amount of video contents to be stored in DVD, video tape, computer server more and more and store in the sequence in a large number.

The administering digital video content has become a great challenge, video and broadband service supplier, or even home-use person concerning all video owners.This is that video content can not just be searched for and discern by computer very simply because unlike text message.Also unlike audio frequency, video content data will take bigger space.In addition, come the very difficult also poor efficiency very of identification video content, also be difficult to operation because process expends time in very much by artificial participation.These factors all make and manage effectively, obtain and search for video content and become very difficult.But the search and the needs of identification video content are more and more important, are accompanied by more and more that on the net broadband provides and the cost of lower digital memeory device.

Therefore, there is a usefulness seldom or do not have artificial needs that participate in realizing effective identification video content here.

Summary of the invention

The present invention can provide a kind of identification dynamic video content, a kind of method of effectively management, filing and search video content.

The present invention can also provide a kind of method of discerning dynamic video content, can reduce the cost of data storage device.

The present invention can provide a kind of method of discerning dynamic video content, can be used for again these information being used for discerning automatically further from the video section information extraction that provides, if the same video content will occur in different video data streams.

Therefore, in the present invention, a kind of method of discerning dynamic video content is provided, be pre-formed a fingerprint database of having stored, the vision signal that has comprised the video content that broadcasts has comprised following several steps here at least: the dynamic video content of storage continuous video frames sequence in a buffer; Obtain frame of video by sampler; Sample value is stored in the fingerprint, that is to say that fingerprint A is stored in the fingerprint database; Between the fingerprint A of fingerprint database and remaining fingerprint part B, carry out the coupling calculation under the fingerprint model, determine whether dynamic video content was broadcasting before.

Video content can be managed, files and be searched for to the method that the present invention provided effectively, reduces the cost of digital storage equipment, and seldom or do not have under artificial the participation identification video content effectively.

Description of drawings

Fig. 1: the schematic diagram that carries out part sampling committed step from frame of video

Fig. 2: at least with coupling fingerprint relevant part, determine the matching process schematic diagram that original video content A and B be whether identical

Fig. 3: the key element schematic diagram that detected value obviously reduces in the SSAD value

Fig. 4: the key element schematic diagram that further detected value obviously reduces in the SSAD value

Fig. 5: the schematic diagram that frame of video is carried out the prefered method of part sampling

Fig. 6: the process schematic diagram that each frame is carried out the part sampling

Fig. 7: the organization and administration process schematic diagram of sample value

Fig. 8: the common method of the quantity of several frame of video of determining to carry out the part sampling

Fig. 9: the roughly process schematic diagram of fingerprint extraction

Figure 10: carry out process between the sample of obtaining from two frame of video with SAD operation carrying out fingerprint matching calculation

Figure 11: the process of mating between the sample of obtaining from two frame of video

Figure 12: the generative process of a series of SSAD value

Figure 13: the identification of fingerprint matching is characterised in that the SSAD value obviously reduced, and obviously increased after coupling before coupling

Embodiment

Below described one and can be used for, and the information of utilization extraction is further discerned the method for the identical video content that occurred in the different video data flow from the video content extracting section information that provides.

The performance of correct identification video content has a lot of important use aspect.Listed severally once, but be not limited only to the following aspects:

Video search

Video monitoring

Video-splicing

Video switch

Video ads

Hereinafter, with at first describing information extraction from the video content data that provides, be called as fingerprint extraction process.Then can specifically describe and how use finger print data in a different video content, to find coupling.

In the elaboration of saying, be concentrated in the storage of vision signal, though in most of the cases, vision signal is come in the audio signal one.It is synchronous with vision signal that audio signal can be considered to.Fingerprint recognition on vision signal also can be discerned relevant audio content.Therefore, in the part of this file remainder, can only discuss the fingerprint running of vision signal is handled.

Simultaneously, suppose that video data is digitized.Also this conception can be applied to analog video content, be digital data stream but before using method as described herein, need analog video signal is digitized into.Therefore, in this file not to how the treatment of simulated video content is specifically addressed.

In addition, suppose that digital video content is not compressed form.Concerning the video content of compression, needs first decompression (perhaps decoding) before the adopting said method.

At last, suppose that frame of video is continuous, that is to say that each frame can show together in the decoder there.For the frame that interlacing shows, can be in the staggered demonstration of two time points that separate---two fields (parity field).Under these circumstances, the running of supposing following all descriptions is applicable to a wherein arbitrary field.

The digital of digital video data of uncompressed form can be represented by the frame of video of free order.Each frame can be described to the sequence of pixel values of a two dimension.Each pixel value can be broken down into two parts of brightness and colourity again further.Obtain and search for video content future, we only use the luminance pixel values of video.

Digital video is made up of continuous frame of time, will be shown as the continuous dynamic picture when presenting to human eye.At first described from the way of these frame of video information extractions, so that the information of extracting can be used for the identification video frame.

Carry out the required step of fingerprint matching can be summarized as following some:

From the video A data that take the fingerprint

To put a data center in order from the finger print data that video A extracts

From the video B data that take the fingerprint

Between two fingerprints, carry out the calculation of fingerprint pattern coupling

Next, we will specifically set forth each step.

2.1 fingerprint extraction

The simplest way that takes the fingerprint is exactly that all frames all are stored in the magnetic disc store.Certainly, its defective is also arranged is exactly to occupy very big memory space to this way.In addition, it is also very difficult to retrieve the frame of video that has stored very soon, because the restriction of memory bandwidth.

The method first step that presents is set forth is that frame of video is carried out the part sampling.Clear and definite is, concerning each frame video, has carried out the part sampling on the space, and sample value is obtained and be stored as to the sample of specific quantity in frame of video.Committed step can be referring to figure one

2.1.1 the part of frame of video sampling

As shown in Figure 5, this sampling method is the resolution that is independent of every frame, makes it also can be very stable when handling the picture of different resolution.

2.1.2 the part of many frame of video sampling

This process can be referring to Fig. 8, and first kind of way needs minimum operand, internal memory and storage.Last a kind of then need maximum operands, internal memory and storage.

The sampled frame of video of each series all can produce the sequence of a continuous two dimensional sample value.This sampled sequence is exactly the fingerprint of so-called sampled video.

Produce from the above different fingerprint sequences that just may have of sampling method of it should be noted that more than one group.For first and the third sampling method, has only a fingerprint.But second kind of sampling method just may produce many group fingerprint sequences, and each group is the relevant different video part of identification respectively.Certainly, the sequence of organizing finger print data can be organized more manages a complicated more fingerprint sequence, and this does not encyclopaedize in this file.

Next, can concrete explaination how to handle an independent fingerprint sequence.

2.2 fingerprint matching

This part has been described the reverse running of fingerprint extraction operation, just uses a fingerprint sequence that provides to seek coupling in a different video content stream, may partly or completely mate the video content of fingerprint representative.

Particular content can be referring to Fig. 2, and two fingerprint storage can determine that they are individual not couplings through several steps.

2.2.1 the summation operation of absolute difference

The key of fingerprint matching computing is exactly absolute difference (Sum of Absolute Difference, summation operation SAD) between the two box fingerprints.The computing of absolute difference is a difference between the basic evaluation sample.The words that sad value is very big have a long way to go with regard to meaning two picture materials between the frame of video.Instantiation is seen referring to Figure 10.

2.2.2 the summation of mobile SAD window and SAD sequence (sum of SAD, SSAD)

SAD computing meeting described above repeats between two finger print data groups, and one is obtained from fingerprint A, and another obtains from fingerprint B.Whether target is to search in fingerprint B has a fraction to mate with fingerprint A.Here suppose that fingerprint A comprises the sample of smaller amounts than fingerprint B.

The SSAD detailed process can be referring to Figure 12.

2.2.3 fingerprint matching detects

The SSAD value is represented as S (1), S (2) ..., S (n), S (n+1) is stored in the model sequence storage, and a model extractor has been checked all SSAD values of closing on and has been drawn model value, is expressed as P (1), P (2), P (3)

P(n)＝(S(n)-S(n-1))/S(n)

Here S (n) has represented fingerprint A and the fingerprint B difference on their n frames in the fingerprint window, and index n refers to each fingerprint sequence B by the relevant fingerprint A frame that is shifted.Here P (1) is not defined and can be used to, and to have only value as S (n) be zero or just can use during near critical value zero.Otherwise P (n) is zero.The model value that extracts from another serial sequence can be stored in the model storage.Model testing device is used the value in following a few step preference pattern storage then:

At first, in model storage, select a particular location, such as m, in the window that a size is 2M-1, identify the position of m:

P(m-M+1)，P(m-M+2)，...，P(m-1)，P(m)，P(m+1)，...，P(m+M-2)，P(m+M-1)，

Then, these values can be added up by a model value collection device, draw a C (m) as a result, and formula is as follows:

C(m)＝-P(m-M+1)-...-P(m-1)-P(m)+P(m+1)+...+P(m+M-1)

Here M is a selected constant that comes out, and calculates the C value to guarantee to have comprised abundant P value in rolling window 2M-1.

At last, the value of C (m) can compare with a User that has provided critical value, determines whether fingerprint A and possible fingerprint B mate.Also having frame number also is to determine by above process, outputs in the Distribution Statistics value collector again.

Claims

1. this invention is a kind of method of discerning dynamic video content, be pre-formed a fingerprint database of having stored, the vision signal that has comprised the video content that broadcasts has comprised following several steps here at least: the dynamic video content of storage continuous video frames sequence in a buffer; Obtain frame of video by sampler; Sample value is stored in the fingerprint, that is to say that fingerprint A is stored in the fingerprint database; Between the fingerprint A of fingerprint database and remaining fingerprint part B, carry out the coupling calculation under the fingerprint model, determine whether dynamic video content was broadcasting before.

2. method according to claim 1, calculation under fingerprint model coupling have comprised the summation operation of carrying out absolute difference between fingerprint A and fingerprint B that each is possible.Formula is as follows:

SAD(A，B)＝|A1-B1|+|A2-B2|+|A3-B3|+|A4-B4|+|A5-B5|+…

| ... | be signed magnitude arithmetic(al)

The sample that obtains from first frame of fingerprint A is represented as A1, A2, and A3, A4, A5, The sample that obtains from first frame of fingerprint B in the rolling window is represented as B1, B2, and B3, B4, B5, Then mating at the frame from the same position of A and B separately, Aland B1, A2 and B2 ..., A5 and B5,

SAD (absolute difference summation) operate in second frame of sample the time carry out once again at A and B, therefore, the frame of each frame of fingerprint A and possible fingerprint B all is included in the calculating up to all video frame numbers, and end product is added and draws a total absolute difference and (SSAD).

The same process is repeated again, the position of the frame by exchange fingerprint B and relevant fingerprint A, and such exchange whenever carries out once will producing a new SSAD value.Like this, just generate a series of SSAD value and be stored as a sequence.The identification of a fingerprint matching means that the SSAD value obviously reduced, and obviously increased after coupling before coupling.

3. method according to claim 2, SSAD value are represented as S (1), S (2) ..., S (n), S (n+1) is stored in the model sequence storage, and a model extractor has been checked all SSAD values of closing on and has been drawn model value, be expressed as P (1), P (2), P (3)

P(n)＝(S(n)-S(n-1))/S(n)

Here S (n) has represented fingerprint A and the fingerprint B difference on the n frame at video A and video B in the fingerprint window, and index n refers to fingerprint sequence B and whenever moves a frame with respect to relevant fingerprint sequence A.Here P (1) is not defined and can be used to, and to have only value as S (n) be zero or just can use during near critical value zero.Otherwise P (n) is zero.The model value that extracts from another serial sequence can be stored in the model storage.Model detector is used the value in following a few step preference pattern storage then:

C(m)＝-P(m-M+1)-...-P(m-1)-P(m)+P(m+1)+...+P(m+M-1)

At last, the value of C (m) can compare with a User that has provided critical value, determines whether fingerprint A and possible fingerprint B mate.Also having the quantity of frame also is to determine by above process, outputs in the Distribution Statistics value collector again.

4. method according to claim 3, Distribution Statistics value collector has been gathered all and has been exceeded C (m) model value of given critical value, and calculate the number of times that it surpasses, store into again in the sequence, each figure has comprised a m value, and C (m) value and C (m) value surpass the number of times of critical value.

5. method according to claim 4, maximum-value selector can procuratorial work in the middle of histogram all value, find out that maximum value of occurrence number.This value refers to such an extent that be exactly the frame of fingerprint matching.

6. method according to claim 1, sample value can be come selected according to the luminance video of sample.

7. method according to claim 1, sample value are to obtain on the same position on each frame.And the sample size of obtaining on each frame also is identical, and it is changeless which position obtaining sample in.

8. method according to claim 1, sample is to generate from image, is in order to guarantee that sample is evenly distributed in the image as far as possible.

9. method according to claim 1, sample are to be positioned at the level of video resolution and the integer position of vertical direction.

10. method according to claim 1, each frame are got 5 sample values, wherein:

Sample 1 is the centre position at image;

Sample 2 is the positions on the top 1/4th of middle in the horizontal direction and vertical direction;

Sample 3 is positions of the bottom 1/4th of middle in the horizontal direction and vertical direction;

Sample 4 is in the middle of the vertical direction and take back 1/4th position of horizontal direction;

Sample 5 is in the middle of the vertical direction and take over 1/4th position of horizontal direction.