CN103093458B

CN103093458B - The detection method of key frame and device

Info

Publication number: CN103093458B
Application number: CN201210592607.4A
Authority: CN
Inventors: 戴琼海; 张佳宏
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2012-12-31
Filing date: 2012-12-31
Publication date: 2015-11-25
Anticipated expiration: 2032-12-31
Also published as: CN103093458A

Abstract

The present invention proposes a kind of detection method and device of key frame, said method comprising the steps of: S1: non-homogeneous piecemeal is carried out to the current video frame of input and the adjacent video frames of current video frame, and add up the histogram distribution probability of two adjacent video frames on each piecemeal and each color component and joint histogram distribution probability; S2: calculate the divided group transinformation content between two adjacent video frames according to the histogram distribution probability of two adjacent video frames and joint histogram distribution probability, and it is poor to calculate the frame of two adjacent video frames according to divided group transinformation content; S3: first time key frame detection is carried out with the initial survey result obtaining current video frame to current video frame according to frame difference; S4: according to the initial survey result of current video frame, second time key frame is carried out to current video frame and detect with the final detection result obtaining current video frame.According to method of the present invention, computing velocity is fast, recall ratio and precision ratio is high, expansion and transplantability strong.

Description

The detection method of key frame and device

Technical field

The present invention relates to technical field of computer vision, particularly a kind of detection method of key frame and device.

Background technology

Video is made up of different scene, and scene is made up of different camera lens, each camera lens then contain several to hundreds of even more than the frame of video such as not, the detection of key frame refers to the shot boundary detecting that these do not wait video, frame of video corresponding when finding out shot change.General shot change comprises lens mutation, type of being fade-in fade-out gradual change and lysotype gradual change.

The detection method of the key frame of prior art utilizes image histogram distance poor to the frame characterizing consecutive frame, thus frame of video corresponding when finding out shot change, the minimum frame difference method of such as histogram, this method Problems existing is, image histogram, just by simple cumulative statistics, is not enough to definite realization and goes out feature difference between adjacent two frame histograms.

Prior art can adopt dual-threshold voltage to carry out the detection of key frame, saltant type and the gradually changeable key frame of video can be detected simultaneously, this method Problems existing is, the motion artifacts to illumination and large object on the one hand, there is obvious false retrieval, on the other hand for the key frame of part gradation type, such as lysotype gradual change key frame, correctly can not detect, occur false retrieval and undetected situation.

Summary of the invention

Object of the present invention is intended at least solve one of above-mentioned technological deficiency.

For achieving the above object, an aspect of of the present present invention proposes a kind of detection method of key frame, comprise the following steps: S1: non-homogeneous piecemeal is carried out to the current video frame of input and the adjacent video frames of described current video frame, and add up the histogram distribution probability of described two adjacent frame of video on each piecemeal and each color component and joint histogram distribution probability; S2: calculate the divided group transinformation content between described two adjacent frame of video according to the histogram distribution probability of described two adjacent frame of video on each piecemeal and each color component and joint histogram distribution probability, and it is poor to calculate the frame of described two adjacent frame of video according to described divided group transinformation content; S3: according to the frame difference of described two adjacent frame of video, first time key frame is carried out to described current video frame and detect with the initial survey result obtaining described current video frame; S4: according to the initial survey result of described current video frame, second time key frame is carried out to described current video frame and detect with the final detection result obtaining described current video frame.

According to the detection method of the key frame of the embodiment of the present invention, have the following advantages: (1) computing velocity is fast, treatment effeciency is high: the video for arbitrary resolution can carry out sampling processing, real-time key frame can be realized and detect; (2) recall ratio and precision ratio high: can locator key frame exactly, there is very high recall ratio and precision ratio; (3) good expansion and transplantability, easy to use: can to apply with other key frame detection method and other and combine, there is good expansion and wide application space.

For realizing said method, another aspect of the present invention also proposes a kind of pick-up unit of key frame, comprise: distribution probability statistical module, for carrying out non-homogeneous piecemeal to the current video frame of input and the adjacent video frames of described current video frame, and add up the histogram distribution probability of described two adjacent frame of video on each piecemeal and each color component and joint histogram distribution probability; Frame difference computing module, be connected with described distribution probability statistical module, for calculating the divided group transinformation content between described two adjacent frame of video according to the histogram distribution probability of described two adjacent frame of video on each piecemeal and each color component and joint histogram distribution probability, and it is poor to calculate the frame of described two adjacent frame of video according to described divided group transinformation content; First detection module, is connected with described frame difference computing module, detects for carrying out first time key frame according to the frame difference of described two adjacent frame of video to described current video frame with the initial survey result obtaining described current video frame; Second detection module, is connected with described first key frame detection module, detects for carrying out second time key frame according to the initial survey result of described current video frame to described current video frame with the final detection result obtaining described current video frame.。

According to the pick-up unit of the key frame of the embodiment of the present invention, easy to use, treatment effeciency is high, recall ratio and precision ratio high, and there is good expansion and transplantability.

The aspect that the present invention adds and advantage will part provide in the following description, and part will become obvious from the following description, or be recognized by practice of the present invention.

Accompanying drawing explanation

The present invention above-mentioned and/or additional aspect and advantage will become obvious and easy understand from the following description of the accompanying drawings of embodiments, wherein:

Fig. 1 is the detection method process flow diagram of the key frame of one aspect of the present invention embodiment;

Fig. 2 is the non-homogeneous piecemeal of the embodiment of the present invention and the schematic diagram of respective weights;

Fig. 3 is the principle schematic that the first time key frame of the embodiment of the present invention detects;

Fig. 4 is the distribution schematic diagram of the pixel sampled point of the embodiment of the present invention;

Fig. 5 is the structural representation of the pick-up unit of the key frame of the present invention's another aspect embodiment;

Fig. 6 is the structural representation of the distribution probability statistical module 100 of the embodiment of the present invention;

Fig. 7 is the structural representation of the frame difference computing module 200 of the embodiment of the present invention;

Fig. 8 is the structural representation of the first detection module 300 of the embodiment of the present invention; And

Fig. 9 is the structural representation of the second detection module 400 of the embodiment of the present invention.

Embodiment

Be described below in detail embodiments of the invention, the example of described embodiment is shown in the drawings, and wherein same or similar label represents same or similar element or has element that is identical or similar functions from start to finish.Being exemplary below by the embodiment be described with reference to the drawings, only for explaining the present invention, and can not limitation of the present invention being interpreted as.

For the detection method of key frame and the effect quality of device that the embodiment of the present invention is described define recall ratio and precision ratio.What recall ratio referred to correct key frame in video lens segmentation detects number divided by actual number of key frames, and precision ratio refers to that the number that detects of correct key frame in video lens segmentation detects number divided by total key frame.

Fig. 1 is the detection method process flow diagram of the key frame of one aspect of the present invention embodiment.As shown in Figure 1, according to the detection method of the key frame of the embodiment of the present invention, comprise the following steps:

Step S101, carries out non-homogeneous piecemeal to the current video frame of input and the adjacent video frames of current video frame, and adds up the histogram distribution probability of two adjacent frame of video on each piecemeal and each color component and joint histogram distribution probability.

Particularly, first, to long and wide, piecemeal is carried out respectively according to the ratio of 1:3:1 to two adjacent frame of video, obtain 9 non-homogeneous piecemeal m (m=1,2 ..., 9), and according to the position imparting weights W of non-homogeneous piecemeal, wherein,

W = [\begin{matrix} w_{1} & w_{2} & w_{3} \\ w_{4} & w_{5} & w_{6} \\ w_{7} & w_{8} & w_{9} \end{matrix}] = [\begin{matrix} 1 & 1 & 1 \\ 2 & 4 & 2 \\ 1 & 1 & 1 \end{matrix}] .

According to traditional histogram calculation method can only reflecting video two field picture global color distribution, be difficult to the spatial information of reflecting video two field picture.Due in ordinary video, there are advertisement or captions through the top of frame of video of being everlasting and bottom, the frequent variations of advertisement or captions detects the key frame of camera lens and forms interference, therefore embodiments of the invention carry out uneven piecemeal to frame of video, and give different weights according to the position of non-homogeneous piecemeal, embody the change of main contents in frame of video, improve recall ratio and precision ratio thus.Be illustrated in figure 2 the non-homogeneous piecemeal of the embodiment of the present invention and each piecemeal given to the schematic diagram of weights, according to the ratio of 1:3:1 frame of video is divided into 3 × 3 do not wait piecemeal, wherein, in fig. 2, numeral weights in non-homogeneous piecemeal, W represents the width of video frame images, and H represents the height of video frame images.

Then, adding up two adjacent frame of video values on m piecemeal, R color component is the number of pixels of i, and by the sum of all pixels order of number of pixels divided by m piecemeal, obtains histogram distribution probability with and the histogram distribution probability obtained successively on G, B color component with with wherein t represents current video frame, and t-1 represents the adjacent video frames of current video frame.

Finally, add up two adjacent frame of video values on corresponding m piecemeal, R color component and be respectively the pixel of i and j to number, and by pixel to the sum of all pixels order of number divided by m piecemeal, obtain joint histogram distribution probability and the joint histogram distribution probability obtained successively on G, B color component

Step S102, calculate the divided group transinformation content between two adjacent frame of video according to the histogram distribution probability of two adjacent frame of video on each piecemeal and each color component and joint histogram distribution probability, and it is poor to calculate the frame of two adjacent frame of video according to divided group transinformation content.

Particularly, first, according to the histogram distribution probability of two adjacent frame of video and joint histogram distribution probability according to the transinformation content of frame of video on m piecemeal, R color component that following formulae discovery two is adjacent

I_{t, t - 1}^{m} (R) = - Σ_{i = 0}^{N - 1} Σ_{j = 0}^{N - 1} p_{t, t - 1}^{m} (R_{i}, R_{j}) * \log_{2} \frac{p_{t, t - 1}^{m} (R_{i}, R_{j})}{p_{t}^{m} (R_{i}) * p_{t - 1}^{m} (R_{j})},

And calculate the adjacent frame of video of acquisition two successively at m piecemeal, the transinformation content on G, B color component

I_{t, t - 1}^{m} (B) .

Then, according to two adjacent frame of video at m piecemeal, the transinformation content on R, G, B color component and according to the total color transinformation content of following formulae discovery

I_{t, t - 1}^{m} (R, G, B) = \frac{1}{3} (I_{t, t - 1}^{m} (R) + I_{t, t - 1}^{m} (G) + I_{t, t - 1}^{m} (B)) .

According to weight matrix W and total color transinformation content according to following formulae discovery divided group transinformation content I _{t, t-1},

I_{t, t - 1} = Σ_{m = 1}^{9} (w_{m} * I_{t, t - 1}^{m} (R, G, B)) / Σ_{m = 1}^{9} w_{m} .

Finally, according to divided group transinformation content I _{t, t-1}, according to the frame difference dist of the adjacent frame of video of following formulae discovery two _{t, t-1},

dist _t，t-1=1-I _t，t-1。

Step S103, carries out first time key frame according to the frame difference of two adjacent frame of video to current video frame and detects with the initial survey result obtaining current video frame.

The principle schematic that the first time key frame being illustrated in figure 3 the embodiment of the present invention detects, detects sudden change key frame and gradual change key frame respectively by the first threshold TH and first threshold TL arranging two judgment frame difference sizes.

Particularly, first, the frame of two adjacent frame of video difference is compared with first threshold TH and first threshold TL.

In one embodiment of the invention, first arranging of first threshold TH and Second Threshold TL specifically comprise the steps:, calculates the frame difference dist that video sequence length between the previous key frame of current video frame and the adjacent video frames of current video frame is the frame of video of S _{i, i1}(i=1 ..., S-1), be then the frame difference dist of the frame of video of S according to video sequence length _{i, i-1}(i=1 ..., S-1), calculate frame difference average μ, wherein,

μ = \frac{1}{S} Σ_{i = 1}^{i = S - 1} {dist}_{i, i - 1} .

Finally according to frame difference average μ according to threshold value described in following formulae discovery,

TH=5μ，TL=3μ。

It should be understood that the setting of first threshold TH and Second Threshold TL, in computation process, self-adaptative adjustment can be carried out according to different video contents according to the embodiment of the present invention, also can arrange as required voluntarily.

If the frame difference of two adjacent frame of video is greater than first threshold TH, then current video frame may be sudden change key frame, the first time key frame carrying out following steps detects: the frame storing two adjacent video frames centered by current video frame, left and right each r frame according to annular array is poor, wherein, r=3 ~ 5; Judge the frame difference of two adjacent frame of video and the frame extent of all 2r+1 adjacent video frames in annular array, if the frame difference of two adjacent frame of video is maximal value, then continue next step judgement, otherwise current video frame is not the key frame that suddenlys change; Continue the frame difference of the adjacent frame of video of judgement two and time large frame extent in the frame difference of all 2r+1 adjacent video frames in annular array, time large poor large 3 times of frame during if the frame difference of two adjacent frame of video is poorer than the frame of 2r+1 adjacent video frames all in annular array, then current video frame is sudden change key frame, otherwise current video frame is not sudden change key frame.

If the frame difference of two adjacent frame of video is less than first threshold TH and is greater than Second Threshold TL, then current video frame is gradual change start frame, carry out following steps first time key frame gradual change end frame detect: the frame difference dist calculating current video frame and kth frame of video thereafter _t,k, wherein k=t+1, t+2 ...; Judge current video frame dist poor with the frame of kth frame of video thereafter _t,kwith the size of first threshold, if current video frame dist poor with the frame of kth frame of video thereafter _t,kbe greater than first threshold, then kth frame is candidate's end frame; Continue the frame difference dist calculating kth frame and a frame of video after it _{k+j+1, k+j}, wherein, j=0,1,2 ..., a-1; Continue the frame difference dist judging kth frame and a frame of video after it _t,kwith the size of Second Threshold, if the frame difference dist of kth frame and a frame of video after it _{k+j+1, k+j}be less than Second Threshold, then kth frame is gradual change end frame.

The first detecting method of step S103, also known as dual-threshold voltage, can meet the requirement of different video content change by arranging threshold value TH and TL two parameter values, can detect sudden change key frame also can detection of gradual transitions key frame well, gradual change key frame of being particularly fade-in fade-out.

Step S104, carries out second time key frame according to the initial survey result of current video frame to current video frame and detects with the final detection result obtaining current video frame.

Particularly, first, the initial survey result of current video frame is obtained.

If the initial survey result of current video frame is sudden change key frame, then carry out the second time key frame detection of following steps: calculate blocked histogram difference and pixel difference histogram variances according to current video frame; Determine the first change threshold according to blocked histogram difference, and determine the second change threshold according to pixel difference histogram variances; Judge the size of pixel difference histogram variances and the first change threshold and the size of blocked histogram difference and the second change threshold; If pixel difference histogram variances is greater than the first change threshold or blocked histogram difference is less than the second change threshold, then current video frame is not sudden change key frame, if pixel difference histogram variances is less than or equal to the first change threshold and blocked histogram difference is more than or equal to the second change threshold, then current video frame is sudden change key frame.

More specifically, blocked histogram difference BHDM is that D (t, t-1) computation process is as follows,

D (t, t - 1) = \frac{[Σ_{k = 1}^{m} DB (t, t - 1, k)] - Max (DB (t, t - 1, k))}{m - 1},

DB (t, t - 1, k) = \frac{Σ_{j = 1}^{n} | H_{t, k} (j) - H_{t - 1, k} (j) |}{n},

Wherein, H _{t, k}j () is the value of normalization histogram on grey level j of the kth piecemeal of current video frame t, after non-homogeneous piecemeal as shown in step S101 is carried out to current video frame t, get its gray-scale map, statistics current video frame t is at kth piecemeal, gray-scale value is the number of pixels of j and by the sum of all pixels order of number of pixels divided by kth piecemeal, obtains H _{t, k}j (), n represents the quantity of gray level, and m represents the piecemeal number of frame of video, and calculates the H of the adjacent video frames t-1 obtaining current video frame t according to these computing method _{t-1, k}(j).

Pixel difference histogram variances (VDHM) for V (t, t-1) computation process as follows,

V (t, t - 1) = \frac{Σ_{j = 1}^{n} {({DH}_{t, t - 1} (j) - \overset{&OverBar;}{DH})}^{2}}{n}

Wherein, DH _{t, t-1}j () represents the value of normalization pixel difference histogram on difference rank j of two adjacent video frames t and t-1, get the gray-scale map of two adjacent video frames t and t-1, then the gray-scale map of two adjacent video frames is subtracted each other according to pixel correspondence position and capture element absolute difference, obtain new gray-scale map, adding up this new gray-scale map in gray scale value is the number of pixels of j, and by the pixel total number of number of pixels divided by gray-scale map, obtain DH _{t, t-1}(j). dH _{t, t-1}j the mean value of (), n represents the quantity of gray level.

In computation process, j rounds numerical value, n=256 in [0,255] interval.

It can thus be appreciated that, 0≤D (t, t-1)≤1, color distortion between the image of frame of video that what blocked histogram difference (BHDM) reflected is, 0≤V (t, t-1)≤1, the spatial diversity between the image of frame of video that what pixel difference histogram variances (VDHM) reflected is.If current video frame t is not sudden change key frame, then the pixel difference histogram variances (VDHM) V (t, t-1) of its correspondence is larger and blocked histogram difference (BHDM) D (t, t-1) is smaller.If current video frame t is sudden change key frame, then the pixel difference histogram variances (VDHM) value V (t, t-1) of its correspondence is smaller and blocked histogram difference (BHDM) value D (t, t-1) is larger.

The first change threshold T that blocked histogram difference (BHDM) is corresponding _bthe second change threshold T corresponding with pixel difference histogram variances (VDHM) _vdetermine by the adaptive threshold method based on moving window, wherein,

T_{v} = \frac{1}{3} * \frac{1}{S} Σ_{i = 1}^{i = S - 1} V (t, t - 1),

T_{b} = 3 * \frac{1}{S} Σ_{i = 1}^{i = S - 1} D (t, t - 1) .

If the initial survey result of current video frame is gradual change key frame, then carries out following second time key frame shown in (a) He (b) respectively and detect.

A () is according to the initial survey result of step S103, the sampling of pixel R, G, B is carried out to sequence of frames of video between current video frame and gradual change start frame to gradual change end frame, be illustrated in figure 4 the distribution plan of pixel sampled point, stain is pixel sampling point position, and it is respectively the mid point of each bar line segment; Judge whether there is complete black sampled point in sampled point sequence, if there is complete black sampled point, then current video frame and gradual change end frame are gradual change key frame of being fade-in fade-out, if there is no complete black sampled point, then initial survey result is flase drop.

In the general shot change process of being fade-in fade-out, total exist a frame or a few frame all black picture, is camera lens of being fade-in fade-out by detecting black frame of video to determine whether.

B (), according to the initial survey result of described step S103, judges two initial survey key frame S _aand S _bbetween sequence of frames of video length whether be greater than 30 frames, if sequence of frames of video length is greater than 30 frames, then continue next step calculate and judge, if sequence of frames of video length is not more than 30 frames, then two initial survey key frame S _aand S _bbetween there is not lysotype gradual change key frame; Add up two initial survey key frame S _a, S _bbetween adjacent video frames frame difference average λ; Judge two initial survey key frame S _a, S _bbetween sequence of frames of video S _a, S _a+1, S _a+2..., S _b-1, S _bin whether there is certain frame S _k, after it, the frame difference of two adjacent video frames of a frame is all greater than described average λ and is less than Second Threshold, if existed, and S _kbe candidate's start frame of lysotype gradual change, now frame sequence S detected _{k+ α+1}, continue next step and calculate and judge, otherwise continue the candidate's start frame detecting lysotype gradual change, wherein a=5 ~ 8; Judge S _{k+ α+1}..., S _b-1, S _bwhether middle existence exists certain frame S _r, after it, the frame difference of two adjacent video frames of ω frame is all less than average λ, if existed, and S _rbe the end frame of lysotype gradual change, make k=r+ ω proceed to (b) and continue to detect, the detection of end as k>b, wherein ω=5 ~ 8.

According to the detection method of the key frame of the embodiment of the present invention, fully utilized non-homogeneous piecemeal, transinformation content, improvement dual-threshold voltage initial survey and recheck scheduling algorithm, specifically have the following advantages: (1) computing velocity is fast, treatment effeciency is high: the method for the embodiment of the present invention can carry out sampling processing for the video of arbitrary resolution, real-time key frame can be realized and detect; (2) recall ratio and precision ratio high: the present invention fully utilized non-homogeneous piecemeal, transinformation content, improvement dual-threshold voltage initial survey, recheck the advantage of scheduling algorithm, make the method for the embodiment of the present invention can locator key frame exactly, there is very high recall ratio and precision ratio; (3) good expansion and transplantability, easy to use: the method for the embodiment of the present invention can be applied with other key frame detection method and other and combine, and has good expansion and wide application space.

Fig. 5 is the structural representation of the pick-up unit of the key frame of the present invention's another aspect embodiment.As shown in Figure 5, according to the pick-up unit of the key frame of the embodiment of the present invention, comprise distribution probability statistical module 100, frame difference computing module 200, first detection module 300 and the second detection module 400.

Wherein, distribution probability statistical module 100 for carrying out non-homogeneous piecemeal to the current video frame of input and the adjacent video frames of current video frame, and adds up the histogram distribution probability of two adjacent frame of video on each piecemeal and each color component and joint histogram distribution probability.

As shown in Figure 6, distribution probability statistical module 100 comprises non-homogeneous blocking unit 110, histogram distribution probability statistics unit 120 and joint histogram distribution probability statistic unit 130.Wherein, non-homogeneous blocking unit 110 for carrying out piecemeal to long and wide to two adjacent frame of video respectively according to the ratio of 1:3:1, obtain 9 non-homogeneous piecemeal m (m=1,2 ..., 9), and according to the position imparting weights W of non-homogeneous piecemeal.Histogram distribution probability calculation unit 120 is connected with non-homogeneous blocking unit 110, be the number of pixels of i for adding up two adjacent frame of video values on m piecemeal, R color component, and by the sum of all pixels order of number of pixels divided by m piecemeal, obtain histogram distribution probability with and the histogram distribution probability obtained successively on G, B color component with with wherein t represents current video frame, and t-1 represents the adjacent video frames of current video frame.Joint histogram distribution probability computing unit 130 is connected with histogram distribution probability calculation unit 120, the pixel of i and j is respectively to number for adding up two adjacent frame of video values on corresponding m piecemeal, R color component, and by pixel to the sum of all pixels order of number divided by m piecemeal, obtain joint histogram distribution probability and the described joint histogram distribution probability obtained successively on G, B color component

Frame difference computing module 200 is connected with distribution probability computing module 100, calculate the divided group transinformation content between two adjacent frame of video for the histogram distribution probability of the frame of video adjacent according to two on each piecemeal and each color component and joint histogram distribution probability, and it is poor to calculate the frame of two adjacent frame of video according to divided group transinformation content.

As shown in Figure 7, frame difference computing module 200 comprises single color transinformation content computing unit 210, total color transinformation content computing unit 220, divided group transinformation content computing unit 230 and frame difference computing unit 240.Wherein, single color transinformation content computing unit 210 is for the histogram distribution probability according to two adjacent frame of video and joint histogram distribution probability calculate two the adjacent transinformation contents of frame of video on m piecemeal, R color component and the frame of video that calculating two is adjacent is successively at m piecemeal, the transinformation content on G, B color component total color transinformation content computing unit 220 is connected with single color transinformation content computing unit 210, for according to two adjacent frame of video at m piecemeal, the transinformation content on R, G, B color component and calculate total color transinformation content divided group transinformation content computing unit 230 is connected with total color transinformation content computing unit 220, for according to weight matrix W and total color transinformation content calculate divided group transinformation content I _{t, t-1}.Frame difference computing unit 240 is connected with divided group transinformation content computing unit 230, for according to divided group transinformation content I _{t, t-1}, calculate the frame difference dist of adjacent two frame of video _{t, t-1}.

Detailed computation process can with reference to the computation process of step S102 in the detection method of the key frame of the embodiment of the present invention.

First time, detection module 300 and frame difference computing module 200 was connected, for carrying out first time key frame detection with the initial survey result obtaining current video frame according to the frame difference of two adjacent frame of video to current video frame.

As shown in Figure 8, detection module 300 comprises comparing unit 310, first judging unit 320, first sudden change key frame detecting unit 330 and the first gradual change key frame detecting unit 340 for the first time.Wherein, comparing unit 310 is for comparing the frame of two adjacent frame of video difference with first threshold and Second Threshold.First judging unit 320 is connected with comparing unit 310, for the comparative result according to comparing unit, judge the first detection mode of current video frame, if the frame difference of two adjacent frame of video is greater than described first threshold, then current video frame may be sudden change key frame, enter the first sudden change key frame detecting unit 330, if the frame difference of two adjacent frame of video is less than first threshold and is greater than Second Threshold, then current video frame is gradual change start frame, enters the first gradual change key frame detecting unit 340 and carries out the detection of gradual change end frame.First sudden change key frame detecting unit 330 is connected with the first judging unit 310, for judging whether current video frame is sudden change key frame.First gradual change key frame detecting unit 340 is connected with the first judging unit 310, for continuing the gradual change end frame detecting current video frame.

The detailed testing process of the first sudden change key frame detecting unit 330 and the first gradual change key frame detecting unit 340 can with reference to the testing process of step S103 in the detection method of the key frame of one aspect of the present invention embodiment.

In one embodiment of the invention, first detection module 300 also comprises threshold setting unit 350, the threshold setting unit 350 and first key frame detecting unit 330 that suddenlys change is connected with the first gradual change key frame detecting unit 340, for arranging first threshold and Second Threshold, the setting steps that concrete setting up procedure can comprise with reference to step S103 in the detection method of the key frame of one aspect of the present invention embodiment.

Second detection module 400 is connected with first detection module 300, detects for carrying out second time key frame according to the initial survey result of current video frame to current video frame with the final detection result obtaining current video frame.

As shown in Figure 9, the second detection module 400 comprises acquiring unit 410, second judging unit 420, second sudden change key frame detecting unit 430, second and to be fade-in fade-out gradual change key frame detecting unit 440 and the second lysotype gradual change key frame detecting unit 450.Wherein, acquiring unit 410 is connected with first detection module 300, for obtaining the initial survey result of current video frame.Second judging unit 420 is connected with acquiring unit 410, for the initial survey result according to acquiring unit 410, judge the second detection mode of current video frame, if the initial survey result of current video frame is sudden change key frame, then enter the second sudden change key frame detecting unit 430 to detect, if the initial survey result of current video frame is gradual change key frame, then enters second and to be fade-in fade-out gradual change key frame detecting unit 440 and the second lysotype gradual change key frame detecting unit 450.Second sudden change key frame detecting unit 430 is connected with the second judging unit 420, for carrying out the reinspection of the sudden change key frame of current video frame.Second gradual change key frame detecting unit 440 of being fade-in fade-out is connected with the second judging unit 420, for carrying out the reinspection of the key frame of being fade-in fade-out of current video frame.Second lysotype gradual change key frame detecting unit 450 is connected with the second judging unit 420, for carrying out the reinspection of the lysotype gradual change key frame of current video frame.

The concrete testing process that second sudden change key frame detecting unit 430, second is fade-in fade-out gradual change key frame detecting unit 440 and the second lysotype gradual change key frame detecting unit 450 can with reference to the testing process of the step S104 in the detection method of the key frame of one aspect of the present invention embodiment.

According to detection method and the device of the key frame of the embodiment of the present invention, at least there is following beneficial effect:

(1) computing velocity is fast, treatment effeciency is high: the method according to the embodiment of the present invention can carry out sampling processing to the video of arbitrary resolution, and realize real-time key frame and detect, the device treatment effeciency of the embodiment of the present invention is high.

(2) recall ratio and precision ratio high: the method for the embodiment of the present invention has fully utilized non-homogeneous piecemeal, scheduling algorithm advantage rechecked by transinformation content, the dual-threshold voltage of improvement, camera lens, can locator key frame exactly, has very high recall ratio and precision ratio.

(3) good expansion and transplantability, easy to use: the method for the embodiment of the present invention can be applied with other key frame detection method and other and combine, and has good expansion and wide application space.The device of the embodiment of the present invention has good expansion and wide application space.

Although illustrate and describe embodiments of the invention, for the ordinary skill in the art, be appreciated that and can carry out multiple change, amendment, replacement and modification to these embodiments without departing from the principles and spirit of the present invention, scope of the present invention is by claims and equivalency thereof.

Claims

1. a detection method for key frame, is characterized in that, comprises the following steps:

S1: non-homogeneous piecemeal is carried out to the current video frame of input and the adjacent video frames of described current video frame, and add up the histogram distribution probability of two adjacent frame of video on each piecemeal and each color component and joint histogram distribution probability;

S2: calculate the divided group transinformation content between described two adjacent frame of video according to the histogram distribution probability of described two adjacent frame of video on each piecemeal and each color component and joint histogram distribution probability, and it is poor to calculate the frame of described two adjacent frame of video according to described divided group transinformation content;

S3: carry out first time key frame according to the frame difference of described two adjacent frame of video to described current video frame and detect with the initial survey result obtaining described current video frame, this step comprises further:

S31: the frame difference of described two adjacent frame of video is compared with predetermined first threshold and Second Threshold;

S32: if the frame difference of described two adjacent frame of video is greater than described first threshold, then described current video frame may be sudden change key frame, continues to perform step S33,

If the frame difference of described two adjacent frame of video is less than described first threshold and is greater than described Second Threshold, then described current video frame is gradual change start frame, continues to perform step S34 and carries out the detection of gradual change end frame;

S33: the frame storing two adjacent video frames centered by described current video frame, left and right each r frame according to annular array is poor, wherein, r=3 ~ 5,

Judge the frame difference of described two adjacent frame of video and the frame extent of all 2r+1 adjacent video frames in described annular array, if the frame difference of described two adjacent frame of video is maximal value, then continue next step to judge, otherwise described current video frame is not sudden change key frame

To judge in the frame difference of described two adjacent frame of video and the frame difference of all 2r+1 adjacent video frames in described annular array time greatly frame extent, time large poor large 3 times of frame during if the frame difference of described two adjacent frame of video is poorer than the frame of 2r+1 adjacent video frames all in described annular array, then described current video frame is sudden change key frame, otherwise described current video frame is not sudden change key frame;

S34: calculate described current video frame dist poor with the frame of kth frame of video thereafter _t,k, wherein k=t+1, t+2 ...,

Judge described current video frame dist poor with the frame of kth frame of video thereafter _t,kwith the size of described first threshold, if described current video frame dist poor with the frame of kth frame of video thereafter _t,kbe greater than described first threshold, then described kth frame of video is candidate's end frame,

Calculate the frame difference dist of described kth frame of video and a frame of video after it _{k+j+1, k+j}, wherein, j=0,1,2 ..., a-1;

Judge the frame difference dist of described kth frame of video and a frame of video after it _t,kwith the size of described Second Threshold, if the frame difference dist of described kth frame of video and a frame of video after it _{k+j+1, k+j}be less than described Second Threshold, then described kth frame of video is gradual change end frame;

S4: according to the initial survey result of described current video frame, second time key frame is carried out to described current video frame and detect with the final detection result obtaining described current video frame.

2. the detection method of key frame according to claim 1, is characterized in that, described step S1 comprises further:

S11: to long and wide, piecemeal is carried out respectively according to the ratio of 1:3:1 to described two adjacent frame of video, obtains 9 non-homogeneous piecemeal m (m=1,2, ..., 9), and give weights according to the position of described non-homogeneous piecemeal, wherein, weight matrix W is as follows:

W = [\begin{matrix} w_{1} & w_{2} & w_{3} \\ w_{4} & w_{5} & w_{6} \\ w_{7} & w_{8} & w_{9} \end{matrix}] = [\begin{matrix} 1 & 1 & 1 \\ 2 & 4 & 2 \\ 1 & 1 & 1 \end{matrix}];

S12: adding up described two adjacent frame of video values on m piecemeal, R color component is the number of pixels of i, and by the sum of all pixels order of described number of pixels divided by m piecemeal, obtains described histogram distribution probability with and the described histogram distribution probability obtained successively on G, B color component with with wherein t represents described current video frame, and t-1 represents the adjacent video frames of described current video frame;

S13: add up described two adjacent frame of video values on corresponding m piecemeal, R color component and be respectively the pixel of i and j to number, and by described pixel to the sum of all pixels order of number divided by m piecemeal, obtain described joint histogram distribution probability and the described joint histogram distribution probability obtained successively on G, B color component

p_{t, t - 1}^{m} (G_{i}, G_{j}), p_{t, t - 1}^{m} (B_{i}, B_{j}) .

3. the detection method of key frame according to claim 2, is characterized in that, described step S2 comprises further:

S21: according to the histogram distribution probability of described two adjacent frame of video and described joint histogram distribution probability according to two the adjacent transinformation contents of frame of video on m piecemeal, R color component described in following formulae discovery

I_{t, t - 1}^{m} (R) = - Σ_{i = 0}^{N - 1} Σ_{j = 0}^{N - 1} p_{t, t - 1}^{m} (R_{i}, R_{j}) * \log_{2} \frac{p_{t, t - 1}^{m} (R_{i}, R_{j})}{p_{t}^{m} (R_{i}) * p_{t - 1}^{m} (R_{j})},

And calculate successively and obtain described two adjacent frame of video at m piecemeal, the transinformation content on G, B color component

I_{t, t - 1}^{m} (G), I_{t, t - 1}^{m} (B);

S22: according to described two adjacent frame of video at m piecemeal, the transinformation content on R, G, B color component and according to the total color transinformation content of following formulae discovery

I_{t, t - 1}^{m} (R, G, B) = \frac{1}{3} (I_{t, t - 1}^{m} (R) + I_{t, t - 1}^{m} (G) + I_{t, t - 1}^{m} (B));

S23: according to described weight matrix W and described total color transinformation content according to divided group transinformation content I described in following formulae discovery _{t, t-1},

I_{t, t - 1} = Σ_{m = 1}^{9} (w_{m} * I_{t, t - 1}^{m} (R, G, B)) / Σ_{m = 1}^{9} w_{m};

S24: according to described divided group transinformation content I _{t, t-1}, according to the frame difference dist of two adjacent frame of video described in following formulae discovery _{t, t-1},

dist _t,t-1＝1-I _t,t-1。

4. the detection method of key frame according to claim 1, is characterized in that, described step S31 comprises further:

S311: calculate the frame difference dist that video sequence length between the previous key frame of described current video frame and the adjacent video frames of described current video frame is the frame of video of S _{i, i-1}(i=1 ..., S-1),

S312: calculate the frame difference average μ that described sequence length is the frame of video of S,

S313: calculate described first threshold and described Second Threshold according to described frame difference average μ, wherein said first threshold equals 5 times of described frame difference average μ, and described Second Threshold equals 3 times of described frame difference average; And

S314: the frame difference of described two adjacent frame of video is compared with described first threshold and Second Threshold.

5. the detection method of key frame according to claim 1, is characterized in that, described step S4 comprises further:

S41: the initial survey result obtaining described current video frame;

S42: if the initial survey result of described current video frame is sudden change key frame, then continue to perform step S43,

If the initial survey result of described current video frame is gradual change key frame, then continue to perform step S44 and step S45;

S43: calculate blocked histogram difference and pixel difference histogram variances according to described current video frame,

Determine the first change threshold according to described blocked histogram difference, and determine the second change threshold according to described pixel difference histogram variances,

Judge the size of described pixel difference histogram variances and described first change threshold and the size of described blocked histogram difference and described second change threshold,

If described pixel difference histogram variances is greater than described first change threshold or described blocked histogram difference is less than described second change threshold, then described current video frame is not sudden change key frame,

If described pixel difference histogram variances is less than or equal to described first change threshold and described blocked histogram difference is more than or equal to described second change threshold, then described current video frame is sudden change key frame;

S44: according to the initial survey result of described step S34, carries out the sampling of pixel R, G, B to sequence of frames of video between described current video frame and gradual change start frame to described gradual change end frame,

Judge whether there is complete black sampled point in sampled point sequence, if there is described complete black sampled point, then described current video frame and described gradual change end frame are gradual change key frame of being fade-in fade-out, if there is no described complete black sampled point, then initial survey result is flase drop,

S45: judge two initial survey key frame S _aand S _bbetween sequence of frames of video length whether be greater than 30 frames, if described sequence of frames of video length is greater than 30 frames, then continue next step calculate and judge, if described sequence of frames of video length is not more than 30 frames, then described two initial survey key frame S _aand S _bbetween there is not lysotype gradual change key frame,

Add up described two initial survey key frame S _a, S _bbetween adjacent video frames frame difference average λ,

Judge described two initial survey key frame S _a, S _bbetween sequence of frames of video S _a, S _a+1, S _a+2..., S _b-1, S _bin whether there is certain frame S _k, after it, the frame difference of two adjacent video frames of a frame is all greater than described average λ and is less than Second Threshold, if existed, described S _kbe candidate's start frame of lysotype gradual change, now frame sequence S detected _{k+ α+1}, continue next step and calculate and judge, otherwise continue the candidate's start frame detecting lysotype gradual change, wherein a=5 ~ 8,

Judge described S _{k+ α+1}..., S _b-1, S _bwhether middle existence exists certain frame S _r, after it, the frame difference of two adjacent video frames of ω frame is all less than described average λ, if existed, described S _rbe the end frame of lysotype gradual change, make k=r+ ω proceed to step S44 and continue to detect, the detection of end as k>b, wherein ω=5 ~ 8.

6. a pick-up unit for key frame, is characterized in that, comprising:

Distribution probability statistical module, for carrying out non-homogeneous piecemeal to the current video frame of input and the adjacent video frames of described current video frame, and add up the histogram distribution probability of two adjacent frame of video on each piecemeal and each color component and joint histogram distribution probability;

Frame difference computing module, for calculating the divided group transinformation content between described two adjacent frame of video according to the histogram distribution probability of described two adjacent frame of video on each piecemeal and each color component and joint histogram distribution probability, and it is poor to calculate the frame of described two adjacent frame of video according to described divided group transinformation content;

First detection module, detect for carrying out first time key frame according to the frame difference of described two adjacent frame of video to described current video frame with the initial survey result obtaining described current video frame, described first detection module comprises further:

Comparing unit, for comparing the frame difference of described two adjacent frame of video with predetermined first threshold and Second Threshold;

First judging unit, is connected with described comparing unit, for the comparative result according to described comparing unit, judges the first detection mode of described current video frame,

First sudden change key frame detecting unit, for the judged result according to described first judging unit, if the frame difference of described two adjacent frame of video is greater than described first threshold, then described current video frame may be sudden change key frame, the frame storing two adjacent video frames centered by described current video frame, left and right each r frame according to annular array is poor, wherein, r=3 ~ 5

First gradual change key frame detecting unit, for the judged result according to described first judging unit, if the frame difference of described two adjacent frame of video is less than described first threshold and is greater than described Second Threshold, then described current video frame is gradual change start frame, calculates described current video frame dist poor with the frame of kth frame of video thereafter _t,k, wherein k=t+1, t+2 ...,

Second detection module, detects for carrying out second time key frame according to the initial survey result of described current video frame to described current video frame with the final detection result obtaining described current video frame.

7. the pick-up unit of key frame according to claim 6, is characterized in that, described distribution probability statistical module comprises further:

Non-homogeneous blocking unit, for carrying out piecemeal to long and wide to described two adjacent frame of video respectively according to the ratio of 1:3:1, obtains 9 non-homogeneous piecemeal m (m=1,2 ..., 9), and give weights according to the position of described non-homogeneous piecemeal, wherein, weight matrix W is defined as follows:

W = [\begin{matrix} w_{1} & w_{2} & w_{3} \\ w_{4} & w_{5} & w_{6} \\ w_{7} & w_{8} & w_{9} \end{matrix}] = [\begin{matrix} 1 & 1 & 1 \\ 2 & 4 & 2 \\ 1 & 1 & 1 \end{matrix}];

Histogram distribution probability statistics unit, is the number of pixels of i for adding up described two adjacent frame of video values on m piecemeal, R color component, and by the sum of all pixels order of described number of pixels divided by m piecemeal, obtains described histogram distribution probability with and the described histogram distribution probability obtained successively on G, B color component with with wherein t represents described current video frame, and t-1 represents the adjacent video frames of described current video frame;

Joint histogram distribution probability statistic unit, the pixel of i and j is respectively to number for adding up described two adjacent frame of video values on corresponding m piecemeal, R color component, and by described pixel to the sum of all pixels order of number divided by m piecemeal, obtain described joint histogram distribution probability and the described joint histogram distribution probability obtained successively on G, B color component

8. the pick-up unit of key frame according to claim 7, is characterized in that, described frame difference computing module comprises further:

Single color transinformation content computing unit, for the histogram distribution probability according to described two adjacent frame of video and described joint histogram distribution probability according to two the adjacent transinformation contents of frame of video on m piecemeal, R color component described in following formulae discovery

I_{t, t - 1}^{m} (R) = - Σ_{i = 0}^{N - 1} Σ_{j = 0}^{N - 1} p_{t, t - 1}^{m} (R_{i}, R_{j}) * \log_{2} \frac{p_{t, t - 1}^{m} (R_{i}, R_{j})}{p_{t}^{m} (R_{i}) * p_{t - 1}^{m} (R_{j})},

I_{t, t - 1}^{m} (G), I_{t, t - 1}^{m} (B);

Total color transinformation content computing unit, for according to described two adjacent frame of video at m piecemeal, the transinformation content on R, G, B color component and according to the total color transinformation content of following formulae discovery

I_{t, t - 1}^{m} (R, G, B) = \frac{1}{3} (I_{t, t - 1}^{m} (R) + I_{t, t - 1}^{m} (G) + I_{t, t - 1}^{m} (B));

Divided group transinformation content computing unit, for according to described weight matrix W and described total color transinformation content according to divided group transinformation content I described in following formulae discovery _{t, t-1},

I_{t, t - 1} = Σ_{m = 1}^{9} (w_{m} * I_{t, t - 1}^{m} (R, G, B)) / Σ_{m = 1}^{9} w_{m};

Frame difference computing unit, for according to described divided group transinformation content I _{t, t-1}, according to the frame difference dist of two adjacent frame of video described in following formulae discovery _{t, t-1},

dist _t,t-1＝1-I _t,t-1。

9. the pick-up unit of key frame according to claim 6, is characterized in that, described first detection module also comprises:

Threshold setting unit, for calculating the frame difference dist that video sequence length between the previous key frame of described current video frame and the adjacent video frames of described current video frame is the frame of video of S _{i, i-1}(i=1 ..., S-1),

Calculate the frame difference average μ that described sequence length is the frame of video of S,

Calculate described first threshold and described Second Threshold according to described frame difference average μ, wherein, described first threshold equals 5 times of described frame difference average μ, and described Second Threshold equals 3 times of described frame difference average.

10. the pick-up unit of key frame according to claim 6, is characterized in that, described second detection module comprises further:

Acquiring unit, for obtaining the initial survey result of described current video frame;

Second judging unit, for the initial survey result according to described acquiring unit, judges the second detection mode of described current video frame,

If the initial survey result of described current video frame is sudden change key frame, then entering the second sudden change key frame detecting unit carries out key frame reinspection,

If the initial survey result of described current video frame is gradual change key frame, then enters the second be fade-in fade-out gradual change key frame detecting unit and the second lysotype gradual change key frame detecting unit respectively and carry out key frame reinspection;

Second sudden change key frame detecting unit, for calculating blocked histogram difference and pixel difference histogram variances according to described current video frame,

Second is fade-in fade-out gradual change key frame detecting unit, carries out the sampling of pixel R, G, B to sequence of frames of video between described current video frame and gradual change start frame to described gradual change end frame,

Second lysotype gradual change key frame detecting unit, for judging two initial survey key frame S _aand S _bbetween sequence of frames of video length whether be greater than 30 frames, if described sequence of frames of video length is greater than 30 frames, then continue next step calculate and judge, if described sequence of frames of video length is not more than 30 frames, then described two initial survey key frame S _aand S _bbetween there is not lysotype gradual change key frame,