CN101360184A

CN101360184A - System and method for extracting key frame of video

Info

Publication number: CN101360184A
Application number: CN 200810211435
Authority: CN
Inventors: 陈波
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2008-09-22
Filing date: 2008-09-22
Publication date: 2009-02-04
Anticipated expiration: 2028-09-22
Also published as: CN101360184B

Abstract

The invention provides a system for extracting video key frames as well as a method thereof, relating to the video image technique field. The system comprises: a key frame extracting unit which carries out the histogram and grey chart operations to image data of two adjacent frames in video data so as to acquire characteristic vectors composed of the operation results, an Euclidean distance of the characteristic vectors is compared with the preset threshold value, and a lens transform boundary is acquired based on the comparison results and key frames are extracted. The system and the method for extracting video key frames can improve the performance of the video key frames.

Description

Extract the system and method for key frame of video

Technical field

The present invention relates to the video image technical field, more particularly, relate to a kind of system and method that extracts key frame of video.

Background technology

Along with the fast development of Internet technology and imaging Display Technique, a large amount of multimedia messagess is transferred on the network and watches for people, and obtaining multimedia messages from network has become an indispensable part people's daily life.The appreciation that video on-demand system, Internet video are shared and video frequency program etc. is improving constantly the consumer is experienced, and has satisfied people's inspirit culture demand.Yet along with greatly enriching of multimedia application, the quantity of Internet video sharply expands, and in numerous Internet videos, has a lot of sensitivities or unsound video to be uploaded simultaneously probably.Therefore, Internet video be accompanied by one very important problem be how effectively huge video data to be examined or to analyze, to prevent that some responsive or unsound content release are on network.

Because the amount of uploading of Internet video is very big, the amount of information of image and video data is also very big, and its content generally need manually be watched and could be understood, and examines or analyze video if only use manually to watch, and efficient can be very low undoubtedly.For saving time, need extract key frame to video, by showing or the image of analysis of key frame comes video is examined or analyzed.The technical scheme that prior art adopts is: extract key frame of video at interval according to regular time, and this key frame is generated picture.This scheme is owing to only extract key frame at interval according to regular time, can cause some critical pictures to omit, if shorten the time interval, then when the view data of audit or analysis of key frame, can cause the great amount of manpower waste, therefore the good overview of reflecting video of the key frame of video that extracts of prior art, the performance that causes extracting key frame of video is low.

Therefore the system and method that needs a kind of new extraction key frame of video can improve the performance of extraction key frame of video.

Summary of the invention

One of purpose of the present invention is to provide a kind of system and method that extracts key frame of video, is intended to solve the not high problem of prior art performance.

In order to realize goal of the invention, described system comprises the key-frame extraction unit, the view data of adjacent two frames in the video data is carried out histogram and gray-scale map computing, obtain the characteristic vector formed by described operation result, the Euclidean distance of more described characteristic vector and the size of predetermined threshold, and obtain the camera lens transform boundary and extract key frame based on described comparative result.

Described key-frame extraction unit comprises:

Computing module, carry out the histogram and the gray-scale map computing of adjacent two frame image datas, obtain the characteristic vector of forming by the variance difference of the average difference of the frame difference value of adjacent two frames, gray-scale map and gray-scale map, and described frame difference value, average difference and variance difference be weighted handle the Euclidean distance that obtains described characteristic vector;

The frame extraction module, the line data of going forward side by side of linking to each other with described computing module is mutual, and the size of more described Euclidean distance and predetermined threshold is obtained the camera lens transform boundary to extract corresponding key frame based on described comparative result.

Described camera lens transform boundary comprises gradual shot border and shot-cut border, and described predetermined threshold comprises gradual shot threshold value and shot-cut threshold value.

Described frame extraction module is further used for: the size of more described Euclidean distance and described gradual shot threshold value and shot-cut threshold value, when described Euclidean distance during greater than the gradual shot threshold value and less than the shot-cut threshold value, obtain the gradual shot border and extract corresponding key frame, when described Euclidean distance during, obtain the shot-cut border and extract corresponding key frame greater than the shot-cut threshold value.

Preferably, described system also comprises:

Decoding unit is decoded video data, obtains decoded view data;

Graphics processing unit, the line data of going forward side by side of linking to each other with described decoding unit and key-frame extraction unit is mutual, decoded view data is carried out normalized, and the view data after will handling reaches described key-frame extraction unit;

The picture synthesis unit, the line data of going forward side by side of linking to each other with described key-frame extraction unit is mutual, and the key frame that obtains is synthesized processing, generates the dynamic picture data.

In order to realize goal of the invention better, described method comprises:

A. the view data of adjacent two frames is carried out histogram and gray-scale map computing, obtain the characteristic vector of forming by described operation result;

B. the size of the Euclidean distance of more described characteristic vector and predetermined threshold, and obtain the camera lens transform boundary according to described comparative result;

C. extract described key frame according to the camera lens transform boundary.

Described characteristic vector is made up of the frame difference value of adjacent two frames, the average difference of gray-scale map and the variance difference of gray-scale map, and described Euclidean distance is weighted to handle to described frame difference value, average difference and variance difference and obtains.

Described camera lens transform boundary comprises gradual shot border and shot-cut border, described predetermined threshold comprises gradual shot threshold value and shot-cut threshold value, described step B further comprises: the size of more described Euclidean distance and described gradual shot threshold value and shot-cut threshold value, when described Euclidean distance during greater than the gradual shot threshold value and less than the shot-cut threshold value, obtain the gradual shot border, when described Euclidean distance during, obtain the shot-cut border greater than the shot-cut threshold value.

Also comprise before the described steps A: video data is decoded, obtain decoded view data, and described decoded view data is carried out normalized.

Also comprise after the described step C: described key frame of video is synthesized processing, generate the dynamic picture data, and show described dynamic picture data.

As from the foregoing, the present invention is in the process of extracting key frame of video, difference with the prior art is by adjacent two frame image datas are carried out histogram and gray-scale map computing, obtain the camera lens transform boundary and extract corresponding key frame according to operation result, the whole overview of the key frame energy reflecting video that is extracted like this, so the present invention can improve the performance of extracting key frame; In addition, in the process of video audit, difference with the prior art is decoded view data has been carried out normalized, has improved arithmetic speed; And the present invention generates the dynamic picture data with the key frame that extracts, and makes things convenient for the user to browse.

Description of drawings

Fig. 1 is the system construction drawing that extracts key frame of video among one of them embodiment of the present invention;

Fig. 2 is the system construction drawing that extracts key frame of video among one of them embodiment of the present invention;

Fig. 3 is the internal structure schematic diagram of key-frame extraction unit among one of them embodiment of the present invention;

Fig. 4 is the method flow diagram that extracts key frame of video among one of them embodiment of the present invention;

Fig. 5 is the method flow diagram that extracts key frame of video among one of them embodiment of the present invention.

In order to make purpose of the present invention, technical scheme and advantage clearer,, the present invention is further elaborated below in conjunction with drawings and Examples.

Embodiment

In the present invention, by adjacent two frame image datas are carried out histogram and gray-scale map computing, and the size of more final operation result and predetermined threshold, thereby obtain the camera lens transform boundary and extract corresponding key frame.Like this, improved the performance of extracting key frame of video.

The system of extraction key frame of video provided by the present invention, comprise key-frame extraction unit 300, be used for the view data of adjacent two frames of video data is carried out histogram and gray-scale map computing, obtain the characteristic vector formed by described operation result, the Euclidean distance of more described characteristic vector and the size of predetermined threshold, and obtain the camera lens transform boundary and extract key frame based on described comparative result.

Fig. 1 shows the system configuration of extracting key frame of video in one embodiment of the present of invention, and this system comprises decoding unit 100, graphics processing unit 200 and key-frame extraction unit 300.Should be noted that the annexation between each equipment is the needs of explaining its information interaction and control procedure for clear in all diagrams of the present invention, therefore should be considered as annexation in logic, and should not only limit to physical connection.Need to prove that in addition the communication mode between each functional module can be taked multiple, protection scope of the present invention should not be defined as the communication mode of certain particular type.Wherein:

Decoding unit 100 is used for video data is decoded, and obtains decoded view data.Owing to upload to the video data of the Internet is the video data of the various forms behind the process coding, therefore needs decode to the video data of various forms earlier, to obtain original view data.

Graphics processing unit 200, the line data of going forward side by side of linking to each other with decoding unit 100 and key-frame extraction unit 300 is mutual, be used for decoded view data is carried out normalized, and the view data after will handling reaches key-frame extraction unit 300.May differ by size dimension owing to upload to the video data of the Internet,, need carry out normalized decoded view data for improving computational speed.In one embodiment, but normalized is the view data of 120 pixel X90 pixel sizes

Key-frame extraction unit 300, the line data of going forward side by side of linking to each other with picture browsing terminal 500 is mutual, be used for the view data of adjacent two frames is carried out histogram and gray-scale map computing, obtain the characteristic vector formed by described operation result, the Euclidean distance of more described characteristic vector and the size of predetermined threshold, and obtain the camera lens transform boundary and extract key frame based on described comparative result.

In one embodiment, the camera lens transform boundary comprises gradual shot border and shot-cut border, and predetermined threshold comprises gradual shot threshold value and shot-cut threshold value.

Based on the foregoing description, the present invention proposes another embodiment, Fig. 2 shows the system configuration of extracting key frame of video in one embodiment of the present of invention, this system is except comprising decoding unit 100, graphics processing unit 200, key-frame extraction unit 300, also comprise picture synthesis unit 400 and picture browsing terminal 500, wherein:

Picture synthesis unit 400, the line data of going forward side by side of linking to each other with key-frame extraction unit 300 and picture browsing terminal 500 is mutual, is used for the key frame that obtains is synthesized processing, generates the dynamic picture data.In one embodiment, the dynamic picture data that generated are pictures of gif file format.

Picture browsing terminal 500, the line data of going forward side by side of linking to each other with picture synthesis unit 400 is mutual, the dynamic picture data that the synthesis unit 400 that is used to Show Picture generates.Resulting dynamic picture data can be for user's audit or analyzing and processing.

Should be noted that picture browsing terminal 500 both can be in the same device with picture synthesis unit 400, also can link to each other with picture synthesis unit 400 by network.When connecting by network, for improving transmission speed, the dynamic picture data that picture synthesis unit 400 generates can be handled through overcompression, and the dynamic picture data after the compression are transmitted through the network to picture browsing terminal 500, show after picture browsing terminal 500 decompression again.

Fig. 3 shows the cut-away view of key-frame extraction unit 300 in one embodiment of the present of invention.This key-frame extraction unit 300 comprises computing module 301, cache module 302, timing module 303 and frame extraction module 304, wherein:

(1) computing module 301, the line data of going forward side by side of linking to each other with cache module 302, timing module 303 and frame extraction module 304 is mutual, be used to carry out the histogram and the gray-scale map computing of adjacent two frame image datas, obtain the characteristic vector that the variance difference of the average difference of the frame difference value that comprises adjacent two frames, gray-scale map and gray-scale map is formed, and described frame difference value, average difference and variance difference be weighted handle the Euclidean distance that obtains described characteristic vector.

In one embodiment, the formula of the characteristic vector obtained of computing module 301 is:

D (k, k+1)=(Z (k, k+1), M (k, k+1), S (k, k+1)), wherein:

Z (k, computing formula k+1) is:

Z (k, k + 1) = 1 - \frac{1}{M} Σ_{i = 1}^{N} \min (h_{k} (i), h_{k + 1} (i))

Wherein, (k k+1) is frame difference value between k frame and the k+1 frame to Z, and M is the pixel count of every frame, and N is the number of color, h _k(i) be that k frame color is the histogram numerical value of i, h _K+1(i) be that k+1 frame color is the histogram numerical value of i.

M (k, computing formula k+1) is:

M (k, k + 1) = \overset{&OverBar;}{μ} (k + 1) - \overset{&OverBar;}{μ} (k)

Wherein, M (k k+1) is the average difference of k frame and k+1 frame gray-scale map,

Be the average of k+1 frame gray-scale map, It is the average of k frame gray-scale map.

S (k, computing formula k+1) is:

S (k, k + 1) = \overset{&OverBar;}{σ^{2}} (k + 1) - \overset{&OverBar;}{σ^{2}} (k)

Wherein, S (k k+1) is the variance difference of k frame and k+1 frame gray-scale map,

Be the variance of k+1 frame gray-scale map,

It is the variance of k frame gray-scale map.

In an exemplary scenario, the computing formula of the average of a certain frame gray-scale map is:

\overset{&OverBar;}{μ} = \frac{1}{hw} Σ_{y = 0}^{h - 1} Σ_{x = 0}^{w - 1} I (x, y)

Wherein,

Be the average of gray-scale map, h is the height of image, and w is the width of image, and (x y) is gray values of pixel points to I.

The computing formula of the variance of a certain frame gray-scale map is:

\overset{&OverBar;}{σ^{2}} = \frac{1}{hw} Σ_{y = 0}^{h - 1} Σ_{x = 0}^{w - 1} {(I (x, y) - \overset{&OverBar;}{μ})}^{2}

Wherein, Be the variance of gray-scale map, h is the height of image, and w is the width of image, I (x y) is gray values of pixel points, It is the average of the gray-scale map of this frame.

After computing module 301 obtains characteristic vector, described frame difference value, average difference and variance difference are weighted processing, to obtain the Euclidean distance of this characteristic vector, among this embodiment, the computing formula of Euclidean distance is:

d = \sqrt{α {(Z (k - 1, k) - Z (k, k + 1))}^{2} + β {(M (k - 1, k) - M (k, k + 1))}^{2} + δ {(S (k - 1, k) - S (k, k + 1))}^{2}}

Wherein, d is the Euclidean distance of characteristic vector, Z (k-1, k) be frame difference value between k-1 frame and the k frame, (k k+1) is frame difference value between k frame and the k+1 frame to Z, (k-1 k) is the average difference of k-1 frame and k frame gray-scale map, M (k to M, k+1) be the average difference of k frame and k+1 frame gray-scale map, (k-1 k) is the variance difference of k-1 frame and k frame gray-scale map to S, (k k+1) is the variance difference of k frame and k+1 frame gray-scale map, α to S, β, δ are weight coefficients.α, β, the value of δ can change according to no applied environment, and preferred value is respectively 1.7,1,1.2.

(2) cache module 302, the line data of going forward side by side of linking to each other with computing module 301, timing module 303 and frame extraction module 304 is mutual, be used to store computing module 301 and calculate the acquired image data feature, the for example histogram that obtains of aforementioned calculation and gray-scale map numerical value etc., cache module 302 can be set to only store the view data feature of one section successive frame.In a preferred embodiment, cache module 302 is made as the view data feature that can store continuous 50 frames, and when has stored when full in the space of cache module 302, then data substitute and are the view data feature of next 50 frame.When computing module 301 carries out further computing, can directly extract the view data feature of storage in the cache module 302, and need not to recomputate raw image data, thereby improve computational speed.

(3) timing module 303, and the line data of going forward side by side of linking to each other with computing module 301, cache module 302 and frame extraction module 304 is mutual, is used for timing.Be not prone to the video of gradual shot and shot-cut for some scenes, may cause omission once in a while, when timing module 303 timing when certain hour is not also exported key frame, 304 of frame extraction modules can be exported a frame by force, in case the phenomenon of leak-stopping inspection further improves the performance that the present invention extracts key frame of video.

(4) the frame extraction module 304, the line data of going forward side by side of linking to each other with computing module 301, cache module 302 and timing module 303 is mutual, the size that is used for comparison Euclidean distance and gradual shot threshold value and shot-cut threshold value, when Euclidean distance during greater than the gradual shot threshold value and less than the shot-cut threshold value, then obtain the gradual shot border and extract corresponding key frame, when Euclidean distance during, then obtain the shot-cut border and extract corresponding key frame greater than the shot-cut threshold value.

In an exemplary scenario, d is the Euclidean distance that calculates, and it is the gradual shot threshold value that T1 is set, and Th is the shot-cut threshold value, and the preferred value of T1 is 0.1, and the preferred value of Th is 0.3.Computing module 301 extracts the view data feature of adjacent two frames from cache module 302, and calculates Euclidean distance d.When d＜T1, gradual change and shear do not take place in camera lens, and when T1＜d＜Th, then camera lens is in gradual change, in the process before d does not reach Th, can obtain the gradual shot border, when d＞Th, then camera lens shear in the process that shot-cut takes place, is obtained the shot-cut border.In one embodiment, in the process of camera lens generation gradual change, can begin the time point that gradual change and gradual change finish, and the intermediate point of gradual change extraction key frame, when camera lens generation shear, can be at the Boundary Extraction key frame of beginning shear.Like this, the key frame that is extracted can reflect the overview of whole video.

Fig. 4 shows the method flow that extracts key frame of video in one embodiment of the present of invention, and this method flow is based on system configuration shown in Figure 1, and detailed process is as follows:

In step S401, the view data of 300 pairs of adjacent two frames in key-frame extraction unit is carried out histogram and gray-scale map computing, obtains the characteristic vector of being made up of operation result.

In step S402, the Euclidean distance of key-frame extraction unit 300 comparative feature vectors and the size of predetermined threshold, and obtain the camera lens transform boundary based on described comparative result.

In step S403, key-frame extraction unit 300 extracts described key frame according to the camera lens transform boundary.

In one embodiment, key-frame extraction unit 300 resulting characteristic vectors comprise the frame difference value of adjacent two frames that computing obtains, the average difference of gray-scale map and the variance difference of gray-scale map, and Euclidean distance is weighted to handle to described frame difference value, average difference and variance difference and obtains.

Fig. 5 shows the method flow that extracts key frame of video in one embodiment of the present of invention, and this method flow is based on system configuration shown in Figure 2, and detailed process is as follows:

In step S501, decoding unit 100 is decoded video data, obtains decoded view data.

In step S502,200 pairs of decoded view data of graphics processing unit are carried out normalized.In one embodiment, graphics processing unit 200 is the view data of 120 pixel X90 pixel sizes with decoded view data normalized.

In step S503, the view data of 300 pairs of adjacent two frames in key-frame extraction unit is carried out histogram and gray-scale map computing, obtains the characteristic vector of being made up of described operation result.

In one embodiment, the detailed process of step S503 is:

Computing module 301 carries out the histogram and the gray-scale map computing of adjacent two frame image datas, obtain the frame difference value of adjacent two frames, the average difference of gray-scale map and the variance difference of gray-scale map, obtain comprising the characteristic vector of described frame difference value, average difference and variance difference.

D (k, k+1)=(Z (k, k+1), M (k, k+1), S (k, k+1)), wherein:

Z (k, computing formula k+1) is:

Z (k, k + 1) = 1 - \frac{1}{M} Σ_{i = 1}^{N} \min (h_{k} (i), h_{k + 1} (i))

M (k, computing formula k+1) is:

M (k, k + 1) = \overset{&OverBar;}{μ} (k + 1) - \overset{&OverBar;}{μ} (k)

Wherein, M (k k+1) is the average difference of k frame and k+1 frame gray-scale map, Be the average of k+1 frame gray-scale map,

It is the average of k frame gray-scale map.

S (k, computing formula k+1) is:

S (k, k + 1) = \overset{&OverBar;}{σ^{2}} (k + 1) - \overset{&OverBar;}{σ^{2}} (k)

Be the variance of k+1 frame gray-scale map,

It is the variance of k frame gray-scale map.

\overset{&OverBar;}{μ} = \frac{1}{hw} Σ_{y = 0}^{h - 1} Σ_{x = 0}^{w - 1} I (x, y)

Wherein,

The computing formula of the variance of a certain frame gray-scale map is:

\overset{&OverBar;}{σ^{2}} = \frac{1}{hw} Σ_{y = 0}^{h - 1} Σ_{x = 0}^{w - 1} {(I (x, y) - \overset{&OverBar;}{μ})}^{2}

Wherein,

Be the variance of gray-scale map, h is the height of image, and w is the width of image, I (x y) is gray values of pixel points,

It is the average of this frame gray-scale map.

In step S504, the Euclidean distance of key frame acquiring unit 300 comparative feature vectors and the size of predetermined threshold, and obtain the camera lens transform boundary and extract corresponding key frame according to this comparative result.

In one embodiment, the detailed process of step S504 is:

After computing module 301 obtained characteristic vector, frame difference value, average difference and variance difference that computing is obtained were weighted the Euclidean distance that processing obtains this characteristic vector, and among this embodiment, the computing formula of Euclidean distance is:

d = \sqrt{α {(Z (k - 1, k) - Z (k, k + 1))}^{2} + β {(M (k - 1, k) - M (k, k + 1))}^{2} + δ {(S (k - 1, k) - S (k, k + 1))}^{2}}

Wherein, d is the Euclidean distance of characteristic vector, Z (k-1, k) be frame difference value between k-1 frame and the k frame, (k k+1) is frame difference value between k frame and the k+1 frame to Z, M (k-1, k) be the average difference of k-1 frame and k frame gray-scale map, (k k+1) is the average difference of k frame and k+1 frame gray-scale map to M, S (k-1, k) be the variance difference of k-1 frame and k frame gray-scale map, (k k+1) is the variance difference of k frame and k+1 frame gray-scale map to S.α, β, δ are weight coefficients.α, β, the value of δ can change according to no applied environment, and preferred value is respectively 1.7,1,1.2.

After obtaining Euclidean distance, frame extraction module 304 compares the size of Euclidean distance and predetermined threshold.In one embodiment, predetermined threshold comprises gradual shot threshold value and shot-cut threshold value, when Euclidean distance during greater than the gradual shot threshold value and less than the shot-cut threshold value, obtain the gradual shot border and extract corresponding key frame, when Euclidean distance during, obtain the shot-cut border and extract corresponding key frame greater than the shot-cut threshold value.

In an exemplary scenario, d is the Euclidean distance that calculates, and it is the gradual shot threshold value that T1 is set, and Th is the shot-cut threshold value, and the preferred value of T1 is 0.1, and the preferred value of Th is 0.3.Computing module 301 extracts the view data feature of adjacent two frames from cache module 302, and calculates Euclidean distance d.When d＜T1, gradual change and shear do not take place in camera lens, and when T1＜d＜Th, then camera lens is in gradual change, in the process before d does not reach Th, can obtain the gradual shot border, when d＞Th, camera lens is shear, in the process that shot-cut takes place, obtains the shot-cut border.In one embodiment, in the process of camera lens generation gradual change, can begin the time point that gradual change and gradual change finish, and the intermediate point of gradual change extraction key frame, when camera lens generation shear, can be at the Boundary Extraction key frame of beginning shear.

In a preferred embodiment, the view data feature that cache module 302 is used to store successive frame is set, for example, is made as and stores 50 frame consecutive image data characteristicses, when has stored when full in the space of cache module 302, then substitute to be the view data feature of next 50 frame.Computing module 301 and frame extraction module 304 can directly extract the view data feature of storage in the cache module 302 and carry out further computing or judgement.

In a further advantageous embodiment, timing module 304 is set and is used for timing.At the video that some scenes is not prone to gradual shot and shot-cut, may cause omission once in a while, when timing module 303 timing when certain hour is not also exported key frame, 304 of frame extraction modules can be exported a frame by force, in case the phenomenon of leak-stopping inspection further improves the performance of video audit of the present invention.

In step S505, picture synthesis unit 400 synthesizes the key frame that extracts, and generates the dynamic picture data.In one embodiment, the dynamic picture data that generated are pictures of gif file format.

In step S506, picture browsing terminal 500 shows described dynamic picture data.In one embodiment, picture browsing terminal 500 and picture synthesis unit 400 are connected by network, the dynamic picture data that picture synthesis unit 400 generates can be handled through overcompression and transfer to picture browsing terminal 500 again, after picture browsing terminal 500 decompression, show, user in picture browsing terminal 500 then can be known the overview of video by browsing these dynamic picture data, thereby the view data of the key frame that extracts is examined or analyzing and processing.

Should be noted that the foregoing description only is to be used to explain technical scheme of the present invention, the present invention typical case uses but is not limited to video audit analytical system, also can use method set forth in the present invention in other similar processing system for video.

The above only is preferred embodiment of the present invention, not in order to restriction the present invention, all any modifications of being done within the spirit and principles in the present invention, is equal to and replaces and improvement etc., all should be included within protection scope of the present invention.

Claims

1, a kind of system that extracts key frame of video is characterized in that, described system comprises:

The key-frame extraction unit, the view data of adjacent two frames in the video data is carried out histogram and gray-scale map computing, obtain the characteristic vector formed by described operation result, the Euclidean distance of more described characteristic vector and the size of predetermined threshold, and obtain the camera lens transform boundary and extract key frame based on described comparative result.

2, the system of extraction key frame of video according to claim 1 is characterized in that, described key-frame extraction unit comprises:

3, the system of extraction key frame of video according to claim 1 and 2 is characterized in that, described camera lens transform boundary comprises gradual shot border and shot-cut border, and described predetermined threshold comprises gradual shot threshold value and shot-cut threshold value.

4, the system of extraction key frame of video according to claim 3, it is characterized in that, described frame extraction module is further used for: the size of more described Euclidean distance and described gradual shot threshold value and shot-cut threshold value, when described Euclidean distance during greater than the gradual shot threshold value and less than the shot-cut threshold value, obtain the gradual shot border and extract corresponding key frame, when described Euclidean distance during, obtain the shot-cut border and extract corresponding key frame greater than the shot-cut threshold value.

5, the system of extraction key frame of video according to claim 1 is characterized in that, described system also comprises:

Decoding unit is decoded video data, obtains decoded view data;

6, a kind of method of extracting key frame of video is characterized in that, said method comprising the steps of:

C. extract described key frame according to the camera lens transform boundary.

7, the method for extraction key frame of video according to claim 6, it is characterized in that, described characteristic vector is made up of the frame difference value of adjacent two frames, the average difference of gray-scale map and the variance difference of gray-scale map, and described Euclidean distance is weighted to handle to described frame difference value, average difference and variance difference and obtains.

8, method according to claim 6 or 7 described extraction key frame of video, it is characterized in that, described camera lens transform boundary comprises gradual shot border and shot-cut border, described predetermined threshold comprises gradual shot threshold value and shot-cut threshold value, described step B further comprises: the size of more described Euclidean distance and described gradual shot threshold value and shot-cut threshold value, when described Euclidean distance during greater than the gradual shot threshold value and less than the shot-cut threshold value, obtain the gradual shot border, when described Euclidean distance during, obtain the shot-cut border greater than the shot-cut threshold value.

9, the method for extraction key frame of video according to claim 6 is characterized in that, also comprises before the described steps A: video data is decoded, obtain decoded view data, and described decoded view data is carried out normalized.

10, the method for extraction key frame of video according to claim 6 is characterized in that, also comprises after the described step C: described key frame of video is synthesized processing, generate the dynamic picture data, and show described dynamic picture data.