CN107590419A

CN107590419A - Camera lens extraction method of key frame and device in video analysis

Info

Publication number: CN107590419A
Application number: CN201610533693.XA
Authority: CN
Inventors: 白永强; 罗旻
Original assignee: BEIJING NUFRONT SOFTWARE TECHNOLOGY Co Ltd
Current assignee: BEIJING NUFRONT SOFTWARE TECHNOLOGY Co Ltd
Priority date: 2016-07-07
Filing date: 2016-07-07
Publication date: 2018-01-16

Abstract

The invention discloses the camera lens extraction method of key frame in a kind of video analysis and device, this method to include：Obtain video file to be analyzed；In the sliding window of setting, the distance between two adjacent video frames are calculated；According to the distance between each adjacent video frames in each sliding window, it is determined that carrying out the cut-point of shot segmentation to video file；According to the cut-point determined, the video file is divided into some video lens；Extraction can represent the camera lens key frame of camera lens main contents from each video lens being partitioned into.The key frame that can comprehensively and accurately extract in the middle camera lens that Video segmentation goes out is analyzed for video frequency searching, improves the accuracy rate of video search matching.

Description

Camera lens extraction method of key frame and device in video analysis

Technical field

The invention belongs to video analysis retrieval technique field, more particularly to the camera lens key-frame extraction in a kind of video analysis Method and device.

Background technology

With the continuous development of network technology, Internet video is increasingly popularized, and people obtain oneself by web search and thought The video to be watched simultaneously is watched online, and the quantity of video is also more and more on network, therefore, user is searched from massive video Required video, belong in video search field the problem of receiving much concern.

Traditional video search technique, the typically video search technique based on word, it usually needs for video text Part carries out manual annotation, but the video now on network is more and more, and annotation effort is carried out one by one for substantial amounts of video file Amount is very big, it is necessary to which substantial amounts of human resources, increase human cost, and efficiency is low.

Therefore, the video search technique based on content starts to rise, and this mode can automatically extract video features, convenient User video is searched for, and avoids disadvantages mentioned above to a certain extent.User realizes video search by client gopher, passes through Video analysis obtains the feature of video, to facilitate video frequency searching.This mode needs accurately to obtain the feature of video, Cai Nengshi Now accurate video search matching, although existing video frequency search system regards comprising a video information data base system to store The characteristic information of frequency, but these video informations are often only comprising the video frequency feature data information that is manually entered, message form is single, Information content is small, it is difficult to meets the Search Requirement of user.

In order to change this situation of video data retrieval, it is necessary to by unordered video data ordering, so as to establish base In the video frequency searching instrument of content, allow user to retrieve desired video data at any time, allow video to automatically adapt to environment, It can rapidly be retrieved with interactive operation, and rapidly, reliably transmitted online.This just needs that video is analyzed and carried Take its characteristic information.

Therefore, during video analysis, how to realize the extraction of comprehensive and accurate video feature information then turns into urgently The technical problem of solution.

The content of the invention

In view of this, it is an object of the present invention to provide the camera lens extraction method of key frame in a kind of video analysis and dress Put, to solve the problems, such as to have the video feature information that can not comprehensively and accurately extract for video search in the prior art, Make the video features of extraction more comprehensively accurate by the accurate segmentation to video lens and camera lens key-frame extraction.In order to disclosing The some aspects of embodiment have a basic understanding, shown below is simple summary.The summarized section is not generally to comment State, nor to determine key/critical component or describe the protection domain of these embodiments.Its sole purpose is with simple Form some concepts are presented, in this, as the preamble of following detailed description.

The embodiment of the present invention provides the camera lens extraction method of key frame in a kind of video analysis, including：

Obtain video file to be analyzed；

In the sliding window of setting, the distance between two adjacent video frames are calculated；

According to the distance between each adjacent video frames in each sliding window, it is determined that carrying out shot segmentation to video file Cut-point；

According to the cut-point determined, the video file is divided into some video lens；

Extraction can represent the camera lens key frame of camera lens main contents from each video lens being partitioned into.

In some optional embodiments, the distance between described calculating two adjacent video frames, specifically include：

According to the color histogram of frame of video, the Histogram distance between two adjacent video frames is calculated；Or

According to the bianry image of frame of video, the Euclidean distance between two adjacent video frames is calculated.

In some optional embodiments, the distance between each adjacent video frames in each sliding window of basis, really The fixed cut-point that shot segmentation is carried out to video file, is specifically included：

Determine the distance in each sliding window between two adjacent video frames maximum and two frame of video it Between distance average value；

Judge the ratio of the maximum of the distance and the average value of distance, if more than the distance change threshold value of setting；

When being judged as YES, it is video cut point to determine the two adjacent video frames, otherwise it is assumed that in the sliding window not Cut point be present.

In some optional embodiments, it is main that the extraction from each video lens being partitioned into can represent camera lens The camera lens key frame of content, is specifically included：

For each video lens being partitioned into：

Each frame of video that will be included in video lens, it is referred in different frame of video clusters；

Representative frame of the frame of video nearest from cluster barycenter as the cluster is extracted from each frame of video cluster；

The camera lens key frame is formed by all frames that represent extracted.

In some optional embodiments, each frame of video that will be included in video lens, different videos is referred to In frame cluster, specifically include：

For each frame of video in video lens：

The distance of present frame and the barycenter of the frame of video cluster of setting is calculated, if the distance is more than the frame of video of setting The distance threshold of cluster, then present frame be added without the frame of video cluster in；Otherwise frame of video cluster is recorded as the standby of present frame Frame of video is selected to cluster；

If the distance of present frame and the barycenter of the frame of video cluster of all settings of setting is all higher than the threshold value of setting, New frame of video is formed using present frame as barycenter to cluster；

Otherwise the selection frame of video maximum with present frame similarity is gathered from the alternative videos frame cluster of the present frame of record Class adds.

The embodiment of the present invention also provides the camera lens key-frame extraction device in a kind of video analysis, including：

Acquisition module, for obtaining video file to be analyzed；

Shot segmentation module, in the sliding window of setting, calculating the distance between two adjacent video frames；According to The distance between each adjacent video frames in each sliding window, it is determined that carrying out the cut-point of shot segmentation to video file；According to The cut-point determined, the video file is divided into some video lens；

First extraction module, the mirror of camera lens main contents can be represented for being extracted from each video lens being partitioned into Head key frame.

In some optional embodiments, the shot segmentation module, it is specifically used for：

In some optional embodiments, first extraction module, it is specifically used for：

For each video lens being partitioned into：

The camera lens key frame is formed by all frames that represent extracted.

For each frame of video in video lens：

Camera lens extraction method of key frame and device in video analysis provided in an embodiment of the present invention, to video to be analyzed File, shot segmentation point is determined by way of the distance between two adjacent video frames is calculated in sliding window, realized accurate True shot segmentation and camera lens key-frame extraction, so as to comprehensively and accurately extract the feature of video from camera lens key frame Information, matched for video search, so as to which the video file of search required for being quickly supplied to user, improves video and search The accuracy rate of rope matching, improve the speed and efficiency of video search matching.

For above-mentioned and related purpose, one or more embodiments include will be explained in below and in claim In the feature that particularly points out.Following explanation and accompanying drawing describe some illustrative aspects in detail, and its instruction is only Some modes in the utilizable various modes of principle of each embodiment.Other benefits and novel features will be under The detailed description in face is considered in conjunction with the accompanying and becomes obvious, the disclosed embodiments be will include all these aspects and they Be equal.

Figure of description

Accompanying drawing is used for providing a further understanding of the present invention, and a part for constitution instruction, the reality with the present invention Apply example to be used to explain the present invention together, be not construed as limiting the invention.In the accompanying drawings：

Fig. 1 is the flow chart of the camera lens extraction method of key frame in video analysis in the embodiment of the present invention one；

Fig. 2 is the flow chart of the camera lens extraction method of key frame in video analysis in the embodiment of the present invention two；

Fig. 3 is the structural representation of the camera lens key-frame extraction device in video analysis in the embodiment of the present invention three；

Fig. 4 is the structural representation of the scene extraction method of key frame in video analysis in the embodiment of the present invention four；

Fig. 5 is the structural representation of the scene extraction method of key frame in video analysis in the embodiment of the present invention five；

Fig. 6 is the structural representation of the scene key-frame extraction device in video analysis in the embodiment of the present invention six.

Embodiment

The following description and drawings fully show specific embodiments of the present invention, to enable those skilled in the art to Put into practice them.Other embodiments can include structure, logic, it is electric, process and other change.Embodiment Only represent possible change.Unless explicitly requested, otherwise single component and function are optional, and the order operated can be with Change.The part of some embodiments and feature can be included in or replace part and the feature of other embodiments.This hair The scope of bright embodiment includes the gamut of claims, and claims is all obtainable equivalent Thing.Herein, these embodiments of the invention can individually or generally be represented that this is only with term " invention " For convenience, and if in fact disclosing the invention more than one, the scope for being not meant to automatically limit the application is to appoint What single invention or inventive concept.

In order to solve in the prior art in the presence of the video feature information that can not comprehensively and accurately extract for video search Problem, the scene that the embodiment of the present invention is provided in camera lens extraction method of key frame and video analysis in a kind of video analysis are crucial Frame extracting method, the accurate segmentation to camera lens and scene can be realized, so as to for realize more comprehensively, accurate video feature extraction It is ready, improves the degree of accuracy of video frequency searching analysis.

Embodiment one

The embodiment of the present invention one provides the camera lens extraction method of key frame in a kind of video analysis, its flow as shown in figure 1, Comprise the following steps：

Step S101：Obtain video file to be analyzed.

Camera lens is the minimal physical unit handled video flowing, in same group of camera lens, the characteristics of image of frame of video Keep stable, therefore, in order to more comprehensively, accurate video features, if video file can be divided into dry system lens, then regarded The extraction of frequency feature.

Step S102：In the sliding window of setting, the distance between two adjacent video frames are calculated.

The form of sliding window can be used, realizes that the distance between adjacent video frames calculate, it is true to be come according to distance Horizontal glass head cut-point.Optionally, according to the color histogram of frame of video, calculate histogram between two adjacent video frames away from From；Or the bianry image according to frame of video, calculate the Euclidean distance between two adjacent video frames.

Step S103：According to the distance between each adjacent video frames in each sliding window, it is determined that being carried out to video file The cut-point of shot segmentation.

After calculating the distance between adjacent video frames, according between the two adjacent video frames in each sliding window The average value of the distance between the maximum of distance and two frame of video, to determine shot segmentation point, such as：The maximum of distance When being more than the distance change threshold value of setting with the ratio of the average value of distance, determine that the two adjacent video frames are cut for video Point, otherwise it is assumed that cut point is not present in the sliding window.

Application, the classification of video or the form of video of the size and distance change threshold value of sliding window according to belonging to video In at least one of set.

Step S104：According to the cut-point determined, the video file being analysed to is divided into some video lens.

The quantity for marking off the video lens come is random, depending on the quantity for having the cut-point determined.

Step S105：Extraction can represent the camera lens key of camera lens main contents from each video lens being partitioned into Frame.

, can be real by way of cluster when extracting camera lens key frame therein for each video lens being partitioned into It is existing, each frame of video that will be included in video lens, it is referred in different frame of video clusters；Extracted from each frame of video cluster Representative frame of the frame of video nearest from cluster barycenter as the cluster；Camera lens key is formed by all frames that represent extracted Frame.

Embodiment two

The embodiment of the present invention two provides the camera lens extraction method of key frame in a kind of video analysis, its flow as shown in Fig. 2 Comprise the following steps：

Step S201：Obtain video file to be analyzed.

Step S202：In the sliding window of setting, the distance between two adjacent video frames are calculated.

When being realized by the way of sliding window, it is any size specified that can set window default size, specific root According to needing to set.

Exemplified by being calculated according to the color histogram of frame of video, represent right in frame of video f color histogram with H (f, k) Answer color k pixel sum.K scope is [0, N], and N is the maximum in the discrete codomain section of color.Two adjacent video frames Between color histogram map distance ask the method for friendship to measure using histogram, the Histogram distance d between two frame of video f and f ' (f, f') calculation formula is as follows：

Exemplified by being calculated according to the bianry image of frame of video, the Euclidean distance between two adjacent video frames can use down Row formula calculates：

(1) two point a (x on two dimensional surface₁, y₁) and b (x₂, y₂) between Euclidean distance：

(2) the point a of three dimensions two (x₁, y₁, z₁) and b (x₂, y₂, z₂) between Euclidean distance：

(3) two n-dimensional vector a (x₁₁, x₁₂... ..., x_1n) and b (x₂₁, x₂₂... ..., x_2n) between Euclidean distance：

According to the concrete condition of frame of video, calculated from the formula provided.

Step S203：Determine the distance in each sliding window between two adjacent video frames maximum and two The average value of the distance between frame of video.

Determined in each sliding window two videos it is true between the maximum of distance be d_max, while determine same slip The average value of distance is d between two frame of video in window_avr, it is T to set distance change threshold value, works as d_max/d_avrDuring ＞ T, it is believed that the point It is on the contrary, it is believed that without cut point in current window for camera lens cut point.This method can be effectively prevented from burr to shot segmentation Influence.

Step S204：The ratio of the maximum for the distance that judgement is determined and the average value of distance, if more than setting Distance change threshold value.

When being judged as YES, step S205 is performed；When being judged as NO, perform step S206.

Step S205：It is video cut point to determine the two adjacent video frames.

Step S206：Think cut point is not present in the sliding window.

Step S203- steps S206 is realized according to the distance between each adjacent video frames in each sliding window, it is determined that The cut-point of shot segmentation is carried out to video file.

Step S207：According to the cut-point determined, video file is divided into some video lens.

Step S208：For each video lens being partitioned into, following steps are performed：

Step S209：Each frame of video that will be included in video lens, it is referred in different frame of video clusters.

For each frame of video in video lens：

Step S210：Representative of the frame of video nearest from cluster barycenter as the cluster is extracted from each frame of video cluster Frame.

Step S211：The camera lens key frame is formed by all frames that represent extracted.

If some camera lens S_iComprising n frame of video, S can be expressed as_i={ F_i1,……,F_in, wherein F_i1Headed by frame and F_inFor tail frame.Similarity between adjacent two frame of video is defined as the similarity of this adjacent two frames color histogram, predefines one The density of individual distance threshold δ controls cluster.

Calculate current video frame F_imThe distance between some existing frame of video cluster barycenter, should if the value is more than δ In larger distance, F between frame of video and frame of video cluster_imIt is added without in frame of video cluster.

If F_imThe distance between existing all frame of video cluster barycenter are all higher than δ, F_imNew frame of video is formed to gather Class, F_imFor the barycenter of new video frame cluster；Otherwise the frame of video is added in the maximum cluster of similar degree, makes the video Frame and the centroid distance of this frame of video cluster are minimum, and accordingly adjust the cluster barycenter as follows：

Centrod'=centrod × F_n/(F_n+1)+1/(F_n+1)×F_im

Wherein, centrod', centrod and F_nIt is barycenter and this is poly- after the original barycenter of cluster group, cluster group's renewal respectively Monoid frame number.

By above method by camera lens S_iComprising n frame of video, be referred to respectively different video frame cluster after, so that it may To select camera lens key frame.The generation as this frame of video cluster nearest from cluster barycenter is extracted from each frame of video cluster Table frame, the representative frame of all frame of video clusters just constitute the camera lens key frame of camera lens.

To frame of video cluster result, it may be considered that some constraintss are added, such as：Provide any recruitment person's cluster Totalframes can not be less than the 10% of camera lens totalframes, and the frame of video similar to barycenter cluster merges.

Wherein threshold value δ can be adjusted according to the needs of different application different video, to obtain the camera lens of varying number Key frame.

Step S208- steps S211 realizes from each video lens being partitioned into extraction, and can to represent camera lens mainly interior The camera lens key frame of appearance.

Embodiment three

The embodiment of the present invention three provides a kind of camera lens key-frame extraction device based in video analysis, its structure such as Fig. 3 It is shown, including：Acquisition module 301, the extraction module 303 of shot segmentation module 302 and first.

Acquisition module 301, for obtaining video file to be analyzed.

Shot segmentation module 302, in the sliding window of setting, calculating the distance between two adjacent video frames； According to the distance between each adjacent video frames in each sliding window, it is determined that carrying out the cut-point of shot segmentation to video file； According to the cut-point determined, video file is divided into some video lens.

First extraction module 303, camera lens main contents can be represented for being extracted from each video lens being partitioned into Camera lens key frame.

Preferably, above-mentioned shot segmentation module 302, specifically for the color histogram according to frame of video, adjacent two are calculated Histogram distance between individual frame of video；Or the bianry image according to frame of video, calculate the Euclidean between two adjacent video frames Distance.

Preferably, above-mentioned shot segmentation module 302, specifically for determining the two adjacent video in each sliding window The average value of the distance between the maximum of the distance between frame and two frame of video；The maximum of judging distance and putting down for distance The ratio of average, if more than the distance change threshold value of setting；When being judged as YES, it is video to determine the two adjacent video frames Cut point, otherwise it is assumed that cut point is not present in the sliding window.

Preferably, above-mentioned first extraction module 303, specifically for each video lens for being partitioned into：By video mirror Each frame of video included in head, it is referred in different frame of video clusters；Extracted from each frame of video cluster from cluster barycenter Representative frame of the nearest frame of video as the cluster；The camera lens key frame is formed by all frames that represent extracted.

Preferably, above-mentioned first extraction module 303, specifically for for each frame of video in video lens：Calculate Present frame with setting frame of video cluster barycenter distance, if the distance be more than setting the frame of video cluster apart from threshold It is worth, then present frame is added without in frame of video cluster；Otherwise frame of video cluster is recorded to cluster for the alternative videos frame of present frame； If the distance of present frame and the barycenter of the frame of video cluster of all settings of setting is all higher than the threshold value of setting, with present frame New frame of video cluster is formed for barycenter；Otherwise selection is similar to present frame from the alternative videos frame cluster of the present frame of record The maximum frame of video cluster of degree adds.

Example IV

The embodiment of the present invention four provides a kind of scene extraction method of key frame based in video analysis, its flow such as Fig. 4 It is shown, comprise the following steps：

Step S401：Obtain video file to be analyzed.

Step S402：Video file is given as some video lens, energy is extracted from each video lens being partitioned into Enough represent the camera lens key frame of camera lens main contents.

If video file is divided into dry system lens and extracts the implementation process of camera lens key frame referring to embodiment one and implementation The associated description of example two.

Step S403：The camera lens key frame included to the video lens being partitioned into carries out key frame cluster, by each camera lens Key frame is referred in different key frame clusters.

There may be one or more camera lens key frames in each camera lens, these camera lens key frames are carried out with the cluster of key frame Analysis, to determine that the key frame that camera lens key frame is belonged to clusters.

For news video, the feature according to possessed by news video --- usually contained between two hosts one it is complete Whole news footage, therefore, correct anchor shots are extracted, with two of appearance continuous in time different host's mirrors Camera lens between head forms a scene, completes the scene cut of news video.

Step S404：Continuous in time, camera lens key frame is belonged to the video lens of identical key frame cluster, is combined into and regards Frequency scene.

When the camera lens key frame more than one in a video lens, it is determined that the key frame belonging to each camera lens key frame Cluster, clustered comprising the most key frame cluster of camera lens key frame as the key frame of the video lens, for determining Whether the video lens belong to identical key frame with time upper adjacent video lens clusters.

For example have 10 camera lens key frames in a video lens, wherein 7 camera lens key frames and adjacent latter video Key frame cluster belonging to camera lens is identical, then the video lens belong to same video scene with latter video lens；3 camera lenses Key frame is identical with the key frame cluster belonging to adjacent previous video lens, then the video lens do not belong to previous video lens In same video scene.

Step S405：Extraction can represent the scene key of scene principal character from each video scene being partitioned into Frame.

For each video scene being partitioned into：

Each frame of video that will be included in video scene, it is referred in different frame of video clusters；

The scene key frame is formed by all frames that represent extracted.

Embodiment five

The embodiment of the present invention five provides a kind of scene extraction method of key frame based in video analysis, its flow such as Fig. 5 It is shown, comprise the following steps：

Step S501：Obtain video file to be analyzed.

Step S502：Video file is given as some video lens, energy is extracted from each video lens being partitioned into Enough represent the camera lens key frame of camera lens main contents.

Step S503：The camera lens key frame included to the video lens being partitioned into carries out key frame cluster, by each camera lens Key frame is referred in different key frame clusters.

For each camera lens key frame：

The distance of current key frame and the barycenter of the key frame cluster of setting is calculated, if the distance is more than the pass of setting Key frame cluster distance threshold, then current key frame be added without the key frame cluster in；Otherwise key frame cluster is recorded to work as The alternative key frame cluster of preceding key frame；

If the distance of current key frame and the barycenter of the key frame cluster of all settings of setting is all higher than the threshold of setting Value, then new key frame is formed as barycenter using present frame and clustered；

Otherwise the selection key maximum with current key frame similarity from the alternative key frame cluster of the present frame of record Frame cluster adds.

Step S504：Continuous in time, camera lens key frame is belonged to the video lens of identical key frame cluster, is combined into and regards Frequency scene.

Step S505：For each video scene being partitioned into, following steps are performed：

Step S506：Each frame of video that will be included in video scene, it is referred in different frame of video clusters.

For each frame of video in video scene：

Step S507：Representative of the frame of video nearest from cluster barycenter as the cluster is extracted from each frame of video cluster Frame.

Step S508：The scene key frame is formed by all frames that represent extracted.

Step S505- steps S508 realizes from each video scene being partitioned into extraction, and can to represent scene mainly interior The scene key frame of appearance.Specific algorithm realizes the process that can refer to camera lens key-frame extraction.

Embodiment six

The embodiment of the present invention six provides a kind of scene key-frame extraction device based in video analysis, its structure such as Fig. 6 It is shown, including：Acquisition module 601, shot segmentation module 602, the first extraction module 603, key frame cluster module 604, scene Split the extraction module 606 of module 605 and second.

Acquisition module 601, obtain video file to be analyzed.

Shot segmentation module 602, for video file to be given as some video lens.

First extraction module 603, camera lens main contents can be represented for being extracted from each video lens being partitioned into Camera lens key frame.

Key frame cluster module 604, the camera lens key frame for being included to the video lens being partitioned into carry out key frame and gathered Class, each camera lens key frame is referred in different key frame clusters.

Scene cut module 605, for continuous in time, camera lens key frame to be belonged to the video mirror of identical key frame cluster Head, it is combined into video scene.

Second extraction module 606, scene principal character can be represented for being extracted from each video scene being partitioned into Scene key frame.

Preferably, above-mentioned shot segmentation module 602, specifically in the sliding window of setting, calculating two neighboring regard The distance between frequency frame；According to the distance between each adjacent video frames in each sliding window, it is determined that carrying out mirror to video file The cut-point of head segmentation；According to the cut-point determined, video file is divided into some video lens.

Preferably, above-mentioned key frame cluster module 604, specifically for for each camera lens key frame：Calculate current key The distance of frame and the barycenter of the key frame cluster of setting, if the distance is more than the distance threshold of the key frame cluster of setting, Then current key frame is added without in key frame cluster；Otherwise alternative key frame of the key frame cluster for current key frame is recorded Cluster；If the distance of current key frame and the barycenter of the key frame cluster of all settings of setting is all higher than the threshold value of setting, Then new key frame is formed as barycenter using present frame to cluster；Otherwise from the alternative key frame cluster of the present frame of record selection with The maximum key frame cluster of current key frame similarity adds.

Preferably, above-mentioned scene cut module 605, specifically for when the camera lens key frame more than one in a video lens When individual, it is determined that the key frame cluster belonging to each camera lens key frame, will be clustered comprising a most key frame of camera lens key frame Key frame as the video lens clusters, for determining whether the video lens adjacent with the time belong to phase to the video lens Same key frame cluster.

Preferably, above-mentioned scene cut module 605, specifically for each video scene for being partitioned into：

The scene key frame is formed by all frames that represent extracted.

Unless otherwise specific statement, term such as handle, calculate, computing, determination, display etc. can refer to it is one or more Individual processing or action and/or the process of computing system or similar devices, the action and/or process will be indicated as processing system The data manipulation of physics (such as electronics) amount in the register or memory of system and it is converted into and is similarly represented as processing system Memory, other data of the register either storage of other this type of informations, transmitting or physical quantity in display device.Information It can be represented with signal using any one of a variety of different technologies and method.For example, in above description Data, instruction, order, information, signal, bit, symbol and the chip referred to can use voltage, electric current, electromagnetic wave, magnetic field or grain Son, light field or particle or its any combination represent.

It should be understood that the particular order or level of the step of during disclosed are the examples of illustrative methods.Based on setting Count preference, it should be appreciated that during the step of particular order or level can be in the feelings for the protection domain for not departing from the disclosure Rearranged under condition.Appended claim to a method gives the key element of various steps with exemplary order, and not It is to be limited to described particular order or level.

In above-mentioned detailed description, various features combine in single embodiment together, to simplify the disclosure.No This open method should be construed to reflect such intention, i.e. the embodiment of theme claimed needs clear The more features of feature stated in each claim to Chu.On the contrary, that reflected such as appended claims Sample, the present invention are in the state fewer than whole features of disclosed single embodiment.Therefore, appended claims is special This is expressly incorporated into detailed description, and wherein each claim is alone as the single preferred embodiment of the present invention.

It should also be appreciated by one skilled in the art that the various illustrative boxes described with reference to the embodiments herein, mould Block, circuit and algorithm steps can be implemented as electronic hardware, computer software or its combination.In order to clearly demonstrate hardware and Interchangeability between software, various illustrative part, frame, module, circuit and steps are carried out around its function above It is generally described.Hardware is implemented as this function and is also implemented as software, depending on specific application and to whole The design constraint that system is applied.Those skilled in the art can be directed to each application-specific, be realized in a manner of flexible Described function, it is still, this to realize that decision-making should not be construed as the protection domain away from the disclosure.

The step of method or algorithm with reference to described by the embodiments herein, can be embodied directly in hardware, be held by processor Capable software module or its combination.Software module can be located at RAM memory, flash memory, ROM memory, eprom memory, The storage of eeprom memory, register, hard disk, mobile disk, CD-ROM or any other form well known in the art is situated between In matter.A kind of exemplary storage medium is connected to processor, so as to enable a processor to from the read information, and Information can be write to the storage medium.Certainly, storage medium can also be the part of processor.Processor and storage medium It can be located in ASIC.The ASIC can be located in user terminal.Certainly, processor and storage medium can also be used as discrete sets Part is present in user terminal.

Realized for software, technology described in this application can use the module for performing herein described function (for example, mistake Journey, function etc.) realize.These software codes can be stored in memory cell and by computing device.Memory cell can With realize in processor, can also realize outside processor, in the latter case, it via various means by correspondence It is coupled to processor, these are all well known in the art.

Described above includes the citing of one or more embodiments.Certainly, in order to above-described embodiment is described and description portion The all possible combination of part or method is impossible, but it will be appreciated by one of ordinary skill in the art that each implementation Example can do further combinations and permutations.Therefore, embodiment described herein is intended to fall into appended claims Protection domain in all such changes, modifications and variations.In addition, with regard to the term used in specification or claims "comprising", the mode that covers of the word are similar to term " comprising ", just as " including " solved in the claims as link word As releasing.In addition, the use of any one term "or" in the specification of claims is to represent " non-exclusionism Or ".

Claims

A kind of 1. camera lens extraction method of key frame in video analysis, it is characterised in that including：

Obtain video file to be analyzed；

In the sliding window of setting, the distance between two adjacent video frames are calculated；

According to the distance between each adjacent video frames in each sliding window, it is determined that carrying out the segmentation of shot segmentation to video file Point；

According to the cut-point determined, the video file is divided into some video lens；

Extraction can represent the camera lens key frame of camera lens main contents from each video lens being partitioned into.
2. the method as described in claim 1, it is characterised in that the distance between described calculating two adjacent video frames, specifically Including：

According to the color histogram of frame of video, the Histogram distance between two adjacent video frames is calculated；Or

According to the bianry image of frame of video, the Euclidean distance between two adjacent video frames is calculated.
3. the method as described in claim 1, it is characterised in that in each sliding window of basis between each adjacent video frames Distance, it is determined that to video file carry out shot segmentation cut-point, specifically include：

Determine between the maximum of the distance in each sliding window between two adjacent video frames and two frame of video The average value of distance；

Judge the ratio of the maximum of the distance and the average value of distance, if more than the distance change threshold value of setting；

When being judged as YES, it is video cut point to determine the two adjacent video frames, otherwise it is assumed that being not present in the sliding window Cut point.
4. the method as described in claim 1-3 is any, it is characterised in that described to be extracted from each video lens being partitioned into The camera lens key frame of camera lens main contents can be represented, is specifically included：

For each video lens being partitioned into：

Each frame of video that will be included in video lens, it is referred in different frame of video clusters；

Representative frame of the frame of video nearest from cluster barycenter as the cluster is extracted from each frame of video cluster；

The camera lens key frame is formed by all frames that represent extracted.
5. method as claimed in claim 4, it is characterised in that each frame of video that will be included in video lens, be referred to In different frame of video clusters, specifically include：

For each frame of video in video lens：

The distance of present frame and the barycenter of the frame of video cluster of setting is calculated, if the distance is more than the frame of video cluster of setting Distance threshold, then present frame be added without the frame of video cluster in；Otherwise frame of video cluster is recorded to regard for the alternative of present frame Frequency frame clusters；

If the distance of present frame and the barycenter of the frame of video cluster of all settings of setting is all higher than the threshold value of setting, to work as Previous frame is that barycenter forms new frame of video cluster；

Otherwise the frame of video cluster maximum with present frame similarity is selected to add in being clustered from the alternative videos frame of the present frame of record Enter.
A kind of 6. camera lens key-frame extraction device in video analysis, it is characterised in that including：

Acquisition module, for obtaining video file to be analyzed；

Shot segmentation module, in the sliding window of setting, calculating the distance between two adjacent video frames；According to each The distance between each adjacent video frames in sliding window, it is determined that carrying out the cut-point of shot segmentation to video file；According to determination The cut-point gone out, the video file is divided into some video lens；

First extraction module, the camera lens that camera lens main contents can be represented for being extracted from each video lens being partitioned into close Key frame.
7. device as claimed in claim 6, it is characterised in that the shot segmentation module, be specifically used for：

According to the color histogram of frame of video, the Histogram distance between two adjacent video frames is calculated；Or

According to the bianry image of frame of video, the Euclidean distance between two adjacent video frames is calculated.
8. device as claimed in claim 6, it is characterised in that the shot segmentation module, be specifically used for：

Determine between the maximum of the distance in each sliding window between two adjacent video frames and two frame of video The average value of distance；

Judge the ratio of the maximum of the distance and the average value of distance, if more than the distance change threshold value of setting；

When being judged as YES, it is video cut point to determine the two adjacent video frames, otherwise it is assumed that being not present in the sliding window Cut point.
9. the device as described in claim 6-8 is any, it is characterised in that first extraction module, be specifically used for：

For each video lens being partitioned into：

Each frame of video that will be included in video lens, it is referred in different frame of video clusters；

Representative frame of the frame of video nearest from cluster barycenter as the cluster is extracted from each frame of video cluster；

The camera lens key frame is formed by all frames that represent extracted.
10. device as claimed in claim 9, it is characterised in that first extraction module, be specifically used for：

For each frame of video in video lens：

The distance of present frame and the barycenter of the frame of video cluster of setting is calculated, if the distance is more than the frame of video cluster of setting Distance threshold, then present frame be added without the frame of video cluster in；Otherwise frame of video cluster is recorded to regard for the alternative of present frame Frequency frame clusters；

If the distance of present frame and the barycenter of the frame of video cluster of all settings of setting is all higher than the threshold value of setting, to work as Previous frame is that barycenter forms new frame of video cluster；

Otherwise the frame of video cluster maximum with present frame similarity is selected to add in being clustered from the alternative videos frame of the present frame of record Enter.