CN102694966A - Construction method of full-automatic video cataloging system - Google Patents

Construction method of full-automatic video cataloging system Download PDF

Info

Publication number
CN102694966A
CN102694966A CN2012100548125A CN201210054812A CN102694966A CN 102694966 A CN102694966 A CN 102694966A CN 2012100548125 A CN2012100548125 A CN 2012100548125A CN 201210054812 A CN201210054812 A CN 201210054812A CN 102694966 A CN102694966 A CN 102694966A
Authority
CN
China
Prior art keywords
frame
video
key frame
metadata
camera lens
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2012100548125A
Other languages
Chinese (zh)
Other versions
CN102694966B (en
Inventor
蔡靖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University of Technology
Original Assignee
Tianjin University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University of Technology filed Critical Tianjin University of Technology
Priority to CN201210054812.5A priority Critical patent/CN102694966B/en
Publication of CN102694966A publication Critical patent/CN102694966A/en
Application granted granted Critical
Publication of CN102694966B publication Critical patent/CN102694966B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a full-automatic video cataloging system and a full-automatic video cataloging method. The system is mainly used for automatically building a metadatabase of massive non-structured media materials, simultaneously supporting manual modification, adding and perfection of metadata on that basis, and finally realizing the aim of effective management on massive media assets through the perfected media metadatabase. The system comprises a full-automatic video cataloging system function and system architecture, a media material video key frame robustness automatic extraction algorithm and a media material shot segmentation key robustness automatic extraction algorithm. The software realized based on the method provided by the invention can realize the real-time processing effect on the existing mainstream computers. The video key frame and shot segmentation robustness extraction algorithm provided by the invention has good anti-interference effect on the influence of camera flash and special effects including wiping off, fading in fading out and dissolving which frequently appear in video, so that excess omissions and false drop caused by the interference can be avoided.

Description

A kind of construction method of full-automatic video cataloging syytem
Technical field
The present invention relates to multimedia technology, the metadata that relates in particular to the voluminous media material is set up and effectively management.
Background technology
Current society is the society of high speed information development, produces the information of a large amount of various kinds every day.Particularly in the video display industry, a large amount of multimedia materials produce and accumulate.Yet multi-medium data itself is a kind of non-structured data, how to manage these media datas effectively, is the difficult problem that presses for solution that media industry is faced.
Media asset management (Median Asset Management; MAM) be an overall solution of end to end all kinds medium and content (like video/audio data, text, chart etc.) being carried out comprehensive management in its lifetime, comprise all processes such as collection, catalogue, management, transmission and code conversion of Digital Media.It satisfies the functional requirement that the media asset owner collects, preserves, searches, edits, issues various information fully; For the user of media asset provides online content and easy access method, realized safety, intactly preserved media asset and utilize media asset efficiently, at low cost.
The object of multi-medium data (Multimedia Data) media asset management; Mainly be meant the carrier of all styles of information; For example: data such as text, figure, image, sound. its outstanding characteristics are: (1) multi-medium data (being unstructured data mostly) of a great variety; Derive from different medium, have diverse form and form; (2) the multi-medium data amount is huge; (3) multi-medium data has time response and version notion, and is different with traditional numerical and character, belongs to unstructuredness information
Media asset metadata (Matedata) is exactly an information of describing media asset.But the quality of metadata, quantity, unicity, description content accessibility and the property known all are the keys that determines a media asset management system success or not.Therefore perfect metadata collecting and manufacturing system are unusual keys.
The multimedia metadata is collected and made to the major function of full-automatic media materials cataloging syytem exactly, thereby to media asset management system effective information support is provided.The generation of traditional media asset metadata is to rely on manual method, utilizes operating personnel to watch video content, and manually adds mark on this basis, and then the generator data.This traditional method takes time and effort inefficiency.Therefore industry is needed the method that a kind of automatic metadata produces badly, replaces artificial method, raises the efficiency.Simultaneously, because the complexity of media asset content, and the increase of some characteristic contents (special efficacy such as for example be fade-in fade-out, wipe, in the video content influence of camera flashlamp etc.), for the automatic mode that video is cut apart, brought very big difficulty.How to carry out the extraction that key frame of video or camera lens are cut apart frame efficiently and accurately, be a technical barrier that remains unsolved at present.
Camera lens is cut apart automatically: the once record start-stop operation of the corresponding video camera of camera lens, represent in the scene in time with the space on continuous action.Can have polytype transient mode between the camera lens, modal is shear, has some complicated transient modes in addition, as be fade-in fade-out, dissolve, wipe etc.Camera lens is a base unit that carries out the material editor, will bring convenience for searching of information through camera lens being set up index.
The key frame Automatic Extraction: find that in reality the start-stop time of a camera lens is separated by longer sometimes, very large variation possibly take place in inner content and scene.Therefore, only cut apart all important informations that are not enough to take out fully media materials by camera lens.For remedying this inadequate natural endowment of camera lens fragment, defined the notion of key frame fragment.The key frame fragment analysis is based on the correlation analysis (rather than physical action of video camera start-stop) to video content, and extracts representational key frame according to complexity of video content, effectively plays the effect of video frequency abstract.The key frame fragment is to regard smallest meaningful unit in the video segment as, and key frame fragment frame interior has very close content, thereby very large information redundancy is arranged.Key frame can the section of representative in the information of all frames.
Summary of the invention
The present invention seeks to,, utilize advanced computer video analytical technology, propose a kind of construction method of full-automatic video cataloging syytem to the problem that the artificial categorization of voluminous media material takes time and effort.This system can practice thrift manpower in a large number, improves the quality and the efficient of video catalogue.
The invention discloses a cover flexibly, full-automatic video cataloging syytem and relevant intelligent algorithm efficiently.This system's support is automatically carried out intellectual analysis to video content; Extract the camera lens that forms based on the physical change of video camera start-stop and cut apart frame and change and can represent the key frame of video segment content, and cut apart the metadatabase of generation media materials on the basis of frame at key frame that extracts and camera lens based on video content.Key frame of video that the present invention proposes and camera lens cut apart frame robustness extraction algorithm for wiping of often occurring in the video, be fade-in fade-out, special efficacy such as dissolving and camera flashlamp influence etc. have good anti-interference effect, too much omission or flase drop can therefore not occur.
The first, the invention provides a functional framework of setting up the video cataloging syytem, defined the functional module of each subsystem.
The second, the invention provides a cover full-automatic video key frame extraction algorithm, this algorithm for wipe, be fade-in fade-out, special efficacy such as dissolving and camera flashlamp influence etc. have the good restraining effect.
The 3rd, the invention provides a cover full-automatic video camera lens and cut apart the frame extraction algorithm, this algorithm for wipe, be fade-in fade-out, special efficacy such as dissolving and camera flashlamp influence etc. have the good restraining effect.
For this reason, the construction method of full-automatic video cataloging syytem provided by the invention comprises:
1st, the structure of full-automatic video cataloging syytem; Described full-automatic video cataloging syytem comprises:
1.1st, acquisition of media module, through the video acquisition integrated circuit board, the sampling video flow data;
1.2nd, Media Analysis module; Go on foot the video stream data that the acquisition of media module collects to the 1.1st; Utilize advanced computer vision treatment technology to carry out intellectual analysis; Extract key frame or camera lens and cut apart frame, and the key frame that extracts or camera lens are cut apart frame be shown on the Storyboard, supply the user further to edit, handle;
1.3rd, metadata foundation, editor and administration module, the key frame that extracts with the 1.2nd step Media Analysis module is cut apart frame with camera lens and is the basis, and formation is unit with the video segment, and is aided with the media materials metadata of descriptor; Described metadata is the final result of cataloging syytem output, is the main foundation that the destructuring media asset is effectively managed; Metadata editor and management comprise the foundation of metadata model file and modification, metadata demonstration, metadata editor, metadata cache management and metadata preservation;
1.3.1, metadata model file; The content and the organizational form of the metadata information that cataloging syytem carries out media materials drawing after the structured analysis have been defined in this document; Comprise fragment description, classifying content, video author and lister; The metadata model file is with the XML representation of file, and system provides default file, and the user can revise the metadata model file through manual mode as required;
1.3.2, metadata editor comprise:
◆ key frame and camera lens are cut apart frame-editing: the user is cut apart frame with concrete condition increases or deletion is presented on the Storyboard key frame or camera lens as required;
◆ the video segment editor: the user is cut apart frame according to key frame or camera lens that system extracts; Pass through edit; The content of two or more key frames or camera lens being cut apart frame merges; Independent and the significant video segment of component content, and, add corresponding information according to the format content of metadata model document definition;
1.3.3, metadata show, comprise
◆ key frame or camera lens are cut apart frame and shown: promptly system will analyze the key frame that extracts or camera lens automatically and cut apart frame and be presented on the Storyboard;
◆ segment contents shows: the frag info content after promptly system will edit is presented on the user interface;
1.3.4, metadata cache management comprise:
During carrying out video, audio analysis automatically, for preventing frequent reading writing harddisk, and only analysis result is write hard disk when accomplishing analyzing;
◆ using system RAM makes buffer memory, requires memory size can support the metadata analysis of continuous several hrs;
◆ long and cause the Out of Memory time spent when continuous analysis time, the employing system divides page file as sharing mapped file;
1.3.5, metadata are preserved, and comprising:
◆ the metadata of analyzing automatically or revising is saved in storage medium with the XML document form;
◆ key frame or camera lens are cut apart in the file that frame is saved in appointment;
1.4th, Configuration Manager, system is stored in configuration information in the configuration file, and with the XML representation of file, the user can be by hand or through the user interface editor that makes amendment; Read this configuration file during system initialization, carry out the software module configuration; Configuration information comprises:
● input equipment configuration: media file on video acquisition integrated circuit board or the disk;
● algorithm function is selected: key frame or camera lens are cut apart frame and are extracted;
● key frame extraction algorithm parameter setting comprises
The ■ key frame extracts susceptibility and sets;
■ maximum or minimum frame gap are set: limit promptly that the adjacent key frame time interval is no less than or more than specified time interval.
2nd, media materials key frame of video robustness automatic decimation; Comprise:
2.1st, preliminary treatment, this step comprises two parts content: to incoming video signal, at first take a sample down through the frame interior pixel and reduce algorithm complex; Utilize frame-to-frame correlation to obtain binaryzation frame difference image in addition, and utilize this error image to filter out the set of candidate key frame;
2.2nd, information extraction; Candidate key frame set to the 2.1st step obtained is further handled; Extract histogram feature information, and utilize between the consecutive frame behind the histogram Fourier transform and between frame to be measured and the previous keyframe divergence measurement carry out key frame and judge and extract; In processing procedure, this step provides the detection and the inhibition of independent be fade-in fade-out, wipe special efficacy and photoflash lamp influence;
2.2.1, to being fade-in fade-out fragment, meet the characteristics of linear change according to interframe luminance signal in the fragment, detect through the method for estimating this rate of change;
2.2.2, to the special efficacy fragment of wiping, have the regular characteristics in space according to the interframe zone of wiping, detect through detecting interframe wipe wipe between the fragment frame interior spatial variations in zone of zone and whole of wiping;
2.2.3, to the photoflash lamp fragment, according to short characteristics of photoflash lamp time, utilize the little characteristic of the big and separated interframe luminance difference of interframe luminance difference during detection, judge and detect;
2.3rd, information analysis according to characteristic information and the special scene testing result that the 2.2nd step information extraction obtains, is carried out last analysis-by-synthesis, and is selected the key frame that can represent segment contents;
2.3.1, for being fade-in fade-out frame sequence, the last frame that the special efficacy of selecting to be fade-in fade-out is accomplished is exported as key frame;
2.3.2, for wiping fragment sequence, the last frame that the special efficacy of selecting to wipe is accomplished is exported as key frame;
2.3.3, for the frame of detected photoflash lamp influence, directly from the candidate frame sequence with its filtering;
2.3.4, for remaining common candidate's key frame sequence, in its fragment of forming by each successive frame sequence, select to export as key frame with the maximum frame of last key frame histogram Fourier transform diversity factor.
3rd, the media materials camera lens is cut apart frame robustness automatic decimation, comprising:
3.1st, preliminary treatment comprises two parts content: to incoming video signal, at first take a sample down through the frame interior pixel and reduce algorithm complex; Utilize frame-to-frame correlation to obtain binaryzation frame difference image in addition, and utilize this error image to filter out the candidate camera lens to cut apart frame set;
3.2nd, information extraction; The 3.1st candidate camera lens that obtains of step is cut apart the frame set further to be handled: extract histogram feature information, and adopt the method for histogram decay average computing to calculate decay histogram average and the statistical variance from the start frame to the present frame in the same camera lens; Calculate the interframe histogram χ between present frame and the decay histogram average then 2The statistics difference, and upgrade statistical variance; With the statistical variance that dynamically updates, calculate the dynamic decision thresholding, utilize this threshold value to carry out the judgement that camera lens is cut apart frame; In processing procedure, this step provides the detection and the inhibit feature of independent be fade-in fade-out, wipe special efficacy and photoflash lamp influence; For the fragment of being fade-in fade-out, meet the characteristics of linear change according to interframe luminance signal in such fragment, detect through the method for estimating this rate of change; For the fragment of wiping, have the regular characteristics in space according to the interframe zone of wiping, detect through detecting wipe wipe between zone and the entire segment frame interior spatial variations in zone of interframe; For the photoflash lamp fragment, according to short characteristics of photoflash lamp time, utilize the little characteristic of the big and separated interframe luminance difference of interframe luminance difference during detection, judge and detect;
3.3rd, information analysis; According to the characteristic information and the special scene testing result that from the 3.2nd step information extraction, obtain; Carry out last analysis-by-synthesis, and judge camera lens and cut apart frame, comprising the detection of the special circumstances of be fade-in fade-out, wipe special efficacy and photoflash lamp; For the detected frame sequence of being fade-in fade-out, the output last frame is as key frame; For the detected fragment sequence of wiping, the output last frame is as key frame; For the frame of detected photoflash lamp influence, directly from the candidate frame sequence with its filtering; For remaining common candidate frame, the judgement that camera lens is cut apart frame requires to satisfy following two conditions:
● sudden change conditions: present frame and decay histogram average χ 2The statistics difference is greater than the sudden change threshold value of being confirmed by statistical variance;
● smooth conditions: present frame and decay histogram average χ 2The statistics difference is greater than by present frame and the back interframe χ between the frame 2The steady threshold value that the statistics difference is confirmed.
Advantage of the present invention and good effect:
The present invention adopts advanced computer video analytical technology, in real time, automatically analyzes video content, and according to user's request, extracts key frame and camera lens and cut apart frame.Be supported in key frame or camera lens simultaneously and cut apart on the frame basis,, set up media materials metadatabase based on video content through human-edited, perfect.For the back end media asset management system provides sufficient metadata information.Robustness key frame of video disclosed by the invention and camera lens are cut apart the frame extraction algorithm, for the interference of special scenes such as video properties such as being fade-in fade-out, dissolving, wiping and photoflash lamp good restraining ability are arranged.Because the destructuring characteristic of original video content, the conventional artificial mode is carried out the method that video segment is cut apart and made a catalogue, and wastes time and energy, and adopts solution of the present invention, can practice thrift great amount of cost and social resources.
Description of drawings
Fig. 1 is a system architecture functional block diagram of the present invention.
Fig. 2 is a specific embodiment functional block diagram of the present invention.
Fig. 3 is that the present invention is about the special efficacy example of being fade-in fade-out.
Fig. 4 is the example of the present invention about wiping.
Fig. 5 is the example of the present invention about the photoflash lamp influence.
Fig. 6 is that the present invention is about metadata model file example.
Fig. 7 is that the present invention is about metadata configurations file example.
Fig. 8 is that the present invention is about key frame robustness extraction algorithm system flow (method).
Fig. 9 is that the present invention is cut apart frame extraction preliminary treatment submodule flow process (method) about key frame or camera lens.
Figure 10 is that the present invention is about the detection sub-module flow process (method) of being fade-in fade-out.
Figure 11 is that the present invention is about the detection sub-module flow process (method) of wiping.
Figure 12 is that the present invention is about photoflash lamp detection sub-module flow process (method).
Figure 13 is that the present invention is cut apart frame robustness extraction algorithm system flow (method) about camera lens.
Embodiment
One, full-automatic video cataloging syytem framework is as shown in Figure 1, comprising:
1, acquisition of media module
This subsystem is through video acquisition integrated circuit board and integrated circuit board respective drive program, and the sampling video flow data arrive calculator memory, and are sent to the Media Analysis module.
2, Media Analysis module
This module is to the video stream data that collects; According to the config option of setting; Select corresponding Processing Algorithm and Control Parameter thereof for use; Carry out intellectual analysis and key frame or camera lens and cut apart frame and extract, and the key frame that extracts or camera lens are cut apart frame and corresponding timecode information be shown on the user interface on the Storyboard zone, supply the user further to edit, handle.
3, the foundation of medium metadata, editor and administration module
The medium metadata management comprises that the metadata model file is set up and modification, and metadata shows, metadata editor, and the metadata cache management, metadata is preserved.
The metadata model document definition; Content and the organizational form (like fragment description, classifying content, video author, lister etc.) of the metadata information that cataloging syytem carries out media materials drawing after the structured analysis have been defined in this document; With the XML representation of file; System provides default file, and the user can make amendment through the mode of craft according to specific needs.Fig. 6 illustrates an example of metadata model file.
Metadata editor comprises:
◆ key frame and camera lens are cut apart frame-editing: system cuts apart frame with key frame or camera lens and is presented on the dialog control, and through increasing the operation of this dialog control, realizes revising or deletes keys or camera lens is cut apart the function of frame
◆ the video segment editor: the dialog control of frame is cut apart through showing key frame or camera lens in system, provides to cut apart frame by key frame or camera lens and form video segment, and the metadata editting function is provided for this video segment.And metadata is presented in the list of meta data control.
Metadata shows, comprises
◆ key frame or camera lens are cut apart frame and shown: system cuts apart frame with key frame or camera lens and is presented on the dialog control, and each key frame or camera lens are cut apart the corresponding dialog control of frame, and all dialog controls all depend in the view of a Storyboard
◆ segment contents shows: the frag info after promptly system will edit (like fragment description, classifying content, video author, lister etc.) content is presented in the list of meta data control, and the content of list control is with the meta data file of XML format at storage medium.
The metadata cache management comprises:
During looking audio analysis automatically, for preventing frequent reading writing harddisk, and only analysis result is write hard disk analyzing when accomplishing.
◆ using system RAM makes buffer memory, requires memory size can support the metadata analysis of continuous several hrs
◆ long and cause the Out of Memory time spent when continuous analysis time, the employing system divides page file to preserve as sharing the mapped file metadata, comprising:
◆ the metadata of analyzing automatically or revising is saved in storage medium with the XML document form
◆ key frame or camera lens are cut apart frame and are saved in the specific file
4, Configuration Manager
The full-automatic video cataloging syytem is stored in configuration information in the CONFIG.SYS, stores with the XML file mode.System provides user interface, shows and the editing system configuration file that with the mode of dialog box the user can manual mode direct modification CONFIG.SYS.Read this CONFIG.SYS during system initialization, carry out the software module configuration.Fig. 7 illustrates an example of system configuration management document.Configuration information comprises:
● input equipment configuration: media file on video acquisition integrated circuit board or the disk
● algorithm function is selected: key frame or camera lens are cut apart frame and are extracted
● key frame extraction algorithm parameter setting comprises
The ■ key frame extracts susceptibility and sets
■ maximum or minimum frame gap are set: limit promptly that the adjacent key frame time interval is no less than or more than specified time interval
Two, robustness key frame of video abstracting method provided by the invention comprises:
1, handling process design
The key frame abstracting method is illustrated by Fig. 8.The entire process method can be divided into three parts: preliminary treatment submodule, information extraction submodule, information analysis submodule.
◆ the preliminary treatment submodule at first to each frame video data of input carry out simply, preliminary treatment fast, comprise sample circuit time domain preanalysis under the spatial domain, and therefrom filter out candidate key frame set roughly.
◆ information extraction contains following two parts function:
● first further handled to the candidate key frame set that a last step filters out; Characteristic information extraction; And according to reaching the interframe characteristic information in the frame of each frame; Further filter out the candidate frame sequence, and continuous candidate frame sequence and characteristic information thereof made as a whole candidate segment note, supply further to analyze and key-frame extraction used.
● second portion is to carry out particular processing and detection to special screne, comprising:
■ photoflash lamp effect detection
■ is fade-in fade-out and dissolves special efficacy and detects
The ■ interframe special efficacy of wiping detects
◆ the information analysis submodule is on last step detection and processing basis, to condition of different, adopts different processing policies, and key frame of video is extracted in final analysis
1.1, the preliminary treatment submodule---the candidate segment start frame obtains
Preprocessing subsystem is illustrated by Fig. 9, comprises following two steps:
1) sampling of carrying out the spatial domain improves treatment effeciency;
For improving treatment effeciency, under the situation that guarantees not impact analysis effect, at first input video stream is descended sampling.In the practical implementation process, be that 720 * 576 video flowing carries out 8*8 sampling down to input resolution, obtaining resolution is 90 * 72 video stream datas.
2) preanalysis.The key frame purpose is the appearance that is used for identifying, describe new scene in the video or fresh content, need filter out in processing procedure therefore that those are a large amount of, describe the similar frame of segment contents height with last key frame.For this reason,
● at first define two-value template M (i, j), its size is following sampling rear video frame sign, promptly 96 * 72.Each point value of initialization is 0.
● to each frame, consider its luminance signal I t, calculate following two binaryzation frame difference images, and utilize morphology open-close operator to carry out noise reduction and removal details.Here, get threshold value T=10.
■ consecutive frame binaryzation frame difference image
d t , t - 1 ( i , j ) = 0 | I t ( i , j ) - I t - 1 ( i , j ) | < T 1 else
d′ t,t-1(i,j)=OC(d t,t-1(i,j))
● present frame I tWith previous keyframe I kBetween binaryzation frame difference image.If do not detect key frame as yet, then use complete black frame to replace previous keyframe.
d t , k ( i , j ) = 0 | I t ( i , j ) - I k ( i , j ) | < Tor | I t - 1 ( i , j ) - I k ( i , j ) | < T 1 else
d′ t,k(i,j)=OC(d t,k(i,j))
● more new template M (i, j):
M new(i,j)=M old(i,j)∪d′ t,k(i,j)
● calculation template M, d T, t-1, d T, kIn 1 number:
N ( M ) = &Sigma; i , j M ( i , j )
N ( d t , t - 1 ) = &Sigma; i , j d t , t - 1 ( i , j )
N ( d t , k ) = &Sigma; i , j d t , k ( i , j )
Just can think that when satisfying one of following condition new contents fragment has begun, promptly to begin all frame of video from present frame be candidate frames to mark, selects until new key frame:
1)N(M)>size/5?and?N(d t,t-1)-N(d t,k)>size/10
2)N(d t,t-1)>size/20?and?N(d t,t-1)-N(d t,k)>size/10
Here, size is the number of frame of video interior pixel
1.2, the information extraction submodule
The information extraction subsystem on the candidate video segment basis that has filtered out, further characteristic information extraction; Simultaneously key frame is extracted influential special scene and detects with special efficacy and suppress for some, comprising: photoflash lamp influence, be fade-in fade-out and dissolve special efficacy, the special efficacy of wiping.
1) essential information is extracted and candidate's key frame sequential recording
● essential information is extracted
Original essential information is based on input yuv video signal.
At first every frame video data Y, U, V component are carried out the statistics with histogram analysis, obtain the color histogram h of 256 component values y, h u, h v
Expand to for frame sequence; Obtaining Nogata atlas t is frame number; I is brightness, chromatic component value, i ∈ [0,255]; Thus, have to give a definition
■ is the frame sequence of N for sequence length, and definition histogram average is:
Ave = 1 N &Sigma; t = 0 N - 1 h &rho; ( t , i )
H wherein y(i), h u(i), h v(i) be respectively Y, U, V histogram of component
■ definition color histogram component Fourier transform is:
h y ( i ) &LeftRightArrow; H y ( j&omega; )
h u ( i ) &LeftRightArrow; H u ( j&omega; )
h v ( i ) &LeftRightArrow; H v ( j&omega; )
Histogram difference between ■ definition frame m and the frame n is estimated:
d m , n = &Sigma; &omega; = 0 &pi; / 4 ( | H y m ( j&omega; ) - H y n ( j&omega; ) | 2 + | H u m ( j&omega; ) - H u n ( j&omega; ) | 2 | H v m ( j&omega; ) - H v n ( j&omega; ) | 2 )
Can find out by following formula, when calculated difference is estimated, only consider the low frequency component in the frequency domain, do two purposes like this:
◆ the mould value of frequency-region signal is insensitive to the time-delay on the time shaft, here is exactly insensitive to the whole global brightness variation of YUV, thereby to a certain extent photoflash lamp is had certain inhibitory action;
◆ adopt the low frequency component of analyzing frequency domain information to be equivalent to primary signal has been carried out smoothly can eliminating The noise like this.
● candidate's key frame sequence is chosen
Utilize histogram difference to estimate, further carry out choosing of key frame candidate frame.
In the key frame testing process, in order to make detected key frame more representative, we will detect the continuous sequence of a key frame, and choose in this section the most representative frame as key frame.
So testing process can be divided into for two steps:
● the detection of key frame section start frame
● the detection of key frame section abort frame.
Detect about key frame section start frame:
1. defining video first frame is key frame
2. the histogram average Ave that upgrades the former frame from the previous keyframe to the present frame is as reference histograms
3. calculate consecutive frame and present frame and the previous keyframe histogram difference is estimated d T, t-1And d T, kIf, d T, k>T shows that then scene has had bigger variation, thus beginning label key frame candidate frame, the beginning of a promptly new contents fragment.
Detect about key frame section abort frame:
When detecting first key frame candidate frame, show that new video segment has begun.Will determine below when this sequence finishes, and key frame fragment abort frame.All frames between key frame start frame and key frame abort frame all are labeled as the key frame candidate frame, and final key frame will be chosen from this sequence.The method of judgement key frame abort frame is:
1. then at first utilize the average computing to upgrade reference histograms, calculate the divergence measurement d of each frame and former frame and reference histograms behind the key frame section start frame then T, t-1And d T, k
If 2. d T, k>T shows that then another key frame section has begun, and then analyzes old key frame section, obtains best key frame output, writes down new key frame start frame simultaneously;
If 3. d T, t-1<T shows that then video transformation is steady, and this frame is key frame section end frame;
If 4. this frame and key frame differ more than 30 frames in addition, then force to finish, carry out the analysis of key frame section, draw key frame.
2) special scene detects
In video flowing, there are some special scenes very big influence to be arranged for the extraction of key frame.In these scenes, because interframe alters a great deal, content change is very little again simultaneously, and excessively detecting often appears in conventional detection method.The robustness key frame that the present invention proposes detects, and considers the situation that these are special, adopts corresponding processing method to these special situation, obtains the better inhibited effect
● the detection of being fade-in fade-out and dissolving:
Being fade-in fade-out and dissolving is special efficacy common in the video flowing, and the typical segments example is illustrated by Fig. 3.It is similar being fade-in fade-out with the formation principle of dissolving special efficacy.Fading in refers to video frame content and changes in time, conceals gradually, until the frame that disappears for complete black content; It is just in time opposite to fade out, and refers to video frame content and changes in time, goes out gradually in the complete black frame to occur; The video segment that dissolving refers to two different contents changes mutual superposition in time, and one of them contents fragment conceals gradually, until disappearance; Another contents fragment engenders, until all manifesting.
The essence that this special efficacy forms is the two frames results of phase superposition in varing proportions, its ratio k linear transformation in time.Its testing process is illustrated by Figure 10.
Detection be fade-in fade-out with dissolving special efficacy process in, consider brightness mean value signal in the frame of video, definition
Y t=∑y t(i,j)
In the special efficacy frame sequence of being fade-in fade-out, consecutive frame satisfies:
y t(i,j)=k*y t-1(i,j)+(1-k)*y t+1(i,j),k∈[0,1]
Utilize brightness mean value signal in the frame of video, can estimate proportionality coefficient
k = Y t + 1 - Y t Y t + 1 - Y t - 1
The proportionality coefficient k that employing is estimated out, the definition frame differences is estimated and decision threshold:
Frame difference is estimated: the d=∑ | k*y T-1(i, j)-y t(i, j)+(1-k) * y T+1(i, j) |
Decision threshold: t 1=∑ | y T-1(i, j)-y t(i, j) |
t 2=∑|y t+1(i,j)-y t(i,j)|
When satisfying d<k*t 1And d<k*t 2The time, this frame of mark is the frame of being fade-in fade-out
● photoflash lamp detects and suppresses:
1. the photoflash lamp frame detects.
Photoflash lamp fragment example can be illustrated by Fig. 5.Photoflash lamp is bigger to image brightness and carrier chrominance signal influence.Carrying out key frame or camera lens when cutting apart frame and detecting, the frame of photoflash lamp influence is detected, avoid it is defined as key frame.Its testing process such as Figure 12 illustrate.For this reason,
The binaryzation error image does between definition frame, here T=10:
D m , n ( i , j ) = 0 | y m ( i , j ) - y n ( i , j ) | < T 1 else
Then, utilize morphology open-close operator to carry out noise and details removal:
D′(i,j)=OC(D(i,j)
Between definition frame error image estimate for,
N ( D &prime; ( i , j ) ) = &Sigma; i , j D &prime; ( i , j )
That is 1 number in the error image.
When frame t satisfied one of following condition, this frame of mark was the photoflash lamp action frame
a) N ( D t - 1 , t &prime; ( i , j ) &cap; D t - 1 , t + 1 &prime; ( i , j ) &OverBar; ) > min ( size / 5 , N ( D t - 1 , t + 1 &prime; ( i , j ) ) )
b) N ( D t , t + 1 &prime; ( i , j ) &cap; D t - 1 , t + 1 &prime; ( i , j ) &OverBar; ) > min ( size / 5 , N ( D t - 1 , t + 1 &prime; ( i , j ) )
Here, size is a frame interior pixel number
2. photoflash lamp suppresses
In analytic process, except through carrying out photoflash lamp detects, also following means suppress other overall jump in brightnesss:
A. utilize the time domain consistency of Fourier transform to suppress overall changes in amplitude
B. utilize adaptive threshold to suppress flash light effect.Promptly change when excessive when certain section video, raise decision threshold adaptively and suppress its influence.
● the frame of wiping detects:
The fragment of wiping example can be illustrated by Fig. 4.Generally wipe and show as: continuous 5~10 frames, every frame and former frame have bigger difference in the part of image and not obvious in other part differences, and this image is clocklike than the variation of big-difference part simultaneously.For example, from left to right change, from top to bottom etc.Regularity according to this difference regional change can detect the interval of wiping.
The fragment of wiping testing process such as Figure 11 illustrate, and can be made up of following 4 steps:
Step 1: the definition template Mask that wipes is used for being recorded in whole interframe of wiping the fragment process zone of wiping.The template size of wiping is identical with video frame size, and the initialization internal data is 0,
(i, j)=0, i, j are respectively row, column index to Mask
Step 2: utilize interframe binaryzation error image, detect the interframe zone of wiping:
2.1 calculate interframe binaryzation error image, and utilize morphology open-close operator to carry out noise and details removal, and the binary image D ' after obtaining handling (i, j)
2.2 choose binary image D ' (i, j) in largest connected territory C (i, j), the effect candidate zone of wiping
2.3 when template and above-mentioned largest connected territory meet the following conditions simultaneously, adjudicate this frame and be the frame of wiping
◆ the interframe candidate region of wiping is enough big: N (C (i, j))>size/15
◆ the newly-increased zone of wiping is enough big: N ( Mask ( i , j ) &OverBar; &cap; C ( i , j ) ) > Size / 20
The candidate region of wiping is detected, when the candidate regions of not wiping detects, and the step 4 of beating
Step 3: if step 2 detects the zone of wiping, new template Mask more, and get back to step 2, handle next frame:
Mask new(i,j)=Mask old(i,j)∪C(i,j)
Otherwise, jump to step 4
Step 4: if the front detects the plurality of continuous frame fragment of wiping, and satisfy following two conditions simultaneously, then
Judge that this frame sequence is the fragment of wiping, and the fragment last frame of will wiping is designated key frame.
◆N(Mask(i,j)>size/3
◆ the frame number of wiping continuously>3 frames
1.3, the information analysis submodule
The information analysis subsystem is exactly that the key frame fragment of record is analyzed, and finally draws best key frame.Detect candidate frame of different nature for the front, adopt following different strategies to handle
4.1 for the detected frame sequence of being fade-in fade-out, the output last frame is as key frame.
4.2 for the detected fragment sequence of wiping, the output last frame is as key frame.
4.3 for the frame of detected photoflash lamp influence, directly from the candidate frame sequence with its filtering.
4.4 for remaining common candidate's key frame sequence, in its continuous sequence frame, select and the maximum frame of last key frame histogram difference degree, as this sequence key frame output I kSatisfy:
d k,last=max(d t,last) t∈D,
Wherein k is the key frame sequence number, and D is a successive candidate key frame sequence scope, and t is the sequence number variable
Three, robustness camera lens switch frame detection method provided by the invention comprises:
1, handling process design
Camera lens is cut apart the frame extraction system and is illustrated by Figure 13.The entire process method can be divided into three parts: preliminary treatment submodule, information extraction submodule, information analysis submodule.
◆ the preliminary treatment submodule at first to each frame video data of input carry out simply, preliminary treatment fast, comprise sample circuit time domain preanalysis under the spatial domain, and therefrom filter out candidate key frame set roughly
◆ the information extraction submodule comprises following two parts function:
● first further handled to the candidate key frame set that a last step filters out; Characteristic information extraction; And according to reaching the interframe characteristic information in the frame of each frame; Further filter out the candidate frame sequence, and continuous candidate frame sequence and characteristic information thereof made as a whole candidate segment note, supply further to analyze and key-frame extraction used.
● second portion is to carry out particular processing and detection to special screne, comprising:
■ photoflash lamp effect detection
■ is fade-in fade-out and dissolves special efficacy and detects
The ■ interframe special efficacy of wiping detects
◆ the information analysis submodule is on last step detection and processing basis, to condition of different, adopts different processing policies, and final analysis extraction video lens is cut apart frame
1.1, the preliminary treatment submodule---the candidate segment start frame obtains
The function of preprocessing subsystem is identical with the key frame extraction with method, can here repeat no more with reference to preamble.
Through preprocessing subsystem, video resolution is reduced, filter out candidate's camera lens simultaneously and cut apart frame sequence.
1.2, the information extraction submodule
The information extraction subsystem on the candidate video segment basis that has filtered out, further characteristic information extraction; Simultaneously key frame is extracted influential special scene and detects with special efficacy and suppress for some, comprising: photoflash lamp influence, be fade-in fade-out and dissolve special efficacy, the special efficacy of wiping.
1) essential information is extracted
● essential information is extracted
Two types of frame difference information of main consideration, and dynamic decision thresholding:
A. interframe pixel grey scale different information DOP (Difference of Pixels)
The binaryzation error image is between definition frame:
d t , t - 1 ( i , j ) = 0 | I t ( i , j ) - I t - 1 ( i , j ) | < T 1 else
Here T=10
Then, utilize morphology open-close operator to carry out noise and details removal:
d(i,j)=OC(d(i,j)
Between definition frame the pixel grey scale error image estimate for,
N ( d &prime; ( i , j ) ) = &Sigma; i , j d &prime; ( i , j )
When N (d)<Size/5, represent that present frame is similar with former frame, thereby can not be that camera lens is cut apart frame, omit remaining processing procedure
When N (d)>Size/5, represent that present frame and former frame differ greatly, and carry out follow-up processing
B. interframe pixel histogram difference information D OH (Difference of Histogram)
When calculating interframe DOH, adopt three kinds of technology: histogram decay average, histogram χ 2Statistics difference and adaptive threshold judgement.
● histogram χ 2The statistics difference.
The histogram H of given frame m and frame n m(i), H n(i), histogram χ between definition frame 2The statistics difference is following:
d ( H m , H n ) = 1 N 2 &Sigma; i ( H m ( i ) - H n ( i ) ) 2 max ( H m ( i ) , H n ( i ) ) H m(i)≠H n(i)
For the YUV color histogram, the color histogram divergence measurement is between definition frame:
D m,n=d y(H m,H n)+d u(H m,H n)+d v(H m,H n)
● the computing of histogram decay average
At technological histogram interframe χ 2The statistical difference timesharing, employing be not that former frame and present frame compare, but the weighted average information of all frames compares in the same camera lens in front.Near frame to be compared, weights are big more, otherwise more little.Here, definition decay average computing:
Suppose sequence { H is arranged t, t is a frame number, and attenuation coefficient is α<1, and definition decay average is:
H &OverBar; = H t + &alpha; H t - 1 + &alpha; 2 H t - 2 + . . . 1 + &alpha; + &alpha; 2 + . . .
Here, through attenuation coefficient, make at weights near the frame of present frame bigger, smaller away from the weights of the frame of present frame, do not remember to ignoring and the weights of frame too far away are little.The decay average that obtains like this can embody the slow variation of same camera lens inner content and the correlation between frame and the frame well.
In implementation procedure, the method for iteration is upgraded current decay average frame as follows, and the information of all frames before needn't writing down:
H t + 1 &OverBar; = H t &OverBar; ( 1 - &alpha; t ) + ( 1 - &alpha; ) H t + | 1 - &alpha; t + 1
In the video analysis processing procedure, calculate the present frame histogram and estimate with the histogram difference of the histogram decay average of interior all frames of all same camera lenses of past, come detector lens to cut apart frame
D t , t - 1 &OverBar; = d y ( H t , H t - 1 &OverBar; ) + d u ( H t , H t - 1 &OverBar; ) + d v ( H t , H t - 1 &OverBar; )
● self-adaptive decision threshold
Consider various noise effects and same camera lens frame-to-frame correlation, when carrying out terminal decision, adopt the method for adaptive threshold.In order to obtain threshold value adaptively, through following two steps:
The inner local variance sigma of ■ camera lens dynamically updates:
◆σ 0=0
&sigma; t = 0.7 * ( 1 - 0.7 t - 1 ) * &sigma; t - 1 + 0.3 * D t , t - 1 &OverBar; 1 - 0.7 t
The ■ adaptive threshold is considered various noise effects and same camera lens frame-to-frame correlation in processing procedure, the definition thresholding is:
T=3*σ+10
Wherein σ is the mean square deviation that the inter-frame difference in the same camera lens fragment changes.Theoretical according to Gaussian distribution, the sample number that is distributed in average both sides 3* σ is more than 99.9%.Therefore, most non-camera lenses cut apart frame the histogram difference all less than this thresholding.
2) special scene detects
In some special scenes of video flowing and special efficacy camera lens being cut apart the correctness that frame extracts has very big influence, if do not do particular processing, can cause very big flase drop.These special scenes and special efficacy front were mentioned, and comprised be fade-in fade-out and dissolve, wipe special efficacy and photoflash lamp influence.Detection about these special scenes is identical with the method that key frame extracts, and can here just repeat no more with reference to the description of front.
1.3, the information analysis submodule
The information that the information analysis subsystem extracts and detects according to the front is done last analysis, and the judgement camera lens is cut apart frame.According to detect candidate frame of different nature for the front, adopt following different strategies to handle
A. for the detected frame sequence of being fade-in fade-out, the output last frame is as key frame.
B. for the detected fragment sequence of wiping, the output last frame is as key frame.
C. for the frame of detected photoflash lamp influence, directly from the candidate frame sequence with its filtering.
D. for remaining common candidate frame sequence, adopt following decision method:
If frame t satisfies simultaneously:
The ■ sudden change conditions: D t , t - 1 &OverBar; > 3 * &sigma; t + 10
The ■ smooth conditions: D t , t - 1 &OverBar; > 2 * D t , t + 1
Then determine that it is camera lens and cut apart frame
Four, introduce exemplary embodiment with the lower part
Present embodiment is the software that runs on windows platform, Visual C++ exploitation.
Fig. 1 illustrates the present embodiment Organization Chart.Can know by figure,
● present embodiment mainly is made up of 5 modules:
■ application graphical user interface: this module realizes based on Visual C++MFC.Many documents application program is made up of 3 parts:
◆ video shows: realize based on the MFC view, be presented in this view framework gathering the video data of coming in real time
◆ Storyboard: realize that based on the MFC view key frame that Automatic Extraction is come out is presented in the Storyboard view framework with the mode of dialog control.
◆ metadata: realize based on the MFC view,, be presented in the Metadata View The framework with the mode of list box control with the metadata information after the editing and processing
■ cataloging syytem engine modules main program module; This module is through calling bottom module (comprising data acquisition module, Media Analysis module, metadata management module etc.); To the upper strata user interface service of video catalogue is provided, major function comprises application initialization, Memory Allocation, dynamic load, following each sub-function module is set.
■ data acquisition submodule, this module are the dynamic base card format, by cataloging syytem engine modules dynamic load, operate in the independent thread.Major function is that through operation bottom data analog input card, image data is to the input shared drive
■ Media Analysis submodule: this module is the dynamic base card format, by cataloging syytem engine modules dynamic load, operates in the independent thread.Major function is that realization key frame or camera lens are cut apart frame intelligence extraction algorithm, and the result is exported
■ metadata management submodule
■ configuration management submodule
Fig. 2 illustrate main type of this enforcement use-case with and call relation, wherein,
◆ upper level applications is the many documents application framework based on MFC.
◆ video demonstration view is for inheriting the real-time video of collection and the class of timecode information of showing that be used for of CFormView.
◆ the Storyboard view for inherit CFormView be used for show that key frame or camera lens cuts apart the class of frame.Key frame of gathering out or camera lens are cut apart frame and are shown based on dialog control.
◆ the Metadata View The class is for inheriting the class that is used for the display element data of CFormView.Content metadata shows based on list control.
◆ all functions of full-automatic video coded system engine integrated video cataloging syytem comprise data acquisition, Media Analysis and metadata management, and to the upper strata interface are provided.
◆ the data acquisition class provides by the data integrated circuit board and gathers the video data function in real time, through data acquisition in internal memory
◆ the Media Analysis class calls key frame or camera lens is cut apart the frame Processing Algorithm, handles the data in the internal memory, and result is cached in the key frame data buffer memory
◆ the metadata management class has encapsulated the metadata management correlation function.This module is cut apart frame information according to key frame or camera lens that the Media Analysis module analysis goes out, and the response user marks by hand, carries out metadata creation and management function.

Claims (4)

1. the construction method of a full-automatic video cataloging syytem is characterized in that this method comprises:
1st, the structure of full-automatic video cataloging syytem;
2nd, media materials key frame of video robustness automatic decimation;
3rd, the media materials camera lens is cut apart frame robustness automatic decimation.
2. method according to claim 1 is characterized in that described full-automatic video cataloging syytem of the 1st step comprises:
1.1st, acquisition of media module, through the video acquisition integrated circuit board, the sampling video flow data;
1.2nd, Media Analysis module; Go on foot the video stream data that the acquisition of media module collects to the 1.1st; Utilize advanced computer vision treatment technology to carry out intellectual analysis; Extract key frame or camera lens and cut apart frame, and the key frame that extracts or camera lens are cut apart frame be shown on the Storyboard, supply the user further to edit, handle;
1.3rd, metadata foundation, editor and administration module, the key frame that extracts with the 1.2nd step Media Analysis module is cut apart frame with camera lens and is the basis, and formation is unit with the video segment, and is aided with the media materials metadata of descriptor; Described metadata is the final result of cataloging syytem output, is the main foundation that the destructuring media asset is effectively managed; Metadata editor and management comprise the foundation of metadata model file and modification, metadata demonstration, metadata editor, metadata cache management and metadata preservation;
1.3.1, metadata model file; The content and the organizational form of the metadata information that cataloging syytem carries out media materials drawing after the structured analysis have been defined in this document; Comprise fragment description, classifying content, video author and lister; The metadata model file is with the XML representation of file, and system provides default file, and the user can revise the metadata model file through manual mode as required;
1.3.2, metadata editor comprise:
◆ key frame and camera lens are cut apart frame-editing: the user is cut apart frame with concrete condition increases or deletion is presented on the Storyboard key frame or camera lens as required;
◆ the video segment editor: the user is cut apart frame according to key frame or camera lens that system extracts; Pass through edit; The content of two or more key frames or camera lens being cut apart frame merges; Independent and the significant video segment of component content, and, add corresponding information according to the format content of metadata model document definition;
1.3.3, metadata show, comprise
◆ key frame or camera lens are cut apart frame and shown: promptly system will analyze the key frame that extracts or camera lens automatically and cut apart frame and be presented on the Storyboard;
◆ segment contents shows: the frag info content after promptly system will edit is presented on the user interface;
1.3.4, metadata cache management comprise:
During carrying out video, audio analysis automatically, for preventing frequent reading writing harddisk, and only analysis result is write hard disk when accomplishing analyzing;
◆ using system RAM makes buffer memory, requires memory size can support the metadata analysis of continuous several hrs;
◆ long and cause the Out of Memory time spent when continuous analysis time, the employing system divides page file as sharing mapped file;
1.3.5, metadata are preserved, and comprising:
◆ the metadata of analyzing automatically or revising is saved in storage medium with the XML document form;
◆ key frame or camera lens are cut apart in the file that frame is saved in appointment;
1.4th, Configuration Manager, system is stored in configuration information in the configuration file, and with the XML representation of file, the user can be by hand or through the user interface editor that makes amendment; Read this configuration file during system initialization, carry out the software module configuration; Configuration information comprises:
● input equipment configuration: media file on video acquisition integrated circuit board or the disk;
● algorithm function is selected: key frame or camera lens are cut apart frame and are extracted;
● key frame extraction algorithm parameter setting comprises
The ■ key frame extracts susceptibility and sets;
■ maximum or minimum frame gap are set: limit promptly that the adjacent key frame time interval is no less than or more than specified time interval.
3. method according to claim 1 is characterized in that described media materials key frame of video robustness automatic decimation of the 2nd step comprises:
2.1st, video low layer preliminary treatment, this step comprises two parts content: at first take a sample down through the frame interior pixel and reduce algorithm complex; Utilize frame-to-frame correlation to obtain binaryzation frame difference image in addition, and utilize this error image to filter out the set of candidate key frame;
2.2nd, information extraction; Candidate key frame set to the 2.1st step obtained is further handled; Extract histogram feature information, and utilize between the consecutive frame behind the histogram Fourier transform and between frame to be measured and the previous keyframe divergence measurement carry out key frame and judge and extract; In processing procedure, this step provides the detection and the inhibition of independent be fade-in fade-out, wipe special efficacy and photoflash lamp influence;
2.2.1, to being fade-in fade-out fragment, meet the characteristics of linear change according to interframe luminance signal in the fragment, detect through the method for estimating this rate of change;
2.2.2, to the special efficacy fragment of wiping, have the regular characteristics in space according to the interframe zone of wiping, detect through detecting interframe wipe wipe between the fragment frame interior spatial variations in zone of zone and whole of wiping;
2.2.3, to the photoflash lamp fragment, according to short characteristics of photoflash lamp time, utilize the little characteristic of the big and separated interframe luminance difference of interframe luminance difference during detection, judge and detect;
2.3rd, information analysis according to characteristic information and the special scene testing result that the 2.2nd step information extraction obtains, is carried out last analysis-by-synthesis, and is selected the key frame that can represent segment contents;
2.3.1, for being fade-in fade-out frame sequence, the last frame that the special efficacy of selecting to be fade-in fade-out is accomplished is exported as key frame;
2.3.2, for wiping fragment sequence, the last frame that the special efficacy of selecting to wipe is accomplished is exported as key frame;
2.3.3, for the frame of detected photoflash lamp influence, directly from the candidate frame sequence with its filtering;
2.3.4, for remaining common candidate's key frame sequence, in its fragment of forming by each successive frame sequence, select to export as key frame with the maximum frame of last key frame histogram Fourier transform diversity factor.
4. method according to claim 1 is characterized in that described media materials camera lens of the 3rd step cuts apart frame robustness automatic decimation and comprise:
3.1st, preliminary treatment comprises two parts content: at first take a sample down through the frame interior pixel and reduce algorithm complex; Utilize frame-to-frame correlation to obtain binaryzation frame difference image in addition, and utilize this error image to filter out the candidate camera lens to cut apart frame set;
3.2nd, information extraction; The 3.1st candidate camera lens that obtains of step is cut apart the frame set further to be handled: extract histogram feature information, and adopt the method for histogram decay average computing to calculate decay histogram average and the statistical variance from the start frame to the present frame in the same camera lens; Calculate the interframe histogram χ between present frame and the decay histogram average then 2The statistics difference, and upgrade statistical variance; With the statistical variance that dynamically updates, calculate the dynamic decision thresholding, utilize this threshold value to carry out the judgement that camera lens is cut apart frame; In processing procedure, this step provides the detection and the inhibit feature of independent be fade-in fade-out, wipe special efficacy and photoflash lamp influence; For the fragment of being fade-in fade-out, meet the characteristics of linear change according to interframe luminance signal in such fragment, detect through the method for estimating this rate of change; For the fragment of wiping, have the regular characteristics in space according to the interframe zone of wiping, detect through detecting wipe wipe between zone and the entire segment frame interior spatial variations in zone of interframe; For the photoflash lamp fragment, according to short characteristics of photoflash lamp time, utilize the little characteristic of the big and separated interframe luminance difference of interframe luminance difference during detection, judge and detect;
3.3rd, information analysis; According to the characteristic information and the special scene testing result that from the 3.2nd step information extraction, obtain; Carry out last analysis-by-synthesis, and judge camera lens and cut apart frame, comprising the detection of the special circumstances of be fade-in fade-out, wipe special efficacy and photoflash lamp; For the detected frame sequence of being fade-in fade-out, the output last frame is as key frame; For the detected fragment sequence of wiping, the output last frame is as key frame; For the frame of detected photoflash lamp influence, directly from the candidate frame sequence with its filtering; For remaining common candidate frame, the judgement that camera lens is cut apart frame requires to satisfy following two conditions:
● sudden change conditions: present frame and decay histogram average χ 2The statistics difference is greater than the sudden change threshold value of being confirmed by statistical variance;
● smooth conditions: present frame and decay histogram average χ 2The statistics difference is greater than by present frame and the back interframe χ between the frame 2The steady threshold value that the statistics difference is confirmed.
CN201210054812.5A 2012-03-05 2012-03-05 Construction method of full-automatic video cataloging system Expired - Fee Related CN102694966B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210054812.5A CN102694966B (en) 2012-03-05 2012-03-05 Construction method of full-automatic video cataloging system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210054812.5A CN102694966B (en) 2012-03-05 2012-03-05 Construction method of full-automatic video cataloging system

Publications (2)

Publication Number Publication Date
CN102694966A true CN102694966A (en) 2012-09-26
CN102694966B CN102694966B (en) 2014-05-21

Family

ID=46860234

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210054812.5A Expired - Fee Related CN102694966B (en) 2012-03-05 2012-03-05 Construction method of full-automatic video cataloging system

Country Status (1)

Country Link
CN (1) CN102694966B (en)

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103065511A (en) * 2012-12-29 2013-04-24 福州新锐同创电子科技有限公司 Implementation method of teaching plan editor
CN103870598A (en) * 2014-04-02 2014-06-18 北京航空航天大学 Unmanned aerial vehicle surveillance video information extracting and layered cataloguing method
CN104184960A (en) * 2014-08-19 2014-12-03 厦门美图之家科技有限公司 Method for carrying out special effect processing on video file
CN104219491A (en) * 2013-06-04 2014-12-17 费珂 Image analysis function based video monitoring system and storage method thereof
CN104519401A (en) * 2013-09-30 2015-04-15 华为技术有限公司 Video division point acquiring method and equipment
CN104822087A (en) * 2015-04-30 2015-08-05 无锡天脉聚源传媒科技有限公司 Processing method and apparatus of video segment
CN105227862A (en) * 2015-09-16 2016-01-06 上海工程技术大学 Can the video recombination system of auto Segmentation camera lens and video recombination method thereof
CN108111537A (en) * 2018-01-17 2018-06-01 杭州当虹科技有限公司 A kind of method of the online video contents of streaming media of rapid preview MP4 forms
CN109005368A (en) * 2018-10-15 2018-12-14 Oppo广东移动通信有限公司 A kind of generation method of high dynamic range images, mobile terminal and storage medium
CN109800035A (en) * 2019-01-24 2019-05-24 博云视觉科技(青岛)有限公司 A kind of algorithm integration service framework system
CN110019025A (en) * 2017-07-20 2019-07-16 中国移动通信集团公司 A kind of stream data processing method and device
CN110147469A (en) * 2019-05-14 2019-08-20 腾讯音乐娱乐科技(深圳)有限公司 A kind of data processing method, equipment and storage medium
CN111641869A (en) * 2020-06-04 2020-09-08 虎博网络技术(上海)有限公司 Video split mirror method, video split mirror device, electronic equipment and computer readable storage medium
CN113014957A (en) * 2021-02-25 2021-06-22 北京市商汤科技开发有限公司 Video shot segmentation method and device, medium and computer equipment
CN113221943A (en) * 2021-04-01 2021-08-06 中国科学技术大学先进技术研究院 Diesel vehicle black smoke image identification method, system and storage medium
CN113473182A (en) * 2021-09-06 2021-10-01 腾讯科技(深圳)有限公司 Video generation method and device, computer equipment and storage medium
CN113612923A (en) * 2021-07-30 2021-11-05 重庆电子工程职业学院 Dynamic visual effect enhancement system and control method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002035377A2 (en) * 2000-10-23 2002-05-02 Binham Communications Corporation Method and system for providing rich media content over a computer network
US20080162450A1 (en) * 2006-12-29 2008-07-03 Mcintyre Dale F Metadata generation for image files
CN101604325A (en) * 2009-07-17 2009-12-16 北京邮电大学 Method for classifying sports video based on key frame of main scene lens
CN101872346A (en) * 2009-04-22 2010-10-27 中国科学院自动化研究所 Method for generating video navigation system automatically

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002035377A2 (en) * 2000-10-23 2002-05-02 Binham Communications Corporation Method and system for providing rich media content over a computer network
US20080162450A1 (en) * 2006-12-29 2008-07-03 Mcintyre Dale F Metadata generation for image files
CN101872346A (en) * 2009-04-22 2010-10-27 中国科学院自动化研究所 Method for generating video navigation system automatically
CN101604325A (en) * 2009-07-17 2009-12-16 北京邮电大学 Method for classifying sports video based on key frame of main scene lens

Cited By (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103065511B (en) * 2012-12-29 2015-04-01 福州新锐同创电子科技有限公司 Implementation method of teaching plan editor
CN103065511A (en) * 2012-12-29 2013-04-24 福州新锐同创电子科技有限公司 Implementation method of teaching plan editor
CN104219491A (en) * 2013-06-04 2014-12-17 费珂 Image analysis function based video monitoring system and storage method thereof
CN104519401A (en) * 2013-09-30 2015-04-15 华为技术有限公司 Video division point acquiring method and equipment
CN104519401B (en) * 2013-09-30 2018-04-17 贺锦伟 Video segmentation point preparation method and equipment
CN103870598B (en) * 2014-04-02 2017-02-08 北京航空航天大学 Unmanned aerial vehicle surveillance video information extracting and layered cataloguing method
CN103870598A (en) * 2014-04-02 2014-06-18 北京航空航天大学 Unmanned aerial vehicle surveillance video information extracting and layered cataloguing method
CN104184960A (en) * 2014-08-19 2014-12-03 厦门美图之家科技有限公司 Method for carrying out special effect processing on video file
CN104822087A (en) * 2015-04-30 2015-08-05 无锡天脉聚源传媒科技有限公司 Processing method and apparatus of video segment
CN104822087B (en) * 2015-04-30 2017-11-28 无锡天脉聚源传媒科技有限公司 A kind of processing method and processing device of video-frequency band
CN105227862A (en) * 2015-09-16 2016-01-06 上海工程技术大学 Can the video recombination system of auto Segmentation camera lens and video recombination method thereof
CN110019025A (en) * 2017-07-20 2019-07-16 中国移动通信集团公司 A kind of stream data processing method and device
CN108111537A (en) * 2018-01-17 2018-06-01 杭州当虹科技有限公司 A kind of method of the online video contents of streaming media of rapid preview MP4 forms
CN108111537B (en) * 2018-01-17 2021-03-23 杭州当虹科技股份有限公司 Method for quickly previewing online streaming media video content in MP4 format
CN109005368A (en) * 2018-10-15 2018-12-14 Oppo广东移动通信有限公司 A kind of generation method of high dynamic range images, mobile terminal and storage medium
CN109800035A (en) * 2019-01-24 2019-05-24 博云视觉科技(青岛)有限公司 A kind of algorithm integration service framework system
CN109800035B (en) * 2019-01-24 2022-11-15 博云视觉科技(青岛)有限公司 Algorithm integrated service framework system
CN110147469A (en) * 2019-05-14 2019-08-20 腾讯音乐娱乐科技(深圳)有限公司 A kind of data processing method, equipment and storage medium
CN110147469B (en) * 2019-05-14 2023-08-08 腾讯音乐娱乐科技(深圳)有限公司 Data processing method, device and storage medium
CN111641869A (en) * 2020-06-04 2020-09-08 虎博网络技术(上海)有限公司 Video split mirror method, video split mirror device, electronic equipment and computer readable storage medium
CN111641869B (en) * 2020-06-04 2022-01-04 虎博网络技术(上海)有限公司 Video split mirror method, video split mirror device, electronic equipment and computer readable storage medium
CN113014957A (en) * 2021-02-25 2021-06-22 北京市商汤科技开发有限公司 Video shot segmentation method and device, medium and computer equipment
CN113221943A (en) * 2021-04-01 2021-08-06 中国科学技术大学先进技术研究院 Diesel vehicle black smoke image identification method, system and storage medium
CN113221943B (en) * 2021-04-01 2022-09-23 中国科学技术大学先进技术研究院 Diesel vehicle black smoke image identification method, system and storage medium
CN113612923A (en) * 2021-07-30 2021-11-05 重庆电子工程职业学院 Dynamic visual effect enhancement system and control method
CN113612923B (en) * 2021-07-30 2023-02-03 重庆电子工程职业学院 Dynamic visual effect enhancement system and control method
CN113473182A (en) * 2021-09-06 2021-10-01 腾讯科技(深圳)有限公司 Video generation method and device, computer equipment and storage medium

Also Published As

Publication number Publication date
CN102694966B (en) 2014-05-21

Similar Documents

Publication Publication Date Title
CN102694966B (en) Construction method of full-automatic video cataloging system
Gygli Ridiculously fast shot boundary detection with fully convolutional neural networks
US10664687B2 (en) Rule-based video importance analysis
CN103065153B (en) A kind of video key frame extracting method based on color quantization and cluster
US7945142B2 (en) Audio/visual editing tool
CN103502936B (en) Automated systems and methods based on image
CA2676632C (en) Method and system for video indexing and video synopsis
CN101971190B (en) Real-time body segmentation system
Ham et al. Automated content-based filtering for enhanced vision-based documentation in construction toward exploiting big visual data from drones
CN101884221B (en) System and method for encoding video
CN106937114B (en) Method and device for detecting video scene switching
CN102231820B (en) Monitoring image processing method, device and system
US8385654B2 (en) Salience estimation for object-based visual attention model
CN110795595A (en) Video structured storage method, device, equipment and medium based on edge calculation
WO2019196795A1 (en) Video editing method, device and electronic device
Fu et al. Robust image segmentation using contour-guided color palettes
US20220222791A1 (en) Generating image masks from digital images utilizing color density estimation and deep learning models
CN112925905A (en) Method, apparatus, electronic device and storage medium for extracting video subtitles
CN103051923B (en) Lens detection method for high-speed and accurate segmentation
Verma et al. A hierarchical shot boundary detection algorithm using global and local features
CN104809438B (en) A kind of method and apparatus for detecting electronic eyes
CN115049963A (en) Video classification method and device, processor and electronic equipment
Zhu et al. Automatic scene detection for advanced story retrieval
CN113361426A (en) Vehicle loss assessment image acquisition method, medium, device and electronic equipment
CN114445751A (en) Method and device for extracting video key frame image contour features

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20140521

Termination date: 20150305

EXPY Termination of patent right or utility model