CN102694966B - Construction method of full-automatic video cataloging system - Google Patents

Construction method of full-automatic video cataloging system Download PDF

Info

Publication number
CN102694966B
CN102694966B CN201210054812.5A CN201210054812A CN102694966B CN 102694966 B CN102694966 B CN 102694966B CN 201210054812 A CN201210054812 A CN 201210054812A CN 102694966 B CN102694966 B CN 102694966B
Authority
CN
China
Prior art keywords
frame
video
key frame
camera lens
fade
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201210054812.5A
Other languages
Chinese (zh)
Other versions
CN102694966A (en
Inventor
蔡靖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University of Technology
Original Assignee
Tianjin University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University of Technology filed Critical Tianjin University of Technology
Priority to CN201210054812.5A priority Critical patent/CN102694966B/en
Publication of CN102694966A publication Critical patent/CN102694966A/en
Application granted granted Critical
Publication of CN102694966B publication Critical patent/CN102694966B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention discloses a full-automatic video cataloging system and a full-automatic video cataloging method. The system is mainly used for automatically building a metadatabase of massive non-structured media materials, simultaneously supporting manual modification, adding and perfection of metadata on that basis, and finally realizing the aim of effective management on massive media assets through the perfected media metadatabase. The system comprises a full-automatic video cataloging system function and system architecture, a media material video key frame robustness automatic extraction algorithm and a media material shot segmentation key robustness automatic extraction algorithm. The software realized based on the method provided by the invention can realize the real-time processing effect on the existing mainstream computers. The video key frame and shot segmentation robustness extraction algorithm provided by the invention has good anti-interference effect on the influence of camera flash and special effects including wiping off, fading in fading out and dissolving which frequently appear in video, so that excess omissions and false drop caused by the interference can be avoided.

Description

A kind of construction method of full-automatic video cataloging syytem
Technical field
The present invention relates to multimedia technology, the metadata that relates in particular to voluminous media material is set up and effectively management.
Background technology
Society is the society of high speed information development, produces the information of a large amount of various kinds every day.Particularly, in video display industry, a large amount of Multi-media Materials produce and accumulate.But multi-medium data itself is a kind of non-structured data, how effectively to manage these media datas, it is the difficult problem in the urgent need to address that media industry is faced.
Media asset management (Median Asset Management, MAM) be an overall solution of end to end all kinds media and content (as video/audio data, text, chart etc.) being carried out comprehensive management in its lifetime, comprise all processes such as collection, cataloguing, management, transmission and code conversion of Digital Media.It meets the functional requirement that media asset owner collects, preserves, searches, edits, issues various information completely, for the user of media asset provides online content and easy access method, realize safety, intactly preserved media asset and utilize efficiently, at low cost media asset.
The object of multi-medium data (Multimedia Data) media asset management, mainly refer to the carrier of all styles of information, for example: the data such as text, figure, image, sound. its outstanding feature is: (1) multi-medium data (being unstructured data mostly) of a great variety, derive from different media, there is diverse form and form; (2) multi-medium data amount is huge; (3) multi-medium data has time response and version concept, different from traditional numerical value and character, belongs to unstructuredness information
Media asset metadata (Matedata) is exactly the information of describing media asset.The quality of metadata, quantity, unicity, description content, can accessibility and the property known be all the key that determines a media asset management system success or not.Therefore perfect metadata collecting and manufacturing system are very crucial.
Multimedia Metadata is collected and made to the major function of full-automatic media materials cataloging syytem exactly, thereby provide effective information support to media asset management system.The generation of traditional media asset metadata is to rely on manual method, utilizes operating personnel to watch video content, and manually adds mark on this basis, and then generator data.This traditional method takes time and effort, inefficiency.Need because of the industry a kind of method that automatic metadata produces badly, replace artificial method, raise the efficiency.Simultaneously, due to the complexity of media asset content, and the increase of some characteristic contents (special efficacy such as be for example fade-in fade-out, wipe, impact of camera flashlamp etc. in video content), for the automatic mode of Video segmentation, bring very large difficulty.How to carry out efficiently and accurately the extraction that key frame of video or camera lens are cut apart frame, be the technical barrier remaining unsolved at present.
Camera lens auto Segmentation: the corresponding video camera of camera lens once record start-stop operation, represent in a scene in time with space on continuous action.Between camera lens, can have polytype transient mode, modal is shear, has in addition some more complicated transient modes, as be fade-in fade-out, dissolve, wipe etc.Camera lens is a base unit that carries out source material editing, will bring convenience for searching of information by camera lens being set up to index.
Key frame Automatic Extraction: find in practice, the start-stop time of a camera lens is separated by longer sometimes, may there is very large variation in inner content and scene.Therefore, only depend on camera lens to cut apart to be not enough to take out fully all important informations of media materials.For making up this inadequate natural endowment of camera lens fragment, define the concept of key frame fragment.Key frame fragment analysis is the correlation analysis (rather than physical action of video camera start-stop) based on to video content, and extracts representational key frame according to complexity of video content, effectively plays the effect of video frequency abstract.Key frame fragment is to regard significant least unit in video segment as, and key frame fragment frame interior has very close content, thereby has very large information redundancy.Key frame can the section of representative in the information of all frames.
Summary of the invention
The present invention seeks to, the problem taking time and effort for the artificial categorization of voluminous media material, utilizes advanced computer video analytical technology, proposes a kind of construction method of full-automatic video cataloging syytem.This system can be saved manpower in a large number, improves quality and the efficiency of video cataloguing.
The invention discloses a set of flexibly, full-automatic video cataloging syytem and relevant intelligent algorithm efficiently.This system support is automatically carried out intellectual analysis to video content, the camera lens that extracts the physical change based on video camera start-stop and form is cut apart frame and is changed and can represent the key frame of video segment content based on video content, and cuts apart the metadatabase that generates media materials on the basis of frame at the key frame extracting and camera lens.The key frame of video that the present invention proposes and camera lens cut apart frame robustness extraction algorithm for wiping of often occurring in video, be fade-in fade-out, special efficacy and the camera flashlamp impact etc. such as dissolving have good Anti-Jamming, can therefore not occur too much undetected or flase drop.
The first, the invention provides a functional framework of setting up video cataloging syytem, define the functional module of each subsystem.
The second, the invention provides a set of full-automatic video Key Frame Extraction algorithm, this algorithm for wiping, be fade-in fade-out, special efficacy and the camera flashlamp impact etc. such as dissolving have good inhibitory action.
The 3rd, the invention provides a set of full-automatic video camera lens and cut apart frame extraction algorithm, this algorithm for wiping, be fade-in fade-out, special efficacy and the camera flashlamp impact etc. such as dissolving have good inhibitory action.
For this reason, the construction method of full-automatic video cataloging syytem provided by the invention comprises:
1st, the structure of full-automatic video cataloging syytem; Described full-automatic video cataloging syytem comprises:
1.1st, acquisition of media module, by video acquisition board, gathers video stream data;
1.2nd, Media Analysis module, the video stream data collecting for the 1.1st step acquisition of media module, utilize advanced computer vision treatment technology to carry out intellectual analysis, extract key frame or camera lens and cut apart frame, and the key frame extracting or camera lens are cut apart to frame be shown in Storyboard, further edit, process for user;
1.3rd, metadata foundation, editor and administration module, the key frame extracting take the 1.2nd step Media Analysis module and camera lens are cut apart frame as basis, form take video segment as unit, and are aided with the media materials metadata of descriptor; Described metadata is the final result of cataloging syytem output, is destructuring media asset is carried out to the effectively Main Basis of management; Metadata editor and management comprise metadata model file set up and modification, metadata demonstration, metadata editor, metadata cache management and meta-data preservation;
1.3.1, metadata model file, in this file, define cataloging syytem and media materials carried out to content and the organizational form of the metadata information drawing after structured analysis, comprise fragment description, classifying content, video author and lister, metadata model file is with XML representation of file, system provides default file, and user can revise metadata model file by manual mode as required;
1.3.2, metadata editor, comprising:
◆ key frame and camera lens are cut apart frame-editing: user increases with concrete condition as required or deletes the key frame or the camera lens that are presented in Storyboard cuts apart frame;
◆ video segment editor: key frame or camera lens that user extracts according to system are cut apart frame, pass through manual edit, the content of two or more key frames or camera lens being cut apart to frame merges, component content independence and significant video segment, and according to the format content of metadata model document definition, add corresponding information;
1.3.3, metadata show, comprise
◆ key frame or camera lens are cut apart frame and are shown: the key frame that system extracts automatic analysis or camera lens are cut apart frame and is presented in Storyboard;
◆ segment contents shows: system is presented at the frag info content after editor in user interface;
1.3.4, metadata cache management, comprising:
During automatically carrying out video, audio analysis, for preventing frequent reading writing harddisk, and only analysis result is write to hard disk in the time that analysis completes;
◆ use system RAM to make buffer memory, require memory size can support the metadata analysis of several hours continuously;
◆ cause the Out of Memory used time when continuous analysis overlong time, adopt system paging file as shared mapped file;
1.3.5, meta-data preservation, comprising:
◆ the metadata of automatic analysis or modification is saved in storage medium with XML document form;
◆ key frame or camera lens are cut apart in the file that frame is saved in appointment;
1.4th, Configuration Manager, system is stored in configuration information in configuration file, and with XML representation of file, user can be by hand or by the user interface editor that modifies; When system initialization, read this configuration file, carry out software module configuration; Configuration information comprises:
● input equipment configuration: media file on video acquisition board or disk;
● algorithm function is selected: key frame or camera lens are cut apart frame and extracted;
● Key Frame Extraction algorithm parameter is set, and comprises
■ Key Frame Extraction susceptibility is set;
■ maximum or minimum frame gap are set: limit that the adjacent key frame time interval is no less than or more than specified time interval.
2nd, media materials key frame of video robustness automatic decimation; Comprise:
2.1st, preliminary treatment, this step comprises two parts content: for incoming video signal, first reduce algorithm complex by sampling under pixel in frame; Utilize in addition frame-to-frame correlation to obtain binaryzation frame difference image, and utilize this error image to filter out the set of candidate key frame;
2.2nd, information extraction, the candidate key frame set that the 2.1st step is obtained is further processed, extract histogram feature information, and utilize divergence measurement between the consecutive frame after histogram Fourier transform and between frame to be measured and previous keyframe to carry out key frame judgement and extract; In processing procedure, this step provides detection and the inhibition of independent be fade-in fade-out, wipe special efficacy and photoflash lamp impact;
2.2.1, to being fade-in fade-out fragment, meet the feature of linear change according to interframe luminance signal in fragment, detect by the method for estimating this rate of change;
2.2.2, to the special efficacy fragment of wiping, there is the regular feature in space according to the interframe region of wiping, detect by detecting wipe region and whole wipe between the fragment frame interior spatial variations in region of wiping of interframe;
2.2.3, to photoflash lamp fragment, when detection, according to short feature of photoflash lamp time, utilize interframe luminance difference large and every the little characteristic of interframe luminance difference, judge and detect;
2.3rd, information analysis, the characteristic information obtaining according to the 2.2nd step information extraction and special scene testing result, carry out last comprehensive analysis, and select the key frame that can represent segment contents;
2.3.1, for being fade-in fade-out frame sequence, the last frame that special efficacy completes of selecting to be fade-in fade-out is exported as key frame;
2.3.2, for wiping fragment sequence, the last frame that special efficacy completes of selecting to wipe is exported as key frame;
2.3.3, for the frame of the photoflash lamp impact detecting, directly from candidate frame sequence by its filtering;
2.3.4, for remaining common candidate's keyframe sequence, in the fragment being formed by each successive frame sequence at it, select to export as key frame with the frame of last key frame histogram Fourier transform diversity factor maximum.
3rd, media materials camera lens is cut apart frame robustness automatic decimation, comprising:
3.1st, preliminary treatment, comprises two parts content: for incoming video signal, first reduce algorithm complex by sampling under pixel in frame; Utilize in addition frame-to-frame correlation to obtain binaryzation frame difference image, and utilize this error image to filter out candidate camera lens to cut apart frame set;
3.2nd, information extraction, the candidate camera lens that the 3.1st step is obtained is cut apart frame set and is further processed: extract histogram feature information, and adopt the method for histogram decay average computing to calculate decay histogram average and the statistical variance from start frame to present frame in same camera lens; Then calculate the interframe histogram χ between present frame and decay histogram average 2statistics difference, and upgrade statistical variance; With the statistical variance dynamically updating, calculate dynamic decision thresholding, utilize this threshold value to carry out the judgement that camera lens is cut apart frame; In processing procedure, this step provides detection and the inhibit feature of independent be fade-in fade-out, wipe special efficacy and photoflash lamp impact; For the fragment of being fade-in fade-out, meet the feature of linear change according to interframe luminance signal in such fragment, detect by the method for estimating this rate of change; For the fragment of wiping, there is the regular feature in space according to the interframe region of wiping, detect by detecting wipe wipe between region and the whole fragment frame interior spatial variations in region of interframe; For photoflash lamp fragment, when detection, according to short feature of photoflash lamp time, utilize interframe luminance difference greatly and every the little characteristic of interframe luminance difference, judge and detect;
3.3rd, information analysis, according to the characteristic information obtaining from the 3.2nd step information extraction and special scene testing result, carry out last comprehensive analysis, and judge camera lens and cut apart frame, comprising the detection of the special circumstances of be fade-in fade-out, wipe special efficacy and photoflash lamp; For the frame sequence of being fade-in fade-out detecting, output last frame is as key frame; For the fragment sequence of wiping detecting, output last frame is as key frame; For the frame of the photoflash lamp impact detecting, directly from candidate frame sequence by its filtering; For remaining common candidate frame, the judgement that camera lens is cut apart frame is required to meet following two conditions:
● sudden change conditions: present frame and decay histogram average χ 2statistics difference is greater than the sudden change threshold value of being determined by statistical variance;
● smooth conditions: present frame and decay histogram average χ 2statistics difference is greater than by the interframe χ between present frame and a rear frame 2the steady threshold value that statistics difference is determined.
Advantage of the present invention and good effect:
The present invention adopts advanced computer video analytical technology, in real time, automatically analyzes video content, and according to user's request, extracts key frame and camera lens and cut apart frame.Be supported in key frame or camera lens simultaneously and cut apart on frame basis, by human-edited, perfect, set up the media materials metadatabase based on video content.For the back end media asset management system provides sufficient metadata information.Robustness key frame of video disclosed by the invention and camera lens are cut apart frame extraction algorithm, have good inhibition ability for the interference of the special scenes such as video properties and photoflash lamp such as being fade-in fade-out, dissolving, wiping.Due to the destructuring characteristic of original video content, traditional manual type is carried out the method that video segment is cut apart and catalogued, and wastes time and energy, and adopts solution of the present invention, can save great amount of cost and social resources.
Accompanying drawing explanation
Fig. 1 is system architecture functional block diagram of the present invention.
Fig. 2 is a specific embodiment functional block diagram of the present invention.
Fig. 3 is that the present invention is about the special efficacy example of being fade-in fade-out.
Fig. 4 is that the present invention is about the example of wiping.
Fig. 5 is the example of the present invention about photoflash lamp impact.
Fig. 6 is that the present invention is about metadata model file example.
Fig. 7 is that the present invention is about metadata configurations file example.
Fig. 8 is that the present invention is about key frame robustness extraction algorithm system flow (method).
Fig. 9 is that the present invention is cut apart frame extraction preliminary treatment submodule flow process (method) about key frame or camera lens.
Figure 10 is that the present invention is about the detection sub-module flow process (method) of being fade-in fade-out.
Figure 11 is that the present invention is about the detection sub-module flow process (method) of wiping.
Figure 12 is that the present invention is about photoflash lamp detection sub-module flow process (method).
Figure 13 is that the present invention is cut apart frame robustness extraction algorithm system flow (method) about camera lens.
Embodiment
One, full-automatic video cataloging syytem framework, as shown in Figure 1, comprising:
1, acquisition of media module
This subsystem, by video acquisition board and board respective drive program, gathers video stream data to calculator memory, and is sent to Media Analysis module.
2, Media Analysis module
This module is for the video stream data collecting, according to the config option of setting, select corresponding Processing Algorithm and control parameter, carry out intellectual analysis and key frame or camera lens and cut apart frame extraction, and the key frame extracting or camera lens are cut apart to frame and corresponding timecode information be shown in user interface on Storyboard region, further edit, process for user.
3, the foundation of media element data, editor and administration module
Media element data management comprises metadata model file set up and modification, and metadata shows, metadata editor, metadata cache management, meta-data preservation.
Metadata model document definition, in this file, define cataloging syytem and media materials carried out to content and the organizational form (as fragment description, classifying content, video author, lister etc.) of the metadata information drawing after structured analysis, with XML representation of file, system provides default file, user can, according to specific needs, modify by manual mode.Fig. 6 illustrates an example of metadata model file.
Metadata editor, comprising:
◆ key frame and camera lens are cut apart frame-editing: system is cut apart frame by key frame or camera lens and is presented on dialog control, and revise or delete keys or camera lens is cut apart the function of frame by increasing the operation of this dialog control, realizing
◆ video segment editor: system is cut apart the dialog control of frame by showing key frame or camera lens, provides and cuts apart frame by key frame or camera lens and form video segment, and provide metadata editting function for this video segment.And metadata is presented in list of meta data control.
Metadata shows, comprises
◆ key frame or camera lens are cut apart frame and are shown: system is cut apart frame by key frame or camera lens and is presented on dialog control, and each key frame or camera lens are cut apart the corresponding dialog control of frame, and all dialog controls all depend in the view of a Storyboard
◆ segment contents shows: system is presented at frag info (as fragment description, classifying content, video author, the lister etc.) content after editor in list of meta data control, and the content of list control is the meta data file that is stored in storage medium with XML form.
Metadata cache management, comprising:
During automatically carrying out video and audio analysis, for preventing frequent reading writing harddisk, and only in the time that analysis completes, analysis result is write to hard disk.
◆ use system RAM to make buffer memory, require memory size can support the metadata analysis of several hours continuously
◆ cause the Out of Memory used time when continuous analysis overlong time, adopt system paging file as shared mapped file meta-data preservation, comprising:
◆ the metadata of automatic analysis or modification is saved in storage medium with XML document form
◆ key frame or camera lens are cut apart frame and are saved in specific file
4, Configuration Manager
Full-automatic video cataloging syytem is stored in configuration information in CONFIG.SYS, with the storage of XML file mode.System provides user interface, shows and Editing allocation file in the mode of dialog box, and user can manual mode directly revise CONFIG.SYS.When system initialization, read this CONFIG.SYS, carry out software module configuration.Fig. 7 illustrates an example of system configuration management document.Configuration information comprises:
● input equipment configuration: media file on video acquisition board or disk
● algorithm function is selected: key frame or camera lens are cut apart frame and extracted
● Key Frame Extraction algorithm parameter is set, and comprises
■ Key Frame Extraction susceptibility is set
■ maximum or minimum frame gap are set: limit that the adjacent key frame time interval is no less than or more than specified time interval
Two, robustness key frame of video abstracting method provided by the invention comprises:
1, handling process design
Key Frame Extraction method is illustrated by Fig. 8.Whole processing method can be divided into three parts: preliminary treatment submodule, information extraction submodule, information analysis submodule.
◆ preliminary treatment submodule first to every one-frame video data of input carry out simply, preliminary treatment fast, comprise under spatial domain sampling and time domain preanalysis, and therefrom filter out roughly the set of candidate key frame.
◆ information extraction, contains following two parts function:
● Part I is that the candidate key frame set filtering out for previous step is further processed, characteristic information extraction, and according in the frame of each frame and interframe characteristic information, further filter out candidate frame sequence, and continuous candidate frame sequence and characteristic information thereof are made to as a whole candidate segment and record, for further analysis and key-frame extraction is used.
● Part II is to carry out special processing and detection for special screne, comprising:
■ photoflash lamp effect detection
■ is fade-in fade-out and dissolves special efficacy and detects
■ interframe wipe special efficacy detect
◆ information analysis submodule is on the basis of detecting and processing in previous step, for different situations, adopts different processing policies, and key frame of video is extracted in final analysis
1.1, preliminary treatment submodule---candidate segment start frame obtains
Preprocessing subsystem is illustrated by Fig. 9, comprises following two steps:
1) sampling of carrying out spatial domain improves treatment effeciency;
For improving treatment effeciency, in the situation that guaranteeing not impact analysis effect, first input video stream is carried out to lower sampling.In specific implementation process, the video flowing that is 720 × 576 for input resolution carries out sampling under 8*8, and obtaining resolution is 90 × 72 video stream datas.
2) preanalysis.Key frame object is for mark, describes the appearance of new scene in video or fresh content, therefore in processing procedure, needs to filter out that those are a large amount of, describe the similar frame of segment contents height to last key frame.For this reason,
● first define two-value template M (i, j), its size is lower sampling rear video frame sign, 96 × 72.Each point value of initialization is 0.
● for each frame, consider its luminance signal I t, calculate following two binaryzation frame difference images, and utilize morphology open-close operator to carry out noise reduction and remove details.Here get threshold value T=10.
■ consecutive frame binaryzation frame difference image
d t , t - 1 ( i , j ) = 0 | I t ( i , j ) - I t - 1 ( i , j ) | < T 1 else
d′ t,t-1(i,j)=OC(d t,t-1(i,j))
● present frame I twith previous keyframe I kbetween binaryzation frame difference image.If not yet detect key frame, use complete black frame to replace previous keyframe.
d t , k ( i , j ) = 0 | I t ( i , j ) - I k ( i , j ) | < Tor | I t - 1 ( i , j ) - I k ( i , j ) | < T 1 else
d′ t,k(i,j)=OC(d t,k(i,j))
● more new template M (i, j):
M new(i,j)=M old(i,j)∪d′ t,k(i,j)
● calculation template M, d t, t-1, d t, kin 1 number:
N ( M ) = &Sigma; i , j M ( i , j )
N ( d t , t - 1 ) = &Sigma; i , j d t , t - 1 ( i , j )
N ( d t , k ) = &Sigma; i , j d t , k ( i , j )
Just can think that when meeting one of following condition new contents fragment has started, to start all frame of video from present frame be candidate frames to mark, until new key frame is selected:
1)N(M)>size/5 and N(d t,t-1)-N(d t,k)>size/10
2)N(d t,t-1)>size/20 and N(d t,t-1)-N(d t,k)>size/10
Here, size is the number of pixel in frame of video
1.2, information extraction submodule
Information extraction subsystem on the candidate video segment basis having filtered out, further characteristic information extraction; Simultaneously for some, the influential special scene of Key Frame Extraction and special efficacy are detected and are suppressed, comprising: photoflash lamp affect, be fade-in fade-out and dissolve special efficacy, the special efficacy of wiping.
1) essential information is extracted and candidate's keyframe sequence record
● essential information is extracted
Original essential information is based on input yuv video signal.
First every frame video data Y, U, V component are carried out to statistics with histogram analysis, obtain the color histogram h of 256 component values y, h u, h v
Expand to for frame sequence, obtain Nogata atlas t is frame number, and i is brightness, chromatic component value, and i ∈ [0,255] thus, has to give a definition
The frame sequence that ■ is N for sequence length, definition histogram average is:
Ave = 1 N &Sigma; t = 0 N - 1 h &rho; ( t , i )
wherein h y(i), h u(i), h v(i) be respectively Y, U, V histogram of component
■ definition color histogram component Fourier transform is:
h y ( i ) &LeftRightArrow; H y ( j&omega; )
h u ( i ) &LeftRightArrow; H u ( j&omega; )
h v ( i ) &LeftRightArrow; H v ( j&omega; )
Histogram difference between ■ definition frame m and frame n is estimated:
d m , n = &Sigma; &omega; = 0 &pi; / 4 ( | H y m ( j&omega; ) - H y n ( j&omega; ) | 2 + | H u m ( j&omega; ) - H u n ( j&omega; ) | 2 | H v m ( j&omega; ) - H v n ( j&omega; ) | 2 )
As can be seen from the above equation, when calculated difference is estimated, only consider the low frequency component in frequency domain, be made with like this two objects:
◆ the mould value of frequency-region signal is insensitive to the time delay on time shaft, is exactly here insensitive to YUV entirety global brightness variation, thereby to a certain extent photoflash lamp is had to certain inhibitory action;
◆ adopt the low frequency component of analyzing frequency domain information to be equivalent to primary signal to carry out smoothly, can eliminating like this impact of noise.
● candidate's keyframe sequence is chosen
Utilize histogram difference to estimate, further carry out choosing of key frame candidate frame.
In key frame testing process, more representative for the key frame that makes to detect, we will detect the continuous sequence of a key frame, and choose in this section the most representative frame as key frame.
So testing process can be divided into two steps:
● the detection of key frame section start frame
● the detection of key frame section abort frame.
Detect about key frame section start frame:
1. defining video the first frame is key frame
2. upgrade the histogram average Ave of the former frame from previous keyframe to present frame as reference histograms
3. calculate consecutive frame and present frame and previous keyframe histogram difference is estimated d t, t-1and d t, kif, d t, k> T shows that scene has had larger variation, thus beginning label key frame candidate frame, the i.e. beginning of a new contents fragment.
Detect about key frame section abort frame:
In the time detecting first key frame candidate frame, show that new video segment has started.Will determine when this sequence finishes below, and key frame fragment abort frame.All frames between key frame start frame and key frame abort frame are all labeled as key frame candidate frame, and final key frame will be chosen from this sequence.The method of judgement key frame abort frame is:
1. first utilize average computing to upgrade reference histograms, then calculate the divergence measurement d of each frame and former frame and reference histograms after key frame section start frame t, t-1and d t, k
If 2. d t, k> T shows that another key frame section has started, and analyzes old key frame section, obtains best key frame output, records new key frame start frame simultaneously;
If 3. d t, t-1< T, shows that video transformation is steady, and this frame is key frame section end frame;
If 4. this frame and key frame differ more than 30 frames in addition, force to finish, carry out the analysis of key frame section, draw key frame.
2) special scene detects
In video flowing, there are some special scenes to have a great impact for the extraction of key frame.In these scenes, because interframe alters a great deal, content change is very little again simultaneously, and excessively detecting often appears in conventional detection method.The robustness key frame that the present invention proposes detects, and considers the situation that these are special, adopts corresponding processing method for these special situations, obtains good inhibition
● the detection of being fade-in fade-out and dissolving:
Being fade-in fade-out and dissolving is special efficacy common in video flowing, and typical segments example is illustrated by Fig. 3.It is similar being fade-in fade-out with the formation principle of dissolving special efficacy.Fade in and refer to video frame content temporal evolution, conceal gradually, until disappear for the frame of complete black content; Fade out just in time contrary, refer to video frame content temporal evolution, go out gradually in complete black frame and occur; The video segment temporal evolution that dissolving refers to two different contents superposes mutually, and one of them contents fragment conceals gradually, until disappear; Another contents fragment engenders, until all manifest.
The essence that this special efficacy forms, is the two frames results of phase superposition in varing proportions, its ratio k linear transformation in time.Its testing process is illustrated by Figure 10.
Be fade-in fade-out and dissolve in special efficacy process in detection, considering brightness mean value signal in frame of video, definition
Y t=∑y t(i,j)
Being fade-in fade-out in special efficacy frame sequence, consecutive frame meets:
y t(i,j)=k*y t-1(i,j)+(1-k)*y t+1(i,j),k∈[0,1]
Utilize brightness mean value signal in frame of video, can estimate proportionality coefficient
k = Y t + 1 - Y t Y t + 1 - Y t - 1
The proportionality coefficient k that employing estimates, divergence measurement and decision threshold between definition frame:
Frame difference is estimated: d=∑ | k*y t-1(i, j)-y t(i, j)+(1-k) * y t+1(i, j) |
Decision threshold: t 1=∑ | y t-1(i, j)-y t(i, j) |
t 2=∑|y t+1(i,j)-y t(i,j)|
When meeting d < k*t 1with d < k*t 2time, this frame of mark is the frame of being fade-in fade-out
● photoflash lamp detects and suppresses:
1. photoflash lamp frame detects.
Photoflash lamp fragment example can be illustrated by Fig. 5.Photoflash lamp is larger on image brightness and carrier chrominance signal impact.Carrying out key frame or camera lens when cutting apart frame and detecting, the frame of photoflash lamp impact be detected, avoid being defined as key frame.Its testing process illustrates as Figure 12.For this reason,
Between definition frame, binaryzation error image is, here T=10:
D m , n ( i , j ) = 0 | y m ( i , j ) - y n ( i , j ) | < T 1 else
Then, utilize morphology open-close operator to carry out noise and details removal:
D′(i,j)=OC(D(i,j)
Between definition frame error image estimate for,
N ( D &prime; ( i , j ) ) = &Sigma; i , j D &prime; ( i , j )
, 1 number in error image.
In the time that frame t meets one of following condition, this frame of mark is photoflash lamp action frame
a) N ( D t - 1 , t &prime; ( i , j ) &cap; D t - 1 , t + 1 &prime; ( i , j ) &OverBar; ) > min ( size / 5 , N ( D t - 1 , t + 1 &prime; ( i , j ) ) )
b) N ( D t , t + 1 &prime; ( i , j ) &cap; D t - 1 , t + 1 &prime; ( i , j ) &OverBar; ) > min ( size / 5 , N ( D t - 1 , t + 1 &prime; ( i , j ) )
Here, size is number of pixels in frame
2. photoflash lamp suppresses
In analytic process, except by carrying out photoflash lamp detection, also following means suppress other overall jump in brightnesss:
A. utilize the time domain consistency of Fourier transform to suppress overall changes in amplitude
B. utilize adaptive threshold to suppress flash light effect.When certain section of video changes when excessive, raise adaptively decision threshold and suppress its impact.
● the frame of wiping detects:
The fragment of wiping example can be illustrated by Fig. 4.Generally wipe and show as: continuous 5~10 frames, every frame and former frame have in visual part compared with large difference and not obvious in other part differences, and the variation of this visual larger difference part is regular simultaneously.For example, from left to right change, from top to bottom etc.Can detect according to the regularity of this difference regional change the interval of wiping.
The fragment of wiping testing process, as Figure 11 illustrates, can be made up of following 4 steps:
Step 1: the definition template Mask that wipes, for being recorded in whole interframe of the wiping fragment process region of wiping.The template size of wiping is identical with video frame size, and initialization internal data is 0,
Mask (i, j)=0, i, j is respectively row, column index
Step 2: utilize interframe binaryzation error image, detect the interframe region of wiping:
2.1 calculate interframe binaryzation error image, and utilize morphology open-close operator to carry out noise and details removal, obtain binary image D ' after treatment (i, j)
2.2 choose largest connected territory C (i, j) in binary image D ' (i, j), the effect candidate region of wiping
2.3 in the time that template and above-mentioned largest connected territory meet the following conditions simultaneously, adjudicate this frame for the frame of wiping
◆ the interframe candidate region of wiping is enough large: N (C (i, j)) > size/15
◆ the newly-increased region of wiping is enough large: N ( Mask ( i , j ) &OverBar; &cap; C ( i , j ) ) > size / 20
Wipe candidate region detect, in the time that the candidate regions of not wiping detects, the step 4 of beating
Step 3: if step 2 detects the region of wiping, more new template Mask, and get back to step 2, process next frame:
Mask new(i,j)=Mask old(i,j)∪C(i,j)
Otherwise, jump to step 4
Step 4: if detect some successive frames fragment of wiping above, and simultaneously meet following two conditions,
Judge that this frame sequence is as the fragment of wiping, and the fragment last frame of wiping is designated key frame.
◆N(Mask(i,j)>size/3
◆ frame number > 3 frames of wiping continuously
1.3, information analysis submodule
Information analysis subsystem is exactly that the key frame fragment of record is analyzed, and finally draws best key frame.For detecting candidate frame of different nature above, adopt following different strategy to process
4.1 for the frame sequence of being fade-in fade-out detecting, and output last frame is as key frame.
4.2 for the fragment sequence of wiping detecting, and output last frame is as key frame.
4.3 frames for the photoflash lamp impact detecting, directly from candidate frame sequence by its filtering.
4.4 for remaining common candidate's keyframe sequence, and in its continuous sequence frame, the frame of selection and last key frame histogram difference degree maximum, as this sequence key frame output I kmeet:
d k,last=max(d t,last) t∈D,
Wherein k is key frame sequence number, and D is continuous candidate's keyframe sequence scope, and t is sequence number variable
Three, robustness camera lens switch frame detection method provided by the invention comprises:
1, handling process design
Camera lens is cut apart frame extraction system and is illustrated by Figure 13.Whole processing method can be divided into three parts: preliminary treatment submodule, information extraction submodule, information analysis submodule.
◆ preliminary treatment submodule first to every one-frame video data of input carry out simply, preliminary treatment fast, comprise under spatial domain sampling and time domain preanalysis, and therefrom filter out roughly the set of candidate key frame
◆ information extraction submodule comprises following two parts function:
● Part I is that the candidate key frame set filtering out for previous step is further processed, characteristic information extraction, and according in the frame of each frame and interframe characteristic information, further filter out candidate frame sequence, and continuous candidate frame sequence and characteristic information thereof are made to as a whole candidate segment and record, for further analysis and key-frame extraction is used.
● Part II is to carry out special processing and detection for special screne, comprising:
■ photoflash lamp effect detection
■ is fade-in fade-out and dissolves special efficacy and detects
■ interframe wipe special efficacy detect
◆ information analysis submodule is on the basis of detecting and processing in previous step, for different situations, adopts different processing policies, and final analysis extraction video lens is cut apart frame
1.1, preliminary treatment submodule---candidate segment start frame obtains
The function of preprocessing subsystem is identical with Key Frame Extraction with method, can, with reference to above, here repeat no more.
Through preprocessing subsystem, video resolution is reduced, filter out candidate's camera lens simultaneously and cut apart frame sequence.
1.2, information extraction submodule
Information extraction subsystem on the candidate video segment basis having filtered out, further characteristic information extraction; Simultaneously for some, the influential special scene of Key Frame Extraction and special efficacy are detected and are suppressed, comprising: photoflash lamp affect, be fade-in fade-out and dissolve special efficacy, the special efficacy of wiping.
1) essential information is extracted
● essential information is extracted
Main consideration two class frame difference information, and dynamic decision thresholding:
A. interframe pixel grey scale different information DOP (Difference of Pixels)
Between definition frame, binaryzation error image is:
d t , t - 1 ( i , j ) = 0 | I t ( i , j ) - I t - 1 ( i , j ) | < T 1 else
Here T=10
Then, utilize morphology open-close operator to carry out noise and details removal:
d(i,j)=OC(d(i,j)
Between definition frame Pixel gray difference image estimate for,
N ( d &prime; ( i , j ) ) = &Sigma; i , j d &prime; ( i , j )
In the time of N (d) < Size/5, represent that present frame is similar with former frame, thereby can not be that camera lens is cut apart frame, omit remaining processing procedure
In the time of N (d) > Size/5, represent that present frame and former frame differ greatly, and carry out follow-up processing
B. interframe pixel histogram difference information DOH (Difference of Histogram)
In the time calculating interframe DOH, adopt three kinds of technology: histogram decay average, histogram χ 2statistics difference and adaptive threshold judgement.
● histogram χ 2statistics difference.
The histogram H of given frame m and frame n m(i), H n(i), histogram χ between definition frame 2statistics difference is as follows:
d ( H m , H n ) = 1 N 2 &Sigma; i ( H m ( i ) - H n ( i ) ) 2 max ( H m ( i ) , H n ( i ) ) H m(i)≠H n(i)
For YUV color histogram, between definition frame, color histogram divergence measurement is:
D m,n=d y(H m,H n)+d u(H m,H n)+d v(H m,H n)
● the computing of histogram decay average
At technology histogram interframe χ 2statistical difference timesharing, employing be not that former frame and present frame compare, but above in same camera lens the weighted average information of all frames compare., near frame to be compared, weights are larger, otherwise less.Here definition decay average computing:
Suppose and have sequence { H t, t is frame number, and attenuation coefficient is α < 1, and definition decay average is:
H &OverBar; = H t + &alpha; H t - 1 + &alpha; 2 H t - 2 + . . . 1 + &alpha; + &alpha; 2 + . . .
Here, by attenuation coefficient, make at the weights near the frame of present frame larger, smaller away from the weights of the frame of present frame, do not remember to ignoring and the weights of frame too far away are little.The decay average obtaining like this can embody the correlation between slow variation and frame and the frame of same camera lens inner content well.
In implementation procedure, the method for iteration is upgraded current decay average frame as follows, and the information of all frames before needn't recording:
H t + 1 &OverBar; = H t &OverBar; ( 1 - &alpha; t ) + ( 1 - &alpha; ) H t + | 1 - &alpha; t + 1
In video analysis processing procedure, calculate present frame histogram and estimate with the histogram difference of the histogram decay average of the interior all frames of all same camera lenses of past, carry out detector lens and cut apart frame
D t , t - 1 &OverBar; = d y ( H t , H t - 1 &OverBar; ) + d u ( H t , H t - 1 &OverBar; ) + d v ( H t , H t - 1 &OverBar; )
● self-adaptive decision threshold
Consider various noise effects and same camera lens frame-to-frame correlation, in the time carrying out terminal decision, adopt the method for adaptive threshold.In order to obtain adaptively threshold value, by following two steps:
The inner local variance sigma of ■ camera lens dynamically updates:
◆σ 0=0
&sigma; t = 0.7 * ( 1 - 0.7 t - 1 ) * &sigma; t - 1 + 0.3 * D t , t - 1 &OverBar; 1 - 0.7 t
■ adaptive threshold, in processing procedure, is considered various noise effects and same camera lens frame-to-frame correlation, and definition thresholding is:
T=3*σ+10
Wherein σ is the mean square deviation that the inter-frame difference in same camera lens fragment changes.According to Gaussian Profile theory, be distributed in the sample number of average both sides 3* σ more than 99.9%.Therefore the Histogram Difference that, most non-camera lenses are cut apart frame is all less than this thresholding.
2) special scene detects
In some special scenes of video flowing and special efficacy, camera lens is cut apart to the correctness that frame extracts and have a great impact, if do not do special processing, can cause very large flase drop.Before these special scenes and special efficacy, mentioned, comprised be fade-in fade-out and dissolve, wipe special efficacy and photoflash lamp impact.Identical with the method for Key Frame Extraction about the detection of these special scenes, can, with reference to description above, here just repeat no more.
1.3, information analysis submodule
Information analysis subsystem, according to the information of extracting and detecting, does last analysis above, and judgement camera lens is cut apart frame.According to for detecting candidate frame of different nature above, adopt following different strategy to process
A. for the frame sequence of being fade-in fade-out detecting, output last frame is as key frame.
B. for the fragment sequence of wiping detecting, output last frame is as key frame.
C. for the frame of the photoflash lamp impact detecting, directly from candidate frame sequence by its filtering.
D. for remaining common candidate frame sequence, adopt following decision method:
If frame t meets simultaneously:
■ sudden change conditions: D t , t - 1 &OverBar; > 3 * &sigma; t + 10
■ smooth conditions: D t , t - 1 &OverBar; > 2 * D t , t + 1
Determine that it is camera lens and cut apart frame
Four, following part is introduced exemplary embodiment
The present embodiment is the software that runs on windows platform, Visual C++ exploitation.
Fig. 1 illustrates the present embodiment Organization Chart.As seen from the figure,
● the present embodiment is mainly by 5 module compositions:
■ application graphical user interface: this module realizes based on Visual C++MFC.Many documents application program, is made up of 3 parts:
◆ video shows: realize based on MFC view class, the video data that Real-time Collection is come in is presented in this view framework
◆ Storyboard: realize based on MFC view class, by Automatic Extraction key frame out, be presented in Storyboard view framework in the mode of dialog control.
◆ metadata: realize based on MFC view class, by the metadata information after editing and processing, be presented in Metadata View The framework in the mode of list box control
■ cataloging syytem engine modules main program module, this module is by calling bottom module (comprising data acquisition module, Media Analysis module, metadata management module etc.), provide the service of video cataloguing to upper strata user interface, major function comprises application initialization, Memory Allocation, dynamic load, following sub-function module is set.
■ data acquisition submodule, this module is dynamic base card format, by cataloging syytem engine modules dynamic load, operates in independent thread.Major function is that, by operation bottom data analog input card, image data, to input shared drive
■ Media Analysis submodule: this module is dynamic base card format, by cataloging syytem engine modules dynamic load, operates in independent thread.Major function is to realize key frame or camera lens and cut apart frame intelligence extraction algorithm, and result is exported
■ metadata management submodule
■ configuration management submodule
Fig. 2 illustrate the main class of this enforcement use-case with and call relation, wherein,
◆ upper level applications is the many documents application framework based on MFC.
◆ video shows that view class is to inherit the real-time video of collection and the class of timecode information of showing that be used for of CFormView.
◆ Storyboard view class be inherit CFormView be used for show that key frame or camera lens cuts apart the class of frame.The key frame gathering out or camera lens are cut apart frame and are shown based on dialog control.
◆ Metadata View The class is to inherit the class that is used for display element data of CFormView.Content metadata shows based on list control.
◆ all functions of full-automatic video coded system engine integrated video cataloging syytem, comprise data acquisition, Media Analysis and metadata management, and provide interface to upper strata.
◆ data acquisition provides by data board Real-time Collection video data function, through data acquisition in internal memory
◆ Media Analysis class calls key frame or camera lens is cut apart frame Processing Algorithm, processes the data in internal memory, and result is cached in key frame data buffer memory
◆ metadata management class has encapsulated metadata management correlation function.Key frame or camera lens that this module goes out according to Media Analysis module analysis are cut apart frame information, and response user marks by hand, carries out metadata creation and management function.

Claims (1)

1. a construction method for full-automatic video cataloging syytem, is characterized in that the method comprises:
1st, the structure of full-automatic video cataloging syytem;
1.1st, acquisition of media module, by video acquisition board, gathers video stream data;
1.2nd, Media Analysis module, the video stream data collecting for the 1.1st step acquisition of media module, utilize advanced computer vision treatment technology to carry out intellectual analysis, extract key frame or camera lens and cut apart frame, and the key frame extracting or camera lens are cut apart to frame be shown in Storyboard, further edit, process for user;
1.3rd, metadata foundation, editor and administration module, the key frame extracting take the 1.2nd step Media Analysis module and camera lens are cut apart frame as basis, form take video segment as unit, and are aided with the media materials metadata of descriptor; Described metadata is the final result of cataloging syytem output, is destructuring media asset is carried out to the effectively Main Basis of management; Metadata editor and management comprise metadata model file set up and modification, metadata demonstration, metadata editor, metadata cache management and meta-data preservation;
1.3.1, metadata model file, in this file, define cataloging syytem and media materials carried out to content and the organizational form of the metadata information drawing after structured analysis, comprise fragment description, classifying content, video author and lister, metadata model file is with XML representation of file, system provides default file, and user can revise metadata model file by manual mode as required;
1.3.2, metadata editor, comprising:
◆ key frame and camera lens are cut apart frame-editing: user increases with concrete condition as required or deletes the key frame or the camera lens that are presented in Storyboard cuts apart frame;
◆ video segment editor: key frame or camera lens that user extracts according to system are cut apart frame, pass through manual edit, the content of two or more key frames or camera lens being cut apart to frame merges, component content independence and significant video segment, and according to the format content of metadata model document definition, add corresponding information;
1.3.3, metadata show, comprise
◆ key frame or camera lens are cut apart frame and are shown: the key frame that system extracts automatic analysis or camera lens are cut apart frame and is presented in Storyboard;
◆ segment contents shows: system is presented at the frag info content after editor in user interface;
1.3.4, metadata cache management, comprising:
During automatically carrying out video, audio analysis, for preventing frequent reading writing harddisk, and only analysis result is write to hard disk in the time that analysis completes;
◆ use system RAM to make buffer memory, require memory size can support the metadata analysis of several hours continuously;
◆ cause the Out of Memory used time when continuous analysis overlong time, adopt system paging file as shared mapped file;
1.3.5, meta-data preservation, comprising:
◆ the metadata of automatic analysis or modification is saved in storage medium with XML document form;
◆ key frame or camera lens are cut apart in the file that frame is saved in appointment;
1.4th, Configuration Manager, system is stored in configuration information in configuration file, and with XML representation of file, user can be by hand or by the user interface editor that modifies; When system initialization, read this configuration file, carry out software module configuration; Configuration information comprises:
● input equipment configuration: media file on video acquisition board or disk;
● algorithm function is selected: key frame or camera lens are cut apart frame and extracted;
● Key Frame Extraction algorithm parameter is set, and comprises
■ Key Frame Extraction susceptibility is set;
■ maximum or minimum frame gap are set: limit that the adjacent key frame time interval is no less than or more than specified time interval;
2nd, media materials key frame of video robustness automatic decimation;
2.1st, video low layer preliminary treatment, this step comprises two parts content: first reduce algorithm complex by sampling under pixel in frame; Utilize in addition frame-to-frame correlation to obtain binaryzation frame difference image, and utilize this error image to filter out the set of candidate key frame;
2.2nd, information extraction, the candidate key frame set that the 2.1st step is obtained is further processed, extract histogram feature information, and utilize divergence measurement between the consecutive frame after histogram Fourier transform and between frame to be measured and previous keyframe to carry out key frame judgement and extract; In processing procedure, this step provides detection and the inhibition of independent be fade-in fade-out, wipe special efficacy and photoflash lamp impact;
2.2.1, to being fade-in fade-out fragment, meet the feature of linear change according to interframe luminance signal in fragment, detect by the method for estimating this rate of change;
2.2.2, to the special efficacy fragment of wiping, there is the regular feature in space according to the interframe region of wiping, detect by detecting wipe region and whole wipe between the fragment frame interior spatial variations in region of wiping of interframe;
2.2.3, to photoflash lamp fragment, when detection, according to short feature of photoflash lamp time, utilize interframe luminance difference large and every the little characteristic of interframe luminance difference, judge and detect;
2.3rd, information analysis, the characteristic information obtaining according to the 2.2nd step information extraction and special scene testing result, carry out last comprehensive analysis, and select the key frame that can represent segment contents;
2.3.1, for being fade-in fade-out frame sequence, the last frame that special efficacy completes of selecting to be fade-in fade-out is exported as key frame;
2.3.2, for wiping fragment sequence, the last frame that special efficacy completes of selecting to wipe is exported as key frame;
2.3.3, for the frame of the photoflash lamp impact detecting, directly from candidate frame sequence by its filtering;
2.3.4, for remaining common candidate's keyframe sequence, in the fragment being formed by each successive frame sequence at it, select to export as key frame with the frame of last key frame histogram Fourier transform diversity factor maximum;
3rd, media materials camera lens is cut apart frame robustness automatic decimation;
3.1st, preliminary treatment, comprises two parts content: first reduce algorithm complex by sampling under pixel in frame; Utilize in addition frame-to-frame correlation to obtain binaryzation frame difference image, and utilize this error image to filter out candidate camera lens to cut apart frame set;
3.2nd, information extraction, the candidate camera lens that the 3.1st step is obtained is cut apart frame set and is further processed: extract histogram feature information, and adopt the method for histogram decay average computing to calculate decay histogram average and the statistical variance from start frame to present frame in same camera lens; Then calculate the interframe histogram χ between present frame and decay histogram average 2statistics difference, and upgrade statistical variance; With the statistical variance dynamically updating, calculate dynamic decision thresholding, utilize this threshold value to carry out the judgement that camera lens is cut apart frame; In processing procedure, this step provides detection and the inhibit feature of independent be fade-in fade-out, wipe special efficacy and photoflash lamp impact; For the fragment of being fade-in fade-out, meet the feature of linear change according to interframe luminance signal in such fragment, detect by the method for estimating this rate of change; For the fragment of wiping, there is the regular feature in space according to the interframe region of wiping, detect by detecting wipe wipe between region and the whole fragment frame interior spatial variations in region of interframe; For photoflash lamp fragment, when detection, according to short feature of photoflash lamp time, utilize interframe luminance difference greatly and every the little characteristic of interframe luminance difference, judge and detect;
3.3rd, information analysis, according to the characteristic information obtaining from the 3.2nd step information extraction and special scene testing result, carry out last comprehensive analysis, and judge camera lens and cut apart frame, comprising the detection of the special circumstances of be fade-in fade-out, wipe special efficacy and photoflash lamp; For the frame sequence of being fade-in fade-out detecting, output last frame is as key frame; For the fragment sequence of wiping detecting, output last frame is as key frame; For the frame of the photoflash lamp impact detecting, directly from candidate frame sequence by its filtering; For remaining common candidate frame, the judgement that camera lens is cut apart frame is required to meet following two conditions:
● sudden change conditions: present frame and decay histogram average χ 2statistics difference is greater than the sudden change threshold value of being determined by statistical variance;
● smooth conditions: present frame and decay histogram average χ 2statistics difference is greater than by the interframe χ between present frame and a rear frame 2the steady threshold value that statistics difference is determined.
CN201210054812.5A 2012-03-05 2012-03-05 Construction method of full-automatic video cataloging system Expired - Fee Related CN102694966B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210054812.5A CN102694966B (en) 2012-03-05 2012-03-05 Construction method of full-automatic video cataloging system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210054812.5A CN102694966B (en) 2012-03-05 2012-03-05 Construction method of full-automatic video cataloging system

Publications (2)

Publication Number Publication Date
CN102694966A CN102694966A (en) 2012-09-26
CN102694966B true CN102694966B (en) 2014-05-21

Family

ID=46860234

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210054812.5A Expired - Fee Related CN102694966B (en) 2012-03-05 2012-03-05 Construction method of full-automatic video cataloging system

Country Status (1)

Country Link
CN (1) CN102694966B (en)

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103065511B (en) * 2012-12-29 2015-04-01 福州新锐同创电子科技有限公司 Implementation method of teaching plan editor
CN104219491A (en) * 2013-06-04 2014-12-17 费珂 Image analysis function based video monitoring system and storage method thereof
CN104519401B (en) * 2013-09-30 2018-04-17 贺锦伟 Video segmentation point preparation method and equipment
CN103870598B (en) * 2014-04-02 2017-02-08 北京航空航天大学 Unmanned aerial vehicle surveillance video information extracting and layered cataloguing method
CN104184960A (en) * 2014-08-19 2014-12-03 厦门美图之家科技有限公司 Method for carrying out special effect processing on video file
CN104822087B (en) * 2015-04-30 2017-11-28 无锡天脉聚源传媒科技有限公司 A kind of processing method and processing device of video-frequency band
CN105227862A (en) * 2015-09-16 2016-01-06 上海工程技术大学 Can the video recombination system of auto Segmentation camera lens and video recombination method thereof
CN110019025B (en) * 2017-07-20 2021-12-21 中移动信息技术有限公司 Stream data processing method and device
CN108111537B (en) * 2018-01-17 2021-03-23 杭州当虹科技股份有限公司 Method for quickly previewing online streaming media video content in MP4 format
CN109005368B (en) * 2018-10-15 2020-07-31 Oppo广东移动通信有限公司 High dynamic range image generation method, mobile terminal and storage medium
CN109800035A (en) * 2019-01-24 2019-05-24 博云视觉科技(青岛)有限公司 A kind of algorithm integration service framework system
CN111641869B (en) * 2020-06-04 2022-01-04 虎博网络技术(上海)有限公司 Video split mirror method, video split mirror device, electronic equipment and computer readable storage medium
CN113014957A (en) * 2021-02-25 2021-06-22 北京市商汤科技开发有限公司 Video shot segmentation method and device, medium and computer equipment
CN113221943B (en) * 2021-04-01 2022-09-23 中国科学技术大学先进技术研究院 Diesel vehicle black smoke image identification method, system and storage medium
CN113473182B (en) * 2021-09-06 2021-12-07 腾讯科技(深圳)有限公司 Video generation method and device, computer equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002035377A2 (en) * 2000-10-23 2002-05-02 Binham Communications Corporation Method and system for providing rich media content over a computer network
CN101604325A (en) * 2009-07-17 2009-12-16 北京邮电大学 Method for classifying sports video based on key frame of main scene lens
CN101872346A (en) * 2009-04-22 2010-10-27 中国科学院自动化研究所 Method for generating video navigation system automatically

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8473525B2 (en) * 2006-12-29 2013-06-25 Apple Inc. Metadata generation for image files

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002035377A2 (en) * 2000-10-23 2002-05-02 Binham Communications Corporation Method and system for providing rich media content over a computer network
CN101872346A (en) * 2009-04-22 2010-10-27 中国科学院自动化研究所 Method for generating video navigation system automatically
CN101604325A (en) * 2009-07-17 2009-12-16 北京邮电大学 Method for classifying sports video based on key frame of main scene lens

Also Published As

Publication number Publication date
CN102694966A (en) 2012-09-26

Similar Documents

Publication Publication Date Title
CN102694966B (en) Construction method of full-automatic video cataloging system
JP5355422B2 (en) Method and system for video indexing and video synopsis
US7945142B2 (en) Audio/visual editing tool
CN103502936B (en) Automated systems and methods based on image
CN103347167A (en) Surveillance video content description method based on fragments
CN106937114B (en) Method and device for detecting video scene switching
CN102231820B (en) Monitoring image processing method, device and system
CN101887439B (en) Method and device for generating video abstract and image processing system including device
CN110008962B (en) Weak supervision semantic segmentation method based on attention mechanism
CN103678299A (en) Method and device for monitoring video abstract
CN102222104A (en) Method for intelligently extracting video abstract based on time-space fusion
Dumont et al. Automatic story segmentation for tv news video using multiple modalities
CN110795595A (en) Video structured storage method, device, equipment and medium based on edge calculation
CN106557760A (en) Monitoring system is filtered in a kind of image frame retrieval based on video identification technology
SenGupta et al. Video shot boundary detection: A review
Asim et al. A key frame based video summarization using color features
Zhu et al. Automatic scene detection for advanced story retrieval
Verma et al. A hierarchical shot boundary detection algorithm using global and local features
Paliwal et al. A survey on various text detection and extraction techniques from videos and images
Chen et al. A practical method for video scene segmentation
Lin et al. Background subtraction based on codebook model and texture feature
Zhang et al. An improved algorithm for video abstract
Lim et al. Plot preservation approach for video summarization
CN112543289A (en) AI (artificial intelligence) video point counting method, device, equipment and medium for pig breeding
Manonmani et al. Robust candidate frame detection in videos using semantic content modeling

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20140521

Termination date: 20150305

EXPY Termination of patent right or utility model