CN103426176A - Video shot detection method based on histogram improvement and clustering algorithm - Google Patents

Video shot detection method based on histogram improvement and clustering algorithm Download PDF

Info

Publication number
CN103426176A
CN103426176A CN2013103799401A CN201310379940A CN103426176A CN 103426176 A CN103426176 A CN 103426176A CN 2013103799401 A CN2013103799401 A CN 2013103799401A CN 201310379940 A CN201310379940 A CN 201310379940A CN 103426176 A CN103426176 A CN 103426176A
Authority
CN
China
Prior art keywords
shot
camera lens
frame
histogram
video
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2013103799401A
Other languages
Chinese (zh)
Other versions
CN103426176B (en
Inventor
瞿中
陈昌志
刘达明
薛峙
高腾飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Post and Telecommunications
Original Assignee
Chongqing University of Post and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Post and Telecommunications filed Critical Chongqing University of Post and Telecommunications
Priority to CN201310379940.1A priority Critical patent/CN103426176B/en
Publication of CN103426176A publication Critical patent/CN103426176A/en
Application granted granted Critical
Publication of CN103426176B publication Critical patent/CN103426176B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)
  • Studio Devices (AREA)

Abstract

The invention discloses a video shot detection method based on histogram improvement and a clustering algorithm, and relates to image processing techniques. According to the method, the histogram improvement and the clustering algorithm are adopted to compute an intersection of histograms of two adjacent frames of images, and whether a shot change happens is judged according to histogram similarity; if the shot change happens, secondary detection on a shot boundary of the intersection of the histograms of the two adjacent frames of images is carried out by utilizing interframe gray scale/color difference values, pixel difference value computation on each block is carried out respectively by adopting non-uniform block weighting processing, pixel difference values and preset block frame differential threshold values are compared to obtain identification variables, the identification variable of each block is weighted and summarized, and the weighing and summarizing values and the preset block differential threshold values are compared to carry out shot detection. The video shot detection method improves shot detection accuracy and solves the problems of shot false detection, frame number discontinuity, and the like.

Description

Shot detection method based on improving histogram and clustering algorithm
Technical field
The present invention relates to image processing techniques, specifically a kind of shot detection technology.
Background technology
A picture group picture frame continuous on time domain has formed video flowing, but because video frame rate is generally larger, one section very short video just has a large amount of picture frames, and the adjacent image frame has certain correlativity on visual signature, therefore in the video frequency searching field, can not directly apply the method for CBIR.Only have to video is carried out structuring and sets up index and summary for video, in the situation that form the linear structure about video content, could effectively realize fast browsing and the retrieval of video data.The structuring of video comprises that camera lens is cut apart, camera lens cuts apart and claims shot transition to detect, it is the basis of video structure stratification, the impact that requirement avoids extraneous factor to cut apart for Shot Detection, video sequence is divided into to a plurality of one group of camera lens that uninterrupted frame forms by having identical content, correctly detects various complicated editors' shot boundary.
Camera lens is cut apart requirement and exactly video is cut open from shot boundary punishment, forms independent camera lens one by one, to guarantee the accuracy of key-frame extraction.The scholars such as the scholars such as Yeung and Nagasaka propose respectively histogram intersection algorithm and χ 2Histogramming algorithm, improved the account form of histogram difference degree;
For the interference that the local motion reduced in camera lens may cause, the scholars such as Nagasaka propose the method that each frame is carried out to the piecemeal processing; In order to detect better lasting progressive formation, the scholars such as Zhang have proposed the algorithm of dual threshold; For motion feature, the scholars such as Shahraray have proposed a kind of block matching algorithm, after each piece is carried out to motion compensation, have improved the tolerance for local motion in camera lens, and the scholars such as Akutsu carry out similarity between definition frame by the related coefficient of calculation of motion vectors, thus the detector lens conversion; In camera lens when conversion,, in camera lens, target edges also changes thereupon, so the scholar such as R.Zabhi has proposed the Scene Segmentation based on edge feature; The scholars such as Chi-ChunLo propose to adopt fuzzy C-mean algorithm (FuzzyC-means:FCM) clustering algorithm to carry out camera lens to cut apart, and the most all frame of video are included into shot change (ShotChange:SC) and without shot change (NoShotChange:NSC) two classes; The scholars such as golden red propose to adopt the Unsupervised Clustering algorithm to be detected mpeg compressed video, and carry out corresponding aftertreatment according to the feature of video data; Cernekova [11]Deng the scholar, propose in adjacent two interframe, in conjunction with the Shot Detection algorithm of mutual information and combination entropy etc.Present stage, a lot of lens detection methods were for the detection effect convergence perfection of shot-cut, and gradual shot is due to the diversity of its camera lens translative mode, and were vulnerable to noise, and existing methodical detection effect is still undesirable.In addition, generally adopt distinct methods to carry out respectively the detection of shot-cut and gradual shot, and it is little to identify merely the practical significance of shot-cut, therefore, the method that can simultaneously identify shot-cut and gradual change is scholars' goal in research always.
It is the basis of video structure stratification that camera lens is cut apart, and has obtained researchist and scholar's extensive attention, and abundant achievement in research is arranged.Yet up to now, still do not have a kind of in all cases, for the video of various content types, can show superperformance, the camera lens Segmentation Detection of " universally applicable ".
The camera lens transition detection, for film or video are divided into to basic time domain unit---camera lens, according to the link edition mode of shot boundary, can be divided into the camera lens conversion regime following two classes: lens mutation (shear) and gradual shot.Lens mutation (shear) is the process of next camera lens that is converted to suddenly from a camera lens, and corresponding is directly to connect the camera lens edit mode of two camera lenses; And gradual shot is the process that next camera lens replaces current camera lens gradually, claim again soft conversion, corresponding is to utilize space/coloring effect to connect the camera lens edit mode of two camera lenses.Gradual shot is to comprise multiple camera lens conversion regime, is characterized in that whole handoff procedure is progressive lasting.More common gradual change mainly contains to be fade-in fade-out, to put under and mark, dissolve, sweep and change and dissolve etc.
Occur in the camera lens conversion, variation has also occurred in video content (high-level semantic) usually.The ideal process that shot detection is cut apart is directly analyzed video content (high-level semantic), but, due to " semantic gap " and the ambiguity that relates to the high-level semantic of human emotion's factor, most Shot Detection algorithms still come the detector lens border according to the variation of shot boundary place video low-level feature (as visual signature and motion features such as color, edge, textures).Generally, the camera lens conversion can cause the significant change of video low-level image feature, as the unexpected variation of picture frame color distribution, and the moving in and out etc. of edge of video objects profile.But, in the transfer process of gradual shot, the video low-level image feature changes comparatively slow and not obvious.
In addition, even in same camera lens, the quick variation of video content and noise also may cause the video low-level image feature that larger variation occurs.In view of above many influence factors, although under some particular case, utilizing existing algorithm to carry out video lens cuts apart and can reach effect preferably, but there is the rapid movement of object/video camera in video, the extreme case such as the acute variation of ambient light photograph and in the progressive formation of video, much existing algorithm carries out the effect that camera lens cuts apart and still is far from satisfactory.
In prior art, the common method that shot detection is cut apart is, calculate in video the frame difference value Diff of Low Level Vision feature between successive frame or motion feature, and it is compared with default or adaptive threshold value T, if Diff>T, this place is shot boundary, otherwise, think that this group successive frame belongs to same camera lens.From common method, the metric form of frame difference value, the setting of threshold value, and both optimum combination will become the key point that shot detection is cut apart.And, within same camera lens, video features changes and mainly contains following two reasons: the motion of object/video camera and the variation of light.The motion of object/video camera causes in camera lens constantly occurring new object, and old object, also in continuous disappearance, if deal with improperly, is easy to obscure with gradual shot simultaneously, causes camera lens flase drop (false detection).Also often occur in camera lens that light changes, if in camera lens, certain frame brightens suddenly, saltus step will occur in the frame difference based on brightness tolerance, if deal with improperly, will be detected. as shot-cut, also can cause the camera lens flase drop.Therefore, when algorithm for design, need to take into full account this two factors.Will correctly detect shot boundary and cut apart to carry out camera lens, interframe Difference of content amount ideally should possess such feature: the less and relative equilibrium of frame difference in camera lens, and it is very large at the shot boundary place, saltus step can to occur.Consider in same camera lens two principal elements that cause the camera lens content change, object/camera motion and light in frame difference reply camera lens change as far as possible insensitive, and capturing observantly the marked change of video content at the shot boundary place, saltus step obtains local maximum.In the research field of cutting apart at shot detection, research and discussion through many decades, many scholars and researchist have proposed algorithm separately, characteristics according to camera lens conversion place, the detection that picture frame visual signature based on different and camera motion feature are carried out shot boundary is cut apart, and has obtained certain effect.Generally speaking, the shot detection partitioning algorithm mainly comprises following a few class: the algorithm based on pixel, the algorithm based on histogrammic algorithm, based on motion feature, the algorithm based on edge feature etc.
Histogram has reflected the population distribution of piece image gray scale (grey level histogram) or color (color histogram) intuitively, due to its outstanding global property, be widely used in the image processing, and multiple metric form is arranged: basic skills is to calculate the histogram difference value of adjacent video interframe, but the result of difference of histograms value is different because of the histogram kind adopted.The histogram Weighted distance that also can calculate between two width images by the introducing weighting coefficient is expanded basic skills, calculates in addition in addition the histogram intersection between two width images or adopts other distance metric methods.
Be the shot detection dividing method be most widely used based on histogrammic algorithm, process simple and conveniently, computation complexity is lower, for most of videos, as long as threshold value setting is proper, general all can reach reasonable effect.Major advantage based on histogrammic algorithm just is its global property.
Consistent with the basic thought based on the pixel algorithm based on histogrammic algorithm, be all to calculate the frame difference value, different is adopted module difference, the former expands and obtains on the latter's basis.Algorithm based on pixel is sued for peace to the gray scale of adjacent two frame respective pixel or the absolute value of luminance difference, to weigh the frame difference degree.It is the simplest and rudimentary algorithm that calculates the frame difference value, and algorithmic procedure is as follows:
The interframe gray scale of respective pixel or luminance difference are suc as formula shown in (1):
fd=|f n(i,j)-f n+1(i,j)| (1)
Wherein, f n(i, j), f N+1(i, j) mean respectively the gray scale of n frame and n+1 frame pixel (i, j) or brightness value (because of the histogram type different), total frame of n frame and n+1 interframe is poor is:
Fd = 1 MN Σ i = 1 M Σ j = 1 N fd ( i , j ) - - - ( 2 )
Then, by total frame poor with predetermined threshold value relatively, if exceed threshold value, camera lens conversion occurs in this place.
Although the simple and clear and easy realization of the algorithm based on pixel, but the motion for object/video camera in camera lens is very responsive, in camera lens, the motion of object/video camera will cause gray scale or the brightness of a lot of pixels in picture frame to change, thereby causes the error detection of shot boundary.Therefore, there is the scholar to propose based on histogrammic Shot Detection dividing method.
(1) histogram distance
The regularity of distribution of each pixel on each gray level, intensity level or color grade in the statistics with histogram image.Tonomura and Abe [14]Proposition, using grey level histogram as frame difference metric standard, is calculated the difference metric image frame difference of adjacent two frame grey level histograms:
Σ v = 0 V | H ( I t , v ) - H ( I t - 1 , v ) | > T - - - ( 3 )
If meeting formula (3), adjacent two frame frame differences the camera lens conversion occurs this place.Successively have the scholar to propose to improve one's methods based on histogrammic, for example: for color histogram, visual characteristic and the needs that reduce calculated amount according to human eye, calculate the histogram difference quantized; For three-dimensional color space (typical in RGB, HSV etc.), three Color Channels are calculated respectively the interframe histogram difference and carry out weighted sum etc., wherein representational expansion has: Gargi and kasturi [15]Proposition is measured for histogram difference between the quantized frame of three-dimensional color space:
Σ k = 1 3 Σ v = 0 V | H ( I t , C k , v ) - H ( I t - 1 , C k , v ) | > T - - - ( 4 )
C wherein kShot change, as RGB or HSV etc., if the frame difference meets formula (4), occurs herein in the representative color space.
(2) histogram weighting
In three-dimensional color space, owing to comparing with other color components, the color that some color component can affect image largely shows, perhaps human vision is to its comparatively responsive (as the Hue component in the hsv color space), therefore, need to be made a concrete analysis of as the case may be, for color being shown to influence degree is large or than the color component that can meet human vision susceptibility, large weight is set, and less or be difficult to the less weight of color component setting of direct feel to influence degree, weighted sum obtains weighting interframe histogram difference, can reflect better content distance or the difference on human vision between frame of video [16].If
Σ k = 1 3 Σ v = 0 V L ( I t , C k ) L mean ( I t ) | H ( I t , C k , v ) - H ( I t - 1 , C k , v ) | > T - - - ( 5 )
Think the camera lens conversion occurs herein.L (I wherein t, C k) mean t frame k color component value, L Mean(I t) mean the average color obtained by each color component in the t frame.Zhao [17]Propose a kind of new learning method Deng the scholar, by the mini-max optimization procedural learning, obtain more excellent similarity measurement, for each color component is set different weights, thereby obtain the weighted histogram distance.If
Σ k = 1 3 Σ v = 0 V w ( k , v ) | H ( I t , C k , v ) - H ( I t - 1 , C k , v ) | > T - - - ( 6 )
Think the generation shot change.Wherein, w (k, v) means the weighting coefficient of t frame k color component.
(3) histogram intersection
In the Shot Detection field, as another metric form of histogram similarity, histogram intersection [2]Also apply morely, account form also has multiple.For example: obtain the histogram intersection of t-1 frame and t frame according to the minimum function method, if
( 1 - 1 xy Σ v = 0 V min ( H ( I t , v ) , H ( I t - 1 , v ) ) ) > T - - - ( 7 )
Think shot change occurs herein, the sum of all pixels in xy presentation graphs picture frame wherein, the histogram intersection calculated like this is between [0,1].
The method that another kind of compute histograms is occured simultaneously [18]Shown in (8), if
( 1 - 1 xy Σ v = 0 V min ( H ( I t , v ) , H ( I t - 1 , v ) ) max ( H ( I t , v ) , H ( I t - 1 , v ) ) ) > T - - - ( 8 )
Think shot change occurs herein.
The histogram intersection method statistic adjacent two interframe there is the number of pixels of same grayscale, brightness or color value.Its essence is identical with direct compute histograms distance.
(4) χ 2Histogram
χ 2Histogram method [3]As a kind of effective expansion for traditional histogram method, because it can amplify the interframe histogram difference, and algorithm is more stable, can reflect better the difference between adjacent two two field pictures, and be widely applied χ 2Be defined as:
Figure BDA0000372819210000055
And by χ 2With predetermined threshold value, T compares, if be greater than T, think shot change occurs herein.Detect with Kolmogorov-Smimov and detect and compare with the likelihood ratio of Yakimovsy, the method performance is more excellent [19].
(5) dual threshold relative method
The translation type of video lens can be divided into two kinds of shear and gradual changes, and generally, between the consecutive frame in the gradual shot process, the difference value amplitude is little than shot-cut, but, in the time-continuing process of gradual shot, the frame difference value amplitude of accumulation is comparatively obvious.Therefore adopt single threshold value to be judged the multiple situation that obviously can't adapt to shot-cut and gradual change.For this reason, the scholar such as Zhang has proposed dual threshold relative method (twin comparison) on the basis of compute histograms distance [5].At first set two threshold value T hAnd T l, be respectively used to detector lens shear and gradual shot.Calculate successively the frame difference value of adjacent two frames, if somewhere frame difference value surpasses T h, think shot-cut occur herein; If the frame difference value is less than T hAnd be greater than T l, think and start to occur gradual shot herein.Continue to calculate the frame difference value of each frame thereafter, if still be greater than T l, added up, otherwise think camera lens conversion does not occur, abolish start frame, and, by cumulative frame difference value zero clearing, again start judgement from next frame.Until accumulative total frame difference surpasses T h, think that gradual shot finishes herein; If until video end frame or frame difference value are less than T lThe time, accumulative total frame difference does not reach T yet h, be greater than T before thinking lThe frame difference value by other reasons, caused.
Prior art adopt based on histogrammic algorithm with have following problems based on the pixel algorithm:
(1) the histogram reflection is the population distribution of gradation of image or color, and can't embody positional information and the vision content of image, and the uncorrelated two width images of content milli also may have same gray scale/color population distribution.In addition, the two width images with same color population distribution may have identical object and background, but the position difference of object, the typical tricolor national flag as France and Holland, Ireland and Cote d'lvoire etc.
(2) histogram has reflected the population distribution of piece image gray scale (grey level histogram) or color (color histogram) intuitively, slow motion for object/video camera in camera lens has stronger robustness, but the detection effect for the rapid movement of object/video camera and gradual shot situation is still undesirable, easily causes camera lens flase drop or camera lens undetected (missed detection).
(3) carry out the shot boundary detection based on histogrammic different measures according to the entire change situation of the gray scale between frame of video or color, and reckon without the interference of the motion of object video/video camera in camera lens for detection.In testing process, if the interior object video/camera motion of camera lens causes the population distribution generation marked change of camera lens frame interior gray scale or color, just probably this frame interior is identified as to shot boundary, causes the camera lens error detection.Can solve this problem by frame of video is carried out to piecemeal, each frame of video is divided into to n * n image block, calculate interframe gray scale or the color histogram difference of consecutive frame corresponding blocks, get rid of the piece of difference maximum, the interframe histogram difference of adding up in some way all the other each pieces.With traditional comparing based on histogrammic method, this motion of improving one's methods for video camera in camera lens has and detects preferably effect, but for the special-effect of some gradual shot, as be fade-in fade-out etc., detect effect still undesirable.The situation of violent illumination variation (as flash of light etc.), also can disturb to a great extent based on histogrammic Shot Detection effect in addition.
(4) the dual threshold comparative approach has fully taken into account the feature difference of shot-cut and gradual shot, and is detected respectively for their characteristics, can meet general camera lens and cut apart requirement.And be defined in the frame difference value and be not less than T lPrerequisite under, cumulative frame difference value reaches T hThe time, just think and the generation gradual shot therefore for burst noise, certain antijamming capability is arranged.But change unconspicuous gradual change time-continuing process for some interframe, probably in its accumulation frame difference value, be added to T hBefore, the gradual shot process just is through with, and probably causes undetected.(be less than T if between certain two consecutive frame in gradual change time-continuing process, difference is very little in addition l), will directly cause cumulative process to finish, also probably cause undetected.
Clustering algorithm is widely applied in information science field, and its basic thought is from the initialization cluster, according to certain video features, utilizes certain measuring similarity mode, by sample set X=(X 1, X 2..., X n) in each element distribute to the cluster the highest with its similarity, finally reach system or customer requirements.
B Gunsel, the scholars such as M R Naphade successively propose to use the K-means clustering algorithm [22], according to the gray scale of adjacent two frames/color histogram difference, scene is divided into significant change being arranged and not having significant change two classes to carry out Shot Detection and cuts apart.The scene changes place occurred separately is judged as to shot-cut, and the scene changes place that will occur continuously is judged as gradual shot.The K-means clustering algorithm carries out the great advantage that shot detection cuts apart and is that it does not need setting threshold, and can utilize a plurality of video features simultaneously, and the Euclidean distance by the calculated characteristics vector is to improve the Shot Detection effect.The essence of clustering algorithm is according to square error and minimum criterion, and the frame difference value is divided into to two classes, and its testing result is equivalent to respectively every section video be arranged to rational global threshold.This algorithm can carry out self-adaptation to each section video sequence, but the impact of noise is comparatively responsive to external world, if the gradual shot process is not clearly, is easy to progressive formation is divided into without obvious scene changes class.
Consider that between this two class of actual scene be fuzzy, thereby Chi-Chun Lo [9]Deng the people, propose with fuzzy C-mean algorithm (Fuzzy C-means, FCM) clustering algorithm carries out Shot Detection and cuts apart, all frame difference values are divided three classes: camera lens conversion (Shot Change, SC), possible camera lens conversion (Suspected Shot Change, SSC) and without camera lens change (No Shot Change, NSC), and to n possibility camera lens switch element SSC (j) between adjacent two element S C (i) and SC (i+1) in camera lens conversion class, SSC (j+1) ... SSC (j+n-1) is analyzed, each picture frame in through type (14) judgement possibility camera lens conversion class is under the jurisdiction of camera lens conversion class still without camera lens conversion class:
H_SSC(k)≥param×[0.5×(H_SC(i)+H_SC(i+1))] (14)
Wherein H_SC (i) and H_SC (i+1) mean respectively the interframe histogram difference of SC class adjacent element SC (i) and SC (i+1), and H_SSC (k) means the interframe histogram difference in SSC dvielement SSC (k) between SC class adjacent element SC (i) and SC (i+1).This algorithm is without setting threshold and introduced possibility camera lens conversion class to be further analyzed, thereby some edge frame difference more reasonably can be sorted out.
In order to reduce the computation complexity of fuzzy clustering algorithm, the scholars such as Xinbo Gao have also adopted granularity substep clustering method from coarse to fine.At first the interframe of the l of often being separated by video (l >=2) frame is carried out to thick cluster, obtain the approximate location of lens mutation on time domain, then carry out frame by frame thin cluster the lens mutation place may occur, can detect the exact position of lens mutation.
The scholars such as Xinbo Gao [23]The fuzzy clustering algorithm proposed also can be used for the detection of gradual shot.This algorithm adopts difference of histograms standard (Histogram difference metric, HDM) and the poor standard of air-frame (Spatial difference metric, SDM) consecutive frame is carried out to similarity measurement, and all frame of video are defined as to the feature space F generated by HDM value and SDM value DIn a point set,
F D={F D(t)=(D S(t),D H(t)),t=1,2,...,T} (15)
Like this, just the Shot Detection problem can be converted into to the problem that feature space is divided into to significant change (Significant Change, SC) and non-significant change (No Significant Change, NSC) two sub spaces.
Adopt in the process that above-mentioned algorithm processed video, calculate at first respectively the degree of membership of current video frame for SC and NSC two class subcharacter spaces.If present frame is higher for the degree of membership of significant change scene class, this frame is included into to significant change scene class, and mean by Boolean variable 1, otherwise mean by Boolean variable 0, until all picture frame clusters of video are complete, thereby video sequence is converted into to a binary sequence, for example 1101001011110100101010 ...In video sequence, lens mutation and gradual change have certain changing pattern separately, therefore, by the video binary sequence to after transforming, carry out mode decision, can detect respectively video lens mutation and gradual shot.According to the people's such as Xinbo Gao analysis, binary sequence 010 means lens mutation, and binary sequence 011 and 110 means gradual shot.
In addition, also can directly be classified to the eigenwert of each frame of video, because each picture frame low-level feature in camera lens has certain similarity, can be chosen the camera lens of characteristic similarity maximum as the camera lens under frame.And camera lens conversion place is because the variation of camera lens content causes the variation of each frame visual signature or motion feature, the present frame of camera lens conversion place will be included into next camera lens.
In the Unsupervised Clustering algorithm, being most widely used of cyclic process, its basic thought is, from certain initial clustering, (certain way is selected or artificial the appointment), and the element in sample set is subdivided into to certain known cluster with certain similarity measurement standard until meet system or user's predetermined demand respectively.
Due to the supervision that there is no expert's priori, the Unsupervised Clustering algorithm is a kind of ofaiterative, dynamic analytic process of self-organization, under the condition that does not meet the cluster end, according to certain similarity account form, constantly restrain finally to meet the requirement for clusters number or cluster density of user or system.When utilizing the Unsupervised Clustering algorithm to carry out cluster to frame of video, described similarity measurement standard before can adopting, comprise color histogram, edge variation ratio, motion vector etc.
The clustering algorithm passing threshold δ of non-supervisory formula controls cluster density [10], with the first frame f 1As initial clustering, calculate each frame f thereafter i, i ∈ [1, N] and all known cluster centres (camera lens Lei Nei center) δ before k, the similarity S (f between k ∈ [1, M] i, C k), and preserve maximal value S maxAnd subscript k, by certain class before comparing to judge whether to belong in similarity threshold δ, and carry out dynamic feature clustering relatively with this, be same camera lens if be divided into of a sort consecutive frame.If original N in k cluster kFrame,
C k = f i + &Sigma; j = 1 N k f | j N k + 1 , S max &GreaterEqual; &delta; C k + 1 = f i , S max < &delta; - - - ( 1 6 )
Wherein, C kAnd C K+1Be respectively the center of k and k+1 cluster.
K-means and ISODATA (Iterative Self-Organizing Data Analysis Technique repeats the self-organization data analysis technique) are the round-robin algorithms of two kinds of Unsupervised Clusterings commonly used.The K-means algorithm is selected k initial cluster center at random, and finds the nearest cluster centre of characteristic distance for each sample and carry out dynamic clustering; ISODATA is by carrying out the repetition performance analysis of self-organization to sample data, within the change allowed band of associated arguments, the clusters number finally obtained is indefinite.
The clustering algorithm of non-supervisory formula has reduced computation complexity to a certain extent, and avoided the setting of threshold value, when but if in camera lens, content change is larger, the camera lens frame interior may be divided into different cluster (camera lens), thereby cause the camera lens flase drop, and its classification results is closely related with initial barycenter (start frame).Due to when the practical application Unsupervised Clustering algorithm, do not fully take into account the temporal characteristics of video in addition, may cause camera lens the discontinuous problem of frame number to occur.
Summary of the invention
The present invention is directed to existing algorithm and carry out the video detection, may cause the problems such as camera lens flase drop, frame number be discontinuous, for the Shot Detection part, proposed the image detecting method based on improving histogram and improved clustering algorithm.
The technical scheme that the present invention solves the problems of the technologies described above is: a kind of lens detection method based on improving histogram and frame difference method comprises step: calculate the histogrammic common factor of adjacent two two field picture, and according to the histogram similarity to judge whether to occur shot change; As shot change occurs, further to shot boundary, utilize interframe gray scale/color difference the histogrammic common factor of adjacent two frame to be carried out to the secondary detection of shot boundary, adopt non-homogeneous divided group to process, respectively to each piecemeal calculating pixel difference, and pixel value difference and default piecemeal frame difference limen value are compared to the acquisition token variable, to the token variable weighted sum of each piecemeal, the divided group threshold value of the value of weighted sum and setting is compared; Frame number being less than to 20 camera lens incorporates in a upper camera lens again.
Wherein, according to formula: S ( t , t - 1 ) = m h &times; S h ( t , t - 1 ) + m s &times; S s ( t , t - 1 ) m v &times; S v ( t , t - 1 ) 3 Calculate adjacent t and the histogram similarity of t-1 frame, wherein, S h(t, t-1), S s(t, t-1) and S v(t, t-1) is respectively the histogram similarity of H, S, V component, according to formula
Figure BDA0000372819210000102
Determine the similarity of adjacent two frame H components, wherein, h t(i), h T-1(i) represent respectively histogram, N presentation video gray scale or the color quantizing rank of t and t-1 frame H component.Can be by the weighting coefficient m of H, S, tri-components of V h, m s, m vBe set as 0.9:0.3:0.1.
The present invention also proposes a kind of shot detection method based on Detection Based on Clustering, by video sequence the first frame f 1As first camera lens, and first camera lens De Leinei center, and make this camera lens boolean access variable Shot.access ≡ 1; Extract the next frame f of video sequence 2, and calculate respectively video sequence and the current camera lens Lei Nei center histogram similarity on H, S, V three-component, according to formula: S ( f , Shot ) = m h &times; S H ( f , Shot ) + m S &times; S S ( f , Shot ) m V &times; S V ( f , Shot ) 3 The histogram similarity that weighted calculation is total; If S (f, shot)>T, think that video sequence frame f belongs to camera lens Lei Nei center Shot, f is put into to Shot, and according to formula:
Figure BDA0000372819210000104
Shot.len=Shot.len+1 recalculates camera lens De Leinei center; If S (f, shot)<T, set up new camera lens, video sequence frame f is put into to new camera lens, as this new camera lens De Leinei center, and when the boolean's access variable access by last camera lens sets to 0, make new camera lens boolean access variable Shot.access ≡ 1, wherein f iMean the inner original frame of camera lens.
Calculating video sequence and the histogram similarity of current camera lens Lei Nei center on H, S, V three-component is specially: by video sequence V={f 1, f 2..., f nProject on the hsv color space, to H, S and V component carry out non-uniform quantizing, and determine and quantize progression, according to histogrammic H, S, V component H (i), S (j), V (k), call formula:
S H ( f , Shot ) = &Sigma; i = 1 8 min ( H ( i ) , Shot _ H ( i ) ) max ( H ( i ) , Shot _ H ( i ) ) S S ( f , Shot ) = &Sigma; j = 1 3 min ( S ( j ) , Shot _ S ( j ) ) max ( S ( j ) , Shot _ S ( j ) ) S V ( f , Shot ) = &Sigma; k = 1 3 min ( V ( k ) , Shot _ V ( k ) ) max ( V ( k ) , Shot _ V ( k ) ) Calculate respectively current video sequence frame to be checked and the current camera lens Lei Nei center histogram similarity on three-component.
Two kinds of method computation complexities that the present invention proposes are low, significantly do not increase calculate and time complexity in, improved the accuracy rate of Shot Detection, solved and caused the aspect problems such as camera lens flase drop, frame number be discontinuous.
The accompanying drawing explanation
Fig. 1 histogram method treatment scheme of the present invention;
Fig. 2 frame difference method treatment scheme of the present invention;
Fig. 3 clustering algorithm flow process of the present invention.
Embodiment
Histogram has a variety of application modes, and the present invention has adopted improved mode---histogram intersection.
Because histogram can't embody positional information and the vision content of image, the uncorrelated two width images of content milli also may have same gray scale/color population distribution, therefore, the present invention improves histogram by non-homogeneous piecemeal and weighting preprocessing process, to give prominence to the contribution of core for frame difference, greatly reduce in camera lens simultaneously and move among a small circle for the impact of Shot Detection, compare with traditional color histogram method, acquired results is closer to the mankind's visual cognition.In addition, for video content, effectively suppressed the interference for Shot Detection of the advertisement of video top or bottom or captions.
Be specially:
Utilize histogram method to detect camera lens.According to the histogrammic common factor of adjacent two frame of image, determine whether camera lens changes.
(1) obtain the histogrammic common factor of adjacent two frame, calculate adjacent two frame histogram similarities, similarity and threshold value compare and tentatively judge whether to occur shot change, as similarity is greater than threshold value, tentatively judge shot change.According to the general span of setting the histogram similarity threshold of experiment, be 0.75-0.95, when threshold value is made as 0.9, the resultant effect optimum.
The similarity of adjacent two frame H components is determined by following formula:
S h ( t , t - 1 ) = &Sigma; i = 1 N min ( h t ( i ) , h t - 1 ( i ) ) max ( h t ( i ) , h t - 1 ( i ) ) - - - ( 2 1 )
Wherein, h t(i), h T-1(i) represent respectively histogram, N presentation video gray scale or the color quantizing rank of t and t-1 frame H component.In like manner, the histogram similarity of S, V component is respectively:
S s ( t , t - 1 ) = &Sigma; i = 1 N min ( S t ( i ) , S t - 1 ( i ) ) max ( S t ( i ) , S t - 1 ( i ) ) With S v ( t , t - 1 ) = &Sigma; i = 1 N min ( v t ( i ) , v t - 1 ( i ) ) max ( v t ( i ) , v t - 1 ( i ) ) .
Equally, s t(i), s T-1, and v (i) t(i), v T-1(i) represent respectively the histogram of t and t-1 frame S and V component.
Under the HSV space, according to formula:
S ( t , t - 1 ) = m h &times; S h ( t , t - 1 ) + m s &times; S s ( t , t - 1 ) m v &times; S v ( t , t - 1 ) 3 - - - ( 22 )
Determine the histogram similarity of t and t-1 frame.
The span of setting the histogram similarity threshold is generally 0.75-0.95, and is learnt when threshold value is made as 0.9 the method resultant effect optimum by lots of comparing experiments; Under the HSV space, the histogram similarity of t and t-1 frame is S ( t , t - 1 ) = m h &times; S h ( t , t - 1 ) + m s &times; S s ( t , t - 1 ) m v &times; S v ( t , t - 1 ) 3 , Gather multiple image as experimental subjects, the span that obtains the histogram similarity threshold of better effects is generally 0.75-0.95, in this scope, image is detected again, and finally obtains best histogram similarity threshold and gets 0.9.
(2) as be less than in threshold value, further utilize interframe gray scale/color difference the histogrammic common factor of adjacent two frame to be carried out to the secondary detection of shot boundary, carry out non-homogeneous divided group processing (as be divided into 9, core proportion maximum, weights and be 1), respectively to each piecemeal calculating pixel difference, and compare to carry out mark with default piecemeal frame difference limen value (span is 10-30), then to the token variable weighted sum of every, and compare to judge whether to occur shot change with the divided group threshold value (span is 0.0-0.4) of setting.
Piecemeal frame difference limen value-acquiring method can adopt: the pixel value difference of the corresponding blocks between adjacent two frames is:
Figure BDA0000372819210000126
Wherein, the size that M * N is certain piece, f n(i, j), f N+1(i, j) is respectively n and the n+1 frame chromatic value at point (i, j).When piecemeal frame difference limen value span is 10-30, resultant effect is best.
The method of non-homogeneous divided group is specially, and is mainly the shortcoming of not considering that in order to overcome histogram method positional information and frame difference method are very responsive to the motion of object/video camera in camera lens, thereby improves recall rate and the accuracy rate of Shot Detection.By great many of experiments, find, when divided group threshold value span is 0.0-0.4, experiment effect the best.
Extract adjacent two frames from video, in the HSV space, calculate its histogram intersection, obtain adjacent two frame histogram similarities, and compare with setting threshold, when being less than setting threshold, tentatively judge that camera lens changes.In order to judge more accurately whether camera lens changes, then do further judgement.Utilize interframe gray scale/color difference to carry out the secondary detection of shot boundary, extract adjacent two frames from video, and carry out non-homogeneous piecemeal, then calculate the pixel value difference of corresponding blocks.Whether the pixel value difference that judges this piece is greater than piecemeal frame difference limen value, and if so, this piece is labeled as 1, otherwise is labeled as 0.Then token variable is weighted to summation.Judge whether it is greater than the divided group threshold value, and if so, camera lens changes, otherwise camera lens does not change.
Due in the hsv color space, the visual characteristic that human eye is the most responsive to the H component, according to the weighting ratio of H, S, V component, after the quantized value that obtains H, S, V, the coefficient ratio that can obtain H, S, V component is Q H: Q S: Q V, wherein, Q H, Q S, Q VBe respectively the quantization level of H, S, V component, in the present invention, the coefficient ratio optimum can be set as 9:3:1.
S h(t, t-1), S s(t, t-1) and S v(t, t-1) is respectively the histogram similarity of H, S, V component, and image ratio of gray scale or color quantizing level n in the histogram similarity of H, S, V component is Q H: Q S: Q V.N, in order to embody more the contribution to the histogram similarity of H, S, V component, arranges the weights of H component, S component, V component according to a certain percentage, as can be by the weighting coefficient m of three components h, m s, m vBe set as 0.9:0.3:0.1.
Consideration based on human visual perception, respectively H, S, V color component are carried out to non-uniform quantizing, and carrying out in the similarity coupling accordingly, for each color component is composed with different weights, the histogram difference degree of two interframe that calculate like this can reflect the difference degree of human visual perception better, has certain perception homogeneity.
(3) consider the situation of strong illumination variation, especially flash of light, be less than frame number on 20 camera lens and again incorporate in a upper camera lens.
In order further to improve recall rate and the accuracy rate of Shot Detection, said method is after utilizing the improvement histogram method to detect camera lens, further utilize frame difference method to be filtered detected camera lens, thereby formed the overall approach in conjunction with histogram method and frame difference method for Shot Detection, can effectively reduce may be by the undetected and flase drop situation of bringing based on histogrammic method.In addition, situation for violent illumination variation, especially flash of light, because its lasting frame number is less, and due to the mankind for visual media, as the persistence of vision effect (its exact value is 24fbps) of the existence such as animation, film, so the present invention is less than 20 camera lens by frame number and again incorporates in a upper camera lens, makes it to be suitable for human visual system.
For test video, be chosen under the hsv color space, adopt improved histogram intersection method to be processed it, consideration based on human visual perception, respectively H, S, V color component are carried out to non-uniform quantizing, and carrying out in the similarity coupling accordingly, for each color component is composed with different weights, the histogram difference degree of two interframe that calculate like this can reflect the difference degree of human visual perception better, has certain perception homogeneity.After being disposed, enter the last handling process that improves the pixel frame difference method, mate and be weighted by non-homogeneous piecemeal, can effectively suppress like this interference for Shot Detection of the advertisement of video top or bottom or captions, and fully taken into account the positional information of each pixel of picture frame, played good supplementary function for improved histogram method.
The present invention can adopt improved Detection Based on Clustering to be detected video lens, according to similarity, judges that video to be checked is whether in current camera lens.
Be illustrated in figure 3 and improve the clustering algorithm process flow diagram.
Traditional Unsupervised Clustering algorithm is for shot detection the time, due to the characteristics that do not fully take into account video data stream, will carry out similarity relatively with all known cluster centres (camera lens Lei Nei center) by each data object to be checked (picture frame), it is distributed to the cluster the most similar to it (camera lens).So probably cause the discontinuous phenomenon of frame number in camera lens flase drop and camera lens, and the time and computation complexity also larger.To this, consider the temporal aspect of video flowing, each frame of video only carries out cluster relatively with the current camera lens that does not complete cluster, (only have and first judge whether camera lens changes and segmented complete camera lens, whether be new camera lens, could exactly video be cut open from shot boundary punishment, form independent camera lens one by one, to guarantee the accuracy of key-frame extraction, camera lens is cut apart.) no longer participate in follow-up cluster.For this reason, introduced boolean's access variable access, when the access of certain camera lens ≡ 0, mean this camera lens cut apart complete, otherwise, mean that this camera lens is the current cluster camera lens relatively that carrying out.In addition, because also adopted the histogram in HSV space in clustering algorithm, so, when calculating frame to be checked and current shot similarity, also need to consider the problem of histogram weighting in the hsv color space.By video sequence V={f 1, f 2..., f nProject on the hsv color space, to H, S and V component carry out non-uniform quantizing, and calculate respectively histogrammic H, S, V component H (i), S (j), and V (k), here, and as desirable, i ∈ [1,8]; J ∈ [1,3]; K ∈ [1,3] represents respectively the quantification progression of H, S, V component.
Then, utilize the histogram intersection algorithm, calculate respectively current video sequence frame to be checked and the current camera lens Lei Nei center histogram similarity on three-component:
S H ( f , Shot ) = &Sigma; i = 1 8 min ( H ( i ) , Shot _ H ( i ) ) max ( H ( i ) , Shot _ H ( i ) ) S S ( f , Shot ) = &Sigma; j = 1 3 min ( S ( j ) , Shot _ S ( j ) ) max ( S ( j ) , Shot _ S ( j ) ) S V ( f , Shot ) = &Sigma; k = 1 3 min ( V ( k ) , Shot _ V ( k ) ) max ( V ( k ) , Shot _ V ( k ) ) - - - ( 23 )
Specifically can adopt following methods:
(1) by video sequence the first frame f 1Regard first camera lens as, f 1Camera lens De Lei center also, and make this camera lens boolean access variable Shot.access ≡ 1.
(2) extract the next frame f of video sequence 2, and after the histogram similarity on H, S, V three-component that calculates respectively current video sequence and camera lens Lei Nei center, according to formula (24):
S ( f i , Shot ) = m h &times; S H ( f i , Shot ) + m S &times; S S ( f i , Shot ) m V &times; S V ( f i , Shot ) 3 - - - ( 24 )
Be weighted and calculate total histogram similarity,
Wherein, m h, m s, m vSetting is respectively H, the weighting coefficient of S and V component.
Generally speaking, because vision is the most responsive for the H component, therefore m h>=m s, m h>=m V.With the quantification weighted ratio in the hsv color space, be consistent, and, for embodying the contribution for similarity of S and V component, weighting coefficient can be assigned respectively 0.9,0.3,0.1, just the camera lens in cluster must meet Shot.access ≡ 1.
(3) if S (f, shot)>T now thinks that video sequence frame f belongs to camera lens Shot.F is put into to Shot, and recalculates Shot De Leinei center and be:
Shot = f + &Sigma; i = 1 Shot . len f i Shot . len + 1 ; Shot.len=Shot.len+1 (25)
F wherein iMean the inner original frame of camera lens.
Otherwise, if S (f, shot)<T thinks that f does not belong to Shot.Set up new camera lens, f is put into to new camera lens, also as this camera lens De Leinei center, the cluster number adds 1 simultaneously, and, when the access by last camera lens sets to 0, makes new camera lens Shot.access ≡ 1.
Wherein, Shot is camera lens Lei Nei center, and f is present frame, f iMean the inner original frame of camera lens, T is the shot similarity threshold value, and Shot.len is the cluster number.
(4), if video is not disposed yet, turn to (2), otherwise algorithm finishes.
The present invention is in the selection that detects sample, consider ubiquity and the popularity of video selection, selected the video of 5 types, comprise animation (Beelzebub ED), advertisement (innisfree cm), news (Cctv_news), TV guide (Anime 10th anniversary) and music video (Taiyou no Uta_clip), and utilize recall rate (Recall) and accuracy rate (Precision) to weigh the detection effect of shot detection algorithm.
Recall rate R = N c ( N c + N m ) &times; 100 % - - - ( 26 )
Accuracy rate P = N c ( N c + N f ) &times; 100 % - - - ( 27 )
Wherein, N c, N m, N fWhat be respectively camera lens correctly detects number, undetected number and flase drop number.
Calculate the histogrammic common factor of two frames to weigh its similarity by the minimum function method, and compare with setting threshold T, thereby judge whether to exist the scene switching.The histogrammic similarity of adjacent two frame is defined as:
Sim = 1 xy &Sigma; v = 0 V min ( H ( I t , v ) , H ( I t - 1 , v ) ) max ( H ( I t , v ) , H ( I t - 1 , v ) ) - - - ( 28 )
Consider that traditional frame difference method is very responsive for the motion of object/camera in video, thereby the shortcoming that easily causes error detection, frame difference method of the present invention combines the thought of non-homogeneous divided group, compare to carry out mark to every node-by-node algorithm pixel value difference and with default piecemeal frame difference limen value respectively, then the token variable of every is weighted to summation, and compares to judge whether to exist shot-cut with the divided group threshold value of setting.Poor being defined as of frame of the corresponding blocks between adjacent two frames:
Fd = 1 MN &Sigma; i = 1 M &Sigma; j = 1 N | f n ( i , j ) - f n + 1 ( i , j ) | - - - ( 29 )
For the comparison of qualitative assessment camera lens partitioning algorithm of the present invention and histogram method and frame difference method, the algorithm respectively the present invention proposed is tested, and its experimental result is as shown in table 1.
Table 1 shot detection result
Figure BDA0000372819210000165
Figure BDA0000372819210000171
As can be seen from Table 1, the Shot Detection accuracy rate drawn by overall approach is higher than two kinds of classic methods, but the recall rate of camera lens but is limited by respectively by these two kinds of results that method obtains.Take the table in final stage MV video " Taiyou no Uta_clip " be example, owing to wherein there being illumination variation in the motion of main body in a large amount of quick shears, gradual change, camera lens and certain camera lens (frame before and after the supposition gradual change and the frame in progressive formation belong to different camera lenses), therefore apply each method, detected and all had certain undetected phenomenon.
Two kinds of algorithm computation complexities that the present invention proposes are lower, when significantly not increasing calculating and time complexity, improved the accuracy rate of Shot Detection.

Claims (6)

1. the lens detection method based on improving histogram and frame difference method, is characterized in that: calculate the histogrammic common factor of adjacent two two field picture, obtain the histogram similarity, according to the histogram similarity, tentatively judge whether camera lens changes; Utilize interframe gray scale/color difference to carry out the secondary detection of shot boundary, extract adjacent two frames from video, and carry out non-homogeneous piecemeal, calculate again the pixel value difference of corresponding blocks, and pixel value difference and default piecemeal frame difference limen value are compared to the acquisition token variable, to the token variable weighted sum of each piecemeal, the divided group threshold value of the value of weighted sum and setting is compared, if be greater than the divided group threshold value, camera lens changes; Frame number being less than to 20 camera lens incorporates in a upper camera lens again.
2. method according to claim 1, is characterized in that, described acquisition histogram similarity specifically comprises, according to formula: S ( t , t - 1 ) = m h &times; S h ( t , t - 1 ) + m s &times; S s ( t , t - 1 ) m v &times; S v ( t , t - 1 ) 3 Calculate adjacent t and the histogram similarity of t-1 frame, adjacent two frame histogram similarities and threshold value compare and judge whether to occur shot change, wherein, and S h(t, t-1), S s(t, t-1) and S v(t, t-1) is respectively the histogram similarity of H, S, V component, according to formula
S h ( t , t - 1 ) = &Sigma; i = 1 N min ( h t ( i ) , h t - 1 ( i ) ) max ( h t ( i ) , h t - 1 ( i ) ) , S s ( t , t - 1 ) = &Sigma; i = 1 N min ( S t ( i ) , S t - 1 ( i ) ) max ( S t ( i ) , S t - 1 ( i ) ) , S v ( t , t - 1 ) = &Sigma; i = 1 N min ( v t ( i ) , v t - 1 ( i ) ) max ( v t ( i ) , v t - 1 ( i ) ) Determine the similarity of adjacent two frame H, S, V component, wherein, h t(i), h T-1(i) represent respectively histogram, N presentation video gray scale or the color quantizing rank of t and t-1 frame H component.
3. method according to claim 1, is characterized in that, described acquisition histogram similarity specifically comprises, calculates video sequence and the histogram similarity of current camera lens Lei Nei center on H, S, V three-component and is specially: by video sequence V={f 1, f 2..., f nProject on the hsv color space, to H, S and V component carry out non-uniform quantizing, and determine and quantize progression, according to histogrammic H, S, V component H (i), S (j), V (k), call formula:
S H ( f , Shot ) = &Sigma; i = 1 8 min ( H ( i ) , Shot _ H ( i ) ) max ( H ( i ) , Shot _ H ( i ) ) S S ( f , Shot ) = &Sigma; j = 1 3 min ( S ( j ) , Shot _ S ( j ) ) max ( S ( j ) , Shot _ S ( j ) ) S V ( f , Shot ) = &Sigma; k = 1 3 min ( V ( k ) , Shot _ V ( k ) ) max ( V ( k ) , Shot _ V ( k ) ) Calculate respectively frequency sequence frame and the current camera lens Lei Nei center histogram similarity on three-component, wherein, Shot is camera lens Lei Nei center.
4. method according to claim 2, is characterized in that, judges whether to occur shot change according to the histogram similarity and further comprise, histogram similarity and setting threshold relatively, when being less than setting threshold, tentatively judge that camera lens changes.
5. method according to claim 3, is characterized in that, judges whether to occur shot change according to the histogram similarity and further comprise, by video sequence the first frame f 1As first camera lens, and first camera lens De Leinei center, and make this camera lens boolean access variable Shot.access ≡ 1 according to formula:
S ( f i , Shot ) = m h &times; S H ( f i , Shot ) + m S &times; S S ( f i , Shot ) m V &times; S V ( f i , Shot ) 3 The histogram similarity that weighted calculation is total; If S is (f i, shot)>T, think video sequence frame f iBelong to camera lens Lei Nei center Shot, by f iPut into Shot, and according to formula:
Figure FDA0000372819200000022
Shot.len=Shot.len+1 recalculates camera lens De Leinei center; If S is (f i, shot)<T, camera lens changes, and sets up new camera lens, by video sequence frame f iPut into new camera lens, as this new camera lens De Leinei center, and, when the boolean's access variable by last camera lens sets to 0, make new camera lens boolean access variable Shot.access ≡ 1.
6. according to claim 2,5 one of them described method, is characterized in that, by the weighting coefficient m of H, S, tri-components of V h, m s, m vBe set as 0.9:0.3:0.1.
CN201310379940.1A 2013-08-27 2013-08-27 Based on the shot detection method improving rectangular histogram and clustering algorithm Active CN103426176B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310379940.1A CN103426176B (en) 2013-08-27 2013-08-27 Based on the shot detection method improving rectangular histogram and clustering algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310379940.1A CN103426176B (en) 2013-08-27 2013-08-27 Based on the shot detection method improving rectangular histogram and clustering algorithm

Publications (2)

Publication Number Publication Date
CN103426176A true CN103426176A (en) 2013-12-04
CN103426176B CN103426176B (en) 2017-03-01

Family

ID=49650866

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310379940.1A Active CN103426176B (en) 2013-08-27 2013-08-27 Based on the shot detection method improving rectangular histogram and clustering algorithm

Country Status (1)

Country Link
CN (1) CN103426176B (en)

Cited By (47)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104090951A (en) * 2014-07-04 2014-10-08 李阳 Abnormal data processing method
CN104243769A (en) * 2014-09-12 2014-12-24 刘鹏 Video scene change detection method based on self-adaptation threshold value
CN104410867A (en) * 2014-11-17 2015-03-11 北京京东尚科信息技术有限公司 Improved video shot detection method
CN104469545A (en) * 2014-12-22 2015-03-25 无锡天脉聚源传媒科技有限公司 Method and device for verifying splitting effect of video clip
CN104469546A (en) * 2014-12-22 2015-03-25 无锡天脉聚源传媒科技有限公司 Video clip processing method and device
CN104539942A (en) * 2014-12-26 2015-04-22 赞奇科技发展有限公司 Video shot switching detection method and device based on frame difference cluster
WO2015131772A1 (en) * 2014-03-04 2015-09-11 Tencent Technology (Shenzhen) Company Limited Method and apparatus for dividing image area
CN104994366A (en) * 2015-06-02 2015-10-21 陕西科技大学 FCM video key frame extracting method based on feature weighing
CN105915758A (en) * 2016-04-08 2016-08-31 绍兴文理学院元培学院 Video searching method
CN106131434A (en) * 2016-08-18 2016-11-16 深圳市金立通信设备有限公司 A kind of image pickup method based on multi-camera system and terminal
CN106162158A (en) * 2015-04-02 2016-11-23 无锡天脉聚源传媒科技有限公司 A kind of method and device identifying lens shooting mode
CN106412619A (en) * 2016-09-28 2017-02-15 江苏亿通高科技股份有限公司 HSV color histogram and DCT perceptual hash based lens boundary detection method
CN106603886A (en) * 2016-12-13 2017-04-26 Tcl集团股份有限公司 Video scene distinguishing method and system
CN106898036A (en) * 2017-02-28 2017-06-27 宇龙计算机通信科技(深圳)有限公司 Image processing method and mobile terminal
CN106960211A (en) * 2016-01-11 2017-07-18 北京陌上花科技有限公司 Key frame acquisition methods and device
CN107408264A (en) * 2014-12-30 2017-11-28 电子湾有限公司 Similar Articles detecting
CN107424163A (en) * 2017-06-09 2017-12-01 广东技术师范学院 A kind of lens boundary detection method based on TextTiling
CN107798304A (en) * 2017-10-20 2018-03-13 央视国际网络无锡有限公司 A kind of method of fast video examination & verification
CN108320294A (en) * 2018-01-29 2018-07-24 袁非牛 A kind of full-automatic replacement method of portrait background intelligent of China second-generation identity card photo
CN108769458A (en) * 2018-05-08 2018-11-06 东北师范大学 A kind of deep video scene analysis method
CN108777755A (en) * 2018-04-18 2018-11-09 上海电力学院 A kind of switching detection method of video scene
CN104021544B (en) * 2014-05-07 2018-11-23 中国农业大学 A kind of greenhouse vegetable disease monitor video extraction method of key frame, that is, extraction system
CN108984648A (en) * 2018-06-27 2018-12-11 武汉大学深圳研究院 The retrieval of the main eigen and animated video of digital cartoon and altering detecting method
CN109036479A (en) * 2018-08-01 2018-12-18 曹清 Clip point judges system and clip point judgment method
CN109344780A (en) * 2018-10-11 2019-02-15 上海极链网络科技有限公司 A kind of multi-modal video scene dividing method based on sound and vision
CN109783684A (en) * 2019-01-25 2019-05-21 科大讯飞股份有限公司 A kind of emotion identification method of video, device, equipment and readable storage medium storing program for executing
CN109964221A (en) * 2016-11-30 2019-07-02 谷歌有限责任公司 The similitude between video is determined using shot durations correlation
WO2019127504A1 (en) * 2017-12-29 2019-07-04 深圳配天智能技术研究院有限公司 Similarity measurement method and device, and storage device
CN110012350A (en) * 2019-03-25 2019-07-12 联想(北京)有限公司 A kind of method for processing video frequency and device, equipment, storage medium
CN110096945A (en) * 2019-02-28 2019-08-06 中国地质大学(武汉) Indoor Video key frame of video real time extracting method based on machine learning
CN110135428A (en) * 2019-04-11 2019-08-16 北京航空航天大学 Image segmentation processing method and device
CN110188625A (en) * 2019-05-13 2019-08-30 浙江大学 A kind of video fine structure method based on multi-feature fusion
CN110210379A (en) * 2019-05-30 2019-09-06 北京工业大学 A kind of lens boundary detection method of combination critical movements feature and color characteristic
CN110248182A (en) * 2019-05-31 2019-09-17 成都东方盛行电子有限责任公司 A kind of scene segment lens detection method
CN110430443A (en) * 2019-07-11 2019-11-08 平安科技(深圳)有限公司 The method, apparatus and computer equipment of video lens shearing
CN110708606A (en) * 2019-09-29 2020-01-17 新华智云科技有限公司 Method for intelligently editing video
CN110781710A (en) * 2018-12-17 2020-02-11 北京嘀嘀无限科技发展有限公司 Target object clustering method and device
CN111292267A (en) * 2020-02-04 2020-06-16 北京锐影医疗技术有限公司 Image subjective visual effect enhancement method based on Laplacian pyramid
CN111563937A (en) * 2020-07-14 2020-08-21 成都四方伟业软件股份有限公司 Picture color extraction method and device
CN111641869A (en) * 2020-06-04 2020-09-08 虎博网络技术(上海)有限公司 Video split mirror method, video split mirror device, electronic equipment and computer readable storage medium
CN112488107A (en) * 2020-12-04 2021-03-12 北京华录新媒信息技术有限公司 Video subtitle processing method and processing device
CN112579823A (en) * 2020-12-28 2021-03-30 山东师范大学 Video abstract generation method and system based on feature fusion and incremental sliding window
CN113379693A (en) * 2021-06-01 2021-09-10 大连东软教育科技集团有限公司 Capsule endoscopy key focus image detection method based on video abstraction technology
CN113591588A (en) * 2021-07-02 2021-11-02 四川大学 Video content key frame extraction method based on bidirectional space-time slice clustering
CN114241367A (en) * 2021-12-02 2022-03-25 北京智美互联科技有限公司 Visual semantic detection method and system
CN117456204A (en) * 2023-09-25 2024-01-26 珠海视熙科技有限公司 Target tracking method, device, video processing system, storage medium and terminal
US11948359B2 (en) 2021-01-27 2024-04-02 Boe Technology Group Co., Ltd. Video processing method and apparatus, computing device and medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102915536A (en) * 2012-08-29 2013-02-06 太原理工大学 Domain histogram lens mutation detection calculating method
CN103093458A (en) * 2012-12-31 2013-05-08 清华大学 Detecting method and detecting device for key frame

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102915536A (en) * 2012-08-29 2013-02-06 太原理工大学 Domain histogram lens mutation detection calculating method
CN103093458A (en) * 2012-12-31 2013-05-08 清华大学 Detecting method and detecting device for key frame

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
潘磊 等: "基于聚类的视频镜头分割和关键帧提取", 《红外与激光工程》 *
瞿中 等: "一种改进的视频关键帧提取算法研究", 《计算机科学》 *
瞿中 等: "一种改进的视频关键帧提取算法研究", 《计算机科学》, vol. 39, no. 8, 5 December 2012 (2012-12-05), pages 300 - 303 *

Cited By (65)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015131772A1 (en) * 2014-03-04 2015-09-11 Tencent Technology (Shenzhen) Company Limited Method and apparatus for dividing image area
US9852510B2 (en) 2014-03-04 2017-12-26 Tencent Technology (Shenzhen) Company Limited Method and apparatus for dividing image area
CN104021544B (en) * 2014-05-07 2018-11-23 中国农业大学 A kind of greenhouse vegetable disease monitor video extraction method of key frame, that is, extraction system
CN104090951A (en) * 2014-07-04 2014-10-08 李阳 Abnormal data processing method
CN104243769A (en) * 2014-09-12 2014-12-24 刘鹏 Video scene change detection method based on self-adaptation threshold value
CN104410867A (en) * 2014-11-17 2015-03-11 北京京东尚科信息技术有限公司 Improved video shot detection method
CN104469546A (en) * 2014-12-22 2015-03-25 无锡天脉聚源传媒科技有限公司 Video clip processing method and device
CN104469546B (en) * 2014-12-22 2017-09-15 无锡天脉聚源传媒科技有限公司 A kind of method and apparatus for handling video segment
CN104469545A (en) * 2014-12-22 2015-03-25 无锡天脉聚源传媒科技有限公司 Method and device for verifying splitting effect of video clip
CN104469545B (en) * 2014-12-22 2017-09-15 无锡天脉聚源传媒科技有限公司 A kind of method and apparatus for examining video segment cutting effect
CN104539942A (en) * 2014-12-26 2015-04-22 赞奇科技发展有限公司 Video shot switching detection method and device based on frame difference cluster
CN107408264A (en) * 2014-12-30 2017-11-28 电子湾有限公司 Similar Articles detecting
CN106162158A (en) * 2015-04-02 2016-11-23 无锡天脉聚源传媒科技有限公司 A kind of method and device identifying lens shooting mode
CN104994366A (en) * 2015-06-02 2015-10-21 陕西科技大学 FCM video key frame extracting method based on feature weighing
CN106960211B (en) * 2016-01-11 2020-04-14 北京陌上花科技有限公司 Key frame acquisition method and device
CN106960211A (en) * 2016-01-11 2017-07-18 北京陌上花科技有限公司 Key frame acquisition methods and device
CN105915758A (en) * 2016-04-08 2016-08-31 绍兴文理学院元培学院 Video searching method
CN105915758B (en) * 2016-04-08 2019-01-08 绍兴文理学院元培学院 A kind of video retrieval method
CN106131434A (en) * 2016-08-18 2016-11-16 深圳市金立通信设备有限公司 A kind of image pickup method based on multi-camera system and terminal
CN106412619A (en) * 2016-09-28 2017-02-15 江苏亿通高科技股份有限公司 HSV color histogram and DCT perceptual hash based lens boundary detection method
CN109964221A (en) * 2016-11-30 2019-07-02 谷歌有限责任公司 The similitude between video is determined using shot durations correlation
CN109964221B (en) * 2016-11-30 2023-09-12 谷歌有限责任公司 Determining similarity between videos using shot duration correlation
CN106603886B (en) * 2016-12-13 2020-08-18 Tcl科技集团股份有限公司 Video scene distinguishing method and system
CN106603886A (en) * 2016-12-13 2017-04-26 Tcl集团股份有限公司 Video scene distinguishing method and system
CN106898036A (en) * 2017-02-28 2017-06-27 宇龙计算机通信科技(深圳)有限公司 Image processing method and mobile terminal
CN107424163A (en) * 2017-06-09 2017-12-01 广东技术师范学院 A kind of lens boundary detection method based on TextTiling
CN107798304A (en) * 2017-10-20 2018-03-13 央视国际网络无锡有限公司 A kind of method of fast video examination & verification
CN107798304B (en) * 2017-10-20 2021-11-02 央视国际网络无锡有限公司 Method for rapidly auditing video
WO2019127504A1 (en) * 2017-12-29 2019-07-04 深圳配天智能技术研究院有限公司 Similarity measurement method and device, and storage device
CN108320294B (en) * 2018-01-29 2021-11-05 袁非牛 Intelligent full-automatic portrait background replacement method for second-generation identity card photos
CN108320294A (en) * 2018-01-29 2018-07-24 袁非牛 A kind of full-automatic replacement method of portrait background intelligent of China second-generation identity card photo
CN108777755A (en) * 2018-04-18 2018-11-09 上海电力学院 A kind of switching detection method of video scene
CN108769458A (en) * 2018-05-08 2018-11-06 东北师范大学 A kind of deep video scene analysis method
CN108984648A (en) * 2018-06-27 2018-12-11 武汉大学深圳研究院 The retrieval of the main eigen and animated video of digital cartoon and altering detecting method
CN109036479A (en) * 2018-08-01 2018-12-18 曹清 Clip point judges system and clip point judgment method
CN109344780A (en) * 2018-10-11 2019-02-15 上海极链网络科技有限公司 A kind of multi-modal video scene dividing method based on sound and vision
CN110781710A (en) * 2018-12-17 2020-02-11 北京嘀嘀无限科技发展有限公司 Target object clustering method and device
CN109783684A (en) * 2019-01-25 2019-05-21 科大讯飞股份有限公司 A kind of emotion identification method of video, device, equipment and readable storage medium storing program for executing
CN109783684B (en) * 2019-01-25 2021-07-06 科大讯飞股份有限公司 Video emotion recognition method, device and equipment and readable storage medium
CN110096945A (en) * 2019-02-28 2019-08-06 中国地质大学(武汉) Indoor Video key frame of video real time extracting method based on machine learning
CN110096945B (en) * 2019-02-28 2021-05-14 中国地质大学(武汉) Indoor monitoring video key frame real-time extraction method based on machine learning
CN110012350B (en) * 2019-03-25 2021-05-18 联想(北京)有限公司 Video processing method and device, video processing equipment and storage medium
CN110012350A (en) * 2019-03-25 2019-07-12 联想(北京)有限公司 A kind of method for processing video frequency and device, equipment, storage medium
CN110135428A (en) * 2019-04-11 2019-08-16 北京航空航天大学 Image segmentation processing method and device
CN110135428B (en) * 2019-04-11 2021-06-04 北京航空航天大学 Image segmentation processing method and device
CN110188625A (en) * 2019-05-13 2019-08-30 浙江大学 A kind of video fine structure method based on multi-feature fusion
CN110188625B (en) * 2019-05-13 2021-07-02 浙江大学 Video fine structuring method based on multi-feature fusion
CN110210379A (en) * 2019-05-30 2019-09-06 北京工业大学 A kind of lens boundary detection method of combination critical movements feature and color characteristic
CN110248182A (en) * 2019-05-31 2019-09-17 成都东方盛行电子有限责任公司 A kind of scene segment lens detection method
CN110430443B (en) * 2019-07-11 2022-01-25 平安科技(深圳)有限公司 Method and device for cutting video shot, computer equipment and storage medium
CN110430443A (en) * 2019-07-11 2019-11-08 平安科技(深圳)有限公司 The method, apparatus and computer equipment of video lens shearing
CN110708606A (en) * 2019-09-29 2020-01-17 新华智云科技有限公司 Method for intelligently editing video
CN111292267A (en) * 2020-02-04 2020-06-16 北京锐影医疗技术有限公司 Image subjective visual effect enhancement method based on Laplacian pyramid
CN111641869A (en) * 2020-06-04 2020-09-08 虎博网络技术(上海)有限公司 Video split mirror method, video split mirror device, electronic equipment and computer readable storage medium
CN111641869B (en) * 2020-06-04 2022-01-04 虎博网络技术(上海)有限公司 Video split mirror method, video split mirror device, electronic equipment and computer readable storage medium
CN111563937A (en) * 2020-07-14 2020-08-21 成都四方伟业软件股份有限公司 Picture color extraction method and device
CN112488107A (en) * 2020-12-04 2021-03-12 北京华录新媒信息技术有限公司 Video subtitle processing method and processing device
CN112579823B (en) * 2020-12-28 2022-06-24 山东师范大学 Video abstract generation method and system based on feature fusion and incremental sliding window
CN112579823A (en) * 2020-12-28 2021-03-30 山东师范大学 Video abstract generation method and system based on feature fusion and incremental sliding window
US11948359B2 (en) 2021-01-27 2024-04-02 Boe Technology Group Co., Ltd. Video processing method and apparatus, computing device and medium
CN113379693A (en) * 2021-06-01 2021-09-10 大连东软教育科技集团有限公司 Capsule endoscopy key focus image detection method based on video abstraction technology
CN113379693B (en) * 2021-06-01 2024-02-06 东软教育科技集团有限公司 Capsule endoscope key focus image detection method based on video abstraction technology
CN113591588A (en) * 2021-07-02 2021-11-02 四川大学 Video content key frame extraction method based on bidirectional space-time slice clustering
CN114241367A (en) * 2021-12-02 2022-03-25 北京智美互联科技有限公司 Visual semantic detection method and system
CN117456204A (en) * 2023-09-25 2024-01-26 珠海视熙科技有限公司 Target tracking method, device, video processing system, storage medium and terminal

Also Published As

Publication number Publication date
CN103426176B (en) 2017-03-01

Similar Documents

Publication Publication Date Title
CN103426176A (en) Video shot detection method based on histogram improvement and clustering algorithm
CN107944359B (en) Flame detecting method based on video
CN104598924A (en) Target matching detection method
CN108921130A (en) Video key frame extracting method based on salient region
NO329897B1 (en) Procedure for faster face detection
Prema et al. Survey on skin tone detection using color spaces
CN109271932A (en) Pedestrian based on color-match recognition methods again
Wang et al. Real-time smoke detection using texture and color features
CN114973112B (en) Scale self-adaptive dense crowd counting method based on countermeasure learning network
US9286690B2 (en) Method and apparatus for moving object detection using fisher&#39;s linear discriminant based radial basis function network
CN107045630B (en) RGBD-based pedestrian detection and identity recognition method and system
Ghazali et al. Pedestrian detection in infrared outdoor images based on atmospheric situation estimation
Ouyang et al. The comparison and analysis of extracting video key frame
CN107341456B (en) Weather sunny and cloudy classification method based on single outdoor color image
Bales et al. Bigbackground-based illumination compensation for surveillance video
CN110765982A (en) Video smoke detection method based on change accumulation graph and cascaded depth network
CN102163279B (en) Color human face identification method based on nearest feature classifier
CN115393788A (en) Multi-scale monitoring pedestrian re-identification method based on global information attention enhancement
Zhang et al. Shot boundary detection based on HSV color model
Gao et al. Spatio-temporal salience based video quality assessment
Sabeti et al. High-speed skin color segmentation for real-time human tracking
Yilmaz et al. Shot detection using principal coordinate system
Ghomsheh et al. A new skin detection approach for adult image identification
Vijaylaxmi et al. Fire detection using YCbCr color model
CN105023001A (en) Selective region-based multi-pedestrian detection method and system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant