CN104318208A - Video scene detection method based on graph partitioning and instance learning - Google Patents

Video scene detection method based on graph partitioning and instance learning Download PDF

Info

Publication number
CN104318208A
CN104318208A CN201410525867.9A CN201410525867A CN104318208A CN 104318208 A CN104318208 A CN 104318208A CN 201410525867 A CN201410525867 A CN 201410525867A CN 104318208 A CN104318208 A CN 104318208A
Authority
CN
China
Prior art keywords
subgraph
camera lens
video
key frame
sift
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410525867.9A
Other languages
Chinese (zh)
Inventor
檀结庆
白天
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei University of Technology
Original Assignee
Hefei University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei University of Technology filed Critical Hefei University of Technology
Priority to CN201410525867.9A priority Critical patent/CN104318208A/en
Publication of CN104318208A publication Critical patent/CN104318208A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/49Segmenting video sequences, i.e. computational techniques such as parsing or cutting the sequence, low-level clustering or determining units such as shots or scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/20Scenes; Scene-specific elements in augmented reality scenes

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computing Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a video scene detection method based on graph partitioning and instance learning, and compared with the prior art, solves defects that a scene boundary is partitioned excessively and time complexity cannot be applied. The invention comprises the following steps: video shots are partitioned, a video sequence is inputted, and a video shot partitioning method is used for detecting all shots in the given video sequence; visual similarity features of the shots are extracted, key frames are extracted in the shots, HSV features and SIFT features are combined to form structure and partitioning of a visual similarity feature directed time sequence graph of the shot, a finite time sequence graph of the entire video is formed and described, and the graph is partitioned into a plurality of sub graphs; and according to scene detection based on the instance learning, part of the sub graphs are recognized into training examples (TE), the rest unrecognized sub graphs use the instance learning method to be classified into the recognized sub graphs, and all sub graphs are finally outputted to be the detected scene. Excessive partitioning can be prevented and calculation complexity is reduced.

Description

A kind of video scene detection method based on scheming segmentation and case-based learning
Technical field
The present invention relates to technical field of video processing, specifically a kind of video scene detection method based on scheming segmentation and case-based learning.
Background technology
Structuring video data analytical technology applies to digital video analysis in a large number with process in recent years, and a video display video is made up of thousands of video lens usually, but the information content that single case for lense contains is less, therefore needs similar lens group to be made into scene.Video scene should by multiple continuously and semantic relevant camera lens form, identical content expressed by the camera lens of composition scene.
Describing method at present based on figure is widely used in video scene detects, and substantially can be divided into following a few class: the method (STG) based on scene transition diagram and the method (SSG) based on camera lens similar diagram.First STG method uses clustering technique to carry out cluster to camera lens, then on the basis of cluster, generates digraph.In STG, each summit represents an arrangement of mirrors head, while represent the transition of two arrangement of mirrors heads.In SSG method, standardization abatement technology is used to carry out figure segmentation and obtains scene; In SSG figure, each summit represents a camera lens, all there is a limit between any two summits, and every bar limit all imparts weight according to the similarity of camera lens.
The describing method of existing figure can be split video scene to a certain extent, but also there are following two deficiencies: 1, over-segmentation, because needs arrange global threshold, therefore threshold value choose most important, in order to obtain scene boundary more exactly, therefore need to arrange a larger threshold value, so just inevitably cause the situation of over-segmentation; 2, computation complexity is higher, is that the standardization abatement in STG in clustering shots or SSG all needs a large amount of operation time, the time complexity how falling ground scene detection method to engineering can scope be a challenge.
How to develop a kind ofly can prevent over-segmentation, video scene detection method that time complexity is low become the technical matters being badly in need of solving.
Summary of the invention
The object of the invention is the defect cannot applied to solve the border over-segmentation of prior art Scene and time complexity, providing a kind of video scene detection method based on figure segmentation and case-based learning to solve the problems referred to above.
To achieve these goals, technical scheme of the present invention is as follows:
Based on the video scene detection method scheming segmentation and case-based learning, comprise the following steps:
Video lens is split, input video sequence, utilizes video lens dividing method to detect all camera lenses in given video sequence;
Extract the vision similarity feature of camera lens, from camera lens, extract key frame, the vision similarity feature of associating HSV characteristic sum SIFT feature structure camera lens;
The structure of oriented sequential chart and segmentation, the limited sequential chart of the whole video of structure description, is divided into some subgraphs by figure;
The scene detection of instance-based learning, partial subgraph is identified as training example TE, remaining Unidentified subgraph utilizes case-based learning method to be referred in the subgraph identified, all subgraphs finally exported are the scene detected.
The vision similarity feature of described extraction camera lens comprises the following steps:
Start frame, a middle frame and end frame in extraction camera lens all sequences, as key frame, obtain the key frame set that describes whole camera lens;
Extract the SIFT feature of key frame, and do normalized to SIFT feature, its formula is as follows:
SimFF sift ( F a i , F b j ) = M Min ( N B i , N b j ) ;
SimSS sift ( S i , S j ) = Max ( SimFF sift ( F h i , F l j ) ) , h ∈ KF i , l ∈ KF j ;
Wherein, for key frame the quantity of SIFT feature, for key frame the quantity of SIFT feature, key frame a belongs to camera lens i, and key frame b belongs to camera lens j, and M is for comparing with the feature quantity of rear coupling, KF ifor the key frame set of camera lens i, KF jfor the key frame set of camera lens j, for the SIFT feature similarity of key frame a and b, for the SIFT feature similarity of camera lens i and j; Th siftfor threshold value, N_SimSS sift(S i, S j) be the SIFT feature after normalized;
Extract the HSV feature of key frame, and do normalized to HSV feature, its formula is as follows:
SimFF color(F i,F j)=∑ h∈binsMin(H i(h),H j(h)),
SimSS color(S i,S j)=Max(SimFF color(F h,F l)),h∈KFi,l∈KF j
Wherein H ifor the normalization HSV histogram of key frame Fi, H jfor the normalization HSV histogram of key frame Fj, KF ifor the key frame set of camera lens i, KF jfor the key frame set of camera lens j, SimFF color(S i, S j) be the HSV characteristic similarity of key frame Fi and Fj, SimSS color(S i, S j) be the HSV characteristic similarity of camera lens i and j;
The camera lens vision similarity feature of associating HSV characteristic sum SIFT feature, its formula is as follows:
SimSS visual(S i,S j)=α·N_SimSS sift(S i,S j)+β·SimSS color(S i,S j),
Wherein, α and β represents non-negative weights and alpha+beta=1.
Structure and the segmentation of described oriented sequential chart comprise the following steps:
According to the oriented sequential chart of the building method generating video of oriented sequential chart, the building method of its oriented sequential chart is as follows:
If G=(V, E) represents digraph, wherein a V={v i| v i=1,2 ..., N} is the set on summit, E={e i, jthe set on limit, all camera lenses with vision order carry out sorting ... v i, v i+1, vertex v irepresent i-th camera lens, vertex v jrepresent a jth camera lens,
If v j-v i=1, will from v iadd a directed edge to v j;
Defining variable i, j, L, i=1, j=2, L are moving window length;
Judge vertex v jin-degree whether be greater than 1, if then carry out 315 step process, if not, then carry out 314 step process;
Judge SimSS visual(S i, S j) whether be greater than given threshold value T, if then from vertex v igenerate a directed edge to vertex v j, if not, then carry out 315 step process;
The value of variable j is added 1; If j-i>L or j>N, then carry out 316 step process, otherwise proceed 313 step process;
The value of variable i is added 1, the value of variable j adds 1;
If i<N, then proceed 313 step process, otherwise represent that oriented sequential chart structure is complete.
Oriented sequential chart is divided into subgraph, and each subgraph represents a video sequence.
The scene detection of described instance-based learning comprises the following steps:
Detect respectively the subgraph after segmentation, if its density of the subgraph detected is greater than 0.33, trained example TE by as one, its subgraph density calculation formula is as follows:
Wherein, Ne is the quantity on the limit that subgraph comprises, and Nv is the quantity on the summit that subgraph comprises;
All subgraphs are divided into TE and non-TE two parts;
Relationship subgraph chronologically, the subgraph in non-TE chronologically relationship generate a sequence label, then use case-based learning method to detect scene boundary, finally export the scene obtained.
Described case-based learning method comprises the following steps:
Each subgraph between TE and non-TE is by imparting label, and label value is 0,1 or-1, and the assignment condition of its label value is as follows:
Calculate the similarity of subgraph and TE, result puts into SL;
Calculate the similarity of subgraph and non-TE, result puts into SR;
If SL>SR, then calculate SR/SL, otherwise calculate SL/SR, result of calculation is put into S;
If the value >0.85 in S, then this subgraph label is denoted as 0; Otherwise if SL>SR, then this subgraph label is denoted as-1; If SL is less than or equal to SR, then this subgraph label is denoted as 1; Obtain the sequence label formed by 0,1 and-1;
The fuzzy value Fuz at computed segmentation place, its computing formula is as follows:
Fuz = Max ( ( N R - N L ) N , N zero N )
Wherein, N is the quantity of label between TE and non-TE, N lfor the accumulated value of all labels of left-half after divided, N rfor the accumulated value of all labels of right half part after divided, N zerofor label value between TE and non-TE is the accumulated value of the label of 0;
If be labeled as in sequence label 0 number of labels be greater than 2N/3, then represent there is not a scene boundary in the sequence;
If the fuzzy value Fuz calculated at split position place is maximal value, then this position is applicable scene boundary; If the fuzzy value Fuz that there is several split position place is all equal, then choose middle split position as scene boundary.
Beneficial effect
A kind of video scene detection method based on scheming segmentation and case-based learning of the present invention, compared with prior art prevents over-segmentation, reduces computation complexity.Improve accuracy rate and the recall rate of the detection of whole video scene, good Detection results can be kept to the scene of illumination acute variation and high motion scenes.By extracting SIFT feature and HSV histogram as the visual signature of scene detection, proposing a kind of structure and dividing method of the sequential digraph based on camera lens, utilizing the Scene Segmentation of instance-based learning to obtain the video scene determined.
Accompanying drawing explanation
Fig. 1 is method flow diagram of the present invention;
Fig. 2 is the digraph in digraph of the present invention structure and segmentation step after initialization;
Fig. 3 is the organigram of oriented sequential chart in digraph of the present invention structure and segmentation step;
Fig. 4 is that digraph of the present invention constructs and oriented sequential chart segmentation schematic diagram in segmentation step;
Fig. 5 is subgraph assignment label schematic diagram in the scene detection step that the present invention is based on case-based learning;
Fig. 6 is label sequences segmentation schematic diagram in the scene detection step that the present invention is based on case-based learning.
Embodiment
For making to have a better understanding and awareness architectural feature of the present invention and effect of reaching, coordinating detailed description in order to preferred embodiment and accompanying drawing, being described as follows:
As shown in Figure 1, a kind of video scene detection method based on scheming segmentation and case-based learning of the present invention, comprises the following steps:
The first step, video lens is split, input video sequence, utilizes video lens dividing method to detect all camera lenses in given video sequence.Video lens dividing method can use method of the prior art, as http:// www-nlpir.nist.gov/projects/tvpubs/tvpapers03/ramonlull. paper .pdfmiddle introduced video lens dividing method.
Second step, extracts the vision similarity feature of camera lens, from camera lens, extracts key frame, the vision similarity feature of associating HSV characteristic sum SIFT feature structure camera lens.Its concrete steps are as follows:
(1) start frame, a middle frame and end frame in extraction camera lens all sequences, as key frame, obtain the key frame set KF that describes whole camera lens.
(2) extract the SIFT feature of key frame, and do normalized to SIFT feature, calculate the unique point quantity of coupling, judge the similarity degree of two camera lenses, its formula is as follows:
SimFF sift ( F a i , F b j ) = M Min ( N B i , N b j ) ;
SimSS sift ( S i , S j ) = Max ( SimFF sift ( F h i , F l j ) ) , h &Element; KF i , l &Element; KF j ;
Wherein, for key frame the quantity of SIFT feature, for key frame the quantity of SIFT feature, key frame a belongs to camera lens i, and key frame b belongs to camera lens j, and M is for comparing with the feature quantity of rear coupling, KF ifor the key frame set of camera lens i, KF jfor the key frame set of camera lens j,
for the SIFT feature similarity of key frame a and b, SimSS sift(S i, S j) be the SIFT feature similarity of camera lens i and j; Th siftfor threshold value, can be 0.12, N_SimSS usually through experimental verification sift(S i, S j) be the SIFT feature after normalized.
(3) extract the HSV feature of key frame, and normalized is done to HSV feature.When carrying out the comparison of HSV histogram feature, first calculate the HSV normalization histogram of two two field pictures, then judge the similarity of two camera lenses, its formula is as follows:
SimFF color(F i,F j)=∑ h∈binsMin(H i(h),H j(h)),
SimSS color(S i,S j)=Max(SimFF color(F h,F l)),h∈KF i,l∈KF j
Wherein H ifor the normalization HSV histogram of key frame Fi, H jfor the normalization HSV histogram of key frame Fj, KF ifor the key frame set of camera lens i, KF jfor the key frame set of camera lens j, SimFF color(S i, S j) be the HSV characteristic similarity of key frame Fi and Fj, SimSS color(S i, S j) be the HSV characteristic similarity of camera lens i and j.
(4) combine the camera lens vision similarity feature of HSV characteristic sum SIFT feature, its formula is as follows:
SimSS visual(S i,S j)=α·N_SimSS sift(S i,S j)+β·SimSS color(S i,S j),
Wherein, α and β represents non-negative weights and alpha+beta=1.
3rd step, the structure of oriented sequential chart and segmentation, the limited sequential chart of the whole video of structure description, is divided into some subgraphs by figure.Carried out the oriented sequential chart of generating video by the building method of oriented sequential chart, for the oriented sequential chart generated, utilize dijkstra's algorithm that figure is divided into some subgraphs, each subgraph represents a video sequence.Its concrete steps are as follows:
(1) according to the oriented sequential chart of the building method generating video of oriented sequential chart, if G=(V, E) represents digraph, wherein a V={v i| v i=1,2 ..., N} is the set on summit, E={e i, jthe set on limit, all camera lenses with vision order carry out sorting ... v i, v i+1, vertex v irepresent i-th camera lens, vertex v jrepresent a jth camera lens, if v j-v i=1, will from v iadd a directed edge to v j, the digraph after initialization as shown in Figure 2.
(2) defining variable i, j, L, i=1, j=2, L are moving window length.
(3) calculation procedure as shown in Figure 3, first judges vertex v jin-degree whether be greater than 1, if then carry out (5) step process, if not, be then for further processing.
(4) SimSS is judged visual(S i, S j) whether be greater than given threshold value T, analyze by experiment, threshold value T can be set to 0.6.
If then from vertex v igenerate a directed edge to vertex v j, if not, be then for further processing.
(5) value of variable j is added 1; If j-i>L or j>N, be then for further processing, otherwise proceed (3) step process.
(6) value of variable i is added 1, the value of variable j adds 1;
If i<N, then proceed 313 step process, otherwise represent that oriented sequential chart structure is complete.
(7) as shown in Figure 4, utilize dijkstra's algorithm that figure is divided into some subgraphs, each subgraph represents a video sequence.Dijkstra's algorithm is for searching for the shortest path of sequential digraph, and limit on shortest paths will all be removed, thus obtains some subgraphs.
4th step, the scene detection of instance-based learning, partial subgraph is identified as training example TE, remaining Unidentified subgraph utilizes case-based learning method to be referred in the subgraph identified, all subgraphs finally exported are the scene detected.Its concrete steps are as follows:
(1) detect respectively the subgraph after segmentation, if its density of the subgraph detected is greater than 0.33, trained example TE by as one, its subgraph density calculation formula is as follows:
Wherein, Ne is the quantity on the limit that subgraph comprises, and Nv is the quantity on the summit that subgraph comprises.
(2) by experimental results demonstrate density be greater than 0.33 TE can as a training example TE, density is greater than 0.33 as a training example TE, all subgraphs are divided into TE and non-TE two parts, subgraph in non-TE may be then the interface that crosses between two scenes, i.e. scene boundary.
(3) relationship subgraph chronologically, the subgraph in non-TE chronologically relationship generate a sequence label, use case-based learning method generating labels sequence and scene boundary detected, finally exporting the scene obtained.
Case-based learning method specifically comprises the following steps:
A, as shown in Figure 5, each subgraph between TE and non-TE is by imparting label, and label value is 0,1 or-1, and 1 represents that this subgraph is similar to next TE vision, and-1 represents that this subgraph is similar to a upper TE vision, and 0 expression cannot judge.The assignment condition of label value is as follows:
The similarity of a, calculating subgraph and TE, result puts into SL;
The similarity of b, calculating subgraph and non-TE, result puts into SR;
If c is SL>SR, then calculate SR/SL, otherwise calculate SL/SR, result of calculation is put into S;
If the value >0.85 in d S, then this subgraph label is denoted as 0; Otherwise if SL>SR, then this subgraph label is denoted as-1; If SL is less than or equal to SR, then this subgraph label is denoted as 1; Obtain the sequence label formed by 0,1 and-1.
The fuzzy value Fuz at B, computed segmentation place, its computing formula is as follows:
Fuz = Max ( ( N R - N L ) N , N zero N )
Wherein, N is the quantity of TE and non-TE label, as shown in Figure 6, and N lfor the accumulated value of all labels of left-half after divided, N rfor the accumulated value of all labels of right half part after divided, N zerofor label value between TE and non-TE is the accumulated value of the label of 0.
If C be labeled as in sequence label 0 number of labels be greater than 2N/3, then represent there is not a scene boundary in the sequence.
D, as shown in Figure 6, sequence label to be split, obtain scene according to segmentation result.If the fuzzy value Fuz calculated at split position place is maximal value, then this position is applicable scene boundary; If the fuzzy value Fuz that there is several split position place is all equal, then choose middle split position as scene boundary.
This method improves accuracy rate and the recall rate of whole scene detection, and can keep good Detection results to the scene of illumination acute variation and high motion scenes.The present invention includes following components: utilize scale invariant feature (SIFT) and hsv color histogram to carry out the similarity feature of combined structure camera lens; Extract the key frame of camera lens, and the associating visual signature extracted from key frame based on SIFT and HSV carrys out representative shot, each camera lens is as a node of oriented sequential chart, recycle the oriented sequential chart that the method construct one compared in moving window describes video, then whole figure is split, obtain some subgraphs, namely each subgraph is a video segment.The video segment recycling case-based learning method obtained finally is obtained to the video scene determined.The present invention proposes the scene detection method of instance-based learning and figure segmentation from the angle of engineer applied, and the shot similarity feature of the method has stronger robustness to change of scale and light change.This method improves the accuracy of the scene detection of dissimilar video display video work, improves the level of application of video scene detection technique in all types of films and television programs post-production.
The scene detection method that the present invention describes, can keep good Detection results to illumination acute variation scene and high motion scenes.Owing to have employed union feature based on SIFT feature and HSV histogram feature as the visual signature of Shot Detection, therefore reduce flase drop and loss.Simultaneously due to based on the oriented sequential chart technology of moving window and the application of case-based learning method, effectively reduce the computing time of detection algorithm.In order to verify the validity of detection method, we have done a large amount of experiments and have carried out quantification and qualification to test video, and as shown in table 1, table 1 gives the details of test video.
The details of table 1 test video
Quantitative evaluating method adopts general in the world recall ratio (recall), precision ratio (precision) and F mark.By different detection methods, same video sequence is calculated simultaneously, carry out qualitative analysis to judge the good and bad degree of various method.As shown in table 2, detection method of the present invention and STG method compare,
Table 2F Indexes Comparison result
Result shows method of the present invention has good performance in recall rate and accuracy rate, and utilization figure is split and the video scene detection method of case-based learning is effective.
More than show and describe ultimate principle of the present invention, principal character and advantage of the present invention.The technician of the industry should understand; the present invention is not restricted to the described embodiments; the just principle of the present invention described in above-described embodiment and instructions; the present invention also has various changes and modifications without departing from the spirit and scope of the present invention, and these changes and improvements all fall in claimed scope of the present invention.The protection domain of application claims is defined by appending claims and equivalent thereof.

Claims (5)

1., based on the video scene detection method scheming segmentation and case-based learning, it is characterized in that, comprise the following steps:
11) video lens segmentation, input video sequence, utilizes video lens dividing method to detect all camera lenses in given video sequence;
12) extract the vision similarity feature of camera lens, from camera lens, extract key frame, the vision similarity feature of associating HSV characteristic sum SIFT feature structure camera lens;
13) structure of oriented sequential chart and segmentation, the limited sequential chart of the whole video of structure description, is divided into some subgraphs by figure;
14) scene detection of instance-based learning, partial subgraph is identified as training example TE, remaining Unidentified subgraph utilizes case-based learning method to be referred in the subgraph identified, all subgraphs finally exported are the scene detected.
2. a kind of video scene detection method based on scheming segmentation and case-based learning according to claim 1, is characterized in that: the vision similarity feature of described extraction camera lens comprises the following steps:
21) start frame, a middle frame and end frame in extraction camera lens all sequences, as key frame, obtain the key frame set that describes whole camera lens;
22) extract the SIFT feature of key frame, and do normalized to SIFT feature, its formula is as follows:
SimFF sift ( F a i , F b j ) = M Min ( N a i , N b j ) ;
SimSS sift ( S i , S j ) = Max ( SimFF sift ( F h i , F l j ) ) , h &Element; KF i , l &Element; KF j ;
Wherein, for key frame the quantity of SIFT feature, for key frame the quantity of SIFT feature, key frame a belongs to camera lens i, and key frame b belongs to camera lens j, and M is for comparing with the feature quantity of rear coupling, KF ifor the key frame set of camera lens i, KF jfor the key frame set of camera lens j, for the SIFT feature similarity of key frame a and b, SimSS sift(S i, S j) be the SIFT feature similarity of camera lens i and j; Th siftfor threshold value, N_SimSS sift(S i, S j) be the SIFT feature after normalized;
23) extract the HSV feature of key frame, and do normalized to HSV feature, its formula is as follows:
SimFF color(F i,F j)=∑ h∈binsMin(H i(h),H j(h)),
SimSS color(S i,S j)=Max(SimFF color(F h,F l)),h∈KF i,l∈KF j
Wherein H ifor the normalization HSV histogram of key frame Fi, H jfor the normalization HSV histogram of key frame Fj, KF ifor the key frame set of camera lens i, KF jfor the key frame set of camera lens j, SimFF color(F i, F j) be the HSV characteristic similarity of key frame Fi and Fj, SimSS color(S i, S j) be the HSV characteristic similarity of camera lens i and j;
24) combine the camera lens vision similarity feature of HSV characteristic sum SIFT feature, its formula is as follows:
SimSS visual(S i,S j)=α·N_SimSS sift(S i,S j)+β·SimSS color(S i,S j),
Wherein, α and β represents non-negative weights and alpha+beta=1.
3. a kind of video scene detection method based on scheming segmentation and case-based learning according to claim 1, it is characterized in that, structure and the segmentation of described oriented sequential chart comprise the following steps:
31) according to the oriented sequential chart of the building method generating video of oriented sequential chart, the building method of its oriented sequential chart is as follows:
311) G=(V, E) is established to represent digraph, wherein a V={v i| v i=1,2 ..., N} is the set on summit, E={e i, jthe set on limit, all camera lenses with vision order carry out sorting ... v i, v i+1, vertex v irepresent i-th camera lens, vertex v jrepresent a jth camera lens,
If v j-v i=1, then from v iadd a directed edge to v j;
312) defining variable i, j, L, i=1, j=2, L are moving window length;
313) vertex v is judged jin-degree whether be greater than 1, if then carry out 315 step process, if not, then carry out 314 step process;
314) SimSS is judged visual(S i, S j) whether be greater than given threshold value T, if then from vertex v igenerate a directed edge to vertex v j, if not, then carry out 315 step process;
315) value of variable j is added 1; If j-i > L or j > N, then carry out 316 step process, otherwise proceed 313 step process;
316) value of variable i is added 1, the value of variable j adds 1;
If i < is N, then proceed 313 step process, otherwise represent that oriented sequential chart structure is complete.
32) oriented sequential chart is divided into subgraph, each subgraph represents a video sequence.
4. a kind of video scene detection method based on scheming segmentation and case-based learning according to claim 1, it is characterized in that, the scene detection of described instance-based learning comprises the following steps:
41) detect respectively the subgraph after segmentation, if its density of the subgraph detected is greater than 0.33, trained example TE by as one, its subgraph density calculation formula is as follows:
Wherein, Ne is the quantity on the limit that subgraph comprises, and Nv is the quantity on the summit that subgraph comprises;
42) all subgraphs are divided into TE and non-TE two parts;
43) relationship subgraph chronologically, the subgraph in non-TE chronologically relationship generate a sequence label, then use case-based learning method to detect scene boundary, finally export the scene obtained.
5. a kind of video scene detection method based on scheming segmentation and case-based learning according to claim 4, it is characterized in that, described case-based learning method comprises the following steps:
51) each subgraph between TE and non-TE is by imparting label, and label value is 0,1 or-1, and the assignment condition of its label value is as follows:
511) calculate the similarity of subgraph and TE, result puts into SL;
512) calculate the similarity of subgraph and non-TE, result puts into SR;
513) if SL > is SR, then calculate SR/SL, otherwise calculate SL/SR, result of calculation is put into S;
514) if the value > in S 0.85, then this subgraph label is denoted as 0; Otherwise if SL > is SR, then this subgraph label is denoted as-1; If SL is less than or equal to SR, then this subgraph label is denoted as 1; Obtain the sequence label formed by 0,1 and-1;
52) the fuzzy value Fuz at computed segmentation place, its computing formula is as follows:
Fuz = Max ( ( N R - N L ) N , N zero N )
Wherein, N is the quantity of label between TE and non-TE, N lfor the accumulated value of all labels of left-half after divided, N rfor the accumulated value of all labels of right half part after divided, N zerofor label value between TE and non-TE is the accumulated value of the label of 0;
53) if be labeled as in sequence label 0 number of labels be greater than 2N/3, then represent there is not a scene boundary in the sequence;
54) if the fuzzy value Fuz calculated at split position place is maximal value, then this position is applicable scene boundary; If the fuzzy value Fuz that there is several split position place is all equal, then choose middle split position as scene boundary.
CN201410525867.9A 2014-10-08 2014-10-08 Video scene detection method based on graph partitioning and instance learning Pending CN104318208A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410525867.9A CN104318208A (en) 2014-10-08 2014-10-08 Video scene detection method based on graph partitioning and instance learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410525867.9A CN104318208A (en) 2014-10-08 2014-10-08 Video scene detection method based on graph partitioning and instance learning

Publications (1)

Publication Number Publication Date
CN104318208A true CN104318208A (en) 2015-01-28

Family

ID=52373438

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410525867.9A Pending CN104318208A (en) 2014-10-08 2014-10-08 Video scene detection method based on graph partitioning and instance learning

Country Status (1)

Country Link
CN (1) CN104318208A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104834919A (en) * 2015-05-20 2015-08-12 东南大学 Contour line based three-dimensional human face iteration preprocessing and feature point extracting method
CN105988369A (en) * 2015-02-13 2016-10-05 上海交通大学 Content-driving-based intelligent household control method
CN107274415A (en) * 2017-06-06 2017-10-20 东北大学 A kind of image partition method connected based on Tarjan algorithms and region
CN108388886A (en) * 2018-03-16 2018-08-10 广东欧珀移动通信有限公司 Method, apparatus, terminal and the computer readable storage medium of image scene identification
CN109214239A (en) * 2017-06-30 2019-01-15 创意引晴(开曼)控股有限公司 The discrimination method for extending information in video, identification system and storage media can be recognized
CN110879952A (en) * 2018-09-06 2020-03-13 阿里巴巴集团控股有限公司 Method and device for processing video frame sequence
CN110913243A (en) * 2018-09-14 2020-03-24 华为技术有限公司 Video auditing method, device and equipment
CN113225461A (en) * 2021-02-04 2021-08-06 江西方兴科技有限公司 System and method for detecting video monitoring scene switching

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1761331A (en) * 2005-09-29 2006-04-19 深圳清华大学研究院 Testing method of switching video scenes
WO2008127319A2 (en) * 2007-01-31 2008-10-23 Thomson Licensing Method and apparatus for automatically categorizing potential shot and scene detection information
CN103679189A (en) * 2012-09-14 2014-03-26 华为技术有限公司 Method and device for recognizing scene

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1761331A (en) * 2005-09-29 2006-04-19 深圳清华大学研究院 Testing method of switching video scenes
WO2008127319A2 (en) * 2007-01-31 2008-10-23 Thomson Licensing Method and apparatus for automatically categorizing potential shot and scene detection information
CN103679189A (en) * 2012-09-14 2014-03-26 华为技术有限公司 Method and device for recognizing scene

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
TIAN BAI ET AL: "Indistinct segmentation of scene in video using instance leaning", 《FOURTH INTERNATIONAL CONFERENCE ON DIGITAL HOME》 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105988369A (en) * 2015-02-13 2016-10-05 上海交通大学 Content-driving-based intelligent household control method
CN104834919A (en) * 2015-05-20 2015-08-12 东南大学 Contour line based three-dimensional human face iteration preprocessing and feature point extracting method
CN104834919B (en) * 2015-05-20 2018-05-15 东南大学 A kind of pretreatment of three-dimensional face iteration and Feature Points Extraction based on contour line
CN107274415A (en) * 2017-06-06 2017-10-20 东北大学 A kind of image partition method connected based on Tarjan algorithms and region
CN107274415B (en) * 2017-06-06 2019-08-09 东北大学 A kind of image partition method connected based on Tarjan algorithm with region
CN109214239A (en) * 2017-06-30 2019-01-15 创意引晴(开曼)控股有限公司 The discrimination method for extending information in video, identification system and storage media can be recognized
CN108388886A (en) * 2018-03-16 2018-08-10 广东欧珀移动通信有限公司 Method, apparatus, terminal and the computer readable storage medium of image scene identification
CN110879952A (en) * 2018-09-06 2020-03-13 阿里巴巴集团控股有限公司 Method and device for processing video frame sequence
CN110879952B (en) * 2018-09-06 2023-06-16 阿里巴巴集团控股有限公司 Video frame sequence processing method and device
CN110913243A (en) * 2018-09-14 2020-03-24 华为技术有限公司 Video auditing method, device and equipment
CN113225461A (en) * 2021-02-04 2021-08-06 江西方兴科技有限公司 System and method for detecting video monitoring scene switching

Similar Documents

Publication Publication Date Title
CN104318208A (en) Video scene detection method based on graph partitioning and instance learning
Peng et al. TPM: Multiple object tracking with tracklet-plane matching
Adhikari et al. Faster bounding box annotation for object detection in indoor scenes
CN106294344B (en) Video retrieval method and device
Tang et al. Facial landmark detection by semi-supervised deep learning
CN104463250A (en) Sign language recognition translation method based on Davinci technology
CN103984943A (en) Scene text identification method based on Bayesian probability frame
Xia et al. Learning to refactor action and co-occurrence features for temporal action localization
CN109919060A (en) A kind of identity card content identifying system and method based on characteristic matching
CN105389558A (en) Method and apparatus for detecting video
Luo et al. SFA: small faces attention face detector
CN111414845B (en) Multi-form sentence video positioning method based on space-time diagram inference network
CN113963304B (en) Cross-modal video time sequence action positioning method and system based on time sequence-space diagram
CN115273154B (en) Thermal infrared pedestrian detection method and system based on edge reconstruction and storage medium
CN112949408A (en) Real-time identification method and system for target fish passing through fish channel
Ma et al. Location-aware box reasoning for anchor-based single-shot object detection
CN115712740A (en) Method and system for multi-modal implication enhanced image text retrieval
CN115659966A (en) Rumor detection method and system based on dynamic heteromorphic graph and multi-level attention
Sun et al. Boosting robust learning via leveraging reusable samples in noisy web data
Mijić et al. Traffic sign detection using yolov3
CN105469099B (en) Pavement crack detection and identification method based on sparse representation classification
CN115100497A (en) Robot-based method, device, equipment and medium for routing inspection of abnormal objects in channel
CN109726670B (en) Method for extracting target detection sample set from video
Chen et al. Online spatio-temporal action detection in long-distance imaging affected by the atmosphere
Wang et al. A deep learning-based method for vehicle licenseplate recognition in natural scene

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20150128

WD01 Invention patent application deemed withdrawn after publication