CN102970528B - The video picture segmentation method accumulated based on change-detection and frame difference - Google Patents

The video picture segmentation method accumulated based on change-detection and frame difference Download PDF

Info

Publication number
CN102970528B
CN102970528B CN201210402443.4A CN201210402443A CN102970528B CN 102970528 B CN102970528 B CN 102970528B CN 201210402443 A CN201210402443 A CN 201210402443A CN 102970528 B CN102970528 B CN 102970528B
Authority
CN
China
Prior art keywords
video
frame
detection
segmentation
motion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201210402443.4A
Other languages
Chinese (zh)
Other versions
CN102970528A (en
Inventor
祝世平
高洁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Xin Xiang Technology Co., Ltd.
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN201210402443.4A priority Critical patent/CN102970528B/en
Publication of CN102970528A publication Critical patent/CN102970528A/en
Application granted granted Critical
Publication of CN102970528B publication Critical patent/CN102970528B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses a kind of video picture segmentation method accumulated based on change-detection and frame difference, interframe first by the symmetrical frame that t significance test assay intervals is k frame changes, again the initial motion region of variation detected is calculated by the accumulation of time domain fixed-interval frame difference, and be integrally formed memory mask further;Then use Kirsch edge detection operator based on discontinuity detecting technique to regulate threshold value by the detection of image border seriality, thus obtain all marginal informations of connection in present frame;Then pass through space time filter and obtain semantic video object plane;Final selective application is filled and Morphological scale-space operates the segmentation realizing object video.This is a kind of new video picture segmentation method, and its proposition efficiently solves the internal disappearance of object video that the object irregular movement that video picture segmentation method often occurs causes and background appears.Splitting speed, segmentation effect, the scope of application and portability have all had the biggest improvement.

Description

The video picture segmentation method accumulated based on change-detection and frame difference
Technical field
The present invention relates to the processing method in a kind of Video Object Extraction, poor based on change-detection and frame particularly to one The video picture segmentation method of accumulation.On the premise of ensureing segmentation effect and speed, changing tradition change-detection needs many The shortcoming of secondary test getting frame difference limen value, has taken into full account the information of neighborhood of pixels, has obtained object video the most accurately.Frame is poor Information in accumulative application a period of time, solves the local motion object disappearance that Moving Objects irregular movement brought and asks Topic, Kirsch edge detection operator based on discontinuity detecting technique obtains the complete image marginal information two of connection in present frame Value, three kinds of method connected applications, make the method have more practicality and generalization.
Background technology
Object-based Video segmentation is the key realizing MPEG-4 coding based on content and interactive function, along with The popularization of MPEG-4 standard Content-based retrieval technology, Video Object Segmentation Technology has become as the focus of current research.At present, Object-based Video segmentation at video monitoring, man-machine interaction, military, before the field such as communication suffers from extremely wide application Scape.Although video object segmentation is conducted extensive research by people in recent years, but still without being fully solved this problem.
Video picture segmentation method manually participates in cutting procedure and is divided into segmentation automatically (to see Thomas according to whether needs Meier,King N.Ngan.Automatic segmentation of moving objects for video object plane generation[J].IEEE Transactions on Circuits and Systems for Video Technology, 1998,8 (5): 525-538) and semi-automatic segmentation (see Munchurl Kim, J.G.Jeon, J.S.Kwak, M.H.Lee,C.Ahn.Moving object segmentation in video sequences by user interaction and automatic object tracking[J].Image and Vision Computing,2001, 19 (5): 245-260).
Temporal segmentation method, spatial segmentation method and temporal-spatial fusion dividing method is specifically included that according to application difference. Temporal segmentation method utilizes the Temporal Order of video sequence, is changed by the time domain of consecutive frame and detects moving target.Commonly use Method has frame differential method and background subtraction.Frame differential method (sees Zhan C H.An improved moving object detection algorithm based on frame difference and edge detection[A] .International Conference on Image and Graphics [C], 2007:519-523);Method realizes letter Single, programming complexity is low, less sensitive to scene changes such as light, it is possible to adapt to various dynamic environment, and stability is preferable, But the complete area of object can not be extracted, border can only be extracted, depend on the time interval of the frame-to-frame differences of selection.Background Calculus of finite differences (sees Olivier Barnich, Marc Van Droogenbroeck.A universal background subtraction algorithm for video sequences[J].IEEE Transactions on Image Processing, 2011,20 (6): 1709-1723);According to practical situation, its principle and method design simply determine that threshold value is carried out Processing, acquired results directly reflects the information such as the position of moving target, size, shape, it is possible to obtain more accurate motion Target information, but the amount of calculation of context update is bigger easily produces discrete noise point, is become by the external condition such as light, weather The impact changed is bigger.Light stream (sees Jareal A, Venkatesh K S.A new color based optical flow a1gorithm for environment mapping using a mobile robot[A].IEEE International Symposium on Intelligent Control [C] .2007:567-572);Carry the movable information of target, in scene Under information unknown situation, can detect Moving Objects well, accuracy in detection is high, but optical flow method is computationally intensive typically can not be by It is applied to process in real time scene.Spatial segmentation method (sees Jaime S.Cardoso, Jorge C.S.Cardoso, Luis Corte-Real.Object-based spatial segmentation of video guided by depth and Motion information [A] .IEEE Workshop on Motion and Video Computing [C] .2007.); Utilize the spatial domain attributes such as the color of image, brightness, texture and marginal information to extract object video.Temporal-spatial fusion dividing method (Ahmed, R., Karmakar, G.C., Dooley, L.S.Incorporation of texture information for joint spatio-temporal probabilistic video object[A].IEEE International Conference on Image Processing [C] .2007,6,293-296);It is the most the most frequently used dividing method, the party The time-domain information of method combining video sequences and spatial information (si).Two kinds of information mutually merge and obtain relatively accurate segmentation result.
But either using which kind of method to split Moving Objects, the irregular movement of displaying background and object (is moved Object or its certain part are static within a period of time) all can cause the decline of segmentation accuracy.In motion analysis, appear The static foreground region that background and the irregular movement because of object cause easily is mistakenly detected as prospect or background, thus makes Become the decline of segmentation accuracy.
Summary of the invention:
The present invention proposes a kind of video picture segmentation method accumulated based on change-detection and frame difference image, first with Gaussian filtering carries out Gaussian smoothing filter, the symmetry then using t significance test assay intervals to be k frame to each frame video image The interframe change of frame, then the initial motion region of variation detected is calculated by the accumulation of time domain fixed-interval frame difference, and further It is integrally formed memory mask;Then use Kirsch edge detection operator based on discontinuity detecting technique continuous by image border Property detection regulate threshold value, thus obtain all image edge informations of connection in present frame, reduce the residual in memory mask Low intensive edge details is preferably protected while noise, guarantee edge continuity;Then pass through space time filter to obtain Semantic video object plane;Final selective application is filled and Morphological scale-space operates the segmentation realizing object video.This is A kind of new video object segmentation methods, its proposition efficiently solve that video picture segmentation method often occurs by irregularly The internal disappearance of the object video that motion (Moving Objects or its certain part are static within a period of time) causes and displaying background Problem, splitting speed, segmentation effect, the scope of application and portability have all had the biggest improvement.
The technical problem to be solved in the present invention is:
1, when directly being obtained the region of variation of Moving Objects by inter-frame difference, it is poor to need by many experiments getting frame The threshold value of method, to noise and light sensitive and obtain the problem that complete moving region exists serious loss;
2, the screening that object video irregular movement (Moving Objects or its certain part are static within a period of time) produces Gear (cover/appear) problem.
3, there is discontinuity in Kirsch edge detection operator acquisition object video border.
The technical solution adopted for the present invention to solve the technical problems: the video accumulated based on change-detection and frame difference image Object segmentation methods, comprises the following steps:
Step one: utilize the gaussian filtering each two field picture of smoothed video sequence, using t significance test assay intervals is k frame Symmetrical frame interframe change obtain each frame initial motion region of variation, then the initial motion region of variation detected is carried out through Cross symmetrical frame and obtain entire motion region of variation away from region of variation phase with operation, then take time domain fixed-interval frame difference accumulation meter Calculate, it is thus achieved that the regularly effective template of section, and be integrally formed memory mask further, complete the temporal segmentation of object video;
Step 2: the Kirsch edge detection operator improving each frame of original video is i.e. based on discontinuity detecting technique Kirsch operator carry out rim detection;The result of binaryzation rim detection completes the spatial segmentation of object video;
Step 3: use the mode of parallel temporal-spatial fusion by by the segmentation memory mask formed in step one and video sequence Each frame carry out extracting the accurate of Moving Objects with operating by the binaryzation edge detection results obtained in step 2 Boundary profile;Optionally carry out morphology opening and closing according to boundary information and padding completes the extraction of object video.
The present invention compared with prior art have the advantage that is:
1, this method utilizes the change of t-distribution significance test detection interframe, it is not necessary to know the variance of noise in video Therefore avoid the estimation of noise parameter, and need not during getting frame difference image manually test acquisition threshold value, can be according to t-distribution Table is searched and is obtained optimal threshold.The statistical change detection technology of hypothesis testing can well suppress camera noise to segmentation result Impact, segmentation result is significantly better than the result that Threshold segmentation obtains.
2, this method utilizes and is spaced apart the image of k frame and can preferably process object video and slowly move, it is proposed that memory The concept of mask MT (Memory Template), uses the method for time domain interval frame difference accumulation decided at the higher level but not officially announced to obtain memory template, effectively Solve the disappearance problem of motion region boundary.
3, easily there is breakpoint in the edge line directly utilizing the acquisition of Kirsch rim detection, and effect is less desirable.We Method uses the difference that the direction template of 4 × 4 calculates around impact point on 6 directions, when maximum difference exceedes certain threshold value, Then it is believed that this point is discontinuity point, thus detect the discontinuity of image border.Thus obtain the institute of connection in present frame Have an image edge information, reduce the residual noise in memory mask, ensure edge continuity while preferably protect low by force The edge details of degree.
Accompanying drawing illustrates:
Fig. 1 is the flow chart of the video picture segmentation method accumulated based on change-detection and frame difference of the present invention.
Fig. 2 is Akiyo video sequence change-detection and frame difference accumulative effect figure: wherein (a) represents Akiyo video sequence 5th frame;B () represents the 21st frame of Akiyo video sequence;C () represents that figure (a) is obtained after being detected by t significance test at the beginning of Beginning Motion-changed Region;D () represents the initial motion region of variation that figure (b) is obtained after being detected by t significance test;(e) table The complete Motion-changed Region that diagram (c) is accumulated through time domain fixed-interval frame difference;F () represents that figure (d) is poor through time domain fixed-interval frame Complete Motion-changed Region after accumulation.
Fig. 3 is the memory mask figure of Akiyo video sequence: wherein first and second of (a) expression Akiyo video sequence Individual memory mask.
Fig. 4 is Akiyo video sequence VOP extraction effect figure: wherein (a) represents the 5th frame of Akiyo video sequence;(b) table Show the 21st frame of Akiyo video sequence;C () represents the VOP extracted from figure (a);D () expression is extracted from figure (b) VOP;
Fig. 5 is the VOP design sketch that Grandma video sequence uses the inventive method to extract: wherein (a) represents 4th frame of Grandma video sequence;B () represents the VOP extracted from figure (a);C () represents the of Claire video sequence 19 frames;D () represents the VOP extracted from figure (c);
Fig. 6 is the VOP design sketch that Claire video sequence uses the inventive method to extract: wherein (a) represents Claire 8th frame of video sequence;B () represents the VOP extracted from figure (a);C () represents the 16th frame of Claire video sequence;(d) Represent the VOP extracted from figure (c);
Fig. 7 is the VOP design sketch that Miss-American video sequence uses the inventive method to extract: wherein (a) represents 20th frame of Miss-American video sequence;B () represents the VOP extracted from figure (a);C () represents Miss- 40th frame of American video sequence;D () represents the VOP extracted from figure (c);
Fig. 8 is Mother and daughter and Hall monitor video sequence uses the inventive method to extract VOP design sketch: wherein (a) represents the 15th frame of Mother and daughter video sequence;B () represents extraction from figure (a) The VOP arrived;C () represents the 70th frame of Hall monitor video sequence;D () represents the VOP extracted from figure (c);
Fig. 9 be context of methods and with reference to (Zhu Shiping, Ma Li, Hou Yangshuan. video pair based on time domain fixed-interval memory compensation As partitioning algorithm [J]. photoelectron. laser, 2010,21 (8): 1241-1246.) method be applied to Grandma video sequence and The spatial accuracy comparison diagram of the front 20 frame segmentation results of Miss-American video sequence.A () is to be applied to Grandma regard The spatial accuracy comparison diagram of frequency sequence;B () is the spatial accuracy comparison diagram being applied to Miss-American video sequence; Wherein spatial accuracy 1 represents the spatial accuracy result that application context of methods is obtained, and spatial accuracy 2 represents application reference Method obtained spatial accuracy result.
Detailed description of the invention:
Below in conjunction with the accompanying drawings and detailed description of the invention is described in further detail the present invention.
The video picture segmentation method based on time domain fixed-interval memory compensation of the present invention, Fig. 1 is the stream of the inventive method Cheng Tu, this method comprises the following steps:
Step one: utilize the gaussian filtering each two field picture of smoothed video sequence, using t significance test assay intervals is k frame Symmetrical frame interframe change obtain each frame initial motion region of variation, then the initial motion region of variation detected is carried out through Cross symmetrical frame and obtain entire motion region of variation away from region of variation phase with operation, then take time domain fixed-interval frame difference accumulation meter Calculate, it is thus achieved that the regularly effective template of section, and be integrally formed memory mask further, complete the temporal segmentation of object video;
If F(n)Represent the n-th frame of image sequence, F(n)With F(n-k)Between error image comprise F(n)In object video and The background area exposed due to the motion of object.F(n+k)With F(n)Between frame difference mask comprise F(n)In object video and by In object motion at F(n+k)The background area of middle covering.It follows that video image to be changed detection, use based on t-distribution Change detection techniques, the value of non-region of variation and the value of region of variation in statistics frame difference image, obtain initial change detection and cover Film.Use change detection techniques based on t-distribution to avoid many experiments and obtain segmentation threshold, taken into full account that detection pixel is adjacent Information in territory, the judged result obtained is the most accurate.As it is assumed that check the false alarm probability existed and the motion details lost At the beginning of being present in the noise region of the inner void caused due to grain details disappearance inside Moving Objects and scattered distribution Beginning Motion-changed Region.
When carrying out temporal segmentation, if target part motion is inconspicuous, just it is difficult to find that the fortune comprising complete video object Dynamic region, so it is accumulative to carry out motion initially with symmetrical frame away from frame difference.
After setting video sequence gray processing, n-th frame is Fn(x y), is G after gaussian filtering is smoothedn(x,y)。
In video sequence, every two field picture noise is designated as Nn(x, y) variance is designated asTherefore can be by n-th frame image in video sequence Gn(x, y) is expressed as:
G n ( x , y ) = G ‾ n ( x , y ) + N n ( x , y )
WhereinActual value for video image.According to above formula, difference image can be obtained:
FD ( x , y ) = G ‾ n ( x , y ) - G ‾ ( n - k ) ( x , y ) + N n ( x , y ) - N ( n - k ) ( x , y )
If D(x, y)=Nn(x,y)-N(n-k)(x, y), wherein Nn(x, y) and N(n-k)(x is y) that probability density is identical and mutual Independent stochastic variable, therefore D(x, y)Being still additive zero Gaussian noise stochastic variable, variance is
Owing to the noise of each pixel is the most independent, if all frame differences being not zero are all by noise in window Causing, the mean μ of these values should be zero, so carrying out hypothesis testing according to theory of probability knowledge, if (x y) is background in position (i.e. null hypothesis H0): H0:μ=0.In the case of without knowledge of noise covariance, use the detection of t significance test, according in neighborhood window Pixel structure statistical test amount t:
t = A d ( n ) s / p
Wherein, AdN () and s are respectively the sample average in neighborhood window and sample variance.
A d ( n ) = 1 p Σ - n n Σ - n n | FD ( x + l , y + m ) |
s = 1 p - 1 Σ - n n Σ - n n ( FD ( x + l , y + m ) - A d ( n ) ) 2
Theoretical according to significance test, the distribution that threshold value is obeyed by the level of significance α given and t determines:
| t | ≥ t α 2 ( p - 1 )
Selecting of level of significance α is relevant to the camera noise intensity in concrete video sequence, and its value generally can set It is 10-2, 10-6Deng (choosing α value herein is 10-2, window size elects 5 × 5 as), preferable result can be obtained. according to setting Level of significance α, ifFormula is set up, then this central pixel point belongs to m (n).
Initial motion region of variation is represented by:
Through symmetrical frame away from region of variation phase and operation, object video can be moved in inconspicuous being partially contained in, obtain Complete Motion-changed Region:
If the inner vein of moving target has the concordance of height, moving target is whole or local is quiet in certain time Stop or motion is slow, only use above-mentioned change detecting method complete moving region can not be detected, thus cause melting at space-time During conjunction, it is impossible to obtain accurate moving object boundary profile, thus cause target to lack in final extraction.
Time domain fixed-interval frame difference accumulation method is applied herein, it is possible to effectively solve object video border and lack for above-mentioned situation Mistake problem.The method utilizes the video sequence each interframe dependency in time domain and the seriality of target travel, considers In certain period, the number of times that each pixel occurs.I.e. in preset time section, occurrence number part frequently goes out as effective template In present whole movement slot.Take into full account object video motion continuity over a period, the most fully excavate one Temporal information on individual overall block.
If preset time, segment length was l, in this section of time zone, comprise L frame video image, respectively M1,M2,…,Ml-1, Ml, effective template EM (effective mask) that this section is corresponding is identical with every frame video image size:
EM ( x , y ) = 255 , T &GreaterEqual; &tau; 0 , T < &tau;
Wherein, T=ni/ L, niFor point, (x y) is marked as the number of times of target travel point, M in L frame video image1, M2,…,Ml-1, MlIt it is Motion-changed Region mask image M obtained through change-detection1(x,y),M2(x,y),…,Ml-1(x, y),Ml(x, y), τ is the proportion threshold value set.Different proportion threshold value τ is chosen according to different video sequence.For fortune Dynamic speed, the video that motion amplitude is bigger can choose bigger value;Relatively slow conversely, for movement velocity, motion amplitude Less video to select less value.
For any pixel, (x, y), if (x, is y) 0, does not the most carry out frame difference cumulative calculation EM, if (x y) is EM 255, then carry out the accumulation of time domain fixed-interval frame difference and calculate.
The pixel value of the respective point of the every frame video image carried out after the accumulation of frame difference calculates in the corresponding time period is setting It is 255, i.e.
F1(x,y)=F2(x,y)=...Fm(x,y)=255
Fig. 2 is that Akiyo carries out t-distribution change-detection and the result of fixed interval frame difference accumulation meter.In test, the symmetry chosen Frame pitch k=2, proportion threshold value τ=2/12 in the accumulation of frame difference.
Figure it is seen that moving region in the frame difference image of change-detection has the many cavities of existence, motion Target is the most imperfect, but after the accumulation of frame difference calculates, effect has had sizable improvement, has not only obtained complete border Profile, and moving region interior void have also been obtained preferably filling.
After time domain fixed-interval frame difference accumulation meter, although effect more initial t-distribution change-detection has obtained the biggest changing Kind, but still suffering from cavity within object video.So, this paper presents memory mask MT (Memory Template) Concept.Morphological scale-space and padding will be carried out through the mask that the accumulation of frame difference calculates, obtain complete object video and cover Film then there are N/L memory mask.
Open, Guan Bi is operation important in morphology, and they are used in combination and are formed by expanding and corrode cascade.
Gray scale expands and gray scale etching operation, can be considered image filtering operations, utilizes structural element that signal is carried out gray scale Corrosion and gray scale expand, and it is defined as follows:
( A &CirclePlus; B ) ( s , t ) = max { A ( s - x , t - y ) + B ( x , y ) | ( s - x , t - y ) &Element; D A ; ( x , y ) &Element; D B }
(AΘB)(s,t)=min{A(s+x,t+y)+B(x,y)|(s+x,t+y)∈DA;(x,y)∈DB}
Wherein DAAnd DBBeing the definition territory of A and B respectively, B is the square structure unit for reconstructed operation.
Opening operation typically makes the profile of image become smooth, disconnects the burr on narrow interruption and profile.Closed operation is same Sample can make contour line the most smooth, but operates by contrast with opening, and it generally eliminates narrow tip and long thin wide gap, eliminates Little cavity, and fill up the fracture in contour line.
Two formulas represent that morphologic opening operates and closed operation respectively below:
A &CenterDot; B = ( A &CirclePlus; B ) &Theta;B
According to Boundary filling Moving Objects template step:
1, carry out level filling, it may be assumed that travel through whole Moving Objects template, find in every a line first boundary point and last One boundary point, is all labeled as Moving Objects internal point by the pixel between these 2;
2, vertically fill, it may be assumed that travel through whole Moving Objects template, find in every string first boundary point and last One boundary point, is all labeled as Moving Objects internal point by the pixel between these 2;
3, level is filled result and takes common factor with vertical filling result, the Moving Objects in Video Sequences after finally being filled Template.
Owing to opening and closing operations can make amount of calculation increase, therefore for the MT of profile smoother, morphology can not be carried out Process, it is only necessary to be filled with.Mask profile in Fig. 2 is continuous and smooth, can directly fill, and it fills result at figure List in 3.
Step 2: the Kirsch edge detection operator improving each frame of original video is i.e. based on discontinuity detecting technique Kirsch operator carry out rim detection;The result of binaryzation rim detection completes the spatial segmentation of object video;
When rim detection, some important edge details due to interference or contrast deficiency thicken, faint.Directly Easily there is breakpoint in the edge line utilizing Kirsch rim detection to obtain, and effect is less desirable.Utilize image border even herein Continuous property detection regulates threshold value, thus obtains the image border of connection.The generally discontinuous place at edge, pixel value has relatively Big difference, the difference calculated around impact point on 6 directions with the direction template of 4 × 4 herein, when maximum difference exceedes certain During individual threshold value, then it is believed that this point is discontinuity point, thus detect the discontinuity of image border.Pressed down by the method Low intensive edge details is preferably protected, it is thus achieved that gratifying effect while noise processed, guarantee edge continuity. Threshold value decides the precision of location, edge and the continuous of edge.Remember that former video sequence is through rim detection and through filling and two-value The result changed is Me
Step 3: use the mode of parallel temporal-spatial fusion by by the segmentation memory mask formed in step one and video sequence Each frame carry out extracting the accurate of Moving Objects with operating by the binaryzation edge detection results obtained in step 2 Boundary profile;Optionally carry out morphology opening and closing according to boundary information and padding completes the extraction of object video.
N/L time domain memory mask MT respectively with each spatial domain binaryzation edge detection results MeMerge and extract two-value motion Object template:
B(x,y)=MT(x,y)∩Me(x,y)
If (x, y) should be 255 to the B of correspondence, i.e. this point is finally labeled as prospect, is otherwise labeled as background.Use so Above-mentioned amalgamation mode, it is clear that the occlusion area produced due to object video motion in memory motherboard can be passed through border Constraint weeds out.Finally, in conjunction with original video sequence VO(x, y), completes the segmentation of object video:
VO ( x , y ) = V O ( x , y ) , B ( x , y ) = 255 255 , B ( x , y ) = 0
In order to the effectiveness of context of methods is described, have selected normal video cycle tests " Akiyo ", " Grandma ", " Claire ", " Miss-American ", " Mother and daughter " and " Hall monitor " are as experimental subject.Six Section test video is QCIF form, and size is 176 × 144 pixels.Test result indicate that context of methods regards different types of Frequency sequence all has good segmentation effect.
C language is selected to realize language as described method herein,CoreTM 2Duo E6300, 1.86GHz dominant frequency, memory size is 2G, uses Visual C++6.0 development environment programming realization.
In order to preferably reflect the correctness of context of methods, use in MPEG-4 core experimental herein and carried by Wollborn etc. The evaluation of the accuracy gone out.Spatial accuracy evaluation defines the spatial accuracy SA (Spatial of each frame cutting object mask Accuracy)。
The segmentation accuracy of method then can be given by equation below.
&Omega; ( I s , I g ) = 1 - | I e - I r | I r
In formula, IeAnd IrRepresent the object template obtained by the reference segmentation of t frame and actual dividing method respectively;Space Accuracy reflects the shape similarity degree between the segmentation result of each frame and reference segmentation template, and SA is the biggest, shows segmentation The most accurate, SA is the least, shows to split the most inaccurate.
Tables 1 and 2 list respectively context of methods and with reference to (Zhu Shiping, Ma Li, Hou Yangshuan. based on time domain fixed-interval remember Recall the Video object segmentation algorithm [J] of compensation. photoelectron. laser, 2010,21 (8): 1241-1246.) it is applied to Grandma and regards The spatial accuracy contrast of the front 20 frame segmentation results of frequency sequence and Miss-American video sequence.Can be seen by contrast Going out, the spatial accuracy of context of methods is definitely better than control methods.
Before table 1Grandma, 20 frames use the spatial accuracy contrast that this method and reference method are asked for
Before table 2Miss-American, 20 frames use the spatial accuracy contrast that this method and reference method are asked for.

Claims (1)

1. the video picture segmentation method accumulated based on change-detection and frame difference, should accumulate based on change-detection and frame difference Video picture segmentation method is characterised by: temporal segmentation utilizes the detection interframe change of t significance test, it is not necessary to according to loaded down with trivial details Experimental data set threshold value, search according to t-distribution table and obtain optimal threshold, it is not necessary to know the variance of noise in video, because of This avoids the estimation of noise parameter;Effective template and the memory concept of mask and both make is proposed at frame difference accumulation phase With and forming method thereof;Spatial segmentation utilizes i.e. based on discontinuity detecting technique the Kirsch of Kirsch edge detection operator improved Operator obtains the finest connection edge;Specifically comprising the following steps that of this video picture segmentation method
Step one: utilize the gaussian filtering each two field picture of smoothed video sequence, using t significance test assay intervals is the right of k frame The interframe change claiming frame obtains each frame initial motion region of variation, then carries out the initial motion region of variation detected and behaviour Make to obtain entire motion region of variation, then take the accumulation of time domain fixed-interval frame difference to calculate, and be integrally formed memory further and cover Film, completes the temporal segmentation of object video;Specifically comprise the following steps that
(1), after setting video sequence gray processing, n-th frame is Fn(x y), is G after gaussian filtering is smoothedn(x,y);
(2), in video sequence, every two field picture noise is designated as Nn(x, y), variance is designated asTherefore can be by n-th frame gray scale in video sequence Image Gn(x, y) is expressed as:
WhereinActual value for video image;According to above formula, difference image can be obtained:
If D(x,y)=Nn(x,y)-N(n-k)(x, y), wherein Nn(x, y) and N(n-k)(x is y) that probability density is identical and the most only Vertical stochastic variable, therefore D(x,y)Being still additive zero Gaussian noise stochastic variable, variance is
Owing to the noise of each pixel is the most independent, if all frame differences being not zero are all to be caused by noise in window , the mean μ of these values should be zero, so carrying out hypothesis testing according to theory of probability knowledge, if (x y) is background, i.e. in position Null hypothesis H0: H0: μ=0;In the case of without knowledge of noise covariance, use the detection of t significance test, according to the pixel in neighborhood window Point structure statistical test amount t:
Wherein, AdN () and s are respectively the sample average in neighborhood window and sample variance, p is that the pixel in neighborhood window is total Number;
Theoretical according to significance test, the distribution that threshold value is obeyed by the level of significance α given and t determines:
Selecting of level of significance α is relevant to the camera noise intensity in concrete video sequence, according to the significance water set Flat α, ifSet up, then the central pixel point of this neighborhood window belongs to m (n);
Initial motion region of variation is represented by:
Through symmetrical frame away from region of variation phase and operation, object video can be moved in inconspicuous being partially contained in, obtain complete Motion-changed Region;
(3), the accumulation of time domain fixed-interval frame difference calculates: has the conforming object video of height for comprising inner vein or regards Frequently object is static in certain time period or motion is slow, only uses change detecting method described in above-mentioned (1), (2) two steps not examine Measure complete moving region, cause when spatio-temporal filtering, it is impossible to obtain accurate moving object boundary profile, thus finally Video Object Extraction in cause target localized loss;
Application time domain fixed-interval frame difference accumulation method, it is possible to effectively solve target localized loss problem;Time domain fixed-interval frame difference is tired out Long-pending method utilizes the video sequence each interframe dependency in time domain and the seriality of target travel, not only considers in certain section In time, the number of times that each pixel occurs is i.e. in preset time section, and occurrence number part frequently occurs in as effective template In whole movement slot;And taken into full account object video motion continuity over a period, the most fully excavate one Temporal information on individual overall block;
If segment length preset time is l, comprise L frame video image, respectively M this time period1,M2,…,ML-1,ML, this time Between effective template corresponding to section be EM (effective mask), identical with every frame video image size:
Wherein, T=ni/ L, niFor point, (x y) is marked as the number of times of target travel point, M in L frame video image1,M2,…, ML-1,MLIt it is Motion-changed Region mask image M obtained through change-detection1(x,y),M2(x,y),…,ML-1(x,y),ML (x, y), τ is the proportion threshold value set;Different proportion threshold value τ is chosen according to different video sequence;For motion speed Degree is very fast, and the video that motion amplitude is bigger can choose bigger value;Relatively slow conversely, for movement velocity, motion amplitude is less Video to select less value;
For any pixel, (x, y), if (x, is y) 0, does not the most carry out frame difference cumulative calculation EM, if (x is y) 255, then to EM Carry out the accumulation of time domain fixed-interval frame difference to calculate;
The pixel value of the respective point of the every frame video image carried out after the accumulation of frame difference calculates in the corresponding time period is and is set as 255, i.e. F1(x, y)=F2(x, y)=...=FL(x, y)=255;
(4), after time domain fixed-interval frame difference accumulation meter, although effect more initial t-distribution change-detection has obtained the biggest changing Kind, but still suffering from cavity within object video;So, the present invention proposes memory mask MT (Memory Template) Concept;Morphological scale-space and padding will be carried out through the mask that the accumulation of frame difference calculates, obtain complete object video Memory mask MT;
Owing to opening and closing operations can make amount of calculation increase, therefore for the MT of profile smoother, Morphological scale-space can not be carried out, Have only to be filled with;
Step 2: use the Kirsch edge detection operator improved i.e. based on discontinuity detecting technique each frame of original video Kirsch operator carries out rim detection;The result of binaryzation rim detection completes the spatial segmentation of object video;Concrete steps are such as Under:
(1), utilize tradition Kirsch edge detection operator to carry out edge detection calculation, obtain the edge of initial each frame video sequence Image;During rim detection, some important edge details due to interference or contrast deficiency thicken, faint;
(2), directly utilize Kirsch rim detection obtain edge line breakpoint easily occurs, effect is undesirable;This method uses 4 The direction template of × 4 calculates the difference around impact point on 6 directions, when maximum difference exceedes certain threshold value, then can recognize It is discontinuity point for this impact point, thus detects the discontinuity of image border;Thus obtain in present frame all of connection Image edge information, preferably protects low-intensity while reducing the residual noise in memory mask, guarantee edge continuity Edge details;Each spatial domain binaryzation edge detection results M is obtained after binaryzatione
Step 3: use the mode of parallel temporal-spatial fusion by each frame by the memory mask formed in step one with video sequence Carried out by the binaryzation edge detection results obtained in step 2 and operate the exact boundary profile extracting Moving Objects; Optionally carry out morphology opening and closing according to boundary information and padding completes the extraction of object video;Specifically comprise the following steps that
(1), time domain memory mask MT respectively with each spatial domain binaryzation edge detection results MeMerge and extract two-value Moving Objects mould Plate:
B (x, y)=MT (x, y) ∩ Me(x,y)
If (x, y) should be 255 to the B of correspondence, then (x y) is finally labeled as prospect, is otherwise labeled as background to put B;
(2), the most above-mentioned amalgamation mode is used, it is clear that the screening that can will produce in memory motherboard due to object video motion Gear region is weeded out by the constraint on border;
Finally, in conjunction with original video sequence VO(x y), completes the segmentation of object video.
CN201210402443.4A 2012-12-28 2012-12-28 The video picture segmentation method accumulated based on change-detection and frame difference Active CN102970528B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210402443.4A CN102970528B (en) 2012-12-28 2012-12-28 The video picture segmentation method accumulated based on change-detection and frame difference

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210402443.4A CN102970528B (en) 2012-12-28 2012-12-28 The video picture segmentation method accumulated based on change-detection and frame difference

Publications (2)

Publication Number Publication Date
CN102970528A CN102970528A (en) 2013-03-13
CN102970528B true CN102970528B (en) 2016-12-21

Family

ID=47800372

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210402443.4A Active CN102970528B (en) 2012-12-28 2012-12-28 The video picture segmentation method accumulated based on change-detection and frame difference

Country Status (1)

Country Link
CN (1) CN102970528B (en)

Families Citing this family (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103218830B (en) * 2013-04-07 2016-09-14 北京航空航天大学 Based on centroid trace and the object video contour extraction method of improvement GVF Snake
US9584814B2 (en) * 2014-05-15 2017-02-28 Intel Corporation Content adaptive background foreground segmentation for video coding
US10133927B2 (en) 2014-11-14 2018-11-20 Sony Corporation Method and system for processing video content
CN105046682B (en) * 2015-05-20 2018-04-03 王向恒 A kind of video frequency monitoring method based on local computing
US10372977B2 (en) * 2015-07-09 2019-08-06 Analog Devices Gloval Unlimited Company Video processing for human occupancy detection
CN106156747B (en) * 2016-07-21 2019-06-28 四川师范大学 The method of the monitor video extracting semantic objects of Behavior-based control feature
CN106530248A (en) * 2016-10-28 2017-03-22 中国南方电网有限责任公司 Method for intelligently detecting scene video noise of transformer station
CN108769803B (en) * 2018-06-29 2021-06-22 北京字节跳动网络技术有限公司 Recognition method, cutting method, system, equipment and medium for video with frame
CN109784164B (en) * 2018-12-12 2020-11-06 北京达佳互联信息技术有限公司 Foreground identification method and device, electronic equipment and storage medium
CN109871846A (en) * 2019-02-18 2019-06-11 北京爱数智慧科技有限公司 A kind of object boundary recognition methods, device and equipment
CN110378327B (en) * 2019-07-09 2021-05-18 浙江大学 Target detection device and method with auxiliary significant features added
CN110728746B (en) * 2019-09-23 2021-09-21 清华大学 Modeling method and system for dynamic texture
CN112017135B (en) * 2020-07-13 2021-09-21 香港理工大学深圳研究院 Method, system and equipment for spatial-temporal fusion of remote sensing image data
CN114071166B (en) * 2020-08-04 2023-03-03 四川大学 HEVC compressed video quality improvement method combined with QP detection
CN112669324B (en) * 2020-12-31 2022-09-09 中国科学技术大学 Rapid video target segmentation method based on time sequence feature aggregation and conditional convolution
CN113160273A (en) * 2021-03-25 2021-07-23 常州工学院 Intelligent monitoring video segmentation method based on multi-target tracking
CN113329227A (en) * 2021-05-27 2021-08-31 中国电信股份有限公司 Video coding method and device, electronic equipment and computer readable medium
CN116524026B (en) * 2023-05-08 2023-10-27 哈尔滨理工大学 Dynamic vision SLAM method based on frequency domain and semantics
CN116225972B (en) * 2023-05-09 2023-07-18 成都赛力斯科技有限公司 Picture difference comparison method, device and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101719979A (en) * 2009-11-27 2010-06-02 北京航空航天大学 Video object segmentation method based on time domain fixed-interval memory compensation
CN101854467A (en) * 2010-05-24 2010-10-06 北京航空航天大学 Method for adaptively detecting and eliminating shadow in video segmentation

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7085401B2 (en) * 2001-10-31 2006-08-01 Infowrap Systems Ltd. Automatic object extraction
US7865015B2 (en) * 2006-02-22 2011-01-04 Huper Laboratories Co. Ltd. Method for video object segmentation

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101719979A (en) * 2009-11-27 2010-06-02 北京航空航天大学 Video object segmentation method based on time domain fixed-interval memory compensation
CN101854467A (en) * 2010-05-24 2010-10-06 北京航空航天大学 Method for adaptively detecting and eliminating shadow in video segmentation

Also Published As

Publication number Publication date
CN102970528A (en) 2013-03-13

Similar Documents

Publication Publication Date Title
CN102970528B (en) The video picture segmentation method accumulated based on change-detection and frame difference
CN103729858B (en) A kind of video monitoring system is left over the detection method of article
Sengar et al. Moving object area detection using normalized self adaptive optical flow
Neri et al. Automatic moving object and background separation
CN102194443A (en) Display method and system for window of video picture in picture and video processing equipment
CN103871076A (en) Moving object extraction method based on optical flow method and superpixel division
CN103678299A (en) Method and device for monitoring video abstract
CN104063885A (en) Improved movement target detecting and tracking method
Patel et al. Moving object detection with moving background using optic flow
CN110688905A (en) Three-dimensional object detection and tracking method based on key frame
CN101483001A (en) Video-based intrusion detection method, detection apparatus and chip
CN103514608A (en) Movement target detection and extraction method based on movement attention fusion model
Liang et al. Methods of moving target detection and behavior recognition in intelligent vision monitoring.
Wu et al. Overview of video-based vehicle detection technologies
Fakhfakh et al. Background subtraction and 3D localization of moving and stationary obstacles at level crossings
Li et al. Intelligent transportation video tracking technology based on computer and image processing technology
WO2016019973A1 (en) Method for determining stationary crowds
CN101571952A (en) Method for segmenting video object based on fixed period regional compensation
CN115188081B (en) Complex scene-oriented detection and tracking integrated method
Liu et al. A review of traffic visual tracking technology
CN115546764A (en) Obstacle detection method, device, equipment and storage medium
CN110858392A (en) Monitoring target positioning method based on fusion background model
Agrawal et al. Performance comparison of moving object detection techniques in video surveillance system
Makawana et al. Moving vehicle detection and speed measurement in video sequence
Ran et al. Multi moving people detection from binocular sequences

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20181219

Address after: 518000 N District, 5th Floor, No. 3011 Shahe West Road, Xili Street, Nanshan District, Shenzhen City, Guangdong Province

Patentee after: Shenzhen Xin Xiang Technology Co., Ltd.

Address before: 100191 Xueyuan Road, Haidian District, Beijing, No. 37

Patentee before: Beihang University

TR01 Transfer of patent right