CN103413330A - Method for reliably generating video abstraction in complex scene - Google Patents

Method for reliably generating video abstraction in complex scene Download PDF

Info

Publication number
CN103413330A
CN103413330A CN2013103895057A CN201310389505A CN103413330A CN 103413330 A CN103413330 A CN 103413330A CN 2013103895057 A CN2013103895057 A CN 2013103895057A CN 201310389505 A CN201310389505 A CN 201310389505A CN 103413330 A CN103413330 A CN 103413330A
Authority
CN
China
Prior art keywords
target
video
moving target
frame
moving
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2013103895057A
Other languages
Chinese (zh)
Inventor
郝红卫
袁飞
唐矗
田澍
冯媛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Automation of Chinese Academy of Science
Original Assignee
Institute of Automation of Chinese Academy of Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Automation of Chinese Academy of Science filed Critical Institute of Automation of Chinese Academy of Science
Priority to CN2013103895057A priority Critical patent/CN103413330A/en
Publication of CN103413330A publication Critical patent/CN103413330A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses a method for reliably generating video abstraction in a complex scene. The method comprises the following steps: performing background modeling on an acquired video to obtain the moving foreground of a current frame in a video sequence; selecting a proper target detector for performing target classification and detection on the moving foreground of the current frame by using prediction information; calculating the movement locus of a moving target in the current frame by using a multi-target tracking method according to the acquired type and detection result of the moving target, and calculating the prediction information of a next frame; detecting a next frame till the end of the video sequence; generating video abstraction according to information such as the movement loci, types and the like of all moving targets. According to the method, a mutual feedback mechanism is introduced into foreground detection and tracking processes, so that the omission ratio of weak targets, the false drop rate of dense shielded targets, and the fault tracking probability, miss tracking probability and incomplete probability of the loci of moving target can be reduced effectively; the method can be applied to various complex scenes.

Description

Reliable video abstraction generating method under a kind of complex scene
Technical field
The present invention relates to technical field of image processing, relate in particular to reliable video abstraction generating method under a kind of complex scene.
Background technology
In modern society, video monitoring system is being brought into play more and more important effect in all trades and professions, the departments such as public security, fire-fighting, traffic, commercial production are particularly urgent to the demand of the video monitoring of public place, and video monitoring system has become an important component part that maintains public order, strengthens social management and safety in production.The quantity of monitoring camera increases fast, all can produce the monitor video data of magnanimity every day, and this causes finding clue by monitor video can expend a large amount of human and material resources and time, has reduced the effective rate of utilization of monitor video.According to the ReportLinker corporate statistics, in 2011, the whole world has and surpasses 1.65 hundred million CCTV cameras, produce the monitor data of 1.4 trillion hours, if 20% important monitor video data are arranged need to manually watch, need to employ the labour's (working every day 8 hours, annual work 300 days) over 100,000,000.Therefore, in video monitoring system, make the whole video of user's fast browsing, lock rapidly searching object, be of great importance for the utilization factor that improves the magnanimity monitor video.
In image processing field, in order to improve the browse efficiency of video, can adopt video summarization technique, by the interested contents extraction of user in video out, then they are rearranged in compact mode, thereby generate brief video.In order to extract automatically the interested content of user in video, the simplest method is the key frame extracted in original video, form video frequency abstract (list of references: the triumphant nurse of Chadwick etc. a kind of video frequency abstract overall plan of based target, the 8th ACM (Association of Computing Machinery) multimedia international conference transactions, (2000) 303-311. (Kim, C., Hwang, J.N.:An integrated scheme for object-based video abstraction.In:Proceedings of the eighth ACM international conference on Multimedia. (2000) 303-311)), but whole section video of the description that key frame can't be complete, can cause the loss of important information in video, and because video content is of a great variety, the key frame that How to choose is suitable is a difficult problem.Another kind method is first video content to be analyzed, extract the relevant information of moving target in original video, the movable information that then will extract arranges compactly, the generating video summary (list of references: Ya Aierpuruiqi etc. non-sequential video frequency abstract and index, IEEE pattern analysis and machine intelligence transactions, (2008) 1971-1984. (Pritch, Y., Rav-Acha, A., Peleg, S.:Nonchronological video synopsis and indexing.IEEE Trans.Pattern Anal.Mach.Intell.30 (2008) 1971-1984)), this method can retain the dynamic content of video preferably.For this method, the key of problem is to extract how exactly the interested all events of user.
For monitor video, the photographed scene of monitor video is very complicated: the scene vehicle had is many, and movement velocity is fast, as highway; Some scene pedestrians are many, and the normal appearance phenomenon of blocking mutually, as railway station or crossroad; In some scenes, moving target shared elemental area on picture is very little, etc.; The complicacy of scene is that the accurate detection of moving target brings very large challenge.Current video summarization technique can not solve the test problems of moving target in complex scene well, usually make the loss of moving target very high, can't accurately extract the critical event in video, thereby cause the video frequency abstract of generation to miss the important information in original video.
Summary of the invention
Video summarization technique is significant and wide market outlook for effective utilization of magnanimity monitor video, but existing video summarization technique can not effectively solve in complex scene and occur that Weak target, target block the situations such as adhesion mutually, for monitor video is carried out to fast browsing, reduce loss and fallout ratio that in complex scene, the moving target event is extracted, reduce moving target and in the track generative process, follow the tracks of the probability of makeing mistakes, follow the tracks of lose objects, and eliminate the misgivings of user to existing video summarization technique.
The present invention proposes reliable video abstraction generating method under a kind of complex scene, it comprises:
Step 1, the video obtained is carried out to background modeling, obtain the sport foreground of present frame in video sequence;
Step 2, utilize the information of forecasting calculated in the previous frame tracing process, select suitable object detector to carry out target classification and detection to the sport foreground of present frame, obtain type and the testing result of moving target;
Step 3, according to type and the testing result of the moving target obtained, utilize multi-object tracking method, calculate the movement locus of moving target in present frame, and calculate accordingly the information of forecasting of next frame, and go to step 2 and carry out the detection of next frame, until detect complete all video frame images;
Step 4, according to movement locus and the type of all moving targets that obtain, generating video summary.
Wherein, in step 1 also the sport foreground to the current video image frame that obtains carry out aftertreatment, specifically comprise:
Step 11, use morphological structuring elements, carry out morphology opening operation and closing operation of mathematical morphology to foreground target, obtains the prospect of contour smoothing, and eliminate the less noise spot of area, dwindles the noise spot that area is larger;
Step 12, foreground target is carried out to area calculating, if in the area of foreground target, pixel is less than threshold value T 1=5 o'clock, this foreground target of filtering, otherwise, retain this foreground target.
Wherein, object detector described in step 2 is with the combination of histograms of oriented gradients and linear SVM, to carry out off-line training in advance to obtain, for detection of type, profile and the positional information of moving target.
Wherein, the positive sample set of the described object detector use of training comprises the following three class images that occur in monitor video: 1) pedestrian and positive bicycle; 2) bicycle of side; 3) motor vehicle; Training obtains three object detectors.
Wherein, step 2 specifically comprises:
Step 21, according to the information of forecasting of previous frame tracing process feedback, determine object detector;
Step 22, carry out target detection according to determined object detector;
If step 23 detects moving target, export the testing result of present frame moving target, for the tracing process of moving target;
If step 24 does not detect moving target, will in background modeling, obtain the information output of sport foreground to tracing process.
Wherein, the information of forecasting fed back according to the previous frame tracing process in step 21, comprise type, position, area, the ratio of width to height and the number of moving target in present frame, carrys out type and the yardstick of select target detecting device, and the detection position of localizing objects detecting device.
Wherein, step 3 specifically comprises the following steps:
Step 31, according to the testing result of the moving target of present frame, calculate the similarity of moving target color histogram feature in present frame and previous frame;
Step 32, according to the moving object detection result in previous frame, utilize the positional information of moving target in present frame in Kalman filter prediction previous frame, in conjunction with the testing result of moving target in present frame, the Euclidean distance in the predicted position of calculating moving target and moving object detection result between the physical location of moving target;
Step 33, according to the result of calculation of step 31 and step 32, use Hungary Algorithm, to the moving target that detects in present frame with enliven track and mate, obtain matching result, and according to matching result, upgrade the movement locus of moving target.
Wherein, in step 4, the duration of the movement locus of the moving target of acquisition, classification and appearance is presented in a secondary video snapshot image, forms video frequency abstract in the mode of video snapshot image.
The present invention is directed to the monitor video under complex scene, by novel Video content analysis technique, moving target in original video is detected and follows the tracks of, extract the moving target event in original video, then for each moving target event, add up their movement locus and relevant information, and be shown to compactly the user with the form of image, the user, by watching the picture that records each moving target event just can reach the purpose of watching original video, has shortened widely the user and has watched the spent time of video.When the moving target in video carries out detection and tracking, this method fully takes into account the complicacy of scene, the technical scheme adopted can guarantee the reliability of result of calculation, the loss of moving target event is controlled to extremely low level, thereby make this invention technology can be widely used in the actual combat of many departments, such as public security investigation etc.
The accompanying drawing explanation
Fig. 1 is reliable video abstraction generating method process flow diagram under complex scene in the present invention.
Fig. 2 is the process flow diagram of moving target detecting method in the present invention.
Fig. 3 is the process flow diagram of method for tracking target in the present invention.
Embodiment
For making the purpose, technical solutions and advantages of the present invention clearer, below in conjunction with specific embodiment, and with reference to accompanying drawing, the present invention is described in further detail.
The present invention proposes reliable video abstraction generating method under a kind of complex scene, the method content is specific as follows:
At first, the image sequence of original video is carried out to background modeling, obtain the sport foreground piece, and it is carried out to prospect aftertreatment etc.; Secondly, utilize a series of object detectors that off-line training is good, the sport foreground piece is carried out to target detection and classification; Again, build the multi-object tracking method based on Hungary Algorithm; Finally, for each moving target generates movement locus, moving target and its movement locus are merged in image, the image after the present invention will be merged is called " video snapshot ".It is worthy of note, this method has been introduced mutual feedback mechanism in foreground target detection and tracking process, feedback mechanism can make target detection and target following mutually promote mutually, improve accuracy rate and the speed of target detection and target following, effectively reduce the loss of Weak target, the false drop rate of dense shelter target, and movement objective orbit with wrong probability, with losing probability and imperfect probability.The original video that can carry out the video frequency abstract processing includes but not limited to: the video file that the live video stream that video monitoring system gathers, video monitoring system store, conventional multimedia video frequency file, TV programme, film etc.
Fig. 1 shows in the present invention under complex scene the process flow diagram of video abstraction generating method reliably.As shown in Figure 1, the concrete implementation step of this video abstraction generating method is as follows:
The video data of step S101, collection video frequency abstract to be generated;
The original video that step S102, storage gather, form the original video data storehouse; Original video can be the video of monitoring camera Real-time Collection, can be also the playback video of monitoring video.
Step S103, the video sequence frame of the original video in the original video data storehouse is carried out to background modeling, obtain sport foreground and background in each frame of video, and it is carried out to aftertreatment.
In an embodiment of the present invention, background modeling can adopt multiple related algorithm, and the present embodiment does not enumerate.The purpose of background modeling is background and the prospect of determining in this video scene.Scene consists of background and prospect, and the background in scene refers to that long period in video remains unchanged or the zone of subtle change is arranged, and accordingly, the prospect in scene refers to the zone of significant change.For example, for the monitor video of crossroad, the automobile travelled on road and the pedestrian who walks, only exist at short notice in video scene, so be considered to sport foreground, and the trees of road, traffic lights and both sides, road, exist for a long time in video scene, can be regarded movement background.By original video is carried out to background modeling, just can extract the sport foreground in video.
But due to the complicacy of actual monitored scene, foreground target can be sneaked into noise spot, such as the trees on road both sides, when the moving leaf of wind, leaf can be shaken, and the leaf of shake is due to the variation that position is arranged, so can be classified as sport foreground.
Based on this phenomenon, in the preferred embodiment of the present invention, the sport foreground obtained is carried out to the prospect aftertreatment, the prospect aftertreatment adopts morphology to calculate, and specifically comprises:
At first, use morphological structuring elements, foreground target is carried out to morphology opening operation and closing operation of mathematical morphology, can obtain the prospect of contour smoothing, and eliminate the less noise spot of area, dwindle the noise spot that area is larger;
Then, foreground target is carried out to area calculating, if in the area of foreground target, the pixel number is less than threshold value T 1=5 o'clock, think that this foreground target belongs to noise, answer filtering, otherwise, retain this foreground target.By above method, eliminate the noise in sport foreground, and can make the edge of prospect become level and smooth.
Step S104, utilize object detector, sport foreground is carried out to target detection and classification, obtain the relevant information such as profile, type, position of moving target.
By video is carried out to background modeling, extract sport foreground, and, after sport foreground is carried out to aftertreatment, can obtain the profile information of moving target in video, still due to the complicacy of actual monitored scene, it is far from being enough having to the profile of moving target.During due to background modeling, usually can move the phenomenons such as target adhesion, target piecemeal, target occlusion, the target of prospect is too dense, these phenomenons often cause the accuracy rate of detecting device to reduce.For example, at the monitor video in the more railway station of flow of the people, the situation that the pedestrian is blocked mutually usually occurs, what at this moment sport foreground showed is group's profile; Monitor video on road bustling with vehicles, usually there will be mutually blocking of vehicle, and what at this moment sport foreground showed is the profile of many cars; And, simply by virtue of the profile of moving target, also can't obtain the classified information (pedestrian/bicycle/motor vehicle) of moving target.In addition, background modeling method is in some picture frame, can move prospect piecemeal phenomenon, for example a pedestrian walks on road, sport foreground should be pedestrian's profile, but background modeling can be regarded the pedestrian above the waist as a moving object after extracting sport foreground, regards the lower part of the body as another moving object.In sum, need to carry out target detection to sport foreground.
In the preferred embodiment of the present invention, adopt the good a series of object detectors of off-line training, target detection and classification are carried out in the zones such as sport foreground.The off-line training of object detector is divided into three steps:
At first, collect sample set.Sample set can be divided into positive sample set and negative sample collection.For example, positive sample set can be divided three classes: 1) pedestrian and positive bicycle; 2) bicycle of side; 3) motor vehicle, the pedestrian, bicycle and the motor vehicle that in a series of images frame of monitor video, occur are formed.Monitor video derives from up to a hundred the videos that the video monitoring system of Quzhou City, Zhejiang Province Ke Shan branch office gathers, pedestrian in positive sample set and vehicle are by manually marking, form the sample data collection of monitor video, and issue on the net, can be for public research.The negative sample collection is the monitoring image in the monitor video that does not comprise pedestrian, bicycle and motor vehicle.
Secondly, extract sample characteristics.In the preferred embodiment of the present invention, adopt histograms of oriented gradients HOG (Histogram of Oriented Gradients) to carry out feature extraction, HOG is a kind of Feature Descriptor for target detection, the direction gradient number of times that image local occurs is counted, the HOG technology is by being divided into entire image little join domain cells, each cell generates a histograms of oriented gradients, and these histogrammic combinations can be expressed as the descriptor of sample.The HOG feature that positive sample in sample set and negative sample is all needed to extract to sample.
Finally, training sample, obtain a series of detecting devices.In the preferred embodiment of the present invention, utilize linear SVM SVM (Support Vector Machine) to train training sample, thereby obtain a series of detecting devices for target detection.For example,, because the positive sample set of training is divided into three classes, therefore can obtain three object detectors.If in moving target, contain front or the side image of pedestrian, bicycle and motor vehicle, the multiclass detecting device that forms of these three object detectors just can detect the information such as position, profile of moving target, and the classified information (pedestrian/bicycle/motor vehicle) of this moving target is provided.
Above step of carrying out off-line training for sample set is only with carrying out once, and purpose is to obtain detecting device, after successfully obtaining detecting device, while for different monitor videos, processing to generate video frequency abstract, only need to call the detecting device trained and get final product.
In the process of generating video summary, this a series of detecting device is that the sport foreground of original video is carried out to target detection, and needs to detect for each sport foreground of each frame of video.When sport foreground is detected, need be from a Train detector, selecting one.
In order to adapt to the needs of moving object detection under complex scene, improve performance and the speed of detecting device, the present invention introduces the information of forecasting of tracing process feedback innovatively in the target detection process, this feedback information is predicted type, area, the ratio of width to height, position and the number of moving target in present frame.According to information of forecasting, can select optimal detecting device, and the parameters such as the yardstick of definite detecting device and detection position.Wherein, in feedback information, the type of moving target can help to select accurately detector type, avoids three detecting devices to carry out duplicate detection to the same movement prospect, causes be multiplied detection time; In feedback information, the area of moving target, aspect ratio information can assist to select exactly the yardstick of detecting device; In feedback information, the effective location of sensing range can assisted detector be carried out in the position of moving target, etc.The present invention, by the feedback mechanism of tracing process to testing process, has increased substantially the performance of detecting device, and greatly reduces detection speed.
For the fresh target in sport foreground, because it just occurs, tracing process there is no the information of forecasting that method is obtained this moving target, therefore according to features such as the ratio of width to height of this moving target, areas, select a class detecting device, the fresh target in sport foreground is detected, obtain testing result, then with threshold value, compare, if higher than threshold value, adopt the testing result of such detecting device, otherwise use an other class detecting device.Obtain testing result the outputs such as type, area, position of moving target.Wherein, the Threshold of each detecting device is different, the present invention during by the detecting device off-line training testing result minimum value of positive sample set be made as threshold value.
By above mechanism, even under complex scene, object detector also can detect comparatively accurately to the sport foreground of each frame.
Fig. 2 shows the process flow diagram of moving target detecting method in the present invention.As shown in Figure 2, this detection process of moving target specifically comprises:
Step S1021, obtain the target prediction information of feeding back in previous frame motion target tracking process, comprise type, position, area and the number etc. of moving target.
Step S1022, according to described target prediction information, determine type, yardstick and the detection position of detecting device.For the selection of yardstick mainly for pedestrian target, according to the height H of pedestrian in target prediction information, pedestrian target is divided into to very little target (H<30pixels), little target (30<H<50 pixels), normal size target (50<H<90), general objective (H>90), the corresponding pedestrian detector who covers 7 search window yardsticks of each target.In order to guarantee that the zone of searching for comprises the target that needs detect fully, for original prediction area, enlarge, the coefficient of expansion is 1.4.
Step S1023, the preferred detecting device of use carry out target detection.
If step S1024 step S1023 detects target, the testing results such as the position of export target, area;
If step S1025 step S1023 does not detect target, directly according to the area in information of forecasting and the corresponding sport foreground of positional information, export as testing result.
Especially, in firm incipient 3 frames of moving target, can't provide information of forecasting accurately in previous frame motion target tracking process.Now, directly according to the height of sport foreground, select the detecting device yardstick, and use 3 class detecting devices to detect successively the sport foreground target, after obtaining testing result, choose the testing result of the detecting device that score is the highest, thereby determine target type, area, the ratio of width to height and position, and above information is exported.
Step S105, utilize multi-object tracking method, obtain the track of moving target.Wherein, enliven track, mean the track of following the tracks of, show in real-time result; Historical track, mean current do not have tracked, but may be transformed into the track that enlivens track; Dead track, mean thoroughly to finish, no longer tracked track.
This paper adopts the movement locus that obtains moving target based on the multiple target tracking mode of Hungary Algorithm, and wherein Hungary Algorithm is used for calculating the optimum correspondence problem of a plurality of moving targets.Wherein, the description of moving target similarity is based on colouring information and the positional information of moving target.Colouring information adopts color histogram to quantize, and a kind of statistical value of color distribution in the color histogram presentation video means different color shared ratio in image, calculates simply, and has yardstick, translation and rotational invariance.Positional information is calculated in conjunction with Kalman filter, Kalman filtering is the linear system optimal estimation method under minimum mean square error criterion, its basic thought is that to make variance of estimaion error be minimum, and estimates it is without inclined to one side, can promote the target following effect.
As shown in Figure 3, the movement locus that obtains moving target based on the multiple target tracking mode of Hungary Algorithm in the present invention specifically can be divided into following step:
Step S1051,8 * 8 * 8 color histogram features of all moving targets that detect in calculation procedure S104, then calculate the similarity of the color histogram feature of the color histogram feature of the moving target obtained in present frame and previous frame moving target.Preferably, the present invention adopts the RGB color space to calculate the color histogram of each moving target: first three color components in color space RGB are quantized, each color space is divided into to 8 sub spaces, one dimension (bin) in the corresponding histogram of every sub spaces, statistics drops on the number of pixels in the subspace that the every one dimension of histogram is corresponding, thereby obtain color histogram, then calculate the similarity between the color histogram feature that previous frame enlivens moving target that track is corresponding and present frame moving target.Preferably, the present invention adopts the Hellinger distance to measure the similarity of two histogram distribution:
d ( h 1 , h 2 ) = 1 - 1 h 1 &OverBar; h 2 &OverBar; N 2 &Sigma; q = 1 N h 1 ( q ) h 2 ( q )
Wherein, h 1(q) and h 2(q) represent two color histogram vectors, N is 8 * 8 * 8, h k &OverBar; = 1 N &Sigma; j = 1 N h k ( j ) .
If the color histogram of two targets is more similar, namely the distance of the Hellinger between the color histogram vector is less, and the possibility of two object matchings is higher, and its probability distribution meets Gaussian distribution.For example, in the monitor video picture of highway, there is a white car W in left side, and there is a black car B on right side, and this method need to be followed the tracks of these two moving targets, thereby obtains their movement locus.If in previous frame, two moving object W detecting in picture and B are calculated to color histogram and obtain h 1And h 2, two moving object W in the present frame picture and B are calculated to color histogram and obtain h 3And h 4, by calculating h 1And h 3, h 1And h 4, h 2And h 3, h 2And h 4Between the Hellinger distance, can find h 1And h 3, h 2And h 4The Hellinger distance much smaller than h 1And h 4, h 2And h 3Between the Hellinger distance, can access h so 1And h 3That W is at the corresponding color histogram of two continuous frames, h 2And h 4Be B at the corresponding color histogram of two continuous frames, this information can help the target that two continuous frames occurs to mate.
Step S1052, according to the trace information that enlivens of moving target in the previous frame image, utilize the position of Kalman filter predicted motion target.Enliven trace information according to every in the t-1 two field picture, utilize the position that in Kalman filter prediction t frame, moving target occurs.In step S104, obtain the testing result of t frame moving target, be the definite position of moving target at the t frame, in this step, successively moving target is carried out to Euclidean distance calculating in the predicted position of t frame and the target detection result of t frame detection module, Euclidean distance is less, predicted position and accurate location are more approaching, the possibility of two object matchings is higher so, and its probability distribution meets Gaussian distribution.For example, the left side vehicle W in monitored picture mentioned above and right side vehicle B, if in the t-1 frame, utilize Kalman filter to carry out position prediction to two moving object W and the B of detecting in picture, obtains the predicted position l in the t frame 1' and l 2', in step S104, after the t frame detects two moving object W and B, obtain the physical location l of target 1And l 2.Because in two continuous frames, huge change can not occur in the position of vehicle, so l 1' and l 1, l 2' and l 2Euclidean distance will be far smaller than l 1' and l 2, l 1' and l 2Euclidean distance, this information can help the target that two continuous frames occurs to mate.
Step S1053, adopt Hungary Algorithm, utilizes colouring information and positional information to carry out multiobject coupling, and Hungary Algorithm is the classic algorithm that solves the bipartite graph maximum matching problem.For example, if in the t-1 frame, exist m to enliven track, step S104 detects n moving target in the t frame, and by Hellinger, is calculated the similarity between the moving target color histogram feature of enlivening track and t frame of t-1 frame, and obtains the matrix M of m * n 1And calculate the t-1 frame enliven track in the t frame predicted position and the Euclidean distance between the definite position of t frame moving target, can obtain the matrix M of m * n 2.By matrix M 1And M 2The element of correspondence position multiplies each other, can obtain the matrix M of m * n, input value using this matrix M as Hungary Algorithm, Hungary Algorithm can provide m matching result that enlivens track and n moving target of t frame in the t-1 frame, if in matching result, similarity is less than threshold value T 2=0.5 o'clock, think and do not mate, otherwise the match is successful.
Step S1054, according to the matching result of target in previous step, generate the movement locus of moving target in present frame; The positional information of while target of prediction in next frame etc.
If the t-1 frame enliven track m iMoving target n with the t frame jThe match is successful, thinks target n jMovement locus in front t-1 frame is m i, integrating step S104 at the t frame to target n jTesting result, the renewable track m that enlivens i.Now, for target n jTracing process at the t frame finishes.
If the moving target of t frame does not match the track that enlivens of t-1 frame, illustrate that this target does not have movement locus, is fresh target; If the track that enlivens of t-1 frame does not match the moving target of t frame, illustrate that target disappears, this is enlivened to track and historical track mates, if on mating, this enlivens track and historical track is integrated into the new track that enlivens, otherwise this enlivens track and changes historical track into.
The present invention is at t frame target n jAfter renewal enlivens track, utilize Kalman filter target of prediction n jIn the position of t+1 frame, and preserve target n jThe information such as type, position, area, the ratio of width to height, to use when the t+1 frame target detection.
Step S106, generating video summary.
By above step, can obtain the information such as the movement locus of moving target and type.In the method, when certain historical track, through the matching operation of N frame, still can't be upper with the sport foreground coupling, be considered as this historical track and stop, N=50 in this algorithm.After the historical track termination, with moving target, merge, generate a sub-picture, be referred to as " video snapshot ", in this " video snapshot ", indicate the classification of moving target, and the time continued appears in this moving target.For example, in the monitor video of road, suspect's vehicle of running away appears, this vehicle can be labeled out to this section running orbit disappeared from picture from entering picture, time of occurrence was from 28.92 seconds to 41.54 seconds, the starting point of movement locus is used and is indicated than light colour, terminating point is used and indicates than dark colour, the gradual change from shallow to deep of the color of the movement locus point between from the starting point to the terminating point, then image and the movement locus of this vehicle in the movement locus midpoint merged, and the classified information of vehicle " motor vehicle " and time of occurrence " 28.92s-41.54s " are indicated in by moving target.After whole section Video processing finishes, generate a series of " video snapshot ", this a series of " video snapshot " formed video frequency abstract, and the user can make a summary by browsing video, reaches the purpose of quick understanding video content.
The present invention has introduced mutual feedback mechanism innovatively in foreground target detection method and method for tracking target, feedback mechanism can make to detect with tracking and mutually promote mutually, greatly improves the performance of the two.On the one hand, for the target detection process, introduce the information of forecasting fed back in tracing process, the information of forecasting of type, position, area, the ratio of width to height and the number of detection method by taking full advantage of the sport foreground target is selected detector type and yardstick, thereby significantly improve detector performance, effectively reduce the loss of Weak target, the false drop rate of dense shelter target, and shortened detection time.On the other hand, this method is utilized the result of target detection in tracing process, can reduce movement objective orbit with wrong probability, with losing probability and imperfect probability.With traditional video summarization method, compare, the present invention can accurately, fast, intactly extract the movement locus (user's events of interest) of foreground moving target in complex scene, under complex scene, can generate reliable video frequency abstract.
Above-described specific embodiment; purpose of the present invention, technical scheme and beneficial effect are further described; be understood that; the foregoing is only specific embodiments of the invention; be not limited to the present invention; within the spirit and principles in the present invention all, any modification of making, be equal to replacement, improvement etc., within all should being included in protection scope of the present invention.

Claims (8)

1. the video abstraction generating method under a complex scene, it comprises:
Step 1, the video obtained is carried out to background modeling, obtain the sport foreground of present frame in video sequence;
Step 2, utilize the information of forecasting calculated in the previous frame tracing process, select suitable object detector to carry out target classification and detection to the sport foreground of present frame, obtain type and the testing result of moving target;
Step 3, according to type and the testing result of the moving target obtained, utilize multi-object tracking method, calculate the movement locus of moving target in present frame, and calculate accordingly the information of forecasting of next frame, and go to step 2 and carry out the detection of next frame, until detect complete all video frame images;
Step 4, according to movement locus and the type of all moving targets that obtain, generating video summary.
2. the method for claim 1, is characterized in that, in step 1 also the sport foreground to the current video image frame that obtains carry out aftertreatment, specifically comprise:
Step 11, use morphological structuring elements, carry out morphology opening operation and closing operation of mathematical morphology to foreground target, obtains the prospect of contour smoothing, and eliminate the less noise spot of area, dwindles the noise spot that area is larger;
Step 12, foreground target is carried out to area calculating, if in the area of foreground target, pixel is less than threshold value T 1=5 o'clock, this foreground target of filtering, otherwise, retain this foreground target.
3. the method for claim 1, it is characterized in that, object detector described in step 2 is with the combination of histograms of oriented gradients and linear SVM, to carry out off-line training in advance to obtain, for detection of type, profile and the positional information of moving target.
4. method as claimed in claim 3, is characterized in that, the positive sample set of training described object detector to use comprises the following three class images that occur in monitor video: 1) pedestrian and positive bicycle; 2) bicycle of side; 3) motor vehicle; Training obtains three object detectors.
5. the method for claim 1, is characterized in that, step 2 specifically comprises:
Step 21, according to the information of forecasting of previous frame tracing process feedback, determine object detector;
Step 22, carry out target detection according to determined object detector;
If step 23 detects moving target, export the testing result of present frame moving target, for the tracing process of moving target;
If step 24 does not detect moving target, will in background modeling, obtain the information output of sport foreground to tracing process.
6. method as claimed in claim 5, it is characterized in that, the information of forecasting fed back according to the previous frame tracing process in step 21, the type, position, area, the ratio of width to height and the number that comprise moving target in present frame, come type and the yardstick of select target detecting device, and the detection position of localizing objects detecting device.
7. the method for claim 1, is characterized in that, step 3 specifically comprises the following steps:
Step 31, according to the testing result of the moving target of present frame, calculate the similarity of moving target color histogram feature in present frame and previous frame;
Step 32, according to the moving object detection result in previous frame, utilize the positional information of moving target in present frame in Kalman filter prediction previous frame, in conjunction with the testing result of moving target in present frame, the Euclidean distance in the predicted position of calculating moving target and moving object detection result between the physical location of moving target;
Step 33, according to the result of calculation of step 31 and step 32, use Hungary Algorithm, to the moving target that detects in present frame with enliven track and mate, obtain matching result, and according to matching result, upgrade the movement locus of moving target.
8. the method for claim 1, is characterized in that, in step 4, the duration of the movement locus of the moving target of acquisition, classification and appearance is presented in a secondary video snapshot image, forms video frequency abstract in the mode of video snapshot image.
CN2013103895057A 2013-08-30 2013-08-30 Method for reliably generating video abstraction in complex scene Pending CN103413330A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2013103895057A CN103413330A (en) 2013-08-30 2013-08-30 Method for reliably generating video abstraction in complex scene

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2013103895057A CN103413330A (en) 2013-08-30 2013-08-30 Method for reliably generating video abstraction in complex scene

Publications (1)

Publication Number Publication Date
CN103413330A true CN103413330A (en) 2013-11-27

Family

ID=49606335

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2013103895057A Pending CN103413330A (en) 2013-08-30 2013-08-30 Method for reliably generating video abstraction in complex scene

Country Status (1)

Country Link
CN (1) CN103413330A (en)

Cited By (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104216875A (en) * 2014-09-26 2014-12-17 中国科学院自动化研究所 Automatic microblog text abstracting method based on unsupervised key bigram extraction
CN104244113A (en) * 2014-10-08 2014-12-24 中国科学院自动化研究所 Method for generating video abstract on basis of deep learning technology
CN104469547A (en) * 2014-12-10 2015-03-25 西安理工大学 Video abstraction generation method based on arborescence moving target trajectory
CN104754248A (en) * 2013-12-30 2015-07-01 浙江大华技术股份有限公司 Method and device for acquiring target snapshot
CN105141903A (en) * 2015-08-13 2015-12-09 中国科学院自动化研究所 Method for retrieving object in video based on color information
CN105139406A (en) * 2015-09-08 2015-12-09 哈尔滨工业大学 Tracking accuracy inversion method based on sequence images
CN105187801A (en) * 2015-09-17 2015-12-23 桂林远望智能通信科技有限公司 Condensed video generation system and method
CN105205171A (en) * 2015-10-14 2015-12-30 杭州中威电子股份有限公司 Image retrieval method based on color feature
CN105224535A (en) * 2014-05-29 2016-01-06 浙江航天长峰科技发展有限公司 Based on the concern target quick reference system of massive video
CN105809714A (en) * 2016-03-07 2016-07-27 广东顺德中山大学卡内基梅隆大学国际联合研究院 Track confidence coefficient based multi-object tracking method
CN106127235A (en) * 2016-06-17 2016-11-16 武汉烽火众智数字技术有限责任公司 A kind of vehicle query method and system based on target characteristic collision
CN106210674A (en) * 2016-09-22 2016-12-07 江苏理工学院 Personnel-oriented monitoring video data file rapid processing method and system
CN106664455A (en) * 2014-06-09 2017-05-10 派尔高公司 Smart video digest system and method
CN106686403A (en) * 2016-12-07 2017-05-17 腾讯科技(深圳)有限公司 Video preview generation method, device, server and system
CN107025458A (en) * 2016-01-29 2017-08-08 深圳中兴力维技术有限公司 People's car sorting technique and device
JP2017225174A (en) * 2014-09-04 2017-12-21 インテル コーポレイション Real time video summarization
CN107659754A (en) * 2017-07-18 2018-02-02 孙战里 Effective method for concentration of monitor video in the case of a kind of leaf disturbance
CN107707975A (en) * 2017-09-20 2018-02-16 天津大学 Video intelligent clipping method based on monitor supervision platform
CN109902543A (en) * 2017-12-11 2019-06-18 北京京东尚科信息技术有限公司 Target trajectory estimation method, device and Target Tracking System
CN110009659A (en) * 2019-04-12 2019-07-12 武汉大学 Personage's video clip extracting method based on multiple target motion tracking
CN110264740A (en) * 2019-06-24 2019-09-20 长沙理工大学 The inhuman real-time track detector of traffic machine and detection method based on video
CN110298867A (en) * 2019-06-21 2019-10-01 江西洪都航空工业集团有限责任公司 A kind of video target tracking method
CN110580428A (en) * 2018-06-08 2019-12-17 Oppo广东移动通信有限公司 image processing method, image processing device, computer-readable storage medium and electronic equipment
CN110879970A (en) * 2019-10-21 2020-03-13 武汉兴图新科电子股份有限公司 Video interest area face abstraction method and device based on deep learning and storage device thereof
CN111611703A (en) * 2020-05-15 2020-09-01 深圳星地孪生科技有限公司 Sand table deduction method, device, equipment and storage medium based on digital twins
CN111949003A (en) * 2020-07-17 2020-11-17 浙江浙能技术研究院有限公司 Closed-loop control loop performance evaluation method based on SFA and Hellinger distance
CN112465870A (en) * 2020-12-10 2021-03-09 济南和普威视光电技术有限公司 Thermal image alarm intrusion detection method and device under complex background
CN112507860A (en) * 2020-12-03 2021-03-16 上海眼控科技股份有限公司 Video annotation method, device, equipment and storage medium
CN112967351A (en) * 2021-03-05 2021-06-15 北京字跳网络技术有限公司 Image generation method and device, electronic equipment and storage medium

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103079117A (en) * 2012-12-30 2013-05-01 信帧电子技术(北京)有限公司 Video abstract generation method and video abstract generation device

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103079117A (en) * 2012-12-30 2013-05-01 信帧电子技术(北京)有限公司 Video abstract generation method and video abstract generation device

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
NAVNEET DALAL等: "Histograms of Oriented Gradients for Human Detection", 《PROCEEDINGS OF THE 2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR’05)》 *
OLIVIER BARNICH等: "ViBe: A Universal Background Subtraction Algorithm for Video Sequences", 《IEEE TRANSACTIONS ON IMAGE PROCESSING》 *
YUAN FEI等: "A Framework for Quick and Accurate Access of Interesting Visual Events in Surveillance Videos", 《9TH INTERNATIONAL SYMPOSIUM, ISVC 2013》 *
陈磊等: "基于运动轨迹的视频摘要技术", 《工程实践及应用技术》 *

Cited By (42)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104754248A (en) * 2013-12-30 2015-07-01 浙江大华技术股份有限公司 Method and device for acquiring target snapshot
CN104754248B (en) * 2013-12-30 2018-05-01 浙江大华技术股份有限公司 A kind of method and device for obtaining target snapshot
CN105224535A (en) * 2014-05-29 2016-01-06 浙江航天长峰科技发展有限公司 Based on the concern target quick reference system of massive video
CN106664455A (en) * 2014-06-09 2017-05-10 派尔高公司 Smart video digest system and method
JP2017225174A (en) * 2014-09-04 2017-12-21 インテル コーポレイション Real time video summarization
JP2019208259A (en) * 2014-09-04 2019-12-05 インテル コーポレイション Real time video summarization
CN104216875A (en) * 2014-09-26 2014-12-17 中国科学院自动化研究所 Automatic microblog text abstracting method based on unsupervised key bigram extraction
CN104216875B (en) * 2014-09-26 2017-05-03 中国科学院自动化研究所 Automatic microblog text abstracting method based on unsupervised key bigram extraction
CN104244113A (en) * 2014-10-08 2014-12-24 中国科学院自动化研究所 Method for generating video abstract on basis of deep learning technology
CN104469547B (en) * 2014-12-10 2017-06-06 西安理工大学 A kind of video abstraction generating method based on tree-shaped movement objective orbit
CN104469547A (en) * 2014-12-10 2015-03-25 西安理工大学 Video abstraction generation method based on arborescence moving target trajectory
CN105141903A (en) * 2015-08-13 2015-12-09 中国科学院自动化研究所 Method for retrieving object in video based on color information
CN105141903B (en) * 2015-08-13 2018-06-19 中国科学院自动化研究所 A kind of method for carrying out target retrieval in video based on colouring information
CN105139406A (en) * 2015-09-08 2015-12-09 哈尔滨工业大学 Tracking accuracy inversion method based on sequence images
CN105139406B (en) * 2015-09-08 2018-02-23 哈尔滨工业大学 A kind of tracking accuracy inversion method based on sequence image
CN105187801A (en) * 2015-09-17 2015-12-23 桂林远望智能通信科技有限公司 Condensed video generation system and method
CN105205171A (en) * 2015-10-14 2015-12-30 杭州中威电子股份有限公司 Image retrieval method based on color feature
CN105205171B (en) * 2015-10-14 2018-09-21 杭州中威电子股份有限公司 Image search method based on color characteristic
CN107025458B (en) * 2016-01-29 2019-08-30 深圳力维智联技术有限公司 People's vehicle classification method and device
CN107025458A (en) * 2016-01-29 2017-08-08 深圳中兴力维技术有限公司 People's car sorting technique and device
CN105809714A (en) * 2016-03-07 2016-07-27 广东顺德中山大学卡内基梅隆大学国际联合研究院 Track confidence coefficient based multi-object tracking method
CN106127235A (en) * 2016-06-17 2016-11-16 武汉烽火众智数字技术有限责任公司 A kind of vehicle query method and system based on target characteristic collision
CN106127235B (en) * 2016-06-17 2020-05-08 武汉烽火众智数字技术有限责任公司 Vehicle query method and system based on target feature collision
CN106210674A (en) * 2016-09-22 2016-12-07 江苏理工学院 Personnel-oriented monitoring video data file rapid processing method and system
CN106686403A (en) * 2016-12-07 2017-05-17 腾讯科技(深圳)有限公司 Video preview generation method, device, server and system
CN107659754B (en) * 2017-07-18 2020-09-04 安徽大学 Effective concentration method for monitoring video under condition of tree leaf disturbance
CN107659754A (en) * 2017-07-18 2018-02-02 孙战里 Effective method for concentration of monitor video in the case of a kind of leaf disturbance
CN107707975A (en) * 2017-09-20 2018-02-16 天津大学 Video intelligent clipping method based on monitor supervision platform
CN109902543A (en) * 2017-12-11 2019-06-18 北京京东尚科信息技术有限公司 Target trajectory estimation method, device and Target Tracking System
CN110580428A (en) * 2018-06-08 2019-12-17 Oppo广东移动通信有限公司 image processing method, image processing device, computer-readable storage medium and electronic equipment
CN110009659A (en) * 2019-04-12 2019-07-12 武汉大学 Personage's video clip extracting method based on multiple target motion tracking
CN110009659B (en) * 2019-04-12 2021-04-16 武汉大学 Character video clip extraction method based on multi-target motion tracking
CN110298867A (en) * 2019-06-21 2019-10-01 江西洪都航空工业集团有限责任公司 A kind of video target tracking method
CN110264740A (en) * 2019-06-24 2019-09-20 长沙理工大学 The inhuman real-time track detector of traffic machine and detection method based on video
CN110879970A (en) * 2019-10-21 2020-03-13 武汉兴图新科电子股份有限公司 Video interest area face abstraction method and device based on deep learning and storage device thereof
CN111611703A (en) * 2020-05-15 2020-09-01 深圳星地孪生科技有限公司 Sand table deduction method, device, equipment and storage medium based on digital twins
CN111949003A (en) * 2020-07-17 2020-11-17 浙江浙能技术研究院有限公司 Closed-loop control loop performance evaluation method based on SFA and Hellinger distance
CN111949003B (en) * 2020-07-17 2021-09-03 浙江浙能技术研究院有限公司 Closed-loop control loop performance evaluation method based on SFA and Hellinger distance
CN112507860A (en) * 2020-12-03 2021-03-16 上海眼控科技股份有限公司 Video annotation method, device, equipment and storage medium
CN112465870A (en) * 2020-12-10 2021-03-09 济南和普威视光电技术有限公司 Thermal image alarm intrusion detection method and device under complex background
CN112465870B (en) * 2020-12-10 2023-07-14 济南和普威视光电技术有限公司 Thermal image alarm intrusion detection method and device under complex background
CN112967351A (en) * 2021-03-05 2021-06-15 北京字跳网络技术有限公司 Image generation method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN103413330A (en) Method for reliably generating video abstraction in complex scene
CN104244113B (en) A kind of video abstraction generating method based on depth learning technology
CN108549846B (en) Pedestrian detection and statistics method combining motion characteristics and head-shoulder structure
Cao et al. Vehicle detection and motion analysis in low-altitude airborne video under urban environment
US8170278B2 (en) System and method for detecting and tracking an object of interest in spatio-temporal space
CN105224912A (en) Based on the video pedestrian detection and tracking method of movable information and Track association
CN103347167A (en) Surveillance video content description method based on fragments
CN103593679A (en) Visual human-hand tracking method based on online machine learning
CN105184229A (en) Online learning based real-time pedestrian detection method in dynamic scene
CN103853794A (en) Pedestrian retrieval method based on part association
CN109872541A (en) A kind of information of vehicles analysis method and device
CN103500456B (en) A kind of method for tracing object based on dynamic Bayesian network network and equipment
Hampapur et al. Searching surveillance video
CN104504733A (en) Video abstraction method and system based on moving target detection
CN104268902A (en) Multi-target video tracking method for industrial site
Ghasemi et al. A real-time multiple vehicle classification and tracking system with occlusion handling
CN101877135B (en) Moving target detecting method based on background reconstruction
CN116012949B (en) People flow statistics and identification method and system under complex scene
Zhu et al. Long-distanceinfrared video pedestrian detection using deep learning and backgroundsubtraction
Zhao et al. An Improved Method for Infrared Vehicle and Pedestrian Detection Based on YOLOv5s
Khorramshahi et al. Scalable and real-time multi-camera vehicle detection, re-identification, and tracking
Zhang Based on YOLO v3 Target Recognition Algorithm as Vehicle Tracking Algorithm Analysis
Kollek et al. Real-time traffic counting on resource constrained embedded systems
Shahraki et al. A trajectory based method of automatic counting of cyclist in traffic video data
Fehlmann et al. Fusion of multiple sensor data to recognise moving objects in wide area motion imagery

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20131127