Background technology
Pedestrian's traffic statistics are exactly the number of a certain passage of turnover in certain time period of statistics.From different angle classification, there is polytype in pedestrian's traffic statistics system.According to the difference of selected hardware platform, can be divided into contact (as revolving door), sensor type (as laser beam) and based on vision formula (as video camera) system.According to the difference of video camera riding position, can be divided into based on vertical shooting with based on the system of canted shot.Whether demarcated according to video camera, can be divided into camera calibration formula and camera and not demarcate the formula system.Contact and sensor type system cause blocking up of gateway easily, and counting precision is relatively poor; Though system's counting precision height based on vertical shooting, algorithm is simple, but it can only obtain pedestrian's partial information under the top, consider for security monitoring, we need keep pedestrian's out of Memory (as facial characteristics, wear feature etc. clothes), and current monitoring camera is based on canted shot; The calibration coefficient of camera itself need be sought by camera calibration formula system, does not have universality.Therefore, take all factors into consideration the each side factor, current most pedestrian's traffic statistics system be camera do not demarcate, based on canted shot.
In recent years, more existing research work are carried out in this respect, can be with reference to people's such as Chan document (A. B. Chan, Z. S. J. Liang, and N. Vasconcelos, " Privacy Preserving Crowd Monitoring:Counting People without People Models or Tracking, " In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Anchorage, June 2008).Generally speaking, the algorithm that is adopted can be divided three classes substantially.
Method based on the pedestrian detection tracking.At first cut apart prospect, adopt the method detection pedestrian or the pedestrian's of match search a part (people's face, the number of people, head shoulder, upper body etc.) then, at last the result who detects is followed the tracks of, an effective pursuit path is represented a pedestrian.Two core technologies---pedestrian detection and motion target tracking in these class methods are exactly two class technical barriers in field of video monitoring itself, and the complexity of algorithm is higher, and this has brought test for the real-time performance of number system.
Method based on the unique point trajectory clustering.At first extract the unique point that some is beneficial to tracking, for example angle point is followed the tracks of frame by frame to these unique points then, forms characteristic locus, will have the trajectory clustering of similarity feature at last, and a class is represented a pedestrian.These class methods require the moving target of statistics to need tool homogeneity, and need seek a kind of clustering method preferably.
Method based on the low-level feature recurrence.At first cut apart prospect, extract the low-level feature (area, girth, edge, edge direction, texture etc.) in the foreground area then, form the proper vector of some dimensions, determine corresponding relation between proper vector and the number by certain regression function at last.These class methods need not to carry out pedestrian detection and tracking, but bigger to foreground segmentation result's dependence, and need generally speaking the sample of a large amount of special scenes is marked training, lack versatility.
Summary of the invention
Purpose of the present invention mainly is to be difficult under the prerequisite that satisfies counting precision at existing pedestrian's traffic statistics system, reach good real-time performance, and the method that a kind of algorithm complex is low, better precision is arranged is provided, realize the pedestrian's traffic statistics in the general scene.
The inventive method step is as follows:
Step (1) is obtained first two field picture of input video, sets a virtual door in this image optional position.
Step (2) adopts the mixed Gaussian background modeling method, foreground area is split from background, and the foreground area that obtains is carried out aftertreatment, and aftertreatment mainly may further comprise the steps:
1) adopt morphological erosion and expanding method to remove noise;
2) connectedness of analysis foreground point is removed the connected region less than certain threshold value.
After step (3) is transformed into the HSV space with former figure, remove the dash area of foreground area.
Step (4) learning phase.May further comprise the steps:
1) employing is carried out pedestrian detection based on the method for gradient orientation histogram;
2) utilize the ordinate and the interior foreground point pixel sum of rectangle frame at detected pedestrian's boundary rectangle center to constitute pedestrian dummy, and several pedestrian dummy are in line by least square fitting, form heuristic information;
3) by heuristic information, determine virtual the ratio of going up the foreground point pixel sum in each point and its position pedestrian dummy, come to give weight for each point.
Step (5) counting stage.May further comprise the steps:
1) adopt the point on the sparse optical flow LK algorithm opposite house to carry out motion compensation.Comprise the steps:
A determines the motion vector direction by the phase angle that sparse optical flow LK algorithm is determined with the angle between virtual the direction;
The amplitude that b is determined by sparse optical flow LK algorithm and the sine value of motion vector direction are determined the motion vector size.
2) the statistics door is gone up the number of each foreground point and is obtained its information, comprises the size and Orientation of weight, motion vector.Weighted sum by the foreground point after the compensation is added up pedestrian's flow.
The present invention is based on a mutation of low-level feature homing method, be different from that tradition relies on detection, tracking or the unique point clustering method has higher algorithm complex, the main complexity of the present invention concentrates on pedestrian detection, after learning phase is finished, can reach the effect of real-time counting at counting stage.To testing, can reach the counting precision more than 85% from the several videos in the CAVIAR Test Case Scenarios public data storehouse.
Embodiment
Introduce embodiments of the invention in detail below with reference to accompanying drawing.
Fig. 1 is a FB(flow block), has represented to the present invention is based on the process flow diagram of pedestrian's traffic statistics systems approach of heuristic information.
The video that this method is handled can be at the various video coded format, for example: MPG, FLV etc., as long as can be the AVI format video of XviD coding with the video conversion of this form.The video of supposition input has been the AVI form behind the XviD coding in this embodiment.
Virtual door setting is a requisite step before the number system operation.At first obtain first two field picture of input video, set a virtual door in this image optional position.Here so-called virtual door is actually artificial straight line that determine, that be in the image optional position.When drawing virtual door, the user only need determine two end points, and system will adopt the Bresenham algorithm to draw straight line automatically.Virtual door can be thought a manual area-of-interest that is provided with, and subsequent step all will launch around it.The length and the direction setting of virtual door are arbitrarily, but in the ordinary course of things, for the accuracy of counting, when virtual door is set, should make that direction of motion is vertical substantially on its ground level that drops on vision and with the pedestrian.Virtual door exemplary plot such as Fig. 2.Two end points coordinate figures are respectively: (70,178) and (290,178).
Foreground segmentation (being that the moving region is detected) result's quality directly has influence on final statistics, because the stream of people's quantity is to obtain by the foreground point on the virtual door that adds up.Mixed Gaussian background model (the C. Stauffer of people's designs such as present embodiment employing Stauffer, W. E. L. Grimson. " Adaptive background mixture models for real-time tracking; " Computer Vision and Pattern Recognition, IEEE Computer Society Conference on, Vol. 2 (06 August 1999), pp. 246-252) obtain prospect.The method is together described the regularity of distribution of background pixel with N Gaussian distribution model.Several key variables are set among the embodiment: Gaussian distribution number N=4, background ratio T=0.7, learning rate Alpha=max (0.001,1/frameIndex) (frameIndex is the index value of frame), learning rate Beta=max (Alpha, 1/frameIndex), matching threshold Lambda=2.5, initial weight InitWeight=0.05, initial variance InitDelta=320.In order to make the result more accurate, need carry out aftertreatment.At first utilize morphological method, corrode after expansion earlier, the noise spot in removal prospect and the background is analyzed connective (the connectedness here refers to 8 connected domains of certain picture element) then, removes the noise piece of area less than certain threshold value.Be endowed smaller value (getting empirical value 200 among the embodiment) during this threshold value initialization, at counting stage, the size of the pedestrian dummy that can obtain according to learning phase is adjusted to a suitable value with this threshold adaptive.Accompanying drawing 3 (b) is that Fig. 3 (a) utilizes a width of cloth exemplary plot of extracting the moving region behind above-mentioned background modeling and the post-processing approach.
Shadow removal partly is after moving object detection, the process that has the shade of same movement feature to eliminate to some and moving target.When obtaining prospect, because the influence of illumination, the shade that produces when the pedestrian moves is also by as the motion pixel, and this can make final statistics bigger than normal undoubtedly.For this reason, remove the shade step that also is absolutely necessary.Present embodiment adopts document (R. Cucchiara, C. Grana, M. Piccardi, and A. Prati. " Detecting moving objects, ghosts; and shadows in video streams; " IEEE Transactions on Pattern Analysis and Machine Intelligence, 2003, Vol. 25 (10): method 1337-1342.), former figure is transformed into the HSV space, detects and remove shade.Can judge whether certain foreground point in the HSV space is shade according to the method for formula (1).
Wherein, IC (x, y) and BC (x y) is respectively the value of current figure and Background point,
Alpha, beta, tauS, tauH is the threshold value of each condition, elects 0.60,0.90,0.1 and 2.0 among the embodiment respectively as.If the SP value of certain foreground point is 1, show that this point is shadow pixels; Otherwise, be non-shadow pixels.After adopting the method to differentiate, also to adopt post-processing approach, comprise connectivity analysis and expansion, to reduce false drop rate.One width of cloth is removed exemplary plot such as the accompanying drawing 3 (c) behind the shade.
At learning phase, at first adopt based on HOG, be gradient orientation histogram (N. Dalai and B. Triggs. " Histograms of oriented gradients for human detection; " Coference on Computer Vision and Pattern Recognition (CVPR), 2005.) pedestrian detection method detects the pedestrian.Use the parameter of HOG method to be set among the embodiment: 3 * 3 cells/interval, 6 * 6 pixels/cell, 9 histogram passages constitute one 3780 vector of tieing up.Describe pedestrian's center and size in the testing result with rectangle frame, exemplary plot as shown in Figure 4.Centre coordinate is: CenterX=RectX+RectWidth/2, CenterY=RectY+RectHeight/2; Size is: RectWidth and RectHeight.Wherein, CenterX and CenterY are respectively the horizontal ordinate and the ordinate of center, RectX, RectY, RectWidth, RectHeight are respectively reference position horizontal stroke, the ordinate of detected pedestrian's boundary rectangle frame, and the width of rectangle frame and height.Foreground point number in certain rectangle frame of statistics present frame can obtain a pedestrian dummy, and it comprises two parameters: central longitudinal coordinate CenterY and foreground point number AreaCount.Using the same method obtains N pedestrian dummy (N=6 among the embodiment), and this N model must satisfy between any two: CenterY[i]-Center[j] ≧ 5 pixels, i ≠ j and i, j=1,2 ... N.By this N model, can self-adaptation adjust threshold value T=min (AreaCount[i])/2 of connectivity analysis in the moving object detection aftertreatment, i=1,2 ... N.
This N model fitting is in line, and gives weight for each point on the virtual door as heuristic information.Because the influence of perspective effect seems big more from the near more object of video camera, otherwise it is just more little that object far away more seems.Between the size of object and the image Y-axis coordinate figure is linear on the whole, can be horizontal ordinate with CenterY therefore, and AreaCount is an ordinate, by least square method, N pedestrian dummy is fitted to straight line L, and exemplary plot as shown in Figure 5.As heuristic information, give weight for each point on the door with this linear relationship, concrete steps are as follows:
1) the statistics door is gone up the number of point, PointCount;
2) the record door is gone up the positional information of each point: PosX[i] and PosY[i], i=1,2 ..., PointCount;
3), determine that it is at foreground point pixel number AreaCount[i that current location characterized according to straight line L and each some position PosY separately], i=1,2 ..., PointCount;
4) weight of each point can be expressed as: PointWeight[i]=1/AreaCount[i], i=1,2 ..., PointCount.
At counting stage, motion compensation is carried out in the foreground point on the opposite house.The speed of pedestrian when walking is inconsistent, makes the process of the virtual door of process that speed be arranged.Therefore, need each point on the opposite house to carry out motion compensation, in order to avoid at a time scan virtual when door, too fast pixel omission or the bradykinetic of causing causes repeat count because of moving.Adopt sparse optical flow LK algorithm to determine the size and Orientation of motion vector, window size is 5 * 5.After adjacent two frames obtain the component motion YMotionMap of the component motion XMotionMap of directions X and Y direction by front and back, calculate MagMap and AngleMap, represent the amplitude figure and the phase angle figure that obtain by the LK algorithm respectively.
Reach the purpose of pedestrian's counting by the method for the foreground point of adding up.The foreground point here be meant on the virtual door, give weight, the foreground point after the motion compensation.Concrete steps are as follows:
1) the angle GateAngle between virtual door of calculating and the X-axis;
2) PointCount point on the sweep gate frame by frame;
3) if certain point is the foreground point, then obtain the weight PointWeight of this point, amplitude PointMag and phase angle PointAngle;
4) through type (2) calculates pedestrian's component:
Wherein N is the foreground point number, and Alpha gets empirical value 0.85 for adjusting the factor;
5) the pedestrian's component that adds up, statistics pedestrian number:
When 6) the P value arrived certain integer, what this round values was represented was accumulative total people's fluxion of current time.In order to improve statistical precision, when counting less than certain threshold value (getting empirical value 10), current P value is rounded up as if the prospect on the virtual door in 5 frames.
From top embodiment as can be seen, heuristic pedestrian's flow statistical method that the present invention proposes, algorithm is simple, man-machine interaction is convenient, guaranteeing to have under the prerequisite of good precision, can reach the requirement of real-time.