WO2015184764A1 - 行人检测方法及装置 - Google Patents

行人检测方法及装置 Download PDF

Info

Publication number
WO2015184764A1
WO2015184764A1 PCT/CN2014/094421 CN2014094421W WO2015184764A1 WO 2015184764 A1 WO2015184764 A1 WO 2015184764A1 CN 2014094421 W CN2014094421 W CN 2014094421W WO 2015184764 A1 WO2015184764 A1 WO 2015184764A1
Authority
WO
WIPO (PCT)
Prior art keywords
pedestrian
map
codeword
edge
importance
Prior art date
Application number
PCT/CN2014/094421
Other languages
English (en)
French (fr)
Inventor
邓硕
董振江
田玉敏
郑海红
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2015184764A1 publication Critical patent/WO2015184764A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/192Recognition using electronic means using simultaneous comparisons or correlations of the image signals with a plurality of references
    • G06V30/194References adjustable by an adaptive method, e.g. learning

Definitions

  • the present invention relates to the field of communications, and in particular to a pedestrian detection method and apparatus.
  • pedestrian detection is widely used in various fields such as intelligent human-computer interaction and video surveillance, which has aroused people's extensive research interests.
  • the existing pedestrian detection techniques are mainly divided into three categories: pedestrian detection based on background model, pedestrian detection based on classifier, and pedestrian detection based on template matching.
  • Pedestrian detection based on the background model is fast, but the accuracy is relatively low.
  • the classifier-based pedestrian detection method has made great progress in recent years.
  • the core is to select the features that can best distinguish different targets, and then train the classifier offline. This method usually has better detection results, but In some cases, it will fail.
  • the surveillance camera is installed at a high level, and the pedestrian's area in the field of view is relatively small. Therefore, the characteristics of pedestrians are small, resulting in a decrease in classification effect. In this case, the contour becomes a typical feature of pedestrians, so the pedestrian detection technique based on template matching has a relatively large advantage.
  • David Schreiber's GPU-based and multi-cue-based pedestrian detection method works well. This method combines the two by combining the shape of pedestrians and the motion information of pedestrians. The position and occlusion of pedestrians are determined based on the optimization theory. This method is suitable for monitoring the environment, which makes the scene more complicated and can achieve better detection results.
  • the target edge is based on pixel representation, which makes the edge map in different directions have great repeatability, which not only leads to lower accuracy of matching, but also increases time complexity; 2)
  • the background model is not well used to extract the edge information, thereby reducing the time complexity; (3)
  • the height model is based on the CCD-based imaging principle, and the modeling process is dependent on the user by manually calibrating the data and bringing the data into the model calculation. And more complicated, it is difficult to apply to the actual; (4) more matching templates (30), higher time complexity.
  • the method for detecting pedestrians in the related art does not consider the problem that the edge image of the target edge leads to inaccurate detection results, and no effective solution has been proposed yet.
  • the present invention provides a pedestrian detection method and apparatus for solving at least the problem of detecting a pedestrian in the related art without considering the edge map of the target edge to cause an inaccurate detection result.
  • a pedestrian detection method comprising: processing a monitored video sequence to obtain a foreground image of the video sequence; obtaining an edge map of the selected region according to the foreground image; The edge points of the graph are processed to obtain a contour map to be detected; and the pedestrian image to be detected is subjected to pedestrian detection according to a pre-established pedestrian contour template.
  • the method before performing pedestrian detection on the to-be-detected contour map according to the pre-established pedestrian contour template, the method further comprises: acquiring a pedestrian data set, and establishing the pedestrian contour template according to the data set.
  • the establishing the pedestrian contour template according to the data set comprises: randomly selecting N people from a standard pedestrian database INRIA, performing feature point marking on the video sequence, the coordinate of the feature point is (x, y),
  • the set of the marked v feature point coordinates constitutes a shape S;
  • the shape S is normalized;
  • the normalized component PCA transform is performed on the shape S subjected to the normalization process to obtain an average of the video sequence
  • the shape S 0 and the shape features S i (i 1, 2, . . . n) corresponding to the first n eigenvalues;
  • the contour of the pedestrian is represented by the following linear equation: After the M sets of profile parameters are obtained, the pedestrian profile template is obtained.
  • the method further includes:
  • Determining whether there is a pedestrian height model if the determination result is YES, performing pedestrian detection on the to-be-detected contour map according to the pre-established pedestrian contour template and the pedestrian height model.
  • a parameter of the degree a first edge map is calculated for the selected region; the foreground image and the first edge image are processed to obtain an edge map of the selected region, which is to be described based on a line segment fitting technique
  • the edge map is represented as a collection of line segments.
  • determining, according to the video sequence, that the codeword of each location in the codebook is a set of four-dimensional vectors, the foreground map comprises: initializing the following parameters: an upper bound Hmax of the importance parameter, and an importance parameter Learning weight gama1, forgetting weight gama0 of importance parameter, maximum length Nmax of each position codeword of codebook, matching threshold d, weighting alpha of updated pixel value, parameter T for determining foreground and background; initializing with first frame image
  • the codebook converts the current frame image into the YCbCr color space, so that the codeword is the value of the current pixel and the importance parameter Si of each codeword is gama1, and the length N of each position codeword in the codebook is recorded;
  • One frame of image the image is converted into a YCbCr color space, and each pixel in the frame is subjected to the following operation: if the new pixel value is u, the following formula compares u with all codewords at the pixel position in the codebook.
  • Dis_u_v abs(u(1)-v(1))+abs(u(2)-v(2))+abs(u(3)-v(3)); update the codebook and gradually converge to form a background model Determining the foreground map according to the background model.
  • the method further comprises: if the distance dis_u_v is less than or equal to the threshold d, updating the codeword, updating the value v of the codeword and the importance parameter s as follows, while reducing the importance of other codewords at the pixel position, ie importance Decrease gama0, if the importance parameter is less than 0 after the importance is reduced, delete the codeword; sort all the updated codewords according to importance from high to low; if the currently matched codeword and its subsequent code The ratio of the sum of the importance of the words to the sum of the importances of all the code words at the pixel position is less than the threshold T, and the pixels of the pixel position are set to the foreground, otherwise the background;
  • the following operations are performed: making the pixel foreground; reducing the importance of all codewords at the pixel location, and reducing the importance of the pixel location of the pixel location after decreasing importance If the parameter is less than 0, the codeword is deleted; all the updated codewords are sorted according to importance from high to low; if the length of the pixel location codeword does not reach the maximum length, the current pixel value is inserted at the end of the codeword, so The weight is gama1, otherwise the last code is deleted, and the value of the current pixel is inserted at the end, so that the weight is gama1.
  • calculating the first edge map comprises: converting the original color video frame into a grayscale image; calculating a gradient of the horizontal direction and the vertical direction of the grayscale image by using a Sobel operator, a gradient map; binarizing the gradient map to obtain the first edge map.
  • processing the foreground map and the first edge map to obtain an edge map of the selected area comprises: bit-setting the first edge map and the foreground image before performing a Chamfer matching "Get the edge map of the selected area.
  • expressing the edge map as a set of line segments based on a line segment fitting technique comprises: traversing the edge map, recording coordinates and numbers of all edge points, and recording them as edgeMap; determining edge points remaining in the edgeMap Whether the number is smaller than the first threshold, and if the determination result is YES, the fitting of the straight line is stopped; if the determination result is negative, it is determined whether the number of existing straight lines is smaller than the second threshold, and the determination result is yes.
  • the step is continued to fit the straight line: a point (x0, y0) is randomly selected, and the first neighborhood is selected with the first predetermined distance as the center as the radius, and all the points in the first neighborhood are recorded.
  • Coordinates using the recorded points to fit the straight line one, and finding the normal vector of the straight line one; selecting the second neighborhood with the second predetermined distance centered on (x0, y0) as the radius, and calculating the second neighborhood
  • the number of points in which the point in the line is collinear with the straight line determines whether the number of points of the collinear line is greater than a third threshold. If the result of the determination is YES, a fitted line is obtained, wherein the first predetermined distance is smaller than Said a second predetermined distance; all points on the fitted line are removed, all lines are found in the remaining points, and all fitted line segments are stored in an array Edge_line as a collection of the line segments.
  • the method further comprises: using the set of line segments, establishing the pedestrian height model based on a multi-scale FDCM method and a data regression method.
  • the method further comprises: processing the line segment array Edge_line by using the multi-scale FDCM method, recording the detected ordinate of the pedestrian position and the height of the pedestrian to obtain a sample set of the pedestrian height; and using the linear fitting method according to the pedestrian The position is fitted to the pedestrian height, and the pedestrian height model is obtained.
  • performing pedestrian detection on the to-be-detected contour map according to the pre-established pedestrian contour template and the pedestrian height model comprises: forming all the fitted line segments obtained in the Edge_line into line segment diagrams according to the line segment direction, and calculating each a distance map of the amplitude segment map; scaling the matching template according to the pedestrian height model; calculating a Cost value of the pedestrian contour template at all positions according to the distance map, scanning all Cost values, if Cost is less than the fourth threshold, Then, a pedestrian is detected, and a rectangular frame is output; the detected rectangular frame is sorted according to the value of Cost, and each detection window is scanned. If the detection window does not overlap or overlap with a rectangular window having a small Cost value, it does not reach a certain value. The ratio is considered to be a new target, otherwise a vote is added to the overlapping target; if the detected Cost value is greater than 0 and the number of votes cast is greater than the fifth threshold, the final target of the pedestrian detection is determined.
  • a pedestrian detecting apparatus comprising: a first processing module configured to process a monitored video sequence to obtain a foreground image of the video sequence; and a second processing module configured to The foreground map obtains an edge map of the selected area; the third processing module is configured to process the edge points of the edge map to obtain a contour to be detected; and the first pedestrian detection module is configured to be based on the pre-established pedestrian contour
  • the template performs pedestrian detection on the to-be-detected contour map.
  • the apparatus comprises: an establishing module configured to acquire a pedestrian data set, and the pedestrian contour template is established according to the data set.
  • the establishing module comprises: a marking unit, configured to randomly select N people from the standard pedestrian database INRIA, and perform feature point marking on the video sequence, the coordinate of the feature point is (x, y), and the marking is good.
  • the set of v feature point coordinates constitutes a shape S;
  • the normalization processing unit is configured to normalize the shape S;
  • the contour of the pedestrian is represented by the following linear equation: After the M sets of profile parameters are obtained, the pedestrian profile template is obtained.
  • the device further includes: a determining module, configured to determine whether there is a pedestrian height model; and a second pedestrian detecting module, configured to, according to the determination result YES, according to the pre-established pedestrian contour template and the The pedestrian height model performs pedestrian detection on the contour to be detected.
  • a determining module configured to determine whether there is a pedestrian height model
  • a second pedestrian detecting module configured to, according to the determination result YES, according to the pre-established pedestrian contour template and the The pedestrian height model performs pedestrian detection on the contour to be detected.
  • a first calculating unit configured to calculate a first edge map for the selected region
  • a representation unit configured to process the foreground image and the first edge image to obtain an edge map of the selected region
  • the edge map is represented as a set of line segments based on a line segment fitting technique.
  • the command unit comprises: an initialization subunit, configured to initialize the following parameters: an upper bound Hmax of the importance parameter, a learning weight gama1 of the importance parameter, a forgetting weight gama0 of the importance parameter, and a codeword for each position of the codebook a maximum length Nmax, a matching threshold d, a weight alpha of the updated pixel value, a parameter T for determining the foreground and the background; a recording subunit configured to convert the current frame image to the YCbCr color space according to the first frame image initialization codebook, The codeword is the value of the current pixel and the importance parameter Si of each codeword is gama1, the length N of each position codeword in the codebook is recorded; the operation subunit is set to read the next frame image, and the image is Converting to the YCbCr color space, performing the following operation for each pixel in the frame: if the new pixel value is u, the following method compares u with the distance dis_u_v of all codewords
  • the update subunit is set to update the codebook, gradually converges to form a background model, and the foreground map is determined according to the background model.
  • the apparatus further includes: a first pixel processing unit configured to: if the distance dis_u_v is less than or equal to the threshold d, update the codeword, update the value v of the codeword and the importance parameter s, and lower the pixel position other code
  • the importance of the word that is, the importance of reducing gama0, if the importance parameter is less than 0 after the importance is reduced, the codeword is deleted; all the updated codewords are sorted according to importance from high to low;
  • the ratio of the sum of the importance of the matched codeword and the subsequent codeword to the sum of the importances of all the codewords of the pixel location is less than the threshold T, and the pixel of the pixel location is set to the foreground, otherwise the background;
  • the second pixel processing unit is configured to: if the distance dis_u_v is greater than the threshold d, indicating that u and v do not match, perform the following operations: make the pixel foreground; reduce the importance of all codewords at the pixel location, if the importance is reduced If the codeword importance parameter of the pixel location is less than 0, the codeword is deleted; all the updated codewords are sorted according to importance from high to low; if the length of the pixel location codeword does not reach the maximum length, the current The pixel value is inserted at the end of the codeword, so that the weight is gama1, otherwise the last code is deleted, and the value of the current pixel is inserted at the end, so that the weight is gama1.
  • the first calculating unit comprises: a converting subunit configured to convert the original color video frame into a grayscale image; and a calculating subunit configured to calculate the horizontal direction and the vertical direction of the grayscale image by using a Sobel operator Gradient, a gradient map is obtained; a binarization subunit is arranged to binarize the gradient map to obtain the first edge map.
  • the representation unit comprises: a first processing subunit, configured to "bit" the first edge map and the foreground image to obtain the edge of the selected area before performing a Chamfer matching Figure.
  • the representation unit comprises: a traversal subunit, configured to traverse the edge map, record coordinates and numbers of all edge points, denoted as edgeMap; stop subunit, and set to determine edge points remaining in the edgeMap Whether the number is smaller than the first threshold, if the judgment result is yes, the fitting of the straight line is stopped; the subunit is fitted, and if the judgment result is negative, it is determined whether the existing number of straight lines is smaller than the second Threshold, if the judgment result is yes, continue to fit the straight line through the step: randomly select a point (x0, y0), and select the first neighborhood with the first predetermined distance as the center centered on the point, and record the The coordinates of all points in the first neighborhood, using the recorded points to fit the straight line one, and finding the normal vector of the straight line one; selecting the second neighborhood with the second predetermined distance as the center centered on (x0, y0) Calculating a number of points in the second neighborhood that are collinear with the straight line, determining whether the
  • the apparatus further comprises: an establishing unit configured to establish the pedestrian height model based on the multi-scale FDCM method and the data regression method using the set of line segments.
  • the apparatus further comprises: a recording unit configured to process the line segment array Edge_line by using the multi-scale FDCM method, record the detected ordinate of the pedestrian position and the height of the pedestrian to obtain a sample set of the pedestrian height;
  • the pedestrian height unit is set to obtain the pedestrian height model by fitting a pedestrian height according to the appearance position of the pedestrian using a linear fitting method.
  • the second pedestrian detection module comprises: a second calculation unit configured to form all the line segments obtained in the Edge_line into line segments according to the direction of the line segments, and calculate a distance map of each line segment map; To scan the matching template according to the pedestrian height model; the scanning unit is configured to calculate a Cost value of the pedestrian contour template at all positions according to the distance map, scan all Cost values, and if Cost is less than the fourth threshold, Detecting a pedestrian and outputting a rectangular frame; adding a unit, set to incrementally sort the detected rectangular boxes according to the value of Cost, scanning each detection window if the detection window does not overlap with a rectangular window having a small Cost value or If the overlap does not reach a certain ratio, it is considered as a new target, otherwise a vote is added to the overlapping target; the target unit is determined, and it is set to be a pedestrian if the detected Cost value is greater than 0 and the number of votes cast is greater than the fifth threshold.
  • the ultimate goal of detection is performed by: a second calculation unit configured to form all the line
  • a foreground image of the video sequence is obtained by processing the monitored video sequence; an edge map of the selected region is obtained according to the foreground image; and an edge point of the edge image is processed to obtain a contour to be detected;
  • the pedestrian detection is performed on the to-be-detected contour map according to the pre-established pedestrian contour template, and the method for detecting pedestrians in the related art does not consider the problem that the edge image of the target edge is inaccurate, and the detection result is achieved. More accurate results.
  • FIG. 1 is a flow chart of a pedestrian detection method according to an embodiment of the present invention.
  • FIG. 2 is a block diagram of a pedestrian detecting apparatus according to an embodiment of the present invention.
  • FIG. 3 is a block diagram 1 of a pedestrian detection apparatus in accordance with a preferred embodiment of the present invention.
  • FIG. 4 is a block diagram 2 of a pedestrian detecting apparatus in accordance with a preferred embodiment of the present invention.
  • FIG. 5 is a block diagram 3 of a pedestrian detection apparatus in accordance with a preferred embodiment of the present invention.
  • FIG. 6 is a block diagram 4 of a pedestrian detection apparatus in accordance with a preferred embodiment of the present invention.
  • FIG. 7 is a block diagram 5 of a pedestrian detection apparatus in accordance with a preferred embodiment of the present invention.
  • Figure 8 is a block diagram 6 of a pedestrian detecting apparatus in accordance with a preferred embodiment of the present invention.
  • Figure 9 is a block diagram 7 of a pedestrian detecting apparatus in accordance with a preferred embodiment of the present invention.
  • FIG. 10 is a schematic diagram of a pedestrian detection system for long-range surveillance video according to an embodiment of the present invention.
  • FIG. 11 is a schematic diagram of a generated pedestrian profile template set according to an embodiment of the present invention.
  • Figure 12 is a first schematic diagram of pedestrian detection in accordance with a preferred embodiment of the present invention.
  • Figure 13 is a second schematic diagram of pedestrian detection in accordance with a preferred embodiment of the present invention.
  • Figure 14 is a third schematic diagram of pedestrian detection in accordance with a preferred embodiment of the present invention.
  • FIG. 1 is a flowchart of a pedestrian detection method according to an embodiment of the present invention. As shown in FIG. 1, the flow includes the following steps:
  • Step S102 processing the monitored video sequence to obtain a foreground image of the video sequence
  • Step S104 obtaining an edge map of the selected area according to the foreground image
  • Step S106 processing edge points of the edge map to obtain a contour map to be detected
  • Step S108 Perform pedestrian detection on the to-be-detected contour map according to the pre-established pedestrian contour template.
  • the monitored video sequence is processed to obtain a foreground image of the video sequence, and an edge map of the selected region is obtained according to the foreground image, and the edge point of the edge image is processed to obtain a contour to be detected, according to the pre-established contour map.
  • the pedestrian contour template performs pedestrian detection on the contour to be detected, and solves the problem that the method for detecting pedestrians in the related art does not take into account the inaccurate detection result of the edge image of the target edge, thereby achieving a more accurate detection result.
  • the pedestrian data set is acquired, and the pedestrian contour template is established according to the data set.
  • the establishing the pedestrian contour template according to the data set comprises: randomly selecting N people from the standard pedestrian database INRIA, and performing feature point marking on the video sequence, the coordinate of the feature point is (x, y), and the marked v features
  • the set of point coordinates constitutes a shape S; the shape S is normalized; the shape S subjected to the normalization process is subjected to principal component analysis PCA transformation to obtain an average shape S 0 of the video sequence, and the first n features
  • the edge point of the edge map is processed to obtain the contour map to be detected, it is determined whether there is a pedestrian height model; if the judgment result is yes, according to the pre-established pedestrian contour template and the The pedestrian height model performs pedestrian detection on the contour to be detected.
  • the selected area is calculated to obtain a first edge map; the foreground image and the first edge map are processed to obtain an edge map of the selected region, and the edge map is represented as a set of line segments based on a line segment fitting technique.
  • determining, according to the video sequence, that the codeword of each location in the codebook is a set of four-dimensional vectors, the foreground map includes: initializing the following parameters: an upper bound Hmax of the importance parameter, and an importance parameter Learning weight gama1, forgetting weight gama0 of importance parameter, maximum length Nmax of each position codeword of codebook, matching threshold d, weighting alpha of updated pixel value, parameter T for determining foreground and background; initializing with first frame image
  • the codebook converts the current frame image into the YCbCr color space, so that the codeword is the value of the current pixel and the importance parameter Si of each codeword is gama1, and the length N of each position codeword in the codebook is recorded;
  • One frame of image the image is converted into a YCbCr color space, and each pixel in the frame is subjected to the following operation: if the new pixel value is u, the following formula compares u with all codewords at the pixel position in the codebook.
  • the codeword is updated, the value v of the codeword and the importance parameter s are updated as follows, and the importance of other codewords at the pixel position is reduced, that is, the importance is reduced by gama0. If the importance parameter is less than 0 after the importance is reduced, the codeword is deleted; all the updated codewords are sorted according to importance from high to low; if the currently matched codeword and the importance of the following codeword are And the ratio of the sum of the importances of all the code words of the pixel position is less than the threshold T, and the pixel of the pixel position is set to the foreground, otherwise the background;
  • the following operations are performed: making the pixel foreground; reducing the importance of all codewords at the pixel location, and reducing the importance of the pixel location of the pixel location after decreasing importance If the parameter is less than 0, the codeword is deleted; all the updated codewords are sorted according to importance from high to low; if the length of the pixel location codeword does not reach the maximum length, the current pixel value is inserted at the end of the codeword, so The weight is gama1, otherwise the last code is deleted, and the value of the current pixel is inserted at the end, so that the weight is gama1.
  • the calculating, by the selected area, the first edge image comprises: converting the original color video frame into a grayscale image; calculating a gradient of the horizontal direction and the vertical direction of the grayscale image by using a Sobel operator to obtain a gradient map; The gradient map is binarized to obtain the first edge map.
  • the first edge map and the foreground image are bitwise ANDed to obtain the edge map of the selected region before the Chamfer matching is performed.
  • the above-mentioned line segment fitting technique is used to represent the edge map as a set of line segments, including: traversing the edge map, recording coordinates and numbers of all edge points, and recording them as edgeMap; determining whether the number of edge points remaining in the edgeMap is smaller than When the determination result is YES, the first threshold is stopped, and if the determination result is negative, it is determined whether the number of existing straight lines is smaller than the second threshold, and if the determination result is YES, The step continues to fit the straight line: randomly select a point (x0, y0), and select the first neighborhood with the first predetermined distance as the center centered on the point, and record the coordinates of all points in the first neighborhood, using the recorded
  • the point is fitted to the straight line one, and the normal vector of the straight line one is obtained; the second predetermined range is selected by the second predetermined distance centered on (x0, y0), and the point in the second neighborhood is calculated together with the straight line a number of points of the line, determining whether the number of
  • the pedestrian height model is also established based on the multi-scale FDCM method and the data regression method by using the set of line segments.
  • the multi-scale FDCM method is used to process the line segment array Edge_line, record the detected ordinate of the pedestrian position and the height of the pedestrian to obtain a sample set of the pedestrian height; and use the linear fitting method to fit the pedestrian height according to the position of the pedestrian, Get the pedestrian height model.
  • the pedestrian detection of the to-be-detected contour map according to the pre-established pedestrian contour template and the pedestrian height model includes: forming all the line segments obtained in the Edge_line into line segment diagrams according to the line segment direction, and calculating each line segment map. a distance map; the matching template is scaled according to the pedestrian height model; a Cost value of the pedestrian contour template at all positions is calculated according to the distance map, all Cost values are scanned, and if Cost is less than the fourth threshold, a pedestrian is detected, and Output a rectangular box; the detected rectangular boxes are sorted according to the value of Cost, and each detection window is scanned, if the detection window does not overlap or overlap with the rectangular window with a small Cost value. To a certain ratio, it is considered as a new target, otherwise a vote is added to the overlapping target; if the detected Cost value is greater than 0 and the number of votes cast is greater than the fifth threshold, it is determined as the final target of pedestrian detection.
  • a pedestrian detection apparatus for implementing the above-described embodiments and preferred embodiments, which are not described again.
  • the term “module” may implement a combination of software and/or hardware of a predetermined function.
  • the apparatus described in the following embodiments is preferably implemented in software, hardware, or a combination of software and hardware, is also possible and contemplated.
  • FIG. 2 is a block diagram of a pedestrian detecting apparatus according to an embodiment of the present invention. As shown in FIG. 2, the first processing module 22, the second processing module 24, the third processing module 26, and the first pedestrian detecting module 28 are provided. Brief description of each module.
  • the first processing module 22 is configured to process the monitored video sequence to obtain a foreground image of the video sequence
  • the second processing module 24 is configured to obtain an edge map of the selected area according to the foreground image
  • the third processing module 26 is configured to process the edge points of the edge map to obtain a contour to be detected
  • the first pedestrian detection module 28 is configured to perform pedestrian detection on the to-be-detected contour map according to a pre-established pedestrian contour template.
  • FIG. 3 is a block diagram 1 of a pedestrian detecting apparatus according to a preferred embodiment of the present invention. As shown in FIG. 3, the apparatus includes:
  • the building module 32 is configured to acquire a pedestrian data set, and the pedestrian contour template is established according to the data set.
  • the setup module 32 includes:
  • the marking unit 42 is configured to randomly select N people from the standard pedestrian database INRIA, and perform feature point marking on the video sequence, the coordinate of the feature point is (x, y), and the set of the marked v feature point coordinates forms a shape. S;
  • the normalization processing unit 44 is configured to normalize the shape S;
  • the pedestrian profile template is obtained.
  • FIG. 5 is a block diagram 3 of a pedestrian detecting apparatus according to a preferred embodiment of the present invention. As shown in FIG. 5, the apparatus further includes:
  • the determining module 52 is configured to determine whether there is a pedestrian height model
  • the second pedestrian detection module 54 is configured to perform pedestrian detection on the to-be-detected contour map according to the pre-established pedestrian contour template and the pedestrian height model if the determination result is YES.
  • FIG. 6 is a block diagram 4 of a pedestrian detection apparatus in accordance with a preferred embodiment of the present invention.
  • the second processing module 24 includes:
  • the first calculating unit 64 is configured to calculate a first edge map for the selected area
  • the representation unit 66 is configured to process the foreground map and the first edge map to obtain an edge map of the selected region, and represent the edge map as a set of line segments based on a line segment fitting technique.
  • FIG. 7 is a block diagram 5 of a pedestrian detection apparatus in accordance with a preferred embodiment of the present invention.
  • the command unit 62 includes:
  • the initialization sub-unit 72 is arranged to initialize the following parameters: the upper bound Hmax of the importance parameter, the learning weight gama1 of the importance parameter, the forgetting weight gama0 of the importance parameter, the maximum length Nmax of the codebook for each position codeword, the matching threshold d, updating the weight alpha of the pixel value, determining the foreground and background parameters T;
  • the recording subunit 74 is configured to convert the current frame image into the YCbCr color space according to the first frame image initialization codebook, so that the codeword is the value of the current pixel and the importance parameter Si of each codeword is gama1, in the codebook The length N of each location codeword;
  • the update sub-unit 78 is arranged to update the codebook, gradually converges to form a background model, and the foreground map is determined according to the background model.
  • the device further comprises:
  • the first pixel processing unit is configured to update the codeword, update the value of the codeword v and the importance parameter s if the distance dis_u_v is less than or equal to the threshold d, and reduce the importance of other codewords at the pixel position, that is, the importance is reduced.
  • Small gama0 if the importance parameter is less than 0 after the importance is reduced, the codeword is deleted; all the updated codewords are sorted according to importance from high to low; if the currently matched codeword and the following codeword The sum of the importance sum and the sum of the importances of all the code words of the pixel position is less than the threshold T, and the pixel of the pixel position is set to the foreground, otherwise the background;
  • the second pixel processing unit is configured to: if the distance dis_u_v is greater than the threshold d, indicating that u and v do not match, perform the following operations: make the pixel foreground; reduce the importance of all codewords at the pixel location, if the importance is reduced If the codeword importance parameter of the pixel location is less than 0, the codeword is deleted; all the updated codewords are sorted according to importance from high to low; if the length of the pixel location codeword does not reach the maximum length, the current The pixel value is inserted at the end of the codeword, so that the weight is gama1, otherwise the last code is deleted, and the value of the current pixel is inserted at the end, so that the weight is gama1.
  • FIG. 8 is a block diagram 6 of a pedestrian detecting apparatus according to a preferred embodiment of the present invention.
  • the first calculating unit 64 includes:
  • a conversion subunit 82 configured to convert the original color video frame into a grayscale image
  • the calculating subunit 84 is configured to calculate a gradient of the horizontal direction and the vertical direction of the grayscale image by using a Sobel operator to obtain a gradient map;
  • the binarization sub-unit 86 is arranged to binarize the gradient map to obtain the first edge map.
  • the above-mentioned representation unit 66 includes: a first processing sub-unit, configured to "bit" the first edge map and the foreground image to obtain the edge map of the selected region before performing Chamfer matching.
  • the representation unit 66 further includes: a traversing subunit, configured to traverse the edge map, record coordinates and numbers of all edge points, denoted as edgeMap; stop subunit, set to determine edge points remaining in the edgeMap Whether the number is less than the first threshold, if the judgment result is yes, the fitting of the straight line is stopped; the subunit is fitted, and if the judgment result is negative, it is determined whether the existing number of straight lines is smaller than the second threshold.
  • a traversing subunit configured to traverse the edge map, record coordinates and numbers of all edge points, denoted as edgeMap
  • stop subunit set to determine edge points remaining in the edgeMap Whether the number is less than the first threshold, if the judgment result is yes, the fitting of the straight line is stopped; the subunit is fitted, and if the judgment result is negative, it is determined whether the existing number of straight lines is smaller than the second threshold.
  • the judgment result is yes, continue to fit the straight line through the step: randomly select a point (x0, y0), and select the first neighborhood with the first predetermined distance as the center centered on the point, and record the first The coordinates of all points in the neighborhood, using the recorded points to fit the straight line one, and find the normal vector of the straight line one; the second predetermined range is selected by the second predetermined distance centered on (x0, y0), and the second neighborhood is calculated.
  • a number of points in the second neighborhood that are collinear with the line determining whether the number of points of the collinear line is greater than a third threshold, and if the determination result is yes, obtaining a fitted line, wherein the first predetermined The distance is less than the second pre- Distance; a second processing sub-unit, which is fitted to remove all points on the line, to find all the remaining points in a straight line, all the memory segments fitted to an array Edge_line as the segment set.
  • the apparatus further comprises: an establishing unit configured to establish the pedestrian height model based on the multi-scale FDCM method and the data regression method using the set of line segments.
  • the apparatus further comprises: a recording unit configured to process the line segment array Edge_line by using the multi-scale FDCM method, record the detected ordinate of the pedestrian position and the height of the pedestrian to obtain a sample set of the pedestrian height; and fit the pedestrian height
  • the unit is set to use the linear fitting method to fit the pedestrian height according to the position of the pedestrian, and the pedestrian height model is obtained.
  • FIG 9 is a block diagram 7 of a pedestrian detection apparatus in accordance with a preferred embodiment of the present invention.
  • the second pedestrian detection module 54 includes:
  • the second calculating unit 92 is configured to form all the line segments obtained in the Edge_line into line segments according to the direction of the line segments, and calculate a distance map of each line segment map;
  • the scaling unit 94 is configured to scale the matching template according to the pedestrian height model
  • the scanning unit 96 is configured to calculate a Cost value of the pedestrian contour template at all positions according to the distance map, scan all Cost values, and if Cost is less than the fourth threshold, detect a pedestrian and output a rectangular frame;
  • the adding unit 98 is configured to incrementally sort the detected rectangular boxes according to the value of Cost, and scan each detection window. If the detection window does not overlap or overlap with a rectangular window having a small Cost value, the ratio is considered to be a new goal, otherwise add a vote to the overlapping targets;
  • the determination target unit 910 is set to determine the final target of the pedestrian detection if the detected Cost value is greater than 0 and the number of votes cast is greater than the fifth threshold.
  • FIG. 10 is a schematic diagram of a pedestrian detection system for a remote surveillance video according to an embodiment of the present invention. As shown in FIG. 10, the following mainly includes:
  • ASM active shape model
  • Step S1001 collecting a pedestrian database, and collecting coordinate data indicating a shape of the pedestrian;
  • Step S1002 establishing an active data model ASM
  • Step S1003 constructing a pedestrian outline template
  • FIG. 11 is a schematic diagram of a generated pedestrian contour template set according to an embodiment of the present invention.
  • the pedestrian contour model is constructed as shown in FIG. 11. 120 people are randomly selected from a standard pedestrian database (INRIA), and the selected training image is manually performed.
  • PCA principal component analysis
  • the average shape S 0 of the set and the shape features S i (i 1, 2, . . . n) corresponding to the feature values of the first n (the number of eigenvalues when the energy reaches 95%).
  • the shape of any pedestrian can be represented by a linear equation:
  • the user is allowed to manually select the region of interest (ROI), and the entire image frame is the default ROI region;
  • the codeword is updated: the value v of the codeword and the importance parameter s are updated as follows, and the importance of other codewords at the position is reduced, that is, the importance is reduced.
  • Gama0 if the importance parameter is less than 0 after the importance is reduced, the code is deleted; all the updated code codes are sorted according to importance from high to low; if the current matching code and the subsequent codeword are of importance And the ratio of the sum of the importances of all the code words is less than a certain threshold T, then the pixel of the position is set to the foreground (255), otherwise the background (0);
  • the pixel is foreground; the importance of all codewords at the location is reduced, and if the importance parameter is less than 0, the importance is deleted.
  • a code sort all the coded symbols according to importance from high to low; if the length of the codeword does not reach the maximum length, insert the current pixel value into the end of the codeword, so that the weight is gama1, otherwise the code at the end is deleted.
  • the codebook is continuously updated, gradually converges to form a background model, and the background model can be used to obtain a binary foreground image;
  • the Gaussian filter is used to smooth the foreground image, and then quantized into a binary image.
  • the 3x3 all 1 template pair is selected.
  • the graph is subjected to a corrosion operation to obtain a foreground image, which is referred to as image1.
  • the edge image image2 is calculated, and the original color video frame is first converted into a grayscale image; then the horizontal and vertical gradients of the image are calculated by the Sobel operator, and the gradient map is obtained; Valued, get the edge map image2.
  • the image of image1 and image2 is ANDed to obtain the edge map image3 of the region of interest, and the edge map is represented as a set of line segments based on the line segment fitting technique.
  • the pedestrian height model is established based on the multi-scale fast direction Chamfer matching (FDCM) and data regression method using the line segment set generated by the partial input video frame.
  • FDCM multi-scale fast direction Chamfer matching
  • the line segment array Edge_line is processed by the multi-scale FDCM method, and the ordinate of the detected pedestrian position (ie, the position of the foot) and the height of the pedestrian are recorded, and the process is completed to obtain the pedestrian height.
  • the sample set; the linear fitting method is used to fit the pedestrian height according to the position of the pedestrian, which is the height model.
  • All the fitted line segments obtained in Edge_line are respectively formed into line segment maps according to the direction of the line segment, and the distance map of each line segment map is calculated; the matching template is scaled according to the height model; the cost value of the template at all positions is calculated according to the distance map, and all the scans are scanned.
  • Cost value if Cost is less than a certain threshold, a pedestrian is detected, and a rectangular box is output; all detected rectangular boxes are sorted according to the value of Cost, and each detection window is scanned if it is a rectangle with a smaller Cost value.
  • the windows do not overlap or overlap and do not reach a certain ratio, they are considered as a new target, otherwise a vote is added to the target with overlap; if the detected Cost value is greater than 0 and the number of votes cast is greater than a certain threshold, then it is a final The goal.
  • the pedestrian template is trained offline in the embodiment of the present invention, and is saved into a file for pedestrian detection system call, as shown in FIG. 10, including the following contents:
  • Step S1004 input a video sequence to select an area of interest
  • Step S1005 background modeling, including the following content: the first 1000 frames of the video are used to train the background model, firstly initialize the relevant parameters; the first frame image is used to initialize the codebook, that is, the current frame image is converted into the YCbCr color space, so that the codeword is The value of the current pixel and the importance parameter Si of each codeword is gama1, and the length N of each position codeword in the codebook is recorded; the following operations are repeated for each subsequent frame; the next frame image is read and converted To the YCbCr color space, perform the following operation on each pixel u in the frame: compare the distance dis_u_v between u and all codewords v at the position of the codebook, and if the distance dis_u_v is less than or equal to the threshold d, it indicates that u and v match, then the update code Word value v And the importance parameter s, while reducing the importance of other codewords at the location, if the importance parameter is less than 0, the code is deleted; all the
  • Step S1006 Preprocessing 1 (Smooth, Threshold, Erode), includes the following content: using the background model, a binary foreground image can be obtained; because there are a large number of noise points in the foreground image, noise is removed, and edge information is not lost. To lay the foundation for template matching, select the Gaussian filter to smooth the foreground image, and then quantize it into a binary image. Finally, select the 3x3 all-one template to perform the etching operation, and obtain the foreground image after preprocessing, which is recorded as image1.
  • Step S1007 preprocessing 2 (rgb2gray, Sobel) , includes the following contents: first converting the original video frame into a grayscale image; then calculating the horizontal and vertical gradients of the image by using the Sobel operator, and obtaining the gradient map; Binarization, get the edge map image2;
  • Step S1008 the line segment is fitted, firstly, the edge image image2 and the foreground image image1 are bitwise ANDed to obtain the edge image image3 of interest; the image image3 is traversed, and the coordinates and numbers of all the edge points are recorded, which is recorded as edgeMap, if the edgeMap is If the number of remaining edge points is less than the threshold, the line will no longer be fitted, otherwise the next step; if the number of existing lines is less than a certain threshold, continue to fit the line, as follows: randomly select a point (x0, Y0), and centered on the point, select a small neighborhood, record the coordinates of all points in the neighborhood, use these points to fit a line, and find its normal vector; then (x0, y0) For the center, select a larger neighborhood and calculate the number of points in the neighborhood that are collinear with the line just fitted; repeat the above operation, if the number of collinear points is greater than a certain threshold, find an optimal Straight line. Remove all points on the fitted line and
  • the construction of the pedestrian height model includes steps S1009-S1013, and subsequent video frames after the background modeling is completed are used for high-level modeling. Firstly, it is judged whether there is a height model. If it does not exist, the line segment array Edge_line is matched by the multi-scale FDCM method, and the detected ordinate of the pedestrian position (ie, the position of the foot) and the height of the pedestrian are recorded, and the process is completed. A set of samples of height; using a linear fitting method to fit the height of the pedestrian according to the position of the pedestrian, that is, the height model.
  • Step S1009 it is determined whether the height model is empty, if the determination result is no, step S1010 is performed, and if the determination result is yes, step S1011 is performed;
  • Step S1010 it is determined whether the height model needs to be updated, if the determination result is no, step S1013 is performed, and if the determination result is yes, step S1011 is performed;
  • Step S1012 calculating a height model
  • Step S1014 performing single-scale FDCM
  • step S1015 non-maximum suppression is performed, and the detection result is output.
  • Pedestrian detection including the following: first determine whether there is a height model and pedestrian template file, if there is, then all the fitted line segments obtained in Edge_line form a line segment map according to the line segment direction, and calculate the distance map of each line segment map; The height model will match the template scaling; calculate the Cost value of the template at all positions according to the distance map, scan all Cost values, if Cost is less than a certain threshold, detect a pedestrian and output a rectangular box; all detected rectangles Incrementally sort according to the value of Cost, scan each detection window, if it does not overlap or overlap with a rectangular window with a small Cost value, it is considered to be a new target, otherwise it is added to the target with overlap. Ticket; if the detected Cost value is greater than 0 and the number of votes cast is greater than a certain threshold, then the final goal.
  • FIG. 11 is a schematic diagram of a generated pedestrian shape template set according to an embodiment of the present invention. As shown in FIG. 11, the pedestrian detection result is obtained by selecting different templates, and three indicators are utilized. For the evaluation, the relevant data showed that the third template was better, so the third template was used as the pedestrian matching template in the subsequent experiments.
  • a pedestrian contour template is constructed on a known pedestrian database, and the image to be detected is preprocessed, and pedestrian detection in the image is implemented by matching the pedestrian contour template with the preprocessed image.
  • FIG. 12 is a schematic diagram 1 of pedestrian detection according to a preferred embodiment of the present invention. As shown in Figure 12, it includes the following:
  • Steps S1201-S1203 are the same as steps S1001-S1003 described above, and are not described herein again.
  • Step S1204 inputting an image to be detected, preprocessing the image to be detected 2, obtaining an edge map, first converting the original image into a gray image; then calculating a gradient of the horizontal direction and the vertical direction by using a Sobel operator, and obtaining a gradient Figure; binarize the gradient map to get the edge map image2.
  • Step S1205 performing line segment fitting on the edge map, traversing the edge map image2, recording the coordinates and number of all edge points, and recording it as edgeMap. If the number of edge points remaining in the edgeMap is less than the threshold, the line is no longer fitted. Otherwise, proceed to the next step; if the number of existing straight lines is less than a certain threshold, continue to fit the straight line as follows: randomly select a point (x0, y0), and select a small neighborhood centering on the point Record the coordinates of all points in the neighborhood, use these points to fit a line, and find the normal vector; then, with (x0, y0) as the center, select a larger neighborhood and calculate the points in the neighborhood.
  • the number of points of the line just fitted is collinear; repeat the above operation, if the number of points of the collinear line is greater than a certain threshold, an optimal straight line is found. Remove all points on the fitted line, repeat the operation in the remaining points to find all the lines, and store them in the array Edge_line to lay the foundation for line segment based matching.
  • Pedestrian detection first determine whether there is a pedestrian template file, if there is, then all the fitted line segments obtained in Edge_line form a line segment map according to the direction of the line segment, and calculate the distance map of each line segment map; use a fixed ratio (such as 1:1 ), scaling the matching template. Scans all Cost values for the Cost value of the template at all positions according to the distance map. If Cost is less than a certain threshold, a pedestrian is detected and a rectangular box is output; all detected rectangular boxes are sorted according to the value of Cost.
  • step S1207 the fee is suppressed to the maximum, and the detection result is output.
  • a pedestrian contour template is constructed on a known pedestrian database, a height model is constructed on the input video image, and the detected video image is preprocessed, and the adaptive height pedestrian contour template is matched with the preprocessed image.
  • the image preprocessing process uses background modeling combined with edge detection, and the height model is constructed by multi-scale FDCM. Since the height model is known, the detection process is processed by the single-scale FDCM method, and FIG. 13 is a preferred embodiment according to the present invention.
  • Figure 2 of the pedestrian detection as shown in Figure 13, includes the following:
  • Steps S1301-S1303 are the same as steps S1001-S1003 described above, and are not described herein again.
  • Step S1304 inputting a video sequence to select an area of interest
  • Step S1305 background modeling, the first 1000 frames of the video are used to train the background model, firstly initialize the relevant parameters; initialize the codebook with the first frame image, that is, convert the current frame image to the YCbCr color space, so that the codeword is the value of the current pixel. And the importance parameter Si of each codeword is gama1, and the length N of each position codeword in the codebook is recorded; the following operations are repeated for each subsequent frame; the next frame image is read and converted to the YCbCr color space.
  • the following operation is performed: comparing the distance dis_u_v between u and all codewords v at the position of the codebook, and if the distance dis_u_v is less than or equal to the threshold d, indicating that u and v match, the value of the codeword is updated.
  • the importance parameter s while reducing the importance of other codewords at the location, if the importance parameter is less than 0, the code is deleted; all the updated codens are sorted according to importance from high to low; if the currently matched code If the ratio of the sum of the importance of the subsequent codewords and the sum of the importances of all the codewords is less than a certain threshold T, then the pixel of the position is set to the foreground (255), otherwise the background (0); if the distance dis_u_v Greater than the threshold d, indicating that u does not match v, then the pixel is foreground; reducing the importance of all codewords at the location, if the importance parameter is less than 0, deleting the code; all the updated code is up to importance Low ordering; if the length of the position codeword does not reach the maximum length, insert the current pixel value into the end of the codeword, and make its weight gama1, otherwise delete the last code and insert the value of the current pixel into the end, so that the weight is gama1
  • Step S1306 Preprocessing 1 (Smooth, Threshold, Erode), using the background model, can obtain a binary foreground image; because there are a large number of noise points in the foreground image, in order to remove noise, and ensure no loss of edge information, template matching Lay the foundation, select the Gaussian filter to smooth the foreground image, and then quantize it into a binary image. Finally, select the 3x3 all-one template to etch the image, and obtain the foreground image after preprocessing, which is recorded as image1.
  • Step S1307 preprocessing 2 (rgb2gray, Sobel), first converting the original video frame into a grayscale image; then using the Sobel operator to calculate the horizontal and vertical gradients of the image, and obtaining a gradient map; binarizing the gradient map, Get the edge map image2;
  • Step S1308, line segment fitting, firstly edge-image image2 and foreground image1 are bitwise ANDed to obtain the edge image of interest image3; traverse image image3, record the coordinates and number of all edge points, recorded as edgeMap, if edgeMap If the number of remaining edge points is less than the threshold, the line will no longer be fitted, otherwise the next step; if the number of existing lines is less than a certain threshold, continue to fit the line, as follows: randomly select a point (x0, Y0), and centered on the point, select a small neighborhood, record the coordinates of all points in the neighborhood, use these points to fit a line, and find its normal vector; then (x0, y0) For the center, select a larger neighborhood and calculate the number of points in the neighborhood that are collinear with the line just fitted; repeat the above operation, if the number of collinear points is greater than a certain threshold, find an optimal Straight line. Remove all points on the fitted line, repeat all the lines in the remaining points, and store them in the array Edge
  • the construction of the pedestrian height model includes steps S1309-S1311, and the subsequent video frames after the background modeling is completed are used for high-level modeling. Since there is no height model, the line segment array Edge_line is matched by the multi-scale FDCM method, and the detected ordinate of the pedestrian position (ie, the position of the foot) and the height of the pedestrian are recorded, and the process proceeds to obtain a sample set of the pedestrian height; The linear fitting method is used to fit the pedestrian height according to the position of the pedestrian, which is the height model.
  • Step S1309 performing multi-scale FDCM pedestrian detection
  • Step S1310 calculating a height model
  • Step S1311 constructing a pedestrian height model
  • Step S1312 the single-scale FDCM pedestrian detection, first determines whether there is a height model and a pedestrian template file, and if so, all the fitted line segments obtained in the Edge_line form a line segment map according to the line segment direction, and calculate the distance map of each line segment map.
  • the matching template is scaled according to the height model; the Cost value of the template at all positions is calculated according to the distance map, all Cost values are scanned, and if Cost is less than a certain threshold, a pedestrian is detected and a rectangular frame is output; all detected The rectangular box is sorted according to the value of Cost, and each detection window is scanned.
  • step S1313 non-maximum suppression is performed, and the detection result is output.
  • a pedestrian contour template is constructed on a known pedestrian database, the known video image height model is updated, and the detected video image is preprocessed by adopting an adaptive height pedestrian contour template and a pre- The processed image is matched to achieve pedestrian detection in the video image.
  • the image preprocessing process uses background modeling combined with edge detection, and the height model is updated by multi-scale FDCM. Since the height model is known, the detection process is processed by the single-scale FDCM method, and FIG. 14 is a schematic diagram of pedestrian detection according to a preferred embodiment of the present invention. Third, as shown in Figure 14, including the following:
  • Steps S1401-S1403 are the same as steps S1001-S1003 described above, and are not described herein again.
  • Step S1404 input a video sequence to select an area of interest
  • Step S1405 background modeling, the first 1000 frames of the video are used to train the background model, first initialize the relevant parameters; initialize the codebook with the first frame image, that is, convert the current frame image to the YCbCr color space, and make the codeword the value of the current pixel.
  • the importance parameter Si of each codeword is gama1, and the length N of each position codeword in the codebook is recorded; the following operations are repeated for each subsequent frame; the next frame image is read and converted to the YCbCr color space.
  • the following operation is performed: comparing the distance dis_u_v between u and all codewords v at the position of the codebook, and if the distance dis_u_v is less than or equal to the threshold d, indicating that u and v match, the value of the codeword is updated.
  • the importance parameter s while reducing the importance of other codewords at the location, if the importance parameter is less than 0, the code is deleted; all the updated codens are sorted according to importance from high to low; if the currently matched code If the ratio of the sum of the importance of the subsequent codewords and the sum of the importances of all the codewords is less than a certain threshold T, then the pixel of the position is set to the foreground (255), otherwise the background (0); if the distance dis_u_v Greater than the threshold d, indicating that u does not match v, then the pixel is foreground; reducing the importance of all codewords at the location, if the importance parameter is less than 0, deleting the code; all the updated code is up to importance Low ordering; if the length of the position codeword does not reach the maximum length, insert the current pixel value into the end of the codeword, and make its weight gama1, otherwise delete the last code and insert the value of the current pixel into the end, so that the weight is gama1
  • Step S1406 Preprocessing 1 (Smooth, Threshold, Erode), using the background model, can obtain a binary foreground image; because there are a large number of noise points in the foreground image, in order to remove noise, and ensure no loss of edge information, template matching Lay the foundation, select the Gaussian filter to smooth the foreground image, and then quantize it into a binary image. Finally, select the 3x3 all-one template to etch the image, and obtain the foreground image after preprocessing, which is recorded as image1.
  • Step S1407 preprocessing 2 (rgb2gray, Sobel), first converting the original video frame into a grayscale image; then using the Sobel operator to calculate the horizontal and vertical gradients of the image, and obtaining a gradient map; binarizing the gradient map, Get the edge map image2.
  • Step S1408 line segment fitting, firstly, the edge image image2 and the foreground image image1 are bitwise ANDed to obtain the edge image of interest image3; traverse the image image3, record the coordinates and number of all edge points, and record it as edgeMap, if in the edgeMap If the number of remaining edge points is less than the threshold, the line will no longer be fitted, otherwise the next step; if the number of existing lines is less than a certain threshold, continue to fit the line, as follows: randomly select one Point (x0, y0), and centered on the point, select a small neighborhood, record the coordinates of all points in the neighborhood, use these points to fit a line, and find the normal vector; then ( X0, y0) is centered, select a larger neighborhood, and calculate the number of points in the neighborhood that are collinear with the line just fitted; repeat the above operation, if the number of collinear points is greater than a certain threshold, then Find an optimal line. Remove all points on the fitted line, repeat the operation in the remaining points
  • the construction of the pedestrian height model includes steps S1409-S1412, and the subsequent video frames after the background modeling is completed are used for high-level modeling.
  • the process obtains a sample set of pedestrian heights; the linear fitting method is used to fit the pedestrian height according to the position of the pedestrian, that is, the height model.
  • Step S1409 it is determined whether the height model needs to be updated, if the determination result is yes, step S1410 is performed, and if the determination result is no, step S1412 is performed;
  • Step S1410 performing multi-scale FDCM pedestrian detection
  • Step S1412 constructing a pedestrian height model
  • Step S1413 the single-scale FDCM pedestrian detection, first determines whether there is a height model and a pedestrian template file, and if so, all the fitted line segments obtained in the Edge_line form a line segment map according to the line segment direction, and calculate the distance map of each line segment map.
  • the matching template is scaled according to the height model; the Cost value of the template at all positions is calculated according to the distance map, all Cost values are scanned, and if Cost is less than a certain threshold, a pedestrian is detected and a rectangular frame is output; all detected The rectangular box is sorted according to the value of Cost, and each detection window is scanned.
  • Step S1414 non-maximum suppression, and outputting the detection result.
  • modules or steps of the present invention described above can be implemented by a general-purpose computing device that can be centralized on a single computing device or distributed across a network of multiple computing devices. Alternatively, they may be implemented by program code executable by the computing device such that they may be stored in the storage device by the computing device and, in some cases, may be different from the order herein. Perform the steps shown or described, or separate them into individual integrated circuit modules, or Multiple of these modules or steps are fabricated as a single integrated circuit module. Thus, the invention is not limited to any specific combination of hardware and software.
  • a pedestrian detection method and apparatus are provided in the embodiments of the present invention.
  • the method for detecting pedestrians in the related art does not take into account the problem that the edge image of the target edge leads to inaccurate detection results, thereby achieving a more accurate detection result.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

本发明公开了一种行人检测方法及装置,其中,该方法包括:对监控的视频序列进行处理得到该视频序列的前景图;根据该前景图得到选定区域的边缘图;对该边缘图的边缘点进行处理得到待检测轮廓图;根据预先建立的行人轮廓模板对该待检测轮廓图进行行人检测。通过本发明,解决了相关技术中对行人进行检测的方法没有考虑到目标边缘的边缘图导致检测结果不准确的问题,进而达到了检测结果较为准确的效果。

Description

行人检测方法及装置 技术领域
本发明涉及通信领域,具体而言,涉及一种行人检测方法及装置。
背景技术
目前,行人检测广泛应用于智能人机交互、视频监控等各种领域,引起了人们广泛的研究兴趣。现有的行人检测技术主要分为3大类:基于背景模型的行人检测、基于分类器的行人检测和基于模板匹配的行人检测。基于背景模型的行人检测速度快,但准确率比较低。基于分类器的行人检测方法在近几年取得了较大的发展,其核心是选择最能区分不同目标的特征,然后离线训练分类器,该类方法通常都有比较好的检测效果,但在某些情况下会失效,比如:监控摄像头安装比较高,行人在视野中的面积比较小,因此表示行人的特征很少,导致分类效果下降。在这种情况下,轮廓成为行人的典型特征,因此基于模板匹配的行人检测技术有比较大的优势。
在模板匹配算法中,David Schreiber提出的基于GPU和多种线索的行人检测方法效果较好,该方法通过有效计算行人的形状和行人的运动信息,并在概率框架下将两者结合在一起,基于最优化理论确定行人的位置及遮挡情况。该方法适用于监控环境,既使场景比较复杂,也能够取得比较理想的检测结果。大量实验结果表明,该方法存在如下问题:(1)目标边缘基于像素表示,使得不同方向的边缘图有很大的重复性,不仅会导致匹配的准确性降低,而且增加了时间复杂度;(2)没有很好的运用背景模型提取边缘信息,从而降低时间复杂度;(3)高度模型基于CCD的成像原理,通过人工标定数据并将数据带入模型计算建立,建模过程依赖于用户,并且比较复杂,很难应用于实际;(4)匹配模板较多(30个),时间复杂度较高。
针对相关技术中对行人进行检测的方法没有考虑到目标边缘的边缘图导致检测结果不准确的问题,目前尚未提出有效的解决方案。
发明内容
本发明提供了一种行人检测方法及装置,以至少解决相关技术中对行人进行检测的方法没有考虑到目标边缘的边缘图导致检测结果不准确的问题。
根据本发明的一个方面,提供了一种行人检测方法,包括:对监控的视频序列进行处理得到所述视频序列的前景图;根据所述前景图得到选定区域的边缘图;对所述边缘图的边缘点进行处理得到待检测轮廓图;根据预先建立的行人轮廓模板对所述待检测轮廓图进行行人检测。
优选地,在根据预先建立的行人轮廓模板对所述待检测轮廓图进行行人检测之前还包括:获取行人数据集,根据所述数据集建立所述行人轮廓模板。
优选地,根据所述数据集建立所述行人轮廓模板包括:从标准行人数据库INRIA中随机选择N人,对所述视频序列进行特征点标记,所述特征点的坐标为(x,y),标记好的v个特征点坐标的集合构成形状S;对所述形状S进行归一化处理;对进行归一化处理的所述形状S进行主成分分析PCA变换,得到所述视频序列的平均形状S0,以及前n个特征值对应的形状特征Si(i=1,2,…n);行人的轮廓用下述线性方程进行表示:
Figure PCTCN2014094421-appb-000001
在得到M组轮廓参数之后,得到所述行人轮廓模板。
优选地,在对所述边缘图的边缘点进行处理得到所述待检测轮廓图之后还包括:
判断是否存在行人高度模型;在判断结果为是的情况下,根据预先建立的所述行人轮廓模板和所述行人高度模型对所述待检测轮廓图进行行人检测。
优选地,根据所述前景图得到所述选定区域的边缘图包括:根据所述视频序列,令码本中每个位置的码字为一个4维向量的集合Ci确定所述前景图,其中,Ci={Vi=(Yi,Cbi,Cri),Si},i=1,2,....N,其中(Yi,Cbi,Cri)为描述背景的像素值,Si为反映码字重要程度的参数;针对所述选定区域,计算得到第一边缘图;将所述前景图和所述第一边缘图进行处理得到所述选定区域的边缘图,基于线段拟合技术将所述边缘图表示成线段的集合。
优选地,根据所述视频序列,令码本中每个位置的码字为一个4维向量的集合Ci确定所述前景图包括:初始化以下参数:重要性参数的上界Hmax、重要性参数的学习权重gama1、重要性参数的遗忘权重gama0、码本每个位置码字的最大长度Nmax、匹配的阈值d、更新像素值的权重alpha、判定前景和背景的参数T;用第一帧图像初始化码本,将当前帧图像转换到YCbCr颜色空间,令码字为当前像素的值且每个码字的重要性参数Si为gama1,记录码本中每个位置码字的长度N;读入下一帧图像,将所述图像转换到YCbCr颜色空间,对该帧中每一像素进行以下运算:若新的像素值为u,按下式比较u与码本中该像素位置处所有码字v的距离dis_u_v,其中, dis_u_v=abs(u(1)-v(1))+abs(u(2)-v(2))+abs(u(3)-v(3));更新码本,逐渐收敛形成背景模型,根据所述背景模型确定所述前景图。
优选地,所述方法还包括:若距离dis_u_v小于等于阈值d,更新码字,按下式更新码字的值v以及重要性参数s,同时降低像素位置其它码字的重要性,即重要性减小gama0,如果重要性减小后重要性参数小于0,则删除该码字;将更新后的所有的码字按重要性由高到低进行排序;若当前匹配的码字及其后码字的重要性之和与该像素位置所有码字的重要性之和的比值小于阈值T,将该像素位置的像素置为前景,否则为背景;
v=(1-alpha)*v+alpha*u
s=min([s+gama1,Hmax])
若距离dis_u_v大于阈值d,说明u与v不匹配,则进行如下操作:令该像素为前景;降低该像素位置所有码字的重要性,如果减小重要性后该像素位置的码字重要性参数小于0,则删除该码字;将更新后的所有的码子按重要性由高到低进行排序;若该像素位置码字的长度没有达到最大长度,将当前像素值插入码字末尾,令权重为gama1,否则删除末尾的码子,并将当前像素的值插入到末尾,令权重为gama1。
优选地,针对所述选定区域,计算得到所述第一边缘图包括:将原始彩色视频帧转换成灰度图;用Sobel算子计算所述灰度图水平方向和垂直方向的梯度,得到梯度图;将所述梯度图二值化,得到所述第一边缘图。
优选地,将所述前景图和所述第一边缘图进行处理得到所述选定区域的边缘图包括:在进行Chamfer匹配前,将所述第一边缘图和所述前景图按位“与”得到所述选定区域的所述边缘图。
优选地,基于线段拟合技术将所述边缘图表示成线段的集合包括:遍历所述边缘图,记录所有边缘点的坐标及编号,记作edgeMap;判断所述edgeMap中所剩的边缘点的个数是否小于第一阈值,在判断结果为是的情况下,停止拟合直线;在判断结果为否的情况下,判断现有的直线条数是否小于第二阈值,在判断结果为是的情况下,通过步骤继续拟合直线:随机选择一个点(x0,y0),并以该点为中心第一预定距离为半径选定第一邻域,记录所述第一邻域中所有点的坐标,利用记录的点拟合直线一,并求出所述直线一的法向量;以(x0,y0)为中心第二预定距离为半径选定第二邻域,计算所述第二邻域中的点与所述直线一共线的点的个数,判断共线的点数是否大于第三阈值,在判断结果为是的情况下,得到已拟合直线,其中,所述第一预定距离小于所述 第二预定距离;去掉所述已拟合直线上的所有点,在剩下的点中找到所有的直线,将所有拟合的线段存到一个数组Edge_line中作为所述线段的集合。
优选地,所述方法还包括:利用所述线段的集合,基于多尺度FDCM方法和数据回归方法建立所述行人高度模型。
优选地,所述方法还包括:利用所述多尺度FDCM方法对线段数组Edge_line进行处理,记录检测到的行人位置的纵坐标以及行人的高度获得行人高度的样本集合;利用线性拟合方法根据行人出现位置拟合行人高度,得到所述行人高度模型。
优选地,根据预先建立的所述行人轮廓模板和所述行人高度模型对所述待检测轮廓图进行行人检测包括:将Edge_line中得到的所有拟合线段按照线段方向分别构成线段图,并计算每幅线段图的距离图;按照所述行人高度模型将匹配模板进行缩放;根据所述距离图计算所述行人轮廓模板在所有位置上的Cost值,扫描所有Cost值,如果Cost小于第四阈值,则检测到一个行人,并输出矩形框;将检测到的矩形框按照Cost的值递增排序,扫描每一个检测窗口,如果所述检测窗口与Cost值较小的矩形窗没有重叠或重叠没有达到一定的比例,则认为是一个新的目标,否则给有重叠的目标增加一票;如果检测的Cost值大于0并且投的票数大于第五阈值,确定为行人检测的最终目标。
根据本发明的另一方面,还提供了一种行人检测装置,包括:第一处理模块,设置为对监控的视频序列进行处理得到所述视频序列的前景图;第二处理模块,设置为根据所述前景图得到选定区域的边缘图;第三处理模块,设置为对所述边缘图的边缘点进行处理得到待检测轮廓图;第一行人检测模块,设置为根据预先建立的行人轮廓模板对所述待检测轮廓图进行行人检测。
优选地,所述装置包括:建立模块,设置为获取行人数据集,根据所述数据集建立所述行人轮廓模板。
优选地,所述建立模块包括:标记单元,设置为从标准行人数据库INRIA中随机选择N人,对所述视频序列进行特征点标记,所述特征点的坐标为(x,y),标记好的v个特征点坐标的集合构成形状S;归一化处理单元,设置为对所述形状S进行归一化处理;分析处理单元,设置为对进行归一化处理的所述形状S进行主成分分析PCA变换,得到所述视频序列的平均形状S0,以及前n个特征值对应的形状特征Si(i=1,2,…n);
行人的轮廓用下述线性方程进行表示:
Figure PCTCN2014094421-appb-000002
在得到M组轮廓参数之后,得到所述行人轮廓模板。
优选地,所述装置还包括:判断模块,设置为判断是否存在行人高度模型;第二行人检测模块,设置为在判断结果为是的情况下,根据预先建立的所述行人轮廓模板和所述行人高度模型对所述待检测轮廓图进行行人检测。
优选地,所述第二处理模块包括:命令单元,设置为根据所述视频序列,令码本中每个位置的码字为一个4维向量的集合Ci确定所述前景图,其中,Ci={Vi=(Yi,Cbi,Cri),Si},i=1,2,....N,其中(Yi,Cbi,Cri)为描述背景的像素值,Si为反映码字重要程度的参数;第一计算单元,设置为针对所述选定区域,计算得到第一边缘图;表示单元,设置为将所述前景图和所述第一边缘图进行处理得到所述选定区域的边缘图,基于线段拟合技术将所述边缘图表示成线段的集合。
优选地,所述命令单元包括:初始化子单元,设置为初始化以下参数:重要性参数的上界Hmax、重要性参数的学习权重gama1、重要性参数的遗忘权重gama0、码本每个位置码字的最大长度Nmax、匹配的阈值d、更新像素值的权重alpha、判定前景和背景的参数T;记录子单元,设置为根据第一帧图像初始化码本将当前帧图像转换到YCbCr颜色空间,令码字为当前像素的值且每个码字的重要性参数Si为gama1,记录码本中每个位置码字的长度N;运算子单元,设置为读入下一帧图像,将所述图像转换到YCbCr颜色空间,对该帧中每一像素进行以下运算:若新的像素值为u,按下式比较u与码本中该像素位置处所有码字v的距离dis_u_v,其中,dis_u_v=abs(u(1)-v(1))+abs(u(2)-v(2))+abs(u(3)-v(3));
更新子单元,设置为更新码本,逐渐收敛形成背景模型,根据所述背景模型确定所述前景图。
优选地,所述装置还包括:第一像素处理单元,设置为若距离dis_u_v小于等于阈值d,更新码字,按下式更新码字的值v以及重要性参数s,同时降低像素位置其它码字的重要性,即重要性减小gama0,如果重要性减小后重要性参数小于0,则删除该码字;将更新后的所有的码字按重要性由高到低进行排序;若当前匹配的码字及其后码字的重要性之和与该像素位置所有码字的重要性之和的比值小于阈值T,将该像素位置的像素置为前景,否则为背景;
v=(1-alpha)*v+alpha*u
s=min([s+gama1,Hmax])
第二像素处理单元,设置为若距离dis_u_v大于阈值d,说明u与v不匹配,则进行如下操作:令该像素为前景;降低该像素位置所有码字的重要性,如果减小重要性后该像素位置的码字重要性参数小于0,则删除该码字;将更新后的所有的码子按重要性由高到低进行排序;若该像素位置码字的长度没有达到最大长度,将当前像素值插入码字末尾,令权重为gama1,否则删除末尾的码子,并将当前像素的值插入到末尾,令权重为gama1。
优选地,所述第一计算单元包括:转换子单元,设置为将原始彩色视频帧转换成灰度图;计算子单元,设置为用Sobel算子计算所述灰度图水平方向和垂直方向的梯度,得到梯度图;二值化子单元,设置为将所述梯度图二值化,得到所述第一边缘图。
优选地,所述表示单元包括:第一处理子单元,设置为在进行Chamfer匹配前,将所述第一边缘图和所述前景图按位“与”得到所述选定区域的所述边缘图。
优选地,所述表示单元包括:遍历子单元,设置为遍历所述边缘图,记录所有边缘点的坐标及编号,记作edgeMap;停止子单元,设置为判断所述edgeMap中所剩的边缘点的个数是否小于第一阈值,在判断结果为是的情况下,停止拟合直线;拟合子单元,设置为在判断结果为否的情况下,判断现有的直线条数是否小于第二阈值,在判断结果为是的情况下,通过步骤继续拟合直线:随机选择一个点(x0,y0),并以该点为中心第一预定距离为半径选定第一邻域,记录所述第一邻域中所有点的坐标,利用记录的点拟合直线一,并求出所述直线一的法向量;以(x0,y0)为中心第二预定距离为半径选定第二邻域,计算所述第二邻域中的点与所述直线一共线的点的个数,判断共线的点数是否大于第三阈值,在判断结果为是的情况下,得到已拟合直线,其中,所述第一预定距离小于所述第二预定距离;第二处理子单元,设置为去掉所述已拟合直线上的所有点,在剩下的点中找到所有的直线,将所有拟合的线段存到一个数组Edge_line中作为所述线段的集合。
优选地,所述装置还包括:建立单元,设置为利用所述线段的集合,基于多尺度FDCM方法和数据回归方法建立所述行人高度模型。
优选地,所述装置还包括:记录单元,设置为利用所述多尺度FDCM方法对线段数组Edge_line进行处理,记录检测到的行人位置的纵坐标以及行人的高度获得行人高度的样本集合;拟合行人高度单元,设置为利用线性拟合方法根据行人出现位置拟合行人高度,得到所述行人高度模型。
优选地,所述第二行人检测模块包括:第二计算单元,设置为将Edge_line中得到的所有拟合线段按照线段方向分别构成线段图,并计算每幅线段图的距离图;缩放单元,设置为按照所述行人高度模型将匹配模板进行缩放;扫描单元,设置为根据所述距离图计算所述行人轮廓模板在所有位置上的Cost值,扫描所有Cost值,如果Cost小于第四阈值,则检测到一个行人,并输出矩形框;增加单元,设置为将检测到的矩形框按照Cost的值递增排序,扫描每一个检测窗口,如果所述检测窗口与Cost值较小的矩形窗没有重叠或重叠没有达到一定的比例,则认为是一个新的目标,否则给有重叠的目标增加一票;确定目标单元,设置为如果检测的Cost值大于0并且投的票数大于第五阈值,确定为行人检测的最终目标。
通过本发明,采用对监控的视频序列进行处理得到所述视频序列的前景图;根据所述前景图得到选定区域的边缘图;对所述边缘图的边缘点进行处理得到待检测轮廓图;根据预先建立的行人轮廓模板对所述待检测轮廓图进行行人检测,解决了相关技术中对行人进行检测的方法没有考虑到目标边缘的边缘图导致检测结果不准确的问题,进而达到了检测结果较为准确的效果。
附图说明
此处所说明的附图用来提供对本发明的进一步理解,构成本申请的一部分,本发明的示意性实施例及其说明用于解释本发明,并不构成对本发明的不当限定。在附图中:
图1是根据本发明实施例的行人检测方法的流程图;
图2是根据本发明实施例的行人检测装置的框图;
图3是根据本发明优选实施例的行人检测装置的框图一;
图4是根据本发明优选实施例的行人检测装置的框图二;
图5是根据本发明优选实施例的行人检测装置的框图三;
图6是根据本发明优选实施例的行人检测装置的框图四;
图7是根据本发明优选实施例的行人检测装置的框图五;
图8是根据本发明优选实施例的行人检测装置的框图六;
图9是根据本发明优选实施例的行人检测装置的框图七;
图10是根据本发明实施例的面向远景监控视频的行人检测系统的示意图;
图11是根据本发明实施例的生成的行人轮廓模板集的示意图;
图12是根据本发明优选实施例的行人检测的示意图一;
图13是根据本发明优选实施例的行人检测的示意图二;
图14是根据本发明优选实施例的行人检测的示意图三。
具体实施方式
下文中将参考附图并结合实施例来详细说明本发明。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。
在本实施例中提供了一种行人检测方法,图1是根据本发明实施例的行人检测方法的流程图,如图1所示,该流程包括如下步骤:
步骤S102,对监控的视频序列进行处理得到该视频序列的前景图;
步骤S104,根据该前景图得到选定区域的边缘图;
步骤S106,对该边缘图的边缘点进行处理得到待检测轮廓图;
步骤S108,根据预先建立的行人轮廓模板对该待检测轮廓图进行行人检测。
通过上述步骤,对监控的视频序列进行处理得到该视频序列的前景图,根据该前景图得到选定区域的边缘图,对该边缘图的边缘点进行处理得到待检测轮廓图,根据预先建立的行人轮廓模板对该待检测轮廓图进行行人检测,解决了相关技术中对行人进行检测的方法没有考虑到目标边缘的边缘图导致检测结果不准确的问题,进而达到了检测结果较为准确的效果。
作为一种优选的实施方式,在根据预先建立的行人轮廓模板对该待检测轮廓图进行行人检测之前,获取行人数据集,根据该数据集建立该行人轮廓模板。其中,根据数据集建立该行人轮廓模板包括:从标准行人数据库INRIA中随机选择N人,对该视频序列进行特征点标记,该特征点的坐标为(x,y),标记好的v个特征点坐标的集合构成形状S;对该形状S进行归一化处理;对进行归一化处理的该形状S进行主成分分析PCA变换,得到该视频序列的平均形状S0,以及前n个特征值对应的形状特征 Si(i=1,2,…n);行人的轮廓用下述线性方程进行表示:
Figure PCTCN2014094421-appb-000003
在得到M组轮廓参数之后,得到该行人轮廓模板。
作为一种优选的实施方式,在对边缘图的边缘点进行处理得到待检测轮廓图之后,判断是否存在行人高度模型;在判断结果为是的情况下,根据预先建立的该行人轮廓模板和该行人高度模型对该待检测轮廓图进行行人检测。
其中,根据该前景图得到该选定区域的边缘图包括:根据该视频序列,令码本中每个位置的码字为一个4维向量的集合Ci确定该前景图,其中,Ci={Vi=(Yi,Cbi,Cri),Si},i=1,2,....N,其中(Yi,Cbi,Cri)为描述背景的像素值,Si为反映码字重要程度的参数;针对该选定区域,计算得到第一边缘图;将该前景图和该第一边缘图进行处理得到该选定区域的边缘图,基于线段拟合技术将该边缘图表示成线段的集合。
本实施例中,根据该视频序列,令码本中每个位置的码字为一个4维向量的集合Ci确定该前景图包括:初始化以下参数:重要性参数的上界Hmax、重要性参数的学习权重gama1、重要性参数的遗忘权重gama0、码本每个位置码字的最大长度Nmax、匹配的阈值d、更新像素值的权重alpha、判定前景和背景的参数T;用第一帧图像初始化码本,将当前帧图像转换到YCbCr颜色空间,令码字为当前像素的值且每个码字的重要性参数Si为gama1,记录码本中每个位置码字的长度N;读入下一帧图像,将该图像转换到YCbCr颜色空间,对该帧中每一像素进行以下运算:若新的像素值为u,按下式比较u与码本中该像素位置处所有码字v的距离dis_u_v,其中,dis_u_v=abs(u(1)-v(1))+abs(u(2)-v(2))+abs(u(3)-v(3));更新码本,逐渐收敛形成背景模型,根据该背景模型确定该前景图。
本实施例中,若距离dis_u_v小于等于阈值d,更新码字,按下式更新码字的值v以及重要性参数s,同时降低像素位置其它码字的重要性,即重要性减小gama0,如果重要性减小后重要性参数小于0,则删除该码字;将更新后的所有的码字按重要性由高到低进行排序;若当前匹配的码字及其后码字的重要性之和与该像素位置所有码字的重要性之和的比值小于阈值T,将该像素位置的像素置为前景,否则为背景;
v=(1-alpha)*v+alpha*u
s=min([s+gama1,Hmax])
若距离dis_u_v大于阈值d,说明u与v不匹配,则进行如下操作:令该像素为前景;降低该像素位置所有码字的重要性,如果减小重要性后该像素位置的码字重要性 参数小于0,则删除该码字;将更新后的所有的码子按重要性由高到低进行排序;若该像素位置码字的长度没有达到最大长度,将当前像素值插入码字末尾,令权重为gama1,否则删除末尾的码子,并将当前像素的值插入到末尾,令权重为gama1。
其中,针对该选定区域,计算得到该第一边缘图包括:将原始彩色视频帧转换成灰度图;用Sobel算子计算该灰度图水平方向和垂直方向的梯度,得到梯度图;将该梯度图二值化,得到该第一边缘图。
作为一种优选的实施方式,在进行Chamfer匹配前,将该第一边缘图和该前景图按位“与”得到该选定区域的该边缘图。
上述的基于线段拟合技术将该边缘图表示成线段的集合包括:遍历该边缘图,记录所有边缘点的坐标及编号,记作edgeMap;判断该edgeMap中所剩的边缘点的个数是否小于第一阈值,在判断结果为是的情况下,停止拟合直线;在判断结果为否的情况下,判断现有的直线条数是否小于第二阈值,在判断结果为是的情况下,通过步骤继续拟合直线:随机选择一个点(x0,y0),并以该点为中心第一预定距离为半径选定第一邻域,记录该第一邻域中所有点的坐标,利用记录的点拟合直线一,并求出该直线一的法向量;以(x0,y0)为中心第二预定距离为半径选定第二邻域,计算该第二邻域中的点与该直线一共线的点的个数,判断共线的点数是否大于第三阈值,在判断结果为是的情况下,得到已拟合直线,其中,该第一预定距离小于该第二预定距离;去掉该已拟合直线上的所有点,在剩下的点中找到所有的直线,将所有拟合的线段存到一个数组Edge_line中作为该线段的集合。
本实施例中,还利用该线段的集合,基于多尺度FDCM方法和数据回归方法建立该行人高度模型。
优选地,利用该多尺度FDCM方法对线段数组Edge_line进行处理,记录检测到的行人位置的纵坐标以及行人的高度获得行人高度的样本集合;利用线性拟合方法根据行人出现位置拟合行人高度,得到该行人高度模型。
上述的根据预先建立的该行人轮廓模板和该行人高度模型对该待检测轮廓图进行行人检测包括:将Edge_line中得到的所有拟合线段按照线段方向分别构成线段图,并计算每幅线段图的距离图;按照该行人高度模型将匹配模板进行缩放;根据该距离图计算该行人轮廓模板在所有位置上的Cost值,扫描所有Cost值,如果Cost小于第四阈值,则检测到一个行人,并输出矩形框;将检测到的矩形框按照Cost的值递增排序,扫描每一个检测窗口,如果该检测窗口与Cost值较小的矩形窗没有重叠或重叠没有达 到一定的比例,则认为是一个新的目标,否则给有重叠的目标增加一票;如果检测的Cost值大于0并且投的票数大于第五阈值,确定为行人检测的最终目标。
根据本发明的另一方面,还提供了一种行人检测装置,该装置用于实现上述实施例及优选实施方式,已经进行过说明的不再赘述。如以下所使用的,术语“模块”可以实现预定功能的软件和/或硬件的组合。尽管以下实施例所描述的装置较佳地以软件来实现,但是硬件,或者软件和硬件的组合的实现也是可能并被构想的。
图2是根据本发明实施例的行人检测装置的框图,如图2所示,包括:第一处理模块22、第二处理模块24、第三处理模块26以及第一行人检测模块28,下面对各个模块进行简要说明。
第一处理模块22,设置为对监控的视频序列进行处理得到该视频序列的前景图;
第二处理模块24,设置为根据该前景图得到选定区域的边缘图;
第三处理模块26,设置为对该边缘图的边缘点进行处理得到待检测轮廓图;
第一行人检测模块28,设置为根据预先建立的行人轮廓模板对该待检测轮廓图进行行人检测。
图3是根据本发明优选实施例的行人检测装置的框图一,如图3所示,该装置包括:
建立模块32,设置为获取行人数据集,根据该数据集建立该行人轮廓模板。
图4是根据本发明优选实施例的行人检测装置的框图二,如图4所示,该建立模块32包括:
标记单元42,设置为从标准行人数据库INRIA中随机选择N人,对该视频序列进行特征点标记,该特征点的坐标为(x,y),标记好的v个特征点坐标的集合构成形状S;
归一化处理单元44,设置为对该形状S进行归一化处理;
分析处理单元46,设置为对进行归一化处理的该形状S进行主成分分析PCA变换,得到该视频序列的平均形状S0,以及前n个特征值对应的形状特征Si(i=1,2,…n);
行人的轮廓用下述线性方程进行表示:
Figure PCTCN2014094421-appb-000004
在得到M组轮廓参数之后,得到该行人轮廓模板。
图5是根据本发明优选实施例的行人检测装置的框图三,如图5所示,该装置还包括:
判断模块52,设置为判断是否存在行人高度模型;
第二行人检测模块54,设置为在判断结果为是的情况下,根据预先建立的该行人轮廓模板和该行人高度模型对该待检测轮廓图进行行人检测。
图6是根据本发明优选实施例的行人检测装置的框图四,如图6所示,该第二处理模块24包括:
命令单元62,设置为根据该视频序列,令码本中每个位置的码字为一个4维向量的集合Ci确定该前景图,其中,Ci={Vi=(Yi,Cbi,Cri),Si},i=1,2,....N,其中(Yi,Cbi,Cri)为描述背景的像素值,Si为反映码字重要程度的参数;
第一计算单元64,设置为针对该选定区域,计算得到第一边缘图;
表示单元66,设置为将该前景图和该第一边缘图进行处理得到该选定区域的边缘图,基于线段拟合技术将该边缘图表示成线段的集合。
图7是根据本发明优选实施例的行人检测装置的框图五,如图7所示,上述的命令单元62包括:
初始化子单元72,设置为初始化以下参数:重要性参数的上界Hmax、重要性参数的学习权重gama1、重要性参数的遗忘权重gama0、码本每个位置码字的最大长度Nmax、匹配的阈值d、更新像素值的权重alpha、判定前景和背景的参数T;
记录子单元74,设置为根据第一帧图像初始化码本将当前帧图像转换到YCbCr颜色空间,令码字为当前像素的值且每个码字的重要性参数Si为gama1,记录码本中每个位置码字的长度N;
运算子单元76,设置为读入下一帧图像,将该图像转换到YCbCr颜色空间,对该帧中每一像素进行以下运算:若新的像素值为u,按下式比较u与码本中该像素位 置处所有码字v的距离dis_u_v,其中,dis_u_v=abs(u(1)-v(1))+abs(u(2)-v(2))+abs(u(3)-v(3));
更新子单元78,设置为更新码本,逐渐收敛形成背景模型,根据该背景模型确定该前景图。
优选地,该装置还包括:
第一像素处理单元,设置为若距离dis_u_v小于等于阈值d,更新码字,按下式更新码字的值v以及重要性参数s,同时降低像素位置其它码字的重要性,即重要性减小gama0,如果重要性减小后重要性参数小于0,则删除该码字;将更新后的所有的码字按重要性由高到低进行排序;若当前匹配的码字及其后码字的重要性之和与该像素位置所有码字的重要性之和的比值小于阈值T,将该像素位置的像素置为前景,否则为背景;
v=(1-alpha)*v+alpha*u
s=min([s+gama1,Hmax])
第二像素处理单元,设置为若距离dis_u_v大于阈值d,说明u与v不匹配,则进行如下操作:令该像素为前景;降低该像素位置所有码字的重要性,如果减小重要性后该像素位置的码字重要性参数小于0,则删除该码字;将更新后的所有的码子按重要性由高到低进行排序;若该像素位置码字的长度没有达到最大长度,将当前像素值插入码字末尾,令权重为gama1,否则删除末尾的码子,并将当前像素的值插入到末尾,令权重为gama1。
图8是根据本发明优选实施例的行人检测装置的框图六,如图8所示,上述的第一计算单元64包括:
转换子单元82,设置为将原始彩色视频帧转换成灰度图;
计算子单元84,设置为用Sobel算子计算该灰度图水平方向和垂直方向的梯度,得到梯度图;
二值化子单元86,设置为将该梯度图二值化,得到该第一边缘图。
优选地,上述的表示单元66包括:第一处理子单元,设置为在进行Chamfer匹配前,将该第一边缘图和该前景图按位“与”得到该选定区域的该边缘图。
优选地,该表示单元66还包括:遍历子单元,设置为遍历该边缘图,记录所有边缘点的坐标及编号,记作edgeMap;停止子单元,设置为判断该edgeMap中所剩的边缘点的个数是否小于第一阈值,在判断结果为是的情况下,停止拟合直线;拟合子单元,设置为在判断结果为否的情况下,判断现有的直线条数是否小于第二阈值,在判断结果为是的情况下,通过步骤继续拟合直线:随机选择一个点(x0,y0),并以该点为中心第一预定距离为半径选定第一邻域,记录该第一邻域中所有点的坐标,利用记录的点拟合直线一,并求出该直线一的法向量;以(x0,y0)为中心第二预定距离为半径选定第二邻域,计算该第二邻域中的点与该直线一共线的点的个数,判断共线的点数是否大于第三阈值,在判断结果为是的情况下,得到已拟合直线,其中,该第一预定距离小于该第二预定距离;第二处理子单元,设置为去掉该已拟合直线上的所有点,在剩下的点中找到所有的直线,将所有拟合的线段存到一个数组Edge_line中作为该线段的集合。
优选地,该装置还包括:建立单元,设置为利用该线段的集合,基于多尺度FDCM方法和数据回归方法建立该行人高度模型。
优选地,该装置还包括:记录单元,设置为利用该多尺度FDCM方法对线段数组Edge_line进行处理,记录检测到的行人位置的纵坐标以及行人的高度获得行人高度的样本集合;拟合行人高度单元,设置为利用线性拟合方法根据行人出现位置拟合行人高度,得到该行人高度模型。
图9是根据本发明优选实施例的行人检测装置的框图七,如图9所示,上述的第二行人检测模块54包括:
第二计算单元92,设置为将Edge_line中得到的所有拟合线段按照线段方向分别构成线段图,并计算每幅线段图的距离图;
缩放单元94,设置为按照该行人高度模型将匹配模板进行缩放;
扫描单元96,设置为根据该距离图计算该行人轮廓模板在所有位置上的Cost值,扫描所有Cost值,如果Cost小于第四阈值,则检测到一个行人,并输出矩形框;
增加单元98,设置为将检测到的矩形框按照Cost的值递增排序,扫描每一个检测窗口,如果该检测窗口与Cost值较小的矩形窗没有重叠或重叠没有达到一定的比例,则认为是一个新的目标,否则给有重叠的目标增加一票;
确定目标单元910,设置为如果检测的Cost值大于0并且投的票数大于第五阈值,确定为行人检测的最终目标。
下面结合优选实施例对本发明实施例进行进一步说明。
本发明实施例提供了一种面向远景监控视频的行人检测系统,图10是根据本发明实施例的面向远景监控视频的行人检测系统的示意图,如图10所示,主要包括以下内容:
搜集行人数据集,基于主动形状模型(ASM)构建行人轮廓模板,包括步骤S1001-S1003:
步骤S1001,搜集行人数据库,采集标记行人形状的坐标数据;
步骤S1002,建立主动数据模型ASM;
步骤S1003,构建行人轮廓模板;
图11是根据本发明实施例的生成的行人轮廓模板集的示意图,构建的行人轮廓模型如图11所示,从标准行人数据库(INRIA)中随机选择120人,手工对所选择的训练图像进行特征点标记,每个特征点的坐标是(x,y),标记好的v个(本文中v=26)特征点坐标的集合就构成了形状S;对形状进行归一化,归一化是指以某个形状为基准,将其它形状进行旋转、缩放和平移使得其尽可能的与基准形状的整体位置相接近;对归一化的形状进行主成分分析(PCA)变换,得到对应训练集的平均形状S0,以及前n个(能量达到95%时特征值的个数)特征值对应的形状特征Si(i=1,2,…n)。任意行人的形状就可以用线性方程进行表示:
Figure PCTCN2014094421-appb-000005
给定30组形状参数之后,得到相应的形状模板集合。
对输入视频序列,允许用户手动选择感兴趣区域(ROI),整幅图像帧为默认的ROI区域;令码本中每个位置的码字为一个4维向量的集合,即ci={vi=(Yi,Cbi,Cri),Si},i=1,2,....N,其中(Yi,Cbi,Cri)为描述背景的像素值,Si为反映码字重要程度的参数,计算前景图image1。初始化以下参数:重要性参数的上界(Hmax)、重要性参数的学习权重(gama1)、重要性参数的遗忘权重(gama0)、码本每个位置码字的最大长度(Nmax)、匹配的阈值(d)、更新像素值的权重(alpha)、判定前景和背景的参数(T);用第一帧图像初始化码本,即将当前帧图像转换到YCbCr颜色空间, 令码字为当前像素的值且每个码字的重要性参数Si为gama1,记录码本中每个位置码字的长度N;读入下一帧图像,将其转换到YCbCr颜色空间,对该帧中每一像素进行以下运算:若新的像素值为u,按下式比较u与码本该位置处所有码字v的距离dis_u_v;
其中,dis_u_v=abs(u(1)-v(1))+abs(u(2)-v(2))+abs(u(3)-v(3))
若距离dis_u_v小于等于阈值d,说明u与v匹配,则更新码字:按下式更新码字的值v以及重要性参数s,同时降低该位置其它码字的重要性,即重要性减小gama0,如果重要性减小后重要性参数小于0,则删除该码子;将更新后的所有的码子按重要性由高到低进行排序;若当前匹配的码子及其后码字的重要性之和与所有码字的重要性之和的比值小于某一阈值T,则将该位置的像素置为前景(255),否则为背景(0);
v=(1-alpha)*v+alpha*u
s=min([s+gama1,Hmax])
若距离dis_u_v大于阈值d,说明u与v不匹配,则进行如下操作:令该像素为前景;降低该位置所有码字的重要性,如果减小重要性后重要性参数小于0,则删除该码子;将更新后的所有的码子按重要性由高到低进行排序;若该位置码字的长度没有达到最大长度,将当前像素值插入码字末尾,令权重为gama1,否则删除末尾的码子,并将当前像素的值插入到末尾,令权重为gama1;随着每一帧的进入,码本不断被更新,逐渐收敛形成背景模型,利用背景模型,可以得到二值的前景图;由于前景图像中存在大量的噪声点,为去除噪声,同时保证不丢失边缘信息,为模板匹配奠定基础,选用高斯滤波器平滑前景图像,然后在将其量化成二值图像,最后选用3x3全1模板对图进行腐蚀操作,得到前景图,记作image1。
针对输入视频序列的感兴趣区域,计算边缘图image2,先将原始彩色视频帧转换成灰度图;然后用Sobel算子计算图水平方向和垂直方向的梯度,并求梯度图;将梯度图二值化,得到边缘图image2。
将image1和image2的相“与”得到感兴趣区域的边缘图image3,基于线段拟合技术将边缘图表示成线段的集合。在进行Chamfer匹配前,需要首先将边缘图image2和前景图image1按位“与”得到感兴趣的边缘图image3;遍历图像image3,记录所有边缘点的坐标及编号,记作edgeMap;若edgeMap中所剩的边缘点的个数小于阈值,则不再拟合直线,否则进行下一步;现有的直线条数小于某一阈值,则继续拟合直线,如下:随机选择一个点(x0,y0),并以该点为中心,选定某个小邻域,记录该邻域中所有点的坐标,利用这些点拟合一条直线,并求出其法向量;以(x0,y0)为中心,选择 一个更大邻域,计算邻域中的点与上述的直线共线的点的个数;反复进行上述内容,若共线的点数大于某一阈值,则找到一条最优的直线。去掉已拟合直线上的所有点,在剩下的点中重复操作,即可找到所有的直线,将所有拟合的线段存到一个数组中Edge_line,为基于线段的匹配奠定基础。
若高度模型未建立,则利用部分输入视频帧生成的线段集合,基于多尺度快速方向Chamfer匹配(FDCM)和数据回归方法建立行人高度模型。首先判断是否存在高度模型,如果不存在则利用多尺度FDCM方法对线段数组Edge_line进行处理,记录检测到的行人位置的纵坐标(即脚的位置)以及行人的高度,持续这一过程获得行人高度的样本集合;利用线性拟合方法根据行人出现位置拟合行人高度,即为高度模型。
否则基于高度模型、行人轮廓模板和单尺度FDCM方法检测行人。将Edge_line中得到的所有拟合线段按照线段方向分别构成线段图,并计算每幅线段图的距离图;按照高度模型将匹配模板缩放;根据距离图计算模板在所有位置上的Cost值,扫描所有Cost值,如果Cost小于某一阈值,则检测到一个行人,并输出矩形框;将所有检测到的矩形框按照Cost的值递增排序,扫描每一个检测窗口,如果它与Cost值较小的矩形窗没有重叠或重叠没有达到一定的比例,则认为是一个新的目标,否则给有重叠的该目标增加一票;如果检测的Cost值大于0并且投的票数大于某一阈值,则是一最终的目标。
通过上述内容,在实际应用中,摄像头的安装高度、距离没有统一的要求,视场景而定;用户希望软件安装好就能用,不希望再进行设置、标注、训练等,该系统在设计时考虑了上述因素,如:高度模型的建立、背景建模等,都是自动学习,不需要用户设置,因此更有利于成为企业的产品;本实施例的上述系统吸取了现有算法的优点,弥补了一些不足,在技术上比较先进。
本发明实施例的离线训练行人模板,并将其存成文件以便行人检测系统调用,如图10所示,包括如下内容:
步骤S1004,输入视频序列,选择感兴趣的区域;
步骤S1005,背景建模,包括以下内容:视频的前1000帧用来训练背景模型,首先初始化相关参数;用第一帧图像初始化码本,即将当前帧图像转换到YCbCr颜色空间,令码字为当前像素的值且每个码字的重要性参数Si为gama1,记录码本中每个位置码字的长度N;以后的每一帧都重复以下操作;读入下一帧图像,将其转换到YCbCr颜色空间,对该帧中每一像素u进行以下运算:比较u与码本该位置处所有码字v的距离dis_u_v,若距离dis_u_v小于等于阈值d,说明u与v匹配,则更新码字的值v 以及重要性参数s,同时降低该位置其它码字的重要性,如果重要性参数小于0,则删除该码子;将更新后的所有的码子按重要性由高到低排序;若当前匹配的码子及其后码字的重要性之和与所有码字的重要性之和的比值小于某一阈值T,则将该位置的像素置为前景(255),否则为背景(0);若距离dis_u_v大于阈值d,说明u与v不匹配,则令该像素为前景;降低该位置所有码字的重要性,如果重要性参数小于0,则删除该码子;将更新后的所有码子按重要性由高到低排序;若该位置码字的长度没有达到最大长度,将当前像素值插入码字末尾,令其权重为gama1,否则删除末尾的码子,并将当前像素的值插入到末尾,令权重为gama1。
步骤S1006,预处理1(Smooth,Threshold,Erode),包括以下内容:利用背景模型,可以得到二值的前景图;由于前景图像中存在大量的噪声点,为去除噪声,同时保证不丢失边缘信息,为模板匹配奠定基础,选用高斯滤波器平滑前景图像,然后在将其量化成二值图像,最后选用3x3全1模板对图进行腐蚀操作,得到预处理后的前景图,记作image1。
步骤S1007,预处理2(rgb2gray,Sobel),包括以下内容:先将原始视频帧转换成灰度图;然后用Sobel算子计算图水平方向和垂直方向的梯度,并求梯度图;将梯度图二值化,得到边缘图image2;
步骤S1008,线段拟合,首先将边缘图image2和前景图image1按位“与”得到感兴趣的边缘图image3;遍历图像image3,记录所有边缘点的坐标及编号,记作edgeMap,若edgeMap中所剩的边缘点的个数小于阈值,则不再拟合直线,否则进行下一步;若现有的直线条数小于某一阈值,则继续拟合直线,步骤如下:随机选择一个点(x0,y0),并以该点为中心,选定某个小邻域,记录该邻域中所有点的坐标,利用这些点拟合一条直线,并求出其法向量;然后以(x0,y0)为中心,选择一个更大邻域,计算邻域中的点与刚拟合的直线共线的点的个数;反复进行上述操作,若共线的点数大于某一阈值,则找到一条最优的直线。去掉已拟合直线上的所有点,在剩下的点中重复步骤(105c)即可找到所有的直线,存到数组Edge_line中,为基于线段的匹配奠定基础。
构建行人高度模型包括步骤S1009-S1013,背景建模完成后的后续视频帧用来进行高度建模。首先判断是否存在高度模型,如果不存在,则利用多尺度FDCM方法对线段数组Edge_line进行匹配,记录检测到的行人位置的纵坐标(即脚的位置)以及行人的高度,持续这一过程获得行人高度的样本集合;利用线性拟合方法根据行人出现位置拟合行人高度,即为高度模型。
步骤S1009,判断高度模型是否为空,在判断结果为否的情况下,执行步骤S1010,在判断结果为是的情况下,执行步骤S1011;
步骤S1010,判断高度模型是否需要更新,在判断结果为否的情况下,执行步骤S1013,在判断结果为是的情况下,执行步骤S1011;
步骤S1011,进行多尺度FDCM;
步骤S1012,计算高度模型;
步骤S1013,构建行人高度模型;
步骤S1014,进行单尺度FDCM;
步骤S1015,非最大抑制,输出检测结果。
行人检测,包括以下内容:首先判断是否存在高度模型和行人模板文件,如果存在,则将Edge_line中得到的所有拟合线段按照线段方向分别构成线段图,并计算每幅线段图的距离图;按照高度模型将匹配模板缩放;根据距离图计算模板在所有位置上的Cost值,扫描所有Cost值,如果Cost小于某一阈值,则检测到一个行人,并输出矩形框;将所有检测到的矩形框按照Cost的值递增排序,扫描每一个检测窗口,如果它与Cost值较小的矩形窗没有重叠或重叠没有达到一定的比例,则认为是一个新的目标,否则给有重叠的该目标增加一票;如果检测的Cost值大于0并且投的票数大于某一阈值,则是最终的目标。
下面通过以下仿真进一步说明。
仿真1,采用不同模板进行行人检测的结果比较
利用构建行人模板模块生成了31个模板,图11是根据本发明实施例的生成的行人形状模板集的示意图,如图11所示,通过选择不同的模板得到行人检测的结果,利用3种指标进行评价,相关数据显示,使用第3个模板效果比较好,因此后续的实验中均采用第3个模板作为行人匹配模板。
仿真2,本发明(采用模板Template3)与David Schreiber提出的算法检测结果对比。经验证本发明(采用模板Template3)可以在大部分数据集上获得较David Schreiber所提算法更好的检测结果,说明本发明比较有优势。
实施例一
本实施例中,在已知行人数据库上构建行人轮廓模板,并对待检测图像进行预处理,通过匹配行人轮廓模板与预处理后的图像实现图像中的行人检测。
本实施例中,图像预处理过程只需实现边缘检测,由于不存在高度模型,因此检测过程中采用固定的单尺度FDCM方法处理,图12是根据本发明优选实施例的行人检测的示意图一,如图12所示,包括以下内容:
步骤S1201-S1203与上述的步骤S1001-S1003相同,在此不再赘述。
步骤S1204,输入待检测图像,对该待检测图像进行预处理2,得到边缘图,先将原始图像转换成灰度图;然后用Sobel算子计算图水平方向和垂直方向的梯度,并求梯度图;将梯度图二值化,得到边缘图image2。
步骤S1205,对边缘图进行线段拟合,遍历边缘图image2,记录所有边缘点的坐标及编号,记作edgeMap,若edgeMap中所剩的边缘点的个数小于阈值,则不再拟合直线,否则进行下一步;若现有的直线条数小于某一阈值,则继续拟合直线,步骤如下:随机选择一个点(x0,y0),并以该点为中心,选定某个小邻域,记录该邻域中所有点的坐标,利用这些点拟合一条直线,并求出其法向量;然后以(x0,y0)为中心,选择一个更大邻域,计算邻域中的点与刚拟合的直线共线的点的个数;反复进行上述操作,若共线的点数大于某一阈值,则找到一条最优的直线。去掉已拟合直线上的所有点,在剩下的点中重复操作即可找到所有的直线,存到数组Edge_line中,为基于线段的匹配奠定基础。
行人检测,首先判断是否存在行人模板文件,如果存在,则将Edge_line中得到的所有拟合线段按照线段方向分别构成线段图,并计算每幅线段图的距离图;采用固定比例(如1:1),对匹配模板缩放。对根据距离图计算模板在所有位置上的Cost值,扫描所有Cost值,如果Cost小于某一阈值,则检测到一个行人,并输出矩形框;将所有检测到的矩形框按照Cost的值递增排序,扫描每一个检测窗口,如果它与Cost值较小的矩形窗没有重叠或重叠没有达到一定的比例,则认为是一个新的目标,否则给有重叠的该目标增加一票;如果检测的Cost值大于0并且投的票数大于某一阈值,则是最终的目标。
步骤S1206,进行单尺度FDCM;
步骤S1207,费最大抑制,输出检测结果。
实施例二
本实施例中,在已知行人数据库上构建行人轮廓模板,对输入视频图像构建高度模型,并对待检测视频图像进行预处理,通过将自适应高度行人轮廓模板与预处理后的图像进行匹配实现视频图像中的行人检测。
本实施例中,图像预处理过程采用背景建模结合边缘检测,采用多尺度FDCM构建高度模型,由于已知高度模型,因此检测过程采用单尺度FDCM方法处理,图13是根据本发明优选实施例的行人检测的示意图二,如图13所示,包括以下内容:
步骤S1301-S1303与上述的步骤S1001-S1003相同,在此不再赘述。
步骤S1304,输入视频序列,选择感兴趣的区域;
步骤S1305,背景建模,视频的前1000帧用来训练背景模型,首先初始化相关参数;用第一帧图像初始化码本,即将当前帧图像转换到YCbCr颜色空间,令码字为当前像素的值且每个码字的重要性参数Si为gama1,记录码本中每个位置码字的长度N;以后的每一帧都重复以下操作;读入下一帧图像,将其转换到YCbCr颜色空间,对该帧中每一像素u进行以下运算:比较u与码本该位置处所有码字v的距离dis_u_v,若距离dis_u_v小于等于阈值d,说明u与v匹配,则更新码字的值v以及重要性参数s,同时降低该位置其它码字的重要性,如果重要性参数小于0,则删除该码子;将更新后的所有的码子按重要性由高到低排序;若当前匹配的码子及其后码字的重要性之和与所有码字的重要性之和的比值小于某一阈值T,则将该位置的像素置为前景(255),否则为背景(0);若距离dis_u_v大于阈值d,说明u与v不匹配,则令该像素为前景;降低该位置所有码字的重要性,如果重要性参数小于0,则删除该码子;将更新后的所有码子按重要性由高到低排序;若该位置码字的长度没有达到最大长度,将当前像素值插入码字末尾,令其权重为gama1,否则删除末尾的码子,并将当前像素的值插入到末尾,令权重为gama1;
步骤S1306,预处理1(Smooth,Threshold,Erode),利用背景模型,可以得到二值的前景图;由于前景图像中存在大量的噪声点,为去除噪声,同时保证不丢失边缘信息,为模板匹配奠定基础,选用高斯滤波器平滑前景图像,然后在将其量化成二值图像,最后选用3x3全1模板对图进行腐蚀操作,得到预处理后的前景图,记作image1。
步骤S1307,预处理2(rgb2gray,Sobel),先将原始视频帧转换成灰度图;然后用Sobel算子计算图水平方向和垂直方向的梯度,并求梯度图;将梯度图二值化,得到边缘图image2;
步骤S1308,线段拟合,首先将边缘图image2和前景图image1按位“与”得到感兴趣的边缘图image3;遍历图像image3,记录所有边缘点的坐标及编号,记作edgeMap,若edgeMap中所剩的边缘点的个数小于阈值,则不再拟合直线,否则进行下一步;若现有的直线条数小于某一阈值,则继续拟合直线,步骤如下:随机选择一个点(x0,y0),并以该点为中心,选定某个小邻域,记录该邻域中所有点的坐标,利用这些点拟合一条直线,并求出其法向量;然后以(x0,y0)为中心,选择一个更大邻域,计算邻域中的点与刚拟合的直线共线的点的个数;反复进行上述操作,若共线的点数大于某一阈值,则找到一条最优的直线。去掉已拟合直线上的所有点,在剩下的点中重复即可找到所有的直线,存到数组Edge_line中,为基于线段的匹配奠定基础。
构建行人高度模型包括步骤S1309-S1311,背景建模完成后的后续视频帧用来进行高度建模。由于不存在高度模型,则利用多尺度FDCM方法对线段数组Edge_line进行匹配,记录检测到的行人位置的纵坐标(即脚的位置)以及行人的高度,持续这一过程获得行人高度的样本集合;利用线性拟合方法根据行人出现位置拟合行人高度,即为高度模型。
步骤S1309,进行多尺度FDCM行人检测;
步骤S1310,计算高度模型;
步骤S1311,构建行人高度模型;
步骤S1312,单尺度FDCM行人检测,首先判断是否存在高度模型和行人模板文件,如果存在,则将Edge_line中得到的所有拟合线段按照线段方向分别构成线段图,并计算每幅线段图的距离图;按照高度模型将匹配模板缩放;根据距离图计算模板在所有位置上的Cost值,扫描所有Cost值,如果Cost小于某一阈值,则检测到一个行人,并输出矩形框;将所有检测到的矩形框按照Cost的值递增排序,扫描每一个检测窗口,如果它与Cost值较小的矩形窗没有重叠或重叠没有达到一定的比例,则认为是一个新的目标,否则给有重叠的该目标增加一票;如果检测的Cost值大于0并且投的票数大于某一阈值,则是最终的目标。
步骤S1313,非最大抑制,输出检测结果。
实施例三
本实施例中,在已知行人数据库上构建行人轮廓模板,对已知的视频图像高度模型进行更新,并对待检测视频图像进行预处理,通过将自适应高度行人轮廓模板与预 处理后的图像进行匹配实现视频图像中的行人检测。图像预处理过程采用背景建模结合边缘检测,采用多尺度FDCM更新高度模型,由于已知高度模型,因此检测过程采用单尺度FDCM方法处理,图14是根据本发明优选实施例的行人检测的示意图三,如图14所示,包括以下内容:
步骤S1401-S1403与上述的步骤S1001-S1003相同,在此不再赘述。
步骤S1404,输入视频序列,选择感兴趣的区域;
步骤S1405,背景建模,视频的前1000帧用来训练背景模型,首先初始化相关参数;用第一帧图像初始化码本,即将当前帧图像转换到YCbCr颜色空间,令码字为当前像素的值且每个码字的重要性参数Si为gama1,记录码本中每个位置码字的长度N;以后的每一帧都重复以下操作;读入下一帧图像,将其转换到YCbCr颜色空间,对该帧中每一像素u进行以下运算:比较u与码本该位置处所有码字v的距离dis_u_v,若距离dis_u_v小于等于阈值d,说明u与v匹配,则更新码字的值v以及重要性参数s,同时降低该位置其它码字的重要性,如果重要性参数小于0,则删除该码子;将更新后的所有的码子按重要性由高到低排序;若当前匹配的码子及其后码字的重要性之和与所有码字的重要性之和的比值小于某一阈值T,则将该位置的像素置为前景(255),否则为背景(0);若距离dis_u_v大于阈值d,说明u与v不匹配,则令该像素为前景;降低该位置所有码字的重要性,如果重要性参数小于0,则删除该码子;将更新后的所有码子按重要性由高到低排序;若该位置码字的长度没有达到最大长度,将当前像素值插入码字末尾,令其权重为gama1,否则删除末尾的码子,并将当前像素的值插入到末尾,令权重为gama1;
步骤S1406,预处理1(Smooth,Threshold,Erode),利用背景模型,可以得到二值的前景图;由于前景图像中存在大量的噪声点,为去除噪声,同时保证不丢失边缘信息,为模板匹配奠定基础,选用高斯滤波器平滑前景图像,然后在将其量化成二值图像,最后选用3x3全1模板对图进行腐蚀操作,得到预处理后的前景图,记作image1。
步骤S1407,预处理2(rgb2gray,Sobel),先将原始视频帧转换成灰度图;然后用Sobel算子计算图水平方向和垂直方向的梯度,并求梯度图;将梯度图二值化,得到边缘图image2。
步骤S1408,线段拟合,首先将边缘图image2和前景图image1按位“与”得到感兴趣的边缘图image3;遍历图像image3,记录所有边缘点的坐标及编号,记作edgeMap,若edgeMap中所剩的边缘点的个数小于阈值,则不再拟合直线,否则进行下一步;若现有的直线条数小于某一阈值,则继续拟合直线,步骤如下:随机选择一个 点(x0,y0),并以该点为中心,选定某个小邻域,记录该邻域中所有点的坐标,利用这些点拟合一条直线,并求出其法向量;然后以(x0,y0)为中心,选择一个更大邻域,计算邻域中的点与刚拟合的直线共线的点的个数;反复进行上述操作,若共线的点数大于某一阈值,则找到一条最优的直线。去掉已拟合直线上的所有点,在剩下的点中重复操作即可找到所有的直线,存到数组Edge_line中,为基于线段的匹配奠定基础。
,构建行人高度模型包括步骤S1409-S1412,背景建模完成后的后续视频帧用来进行高度建模。首先判断现有高度模型是否需要更新,如果需要更新,则利用多尺度FDCM方法对线段数组Edge_line进行匹配,记录检测到的行人位置的纵坐标(即脚的位置)以及行人的高度,持续这一过程获得行人高度的样本集合;利用线性拟合方法根据行人出现位置拟合行人高度,即为高度模型。
步骤S1409,判断高度模型是否需要更新,在判断结果为是的情况下,执行步骤S1410,在判断结果为否的情况下,执行步骤S1412;
步骤S1410,进行多尺度FDCM行人检测;
步骤S1411,计算高度模型;
步骤S1412,构建行人高度模型;
步骤S1413,单尺度FDCM行人检测,首先判断是否存在高度模型和行人模板文件,如果存在,则将Edge_line中得到的所有拟合线段按照线段方向分别构成线段图,并计算每幅线段图的距离图;按照高度模型将匹配模板缩放;根据距离图计算模板在所有位置上的Cost值,扫描所有Cost值,如果Cost小于某一阈值,则检测到一个行人,并输出矩形框;将所有检测到的矩形框按照Cost的值递增排序,扫描每一个检测窗口,如果它与Cost值较小的矩形窗没有重叠或重叠没有达到一定的比例,则认为是一个新的目标,否则给有重叠的该目标增加一票;如果检测的Cost值大于0并且投的票数大于某一阈值,则是最终的目标。
步骤S1414,非最大抑制,输出检测结果。
显然,本领域的技术人员应该明白,上述的本发明的各模块或各步骤可以用通用的计算装置来实现,它们可以集中在单个的计算装置上,或者分布在多个计算装置所组成的网络上,可选地,它们可以用计算装置可执行的程序代码来实现,从而,可以将它们存储在存储装置中由计算装置来执行,并且在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤,或者将它们分别制作成各个集成电路模块,或者将 它们中的多个模块或步骤制作成单个集成电路模块来实现。这样,本发明不限制于任何特定的硬件和软件结合。
以上所述仅为本发明的优选实施例而已,并不用于限制本发明,对于本领域的技术人员来说,本发明可以有各种更改和变化。凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。
工业实用性
如上所述,本发明实施例中提供了一种行人检测方法及装置。解决了相关技术中对行人进行检测的方法没有考虑到目标边缘的边缘图导致检测结果不准确的问题,进而达到了检测结果较为准确的效果。

Claims (26)

  1. 一种行人检测方法,包括:
    对监控的视频序列进行处理得到所述视频序列的前景图;
    根据所述前景图得到选定区域的边缘图;
    对所述边缘图的边缘点进行处理得到待检测轮廓图;
    根据预先建立的行人轮廓模板对所述待检测轮廓图进行行人检测。
  2. 根据权利要求1所述的方法,其中,在根据预先建立的行人轮廓模板对所述待检测轮廓图进行行人检测之前还包括:
    获取行人数据集,根据所述数据集建立所述行人轮廓模板。
  3. 根据权利要求2所述的方法,其中,根据所述数据集建立所述行人轮廓模板包括:
    从标准行人数据库INRIA中随机选择N人,对所述视频序列进行特征点标记,所述特征点的坐标为(x,y),标记好的v个特征点坐标的集合构成形状S;
    对所述形状S进行归一化处理;
    对进行归一化处理的所述形状S进行主成分分析PCA变换,得到所述视频序列的平均形状S0,以及前n个特征值对应的形状特征Si(i=1,2,…n);
    行人的轮廓用下述线性方程进行表示:
    Figure PCTCN2014094421-appb-100001
    在得到M组轮廓参数之后,得到所述行人轮廓模板。
  4. 根据权利要求1所述的方法,其中,在对所述边缘图的边缘点进行处理得到所述待检测轮廓图之后还包括:
    判断是否存在行人高度模型;
    在判断结果为是的情况下,根据预先建立的所述行人轮廓模板和所述行人高度模型对所述待检测轮廓图进行行人检测。
  5. 根据权利要求4所述的方法,其中,根据所述前景图得到所述选定区域的边缘图包括:
    根据所述视频序列,令码本中每个位置的码字为一个4维向量的集合Ci确定所述前景图,其中,Ci={Vi=(Yi,Cbi,Cri),Si},i=1,2,….N,其中(Yi,Cbi,Cri)为描述背景的像素值,Si为反映码字重要程度的参数;
    针对所述选定区域,计算得到第一边缘图;
    将所述前景图和所述第一边缘图进行处理得到所述选定区域的边缘图,基于线段拟合技术将所述边缘图表示成线段的集合。
  6. 根据权利要求5所述的方法,其中,根据所述视频序列,令码本中每个位置的码字为一个4维向量的集合Ci确定所述前景图包括:
    初始化以下参数:重要性参数的上界Hmax、重要性参数的学习权重gama1、重要性参数的遗忘权重gama0、码本每个位置码字的最大长度Nmax、匹配的阈值d、更新像素值的权重alpha、判定前景和背景的参数T;
    用第一帧图像初始化码本,将当前帧图像转换到YCbCr颜色空间,令码字为当前像素的值且每个码字的重要性参数Si为gama1,记录码本中每个位置码字的长度N;
    读入下一帧图像,将所述图像转换到YCbCr颜色空间,对该帧中每一像素进行以下运算:若新的像素值为u,按下式比较u与码本中该像素位置处所有码字v的距离dis_u_v,其中,dis_u_v=abs(u(1)-v(1))+abs(u(2)-v(2))+abs(u(3)-v(3));
    更新码本,逐渐收敛形成背景模型,根据所述背景模型确定所述前景图。
  7. 根据权利要求6所述的方法,其中,所述方法还包括:
    若距离dis_u_v小于等于阈值d,更新码字,按下式更新码字的值v以及重要性参数s,同时降低像素位置其它码字的重要性,即重要性减小gama0,如果重要性减小后重要性参数小于0,则删除该码字;将更新后的所有的码字按重要性由高到低进行排序;若当前匹配的码字及其后码字的重要性之和与该像素位置所有码字的重要性之和的比值小于阈值T,将该像素位置的像素置为前景,否则为背景;
    v=(1-alpha)*v+alpha*u
    s=min([s+gama1,Hmax])
    若距离dis_u_v大于阈值d,说明u与v不匹配,则进行如下操作:令该像素为前景;降低该像素位置所有码字的重要性,如果减小重要性后该像素位置 的码字重要性参数小于0,则删除该码字;将更新后的所有的码子按重要性由高到低进行排序;若该像素位置码字的长度没有达到最大长度,将当前像素值插入码字末尾,令权重为gama1,否则删除末尾的码子,并将当前像素的值插入到末尾,令权重为gama1。
  8. 根据权利要求5所述的方法,其中,针对所述选定区域,计算得到所述第一边缘图包括:
    将原始彩色视频帧转换成灰度图;
    用Sobel算子计算所述灰度图水平方向和垂直方向的梯度,得到梯度图;
    将所述梯度图二值化,得到所述第一边缘图。
  9. 根据权利要求5所述的方法,其中,将所述前景图和所述第一边缘图进行处理得到所述选定区域的边缘图包括:
    在进行Chamfer匹配前,将所述第一边缘图和所述前景图按位“与”得到所述选定区域的所述边缘图。
  10. 根据权利要求5所述的方法,其中,基于线段拟合技术将所述边缘图表示成线段的集合包括:
    遍历所述边缘图,记录所有边缘点的坐标及编号,记作edgeMap;
    判断所述edgeMap中所剩的边缘点的个数是否小于第一阈值,在判断结果为是的情况下,停止拟合直线;
    在判断结果为否的情况下,判断现有的直线条数是否小于第二阈值,在判断结果为是的情况下,通过步骤继续拟合直线:随机选择一个点(x0,y0),并以该点为中心第一预定距离为半径选定第一邻域,记录所述第一邻域中所有点的坐标,利用记录的点拟合直线一,并求出所述直线一的法向量;以(x0,y0)为中心第二预定距离为半径选定第二邻域,计算所述第二邻域中的点与所述直线一共线的点的个数,判断共线的点数是否大于第三阈值,在判断结果为是的情况下,得到已拟合直线,其中,所述第一预定距离小于所述第二预定距离;
    去掉所述已拟合直线上的所有点,在剩下的点中找到所有的直线,将所有拟合的线段存到一个数组Edge_line中作为所述线段的集合。
  11. 根据权利要求5所述的方法,其中,所述方法还包括:
    利用所述线段的集合,基于多尺度FDCM方法和数据回归方法建立所述行人高度模型。
  12. 根据权利要求11所述的方法,其中,
    利用所述多尺度FDCM方法对线段数组Edge_line进行处理,记录检测到的行人位置的纵坐标以及行人的高度获得行人高度的样本集合;
    利用线性拟合方法根据行人出现位置拟合行人高度,得到所述行人高度模型。
  13. 根据权利要求4所述的方法,其中,根据预先建立的所述行人轮廓模板和所述行人高度模型对所述待检测轮廓图进行行人检测包括:
    将Edge_line中得到的所有拟合线段按照线段方向分别构成线段图,并计算每幅线段图的距离图;
    按照所述行人高度模型将匹配模板进行缩放;
    根据所述距离图计算所述行人轮廓模板在所有位置上的Cost值,扫描所有Cost值,如果Cost小于第四阈值,则检测到一个行人,并输出矩形框;
    将检测到的矩形框按照Cost的值递增排序,扫描每一个检测窗口,如果所述检测窗口与Cost值较小的矩形窗没有重叠或重叠没有达到一定的比例,则认为是一个新的目标,否则给有重叠的目标增加一票;
    如果检测的Cost值大于0并且投的票数大于第五阈值,确定为行人检测的最终目标。
  14. 一种行人检测装置,包括:
    第一处理模块,设置为对监控的视频序列进行处理得到所述视频序列的前景图;
    第二处理模块,设置为根据所述前景图得到选定区域的边缘图;
    第三处理模块,设置为对所述边缘图的边缘点进行处理得到待检测轮廓图;
    第一行人检测模块,设置为根据预先建立的行人轮廓模板对所述待检测轮廓图进行行人检测。
  15. 根据权利要求14所述的装置,其中,所述装置包括:
    建立模块,设置为获取行人数据集,根据所述数据集建立所述行人轮廓模板。
  16. 根据权利要求15所述的装置,其中,所述建立模块包括:
    标记单元,设置为从标准行人数据库INRIA中随机选择N人,对所述视频序列进行特征点标记,所述特征点的坐标为(x,y),标记好的v个特征点坐标的集合构成形状S;
    归一化处理单元,设置为对所述形状S进行归一化处理;
    分析处理单元,设置为对进行归一化处理的所述形状S进行主成分分析PCA变换,得到所述视频序列的平均形状S0,以及前n个特征值对应的形状特征Si(i=1,2,…n);
    行人的轮廓用下述线性方程进行表示:
    Figure PCTCN2014094421-appb-100002
    在得到M组轮廓参数之后,得到所述行人轮廓模板。
  17. 根据权利要求14所述的装置,其中,所述装置还包括:
    判断模块,设置为判断是否存在行人高度模型;
    第二行人检测模块,设置为在判断结果为是的情况下,根据预先建立的所述行人轮廓模板和所述行人高度模型对所述待检测轮廓图进行行人检测。
  18. 根据权利要求17所述的装置,其中,所述第二处理模块包括:
    命令单元,设置为根据所述视频序列,令码本中每个位置的码字为一个4维向量的集合Ci确定所述前景图,其中,Ci={Vi=(Yi,Cbi,Cri),Si},i=1,2,….N,其中(Yi,Cbi,Cri)为描述背景的像素值,Si为反映码字重要程度的参数;
    第一计算单元,设置为针对所述选定区域,计算得到第一边缘图;
    表示单元,设置为将所述前景图和所述第一边缘图进行处理得到所述选定区域的边缘图,基于线段拟合技术将所述边缘图表示成线段的集合。
  19. 根据权利要求18所述的装置,其中,所述命令单元包括:
    初始化子单元,设置为初始化以下参数:重要性参数的上界Hmax、重要性参数的学习权重gama1、重要性参数的遗忘权重gama0、码本每个位置码字 的最大长度Nmax、匹配的阈值d、更新像素值的权重alpha、判定前景和背景的参数T;
    记录子单元,设置为根据第一帧图像初始化码本将当前帧图像转换到YCbCr颜色空间,令码字为当前像素的值且每个码字的重要性参数Si为gama1,记录码本中每个位置码字的长度N;
    运算子单元,设置为读入下一帧图像,将所述图像转换到YCbCr颜色空间,对该帧中每一像素进行以下运算:若新的像素值为u,按下式比较u与码本中该像素位置处所有码字v的距离dis_u_v,其中,dis_u_v=abs(u(1)-v(1))+abs(u(2)-v(2))+abs(u(3)-v(3));
    更新子单元,设置为更新码本,逐渐收敛形成背景模型,根据所述背景模型确定所述前景图。
  20. 根据权利要求19所述的装置,其中,所述装置还包括:
    第一像素处理单元,设置为若距离dis_u_v小于等于阈值d,更新码字,按下式更新码字的值v以及重要性参数s,同时降低像素位置其它码字的重要性,即重要性减小gama0,如果重要性减小后重要性参数小于0,则删除该码字;将更新后的所有的码字按重要性由高到低进行排序;若当前匹配的码字及其后码字的重要性之和与该像素位置所有码字的重要性之和的比值小于阈值T,将该像素位置的像素置为前景,否则为背景;
    v=(1-alpha)*v+alpha*u
    s=min([s+gama1,Hmax])
    第二像素处理单元,设置为若距离dis_u_v大于阈值d,说明u与v不匹配,则进行如下操作:令该像素为前景;降低该像素位置所有码字的重要性,如果减小重要性后该像素位置的码字重要性参数小于0,则删除该码字;将更新后的所有的码子按重要性由高到低进行排序;若该像素位置码字的长度没有达到最大长度,将当前像素值插入码字末尾,令权重为gama1,否则删除末尾的码子,并将当前像素的值插入到末尾,令权重为gama1。
  21. 根据权利要求18所述的装置,其中,所述第一计算单元包括:
    转换子单元,设置为将原始彩色视频帧转换成灰度图;
    计算子单元,设置为用Sobel算子计算所述灰度图水平方向和垂直方向的梯度,得到梯度图;
    二值化子单元,设置为将所述梯度图二值化,得到所述第一边缘图。
  22. 根据权利要求18所述的装置,其中,所述表示单元包括:
    第一处理子单元,设置为在进行Chamfer匹配前,将所述第一边缘图和所述前景图按位“与”得到所述选定区域的所述边缘图。
  23. 根据权利要求18所述的装置,其中,所述表示单元包括:
    遍历子单元,设置为遍历所述边缘图,记录所有边缘点的坐标及编号,记作edgeMap;
    停止子单元,设置为判断所述edgeMap中所剩的边缘点的个数是否小于第一阈值,在判断结果为是的情况下,停止拟合直线;
    拟合子单元,设置为在判断结果为否的情况下,判断现有的直线条数是否小于第二阈值,在判断结果为是的情况下,通过步骤继续拟合直线:随机选择一个点(x0,y0),并以该点为中心第一预定距离为半径选定第一邻域,记录所述第一邻域中所有点的坐标,利用记录的点拟合直线一,并求出所述直线一的法向量;以(x0,y0)为中心第二预定距离为半径选定第二邻域,计算所述第二邻域中的点与所述直线一共线的点的个数,判断共线的点数是否大于第三阈值,在判断结果为是的情况下,得到已拟合直线,其中,所述第一预定距离小于所述第二预定距离;
    第二处理子单元,设置为去掉所述已拟合直线上的所有点,在剩下的点中找到所有的直线,将所有拟合的线段存到一个数组Edge_line中作为所述线段的集合。
  24. 根据权利要求18所述的装置,其中,所述装置还包括:
    建立单元,设置为利用所述线段的集合,基于多尺度FDCM方法和数据回归方法建立所述行人高度模型。
  25. 根据权利要求24所述的装置,其中,所述装置还包括:
    记录单元,设置为利用所述多尺度FDCM方法对线段数组Edge_line进行处理,记录检测到的行人位置的纵坐标以及行人的高度获得行人高度的样本集合;
    拟合行人高度单元,设置为利用线性拟合方法根据行人出现位置拟合行人高度,得到所述行人高度模型。
  26. 根据权利要求17所述的装置,其中,所述第二行人检测模块包括:
    第二计算单元,设置为将Edge_line中得到的所有拟合线段按照线段方向分别构成线段图,并计算每幅线段图的距离图;
    缩放单元,设置为按照所述行人高度模型将匹配模板进行缩放;
    扫描单元,设置为根据所述距离图计算所述行人轮廓模板在所有位置上的Cost值,扫描所有Cost值,如果Cost小于第四阈值,则检测到一个行人,并输出矩形框;
    增加单元,设置为将检测到的矩形框按照Cost的值递增排序,扫描每一个检测窗口,如果所述检测窗口与Cost值较小的矩形窗没有重叠或重叠没有达到一定的比例,则认为是一个新的目标,否则给有重叠的目标增加一票;
    确定目标单元,设置为如果检测的Cost值大于0并且投的票数大于第五阈值,确定为行人检测的最终目标。
PCT/CN2014/094421 2014-11-17 2014-12-19 行人检测方法及装置 WO2015184764A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201410657480.9A CN105678347A (zh) 2014-11-17 2014-11-17 行人检测方法及装置
CN201410657480.9 2014-11-17

Publications (1)

Publication Number Publication Date
WO2015184764A1 true WO2015184764A1 (zh) 2015-12-10

Family

ID=54766024

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2014/094421 WO2015184764A1 (zh) 2014-11-17 2014-12-19 行人检测方法及装置

Country Status (2)

Country Link
CN (1) CN105678347A (zh)
WO (1) WO2015184764A1 (zh)

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108537718A (zh) * 2018-04-12 2018-09-14 长沙景美集成电路设计有限公司 Gpu中一种实现点图元反走样的装置和方法
CN109165609A (zh) * 2018-08-31 2019-01-08 上海悠络客电子科技股份有限公司 一种基于关键点分析的头肩模式判定系统
CN109615695A (zh) * 2018-11-13 2019-04-12 远景能源(南京)软件技术有限公司 房屋外部的空间照片到屋顶cad图纸的自动化转换方法
CN110276769A (zh) * 2018-03-13 2019-09-24 上海狮吼网络科技有限公司 一种视频画中画架构中直播内容定位方法
CN110738101A (zh) * 2019-09-04 2020-01-31 平安科技(深圳)有限公司 行为识别方法、装置及计算机可读存储介质
CN110942443A (zh) * 2019-09-21 2020-03-31 南京鑫和汇通电子科技有限公司 一种对开式刀闸状态的实时检测方法
CN110956078A (zh) * 2019-10-09 2020-04-03 中国人民解放军战略支援部队信息工程大学 一种电力线检测方法及装置
CN111027519A (zh) * 2019-12-26 2020-04-17 讯飞智元信息科技有限公司 一种高炉风口的监控方法及监控装置
CN111127542A (zh) * 2019-11-14 2020-05-08 北京控制工程研究所 一种基于图像的非合作目标对接环提取方法
CN111144415A (zh) * 2019-12-05 2020-05-12 大连民族大学 一种微小行人目标的检测方法
CN111145211A (zh) * 2019-12-05 2020-05-12 大连民族大学 单目摄像机直立行人头部像素高度获取方法
CN111292283A (zh) * 2020-01-21 2020-06-16 河南大学 一种基于时间序列相似性计算的甲骨残片缀合方法
CN111681300A (zh) * 2020-06-02 2020-09-18 西安电子科技大学 轮廓素描线组成的目标区域获取方法
CN112150480A (zh) * 2020-09-17 2020-12-29 哈尔滨工业大学(威海) 基于海天线的背景分割方法
CN112183517A (zh) * 2020-09-22 2021-01-05 平安科技(深圳)有限公司 证卡边缘检测方法、设备及存储介质
CN112367514A (zh) * 2020-10-30 2021-02-12 京东方科技集团股份有限公司 三维场景构建方法、装置、系统和存储介质
CN112464950A (zh) * 2020-11-23 2021-03-09 武汉舜陈技术有限公司 一种基于柔性材料的图形识别定位方法
CN113160223A (zh) * 2021-05-17 2021-07-23 深圳中科飞测科技股份有限公司 轮廓的确定方法、确定装置、检测设备及存储介质
CN113944396A (zh) * 2021-10-25 2022-01-18 上海网车科技有限公司 一种车辆前后备箱门自动打开的系统及方法
CN116740072A (zh) * 2023-08-15 2023-09-12 安徽省云鹏工程项目管理有限公司 基于机器视觉的道路表面缺陷检测方法及系统
CN116740653A (zh) * 2023-08-14 2023-09-12 山东创亿智慧信息科技发展有限责任公司 一种配电箱运行状态监测方法及系统
CN116828154A (zh) * 2023-07-14 2023-09-29 湖南中医药大学第一附属医院((中医临床研究所)) 一种远程视频监护系统
CN117372377A (zh) * 2023-10-23 2024-01-09 保定景欣电气有限公司 一种单晶硅棱线的断线检测方法、装置及电子设备
CN117746343A (zh) * 2024-02-20 2024-03-22 济南格林信息科技有限公司 基于等高轮廓图的人员流动检测方法及系统
CN117372377B (zh) * 2023-10-23 2024-05-31 保定景欣电气有限公司 一种单晶硅棱线的断线检测方法、装置及电子设备

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109816086B (zh) * 2017-11-20 2023-05-23 富士通株式会社 移动物体的计数装置、方法和电子设备
CN108537188B (zh) * 2018-04-16 2021-09-28 上海理工大学 基于局部去相关特征的行人检测方法
CN109325406B (zh) * 2018-08-10 2021-06-08 广州广电运通金融电子股份有限公司 待评估检测算法检测性能的评估方法、装置和计算机设备
CN112561939B (zh) * 2020-12-08 2024-03-26 福建星网天合智能科技有限公司 一种图像轮廓模板的检索方法、装置、设备和介质
CN113256758B (zh) * 2021-05-20 2023-08-18 稿定(厦门)科技有限公司 图形光滑处理方法
CN115511835B (zh) * 2022-09-28 2023-07-25 西安航空学院 一种图像处理测试平台

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101833668A (zh) * 2010-04-23 2010-09-15 清华大学 一种基于轮廓带图的相似单元的检测方法
CN102147869A (zh) * 2011-03-31 2011-08-10 上海交通大学 基于前景分析和模式识别的行人检测方法
CN103514448A (zh) * 2013-10-24 2014-01-15 北京国基科技股份有限公司 船形识别方法和系统
CN103729614A (zh) * 2012-10-16 2014-04-16 上海唐里信息技术有限公司 基于视频图像的人物识别方法及人物识别装置

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101282196B1 (ko) * 2009-12-11 2013-07-04 한국전자통신연구원 다시점 영상에서 코드북 기반의 전경 및 배경 분리 장치 및 방법
CN103489196B (zh) * 2013-10-16 2016-04-27 北京航空航天大学 基于Codebook背景建模的运动目标检测方法
CN105930833B (zh) * 2016-05-19 2019-01-22 重庆邮电大学 一种基于视频监控的车辆跟踪与分割方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101833668A (zh) * 2010-04-23 2010-09-15 清华大学 一种基于轮廓带图的相似单元的检测方法
CN102147869A (zh) * 2011-03-31 2011-08-10 上海交通大学 基于前景分析和模式识别的行人检测方法
CN103729614A (zh) * 2012-10-16 2014-04-16 上海唐里信息技术有限公司 基于视频图像的人物识别方法及人物识别装置
CN103514448A (zh) * 2013-10-24 2014-01-15 北京国基科技股份有限公司 船形识别方法和系统

Cited By (43)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110276769B (zh) * 2018-03-13 2023-07-11 上海薇龙文化传播有限公司 一种视频画中画架构中直播内容定位方法
CN110276769A (zh) * 2018-03-13 2019-09-24 上海狮吼网络科技有限公司 一种视频画中画架构中直播内容定位方法
CN108537718B (zh) * 2018-04-12 2022-12-02 长沙景美集成电路设计有限公司 Gpu中一种实现点图元反走样的装置和方法
CN108537718A (zh) * 2018-04-12 2018-09-14 长沙景美集成电路设计有限公司 Gpu中一种实现点图元反走样的装置和方法
CN109165609B (zh) * 2018-08-31 2023-05-05 上海悠络客电子科技股份有限公司 一种基于关键点分析的头肩模式判定系统
CN109165609A (zh) * 2018-08-31 2019-01-08 上海悠络客电子科技股份有限公司 一种基于关键点分析的头肩模式判定系统
CN109615695A (zh) * 2018-11-13 2019-04-12 远景能源(南京)软件技术有限公司 房屋外部的空间照片到屋顶cad图纸的自动化转换方法
CN110738101A (zh) * 2019-09-04 2020-01-31 平安科技(深圳)有限公司 行为识别方法、装置及计算机可读存储介质
CN110738101B (zh) * 2019-09-04 2023-07-25 平安科技(深圳)有限公司 行为识别方法、装置及计算机可读存储介质
CN110942443A (zh) * 2019-09-21 2020-03-31 南京鑫和汇通电子科技有限公司 一种对开式刀闸状态的实时检测方法
CN110956078A (zh) * 2019-10-09 2020-04-03 中国人民解放军战略支援部队信息工程大学 一种电力线检测方法及装置
CN110956078B (zh) * 2019-10-09 2023-06-30 中国人民解放军战略支援部队信息工程大学 一种电力线检测方法及装置
CN111127542A (zh) * 2019-11-14 2020-05-08 北京控制工程研究所 一种基于图像的非合作目标对接环提取方法
CN111127542B (zh) * 2019-11-14 2023-09-29 北京控制工程研究所 一种基于图像的非合作目标对接环提取方法
CN111145211A (zh) * 2019-12-05 2020-05-12 大连民族大学 单目摄像机直立行人头部像素高度获取方法
CN111144415B (zh) * 2019-12-05 2023-07-04 大连民族大学 一种微小行人目标的检测方法
CN111145211B (zh) * 2019-12-05 2023-06-30 大连民族大学 单目摄像机直立行人头部像素高度获取方法
CN111144415A (zh) * 2019-12-05 2020-05-12 大连民族大学 一种微小行人目标的检测方法
CN111027519B (zh) * 2019-12-26 2023-08-01 讯飞智元信息科技有限公司 一种高炉风口的监控方法及监控装置
CN111027519A (zh) * 2019-12-26 2020-04-17 讯飞智元信息科技有限公司 一种高炉风口的监控方法及监控装置
CN111292283B (zh) * 2020-01-21 2023-02-07 河南大学 一种基于时间序列相似性计算的甲骨残片缀合方法
CN111292283A (zh) * 2020-01-21 2020-06-16 河南大学 一种基于时间序列相似性计算的甲骨残片缀合方法
CN111681300A (zh) * 2020-06-02 2020-09-18 西安电子科技大学 轮廓素描线组成的目标区域获取方法
CN111681300B (zh) * 2020-06-02 2023-03-10 西安电子科技大学 轮廓素描线组成的目标区域获取方法
CN112150480A (zh) * 2020-09-17 2020-12-29 哈尔滨工业大学(威海) 基于海天线的背景分割方法
CN112183517A (zh) * 2020-09-22 2021-01-05 平安科技(深圳)有限公司 证卡边缘检测方法、设备及存储介质
CN112183517B (zh) * 2020-09-22 2023-08-11 平安科技(深圳)有限公司 证卡边缘检测方法、设备及存储介质
CN112367514A (zh) * 2020-10-30 2021-02-12 京东方科技集团股份有限公司 三维场景构建方法、装置、系统和存储介质
US11954813B2 (en) 2020-10-30 2024-04-09 Boe Technology Group Co., Ltd. Three-dimensional scene constructing method, apparatus and system, and storage medium
CN112464950A (zh) * 2020-11-23 2021-03-09 武汉舜陈技术有限公司 一种基于柔性材料的图形识别定位方法
CN112464950B (zh) * 2020-11-23 2023-08-08 武汉舜陈技术有限公司 一种基于柔性材料的图形识别定位方法
CN113160223A (zh) * 2021-05-17 2021-07-23 深圳中科飞测科技股份有限公司 轮廓的确定方法、确定装置、检测设备及存储介质
CN113944396A (zh) * 2021-10-25 2022-01-18 上海网车科技有限公司 一种车辆前后备箱门自动打开的系统及方法
CN113944396B (zh) * 2021-10-25 2023-06-06 上海网车科技有限公司 一种车辆前后备箱门自动打开的系统及方法
CN116828154A (zh) * 2023-07-14 2023-09-29 湖南中医药大学第一附属医院((中医临床研究所)) 一种远程视频监护系统
CN116828154B (zh) * 2023-07-14 2024-04-02 湖南中医药大学第一附属医院((中医临床研究所)) 一种远程视频监护系统
CN116740653A (zh) * 2023-08-14 2023-09-12 山东创亿智慧信息科技发展有限责任公司 一种配电箱运行状态监测方法及系统
CN116740072A (zh) * 2023-08-15 2023-09-12 安徽省云鹏工程项目管理有限公司 基于机器视觉的道路表面缺陷检测方法及系统
CN116740072B (zh) * 2023-08-15 2023-12-01 安徽省云鹏工程项目管理有限公司 基于机器视觉的道路表面缺陷检测方法及系统
CN117372377A (zh) * 2023-10-23 2024-01-09 保定景欣电气有限公司 一种单晶硅棱线的断线检测方法、装置及电子设备
CN117372377B (zh) * 2023-10-23 2024-05-31 保定景欣电气有限公司 一种单晶硅棱线的断线检测方法、装置及电子设备
CN117746343A (zh) * 2024-02-20 2024-03-22 济南格林信息科技有限公司 基于等高轮廓图的人员流动检测方法及系统
CN117746343B (zh) * 2024-02-20 2024-05-14 济南格林信息科技有限公司 基于等高轮廓图的人员流动检测方法及系统

Also Published As

Publication number Publication date
CN105678347A (zh) 2016-06-15

Similar Documents

Publication Publication Date Title
WO2015184764A1 (zh) 行人检测方法及装置
US10332266B2 (en) Method and device for traffic sign recognition
CN109918987B (zh) 一种视频字幕关键词识别方法及装置
US9235902B2 (en) Image-based crack quantification
CN111611643B (zh) 户型矢量化数据获得方法、装置、电子设备及存储介质
US9602728B2 (en) Image capturing parameter adjustment in preview mode
JP5223675B2 (ja) 車両検知装置,車両検知方法並びに車両検知プログラム
KR20140091762A (ko) 히스토그램들을 갖는 다중 층 연결 요소들을 사용하는 텍스트 검출
JP6095817B1 (ja) 物体検出装置
CN109902618A (zh) 一种海面船只识别方法和装置
CN112784712B (zh) 一种基于实时监控的失踪儿童预警实现方法、装置
CN103984942A (zh) 一种物象识别的方法及移动终端
Yanagisawa et al. Face detection for comic images with deformable part model
CN116740758A (zh) 一种防止误判的鸟类图像识别方法及系统
Zhou et al. Hybridization of appearance and symmetry for vehicle-logo localization
EP2866171A2 (en) Object detection method and device
US10115195B2 (en) Method and apparatus for processing block to be processed of urine sediment image
JP2010271792A (ja) 画像処理装置及び画像処理方法
Rahman et al. A real-time end-to-end bangladeshi license plate detection and recognition system for all situations including challenging environmental scenarios
CN106951831B (zh) 一种基于深度摄像机的行人检测跟踪方法
CN115359468A (zh) 一种目标网站识别方法、装置、设备及介质
Mori et al. Classification of pole-like objects using point clouds and images captured by mobile mapping systems
Kaimkhani et al. UAV with Vision to Recognise Vehicle Number Plates
Pakizeh et al. Building detection from aerial images using hough transform and intensity information
KR101312306B1 (ko) 표지판 인식장치, 표지판 인식방법, 및 이미지 인식방법

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14894106

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14894106

Country of ref document: EP

Kind code of ref document: A1