WO2018119668A1 - 一种行人头部识别方法及系统 - Google Patents
一种行人头部识别方法及系统 Download PDFInfo
- Publication number
- WO2018119668A1 WO2018119668A1 PCT/CN2016/112383 CN2016112383W WO2018119668A1 WO 2018119668 A1 WO2018119668 A1 WO 2018119668A1 CN 2016112383 W CN2016112383 W CN 2016112383W WO 2018119668 A1 WO2018119668 A1 WO 2018119668A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- pixel
- value
- point
- region
- head
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/136—Segmentation; Edge detection involving thresholding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/194—Segmentation; Edge detection involving foreground-background segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/255—Detecting or recognising potential candidate objects based on visual cues, e.g. shapes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
- G06V20/53—Recognition of crowd images, e.g. recognition of crowd congestion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/103—Static body considered as a whole, e.g. static pedestrian or occupant recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
- G06V40/162—Detection; Localisation; Normalisation using pixel segmentation or colour matching
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10028—Range image; Depth image; 3D point clouds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
Definitions
- the invention belongs to the technical field of image processing, and in particular relates to a pedestrian head recognition method and system.
- the existing image processing-based pedestrian head recognition method is generally implemented by recognizing some physical characteristics of a person such as recognizing the color of the hair, the outline of the head, or the head and shoulder model, but the above features are not representative;
- the color of some dyed hair is not well recognized, the color of the hair will change with the light and other factors, and the clothes are close to the color of the hair or wearing a hat, etc., which will also interfere with the recognition, resulting in recognition accuracy.
- Low when using the camera to extract the contour of the human head, the contour of the head will change according to the movement of the pedestrian, and there is no uniformity; the head and shoulder model adopts the method of obliquely shooting the camera downward, which will cause occlusion problems. Can not be accurately identified.
- the technical problem to be solved by the present invention is to provide a pedestrian head recognition method and system, which aims to intelligently and efficiently recognize a human head by means of image processing.
- the invention provides a pedestrian head identification method, comprising:
- Step S1 acquiring a depth image acquired from the target area when the depth camera is vertically aligned with the ground, and extracting a foreground image from the depth image;
- Step S2 extracting potential regions of all the heads from the foreground image as a region of interest, that is, an ROI region;
- Step S3 taking each pixel point in each ROI region as a center, constructing a concentric circle to calculate a probability that a current pixel point belongs to a pixel point in the head region, and obtaining a probability of each pixel point in each ROI region a value, comparing a probability value of each pixel in each ROI region with a preset first threshold, filtering out pixels below the first threshold, and remaining pixels in the form of regions are The points in the head area, each of which is the identified one.
- the step S1 is specifically: using a depth camera to vertically align with the ground, and collecting the depth of the target area.
- the image obtains the pixel value f(x, y) of the point of the depth image in the coordinate (x, y), and the pixel value f(x, y) and the coordinate obtained by modeling the background in advance are (x, y) a pixel value bg(x, y) of the point is compared, and combined with the formula to obtain a pixel value mask (x, y) of the point of the foreground image with coordinates (x, y);
- Tbg distinguishes the threshold between the background model and the foreground image
- the process of obtaining the pixel value bg(x, y) of the point of the coordinate (x, y) by the pre-background modeling is specifically: collecting a plurality of background images of the target area, and the plurality of background images The pixel value of the point whose center coordinate is (x, y) is obtained by taking the mean value;
- the pixel value of the point where the coordinate is (x, y) is the relative distance of the corresponding point of the point in the target area to the depth camera.
- step S2 specifically includes:
- Step S21 taking each pixel point P in the foreground image as a center point, and calculating the mean value m of the pixel values of the pixel points in the 8 neighborhood according to the formula;
- p(k) is a pixel value of a pixel in the neighborhood
- Step S22 if the absolute value d of the difference between the mean value m of the neighborhood and the pixel value p of the center point is smaller than the preset second threshold value Tm, calculate the variance v in the neighborhood according to the variance formula;
- the variance formula is:
- step S23 if the variance v is smaller than the preset third threshold Tv, it is determined that the neighborhood belongs to the ROI area.
- step S3 specifically includes:
- Step S31 constructing a concentric circle with each pixel point in each ROI region as a center, the inner circle radius of the concentric circle is r, and the outer circle radius is n ⁇ r;
- R is the average number of pixel points from the center point to the edge of the human head region obtained by statistics, 1.2 ⁇ n ⁇ 1.7;
- Step S32 sorting the pixel values of the pixels obtained in the inner circle, and recording the sequence ArrayInner formed by the pixel values of the pixel points obtained from the inner circle region, the length of the sequence ArrayInner being lengthInner, wherein the pixel value is the largest The pixel value of the point is NinnerMax; and the pixel values of the pixels obtained in the area between the inner and outer circles are sorted, and the sequence ArrayOuter formed by the pixel values of the pixels obtained from the area between the inner and outer circles is recorded, the sequence The length of the ArrayOuter is lengthOuter, where the pixel value of the smallest pixel value is NouterMin;
- the pixel points obtained above satisfy a uniform distribution in respective regions, and the number of lengths of pixels obtained from the inner circle region is equal to the number of lengths of the pixel points obtained from the region between the inner and outer circles;
- Step S33 calculating the number of points smaller than NinnerMax in the sequence ArrayOuter as Num_1, calculating the number of points larger than the NouterMin in the sequence ArrayInner as Num_2, and calculating the probability that the current pixel belongs to the pixel in the head region according to the formula. L, and record the probability value;
- L (lengthInner + lengthOuter - Num_1 - Num_2) / (lengthInner + lengthOuter);
- Step S35 comparing the final probability of each pixel point with the first threshold value, filtering out pixel points lower than the first threshold value, and remaining pixels in the form of regions are points of the head region Each area is a recognized head.
- the invention also provides a pedestrian head recognition system, comprising:
- a foreground image extraction module configured to acquire a depth image acquired from a target area when the depth camera is vertically aligned with the ground, and extract a foreground image from the depth image;
- An ROI area extraction module configured to extract, from the foreground image, a potential area of all the heads as a region of interest, that is, an ROI area;
- a head recognition module configured to identify a human head region by constructing a concentric circle; specifically, each pixel point in each ROI region is a center, and constructing a concentric circle to calculate that the current pixel point belongs to the head region a probability of a pixel point, a probability value of each pixel point in each ROI region is obtained, and a probability value of each pixel point in each ROI region is compared with a preset first threshold value, which is lower than the first threshold value Pixels are filtered out, and the remaining pixels in the form of regions
- the point is the point of the head area, and each area is the identified one head.
- the foreground image extraction module is specifically configured to: use a depth camera to vertically align with the ground, and collect a depth image of the target area to obtain a pixel value f(x, y) of a point with a coordinate of (x, y) in the depth image. Comparing the pixel value f(x, y) with the pixel value bg(x, y) of the point of the (x, y) coordinate obtained by the prior background modeling, and combining the formula to obtain the coordinates in the foreground image as (x, The pixel value of the point of y) is mask(x, y);
- Tbg distinguishes the threshold between the background model and the foreground image
- the process of obtaining the pixel value bg(x, y) of the point of the coordinate (x, y) by the pre-background modeling is specifically: collecting a plurality of background images of the target area, and the plurality of background images The pixel value of the point whose center coordinate is (x, y) is obtained by taking the mean value;
- the pixel value of the point where the coordinate is (x, y) is the relative distance of the corresponding point of the point in the target area to the depth camera.
- the ROI area extraction module specifically includes:
- a mean value calculation sub-module configured to calculate, according to a formula, a mean value m of pixel values of pixel points in the 8 neighborhoods, with each pixel point P in the foreground image as a center point;
- p(k) is a pixel value of a pixel in the neighborhood
- a variance calculation submodule configured to calculate a variance v in the neighborhood according to the variance formula when the absolute value d of the difference between the mean value m of the neighborhood and the pixel value p of the center point is less than the preset second threshold value Tm;
- the variance formula is:
- the ROI area determining submodule is configured to determine that the neighborhood belongs to the ROI area when the variance v is less than the preset third threshold Tv.
- the head recognition module specifically includes:
- Concentric circle construction sub-module for constructing concentric circles with each pixel point in each ROI region as a center
- the radius of the inner circle of the concentric circle is r, and the radius of the outer circle is n ⁇ r;
- R is the average number of pixel points from the center point to the edge of the human head region obtained by statistics, 1.2 ⁇ n ⁇ 1.7;
- a pixel value sorting sub-module configured to sort pixel values of pixels obtained in an inner circle, and record a sequence ArrayInner formed by pixel values of pixel points obtained from an inner circle region, the length of the sequence ArrayInner being lengthInner, Wherein, the pixel value of the pixel with the largest value of the pixel value is NinnerMax; and the pixel values of the pixel points obtained in the region between the inner and outer circles are sorted, and the sequence of pixel values of the pixel points obtained from the region between the inner and outer circles is recorded.
- ArrayOuter the length of the sequence ArrayOuter is lengthOuter, wherein the pixel value of the smallest pixel value is NouterMin;
- the pixel points obtained above satisfy a uniform distribution in respective regions, and the number of lengths of pixels obtained from the inner circle region is equal to the number of lengths of the pixel points obtained from the region between the inner and outer circles;
- the first probability value determining submodule is configured to calculate the number of points smaller than the NinnerMax in the sequence ArrayOuter as Num_1, calculate the number of points larger than the NouterMin in the sequence ArrayInner as Num_2, and calculate the current pixel point according to the formula to belong to the head The probability L of the pixel in the area, and record the probability value;
- L (lengthInner + lengthOuter - Num_1 - Num_2) / (lengthInner + lengthOuter);
- a header recognition submodule configured to compare a final probability of each pixel with the first threshold, and filter out pixels below the first threshold, and the remaining pixels in the form of regions are The points in the head area, each of which is the identified one.
- the present invention has the beneficial effects that the pedestrian head recognition method and system provided by the present invention, on the one hand, prior to performing head recognition, first delineating the ROI area in the foreground image to lock the head.
- the potential area effectively reduces the computational complexity of the algorithm and improves the recognition speed.
- the concentric circle is used to measure the head region, which improves the accuracy of the head recognition. Rate; and effectively avoids the influence of the color of the clothes, the color of the head, etc. on the head recognition, and improves the anti-interference ability of the algorithm.
- FIG. 1 is a schematic flow chart of a pedestrian head identification method according to an embodiment of the present invention.
- FIG. 2 is a schematic block diagram of a pedestrian head recognition system according to an embodiment of the present invention.
- FIG. 3 is a schematic flowchart of a method for collecting human traffic according to an embodiment of the present invention.
- FIG. 4 is a schematic block diagram of a human traffic statistics system according to an embodiment of the present invention.
- the main implementation idea of the present invention is: extracting a foreground image from the depth image by using a depth camera to acquire a depth image of the target area, and by background modeling; using the head region to be relatively flat, from the foreground
- the potential area of all the heads is extracted as an ROI area in the image, and the distance (ie, the pixel value) of the head area relative to the depth camera is smaller in each ROI area than the shoulders and other parts of the body, and the concentric circles are recognized by recognizing Specifically, the concentric circle is constructed by combining the pixel points in the head region as a center, and the pixel value of the pixel in the inner circle is generally smaller than the pixel value of the pixel between the inner and outer circles, and the ROI region is used.
- Each point in the center constructs a concentric circle for the center, calculates the probability that the point belongs to the head region, and compares the probability with a preset first threshold, and filters out pixels below the first threshold, leaving
- the pixel points in the form of regions are the points of the head region, and each region is a recognized head.
- step S1 a depth image acquired from the target area when the depth camera is vertically aligned with the ground is acquired, and a foreground image is extracted from the depth image.
- the camera used in the acquisition target area of the present invention is a depth camera
- the imaging principle is that the target object distance is obtained by continuously transmitting the light pulse to the target, then receiving the light returned from the target by the sensor, and detecting the round trip time of the light pulse. Therefore, the image formed by the depth camera is a pair of relative distance images, that is, the value of each pixel in the image is the relative distance of the target camera from the depth camera.
- the depth camera's shooting angle is vertically aligned with the ground, which can effectively reduce the occlusion between pedestrians.
- the step S1 is specifically: using a depth camera to vertically align with the ground, collecting a depth image of the target area, and obtaining a pixel value f(x, y) of a point of the depth image (x, y), and the pixel value f ( x, y) is compared with the pixel value bg(x, y) of the point of (x, y) obtained by the background model, and combined with the formula to obtain the pixel of the point of the foreground image with the coordinate of (x, y) The value mask(x,y).
- Tbg distinguishes the threshold between the background model and the foreground image, that is, the pixel value belonging to the background image in mask(x, y) is set to 0, and the point where the pixel value in mask(x, y) is not 0 is the foreground image. The point in the middle.
- the process of obtaining the pixel value bg(x, y) of the point of the coordinate (x, y) by the background modeling in advance is specifically: collecting a plurality of background images of the target area, and The pixel value of the point of the background image in which the coordinate is (x, y) takes the mean value, and the background model bg(x, y) facing the target area is obtained, and (x, y) represents the coordinates in the image;
- the pixel value of the point of (x, y) is the relative distance of the corresponding point of the point in the target area to the depth camera.
- the target area is a to-be-detected area within a range that the depth camera can collect, and a relative distance of a point in the target area to the depth camera is taken as a pixel value of the point in the image.
- Step S2 extracting potential regions of all the heads from the foreground image as a region of interest, that is, an ROI region;
- the pixel values of the head region are relatively close in the depth image, that is, the variance of the region is small, that is, the region where the pixel value is flat on the image can be defined as the ROI region ( Region Of Interest, the ROI region represents the potential head region, and the later header recognition is performed on the ROI region, and the determination of the ROI region reduces the pressure of the later header discrimination.
- the ROI region Region Of Interest, the ROI region represents the potential head region, and the later header recognition is performed on the ROI region, and the determination of the ROI region reduces the pressure of the later header discrimination.
- the step S2 specifically includes:
- step S21 the mean value m of the pixel values of the pixel points in the 8 neighborhoods is calculated according to the formula, with each pixel point P in the foreground image as a center point.
- the 8 neighborhoods are centered on one pixel, and the adjacent 8 pixels constitute 8 neighborhoods of the center point.
- p(k) is the pixel value of the pixel in the neighborhood.
- step S22 if the absolute value d of the difference between the mean value m of the neighborhood and the pixel value p of the center point is smaller than the preset second threshold value Tm, the variance v in the neighborhood is calculated according to the variance formula.
- the variance formula is:
- step S23 if the variance v is smaller than the preset third threshold Tv, it is determined that the neighborhood belongs to the ROI area.
- Step S3 taking each pixel point in each ROI region as a center, constructing a concentric circle to calculate a probability that a current pixel point belongs to a pixel point in the head region, and obtaining a probability of each pixel point in each ROI region a value, comparing a probability value of each pixel in each ROI region with a preset first threshold, filtering out pixels below the first threshold, and remaining pixels in the form of regions are The points in the head area, each of which is the identified one.
- the head area is the highest area of the body.
- the pixel value of the head area is smaller than other areas of the body, so this feature can be combined to construct concentricity on the foreground image. The round way is achieved.
- the step S3 specifically includes:
- Step S31 constructing a concentric circle with each pixel point in each ROI region as a center, the inner circle radius of the concentric circle is r, and the outer circle radius is n ⁇ r.
- R is the average number of pixel points from the center point to the edge of the human head region obtained by statistics, 1.2 ⁇ n ⁇ 1.7.
- the method for counting the average number of pixel points from the center point to the edge of the human head region by using the statistical method is: vertically capturing a large number of pedestrians through the image of the shooting area by using the depth camera; and counting the radius of the pedestrian head region from a large number of pedestrian images.
- the half of the average head region radius R is the radius of the inner circle of the concentric circle, that is, the inner circle radius r is
- the radius of the outer circle is n times the radius of the inner circle, that is, the radius of the outer circle is n ⁇ r; within a certain range, the larger n is, the stricter the criterion is.
- Step S32 sorting the pixel values of the pixels obtained in the inner circle, and recording the sequence ArrayInner formed by the pixel values of the pixel points obtained from the inner circle region, the length of the sequence ArrayInner being lengthInner, wherein the pixel value is the largest The pixel value of the point is NinnerMax; and the pixel values of the pixels obtained in the area between the inner and outer circles are sorted, and the sequence ArrayOuter formed by the pixel values of the pixels obtained from the area between the inner and outer circles is recorded, the sequence The length of the ArrayOuter is lengthOuter, where the pixel value of the smallest pixel value is NouterMin.
- the pixel points obtained above satisfy the uniform distribution in the respective regions, and the number of pixel points taken from the inner circular region
- the lengthInner is equal to the number of length points obtained from the area between the inner and outer circles, lengthOuter.
- Step S33 calculating the number of points smaller than NinnerMax in the sequence ArrayOuter as Num_1, calculating the number of points larger than the NouterMin in the sequence ArrayInner as Num_2, and calculating the probability L of the pixel point belonging to the head region in the current region according to the formula. And record the probability value.
- L (lengthInner + lengthOuter - Num_1 - Num_2) / (lengthInner + lengthOuter);
- Step S35 comparing the final probability of each pixel point with the first threshold value, filtering out pixel points lower than the first threshold value, and remaining pixels in the form of regions are points of the head region Each area is a recognized head.
- the foreground image extraction module 10 is configured to acquire a depth image acquired from the target area when the depth camera is vertically aligned with the ground, and extract the foreground image from the depth image.
- the foreground image extraction module 10 is specifically configured to: use a depth camera to vertically align with the ground, collect a depth image of the target area, and obtain a pixel value f(x, y) of a point of the depth image in which the coordinate is (x, y), Pixel value f(x, y) and pass back in advance
- the pixel value bg(x, y) of the point obtained by the scene modeling is (x, y), and combined with the formula to obtain the pixel value mask(x, y) of the point with the coordinate (x, y) in the foreground image. .
- Tbg distinguishes the threshold between the background model and the foreground image.
- the process of obtaining the pixel value bg(x, y) of the point of the coordinate (x, y) by the pre-background modeling is specifically: collecting a plurality of background images of the target area, and the plurality of background images
- the pixel value of the point where the middle coordinate is (x, y) is obtained by taking the mean value.
- the pixel value of the point where the coordinate is (x, y) is the relative distance of the corresponding point of the point in the target area to the depth camera.
- the ROI area extraction module 11 is configured to extract a potential area of all the heads from the foreground image as a region of interest, that is, an ROI area.
- the ROI area extraction module 11 specifically includes:
- the mean value calculation sub-module is configured to calculate, according to the formula, a mean value m of pixel values of pixel points in the 8 neighborhoods, with each pixel point P in the foreground image as a center point.
- p(k) is a pixel value of a pixel in the neighborhood
- a variance calculation submodule configured to calculate a variance v in the neighborhood according to the variance formula when the absolute value d of the difference between the mean value m of the neighborhood and the pixel value p of the center point is less than the preset second threshold value Tm;
- the variance formula is:
- the ROI area determining submodule is configured to determine that the neighborhood belongs to the ROI area when the variance v is less than the third threshold Tv of the threshold.
- the head recognition module 12 is configured to identify a human head region by constructing a concentric circle; specifically, each pixel point in each ROI region is a center, and the concentric circle is constructed to calculate that the current pixel belongs to the head region.
- Pixel point Probability obtaining a probability value of each pixel in each ROI region, comparing a probability value of each pixel in each ROI region with a preset first threshold, and lowering a pixel lower than the first threshold Filtered out, the remaining pixels in the form of regions are the points in the head region, and each region is the identified one.
- the header recognition module 12 specifically includes:
- the concentric circle construction sub-module is configured to construct a concentric circle with each pixel point in each ROI region as a center, the inner circle radius of the concentric circle is r, and the outer circle radius is n ⁇ r.
- R is the average number of pixel points from the center point to the edge of the human head region obtained by statistics, 1.2 ⁇ n ⁇ 1.7.
- a pixel value sorting sub-module configured to sort pixel values of pixels obtained in an inner circle, and record a sequence ArrayInner formed by pixel values of pixel points obtained from an inner circle region, the length of the sequence ArrayInner being lengthInner, Wherein, the pixel value of the pixel with the largest value of the pixel value is NinnerMax; and the pixel values of the pixel points obtained in the region between the inner and outer circles are sorted, and the sequence of pixel values of the pixel points obtained from the region between the inner and outer circles is recorded.
- ArrayOuter the length of the sequence ArrayOuter is lengthOuter, wherein the pixel value of the smallest pixel value is NouterMin.
- the pixel points obtained above satisfy a uniform distribution in each region, and the number of lengths of pixels obtained from the inner circle region lengthInner is equal to the number of lengths of pixel points obtained from the region between the inner and outer circles.
- the first probability value determining submodule is configured to calculate the number of points smaller than the NinnerMax in the sequence ArrayOuter as Num_1, calculate the number of points larger than the NouterMin in the sequence ArrayInner as Num_2, and calculate the current pixel point according to the formula to belong to the head The probability L of the pixel points in the area, and the probability value is recorded.
- L (lengthInner + lengthOuter - Num_1 - Num_2) / (lengthInner + lengthOuter);
- a header recognition submodule configured to compare a final probability of each pixel with the first threshold, and filter out pixels below the first threshold, and the remaining pixels in the form of regions are The points in the head area, each of which is the identified one.
- the human traffic statistics method is based on the pedestrian head identification system described above, and the human traffic statistics method is: determining by tracking the pedestrian head area identified by the pedestrian head recognition system. Its motion trajectory is counted when the motion trajectory passes through the preset area, thereby counting the flow of people in the target area.
- the method for counting traffic statistics specifically includes:
- step A1 the identified head area is surrounded by a rectangular frame, and the head area is inscribed in the rectangular frame.
- step A2 joint head similarity calculation is performed for each head region in the previous frame image in the foreground image and all header regions in the subsequent frame image.
- the tracking of the head object depends on calculating the coordinates of the intersection position and the head region of the diagonal of the rectangular frame of the head between two consecutive frames.
- the joint similarity of the area size is determined.
- P associate (d1, d2), A position (d1, d2) and A area (d1, d2) represent joint similarity and position similarity, respectively.
- Sex and area similarity, where A position (d1, d2) and A area (d1, d2) are calculated as:
- (x 1 , y 1 ) and (x 2 , y 2 ) respectively represent coordinates of intersections of diagonal lines of any one of the head regions d1 and d2 in consecutive two frames of images
- s 1 and s 2 respectively represent images of two consecutive frames
- the area of any one of the head regions d1 and d2, a x and a y represent the variance of the diagonal coordinates of the rectangles of all the head regions in the continuous two frames of images on the X-axis and the Y-axis, and a s represents two consecutive frames.
- the variance of the area of all head regions in the image is the variance of the area of all head regions in the image.
- step A3 the calculated maximum value of the joint similarity between each head region in the previous frame image and all the head regions in the subsequent frame image is compared with a threshold value. If it is greater than the threshold value, it represents the image of the previous frame. The header region in the header region is successfully matched with the maximum value of the joint similarity in the image of the subsequent frame; if less than the threshold, the matching fails and the target is lost.
- step A4 the diagonal intersections of the rectangular frames of the two head regions that are successfully matched in each successive two frames of images are connected to determine the motion track of the head region.
- step A5 when the motion track is detected to pass through the preset area, the number of people is counted, thereby counting the flow rate of the target area.
- the preset area is two parallel virtual decision lines L1 and L2 disposed on a frame image of the foreground image and an area formed with an edge of the foreground image; when the motion track is detected to be continuous When crossing L1 and L2, the number of people counters are counted; more specifically, when it is detected that the order of the motion trajectory continuously traverses the virtual decision line is L1 and L2, the counter counts; when it is detected that the motion trajectory continuously traverses the virtual decision line When the order is L2 and L1, the counter is counted; the motion trajectories of all the head regions are detected and counted in the above manner, and the results of the counter and the counter are respectively output in real time.
- the distance between L1 and L2 is twice the length of the top area of a standard adult head, and the center line of L1 and L2 is the center line of the frame image.
- the following describes a human traffic statistics system, which is based on the pedestrian head identification system described above, and the human traffic statistics system is configured to: determine by tracking the head area identified by the pedestrian head recognition system. Its motion trajectory is counted when the motion trajectory passes through the preset area, thereby counting the flow of people in the target area.
- the human traffic statistics system specifically includes:
- the head area framing module 20 is configured to surround the identified head area with a rectangular frame, and the head area is inscribed in the rectangular frame.
- the joint similarity calculation module 21 is configured to perform joint similarity calculation on each of the header regions in the previous frame image and all the header regions in the subsequent frame image.
- the joint similarity is a joint similarity between the coordinates of the intersection position of the diagonal of the rectangular frame of the head between two consecutive frames and the size of the area of the head region.
- P associate (d1, d2), A position (d1, d2) and A area (d1, d2) represent joint similarity and position similarity, respectively.
- Sex and area similarity, where A position (d1, d2) and A area (d1, d2) are calculated as:
- (x 1 , y 1 ) and (x 2 , y 2 ) respectively represent coordinates of intersections of diagonal lines of any one of the head regions d1 and d2 in consecutive two frames of images
- s 1 and s 2 respectively represent images of two consecutive frames
- the area of any one of the head regions d1 and d2, a x and a y represent the variance of the diagonal coordinates of the rectangles of all the head regions in the image on the X-axis and the Y-axis, and a s represents two consecutive frames.
- the variance of the area of all head regions in the image is the variance of the area of all head regions in the image.
- the header region matching module 22 is configured to compare the maximum value of the joint similarity between each of the calculated header regions and all the header regions in the subsequent frame image with a threshold, and if greater than the threshold, Then, the head region corresponding to the maximum value of the joint similarity in the image of the previous frame is successfully matched; if less than the threshold, the matching fails and the target is lost.
- the motion trajectory determining module 23 is configured to connect the rectangular frame diagonal intersections of the two head regions that are successfully matched in each successive two frames of images, thereby determining the motion trajectory of the head region.
- the human traffic statistics module 24 is configured to count when the motion track is detected to pass through the preset area, thereby counting the flow rate of the target area.
- the preset area is two parallel virtual decision lines L1 and L2 disposed on a frame image of the foreground image and an area formed with an edge of the foreground image; when the motion track is detected to be continuous When crossing L1 and L2, the number of people counters count; when it is detected that the order of the motion trajectory continuously traverses the virtual decision line is L1 and L2, the counter counts; when it is detected that the motion trajectory continuously traverses the virtual decision line, the order is L2 When L1 and L1, the counter is counted; and the results of the counter and the counter are output in real time.
- the distance between L1 and L2 is twice the length of the top area of a standard adult head, and the center line of L1 and L2 is the center line of the frame image.
- the traffic statistics service can be provided for places with dense traffic such as airports, shopping malls and railway stations.
- the pedestrian head recognition method provided by the present invention on the one hand, prior to performing head recognition, first delineating the ROI area in the foreground image to lock the potential area of the head, thereby effectively reducing the calculation amount of the algorithm and improving the recognition speed; On the one hand, combined with the feature that only the head region has the concentric circle attribute in the human body, the concentric circle is used to measure the head region, which improves the head recognition accuracy rate; and effectively avoids the head color due to the color of the clothes, the color of the head, etc. Identify the impact and improve the anti-jamming capability of the algorithm.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Human Computer Interaction (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Image Analysis (AREA)
- Traffic Control Systems (AREA)
- Studio Devices (AREA)
Abstract
Description
Claims (8)
- 一种行人头部识别方法,其特征在于,包括:步骤S1,获取深度摄像头垂直对准地面时从目标区域采集到的深度图像,并从所述深度图像中提取前景图像;步骤S2,从所述前景图像中提取所有头部的潜在区域作为感兴趣区域即ROI区域;步骤S3,以每一个ROI区域中的每一个像素点为圆心,通过构造同心圆来计算当前像素点属于头部区域内的像素点的概率,得到每个ROI区域中的每个像素点的概率值,将每个ROI区域中每个像素点的概率值与预置的第一阈值比较,将低于所述第一阈值的像素点滤除,剩下的以区域形式存在的像素点即为头部区域的点,每个区域即为识别出的一个头部。
- 如权利要求1所述的行人头部识别方法,其特征在于,所述步骤S1具体为:利用深度摄像头垂直对准地面,采集目标区域的深度图像,得到深度图像中坐标为(x,y)的点的像素值f(x,y),将所述像素值f(x,y)和通过预先背景建模得到的坐标为(x,y)的点的像素值bg(x,y)比较,并结合公式得到前景图像中坐标为(x,y)的点的像素值mask(x,y);所述公式为:其中,Tbg为背景模型和前景图像区分阈值;上述通过预先背景建模得到坐标为(x,y)的点的像素值bg(x,y)的过程具体为:通过采集若干张所述目标区域的背景图像,并对所述若干张背景图像中坐标为(x,y)的点的像素值取均值得到;所述坐标为(x,y)的点的像素值为该点在目标区域中的对应点与所述深度摄像头的相对距离。
- 如权利要求1所述的行人头部识别方法,其特征在于,所述步骤S3具体包括:步骤S31,以每一个ROI区域中的每一个像素点为圆心,构造同心圆,所述同心圆的内圆半径为r,外圆半径为n×r;步骤S32,对内圆内取得的像素点的像素值进行排序,记录从内圆区域中取得的像素点的像素值所形成的序列ArrayInner,所述序列ArrayInner的长度为lengthInner,其中,像素值最大点的像素值为NinnerMax;并对内外圆之间的区域内取得的像素点的像素值进行排序,记录从内外圆之间的区域取得的像素点的像素值所形成的序列ArrayOuter,所述序列ArrayOuter的长度为lengthOuter,其中,像素值最小点的像素值为NouterMin;上述取得的像素点在各自区域内满足均匀分布,且从内圆区域内取得的像素点的数量lengthInner与从内外圆之间的区域取得的像素点的数量lengthOuter相等;步骤S33,计算在序列ArrayOuter中比NinnerMax小的点的数量作为Num_1,计算在序列ArrayInner中比NouterMin大的点的数量作为Num_2,并根据公式计算当前像素点属于头部区域内的像素点的概率L,并记录概率值;所述公式为:L=(lengthInner+lengthOuter-Num_1-Num_2)/(lengthInner+lengthOuter);步骤S34,增大同心圆的内圆半径为rnew,rnew=r+r×α,其中α代表同心圆的内圆半径r增大的速度,0<α<1,此时,外圆半径为n×rnew;当rnew≤2R时,令r=rnew,并重复上述步骤S32-S34来计算ROI区域中的每一个像素点属于头部区域内的像素点的概率,并记录概率值,以记录的每一个像素点的最大概率作为该像素点属于头部区域内的像素点的最终概率;当rnew>2R时,转至步骤S35;步骤S35,将每个像素点的最终概率与所述第一阈值比较,将低于所述第一阈值的像素点滤除,剩下的以区域形式存在的像素点即为头部区域的点,每个区域即为识别出的一个头 部。
- 一种行人头部识别系统,其特征在于,包括:前景图像提取模块,用于获取深度摄像头垂直对准地面时从目标区域采集到的深度图像,并从所述深度图像中提取前景图像;ROI区域提取模块,用于从所述前景图像中提取所有头部的潜在区域作为感兴趣区域即ROI区域;头部识别模块,用于通过构造同心圆来识别人头部区域;具体地,以每一个ROI区域中的每一个像素点为圆心,通过构造同心圆来计算当前像素点属于头部区域内的像素点的概率,得到每个ROI区域中的每个像素点的概率值,将每个ROI区域中每个像素点的概率值与预置的第一阈值比较,将低于所述第一阈值的像素点滤除,剩下的以区域形式存在的像素点即为头部区域的点,每个区域即为识别出的一个头部。
- 如权利要求5所述的行人头部识别系统,其特征在于,所述前景图像提取模块具体用于:利用深度摄像头垂直对准地面,采集目标区域的深度图像,得到深度图像中坐标为(x,y)的点的像素值f(x,y),将像素值f(x,y)和通过预先背景建模得到的坐标为(x,y)的点的像素值bg(x,y)比较,并结合公式得到前景图像中坐标为(x,y)的点的像素值mask(x,y);所述公式为:其中,Tbg为背景模型和前景图像区分阈值;上述通过预先背景建模得到坐标为(x,y)的点的像素值bg(x,y)的过程具体为:通过采集若干张所述目标区域的背景图像,并对所述若干张背景图像中坐标为(x,y)的点的像素值取均值得到;所述坐标为(x,y)的点的像素值为该点在目标区域中的对应点与所述深度摄像头的相对距离。
- 如权利要求5所述的行人头部识别系统,其特征在于,所述头部识别模块具体包括:同心圆构造子模块,用于以每一个ROI区域中的每一个像素点为圆心,构造同心圆,所述同心圆的内圆半径为r,外圆半径为n×r;像素值排序子模块,用于对内圆内取得的像素点的像素值进行排序,记录从内圆区域中取得的像素点的像素值所形成的序列ArrayInner,所述序列ArrayInner的长度为lengthInner,其中,像素值最大点的像素值为NinnerMax;并对内外圆之间的区域内取得的像素点的像素值进行排序,记录从内外圆之间的区域取得的像素点的像素值所形成的序列ArrayOuter,所述序列ArrayOuter的长度为lengthOuter,其中,像素值最小点的像素值为NouterMin;上述取得的像素点在各自区域内满足均匀分布,且从内圆区域内取得的像素点的数量lengthInner与从内外圆之间的区域取得的像素点的数量lengthOuter相等;第一概率值确定子模块,用于计算在序列ArrayOuter中比NinnerMax小的点的数量作为Num_1,计算在序列ArrayInner中比NouterMin大的点的数量作为Num_2,并根据公式计算当前像素点属于头部区域内的像素点的概率L,并记录概率值;所述公式为:L=(lengthInner+lengthOuter-Num_1-Num_2)/(lengthInner+lengthOuter);第二概率值确定子模块,用于增大同心圆的内圆半径为rnew,rnew=r+r×α,其中α代表同心圆的内圆半径r增大的速度,0<α<1,此时,外圆半径为n×rnew;当rnew≤2R时,令 r=rnew,并返回所述像素值排序子模块来计算ROI区域中的每一个像素点属于头部区域内的像素点的概率,并记录概率值,以记录的每一个像素点的最大概率作为该像素点属于头部区域内的像素点的最终概率;当rnew>2R时,进入头部识别子模块;头部识别子模块,用于将每个像素点的最终概率与所述第一阈值比较,将低于所述第一阈值的像素点滤除,剩下的以区域形式存在的像素点即为头部区域的点,每个区域即为识别出的一个头部。
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2016/112383 WO2018119668A1 (zh) | 2016-12-27 | 2016-12-27 | 一种行人头部识别方法及系统 |
JP2018519925A JP6549797B2 (ja) | 2016-12-27 | 2016-12-27 | 通行人の頭部識別方法及びシステム |
US15/832,715 US10445567B2 (en) | 2016-12-27 | 2017-12-05 | Pedestrian head identification method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2016/112383 WO2018119668A1 (zh) | 2016-12-27 | 2016-12-27 | 一种行人头部识别方法及系统 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/832,715 Continuation US10445567B2 (en) | 2016-12-27 | 2017-12-05 | Pedestrian head identification method and system |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2018119668A1 true WO2018119668A1 (zh) | 2018-07-05 |
Family
ID=62629663
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2016/112383 WO2018119668A1 (zh) | 2016-12-27 | 2016-12-27 | 一种行人头部识别方法及系统 |
Country Status (3)
Country | Link |
---|---|
US (1) | US10445567B2 (zh) |
JP (1) | JP6549797B2 (zh) |
WO (1) | WO2018119668A1 (zh) |
Families Citing this family (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11288733B2 (en) * | 2018-11-14 | 2022-03-29 | Mastercard International Incorporated | Interactive 3D image projection systems and methods |
CN109684933A (zh) * | 2018-11-30 | 2019-04-26 | 广州大学 | 一种前方行人窜出马路的预警方法 |
CN109872355B (zh) * | 2019-01-25 | 2022-12-02 | 合肥哈工仞极智能科技有限公司 | 一种基于深度相机的最短距离获取方法及装置 |
US10915786B2 (en) * | 2019-02-28 | 2021-02-09 | Sap Se | Object detection and candidate filtering system |
US11559221B2 (en) * | 2019-03-22 | 2023-01-24 | Siemens Healthcare Gmbh | Multi-task progressive networks for patient modeling for medical scans |
CN111833398B (zh) * | 2019-04-16 | 2023-09-08 | 杭州海康威视数字技术股份有限公司 | 一种图像中的像素点标记方法及装置 |
CN111060442B (zh) * | 2019-04-30 | 2022-06-17 | 威海戥同测试设备有限公司 | 一种基于图像处理的油液颗粒检测方法 |
CN110210417B (zh) * | 2019-06-05 | 2021-09-28 | 达闼机器人有限公司 | 一种行人运动轨迹的预测方法、终端及可读存储介质 |
US11048948B2 (en) * | 2019-06-10 | 2021-06-29 | City University Of Hong Kong | System and method for counting objects |
CN111145211B (zh) * | 2019-12-05 | 2023-06-30 | 大连民族大学 | 单目摄像机直立行人头部像素高度获取方法 |
CN110929695B (zh) * | 2019-12-12 | 2024-02-27 | 易诚高科(大连)科技有限公司 | 一种人脸识别和行人重识别关联方法 |
US11741620B1 (en) * | 2020-01-24 | 2023-08-29 | Apple Inc. | Plane detection using depth sensor and semantic information |
CN113516743A (zh) * | 2020-03-27 | 2021-10-19 | 北京达佳互联信息技术有限公司 | 头发的渲染方法、装置、电子设备及存储介质 |
CN111639562B (zh) * | 2020-05-15 | 2023-06-20 | 圣点世纪科技股份有限公司 | 一种手掌感兴趣区域的智能定位方法 |
CN111710008B (zh) * | 2020-05-29 | 2023-07-11 | 北京百度网讯科技有限公司 | 人流密度的生成方法、装置、电子设备以及存储介质 |
CN113205068B (zh) * | 2021-05-27 | 2024-06-14 | 苏州魔视智能科技有限公司 | 洒水车喷头监控方法、电子设备及车辆 |
CN113343917B (zh) * | 2021-06-30 | 2024-05-31 | 上海申瑞继保电气有限公司 | 基于直方图的变电站设备识别方法 |
CN114081198B (zh) * | 2021-10-26 | 2023-02-17 | 农业农村部南京农业机械化研究所 | 一种球茎类作物自动切根装置及方法 |
CN114973310B (zh) * | 2022-04-06 | 2024-06-07 | 国网智慧能源交通技术创新中心(苏州)有限公司 | 一种基于红外热成像的被动人体定位方法及系统 |
CN116645372B (zh) * | 2023-07-27 | 2023-10-10 | 汉克威(山东)智能制造有限公司 | 一种制动气室外观图像智能检测方法及系统 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102930524A (zh) * | 2012-09-11 | 2013-02-13 | 无锡数字奥森科技有限公司 | 一种基于垂直放置的深度摄像头的人头检测方法 |
US20130272576A1 (en) * | 2011-09-30 | 2013-10-17 | Intel Corporation | Human head detection in depth images |
CN105096259A (zh) * | 2014-05-09 | 2015-11-25 | 株式会社理光 | 深度图像的深度值恢复方法和系统 |
CN105138979A (zh) * | 2015-08-19 | 2015-12-09 | 南京理工大学 | 基于立体视觉的运动人体头部检测方法 |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP4751442B2 (ja) * | 2008-12-24 | 2011-08-17 | 株式会社東芝 | 映像監視システム |
JP5349080B2 (ja) * | 2009-02-27 | 2013-11-20 | 株式会社東芝 | 入場管理システム、入場管理装置、および入場管理方法 |
WO2016199244A1 (ja) * | 2015-06-10 | 2016-12-15 | 株式会社日立製作所 | 物体認識装置及び物体認識システム |
-
2016
- 2016-12-27 WO PCT/CN2016/112383 patent/WO2018119668A1/zh active Application Filing
- 2016-12-27 JP JP2018519925A patent/JP6549797B2/ja not_active Expired - Fee Related
-
2017
- 2017-12-05 US US15/832,715 patent/US10445567B2/en not_active Expired - Fee Related
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130272576A1 (en) * | 2011-09-30 | 2013-10-17 | Intel Corporation | Human head detection in depth images |
CN102930524A (zh) * | 2012-09-11 | 2013-02-13 | 无锡数字奥森科技有限公司 | 一种基于垂直放置的深度摄像头的人头检测方法 |
CN105096259A (zh) * | 2014-05-09 | 2015-11-25 | 株式会社理光 | 深度图像的深度值恢复方法和系统 |
CN105138979A (zh) * | 2015-08-19 | 2015-12-09 | 南京理工大学 | 基于立体视觉的运动人体头部检测方法 |
Also Published As
Publication number | Publication date |
---|---|
JP6549797B2 (ja) | 2019-07-24 |
US10445567B2 (en) | 2019-10-15 |
US20180181803A1 (en) | 2018-06-28 |
JP2019505866A (ja) | 2019-02-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2018119668A1 (zh) | 一种行人头部识别方法及系统 | |
CN110942545B (zh) | 基于人脸识别和视频围栏的人员密集出入口门禁控制系统及方法 | |
CN104751491B (zh) | 一种人群跟踪及人流量统计方法及装置 | |
CN104036236B (zh) | 一种基于多参数指数加权的人脸性别识别方法 | |
CN104361327B (zh) | 一种行人检测方法和系统 | |
CN108021848B (zh) | 客流量统计方法及装置 | |
CN102542289B (zh) | 一种基于多高斯计数模型的人流量统计方法 | |
WO2018076392A1 (zh) | 一种基于人体头顶部识别的行人统计方法及装置 | |
CN104978567B (zh) | 基于场景分类的车辆检测方法 | |
CN106372570A (zh) | 一种人流量统计的方法 | |
CN106570449B (zh) | 一种基于区域定义的人流量与人气指数检测方法及检测系统 | |
US10650249B2 (en) | Method and device for counting pedestrians based on identification of head top of human body | |
CN102214291A (zh) | 一种快速准确的基于视频序列的人脸检测跟踪方法 | |
CN102043953A (zh) | 一种实时鲁棒的针对特定场景的行人检测方法 | |
CN105956552A (zh) | 一种人脸黑名单监测方法 | |
CN105447432A (zh) | 一种基于局部运动模式的人脸防伪方法 | |
CN105160297A (zh) | 基于肤色特征的蒙面人事件自动检测方法 | |
CN104835147A (zh) | 基于三维深度图数据的密集人流量实时检测方法 | |
CN107066963B (zh) | 一种自适应人群计数方法 | |
CN110717400A (zh) | 一种客流统计方法、装置及系统 | |
CN103258232A (zh) | 一种基于双摄像头的公共场所人数估算方法 | |
CN104463232A (zh) | 一种基于hog特征和颜色直方图特征的密度人群计数的方法 | |
CN106778637B (zh) | 一种对男女客流的统计方法 | |
CN105069816A (zh) | 一种进出口人流量统计的方法及系统 | |
CN106845361B (zh) | 一种行人头部识别方法及系统 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
ENP | Entry into the national phase |
Ref document number: 2018519925 Country of ref document: JP Kind code of ref document: A |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 16925413 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 16.09.2019) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 16925413 Country of ref document: EP Kind code of ref document: A1 |