CN109035305B - Indoor human body detection and tracking method based on RGB-D low-visual-angle condition - Google Patents

Indoor human body detection and tracking method based on RGB-D low-visual-angle condition Download PDF

Info

Publication number
CN109035305B
CN109035305B CN201810908661.2A CN201810908661A CN109035305B CN 109035305 B CN109035305 B CN 109035305B CN 201810908661 A CN201810908661 A CN 201810908661A CN 109035305 B CN109035305 B CN 109035305B
Authority
CN
China
Prior art keywords
cluster
point cloud
human body
points
point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810908661.2A
Other languages
Chinese (zh)
Other versions
CN109035305A (en
Inventor
袁泽慧
段荣杰
安晓红
李世中
张亚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
North University of China
Original Assignee
North University of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by North University of China filed Critical North University of China
Priority to CN201810908661.2A priority Critical patent/CN109035305B/en
Publication of CN109035305A publication Critical patent/CN109035305A/en
Application granted granted Critical
Publication of CN109035305B publication Critical patent/CN109035305B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • G06T7/251Analysis of motion using feature-based methods, e.g. the tracking of corners or segments involving models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person

Abstract

The invention relates to an indoor human body detection and tracking method based on RGB-D low visual angle, belonging to the technical field of human body detection, and specifically comprising the following steps of carrying out point cloud collection by utilizing Huashuo Xtion Pro to obtain 3D point cloud, then carrying out noise reduction and down sampling treatment on the 3D point cloud, carrying out detection and removal on the ground, then carrying out 3D clustering by utilizing Euclidean distance between two points, calculating HOG characteristics of each cluster, feeding the HOG characteristics to a pre-trained SVM binary soft classifier, classifying people with high HOG characteristics, thereby achieving the purpose of human body detection, and finally realizing human body tracking by utilizing two likelihood probabilities formed by color consistency and distance consistency in the data coherence process. The invention has high precision and is widely used for indoor human body detection and tracking under the condition of low visual angle.

Description

Indoor human body detection and tracking method based on RGB-D low-visual-angle condition
Technical Field
The invention relates to an indoor human body detection and tracking method based on RGB-D low visual angle, belonging to the technical field of human body detection.
Background
Human detection and tracking is a key to the task of mobile robots in indoor environments. For a mobile robot, it must be able to distinguish between human bodies and other obstacles in order to adjust the trajectory according to its task. Such as a service robot, must provide assistance to a particular person in a particular environment.
Currently, there are some methods for detecting and tracking human body by simply using RGB-D depth camera or radar ranging. With the advent of RGB-D depth cameras, such as microsoft Kinect or luxurious rotation Pro, it can capture 640 × 480 pixels of images at a rate of 30 frames/s, and due to its advantages of low power consumption, it has been widely used in recent years in the fields of 3D perception of robots and indoor positioning and recognition.
There are several more successful human detection algorithms that are based on the human whole body, especially if the head is visible. In some cases, such as when the robot is particularly close to the object, a large portion of the object may be outside the sensor sensing range. Also, when the RGB-D sensor is mounted on a small robot, such as Turtlebot 2. Its line of sight is very close to the ground and therefore only the lower body or only the legs of the person are observed. Human detection and tracking is very difficult at low viewing angles, mainly because some main features are lost, and no obvious features are available for distinguishing human bodies from other objects, such as legs and chairs of a table, and the like, which cannot be distinguished from the human bodies by using the corresponding features. Based on the fact that when RGB-D is installed on the turtle bot2, if the person is very close to the robot, (distance <100cm), in most cases, it is visible from the foot to the waist, we propose a human detection and tracking algorithm in the low viewing angle case.
Disclosure of Invention
In order to solve the technical problems in the prior art, the invention provides an indoor human body detection and tracking method based on the RGB-D low-visual-angle condition, which clusters objects on the ground at a limited height, distinguishes people or objects by combining HOG characteristics and an SVM classifier, then tracks a human body by using combined likelihood probability formed by color consistency and distance consistency, and has higher accuracy.
In order to achieve the above object, the technical solution adopted by the present invention is an indoor human body detection and tracking method based on RGB-D low viewing angle, comprising the following steps,
step a, data acquisition and pretreatment
Performing point cloud collection by using the Hua Shu Xtion Pro to obtain a dense 3D point cloud, performing noise reduction processing on the dense 3D point cloud through a straight-through filter, and performing down-sampling processing on the 3D point cloud after noise reduction through a three-dimensional voxel grid filter;
step b, ground detection and filtering
Detecting the ground in the 3D initial point cloud processed in the step a by using a least square method based on RANSAC, wherein for non-first frame data, the ground parameters obtained by the detection of the previous frame can be used as the initial parameters of the next frame, performing ground detection on the current frame data by using the RANSAC method according to the initial given ground parameters and a preset initial distance threshold, and filtering the data from the 3D point cloud according to the indexes of all three-dimensional points on the ground in the 3D point cloud; for the first frame data, determining the relative relation with the ground according to the installation position of the RGB-D sensor on the robot, thereby giving initial parameters of the ground;
step c. clustering
Acquiring point cloud data within 130cm from the ground from the 3D point cloud removed from the ground, then performing 3D clustering by using Euclidean distance between two points, and defining the two points to belong to the same class when the Euclidean distance between each pair of points is smaller than a predefined distance threshold; wherein, clustering needs to set two initial thresholds: namely the minimum distance between two points belonging to the same cluster and the minimum point number required for generating one cluster;
HOG + SVM classification
Projecting the point cloud in the bounding box after 3D clustering to an RGB image, calculating an HOG descriptor of the point cloud by using an obtained image module, then sending the obtained HOG descriptor to a pre-trained SVM classifier, and calculating the HOG confidence coefficient of each cluster; when the calculated HOG confidence coefficient is higher than a set threshold value, judging that the cluster is a person, otherwise, judging that the cluster is not the person;
step e. tracking
And the obtained human body cluster is used as the input of a tracking module, namely, the human body cluster is used as an object to be tracked next step, then the human body cluster detected in each frame is used for matching with the existing tracking object, and the maximum likelihood probability between the currently detected human body and the known tracking object is calculated by adopting a method based on the combination of distance consistency and color consistency.
Preferably, in the step c, when clustering is performed, 1) a Kd tree of the 3D point cloud is first created as a search method used when the point cloud is extracted later; 2) setting a clustered empty list C and an empty queue of point clouds; 3) for each point p in the point cloud, searching all adjacent points in the ball by taking the point p as a center and a preset distance threshold as a radius, firstly checking whether the adjacent points obtained by searching are added into other clusters, and if not, adding the adjacent points into a queue Q; 4) after all the points in the queue Q are processed, adding the queue Q into the cluster C; 5) and finishing clustering after all the points in the initial point cloud are processed.
Preferably, in the step c, after the clustering is completed, the problems of over-clustering and under-clustering need to be further processed;
the process of over-clustering is as follows: for each resulting cluster CiFirst, the projection p of its center point on the XZ plane is calculatediIf p isiAnd cluster CjCentral projection point p ofjIs less than a set threshold, then cluster C is considered to beiAnd CjBelong to the same cluster and then combine the clusters;
the under-clustering process is as follows: for each cluster, calculating geometrical information of the cluster, wherein the geometrical information specifically comprises width, depth and height information; if the geometric information of some clusters is far larger than a set threshold value, further dividing the clusters by using color information, namely classifying the points with the same color into the same class; for clusters where there are too few points, discard directly.
Preferably, in the step e,
distance consistency is defined as: given a human detection cluster CiBy processing the global nearest neighbor data to find the nearest tracking object TjIf their distance is less than a threshold, the detected clusters are considered to be linked to the tracked cluster, and then T is used for the tracked objectjEach point p ini,jFinding a detection target C using octree methodiEach point p inj,iCalculating the distance between them, point pi,jAnd point pj,iThe distance consistency probability between them is defined as follows:
Figure BDA0001761207890000041
wherein α is a weight vector;
colour(s)Consistency is defined as: when comparing the current detection clusters CiAnd tracking object TjWhen the color information is obtained, the nearest pair of points is found by calculation<pj,i,pi,j>Color consistency between them, which can be calculated in RGB, HSV or other color spaces; taking HSV space as an example, point pi,jAnd pj,iThe color consistency probability between is defined as follows:
Figure BDA0001761207890000051
wherein c isi,jAnd cj,iRespectively represents pi,jAnd pj,iThe HSV information of (1), beta represents weight;
pi,jand pj,iThe joint consistency probability between them is defined as:
L(pi,j,pj,i)=Ld(pi,j,pj,i)Lc(pi,j,pj,i);
for each tracked object TjAnd detecting cluster CiThe maximum joint likelihood probability of (d) is defined as:
Figure BDA0001761207890000052
if L (j, i) is higher than the set threshold, it indicates the current cluster CiAnd tracking object TjIs the same person, otherwise, if the same person is not found with CiThe associated tracking object, a new tracking object is created.
Compared with the prior art, the invention has the following technical effects: the invention provides a human body detection algorithm based on the main characteristic of the lower half of the human body, which mainly aims at the condition that when an RGB-D sensor is lower than the ground, namely, at a low visual angle, or when a detection object is closer to the sensor, only the lower half of the human body is visible. The algorithm can effectively improve the accuracy of human body detection, and is based on the common knowledge that the human body moves on the ground, firstly, the ground in a scene is detected and removed, objects on the ground are clustered on a limited height, and the objects with high HOG confidence values are classified into people and are classified into other people by calculating the HOG characteristics and then feeding the HOG characteristics to a pre-trained SVM binary soft classifier. And then, the detected human body result is used as the input of a tracking module, a tracking object closest to the currently detected human body cluster is searched by using a joint likelihood probability formed by color consistency and distance consistency, and when the maximum likelihood probability is greater than a set threshold, the currently detected human body cluster and the tracking object are considered to be the same person.
Drawings
FIG. 1 is a flow chart of the present invention.
FIG. 2 is a diagram of the dense 3D point cloud data collected by the present invention.
FIG. 3 is a point cloud diagram of the dense 3D point cloud data of the present invention after being preprocessed.
Fig. 4 is a schematic diagram of the recognition situation under different environments.
Detailed Description
In order to make the technical problems, technical solutions and advantageous effects to be solved by the present invention more clearly apparent, the present invention is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
As shown in fig. 1, an indoor human body detection and tracking method based on RGB-D low viewing angle condition includes the following steps,
step a, data acquisition and processing
As shown in fig. 2, the dense 3D point cloud is obtained by performing point cloud collection using the filling x tion Pro. Due to the influence of different error sources, in particular the discretization effect in the depth measurement and the fact that the camera is calibrated within a certain range, a large amount of noise is present in the acquired initial RGB-D point cloud, while the RGB-D point cloud acquired per frame contains 307200 points, corresponding to the 640 × 480 dimensions of its depth image. Therefore, in order to improve the processing speed and accuracy of data, points which are not interested are firstly removed by using a through filter, for example, points which are more than 5m in the z direction are filtered according to the parameters of the Xtion pro. And then, a three-dimensional voxel grid filter is used for carrying out down-sampling on the point cloud, and the algorithm approximately displays other points in the voxel by using the gravity centers of all the points in the voxel, so that the down-sampling of the point cloud data is realized, and the calculation amount is reduced. The side length size of a voxel (i.e., a three-dimensional cube) is set to 0.06 m. The point cloud data acquired by the RGB-D sensor can be compressed to an order of magnitude through the down-sampling, so that the processing time of the point cloud in the later period is greatly shortened, and meanwhile, the density of the data after the down-sampling is the same at each position. The processed point cloud data is shown in fig. 3.
Step b, ground detection and filtering
This step is based on the assumption that a person is walking on the ground, so we first perform ground detection in the filtered point cloud and remove it. When the RGB-D is fixedly installed on the mobile robot, the relative position of the RGB-D and the ground is approximately known, initial ground parameters are set when the first frame point cloud is processed based on the relative position, the ground in the 3D point cloud is detected by using a least square method based on RANSAC on the basis of the initial ground parameters, updated ground parameters are obtained, and the updated ground parameters are used as the initial parameters of the ground of the next frame. This allows real-time updating of the parameters of the ground in the robot coordinate system. In theory, the ground is a strictly plane, whose parameters are fixed with respect to the robot, or in the robot coordinate system, but the parameters of the ground are different in the robot coordinate system at different times due to the vibrations of the camera in the robot motion, and the slight inclination of the actual ground.
Step c. clustering
Acquiring point cloud data within 130cm from the ground from the 3D point cloud removed from the ground, then performing 3D clustering by using Euclidean distance between two points, and defining the two points to belong to the same class when the Euclidean distance between each pair of points is smaller than a predefined distance threshold; wherein, clustering needs to set two initial thresholds: i.e. the minimum distance between two points belonging to the same cluster and the minimum number of points needed to generate a cluster.
Once the ground is detected, the three-dimensional points belonging to the ground can be removed, so that the objects on the ground are no longer in contact with the ground and appear as unconnected objects. Since our human body detection method uses the lower part of the human body as a feature description, point cloud data to be analyzed is limited to point cloud data within 130cm from the ground, and then 3D clustering is performed using the euclidean distance between two points.
In clustering, 1) creating a Kd tree of the 3D point cloud as a searching method for extracting the point cloud; 2) setting a clustered empty list C and an empty queue of point clouds; 3) for each point p in the point cloud, searching all adjacent points in the ball by taking the point p as a center and a preset distance threshold as a radius, firstly checking whether the adjacent points obtained by searching are added into other clusters, and if not, adding the adjacent points into a queue Q; 4) after all the points in the queue Q are processed, adding the queue Q into the cluster C; 5) and finishing clustering after all the points in the initial point cloud are processed.
After the euclidean clustering is completed, two typical problems exist in the actual operation, namely over-clustering and under-clustering: (1) over-clustering, i.e., the point cloud that should belong to the same object, is segmented into multiple clusters, mainly due to noise, occlusion, and some missing depth data. (2) Under-clustering, as the name implies, is that the point clouds that should belong to two or more different objects are improperly classified as the same cluster. For example, when a person is in close proximity to a background, cabinet, or table, the person is often categorized as a background point, table, or the like. In our experiments, it is not uncommon for two different people's points to be merged together in a clustering operation, since only points within 130 centimeters above ground are considered.
To solve these two problems, after the initial euclidean clustering is performed, the resulting clusters need to be further processed.
For the over-clustering problem, after clustering is completed, for each obtained cluster CiFirst, the projection p of its center point on the XZ plane is calculatediIf p isiAnd cluster CjCentral projection point p ofjIs less than a set thresholdThen, consider cluster CiAnd CjBelong to the same cluster and then merge them.
For under-clustering, color information separation is used when a person and background are classified into a class. This is based on the assumption that the person's pants color and background color are different, such as the jeans color is blue and the walls are white. The specific algorithm is as follows: for each cluster, calculating geometrical information of the cluster, wherein the geometrical information specifically comprises width, depth and height information; if the geometric information of some clusters is far larger than a set threshold value, further dividing the clusters by using color information, namely classifying the points with the same color into the same class; also for clusters where there are too few points, they are discarded directly in this operation.
HOG + SVM classification
Projecting the point cloud in the bounding box after 3D clustering to an RGB image, calculating an HOG descriptor of the point cloud by using an obtained image module, then sending the obtained HOG descriptor to a pre-trained SVM classifier, and calculating the HOG confidence coefficient of each cluster; when the calculated HOG confidence coefficient is higher than a set threshold value, judging that the cluster is a person, otherwise, judging that the cluster is not a person;
step e. tracking
And the obtained human body cluster is used as the input of a tracking module, namely, the human body cluster is used as an object to be tracked next, then the human body cluster is matched with the existing tracking object, and the maximum likelihood probability between the currently detected human body and the known tracking object is calculated by specifically adopting a method based on the combination of distance consistency and color consistency.
The tracking algorithm is to estimate the trajectory of each target using particle filtering. Assuming that a person moves on the ground, the state-tracking quantity of each person appears as a 2D transform, i.e., the position (x, y) of the center of gravity and the rotation angle θ. Wherein the moving mold is set to constant velocity motion because the mold is good at dealing with the complete coupling problem.
When a cluster is classified as a human, meaning that they are not associated with a known tracked object, a new tracked object is created. It is apparent that the human body detected in the first frame is initialized as a new tracking object.
In order to match the detected person in the current frame with the known tracked object, the consistency based on color and distance is calculated as follows:
distance consistency is defined as: given a human detection cluster CiBy processing global nearest neighbor data to find the nearest human tracking object TjIf their distance is less than a threshold, the detected clusters are considered to be associated with the tracked object, and then T is used for the tracked objectjEach point p ini,jFinding a detection target C using octree methodiEach point p inj,iCalculating the distance between them, point pi,jAnd point pj,iThe distance consistency probability between them is defined as follows:
Figure BDA0001761207890000101
wherein α is a weight vector;
color consistency is defined as: when comparing the current detection clusters CiAnd tracking object TjWhen the color information is obtained, the nearest pair of points is found by calculation<pj,i,pi,j>Color consistency between. Color consistency may be calculated in RGB, HSV, or other color spaces. Taking HSV space as an example, point pi,jAnd pj,iThe color consistency probability between is defined as follows:
Figure BDA0001761207890000111
wherein c isi,jAnd cj,iRespectively represents pi,jAnd pj,iThe HSV information of (1), beta represents weight;
pi,jand pj,iThe joint consistency probability between them is defined as:
L(pi,j,pj,i)=Ld(pi,j,pj,i)Lc(pi,j,pj,i);
for theEach tracking object TjAnd detecting cluster CiThe maximum joint likelihood probability of (d) is then defined as:
Figure BDA0001761207890000112
if L (j, i) is higher than the set threshold, it indicates the current cluster CiAnd tracking object TjIs the same person, otherwise not. If not found with CiThe associated tracking object, a new tracking object is created.
To validate the proposed algorithm, the following experiment was performed: and installing a large Xtion Pro and a laser radar on the Turtlebot2, wherein the Xtion Pro is used for collecting original RGB-D point cloud data of the environment, and the laser radar is used for avoiding obstacles.
As shown in fig. 4, in order to verify the method proposed in the present invention, experiments were performed in three different scenarios, respectively:
simple environment: in an environment without obstacles, the sensor is fixed and two persons move with the same trajectory, as shown by a in fig. 4.
Moderate environment: in an environment without obstacles, the sensor is fixed, more than two persons move at random and there is a crossing between the motion trajectories, as in b in fig. 4.
Difficult environment: there is an obstacle, the robot moves, and three or more people walk at random and their movement trajectories are crossed with each other, as shown by c in fig. 4.
For each scene, the video sequence was at about 250 frames, and the total test set included 798 frames. There are 2698 instances of people in total, which are manually marked on the RGB image as real data.
Human body detection results:
in order to verify the performance of the proposed method, frame-based metrics are used, and the advantages and disadvantages of the proposed method are measured by using three items of precision (p), recall (r) and f1 score. The three terms are defined as:
Figure BDA0001761207890000121
wherein TP represents the true positive rate, FP represents the false positive rate, and FN represents the false negative rate.
As shown in fig. 4, the detected human body is enclosed by a green frame, and it is obvious from the figure that the experimental results are better for the former two cases, especially in a simple environment. However, in difficult scenarios, the results are less good than the first two. This is readily understood as having an obstruction in a difficult scene. In some cases, classification by geometric information and HOG + SVM is difficult. As a result, some obstacles are erroneously recognized as human bodies. The performance of human detection in three different environments is recorded in table one.
To reduce the false positive rate, the HOG confidence threshold is set strictly to-2.2. This helps to reduce the false positive rate, but also results in an increase in the false negative rate.
Table one: performance parameters of human body detection model under three different environments
Accuracy of measurement Recall rate f1 score
Simple environment 0.97 0.91 0.94
Moderate environment 0.92 0.88 0.89
Difficult environment 0.82 0.78 0.80
Human body tracking result:
we evaluated our tracking results in terms of false positive and false negative rates, and table two records tracking performance in three different environments, with better results in simple and medium cases, whereas in difficult cases FP and FN ratios are somewhat higher, 5.8% and 5.2%, respectively. This is mainly due to the fact that in this scene, people move faster, are blocked by others, or are out of the field of view of the camera.
Table two: performance parameters of a human tracking model in three different environments
FP FN
Simple environment 2.4% 1.8%
Moderate environment 4.6% 4.4%
Difficult environment 5.8% 5.2%
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principles of the present invention are intended to be included therein.

Claims (4)

1. An indoor human body detection and tracking method based on RGB-D low visual angle condition is characterized by comprising the following steps,
step a, data acquisition and pretreatment
Performing point cloud collection by using the Hua Shu Xtion Pro to obtain a dense 3D point cloud, performing noise reduction processing on the dense 3D point cloud through a straight-through filter, and performing down-sampling processing on the 3D point cloud after noise reduction through a three-dimensional voxel grid filter;
step b, ground detection and filtering
Detecting the ground in the 3D initial point cloud processed in the step a by using a least square method based on RANSAC, wherein for non-first frame data, the ground parameters obtained by the detection of the previous frame can be used as the initial parameters of the next frame, performing ground detection on the current frame data by using the RANSAC method according to the initial given ground parameters and a preset initial distance threshold, and filtering the data from the 3D point cloud according to the indexes of all three-dimensional points on the ground in the 3D point cloud; for the first frame data, determining the relative relation with the ground according to the installation position of the RGB-D sensor on the robot, thereby giving initial parameters of the ground;
step c. clustering
Acquiring point cloud data within 130cm from the ground from the 3D point cloud removed from the ground, then performing 3D clustering by using Euclidean distance between two points, and defining the two points to belong to the same class when the Euclidean distance between each pair of points is smaller than a predefined distance threshold; wherein, clustering needs to set two initial thresholds: namely the minimum distance between two points belonging to the same cluster and the minimum point number required for generating one cluster;
HOG + SVM classification
Projecting the point cloud in the bounding box after 3D clustering to an RGB image, calculating an HOG descriptor of the point cloud by using an obtained image module, then sending the obtained HOG descriptor to a pre-trained SVM classifier, and calculating the HOG confidence coefficient of each cluster; when the calculated HOG confidence coefficient is higher than a set threshold value, judging that the cluster is a person, otherwise, judging that the cluster is not the person;
step e. tracking
And the obtained human body cluster is used as the input of a tracking module, namely, the human body cluster is used as an object to be tracked next step, then the human body cluster detected in each frame is used for matching with the existing tracking object, and the maximum likelihood probability between the currently detected human body and the known tracking object is calculated by adopting a method based on the combination of distance consistency and color consistency.
2. The method as claimed in claim 1, wherein in the step c, 1) a Kd tree of 3D point cloud is first created as a search method for extracting point cloud later; 2) setting a clustered empty list C and an empty queue of point clouds; 3) for each point p in the point cloud, searching all adjacent points in the sphere by taking the point p as a center and a preset distance threshold as a radius, firstly checking whether the adjacent points obtained by searching are added into other clusters, and if not, adding the adjacent points into a queue Q; 4) after all the points in the queue Q are processed, adding the queue Q into the cluster C; 5) and finishing clustering after all the points in the initial point cloud are processed.
3. The method for detecting and tracking the indoor human body based on the RGB-D low visual angle condition as claimed in claim 2, wherein in the step c, after the clustering is completed, the problems of over-clustering and under-clustering are required to be further processed;
the process of over-clustering is as follows: for each one getCluster of arrival CiFirst, the projection p of its center point on the XZ plane is calculatediIf p isiAnd cluster CjCentral projection point p ofjIs less than a set threshold, then cluster C is considered to beiAnd CjBelong to the same cluster and then combine the clusters;
the under-clustering process is as follows: for each cluster, calculating geometrical information of the cluster, wherein the geometrical information specifically comprises width, depth and height information; if the geometric information of some clusters is far larger than a set threshold value, further dividing the clusters by using color information, namely classifying the points with the same color into the same class; for clusters where there are too few points, discard directly.
4. The RGB-D based indoor human body detection and tracking method for low viewing angle condition as claimed in claim 1, wherein in step e,
distance consistency is defined as: given a human detection cluster CiBy processing the global nearest neighbor data to find the nearest tracking object TjIf their distance is less than a threshold, the detected clusters are considered to be linked to the tracked cluster, and then T is used for the tracked objectjEach point p ini,jFinding a detection target C using octree methodiEach point p inj,iCalculating the distance between them, point pi,jAnd point pj,iThe distance consistency probability between them is defined as follows:
Figure FDA0003059271330000031
wherein α is a weight vector;
color consistency is defined as: when comparing the current detection clusters CiAnd tracking object TjWhen the color information is obtained, the nearest pair of points is found by calculation<pj,i,pi,j>Color consistency between them, which can be calculated in RGB, HSV or other color spaces; within HSV space, point pi,jAnd pj,iThe color consistency probability between is defined as follows:
Figure FDA0003059271330000032
wherein c isi,jAnd cj,iRespectively represents pi,jAnd pj,iThe HSV information of (1), beta represents weight;
pi,jand pj,iThe joint consistency probability between them is defined as:
L(pi,j,pj,i)=Ld(pi,j,pj,i)Lc(pi,j,pj,i);
for each tracked object TjAnd detecting cluster CiThe maximum joint likelihood probability of (d) is defined as:
Figure FDA0003059271330000041
if L (j, i) is higher than the set threshold, it indicates the current cluster CiAnd tracking object TjIs the same person, otherwise, if the same person is not found with CiThe associated tracking object, a new tracking object is created.
CN201810908661.2A 2018-08-10 2018-08-10 Indoor human body detection and tracking method based on RGB-D low-visual-angle condition Active CN109035305B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810908661.2A CN109035305B (en) 2018-08-10 2018-08-10 Indoor human body detection and tracking method based on RGB-D low-visual-angle condition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810908661.2A CN109035305B (en) 2018-08-10 2018-08-10 Indoor human body detection and tracking method based on RGB-D low-visual-angle condition

Publications (2)

Publication Number Publication Date
CN109035305A CN109035305A (en) 2018-12-18
CN109035305B true CN109035305B (en) 2021-06-25

Family

ID=64632680

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810908661.2A Active CN109035305B (en) 2018-08-10 2018-08-10 Indoor human body detection and tracking method based on RGB-D low-visual-angle condition

Country Status (1)

Country Link
CN (1) CN109035305B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110175528B (en) * 2019-04-29 2021-10-26 北京百度网讯科技有限公司 Human body tracking method and device, computer equipment and readable medium
CN110110687B (en) * 2019-05-15 2020-11-17 江南大学 Method for automatically identifying fruits on tree based on color information and three-dimensional contour information
CN110456308B (en) * 2019-07-08 2021-05-04 广西工业职业技术学院 Three-dimensional space positioning rapid searching method
CN111582352B (en) * 2020-04-30 2023-06-27 上海高仙自动化科技发展有限公司 Object-based perception method, object-based perception device, robot and storage medium
CN112070840B (en) * 2020-09-11 2023-10-10 上海幻维数码创意科技股份有限公司 Human body space positioning and tracking method fused by multiple depth cameras
CN113033481B (en) * 2021-04-20 2023-06-02 湖北工业大学 Handheld stick detection method based on first-order full convolution target detection algorithm

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106384079A (en) * 2016-08-31 2017-02-08 东南大学 RGB-D information based real-time pedestrian tracking method
CN107016373A (en) * 2017-04-12 2017-08-04 广东工业大学 The detection method and device that a kind of safety cap is worn
CN107491712A (en) * 2016-06-09 2017-12-19 北京雷动云合智能技术有限公司 A kind of human body recognition method based on RGB D images
CN107833270A (en) * 2017-09-28 2018-03-23 浙江大学 Real-time object dimensional method for reconstructing based on depth camera

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9111351B2 (en) * 2011-12-15 2015-08-18 Sony Corporation Minimizing drift using depth camera images

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107491712A (en) * 2016-06-09 2017-12-19 北京雷动云合智能技术有限公司 A kind of human body recognition method based on RGB D images
CN106384079A (en) * 2016-08-31 2017-02-08 东南大学 RGB-D information based real-time pedestrian tracking method
CN107016373A (en) * 2017-04-12 2017-08-04 广东工业大学 The detection method and device that a kind of safety cap is worn
CN107833270A (en) * 2017-09-28 2018-03-23 浙江大学 Real-time object dimensional method for reconstructing based on depth camera

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于RGB-D传感器的地面移动机器人目标检测与跟踪;张松;《中国优秀硕士学位论文全文数据库》;20170815;全文 *
基于RGB-D传感器的移动服务机器人功能软件开发及应用;居青;《中国优秀硕士学位论文全文数据库》;20170315;全文 *

Also Published As

Publication number Publication date
CN109035305A (en) 2018-12-18

Similar Documents

Publication Publication Date Title
CN109035305B (en) Indoor human body detection and tracking method based on RGB-D low-visual-angle condition
JP6288221B2 (en) Enhanced layer-based object detection by deep convolutional neural networks
Mittal et al. M2tracker: A multi-view approach to segmenting and tracking people in a cluttered scene using region-based stereo
CN109934848B (en) Method for accurately positioning moving object based on deep learning
CN104573614B (en) Apparatus and method for tracking human face
Zhou et al. Self‐supervised learning to visually detect terrain surfaces for autonomous robots operating in forested terrain
Gritti et al. Kinect-based people detection and tracking from small-footprint ground robots
Eppenberger et al. Leveraging stereo-camera data for real-time dynamic obstacle detection and tracking
Damen et al. Detecting carried objects from sequences of walking pedestrians
CN104966062B (en) Video monitoring method and device
Zhang et al. Multiple vehicle-like target tracking based on the velodyne lidar
Wang et al. An overview of 3d object detection
JP2016099982A (en) Behavior recognition device, behaviour learning device, method, and program
Huang et al. Fish tracking and segmentation from stereo videos on the wild sea surface for electronic monitoring of rail fishing
Volkhardt et al. Fallen person detection for mobile robots using 3D depth data
Hsieh et al. Abnormal scene change detection from a moving camera using bags of patches and spider-web map
Batool et al. Telemonitoring of daily activities based on multi-sensors data fusion
Fang et al. Real-time RGB-D based people detection and tracking system for mobile robots
Saisan et al. Multi-view classifier swarms for pedestrian detection and tracking
KR101542206B1 (en) Method and system for tracking with extraction object using coarse to fine techniques
US20200258237A1 (en) Method for real time surface tracking in unstructured environments
Kuo et al. People counting base on head and shoulder information
Behendi et al. Non-invasive performance measurement in combat sports
Pane et al. A people counting system for business analytics
Rougier et al. 3D head trajectory using a single camera

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant