CN110598510A - Vehicle-mounted gesture interaction technology - Google Patents

Vehicle-mounted gesture interaction technology Download PDF

Info

Publication number
CN110598510A
CN110598510A CN201810606708.XA CN201810606708A CN110598510A CN 110598510 A CN110598510 A CN 110598510A CN 201810606708 A CN201810606708 A CN 201810606708A CN 110598510 A CN110598510 A CN 110598510A
Authority
CN
China
Prior art keywords
point
points
value
depth
control method
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810606708.XA
Other languages
Chinese (zh)
Other versions
CN110598510B (en
Inventor
周秦娜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Point Cloud Intelligent Technology Co ltd
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN201810606708.XA priority Critical patent/CN110598510B/en
Publication of CN110598510A publication Critical patent/CN110598510A/en
Application granted granted Critical
Publication of CN110598510B publication Critical patent/CN110598510B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

A vehicle-mounted gesture interaction technology comprises the following steps: (1) identifying moving objects using a modified moving object detection algorithm; (2) and (3) judging whether the moving object identified in the step (1) is a human palm by using a gesture identification control method. The improved moving object detection algorithm comprises the following steps: 2.1 initializing; 2.2 detecting whether the pixel points are motion points; 2.3 performing kmeans clustering on the motion points; 2.4 area growth; extracting an area; and 2.5, updating the pixel points. The gesture recognition control method comprises the following steps: 3.1 feature selection and model training; 3.2 judging whether the target image is the human palm. The feature selection and model training comprises the following steps: 3.1.1 collecting training data; 3.1.2 selecting sample points from the data to be trained; 3.1.3, calculating the optimal division value of all the sample points; 3.1.4 establishing a random forest corresponding to the sample point based on the optimal division value calculation result.

Description

Vehicle-mounted gesture interaction technology
Technical Field
The invention relates to the field of image recognition and processing, in particular to a vehicle-mounted gesture interaction technology.
Background
With the progress of science and technology, the functions of automobiles are increased day by day, the internal information systems are complicated day by day, and the operation is also complicated for users. The operation of the traditional automobile button and the touch screen requires the use of both eyes and hands, which has an influence on driving safety. Although the voice interaction mode is rapid, the voice recognition is not accurate enough because the noise of the running vehicle is large and the interference is large.
Inside the car, use the gesture to come to interact with the car, be equivalent to traditional car button or pronunciation interactive mode, have fast, accurate, safety, and the advantage that the interference killing feature is strong.
The camera that uses among the traditional on-vehicle gesture interaction technology is the rgb camera, obtains through the colour of the skin of people's hand, but this kind of mode has the limitation, for example the hand of black race, or at dark or night, or the colour of seat in the car all has very big interference to the gesture recognition of rgb camera. The invention adopts the depth camera, detects the moving object based on the basic principle of motion detection, refers to the motion detection algorithm of the traditional rgb camera, and improves on the basis, thereby better detecting the moving object.
The current general scheme of extracting the palm by the depth camera is based on a depth threshold value, namely, if an object is larger than a distance value, the object is discarded. However, in an actual driving process, the hands of the user are usually in the middle or below the steering wheel, but the user cannot be required to raise the handle to a position beyond the steering wheel when using the product, and the motion of the person himself interferes with other moving objects, so that it is difficult to determine whether one moving object is the palm of the person. In the invention, the camera shoots from the top to the bottom of the vehicle, objects which are in motion during driving, such as a steering wheel, a human body, a head and shoulders, can be shot by the camera, and hands of the human can appear at any position of the camera.
Drawings
The attached drawings in the specification show the main steps of the technical scheme.
Fig. 1 shows two parts of the present technical solution in general, providing a vehicle-mounted gesture interaction technology: identifying moving objects using a modified moving object detection algorithm; and then judging whether the recognized moving object is a human palm by using a gesture recognition control method.
Fig. 2 shows the main steps of the improved moving object detection algorithm used in the present solution to identify moving objects.
Fig. 3 shows a main step of the gesture recognition control method used in the present technical solution, which determines whether the recognized moving object is a human palm.
Fig. 4 shows the main steps of feature selection and model training. And
fig. 5 shows the main steps for determining whether the target image is a human palm.
Disclosure of Invention
The invention provides a vehicle-mounted gesture interaction technology, which comprises the following steps: (1) identifying moving objects using a modified moving object detection algorithm; (2) and (3) judging whether the moving object identified in the step (1) is a human palm by using a gesture identification control method.
Wherein the moving object detection algorithm comprises the steps of: initializing; detecting whether the pixel points are motion points or not; performing kmeans clustering on the motion points; growing a region; extracting an area; and updating the pixel points.
The gesture recognition control method comprises the following steps: selecting characteristics and training a model; and judging whether the target image is a human palm.
Further, the gesture recognition control method is a part of the step of updating the pixel points in the moving object detection algorithm, whether the target image is a hand of a person is judged, if the target image is the hand, historical record information of the pixel points is updated, so that the depth information change of the moving points is increased, and if the target image is not the hand, the information is kept unchanged, so that the change of the depth information of the moving points is increased, and the moving pixel points are extracted more effectively next time.
The invention combines the depth camera with the gesture technology, and the depth camera can solve the interference of illumination, skin color and ornaments in the car. The moving object can be better detected by referring to a traditional rgb camera motion detection algorithm and carrying out innovation on the basis of the traditional rgb camera motion detection algorithm. In the invention, the depth camera shoots from top to bottom of the car roof, the gesture recognition control method adopts machine learning random forest training and innovations are carried out in the step of selecting the characteristics of the decision tree, thereby realizing the judgment of whether the target image is a hand or not.
Detailed Description
To further explain the technical scheme, the depth camera is combined with the gesture technology, an improved moving object detection algorithm is used for detecting a moving object, and then a gesture recognition control method is used for judging whether the recognized moving object is a human palm, and the specific implementation mode is explained below by combining the accompanying drawings of the specification.
Fig. 1 shows two parts of the present technical solution in general, providing a vehicle-mounted gesture interaction technology: identifying moving objects using a modified moving object detection algorithm; and then judging whether the recognized moving object is a human palm by using a gesture recognition control method.
Fig. 2 shows the main steps of the improved moving object detection algorithm used in the present solution to identify moving objects.
Further, the improved moving object detection algorithm is characterized in that in the step 2.1, continuous depth maps of dozens of frames are obtained through a depth camera, and a historical record library is created for each pixel point.
Further, the improved moving object detection algorithm is characterized in that, in step 2.2, for each pixel point, the image obtained by each frame of camera is detected to determine whether the pixel point is a moving pixel point, and specifically includes the following steps: 2.2.1. setting a counter a to 0; 2.2.2. calculating the difference between the current depth value of the pixel point and the depth value in the history record base, and if the difference value is greater than a certain set threshold value, adding 1 to the counter a; 2.2.3. and (3) after step 2.2.2 is carried out on each historical record of the pixel point in the historical record library, if the value of the counter a is greater than a threshold value, setting the pixel point as a motion point.
Preferably, the certain set threshold is not uniquely fixed, and can be adjusted according to actual needs; the threshold value compared with the value of the counter a is not only fixed, but can be adjusted according to actual needs.
Further, the improved moving object detection algorithm is characterized in that after all the moving points are obtained, step 2.3 is executed, and kmeans clustering is performed on all the moving points, and the method specifically comprises the following steps: 2.3.1 randomly selecting a part of pixel points from all the pixel points as an initial clustering center; 2.3.2 for the rest other pixel points, respectively allocating the rest other pixel points to the most similar clusters according to the similarity (distance) between the rest other pixel points and the cluster centers of 2.3.1; 2.3.3 recalculating the cluster center of each obtained new cluster, namely calculating the mean value of all objects in the new cluster; 2.3.4 calculating a standard degree function, if a certain condition is met, if the function is converged, terminating the algorithm, otherwise, recursively executing the steps 2.3.2, 2.3.3 and 2.3.4 to obtain some categories; 2.3.5 step 2.3.4 said categories, each category has a pixel center point and its corresponding motion pixel point, a threshold for the number of category elements is set, categories which do not reach the threshold are removed, and then a category serial number is assigned to each motion point.
Preferably, the threshold of the number of the category elements is not uniquely fixed, and can be adjusted according to actual needs.
Further, the improved moving object detection algorithm is characterized in that after the category is obtained, step 2.4 is executed to perform region growing, and specifically includes the following steps: 2.4.1 comparing the depth value which is detected as the motion point with a new nearby pixel point to be detected, if the depth difference value of the two is less than a set threshold value, the new pixel point to be detected is similar to the pixel point which is detected as the motion point, and therefore the new pixel point is set as the motion point; 2.4.2 according to step 2.4.1, if the new pixel point is judged as a new motion point in both categories, the attributes of the two categories are similar, so that the two categories are merged, and the category serial number is set as the same category serial number until all the motion point detection is completed.
Preferably, the set threshold value compared with the depth difference value is not only fixed, and can be adjusted according to actual needs.
Further, the improved moving object detection algorithm is characterized in that after all the moving points are detected, whether the picture extracted in step 2.5 is a hand is judged, if the picture is judged to be the hand, the historical record information of the pixel points is updated, so that the depth information change of the moving points is increased, and if the picture is judged not to be the hand, the information is kept unchanged.
Fig. 3 shows a main step of the gesture recognition control method used in the present technical solution, which determines whether the recognized moving object is a human palm. The method is characterized by comprising the following steps: 3.1 feature selection and model training; 3.2 judging whether the target image is the human palm.
Further, the step 3.1 comprises the following steps: 3.1.2 selecting sample points from the data to be trained; 3.1.3, calculating the optimal division value of all the sample points; 3.1.4 establishing a random forest corresponding to the sample point based on the optimal division value calculation result.
Further, before step 3.1.2, step 3.1.1: the image to be trained is obtained by means of at least two cameras, wherein at least one camera is a depth camera and at least one camera is an rgb camera. This step is intended to collect training data.
Preferably, the camera is shot from top to bottom of the roof, and the recorded person wears blue gloves with both hands, and can freely make various gestures and actions including actions during driving, steering wheel hitting, hand brake hitting and the like in the middle of the car. Through extracting the position that utilizes rgb camera blue pixel to obtain the region that corresponds degree of depth camera hand, thereby realize the data annotation of palm.
Further, the gesture recognition control method is characterized in that, in the step 3.1.2, the palm portion in the depth map is used as a positive sample, the non-palm portion is used as a negative sample, and the same number of pixel points are randomly selected from the positive sample portion and the negative sample portion to be used as sample points to be trained.
Further, the gesture recognition control method is characterized in that the calculation of the optimal division value in the step 3.1.3 includes the following steps: 3.1.3.1, calculating the depth mean value of each neighborhood of the sample point; 3.1.3.2 calculating the difference between the depth mean value of each neighborhood of the sample point and the depth value of the sample point; 3.1.3.3 calculating information entropy; 3.3.4 obtaining the optimal division value.
Further, the gesture recognition control method is characterized in that the depth mean calculation in the step 3.1.3.1 includes the following steps: 3.1.3.1.1 randomly selecting a certain sample point P; 3.1.3.1.2 calculating the average of the depth values of a square neighborhood centered at P; 3.1.3.1.3 the average of the depth values of the neighborhood of all sample points is calculated based on the calculation method of step 3.1.3.1.2.
Further, the gesture recognition control method is characterized in that, in the step 3.1.3.1.2, the average value calculation method of the depth values of the square neighborhood with P as the center is as follows: the size of the neighborhood of the point P is 3, 5, 7, 9. cndot. 2n +1 in sequence, n is the number of pixels on one side of the square neighborhood, and P is the central point of the square neighborhood.
Further, the gesture recognition control method is characterized in that the information entropy calculation step in step 3.1.3.3 is: 3.1.3.3.1 dividing all the positive and negative sample points into several equal parts, each part contains positive and negative sample points with the same proportion; 3.1.3.3.2 for all sample points in an equal portion, when the neighborhood is 3, there is a difference d from the depth mean of the 3 neighborhoods; 3.1.3.3.3 each difference d may divide the differences into two parts, one larger than d and one smaller than d; 3.1.3.3.4, obtaining the final information entropy s according to the calculation formula of the information entropy; 3.1.3.3.5 based on the calculation method of steps 3.1.3.1.2-3.1.3.1.4, the corresponding information entropy can be obtained when the neighborhood is 5, 7, 9. cndot. 2n + 1.
Preferably, the information entropy is defined as: in the source, not the uncertainty of a single symbol occurrence is considered, but the average uncertainty of all possible occurrences of the source. If the information source symbol has n values: u shape1…Ui…UnThe corresponding probability is: p1…Pi…PnAnd the various symbols appear independently of each other. At this time, the average uncertainty of the source should be the statistical average (E) of the single symbol uncertainty-log Pi, which can be called information entropy, i.e., the entropyWhere the logarithm is typically taken to be base 2 and the units are bits. However, other logarithmic bases can be selected, and other corresponding units can be usedThe space can be converted by a bottom-changing formula.
Further, for example, for 100 sample points in the 3 neighborhoods, 100 difference values may be calculated and recorded as (d1, d2, d3 …, d100), k values (0 < k < 100) are randomly selected from the 100 difference values, and for each selected difference value as a division value, according to the definition of entropy, a score of entropy that divides the difference values by the division value may be obtained and recorded as S. (assuming that after division, 30 points on the left and 70 points on the right, the entropy of the information is calculated as S-1 (0.3 log (0.3) +0.7 log (0.7)).
Further, the step of obtaining the optimal score value in step 3.1.3.4 is: 3.1.3.4.1 for the 3 neighborhoods, selecting the largest one of the information entropy values corresponding to all the sample points as S3, and recording the score of the 3 neighborhoods, and recording the D value of the division at this time as D3; 3.1.3.4.2 based on step 3.1.3.4.1, it can be found that when the neighborhood is 5, the score is S5 and the value of D is D5; when the neighborhood is 7, the score is S7, and the value of D is D7; when the neighborhood is 2n +1, the score is S (2n +1), and the value of D is D (2n + 1); 3.1.3.4.3, selecting the S value with the maximum score as Sm, the corresponding neighborhood as the optimal neighborhood as m, and the corresponding d as the optimal division value as Dm.
Further, in step 3.1.4, a random forest corresponding to the sample point is established by: 3.1.4.1 constructing a decision tree based on the optimal partition neighborhood and the optimal partition value; 3.1.4.2 building the random forest based on the decision tree.
Preferably, each decision tree is a binary pair, one decision forest is composed of a plurality of decision trees, and each decision tree can be trained by using the extracted pixel points or by using different pixel points.
Further, in the step 3.1.4.1, the step of constructing a decision tree includes: 3.1.4.1.1 storing the optimal neighborhood m and optimal division value Dm obtained in step 3.1.3.4.3 in the root node of a decision tree, (m, Dm); 3.1.4.1.2 dividing an aliquot of sample points into two parts based on the optimal partition values stored on the root nodes, the left side being points where d is greater than Dm and the right side being points where d is less than Dm; 3.1.4.1.3 recursively execute the contents of steps 3.1.3.3 and 3.1.3.4 and 3.1.4.1 for the left and right part points until the left and right subtrees have categories with only positive or only negative examples, or the maximum depth of the tree is reached; 3.1.4.1.4, when the maximum depth is reached, the leaf nodes store the number of positive and negative sample points, thus forming a decision tree.
Further, in step 3.1.4.2, a decision tree can be constructed for each sample point of the equal parts, and a random forest can be formed based on the decision tree.
Preferably, the step feature selection and the model training are completed offline, and those skilled in the art understand that when the model training is completed and the target image needs to be predicted each time, the model training is not required to be performed again, but the judgment is performed based on the result of the model training.
Further, in the step 3.2, it is determined whether the target image is a human palm, and the method may further include the following steps: 3.2.1 judging whether each point of the target image is one point of the human palm based on the random forest obtained by the model training; 3.2.2 judging whether the target image is a human palm or not based on the judgment result in the step 3.2.1.
Further, in the step 3.2.1, the following steps can be further divided: 3.2.1.1 calculate depth difference: for each pixel point, finding a decision tree, calculating the depth mean value of the optimal neighborhood m stored on the root node of the decision tree, and calculating the difference value of the depth mean value and the depth value of the pixel point; 3.2.1.2 recurse the decision tree: comparing the difference calculated based on the step 3.2.1.1 with the optimal division value Dm stored by the node, if the difference is smaller than the Dm, performing left branch recursion and larger than the Dm, performing right branch recursion, and performing recursion sequentially until the recursion reaches a leaf node, wherein the leaf node stores the number of positive and negative samples; 3.2.1.3 counting the positive and negative sample numbers of all trees of the point in the random forest and judging: for the pixel point, the positive and negative sample numbers of all trees of the point in the random forest can be obtained through statistics, and if the total positive sample number is larger than the negative sample number, the pixel point is a hand; if the total number of positive samples is less than the number of negative samples, the pixel is not a hand.
Further, the method for determining whether the target image is a human hand in step 3.2.2 includes, after step 3.2.1 is performed on each point in the target image, counting the number of pixel points predicted to be a hand and the number of pixel points not predicted to be a hand, and if the number of pixel points predicted to be a hand is greater than that predicted to be not a hand, determining that the target image is a hand.
Preferably, for the pixel point judged to be a hand, the history information in the history database is updated, and for the pixel point judged not to be a hand, the record in the history database is kept unchanged, so that the depth information change of the motion point is increased, and the motion pixel point is extracted more effectively next time.
The above is the concrete implementation mode of the technical scheme, and the technical scheme can solve the problem of interaction between a driver and an automobile in the current automobile driving process.

Claims (23)

1. A vehicle-mounted gesture interaction control method is characterized by comprising the following steps:
(1) identifying moving objects using a modified moving object detection algorithm;
(2) and (3) judging whether the moving object identified in the step (1) is a human palm by using a gesture identification control method.
2. Control method according to claim 1, characterized in that said step for identifying moving objects comprises the steps of:
1.1 initializing;
1.2 detecting whether the pixel points are motion points or not;
1.3 performing kmeans clustering on the motion points;
1.4 area growth;
1.5 extracting the region;
1.6 updating the pixel points.
3. The control method according to claim 2, wherein the step of detecting whether each pixel point is a moving pixel point or not is performed on the image acquired by each camera in the 2.2 nd step, specifically includes the steps of:
2.2.1. setting a counter a to 0;
2.2.2. calculating the difference between the current depth value of the pixel point and the depth value in the history record base, and if the difference value is greater than a certain set threshold value, adding 1 to the counter a;
2.2.3. and (3) after step 2.2.2 is carried out on each historical record of the pixel point in the historical record library, if the value of the counter a is greater than a threshold value, setting the pixel point as a motion point.
4. The control method according to any one of claims 2 to 5, wherein after all the motion points are obtained, step 2.3 is performed to perform kmeans clustering on all the motion points, and the method specifically comprises the following steps:
2.3.1 randomly selecting a part of pixel points from all the pixel points as an initial clustering center;
2.3.2 for the rest other pixel points, respectively allocating the rest other pixel points to the most similar clusters according to the similarity (distance) between the rest other pixel points and the cluster centers of 2.3.1;
2.3.3 recalculating the cluster center of each obtained new cluster, namely calculating the mean value of all objects in the new cluster;
2.3.4 calculating a standard degree function, if a certain condition is met, if the function is converged, terminating the algorithm, otherwise, recursively executing the steps 2.3.2, 2.3.3 and 2.3.4 to obtain some categories;
2.3.5 step 2.3.4 said categories, each category has a pixel center point and its corresponding motion pixel point, a threshold for the number of category elements is set, categories which do not reach the threshold are removed, and then a category serial number is assigned to each motion point.
5. The control method according to any one of claims 2 to 6, wherein after obtaining the category, step 2.4 is performed to perform region growing, and specifically includes the following steps:
2.4.1 comparing the depth value which is detected as the motion point with a new nearby pixel point to be detected, if the depth difference value of the two is less than a set threshold value, the new pixel point to be detected is similar to the pixel point which is detected as the motion point, and therefore the new pixel point is set as the motion point;
2.4.2 according to step 2.4.1, if the new pixel point is judged as a new motion point in both categories, the attributes of the two categories are similar, so that the two categories are merged, and the category serial number is set as the same category serial number until all the motion point detection is completed.
6. The method according to any one of claims 1 or 2, wherein it is determined whether the picture extracted in step 2.5 is a human hand, and if it is determined to be a hand, the recorded information of the pixel point history is updated so as to increase the depth information change of the motion point, and if it is determined not to be a hand, the information is kept unchanged.
7. The control method according to any one of claims 1 to 6, characterized in that the step of identifying whether the moving object is a human palm comprises the steps of:
3.1 feature selection and model training;
3.2 judging whether the target image is the human palm.
8. Control method according to any of claims 1 to 7, characterized in that said step 3.1 comprises the steps of:
3.1.2 selecting sample points from the data to be trained;
3.1.3, calculating the optimal division value of all the sample points;
3.1.4 establishing a random forest corresponding to the sample point based on the optimal division value calculation result.
9. Control method according to any of claims 1 to 8, characterized in that it comprises, before said step 3.1.2, the following steps:
3.1.1 collecting training data, obtaining images to be trained by at least two cameras, wherein at least one camera is a depth camera and at least one camera is an rgb camera.
10. The control method according to any one of claims 1 to 8, characterized in that in step 3.1.2, the palm portion in the depth map is taken as a positive sample, the non-palm portion is taken as a negative sample, and the same number of pixel points are randomly selected from the positive sample portion and the negative sample portion as sample points to be trained.
11. Control method according to any of claims 1 to 8, characterized in that the optimal partition value calculation in step 3.1.3 comprises the steps of:
3.1.3.1, calculating the depth mean value of each neighborhood of the sample point;
3.1.3.2 calculating the difference between the depth mean value of each neighborhood of the sample point and the depth value of the sample point;
3.1.3.3 calculating information entropy;
3.1.3.4 obtain the optimal division value.
12. Control method according to claim 11, characterized in that the depth mean calculation in step 3.1.3.1 comprises the following steps:
3.1.3.1.1 randomly selecting a certain sample point P;
3.1.3.1.2 calculating the average of the depth values of a square neighborhood centered at P;
3.1.3.1.3 calculating the average of the depth values of the neighborhood of all sample points based on the calculation method of step 3.3.1.2.
13. The method according to claim 12, wherein the average value of the depth values in the neighborhood of the square centered at P in step 3.1.3.1.2 is calculated by:
the size of the neighborhood of the point P is 3, 5, 7, 9. cndot. 2n +1 in sequence, n is the number of pixels on one side of the square neighborhood, and P is the central point of the square neighborhood.
14. The control method according to any one of claims 12 to 13, wherein the information entropy calculation step in step 3.1.3.3 is:
3.1.3.3.1 dividing all the positive and negative sample points into several equal parts, each part contains positive and negative sample points with the same proportion;
3.1.3.3.2 for all sample points in an equal portion, when the neighborhood is 3, there is a difference d from the depth mean of the 3 neighborhoods;
3.1.3.3.3 each difference d may divide the differences into two parts, one larger than d and one smaller than d;
3.1.3.3.4, obtaining the final information entropy s according to the calculation formula of the information entropy;
3.1.3.3.5 based on the calculation method of steps 3.1.3.3.2-3.1.3.3.4, the corresponding information entropy can be obtained when the neighborhood is 5, 7, 9. cndot. 2n + 1.
15. The control method according to any one of claims 12 to 14, wherein the step of obtaining the optimal division value in step 3.1.3.4 is:
3.1.3.4.1 for the 3 neighborhoods, selecting the largest one of the information entropy values corresponding to all the sample points as S3, and recording the score of the 3 neighborhoods, and recording the D value of the division at this time as D3;
3.1.3.4.2 based on step 3.1.3.4.1, it can be found that when the neighborhood is 5, the score is S5 and the value of D is D5; when the neighborhood is 7, the score is S7, and the value of D is D7; when the neighborhood is 2n +1, the score is S (2n +1), and the value of D is D (2n + 1);
3.1.3.4.3, selecting the S value with the maximum score as Sm, the corresponding neighborhood as the optimal neighborhood as m, and the corresponding d as the optimal division value as Dm.
16. A control method according to any one of claims 7 to 16, characterized in that in step 3.1.4 a random forest is established corresponding to the sample points by:
3.1.4.1 constructing a decision tree based on the optimal partition neighborhood and the optimal partition value;
3.1.4.2 building the random forest based on the decision tree.
17. A control method according to claim 16, wherein in said step 3.1.4.1, the step of constructing a decision tree comprises:
3.1.4.1.1 storing the optimal neighborhood m and optimal division value Dm obtained in step 3.1.3.4.3 in the root node of a decision tree, (m, Dm);
3.1.4.1.2 dividing an aliquot of sample points into two parts based on the optimal partition values stored on the root nodes, the left side being points where d is greater than Dm and the right side being points where d is less than Dm;
3.1.4.1.3 recursively execute the contents of steps 3.1.3.3 and 3.1.3.4 and 3.1.4.1 for the left and right part points until the left and right subtrees have categories with only positive or only negative examples, or the maximum depth of the tree is reached;
3.1.4.1.4, when the maximum depth is reached, the leaf nodes store the number of positive and negative sample points, thus forming a decision tree.
18. The control method according to claim 16, wherein in step 3.1.4.2, a decision tree is constructed for each sample point of the equal parts, based on which a random forest can be formed, and model training is completed.
19. The algorithm according to claim 7, characterized in that said step 3.2 comprises in particular the steps of:
3.2.1 judging whether each point of the target image is one point of the human palm based on the random forest obtained by the model training;
3.2.2 judging whether the target image is a human palm or not based on the judgment result in the step 3.2.1.
20. The control method according to claim 19, wherein in the step 3.2.1, the step of determining whether each point of the target image is a point of the human palm comprises:
3.2.1.1 calculate depth difference:
for each pixel point, finding a decision tree in a training model, calculating the depth mean value of the optimal neighborhood m stored on the root node of the decision tree, and calculating the difference value between the depth mean value and the depth value of the pixel point;
3.2.1.2 recurse the decision tree:
comparing the difference calculated based on the step 3.2.1.1 with the optimal division value Dm stored by the node, if the difference is smaller than the Dm, performing left branch recursion and larger than the Dm, performing right branch recursion, and performing recursion sequentially until the recursion reaches a leaf node to obtain the number of positive and negative samples stored on the leaf node;
3.2.1.3 counting the positive and negative sample numbers of all trees of the point in the random forest and judging:
for the pixel point, the positive and negative sample numbers of all trees of the point in the random forest can be obtained through statistics, and if the total positive sample number is larger than the negative sample number, the pixel point is a hand; if the total number of positive samples is less than the number of negative samples, the pixel is not a hand.
21. The control method according to any one of claims 7 to 20, wherein the step 3.2.2 is a step of determining whether the target image is a human hand by counting the number of pixel points predicted to be a hand and the number of pixel points not predicted to be a hand after the step 3.2.1 is performed for each point in the target image, and if the number of pixel points predicted to be a hand is larger than that predicted to be not a hand, the target image is a hand.
22. The camera of claim 3 or 9 is a depth camera.
23. The algorithm according to any one of claims 1 to 3, characterized in that a depth camera is used to obtain continuous depth maps of tens of frames, and a historical record library is created for each pixel point.
CN201810606708.XA 2018-06-13 2018-06-13 Vehicle-mounted gesture interaction technology Active CN110598510B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810606708.XA CN110598510B (en) 2018-06-13 2018-06-13 Vehicle-mounted gesture interaction technology

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810606708.XA CN110598510B (en) 2018-06-13 2018-06-13 Vehicle-mounted gesture interaction technology

Publications (2)

Publication Number Publication Date
CN110598510A true CN110598510A (en) 2019-12-20
CN110598510B CN110598510B (en) 2023-07-04

Family

ID=68849600

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810606708.XA Active CN110598510B (en) 2018-06-13 2018-06-13 Vehicle-mounted gesture interaction technology

Country Status (1)

Country Link
CN (1) CN110598510B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103413145A (en) * 2013-08-23 2013-11-27 南京理工大学 Articulation point positioning method based on depth image
US20140204013A1 (en) * 2013-01-18 2014-07-24 Microsoft Corporation Part and state detection for gesture recognition
CN106778813A (en) * 2016-11-24 2017-05-31 金陵科技学院 The self-adaption cluster partitioning algorithm of depth image
CN106845513A (en) * 2016-12-05 2017-06-13 华中师范大学 Staff detector and method based on condition random forest
CN107688779A (en) * 2017-08-18 2018-02-13 北京航空航天大学 A kind of robot gesture interaction method and apparatus based on RGBD camera depth images

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140204013A1 (en) * 2013-01-18 2014-07-24 Microsoft Corporation Part and state detection for gesture recognition
CN103413145A (en) * 2013-08-23 2013-11-27 南京理工大学 Articulation point positioning method based on depth image
CN106778813A (en) * 2016-11-24 2017-05-31 金陵科技学院 The self-adaption cluster partitioning algorithm of depth image
CN106845513A (en) * 2016-12-05 2017-06-13 华中师范大学 Staff detector and method based on condition random forest
CN107688779A (en) * 2017-08-18 2018-02-13 北京航空航天大学 A kind of robot gesture interaction method and apparatus based on RGBD camera depth images

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
陈皓,路海明: "基于深度图像的手势识别综述", 《内蒙古大学学报(自然科学版)》 *

Also Published As

Publication number Publication date
CN110598510B (en) 2023-07-04

Similar Documents

Publication Publication Date Title
CN108764185B (en) Image processing method and device
CN111931701B (en) Gesture recognition method and device based on artificial intelligence, terminal and storage medium
CN107203753B (en) Action recognition method based on fuzzy neural network and graph model reasoning
US8620024B2 (en) System and method for dynamic gesture recognition using geometric classification
CN106778796B (en) Human body action recognition method and system based on hybrid cooperative training
CN109671102B (en) Comprehensive target tracking method based on depth feature fusion convolutional neural network
EP1934941B1 (en) Bi-directional tracking using trajectory segment analysis
CN111563417B (en) Pyramid structure convolutional neural network-based facial expression recognition method
US7331671B2 (en) Eye tracking method based on correlation and detected eye movement
CN107194371B (en) User concentration degree identification method and system based on hierarchical convolutional neural network
KR100969298B1 (en) Method For Social Network Analysis Based On Face Recognition In An Image or Image Sequences
CN109685037B (en) Real-time action recognition method and device and electronic equipment
CN112734775A (en) Image annotation, image semantic segmentation and model training method and device
CN108805016B (en) Head and shoulder area detection method and device
CN109858553B (en) Method, device and storage medium for updating driving state monitoring model
CN107273866A (en) A kind of human body abnormal behaviour recognition methods based on monitoring system
CN107564035B (en) Video tracking method based on important area identification and matching
CN110188668B (en) Small sample video action classification method
CN111160095B (en) Unbiased face feature extraction and classification method and system based on depth self-encoder network
CN111062292A (en) Fatigue driving detection device and method
JP2017041206A (en) Learning device, search device, method, and program
CN110599463A (en) Tongue image detection and positioning algorithm based on lightweight cascade neural network
US11132577B2 (en) System and a method for efficient image recognition
CN108564067A (en) The Threshold and system of face alignment
CN114140696A (en) Commodity identification system optimization method, commodity identification system optimization device, commodity identification equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20220125

Address after: 518063 2W, Zhongdian lighting building, Gaoxin South 12th Road, Nanshan District, Shenzhen, Guangdong

Applicant after: Shenzhen point cloud Intelligent Technology Co.,Ltd.

Address before: 518023 No. 3039 Baoan North Road, Luohu District, Shenzhen City, Guangdong Province

Applicant before: Zhou Qinna

GR01 Patent grant
GR01 Patent grant