CN110598510B - Vehicle-mounted gesture interaction technology - Google Patents

Vehicle-mounted gesture interaction technology Download PDF

Info

Publication number
CN110598510B
CN110598510B CN201810606708.XA CN201810606708A CN110598510B CN 110598510 B CN110598510 B CN 110598510B CN 201810606708 A CN201810606708 A CN 201810606708A CN 110598510 B CN110598510 B CN 110598510B
Authority
CN
China
Prior art keywords
point
value
depth
neighborhood
pixel
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810606708.XA
Other languages
Chinese (zh)
Other versions
CN110598510A (en
Inventor
周秦娜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Point Cloud Intelligent Technology Co ltd
Original Assignee
Shenzhen Point Cloud Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Point Cloud Intelligent Technology Co ltd filed Critical Shenzhen Point Cloud Intelligent Technology Co ltd
Priority to CN201810606708.XA priority Critical patent/CN110598510B/en
Publication of CN110598510A publication Critical patent/CN110598510A/en
Application granted granted Critical
Publication of CN110598510B publication Critical patent/CN110598510B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

A vehicle-mounted gesture interaction technology, comprising the following steps: (1) Identifying a moving object using an improved moving object detection algorithm; (2) And (3) judging whether the moving object identified in the step (1) is a human palm or not by using a gesture identification control method. Wherein the improved moving object detection algorithm comprises the steps of: 2.1, initializing; 2.2 detecting whether the pixel point is a motion point; 2.3, carrying out kmeans clustering on the motion points; 2.4 region growth; extracting a region; and 2.5 updating the pixel point. The gesture recognition control method comprises the following steps: 3.1, feature selection and model training; 3.2 judging whether the target image is a human palm. The feature selection and model training comprises the following steps: 3.1.1 collecting training data; 3.1.2 selecting sample points from the data to be trained; 3.1.3, calculating optimal division values of all the sample points; and 3.1.4, establishing a random forest corresponding to the sample point based on the optimal division value calculation result.

Description

Vehicle-mounted gesture interaction technology
Technical Field
The invention relates to the field of image recognition and processing, in particular to a vehicle-mounted gesture interaction technology.
Background
With the progress of science and technology, the functions of automobiles are increasing, the internal information systems are becoming more complex, and the operations are also becoming more complex for users. The operation of the conventional automobile buttons and touch screens requires the simultaneous use of eyes and hands, which has an impact on driving safety. Although the voice interaction mode is quick, the voice recognition is not accurate enough because the noise of the running vehicle is large and the interference is large.
Inside the automobile, use the gesture to come with the car and interact, be equivalent to traditional car button or pronunciation interactive mode, have quick, accurate, safety, and the strong advantage of interference killing feature.
The camera used in the traditional vehicle-mounted gesture interaction technology is an rgb camera, and the camera is obtained through the skin color of a human hand, but the method has limitations, such as dark hands, dark lights or at night, or the color of a seat in the vehicle has great interference on gesture recognition of the rgb camera. The invention adopts the depth camera to detect the moving object based on the basic principle of motion detection, refers to the motion detection algorithm of the traditional rgb camera, improves on the basis, and can better detect the moving object.
The current common scheme of extracting the palm by the depth camera is a scheme based on a depth threshold, namely, if an object is larger than a distance value, the object is discarded. However, in an actual driving process, the hands of the user are usually located in the middle or below the steering wheel, but the user cannot be required to raise the handle beyond the steering wheel when using the product, and in addition, the movement of the user and other moving objects can cause interference, so how to determine whether one moving object is a palm of a person is difficult. In the invention, the camera shoots from the top down, an object moving in driving, such as a steering wheel, a human body, a head and a shoulder, and a human hand can appear at any position of the camera.
Drawings
The specification and the drawings show the main steps of the technical scheme.
Fig. 1 shows two general parts of the present technical solution, providing a vehicle gesture interaction technique: identifying a moving object using an improved moving object detection algorithm; and then judging whether the identified moving object is a human palm or not by using a gesture identification control method.
Fig. 2 shows the main steps of the improved moving object detection algorithm used in the present solution to identify moving objects.
Fig. 3 shows a main step of determining whether a recognized moving object is a human palm according to a gesture recognition control method used in the present technical solution.
Fig. 4 shows the main steps of feature selection and model training. and
Fig. 5 shows the main steps of determining whether the target image is a human palm.
Disclosure of Invention
The invention provides a vehicle-mounted gesture interaction technology, which comprises the following steps of: (1) Identifying a moving object using an improved moving object detection algorithm; (2) And (3) judging whether the moving object identified in the step (1) is a human palm or not by using a gesture identification control method.
The moving object detection algorithm comprises the following steps: initializing; detecting whether the pixel point is a motion point or not; carrying out kmeans clustering on the motion points; growing a region; extracting a region; and updating the pixel point.
The gesture recognition control method comprises the following steps: feature selection and model training; and judging whether the target image is a human palm.
Further, the gesture recognition control method is a part of the step of updating the pixels in the moving object detection algorithm, judges whether the target image is a hand of a person, if so, updates the history record information of the pixels, thereby increasing the depth information change of the moving points, and if not, the information is kept unchanged, thereby increasing the change of the depth information of the moving points, so as to extract the moving pixels more effectively next time.
The invention combines the depth camera with the gesture technology, and the depth camera can solve the interference of illumination, skin color and ornaments in the vehicle. The motion detection algorithm of the traditional rgb camera is referred to, innovation is carried out on the basis of the motion detection algorithm, and a moving object can be better detected. According to the method, the depth camera shoots from the top to the bottom of the vehicle roof, the gesture recognition control method adopts random forest training of machine learning, and innovation is carried out in the feature selection step of the decision tree, so that whether a target image is a hand or not is judged.
Detailed Description
In order to further explain the technical scheme, the depth camera is combined with the gesture technology, an improved moving object detection algorithm is used for detecting a moving object, then a gesture recognition control method is used for judging whether the recognized moving object is a human palm, and the specific implementation mode is described below with reference to the attached drawings.
Fig. 1 shows two general parts of the present technical solution, providing a vehicle gesture interaction technique: identifying a moving object using an improved moving object detection algorithm; and then judging whether the identified moving object is a human palm or not by using a gesture identification control method.
Fig. 2 shows the main steps of the improved moving object detection algorithm used in the present solution to identify moving objects.
Further, the improved moving object detection algorithm is characterized in that in step 2.1, a continuous tens of frames of depth maps are obtained through a depth camera, and a historical record library is created for each pixel point.
Further, the improved moving object detection algorithm is characterized in that, in step 2.2, for each pixel point, the image obtained by the camera for each frame is detected whether the point is a moving pixel point, and specifically includes the following steps: 2.2.1. setting a counter a to 0;2.2.2. calculating the difference between the current depth value of the pixel point and the depth value in the history record library, and adding 1 to the counter a if the difference is larger than a certain set threshold value; 2.2.3. after step 2.2.2 is performed on each history of the pixel in the history repository, if the value of the counter a is greater than a threshold value, the pixel is set as a motion point.
Preferably, the certain set threshold is not uniquely fixed and can be adjusted according to actual needs; the threshold value compared with the value of the counter a is not uniquely fixed and can be adjusted according to actual needs.
Further, the improved moving object detection algorithm is characterized in that after all the moving points are obtained, step 2.3 is executed, and kmeans clustering is performed on all the moving points, and specifically comprises the following steps: 2.3.1, selecting a part of pixel points from all the pixel points at will as an initial clustering center; 2.3.2 for the remaining other pixels, assigning them to clusters most similar to them according to their similarity (distance) to the cluster center described in 2.3.1, respectively; 2.3.3 recalculating the cluster center of each obtained new cluster, namely calculating the average value of all objects in the new cluster; 2.3.4 calculating a standard degree function, when certain conditions are met, if the function converges, the algorithm is terminated, otherwise, the steps 2.3.2, 2.3.3 and 2.3.4 are recursively executed to obtain some categories; 2.3.5 step 2.3.4, wherein each category has a pixel center point and a motion pixel point corresponding to the pixel center point, a category element number threshold is set, categories which do not reach the threshold are removed, and then a category serial number is allocated to each motion point.
Preferably, the threshold value of the number of the category elements is not uniquely fixed, and can be adjusted according to actual needs.
Further, the improved moving object detection algorithm is characterized in that after the category is obtained, step 2.4 is executed to perform region growing, and specifically includes the following steps: 2.4.1 comparing the depth value which has been detected as a motion point with a new pixel point to be detected nearby, if the difference between the depth values is smaller than a set threshold value, the new pixel point to be detected is similar to the pixel point which has been detected as a motion point, thereby setting the new pixel point as a motion point; 2.4.2 according to step 2.4.1, if the new pixel point is judged as a new motion point in both categories, the two categories are similar in attribute, so that the two categories are combined, and the category serial number is set to the same category serial number until all motion point detection is completed.
Preferably, the set threshold value compared with the depth difference value is not uniquely fixed, and can be adjusted according to actual needs.
Further, the improved moving object detection algorithm is characterized in that after all the moving points are detected, whether the picture extracted in the step 2.5 is a human hand is judged, if the picture is a hand, the historical record information of the pixel points is updated, so that the depth information change of the moving points is increased, and if the picture is not a hand, the information is kept unchanged.
Fig. 3 shows a main step of determining whether a recognized moving object is a human palm according to a gesture recognition control method used in the present technical solution. The method is characterized by comprising the following steps of: 3.1, feature selection and model training; 3.2 judging whether the target image is a human palm.
Further, the step 3.1 includes the following steps: 3.1.2 selecting sample points from the data to be trained; 3.1.3, calculating optimal division values of all the sample points; and 3.1.4, establishing a random forest corresponding to the sample point based on the optimal division value calculation result.
Further, before step 3.1.2, step 3.1.1: and obtaining the images to be trained through at least two cameras, wherein at least one camera is a depth camera and at least one camera is an rgb camera. The purpose of this step is to collect training data.
Preferably, the camera shoots from the top to the bottom of the vehicle roof, and a person to be recorded wears blue gloves with both hands, and freely takes the palm of the user in the vehicle to make various gestures and actions, including actions during driving, steering wheel, hand brake and the like. The blue pixel point position of the rgb camera is extracted and utilized, so that the region of the camera hand with the corresponding depth is obtained, and the data marking of the palm is realized.
Further, in the gesture recognition control method, in the step 3.1.2, a palm portion in the depth map is used as a positive sample, a non-palm portion is used as a negative sample, and the same number of pixels are randomly selected from the positive sample portion and the negative sample portion to be used as sample points to be trained.
Further, the gesture recognition control method is characterized in that the calculation of the optimal score value in the step 3.1.3 includes the following steps: 3.1.3.1 calculating the depth average value of each neighborhood of the sample point; 3.1.3.2 calculating the difference value between the depth average value of each neighborhood of the sample point and the depth value of the sample point; 3.1.3.3 calculating information entropy; and 3.3.4, obtaining the optimal score value.
Further, the gesture recognition control method is characterized in that the depth average calculation in the step 3.1.3.1 includes the following steps: 3.1.3.1.1 randomly selecting a certain sample point P;3.1.3.1.2 calculating an average value of depth values of a square neighborhood centered on P; 3.1.3.1.3 calculates the average value of the depth values of the neighborhood of all sample points based on the calculation method of step 3.1.3.1.2.
Further, the gesture recognition control method is characterized in that in the step 3.1.3.1.2, the average value calculation method of the square neighborhood depth value with P as the center is as follows: the size of the neighborhood of the point P is 3, 5, 7 and 9.2n+1 in sequence, n is the number of pixels on one side of the square neighborhood, and P is the center point of the square neighborhood, and if the neighborhood is partially beyond the depth map, only the average value of the depth values of the points which do not exceed the range is calculated.
Further, the gesture recognition control method is characterized in that the step of calculating the information entropy in the step 3.1.3.3 is as follows: 3.1.3.3.1 dividing all positive and negative sample points into a plurality of equal parts, wherein each equal part contains positive and negative sample points in the same proportion; 3.1.3.3.2 for all sample points in an aliquot, when the neighborhood is 3, there is a difference d from the 3 neighborhood depth mean; 3.1.3.3.3 each difference d can divide the difference into two parts, one part being larger than d and one part being smaller than d;3.1.3.3.4 obtaining a final information entropy s according to a calculation formula of the information entropy; 3.1.3.3.5 based on the calculation method of steps 3.1.3.1.2-3.1.3.1.4, corresponding information entropy can be obtained when the neighborhood is 5, 7 and 9.
Preferably, the definition of the information entropy is: in a source, what is considered is not the uncertainty of the occurrence of a single symbol, but the average uncertainty of all possible occurrences of the source. If the source symbol has n values: u (U) 1 …U i …U n The corresponding probabilities are: p (P) 1 …P i …P n And the occurrence of the various symbols is independent of each other. At this time, the average uncertainty of the source should be a statistical average (E) of the single symbol uncertainty-log Pi, which can be called information entropy, i.e
Figure BDA0001694304220000081
Where the logarithm is typically taken as the base of 2 in bits. However, other logarithmic bases can be selected and other corresponding units can be adopted, and the units can be converted by using a base-changing formula.
Further, by way of example, for a 3-neighborhood, 100 sample points, 100 differences may be calculated, denoted (d 1, d2, d3 …, d 100), from the 100 differences, k values (0 < k < 100) are randomly selected, and for each selected difference as a division value, a score of entropy dividing the differences by the division value may be obtained according to the definition of entropy, denoted S. (assuming that 30 points on the left and 70 points on the right after division, the calculated information entropy is s= -1 (0.3×log (0.3) +0.7×log (0.7)).
Further, the step of obtaining the optimal score value in the step 3.1.3.4 includes: 3.1.3.4.1 selecting the largest value of the information entropy corresponding to all sample points from the 3 neighborhood, namely S3, as the score of the 3 neighborhood, and recording the D value of the division at the moment as D3;3.1.3.4.2 based on step 3.1.3.4.1, when the neighborhood is 5, the score is S5 and the D value is D5; when the neighborhood is 7, the score is S7, and the D value is D7; when 2n+1 is the neighborhood, the score is S (2n+1), and the D value is D (2n+1); 3.1.3.4.3 the S value with the largest score is marked as Sm, the corresponding neighborhood is marked as m, and the corresponding d is marked as Dm.
Further, in the step 3.1.4, a random forest corresponding to the sample points is established by the following steps: 3.1.4.1 constructing a decision tree based on the optimal partition neighborhood and the optimal partition value; 3.1.4.2 based on the decision tree, the random forest is built.
Preferably, each decision tree is a binary pair, and a decision forest is formed by a plurality of decision trees, and each decision tree can be trained by using extracted pixels or different pixels.
Further, in the step 3.1.4.1, the step of constructing a decision tree includes: 3.1.4.1.1 storing the optimal neighborhood m and the optimal score value Dm obtained in the step 3.1.3.4.3 on a root node of a decision tree, (m, dm); 3.1.4.1.2 dividing an aliquot of sample points into two parts based on the optimal division value stored on the root node, wherein d is a point larger than Dm on the left side, and d is a point smaller than Dm on the right side; 3.1.4.1.3 recursively performs the contents of steps 3.1.3.3 and 3.1.3.4 and 3.1.4.1 for the left and right partial points until the class of the left and right subtrees is either only positive or only negative, or the maximum depth of the tree is reached; 3.1.4.1.4 when the maximum depth is reached, the leaf nodes store the number of positive and negative sample points, thereby forming a decision tree.
Further, in the step 3.1.4.2, a decision tree may be formed for each aliquot of sample points, based on which a random forest may be formed.
Preferably, the step feature selection and the model training are completed offline, and those skilled in the art understand that when the model training is completed, the model training is not required to be performed again each time the target image needs to be predicted, and the judgment is performed based on the result of the model training.
Further, in the step 3.2, it is determined whether the target image is a human palm, and the steps may be further divided into the following steps: 3.2.1 judging whether each point of the target image is a point of a human palm or not based on the random forest obtained by the model training; 3.2.2 determining whether the target image is a human palm based on the determination result in the step 3.2.1.
Further, in the step 3.2.1, the steps may be further divided into the following steps: 3.2.1.1 calculating depth difference: for each pixel point, finding a decision tree, calculating the depth average value of the optimal neighborhood m stored on the root node of the decision tree, and calculating the difference value between the depth average value and the depth value of the pixel point; 3.2.1.2 recursively the decision tree: comparing the difference value calculated based on the step 3.2.1.1 with the optimal division value Dm stored in the node, if the difference value is smaller than Dm, carrying out left branch recursion, and if the difference value is larger than Dm, carrying out right branch recursion, sequentially carrying out recursion until the recursion reaches a leaf node, and storing the number of positive and negative samples on the leaf node; 3.2.1.3, counting the positive and negative sample numbers of all trees of the point in the random forest and judging: for the pixel point, counting to obtain the positive and negative sample numbers of all trees of the point in the random forest, and if the total positive sample number is larger than the negative sample number, the pixel point is a hand; if the total positive number of samples is less than the negative number of samples, the pixel is not a hand.
Further, the method of determining whether the target image is a human hand in step 3.2.2 is that after step 3.2.1 is performed on each point in the target image, the number of pixels predicted to be a hand and the number of pixels not to be a hand are counted, and if the number of pixels predicted to be a hand is greater than the number of pixels not to be a hand, the target image is a hand.
Preferably, for the pixel point judged to be the hand, the history information in the history record library is updated, and for the point judged not to be the hand, the record in the history record library is kept unchanged, so that the depth information change of the motion point is increased, and the motion pixel point is extracted more effectively next time.
The technical scheme is a specific implementation mode, and the problem of interaction between a driver and an automobile in the existing automobile driving process can be solved through the technical scheme.

Claims (7)

1. A vehicle-mounted gesture interaction control method comprises the following steps:
(1) Identifying a moving object using an improved moving object detection algorithm; the step for identifying a moving object includes the steps of:
1.1 Initializing;
1.2 Detecting whether the pixel point is a motion point or not;
1.3 Carrying out kmeans clustering on the motion points;
1.4 Growing a region;
1.5 Extracting a region;
1.6 Updating the pixel points;
(2) Judging whether the moving object identified in the step (1) is a human palm or not by using a gesture identification control method;
2.1 Feature selection and model training;
2.1.1 Collecting training data, and obtaining images to be trained through at least two cameras, wherein at least one camera is a depth camera and at least one camera is an RGB camera;
2.1.2 Selecting sample points from data to be trained, taking a palm part in a depth map as a positive sample, taking a non-palm part as a negative sample, and randomly selecting the same number of pixel points from the positive sample part and the negative sample part as the sample points to be trained;
2.1.3 Calculating optimal division values of all the sample points; the optimal score value calculation comprises the following steps:
2.1.3.1 Calculating the depth average value of each neighborhood of the sample point;
a certain sample point P is randomly selected, the average value of the depth values of the square neighborhood centering on the P is calculated, and the average value of the depth values of the neighborhood of all sample points is calculated. The average value calculation method of the square neighborhood depth value with P as the center in the step is as follows: the size of the neighborhood of the point P is 3, 5, 7 and 9.2n+1 in sequence, n is the number of pixels on one side of the square neighborhood, P is the center point of the square neighborhood, and if part of the neighborhood exceeds the depth map, only the average value of the depth values of the points which do not exceed the range is calculated;
2.1.3.2 Calculating the difference value between the depth average value of each neighborhood of the sample point and the depth value of the sample point;
2.1.3.3 Calculating information entropy;
dividing all positive and negative sample points into a plurality of equal parts, wherein each equal part contains positive and negative sample points with the same proportion, when the neighborhood is 3 for all sample points in one equal part, each difference d is a difference d from the depth average value of the 3 neighborhood, each difference d can divide the difference into two parts, one part is larger than d, and one part is smaller than d, the final information entropy s is obtained according to the calculation formula of the information entropy, and the corresponding information entropy can be obtained in sequence when the neighborhood in 2.1.3.1 is 5, 7 and 9;
2.1.3.4 Obtaining an optimal dividing value;
for the 3 neighborhood, selecting the largest information entropy value from the information entropy values corresponding to all sample points, marking the largest information entropy value as S3 and the largest information entropy value as the score of the 3 neighborhood, marking the D value divided at the moment as D3, and sequentially obtaining the score as S5 and the D value as D5 when the neighborhood is 5 based on the steps; when the neighborhood is 7, the score is S7, and the D value is D7; when the neighborhood is 2n+1, the score is S (2n+1), the D value is D (2n+1), the S value with the largest score is selected and is marked as Sm, the corresponding neighborhood is marked as m, and the corresponding D is marked as Dm;
2.1.4 Establishing a random forest corresponding to the sample point based on the optimal partition value calculation result, and specifically comprising the following steps:
2.1.4.1 Constructing a decision tree based on the optimal partition neighborhood and the optimal partition value;
storing the optimal neighborhood m and the optimal division value Dm obtained in the step 2.1.3.4 on a root node of a decision tree, (m, dm), dividing an equivalent sample point into two parts based on the optimal division value stored on the root node, wherein the left side is a point with d larger than Dm, and the right side is a point with d smaller than Dm; the contents of steps 2.1.3.3, 2.1.3.4 and 2.1.4.1 are recursively executed for the points of the left and right parts until the categories of the left and right subtrees are only positive samples or only negative samples, or the maximum depth of the tree is reached, and when the maximum depth is reached, leaf nodes store the number of positive and negative sample points, so that a decision tree is formed;
2.1.4.2 A decision tree can be formed for each equal sample point, a random forest can be formed based on the decision tree, and model training is completed;
2.2 Judging whether the target image is a human palm, further comprising: and judging whether each point of the target image is a point of a human palm based on the random forest obtained by the model training, and further judging whether the target image is the human palm.
2. The interactive control method according to claim 1, wherein the image acquired by the camera for each frame in step 1.2, for each pixel, detects whether the pixel is a moving pixel, specifically includes the following steps:
1.2.1. setting a counter a to 0;
1.2.2. calculating the difference between the current depth value of the pixel point and the depth value in the history record library, and adding 1 to the counter a if the difference is larger than a certain set threshold value;
1.2.3. after step 1.2.2 is performed on each history of the pixel in the history repository, if the value of the counter a is greater than a threshold value, the pixel is set as a motion point.
3. The control method according to any one of claims 1 or 2, characterized in that a historic record base is created for each pixel point by acquiring consecutive tens of frames of depth maps by means of a depth camera.
4. The control method according to any one of claims 1 or 2, characterized in that after obtaining all the motion points, step 1.3 is performed, and kmeans clustering is performed on all the motion points, and specifically comprises the following steps:
1.3.1 Selecting a part of pixel points from all the pixel points at will as an initial clustering center;
1.3.2 For the rest other pixel points, respectively distributing the rest other pixel points to the clusters most similar to the rest other pixel points according to the similarity between the rest other pixel points and the cluster center of 1.3.1;
1.3.3 Re-calculating the cluster center of each obtained new cluster, namely calculating the average value of all objects in the new cluster;
1.3.4 Calculating a standard function, when a certain condition is met, if the function is converged, terminating the algorithm, otherwise recursively executing the steps 1.3.2, 1.3.3 and 1.3.4 to obtain some categories;
1.3.5 And 1.3.4, setting a threshold value of the number of class elements in each class with a pixel center point and a motion pixel point corresponding to the pixel center point, removing the classes which do not reach the threshold value, and then distributing a class serial number for each motion point.
5. The control method according to claim 4, wherein after the category is obtained, step 1.4 is performed to perform region growing, and the method specifically comprises the steps of:
1.4.1 Comparing the depth value which has been detected as the motion point with a new pixel point to be detected nearby, and if the depth difference value of the depth value and the depth value is smaller than a set threshold value, setting the new pixel point as the motion point by making the new pixel point similar to the pixel point which has been detected as the motion point;
1.4.2 According to step 1.4.1, if the new pixel point is judged to be a new motion point in both categories, the two categories are similar in attribute, so that the two categories are combined, and the category serial number is set to be the same category serial number until all motion points are detected.
6. The control method according to claim 5, wherein it is determined whether the picture extracted in step 1.5 is a human hand, if it is determined to be a hand, the recorded information of the history of the pixel points is updated so as to increase the depth information change of the moving point, and if it is determined not to be a hand, the information is kept unchanged.
7. The control method according to claim 6, wherein the step 2.2 of determining whether the target image is a human palm includes the steps of:
2.2.1 Judging whether each point of the target image is a point of a human palm or not based on the random forest obtained by the model training;
calculating a depth difference value: for each pixel point, finding a decision tree in a training model, calculating the depth average value of the optimal neighborhood m stored on the root node of the decision tree, calculating the difference value between the depth average value and the depth value of the pixel point, comparing the calculated difference value with the optimal division value Dm stored in the node, if the calculated difference value is smaller than Dm, carrying out left branch recursion and right branch recursion, carrying out recursion in turn until the recursion reaches a leaf node, obtaining the number of positive and negative samples stored on the leaf node, counting the number of positive and negative samples of all trees of the point in a random forest, and if the total number of positive samples is larger than the number of negative samples, then the pixel point is a hand, otherwise, the pixel point is not the hand;
2.2.2 Judging whether the target image is a human palm or not based on the judgment result in the step 2.2.1;
after step 2.2.1 is performed on each point in the target image, counting the number of pixels predicted to be a hand and the number of pixels not to be a hand, if the number of pixels predicted to be a hand is more than the number of pixels not to be a hand, the target image is a hand, otherwise, the target image is not a hand.
CN201810606708.XA 2018-06-13 2018-06-13 Vehicle-mounted gesture interaction technology Active CN110598510B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810606708.XA CN110598510B (en) 2018-06-13 2018-06-13 Vehicle-mounted gesture interaction technology

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810606708.XA CN110598510B (en) 2018-06-13 2018-06-13 Vehicle-mounted gesture interaction technology

Publications (2)

Publication Number Publication Date
CN110598510A CN110598510A (en) 2019-12-20
CN110598510B true CN110598510B (en) 2023-07-04

Family

ID=68849600

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810606708.XA Active CN110598510B (en) 2018-06-13 2018-06-13 Vehicle-mounted gesture interaction technology

Country Status (1)

Country Link
CN (1) CN110598510B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103413145A (en) * 2013-08-23 2013-11-27 南京理工大学 Articulation point positioning method based on depth image
CN106778813A (en) * 2016-11-24 2017-05-31 金陵科技学院 The self-adaption cluster partitioning algorithm of depth image
CN106845513A (en) * 2016-12-05 2017-06-13 华中师范大学 Staff detector and method based on condition random forest
CN107688779A (en) * 2017-08-18 2018-02-13 北京航空航天大学 A kind of robot gesture interaction method and apparatus based on RGBD camera depth images

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140204013A1 (en) * 2013-01-18 2014-07-24 Microsoft Corporation Part and state detection for gesture recognition

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103413145A (en) * 2013-08-23 2013-11-27 南京理工大学 Articulation point positioning method based on depth image
CN106778813A (en) * 2016-11-24 2017-05-31 金陵科技学院 The self-adaption cluster partitioning algorithm of depth image
CN106845513A (en) * 2016-12-05 2017-06-13 华中师范大学 Staff detector and method based on condition random forest
CN107688779A (en) * 2017-08-18 2018-02-13 北京航空航天大学 A kind of robot gesture interaction method and apparatus based on RGBD camera depth images

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于深度图像的手势识别综述;陈皓,路海明;《内蒙古大学学报(自然科学版)》;20140131;第45卷(第1期);全文 *

Also Published As

Publication number Publication date
CN110598510A (en) 2019-12-20

Similar Documents

Publication Publication Date Title
CN107203753B (en) Action recognition method based on fuzzy neural network and graph model reasoning
CN108460356B (en) Face image automatic processing system based on monitoring system
CN107492251B (en) Driver identity recognition and driving state monitoring method based on machine learning and deep learning
Shrivastava et al. Training region-based object detectors with online hard example mining
US8620024B2 (en) System and method for dynamic gesture recognition using geometric classification
EP1934941B1 (en) Bi-directional tracking using trajectory segment analysis
JP4513898B2 (en) Image identification device
CN111563417B (en) Pyramid structure convolutional neural network-based facial expression recognition method
CN112966691B (en) Multi-scale text detection method and device based on semantic segmentation and electronic equipment
CN109671102B (en) Comprehensive target tracking method based on depth feature fusion convolutional neural network
CN106778796B (en) Human body action recognition method and system based on hybrid cooperative training
CN112734775A (en) Image annotation, image semantic segmentation and model training method and device
CN106960181B (en) RGBD data-based pedestrian attribute identification method
CN105956560A (en) Vehicle model identification method based on pooling multi-scale depth convolution characteristics
CN106778501A (en) Video human face ONLINE RECOGNITION method based on compression tracking with IHDR incremental learnings
CN110032932B (en) Human body posture identification method based on video processing and decision tree set threshold
CN111886600A (en) Device and method for instance level segmentation of image
CN109165658B (en) Strong negative sample underwater target detection method based on fast-RCNN
CN110599463A (en) Tongue image detection and positioning algorithm based on lightweight cascade neural network
CN111444817B (en) Character image recognition method and device, electronic equipment and storage medium
CN111563549B (en) Medical image clustering method based on multitasking evolutionary algorithm
CN110580499B (en) Deep learning target detection method and system based on crowdsourcing repeated labels
US20210019547A1 (en) System and a method for efficient image recognition
CN114140696A (en) Commodity identification system optimization method, commodity identification system optimization device, commodity identification equipment and storage medium
CN110598510B (en) Vehicle-mounted gesture interaction technology

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20220125

Address after: 518063 2W, Zhongdian lighting building, Gaoxin South 12th Road, Nanshan District, Shenzhen, Guangdong

Applicant after: Shenzhen point cloud Intelligent Technology Co.,Ltd.

Address before: 518023 No. 3039 Baoan North Road, Luohu District, Shenzhen City, Guangdong Province

Applicant before: Zhou Qinna

GR01 Patent grant
GR01 Patent grant