CN107886057B - Robot hand waving detection method and system and robot - Google Patents

Robot hand waving detection method and system and robot Download PDF

Info

Publication number
CN107886057B
CN107886057B CN201711042859.9A CN201711042859A CN107886057B CN 107886057 B CN107886057 B CN 107886057B CN 201711042859 A CN201711042859 A CN 201711042859A CN 107886057 B CN107886057 B CN 107886057B
Authority
CN
China
Prior art keywords
angular
point
palm
motion
points
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711042859.9A
Other languages
Chinese (zh)
Other versions
CN107886057A (en
Inventor
张帆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Avatarmind Robot Technology Co ltd
Original Assignee
Nanjing Avatarmind Robot Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Avatarmind Robot Technology Co ltd filed Critical Nanjing Avatarmind Robot Technology Co ltd
Priority to CN201711042859.9A priority Critical patent/CN107886057B/en
Priority to PCT/CN2017/112211 priority patent/WO2019085060A1/en
Publication of CN107886057A publication Critical patent/CN107886057A/en
Application granted granted Critical
Publication of CN107886057B publication Critical patent/CN107886057B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/50Extraction of image or video features by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • Psychiatry (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a robot hand waving detection method, a system and a robot, wherein the method comprises the following steps: detecting a palm in a standard position in a video stream by a cascade of classifiers; extracting angular points of the palm at the standard position to obtain an angular point set; tracking and detecting each corner point in the corner point set in a video stream to obtain a motion track corresponding to each corner point; and judging whether the palm waves the hand or not according to the motion trail of each angular point in the angular point set. The invention realizes the detection of the waving hand in a complex environment, has strong anti-interference capability to the dynamic background noise and has high detection accuracy.

Description

Robot hand waving detection method and system and robot
Technical Field
The invention relates to the field of image recognition, in particular to a robot hand waving detection method and system and a robot.
Background
With the development of science and technology, artificial intelligence is more and more popular, and people hope to interact with a robot like a real person. However, the current interaction mode with the robot is mainly a mouse, a keyboard and a touch screen, and people cannot meet the traditional interaction mode with many limitations. Therefore, some more intelligent interaction methods need to be designed. The vision is an important channel for understanding the interactive information between people, and the interactive intention of the other party can be understood by visually observing the limb actions of the interactive objects. Therefore, in order to increase the personifying character in the robot-human interaction process, it is necessary for the robot to understand some of the body languages of the human, such as waving a hand for a call.
The invention patent application No. 201610859376.7 discloses a waving detection method based on motion history images. The present invention first detects a rough area where a human body is located by a human body detector, and then analyzes the area based on a motion history image to determine whether the human body is waving a hand. However, the method based on motion history image analysis is only suitable for a static background, and when a person is in a complex dynamic background, the motion background can be easily judged as that the person is waving. The hand waving detection algorithm of the prior patent actually performs differential operation on three continuous frames of images of an area to be detected, then obtains a binary image from the obtained differential image by a threshold method, and finally utilizes a series of acquired binary image historical information to perform hand waving judgment. Because the invention adopts the frame difference method of continuous images, when the background is not fixed but dynamic, for example, if a person moves in the area to be detected, the dynamic background is easily interfered to identify the dynamic background as the person waving the hand.
Therefore, it is necessary to design a method capable of accurately recognizing a waving hand even under a complex background, so as to improve the intelligence of human-computer interaction.
Disclosure of Invention
The invention provides a robot hand waving detection method, a robot hand waving detection system and a robot, which realize the identification of palms under a complex background and track the palms, thereby detecting whether hands are waving or not and having strong anti-interference performance on background noise. The technical scheme is as follows:
a method of detecting a hand swing of a robot, comprising: detecting a palm in a standard position in a video stream by a cascade of classifiers; extracting angular points of the palm at the standard position to obtain an angular point set; tracking and detecting each corner point in the corner point set in a video stream to obtain a motion track corresponding to each corner point; and judging whether the palm waves the hand or not according to the motion trail of each angular point in the angular point set.
Compared with the detection method for waving hands in the prior art, the palm detection method can execute palm detection on a single video frame, and the waving hands can be detected in the waving process without combining historical data of the front video frame and the back video frame. Gesture recognition can be completed on a complex background through the cascade classifier, then the corner points of the palm are detected and tracked, hand waving detection under a complex environment is achieved, anti-interference capability on dynamic background noise is high, and accuracy on hand waving detection is high.
Preferably, the determining whether the user swings the hand according to the motion trajectory of each corner point in the corner point set specifically includes: analyzing the number of effective motion angular points in the angular point set according to the motion trail of each angular point in the angular point set between two adjacent frames in the video stream and a preset pixel distance; analyzing and obtaining the detection conditions of all corner points in the corner point set according to the number of the effective motion corner points and a first preset number; when the detection condition meets a preset requirement, analyzing and obtaining the number of effective waving angular points in the effective motion angular points according to the motion trail of the effective motion angular points; and judging whether the user waves the hands or not according to the number of the effective hand waving angular points and a second preset number.
By detecting the angular point track in the video stream, whether the angular point carries out effective motion is judged firstly, so that the accuracy of hand waving detection is improved; whether the angular point effectively moves leftwards and rightwards is judged, and whether the palm waves the hand or not can be effectively judged.
Preferably, the analyzing the number of effective motion corner points in the corner point set according to the motion trajectory of each corner point in the corner point set between two adjacent frames of the video stream and the preset pixel distance specifically includes: calculating the motion distance of the same corner point between two adjacent frames in the video stream according to the motion trail of each corner point in the corner point set between two adjacent frames in the video stream; judging whether the movement distance is larger than a preset pixel distance; when the motion distance is larger than a preset pixel distance, judging that the angular point performs effective motion, and taking the angular point as the effective motion angular point; when the motion distance is smaller than the preset pixel distance, judging the angular points to carry out invalid motion, and counting the number N of the valid motion angular pointstrackSubtracting 1, finally calculating the obtained NtrackI.e. the number of effective motion corners in said set of corners, NtrackIs the total number N of corner points in the set of corner points;
the angular points which do effective motion can be effectively screened out by comparing the angular point motion distance with the preset pixel distance, and the effective motion angular points N obtained from the angular pointstrackThe quantity of the palm is accurate, and preliminary preparation is made for judging whether the palm swings the hand or not.
Preferably, the detection of all corner points in the corner point set is obtained by analyzing according to the number of the effective motion corner points and a first preset numberThe situation specifically includes: judging the number N of the effective motion angular pointstrackWhether the number is larger than a first preset number; when the number of effective motion corner points NtrackWhen the number of the effective motion angular points is less than the first preset number, stopping detecting the motion trail of the effective motion angular points, and judging that the user does not wave the hand; when the number of effective motion corner points NtrackAnd when the number of the effective motion corner points is larger than the first preset number, continuously detecting the motion trail of the effective motion corner points.
Judging the number N of the effective motion angular pointstrackWhether the quantity of the effective motion angular points is larger than a first preset quantity or not can preliminarily judge whether the palm waves the hand, if the quantity of the effective motion angular points is too low, the palm can be judged not to move, and if the quantity N of the effective motion angular points is larger than a second preset quantitytrackIf the number is larger than the first preset number, the next judgment process is continued.
Preferably, when the number of effective motion corner points N is less than the number of effective motion corner pointstrackWhen the number of the effective motion corner points is larger than the first preset number, detecting the left motion distance and the right motion distance of the effective motion corner points between two adjacent frames in the video stream; judging whether the left movement distance and the right movement distance of the effective movement corner point between two adjacent frames in the video stream are both larger than a preset distance; when the left movement distance and the right movement distance of the effective movement angular points between two adjacent frames in the video stream are larger than the preset distance, recording the effective movement angular points as effective hand-waving angular points, and counting the number N of the effective hand-waving angular pointsmoveAdding 1 to finally calculate the obtained NmoveThat is, the number of effective hand-waving corner points in the corner point set is NmoveIs 0.
If the number N of the effective motion corner pointstrackMore than a first preset number, judging whether the effective motion angular points carry out effective motion leftwards and rightwards, and accurately obtaining the number N of effective hand waving angular points which carry out effective motion leftwards and rightwardsmoveFacilitating the next decision process.
Preferably, judging whether the user swings the hand specifically according to the number of the effective hand swinging corner points and a second preset number includes: judge said hasNumber of effective hand-waving corner points NmoveWhether the number is larger than a second preset number; when the number of effective hand waving angular points is NmoveWhen the number of the hands is larger than the second preset number, judging that the palm swings the hand; otherwise, judging that the palm does not wave the hand.
Because the palm must carry out left right hand process of waving when waving the hand, consequently the movement track of palm angular point also can move left right, can detect from this and effectively wave the quantity of angular point, through effectively waving the judgement of angular point quantity, when effectively waving angular point quantity and being greater than the second and predetermineeing quantity, then can accurately appear whether the palm is waving the hand respectively.
Preferably, before detecting the palm in the standard position in the video stream by the cascade classifier, the method comprises: making a palm training sample; extracting the LBP texture characteristic vector of the palm training sample to obtain an extracted and processed palm training sample; and training a cascade classifier by extracting the processed palm training sample.
Before hand waving detection, an adaboost cascade classifier needs to be trained, and LBP texture feature extraction is carried out on palm training samples required by the training of the adaboost cascade classifier, so that the identification accuracy of the adaboost cascade classifier can be improved.
Preferably, the making of the palm training sample specifically comprises: palm samples of different users in different contexts are collected, including samples in which the palm is rotated within a preset angular range relative to a standard position.
When the palm samples are screened, in order to extract the LBP characteristic value conveniently, the selected palm samples are palm samples within a preset angle range relative to the standard position, so that the gesture recognition success rate of the cascade classifier near the standard position is improved better. The palm samples of different users under different backgrounds are collected, the diversity of the samples is improved, when the collected palm training samples have certain shapes and illumination differences, the obtained detection effect is very good, and the success rate of the cascade classifier for gesture recognition is improved.
Preferably, the process of extracting the LBP texture feature vector of the palm training sample specifically includes: dividing a detection window into N multiplied by N pixel regions, and comparing the gray value of a central pixel point in the pixel regions with the gray value of 8 adjacent pixel points, wherein N belongs to {64,32,16,8 }; if the pixel value of the adjacent pixel point is larger than that of the central pixel point, the adjacent pixel point is marked as 1, otherwise, the adjacent pixel point is 0, so that 8-bit binary number is generated and used as the LBP value of the central pixel point, and the LBP value of each pixel point in the pixel area can be obtained through calculation; calculating a statistical histogram of the pixel region according to the LBP value obtained by each pixel point in the pixel region, and carrying out normalization processing on the statistical histogram; connecting the statistical histograms of each pixel region of the palm training sample to form an LBP texture feature vector of the palm training sample;
before training the cascade classifier, the LBP characteristic detection is carried out on the palm sample, so that the trained cascade classifier has high calculation speed and can be accurately positioned to the position of the palm on the image.
Preferably, detecting the corner points of the palm region to obtain a corner point set, and tracking and detecting all corner points in the corner point set in a video stream to obtain a motion trajectory corresponding to each corner point specifically: extracting the angular points of the palm at the standard position through an angular point detection algorithm to obtain an angular point set, carrying out optical flow tracking on all the angular points in the angular point set in a video stream, detecting the positions of all the angular points in the angular point set after movement according to the optical flow tracking algorithm, and calculating a movement track corresponding to each angular point according to a sparse optical flow algorithm.
The corner points of the palm area can be effectively extracted through a corner point detection algorithm, the optical flow tracking of each corner point in the video stream can be better realized through the optical flow tracking algorithm, the motion trail of the corner points can be better detected through a sparse optical flow algorithm, and the detection process of waving a hand is facilitated later.
A robot hand-waving detection system comprising: the detection module is used for detecting the palm at the standard position in the video stream through the cascade classifier; the angular point extraction module is electrically connected with the detection module and used for extracting the angular points of the palm at the standard position to obtain an angular point set; the angular point tracking module is electrically connected with the angular point extraction module and used for tracking and detecting each angular point in the angular point set in a video stream to obtain a motion track corresponding to each angular point; and the judging module is electrically connected with the angular point tracking module and is used for judging whether the palm waves the hand or not according to the motion trail of each angular point in the angular point set.
Preferably, the judging module includes: the analysis submodule analyzes and obtains the number of effective motion angular points in the angular point set according to the motion track of each angular point in the angular point set between two adjacent frames in the video stream and a preset pixel distance; the analysis submodule is further used for analyzing and obtaining the detection conditions of all the angular points in the angular point set according to the number of the effective motion angular points and a first preset number; the analysis submodule is further used for analyzing and obtaining the number of effective hand waving angular points in the effective motion angular points according to the motion trail of the effective motion angular points when the detection condition meets the preset requirement; and the judging submodule is electrically connected with the analyzing submodule and used for judging whether the user waves the hand or not according to the number of the effective hand waving corner points and a second preset number.
Preferably, the judging module further includes: the calculation submodule is used for calculating the motion distance of the same angular point between two adjacent frames in the video stream according to the motion trail of each angular point in the angular point set between two adjacent frames in the video stream; the judgment submodule is also electrically connected with the calculation submodule and is used for judging whether the movement distance is greater than a preset pixel distance; the judgment sub-module is further used for judging that the angular point performs effective motion when the motion distance is larger than a preset pixel distance, and taking the angular point as the effective motion angular point; the judgment sub-module is further used for judging the angular points to carry out invalid motion when the motion distance is smaller than the preset pixel distance, and the number N of the valid motion angular pointstrackSubtracting 1, finally calculating the obtained NtrackI.e. the number of effective motion corners in said set of corners, NtrackIs the angleThe total number of angular points N in the point set;
preferably, the determining sub-module is further configured to determine the number N of effective motion corner pointstrackWhether the number is larger than a first preset number; the judgment sub-module is further used for judging the number N of the effective motion corner pointstrackWhen the number of the effective motion angular points is less than the first preset number, stopping detecting the motion trail of the effective motion angular points, and judging that the user does not wave the hand; the judgment sub-module is further used for judging the number N of the effective motion corner pointstrackAnd when the number of the effective motion corner points is larger than the first preset number, continuously detecting the motion trail of the effective motion corner points.
Preferably, the determining module further includes a detecting sub-module, configured to determine the number N of valid motion cornerstrackWhen the number of the effective motion corner points is larger than the first preset number, detecting the left motion distance and the right motion distance of the effective motion corner points between two adjacent frames in the video stream; the judgment sub-module is further used for judging whether the left movement distance and the right movement distance of the effective movement corner point between two adjacent frames in the video stream are both larger than a preset distance; the judgment sub-module is further used for recording the effective motion angular points as effective hand waving angular points and recording the number N of the effective hand waving angular points when the leftward motion distance and the rightward motion distance of the effective motion angular points between two adjacent frames in the video stream are larger than preset distancesmoveAdding 1 to finally calculate the obtained NmoveThat is, the number of effective hand-waving corner points in the corner point set is NmoveIs 0.
Preferably, the judging sub-module is further configured to judge the number N of effective hand waving corner pointsmoveWhether the number is larger than a second preset number; the judging submodule is also used for judging the number N of the effective hand waving angular pointsmoveWhen the number of the hands is larger than the second preset number, judging that the palm swings the hand; otherwise, judging that the palm does not wave the hand.
Preferably, the method further comprises the following steps: the sample making module is used for making a palm training sample; the feature extraction module is used for extracting the LBP texture feature vector of the palm training sample to obtain the extracted and processed palm training sample; and the classifier training module is electrically connected with the feature extraction module and used for training the cascade classifier through the extracted palm training sample.
Preferably, the feature extraction module includes: the processing submodule is used for dividing the detection window into N multiplied by N pixel areas and comparing a central pixel point in the pixel areas with the gray values of 8 adjacent pixel points, wherein N belongs to {64,32,16 and 8 }; the characteristic value operator module is electrically connected with the processing submodule, if the pixel value of the adjacent pixel point is greater than that of the central pixel point, the adjacent pixel point is marked as 1, otherwise, the adjacent pixel point is 0, so that 8-bit binary number is generated and is used as the LBP value of the central pixel point, and the LBP value of each pixel point in the pixel area can be obtained through calculation; the processing submodule is also used for calculating a statistical histogram of the pixel region according to the LBP value obtained by each pixel point in the pixel region and carrying out normalization processing on the statistical histogram; the processing sub-module is further configured to connect the statistical histograms of each pixel region of the palm training sample to form an LBP texture feature vector of the palm training sample;
preferably, the corner point tracking module is further configured to extract the corner points of the palm located in the standard position through a corner point detection algorithm to obtain a corner point set, perform optical flow tracking on all corner points in the corner point set in a video stream, detect positions of all corner points in the corner point set after movement according to the optical flow tracking algorithm, and calculate a movement track corresponding to each corner point according to a sparse optical flow algorithm.
A robot is integrated with the hand waving detection system of the robot.
According to the robot hand waving detection method and system and the robot provided by the invention, at least one of the following beneficial effects can be realized:
1. compared with the detection method for waving hands in the prior art, the palm detection method can execute palm detection on a single video frame, and the waving hands can be detected in the waving process without combining historical data of the front video frame and the back video frame. Gesture recognition can be completed on a complex background through the cascade classifier, then the corner points of the palm are detected and tracked, hand waving detection under a complex environment is achieved, anti-interference capability on dynamic background noise is high, and accuracy on hand waving detection is high.
2. The corner points of the palm area can be effectively extracted through a corner point detection algorithm, the optical flow tracking of each corner point in the video stream can be better realized through the optical flow tracking algorithm, the motion trail of the corner points can be better detected through a sparse optical flow algorithm, and the detection process of waving a hand is facilitated later.
3. By detecting the angular point track in the video stream, whether the angular point carries out effective motion is judged at first, and the accuracy of hand waving detection is improved. Whether the angular point effectively moves leftwards and rightwards is judged, and whether the palm waves the hand or not can be effectively judged.
4. When the palm samples are screened, in order to extract the LBP characteristic value conveniently, the selected palm samples are palm samples within a preset angle range relative to the standard position, so that the gesture recognition success rate of the cascade classifier near the standard position is improved better. The palm samples of different users under different backgrounds are collected, the diversity of the samples is improved, when the collected palm training samples have certain shapes and illumination differences, the obtained detection effect is very good, and the success rate of the cascade classifier for gesture recognition is improved.
5. Before training the cascade classifier, the LBP characteristic detection is carried out on the palm sample, so that the trained cascade classifier has high calculation speed and can be accurately positioned to the position of the palm on the image.
Drawings
The above features, technical features, advantages and implementations of a method, system and robot for detecting a waving hand of a robot will be further described in the following detailed description of preferred embodiments with reference to the accompanying drawings.
FIG. 1 is a flowchart of an embodiment of a method for detecting a waving hand of a robot according to the present invention;
FIG. 2 is a flowchart of another embodiment of a method for detecting a waving hand of a robot according to the present invention;
FIG. 3 is a schematic view of a palm swing of the present invention;
FIG. 4 is a flowchart of another embodiment of a method for detecting a waving hand of a robot according to the present invention;
FIG. 5 is a view of the palm corner point tracking inspection of the present invention;
FIG. 6 is a flowchart of another embodiment of a method for detecting a waving hand of a robot according to the present invention;
FIG. 7 is a schematic view of a hand swing detecting system of a robot according to the present invention;
FIG. 8 is another schematic view of the hand swing detecting system of the robot according to the present invention;
FIG. 9 is another schematic view of the hand swing detecting system of the robot according to the present invention;
the reference numbers illustrate:
the system comprises a sample making module-1, a feature extraction module-2, a processing sub-module-21, a feature value calculating sub-operator module-22, a classifier training module-3, a detection module-4, a corner extraction module-5, a corner tracking module-6, a judgment module-7, an analysis sub-module-71, a judgment sub-module-72 and a calculation sub-module-73.
Detailed Description
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the following description will be made with reference to the accompanying drawings. It is obvious that the drawings in the following description are only some examples of the invention, and that for a person skilled in the art, other drawings and embodiments can be derived from them without inventive effort.
For the sake of simplicity, the drawings only schematically show the parts relevant to the present invention, and they do not represent the actual structure as a product. In addition, in order to make the drawings concise and understandable, components having the same structure or function in some of the drawings are only schematically illustrated or only labeled. In this document, "one" means not only "only one" but also a case of "more than one".
As shown in fig. 1, the present invention provides an embodiment of a method for detecting a waving hand of a robot, including:
s4 detecting a palm in a standard position in the video stream through the cascade of classifiers;
s5, extracting the corner points of the palm at the standard position to obtain a corner point set;
s6, tracking and detecting each corner point in the corner point set in the video stream to obtain a motion trail corresponding to each corner point;
s7, judging whether the palm swings according to the motion trail of each corner point in the corner point set.
Specifically, in this embodiment, first, the adaboost cascade classifier detects a palm in a standard position in the video stream by using an adaboost algorithm, after the palm in the standard position is detected, angular points on the palm are acquired by using an angular point acquisition method, tracking detection is performed on the angular points in the video stream, a motion trajectory of each angular point is obtained, and whether the palm in the video stream is waving can be judged according to the motion trajectory.
In this embodiment, the robot can synchronously detect the motion trajectories of the corners on the palm and the palm when acquiring the video stream, the palm waving detection is not performed based on the historical image, and the adaboost cascade classifier is used to have a high recognition rate for the palm.
In this embodiment, whether the palm swings the hand or not is determined by detecting the motion trajectories of the corner points, for example, the motion trajectories of each corner point in the set of corner points are detected, and if more than half of the corner points perform effective left-to-right motion, it can be determined that the user swings the hand.
As shown in fig. 2, the present invention also provides another embodiment of a method for detecting a waving hand of a robot, including:
s1, making a palm training sample;
s2, extracting LBP texture characteristic vectors of the palm training samples to obtain extracted and processed palm training samples;
s3 trains a cascade classifier by extracting the processed palm training samples.
S4 detecting a palm in a standard position in the video stream through the cascade of classifiers;
s5, extracting the corner points of the palm at the standard position to obtain a corner point set;
s6, tracking and detecting each corner point in the corner point set in the video stream to obtain a motion trail corresponding to each corner point;
s7, judging whether the palm swings according to the motion trail of each corner point in the corner point set.
Preferably, the making of the palm training sample specifically comprises: s11 collects palm samples of different users in different contexts, the palm samples including samples in which the palm is rotated within a preset angle range with respect to a standard position.
Preferably, the process of extracting the LBP texture feature vector of the palm training sample specifically includes:
s21, dividing the detection window into N multiplied by N pixel regions, and comparing the gray value of a central pixel point in the pixel regions with the gray value of 8 adjacent pixel points, wherein N belongs to {64,32,16,8 };
s22, if the pixel value of the neighboring pixel is greater than the pixel value of the central pixel, the neighboring pixel is marked as 1, otherwise, the neighboring pixel is 0, thereby generating an 8-bit binary number as the LBP value of the central pixel, and thereby calculating the LBP value of each pixel in the pixel region;
s23, calculating a statistical histogram of the pixel region according to the LBP value obtained by each pixel point in the pixel region, and carrying out normalization processing on the statistical histogram;
s24, connecting the statistical histograms of each pixel region of the palm training sample to form an LBP texture feature vector of the palm training sample;
specifically, the cascade classifier needs to be trained before detecting the palm in the standard position in the video stream by using the cascade classifier, and this embodiment specifically describes a sample making process of the cascade classifier and a process how to train the cascade classifier.
As shown in FIG. 3, the palm of the user moves back and forth between the first, second and third positions during waving the hand. In the embodiment, the palm in the video frame is detected by using an Adaboost algorithm in combination with the LBP feature. Since the algorithm has no rotation invariance, only the palm with the rotation angle within the range of +/-15 degrees at a certain standard position can be identified. In the embodiment, the gesture with the palm at the position II is selected as the standard palm, and the gestures at other positions cannot be recognized.
The training process of the cascade classifier can be divided into: 1. making a palm training sample; 2. extracting the LBP characteristics of the palm training sample; 3. a cascade classifier is trained.
When a palm training sample is collected, in order to extract an LBP characteristic value, the selected palm sample is a palm sample of which the palm is within a preset angle range relative to a standard position, so as to better improve the gesture recognition success rate of the cascade classifier near the standard position, the preset angle range of the palm can be a range of ± 15 °, that is, the sample includes gesture samples of which the rotation angle relative to the standard position is within a range of ± 15 °, so as to better improve the gesture recognition rate of an algorithm near the standard position. When palm samples of different users under different backgrounds are collected, in order to improve the diversity of the samples, when the collected palm training samples have certain shape and illumination difference, the obtained detection effect is very good, and the success rate of the cascade classifier for gesture recognition can be improved to a greater extent.
The extraction process of the LBP characteristics of the palm training sample comprises the following steps: a. dividing the detection window into 16 × 16 small regions (cells), comparing the gray values of 8 adjacent pixels with one pixel in each cell, and if the peripheral pixel values are greater than the central pixel value, marking the position of the pixel as 1, otherwise, marking the position as 0. Thus, 8 points in the 3-by-3 neighborhood can generate 8-bit binary numbers through comparison, and the LBP value of the window center pixel point is obtained; b. calculating a histogram for each cell, i.e. the frequency of occurrence of each digit (assuming a decimal LBP value); normalizing the histogram; c. and finally, connecting the obtained statistical histograms of all the cells into a feature vector, namely an LBP texture feature vector of the whole graph.
The AdaBoost algorithm used for training the cascade classifier is an iterative algorithm, the core idea is to train different classifiers, namely weak classifiers, aiming at the same training set, then the weak classifiers are gathered to construct a stronger final classifier, and the algorithm is divided into the following 3 steps:
a. and initializing weight distribution of the training data. If there are N samples, each training sample is initially given the same weight: 1/N;
Figure GDA0002894588200000121
b. and training the weak classifier. If a sample point has been accurately classified, its weight is reduced in constructing the next training set; conversely, if a sample point is not classified accurately, its weight is increased. The sample set with updated weights is then used to train the next classifier, and the entire training process proceeds iteratively. Learning by using a training data set with weight distribution to obtain a basic classifier (selecting a threshold value with the lowest error rate to design the basic classifier):
Gm(x):x→{-1,+1}
calculate the classification error rate on the training data set:
Figure GDA0002894588200000131
from the above equation, the error rate on the training data set is the sum of the weights of the misclassified samples.
The calculated coefficients, representing the degree of importance in the final classifier (objective: obtaining the weight that the basic classifier occupies in the final classifier):
Figure GDA0002894588200000132
as can be seen from the above equation, when, and increasing with decreasing, it means that the basic classifier with smaller classification error rate has a larger role in the final classifier.
The weight distribution of the training data set is updated (in order to obtain a new weight distribution of the samples) for the next iteration
Dm+1=(wm+1,1,wm+1,2,...,wm+1,N)
Figure GDA0002894588200000133
So that the weight of the misclassified samples by the basic classifier is increased and the weight of the correctly classified samples is decreased. In this manner, the Adaboost method can "focus" or "focus on" those samples that are less readily separable.
c. And combining the weak classifiers obtained by training into a strong classifier. After the training process of each weak classifier is finished, the weight of the weak classifier with small classification error rate is increased to play a larger decision role in the final classification function, and the weight of the weak classifier with large classification error rate is reduced to play a smaller decision role in the final classification function. In other words, weak classifiers with low error rates take up more weight in the final classifier, and are otherwise smaller.
Combining the weak classifiers to obtain a final classifier as follows:
Figure GDA0002894588200000134
because the improved palm detection algorithm is adopted, the method has the following advantages:
1. the gesture detection algorithm performs palm detection on separate video frames without combining historical data of previous and subsequent video frames.
2. The detection algorithm is used for detecting through the LBP characteristics of the palm image, the calculation speed is high, and the position of the palm on the image can be accurately positioned.
3. When the collected palm training samples have certain shape and illumination difference, the obtained detection effect is very good, and the recognition rate is high.
As shown in fig. 4, the present invention also provides another embodiment of a method for detecting a waving hand of a robot, including:
s11, acquiring palm samples of different users under different backgrounds, wherein the palm samples comprise samples of a palm rotating within a preset angle range relative to a standard position;
s21, dividing the detection window into N multiplied by N pixel regions, and comparing the gray value of a central pixel point in the pixel regions with the gray value of 8 adjacent pixel points, wherein N belongs to {64,32,16,8 };
s22, if the pixel value of the neighboring pixel is greater than the pixel value of the central pixel, the neighboring pixel is marked as 1, otherwise, the neighboring pixel is 0, thereby generating an 8-bit binary number as the LBP value of the central pixel, and thereby calculating the LBP value of each pixel in the pixel region;
s23, calculating a statistical histogram of the pixel region according to the LBP value obtained by each pixel point in the pixel region, and carrying out normalization processing on the statistical histogram;
s24, connecting the statistical histograms of each pixel region of the palm training sample to form an LBP texture feature vector of the palm training sample;
s3 training a cascade classifier by extracting the processed palm training sample;
s4 detecting a palm in a standard position in the video stream through the cascade of classifiers;
s51, extracting the corner points of the palm in the standard position through a corner point detection algorithm to obtain a corner point set;
s61, carrying out optical flow tracking on all the angular points in the angular point set in a video stream, detecting the positions of all the angular points in the angular point set after movement according to an optical flow tracking algorithm, and calculating to obtain a movement track corresponding to each angular point according to a sparse optical flow algorithm;
s71, analyzing and obtaining the number of effective motion corner points in the corner point set according to the motion trail of each corner point in the corner point set between two adjacent frames in the video stream and the preset pixel distance;
s72, when the number of the effective motion corner points is larger than a first preset number, continuing to track and detect all corner points in the corner point set;
s73, analyzing and obtaining the number of effective hand waving corner points in the effective motion corner points according to the motion trail of the effective motion corner points;
and S74, when the number of the effective hand waving corner points is larger than a second preset number, judging that the user is waving hands.
The embodiment specifically explains the detection and tracking of the corner points of the palm, and how to detect the process of waving the hand according to the motion tracks of the corner points.
Specifically, as shown in fig. 5, the palm movement detection and tracking process. Detecting a palm in a standard position in a video frame; extracting angular points in the palm area through an angular point detection algorithm, wherein the extracted angular points are multiple, a part of the extracted angular points are schematically shown in the figure, and black points are detected angular points; thirdly, carrying out optical flow tracking on the video frame by taking the initial position of the corner point detected when the palm is in the standard position in the video frame as the initial position; fourthly, the fifth step is the position of the corner point detected by the optical flow tracking algorithm in the next video frame after the movement; l1, L2 are motion paths traced to the hand corner points by sparse optical flow algorithms during the palm swing.
In this embodiment, the corner detection algorithm is adopted as the method for extracting the corner, so that the corner of the palm region can be effectively extracted, and in addition, other methods for extracting the feature points, such as a FAST feature point extraction method, may also be used, and details are not described here. The tracking of the angular points in the video stream adopts an optical flow tracking algorithm, each angular point can be effectively tracked in the video stream, the effective degree of tracking is improved, the motion trail of the angular points can be better detected through a sparse optical flow algorithm, and the detection process of waving a hand is facilitated later.
And when the motion distance of the angular points in the two adjacent frames is greater than a preset pixel distance, taking the angular points as effective motion angular points, and counting the number of the effective motion angular points.
Analyzing and obtaining the detection conditions of all the angular points in the angular point set according to the number of the effective motion angular points and a first preset number, wherein the specific detection conditions are as follows: the utility model provides a be that the quantity of effective motion angular point is less than first predetermined quantity, the value ratio of first predetermined quantity is less, can establish to 5 ~ 10, because the palm angular point quantity of extraction far exceeds 5 ~ 10, and when the quantity of effective motion angular point is less than first predetermined quantity, most angular points do not carry out effective motion promptly, can judge that the palm does not wave the hand this moment.
When the number of the effective motion angular points is larger than a first preset number, namely, when the detection condition meets the preset requirement, the number of the effective hand waving angular points in the effective motion angular points is analyzed and obtained according to the motion trail of the effective motion angular points, when a hand is waved, the palm can certainly move leftwards and rightwards, the angular points on the palm can also do the same motion, therefore, a corresponding motion trail can be formed, and the palm waving hand can be judged according to the motion trail of the angular points. When the effective motion corner points are tracked and detected, the target is inevitably lost, so that the number of the effective motion corner points obtained by tracking is not as large as that of the original effective motion corner points, and after the effective motion corner points are effectively moved leftwards and rightwards, the effective motion corner points are recorded as effective waving corner points. And when the number of the effective hand waving corner points is greater than a second preset number, judging that the user waves the hand.
As shown in fig. 6, the present invention also provides another embodiment of a hand swing detection method of a robot, including:
s11, acquiring palm samples of different users under different backgrounds, wherein the palm samples comprise samples of a palm rotating within a preset angle range relative to a standard position;
s21, dividing the detection window into N multiplied by N pixel regions, and comparing the gray value of a central pixel point in the pixel regions with the gray value of 8 adjacent pixel points, wherein N belongs to {64,32,16,8 };
s22, if the pixel value of the neighboring pixel is greater than the pixel value of the central pixel, the neighboring pixel is marked as 1, otherwise, the neighboring pixel is 0, thereby generating an 8-bit binary number as the LBP value of the central pixel, and thereby calculating the LBP value of each pixel in the pixel region;
s23, calculating a statistical histogram of the pixel region according to the LBP value obtained by each pixel point in the pixel region, and carrying out normalization processing on the statistical histogram;
s24, connecting the statistical histograms of each pixel region of the palm training sample to form an LBP texture feature vector of the palm training sample;
s3 training a cascade classifier by extracting the processed palm training sample;
s4 detecting a palm in a standard position in the video stream through the cascade of classifiers;
s51, extracting the corner points of the palm in the standard position through a corner point detection algorithm to obtain a corner point set;
s61, carrying out optical flow tracking on all the angular points in the angular point set in a video stream, detecting the positions of all the angular points in the angular point set after movement according to an optical flow tracking algorithm, and calculating to obtain a movement track corresponding to each angular point according to a sparse optical flow algorithm;
s711, calculating the motion distance of the same corner point between two adjacent frames in the video stream according to the motion trail of each corner point in the corner point set between two adjacent frames in the video stream;
s712 determining whether the movement distance is greater than a preset pixel distance;
s713, when the motion distance is larger than a preset pixel distance, judging the angular point to carry out effective motion, and taking the angular point as the effective motion angular point;
s714, when the motion distance is smaller than the preset pixel distance, judging the corner points to carry out invalid motion, and counting the number N of the valid motion corner pointstrackSubtracting 1, finally calculating the obtained NtrackI.e. the number of effective motion corners in said set of corners, NtrackIs the total number N of corner points in the set of corner points;
s721, judging the number N of the effective motion corner pointstrackWhether the number is larger than a first preset number;
s722 when the number N of the effective motion corner pointstrackWhen the number of the effective motion angular points is less than the first preset number, stopping detecting the motion trail of the effective motion angular points, and judging that the user does not wave the hand;
s723 as the number N of effective motion corner pointstrackAnd when the number of the effective motion corner points is larger than the first preset number, continuously detecting the motion trail of the effective motion corner points.
S731 when the number of effective motion corner points NtrackWhen the number of the effective motion corner points is larger than the first preset number, detecting the left motion distance and the right motion distance of the effective motion corner points between two adjacent frames in the video stream;
s732, judging whether the left movement distance and the right movement distance of the effective movement corner point between two adjacent frames in the video stream are both larger than a preset distance;
s733, when the left movement distance and the right movement distance of the effective movement angular point between two adjacent frames in the video stream are larger than the preset distance, the effective movement angular point is recorded as an effective hand waving angular point, and the number N of the effective hand waving angular points ismoveAdding 1 to finally calculate the obtained NmoveThat is, the number of effective hand-waving corner points in the corner point set is NmoveIs 0;
s741 judges the number N of effective hand waving angular pointsmoveWhether the number is larger than a second preset number;
s742 when the number N of effective hand waving corner pointsmoveWhen the number of the hands is larger than the second preset number, judging that the palm swings the hand;
and S743 otherwise, judging that the palm is not waved.
Specifically, the present embodiment describes in more detail how to determine whether the palm is waving according to the motion trajectory of each corner point in the set of corner points.
The hand waving detection process specifically comprises the following steps:
firstly, a standard gesture is detected from a video frame to start tracking the palm, and all tracking angular points A of a palm area are countediA motion path L of { i ═ 1, 2.. N }iN, where N is the total number of corner points in the palm area. Stipulate as Li> 0 indicates that the corner moves to the right of the standard gesture, when Li< 0 indicates that the corner point moves to the left of the standard gesture.
When L isi<-DleftIndicating that the corner has effectively moved to the left and the corresponding flag M of the corneri,leftTrue, wherein DleftA threshold value for the leftward movement distance may be set to a two-pixel distance; when L isi>DrightIndicating that the corner has effectively moved to the right, and the corresponding flag M of the corneri,rightTrue, wherein DrightThe threshold value for the rightward movement distance may be set to a two-pixel distance;
thirdly, when the angular point A between two frames in the video streamiIs moved along a path LiLess than two pixel distance, the corner point tracking is regarded as invalid, and the effective corner point number N is trackedtrack Minus 1, NtrackIs the total number N of corner points in the set of corner points;
fourthly, when tracking the effective angular point total number NtrackStopping tracking when the tracking is less than 5, and stopping tracking when the total number of effective corner points N is trackedtrackWhen the motion angle point is more than or equal to 5, continuously judging whether the effective motion angle point in the video stream carries out effective motion to the left and the right,
when a certain corner point AiEffective movements both to the left and to the right, i.e. movement marks Mi,left=true,Mi,rightTrue, the number of valid hand corner points NmoveAnd adding 1. Number N of effective hand waving angular pointsmoveIs 0;
sixthly, counting the number N of effective hand-waving angular points in real timemoveWhen is coming into contact with
Figure GDA0002894588200000181
It can be determined that the user is waving his hand, and at the same time, the corner point tracking is stopped and a new round of palm detection is started.
Sixthly, if fruit is waved, the total number of the effective motion angular points is detected
Figure GDA0002894588200000191
The detection is invalid, and a new round of detection is started.
The embodiment provides a method capable of accurately recognizing hand waving in a dynamic background, and a camera can detect whether a person waves the hand through continuous video frames. The invention is different from the frame difference method-based identification in nature, firstly, the palm is detected on the picture of each frame through a palm detector, once the palm is detected, the palm is tracked, and the hand waving judgment is carried out through the motion track of the palm. The palm detection is performed on a single-frame picture without combining upper and lower historical image frames, so that the palm detection is not interfered by a complex background and a dynamic background. Namely, the detection method is accurate and stable, can not be influenced by a complex background, and can accurately identify whether the hand is swung or not even if the background is interfered by the motion of people or other objects.
As shown in fig. 7, the present invention provides an embodiment of a waving detection system of a robot, including:
the detection module is used for detecting the palm at the standard position in the video stream through the cascade classifier;
the angular point extraction module is electrically connected with the detection module and used for extracting the angular points of the palm at the standard position to obtain an angular point set;
the angular point tracking module is electrically connected with the angular point extraction module and used for tracking and detecting each angular point in the angular point set in a video stream to obtain a motion track corresponding to each angular point;
and the judging module is electrically connected with the angular point tracking module and is used for judging whether the palm waves the hand or not according to the motion trail of each angular point in the angular point set.
Firstly, a robot calls a detection module to detect a palm at a standard position in a video stream through a cascade classifier; secondly, calling an angular point extraction module to extract the angular points of the palm at the standard position to obtain an angular point set; thirdly, calling an angular point tracking module to perform tracking detection on each angular point in the angular point set in the video stream to obtain a motion track corresponding to each angular point; and finally, calling a judging module, and judging whether the palm waves the hand according to the motion trail of each angular point in the angular point set.
In this embodiment, the robot can synchronously detect the motion trajectories of the corners on the palm and the palm when acquiring the video stream, the palm waving detection is not performed based on the historical image, and the adaboost cascade classifier is used to have a high recognition rate for the palm.
As shown in fig. 8, the present invention provides an embodiment of a waving detection system of a robot, and based on the above embodiment, the waving detection system further includes:
the sample making module is used for making a palm training sample;
the feature extraction module is used for extracting the LBP texture feature vector of the palm training sample to obtain the extracted and processed palm training sample;
and the classifier training module is electrically connected with the feature extraction module and used for training the cascade classifier through the extracted palm training sample.
Preferably, the feature extraction module includes:
the processing submodule is used for dividing the detection window into N multiplied by N pixel areas and comparing a central pixel point in the pixel areas with the gray values of 8 adjacent pixel points, wherein N belongs to {64,32,16 and 8 };
the characteristic value operator module is electrically connected with the processing submodule, if the pixel value of the adjacent pixel point is greater than that of the central pixel point, the adjacent pixel point is marked as 1, otherwise, the adjacent pixel point is 0, so that 8-bit binary number is generated and is used as the LBP value of the central pixel point, and the LBP value of each pixel point in the pixel area can be obtained through calculation;
the processing submodule is also used for calculating a statistical histogram of the pixel region according to the LBP value obtained by each pixel point in the pixel region and carrying out normalization processing on the statistical histogram;
the processing sub-module is further configured to connect the statistical histograms of each pixel region of the palm training sample to form an LBP texture feature vector of the palm training sample;
the corner point tracking module is further used for extracting the corner points of the palm in the standard position through a corner point detection algorithm to obtain a corner point set, carrying out optical flow tracking on all the corner points in the corner point set in a video stream, detecting the positions of all the corner points in the corner point set after movement according to the optical flow tracking algorithm, and calculating a movement track corresponding to each corner point according to a sparse optical flow algorithm.
As shown in fig. 3, the palm of the user moves back and forth among the positions of (i), (ii), and (iii) in the process of waving the hand, and the palm of the video frame is detected by using the Adaboost algorithm in combination with the LBP feature in the embodiment. Since the algorithm has no rotation invariance, only the palm with the rotation angle within the range of +/-15 degrees at a certain standard position can be identified. In the embodiment, the gesture with the palm at the position II is selected as the standard palm, and the gestures at other positions cannot be recognized.
The training process of the cascade classifier can be divided into: 1. making a palm training sample; 2. extracting the LBP characteristics of the palm training sample; 3. a cascade classifier is trained.
When a palm training sample is collected, in order to extract an LBP characteristic value, the selected palm sample is a palm sample of which the palm is within a preset angle range relative to a standard position, so as to better improve the gesture recognition success rate of the cascade classifier near the standard position, the preset angle range of the palm can be a range of ± 15 °, that is, the sample includes gesture samples of which the rotation angle relative to the standard position is within a range of ± 15 °, so as to better improve the gesture recognition rate of an algorithm near the standard position. When palm samples of different users under different backgrounds are collected, in order to improve the diversity of the samples, when the collected palm training samples have certain shape and illumination difference, the obtained detection effect is very good, and the success rate of the cascade classifier for gesture recognition can be improved to a greater extent.
The process of extracting the LBP characteristics of the palm training sample by the characteristic extraction module is as follows: a. dividing the detection window into 16 × 16 small regions (cells), comparing the gray values of 8 adjacent pixels with one pixel in each cell, and if the peripheral pixel values are greater than the central pixel value, marking the position of the pixel as 1, otherwise, marking the position as 0. Thus, 8 points in the 3-by-3 neighborhood can generate 8-bit binary numbers through comparison, and the LBP value of the window center pixel point is obtained; b. calculating a histogram for each cell, i.e. the frequency of occurrence of each digit (assuming a decimal LBP value); normalizing the histogram; c. and finally, connecting the obtained statistical histograms of all the cells into a feature vector, namely an LBP texture feature vector of the whole graph.
The AdaBoost algorithm used for training the cascade classifier is an iterative algorithm, the core idea is to train different classifiers, namely weak classifiers, aiming at the same training set, then the weak classifiers are gathered to construct a stronger final classifier, and the algorithm is divided into the following 3 steps:
a. and initializing weight distribution of the training data. If there are N samples, each training sample is initially given the same weight: 1/N;
Figure GDA0002894588200000211
b. and training the weak classifier. If a sample point has been accurately classified, its weight is reduced in constructing the next training set; conversely, if a sample point is not classified accurately, its weight is increased. The sample set with updated weights is then used to train the next classifier, and the entire training process proceeds iteratively. Learning by using a training data set with weight distribution to obtain a basic classifier (selecting a threshold value with the lowest error rate to design the basic classifier):
Gm(x):x→{-1,+1}
calculate the classification error rate on the training data set:
Figure GDA0002894588200000221
from the above equation, the error rate on the training data set is the sum of the weights of the misclassified samples.
The calculated coefficients, representing the degree of importance in the final classifier (objective: obtaining the weight that the basic classifier occupies in the final classifier):
Figure GDA0002894588200000222
as can be seen from the above equation, when, and increasing with decreasing, it means that the basic classifier with smaller classification error rate has a larger role in the final classifier.
The weight distribution of the training data set is updated (in order to obtain a new weight distribution of the samples) for the next iteration
Dm+1=(wm+1,1,wm+1,2,...,wm+1,N)
Figure GDA0002894588200000223
So that the weight of the misclassified samples by the basic classifier is increased and the weight of the correctly classified samples is decreased. In this manner, the Adaboost method can "focus" or "focus on" those samples that are less readily separable.
c. And combining the weak classifiers obtained by training into a strong classifier. After the training process of each weak classifier is finished, the weight of the weak classifier with small classification error rate is increased to play a larger decision role in the final classification function, and the weight of the weak classifier with large classification error rate is reduced to play a smaller decision role in the final classification function. In other words, weak classifiers with low error rates take up more weight in the final classifier, and are otherwise smaller.
Combining the weak classifiers to obtain a final classifier as follows:
Figure GDA0002894588200000231
because the improved palm detection algorithm is adopted, the method has the following advantages:
1. the gesture detection algorithm performs palm detection on separate video frames without combining historical data of previous and subsequent video frames.
2. The detection algorithm is used for detecting through the LBP characteristics of the palm image, the calculation speed is high, and the position of the palm on the image can be accurately positioned.
3. When the collected palm training samples have certain shape and illumination difference, the obtained detection effect is very good, and the recognition rate is high.
As shown in fig. 9, the present invention provides an embodiment of a waving detection system of a robot, including:
the detection module is used for detecting the palm at the standard position in the video stream through the cascade classifier;
the angular point extraction module is electrically connected with the detection module and used for extracting the angular points of the palm at the standard position to obtain an angular point set;
the angular point tracking module is electrically connected with the angular point extraction module and used for tracking and detecting each angular point in the angular point set in a video stream to obtain a motion track corresponding to each angular point;
the judging module is electrically connected with the angular point tracking module and is used for judging whether the palm waves the hand or not according to the motion trail of each angular point in the angular point set;
the judging module comprises:
the analysis submodule analyzes and obtains the number of effective motion angular points in the angular point set according to the motion track of each angular point in the angular point set between two adjacent frames in the video stream and a preset pixel distance;
the analysis submodule is further used for analyzing and obtaining the detection conditions of all the angular points in the angular point set according to the number of the effective motion angular points and a first preset number;
the analysis submodule is further used for analyzing and obtaining the number of effective hand waving angular points in the effective motion angular points according to the motion trail of the effective motion angular points when the detection condition meets the preset requirement;
the judging submodule is electrically connected with the analyzing submodule and used for judging whether the user waves the hand or not according to the number of the effective hand waving angular points and a second preset number;
the judging module further comprises:
the calculation submodule is used for calculating the motion distance of the same angular point between two adjacent frames in the video stream according to the motion trail of each angular point in the angular point set between two adjacent frames in the video stream;
the judgment submodule is also electrically connected with the calculation submodule and is used for judging whether the movement distance is greater than a preset pixel distance;
the judgment sub-module is further used for judging that the angular point performs effective motion when the motion distance is larger than a preset pixel distance, and taking the angular point as the effective motion angular point;
the judgment sub-module is further used for judging the angular points to carry out invalid motion when the motion distance is smaller than the preset pixel distance, and the number N of the valid motion angular pointstrackSubtracting 1, finally calculating the obtained NtrackI.e. the number of effective motion corners in said set of corners, NtrackIs the total number N of corner points in the set of corner points;
the judging submodule is also used for judging the number N of the effective motion angular pointstrackWhether the number is larger than a first preset number;
the judgment sub-module is further used for judging the number N of the effective motion corner pointstrackWhen the number of the effective motion angular points is less than the first preset number, stopping detecting the motion trail of the effective motion angular points, and judging that the user does not wave the hand;
the judgment sub-module is further used for judging the number N of the effective motion corner pointstrackAnd when the number of the effective motion corner points is larger than the first preset number, continuously detecting the motion trail of the effective motion corner points.
The judging module further comprises a detection submodule for detecting the number N of the effective motion corner pointstrackWhen the number of the effective motion corner points is larger than the first preset number, detecting the left motion distance and the right motion distance of the effective motion corner points between two adjacent frames in the video stream;
the judgment sub-module is further used for judging whether the left movement distance and the right movement distance of the effective movement corner point between two adjacent frames in the video stream are both larger than a preset distance;
the judgment sub-module is further used for recording the effective motion angular points as effective hand waving angular points and recording the number N of the effective hand waving angular points when the leftward motion distance and the rightward motion distance of the effective motion angular points between two adjacent frames in the video stream are larger than preset distancesmoveAdding 1 to finally calculate the obtained NmoveThat is, the number of effective hand-waving corner points in the corner point set is NmoveIs 0;
the judging submodule is also used for judging the number N of the effective hand waving angular pointsmoveWhether the number is larger than a second preset number;
the judging submodule is also used for judging the number N of the effective hand waving angular pointsmoveWhen the number of the hands is larger than the second preset number, judging that the palm swings the hand; otherwise, judging that the palm does not wave the hand;
the sample making module is used for making a palm training sample;
the feature extraction module is used for extracting the LBP texture feature vector of the palm training sample to obtain the extracted and processed palm training sample;
the classifier training module is electrically connected with the feature extraction module and used for training a cascade classifier through the extracted palm training sample;
the feature extraction module includes:
the processing submodule is used for dividing the detection window into N multiplied by N pixel areas and comparing a central pixel point in the pixel areas with the gray values of 8 adjacent pixel points, wherein N belongs to {64,32,16 and 8 };
the characteristic value operator module is electrically connected with the processing submodule, if the pixel value of the adjacent pixel point is greater than that of the central pixel point, the adjacent pixel point is marked as 1, otherwise, the adjacent pixel point is 0, so that 8-bit binary number is generated and is used as the LBP value of the central pixel point, and the LBP value of each pixel point in the pixel area can be obtained through calculation;
the processing submodule is also used for calculating a statistical histogram of the pixel region according to the LBP value obtained by each pixel point in the pixel region and carrying out normalization processing on the statistical histogram;
the processing sub-module is further configured to connect the statistical histograms of each pixel region of the palm training sample to form an LBP texture feature vector of the palm training sample;
the corner point tracking module is further used for extracting the corner points of the palm in the standard position through a corner point detection algorithm to obtain a corner point set, carrying out optical flow tracking on all the corner points in the corner point set in a video stream, detecting the positions of all the corner points in the corner point set after movement according to the optical flow tracking algorithm, and calculating a movement track corresponding to each corner point according to a sparse optical flow algorithm.
The embodiment specifically explains the detection and tracking of the corner points of the palm, and how to detect the process of waving the hand according to the motion tracks of the corner points.
Specifically, as shown in fig. 5, the palm movement detection and tracking process. Detecting a palm in a standard position in a video frame; extracting angular points in the palm area through an angular point detection algorithm, wherein the extracted angular points are multiple, a part of the extracted angular points are schematically shown in the figure, and black points are detected angular points; thirdly, carrying out optical flow tracking on the video frame by taking the initial position of the corner point detected when the palm is in the standard position in the video frame as the initial position; fourthly, the fifth step is the position of the corner point detected by the optical flow tracking algorithm in the next video frame after the movement; l1, L2 are motion paths traced to the hand corner points by sparse optical flow algorithms during the palm swing.
In this embodiment, the corner detection algorithm is adopted as the method for extracting the corner, so that the corner of the palm region can be effectively extracted, and in addition, other methods for extracting the feature points, such as a FAST feature point extraction method, may also be used, and details are not described here. The tracking of the angular points in the video stream adopts an optical flow tracking algorithm, each angular point can be effectively tracked in the video stream, the effective degree of tracking is improved, the motion trail of the angular points can be better detected through a sparse optical flow algorithm, and the detection process of waving a hand is facilitated later.
And when the motion distance of the angular points in the two adjacent frames is greater than a preset pixel distance, taking the angular points as effective motion angular points, and counting the number of the effective motion angular points.
Analyzing and obtaining the detection conditions of all the angular points in the angular point set according to the number of the effective motion angular points and a first preset number, wherein the specific detection conditions are as follows: the utility model provides a be that the quantity of effective motion angular point is less than first predetermined quantity, the value ratio of first predetermined quantity is less, can establish to 5 ~ 10, because the palm angular point quantity of extraction far exceeds 5 ~ 10, and when the quantity of effective motion angular point is less than first predetermined quantity, most angular points do not carry out effective motion promptly, can judge that the palm does not wave the hand this moment.
When the number of the effective motion angular points is larger than a first preset number, namely, when the detection condition meets the preset requirement, the number of the effective hand waving angular points in the effective motion angular points is analyzed and obtained according to the motion trail of the effective motion angular points, when a hand is waved, the palm can certainly move leftwards and rightwards, the angular points on the palm can also do the same motion, therefore, a corresponding motion trail can be formed, and the palm waving hand can be judged according to the motion trail of the angular points. When the effective motion corner points are tracked and detected, the target is inevitably lost, so that the number of the effective motion corner points obtained by tracking is not as large as that of the original effective motion corner points, and after the effective motion corner points are effectively moved leftwards and rightwards, the effective motion corner points are recorded as effective waving corner points. And when the number of the effective hand waving corner points is greater than a second preset number, judging that the user waves the hand.
Specifically, in this embodiment, how to judge whether the palm swings the hand according to the motion trajectory of each corner point in the corner point set is described in more detail, and the hand swing detection process specifically includes:
firstly, a standard gesture is detected from a video frame to start tracking the palm, and all tracking angular points A of a palm area are countediA motion path L of { i ═ 1, 2.. N }iN, where N is the total number of corner points in the palm area. Stipulate as Li> 0 indicates that the corner moves to the right of the standard gesture, when Li< 0 indicates that the corner point moves to the left of the standard gesture.
When L isi<-DleftIndicating that the corner has effectively moved to the left and the corresponding flag M of the corneri,leftTrue, wherein DleftA threshold value for the leftward movement distance may be set to a two-pixel distance; when L isi>DrightIndicating that the corner has effectively moved to the right, and the corresponding flag M of the corneri,rightTrue, wherein DrightThe threshold value for the rightward movement distance may be set to a two-pixel distance;
thirdly, when the angular point A between two frames in the video streamiIs moved along a path LiLess than two pixel distance, the corner point tracking is regarded as invalid, and the effective corner point number N is trackedtrack Minus 1, NtrackIs the total number N of corner points in the set of corner points;
fourthly, when tracking the effective angular point total number NtrackStopping tracking when the tracking is less than 5, and stopping tracking when the total number of effective corner points N is trackedtrackWhen the motion angle point is more than or equal to 5, continuously judging whether the effective motion angle point in the video stream carries out effective motion to the left and the right,
when a certain corner point AiEffective movements both to the left and to the right, i.e. movement marks Mi,left=true,Mi,rightTrue, the number of valid hand corner points NmoveAnd adding 1. Number N of effective hand waving angular pointsmoveIs 0;
sixthly, counting the number N of effective hand-waving angular points in real timemoveWhen is coming into contact with
Figure GDA0002894588200000271
It can be determined that the user is waving his hand, and at the same time, the corner point tracking is stopped and a new round of palm detection is started.
Sixthly, if fruit is waved, the total number of the effective motion angular points is detected
Figure GDA0002894588200000272
The detection is invalid, and a new round of detection is started.
The embodiment provides a method capable of accurately recognizing hand waving in a dynamic background, and a camera can detect whether a person waves the hand through continuous video frames. The invention is different from the frame difference method-based identification in nature, firstly, the palm is detected on the picture of each frame through a palm detector, once the palm is detected, the palm is tracked, and the hand waving judgment is carried out through the motion track of the palm. The palm detection is performed on a single-frame picture without combining upper and lower historical image frames, so that the palm detection is not interfered by a complex background and a dynamic background. Namely, the detection method is accurate and stable, can not be influenced by a complex background, and can accurately identify whether the hand is swung or not even if the background is interfered by the motion of people or other objects.
It should be noted that the above embodiments can be freely combined as necessary. The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (12)

1. A method for detecting a hand swinging of a robot, comprising:
detecting a palm in a standard position in a video stream by a cascade of classifiers;
extracting angular points of the palm at the standard position to obtain an angular point set;
tracking and detecting each corner point in the corner point set in a video stream to obtain a motion track corresponding to each corner point;
analyzing the number of effective motion angular points in the angular point set according to the motion trail of each angular point in the angular point set between two adjacent frames in the video stream and a preset pixel distance;
when the number of the effective motion angular points is larger than a first preset number, continuously tracking and detecting all angular points in the angular point set;
analyzing and obtaining the number of effective waving angular points in the effective motion angular points according to the motion trail of the effective motion angular points;
and when the number of the effective hand waving corner points is larger than a second preset number, judging that the user waves the hand.
2. The method as claimed in claim 1, wherein the analyzing the number of effective motion corners in the corner set according to the motion trajectory of each corner in the corner set between two adjacent frames of the video stream and the preset pixel distance comprises:
calculating the motion distance of the same corner point between two adjacent frames in the video stream according to the motion trail of each corner point in the corner point set between two adjacent frames in the video stream;
judging whether the movement distance is larger than a preset pixel distance;
when the motion distance is larger than a preset pixel distance, judging that the angular point performs effective motion, and taking the angular point as the effective motion angular point;
when the motion distance is smaller than the preset pixel distance, judging the angular points to carry out invalid motion, and counting the number N of the valid motion angular pointstrackSubtracting 1, finally calculating the obtained NtrackI.e. the number of effective motion corners in said set of corners, NtrackIs the total number N of corner points in said set of corner points.
3. The method for detecting the waving of a robot as claimed in any one of claims 1 to 2, wherein before detecting the palm in the standard position in the video stream by the cascade classifier, the method comprises:
making a palm training sample;
extracting the LBP texture characteristic vector of the palm training sample to obtain an extracted and processed palm training sample;
and training a cascade classifier by extracting the processed palm training sample.
4. The method according to claim 3, wherein the step of preparing a palm training sample comprises:
palm samples of different users in different contexts are collected, including samples in which the palm is rotated within a preset angular range relative to a standard position.
5. The method as claimed in claim 3, wherein the process of extracting the LBP texture feature vector of the palm training sample comprises:
dividing a detection window into N multiplied by N pixel regions, and comparing the gray value of a central pixel point in the pixel regions with the gray value of 8 adjacent pixel points, wherein N belongs to {64,32,16,8 };
if the pixel value of the adjacent pixel point is larger than that of the central pixel point, the adjacent pixel point is marked as 1, otherwise, the adjacent pixel point is 0, so that 8-bit binary number is generated and used as the LBP value of the central pixel point, and the LBP value of each pixel point in the pixel area can be obtained through calculation;
calculating a statistical histogram of the pixel region according to the LBP value obtained by each pixel point in the pixel region, and carrying out normalization processing on the statistical histogram;
and connecting the statistical histograms of each pixel region of the palm training sample to form an LBP texture feature vector of the palm training sample.
6. The method according to any one of claims 1 to 2, wherein the extracting angular points of the palm at a standard position to obtain an angular point set, and performing tracking detection on each angular point in the angular point set in a video stream to obtain a motion trajectory corresponding to each angular point specifically comprises:
extracting the angular points of the palm at the standard position through an angular point detection algorithm to obtain an angular point set, carrying out optical flow tracking on each angular point in the angular point set in a video stream, detecting the position of each angular point in the angular point set after movement according to the optical flow tracking algorithm, and calculating the movement track corresponding to each angular point according to a sparse optical flow algorithm.
7. A robot hand-waving detection system comprising:
the detection module is used for detecting the palm at the standard position in the video stream through the cascade classifier;
the angular point extraction module is electrically connected with the detection module and used for extracting the angular points of the palm at the standard position to obtain an angular point set;
the angular point tracking module is electrically connected with the angular point extraction module and used for tracking and detecting each angular point in the angular point set in a video stream to obtain a motion track corresponding to each angular point;
the judging module is electrically connected with the angular point tracking module and is used for judging whether the palm waves the hand or not according to the motion trail of each angular point in the angular point set; wherein, the judging module comprises:
the analysis submodule analyzes and obtains the number of effective motion angular points in the angular point set according to the motion track of each angular point in the angular point set between two adjacent frames in the video stream and a preset pixel distance;
the analysis submodule is further used for continuing to track and detect all the angular points in the angular point set when the number of the effective motion angular points is larger than a first preset number;
the analysis submodule is further used for analyzing and obtaining the number of effective hand waving angular points in the effective motion angular points according to the motion trail of the effective motion angular points;
and the judging submodule is electrically connected with the analyzing submodule and used for judging that the user waves the hand when the number of the effective hand waving angular points is larger than a second preset number.
8. The system of claim 7, wherein the determining module further comprises:
the calculation submodule is used for calculating the motion distance of the same angular point between two adjacent frames in the video stream according to the motion trail of each angular point in the angular point set between two adjacent frames in the video stream;
the judgment submodule is also electrically connected with the calculation submodule and is used for judging whether the movement distance is greater than a preset pixel distance;
the judgment sub-module is further used for judging that the angular point performs effective motion when the motion distance is larger than a preset pixel distance, and taking the angular point as the effective motion angular point;
the judgment sub-module is further used for judging the angular points to carry out invalid motion when the motion distance is smaller than the preset pixel distance, and the number N of the valid motion angular pointstrackSubtracting 1, finally calculating the obtained NtrackI.e. the number of effective motion corners in said set of corners, NtrackIs the total number N of corner points in said set of corner points.
9. A robot hand-waving detection system according to any one of claims 7 to 8, further comprising:
the sample making module is used for making a palm training sample;
the feature extraction module is used for extracting the LBP texture feature vector of the palm training sample to obtain the extracted and processed palm training sample;
and the classifier training module is electrically connected with the feature extraction module and used for training the cascade classifier through the extracted palm training sample.
10. The system of claim 8, wherein the feature extraction module comprises:
the processing submodule is used for dividing the detection window into N multiplied by N pixel areas and comparing a central pixel point in the pixel areas with the gray values of 8 adjacent pixel points, wherein N belongs to {64,32,16 and 8 };
the characteristic value operator module is electrically connected with the processing submodule, if the pixel value of the adjacent pixel point is greater than that of the central pixel point, the adjacent pixel point is marked as 1, otherwise, the adjacent pixel point is 0, so that 8-bit binary number is generated and is used as the LBP value of the central pixel point, and the LBP value of each pixel point in the pixel area can be obtained through calculation;
the processing submodule is also used for calculating a statistical histogram of the pixel region according to the LBP value obtained by each pixel point in the pixel region and carrying out normalization processing on the statistical histogram;
the processing sub-module is further configured to connect the statistical histograms of each pixel region of the palm training samples to form an LBP texture feature vector of the palm training samples.
11. A robot hand-waving detection system as set forth in any one of claims 7 to 8, characterized in that:
the corner point tracking module is further used for extracting the corner points of the palm in the standard position through a corner point detection algorithm to obtain a corner point set, carrying out optical flow tracking on each corner point in the corner point set in a video stream, detecting the positions of all the corner points in the corner point set after movement according to the optical flow tracking algorithm, and calculating the movement track corresponding to each corner point according to a sparse optical flow algorithm.
12. A robot characterized by being integrated with a hand swing detection system of a robot as claimed in any one of claims 7 to 11.
CN201711042859.9A 2017-10-30 2017-10-30 Robot hand waving detection method and system and robot Active CN107886057B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201711042859.9A CN107886057B (en) 2017-10-30 2017-10-30 Robot hand waving detection method and system and robot
PCT/CN2017/112211 WO2019085060A1 (en) 2017-10-30 2017-11-21 Method and system for detecting waving of robot, and robot

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711042859.9A CN107886057B (en) 2017-10-30 2017-10-30 Robot hand waving detection method and system and robot

Publications (2)

Publication Number Publication Date
CN107886057A CN107886057A (en) 2018-04-06
CN107886057B true CN107886057B (en) 2021-03-30

Family

ID=61782978

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711042859.9A Active CN107886057B (en) 2017-10-30 2017-10-30 Robot hand waving detection method and system and robot

Country Status (2)

Country Link
CN (1) CN107886057B (en)
WO (1) WO2019085060A1 (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110852216A (en) * 2019-10-30 2020-02-28 平安科技(深圳)有限公司 Palm print verification method and device, computer equipment and readable storage medium
CN111950588B (en) * 2020-07-03 2023-10-17 国网冀北电力有限公司 Distributed power island detection method based on improved Adaboost algorithm
CN111680671A (en) * 2020-08-13 2020-09-18 北京理工大学 Automatic generation method of camera shooting scheme based on optical flow
CN116612119B (en) * 2023-07-20 2023-09-19 山东行创科技有限公司 Machine vision-based method for detecting working state image of drill bit for machine tool

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5454043A (en) * 1993-07-30 1995-09-26 Mitsubishi Electric Research Laboratories, Inc. Dynamic and static hand gesture recognition through low-level image analysis
CN103179359A (en) * 2011-12-21 2013-06-26 北京新岸线移动多媒体技术有限公司 Method and device for controlling video terminal and video terminal
CN103593679A (en) * 2012-08-16 2014-02-19 北京大学深圳研究生院 Visual human-hand tracking method based on online machine learning
CN104571482A (en) * 2013-10-22 2015-04-29 中国传媒大学 Digital device control method based on somatosensory recognition
CN107292295A (en) * 2017-08-03 2017-10-24 华中师范大学 Hand Gesture Segmentation method and device

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102426480A (en) * 2011-11-03 2012-04-25 康佳集团股份有限公司 Man-machine interactive system and real-time gesture tracking processing method for same
US9524445B2 (en) * 2015-02-27 2016-12-20 Sharp Laboratories Of America, Inc. Methods and systems for suppressing non-document-boundary contours in an image
US20160307057A1 (en) * 2015-04-20 2016-10-20 3M Innovative Properties Company Fully Automatic Tattoo Image Processing And Retrieval
CN105469043A (en) * 2015-11-20 2016-04-06 苏州铭冠软件科技有限公司 Gesture recognition system
CN105869166B (en) * 2016-03-29 2018-07-10 北方工业大学 A kind of human motion recognition method and system based on binocular vision
CN107175660B (en) * 2017-05-08 2019-11-29 同济大学 A kind of six-freedom degree robot kinematics scaling method based on monocular vision

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5454043A (en) * 1993-07-30 1995-09-26 Mitsubishi Electric Research Laboratories, Inc. Dynamic and static hand gesture recognition through low-level image analysis
CN103179359A (en) * 2011-12-21 2013-06-26 北京新岸线移动多媒体技术有限公司 Method and device for controlling video terminal and video terminal
CN103593679A (en) * 2012-08-16 2014-02-19 北京大学深圳研究生院 Visual human-hand tracking method based on online machine learning
CN104571482A (en) * 2013-10-22 2015-04-29 中国传媒大学 Digital device control method based on somatosensory recognition
CN107292295A (en) * 2017-08-03 2017-10-24 华中师范大学 Hand Gesture Segmentation method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Feature Extraction from 2D Gesture Trajectory in Dynamic Hand Gesture Recognition;M.K. Bhuyan etal.;《 2006 IEEE Conference on Cybernetics and Intelligent Systems》;20061204;全文 *
基于单目视觉的手势跟踪与识别算法研究;陈洁;《中国优秀硕士学位论文全文数据库信息科技辑》;20121015;第2012年卷(第10期);第2章第3段、第3.2.1节第4段、第4.2.2节第1段、第4.2.3节最后1段、第4.3节第1段 *

Also Published As

Publication number Publication date
CN107886057A (en) 2018-04-06
WO2019085060A1 (en) 2019-05-09

Similar Documents

Publication Publication Date Title
CN110147743B (en) Real-time online pedestrian analysis and counting system and method under complex scene
Tang et al. A real-time hand posture recognition system using deep neural networks
CN107886057B (en) Robot hand waving detection method and system and robot
WO2020108362A1 (en) Body posture detection method, apparatus and device, and storage medium
Xu et al. Online dynamic gesture recognition for human robot interaction
Megavannan et al. Human action recognition using depth maps
Bhuyan et al. Fingertip detection for hand pose recognition
Singha et al. Hand gesture recognition based on Karhunen-Loeve transform
CN110688965B (en) IPT simulation training gesture recognition method based on binocular vision
Lim et al. Block-based histogram of optical flow for isolated sign language recognition
CN110232308A (en) Robot gesture track recognizing method is followed based on what hand speed and track were distributed
Kumar et al. Indian sign language recognition using graph matching on 3D motion captured signs
Thongtawee et al. A novel feature extraction for American sign language recognition using webcam
Mesbahi et al. Hand gesture recognition based on convexity approach and background subtraction
Perimal et al. Hand-gesture recognition-algorithm based on finger counting
Xu et al. Robust hand gesture recognition based on RGB-D Data for natural human–computer interaction
Zhu et al. Action recognition in broadcast tennis video using optical flow and support vector machine
Thabet et al. Fast marching method and modified features fusion in enhanced dynamic hand gesture segmentation and detection method under complicated background
Guo et al. Small aerial target detection using trajectory hypothesis and verification
Kovalenko et al. Real-time hand tracking and gesture recognition using semantic-probabilistic network
Al-Azzo et al. 3D Human action recognition using Hu moment invariants and euclidean distance classifier
Kishore et al. Spatial Joint features for 3D human skeletal action recognition system using spatial graph kernels
Tofighi et al. Hand pointing detection using live histogram template of forehead skin
CN106709442B (en) Face recognition method
CN111913584B (en) Mouse cursor control method and system based on gesture recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant