CN105261038A - Bidirectional optical flow and perceptual hash based fingertip tracking method - Google Patents
Bidirectional optical flow and perceptual hash based fingertip tracking method Download PDFInfo
- Publication number
- CN105261038A CN105261038A CN201510646203.2A CN201510646203A CN105261038A CN 105261038 A CN105261038 A CN 105261038A CN 201510646203 A CN201510646203 A CN 201510646203A CN 105261038 A CN105261038 A CN 105261038A
- Authority
- CN
- China
- Prior art keywords
- fingertip
- points
- point
- contour
- optical flow
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000003287 optical effect Effects 0.000 title claims abstract description 67
- 238000000034 method Methods 0.000 title claims abstract description 62
- 230000002457 bidirectional effect Effects 0.000 title claims abstract description 31
- 101150060512 SPATA6 gene Proteins 0.000 claims abstract description 89
- 239000013598 vector Substances 0.000 claims abstract description 46
- 238000004364 calculation method Methods 0.000 claims abstract description 39
- 230000033001 locomotion Effects 0.000 claims description 30
- 239000011159 matrix material Substances 0.000 claims description 18
- 230000008569 process Effects 0.000 claims description 11
- 230000009466 transformation Effects 0.000 claims description 7
- 238000004458 analytical method Methods 0.000 claims description 6
- 238000007781 pre-processing Methods 0.000 claims description 6
- 238000005070 sampling Methods 0.000 claims description 6
- 238000001914 filtration Methods 0.000 claims description 4
- 230000008859 change Effects 0.000 claims description 3
- 230000008447 perception Effects 0.000 claims description 3
- 230000011218 segmentation Effects 0.000 claims description 3
- 238000012163 sequencing technique Methods 0.000 claims description 3
- 230000000694 effects Effects 0.000 abstract description 6
- 238000005516 engineering process Methods 0.000 description 11
- 230000003993 interaction Effects 0.000 description 8
- 238000001514 detection method Methods 0.000 description 4
- 230000018109 developmental process Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 208000012661 Dyskinesia Diseases 0.000 description 1
- 238000007476 Maximum Likelihood Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 239000002245 particle Substances 0.000 description 1
- 239000000049 pigment Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/017—Gesture based interaction, e.g. based on a set of recognized hand gestures
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
The invention provides a bidirectional optical flow and perceptual hash based fingertip tracking method. The method comprises the steps of S1, calculating dense optical flow information corresponding to scene information, and constructing a skin color filter to acquire a hand area; S2, carrying out vector dot product and cross multiplication calculation on each contour point so as to remove non-fingertip contour points, detecting the position of a fingertip point according to geometrical features of the fingertip, and acquiring fingertip contour points; and S3, calculating the fingertip contour points by using a bidirectional pyramid optical flow algorithm, acquiring bidirectionally matched fingertip contour points so as to carry out estimation on a fingertip moving area, generating a 64-bit perceptual hash sequence by adopting a perceptual hash algorithm, and matching the 64-bit perceptual hash sequence with a fingertip hash sequence template, wherein the maximum matching area is a fingertip area; and judging whether a next turn of fingertip tracking is carried out or not, and realizing fingertip continuous tracking. The method provided by the invention can effectively realize continuous tracking for the fingertip in a complex environment, and a problem of poor fingertip tracking effect caused by discontinuous fingertip tracking trajectory is avoided at the same time.
Description
Technical Field
The invention relates to the technical field of image processing and analysis, in particular to a fingertip tracking method based on bidirectional optical flow and perceptual hashing.
Background
The traditional man-machine interaction is mainly to finish the information exchange between people and calculation through media such as a mouse or a keyboard, and the media of the interaction method needs to occupy certain space and is inconvenient to use. Nowadays, with the development of computer science technology and artificial intelligence technology, human-computer interaction is continuously developed, and technologies such as touch screen, voice recognition, computer vision and the like are applied, so that a human-computer interaction system becomes more convenient and more friendly.
Fingertip tracking has attracted increasing researchers' attention in recent years as an important component of computer vision-based human-computer interaction systems. The fingertip tracking technology can be widely applied to the fields of gesture recognition, identity authentication, mouse control, home entertainment remote control and the like, and has great commercial value. Current fingertip tracking technologies can be broadly divided into two categories: one is to realize continuous tracking by continuously detecting fingertips; the other method is to detect the fingertip firstly and then realize the continuous tracking of the fingertip through the modes of analysis, prediction and the like.
The typical implementation of the first fingertip tracking technology is to mark the fingertip with a special pigment or bind the fingertip with a colored tape, capture video scene information with a camera, and detect the tracked fingertip by tracking the special color in the scene. However, this method requires a special color mark each time, and is inconvenient to use, and can be used only in a case where the scene is simple. Some researchers put forward that the data gloves are utilized, fingertip positioning is carried out through sensor data, the fingertip tracking effect achieved by the method is ideal, but the data gloves are generally high in price, inconvenient to use and poor in popularization. In recent years, with the development of camera technology, especially special camera Kinect developed by microsoft, many researchers are beginning to consider the application of the special camera in human-computer interaction systems such as finger tracking. The special cameras can generally provide more useful information than ordinary cameras, and the human-computer interaction effect is ideal, but the special cameras are poor in popularity and high in price and are not suitable for popularization. Generally, the fingertip tracking method only performs continuous detection, and often ignores the relation between time and space in a video scene, so that the finger tracking effect is relatively general, and problems such as jerkiness tracking and discontinuous tracking track are easily caused.
The second fingertip tracking technology is proposed only in recent years, and after detecting the fingertip position in a video scene through various technologies, continuous fingertip tracking is realized by means of speed position prediction, particle filtering and the like. However, the technology is not mature at present, most algorithms have common tracking effect, and many algorithms can only be applied in simple scenes.
Therefore, in the current stage, a common camera is used for capturing a video scene, so that continuous tracking of fingertips in a complex environment is realized, and the method has the advantages of convenience in use, strong popularization, good robustness and the like, and is a trend for researching a finger human-computer interaction system.
Disclosure of Invention
The invention aims to make up the defects of the existing fingertip detection and tracking technology, and provides a fingertip tracking method based on bidirectional optical flow and perceptual hash.
In order to achieve the purpose, the invention is realized by the following technical scheme: a fingertip tracking method based on bidirectional optical flow and perceptual hash is used for continuously tracking fingertips of hands; the method is characterized in that: the method comprises the following three steps:
the method comprises the steps of firstly, capturing scene information, and obtaining a scene image containing a hand area by calculating dense optical flow information corresponding to the scene information and performing binarization preprocessing; then constructing a skin color filter, and segmenting a scene image containing a hand region to obtain the hand region;
secondly, sampling the contour of the hand region, and performing vector dot product and cross product calculation on each contour point to remove non-fingertip contour points to obtain candidate fingertip points; setting a threshold value according to the geometric characteristics of the fingertips, counting the number of the non-fingertip contour points which are removed near the candidate fingertip points and comparing the number with the threshold value to detect the positions of the fingertip points and obtain the fingertip contour points; if the fingertip point is not detected, returning to the first step for reinitialization;
thirdly, calculating the fingertip contour points by using a bidirectional pyramid optical flow algorithm to obtain bidirectional matched fingertip contour points to estimate a fingertip movement area; in an estimated motion area of a fingertip, discrete cosine transform is sequentially carried out by adopting a search window strategy, a neighborhood is extracted, a corresponding discrete cosine transform low-frequency coefficient matrix is calculated, a 64-bit perceptual hash sequence is generated by adopting a perceptual hash algorithm, the 64-bit perceptual hash sequence is matched with a fingertip hash sequence template, and then the maximum matching area is the fingertip area; and finally, setting a threshold value of the matching distance to judge whether the 64-bit perceptual hash sequence is updated into a fingertip hash sequence template, setting a threshold value of the fingertip outline matching point number to judge whether the fingertip area is correctly tracked, if so, carrying out next round of fingertip tracking to realize continuous fingertip tracking, and if not, returning to the first step.
In the first step, the captured scene information is subjected to dense optical flow information corresponding to the scene information and binarization preprocessing to obtain a scene image containing a hand area; then, constructing a skin color filter, and segmenting a hand region of the scene image containing the hand region to obtain the hand region: the method comprises the following steps:
(1.1) capturing scene information and calculating dense optical flow information corresponding to the scene information;
(1.2) traversing the optical flow information, searching a maximum optical flow value, and performing regularization processing on all optical flows in the X-axis direction and the Y-axis direction by using the maximum optical flow value;
(1.3) calculating the hue and saturation of the area which changes along the optical flow direction according to the normalized optical flow, and marking different color values;
and (1.4) setting a threshold value, converting the optical flow change area of the color mark into a binary image according to the threshold value, and combining logical operation and mathematical morphology operation to obtain a scene image containing a hand area.
(1.5) constructing a YCbCr skin color filter according to the human body skin color clustering characteristics, and removing redundant color and brightness information;
(1.6) filtering the YCbCr skin color filter, then sequencing all outlines of the scene image containing the hand region in a descending manner, searching the maximum connected region as the hand region, and realizing the segmentation of the hand region of the scene image containing the hand region.
In the second step, the contour of the hand area is sampled, and vector dot product and cross product calculation is carried out on each contour point to remove non-fingertip contour points, so as to obtain candidate fingertip points; setting a threshold value according to the geometric characteristics of the fingertips, counting the number of the non-fingertip contour points which are removed near the candidate fingertip points and comparing the number with the threshold value to detect the positions of the fingertip points and obtain the fingertip contour points; if the fingertip point is not detected, returning to the first step of reinitialization refers to: the method comprises the following steps:
(2.1) sampling the contour of the hand region, and calculating the vector dot product of each contour point, wherein the vector dot product is K curvature;
(2.2) setting a threshold value T1、T2For K curvature at threshold T1And T2The vector cross multiplication calculation is carried out on the contour points between the finger points, the non-fingertip contour points are removed according to the calculation result, and candidate contour points are obtainedPointing to a tip point;
(2.3) setting the threshold T according to the geometric characteristics of the fingertip0Counting the number of continuously distributed contour points which do not meet the vector cross multiplication calculation result near the candidate fingertip points and comparing the number with a threshold value T0Comparing to detect the position of the fingertip point and obtain a fingertip outline point; if the fingertip point is not detected, the first step is returned to reinitialize.
In step (2.1), the calculating the vector dot product of each contour point includes: calculating the cosine absolute value of each contour point by adopting a formula (1):
wherein,is a certain point P on the contour0From the K-th point P1The vector of the composition is then calculated,is a certain point P on the contour0From the K th point P behind2The constructed vector.
In step (2.2), the pair of K curvatures is at a threshold T1And T2The vector cross multiplication calculation is carried out on the contour points between the finger points, the non-fingertip contour points are removed according to the calculation result, and the candidate fingertip points are obtained by the following steps: if the cosine absolute value of the contour point meets the formula (2), vector cross-multiplication calculation of the point is carried out through the formula (3); otherwise, the point does not need to carry out vector cross multiplication calculation;
V=V1×V2(3)
if the vector cross product V is greater than 0, the contour point is a candidate fingertip point, otherwise, the contour point is judged to be a non-fingertip contour point and is removed.
In the step (2.3), the number of the continuously distributed contour points which do not meet the vector cross multiplication calculation result near the statistical candidate fingertip points is counted and is compared with a threshold value T0Comparing, detecting the position of the fingertip point and obtaining the fingertip outline point means: counting that the vector cross product V is not satisfied near each candidate fingertip point>0, if N, the number of contour points>T0Then, the corresponding candidate fingertip points are fingertip outline points, and the positions of the fingertip outline points are detected; otherwise, returning to the first step for reinitialization.
In the third step, the fingertip outline points are calculated by using a bidirectional pyramid optical flow algorithm to obtain bidirectional matched fingertip outline points to estimate a fingertip movement area; in an estimated motion area of a fingertip, discrete cosine transform is sequentially carried out by adopting a search window strategy, a neighborhood is extracted, a corresponding discrete cosine transform low-frequency coefficient matrix is calculated, a 64-bit perceptual hash sequence is generated by adopting a perceptual hash algorithm, the 64-bit perceptual hash sequence is matched with a fingertip hash sequence template, and then the maximum matching area is the fingertip area; and finally, setting a threshold value of the matching distance to judge whether the 64-bit perceptual hash sequence is updated into a fingertip hash sequence template, setting a threshold value of a fingertip outline matching point number to judge whether the fingertip area is correctly tracked, if so, carrying out next round of fingertip tracking to realize continuous fingertip tracking, and if not, returning to the first step, wherein the method comprises the following steps:
(3.1) carrying out forward optical flow calculation on the fingertip outline points detected in the second step to obtain forward estimation fingertip outline points corresponding to the fingertip outline points in the next frame;
(3.2) carrying out reverse optical flow calculation on the forward estimation fingertip outline points to obtain corresponding reverse estimation fingertip outline points in the current frame;
(3.3) matching the initial fingertip outline points and the reverse estimation fingertip outline points to obtain the fingertip outline points capable of being subjected to bidirectional matching;
(3.4) estimating the motion area of the fingertip in the next frame by utilizing the bidirectional matching fingertip outline point;
(3.5) in the fingertip movement estimation area, adopting a search window strategy, sequentially performing discrete cosine transform in the search window area, and extracting low-frequency effective components in an 8 x 8 neighborhood at the upper left corner after the discrete cosine transform to obtain a corresponding discrete cosine coefficient matrix;
(3.6) calculating a median value of the discrete cosine coefficient matrix, and comparing each pixel point in the neighborhood with the median value to generate a 64-bit perceptual hash sequence;
(3.7) carrying out matching analysis on the 64-bit perceptual hash sequence and a fingertip hash sequence template reserved in the last frame of tracking result, wherein the maximum matching area is a fingertip area; in the tracking process after the initialization is finished, setting a fingertip hash sequence template according to an initialization result;
(3.8) setting the matching distanceThreshold value T of3Judging whether to update the 64-bit perception hash sequence into a fingertip hash sequence template;
(3.9) setting threshold value T of the matching point number of the fingertip outline4Judging whether the fingertip area is correctly tracked or not, and if the fingertip outline is matched with the point number M<T4When the fingertip outline points are detected, obvious fingertip target loss occurs, the first step is returned, and the fingertip outline points are initialized and detected again; otherwise, judging that the fingertip area tracking is correct, returning to the step (3.1) to perform the next round of fingertip tracking, and realizing the fingertip continuous tracking.
In step (3.5), in the fingertip motion estimation region, a search window strategy is adopted, discrete cosine transform is sequentially performed in the search window region, and low-frequency effective components in an 8 × 8 neighborhood of the upper left corner after the discrete cosine transform are extracted, so as to obtain a corresponding discrete cosine coefficient matrix, where: in an estimated motion area of a fingertip, a search window strategy is adopted, a 16 x 16 search window is set, from the upper left corner to the lower right corner of the motion estimation area, 16 x 16 image blocks are sequentially converted into gray images, discrete cosine transformation is carried out, low-frequency effective components of 8 x 8 neighborhoods at the upper left corner after the discrete cosine transformation are extracted, and a corresponding discrete cosine coefficient matrix is calculated.
In step (3.8), the threshold T of the matching distance is set3Judging whether to update the 64-bit perceptual hash sequence into the fingertip hash sequence template means: calculating the Hamming distance of the 64-bit perceptual hash sequence, and calculating the Hamming distance and the threshold value T through formula (4)3Judging whether the 64-bit perceptual hash sequence is updated into a fingertip hash sequence template or not;
distance D if Hamminghamming(Stemplate,Snew)≥T3And if not, not updating the fingertip hash sequence template. In the invention, for the first tracking after the initialization is finished, a fingertip hash sequence template is set according to an initialization result; in the subsequent continuous tracking process, the fingertip hash sequence template is set according to the previous frame of tracking process, namely the fingertip hash sequence template updated in the previous frame of tracking process.
Compared with the prior art, the invention has the following advantages and beneficial effects:
1. when the method is implemented, a common camera is used for capturing a scene video, no special equipment is needed, and no special mark is needed for hands. According to the method, the area where the hand is located can be accurately segmented from a complex scene by calculating dense optical flows and constructing a skin color filter, and the situations that a large number of skin color-like areas and background transformation are contained in the scene can be met; then, by improving a K curvature method, the correct position of the fingertip point is found, and the fingertip point is ensured to simultaneously meet the curvature characteristic and the geometric characteristic; finally, according to a fingertip detection result, realizing fingertip continuous tracking by using a bidirectional pyramid optical flow and perceptual hashing, wherein the bidirectional pyramid optical flow is applied to analysis of front-back correlation of fingertip contour points, and the motion trend and the motion range of a current target can be accurately predicted; the application of perceptual hash matching can find the maximum likelihood target within the range of motion estimation. In addition, in the fingertip tracking process, the method introduces a strategy of updating the template and the contour matching points, and realizes stable and efficient fingertip tracking.
2. The fingertip tracking method can effectively realize the continuous tracking of the fingertip in a complex environment, and simultaneously avoids the problem of poor fingertip tracking effect caused by discontinuous fingertip tracking track.
Drawings
FIG. 1 is a block flow diagram of a fingertip tracking method of the present invention;
FIG. 2 is a flow chart of a method for segmenting a complete hand region from a complex environment in a first step;
FIG. 3 is a flowchart of a second method for performing fingertip detection on a hand region image;
FIG. 4 shows a point P on the contour in a second step0From the K-th point P1And the following K point P2A schematic view of the position of (a);
FIG. 5 is a flowchart of a third step of a method for implementing continuous fingertip tracking based on bidirectional pyramid optical flow and perceptual hashing;
FIG. 6 is a schematic diagram of the process of obtaining fingertip outline points capable of bidirectional matching in the third step;
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and specific embodiments.
Examples
As shown in fig. 1 to 6, the fingertip tracking method based on bidirectional optical flow and perceptual hash is used for continuously tracking the fingertip of a hand; the method comprises the following three steps:
the method comprises the steps of firstly, capturing scene information, and obtaining a scene image containing a hand area by calculating dense optical flow information corresponding to the scene information and performing binarization preprocessing; then constructing a skin color filter, and segmenting a scene image containing a hand region to obtain the hand region;
secondly, sampling the contour of the hand region, and performing vector dot product and cross product calculation on each contour point to remove non-fingertip contour points to obtain candidate fingertip points; setting a threshold value according to the geometric characteristics of the fingertips, counting the number of the non-fingertip contour points which are removed near the candidate fingertip points and comparing the number with the threshold value to detect the positions of the fingertip points and obtain the fingertip contour points; if the fingertip point is not detected, returning to the first step for reinitialization;
thirdly, calculating the fingertip contour points by using a bidirectional pyramid optical flow algorithm to obtain bidirectional matched fingertip contour points to estimate a fingertip movement area; in an estimated motion area of a fingertip, discrete cosine transform is sequentially carried out by adopting a search window strategy, a neighborhood is extracted, a corresponding discrete cosine transform low-frequency coefficient matrix is calculated, a 64-bit perceptual hash sequence is generated by adopting a perceptual hash algorithm, the 64-bit perceptual hash sequence is matched with a fingertip hash sequence template, and then the maximum matching area is the fingertip area; and finally, setting a threshold value of the matching distance to judge whether the 64-bit perceptual hash sequence is updated into a fingertip hash sequence template, setting a threshold value of the fingertip outline matching point number to judge whether the fingertip area is correctly tracked, if so, carrying out next round of fingertip tracking to realize continuous fingertip tracking, and if not, returning to the first step.
In the first step, the captured scene information is subjected to dense optical flow information corresponding to the scene information and binarization preprocessing to obtain a scene image containing a hand area; then, constructing a skin color filter, and segmenting a hand region of the scene image containing the hand region to obtain the hand region: the method comprises the following steps:
(1.1) capturing scene information and calculating dense optical flow information corresponding to the scene information; although the amount of calculation of the dense optical flow is generally large, in the method, because the dense optical flow is calculated only by the invention to acquire the approximate area of the hand candidate, the optical flow calculation is carried out by using a double-layer pyramid, and a larger search window (15x15) is set;
(1.2) traversing the optical flow information, searching a maximum optical flow value, and performing regularization processing on all optical flows in the X-axis direction and the Y-axis direction by using the maximum optical flow value;
(1.3) calculating the hue and saturation of the area which changes along the optical flow direction according to the normalized optical flow, and marking different color values;
and (1.4) setting a threshold value, converting the optical flow change area of the color mark into a binary image according to the threshold value, and combining logical operation and mathematical morphology operation to obtain a scene image containing a hand area.
(1.5) constructing a YCbCr skin color filter according to the human body skin color clustering characteristics, and removing redundant color and brightness information; due to the obvious clustering characteristic of the distribution of human skin colors, the RGB image can be converted into the YCbCr image by combining a large number of experimental results, and long and narrow color bands (Cb ∈ [100,127], Cr ∈ [128,170]) are selected as skin color models to construct the YCbCr skin color filter.
(1.6) filtering the YCbCr skin color filter, then sequencing all outlines of the scene image containing the hand region in a descending manner, searching the maximum connected region as the hand region, and realizing the segmentation of the hand region of the scene image containing the hand region.
In the second step, the contour of the hand area is sampled, and vector dot product and cross product calculation is carried out on each contour point to remove non-fingertip contour points, so as to obtain candidate fingertip points; setting a threshold value according to the geometric characteristics of the fingertips, counting the number of the non-fingertip contour points which are removed near the candidate fingertip points and comparing the number with the threshold value to detect the positions of the fingertip points and obtain the fingertip contour points; if the fingertip point is not detected, returning to the first step of reinitialization refers to: the method comprises the following steps:
(2.1) sampling the contour of the hand region, and calculating the vector dot product of each contour point, wherein the vector dot product is K curvature;
(2.2) setting a threshold value T1、T2For K curvature at threshold T1And T2Carrying out vector cross multiplication calculation on the contour points between the finger points, and removing non-fingertip contour points according to the calculation result to obtain candidate fingertip points;
(2.3) setting the threshold T according to the geometric characteristics of the fingertip0Counting the number of continuously distributed contour points which do not meet the vector cross multiplication calculation result near the candidate fingertip points and comparing the number with a threshold value T0Comparing to detect the position of the fingertip point and obtain a fingertip outline point; if the fingertip point is not detected, the first step is returned to reinitialize.
In step (2.1), the calculating the vector dot product of each contour point includes: calculating the cosine absolute value of each contour point by adopting a formula (1):
wherein,is a certain point P on the contour0From the K-th point P1The vector of the composition is then calculated,is a certain point P on the contour0From the K th point P behind2The constructed vector.
In step (2.2), the pair of K curvatures is at a threshold T1And T2The vector cross multiplication calculation is carried out on the contour points between the finger points, the non-fingertip contour points are removed according to the calculation result, and the candidate fingertip points are obtained by the following steps: if the cosine absolute value of the contour point meets the formula (2), vector cross-multiplication calculation of the point is carried out through the formula (3); otherwise, the point does not need to carry out vector cross multiplication calculation;
V=V1×V2(3)
if the vector cross product V is greater than 0, the contour point is a candidate fingertip point, otherwise, the contour point is judged to be a non-fingertip contour point and is removed.
In the step (2.3), the number of the continuously distributed contour points which do not meet the vector cross multiplication calculation result near the statistical candidate fingertip points is counted and is compared with a threshold value T0Comparing, detecting the position of the fingertip point and obtaining the fingertip outline point means: counting that the vector cross product V is not satisfied near each candidate fingertip point>0, if N, the number of contour points>T0Then, the corresponding candidate fingertip points are fingertip outline points, and the positions of the fingertip outline points are detected; otherwise, returning to the first step for reinitialization.
In the third step, the fingertip outline points are calculated by using a bidirectional pyramid optical flow algorithm to obtain bidirectional matched fingertip outline points to estimate a fingertip movement area; in an estimated motion area of a fingertip, discrete cosine transform is sequentially carried out by adopting a search window strategy, a neighborhood is extracted, a corresponding discrete cosine transform low-frequency coefficient matrix is calculated, a 64-bit perceptual hash sequence is generated by adopting a perceptual hash algorithm, the 64-bit perceptual hash sequence is matched with a fingertip hash sequence template, and then the maximum matching area is the fingertip area; and finally, setting a threshold value of the matching distance to judge whether the 64-bit perceptual hash sequence is updated into a fingertip hash sequence template, setting a threshold value of a fingertip outline matching point number to judge whether the fingertip area is correctly tracked, if so, carrying out next round of fingertip tracking to realize continuous fingertip tracking, and if not, returning to the first step, wherein the method comprises the following steps:
(3.1) carrying out forward optical flow calculation on the fingertip outline points detected in the second step to obtain forward estimation fingertip outline points corresponding to the fingertip outline points in the next frame;
(3.2) carrying out reverse optical flow calculation on the forward estimation fingertip outline points to obtain corresponding reverse estimation fingertip outline points in the current frame;
(3.3) matching the initial fingertip outline points and the reverse estimation fingertip outline points to obtain the fingertip outline points capable of being subjected to bidirectional matching;
(3.4) estimating the motion area of the fingertip in the next frame by utilizing the bidirectional matching fingertip outline point;
(3.5) in the fingertip movement estimation area, adopting a search window strategy, sequentially performing discrete cosine transform in the search window area, and extracting low-frequency effective components in an 8 x 8 neighborhood at the upper left corner after the discrete cosine transform to obtain a corresponding discrete cosine coefficient matrix;
(3.6) calculating a median value of the discrete cosine coefficient matrix, and comparing each pixel point in the neighborhood with the median value to generate a 64-bit perceptual hash sequence;
(3.7) carrying out matching analysis on the 64-bit perceptual hash sequence and a fingertip hash sequence template reserved in the last frame of tracking result, wherein the maximum matching area is a fingertip area; in the tracking process after the initialization is finished, setting a fingertip hash sequence template according to an initialization result;
(3.8) setting threshold T of matching distance3Judging whether to update the 64-bit perception hash sequence into a fingertip hash sequence template;
(3.9) setting threshold value T of the matching point number of the fingertip outline4Judging whether the fingertip area is correctly tracked or not, and if the fingertip outline is matched with the point number M<T4When the fingertip outline points are detected, obvious fingertip target loss occurs, the first step is returned, and the fingertip outline points are initialized and detected again; otherwise, judging that the fingertip area tracking is correct, returning to the step (3.1) to perform the next round of fingertip tracking, and realizing the fingertip continuous tracking.
In step (3.5), in the fingertip motion estimation region, a search window strategy is adopted, discrete cosine transform is sequentially performed in the search window region, and low-frequency effective components in an 8 × 8 neighborhood of the upper left corner after the discrete cosine transform are extracted, so as to obtain a corresponding discrete cosine coefficient matrix, where: in an estimated motion area of a fingertip, a search window strategy is adopted, a 16 x 16 search window is set, from the upper left corner to the lower right corner of the motion estimation area, 16 x 16 image blocks are sequentially converted into gray images, discrete cosine transformation is carried out, low-frequency effective components of 8 x 8 neighborhoods at the upper left corner after the discrete cosine transformation are extracted, and a corresponding discrete cosine coefficient matrix is calculated.
In step (3.8), the threshold T of the matching distance is set3Judging whether to update the 64-bit perceptual hash sequence into the fingertip hash sequence template means: calculating the Hamming distance of the 64-bit perceptual hash sequence, and calculating the Hamming distance and the threshold value T through formula (4)3Judging whether the 64-bit perceptual hash sequence is updated into a fingertip hash sequence template or not;
distance D if Hamminghamming(Stemplate,Snew)≥T3And if not, not updating the fingertip hash sequence template. In the invention, for the first tracking after the initialization is finished, a fingertip hash sequence template is set according to an initialization result; in the subsequent continuous tracking process, the fingertip hash sequence template is set according to the previous frame of tracking process, namely the fingertip hash sequence template updated in the previous frame of tracking process.
The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.
Claims (9)
1. A fingertip tracking method based on bidirectional optical flow and perceptual hash is used for continuously tracking fingertips of hands; the method is characterized in that: the method comprises the following three steps:
the method comprises the steps of firstly, capturing scene information, and obtaining a scene image containing a hand area by calculating dense optical flow information corresponding to the scene information and performing binarization preprocessing; then constructing a skin color filter, and segmenting a scene image containing a hand region to obtain the hand region;
secondly, sampling the contour of the hand region, and performing vector dot product and cross product calculation on each contour point to remove non-fingertip contour points to obtain candidate fingertip points; setting a threshold value according to the geometric characteristics of the fingertips, counting the number of the non-fingertip contour points which are removed near the candidate fingertip points and comparing the number with the threshold value to detect the positions of the fingertip points and obtain the fingertip contour points; if the fingertip point is not detected, returning to the first step for reinitialization;
thirdly, calculating the fingertip contour points by using a bidirectional pyramid optical flow algorithm to obtain bidirectional matched fingertip contour points to estimate a fingertip movement area; in an estimated motion area of a fingertip, discrete cosine transform is sequentially carried out by adopting a search window strategy, a neighborhood is extracted, a corresponding discrete cosine transform low-frequency coefficient matrix is calculated, a 64-bit perceptual hash sequence is generated by adopting a perceptual hash algorithm, the 64-bit perceptual hash sequence is matched with a fingertip hash sequence template, and then the maximum matching area is the fingertip area; and finally, setting a threshold value of the matching distance to judge whether the 64-bit perceptual hash sequence is updated into a fingertip hash sequence template, setting a threshold value of the fingertip outline matching point number to judge whether the fingertip area is correctly tracked, if so, carrying out next round of fingertip tracking to realize continuous fingertip tracking, and if not, returning to the first step.
2. The bi-directional optical flow and perceptual hashing based fingertip tracking method according to claim 1, wherein: in the first step, the captured scene information is subjected to dense optical flow information corresponding to the scene information and binarization preprocessing to obtain a scene image containing a hand area; then, constructing a skin color filter, and segmenting a hand region of the scene image containing the hand region to obtain the hand region: the method comprises the following steps:
(1.1) capturing scene information and calculating dense optical flow information corresponding to the scene information;
(1.2) traversing the optical flow information, searching a maximum optical flow value, and performing regularization processing on all optical flows in the X-axis direction and the Y-axis direction by using the maximum optical flow value;
(1.3) calculating the hue and saturation of the area which changes along the optical flow direction according to the normalized optical flow, and marking different color values;
and (1.4) setting a threshold value, converting the optical flow change area of the color mark into a binary image according to the threshold value, and combining logical operation and mathematical morphology operation to obtain a scene image containing a hand area.
(1.5) constructing a YCbCr skin color filter according to the human body skin color clustering characteristics, and removing redundant color and brightness information;
(1.6) filtering the YCbCr skin color filter, then sequencing all outlines of the scene image containing the hand region in a descending manner, searching the maximum connected region as the hand region, and realizing the segmentation of the hand region of the scene image containing the hand region.
3. The bi-directional optical flow and perceptual hashing based fingertip tracking method according to claim 1, wherein: in the second step, the contour of the hand area is sampled, and vector dot product and cross product calculation is carried out on each contour point to remove non-fingertip contour points, so as to obtain candidate fingertip points; setting a threshold value according to the geometric characteristics of the fingertips, counting the number of the non-fingertip contour points which are removed near the candidate fingertip points and comparing the number with the threshold value to detect the positions of the fingertip points and obtain the fingertip contour points; if the fingertip point is not detected, returning to the first step of reinitialization refers to: the method comprises the following steps:
(2.1) sampling the contour of the hand region, and calculating the vector dot product of each contour point, wherein the vector dot product is K curvature;
(2.2) setting a threshold value T1、T2For K curvature at threshold T1And T2Carrying out vector cross multiplication calculation on the contour points between the finger points, and removing non-fingertip contour points according to the calculation result to obtain candidate fingertip points;
(2.3) setting the threshold T according to the geometric characteristics of the fingertip0Counting the number of continuously distributed contour points which do not meet the vector cross multiplication calculation result near the candidate fingertip points and comparing the number with a threshold value T0Comparing to detect the position of the fingertip point and obtain a fingertip outline point; if the fingertip point is not detected, the first step is returned to reinitialize.
4. The bi-directional optical flow and perceptual hashing based fingertip tracking method according to claim 3, wherein: in step (2.1), the calculating the vector dot product of each contour point includes: calculating the cosine absolute value of each contour point by adopting a formula (1):
wherein,is a certain point P on the contour0From the K-th point P1The vector of the composition is then calculated,is a certain point P on the contour0From the K th point P behind2The constructed vector.
5. The bi-directional optical flow and perceptual hashing based fingertip tracking method according to claim 4, wherein: in step (2.2), the pair of K curvatures is at a threshold T1And T2The cross multiplication of vectors is carried out on contour points between the two points, and non-fingers are removed according to the calculation resultThe sharp contour points and the candidate fingertip points are obtained by the following steps: if the cosine absolute value of the contour point meets the formula (2), vector cross-multiplication calculation of the point is carried out through the formula (3); otherwise, the point does not need to carry out vector cross multiplication calculation;
V=V1×V2(3)
if the vector cross product V is greater than 0, the contour point is a candidate fingertip point, otherwise, the contour point is judged to be a non-fingertip contour point and is removed.
6. The bi-directional optical flow and perceptual hashing based fingertip tracking method of claim 5, wherein: in the step (2.3), the number of the continuously distributed contour points which do not meet the vector cross multiplication calculation result near the statistical candidate fingertip points is counted and is compared with a threshold value T0Comparing, detecting the position of the fingertip point and obtaining the fingertip outline point means: counting that the vector cross product V is not satisfied near each candidate fingertip point>0, if N, the number of contour points>T0Then, the corresponding candidate fingertip points are fingertip outline points, and the positions of the fingertip outline points are detected; otherwise, returning to the first step for reinitialization.
7. The bi-directional optical flow and perceptual hashing based fingertip tracking method according to claim 1, wherein: in the third step, the fingertip outline points are calculated by using a bidirectional pyramid optical flow algorithm to obtain bidirectional matched fingertip outline points to estimate a fingertip movement area; in an estimated motion area of a fingertip, discrete cosine transform is sequentially carried out by adopting a search window strategy, a neighborhood is extracted, a corresponding discrete cosine transform low-frequency coefficient matrix is calculated, a 64-bit perceptual hash sequence is generated by adopting a perceptual hash algorithm, the 64-bit perceptual hash sequence is matched with a fingertip hash sequence template, and then the maximum matching area is the fingertip area; and finally, setting a threshold value of the matching distance to judge whether the 64-bit perceptual hash sequence is updated into a fingertip hash sequence template, setting a threshold value of a fingertip outline matching point number to judge whether the fingertip area is correctly tracked, if so, carrying out next round of fingertip tracking to realize continuous fingertip tracking, and if not, returning to the first step, wherein the method comprises the following steps:
(3.1) carrying out forward optical flow calculation on the fingertip outline points detected in the second step to obtain forward estimation fingertip outline points corresponding to the fingertip outline points in the next frame;
(3.2) carrying out reverse optical flow calculation on the forward estimation fingertip outline points to obtain corresponding reverse estimation fingertip outline points in the current frame;
(3.3) matching the initial fingertip outline points and the reverse estimation fingertip outline points to obtain the fingertip outline points capable of being subjected to bidirectional matching;
(3.4) estimating the motion area of the fingertip in the next frame by utilizing the bidirectional matching fingertip outline point;
(3.5) in the fingertip movement estimation area, adopting a search window strategy, sequentially performing discrete cosine transform in the search window area, and extracting low-frequency effective components in an 8 x 8 neighborhood at the upper left corner after the discrete cosine transform to obtain a corresponding discrete cosine coefficient matrix;
(3.6) calculating a median value of the discrete cosine coefficient matrix, and comparing each pixel point in the neighborhood with the median value to generate a 64-bit perceptual hash sequence;
(3.7) carrying out matching analysis on the 64-bit perceptual hash sequence and a fingertip hash sequence template reserved in the last frame of tracking result, wherein the maximum matching area is a fingertip area; in the tracking process after the initialization is finished, setting a fingertip hash sequence template according to an initialization result;
(3.8) setting threshold T of matching distance3Judging whether to update the 64-bit perception hash sequence into a fingertip hash sequence template;
(3.9)threshold value T for setting fingertip contour matching point number4Judging whether the fingertip area is correctly tracked or not, and if the fingertip outline is matched with the point number M<T4If so, losing the obvious fingertip target, returning to the first step, and reinitializing and detecting the fingertip outline points; otherwise, judging that the fingertip area tracking is correct, returning to the step (3.1) to perform the next round of fingertip tracking, and realizing the fingertip continuous tracking.
8. The bi-directional optical flow and perceptual hashing based fingertip tracking method of claim 7, wherein: in step (3.5), in the fingertip motion estimation region, a search window strategy is adopted, discrete cosine transform is sequentially performed in the search window region, and low-frequency effective components in an 8 × 8 neighborhood of the upper left corner after the discrete cosine transform are extracted, so as to obtain a corresponding discrete cosine coefficient matrix, where: in an estimated motion area of a fingertip, a search window strategy is adopted, a 16 x 16 search window is set, from the upper left corner to the lower right corner of the motion estimation area, 16 x 16 image blocks are sequentially converted into gray images, discrete cosine transformation is carried out, low-frequency effective components of 8 x 8 neighborhoods at the upper left corner after the discrete cosine transformation are extracted, and a corresponding discrete cosine coefficient matrix is calculated.
9. The bi-directional optical flow and perceptual hashing based fingertip tracking method of claim 7, wherein: in step (3.8), the threshold T of the matching distance is set3Judging whether to update the 64-bit perceptual hash sequence into the fingertip hash sequence template means: calculating the Hamming distance of the 64-bit perceptual hash sequence, and calculating the Hamming distance and the threshold value T through formula (4)3Judging whether the 64-bit perceptual hash sequence is updated into a fingertip hash sequence template or not;
distance D if Hamminghamming(Stemplate,Snew)≥T3And if not, not updating the fingertip hash sequence template.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510646203.2A CN105261038B (en) | 2015-09-30 | 2015-09-30 | Finger tip tracking based on two-way light stream and perception Hash |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510646203.2A CN105261038B (en) | 2015-09-30 | 2015-09-30 | Finger tip tracking based on two-way light stream and perception Hash |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105261038A true CN105261038A (en) | 2016-01-20 |
CN105261038B CN105261038B (en) | 2018-02-27 |
Family
ID=55100709
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510646203.2A Active CN105261038B (en) | 2015-09-30 | 2015-09-30 | Finger tip tracking based on two-way light stream and perception Hash |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105261038B (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105825201A (en) * | 2016-03-31 | 2016-08-03 | 武汉理工大学 | Moving object tracking method in video monitoring |
CN106327486A (en) * | 2016-08-16 | 2017-01-11 | 广州视源电子科技股份有限公司 | Method and device for tracking finger web position |
CN106408579A (en) * | 2016-10-25 | 2017-02-15 | 华南理工大学 | Video based clenched finger tip tracking method |
CN110036638A (en) * | 2017-01-04 | 2019-07-19 | 高通股份有限公司 | Motion vector for bi-directional optical stream (BIO) is rebuild |
CN113947683A (en) * | 2021-10-15 | 2022-01-18 | 兰州交通大学 | Fingertip point detection method and system and fingertip point motion track identification method and system |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100103092A1 (en) * | 2008-10-23 | 2010-04-29 | Tatung University | Video-based handwritten character input apparatus and method thereof |
CN102270348A (en) * | 2011-08-23 | 2011-12-07 | 中国科学院自动化研究所 | Method for tracking deformable hand gesture based on video streaming |
CN102402289A (en) * | 2011-11-22 | 2012-04-04 | 华南理工大学 | Mouse recognition method for gesture based on machine vision |
CN104599288A (en) * | 2013-10-31 | 2015-05-06 | 展讯通信(天津)有限公司 | Skin color template based feature tracking method and device |
US20150253864A1 (en) * | 2014-03-06 | 2015-09-10 | Avago Technologies General Ip (Singapore) Pte. Ltd. | Image Processor Comprising Gesture Recognition System with Finger Detection and Tracking Functionality |
-
2015
- 2015-09-30 CN CN201510646203.2A patent/CN105261038B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100103092A1 (en) * | 2008-10-23 | 2010-04-29 | Tatung University | Video-based handwritten character input apparatus and method thereof |
CN102270348A (en) * | 2011-08-23 | 2011-12-07 | 中国科学院自动化研究所 | Method for tracking deformable hand gesture based on video streaming |
CN102402289A (en) * | 2011-11-22 | 2012-04-04 | 华南理工大学 | Mouse recognition method for gesture based on machine vision |
CN104599288A (en) * | 2013-10-31 | 2015-05-06 | 展讯通信(天津)有限公司 | Skin color template based feature tracking method and device |
US20150253864A1 (en) * | 2014-03-06 | 2015-09-10 | Avago Technologies General Ip (Singapore) Pte. Ltd. | Image Processor Comprising Gesture Recognition System with Finger Detection and Tracking Functionality |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105825201A (en) * | 2016-03-31 | 2016-08-03 | 武汉理工大学 | Moving object tracking method in video monitoring |
CN106327486A (en) * | 2016-08-16 | 2017-01-11 | 广州视源电子科技股份有限公司 | Method and device for tracking finger web position |
CN106327486B (en) * | 2016-08-16 | 2018-12-28 | 广州视源电子科技股份有限公司 | Method and device for tracking finger web position |
CN106408579A (en) * | 2016-10-25 | 2017-02-15 | 华南理工大学 | Video based clenched finger tip tracking method |
WO2018076484A1 (en) * | 2016-10-25 | 2018-05-03 | 华南理工大学 | Method for tracking pinched fingertips based on video |
CN106408579B (en) * | 2016-10-25 | 2019-01-29 | 华南理工大学 | A kind of kneading finger tip tracking based on video |
CN110036638A (en) * | 2017-01-04 | 2019-07-19 | 高通股份有限公司 | Motion vector for bi-directional optical stream (BIO) is rebuild |
CN110036638B (en) * | 2017-01-04 | 2023-06-27 | 高通股份有限公司 | Method, device, equipment and storage medium for decoding video data |
CN113947683A (en) * | 2021-10-15 | 2022-01-18 | 兰州交通大学 | Fingertip point detection method and system and fingertip point motion track identification method and system |
Also Published As
Publication number | Publication date |
---|---|
CN105261038B (en) | 2018-02-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zhou et al. | A novel finger and hand pose estimation technique for real-time hand gesture recognition | |
CN101593022B (en) | Method for quick-speed human-computer interaction based on finger tip tracking | |
Pan et al. | A real-time multi-cue hand tracking algorithm based on computer vision | |
Song et al. | Continuous body and hand gesture recognition for natural human-computer interaction | |
Kim et al. | Simultaneous gesture segmentation and recognition based on forward spotting accumulative HMMs | |
Kar | Skeletal tracking using microsoft kinect | |
CN105261038B (en) | Finger tip tracking based on two-way light stream and perception Hash | |
CN105739702B (en) | Multi-pose finger tip tracking for natural human-computer interaction | |
CN108256421A (en) | A kind of dynamic gesture sequence real-time identification method, system and device | |
Lim et al. | Block-based histogram of optical flow for isolated sign language recognition | |
CN106503651B (en) | A kind of extracting method and system of images of gestures | |
CN109961016B (en) | Multi-gesture accurate segmentation method for smart home scene | |
CN103995595A (en) | Game somatosensory control method based on hand gestures | |
Bilal et al. | A hybrid method using haar-like and skin-color algorithm for hand posture detection, recognition and tracking | |
CN105335711B (en) | Fingertip Detection under a kind of complex environment | |
CN104821010A (en) | Binocular-vision-based real-time extraction method and system for three-dimensional hand information | |
CN107357414B (en) | Click action recognition method and device | |
CN105138990A (en) | Single-camera-based gesture convex hull detection and palm positioning method | |
Kakkoth et al. | Real time hand gesture recognition & its applications in assistive technologies for disabled | |
CN102073878B (en) | Non-wearable finger pointing gesture visual identification method | |
CN101789128A (en) | Target detection and tracking method based on DSP and digital image processing system | |
Pun et al. | Real-time hand gesture recognition using motion tracking | |
AlSaedi et al. | An efficient hand gestures recognition system | |
JP2012003724A (en) | Three-dimensional fingertip position detection method, three-dimensional fingertip position detector and program | |
Kim et al. | Hand tracking and motion detection from the sequence of stereo color image frames |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |