CN105261038A - Bidirectional optical flow and perceptual hash based fingertip tracking method - Google Patents

Bidirectional optical flow and perceptual hash based fingertip tracking method Download PDF

Info

Publication number
CN105261038A
CN105261038A CN201510646203.2A CN201510646203A CN105261038A CN 105261038 A CN105261038 A CN 105261038A CN 201510646203 A CN201510646203 A CN 201510646203A CN 105261038 A CN105261038 A CN 105261038A
Authority
CN
China
Prior art keywords
fingertip
points
point
contour
optical flow
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510646203.2A
Other languages
Chinese (zh)
Other versions
CN105261038B (en
Inventor
康文雄
吴桂乐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN201510646203.2A priority Critical patent/CN105261038B/en
Publication of CN105261038A publication Critical patent/CN105261038A/en
Application granted granted Critical
Publication of CN105261038B publication Critical patent/CN105261038B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/017Gesture based interaction, e.g. based on a set of recognized hand gestures

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a bidirectional optical flow and perceptual hash based fingertip tracking method. The method comprises the steps of S1, calculating dense optical flow information corresponding to scene information, and constructing a skin color filter to acquire a hand area; S2, carrying out vector dot product and cross multiplication calculation on each contour point so as to remove non-fingertip contour points, detecting the position of a fingertip point according to geometrical features of the fingertip, and acquiring fingertip contour points; and S3, calculating the fingertip contour points by using a bidirectional pyramid optical flow algorithm, acquiring bidirectionally matched fingertip contour points so as to carry out estimation on a fingertip moving area, generating a 64-bit perceptual hash sequence by adopting a perceptual hash algorithm, and matching the 64-bit perceptual hash sequence with a fingertip hash sequence template, wherein the maximum matching area is a fingertip area; and judging whether a next turn of fingertip tracking is carried out or not, and realizing fingertip continuous tracking. The method provided by the invention can effectively realize continuous tracking for the fingertip in a complex environment, and a problem of poor fingertip tracking effect caused by discontinuous fingertip tracking trajectory is avoided at the same time.

Description

Fingertip tracking method based on bidirectional optical flow and perceptual hash
Technical Field
The invention relates to the technical field of image processing and analysis, in particular to a fingertip tracking method based on bidirectional optical flow and perceptual hashing.
Background
The traditional man-machine interaction is mainly to finish the information exchange between people and calculation through media such as a mouse or a keyboard, and the media of the interaction method needs to occupy certain space and is inconvenient to use. Nowadays, with the development of computer science technology and artificial intelligence technology, human-computer interaction is continuously developed, and technologies such as touch screen, voice recognition, computer vision and the like are applied, so that a human-computer interaction system becomes more convenient and more friendly.
Fingertip tracking has attracted increasing researchers' attention in recent years as an important component of computer vision-based human-computer interaction systems. The fingertip tracking technology can be widely applied to the fields of gesture recognition, identity authentication, mouse control, home entertainment remote control and the like, and has great commercial value. Current fingertip tracking technologies can be broadly divided into two categories: one is to realize continuous tracking by continuously detecting fingertips; the other method is to detect the fingertip firstly and then realize the continuous tracking of the fingertip through the modes of analysis, prediction and the like.
The typical implementation of the first fingertip tracking technology is to mark the fingertip with a special pigment or bind the fingertip with a colored tape, capture video scene information with a camera, and detect the tracked fingertip by tracking the special color in the scene. However, this method requires a special color mark each time, and is inconvenient to use, and can be used only in a case where the scene is simple. Some researchers put forward that the data gloves are utilized, fingertip positioning is carried out through sensor data, the fingertip tracking effect achieved by the method is ideal, but the data gloves are generally high in price, inconvenient to use and poor in popularization. In recent years, with the development of camera technology, especially special camera Kinect developed by microsoft, many researchers are beginning to consider the application of the special camera in human-computer interaction systems such as finger tracking. The special cameras can generally provide more useful information than ordinary cameras, and the human-computer interaction effect is ideal, but the special cameras are poor in popularity and high in price and are not suitable for popularization. Generally, the fingertip tracking method only performs continuous detection, and often ignores the relation between time and space in a video scene, so that the finger tracking effect is relatively general, and problems such as jerkiness tracking and discontinuous tracking track are easily caused.
The second fingertip tracking technology is proposed only in recent years, and after detecting the fingertip position in a video scene through various technologies, continuous fingertip tracking is realized by means of speed position prediction, particle filtering and the like. However, the technology is not mature at present, most algorithms have common tracking effect, and many algorithms can only be applied in simple scenes.
Therefore, in the current stage, a common camera is used for capturing a video scene, so that continuous tracking of fingertips in a complex environment is realized, and the method has the advantages of convenience in use, strong popularization, good robustness and the like, and is a trend for researching a finger human-computer interaction system.
Disclosure of Invention
The invention aims to make up the defects of the existing fingertip detection and tracking technology, and provides a fingertip tracking method based on bidirectional optical flow and perceptual hash.
In order to achieve the purpose, the invention is realized by the following technical scheme: a fingertip tracking method based on bidirectional optical flow and perceptual hash is used for continuously tracking fingertips of hands; the method is characterized in that: the method comprises the following three steps:
the method comprises the steps of firstly, capturing scene information, and obtaining a scene image containing a hand area by calculating dense optical flow information corresponding to the scene information and performing binarization preprocessing; then constructing a skin color filter, and segmenting a scene image containing a hand region to obtain the hand region;
secondly, sampling the contour of the hand region, and performing vector dot product and cross product calculation on each contour point to remove non-fingertip contour points to obtain candidate fingertip points; setting a threshold value according to the geometric characteristics of the fingertips, counting the number of the non-fingertip contour points which are removed near the candidate fingertip points and comparing the number with the threshold value to detect the positions of the fingertip points and obtain the fingertip contour points; if the fingertip point is not detected, returning to the first step for reinitialization;
thirdly, calculating the fingertip contour points by using a bidirectional pyramid optical flow algorithm to obtain bidirectional matched fingertip contour points to estimate a fingertip movement area; in an estimated motion area of a fingertip, discrete cosine transform is sequentially carried out by adopting a search window strategy, a neighborhood is extracted, a corresponding discrete cosine transform low-frequency coefficient matrix is calculated, a 64-bit perceptual hash sequence is generated by adopting a perceptual hash algorithm, the 64-bit perceptual hash sequence is matched with a fingertip hash sequence template, and then the maximum matching area is the fingertip area; and finally, setting a threshold value of the matching distance to judge whether the 64-bit perceptual hash sequence is updated into a fingertip hash sequence template, setting a threshold value of the fingertip outline matching point number to judge whether the fingertip area is correctly tracked, if so, carrying out next round of fingertip tracking to realize continuous fingertip tracking, and if not, returning to the first step.
In the first step, the captured scene information is subjected to dense optical flow information corresponding to the scene information and binarization preprocessing to obtain a scene image containing a hand area; then, constructing a skin color filter, and segmenting a hand region of the scene image containing the hand region to obtain the hand region: the method comprises the following steps:
(1.1) capturing scene information and calculating dense optical flow information corresponding to the scene information;
(1.2) traversing the optical flow information, searching a maximum optical flow value, and performing regularization processing on all optical flows in the X-axis direction and the Y-axis direction by using the maximum optical flow value;
(1.3) calculating the hue and saturation of the area which changes along the optical flow direction according to the normalized optical flow, and marking different color values;
and (1.4) setting a threshold value, converting the optical flow change area of the color mark into a binary image according to the threshold value, and combining logical operation and mathematical morphology operation to obtain a scene image containing a hand area.
(1.5) constructing a YCbCr skin color filter according to the human body skin color clustering characteristics, and removing redundant color and brightness information;
(1.6) filtering the YCbCr skin color filter, then sequencing all outlines of the scene image containing the hand region in a descending manner, searching the maximum connected region as the hand region, and realizing the segmentation of the hand region of the scene image containing the hand region.
In the second step, the contour of the hand area is sampled, and vector dot product and cross product calculation is carried out on each contour point to remove non-fingertip contour points, so as to obtain candidate fingertip points; setting a threshold value according to the geometric characteristics of the fingertips, counting the number of the non-fingertip contour points which are removed near the candidate fingertip points and comparing the number with the threshold value to detect the positions of the fingertip points and obtain the fingertip contour points; if the fingertip point is not detected, returning to the first step of reinitialization refers to: the method comprises the following steps:
(2.1) sampling the contour of the hand region, and calculating the vector dot product of each contour point, wherein the vector dot product is K curvature;
(2.2) setting a threshold value T1、T2For K curvature at threshold T1And T2The vector cross multiplication calculation is carried out on the contour points between the finger points, the non-fingertip contour points are removed according to the calculation result, and candidate contour points are obtainedPointing to a tip point;
(2.3) setting the threshold T according to the geometric characteristics of the fingertip0Counting the number of continuously distributed contour points which do not meet the vector cross multiplication calculation result near the candidate fingertip points and comparing the number with a threshold value T0Comparing to detect the position of the fingertip point and obtain a fingertip outline point; if the fingertip point is not detected, the first step is returned to reinitialize.
In step (2.1), the calculating the vector dot product of each contour point includes: calculating the cosine absolute value of each contour point by adopting a formula (1):
| c o s α | = | V 1 · V 2 | V 1 | | V 2 | | - - - ( 1 )
wherein,is a certain point P on the contour0From the K-th point P1The vector of the composition is then calculated,is a certain point P on the contour0From the K th point P behind2The constructed vector.
In step (2.2), the pair of K curvatures is at a threshold T1And T2The vector cross multiplication calculation is carried out on the contour points between the finger points, the non-fingertip contour points are removed according to the calculation result, and the candidate fingertip points are obtained by the following steps: if the cosine absolute value of the contour point meets the formula (2), vector cross-multiplication calculation of the point is carried out through the formula (3); otherwise, the point does not need to carry out vector cross multiplication calculation;
T 1 < | cos &alpha; | P 0 < T 2 - - - ( 2 )
V=V1×V2(3)
if the vector cross product V is greater than 0, the contour point is a candidate fingertip point, otherwise, the contour point is judged to be a non-fingertip contour point and is removed.
In the step (2.3), the number of the continuously distributed contour points which do not meet the vector cross multiplication calculation result near the statistical candidate fingertip points is counted and is compared with a threshold value T0Comparing, detecting the position of the fingertip point and obtaining the fingertip outline point means: counting that the vector cross product V is not satisfied near each candidate fingertip point>0, if N, the number of contour points>T0Then, the corresponding candidate fingertip points are fingertip outline points, and the positions of the fingertip outline points are detected; otherwise, returning to the first step for reinitialization.
In the third step, the fingertip outline points are calculated by using a bidirectional pyramid optical flow algorithm to obtain bidirectional matched fingertip outline points to estimate a fingertip movement area; in an estimated motion area of a fingertip, discrete cosine transform is sequentially carried out by adopting a search window strategy, a neighborhood is extracted, a corresponding discrete cosine transform low-frequency coefficient matrix is calculated, a 64-bit perceptual hash sequence is generated by adopting a perceptual hash algorithm, the 64-bit perceptual hash sequence is matched with a fingertip hash sequence template, and then the maximum matching area is the fingertip area; and finally, setting a threshold value of the matching distance to judge whether the 64-bit perceptual hash sequence is updated into a fingertip hash sequence template, setting a threshold value of a fingertip outline matching point number to judge whether the fingertip area is correctly tracked, if so, carrying out next round of fingertip tracking to realize continuous fingertip tracking, and if not, returning to the first step, wherein the method comprises the following steps:
(3.1) carrying out forward optical flow calculation on the fingertip outline points detected in the second step to obtain forward estimation fingertip outline points corresponding to the fingertip outline points in the next frame;
(3.2) carrying out reverse optical flow calculation on the forward estimation fingertip outline points to obtain corresponding reverse estimation fingertip outline points in the current frame;
(3.3) matching the initial fingertip outline points and the reverse estimation fingertip outline points to obtain the fingertip outline points capable of being subjected to bidirectional matching;
(3.4) estimating the motion area of the fingertip in the next frame by utilizing the bidirectional matching fingertip outline point;
(3.5) in the fingertip movement estimation area, adopting a search window strategy, sequentially performing discrete cosine transform in the search window area, and extracting low-frequency effective components in an 8 x 8 neighborhood at the upper left corner after the discrete cosine transform to obtain a corresponding discrete cosine coefficient matrix;
(3.6) calculating a median value of the discrete cosine coefficient matrix, and comparing each pixel point in the neighborhood with the median value to generate a 64-bit perceptual hash sequence;
(3.7) carrying out matching analysis on the 64-bit perceptual hash sequence and a fingertip hash sequence template reserved in the last frame of tracking result, wherein the maximum matching area is a fingertip area; in the tracking process after the initialization is finished, setting a fingertip hash sequence template according to an initialization result;
(3.8) setting the matching distanceThreshold value T of3Judging whether to update the 64-bit perception hash sequence into a fingertip hash sequence template;
(3.9) setting threshold value T of the matching point number of the fingertip outline4Judging whether the fingertip area is correctly tracked or not, and if the fingertip outline is matched with the point number M<T4When the fingertip outline points are detected, obvious fingertip target loss occurs, the first step is returned, and the fingertip outline points are initialized and detected again; otherwise, judging that the fingertip area tracking is correct, returning to the step (3.1) to perform the next round of fingertip tracking, and realizing the fingertip continuous tracking.
In step (3.5), in the fingertip motion estimation region, a search window strategy is adopted, discrete cosine transform is sequentially performed in the search window region, and low-frequency effective components in an 8 × 8 neighborhood of the upper left corner after the discrete cosine transform are extracted, so as to obtain a corresponding discrete cosine coefficient matrix, where: in an estimated motion area of a fingertip, a search window strategy is adopted, a 16 x 16 search window is set, from the upper left corner to the lower right corner of the motion estimation area, 16 x 16 image blocks are sequentially converted into gray images, discrete cosine transformation is carried out, low-frequency effective components of 8 x 8 neighborhoods at the upper left corner after the discrete cosine transformation are extracted, and a corresponding discrete cosine coefficient matrix is calculated.
In step (3.8), the threshold T of the matching distance is set3Judging whether to update the 64-bit perceptual hash sequence into the fingertip hash sequence template means: calculating the Hamming distance of the 64-bit perceptual hash sequence, and calculating the Hamming distance and the threshold value T through formula (4)3Judging whether the 64-bit perceptual hash sequence is updated into a fingertip hash sequence template or not;
D h a m min g ( S t e m p l a t e , S n e w ) &GreaterEqual; T 3 , T e m p l a t e u p d a t e D h a m min g ( S t e m p l a t e , S n e w ) < T 3 , T e m p l a t e n o u p d a t e - - - ( 4 )
distance D if Hamminghamming(Stemplate,Snew)≥T3And if not, not updating the fingertip hash sequence template. In the invention, for the first tracking after the initialization is finished, a fingertip hash sequence template is set according to an initialization result; in the subsequent continuous tracking process, the fingertip hash sequence template is set according to the previous frame of tracking process, namely the fingertip hash sequence template updated in the previous frame of tracking process.
Compared with the prior art, the invention has the following advantages and beneficial effects:
1. when the method is implemented, a common camera is used for capturing a scene video, no special equipment is needed, and no special mark is needed for hands. According to the method, the area where the hand is located can be accurately segmented from a complex scene by calculating dense optical flows and constructing a skin color filter, and the situations that a large number of skin color-like areas and background transformation are contained in the scene can be met; then, by improving a K curvature method, the correct position of the fingertip point is found, and the fingertip point is ensured to simultaneously meet the curvature characteristic and the geometric characteristic; finally, according to a fingertip detection result, realizing fingertip continuous tracking by using a bidirectional pyramid optical flow and perceptual hashing, wherein the bidirectional pyramid optical flow is applied to analysis of front-back correlation of fingertip contour points, and the motion trend and the motion range of a current target can be accurately predicted; the application of perceptual hash matching can find the maximum likelihood target within the range of motion estimation. In addition, in the fingertip tracking process, the method introduces a strategy of updating the template and the contour matching points, and realizes stable and efficient fingertip tracking.
2. The fingertip tracking method can effectively realize the continuous tracking of the fingertip in a complex environment, and simultaneously avoids the problem of poor fingertip tracking effect caused by discontinuous fingertip tracking track.
Drawings
FIG. 1 is a block flow diagram of a fingertip tracking method of the present invention;
FIG. 2 is a flow chart of a method for segmenting a complete hand region from a complex environment in a first step;
FIG. 3 is a flowchart of a second method for performing fingertip detection on a hand region image;
FIG. 4 shows a point P on the contour in a second step0From the K-th point P1And the following K point P2A schematic view of the position of (a);
FIG. 5 is a flowchart of a third step of a method for implementing continuous fingertip tracking based on bidirectional pyramid optical flow and perceptual hashing;
FIG. 6 is a schematic diagram of the process of obtaining fingertip outline points capable of bidirectional matching in the third step;
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and specific embodiments.
Examples
As shown in fig. 1 to 6, the fingertip tracking method based on bidirectional optical flow and perceptual hash is used for continuously tracking the fingertip of a hand; the method comprises the following three steps:
the method comprises the steps of firstly, capturing scene information, and obtaining a scene image containing a hand area by calculating dense optical flow information corresponding to the scene information and performing binarization preprocessing; then constructing a skin color filter, and segmenting a scene image containing a hand region to obtain the hand region;
secondly, sampling the contour of the hand region, and performing vector dot product and cross product calculation on each contour point to remove non-fingertip contour points to obtain candidate fingertip points; setting a threshold value according to the geometric characteristics of the fingertips, counting the number of the non-fingertip contour points which are removed near the candidate fingertip points and comparing the number with the threshold value to detect the positions of the fingertip points and obtain the fingertip contour points; if the fingertip point is not detected, returning to the first step for reinitialization;
thirdly, calculating the fingertip contour points by using a bidirectional pyramid optical flow algorithm to obtain bidirectional matched fingertip contour points to estimate a fingertip movement area; in an estimated motion area of a fingertip, discrete cosine transform is sequentially carried out by adopting a search window strategy, a neighborhood is extracted, a corresponding discrete cosine transform low-frequency coefficient matrix is calculated, a 64-bit perceptual hash sequence is generated by adopting a perceptual hash algorithm, the 64-bit perceptual hash sequence is matched with a fingertip hash sequence template, and then the maximum matching area is the fingertip area; and finally, setting a threshold value of the matching distance to judge whether the 64-bit perceptual hash sequence is updated into a fingertip hash sequence template, setting a threshold value of the fingertip outline matching point number to judge whether the fingertip area is correctly tracked, if so, carrying out next round of fingertip tracking to realize continuous fingertip tracking, and if not, returning to the first step.
In the first step, the captured scene information is subjected to dense optical flow information corresponding to the scene information and binarization preprocessing to obtain a scene image containing a hand area; then, constructing a skin color filter, and segmenting a hand region of the scene image containing the hand region to obtain the hand region: the method comprises the following steps:
(1.1) capturing scene information and calculating dense optical flow information corresponding to the scene information; although the amount of calculation of the dense optical flow is generally large, in the method, because the dense optical flow is calculated only by the invention to acquire the approximate area of the hand candidate, the optical flow calculation is carried out by using a double-layer pyramid, and a larger search window (15x15) is set;
(1.2) traversing the optical flow information, searching a maximum optical flow value, and performing regularization processing on all optical flows in the X-axis direction and the Y-axis direction by using the maximum optical flow value;
(1.3) calculating the hue and saturation of the area which changes along the optical flow direction according to the normalized optical flow, and marking different color values;
and (1.4) setting a threshold value, converting the optical flow change area of the color mark into a binary image according to the threshold value, and combining logical operation and mathematical morphology operation to obtain a scene image containing a hand area.
(1.5) constructing a YCbCr skin color filter according to the human body skin color clustering characteristics, and removing redundant color and brightness information; due to the obvious clustering characteristic of the distribution of human skin colors, the RGB image can be converted into the YCbCr image by combining a large number of experimental results, and long and narrow color bands (Cb ∈ [100,127], Cr ∈ [128,170]) are selected as skin color models to construct the YCbCr skin color filter.
(1.6) filtering the YCbCr skin color filter, then sequencing all outlines of the scene image containing the hand region in a descending manner, searching the maximum connected region as the hand region, and realizing the segmentation of the hand region of the scene image containing the hand region.
In the second step, the contour of the hand area is sampled, and vector dot product and cross product calculation is carried out on each contour point to remove non-fingertip contour points, so as to obtain candidate fingertip points; setting a threshold value according to the geometric characteristics of the fingertips, counting the number of the non-fingertip contour points which are removed near the candidate fingertip points and comparing the number with the threshold value to detect the positions of the fingertip points and obtain the fingertip contour points; if the fingertip point is not detected, returning to the first step of reinitialization refers to: the method comprises the following steps:
(2.1) sampling the contour of the hand region, and calculating the vector dot product of each contour point, wherein the vector dot product is K curvature;
(2.2) setting a threshold value T1、T2For K curvature at threshold T1And T2Carrying out vector cross multiplication calculation on the contour points between the finger points, and removing non-fingertip contour points according to the calculation result to obtain candidate fingertip points;
(2.3) setting the threshold T according to the geometric characteristics of the fingertip0Counting the number of continuously distributed contour points which do not meet the vector cross multiplication calculation result near the candidate fingertip points and comparing the number with a threshold value T0Comparing to detect the position of the fingertip point and obtain a fingertip outline point; if the fingertip point is not detected, the first step is returned to reinitialize.
In step (2.1), the calculating the vector dot product of each contour point includes: calculating the cosine absolute value of each contour point by adopting a formula (1):
| c o s &alpha; | = | V 1 &CenterDot; V 2 | V 1 | | V 2 | | - - - ( 1 )
wherein,is a certain point P on the contour0From the K-th point P1The vector of the composition is then calculated,is a certain point P on the contour0From the K th point P behind2The constructed vector.
In step (2.2), the pair of K curvatures is at a threshold T1And T2The vector cross multiplication calculation is carried out on the contour points between the finger points, the non-fingertip contour points are removed according to the calculation result, and the candidate fingertip points are obtained by the following steps: if the cosine absolute value of the contour point meets the formula (2), vector cross-multiplication calculation of the point is carried out through the formula (3); otherwise, the point does not need to carry out vector cross multiplication calculation;
T 1 < | cos &alpha; | P 0 < T 2 - - - ( 2 )
V=V1×V2(3)
if the vector cross product V is greater than 0, the contour point is a candidate fingertip point, otherwise, the contour point is judged to be a non-fingertip contour point and is removed.
In the step (2.3), the number of the continuously distributed contour points which do not meet the vector cross multiplication calculation result near the statistical candidate fingertip points is counted and is compared with a threshold value T0Comparing, detecting the position of the fingertip point and obtaining the fingertip outline point means: counting that the vector cross product V is not satisfied near each candidate fingertip point>0, if N, the number of contour points>T0Then, the corresponding candidate fingertip points are fingertip outline points, and the positions of the fingertip outline points are detected; otherwise, returning to the first step for reinitialization.
In the third step, the fingertip outline points are calculated by using a bidirectional pyramid optical flow algorithm to obtain bidirectional matched fingertip outline points to estimate a fingertip movement area; in an estimated motion area of a fingertip, discrete cosine transform is sequentially carried out by adopting a search window strategy, a neighborhood is extracted, a corresponding discrete cosine transform low-frequency coefficient matrix is calculated, a 64-bit perceptual hash sequence is generated by adopting a perceptual hash algorithm, the 64-bit perceptual hash sequence is matched with a fingertip hash sequence template, and then the maximum matching area is the fingertip area; and finally, setting a threshold value of the matching distance to judge whether the 64-bit perceptual hash sequence is updated into a fingertip hash sequence template, setting a threshold value of a fingertip outline matching point number to judge whether the fingertip area is correctly tracked, if so, carrying out next round of fingertip tracking to realize continuous fingertip tracking, and if not, returning to the first step, wherein the method comprises the following steps:
(3.1) carrying out forward optical flow calculation on the fingertip outline points detected in the second step to obtain forward estimation fingertip outline points corresponding to the fingertip outline points in the next frame;
(3.2) carrying out reverse optical flow calculation on the forward estimation fingertip outline points to obtain corresponding reverse estimation fingertip outline points in the current frame;
(3.3) matching the initial fingertip outline points and the reverse estimation fingertip outline points to obtain the fingertip outline points capable of being subjected to bidirectional matching;
(3.4) estimating the motion area of the fingertip in the next frame by utilizing the bidirectional matching fingertip outline point;
(3.5) in the fingertip movement estimation area, adopting a search window strategy, sequentially performing discrete cosine transform in the search window area, and extracting low-frequency effective components in an 8 x 8 neighborhood at the upper left corner after the discrete cosine transform to obtain a corresponding discrete cosine coefficient matrix;
(3.6) calculating a median value of the discrete cosine coefficient matrix, and comparing each pixel point in the neighborhood with the median value to generate a 64-bit perceptual hash sequence;
(3.7) carrying out matching analysis on the 64-bit perceptual hash sequence and a fingertip hash sequence template reserved in the last frame of tracking result, wherein the maximum matching area is a fingertip area; in the tracking process after the initialization is finished, setting a fingertip hash sequence template according to an initialization result;
(3.8) setting threshold T of matching distance3Judging whether to update the 64-bit perception hash sequence into a fingertip hash sequence template;
(3.9) setting threshold value T of the matching point number of the fingertip outline4Judging whether the fingertip area is correctly tracked or not, and if the fingertip outline is matched with the point number M<T4When the fingertip outline points are detected, obvious fingertip target loss occurs, the first step is returned, and the fingertip outline points are initialized and detected again; otherwise, judging that the fingertip area tracking is correct, returning to the step (3.1) to perform the next round of fingertip tracking, and realizing the fingertip continuous tracking.
In step (3.5), in the fingertip motion estimation region, a search window strategy is adopted, discrete cosine transform is sequentially performed in the search window region, and low-frequency effective components in an 8 × 8 neighborhood of the upper left corner after the discrete cosine transform are extracted, so as to obtain a corresponding discrete cosine coefficient matrix, where: in an estimated motion area of a fingertip, a search window strategy is adopted, a 16 x 16 search window is set, from the upper left corner to the lower right corner of the motion estimation area, 16 x 16 image blocks are sequentially converted into gray images, discrete cosine transformation is carried out, low-frequency effective components of 8 x 8 neighborhoods at the upper left corner after the discrete cosine transformation are extracted, and a corresponding discrete cosine coefficient matrix is calculated.
In step (3.8), the threshold T of the matching distance is set3Judging whether to update the 64-bit perceptual hash sequence into the fingertip hash sequence template means: calculating the Hamming distance of the 64-bit perceptual hash sequence, and calculating the Hamming distance and the threshold value T through formula (4)3Judging whether the 64-bit perceptual hash sequence is updated into a fingertip hash sequence template or not;
D h a m min g ( S t e m p l a t e , S n e w ) &GreaterEqual; T 3 , T e m p l a t e u p d a t e D h a m min g ( S t e m p l a t e , S n e w ) < T 3 , T e m p l a t e n o u p d a t e - - - ( 4 )
distance D if Hamminghamming(Stemplate,Snew)≥T3And if not, not updating the fingertip hash sequence template. In the invention, for the first tracking after the initialization is finished, a fingertip hash sequence template is set according to an initialization result; in the subsequent continuous tracking process, the fingertip hash sequence template is set according to the previous frame of tracking process, namely the fingertip hash sequence template updated in the previous frame of tracking process.
The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.

Claims (9)

1. A fingertip tracking method based on bidirectional optical flow and perceptual hash is used for continuously tracking fingertips of hands; the method is characterized in that: the method comprises the following three steps:
the method comprises the steps of firstly, capturing scene information, and obtaining a scene image containing a hand area by calculating dense optical flow information corresponding to the scene information and performing binarization preprocessing; then constructing a skin color filter, and segmenting a scene image containing a hand region to obtain the hand region;
secondly, sampling the contour of the hand region, and performing vector dot product and cross product calculation on each contour point to remove non-fingertip contour points to obtain candidate fingertip points; setting a threshold value according to the geometric characteristics of the fingertips, counting the number of the non-fingertip contour points which are removed near the candidate fingertip points and comparing the number with the threshold value to detect the positions of the fingertip points and obtain the fingertip contour points; if the fingertip point is not detected, returning to the first step for reinitialization;
thirdly, calculating the fingertip contour points by using a bidirectional pyramid optical flow algorithm to obtain bidirectional matched fingertip contour points to estimate a fingertip movement area; in an estimated motion area of a fingertip, discrete cosine transform is sequentially carried out by adopting a search window strategy, a neighborhood is extracted, a corresponding discrete cosine transform low-frequency coefficient matrix is calculated, a 64-bit perceptual hash sequence is generated by adopting a perceptual hash algorithm, the 64-bit perceptual hash sequence is matched with a fingertip hash sequence template, and then the maximum matching area is the fingertip area; and finally, setting a threshold value of the matching distance to judge whether the 64-bit perceptual hash sequence is updated into a fingertip hash sequence template, setting a threshold value of the fingertip outline matching point number to judge whether the fingertip area is correctly tracked, if so, carrying out next round of fingertip tracking to realize continuous fingertip tracking, and if not, returning to the first step.
2. The bi-directional optical flow and perceptual hashing based fingertip tracking method according to claim 1, wherein: in the first step, the captured scene information is subjected to dense optical flow information corresponding to the scene information and binarization preprocessing to obtain a scene image containing a hand area; then, constructing a skin color filter, and segmenting a hand region of the scene image containing the hand region to obtain the hand region: the method comprises the following steps:
(1.1) capturing scene information and calculating dense optical flow information corresponding to the scene information;
(1.2) traversing the optical flow information, searching a maximum optical flow value, and performing regularization processing on all optical flows in the X-axis direction and the Y-axis direction by using the maximum optical flow value;
(1.3) calculating the hue and saturation of the area which changes along the optical flow direction according to the normalized optical flow, and marking different color values;
and (1.4) setting a threshold value, converting the optical flow change area of the color mark into a binary image according to the threshold value, and combining logical operation and mathematical morphology operation to obtain a scene image containing a hand area.
(1.5) constructing a YCbCr skin color filter according to the human body skin color clustering characteristics, and removing redundant color and brightness information;
(1.6) filtering the YCbCr skin color filter, then sequencing all outlines of the scene image containing the hand region in a descending manner, searching the maximum connected region as the hand region, and realizing the segmentation of the hand region of the scene image containing the hand region.
3. The bi-directional optical flow and perceptual hashing based fingertip tracking method according to claim 1, wherein: in the second step, the contour of the hand area is sampled, and vector dot product and cross product calculation is carried out on each contour point to remove non-fingertip contour points, so as to obtain candidate fingertip points; setting a threshold value according to the geometric characteristics of the fingertips, counting the number of the non-fingertip contour points which are removed near the candidate fingertip points and comparing the number with the threshold value to detect the positions of the fingertip points and obtain the fingertip contour points; if the fingertip point is not detected, returning to the first step of reinitialization refers to: the method comprises the following steps:
(2.1) sampling the contour of the hand region, and calculating the vector dot product of each contour point, wherein the vector dot product is K curvature;
(2.2) setting a threshold value T1、T2For K curvature at threshold T1And T2Carrying out vector cross multiplication calculation on the contour points between the finger points, and removing non-fingertip contour points according to the calculation result to obtain candidate fingertip points;
(2.3) setting the threshold T according to the geometric characteristics of the fingertip0Counting the number of continuously distributed contour points which do not meet the vector cross multiplication calculation result near the candidate fingertip points and comparing the number with a threshold value T0Comparing to detect the position of the fingertip point and obtain a fingertip outline point; if the fingertip point is not detected, the first step is returned to reinitialize.
4. The bi-directional optical flow and perceptual hashing based fingertip tracking method according to claim 3, wherein: in step (2.1), the calculating the vector dot product of each contour point includes: calculating the cosine absolute value of each contour point by adopting a formula (1):
| c o s &alpha; | = | V 1 &CenterDot; V 2 | V 1 | | V 2 | | - - - ( 1 )
wherein,is a certain point P on the contour0From the K-th point P1The vector of the composition is then calculated,is a certain point P on the contour0From the K th point P behind2The constructed vector.
5. The bi-directional optical flow and perceptual hashing based fingertip tracking method according to claim 4, wherein: in step (2.2), the pair of K curvatures is at a threshold T1And T2The cross multiplication of vectors is carried out on contour points between the two points, and non-fingers are removed according to the calculation resultThe sharp contour points and the candidate fingertip points are obtained by the following steps: if the cosine absolute value of the contour point meets the formula (2), vector cross-multiplication calculation of the point is carried out through the formula (3); otherwise, the point does not need to carry out vector cross multiplication calculation;
T 1 | cos &alpha; | P 0 < T 2 - - - ( 2 )
V=V1×V2(3)
if the vector cross product V is greater than 0, the contour point is a candidate fingertip point, otherwise, the contour point is judged to be a non-fingertip contour point and is removed.
6. The bi-directional optical flow and perceptual hashing based fingertip tracking method of claim 5, wherein: in the step (2.3), the number of the continuously distributed contour points which do not meet the vector cross multiplication calculation result near the statistical candidate fingertip points is counted and is compared with a threshold value T0Comparing, detecting the position of the fingertip point and obtaining the fingertip outline point means: counting that the vector cross product V is not satisfied near each candidate fingertip point>0, if N, the number of contour points>T0Then, the corresponding candidate fingertip points are fingertip outline points, and the positions of the fingertip outline points are detected; otherwise, returning to the first step for reinitialization.
7. The bi-directional optical flow and perceptual hashing based fingertip tracking method according to claim 1, wherein: in the third step, the fingertip outline points are calculated by using a bidirectional pyramid optical flow algorithm to obtain bidirectional matched fingertip outline points to estimate a fingertip movement area; in an estimated motion area of a fingertip, discrete cosine transform is sequentially carried out by adopting a search window strategy, a neighborhood is extracted, a corresponding discrete cosine transform low-frequency coefficient matrix is calculated, a 64-bit perceptual hash sequence is generated by adopting a perceptual hash algorithm, the 64-bit perceptual hash sequence is matched with a fingertip hash sequence template, and then the maximum matching area is the fingertip area; and finally, setting a threshold value of the matching distance to judge whether the 64-bit perceptual hash sequence is updated into a fingertip hash sequence template, setting a threshold value of a fingertip outline matching point number to judge whether the fingertip area is correctly tracked, if so, carrying out next round of fingertip tracking to realize continuous fingertip tracking, and if not, returning to the first step, wherein the method comprises the following steps:
(3.1) carrying out forward optical flow calculation on the fingertip outline points detected in the second step to obtain forward estimation fingertip outline points corresponding to the fingertip outline points in the next frame;
(3.2) carrying out reverse optical flow calculation on the forward estimation fingertip outline points to obtain corresponding reverse estimation fingertip outline points in the current frame;
(3.3) matching the initial fingertip outline points and the reverse estimation fingertip outline points to obtain the fingertip outline points capable of being subjected to bidirectional matching;
(3.4) estimating the motion area of the fingertip in the next frame by utilizing the bidirectional matching fingertip outline point;
(3.5) in the fingertip movement estimation area, adopting a search window strategy, sequentially performing discrete cosine transform in the search window area, and extracting low-frequency effective components in an 8 x 8 neighborhood at the upper left corner after the discrete cosine transform to obtain a corresponding discrete cosine coefficient matrix;
(3.6) calculating a median value of the discrete cosine coefficient matrix, and comparing each pixel point in the neighborhood with the median value to generate a 64-bit perceptual hash sequence;
(3.7) carrying out matching analysis on the 64-bit perceptual hash sequence and a fingertip hash sequence template reserved in the last frame of tracking result, wherein the maximum matching area is a fingertip area; in the tracking process after the initialization is finished, setting a fingertip hash sequence template according to an initialization result;
(3.8) setting threshold T of matching distance3Judging whether to update the 64-bit perception hash sequence into a fingertip hash sequence template;
(3.9)threshold value T for setting fingertip contour matching point number4Judging whether the fingertip area is correctly tracked or not, and if the fingertip outline is matched with the point number M<T4If so, losing the obvious fingertip target, returning to the first step, and reinitializing and detecting the fingertip outline points; otherwise, judging that the fingertip area tracking is correct, returning to the step (3.1) to perform the next round of fingertip tracking, and realizing the fingertip continuous tracking.
8. The bi-directional optical flow and perceptual hashing based fingertip tracking method of claim 7, wherein: in step (3.5), in the fingertip motion estimation region, a search window strategy is adopted, discrete cosine transform is sequentially performed in the search window region, and low-frequency effective components in an 8 × 8 neighborhood of the upper left corner after the discrete cosine transform are extracted, so as to obtain a corresponding discrete cosine coefficient matrix, where: in an estimated motion area of a fingertip, a search window strategy is adopted, a 16 x 16 search window is set, from the upper left corner to the lower right corner of the motion estimation area, 16 x 16 image blocks are sequentially converted into gray images, discrete cosine transformation is carried out, low-frequency effective components of 8 x 8 neighborhoods at the upper left corner after the discrete cosine transformation are extracted, and a corresponding discrete cosine coefficient matrix is calculated.
9. The bi-directional optical flow and perceptual hashing based fingertip tracking method of claim 7, wherein: in step (3.8), the threshold T of the matching distance is set3Judging whether to update the 64-bit perceptual hash sequence into the fingertip hash sequence template means: calculating the Hamming distance of the 64-bit perceptual hash sequence, and calculating the Hamming distance and the threshold value T through formula (4)3Judging whether the 64-bit perceptual hash sequence is updated into a fingertip hash sequence template or not;
D h a m min g ( S t e m p l a t e , S n e w ) &GreaterEqual; T 3 , T e m p l a t e u p d a t e D h a m min g ( S t e m p l a t e , S n e w ) < T 3 , T e m p l a t e n o u p d a t e - - - ( 4 )
distance D if Hamminghamming(Stemplate,Snew)≥T3And if not, not updating the fingertip hash sequence template.
CN201510646203.2A 2015-09-30 2015-09-30 Finger tip tracking based on two-way light stream and perception Hash Active CN105261038B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510646203.2A CN105261038B (en) 2015-09-30 2015-09-30 Finger tip tracking based on two-way light stream and perception Hash

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510646203.2A CN105261038B (en) 2015-09-30 2015-09-30 Finger tip tracking based on two-way light stream and perception Hash

Publications (2)

Publication Number Publication Date
CN105261038A true CN105261038A (en) 2016-01-20
CN105261038B CN105261038B (en) 2018-02-27

Family

ID=55100709

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510646203.2A Active CN105261038B (en) 2015-09-30 2015-09-30 Finger tip tracking based on two-way light stream and perception Hash

Country Status (1)

Country Link
CN (1) CN105261038B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105825201A (en) * 2016-03-31 2016-08-03 武汉理工大学 Moving object tracking method in video monitoring
CN106327486A (en) * 2016-08-16 2017-01-11 广州视源电子科技股份有限公司 Method and device for tracking finger web position
CN106408579A (en) * 2016-10-25 2017-02-15 华南理工大学 Video based clenched finger tip tracking method
CN110036638A (en) * 2017-01-04 2019-07-19 高通股份有限公司 Motion vector for bi-directional optical stream (BIO) is rebuild
CN113947683A (en) * 2021-10-15 2022-01-18 兰州交通大学 Fingertip point detection method and system and fingertip point motion track identification method and system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100103092A1 (en) * 2008-10-23 2010-04-29 Tatung University Video-based handwritten character input apparatus and method thereof
CN102270348A (en) * 2011-08-23 2011-12-07 中国科学院自动化研究所 Method for tracking deformable hand gesture based on video streaming
CN102402289A (en) * 2011-11-22 2012-04-04 华南理工大学 Mouse recognition method for gesture based on machine vision
CN104599288A (en) * 2013-10-31 2015-05-06 展讯通信(天津)有限公司 Skin color template based feature tracking method and device
US20150253864A1 (en) * 2014-03-06 2015-09-10 Avago Technologies General Ip (Singapore) Pte. Ltd. Image Processor Comprising Gesture Recognition System with Finger Detection and Tracking Functionality

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100103092A1 (en) * 2008-10-23 2010-04-29 Tatung University Video-based handwritten character input apparatus and method thereof
CN102270348A (en) * 2011-08-23 2011-12-07 中国科学院自动化研究所 Method for tracking deformable hand gesture based on video streaming
CN102402289A (en) * 2011-11-22 2012-04-04 华南理工大学 Mouse recognition method for gesture based on machine vision
CN104599288A (en) * 2013-10-31 2015-05-06 展讯通信(天津)有限公司 Skin color template based feature tracking method and device
US20150253864A1 (en) * 2014-03-06 2015-09-10 Avago Technologies General Ip (Singapore) Pte. Ltd. Image Processor Comprising Gesture Recognition System with Finger Detection and Tracking Functionality

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105825201A (en) * 2016-03-31 2016-08-03 武汉理工大学 Moving object tracking method in video monitoring
CN106327486A (en) * 2016-08-16 2017-01-11 广州视源电子科技股份有限公司 Method and device for tracking finger web position
CN106327486B (en) * 2016-08-16 2018-12-28 广州视源电子科技股份有限公司 Method and device for tracking finger web position
CN106408579A (en) * 2016-10-25 2017-02-15 华南理工大学 Video based clenched finger tip tracking method
WO2018076484A1 (en) * 2016-10-25 2018-05-03 华南理工大学 Method for tracking pinched fingertips based on video
CN106408579B (en) * 2016-10-25 2019-01-29 华南理工大学 A kind of kneading finger tip tracking based on video
CN110036638A (en) * 2017-01-04 2019-07-19 高通股份有限公司 Motion vector for bi-directional optical stream (BIO) is rebuild
CN110036638B (en) * 2017-01-04 2023-06-27 高通股份有限公司 Method, device, equipment and storage medium for decoding video data
CN113947683A (en) * 2021-10-15 2022-01-18 兰州交通大学 Fingertip point detection method and system and fingertip point motion track identification method and system

Also Published As

Publication number Publication date
CN105261038B (en) 2018-02-27

Similar Documents

Publication Publication Date Title
Zhou et al. A novel finger and hand pose estimation technique for real-time hand gesture recognition
CN101593022B (en) Method for quick-speed human-computer interaction based on finger tip tracking
Pan et al. A real-time multi-cue hand tracking algorithm based on computer vision
Song et al. Continuous body and hand gesture recognition for natural human-computer interaction
Kim et al. Simultaneous gesture segmentation and recognition based on forward spotting accumulative HMMs
Kar Skeletal tracking using microsoft kinect
CN105261038B (en) Finger tip tracking based on two-way light stream and perception Hash
CN105739702B (en) Multi-pose finger tip tracking for natural human-computer interaction
CN108256421A (en) A kind of dynamic gesture sequence real-time identification method, system and device
Lim et al. Block-based histogram of optical flow for isolated sign language recognition
CN106503651B (en) A kind of extracting method and system of images of gestures
CN109961016B (en) Multi-gesture accurate segmentation method for smart home scene
CN103995595A (en) Game somatosensory control method based on hand gestures
Bilal et al. A hybrid method using haar-like and skin-color algorithm for hand posture detection, recognition and tracking
CN105335711B (en) Fingertip Detection under a kind of complex environment
CN104821010A (en) Binocular-vision-based real-time extraction method and system for three-dimensional hand information
CN107357414B (en) Click action recognition method and device
CN105138990A (en) Single-camera-based gesture convex hull detection and palm positioning method
Kakkoth et al. Real time hand gesture recognition & its applications in assistive technologies for disabled
CN102073878B (en) Non-wearable finger pointing gesture visual identification method
CN101789128A (en) Target detection and tracking method based on DSP and digital image processing system
Pun et al. Real-time hand gesture recognition using motion tracking
AlSaedi et al. An efficient hand gestures recognition system
JP2012003724A (en) Three-dimensional fingertip position detection method, three-dimensional fingertip position detector and program
Kim et al. Hand tracking and motion detection from the sequence of stereo color image frames

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant