CN105261038B - Finger tip tracking based on two-way light stream and perception Hash - Google Patents

Finger tip tracking based on two-way light stream and perception Hash Download PDF

Info

Publication number
CN105261038B
CN105261038B CN201510646203.2A CN201510646203A CN105261038B CN 105261038 B CN105261038 B CN 105261038B CN 201510646203 A CN201510646203 A CN 201510646203A CN 105261038 B CN105261038 B CN 105261038B
Authority
CN
China
Prior art keywords
fingertip
points
point
contour
area
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510646203.2A
Other languages
Chinese (zh)
Other versions
CN105261038A (en
Inventor
康文雄
吴桂乐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN201510646203.2A priority Critical patent/CN105261038B/en
Publication of CN105261038A publication Critical patent/CN105261038A/en
Application granted granted Critical
Publication of CN105261038B publication Critical patent/CN105261038B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/017Gesture based interaction, e.g. based on a set of recognized hand gestures

Abstract

The present invention provides a kind of finger tip tracking based on two-way light stream and perception Hash, and this method includes:The first step, by calculating dense optical flow information corresponding with scene information;Complexion filter device is built again obtains hand region;Second step, dot product is carried out to each profile point and multiplication cross is calculated to reject the profile point of non-finger tip;Detect the position of finger tip point further according to the geometrical property of finger tip and obtain finger tip profile point;3rd step, finger tip profile point is calculated using duplex pyramid optical flow algorithm, obtains the finger tip profile point of bi-directional matching to estimate fingertip motions region;64 perception Hash sequences are generated using hash algorithm is perceived, are matched with finger tip Hash sequence template, maximum matching area is fingertip area;The finger tip tracking of next round is judged whether to, realizes that finger tip persistently tracks.The inventive method can effectively realize the finger tip under complex environment continuation tracking, while avoid because finger tip pursuit path discontinuously cause finger tip tracking effect difference the problem of.

Description

Fingertip tracking method based on bidirectional optical flow and perceptual hash
Technical Field
The invention relates to the technical field of image processing and analysis, in particular to a fingertip tracking method based on bidirectional optical flow and perceptual hashing.
Background
The traditional man-machine interaction is mainly to finish the information exchange between people and calculation through media such as a mouse or a keyboard, and the media of the interaction method needs to occupy certain space and is inconvenient to use. Nowadays, with the development of computer science technology and artificial intelligence technology, human-computer interaction is continuously developed, such as application of technologies such as touch screen, voice recognition, computer vision, etc., so that a human-computer interaction system becomes more convenient and friendly.
Fingertip tracking has attracted increasing researchers' attention in recent years as an important component of computer vision-based human-computer interaction systems. The fingertip tracking technology can be widely applied to the fields of gesture recognition, identity authentication, mouse control, home entertainment remote control and the like, and has great commercial value. Current fingertip tracking technologies can be broadly divided into two categories: one is to realize continuous tracking by continuously detecting fingertips; the other method is to detect the fingertip firstly and then realize the continuous tracking of the fingertip by the modes of analysis, prediction and the like.
The typical implementation manner of the first fingertip tracking technology is to mark a fingertip with a special pigment or bind the fingertip with a colored adhesive tape, capture video scene information with a camera, and detect a tracked fingertip by tracking the special color in the scene. However, this method requires a special color mark every time, which is inconvenient to use, and can be used only in a case where the scene is very simple. Some researchers put forward that the data gloves are utilized, fingertip positioning is carried out through sensor data, the fingertip tracking effect achieved by the method is ideal, but the data gloves are generally high in price, inconvenient to use and poor in popularization. In recent years, with the development of camera technology, especially special camera Kinect developed by microsoft, many researchers are beginning to consider the application of the special camera in human-computer interaction systems such as finger tracking. The special cameras can generally provide more useful information than ordinary cameras, and the human-computer interaction effect is ideal, but the special cameras are poor in popularity and high in price and are not suitable for popularization. Generally, the fingertip tracking method only performs continuous detection, and often ignores the relation between time and space in a video scene, so that the finger tracking effect is relatively general, and problems such as jerkiness tracking and discontinuous tracking track are easily caused.
The second fingertip tracking technology is proposed only in recent years, and after detecting the fingertip position in a video scene through various technologies, continuous fingertip tracking is realized by means of speed position prediction, particle filtering and the like. However, the technology is not mature at present, most algorithms have common tracking effect, and many algorithms can only be applied in simple scenes.
Therefore, in the current stage, a common camera is used for capturing a video scene, so that continuous tracking of fingertips in a complex environment is realized, and the method has the advantages of convenience in use, strong popularization, good robustness and the like, and is a trend for researching a finger human-computer interaction system.
Disclosure of Invention
The invention aims to make up the defects of the existing fingertip detection and tracking technology, and provides a fingertip tracking method based on bidirectional optical flow and perceptual hash.
In order to achieve the purpose, the invention is realized by the following technical scheme: a fingertip tracking method based on bidirectional optical flow and perceptual hash is used for continuously tracking fingertips of hands; the method is characterized in that: the method comprises the following three steps:
the method comprises the steps of firstly, capturing scene information, and obtaining a scene image containing a hand area by calculating dense optical flow information corresponding to the scene information and performing binarization preprocessing; then constructing a skin color filter, and segmenting a scene image containing a hand region to obtain the hand region;
secondly, sampling the contour of the hand region, and performing vector dot product and cross product calculation on each contour point to remove non-fingertip contour points to obtain candidate fingertip points; setting a threshold value according to the geometric characteristics of the fingertips, counting the number of the non-fingertip contour points which are removed near the candidate fingertip points and comparing the number with the threshold value to detect the positions of the fingertip points and obtain the fingertip contour points; if the fingertip point is not detected, returning to the first step for reinitialization;
thirdly, calculating the fingertip contour points by using a bidirectional pyramid optical flow algorithm to obtain bidirectional matched fingertip contour points to estimate a fingertip movement area; in an estimated motion area of a fingertip, discrete cosine transform is sequentially carried out by adopting a search window strategy, a neighborhood is extracted, a corresponding discrete cosine transform low-frequency coefficient matrix is calculated, a 64-bit perceptual hash sequence is generated by adopting a perceptual hash algorithm, the 64-bit perceptual hash sequence is matched with a fingertip hash sequence template, and then the maximum matching area is the fingertip area; and finally, setting a threshold value of the matching distance to judge whether the 64-bit perceptual hash sequence is updated into a fingertip hash sequence template, setting a threshold value of the fingertip outline matching point number to judge whether the fingertip area is correctly tracked, if so, carrying out next round of fingertip tracking to realize continuous fingertip tracking, and if not, returning to the first step.
In the first step, the captured scene information is subjected to dense optical flow information corresponding to the scene information and binarization preprocessing to obtain a scene image containing a hand area; then, constructing a skin color filter, and segmenting a hand region of the scene image containing the hand region to obtain the hand region: the method comprises the following steps:
(1.1) capturing scene information and calculating dense optical flow information corresponding to the scene information;
(1.2) traversing the optical flow information, searching a maximum optical flow value, and performing regularization processing on all optical flows in the X-axis direction and the Y-axis direction by using the maximum optical flow value;
(1.3) calculating the hue and saturation of the area which changes along the optical flow direction according to the normalized optical flow, and marking different color values;
and (1.4) setting a threshold value, converting the optical flow change area of the color mark into a binary image according to the threshold value, and combining logical operation and mathematical morphology operation to obtain a scene image containing a hand area.
(1.5) constructing a YCbCr skin color filter according to the human body skin color clustering characteristics, and removing redundant color and brightness information;
(1.6) filtering the YCbCr skin color filter, then sequencing all outlines of the scene image containing the hand region in a descending manner, searching the maximum connected region as the hand region, and realizing the segmentation of the hand region of the scene image containing the hand region.
In the second step, the contour of the hand area is sampled, and vector dot product and cross product calculation is carried out on each contour point to remove non-fingertip contour points, so as to obtain candidate fingertip points; setting a threshold value according to the geometric characteristics of the fingertips, counting the number of the non-fingertip contour points which are removed near the candidate fingertip points and comparing the number with the threshold value to detect the positions of the fingertip points and obtain the fingertip contour points; if the fingertip point is not detected, returning to the first step of reinitialization refers to: the method comprises the following steps:
(2.1) sampling the contour of the hand region, and calculating the vector dot product of each contour point, wherein the vector dot product is K curvature;
(2.2) setting a threshold value T 1 、T 2 For K curvature at threshold T 1 And T 2 Carrying out vector cross multiplication calculation on the contour points between the finger points, and removing non-fingertip contour points according to the calculation result to obtain candidate fingertip points;
(2.3) setting the threshold T according to the geometric characteristics of the fingertip 0 Counting the number of continuously distributed contour points which do not meet the vector cross multiplication calculation result near the candidate fingertip points and comparing the number with a threshold value T 0 Comparing to detect the position of the fingertip point and obtain a fingertip outline point; if the fingertip point is not detected, the first step is returned to reinitialize.
In step (2.1), the calculating the vector dot product of each contour point includes: calculating the cosine absolute value of each contour point by adopting a formula (1):
wherein the content of the first and second substances,is a certain point P on the contour 0 From the K-th point P 1 The vector of the composition is then calculated,is a certain point P on the contour 0 And the following Kth point P 2 The constructed vector.
In step (2.2), the pair of K curvatures is at a threshold T 1 And T 2 Carrying out vector cross product calculation on contour points between the finger tips, and eliminating non-fingertip contour points according to the calculation result to obtainThe candidate fingertip points are: if the cosine absolute value of the contour point meets the formula (2), vector cross-multiplication calculation of the point is carried out through the formula (3); otherwise, the point does not need to carry out vector cross multiplication calculation;
V=V 1 ×V 2 (3)
if the vector cross product V is greater than 0, the contour point is a candidate fingertip point, otherwise, the contour point is judged to be a non-fingertip contour point and is removed.
In the step (2.3), the number of the continuously distributed contour points which do not meet the vector cross multiplication calculation result near the statistical candidate fingertip points is counted and is compared with a threshold value T 0 And comparing to detect the positions of the fingertip points and obtain fingertip outline points, namely: counting that the vector cross product V is not satisfied near each candidate fingertip point&Number N of continuously distributed contour points of gt, 0, if N>T 0 Then, the corresponding candidate fingertip points are fingertip outline points, and the positions of the fingertip outline points are detected; otherwise, returning to the first step for reinitialization.
In the third step, the fingertip outline points are calculated by using a bidirectional pyramid optical flow algorithm to obtain bidirectional matched fingertip outline points to estimate a fingertip movement area; in an estimated motion area of a fingertip, discrete cosine transform is sequentially carried out by adopting a search window strategy, a neighborhood is extracted, a corresponding discrete cosine transform low-frequency coefficient matrix is calculated, a 64-bit perceptual hash sequence is generated by adopting a perceptual hash algorithm, the 64-bit perceptual hash sequence is matched with a fingertip hash sequence template, and then the maximum matching area is the fingertip area; and finally, setting a threshold value of the matching distance to judge whether the 64-bit perceptual hash sequence is updated into a fingertip hash sequence template, setting a threshold value of a fingertip outline matching point number to judge whether the fingertip area is correctly tracked, if so, carrying out next round of fingertip tracking to realize continuous fingertip tracking, and if not, returning to the first step, wherein the method comprises the following steps:
(3.1) carrying out forward optical flow calculation on the fingertip outline points detected in the second step to obtain forward estimation fingertip outline points corresponding to the fingertip outline points in the next frame;
(3.2) performing reverse optical flow calculation on the forward estimation fingertip outline points to obtain reverse estimation fingertip outline points corresponding to the forward estimation fingertip outline points in the current frame;
(3.3) matching the initial fingertip outline points and the reverse estimation fingertip outline points to obtain the fingertip outline points capable of being subjected to bidirectional matching;
(3.4) estimating the motion area of the fingertip in the next frame by utilizing the bidirectional matching fingertip outline point;
(3.5) in the fingertip movement estimation area, adopting a search window strategy, sequentially performing discrete cosine transform in the search window area, and extracting low-frequency effective components in an 8 x 8 neighborhood at the upper left corner after the discrete cosine transform to obtain a corresponding discrete cosine coefficient matrix;
(3.6) calculating a median value of the discrete cosine coefficient matrix, and comparing each pixel point in the neighborhood with the median value to generate a 64-bit perceptual hash sequence;
(3.7) carrying out matching analysis on the 64-bit perceptual hash sequence and a fingertip hash sequence template reserved in the last frame of tracking result, wherein the maximum matching area is a fingertip area; in the tracking process after the initialization is finished, setting a fingertip hash sequence template according to an initialization result;
(3.8) setting threshold T of matching distance 3 Judging whether to update the 64-bit perception hash sequence into a fingertip hash sequence template;
(3.9) setting threshold value T of the matching point number of the fingertip outline 4 Judging whether the fingertip area is correctly tracked or not, and if the fingertip outline is matched with the point number M<T 4 When the fingertip outline points are detected, obvious fingertip target loss occurs, the first step is returned, and the fingertip outline points are initialized and detected again; otherwise, judging that the fingertip area tracking is correct, returning to the step (3.1) to carry out the next round of fingertip tracking, and realizing the fingertip continuous tracking.
In the step (3.5), in the fingertip motion estimation region, a search window strategy is adopted, discrete cosine transform is sequentially performed in the search window region, and low-frequency effective components in an 8 × 8 neighborhood of the upper left corner after the discrete cosine transform are extracted, so as to obtain a corresponding discrete cosine coefficient matrix, where: in an estimated motion area of a fingertip, a search window strategy is adopted, a 16 x 16 search window is set, from the upper left corner to the lower right corner of the motion estimation area, 16 x 16 image blocks are sequentially converted into gray images, discrete cosine transformation is carried out, low-frequency effective components of 8 x 8 neighborhoods at the upper left corner after the discrete cosine transformation are extracted, and a corresponding discrete cosine coefficient matrix is calculated.
In step (3.8), the threshold T of the matching distance is set 3 Judging whether to update the 64-bit perceptual hash sequence into the fingertip hash sequence template means: calculating the Hamming distance of the 64-bit perceptual hash sequence, and calculating the Hamming distance and the threshold value T through formula (4) 3 Judging whether the 64-bit perceptual hash sequence is updated into a fingertip hash sequence template or not;
distance D if Hamming hamming (S template ,S new )≥T 3 And if not, not updating the fingertip hash sequence template. In the invention, for the first tracking after the initialization is finished, a fingertip hash sequence template is set according to an initialization result; in the subsequent continuous tracking process, the fingertip hash sequence template is set according to the previous frame of tracking process, namely the fingertip hash sequence template updated in the previous frame of tracking process.
Compared with the prior art, the invention has the following advantages and beneficial effects:
1. when the method is implemented, a common camera is used for capturing a scene video, no special equipment is needed, and no special mark is needed for hands. According to the method, the area where the hand is located can be accurately segmented from a complex scene by calculating dense optical flows and constructing a skin color filter, and the situations that a large number of skin color-like areas and background transformation are contained in the scene can be met; then, by improving a K curvature method, the correct position of the fingertip point is found, and the fingertip point is ensured to simultaneously meet the curvature characteristic and the geometric characteristic; finally, according to a fingertip detection result, realizing continuous fingertip tracking by using a bidirectional pyramid optical flow and a perceptual hash, wherein the bidirectional pyramid optical flow is applied to analysis of front and back correlation of fingertip contour points, and the motion trend and the motion range of a current target can be accurately predicted; the application of perceptual hash matching can find the maximum likelihood target within the range of motion estimation. In addition, in the fingertip tracking process, the method introduces a strategy of updating the template and the contour matching points, and realizes stable and efficient fingertip tracking.
2. The fingertip tracking method can effectively realize the continuous tracking of the fingertip in a complex environment, and simultaneously avoids the problem of poor fingertip tracking effect caused by discontinuous fingertip tracking track.
Drawings
FIG. 1 is a block flow diagram of a fingertip tracking method of the present invention;
FIG. 2 is a flow chart of a method for segmenting a complete hand region from a complex environment in a first step;
FIG. 3 is a flowchart of a second method for performing fingertip detection on a hand region image;
FIG. 4 shows a point P on the contour in a second step 0 From the previous Kth point P 1 And the following K point P 2 A schematic view of the position of (a);
FIG. 5 is a flowchart of a third step of a method for implementing continuous fingertip tracking based on bidirectional pyramid optical flow and perceptual hashing;
FIG. 6 is a schematic diagram of the process of obtaining fingertip outline points capable of bidirectional matching in the third step;
Detailed Description
The invention is described in further detail below with reference to the drawings and the detailed description.
Examples
As shown in fig. 1 to 6, the fingertip tracking method based on bidirectional optical flow and perceptual hash is used for continuously tracking the fingertip of a hand; the method comprises the following three steps:
the method comprises the steps of firstly, capturing scene information, and obtaining a scene image containing a hand area by calculating dense optical flow information corresponding to the scene information and performing binarization preprocessing; then constructing a skin color filter, and segmenting a scene image containing a hand region to obtain the hand region;
secondly, sampling the contour of the hand region, and performing vector dot product and cross product calculation on each contour point to remove non-fingertip contour points to obtain candidate fingertip points; setting a threshold value according to the geometric characteristics of the fingertips, counting the number of the non-fingertip contour points which are removed near the candidate fingertip points and comparing the number with the threshold value to detect the positions of the fingertip points and obtain the fingertip contour points; if the fingertip point is not detected, returning to the first step for reinitialization;
thirdly, calculating the fingertip contour points by using a bidirectional pyramid optical flow algorithm to obtain bidirectional matched fingertip contour points to estimate a fingertip movement area; in an estimated motion area of a fingertip, discrete cosine transform is sequentially carried out by adopting a search window strategy, a neighborhood is extracted, a corresponding discrete cosine transform low-frequency coefficient matrix is calculated, a 64-bit perceptual hash sequence is generated by adopting a perceptual hash algorithm, the 64-bit perceptual hash sequence is matched with a fingertip hash sequence template, and then the maximum matching area is the fingertip area; and finally, setting a threshold value of the matching distance to judge whether the 64-bit perceptual hash sequence is updated into a fingertip hash sequence template, setting a threshold value of the fingertip outline matching point number to judge whether the fingertip area is correctly tracked, if so, carrying out next round of fingertip tracking to realize continuous fingertip tracking, and if not, returning to the first step.
In the first step, the captured scene information is subjected to dense optical flow information corresponding to the scene information and binarization preprocessing to obtain a scene image containing a hand area; then, constructing a skin color filter, and segmenting a hand region of the scene image containing the hand region to obtain the hand region: the method comprises the following steps:
(1.1) capturing scene information and calculating dense optical flow information corresponding to the scene information; although the amount of calculation of the dense optical flow is generally large, in the method, because the dense optical flow is calculated only by the method for acquiring the approximate area of the hand candidate, the optical flow calculation is carried out by using a double-layer pyramid, and a large search window (15 x 15) is set;
(1.2) traversing the optical flow information, searching a maximum optical flow value, and performing regularization processing on all optical flows in the X-axis direction and the Y-axis direction by using the maximum optical flow value;
(1.3) calculating the hue and saturation of an area which changes along the direction of the optical flow according to the optical flow after regularization, and marking different color values;
and (1.4) setting a threshold value, converting the optical flow change area of the color mark into a binary image according to the threshold value, and combining logical operation and mathematical morphology operation to obtain a scene image containing a hand area.
(1.5) constructing a YCbCr skin color filter according to the human body skin color clustering characteristics, and removing redundant color and brightness information; due to the fact that the distribution of human skin color has obvious clustering characteristics, combined with a large number of experimental results, the RGB image can be converted into the YCbCr image, and long and narrow color bands (Cb ∈ [100,127], cr ∈ [128,170 ]) are selected as skin color models to construct the YCbCr skin color filter.
(1.6) filtering the YCbCr skin color filter, then sequencing all outlines of the scene image containing the hand region in a descending manner, searching the maximum connected region as the hand region, and realizing the segmentation of the hand region of the scene image containing the hand region.
In the second step, the contour of the hand area is sampled, and vector dot product and cross product calculation is carried out on each contour point to remove non-fingertip contour points, so as to obtain candidate fingertip points; setting a threshold value according to the geometric characteristics of the fingertips, counting the number of the non-fingertip contour points which are removed near the candidate fingertip points and comparing the number with the threshold value to detect the positions of the fingertip points and obtain the fingertip contour points; if the fingertip point is not detected, returning to the first step of reinitialization refers to: the method comprises the following steps:
(2.1) sampling the contour of the hand region, and calculating the vector dot product of each contour point, wherein the vector dot product is K curvature;
(2.2) setting a threshold value T 1 、T 2 For K curvature at threshold T 1 And T 2 Carrying out vector cross multiplication calculation on the contour points between the finger points, and removing non-fingertip contour points according to the calculation result to obtain candidate fingertip points;
(2.3) setting the threshold T according to the geometric characteristics of the fingertip 0 Counting the number of continuously distributed contour points which do not satisfy the vector cross product calculation result near the candidate fingertip points and comparing the number with a threshold value T 0 Comparing to detect the position of the fingertip point and obtain a fingertip outline point; if the fingertip point is not detected, the first step is returned to reinitialize.
In step (2.1), the calculating the vector dot product of each contour point includes: calculating the cosine absolute value of each contour point by adopting a formula (1):
wherein the content of the first and second substances,is a certain point P on the contour 0 From the K-th point P 1 The vector of the composition is then calculated,is a certain point P on the contour 0 From the K th point P behind 2 The constructed vector.
In step (2.2), the pair of K curvatures is at a threshold T 1 And T 2 The vector cross multiplication calculation is carried out on the contour points between the finger points, the non-fingertip contour points are removed according to the calculation result, and the candidate fingertip points are obtained by the following steps: if the cosine absolute value of the contour point meets the formula (2), vector cross-multiplication calculation of the point is carried out through the formula (3); otherwise, the point does not need to carry out vector cross multiplication calculation;
V=V 1 ×V 2 (3)
if the vector cross product V is larger than 0, the contour point is a candidate fingertip point, otherwise, the contour point is judged to be a non-fingertip contour point and is removed.
In the step (2.3), the number of the continuously distributed contour points which do not meet the vector cross multiplication calculation result near the statistical candidate fingertip points is counted and is compared with a threshold value T 0 Comparing, detecting the position of the fingertip point and obtaining the fingertip outline point means: counting that the vector cross product V is not satisfied near each candidate fingertip point&Number N of continuously distributed contour points of gt, 0, if N>T 0 Then, the corresponding candidate fingertip points are fingertip outline points, and the positions of the fingertip outline points are detected; otherwise, returning to the first step for reinitialization.
In the third step, the fingertip outline points are calculated by using a bidirectional pyramid optical flow algorithm to obtain bidirectional matched fingertip outline points to estimate a fingertip movement area; in an estimated motion area of a fingertip, discrete cosine transform is sequentially carried out by adopting a search window strategy, a neighborhood is extracted, a corresponding discrete cosine transform low-frequency coefficient matrix is calculated, a 64-bit perceptual hash sequence is generated by adopting a perceptual hash algorithm, the 64-bit perceptual hash sequence is matched with a fingertip hash sequence template, and then the maximum matching area is the fingertip area; and finally, setting a threshold value of the matching distance to judge whether the 64-bit perceptual hash sequence is updated into a fingertip hash sequence template, setting a threshold value of the number of matched fingertip contour points to judge whether the fingertip area is correctly tracked, if so, carrying out next round of fingertip tracking to realize continuous fingertip tracking, and if not, returning to the first step, wherein the method comprises the following steps:
(3.1) carrying out forward optical flow calculation on the fingertip outline points detected in the second step to obtain forward estimation fingertip outline points corresponding to the fingertip outline points in the next frame;
(3.2) carrying out reverse optical flow calculation on the forward estimation fingertip outline points to obtain corresponding reverse estimation fingertip outline points in the current frame;
(3.3) matching the initial fingertip outline points and the reverse estimation fingertip outline points to obtain the fingertip outline points capable of being subjected to bidirectional matching;
(3.4) estimating the motion area of the fingertip in the next frame by utilizing the bidirectional matching fingertip outline point;
(3.5) in the fingertip movement estimation area, adopting a search window strategy, sequentially performing discrete cosine transform in the search window area, and extracting low-frequency effective components in an 8 x 8 neighborhood at the upper left corner after the discrete cosine transform to obtain a corresponding discrete cosine coefficient matrix;
(3.6) calculating a median value of the discrete cosine coefficient matrix, and comparing each pixel point in the neighborhood with the median value to generate a 64-bit perceptual hash sequence;
(3.7) carrying out matching analysis on the 64-bit perceptual hash sequence and a fingertip hash sequence template reserved in the last frame of tracking result, wherein the maximum matching area is a fingertip area; in the tracking process after the initialization is finished, setting a fingertip hash sequence template according to an initialization result;
(3.8) setting threshold T of matching distance 3 Judging whether to update the 64-bit perception hash sequence into a fingertip hash sequence template;
(3.9) setting threshold value T of the matching point number of the fingertip outline 4 Judging whether the fingertip area is correctly tracked or not, if the fingertip outline is matched with the point number M<T 4 When the fingertip outline points are detected, obvious fingertip target loss occurs, the first step is returned, and the fingertip outline points are initialized and detected again; otherwise, judging that the fingertip area tracking is correct, returning to the step (3.1) to perform the next round of fingertip tracking, and realizing the fingertip continuous tracking.
In step (3.5), in the fingertip motion estimation region, a search window strategy is adopted, discrete cosine transform is sequentially performed in the search window region, and low-frequency effective components in an 8 × 8 neighborhood of the upper left corner after the discrete cosine transform are extracted, so as to obtain a corresponding discrete cosine coefficient matrix, where: in an estimated motion area of a fingertip, a search window strategy is adopted, a 16 x 16 search window is set, from the upper left corner to the lower right corner of the motion estimation area, 16 x 16 image blocks are sequentially converted into gray images, discrete cosine transformation is carried out, low-frequency effective components of 8 x 8 neighborhoods at the upper left corner after the discrete cosine transformation are extracted, and a corresponding discrete cosine coefficient matrix is calculated.
In step (3.8), the threshold T of the matching distance is set 3 Judging whether to update the 64-bit perceptual hash sequence into the fingertip hash sequence template means: calculating the Hamming distance of the 64-bit perceptual hash sequence, and calculating the Hamming distance and the threshold value T through formula (4) 3 Judging whether to update the 64-bit perceptual hash sequence into a fingertip hash sequence template;
distance D if Hamming hamming (S template ,S new )≥T 3 And if not, not updating the fingertip hash sequence template. In the invention, for the first tracking after the initialization is finished, a fingertip hash sequence template is set according to an initialization result; in the subsequent continuous tracking process, the fingertip hash sequence template is set according to the previous frame of tracking process, namely the fingertip hash sequence template updated in the previous frame of tracking process.
The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.

Claims (9)

1. A fingertip tracking method based on bidirectional optical flow and perceptual hash is used for continuously tracking fingertips of hands; the method is characterized in that: the method comprises the following three steps:
the method comprises the steps of firstly, capturing scene information, and obtaining a scene image containing a hand area by calculating dense optical flow information corresponding to the scene information and performing binarization preprocessing; then constructing a skin color filter, and segmenting a scene image containing a hand region to obtain the hand region;
secondly, sampling the contour of the hand region, and performing vector dot product and cross product calculation on each contour point to remove non-fingertip contour points to obtain candidate fingertip points; setting a threshold value according to the geometric characteristics of the fingertips, counting the number of the non-fingertip contour points which are removed near the candidate fingertip points and comparing the number with the threshold value to detect the positions of the fingertip points and obtain the fingertip contour points; if the fingertip point is not detected, returning to the first step for reinitialization;
thirdly, calculating the fingertip contour points by using a bidirectional pyramid optical flow algorithm to obtain bidirectional matched fingertip contour points to estimate a fingertip movement area; in an estimated motion area of a fingertip, discrete cosine transform is sequentially carried out by adopting a search window strategy, a neighborhood is extracted, a corresponding discrete cosine transform low-frequency coefficient matrix is calculated, a 64-bit perceptual hash sequence is generated by adopting a perceptual hash algorithm, the 64-bit perceptual hash sequence is matched with a fingertip hash sequence template, and then the maximum matching area is the fingertip area; and finally, setting a threshold value of the matching distance to judge whether the 64-bit perceptual hash sequence is updated into a fingertip hash sequence template, setting a threshold value of the fingertip outline matching point number to judge whether the fingertip area is correctly tracked, if so, carrying out next round of fingertip tracking to realize continuous fingertip tracking, and if not, returning to the first step.
2. The bi-directional optical flow and perceptual hashing based fingertip tracking method according to claim 1, wherein: in the first step, the captured scene information is subjected to dense optical flow information corresponding to the scene information and binarization preprocessing to obtain a scene image containing a hand area; then, constructing a skin color filter, and segmenting a hand region of the scene image containing the hand region to obtain the hand region: the method comprises the following steps:
(1.1) capturing scene information and calculating dense optical flow information corresponding to the scene information;
(1.2) traversing the optical flow information, searching a maximum optical flow value, and performing regularization processing on all optical flows in the X-axis direction and the Y-axis direction by using the maximum optical flow value;
(1.3) calculating the hue and saturation of an area which changes along the direction of the optical flow according to the optical flow after regularization, and marking different color values;
(1.4) setting a threshold, converting the optical flow change area of the color mark into a binary image according to the threshold, and combining logical operation and mathematical morphology operation to obtain a scene image containing a hand area;
(1.5) constructing a YCbCr skin color filter according to the human body skin color clustering characteristics, and removing redundant color and brightness information;
(1.6) filtering the YCbCr skin color filter, then sequencing all outlines of the scene image containing the hand region in a descending manner, searching the maximum connected region as the hand region, and realizing the segmentation of the hand region of the scene image containing the hand region.
3. The bi-directional optical flow and perceptual hashing based fingertip tracking method according to claim 1, wherein: in the second step, the contour of the hand area is sampled, and vector dot product and cross product calculation is carried out on each contour point to eliminate contour points of non-fingertips, so as to obtain candidate fingertip points; setting a threshold value according to the geometric characteristics of the fingertips, counting the number of the rejected non-fingertip contour points near the candidate fingertip points, and comparing the number with the threshold value to detect the positions of the fingertip points and obtain fingertip contour points; if the fingertip point is not detected, returning to the first step of reinitialization refers to: the method comprises the following steps:
(2.1) sampling the contour of the hand region, and calculating the vector dot product of each contour point, wherein the vector dot product is K curvature;
(2.2) setting a threshold T 1 、T 2 For K curvature at threshold T 1 And T 2 Carrying out vector cross multiplication calculation on the contour points between the finger points, and removing non-fingertip contour points according to the calculation result to obtain candidate fingertip points;
(2.3) setting the threshold T according to the geometric characteristics of the fingertip 0 Counting the number of continuously distributed contour points which do not meet the vector cross multiplication calculation result near the candidate fingertip points and comparing the number with a threshold value T 0 ComparisonDetecting the position of the fingertip point and obtaining a fingertip outline point; if the fingertip point is not detected, the first step is returned to reinitialize.
4. The bi-directional optical flow and perceptual hashing based fingertip tracking method according to claim 3, wherein: in step (2.1), the calculating the vector dot product of each contour point includes: calculating the cosine absolute value of each contour point by adopting a formula (1):
wherein the content of the first and second substances,is a certain point P on the contour 0 From the K-th point P 1 The vector of the composition is then calculated,is a certain point P on the contour 0 And the following Kth point P 2 The constructed vector.
5. The bi-directional optical flow and perceptual hashing based fingertip tracking method according to claim 4, wherein: in step (2.2), the pair of K curvatures is at a threshold T 1 And T 2 The vector cross multiplication calculation is carried out on the contour points between the finger points, the non-fingertip contour points are removed according to the calculation result, and the candidate fingertip points are obtained by the following steps: if the cosine absolute value of the contour point meets the formula (2), performing vector cross product calculation of the point through the formula (3); otherwise, the point does not need to carry out vector cross multiplication calculation;
V=V 1 ×V 2 (3)
if the vector cross product V is greater than 0, the contour point is a candidate fingertip point, otherwise, the contour point is judged to be a non-fingertip contour point and is removed.
6. The bi-directional optical flow and perceptual hashing based fingertip tracking method of claim 5, wherein: in the step (2.3), the number of the continuously distributed contour points which do not satisfy the vector cross product calculation result near the candidate fingertip points is counted and compared with the threshold value T 0 Comparing, detecting the position of the fingertip point and obtaining the fingertip outline point means: counting that the vector cross product V is not satisfied near each candidate fingertip point&Number N of continuously distributed contour points of gt, 0, if N>T 0 Then, the corresponding candidate fingertip points are fingertip outline points, and the positions of the fingertip outline points are detected; otherwise, returning to the first step for reinitialization.
7. The bi-directional optical flow and perceptual hashing based fingertip tracking method according to claim 1, wherein: in the third step, the fingertip outline points are calculated by using a bidirectional pyramid optical flow algorithm to obtain bidirectional matched fingertip outline points to estimate a fingertip movement area; in an estimated motion area of a fingertip, discrete cosine transform is sequentially carried out by adopting a search window strategy, a neighborhood is extracted, a corresponding discrete cosine transform low-frequency coefficient matrix is calculated, a 64-bit perceptual hash sequence is generated by adopting a perceptual hash algorithm, the 64-bit perceptual hash sequence is matched with a fingertip hash sequence template, and then the maximum matching area is the fingertip area; and finally, setting a threshold value of the matching distance to judge whether the 64-bit perceptual hash sequence is updated into a fingertip hash sequence template, setting a threshold value of a fingertip outline matching point number to judge whether the fingertip area is correctly tracked, if so, carrying out next round of fingertip tracking to realize continuous fingertip tracking, and if not, returning to the first step, wherein the method comprises the following steps:
(3.1) performing forward optical flow calculation on the fingertip outline points detected in the second step to obtain forward estimation fingertip outline points corresponding to the fingertip outline points in the next frame;
(3.2) carrying out reverse optical flow calculation on the forward estimation fingertip outline points to obtain corresponding reverse estimation fingertip outline points in the current frame;
(3.3) matching the initial fingertip outline points and the reverse estimation fingertip outline points to obtain the fingertip outline points capable of being subjected to bidirectional matching;
(3.4) estimating the motion area of the fingertip in the next frame by utilizing the bidirectional matching fingertip outline point;
(3.5) in the fingertip movement estimation area, adopting a search window strategy, sequentially performing discrete cosine transform in the search window area, and extracting low-frequency effective components in an 8 x 8 neighborhood at the upper left corner after the discrete cosine transform to obtain a corresponding discrete cosine coefficient matrix;
(3.6) calculating a median value of the discrete cosine coefficient matrix, and comparing each pixel point in the neighborhood with the median value to generate a 64-bit perceptual hash sequence;
(3.7) carrying out matching analysis on the 64-bit perceptual hash sequence and a fingertip hash sequence template reserved in the last frame of tracking result, wherein the maximum matching area is a fingertip area; in the tracking process after the initialization is finished, setting a fingertip hash sequence template according to an initialization result;
(3.8) setting threshold T of matching distance 3 Judging whether to update the 64-bit perception hash sequence into a fingertip hash sequence template;
(3.9) setting threshold value T of the matching point number of the fingertip outline 4 Judging whether the fingertip area is correctly tracked or not, and if the fingertip outline is matched with the point number M<T 4 If so, losing the obvious fingertip target, returning to the first step, and reinitializing and detecting the fingertip outline points; otherwise, judging that the fingertip area tracking is correct, returning to the step (3.1) to perform the next round of fingertip tracking, and realizing the fingertip continuous tracking.
8. The bi-directional optical flow and perceptual hashing based fingertip tracking method of claim 7, wherein: in the step (3.5), in the fingertip motion estimation region, a search window strategy is adopted, discrete cosine transform is sequentially performed in the search window region, and low-frequency effective components in an 8 × 8 neighborhood of the upper left corner after the discrete cosine transform are extracted, so as to obtain a corresponding discrete cosine coefficient matrix, where: in an estimated motion area of a fingertip, a search window strategy is adopted, a 16 x 16 search window is set, from the upper left corner to the lower right corner of the motion estimation area, 16 x 16 image blocks are sequentially converted into gray images, discrete cosine transformation is carried out, low-frequency effective components of 8 x 8 neighborhoods at the upper left corner after the discrete cosine transformation are extracted, and a corresponding discrete cosine coefficient matrix is calculated.
9. The bi-directional optical flow and perceptual hashing based fingertip tracking method of claim 7, wherein: in step (3.8), the threshold T of the matching distance is set 3 Judging whether to update the 64-bit perceptual hash sequence into the fingertip hash sequence template means: calculating the Hamming distance of the 64-bit perceptual hash sequence, and calculating the Hamming distance and the threshold value T through formula (4) 3 Judging whether the 64-bit perceptual hash sequence is updated into a fingertip hash sequence template or not;
distance D if Hamming hamming (S template ,S new )≥T 3 And if not, not updating the fingertip hash sequence template.
CN201510646203.2A 2015-09-30 2015-09-30 Finger tip tracking based on two-way light stream and perception Hash Active CN105261038B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510646203.2A CN105261038B (en) 2015-09-30 2015-09-30 Finger tip tracking based on two-way light stream and perception Hash

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510646203.2A CN105261038B (en) 2015-09-30 2015-09-30 Finger tip tracking based on two-way light stream and perception Hash

Publications (2)

Publication Number Publication Date
CN105261038A CN105261038A (en) 2016-01-20
CN105261038B true CN105261038B (en) 2018-02-27

Family

ID=55100709

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510646203.2A Active CN105261038B (en) 2015-09-30 2015-09-30 Finger tip tracking based on two-way light stream and perception Hash

Country Status (1)

Country Link
CN (1) CN105261038B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105825201A (en) * 2016-03-31 2016-08-03 武汉理工大学 Moving object tracking method in video monitoring
CN106327486B (en) * 2016-08-16 2018-12-28 广州视源电子科技股份有限公司 Track the method and device thereof of the finger web position
CN106408579B (en) * 2016-10-25 2019-01-29 华南理工大学 A kind of kneading finger tip tracking based on video
US10931969B2 (en) * 2017-01-04 2021-02-23 Qualcomm Incorporated Motion vector reconstructions for bi-directional optical flow (BIO)
CN113947683B (en) * 2021-10-15 2022-07-08 兰州交通大学 Fingertip point detection method and system and fingertip point motion track identification method and system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102270348A (en) * 2011-08-23 2011-12-07 中国科学院自动化研究所 Method for tracking deformable hand gesture based on video streaming
CN102402289A (en) * 2011-11-22 2012-04-04 华南理工大学 Mouse recognition method for gesture based on machine vision
CN104599288A (en) * 2013-10-31 2015-05-06 展讯通信(天津)有限公司 Skin color template based feature tracking method and device

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI382352B (en) * 2008-10-23 2013-01-11 Univ Tatung Video based handwritten character input device and method thereof
RU2014108820A (en) * 2014-03-06 2015-09-20 ЭлЭсАй Корпорейшн IMAGE PROCESSOR CONTAINING A SYSTEM FOR RECOGNITION OF GESTURES WITH FUNCTIONAL FEATURES FOR DETECTING AND TRACKING FINGERS

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102270348A (en) * 2011-08-23 2011-12-07 中国科学院自动化研究所 Method for tracking deformable hand gesture based on video streaming
CN102402289A (en) * 2011-11-22 2012-04-04 华南理工大学 Mouse recognition method for gesture based on machine vision
CN104599288A (en) * 2013-10-31 2015-05-06 展讯通信(天津)有限公司 Skin color template based feature tracking method and device

Also Published As

Publication number Publication date
CN105261038A (en) 2016-01-20

Similar Documents

Publication Publication Date Title
Zhou et al. A novel finger and hand pose estimation technique for real-time hand gesture recognition
Ou et al. Moving object detection method via ResNet-18 with encoder–decoder structure in complex scenes
CN105261038B (en) Finger tip tracking based on two-way light stream and perception Hash
Song et al. Continuous body and hand gesture recognition for natural human-computer interaction
Pan et al. A real-time multi-cue hand tracking algorithm based on computer vision
CN101593022B (en) Method for quick-speed human-computer interaction based on finger tip tracking
Kar Skeletal tracking using microsoft kinect
Wu et al. Robust fingertip detection in a complex environment
CN103607554A (en) Fully-automatic face seamless synthesis-based video synthesis method
Bilal et al. A hybrid method using haar-like and skin-color algorithm for hand posture detection, recognition and tracking
CN106503651B (en) A kind of extracting method and system of images of gestures
CN103995595A (en) Game somatosensory control method based on hand gestures
CN105335711B (en) Fingertip Detection under a kind of complex environment
CN109961016B (en) Multi-gesture accurate segmentation method for smart home scene
CN113608663B (en) Fingertip tracking method based on deep learning and K-curvature method
Kakkoth et al. Real time hand gesture recognition & its applications in assistive technologies for disabled
Tsagaris et al. Colour space comparison for skin detection in finger gesture recognition
Huang et al. A novel method for video moving object detection using improved independent component analysis
Park et al. Real-time hand gesture recognition for augmented screen using average background and camshift
Pun et al. Real-time hand gesture recognition using motion tracking
Junxia et al. Hand detection based on depth information and color information of the Kinect
JP2012003724A (en) Three-dimensional fingertip position detection method, three-dimensional fingertip position detector and program
Kim et al. Hand tracking and motion detection from the sequence of stereo color image frames
Shaker et al. Real-time finger tracking for interaction
AlSaedi et al. An efficient hand gestures recognition system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant