CN108682021A - Rapid hand tracking, device, terminal and storage medium - Google Patents
Rapid hand tracking, device, terminal and storage medium Download PDFInfo
- Publication number
- CN108682021A CN108682021A CN201810349972.XA CN201810349972A CN108682021A CN 108682021 A CN108682021 A CN 108682021A CN 201810349972 A CN201810349972 A CN 201810349972A CN 108682021 A CN108682021 A CN 108682021A
- Authority
- CN
- China
- Prior art keywords
- region
- calibration frame
- hand
- calibration
- video
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/017—Gesture based interaction, e.g. based on a set of recognized hand gestures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/60—Analysis of geometric attributes
- G06T7/66—Analysis of geometric attributes of image moments or centre of gravity
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/41—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
- G06V20/42—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items of sport video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
- G06V40/28—Recognition of hand or arm movements, e.g. recognition of deaf sign language
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10028—Range image; Depth image; 3D point clouds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20092—Interactive image processing based on input by user
- G06T2207/20104—Interactive definition of region of interest [ROI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
Abstract
A kind of rapid hand tracking, including:The video for including human hands region of imaging device acquisition is shown on display interface;Receive the calibration frame that user demarcates on the video comprising human hands region;The gradient orientation histogram feature in the fixed region of the calibration collimation mark is extracted, and is split to obtain hand images according to the gradient orientation histogram feature region fixed to the calibration collimation mark;And using continuous adaptive Mean Shift operator to the hand images into line trace.The present invention also provides a kind of rapid hand tracks of device, terminal and storage mediums.The present invention can quickly extract the HOG features in the calibration frame of calibration, accurately divide hand region according to HOG features, obtain preferable tracking effect.
Description
Technical field
The present invention relates to hand tracking technique fields, and in particular to a kind of rapid hand tracking, device, terminal and deposits
Storage media.
Background technology
A kind of important means of the gesture as natural interaction has important researching value and is widely applied foreground.Hand
The first step and a most important step that gesture identifies and hand tracks, are to split hand region from figure.Hand
The quality of region segmentation directly influences subsequent gesture identification and the effect of gesture tracking.
During people and robot interactive, the video capture device installed in robot has a certain distance with human body
When, include Whole Body in the photo of acquisition.Since such photo is there are a large amount of backgrounds, hand region is very little in picture
How a part detects hand from a large amount of background areas, and it is fast and accurately split, and is to be worth research
Problem.
Invention content
In view of the foregoing, it is necessary to propose a kind of rapid hand tracking, device, terminal and storage medium, it can
Shorten the time of extraction hand region, accuracy and the efficiency of hard recognition and hand tracking is improved, especially in complex background
Under hand region tracking, tracking efficiency it is preferable.
The first aspect of the present invention provides a kind of rapid hand tracking, the method includes:
The video for including human hands region of imaging device acquisition is shown on display interface;
Receive the calibration frame that user demarcates on the video comprising human hands region;
The gradient orientation histogram feature in the fixed region of the calibration collimation mark is extracted, and according to the gradient orientation histogram
The feature region fixed to the calibration collimation mark is split to obtain hand images;And
Using continuous adaptive Mean Shift operator to the hand images into line trace, wherein described using continuous
Adaptive Mean Shift operator specifically includes the hand images into line trace:
The color space of the hand images is transformed into HSV color spaces, isolates the hand images of chrominance component, base
In the centroid position and size of the search box of the hand images I (i, j) and initialization of the chrominance component, current search window is calculated
Centroid position (M10/M00,M01/M00) and current search window sizeWherein,AndFor the first moment of current search window,To work as
The zeroth order square of preceding search window, i are the pixel value in the horizontal direction of I (i, j), the pixel value in the vertical direction of j I (i, j).
In a kind of preferred realization method, described imaging device acquisition is shown on display interface includes human hands area
The video in domain further includes:
Pre-set standard calibration frame, the pre-set display mode packet are shown with pre-set display mode
Include the combination of one or more of:
When receiving idsplay order, the pre-set standard calibration frame is shown;
When receiving hiding instruction, the pre-set standard calibration frame is hidden;
The pre-set standard calibration frame is shown receiving the idsplay order, and is not received by later any
When the time of instruction is more than preset time period, the pre-set standard calibration frame is hidden automatically.
It is described to receive what user demarcated on the video comprising human hands region in a kind of preferred realization method
Demarcating frame includes:
The standard calibration frame that user demarcates on the video comprising human hands region is received, including:
Receive user it is described comprising the video in human hands region in the rough calibration frame that draws;
Pre-set standard calibration frame corresponding with the rough calibration frame is matched by the method for fuzzy matching;
According to the standard calibration frame matched to being demarcated in the video comprising human hands region and showing mark
Fixed standard calibration frame, wherein the geometric center of the geometric center and the standard calibration frame matched of the rough calibration frame
It is identical.
It is described to receive what user demarcated on the video comprising human hands region in a kind of preferred realization method
Demarcating frame further includes:
The standard calibration frame that user demarcates on the video comprising human hands region is received, including:
The standard calibration frame that user chooses directly is received, according to the standard calibration frame described comprising human hands region
Video on demarcated and shown the standard calibration frame of calibration.
In a kind of preferred realization method, the reception user is in the video subscript for including human hands region
Fixed standard calibration frame further includes:When receiving the instruction of amplification, diminution, movement, deletion, the standard calibration frame of display is carried out
Amplification is reduced, is mobile, deleting.
In a kind of preferred realization method, the method further includes:
The region fixed to the standard calibration collimation mark pre-processes, and the pretreatment may include one or more of
Combination:Gray processing processing, correction process.
In a kind of preferred realization method, the method further includes:
The depth information in the fixed corresponding video comprising human hands region in region of the calibration collimation mark is obtained, according to
The depth information standardizes to the hand images, and the process of the standardization is:S2* (H2/H1), wherein S1 be from
The size for the hand images that the fixed region segmentation of the standard calibration collimation mark of first time obtains, H1 are that the calibration collimation mark of first time is fixed
The corresponding depth of view information in region;The size for the hand images that the region segmentation that it is fixed that S2 is current standard calibration collimation mark obtains, H2
For the corresponding depth of view information in the fixed region of current calibration collimation mark.
The second aspect of the present invention provides a kind of rapid hand tracks of device, and described device includes:
Display module, the video for including human hands region for showing imaging device acquisition on display interface;
Demarcating module, the calibration frame demarcated on the video comprising human hands region for receiving user;
Divide module, the gradient orientation histogram feature for extracting the fixed region of the calibration collimation mark, and according to described
The gradient orientation histogram feature region fixed to the calibration collimation mark is split to obtain hand images;And
Tracking module, for using continuous adaptive Mean Shift operator to the hand images into line trace.
The third aspect of the present invention provides a kind of terminal, and the terminal includes processor and memory, and the processor is used
The rapid hand tracking is realized when executing the computer program stored in the memory.
The fourth aspect of the present invention provides a kind of computer readable storage medium, is stored thereon with computer program, described
The rapid hand tracking is realized when computer program is executed by processor.
Rapid hand tracking, device, terminal and storage medium of the present invention first carry out hand region rough
Calibration obtains calibration frame, then extracts the HOG features in the fixed region of the calibration collimation mark, will be by hand according to the HOG features
Region is accurately split from the fixed region of the calibration collimation mark, to reduce the area in the region for extracting HOG features,
The effective time for shortening extraction HOG features, it is thus possible to be rapidly performed by hand region segmentation and tracking;Secondly, it obtains
Include the depth information of the video of hand, the clarity of hand profile can be further ensured that, especially under complex background
Hand region tracks, and tracking efficiency is especially notable.
Description of the drawings
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below
There is attached drawing needed in technology description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
The embodiment of invention for those of ordinary skill in the art without creative efforts, can also basis
The attached drawing of offer obtains other attached drawings.
Fig. 1 is the flow chart for the rapid hand tracking that the embodiment of the present invention one provides.
Fig. 2 is the flow chart of rapid hand tracking provided by Embodiment 2 of the present invention.
Fig. 3 is the structure chart for the rapid hand tracks of device that the embodiment of the present invention three provides.
Fig. 4 is the structure chart for the rapid hand tracks of device that the embodiment of the present invention four provides.
Fig. 5 is the schematic diagram for the terminal that the embodiment of the present invention five provides.
Following specific implementation mode will be further illustrated the present invention in conjunction with above-mentioned attached drawing.
Specific implementation mode
To better understand the objects, features and advantages of the present invention, below in conjunction with the accompanying drawings and specific real
Applying example, the present invention will be described in detail.It should be noted that in the absence of conflict, the embodiment of the present invention and embodiment
In feature can be combined with each other.
Elaborate many details in the following description to facilitate a thorough understanding of the present invention, described embodiment only
It is only a part of the embodiment of the present invention, instead of all the embodiments.Based on the embodiments of the present invention, ordinary skill
The every other embodiment that personnel are obtained without making creative work, shall fall within the protection scope of the present invention.
Unless otherwise defined, all of technologies and scientific terms used here by the article and belong to the technical field of the present invention
The normally understood meaning of technical staff is identical.Used term is intended merely to description tool in the description of the invention herein
The purpose of the embodiment of body, it is not intended that in the limitation present invention.
The rapid hand tracking of the embodiment of the present invention is applied in one or more terminal.The rapid hand with
Track method can also be applied to the hardware environment being made of terminal and the server being attached by network and the terminal
In.Network includes but not limited to:Wide area network, Metropolitan Area Network (MAN) or LAN.The rapid hand tracking of the embodiment of the present invention can be with
It is executed, can also be executed by terminal by server;It can also be and executed jointly by server and terminal.
The terminal for needing progress rapid hand tracking can integrate the side of the present invention directly in terminal
The rapid hand following function that method is provided, or installation is for realizing the client of the method for the present invention.For another example, institute of the present invention
The method of offer can also operate in clothes in the form of Software Development Kit (Software Development Kit, SDK)
It is engaged in the equipment such as device, the interface of rapid hand following function, terminal or other equipment connecing by offer is provided in the form of SDK
The tracking of hand can be realized in mouth.
Embodiment one
Fig. 1 is the flow chart for the rapid hand tracking that the embodiment of the present invention one provides.The stream according to different requirements,
Execution sequence in journey figure can change, and certain steps can be omitted.
101:The video for including human hands region of imaging device acquisition is shown on display interface.
In the present embodiment, the terminal provides a display interface, and the display interface is adopted to simultaneous display imaging device
The video for including human hands region of collection.The imaging device is 2D cameras.
102:Receive the calibration frame that user demarcates on the video comprising human hands region.
In the present embodiment, when user is found that sense in the video comprising human hands region that the display interface is shown
When the hand information of interest, indicate that the interested hand calibrated is believed by adding a calibration frame on the display interface
Breath.
User can use finger, stylus or other any suitable objects to touch the display interface, preferably hand
Refer to and touches the display interface and add a calibration frame on the display interface.
103:The gradient orientation histogram feature in the fixed region of the calibration collimation mark is extracted, and straight according to the gradient direction
The square figure feature region fixed to the calibration collimation mark is split to obtain hand images.
The extraction calibration collimation mark fixed region gradient orientation histogram (Histogram Of Gradient,
HOG) feature detailed process includes:
11) gradient information of each pixel in the fixed region of the calibration collimation mark is calculated, the gradient information includes ladder
Spend amplitude and gradient direction;
One-Dimensional Center [1,0, -1], one-dimensional non-central [- 1,1], one-dimensional cube of amendment [1, -8, -8, -1], rope may be used
The first differentials templates such as Bell (Soble) operator calculate separately each pixel in the fixed region of the calibration collimation mark in level side
Gradient in upward and vertical direction;It is fixed that the calibration collimation mark is calculated according to the gradient in the gradient and vertical direction in horizontal direction
Region gradient magnitude and gradient direction.
In this preferred embodiment, to calculate each of the fixed region of the calibration collimation mark for One-Dimensional Center [1,0, -1] template
The gradient information of a pixel.The fixed region of the calibration collimation mark is denoted as I (x, y), pixel is calculated in the horizontal direction and hangs down
On histogram to gradient respectively as shown in following formula (1-1):
Wherein, Gh(x, y) and Gv(x, y) indicates the gradient of pixel (x, y) in the horizontal direction and the vertical direction respectively
Value.
Calculate gradient magnitude (or referred to as gradient intensity) and gradient direction such as following formula (1-2) institute of pixel (x, y)
Show:
θ (x, y)=arctan (Gv(x,y)/Gh(x,y)) (1-2)
Wherein, M (x, y) and θ (x, y) indicates the gradient magnitude and gradient direction of pixel (x, y) respectively.
Further, the range of gradient direction is limited, signless range generally may be used, that is, ignore gradient side
To angle degree positive and negative grade, signless gradient direction can use following formula (1-3) shown in indicate:
After the calculating of formula (1-3), the gradient direction of each pixel in the fixed region of the calibration collimation mark is limited to 0
It spends to 180 degree.
12) it is multiple pieces by the fixed region division of the calibration collimation mark, each block is divided into multiple cell factories, Mei Gexi
Born of the same parents' unit includes multiple pixels;
In the present embodiment, the size of the cell factory is 8*8 pixels, is not overlapped between adjacent cell factory.
For example, it is assumed that fixed region I (x, the y) size of the calibration collimation mark is 64*128, sets the size of each block
Size for 16*16, each cell factory is 8*8, then the fixed region of the calibration collimation mark can be divided into 105 blocks, each
Block includes 4 cell factories, and each cell factory includes 64 pixels.
Nonoverlapping model split cell factory is used in the present embodiment, can to calculate each gradient direction in the block
Histogram speed is faster.
13) quantification treatment is carried out to the gradient information of each pixel in each cell factory, obtains the calibration collimation mark
The histogram of gradients in fixed region;
In the present embodiment, the gradient direction of each pixel of each cell factory is divided into (9 sides 9 bin first
To channel), the horizontal axis of 9 bin as histogram of gradients, be respectively [0 °, 20 °], [20 °, 40 °], [40 °, 40 °], [40 °,
80 °], [80 °, 100 °], [100 °, 120 °], [120 °, 140 °], [140 °, 140 °], [140 °, 180 °];Then by each bin
The gradient magnitude of corresponding pixel carries out the cumulative longitudinal axis as histogram of gradients.
14) each piece of histogram of gradients is normalized, obtains each piece of histogram of gradients normalization knot
Fruit;
In this preferred embodiment, the histogram of gradients that normalized function may be used to each piece is normalized, described
Normalized function can be L2 norms, L1 norms.
The variation of the variation and foreground/background contrast shone due to local light so that the change of the gradient magnitude of pixel
Change range is very big, and normalization can compress illumination, shade and edge so that gradient orientation histogram feature vector is empty
Between to illumination, shade and edge variation have robustness.
15) all pieces of histogram of gradients normalization result is attached, obtains the fixed region of the calibration collimation mark most
Whole HOG features;
16) according to the final HOG features, hand region is split from the fixed region of the calibration collimation mark.
104:Using continuous adaptive Mean Shift operator to the hand images into line trace.
In the present embodiment, continuous adaptive Mean Shift (Continuously Adaptive Mean Shift,
CamShift) algorithm is a kind of method based on colouring information, and the particular color of target can be utilized into line trace, automatic to adjust
The size and location for saving search window, positions size and the center of tracked target, and the result of former frame (i.e. search window size
And barycenter) as next frame target size in the picture and barycenter.
It is described that the hand images are specifically included into line trace using continuous adaptive Mean Shift operator:
21) color space of the hand images is transformed into HSV (Hue colorations, Saturation saturation degrees, Value is pure
Degree) color space, isolate the hand images of tone H components;
22) hand images based on tone H components, the centroid position and size S of initialization search window W;
23) the rank square of current search window is calculated;
The zeroth order square that current search window is calculated according to formula (1-4) calculates the first moment of current search window according to formula (1-5).
24) centroid position (M of current search window is calculated according to the rank square of current search window10/M00,M01/M00)
25) current search window is calculated according to the rank square of current search window
Relationship between the search window more currently calculated and preset search window threshold value, when the search window currently calculated is more than
Or when equal to the preset search window threshold value, repeat above-mentioned steps 21) -25);Described in being less than when the search window currently calculated
When preset search window threshold value, then terminate to track, the position where the barycenter of search window is exactly the current location for tracking target at this time.
In conclusion rapid hand tracking of the present invention, by user to described comprising human hands region
After interested hand information is determined with calibration collimation mark in video, then the HOG features in the fixed region of the calibration collimation mark are extracted, according to
The HOG features split hand region from the fixed region of the calibration collimation mark.Thus, it is only necessary to calculate the calibration frame
HOG features in the region of calibration, compared to the video image for entirely including human hands region is calculated, the present invention passes through reception
The calibration frame of user's calibration, can reduce the region area of extraction HOG features, to effectively shorten the time of extraction HOG features,
Thus quickly hand region can be split from the video comprising human hands region.
In addition, being processing since the gradient information of each pixel in the fixed region of the calibration collimation mark is with cell factory
Unit, thus the HOG features being calculated can keep the geometry and optical characteristics of hand region;Secondly, piecemeal divides cell factory
Calculation processing mode, may make that the relationship between each pixel of hand region can be characterized well;Finally take
Normalized, the influence that can be brought with partial offset illumination variation, and then ensure that the clarity of the hand region extracted,
Hand region is accurately split.
Embodiment two
Fig. 2 is the flow chart of rapid hand tracking provided by Embodiment 2 of the present invention.The stream according to different requirements,
Execution sequence in journey figure can change, and certain steps can be omitted.
201:The video for including human hands region of imaging device acquisition is shown on display interface, while to set in advance
The display mode set shows pre-set standard calibration frame.
In the present embodiment, the terminal provides a display interface, and the display interface is adopted to simultaneous display imaging device
The video for including human hands region of collection, the display interface also show standard calibration frame simultaneously.
The imaging device is 3D depth cameras, and the 3D depth cameras are with 2D cameras the difference is that 3D depth cameras
Can simultaneously photographed grey-tone image information and 3 dimensional information including depth information.It is acquired when using the 3D depth cameras
To after the video comprising human hands region, comprising human hands region described in simultaneous display on the display interface of terminal
Video.
In the present embodiment, the pre-set standard calibration frame is for user shown comprising human hands region
It is demarcated on video to obtain interested hand information.
The pre-set display mode includes the combination of one or more of:
1) when receiving idsplay order, the pre-set standard calibration frame is shown;
The idsplay order corresponds to display operation input by user, and the display operation input by user includes, but unlimited
In:Display interface any position is clicked, or the time of touch display interface any position is more than the first preset time period (example
Such as, 1 second), or send out the first default voice (for example, " calibration frame ") etc..
Clicking operation is performed on the display interface when detecting user, or is worked as and detected user in the display
The touch operation time executed on interface is more than preset time, or has issued the described first default voice when detecting user
When, the terminal determination has received idsplay order, shows the pre-set standard calibration frame.
2) when receiving hiding instruction, the pre-set standard calibration frame is hidden;
The corresponding hiding operation input by user of the hiding instruction, the hiding operation input by user include, but unlimited
In:Display interface any position is clicked, or the time of touch display interface any position is more than the second preset time period (example
Such as, 2 seconds), or send out the second default voice (for example, " exiting ") etc..
Clicking operation is performed on the display interface when detecting user, or is worked as and detected user in the display
The touch operation time executed on interface is more than the second preset time, or has issued the second default voice when detecting user
When, the terminal determination has received hiding instruction, hides the pre-set standard calibration frame.
The hiding instruction can be identical as the idsplay order, can also be different.First preset time period can be with
It is identical as second preset time period, it can not also be identical.Preferably, first preset time period is small and described second is pre-
If the period, the first shorter preset time period of setting can quickly show the pre-set standard calibration frame, be arranged
Longer second preset time period hides the pre-set mark caused by user capable of being avoided unconscious or operation error
Fiducial mark is determined the case where frame and is occurred.
The pre-set standard calibration frame is shown when receiving idsplay order, enables to display interface in display institute
When stating the video comprising human hands region, user can demarcate interested hand region;It is not receiving simultaneously
When to the idsplay order, the pre-set standard calibration frame is not shown, or is received the hiding instruction and hidden institute
Pre-set standard calibration frame is stated, the video comprising human hands region of display can be avoided for a long time by described pre-
The standard calibration frame being first arranged blocks, to cause the omission of important information or check that described includes human body hand to user
Visual sense of discomfort is brought when the video in portion region.
3) the pre-set standard calibration frame is shown receiving the idsplay order, and be not received by appoint later
When the time of what instruction is more than third preset time period, the pre-set standard calibration frame is hidden automatically.
After showing the pre-set standard calibration frame, when user no longer inputs any operation and is more than the third
When preset time period, pre-set standard calibration frame is hidden automatically, can be triggered at unconscious to avoid user
Idsplay order and for a long time the case where showing pre-set standard calibration frame occur, secondly, automatically by pre-set standard
Calibration frame is hidden, it helps promotes the interactive experience of user.
In the present embodiment, the pre-set standard calibration frame can be circle, ellipse, rectangle, square etc..
202:Receive the standard calibration frame that user demarcates on the video comprising human hands region.
In the present embodiment, when user is found that sense in the video comprising human hands region that the display interface is shown
When the hand information of interest, the interested hand calibrated is indicated by adding a standard calibration frame on the display interface
Portion's information.
In the present embodiment, the standard calibration frame for receiving user and being demarcated on the video comprising human hands region
Including two following situations:
The first situation:Receive user it is described comprising the video in human hands region in the rough calibration frame that draws;It is logical
The method for crossing fuzzy matching matches pre-set standard calibration frame corresponding with the rough calibration frame;According to matching
Standard calibration frame to being demarcated and being shown the standard calibration frame of calibration in the video comprising human hands region,
In, the geometric center of the rough calibration frame is identical as the geometric center of the standard calibration frame matched.
In the present embodiment, due to the shape and non-standard of the calibration frame that user is drawn by finger on the display interface
Or standard, for example, the circular calibration frame that user draws not is very precisely, thus when terminal receives what user drew
After the shape of rough rough calibration frame, matched according to the general shape of the rough calibration frame corresponding pre-set
The shape of standard calibration frame.Corresponding standard calibration frame is matched by the method for fuzzy matching, convenient for subsequently to the calibration frame
The region of calibration is cut.
Second case:The standard calibration frame that user chooses directly is received, includes described according to the standard calibration frame
The standard calibration frame of calibration is demarcated and shown on the video in human hands region.
In the present embodiment, user inputs display operation and triggers idsplay order, to show pre-set multiple standard marks
Determine frame, user touches standard calibration frame, after terminal detects the touch signal on standard calibration frame, determines the standard calibration frame quilt
It chooses.User moves the standard calibration frame being selected and is pulled on the video comprising human hands region, terminal
Dragged standard calibration frame is shown on the video comprising human hands region.
Preferably, the step 202 can also include:When receiving the instruction of amplification, diminution, movement, deletion, to display
Standard calibration frame be amplified, reduce, move, delete.
203:The region fixed to the standard calibration collimation mark pre-processes.
In the present embodiment, the pretreatment may include the combination of one or more of:Gray processing is handled, at correction
Reason.
The gray processing processing refers to converting the fixed area image of the standard calibration collimation mark to gray level image, because of color
Multimedia message influences extraction gradient orientation histogram feature little, thus the fixed area image of the standard calibration collimation mark is converted into
Gray level image had not both interfered with the gradient information for each pixel for subsequently calculating the fixed region of standard calibration collimation mark, also
The calculation amount of the gradient information of each pixel can be reduced.
Gamma (Gamma) correction may be used in the correction process, because in the texture strength of image, local surface layer exposes
The proportion of light contribution is larger, and the image after Gamma correction process can be effectively reduced local shade and illumination variation.
204:The gradient orientation histogram feature in the pretreated fixed region of standard calibration collimation mark is extracted, and
According to the gradient orientation histogram feature, the region fixed to the standard calibration collimation mark is split to obtain hand images.
Step 204 described in the present embodiment is with the step 103 described in embodiment one, and in this not go into detail.
205:Using continuous adaptive Mean Shift operator to the hand images into line trace.
Step 205 described in the present embodiment is with the step 104 described in embodiment one, and in this not go into detail.
Further, in order to make full use of depth information, after the step 204, before the step 205, institute
The method of stating further includes:Obtain the depth letter in the fixed corresponding video comprising human hands region in region of the calibration collimation mark
Breath, standardizes to the hand images according to the depth information.
The depth information is to be obtained from the 3D depth cameras.It is described according to the depth information to the hand figure
As the detailed process standardized is:The hand images that the fixed region segmentation of standard calibration collimation mark from first time is obtained
Size is denoted as standard size S1, and the corresponding depth of view information in region that the calibration collimation mark of first time is fixed is to be denoted as standard depth of field H1;When
The size for the hand images that the fixed region segmentation of preceding standard calibration collimation mark obtains is denoted as S2, the fixed region of current calibration collimation mark
Corresponding depth of view information is denoted as H2, and the hand images that the region segmentation fixed to current calibration collimation mark obtains carry out specification and turn to
S2*(H2/H1)。
Standardize to the size of the hand images, is in order to enable the HOG character representations finally extracted have
Unified criticism standard, i.e., dimension having the same improve the accuracy of hand tracking.
In conclusion rapid hand tracking of the present invention, provides two kinds of standard calibration frames to described comprising people
The video of body hand region is demarcated, and it is standard calibration frame to enable to the calibration frame that user demarcates, and then divides and obtain
The shape of hand region is standard, and the calibration frame of the standard based on the segmentation carries out hand tracking effect more preferably.
It should be noted that quick dynamic hand tracking of the present invention can be adapted for single hand with
Track is readily applicable to the tracking of multiple hands.Tracking for multiple hands, using the method for Parallel Tracking into line trace,
It is substantially the process of multiple single hand tracking, and herein without detailed description, any thought using the present invention carries out
The method of hand tracking all should be within the scope of the present invention.
The above is only the specific implementation mode of the present invention, but scope of protection of the present invention is not limited thereto, for
For those skilled in the art, without departing from the concept of the premise of the invention, improvement, but these can also be made
It all belongs to the scope of protection of the present invention.
With reference to the 3rd to 5 figure, respectively to the function module and hardware of the terminal of the above-mentioned rapid hand tracking of realization
Structure is introduced.
It should be appreciated that the embodiment is only purposes of discussion, do not limited by this structure in patent claim.
Embodiment three
Fig. 3 is the functional block diagram in rapid hand tracks of device preferred embodiment of the present invention.
In some embodiments, the rapid hand tracks of device 30 is run in terminal.The rapid hand tracking dress
It may include multiple function modules being made of program code segments to set 30.Each journey in the rapid hand tracks of device 30
The program code of sequence section can be stored in memory, and performed by least one processor, with execution (refer to Fig. 1 and its
Associated description) tracking to hand region.
In the present embodiment, the function of the rapid hand tracks of device 30 of the terminal performed by it can be divided
For multiple function modules.The function module may include:Display module 301, demarcating module 302, segmentation module 303 and tracking
Module 304.The so-called module of the present invention, which refers to one kind, performed by least one processor and capable of completing fixed work(
The series of computation machine program segment of energy, is stored in the memory.It in some embodiments, will about the function of each module
It is described in detail in subsequent embodiment.
Display module 301, the video for including human hands region for showing imaging device acquisition on display interface.
In the present embodiment, the terminal provides a display interface, and the display interface is adopted to simultaneous display imaging device
The video for including human hands region of collection.The imaging device is 2D cameras.
Demarcating module 302, the calibration frame demarcated on the video comprising human hands region for receiving user.
In the present embodiment, when user is found that sense in the video comprising human hands region that the display interface is shown
When the hand information of interest, indicate that the interested hand calibrated is believed by adding a calibration frame on the display interface
Breath.
User can use finger, stylus or other any suitable objects to touch the display interface, preferably hand
Refer to and touches the display interface and add a calibration frame on the display interface.
Divide module 303, the gradient orientation histogram feature for extracting the fixed region of the calibration collimation mark, and according to institute
The gradient orientation histogram feature region fixed to the calibration collimation mark is stated to be split to obtain hand images.
The segmentation module 303 extracts gradient orientation histogram (the Histogram Of in the fixed region of the calibration collimation mark
Gradient, HOG) feature specifically includes:
11) gradient information of each pixel in the fixed region of the calibration collimation mark is calculated, the gradient information includes ladder
Spend amplitude and gradient direction;
One-Dimensional Center [1,0, -1], one-dimensional non-central [- 1,1], one-dimensional cube of amendment [1, -8, -8, -1], rope may be used
The first differentials templates such as Bell (Soble) operator calculate separately each pixel in the fixed region of the calibration collimation mark in level side
Gradient in upward and vertical direction;It is fixed that the calibration collimation mark is calculated according to the gradient in the gradient and vertical direction in horizontal direction
Region gradient magnitude and gradient direction.
In this preferred embodiment, to calculate each of the fixed region of the calibration collimation mark for One-Dimensional Center [1,0, -1] template
The gradient information of a pixel.The fixed region of the calibration collimation mark is denoted as I (x, y), pixel is calculated in the horizontal direction and hangs down
On histogram to gradient respectively as shown in following formula (1-1):
Wherein, Gh(x, y) and Gv(x, y) indicates the gradient of pixel (x, y) in the horizontal direction and the vertical direction respectively
Value.
Calculate gradient magnitude (or referred to as gradient intensity) and gradient direction such as following formula (1-2) institute of pixel (x, y)
Show:
Wherein, M (x, y) and θ (x, y) indicates the gradient magnitude and gradient direction of pixel (x, y) respectively.
Further, the range of gradient direction is limited, signless range generally may be used, that is, ignore gradient side
To angle degree positive and negative grade, signless gradient direction can use following formula (1-3) shown in indicate:
After the calculating of formula (1-3), the gradient direction of each pixel in the fixed region of the calibration collimation mark is limited to 0
It spends to 180 degree.
12) it is multiple pieces by the fixed region division of the calibration collimation mark, each block is divided into multiple cell factories, Mei Gexi
Born of the same parents' unit includes multiple pixels;
In the present embodiment, the size of the cell factory is 8*8 pixels, is not overlapped between adjacent cell factory.
For example, it is assumed that fixed region I (x, the y) size of the calibration collimation mark is 64*128, sets the size of each block
Size for 16*16, each cell factory is 8*8, then the fixed region of the calibration collimation mark can be divided into 105 blocks, each
Block includes 4 cell factories, and each cell factory includes 64 pixels.
Nonoverlapping model split cell factory is used in the present embodiment, can to calculate each gradient direction in the block
Histogram speed is faster.
13) quantification treatment is carried out to the gradient information of each pixel in each cell factory, obtains the calibration collimation mark
The histogram of gradients in fixed region;
In the present embodiment, the gradient direction of each pixel of each cell factory is divided into (9 sides 9 bin first
To channel), the horizontal axis of 9 bin as histogram of gradients, be respectively [0 °, 20 °], [20 °, 40 °], [40 °, 40 °], [40 °,
80 °], [80 °, 100 °], [100 °, 120 °], [120 °, 140 °], [140 °, 140 °], [140 °, 180 °];Then by each bin
The gradient magnitude of corresponding pixel carries out the cumulative longitudinal axis as histogram of gradients.
14) each piece of histogram of gradients is normalized, obtains each piece of histogram of gradients normalization knot
Fruit;
In this preferred embodiment, the histogram of gradients that normalized function may be used to each piece is normalized, described
Normalized function can be L2 norms, L1 norms.
The variation of the variation and foreground/background contrast shone due to local light so that the change of the gradient magnitude of pixel
Change range is very big, and normalization can compress illumination, shade and edge so that gradient orientation histogram feature vector is empty
Between to illumination, shade and edge variation have robustness.
15) all pieces of histogram of gradients normalization result is attached, obtains the fixed region of the calibration collimation mark most
Whole HOG features;
16) according to the final HOG features, hand region is split from the fixed region of the calibration collimation mark.
Tracking module 304, for using continuous adaptive Mean Shift operator to the hand images into line trace.
In the present embodiment, continuous adaptive Mean Shift (Continuously Adaptive Mean Shift,
CamShift) algorithm is a kind of method based on colouring information, and the particular color of target can be utilized into line trace, automatic to adjust
The size and location for saving search window, positions size and the center of tracked target, and the result of former frame (i.e. search window size
And barycenter) as next frame target size in the picture and barycenter.
It is described that the hand images are specifically included into line trace using continuous adaptive Mean Shift operator:
21) color space of the hand images is transformed into HSV (Hue colorations, Saturation saturation degrees, Value is pure
Degree) color space, isolate the hand images of tone H components;
22) hand images based on tone H components, the centroid position and size S of initialization search window W;
23) the rank square of current search window is calculated;
The zeroth order square that current search window is calculated according to formula (1-4) calculates the first moment of current search window according to formula (1-5).
24) centroid position (M of current search window is calculated according to the rank square of current search window10/M00,M01/M00)
25) size of current search window is calculated according to the rank square of current search window
Relationship between the search window more currently calculated and preset search window threshold value, when the search window currently calculated is more than
Or when equal to the preset search window threshold value, repeat above-mentioned steps 21) -25);Described in being less than when the search window currently calculated
When preset search window threshold value, then terminate to track, the position where the barycenter of search window is exactly the current location for tracking target at this time.
In conclusion rapid hand tracks of device 30 of the present invention, by user to described comprising human hands region
Video in interested hand information with calibration collimation mark it is fixed after, then extract the HOG features in the fixed region of the calibration collimation mark, root
Hand region is split from described demarcate in the fixed region of collimation mark according to the HOG features.Thus, it is only necessary to calculate the calibration
HOG features in the fixed region of collimation mark, compared to the video image for entirely including human hands region is calculated, the present invention is by connecing
Receive user calibration calibration frame, can reduce extraction HOG features region area, to effectively shortening extract HOG features when
Between, it is thus possible to quickly hand region is split from the video comprising human hands region.
In addition, being processing since the gradient information of each pixel in the fixed region of the calibration collimation mark is with cell factory
Unit, thus the HOG features being calculated can keep the geometry and optical characteristics of hand region;Secondly, piecemeal divides cell factory
Calculation processing mode, may make that the relationship between each pixel of hand region can be characterized well;Finally take
Normalized, the influence that can be brought with partial offset illumination variation, and then ensure that the clarity of the hand region extracted,
Hand region is accurately split.
Example IV
Fig. 4 is the functional block diagram in the preferred embodiment of rapid hand tracks of device of the present invention.
In some embodiments, the rapid hand tracks of device 40 is run in terminal.The rapid hand tracking dress
It may include multiple function modules being made of program code segments to set 40.Each journey in the rapid hand tracks of device 40
The program code of sequence section can be stored in memory, and performed by least one processor, with execution (refer to Fig. 2 and its
Associated description) tracking to hand region.
In the present embodiment, the function of the rapid hand tracks of device of the terminal performed by it can be divided into
Multiple function modules.The function module may include:Display module 401, demarcating module 402, preprocessing module 403, segmentation
Module 404, tracking module 405 and normalizing block 406.The so-called module of the present invention refers to that one kind can be by least one processing
Device is performed and can complete the series of computation machine program segment of fixed function, is stored in the memory.At some
In embodiment, the function about each module will be described in detail in subsequent embodiment.
Display module 401 includes:First display sub-module 4010 and the second display sub-module 4012.Wherein, described first
Display sub-module 4010 is used to show the video for including human hands region that imaging device acquires on display interface, described the
Two display sub-modules 4012 are used to show pre-set standard calibration frame with pre-set display mode.
In the present embodiment, the terminal provides a display interface, and the display interface is adopted to simultaneous display imaging device
The video for including human hands region of collection, the display interface also show standard calibration frame simultaneously.
The imaging device is 3D depth cameras, and the 3D depth cameras are with 2D cameras the difference is that 3D depth cameras
Can simultaneously photographed grey-tone image information and 3 dimensional information including depth information.It is acquired when using the 3D depth cameras
To after the video comprising human hands region, comprising human hands region described in simultaneous display on the display interface of terminal
Video.
In the present embodiment, the pre-set standard calibration frame is for user shown comprising human hands region
It is demarcated on video to obtain interested hand information.
The pre-set display mode includes the combination of one or more of:
1) when receiving idsplay order, the pre-set standard calibration frame is shown;
The idsplay order corresponds to display operation input by user, and the display operation input by user includes, but unlimited
In:Display interface any position is clicked, or the time of touch display interface any position is more than the first preset time period (example
Such as, 1 second), or send out the first default voice (for example, " calibration frame ") etc..
Clicking operation is performed on the display interface when detecting user, or is worked as and detected user in the display
The touch operation time executed on interface is more than preset time, or has issued the described first default voice when detecting user
When, the terminal determination has received idsplay order, shows the pre-set standard calibration frame.
2) when receiving hiding instruction, the pre-set standard calibration frame is hidden;
The corresponding hiding operation input by user of the hiding instruction, the hiding operation input by user include, but unlimited
In:Display interface any position is clicked, or the time of touch display interface any position is more than the second preset time period (example
Such as, 2 seconds), or send out the second default voice (for example, " exiting ") etc..
Clicking operation is performed on the display interface when detecting user, or is worked as and detected user in the display
The touch operation time executed on interface is more than the second preset time, or has issued the second default voice when detecting user
When, the terminal determination has received hiding instruction, hides the pre-set standard calibration frame.
The hiding instruction can be identical as the idsplay order, can also be different.First preset time period can be with
It is identical as second preset time period, it can not also be identical.Preferably, first preset time period is small and described second is pre-
If the period, the first shorter preset time period of setting can quickly show the pre-set standard calibration frame, be arranged
Longer second preset time period hides the pre-set mark caused by user capable of being avoided unconscious or operation error
Fiducial mark is determined the case where frame and is occurred.
The pre-set standard calibration frame is shown when receiving idsplay order, enables to display interface in display institute
When stating the video comprising human hands region, user can demarcate interested hand region;It is not receiving simultaneously
When to the idsplay order, the pre-set standard calibration frame is not shown, or is received the hiding instruction and hidden institute
Pre-set standard calibration frame is stated, the video comprising human hands region of display can be avoided for a long time by described pre-
The standard calibration frame being first arranged blocks, to cause the omission of important information or check that described includes human body hand to user
Visual sense of discomfort is brought when the video in portion region.
3) the pre-set standard calibration frame is shown receiving the idsplay order, and be not received by appoint later
When the time of what instruction is more than third preset time period, the pre-set standard calibration frame is hidden automatically.
After showing the pre-set standard calibration frame, when user no longer inputs any operation and is more than the third
When preset time period, pre-set standard calibration frame is hidden automatically, can be triggered at unconscious to avoid user
Idsplay order and for a long time the case where showing pre-set standard calibration frame occur, secondly, automatically by pre-set standard
Calibration frame is hidden, it helps promotes the interactive experience of user.
In the present embodiment, the pre-set standard calibration frame can be circle, ellipse, rectangle, square etc..
Demarcating module 402, the standard calibration demarcated on the video comprising human hands region for receiving user
Frame.
In the present embodiment, when user is found that sense in the video comprising human hands region that the display interface is shown
When the hand information of interest, the interested hand calibrated is indicated by adding a standard calibration frame on the display interface
Portion's information.
In the present embodiment, the demarcating module 402 further includes the first calibration submodule 4020, second calibration submodule 4022
And third demarcates submodule 4024.
It is described first calibration submodule 4020, for receive user it is described comprising the video in human hands region in draw
Rough calibration frame;Pre-set standard mark corresponding with the rough calibration frame is matched by the method for fuzzy matching
Determine frame;According to the standard calibration frame matched to being demarcated in the video comprising human hands region and showing calibration
Standard calibration frame, wherein the geometric center of the rough calibration frame is identical as the geometric center of the standard calibration frame matched.
In the present embodiment, due to the shape and non-standard of the calibration frame that user is drawn by finger on the display interface
Or standard, for example, the circular calibration frame that user draws not is very precisely, thus when terminal receives what user drew
After the shape of rough rough calibration frame, matched according to the general shape of the rough calibration frame corresponding pre-set
The shape of standard calibration frame.Corresponding standard calibration frame is matched by the method for fuzzy matching, convenient for subsequently to the calibration frame
The region of calibration is cut.
The second calibration submodule 4022, the standard calibration frame chosen for directly receiving user, according to the standard
Calibration frame is demarcated and is shown the standard calibration frame of calibration on the video comprising human hands region.
In the present embodiment, user inputs display operation and triggers idsplay order, to show pre-set multiple standard marks
Determine frame, user touches standard calibration frame, after terminal detects the touch signal on standard calibration frame, determines the standard calibration frame quilt
It chooses.User moves the standard calibration frame being selected and is pulled on the video comprising human hands region, terminal
Dragged standard calibration frame is shown on the video comprising human hands region.
The third demarcates submodule 4024, when instruction for receiving amplification, diminution, movement, deletion, to display
Standard calibration frame is amplified, reduces, moves, deletes.
Preprocessing module 403 is pre-processed for the region fixed to the standard calibration collimation mark.
In the present embodiment, the pretreatment may include the combination of one or more of:Gray processing is handled, at correction
Reason.
The gray processing processing refers to converting the fixed area image of the standard calibration collimation mark to gray level image, because of color
Multimedia message influences extraction gradient orientation histogram feature little, thus the fixed area image of the standard calibration collimation mark is converted into
Gray level image had not both interfered with the gradient information for each pixel for subsequently calculating the fixed region of standard calibration collimation mark, also
The calculation amount of the gradient information of each pixel can be reduced.
Gamma (Gamma) correction may be used in the correction process, because in the texture strength of image, local surface layer exposes
The proportion of light contribution is larger, and the image after Gamma correction process can be effectively reduced local shade and illumination variation.
Divide module 404, the gradient direction for extracting the pretreated fixed region of standard calibration collimation mark
Histogram feature, and be split to obtain according to the gradient orientation histogram feature region fixed to the standard calibration collimation mark
Hand images.
Tracking module 405, for using continuous adaptive Mean Shift operator to the hand images into line trace.
Further, the rapid hand tracks of device 40 further includes normalizing block 406, for obtaining the calibration frame
Depth information in the corresponding video comprising human hands region in region of calibration, according to the depth information to the hand
Image standardizes.
The depth information is to be obtained from the 3D depth cameras.It is described according to the depth information to the hand figure
As the detailed process standardized is:The hand images that the fixed region segmentation of standard calibration collimation mark from first time is obtained
Size is denoted as standard size S1, and the corresponding depth of view information in region that the calibration collimation mark of first time is fixed is to be denoted as standard depth of field H1;When
The size for the hand images that the fixed region segmentation of preceding standard calibration collimation mark obtains is denoted as S2, the fixed region of current calibration collimation mark
Corresponding depth of view information is denoted as H2, and the hand images that the region segmentation fixed to current calibration collimation mark obtains carry out specification and turn to
S2*(H2/H1)。
Standardize to the size of the hand images, is in order to enable the HOG character representations finally extracted have
Unified criticism standard, i.e., dimension having the same improve the accuracy of hand tracking.
In conclusion rapid hand tracks of device 40 of the present invention, provide two kinds of standard calibration frames includes to described
The video in human hands region is demarcated, and it is standard calibration frame to enable to the calibration frame that user demarcates, and then divides and obtain
The shape of hand region be standard, the calibration frame of the standard based on the segmentation carries out hand tracking effect more preferably.
It should be noted that quick dynamic hand tracks of device 30,40 of the present invention can be adapted for single hand
Tracking, be readily applicable to the tracking of multiple hands.Tracking for multiple hands, using the method for Parallel Tracking carry out with
Track is substantially the process of multiple single hands tracking, herein without detailed description, any thought using the present invention into
The device of row hand tracking all should be within the scope of the present invention.
The above-mentioned integrated unit realized in the form of software function module, can be stored in one and computer-readable deposit
In storage media.Above-mentioned software function module is stored in a storage medium, including some instructions are used so that a computer
It is each that equipment (can be personal computer, double screen equipment or the network equipment etc.) or processor (processor) execute the present invention
The part of a embodiment the method.
Embodiment five
Fig. 5 is the schematic diagram for the terminal that the embodiment of the present invention five provides.
The terminal 5 includes:Memory 51, at least one processor 52 are stored in the memory 51 and can be in institute
State the computer program 53 run at least one processor 52, at least one communication bus 54 and imaging device 55.
At least one processor 52 realizes that above-mentioned rapid hand tracking is real when executing the computer program 53
Apply the step in example, such as step 101 shown in FIG. 1 is to 104 or shown in Fig. 2 steps 201 to 205.Alternatively, described at least one
A processor 52 realizes the function of each module/unit in above-mentioned apparatus embodiment, such as Fig. 3 when executing the computer program 53
In module 301 to 304 or the module 401 to 406 in Fig. 4.
Illustratively, the computer program 53 can be divided into one or more module/units, it is one or
Multiple module/units are stored in the memory 51, and are executed by least one processor 52, to complete this hair
It is bright.One or more of module/units can be the series of computation machine program instruction section that can complete specific function, this refers to
Enable section for describing implementation procedure of the computer program 53 in the terminal 5.For example, the computer program 53 can be with
Display module 301, demarcating module 302, segmentation module 303 and the tracking module 304 being divided into Fig. 3, or be divided into
Display module 401, demarcating module 402, preprocessing module 403, segmentation module 404, tracking module 405 in Fig. 4 and standardization
Module 406.The display module 401 includes the first display sub-module 4010 and the second display sub-module 4012, the calibration mold
Block 402 includes that the first calibration submodule 4020, second demarcates submodule 4022 and third calibration submodule 4024, the tool of each module
Body function is referring to embodiment one, two and its corresponding description.
The imaging device 55 includes 2D video cameras, 3D depth cameras etc., and the imaging device 55 can be installed in described
In terminal 5, it can also detach with the terminal 5 and exist as independent element.
The terminal 5 can be the computing devices such as desktop PC, notebook, palm PC and cloud server.This
Field technology personnel are appreciated that the schematic diagram 5 is only the example of terminal 5, and the not restriction of structure paired terminal 5 can be with
Including components more more or fewer than diagram, certain components or different components are either combined, such as the terminal 5 may be used also
To include input-output equipment, network access equipment, bus etc..
At least one processor 52 can be central processing unit (Central Processing Unit, CPU),
It can also be other general processors, digital signal processor (Digital Signal Processor, DSP), special integrated
Circuit (Application Specific Integrated Circuit, ASIC), ready-made programmable gate array (Field-
Programmable Gate Array, FPGA) either other programmable logic device, discrete gate or transistor logic,
Discrete hardware components etc..The processor 52 can be microprocessor or the processor 52 can also be any conventional processor
Deng the processor 52 is the control centre of the terminal 5, utilizes each portion of various interfaces and the entire terminal of connection 5
Point.
The memory 51 can be used for storing the computer program 53 and/or module/unit, and the processor 52 passes through
Operation executes the computer program and/or module/unit being stored in the memory 51, and calls and be stored in memory
Data in 51 realize the various functions of the terminal 5.The memory 51 can include mainly storing program area and storage data
Area, wherein storing program area can storage program area, needed at least one function application program (such as sound-playing function,
Image player function etc.) etc.;Storage data field can be stored uses created data (such as audio data, electricity according to terminal 5
Script for story-telling etc.) etc..In addition, memory 51 may include high-speed random access memory, can also include nonvolatile memory, example
Such as hard disk, memory, plug-in type hard disk, intelligent memory card (Smart Media Card, SMC), secure digital (Secure
Digital, SD) card, flash card (Flash Card), at least one disk memory, flush memory device or other volatibility are solid
State memory device.
If the integrated module/unit of the terminal 5 is realized in the form of SFU software functional unit and as independent product
Sale in use, can be stored in a computer read/write memory medium.Based on this understanding, in present invention realization
All or part of flow in embodiment method is stated, relevant hardware can also be instructed to complete by computer program, institute
The computer program stated can be stored in a computer readable storage medium, which, can when being executed by processor
The step of realizing above-mentioned each embodiment of the method.Wherein, the computer program includes computer program code, the computer
Program code can be source code form, object identification code form, executable file or certain intermediate forms etc..The computer can
Reading medium may include:Any entity or device, recording medium, USB flash disk, mobile hard of the computer program code can be carried
Disk, magnetic disc, CD, computer storage, read-only memory (ROM, Read-Only Memory), random access memory
(RAM, Random Access Memory), electric carrier signal, telecommunication signal and software distribution medium etc..It needs to illustrate
It is that the content that the computer-readable medium includes can be fitted according to legislation in jurisdiction and the requirement of patent practice
When increase and decrease, such as in certain jurisdictions, according to legislation and patent practice, computer-readable medium does not include that electric carrier wave is believed
Number and telecommunication signal.
In several embodiments provided by the present invention, it should be understood that disclosed terminal and method can pass through it
Its mode is realized.For example, terminal embodiment described above is only schematical, for example, the division of the unit, only
Only a kind of division of logic function, formula that in actual implementation, there may be another division manner.
In addition, each functional unit in each embodiment of the present invention can be integrated in same treatment unit, it can also
That each unit physically exists alone, can also two or more units be integrated in same unit.Above-mentioned integrated list
The form that hardware had both may be used in member is realized, can also be realized in the form of hardware adds software function module.
It is obvious to a person skilled in the art that invention is not limited to the details of the above exemplary embodiments, Er Qie
In the case of without departing substantially from spirit or essential attributes of the invention, the present invention can be realized in other specific forms.Therefore, no matter
From the point of view of which point, the present embodiments are to be considered as illustrative and not restrictive, and the scope of the present invention is by appended power
Profit requires rather than above description limits, it is intended that all by what is fallen within the meaning and scope of the equivalent requirements of the claims
Variation includes within the present invention.Any reference signs in the claims should not be construed as limiting the involved claims.This
Outside, it is clear that one word of " comprising " is not excluded for other units or, odd number is not excluded for plural number.The multiple units stated in system claims
Or device can also be realized by a unit or device by software or hardware.The first, the second equal words are used for indicating name
Claim, and does not represent any particular order.
Finally it should be noted that the above examples are only used to illustrate the technical scheme of the present invention and are not limiting, although reference
Preferred embodiment describes the invention in detail, it will be understood by those of ordinary skill in the art that, it can be to the present invention's
Technical solution is modified or equivalent replacement, without departing from the spirit of the technical scheme of the invention range.
Claims (10)
1. a kind of rapid hand tracking, which is characterized in that the method includes:
The video for including human hands region of imaging device acquisition is shown on display interface;
Receive the calibration frame that user demarcates on the video comprising human hands region;
The gradient orientation histogram feature in the fixed region of the calibration collimation mark is extracted, and according to the gradient orientation histogram feature
The region fixed to the calibration collimation mark is split to obtain hand images;And
Using continuous adaptive Mean Shift operator to the hand images into line trace, wherein described using continuous adaptive
Mean Shift operator is answered to specifically include the hand images into line trace:
The color space of the hand images is transformed into HSV color spaces, isolates the hand images of chrominance component, is based on institute
The centroid position and size for stating the hand images I (i, j) of chrominance component and the search box of initialization, calculate the matter of current search window
Heart position (M10/M00,M01/M00) and current search window sizeWherein,AndFor the first moment of current search window,For the zeroth order square of current search window, i
For the pixel value in the horizontal direction of I (i, j), the pixel value in the vertical direction of j I (i, j).
2. the method as described in claim 1, which is characterized in that it is described shown on display interface imaging device acquisition include
The video in human hands region further includes:
Show pre-set standard calibration frame with pre-set display mode, the pre-set display mode include with
Under one or more combination:
When receiving idsplay order, the pre-set standard calibration frame is shown;
When receiving hiding instruction, the pre-set standard calibration frame is hidden;
The pre-set standard calibration frame is shown receiving the idsplay order, and is not received by any instruction later
Time be more than preset time period when, automatically hide the pre-set standard calibration frame.
3. method as claimed in claim 2, which is characterized in that described to receive user in the regarding comprising human hands region
The calibration frame demarcated on frequency includes:
The standard calibration frame that user demarcates on the video comprising human hands region is received, including:
Receive user it is described comprising the video in human hands region in the rough calibration frame that draws;
Pre-set standard calibration frame corresponding with the rough calibration frame is matched by the method for fuzzy matching;
According to the standard calibration frame matched to being demarcated in the video comprising human hands region and showing calibration
Standard calibration frame, wherein the geometric center of the rough calibration frame is identical as the geometric center of the standard calibration frame matched.
4. method as claimed in claim 2, which is characterized in that described to receive user in the regarding comprising human hands region
The calibration frame demarcated on frequency includes:
The standard calibration frame that user demarcates on the video comprising human hands region is received, including:
The standard calibration frame that user chooses directly is received, according to the standard calibration frame in the regarding comprising human hands region
The standard calibration frame of calibration is demarcated and shown on frequency.
5. method as described in claim 3 or 4, which is characterized in that the reception user is described comprising human hands region
Video on the standard calibration frame demarcated further include:
When receiving the instruction of amplification, diminution, movement, deletion, the standard calibration frame of display is amplified, reduce, move, is deleted
It removes.
6. method as claimed in claim 5, which is characterized in that the method further includes:
The region fixed to the standard calibration collimation mark pre-processes, and the pretreatment may include the group of one or more of
It closes:Gray processing processing, correction process.
7. method as claimed in claim 6, which is characterized in that the method further includes:
The depth information in the fixed corresponding video comprising human hands region in region of the calibration collimation mark is obtained, according to described
Depth information standardizes to the hand images, and the process of the standardization is:S2* (H2/H1), wherein S1 are from first
The size for the hand images that the fixed region segmentation of secondary standard calibration collimation mark obtains, H1 are the fixed region of the calibration collimation mark of first time
Corresponding depth of view information;The size for the hand images that the region segmentation that it is fixed that S2 is current standard calibration collimation mark obtains, H2 are to work as
The fixed corresponding depth of view information in region of preceding calibration collimation mark.
8. a kind of rapid hand tracks of device, which is characterized in that described device includes:
Display module, the video for including human hands region for showing imaging device acquisition on display interface;
Demarcating module, the calibration frame demarcated on the video comprising human hands region for receiving user;
Divide module, the gradient orientation histogram feature for extracting the fixed region of the calibration collimation mark, and according to the gradient
The direction histogram feature region fixed to the calibration collimation mark is split to obtain hand images;And
Tracking module, for using continuous adaptive Mean Shift operator to the hand images into line trace, wherein institute
It states and the hand images is specifically included into line trace using continuous adaptive Mean Shift operator:
The color space of the hand images is transformed into HSV color spaces, isolates the hand images of chrominance component, is based on institute
The centroid position and size for stating the hand images I (i, j) of chrominance component and the search box of initialization, calculate the matter of current search window
Heart position (M10/M00,M01/M00) and current search window sizeWherein,AndFor the first moment of current search window,For the zeroth order square of current search window, i
For the pixel value in the horizontal direction of I (i, j), the pixel value in the vertical direction of j I (i, j).
9. a kind of terminal, which is characterized in that the terminal includes processor and memory, and the processor is for executing described deposit
The computer program stored in reservoir is to realize rapid hand tracking as claimed in any of claims 1 to 7 in one of claims.
10. a kind of computer readable storage medium, computer program, feature are stored on the computer readable storage medium
Be, the computer program realized when being executed by processor rapid hand as claimed in any of claims 1 to 7 in one of claims with
Track method.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810349972.XA CN108682021B (en) | 2018-04-18 | 2018-04-18 | Rapid hand tracking method, device, terminal and storage medium |
PCT/CN2018/100227 WO2019200785A1 (en) | 2018-04-18 | 2018-08-13 | Fast hand tracking method, device, terminal, and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810349972.XA CN108682021B (en) | 2018-04-18 | 2018-04-18 | Rapid hand tracking method, device, terminal and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108682021A true CN108682021A (en) | 2018-10-19 |
CN108682021B CN108682021B (en) | 2021-03-05 |
Family
ID=63801123
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810349972.XA Active CN108682021B (en) | 2018-04-18 | 2018-04-18 | Rapid hand tracking method, device, terminal and storage medium |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN108682021B (en) |
WO (1) | WO2019200785A1 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109886928A (en) * | 2019-01-24 | 2019-06-14 | 平安科技(深圳)有限公司 | A kind of target cell labeling method, device, storage medium and terminal device |
WO2021130549A1 (en) * | 2019-12-23 | 2021-07-01 | Sensetime International Pte. Ltd. | Target tracking method and apparatus, electronic device, and storage medium |
WO2023001039A1 (en) * | 2021-07-19 | 2023-01-26 | 北京字跳网络技术有限公司 | Image matching method and apparatus, and device and storage medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103390168A (en) * | 2013-07-18 | 2013-11-13 | 重庆邮电大学 | Intelligent wheelchair dynamic gesture recognition method based on Kinect depth information |
CN105678809A (en) * | 2016-01-12 | 2016-06-15 | 湖南优象科技有限公司 | Handheld automatic follow shot device and target tracking method thereof |
CN106157308A (en) * | 2016-06-30 | 2016-11-23 | 北京大学 | Rectangular target object detecting method |
US20170287139A1 (en) * | 2009-10-07 | 2017-10-05 | Microsoft Technology Licensing, Llc | Methods and systems for determining and tracking extremities of a target |
CN107240117A (en) * | 2017-05-16 | 2017-10-10 | 上海体育学院 | The tracking and device of moving target in video |
WO2018031102A1 (en) * | 2016-08-12 | 2018-02-15 | Qualcomm Incorporated | Methods and systems of performing content-adaptive object tracking in video analytics |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3120294A1 (en) * | 2014-03-20 | 2017-01-25 | Telecom Italia S.p.A. | System and method for motion capture |
CN105825524B (en) * | 2016-03-10 | 2018-07-24 | 浙江生辉照明有限公司 | Method for tracking target and device |
CN105957107A (en) * | 2016-04-27 | 2016-09-21 | 北京博瑞空间科技发展有限公司 | Pedestrian detecting and tracking method and device |
-
2018
- 2018-04-18 CN CN201810349972.XA patent/CN108682021B/en active Active
- 2018-08-13 WO PCT/CN2018/100227 patent/WO2019200785A1/en unknown
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170287139A1 (en) * | 2009-10-07 | 2017-10-05 | Microsoft Technology Licensing, Llc | Methods and systems for determining and tracking extremities of a target |
CN103390168A (en) * | 2013-07-18 | 2013-11-13 | 重庆邮电大学 | Intelligent wheelchair dynamic gesture recognition method based on Kinect depth information |
CN105678809A (en) * | 2016-01-12 | 2016-06-15 | 湖南优象科技有限公司 | Handheld automatic follow shot device and target tracking method thereof |
CN106157308A (en) * | 2016-06-30 | 2016-11-23 | 北京大学 | Rectangular target object detecting method |
WO2018031102A1 (en) * | 2016-08-12 | 2018-02-15 | Qualcomm Incorporated | Methods and systems of performing content-adaptive object tracking in video analytics |
CN107240117A (en) * | 2017-05-16 | 2017-10-10 | 上海体育学院 | The tracking and device of moving target in video |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109886928A (en) * | 2019-01-24 | 2019-06-14 | 平安科技(深圳)有限公司 | A kind of target cell labeling method, device, storage medium and terminal device |
CN109886928B (en) * | 2019-01-24 | 2023-07-14 | 平安科技(深圳)有限公司 | Target cell marking method, device, storage medium and terminal equipment |
WO2021130549A1 (en) * | 2019-12-23 | 2021-07-01 | Sensetime International Pte. Ltd. | Target tracking method and apparatus, electronic device, and storage medium |
US11244154B2 (en) | 2019-12-23 | 2022-02-08 | Sensetime International Pte. Ltd. | Target hand tracking method and apparatus, electronic device, and storage medium |
WO2023001039A1 (en) * | 2021-07-19 | 2023-01-26 | 北京字跳网络技术有限公司 | Image matching method and apparatus, and device and storage medium |
Also Published As
Publication number | Publication date |
---|---|
WO2019200785A1 (en) | 2019-10-24 |
CN108682021B (en) | 2021-03-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109359538B (en) | Training method of convolutional neural network, gesture recognition method, device and equipment | |
CN108765278B (en) | Image processing method, mobile terminal and computer readable storage medium | |
CN103927016B (en) | Real-time three-dimensional double-hand gesture recognition method and system based on binocular vision | |
US8792722B2 (en) | Hand gesture detection | |
CN104268583B (en) | Pedestrian re-recognition method and system based on color area features | |
US8750573B2 (en) | Hand gesture detection | |
CN109165538B (en) | Bar code detection method and device based on deep neural network | |
Shivakumara et al. | A new multi-modal approach to bib number/text detection and recognition in Marathon images | |
CN110413816A (en) | Colored sketches picture search | |
CN110827312B (en) | Learning method based on cooperative visual attention neural network | |
CN105405116B (en) | A kind of solid matching method cut based on figure | |
WO2019071976A1 (en) | Panoramic image saliency detection method based on regional growth and eye movement model | |
CN113240691A (en) | Medical image segmentation method based on U-shaped network | |
CN112418216A (en) | Method for detecting characters in complex natural scene image | |
CN110222572A (en) | Tracking, device, electronic equipment and storage medium | |
WO2023151237A1 (en) | Face pose estimation method and apparatus, electronic device, and storage medium | |
CN108682021A (en) | Rapid hand tracking, device, terminal and storage medium | |
CN111080670A (en) | Image extraction method, device, equipment and storage medium | |
CN109558790B (en) | Pedestrian target detection method, device and system | |
CN111753923A (en) | Intelligent photo album clustering method, system, equipment and storage medium based on human face | |
CN114549557A (en) | Portrait segmentation network training method, device, equipment and medium | |
Paul et al. | Hand segmentation from complex background for gesture recognition | |
CN109740674A (en) | A kind of image processing method, device, equipment and storage medium | |
Mussi et al. | A novel ear elements segmentation algorithm on depth map images | |
CN112686122B (en) | Human body and shadow detection method and device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |