WO2016145625A1 - 3d hand pose recovery from binocular imaging system - Google Patents

3d hand pose recovery from binocular imaging system Download PDF

Info

Publication number
WO2016145625A1
WO2016145625A1 PCT/CN2015/074447 CN2015074447W WO2016145625A1 WO 2016145625 A1 WO2016145625 A1 WO 2016145625A1 CN 2015074447 W CN2015074447 W CN 2015074447W WO 2016145625 A1 WO2016145625 A1 WO 2016145625A1
Authority
WO
WIPO (PCT)
Prior art keywords
hand
image
matched
parts
hand part
Prior art date
Application number
PCT/CN2015/074447
Other languages
French (fr)
Inventor
Xiaoou Tang
Chen QIAN
Tak Wai HUI
Chen Change Loy
Original Assignee
Xiaoou Tang
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiaoou Tang filed Critical Xiaoou Tang
Priority to PCT/CN2015/074447 priority Critical patent/WO2016145625A1/en
Priority to CN201580077259.9A priority patent/CN108140243B/en
Publication of WO2016145625A1 publication Critical patent/WO2016145625A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • G06T7/593Depth or shape recovery from multiple images from stereo images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person

Definitions

  • the present application generally relates to a field of body pose recognition, more particularly, to an apparatus for constructing a 3D hand model from a binocular image system.
  • the present application further relates to a method and a system for constructing a 3D hand model from a binocular image system.
  • body pose recognition systems especially hand pose recognition systems have been applied in several applications, such as, hand gesture control in human-computer interface (HCI) , sign language recognition and etc. .
  • HCI human-computer interface
  • Conventional recovery of a 3D model from a stereo image is generally divided into two steps including extraction of a 3D point cloud from the stereo image and then fitting the 3D point cloud into a 3D model.
  • an apparatus, a system and method are proposed to solve the aforementioned problems.
  • properties of human hand are wisely utilized by introducing and using a concept of hand parts to overcome the above difficulties. Therefore, the hand pose including 3D positions and directions of fingers and palm can be recovered in a real-time manner.
  • the apparatus may comprise a retrieving device configured to retrieve a hand region from a stereo frame comprising at least a first image and a second image; a segmenting device in electrical communication with the retrieving device and configured to segment one or more hand parts each consisting of a number of feature points from the retrieved hand region; an acquiring device electrically coupled with the segmenting device and configured to, for each segmented hand part, acquire a plurality of matched feature point pairs in which the feature points in the first image are matched with corresponding feature points in the second image; and a generating device in electrical communication with the acquiring device and configured to generate a 3D model of each hand part based on the matched feature point pairs of the hand parts to construct the 3D hand model.
  • a method for constructing a 3D hand model may comprise the following steps: retrieving a hand region from a stereo frame comprising at least a first image and a second image; segmenting, from the retrieved hand region, one or more hand parts each consisting of a number of feature points; acquiring, for each hand part, a plurality of matched feature point pairs in which the feature points in the first image are matched with the corresponding feature points in the second image; and generating a 3D model of each hand part based on the matched feature point pairs of the hand parts to construct the 3D hand model.
  • the system may comprise a memory that stores executable components and a processor, electrically coupled to the memory to execute the executable components to retrieve a hand region from a stereo frame comprising at least a first image and a second image; segment one or more hand parts each consisting of a number of feature points from the retrieved hand region; acquire, for each hand part, a plurality of matched feature point pairs in which the feature points in the first image are matched with corresponding feature points in the second image; and generate a 3D model of each hand part based on the matched feature point pairs of the hand parts to construct the 3D hand model.
  • Fig. 1 is a schematic diagram illustrating an apparatus for constructing a 3D hand model consistent with an embodiment of the present application.
  • Fig. 2 is a schematic diagram illustrating a segmenting device of the apparatus for constructing a 3D hand model consistent with some disclosed embodiments.
  • Fig. 3 is a schematic diagram illustrating a generating device of the apparatus for constructing a 3D hand model consistent with one embodiment of the present application.
  • Fig. 4 is a schematic diagram illustrating an example of a constructed 3D hand model consistent with one embodiment of the present application.
  • Fig. 5 is a schematic flowchart illustrating a method for constructing a 3D hand model consistent with some disclosed embodiments.
  • Fig. 6 is a schematic flowchart illustrating a step of segmenting of the method for constructing a 3D hand model consistent with some other disclosed embodiments.
  • Fig. 7 is a schematic flowchart illustrating a step of generating of the method for constructing a 3D hand model consistent with some other disclosed embodiments.
  • Fig. 8 is a schematic diagram illustrating a system for constructing a 3D hand model consistent with an embodiment of the present application.
  • the present invention may be embodied as a system, method or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc. ) or an embodiment combining software and hardware aspects that may all generally be referred to herein as a “circuit, ” “module” or “system. ” Furthermore, the present invention may take the form of a computer program product embodied in any tangible medium of expression having computer-usable program code embodied in the medium.
  • Fig. 1 is a schematic diagram illustrating an exemplary apparatus 1000 for constructing a 3D hand model of a user from a binocular imaging system consistent with some disclosed embodiments.
  • the apparatus 1000 may comprise a retrieving device 100, a segmenting device 200, an acquiring device 300 and a generating device 400.
  • the retrieving device 100 may retrieve a hand region from a stereo frame comprising at least a first image and a second image.
  • the retrieving device 100 may capture the stereo frame of the user’s hand from the binocular image system and retrieve the largest connected component of each of the images in the stereo frame as the hand region.
  • the connected component refers to a region consisting of a set of image points which are located adjacently.
  • the segmenting device 200 may be in communication with the retrieving device 100 and may segment one or more hand parts from the retrieved hand region, wherein each of the hand parts consists of a number of feature points, which will be described later in details with reference to Fig. 2.
  • the acquiring device 300 may be electrically coupled with the segmenting device 200. For each hand part, the acquiring device 300 may acquire a plurality of matched feature point pairs in which the feature points in the first image are matched with the corresponding feature points in the second image.
  • the generating device 400 may be in electrical communication with the acquiring device 300 and may generate a 3D model of each hand part based on the matched feature point pairs of the hand parts to construct the 3D hand model, which will be described later in details with reference to Fig. 3.
  • Fig. 4 illustrates an example of a 3D hand model constructed according to one embodiment of the present application, wherein five circles and an ellipse represent the detected fingertips and palm of the user’s hand, respectively.
  • the binocular imaging system (also known as a stereo camera) may be for example, an infra-red stereo camera.
  • an infra-red (IR) stereo camera with a brightness-adjustable IR light source is used to capture stereo images.
  • IR infra-red
  • the images may be captured by any other kinds of imaging system and the present application is not limited thereto.
  • the binocular imaging system is calibrated, that is, image rectification is performed for every stereo image frame.
  • the stereo frame has at least two images, namely a left image I 1 captured by a left stereo camera and a right image I 2 captured by a right camera.
  • the first and second images may refer to any one of the left and right images in the stereo image frame (I 1 , I 2 ) , unless otherwise specifically stated.
  • the segmenting device 200 may further comprise a chooser 201, an extractor 202 and a detector 203.
  • the chooser 201 may choose a representative point for identifying each of the hand parts from the hand region
  • the extractor 202 may extract a connected component of each hand part according to the chosen representative point
  • the detector 203 may detect the corresponding feature points of each hand part according to the extracted connected component to segment at least one hand parts with the detected feature points.
  • the segmenting device 200 may segment the hand region into a plurality of hand parts at least comprising, for example, five finger parts and a palm part.
  • each hand part is assigned by a representative point so as to distinguish it from different hand parts.
  • the chooser 201 may use a geometric method such that it chooses the most protruding point in the hand region as the representative point for identifying one finger part.
  • the chooser 201 may use an intensity approach to choose the brightest point of the stereo frame of the hand region as the representative point of a finger part, that is, to choose the point which has the highest intensity (i.e., the brightest) in the hand region as the representative point of the finger part.
  • the chooser 201 may choose a center of the hand region as its representative point. Note that, other properties of the hand can also be used to identify finger or palm parts according to different imaging systems and the present application is not limited thereto.
  • the extractor 202 may extract the connected component of each hand part according to its representative point.
  • the connected component consists of a set of image points around the representative point of the hand part.
  • the connected component is a set of image points around its representative point within the average palm radius.
  • each connected component of the finger part is a set of image points around the protruding point within a distance not exceeding an average finger length.
  • the connected component of the finger part is a set of image points such that the image points with the same distance to the protruding point do not exceed an average finger width.
  • the connected component is a set of image points around the brightest point such that intensities of the image points at contour lines of the aforementioned distance map are lower than a certain threshold.
  • the average palm radius, the average finger length, and the average finger width are pre-determined.
  • the segmenting device 200 may further comprise a remover 204 configured for removing the hand part comprising the chosen representative point and the extracted connected component from the hand region, such that the choosing, the extracting and the removing are performed repeatedly in the remained hand region until no hand part needs to be removed from the hand region. Therefore, hand parts can be iteratively segmented and only a new single hand part is recovered from the remained hand region in each iteration process.
  • the chooser 201 may first choose the most protruding point as the representative point of a hand part. Then, the extractor 202 may extract the connected component around the protruding point within a distance not exceeding the predetermined average finger length.
  • the currently searched hand part in which the representative point and the connected component are extracted is removed from the hand region, wherein the removed hand part comprises the chosen representative point and the extracted connected component.
  • the processes of the chooser 201, the extractor 202 and the remover 204 are performed repeatedly in the remained hand region until all the hand region is searched through.
  • the detector 203 may detect the corresponding feature points of each of the hand parts according to each connected component extracted by the extractor 202.
  • the feature points are distributed widely enough to cover the whole hand part and are discriminative so that the 2D image projections from different 3D points are distinguishable from each other.
  • the image points which are located in the boundary along the connected components of a hand part are used as the feature points of the hand part.
  • the segmenting device 200 may further comprise a validator 205.
  • the validator 205 is configured to validate whether the segmented hand part is a finger part according to the extracted connected component of the hand part. If it does not belong to a finger part, then the segmented hand part is considered as a part of the palm part.
  • a length-to-width ratio which is defined as a ratio of the length to width of the finger part is used to determine whether the current hand part is a valid finger or not.
  • the information of the length and the width of the finger part are provided by the connected component extracted from the extractor 202. Note that, other properties relating to the representative point can also be used to provide useful cues to facilitate the validation of the hand part and the present application is not limited thereto.
  • the acquiring device 300 is configured to acquire a matched hand part pair in which each hand part in the first image is matched with a hand part in the second image.
  • the acquiring device 300 may further acquire the matched feature point pairs in each matched hand part.
  • the five components represent the five finger parts and the last one represents the palm part.
  • the center of the hand region is chosen as the representative point (p p ) .
  • the acquiring device 300 may only acquire the matched finger part pair in which each hand part in the first image is matched with a hand part in the second image, according to the representative point of the hand part.
  • each finger part (F 1 ) i in the first image I 1 is matched to a finger part (F 2 ) j in the second image I 2 by measuring the difference between the distance of the representative point of (F 1 ) i relative to the palm center p p1 and the distance of the representative point of (F 2 ) j relative to the palm center p p2 as follows:
  • the acquiring device 300 may further acquire the matched feature point pairs, that is, the 2D image points and the hand part labels associated with the 2D image points.
  • the correspondence of a feature point x 1 in the first image I 1 is defined to be in the second image I 2 .
  • the disparity may be provided by the generating device 400, which will be described later.
  • the optimal matched feature point x 2 for x 1 is defined as follows:
  • the acquiring device 300 acquires the matched feature points pair (x 1 , x 2 ) .
  • the generating device 400 may comprise an establisher 401, a determiner 402 and a fitter 403.
  • the establisher 401 may establish a 3D point cloud for the first image and the second image from the matched feature point pairs of each hand part.
  • Each of the matched feature point pairs may comprise hand part labels associated with 2D coordinated of the feature points.
  • the determiner 402 may determine whether the established 3D point cloud of a hand part belongs to a finger part or not according to the hand part label.
  • the fitter 403 may fit each established 3D point cloud with a specific 3D model according to the hand part label associated with the 3D point cloud.
  • the depth Z(x 1 , x 2 ) is defined as the follows:
  • f and b represent the focal length and the baseline of the stereo camera after rectification, respectively.
  • the establisher 401 establishes the 3D point cloud, such that the 3D position X 1 with respect to the camera center associated with I 1 is defined as follows:
  • the determiner 402 may determine whether the established 3D point cloud of one hand part belongs to a finger part or not according to the hand part label, such that the fitter 403 may fit the established 3D point cloud with a specific 3D model.
  • a 3D finger model fitting is performed by the fitter 403.
  • a cylinder in a 3D space is modeled as a finger and further simplified to a line segment.
  • the line segment can be parameterized by finger length L, 3D coordinates of the fingertip P f , and a unit direction vector of the finger, wherein L may be pre-determined by the segmenting device 200.
  • the parameters P f and may be initialized.
  • the optimal values can be obtained by using a gradient descent optimization to minimize the total distance from all 3D feature points of the finger part to the line segment. Therefore, a cost function is defined as follows:
  • P f represents the 3D coordinates of the fingertip of the finger part; represents the unit direction vector of the finger part; and (X f ) i represents the i th point of the 3D point cloud for the finger part. From this, the 3D finger model of the finger part is constructed accordingly.
  • a 3D palm model fitting is performed by the fitter 403.
  • a 3D circle is modeled as a palm and further parameterized by using a palm center C p , a radius r and a surface unit normal After the palm center C p , the radius r and the unit normal are initialized, a gradient descent optimization is performed on to minimize the total distance from all 3D points to the 3D circle and its variance.
  • a cost function is defined as follows:
  • (X p ) i represents the i th point of the 3D point cloud for the palm part
  • represents an adjustment factor
  • the radius r is re-estimated according to the calculated C p .
  • Fig. 5 is a flowchart illustrating a method for constructing a 3D hand model and Figs. 6 and 7 are flowcharts respectively illustrating the segmenting step S502 and the generating step S504 of the method shown in Fig. 5.
  • the method 2000 will be described in details with reference to Figs. 5-7.
  • a hand region may be retrieved from a stereo frame comprising at least a first image and a second image.
  • one or more hand parts each consisting of a number of feature points may be segmented from the hand region retrieved at step S501.
  • a plurality of matched feature point pairs in which the feature points in the first image may be matched with the feature points in the second image are acquired.
  • a 3D model of each hand part may be generated based on the matched feature point pairs of the hand parts to construct the 3D hand model.
  • the step S502 shown in Fig. 5 may further comprise steps S5021 to S5023 as shown in Fig. 6.
  • a representative point for identifying each of the hand parts is chosen from the hand region.
  • the connected component of each of the hand parts is extracted according to the chosen representative point.
  • the corresponding feature points of each of the hand parts are detected according to the extracted connected component to segment at least one hand parts with the detected feature points.
  • the step S503 further comprise a step of acquiring a matched hand part pair in which each hand part in the first image is matched with a hand part in the second image, according to the representative point of the hand part and a step of acquiring the matched feature point pair in each matched hand part.
  • the step S504 shown in Fig. 5 further comprises steps S5041 to S5044 as shown in Fig. 7.
  • Each matched feature point pairs of the hand part acquired at step S503 may comprise hand part labels associated with 2D coordinates of the feature points.
  • a 3D point cloud is established for the first image and the second image from the matched feature point pair of each hand part.
  • whether the established 3D point cloud of a hand part belongs to a finger part or not may be determined according to the hand part label. If it is determined that the hand part is the finger part, then, at step S5043, a 3D finger model fitting process is performed, which may be governed by the above-mentioned formula (5) . If not, at step S5044, a 3D palm model fitting process is performed, which may be governed by the above-mentioned formula (6) .
  • Fig. 8 illustrates a system 3000 for constructing a 3D hand model consistent with an embodiment of the present application.
  • the system 3000 comprises a memory 3001 that stores executable components and a processor 3002, electrically coupled to the memory 3001 to execute the executable components to perform operations of the system 3000.
  • the executable components may comprise: a retrieving component 3003 configured to retrieve a hand region from a stereo frame comprising at least a first image and a second image; a segmenting component 3004 configured to segment one or more hand parts each consisting of a number of feature points from the retrieved hand region; an acquiring component 3005 configured to, for each hand part, acquire a plurality of matched feature point pairs in which the feature points in the first image are matched with corresponding feature points in the second image; and a generating component 3006 configured to generate a 3D model of each hand part based on the matched feature point pairs of the hand parts to construct the 3D hand model.
  • the functions of the components 3003 to 3006 are similar to those of the devices 100 to 400, respectively, and thus the detailed descriptions thereof are omitted herein.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

Disclosed is an apparatus, a method and a system for constructing a 3D hand model from a binocular imaging system. The apparatus may comprise a retrieving device configured to retrieve a hand region from a stereo frame comprising at least a first image and a second image; a segmenting device in electrical communication with the retrieving device and configured to segment at least one hand part each having feature points from the retrieved hand region; an acquiring device electrically coupled with the segmenting device and configured to, for each segmented hand part, acquire a plurality of matched feature point pairs in which the feature points in the first image are matched with corresponding feature points in the second image; and a generating device in electrical communication with the acquiring device and configured to generate a 3D model of each hand part based on the matched feature point pairs of the hand part to construct the 3D hand model.

Description

3D HAND POSE RECOVERY FROM A BINOCULAR IMAGING SYSTEM Technical Field
The present application generally relates to a field of body pose recognition, more particularly, to an apparatus for constructing a 3D hand model from a binocular image system. The present application further relates to a method and a system for constructing a 3D hand model from a binocular image system.
Background
Recently, body pose recognition systems, especially hand pose recognition systems have been applied in several applications, such as, hand gesture control in human-computer interface (HCI) , sign language recognition and etc. . Conventional recovery of a 3D model from a stereo image is generally divided into two steps including extraction of a 3D point cloud from the stereo image and then fitting the 3D point cloud into a 3D model.
However, traditional methods generally face the following problems. Firstly, 2D features of a finger are hardly distinguished from that of other fingers. Ambiguity in establishing the correspondence of the same 3D point across two or more different images in a stereo image pair exists and affects the accuracy of a 3D reconstruction. Secondly, distinct features extraction and feature mapping hardly meet the real-time requirement. Thirdly, a hand is considered as a multi-body object, which is generally called an articulated object, and thus hand pose recovery is an ill-posed task when the traditional single model fitting is used. Fourthly, even complex multi-body model fitting can be used instead of the single model, which is rather a computationally intensive task.
Traditional methods without considering the unique characteristics of the human hand can hardly overcome these difficulties.
Summary
In view of the above, an apparatus, a system and method are proposed to solve the aforementioned problems. With the apparatus, system and method, properties of human  hand are wisely utilized by introducing and using a concept of hand parts to overcome the above difficulties. Therefore, the hand pose including 3D positions and directions of fingers and palm can be recovered in a real-time manner.
According to an embodiment of the present application, disclosed is an apparatus for constructing a 3D hand model. The apparatus may comprise a retrieving device configured to retrieve a hand region from a stereo frame comprising at least a first image and a second image; a segmenting device in electrical communication with the retrieving device and configured to segment one or more hand parts each consisting of a number of feature points from the retrieved hand region; an acquiring device electrically coupled with the segmenting device and configured to, for each segmented hand part, acquire a plurality of matched feature point pairs in which the feature points in the first image are matched with corresponding feature points in the second image; and a generating device in electrical communication with the acquiring device and configured to generate a 3D model of each hand part based on the matched feature point pairs of the hand parts to construct the 3D hand model.
According to an embodiment of the present application, disclosed is a method for constructing a 3D hand model. The method may comprise the following steps: retrieving a hand region from a stereo frame comprising at least a first image and a second image; segmenting, from the retrieved hand region, one or more hand parts each consisting of a number of feature points; acquiring, for each hand part, a plurality of matched feature point pairs in which the feature points in the first image are matched with the corresponding feature points in the second image; and generating a 3D model of each hand part based on the matched feature point pairs of the hand parts to construct the 3D hand model.
According to an embodiment of the present application, disclosed is a system for constructing a 3D hand model. The system may comprise a memory that stores executable components and a processor, electrically coupled to the memory to execute the executable components to retrieve a hand region from a stereo frame comprising at least a first image and a second image; segment one or more hand parts each consisting of a number of feature points from the retrieved hand region; acquire, for each hand part, a plurality of matched feature point pairs in which the feature points in the first image are matched with corresponding  feature points in the second image; and generate a 3D model of each hand part based on the matched feature point pairs of the hand parts to construct the 3D hand model.
The following description and the annexed drawings set forth certain illustrative aspects of the disclosure. These aspects are indicative, however, of but a few of the various ways in which the principles of the disclosure may be employed. Other aspects of the disclosure will become apparent from the following detailed description of the disclosure when considered in conjunction with the drawings.
Brief Description of the Drawing
Exemplary non-limiting embodiments of the present invention are described below with reference to the attached drawings. The drawings are illustrative and generally not to an exact scale. The same or similar elements on different figures are referenced with the same reference numbers.
Fig. 1 is a schematic diagram illustrating an apparatus for constructing a 3D hand model consistent with an embodiment of the present application.
Fig. 2 is a schematic diagram illustrating a segmenting device of the apparatus for constructing a 3D hand model consistent with some disclosed embodiments.
Fig. 3 is a schematic diagram illustrating a generating device of the apparatus for constructing a 3D hand model consistent with one embodiment of the present application.
Fig. 4 is a schematic diagram illustrating an example of a constructed 3D hand model consistent with one embodiment of the present application.
Fig. 5 is a schematic flowchart illustrating a method for constructing a 3D hand model consistent with some disclosed embodiments.
Fig. 6 is a schematic flowchart illustrating a step of segmenting of the method for constructing a 3D hand model consistent with some other disclosed embodiments.
Fig. 7 is a schematic flowchart illustrating a step of generating of the method for constructing a 3D hand model consistent with some other disclosed embodiments.
Fig. 8 is a schematic diagram illustrating a system for constructing a 3D hand model consistent with an embodiment of the present application.
Detailed Description
Reference will now be made in detail to some specific embodiments of the invention including the best modes contemplated by the inventors for carrying out the invention. Examples of these specific embodiments are illustrated in the accompanying drawings. While the invention is described in conjunction with these specific embodiments, it will be understood that it is not intended to limit the invention to the described embodiments. On the contrary, it is intended to cover alternatives, modifications, and equivalents as may be included within the spirit and scope of the invention as defined by the appended claims. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. The present invention may be practiced without some or all of these specific details. In other instances, well-known process operations have not been described in detail in order not to unnecessarily obscure the present invention.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the singular forms "a" , "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising, " when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
As will be appreciated by one skilled in the art, the present invention may be embodied as a system, method or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc. ) or an embodiment combining software and hardware aspects that may all generally be referred to herein as a “circuit, ” “module” or “system. ” Furthermore, the present invention may take the form of a computer program product embodied in any tangible medium of expression having computer-usable program code embodied in the medium.
It is further understood that the use of relational terms such as first and second, and the like, if any, are used solely to distinguish one from another entity, item, or action without necessarily requiring or implying any actual such relationship or order between such entities, items or actions.
Much of the inventive functionality and many of the inventive principles when implemented, are best supported with or in software or integrated circuits (ICs) , such as a digital signal processor and software therefore or application specific ICs. It is expected that one of ordinary skill, notwithstanding possibly significant effort and many design choices motivated by, for example, available time, current technology, and economic considerations, when guided by the concepts and principles disclosed herein will be readily capable of generating such software instructions or ICs with minimal experimentation. Therefore, in the interest of brevity and minimization of any risk of obscuring the principles and concepts according to the present invention, further discussion of such software and ICs, if any, will be limited to the essentials with respect to the principles and concepts used by the preferred embodiments.
Fig. 1 is a schematic diagram illustrating an exemplary apparatus 1000 for constructing a 3D hand model of a user from a binocular imaging system consistent with some disclosed embodiments. As shown, the apparatus 1000 may comprise a retrieving device 100, a segmenting device 200, an acquiring device 300 and a generating device 400.
In the embodiment shown in Fig. 1, the retrieving device 100 may retrieve a hand region from a stereo frame comprising at least a first image and a second image. In an embodiment, the retrieving device 100 may capture the stereo frame of the user’s hand from the binocular image system and retrieve the largest connected component of each of the images in the stereo frame as the hand region. Herein, the connected component refers to a region consisting of a set of image points which are located adjacently.
The segmenting device 200 may be in communication with the retrieving device 100 and may segment one or more hand parts from the retrieved hand region, wherein each of the hand parts consists of a number of feature points, which will be described later in details with reference to Fig. 2.
The acquiring device 300 may be electrically coupled with the segmenting  device 200. For each hand part, the acquiring device 300 may acquire a plurality of matched feature point pairs in which the feature points in the first image are matched with the corresponding feature points in the second image.
The generating device 400 may be in electrical communication with the acquiring device 300 and may generate a 3D model of each hand part based on the matched feature point pairs of the hand parts to construct the 3D hand model, which will be described later in details with reference to Fig. 3.
With the apparatus 1000, the 3D positions and orientations of fingers and palm of the user’s hand can be recovered in a real-time manner. Fig. 4 illustrates an example of a 3D hand model constructed according to one embodiment of the present application, wherein five circles and an ellipse represent the detected fingertips and palm of the user’s hand, respectively.
The binocular imaging system (also known as a stereo camera) may be for example, an infra-red stereo camera. Hereinafter, each component of the apparatus 1000 will be described in details in an exemplary embodiment in which an infra-red (IR) stereo camera with a brightness-adjustable IR light source is used to capture stereo images. In this way, only objects which are illuminated by the light source will be captured by the binocular imaging system. Note that the images may be captured by any other kinds of imaging system and the present application is not limited thereto. For simplicity's sake, the binocular imaging system is calibrated, that is, image rectification is performed for every stereo image frame.
For the binocular imaging system, the stereo frame has at least two images, namely a left image I1 captured by a left stereo camera and a right image I2 captured by a right camera. Hereinafter, the first and second images may refer to any one of the left and right images in the stereo image frame (I1, I2) , unless otherwise specifically stated.
Referring to Fig. 2, the segmenting device 200 may further comprise a chooser 201, an extractor 202 and a detector 203. In particular, the chooser 201 may choose a representative point for identifying each of the hand parts from the hand region, the extractor 202 may extract a connected component of each hand part according to the chosen  representative point, and the detector 203 may detect the corresponding feature points of each hand part according to the extracted connected component to segment at least one hand parts with the detected feature points.
The segmenting device 200 may segment the hand region into a plurality of hand parts at least comprising, for example, five finger parts and a palm part. In order to identify the hand part, each hand part is assigned by a representative point so as to distinguish it from different hand parts. In an embodiment, the chooser 201 may use a geometric method such that it chooses the most protruding point in the hand region as the representative point for identifying one finger part. In another embodiment, the chooser 201 may use an intensity approach to choose the brightest point of the stereo frame of the hand region as the representative point of a finger part, that is, to choose the point which has the highest intensity (i.e., the brightest) in the hand region as the representative point of the finger part. For the palm part, the chooser 201 may choose a center of the hand region as its representative point. Note that, other properties of the hand can also be used to identify finger or palm parts according to different imaging systems and the present application is not limited thereto.
The extractor 202 may extract the connected component of each hand part according to its representative point. The connected component consists of a set of image points around the representative point of the hand part. For the palm part, the connected component is a set of image points around its representative point within the average palm radius. For the geometric approach, each connected component of the finger part is a set of image points around the protruding point within a distance not exceeding an average finger length. In another embodiment, the connected component of the finger part is a set of image points such that the image points with the same distance to the protruding point do not exceed an average finger width. In another embodiment, for the finger parts identified using intensity-based approach, the connected component is a set of image points around the brightest point such that intensities of the image points at contour lines of the aforementioned distance map are lower than a certain threshold. The average palm radius, the average finger length, and the average finger width are pre-determined.
In an embodiment, the segmenting device 200 may further comprise a remover  204 configured for removing the hand part comprising the chosen representative point and the extracted connected component from the hand region, such that the choosing, the extracting and the removing are performed repeatedly in the remained hand region until no hand part needs to be removed from the hand region. Therefore, hand parts can be iteratively segmented and only a new single hand part is recovered from the remained hand region in each iteration process. In an embodiment, the chooser 201 may first choose the most protruding point as the representative point of a hand part. Then, the extractor 202 may extract the connected component around the protruding point within a distance not exceeding the predetermined average finger length. Then, the currently searched hand part in which the representative point and the connected component are extracted is removed from the hand region, wherein the removed hand part comprises the chosen representative point and the extracted connected component. The processes of the chooser 201, the extractor 202 and the remover 204 are performed repeatedly in the remained hand region until all the hand region is searched through.
The detector 203 may detect the corresponding feature points of each of the hand parts according to each connected component extracted by the extractor 202. The feature points are distributed widely enough to cover the whole hand part and are discriminative so that the 2D image projections from different 3D points are distinguishable from each other. In an embodiment, the image points which are located in the boundary along the connected components of a hand part are used as the feature points of the hand part.
In an embodiment, the segmenting device 200 may further comprise a validator 205. To validate the correctness of the segmented hand part, the validator 205 is configured to validate whether the segmented hand part is a finger part according to the extracted connected component of the hand part. If it does not belong to a finger part, then the segmented hand part is considered as a part of the palm part. A length-to-width ratio which is defined as a ratio of the length to width of the finger part is used to determine whether the current hand part is a valid finger or not. The information of the length and the width of the finger part are provided by the connected component extracted from the extractor 202. Note that, other properties relating to the representative point can also be used to provide useful cues to facilitate the  validation of the hand part and the present application is not limited thereto.
Referring to Fig. 1 again, the acquiring device 300 is configured to acquire a matched hand part pair in which each hand part in the first image is matched with a hand part in the second image. In an embodiment, the acquiring device 300 may further acquire the matched feature point pairs in each matched hand part.
For a set of hand parts H= (F1, F2, ..., F5, P) segmented by the segmenting device 200, the five components represent the five finger parts and the last one represents the palm part. Herein, for the palm part (P) , the center of the hand region is chosen as the representative point (pp) . The acquiring device 300 may only acquire the matched finger part pair in which each hand part in the first image is matched with a hand part in the second image, according to the representative point of the hand part.
In particular, for the stereo frame (I1, I2) , each finger part (F1i in the first image I1 is matched to a finger part (F2j in the second image I2 by measuring the difference between the distance of the representative point of (F1i relative to the palm center pp1 and the distance of the representative point of (F2j relative to the palm center pp2 as follows:
Figure PCTCN2015074447-appb-000001
where (pf1i and (pf2j represent the ith and the jth representative points of the finger parts (F1i and (F2j in I1 and I2 , respectively and i, j=1, 2, ..., 5.
From the matched hand part, the acquiring device 300 may further acquire the matched feature point pairs, that is, the 2D image points and the hand part labels associated with the 2D image points. Herein, disparities d (pH) of all the feature points x= (x, y) T in the same hand part H are assumed the same as the associated representative point pH. In other words, for the stereo frame (I1, I2) , the correspondence of a feature point  x1 in the first image I1 is defined to be
Figure PCTCN2015074447-appb-000002
in the second image I2. The disparity may be provided by the generating device 400, which will be described later. After rejecting some of the impossible correspondences, the optimal matched feature point x2 for x1 is defined as follows:
Figure PCTCN2015074447-appb-000003
where x′2 s are the 2D image positions of the feature points around
Figure PCTCN2015074447-appb-000004
Then, the acquiring device 300 acquires the matched feature points pair (x1, x2) .
Returning to Fig. 3, the generating device 400 may comprise an establisher 401, a determiner 402 and a fitter 403. The establisher 401 may establish a 3D point cloud for the first image and the second image from the matched feature point pairs of each hand part. Each of the matched feature point pairs may comprise hand part labels associated with 2D coordinated of the feature points. The determiner 402 may determine whether the established 3D point cloud of a hand part belongs to a finger part or not according to the hand part label. The fitter 403 may fit each established 3D point cloud with a specific 3D model according to the hand part label associated with the 3D point cloud.
For the matched feature point pair (x1, x2) , the depth Z(x1, x2) is defined as the follows:
Figure PCTCN2015074447-appb-000005
where d=x2-x1 represents the disparity for the matched feature point pair (x1, x2) ; and f and b represent the focal length and the baseline of the stereo camera after rectification, respectively.
Therefore, the establisher 401 establishes the 3D point cloud, such that the 3D position X1 with respect to the camera center associated with I1 is defined as follows:
Figure PCTCN2015074447-appb-000006
Then, the determiner 402 may determine whether the established 3D point cloud of one hand part belongs to a finger part or not according to the hand part label, such that the fitter 403 may fit the established 3D point cloud with a specific 3D model.
If the established 3D point cloud of a hand part is determined as belonging to the finger, a 3D finger model fitting is performed by the fitter 403. Herein, a cylinder in a 3D space is modeled as a finger and further simplified to a line segment. The line segment can be parameterized by finger length L, 3D coordinates of the fingertip Pf, and a unit direction vector
Figure PCTCN2015074447-appb-000007
of the finger, wherein L may be pre-determined by the segmenting device 200. The parameters Pf and
Figure PCTCN2015074447-appb-000008
may be initialized. The optimal values can be obtained by using a gradient descent optimization to minimize the total distance from all 3D feature points of the finger part to the line segment. Therefore, a cost function
Figure PCTCN2015074447-appb-000009
is defined as follows:
Figure PCTCN2015074447-appb-000010
where Pf represents the 3D coordinates of the fingertip of the finger part; 
Figure PCTCN2015074447-appb-000011
represents the unit direction vector of the finger part; and (Xfi represents the ith point of the 3D point cloud for the finger part. From this, the 3D finger model of the finger part is constructed accordingly.
On the other hand, if the established 3D point cloud of a hand part is determined as belonging to the palm, a 3D palm model fitting is performed by the fitter 403. Herein, a 3D circle is modeled as a palm and further parameterized by using a palm center Cp, a radius r and a surface unit normal
Figure PCTCN2015074447-appb-000012
After the palm center Cp, the radius r and the unit normal
Figure PCTCN2015074447-appb-000013
are initialized, a gradient descent optimization is performed on
Figure PCTCN2015074447-appb-000014
to minimize the total distance from all 3D points to the 3D circle and its variance. A cost function
Figure PCTCN2015074447-appb-000015
is defined as follows:
Figure PCTCN2015074447-appb-000016
where (Xpi represents the ith point of the 3D point cloud for the palm part;
Figure PCTCN2015074447-appb-000017
represents the mean distance of Xps’ to the palm center Cp; and
λ represents an adjustment factor.
After that, the radius r is re-estimated according to the calculated Cp.
Then, the above two steps are performed iteratively such that a final 3D hand model can be obtained.
Fig. 5 is a flowchart illustrating a method for constructing a 3D hand model and Figs. 6 and 7 are flowcharts respectively illustrating the segmenting step S502 and the generating step S504 of the method shown in Fig. 5. Hereinafter, the method 2000 will be described in details with reference to Figs. 5-7.
As shown in Fig. 5, at step S501, a hand region may be retrieved from a stereo frame comprising at least a first image and a second image. At step S502, one or more hand parts each consisting of a number of feature points may be segmented from the hand region retrieved at step S501. Then, at step S503, for each hand part, a plurality of matched feature point pairs in which the feature points in the first image may be matched with the feature points in the second image are acquired. After that, at step S504, a 3D model of each hand part may be generated based on the matched feature point pairs of the hand parts to construct the 3D hand model.
In an embodiment, the step S502 shown in Fig. 5 may further comprise steps S5021 to S5023 as shown in Fig. 6. Referring to Fig. 6, at S5021, a representative point for identifying each of the hand parts is chosen from the hand region. At S5022, the connected component of each of the hand parts is extracted according to the chosen representative point. Then, at S5023, the corresponding feature points of each of the hand parts are detected according to the extracted connected component to segment at least one hand parts with the detected feature points.
In an embodiment, the step S503 further comprise a step of acquiring a matched hand part pair in which each hand part in the first image is matched with a hand part in the second image, according to the representative point of the hand part and a step of acquiring the matched feature point pair in each matched hand part.
In an embodiment, the step S504 shown in Fig. 5 further comprises steps S5041 to S5044 as shown in Fig. 7. Each matched feature point pairs of the hand part acquired at step S503 may comprise hand part labels associated with 2D coordinates of the feature points.  At step S5041, a 3D point cloud is established for the first image and the second image from the matched feature point pair of each hand part. At step S5042, whether the established 3D point cloud of a hand part belongs to a finger part or not may be determined according to the hand part label. If it is determined that the hand part is the finger part, then, at step S5043, a 3D finger model fitting process is performed, which may be governed by the above-mentioned formula (5) . If not, at step S5044, a 3D palm model fitting process is performed, which may be governed by the above-mentioned formula (6) .
Fig. 8 illustrates a system 3000 for constructing a 3D hand model consistent with an embodiment of the present application. Referring to Fig. 8, the system 3000 comprises a memory 3001 that stores executable components and a processor 3002, electrically coupled to the memory 3001 to execute the executable components to perform operations of the system 3000. The executable components may comprise: a retrieving component 3003 configured to retrieve a hand region from a stereo frame comprising at least a first image and a second image; a segmenting component 3004 configured to segment one or more hand parts each consisting of a number of feature points from the retrieved hand region; an acquiring component 3005 configured to, for each hand part, acquire a plurality of matched feature point pairs in which the feature points in the first image are matched with corresponding feature points in the second image; and a generating component 3006 configured to generate a 3D model of each hand part based on the matched feature point pairs of the hand parts to construct the 3D hand model. The functions of the components 3003 to 3006 are similar to those of the devices 100 to 400, respectively, and thus the detailed descriptions thereof are omitted herein.
Although the preferred examples of the present invention have been described, those skilled in the art can make variations or modifications to these examples upon knowing the basic inventive concept. The appended claims are intended to be considered as comprising the preferred examples and all the variations or modifications fell into the scope of the present invention.
Obviously, those skilled in the art can make variations or modifications to the present invention without departing the spirit and scope of the present invention. As such, if  these variations or modifications belong to the scope of the claims and equivalent technique, they may also fall into the scope of the present invention.

Claims (19)

  1. An apparatus for constructing a 3D hand model comprising:
    a retrieving device configured to retrieve a hand region from a stereo frame comprising at least a first image and a second image;
    a segmenting device in electrical communication with the retrieving device and configured to segment one or more hand parts each consisting of a number of feature points from the retrieved hand region;
    an acquiring device electrically coupled with the segmenting device and configured to, for each segmented hand part, acquire a plurality of matched feature point pairs in which the feature points in the first image are matched with corresponding feature points in the second image; and
    a generating device in electrical communication with the acquiring device and configured to generate a 3D model of each hand part based on the matched feature point pairs of the hand parts to construct the 3D hand model.
  2. The apparatus of claim 1, wherein the segmenting device further comprises:
    a chooser configured for choosing a representative point for identifying each of the hand parts from the hand region;
    an extractor configured for extracting a connected component of each of the hand parts according to the chosen representative point;
    a detector configured for detecting feature points of each of the hand parts according to the extracted connected component to segment at least one hand parts with the detected feature points.
  3. The apparatus of claim 2, wherein the segmenting device further comprises:
    a remover configured for removing the hand part comprising the chosen representative point and the extracted connected component from the hand region, wherein the choosing, the  extracting and the removing are performed repeatedly in the remained hand region until no hand part needs to be removed from the hand region
  4. The apparatus of claim 2, wherein the segmenting device further comprises:
    a validator configured to validate whether the segmented hand part is a finger part according to the extracted connected component.
  5. The apparatus of claim 2, wherein the hand parts at least comprise a plurality of finger parts, and wherein
    the most protruding point in the hand region is chosen as the representative point for identifying a finger part, and
    the connected component of one of the finger parts is a set of image points around the representative point within a distance not exceeding a predetermined finger length.
  6. The apparatus of claim 2, wherein the hand parts at least comprise a palm part, and wherein
    a center of the hand region is chosen as the representative point for identifying the palm part, and
    the connected component of the palm part is a set of image points around the representative point within a predetermined palm radius.
  7. The apparatus of claim 2, wherein the acquiring device is further configured to acquire a matched hand part pair in which each hand part in the first image is matched with a hand part in the second image, according to the representative point of each hand part; and acquire the matched feature point pairs in each matched hand part.
  8. The apparatus of claim 1, wherein each of the matched feature point pairs comprises hand part labels associated with 2D coordinates of the feature points, and the generating device further comprises:
    an establisher configured to establish a 3D point cloud for the first image and the second image from the matched feature point pairs of each hand part;
    a determiner configured to determine whether the established 3D point cloud of a hand part belongs to a finger part or not according to the hand part label; and
    a fitter configured to fit the established 3D point cloud with a specific 3D model according to the hand part label associated with the 3D point cloud.
  9. The apparatus of claim 1, wherein the retrieving device is further configured to capture the stereo frame of a user’s hand from a binocular image system and retrieve the largest connected component of each of the first and the second images as the hand region.
  10. A method for constructing a 3D hand model comprising:
    retrieving a hand region from a stereo frame comprising at least a first image and a second image;
    segmenting, from the retrieved hand region, one or more hand parts each consisting of a number of feature points;
    acquiring, for each hand part, a plurality of matched feature point pairs in which the feature points in the first image are matched with the corresponding feature points in the second image; and
    generating a 3D model of each hand part based on the matched feature point pairs of the hand parts to construct the 3D hand model.
  11. The method of claim 10, wherein the segmenting further comprises:
    choosing a representative point for identifying each of the hand parts from the hand region;
    extracting a connected component of each of the hand parts according to the chosen representative point; and
    detecting the feature points of each of the hand parts according to the extracted component to segment at least one hand parts with the detected feature points.
  12. The method of claim 11, wherein the segmenting further comprises:
    removing the hand part comprising the chosen representative point and the extracted connected component from the hand region, wherein the choosing, the extracting and the detecting are completely performed in the remained hand region until no hand part needs to be removed from the hand region.
  13. The method of claim 11, wherein the segmenting further comprises:
    validating whether the segmented hand part is a finger part according to the extracted connected component.
  14. The method of claim 11, wherein the hand parts at least comprise a plurality of finger parts, and wherein
    the most protruding point in the hand region is chosen as the representative point for identifying a finger part, and
    the connected component of one of the finger part is a set of image points around the representative point within a distance not exceeding a predetermined finger length.
  15. The method of claim 11, wherein the hand parts at least comprise a palm part, and wherein
    a center of the hand region is chosen as the representative point for identifying the palm part, and
    the connected component of the palm part is a set of image points around the representative point within a predetermined palm radius.
  16. The method of claim 11, wherein the acquiring further comprises:
    acquiring a matched hand part pair in which each hand part in the first image is matched with a hand part in the second image, according to the representative point of each hand part; and
    acquiring the matched feature point pairs in each matched hand part.
  17. The method of claim 10, wherein each of the matched feature point pairs comprises hand part labels associated with 2D coordinates of the feature points, and the generating further comprises:
    establishing a 3D point cloud for the first image and the second image from the matched feature point pairs of each hand part;
    determining whether the established 3D point cloud of a hand part belongs to a finger part or not according to the hand part label; and
    fitting the established 3D point cloud with a specific 3D model according to the hand part label associated with the 3D point cloud.
  18. The method of claim 10, further comprising:
    capturing the stereo frame of a user’s hand from a binocular image system; and
    retrieving the largest connected component of each of the first and the second images as the hand region.
  19. A system for constructing a 3D hand model, comprising:
    a memory that stores executable components; and
    a processor, electrically coupled to the memory to execute the executable components, to,
    retrieve a hand region from a stereo frame comprising at least a first image and a second image;
    segment one or more hand parts each consisting of a number of feature points from the retrieved hand region;
    acquire, for each hand part, a plurality of matched feature point pairs in which the feature points in the first image are matched with corresponding feature points in the second image; and
    generate a 3D model of each hand part based on the matched feature point pairs of the hand parts to construct the 3D hand model.
PCT/CN2015/074447 2015-03-18 2015-03-18 3d hand pose recovery from binocular imaging system WO2016145625A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2015/074447 WO2016145625A1 (en) 2015-03-18 2015-03-18 3d hand pose recovery from binocular imaging system
CN201580077259.9A CN108140243B (en) 2015-03-18 2015-03-18 Method, device and system for constructing 3D hand model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2015/074447 WO2016145625A1 (en) 2015-03-18 2015-03-18 3d hand pose recovery from binocular imaging system

Publications (1)

Publication Number Publication Date
WO2016145625A1 true WO2016145625A1 (en) 2016-09-22

Family

ID=56919552

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/074447 WO2016145625A1 (en) 2015-03-18 2015-03-18 3d hand pose recovery from binocular imaging system

Country Status (2)

Country Link
CN (1) CN108140243B (en)
WO (1) WO2016145625A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007130122A2 (en) * 2006-05-05 2007-11-15 Thomson Licensing System and method for three-dimensional object reconstruction from two-dimensional images
CN102164233A (en) * 2009-12-25 2011-08-24 卡西欧计算机株式会社 Imaging device and 3d modeling data creation method
CN102208116A (en) * 2010-03-29 2011-10-05 卡西欧计算机株式会社 3D modeling apparatus and 3D modeling method
US20110316963A1 (en) * 2008-12-30 2011-12-29 Huawei Device Co., Ltd. Method and device for generating 3d panoramic video streams, and videoconference method and device

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6873723B1 (en) * 1999-06-30 2005-03-29 Intel Corporation Segmenting three-dimensional video images using stereo
CN101038671A (en) * 2007-04-25 2007-09-19 上海大学 Tracking method of three-dimensional finger motion locus based on stereo vision
CN101763636B (en) * 2009-09-23 2012-07-04 中国科学院自动化研究所 Method for tracing position and pose of 3D human face in video sequence
CN101720047B (en) * 2009-11-03 2011-12-21 上海大学 Method for acquiring range image by stereo matching of multi-aperture photographing based on color segmentation
CN102982557B (en) * 2012-11-06 2015-03-25 桂林电子科技大学 Method for processing space hand signal gesture command based on depth camera
CN103714345B (en) * 2013-12-27 2018-04-06 Tcl集团股份有限公司 A kind of method and system of binocular stereo vision detection finger fingertip locus

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007130122A2 (en) * 2006-05-05 2007-11-15 Thomson Licensing System and method for three-dimensional object reconstruction from two-dimensional images
US20110316963A1 (en) * 2008-12-30 2011-12-29 Huawei Device Co., Ltd. Method and device for generating 3d panoramic video streams, and videoconference method and device
CN102164233A (en) * 2009-12-25 2011-08-24 卡西欧计算机株式会社 Imaging device and 3d modeling data creation method
CN102208116A (en) * 2010-03-29 2011-10-05 卡西欧计算机株式会社 3D modeling apparatus and 3D modeling method

Also Published As

Publication number Publication date
CN108140243B (en) 2022-01-11
CN108140243A (en) 2018-06-08

Similar Documents

Publication Publication Date Title
US9286694B2 (en) Apparatus and method for detecting multiple arms and hands by using three-dimensional image
JP6125188B2 (en) Video processing method and apparatus
WO2015161816A1 (en) Three-dimensional facial recognition method and system
CN104933389B (en) Identity recognition method and device based on finger veins
US9020251B2 (en) Image processing apparatus and method
JP2010176380A (en) Information processing device and method, program, and recording medium
EP3345123B1 (en) Fast and robust identification of extremities of an object within a scene
JP2007249592A (en) Three-dimensional object recognition system
JP2016014954A (en) Method for detecting finger shape, program thereof, storage medium of program thereof, and system for detecting finger shape
Park et al. Hand detection and tracking using depth and color information
JP2014170368A (en) Image processing device, method and program and movable body
Raheja et al. Hand gesture pointing location detection
KR20170053807A (en) A method of detecting objects in the image with moving background
JP2018081402A5 (en)
KR20170023565A (en) method for finger counting by using image processing and apparatus adopting the method
CN106406507B (en) Image processing method and electronic device
JP6393495B2 (en) Image processing apparatus and object recognition method
KR101339616B1 (en) Object detection and tracking method and device
Wang et al. Skin Color Weighted Disparity Competition for Hand Segmentation from Stereo Camera.
WO2016145625A1 (en) 3d hand pose recovery from binocular imaging system
JP2012003724A (en) Three-dimensional fingertip position detection method, three-dimensional fingertip position detector and program
Dehankar et al. Using AEPI method for hand gesture recognition in varying background and blurred images
Li et al. Algorithm of fingertip detection and its improvement based on kinect
KR101706674B1 (en) Method and computing device for gender recognition based on long distance visible light image and thermal image
KR20160062913A (en) System and Method for Translating Sign Language for Improving the Accuracy of Lip Motion Device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15885008

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15885008

Country of ref document: EP

Kind code of ref document: A1