WO2022255642A1

WO2022255642A1 - Weight-reduced hand joint prediction method and device for implementation of real-time hand motion interface of augmented reality glass device

Info

Publication number: WO2022255642A1
Application number: PCT/KR2022/005823
Authority: WO
Inventors: 최치원; 조성동; 김정환; 백지엽; 민경진; 이강휘
Original assignee: 주식회사 피앤씨솔루션
Priority date: 2021-06-04
Filing date: 2022-04-24
Publication date: 2022-12-08
Also published as: KR102548208B1; KR20220164376A

Abstract

According to a weight-reduced hand joint prediction method and device for implementation of real-time hand motion interface of an augmented reality glass device proposed in the present invention, candidate key points of hand joints are detected from a whole input image without a hand detection process, and then at least one hand joint included in the input image is predicted on the basis of a correlation between the candidate key points by using a joint evaluation model. Therefore, the present invention allows prediction of a plurality of hand joints by one-time candidate key point detection without a separate hand area detection and a joint prediction process in each of detected hand areas and thus can simplify and reduce a hand joint prediction process, and can prevent a joint prediction time and an operation quantity from increasing in proportion to the number of hands in the input image and thus enables rapid hand joint prediction in real time in an embedded environment.

Description

Lightweight hand joint prediction method and device for implementing real-time hand motion interface of augmented reality glass device

The present invention relates to a hand joint prediction method and device, and more particularly, to a lightweight hand joint prediction method and device for implementing a real-time hand motion interface in an augmented reality glass device.

Various wearable devices are being developed according to the trend of light weight and miniaturization of digital devices. A head mounted display, which is a kind of wearable device, refers to various devices that can be worn on a user's head to receive multimedia contents and the like. Here, the head mounted display (HMD) is worn on the user's body and provides images to the user in various environments as the user moves. Head-mounted displays (HMDs) are classified into see-through and see-closed types. The see-through type is mainly used for Augmented Reality (AR), and the closed type is mainly used for virtual reality (VR). reality, VR).

On the other hand, HMD for augmented reality (hereinafter referred to as augmented reality glass device) is important for gesture (hand movement) recognition in order to easily and conveniently interact with a wearer without a separate input device. In order to implement a hand gesture interface that controls an augmented reality glasses device with a hand gesture, a process of accurately detecting a hand gesture must be preceded.

For accurate detection of hand gestures, existing technology for detecting the position and orientation of an object can be used in computer vision. Recently, artificial intelligence technology has been applied to computer vision to detect and estimate the position of an object. This technique can also be applied to hand joint estimation.

On the other hand, since the size and weight of the augmented reality glass device must be minimized due to the nature of being worn on the head, it is difficult to have high computing power. However, existing artificial intelligence-based hand joint estimation requires high computing power, so it is difficult to process it in real time in an embedded environment. In particular, as the number of hands for which joints are to be estimated increases, the amount of computation also increases, so in a small embedded environment, FPS may decrease and unstable CPU resource shortages may appear instantaneously. Therefore, there is a need for a solution to solve this problem.

On the other hand, as prior art related to the present invention, Patent Registration No. 10-2102309 (Title of Invention: Object Recognition Method for 3D Virtual Space of Head-Worn Display Device, Registration Date: April 13, 2020), etc. has been disclosed.

The present invention is proposed to solve the above problems of the previously proposed methods, and detects candidate keypoints of the hand joints in the entire input image without a hand detection process, and then uses a joint evaluation model to determine the correlation between the candidate keypoints. By predicting at least one or more hand joints included in the input image based on the hand region, a plurality of hand joints can be predicted from one candidate keypoint detection without a separate hand region detection and joint prediction process in each detected hand region. The augmented reality glass device can simplify and lighten the joint prediction process, and since the joint prediction time and computation amount do not increase in proportion to the number of hands in the input image, real-time hand joint prediction can be quickly performed in an embedded environment. Its purpose is to provide a lightweight hand joint prediction method and device for implementing a real-time hand motion interface.

A lightweight hand joint prediction method for implementing a real-time hand motion interface of an augmented reality glass device according to the features of the present invention for achieving the above object is,

A hand joint prediction method in which each step is performed in the augmented reality glasses device to implement a real-time hand motion interface of the augmented reality glasses device, comprising:

(1) storing a joint evaluation model that has learned correlations between hand joints based on an artificial neural network;

(2) detecting in real time candidate key points that become hand joint candidates in the entire input image captured by the augmented reality glasses device;

(3) Evaluate the correlation between the candidate key points detected in step (2) using the joint evaluation model stored in step (1), and between the joint points corresponding to the hand joints and the joint points among the candidate key points. Determining the connection relationship of; and

(4) predicting at least one hand joint by connecting the joint points according to the connection relationship from the determination result of step (3);

It is characterized in that it predicts at least one hand joint included in the input image through performing steps (2) to (4) once.

Preferably, the joint evaluation model of step (1),

It can be configured by learning the hand joint point map and the joint relationship map based on the artificial neural network.

Preferably, in the step (3),

Using the correlation between the candidate keypoints, the joint points may be selected by classifying each hand joint included in the input image, and a connection relationship between the joint points within the classified group may be determined.

Preferably,

In step (2), the candidate keypoint is detected from a 2D input image captured by the augmented reality glasses device;

In the above step (4), a three-dimensional hand joint can be predicted.

A lightweight hand joint prediction device for implementing a real-time hand motion interface of an augmented reality glass device according to the features of the present invention for achieving the above object,

A hand joint prediction device mounted on the augmented reality glasses device to implement a real-time hand motion interface of the augmented reality glasses device,

a model storage unit for storing a joint evaluation model obtained by learning correlations between hand joints based on an artificial neural network; and

A prediction unit for predicting at least one hand joint from an input image captured by the augmented reality glasses device;

The prediction unit,

a detection module for detecting in real time candidate keypoints that become hand joint candidates in the entire input image captured by the augmented reality glasses device;

Evaluating a correlation between candidate key points detected by the detection module using a joint evaluation model stored in the model storage unit, and determining a joint point corresponding to a hand joint among the candidate key points and a connection relationship between the joint points judgment module; and

A prediction module for predicting at least one hand joint by connecting the joint points according to the connection relationship from the determination result of the determination module;

The prediction unit,

It is characterized in that it predicts at least one hand joint included in the input image by sequentially operating the detection module, the determination module, and the prediction module once.

Preferably, the model storage unit,

The joint evaluation model configured by learning the hand joint point map and the joint relationship map based on the artificial neural network may be stored.

Preferably, the determination module,

Preferably,

The detection module detects the candidate keypoint from a two-dimensional input image captured by the augmented reality glasses device;

The prediction module may predict a 3-dimensional hand joint.

According to the lightweight hand joint prediction method and device for real-time hand motion interface implementation of an augmented reality glass device proposed in the present invention, candidate key points of hand joints are detected in the entire input image without a hand detection process, and then the joint evaluation model is used. By predicting at least one or more hand joints included in the input image based on the correlation between candidate keypoints, a plurality of hands from one candidate keypoint detection without separate hand region detection and joint prediction process in each detected hand region. Since joints can be predicted, the hand joint prediction process can be simplified and lightened, and since the joint prediction time and computation amount do not increase in proportion to the number of hands in the input image, hand joint prediction can be quickly performed in real time in an embedded environment. have.

1 is a view showing the configuration of an augmented reality glasses device equipped with a lightweight hand joint prediction method and device for implementing a real-time hand motion interface of the augmented reality glasses device according to an embodiment of the present invention.

2 is a diagram showing the configuration of a lightweight hand joint prediction device for implementing a real-time hand motion interface of an augmented reality glasses device according to an embodiment of the present invention.

FIG. 3 is a flowchart illustrating a lightweight hand joint prediction method for implementing a real-time hand motion interface of an augmented reality glasses device according to an embodiment of the present invention.

4 is a diagram illustrating a prediction process according to a conventional hand joint prediction method;

5 is a diagram showing a simplified flow of a conventional hand joint prediction method;

6 is a diagram showing a simplified flow of a lightweight hand joint prediction method for implementing a real-time hand motion interface of an augmented reality glasses device according to an embodiment of the present invention.

7 is a diagram illustrating a prediction process according to a lightweight hand joint prediction method for implementing a real-time hand motion interface of an augmented reality glasses device according to an embodiment of the present invention.

10: augmented reality glasses device

100: hand joint prediction device

110: model storage unit

111 joint evaluation model

120: prediction unit

121: detection module

122: judgment module

123: prediction module

200: camera

300: control unit

S100: Step of storing the joint evaluation model that has learned the correlation between the hand joints based on the artificial neural network

S200: Detecting in real time candidate key points that become hand joint candidates in the entire input image captured by the augmented reality glasses device

S300: Evaluating the correlation between candidate key points using the joint evaluation model, and determining the joint points corresponding to the hand joints among the candidate key points and the connection relationship between the joint points.

S400: Predicting at least one hand joint by connecting joint points according to a connection relationship

Hereinafter, preferred embodiments will be described in detail so that those skilled in the art can easily practice the present invention with reference to the accompanying drawings. However, in describing a preferred embodiment of the present invention in detail, if it is determined that a detailed description of a related known function or configuration may unnecessarily obscure the gist of the present invention, the detailed description will be omitted. In addition, the same reference numerals are used throughout the drawings for parts having similar functions and actions.

In addition, throughout the specification, when a part is said to be 'connected' to another part, this is not only the case where it is 'directly connected', but also the case where it is 'indirectly connected' with another element in between. include In addition, 'including' a certain component means that other components may be further included, rather than excluding other components unless otherwise specified.

1 is a diagram showing the configuration of an augmented reality glasses device 10 equipped with a lightweight hand joint prediction device 100 for implementing a real-time hand motion interface of the augmented reality glasses device 10 according to an embodiment of the present invention. to be. As shown in FIG. 1 , a lightweight hand joint prediction device 100 for implementing a real-time hand motion interface of the augmented reality glasses device 10 according to an embodiment of the present invention is to be mounted on the augmented reality glasses device 10. can

That is, the augmented reality glasses device 10 may include the hand joint prediction device 100 to implement a real-time hand motion interface. More specifically, the hand joint prediction device 100 predicts hand joints from an input image captured by the camera 200 of the augmented reality glasses device 10, and transmits the predicted hand joints to the controller 300 so that the controller 300 ) can process the hand motion interface corresponding to the predicted hand joint.

Here, the hand joint refers to a skeleton in the form of a skeleton connecting joint points constituting the hand, and predicting a hand joint may mean predicting a plurality of joint points constituting the hand and a connection relationship between the joint points. Hand joint prediction is for estimating the skeletal shape of the hand, constructing a hand motion from the estimated skeletal shape of the hand, and using it for the hand motion interface.

2 is a diagram showing the configuration of a lightweight hand joint prediction device 100 for implementing a real-time hand motion interface of the augmented reality glasses device 10 according to an embodiment of the present invention. As shown in FIG. 2 , the lightweight hand joint prediction device 100 for implementing a real-time hand motion interface of the augmented reality glasses device 10 according to an embodiment of the present invention is As a hand joint prediction device 100 mounted on the augmented reality glasses device 10 to implement a hand motion interface, a model storage unit for storing a joint evaluation model 111 learning correlations between hand joints based on an artificial neural network ( 110) and a prediction unit 120 that predicts at least one hand joint from an input image captured by the augmented reality glasses device 10.

In addition, the prediction unit 120 includes a detection module 121 for detecting in real time candidate key points that become hand joint candidates in the entire input image captured by the augmented reality glasses device 10; Using the joint evaluation model 111 stored in the model storage unit 110, the correlation between candidate key points detected by the detection module 121 is evaluated, and among the candidate key points, the joint points corresponding to the hand joints and the joint points between the joint points are evaluated. a judgment module 122 for determining a connection relationship; and a prediction module 123 for predicting at least one hand joint by connecting joint points according to a connection relationship from the determination result of the determination module 122, wherein the prediction unit 120 includes the detection module 121 and the determination module. At least one hand joint included in the input image may be predicted by sequentially operating step 122 and the prediction module 123 once.

FIG. 3 is a flow diagram illustrating a lightweight hand joint prediction method for implementing a real-time hand motion interface of the augmented reality glasses device 10 according to an embodiment of the present invention. As shown in FIG. 3 , a lightweight hand joint prediction method for implementing a real-time hand motion interface of the augmented reality glasses device 10 according to an embodiment of the present invention implements a real-time hand motion interface of the augmented reality glasses device 10 For this, as a hand joint prediction method in which each step is performed in the augmented reality glasses device 10, the joint evaluation model 111 learned from the correlation between the hand joints based on the artificial neural network is stored (S100), augmentation Detecting candidate keypoints as hand joint candidates in real time from the entire input image captured by the real glasses device 10 (S200), evaluating the correlation between candidate keypoints using the joint evaluation model 111, and Among them, it may be implemented including determining joint points corresponding to the hand joints and a connection relationship between the joint points (S300) and predicting at least one hand joint by connecting the joint points according to the connection relationship (S400). .

In addition, in the lightweight hand joint prediction method for real-time hand motion interface implementation of the augmented reality glasses device 10 according to an embodiment of the present invention, at least one included in the input image is performed through steps S200 to S400 once. More than one hand joint can be predicted. More specifically, without the hand detection process of detecting hand regions in the input image and the process of predicting hand joints in individual hand regions, candidate keypoints for hand joints are detected in the entire input image, and then the joint evaluation model 111 is used. At least one hand joint included in the input image may be predicted at once based on the correlation between candidate keypoints.

That is, in the conventional hand joint prediction method, after detecting a hand region, keypoint detection and hand joint prediction processes are respectively performed for each hand region, so the keypoint detection and hand joint prediction processes are repeated as many times as the number of detected hand regions. will do On the other hand, in the lightweight hand joint prediction method for implementing the real-time hand motion interface of the augmented reality glasses device 10 according to an embodiment of the present invention, a plurality of hand joints can be predicted by performing steps S200 to S400 only once. have. Therefore, the hand joint prediction process can be simplified and lightened, and since the joint prediction time and computation amount do not increase in proportion to the number of hands in the input image, hand joint prediction can be quickly performed in real time in an embedded environment.

Hereinafter, a lightweight hand joint prediction method for implementing a real-time hand motion interface of the augmented reality glasses device 10 according to an embodiment of the present invention will be described in detail as compared to the conventional hand joint prediction method.

4 is a diagram showing a prediction process according to a conventional hand joint prediction method, and FIG. 5 is a diagram showing a simplified flow of the conventional hand joint prediction method. As shown in FIGS. 4 and 5 , when predicting a conventional hand joint, a top-down method was used to estimate the joint. That is, a candidate region with a human hand is found in the input image (FIG. 4(a)), and the hand region is detected in the form of a bounding box by determining whether the detected candidate region is a real hand (FIG. 4(a)). of (b)). A hand image is acquired by cutting the bounding box of the detected hand region, and a hand joint is estimated by applying a pose estimation algorithm to each hand image (Fig. 4(c) and (d)).

As described above, in the conventional hand joint prediction method, when there are a plurality of hand candidate regions in one input image, both hand region detection and joint estimation in each hand region must be performed. Therefore, hand detection is performed in proportion to the number of hand candidate regions. The amount of time and computation required for resource usage and joint estimation increases. Therefore, in a small embedded environment such as the augmented reality glasses device 10, calculation resources are insufficient and unstable CPU resources such as FPS decrease instantaneously appear, which is unsuitable for real-time operation of the augmented reality glasses device 10.

6 is a diagram showing a simplified flow of a lightweight hand joint prediction method for real-time hand motion interface implementation of the augmented reality glasses device 10 according to an embodiment of the present invention, and FIG. 7 is an embodiment of the present invention It is a diagram showing a prediction process according to a lightweight hand joint prediction method for implementing a real-time hand motion interface of the augmented reality glasses device 10 according to. As shown in FIGS. 6 and 7 , a lightweight hand joint prediction method for implementing a real-time hand motion interface of the augmented reality glasses device 10 according to an embodiment of the present invention includes candidate key points of hand joints in the entire input image. It is possible to predict the hand joint by detecting (Fig. 7 (a)) and evaluating the correlation between candidate key points (Fig. 7 (b)). This can be referred to as a bottom-up method in contrast to the conventional top-down method. According to this bottom-up method, it is possible to predict hand joints without detecting candidate regions of the conventional top-down method of hand joint prediction and repeating procedures for all candidate regions as shown in FIG. It can have fast estimation speed.

Since the conventional top-down method obtains joint points from a hand image obtained by cropping the detected hand region and predicts the hand joint, the accuracy of the estimation result may be higher than that of the bottom-up method according to the present invention. However, the shape of the hand used for the hand gesture interface is limited, and considering the small resource usage and high speed of the bottom-up method of the present invention, the real-time hand gesture interface of the augmented reality glasses device 10 according to an embodiment of the present invention A lightweight hand joint prediction method for implementation may be more efficient than conventional methods in real-time application in an embedded environment of the augmented reality glasses device 10 .

Hereinafter, each step of a lightweight hand joint prediction method for implementing a real-time hand motion interface of the augmented reality glasses device 10 according to an embodiment of the present invention will be described in detail.

The present invention relates to a lightweight hand joint prediction method for implementing a real-time hand motion interface of an augmented reality glasses device 10, and may be configured as software recorded in hardware including a memory and a processor. For example, a lightweight hand joint prediction method for implementing a real-time hand motion interface of the augmented reality glasses device 10 of the present invention may be stored and implemented in the augmented reality glasses device 10 . In the following, for convenience of explanation, the subject performing each step may be omitted.

In step S100, the joint evaluation model 111 obtained by learning the correlation between the joints of the hand based on the artificial neural network may be stored. More specifically, the joint evaluation model 111 is stored in the model storage unit 110 of the hand joint prediction device 100, and the prediction unit 120 uses this to store the hand joint in the embedded environment of the augmented reality glass device 10. can predict In particular, a learning process requiring a lot of computing resources may be processed in a server computer or the like, and the joint evaluation model 111 generated through learning may be stored in the augmented reality glasses device 10 and used.

Here, the joint evaluation model 111 of step S100 may be configured by learning the hand joint point map and the joint relationship map based on the artificial neural network. More specifically, a hand joint point map and a joint relationship map are created from images of various hand gestures taken at various angles and lighting, and the hand joint point map and joint relationship map are configured as training data and used with a deep learning algorithm. A joint evaluation model 111 may be created.

An algorithm used to generate the joint evaluation model 111 may be an artificial neural network model, and CNN, RNN, and the like may be used. In addition, a graph neural network, a graph convolutional network (GCN), or the like may be used for effective learning of the joint relationship map.

In step S200 , candidate key points serving as hand joint candidates may be detected in real time from the entire input image captured by the augmented reality glasses device 10 . That is, in step S200, candidate key points of hand joints can be detected from the entire captured input image without detecting the hand region or detecting and cropping the bounding box in the input image captured by the camera 200 (see FIG. 7). (a)).

More specifically, in step S200 , candidate keypoints may be detected from a 2D input image captured by the augmented reality glasses device 10 . That is, the camera 200 of the augmented reality glasses device 10 is a general camera device that captures a 2D image, and can capture an image in the direction of the wearer's gaze.

In step S300, the correlation between the candidate key points detected in step S200 is evaluated using the joint evaluation model 111 stored in step S100, and the joint points corresponding to the hand joints among the candidate key points and the connection relationship between the joint points are evaluated. can judge That is, candidate key points are matched with each other to evaluate the correlation through the joint evaluation model 111, and based on the evaluated correlation, it is determined whether the joint points constitute the hand or the joints connected to each other, that is, the joint points of one hand. can do.

In addition, in step S300, by using the correlation between candidate key points, it is possible to classify the input image for each hand joint included in the input image, select a joint point, and determine a connection between the joint points within the classified group. That is, as shown in FIG. 7 , if there are a plurality of hands included in the input image, the joint points constituting each hand may be grouped by classifying the joint points for each hand. That is, the correlation between the candidate key points detected in (a) of FIG. 7 is evaluated, joint points corresponding to the left hand and joint points corresponding to the right hand are classified and selected, and the connection relationship between the left hand joint points and the right hand joint are selected. A connection relationship between points may be determined.

In step S400, at least one hand joint may be predicted by connecting joint points according to a connection relationship from the determination result of step S300. More specifically, it is possible to predict hand joints and estimate hand motions using the joint points of each group and their connection relationships. That is, as shown in (b) of FIG. 7 , a left hand joint may be predicted from a connection relationship between left hand joint points, and a right hand joint may be predicted from a connection relationship between right hand joint points.

In this way, since a plurality of hand joints can be predicted at once using the determination result of step S300 without the need to perform the joint evaluation model 111 several times, resource consumption can be minimized and hand motions can be quickly estimated in real time. have.

In addition, in step S400, a three-dimensional hand joint may be predicted. That is, since hand joints can be configured in three dimensions according to the relative positions of joint points, a three-dimensional hand motion can be estimated from a two-dimensional input image.

As described above, according to the lightweight hand joint prediction method and device for implementing the real-time hand motion interface of the augmented reality glasses device 10 proposed in the present invention, candidate key points of hand joints are obtained from the entire input image without a hand detection process. After detection, the joint evaluation model 111 is used to predict at least one or more hand joints included in the input image based on the correlation between candidate key points, thereby detecting a separate hand region and predicting a joint in each detected hand region. Since a plurality of hand joints can be predicted from one candidate keypoint detection without a process, the hand joint prediction process can be simplified and lightened, and the joint prediction time and computation amount do not increase in proportion to the number of hands in the input image, so the embedded environment can quickly predict hand joints in real time.

Meanwhile, the present invention may include a computer-readable medium including program instructions for performing operations implemented in various communication terminals. For example, computer-readable media include magnetic media such as hard disks, floppy disks, and magnetic tapes, optical media such as CD_ROMs and DVDs, and floptical disks. It may include hardware devices specially configured to store and execute program instructions, such as magneto-optical media and ROM, RAM, flash memory, and the like.

Such computer-readable media may include program instructions, data files, data structures, etc. alone or in combination. At this time, program instructions recorded on a computer-readable medium may be specially designed and configured to implement the present invention, or may be known and usable to those skilled in computer software. For example, it may include high-level language codes that can be executed by a computer using an interpreter, as well as machine language codes generated by a compiler.

The present invention described above can be variously modified or applied by those skilled in the art to which the present invention belongs, and the scope of the technical idea according to the present invention should be defined by the claims below.

Claims

As a hand joint prediction method in which each step is performed in the augmented reality glasses device 10 to implement a real-time hand motion interface of the augmented reality glasses device 10,

(1) storing the joint evaluation model 111, in which the correlation between the joints of the hand is learned based on the artificial neural network;

(2) detecting in real time candidate key points that become hand joint candidates in the entire input image captured by the augmented reality glasses device 10;

(3) Using the joint evaluation model 111 stored in step (1), the correlation between the candidate key points detected in step (2) is evaluated, and among the candidate key points, joint points corresponding to the hand joints and Determining a connection relationship between joint points; and

(4) predicting at least one hand joint by connecting the joint points according to the connection relationship from the determination result of step (3);

Lightweight for real-time hand motion interface implementation of augmented reality glasses device 10, characterized in that at least one hand joint included in the input image is predicted through one execution of steps (2) to (4) A method for predicting hand joints.
The method of claim 1, wherein the joint evaluation model 111 of step (1),

A lightweight hand joint prediction method for implementing a real-time hand motion interface of an augmented reality glasses device (10), characterized in that it is configured by learning a hand joint point map and a joint relationship map based on an artificial neural network.
The method of claim 1, wherein in step (3),

Characterized in that, by using the correlation between the candidate key points, the joint points are selected by classifying each hand joint included in the input image, and the connection relationship between the joint points is determined within the classified group. A lightweight hand joint prediction method for implementing a real-time hand motion interface in a glass device (10).
According to claim 1,

In the step (2), the candidate keypoint is detected from the 2D input image captured by the augmented reality glasses device 10;

In step (4), a three-dimensional hand joint is predicted.
As a hand joint prediction device 100 mounted on the augmented reality glasses device 10 to implement a real-time hand motion interface of the augmented reality glasses device 10,

a model storage unit 110 for storing a joint evaluation model 111 obtained by learning correlations between hand joints based on an artificial neural network; and

A prediction unit 120 predicting at least one hand joint from an input image captured by the augmented reality glasses device 10,

The prediction unit 120,

a detection module 121 for detecting in real time candidate key points that become hand joint candidates in the entire input image captured by the augmented reality glasses device 10;

A correlation between candidate key points detected by the detection module 121 is evaluated using the joint evaluation model 111 stored in the model storage unit 110, and a joint point corresponding to a hand joint among the candidate key points is evaluated. a determination module 122 for determining a connection relationship between joint points; and

And a prediction module 123 for predicting at least one hand joint by connecting the joint points according to the connection relationship from the determination result of the determination module 122,

The prediction unit 120,

The augmented reality glasses device (10) characterized by predicting at least one hand joint included in the input image by sequentially operating the detection module (121), the determination module (122) and the prediction module (123) once. ) A lightweight hand joint prediction device 100 for implementing a real-time hand motion interface.
The method of claim 5, wherein the model storage unit 110,

A lightweight hand joint for implementing a real-time hand motion interface of an augmented reality glasses device 10, characterized in that the joint evaluation model 111 configured by learning the hand joint point map and the joint relationship map based on an artificial neural network is stored. Prediction Device (100).
The method of claim 5, wherein the determination module 122,

Characterized in that, by using the correlation between the candidate key points, the joint points are selected by classifying each hand joint included in the input image, and the connection relationship between the joint points is determined within the classified group. A lightweight hand joint prediction device 100 for implementing a real-time hand motion interface of the glass device 10.
According to claim 5,

The detection module 121 detects the candidate keypoint from a 2D input image captured by the augmented reality glasses device 10;

The prediction module 123 is a lightweight hand joint prediction device 100 for implementing a real-time hand motion interface of the augmented reality glasses device 10, characterized in that for predicting three-dimensional hand joints.