CN113866987A - Method for interactively adjusting interpupillary distance and image surface of augmented reality helmet display by utilizing gestures - Google Patents

Method for interactively adjusting interpupillary distance and image surface of augmented reality helmet display by utilizing gestures Download PDF

Info

Publication number
CN113866987A
CN113866987A CN202111154355.2A CN202111154355A CN113866987A CN 113866987 A CN113866987 A CN 113866987A CN 202111154355 A CN202111154355 A CN 202111154355A CN 113866987 A CN113866987 A CN 113866987A
Authority
CN
China
Prior art keywords
gesture
adjustment
image plane
gestures
interpupillary distance
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111154355.2A
Other languages
Chinese (zh)
Inventor
陈靖
倪科
王剑
雷霆
杨露梅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Technology BIT
Original Assignee
Beijing Institute of Technology BIT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Technology BIT filed Critical Beijing Institute of Technology BIT
Priority to CN202111154355.2A priority Critical patent/CN113866987A/en
Publication of CN113866987A publication Critical patent/CN113866987A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G02OPTICS
    • G02BOPTICAL ELEMENTS, SYSTEMS OR APPARATUS
    • G02B27/00Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00
    • G02B27/01Head-up displays
    • G02B27/017Head mounted
    • GPHYSICS
    • G02OPTICS
    • G02BOPTICAL ELEMENTS, SYSTEMS OR APPARATUS
    • G02B27/00Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00
    • G02B27/0075Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00 with means for altering, e.g. increasing, the depth of field or depth of focus
    • GPHYSICS
    • G02OPTICS
    • G02BOPTICAL ELEMENTS, SYSTEMS OR APPARATUS
    • G02B27/00Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00
    • G02B27/0081Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00 with means for altering, e.g. enlarging, the entrance or exit pupil
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/013Eye tracking input arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/017Gesture based interaction, e.g. based on a set of recognized hand gestures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/005General purpose rendering architectures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/006Mixed reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2210/00Indexing scheme for image generation or computer graphics
    • G06T2210/61Scene description

Abstract

In order to solve the problem that the pupil distance and the image plane of an augmented reality helmet are not natural enough, the method for interactively adjusting the pupil distance and the image plane of an augmented reality helmet display by gestures comprises the following steps: shooting gesture actions in a real scene, numbering according to preset gesture types, and constructing a gesture recognition model according to the gesture actions; creating two virtual cameras in a virtual scene rendered by a three-dimensional rendering engine, and setting an initial distance between the two virtual cameras to simulate left and right eyes of a human; creating two empty objects as carriers of pictures presented by the left eye and the right eye of the simulated human, respectively corresponding the two virtual cameras to the two empty objects, and setting the depth of an imaging picture; independently rendering prompt information of pupil distance and image plane adjustment on a hollow object simulating a picture presented by left and right eyes of a human by using a three-dimensional rendering engine; and defining virtual scene logic in the three-dimensional rendering engine, and corresponding the gesture number with the virtual scene logic.

Description

Method for interactively adjusting interpupillary distance and image surface of augmented reality helmet display by utilizing gestures
Technical Field
The invention belongs to the technical field of augmented reality, and particularly relates to a method for interactively adjusting the interpupillary distance and the image plane of an augmented reality helmet display by utilizing gestures.
Background
Augmented Reality (Augmented Reality) technology is a new technology for overlaying virtual information generated by a computer to the real world of a user, and has the characteristics of virtual-real combination, real-time interaction and three-dimensional registration. Unlike virtual reality, AR technology utilizes three-dimensional registration to calculate the position of virtual objects in the real environment, and achieves augmentation of the real world by bringing virtual objects or information in a computer into the real world.
With the development of technology in recent years, AR technology has been widely applied to various fields such as industry, military, medical treatment, education, etc., and near-eye display systems developed from head-mounted displays have become one of the main ways of AR display, and can be divided into video see-through type and optical see-through type displays according to the representation form of real environment. For AR near-eye devices, whether their display device creates an immersive mixed environment for the user is critical to the success or failure of the overall system.
At present, virtual images displayed by most AR near-eye display equipment are rendered according to a binocular stereoscopic vision principle, namely, left and right eye views displayed by the AR helmet display can restore binocular parallax when human eyes watch real objects as much as possible, so that vivid and stereoscopic effects of the rendered images are achieved. In the rendering process, the main factor influencing the virtual model stereoscopic display is the distance between the optical centers of the virtual cameras corresponding to the left and right screens of the AR helmet display, namely the virtual 'interpupillary distance'. If the interpupillary distance between the eyes of the user is not matched with the distance between the optical centers of the two virtual cameras of the AR helmet, a prism effect is generated, so that not only is the three-dimensional effect reduced and the pictures of the left and right eyes can not be combined, but also the symptoms of visual fatigue, dizziness, nausea and the like are generated for part of users due to the fact that the interpupillary distance between the eyes is greatly different from the virtual interpupillary distance. Meanwhile, the distance of the display picture of the AR helmet in the imaging position in front of the eyes is also an important influence factor for influencing whether the user can obtain immersion experience and comfortable interaction when using the equipment.
Most of AR near-to-eye display devices in the current market adjust the distance between the virtual pupil distance and the imaging position of a display picture in a physical or software mode, and the main modes are as follows: the adjustment is carried out through the keys at the equipment end or corresponding parameters in the matched software are changed, but the adjustment modes have certain problems and do not completely consider the error possibly generated by the user in the adjustment and calibration stage and the feeling during interaction. The physical adjustment has the problems that after the AR helmet is worn by a user, the relative position relation between the original helmet and eyes can be changed through keys on the appearance of the equipment, so that the whole display picture can be deviated, and the wearing feeling of the user in the adjustment process is influenced; the software adjustment requires the user to pay attention to the definition of the display picture and the change of the input numerical value in the helmet during the adjustment process, which increases the use burden of the user and increases the visual fatigue, and the adjustment mode is not natural enough.
In order to solve the problem that the pupil distance and the image plane are not natural and efficient when a user wears an AR helmet, the pupil distance and the image plane can be adjusted in real time in a natural gesture interaction mode. The method has the characteristics of high recognition speed, high feasibility and convenience in operation based on a real-time gesture interaction mode, so that the virtual scene of the AR helmet can be rendered in an off-line mode, and the interpupillary distance and the image surface of the AR helmet display can be adjusted in an on-line real-time gesture interaction mode. The virtual camera and the imaging picture for simulating human eyes in the virtual scene can be set by using a three-dimensional rendering engine, the three-dimensional rendering engine in the current market is mature, and for example, Unity3D and Unreal can both generate virtual objects in augmented reality and construct a simulation environment. During real-time gesture interaction, firstly, a user makes corresponding gestures according to a current AR helmet display picture and an adjusting requirement, a camera acquires the current gestures of the user in real time, then static gestures and dynamic gestures of the user are classified and identified by adopting a gesture identification algorithm, an identification result is transmitted to a virtual scene, and finally, the virtual camera and an imaging picture are moved according to predefined scene logic, so that pupil distance and image plane adjustment of the AR helmet display can be completed.
Disclosure of Invention
In order to solve the problem that the pupil distance and the image plane of an augmented reality helmet are not natural enough, the invention provides a method for interactively adjusting the pupil distance and the image plane of an augmented reality helmet display by using gestures, so that the pupil distance and the image plane can be adjusted more naturally and efficiently when the augmented reality helmet is worn.
In order to achieve the purpose, the technical scheme of the invention is as follows:
the method for interactively adjusting the interpupillary distance and the image plane of the augmented reality helmet display by using gestures is characterized by comprising an off-line stage and an on-line stage, wherein the off-line stage specifically comprises the following steps:
step one, shooting gesture actions in a real scene, numbering according to preset gesture types, and constructing a gesture recognition model according to the gesture actions;
step two, using a three-dimensional rendering engine to create two virtual cameras in a rendered virtual scene, and setting an initial distance between the two virtual cameras to simulate the left and right eyes of a human; creating two empty objects as carriers of pictures presented by the left eye and the right eye of the simulated human, respectively corresponding the two virtual cameras to the two empty objects, and setting the depth of an imaging picture;
independently rendering prompt information of interpupillary distance and image plane adjustment on a hollow object simulating a picture presented by the left eye and the right eye of a human by using a three-dimensional rendering engine;
step four: defining virtual scene logic in a three-dimensional rendering engine, and corresponding the gesture number with the virtual scene logic;
wherein the online phase specifically comprises:
step 1: the method comprises the steps that a camera obtains a gesture image sequence of a user and transmits the gesture image sequence to a gesture recognition model;
step 2: after receiving the current gesture image sequence, the gesture recognition model outputs a gesture recognition result and sends the number of the recognition result to the three-dimensional rendering engine;
and step 3: and after receiving the serial numbers, the three-dimensional rendering engine changes the position relation between corresponding components in the virtual scene according to the virtual scene logic.
The invention has the beneficial effects that:
compared with the prior art, the method can solve the problem that the adjusting mode of matching the pupil distance of the user and changing the distance of the position of the imaging picture under the augmented reality helmet is unnatural, and has the effect of enabling the pupil distance and the image plane to be adjusted more flexibly and efficiently.
Drawings
Fig. 1 is a flowchart of a method for interactively adjusting the interpupillary distance and the image plane of an augmented reality head-mounted display by using gestures according to an embodiment of the present invention.
Detailed Description
The invention is described in detail below by way of example with reference to the accompanying drawings.
The method for interactively adjusting the interpupillary distance and the image plane of the augmented reality helmet display by using gestures comprises an offline stage and an online stage, wherein the offline stage specifically comprises the following steps:
step one, shooting gesture actions in a real scene, numbering according to preset gesture types, and constructing a gesture recognition model according to the gesture actions;
in this embodiment, the gesture types are divided into static gestures and dynamic gestures.
Step two, using a three-dimensional rendering engine to create two virtual cameras in a rendered virtual scene, and setting an initial distance between the two virtual cameras to simulate the left and right eyes of a human; creating two empty objects as carriers of pictures presented by the left eye and the right eye of the simulated human, respectively corresponding the two virtual cameras to the two empty objects, and setting the depth of an imaging picture;
independently rendering prompt information of interpupillary distance and image plane adjustment on a hollow object simulating a picture presented by the left eye and the right eye of a human by using a three-dimensional rendering engine;
step four: and defining virtual scene logic in the three-dimensional rendering engine, and corresponding the gesture number with the virtual scene logic.
In this embodiment, the virtual scene logic includes adjustment of the interpupillary distance and the image plane, switching between adjustment interfaces, reset adjustment, and exit adjustment.
Wherein the online phase specifically comprises:
step 1: the camera acquires a gesture image sequence of a user in real time and transmits the gesture image sequence to the gesture recognition model;
in specific implementation, after a user wears the AR helmet and before the camera acquires a gesture image sequence of the user in real time, a correct pupil distance adjustment or image plane adjustment dynamic gesture and a static gesture for confirming task completion are made according to a current display picture and an adjustment requirement.
Step 2: after receiving the current gesture image sequence, the gesture recognition model outputs a gesture recognition result and sends the number of the recognition result to the three-dimensional rendering engine;
and step 3: and after receiving the serial numbers, the three-dimensional rendering engine changes the position relation between corresponding components in the virtual scene according to the virtual scene logic.
In this embodiment, the changing the position relationship between the corresponding components in the virtual scene specifically includes: pupil distance adjustment, image plane adjustment, interface switching and resetting adjustment and exit adjustment. Wherein the content of the first and second substances,
the pupil distance adjustment is that the distance between the left camera and the right camera in the virtual scene is changed through a dynamic gesture based on hand translation, and when a display picture in the virtual scene is clear and is combined, the user makes a defined confirmation static gesture to complete the pupil distance adjustment and confirmation tasks.
The image plane adjustment is that the distance between a camera and an empty object presenting a picture in a virtual scene is changed by a user through a dynamic gesture based on hand translation, and when the virtual picture is in a proper distance in front of human eyes, the user makes a defined confirmation static gesture to finish image plane adjustment and confirmation tasks.
The interface switching and resetting adjustment are corresponding gestures of interface switching, resetting adjustment and quitting adjustment performed by a user according to requirements, and the switching, resetting adjustment and quitting adjustment among the pupil distance and the image plane adjustment interface are completed through different static gestures.
And the exit adjustment is a static gesture for exiting adjustment after the user confirms the pupil distance and the image plane adjustment result, and exits the adjustment stage to complete the whole adjustment process.
Example 1:
the following embodiments are described in detail according to an off-line stage and an on-line stage, and as shown in fig. 1, specifically include:
first, off-line stage
Step 1: firstly, defining gesture types and action specifications for interaction according to the adjustment requirements of the interpupillary distance and the image plane, and numbering each gesture type. The gesture types mainly comprise 4 static gestures and 2 dynamic gestures, wherein the static gestures are mainly used for interface switching, reset adjustment, exit adjustment and other functions. The dynamic gestures are divided into initial actions and continuous actions and are mainly used for changing parameters, adjusting distances and other functions, and specific action specifications are shown in the following table:
Figure BDA0003288204790000071
step 2: and (3) shooting the gesture action defined in the step (1) in a real scene through a monocular camera, and then storing the shot video stream, wherein the whole gesture collection process comprises 20 tested subjects. In the static gesture collection process, a camera is used for recording 4 static actions for 3 times respectively, and each time is recorded for 6 seconds; in the dynamic gesture collection process, the camera records 2 dynamic actions for 3 times respectively, and the recording time for each time is the time consumed by the testee for completing one grabbing-sliding.
And step 3: preprocessing the video stream saved in the step 2, and labeling the acquired gesture image sequence by using a labeling tool, wherein the specific implementation steps comprise:
(1) image sequence pre-processing
In the off-line stage, the stored video stream is converted into an image sequence and stored in a gray-scale image format by using a video stream processing algorithm in OpenCv.
(2) Gesture image sequence annotation
And (3) labeling the acquired gesture image sequence by using a LabelImg labeling tool, removing fuzzy images when labeling data, labeling the range of the thumb, the index finger and the palm in each frame of clear image, and writing corresponding labels.
And 4, step 4: according to the gesture actions defined in the step 1, a gesture recognition model is constructed to realize the recognition of static gestures and dynamic gestures, and the specific implementation steps comprise:
(1) static gesture recognition
A neural network is built by adopting a Python language-based Pythrch framework, and the operation is carried out by using Pycharm. The network structure adopts a YOLO-v5 model, gray level images of 4 static gesture images and 2 dynamic gesture initial action images are used as network input, training is carried out by extracting features on gesture image sequences, gesture recognition results are output, if the recognition results are static gestures, the recognition results are directly output by YOLO-v5, and if the recognition results are not static gestures, a plurality of adjacent frame image sequences of the frame gesture images are input into a dynamic gesture recognition network for processing.
(2) Dynamic gesture recognition
A neural network is built by using a Pythrch framework for training, ResNet50 is used as a backbone network, and an A2J algorithm is used for hand key point detection, and the hand joint positions are predicted by aggregating estimation results of a plurality of anchor points. Taking a gray scale image of a section of dynamic gesture image sequence as input, carrying out joint point tracking and prediction on a thumb and an index finger in the image sequence by a network, and outputting a dynamic gesture recognition result.
And 5: using a three-dimensional rendering engine Unity to render a virtual information picture of an AR helmet, two virtual cameras are required to be set to simulate the left and right eyes of a human, and the initial positions of the two virtual cameras in a Unity world coordinate system are respectively set as: the unit of the left camera (-0.032, 0, 0) and the right camera (0.032, 0, 0) is meter, namely the distance between the centers of the two virtual cameras is 0.032- (-0.032) ═ 0.064m, and the pupil distance range of the common human eye is 58 mm-68 mm. The adjustable range between the two virtual cameras is set to be 50 mm-80 mm.
Step 6: two 3D objects Quad are placed in the three-dimensional rendering engine Unity to be respectively used as carriers of left and right eye display pictures, and the two virtual cameras in the step 5 are respectively corresponding to the left and right Quad to be respectively imaged. The Z-axis value of the two Quad in the Unity world coordinate system is set to be 5 in meters, namely, the virtual imaging picture is rendered at 5m in front of human eyes. The adjustable depth range of the imaging picture is 1 m-10 m.
And 7: and a three-dimensional rendering engine is used for rendering a gesture icon and a scale bar for prompting whether the user is in an adjusting state or not currently in a virtual scene, and the prompting information is independently rendered on a Quad component corresponding to a virtual camera for simulating the right eye of a human, so that the user is prevented from generating wrong judgment on the problem of image failure caused by mismatching of the pupil distance in the adjusting process.
And 8: defining scene logic in a three-dimensional rendering engine, wherein the scene logic comprises switching between pupil distance and image plane adjustment, resetting adjustment and quitting adjustment, and the corresponding relation between each gesture number and the scene logic in the Unity is shown in the following table:
Figure BDA0003288204790000091
an online stage:
step 1: and (3) a user wears a set of augmented reality helmet and performs dynamic gesture actions of pupil distance adjustment or image plane adjustment and static gestures for confirming task completion in the step 1 in the off-line stage according to the current display picture and the adjustment requirement.
Step 2: and acquiring a gesture image sequence of the current frame of the user in real time by using a camera, and transmitting the gesture image sequence to the gesture recognition model.
And step 3: and after receiving the current gesture image sequence, the gesture recognition model outputs a gesture recognition result and sends the gesture recognition result to the three-dimensional rendering engine in a serial number form defined in the step 1 of the off-line stage in a TCP communication mode.
And 4, step 4: after receiving the gesture number information, the three-dimensional rendering engine changes the position relationship between the corresponding components in the virtual scene according to the scene logic determined in the offline stage step 8, and the specific adjustment process is as follows:
(1) interpupillary distance adjustment
And on the pupil distance adjusting interface, the distance between the left camera and the right camera in the virtual scene can be changed through the dynamic gestures grabbed leftwards and rightwards by the user, and when the display frame in the virtual scene is clear and is combined with the image, the user releases the current dynamic gesture to complete the pupil distance confirmation.
(2) Image plane adjustment
And on the image plane adjusting interface, a user can change the distance between a camera and a hollow object presenting a picture in the virtual scene through a left/right captured dynamic gesture, and when the virtual picture is in a proper distance in front of the eyes, the user releases the current dynamic gesture to finish image plane confirmation.
(3) Interface switching and reset adjustment
After the pupil distance or the image plane is adjusted, the user loosens the dynamic gesture of left/right grabbing, namely, the adjustment can be stopped and the current adjustment result can be stored, and then the gesture of interface switching, reset adjustment and exit adjustment can be made according to the requirement. The switching between the pupil distance and the image plane adjusting interface can be completed by using a static gesture of 'left finger' and 'right finger'. Reset adjust uses static gesture "C".
(4) Exit adjustment
And after confirming the pupil distance and the image plane adjusting result, the user makes a static gesture 'OK', exits from the adjusting stage and finishes the whole adjusting process.
Therefore, pupil distance and image plane adjustment based on gestures under the augmented reality helmet are achieved.
In summary, the above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (15)

1. A method for interactively adjusting the interpupillary distance and the image plane of an augmented reality helmet display by utilizing gestures is characterized by comprising an off-line stage and an on-line stage;
wherein the off-line phase comprises: shooting gesture actions in a real scene, numbering according to preset gesture types, and constructing a gesture recognition model according to the gesture actions; creating two virtual cameras in a virtual scene rendered by a three-dimensional rendering engine, and setting an initial distance between the two virtual cameras to simulate left and right eyes of a human; creating two empty objects as carriers of pictures presented by the left eye and the right eye of the simulated human, respectively corresponding the two virtual cameras to the two empty objects, and setting the depth of an imaging picture; independently rendering prompt information of pupil distance and image plane adjustment on a hollow object simulating a picture presented by left and right eyes of a human by using a three-dimensional rendering engine; defining virtual scene logic in a three-dimensional rendering engine, and corresponding the gesture number with the virtual scene logic;
wherein the online phase comprises: the method comprises the steps that a camera obtains a gesture image sequence of a user and transmits the gesture image sequence to a gesture recognition model; after receiving the current gesture image sequence, the gesture recognition model outputs a gesture recognition result and sends the number of the recognition result to the three-dimensional rendering engine; and after receiving the serial numbers, the three-dimensional rendering engine changes the position relation between corresponding components in the virtual scene according to the virtual scene logic.
2. The method for interactively adjusting the interpupillary distance and the image plane of an augmented reality head-mounted display by gestures as claimed in claim 1, wherein the types of gestures are classified into static gestures and dynamic gestures.
3. The method for interactively adjusting the interpupillary distance and the image plane of the augmented reality head-mounted display by gestures as claimed in claim 1 or 2, wherein the virtual scene logic comprises the adjustment of the interpupillary distance and the image plane, the interface switching adjustment, the reset adjustment and the exit adjustment.
4. The method for interactively adjusting the interpupillary distance and the image plane of an augmented reality head-mounted display by using gestures as claimed in claim 1 or 2, wherein after a user wears an AR helmet in an online stage, a dynamic gesture action of correct interpupillary distance adjustment or image plane adjustment and a static gesture for confirming task completion are performed according to a current display picture and an adjustment requirement before the camera acquires a gesture image sequence of the user in real time.
5. The method for interactively adjusting the interpupillary distance and the image plane of the augmented reality head-mounted display by using gestures according to claim 1 or 2, wherein the changing of the position relationship between the corresponding components in the virtual scene specifically comprises: pupil distance adjustment, image plane adjustment, interface switching and resetting adjustment and exit adjustment.
6. The method for interactively adjusting the interpupillary distance and the image plane of the augmented reality head-mounted display by gestures as claimed in claim 5, wherein the interpupillary distance adjustment is performed by changing the distance between the left camera and the right camera in the virtual scene through dynamic gestures based on hand translation, and when the displayed image in the virtual scene is clear, the user makes defined confirmation static gestures to complete the tasks of interpupillary distance adjustment and confirmation.
7. The method as claimed in claim 5, wherein the image plane adjustment is performed by changing the distance between the camera and the empty object in the virtual scene through a dynamic gesture based on hand translation, and when the virtual scene is in a suitable distance in front of the eyes, the user makes a defined confirmation static gesture to complete the image plane adjustment and confirmation tasks.
8. The method for interactively adjusting the interpupillary distance and the image plane of the augmented reality helmet display by gestures as claimed in claim 5, wherein the interface switching and resetting adjustment are corresponding gestures for the user to perform the interface switching, resetting adjustment and exiting adjustment according to the requirement, and the switching, resetting adjustment and exiting adjustment among the interpupillary distance, the image plane adjustment interface are all completed by different static gestures.
9. The method for interactively adjusting the interpupillary distance and the image plane of an augmented reality helmet display by gestures as claimed in claim 5, wherein the exit adjustment is a static gesture for exiting adjustment after the user confirms the interpupillary distance and the image plane adjustment result, and the exit adjustment stage is performed to complete the whole adjustment process.
10. The method for interactively adjusting the interpupillary distance and the image plane of the augmented reality helmet display by using gestures as claimed in claim 1 or 2, wherein after the gesture action is shot in the real scene, the shot video stream is converted into an image sequence by using a video stream processing algorithm in OpenCv and is saved in a gray-scale image format.
11. The method for adjusting the interpupillary distance and the image plane of the augmented reality head-mounted display by using gesture interaction as claimed in claim 1 or 2, wherein the constructing the gesture recognition model comprises a static gesture recognition model and a dynamic gesture recognition model.
12. The method for interactively adjusting the interpupillary distance and the image plane of the augmented reality head-mounted display through gestures as claimed in claim 11, wherein the static gesture recognition model is trained by building a neural network model using a Pytorch framework based on Python language.
13. The method as claimed in claim 12, wherein the neural network model adopts a YOLO-v5 model to take gray scale images of static gesture images and dynamic gesture initial motion images as input of the network, training is performed by extracting features on gesture image sequences, and a gesture recognition result is output, if the recognition result is a static gesture, the recognition result is directly output by the YOLO-v5, otherwise, several adjacent frame image sequences of the frame gesture images are input into the dynamic gesture recognition network for processing.
14. The method for interactively adjusting the interpupillary distance and the image plane of the augmented reality head-mounted display through gestures as claimed in claim 11, wherein the dynamic gesture recognition model is trained by building a neural network by using a Pythrch framework.
15. The method as claimed in claim 14, wherein the neural network adopts ResNet50 as a backbone network, the detection of the hand key points uses A2J algorithm, the estimation results of a plurality of anchor points are aggregated to predict the hand joint position, a gray scale image of a dynamic gesture image sequence is used as an input, and the network performs joint point tracking and prediction on the thumb and the index finger in the image sequence to output the dynamic gesture recognition result.
CN202111154355.2A 2021-09-29 2021-09-29 Method for interactively adjusting interpupillary distance and image surface of augmented reality helmet display by utilizing gestures Pending CN113866987A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111154355.2A CN113866987A (en) 2021-09-29 2021-09-29 Method for interactively adjusting interpupillary distance and image surface of augmented reality helmet display by utilizing gestures

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111154355.2A CN113866987A (en) 2021-09-29 2021-09-29 Method for interactively adjusting interpupillary distance and image surface of augmented reality helmet display by utilizing gestures

Publications (1)

Publication Number Publication Date
CN113866987A true CN113866987A (en) 2021-12-31

Family

ID=79000589

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111154355.2A Pending CN113866987A (en) 2021-09-29 2021-09-29 Method for interactively adjusting interpupillary distance and image surface of augmented reality helmet display by utilizing gestures

Country Status (1)

Country Link
CN (1) CN113866987A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114866757A (en) * 2022-04-22 2022-08-05 深圳市华星光电半导体显示技术有限公司 Stereoscopic display system and method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107015366A (en) * 2017-04-20 2017-08-04 苏州神罗信息科技有限公司 A kind of Intelligent Hybrid Reality glasses
CN107329256A (en) * 2016-04-28 2017-11-07 江苏慧光电子科技有限公司 Display device and its control method
CN109002164A (en) * 2018-07-10 2018-12-14 歌尔科技有限公司 It wears the display methods for showing equipment, device and wears display equipment
CN111061363A (en) * 2019-11-21 2020-04-24 青岛小鸟看看科技有限公司 Virtual reality system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107329256A (en) * 2016-04-28 2017-11-07 江苏慧光电子科技有限公司 Display device and its control method
CN107015366A (en) * 2017-04-20 2017-08-04 苏州神罗信息科技有限公司 A kind of Intelligent Hybrid Reality glasses
CN109002164A (en) * 2018-07-10 2018-12-14 歌尔科技有限公司 It wears the display methods for showing equipment, device and wears display equipment
CN111061363A (en) * 2019-11-21 2020-04-24 青岛小鸟看看科技有限公司 Virtual reality system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
FU XIONG; BOSHEN ZHANG; YANG XIAO, ETC: "A2J: Anchor-to-Joint Regression Network for 3D Articulated Pose Estimation from a Single Depth Image", 《IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS, COMMUNICATIONS AND COMPUTER SCIENCES》 *
MEI-YING NG; CHIN-BOON CHNG; WAI-KIN KOH, ETC: "An enhanced self-attention and A2J approach for 3D hand pose estimation", 《MULTIMEDIA TOOLS AND APPLICATION》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114866757A (en) * 2022-04-22 2022-08-05 深圳市华星光电半导体显示技术有限公司 Stereoscopic display system and method
CN114866757B (en) * 2022-04-22 2024-03-05 深圳市华星光电半导体显示技术有限公司 Stereoscopic display system and method

Similar Documents

Publication Publication Date Title
US11238568B2 (en) Method and system for reconstructing obstructed face portions for virtual reality environment
EP2568355A2 (en) Combined stereo camera and stereo display interaction
CN110023814A (en) Mask capture is carried out by wearable device
EP1431798A2 (en) Arbitrary object tracking in augmented reality applications
WO2016021034A1 (en) Algorithm for identifying three-dimensional point of gaze
CN107357434A (en) Information input equipment, system and method under a kind of reality environment
WO2013171731A1 (en) A system worn by a moving user for fully augmenting reality by anchoring virtual objects
CN106598252A (en) Image display adjustment method and apparatus, storage medium and electronic device
US20210278671A1 (en) Head wearable device with adjustable image sensing modules and its system
CN113866987A (en) Method for interactively adjusting interpupillary distance and image surface of augmented reality helmet display by utilizing gestures
Tharatipyakul et al. Pose estimation for facilitating movement learning from online videos
Jeanne et al. A study on improving performance in gesture training through visual guidance based on learners' errors
CN111399662B (en) Human-robot interaction simulation device and method based on high-reality virtual avatar
Mania et al. Gaze-aware displays and interaction
WO2023240999A1 (en) Virtual reality scene determination method and apparatus, and system
JPH11195131A (en) Virtual reality method and device therefor and storage medium
Krishna et al. Gan based indian sign language synthesis
US20230139989A1 (en) Videoconference method and videoconference system
Madritsch CCD-Camera Based Optical Tracking for Human-Computer Interaction
CN110097644B (en) Expression migration method, device and system based on mixed reality and processor
Wu et al. Depth-disparity calibration for augmented reality on binocular optical see-through displays
Miura et al. SynSLaG: Synthetic sign language generator
WO2023282913A1 (en) Blendshape weights prediction for facial expression of hmd wearer using machine learning model trained on rendered avatar training images
Khan et al. Face-off: A face reconstruction technique for virtual reality (VR) scenarios
Filer A 3-D Virtual Environment Display System

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20211231

RJ01 Rejection of invention patent application after publication