WO2020087919A1 - 基于视线追踪的增强现实人机交互设备及控制方法 - Google Patents

基于视线追踪的增强现实人机交互设备及控制方法 Download PDF

Info

Publication number
WO2020087919A1
WO2020087919A1 PCT/CN2019/088729 CN2019088729W WO2020087919A1 WO 2020087919 A1 WO2020087919 A1 WO 2020087919A1 CN 2019088729 W CN2019088729 W CN 2019088729W WO 2020087919 A1 WO2020087919 A1 WO 2020087919A1
Authority
WO
WIPO (PCT)
Prior art keywords
layer
network
eye
module
input
Prior art date
Application number
PCT/CN2019/088729
Other languages
English (en)
French (fr)
Inventor
崔笑宇
纪欣伯
陈卫兴
Original Assignee
东北大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 东北大学 filed Critical 东北大学
Publication of WO2020087919A1 publication Critical patent/WO2020087919A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/013Eye tracking input arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the present invention belongs to the field of line-of-sight tracking and embedding, and particularly relates to an augmented reality human-computer interaction device and control method based on line-of-sight tracking.
  • augmented reality will be widely used in medical, industrial design, military, entertainment and other industries, and is expected to become a general-purpose computing platform in the future and will change people's working lifestyle.
  • machine intelligence makes the computer's understanding of human's natural consciousness more and more reliable, thus giving intelligent interaction an opportunity from laboratory to practical.
  • GPUs and other hardware has greatly improved computing power, which not only makes deep learning and artificial intelligence more widely used, but also promotes the development of augmented reality.
  • Deep learning is a branch of machine learning, an algorithm that attempts to perform high-level abstraction on data using multiple processing layers that contain complex structures or are composed of multiple nonlinear transformations.
  • a typical DL architecture can include many layers of neurons and millions of parameters.
  • Convolutional Neural Network (CNN) is one of the most popular architectures. Its artificial neurons can respond to a part of the surrounding cells in the coverage area.
  • feed-forward neural networks for images Processing shows better results, making it an attractive deep learning structure.
  • the present invention provides an augmented reality human-machine interaction device based on gaze tracking, which is characterized by comprising: a frame, a left interaction system and a right interaction system;
  • the left interaction system and the right interaction system have the same and symmetrical structure, and each system includes a miniature eye tracking camera, an optical waveguide AR lens, an embedded processor, a drive control board, and a hub;
  • the wire collecting groove is provided on the frame
  • the drive control board is installed on the frame, connected with the optical waveguide AR lens, and the connecting wire is accommodated in the collecting groove;
  • the embedded processor is installed on the drive control board;
  • the optical waveguide AR lens is used to display the output information of the drive control board
  • optical waveguide AR lens and the miniature eye-tracking camera are mounted on the frame and are located within the human vision.
  • the embedded processor has a Pascal GPU architecture and an independent operating system.
  • the miniature eye tracking camera uses a camera that can record the original red, green, and blue three-channel images.
  • a method for controlling an augmented reality human-machine interaction device based on line-of-sight tracking, using the aforementioned augmented reality human-machine interaction device based on line-of-sight tracking includes the following steps:
  • Step 1 Establish an eye movement interaction system in the interaction device;
  • the eye movement interaction system uses a CNN architecture-based convolutional neural network;
  • Step 2 training the convolutional neural network:
  • the training set images used by the model of the convolutional neural network include eyes captured with different skin colors, races, iris colors, eye size, etc., at different angles, different simulated lighting, and different sight lines. Simulated image
  • the model is constructed according to the ResNet network, and its training process is:
  • the input image passes through a BatchNorm (BN) layer, a 7 x 7 convolution kernel (CONV) layer, and a modified linear unit (relu) layer, and enters the convolution network;
  • BN BatchNorm
  • CONV convolution kernel
  • reflu modified linear unit
  • the convolutional network includes a first module, a second module, a third module and a fourth module, the input image sequence 4 modules through the convolutional network;
  • Any one module is composed of several networks, and each network in the same module is the same;
  • Each network in the module is composed of a BN layer, a 3 x 3 CONV layer and a relu layer connected in sequence;
  • the first network of the first module takes the received input image as an input; the input of other networks of the first module is the sum of the output and input of the previous network;
  • the input volume of the first network of other modules is the sum of the output volume and input volume of the last network of the previous module; the input volume of the other networks of other modules is the sum of the output volume and input volume of the last network ;
  • the pupil center is obtained based on the 32 iris feature points; the eye movements are recognized based on the 33 other feature points; all 55 feature points are used as input, and 2 sight vectors are obtained through 3 FC layers ; The intersection of the two sight vectors is determined as the position of the focal point of sight of the human eye in space;
  • Step 3 The eye-tracking interactive system acquires original red, green, and blue three-channel images of the left and right eyes through two miniature eye tracking cameras, respectively, and performs the following operations in sequence:
  • Step 4 Perform eye movement trajectory recognition on the image processed in step 3 through the eye movement interaction system, and then identify various patterns drawn by the sight movement trajectory to perform corresponding interactive actions; Recognition of eye movements.
  • the present invention proposes an augmented reality human-computer interaction device and control method based on line-of-sight tracking. On the one hand, it improves the way for people to obtain effective information and its efficiency. Operation, when the source of these two methods is occupied, they can still interact.
  • the present invention adopts an eye movement interactive system based on a CNN architecture convolutional neural network, which enables the application of ordinary cameras inferior to infrared cameras, improves the accuracy of gaze tracking and saves costs.
  • the present invention is reasonable in design, easy to implement, and has good practical value.
  • FIG. 1 is a schematic structural diagram of an augmented reality human-computer interaction device based on line-of-sight tracking according to a specific embodiment of the present invention.
  • the present invention proposes an augmented reality human-machine interaction device based on line-of-sight tracking, as shown in FIG. 1, including: a frame 4, a left interaction system, and a right interaction system;
  • the left interaction system and the right interaction system are respectively installed in the left half and the right half of the frame 4;
  • each system includes a miniature eye tracking camera 1, an optical waveguide AR lens 2, an embedded processor, a drive control board 3 and a hub 5;
  • the collecting slot 5 is provided on the frame 4;
  • the drive control board 3 is mounted on the frame 4 and is connected to the optical waveguide AR lens 2 and the connecting wire is accommodated in the collecting groove 5;
  • the embedded processor is installed on the drive control board 3;
  • the embedded processor is the control center and image processing center of the device, and the miniature eye tracking camera 1
  • the processed signal is sent back to the processing center of the optical waveguide lens for display, with Pascal's G PU architecture, which has powerful image processing capabilities and an independent operating system;
  • the miniature eye tracking camera 1 is used to record the original red, green, and blue three-channel images of the eyes, and realize human-computer interaction through binocular tracking;
  • the optical waveguide AR lens 2 is used to display the output information of the drive control board 3;
  • optical waveguide AR lens 2 and the miniature eye tracking camera 1 are mounted on the frame 4 and are located within the human visual range
  • the present invention provides a method for controlling an augmented reality human-machine interaction device based on line-of-sight tracking.
  • Using the aforementioned augmented reality human-machine interaction device based on line-of-sight tracking includes the following steps:
  • Step 1 Establish an eye movement interaction system in the interaction device;
  • the eye movement interaction system uses a CNN architecture-based convolutional neural network;
  • Step 2 training the convolutional neural network:
  • the training set images used by the model of the convolutional neural network include eyes captured with different skin colors, races, iris colors, eyeball sizes, and other three-dimensional eye models at different angles, different simulated lighting, and different sight lines Simulated image
  • the model is constructed according to the ResNet network, and its training process is:
  • the input image passes through a BatchNorm (BN) layer, a 7 x 7 convolution kernel (CONV) layer, and a modified linear unit (relu) layer, and enters the convolution network;
  • BN BatchNorm
  • CONV convolution kernel
  • reflu modified linear unit
  • the convolutional network includes a first module, a second module, a third module and a fourth module, and the input image sequentially passes through 4 modules of the convolutional network;
  • Any one module is composed of several networks, and each network in the same module is the same;
  • Each network in the module is composed of a BN layer, a 3 x 3 CONV layer and a relu layer connected in sequence;
  • the first network of the first module takes the received input image as an input; the input of other networks of the first module is the sum of the output and input of the previous network;
  • the input volume of the first network of other modules is the output volume and input volume of the last network of the previous module
  • the sum of the input of other networks of other modules is the sum of the output and input of the previous network
  • the pupil center is obtained based on the 32 iris feature points; the eye movements are recognized based on the 33 other feature points; all 55 feature points are used as input, and 2 sight vectors are obtained through 3 FC layers ; The intersection of the two sight vectors is determined as the position of the focal point of sight of the human eye in space;
  • Step 3 The eye movement interactive system uses two miniature eye tracking cameras 1 to acquire the original red, green, and blue three-channel images of the left and right eyes, respectively, and sequentially performs the following operations:
  • Step 4 Perform eye-tracking trajectory recognition on the image processed in step 3 through the eye-movement interaction system, and then identify various patterns drawn by the eye-tracking trajectory to perform corresponding interactive actions; Recognition of eye movements;
  • the blinking motion in the eye motion is used as a switch for the interactive motion of the eye movement interactive system.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Human Computer Interaction (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)
  • Eye Examination Apparatus (AREA)
  • Image Processing (AREA)

Abstract

本发明具体涉及一种基于视线追踪的增强现实人机交互设备及控制方法,属于视线追踪和嵌入式领域。所述设备包括:镜架、左交互系统和右交互系统;每个系统包括微型眼追踪相机、光波导AR镜片、嵌入式处理器、驱动控制板和集线槽;所述方法包括:1)建立眼动交互系统;2)训练卷积神经网络;3)对采集图像进行数据处理;4)识别眼部动作。本发明一方面改善了人们获取有效信息的途径及其效率,另一方面通过视线进行交互,弥补了语音和手势的操作,当这两种方法来源被占用时仍能进行交互。

Description

基于视线追踪的增强现实人机交互设备及控制方法 技术领域
[0001] 本发明属于视线追踪和嵌入式领域, 具体涉及一种基于视线追踪的增强现实人 机交互设备及控制方法。
背景技术
[0002] 作为一项将虚拟与现实结合起来的技术, 增强现实将广泛应用于医疗、 工业设 计、 军事、 娱乐等行业, 有望成为未来的通用计算平台, 并将改变人们的工作 生活方式。 机器智能的发展使得计算机对人类的自然意识的理解越来越可靠, 从而使智能交互有了从实验室走向实用的契机。 GPU和其他硬件的发展极大地 提高了计算能力, 不仅使深度学习和人工智能有了更广泛的应用, 还促进了增 强现实的发展。
[0003] 随着交互式设备的出现, 人们与计算机交互的方式越来越多。 如何高效快速便 捷的与计算平台通信已经成为科学家研究的热门话题。 就现有的 HoloLens、 Mag ic leap而言, 其人机交互停留在语音和手势, 尚未出现一种成形的使用视线的交 互操作, 这在一定程度上限降低了增强现实的优势。 对于 tobli与 SMI等公司开发 出的视线追踪眼镜, 仅仅作为单纯的注视分析, 并未上升到交互和控制层面。 对照 AR和眼动的技术环境, 视线作为一种交互方式, 与增强现实眼镜有着极大 的契合度, 为改善人们获取有效信息的方式提供新的契机。
[0004] 深度学习 (deep learning) 是机器学习的分支, 是一种试图使用包含复杂结构 或由多重非线性变换构成的多个处理层对数据进行高层抽象的算法。 典型的 DL 架构可包括许多层的神经元和数百万个参数。 在现有的 DL框架中, 卷积神经网 络 (CNN) 是最流行的架构之一, 它的人工神经元可以响应一部分覆盖范围内 的周围单元, 相比较其他深度、 前馈神经网络, 对于图像处理表现出更好的结 果, 使之成为一种颇具吸引力的深度学习结构。
发明概述
技术问题 问题的解决方案
技术解决方案
[0005] 针对上述存在的技术问题, 本发明提供一种基于视线追踪的增强现实人机交互 设备, 其特征在于, 包括: 镜架、 左交互系统和右交互系统;
[0006] 所述左交互系统与右交互系统的结构相同并对称, 每个系统包括微型眼追踪相 机、 光波导 AR镜片、 嵌入式处理器、 驱动控制板和集线槽;
[0007] 所述集线槽设置在镜架上;
[0008] 所述驱动控制板安装在镜架上, 与光波导 AR镜片相连, 连接线收纳于集线槽 中;
[0009] 所述嵌入式处理器安装在驱动控制板上;
[0010] 所述光波导 AR镜片用于显示驱动控制板的输出信息;
[0011] 所述光波导 AR镜片和微型眼追踪相机安装在镜架上, 位于人体视觉范围内。
[0012] 所述嵌入式处理器具有 Pascal的 GPU架构, 同时具有独立的操作系统。
[0013] 所述微型眼追踪相机采用可以记录原始红绿蓝三通道图像的相机。
[0014] 一种基于视线追踪的增强现实人机交互设备的控制方法, 采用上述的基于视线 追踪的增强现实人机交互设备, 包括以下步骤:
[0015] 步骤 1, 在所述交互设备内建立眼动交互系统; 所述眼动交互系统采用基于 CN N架构的卷积神经网络;
[0016] 步骤 2, 训练所述卷积神经网络:
[0017] 所述卷积神经网络的模型采用的训练集图像包括以不同肤色, 人种, 虹膜颜色 , 眼球大小等眼部三维模型在不同角度, 不同模拟光照, 不同视线中截取到的 眼部模拟图像;
[0018] 对所述训练集图像进行锐化处理, 强调边缘以便于学习, 并调节图像尺寸为 25 6 x 256像素;
[0019] 所述模型是根据 ResNet网络构建, 其训练过程为:
[0020] 输入图像依次经过一层 BatchNorm (BN) 层, 一层 7 x 7卷积核的卷积 (CONV ) 层, 一层修正线形单元 (relu) 层, 进入卷积网络;
[0021] 所述卷积网络包括第一模块、 第二模块、 第三模块和第四模块, 输入图像顺序 经过卷积网络的 4个模块;
[0022] 任意一个模块皆由若干个网络组成, 同一个模块中各个网络是相同的;
[0023] 所述模块中的各个网络由一层 BN层, 一层 3 x 3 CONV层与一层 relu层依次连接 而成;
[0024] 第一模块的第一网络以接收的输入图像作为输入量; 第一模块的其他网络的输 入量皆为上个网络的输出量与输入量的和;
[0025] 其他模块的第一网络的输入量为上一个模块的最后一个网络的输出量与输入量 的和; 其他模块的其他网络的输入量皆为上个网络的输出量与输入量的和;
[0026] 第四模块的输出量, 一方面经过降维并通过全连接 (FC) 层得出 32个虹膜特征 点; 另一方面依次通过一层 BN层, 一层 3 x 3 C0NV层与一层 relu层, 再降维经 过 FC层, 得出 33个其他特征点;
[0027] 根据所述 32个虹膜特征点得出瞳孔中心; 根据所述 33个其他特征点识别眼部动 作; 以全部的 55个特征点作为输入, 经过 3个 FC层得出 2个视线向量; 以两视线 向量交点确定为空间上人眼的视线焦点的位置;
[0028] 将得到的瞳孔中心、 视线向量和视线焦点作为训练结果, 使眼动交互系统达到 使用要求;
[0029] 步骤 3 , 所述眼动交互系统通过 2个微型眼追踪相机分别采集的左右眼部的原始 红绿蓝三通道图像, 依次进行以下操作:
[0030] (1) 对图像中的红色通道进行直方图均衡化, 增强大多数场景下的图像细节
[0031] (2) 提高对比度, 突出皮肤与眼球以及眼白与虹膜的色彩差别;
[0032] (3) 经过锐化处理, 突出边缘特征;
[0033] (4) 把图像的尺寸调节为 256 x 256像素;
[0034] 步骤 4, 通过所述眼动交互系统对经过步骤 3处理后的图像进行视线移动轨迹识 另 IJ, 进而识别视线移动轨迹所画出的各种图案来进行相应的交互动作; 同时进 行眼部动作的识别。
发明的有益效果
有益效果 [0035] 本发明的有益效果:
[0036] 本发明提出一种基于视线追踪的增强现实人机交互设备及控制方法, 一方面改 善了人们获取有效信息的途径及其效率, 另一方面通过视线进行交互, 弥补了 语音和手势的操作, 当这两种方法来源被占用时仍能进行交互。
[0037] 本发明采用基于 CNN架构的卷积神经网络的眼动交互系统, 使逊于红外相机的 普通相机得以应用, 提升了视线追踪的精确性并节约了成本。
[0038] 本发明设计合理, 易于实现, 具有很好的实用价值。
对附图的简要说明
附图说明
[0039] 图 1为本发明具体实施方式中所述基于视线追踪的增强现实人机交互设备的结 构示意图。
[0040] 图中: 1、 微型眼追踪相机; 2、 光波导 AR镜片; 3、 驱动控制板; 4、 镜架; 5 、 集线槽。
发明实施例
本发明的实施方式
[0041] 为了使本发明的目的、 技术方案及优点更加清楚明白, 以下结合附图及实施实 例, 对本发明做出进一步详细说明。 应当理解, 此处所描述的具体实施例仅用 以解释本发明, 并不用于限定本发明。
[0042] 本发明提出一种基于视线追踪的增强现实人机交互设备, 如图 1所示, 包括: 镜架 4、 左交互系统和右交互系统;
[0043] 所述左交互系统和右交互系统分别安装在镜架 4的左半部和右半部;
[0044] 所述左交互系统与右交互系统的结构相同并对称, 每个系统包括微型眼追踪相 机 1、 光波导 AR镜片 2、 嵌入式处理器、 驱动控制板 3和集线槽 5 ;
[0045] 所述集线槽 5设置在镜架 4上;
[0046] 所述驱动控制板 3安装在镜架 4上, 与光波导 AR镜片 2相连, 连接线收纳于集线 槽 5中;
[0047] 所述嵌入式处理器安装在驱动控制板 3上;
[0048] 所述嵌入式处理器是设备的控制中心和图像处理中心、 以及对微型眼追踪相机 1回传的信号进行处理后发送到光波导镜片进行显示的处理中心, 具有 Pascal的 G PU架构, 从而具有强大的图像处理能力, 同时还具有独立的操作系统;
[0049] 所述微型眼追踪相机 1用于记录眼部的原始红绿蓝三通道图像, 通过双眼追踪 实现人机交互;
[0050] 所述光波导 AR镜片 2用于显示驱动控制板 3的输出信息;
[0051] 所述光波导 AR镜片 2和微型眼追踪相机 1安装在镜架 4上, 位于人体视觉范围内
[0052] 本发明提出一种基于视线追踪的增强现实人机交互设备的控制方法, 采用上述 的基于视线追踪的增强现实人机交互设备, 包括以下步骤:
[0053] 步骤 1, 在所述交互设备内建立眼动交互系统; 所述眼动交互系统采用基于 CN N架构的卷积神经网络;
[0054] 步骤 2, 训练所述卷积神经网络:
[0055] 所述卷积神经网络的模型采用的训练集图像包括以不同肤色, 人种, 虹膜颜色 , 眼球大小等眼部三维模型在不同角度, 不同模拟光照, 不同视线中截取到的 眼部模拟图像;
[0056] 对所述训练集图像进行锐化处理, 强调边缘以便于学习, 并调节图像尺寸为 25 6 x 256像素;
[0057] 所述模型是根据 ResNet网络构建, 其训练过程为:
[0058] 输入图像依次经过一层 BatchNorm (BN) 层, 一层 7 x 7卷积核的卷积 (CONV ) 层, 一层修正线形单元 (relu) 层, 进入卷积网络;
[0059] 所述卷积网络包括第一模块、 第二模块、 第三模块和第四模块, 输入图像顺序 经过卷积网络的 4个模块;
[0060] 任意一个模块皆由若干个网络组成, 同一个模块中各个网络是相同的;
[0061] 所述模块中的各个网络由一层 BN层, 一层 3 x 3 CONV层与一层 relu层依次连接 而成;
[0062] 第一模块的第一网络以接收的输入图像作为输入量; 第一模块的其他网络的输 入量皆为上个网络的输出量与输入量的和;
[0063] 其他模块的第一网络的输入量为上一个模块的最后一个网络的输出量与输入量 的和; 其他模块的其他网络的输入量皆为上个网络的输出量与输入量的和;
[0064] 第四模块的输出量, 一方面经过降维并通过全连接 (FC) 层得出 32个虹膜特征 点; 另一方面依次通过一层 BN层, 一层 3 x 3 CONV层与一层 relu层, 再降维经 过 FC层, 得出 33个其他特征点;
[0065] 根据所述 32个虹膜特征点得出瞳孔中心; 根据所述 33个其他特征点识别眼部动 作; 以全部的 55个特征点作为输入, 经过 3个 FC层得出 2个视线向量; 以两视线 向量交点确定为空间上人眼的视线焦点的位置;
[0066] 将得到的瞳孔中心、 视线向量和视线焦点作为训练结果, 使眼动交互系统达到 使用要求;
[0067] 步骤 3 , 所述眼动交互系统通过 2个微型眼追踪相机 1分别采集的左右眼部的原 始红绿蓝三通道图像, 依次进行以下操作:
[0068] (1) 对图像中的红色通道进行直方图均衡化, 增强大多数场景下的图像细节
[0069] (2) 提高对比度, 突出皮肤与眼球以及眼白与虹膜的色彩差别;
[0070] (3) 经过锐化处理, 突出边缘特征;
[0071] (4) 把图像的尺寸调节为 256 x 256像素;
[0072] 步骤 4, 通过所述眼动交互系统对经过步骤 3处理后的图像进行视线移动轨迹识 另 IJ, 进而识别视线移动轨迹所画出的各种图案来进行相应的交互动作; 同时进 行眼部动作的识别;
[0073] 其中, 以眼部动作中的眨眼动作作为眼动交互系统的交互动作的开关。

Claims

权利要求书
[权利要求 1] 一种基于视线追踪的增强现实人机交互设备, 其特征在于, 包括: 镜 架、 左交互系统和右交互系统;
所述左交互系统与右交互系统的结构相同并对称, 每个系统包括微型 眼追踪相机、 光波导 AR镜片、 嵌入式处理器、 驱动控制板和集线槽 所述集线槽设置在镜架上;
所述驱动控制板安装在镜架上, 与光波导 AR镜片相连, 连接线收纳 于集线槽中;
所述嵌入式处理器安装在驱动控制板上;
所述光波导 AR镜片和微型眼追踪相机安装在镜架上, 位于人体视觉 范围内。
[权利要求 2] 根据权利要求 1所述的基于视线追踪的增强现实人机交互设备, 其特 征在于, 所述嵌入式处理器具有 Pascal的 GPU架构, 同时具有独立的 操作系统。
[权利要求 3] 根据权利要求 1所述的基于视线追踪的增强现实人机交互设备, 其特 征在于, 所述微型眼追踪相机采用可以记录原始红绿蓝三通道图像的 相机。
[权利要求 4] 一种基于视线追踪的增强现实人机交互设备的控制方法, 其特征在于 , 采用权利要求 3所述的基于视线追踪的增强现实人机交互设备, 包 括以下步骤:
步骤 1, 在所述交互设备内建立眼动交互系统; 所述眼动交互系统采 用基于 CNN架构的卷积神经网络;
步骤 2, 训练所述卷积神经网络:
所述卷积神经网络的模型采用的训练集图像包括以不同肤色, 人种, 虹膜颜色, 眼球大小等眼部三维模型在不同角度, 不同模拟光照, 不 同视线中截取到的眼部模拟图像;
对所述训练集图像进行锐化处理, 强调边缘以便于学习, 并调节图像 尺寸为 256 x 256像素;
所述模型是根据 ResNet网络构建, 其训练过程为:
输入图像依次经过一层 BatchNorm (BN) 层, 一层 7 x 7卷积核的卷积 (C0NV) 层, 一层修正线形单元 (relu) 层, 进入卷积网络; 所述卷积网络包括第一模块、 第二模块、 第三模块和第四模块, 输入 图像顺序经过卷积网络的 4个模块;
任意一个模块皆由若干个网络组成, 同一个模块中各个网络是相同的
所述模块中的各个网络由一层 BN层, 一层 3 x 3 C0NV层与一层 relu层 依次连接而成;
第一模块的第一网络以接收的输入图像作为输入量; 第一模块的其他 网络的输入量皆为上个网络的输出量与输入量的和;
其他模块的第一网络的输入量为上一个模块的最后一个网络的输出量 与输入量的和; 其他模块的其他网络的输入量皆为上个网络的输出量 与输入量的和;
第四模块的输出量, 一方面经过降维并通过全连接 (FC) 层得出 32 个虹膜特征点; 另一方面依次通过一层 BN层, 一层 3 x 3 CONV层与 一层 relu层, 再降维经过 FC层, 得出 33个其他特征点;
根据所述 32个虹膜特征点得出瞳孔中心; 根据所述 33个其他特征点识 别眼部动作; 以全部的 55个特征点作为输入, 经过 3个 FC层得出 2个 视线向量; 以两视线向量交点确定为空间上人眼的视线焦点的位置; 将得到的瞳孔中心、 视线向量和视线焦点作为训练结果, 使眼动交互 系统达到使用要求;
步骤 3 , 所述眼动交互系统通过 2个微型眼追踪相机分别采集的左右眼 部的原始红绿蓝三通道图像, 依次进行以下操作:
( 1) 对图像中的红色通道进行直方图均衡化, 增强大多数场景下的 图像细节;
(2) 提高对比度, 突出皮肤与眼球以及眼白与虹膜的色彩差别; (3) 经过锐化处理, 突出边缘特征;
(4) 把图像的尺寸调节为 256 x 256像素;
步骤 4, 通过所述眼动交互系统对经过步骤 3处理后的图像进行视线移 动轨迹识别, 进而识别视线移动轨迹所画出的各种图案来进行相应的 交互动作; 同时进行眼部动作的识别。
PCT/CN2019/088729 2018-10-30 2019-05-28 基于视线追踪的增强现实人机交互设备及控制方法 WO2020087919A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811278631.4A CN109240510B (zh) 2018-10-30 2018-10-30 基于视线追踪的增强现实人机交互设备及控制方法
CN201811278631.4 2018-10-30

Publications (1)

Publication Number Publication Date
WO2020087919A1 true WO2020087919A1 (zh) 2020-05-07

Family

ID=65079352

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/088729 WO2020087919A1 (zh) 2018-10-30 2019-05-28 基于视线追踪的增强现实人机交互设备及控制方法

Country Status (2)

Country Link
CN (1) CN109240510B (zh)
WO (1) WO2020087919A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109240510B (zh) * 2018-10-30 2023-12-26 东北大学 基于视线追踪的增强现实人机交互设备及控制方法
CN117289788A (zh) * 2022-11-28 2023-12-26 清华大学 交互方法、装置、电子设备及计算机存储介质
CN116185192B (zh) * 2023-02-09 2023-10-20 北京航空航天大学 基于去噪变分编码器的眼动识别vr交互方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105589551A (zh) * 2014-10-22 2016-05-18 褚秀清 一种用于移动设备人机交互的眼动跟踪方法
CN106354264A (zh) * 2016-09-09 2017-01-25 电子科技大学 基于视线追踪的实时人机交互系统及其工作方法
CN107545302A (zh) * 2017-08-02 2018-01-05 北京航空航天大学 一种人眼左右眼图像联合的视线方向计算方法
CN109240510A (zh) * 2018-10-30 2019-01-18 东北大学 基于视线追踪的增强现实人机交互设备及控制方法

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070011609A1 (en) * 2005-07-07 2007-01-11 Florida International University Board Of Trustees Configurable, multimodal human-computer interface system and method
CN102749991B (zh) * 2012-04-12 2016-04-27 广东百泰科技有限公司 一种适用于人机交互的非接触式自由空间视线跟踪方法
US10922393B2 (en) * 2016-07-14 2021-02-16 Magic Leap, Inc. Deep neural network for iris identification
CA3034644A1 (en) * 2016-08-22 2018-03-01 Magic Leap, Inc. Augmented reality display device with deep learning sensors
CN106407772A (zh) * 2016-08-25 2017-02-15 北京中科虹霸科技有限公司 适于虚拟现实设备的人机交互与身份认证装置及其方法
CN106447184B (zh) * 2016-09-21 2019-04-05 中国人民解放军国防科学技术大学 基于多传感器测量与神经网络学习的无人机操作员状态评估方法
DE102016118647B4 (de) * 2016-09-30 2018-12-06 Deutsche Telekom Ag Augmented-Reality-Kommunikationssystem und Augmented-Reality-Interaktionsvorrichtung
US10642887B2 (en) * 2016-12-27 2020-05-05 Adobe Inc. Multi-modal image ranking using neural networks
CN107105333A (zh) * 2017-04-26 2017-08-29 电子科技大学 一种基于视线追踪技术的vr视频直播交互方法与装置
CN107656613B (zh) * 2017-09-08 2020-12-18 国网智能科技股份有限公司 一种基于眼动追踪的人机交互系统及其工作方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105589551A (zh) * 2014-10-22 2016-05-18 褚秀清 一种用于移动设备人机交互的眼动跟踪方法
CN106354264A (zh) * 2016-09-09 2017-01-25 电子科技大学 基于视线追踪的实时人机交互系统及其工作方法
CN107545302A (zh) * 2017-08-02 2018-01-05 北京航空航天大学 一种人眼左右眼图像联合的视线方向计算方法
CN109240510A (zh) * 2018-10-30 2019-01-18 东北大学 基于视线追踪的增强现实人机交互设备及控制方法

Also Published As

Publication number Publication date
CN109240510B (zh) 2023-12-26
CN109240510A (zh) 2019-01-18

Similar Documents

Publication Publication Date Title
Hickson et al. Eyemotion: Classifying facial expressions in VR using eye-tracking cameras
JP7125950B2 (ja) 虹彩コードの蓄積および信頼性割当
CN111385462A (zh) 信号处理装置、信号处理方法及相关产品
WO2020062392A1 (zh) 信号处理装置、信号处理方法及相关产品
WO2020087919A1 (zh) 基于视线追踪的增强现实人机交互设备及控制方法
CN111046734A (zh) 基于膨胀卷积的多模态融合视线估计方法
Lemley et al. Eye tracking in augmented spaces: A deep learning approach
KR20170111938A (ko) 시선 인식을 이용하는 콘텐츠 재생 장치 및 방법
US11972634B2 (en) Image processing method and apparatus
CN112183200A (zh) 一种基于视频图像的眼动追踪方法和系统
CN111476151B (zh) 眼球检测方法、装置、设备及存储介质
WO2022267653A1 (zh) 图像处理方法、电子设备及计算机可读存储介质
Wan et al. Robust and accurate pupil detection for head-mounted eye tracking
Kang et al. Real-time eye tracking for bare and sunglasses-wearing faces for augmented reality 3D head-up displays
US10269116B2 (en) Proprioception training method and apparatus
Kurdthongmee et al. A yolo detector providing fast and accurate pupil center estimation using regions surrounding a pupil
US20220159174A1 (en) Method of controlling electronic device by recognizing movement in peripheral zone of field of view of camera, and electronic device therefor
CN116012459A (zh) 基于三维视线估计和屏幕平面估计的鼠标定位的方法
CN115862095A (zh) 一种自适应视线估计方法、系统、电子设备及存储介质
CN115393963A (zh) 运动动作纠正方法、系统、存储介质、计算机设备及终端
KR20220067964A (ko) 카메라 시야(fov) 가장자리에서 움직임을 인식하여 전자 장치를 제어하는 방법 및 그 전자 장치
EP4217921A1 (en) Multi-wavelength biometric imaging system
EP4217918A1 (en) Flexible illumination for imaging systems
EP4217973A2 (en) Pose optimization in biometric authentication systems
CN115731326A (zh) 虚拟角色生成方法及装置、计算机可读介质和电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19877620

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19877620

Country of ref document: EP

Kind code of ref document: A1