CN110263657B

CN110263657B - Human eye tracking method, device, system, equipment and storage medium

Info

Publication number: CN110263657B
Application number: CN201910438457.3A
Authority: CN
Inventors: 卢增祥
Original assignee: Yixin Technology Development Co ltd
Current assignee: Yixin Technology Development Co ltd
Priority date: 2019-05-24
Filing date: 2019-05-24
Publication date: 2023-04-18
Anticipated expiration: 2039-05-24
Also published as: CN110263657A; WO2020237921A1

Abstract

The embodiment of the invention discloses a human eye tracking method, a human eye tracking device, a human eye tracking system, human eye tracking equipment and a storage medium, wherein the method comprises the following steps: calling a first camera to acquire user images in a preset viewing area, and determining three-dimensional head position information corresponding to each user in the preset viewing area according to the user images; determining a first preset number of target second cameras from the plurality of second cameras according to the three-dimensional head orientation information; calling each target second camera to collect a face image of the user, and determining two-dimensional binocular pupil orientation information of the user according to each face image; and determining the three-dimensional binocular pupil orientation information of the user according to the at least two-dimensional binocular pupil orientation information. By the technical scheme of the embodiment of the invention, the calculation speed and the calculation precision can be simultaneously improved, and the method and the device can be suitable for a multi-user human eye tracking scene.

Description

Human eye tracking method, device, system, equipment and storage medium

技术领域Technical Field

本发明实施例涉及图像处理技术，尤其涉及一种人眼追踪方法、装置、系统、设备和存储介质。Embodiments of the present invention relate to image processing technology, and more particularly to a method, apparatus, system, device and storage medium for tracking human eyes.

背景技术Background Art

人眼追踪技术主要应用在人机交互、裸眼3D显示、虚拟现实等领域，通过追踪眼球的运动来获取人的观看视点位置；在当前的裸眼3D显示屏中通过人眼追踪判断出当前观看位置，修改显示图像以减少3D图像的左右眼串扰现象。Eye tracking technology is mainly used in human-computer interaction, naked-eye 3D display, virtual reality and other fields. It obtains the viewing point of a person by tracking the movement of the eyeball. In the current naked-eye 3D display, eye tracking is used to determine the current viewing position and modify the displayed image to reduce the left-eye crosstalk of the 3D image.

现有人眼追踪主要基于PCCR(瞳孔角膜反射技术)加图像处理识别方式实现，如Tobii眼动仪。Tobii眼动仪使用近红外光源使用户眼睛的角膜和瞳孔上产成反射图像，然后使用两个图像传感器采集眼睛与反射的图像。然后基于图像处理算法和一个三维眼球模型精确地计算出眼睛空间位置和注视方向。Existing eye tracking is mainly based on PCCR (Pupillary Corneal Reflection Technology) plus image processing and recognition, such as the Tobii eye tracker. The Tobii eye tracker uses a near-infrared light source to generate a reflected image on the cornea and pupil of the user's eye, and then uses two image sensors to collect the eye and reflected images. Then, based on the image processing algorithm and a three-dimensional eyeball model, the eye's spatial position and gaze direction are accurately calculated.

然而，现有的人眼追踪方式不具备用户识别能力，只适用于近距离单用户的场景，如使用电脑、VR眼镜、眼睛检查等。并且现有的人眼追踪方式通常是在用户图像中先识别出脸部区域，再根据脸部区域计算用户双眼瞳孔位置，由于用户图像中的脸部区域所占的像素面积可能较少，从而在用户图像中直接计算双眼瞳孔位置时，会导致无法同时提高计算精度和计算速度的情况。However, the existing eye tracking method does not have the ability to identify users and is only applicable to scenarios with a single user at close range, such as using a computer, VR glasses, eye examinations, etc. In addition, the existing eye tracking method usually first identifies the face area in the user image, and then calculates the pupil position of the user's eyes based on the face area. Since the face area in the user image may occupy a small pixel area, when the pupil position of the eyes is directly calculated in the user image, it will not be possible to improve the calculation accuracy and calculation speed at the same time.

可见，当前急需一种既可以提高计算速度又可以提高计算精度的人眼追踪方法。It can be seen that there is an urgent need for a human eye tracking method that can improve both calculation speed and calculation accuracy.

发明内容Summary of the invention

本发明实施例提供了一种人眼追踪方法、装置、系统、设备和存储介质，可以同时提高计算速度和精度，并且可以适用于多用户的人眼追踪场景。Embodiments of the present invention provide a human eye tracking method, apparatus, system, device and storage medium, which can improve both calculation speed and accuracy, and can be applicable to multi-user human eye tracking scenarios.

第一方面，本发明实施例提供了一种人眼追踪方法，包括：In a first aspect, an embodiment of the present invention provides a method for tracking human eyes, comprising:

调用第一摄像头采集预设观看区域中的用户图像，并根据所述用户图像确定所述预设观看区域中每个用户对应的三维头部方位信息；Calling the first camera to capture user images in a preset viewing area, and determining three-dimensional head orientation information corresponding to each user in the preset viewing area according to the user images;

根据所述三维头部方位信息，从多个第二摄像头中确定第一预设数量的目标第二摄像头；Determining a first preset number of target second cameras from a plurality of second cameras according to the three-dimensional head orientation information;

调用每个所述目标第二摄像头采集所述用户的脸部图像，并根据每个所述脸部图像确定所述用户的二维双眼瞳孔方位信息；Calling each of the target second cameras to capture a facial image of the user, and determining two-dimensional binocular pupil position information of the user according to each of the facial images;

根据至少两个所述二维双眼瞳孔方位信息确定所述用户的三维双眼瞳孔方位信息。The three-dimensional binocular pupil position information of the user is determined based on at least two of the two-dimensional binocular pupil position information.

第二方面，本发明实施例还提供了一种人眼追踪装置，包括：In a second aspect, an embodiment of the present invention further provides an eye tracking device, comprising:

三维头部方位信息确定模块，用于调用第一摄像头采集预设观看区域中的用户图像，并根据所述用户图像确定所述预设观看区域中每个用户对应的三维头部方位信息；A three-dimensional head position information determination module, used to call the first camera to collect user images in a preset viewing area, and determine the three-dimensional head position information corresponding to each user in the preset viewing area according to the user images;

目标第二摄像头确定模块，用于根据所述三维头部方位信息，从多个第二摄像头中确定第一预设数量的目标第二摄像头；a target second camera determination module, configured to determine a first preset number of target second cameras from a plurality of second cameras according to the three-dimensional head orientation information;

二维双眼瞳孔方位信息确定模块，用于调用每个所述目标第二摄像头采集所述用户的脸部图像，并根据每个所述脸部图像确定所述用户的二维双眼瞳孔方位信息；A two-dimensional binocular pupil position information determination module, used for calling each of the target second cameras to collect the user's facial image, and determining the user's two-dimensional binocular pupil position information according to each of the facial images;

三维双眼瞳孔方位信息确定模块，用于根据至少两个所述二维双眼瞳孔方位信息确定所述用户的三维双眼瞳孔方位信息。The three-dimensional binocular pupil position information determining module is used to determine the three-dimensional binocular pupil position information of the user based on at least two of the two-dimensional binocular pupil position information.

第三方面，本发明实施例还提供了一种人眼追踪系统，所述系统包括：第一摄像头、多个第二摄像头和人眼追踪装置；其中，所述人眼追踪装置用于实现如本发明任意实施例所提供的人眼追踪方法。In a third aspect, an embodiment of the present invention further provides an eye tracking system, comprising: a first camera, multiple second cameras and an eye tracking device; wherein the eye tracking device is used to implement the eye tracking method provided in any embodiment of the present invention.

第四方面，本发明实施例还提供了一种设备，所述设备包括：In a fourth aspect, an embodiment of the present invention further provides a device, the device comprising:

一个或多个处理器；one or more processors;

存储器，用于存储一个或多个程序；A memory for storing one or more programs;

输入装置，用于采集图像；An input device, for collecting images;

输出装置，用于显示屏幕信息；Output device, used for displaying screen information;

当所述一个或多个程序被所述一个或多个处理器执行，使得所述一个或多个处理器实现如本发明任意实施例所提供的人眼追踪方法。When the one or more programs are executed by the one or more processors, the one or more processors implement the human eye tracking method provided by any embodiment of the present invention.

第五方面，本发明实施例还提供了一种计算机可读存储介质，其上存储有计算机程序，该程序被处理器执行时实现如本发明任意实施例所提供的人眼追踪方法。In a fifth aspect, an embodiment of the present invention further provides a computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, implements the eye tracking method provided by any embodiment of the present invention.

本发明实施例通过根据第一摄像头采集的用户图像，确定预设观看区域中每个用户对应的三维头部方位信息，从而实现对用户头部的追踪。基于用户的三维头部方位信息，从多个第二摄像头中确定第一预设数量的目标第二摄像头，使得每个目标第二摄像头采集的脸部图像中双眼瞳孔区域所占的像素面积较多，从而根据多个脸部图像可以准确快速地确定用户的三维双眼瞳孔方位信息，实现了人眼的高速追踪，同时提高了计算精度。并且本发明实施例是利用多个第二摄像头采集预设观看区域中用户的脸部图像，从而可以调用不同的第二摄像头同时采集不同用户的脸部图像，使得可以适用于多用户的人眼追踪场景。The embodiment of the present invention determines the three-dimensional head position information corresponding to each user in the preset viewing area based on the user image captured by the first camera, thereby realizing the tracking of the user's head. Based on the user's three-dimensional head position information, a first preset number of target second cameras are determined from multiple second cameras, so that the pixel area occupied by the pupil area of both eyes in the facial image captured by each target second camera is relatively large, so that the user's three-dimensional binocular pupil position information can be accurately and quickly determined based on multiple facial images, realizing high-speed tracking of the human eye and improving the calculation accuracy at the same time. In addition, the embodiment of the present invention uses multiple second cameras to capture the facial images of users in the preset viewing area, so that different second cameras can be called to capture facial images of different users at the same time, so that it can be applicable to multi-user human eye tracking scenarios.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1是本发明实施例一提供的一种人眼追踪方法的流程图；FIG1 is a flow chart of a human eye tracking method provided by Embodiment 1 of the present invention;

图2是本发明实施例一所涉及的一种第二摄像头匹配的示意图；FIG2 is a schematic diagram of a second camera matching method according to Embodiment 1 of the present invention;

图3是本发明实施例一所涉及的一种光线对眼睛位置在深度方向的距离敏感度的示例；FIG3 is an example of the distance sensitivity of light to the eye position in the depth direction according to the first embodiment of the present invention;

图4是本发明实施例一所涉及的一种第二三维头部方位信息展示的示例；FIG4 is an example of displaying second three-dimensional head position information involved in Embodiment 1 of the present invention;

图5是本发明实施例一所涉及的一种接收脸部图像数据的示意图；FIG5 is a schematic diagram of receiving facial image data according to Embodiment 1 of the present invention;

图6是本发明实施例二提供的一种人眼追踪方法的流程图；FIG6 is a flow chart of a method for tracking human eyes provided by Embodiment 2 of the present invention;

图7是本发明实施例二所涉及的一种螺旋状搜索图的示例；FIG7 is an example of a spiral search graph involved in Embodiment 2 of the present invention;

图8是本发明实施例二所涉及的一种第二布局位置处第二摄像头并列布局的示意图；FIG8 is a schematic diagram of a parallel layout of second cameras at a second layout position involved in Embodiment 2 of the present invention;

图9是本发明实施例二所涉及的一种第二摄像头对应的三层数据表的示例；FIG9 is an example of a three-layer data table corresponding to a second camera involved in Embodiment 2 of the present invention;

图10是本发明实施例三提供的一种人眼追踪装置的结构示意图；FIG10 is a schematic diagram of the structure of an eye tracking device provided by Embodiment 3 of the present invention;

图11是本发明实施例四提供的一种人眼追踪系统的结构示意图；FIG11 is a schematic diagram of the structure of an eye tracking system provided by Embodiment 4 of the present invention;

图12是本发明实施例四所涉及的一种在预设观看区域为圆形内区域时第一摄像头的布局示例；FIG12 is a layout example of a first camera when the preset viewing area is a circular inner area according to the fourth embodiment of the present invention;

图13是本发明实施例四所涉及的一种在预设观看区域为圆形外区域时第一摄像头的布局示例；FIG13 is a layout example of a first camera when the preset viewing area is a circular outer area according to the fourth embodiment of the present invention;

图14是本发明实施例四所涉及的一种在预设观看区域为直线单侧观看区域时第一摄像头的布局示例；FIG14 is a layout example of a first camera when the preset viewing area is a straight line single-sided viewing area according to the fourth embodiment of the present invention;

图15是本发明实施例四所涉及的一种相邻两个第二摄像头的交叉区域的示例；FIG15 is an example of an intersection area of two adjacent second cameras involved in Embodiment 4 of the present invention;

图16是本发明实施例四所涉及的一种在预设观看区域为圆形内区域时第二摄像头的布局示例；FIG16 is a layout example of a second camera when the preset viewing area is a circular inner area according to the fourth embodiment of the present invention;

图17是本发明实施例四所涉及的一种在预设观看区域为圆形外区域时第二摄像头的布局示例；FIG17 is a layout example of a second camera when the preset viewing area is a circular outer area according to the fourth embodiment of the present invention;

图18是本发明实施例四所涉及的一种在预设观看区域为直线单侧观看区域时第二摄像头的布局示例；FIG18 is a layout example of a second camera when the preset viewing area is a straight line single-sided viewing area according to the fourth embodiment of the present invention;

图19是本发明实施例四提供的另一种人眼追踪系统的结构示意图；FIG19 is a schematic diagram of the structure of another eye tracking system provided by Embodiment 4 of the present invention;

图20是本发明实施例五提供的一种设备的结构示意图。FIG20 is a schematic diagram of the structure of a device provided in Embodiment 5 of the present invention.

具体实施方式DETAILED DESCRIPTION

下面结合附图和实施例对本发明作进一步的详细说明。可以理解的是，此处所描述的具体实施例仅仅用于解释本发明，而非对本发明的限定。另外还需要说明的是，为了便于描述，附图中仅示出了与本发明相关的部分而非全部结构。The present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It is to be understood that the specific embodiments described herein are only used to explain the present invention, rather than to limit the present invention. It should also be noted that, for ease of description, only parts related to the present invention, rather than all structures, are shown in the accompanying drawings.

实施例一Embodiment 1

图1为本发明实施例一提供的一种人眼追踪方法的流程图，本实施例可适用于在用户观看3D显示屏时，对用户的双眼瞳孔进行追踪定位的情况。该方法可以由人眼追踪装置来执行，该装置可以由软件和/或硬件的方式来实现，集成于具有3D显示功能的设备中，比如裸眼3D广告机、裸眼3D显示器等。如图1所示，该方法具体包括以下步骤：FIG1 is a flow chart of an eye tracking method provided by Embodiment 1 of the present invention. This embodiment is applicable to tracking and locating the pupils of both eyes of a user when the user is watching a 3D display screen. The method can be performed by an eye tracking device, which can be implemented by software and/or hardware and integrated into a device with a 3D display function, such as a naked-eye 3D advertising machine, a naked-eye 3D display, etc. As shown in FIG1 , the method specifically includes the following steps:

S110、调用第一摄像头采集预设观看区域中的用户图像，并根据用户图像确定预设观看区域中每个用户对应的三维头部方位信息。S110, calling a first camera to capture user images in a preset viewing area, and determining three-dimensional head orientation information corresponding to each user in the preset viewing area according to the user images.

其中，第一摄像头可以是指用于追踪用户头部的摄像头。本实施例中的第一摄像头可以是3D摄像头，也可以是多个2D摄像头。由于第一摄像头是负责追踪用户头部，对脸部细节和追踪速度要求不高，从而可以选择视角大的第一摄像头。预设观看区域可以是指用户观看3D显示屏时所处于的区域，其可以根据3D显示屏的形状和位置预先确定。比如，若3D显示屏为一个圆形且面向圆形中心进行显示，则预设观看区域可以为该圆形组成的圆形区域内。预设观看区域中可以存在一个或多个用户同时进行观看。本实施例中第一摄像头的数量和拍摄位置(即布局方式)可以根据3D显示屏的形状和位置预先进行设置，以使各个第一摄像头的总检测区域可以覆盖预设观看区域，从而可以利用第一摄像头追踪预设观看区域中每个用户的头部。本实施例中可以同时追踪至少10个用户的头部，其具体数量可以根据第一摄像头的拍摄性能进行确定。三维头部方位信息可以包括用户的三维头部位置信息和头部朝向信息，其中头部朝向信息可以用于反映用户的头部状态，比如仰头、低头或者偏头等。Among them, the first camera may refer to a camera used to track the user's head. The first camera in this embodiment may be a 3D camera or a plurality of 2D cameras. Since the first camera is responsible for tracking the user's head, it does not require high facial details and tracking speed, so a first camera with a large viewing angle may be selected. The preset viewing area may refer to the area where the user is located when viewing the 3D display screen, which may be predetermined according to the shape and position of the 3D display screen. For example, if the 3D display screen is a circle and faces the center of the circle for display, the preset viewing area may be within the circular area formed by the circle. One or more users may be simultaneously viewing in the preset viewing area. In this embodiment, the number and shooting positions (i.e., layout) of the first cameras may be pre-set according to the shape and position of the 3D display screen, so that the total detection area of each first camera may cover the preset viewing area, so that the first camera may be used to track the head of each user in the preset viewing area. In this embodiment, at least 10 users' heads may be tracked simultaneously, and the specific number may be determined according to the shooting performance of the first camera. The three-dimensional head orientation information may include the user's three-dimensional head position information and head orientation information, wherein the head orientation information may be used to reflect the user's head state, such as looking up, lowering or tilting the head.

具体地，可以通过调用每个第一摄像头采集预设观看区域中的用户图像，并利用视觉定位原理等方式实现用户头部的追踪定位，以及通过脸部匹配等技术对用户图像进行图像处理，识别出不同的用户头部，从而可以计算出预设观看区域中每个用户对应的三维头部方位信息。本实施例中第一摄像头可以优选为高清的彩色摄像头，以便可以根据采集的用户图像中的RGB信息识别出用户的头发颜色、肤色等，从而可以更加准确地确定用户的三维头部方位信息。Specifically, each first camera can be called to collect user images in a preset viewing area, and the user's head can be tracked and located by using the visual positioning principle and other methods, and the user image can be processed by face matching and other technologies to identify different user heads, so that the three-dimensional head position information corresponding to each user in the preset viewing area can be calculated. In this embodiment, the first camera can preferably be a high-definition color camera, so that the user's hair color, skin color, etc. can be identified based on the RGB information in the collected user image, so that the user's three-dimensional head position information can be determined more accurately.

示例性地，预设观看区域中每个用户对应的三维头部方位信息可以采用如下数据结构进行存储：Exemplarily, the three-dimensional head position information corresponding to each user in the preset viewing area can be stored in the following data structure:

其中，hid为追踪到的某个用户头部对应的标识编号，以便区分不同的用户头部。(x，y，z)为用户的三维头部位置坐标，单位可以设置为毫米mm，(angle_x，angle_y，angle_z)为用户的头部朝向对应的朝向向量。本实施例中用户头部朝向水平方向转动(即沿y轴的转动角度)的精度大于其他两个方向(沿x轴和z轴的转动角度)的精度。hid is the identification number corresponding to a tracked user's head, so as to distinguish different user heads. (x, y, z) is the three-dimensional head position coordinate of the user, and the unit can be set to mm. (angle_x, angle_y, angle_z) is the orientation vector corresponding to the user's head orientation. In this embodiment, the accuracy of the user's head rotation in the horizontal direction (i.e., the rotation angle along the y-axis) is greater than the accuracy of the other two directions (the rotation angles along the x-axis and z-axis).

S120、根据三维头部方位信息，从多个第二摄像头中确定第一预设数量的目标第二摄像头。S120. Determine a first preset number of target second cameras from a plurality of second cameras according to the three-dimensional head orientation information.

其中，第二摄像头可以是指用于追踪用户双眼瞳孔的摄像头。本实施例中第二摄像头可以为2D摄像头。本实施例中第二摄像头的具体数量和拍摄位置(即布局方式)可以根据3D显示屏的形状和位置预先进行设置，以使各个第二摄像头的总检测区域可以覆盖预设观看区域，从而可以利用第二摄像头追踪预设观看区域中每个用户的眼睛，实现多用户的人眼追踪。目标第二摄像头可以是指从多个第二摄像头中筛选出的采集用户脸部图像最佳的第二摄像头。第一预设数量可以根据业务需求和场景进行设置。本实施例中的第一预设数量可以为至少两个。Among them, the second camera may refer to a camera used to track the pupils of both eyes of the user. In this embodiment, the second camera may be a 2D camera. In this embodiment, the specific number and shooting positions (i.e., layout) of the second cameras can be pre-set according to the shape and position of the 3D display screen, so that the total detection area of each second camera can cover the preset viewing area, so that the second camera can be used to track the eyes of each user in the preset viewing area, thereby realizing multi-user eye tracking. The target second camera may refer to the second camera that is selected from multiple second cameras to capture the best user facial image. The first preset number can be set according to business needs and scenarios. The first preset number in this embodiment may be at least two.

具体地，对于每个用户而言，可以基于用户的三维头部方位信息从多个第二摄像头中确定出该用户对应的目标第二摄像头，以使每个目标第二摄像头采集到的脸部图像中双眼瞳孔区域所占的像素面积较多，进而保证分辨率，提高人眼追踪的精度。示例性地，本实施例中第二摄像头可以为黑白摄像头，并在第二摄像头的安装位置上设置照明红外光源，以便可以进行图像采集。本实施例中第二摄像头的分辨率可以小于第一摄像头的分辨率，以进一步提高图像的处理速度。Specifically, for each user, the target second camera corresponding to the user can be determined from multiple second cameras based on the user's three-dimensional head orientation information, so that the pixel area occupied by the pupil area of both eyes in the facial image captured by each target second camera is larger, thereby ensuring the resolution and improving the accuracy of human eye tracking. Exemplarily, in this embodiment, the second camera can be a black and white camera, and an infrared light source for illumination is set at the installation position of the second camera so that image acquisition can be performed. In this embodiment, the resolution of the second camera can be smaller than the resolution of the first camera to further improve the image processing speed.

示例性地，S120可以包括：根据三维头部方位信息和每个第二摄像头的方位配置信息，确定用户对应的第二预设数量的候选第二摄像头，以及每个候选第二摄像头对应的匹配度；根据各匹配度和每个候选第二摄像头对应的当前调用次数，从各候选第二摄像头中筛选出第一预设数量的目标第二摄像头。Exemplarily, S120 may include: determining a second preset number of candidate second cameras corresponding to the user and a matching degree corresponding to each candidate second camera based on the three-dimensional head orientation information and the orientation configuration information of each second camera; and screening out a first preset number of target second cameras from each candidate second camera based on the matching degrees and the current number of calls corresponding to each candidate second camera.

其中，第二摄像头的方位配置信息可以包括但不限于第二摄像头的安装位置、分辨率、景深参数和视角范围。示例性地，每个第二摄像头的方位配置信息可以采用如下数据结构进行存储：The position configuration information of the second camera may include but is not limited to the installation position, resolution, depth of field parameter and viewing angle range of the second camera. Exemplarily, the position configuration information of each second camera may be stored in the following data structure:

其中，cid为第二摄像头对应的标识编号，以便区别不同的第二摄像头；(x，y，z)为第二摄像头对应的安装位置坐标，单位可以为毫米mm；(angle_x，angle_y，angle_z)为第二摄像头的拍摄中心对应的方向向量。width和height分别表示第二摄像头的分辨率；fov_h和fov_v分别为第二摄像头在水平方向和垂直方向的视角，单位为度。dof为第二摄像头的景深参数。Type为第二摄像头的布局方式，比如圆形向内型、圆形向外型和平面型，其中，圆形向内型可以是指分布在圆形上的第二摄像头的拍摄方向均朝向圆形的中心位置处；圆形向外型可以是指分布在圆形上的第二摄像头的拍摄方向均背离圆形的中心位置；平面型可以是指分布在直线上的第二摄像头均朝向直线单侧的观看区域拍摄。本实施例可以基于布局方式选取相应的筛选优化策略。需要说明的是，本实施例中存储的每个第二摄像头对应的安装位置坐标(x，y，z)是指在世界坐标系下的坐标，以便进行数据匹配。Wherein, cid is the identification number corresponding to the second camera, so as to distinguish different second cameras; (x, y, z) is the installation position coordinate corresponding to the second camera, and the unit can be mm; (angle_x, angle_y, angle_z) is the direction vector corresponding to the shooting center of the second camera. Width and height represent the resolution of the second camera respectively; fov_h and fov_v are the viewing angles of the second camera in the horizontal and vertical directions respectively, and the unit is degree. Dof is the depth of field parameter of the second camera. Type is the layout mode of the second camera, such as circular inward type, circular outward type and plane type, wherein the circular inward type may mean that the shooting direction of the second cameras distributed on the circle is all toward the center position of the circle; the circular outward type may mean that the shooting direction of the second cameras distributed on the circle is all away from the center position of the circle; the plane type may mean that the second cameras distributed on the straight line are all shooting towards the viewing area on one side of the straight line. In this embodiment, the corresponding screening optimization strategy can be selected based on the layout mode. It should be noted that the installation position coordinates (x, y, z) corresponding to each second camera stored in this embodiment refer to the coordinates in the world coordinate system for data matching.

具体地，本实施例将三维头部方位信息与每个第二摄像头的方位配置信息进行匹配，确定可以拍摄到用户脸部图像的第二预设数量的候选第二摄像头，并可以根据用户的头部朝向方向与每个候选第二摄像头拍摄视角的中心线之间的夹角确定每个候选第二摄像头对应的匹配度，比如头部朝向方向与拍摄视角的中心线之间的夹角越大，则候选第二摄像头对应的匹配度越小。本实施例还可以根据预设观看区域中多个用户之间的位置关系，检测是否存在拍摄遮挡的情况，以便对候选第二摄像头对应的匹配度进行调节。例如，若利用某个候选第二摄像头拍摄用户A时，存在一个用户B遮挡了用户A，则可以降低该候选第二摄像头对应的匹配度，或者将该候选第二摄像头对应的匹配度设置为最小，以避免利用该候选第二摄像头对用户A进行图像采集。本实施例在同时追踪多用户时，每个第二摄像头均可能对应多个拍摄任务，每个拍摄任务均对应一次第二摄像头的调用，以便可以针对不同用户进行拍摄。候选第二摄像头对应的当前调用次数可以是指候选第二摄像头在当前时刻对应的待执行的拍摄任务的数量。本实施例中的第一预设数量小于或等于第二预设数量。第二预设数量的大小可以根据业务场景和实际运行情况确定。当第一预设数量小于第二预设数量时，可以基于每个候选摄像头对应的匹配度和当前调用次数，从第二预设数量的候选摄像头中进一步筛选出第一预设数量的最佳的第二摄像头，即目标第二摄像头。当第一预设数据等于第二预设数量时，可以将确定的每个候选第二摄像头直接确定为目标第二摄像头。Specifically, this embodiment matches the three-dimensional head orientation information with the orientation configuration information of each second camera, determines a second preset number of candidate second cameras that can capture the user's facial image, and can determine the matching degree corresponding to each candidate second camera according to the angle between the user's head orientation direction and the center line of the shooting angle of each candidate second camera. For example, the larger the angle between the head orientation direction and the center line of the shooting angle of view, the smaller the matching degree corresponding to the candidate second camera. This embodiment can also detect whether there is a shooting occlusion according to the positional relationship between multiple users in the preset viewing area, so as to adjust the matching degree corresponding to the candidate second camera. For example, if a user B blocks user A when a candidate second camera is used to shoot user A, the matching degree corresponding to the candidate second camera can be reduced, or the matching degree corresponding to the candidate second camera can be set to the minimum to avoid using the candidate second camera to collect images of user A. In this embodiment, when tracking multiple users at the same time, each second camera may correspond to multiple shooting tasks, and each shooting task corresponds to a call of the second camera, so that shooting can be performed for different users. The current number of calls corresponding to the candidate second camera can refer to the number of shooting tasks to be executed corresponding to the candidate second camera at the current moment. In this embodiment, the first preset number is less than or equal to the second preset number. The size of the second preset number can be determined according to the business scenario and the actual operation situation. When the first preset number is less than the second preset number, the best second cameras of the first preset number, i.e., the target second cameras, can be further screened out from the second preset number of candidate cameras based on the matching degree and the current call count corresponding to each candidate camera. When the first preset data is equal to the second preset number, each determined candidate second camera can be directly determined as the target second camera.

示例性地，图2给出了一种第二摄像头匹配的示意图。如图2所示，第二摄像头的布局方式为圆形向内型，即在圆形上均匀分布多个布局位置，每个布局位置处可以安装多个第二摄像头，且每个摄像头的拍摄方向均朝向圆形的中心位置处。在图2中，虚线表示用户头部的朝向方向，c1是指位于布局位置1处的第二摄像头C1对应的拍摄视角；c2是指位于布局位置2处的第二摄像头C2对应的拍摄视角；c3是指位于布局位置2处的第二摄像头C3对应的拍摄视角；c4是指位于布局位置3处的第二摄像头C4对应的拍摄视角。图2中的用户A的头部朝向在布局位置1和布局位置2之间，则对于用户A而言，在布局位置1和布局位置2处均可以匹配出一个最佳的第二摄像头，根据布局位置上的每个第二摄像头的拍摄视角可以确定出该布局位置上最佳的第二摄像头，如图2所示，对用户A匹配的两个目标第二摄像头分别为第二摄像头C1和第二摄像头C2。同理，对用户B匹配两个目标第二摄像头分别为第二摄像头C3和第二摄像头C4。Exemplarily, FIG2 shows a schematic diagram of a second camera matching. As shown in FIG2, the layout of the second camera is circular inward, that is, multiple layout positions are evenly distributed on the circle, multiple second cameras can be installed at each layout position, and the shooting direction of each camera is toward the center of the circle. In FIG2, the dotted line indicates the direction of the user's head, c1 refers to the shooting angle corresponding to the second camera C1 located at layout position 1; c2 refers to the shooting angle corresponding to the second camera C2 located at layout position 2; c3 refers to the shooting angle corresponding to the second camera C3 located at layout position 2; c4 refers to the shooting angle corresponding to the second camera C4 located at layout position 3. The head direction of user A in FIG2 is between layout position 1 and layout position 2. For user A, an optimal second camera can be matched at both layout position 1 and layout position 2. According to the shooting angle of each second camera at the layout position, the optimal second camera at the layout position can be determined. As shown in FIG2, the two target second cameras matched for user A are the second camera C1 and the second camera C2. Similarly, two target second cameras are matched for user B, namely second camera C3 and second camera C4.

S130、调用每个目标第二摄像头采集用户的脸部图像，并根据每个脸部图像确定用户的二维双眼瞳孔方位信息。S130, calling each target second camera to collect a facial image of the user, and determining two-dimensional binocular pupil position information of the user according to each facial image.

其中，二维双眼瞳孔方位信息可以包括用户的二维双眼瞳孔位置信息和眼睛注视方向信息，其中二维双眼瞳孔位置信息包括二维左眼瞳孔位置信息和二维右眼瞳孔位置信息。示例性地，用户的二维双眼瞳孔方位信息可以采用如下数据结构进行存储：The two-dimensional binocular pupil position information may include the user's two-dimensional binocular pupil position information and eye gaze direction information, wherein the two-dimensional binocular pupil position information includes two-dimensional left eye pupil position information and two-dimensional right eye pupil position information. Exemplarily, the user's two-dimensional binocular pupil position information may be stored in the following data structure:

其中，(left_x，left_y)、(right_x，right_y)表示左右眼睛瞳孔的二维位置坐标，其中x轴方向可以为从左向右的方向，y轴方向可以为从上向下的方向；(angle_x，angle_y)表示眼睛注视方向的角度，其中x方向可以分为“无”、“左”、“中”和“右”四档，相应地y方向可以为“无”、“上”、“中”和“下”四档，其中“无”表示方向不确定。Time可以为计算出眼睛瞳孔位置时所对应的时间点，单位可以为ms，以便利用时间点来区分不同的眼睛瞳孔位置。Mode表示检测到的眼睛模式，比如，双眼模式、左眼模式、右眼模式和单眼模式这四种模式，其中，单眼模式可以是指仅检测到一只眼睛瞳孔的位置并且无法识别出该眼睛为左眼还是右眼。单眼模式对应的二维位置坐标可以存放在左眼瞳孔位置处(left_x，left_y)，也可以存放在右眼瞳孔位置处(right_x，right_y)。本实施例中计算的双眼瞳孔的二维位置坐标的精度为±1个像素。Among them, (left_x, left_y), (right_x, right_y) represent the two-dimensional position coordinates of the left and right pupils, wherein the x-axis direction can be from left to right, and the y-axis direction can be from top to bottom; (angle_x, angle_y) represents the angle of the eye gaze direction, wherein the x-direction can be divided into four levels: "none", "left", "middle" and "right", and the corresponding y-direction can be four levels: "none", "up", "middle" and "down", wherein "none" represents an uncertain direction. Time can be the time point corresponding to the calculation of the pupil position of the eye, and the unit can be ms, so that the time point can be used to distinguish different pupil positions of the eye. Mode represents the detected eye mode, such as the four modes of binocular mode, left eye mode, right eye mode and monocular mode, wherein the monocular mode can refer to the detection of the position of only one eye pupil and the inability to identify whether the eye is the left eye or the right eye. The two-dimensional position coordinates corresponding to the monocular mode can be stored at the left eye pupil position (left_x, left_y) or at the right eye pupil position (right_x, right_y). The accuracy of the two-dimensional position coordinates of the pupils of both eyes calculated in this embodiment is ±1 pixel.

具体地，本实施例利用筛选出的目标第二摄像头针对性地采集用户的脸部图像，使得在利用较低分辨率的目标第二摄像头采集脸部图像时，可以保证脸部图像中双眼瞳孔区域所占的像素面积较多，从而在提高计算速度的前提下可以提高计算精度。在确定第一预设数量的目标第二摄像头后，通过调用各个目标第二摄像头，获得用户对应的各脸部图像，并可以基于图像处理算法对采集的每个脸部图像进行处理，从而可以快速精准地计算出每个脸部图像中的二维双眼瞳孔方位信息。Specifically, this embodiment uses the selected target second camera to specifically collect the user's facial image, so that when the facial image is collected by the target second camera with a lower resolution, it can ensure that the pixel area occupied by the pupil area of both eyes in the facial image is larger, thereby improving the calculation accuracy while improving the calculation speed. After determining the first preset number of target second cameras, each facial image corresponding to the user is obtained by calling each target second camera, and each collected facial image can be processed based on the image processing algorithm, so that the two-dimensional pupil orientation information of both eyes in each facial image can be quickly and accurately calculated.

S140、根据至少两个二维双眼瞳孔方位信息确定用户的三维双眼瞳孔方位信息。S140: Determine the user's three-dimensional binocular pupil position information based on at least two pieces of two-dimensional binocular pupil position information.

其中，三维双眼瞳孔方位信息可以包括用户的三维双眼瞳孔位置信息和眼睛注视方向信息。在光场显示中，由于光线角度问题，使得不同方向像素的光线对眼睛位置在深度方向的距离敏感度不同。图3给出了一种光线对眼睛位置在深度方向的距离敏感度的示例。图3中的光场显示屏为一个圆形，用户可以在该圆形区域内进行观看。图3中在眼睛瞳孔位置处“像素1”发出的光线在y轴方向上的精度为d1，在眼睛瞳孔位置处“像素2”发出的光线在y轴方向上的精度为d2，可见，d2<d1，即在眼睛瞳孔位置处“像素2”发出的光线在y轴方向上的精度更高。若眼睛瞳孔位置在y轴方向上的精度小于d2，则“像素2”发出的光线无法照射到用户的眼睛瞳孔位置处，导致出现缺失像素的现象，从而需要提高在深度距离上的精度。Among them, the three-dimensional binocular pupil orientation information may include the user's three-dimensional binocular pupil position information and eye gaze direction information. In the light field display, due to the light angle problem, the light of pixels in different directions has different distance sensitivities to the eye position in the depth direction. FIG3 shows an example of the distance sensitivity of light to the eye position in the depth direction. The light field display screen in FIG3 is a circle, and the user can watch in the circular area. In FIG3, the accuracy of the light emitted by "pixel 1" at the eye pupil position in the y-axis direction is d1, and the accuracy of the light emitted by "pixel 2" at the eye pupil position in the y-axis direction is d2. It can be seen that d2<d1, that is, the accuracy of the light emitted by "pixel 2" at the eye pupil position in the y-axis direction is higher. If the accuracy of the eye pupil position in the y-axis direction is less than d2, the light emitted by "pixel 2" cannot irradiate the user's eye pupil position, resulting in the phenomenon of missing pixels, so it is necessary to improve the accuracy in the depth distance.

具体地，本实施例可以根据至少两个二维双眼瞳孔方位信息对用户双眼瞳孔进行三维重建，计算出用户的三维双眼瞳孔方位信息，从而可以提高在深度距离上的精度，避免出现上述缺失像素的现象。本实施例可以将用户的三维双眼瞳孔方位信息输入至3D显示屏驱动中，以使3D显示屏可以根据三维双眼瞳孔方位信息确定相应的显示数据，使得用户可以观看到相应地三维画面。Specifically, this embodiment can perform three-dimensional reconstruction of the user's eye pupils based on at least two two-dimensional eye pupil position information, and calculate the user's three-dimensional eye pupil position information, thereby improving the accuracy in depth distance and avoiding the above-mentioned phenomenon of missing pixels. This embodiment can input the user's three-dimensional eye pupil position information into the 3D display screen driver, so that the 3D display screen can determine corresponding display data based on the three-dimensional eye pupil position information, so that the user can view the corresponding three-dimensional picture.

本实施例的技术方案，根据第一摄像头采集的用户图像，确定预设观看区域中每个用户对应的三维头部方位信息，从而实现对用户头部的追踪。基于用户的三维头部方位信息，从多个第二摄像头中确定第一预设数量的目标第二摄像头，使得每个目标第二摄像头采集的脸部图像中双眼瞳孔区域所占的像素面积较多，从而根据多个脸部图像可以准确快速地确定用户的三维双眼瞳孔方位信息，实现了人眼的高速追踪，同时提高了计算精度。并且本发明实施例是利用多个第二摄像头采集预设观看区域中用户的脸部图像，从而可以调用不同的第二摄像头同时采集不同用户的脸部图像，进而可以实现多用户的人眼追踪。The technical solution of this embodiment determines the three-dimensional head position information corresponding to each user in the preset viewing area based on the user image captured by the first camera, thereby realizing the tracking of the user's head. Based on the user's three-dimensional head position information, a first preset number of target second cameras are determined from multiple second cameras, so that the pixel area occupied by the pupil area of both eyes in the facial image captured by each target second camera is relatively large, so that the user's three-dimensional binocular pupil position information can be accurately and quickly determined based on multiple facial images, realizing high-speed tracking of the human eye and improving the calculation accuracy at the same time. In addition, the embodiment of the present invention uses multiple second cameras to capture the facial images of users in the preset viewing area, so that different second cameras can be called to capture facial images of different users at the same time, thereby realizing multi-user human eye tracking.

在上述技术方案的基础上，“根据各匹配度和每个候选第二摄像头对应的当前调用次数，从各候选第二摄像头中筛选出第一预设数量的目标第二摄像头”，可以包括：根据候选第二摄像头对应的当前调用次数，筛选出当前调用次数小于或等于预设调用次数的候选第二摄像头，作为待选第二摄像头；基于待选第二摄像头对应的匹配度，对各进行降序排列，并将排列后的前第一预设数量的待选第二摄像头确定为目标第二摄像头。On the basis of the above technical solution, "screening out a first preset number of target second cameras from each candidate second camera according to each matching degree and the current call number corresponding to each candidate second camera" may include: screening out candidate second cameras whose current call number is less than or equal to the preset call number as the second cameras to be selected according to the current call number corresponding to the candidate second cameras; arranging each of the second cameras to be selected in descending order based on the matching degrees corresponding to the second cameras to be selected, and determining the first first preset number of second cameras to be selected after the arrangement as the target second cameras.

其中，预设调用次数可以是指第二摄像头对应的待执行的拍摄任务数量的最大值，其可以根据业务需求和场景预先进行设置。例如，预设调用次数可以设置为5。具体地，本实施例可以先在各个候选第二摄像头中，筛选出当前调用次数小于或等于预设调用次数的候选第二摄像头，并将筛选出的候选第二摄像头作为待选第二摄像头，然后将各个待选第二摄像头的匹配度从大到小进行排列，将排列后的前第一预设数量的待选第二摄像头确定为目标第二摄像头。The preset call number may refer to the maximum number of shooting tasks to be executed corresponding to the second camera, which may be preset according to business requirements and scenarios. For example, the preset call number may be set to 5. Specifically, in this embodiment, the candidate second cameras whose current call number is less than or equal to the preset call number may be first screened out from each candidate second camera, and the screened candidate second cameras may be used as the second cameras to be selected, and then the matching degree of each second camera to be selected may be arranged from large to small, and the first preset number of second cameras to be selected after the arrangement may be determined as the target second cameras.

在上述技术方案的基础上，本实施例中用于进行人眼追踪的设备可以周期性地调用第一摄像头，从而周期性地采集用户图像，并根据每次采集的用户图像确定对应的用户的三维头部方位信息，并利用该三维头部方位信息确定目标第二摄像头，根据每个目标第二摄像头采集的脸部图像确定用户的二维双眼瞳孔方位信息。也就是说，本实施例中可以根据周期性采集的用户图像确定相应的二维头部方位信息。若预设历史次数中每次根据采集的用户图像无法确定出至少两个二维双眼瞳孔方位信息，则表明用户头部一直处于被遮挡的状态或者用户一直处于运动状态，从而在确定当次的目标第二摄像头时，可以将第二预设数量的候选第二摄像头全部确定为目标第二摄像头，以增加追踪到人眼的几率，提高追踪效率。On the basis of the above technical solution, the device for human eye tracking in this embodiment can periodically call the first camera to periodically collect user images, and determine the corresponding user's three-dimensional head orientation information based on each collected user image, and use the three-dimensional head orientation information to determine the target second camera, and determine the user's two-dimensional binocular pupil orientation information based on the facial image collected by each target second camera. That is to say, in this embodiment, the corresponding two-dimensional head orientation information can be determined based on the periodically collected user images. If at least two two-dimensional binocular pupil orientation information cannot be determined based on the collected user images each time in the preset historical number of times, it indicates that the user's head has been in an obstructed state or the user has been in a moving state, so when determining the target second camera of the time, all of the second preset number of candidate second cameras can be determined as the target second camera to increase the probability of tracking the human eye and improve the tracking efficiency.

或者，在S140之前，还可以包括：若根据每个脸部图像仅确定出一个二维双眼瞳孔方位信息，则从筛选目标第二摄像头后的剩余候选第二摄像头中再次筛选出至少一个目标第二摄像头，并调用再次筛选出的目标第二摄像头采集用户的脸部图像，并根据再次采集的脸部图像确定相应的二维双眼瞳孔方位信息；或者，若根据每个脸部图像无法确定出二维双眼瞳孔方位信息，则从筛选目标第二摄像头后的剩余候选第二摄像头中再次筛选出至少两个目标第二摄像头，并调用再次筛选出的每个目标第二摄像头采集用户的脸部图像，并根据再次采集的每个脸部图像确定相应的二维双眼瞳孔方位信息。Alternatively, before S140, the method may further include: if only one two-dimensional binocular pupil position information is determined according to each facial image, at least one target second camera is screened out again from the remaining candidate second cameras after the target second camera is screened out, and the target second camera screened out again is called to capture the user's facial image, and the corresponding two-dimensional binocular pupil position information is determined according to the facial image collected again; or, if the two-dimensional binocular pupil position information cannot be determined according to each facial image, at least two target second cameras are screened out again from the remaining candidate second cameras after the target second camera is screened out, and each target second camera screened out again is called to capture the user's facial image, and the corresponding two-dimensional binocular pupil position information is determined according to each facial image collected again.

具体地，通过调用第一预设数量的目标第二摄像头采集用户的第一预设数量的脸部图像后，如果存在用户突然转头或者处于第二摄像头的交界位置处等情况，则采集的脸部图像可能无法包含用户的双眼瞳孔信息，使得基于该脸部图像无法确定出相应的二维双眼瞳孔方位信息。若根据各个脸部图像仅确定出一个二维双眼瞳孔方位信息，则可以基于相同的筛选规则，从剩余候选第二摄像头中再次筛选出至少一个目标第二摄像头，利用该重新筛选出的目标第二摄像头重新采集用户的脸部图像，并根据脸部图像再次确定相应的二维双眼瞳孔方位信息，以便获得至少两个二维双眼瞳孔方位信息。其中，筛选规则可以是指基于候选摄像头的当前调用次数和匹配度进行筛选的规则。类似地，若根据各个脸部图像无法确定出二维双眼瞳孔方位信息，即每个脸部图像中均不包含双眼瞳孔信息，则可以从剩余第二摄像头中再次筛选出至少两个目标第二摄像头，利用该重新筛选出的目标第二摄像头重新采集用户的脸部图像，并根据脸部图像再次确定相应的二维双眼瞳孔方位信息，以便获得至少两个二维双眼瞳孔方位信息。Specifically, after collecting a first preset number of facial images of the user by calling a first preset number of target second cameras, if the user suddenly turns his head or is at the junction of the second cameras, the collected facial images may not contain the user's eye pupil information, so that the corresponding two-dimensional eye pupil orientation information cannot be determined based on the facial images. If only one two-dimensional eye pupil orientation information is determined based on each facial image, at least one target second camera can be screened out again from the remaining candidate second cameras based on the same screening rule, and the user's facial image can be re-collected using the re-screened target second camera, and the corresponding two-dimensional eye pupil orientation information can be determined again based on the facial image, so as to obtain at least two two-dimensional eye pupil orientation information. The screening rule may refer to a rule for screening based on the current number of calls and matching degrees of the candidate cameras. Similarly, if the two-dimensional binocular pupil position information cannot be determined based on each facial image, that is, each facial image does not contain binocular pupil information, at least two target second cameras can be screened out again from the remaining second cameras, and the user's facial image can be re-captured using the re-screened target second cameras, and the corresponding two-dimensional binocular pupil position information can be determined again based on the facial image, so as to obtain at least two two-dimensional binocular pupil position information.

在上述技术方案的基础上，在S140之后且在S110之前，还包括：根据用户当前时刻的三维双眼瞳孔方位信息和历史时刻的三维双眼瞳孔方位信息，预估用户在下一时刻的第二三维头部方位信息；根据第二三维头部方位信息预估用户在下一时刻的三维双眼瞳孔方位信息。On the basis of the above technical solution, after S140 and before S110, it also includes: estimating the second three-dimensional head orientation information of the user at the next moment based on the three-dimensional binocular pupil orientation information of the user at the current moment and the three-dimensional binocular pupil orientation information at the historical moment; estimating the three-dimensional binocular pupil orientation information of the user at the next moment based on the second three-dimensional head orientation information.

其中，本实施例中的第一摄像头是按照预设帧速一帧一帧地采集预设观看区域中的用户图像，从而可以根据周期性采集的用户图像，周期性地确定用户对应的三维头部方位信息。由于相对于采集用户图像和根据用户图像确定三维头部方位信息所消耗的延迟时间而言，根据三维头部方位信息计算三维双眼瞳孔方位信息所消耗的延迟时间较短。也就是说，在根据第一摄像头采集的当前帧用户图像确定出用户在当前时刻的三维双眼瞳孔方位信息后，无法立即获得根据下一帧用户图像确定出的用户在下一时刻的三维头部方位信息，即头部定位追踪速度比人眼定位追踪速度较慢。本实施例在根据当前帧用户图像确定出用户在当前时刻的三维双眼瞳孔方位信息后，即执行步骤S110-S140操作后，且在根据下一帧用户图像确定出用户在下一时刻的三维头部方位信息之前，可以对用户在下一时刻的三维头部方位信息进行预估，从而解决头部定位追踪速度较慢的问题，以便进一步提高追踪速度。本实施例中的第二三维头部方位信息可以是指根据用户已有的三维双眼瞳孔方位信息进行预估获得的三维头部方位信息。历史时刻的三维双眼瞳孔方位信息可以是指历史时刻中基于用户图像准确地确定出的三维双眼瞳孔方位信息。Among them, the first camera in this embodiment collects the user image in the preset viewing area frame by frame according to the preset frame rate, so that the three-dimensional head position information corresponding to the user can be periodically determined according to the periodically collected user image. Compared with the delay time consumed by collecting the user image and determining the three-dimensional head position information according to the user image, the delay time consumed by calculating the three-dimensional binocular pupil position information according to the three-dimensional head position information is shorter. That is to say, after the three-dimensional binocular pupil position information of the user at the current moment is determined according to the current frame user image collected by the first camera, the three-dimensional head position information of the user at the next moment determined according to the next frame user image cannot be immediately obtained, that is, the head positioning tracking speed is slower than the human eye positioning tracking speed. In this embodiment, after the three-dimensional binocular pupil position information of the user at the current moment is determined according to the current frame user image, that is, after the operation of steps S110-S140 is performed, and before the three-dimensional head position information of the user at the next moment is determined according to the next frame user image, the three-dimensional head position information of the user at the next moment can be estimated, thereby solving the problem of slow head positioning tracking speed, so as to further improve the tracking speed. The second 3D head orientation information in this embodiment may refer to the 3D head orientation information estimated based on the user's existing 3D eye pupil orientation information. The 3D eye pupil orientation information at a historical moment may refer to the 3D eye pupil orientation information accurately determined based on the user image at a historical moment.

具体地，在S140之后，若未获得根据用户图像确定出的下一时刻的三维头部方位信息时，可以根据当前时刻计算出的三维双眼瞳孔方位信息和历史时刻计算出的三维双眼瞳孔方位信息，较为准确地预估用户在下一时刻的第二三维头部方位信息，以便可以根据第二三维头部方位信息预估用户在下一时刻的三维双眼瞳孔方位信息，从而解决了头部定位追踪速度较慢的问题。当获得根据用户图像确定出的准确的三维头部方位信息时，再根据该三维头部方位信息执行步骤S120-S140操作，从而可以减小延迟时间，进一步提高追踪速度。Specifically, after S140, if the three-dimensional head orientation information at the next moment determined according to the user image is not obtained, the second three-dimensional head orientation information of the user at the next moment can be more accurately estimated based on the three-dimensional binocular pupil orientation information calculated at the current moment and the three-dimensional binocular pupil orientation information calculated at the historical moment, so that the three-dimensional binocular pupil orientation information of the user at the next moment can be estimated based on the second three-dimensional head orientation information, thereby solving the problem of slow head positioning and tracking speed. When accurate three-dimensional head orientation information determined according to the user image is obtained, steps S120-S140 are performed according to the three-dimensional head orientation information, thereby reducing the delay time and further improving the tracking speed.

示例性地，第二三维头部方位信息包括三维头部位置和转动角度；相应地，可以根据如下公式预估用户在下一时刻的第二三维头部方位信息：Exemplarily, the second three-dimensional head orientation information includes a three-dimensional head position and a rotation angle; accordingly, the second three-dimensional head orientation information of the user at the next moment can be estimated according to the following formula:

其中，(X_p1，Y_p1，Z_p1)和α₁为用户在当前时刻P1的三维眼睛瞳孔位置和注视方向的角度；(X_p2，Y_p2，Z_p2)和α₂为用户在历史时刻P2的三维眼睛瞳孔位置和注视方向的角度；(X_p3，Y_p3，Z_p3)和α₃为用户在历史时刻P3的三维眼睛瞳孔位置和注视方向的角度；(X，Y，Z)和α为估算的用户在下一时刻的三维头部位置和转动角度。其中，当前时刻P1、历史时刻P2和历史时刻P3这三个时刻的三维眼睛瞳孔位置可以为用户的三维右眼瞳孔位置或者三维左眼瞳孔位置。图4给出了一种第二三维头部方位信息展示的示例。图4中的点A和a分别为用户在下一时刻的三维眼睛瞳孔位置和注视方向的角度；点A1和a1分别为用户在当前时刻P1的三维眼睛瞳孔位置和注视方向的角度；点A2和a2分别为用户在当前时刻P2的三维眼睛瞳孔位置和注视方向的角度；点A3和a3分别为用户在当前时刻P3的三维头部位置和转动角度，本实施例可以根据当前时刻P1和前两时刻P2和P3对应的准确的三维眼睛瞳孔位置和注视方向的角度，按照比例变化地预测机制来估算出用户下一时刻的三维头部位置和转动角度，以便可以根据预估出的第二三维头部方位信息继续执行步骤S120-S140的操作，提高追踪速度。Wherein, ( _Xp1 , _Yp1 , _Zp1 ) and _α1 are the three-dimensional eye pupil position and the angle of the gaze direction of the user at the current moment P1; ( _Xp2 , _Yp2 , _Zp2 ) and _α2 are the three-dimensional eye pupil position and the angle of the gaze direction of the user at the historical moment P2; ( _Xp3 , _Yp3 , _Zp3 ) and _α3 are the three-dimensional eye pupil position and the angle of the gaze direction of the user at the historical moment P3; (X, Y, Z) and α are the estimated three-dimensional head position and rotation angle of the user at the next moment. Wherein, the three-dimensional eye pupil position at the current moment P1, the historical moment P2 and the historical moment P3 can be the three-dimensional right eye pupil position or the three-dimensional left eye pupil position of the user. FIG4 shows an example of a second three-dimensional head orientation information display. Points A and a in Figure 4 are respectively the three-dimensional eye pupil position and the angle of the gaze direction of the user at the next moment; points A1 and a1 are respectively the three-dimensional eye pupil position and the angle of the gaze direction of the user at the current moment P1; points A2 and a2 are respectively the three-dimensional eye pupil position and the angle of the gaze direction of the user at the current moment P2; points A3 and a3 are respectively the three-dimensional head position and the rotation angle of the user at the current moment P3. This embodiment can estimate the three-dimensional head position and the rotation angle of the user at the next moment according to the accurate three-dimensional eye pupil position and the angle of the gaze direction corresponding to the current moment P1 and the previous two moments P2 and P3 according to the proportional change prediction mechanism, so that the operations of steps S120-S140 can be continued according to the estimated second three-dimensional head orientation information to improve the tracking speed.

在上述技术方案的基础上，在S130之后，还包括：根据三维头部方位信息确定每个目标第二摄像头采集的脸部图像中的目标脸部区域的位置和大小。其中，目标脸部区域可以是指脸部图像中由脸部轮廓组成的图像区域。本实施例可以基于用户的三维头部方位信息，划定脸部图像中的目标脸部区域的位置和大小，以便可以减少计算区域，并通过仅对该目标脸部区域进行图像处理，可以更加快速地计算出用户的二维双眼瞳孔方位信息，从而可以进一步提高人眼定位的计算速度。On the basis of the above technical solution, after S130, it also includes: determining the position and size of the target facial area in the facial image collected by each target second camera according to the three-dimensional head orientation information. Wherein, the target facial area may refer to an image area composed of a facial contour in the facial image. This embodiment can define the position and size of the target facial area in the facial image based on the three-dimensional head orientation information of the user, so as to reduce the calculation area, and by performing image processing only on the target facial area, the user's two-dimensional binocular pupil orientation information can be calculated more quickly, thereby further improving the calculation speed of human eye positioning.

在上述技术方案的基础上，S130中“根据每个脸部图像确定用户的二维双眼瞳孔方位信息”，可以包括：根据目标脸部区域的位置和大小确定通过扫描线接收数据的时间，并根据该时间接收目标第二摄像头发送的目标脸部区域数据，并根据接收的目标脸部图像数据确定用户的二维双眼瞳孔方位信息。Based on the above technical solution, "determining the user's two-dimensional binocular pupil position information according to each facial image" in S130 may include: determining the time for receiving data through the scanning line according to the position and size of the target facial area, and receiving the target facial area data sent by the target second camera according to the time, and determining the user's two-dimensional binocular pupil position information according to the received target facial image data.

其中，目标第二摄像头可以使用CSI(CMOS Sensor Interface，相机串行接口)，以行方式将采集的脸部图像数据发送至人眼追踪设备中。目标第二摄像头可以采用连续内存模块存储图像数据，并可以通过直接传输脸部图像的行指针列表的方式，即扫描线的方式来传输脸部图像数据，以提高数据传输速度。本实施例可以根据目标脸部区域的位置和大小计算出通过扫描线接收数据的时间，以在接收到脸部区域的前提下接收更少的无用数据，从而可以减少因目标第二摄像头采集脸部图像而导致的延迟时间，进一步提高追踪速度。示例性地，图5给出了一种接收脸部图像数据的示意图。图5中的脸部图像分辨率为640×480。目标第二摄像头在开始传输一帧脸部图像时，会先发送一个帧开始同步信号(对应的发送时刻记为Ts)，然后在传输结束时发送一个帧结束同步信号(对应的发送时刻记为Te)，根据这两个信号可以确定目标第二摄像头的行传输速度为：(Te-Ts)/480。人眼追踪设备根据三维头部方位信息可以确定目标脸部区域最后一行在300高度位置，从而可以计算目标脸部区域对应的数据接收时间为：300×(Te-Ts)/480。在接收到帧开始同步信号后开始计时，当接收时间到达300×(Te-Ts)/480时，表明已经接收到包含脸部区域的目标脸部图像数据，此时无需再接收剩余的脸部图像数据，从而可以提高数据传输速度，并且根据接收的目标脸部图像数据可以更加快速地确定用户的二维双眼瞳孔方位信息，降低延迟时间，进一步提高追踪速度和计算速度。Among them, the target second camera can use CSI (CMOS Sensor Interface, camera serial interface) to send the collected facial image data to the human eye tracking device in a row manner. The target second camera can use a continuous memory module to store image data, and can transmit facial image data by directly transmitting a row pointer list of facial images, that is, a scan line method, to improve the data transmission speed. This embodiment can calculate the time to receive data through the scan line according to the position and size of the target facial area, so as to receive less useless data on the premise of receiving the facial area, thereby reducing the delay time caused by the target second camera collecting facial images and further improving the tracking speed. Exemplarily, FIG5 shows a schematic diagram of receiving facial image data. The facial image resolution in FIG5 is 640×480. When the target second camera starts to transmit a frame of facial image, it will first send a frame start synchronization signal (the corresponding transmission time is recorded as Ts), and then send a frame end synchronization signal (the corresponding transmission time is recorded as Te) at the end of the transmission. According to these two signals, the row transmission speed of the target second camera can be determined as: (Te-Ts)/480. The eye tracking device can determine that the last row of the target facial area is at a height of 300 according to the three-dimensional head orientation information, so that the data receiving time corresponding to the target facial area can be calculated as: 300×(Te-Ts)/480. The timing starts after receiving the frame start synchronization signal. When the receiving time reaches 300×(Te-Ts)/480, it indicates that the target facial image data containing the facial area has been received. At this time, there is no need to receive the remaining facial image data, thereby increasing the data transmission speed, and based on the received target facial image data, the user's two-dimensional binocular pupil orientation information can be determined more quickly, reducing the delay time, and further improving the tracking speed and calculation speed.

实施例二Embodiment 2

图6为本发明实施例二提供的一种人眼追踪方法的流程图，本实施例在上述实施例的基础上，对“根据至少两个二维双眼瞳孔方位信息确定用户的三维双眼瞳孔方位信息”进行了优化。其中与上述实施例相同或相应的术语的解释在此不再赘述。FIG6 is a flow chart of a human eye tracking method provided by Embodiment 2 of the present invention. Based on the above embodiment, this embodiment optimizes “determining the user's three-dimensional binocular pupil position information based on at least two two-dimensional binocular pupil position information”. The explanations of the terms that are the same or corresponding to the above embodiment are not repeated here.

参见图6，本实施例提供的人眼追踪方法具体包括以下步骤：Referring to FIG. 6 , the eye tracking method provided in this embodiment specifically includes the following steps:

S210、调用第一摄像头采集预设观看区域中的用户图像，并根据用户图像确定预设观看区域中每个用户对应的三维头部方位信息。S210, calling the first camera to collect user images in the preset viewing area, and determining three-dimensional head orientation information corresponding to each user in the preset viewing area according to the user images.

S220、根据三维头部方位信息，从多个第二摄像头中确定第一预设数量的目标第二摄像头。S220. Determine a first preset number of target second cameras from a plurality of second cameras according to the three-dimensional head orientation information.

S230、调用每个目标第二摄像头采集用户的脸部图像。S230: Call each target second camera to capture a facial image of the user.

S240、若根据每个脸部图像无法确定出用户当前时刻对应的至少两个二维双眼瞳孔方位信息，则根据历史时刻的三维双眼瞳孔方位信息确定螺旋状搜索规则。S240: If at least two two-dimensional binocular pupil position information corresponding to the user at the current moment cannot be determined based on each facial image, a spiral search rule is determined based on the three-dimensional binocular pupil position information at historical moments.

具体地，当用户存在突然转头或者处于第二摄像头的交界位置处等情况时，导致通过调用目标第二摄像头无法检测到用户双眼瞳孔信息，即根据各个脸部图像无法确定出至少两个二维双眼瞳孔方位信息，此时表明根据用户图像确定出的三维头部方位信息是不准确的，需要预测用户当前时刻对应的三维头部方位信息，以便对根据用户图像确定出的三维头部方位信息进行调整。通常，用户观看屏幕时头部转动角速度比移动速度更快，从而在建立头部运动学模型之外，还可以按照螺旋状向外进行搜索，以提高预测速度。本实施例可以根据历史时刻确定出的三维双眼瞳孔方位信息来确定螺旋状在各个节点处的半径值和运动轨迹，从而确定螺旋状搜索规则。图7给出了一种螺旋状搜索图的示例。图7中的节点“0”、“1”、“2”、“3”、“4”和“5”分别表示不同的头部位置，每个节点连接的直线表示该节点对应的头部朝向。本实施例中的螺旋状搜索规则可以是节点“0”为起始位置，按照节点“1”-“5”的顺序，向外进行重搜索。Specifically, when the user suddenly turns his head or is at the junction of the second camera, the user's eye pupil information cannot be detected by calling the target second camera, that is, at least two two-dimensional eye pupil orientation information cannot be determined based on each facial image. At this time, it indicates that the three-dimensional head orientation information determined based on the user image is inaccurate, and it is necessary to predict the three-dimensional head orientation information corresponding to the user at the current moment, so as to adjust the three-dimensional head orientation information determined based on the user image. Usually, when the user watches the screen, the angular velocity of the head rotation is faster than the moving speed, so in addition to establishing the head kinematic model, it is also possible to search outward in a spiral shape to improve the prediction speed. In this embodiment, the radius value and motion trajectory of the spiral at each node can be determined according to the three-dimensional eye pupil orientation information determined at the historical moment, so as to determine the spiral search rule. FIG. 7 shows an example of a spiral search graph. The nodes "0", "1", "2", "3", "4" and "5" in FIG. 7 represent different head positions respectively, and the straight line connecting each node represents the head orientation corresponding to the node. The spiral search rule in this embodiment may be to start at node "0" and to search outward in the order of nodes "1" to "5".

S250、将根据用户图像确定的三维头部方位信息作为用户在当前时刻的三维头部方位信息。S250: Using the three-dimensional head orientation information determined according to the user image as the three-dimensional head orientation information of the user at the current moment.

具体地，在利用螺旋状搜索规则对用户的三维头部方位信息进行预测时，根据用户图像确定出的三维头部方位信息是不准确的，即根据该三维头部方位信息调用的目标第二摄像头无法检测到用户双眼瞳孔信息，本实施例先将该三维头部方位信息作为用户在当前时刻的三维头部方位信息，以便对该三维头部方位信息进行调整。Specifically, when using the spiral search rule to predict the user's three-dimensional head orientation information, the three-dimensional head orientation information determined based on the user's image is inaccurate, that is, the target second camera called based on the three-dimensional head orientation information cannot detect the user's eye pupil information. This embodiment first uses the three-dimensional head orientation information as the user's three-dimensional head orientation information at the current moment, so as to adjust the three-dimensional head orientation information.

S260、根据螺旋状搜索规则对当前时刻的三维头部方位信息进行调整，并将调整后的当前时刻的三维头部方位信息作为第一三维头部方位信息。S260: Adjust the three-dimensional head orientation information at the current moment according to the spiral search rule, and use the adjusted three-dimensional head orientation information at the current moment as the first three-dimensional head orientation information.

具体地，可以根据螺旋状搜索规则对应的螺旋状搜索图对当前时刻的三维头部方位信息进行调整。示例性地，如图7中，节点“0”表示根据用户图像确定的三维头部方位信息，即当前时刻的三维头部方位信息。在按照节点“1”-“5”的顺序，向外进行重搜索时，可以将节点“0”的下一个节点“1”对应的三维头部位置和头部朝向确定为调整后的当前时刻的三维头部方位信息，即第一三维头部方位信息，从而对头部方位信息进行合理调整和预测。Specifically, the three-dimensional head orientation information at the current moment can be adjusted according to the spiral search graph corresponding to the spiral search rule. Exemplarily, as shown in FIG7 , node "0" represents the three-dimensional head orientation information determined according to the user image, that is, the three-dimensional head orientation information at the current moment. When re-searching outward in the order of nodes "1"-"5", the three-dimensional head position and head orientation corresponding to the next node "1" of node "0" can be determined as the adjusted three-dimensional head orientation information at the current moment, that is, the first three-dimensional head orientation information, so as to reasonably adjust and predict the head orientation information.

S270、根据第一三维头部方位信息确定用户当前时刻的至少两个第一二维双眼瞳孔方位信息。S270. Determine at least two first two-dimensional binocular pupil position information of the user at the current moment according to the first three-dimensional head position information.

具体地，在预测出第一三维头部信息后，可以根据第一三维头部信息从多个第二摄像头中确定第一预设数量的目标第二摄像头，并调用每个目标第二摄像头采集用户的脸部图像，以及根据每个脸部图像确定用户当前时刻的至少两个第一二维双眼瞳孔方位信息。Specifically, after predicting the first three-dimensional head information, a first preset number of target second cameras can be determined from multiple second cameras based on the first three-dimensional head information, and each target second camera can be called to capture the user's facial image, and at least two first two-dimensional binocular pupil position information of the user at the current moment can be determined based on each facial image.

S280、根据至少两个第一二维双眼瞳孔方位信息确定用户的三维双眼瞳孔方位信息。S280: Determine the user's three-dimensional binocular pupil position information based on at least two first two-dimensional binocular pupil position information.

具体地，本实施例可以根据至少两个第一二维双眼瞳孔方位信息对用户双眼瞳孔进行三维重建，计算出用户的三维双眼瞳孔方位信息，从而可以提高计算精度和速度。Specifically, this embodiment can perform three-dimensional reconstruction of the user's binocular pupils according to at least two first two-dimensional binocular pupil position information, and calculate the user's three-dimensional binocular pupil position information, thereby improving calculation accuracy and speed.

本实施例的技术方案，在根据用户图像确定出的三维头部方位信息，无法检测到用户双眼瞳孔信息时，可以利用螺旋状搜索规则对该三维头部方位信息调整和预测，以便获得更加准确的第一三维头部方位信息，并基于该第一三维头部方位信息，可以获得至少两个第一二维双眼瞳孔方位信息，以便确定出用户的三维双眼瞳孔方位信息，从而可以解决因用户突然转动等情况而无法进行人眼追踪的问题，进一步提高了人眼追踪速度和精度。The technical solution of this embodiment, when the three-dimensional head orientation information determined according to the user image cannot detect the user's binocular pupil information, can use a spiral search rule to adjust and predict the three-dimensional head orientation information so as to obtain more accurate first three-dimensional head orientation information, and based on the first three-dimensional head orientation information, at least two first two-dimensional binocular pupil orientation information can be obtained so as to determine the user's three-dimensional binocular pupil orientation information, thereby solving the problem of being unable to track the human eyes due to sudden rotation of the user, and further improving the speed and accuracy of human eye tracking.

在上述技术方案的基础上，若根据第一三维头部方位信息无法确定出当前时刻的至少两个第一二维双眼瞳孔方位信息，且当前测试时长小于预设时长，则将第一三维头部方位信息更新为当前时刻的三维头部方位信息，并进入步骤S260。Based on the above technical solution, if at least two first two-dimensional binocular pupil orientation information at the current moment cannot be determined based on the first three-dimensional head orientation information, and the current test duration is less than the preset duration, the first three-dimensional head orientation information is updated to the three-dimensional head orientation information at the current moment, and step S260 is entered.

其中，预设时长可以根据第一摄像头的帧速确定，例如可以将第一摄像头拍摄相邻两帧图像之间的时间间隔确定为预设时长，以保证在获得根据用户图像确定出的三维头部方位信息之前，进行预测调整操作。具体地，在根据第一三维头部信息从多个第二摄像头中确定第一预设数量的目标第二摄像头，并调用每个目标第二摄像头采集用户的脸部图像之后，若根据每个脸部图像无法确定出用户当前时刻的至少两个第一二维双眼瞳孔方位信息，且当前测试时长小于预设时长，则表明此时预估的第一三维头部方位信息出现错误，并且测试时间较短，此时可以将第一三维头部方位信息作为当前时刻的三维头部方位信息，并返回执行步骤S260-S280的操作，对当前时刻的三维头部方位信息进行再次调整，更新第一三维头部方位信息。例如，图7中的节点“1”作为当前时刻的三维头部方位信息，在按照节点“1”-“5”的顺序，向外进行重搜索时，可以将节点“1”的下一个节点“2”对应的三维头部位置和头部朝向确定为调整后的当前时刻的三维头部方位信息，即更新后的第一三维头部方位信息，从而可以再次对头部方位信息进行合理调整和预测。Among them, the preset duration can be determined according to the frame rate of the first camera. For example, the time interval between two adjacent frames of images taken by the first camera can be determined as the preset duration to ensure that the prediction adjustment operation is performed before the three-dimensional head orientation information determined according to the user image is obtained. Specifically, after determining a first preset number of target second cameras from multiple second cameras according to the first three-dimensional head information, and calling each target second camera to capture the user's facial image, if at least two first two-dimensional binocular pupil orientation information of the user at the current moment cannot be determined according to each facial image, and the current test duration is less than the preset duration, it indicates that the estimated first three-dimensional head orientation information at this time is wrong, and the test time is short. At this time, the first three-dimensional head orientation information can be used as the three-dimensional head orientation information at the current moment, and the operation of steps S260-S280 is returned to be executed, and the three-dimensional head orientation information at the current moment is adjusted again to update the first three-dimensional head orientation information. For example, node "1" in Figure 7 is used as the three-dimensional head orientation information at the current moment. When re-searching outward in the order of nodes "1"-"5", the three-dimensional head position and head orientation corresponding to the next node "2" of node "1" can be determined as the adjusted three-dimensional head orientation information at the current moment, that is, the updated first three-dimensional head orientation information, so that the head orientation information can be reasonably adjusted and predicted again.

在上述技术方案的基础上，若根据第一三维头部方位信息无法确定出当前时刻的至少两个第一二维双眼瞳孔方位信息，且当前测试时长等于预设时长，则将用户上一时刻确定出的三维双眼瞳孔信息作为用户当前时刻的三维双眼瞳孔方位信息。Based on the above technical solution, if at least two first two-dimensional binocular pupil position information at the current moment cannot be determined based on the first three-dimensional head position information, and the current test duration is equal to the preset duration, the three-dimensional binocular pupil information determined at the previous moment of the user will be used as the three-dimensional binocular pupil position information of the user at the current moment.

具体地，若根据第一三维头部方位信息无法确定出当前时刻的至少两个第一二维双眼瞳孔方位信息，且当前测试时长等于预设时长，则表明即将会获得根据用户图像确定出的三维头部方位信息，此时可以停止该预测操作，以避免无限期地进行预测，并将上一时刻确定出的三维双眼瞳孔信息直接确定为当前时刻的三维双眼瞳孔方位信息。示例性地，本实施例在当前测试时长等于预设时长时，若仅确定出当前时刻的一个第一二维双眼瞳孔方位信息，则可以利用上一时刻确定出的二维双眼瞳孔方位信息计算深度距离，并根据该深度距离和当前时刻的这一个第一二维双眼瞳孔方位信息较为准确地计算出当前时刻的三维双眼瞳孔方位信息。Specifically, if at least two first two-dimensional binocular pupil orientation information at the current moment cannot be determined based on the first three-dimensional head orientation information, and the current test duration is equal to the preset duration, it indicates that the three-dimensional head orientation information determined based on the user image will be obtained soon, and the prediction operation can be stopped at this time to avoid indefinite prediction, and the three-dimensional binocular pupil information determined at the previous moment is directly determined as the three-dimensional binocular pupil orientation information at the current moment. Exemplarily, in this embodiment, when the current test duration is equal to the preset duration, if only one first two-dimensional binocular pupil orientation information at the current moment is determined, the depth distance can be calculated using the two-dimensional binocular pupil orientation information determined at the previous moment, and the three-dimensional binocular pupil orientation information at the current moment can be more accurately calculated based on the depth distance and the first two-dimensional binocular pupil orientation information at the current moment.

在上述技术方案的基础上，在调用第一摄像头采集预设观看区域中的用户图像之前，还包括：根据预设观看区域对应的观看角度范围和第一摄像头对应的第一预设朝向误差，确定第一摄像头对应的第一布局位置的数量；根据第一摄像头的第一视角确定每个第一布局位置对应的第一摄像头数量。On the basis of the above technical solution, before calling the first camera to capture the user image in the preset viewing area, it also includes: determining the number of first layout positions corresponding to the first camera according to the viewing angle range corresponding to the preset viewing area and the first preset orientation error corresponding to the first camera; determining the number of first cameras corresponding to each first layout position according to the first viewing angle of the first camera.

其中，第一预设朝向误差可以是指在匹配到一个新的第一摄像头时用户头部所转动的度数，其可以根据业务需求和场景预先设置。示例性地，若将第一摄像头布局在一个圆形上，则可以设置第一预设朝向误差为60度，即用户头部每转动60度便可以存在一个新的第一摄像头采集该用户的正面图像。观看角度范围可以是指预设观看区域所对应的观看角度，比如，预设观看区域为一个圆形区域，则对应的预设观看角度为360度。第一摄像头的第一视角可以是指第一摄像头拍摄角度的范围。预设观看区域对应的预设检测距离可以是在预设观看区域中用户距离第一摄像头的最大值。每个第一摄像头的第一景深均大于预设观看区域对应的预设检测距离，以便可以利用该第一摄像头清楚地拍摄预设观看区域中的用户图像。Among them, the first preset orientation error may refer to the degree of rotation of the user's head when matching a new first camera, which can be preset according to business needs and scenarios. For example, if the first camera is arranged on a circle, the first preset orientation error can be set to 60 degrees, that is, a new first camera can capture the front image of the user every 60 degrees of rotation of the user's head. The viewing angle range may refer to the viewing angle corresponding to the preset viewing area. For example, if the preset viewing area is a circular area, the corresponding preset viewing angle is 360 degrees. The first viewing angle of the first camera may refer to the range of the shooting angle of the first camera. The preset detection distance corresponding to the preset viewing area may be the maximum value of the user's distance from the first camera in the preset viewing area. The first depth of field of each first camera is greater than the preset detection distance corresponding to the preset viewing area, so that the first camera can be used to clearly capture the user image in the preset viewing area.

具体地，本实施例可以将预设观看区域对应的观看角度范围除以第一摄像头对应的第一预设朝向误差获得的结果确定为第一布局位置的数量。并根据每个第一布局位置所要求的视角范围和第一摄像头的第一视角，确定每个第一布局位置所对应的第一摄像头数量，以便可以覆盖整个预设观看区域。示例性地，若第一布局位置所要求的视角范围为150度，第一摄像头的第一视角为150度，则可以确定每个第一布局位置处只需安装一个第一摄像头即可。Specifically, in this embodiment, the result obtained by dividing the viewing angle range corresponding to the preset viewing area by the first preset orientation error corresponding to the first camera can be determined as the number of first layout positions. And according to the viewing angle range required by each first layout position and the first viewing angle of the first camera, the number of first cameras corresponding to each first layout position is determined so that the entire preset viewing area can be covered. Exemplarily, if the viewing angle range required by the first layout position is 150 degrees and the first viewing angle of the first camera is 150 degrees, it can be determined that only one first camera needs to be installed at each first layout position.

在上述技术方案的基础上，在调用第一摄像头采集预设观看区域中的用户图像之前，还包括：根据预设观看区域对应的观看角度范围和第二摄像头对应的第二预设朝向误差，确定第二摄像头对应的第二布局位置的数量；根据第二摄像头的第二景深和预设观看区域对应的预设检测距离，确定每个第二布局位置对应的景深层数；根据第二摄像头的第二视角确定每层景深对应的第二摄像头数量。On the basis of the above technical solution, before calling the first camera to capture the user image in the preset viewing area, it also includes: determining the number of second layout positions corresponding to the second camera according to the viewing angle range corresponding to the preset viewing area and the second preset orientation error corresponding to the second camera; determining the number of depth of field layers corresponding to each second layout position according to the second depth of field of the second camera and the preset detection distance corresponding to the preset viewing area; determining the number of second cameras corresponding to each layer of depth of field according to the second viewing angle of the second camera.

其中，第二预设朝向误差可以是指在匹配到一个最佳的第二摄像头时用户头部所转动的度数，其可以根据业务需求和场景预先设置。示例性地，若将第一摄像头布局在一个圆形上，则可以设置第一预设朝向误差为30度，以便用户头部每转动30度便可以存在一个最佳的第二摄像头采集该用户的正面图像，并可以保证区分出用户的左右眼。观看角度范围可以是指预设观看区域所对应的观看角度，比如，预设观看区域为一个圆形区域，则对应的预设观看角度为360度。预设观看区域对应的预设检测距离可以是在预设观看区域中用户距离第二摄像头的最大值。第二摄像头的第二景深可以小于预设检测距离，以便可以进行至少两层景深布局，保证图像的分辨率以及提高对比度。第二摄像头的第二视角可以是指第二摄像头拍摄角度的范围，其可以通过第二摄像头的镜头参数获得。本实施例中第一摄像头用于追踪用户头部，对脸部细节和追踪速度要求较低，而第二摄像头用于追踪人眼位置，对追踪速度和定位精度要求较高，从而可以设置第一预设朝向误差大于或等于第二预设朝向误差，以及第一视角大于第二视角。Among them, the second preset orientation error may refer to the degree of rotation of the user's head when matching to an optimal second camera, which can be preset according to business needs and scenarios. For example, if the first camera is arranged on a circle, the first preset orientation error can be set to 30 degrees, so that every time the user's head rotates 30 degrees, there can be an optimal second camera to capture the front image of the user, and the user's left and right eyes can be distinguished. The viewing angle range may refer to the viewing angle corresponding to the preset viewing area. For example, if the preset viewing area is a circular area, the corresponding preset viewing angle is 360 degrees. The preset detection distance corresponding to the preset viewing area may be the maximum value of the user's distance from the second camera in the preset viewing area. The second depth of field of the second camera may be less than the preset detection distance, so that at least two layers of depth of field layout can be performed to ensure the resolution of the image and improve the contrast. The second viewing angle of the second camera may refer to the range of the shooting angle of the second camera, which can be obtained through the lens parameters of the second camera. In this embodiment, the first camera is used to track the user's head, and has lower requirements for facial details and tracking speed, while the second camera is used to track the position of the human eyes, and has higher requirements for tracking speed and positioning accuracy. Therefore, the first preset orientation error can be set to be greater than or equal to the second preset orientation error, and the first viewing angle can be greater than the second viewing angle.

具体地，本实施例可以将预设观看区域对应的观看角度范围除以第二摄像头对应的第二预设朝向误差获得的结果确定为第二布局位置的数量。根据第二摄像头的第二景深和预设观看区域对应的预设检测距离，选择适宜的景深层数，以便可以覆盖整个预设观看区域，并且可以保证在不同拍摄距离下均可以清晰成像，提高图像的分辨率和对比度。示例性地，本实施例可以使用三层景深(即近-中-远)进行多层布局，并且每层可以采用并列排放第二摄像头的方式增加追踪的覆盖范围。根据每层景深所要求的视角范围和第二摄像头的第二视角，确定每层景深对应的第二摄像头数量，以便可以覆盖整个预设观看区域。示例性地，若每层景深所要求的水平方向上视角范围为150度，第二摄像头的第二视角为30度，则可以确定每层景深对应6个第二摄像头，以保证水平方向上的覆盖范围为150度，如图8所示。为了提高竖直方向上的覆盖角度范围，可以在每个第二布局位置叠加设置至少两个第二摄像头组，每个第二摄像头组中包括至少两层不同景深的第二摄像头，以便可以追踪用户低头和仰头的情况。比如在第二布局位置对应3层景深时，该3层景深(即为一个第二摄像头组)对应18个第二摄像头，若在每个布局位置处叠加设置两个第二摄像头组时，可以确定在每个第二布局位置处需要安装36个第二摄像头。Specifically, in this embodiment, the result obtained by dividing the viewing angle range corresponding to the preset viewing area by the second preset orientation error corresponding to the second camera can be determined as the number of second layout positions. According to the second depth of field of the second camera and the preset detection distance corresponding to the preset viewing area, an appropriate number of depth of field layers is selected so that the entire preset viewing area can be covered, and clear imaging can be ensured at different shooting distances, thereby improving the resolution and contrast of the image. Exemplarily, in this embodiment, three layers of depth of field (i.e., near-middle-far) can be used for multi-layer layout, and each layer can increase the tracking coverage by arranging the second camera in parallel. According to the viewing angle range required by each layer of depth of field and the second viewing angle of the second camera, the number of second cameras corresponding to each layer of depth of field is determined so that the entire preset viewing area can be covered. Exemplarily, if the viewing angle range required by each layer of depth of field is 150 degrees in the horizontal direction and the second viewing angle of the second camera is 30 degrees, it can be determined that 6 second cameras correspond to each layer of depth of field to ensure that the coverage range in the horizontal direction is 150 degrees, as shown in FIG8. In order to increase the coverage angle range in the vertical direction, at least two second camera groups can be superimposed at each second layout position, and each second camera group includes at least two layers of second cameras with different depths of field, so as to track the user's lowering and raising of the head. For example, when the second layout position corresponds to three layers of depth of field, the three layers of depth of field (i.e., one second camera group) correspond to 18 second cameras. If two second camera groups are superimposed at each layout position, it can be determined that 36 second cameras need to be installed at each second layout position.

本实施例在每个第二布局位置处安装相应数量的第二摄像头后，每个第二摄像头的方位配置信息可以采用三层数据表的方式进行存储，以便提高第二摄像头匹配搜索的速度，进而可以提高目标第二摄像头的确定速度。示例性地，三层数据表可以包括一个第二布局位置指针表、多个结构指针表和多个数据结构表。图9给出了一种第二摄像头对应的三层数据表的示例。如图9所示，第二布局位置指针表可以按照第二布局位置的布局方向(如顺时针或者逆时针)顺序排列各个第二布局位置。每个第二布局位置对应一个结构指针表，用于存储该第二布局位置对应的各个第二摄像头的标识编码cid。数据结构表可以用于存储第二摄像头对应的方位配置信息structcamera_position。In this embodiment, after a corresponding number of second cameras are installed at each second layout position, the orientation configuration information of each second camera can be stored in the form of a three-layer data table, so as to improve the speed of the second camera matching search, and then improve the speed of determining the target second camera. Exemplarily, the three-layer data table may include a second layout position pointer table, multiple structure pointer tables and multiple data structure tables. Figure 9 shows an example of a three-layer data table corresponding to a second camera. As shown in Figure 9, the second layout position pointer table can arrange each second layout position in order according to the layout direction of the second layout position (such as clockwise or counterclockwise). Each second layout position corresponds to a structure pointer table for storing the identification code cid of each second camera corresponding to the second layout position. The data structure table can be used to store the orientation configuration information structcamera_position corresponding to the second camera.

以下是本发明实施例提供的人眼追踪装置的实施例，该装置与上述各实施例的人眼追踪方法属于同一个发明构思，在人眼追踪装置的实施例中未详尽描述的细节内容，可以参考上述人眼追踪方法的实施例。The following is an embodiment of a human eye tracking device provided by an embodiment of the present invention. The device and the human eye tracking methods of the above embodiments belong to the same inventive concept. For details not described in detail in the embodiment of the human eye tracking device, reference can be made to the embodiment of the above-mentioned human eye tracking method.

实施例三Embodiment 3

图10为本发明实施例三提供的一种人眼追踪装置的结构示意图，本实施例可适用于用户观看3D显示屏时，对用户的双眼瞳孔进行追踪定位的情况。该装置包括：三维头部方位信息确定模块310、目标第二摄像头确定模块320、二维双眼瞳孔方位信息确定模块330和三维双眼瞳孔方位信息确定模块340。FIG10 is a schematic diagram of the structure of an eye tracking device provided in Embodiment 3 of the present invention. This embodiment is applicable to the case where the user's eye pupils are tracked and located when the user is watching a 3D display screen. The device includes: a 3D head orientation information determination module 310, a target second camera determination module 320, a 2D eye pupil orientation information determination module 330, and a 3D eye pupil orientation information determination module 340.

其中，三维头部方位信息确定模块310，用于调用第一摄像头采集预设观看区域中的用户图像，并根据用户图像确定预设观看区域中每个用户对应的三维头部方位信息；目标第二摄像头确定模块320，用于根据三维头部方位信息，从多个第二摄像头中确定第一预设数量的目标第二摄像头；二维双眼瞳孔方位信息确定模块330，用于调用每个目标第二摄像头采集用户的脸部图像，并根据每个脸部图像确定用户的二维双眼瞳孔方位信息；三维双眼瞳孔方位信息确定模块340，用于根据至少两个二维双眼瞳孔方位信息确定用户的三维双眼瞳孔方位信息。Among them, the three-dimensional head orientation information determination module 310 is used to call the first camera to capture the user image in the preset viewing area, and determine the three-dimensional head orientation information corresponding to each user in the preset viewing area based on the user image; the target second camera determination module 320 is used to determine a first preset number of target second cameras from multiple second cameras based on the three-dimensional head orientation information; the two-dimensional binocular pupil orientation information determination module 330 is used to call each target second camera to capture the user's facial image, and determine the user's two-dimensional binocular pupil orientation information based on each facial image; the three-dimensional binocular pupil orientation information determination module 340 is used to determine the user's three-dimensional binocular pupil orientation information based on at least two two-dimensional binocular pupil orientation information.

可选地，目标第二摄像头确定模块320，包括：Optionally, the target second camera determination module 320 includes:

候选第二摄像头确定单元，用于根据三维头部方位信息和每个第二摄像头的方位配置信息，确定用户对应的第二预设数量的候选第二摄像头，以及每个候选第二摄像头对应的匹配度；a candidate second camera determining unit, configured to determine a second preset number of candidate second cameras corresponding to the user and a matching degree corresponding to each candidate second camera according to the three-dimensional head orientation information and the orientation configuration information of each second camera;

目标第二摄像头确定单元，用于根据各匹配度和每个候选第二摄像头对应的当前调用次数，从各候选第二摄像头中筛选出第一预设数量的目标第二摄像头；其中，第一预设数量小于第二预设数量。The target second camera determination unit is used to screen out a first preset number of target second cameras from each candidate second camera according to each matching degree and the current call count corresponding to each candidate second camera; wherein the first preset number is less than the second preset number.

可选地，目标第二摄像头确定单元，具体用于：根据候选第二摄像头对应的当前调用次数，筛选出当前调用次数小于或等于预设调用次数的候选第二摄像头，作为待选第二摄像头；基于待选第二摄像头对应的匹配度，对各进行降序排列，并将排列后的前第一预设数量的待选第二摄像头确定为目标第二摄像头。Optionally, the target second camera determination unit is specifically used to: screen out candidate second cameras whose current call times are less than or equal to a preset call times as the second cameras to be selected according to the current call times corresponding to the candidate second cameras; arrange the second cameras to be selected in descending order based on the matching degrees corresponding to the second cameras to be selected, and determine the first first preset number of second cameras to be selected after the arrangement as the target second cameras.

可选地，三维双眼瞳孔方位信息确定模块340，具体用于：若根据各脸部图像无法确定出用户当前时刻对应的至少两个二维双眼瞳孔方位信息，则根据历史时刻的三维双眼瞳孔方位信息确定螺旋状搜索规则；将根据用户图像确定的三维头部方位信息作为用户在当前时刻的三维头部方位信息；根据螺旋状搜索规则对当前时刻的三维头部方位信息进行调整，并将调整后的当前时刻的三维头部方位信息作为第一三维头部方位信息；根据第一三维头部方位信息确定用户当前时刻的至少两个第一二维双眼瞳孔方位信息；根据至少两个第一二维双眼瞳孔方位信息确定用户的三维双眼瞳孔方位信息。Optionally, the three-dimensional binocular pupil orientation information determination module 340 is specifically used to: if at least two two-dimensional binocular pupil orientation information corresponding to the user at the current moment cannot be determined based on each facial image, then determine a spiral search rule based on the three-dimensional binocular pupil orientation information at historical moments; use the three-dimensional head orientation information determined based on the user image as the three-dimensional head orientation information of the user at the current moment; adjust the three-dimensional head orientation information at the current moment according to the spiral search rule, and use the adjusted three-dimensional head orientation information at the current moment as the first three-dimensional head orientation information; determine at least two first two-dimensional binocular pupil orientation information of the user at the current moment based on the first three-dimensional head orientation information; determine the user's three-dimensional binocular pupil orientation information based on at least two first two-dimensional binocular pupil orientation information.

可选地，该装置还包括：Optionally, the device further comprises:

第二三维头部方位信息确定模块，用于在根据至少两个二维双眼瞳孔方位信息确定用户的三维双眼瞳孔方位信息之后，且在根据用户图像确定预设观看区域中每个用户对应的三维头部方位信息之前，根据用户当前时刻的三维双眼瞳孔方位信息和历史时刻的三维双眼瞳孔方位信息，预估用户在下一时刻的第二三维头部方位信息；a second three-dimensional head orientation information determination module, configured to estimate the second three-dimensional head orientation information of the user at a next moment according to the three-dimensional eye pupil orientation information of the user at a current moment and the three-dimensional eye pupil orientation information of the user at a historical moment, after determining the three-dimensional eye pupil orientation information of the user according to at least two two-dimensional eye pupil orientation information and before determining the three-dimensional head orientation information corresponding to each user in a preset viewing area according to the user image;

三维双眼瞳孔方位信息预估模块，用于根据第二三维头部方位信息预估用户在下一时刻的三维双眼瞳孔方位信息。The three-dimensional binocular pupil position information estimation module is used to estimate the three-dimensional binocular pupil position information of the user at the next moment according to the second three-dimensional head position information.

可选地，第二三维头部方位信息包括三维头部位置和转动角度；相应地，根据如下公式预估用户在下一时刻的第二三维头部方位信息：Optionally, the second three-dimensional head orientation information includes a three-dimensional head position and a rotation angle; accordingly, the second three-dimensional head orientation information of the user at the next moment is estimated according to the following formula:

其中，(X_p1，Y_p1，Z_p1)和α₁为用户在当前时刻P1的三维眼睛瞳孔位置和注视方向的角度；(X_p2，Y_p2，Z_p2)和α₂为用户在历史时刻P2的三维眼睛瞳孔位置和注视方向的角度；(X_p3，Y_p3，Z_p3)和α₃为用户在历史时刻P3的三维眼睛瞳孔位置和注视方向的角度；(X，Y，Z)和α为估算的用户在下一时刻的三维头部位置和转动角度。Among them, ( _Xp1 , _Yp1 , _Zp1 ) and _α1 are the three-dimensional eye pupil position and the angle of the gaze direction of the user at the current moment P1; ( _Xp2 , _Yp2 , _Zp2 ) and _α2 are the three-dimensional eye pupil position and the angle of the gaze direction of the user at the historical moment P2; ( _Xp3 , _Yp3 , _Zp3 ) and _α3 are the three-dimensional eye pupil position and the angle of the gaze direction of the user at the historical moment P3; (X, Y, Z) and α are the estimated three-dimensional head position and rotation angle of the user at the next moment.

可选地，该装置还包括：Optionally, the device further comprises:

目标脸部区域确定模块，用于在从多个第二摄像头中确定第一预设数量的目标第二摄像头之后，根据三维头部方位信息确定每个目标第二摄像头采集的脸部图像中的目标脸部区域的位置和大小。The target facial region determination module is used to determine the position and size of the target facial region in the facial image captured by each target second camera according to the three-dimensional head orientation information after determining a first preset number of target second cameras from multiple second cameras.

可选地，二维双眼瞳孔方位信息确定模块330，具体用于：根据目标脸部区域的位置和大小确定通过扫描线接收数据的时间，并根据时间接收目标第二摄像头发送的目标脸部区域数据，并根据接收的目标脸部图像数据确定用户的二维双眼瞳孔方位信息。Optionally, the two-dimensional binocular pupil position information determination module 330 is specifically used to: determine the time for receiving data through the scanning line according to the position and size of the target facial area, receive the target facial area data sent by the target second camera according to the time, and determine the user's two-dimensional binocular pupil position information according to the received target facial image data.

可选地，第一摄像头为彩色摄像头或者3D摄像头，第二摄像头为黑白摄像头，第二摄像头的安装位置上设置照明红外光源。Optionally, the first camera is a color camera or a 3D camera, the second camera is a black and white camera, and an infrared light source is arranged at the installation position of the second camera.

可选地，该装置还包括：第一布局位置的数量确定模块，用于在调用第一摄像头采集预设观看区域中的用户图像之前，根据预设观看区域对应的观看角度范围和第一摄像头对应的第一预设朝向误差，确定第一摄像头对应的第一布局位置的数量；Optionally, the device further includes: a first layout position number determination module, configured to determine the number of first layout positions corresponding to the first camera according to a viewing angle range corresponding to the preset viewing area and a first preset orientation error corresponding to the first camera before calling the first camera to capture a user image in the preset viewing area;

第一摄像头数量确定模块，用于根据第一摄像头的第一视角确定每个第一布局位置对应的第一摄像头数量，其中，每个第一摄像头的第一景深均大于预设观看区域对应的预设检测距离。The first camera quantity determination module is used to determine the number of first cameras corresponding to each first layout position according to the first viewing angle of the first camera, wherein the first depth of field of each first camera is greater than the preset detection distance corresponding to the preset viewing area.

可选地，该装置还包括：第二布局位置的数量确定模块，用于在调用第一摄像头采集预设观看区域中的用户图像之前，根据预设观看区域对应的观看角度范围和第二摄像头对应的第二预设朝向误差，确定第二摄像头对应的第二布局位置的数量；Optionally, the device further comprises: a second layout position number determination module, configured to determine the number of second layout positions corresponding to the second camera according to a viewing angle range corresponding to the preset viewing area and a second preset orientation error corresponding to the second camera before calling the first camera to capture a user image in the preset viewing area;

景深层数确定模块，用于根据第二摄像头的第二景深和预设观看区域对应的预设检测距离，确定每个第二布局位置对应的景深层数，其中，第二景深小于预设检测距离；A depth of field layer number determination module, used to determine the depth of field layer number corresponding to each second layout position according to the second depth of field of the second camera and the preset detection distance corresponding to the preset viewing area, wherein the second depth of field is less than the preset detection distance;

第二摄像头数量确定模块，用于根据第二摄像头的第二视角确定每层景深对应的第二摄像头数量。The second camera quantity determination module is used to determine the number of second cameras corresponding to each layer of depth of field according to the second viewing angle of the second camera.

本发明实施例所提供的人眼追踪装置可执行本发明任意实施例所提供的人眼追踪方法，具备执行人眼追踪方法相应的功能模块和有益效果。The human eye tracking device provided in the embodiment of the present invention can execute the human eye tracking method provided in any embodiment of the present invention, and has the corresponding functional modules and beneficial effects for executing the human eye tracking method.

值得注意的是，上述人眼追踪装置的实施例中，所包括的各个单元和模块只是按照功能逻辑进行划分的，但并不局限于上述的划分，只要能够实现相应的功能即可；另外，各功能单元的具体名称也只是为了便于相互区分，并不用于限制本发明的保护范围。It is worth noting that in the embodiment of the above-mentioned human eye tracking device, the various units and modules included are only divided according to functional logic, but are not limited to the above-mentioned division, as long as the corresponding functions can be achieved; in addition, the specific names of the functional units are only for the convenience of distinguishing each other, and are not used to limit the scope of protection of the present invention.

实施例四Embodiment 4

图11是本发明实施例四提供的一种人眼追踪系统的结构示意图。参见图11，该系统包括：第一摄像头410、多个第二摄像头420和人眼追踪装置430；其中，人眼追踪装置430可以用于实现如本发明任意实施例所提供的人眼追踪方法。Fig. 11 is a schematic diagram of the structure of an eye tracking system provided by Embodiment 4 of the present invention. Referring to Fig. 11, the system comprises: a first camera 410, a plurality of second cameras 420 and an eye tracking device 430; wherein the eye tracking device 430 can be used to implement the eye tracking method provided by any embodiment of the present invention.

其中，第一摄像头410用于采集预设观看区域中的用户图像，以便实现用户头部的追踪。本实施例中的第一摄像头410可以是3D摄像头，也可以是多个2D摄像头。由于第一摄像头410是负责追踪用户头部，对脸部细节和追踪速度要求不高，从而可以选择视角大的第一摄像头410。第二摄像头420用于采集用户的脸部图像，以便实现人眼高速追踪。本实施例中的第二摄像头420可以为2D摄像头。示例性地，第一摄像头410可以为高清的彩色摄像头，第二摄像头420可以为黑白摄像头。本实施例中的第一摄像头的分辨率可以大于第二摄像头的分辨率。本实施例通过利用较低分辨率的第二摄像头可以在保证计算精度的前提下，提高计算速度。Among them, the first camera 410 is used to collect the user image in the preset viewing area so as to realize the tracking of the user's head. The first camera 410 in this embodiment can be a 3D camera or a plurality of 2D cameras. Since the first camera 410 is responsible for tracking the user's head, the facial details and tracking speed are not required to be high, so the first camera 410 with a large viewing angle can be selected. The second camera 420 is used to collect the user's facial image so as to realize high-speed tracking of the human eye. The second camera 420 in this embodiment can be a 2D camera. Exemplarily, the first camera 410 can be a high-definition color camera, and the second camera 420 can be a black and white camera. The resolution of the first camera in this embodiment can be greater than the resolution of the second camera. In this embodiment, by using a second camera with a lower resolution, the calculation speed can be improved while ensuring the calculation accuracy.

本实施例中第一摄像头410的布局要求可以设置但不限于为：(1)第一摄像头410的景深大于预设观看区域对应的预设检测距离，以便在预设检测距离范围内只需利用一种景深的第一摄像头，即每个第一摄像头的景深均相同。(2)第一摄像头410采集的用户图像中头部区域所占区域面积可以大于精度要求，比如60×60像素，以使拍摄的头部区域更加清晰。(3)根据第一预设朝向误差设置足够的第一摄像头数量，并且可以覆盖整个预设观看区域，其中第一预设朝向误差可以设置为60度。The layout requirements of the first camera 410 in this embodiment can be set but not limited to: (1) The depth of field of the first camera 410 is greater than the preset detection distance corresponding to the preset viewing area, so that only one first camera with a depth of field is needed within the preset detection distance range, that is, the depth of field of each first camera is the same. (2) The area occupied by the head area in the user image captured by the first camera 410 can be greater than the accuracy requirement, such as 60×60 pixels, so that the photographed head area is clearer. (3) A sufficient number of first cameras are set according to the first preset orientation error, and can cover the entire preset viewing area, wherein the first preset orientation error can be set to 60 degrees.

本实施例可以基于第一摄像头410的布局要求和预设观看区域对第一摄像头进行布局。具体地，在预设观看区域内的每个第一布局位置处设置有至少一个第一摄像头，各第一摄像头对应的总检测区域为预设观看区域，也就是说，各个第一摄像头的总拍摄范围可以覆盖预设观看区域。示例性地，当预设观看区域为圆形内区域时，可以在预设观看区域对应的圆形上分布各个第一布局位置，并且每个第一布局位置处设置有至少一个第一摄像头，以及每个第一摄像头的拍摄方向均朝向圆形的中心位置。其中，每相邻两个第一布局位置之间的距离可以相等，即各个第一布局位置均匀分布在预设观看区域对应的封闭形状上；也可以在预设允许误差范围内。例如，图12给出了一种在预设观看区域为圆形内区域时第一摄像头的布局示例。图12中的虚线表示第一摄像头的光轴线，虚线两边的实线表示第一摄像头的视角范围。如图12所示，圆形上均匀分布6个第一布局位置，并且在每个第一布局位置处设置一个第一摄像头，即通过设置6个第一摄像头便可以覆盖整个圆形内区域，以便可以采集圆形内区域中各个位置处的用户图像，这种第一摄像头的布局方式可以称为圆形向内型。In this embodiment, the first camera can be laid out based on the layout requirements of the first camera 410 and the preset viewing area. Specifically, at least one first camera is arranged at each first layout position in the preset viewing area, and the total detection area corresponding to each first camera is the preset viewing area, that is, the total shooting range of each first camera can cover the preset viewing area. Exemplarily, when the preset viewing area is an inner circular area, each first layout position can be distributed on the circle corresponding to the preset viewing area, and at least one first camera is arranged at each first layout position, and the shooting direction of each first camera is toward the center position of the circle. Among them, the distance between each adjacent two first layout positions can be equal, that is, each first layout position is evenly distributed on the closed shape corresponding to the preset viewing area; it can also be within the preset allowable error range. For example, FIG. 12 shows an example of the layout of the first camera when the preset viewing area is an inner circular area. The dotted line in FIG. 12 represents the optical axis of the first camera, and the solid lines on both sides of the dotted line represent the viewing angle range of the first camera. As shown in FIG12 , six first layout positions are evenly distributed on the circle, and a first camera is set at each first layout position. That is, by setting six first cameras, the entire inner circular area can be covered, so that user images at various positions in the inner circular area can be collected. This layout of the first cameras can be called a circular inward type.

示例性地，当预设观看区域为圆形外区域时，可以在预设观看区域对应的圆形上分布各个第一布局位置，并且每个第一布局位置处设置有至少一个第一摄像头，以及每个第一摄像头的拍摄方向均背离圆形的中心位置。其中，每相邻两个第一布局位置之间的距离可以相等，即各个第一布局位置均匀分布在预设观看区域对应的封闭形状上；也可以在预设允许误差范围内。例如，图13给出了一种在预设观看区域为圆形外区域时第一摄像头的布局示例。图13中的虚线表示第一摄像头的光轴线，虚线两边的实线表示第一摄像头的视角范围。如图13所示，圆形上均匀分布6个第一布局位置，并且在每个第一布局位置处设置一个第一摄像头，即通过设置6个第一摄像头便可以覆盖整个圆形外区域，以便可以采集圆形外区域中的用户图像，这种第一摄像头的布局方式可以称为圆形向外型。Exemplarily, when the preset viewing area is a circular outer area, each first layout position can be distributed on the circle corresponding to the preset viewing area, and at least one first camera is set at each first layout position, and the shooting direction of each first camera is away from the center position of the circle. Among them, the distance between each two adjacent first layout positions can be equal, that is, each first layout position is evenly distributed on the closed shape corresponding to the preset viewing area; it can also be within the preset allowable error range. For example, Figure 13 shows an example of the layout of the first camera when the preset viewing area is a circular outer area. The dotted line in Figure 13 represents the optical axis of the first camera, and the solid lines on both sides of the dotted line represent the viewing angle range of the first camera. As shown in Figure 13, 6 first layout positions are evenly distributed on the circle, and a first camera is set at each first layout position, that is, by setting 6 first cameras, the entire circular outer area can be covered, so that the user image in the circular outer area can be collected. This layout of the first camera can be called a circular outward type.

示例性地，当预设观看区域为直线单侧观看区域时，比如在电影院中的观看区域时，可以在直线单侧观看区域的直线上分布各个第一布局位置，并且每个第一布局位置处设置有至少一个第一摄像头，以及每个第一摄像头的拍摄方向均朝向直线单侧观看区域拍摄。其中，每相邻两个第一布局位置之间的距离可以相等，即各个第一布局位置均匀分布在直线单侧观看区域的直线上；也可以在预设允许误差范围内。例如，图14给出了一种在预设观看区域为直线单侧观看区域时第一摄像头的布局示例。图14中的虚线表示第一摄像头的光轴线，虚线两边的实线表示第一摄像头的视角范围。如图14所示，直线上均匀分布3个第一布局位置，并且在每个第一布局位置处设置一个第一摄像头，且这三个第一摄像头可以采用扇形方向朝向直线单侧观看区域拍摄，以便覆盖整个直线单侧观看区域，这种第一摄像头的布局方式可以称为平面型。Exemplarily, when the preset viewing area is a linear one-sided viewing area, such as a viewing area in a movie theater, each first layout position can be distributed on the straight line of the linear one-sided viewing area, and at least one first camera is set at each first layout position, and the shooting direction of each first camera is toward the linear one-sided viewing area. Among them, the distance between each two adjacent first layout positions can be equal, that is, each first layout position is evenly distributed on the straight line of the linear one-sided viewing area; it can also be within the preset allowable error range. For example, FIG. 14 shows an example of the layout of the first camera when the preset viewing area is a linear one-sided viewing area. The dotted line in FIG. 14 represents the optical axis of the first camera, and the solid lines on both sides of the dotted line represent the viewing angle range of the first camera. As shown in FIG. 14, three first layout positions are evenly distributed on the straight line, and a first camera is set at each first layout position, and the three first cameras can be shot toward the linear one-sided viewing area in a fan-shaped direction so as to cover the entire linear one-sided viewing area. This layout of the first camera can be called a planar type.

本实施例中第二摄像头420的布局要求可以设置但不限于为：(1)在每个第二布局位置处使用至少两层景深，以便提高图像分辨率和对比度。(2)第二摄像头的帧速大于60帧每秒。(3)第二摄像头采集的脸部图像中脸部区域所占区域面积可以大于精度要求，比如100×100像素，以提高二维双眼瞳孔方位信息的计算精度。(4)每个第二布局位置处相邻两个第二摄像头的景深范围交叉区域(如图15中的阴影区域)中的最短距离d大于双眼瞳孔间的距离，以保证可以采集到同时包含左右两只眼睛瞳孔的脸部图像。其中d可以设置为6.5cm。(5)根据第二预设朝向误差设置足够数量的第二布局位置，并且可以覆盖整个预设观看区域，其中第二预设朝向误差可以设置为30度。(6)在第二摄像头为黑白摄像头时，每个第二布局位置的中心处设置照明红外光源，以便进行补光照明。The layout requirements of the second camera 420 in this embodiment can be set but not limited to: (1) At least two layers of depth of field are used at each second layout position to improve image resolution and contrast. (2) The frame rate of the second camera is greater than 60 frames per second. (3) The area occupied by the face area in the face image captured by the second camera can be greater than the accuracy requirement, such as 100×100 pixels, to improve the calculation accuracy of the two-dimensional binocular pupil orientation information. (4) The shortest distance d in the intersection area of the depth of field range of two adjacent second cameras at each second layout position (such as the shaded area in Figure 15) is greater than the distance between the pupils of the two eyes, so as to ensure that a face image containing both the left and right pupils can be captured. Where d can be set to 6.5 cm. (5) A sufficient number of second layout positions are set according to the second preset orientation error, and can cover the entire preset viewing area, where the second preset orientation error can be set to 30 degrees. (6) When the second camera is a black and white camera, an infrared light source is set at the center of each second layout position for fill light illumination.

示例性地，每个第二布局位置处可以设置至少一个第二摄像头组，每个第二摄像头组包括至少两层第二摄像头，且每层第二摄像头为景深相同的多个第二摄像头，不同层中的第二摄像头的景深不同。如图8所示，当用户距离第二摄像头1米至5米的检测范围，第二摄像头可以使用低分辨率(640×480)高速(大于90帧每秒)的黑白摄像头，且每个第二摄像头的第二视角为30度时，每个第二摄像头组中可以包括三种不同焦距的第二摄像头来保证对不同距离下的用户均能清晰成像，并且使用6个摄像头并列排放的方式使得水平方向的覆盖范围最大可以达到150度。Exemplarily, at least one second camera group can be set at each second layout position, and each second camera group includes at least two layers of second cameras, and each layer of second cameras is a plurality of second cameras with the same depth of field, and the depth of field of the second cameras in different layers is different. As shown in FIG8 , when the user is within the detection range of 1 meter to 5 meters from the second camera, the second camera can use a low-resolution (640×480) high-speed (greater than 90 frames per second) black and white camera, and the second viewing angle of each second camera is 30 degrees, each second camera group can include three second cameras with different focal lengths to ensure clear imaging of users at different distances, and the use of 6 cameras arranged in parallel allows the maximum horizontal coverage to reach 150 degrees.

本实施例可以基于第二摄像头420的布局要求和预设观看区域对第二摄像头进行布局。具体地，在预设观看区域内的每个第二布局位置处设置有至少一个第二摄像头，各第二摄像头对应的总检测区域为所述预设观看区域，也就是说，各个第二摄像头的总拍摄范围可以覆盖预设观看区域。示例性地，当预设观看区域为圆形内区域时，可以在预设观看区域对应的圆形上分布各个第二布局位置，并且每个第二布局位置处设置有至少一个第二摄像头，以及每个第二摄像头的拍摄方向均朝向圆形的中心位置。其中，每相邻两个第二布局位置之间的距离可以相等，即各个第二布局位置均匀分布在预设观看区域对应的封闭形状上；也可以在预设允许误差范围内。例如，图16给出了一种在预设观看区域为圆形内区域时第二摄像头的布局示例。如图16所示，圆形上均匀分布12个第二布局位置，若每个第二摄像头组按照图8方式布局，则第二布局位置在水平方向上可以设置18个第二摄像头，即每个第二摄像头组包括18个第二摄像头，以保证每个第二布局位置在水平方向上的拍摄角度可以为150度，此时需要的第二摄像头总数为：12×18＝216个。本实施例为了提高竖直方向上的覆盖角度范围，还可以在每个第二布局位置处叠加设置至少两个第二摄像头组，以便可以追踪用户低头和仰头观看时的情况，此时每个第二布局位置处应至少设置12×18×2＝432个第二摄像头，从而可以使得用户头部每转动30度便有一个最佳的目标第二摄像头拍摄该用户的正面脸部图像，从而可以减少遮挡现象以及提高追踪精度。这种第二摄像头的布局方式可以称为圆形向内型。This embodiment can layout the second camera based on the layout requirements of the second camera 420 and the preset viewing area. Specifically, at least one second camera is arranged at each second layout position in the preset viewing area, and the total detection area corresponding to each second camera is the preset viewing area, that is, the total shooting range of each second camera can cover the preset viewing area. Exemplarily, when the preset viewing area is an inner circular area, each second layout position can be distributed on the circle corresponding to the preset viewing area, and at least one second camera is arranged at each second layout position, and the shooting direction of each second camera is toward the center position of the circle. Among them, the distance between each adjacent second layout position can be equal, that is, each second layout position is evenly distributed on the closed shape corresponding to the preset viewing area; it can also be within the preset allowable error range. For example, Figure 16 shows an example of the layout of the second camera when the preset viewing area is an inner circular area. As shown in FIG16 , 12 second layout positions are evenly distributed on the circle. If each second camera group is arranged in the manner of FIG8 , 18 second cameras can be arranged in the horizontal direction at the second layout position, that is, each second camera group includes 18 second cameras, so as to ensure that the shooting angle of each second layout position in the horizontal direction can be 150 degrees. At this time, the total number of second cameras required is: 12×18=216. In order to increase the coverage angle range in the vertical direction, at least two second camera groups can be superimposed at each second layout position in this embodiment, so as to track the situation when the user lowers his head and looks up. At this time, at least 12×18×2=432 second cameras should be arranged at each second layout position, so that every time the user's head turns 30 degrees, there is an optimal target second camera to shoot the front face image of the user, thereby reducing the occlusion phenomenon and improving the tracking accuracy. This layout of the second camera can be called a circular inward type.

示例性地，当预设观看区域为圆形外区域时，可以在预设观看区域对应的封闭形状上分布各个第二布局位置，并且每个第二布局位置处设置有至少一个第二摄像头，以及每个第二摄像头的拍摄方向均背离圆形的中心位置。其中，每相邻两个第二布局位置之间的距离可以相等，即各个第二布局位置均匀分布在预设观看区域对应的封闭形状上；也可以在预设允许误差范围内。例如，图17给出了一种在预设观看区域为圆形外区域时第二摄像头的布局示例。如图17所示，圆形上均匀分布12个第二布局位置1-12，每个第二布局位置在水平方向上设置18个第二摄像头，以保证每个第二布局位置在水平方向上的拍摄角度可以为150度。Exemplarily, when the preset viewing area is an outer circular area, each second layout position can be distributed on the closed shape corresponding to the preset viewing area, and at least one second camera is arranged at each second layout position, and the shooting direction of each second camera is away from the center position of the circle. Among them, the distance between each two adjacent second layout positions can be equal, that is, each second layout position is evenly distributed on the closed shape corresponding to the preset viewing area; it can also be within the preset allowable error range. For example, FIG17 shows an example of the layout of the second camera when the preset viewing area is an outer circular area. As shown in FIG17, 12 second layout positions 1-12 are evenly distributed on the circle, and each second layout position is provided with 18 second cameras in the horizontal direction to ensure that the shooting angle of each second layout position in the horizontal direction can be 150 degrees.

示例性地，当预设观看区域为直线单侧观看区域时，比如在电影院中的观看区域时，可以在直线单侧观看区域的直线上分布各个第二布局位置，并且每个第二布局位置处设置有至少一个第二摄像头，以及每个第二摄像头的拍摄方向均朝向直线单侧观看区域拍摄。其中，每相邻两个第二布局位置之间的距离可以相等，即各个第二布局位置均匀分布在直线单侧观看区域的直线上；也可以在预设允许误差范围内。例如，图18给出了一种在预设观看区域为直线单侧观看区域时第二摄像头的布局示例。如图18所示，直线上均匀分布3个第二布局位置，若每个第二布局位置按照图8方式布局，则设置18个第二摄像头，以保证每个第二布局位置在水平方向上的拍摄角度可以为150度，并且这三个第二布局位置对应的拍摄方向可以为一个扇形，以便覆盖第二布局位置在水平方向上整个直线单侧观看区域，这种第二摄像头的布局方式可以称为平面型。Exemplarily, when the preset viewing area is a linear single-sided viewing area, such as a viewing area in a movie theater, each second layout position can be distributed on the straight line of the linear single-sided viewing area, and at least one second camera is set at each second layout position, and the shooting direction of each second camera is toward the linear single-sided viewing area. Among them, the distance between each two adjacent second layout positions can be equal, that is, each second layout position is evenly distributed on the straight line of the linear single-sided viewing area; it can also be within the preset allowable error range. For example, Figure 18 shows an example of the layout of the second camera when the preset viewing area is a linear single-sided viewing area. As shown in Figure 18, three second layout positions are evenly distributed on the straight line. If each second layout position is arranged in the manner of Figure 8, 18 second cameras are set to ensure that the shooting angle of each second layout position in the horizontal direction can be 150 degrees, and the shooting direction corresponding to the three second layout positions can be a fan-shaped, so as to cover the entire linear single-sided viewing area of the second layout position in the horizontal direction. This layout of the second camera can be called a plane type.

本实施例中人眼追踪系统的工作过程为：人眼追踪装置430调用第一摄像头410，使得第一摄像头410采集预设观看区域中的用户图像，并将采集的用户图像传输至人眼追踪装置430，人眼追踪装置430根据用户图像确定预设观看区域中每个用户对应的三维头部方位信息，并根据三维头部方位信息，从多个第二摄像头中确定第一预设数量的目标第二摄像头，并调用每个目标第二摄像头，使得每个目标第二摄像头可以采集用户的脸部图像，并将采集的脸部图像传输至人眼追踪装置430，人眼追踪装置430根据每个脸部图像确定用户的二维双眼瞳孔方位信息，以及根据至少两个二维双眼瞳孔方位信息确定用户的三维双眼瞳孔方位信息，从而可以实现用户双眼瞳孔的高速追踪，同时提高了计算精度。The working process of the eye tracking system in this embodiment is as follows: the eye tracking device 430 calls the first camera 410, so that the first camera 410 collects the user image in the preset viewing area, and transmits the collected user image to the eye tracking device 430, the eye tracking device 430 determines the three-dimensional head orientation information corresponding to each user in the preset viewing area according to the user image, and determines a first preset number of target second cameras from multiple second cameras according to the three-dimensional head orientation information, and calls each target second camera, so that each target second camera can collect the user's facial image, and transmits the collected facial image to the eye tracking device 430, the eye tracking device 430 determines the user's two-dimensional binocular pupil orientation information according to each facial image, and determines the user's three-dimensional binocular pupil orientation information according to at least two two-dimensional binocular pupil orientation information, thereby achieving high-speed tracking of the user's binocular pupils and improving the calculation accuracy.

本实施例提供的人眼追踪系统，通过利用第一摄像头和第二摄像头分别实现头部识别和人眼高速追踪，并利用人眼追踪装置430对第二摄像头进行调度和管理，从而可以实现用户双眼瞳孔的高速追踪，同时提高了计算精度。The eye tracking system provided in this embodiment realizes head recognition and high-speed eye tracking respectively by using the first camera and the second camera, and uses the eye tracking device 430 to schedule and manage the second camera, thereby realizing high-speed tracking of the pupils of both eyes of the user and improving the calculation accuracy.

在上述技术方案的基础上，可以将人眼追踪装置430集成在一台服务器上来实现本发明任意实施例所提供的人眼追踪方法，也可以利用第一客户端、多个第二客户端和中心服务器来实现本发明任意实施例所提供的人眼追踪方法。图19给出了本实施例提供的另一种人眼追踪系统的结构示意图，如图19所示，该系统包括：第一摄像头410、多个第二摄像头420、第一客户端440、多个第二客户端450和中心服务器460。On the basis of the above technical solution, the eye tracking device 430 can be integrated on a server to implement the eye tracking method provided by any embodiment of the present invention, or the eye tracking method provided by any embodiment of the present invention can be implemented by using a first client, multiple second clients and a central server. FIG19 shows a schematic diagram of the structure of another eye tracking system provided by this embodiment. As shown in FIG19 , the system includes: a first camera 410, multiple second cameras 420, a first client 440, multiple second clients 450 and a central server 460.

其中，第一摄像头410，与第一客户端440连接，用于采集预设观看区域中的用户图像，并将用户图像发送至第一客户端440；第一客户端440，与中心服务器460连接，用于根据第一客户端440发送的用户图像确定预设观看区域中每个用户对应的三维头部方位信息，并将三维头部方位信息发送至中心服务器460；每个第二摄像头420分别与相对应的一个第二客户端450相连接，用于采集用户的脸部图像，并将脸部图像发送至相应的第二客户端450；第二客户端450，与中心服务器460连接，用于根据每个脸部图像确定用户的二维双眼瞳孔方位信息，并将二维双眼瞳孔方位信息发送至中心服务器460；中心服务器460，用于根据第一客户端440发送的三维头部方位信息，从多个第二摄像头中确定第一预设数量的目标第二摄像头，并调用与每个目标第二摄像头连接的第二客户端，获得调用的每个第二客户端发送的二维双眼瞳孔方位信息，根据至少两个二维双眼瞳孔方位信息确定用户的三维双眼瞳孔方位信息。The first camera 410 is connected to the first client 440, and is used to collect user images in the preset viewing area and send the user images to the first client 440; the first client 440 is connected to the central server 460, and is used to determine the three-dimensional head orientation information corresponding to each user in the preset viewing area according to the user images sent by the first client 440, and send the three-dimensional head orientation information to the central server 460; each second camera 420 is respectively connected to a corresponding second client 450, and is used to collect the user's facial image, and send the facial image to the corresponding second client 450; The second client 450 is connected to the central server 460, and is used to determine the user's two-dimensional binocular pupil position information based on each facial image, and send the two-dimensional binocular pupil position information to the central server 460; the central server 460 is used to determine a first preset number of target second cameras from multiple second cameras based on the three-dimensional head position information sent by the first client 440, and call the second client connected to each target second camera to obtain the two-dimensional binocular pupil position information sent by each called second client, and determine the user's three-dimensional binocular pupil position information based on at least two two-dimensional binocular pupil position information.

需要说明的是，第一客户端可以为但不限于性能高的PC(Personal Computer，个人计算机)。第二客户端可以为但不限于嵌入式计算机，以提高响度速度。本本实施例中第一摄像头410的数量可以为一个，也可以为多个，以提高头部追踪范围，并提高头部追踪精度。当存在多个第一摄像头410时，每个第一摄像头410均与第一客户端440相连接，以使第一客户端440对各个第一摄像头410采集的用户图像进行图像处理，更加准确地确定预设观看区域中每个用户的三维头部方位信息。It should be noted that the first client may be, but is not limited to, a high-performance PC (Personal Computer). The second client may be, but is not limited to, an embedded computer to increase the speed of the sound. In this embodiment, the number of the first camera 410 may be one or more to increase the head tracking range and improve the head tracking accuracy. When there are multiple first cameras 410, each first camera 410 is connected to the first client 440, so that the first client 440 performs image processing on the user images captured by each first camera 410, and more accurately determines the three-dimensional head orientation information of each user in the preset viewing area.

具体地，图19中提供的人眼追踪系统的工作过程为：第一客户端440调用第一摄像头410，使得第一摄像头410采集预设观看区域中的用户图像，并将用户图像发送至第一客户端440；第一客户端440根据第一客户端440发送的用户图像确定预设观看区域中每个用户对应的三维头部方位信息，并将三维头部方位信息发送至中心服务器460；中心服务器460，用于根据第一客户端440发送的三维头部方位信息，从多个第二摄像头420中确定第一预设数量的目标第二摄像头，并调用与每个目标第二摄像头连接的第二客户端；与每个目标第二摄像头连接的第二客户端450调用相应的目标第二摄像头，使得每个目标第二摄像头采集用户的脸部图像，并将脸部图像发送至相应的第二客户端450；第二客户端450根据接收的每个脸部图像确定用户的二维双眼瞳孔方位信息，并将二维双眼瞳孔方位信息发送至中心服务器460；中心服务器460根据至少两个二维双眼瞳孔方位信息确定用户的三维双眼瞳孔方位信息。中心服务器460还可以与3D显示屏驱动相连接，以便将用户的三维双眼瞳孔方位信息输入至3D显示屏驱动中，使得3D显示屏可以根据三维双眼瞳孔方位信息确定相应的显示数据，从而用户可以观看到相应地三维画面。本实施例通过利用第一客户端、第二客户端和中心服务器分别负责处理人眼追踪过程中的三个环节，即第一客户端负责对用户图像进行处理，第二客户端负责对脸部图像进行处理，中心服务器负责对第二摄像头的匹配和调度，并计算出用户的三维双眼瞳孔方位信息，使得系统运行速度更快，处理效率更高，从而进一步提高了人眼追踪速度。Specifically, the working process of the human eye tracking system provided in FIG. 19 is as follows: the first client 440 calls the first camera 410 so that the first camera 410 captures the user image in the preset viewing area and sends the user image to the first client 440; the first client 440 determines the three-dimensional head orientation information corresponding to each user in the preset viewing area according to the user image sent by the first client 440, and sends the three-dimensional head orientation information to the central server 460; the central server 460 is used to determine a first preset number of target second cameras from a plurality of second cameras 420 according to the three-dimensional head orientation information sent by the first client 440, and call the second client connected to each target second camera; the second client 450 connected to each target second camera calls the corresponding target second camera so that each target second camera captures the user's facial image and sends the facial image to the corresponding second client 450; the second client 450 determines the user's two-dimensional binocular pupil orientation information according to each received facial image, and sends the two-dimensional binocular pupil orientation information to the central server 460; the central server 460 determines the user's three-dimensional binocular pupil orientation information according to at least two two-dimensional binocular pupil orientation information. The central server 460 can also be connected to the 3D display driver to input the user's three-dimensional binocular pupil position information into the 3D display driver, so that the 3D display can determine the corresponding display data according to the three-dimensional binocular pupil position information, so that the user can view the corresponding three-dimensional picture. In this embodiment, the first client, the second client and the central server are responsible for processing the three links in the human eye tracking process, that is, the first client is responsible for processing the user image, the second client is responsible for processing the facial image, and the central server is responsible for matching and scheduling the second camera, and calculating the user's three-dimensional binocular pupil position information, so that the system runs faster and has higher processing efficiency, thereby further improving the human eye tracking speed.

实施例五Embodiment 5

图20是本发明实施例五提供的一种设备的结构示意图。参见图20，该设备包括：FIG20 is a schematic diagram of the structure of a device provided in Embodiment 5 of the present invention. Referring to FIG20 , the device includes:

一个或多个处理器510；one or more processors 510;

存储器520，用于存储一个或多个程序；Memory 520, used to store one or more programs;

输入装置530，用于采集图像；An input device 530, used for collecting images;

输出装置540，用于显示屏幕信息；Output device 540, used to display screen information;

当一个或多个程序被一个或多个处理器510执行，使得一个或多个处理器510实现如上述实施例中任意实施例提出的人眼追踪方法。When one or more programs are executed by one or more processors 510 , the one or more processors 510 implement the human eye tracking method proposed in any of the above embodiments.

图20中以一个处理器510为例；设备中的处理器510、存储器520、输入装置530和输出装置540可以通过总线或其他方式连接，图20中以通过总线连接为例。FIG20 takes a processor 510 as an example; the processor 510, memory 520, input device 530 and output device 540 in the device may be connected via a bus or other means, and FIG20 takes the connection via a bus as an example.

存储器520作为一种计算机可读存储介质，可用于存储软件程序、计算机可执行程序以及模块，如本发明实施例中的人眼追踪方法对应的程序指令/模块(例如，人眼追踪装置中的三维头部方位信息确定模块310、目标第二摄像头确定模块320、二维双眼瞳孔方位信息确定模块330和三维双眼瞳孔方位信息确定模块340)。处理器510通过运行存储在存储器520中的软件程序、指令以及模块，从而执行设备的各种功能应用以及数据处理，即实现上述的人眼追踪方法。The memory 520, as a computer-readable storage medium, can be used to store software programs, computer executable programs and modules, such as program instructions/modules corresponding to the eye tracking method in the embodiment of the present invention (for example, the three-dimensional head orientation information determination module 310, the target second camera determination module 320, the two-dimensional binocular pupil orientation information determination module 330 and the three-dimensional binocular pupil orientation information determination module 340 in the eye tracking device). The processor 510 executes various functional applications and data processing of the device by running the software programs, instructions and modules stored in the memory 520, that is, realizing the above-mentioned eye tracking method.

存储器520主要包括存储程序区和存储数据区，其中，存储程序区可存储操作系统、至少一个功能所需的应用程序；存储数据区可存储根据设备的使用所创建的数据等。此外，存储器520可以包括高速随机存取存储器，还可以包括非易失性存储器，例如至少一个磁盘存储器件、闪存器件、或其他非易失性固态存储器件。在一些实例中，存储器520可进一步包括相对于处理器510远程设置的存储器，这些远程存储器可以通过网络连接至设备。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。The memory 520 mainly includes a program storage area and a data storage area, wherein the program storage area can store an operating system and at least one application required for a function; the data storage area can store data created according to the use of the device, etc. In addition, the memory 520 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one disk storage device, a flash memory device, or other non-volatile solid-state storage device. In some instances, the memory 520 may further include a memory remotely arranged relative to the processor 510, and these remote memories may be connected to the device via a network. Examples of the above-mentioned network include, but are not limited to, the Internet, an intranet, a local area network, a mobile communication network, and combinations thereof.

输入装置530可包括摄像头等采集设备，用于采集用户图像和脸部图像，并将采集的脸部图像和脸部图像输入到处理器510进行数据处理。The input device 530 may include a camera or other acquisition device for acquiring user images and facial images, and inputting the acquired facial images and facial images into the processor 510 for data processing.

输出装置540可包括显示屏等显示设备，用于显示屏幕信息。The output device 540 may include a display device such as a display screen for displaying screen information.

本实施例提出的设备与上述实施例提出的人眼追踪方法属于同一发明构思，未在本实施例中详尽描述的技术细节可参见上述实施例，并且本实施例具备执行人眼追踪方法相同的有益效果。The device proposed in this embodiment and the human eye tracking method proposed in the above embodiment belong to the same inventive concept. The technical details not fully described in this embodiment can be referred to the above embodiment, and this embodiment has the same beneficial effects as executing the human eye tracking method.

实施例六Embodiment 6

本实施例提供一种计算机可读存储介质，其上存储有计算机程序，该程序被处理器执行时实现如本发明任意实施例所以提供的人眼追踪方法。This embodiment provides a computer-readable storage medium having a computer program stored thereon. When the program is executed by a processor, the human eye tracking method provided in any embodiment of the present invention is implemented.

本发明实施例的计算机存储介质，可以采用一个或多个计算机可读的介质的任意组合。计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质。计算机可读存储介质例如可以是但不限于：电、磁、光、电磁、红外线、或半导体的系统、装置或器件，或者任意以上的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括：具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本文件中，计算机可读存储介质可以是任何包含或存储程序的有形介质，该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。The computer storage medium of the embodiment of the present invention can adopt any combination of one or more computer-readable media. The computer-readable medium can be a computer-readable signal medium or a computer-readable storage medium. The computer-readable storage medium can be, for example, but not limited to: an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device or device, or any combination of the above. More specific examples (non-exhaustive list) of computer-readable storage media include: an electrical connection with one or more wires, a portable computer disk, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the above. In this document, a computer-readable storage medium can be any tangible medium containing or storing a program, which can be used by an instruction execution system, device or device or used in combination with it.

计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号，其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式，包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质，该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。Computer-readable signal media may include data signals propagated in baseband or as part of a carrier wave, which carry computer-readable program code. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the above. Computer-readable signal media may also be any computer-readable medium other than a computer-readable storage medium, which may send, propagate, or transmit a program for use by or in conjunction with an instruction execution system, apparatus, or device.

计算机可读介质上包含的程序代码可以用任何适当的介质传输，包括但不限于：无线、电线、光缆、RF等等，或者上述的任意合适的组合。The program code embodied on the computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

可以以一种或多种程序设计语言或其组合来编写用于执行本发明操作的计算机程序代码，程序设计语言包括面向对象的程序设计语言，诸如Java、Smalltalk、C++，还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中，远程计算机可以通过任意种类的网络，包括局域网(LAN)或广域网(WAN)，连接到用户计算机，或者，可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。Computer program code for performing the operations of the present invention may be written in one or more programming languages or a combination thereof, including object-oriented programming languages such as Java, Smalltalk, C++, and conventional procedural programming languages such as "C" or similar programming languages. The program code may be executed entirely on the user's computer, partially on the user's computer, as a separate software package, partially on the user's computer and partially on a remote computer, or entirely on a remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (e.g., via the Internet using an Internet service provider).

本领域普通技术人员应该明白，上述的本发明的各模块或各步骤可以用通用的计算装置来实现，它们可以集中在单个计算装置上，或者分布在多个计算装置所组成的网络上，可选地，他们可以用计算机装置可执行的程序代码来实现，从而可以将它们存储在存储装置中由计算装置来执行，或者将它们分别制作成各个集成电路模块，或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。这样，本发明不限制于任何特定的硬件和软件的结合。It should be understood by those skilled in the art that the modules or steps of the present invention described above can be implemented by a general-purpose computing device, they can be concentrated on a single computing device, or distributed on a network composed of multiple computing devices, optionally, they can be implemented by a program code executable by a computer device, so that they can be stored in a storage device and executed by the computing device, or they can be made into individual integrated circuit modules, or multiple modules or steps therein can be made into a single integrated circuit module for implementation. Thus, the present invention is not limited to any specific combination of hardware and software.

注意，上述仅为本发明的较佳实施例及所运用技术原理。本领域技术人员会理解，本发明不限于这里的特定实施例，对本领域技术人员来说能够进行各种明显的变化、重新调整和替代而不会脱离本发明的保护范围。因此，虽然通过以上实施例对本发明进行了较为详细的说明，但是本发明不仅仅限于以上实施例，在不脱离本发明构思的情况下，还可以包括更多其他等效实施例，而本发明的范围由所附的权利要求范围决定。Note that the above are only preferred embodiments of the present invention and the technical principles used. Those skilled in the art will understand that the present invention is not limited to the specific embodiments herein, and that various obvious changes, readjustments and substitutions can be made by those skilled in the art without departing from the scope of protection of the present invention. Therefore, although the present invention has been described in more detail through the above embodiments, the present invention is not limited to the above embodiments, and may include more other equivalent embodiments without departing from the concept of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims

1. A method of eye tracking, comprising:

calling a first camera to collect user images in a preset watching area, and determining three-dimensional head direction information corresponding to each user in the preset watching area according to the user images;

determining a first preset number of target second cameras from the plurality of second cameras according to the three-dimensional head orientation information;

calling each target second camera to collect the face image of the user, and determining the two-dimensional binocular pupil orientation information of the user according to each face image;

determining three-dimensional binocular pupil orientation information of the user according to the at least two pieces of two-dimensional binocular pupil orientation information;

if at least two pieces of two-dimensional binocular pupil orientation information corresponding to the current moment of the user cannot be determined according to each facial image, determining a spiral search rule according to three-dimensional binocular pupil orientation information at historical moments; taking the three-dimensional head position information determined according to the user image as the three-dimensional head position information of the user at the current moment; adjusting the three-dimensional head position information at the current moment according to the spiral search rule, and taking the adjusted three-dimensional head position information at the current moment as first three-dimensional head position information; determining at least two first two-dimensional binocular pupil position information of the user at the current moment according to the first three-dimensional head position information; and determining the three-dimensional binocular pupil position information of the user according to at least two first two-dimensional binocular pupil position information.

2. The method of claim 1, wherein determining a first preset number of target second cameras from a plurality of second cameras based on the three-dimensional head position information comprises:

determining a second preset number of candidate second cameras corresponding to the user and a matching degree corresponding to each candidate second camera according to the three-dimensional head position information and the position configuration information of each second camera;

screening a first preset number of target second cameras from the candidate second cameras according to the matching degrees and the current calling times corresponding to the candidate second cameras;

wherein the first preset number is less than or equal to the second preset number.

3. The method of claim 2, wherein the step of screening a first preset number of target second cameras from the candidate second cameras according to the matching degrees and the current calling times corresponding to each candidate second camera comprises:

screening out candidate second cameras with the current calling times smaller than or equal to preset calling times according to the current calling times corresponding to the candidate second cameras, and taking the candidate second cameras as second cameras to be selected;

and based on the matching degree corresponding to the second cameras to be selected, performing descending order arrangement on the second cameras to be selected, and determining the arranged first preset number of second cameras to be selected as target second cameras.

4. The method according to claim 1, further comprising, after determining three-dimensional binocular pupil orientation information of the user according to at least two of the two-dimensional binocular pupil orientation information, and before determining three-dimensional head orientation information corresponding to each user in the preset viewing area according to the user image:

according to the three-dimensional binocular pupil position information of the user at the current moment and the three-dimensional binocular pupil position information of the user at the historical moment, second three-dimensional head position information of the user at the next moment is estimated;

and estimating the three-dimensional binocular pupil orientation information of the user at the next moment according to the second three-dimensional head orientation information.

5. The method of claim 4, wherein the second three-dimensional head position information comprises a three-dimensional head position and a rotation angle;

correspondingly, second three-dimensional head orientation information of the user at the next moment is estimated according to the following formula:

wherein (X) _p1 ，Y _p1 ，Z _p1 ) And alpha ₁ The three-dimensional eye pupil position and the gaze direction angle of the user at the current moment P1 are obtained; (X) _p2 ，Y _p2 ，Z _p2 ) And alpha ₂ The three-dimensional eye pupil position and the gazing direction angle of the user at the historical moment P2 are obtained; (X) _p3 ，Y _p3 ，Z _p3 ) And alpha ₃ The three-dimensional eye pupil position and the gazing direction angle of the user at the historical moment P3 are obtained; (X, Y, Z) and α are estimated three-dimensional head position and rotation angle of the user at the next time instant.

6. The method of claim 1, after determining a first preset number of target second cameras from the plurality of second cameras, further comprising:

and determining the position and the size of a target face area in the face image acquired by each target second camera according to the three-dimensional head orientation information.

7. The method of claim 6, wherein determining two-dimensional binocular pupil orientation information of the user from each of the facial images comprises:

and determining the time for receiving data through a scanning line according to the position and the size of the target face area, receiving target face area data sent by the target second camera according to the time, and determining the two-dimensional binocular pupil orientation information of the user according to the received target face image data.

8. The method according to any one of claims 1 to 7, wherein the first camera is a color camera or a 3D camera, the second camera is a black and white camera, and an illuminating infrared light source is arranged at the installation position of the second camera.

9. The method of claim 1, prior to invoking the first camera to capture the user image in the preset viewing area, further comprising:

determining the number of first layout positions corresponding to the first camera according to the viewing angle range corresponding to the preset viewing area and the first preset orientation error corresponding to the first camera;

and determining the number of the first cameras corresponding to each first layout position according to the first visual angle of the first cameras, wherein the first depth of field of each first camera is greater than the preset detection distance corresponding to the preset viewing area.

10. The method of claim 1, prior to invoking the first camera to capture the user image in the preset viewing area, further comprising:

determining the number of second layout positions corresponding to the second camera according to the viewing angle range corresponding to the preset viewing area and a second preset orientation error corresponding to the second camera;

determining the number of depth-of-field layers corresponding to each second layout position according to a second depth of field of the second camera and a preset detection distance corresponding to the preset viewing area, wherein the second depth of field is smaller than the preset detection distance;

and determining the number of the second cameras corresponding to each layer of depth of field according to the second visual angle of the second cameras.

11. An eye tracking device, comprising:

the three-dimensional head orientation information determining module is used for calling a first camera to acquire a user image in a preset viewing area and determining three-dimensional head orientation information corresponding to each user in the preset viewing area according to the user image;

the target second camera determining module is used for determining a first preset number of target second cameras from the plurality of second cameras according to the three-dimensional head position information;

the two-dimensional binocular pupil orientation information determining module is used for calling each target second camera to acquire a face image of the user and determining the two-dimensional binocular pupil orientation information of the user according to each face image;

the three-dimensional binocular pupil orientation information determining module is used for determining the three-dimensional binocular pupil orientation information of the user according to at least two pieces of two-dimensional binocular pupil orientation information;

the three-dimensional binocular pupil orientation information determining module is specifically configured to: if at least two pieces of two-dimensional binocular pupil orientation information corresponding to the current moment of the user cannot be determined according to each face image, determining a spiral search rule according to three-dimensional binocular pupil orientation information at historical moments; taking the three-dimensional head position information determined according to the user image as the three-dimensional head position information of the user at the current moment; adjusting the three-dimensional head position information at the current moment according to the spiral search rule, and taking the adjusted three-dimensional head position information at the current moment as first three-dimensional head position information; determining at least two first two-dimensional binocular pupil position information of the user at the current moment according to the first three-dimensional head position information; and determining the three-dimensional binocular pupil position information of the user according to at least two first two-dimensional binocular pupil position information.

12. An eye tracking system, the system comprising: the system comprises a first camera, a plurality of second cameras and a human eye tracking device; wherein the eye tracking apparatus is used to implement the eye tracking method of any one of claims 1-10.

13. The system of claim 12, wherein the first camera is a color camera or a 3D camera and the second camera is a black and white camera.

14. The system according to claim 12, wherein at least one of the first cameras is disposed at each first layout position within a preset viewing area, and a total detection area corresponding to each of the first cameras is the preset viewing area;

at least one second camera is arranged at each second layout position in a preset viewing area, and a total detection area corresponding to each second camera is the preset viewing area.

15. The system of claim 14, wherein an illuminating infrared light source is disposed at the center of each of the second layout positions.

16. The system according to any one of claims 12-15, wherein a depth of field of each of the first cameras is greater than a predetermined detection distance corresponding to a predetermined viewing area.

17. The system according to claim 14, wherein at least one second camera group is disposed at each second layout position, the second camera group comprises at least two layers of second cameras, each layer of second cameras is a plurality of second cameras with the same depth of field, and the depth of field of the second cameras in different layers is different.

18. The system according to claim 14, wherein the shortest distance in the intersection region of the depth of field ranges of two adjacent second cameras at each second layout position is greater than the distance between pupils of both eyes.

19. An apparatus, characterized in that the apparatus comprises:

one or more processors;

a memory for storing one or more programs;

an input device for acquiring an image;

output means for displaying screen information;

the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the eye tracking method of any one of claims 1-10.

20. A computer-readable storage medium, on which a computer program is stored, which program, when being executed by a processor, carries out the eye tracking method according to any one of claims 1 to 10.