CN104036238B

CN104036238B - The method of the human eye positioning based on active light

Info

Publication number: CN104036238B
Application number: CN201410231543.4A
Authority: CN
Inventors: 王元庆; 孙文晋; 徐斌
Original assignee: Nanjing University
Current assignee: Nanjing University
Priority date: 2014-05-28
Filing date: 2014-05-28
Publication date: 2017-07-07
Anticipated expiration: 2034-05-28
Also published as: CN104036238A

Abstract

In the human eye positioning method based on active light, the active light generation device is used in the video device to actively project the human face, and an image capture device is set to extract two images of bright pupil and dark pupil; the bright pupil effect caused by active light projection is The candidate area for human eye positioning is obtained through the difference of two images and the image filtering method; the human eye positioning is then completed by using the face and human eye positioning methods. Especially through knowledge-based, feature-based, template matching or appearance-based face positioning methods to locate the face area; according to the geometric characteristics of the face, through knowledge-based, feature-based, template matching or based on The representational human eye positioning method locates the human eye area.

Description

A Method of Human Eye Positioning Based on Active Light

技术领域technical field

本发明涉及人眼定位的方法，尤其是视频装置使用的基于主动光的人眼定位的方法。The invention relates to a method for positioning human eyes, in particular to a method for positioning human eyes based on active light used in a video device.

背景技术Background technique

人眼定位的研究已经有很长的历史，最早的研究工作可以追溯到20世纪40年代，但真正有发展还是在最近20年。人眼定位的输入图像通常有3种情况：正面、侧面、斜面。1997年IBM的工作至今，大多数人眼定位研究工作的对象为正面或接近正面的人眼图像。The research on human eye positioning has a long history, and the earliest research work can be traced back to the 1940s, but the real development is still in the last 20 years. There are usually three types of input images for human eye positioning: front, side, and oblique. Since IBM's work in 1997, most human eye positioning research works have focused on frontal or near-frontal human eye images.

人眼定位是一项有着重要的理论研究价值和应用价值，极具挑战性的课题。人眼定位是指在一幅图像中检查是否含有人眼，如果有，则需要进一步确定人眼的位置及尺度，进而用一个多边形或圆形框标示出人眼的区域。它的潜在应用包括机器人视觉、视线鼠标、疲劳驾驶预警、残疾人自助、人机交互、人工智能等许多方面。Human eye positioning is a very challenging topic with important theoretical research value and application value. Eye positioning refers to checking whether there is a human eye in an image. If so, the position and scale of the human eye need to be further determined, and then a polygon or circular frame is used to mark the area of the human eye. Its potential applications include robot vision, line-of-sight mouse, fatigue driving warning, self-help for the disabled, human-computer interaction, artificial intelligence and many other aspects.

目前国内外用于人眼定位的方法层出不穷，概括起来大致有四种：基于知识的、基于特征的、模板匹配的或基于表象的人眼定位的方法。At present, there are endless methods for human eye positioning at home and abroad, and there are roughly four types: knowledge-based, feature-based, template matching or appearance-based human eye positioning methods.

基于知识的人眼定位方法是将人类有关典型的眼的知识编码成一些规则，利用这些规则进行人眼的定位。这些规则主要包括：轮廓规则，如人眼的轮廓可近似的被看成一个椭圆；器官排布规则，如正面人脸中人眼分布在上半个人脸中；对称性规则，如人的双眼具有对称性；运动规则，如眨眼动作可用于实现人眼与背景的分离。The knowledge-based human eye positioning method is to encode human knowledge about typical eyes into some rules, and use these rules to locate human eyes. These rules mainly include: outline rules, such as the outline of the human eye can be approximated as an ellipse; organ arrangement rules, such as the distribution of human eyes in the upper half of the face in a frontal face; symmetry rules, such as human eyes Symmetry; movement rules, such as eye blinking, can be used to separate the human eye from the background.

基于特征的人眼定位方法是寻找一些关于人眼的不依赖于外在条件的属性或结构特征，并利用这些属性或结构特征进行人眼定位。首先通过大量样本学习的方法去寻找这些属性或结构特征，然后用这些属性或结构特征去定位人眼。The feature-based eye location method is to find some attributes or structural features about the human eye that do not depend on external conditions, and use these attributes or structural features to locate the human eye. First, find these attributes or structural features through a large number of sample learning methods, and then use these attributes or structural features to locate human eyes.

模板匹配的人眼定位方法是一种经典的模式识别方法，首先预定义或参数化一个标准的人眼模板，然后计算检测图像区域与标准模板的相关度，通过阈值判定是否为人眼。其中，人眼模板可以动态更新。The human eye positioning method of template matching is a classic pattern recognition method. First, a standard human eye template is predefined or parameterized, and then the correlation between the detected image area and the standard template is calculated, and whether it is a human eye is judged by a threshold. Wherein, the human eye template can be dynamically updated.

基于表象的人眼定位方法一般利用统计分析和机器学习来寻找人眼和非人眼图像的有关特性。学习而来的特性总结成分布模型或者判别函数，再利用这些分布模型或者判别函数来定位人眼。基于表象的人眼定位方法的理论基础是概率论，一般都要用到概率论与数理统计的知识。Appearance-based eye localization methods generally use statistical analysis and machine learning to find relevant characteristics of human and non-human eye images. The learned characteristics are summarized into distribution models or discriminant functions, and then these distribution models or discriminant functions are used to locate human eyes. The theoretical basis of the appearance-based human eye positioning method is probability theory, and the knowledge of probability theory and mathematical statistics is generally used.

Adaboost是一种迭代算法，其核心思想是针对同一个训练集训练不同的分类器(弱分类器)，然后把这些弱分类器集合起来，构成一个更强的最终分类器(强分类器)。其算法本身是通过改变数据分布来实现的，它根据每次训练集之中每个样本的分类是否正确，以及上次的总体分类的准确率，来确定每个样本的权值。将修改过权值的新数据集送给下层分类器进行训练，最后将每次训练得到的分类器最后融合起来，作为最后的决策分类器。Adaboost is an iterative algorithm whose core idea is to train different classifiers (weak classifiers) for the same training set, and then combine these weak classifiers to form a stronger final classifier (strong classifier). The algorithm itself is realized by changing the data distribution. It determines the weight of each sample according to whether the classification of each sample in each training set is correct or not, and the accuracy of the last overall classification. The new data set with modified weights is sent to the lower classifier for training, and finally the classifiers obtained from each training are finally fused as the final decision classifier.

现有技术基于主动光的人眼定位。所谓主动光是指由红外或近红外光源发出的投射到被检测目标表面的光束。而在红外光照明下拍摄人的脸部图像时，当满足一定的条件时，人眼的瞳孔在画面中的表现是亮度明显高于瞳孔周围区域的，这种现象称作“亮瞳效应”。利用主动光下的亮瞳效应，通过不同图像之间的差分及图像滤波，可以得到人眼定位的候选区域，从而加速人眼定位的速度。The existing technology is based on the positioning of human eyes by active light. The so-called active light refers to the light beam emitted by an infrared or near-infrared light source and projected onto the surface of the detected target. When shooting human face images under infrared light, when certain conditions are met, the brightness of the pupil of the human eye in the picture is significantly higher than that of the area around the pupil. This phenomenon is called "bright pupil effect". . Using the bright pupil effect under active light, through the difference between different images and image filtering, the candidate area for human eye positioning can be obtained, thereby accelerating the speed of human eye positioning.

发明内容Contents of the invention

本发明目的是：提出一种基于主动照射式的人眼定位方法。该方法采用主动光照射的方法，通过人脸定位与人眼定位，快速而有效的将人眼区域与图像中的其它区域区分，实现复杂背景下的单个或多个人眼的实时定位。其中，方法中用到的跟踪、模板匹配、相关计算以及滤波等算法有效的保证了人眼定位算法的精确性、稳定性和实时性。The object of the invention is to propose a human eye positioning method based on active illumination. This method adopts the method of active light irradiation, through face positioning and human eye positioning, quickly and effectively distinguishes the human eye area from other areas in the image, and realizes real-time positioning of single or multiple human eyes in complex backgrounds. Among them, the tracking, template matching, correlation calculation and filtering algorithms used in the method effectively ensure the accuracy, stability and real-time performance of the human eye positioning algorithm.

本发明的技术解决方案如下：基于主动光的人眼定位方法，视频装置中利用主动光产生装置对人脸主动投射，设有图像摄取装置对亮瞳、暗瞳两场图像进行提取；利用主动光投射引发的亮瞳效应即通过两场图像的差分及图像滤波方法得到人眼定位的候选区域；后续采用人脸、人眼定位方法完成人眼定位。The technical solution of the present invention is as follows: in the human eye positioning method based on active light, the active light generation device is used in the video device to actively project the human face, and an image capture device is provided to extract two images of bright pupil and dark pupil; The bright pupil effect caused by light projection means that the candidate area for human eye positioning is obtained through the difference of two images and image filtering method; the human face and human eye positioning methods are subsequently used to complete human eye positioning.

根据人眼定位的候选区域，通过基于知识的、基于特征的、模板匹配的或基于表象的人脸定位的方法定位出人脸区域；根据人脸的几何特征，通过基于知识的、基于特征的、模板匹配的或基于表象的人眼定位的方法定位出人眼区域。According to the candidate area of human eye positioning, the face area is located through knowledge-based, feature-based, template matching or appearance-based face positioning methods; according to the geometric features of the face, through knowledge-based, feature-based , template matching or appearance-based human eye positioning methods to locate the human eye area.

优化人眼定位的方法采用跟踪算法对定位出的人脸或人眼位置进行跟踪；或采用模板匹配与计算相关的方法提高人眼定位的性能；或采用滤波方法提高人脸或人眼定位的性能，可以选择上述三类方法中的若干种与基于主动光的人眼定位的基本方法组合使用。The method of optimizing human eye positioning uses a tracking algorithm to track the positioned face or human eye position; or uses template matching and calculation-related methods to improve the performance of human eye positioning; or uses filtering methods to improve the accuracy of human face or human eye positioning. performance, you can choose several of the above three types of methods to be used in combination with the basic method of human eye positioning based on active light.

在人脸定位阶段采用滤波算法提高人脸定位的精确性；在人脸或人眼定位阶段使用跟踪算法以加速后续帧图像人眼定位的速度；在人眼定位阶段使用模板匹配、相关计算的方法提高人眼定位的精确性和稳定性。In the stage of face positioning, the filtering algorithm is used to improve the accuracy of face positioning; in the stage of face or human eye positioning, the tracking algorithm is used to speed up the speed of human eye positioning in subsequent frame images; in the stage of human eye positioning, template matching and correlation calculation are used The method improves the accuracy and stability of human eye positioning.

视频装置用于立体显示装置，可置于立体显示装置的某一位置。The video device is used in a stereoscopic display device and can be placed in a certain position of the stereoscopic display device.

在图像摄取装置的成像镜头上安装带通滤光片，带通滤光片的中心频率与主动光光源的中心频率相等或相近。A band-pass filter is installed on the imaging lens of the image pickup device, and the center frequency of the band-pass filter is equal to or close to the center frequency of the active light source.

采用数字图像处理的手段，对图像摄取装置获得的图像进行分析，进一步确定人眼位置，其过程如图1所示，主要包括如下方面：Using digital image processing means to analyze the image obtained by the image capture device, and further determine the position of the human eye. The process is shown in Figure 1, mainly including the following aspects:

人眼候选区域的获取：人眼候选区域由两部分组成，一部分是由上一帧图像中定位到的人眼位置跟踪而来的；另一部分则是对亮瞳、暗瞳两种图像进行差分后使用阈值提取得到的。其中第二部分通常要使用滤波算法滤掉由于边缘或运动等造成的伪候选区域。如图2所示为亮瞳和暗瞳两种不同的图像。Acquisition of the human eye candidate area: the human eye candidate area consists of two parts, one part is tracked by the position of the human eye located in the previous frame image; the other part is the difference between the bright pupil and dark pupil images obtained after threshold extraction. The second part usually uses a filtering algorithm to filter out false candidate regions caused by edges or motion. As shown in Figure 2, there are two different images of bright pupil and dark pupil.

人脸定位：人脸定位的方法有基于知识的、基于特征的、模板匹配的或基于表象的方法。以基于特征的AdaBoost方法为例，训练得到若干性能较好的特征组成级联强分类器。根据人眼候选区域的位置及人脸器官的排布，依次按照不同尺度对可能的人脸区域进行检测，采用阈值比较的方法确定该区域是否为人脸区域。图3所示为AdaBoost算法中可能用到的几类Haar特征。Face localization: Face localization methods include knowledge-based, feature-based, template matching or representation-based methods. Taking the feature-based AdaBoost method as an example, several features with better performance are trained to form a cascaded strong classifier. According to the position of the human eye candidate area and the arrangement of face organs, the possible face areas are detected in sequence according to different scales, and the threshold comparison method is used to determine whether the area is a face area. Figure 3 shows several types of Haar features that may be used in the AdaBoost algorithm.

人眼定位及优化：人眼定位的方法同样也有基于知识的、基于特征的、模板匹配的或基于表象的方法。以基于表象的支持向量机(SVM)方法为例，首先选取若干类Haar特征作为特征空间，利用格点搜索及、交叉验证加权平衡错误率及支持向量机训练的方法得到支持向量及相应权重系数，从而刻画出分类超平面。然后利用该超平面对待检测的人眼区域进行检测，确定该区域是否是人眼区域。定位出人眼区域后，可以采用模板匹配及帧间相关计算的方法优化人眼区域的位置。常用的相关计算方法举例有最小均方差算法。Human eye positioning and optimization: There are also knowledge-based, feature-based, template-matching or image-based methods for human eye positioning. Taking the representation-based support vector machine (SVM) method as an example, first select several Haar-like features as the feature space, and use grid search, cross-validation weighted balance error rate and support vector machine training to obtain support vectors and corresponding weight coefficients , so as to describe the classification hyperplane. Then use the hyperplane to detect the human eye area to be detected to determine whether the area is the human eye area. After the human eye area is located, the position of the human eye area can be optimized by using template matching and inter-frame correlation calculation. An example of a commonly used correlation calculation method is the minimum mean square error algorithm.

人眼位置跟踪：根据当前帧及前若干帧定位到的人眼位置，采用相关跟踪算法，可以预测得到下一帧中人眼的可能位置。这些位置直接进行人眼检测，不再进行人脸检测，从而为下一帧整个人眼定位过程的进行节省时间。常用的跟踪算法举例有Kalman预测算法、Mean-Shift预测算法。Human eye position tracking: According to the human eye position located in the current frame and previous frames, the possible position of the human eye in the next frame can be predicted by using the correlation tracking algorithm. These positions directly perform human eye detection instead of human face detection, thereby saving time for the entire human eye positioning process in the next frame. Examples of commonly used tracking algorithms include Kalman prediction algorithm and Mean-Shift prediction algorithm.

本发明的改进是：与现有的人眼定位方法与装置相比，本发明利用主动光照明快速得到人眼候选区域，加快了人眼定位的搜索速度；在对图像进行处理方面，采用了“人脸-人眼”两级定位结构，相对于直接在图像中对人眼进行定位的方法节省了处理时间；在人脸定位阶段采用滤波算法提高了人脸定位的精确性，在人眼定位阶段使用模板匹配、相关计算的方法提高人眼定位的精确性和稳定性；在定位得到人脸区域或人眼区域后通过跟踪算法预测出下一帧图像中可能的人眼区域，从而为后续帧的人眼定位节省了时间；在图像摄取装置的成像镜头上安置了带通滤光片，带通滤光片的中心频率与主动光光源的中心频率相近或相等，从而降低了环境光变化对人眼定位效果的影响。The improvement of the present invention is: compared with the existing human eye positioning method and device, the present invention uses active light illumination to quickly obtain the human eye candidate area, which speeds up the search speed of human eye positioning; The "face-eye" two-level positioning structure saves processing time compared to the method of directly locating human eyes in the image; the filtering algorithm is used in the face positioning stage to improve the accuracy of face positioning, In the positioning stage, the method of template matching and correlation calculation is used to improve the accuracy and stability of human eye positioning; after the positioning of the human face area or human eye area, the possible human eye area in the next frame of image is predicted by tracking algorithm, so as to The human eye positioning of subsequent frames saves time; a band-pass filter is placed on the imaging lens of the image capture device, and the center frequency of the band-pass filter is close to or equal to the center frequency of the active light source, thereby reducing the ambient light The effect of the change on the positioning effect of the human eye.

本发明的有益效果是：利用主动光照射探测目标，利用主动光下的亮瞳效应描述被探测目标的特征，以最少的数据量描绘了目标的主要特征，在加快了人眼定位速度的同时，还降低了环境光变化对人眼定位效果的影响。在图像处理阶段，采用分级定位、滤波、跟踪、模板匹配结合相关计算优化等多种方法，提高了人眼定位的精确性、稳定性，保证了人眼定位的实时性。The beneficial effects of the present invention are: the active light is used to illuminate the detection target, the bright pupil effect under the active light is used to describe the characteristics of the detected target, the main features of the target are described with the least amount of data, and the speed of human eye positioning is accelerated. , It also reduces the impact of ambient light changes on the positioning effect of the human eye. In the image processing stage, various methods such as hierarchical positioning, filtering, tracking, template matching and correlation calculation optimization are used to improve the accuracy and stability of human eye positioning and ensure the real-time performance of human eye positioning.

附图说明Description of drawings

图1是本发明方法的流程图；Fig. 1 is a flow chart of the inventive method;

图2是本发明的亮瞳图像和暗瞳图像；Fig. 2 is bright pupil image and dark pupil image of the present invention;

图3是本发明图像处理中能够使用的几类Haar特征的示例；Fig. 3 is the example of several classes of Haar features that can be used in the image processing of the present invention;

图4是本发明用于立体显示装置的实施例结构示意。Fig. 4 is a schematic structural diagram of an embodiment of the present invention used in a stereoscopic display device.

图5是Kalman一步预测器框图。Figure 5 is a block diagram of the Kalman one-step predictor.

具体实施方式detailed description

图1主动照射式人眼定位算法流程图如下所示：基于主动光的人眼定位方法，利用主动光产生装置对人脸主动投射，用图像摄取装置对亮瞳、暗瞳两种图像进行提取；通过两场图像的差分以及图像滤波得到人眼定位的候选区域；根据人眼定位的候选区域，通过基于知识的、基于特征的、模板匹配的或基于表象的人脸定位的方法定位出人脸区域；根据人脸的几何特征，通过基于知识的、基于特征的、模板匹配的或基于表象的人眼定位的方法定位出人眼区域；其中，在人脸定位阶段采用滤波算法提高人脸定位的精确性；在人脸或人眼定位阶段使用跟踪算法以加速后续帧图像人眼定位的速度；在人眼定位阶段使用模板匹配、相关计算的方法提高人眼定位的精确性和稳定性。Figure 1 The flow chart of the active illumination human eye positioning algorithm is as follows: the active light-based human eye positioning method uses the active light generation device to actively project the face, and uses the image capture device to extract bright pupil and dark pupil images ; Obtain the candidate area of human eye positioning through the difference of two images and image filtering; according to the candidate area of human eye positioning, locate the person through knowledge-based, feature-based, template matching or appearance-based face positioning methods Face area; according to the geometric features of the face, the human eye area is located by knowledge-based, feature-based, template matching or appearance-based human eye positioning methods; among them, the filtering algorithm is used to improve the human face in the face positioning stage. Accuracy of positioning; use tracking algorithm in the stage of human face or human eye positioning to accelerate the speed of human eye positioning in subsequent frame images; use template matching and correlation calculation methods in the stage of human eye positioning to improve the accuracy and stability of human eye positioning .

本发明的流程中，从摄取或输入图像后进行处理，具体如下述：In the process of the present invention, the processing is performed after taking or inputting the image, specifically as follows:

应用于立体显示装置的人眼位置探测与跟踪装置。屏幕的下部采用主动光产生装置，同时采用摄像头作为图像输入装置。人眼到屏幕的距离一般在30厘米到80厘米之间，在观看屏幕时，一般不会低于屏幕的下边沿，不高于上边沿30度张角的范围。A human eye position detection and tracking device applied to a stereoscopic display device. The lower part of the screen adopts an active light generating device, and a camera is used as an image input device at the same time. The distance from the human eye to the screen is generally between 30 cm and 80 cm. When watching the screen, it is generally not lower than the lower edge of the screen and not higher than the upper edge of the 30-degree angle range.

以17寸的立体显示器为例，各部件之间的相对布置尺寸如图4所示。17寸的液晶板，长338毫米，宽268毫米。加上四周的边框长约420毫米，宽约389毫米。在观看屏幕时，人的眼睛习惯上会位于屏幕的正前方靠上的位置上，虽然会经常上下或左右的移动ABCD之间，但一般不会超过一定的范围，这也就决定了主动光的照射范围，典型的尺寸标注在图上(超过此范围不能体会立体电视的作用也属正常)。Taking a 17-inch stereoscopic display as an example, the relative layout dimensions of the components are shown in FIG. 4 . The 17-inch LCD panel is 338 mm long and 268 mm wide. Plus the surrounding frame is about 420 mm long and 389 mm wide. When watching a screen, people's eyes are usually located in the front and upper position of the screen. Although they often move up and down or left and right between ABCD, they generally do not exceed a certain range, which also determines the active light. The typical size is marked on the map (it is normal to not be able to experience the effect of stereoscopic TV beyond this range).

人脸到屏幕的距离一般在30厘米到80厘米之间，如D点，在观看屏幕时，一般不会低于屏幕的下边沿，不高于上边沿30度张角的范围，如图4所示。The distance from the face to the screen is generally between 30 cm and 80 cm, such as point D, when viewing the screen, it will generally not be lower than the lower edge of the screen, and not higher than the upper edge of the range of 30 degrees, as shown in Figure 4 shown.

输入的图像首先将亮瞳图像和暗瞳图像进行差分，差分值比较大的点有可能对应人眼区域。为了避免运动或边缘造成的差分值比较大的情况，需要根据差分值比较大的点的聚集程度和形状进行滤波处理，减少可能人眼区域的数量，从而为后续的人脸人眼定位工作节省处理时间。亮瞳图像和暗瞳图像如图2所示。In the input image, the bright pupil image and the dark pupil image are firstly differentiated, and the point with a relatively large difference value may correspond to the human eye area. In order to avoid the situation where the difference value caused by motion or edge is relatively large, it is necessary to perform filtering processing according to the aggregation degree and shape of points with relatively large difference value, so as to reduce the number of possible human eye areas, thereby saving money for subsequent face and human eye positioning work. processing time. Bright pupil images and dark pupil images are shown in Figure 2.

取得人眼候选区域后，可以采用基于知识的、基于特征的、模板匹配的或基于表象的的方法进行人脸检测，以基于特征的AdaBoost算法为例，通过对人脸、非人脸两类样本库进行机器学习，寻找到在正负样本库上区分性能较好的若干Haar特征。设有训练样本集合S＝{(x_1,y_1),(x_2,y_2),…,(x_m,y_m)}，初始化分配给每个样本权重，接着用弱分类器空间H中所有的弱分类器对样本分类，将分类结果与权重相乘后加和，选出效果最好的弱分类器h_1，按照分类结果改变样本权重，误分类的样本提高权重，接着重复以上步骤，从弱分类器空间中选出预测效果最好的弱分类器h_2，重复N次，就获得了N个弱分类器。每个弱分类器也会被分配一个权重，分类效果好的弱分类器分配的权重大，分类效果差的弱分类器分配的权重小。最终的强分类器分类的结果就是N个弱分类器按照各自权重投票分类产生的结果。After the human eye candidate area is obtained, face detection can be performed using knowledge-based, feature-based, template matching, or appearance-based methods. Taking the feature-based AdaBoost algorithm as an example, through the two types of face and non-face The sample library is used for machine learning, and some Haar features with better performance in distinguishing positive and negative sample libraries are found. Set the training sample set S={(x_1,y_1),(x_2,y_2),...,(x_m,y_m)}, initialize the weight assigned to each sample, and then use all weak classifiers in the weak classifier space H To classify the samples, multiply the classification results with the weights and add them together, select the weak classifier h_1 with the best effect, change the sample weights according to the classification results, and increase the weights of misclassified samples, and then repeat the above steps to start from the weak classifier space Select the weak classifier h_2 with the best prediction effect, and repeat N times to obtain N weak classifiers. Each weak classifier is also assigned a weight, the weak classifier with good classification effect is assigned a large weight, and the weak classifier with poor classification effect is assigned a small weight. The final classification result of the strong classifier is the result of N weak classifiers voting and classifying according to their respective weights.

根据人眼候选区域的位置及人脸的器官排布，利用得到的强分类器在待检测图像上进行变尺度人脸检测，从而定位出图像中的人脸区域。机器学习中所用到的若干类Haar特征如图3所示。According to the position of the human eye candidate area and the organ arrangement of the face, the obtained strong classifier is used to perform variable-scale face detection on the image to be detected, so as to locate the face area in the image. Several Haar-like features used in machine learning are shown in Figure 3.

定位出人脸区域后，可以采用基于知识的、基于特征的、模板匹配的或基于表象的的方法进行人眼检测，以基于表象的支持向量机(SVM)算法为例，首先通过对人眼、非人眼两类样本库进行机器学习，得出若干类Haar特征组成支持向量机算法的特征空间。选择的目的是从Haar特征空间中选出少量有代表性的特征，从而简化计算。我们利用公式从Haar特征中选取少量有最佳分类性能的特征来构建样本向量，在上式中，表示第i个特征在正样本集中取值为x的概率密度，为对应的权重，和分别是在负样本集中的表示。F(i)越小，表示第i个特征的区分正负样本的能力越强。After locating the face area, human eye detection can be performed using knowledge-based, feature-based, template matching or appearance-based methods. Taking the appearance-based support vector machine (SVM) algorithm as an example, firstly, through the human eye Two types of sample databases, non-human eyes, and non-human eyes are used for machine learning, and several Haar-like features are obtained to form the feature space of the support vector machine algorithm. The purpose of selection is to select a small number of representative features from the Haar feature space, thereby simplifying the calculation. We use the formula Select a small number of features with the best classification performance from the Haar features to construct the sample vector. In the above formula, Indicates the probability density of the i-th feature taking the value x in the positive sample set, is the corresponding weight, with are the representations in the negative sample set, respectively. The smaller F(i) is, the stronger the ability of the i-th feature to distinguish between positive and negative samples.

在通过上述准则得到一系列Haar特征后，我们以这些特征的归一化特征值构成一个向量，并将训练样本投射到该空间中作为SVM的训练空间。并使用LIBSVM库进行训练，该库使用使用较为广泛，故关于具体的训练过程，这里不再展开介绍。After obtaining a series of Haar features through the above criteria, we use the normalized eigenvalues of these features to form a vector, and project the training samples into this space as the training space of SVM. And use the LIBSVM library for training, which is widely used, so the specific training process will not be introduced here.

获得人眼区域后，对定位出的人眼区域进行优化，以增强人眼定位的精确性和稳定性。具体方法为：首先存储前几帧的人眼检测图像作为模板，在此帧检测位置周围采用模板匹配的方法，选取匹配度最高的区域作为人眼区域，以此减小待检测窗口尺度变化粒度太大造成的误差，提高人眼定位的精度。After the human eye area is obtained, the located human eye area is optimized to enhance the accuracy and stability of human eye positioning. The specific method is: first store the human eye detection images of the first few frames as templates, use template matching method around the detection position of this frame, and select the area with the highest matching degree as the human eye area, so as to reduce the granularity of the scale change of the window to be detected The error caused by too large can improve the accuracy of human eye positioning.

为了外界干扰造成的人眼坐标跳变，使用基于人脸位移相关性的滤波方法来改善跳变。在每秒25帧的视频中，帧间的差别很小，即使人脸的姿态和表情发生变化，在1/25s的情况下，相对坐标变化也不会太大。我们获得人脸的位移，与人眼的位移进行比对，如果人眼的位移和人脸的位移差别太大，即认为发生了跳变，进行跳变修正。其中人脸的位移通过基于模板匹配的三步搜索法获得。For the jump of human eye coordinates caused by external interference, a filtering method based on face displacement correlation is used to improve the jump. In a video with 25 frames per second, the difference between frames is very small. Even if the posture and expression of the face change, in the case of 1/25s, the relative coordinates will not change too much. We obtain the displacement of the human face and compare it with the displacement of the human eye. If the difference between the displacement of the human eye and the displacement of the human face is too large, it is considered that a jump has occurred and a jump correction is performed. The displacement of the face is obtained by a three-step search method based on template matching.

我们通过三步搜索法获得人脸的偏移，模板匹配的精度较低，但是稳定性高，不会产生突变，并且能很好的反映人眼的运动轨迹。在没有人眼跳变的视频帧中，模板匹配获得的轨迹和人眼定位获得的轨迹高度重合。当跳变发生时，二者的运动轨迹变得不吻合，我们可以监测二者的相关度来判断是否发生了跳变。如果跳变发生，则用上一个人眼的检测坐标加上运动估计的偏移量作为检测值。We obtain the offset of the face through a three-step search method. The accuracy of template matching is low, but the stability is high, no mutation occurs, and it can well reflect the trajectory of human eyes. In video frames without eye jumps, the trajectory obtained by template matching and the trajectory obtained by human eye positioning are highly coincident. When a jump occurs, the trajectories of the two become inconsistent, and we can monitor the correlation between the two to determine whether a jump has occurred. If a jump occurs, the detection coordinates of the last human eye plus the offset of motion estimation are used as the detection value.

定位出人眼区域后，根据当前帧图像的人眼位置及前若干帧图像中的人眼位置，采用跟踪算法对下一帧图像中人眼出现的位置进行预测。以kalman预测跟踪算法为例，预测得到下一帧图像中人眼的可能区域，这些区域不再进行人脸检测，而是直接进行人眼检测，从而加速了下一帧图像的人眼定位过程。After locating the human eye area, according to the human eye position in the current frame image and the human eye position in the previous several frame images, the tracking algorithm is used to predict the position of the human eye in the next frame image. Taking the Kalman predictive tracking algorithm as an example, the possible areas of the human eyes in the next frame of images are predicted. These areas are no longer subject to face detection, but directly to human eye detection, thus speeding up the process of human eye positioning in the next frame of images .

具体的方法为：选取匀加速移动模型，采集人眼的位置变化序列，针对样本，选择卡尔曼算法方程的适宜的参数。卡尔曼滤波的计算流程如下：The specific method is as follows: select the uniform acceleration moving model, collect the position change sequence of the human eye, and select the appropriate parameters of the Kalman algorithm equation for the sample. The calculation process of Kalman filter is as follows:

由以上假定可以得到卡尔曼预测的递推流程：From the above assumptions, the recursive process of Kalman prediction can be obtained:

1.在t＝k-1时刻，计算 1. At time t=k-1, calculate

2.计算预测误差的协方差矩阵 2. Calculate the covariance matrix of the forecast error

3.计算增益矩阵 3. Calculate the gain matrix

4.计算对当前时刻状态的估计值：4. Calculate the estimated value of the state at the current moment:

5.计算估计误差P(k|k)＝(I-K(k)C(k))P(k|k-1)；5. Calculate the estimated error P(k|k)=(I-K(k)C(k))P(k|k-1);

在下一时刻，重复1-5操作。这一过程的框图如图5所示。At the next moment, repeat operations 1-5. A block diagram of this process is shown in Figure 5.

应用中，Kalman算法将会在已知位置序列的基础上，根据新的数据和前一时刻的参数估计值，借助于系统本身的状态转移方程，按照一套递推公式，即可算出新的参数估计值。预测得到下一帧图像中人眼的可能区域，这些区域直接进行人眼检测，从而加速了下一帧图像的定位过程。In the application, the Kalman algorithm will be based on the known position sequence, according to the new data and the estimated value of the parameters at the previous moment, with the help of the state transition equation of the system itself, and according to a set of recursive formulas, the new position can be calculated. parameter estimates. The possible areas of human eyes in the next frame of images are predicted, and these areas are directly detected by human eyes, thus speeding up the positioning process of the next frame of images.

本发明在图像摄取装置的成像镜头上安置了带通滤光片，带通滤光片的中心频率与主动光光源的中心频率相等或相近。In the present invention, a band-pass filter is arranged on the imaging lens of the image pickup device, and the center frequency of the band-pass filter is equal to or close to the center frequency of the active light source.

本发明实施例并不构成对本发明的限定，基于本发明原理的简单改进或等同方案并没有超出本发明要求保护的范围。The embodiments of the present invention do not constitute a limitation of the present invention, and simple improvements or equivalent solutions based on the principle of the present invention do not exceed the protection scope of the present invention.

Claims

1. based on active light human-eye positioning method, it is characterized in that in video-unit using active light generating device to face actively Projection, is provided with image-pickup device and bright pupil, dark pupil two field picture is extracted；The bright pupil effect triggered using active light projection The candidate region that human eye is positioned is obtained by the difference and image filtering method of two field picture；It is follow-up fixed using face, human eye Position method completes human eye positioning；According to human eye position candidate region, by Knowledge based engineering, feature based, template matches Or the method for Face detection based on presentation orient human face region；According to the geometric properties of face, by Knowledge based engineering, The method of feature based, template matches or based on presentation human eye positioning orients human eye area；

The method of optimization human eye positioning is tracked using track algorithm to the face or position of human eye oriented, or uses template The matching method related to calculating improves the performance of human eye positioning；Or the property of face or human eye positioning is improved using filtering method Can, select the several basic skills positioned with the human eye based on active light in above-mentioned three classes method to be applied in combination；

The accuracy of Face detection is improved using filtering algorithm in the Face detection stage；Face or human eye positioning stage use with Speed of the track algorithm to accelerate follow-up two field picture human eye to position；Template matches, the side of correlation computations are used in human eye positioning stage Method improves the accuracy and stability of human eye positioning；

Video-unit is used for 3 d display device, is placed in a certain position of 3 d display device；

Bandpass filter, centre frequency and the active radiant of bandpass filter are installed on the imaging lens of image-pickup device Centre frequency it is equal or close；

Using the means of Digital Image Processing, the image that image-pickup device is obtained is analyzed, further determines that human eye position Put, step is：

1）The acquisition of human eye candidate region：Human eye candidate region is made up of two parts, and a part is positioned by previous frame image What the position of human eye for arriving was tracked and come；Part II is extracted using threshold value after carrying out difference to bright pupil, two kinds of images of dark pupil Obtain；Wherein Part II is filtered due to the pseudo- candidate region that edge or motion etc. are caused using filtering algorithm；

2）Face detection and filtering：The AdaBoost methods of the method feature based of Face detection, training obtains some feature groups Into cascade of strong classifiers；The arrangement of position and human face according to human eye candidate region, successively according to different scale to possible Human face region detected that the method for using threshold value to compare determines whether the region is human face region；Orient human face region Afterwards, using the position of Kalman filter algorithm optimization human face region；

3）Human eye is positioned and optimized：The method of human eye positioning is based on the SVMs of presentation（SVM）Method, chooses some first Like-Fenton Oxidation is trained as feature space using grid search and cross validation weighted balance error rate and SVMs Method obtains supporting vector and respective weights coefficient, so as to depict Optimal Separating Hyperplane；Then using the hyperplane to be detected Human eye area detected, determine whether the region is human eye area；After orienting human eye area, using template matches and frame Between correlation computations method optimize human eye area position；Conventional Related Computational Methods are lms algorithm；

4）Position of human eye is tracked：According to the position of human eye that present frame and preceding some frame alignment are arrived, using correlation tracking algorithm, prediction Obtain the possible position of human eye in next frame；These positions directly carry out human eye detection, no longer carry out Face datection, so that under being Carrying out for the whole human eye position fixing process of one frame is time-consuming；Track algorithm has Kalman prediction algorithms, Mean-Shift to calculate in advance Method.