CN116168444A

CN116168444A - Multi-screen two-dimensional fixation point positioning method based on polarization imaging

Info

Publication number: CN116168444A
Application number: CN202310036450.5A
Authority: CN
Inventors: 王亚飞; 丁雪妍; 付先平
Original assignee: Dalian Maritime University
Current assignee: Dalian Maritime University
Priority date: 2023-01-10
Filing date: 2023-01-10
Publication date: 2023-05-26

Abstract

The invention provides a multi-screen two-dimensional fixation point positioning method based on polarization imaging, which is used for estimating fixation drop points of human eyes under a scene of a plurality of screen displays and realizing cross-screen human eye two-dimensional fixation point estimation. The invention collects four polarized angle images by utilizing the polarization characteristic of the multi-screen bright spots on the iris reflection plane of the human eye, calculates the corresponding polarized degree images and polarized angle images, and realizes the fusion positioning of the pupil center and the multi-screen bright spot area. Meanwhile, the pupil center-multi-screen bright spot center feature vector is built, a feature vector and multi-screen fixation plane relation model is built, multi-screen fixation plane selection is carried out, and then the fixation point calibration and positioning are completed on a single screen by adopting a homography normalization method. According to the invention, no additional manual auxiliary light source is added, only the screen display in the scene is used as a light source, and the method has the characteristics of high precision, strong robustness, user friendliness and the like, and can realize head unconstrained two-dimensional fixation point positioning.

Description

Multi-screen two-dimensional fixation point positioning method based on polarization imaging

Technical Field

The invention relates to the technical field of man-machine interaction, in particular to a multi-screen two-dimensional fixation point positioning method based on polarization imaging.

Background

The existing human eye fixation point positioning method generally adopts pupil cornea reflection technology to extract the pupil center position and the purkinje spot position reflected by the light source, and additional near infrared light source assistance is needed. However, long-term irradiation of the near infrared light source may cause fatigue to the user. And the purkinje spot reflected by the light source is obvious only in the iris region of the pupil, is not easy to detect in the sclera region, and influences the detection result. In addition, existing gaze point location methods are all directed to single screen displays that fail when switching over multiple screens. Under the premise, when the existing human eye fixation point positioning method is used, the head movement amplitude of a user is extremely limited, a chin rest or a head rest is needed to be used for fixing the head, and the purkinje is limited not to greatly exceed the iris area, so that a good detection result can be obtained.

Disclosure of Invention

The invention provides a multi-screen two-dimensional fixation point positioning method based on polarization imaging, which takes a display as an auxiliary light source and realizes head unconstrained two-dimensional fixation point positioning through fixation point calibration of a plurality of screen displays. Therefore, the problem that the head movement amplitude of the user is extremely limited is solved, and meanwhile, the gazing drop points of human eyes under the scenes of a plurality of screen displays can be accurately estimated.

The invention adopts the following technical means:

a multi-screen two-dimensional fixation point positioning method based on polarization imaging is realized based on a multi-screen fixation point positioning system, and the multi-screen fixation point positioning system comprises: a single polarization camera and a plurality of LCD screen displays disposed on the same side of the human eye;

the method comprises the following steps:

obtaining human eye region images with four polarization angles through a polarization camera, solving Stokes parameters according to the human eye region images with the four angles, and obtaining polarization degree information and polarization angle information of the human eye region through calculation so as to obtain the polarization degree image and the polarization angle image of the human eye region;

respectively carrying out threshold segmentation based on statistics on the human eye region images with the four polarization angles, and carrying out fusion morphology processing on the pupil region so as to position the center position of the pupil;

performing self-adaptive threshold segmentation on the polarization degree image of the human eye region, solving a binarized image of the multi-screen bright spot region, and positioning the multi-screen bright spot region by taking the binarized image of the multi-screen bright spot region as a mask on the corresponding polarization angle image, and respectively calculating the mass centers and the areas of the different screen bright spot regions;

constructing vectors from the center of the through hole to the centroids of the bright spot areas of different screens, and constructing a multi-screen regression model to realize plane selection of looking at the screen;

gazing at each single screen, calculating the area of a bright spot area of the corresponding screen, judging the gazing reliability of the screen, detecting and positioning four corner points of the bright spot area of the screen by using the corner points, fitting the imaging of the bright spot area by using a rectangle, and realizing the smooth trailing dynamic calibration of the single screen by using dynamic points.

Further, acquiring an image of the human eye region at four polarization angles by the polarization camera includes acquiring four polarization angles of 0 with the polarization camera ^° 、45 ^° 、90 ^° And 135 ^° Is a human eye region polarized image: i (0) ^° )、I(45 ^° )、I(90 ^° ) And I (135) ^° )；

The stokes parameters are:

S ₀ ＝(0 ^° )+(90 ^° )

S ₁ ＝(0 ^° )-I(90 ^° )

S ₂ ＝(45 ^° )-I(135 ^° )

wherein I (0) ^° ) Indicating a polarization angle of 0 ^° Is a polarized image of the human eye region, I (45 ^° ) Indicating a polarization angle of 45 ^° Is a polarized image of the human eye region, I (90 ^° ) Indicating a polarization angle of 90 ^° Is a polarized image of the human eye region, I (135 ^° ) Indicating a polarization angle of 135 ^° Is a polarized image of the human eye region.

Further, the polarization degree image is obtained according to the following calculation:

wherein I is _D Representing a polarization degree image, S ₀ 、S ₁ 、S ₂ Is a Stokes parameter.

The polarization angle image is obtained according to the following calculation:

wherein I is _A Representing the polarization angle image.

Further, on the four polarization angle human eye region images, respectively performing threshold segmentation based on statistics, and performing fusion morphology processing on the pupil region, thereby positioning the pupil center position, including:

respectively selecting gray values corresponding to the first 10% of cumulative distribution of the gray histograms as thresholds on human eye region images with four polarization angles by adopting a threshold segmentation method based on statistics, and converting the gray images into binary images;

preliminarily removing invalid small areas in the binary image by morphological processing;

adding the obtained binary images of the human eye region images with the four polarization angles to obtain a pupil region to be estimated;

and correcting the pupil area by adopting an ellipse fitting algorithm, defining the pupil center position as the position with the minimum cost for reaching all position points in the pupil area, and solving the pupil center position in the pupil area by utilizing the minimum cost constraint.

Further, on the polarization degree image of the human eye region, performing adaptive threshold segmentation to obtain a binary image of the multi-screen bright spot region, and on the corresponding polarization angle image, positioning the multi-screen bright spot region by using the binary image of the multi-screen bright spot region as a mask, and respectively calculating the mass centers and the areas of different screen bright spot regions, wherein the method comprises the following steps:

performing self-adaptive threshold segmentation on the polarization degree image of the human eye region by using an Ojin method, and converting the polarization degree image into a binary image;

performing region clustering segmentation on the polarization angle diagram of the human eye region, reserving region blocks with the number of pixels being more than 5 multiplied by 5, and positioning a multi-screen bright spot region in the polarization angle image by taking the polarization degree binarization diagram as a mask;

the centroid and the area of the different screen bright spot areas are calculated respectively.

Further, constructing vectors from the center of the through hole to centroids of different screen bright spot areas, and constructing a multi-screen regression model to realize plane selection of a gazing screen, wherein the method comprises the following steps:

connecting the pupil center with different screen bright spot centers by taking the pupil center as a starting point, and establishing a pupil center-multi-screen bright spot center characteristic vector under an imaging plane coordinate system;

the human eyes sequentially watch different screen planes, and feature vectors, screen bright spot area and index screen serial numbers of the different screen planes are marked;

and establishing a relation model of the feature vector and the gazing plane by using partial least square regression.

connecting and constructing one or more triangles by taking the central point of the multi-screen bright spots as the top point, and taking all internal angle values of the triangles as feature vectors;

and establishing a relation model of the feature vector and the gazing plane by utilizing a neighbor regression method.

Further, gazing at each single screen, calculating the area of a bright spot area of the corresponding screen, judging the gazing reliability of the screen, detecting and positioning four corner points of the bright spot area of the screen by using the corner points, fitting the imaging of the bright spot area by using a rectangle, and realizing the smooth trailing dynamic calibration of the single screen by using dynamic points, wherein the method comprises the following steps:

the human eyes watch a single screen to be calibrated, the area of a bright spot area of a corresponding screen is calculated, the area reliability is judged, the area reliability is calculated according to the currently solved bright spot area divided by the area of the bright spot area of the corresponding single screen marked in a multi-screen watching plane selecting stage, if the area reliability of the bright spot area is more than 0.9, the current screen does not need to be calibrated again, if the area reliability of the bright spot area is less than or equal to 0.9, the current head posture rotation is considered to have larger phase difference relative to the marked head posture, and the screen is required to be used for calibration;

when calibration is needed, positioning four corners of a current screen bright spot area by using a corner detection method, fitting the screen bright spot area by combining a rectangular fitting algorithm, and determining the final four corner positions of the screen bright spot area;

and according to two mapping relations from the imaging plane to the cornea reflection plane and from the cornea reflection plane to the screen plane, a corresponding homography matrix is calculated, and projection mapping of the gaze point from the imaging plane to the cornea reflection plane and then to the screen plane is realized.

Compared with the prior art, the invention has the following advantages:

1. the invention provides a multi-screen two-dimensional fixation point positioning method based on polarization imaging, which is used for estimating fixation drop points of human eyes under a scene of a plurality of screen displays and realizing cross-screen human eye two-dimensional fixation point estimation. The invention collects four polarized angle images by utilizing the polarization characteristic of the multi-screen bright spots on the iris reflection plane of the human eye, calculates the corresponding polarized degree images and polarized angle images, and realizes the fusion positioning of the pupil center and the multi-screen bright spot area.

2. According to the invention, pupil center-multi-screen bright spot center feature vectors are constructed, a relation model of the feature vectors and multi-screen gazing planes is established, multi-screen gazing planes are selected, and then, the homography normalization method is adopted on a single screen to finish gazing point calibration and positioning.

3. According to the invention, no additional manual auxiliary light source is added, only the screen display in the scene is used as a light source, and the method has the characteristics of high precision, strong robustness, user friendliness and the like, and can realize head unconstrained two-dimensional fixation point positioning.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to the drawings without inventive effort to a person skilled in the art.

FIG. 1 is a flow chart of a multi-screen two-dimensional fixation point positioning method based on polarization imaging.

Fig. 2 is an imaging schematic diagram of a multi-screen gaze point positioning system in accordance with an embodiment of the present invention.

FIG. 3 is a schematic diagram of multi-screen polarization imaging in accordance with an embodiment of the present invention.

Fig. 4 is a flowchart of performing multi-screen two-dimensional gaze point positioning in an embodiment of the present invention.

Detailed Description

In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

Considering that the LCD display itself includes a polarizing filter, the light emitted therefrom is polarized light. The invention provides a multi-screen two-dimensional fixation point positioning method and device based on polarization imaging, and a display is used as an auxiliary light source. The polarized bright spots reflected by the display have high robustness compared with the infrared bright spots, and cannot cause secondary injury to human eyes. In view of the difference in polarizing filter angles among different screen displays, bright spots reflected by different screen displays can be easily distinguished in the human eye region image captured by the polarizing camera. And realizing head unconstrained two-dimensional fixation point positioning through fixation point calibration of a plurality of screen displays. Specifically, as shown in fig. 1, the present invention provides a multi-screen gaze point based positioning system implementation, comprising: a single polarization camera and several LCD screen displays, which are disposed on the same side of the human eye, are shown in fig. 2. The method comprises the following steps:

s1, acquiring human eye region images with four polarization angles through a polarization camera, solving Stokes parameters according to the human eye region images with the four angles, and acquiring polarization degree information and polarization angle information of the human eye region through calculation so as to acquire the polarization degree image and the polarization angle image of the human eye region.

The method is mainly used for acquiring image information, and the LCD adjusts screen backlight through the linear polarizing filter, so that the display emits polarized light.

Specifically, a gaze point positioning system is composed of a single polarization camera and a plurality of LCD displays, and four polarization angles 0 are obtained through the polarization camera ^° 、45 ^° 、90 ^° 、135 ^° Is a human eye region polarized image: i (0) ^° )、I(45 ^° )、I(90 ^° )、I(135 ^° ) The polarization angle refers to the angle between the transparent polarization direction and the vertical direction. Linear polarizers are used to transmit linearly polarized light in a direction, while polarized light orthogonal to that direction is absorbed or deflected. In view of the difference in polarizing filter angles among different screen displays, bright spots reflected by different screen displays can be easily distinguished in the human eye region image captured by the polarizing camera. When a polarized camera is used to capture polarized images of the human eye region, polarized light reflection of the display can be observed, the imaging principle is as shown in fig. 3, after the screen display emits light, the polarized light passes through the polarizer with the corresponding polarization angle, and the polarized camera captures images through the polarizer with the corresponding polarization angle. The light intensity of the bright spots on the cornea surface on the polarized image of the human eye area can be adjusted through a polarized filter at the camera end. The polarization angle value of the bright spot area of the cornea surface is almost equal to the polarization angle value of light emitted by the display. The LCD display may be a desktop display, an industrial personal computer display, a portable display, or the like. Common screen placement positions are: the plurality of screens are horizontally juxtaposed, the plurality of screens are vertically juxtaposed, and the screens are horizontally and vertically juxtaposed.

Further, stokes parameters are obtained according to the following calculation: s is S ₀ ＝I(0 ^° )+I(90 ^° )，S ₁ ＝I(0 ^° )-I(90 ^° )，S ₂ ＝I(45 ^° )-I(135 ^° ). By a Stokes restThe polarization state information which can be calculated by the Kerr formula comprises the degree of polarization and the polarization angle. Polarization degree diagram I _D And polarization angle diagram I _A The calculation method comprises the following steps:

wherein the value of the polarization angle depends on the normal vector of the object surface.

S2, respectively carrying out threshold segmentation based on statistics on the human eye region images with the four polarization angles, and carrying out fusion morphology processing on the pupil region so as to position the pupil center.

The invention utilizes the polarization characteristic of the multi-screen bright spots on the human eye iris reflection plane, and realizes the fusion positioning of the pupil center through the collected four polarization angle images, the calculated polarization degree image and the calculated polarization angle image. In order to reduce the influence of random noise in the image acquisition process, a 5×5 gaussian filter with standard deviation of 2 pixels is adopted to perform image preprocessing on the image.

And respectively adopting a statistical-based threshold segmentation method on the images with four polarization angles, selecting a gray value corresponding to the first 10% of cumulative distribution of the gray histogram as a threshold value, and converting the gray image into a binary image. The ineffective small areas are initially removed by morphological treatments (swelling, corrosion, etc.). And adding the obtained binary images of the four polarization angles to obtain a pupil area to be estimated, and correcting the pupil area by using an ellipse fitting algorithm. The pupil center position is defined as the position in the pupil area where the cost of reaching all the position points is minimum, and the pupil center position in the pupil area is solved by utilizing the minimum cost constraint.

S3, performing self-adaptive threshold segmentation on the polarization degree image of the human eye region, obtaining a binarized image of the multi-screen bright spot region, and positioning the multi-screen bright spot region by taking the binarized image of the multi-screen bright spot region as a mask on the corresponding polarization angle image, so as to respectively calculate the mass centers and the areas of the different screen bright spot regions.

Specifically, on the polarization degree map, adaptive threshold segmentation is performed by the oxford method, and the polarization degree map is converted into a binary image. The ineffective small areas are initially removed by morphological treatments (swelling, corrosion, etc.). And on the polarization angle image, carrying out region clustering segmentation, reserving region blocks with the number of pixels being more than 5 multiplied by 5, and positioning a multi-screen bright spot region in the polarization angle image by taking the polarization degree binarization image as a mask. The centroid and the area of the different screen bright spot areas are calculated respectively.

And S4, constructing vectors from the center of the through hole to the centroids of the bright spot areas of different screens, and establishing a multi-screen regression model to realize plane selection of the gazing screen.

The method is mainly used for realizing multi-screen fixation plane selection and model training. Specifically, the invention constructs pupil center-multi-screen bright spot center feature vector, builds a relation model of feature vector and multi-screen fixation plane, selects multi-screen fixation plane, and then performs fixation point positioning on a single screen. According to the method, the gazing point positions outside the gazing screen are not considered, the multi-screen gazing plane selection is utilized to switch the gazing planes, and the cross-screen gazing point positioning is realized.

In one embodiment, for the fixation of the head chin rest, the pupil center is taken as a starting point, the pupil center and different screen bright spot centers are connected, and a pupil center-multi-screen bright spot center characteristic vector under an imaging plane coordinate system is established. The human eyes sequentially watch different screen planes, and feature vectors, screen bright spot area and index screen serial numbers of the different screen planes are marked. And establishing a relation model of the feature vector and the gazing plane by using partial least square regression.

In another embodiment, for the case of large head posture rotation, the center point of the multi-screen bright spots is taken as the vertex, one or more triangles are connected and constructed, all the internal angle values of the triangles are taken as feature vectors, and a relationship model of the feature vectors and the fixation plane is established by utilizing a neighbor regression method. As shown in fig. 3, in the polarization camera imaging plane, multi-screen bright spot areas are captured, the center of each screen bright spot area is calculated, and the multi-screen bright spot center points are used as vertexes to connect and construct triangles. If the multi-screen bright spot area is more than three, a plurality of triangle areas are constructed by using Delaue triangulation. And taking all the internal angle values of the triangle as feature vectors, and establishing a relation model of the feature vectors and the gazing plane by utilizing a neighbor regression method.

S5, gazing at each single screen, calculating the area of a bright spot area of the corresponding screen, judging the gazing reliability of the screen, detecting and positioning four corner points of the bright spot area of the screen by using the corner points, fitting the imaging of the bright spot area by using a rectangle, and realizing the smooth trailing dynamic calibration of the single screen by using dynamic points.

This step is mainly used to achieve single-screen gaze calibration and single-screen gaze estimation. Specifically:

the invention carries out single-screen fixation calibration on all screen planes, and mainly carries out homography mapping through corresponding screen corner points. The specific method comprises the following steps:

firstly, the eyes watch a single screen to be calibrated, the area of a bright spot area of a corresponding screen is calculated, and the area reliability is judged, namely, the area of the bright spot area which is solved currently is divided by the area of the bright spot area of the corresponding single screen marked at a multi-screen watching plane selection stage. And positioning four corner points of the current screen bright spot area by using a corner point detection method, fitting the screen bright spot area by combining a rectangular fitting algorithm, and determining the final four corner point positions of the screen bright spot area.

At this time, the line-of-sight calibration is completed by adopting a homography normalization method. And according to two mapping relations from the imaging plane to the cornea reflection plane and from the cornea reflection plane to the screen plane, a corresponding homography matrix is calculated, and projection mapping of the gaze point from the imaging plane to the cornea reflection plane and then to the screen plane is realized.

After all single screens complete gaze calibration, gaze estimation may be performed. In the gazing estimation stage, the system detects and positions the pupil center and the bright spot area of human eyes. When the eyes watch a certain screen, the watch estimating system firstly selects and judges the watch plane of the multi-screen, and judges which plane is watched; and then calculating the credibility of the current screen bright spot area. If the bright spot area reliability is greater than 0.9, the current screen does not need to be recalibrated. If the area reliability of the bright spot area is smaller than or equal to 0.9, the current head posture rotation is considered to have a large difference relative to the marked head posture, and the calibration is needed by using a screen. The execution flow of the line-of-sight calibration and the line-of-sight estimation by the scheme presented in the present embodiment is shown in fig. 4.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the invention.

Claims

1. The multi-screen two-dimensional fixation point positioning method based on polarization imaging is characterized by being realized based on a multi-screen fixation point positioning system, and the multi-screen fixation point positioning system comprises the following steps: a single polarization camera and a plurality of LCD screen displays disposed on the same side of the human eye;

the method comprises the following steps:

2. The polarization imaging-based multi-screen two-dimensional gaze point positioning method of claim 1, wherein obtaining human eye region images of four polarization angles by a polarization camera comprises obtaining human eye region polarized images of four polarization angles of 0 °, 45 °, 90 ° and 135 ° by using the polarization camera: i (0 °), I (45 °), I (90 °) and I (135 °);

the stokes parameters are:

S ₀ ＝I(0°)+I(90°)

S ₁ ＝I(0°)-I(90°)

S ₂ ＝I(45°)-I(135°)

wherein I (0) represents a human eye region polarized image with a polarization angle of 0 DEG, and I (45) represents a human eye region polarized image with a polarization angle of 45 DEG ^° Is a polarized image of the human eye region, I (90 ^° ) Indicating a polarization angle of 90 ^° Is a polarized image of the human eye region, I (135 ^° ) Indicating a polarization angle of 135 ^° Is a polarized image of the human eye region.

3. A multi-screen two-dimensional gaze point positioning method based on polarization imaging of claim 2, wherein said polarization degree image is obtained from the following calculation:

wherein I is _A Representing the polarization angle image.

4. The polarization imaging-based multi-screen two-dimensional gaze point positioning method of claim 1, wherein on the four polarization angle human eye region images, statistical-based threshold segmentation is performed respectively, and fusion morphology processing is performed on pupil regions, so as to position pupil center positions, comprising:

5. The polarization imaging-based multi-screen two-dimensional fixation point positioning method according to claim 1, wherein on the polarization degree image of the human eye region, adaptive threshold segmentation is performed to obtain a binary image of a multi-screen bright spot region, on a corresponding polarization angle image, the binary image of the multi-screen bright spot region is used as a mask to position the multi-screen bright spot region, and the centroid and the area of different screen bright spot regions are respectively calculated, and the method comprises the following steps:

6. The polarization imaging-based multi-screen two-dimensional fixation point positioning method as claimed in claim 1, wherein constructing vectors from the center of the through hole to the centroids of the bright spot areas of different screens, and constructing a multi-screen regression model, realizing plane selection of the fixation screen, comprises:

7. The polarization imaging-based multi-screen two-dimensional fixation point positioning method as claimed in claim 1, wherein constructing vectors from the center of the through hole to the centroids of the bright spot areas of different screens, and constructing a multi-screen regression model, realizing plane selection of the fixation screen, comprises:

8. The polarization imaging-based multi-screen two-dimensional gaze point positioning method of claim 1, wherein gazing at each single screen, calculating the area of a bright spot area of the corresponding screen, judging the gazing reliability of the screen, detecting and positioning four corner points of the bright spot area of the screen by using the corner points, fitting the imaging of the bright spot area by using a rectangle, and realizing the smooth trailing dynamic calibration of the single screen by using a dynamic point, comprising: