CN110858095A

CN110858095A - Electronic device capable of being controlled by head and operation method thereof

Info

Publication number: CN110858095A
Application number: CN201810965579.3A
Authority: CN
Inventors: 吴政泽; 李安正; 邱圣霖; 洪英士
Original assignee: Acer Inc
Current assignee: Acer Inc
Priority date: 2018-08-23
Filing date: 2018-08-23
Publication date: 2020-03-03

Abstract

The invention provides an electronic device and an operation method. The electronic device includes an image acquisition device, a storage device, and a processor. The image acquisition device acquires an image towards a user, and the storage device stores a plurality of modules. The processor is coupled to the image acquisition device and the storage device, and is used for: setting an image acquisition device to acquire a head image of a user; performing face recognition operation on the head image to obtain a face block; detecting a plurality of facial feature points in a face block; estimating the head posture angle of the user according to the facial feature points; calculating the fixation position of the user on the screen according to the head posture angle, the rotation reference angle and the preset correction position; and setting a screen to display the corresponding visual effect according to the gaze position.

Description

Electronic device capable of being controlled by head and operation method thereof

Technical Field

The present invention relates to electronic devices, and particularly to a head-mounted electronic device and an operating method thereof.

Background

With the development of technology, users of electronic devices are continuously seeking more convenient operation modes. For example, the traditional keyboard and mouse are used in combination or replacement by means of brain waves, eye tracking, head tracking, body movements, and the like, so as to more conveniently control the electronic device. Taking eye tracking technology as an example, it has been widely adopted in consumer applications such as computers, mobile phones, head-mounted display devices, automobiles, game machines, and the like. The basic principle of the eyeball tracking technology is that infrared light beams are utilized to irradiate eyeballs of a user, then light reflected by different parts such as pupils, irises, corneas and the like is obtained through a sensor, and then the gazing position of the user is analyzed through a complex algorithm. In other words, for the current eye tracking and gaze prediction technology, in addition to a general image capturing device such as a web camera (webcam), an additional infrared emitter and a corresponding infrared sensor are required to achieve the purpose of tracking the eye. In addition, the eyeball tracking technology is also susceptible to the influence of the small size of the eyeballs or the shielding of the eyeballs by objects such as the nose and the glasses.

In addition, in the head tracking or limb tracking technology, an image pickup device such as a web camera is generally used to track a portion such as a nose or a finger of a user to analyze an input signal of the user. However, in situations such as virtual reality or medical assistance, the user may wear virtual reality glasses, surgical loupes, or masks, which may cause tracking errors or even fail to detect. For example, mouse replacement software Enable Viacam ((R))http://eviacam.crea-si.com) The movement of the mouse cursor on the display picture is correspondingly controlled by detecting the swing of the nose on the human face. Similarly, another software Camera Mouse (C:)http://comeramouse.org) An alternative to manipulating a mouse cursor is also provided, again by detecting a single site (e.g.: nose, eyes, lips, or fingers) to control the movement of the mouse cursor. Because the position of the nose is obvious on the face, the degree of influence of the background or light on the nose is less when the nose is detected. Therefore, the swinging of nose is often used as the control mouse game in the existing solutionThe object alternatives, however, still leave much to be desired for the accuracy of the manipulation of electronic devices by detecting faces or noses and the corresponding controllability.

Disclosure of Invention

In view of the above, the present invention provides an electronic device and an operating method thereof, which can estimate the gaze position of a user viewing a screen through an image obtained by a general camera, so that the user can control the electronic device through the head movement.

An electronic device is arranged for enabling a screen to display a plurality of image pictures and comprises an image acquisition device, a storage device and a processor. The storage device stores a plurality of modules. The processor is coupled to the image capture device and the storage device, and is configured to execute the modules in the storage device to perform the following steps: setting a screen to display a plurality of marked objects at a plurality of preset correction positions; setting an image acquisition device to acquire a plurality of first head images when a user watches the preset correction positions; performing a plurality of first face recognition operations on the first head images to obtain a plurality of first face blocks corresponding to the preset correction positions; detecting a plurality of first facial feature points corresponding to the first face blocks; calculating a plurality of rotation reference angles of the preset correction positions watched by the user according to the first facial feature points; setting an image acquisition device to acquire a second head image of the user; performing a second face recognition operation on the second head image to obtain a second face block; detecting a plurality of second face feature points in a second face block; estimating the head posture angle of the user according to the second facial feature points; calculating the fixation position of the user on the screen according to the head posture angle, the rotation reference angles and the preset correction positions; and setting a screen to display the corresponding visual effect according to the gaze position.

An operation method is suitable for an electronic device which comprises an image acquisition device and enables a screen to display a plurality of image pictures, and comprises the following steps: setting a screen to display a plurality of marked objects at a plurality of preset correction positions; setting an image acquisition device to acquire a plurality of first head images when a user watches the preset correction positions; performing a plurality of first face recognition operations on the first head images to obtain a plurality of first face blocks corresponding to the preset correction positions; detecting a plurality of first facial feature points corresponding to the first face blocks; calculating a plurality of rotation reference angles of the preset correction positions watched by the user according to the first facial feature points; setting an image acquisition device to acquire a second head image of the user; performing a second face recognition operation on the second head image to obtain a second face block; detecting a plurality of second face feature points in a second face block; estimating the head posture angle of the user according to the second facial feature points; calculating the fixation position of the user on the screen according to the head posture angle, the rotation reference angles and the preset correction positions; and setting a screen to display the corresponding visual effect according to the gaze position.

Based on the above, the electronic device according to the embodiment of the invention can capture the head image of the user by using a general camera device without using additional devices such as an infrared emitter and an infrared sensor, and then perform face recognition and face feature point detection on the captured image. Then, the electronic device can estimate the head posture angle of the user according to the facial feature points generated by the detection of the facial feature points, and can deduce the watching position of the user watching the screen according to the head posture angle of the user, so that the user can interact with the electronic device based on the watching position.

In order to make the aforementioned and other features and advantages of the invention more comprehensible, embodiments accompanied with figures are described in detail below.

Drawings

FIG. 1 is a block diagram of an electronic device according to an embodiment of the invention.

Fig. 2A is a schematic diagram illustrating a scenario of the control electronic device according to an embodiment of the invention.

Fig. 2B is a schematic diagram illustrating an example of an image acquired by the electronic device according to the embodiment of the invention.

Fig. 3 is a flowchart illustrating an operation method of the electronic device according to an embodiment of the invention.

FIG. 4 is a block diagram of an electronic device according to an embodiment of the invention.

FIG. 5A is a schematic diagram illustrating an example of displaying a plurality of marked objects according to an embodiment of the present invention.

Fig. 5B and 5C are schematic diagrams illustrating an example of a situation in which a user views a tagged object according to fig. 5A.

Fig. 6A is a schematic view of a user looking at a screen.

Fig. 6B is a schematic diagram of the image displayed on the screen corresponding to the gazing position of the user in fig. 6A.

FIG. 7 is a block diagram of an electronic device according to an embodiment of the invention.

Fig. 8A and 8B are schematic diagrams illustrating an exemplary method for establishing a rotation reference angle based on different predetermined distances according to an embodiment of the invention.

Description of the reference numerals

10. 40, 70: electronic device

110: screen

120: image acquisition device

130: storage device

131: face recognition module

132: feature point detection module

133: head pose estimation module

134: gaze location prediction module

135: interactive application module

140: processor with a memory having a plurality of memory cells

20. 50, 80: user's hand

F1, F2, F3: gaze location

Img 1: image of a person

B1: face block

P1-P4: preset correction position

D1, D2: distance between two adjacent plates

obj 1-obj 4: marking article

S301 to S305: step (ii) of

Detailed Description

Fig. 1 is a block diagram of an electronic device according to an embodiment of the invention, which is for convenience of illustration only and is not intended to limit the invention. Referring to fig. 1, the electronic device 10 includes a screen 110, an image capturing device 120, a storage device 130, and a processor 140, and the screen 110, the image capturing device 120, the storage device 130, and the processor 140 may be integrated into a whole or respectively disposed in different devices according to different design considerations. In an embodiment, the electronic device 10 may be a notebook computer, a tablet computer, or a smart tv integrated with the screen 110 and the image capturing device 120. In another embodiment, the electronic device 10 may be a desktop computer, a set-top box, a game console, etc., the storage device 130 and the processor 140 are disposed in the same body, and the screen 110 and the image capturing device 120 are implemented by different devices and connected to the storage device 130 and the processor 140 in a wired or wireless manner.

The screen 110 is used to Display a plurality of image frames, and may be a Liquid-crystal Display (LCD), a plasma Display, a vacuum fluorescent Display, a Light-Emitting Diode (LED) Display, a Field Emission Display (FED), a head-mounted Display and/or other suitable displays, but not limited thereto.

The image capturing device 120 provides an image capturing function including an image pickup lens having a lens and a photosensitive element. The photosensitive element is used for sensing the intensity of light entering the lens so as to generate an image. The photosensitive element may be a Charge Coupled Device (CCD), a complementary metal-oxide semiconductor (CMOS) element, or other elements. For example, the image capturing device 120 may be an external network camera or a built-in network camera configured inside a mobile phone, a tablet computer, or a notebook computer. In the present embodiment, the image capturing device 120 is used to capture images in the visible light range (e.g., visible light with a wavelength of 390nm to 700 nm), and does not have the function of capturing images in the infrared range (e.g., infrared with a wavelength of 750nm or more).

The storage device 130 is used for storing data such as images and program codes, and may be any type of fixed or removable Random Access Memory (RAM), read-only memory (ROM), flash memory (flash memory), hard disk or other similar devices, integrated circuits, and combinations thereof. In the present embodiment, the storage device 130 is used to record a plurality of modules, which may include a face recognition module 131, a feature point detection module 132, a head pose estimation module 133, a gaze location prediction module 134, and an interactive application module 135.

The Processor 140 may be a Central Processing Unit (CPU), or other Programmable general purpose or special purpose Microprocessor (Microprocessor), Digital Signal Processor (DSP), Programmable controller, Application Specific Integrated Circuit (ASIC), Programmable Logic Device (PLD), or other similar devices or combinations thereof, which are connected to the screen 110, the image capture Device 120, and the storage Device 130.

In the embodiment, the modules stored in the storage device 130 may be computer programs and may be loaded by the processor 140 to execute the operation method of the electronic device according to the embodiment of the invention.

Fig. 2A is a schematic diagram illustrating a scenario of the control electronic device according to an embodiment of the invention. Fig. 2B is a schematic diagram illustrating an example of an image acquired by the electronic device according to the embodiment of the invention. Referring to fig. 2A and 2B, when the user 20 uses the electronic device 10 (here, a notebook computer is taken as an example), the image capturing device 120 captures an image Img1 towards the user 20. The electronic device 10 may analyze the head pose angle of the user 20 according to the face information carried by the face block B1 in Img1, and predict the gaze position F1 of the line of sight of the user 20 projected on the screen 110 according to the head pose angle of the user 20. Accordingly, the electronic device 10 can perform an interactive application with the user according to the gaze position F1 and the operation interface currently displayed on the screen 110. In other words, when the user 20 uses the electronic device 10, the user 20 can control the electronic device 10 according to the head swing naturally generated when viewing the screen 110.

To further illustrate how the electronic device 10 interacts with the user according to the image captured by the image capturing device 120, the present invention is described below with reference to an embodiment. Fig. 3 is a flowchart illustrating an operation method of the electronic device according to an embodiment of the invention, and the method flow of fig. 3 may be implemented by various elements of the electronic device 10 of fig. 1. Referring to fig. 1 and fig. 3, the following describes detailed steps of the operation method of the electronic device of the present embodiment in combination with various elements and devices of the electronic device 10 in fig. 1.

In step S301, the face recognition module 131 performs a face recognition operation on the image acquired by the image acquisition device 120 to obtain a face block on the head image. Herein, the head image obtained by the image obtaining device 120 is used as a basis for predicting the gaze position of the user viewing the screen 110. The face recognition module 131 can detect a face image from the head image acquired by the image acquisition device 120 to obtain a face block including the face image from the head image. In detail, the face recognition module 131 may extract a plurality of face features from the head image by using a specific feature extraction algorithm, such as Haar-like features, Histogram of Oriented Gradient (HOG) features, or other suitable face feature classification methods, and input the extracted face features into the classifier for classification. The classifier can classify the extraction characteristics of each image segmentation block according to a pre-trained data model, so that the human face block which accords with a specific data model is detected from the image.

In step S302, the feature point detection module 132 detects a plurality of facial feature points within the face block. In detail, the feature point detection module 132 may detect a plurality of facial feature points (facial landmarks) for marking facial contours, facial shapes and facial positions from the facial blocks by using machine learning (machine learning), deep learning (deep learning) or other suitable algorithms. For example, the feature point detection module 132 may detect a plurality of facial feature points from the face region by using a Constrained Local Model (CLM) algorithm, a Constrained Local Neural Fields (CLNF) algorithm, or an Active Shape Model (ASM) algorithm.

In step S303, the head pose estimation module 133 estimates the head pose angle of the user according to the facial feature points. In one embodiment, the head pose estimation module 133 may map the two-dimensional position coordinates of the facial feature points in the plane coordinate system to three-dimensional position coordinates in the three-dimensional coordinate system. In detail, the image acquisition module 120 may map two-dimensional coordinates in a camera coordinate system to three-dimensional coordinates in a world coordinate system. Then, the head pose estimation module 133 estimates the head pose angle of the user according to the three-dimensional position coordinates of the facial feature points. For example, by using a Perspective-n-Point (PnP) algorithm, the head pose estimation module 133 can estimate the head pose angle of the user under the world coordinate system according to the coordinate positions of the facial feature points. In addition, the head pose estimation module 133 may perform coordinate mapping on all or part of the facial feature points to estimate the head pose angle of the user by using all or part of the facial feature points.

In one embodiment, the head pose angle may comprise a head pitch angle rotated with respect to a first axis and a head yaw angle rotated with respect to a second axis. In another embodiment, this head pose angle may include a head tilt angle rotated with respect to a first axis, a head yaw angle rotated with respect to a second axis, and a head roll angle rotated with respect to a third axis. For example: the head tilt angle can be used to indicate the angle of the user's head when swinging up and down on a virtual horizontal axis parallel to the direction of the screen 110; the head yaw angle may be used to represent the angle of the user when rocking the head left or right on a virtual vertical axis parallel to the direction of the screen 110; the head flip angle can be used to indicate the angle of the user's head when rotating on a virtual longitudinal axis perpendicular to the direction of the screen 110.

In step S304, the gaze location prediction module 134 calculates a gaze location of the user on the screen 110 according to the head pose angle, the rotation reference angles and the preset calibration locations. In one embodiment, the gaze location prediction module 134 may obtain a plurality of rotation reference angles corresponding to a plurality of predetermined calibration locations on the screen 110. These rotation reference angles represent the head pose angles of the user's head when the user looks at the preset calibration position on the screen 110, and are used as reference comparison information for subsequent prediction of the gaze point of the user. However, the number of the preset correction positions and the actual coordinate positions on the screen 110 are not limited in the present invention, but the number of the preset correction positions is at least two. Through at least two preset correction positions and corresponding rotation reference angles, the gaze position prediction module 134 may establish a correspondence between the head pitch angle and the high-low deviation degree of the gaze position, and a correspondence between the head yaw angle and the left-right deviation degree of the gaze position. In an embodiment, the predetermined calibration positions may be located at four corners or four edges of the screen 110, respectively. In another embodiment, the preset calibration positions may be located on diagonal lines of the screen 110, respectively. Then, according to the rotation reference angle, the preset calibration position and the corresponding relationship between the rotation reference angle and the preset calibration position, the gaze position prediction module 134 may convert the real-time estimated head pose angle into a gaze position at which the user's sight line is currently projected on the screen 110. The gaze location may be represented by a screen coordinate system (screen coordinate system) defined based on the size and resolution of the screen 110, and thus may include a first coordinate value in a first axial direction and a second coordinate value in a second axial direction (e.g., coordinate values in the X-axis and the Y-axis). In addition, in addition to converting the head pose angle into the gaze position based on the rotation reference angle and the preset correction position, in another embodiment, the gaze position prediction module 134 may also search a pre-established lookup table by using the rotation angle of the head of the user to obtain the gaze position of the user.

Accordingly, after acquiring the gaze location of the user, in step S305, the interactive application module 135 displays the corresponding visual effect on the screen 110 according to the gaze location setting, so as to control the screen 110 to display the corresponding image frame. In one embodiment, the interactive application module 135 may control the screen 110 to display an indication object at the gazing position, wherein the indication object may be an arrow or a geometric figure (such as a dot), etc. Accordingly, the pointing object will also move upstream on the screen 110 in response to the change in the instant predicted gaze location. In other words, the user can control the pointing object 110 to move on the screen through his/her line of sight, so that the user can further perform more diversified interactive control according to the gaze location of the pointing object.

In another embodiment, the interactive application module 135 may further control the screen 110 to display an information prompt block associated with the display object according to the display object indicated by the gaze location. The display object may be any icon (icon) or text label object on the user interface displayed on the screen 110. Further, when the gaze location overlaps with the area covered by the displayed object, the interactive application module 135 may display the information prompt section of the displayed object to provide other detailed information related to the displayed object. For example, when the user's gaze location overlaps the system time block of the operating system toolbar, the interactive application module 135 may expand and feed back a window that displays information including more detailed calendar and time. In addition, in an embodiment, the interactive application module 135 may further control the screen 110 to change the display effect for displaying the display area according to the display area where the gazing position is located. Herein, the prompt display effect may include a display color, a display font, a display size, or the like. Specifically, the entire display area of the screen 110 can be divided into a plurality of display blocks. When the user's gaze location is located in one of the display blocks, the interactive application module 135 may change the display effect of the display area where the gaze location is located, such as displaying the display content in the display area in an enlarged manner.

In one embodiment, the rotation reference angle for predicting the user's gaze position may be obtained through a prior test by the product designer, in which case the rotation reference angle may be a fixed preset value recorded in the storage device 130. In another embodiment, the rotation reference angle for predicting the user's gaze position may be established according to the actual user's usage habits. The habits and biological characteristics of different users are different. In another embodiment, the rotation reference angle can be obtained by allowing the actual user to operate a calibration procedure, which can improve the accuracy of predicting the gaze location.

The following will exemplify embodiments to describe how to obtain the rotation reference angles corresponding to a plurality of preset correction positions through the correction procedure, and further describe in detail how to predict the gaze position based on the head posture angle from the rotation reference angles.

FIG. 4 is a block diagram of an electronic device according to an embodiment of the invention. Referring to fig. 4, the electronic device 40 of the present embodiment includes a screen 110, an image capturing device 120, a storage device 130, and a processor 140, and the coupling relationship and functions thereof are similar to those of the previous embodiment and are not described herein again. Different from the previous embodiments, the storage device 130 of the present embodiment further records the calibration module 136, and the processor 140 can load and execute the calibration module 136 to obtain the corresponding rotation reference angle for a specific actual user.

In the present embodiment, the calibration module 136 can control the screen 110 to display a plurality of marked objects at a plurality of predetermined calibration positions on the screen 110, wherein the predetermined calibration positions can be respectively located at an upper boundary, a lower boundary, a left boundary and a right boundary of the screen 110. Referring to fig. 5A, fig. 5A is a schematic diagram illustrating an example of displaying a plurality of marked objects according to an embodiment of the present invention. The calibration module 136 may control the screen 110 to display the markup object obj1 at the upper boundary of the screen 110; displaying a marker object obj2 at the right border of the screen 110; displaying a marker object obj3 at the lower boundary of the screen 110; and a markup object obj4 is displayed at the left border of the screen 110. However, fig. 5A is only an exemplary illustration and is not intended to limit the invention.

Based on the example of fig. 5A, the calibration module 136 may prompt the user to view the mark objects obj 1-obj 4, and at the same time, the image capturing device 120 may capture a plurality of head images when the user gazes at the mark objects obj 1-obj 4. In detail, the calibration module 136 prompts the user to view the mark object obj1 first, and drives the image capturing device 120 to capture the head image when the user gazes at the mark object obj1, and so on, and then captures the head images corresponding to obj2 to obj4 in the subsequent process. After obtaining the head images of the markers obj 1-obj 4 viewed by the user, the calibration module 136 may analyze the head images to estimate the rotation reference angles respectively corresponding to the preset calibration positions P1-P4. Specifically, the calibration module 136 may utilize the operations and algorithms executed by the face recognition module 131, the feature point detection module 132 and the head pose estimation module 133 to estimate the rotation reference angles corresponding to the preset calibration positions P1-P4 according to the head images obtained during the calibration procedure. The calibration module 136 records the rotation reference angles corresponding to the preset calibration positions P1-P4 in the storage device 130.

Based on the example of fig. 5A, the rotation reference angles recorded by the calibration module 136 may include a first tilt angle corresponding to the preset calibration position P1, a second tilt angle corresponding to the preset calibration position P3, a first yaw angle corresponding to the preset calibration position P2, and a second yaw angle corresponding to the preset calibration position P4. Specifically, based on the example of fig. 5A, since the preset correction positions P1-P4 are respectively located at the upper boundary, the lower boundary, the left boundary and the right boundary of the screen 110, the rotation reference angle obtained by the correction module 136 may include the maximum tilt angle, the minimum tilt angle, the maximum yaw angle and the minimum yaw angle corresponding to the four preset correction positions P1-P4.

In detail, referring to fig. 5B, fig. 5B is a schematic diagram illustrating a situation in which a user views a tagged object according to the example shown in fig. 5A. When the user watches the marked object obj1, the head of the user 50 will be lifted upwards, and the calibration module 136 can estimate the maximum tilt angle θ corresponding to the predetermined calibration position P1 according to the head image obtained at this time_pmax. Similarly, referring to fig. 5C, fig. 5C is a schematic diagram illustrating a situation in which a user views a tagged object according to the example shown in fig. 5A. When the user watches the marked object obj3, the head of the user 50 swings downwards, and the calibration module 136 can estimate the head image Img3 according to the head image Img3 acquired at the momentThe minimum tilt angle theta corresponding to the preset correction position P3 is measured_pmin. Similarly, the calibration module 136 may obtain the maximum yaw angle θ corresponding to the preset calibration positions P2 and P4_ymaxWith a minimum yaw angle theta_ymin. Here, the calibration module 136 may adjust the maximum tilt angle θ_pmaxMinimum tilt angle theta_pminMaximum yaw angle theta_ymaxAnd a minimum yaw angle theta_yminThe reference angle of rotation is recorded. Herein, the maximum inclination angle θ_pmaxMinimum tilt angle theta_pminMaximum yaw angle theta_ymaxAnd a minimum yaw angle theta_yminA range of possible angles of head swing can be defined when the user views the screen 110.

In addition, it should be noted that fig. 5A is an example in which the preset calibration positions are respectively located at the upper, lower, left and right boundaries of the screen 110, so that the calibration module 136 can obtain the maximum tilt angle θ_pmaxMinimum tilt angle theta_pminMaximum yaw angle theta_ymaxAnd a minimum yaw angle theta_ymin. However, in other embodiments, even if the preset calibration position is not located on the upper, lower, left, and right boundaries of the screen 110, the calibration module 136 may still obtain two tilt angles (i.e., the first tilt angle and the second tilt angle) and two yaw angles (i.e., the first yaw angle and the second yaw angle) as the rotation reference angles and the corresponding preset calibration positions, so as to establish a corresponding relationship between the head pitch angle and the high and low offset degrees of the gaze position, and a corresponding relationship between the head yaw angle and the left and right offset degrees of the gaze position.

In an embodiment, the gaze location prediction module 134 performs an interpolation operation or an extrapolation operation according to the first yaw angle, the second yaw angle, a first location corresponding to the first yaw angle in the preset correction location, a second location corresponding to the second yaw angle in the preset correction location, and the head yaw angle to obtain a first coordinate value of the gaze location. Moreover, the gaze location prediction module 134 may perform interpolation or extrapolation operation according to the first tilt angle, the second tilt angle, a third location corresponding to the first tilt angle in the preset correction location, a third location corresponding to the second tilt angle in the preset correction location, and the head tilt angle, so as to obtain a second coordinate value of the gaze location.

Specifically, the description is continued by taking the reference angle of rotation generated based on fig. 5A as an example, where fig. 6A represents a schematic view of the screen 110 at which the user gazes, and fig. 6B represents a schematic view of the position corresponding to the position at which the user gazes in fig. 6A in the image displayed on the screen 110. Referring to fig. 6A and 6B, the gaze location prediction module 134 may predict the gaze location according to the maximum yaw angle θ_ymaxMinimum yaw angle theta_yminMaximum yaw angle theta_ymaxThe corresponding preset correction position is arranged in the X-axis coordinate (namely Sw) and the minimum yaw angle theta under the screen coordinate system_yminThe corresponding preset correction position is arranged in the X-axis coordinate (which is 0) of the screen coordinate system, and the current head deflection angle theta of the user_yInterpolation is performed to obtain the X-coordinate value (i.e., Sx) of the gaze position F2. Similarly, the gaze location prediction module 134 may be based on the maximum tilt angle θ_pmaxMinimum tilt angle theta_pminMaximum inclination angle theta_pmaxThe corresponding preset correction position is arranged in the Y-axis coordinate (which is 0) and the minimum tilt angle theta under the screen coordinate system_pminThe corresponding preset correction position is arranged in the X-axis coordinate (Sh) of the screen coordinate system and the current head deflection angle theta of the user_pInterpolation is performed to obtain the Y coordinate value (Sy) of the gaze position F2. Thus, the gaze location prediction module 134 can predict the gaze location of the user as (Sx, Sy) in the screen coordinate system. Here, the screen coordinate system may be a coordinate system defined in units of pixels and based on a screen resolution. In one embodiment, the gaze location prediction module 134 may calculate the coordinates of the gaze location according to the following equations (1) and (2).

In the foregoing embodiment, the prediction of the gaze location does not take into account the distance between the user and the screen 110. However, the distance between the user and the screen 110 will affect the head swing amplitude of the user when viewing the screen 110. Generally, the closer the user is to the screen 110, the greater the amplitude of the head swing that the user sees the edge of the screen 110. Conversely, the farther away the user is from the screen 110, the smaller the head swing amplitude of the user viewing the edge of the screen 110. It can be known that the rotation reference angle, which is preset in advance or generated according to the calibration procedure, is generated based on a preset specific distance, so that the actual viewing distance between the user and the screen 110 may affect the prediction result of the gazing position.

FIG. 7 is a block diagram of an electronic device according to an embodiment of the invention. Referring to fig. 7, the electronic device 70 of the present embodiment includes a screen 110, an image capturing device 120, a storage device 130, and a processor 140, and the coupling relationship and functions thereof are similar to those of the previous embodiment and are not described herein again. Unlike the previous embodiments, the storage device 130 of the present embodiment further records the distance adjustment module 137, and the processor 140 may load and execute the distance adjustment module 137 to adjust the prediction result of the gazing position based on the actual viewing distance between the user and the screen 110, so as to improve the accuracy of predicting the gazing position.

In the present embodiment, the distance adjusting module 137 can calculate the viewing distance between the user and the screen 110. The distance adjusting module 137 may also estimate the viewing distance between the user and the screen 110 based on the head image acquired by the image acquiring device 120, for example, the viewing distance between the user and the screen 110 is estimated according to the size of the face region. The distance adjustment module 137 may also calculate the viewing distance between the user and the screen 110 according to the facial feature points in the facial block. Then, the distance adjusting module 137 may adjust the previously recorded rotation reference angle according to the viewing distance, so that the gaze position predicting module 134 may convert the head pose angle into the gaze position according to the adjusted rotation reference angle. Alternatively, the distance adjustment module 137 may adjust the gaze location generated by the gaze location prediction module 134 according to the viewing distance.

In an embodiment, the distance adjusting module 137 may calculate a plurality of first viewing distances between the user and the screen 110 according to the facial feature points when the calibration procedure is executed. In addition, the distance adjusting module 137 may estimate a second viewing distance between the user and the screen 110 according to the facial feature points when the user actually operates the electronic device 70. Thereafter, the distance adjustment module 137 may adjust the rotation reference angle or the gaze position according to the second viewing distance and/or the plurality of first viewing distances.

In detail, in order for the distance adjusting module 137 to adjust the rotation reference angle or the gaze position according to the viewing distance, two sets of rotation reference angles corresponding to at least two different preset viewing distances (i.e. a plurality of first viewing distances) need to be established in advance. Fig. 8A and 8B are schematic diagrams illustrating an exemplary method for obtaining a rotation reference angle based on different predetermined distances according to an embodiment of the invention. Referring to fig. 8A and 8B, when the calibration procedure described in the foregoing embodiment is used to obtain the rotation reference angle, the calibration module 136 may prompt the user 80 to repeat the calibration procedure at a position away from the screen 110 by a different preset distance. Accordingly, the calibration module 136 can first obtain a first set of rotation reference angles (e.g., the first yaw angle θ shown in fig. 8A) for the first predetermined distance D1_{ymax_1}And a second yaw angle theta_{ymin_1}) And a second set of rotation reference angles (e.g., the first yaw angle θ shown in fig. 8A) is obtained for the second predetermined distance D2_{ymax_2}And a second yaw angle theta_{ymin_2}). However, details of obtaining the rotation reference angle based on the head image are already described in the foregoing embodiments, and are not described herein again. Table 1 is an example of two sets of rotation reference angles established for different predetermined distances, but the invention is not limited thereto.

TABLE 1

In other words, the storage device 130 stores at least two sets of rotation reference angles corresponding to different preset distances. Accordingly, taking table 1 as an example, the distance adjusting module 137 may obtain the viewing distance (i.e., the second viewing distance) between the user and the screen 110 after the calibration procedure is performed, and then perform interpolation or extrapolation operation according to the first predetermined distance D1, the second predetermined distance D2, the viewing distance, and two sets of rotation reference angles (as shown in table 1) corresponding to the first predetermined distance D1 and the second predetermined distance D2, so as to obtain the adjusted rotation reference angle. Accordingly, the gaze location prediction module 134 may predict the gaze location of the user's gaze projected on the screen 110 based on the adjusted rotation reference angle and the real-time estimated head posture angle.

Alternatively, the gaze location prediction module 134 may first calculate the first gaze location (Sx) according to the first set of rotation reference angles corresponding to the first predetermined distance D1₁,Sy₁) And then a second gaze position (Sx) is calculated according to a second set of rotation reference angles corresponding to a second preset distance D2₂,Sy₂). Then, the distance adjusting module 137 may obtain the viewing distance according to the first predetermined distance D1, the second predetermined distance D2, the viewing distance and the first gaze location (Sx)₁,Sy₁) And a second gaze position (Sx)₂,Sy₂) Performing interpolation or extrapolation to obtain the adjusted gaze position (Sx)_f,Sy_f)。

When the user's head flip angle is changed, it is also possible to change the position at which the user gazes. In another embodiment, the correction module 136 may also perform a correction operation on the head flip angle, and the gaze location prediction module 134 may also adjust the corresponding gaze location on the screen 110 according to the change of the head flip angle.

In the above embodiment, the electronic device 10 is described as a notebook computer or the like. In other embodiments, various components of the electronic device 10 may be appropriately modified according to different design considerations. In another embodiment, when the user wears VR glasses, surgical loupes, head-mounted display or other wearable device on the head, the image of the user's head and part of the facial feature points are hidden by the wearable device, and the processor 140 can use the facial feature points that are not hidden by the wearable device to perform the estimation of the user's sight line. In another embodiment, the VR glasses, the surgical loupe, the head-mounted display, or other wearable devices are worn on the head of the user, the head image and part of the facial feature points of the user are hidden by the wearable devices, and one or more simulated feature points are disposed on the wearable devices, and the processor 140 can estimate the line of sight of the user by using the facial feature points not hidden by the wearable devices and the simulated feature points on the wearable devices. In addition, in some applications of the virtual reality, the user wears VR glasses with a screen on the head, and the user can perform a wider range of actions on the head to experience the virtual reality effect around the circumference. By the above embodiments and one or more image capturing devices, even if the eyes of the user are shielded, the position of the user gazing at the screen of the VR glasses can be estimated, and the desired image effect can be generated on the screen correspondingly.

In summary, in the embodiment of the invention, the electronic device can accurately estimate the sight line falling point projected on the screen by the user's sight line without analyzing the pupil position according to the eye image by using the general photographing apparatus to photograph and analyze the head image, so that the user can interact with the electronic device in a more intuitive and simple manner. That is, the present invention can match the current consumer electronic products with a common camera lens with the proposed operation method, thereby achieving the purpose of controlling the electronic device according to the line of sight of the user watching the screen, and enhancing the application range of the present invention in practical application. In addition, the electronic device can obtain the rotation reference angle as the reference for comparing the head rotation angle for each user, and can adjust the prediction result of the gaze position according to the actual viewing distance between the user and the screen, so that the accuracy of predicting the gaze position on the screen according to the face image can be greatly improved.

Although the present invention has been described with reference to the above embodiments, it should be understood that the invention is not limited to the embodiments, and various changes and modifications can be made by those skilled in the art without departing from the spirit and scope of the invention.

Claims

1. An electronic device arranged to cause a screen to display a plurality of image pictures, comprising:

an image acquisition device;

a storage device storing a plurality of modules; and

a processor, coupled to the image capture device and the storage device, configured to execute the plurality of modules in the storage device to:

setting the screen to display a plurality of marked objects at a plurality of preset correction positions;

setting the image acquisition device to acquire a plurality of first head images when a user gazes at the preset correction positions;

performing a plurality of first face recognition operations on the plurality of first head images to obtain a plurality of first face blocks corresponding to the plurality of preset correction positions;

detecting a plurality of first facial feature points corresponding to the plurality of first facial blocks;

calculating a plurality of rotation reference angles of the user gazing at the preset correction positions according to the first facial feature points;

setting the image acquisition device to acquire a second head image of the user;

performing a second face recognition operation on the second head image to obtain a second face block;

detecting a plurality of second facial feature points in the second face block;

estimating the head posture angle of the user according to the plurality of second facial feature points;

calculating the gaze position of the user on the screen according to the head posture angle, the plurality of rotation reference angles and the plurality of preset correction positions; and

and setting the visual effect corresponding to the screen display according to the watching position.

2. The electronic device of claim 1, wherein the gaze location comprises a first coordinate value in a first axial direction and a second coordinate value in a second axial direction.

3. The electronic device of claim 2, wherein the plurality of head pose angles comprises a head tilt angle and a head yaw angle, and the plurality of rotational reference angles comprises a first tilt angle, a second tilt angle, a first yaw angle, and a second yaw angle corresponding to the plurality of preset correction positions.

4. The electronic device according to claim 3, wherein the processor performs an interpolation or extrapolation operation according to the first yaw angle, the second yaw angle, a first position corresponding to the first yaw angle among the plurality of preset correction positions, a second position corresponding to the second yaw angle among the plurality of preset correction positions, and the head yaw angle to obtain the first coordinate value of the gaze position; and

the processor performs interpolation or extrapolation operation according to the first tilt angle, the second tilt angle, a third position corresponding to the first tilt angle among the plurality of preset correction positions, a fourth position corresponding to the second tilt angle among the plurality of preset correction positions, and the head tilt angle, so as to obtain the second coordinate value of the gaze position.

5. The electronic device of claim 1, wherein the processor calculates a plurality of first viewing distances between the user and the screen based on the plurality of first facial feature points;

the processor estimates a second viewing distance between the user and the screen according to the plurality of second facial feature points; and

the processor adjusts the plurality of rotation reference angles or the gaze position according to the second viewing distance and the plurality of first viewing distances.

6. The electronic device of claim 1, wherein the processor maps a plurality of two-dimensional position coordinates of the plurality of second facial feature points in a planar coordinate system to a plurality of three-dimensional position coordinates in a stereoscopic coordinate system; and

the processor estimates the head pose angle from the plurality of three-dimensional position coordinates of the plurality of second facial feature points.

7. The electronic device of claim 1, wherein the second head image includes a wearable device, and the second plurality of facial feature points does not include a third plurality of facial feature points obscured by the wearable device by the user.

8. The electronic device of claim 1, wherein the second head image includes a wearing device, and the second facial feature points include one or more simulated feature points identified by the wearing device.

9. An operation method is suitable for an electronic device which comprises an image acquisition device and enables a screen to display a plurality of image pictures, and comprises the following steps:

detecting a plurality of second facial feature points in the second face block;

10. The method of operation of claim 9, wherein the gaze location comprises a first coordinate value in a first axial direction and a second coordinate value in a second axial direction.

11. The method of operation of claim 10, wherein the plurality of head pose angles comprises a head tilt angle and a head yaw angle, and the plurality of rotational reference angles comprises a first tilt angle, a second tilt angle, a first yaw angle, and a second yaw angle corresponding to the plurality of preset correction positions.

12. The operating method according to claim 11, wherein the step of calculating the gaze position of the user on the screen according to the head pose angle, the plurality of rotation reference angles and the plurality of preset correction positions comprises:

performing interpolation or extrapolation operation according to the first yaw angle, the second yaw angle, a first position corresponding to the first yaw angle in the plurality of preset correction positions, a second position corresponding to the second yaw angle in the plurality of preset correction positions, and the head yaw angle to obtain the first coordinate value of the gaze position; and

and carrying out interpolation operation or extrapolation operation according to the first inclination angle, the second inclination angle, a third position corresponding to the first inclination angle in the plurality of preset correction positions, a fourth position corresponding to the second inclination angle in the plurality of preset correction positions and the head inclination angle so as to obtain the second coordinate value of the gaze position.

13. The method of operation of claim 9, wherein the method further comprises:

calculating a plurality of first viewing distances between the user and the screen according to the plurality of first facial feature points;

estimating a second viewing distance between the user and the screen according to the plurality of second facial feature points; and

adjusting the plurality of rotation reference angles or the gaze position according to the second viewing distance and the plurality of first viewing distances.

14. The method of operation of claim 9, wherein the method further comprises:

mapping a plurality of two-dimensional position coordinates of the plurality of second facial feature points under a plane coordinate system into a plurality of three-dimensional position coordinates under a three-dimensional coordinate system; and

estimating the head pose angle from the plurality of three-dimensional position coordinates of the plurality of second facial feature points.

15. The operating method according to claim 9, wherein the second head image includes a wearing device, and the plurality of second facial feature points do not include a plurality of third facial feature points that are obscured by the wearing device by the user.

16. The operating method according to claim 9, wherein the second head image includes a wearable device, and the plurality of second facial feature points includes one or more simulated feature points identified by the wearable device.