WO2023071884A1

WO2023071884A1 - Gaze detection method, control method for electronic device, and related devices

Info

Publication number: WO2023071884A1
Application number: PCT/CN2022/126148
Authority: WO
Inventors: 龚章泉
Original assignee: Oppo广东移动通信有限公司
Priority date: 2021-10-29
Filing date: 2022-10-19
Publication date: 2023-05-04
Also published as: CN113936324A

Abstract

A gaze detection method, a control method for an electronic device (100), and a detection apparatus (10), a control apparatus (20), an electronic device (100) and a non-volatile computer-readable storage medium (300). The gaze detection method comprises: determining posture information of a human face according to facial information, and determining reference gaze point coordinates according to the facial information (011); in response to the posture information being greater than a preset threshold, determining a correction parameter according to the posture information (013); and determining gaze information according to the reference gaze point coordinates and the correction parameter (015).

Description

Gaze detection method, electronic device control method, and related devices

priority information

This application claims the priority and benefit of the patent application No. 202111271397.4 filed with the State Intellectual Property Office of China on October 29, 2021, which is hereby incorporated by reference in its entirety.

technical field

The present application relates to the technical field of consumer electronics, and in particular to a gaze detection method, a control method for electronic equipment, a detection device, a control device, electronic equipment, and a non-volatile computer-readable storage medium.

Background technique

Currently, electronic devices can estimate a user's gaze point by collecting face images.

Contents of the invention

The present application provides a gaze detection method, a control method of an electronic device, a detection device, a control device, an electronic device and a non-volatile computer-readable storage medium.

The gaze detection method of an embodiment of the present application includes determining the pose information of the face according to the face information, determining the reference gaze point coordinates according to the face information; in response to the pose information being greater than a preset threshold, determining Correction parameters: determine gaze information according to the coordinates of the reference gaze point and the correction parameters.

A detection device according to an embodiment of the present application includes a first determination module, a second determination module and a third determination module. The first determination module is used to determine the pose information of the face according to the face information, and determine the reference gaze point coordinates according to the face information; the second determination module is used to respond to the pose information being greater than a preset threshold, Determine correction parameters according to the posture information; the third determination module is configured to determine gaze information according to the coordinates of the reference gaze point and the correction parameters.

An electronic device according to an embodiment of the present application includes a processor, the processor is configured to determine the pose information of the face according to the face information, and determine the coordinates of a reference gaze point according to the face information; in response to the pose information being greater than a preset threshold , determining correction parameters according to the posture information; determining gaze information according to the coordinates of the reference gaze point and the correction parameters.

The gaze detection method, detection device and electronic equipment of the present application, after obtaining the face information, first calculate the face posture through the face information, and if the posture information is greater than the preset threshold, it will affect the calculation accuracy of the gaze point coordinates , and then calculate the coordinates of the reference gaze point according to the face information, and then calculate the correction parameters according to the attitude information, so that the coordinates of the reference gaze point can be corrected according to the correction parameters, so as to prevent the face shooting angle from being too large in the obtained face information The impact on gaze detection can improve the accuracy of gaze detection.

The method for controlling an electronic device according to the embodiment of the present application includes determining the pose information of the face according to the face information, determining the coordinates of a reference gaze point according to the face information; in response to the pose information being greater than a preset threshold, determining correction parameters; determining gaze information according to the reference gaze point coordinates and the correction parameters; and controlling the electronic device according to the gaze information.

The control device in the embodiment of the present application includes an acquisition module, a first determination module and a second determination module. The acquisition module is used to determine the pose information of the face according to the face information, and determine the reference gaze point coordinates according to the face information; the first determination module is used to respond to the pose information being greater than a preset threshold, according to the set The posture information determines correction parameters; the second determination module is used to determine gaze information according to the reference gaze point coordinates and the correction parameters;

The electronic device according to another embodiment of the present application includes a processor, the processor is used to determine the pose information of the face according to the face information, and determine the reference gaze point coordinates according to the face information; in response to the pose information being greater than the preset determining a correction parameter according to the posture information; determining gaze information according to the coordinates of the reference gaze point and the correction parameter; and controlling the electronic device according to the gaze information.

A non-volatile computer-readable storage medium containing a computer program according to an embodiment of the present application. When the computer program is executed by one or more processors, the processors are made to execute a gaze detection method or a control method. The gaze detection method includes that the gaze detection method includes determining the pose information of the face according to the face information, determining the coordinates of the reference gaze point according to the face information; in response to the pose information being greater than a preset threshold, determining according to the pose information Correction parameters: determine gaze information according to the coordinates of the reference gaze point and the correction parameters. The control method of the electronic device includes determining the pose information of the face according to the face information, determining the reference gaze point coordinates according to the face information; in response to the pose information being greater than a preset threshold, determining a correction parameter according to the pose information ; determining gaze information according to the coordinates of the reference gaze point and the correction parameters; and controlling the electronic device according to the gaze information.

Additional aspects and advantages of the application will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the application.

Description of drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the drawings in the following description are only These are some implementations of the present application. For those skilled in the art, other drawings can also be obtained according to these drawings without creative work.

FIG. 1 is a schematic flow chart of a gaze detection method in some embodiments of the present application;

FIG. 2 is a block diagram of a detection device in some embodiments of the present application;

3 is a schematic plan view of an electronic device in some embodiments of the present application;

Fig. 4 is a schematic diagram of connection between an electronic device and a cloud server in some embodiments of the present application;

5 to 7 are schematic flowcharts of gaze detection methods in some embodiments of the present application;

Fig. 8 is a schematic structural diagram of a detection model in some embodiments of the present application;

FIG. 9 is a schematic flowchart of a method for controlling an electronic device in some embodiments of the present application;

Fig. 10 is a block diagram of a control device in some embodiments of the present application;

11 to 14 are schematic diagrams of scenarios of control methods in some embodiments of the present application;

Figure 15 and Figure 16 are schematic flow charts of the control method in some embodiments of the present application;

Figure 17 and Figure 18 are schematic diagrams of scenarios of the control method in some embodiments of the present application;

FIG. 19 is a schematic flowchart of a control method in some embodiments of the present application; and

Fig. 20 is a schematic diagram of connection between a processor and a computer-readable storage medium in some embodiments of the present application.

Detailed ways

Embodiments of the present application will be further described below in conjunction with the accompanying drawings. The same or similar reference numerals in the drawings represent the same or similar elements or elements having the same or similar functions throughout. In addition, the embodiments of the present application described below in conjunction with the accompanying drawings are exemplary, and are only used to explain the embodiments of the present application, and should not be construed as limiting the present application.

The gaze detection method of the present application includes determining the pose information of the face according to the face information, and determining the reference gaze point coordinates according to the face information; in response to the pose information being greater than a preset threshold, determining the correction parameters according to the pose information; and according to the reference gaze point coordinates and calibration parameters to determine gaze information.

In some implementations, the gaze detection method further includes: in response to the gesture information being less than a preset threshold, calculating reference gaze point coordinates according to face information as gaze information.

In some embodiments, the posture information includes a posture angle, and the posture angle includes a pitch angle and a yaw angle. According to the face information, judging whether the posture information of the face is greater than a preset threshold includes: judging the pitch angle according to the face information Or whether the yaw angle is greater than a preset threshold.

In some implementations, the gaze detection method further includes: obtaining a training sample set, the training sample set includes a first type of sample whose posture information of a human face is less than a preset threshold and a second type of sample whose posture information of a human face is greater than a preset threshold samples; training a preset detection model according to the first type of samples and the second type of samples; determining the correction parameters according to the attitude information, including: determining the correction parameters according to the attitude information based on the detection model.

In some embodiments, the detection model includes a gaze point detection module and a correction module, and the detection model is trained according to the first type of samples and the second type of samples, including: inputting the first type of samples into the gaze point detection module to output the first training coordinates; input the second type of samples into the gaze point detection module and the correction module to output the second training coordinates; based on the preset loss function, according to the first preset coordinates and the first training coordinates corresponding to the first type of samples, calculate the second A loss value, and calculate the second loss value according to the second preset coordinates and the second training coordinates corresponding to the second type of samples; adjust the detection model according to the first loss value and the second loss value until the detection model converges.

In some implementations, in the samples of N consecutive batches for training, the first difference of the first loss value corresponding to any two samples of the first type, and the first difference of the first loss value corresponding to any two samples of the second type When the second difference of the two loss values is less than the predetermined difference threshold, it is determined that the detection model is converged, and N is a positive integer greater than 1; or, when the first loss value and the second loss value are both less than the predetermined loss threshold, the detection model is determined convergence.

In some embodiments, the face information includes a face mask, a left-eye image and a right-eye image, the face mask is used to indicate the position of the face in the image, and the coordinates of the reference gaze point are calculated according to the face information, including : Calculate the position information of the face relative to the electronic device according to the face mask; calculate the coordinates of the reference gaze point according to the position information, the left eye image and the right eye image.

In some implementations, the face information includes face feature points, the pose information includes pose angles and three-dimensional coordinate offsets, the correction parameters include rotation matrices and translation matrices, and the pose information of the face is determined according to the face information, including: Calculate the attitude angle and three-dimensional coordinate offset of the face feature points; calculate the correction parameters according to the attitude information, including: calculate the rotation matrix according to the attitude angle, and calculate the translation matrix according to the three-dimensional coordinate offset.

The electronic device control method of the present application includes determining the pose information of the face according to the face information, determining the reference gaze point coordinates according to the face information; in response to the pose information being greater than a preset threshold, determining the correction parameters according to the pose information; according to the reference gaze point Coordinates and calibration parameters, determining gaze information; and controlling electronic equipment based on the gaze information.

In some implementations, the control method further includes: in response to the gesture information being less than a preset threshold, calculating reference gaze point coordinates according to face information as gaze information.

In some implementations, the face information includes face feature points, the pose information includes pose angles and three-dimensional coordinate offsets, the correction parameters include rotation matrices and translation matrices, and the correction parameters are calculated according to the pose information, including: according to face feature points Calculate the attitude angle and three-dimensional coordinate offset; calculate the rotation matrix according to the attitude angle, and calculate the translation matrix according to the three-dimensional coordinate offset.

In some embodiments, the gazing information includes gaze point coordinates. Before determining the pose information of the face according to the face information and determining the reference gaze point coordinates according to the face information, the control method further includes: Within the duration, acquire the photographed image; respond to the face information contained in the photographed image; control the electronic device according to the gaze information, including: responding to the coordinates of the gaze point being located in the display area of the display screen, continuing to light the screen for a second predetermined duration.

In some implementations, the display area is associated with a preset coordinate range, and the control method further includes: when the gaze point coordinates are within the preset coordinate range, determining that the gaze point coordinates are located in the display area.

In some implementations, before determining the posture information of the human face according to the human face information and determining the coordinates of the reference point of gaze according to the human face information, the control method further includes: in response to a situation where the electronic device does not receive an input operation, acquiring a captured image ; Control the electronic device according to the gaze information, including: in response to the captured image containing a human face and the gaze point coordinates are located in the display area, adjusting the display brightness of the display screen to the first predetermined brightness; in response to the captured image not containing a human face, or shooting If the image contains a human face and the gaze point coordinates are outside the display area, the display brightness is adjusted to a second predetermined brightness, and the second predetermined brightness is smaller than the first predetermined brightness.

The detection device of the present application includes a first determination module, a second determination module and a third determination module. The first determination module is used to determine the pose information of the face according to the face information, and determine the coordinates of the reference gaze point according to the face information; the second determination module is used to determine the correction parameters according to the pose information in response to the pose information being greater than a preset threshold; 3. A determination module, configured to determine fixation information according to the reference fixation point coordinates and correction parameters.

The control device of the present application includes an acquisition module, a first determination module, a second determination module and a control module. The acquisition module is used to determine the pose information of the face according to the face information, and determine the reference gaze point coordinates according to the face information; the first determination module is used to determine the correction parameters according to the pose information in response to the pose information being greater than a preset threshold; the second determination The module is used to determine the gaze information according to the reference gaze point coordinates and the correction parameters; the control module is used to control the electronic equipment according to the gaze information.

The electronic device of the present application includes a processor, the processor is used to determine the pose information of the face according to the face information, and determine the coordinates of the reference gaze point according to the face information; in response to the pose information being greater than a preset threshold, determine the correction parameters according to the pose information; The reference fixation point coordinates and correction parameters determine the fixation information.

The electronic device of the present application includes a processor, the processor is used to determine the pose information of the face according to the face information, and determine the coordinates of the reference gaze point according to the face information; in response to the pose information being greater than a preset threshold, determine the correction parameters according to the pose information; The reference gaze point coordinates and correction parameters are used to determine gaze information; and to control electronic equipment according to the gaze information.

The non-transitory computer-readable storage medium of the present application includes a computer program. When the computer program is executed by the processor, the processor executes the gaze detection method of any of the above-mentioned embodiments, or the control method of the electronic device of any of the above-mentioned embodiments. .

Please refer to Fig. 1 to Fig. 3, the gaze detection method of the embodiment of the present application includes the following steps:

011: Determine the posture information of the face according to the face information, and determine the reference gaze point coordinates according to the face information;

013: In response to the attitude information being greater than a preset threshold, determine a correction parameter according to the attitude information; and

015: Determine the fixation information according to the reference fixation point coordinates and correction parameters.

The detection device 10 in the embodiment of the present application includes a first determination module 11 , a second determination module 12 and a third determination module 13 . The first determination module 11 is used to determine the pose information of the face according to the face information, and determines the coordinates of the reference gaze point according to the face information; the second determination module 12 is used to determine the correction parameters according to the pose information in response to the pose information being greater than a preset threshold ; The third determination module 13 is used to determine the gaze information according to the coordinates of the reference gaze point and the correction parameters. That is to say, step 011 can be implemented by the first determination module 11 , step 013 can be performed by the second determination module 12 and step 015 can be performed by the third determination module 13 .

The electronic device 100 in the embodiment of the present application includes a processor 60 and a collection device 30 . Acquisition device 30 is used for collecting face information by predetermined frame rate (face information can comprise people's face image, as the visible light image of people's face, infrared image, depth image etc.); Acquisition device 30 can be visible light camera, infrared camera, depth One or more of the cameras, wherein the visible light camera can collect visible light face images, the infrared camera can collect infrared face images, and the depth camera can collect depth face images. In this embodiment, the collection device 30 includes a visible light camera, An infrared camera and a depth camera, and the acquisition device 30 simultaneously has a visible light face image, an infrared face image and a depth face image. The processor 60 may include an image processor 60 (Image Signal Processor, ISP), a neural network processor 60 (Neural-Network Processing Unit, NPU) and an application processor 60 (Application Processor, AP), and the detection device 10 is arranged on the electronic device In 100, wherein, the first determination module 11 can be arranged on the ISP and the NPU, and the processor 60 is connected to the collection device 30. After the collection device 30 collects the face image, the ISP can process the face image to obtain the face image. The NPU can determine the reference gaze point coordinates according to the face information, and the second determination module 12 and the third determination module 13 can be set on the NPU. The processor 60 (specifically, it can be an ISP and an NPU) is used to determine the pose information of the face according to the face information; the processor 60 (specifically, it can be an NPU) is also used to determine the correction according to the pose information in response to the pose information being greater than a preset threshold. Parameters, and according to the coordinates of the reference point of fixation and the correction parameters, fixation information is determined. That is to say, step 011 , step 013 and step 015 may be executed by the processor 60 .

The electronic device 100 may be a mobile phone, a smart watch, a tablet computer, a display device, a notebook computer, a teller machine, a gate, a head-mounted display device, a game machine, and the like. As shown in FIG. 3 , the embodiment of the present application is described by taking the electronic device 100 as a mobile phone as an example. It can be understood that the specific form of the electronic device 100 is not limited to the mobile phone.

Specifically, when the user uses the electronic device 100, the collection device 30 can collect the user's face information once at a predetermined time interval, and continue to perform gaze detection on the user while ensuring that the power consumption of the electronic device 100 is small, or, when the user When using applications that require gaze detection (such as browser software, post bar software, video software, etc.), collect face information according to a predetermined number of frames (such as 10 frames per second), so that human face information is only performed when there is a gaze detection requirement. Face information collection minimizes the power consumption of gaze detection.

Please refer to Fig. 4, after obtaining the face information (taking the face image as an example), the processor 60 can identify the face image, for example, the processor 60 can compare the face image with the preset face template , so as to determine the face in the face image and the image area where different parts of the face (such as eyes, nose, etc.) Within, the processor 60 can perform face recognition in a trusted execution environment (Trusted Execution Environment, TEE) to ensure the user's privacy; or, the preset face template can be stored in the cloud server 200, and then the electronic device 100 will The face image is sent to the cloud server 200 for comparison to determine the face area image, and the face recognition is handed over to the cloud server 200 for processing, which can reduce the processing capacity of the electronic device 100 and improve image processing efficiency; then, the processor 60 can The image of the face area is recognized to determine the pose information of the face. More specifically, the recognition of the face and different parts of the face can be carried out according to the shape features of the face and different parts of the face, so as to obtain the face information. image of the part.

After the face information is obtained, the pose information of the face can be calculated according to the face information. The pose information can be calculated by extracting the features of the face image and calculating the pose information according to the position coordinates of the extracted feature points, such as nose tip, left and right The center of the eyes and the left and right corners of the mouth are used as feature points, and as the posture of the face changes, the position coordinates of the feature points are also changing. For example, a three-dimensional coordinate system is established with the tip of the nose as the origin. , the inclination angle respectively represent the rotation angles of the human face relative to the three coordinate axes of the three-dimensional coordinate system, etc.), taking the horizontal rotation angle of the human face when facing the display screen 40 of the electronic device 100 as an example, the deflection angle of the human face (i.e. The larger the horizontal rotation angle), the closer the distance between the two feature points corresponding to the left and right eyes. Therefore, the pose information of the face can be accurately calculated according to the position coordinates of the feature points.

It can be understood that different postures of the human face will affect the user's gaze direction and gaze point coordinates. Therefore, after the coordinates of the reference point of gaze are calculated according to the face information, the correction parameters can be determined according to the posture information, thereby correcting the error of the gaze point detection caused by the change of posture, so that the gaze obtained according to the coordinates of the reference point of gaze and the correction parameters The information is more accurate.

When calculating the reference gaze point coordinates according to the face information, the processor 60 can directly calculate the reference gaze point coordinates according to the face area image, or carry out feature point recognition on the face area image, and calculate the reference gaze point coordinates by the feature points, and the amount of calculation is relatively small. or, the processor 60 can obtain the face area image and the human eye area image, and perform feature point recognition on the face area image, and jointly calculate the reference gaze point coordinates in conjunction with the feature points of the human face area image and the human eye area image , on the basis of ensuring a small amount of calculation, the calculation accuracy of the coordinates of the reference gaze point is further improved.

Then, it can be understood that when the user's posture is facing the display screen 40 of the electronic device 100, the face information obtained at this time is generally the most accurate. Therefore, the processor 60 can first determine whether the posture information is greater than a preset threshold, and the posture information Including the pitch angle, roll angle and yaw angle of the human face, of course, because the change of the roll angle (the human face parallel display screen 40 rotates) does not cause the feature points of the human face to change in the position of the human face, so , you can only judge whether the pitch angle or yaw angle is greater than 0 degrees.

If the posture of the user is facing the display screen 40 of the electronic device 100, the pitch angle, roll angle and yaw angle of the face are all 0 degrees, and the preset threshold is 0 degrees, then when the posture information is greater than 0 degrees (such as When the pitch angle or yaw angle is greater than 0 degrees), it can be determined that the coordinates of the reference gaze point need to be corrected. Among them, since the pitch angle, roll angle, and yaw angle are directional, they may be negative values, which will affect the accuracy of judgment Therefore, when judging whether the posture information is greater than a preset threshold, it can be judged whether the absolute value of the posture information is greater than a preset threshold.

At this time, the processor 60 calculates the correction parameters according to the posture information. For example, when the reference gaze point coordinates only include the two-dimensional coordinates of the line of sight on the display screen 40, the correction parameters include coordinate correction coefficients, which can be obtained according to the reference gaze point coordinates and the coordinate correction coefficients. Watching information, such as the coordinates of the reference point of gaze are (x, y), and the coordinate correction coefficients are a and b, then the gaze information is (ax, by); or, the coordinates of the reference point of gaze include the two-dimensional coordinates and the coordinates of the line of sight on the display screen 40 For the direction of line of sight, the correction parameters include coordinate correction coefficient and direction correction coefficient. The gaze information can be obtained according to the coordinates of the reference gaze point and the coordinate correction coefficient. For example, the coordinates of the reference gaze point are (x, y), and the direction of the line of sight is (α, β, γ), the coordinate correction coefficients are a and b, and the direction correction coefficients include c, d and e, then the gaze information is (ax, by, cα, dβ, eγ).

And when the attitude information is less than or equal to the preset threshold value, it means that the user is facing the display screen 40 or the deflection angle of the user relative to the display screen 40 is small at this time, then it can be determined that the coordinates of the reference gaze point do not need to be corrected, and the processor 60 After the coordinates of the reference fixation point are calculated, the coordinates of the reference fixation point can be directly determined as the final fixation information, thereby saving the calculation amount for calculating the correction parameters.

Of course, in order to further reduce the amount of calculation, the preset threshold can be set larger. For example, when the preset threshold is 5 degrees, since the deflection of the face is small, the detection accuracy of the gaze information is basically not affected at this time. Calculation of correction parameters. Or, set the preset threshold according to the needs of the gaze information, such as the gaze information only includes the gaze direction, and does not need accurate gaze point coordinates. At this time, the preset threshold can be set larger, and the gaze information includes the gaze point at coordinates of the display screen 40, the preset threshold can be set smaller, so as to ensure the accuracy of gaze point detection.

After obtaining the gaze information, the electronic device 100 can be controlled according to the gaze information (the gaze direction and/or the coordinates of the gaze point). For example, when it is detected that the gaze point coordinates are located in the display area of the display screen 40, keep the screen always on, and after detecting that the gaze point coordinates are located outside the display area of the display screen 40 for a predetermined duration (such as 10S, 20S, etc.), to turn off the screen. Or, according to the change of the gaze direction, operations such as turning pages are performed.

In the face image collected by the electronic device 100 , the obtained face information may not be accurate enough due to factors such as shooting angles, thereby affecting the accuracy of gaze point detection.

The gaze detection method, the detection device 10 and the electronic device 100 of the present application, after obtaining the face information, first calculate the face posture through the face information, and if the posture information is greater than the preset threshold, it will affect the calculation of the gaze point coordinates For accuracy, first calculate the coordinates of the reference point of gaze based on the face information, and then calculate the correction parameters based on the attitude information, so that the coordinates of the reference point of gaze can be corrected according to the correction parameters, thereby preventing the shooting angle of the face from being captured due to the face information. Excessive impact on gaze detection can improve the accuracy of gaze detection.

Referring to Fig. 2, Fig. 3 and Fig. 5, in some embodiments, the face information includes a face mask, a left-eye image and a right-eye image, and the face mask is used to indicate the position of the face in the image, Step 011: Calculate the reference gaze point coordinates according to the face information, including:

0111: Calculate the position information of the face relative to the electronic device 100 according to the face mask;

0112: Calculate the coordinates of the reference gaze point according to the position information, the left-eye image and the right-eye image.

In some implementations, the first determining module 11 is configured to calculate the position information of the face relative to the electronic device 100 according to the face mask; and calculate the reference gaze point coordinates according to the position information, the left-eye image and the right-eye image. That is to say, step 0111 and step 0112 can be executed by the first determination module 11 .

In some implementations, the processor 60 is further configured to calculate the position information of the face relative to the electronic device 100 according to the face mask; and calculate the reference gaze point coordinates according to the position information, the left-eye image and the right-eye image. That is to say, step 0111 and step 0112 may be executed by the processor 60 .

Specifically, before obtaining the gaze information according to the calculation, the processor 60 can also first determine the face mask of the face image, the face mask is used to represent the position of the face in the face image, and the face mask can be obtained by The position of the face in the recognition face image is determined, and the processor 60 can calculate the position information of the face relative to the electronic device 100 according to the face mask (for example, according to the ratio of the face mask to the face image, the face and face can be calculated). distance of the electronic device 100), it can be understood that when the distance between the human face and the electronic device 100 changes, even if the gaze direction of the human eye does not change, the gaze point coordinates of the human eye will still change. Therefore, when calculating the gaze information, except In addition to calculating gaze information based on face images and/or eye images (such as left-eye images and right-eye images), location information can also be combined to more accurately calculate gaze point coordinates.

Please refer to Fig. 2, Fig. 3 and Fig. 5 again, in some embodiments, face information includes face feature points, attitude information includes attitude angle and three-dimensional coordinate offset, correction parameters include rotation matrix and translation matrix, step 011 : Determine the posture information of the face according to the face information, including:

0113: Calculate attitude angle and three-dimensional coordinate offset according to facial feature points;

Step 013: Calculate the correction parameters according to the attitude information, including

0131: Calculate the rotation matrix according to the attitude angle, and calculate the translation matrix according to the three-dimensional coordinate offset.

In some implementations, the first determination module 11 is also used to calculate the attitude angle and three-dimensional coordinate offset according to the facial feature points; the second determination module 12 is also used to calculate the rotation matrix according to the attitude angle, and calculate the three-dimensional coordinate offset according to the Compute the translation matrix. That is to say, step 0113 can be performed by the first determination module 11 , and step 0131 can be performed by the second determination module 12 .

In some implementations, the processor 60 is further configured to calculate an attitude angle and a three-dimensional coordinate offset based on facial feature points; calculate a rotation matrix based on the attitude angle, and calculate a translation matrix based on the three-dimensional coordinate offset. That is to say, step 0133 and step 0134 can be executed by the processor 60 .

Specifically, the correction parameters may include a rotation matrix and a translation matrix to represent the face position change and pose change respectively. When calculating the correction parameters, the pose angle and the three-dimensional coordinate offset may be calculated first according to the face feature points, where the pose angle It is used to represent the attitude of the face (such as pitch angle, roll angle and yaw angle), and the three-dimensional coordinate offset can represent the position of the face, and then calculate the rotation matrix according to the attitude angle, and calculate the offset matrix according to the three-dimensional coordinate offset , so as to determine the correction parameters of the reference gaze point coordinates, and accurately calculate the fixation information according to the reference gaze point coordinates, rotation matrix and translation matrix.

Please refer to Fig. 2, Fig. 3 and Fig. 6, in some embodiments, gaze detection method also includes:

0101: Obtain a training sample set, the training sample set includes the first type of samples whose face pose information is less than a preset threshold and the second type of samples whose face pose information is greater than a preset threshold;

0102: Train a preset detection model according to the first type of samples and the second type of samples;

Step 013 includes:

0132: Based on the detection model, determine the correction parameters according to the attitude information.

In some embodiments, the detection device 10 further includes an acquisition module 14 and a training module 15 . Both the acquiring module 14 and the training module 15 can be set in the NPU to train the detection model. The acquisition module 14 is used to obtain the training sample set; the training module 16 is used to train the preset detection model according to the first type of sample and the second type of sample; the second determination module 12 is also used to determine the correction parameters according to the posture information based on the detection model . That is to say, step 0101 may be performed by the acquisition module 14 , step 0102 may be performed by the training module 15 , and step 0132 may be performed by the second determination module 12 .

In some embodiments, the processor 60 is further configured to obtain a training sample set; train a preset detection model according to the first type of samples and the second type of samples; and determine correction parameters according to the posture information based on the detection model. That is to say, step 0101 , step 0102 and step 0132 can be executed by the processor 60 .

Specifically, the present application can realize calculation of gaze information through a preset detection model. In order to ensure the accuracy of gaze information, it is necessary to first train the detection model so that the detection model converges.

During training, in order to enable the detection model to still accurately calculate the gaze information when the face is deflected relative to the display screen 40, it is possible to pre-select a plurality of first-type samples whose posture information of the face is less than a preset threshold and A plurality of second-type samples whose pose information of the face is greater than a preset threshold are used as a training sample set; wherein, the first-type samples are face images whose pose information is less than a preset threshold; the second-type samples are faces whose pose information is greater than a preset A face image with a threshold; in this way, the detection model is trained through the first type of samples whose attitude information is less than the preset threshold and the second type of samples greater than the preset threshold, and after training to convergence, the detection model can be When the gaze information is detected, the impact caused by the deflection of the human face relative to the display screen 40 is minimized to ensure the accuracy of gaze detection.

Referring to Fig. 2, Fig. 3 and Fig. 7, in some embodiments, step 0102 includes:

01021: Input the first type of samples into the fixation point detection module to output the first training coordinates;

01022: Input the second type of samples into the gaze point detection module and the correction module to output the second training coordinates;

01023: Based on the preset loss function, calculate the first loss value according to the first preset coordinates corresponding to the first type of samples and the first training coordinates, and calculate the first loss value according to the second preset coordinates corresponding to the second type of samples and the second training coordinates Coordinates, calculate the second loss value;

01024: Adjust the detection model according to the first loss value and the second loss value until the detection model converges.

In some embodiments, the training module 15 is also used to input the first type of sample into the gaze point detection module to output the first training coordinates; the second type of sample is input into the gaze point detection module and the correction module to output the second training coordinates; based on the preset loss function, calculate the first loss value according to the first preset coordinates and the first training coordinates corresponding to the first type of samples, and calculate the first loss value according to the second preset coordinates corresponding to the second type of samples and the second training coordinates coordinates, calculating the second loss value; adjusting the detection model according to the first loss value and the second loss value until the detection model converges. That is to say, Step 01021 to Step 01024 can be executed by the training module 15 .

In some embodiments, the processor 60 is also used for inputting the first type of samples into the gaze point detection module to output the first training coordinates; inputting the second type of samples into the gaze point detection module and the correction module to output the second training coordinates coordinates; based on the preset loss function, calculate the first loss value according to the first preset coordinates and the first training coordinates corresponding to the first type of samples, and calculate the first loss value according to the second preset coordinates corresponding to the second type of samples and the second training coordinates coordinates, calculating the second loss value; adjusting the detection model according to the first loss value and the second loss value until the detection model converges. That is to say, Step 01021 to Step 01024 may be executed by the processor 60 .

Specifically, please refer to FIG. 8 , the face detection model 50 includes a gaze point detection module 51 and a correction module 52 . When training, the training sample set is input to the detection model, wherein the first type of sample is input to the gaze point detection module 51 to output the first training coordinates; since the attitude information of the first type of training sample is less than the preset threshold, therefore Directly output the first training coordinates; the second type of training samples are input to the fixation point detection module 51 and the correction module 52 at the same time, the fixation point detection module 51 outputs the reference training coordinates, and then the correction module 52 outputs the correction parameters, and corrects the reference according to the correction parameters training coordinates to output the second training coordinates.

It can be understood that each training sample has a corresponding preset coordinate, and the preset coordinate represents the actual gaze information of the training sample, wherein the first type of training sample corresponds to the first preset coordinate, and the second type of preset sample corresponds to the second preset coordinate. Set the coordinates, therefore, the processor 60 can calculate the first loss value based on the preset loss function, the first training coordinates and the first preset coordinates; then the processor 60 adjusts the gaze point detection module 51 based on the first loss value , so that the first training coordinates output by the gaze point detection module 51 gradually approach the first preset coordinates until convergence; the processor 60 can calculate the second based on the preset loss function, the second training coordinates and the second preset coordinates Two loss values; then the processor 60 simultaneously adjusts the gaze point detection module 51 and the correction module 52 based on the second loss value, so that the second training coordinates output by the detection model gradually approach the second preset coordinates until convergence.

For example, the loss function is as follows:

Among them, loss is the loss value, N is the number of training samples contained in each training sample set, X and Y are training coordinates (such as the first training coordinates or the second training coordinates), and Gx and Gy are preset coordinates (such as the first training coordinates or the second training coordinates). a preset coordinate and a second preset coordinate), when the training coordinates are the gaze direction, X and Y represent the pitch angle and the yaw angle respectively; when the training coordinates are the gaze point coordinates, X and Y represent the gaze point respectively The coordinates of the plane where the screen 40 is located, so as to quickly calculate the first loss value and the second loss value.

Then, the processor 60 can adjust the detection model according to the first loss value and the second loss value, so that the gradient of the detection model decreases continuously, so that the training coordinates are getting closer to the preset coordinates, and finally the detection model is trained to convergence. For example, in the samples of N consecutive batches for training, the first difference of the first loss value corresponding to any two samples of the first type, and the first difference of the second loss value corresponding to any two samples of the second type When the two differences are both less than the predetermined difference threshold, it is determined that the detection model is converged, and N is a positive integer greater than 1; that is to say, during the training process of consecutive N batches, the first loss value basically does not change, which means When the first loss value and the second loss value reach the limit, it can be determined that the detection model has converged; or, when both the first loss value and the second loss value are less than a predetermined loss threshold, it is determined that the detection model is converged, and the first loss value and the second loss value are determined to be convergent. Second, when the loss value is less than the predetermined loss threshold, it indicates that the training coordinates are very close to the preset coordinates, and it can be determined that the detection model has converged.

In this way, the detection model is trained to converge through the first type of training samples and the second type of training samples, so as to ensure that the detection model can still output accurate gaze information according to the face information when the face is deflected.

Referring to Fig. 3, Fig. 9 and Fig. 10, the control method of the electronic device 100 in the embodiment of the present application includes the following steps:

021: Determine the posture information of the face according to the face information, and determine the reference gaze point coordinates according to the face information;

023: In response to the attitude information being greater than a preset threshold, determine a correction parameter according to the attitude information;

025: Determine the fixation information according to the coordinates of the reference fixation point and the correction parameters; and

027: Control the electronic device 100 according to the gaze information.

The control device 20 in the embodiment of the present application includes an acquisition module 21 , a first determination module 22 , a second determination module 23 and a control module 24 . Acquisition module 21 is used for determining the posture information of people's face according to people's face information, determines the coordinates of reference point of gaze according to people's face information; The first determining module 22 is used for determining correction parameters according to posture information in response to posture information being greater than a preset threshold; The second determination module 23 is used to determine the gaze information according to the coordinates of the reference gaze point and the correction parameters; the control module 24 is used to control the electronic device 100 according to the gaze information. That is to say, step 021 can be performed by the acquisition module 21 , step 023 can be performed by the first determination module 22 , step 025 can be performed by the second determination module 23 and step 027 can be performed by the control module 24 .

The electronic device 100 in the embodiment of the present application includes a processor 60 and a collection device 30 . Acquisition device 30 is used for collecting face information by predetermined frame rate (face information comprises people's face image, as the visible light image of people's face, infrared image, depth image etc.); Acquisition device 30 can be visible light camera, infrared camera, depth camera One or more of them, wherein the visible light camera can collect visible light face images, the infrared camera can collect infrared face images, and the depth camera can collect depth face images. In this embodiment, the collection device 30 includes a visible light camera, an infrared face image camera and depth camera, the acquisition device 30 simultaneously visible light face image, infrared face image and depth face image. Processor 60 may include ISP, NPU and AP, such as control device 20 is arranged in electronic equipment 100, acquisition module 21 is arranged in ISP and NPU, processor 60 is connected with collection device 30, after collection device 30 collects the face image , the ISP can process the face image to determine the posture information of the face according to the face information, the NPU can determine the reference point of gaze coordinates according to the face information, the first determination module 22 and the second determination module 23 can be arranged on the NPU, The control module 24 can be set at the AP. Processor 60 (specifically can be ISP and NPU) is used for obtaining face information and posture information; Processor 60 (specifically can be NPU) is also used for determining correction parameter according to posture information in response to posture information greater than preset threshold value; According to The coordinates of the reference gaze point and the correction parameters are used to determine gaze information; the processor 60 (specifically, it may be an AP) can also be used to control the electronic device 100 according to the gaze information. That is to say, step 021 can be executed by the collection device 30 in cooperation with the processor 60 , and step 023 , step 025 and step 027 can be executed by the processor 60 .

Specifically, for the manner of determining the gaze information, that is, Step 021, Step 023, and Step 025, please refer to the descriptions of Step 011, Step 013, and Step 015, respectively, and details will not be repeated here.

After obtaining the gaze information (such as gaze direction and gaze point coordinates), the electronic device 100 can be controlled according to the gaze direction and gaze point coordinates. Please refer to Figure 11. For example, a three-dimensional coordinate system is established with the midpoint of the eyes as the origin O1, the X1 axis is parallel to the direction of the line connecting the centers of the eyes, the Y1 axis is located on the horizontal plane and perpendicular to the X1 axis, and the Z1 axis is perpendicular to the X1 axis and Y1 axis. The three-axis rotation angle of the line of sight S and the three-dimensional coordinate system indicates the user's gaze direction. For example, the gaze direction includes pitch angle, roll angle and yaw angle respectively. The pitch angle represents the rotation angle around the X1 axis, and the roll angle represents the rotation angle around the Y1 axis. The rotation angle of the axis, the yaw angle represents the rotation angle around the Z1 axis, the processor 60 can realize the page turning or sliding operation of the display content of the electronic device 100 according to the gaze direction, for example, according to the determination of continuous multiple frames of human eye area images (such as 10 consecutive frames) of the gaze direction, the change of the gaze direction can be determined, for example, please combine Figure 11 and Figure 12, when the pitch angle gradually increases (that is, the line of sight S is tilted), it can be determined that the user wants the displayed content to slide up or Turn the page down. For another example, please refer to FIG. 11 and FIG. 13 , the pitch angle gradually decreases (that is, the line of sight S is tilted), then it can be determined that the user wants to slide the displayed content down or turn the page up. Similarly, by detecting the moving direction of the gaze point M, the electronic device 100 can also be slid or page-turned. Please refer to FIG. 14 , the center of the display screen 40 can be used as the coordinate origin O2 to establish a plane coordinate system, the width direction parallel to the electronic device 100 is used as the X2 axis, the length direction parallel to the electronic device 100 is used as the Y2 axis, and the gaze point coordinates include the abscissa (corresponding to the position on the X2 axis) and the ordinate (corresponding to the position on the Y2 axis). If the ordinate gradually increases, it means that the gaze point M moves up. It can be determined that the user wants to slide up or turn the page down, and then For example, if the ordinate gradually decreases, it means that the gaze point M moves down, and it can be determined that the user wants to slide the displayed content down or turn the page up.

In other implementations, the processor 60 can also obtain 10 consecutive frames according to the change speed of the gaze direction (such as the difference between the pitch angles of the first frame and the tenth frame (or the difference between the vertical coordinates of the gaze point M) and The duration is determined), the faster the change speed, the more new display content will be displayed after sliding.

In another example, when it is detected that the coordinates of the gaze point are located in the display area of the display screen 40, it means that the user is always viewing the display screen 40, and the screen is kept on all the time, and when it is detected that the coordinates of the gaze point are located in the display area of the display screen 40 When outside the area, it means that the user has not checked the display screen 40, but in order to prevent the user from looking around outside the display area once in a while, causing a misjudgment, the predetermined time length (such as 10S, 20S, etc.) after the user does not check the display screen 40 can be Turn off the screen again.

In the control method of the electronic device 100 of the present application, the control device 20 and the electronic device 100, after obtaining the face information and posture information, when the posture information is greater than the preset threshold, which will affect the calculation accuracy of the gaze point coordinates, according to The face information first calculates the coordinates of the reference point of gaze, and then calculates the correction parameters according to the posture information. The coordinates of the reference point of gaze can be corrected according to the correction parameters to obtain accurate gaze information, so as to prevent the acquired face information from being shot at an excessive angle. Large impact on gaze detection, which can improve the accuracy of gaze detection. And when the electronic device 100 is controlled according to accurate gaze information, the control accuracy of the electronic device 100 can be improved.

Referring to Fig. 3, Fig. 10 and Fig. 15, in some embodiments, the face information includes a face mask, a left-eye image and a right-eye image, and the face mask is used to indicate the position of the face in the image, Step 021: Calculate the reference gaze point coordinates according to the face information, including:

0211: Calculate the position information of the face relative to the electronic device 100 according to the face mask;

0212: Calculate the coordinates of the reference gaze point according to the position information, the left-eye image and the right-eye image.

In some implementations, the first determining module 22 is also used to calculate the position information of the face relative to the electronic device 100 according to the face mask; and calculate the reference gaze point coordinates according to the position information, the left-eye image and the right-eye image. That is to say, step 0211 and step 0212 can be executed by the first determination module 22 .

In some implementations, the processor 60 is further configured to calculate the position information of the face relative to the electronic device 100 according to the face mask; and calculate the reference gaze point coordinates according to the position information, the left-eye image and the right-eye image. That is to say, step 0211 and step 0212 can be executed by the processor 60 .

Specifically, for the detailed description of step 0231 and step 0232, please refer to step 0131 and step 0132 respectively, and details are not repeated here.

Please refer to Fig. 3, Fig. 10 and Fig. 15, in some embodiments, face information includes face feature points, attitude information includes attitude angle and three-dimensional coordinate offset, correction parameters include rotation matrix and translation matrix, step 021: Determine the pose information of the face based on the face information, including:

0213: Calculate attitude angle and three-dimensional coordinate offset according to facial feature points;

Step 023 includes:

0231: Calculate the rotation matrix according to the attitude angle, and calculate the translation matrix according to the three-dimensional coordinate offset.

In some implementations, the first determination module 22 is also used to calculate the attitude angle and three-dimensional coordinate offset according to the face feature points; calculate the rotation matrix according to the attitude angle, and calculate the translation matrix according to the three-dimensional coordinate offset. That is to say, step 0233 and step 0234 can be executed by the first determination module 22 .

In some implementations, the processor 60 is further configured to calculate an attitude angle and a three-dimensional coordinate offset based on facial feature points; calculate a rotation matrix based on the attitude angle, and calculate a translation matrix based on the three-dimensional coordinate offset. That is to say, step 0233 and step 0234 can be executed by the processor 60 .

Specifically, for specific descriptions of Step 0233 and Step 0234, please refer to Step 0133 and Step 0134 respectively, which will not be repeated here.

Referring to Fig. 3, Fig. 10 and Fig. 16, in some embodiments, the gaze information includes gaze point coordinates, and before step 021, the control method also includes:

0201: Acquiring captured images within the first predetermined time period before the screen is off;

0202: Responding to the fact that the captured image contains a human face;

Step 027: Control the electronic device 100 according to the gaze information, including:

0271: In response to the gaze point coordinates being located in the display area of the display screen 40, keep the screen on for a second predetermined duration.

In some implementations, the control module 24 is also used to acquire the photographed image within the first predetermined time period before the screen is off; in response to the fact that the photographed image contains a human face; , the screen remains on for a second predetermined duration. That is to say, step 0201 , step 0202 and step 0271 can be executed by the control module 24 .

In some implementations, the processor 60 is further configured to acquire a photographed image within the first predetermined time period before the screen is off; in response to the fact that the photographed image contains a human face, in response to the gaze point coordinates being located in the display area of the display screen 40 , the screen remains on for a second predetermined duration. That is to say, step 0201 , step 0202 and step 0271 may be executed by the processor 60 .

Specifically, the gazing information can be used to realize off-screen control. Before the off-screen, gaze detection is first performed. For example, the processor 60 first acquires a captured image. If there is a human face in the captured image, the gazing information is determined according to the captured image. Of course , in order to ensure that there is enough time to acquire the captured image and calculate the gaze information before the screen is off, it is necessary to acquire the captured image within the first predetermined time period (such as 5 seconds, 10 seconds, etc.) before the screen is off.

17 and 18, when the gaze point M is located within the display area of the display screen 40, it can be determined that the user is looking at the display screen 40, so that the display screen 40 continues to be on for a second predetermined duration, and the second predetermined duration can be greater than the first predetermined time length, and within the first predetermined time length before the screen is turned off again, the captured image is acquired again, so that when the user looks at the display screen 40, the screen remains bright, and when the user no longer looks at the display screen 40, the screen is turned off again. Screen.

Wherein, please refer to FIG. 17 again, the center of the display area can be used as the coordinate origin O2 to establish a two-dimensional coordinate system parallel to the display screen 40. The coordinate range and the ordinate range, as the preset coordinate range, can be determined when the gaze point coordinates are within the preset coordinate range (that is, the abscissa of the gaze point coordinates is within the abscissa range and the ordinate is within the ordinate range). The gaze point coordinates are located in the display area, so it is relatively simple to determine whether the user gazes at the display screen 40 .

Moreover, since the acquisition of captured images and the calculation of gaze information are performed only within the first predetermined period of time before the screen turns off, it is beneficial to save power consumption.

Referring to Fig. 3, Fig. 10 and Fig. 19, in some embodiments, the gaze information includes gaze point coordinates, and before step 021, the control method also includes:

0203: In response to the fact that the electronic device 100 does not receive an input operation, acquire a captured image;

Step 027 includes:

0272: Adjust the display brightness of the display screen 40 to a first predetermined brightness in response to the captured image containing a human face and the gaze point coordinates are located in the display area;

0273: In response to the fact that the captured image does not contain a human face, or the captured image contains a human face and the gaze point coordinates are outside the display area, adjust the display brightness to a second predetermined brightness, where the second predetermined brightness is smaller than the first predetermined brightness.

In some implementations, the control module 24 is also used to acquire the captured image; in response to the situation that the electronic device 100 does not receive an input operation; in response to the captured image containing a human face and the coordinates of the point of gaze located in the display area, adjust the display screen 40 Adjust the display brightness to the first predetermined brightness; in response to the captured image does not contain a human face, or the captured image contains a human face and the gaze point coordinates are outside the display area, adjust the display brightness to the second predetermined brightness, and the second predetermined brightness is less than A first predetermined brightness. That is to say, step 0203 , step 0204 , step 0272 and step 0273 can be executed by the control module 24 .

In some implementations, the processor 60 is also configured to acquire a captured image; in response to a situation where the electronic device 100 does not receive an input operation; in response to a human face being included in the captured image and the gaze point coordinates are located in the display area, adjust the display screen 40 Adjust the display brightness to the first predetermined brightness; in response to the captured image does not contain a human face, or the captured image contains a human face and the gaze point coordinates are outside the display area, adjust the display brightness to the second predetermined brightness, and the second predetermined brightness is less than A first predetermined brightness. That is to say, step 0203 , step 0204 , step 0272 and step 0273 can be executed by the processor 60 .

Specifically, please refer to Fig. 17 and Fig. 18 again. The gaze information can also be used to realize intelligent brightening of the screen. In order to save power, the electronic device 100 generally reduces the display brightness after a certain period of time when the screen is bright, and then brightens the screen with low brightness for a certain period of time. After a long time, the screen will be off. In this embodiment, when the electronic device 100 does not receive the user's input operation, it can be judged that the user may not be using the electronic device 100 or is only viewing the displayed content, and the processor 60 can obtain the captured image. If the image contains a human face, the gaze information is calculated according to the captured image. If the coordinates of the gaze point are located in the display area, it means that although the user is not operating the electronic device 100, he is checking the display content. At this time, the display brightness is adjusted to the first predetermined brightness. The predetermined brightness can be the brightness set by the user when the display screen 40 is normally displayed, or it can be changed in real time according to the brightness of the ambient light to adapt to the brightness of the ambient light, so as to ensure that the user can still brighten the screen even if the electronic device 100 is not operated. In order to prevent the situation that the user does not operate the electronic device 100 but suddenly turns off the screen when viewing the displayed content and affects the user experience.

However, when the electronic device 100 does not receive an input operation, if the captured image does not contain a human face, or although the captured image contains a human face, the gaze point coordinates are outside the display area (that is, the user does not view the display area), then It can be determined that the user does not need to use the electronic device 100 at present, therefore, at this time, the display brightness can be adjusted to a second predetermined brightness, which is smaller than the first predetermined brightness, so as to prevent unnecessary power consumption. When the user looks at the display area again, the display brightness is adjusted to the first predetermined brightness again, so as to ensure the normal viewing experience of the user. In this way, it can be realized that when the user does not operate the electronic device 100, the user looks at the display area, and the display area is displayed at normal brightness; Save battery.

Referring to FIG. 20 , one or more non-transitory computer-readable storage media 300 containing a computer program 302 according to an embodiment of the present application, when the computer program 302 is executed by one or more processors 60, the processors 60 can Execute the gaze detection method or the control method of the electronic device 100 in any one of the above-mentioned embodiments.

For example, referring to FIG. 1, when the computer program 302 is executed by one or more processors 60, the processors 60 are made to perform the following steps:

For another example, please refer to FIG. 9, when the computer program 302 is executed by one or more processors 60, the processors 60 may also perform the following steps:

027: Control the electronic device 100 according to the gaze information.

In the description of this specification, descriptions with reference to the terms "one embodiment", "some embodiments", "exemplary embodiments", "example", "specific examples" or "some examples" mean that a combination of the embodiments or Examples describe specific features, structures, materials, or characteristics that are included in at least one embodiment or example of the present application. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the described specific features, structures, materials or characteristics may be combined in any suitable manner in any one or more embodiments or examples. In addition, those skilled in the art can combine and combine different embodiments or examples and features of different embodiments or examples described in this specification without conflicting with each other.

Any process or method descriptions in flowcharts or otherwise described herein may be understood to represent modules, segments or portions of code comprising one or more executable instructions for implementing specific logical functions or steps of the process , and the scope of preferred embodiments of the present application includes additional implementations in which functions may be performed out of the order shown or discussed, including in substantially simultaneous fashion or in reverse order depending on the functions involved, which shall It should be understood by those skilled in the art to which the embodiments of the present application belong.

Although the implementation of the present application has been shown and described above, it can be understood that the above-mentioned implementation is exemplary and should not be construed as limiting the application, and those skilled in the art can make the above-mentioned The embodiments are subject to changes, modifications, substitutions and variations.

Claims

A gaze detection method, characterized in that, comprising:

Determine the posture information of the human face according to the human face information, and determine the reference gaze point coordinates according to the human face information;

In response to the attitude information being greater than a preset threshold, determining correction parameters according to the attitude information; and

Determine gaze information according to the coordinates of the reference gaze point and the correction parameters.
The gaze detection method according to claim 1, further comprising:

In response to the posture information being smaller than the preset threshold, calculating reference gaze point coordinates according to the face information as the gaze information.
The gaze detection method according to claim 1, wherein the attitude information includes an attitude angle, and the attitude angle includes a pitch angle and a yaw angle, and according to the face information, it is judged whether the attitude information of the face is greater than Preset thresholds, including:

According to the face information, it is judged whether the pitch angle or the yaw angle is greater than the preset threshold.
The gaze detection method according to claim 1, further comprising:

Obtaining a training sample set, the training sample set including a first-type sample whose face posture information is less than the preset threshold and a second-type sample whose face posture information is greater than the preset threshold;

training a preset detection model according to the first type of samples and the second type of samples;

The determining the correction parameters according to the attitude information includes:

Based on the detection model, the correction parameters are determined according to the attitude information.
The gaze detection method according to claim 4, wherein the detection model includes a gaze point detection module and a correction module, and the training of the detection model according to the first type sample and the second type sample includes:

Inputting the first type of samples into the gaze point detection module to output the first training coordinates;

Inputting the second type of samples into the fixation point detection module and the correction module to output second training coordinates;

Based on the preset loss function, calculate the first loss value according to the first preset coordinates corresponding to the first type of samples and the first training coordinates, and calculate the first loss value according to the second preset coordinates corresponding to the second type of samples and the second training coordinates, calculating a second loss value;

Adjusting the detection model according to the first loss value and the second loss value until the detection model converges.
The gaze detection method according to claim 5, wherein, in the samples of continuous N batches of training, the first difference of the first loss value corresponding to any two samples of the first type , and when the second difference of the second loss value corresponding to any two samples of the second type is less than a predetermined difference threshold, it is determined that the detection model is converged, and the N is a positive integer greater than 1; or , when both the first loss value and the second loss value are smaller than a predetermined loss threshold, it is determined that the detection model is converged.
The gaze detection method according to claim 1, wherein the face information includes a face mask, a left-eye image and a right-eye image, and the face mask is used to indicate the position of the face in the image , the calculation of reference gaze point coordinates according to the face information includes:

Calculate the position information of the face relative to the electronic device according to the face mask;

Calculate the coordinates of the reference gaze point according to the position information, the left-eye image, and the right-eye image.
The gaze detection method according to claim 1, wherein the face information includes face feature points, the attitude information includes attitude angles and three-dimensional coordinate offsets, and the correction parameters include a rotation matrix and a translation matrix,

The said determination of the pose information of the face according to the face information includes:

calculating the attitude angle and the three-dimensional coordinate offset according to the facial feature points;

The calculation of correction parameters according to the attitude information includes:

The rotation matrix is calculated according to the attitude angle, and the translation matrix is calculated according to the three-dimensional coordinate offset.
A method for controlling an electronic device, comprising:

Determine the posture information of the human face according to the human face information, and determine the reference gaze point coordinates according to the human face information;

determining a correction parameter according to the attitude information in response to the attitude information being greater than a preset threshold;

determining gaze information according to the coordinates of the reference gaze point and the correction parameters; and

The electronic device is controlled according to the gaze information.
The control method according to claim 9, further comprising:

In response to the posture information being smaller than the preset threshold, calculating reference gaze point coordinates according to the face information as the gaze information.
The control method according to claim 9, wherein the face information includes a face mask, a left-eye image and a right-eye image, and the face mask is used to indicate the position of the face in the image, The calculation of reference gaze point coordinates according to the face information includes:

calculating the position information of the face relative to the electronic device according to the face mask;

Calculate the coordinates of the reference gaze point according to the position information, the left-eye image, and the right-eye image.
The control method according to claim 9, wherein the face information includes face feature points, the attitude information includes attitude angle and three-dimensional coordinate offset, and the correction parameters include a rotation matrix and a translation matrix, so The calculation of correction parameters according to the attitude information includes:

calculating the attitude angle and the three-dimensional coordinate offset according to the facial feature points;

The rotation matrix is calculated according to the attitude angle, and the translation matrix is calculated according to the three-dimensional coordinate offset.
The control method according to claim 9, wherein the gaze information includes gaze point coordinates,

Before determining the posture information of the human face according to the human face information, and determining the reference gaze point coordinates according to the human face information, the control method also includes:

within the first predetermined period of time before the screen is off, acquire the photographed image;

Responding to the fact that the captured image contains face information;

The controlling the electronic device according to the gaze information includes:

In response to the gaze point coordinates being located in the display area of the display screen, the screen is kept on for a second predetermined duration.
The control method according to claim 13, wherein the display area is associated with a preset coordinate range, and the control method further comprises:

When the gaze point coordinates are within the preset coordinate range, it is determined that the gaze point coordinates are located in the display area.
The control method according to claim 9, wherein, before determining the posture information of the face according to the face information, and determining the reference gaze point coordinates according to the face information, the control method further comprises:

Acquiring a captured image in response to the fact that the electronic device does not receive an input operation;

The controlling the electronic device according to the gaze information includes:

Adjusting the display brightness of the display screen to a first predetermined brightness in response to the captured image containing a human face and the gaze point coordinates being located in a display area of the display screen;

In response to the captured image not containing a human face, or the captured image containing a human face and the gaze point coordinates are located outside the display area, adjusting the display brightness to a second predetermined brightness, the second The predetermined brightness is smaller than the first predetermined brightness.
A detection device is characterized in that it comprises:

The first determination module is used to determine the posture information of the human face according to the human face information, and determine the reference gaze point coordinates according to the human face information;

A second determination module, configured to determine a correction parameter according to the attitude information in response to the attitude information being greater than a preset threshold;

The third determination module is configured to determine gaze information according to the coordinates of the reference gaze point and the correction parameters.
A control device, characterized in that it comprises:

The acquisition module is used to determine the gesture information of the human face according to the human face information, and determine the reference gaze point coordinates according to the human face information;

A first determination module, configured to determine a correction parameter according to the attitude information in response to the attitude information being greater than a preset threshold;

The second determination module is used to determine gaze information according to the coordinates of the reference gaze point and the correction parameters;

A control module, configured to control electronic equipment according to the gaze information.
An electronic device, characterized in that it includes a processor, the processor is used to determine the pose information of the face according to the face information, and determine the reference gaze point coordinates according to the face information; in response to the pose information being greater than the preset A threshold value, determining correction parameters according to the posture information; determining gaze information according to the coordinates of the reference gaze point and the correction parameters.
An electronic device, characterized in that it includes a processor, the processor is used to determine the pose information of the face according to the face information, and determine the reference gaze point coordinates according to the face information; in response to the pose information being greater than the preset determining a correction parameter according to the posture information; determining gaze information according to the coordinates of the reference gaze point and the correction parameter; and controlling the electronic device according to the gaze information.
A non-transitory computer-readable storage medium including a computer program, when the computer program is executed by a processor, the processor is made to perform the gaze detection method described in any one of claims 1-8, or the claim The control method of the electronic equipment described in any one of 9-15.