WO2023071884A1 - Gaze detection method, control method for electronic device, and related devices - Google Patents

Gaze detection method, control method for electronic device, and related devices Download PDF

Info

Publication number
WO2023071884A1
WO2023071884A1 PCT/CN2022/126148 CN2022126148W WO2023071884A1 WO 2023071884 A1 WO2023071884 A1 WO 2023071884A1 CN 2022126148 W CN2022126148 W CN 2022126148W WO 2023071884 A1 WO2023071884 A1 WO 2023071884A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
face
gaze
coordinates
gaze point
Prior art date
Application number
PCT/CN2022/126148
Other languages
French (fr)
Chinese (zh)
Inventor
龚章泉
Original Assignee
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oppo广东移动通信有限公司 filed Critical Oppo广东移动通信有限公司
Publication of WO2023071884A1 publication Critical patent/WO2023071884A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Definitions

  • the present application relates to the technical field of consumer electronics, and in particular to a gaze detection method, a control method for electronic equipment, a detection device, a control device, electronic equipment, and a non-volatile computer-readable storage medium.
  • electronic devices can estimate a user's gaze point by collecting face images.
  • the present application provides a gaze detection method, a control method of an electronic device, a detection device, a control device, an electronic device and a non-volatile computer-readable storage medium.
  • the gaze detection method of an embodiment of the present application includes determining the pose information of the face according to the face information, determining the reference gaze point coordinates according to the face information; in response to the pose information being greater than a preset threshold, determining Correction parameters: determine gaze information according to the coordinates of the reference gaze point and the correction parameters.
  • a detection device includes a first determination module, a second determination module and a third determination module.
  • the first determination module is used to determine the pose information of the face according to the face information, and determine the reference gaze point coordinates according to the face information;
  • the second determination module is used to respond to the pose information being greater than a preset threshold, Determine correction parameters according to the posture information;
  • the third determination module is configured to determine gaze information according to the coordinates of the reference gaze point and the correction parameters.
  • An electronic device includes a processor, the processor is configured to determine the pose information of the face according to the face information, and determine the coordinates of a reference gaze point according to the face information; in response to the pose information being greater than a preset threshold , determining correction parameters according to the posture information; determining gaze information according to the coordinates of the reference gaze point and the correction parameters.
  • the gaze detection method, detection device and electronic equipment of the present application after obtaining the face information, first calculate the face posture through the face information, and if the posture information is greater than the preset threshold, it will affect the calculation accuracy of the gaze point coordinates , and then calculate the coordinates of the reference gaze point according to the face information, and then calculate the correction parameters according to the attitude information, so that the coordinates of the reference gaze point can be corrected according to the correction parameters, so as to prevent the face shooting angle from being too large in the obtained face information
  • the impact on gaze detection can improve the accuracy of gaze detection.
  • the method for controlling an electronic device includes determining the pose information of the face according to the face information, determining the coordinates of a reference gaze point according to the face information; in response to the pose information being greater than a preset threshold, determining correction parameters; determining gaze information according to the reference gaze point coordinates and the correction parameters; and controlling the electronic device according to the gaze information.
  • the control device in the embodiment of the present application includes an acquisition module, a first determination module and a second determination module.
  • the acquisition module is used to determine the pose information of the face according to the face information, and determine the reference gaze point coordinates according to the face information;
  • the first determination module is used to respond to the pose information being greater than a preset threshold, according to the set
  • the posture information determines correction parameters;
  • the second determination module is used to determine gaze information according to the reference gaze point coordinates and the correction parameters;
  • the electronic device includes a processor, the processor is used to determine the pose information of the face according to the face information, and determine the reference gaze point coordinates according to the face information; in response to the pose information being greater than the preset determining a correction parameter according to the posture information; determining gaze information according to the coordinates of the reference gaze point and the correction parameter; and controlling the electronic device according to the gaze information.
  • the processors are made to execute a gaze detection method or a control method.
  • the gaze detection method includes that the gaze detection method includes determining the pose information of the face according to the face information, determining the coordinates of the reference gaze point according to the face information; in response to the pose information being greater than a preset threshold, determining according to the pose information Correction parameters: determine gaze information according to the coordinates of the reference gaze point and the correction parameters.
  • the control method of the electronic device includes determining the pose information of the face according to the face information, determining the reference gaze point coordinates according to the face information; in response to the pose information being greater than a preset threshold, determining a correction parameter according to the pose information ; determining gaze information according to the coordinates of the reference gaze point and the correction parameters; and controlling the electronic device according to the gaze information.
  • FIG. 1 is a schematic flow chart of a gaze detection method in some embodiments of the present application.
  • FIG. 2 is a block diagram of a detection device in some embodiments of the present application.
  • FIG. 3 is a schematic plan view of an electronic device in some embodiments of the present application.
  • Fig. 4 is a schematic diagram of connection between an electronic device and a cloud server in some embodiments of the present application
  • FIG. 5 to 7 are schematic flowcharts of gaze detection methods in some embodiments of the present application.
  • Fig. 8 is a schematic structural diagram of a detection model in some embodiments of the present application.
  • FIG. 9 is a schematic flowchart of a method for controlling an electronic device in some embodiments of the present application.
  • Fig. 10 is a block diagram of a control device in some embodiments of the present application.
  • 11 to 14 are schematic diagrams of scenarios of control methods in some embodiments of the present application.
  • FIG. 15 and Figure 16 are schematic flow charts of the control method in some embodiments of the present application.
  • Figure 17 and Figure 18 are schematic diagrams of scenarios of the control method in some embodiments of the present application.
  • FIG. 19 is a schematic flowchart of a control method in some embodiments of the present application.
  • Fig. 20 is a schematic diagram of connection between a processor and a computer-readable storage medium in some embodiments of the present application.
  • the gaze detection method of the present application includes determining the pose information of the face according to the face information, and determining the reference gaze point coordinates according to the face information; in response to the pose information being greater than a preset threshold, determining the correction parameters according to the pose information; and according to the reference gaze point coordinates and calibration parameters to determine gaze information.
  • the gaze detection method further includes: in response to the gesture information being less than a preset threshold, calculating reference gaze point coordinates according to face information as gaze information.
  • the posture information includes a posture angle
  • the posture angle includes a pitch angle and a yaw angle.
  • judging whether the posture information of the face is greater than a preset threshold includes: judging the pitch angle according to the face information Or whether the yaw angle is greater than a preset threshold.
  • the gaze detection method further includes: obtaining a training sample set, the training sample set includes a first type of sample whose posture information of a human face is less than a preset threshold and a second type of sample whose posture information of a human face is greater than a preset threshold samples; training a preset detection model according to the first type of samples and the second type of samples; determining the correction parameters according to the attitude information, including: determining the correction parameters according to the attitude information based on the detection model.
  • the detection model includes a gaze point detection module and a correction module, and the detection model is trained according to the first type of samples and the second type of samples, including: inputting the first type of samples into the gaze point detection module to output the first training coordinates; input the second type of samples into the gaze point detection module and the correction module to output the second training coordinates; based on the preset loss function, according to the first preset coordinates and the first training coordinates corresponding to the first type of samples, calculate the second A loss value, and calculate the second loss value according to the second preset coordinates and the second training coordinates corresponding to the second type of samples; adjust the detection model according to the first loss value and the second loss value until the detection model converges.
  • the first difference of the first loss value corresponding to any two samples of the first type and the first difference of the first loss value corresponding to any two samples of the second type
  • N is a positive integer greater than 1; or, when the first loss value and the second loss value are both less than the predetermined loss threshold, the detection model is determined convergence.
  • the face information includes a face mask, a left-eye image and a right-eye image
  • the face mask is used to indicate the position of the face in the image
  • the coordinates of the reference gaze point are calculated according to the face information, including : Calculate the position information of the face relative to the electronic device according to the face mask; calculate the coordinates of the reference gaze point according to the position information, the left eye image and the right eye image.
  • the face information includes face feature points
  • the pose information includes pose angles and three-dimensional coordinate offsets
  • the correction parameters include rotation matrices and translation matrices
  • the pose information of the face is determined according to the face information, including: Calculate the attitude angle and three-dimensional coordinate offset of the face feature points; calculate the correction parameters according to the attitude information, including: calculate the rotation matrix according to the attitude angle, and calculate the translation matrix according to the three-dimensional coordinate offset.
  • the electronic device control method of the present application includes determining the pose information of the face according to the face information, determining the reference gaze point coordinates according to the face information; in response to the pose information being greater than a preset threshold, determining the correction parameters according to the pose information; according to the reference gaze point Coordinates and calibration parameters, determining gaze information; and controlling electronic equipment based on the gaze information.
  • control method further includes: in response to the gesture information being less than a preset threshold, calculating reference gaze point coordinates according to face information as gaze information.
  • the face information includes a face mask, a left-eye image and a right-eye image
  • the face mask is used to indicate the position of the face in the image
  • the coordinates of the reference gaze point are calculated according to the face information, including : Calculate the position information of the face relative to the electronic device according to the face mask; calculate the coordinates of the reference gaze point according to the position information, the left eye image and the right eye image.
  • the face information includes face feature points
  • the pose information includes pose angles and three-dimensional coordinate offsets
  • the correction parameters include rotation matrices and translation matrices
  • the correction parameters are calculated according to the pose information, including: according to face feature points Calculate the attitude angle and three-dimensional coordinate offset; calculate the rotation matrix according to the attitude angle, and calculate the translation matrix according to the three-dimensional coordinate offset.
  • the gazing information includes gaze point coordinates.
  • the control method further includes: Within the duration, acquire the photographed image; respond to the face information contained in the photographed image; control the electronic device according to the gaze information, including: responding to the coordinates of the gaze point being located in the display area of the display screen, continuing to light the screen for a second predetermined duration.
  • the display area is associated with a preset coordinate range
  • the control method further includes: when the gaze point coordinates are within the preset coordinate range, determining that the gaze point coordinates are located in the display area.
  • the control method before determining the posture information of the human face according to the human face information and determining the coordinates of the reference point of gaze according to the human face information, the control method further includes: in response to a situation where the electronic device does not receive an input operation, acquiring a captured image ; Control the electronic device according to the gaze information, including: in response to the captured image containing a human face and the gaze point coordinates are located in the display area, adjusting the display brightness of the display screen to the first predetermined brightness; in response to the captured image not containing a human face, or shooting If the image contains a human face and the gaze point coordinates are outside the display area, the display brightness is adjusted to a second predetermined brightness, and the second predetermined brightness is smaller than the first predetermined brightness.
  • the detection device of the present application includes a first determination module, a second determination module and a third determination module.
  • the first determination module is used to determine the pose information of the face according to the face information, and determine the coordinates of the reference gaze point according to the face information;
  • the second determination module is used to determine the correction parameters according to the pose information in response to the pose information being greater than a preset threshold; 3.
  • a determination module configured to determine fixation information according to the reference fixation point coordinates and correction parameters.
  • the control device of the present application includes an acquisition module, a first determination module, a second determination module and a control module.
  • the acquisition module is used to determine the pose information of the face according to the face information, and determine the reference gaze point coordinates according to the face information;
  • the first determination module is used to determine the correction parameters according to the pose information in response to the pose information being greater than a preset threshold;
  • the second determination is used to determine the gaze information according to the reference gaze point coordinates and the correction parameters;
  • the control module is used to control the electronic equipment according to the gaze information.
  • the electronic device of the present application includes a processor, the processor is used to determine the pose information of the face according to the face information, and determine the coordinates of the reference gaze point according to the face information; in response to the pose information being greater than a preset threshold, determine the correction parameters according to the pose information; The reference fixation point coordinates and correction parameters determine the fixation information.
  • the electronic device of the present application includes a processor, the processor is used to determine the pose information of the face according to the face information, and determine the coordinates of the reference gaze point according to the face information; in response to the pose information being greater than a preset threshold, determine the correction parameters according to the pose information; The reference gaze point coordinates and correction parameters are used to determine gaze information; and to control electronic equipment according to the gaze information.
  • the non-transitory computer-readable storage medium of the present application includes a computer program.
  • the processor executes the gaze detection method of any of the above-mentioned embodiments, or the control method of the electronic device of any of the above-mentioned embodiments. .
  • the gaze detection method of the embodiment of the present application includes the following steps:
  • 011 Determine the posture information of the face according to the face information, and determine the reference gaze point coordinates according to the face information;
  • 015 Determine the fixation information according to the reference fixation point coordinates and correction parameters.
  • the detection device 10 in the embodiment of the present application includes a first determination module 11 , a second determination module 12 and a third determination module 13 .
  • the first determination module 11 is used to determine the pose information of the face according to the face information, and determines the coordinates of the reference gaze point according to the face information;
  • the second determination module 12 is used to determine the correction parameters according to the pose information in response to the pose information being greater than a preset threshold ;
  • the third determination module 13 is used to determine the gaze information according to the coordinates of the reference gaze point and the correction parameters. That is to say, step 011 can be implemented by the first determination module 11 , step 013 can be performed by the second determination module 12 and step 015 can be performed by the third determination module 13 .
  • the electronic device 100 in the embodiment of the present application includes a processor 60 and a collection device 30 .
  • Acquisition device 30 is used for collecting face information by predetermined frame rate (face information can comprise people's face image, as the visible light image of people's face, infrared image, depth image etc.);
  • Acquisition device 30 can be visible light camera, infrared camera, depth One or more of the cameras, wherein the visible light camera can collect visible light face images, the infrared camera can collect infrared face images, and the depth camera can collect depth face images.
  • the collection device 30 includes a visible light camera, An infrared camera and a depth camera, and the acquisition device 30 simultaneously has a visible light face image, an infrared face image and a depth face image.
  • the processor 60 may include an image processor 60 (Image Signal Processor, ISP), a neural network processor 60 (Neural-Network Processing Unit, NPU) and an application processor 60 (Application Processor, AP), and the detection device 10 is arranged on the electronic device In 100, wherein, the first determination module 11 can be arranged on the ISP and the NPU, and the processor 60 is connected to the collection device 30. After the collection device 30 collects the face image, the ISP can process the face image to obtain the face image. The NPU can determine the reference gaze point coordinates according to the face information, and the second determination module 12 and the third determination module 13 can be set on the NPU.
  • ISP Image Signal Processor
  • NPU Neuro-Network Processing Unit
  • AP Application Processor
  • the processor 60 (specifically, it can be an ISP and an NPU) is used to determine the pose information of the face according to the face information; the processor 60 (specifically, it can be an NPU) is also used to determine the correction according to the pose information in response to the pose information being greater than a preset threshold. Parameters, and according to the coordinates of the reference point of fixation and the correction parameters, fixation information is determined. That is to say, step 011 , step 013 and step 015 may be executed by the processor 60 .
  • the electronic device 100 may be a mobile phone, a smart watch, a tablet computer, a display device, a notebook computer, a teller machine, a gate, a head-mounted display device, a game machine, and the like. As shown in FIG. 3 , the embodiment of the present application is described by taking the electronic device 100 as a mobile phone as an example. It can be understood that the specific form of the electronic device 100 is not limited to the mobile phone.
  • the collection device 30 can collect the user's face information once at a predetermined time interval, and continue to perform gaze detection on the user while ensuring that the power consumption of the electronic device 100 is small, or, when the user When using applications that require gaze detection (such as browser software, post bar software, video software, etc.), collect face information according to a predetermined number of frames (such as 10 frames per second), so that human face information is only performed when there is a gaze detection requirement. Face information collection minimizes the power consumption of gaze detection.
  • applications that require gaze detection such as browser software, post bar software, video software, etc.
  • the processor 60 can identify the face image, for example, the processor 60 can compare the face image with the preset face template , so as to determine the face in the face image and the image area where different parts of the face (such as eyes, nose, etc.) Within, the processor 60 can perform face recognition in a trusted execution environment (Trusted Execution Environment, TEE) to ensure the user's privacy; or, the preset face template can be stored in the cloud server 200, and then the electronic device 100 will The face image is sent to the cloud server 200 for comparison to determine the face area image, and the face recognition is handed over to the cloud server 200 for processing, which can reduce the processing capacity of the electronic device 100 and improve image processing efficiency; then, the processor 60 can The image of the face area is recognized to determine the pose information of the face. More specifically, the recognition of the face and different parts of the face can be carried out according to the shape features of the face and different parts of the face, so as to obtain the face information. image of the
  • the pose information of the face can be calculated according to the face information.
  • the pose information can be calculated by extracting the features of the face image and calculating the pose information according to the position coordinates of the extracted feature points, such as nose tip, left and right
  • the center of the eyes and the left and right corners of the mouth are used as feature points, and as the posture of the face changes, the position coordinates of the feature points are also changing. For example, a three-dimensional coordinate system is established with the tip of the nose as the origin.
  • the inclination angle respectively represent the rotation angles of the human face relative to the three coordinate axes of the three-dimensional coordinate system, etc.), taking the horizontal rotation angle of the human face when facing the display screen 40 of the electronic device 100 as an example, the deflection angle of the human face (i.e. The larger the horizontal rotation angle), the closer the distance between the two feature points corresponding to the left and right eyes. Therefore, the pose information of the face can be accurately calculated according to the position coordinates of the feature points.
  • the correction parameters can be determined according to the posture information, thereby correcting the error of the gaze point detection caused by the change of posture, so that the gaze obtained according to the coordinates of the reference point of gaze and the correction parameters The information is more accurate.
  • the processor 60 can directly calculate the reference gaze point coordinates according to the face area image, or carry out feature point recognition on the face area image, and calculate the reference gaze point coordinates by the feature points, and the amount of calculation is relatively small. or, the processor 60 can obtain the face area image and the human eye area image, and perform feature point recognition on the face area image, and jointly calculate the reference gaze point coordinates in conjunction with the feature points of the human face area image and the human eye area image , on the basis of ensuring a small amount of calculation, the calculation accuracy of the coordinates of the reference gaze point is further improved.
  • the processor 60 can first determine whether the posture information is greater than a preset threshold, and the posture information Including the pitch angle, roll angle and yaw angle of the human face, of course, because the change of the roll angle (the human face parallel display screen 40 rotates) does not cause the feature points of the human face to change in the position of the human face, so , you can only judge whether the pitch angle or yaw angle is greater than 0 degrees.
  • the pitch angle, roll angle and yaw angle of the face are all 0 degrees, and the preset threshold is 0 degrees, then when the posture information is greater than 0 degrees (such as When the pitch angle or yaw angle is greater than 0 degrees), it can be determined that the coordinates of the reference gaze point need to be corrected.
  • the pitch angle, roll angle, and yaw angle are directional, they may be negative values, which will affect the accuracy of judgment Therefore, when judging whether the posture information is greater than a preset threshold, it can be judged whether the absolute value of the posture information is greater than a preset threshold.
  • the processor 60 calculates the correction parameters according to the posture information.
  • the correction parameters include coordinate correction coefficients, which can be obtained according to the reference gaze point coordinates and the coordinate correction coefficients.
  • Watching information such as the coordinates of the reference point of gaze are (x, y), and the coordinate correction coefficients are a and b, then the gaze information is (ax, by); or, the coordinates of the reference point of gaze include the two-dimensional coordinates and the coordinates of the line of sight on the display screen 40
  • the correction parameters include coordinate correction coefficient and direction correction coefficient.
  • the gaze information can be obtained according to the coordinates of the reference gaze point and the coordinate correction coefficient.
  • the coordinates of the reference gaze point are (x, y), and the direction of the line of sight is ( ⁇ , ⁇ , ⁇ ), the coordinate correction coefficients are a and b, and the direction correction coefficients include c, d and e, then the gaze information is (ax, by, c ⁇ , d ⁇ , e ⁇ ).
  • the attitude information is less than or equal to the preset threshold value, it means that the user is facing the display screen 40 or the deflection angle of the user relative to the display screen 40 is small at this time, then it can be determined that the coordinates of the reference gaze point do not need to be corrected, and the processor 60 After the coordinates of the reference fixation point are calculated, the coordinates of the reference fixation point can be directly determined as the final fixation information, thereby saving the calculation amount for calculating the correction parameters.
  • the preset threshold can be set larger. For example, when the preset threshold is 5 degrees, since the deflection of the face is small, the detection accuracy of the gaze information is basically not affected at this time. Calculation of correction parameters. Or, set the preset threshold according to the needs of the gaze information, such as the gaze information only includes the gaze direction, and does not need accurate gaze point coordinates. At this time, the preset threshold can be set larger, and the gaze information includes the gaze point at coordinates of the display screen 40, the preset threshold can be set smaller, so as to ensure the accuracy of gaze point detection.
  • the electronic device 100 can be controlled according to the gaze information (the gaze direction and/or the coordinates of the gaze point). For example, when it is detected that the gaze point coordinates are located in the display area of the display screen 40, keep the screen always on, and after detecting that the gaze point coordinates are located outside the display area of the display screen 40 for a predetermined duration (such as 10S, 20S, etc.), to turn off the screen. Or, according to the change of the gaze direction, operations such as turning pages are performed.
  • a predetermined duration such as 10S, 20S, etc.
  • the obtained face information may not be accurate enough due to factors such as shooting angles, thereby affecting the accuracy of gaze point detection.
  • the gaze detection method, the detection device 10 and the electronic device 100 of the present application after obtaining the face information, first calculate the face posture through the face information, and if the posture information is greater than the preset threshold, it will affect the calculation of the gaze point coordinates For accuracy, first calculate the coordinates of the reference point of gaze based on the face information, and then calculate the correction parameters based on the attitude information, so that the coordinates of the reference point of gaze can be corrected according to the correction parameters, thereby preventing the shooting angle of the face from being captured due to the face information. Excessive impact on gaze detection can improve the accuracy of gaze detection.
  • the face information includes a face mask, a left-eye image and a right-eye image, and the face mask is used to indicate the position of the face in the image
  • Step 011 Calculate the reference gaze point coordinates according to the face information, including:
  • 0111 Calculate the position information of the face relative to the electronic device 100 according to the face mask
  • 0112 Calculate the coordinates of the reference gaze point according to the position information, the left-eye image and the right-eye image.
  • the first determining module 11 is configured to calculate the position information of the face relative to the electronic device 100 according to the face mask; and calculate the reference gaze point coordinates according to the position information, the left-eye image and the right-eye image. That is to say, step 0111 and step 0112 can be executed by the first determination module 11 .
  • the processor 60 is further configured to calculate the position information of the face relative to the electronic device 100 according to the face mask; and calculate the reference gaze point coordinates according to the position information, the left-eye image and the right-eye image. That is to say, step 0111 and step 0112 may be executed by the processor 60 .
  • the processor 60 can also first determine the face mask of the face image, the face mask is used to represent the position of the face in the face image, and the face mask can be obtained by The position of the face in the recognition face image is determined, and the processor 60 can calculate the position information of the face relative to the electronic device 100 according to the face mask (for example, according to the ratio of the face mask to the face image, the face and face can be calculated).
  • distance of the electronic device 100 it can be understood that when the distance between the human face and the electronic device 100 changes, even if the gaze direction of the human eye does not change, the gaze point coordinates of the human eye will still change. Therefore, when calculating the gaze information, except In addition to calculating gaze information based on face images and/or eye images (such as left-eye images and right-eye images), location information can also be combined to more accurately calculate gaze point coordinates.
  • face information includes face feature points
  • attitude information includes attitude angle and three-dimensional coordinate offset
  • correction parameters include rotation matrix and translation matrix
  • step 011 Determine the posture information of the face according to the face information, including:
  • Step 013 Calculate the correction parameters according to the attitude information, including
  • 0131 Calculate the rotation matrix according to the attitude angle, and calculate the translation matrix according to the three-dimensional coordinate offset.
  • the first determination module 11 is also used to calculate the attitude angle and three-dimensional coordinate offset according to the facial feature points; the second determination module 12 is also used to calculate the rotation matrix according to the attitude angle, and calculate the three-dimensional coordinate offset according to the Compute the translation matrix. That is to say, step 0113 can be performed by the first determination module 11 , and step 0131 can be performed by the second determination module 12 .
  • the processor 60 is further configured to calculate an attitude angle and a three-dimensional coordinate offset based on facial feature points; calculate a rotation matrix based on the attitude angle, and calculate a translation matrix based on the three-dimensional coordinate offset. That is to say, step 0133 and step 0134 can be executed by the processor 60 .
  • the correction parameters may include a rotation matrix and a translation matrix to represent the face position change and pose change respectively.
  • the pose angle and the three-dimensional coordinate offset may be calculated first according to the face feature points, where the pose angle It is used to represent the attitude of the face (such as pitch angle, roll angle and yaw angle), and the three-dimensional coordinate offset can represent the position of the face, and then calculate the rotation matrix according to the attitude angle, and calculate the offset matrix according to the three-dimensional coordinate offset , so as to determine the correction parameters of the reference gaze point coordinates, and accurately calculate the fixation information according to the reference gaze point coordinates, rotation matrix and translation matrix.
  • gaze detection method also includes:
  • the training sample set includes the first type of samples whose face pose information is less than a preset threshold and the second type of samples whose face pose information is greater than a preset threshold;
  • 0102 Train a preset detection model according to the first type of samples and the second type of samples;
  • Step 013 includes:
  • the detection device 10 further includes an acquisition module 14 and a training module 15 . Both the acquiring module 14 and the training module 15 can be set in the NPU to train the detection model.
  • the acquisition module 14 is used to obtain the training sample set;
  • the training module 16 is used to train the preset detection model according to the first type of sample and the second type of sample;
  • the second determination module 12 is also used to determine the correction parameters according to the posture information based on the detection model . That is to say, step 0101 may be performed by the acquisition module 14 , step 0102 may be performed by the training module 15 , and step 0132 may be performed by the second determination module 12 .
  • the processor 60 is further configured to obtain a training sample set; train a preset detection model according to the first type of samples and the second type of samples; and determine correction parameters according to the posture information based on the detection model. That is to say, step 0101 , step 0102 and step 0132 can be executed by the processor 60 .
  • the present application can realize calculation of gaze information through a preset detection model.
  • it is necessary to first train the detection model so that the detection model converges.
  • the detection model in order to enable the detection model to still accurately calculate the gaze information when the face is deflected relative to the display screen 40, it is possible to pre-select a plurality of first-type samples whose posture information of the face is less than a preset threshold and A plurality of second-type samples whose pose information of the face is greater than a preset threshold are used as a training sample set; wherein, the first-type samples are face images whose pose information is less than a preset threshold; the second-type samples are faces whose pose information is greater than a preset A face image with a threshold; in this way, the detection model is trained through the first type of samples whose attitude information is less than the preset threshold and the second type of samples greater than the preset threshold, and after training to convergence, the detection model can be When the gaze information is detected, the impact caused by the deflection of the human face relative to the display screen 40 is minimized to ensure the accuracy of gaze detection.
  • step 0102 includes:
  • 01021 Input the first type of samples into the fixation point detection module to output the first training coordinates
  • 01022 Input the second type of samples into the gaze point detection module and the correction module to output the second training coordinates;
  • 01023 Based on the preset loss function, calculate the first loss value according to the first preset coordinates corresponding to the first type of samples and the first training coordinates, and calculate the first loss value according to the second preset coordinates corresponding to the second type of samples and the second training coordinates Coordinates, calculate the second loss value;
  • 01024 Adjust the detection model according to the first loss value and the second loss value until the detection model converges.
  • the training module 15 is also used to input the first type of sample into the gaze point detection module to output the first training coordinates; the second type of sample is input into the gaze point detection module and the correction module to output the second training coordinates; based on the preset loss function, calculate the first loss value according to the first preset coordinates and the first training coordinates corresponding to the first type of samples, and calculate the first loss value according to the second preset coordinates corresponding to the second type of samples and the second training coordinates coordinates, calculating the second loss value; adjusting the detection model according to the first loss value and the second loss value until the detection model converges. That is to say, Step 01021 to Step 01024 can be executed by the training module 15 .
  • the processor 60 is also used for inputting the first type of samples into the gaze point detection module to output the first training coordinates; inputting the second type of samples into the gaze point detection module and the correction module to output the second training coordinates coordinates; based on the preset loss function, calculate the first loss value according to the first preset coordinates and the first training coordinates corresponding to the first type of samples, and calculate the first loss value according to the second preset coordinates corresponding to the second type of samples and the second training coordinates coordinates, calculating the second loss value; adjusting the detection model according to the first loss value and the second loss value until the detection model converges. That is to say, Step 01021 to Step 01024 may be executed by the processor 60 .
  • the face detection model 50 includes a gaze point detection module 51 and a correction module 52 .
  • the training sample set is input to the detection model, wherein the first type of sample is input to the gaze point detection module 51 to output the first training coordinates; since the attitude information of the first type of training sample is less than the preset threshold, therefore Directly output the first training coordinates; the second type of training samples are input to the fixation point detection module 51 and the correction module 52 at the same time, the fixation point detection module 51 outputs the reference training coordinates, and then the correction module 52 outputs the correction parameters, and corrects the reference according to the correction parameters training coordinates to output the second training coordinates.
  • each training sample has a corresponding preset coordinate
  • the preset coordinate represents the actual gaze information of the training sample
  • the first type of training sample corresponds to the first preset coordinate
  • the second type of preset sample corresponds to the second preset coordinate
  • the processor 60 can calculate the first loss value based on the preset loss function, the first training coordinates and the first preset coordinates; then the processor 60 adjusts the gaze point detection module 51 based on the first loss value , so that the first training coordinates output by the gaze point detection module 51 gradually approach the first preset coordinates until convergence; the processor 60 can calculate the second based on the preset loss function, the second training coordinates and the second preset coordinates Two loss values; then the processor 60 simultaneously adjusts the gaze point detection module 51 and the correction module 52 based on the second loss value, so that the second training coordinates output by the detection model gradually approach the second preset coordinates until convergence.
  • the loss function is as follows: Among them, loss is the loss value, N is the number of training samples contained in each training sample set, X and Y are training coordinates (such as the first training coordinates or the second training coordinates), and Gx and Gy are preset coordinates (such as the first training coordinates or the second training coordinates). a preset coordinate and a second preset coordinate), when the training coordinates are the gaze direction, X and Y represent the pitch angle and the yaw angle respectively; when the training coordinates are the gaze point coordinates, X and Y represent the gaze point respectively The coordinates of the plane where the screen 40 is located, so as to quickly calculate the first loss value and the second loss value.
  • the processor 60 can adjust the detection model according to the first loss value and the second loss value, so that the gradient of the detection model decreases continuously, so that the training coordinates are getting closer to the preset coordinates, and finally the detection model is trained to convergence.
  • the first difference of the first loss value corresponding to any two samples of the first type, and the first difference of the second loss value corresponding to any two samples of the second type When the two differences are both less than the predetermined difference threshold, it is determined that the detection model is converged, and N is a positive integer greater than 1; that is to say, during the training process of consecutive N batches, the first loss value basically does not change, which means When the first loss value and the second loss value reach the limit, it can be determined that the detection model has converged; or, when both the first loss value and the second loss value are less than a predetermined loss threshold, it is determined that the detection model is converged, and the first loss value and the second loss value are determined to be convergent.
  • the detection model is trained to converge through the first type of training samples and the second type of training samples, so as to ensure that the detection model can still output accurate gaze information according to the face information when the face is deflected.
  • control method of the electronic device 100 in the embodiment of the present application includes the following steps:
  • 021 Determine the posture information of the face according to the face information, and determine the reference gaze point coordinates according to the face information;
  • the control device 20 in the embodiment of the present application includes an acquisition module 21 , a first determination module 22 , a second determination module 23 and a control module 24 .
  • Acquisition module 21 is used for determining the posture information of people's face according to people's face information, determines the coordinates of reference point of gaze according to people's face information;
  • the first determining module 22 is used for determining correction parameters according to posture information in response to posture information being greater than a preset threshold;
  • the second determination module 23 is used to determine the gaze information according to the coordinates of the reference gaze point and the correction parameters;
  • the control module 24 is used to control the electronic device 100 according to the gaze information. That is to say, step 021 can be performed by the acquisition module 21 , step 023 can be performed by the first determination module 22 , step 025 can be performed by the second determination module 23 and step 027 can be performed by the control module 24 .
  • the electronic device 100 in the embodiment of the present application includes a processor 60 and a collection device 30 .
  • Acquisition device 30 is used for collecting face information by predetermined frame rate (face information comprises people's face image, as the visible light image of people's face, infrared image, depth image etc.);
  • Acquisition device 30 can be visible light camera, infrared camera, depth camera One or more of them, wherein the visible light camera can collect visible light face images, the infrared camera can collect infrared face images, and the depth camera can collect depth face images.
  • the collection device 30 includes a visible light camera, an infrared face image camera and depth camera, the acquisition device 30 simultaneously visible light face image, infrared face image and depth face image.
  • Processor 60 may include ISP, NPU and AP, such as control device 20 is arranged in electronic equipment 100, acquisition module 21 is arranged in ISP and NPU, processor 60 is connected with collection device 30, after collection device 30 collects the face image , the ISP can process the face image to determine the posture information of the face according to the face information, the NPU can determine the reference point of gaze coordinates according to the face information, the first determination module 22 and the second determination module 23 can be arranged on the NPU, The control module 24 can be set at the AP.
  • ISP can process the face image to determine the posture information of the face according to the face information
  • the NPU can determine the reference point of gaze coordinates according to the face information
  • the first determination module 22 and the second determination module 23 can be arranged on the NPU
  • the control module 24 can be set at the AP.
  • Processor 60 (specifically can be ISP and NPU) is used for obtaining face information and posture information; Processor 60 (specifically can be NPU) is also used for determining correction parameter according to posture information in response to posture information greater than preset threshold value; According to The coordinates of the reference gaze point and the correction parameters are used to determine gaze information; the processor 60 (specifically, it may be an AP) can also be used to control the electronic device 100 according to the gaze information. That is to say, step 021 can be executed by the collection device 30 in cooperation with the processor 60 , and step 023 , step 025 and step 027 can be executed by the processor 60 .
  • Step 021, Step 023, and Step 025 for the manner of determining the gaze information, that is, Step 021, Step 023, and Step 025, please refer to the descriptions of Step 011, Step 013, and Step 015, respectively, and details will not be repeated here.
  • the electronic device 100 can be controlled according to the gaze direction and gaze point coordinates.
  • a three-dimensional coordinate system is established with the midpoint of the eyes as the origin O1, the X1 axis is parallel to the direction of the line connecting the centers of the eyes, the Y1 axis is located on the horizontal plane and perpendicular to the X1 axis, and the Z1 axis is perpendicular to the X1 axis and Y1 axis.
  • the three-axis rotation angle of the line of sight S and the three-dimensional coordinate system indicates the user's gaze direction.
  • the gaze direction includes pitch angle, roll angle and yaw angle respectively.
  • the pitch angle represents the rotation angle around the X1 axis
  • the roll angle represents the rotation angle around the Y1 axis.
  • the rotation angle of the axis, the yaw angle represents the rotation angle around the Z1 axis
  • the processor 60 can realize the page turning or sliding operation of the display content of the electronic device 100 according to the gaze direction, for example, according to the determination of continuous multiple frames of human eye area images (such as 10 consecutive frames) of the gaze direction, the change of the gaze direction can be determined, for example, please combine Figure 11 and Figure 12, when the pitch angle gradually increases (that is, the line of sight S is tilted), it can be determined that the user wants the displayed content to slide up or Turn the page down. For another example, please refer to FIG. 11 and FIG.
  • the pitch angle gradually decreases (that is, the line of sight S is tilted), then it can be determined that the user wants to slide the displayed content down or turn the page up.
  • the electronic device 100 can also be slid or page-turned.
  • the center of the display screen 40 can be used as the coordinate origin O2 to establish a plane coordinate system
  • the width direction parallel to the electronic device 100 is used as the X2 axis
  • the length direction parallel to the electronic device 100 is used as the Y2 axis
  • the gaze point coordinates include the abscissa (corresponding to the position on the X2 axis) and the ordinate (corresponding to the position on the Y2 axis).
  • the ordinate gradually increases, it means that the gaze point M moves up. It can be determined that the user wants to slide up or turn the page down, and then For example, if the ordinate gradually decreases, it means that the gaze point M moves down, and it can be determined that the user wants to slide the displayed content down or turn the page up.
  • the processor 60 can also obtain 10 consecutive frames according to the change speed of the gaze direction (such as the difference between the pitch angles of the first frame and the tenth frame (or the difference between the vertical coordinates of the gaze point M) and The duration is determined), the faster the change speed, the more new display content will be displayed after sliding.
  • the predetermined time length (such as 10S, 20S, etc.) after the user does not check the display screen 40 can be Turn off the screen again.
  • the control device 20 and the electronic device 100 after obtaining the face information and posture information, when the posture information is greater than the preset threshold, which will affect the calculation accuracy of the gaze point coordinates, according to
  • the face information first calculates the coordinates of the reference point of gaze, and then calculates the correction parameters according to the posture information.
  • the coordinates of the reference point of gaze can be corrected according to the correction parameters to obtain accurate gaze information, so as to prevent the acquired face information from being shot at an excessive angle. Large impact on gaze detection, which can improve the accuracy of gaze detection.
  • the control accuracy of the electronic device 100 can be improved.
  • the face information includes a face mask, a left-eye image and a right-eye image, and the face mask is used to indicate the position of the face in the image
  • Step 021 Calculate the reference gaze point coordinates according to the face information, including:
  • 0212 Calculate the coordinates of the reference gaze point according to the position information, the left-eye image and the right-eye image.
  • the first determining module 22 is also used to calculate the position information of the face relative to the electronic device 100 according to the face mask; and calculate the reference gaze point coordinates according to the position information, the left-eye image and the right-eye image. That is to say, step 0211 and step 0212 can be executed by the first determination module 22 .
  • the processor 60 is further configured to calculate the position information of the face relative to the electronic device 100 according to the face mask; and calculate the reference gaze point coordinates according to the position information, the left-eye image and the right-eye image. That is to say, step 0211 and step 0212 can be executed by the processor 60 .
  • step 0231 and step 0232 please refer to step 0131 and step 0132 respectively, and details are not repeated here.
  • face information includes face feature points
  • attitude information includes attitude angle and three-dimensional coordinate offset
  • correction parameters include rotation matrix and translation matrix
  • step 021 Determine the pose information of the face based on the face information, including:
  • Step 023 includes:
  • 0231 Calculate the rotation matrix according to the attitude angle, and calculate the translation matrix according to the three-dimensional coordinate offset.
  • the first determination module 22 is also used to calculate the attitude angle and three-dimensional coordinate offset according to the face feature points; calculate the rotation matrix according to the attitude angle, and calculate the translation matrix according to the three-dimensional coordinate offset. That is to say, step 0233 and step 0234 can be executed by the first determination module 22 .
  • the processor 60 is further configured to calculate an attitude angle and a three-dimensional coordinate offset based on facial feature points; calculate a rotation matrix based on the attitude angle, and calculate a translation matrix based on the three-dimensional coordinate offset. That is to say, step 0233 and step 0234 can be executed by the processor 60 .
  • Step 0233 and Step 0234 please refer to Step 0133 and Step 0134 respectively, which will not be repeated here.
  • the gaze information includes gaze point coordinates
  • the control method also includes:
  • Step 027 Control the electronic device 100 according to the gaze information, including:
  • control module 24 is also used to acquire the photographed image within the first predetermined time period before the screen is off; in response to the fact that the photographed image contains a human face; , the screen remains on for a second predetermined duration. That is to say, step 0201 , step 0202 and step 0271 can be executed by the control module 24 .
  • the processor 60 is further configured to acquire a photographed image within the first predetermined time period before the screen is off; in response to the fact that the photographed image contains a human face, in response to the gaze point coordinates being located in the display area of the display screen 40 , the screen remains on for a second predetermined duration. That is to say, step 0201 , step 0202 and step 0271 may be executed by the processor 60 .
  • the gazing information can be used to realize off-screen control.
  • gaze detection is first performed.
  • the processor 60 first acquires a captured image. If there is a human face in the captured image, the gazing information is determined according to the captured image.
  • the first predetermined time period such as 5 seconds, 10 seconds, etc.
  • the gaze point M when the gaze point M is located within the display area of the display screen 40, it can be determined that the user is looking at the display screen 40, so that the display screen 40 continues to be on for a second predetermined duration, and the second predetermined duration can be greater than the first predetermined time length, and within the first predetermined time length before the screen is turned off again, the captured image is acquired again, so that when the user looks at the display screen 40, the screen remains bright, and when the user no longer looks at the display screen 40, the screen is turned off again. Screen.
  • the center of the display area can be used as the coordinate origin O2 to establish a two-dimensional coordinate system parallel to the display screen 40.
  • the coordinate range and the ordinate range can be determined when the gaze point coordinates are within the preset coordinate range (that is, the abscissa of the gaze point coordinates is within the abscissa range and the ordinate is within the ordinate range).
  • the gaze point coordinates are located in the display area, so it is relatively simple to determine whether the user gazes at the display screen 40 .
  • the gaze information includes gaze point coordinates
  • the control method also includes:
  • Step 027 includes:
  • control module 24 is also used to acquire the captured image; in response to the situation that the electronic device 100 does not receive an input operation; in response to the captured image containing a human face and the coordinates of the point of gaze located in the display area, adjust the display screen 40 Adjust the display brightness to the first predetermined brightness; in response to the captured image does not contain a human face, or the captured image contains a human face and the gaze point coordinates are outside the display area, adjust the display brightness to the second predetermined brightness, and the second predetermined brightness is less than A first predetermined brightness. That is to say, step 0203 , step 0204 , step 0272 and step 0273 can be executed by the control module 24 .
  • the processor 60 is also configured to acquire a captured image; in response to a situation where the electronic device 100 does not receive an input operation; in response to a human face being included in the captured image and the gaze point coordinates are located in the display area, adjust the display screen 40 Adjust the display brightness to the first predetermined brightness; in response to the captured image does not contain a human face, or the captured image contains a human face and the gaze point coordinates are outside the display area, adjust the display brightness to the second predetermined brightness, and the second predetermined brightness is less than A first predetermined brightness. That is to say, step 0203 , step 0204 , step 0272 and step 0273 can be executed by the processor 60 .
  • the gaze information can also be used to realize intelligent brightening of the screen.
  • the electronic device 100 In order to save power, the electronic device 100 generally reduces the display brightness after a certain period of time when the screen is bright, and then brightens the screen with low brightness for a certain period of time. After a long time, the screen will be off.
  • the processor 60 can obtain the captured image. If the image contains a human face, the gaze information is calculated according to the captured image.
  • the display brightness is adjusted to the first predetermined brightness.
  • the predetermined brightness can be the brightness set by the user when the display screen 40 is normally displayed, or it can be changed in real time according to the brightness of the ambient light to adapt to the brightness of the ambient light, so as to ensure that the user can still brighten the screen even if the electronic device 100 is not operated. In order to prevent the situation that the user does not operate the electronic device 100 but suddenly turns off the screen when viewing the displayed content and affects the user experience.
  • the display brightness can be adjusted to a second predetermined brightness, which is smaller than the first predetermined brightness, so as to prevent unnecessary power consumption.
  • the display brightness is adjusted to the first predetermined brightness again, so as to ensure the normal viewing experience of the user. In this way, it can be realized that when the user does not operate the electronic device 100, the user looks at the display area, and the display area is displayed at normal brightness; Save battery.
  • one or more non-transitory computer-readable storage media 300 containing a computer program 302 when the computer program 302 is executed by one or more processors 60, the processors 60 can Execute the gaze detection method or the control method of the electronic device 100 in any one of the above-mentioned embodiments.
  • processors 60 when the computer program 302 is executed by one or more processors 60, the processors 60 are made to perform the following steps:
  • 011 Determine the posture information of the face according to the face information, and determine the reference gaze point coordinates according to the face information;
  • 015 Determine the fixation information according to the reference fixation point coordinates and correction parameters.
  • processors 60 may also perform the following steps:
  • 021 Determine the posture information of the face according to the face information, and determine the reference gaze point coordinates according to the face information;

Abstract

A gaze detection method, a control method for an electronic device (100), and a detection apparatus (10), a control apparatus (20), an electronic device (100) and a non-volatile computer-readable storage medium (300). The gaze detection method comprises: determining posture information of a human face according to facial information, and determining reference gaze point coordinates according to the facial information (011); in response to the posture information being greater than a preset threshold, determining a correction parameter according to the posture information (013); and determining gaze information according to the reference gaze point coordinates and the correction parameter (015).

Description

注视检测方法、电子设备的控制方法及相关设备Gaze detection method, electronic device control method, and related devices
优先权信息priority information
本申请请求2021年10月29日向中国国家知识产权局提交的、专利申请号为202111271397.4的专利申请的优先权和权益,并且通过参照将其全文并入此处。This application claims the priority and benefit of the patent application No. 202111271397.4 filed with the State Intellectual Property Office of China on October 29, 2021, which is hereby incorporated by reference in its entirety.
技术领域technical field
本申请涉及消费性电子产品技术领域,特别涉及一种注视检测方法、电子设备的控制方法、检测装置、控制装置、电子设备和非易失性计算机可读存储介质。The present application relates to the technical field of consumer electronics, and in particular to a gaze detection method, a control method for electronic equipment, a detection device, a control device, electronic equipment, and a non-volatile computer-readable storage medium.
背景技术Background technique
目前,电子设备可通过采集人脸图像来估计用户的注视点。Currently, electronic devices can estimate a user's gaze point by collecting face images.
发明内容Contents of the invention
本申请提供了一种注视检测方法、电子设备的控制方法、检测装置、控制装置、电子设备和非易失性计算机可读存储介质。The present application provides a gaze detection method, a control method of an electronic device, a detection device, a control device, an electronic device and a non-volatile computer-readable storage medium.
本申请一个实施方式的注视检测方法包括根据人脸信息确定人脸的姿态信息,根据所述人脸信息确定基准注视点坐标;响应于所述姿态信息大于预设阈值,根据所述姿态信息确定校正参数;根据所述基准注视点坐标和所述校正参数,确定注视信息。The gaze detection method of an embodiment of the present application includes determining the pose information of the face according to the face information, determining the reference gaze point coordinates according to the face information; in response to the pose information being greater than a preset threshold, determining Correction parameters: determine gaze information according to the coordinates of the reference gaze point and the correction parameters.
本申请一个实施方式的检测装置包括第一确定模块、第二确定模块和第三确定模块。所述第一确定模块用于根据人脸信息确定人脸的姿态信息,根据所述人脸信息确定基准注视点坐标;所述第二确定模块用于响应于所述姿态信息大于预设阈值,根据所述姿态信息确定校正参数;所述第三确定模块用于根据所述基准注视点坐标和所述校正参数,确定注视信息。A detection device according to an embodiment of the present application includes a first determination module, a second determination module and a third determination module. The first determination module is used to determine the pose information of the face according to the face information, and determine the reference gaze point coordinates according to the face information; the second determination module is used to respond to the pose information being greater than a preset threshold, Determine correction parameters according to the posture information; the third determination module is configured to determine gaze information according to the coordinates of the reference gaze point and the correction parameters.
本申请一个实施方式的电子设备包括处理器,所述处理器用于根据人脸信息确定人脸的姿态信息,根据所述人脸信息确定基准注视点坐标;响应于所述姿态信息大于预设阈值,根据所述姿态信息确定校正参数;根据所述基准注视点坐标和所述校正参数,确定注视信息。An electronic device according to an embodiment of the present application includes a processor, the processor is configured to determine the pose information of the face according to the face information, and determine the coordinates of a reference gaze point according to the face information; in response to the pose information being greater than a preset threshold , determining correction parameters according to the posture information; determining gaze information according to the coordinates of the reference gaze point and the correction parameters.
本申请的注视检测方法、检测装置和电子设备,在获取到人脸信息后,先通过人脸信息来计算人脸姿态,在姿态信息大于预设阈值,会影响到注视点坐标的计算准确性时,再根据人脸信息首先计算基准注视点坐标,再根据姿态信息计算校正参数,从而可根据校正参数来校正基准注视点坐标,从而防止因获取的人脸信息中,人脸拍摄角度过大对注视检测的影响,可提升注视检测的准确性。The gaze detection method, detection device and electronic equipment of the present application, after obtaining the face information, first calculate the face posture through the face information, and if the posture information is greater than the preset threshold, it will affect the calculation accuracy of the gaze point coordinates , and then calculate the coordinates of the reference gaze point according to the face information, and then calculate the correction parameters according to the attitude information, so that the coordinates of the reference gaze point can be corrected according to the correction parameters, so as to prevent the face shooting angle from being too large in the obtained face information The impact on gaze detection can improve the accuracy of gaze detection.
本申请实施方式的电子设备的控制方法包括根据人脸信息确定人脸的姿态信息,根据所述人脸信息确定基准注视点坐标;响应于所述姿态信息大于预设阈值,根据所述姿态信息确定校正参数;根据所述基准注视点坐标和所述校正参数,确定注视信息;及根据所述注视信息控制所述电子设备。The method for controlling an electronic device according to the embodiment of the present application includes determining the pose information of the face according to the face information, determining the coordinates of a reference gaze point according to the face information; in response to the pose information being greater than a preset threshold, determining correction parameters; determining gaze information according to the reference gaze point coordinates and the correction parameters; and controlling the electronic device according to the gaze information.
本申请实施方式的控制装置包括获取模块、第一确定模块和第二确定模块。所述获取模块用于根据人脸信息确定人脸的姿态信息,根据所述人脸信息确定基准注视点坐标;所述第一确定模块用于响应于所述姿态信息大于预设阈值,根据所述姿态信息确定校正参数;所述第二确定模块用于根据所述基准注视点坐标和所述校正参数,确定注视信息;The control device in the embodiment of the present application includes an acquisition module, a first determination module and a second determination module. The acquisition module is used to determine the pose information of the face according to the face information, and determine the reference gaze point coordinates according to the face information; the first determination module is used to respond to the pose information being greater than a preset threshold, according to the set The posture information determines correction parameters; the second determination module is used to determine gaze information according to the reference gaze point coordinates and the correction parameters;
本申请另一实施方式的电子设备包括处理器,所述处理器用于根据人脸信息确定人脸的姿态信息,根据所述人脸信息确定基准注视点坐标;响应于所述姿态信息大于预设阈值,根据所述姿态信息确定校正参数;根据所述基准注视点坐标和所述校正参数,确定注视信息;及根据所述注视信息控制所述电子设备。The electronic device according to another embodiment of the present application includes a processor, the processor is used to determine the pose information of the face according to the face information, and determine the reference gaze point coordinates according to the face information; in response to the pose information being greater than the preset determining a correction parameter according to the posture information; determining gaze information according to the coordinates of the reference gaze point and the correction parameter; and controlling the electronic device according to the gaze information.
本申请实施方式的一种包含计算机程序的非易失性计算机可读存储介质,当所述计算机程序被一个或多个处理器执行时,使得所述处理器执行注视检测方法或控制方法。所述注视检测方法包括注视检测方法包括根据人脸信息确定人脸的姿态信息,根据所述人脸信息确定基准注视点坐标;响应于所述姿态信息大于预设阈值,根据所述姿态信息确定校正参数;根据所述基准注视点坐标和所述校正参数,确定注视信息。所述电子设备的控制方法包括根据人脸信息确定人脸的姿态信息,根据所述人脸信息确定基准注视点坐标;响应于所述姿态信息大于预设阈值,根据所述姿态信息确定校正参数;根 据所述基准注视点坐标和所述校正参数,确定注视信息;及根据所述注视信息控制所述电子设备。A non-volatile computer-readable storage medium containing a computer program according to an embodiment of the present application. When the computer program is executed by one or more processors, the processors are made to execute a gaze detection method or a control method. The gaze detection method includes that the gaze detection method includes determining the pose information of the face according to the face information, determining the coordinates of the reference gaze point according to the face information; in response to the pose information being greater than a preset threshold, determining according to the pose information Correction parameters: determine gaze information according to the coordinates of the reference gaze point and the correction parameters. The control method of the electronic device includes determining the pose information of the face according to the face information, determining the reference gaze point coordinates according to the face information; in response to the pose information being greater than a preset threshold, determining a correction parameter according to the pose information ; determining gaze information according to the coordinates of the reference gaze point and the correction parameters; and controlling the electronic device according to the gaze information.
本申请的附加方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得明显,或通过本申请的实践了解到。Additional aspects and advantages of the application will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the application.
附图说明Description of drawings
为了更清楚地说明本申请实施方式或现有技术中的技术方案,下面将对实施方式或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施方式,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present application or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the drawings in the following description are only These are some implementations of the present application. For those skilled in the art, other drawings can also be obtained according to these drawings without creative work.
图1是本申请某些实施方式的注视检测方法的流程示意图;FIG. 1 is a schematic flow chart of a gaze detection method in some embodiments of the present application;
图2是本申请某些实施方式的检测装置的模块示意图;FIG. 2 is a block diagram of a detection device in some embodiments of the present application;
图3是本申请某些实施方式的电子设备的平面示意图;3 is a schematic plan view of an electronic device in some embodiments of the present application;
图4是本申请某些实施方式的电子设备和云端服务器的连接示意图;Fig. 4 is a schematic diagram of connection between an electronic device and a cloud server in some embodiments of the present application;
图5至图7是本申请某些实施方式的注视检测方法的流程示意图;5 to 7 are schematic flowcharts of gaze detection methods in some embodiments of the present application;
图8是本申请某些实施方式的检测模型的结构示意图;Fig. 8 is a schematic structural diagram of a detection model in some embodiments of the present application;
图9是本申请某些实施方式的电子设备的控制方法的流程示意图;FIG. 9 is a schematic flowchart of a method for controlling an electronic device in some embodiments of the present application;
图10是本申请某些实施方式的控制装置的模块示意图;Fig. 10 is a block diagram of a control device in some embodiments of the present application;
图11至图14是本申请某些实施方式的控制方法的场景示意图;11 to 14 are schematic diagrams of scenarios of control methods in some embodiments of the present application;
图15和图16是本申请某些实施方式的控制方法的流程示意图;Figure 15 and Figure 16 are schematic flow charts of the control method in some embodiments of the present application;
图17和图18是本申请某些实施方式的控制方法的场景示意图;Figure 17 and Figure 18 are schematic diagrams of scenarios of the control method in some embodiments of the present application;
图19是本申请某些实施方式的控制方法的流程示意图;及FIG. 19 is a schematic flowchart of a control method in some embodiments of the present application; and
图20是本申请某些实施方式的处理器和计算机可读存储介质的连接示意图。Fig. 20 is a schematic diagram of connection between a processor and a computer-readable storage medium in some embodiments of the present application.
具体实施方式Detailed ways
以下结合附图对本申请的实施方式作进一步说明。附图中相同或类似的标号自始至终表示相同或类似的元件或具有相同或类似功能的元件。另外,下面结合附图描述的本申请的实施方式是示例性的,仅用于解释本申请的实施方式,而不能理解为对本申请的限制。Embodiments of the present application will be further described below in conjunction with the accompanying drawings. The same or similar reference numerals in the drawings represent the same or similar elements or elements having the same or similar functions throughout. In addition, the embodiments of the present application described below in conjunction with the accompanying drawings are exemplary, and are only used to explain the embodiments of the present application, and should not be construed as limiting the present application.
本申请的注视检测方法包括根据人脸信息确定人脸的姿态信息,根据人脸信息确定基准注视点坐标;响应于姿态信息大于预设阈值,根据姿态信息确定校正参数;及根据基准注视点坐标和校正参数,确定注视信息。The gaze detection method of the present application includes determining the pose information of the face according to the face information, and determining the reference gaze point coordinates according to the face information; in response to the pose information being greater than a preset threshold, determining the correction parameters according to the pose information; and according to the reference gaze point coordinates and calibration parameters to determine gaze information.
在某些实施方式中,注视检测方法还包括:响应于姿态信息小于预设阈值,根据人脸信息计算基准注视点坐标,以作为注视信息。In some implementations, the gaze detection method further includes: in response to the gesture information being less than a preset threshold, calculating reference gaze point coordinates according to face information as gaze information.
在某些实施方式中,姿态信息包括姿态角,姿态角包括俯仰角和偏航角,根据人脸信息,判断人脸的姿态信息是否大于预设阈值,包括:根据人脸信息,判断俯仰角或偏航角是否大于预设阈值。In some embodiments, the posture information includes a posture angle, and the posture angle includes a pitch angle and a yaw angle. According to the face information, judging whether the posture information of the face is greater than a preset threshold includes: judging the pitch angle according to the face information Or whether the yaw angle is greater than a preset threshold.
在某些实施方式中,注视检测方法还包括:获取训练样本集,训练样本集包括人脸的姿态信息小于预设阈值的第一类样本和人脸的姿态信息大于预设阈值的第二类样本;根据第一类样本和第二类样本训练预设的检测模型;根据姿态信息确定校正参数,包括:基于检测模型,根据姿态信息确定校正参数。In some implementations, the gaze detection method further includes: obtaining a training sample set, the training sample set includes a first type of sample whose posture information of a human face is less than a preset threshold and a second type of sample whose posture information of a human face is greater than a preset threshold samples; training a preset detection model according to the first type of samples and the second type of samples; determining the correction parameters according to the attitude information, including: determining the correction parameters according to the attitude information based on the detection model.
在某些实施方式中,检测模型包括注视点检测模块和校正模块,根据第一类样本和第二类样本训练检测模型,包括:将第一类样本输入注视点检测模块,以输出第一训练坐标;将第二类样本输入注视点检测模块和校正模块,以输出第二训练坐标;基于预设的损失函数,根据第一类样本对应的第一预设坐标和第一训练坐标,计算第一损失值,并根据第二类样本对应的第二预设坐标和第二训练坐标,计算第二损失值;根据第一损失值和第二损失值调整检测模型,直至检测模型收敛。In some embodiments, the detection model includes a gaze point detection module and a correction module, and the detection model is trained according to the first type of samples and the second type of samples, including: inputting the first type of samples into the gaze point detection module to output the first training coordinates; input the second type of samples into the gaze point detection module and the correction module to output the second training coordinates; based on the preset loss function, according to the first preset coordinates and the first training coordinates corresponding to the first type of samples, calculate the second A loss value, and calculate the second loss value according to the second preset coordinates and the second training coordinates corresponding to the second type of samples; adjust the detection model according to the first loss value and the second loss value until the detection model converges.
在某些实施方式中,在进行训练的连续N个批次的样本中,任意两个第一类样本对应的第一损失值的第一差值、及任意两个第二类样本对应的第二损失值的第二差值均小于预定差值阈值时,确定检测模型收敛,N为大于1的正整数;或者,第一损失值和第二损失值均小于预定损失阈值时,确定检测模型收敛。In some implementations, in the samples of N consecutive batches for training, the first difference of the first loss value corresponding to any two samples of the first type, and the first difference of the first loss value corresponding to any two samples of the second type When the second difference of the two loss values is less than the predetermined difference threshold, it is determined that the detection model is converged, and N is a positive integer greater than 1; or, when the first loss value and the second loss value are both less than the predetermined loss threshold, the detection model is determined convergence.
在某些实施方式中,人脸信息包括人脸掩码、左眼图像及右眼图像,人脸掩码用于指示人脸在图 像中的位置,根据人脸信息计算基准注视点坐标,包括:根据人脸掩码计算人脸相对电子设备的位置信息;根据位置信息、左眼图像和右眼图像,计算基准注视点坐标。In some embodiments, the face information includes a face mask, a left-eye image and a right-eye image, the face mask is used to indicate the position of the face in the image, and the coordinates of the reference gaze point are calculated according to the face information, including : Calculate the position information of the face relative to the electronic device according to the face mask; calculate the coordinates of the reference gaze point according to the position information, the left eye image and the right eye image.
在某些实施方式中,人脸信息包括人脸特征点,姿态信息包括姿态角和三维坐标偏移,校正参数包括旋转矩阵和平移矩阵,根据人脸信息确定人脸的姿态信息,包括:根据人脸特征点计算姿态角和三维坐标偏移;根据姿态信息计算校正参数,包括:根据姿态角计算旋转矩阵,并根据三维坐标偏移计算平移矩阵。In some implementations, the face information includes face feature points, the pose information includes pose angles and three-dimensional coordinate offsets, the correction parameters include rotation matrices and translation matrices, and the pose information of the face is determined according to the face information, including: Calculate the attitude angle and three-dimensional coordinate offset of the face feature points; calculate the correction parameters according to the attitude information, including: calculate the rotation matrix according to the attitude angle, and calculate the translation matrix according to the three-dimensional coordinate offset.
本申请的电子设备的控制方法包括根据人脸信息确定人脸的姿态信息,根据人脸信息确定基准注视点坐标;响应于姿态信息大于预设阈值,根据姿态信息确定校正参数;根据基准注视点坐标和校正参数,确定注视信息;及根据注视信息控制电子设备。The electronic device control method of the present application includes determining the pose information of the face according to the face information, determining the reference gaze point coordinates according to the face information; in response to the pose information being greater than a preset threshold, determining the correction parameters according to the pose information; according to the reference gaze point Coordinates and calibration parameters, determining gaze information; and controlling electronic equipment based on the gaze information.
在某些实施方式中,控制方法还包括:响应于姿态信息小于预设阈值,根据人脸信息计算基准注视点坐标,以作为注视信息。In some implementations, the control method further includes: in response to the gesture information being less than a preset threshold, calculating reference gaze point coordinates according to face information as gaze information.
在某些实施方式中,人脸信息包括人脸掩码、左眼图像及右眼图像,人脸掩码用于指示人脸在图像中的位置,根据人脸信息计算基准注视点坐标,包括:根据人脸掩码计算人脸相对电子设备的位置信息;根据位置信息、左眼图像和右眼图像,计算基准注视点坐标。In some embodiments, the face information includes a face mask, a left-eye image and a right-eye image, the face mask is used to indicate the position of the face in the image, and the coordinates of the reference gaze point are calculated according to the face information, including : Calculate the position information of the face relative to the electronic device according to the face mask; calculate the coordinates of the reference gaze point according to the position information, the left eye image and the right eye image.
在某些实施方式中,人脸信息包括人脸特征点,姿态信息包括姿态角和三维坐标偏移,校正参数包括旋转矩阵和平移矩阵,根据姿态信息计算校正参数,包括:根据人脸特征点计算姿态角和三维坐标偏移;根据姿态角计算旋转矩阵,并根据三维坐标偏移计算平移矩阵。In some implementations, the face information includes face feature points, the pose information includes pose angles and three-dimensional coordinate offsets, the correction parameters include rotation matrices and translation matrices, and the correction parameters are calculated according to the pose information, including: according to face feature points Calculate the attitude angle and three-dimensional coordinate offset; calculate the rotation matrix according to the attitude angle, and calculate the translation matrix according to the three-dimensional coordinate offset.
在某些实施方式中,注视信息包括注视点坐标,在根据人脸信息确定人脸的姿态信息,根据人脸信息确定基准注视点坐标之前,控制方法还包括:在息屏前的第一预定时长内,获取拍摄图像;响应于拍摄图像中包含人脸信息;根据注视信息控制电子设备,包括:响应于注视点坐标位于显示屏的显示区域,持续亮屏第二预定时长。In some embodiments, the gazing information includes gaze point coordinates. Before determining the pose information of the face according to the face information and determining the reference gaze point coordinates according to the face information, the control method further includes: Within the duration, acquire the photographed image; respond to the face information contained in the photographed image; control the electronic device according to the gaze information, including: responding to the coordinates of the gaze point being located in the display area of the display screen, continuing to light the screen for a second predetermined duration.
在某些实施方式中,显示区域与预设坐标范围相关联,控制方法还包括:在注视点坐标位于预设坐标范围内时,确定注视点坐标位于显示区域。In some implementations, the display area is associated with a preset coordinate range, and the control method further includes: when the gaze point coordinates are within the preset coordinate range, determining that the gaze point coordinates are located in the display area.
在某些实施方式中,在根据人脸信息确定人脸的姿态信息,根据人脸信息确定基准注视点坐标之前,控制方法还包括:响应于电子设备未接收到输入操作的情况,获取拍摄图像;根据注视信息控制电子设备,包括:响应于拍摄图像中包含人脸且注视点坐标位于显示区域,调节显示屏的显示亮度至第一预定亮度;响应于拍摄图像中不包含人脸、或拍摄图像中包含人脸且注视点坐标位于显示区域之外,调节显示亮度至第二预定亮度,第二预定亮度小于第一预定亮度。In some implementations, before determining the posture information of the human face according to the human face information and determining the coordinates of the reference point of gaze according to the human face information, the control method further includes: in response to a situation where the electronic device does not receive an input operation, acquiring a captured image ; Control the electronic device according to the gaze information, including: in response to the captured image containing a human face and the gaze point coordinates are located in the display area, adjusting the display brightness of the display screen to the first predetermined brightness; in response to the captured image not containing a human face, or shooting If the image contains a human face and the gaze point coordinates are outside the display area, the display brightness is adjusted to a second predetermined brightness, and the second predetermined brightness is smaller than the first predetermined brightness.
本申请的检测装置包括第一确定模块、第二确定模块和第三确定模块。第一确定模块用于根据人脸信息确定人脸的姿态信息,根据人脸信息确定基准注视点坐标;第二确定模块用于响应于姿态信息大于预设阈值,根据姿态信息确定校正参数;第三确定模块,用于根据基准注视点坐标和校正参数,确定注视信息。The detection device of the present application includes a first determination module, a second determination module and a third determination module. The first determination module is used to determine the pose information of the face according to the face information, and determine the coordinates of the reference gaze point according to the face information; the second determination module is used to determine the correction parameters according to the pose information in response to the pose information being greater than a preset threshold; 3. A determination module, configured to determine fixation information according to the reference fixation point coordinates and correction parameters.
本申请的控制装置包括获取模块、第一确定模块、第二确定模块和控制模块。获取模块用于根据人脸信息确定人脸的姿态信息,根据人脸信息确定基准注视点坐标;第一确定模块用于响应于姿态信息大于预设阈值,根据姿态信息确定校正参数;第二确定模块用于根据基准注视点坐标和校正参数,确定注视信息;控制模块用于根据注视信息控制电子设备。The control device of the present application includes an acquisition module, a first determination module, a second determination module and a control module. The acquisition module is used to determine the pose information of the face according to the face information, and determine the reference gaze point coordinates according to the face information; the first determination module is used to determine the correction parameters according to the pose information in response to the pose information being greater than a preset threshold; the second determination The module is used to determine the gaze information according to the reference gaze point coordinates and the correction parameters; the control module is used to control the electronic equipment according to the gaze information.
本申请的电子设备包括处理器,处理器用于根据人脸信息确定人脸的姿态信息,根据人脸信息确定基准注视点坐标;响应于姿态信息大于预设阈值,根据姿态信息确定校正参数;根据基准注视点坐标和校正参数,确定注视信息。The electronic device of the present application includes a processor, the processor is used to determine the pose information of the face according to the face information, and determine the coordinates of the reference gaze point according to the face information; in response to the pose information being greater than a preset threshold, determine the correction parameters according to the pose information; The reference fixation point coordinates and correction parameters determine the fixation information.
本申请的电子设备包括处理器,处理器用于根据人脸信息确定人脸的姿态信息,根据人脸信息确定基准注视点坐标;响应于姿态信息大于预设阈值,根据姿态信息确定校正参数;根据基准注视点坐标和校正参数,确定注视信息;及根据注视信息控制电子设备。The electronic device of the present application includes a processor, the processor is used to determine the pose information of the face according to the face information, and determine the coordinates of the reference gaze point according to the face information; in response to the pose information being greater than a preset threshold, determine the correction parameters according to the pose information; The reference gaze point coordinates and correction parameters are used to determine gaze information; and to control electronic equipment according to the gaze information.
本申请的非易失性计算机可读存储介质包括计算机程序,计算机程序被处理器执行时,使得处理器执行上述任一实施方式的注视检测方法、或上述任一实施方式的电子设备的控制方法。The non-transitory computer-readable storage medium of the present application includes a computer program. When the computer program is executed by the processor, the processor executes the gaze detection method of any of the above-mentioned embodiments, or the control method of the electronic device of any of the above-mentioned embodiments. .
请参阅图1至图3,本申请实施方式的注视检测方法包括以下步骤:Please refer to Fig. 1 to Fig. 3, the gaze detection method of the embodiment of the present application includes the following steps:
011:根据人脸信息确定人脸的姿态信息,根据人脸信息确定基准注视点坐标;011: Determine the posture information of the face according to the face information, and determine the reference gaze point coordinates according to the face information;
013:响应于姿态信息大于预设阈值,根据姿态信息确定校正参数;及013: In response to the attitude information being greater than a preset threshold, determine a correction parameter according to the attitude information; and
015:根据基准注视点坐标和校正参数,确定注视信息。015: Determine the fixation information according to the reference fixation point coordinates and correction parameters.
本申请实施方式的检测装置10包括第一确定模块11、第二确定模块12和第三确定模块13。第一确定模块11用于根据人脸信息确定人脸的姿态信息,根据人脸信息确定基准注视点坐标;第二确定模块12用于响应于姿态信息大于预设阈值,根据姿态信息确定校正参数;第三确定模块13用于根据基准注视点坐标和校正参数,确定注视信息。也即是说,步骤011可以由第一确定模块11实现、步骤013可以由第二确定模块12执行和步骤015可以由第三确定模块13执行。The detection device 10 in the embodiment of the present application includes a first determination module 11 , a second determination module 12 and a third determination module 13 . The first determination module 11 is used to determine the pose information of the face according to the face information, and determines the coordinates of the reference gaze point according to the face information; the second determination module 12 is used to determine the correction parameters according to the pose information in response to the pose information being greater than a preset threshold ; The third determination module 13 is used to determine the gaze information according to the coordinates of the reference gaze point and the correction parameters. That is to say, step 011 can be implemented by the first determination module 11 , step 013 can be performed by the second determination module 12 and step 015 can be performed by the third determination module 13 .
本申请实施方式的电子设备100包括处理器60和采集装置30。采集装置30用于按预定帧率采集人脸信息(人脸信息可包括人脸图像,如人脸的可见光图像、红外图像、深度图像等);采集装置30可以是可见光相机、红外相机、深度相机中的一种或多种,其中,可见光相机可采集可见光人脸图像、红外相机可采集红外人脸图像、深度相机可采集深度人脸图像,本实施方式中,采集装置30包括可见光相机、红外相机和深度相机,采集装置30同时可见光人脸图像、红外人脸图像和深度人脸图像。处理器60可包括图像处理器60(Image Signal Processor,ISP),神经网络处理器60(Neural-Network Processing Unit,NPU)和应用处理器60(Application Processor,AP),检测装置10设置在电子设备100内,其中,第一确定模块11可设置在ISP和NPU,处理器60与采集装置30连接,在采集装置30采集到人脸图像后,ISP可对人脸图像进行处理,以获取人脸的姿态信息,NPU可根据人脸信息确定基准注视点坐标,第二确定模块12和第三确定模块13可设置在NPU。处理器60(具体可以是ISP和NPU)用于根据人脸信息确定人脸的姿态信息;处理器60(具体可以是NPU)还用于响应于姿态信息大于预设阈值,根据姿态信息确定校正参数、及根据基准注视点坐标和校正参数,确定注视信息。也即是说,步骤011、步骤013和步骤015可以由处理器60执行。The electronic device 100 in the embodiment of the present application includes a processor 60 and a collection device 30 . Acquisition device 30 is used for collecting face information by predetermined frame rate (face information can comprise people's face image, as the visible light image of people's face, infrared image, depth image etc.); Acquisition device 30 can be visible light camera, infrared camera, depth One or more of the cameras, wherein the visible light camera can collect visible light face images, the infrared camera can collect infrared face images, and the depth camera can collect depth face images. In this embodiment, the collection device 30 includes a visible light camera, An infrared camera and a depth camera, and the acquisition device 30 simultaneously has a visible light face image, an infrared face image and a depth face image. The processor 60 may include an image processor 60 (Image Signal Processor, ISP), a neural network processor 60 (Neural-Network Processing Unit, NPU) and an application processor 60 (Application Processor, AP), and the detection device 10 is arranged on the electronic device In 100, wherein, the first determination module 11 can be arranged on the ISP and the NPU, and the processor 60 is connected to the collection device 30. After the collection device 30 collects the face image, the ISP can process the face image to obtain the face image. The NPU can determine the reference gaze point coordinates according to the face information, and the second determination module 12 and the third determination module 13 can be set on the NPU. The processor 60 (specifically, it can be an ISP and an NPU) is used to determine the pose information of the face according to the face information; the processor 60 (specifically, it can be an NPU) is also used to determine the correction according to the pose information in response to the pose information being greater than a preset threshold. Parameters, and according to the coordinates of the reference point of fixation and the correction parameters, fixation information is determined. That is to say, step 011 , step 013 and step 015 may be executed by the processor 60 .
电子设备100可以是手机、智能手表、平板电脑、显示设备、笔记本电脑、柜员机、闸机、头显设备、游戏机等。如图3所示,本申请实施方式以电子设备100是手机为例进行说明,可以理解,电子设备100的具体形式并不限于手机。The electronic device 100 may be a mobile phone, a smart watch, a tablet computer, a display device, a notebook computer, a teller machine, a gate, a head-mounted display device, a game machine, and the like. As shown in FIG. 3 , the embodiment of the present application is described by taking the electronic device 100 as a mobile phone as an example. It can be understood that the specific form of the electronic device 100 is not limited to the mobile phone.
具体地,在用户使用电子设备100时,采集装置30可间隔预定时长采集一次用户的人脸信息,在保证电子设备100的功耗较小的情况,持续对用户进行注视检测,或者,在用户使用需要进行注视检测的应用程序(如浏览器软件、贴吧软件、视频软件等)时,再按预定帧数(如每秒10帧)采集人脸信息,从而在有注视检测需求时才进行人脸信息采集,最大化的降低了注视检测的功耗。Specifically, when the user uses the electronic device 100, the collection device 30 can collect the user's face information once at a predetermined time interval, and continue to perform gaze detection on the user while ensuring that the power consumption of the electronic device 100 is small, or, when the user When using applications that require gaze detection (such as browser software, post bar software, video software, etc.), collect face information according to a predetermined number of frames (such as 10 frames per second), so that human face information is only performed when there is a gaze detection requirement. Face information collection minimizes the power consumption of gaze detection.
请参阅图4,在获取到人脸信息(以人脸图像为例)后,处理器60可对人脸图像进行识别,例如处理器60可将人脸图像和预设人脸模板进行比对,从而确定人脸图像中的人脸及人脸的不同部位(如眼睛、鼻子等)所在的图像区域,从而获取人脸区域图像,其中,预设人脸模板可存储在电子设备100的存储器内,处理器60可在可信执行环境(Trusted Execution Environment,TEE)内进行人脸识别,以保证用户的隐私;或者,预设人脸模板可存储在云端服务器200,然后由电子设备100将人脸图像发送到云端服务器200进行比对以确定脸区域图像,将人脸识别交给云端服务器200进行处理,可降低电子设备100的处理量并提升图像处理效率;然后,处理器60可对人脸区域图像进行识别,以确定人脸的姿态信息。更具体地,可根据人脸及人脸不同部位的形状特征进行人脸及人脸不同部位的识别,从而获取到人脸信息,人脸信息可包括人脸区域图像、人眼区域图像等不同部位的图像。Please refer to Fig. 4, after obtaining the face information (taking the face image as an example), the processor 60 can identify the face image, for example, the processor 60 can compare the face image with the preset face template , so as to determine the face in the face image and the image area where different parts of the face (such as eyes, nose, etc.) Within, the processor 60 can perform face recognition in a trusted execution environment (Trusted Execution Environment, TEE) to ensure the user's privacy; or, the preset face template can be stored in the cloud server 200, and then the electronic device 100 will The face image is sent to the cloud server 200 for comparison to determine the face area image, and the face recognition is handed over to the cloud server 200 for processing, which can reduce the processing capacity of the electronic device 100 and improve image processing efficiency; then, the processor 60 can The image of the face area is recognized to determine the pose information of the face. More specifically, the recognition of the face and different parts of the face can be carried out according to the shape features of the face and different parts of the face, so as to obtain the face information. image of the part.
在得到人脸信息后,可根据人脸信息计算人脸的姿态信息,姿态信息可通过对人脸图像进行特征提取,根据提取得到的特征点的位置坐标来计算姿态信息,例如以鼻尖、左右眼的中心、左右嘴角作为特征点,则随着人脸的姿态的变化,特征点的位置坐标也在不断变化,例如,以鼻尖为原点建立三维坐标系,人脸的俯仰角、水平转动角、倾斜角分别表示人脸相对于三维坐标系的三个坐标轴的旋转角度等),以正对电子设备100的显示屏40时人脸的水平转动角为例,人脸的偏转角度(即水平转动角)越大,则左右眼对应的两个特征点的距离就越近,因此,可根据特征点的位置坐标来准确地计算人脸的姿态信息。After the face information is obtained, the pose information of the face can be calculated according to the face information. The pose information can be calculated by extracting the features of the face image and calculating the pose information according to the position coordinates of the extracted feature points, such as nose tip, left and right The center of the eyes and the left and right corners of the mouth are used as feature points, and as the posture of the face changes, the position coordinates of the feature points are also changing. For example, a three-dimensional coordinate system is established with the tip of the nose as the origin. , the inclination angle respectively represent the rotation angles of the human face relative to the three coordinate axes of the three-dimensional coordinate system, etc.), taking the horizontal rotation angle of the human face when facing the display screen 40 of the electronic device 100 as an example, the deflection angle of the human face (i.e. The larger the horizontal rotation angle), the closer the distance between the two feature points corresponding to the left and right eyes. Therefore, the pose information of the face can be accurately calculated according to the position coordinates of the feature points.
可以理解,人脸的不同姿态均会影响用户的注视方向和注视点坐标。因此,在根据人脸信息计算出基准注视点坐标后,可根据姿态信息来确定校正参数,从而校正因姿态变化带来的注视点检测的误差,使得根据基准注视点坐标和校正参数得到的注视信息较为准确性。It can be understood that different postures of the human face will affect the user's gaze direction and gaze point coordinates. Therefore, after the coordinates of the reference point of gaze are calculated according to the face information, the correction parameters can be determined according to the posture information, thereby correcting the error of the gaze point detection caused by the change of posture, so that the gaze obtained according to the coordinates of the reference point of gaze and the correction parameters The information is more accurate.
根据人脸信息计算基准注视点坐标时,处理器60可直接根据人脸区域图像计算基准注视点坐标,或者对人脸区域图像进行特征点识别,通过特征点计算基准注视点坐标,计算量较小;或者,处理器 60可获取人脸区域图像和人眼区域图像,通过对人脸区域图像进行特征点识别,并结合人脸区域图像的特征点和人眼区域图像共同计算基准注视点坐标,在保证计算量较小的基础上,进一步提升基准注视点坐标的计算准确性。When calculating the reference gaze point coordinates according to the face information, the processor 60 can directly calculate the reference gaze point coordinates according to the face area image, or carry out feature point recognition on the face area image, and calculate the reference gaze point coordinates by the feature points, and the amount of calculation is relatively small. or, the processor 60 can obtain the face area image and the human eye area image, and perform feature point recognition on the face area image, and jointly calculate the reference gaze point coordinates in conjunction with the feature points of the human face area image and the human eye area image , on the basis of ensuring a small amount of calculation, the calculation accuracy of the coordinates of the reference gaze point is further improved.
然后,可以理解,在用户的姿态为正对电子设备100的显示屏40时,此时获取的人脸信息一般最为准确,因此,处理器60可首先判断姿态信息是否大于预设阈值,姿态信息包括人脸的俯仰角、横滚角和偏航角,当然,由于横滚角的变化(人脸平行显示屏40旋转)并不会导致人脸的特征点在人脸的位置发生变化,因此,可只判断俯仰角或偏航角是否大于0度即可。Then, it can be understood that when the user's posture is facing the display screen 40 of the electronic device 100, the face information obtained at this time is generally the most accurate. Therefore, the processor 60 can first determine whether the posture information is greater than a preset threshold, and the posture information Including the pitch angle, roll angle and yaw angle of the human face, of course, because the change of the roll angle (the human face parallel display screen 40 rotates) does not cause the feature points of the human face to change in the position of the human face, so , you can only judge whether the pitch angle or yaw angle is greater than 0 degrees.
如用户的姿态为正对电子设备100的显示屏40时,人脸的俯仰角、横滚角和偏航角均为0度,预设阈值为0度,则在姿态信息大于0度(如俯仰角或偏航角大于0度)时,则可确定需要对基准注视点坐标进行校正,其中,由于俯仰角、横滚角和偏航角具有方向性,可能为负值,影响判断的准确性,因此,在判断姿态信息是否大于预设阈值时,可判断姿态信息的绝对值是否大于预设阈值。If the posture of the user is facing the display screen 40 of the electronic device 100, the pitch angle, roll angle and yaw angle of the face are all 0 degrees, and the preset threshold is 0 degrees, then when the posture information is greater than 0 degrees (such as When the pitch angle or yaw angle is greater than 0 degrees), it can be determined that the coordinates of the reference gaze point need to be corrected. Among them, since the pitch angle, roll angle, and yaw angle are directional, they may be negative values, which will affect the accuracy of judgment Therefore, when judging whether the posture information is greater than a preset threshold, it can be judged whether the absolute value of the posture information is greater than a preset threshold.
此时处理器60根据姿态信息来计算校正参数,如基准注视点坐标仅包括视线在显示屏40的二维坐标时,校正参数包括坐标校正系数,根据基准注视点坐标和坐标校正系数即可得到注视信息,如基准注视点坐标为(x,y),坐标校正系数为a和b,则注视信息为(ax,by);或者,基准注视点坐标包括视线在显示屏40的二维坐标和视线的方向时,校正参数包括坐标校正系数和方向校正系数,根据基准注视点坐标和坐标校正系数即可得到注视信息,如基准注视点坐标为(x,y),视线的方向为(α,β,γ),坐标校正系数为a和b,方向校正系数包括c、d和e,则注视信息为(ax,by,cα,dβ,eγ)。At this time, the processor 60 calculates the correction parameters according to the posture information. For example, when the reference gaze point coordinates only include the two-dimensional coordinates of the line of sight on the display screen 40, the correction parameters include coordinate correction coefficients, which can be obtained according to the reference gaze point coordinates and the coordinate correction coefficients. Watching information, such as the coordinates of the reference point of gaze are (x, y), and the coordinate correction coefficients are a and b, then the gaze information is (ax, by); or, the coordinates of the reference point of gaze include the two-dimensional coordinates and the coordinates of the line of sight on the display screen 40 For the direction of line of sight, the correction parameters include coordinate correction coefficient and direction correction coefficient. The gaze information can be obtained according to the coordinates of the reference gaze point and the coordinate correction coefficient. For example, the coordinates of the reference gaze point are (x, y), and the direction of the line of sight is (α, β, γ), the coordinate correction coefficients are a and b, and the direction correction coefficients include c, d and e, then the gaze information is (ax, by, cα, dβ, eγ).
而在姿态信息小于或等于预设阈值时,说明此时用户正对显示屏40或者用户相对显示屏40的偏转角度较小,则可确定不需要对基准注视点坐标进行校正,处理器60在计算得到基准注视点坐标后,直接确定基准注视点坐标为最终的注视信息即可,从而节省了计算校正参数的计算量。And when the attitude information is less than or equal to the preset threshold value, it means that the user is facing the display screen 40 or the deflection angle of the user relative to the display screen 40 is small at this time, then it can be determined that the coordinates of the reference gaze point do not need to be corrected, and the processor 60 After the coordinates of the reference fixation point are calculated, the coordinates of the reference fixation point can be directly determined as the final fixation information, thereby saving the calculation amount for calculating the correction parameters.
当然,为了更进一步的减少计算量,预设阈值可设置的较大,如预设阈值为5度时,由于人脸的偏转较小,此时基本不影响注视信息的检测准确性,可不进行校正参数的计算。或者,根据注视信息的需求,来设置预设阈值,如注视信息仅包括注视方向,而不需要准确的注视点坐标,此时可将预设阈值设置的较大,而注视信息包括注视点在显示屏40的坐标时,则可将预设阈值设置的较小,从而保证注视点的检测准确性。Of course, in order to further reduce the amount of calculation, the preset threshold can be set larger. For example, when the preset threshold is 5 degrees, since the deflection of the face is small, the detection accuracy of the gaze information is basically not affected at this time. Calculation of correction parameters. Or, set the preset threshold according to the needs of the gaze information, such as the gaze information only includes the gaze direction, and does not need accurate gaze point coordinates. At this time, the preset threshold can be set larger, and the gaze information includes the gaze point at coordinates of the display screen 40, the preset threshold can be set smaller, so as to ensure the accuracy of gaze point detection.
在得到注视信息后,即可根据注视信息(注视方向和/或注视点坐标)实现电子设备100的控制。如在检测到注视点坐标位于显示屏40的显示区域时,保持屏幕始终点亮,而在检测到注视点坐标位于显示屏40的显示区域之外的预定时长(如10S、20S等)后,则关闭屏幕。或者,根据注视方向的改变,进行翻页等操作。After obtaining the gaze information, the electronic device 100 can be controlled according to the gaze information (the gaze direction and/or the coordinates of the gaze point). For example, when it is detected that the gaze point coordinates are located in the display area of the display screen 40, keep the screen always on, and after detecting that the gaze point coordinates are located outside the display area of the display screen 40 for a predetermined duration (such as 10S, 20S, etc.), to turn off the screen. Or, according to the change of the gaze direction, operations such as turning pages are performed.
电子设备100采集的人脸图像中,可能因拍摄角度等原因,使得获取的人脸信息不够准确,从而影响了注视点的检测准确性。In the face image collected by the electronic device 100 , the obtained face information may not be accurate enough due to factors such as shooting angles, thereby affecting the accuracy of gaze point detection.
本申请的注视检测方法、检测装置10和电子设备100,在获取到人脸信息后,先通过人脸信息来计算人脸姿态,在姿态信息大于预设阈值,会影响到注视点坐标的计算准确性时,再根据人脸信息首先计算基准注视点坐标,再根据姿态信息计算校正参数,从而可根据校正参数来校正基准注视点坐标,从而防止因获取的人脸信息中,人脸拍摄角度过大对注视检测的影响,可提升注视检测的准确性。The gaze detection method, the detection device 10 and the electronic device 100 of the present application, after obtaining the face information, first calculate the face posture through the face information, and if the posture information is greater than the preset threshold, it will affect the calculation of the gaze point coordinates For accuracy, first calculate the coordinates of the reference point of gaze based on the face information, and then calculate the correction parameters based on the attitude information, so that the coordinates of the reference point of gaze can be corrected according to the correction parameters, thereby preventing the shooting angle of the face from being captured due to the face information. Excessive impact on gaze detection can improve the accuracy of gaze detection.
请参阅图2、图3和图5,在某些实施方式中,人脸信息包括人脸掩码、左眼图像及右眼图像,人脸掩码用于指示人脸在图像中的位置,步骤011:根据人脸信息计算基准注视点坐标,包括:Referring to Fig. 2, Fig. 3 and Fig. 5, in some embodiments, the face information includes a face mask, a left-eye image and a right-eye image, and the face mask is used to indicate the position of the face in the image, Step 011: Calculate the reference gaze point coordinates according to the face information, including:
0111:根据人脸掩码计算人脸相对电子设备100的位置信息;0111: Calculate the position information of the face relative to the electronic device 100 according to the face mask;
0112:根据位置信息、左眼图像和右眼图像,计算基准注视点坐标。0112: Calculate the coordinates of the reference gaze point according to the position information, the left-eye image and the right-eye image.
在某些实施方式中,第一确定模块11用于根据人脸掩码计算人脸相对电子设备100的位置信息;根据位置信息、左眼图像和右眼图像,计算基准注视点坐标。也即是说,步骤0111和步骤0112可以由第一确定模块11执行。In some implementations, the first determining module 11 is configured to calculate the position information of the face relative to the electronic device 100 according to the face mask; and calculate the reference gaze point coordinates according to the position information, the left-eye image and the right-eye image. That is to say, step 0111 and step 0112 can be executed by the first determination module 11 .
在某些实施方式中,处理器60还用于根据人脸掩码计算人脸相对电子设备100的位置信息;根据位置信息、左眼图像和右眼图像,计算基准注视点坐标。也即是说,步骤0111和步骤0112可以由处理器60执行。In some implementations, the processor 60 is further configured to calculate the position information of the face relative to the electronic device 100 according to the face mask; and calculate the reference gaze point coordinates according to the position information, the left-eye image and the right-eye image. That is to say, step 0111 and step 0112 may be executed by the processor 60 .
具体地,在根据计算得到注视信息之前,处理器60还可先确定人脸图像的人脸掩码,人脸掩码用于表征人脸在人脸图像中的位置,人脸掩码可通过识别人脸图像中的人脸的位置确定,处理器60根据 人脸掩码可计算人脸相对电子设备100的位置信息(如根据人脸掩码占人脸图像的比例,可计算人脸和电子设备100的距离),可以理解,人脸和电子设备100的距离变化时,即使人眼注视方向未发生改变,人眼的注视点坐标依旧会发生变化,因此,在计算注视信息时,除了根据人脸图像和/或人眼图像(如左眼图像和右眼图像)计算注视信息外,还可结合位置信息,从而更为准确地计算注视点坐标。Specifically, before obtaining the gaze information according to the calculation, the processor 60 can also first determine the face mask of the face image, the face mask is used to represent the position of the face in the face image, and the face mask can be obtained by The position of the face in the recognition face image is determined, and the processor 60 can calculate the position information of the face relative to the electronic device 100 according to the face mask (for example, according to the ratio of the face mask to the face image, the face and face can be calculated). distance of the electronic device 100), it can be understood that when the distance between the human face and the electronic device 100 changes, even if the gaze direction of the human eye does not change, the gaze point coordinates of the human eye will still change. Therefore, when calculating the gaze information, except In addition to calculating gaze information based on face images and/or eye images (such as left-eye images and right-eye images), location information can also be combined to more accurately calculate gaze point coordinates.
请再次参阅图2、图3和图5,在某些实施方式中,人脸信息包括人脸特征点,姿态信息包括姿态角和三维坐标偏移,校正参数包括旋转矩阵和平移矩阵,步骤011:根据人脸信息确定人脸的姿态信息,还包括:Please refer to Fig. 2, Fig. 3 and Fig. 5 again, in some embodiments, face information includes face feature points, attitude information includes attitude angle and three-dimensional coordinate offset, correction parameters include rotation matrix and translation matrix, step 011 : Determine the posture information of the face according to the face information, including:
0113:根据人脸特征点计算姿态角和三维坐标偏移;0113: Calculate attitude angle and three-dimensional coordinate offset according to facial feature points;
步骤013:根据姿态信息计算校正参数,包括Step 013: Calculate the correction parameters according to the attitude information, including
0131:根据姿态角计算旋转矩阵,并根据三维坐标偏移计算平移矩阵。0131: Calculate the rotation matrix according to the attitude angle, and calculate the translation matrix according to the three-dimensional coordinate offset.
在某些实施方式中,第一确定模块11还用于根据人脸特征点计算姿态角和三维坐标偏移;第二确定模块12还用于根据姿态角计算旋转矩阵,并根据三维坐标偏移计算平移矩阵。也即是说,步骤0113可以由第一确定模块11执行,和步骤0131可以由第二确定模块12执行。In some implementations, the first determination module 11 is also used to calculate the attitude angle and three-dimensional coordinate offset according to the facial feature points; the second determination module 12 is also used to calculate the rotation matrix according to the attitude angle, and calculate the three-dimensional coordinate offset according to the Compute the translation matrix. That is to say, step 0113 can be performed by the first determination module 11 , and step 0131 can be performed by the second determination module 12 .
在某些实施方式中,处理器60还用于根据人脸特征点计算姿态角和三维坐标偏移;根据姿态角计算旋转矩阵,并根据三维坐标偏移计算平移矩阵。也即是说,步骤0133和步骤0134可以由处理器60执行。In some implementations, the processor 60 is further configured to calculate an attitude angle and a three-dimensional coordinate offset based on facial feature points; calculate a rotation matrix based on the attitude angle, and calculate a translation matrix based on the three-dimensional coordinate offset. That is to say, step 0133 and step 0134 can be executed by the processor 60 .
具体地,校正参数可包括旋转矩阵和平移矩阵,以分别表征人脸位置变化和姿态变化,在计算校正参数时,可首先根据人脸特征点计算姿态角和三维坐标偏移,其中,姿态角用于表示人脸的姿态(如俯仰角、横滚角和偏航角),三维坐标偏移则可表示人脸的位置,然后根据姿态角计算旋转矩阵,根据三维坐标偏移计算偏移矩阵,从而确定基准注视点坐标的校正参数,根据基准注视点坐标、旋转矩阵和平移矩阵,即可准确计算得到注视信息。Specifically, the correction parameters may include a rotation matrix and a translation matrix to represent the face position change and pose change respectively. When calculating the correction parameters, the pose angle and the three-dimensional coordinate offset may be calculated first according to the face feature points, where the pose angle It is used to represent the attitude of the face (such as pitch angle, roll angle and yaw angle), and the three-dimensional coordinate offset can represent the position of the face, and then calculate the rotation matrix according to the attitude angle, and calculate the offset matrix according to the three-dimensional coordinate offset , so as to determine the correction parameters of the reference gaze point coordinates, and accurately calculate the fixation information according to the reference gaze point coordinates, rotation matrix and translation matrix.
请参阅图2、图3和图6,在某些实施方式中,注视检测方法还包括:Please refer to Fig. 2, Fig. 3 and Fig. 6, in some embodiments, gaze detection method also includes:
0101:获取训练样本集,训练样本集包括人脸的姿态信息小于预设阈值的第一类样本和人脸的姿态信息大于预设阈值的第二类样本;0101: Obtain a training sample set, the training sample set includes the first type of samples whose face pose information is less than a preset threshold and the second type of samples whose face pose information is greater than a preset threshold;
0102:根据第一类样本和第二类样本训练预设的检测模型;0102: Train a preset detection model according to the first type of samples and the second type of samples;
步骤013包括:Step 013 includes:
0132:基于检测模型,根据姿态信息确定校正参数。0132: Based on the detection model, determine the correction parameters according to the attitude information.
在某些实施方式中,检测装置10还包括获取模块14和训练模块15。获取模块14和训练模块15均可设置在NPU,以进行检测模型的训练。获取模块14用于获取训练样本集;训练模块16用于根据第一类样本和第二类样本训练预设的检测模型;第二确定模块12还用于基于检测模型,根据姿态信息确定校正参数。也即是说,步骤0101可以由获取模块14执行、步骤0102可以由训练模块15执行、步骤0132可以由第二确定模块12执行。In some embodiments, the detection device 10 further includes an acquisition module 14 and a training module 15 . Both the acquiring module 14 and the training module 15 can be set in the NPU to train the detection model. The acquisition module 14 is used to obtain the training sample set; the training module 16 is used to train the preset detection model according to the first type of sample and the second type of sample; the second determination module 12 is also used to determine the correction parameters according to the posture information based on the detection model . That is to say, step 0101 may be performed by the acquisition module 14 , step 0102 may be performed by the training module 15 , and step 0132 may be performed by the second determination module 12 .
在某些实施方式中,处理器60还用于获取训练样本集;根据第一类样本和第二类样本训练预设的检测模型;基于检测模型,根据姿态信息确定校正参数。也即是说,步骤0101、步骤0102和步骤0132可以由处理器60执行。In some embodiments, the processor 60 is further configured to obtain a training sample set; train a preset detection model according to the first type of samples and the second type of samples; and determine correction parameters according to the posture information based on the detection model. That is to say, step 0101 , step 0102 and step 0132 can be executed by the processor 60 .
具体的,本申请可通过预设的检测模型实现注视信息的计算,为了保证注视信息的准确性,需要先对检测模型进行训练,使得检测模型收敛。Specifically, the present application can realize calculation of gaze information through a preset detection model. In order to ensure the accuracy of gaze information, it is necessary to first train the detection model so that the detection model converges.
在训练时,为了使得检测模型能够在人脸相对显示屏40发生偏转时仍准确地的计算出注视信息,因此,可预先选取人脸的姿态信息小于预设阈值的多个第一类样本和人脸的姿态信息大于预设阈值的多个第二类样本,以作为训练样本集;其中,第一类样本为姿态信息小于预设阈值的人脸图像;第二类样本为姿态信息大于预设阈值的人脸图像;如此,通过姿态信息小于预设阈值的第一类样本和大于预设阈值的第二类样本来对检测模型进行训练,并训练至收敛后,可使得检测模型在进行注视信息检测时,最大化的降低人脸相对显示屏40发生偏转带来的影响,可保证注视检测的准确性。During training, in order to enable the detection model to still accurately calculate the gaze information when the face is deflected relative to the display screen 40, it is possible to pre-select a plurality of first-type samples whose posture information of the face is less than a preset threshold and A plurality of second-type samples whose pose information of the face is greater than a preset threshold are used as a training sample set; wherein, the first-type samples are face images whose pose information is less than a preset threshold; the second-type samples are faces whose pose information is greater than a preset A face image with a threshold; in this way, the detection model is trained through the first type of samples whose attitude information is less than the preset threshold and the second type of samples greater than the preset threshold, and after training to convergence, the detection model can be When the gaze information is detected, the impact caused by the deflection of the human face relative to the display screen 40 is minimized to ensure the accuracy of gaze detection.
请参阅图2、图3和图7,在某些实施方式中,步骤0102包括:Referring to Fig. 2, Fig. 3 and Fig. 7, in some embodiments, step 0102 includes:
01021:将第一类样本输入注视点检测模块,以输出第一训练坐标;01021: Input the first type of samples into the fixation point detection module to output the first training coordinates;
01022:将第二类样本输入注视点检测模块和校正模块,以输出第二训练坐标;01022: Input the second type of samples into the gaze point detection module and the correction module to output the second training coordinates;
01023:基于预设的损失函数,根据第一类样本对应的第一预设坐标和第一训练坐标,计算第一损 失值,并根据第二类样本对应的第二预设坐标和第二训练坐标,计算第二损失值;01023: Based on the preset loss function, calculate the first loss value according to the first preset coordinates corresponding to the first type of samples and the first training coordinates, and calculate the first loss value according to the second preset coordinates corresponding to the second type of samples and the second training coordinates Coordinates, calculate the second loss value;
01024:根据第一损失值和第二损失值调整检测模型,直至检测模型收敛。01024: Adjust the detection model according to the first loss value and the second loss value until the detection model converges.
在某些实施方式中,训练模块15还用于将第一类样本输入注视点检测模块,以输出第一训练坐标;将第二类样本输入注视点检测模块和校正模块,以输出第二训练坐标;基于预设的损失函数,根据第一类样本对应的第一预设坐标和第一训练坐标,计算第一损失值,并根据第二类样本对应的第二预设坐标和第二训练坐标,计算第二损失值;根据第一损失值和第二损失值调整检测模型,直至检测模型收敛。也即是说,步骤01021至步骤01024可以由训练模块15执行。In some embodiments, the training module 15 is also used to input the first type of sample into the gaze point detection module to output the first training coordinates; the second type of sample is input into the gaze point detection module and the correction module to output the second training coordinates; based on the preset loss function, calculate the first loss value according to the first preset coordinates and the first training coordinates corresponding to the first type of samples, and calculate the first loss value according to the second preset coordinates corresponding to the second type of samples and the second training coordinates coordinates, calculating the second loss value; adjusting the detection model according to the first loss value and the second loss value until the detection model converges. That is to say, Step 01021 to Step 01024 can be executed by the training module 15 .
在某些实施方式中,处理器60还用于将第一类样本输入注视点检测模块,以输出第一训练坐标;将第二类样本输入注视点检测模块和校正模块,以输出第二训练坐标;基于预设的损失函数,根据第一类样本对应的第一预设坐标和第一训练坐标,计算第一损失值,并根据第二类样本对应的第二预设坐标和第二训练坐标,计算第二损失值;根据第一损失值和第二损失值调整检测模型,直至检测模型收敛。也即是说,步骤01021至步骤01024可以由处理器60执行。In some embodiments, the processor 60 is also used for inputting the first type of samples into the gaze point detection module to output the first training coordinates; inputting the second type of samples into the gaze point detection module and the correction module to output the second training coordinates coordinates; based on the preset loss function, calculate the first loss value according to the first preset coordinates and the first training coordinates corresponding to the first type of samples, and calculate the first loss value according to the second preset coordinates corresponding to the second type of samples and the second training coordinates coordinates, calculating the second loss value; adjusting the detection model according to the first loss value and the second loss value until the detection model converges. That is to say, Step 01021 to Step 01024 may be executed by the processor 60 .
具体的,请参阅图8,人脸检测模型50包括注视点检测模块51和校正模块52。在进行训练时,将训练样本集输入到检测模型,其中,第一类样本输入到注视点检测模块51,以输出第一训练坐标;由于第一类型训练样本的姿态信息小于预设阈值,因此直接输出第一训练坐标;第二类训练样本则同时输入到注视点检测模块51和校正模块52,注视点检测模块51输出基准训练坐标,然后校正模块52输出校正参数,并根据校正参数校正基准训练坐标,以输出第二训练坐标。Specifically, please refer to FIG. 8 , the face detection model 50 includes a gaze point detection module 51 and a correction module 52 . When training, the training sample set is input to the detection model, wherein the first type of sample is input to the gaze point detection module 51 to output the first training coordinates; since the attitude information of the first type of training sample is less than the preset threshold, therefore Directly output the first training coordinates; the second type of training samples are input to the fixation point detection module 51 and the correction module 52 at the same time, the fixation point detection module 51 outputs the reference training coordinates, and then the correction module 52 outputs the correction parameters, and corrects the reference according to the correction parameters training coordinates to output the second training coordinates.
可以理解,每个训练样本均存在对应的预设坐标,预设坐标表示训练样本实际的注视信息,其中,第一类训练样本对应第一预设坐标,第二类预设样本对应第二预设坐标,因此,处理器60基于预设的损失函数、第一训练坐标和第一预设坐标,可计算得到第一损失值;然后处理器60基于第一损失值来调整注视点检测模块51,以使得注视点检测模块51输出的第一训练坐标逐渐接近第一预设坐标,直至收敛;处理器60基于预设的损失函数、第二训练坐标和第二预设坐标,可计算得到第二损失值;然后处理器60基于第二损失值来同时调整注视点检测模块51和校正模块52,以使得检测模型输出的第二训练坐标逐渐接近第二预设坐标,直至收敛。It can be understood that each training sample has a corresponding preset coordinate, and the preset coordinate represents the actual gaze information of the training sample, wherein the first type of training sample corresponds to the first preset coordinate, and the second type of preset sample corresponds to the second preset coordinate. Set the coordinates, therefore, the processor 60 can calculate the first loss value based on the preset loss function, the first training coordinates and the first preset coordinates; then the processor 60 adjusts the gaze point detection module 51 based on the first loss value , so that the first training coordinates output by the gaze point detection module 51 gradually approach the first preset coordinates until convergence; the processor 60 can calculate the second based on the preset loss function, the second training coordinates and the second preset coordinates Two loss values; then the processor 60 simultaneously adjusts the gaze point detection module 51 and the correction module 52 based on the second loss value, so that the second training coordinates output by the detection model gradually approach the second preset coordinates until convergence.
例如,损失函数如下:
Figure PCTCN2022126148-appb-000001
其中,loss为损失值,N为每个训练样本集包含的训练样本的数量,X和Y为训练坐标(如第一训练坐标或第二训练坐标),Gx和Gy为预设坐标(如第一预设坐标和第二预设坐标),在训练坐标为注视方向时,X和Y分别表示俯仰角和偏航角,在训练坐标为注视点坐标时,X和Y分别表示注视点在显示屏40所在平面的坐标,从而快速计算得到第一损失值和第二损失值。
For example, the loss function is as follows:
Figure PCTCN2022126148-appb-000001
Among them, loss is the loss value, N is the number of training samples contained in each training sample set, X and Y are training coordinates (such as the first training coordinates or the second training coordinates), and Gx and Gy are preset coordinates (such as the first training coordinates or the second training coordinates). a preset coordinate and a second preset coordinate), when the training coordinates are the gaze direction, X and Y represent the pitch angle and the yaw angle respectively; when the training coordinates are the gaze point coordinates, X and Y represent the gaze point respectively The coordinates of the plane where the screen 40 is located, so as to quickly calculate the first loss value and the second loss value.
然后,处理器60可根据第一损失值和第二损失值调整检测模型,使得检测模型的梯度不断下降,使得训练坐标越来越接近预设坐标,最终使得检测模型训练至收敛。例如,在进行训练的连续N个批次的样本中,任意两个第一类样本对应的第一损失值的第一差值、及任意两个第二类样本对应的第二损失值的第二差值均小于预定差值阈值时,确定检测模型收敛,N为大于1的正整数;也即是说,在连续N个批次的训练过程中,第一损失值基本没有变化,则说明第一损失值和第二损失值达到了极限,可确定检测模型已收敛;或者,第一损失值和第二损失值均小于预定损失阈值时,确定检测模型收敛,在第一损失值和第二损失值小于预定损失阈值时,说明训练坐标已经十分接近预设坐标,此时可确定检测模型已收敛。Then, the processor 60 can adjust the detection model according to the first loss value and the second loss value, so that the gradient of the detection model decreases continuously, so that the training coordinates are getting closer to the preset coordinates, and finally the detection model is trained to convergence. For example, in the samples of N consecutive batches for training, the first difference of the first loss value corresponding to any two samples of the first type, and the first difference of the second loss value corresponding to any two samples of the second type When the two differences are both less than the predetermined difference threshold, it is determined that the detection model is converged, and N is a positive integer greater than 1; that is to say, during the training process of consecutive N batches, the first loss value basically does not change, which means When the first loss value and the second loss value reach the limit, it can be determined that the detection model has converged; or, when both the first loss value and the second loss value are less than a predetermined loss threshold, it is determined that the detection model is converged, and the first loss value and the second loss value are determined to be convergent. Second, when the loss value is less than the predetermined loss threshold, it indicates that the training coordinates are very close to the preset coordinates, and it can be determined that the detection model has converged.
如此,通过第一类训练样本和第二类训练样本训练检测模型至收敛,从而保证检测模型在人脸发生偏转时,仍能够根据人脸信息输出准确地注视信息。In this way, the detection model is trained to converge through the first type of training samples and the second type of training samples, so as to ensure that the detection model can still output accurate gaze information according to the face information when the face is deflected.
请参阅图3、图9和10,本申请实施方式的电子设备100的控制方法包括以下步骤:Referring to Fig. 3, Fig. 9 and Fig. 10, the control method of the electronic device 100 in the embodiment of the present application includes the following steps:
021:根据人脸信息确定人脸的姿态信息,根据人脸信息确定基准注视点坐标;021: Determine the posture information of the face according to the face information, and determine the reference gaze point coordinates according to the face information;
023:响应于姿态信息大于预设阈值,根据姿态信息确定校正参数;023: In response to the attitude information being greater than a preset threshold, determine a correction parameter according to the attitude information;
025:根据基准注视点坐标和校正参数,确定注视信息;及025: Determine the fixation information according to the coordinates of the reference fixation point and the correction parameters; and
027:根据注视信息控制电子设备100。027: Control the electronic device 100 according to the gaze information.
本申请实施方式的控制装置20包括获取模块21、第一确定模块22、第二确定模块23和控制模块24。获取模块21用于根据人脸信息确定人脸的姿态信息,根据人脸信息确定基准注视点坐标;第一确定模块22用于响应于姿态信息大于预设阈值,根据姿态信息确定校正参数;第二确定模块23用于根 据基准注视点坐标和校正参数,确定注视信息;控制模块24用于根据注视信息控制电子设备100。也即是说,步骤021可以由获取模块21执行、步骤023可以由第一确定模块22执行、步骤025可以由第二确定模块23执行和步骤027可以由控制模块24执行。The control device 20 in the embodiment of the present application includes an acquisition module 21 , a first determination module 22 , a second determination module 23 and a control module 24 . Acquisition module 21 is used for determining the posture information of people's face according to people's face information, determines the coordinates of reference point of gaze according to people's face information; The first determining module 22 is used for determining correction parameters according to posture information in response to posture information being greater than a preset threshold; The second determination module 23 is used to determine the gaze information according to the coordinates of the reference gaze point and the correction parameters; the control module 24 is used to control the electronic device 100 according to the gaze information. That is to say, step 021 can be performed by the acquisition module 21 , step 023 can be performed by the first determination module 22 , step 025 can be performed by the second determination module 23 and step 027 can be performed by the control module 24 .
本申请实施方式的电子设备100包括处理器60和采集装置30。采集装置30用于按预定帧率采集人脸信息(人脸信息包括人脸图像,如人脸的可见光图像、红外图像、深度图像等);采集装置30可以是可见光相机、红外相机、深度相机中的一种或多种,其中,可见光相机可采集可见光人脸图像、红外相机可采集红外人脸图像、深度相机可采集深度人脸图像,本实施方式中,采集装置30包括可见光相机、红外相机和深度相机,采集装置30同时可见光人脸图像、红外人脸图像和深度人脸图像。处理器60可包括ISP、NPU和AP,如控制装置20设置在电子设备100内,获取模块21设置在ISP和NPU,处理器60与采集装置30连接,在采集装置30采集到人脸图像后,ISP可对人脸图像进行处理,以根据人脸信息确定人脸的姿态信息,NPU可根据人脸信息确定基准注视点坐标,第一确定模块22和第二确定模块23可设置在NPU,控制模块24可设置在AP。处理器60(具体可以是ISP和NPU)用于获取人脸信息和姿态信息;处理器60(具体可以是NPU)还用于响应于姿态信息大于预设阈值,根据姿态信息确定校正参数;根据基准注视点坐标和校正参数,确定注视信息;处理器60(具体可以是AP)还可用于根据注视信息控制电子设备100。也即是说,步骤021可以由采集装置30配合处理器60执行,步骤023、步骤025和步骤027可以由处理器60执行。The electronic device 100 in the embodiment of the present application includes a processor 60 and a collection device 30 . Acquisition device 30 is used for collecting face information by predetermined frame rate (face information comprises people's face image, as the visible light image of people's face, infrared image, depth image etc.); Acquisition device 30 can be visible light camera, infrared camera, depth camera One or more of them, wherein the visible light camera can collect visible light face images, the infrared camera can collect infrared face images, and the depth camera can collect depth face images. In this embodiment, the collection device 30 includes a visible light camera, an infrared face image camera and depth camera, the acquisition device 30 simultaneously visible light face image, infrared face image and depth face image. Processor 60 may include ISP, NPU and AP, such as control device 20 is arranged in electronic equipment 100, acquisition module 21 is arranged in ISP and NPU, processor 60 is connected with collection device 30, after collection device 30 collects the face image , the ISP can process the face image to determine the posture information of the face according to the face information, the NPU can determine the reference point of gaze coordinates according to the face information, the first determination module 22 and the second determination module 23 can be arranged on the NPU, The control module 24 can be set at the AP. Processor 60 (specifically can be ISP and NPU) is used for obtaining face information and posture information; Processor 60 (specifically can be NPU) is also used for determining correction parameter according to posture information in response to posture information greater than preset threshold value; According to The coordinates of the reference gaze point and the correction parameters are used to determine gaze information; the processor 60 (specifically, it may be an AP) can also be used to control the electronic device 100 according to the gaze information. That is to say, step 021 can be executed by the collection device 30 in cooperation with the processor 60 , and step 023 , step 025 and step 027 can be executed by the processor 60 .
具体地,确定注视信息的方式,即步骤021、步骤023和步骤025,请分别参阅步骤011、步骤013和步骤015的描述,在此不再赘述。Specifically, for the manner of determining the gaze information, that is, Step 021, Step 023, and Step 025, please refer to the descriptions of Step 011, Step 013, and Step 015, respectively, and details will not be repeated here.
在得到注视信息(如注视方向和注视点坐标)后,即可根据注视方向和注视点坐标实现电子设备100的控制。请参阅图11,例如,以双眼的中点为原点O1建立三维坐标系,X1轴平行双眼中心的连线方向,Y1轴位于水平面并垂直X1轴,Z1轴垂直X1轴和Y1轴,通过用户视线S和三维坐标系的三轴的旋转角表示用户的注视方向,如注视方向分别包括俯仰角、横滚角和偏航角,俯仰角表示绕X1轴的旋转角,横滚角表示绕Y1轴的旋转角,偏航角表示绕Z1轴的旋转角,处理器60可根据注视方向实现对电子设备100的显示内容的翻页或者滑动操作,例如根据确定连续多帧人眼区域图像(如连续10帧)的注视方向,可确定注视方向的变化,例如,请结合图11和图12,当俯仰角逐渐增大(即视线S仰)时,则可确定用户想要显示内容向上滑动或向下翻页,再例如,请结合图11和图13,俯仰角逐渐减小(即视线S俯),则可确定用户想要显示内容向下滑动或向上翻页。同样地,通过检测注视点M的移动方向,也可对电子设备100进行滑动或翻页操作。请结合图14,可以显示屏40的中心作为坐标原点O2建立平面坐标系,以平行电子设备100的宽度方向作为X2轴,以平行电子设备100的长度方向作为Y2轴,注视点坐标包括横坐标(对应在X2轴的位置)和纵坐标(对应在Y2轴的位置),若纵坐标逐渐增加,则表示注视点M上移,可确定用户想要显示内容向上滑动或向下翻页,再例如,若纵坐标逐渐减小,则表示注视点M下移,可确定用户想要显示内容向下滑动或向上翻页。After obtaining the gaze information (such as gaze direction and gaze point coordinates), the electronic device 100 can be controlled according to the gaze direction and gaze point coordinates. Please refer to Figure 11. For example, a three-dimensional coordinate system is established with the midpoint of the eyes as the origin O1, the X1 axis is parallel to the direction of the line connecting the centers of the eyes, the Y1 axis is located on the horizontal plane and perpendicular to the X1 axis, and the Z1 axis is perpendicular to the X1 axis and Y1 axis. The three-axis rotation angle of the line of sight S and the three-dimensional coordinate system indicates the user's gaze direction. For example, the gaze direction includes pitch angle, roll angle and yaw angle respectively. The pitch angle represents the rotation angle around the X1 axis, and the roll angle represents the rotation angle around the Y1 axis. The rotation angle of the axis, the yaw angle represents the rotation angle around the Z1 axis, the processor 60 can realize the page turning or sliding operation of the display content of the electronic device 100 according to the gaze direction, for example, according to the determination of continuous multiple frames of human eye area images (such as 10 consecutive frames) of the gaze direction, the change of the gaze direction can be determined, for example, please combine Figure 11 and Figure 12, when the pitch angle gradually increases (that is, the line of sight S is tilted), it can be determined that the user wants the displayed content to slide up or Turn the page down. For another example, please refer to FIG. 11 and FIG. 13 , the pitch angle gradually decreases (that is, the line of sight S is tilted), then it can be determined that the user wants to slide the displayed content down or turn the page up. Similarly, by detecting the moving direction of the gaze point M, the electronic device 100 can also be slid or page-turned. Please refer to FIG. 14 , the center of the display screen 40 can be used as the coordinate origin O2 to establish a plane coordinate system, the width direction parallel to the electronic device 100 is used as the X2 axis, the length direction parallel to the electronic device 100 is used as the Y2 axis, and the gaze point coordinates include the abscissa (corresponding to the position on the X2 axis) and the ordinate (corresponding to the position on the Y2 axis). If the ordinate gradually increases, it means that the gaze point M moves up. It can be determined that the user wants to slide up or turn the page down, and then For example, if the ordinate gradually decreases, it means that the gaze point M moves down, and it can be determined that the user wants to slide the displayed content down or turn the page up.
在其他实施方式中,处理器60还可根据注视方向的变化速度(如通过第1帧和第10帧的俯仰角的差值(或注视点M的纵坐标的差值)和获取连续10帧的时长确定),变化速度越快,滑动后显示的新的显示内容越多。In other implementations, the processor 60 can also obtain 10 consecutive frames according to the change speed of the gaze direction (such as the difference between the pitch angles of the first frame and the tenth frame (or the difference between the vertical coordinates of the gaze point M) and The duration is determined), the faster the change speed, the more new display content will be displayed after sliding.
在另一个例子中,在检测到注视点坐标位于显示屏40的显示区域时,说明用户始终在查看显示屏40,则保持屏幕始终点亮,而在检测到注视点坐标位于显示屏40的显示区域之外时,则说明用户未查看显示屏40,但为了防止用户仅是偶尔张望一下显示区域外,导致误判,可在用户未查看显示屏40后预定时长(如10S、20S等),再关闭屏幕。In another example, when it is detected that the coordinates of the gaze point are located in the display area of the display screen 40, it means that the user is always viewing the display screen 40, and the screen is kept on all the time, and when it is detected that the coordinates of the gaze point are located in the display area of the display screen 40 When outside the area, it means that the user has not checked the display screen 40, but in order to prevent the user from looking around outside the display area once in a while, causing a misjudgment, the predetermined time length (such as 10S, 20S, etc.) after the user does not check the display screen 40 can be Turn off the screen again.
本申请的电子设备100的控制方法、控制装置20和电子设备100,在获取到人脸信息和姿态信息后,在姿态信息大于预设阈值,会影响到注视点坐标的计算准确性时,根据人脸信息首先计算基准注视点坐标,再根据姿态信息计算校正参数,可根据校正参数来校正基准注视点坐标以得到准确地注视信息,从而防止因获取的人脸信息中,人脸拍摄角度过大对注视检测的影响,可提升注视检测的准确性。且根据准确地的注视信息控制电子设备100时,可提高对电子设备100的控制准确性。In the control method of the electronic device 100 of the present application, the control device 20 and the electronic device 100, after obtaining the face information and posture information, when the posture information is greater than the preset threshold, which will affect the calculation accuracy of the gaze point coordinates, according to The face information first calculates the coordinates of the reference point of gaze, and then calculates the correction parameters according to the posture information. The coordinates of the reference point of gaze can be corrected according to the correction parameters to obtain accurate gaze information, so as to prevent the acquired face information from being shot at an excessive angle. Large impact on gaze detection, which can improve the accuracy of gaze detection. And when the electronic device 100 is controlled according to accurate gaze information, the control accuracy of the electronic device 100 can be improved.
请参阅图3、图10和图15,在某些实施方式中,人脸信息包括人脸掩码、左眼图像及右眼图像,人脸掩码用于指示人脸在图像中的位置,步骤021:根据人脸信息计算基准注视点坐标,包括:Referring to Fig. 3, Fig. 10 and Fig. 15, in some embodiments, the face information includes a face mask, a left-eye image and a right-eye image, and the face mask is used to indicate the position of the face in the image, Step 021: Calculate the reference gaze point coordinates according to the face information, including:
0211:根据人脸掩码计算人脸相对电子设备100的位置信息;0211: Calculate the position information of the face relative to the electronic device 100 according to the face mask;
0212:根据位置信息、左眼图像和右眼图像,计算基准注视点坐标。0212: Calculate the coordinates of the reference gaze point according to the position information, the left-eye image and the right-eye image.
在某些实施方式中,第一确定模块22还用于根据人脸掩码计算人脸相对电子设备100的位置信息;根据位置信息、左眼图像和右眼图像,计算基准注视点坐标。也即是说,步骤0211和步骤0212可以由第一确定模块22执行。In some implementations, the first determining module 22 is also used to calculate the position information of the face relative to the electronic device 100 according to the face mask; and calculate the reference gaze point coordinates according to the position information, the left-eye image and the right-eye image. That is to say, step 0211 and step 0212 can be executed by the first determination module 22 .
在某些实施方式中,处理器60还用于根据人脸掩码计算人脸相对电子设备100的位置信息;根据位置信息、左眼图像和右眼图像,计算基准注视点坐标。也即是说,步骤0211和步骤0212可以由处理器60执行。In some implementations, the processor 60 is further configured to calculate the position information of the face relative to the electronic device 100 according to the face mask; and calculate the reference gaze point coordinates according to the position information, the left-eye image and the right-eye image. That is to say, step 0211 and step 0212 can be executed by the processor 60 .
具体地,步骤0231和步骤0232的具体描述请分别参阅步骤0131和步骤0132,在此不再赘述。Specifically, for the detailed description of step 0231 and step 0232, please refer to step 0131 and step 0132 respectively, and details are not repeated here.
请参阅图3、图10和图15,在某些实施方式中,人脸信息包括人脸特征点,姿态信息包括姿态角和三维坐标偏移,校正参数包括旋转矩阵和平移矩阵,步骤021:根据人脸信息确定人脸的姿态信息,包括:Please refer to Fig. 3, Fig. 10 and Fig. 15, in some embodiments, face information includes face feature points, attitude information includes attitude angle and three-dimensional coordinate offset, correction parameters include rotation matrix and translation matrix, step 021: Determine the pose information of the face based on the face information, including:
0213:根据人脸特征点计算姿态角和三维坐标偏移;0213: Calculate attitude angle and three-dimensional coordinate offset according to facial feature points;
步骤023包括:Step 023 includes:
0231:根据姿态角计算旋转矩阵,并根据三维坐标偏移计算平移矩阵。0231: Calculate the rotation matrix according to the attitude angle, and calculate the translation matrix according to the three-dimensional coordinate offset.
在某些实施方式中,第一确定模块22还用于根据人脸特征点计算姿态角和三维坐标偏移;根据姿态角计算旋转矩阵,并根据三维坐标偏移计算平移矩阵。也即是说,步骤0233和步骤0234可以由第一确定模块22执行。In some implementations, the first determination module 22 is also used to calculate the attitude angle and three-dimensional coordinate offset according to the face feature points; calculate the rotation matrix according to the attitude angle, and calculate the translation matrix according to the three-dimensional coordinate offset. That is to say, step 0233 and step 0234 can be executed by the first determination module 22 .
在某些实施方式中,处理器60还用于根据人脸特征点计算姿态角和三维坐标偏移;根据姿态角计算旋转矩阵,并根据三维坐标偏移计算平移矩阵。也即是说,步骤0233和步骤0234可以由处理器60执行。In some implementations, the processor 60 is further configured to calculate an attitude angle and a three-dimensional coordinate offset based on facial feature points; calculate a rotation matrix based on the attitude angle, and calculate a translation matrix based on the three-dimensional coordinate offset. That is to say, step 0233 and step 0234 can be executed by the processor 60 .
具体地,步骤0233和步骤0234的具体描述请分别参阅步骤0133和步骤0134,在此不再赘述。Specifically, for specific descriptions of Step 0233 and Step 0234, please refer to Step 0133 and Step 0134 respectively, which will not be repeated here.
请参阅图3、图10和图16,在某些实施方式中,注视信息包括注视点坐标,在步骤021之前,控制方法还包括:Referring to Fig. 3, Fig. 10 and Fig. 16, in some embodiments, the gaze information includes gaze point coordinates, and before step 021, the control method also includes:
0201:在息屏前的第一预定时长内,获取拍摄图像;0201: Acquiring captured images within the first predetermined time period before the screen is off;
0202:响应于拍摄图像中包含人脸;0202: Responding to the fact that the captured image contains a human face;
步骤027:根据注视信息控制电子设备100,包括:Step 027: Control the electronic device 100 according to the gaze information, including:
0271:响应于注视点坐标位于显示屏40的显示区域,持续亮屏第二预定时长。0271: In response to the gaze point coordinates being located in the display area of the display screen 40, keep the screen on for a second predetermined duration.
在某些实施方式中,控制模块24还用于在息屏前的第一预定时长内,获取拍摄图像;响应于拍摄图像中包含人脸;响应于在注视点坐标位于显示屏40的显示区域时,持续亮屏第二预定时长。也即是说,步骤0201、步骤0202和步骤0271可以由控制模块24执行。In some implementations, the control module 24 is also used to acquire the photographed image within the first predetermined time period before the screen is off; in response to the fact that the photographed image contains a human face; , the screen remains on for a second predetermined duration. That is to say, step 0201 , step 0202 and step 0271 can be executed by the control module 24 .
在某些实施方式中,处理器60还用于在息屏前的第一预定时长内,获取拍摄图像;响应于拍摄图像中包含人脸,响应于在注视点坐标位于显示屏40的显示区域时,持续亮屏第二预定时长。也即是说,步骤0201、步骤0202和步骤0271可以由处理器60执行。In some implementations, the processor 60 is further configured to acquire a photographed image within the first predetermined time period before the screen is off; in response to the fact that the photographed image contains a human face, in response to the gaze point coordinates being located in the display area of the display screen 40 , the screen remains on for a second predetermined duration. That is to say, step 0201 , step 0202 and step 0271 may be executed by the processor 60 .
具体的,注视信息可用于实现息屏控制,在息屏前,首先进行注视检测,如处理器60首先获取拍摄图像,若拍摄图像中存在人脸,则根据拍摄图像进行注视信息的确定,当然,为了保证息屏前具有足够时间获取拍摄图像和计算注视信息,需要在息屏前的第一预定时长(如5秒、10秒等)内获取拍摄图像。Specifically, the gazing information can be used to realize off-screen control. Before the off-screen, gaze detection is first performed. For example, the processor 60 first acquires a captured image. If there is a human face in the captured image, the gazing information is determined according to the captured image. Of course , in order to ensure that there is enough time to acquire the captured image and calculate the gaze information before the screen is off, it is necessary to acquire the captured image within the first predetermined time period (such as 5 seconds, 10 seconds, etc.) before the screen is off.
请参阅图17和图18,当注视点M位于显示屏40的显示区域内时,则可确定用户正在注视显示屏40,从而使得显示屏40持续亮屏第二预定时长,第二预定时长可大于第一预定时长,而在再次息屏前第一预定时长内,则再次获取拍摄图像,从而实现在用户注视显示屏40时,保持亮屏,在用户不再注视显示屏40时,再息屏。17 and 18, when the gaze point M is located within the display area of the display screen 40, it can be determined that the user is looking at the display screen 40, so that the display screen 40 continues to be on for a second predetermined duration, and the second predetermined duration can be greater than the first predetermined time length, and within the first predetermined time length before the screen is turned off again, the captured image is acquired again, so that when the user looks at the display screen 40, the screen remains bright, and when the user no longer looks at the display screen 40, the screen is turned off again. Screen.
其中,请再次参阅图17,可以显示区域的中心作为坐标原点O2,建立平行显示屏40的二维坐标系,显示区域与预设坐标范围相关联,即显示区域在该二维坐标系的横坐标范围和纵坐标范围,作为预设坐标范围,当注视点坐标位于预设坐标范围内(即注视点坐标的横坐标位于横坐标范围内且纵坐标位于纵坐标范围内)时,即可确定注视点坐标位于显示区域内,从而较为简单地判断用户是否注视显示屏40。Wherein, please refer to FIG. 17 again, the center of the display area can be used as the coordinate origin O2 to establish a two-dimensional coordinate system parallel to the display screen 40. The coordinate range and the ordinate range, as the preset coordinate range, can be determined when the gaze point coordinates are within the preset coordinate range (that is, the abscissa of the gaze point coordinates is within the abscissa range and the ordinate is within the ordinate range). The gaze point coordinates are located in the display area, so it is relatively simple to determine whether the user gazes at the display screen 40 .
且由于仅在息屏前的第一预定时长内才进行拍摄图像的获取和注视信息的计算,有利于节省功耗。Moreover, since the acquisition of captured images and the calculation of gaze information are performed only within the first predetermined period of time before the screen turns off, it is beneficial to save power consumption.
请参阅图3、图10和图19,在某些实施方式中,注视信息包括注视点坐标,在步骤021之前,控制方法还包括:Referring to Fig. 3, Fig. 10 and Fig. 19, in some embodiments, the gaze information includes gaze point coordinates, and before step 021, the control method also includes:
0203:响应于电子设备100未接收到输入操作的情况,获取拍摄图像;0203: In response to the fact that the electronic device 100 does not receive an input operation, acquire a captured image;
步骤027包括:Step 027 includes:
0272:响应于拍摄图像中包含人脸且注视点坐标位于显示区域,调节显示屏40的显示亮度至第一预定亮度;0272: Adjust the display brightness of the display screen 40 to a first predetermined brightness in response to the captured image containing a human face and the gaze point coordinates are located in the display area;
0273:响应于拍摄图像中不包含人脸、或拍摄图像中包含人脸且注视点坐标位于显示区域之外,调节显示亮度至第二预定亮度,第二预定亮度小于第一预定亮度。0273: In response to the fact that the captured image does not contain a human face, or the captured image contains a human face and the gaze point coordinates are outside the display area, adjust the display brightness to a second predetermined brightness, where the second predetermined brightness is smaller than the first predetermined brightness.
在某些实施方式中,控制模块24还用于获取拍摄图像;响应于电子设备100未接收到输入操作的情况;响应于拍摄图像中包含人脸且注视点坐标位于显示区域,调节显示屏40的显示亮度至第一预定亮度;响应于拍摄图像中不包含人脸、或拍摄图像中包含人脸且注视点坐标位于显示区域之外,调节显示亮度至第二预定亮度,第二预定亮度小于第一预定亮度。也即是说,步骤0203、步骤0204、步骤0272和步骤0273可以由控制模块24执行。In some implementations, the control module 24 is also used to acquire the captured image; in response to the situation that the electronic device 100 does not receive an input operation; in response to the captured image containing a human face and the coordinates of the point of gaze located in the display area, adjust the display screen 40 Adjust the display brightness to the first predetermined brightness; in response to the captured image does not contain a human face, or the captured image contains a human face and the gaze point coordinates are outside the display area, adjust the display brightness to the second predetermined brightness, and the second predetermined brightness is less than A first predetermined brightness. That is to say, step 0203 , step 0204 , step 0272 and step 0273 can be executed by the control module 24 .
在某些实施方式中,处理器60还用于获取拍摄图像;响应于电子设备100未接收到输入操作的情况;响应于拍摄图像中包含人脸且注视点坐标位于显示区域,调节显示屏40的显示亮度至第一预定亮度;响应于拍摄图像中不包含人脸、或拍摄图像中包含人脸且注视点坐标位于显示区域之外,调节显示亮度至第二预定亮度,第二预定亮度小于第一预定亮度。也即是说,步骤0203、步骤0204、步骤0272和步骤0273可以由处理器60执行。In some implementations, the processor 60 is also configured to acquire a captured image; in response to a situation where the electronic device 100 does not receive an input operation; in response to a human face being included in the captured image and the gaze point coordinates are located in the display area, adjust the display screen 40 Adjust the display brightness to the first predetermined brightness; in response to the captured image does not contain a human face, or the captured image contains a human face and the gaze point coordinates are outside the display area, adjust the display brightness to the second predetermined brightness, and the second predetermined brightness is less than A first predetermined brightness. That is to say, step 0203 , step 0204 , step 0272 and step 0273 can be executed by the processor 60 .
具体的,请再次参阅图17和图18,注视信息还可用于实现智能亮屏,电子设备100为了节省电量,一般在亮屏一定时长后,会先降低显示亮度,然后以低亮度亮屏一定时长后则息屏。本实施方式中,在电子设备100未接收到用户的输入操作的情况下,此时可判断用户可能并未在使用电子设备100或者仅在查看显示内容,处理器60可获取拍摄图像,若拍摄图像包含人脸,则根据拍摄图像计算注视信息,若注视点坐标位于显示区域,则说明用户虽然未操作电子设备100,但在查看显示内容,此时将显示亮度调节到第一预定亮度,第一预定亮度可以是显示屏40正常显示时,由用户自定义设置的亮度,或者根据环境光亮度实时进行变化以适应环境光的亮度,从而保证用户即使不操作电子设备100,仍能够亮屏,以防止在用户未操作电子设备100,但在查看显示内容时突然息屏影响用户体验的情况。Specifically, please refer to Fig. 17 and Fig. 18 again. The gaze information can also be used to realize intelligent brightening of the screen. In order to save power, the electronic device 100 generally reduces the display brightness after a certain period of time when the screen is bright, and then brightens the screen with low brightness for a certain period of time. After a long time, the screen will be off. In this embodiment, when the electronic device 100 does not receive the user's input operation, it can be judged that the user may not be using the electronic device 100 or is only viewing the displayed content, and the processor 60 can obtain the captured image. If the image contains a human face, the gaze information is calculated according to the captured image. If the coordinates of the gaze point are located in the display area, it means that although the user is not operating the electronic device 100, he is checking the display content. At this time, the display brightness is adjusted to the first predetermined brightness. The predetermined brightness can be the brightness set by the user when the display screen 40 is normally displayed, or it can be changed in real time according to the brightness of the ambient light to adapt to the brightness of the ambient light, so as to ensure that the user can still brighten the screen even if the electronic device 100 is not operated. In order to prevent the situation that the user does not operate the electronic device 100 but suddenly turns off the screen when viewing the displayed content and affects the user experience.
而在电子设备100未接收到输入操作情况下,若拍摄图像中不包含人脸、或者拍摄图像虽然包含人脸,注视点坐标在显示区域之外时(即用户并未查看显示区域),则可确定用户当前并不需要使用电子设备100,因此,此时可将显示亮度调节到第二预定亮度,第二预定亮度小于第一预定亮度,从而防止不必要的电量损耗。在用户再次注视显示区域时,则又将显示亮度调节到第一预定亮度,保证用户正常的观看体验。如此,可实现在用户未操作电子设备100情况下,用户注视显示区域,显示区域以正常亮度显示,用户不注视显示区域,则以低亮度显示,在保证用户观看体验的基础上,最大化的节省电量。However, when the electronic device 100 does not receive an input operation, if the captured image does not contain a human face, or although the captured image contains a human face, the gaze point coordinates are outside the display area (that is, the user does not view the display area), then It can be determined that the user does not need to use the electronic device 100 at present, therefore, at this time, the display brightness can be adjusted to a second predetermined brightness, which is smaller than the first predetermined brightness, so as to prevent unnecessary power consumption. When the user looks at the display area again, the display brightness is adjusted to the first predetermined brightness again, so as to ensure the normal viewing experience of the user. In this way, it can be realized that when the user does not operate the electronic device 100, the user looks at the display area, and the display area is displayed at normal brightness; Save battery.
请参阅图20,本申请实施方式的一个或多个包含计算机程序302的非易失性计算机可读存储介质300,当计算机程序302被一个或多个处理器60执行时,使得处理器60可执行上述任一实施方式的注视检测方法或电子设备100的控制方法。Referring to FIG. 20 , one or more non-transitory computer-readable storage media 300 containing a computer program 302 according to an embodiment of the present application, when the computer program 302 is executed by one or more processors 60, the processors 60 can Execute the gaze detection method or the control method of the electronic device 100 in any one of the above-mentioned embodiments.
例如,请结合图1,当计算机程序302被一个或多个处理器60执行时,使得处理器60执行以下步骤:For example, referring to FIG. 1, when the computer program 302 is executed by one or more processors 60, the processors 60 are made to perform the following steps:
011:根据人脸信息确定人脸的姿态信息,根据人脸信息确定基准注视点坐标;011: Determine the posture information of the face according to the face information, and determine the reference gaze point coordinates according to the face information;
013:响应于姿态信息大于预设阈值,根据姿态信息确定校正参数;及013: In response to the attitude information being greater than a preset threshold, determine a correction parameter according to the attitude information; and
015:根据基准注视点坐标和校正参数,确定注视信息。015: Determine the fixation information according to the reference fixation point coordinates and correction parameters.
再例如,请结合图9,当计算机程序302被一个或多个处理器60执行时,处理器60还可以执行以下步骤:For another example, please refer to FIG. 9, when the computer program 302 is executed by one or more processors 60, the processors 60 may also perform the following steps:
021:根据人脸信息确定人脸的姿态信息,根据人脸信息确定基准注视点坐标;021: Determine the posture information of the face according to the face information, and determine the reference gaze point coordinates according to the face information;
023:响应于姿态信息大于预设阈值,根据姿态信息确定校正参数;023: In response to the attitude information being greater than a preset threshold, determine a correction parameter according to the attitude information;
025:根据基准注视点坐标和校正参数,确定注视信息;及025: Determine the fixation information according to the coordinates of the reference fixation point and the correction parameters; and
027:根据注视信息控制电子设备100。027: Control the electronic device 100 according to the gaze information.
在本说明书的描述中,参考术语“一个实施方式”、“一些实施方式”、“示意性实施方式”、“示例”、“具体示例”或“一些示例”等的描述意指结合实施方式或示例描述的具体特征、结构、材料或者特点包含于本申请的至少一个实施方式或示例中。在本说明书中,对上述术语的示意性表述不一定指的是相同的实施方式或示例。而且,描述的具体特征、结构、材料或者特点可以在任何的一个或多个实施方式或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施方式或示例以及不同实施方式或示例的特征进行结合和组合。In the description of this specification, descriptions with reference to the terms "one embodiment", "some embodiments", "exemplary embodiments", "example", "specific examples" or "some examples" mean that a combination of the embodiments or Examples describe specific features, structures, materials, or characteristics that are included in at least one embodiment or example of the present application. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the described specific features, structures, materials or characteristics may be combined in any suitable manner in any one or more embodiments or examples. In addition, those skilled in the art can combine and combine different embodiments or examples and features of different embodiments or examples described in this specification without conflicting with each other.
流程图中或在此以其他方式描述的任何过程或方法描述可以被理解为,表示包括一个或更多个用于实现特定逻辑功能或过程的步骤的可执行指令的代码的模块、片段或部分,并且本申请的优选实施方式的范围包括另外的实现,其中可以不按所示出或讨论的顺序,包括根据所涉及的功能按基本同时的方式或按相反的顺序,来执行功能,这应被本申请的实施方式所属技术领域的技术人员所理解。Any process or method descriptions in flowcharts or otherwise described herein may be understood to represent modules, segments or portions of code comprising one or more executable instructions for implementing specific logical functions or steps of the process , and the scope of preferred embodiments of the present application includes additional implementations in which functions may be performed out of the order shown or discussed, including in substantially simultaneous fashion or in reverse order depending on the functions involved, which shall It should be understood by those skilled in the art to which the embodiments of the present application belong.
尽管上面已经示出和描述了本申请的实施方式,可以理解的是,上述实施方式是示例性的,不能理解为对本申请的限制,本领域的普通技术人员在本申请的范围内可以对上述实施方式进行变化、修改、替换和变型。Although the implementation of the present application has been shown and described above, it can be understood that the above-mentioned implementation is exemplary and should not be construed as limiting the application, and those skilled in the art can make the above-mentioned The embodiments are subject to changes, modifications, substitutions and variations.

Claims (20)

  1. 一种注视检测方法,其特征在于,包括:A gaze detection method, characterized in that, comprising:
    根据人脸信息确定人脸的姿态信息,根据所述人脸信息确定基准注视点坐标;Determine the posture information of the human face according to the human face information, and determine the reference gaze point coordinates according to the human face information;
    响应于所述姿态信息大于预设阈值,根据所述姿态信息确定校正参数;及In response to the attitude information being greater than a preset threshold, determining correction parameters according to the attitude information; and
    根据所述基准注视点坐标和所述校正参数,确定注视信息。Determine gaze information according to the coordinates of the reference gaze point and the correction parameters.
  2. 根据权利要求1所述的注视检测方法,其特征在于,还包括:The gaze detection method according to claim 1, further comprising:
    响应于所述姿态信息小于所述预设阈值,根据所述人脸信息计算基准注视点坐标,以作为所述注视信息。In response to the posture information being smaller than the preset threshold, calculating reference gaze point coordinates according to the face information as the gaze information.
  3. 根据权利要求1所述的注视检测方法,其特征在于,所述姿态信息包括姿态角,所述姿态角包括俯仰角和偏航角,所述根据人脸信息,判断人脸的姿态信息是否大于预设阈值,包括:The gaze detection method according to claim 1, wherein the attitude information includes an attitude angle, and the attitude angle includes a pitch angle and a yaw angle, and according to the face information, it is judged whether the attitude information of the face is greater than Preset thresholds, including:
    根据所述人脸信息,判断所述俯仰角或所述偏航角是否大于所述预设阈值。According to the face information, it is judged whether the pitch angle or the yaw angle is greater than the preset threshold.
  4. 根据权利要求1所述的注视检测方法,其特征在于,还包括:The gaze detection method according to claim 1, further comprising:
    获取训练样本集,所述训练样本集包括人脸的姿态信息小于所述预设阈值的第一类样本和人脸的姿态信息大于所述预设阈值的第二类样本;Obtaining a training sample set, the training sample set including a first-type sample whose face posture information is less than the preset threshold and a second-type sample whose face posture information is greater than the preset threshold;
    根据所述第一类样本和第二类样本训练预设的检测模型;training a preset detection model according to the first type of samples and the second type of samples;
    所述根据所述姿态信息确定校正参数,包括:The determining the correction parameters according to the attitude information includes:
    基于所述检测模型,根据所述姿态信息确定所述校正参数。Based on the detection model, the correction parameters are determined according to the attitude information.
  5. 根据权利要求4所述的注视检测方法,其特征在于,所述检测模型包括注视点检测模块和校正模块,所述根据所述第一类样本和第二类样本训练所述检测模型,包括:The gaze detection method according to claim 4, wherein the detection model includes a gaze point detection module and a correction module, and the training of the detection model according to the first type sample and the second type sample includes:
    将第一类样本输入所述注视点检测模块,以输出第一训练坐标;Inputting the first type of samples into the gaze point detection module to output the first training coordinates;
    将第二类样本输入所述注视点检测模块和所述校正模块,以输出第二训练坐标;Inputting the second type of samples into the fixation point detection module and the correction module to output second training coordinates;
    基于预设的损失函数,根据所述第一类样本对应的第一预设坐标和所述第一训练坐标,计算第一损失值,并根据所述第二类样本对应的第二预设坐标和所述第二训练坐标,计算第二损失值;Based on the preset loss function, calculate the first loss value according to the first preset coordinates corresponding to the first type of samples and the first training coordinates, and calculate the first loss value according to the second preset coordinates corresponding to the second type of samples and the second training coordinates, calculating a second loss value;
    根据所述第一损失值和所述第二损失值调整所述检测模型,直至所述检测模型收敛。Adjusting the detection model according to the first loss value and the second loss value until the detection model converges.
  6. 根据权利要求5所述的注视检测方法,其特征在于,在进行训练的连续N个批次的样本中,任意两个所述第一类样本对应的所述第一损失值的第一差值、及任意两个所述第二类样本对应的所述第二损失值的第二差值均小于预定差值阈值时,确定所述检测模型收敛,所述N为大于1的正整数;或者,所述第一损失值和所述第二损失值均小于预定损失阈值时,确定所述检测模型收敛。The gaze detection method according to claim 5, wherein, in the samples of continuous N batches of training, the first difference of the first loss value corresponding to any two samples of the first type , and when the second difference of the second loss value corresponding to any two samples of the second type is less than a predetermined difference threshold, it is determined that the detection model is converged, and the N is a positive integer greater than 1; or , when both the first loss value and the second loss value are smaller than a predetermined loss threshold, it is determined that the detection model is converged.
  7. 根据权利要求1所述的注视检测方法,其特征在于,所述人脸信息包括人脸掩码、左眼图像及右眼图像,所述人脸掩码用于指示人脸在图像中的位置,所述根据所述人脸信息计算基准注视点坐标,包括:The gaze detection method according to claim 1, wherein the face information includes a face mask, a left-eye image and a right-eye image, and the face mask is used to indicate the position of the face in the image , the calculation of reference gaze point coordinates according to the face information includes:
    根据所述人脸掩码计算人脸相对电子设备的位置信息;Calculate the position information of the face relative to the electronic device according to the face mask;
    根据所述位置信息、所述左眼图像和所述右眼图像,计算所述基准注视点坐标。Calculate the coordinates of the reference gaze point according to the position information, the left-eye image, and the right-eye image.
  8. 根据权利要求1所述的注视检测方法,其特征在于,所述人脸信息包括人脸特征点,所述姿态信息包括姿态角和三维坐标偏移,所述校正参数包括旋转矩阵和平移矩阵,The gaze detection method according to claim 1, wherein the face information includes face feature points, the attitude information includes attitude angles and three-dimensional coordinate offsets, and the correction parameters include a rotation matrix and a translation matrix,
    所述根据人脸信息确定人脸的姿态信息,包括:The said determination of the pose information of the face according to the face information includes:
    根据所述人脸特征点计算所述姿态角和所述三维坐标偏移;calculating the attitude angle and the three-dimensional coordinate offset according to the facial feature points;
    所述根据所述姿态信息计算校正参数,包括:The calculation of correction parameters according to the attitude information includes:
    根据所述姿态角计算所述旋转矩阵,并根据所述三维坐标偏移计算所述平移矩阵。The rotation matrix is calculated according to the attitude angle, and the translation matrix is calculated according to the three-dimensional coordinate offset.
  9. 一种电子设备的控制方法,其特征在于,包括:A method for controlling an electronic device, comprising:
    根据人脸信息确定人脸的姿态信息,根据所述人脸信息确定基准注视点坐标;Determine the posture information of the human face according to the human face information, and determine the reference gaze point coordinates according to the human face information;
    响应于所述姿态信息大于预设阈值,根据所述姿态信息确定校正参数;determining a correction parameter according to the attitude information in response to the attitude information being greater than a preset threshold;
    根据所述基准注视点坐标和所述校正参数,确定注视信息;及determining gaze information according to the coordinates of the reference gaze point and the correction parameters; and
    根据所述注视信息控制所述电子设备。The electronic device is controlled according to the gaze information.
  10. 根据权利要求9所述的控制方法,其特征在于,还包括:The control method according to claim 9, further comprising:
    响应于所述姿态信息小于所述预设阈值,根据所述人脸信息计算基准注视点坐标,以作为所述注视信息。In response to the posture information being smaller than the preset threshold, calculating reference gaze point coordinates according to the face information as the gaze information.
  11. 根据权利要求9所述的控制方法,其特征在于,所述人脸信息包括人脸掩码、左眼图像及右眼图像,所述人脸掩码用于指示人脸在图像中的位置,所述根据所述人脸信息计算基准注视点坐标,包括:The control method according to claim 9, wherein the face information includes a face mask, a left-eye image and a right-eye image, and the face mask is used to indicate the position of the face in the image, The calculation of reference gaze point coordinates according to the face information includes:
    根据所述人脸掩码计算人脸相对所述电子设备的位置信息;calculating the position information of the face relative to the electronic device according to the face mask;
    根据所述位置信息、所述左眼图像和所述右眼图像,计算所述基准注视点坐标。Calculate the coordinates of the reference gaze point according to the position information, the left-eye image, and the right-eye image.
  12. 根据权利要求9所述的控制方法,其特征在于,所述人脸信息包括人脸特征点,所述姿态信息包括姿态角和三维坐标偏移,所述校正参数包括旋转矩阵和平移矩阵,所述根据所述姿态信息计算校正参数,包括:The control method according to claim 9, wherein the face information includes face feature points, the attitude information includes attitude angle and three-dimensional coordinate offset, and the correction parameters include a rotation matrix and a translation matrix, so The calculation of correction parameters according to the attitude information includes:
    根据所述人脸特征点计算所述姿态角和所述三维坐标偏移;calculating the attitude angle and the three-dimensional coordinate offset according to the facial feature points;
    根据所述姿态角计算所述旋转矩阵,并根据所述三维坐标偏移计算所述平移矩阵。The rotation matrix is calculated according to the attitude angle, and the translation matrix is calculated according to the three-dimensional coordinate offset.
  13. 根据权利要求9所述的控制方法,其特征在于,所述注视信息包括注视点坐标,The control method according to claim 9, wherein the gaze information includes gaze point coordinates,
    在根据人脸信息确定人脸的姿态信息,根据所述人脸信息确定基准注视点坐标之前,所述控制方法还包括:Before determining the posture information of the human face according to the human face information, and determining the reference gaze point coordinates according to the human face information, the control method also includes:
    在息屏前的第一预定时长内,获取拍摄图像;within the first predetermined period of time before the screen is off, acquire the photographed image;
    响应于所述拍摄图像中包含人脸信息;Responding to the fact that the captured image contains face information;
    所述根据所述注视信息控制所述电子设备,包括:The controlling the electronic device according to the gaze information includes:
    响应于所述注视点坐标位于显示屏的显示区域,持续亮屏第二预定时长。In response to the gaze point coordinates being located in the display area of the display screen, the screen is kept on for a second predetermined duration.
  14. 根据权利要求13所述的控制方法,其特征在于,所述显示区域与预设坐标范围相关联,所述控制方法还包括:The control method according to claim 13, wherein the display area is associated with a preset coordinate range, and the control method further comprises:
    在所述注视点坐标位于所述预设坐标范围内时,确定所述注视点坐标位于所述显示区域。When the gaze point coordinates are within the preset coordinate range, it is determined that the gaze point coordinates are located in the display area.
  15. 根据权利要求9所述的控制方法,其特征在于,在根据人脸信息确定人脸的姿态信息,根据所述人脸信息确定基准注视点坐标之前,所述控制方法还包括:The control method according to claim 9, wherein, before determining the posture information of the face according to the face information, and determining the reference gaze point coordinates according to the face information, the control method further comprises:
    响应于所述电子设备未接收到输入操作的情况,获取拍摄图像;Acquiring a captured image in response to the fact that the electronic device does not receive an input operation;
    所述根据所述注视信息控制所述电子设备,包括:The controlling the electronic device according to the gaze information includes:
    响应于所述拍摄图像中包含人脸且所述注视点坐标位于显示屏的显示区域,调节所述显示屏的显示亮度至第一预定亮度;Adjusting the display brightness of the display screen to a first predetermined brightness in response to the captured image containing a human face and the gaze point coordinates being located in a display area of the display screen;
    响应于所述拍摄图像中不包含人脸、或所述拍摄图像中包含人脸且所述注视点坐标位于所述显示区域之外,调节所述显示亮度至第二预定亮度,所述第二预定亮度小于所述第一预定亮度。In response to the captured image not containing a human face, or the captured image containing a human face and the gaze point coordinates are located outside the display area, adjusting the display brightness to a second predetermined brightness, the second The predetermined brightness is smaller than the first predetermined brightness.
  16. 一种检测装置,其特征在于,包括:A detection device is characterized in that it comprises:
    第一确定模块,用于根据人脸信息确定人脸的姿态信息,根据所述人脸信息确定基准注视点坐标;The first determination module is used to determine the posture information of the human face according to the human face information, and determine the reference gaze point coordinates according to the human face information;
    第二确定模块,用于响应于所述姿态信息大于预设阈值,根据所述姿态信息确定校正参数;A second determination module, configured to determine a correction parameter according to the attitude information in response to the attitude information being greater than a preset threshold;
    第三确定模块,用于根据所述基准注视点坐标和所述校正参数,确定注视信息。The third determination module is configured to determine gaze information according to the coordinates of the reference gaze point and the correction parameters.
  17. 一种控制装置,其特征在于,包括:A control device, characterized in that it comprises:
    获取模块,用于根据人脸信息确定人脸的姿态信息,根据所述人脸信息确定基准注视点坐标;The acquisition module is used to determine the gesture information of the human face according to the human face information, and determine the reference gaze point coordinates according to the human face information;
    第一确定模块,用于响应于所述姿态信息大于预设阈值,根据所述姿态信息确定校正参数;A first determination module, configured to determine a correction parameter according to the attitude information in response to the attitude information being greater than a preset threshold;
    第二确定模块,用于根据所述基准注视点坐标和所述校正参数,确定注视信息;The second determination module is used to determine gaze information according to the coordinates of the reference gaze point and the correction parameters;
    控制模块,用于根据所述注视信息控制电子设备。A control module, configured to control electronic equipment according to the gaze information.
  18. 一种电子设备,其特征在于,包括处理器,所述处理器用于根据人脸信息确定人脸的姿态信息,根据所述人脸信息确定基准注视点坐标;响应于所述姿态信息大于预设阈值,根据所述姿态信息确定校正参数;根据所述基准注视点坐标和所述校正参数,确定注视信息。An electronic device, characterized in that it includes a processor, the processor is used to determine the pose information of the face according to the face information, and determine the reference gaze point coordinates according to the face information; in response to the pose information being greater than the preset A threshold value, determining correction parameters according to the posture information; determining gaze information according to the coordinates of the reference gaze point and the correction parameters.
  19. 一种电子设备,其特征在于,包括处理器,所述处理器用于根据人脸信息确定人脸的姿态信息,根据所述人脸信息确定基准注视点坐标;响应于所述姿态信息大于预设阈值,根据所述姿态信息确定校正参数;根据所述基准注视点坐标和所述校正参数,确定注视信息;及根据所述注视信息控制所述电子设备。An electronic device, characterized in that it includes a processor, the processor is used to determine the pose information of the face according to the face information, and determine the reference gaze point coordinates according to the face information; in response to the pose information being greater than the preset determining a correction parameter according to the posture information; determining gaze information according to the coordinates of the reference gaze point and the correction parameter; and controlling the electronic device according to the gaze information.
  20. 一种包括计算机程序的非易失性计算机可读存储介质,所述计算机程序被处理器执行时,使得所述处理器执行权利要求1-8任意一项所述的注视检测方法、或权利要求9-15任意一项所述的电子设备的控制方法。A non-transitory computer-readable storage medium including a computer program, when the computer program is executed by a processor, the processor is made to perform the gaze detection method described in any one of claims 1-8, or the claim The control method of the electronic equipment described in any one of 9-15.
PCT/CN2022/126148 2021-10-29 2022-10-19 Gaze detection method, control method for electronic device, and related devices WO2023071884A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111271397.4A CN113936324A (en) 2021-10-29 2021-10-29 Gaze detection method, control method of electronic device and related device
CN202111271397.4 2021-10-29

Publications (1)

Publication Number Publication Date
WO2023071884A1 true WO2023071884A1 (en) 2023-05-04

Family

ID=79285003

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/126148 WO2023071884A1 (en) 2021-10-29 2022-10-19 Gaze detection method, control method for electronic device, and related devices

Country Status (2)

Country Link
CN (1) CN113936324A (en)
WO (1) WO2023071884A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117495937A (en) * 2023-12-25 2024-02-02 荣耀终端有限公司 Face image processing method and electronic equipment

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113936324A (en) * 2021-10-29 2022-01-14 Oppo广东移动通信有限公司 Gaze detection method, control method of electronic device and related device
CN116052235B (en) * 2022-05-31 2023-10-20 荣耀终端有限公司 Gaze point estimation method and electronic equipment
CN116052261A (en) * 2022-05-31 2023-05-02 荣耀终端有限公司 Sight estimation method and electronic equipment
CN116030512B (en) * 2022-08-04 2023-10-31 荣耀终端有限公司 Gaze point detection method and device
CN115509351B (en) * 2022-09-16 2023-04-07 上海仙视电子科技有限公司 Sensory linkage situational digital photo frame interaction method and system
CN117133043A (en) * 2023-03-31 2023-11-28 荣耀终端有限公司 Gaze point estimation method, electronic device, and computer-readable storage medium
CN116737051B (en) * 2023-08-16 2023-11-24 北京航空航天大学 Visual touch combination interaction method, device and equipment based on touch screen and readable medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160217318A1 (en) * 2013-08-29 2016-07-28 Nec Corporation Image processing device, image processing method, and program
CN109993029A (en) * 2017-12-29 2019-07-09 上海聚虹光电科技有限公司 Blinkpunkt model initialization method
CN112232128A (en) * 2020-09-14 2021-01-15 南京理工大学 Eye tracking based method for identifying care needs of old disabled people
CN112509007A (en) * 2020-12-14 2021-03-16 科大讯飞股份有限公司 Real fixation point positioning method and head-wearing sight tracking system
CN113544626A (en) * 2019-03-15 2021-10-22 索尼集团公司 Information processing apparatus, information processing method, and computer-readable recording medium
CN113936324A (en) * 2021-10-29 2022-01-14 Oppo广东移动通信有限公司 Gaze detection method, control method of electronic device and related device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160217318A1 (en) * 2013-08-29 2016-07-28 Nec Corporation Image processing device, image processing method, and program
CN109993029A (en) * 2017-12-29 2019-07-09 上海聚虹光电科技有限公司 Blinkpunkt model initialization method
CN113544626A (en) * 2019-03-15 2021-10-22 索尼集团公司 Information processing apparatus, information processing method, and computer-readable recording medium
CN112232128A (en) * 2020-09-14 2021-01-15 南京理工大学 Eye tracking based method for identifying care needs of old disabled people
CN112509007A (en) * 2020-12-14 2021-03-16 科大讯飞股份有限公司 Real fixation point positioning method and head-wearing sight tracking system
CN113936324A (en) * 2021-10-29 2022-01-14 Oppo广东移动通信有限公司 Gaze detection method, control method of electronic device and related device

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117495937A (en) * 2023-12-25 2024-02-02 荣耀终端有限公司 Face image processing method and electronic equipment

Also Published As

Publication number Publication date
CN113936324A (en) 2022-01-14

Similar Documents

Publication Publication Date Title
WO2023071884A1 (en) Gaze detection method, control method for electronic device, and related devices
US9373156B2 (en) Method for controlling rotation of screen picture of terminal, and terminal
WO2023071882A1 (en) Human eye gaze detection method, control method and related device
TWI704501B (en) Electronic apparatus operated by head movement and operation method thereof
US9696859B1 (en) Detecting tap-based user input on a mobile device based on motion sensor data
US9740281B2 (en) Human-machine interaction method and apparatus
CN104317391B (en) A kind of three-dimensional palm gesture recognition exchange method and system based on stereoscopic vision
KR101977638B1 (en) Method for correcting user’s gaze direction in image, machine-readable storage medium and communication terminal
US11715231B2 (en) Head pose estimation from local eye region
CN100343867C (en) Method and apparatus for distinguishing direction of visual lines
US10489912B1 (en) Automated rectification of stereo cameras
TWI631506B (en) Method and system for whirling view on screen
CN109375765B (en) Eyeball tracking interaction method and device
CN104574321A (en) Image correction method and device and video system
WO2020042542A1 (en) Method and apparatus for acquiring eye movement control calibration data
US10866492B2 (en) Method and system for controlling tracking photographing of stabilizer
WO2012137801A1 (en) Input device, input method, and computer program
WO2020019504A1 (en) Robot screen unlocking method, apparatus, smart device and storage medium
US20210118157A1 (en) Machine learning inference on gravity aligned imagery
WO2021197466A1 (en) Eyeball detection method, apparatus and device, and storage medium
CN109377518A (en) Target tracking method, device, target tracking equipment and storage medium
CN102725713A (en) Operation input device
CN113487670A (en) Cosmetic mirror and state adjusting method
WO2016033877A1 (en) Method for dynamically adjusting screen character display of terminal, and terminal
US20150063631A1 (en) Dynamic image analyzing system and operating method thereof

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22885762

Country of ref document: EP

Kind code of ref document: A1