WO2023222128A9 - 一种显示方法和电子设备 - Google Patents

一种显示方法和电子设备 Download PDF

Info

Publication number
WO2023222128A9
WO2023222128A9 PCT/CN2023/095379 CN2023095379W WO2023222128A9 WO 2023222128 A9 WO2023222128 A9 WO 2023222128A9 CN 2023095379 W CN2023095379 W CN 2023095379W WO 2023222128 A9 WO2023222128 A9 WO 2023222128A9
Authority
WO
WIPO (PCT)
Prior art keywords
user
image
terminal
interface
camera
Prior art date
Application number
PCT/CN2023/095379
Other languages
English (en)
French (fr)
Other versions
WO2023222128A1 (zh
Inventor
邸皓轩
李丹洪
张晓武
Original Assignee
荣耀终端有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 荣耀终端有限公司 filed Critical 荣耀终端有限公司
Publication of WO2023222128A1 publication Critical patent/WO2023222128A1/zh
Publication of WO2023222128A9 publication Critical patent/WO2023222128A9/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/013Eye tracking input arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • G06V40/193Preprocessing; Feature extraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • G06V40/197Matching; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2203/00Indexing scheme relating to G06F3/00 - G06F3/048
    • G06F2203/01Indexing scheme relating to G06F3/01
    • G06F2203/011Emotion or mood input determined on the basis of sensed human body parameters such as pulse, heart rate or beat, temperature of skin, facial expressions, iris, voice pitch, brain activity patterns

Definitions

  • the present application relates to the field of terminals, and in particular, to a display method and electronic device.
  • Embodiments of the present application provide a display method and electronic device.
  • a terminal device equipped with a camera can determine the user's eyeball gaze position through the camera module, and then determine whether the user is gazing at a preset area on the screen. After identifying the user's preset area, the terminal can display a notification interface associated with the preset area so that the user can quickly obtain notifications without the user's touch operation.
  • this application provides a display method, which is applied to an electronic device.
  • the electronic device includes a screen, and the screen of the electronic device includes a first preset area.
  • the method includes: displaying a first interface; responding to a user's third An operation is performed to display the second interface; if the second interface is a preset interface, when the second interface is displayed, the electronic device collects the first image within the first preset time period; and determines the user's first image based on the first image.
  • Eye gaze area the first eye gaze area is the screen area that the user gazes on when the user gazes on the screen; when the first eye gaze area is within the first preset area, a third interface including one or more notifications is displayed.
  • the electronic device can enable eye gaze recognition detection on a specific interface.
  • eye gaze recognition detection electronic devices can detect whether the user is looking at a preset area on the screen.
  • the electronic device can automatically display a notification interface displaying the notification.
  • the user can control the electronic device to display the notification interface through gaze operations, thereby quickly obtaining notifications.
  • this method provides users with another way to obtain notifications, which is conducive to improving the user experience.
  • the second interface is any one of the following multiple interfaces: the first desktop, the second desktop, and the negative screen.
  • the electronic device can detect whether the user is looking at a certain preset area on the screen when displaying interfaces such as the first desktop, the second desktop, and the negative screen. In this way, the user can control the electronic device to display the notification interface through a gaze operation when the electronic device displays the first desktop, the second desktop or the negative screen.
  • the second interface for turning on eye gaze recognition detection based on the user's habit of obtaining notifications is not limited to the above-mentioned first desktop, second desktop, and negative screen.
  • the first preset time is the first N seconds before the second interface is displayed.
  • the electronic device will not always detect whether the user is looking at a certain area on the screen in the second interface, but will detect it within a preset period of time, such as the first 3 seconds of displaying the second interface. , to save power consumption, At the same time, prevent camera abuse from affecting user information security.
  • the first eye gaze area is a cursor point formed by one display unit on the screen, or the first eye gaze area is a cursor point formed by multiple display units on the screen, or Cursor area.
  • the first eyeball gaze area is within the first preset area, including: the position of the first eyeball gaze area on the screen is included in the first preset area, or the first eyeball gaze area is within the first preset area.
  • the position of the eyeball gaze area on the screen intersects with the first preset area.
  • the first interface is an interface to be unlocked, and the first operation is an unlocking operation.
  • the electronic device can display first interfaces such as the first desktop, the second desktop, and the negative screen after being successfully unlocked. While displaying the above interface, the electronic device may also detect whether the user is looking at the first preset area on the screen. In this way, in the scenario of unlocking and entering the second interface, the user can control the electronic device to display the notification interface through gaze operation.
  • the first interface is an interface provided by a first application installed on the electronic device, and the first operation is an operation of exiting the first application.
  • the electronic device can display a second interface after exiting an application.
  • the electronic device can also detect whether the user is looking at the first preset area on the screen. In this way, after exiting an application, the user can immediately control the electronic device to display the notification interface through gaze operations, and then process the pending tasks indicated by the notification.
  • the operation of exiting the first application includes: the electronic device detects an operation by the user instructing the electronic device to exit the first application, and the electronic device detects that the user has An operation to exit the first application is generated when no operation is performed on the first application.
  • the electronic device can determine to exit the first application through the user's operation to exit the first application, or can also determine to exit the first application through the user's failure to perform operations on the first application for a long time. In this way, the electronic device can provide users with a more convenient and automated service of displaying notifications.
  • the electronic device includes a camera module; the camera module includes: at least one 2D camera and at least one 3D camera, the 2D camera is used to obtain two-dimensional images, and the 3D camera is used to obtain An image including depth information; the first image includes a two-dimensional image and an image including depth information.
  • the camera module of the electronic device may include multiple cameras, and the multiple cameras include at least one 2D camera and at least one 3D camera.
  • the electronic device can obtain two-dimensional images and three-dimensional images indicating the gaze position of the user's eyeballs.
  • the combination of two-dimensional images and three-dimensional images can help improve the accuracy and accuracy of electronic devices in identifying the user's eye gaze position.
  • determining the user's first eye gaze area based on the first image specifically includes: using the first image to determine feature data, where the feature data includes a left eye image, a right eye image, a person One or more categories of face images and face mesh data; the eye gaze recognition model is used to determine the first eye gaze area indicated by the feature data.
  • the eye gaze recognition model is established based on a convolutional neural network.
  • the electronic device can obtain left eye images, right eye images, face images and face grid data respectively from the two-dimensional images and three-dimensional images collected by the camera module, thereby extracting more features to improve recognition precision and accuracy.
  • using the first image to determine the characteristic data specifically includes: performing face correction on the first image to obtain a first image with a corrected facial image; and based on the corrected facial image, first image images to determine feature data.
  • the electronic device may Face correction is performed on the images collected by the camera module to improve the accuracy of the left eye image, right eye image, and face image.
  • the first image is stored in a secure data buffer; before determining the user's first eye gaze area based on the first image, the method further includes: in a trusted execution environment Get the first image from the secure data buffer.
  • the electronic device can store the images collected by the camera module in the secure data buffer.
  • the image data stored in the secure data buffer can only be transmitted to the eye gaze recognition algorithm through the secure transmission channel provided by the security service, thereby improving the security of the image data.
  • the secure data buffer is set at the hardware layer of the electronic device.
  • the present application provides an electronic device, which includes one or more processors and one or more memories; wherein one or more memories are coupled to one or more processors, and one or more
  • the memory is used to store computer program code.
  • the computer program code includes computer instructions.
  • embodiments of the present application provide a chip system, which is applied to an electronic device.
  • the chip system includes one or more processors, and the processor is used to call computer instructions to cause the electronic device to execute the first step. aspect and the method described in any possible implementation manner in the first aspect.
  • the present application provides a computer-readable storage medium, including instructions.
  • the above instructions When the above instructions are run on an electronic device, the above electronic device causes the above-mentioned electronic device to execute as described in the first aspect and any possible implementation manner of the first aspect. method.
  • the present application provides a computer program product containing instructions.
  • the computer program product When the computer program product is run on an electronic device, the electronic device causes the electronic device to execute as described in the first aspect and any possible implementation manner of the first aspect. method.
  • the electronic device provided by the second aspect the chip system provided by the third aspect, the computer storage medium provided by the fourth aspect, and the computer program product provided by the fifth aspect are all used to execute the method provided by this application. Therefore, the beneficial effects it can achieve can be referred to the beneficial effects in the corresponding methods, and will not be described again here.
  • Figures 1A-1C are a set of user interfaces provided by embodiments of the present application.
  • Figures 2A-2H are a set of user interfaces provided by embodiments of the present application.
  • Figures 3A-3H are a set of user interfaces provided by embodiments of the present application.
  • Figures 4A-4D are a set of user interfaces provided by embodiments of the present application.
  • Figures 5A-5D are a set of user interfaces provided by embodiments of the present application.
  • Figures 6A-6E are a set of user interfaces provided by embodiments of the present application.
  • Figure 7 is a flow chart of a display method provided by an embodiment of the present application.
  • Figure 8 is a structural diagram of an eye gaze recognition model provided by an embodiment of the present application.
  • Figure 9A is a flow chart of face correction provided by an embodiment of the present application.
  • Figures 9B-9D are schematic diagrams of a set of face corrections provided by embodiments of the present application.
  • Figure 10 is a structural diagram of a convolutional network of an eye gaze recognition model provided by an embodiment of the present application.
  • Figure 11 is a schematic diagram of a separable convolution technology provided by an embodiment of the present application.
  • Figure 12 is a schematic system structure diagram of the terminal 100 provided by the embodiment of the present application.
  • Figure 13 is a schematic diagram of the hardware structure of the terminal 100 provided by the embodiment of the present application.
  • the phone After the phone is unlocked and the desktop is displayed, the phone usually detects the user's pull-down operation first.
  • the above-mentioned pull-down operation refers to the operation of sliding downwards on the top of the screen.
  • the mobile phone may display a notification interface.
  • the interface shown in Figure 1B may be called a notification interface.
  • the notification interface may display one or more notifications received by the mobile phone, such as notification 121, notification 122, and notification 123.
  • One or more of the above notification information comes from the operating system, system applications and/or third-party applications installed on the mobile phone.
  • the user first performs a pull-down operation and instructs the phone to display the notification interface because: after turning on the phone, the user usually first wants to check notifications and confirm whether there are any urgent matters to be processed.
  • face unlock is enabled
  • the phone can quickly detect whether the user's facial image matches, and then quickly complete the face unlock and display the desktop. This makes it difficult for the user to carefully check and display the desktop before unlocking. Confirm notification.
  • the user is more likely to perform a pull-down operation after displaying the desktop and instruct the mobile phone to display the notification interface to check and confirm the notification.
  • the mobile phone displays a notification interface that can give the user prompts and instruct the user what operations they can perform.
  • the notification interface can display notifications for updating software, notifications for receiving calls or messages from contacts, etc.
  • the mobile phone can first display the notification interface showing the above notification. In this way, the user can confirm operations such as updating the software or replying to the contact with a call or message, thereby providing the user with efficient notification reminders and improving the user experience.
  • embodiments of the present application provide a display method.
  • This method can be applied to terminal devices such as mobile phones and tablet computers.
  • Terminal devices such as mobile phones and tablet computers that implement the above method can be recorded as terminal 100.
  • the terminal 100 will be used to refer to the above-mentioned terminal devices such as mobile phones and tablet computers.
  • the terminal 100 can also be a desktop computer, a laptop computer, a handheld computer, a notebook computer, an ultra-mobile personal computer (UMPC), a netbook, a cellular phone, or a personal digital assistant.
  • PDA personal digital assistant
  • AR augmented reality
  • VR virtual reality
  • AI artificial intelligence
  • wearable devices wearable devices
  • vehicle-mounted devices smart home devices and /or smart city equipment.
  • the terminal 100 can detect that the user picks up the mobile phone to perform an unlocking operation. After detecting successful unlocking, the terminal 100 can enable the 2D camera and the 3D camera to collect the user's face data, thereby determining the user's eyeball gaze position.
  • the above-mentioned 2D camera refers to a camera that generates two-dimensional images, such as a camera commonly used on mobile phones that generates RGB images.
  • the above-mentioned 3D camera refers to a camera that can generate a three-dimensional image or a camera that can generate an image including depth information, such as a TOF camera.
  • the three-dimensional images generated by 3D cameras also include depth information, that is, the position information of the object being photographed and the 3D camera.
  • the user's eyeball gaze position refers to the position where the user's gaze focuses on the screen of the terminal 100 when the user gazes at the terminal 100 .
  • a cursor point S may be displayed on the screen of the terminal 100.
  • the position where the user's eyes focus on the screen shown in FIG. 1C is the cursor point S, that is, the position where the user's eyeballs focus is the cursor point S.
  • the cursor point S can be any position on the screen.
  • the arbitrary position may correspond to an application icon or control, or it may be a blank display area.
  • the user may look directly at the cursor point S when facing the screen, or may look sideways at the cursor point when not facing the screen. That is to say, the terminal 100 does not limit the gesture of the user looking at the screen, and the terminal 100 can determine the user's eyeball gaze position in various head postures.
  • the terminal 100 may display the notification interface shown in FIG. 1B. As shown in FIG. 1C , the area enclosed by the dotted frame 131 may be called the notification bar area. When the user's eyeball gaze position is within the notification bar area of the terminal 100, it may mean that the user is looking at the notification bar.
  • the terminal 100 can determine whether the user is looking at the notification bar within 3 seconds after the unlocking is successful. Therefore, the terminal 100 can quickly determine whether to display the notification interface after completing the unlocking, which not only realizes interactive control through eye gaze, but also prevents long interaction time from affecting the user experience.
  • eye gaze can not only achieve the same control display effect as the touch operation, but also reduce the user's usage restrictions. For example, in scenarios where it is inconvenient for users to perform touch actions, such as cooking, cleaning, etc., eye gaze can provide users with a convenient interactive operation experience. In a scenario where the user does not know what to do after unlocking the terminal 100, eye gaze can instruct the terminal 100 to display a notification interface to prompt the user what operations can be performed next.
  • the terminal 100 can also automatically display the detailed information of another notification after the user completes processing of one notification, prompting the user to process the notification, thereby saving user operations and improving efficiency. User experience.
  • FIG. 2A exemplarily shows the user interface of the terminal 100 that is on but not unlocked (interface to be unlocked).
  • the time and date can be displayed on the interface to be unlocked for the user to view.
  • the terminal 100 can enable the camera module 210 to collect and generate image frames.
  • the above image frame may include the user's facial image.
  • the terminal 100 can perform facial recognition on the above-mentioned image frame, and determine whether the above-mentioned facial image is the facial image of the owner, that is, determine whether the user performing the unlocking operation is the owner himself.
  • the camera module 210 may include multiple camera devices.
  • the camera module 210 of the terminal 100 includes at least a 2D camera and a 3D camera.
  • the camera module 210 may also include multiple 2D cameras and multiple 3D cameras, which is not limited in the embodiments of the present application.
  • the number of cameras used by the terminal 100 is one.
  • this camera is a 3D camera.
  • the terminal 100 may display the user interface shown in Figures 2B-2C.
  • the terminal 100 may display the user interface (unlocking success interface) shown in FIG. 2B.
  • An icon 211 may be displayed on the successful unlocking interface.
  • the icon 211 can be used to prompt the user that the face unlock is successful.
  • the terminal 100 may display the user interface shown in FIG. 2C. This interface may be called the main interface of the terminal 100 .
  • the terminal 100 can also adopt password unlocking (graphic password, digital password), fingerprint unlocking and other unlocking methods. After the unlocking is successful, the terminal 100 can also display the main interface shown in Figure 2C.
  • the main interface may include a notification bar 221, a page indicator 222, a frequently used application icon tray 223, and a plurality of other application icon trays 224.
  • the notification bar may include one or more signal strength indicators (such as signal strength indicator 221A, signal strength indicator 221B) of mobile communication signals (also known as cellular signals), wireless fidelity (wireless fidelity, Wi-Fi) Fi) signal strength indicator 221C, battery status indicator 221D, time indicator 221E.
  • signal strength indicator 221A signal strength indicator 221A, signal strength indicator 221B
  • mobile communication signals also known as cellular signals
  • wireless fidelity wireless fidelity, Wi-Fi
  • Fi wireless fidelity
  • battery status indicator 221D battery status indicator 221D
  • time indicator 221E time indicator
  • the page indicator 222 may be used to indicate the positional relationship of the currently displayed page to other pages.
  • the main interface of the terminal 100 may include multiple pages.
  • the interface shown in Figure 2C may be one of the multiple pages mentioned above.
  • the main interface of the terminal 100 also includes other pages. This other page is not shown in Figure 2C.
  • the terminal 100 can display other pages mentioned above, that is, switch pages.
  • the page indicator 222 will also change to different forms to indicate different pages. Subsequent embodiments will be introduced in detail.
  • the frequently used application icon tray 223 may include multiple common application icons (such as a camera application icon, an address book application icon, a phone application icon, and an information application icon), and the frequently used application icons remain displayed when the page is switched.
  • the above common application icons are optional and are not limited in this embodiment of the present application.
  • the other application icon tray 224 may include a plurality of general application icons, such as a settings application icon, an application market application icon, a gallery application icon, a browser application icon, etc.
  • General application icons may be distributed in other application icon trays 224 on multiple pages of the main interface.
  • the general application icons displayed in the other application icon tray 224 will be changed accordingly when the page is switched.
  • the icon of an application can be a general application icon or a commonly used application icon. When the above icon is placed in the common application icon tray 223, the above icon is a common application icon; when the above icon is placed in the other application icon tray 224, the above icon is a general application icon.
  • FIG. 2C only illustrates a main interface or a page of a main interface of the terminal 100, and should not be construed as limiting the embodiments of the present application.
  • the terminal 100 may collect and generate an image frame including the user's face through the camera module 210 .
  • the number of cameras used by the terminal 100 is two, including a 2D camera and a 3D camera. Of course, it is not limited to one 2D camera and one 3D camera.
  • the terminal 100 can also use more cameras to obtain more user facial features, especially eye features, so as to determine the user's eyeball gaze position more quickly and accurately.
  • the 3D camera of the terminal 100 is turned on. Therefore, the terminal 100 only needs to turn on the 2D camera of the camera module 210.
  • the camera of the terminal 100 is turned off. At this time, the terminal 100 needs to turn on the 2D camera and the 3D camera in the camera module 210.
  • the time when the terminal 100 collects and generates image frames is 3 seconds before displaying the main interface shown in Figure 2C.
  • the terminal 100 can turn off the camera module 210 to save power consumption.
  • Setting the gaze recognition time too short, such as 1 second may result in inaccurate eye gaze recognition results.
  • Setting the gaze recognition time too long, such as 7 seconds or 10 seconds will result in excessive power consumption, which is not conducive to the battery life of the terminal 100.
  • the gaze recognition time can also take other values, such as 2.5 seconds, 3.5 seconds, 4 seconds, etc., which are not limited in the embodiments of the present application.
  • the subsequent introduction will take 3 seconds as an example.
  • the camera module 210 may continuously collect and generate image frames including the user's facial image. Then, the terminal 100 may recognize the user's eyeball gaze position using the above image frame. Referring to FIG. 2D , when it is recognized that the user's eyeball gaze position is within the notification bar 221 , it is determined that the user is gazing at the notification bar 221 , and the terminal 100 may display the notification interface shown in FIG. 2E for the user to obtain notification information.
  • the notification interface shown in Figure 2E is the same as Figure 1B and will not be described again here.
  • the terminal 100 provides the user with the ability to control the display of the notification interface through eye gaze.
  • the user only needs to look at the notification bar 221 to get the notification interface without performing a pull-down operation, which saves user operations.
  • the above-mentioned interaction method based on eye gaze can provide users with great convenience.
  • the terminal 100 can determine the scene in which the user needs to display the notification interface (a period of time before starting to display the main interface), and then provide the user with an eye gaze recognition service in the corresponding scene, avoiding the need to turn on the camera for a long time.
  • the problem of resource waste brought about.
  • the terminal 100 can also enable eye gaze recognition when entering an application through a notification of the interface to be unlocked and displaying an interface of the application.
  • the terminal 100 can turn on the camera in real time and obtain the user's eyeball gaze position to determine whether the user controls the display of the notification interface through eyeball gaze.
  • the terminal 100 may detect whether the user is looking at the top of the screen or a pop-up banner notification.
  • the terminal 100 may display a notification interface or an interface corresponding to the banner notification, or the like.
  • Figure 2F exemplarily shows a main interface including multiple pages. Among them, each page can be called the main interface.
  • the main interface may include page 20, page 21, and page 22.
  • Page 21 may be called the first desktop, and the first desktop is also called homepage, home screen or start screen. It is understandable that when there is only one application icon and the page indicator 222 has only one dot, the page indicator 222 has only one dot.
  • Page 22 can be called the second desktop. It can be understood that the second desktop is the desktop on the right side adjacent to the first desktop. For example, when the first desktop is displayed and the user's sliding operation from right to left is detected, the second desktop is displayed. .
  • Page 20 can be called the negative screen. It can be understood that the negative screen is the interface on the left side adjacent to the first desktop. It can be a functional page.
  • the user's sliding from left to right is detected. Operation, display negative one screen.
  • the page layout of the second desktop is the same as that of the first desktop, which will not be described again here.
  • the number of desktops in the main interface can be increased or reduced according to the user's settings. Only the first desktop, the second desktop, etc. are shown in FIG. 2F.
  • the main interface displayed by the terminal 100 is actually the first desktop in the main interface shown in FIG. 2F .
  • the terminal 100 first displays the first desktop.
  • the terminal 100 may display the negative screen, the first desktop or the second desktop.
  • which one of the negative screen, first desktop, or second desktop the terminal 100 displays depends on the page you stayed on when you last exited.
  • the terminal 100 may also display the main interface shown in FIG. 2G or 2H (the second desktop or the negative screen of the main interface).
  • the terminal 100 can also collect and generate image frames including the user's facial image through the camera module 210 to identify whether the user is looking at the notification bar 221 . If it is recognized that the user's eyeball gaze position is within the notification bar 221, the terminal 100 may also display the notification interface shown in FIG. 2E for the user to obtain notification information.
  • the terminal 100 can detect the user's eyeball gaze position within the first 3 seconds, thereby meeting the user's need to view notifications first after unlocking.
  • the terminal 100 can also enable the camera module 210 to collect and generate an image frame including the user's facial image, and identify whether the user is looking at the notification bar 221.
  • the terminal 100 may first display the first desktop. First, within the first 3 seconds of displaying the first desktop, the terminal 100 can enable the camera module 210 to collect and generate image frames including the user's facial image, and identify whether the user is looking at the notification bar 221.
  • the terminal 100 may detect a left sliding operation (an operation of sliding from the right side to the left side of the screen). In response to the above operation, the terminal 100 may display the second desktop, refer to FIG. 3B. At this time, within the first 3 seconds of displaying the second desktop, the terminal 100 can also enable the camera module 210 to collect and generate image frames including the user's facial image, and identify whether the user is looking at the notification bar 221. As shown in FIGS. 3C to 3D , when it is recognized that the user is looking at the notification bar 221 , the terminal 100 may also display a notification interface. The notification interface shown in Figure 3D is the same as Figure 1B, and will not be described again here.
  • the terminal 100 may also detect a right sliding operation (an operation of sliding from the left side to the right side of the screen), see Figure 3E. In response to the above operation, the terminal 100 may display a negative screen, see FIG. 3F. Similarly, within the first 3 seconds of displaying the negative screen, the terminal 100 can also enable the camera module 210 to collect and generate image frames including the user's facial image, and identify whether the user is looking at the notification bar 221. As shown in Figures 3G-3H, when it is recognized that the user is looking at the notification bar 221, the terminal 100 may also display a notification interface.
  • the terminal 100 can detect the user's eyeball gaze position multiple times, providing the user with multiple opportunities to control the display by eyeball gaze.
  • the terminal 100 may display the main interface. At this time, users often also get the notification interface to see what notifications are pending. Therefore, in some embodiments, when detecting returning to the main interface from a running application, the terminal 100 may also detect whether the user is looking at the notification bar, and then determine whether to display the notification interface.
  • FIG. 4A exemplarily shows a user interface displayed by the terminal 100 when running the gallery application, which is denoted as gallery interface.
  • the user can browse image resources such as pictures and videos stored on the terminal 100 through the gallery interface.
  • the terminal 100 may detect a slide-up operation (a slide-up operation from the bottom of the screen), see FIG. 4B .
  • the terminal 100 may display the main interface, see FIG. 4C.
  • the terminal 100 can also enable the camera module 210 to collect and generate image frames including the user's facial image, and identify whether the user is looking at the notification bar 221.
  • the terminal 100 may also display a notification interface, see FIG. 4D.
  • the gallery application is an application program installed on the exemplary terminal 100 . Not limited to the gallery application, when detecting an operation to return to the main interface from other applications, the terminal 100 can enable the camera module 210 to collect and generate an image frame including the user's facial image, identify whether the user is looking at the notification bar 221, and then determine whether to display the notification bar 221. Notification interface.
  • the terminal 100 can also detect the user's eye gaze position, thereby meeting the user's need to view pending notifications after using an application.
  • the terminal 100 may also confirm the number of notifications in the notification interface. If two or more notifications are displayed, the terminal 100 can automatically display other notifications in the notification interface after the user finishes processing one notification.
  • the terminal 100 may detect a user operation acting on a certain notification. In response to the above operation, the terminal 100 can expand the notification and display the detailed notification content corresponding to the notification.
  • the terminal 100 may detect a user operation acting on the notification 121.
  • the notification 121 is an information notification received by the exemplary terminal 100 .
  • the terminal 100 can perform the user interface for sending and receiving information as shown in FIG. 5B , which is referred to as the information interface.
  • the information interface may include contacts 511, information 512, and input fields 513.
  • Contact 511 may indicate the source of the received information. For example, “Lisa” can indicate that the sender of the information displayed in the interface is "Lisa”.
  • Information 512 can display complete information content.
  • the input field 513 may be used to receive input information from a user of the terminal 100 . When the user wants to reply to "Lisa", the user can click on the input field 513. In response to the above click operation, the terminal 100 may display an input keyboard, receive the user's input information, and display it in the input field 513 . After completing the input, in response to the user's sending operation, the terminal 100 may send the information in the above-mentioned input field 513 to "Lisa".
  • the information interface also includes multiple information type options.
  • a message type option can be used to send a special type of message.
  • photo option 514 may be used to send photo type information. Users can send various special information such as photos, emoticons, red envelopes, locations, etc. to their contacts through multiple information type options.
  • the terminal 100 can monitor the user's operation to determine whether the user has finished processing the notification. Specifically, after displaying the information interface shown in FIG. 5B , the terminal 100 may monitor whether a user operation on the above-mentioned information interface is not detected within the first waiting period. If no user operation is detected within the first waiting period, the terminal 100 may determine that the user has finished processing the notification. If a user operation is detected within the first waiting period, then at the moment when the user operation is detected, the terminal 100 can restart calculating the first waiting period, and detect the user within the first waiting period after the above moment. User operation. If no user operation is detected, the terminal 100 may determine that the user has finished processing the notification. Otherwise, the terminal 100 continues to calculate the first waiting time again and detects the user operation.
  • the above-mentioned first waiting time is preset, for example, 5 seconds.
  • the terminal 100 may display the notification interface, see FIG. 5C. At this time, the notification interface does not include the processed notification 121, but only the remaining notifications 122 and 123.
  • the user may choose to click notification 122.
  • the user can also choose to click notification 123.
  • notification 122 is taken as an example.
  • the terminal 100 may display a page containing detailed content of the notification 122.
  • the notification 122 may be a weather forecast notification.
  • the terminal 100 may display a user interface showing current weather and weather forecast information as shown in FIG. 5D , which is denoted as a weather interface. Therefore, users can quickly obtain weather information.
  • the terminal 100 after processing a notification, automatically displays the notification interface again to remind the user to process other unprocessed notifications in the notification interface. It also provides convenience for the user to process notifications without requiring the user to perform a pull-down operation every time. .
  • the terminal 100 can collect and generate image frames including the user's facial image, identify the user's eye gaze position, and then determine whether to display the notification bar to provide convenience for the user to view notifications.
  • the terminal 100 will not collect the user's facial image for identifying the user's eye gaze position.
  • FIGS. 6A-6D exemplarily illustrate a set of user interfaces for enabling or disabling the eye gaze recognition function.
  • FIG. 6A exemplarily shows the setting interface on the terminal 100.
  • Multiple setting options may be displayed on the setting interface, such as account setting option 611, WLAN option 612, Bluetooth option 613, mobile network option 614, etc.
  • the setting interface also includes auxiliary function options 615. Accessibility option 615 can be used to set some shortcut operations.
  • Terminal 100 may detect user operations on accessibility options 615 .
  • the terminal 100 may display the user interface shown in FIG. 6B , which is referred to as the auxiliary function setting interface.
  • the interface may display multiple accessibility options, such as accessibility options 621, one-handed mode options 622, and so on.
  • the auxiliary function setting interface also includes quick start and gesture options 623. Quick start and gesture options 623 can be used to set some gesture actions and eye gaze actions to control interaction.
  • the terminal 100 may detect user operations on the quick launch and gesture options 623 .
  • the terminal 100 may display the user interface shown in FIG. 6C , which is denoted as the quick startup and gesture setting interface.
  • the interface can display multiple quick startup and gesture setting options, such as smart voice option 631, screenshot option 632, screen recording option 633, and quick call option 634.
  • the quick start and gesture setting interface also includes an eye gaze option 635.
  • the eye gaze option 635 can be used to set the eye gaze recognition area and corresponding shortcut operations.
  • Terminal 100 may detect user operations on eye gaze option 635 .
  • the terminal 100 may display the user interface shown in FIG. 6D , which is referred to as the eye gaze setting interface.
  • the interface may display multiple function options based on eye gaze recognition, such as notification bar option 641 .
  • notification bar option 641 When the switch in the notification bar option 641 is "ON”, it means that the terminal 100 has enabled the notification bar gaze recognition function shown in FIGS. 2A to 2H, 3A to 3D, and 4A to 4D.
  • the switch in the notification bar option 641 When the switch in the notification bar option 641 is "OFF”, it means that the terminal 100 does not enable the above notification bar gaze recognition function. Therefore, when the unlocking is successful and the main interface is displayed, or when switching between pages of the main interface, or when returning to the main interface, the terminal 100 will not collect the user's facial image, nor will it determine whether the user is looking at the notification bar.
  • the eye gaze setting interface may also include a payment code option 642 and a health code option 643.
  • the payment code option 642 can be used to turn on or off the function of eye gaze controlling the display of payment codes.
  • the terminal 100 can use the collected image frames containing the user's facial image to confirm whether the user Focus on the upper right area of the screen.
  • the upper right corner area of the above screen may refer to the upper right corner area shown in Figure 6E.
  • the terminal 100 may display the payment code. In this way, users can quickly and easily obtain the payment code and complete the payment behavior, avoiding a large number of tedious user operations, improving interaction efficiency and user experience.
  • the health code option 643 can be used to turn on or off the function of eye gaze control to display the health code.
  • the terminal 100 can use the collected image frames containing the user's facial image to confirm whether the user Focus on the lower left corner area of the screen, refer to the lower left corner area shown in Figure 6E.
  • the terminal 100 may display the health code. In this way, users can quickly and easily obtain the health code and complete the health check.
  • mapping relationship between the payment code and the upper right corner area, and the mapping relationship between the health code and the lower left corner area are exemplary. Developers or users can also set other mapping relationships, such as paying attention to the upper left corner to display the payment code, etc. This is not limited in the embodiment of the present application.
  • FIG. 7 exemplarily shows a flow chart of a display provided by an embodiment of the present application.
  • the following is a detailed introduction to the processing flow of the terminal 100 implementing the above display method in conjunction with the user interface shown in FIG. 7 and FIGS. 2A-2H.
  • the terminal 100 detects that the unlocking is successful and displays the main interface.
  • the terminal 100 When the user is not using the terminal 100, the terminal 100 may be in a screen-off state or an AOD (Always on Display) state.
  • AOD Automatically on Display
  • the display of the terminal 100 goes into sleep state and becomes a black screen, but other devices and programs work normally.
  • the screen-off AOD state refers to the state of controlling part of the screen to light up without lighting up the entire mobile phone screen, that is, the state of controlling part of the screen to light up based on the screen-off state.
  • the terminal 100 can light up the entire screen and display the interface to be unlocked as shown in Figure 2A. After lighting up the screen, the terminal 100 can enable the camera module 210 to collect and generate image frames including the user's facial image.
  • the camera module 210 of the terminal 100 includes at least a 2D camera and a 3D camera.
  • the camera used by the terminal 100 may be a 3D camera in the camera module 210 .
  • the terminal 100 may input the above image frame into the face recognition model.
  • the facial image of the owner can be stored in the face recognition model.
  • the face recognition model can identify whether the facial image in the above-mentioned image frame matches the facial image of the machine owner.
  • the above face recognition model is existing and will not be described again here.
  • the terminal 100 can confirm that the unlocking is successful. At this time, the terminal 100 may display the unlocking success interface shown in FIG. 2B to prompt the user that the unlocking is successful. Subsequently, the terminal 100 may display the main interface shown in FIG. 2C.
  • the main interface may include multiple pages.
  • displaying the main interface by the terminal 100 includes displaying any page in the main interface, such as displaying the first desktop, displaying the second desktop, or displaying the negative screen.
  • the terminal 100 may permanently display the first desktop.
  • the terminal 100 may continue to display the next screen, the first desktop or the second desktop according to the page where the last exit was. Therefore, after displaying the unlocking success interface shown in FIG. 2B, the terminal 100 may also display the main interface shown in FIG. 2G or 2H.
  • the terminal 100 can also directly display the main interface shown in Figure 2C or Figure 2G or Figure 2H without displaying Figure 2B.
  • Figure 2B is not required.
  • the terminal 100 obtains an image frame including the user's facial image, and uses the above image frame to determine the user's eyeball gaze position.
  • the terminal 100 can set the gaze recognition time. During this time, the terminal 100 can collect and generate an image frame including the user's facial image through the camera module 210 within the gaze recognition time, so as to identify the user's eyeball gaze position.
  • the period of time before the main interface is displayed can be set as the gaze recognition time.
  • the first 3 seconds of displaying the main interface are illustrated in Figure 2C.
  • the above time is summarized based on the user's behavioral habits of controlling the display of the notification interface, and is the optimal time to meet the user's need to view the notification interface.
  • the terminal 100 will also enable the camera module 210 to collect and generate image frames including the user's facial image within a period of time before the terminal 100 can display the main interface (gaze recognition time).
  • the image frames collected and generated during this time can be called the target input image.
  • the target input image can be used by the terminal 100 to determine whether the user is looking at the notification bar area, and thereby determine whether to display the notification interface.
  • the terminal 100 in the scenario of using face unlocking, the 3D camera of the terminal 100 is turned on. Therefore, in S102, the terminal 100 only needs to turn on the 2D camera of the camera module 210. In the scenario of using password unlocking and fingerprint unlocking, the camera of the terminal 100 is turned off. At this time, the terminal 100 needs to turn on the 2D camera and the 3D camera in the camera module 210.
  • the terminal 100 may input the above image into the eye gaze recognition model.
  • the eye gaze recognition model is a model preset in the terminal 100 .
  • the eye gaze recognition model can identify the user's eye gaze position in the input image frame, and then the terminal 100 can determine whether the user is looking at the notification bar based on the above eye gaze position.
  • the subsequent Figure 8 will specifically introduce the structure of the eye gaze recognition model used in this application, which will not be elaborated here.
  • the eye gaze recognition model can also output the user's eye gaze area.
  • An eye-gaze area can be contracted into an eye-gaze position, and an eye-gaze position can also be expanded into an eye-gaze area.
  • a cursor point formed by one display unit on the screen can be called an eye gaze position, and correspondingly, a cursor point or a cursor area formed by multiple display units on the screen can be called an eye gaze area.
  • the terminal 100 can determine whether the user is looking at the notification bar by determining the position of the eye gaze area on the screen, and then determine whether to display the notification interface.
  • the terminal 100 determines that the user is gazing at the notification bar. In response to the user's action of looking at the notification bar, the terminal 100 displays a notification interface.
  • the terminal 100 may determine to display the notification interface. Referring to the user interfaces shown in FIGS. 2D to 2E , the terminal 100 can determine that the user's eyeball gaze position is within the notification bar 221 area, and thus the terminal 100 can display the notification interface.
  • the terminal 100 does not display the notification interface.
  • the preset eye gaze recognition time is not limited to the period before the main interface is displayed for the first time after successful unlocking.
  • the terminal 100 is also set with other eye gaze recognition times, for example, a period of time before updating the main interface after detecting the user's page switching operation, and a period of time before returning to the main interface after exiting an application.
  • the terminal 100 can also identify whether the user is looking at the notification bar, and then determine whether to display the notification interface. This will be described in detail later, but will not be discussed here.
  • the terminal 100 collects and generates image frames containing the user's facial image and identifies the eye gaze position at the same time. Therefore, before the gaze recognition time ends, if the terminal 100 recognizes that the user is gazing at the notification bar area, the terminal 100 may display the notification bar, and at the same time, the camera module 210 may stop collecting and generating images. like frames. After the gaze recognition time is over, if the terminal 100 still does not recognize that the user is gazing at the notification bar area, the terminal 100 also stops collecting and generating image frames to save power consumption.
  • the terminal 100 determines that multiple notifications are displayed on the notification interface, and after detecting that the user has confirmed one notification, automatically displays the notification interface containing the remaining notifications. S104 is optional.
  • the terminal 100 may determine the number of notifications displayed by the notification interface. If the number of notifications displayed on the notification interface is multiple (two or more), the terminal 100 can automatically display the details of the remaining notifications after detecting that the user has confirmed one notification.
  • the terminal 100 may display the information interface shown in FIG. 5B . After the above information interface, the terminal 100 can monitor user operations to determine whether the user has finished processing the notification.
  • the terminal 100 can determine whether the user has finished processing a notification through preset user operations.
  • the above-mentioned preset user operations include sliding up to return to the main interface.
  • the terminal 100 may also monitor whether a user operation on the above information interface is not detected within the first waiting period. If no user operation is detected within the first waiting period, the terminal 100 may determine that the user has finished processing the notification. Referring to FIG. 5B , when the information sent by Lisa shown in FIG. 5B is displayed for a period of time and no user edit recovery operation is detected, the terminal 100 may confirm that the user has finished processing the information notification. For another example, when it is detected that the user browses the interface corresponding to a certain notification, slides to a certain position on the interface, and stays for more than a period of time, the terminal 100 can confirm that the user has finished processing the notification. For another example, when it is detected that the user browses the video corresponding to a certain notification, and after playing the video multiple times, the terminal 100 can confirm that the user has finished processing the notification, and so on.
  • the terminal 100 can automatically display the notification interface. In this way, users can get the notification interface and view the remaining unprocessed notifications without performing a pull-down operation. The user can then proceed with the pending notifications mentioned above. This not only reminds users to process the remaining unprocessed notifications, but also provides convenience for users to process notifications, saving user operations.
  • the terminal 100 may also determine whether to automatically display the detailed content of the notification according to the type of the notification.
  • Notifications can be divided into transaction notifications and recommendation notifications.
  • Transactional notifications include ticket order notifications, itinerary reminder notifications, etc. sent after the user purchases a ticket.
  • Transactional notifications require user confirmation.
  • Recommended notifications such as those promoting air ticket promotions. Recommended notifications can be ignored by the user. If the remaining notifications are recommended notifications, the terminal 100 may not automatically display the detailed content of the notifications to avoid disturbing the user and reducing the user experience.
  • the terminal 100 After the gaze recognition time is over, if the terminal 100 has not yet recognized that the user is gazing at the notification bar area, the terminal 100 also stops collecting and generating image frames. At this time, the user may switch the currently displayed main interface or open an application.
  • the terminal 100 can also enable the camera module 210 to collect and generate image frames to determine whether the user is looking at the notification bar, and then determine Whether to display the notification interface.
  • the terminal 100 may also set an eye gaze recognition time, and identify whether the user is looking at the notification bar within the eye gaze recognition period. After detecting the user's action of looking at the notification bar, the terminal 100 also displays a notification interface.
  • the terminal 100 Combined with the user interface shown in FIGS. 3A to 3D , within the first 3 seconds when the first desktop of the main interface is initially displayed, the terminal 100 The user's gaze at the notification shade may not be recognized. At this time, the terminal 100 will not display the notification interface. At the same time, after 3 seconds, the terminal 100 will turn off the camera module 210 to reduce power consumption.
  • the terminal 100 may detect a left sliding operation (an operation of switching the main interface). In response to the above operation, the terminal 100 may display the second desktop of the main interface. At this time, the first 3 seconds when the second desktop is displayed can also be set as the gaze recognition time. Therefore, within the first 3 seconds of displaying the second desktop, the terminal 100 can also enable the camera module 210 to collect and generate image frames to determine whether the user is looking at the notification bar and then determine whether to display the notification interface.
  • the gaze recognition time of different pages of the main interface can also be different.
  • the first 2 seconds of the second desktop can be set as the gaze recognition time of the second desktop.
  • the terminal 100 can detect the user's eyeball gaze position multiple times, providing the user with multiple opportunities to control the display by eyeball gaze. At the same time, the terminal 100 also avoids keeping the camera in a working state all the time and avoids the problem of excessive power consumption.
  • the terminal 100 may also set the eye gaze recognition time, and identify whether the user is looking at the notification bar within the eye gaze recognition period. After detecting the user's action of looking at the notification bar, the terminal 100 also displays a notification interface.
  • the terminal 100 can detect the sliding operation, that is, the operation of exiting the gallery and returning to the main interface. In response to the above operation, the terminal 100 may display the main interface. At this time, the first 3 seconds when the above-mentioned main interface is displayed can also be set as the gaze recognition time. During this time, the camera module 210 of the terminal 100 can collect and generate image frames. When detecting that the user looks at the notification bar, the terminal 100 may also display a notification interface.
  • the scene in which the user controls the display of the notification interface through eye gaze is not limited to a fixed period of time after unlocking. Users can control the display of the notification interface through eye gaze in more scenarios, such as switching the main interface and returning to the main interface.
  • the terminal 100 only activates the camera when it recognizes the preset trigger scene and identifies the user's eyeball gaze position, thus avoiding the problems of resource waste and high power consumption caused by the camera being in working state for a long time.
  • Figure 8 exemplarily shows the structure of the eye gaze recognition model.
  • the eye gaze recognition model used in the embodiment of the present application will be introduced in detail below with reference to Figure 8 .
  • the eye gaze recognition model is established based on convolutional neural networks (Convolutional Neural Networks, CNN).
  • the eye gaze recognition model may include: a face correction module, a dimensionality reduction module, and a convolutional network module.
  • the image including the user's face collected by the camera module 210 may first be input into the face correction module.
  • the face correction module can be used to identify whether the facial image in the input image frame is straight. For image frames in which the facial image is not straight (such as head tilt), the face correction module can correct the image frame to make it straight, thereby avoiding subsequent impact on the eye gaze recognition effect.
  • FIG. 9A exemplarily shows the correction processing flow of the face correction module for the image frames generated by the camera module 210 and containing the user's facial image.
  • S201 Use the facial key point recognition algorithm to determine the facial key points in the image frame T1.
  • the key points of the human face include the left eye, the right eye, the nose, the left lip corner, and the right lip corner.
  • the face key point recognition algorithm is existing, such as the Kinect-based face key point recognition algorithm, etc., which will not be described again here.
  • FIG. 9B exemplarily shows an image frame including a user's facial image, which is denoted as image frame T1.
  • human face The correction module can use the facial key point recognition algorithm to determine the key points of the face in the image frame T1: left eye a, right eye b, nose c, left lip corner d, right lip corner e, and determine the coordinate position of each key point , refer to image frame T1 in Figure 9C.
  • S202 Use the face key points to determine the calibrated line of the image frame T1, and then determine the face deflection angle ⁇ of the image frame T1.
  • the left and right eyes are on the same horizontal line, so the straight line connecting the key points of the left eye and the key points of the right eye (the calibrated line) is parallel to the horizontal line, that is, the face deflection angle (the composition of the calibrated line and the horizontal line) angle) ⁇ is 0.
  • the face correction module can use the recognized coordinate positions of the left eye a and the right eye b to determine the calibrated line L1. Then, based on L1 and the horizontal line, the face correction module can determine the face deflection angle ⁇ of the facial image in the image frame T1.
  • ⁇ 0 that is, the facial image in the image frame T1 is not straight.
  • the face correction module can correct the image frame T1.
  • the face correction module can first use the coordinate positions of the left eye a and the right eye b to determine the rotation center point y, and then use the y point as the rotation center to rotate the image frame T1 by ⁇ ° to obtain an image frame with a correct facial image.
  • T1 is recorded as image frame T2.
  • point A can represent the position of the rotated left eye a
  • point B can represent the position of the right eye b after the rotation
  • point C can represent the position of the nose c after the rotation
  • point D can represent the position of the rotated nose c.
  • the position of the left lip corner d and point E can represent the position of the rotated right lip corner e.
  • S205 Process the corrected image frame T1 of the facial image obtained after rotation to obtain the left eye image, the right eye image, the face image and the face grid data.
  • the face grid data can be used to reflect the position of the face image in the entire image.
  • the face correction module can center on the key points of the face and crop the corrected image according to the preset size, thereby obtaining the left eye image, right eye image, and face image corresponding to the image.
  • the face correction module may determine face mesh data.
  • the face correction module can determine a rectangle of fixed size centered on the left eye A.
  • the image covered by this rectangle is the left eye image.
  • the face correction module can determine the right eye image with the right eye B as the center, and the face image with the nose C as the center.
  • the size of the left eye image and the right eye image are the same, and the size of the face image and the left eye image are different.
  • the face correction module can correspondingly obtain the face grid data, that is, the position of the face image in the entire image.
  • the terminal 100 can obtain the corrected image frame T1, and obtain the corresponding left eye image, right eye image, facial image and face grid data from the above image frame T1.
  • the face correction module can input the left eye image, right eye image, facial image and face mesh data output by itself into the dimensionality reduction module.
  • the dimensionality reduction module can be used to reduce the dimensionality of the input left eye image, right eye image, facial image and face grid data to reduce the computational complexity of the convolutional network module and improve the speed of eye gaze recognition.
  • the dimensionality reduction methods used by the dimensionality reduction module include but are not limited to principal component analysis (PCA), downsampling, 1*1 convolution kernel, etc.
  • Each image that has undergone dimensionality reduction processing can be input into the convolutional network module.
  • the convolutional network module can be based on the above input image Output the eye gaze position.
  • the structure of the convolutional network in the convolutional network module can be referred to Figure 10.
  • the convolution network may include convolution group 1 (CONV1), convolution group 2 (CONV2), and convolution group 3 (CONV3).
  • a convolution group includes: convolution kernel (Convolution), activation function PRelu, pooling kernel (Pooling) and local response normalization layer (Local Response Normalization, LRN).
  • the convolution kernel of CONV1 is a 7*7 matrix, and the pooling kernel is a 3*3 matrix
  • the convolution kernel of CONV2 is a 5*5 matrix
  • the pooling kernel is a 3*3 matrix
  • the convolution of CONV3 The kernel is a 3*3 matrix
  • the pooling kernel is a 2*2 matrix.
  • convolution convolution
  • Pooling pooling kernel
  • separable convolution technology refers to decomposing an n*n matrix into an n*1 column matrix and a 1*n row matrix for storage, thereby reducing the demand for storage space. Therefore, the eye gaze module used in this application has the advantages of small size and easy deployment, so as to be adapted to be deployed on electronic devices such as terminals.
  • the convolutional network may include connection layer 1 (FC1), connection layer 2 (FC2), and connection layer 3 (FC3).
  • FC1 may include a combination module (concat), a convolution kernel 101, PRelu, and a fully connected module 102. Among them, concat can be used to combine left eye images and right eye images.
  • the face image can be input into FC2 after passing through CONV1, CONV2, and CONV3.
  • FC2 may include a convolution kernel 103, PRelu, a fully connected module 104, and a fully connected module 105. FC2 can perform two full connections on face images.
  • the face mesh data can be input into FC3 after passing through CONV1, CONV2, and CONV3.
  • FC3 includes a fully connected module.
  • Connection layers with different structures are constructed for different types of images (such as left eye, right eye, face images), which can better obtain the characteristics of various types of images, thereby improving the accuracy of the model, so that the terminal 100 can be more accurate to identify the user’s eye gaze position.
  • images such as left eye, right eye, face images
  • the full connection module 106 can perform another full connection on the left eye image, right eye image, face image, and face grid data, and finally output the eye gaze position.
  • the eyeball gaze position indicates the specific position where the user's gaze focuses on the screen, that is, the user's gaze position. Refer to the cursor point S shown in Figure 1C. Furthermore, when the eyeball gaze position is within the notification bar area, the terminal 100 can determine that the user is gazing at the notification bar.
  • the convolutional neural network set by the eye gaze model used in this application has fewer parameters. Therefore, the time required to calculate and predict the user's eyeball gaze position using the eyeball gaze model is smaller, that is, the terminal 100 can quickly determine whether the user is gazing at a specific area such as the notification bar.
  • the first preset area may be the notification bar 221 in the interface shown in Figure 2C;
  • the first interface may be an interface to be unlocked as shown in Figure 2A, or an interface for exiting an application and displaying the application as shown in Figure 5B;
  • the second interface may be any one of the main interfaces such as the first desktop, the second desktop, and the negative screen shown in Figure 2F;
  • the third interface may be the third interface shown in Figure 2E.
  • Figure 12 is a schematic system structure diagram of the terminal 100 according to the embodiment of the present application.
  • the layered architecture divides the system into several layers, and each layer has clear roles and division of labor.
  • the layers communicate through software interfaces.
  • the system is divided into five layers, from top to bottom: application layer, application framework layer, hardware abstraction layer, kernel layer and hardware layer.
  • the application layer can include multiple applications, such as dial-up applications, gallery applications, and so on.
  • the application layer also includes an eye gaze SDK (software development kit).
  • the system of the terminal 100 and the third application installed on the terminal 100 can identify the user's eyeball gaze position by calling the eyeball gaze SDK.
  • the framework layer provides application programming interface (API) and programming framework for applications in the application layer.
  • the framework layer includes some predefined functions.
  • the framework layer may include a camera service interface and an eyeball gaze service interface.
  • the camera service interface is used to provide an application programming interface and programming framework for using the camera.
  • the eye gaze service interface provides an application programming interface and programming framework that uses the eye gaze recognition model.
  • the hardware abstraction layer is the interface layer between the framework layer and the driver layer, providing a virtual hardware platform for the operating system.
  • the hardware abstraction layer may include a camera hardware abstraction layer and an eye gaze process.
  • the camera hardware abstraction layer can provide virtual hardware for camera device 1 (RGB camera), camera device 2 (TOF camera), or more camera devices.
  • the calculation process of identifying the user's eye gaze position through the eye gaze recognition module is performed during the eye gaze process.
  • the driver layer is the layer between hardware and software.
  • the driver layer includes drivers for various hardware.
  • the driver layer may include camera device drivers.
  • the camera device driver is used to drive the sensor of the camera to collect images and drive the image signal processor to preprocess the images.
  • the hardware layer includes sensors and secure data buffers.
  • the sensors include RGB camera (ie 2D camera) and TOF camera (ie 3D camera). RGB cameras capture and generate 2D images.
  • TOF camera is a depth-sensing camera that can collect and generate 3D images with depth information. Data collected by the camera is stored in a secure data buffer. When any upper-layer process or reference obtains image data collected by the camera, it needs to obtain it from the secure data buffer and cannot obtain it through other means. Therefore, the secure data buffer can also avoid the problem of abuse of image data collected by the camera.
  • the software layers introduced above and the modules or interfaces included in each layer run in a runnable environment (Runnable executive environment, REE).
  • the terminal 100 also includes a trusted execution environment (Trust executive environment, TEE). Data communication in TEE is more secure than REE.
  • TEE can include eye gaze recognition algorithm module, trust application (Trust Application, TA) module and security service module.
  • the eye gaze recognition algorithm module stores the executable code of the eye gaze recognition model.
  • TA can be used to safely send the recognition results output by the above model to the eye gaze process.
  • the security service module can be used to securely input the image data stored in the secure data buffer to the eye gaze recognition algorithm module.
  • the terminal 100 determines to perform the eye gaze recognition operation. After recognizing that the unlocking is successful, or after switching pages after unlocking, or after returning to the main interface, the terminal 100 may determine to perform the eye gaze recognition operation within the eye gaze recognition time.
  • the terminal 100 calls the eye gaze service through the eye gaze SDK.
  • the eye gaze service can call the camera service of the frame layer to collect and obtain image frames containing the user's facial image through the camera service.
  • the camera service can send instructions to start the RGB camera and TOF camera by calling camera device 1 (RGB camera) and camera device 2 (TOF camera) in the camera hardware abstraction layer.
  • the camera hardware abstraction layer sends this instruction to the camera device driver of the driver layer.
  • the camera device driver can start the camera according to the above instructions.
  • the instructions sent by camera device 1 to the camera device driver can be used to start the RGB camera.
  • Camera device 2 sent Instructions to the camera device driver can be used to start the TOF camera.
  • the RGB camera and TOF camera After the RGB camera and TOF camera are turned on, they collect light signals and generate 2D or 3D images of electrical signals through the image signal processor.
  • the eye gaze service creates an eye gaze process and initializes the eye recognition model.
  • Images generated by the image signal processor can be stored in a secure data buffer.
  • the image data stored in the secure data buffer can be transmitted to the eye gaze recognition algorithm through the secure transmission channel (TEE) provided by the security service.
  • TEE secure transmission channel
  • the eye gaze recognition algorithm can input the above image data into the eye gaze recognition model established based on CNN to determine the user's eye gaze position. Then, TA safely returns the above-mentioned eye gaze position to the eye gaze process, and then returns it to the application layer eye gaze SDK through the camera service and eye gaze service.
  • the eye gaze SDK can determine whether the user is looking at the notification bar based on the received eye gaze position, and then determine whether to display the notification interface.
  • Figure 13 shows a schematic diagram of the hardware structure of the terminal 100.
  • the terminal 100 may include a processor 110, an external memory interface 120, an internal memory 121, a universal serial bus (USB) interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, an antenna 2, Mobile communication module 150, wireless communication module 160, audio module 170, speaker 170A, receiver 170B, microphone 170C, headphone interface 170D, sensor module 180, button 190, motor 191, indicator 192, camera 193, display screen 194, and user Identification module (subscriber identification module, SIM) card interface 195, etc.
  • a processor 110 an external memory interface 120, an internal memory 121, a universal serial bus (USB) interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, an antenna 2, Mobile communication module 150, wireless communication module 160, audio module 170, speaker 170A, receiver 170B, microphone 170C, headphone interface 170D, sensor module 180, button 190, motor 191, indicator 192, camera 193, display screen
  • the sensor module 180 may include a pressure sensor 180A, a gyro sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, and ambient light. Sensor 180L, bone conduction sensor 180M, etc.
  • the structure illustrated in the embodiment of the present invention does not constitute a specific limitation on the terminal 100.
  • the terminal 100 may include more or fewer components than shown in the figures, or some components may be combined, or some components may be separated, or may be arranged differently.
  • the components illustrated may be implemented in hardware, software, or a combination of software and hardware.
  • the processor 110 may include one or more processing units.
  • the processor 110 may include an application processor (application processor, AP), a modem processor, a graphics processing unit (GPU), and an image signal processor. (image signal processor, ISP), controller, video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural network processor (neural-network processing unit, NPU), etc.
  • application processor application processor, AP
  • modem processor graphics processing unit
  • GPU graphics processing unit
  • image signal processor image signal processor
  • ISP image signal processor
  • controller video codec
  • digital signal processor digital signal processor
  • DSP digital signal processor
  • baseband processor baseband processor
  • neural network processor neural-network processing unit
  • the controller can generate operation control signals based on the instruction operation code and timing signals to complete the control of fetching and executing instructions.
  • the processor 110 may also be provided with a memory for storing instructions and data.
  • the memory in processor 110 is cache memory. This memory may hold instructions or data that have been recently used or recycled by processor 110 . If the processor 110 needs to use the instructions or data again, it can be called directly from the memory. Repeated access is avoided and the waiting time of the processor 110 is reduced, thus improving the efficiency of the system.
  • processor 110 may include one or more interfaces.
  • Interfaces may include integrated circuit (inter-integrated circuit, I2C) interface, integrated circuit built-in audio (inter-integrated circuit sound, I2S) interface, pulse code modulation (pulse code modulation, PCM) interface, universal asynchronous receiver and transmitter (universal) asynchronous receiver/transmitter (UART) interface, mobile industry processor interface (MIPI), general-purpose input/output (GPIO) interface, subscriber identity module (SIM) interface, and/or universal serial bus (USB) interface, etc.
  • I2C integrated circuit
  • I2S integrated circuit built-in audio
  • PCM pulse code modulation
  • UART universal asynchronous receiver and transmitter
  • MIPI mobile industry processor interface
  • GPIO general-purpose input/output
  • SIM subscriber identity module
  • USB universal serial bus
  • the interface connection relationships between the modules illustrated in the embodiment of the present invention are only schematic illustrations and do not constitute a structural limitation on the terminal 100 .
  • the terminal 100 may also adopt different interface connection methods in the above embodiments, or a combination of multiple interface connection methods.
  • the charging management module 140 is used to receive charging input from the charger.
  • the power management module 141 is used to connect the battery 142, the charging management module 140 and the processor 110.
  • the wireless communication function of the terminal 100 can be implemented through the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, the modem processor and the baseband processor.
  • Antenna 1 and Antenna 2 are used to transmit and receive electromagnetic wave signals.
  • the mobile communication module 150 can provide wireless communication solutions including 2G/3G/4G/5G applied to the terminal 100.
  • the mobile communication module 150 may include at least one filter, switch, power amplifier, low noise amplifier (LNA), etc.
  • LNA low noise amplifier
  • the mobile communication module 150 can receive electromagnetic waves through the antenna 1, perform filtering, amplification and other processing on the received electromagnetic waves, and transmit them to the modem processor for demodulation.
  • the mobile communication module 150 can also amplify the signal modulated by the modem processor and convert it into electromagnetic waves through the antenna 1 for radiation.
  • a modem processor may include a modulator and a demodulator.
  • the modulator is used to modulate the low-frequency baseband signal to be sent into a medium-high frequency signal.
  • the demodulator is used to demodulate the received electromagnetic wave signal into a low-frequency baseband signal.
  • the wireless communication module 160 can provide applications on the terminal 100 including wireless local area networks (WLAN) (such as wireless fidelity (Wi-Fi) network), Bluetooth (bluetooth, BT), and global navigation satellite system. (global navigation satellite system, GNSS), frequency modulation (FM), near field communication technology (near field communication, NFC), infrared technology (infrared, IR) and other wireless communication solutions.
  • WLAN wireless local area networks
  • GNSS global navigation satellite system
  • FM frequency modulation
  • NFC near field communication technology
  • infrared technology infrared, IR
  • the wireless communication module 160 receives electromagnetic waves via the antenna 2 , frequency modulates and filters the electromagnetic wave signals, and sends the processed signals to the processor 110 .
  • the wireless communication module 160 can also receive the signal to be sent from the processor 110, frequency modulate it, amplify it, and convert it into electromagnetic waves through the antenna 2 for radiation.
  • the antenna 1 of the terminal 100 is coupled to the mobile communication module 150, and the antenna 2 is coupled to the wireless communication module 160, so that the terminal 100 can communicate with the network and other devices through wireless communication technology.
  • the wireless communication technology may include global system for mobile communications (GSM), general packet radio service (GPRS), code division multiple access (CDMA), broadband Code division multiple access (wideband code division multiple access, WCDMA), time division code division multiple access (time-division code division multiple access, TD-SCDMA), long term evolution (long term evolution, LTE), BT, GNSS, WLAN, NFC , FM, and/or IR technology, etc.
  • the GNSS may include global positioning system (GPS), global navigation satellite system (GLONASS), Beidou navigation satellite system (BDS), quasi-zenith satellite system (quasi) -zenith satellite system (QZSS) and/or satellite based augmentation systems (SBAS).
  • GPS global positioning system
  • GLONASS global navigation satellite system
  • BDS Beidou navigation satellite system
  • QZSS quasi-zenith satellite system
  • SBAS satellite based augmentation systems
  • some notifications received by the terminal 100 are sent by application servers corresponding to applications installed on the terminal 100 .
  • the terminal 100 receives the above notification by implementing the wireless communication function through the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, the modem processor and the baseband processor, and then displays the above notification.
  • the terminal 100 implements the display function through the GPU, the display screen 194, and the application processor.
  • GPU is the graphics processing unit
  • the processing microprocessor is connected to the display screen 194 and the application processor. GPUs are used to perform mathematical and geometric calculations for graphics rendering.
  • Processor 110 may include one or more GPUs that execute program instructions to generate or alter display information.
  • the display screen 194 is used to display images, videos, etc.
  • Display 194 includes a display panel.
  • Display 194 includes a display panel.
  • the display panel can use a liquid crystal display (LCD).
  • the display panel can also use organic light-emitting diode (OLED), active matrix organic light-emitting diode or active matrix organic light-emitting diode (active-matrix organic light emitting diode, AMOLED), flexible light-emitting diode ( Manufacturing of flex light-emitting diodes (FLED), miniled, microled, micro-oled, quantum dot light emitting diodes (QLED), etc.
  • the terminal 100 may include 1 or N display screens 194, where N is a positive integer greater than 1.
  • the terminal 100 uses the display functions provided by the GPU, the display screen 194, and the application processor to display Figures 2A-2H, 3A-3H, 4A-4D, and 5A-5D. , the user interface shown in Figures 6A-6E.
  • the terminal 100 can implement the shooting function through the ISP, camera 193, video codec, GPU, display screen 194, application processor, etc.
  • the camera 193 includes an RGB camera (2D camera) that generates two-dimensional images and a TOF camera (3D camera) that generates three-dimensional images.
  • the ISP is used to process the data fed back by the camera 193. For example, when taking a photo, the shutter is opened, the light is transmitted to the camera sensor through the lens, the optical signal is converted into an electrical signal, and the camera sensor passes the electrical signal to the ISP for processing, and converts it into an image visible to the naked eye. ISP can also perform algorithm optimization on image noise and brightness. ISP can also optimize the exposure, color temperature and other parameters of the shooting scene. In some embodiments, the ISP may be provided in the camera 193.
  • Camera 193 is used to capture still images or video.
  • the object passes through the lens to produce an optical image that is projected onto the photosensitive element.
  • the photosensitive element can be a charge coupled device (CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor.
  • CMOS complementary metal-oxide-semiconductor
  • the photosensitive element converts the optical signal into an electrical signal, and then passes the electrical signal to the ISP to convert it into a digital image signal.
  • ISP outputs digital image signals to DSP for processing.
  • DSP converts digital image signals into standard RGB, YUV and other format image signals.
  • the terminal 100 may include 1 or N cameras 193, where N is a positive integer greater than 1.
  • Digital signal processors are used to process digital signals. In addition to digital image signals, they can also process other digital signals.
  • Video codecs are used to compress or decompress digital video. Terminal 100 may support one or more video codecs.
  • NPU is a neural network (NN) computing processor.
  • NN neural network
  • the NPU can realize intelligent cognitive applications of the terminal 100, such as image recognition, face recognition, speech recognition, text understanding, etc.
  • the terminal 100 collects and generates image frames through the shooting capabilities provided by the ISP and the camera 193.
  • the terminal 100 can execute the eye gaze recognition algorithm through the NPU, and then identify the user's eye gaze position through the collected image frames.
  • the internal memory 121 may include one or more random access memories (RAM) and one or more non-volatile memories (NVM).
  • RAM random access memories
  • NVM non-volatile memories
  • Random access memory can include static random-access memory (SRAM), dynamic random access memory (DRAM), synchronous dynamic random access memory (SDRAM), double data rate synchronous dynamic random access Memory (double data rate synchronous dynamic random access memory, DDR SDRAM, for example, the fifth generation DDR SDRAM is generally called DDR5SDRAM), etc.
  • Non-volatile memory may include disk storage devices and flash memory.
  • the random access memory can be directly read and written by the processor 110, can be used to store executable programs (such as machine instructions) of the operating system or other running programs, and can also be used to store user and application data, etc.
  • the non-volatile memory can also store executable programs and user and application program data, etc., and can be loaded into the random access memory in advance for direct reading and writing by the processor 110.
  • Eye Gaze SDK application code can be stored in non-volatile memory.
  • the application code of the Eye Gaze SDK can be loaded into random access memory. Data generated when running the above code can also be stored in random access memory.
  • the external memory interface 120 can be used to connect an external non-volatile memory to expand the storage capability of the terminal 100 .
  • the external non-volatile memory communicates with the processor 110 through the external memory interface 120 to implement the data storage function. For example, save music, video and other files in external non-volatile memory.
  • the terminal 100 can implement audio functions through the audio module 170, the speaker 170A, the receiver 170B, the microphone 170C, the headphone interface 170D, and the application processor. Such as music playback, recording, etc.
  • the audio module 170 is used to convert digital audio information into analog audio signal output, and is also used to convert analog audio input into digital audio signals.
  • Speaker 170A also called “speaker”
  • Receiver 170B also called “earpiece”
  • Receiver 170B is used to convert audio electrical signals into sound signals.
  • the voice can be heard by bringing the receiver 170B close to the human ear.
  • Microphone 170C also called “microphone” or “microphone”
  • the headphone interface 170D is used to connect wired headphones.
  • the pressure sensor 180A is used to sense pressure signals and can convert the pressure signals into electrical signals.
  • the gyro sensor 180B may be used to determine the angular velocity of the terminal 100 around three axes (ie, x, y, and z axes), and thereby determine the motion posture of the terminal 100.
  • the acceleration sensor 180E can detect the acceleration of the terminal 100 in various directions (generally three axes). Therefore, the acceleration sensor 180E can be used to recognize the posture of the terminal 100 . In this embodiment of the present application, when the screen is off or the screen is off AOD, the terminal 100 can detect whether the user picks up the mobile phone through the acceleration sensor 180E and the gyroscope sensor 180B, and then determine whether to light up the screen.
  • Air pressure sensor 180C is used to measure air pressure.
  • Magnetic sensor 180D includes a Hall sensor.
  • the terminal 100 may use the magnetic sensor 180D to detect the opening and closing of the flip cover. Therefore, in some embodiments, when the terminal 100 is a flip machine, the terminal 100 can detect the opening and closing of the flip cover based on the magnetic sensor 180D, and then determine whether to light up the screen.
  • Distance sensor 180F is used to measure distance.
  • Proximity light sensor 180G may include, for example, a light emitting diode (LED) and a light detector.
  • the terminal 100 can use the proximity light sensor 180G to detect a scene in which the user holds the terminal 100 close to the user, such as a handset conversation.
  • the ambient light sensor 180L is used to sense ambient light brightness.
  • the terminal 100 can adaptively adjust the brightness of the display screen 194 according to the perceived ambient light brightness.
  • Fingerprint sensor 180H is used to collect fingerprints.
  • the terminal 100 can use the collected fingerprint characteristics to implement fingerprint unlocking, access application lock and other functions.
  • Temperature sensor 180J is used to detect temperature.
  • Bone conduction sensor 180M can acquire vibration signals.
  • Touch sensor 180K also known as "touch device”.
  • the touch sensor 180K can be disposed on the display screen 194.
  • the touch sensor 180K and the display screen 194 form a touch screen, which is also called a "touch screen”.
  • the touch sensor 180K is used to detect a touch operation on or near the touch sensor 180K.
  • the touch sensor can pass the detected touch operation to the application for processing handler to determine the touch event type.
  • Visual output related to the touch operation may be provided through display screen 194 .
  • the touch sensor 180K may also be disposed on the surface of the terminal 100 in a position different from that of the display screen 194 .
  • the terminal 100 detects whether there is a user operation on the screen through the touch sensor 180K, such as click, left swipe, right swipe and other operations. Based on the user operation on the screen detected by the touch sensor 180K, the terminal 100 can determine the actions to be performed subsequently, such as running a certain application program, displaying the interface of the application program, and so on.
  • the buttons 190 include a power button, a volume button, etc.
  • Key 190 may be a mechanical key. It can also be a touch button.
  • the motor 191 can generate vibration prompts.
  • the motor 191 can be used for vibration prompts for incoming calls and can also be used for touch vibration feedback.
  • the indicator 192 may be an indicator light, which may be used to indicate charging status, power changes, messages, missed calls, notifications, etc.
  • the SIM card interface 195 is used to connect a SIM card.
  • the terminal 100 can support 1 or N SIM card interfaces.
  • UI user interface
  • the term "user interface (UI)” in the description, claims and drawings of this application is a media interface for interaction and information exchange between an application or operating system and a user, which implements the internal form of information. Conversion to and from a user-acceptable form.
  • the user interface of an application is source code written in specific computer languages such as Java and extensible markup language (XML).
  • XML Java and extensible markup language
  • the interface source code is parsed and rendered on the terminal device, and finally presented as content that the user can recognize.
  • Control also called widget, is the basic element of user interface. Typical controls include toolbar, menu bar, text box, button, and scroll bar. (scrollbar), images and text.
  • the properties and contents of controls in the interface are defined through tags or nodes.
  • XML specifies the controls contained in the interface through nodes such as ⁇ Textview>, ⁇ ImgView>, and ⁇ VideoView>.
  • a node corresponds to a control or property in the interface. After parsing and rendering, the node is rendered into user-visible content.
  • applications such as hybrid applications, often include web pages in their interfaces.
  • a web page also known as a page, can be understood as a special control embedded in an application interface.
  • a web page is source code written in a specific computer language, such as hypertext markup language (GTML), cascading styles Tables (cascading style sheets, CSS), java scripts (JavaScript, JS), etc.
  • web page source code can be loaded and displayed as user-recognizable content by a browser or a web page display component with functions similar to the browser.
  • the specific content contained in the web page is also defined through tags or nodes in the web page source code.
  • GTML defines the elements and attributes of the web page through ⁇ p>, ⁇ img>, ⁇ video>, and ⁇ canvas>.
  • GUI graphical user interface
  • the commonly used form of user interface is graphical user interface (GUI), which refers to a user interface related to computer operations that is displayed graphically. It can be an icon, window, control and other interface elements displayed on the display screen of the terminal device.
  • the control can include icons, buttons, menus, tabs, text boxes, dialog boxes, status bars, navigation bars, widgets, etc. Visual interface elements.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device.
  • the computer instructions may be stored in or transmitted from one computer-readable storage medium to another, e.g., the computer instructions may be transferred from a website, computer, server, or data center Transmission to another website, computer, server or data center through wired (such as coaxial cable, optical fiber, digital subscriber line) or wireless (such as infrared, wireless, microwave, etc.) means.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains one or more available media integrated.
  • the available media may be magnetic media (eg, floppy disk, hard disk, tape), optical media (eg, DVD), or semiconductor media (eg, solid state drive), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Ophthalmology & Optometry (AREA)
  • Computing Systems (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

本申请实施例提供了一种显示方法和电子设备。该方法可应用于手机、平板电脑等终端设备上。在检测到成功解锁或返回主界面的操作之后,终端设备可通过摄像头模组确定用户的眼球注视位置,进而确定用户是否在注视通知栏。当识别到用户注视通知栏的动作后,终端可显示用于展示通知消息的通知界面,以供用户快速获取通知,且无需用户的触控操作。

Description

一种显示方法和电子设备
本申请要求于2022年05月20日提交中国专利局、申请号为202210549604.6、申请名称为“一种显示方法和电子设备”的中国专利申请的优先权,和2022年06月30日提交中国专利局、申请号为202210764445.1、申请名称为“一种显示方法和电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及终端领域,尤其涉及一种显示方法和电子设备。
背景技术
随着移动终端崛起及通信技术的成熟,人们开始探索脱离鼠标和键盘的新型人机交互方式,例如语音控制、手势识别控制等,进而实现新型的人机交互方式,为用户的更多样化更便捷的交互体验,提升用户使用体验。
发明内容
本申请实施例提供了一种显示方法和电子设备。实施该方法,具备摄像头的终端设备可通过摄像头模组确定用户的眼球注视位置,进而确定用户是否在注视屏幕上的预设区域。当识别到用户上述预设区域后,终端可显示用于与该预设区域关联的通知界面,以供用户快速获取通知,且无需用户的触控操作。
第一方面,本申请提供了一种显示方法,该方法应用于电子设备,电子设备包括屏幕,电子设备的屏幕包括第一预设区域,该方法包括:显示第一界面;响应于用户的第一操作,显示第二界面;若所述第二界面是预设界面,显示第二界面时,在第一预设时间段内,电子设备采集第一图像;基于第一图像确定用户的第一眼球注视区域,第一眼球注视区域为当用户注视屏幕时用户所注视的屏幕区域;当第一眼球注视区域在第一预设区域内,显示包括一个或多个通知的第三界面。
实施第一方面提供的方法,电子设备可以在特定的界面开启眼球注视识别检测。通过眼球注视识别检测,电子设备可检测用户是否注视屏幕上的某一预设区域。当检测到用户注视该预设区域时,电子设备可以自动地显示展示通知的通知界面。这样,用户可以通过注视操作控制电子设备显示通知界面,进而快速获取通知。特别是在用户不方便通过触控操作获取通知界面的场景下,该方法为用户提供了另一个获取通知的途径,有利于提升用户使用体验。
结合第一方面提供的方法,在一些实施例中,第二界面为以下多个界面中的任意一个:第一桌面、第二桌面、负一屏。
实施上述实施例提供的方法,电子设备可以在显示第一桌面、第二桌面、负一屏等界面时,检测用户是否注视屏幕上的某一预设区域。这样,用户可以在电子设备显示第一桌面、第二桌面或负一屏时,通过注视操作,控制电子设备显示通知界面。基于用户获取通知的习惯确定的开启眼球注视识别检测的第二界面不限于上述第一桌面、第二桌面、负一屏。
结合第一方面提供的方法,在一些实施例中,第一预设时间为显示第二界面的前N秒。
实施上述实施例提供的方法,电子设备不会一直在第二界面中检测用户是否注视屏幕上的某一区域,而是会在预设的一段时间内检测,例如显示第二界面的前3秒,以节省功耗, 同时避免摄像头滥用影响用户信息安全。
结合第一方面提供的方法,在一些实施例中,第一眼球注视区域为屏幕上的一个显示单元构成的光标点,或者,第一眼球注视区域为屏幕上多个显示单元构成的光标点或光标区域。
结合第一方面提供的方法,在一些实施例中,第一眼球注视区域在第一预设区域内,包括:第一眼球注视区域在屏幕中的位置包含于第一预设区域,或者第一眼球注视区域在屏幕中的位置与第一预设区域有交集。
结合第一方面提供的方法,在一些实施例中,第一界面为待解锁界面,第一操作为解锁操作。
实施上述实施例提供的方法,电子设备在解锁成功后可显示第一桌面、第二桌面、负一屏等类型的第一界面。在显示上述界面的同时,电子设备还可检测用户是否注视屏幕上的第一预设区域。这样,在解锁进入第二界面的场景下,用户可以通过注视操作,控制电子设备显示通知界面。
结合第一方面提供的方法,在一些实施例中,第一界面为电子设备上安装的第一应用提供的一个界面,第一操作为退出第一应用的操作。
实施上述实施例提供的方法,电子设备在退出某一应用之后可显示第二界面。这时,电子设备也可检测用户是否注视屏幕上的第一预设区域。这样,在退出某一应用之后,用户可以立马通过注视操作,控制电子设备显示通知界面,进而处理通知指示的待处理任务。
结合第一方面提供的方法,在一些实施例中,退出第一应用的操作,包括:电子设备检测到的用户做出的指示电子设备退出第一应用的操作,和,电子设备检测到用户长时间未对第一应用施加操作而生成的退出第一应用的操作。
实施上述实施例提供的方法,电子设备可以通过用户退出第一应用的操作确定退出第一应用,也可以通过用户长时间不对第一应用施加操作确定退出第一应用。这样,电子设备可以为用户提供更加便捷、自动化的显示通知的服务。
结合第一方面提供的方法,在一些实施例中,电子设备包括摄像头模组;摄像头模组包括:至少一个2D摄像头和至少一个3D摄像头,2D摄像头用于获取二维图像,3D摄像头用于获取包含深度信息的图像;第一图像包括二维图像和包括深度信息的图像。
实施上述实施例提供的方法,电子设备的摄像头模组可包括多个摄像头,且这多个摄像头中包括至少一个2D摄像头和至少一个3D摄像头。这样,电子设备可以获取到指示用户眼球注视位置的二维图像和三维图像。二维图像和三维图像结合有利于提升电子设备识别用户眼球注视位置的精度和准确度。
结合第一方面提供的方法,在一些实施例中,基于第一图像确定用户的第一眼球注视区域,具体包括:利用第一图像确定特征数据,特征数据包括左眼图像、右眼图像、人脸图像和人脸网格数据中的一类或多类;利用眼球注视识别模型确定特征数据指示的第一眼球注视区域,眼球注视识别模型是基于卷积神经网络建立的。
实施上述实施例提供的方法,电子设备可从摄像头模组采集的二维图像和三维图像中分别获取左眼图像、右眼图像、人脸图像和人脸网格数据,从而提取出更多的特征,提升识别精度和准确度。
结合第一方面提供的方法,在一些实施例中,利用第一图像确定特征数据,具体包括:对第一图像进行人脸校正,得到面部图像端正的第一图像;基于面部图像端正的第一图像,确定特征数据。
实施上述实施例提供的方法,在获取左眼图像、右眼图像、人脸图像之前,电子设备可 以对摄像头模组采集的图像进行人脸校正,以提升左眼图像、右眼图像、人脸图像的准确性。
结合第一方面提供的方法,在一些实施例中,第一图像存储在安全数据缓冲区;在基于第一图像确定用户的第一眼球注视区域之前,该方法还包括:在可信执行环境下从安全数据缓冲区中获取第一图像。
实施上述实施例提供的方法,在电子设备处理摄像头模组采集图像之前,电子设备可将摄像头模组采集的图像存储在安全数据缓冲区。安全数据缓冲区中存储的图像数据仅可经由安全服务提供的安全传输通道输送到眼球注视识别算法中,从而提升图像数据的安全性。
结合第一方面提供的方法,在一些实施例中,安全数据缓冲区设置在电子设备的硬件层。
第二方面,本申请提供了一种电子设备,该电子设备包括一个或多个处理器和一个或多个存储器;其中,一个或多个存储器与一个或多个处理器耦合,一个或多个存储器用于存储计算机程序代码,计算机程序代码包括计算机指令,当一个或多个处理器执行计算机指令时,使得电子设备执行如第一方面以及第一方面中任一可能的实现方式描述的方法。
第三方面,本申请实施例提供了一种芯片系统,该芯片系统应用于电子设备,该芯片系统包括一个或多个处理器,该处理器用于调用计算机指令以使得该电子设备执行如第一方面以及第一方面中任一可能的实现方式描述的方法。
第四方面,本申请提供一种计算机可读存储介质,包括指令,当上述指令在电子设备上运行时,使得上述电子设备执行如第一方面以及第一方面中任一可能的实现方式描述的方法。
第五方面,本申请提供一种包含指令的计算机程序产品,当上述计算机程序产品在电子设备上运行时,使得上述电子设备执行如第一方面以及第一方面中任一可能的实现方式描述的方法。
可以理解地,上述第二方面提供的电子设备、第三方面提供的芯片系统、第四方面提供的计算机存储介质、第五方面提供的计算机程序产品均用于执行本申请所提供的方法。因此,其所能达到的有益效果可参考对应方法中的有益效果,此处不再赘述。
附图说明
图1A-图1C是本申请实施例提供的一组用户界面;
图2A-图2H是本申请实施例提供的一组用户界面;
图3A-图3H是本申请实施例提供的一组用户界面;
图4A-图4D是本申请实施例提供的一组用户界面;
图5A-图5D是本申请实施例提供的一组用户界面;
图6A-图6E是本申请实施例提供的一组用户界面;
图7是本申请实施例提供的一种显示方法的流程图;
图8是本申请实施例提供的一种眼球注视识别模型的结构图;
图9A是本申请实施例提供的一种人脸校正的流程图;
图9B-图9D是本申请实施例提供的一组人脸校正的示意图;
图10是本申请实施例提供的一种眼球注视识别模型的卷积网络的结构图;
图11是本申请实施例提供的一种可分离卷积技术的示意图;
图12是本申请实施例提供的终端100的系统结构示意图;
图13是本申请实施例提供的终端100的硬件结构示意图。
具体实施方式
本申请以下实施例中所使用的术语只是为了描述特定实施例的目的,而并非旨在作为对本申请的限制。
以智能手机为例,据统计,在手机被解锁并显示桌面之后,手机通常首先检测到用户的下拉操作。上述下拉操作是指作用于屏幕顶端向下滑动的操作,参考图1A所示的用户操作。响应于该操作,手机可显示通知界面。如图1B所示的界面可称为通知界面。通知界面中可显示手机接收到的一个或多个通知,例如通知121、通知122、通知123。上述一个或多个通知信息来自于手机上安装的操作系统、系统应用和/或第三方应用。
在一些场景中,用户首先执行下拉操作、指示手机显示通知界面是因为:在打开手机之后,用户通常首先想要检查通知、确认是否有急切的待处理事项。特别的,在启用面部解锁的场景,在用户拿起手机之后,手机可快速地检测用户面部图像是否匹配,进而快速地完成面部解锁并显示桌面,这就使得用户难以在解锁之前仔细地检查并确认通知。这时,用户更倾向于在显示桌面之后,执行下拉操作、指示手机显示通知界面,从而检查并确认通知。
在另一些场景中,在手机解锁并显示桌面之后,用户经常不确定要干什么。这时,手机显示通知界面可给与用户提示,指示用户可以执行哪些操作。例如,通知界面可显示更新软件的通知、接收到联系人来电或信息的通知等。在显示桌面后,手机可首先显示展示有上述通知的通知界面,这样,用户可以确定更新软件或给联系人回复电话或信息等操作,从而为用户提供高效地通知提醒,提升用户使用体验。
针对上述使用场景,本申请实施例提供了一种显示方法。该方法可应用于手机、平板电脑等终端设备。实施上述方法的手机、平板电脑等终端设备可记为终端100。后续实施例将使用终端100指代上述手机、平板电脑等终端设备。
不限于手机、平板电脑,终端100还可以是桌面型计算机、膝上型计算机、手持计算机、笔记本电脑、超级移动个人计算机(ultra-mobile personal computer,UMPC)、上网本,以及蜂窝电话、个人数字助理(personal digital assistant,PDA)、增强现实(augmented reality,AR)设备、虚拟现实(virtual reality,VR)设备、人工智能(artificial intelligence,AI)设备、可穿戴式设备、车载设备、智能家居设备和/或智慧城市设备,本申请实施例对上述终端的具体类型不作特殊限制。
具体的,终端100可检测用户拿起手机进行解锁操作。在检测到成功解锁之后,终端100可启用2D摄像头和3D摄像头采集用户的人脸数据,进而确定用户的眼球注视位置。
上述2D摄像头是指生成二维图像的摄像头,例如手机上常用的生成RGB图像的摄像头。上述3D摄像头是指能够生成三维图像的摄像头或者能够生成包括深度信息的图像的摄像头,例如TOF摄像头。相比于2D摄像头,3D摄像头生成的三维图像还包括深度信息,即被拍摄的物体与3D摄像头的位置信息。
用户的眼球注视位置是指用户注视终端100时目光在终端100屏幕上聚焦的位置。如图1C所示,终端100的屏幕上可显示光标点S。用户注视光标点S时,用户的目光在图1C所示的屏幕上聚焦的位置为光标点S,即用户的眼球注视位置为光标点S。光标点S可以是屏幕中任意位置,该任意位置可能对应某一应用图标或控件,也可能是空白的显示区域。其中,用户注视光标点S可以是正对屏幕时直视该光标点,有可以是未正对屏幕时斜视该光标点。这也就是说,终端100对用户注视屏幕的姿态不作限制,终端100可以在多种头部姿态时确定用户的眼球注视位置。
在确定用户的眼球注视位置之后,若确定上述眼球注视位置在终端100的通知栏区域内,终端100可显示图1B所示的通知界面。如图1C所示,虚线框131包围的区域可称为通知栏区域。用户的眼球注视位置在终端100的通知栏区域内时可表示用户在注视通知栏。
在本申请实施例中,终端100可在解锁成功后的3秒内确定用户是否正在注视通知栏。因此,终端100可以在完成解锁之后,快速地确定是否显示通知界面,既实现了通过眼球注视的交互控制,又避免交互时间过长影响用户体验。
特别的,在用户有意图执行下拉操作(触控操作)指示终端100显示通知界面的场景中,眼球注视不仅可以实现触控操作同样的控制显示效果,还降低了用户的使用限制。例如,在用户不方便执行触控动作的场景下,例如做饭、打扫等,眼球注视可以为用户提供便捷的交互操作体验。在用户解锁终端100后不知道要干什么的场景下,眼球注视可以指示终端100显示通知界面,提示用户接下来可以执行什么操作。
在一些实施例中,当通知界面显示有多条通知时,终端100还可在用户处理完成一个通知之后,自动地显示另一通知的详细信息,提示用户处理该通知,从而节省用户操作,提升用户使用体验。
下面具体介绍终端100实施上述基于眼球注视识别的交互方法的场景。
图2A示例性示出了终端100亮屏但未解锁的用户界面(待解锁界面)。待解锁界面可有显示时间、日期以供用户查看。
在显示图2A所示的用户界面之后,终端100可启用摄像头模组210采集并生成图像帧。上述图像帧中可包括用户的面部图像。然后,终端100可对上述图像帧进行面部识别,判断上述面部图像是否是机主的面部图像,即判断正在执行解锁操作的用户是否是机主本人。
如图2A所示,摄像头模组210中可包括多个摄像头器件。在本申请实施例中,终端100的摄像头模组210至少包括一个2D摄像头和一个3D摄像头。可选的,摄像头模组210也可包括多个2D摄像头和多个3D摄像头,本申请实施例对此不作限定。在进行面部解锁检验时,终端100所使用的摄像头的数量为1个。一般的,这个摄像头为3D摄像头。
当面部解锁成功时,即采集到的面部图像与机主的面部图像匹配时,终端100可显示图2B-图2C所示的用户界面。首先,终端100可显示图2B所示的用户界面(解锁成功界面)。解锁成功界面可显示有图标211。图标211可用于提示用户面部解锁成功。随后,终端100可显示图2C所示的用户界面。该界面可称为终端100的主界面。
不限于上述实施例介绍的面部解锁,终端100还可采用密码解锁(图形密码、数字密码)、指纹解锁等解锁方式。在解锁成功之后,终端100同样可显示图2C所示的主界面。
主界面可包括通知栏221、页面指示符222、常用应用程序图标托盘223,以及多个其他应用程序图标托盘224。
其中:通知栏可包括移动通信信号(又可称为蜂窝信号)的一个或多个信号强度指示符(例如信号强度指示符221A、信号强度指示符221B)、无线高保真(wireless fidelity,Wi-Fi)信号强度指示符221C,电池状态指示符221D、时间指示符221E。
页面指示符222可用于指示当前显示的页面与其他页面的位置关系。一般的,终端100的主界面可包括多个页面。图2C所示的界面可以为上述多个页面中的一个页面。终端100的主界面还包括其他页面。该其他页面在图2C中未显示出来。当检测到用户的左滑、右滑 操作时,终端100可显示上述其他页面,即切换页面。这时,页面指示符222也会变更不同的形态来指示不同的页面。后续实施例再详细介绍。
常用应用程序图标托盘223可以包括多个常用应用图标(例如相机应用图标、通讯录应用图标、电话应用图标、信息应用图标),常用应用图标在页面切换时保持显示。上述常用应用图标是可选的,本申请实施例对此不作限定。
其他应用程序图标托盘224可包括多个一般应用图标,例如设置应用图标、应用市场应用图标、图库应用图标、浏览器应用图标等。一般应用图标可分布在主界面的多个页面的其他应用程序图标托盘224中。其他应用程序图标托盘224中显示的一般应用图标在页面切换时会进行相应地变更。一个应用程序的图标可以是一般应用图标、也可以是常用应用图标。当上述图标被放置在常用应用程序图标托盘223,上述图标为常用应用图标;当上述图标被放置在其他应用程序图标托盘224,上述图标为一般应用图标。
可以理解的是,图2C仅仅示例性示出了终端100的一个主界面或一个主界面的一个页面,不应构成对本申请实施例的限定。
在显示图2C之后,终端100可通过摄像头模组210采集并生成包含用户面部的图像帧。这时,终端100所使用的摄像头的数量为2个,包括一个2D摄像头和一个3D摄像头。当然,不限于一个2D摄像头和一个3D摄像头,终端100还可以使用更多的摄像头,以获取更多的用户面部特征,特别是眼部特征,从而更快速准确地确定用户的眼球注视位置。
在使用面部解锁的场景中,终端100的3D摄像头时开启的,因此,终端100仅需开启摄像头模组210的2D摄像头。在使用密码解锁和指纹解锁的场景中,终端100的摄像头是关闭的。这时,终端100需开启摄像头模组210中的2D摄像头和3D摄像头。
优选的,终端100采集并生成图像帧的时间(记为注视识别时间)为显示图2C所示主界面的前3秒。3秒后,终端100可关闭摄像头模组210,以节省功耗。注视识别时间设置过短,例如1秒,可能导致眼球注视识别结果不准确。另一方面,对用户而言,用户也难以在显示主界面后1秒内立刻注视通知栏。注视识别时间设置过长,例如7秒、10秒,会导致功耗过大,不利于终端100的续航。当然,不限于3秒,注视识别时间还可以取其他值,例如2.5秒、3.5秒、4秒等,本申请实施例对此不作限定。后续介绍均以3秒为例。
在上述注视识别时间内,摄像头模组210可不断地采集并生成包含用户面部图像的图像帧。然后,终端100可使用上述图像帧识别用户的眼球注视位置。参考图2D,当识别到用户的眼球注视位置在通知栏221内时,即确定用户在注视通知栏221,终端100可显示图2E所示的通知界面,以供用户获取通知信息。图2E所示的通知界面同图1B,这里不再赘述。
这时,终端100为用户提供了通过眼球注视控制显示通知界面的能力。用户仅需注视通知栏221即可得到通知界面,而无需执行下拉操作,节省了用户操作。特别是在做饭等不方便执行触控操作的场景中,上述基于眼球注视的交互方法可以为用户提供极大地便利。
进一步的,通过用户行为分析,终端100可确定用户需要显示通知界面的场景(开始显示主界面的前一段时间内),进而在相应场景中为用户提供眼球注视识别服务,避免了长期开启摄像头所带来的资源浪费的问题。
可选的,终端100还可在通过待解锁界面的通知进入某一应用,显示该应用的一个界面时,开启眼球注视识别。当然,不考虑功耗,终端100可以实时的开启摄像头,获取用户的眼球注视位置,以确定是否用户是否通过眼球注视的方式控制显示通知界面。例如,在视频播放器中,终端100可检测用户是否注视屏幕顶端或弹出的横幅通知。当检测到用户注视或 弹出的横幅通知时,终端100可显示通知界面或横幅通知对应的界面等等。
图2F示例性示出了一个包括多个页面的主界面。其中,每一个页面均可称为主界面。
如图2F所示,主界面可包括页面20、页面21、页面22。页面21可称为第一桌面,所述第一桌面也称为主桌面(homepage)、主屏幕或开始屏幕,可以理解的,当仅有一个应用图标,页面指示符222仅有一个点,该应用图标所在的桌面。页面22可称为第二桌面,可以理解的,第二桌面为第一桌面相邻的右侧的桌面,如显示第一桌面时,检测到用户由右向左的滑动操作,显示第二桌面。页面20可称为负一屏,可以理解的,负一屏为第一桌面相邻的左侧的界面,其可以为功能页面,如显示第一桌面时,检测到用户由左向右的滑动操作,显示负一屏。其中,第二桌面的页面布局与第一桌面相同,这里不再赘述。主界面中的桌面的数量可根据用户的设置增加或减少,图2F中仅示出了第一桌面和第二桌面等。
在图2C所示的主界面中,终端100显示的主界面实际为图2F所示的主界面中的第一桌面。在一些实施例中,在解锁成功之后,终端100均首先显示第一桌面。在另一些实施例中,在解锁成功之后,终端100可显示负一屏、第一桌面或第二桌面。可选的,终端100具体显示负一屏、第一桌面或第二桌面中的哪一个取决于上一次退出时停留的页面。
因此,在显示图2B所示的解锁成功界面之后,终端100还可显示图2G或图2H所示的主界面(主界面的第二桌面或负一屏)。参考前述图2C的介绍,在显示第二桌面或负一屏的前3秒内,终端100也可通过摄像头模组210采集并生成包含用户面部图像的图像帧、识别用户是否注视通知栏221。若识别到用户的眼球注视位置在通知栏221内时,终端100也可显示图2E所示的通知界面,以供用户获取通知信息。
这样,无论解锁后终端100显示的哪一主界面,终端100均可检测前3秒内用户的眼球注视位置,从而满足用户解锁后想要首先查看通知的需求。
在用户解锁后且不确定接下来执行什么操作时,用户通常会执行左滑、右滑操作,切换当前显示的主界面的页面。这时,在切换页面之后,用户往往也会首先执行下拉操作,指示终端100显示通知界面。因此,在一些实施例中,在每次切换页面时,终端100也可启用摄像头模组210采集并生成包含用户面部图像的图像帧、识别用户是否注视通知栏221。
如图3A所示,在解锁成功之后,终端100可首先显示第一桌面。首先,在显示第一桌面的前3秒内,终端100可启用摄像头模组210采集并生成包含用户面部图像的图像帧、识别用户是否注视通知栏221。
在上述3秒内(或3秒后且未识别到用户注视通知栏)的某一时刻,终端100可检测到左滑操作(从屏幕右侧向左侧滑动的操作)。响应于上述操作,终端100可显示第二桌面,参考图3B。此时,在显示第二桌面的前3秒内,终端100也可启用摄像头模组210采集并生成包含用户面部图像的图像帧、识别用户是否注视通知栏221。如图3C-图3D所示,当识别到用户注视通知栏221时,终端100也可显示通知界面。图3D所示的通知界面同图1B,这里不再赘述。
在上述3秒内(或3秒后且未识别到用户注视通知栏)的某一时刻,终端100也可检测到右滑操作(从屏幕左侧向右侧滑动的操作),参考图3E。响应于上述操作,终端100可显示负一屏,参考图3F。同样的,在显示负一屏的前3秒内,终端100也可启用摄像头模组210采集并生成包含用户面部图像的图像帧、识别用户是否注视通知栏221。如图3G-图3H所示,当识别到用户注视通知栏221时,终端100也可显示通知界面。
这样,在用户切换页面的过程中,终端100可多次检测用户的眼球注视位置,为用户提供多次眼球注视控制显示的机会。
在检测到用户从某一应用程序退出后,终端100可显示主界面。这时,用户往往也会获取通知界面,查看还有哪些待处理通知。因此,在一些实施例中,在检测到从正在运行的应用程序返回到主界面时,终端100也可检测用户是否注视通知栏,进而确定是否显示通知界面。
图4A示例性示出了终端100在运行图库应用时显示的一个用户界面,记为图库界面。用户可通过该图库界面浏览终端100上存储的图片、视频等图像资源。终端100可检测到上滑操作(从屏幕底端向上滑动的操作),参考图4B。响应于上述上滑操作,终端100可显示主界面,参考图4C。这时,在显示上述主界面的前3秒内,终端100也可启用摄像头模组210采集并生成包含用户面部图像的图像帧、识别用户是否注视通知栏221。当检测到用户的眼球注视位置在通知栏221内时,终端100也可显示通知界面,参考图4D。
图库应用为示例性例举的终端100上安装的一个应用程序。不限于图库应用,在检测到从其他应用返回主界面的操作时,终端100均可启用摄像头模组210采集并生成包含用户面部图像的图像帧、识别用户是否注视通知栏221,进而确定是否显示通知界面。
这样,在用户推出当前正在运行的应用返回主界面的过程中,终端100也可检测用户的眼球注视位置,满足了用户在使用完一个应用之后,查看待处理通知的需求。
在一些实施例中,在显示通知界面后,终端100还可根据确认通知界面中通知的数量。若显示有两个及两个以上的通知,则终端100在用户处理完一个通知后,可自动地显示通知界面中的其他通知。
参考图5A,在显示通知界面之后,终端100可检测到作用于某一通知的用户操作。响应于上述操作,终端100可展开该通知,显示该通知对应的详细通知内容。
例如,终端100可检测到作用于通知121的用户操作。通知121为示例性示出的终端100接收到的一个信息通知。响应于上述操作,终端100可图5B所示的终端100进行信息收发的用户界面,记为信息界面。
信息界面可包括联系人511、信息512、输入栏513。联系人511可指示接收到的信息的来源。例如“Lisa”可指示该界面中显示的信息的发送者为“Lisa”。信息512可展示完整的信息内容。输入栏513可用于接收终端100的用户的输入信息。当用户想要给“Lisa”回信时,用户可以点击输入栏513。响应于上述点击操作,终端100可显示输入键盘,接收用户的输入信息,并在输入栏513中显示。在完成输入后,响应于用户的发送操作,终端100可将上述输入栏513中的信息发送给“Lisa”。
信息界面还包括多个信息类型选项。一个信息类型选项可用于发送一类特殊的信息。例如,照片选项514可用于发送照片类型的信息。用户可通过多个信息类型选项向联系人发送照片、表情、红包、位置等各类特殊信息。
在显示图5B所示的信息界面后,终端100可监测用户操作,进而确定用户是否处理完该通知。具体的,在显示图5B所示的信息界面后,终端100可监测第一等候时长内是否未检测到作用于上述信息界面的用户操作。若第一等候时长内未检测到用户操作,则终端100可确定用户已经处理完该通知。若第一等候时长内检测到用户操作,则在检测到用户操作的时刻,终端100可重新开始计算第一等候时长,并检测上述时刻之后第一等候时长之内的用 户操作。若未检测到用户操作,则终端100可确定用户已经处理完该通知。反之,则终端100继续重新开始计算第一等候时长,并检测用户操作。上述第一等候时长是预设的,例如5秒。
以5秒为例,若在某一用户操作之后的5秒内,终端100未检测到作用于图5B所示的信息界面的任意用户操作,终端100可确定用户已经处理完通知121。于是,终端100可显示通知界面,参考图5C。此时,通知界面中不包括已处理完的通知121,而只包括剩余的通知122、通知123。
这时,用户可选择点击通知122。用户也可选择点击通知123,这里以通知122为例。响应于点击通知122的用户操作,终端100可显示包含通知122的详细内容的页面。示例性的,通知122可以为天气预报通知。响应于点击天气预报通知的操作,终端100可显示图5D所示的展示当前天气以及天气预报信息的用户界面,记为天气界面。于是,用户可以快速的获取到天气信息。
实施上述方法,在处理完一个通知后,终端100自动地再次显示通知界面可提醒用户处理通知界面中的其他未处理的通知,同时也为用户处理通知提供方便,无需用户每次都执行下拉操作。
用户可选择启用或关闭眼球注视识别功能。在启用眼球注视识别的场景下,在完成解锁之后,终端100可采集并生成包含用户面部图像的图像帧、识别用户的眼球注视位置,进而确定是否展示通知栏,为用户查看通知提供方便。反之,在关闭眼球注视识别的场景下,终端100不会采集用户的面部图像,用于识别用户的眼球注视位置。
图6A-图6D示例性示出了一组设置启用或关闭眼球注视识别功能的用户界面。
图6A示例性示出了终端100上的设置界面。设置界面上可显示有多个设置选项,例如账号设置选择611、WLAN选项612、蓝牙选项613、移动网络选项614等。在本申请实施例中,设置界面还包括辅助功能选项615。辅助功能选项615可用于设置一些快捷操作。
终端100可检测到作用于辅助功能选项615的用户操作。响应于上述操作,终端100可显示图6B所示的用户界面,记为辅助功能设置界面。该界面可显示多个辅助功能选项,例如无障碍选项621、单手模式选项622等等。在本申请实施例中,辅助功能设置界面还包括快捷启动及手势选项623。快捷启动及手势选项623可用于设置一些控制交互的手势动作和眼球注视动作。
终端100可检测到作用于快捷启动及手势选项623的用户操作。响应于上述操作,终端100可显示图6C所示的用户界面,记为快捷启动及手势设置界面。该界面可显示多个快捷启动及手势设置选项,例如智慧语音选项631、截屏选项632、录屏选项633、快速通话选项634。在本申请实施例中,快捷启动及手势设置界面还包括眼球注视选项635。眼球注视选项635可用于设置眼球注视识别的区域、对应的快捷操作。
终端100可检测到作用于眼球注视选项635的用户操作。响应于上述操作,终端100可显示图6D所示的用户界面,记为眼球注视设置界面。
如图6D所示,该界面可显示多个基于眼球注视识别的功能选项,例如通知栏选项641。通知栏选项641中的开关为“ON”时表示终端100启用了图2A-图2H、图3A-图3D、图4A-图4D所示的通知栏注视识别功能。通知栏选项641中的开关为“OFF”时表示终端100未启用上述通知栏注视识别功能。因此,当解锁成功并显示主界面时,或切换主界面各个页面时,或返回主界面时,终端100不会采集用户的面部图像、也不会判断用户是否注视通知栏。
眼球注视设置界面还可包括支付码选项642、健康码选项643。
支付码选项642可用于开启或关闭眼球注视控制显示支付码的功能。例如,在启用上述功能的场景下,当解锁成功并显示主界面时(或切换主界面各个页面时,或返回主界面时),终端100可利用采集的包含用户面部图像的图像帧确认用户是否注视屏幕的右上角区域。上述屏幕的右上角区域可参考图6E所示的右上角区域。当检测到用户注视屏幕右上角区域的动作时,终端100可显示支付码。这样,用户可以快速便捷地获取到支付码,完成支付行为,避免了大量繁琐的用户操作,提高了交互效率,提升了用户使用体验。
健康码选项643可用于开启或关闭眼球注视控制显示健康码的功能。例如,在启用上述功能的场景下,当解锁成功并显示主界面时(或切换主界面各个页面时,或返回主界面时),终端100可利用采集的包含用户面部图像的图像帧确认用户是否注视屏幕的左下角区域,参考图6E所示的左下角区域。当检测到用户注视屏幕左下角区域的动作时,终端100可显示健康码。这样,用户可以快速便捷地获取到健康码,完成健康检查。
支付码与右上角区域的映射关系、健康码与左下角区域的映射关系为示例性的。开发人员或用户也可设定其他的映射关系,例如注视左上角显示支付码等等,本申请实施例对此不作限制。
图7示例性示出了本申请实施例提供的一种显示的流程图。下面结合图7和图2A-图2H所示的用户界面,具体介绍终端100实施上述显示方法的处理流程。
S101、终端100检测到解锁成功并显示主界面。
在用户不使用终端100时,终端100可处于灭屏状态,或灭屏AOD(Always on Display)状态。终端100处于灭屏状态时,终端100的显示器休眠成为黑屏,但是其他器件和程序正常工作的状态。灭屏AOD状态是指在不点亮整块手机屏幕的情况下控制屏幕局部亮起的状态,即在灭屏状态的基础上控制屏幕局部亮起的状态。
当检测到用户拿起手机的动作时,终端100可点亮整块屏幕,显示如图2A所示的待解锁界面。在点亮屏幕之后,终端100可启用摄像头模组210采集并生成包含用户面部图像的图像帧。参考图2A的介绍,在本申请实施例中,终端100的摄像头模组210至少包括一个2D摄像头和一个3D摄像头。在进行面部解锁检验时,终端100所使用的摄像头可以为摄像头模组210中的3D摄像头。
在生成图像帧之后,终端100可将上述图像帧输入人脸识别模型。人脸识别模型中可存储有机主的面部图像。在接收到上述摄像头模组210采集并生成的图像帧之后,人脸识别模型可识别上述图像帧中的面部图像与机主的面部图像是否匹配。上述人脸识别模型为现有的,这里不再赘述。
当面部图像匹配时,终端100可确认解锁成功。这时,终端100可显示图2B所示的解锁成功界面,以提示用户解锁成功。随后,终端100可显示图2C所示的主界面。
参考图2F的介绍,主界面可包括多个页面。这里,终端100显示主界面包括显示主界面中的任意页面的情况,例如显示第一桌面、显示第二桌面或显示负一屏。在一些实施例中,终端100可固定地显示第一桌面。在另一些实施例中,终端100可依据上一次退出时停留的页面继续显示负一屏、第一桌面或第二桌面。因此,显示图2B所示的解锁成功界面之后,终端100也可显示图2G或图2H所示的主界面。
当然,在确认解锁成功之后,终端100也可直接显示图2C或图2G或图2H所示主界面,而无需显示图2B。图2B不是必须的。
S102、在预设的注视识别时间内,终端100获取包含用户面部图像的图像帧,并利用上述图像帧确定用户的眼球注视位置。
为了避免功耗过高以及隐私安全问题,摄像头模组210不会一直处于工作状态。因此,终端100可设置注视识别时间。在该时间内,终端100可在注视识别时间内通过摄像头模组210采集并生成包含用户面部图像的图像帧,用于识别用户的眼球注视位置。
显示主界面的前一段时间,一般为前几秒,可设定为注视识别时间。例如图2C中例举的显示主界面的前3秒。上述时间是根据用户控制显示通知界面的行为习惯总结出的,是优选的满足用户想要查看通知界面需求的时间。
因此,在确认解锁成功之后,在终端100可显示主界面的前一段时间(注视识别时间)内,终端100还会启用摄像头模组210采集并生成包含用户面部图像的图像帧。在这一时间内采集的并生成图像帧可称为目标输入图像。目标输入图像可用于终端100确定用户是否在注视通知栏区域,进而确定是否显示通知界面。
其中,在使用面部解锁的场景中,终端100的3D摄像头时开启的,因此,S102中终端100仅需开启摄像头模组210的2D摄像头。在使用密码解锁和指纹解锁的场景中,终端100的摄像头是关闭的。这时,终端100需开启摄像头模组210中的2D摄像头和3D摄像头。
具体的,在获取到目标输入图像之后,终端100可将上述图像输入到眼球注视识别模型中。眼球注视识别模型是终端100中预置的模型。眼球注视识别模型可识别输入图像帧中用户的眼球注视位置,进而终端100可基于上述眼球注视位置确定用户是否在注视通知栏。后续图8将具体介绍本申请所使用的眼球注视识别模型的结构,这里先不展开。
可选的,眼球注视识别模型还可输出用户的眼球注视区域。一个眼球注视区域可收缩为一个眼球注视位置,一个眼球注视位置也可扩展为一个眼球注视区域。在一些示例中,屏幕上的一个显示单元构成的光标点可称为一个眼球注视位置,对应的,屏幕上多个显示单元构成的光标点或光标区域即称为一个眼球注视区域。
在输出一个眼球注视区域之后,终端100可通过判断眼球注视区域在屏幕中的位置,确定用户是否注视通知栏,进而确定是否显示通知界面。
S103、当确定用户的眼球注视位置在通知栏区域内时,终端100确定用户在注视通知栏。响应于用户注视通知栏的动作,终端100显示通知界面。
当眼球注视识别模型识别输入的图像帧中的眼球注视位置在通知栏区域内时,终端100可确定显示通知界面。参考图2D-图2E所示的用户界面,终端100可确定用户的眼球注视位置在通知栏221区域内,于是,终端100可显示通知界面。
若上述注视识别时间内所采集并生成的图像帧中的眼球注视位置均不在通知栏区域内时,则终端100不显示通知界面。当然,预设的眼球注视识别时间不限于解锁成功后第一次显示主界面的前一段时间。终端100还设置有其他的眼球注视识别时间,例如,检测到用户切换页面的操作后更新主界面的前一段时间,以及退出某一应用后返回主界面的前一段时间。这时,终端100也可识别用户是否在注视通知栏,进而确定是否显示通知界面,后续还会详细介绍,这里先不展开。
在预设的注视识别时间(3秒)内,终端100采集并生成包含用户面部图像的图像帧和识别眼球注视位置是同时进行的。因此,在注视识别时间结束之前,如果终端100识别到用户注视通知栏区域,则终端100可显示通知栏,同时,摄像头模组210可停止采集和生成图 像帧。在注视识别时间结束之后,如果终端100仍未识别到用户注视通知栏区域,终端100也停止采集和生成图像帧,以节省功耗。
S104、终端100确定通知界面上显示有多个通知,在检测到用户确认完一个通知后,自动地显示包含剩余通知的通知界面。S104是可选的。
在显示通知界面后,终端100可确定通知界面显示的通知的数量。如果通知界面显示的通知的数量为多个(两个及两个以上),则终端100在检测到用户确认完一个通知之后,可自动地显示剩余的通知的详细内容。
参考图5A-图5D的介绍,在用户点开通知121之后,终端100可显示图5B所示的信息界面。在上述信息界面后,终端100可监测用户操作,进而确定用户是否处理完该通知。
可选的,终端100可通过预设的用户操作确定用户是否处理完一个通知。上述预设的用户操作例如上滑返回主界面的操作。
可选的,终端100可还可监测第一等候时长内是否未检测到作用于上述信息界面的用户操作。若第一等候时长内未检测到用户操作,则终端100可确定用户已经处理完该通知。参考图5B,当显示图5B所示的Lisa发送的信息一段时间,且没有检测到用户编辑恢复的操作时,终端100可确认用户处理完该信息通知。又比如,当检测到用户浏览某一通知对应的界面,滑动到该界面的某一个位置,并停留超过一段时间后,终端100可确认用户处理完该通知。又比如,当检测到用户浏览某一通知对应的视频,重复播放该视频多次后,终端100可确认用户处理完该通知等等。
在确认用户是否处理完一个通知后,终端100可自动的显示通知界面。这样,用户无需执行下拉操作,即可获得通知界面,进而查看剩余的未处理通知。然后用户可继续处理上述未处理通知。这样既实现了提醒用户处理剩余未处理的通知,同时也为用户处理通知提供方便,节省了用户操作。
进一步的,终端100还可根据通知的类型,确定是否自动的显示该通知的详细内容。通知可分为事务型通知和推荐型通知。事务型通知例如用户购买机票后发送的机票订单通知、行程提醒通知等等。事务型通知是用户需要确认的。推荐型通知例如宣传机票促销活动的通知。推荐型通知是用户可以忽略的。如果剩余的通知为推荐型通知,则终端100可不自动的显示该通知的详细内容,避免打扰用户,降低用户使用体验。
在S103中,在注视识别时间结束之后,如果终端100仍未识别到用户注视通知栏区域,终端100也停止采集和生成图像帧。这时,用户可能切换当前显示的主界面,也可能打开某一应用。
在用户打开手机后不确定首先执行什么操作的时候,用户经常左滑、右滑切换主界面,随意浏览,最后还是选择查看通知,确认是否有重要的待处理通知。因此,在检测到切换当前显示的主界面之后,在显示切换后的主界面的前一段时间内,终端100也可启用摄像头模组210采集和生成图像帧,确定用户是否注视通知栏、进而确定是否显示通知界面。
针对上述场景,在检测到切换当前显示的主界面的用户操作之后,终端100也可设定眼球注视识别时间,并在该眼球注视识别之间内识别用户是否注视通知栏。在检测到用户注视通知栏的动作之后,终端100也显示通知界面。
结合图3A-图3D所示的用户界面,在初始显示主界面的第一桌面的前3秒内,终端100 可能未识别用户注视通知栏的动作。这时,终端100不会显示通知界面。同时,3秒之后,终端100会关闭摄像头模组210,以减少功耗。
此后,终端100可检测到左滑操作(切换主界面的操作)。响应于上述操作,终端100可显示主界面的第二桌面。这时,显示第二桌面的前3秒也可设定为注视识别时间。因此,终端100在显示第二桌面的前3秒内也可启用摄像头模组210采集和生成图像帧,确定用户是否注视通知栏、进而确定是否显示通知界面。
可选的,主界面的不同页面的注视识别时间也可不同,例如,第二桌面的前2秒可设定为第二桌面的注视识别时间。
这样,在用户切换页面的过程中,终端100可多次检测用户的眼球注视位置,为用户提供多次眼球注视控制显示的机会。同时,终端100也避免了一直保持摄像头处于工作状态,避免了功耗过大的问题。
在退出某一应用之后,用户也倾向于再查看一下通知,确认是否有重要的待处理通知。针对上述场景,在检测到返回主界面的用户操作之后,终端100也可设定眼球注视识别时间,并在该眼球注视识别之间内识别用户是否注视通知栏。在检测到用户注视通知栏的动作之后,终端100也显示通知界面。
参考图4A-图4D所示的用户界面,在运行图库应用的场景下,终端100可检测到上滑操作,即退出图库返回主界面的操作。响应于上述操作,终端100可显示主界面。这时,显示上述主界面的前3秒也可设定为注视识别时间。在这一时间内,终端100的摄像头模组210可采集和生成图像帧。当检测到用户注视通知栏的动作时,终端100也可显示通知界面。
实施上述方法,用户通过眼球注视控制显示通知界面的场景不限于解锁后的一个固定时间内。用户可以在更多的场景下,例如切换主界面、返回主界面的场景下,通过眼球注视控制显示通知界面。同时,实施上述方法,终端100只在识别到预设的触发场景才启用摄像头,识别用户的眼球注视位置,避免了摄像头长期处于工作状态所带来的资源浪费和功耗大的问题。
图8示例性示出了眼球注视识别模型的结构。下面结合图8具体介绍本申请实施例所使用的眼球注视识别模型。在本申请实施例中,眼球注视识别模型是基于卷积神经网络(Convolutional Neural Networks,CNN)建立的。
如图8所示,眼球注视识别模型可包括:人脸校正模块、降维模块、卷积网络模块。
(1)、人脸校正模块。
摄像头模组210采集的包含用户面部的图像可首先被输入人脸校正模块。人脸校正模块可用于识别输入的图像帧中的面部图像是否端正。对于面部图像不端正(例如歪头)的图像帧,人脸校正模块可对该图像帧进行校正,使其端正,从而避免后续影响眼球注视识别效果。
图9A示例性示出了人脸校正模块对摄像头模组210生成的包含用户面部图像的图像帧的校正处理流程。
S201:利用人脸关键点识别算法确定图像帧T1中的人脸关键点。
在本申请实施例中,人脸关键点包括左眼、右眼、鼻子、左唇角、右唇角。人脸关键点识别算法为现有的,例如基于Kinect的人脸关键点识别算法等等,这里不再赘述。
参考图9B,图9B示例性示出了一帧包含用户面部图像的图像帧,记为图像帧T1。人脸 校正模块可利用人脸关键点识别算法确定图像帧T1中的人脸关键点:左眼a、右眼b、鼻子c、左唇角d、右唇角e,并确定各个关键点的坐标位置,参考图9C中图像帧T1。
S202:利用人脸关键点确定图像帧T1的被校准线,进而确定图像帧T1人脸偏转角θ。
端正的面部图像中左右眼处于同一水平线,因此左眼关键点与右眼关键点连成的直线(被校准线)与水平线是平行的,即人脸偏转角(被校准线与水平线的所构成的角)θ为0。
如图9C所示,人脸校正模块可利用识别到的左眼a、右眼b的坐标位置确定被校准线L1。于是,根据L1和水平线,人脸校正模块可确定图像帧T1中的面部图像的人脸偏转角θ。
S203:如果θ=0°,确定图像帧T1中的面部图像是端正的,无需校正。
S204:如果θ≠0°,确定图像帧T1中的面部图像是不端正的,进一步的,对图像帧T1进行校正,得到面部图像端正的图像帧。
在图9C中,θ≠0,即图像帧T1中的面部图像是不端正。这时,人脸校正模块可对图像帧T1进行校正。
具体的,人脸校正模块可首先利用左眼a、右眼b的坐标位置确定旋转中心点y,然后,以y点为旋转中心,将图像帧T1旋转θ°,得到面部图像端正的图像帧T1,记为图像帧T2。如图9C所示,A点可表示旋转后的左眼a的位置、B点可表示旋转后的右眼b的位置、C点可表示旋转后的鼻子c的位置、D点可表示旋转后的左唇角d的位置、E点可表示旋转后的右唇角e的位置。
可以理解的,在旋转图像帧T1时,图像中的每一个像素点都会被旋转。上述A、B、C、D、E仅为示例性示出了图像中的关键点的旋转过程,而并非只对人脸关键点进行旋转。
S205:对旋转后得到的面部图像端正的图像帧T1进行处理,得到左眼图像、右眼图像、脸部图像和人脸网格数据。其中,人脸网格数据可用于反映图像中脸部图像在整个图像中的位置。
具体的,人脸校正模块可以以人脸关键点为中心,按预设的尺寸,对校正后的图像进行裁剪,从而得到该图像对应的左眼图像、右眼图像、脸部图像。在确定脸部图像时,人脸校正模块可确定人脸网格数据。
参考图9D,人脸校正模块可以以左眼A为中心确定一个固定尺寸的矩形。该矩形覆盖的图像即左眼图像。按同样的方法,人脸校正模块可以以右眼B为中心确定右眼图像,以鼻子C为中心确定人脸图像。其中,左眼图像与右眼图像的尺寸相同,人脸图像与左眼图像的尺寸不同。在确定人脸图像之后,人脸校正模块可相应地得到人脸网格数据,即人脸图像在整个图像中的位置。
在完成人脸校正之后,终端100可得到校正后的图像帧T1,并由上述图像帧T1得到对应的左眼图像、右眼图像、脸部图像和人脸网格数据。
(2)、降维模块。
人脸校正模块可将自身输出的左眼图像、右眼图像、脸部图像和人脸网格数据输入降维模块。降维模块可用于对输入的左眼图像、右眼图像、脸部图像和人脸网格数据进行降维,以降低卷积网络模块的计算复杂度,提升眼球注视识别的速度。降维模块使用的降维方法包括但不限于主成分分析法(principal components analysis,PCA)、下采样、1*1卷积核等等。
(3)、卷积网络模块。
经过降维处理的各个图像可被输入卷积网络模块。卷积网络模块可基于上述输入的图像 输出眼球注视位置。在本申请实施例中,卷积网络模块中卷积网络的结构可参考图10。
如图10所示,卷积网络可包括卷积组1(CONV1)、卷积组2(CONV2)、卷积组3(CONV3)。一个卷积组包括:卷积核(Convolution)、激活函数PRelu、池化核(Pooling)和局部响应归一化层(Local Response Normalization,LRN)。其中,CONV1的卷积核为7*7的矩阵,池化核为3*3的矩阵;CONV2的卷积核为5*5的矩阵,池化核为3*3的矩阵;CONV3的卷积核为3*3的矩阵,池化核为2*2的矩阵。
其中,可分离卷积技术可以降低卷积核Convolution)、池化核(Pooling)的存储要求,从而降低整体模型对存储空间的需求,使得该模型可以部署在终端设备上。
具体的,可分离卷积技术是指将一个n*n的矩阵分解为一个n*1的列矩阵和一个1*n的行矩阵进行存储,从而减少对存储空间的需求。因此,本申请所示用的眼球注视模块具有体量小,易部署的优势,以适应部属在终端等电子设备上。
具体的,参考图11,矩阵A可表示一个3*3的卷积核。假设直接存储矩阵A,则该矩阵A需要占9个存储单元。矩阵A可拆分成列矩阵A1和行矩阵A2(列矩阵A1×行矩阵A2=矩阵A)。列矩阵A1和行矩阵A2仅需6个存储单元。
在经过CONV1、CONV2、CONV3的处理之后,不同的图像可被输入不同的连接层进行全连接。如图10所示,卷积网络可包括连接层1(FC1)、连接层2(FC2)、连接层3(FC3)。
左眼图像和右眼图像在经过CONV1、CONV2、CONV3之后可被输入到FC1中。FC1可包括组合模块(concat)、卷积核101、PRelu、全连接模块102。其中,concat可用于组合左眼图像和右眼图像。人脸图像在经过CONV1、CONV2、CONV3之后可被输入到FC2中。FC2可包括卷积核103、PRelu、全连接模块104、全连接模块105。FC2可对人脸图像进行两次全连接。人脸网格数据在经过CONV1、CONV2、CONV3之后可被输入到FC3中。FC3包括一个全连接模块。
不同结构的连接层是针对不同类型的图像(例如左眼、右眼、人脸图像)构建的,可以更好的获取各类图像的特征,从而提升模型的准确性,使得终端100可以更加准确的识别用户的眼球注视位置。
然后,全连接模块106可对左眼图像和右眼图像、人脸图像、人脸网格数据再进行一次全连接,最终输出眼球注视位置。眼球注视位置指示了用户目光在屏幕的聚焦的具体位置,即用户注视位置,参考图1C所示的光标点S。进而,当眼球注视位置在通知栏区域内时,终端100可以确定用户在注视通知栏。
此外,本申请所使用的眼球注视模型设置的卷积神经网络的参数较少。因此,在使用眼球注视模型计算和预测用户眼球注视位置所需的时间较小,即终端100可快速地确定用户是否在注视通知栏等特定区域。
在本申请实施例中:
第一预设区域可以为图2C所示的界面中的通知栏221;
第一界面可以为图2A所示的待解锁界面,或图5B所示的退出某一应用显示该应用的一个界面;
第二界面可以是图2F所示的第一桌面、第二桌面、负一屏等主界面中任意一个界面;
第三界面可以是图2E所示的第三界面。
图12为本申请实施例的终端100的系统结构示意图。
分层架构将系统分成若干个层,每一层都有清晰的角色和分工。层与层之间通过软件接口通信。在一些实施例中,将系统分为五层,从上至下分别为应用程序层,应用程序框架层、硬件抽象层、内核层以及硬件层。
应用层可以包括多个应用程序,例如拨号应用、图库应用等等。在本申请实施例中,应用层还包括眼球注视SDK(software development kit,软件开发工具包)。终端100的系统和终端100上安装的第三应用程序,可通过调用眼球注视SDK识别用户的眼球注视位置。
框架层为应用层的应用程序提供应用编程接口(application programming interface,API)和编程框架。框架层包括一些预先定义的函数。在本申请实施例中,框架层可以包括相机服务接口、眼球注视服务接口。相机服务接口用于提供使用摄像头的应用编程接口和编程框架。眼球注视服务接口提供使用眼球注视识别模型的应用编程接口和编程框架。
硬件抽象层为位于框架层和驱动层之间的接口层,为操作系统提供虚拟硬件平台。本申请实施例中,硬件抽象层可以包括相机硬件抽象层和眼球注视进程。相机硬件抽象层可以提供相机设备1(RGB摄像头)、相机设备2(TOF摄像头)或更多的相机设备的虚拟硬件。通过眼球注视识别模块识别用户眼球注视位置的计算过程在眼球注视进程中执行。
驱动层为硬件和软件之间的层。驱动层包括各种硬件的驱动。驱动层可以包括相机设备驱动。相机设备驱动用于驱动摄像头的传感器采集图像以及驱动图像信号处理器对图像进行预处理。
硬件层包括传感器和安全数据缓冲区。其中,传感器包括RGB摄像头(即2D摄像头)、TOF摄像头(即3D摄像头)。RGB摄像头可采集并生成2D图像。TOF摄像头即深感摄像头,可采集并生成带有深度信息的3D图像。摄像头采集的数据存储在安全数据缓冲区中。任何上层进程或引用在获取摄像头采集的图像数据时,需要从安全数据缓冲区中获取,而不能通过其他方式获取,因此安全数据缓冲区还可以避免摄像头采集的图像数据被滥用的问题。
上述介绍的软件层级和各层中包括的模块或接口运行在可运行环境(Runnable executive environment,REE)中。终端100还包括可信执行环境(Trust executive environment,TEE)。TEE中的数据通信相比于REE更安全。
TEE中可包括眼球注视识别算法模块、信任应用(Trust Application,TA)模块以及安全服务模块。眼球注视识别算法模块存储有眼球注视识别模型的可执行代码。TA可用于安全地将上述模型输出的识别结果发送到眼球注视进程中。安全服务模块可用于将安全数据缓冲区中存储的图像数据安全地输入到眼球注视识别算法模块。
下面结合上述硬件结构以及系统结构,对本申请实施例中的基于眼球注视识别的交互方法进行具体描述:
终端100确定执行眼球注视识别操作。在识别到解锁成功后,或解锁后切换页面后,或返回主界面后,终端100可确定在眼球注视识别时间内执行眼球注视识别操作。
终端100通过眼球注视SDK调用眼球注视服务。
一方面,眼球注视服务可调用框架层的相机服务,通过相机服务采集并获得包含用户面部图像的图像帧。相机服务可通过调用相机硬件抽象层中的相机设备1(RGB摄像头)、相机设备2(TOF摄像头)发送启动RGB摄像头和TOF摄像头的指令。相机硬件抽象层将该指令发送到驱动层的相机设备驱动。相机设备驱动依据上述指令可以启动摄像头。相机设备1发送到相机设备驱动的指令可用于启动RGB摄像头。相机设备2发送 到相机设备驱动的指令可用于启动TOF摄像头。RGB摄像头、TOF摄像头开启后采集光信号,经过图像信号处理器生成电信号的2D或3D图像。另一方面,眼球注视服务可创建眼球注视进程,初始化眼球识别模型。
图像信号处理器生成的图像可被存储到安全数据缓冲区。在建眼球注视进程创建完成并初始化后,安全数据缓冲区中存储的图像数据可经由安全服务提供的安全传输通道(TEE)输送到眼球注视识别算法。眼球注视识别算法在接收到图像数据之后,可将上述图像数据输入到基于CNN建立的眼球注视识别模型中,从而确定用户的眼球注视位置。然后,TA将上述眼球注视位置安全地传回眼球注视进程,进而经由相机服务、眼球注视服务返回到应用层眼球注视SDK中。
最后,眼球注视SDK可根据接收到的眼球注视位置确定用户是否在注视通知栏,进而确定是否显示通知界面。
图13示出了终端100的硬件结构示意图。
终端100可以包括处理器110,外部存储器接口120,内部存储器121,通用串行总线(universal serial bus,USB)接口130,充电管理模块140,电源管理模块141,电池142,天线1,天线2,移动通信模块150,无线通信模块160,音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,传感器模块180,按键190,马达191,指示器192,摄像头193,显示屏194,以及用户标识模块(subscriber identification module,SIM)卡接口195等。其中传感器模块180可以包括压力传感器180A,陀螺仪传感器180B,气压传感器180C,磁传感器180D,加速度传感器180E,距离传感器180F,接近光传感器180G,指纹传感器180H,温度传感器180J,触摸传感器180K,环境光传感器180L,骨传导传感器180M等。
可以理解的是,本发明实施例示意的结构并不构成对终端100的具体限定。在本申请另一些实施例中,终端100可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。
处理器110可以包括一个或多个处理单元,例如:处理器110可以包括应用处理器(application processor,AP),调制解调处理器,图形处理器(graphics processing unit,GPU),图像信号处理器(image signal processor,ISP),控制器,视频编解码器,数字信号处理器(digital signal processor,DSP),基带处理器,和/或神经网络处理器(neural-network processing unit,NPU)等。其中,不同的处理单元可以是独立的器件,也可以集成在一个或多个处理器中。
控制器可以根据指令操作码和时序信号,产生操作控制信号,完成取指令和执行指令的控制。
处理器110中还可以设置存储器,用于存储指令和数据。在一些实施例中,处理器110中的存储器为高速缓冲存储器。该存储器可以保存处理器110刚用过或循环使用的指令或数据。如果处理器110需要再次使用该指令或数据,可从所述存储器中直接调用。避免了重复存取,减少了处理器110的等待时间,因而提高了系统的效率。
在一些实施例中,处理器110可以包括一个或多个接口。接口可以包括集成电路(inter-integrated circuit,I2C)接口,集成电路内置音频(inter-integrated circuit sound,I2S)接口,脉冲编码调制(pulse code modulation,PCM)接口,通用异步收发传输器(universal  asynchronous receiver/transmitter,UART)接口,移动产业处理器接口(mobile industry processor interface,MIPI),通用输入输出(general-purpose input/output,GPIO)接口,用户标识模块(subscriber identity module,SIM)接口,和/或通用串行总线(universal serial bus,USB)接口等。
可以理解的是,本发明实施例示意的各模块间的接口连接关系,只是示意性说明,并不构成对终端100的结构限定。在本申请另一些实施例中,终端100也可以采用上述实施例中不同的接口连接方式,或多种接口连接方式的组合。
充电管理模块140用于从充电器接收充电输入。电源管理模块141用于连接电池142,充电管理模块140与处理器110。
终端100的无线通信功能可以通过天线1,天线2,移动通信模块150,无线通信模块160,调制解调处理器以及基带处理器等实现。
天线1和天线2用于发射和接收电磁波信号。移动通信模块150可以提供应用在终端100上的包括2G/3G/4G/5G等无线通信的解决方案。移动通信模块150可以包括至少一个滤波器,开关,功率放大器,低噪声放大器(low noise amplifier,LNA)等。移动通信模块150可以由天线1接收电磁波,并对接收的电磁波进行滤波,放大等处理,传送至调制解调处理器进行解调。移动通信模块150还可以对经调制解调处理器调制后的信号放大,经天线1转为电磁波辐射出去。
调制解调处理器可以包括调制器和解调器。其中,调制器用于将待发送的低频基带信号调制成中高频信号。解调器用于将接收的电磁波信号解调为低频基带信号。
无线通信模块160可以提供应用在终端100上的包括无线局域网(wireless local area networks,WLAN)(如无线保真(wireless fidelity,Wi-Fi)网络),蓝牙(bluetooth,BT),全球导航卫星系统(global navigation satellite system,GNSS),调频(frequency modulation,FM),近距离无线通信技术(near field communication,NFC),红外技术(infrared,IR)等无线通信的解决方案。无线通信模块160经由天线2接收电磁波,将电磁波信号调频以及滤波处理,将处理后的信号发送到处理器110。无线通信模块160还可以从处理器110接收待发送的信号,对其进行调频,放大,经天线2转为电磁波辐射出去。
在一些实施例中,终端100的天线1和移动通信模块150耦合,天线2和无线通信模块160耦合,使得终端100可以通过无线通信技术与网络以及其他设备通信。所述无线通信技术可以包括全球移动通讯系统(global system for mobile communications,GSM),通用分组无线服务(general packet radio service,GPRS),码分多址接入(code division multiple access,CDMA),宽带码分多址(wideband code division multiple access,WCDMA),时分码分多址(time-division code division multiple access,TD-SCDMA),长期演进(long term evolution,LTE),BT,GNSS,WLAN,NFC,FM,和/或IR技术等。所述GNSS可以包括全球卫星定位系统(global positioning system,GPS),全球导航卫星系统(global navigation satellite system,GLONASS),北斗卫星导航系统(beidou navigation satellite system,BDS),准天顶卫星系统(quasi-zenith satellite system,QZSS)和/或星基增强系统(satellite based augmentation systems,SBAS)。
在本申请实施例中,终端100接收到的一些通知是终端100上安装的应用对应的应用服务器发送的。终端100通过过天线1,天线2,移动通信模块150,无线通信模块160,调制解调处理器以及基带处理器等实现无线通信功能接收上述通知,进而展示上述通知。
终端100通过GPU,显示屏194,以及应用处理器等实现显示功能。GPU为图像处 理的微处理器,连接显示屏194和应用处理器。GPU用于执行数学和几何计算,用于图形渲染。处理器110可包括一个或多个GPU,其执行程序指令以生成或改变显示信息。
显示屏194用于显示图像,视频等。显示屏194包括显示面板。显示屏194包括显示面板。显示面板可以采用液晶显示屏(liquid crystal display,LCD)。显示面板还可以采用有机发光二极管(organic light-emitting diode,OLED),有源矩阵有机发光二极体或主动矩阵有机发光二极体(active-matrix organic light emitting diode,AMOLED),柔性发光二极管(flex light-emitting diode,FLED),miniled,microled,micro-oled,量子点发光二极管(quantum dot light emitting diodes,QLED)等制造。在一些实施例中,终端100可以包括1个或N个显示屏194,N为大于1的正整数。
在本申请实施例中,终端100通过GPU,显示屏194,以及应用处理器等提供的显示功能,显示图2A-图2H、图3A-图3H、图4A-图4D、图5A-图5D、图6A-图6E所示的用户界面。
终端100可以通过ISP,摄像头193,视频编解码器,GPU,显示屏194以及应用处理器等实现拍摄功能。在本申请实施例中,摄像头193包括生成二维图像的RGB摄像头(2D摄像头)和生成三维图像的TOF摄像头(3D摄像头)。
ISP用于处理摄像头193反馈的数据。例如,拍照时,打开快门,光线通过镜头被传递到摄像头感光元件上,光信号转换为电信号,摄像头感光元件将所述电信号传递给ISP处理,转化为肉眼可见的图像。ISP还可以对图像的噪点,亮度进行算法优化。ISP还可以对拍摄场景的曝光,色温等参数优化。在一些实施例中,ISP可以设置在摄像头193中。
摄像头193用于捕获静态图像或视频。物体通过镜头生成光学图像投射到感光元件。感光元件可以是电荷耦合器件(charge coupled device,CCD)或互补金属氧化物半导体(complementary metal-oxide-semiconductor,CMOS)光电晶体管。感光元件把光信号转换成电信号,之后将电信号传递给ISP转换成数字图像信号。ISP将数字图像信号输出到DSP加工处理。DSP将数字图像信号转换成标准的RGB,YUV等格式的图像信号。在一些实施例中,终端100可以包括1个或N个摄像头193,N为大于1的正整数。
数字信号处理器用于处理数字信号,除了可以处理数字图像信号,还可以处理其他数字信号。视频编解码器用于对数字视频压缩或解压缩。终端100可以支持一种或多种视频编解码器。
NPU为神经网络(neural-network,NN)计算处理器,通过借鉴生物神经网络结构,例如借鉴人脑神经元之间传递模式,对输入信息快速处理,还可以不断的自学习。通过NPU可以实现终端100的智能认知等应用,例如:图像识别,人脸识别,语音识别,文本理解等。
在本申请实施例中,终端100通过ISP,摄像头193提供的拍摄能力,采集并生成图像帧。终端100可通过NPU执行眼球注视识别算法,进而通过采集到的图像帧识别用户的眼球注视位置。
内部存储器121可以包括一个或多个随机存取存储器(random access memory,RAM)和一个或多个非易失性存储器(non-volatile memory,NVM)。
随机存取存储器可以包括静态随机存储器(static random-access memory,SRAM)、动态随机存储器(dynamic random access memory,DRAM)、同步动态随机存储器(synchronous dynamic random access memory,SDRAM)、双倍资料率同步动态随机存取 存储器(double data rate synchronous dynamic random access memory,DDR SDRAM,例如第五代DDR SDRAM一般称为DDR5SDRAM)等。非易失性存储器可以包括磁盘存储器件、快闪存储器(flash memory)。
随机存取存储器可以由处理器110直接进行读写,可以用于存储操作系统或其他正在运行中的程序的可执行程序(例如机器指令),还可以用于存储用户及应用程序的数据等。非易失性存储器也可以存储可执行程序和存储用户及应用程序的数据等,可以提前加载到随机存取存储器中,用于处理器110直接进行读写。
眼球注视SDK的应用程序代码可存储到非易失性存储器中。在运行眼球注视SDK调用眼球注视服务时,眼球注视SDK的应用程序代码可被加载到随机存取存储器中。运行上述代码时产生的数据也可存储到随机存取存储器中。
外部存储器接口120可以用于连接外部的非易失性存储器,实现扩展终端100的存储能力。外部的非易失性存储器通过外部存储器接口120与处理器110通信,实现数据存储功能。例如将音乐,视频等文件保存在外部的非易失性存储器中。
终端100可以通过音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,以及应用处理器等实现音频功能。例如音乐播放,录音等。
音频模块170用于将数字音频信息转换成模拟音频信号输出,也用于将模拟音频输入转换为数字音频信号。扬声器170A,也称“喇叭”,用于将音频电信号转换为声音信号。终端100可以通过扬声器170A收听音乐,或收听免提通话。受话器170B,也称“听筒”,用于将音频电信号转换成声音信号。当终端100接听电话或语音信息时,可以通过将受话器170B靠近人耳接听语音。麦克风170C,也称“话筒”,“传声器”,用于将声音信号转换为电信号。当拨打电话或发送语音信息时,用户可以通过人嘴靠近麦克风170C发声,将声音信号输入到麦克风170C。耳机接口170D用于连接有线耳机。
压力传感器180A用于感受压力信号,可以将压力信号转换成电信号。陀螺仪传感器180B可以用于确定终端100围绕三个轴(即,x,y和z轴)的角速度,进而确定终端100的运动姿态。加速度传感器180E可检测终端100在各个方向上(一般为三轴)加速度的大小。因此,加速度传感器180E可用于识别终端100的姿态。在本申请实施例中,终端100在灭屏或灭屏AOD状态下,可通过加速度传感器180E、陀螺仪传感器180B检测用户是否拿起手机,进而确定是否点亮屏幕。
气压传感器180C用于测量气压。磁传感器180D包括霍尔传感器。终端100可以利用磁传感器180D检测翻盖皮套的开合。因此,在一些实施例中,当终端100是翻盖机时,终端100可以根据磁传感器180D检测翻盖的开合,进而确定是否点亮屏幕。
距离传感器180F用于测量距离。接近光传感器180G可以包括例如发光二极管(LED)和光检测器。终端100可以利用接近光传感器180G检测用户手持终端100贴近用户的场景,例如听筒通话。环境光传感器180L用于感知环境光亮度。终端100可以根据感知的环境光亮度自适应调节显示屏194亮度。
指纹传感器180H用于采集指纹。终端100可以利用采集的指纹特性实现指纹解锁,访问应用锁等功能。温度传感器180J用于检测温度。骨传导传感器180M可以获取振动信号。
触摸传感器180K,也称“触控器件”。触摸传感器180K可以设置于显示屏194,由触摸传感器180K与显示屏194组成触摸屏,也称“触控屏”。触摸传感器180K用于检测作用于其上或附近的触摸操作。触摸传感器可以将检测到的触摸操作传递给应用处理 器,以确定触摸事件类型。可以通过显示屏194提供与触摸操作相关的视觉输出。在另一些实施例中,触摸传感器180K也可以设置于终端100的表面,与显示屏194所处的位置不同。
在本申请实施例中,终端100通过触摸传感器180K检测是否有作用于屏幕的用户操作,例如点击、左滑、右滑等操作。基于触摸传感器180K检测到的作用于屏幕的用户操作,终端100得以确定后续将要执行的动作,例如运行某一应用程序、显示应用程序的界面等等。
按键190包括开机键,音量键等。按键190可以是机械按键。也可以是触摸式按键。马达191可以产生振动提示。马达191可以用于来电振动提示,也可以用于触摸振动反馈。指示器192可以是指示灯,可以用于指示充电状态、电量变化、也可以用于指示消息、未接来电、通知等。
SIM卡接口195用于连接SIM卡。终端100可以支持1个或N个SIM卡接口。
本申请的说明书和权利要求书及附图中的术语“用户界面(user interface,UI)”,是应用程序或操作系统与用户之间进行交互和信息交换的介质接口,它实现信息的内部形式与用户可以接受形式之间的转换。应用程序的用户界面是通过java、可扩展标记语言(extensible markup language,XML)等特定计算机语言编写的源代码,界面源代码在终端设备上经过解析,渲染,最终呈现为用户可以识别的内容,比如图片、文字、按钮等控件。控件(control)也称为部件(widget),是用户界面的基本元素,典型的控件有工具栏(toolbar)、菜单栏(menu bar)、文本框(text box)、按钮(button)、滚动条(scrollbar)、图片和文本。界面中的控件的属性和内容是通过标签或者节点来定义的,比如XML通过<Textview>、<ImgView>、<VideoView>等节点来规定界面所包含的控件。一个节点对应界面中一个控件或属性,节点经过解析和渲染之后呈现为用户可视的内容。此外,很多应用程序,比如混合应用(hybrid application)的界面中通常还包含有网页。网页,也称为页面,可以理解为内嵌在应用程序界面中的一个特殊的控件,网页是通过特定计算机语言编写的源代码,例如超文本标记语言(hyper text markup language,GTML),层叠样式表(cascading style sheets,CSS),java脚本(JavaScript,JS)等,网页源代码可以由浏览器或与浏览器功能类似的网页显示组件加载和显示为用户可识别的内容。网页所包含的具体内容也是通过网页源代码中的标签或者节点来定义的,比如GTML通过<p>、<img>、<video>、<canvas>来定义网页的元素和属性。
用户界面常用的表现形式是图形用户界面(graphic user interface,GUI),是指采用图形方式显示的与计算机操作相关的用户界面。它可以是在终端设备的显示屏中显示的一个图标、窗口、控件等界面元素,其中控件可以包括图标、按钮、菜单、选项卡、文本框、对话框、状态栏、导航栏、Widget等可视的界面元素。
在本申请的说明书和所附权利要求书中所使用的那样,单数表达形式“一个”、“一种”、“所述”、“上述”、“该”和“这一”旨在也包括复数表达形式,除非其上下文中明确地有相反指示。还应当理解,本申请中使用的术语“和/或”是指并包含一个或多个所列出项目的任何或所有可能组合。上述实施例中所用,根据上下文,术语“当…时”可以被解释为意思是“如果…”或“在…后”或“响应于确定…”或“响应于检测到…”。类似地,根据上下文,短语“在确定…时”或“如果检测到(所陈述的条件或事件)”可以被解释为意思是“如果确定…”或“响应于确定…”或“在检测到(所陈述的条件或事件)时”或“响应于检测到(所 陈述的条件或事件)”。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线)或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如DVD)、或者半导体介质(例如固态硬盘)等。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,该流程可以由计算机程序来指令相关的硬件完成,该程序可存储于计算机可读取存储介质中,该程序在执行时,可包括如上述各方法实施例的流程。而前述的存储介质包括:ROM或随机存储记忆体RAM、磁碟或者光盘等各种可存储程序代码的介质。

Claims (18)

  1. 一种显示方法,应用于电子设备,所述电子设备包括屏幕和摄像头模组,其特征在于,所述方法包括:
    在所述屏幕上显示第一界面,所述第一界面包括通知消息,所述通知消息显示在所述屏幕的第一区域;
    根据所述摄像头模组的第一输入确定用户在注视所述第一区域;
    响应于所述用户在注视所述第一区域,显示所述通知消息对应的第二界面。
  2. 根据权利要求1所述的方法,其特征在于,
    所述第一区域为所述通知消息弹出显示在所述屏幕上的局部区域。
  3. 根据权利要求1所述的方法,其特征在于,
    所述摄像头模组至少包括一个第一摄像头和一个第二摄像头,所述第一摄像头用于获取二维图像,所述第二摄像头用于获取包含深度信息的图像;
    所述第一输入为第一图像,所述第一图像包括所述二维图像和所述包含深度信息的图像。
  4. 根据权利要求3所述的方法,其特征在于,
    所述第一摄像头为生成RGB图像的摄像头,所述第二摄像头为TOF摄像头。
  5. 根据权利要求3或4所述的方法,其特征在于,所述根据所述摄像头模组的第一输入确定用户在注视所述第一区域,包括:
    根据所述第一图像确定用户的眼球注视位置。
  6. 根据权利要求5所述的方法,其特征在于,所述方法还包括:
    在所述屏幕上显示第一图标提示所述确定的用户的眼球注视位置。
  7. 根据权利要求6所述的方法,其特征在于,所述方法还包括:
    根据所述用户的眼球注视位置确定所述用户在注视所述第一区域。
  8. 根据权利要求7所述的方法,其特征在于,所述方法还包括:
    所述第一图标显示在所述第一区域。
  9. 根据权利要求7所述的方法,其特征在于,根据所述用户的眼球注视位置确定所述用户在注视所述第一区域,包括:
    所述用户的眼球注视位置包含于所述第一区域,或者所述用户的眼球注视位置与所述第一区域有交集。
  10. 根据权利要求6-9中任一项所述的方法,其特征在于,
    所述第一图标为屏幕上的一个显示单元构成的光标点,或者,所述第一图标为屏幕上多个显示单元构成的光标点或光标区域。
  11. 根据权利要求5-10中任一项所述的方法,其特征在于,所述根据所述第一图像确定用户的眼球注视位置,具体包括:
    利用所述第一图像确定特征数据,所述特征数据包括左眼图像、右眼图像、人脸图像和人脸网格数据中的一类或多类;
    利用眼球注视识别模型确定所述特征数据指示的用户的眼球注视位置,所述眼球注视识别模型是基于卷积神经网络建立的。
  12. 根据权利要求11所述的方法,其特征在于,所述利用所述第一图像确定特征数据,具体包括:
    对所述第一图像进行人脸校正,得到面部图像端正的第一图像;
    基于所述面部图像端正的第一图像,确定所述特征数据。
  13. 根据权利要求12所述的方法,其特征在于,所述得到面部图像端正的第一图像,具体包括:
    确定所述第一图像的被校准线,所述被校准线为左眼关键点与右眼关键点连成的直线;
    确定所述第一图像的人脸偏转角,所述人脸偏转角为所述被校准线与水平线的夹角;
    判断所述人脸偏转角等于0度,所述第一图像为所述面部图像端正的第一图像;
    判断所述人脸偏转角不等于0度,基于左眼坐标位置和右眼坐标位置确定旋转中心点,将第一图像绕旋转中心点旋转第一角度,得到所述面部图像端正的第一图像,所述第一角度等于所述人脸偏转角。
  14. 根据权利要求1所述的方法,其特征在于,所述第一界面为所述电子设备上安装的第一应用提供的一个界面。
  15. 根据权利要求1所述的方法,其特征在于,所述第一界面为以下多个界面中的任意一个:
    第一桌面、第二桌面、负一屏。
  16. 根据权利要求15所述的方法,其特征在于,所述方法还包括:
    显示待解锁界面;
    响应于所述用户的解锁操作,显示所述第一界面。
  17. 一种电子设备,其特征在于,包括一个或多个处理器和一个或多个存储器;其中,所述一个或多个存储器与所述一个或多个处理器耦合,所述一个或多个存储器用于存储计算机程序代码,所述计算机程序代码包括计算机指令,当所述一个或多个处理器执行所述计算机 指令时,使得执行如权利要求1-16中任一项所述的方法。
  18. 一种计算机可读存储介质,包括指令,其特征在于,当所述指令在电子设备上运行时,使得执行如权利要求1-16中任一项所述的方法。
PCT/CN2023/095379 2022-05-20 2023-05-19 一种显示方法和电子设备 WO2023222128A1 (zh)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN202210549604 2022-05-20
CN202210549604.6 2022-05-20
CN202210764445.1 2022-06-30
CN202210764445.1A CN116027887B (zh) 2022-05-20 2022-06-30 一种显示方法和电子设备

Publications (2)

Publication Number Publication Date
WO2023222128A1 WO2023222128A1 (zh) 2023-11-23
WO2023222128A9 true WO2023222128A9 (zh) 2023-12-21

Family

ID=86069557

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/095379 WO2023222128A1 (zh) 2022-05-20 2023-05-19 一种显示方法和电子设备

Country Status (2)

Country Link
CN (2) CN116027887B (zh)
WO (1) WO2023222128A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116027887B (zh) * 2022-05-20 2024-03-29 荣耀终端有限公司 一种显示方法和电子设备

Family Cites Families (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH08266476A (ja) * 1995-03-30 1996-10-15 Nippon Koden Corp 眼球運動測定装置
CN103927106A (zh) * 2013-01-14 2014-07-16 富泰华工业(深圳)有限公司 应用程序启动系统及方法
CN103324290A (zh) * 2013-07-04 2013-09-25 深圳市中兴移动通信有限公司 终端设备及其眼睛操控方法
KR102005406B1 (ko) * 2014-02-06 2019-07-30 삼성전자주식회사 디스플레이 장치 및 그 제어 방법
CN105335066A (zh) * 2015-10-13 2016-02-17 广东欧珀移动通信有限公司 通知消息的控制方法以及装置
CN105740671A (zh) * 2016-01-29 2016-07-06 珠海市魅族科技有限公司 一种消息内容的展示方法及装置
CN106557257A (zh) * 2016-11-16 2017-04-05 努比亚技术有限公司 通知消息的查看装置、方法及移动终端
CN107608514A (zh) * 2017-09-20 2018-01-19 维沃移动通信有限公司 信息处理方法及移动终端
CN107704086A (zh) * 2017-10-20 2018-02-16 维沃移动通信有限公司 一种移动终端操作方法及移动终端
CN108196781B (zh) * 2017-12-29 2020-05-26 维沃移动通信有限公司 界面的显示方法和移动终端
CN108596061A (zh) * 2018-04-12 2018-09-28 Oppo广东移动通信有限公司 人脸识别方法、装置及移动终端、存储介质
CN109274828B (zh) * 2018-09-30 2021-01-15 华为技术有限公司 一种生成截图的方法、控制方法及电子设备
CN110442241A (zh) * 2019-08-09 2019-11-12 Oppo广东移动通信有限公司 日程显示方法、装置、移动终端及计算机可读存储介质
CN110825226A (zh) * 2019-10-30 2020-02-21 维沃移动通信有限公司 消息查看方法及终端
CN113645349B (zh) * 2020-04-27 2023-01-13 华为技术有限公司 终端的消息处理方法、终端、介质和电子设备
CN111665938A (zh) * 2020-05-27 2020-09-15 维沃移动通信(杭州)有限公司 应用启动方法及电子设备
CN113741681B (zh) * 2020-05-29 2024-04-26 华为技术有限公司 一种图像校正方法与电子设备
CN113970965A (zh) * 2020-07-21 2022-01-25 华为技术有限公司 消息显示方法和电子设备
CN114115512B (zh) * 2020-08-25 2022-12-16 荣耀终端有限公司 信息显示方法、终端设备及计算机可读存储介质
CN112783330A (zh) * 2021-03-16 2021-05-11 展讯通信(上海)有限公司 电子设备的操作方法、装置和电子设备
CN113903317B (zh) * 2021-10-19 2023-06-27 Oppo广东移动通信有限公司 电子设备的屏幕亮度调节方法、装置以及电子设备
CN116027887B (zh) * 2022-05-20 2024-03-29 荣耀终端有限公司 一种显示方法和电子设备

Also Published As

Publication number Publication date
CN116027887B (zh) 2024-03-29
CN116027887A (zh) 2023-04-28
WO2023222128A1 (zh) 2023-11-23
CN116700477A (zh) 2023-09-05

Similar Documents

Publication Publication Date Title
US11450322B2 (en) Speech control method and electronic device
WO2021129326A1 (zh) 一种屏幕显示方法及电子设备
WO2021063343A1 (zh) 语音交互方法及装置
WO2021103981A1 (zh) 分屏显示的处理方法、装置及电子设备
EP3846427B1 (en) Control method and electronic device
US10402625B2 (en) Intelligent electronic device and method of operating the same
WO2020062294A1 (zh) 系统导航栏的显示控制方法、图形用户界面及电子设备
US11914850B2 (en) User profile picture generation method and electronic device
WO2021063237A1 (zh) 电子设备的控制方法及电子设备
WO2021063098A1 (zh) 一种触摸屏的响应方法及电子设备
US20220150403A1 (en) Input Method and Electronic Device
WO2021037223A1 (zh) 一种触控方法与电子设备
US20230244507A1 (en) Method and Apparatus for Processing Interaction Event
WO2021249281A1 (zh) 一种用于电子设备的交互方法和电子设备
WO2021057699A1 (zh) 具有柔性屏幕的电子设备的控制方法及电子设备
WO2023222128A9 (zh) 一种显示方法和电子设备
WO2023222130A1 (zh) 一种显示方法和电子设备
WO2022222688A1 (zh) 一种窗口控制方法及其设备
WO2020103091A1 (zh) 锁定触控操作的方法及电子设备
WO2024037379A1 (zh) 一种通知查看方法、系统及相关装置
WO2023246604A1 (zh) 手写输入方法及终端
US20240126897A1 (en) Access control method and related apparatus
WO2024037542A1 (zh) 一种触控输入的方法、系统、电子设备及存储介质
WO2024036998A1 (zh) 显示方法、存储介质及电子设备
WO2024037384A1 (zh) 一种电子设备的显示方法以及电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23807077

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2023807077

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2023807077

Country of ref document: EP

Effective date: 20240325