CN114255505A - Eyeball tracking processing method and related device - Google Patents

Eyeball tracking processing method and related device Download PDF

Info

Publication number
CN114255505A
CN114255505A CN202011012605.4A CN202011012605A CN114255505A CN 114255505 A CN114255505 A CN 114255505A CN 202011012605 A CN202011012605 A CN 202011012605A CN 114255505 A CN114255505 A CN 114255505A
Authority
CN
China
Prior art keywords
human eye
eye image
image
user
eye
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011012605.4A
Other languages
Chinese (zh)
Inventor
吴义孝
王文东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Oppo Mobile Telecommunications Corp Ltd
Original Assignee
Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Oppo Mobile Telecommunications Corp Ltd filed Critical Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority to CN202011012605.4A priority Critical patent/CN114255505A/en
Publication of CN114255505A publication Critical patent/CN114255505A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G02OPTICS
    • G02BOPTICAL ELEMENTS, SYSTEMS OR APPARATUS
    • G02B27/00Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00
    • G02B27/0093Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00 with means for monitoring data relating to the user, e.g. head-tracking, eye-tracking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/013Eye tracking input arrangements

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Optics & Photonics (AREA)
  • Human Computer Interaction (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The embodiment of the application discloses an eyeball tracking processing method and a related device, which are applied to a terminal and comprise the following steps: in the eyeball tracking process, acquiring a human eye image of a user; in the process of obtaining a first fixation point position by calling a first eye tracking algorithm to process the eye image, judging whether a user blinks or not based on the eye image, wherein the first eye tracking algorithm is an eye tracking algorithm which needs to be started under the condition that the user does not blink; if yes, calling a second eyeball tracking algorithm to process the human eye image to obtain a second fixation point position; determining the second gaze location as a final gaze location; and if not, determining that the first gazing point position is the final gazing point position. The method and the device are beneficial to more accurately identifying the gazing point position in the eyeball tracking process.

Description

Eyeball tracking processing method and related device
Technical Field
The application relates to the technical field of mobile terminals, in particular to an eyeball tracking processing method and a related device.
Background
With the rapid development of intelligent terminals such as mobile phones, intelligent terminals with eyeball tracking functions have come into existence. In the eye tracking technology, a user may blink while viewing a screen. When a blinking motion occurs, some conventional eye tracking algorithms cannot timely capture and react to the blinking motion, so that the identified gaze position has a larger deviation than an actual gaze position, and the deviation is very unfavorable for the application of the eye tracking technology.
Disclosure of Invention
The embodiment of the application provides an eyeball tracking processing method and a related device, which are beneficial to more accurately identifying the position of a gazing point in the eyeball tracking process.
In a first aspect, an embodiment of the present application provides an eyeball tracking processing method, which is applied to a terminal, and the method includes:
in the eyeball tracking process, acquiring a human eye image of a user;
in the process of obtaining a first fixation point position by calling a first eye tracking algorithm to process the eye image, judging whether a user blinks or not based on the eye image, wherein the first eye tracking algorithm is an eye tracking algorithm which needs to be started under the condition that the user does not blink;
if yes, calling a second eyeball tracking algorithm to process the human eye image to obtain a second fixation point position;
determining the second gaze location as a final gaze location;
and if not, determining that the first gazing point position is the final gazing point position.
In a second aspect, an embodiment of the present application provides an eyeball tracking processing apparatus, which is applied to a terminal, the video processing apparatus includes an acquisition unit, a processing unit, and a determination unit, wherein,
the acquisition unit is used for acquiring a human eye image of a user in an eyeball tracking process;
the processing unit is used for judging whether the user blinks or not based on the eye image in the process of calling a first eye tracking algorithm to process the eye image to obtain a first fixation point position, wherein the first eye tracking algorithm is an eye tracking algorithm which needs to be started under the condition that the user does not blink;
the determining unit is used for calling a second eyeball tracking algorithm to process the human eye image to obtain a second gazing point position if the human eye image is determined to be the first gazing point position; and for determining that the second gaze location is a final gaze location;
the determining unit is further configured to determine that the first gaze point location is a final gaze point location if the first gaze point location is not the final gaze point location.
In a third aspect, an embodiment of the present application provides a terminal, including a processor, a memory, a communication interface, and one or more programs, where the one or more programs are stored in the memory and configured to be executed by the processor, and the program includes instructions for executing steps in any method of the first aspect of the embodiment of the present application.
In a fourth aspect, the present application provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program for electronic data exchange, where the computer program makes a computer perform part or all of the steps described in any one of the methods of the first aspect of the present application.
In a fifth aspect, the present application provides a computer program product, wherein the computer program product includes a non-transitory computer-readable storage medium storing a computer program, and the computer program is operable to cause a computer to perform some or all of the steps as described in any one of the methods of the first aspect of the embodiments of the present application. The computer program product may be a software installation package.
It can be seen that in the embodiment of the application, the terminal firstly acquires the eye image of the user in the eye tracking process, secondly, processes calling the first eye tracking algorithm to obtain the process of the first gaze position of the eye image, and judges whether the user blinks based on the eye image, the eye tracking algorithm which needs to be started under the condition that the user is not blinked is called for the first eye tracking algorithm, and finally, if so, the second eye tracking algorithm is called to process the eye image to obtain the second gaze position, and the second gaze position is determined to be the final gaze position, and if not, the first gaze position is determined to be the final gaze position. The terminal executes blink identification and fixation point identification in parallel in the eyeball tracking process, can quickly identify whether the user blinks, namely quickly identify the blink on the basis of not increasing the overall processing time of eyeball tracking, and simultaneously predicts the problem of fixation point output in the eyeball tracking process, thereby being beneficial to ensuring the continuity of the fixation point in the eyeball tracking process and improving the accuracy of eyeball tracking.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic structural diagram of a terminal provided in an embodiment of the present application;
fig. 2 is a schematic diagram of a software and hardware system architecture of a terminal according to an embodiment of the present disclosure;
fig. 3 is a schematic structural diagram of a terminal provided in an embodiment of the present application;
fig. 4A is a schematic flowchart of an eyeball tracking processing method according to an embodiment of the present application;
FIG. 4B is a reference diagram of a human eye image histogram according to an embodiment of the present disclosure;
FIG. 4C is a schematic diagram of a gray scale image of a human eye provided by an embodiment of the present application;
fig. 5 is a schematic flowchart illustrating another eye tracking processing method according to an embodiment of the present disclosure;
fig. 6 is a block diagram of distributed functional units of an eyeball tracking processing device according to an embodiment of the present application;
fig. 7 is a block diagram of an integrated functional unit of an eyeball tracking processing device according to an embodiment of the present application.
Detailed Description
In order to make the technical solutions of the present application better understood, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The terms "first," "second," and the like in the description and claims of the present application and in the above-described drawings are used for distinguishing between different objects and not for describing a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.
In a first section, the software and hardware operating environment of the eye tracking processing techniques disclosed herein is described as follows.
Referring to fig. 1, a block diagram of a terminal 100 according to an exemplary embodiment of the present application is shown. The terminal 100 may be a communication-capable terminal, which may include various handheld devices having wireless communication functions, vehicle-mounted devices, wearable devices, computing devices or other processing devices connected to a wireless modem, as well as various forms of User Equipment (UE), Mobile Station (MS), terminal Equipment (terminal device), and so on. The terminal 100 in the present application may include one or more of the following components: a processor 110, a memory 120, and an input-output device 130.
Processor 110 may include one or more processing cores. The processor 110 connects various parts within the overall terminal 100 using various interfaces and lines, and performs various functions of the terminal 100 and processes data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 120 and calling data stored in the memory 120. Processor 110 may include one or more processing units, such as: the processor 110 may include a Central Processing Unit (CPU), an Application Processor (AP), a modem processor, a Graphics Processing Unit (GPU), an Image Signal Processor (ISP), a controller, a video codec, a Digital Signal Processor (DSP), a baseband processor, and/or a neural-Network Processing Unit (NPU), etc. The controller may be, among other things, a neural center and a command center of the terminal 100. The controller can generate an operation control signal according to the instruction operation code and the timing signal to complete the control of instruction fetching and instruction execution. The CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing display content; the modem is used to handle wireless communications. The digital signal processor is used for processing digital signals, and can process digital image signals and other digital signals.
The Memory 120 may include a Random Access Memory (RAM) or a Read-Only Memory (Read-Only Memory). Optionally, the memory 120 includes a non-transitory computer-readable medium. The memory 120 may be used to store instructions, programs, code sets, or instruction sets. The memory 120 may include a program storage area and a data storage area, wherein the program storage area may store instructions for implementing an operating system, instructions for implementing at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing various method embodiments described below, and the like, and the operating system may be an Android (Android) system (including a system based on Android system depth development), an IOS system developed by apple inc (including a system based on IOS system depth development), or other systems. The storage data area may also store data created by the terminal 100 in use, such as a phonebook, audio-video data, chat log data, and the like.
The software system of the terminal 100 may adopt a hierarchical architecture, an event-driven architecture, a micro-core architecture, a micro-service architecture, or a cloud architecture. The embodiment of the present application exemplifies a software architecture of the terminal 100 by taking an Android system with a layered architecture as an example.
As shown in fig. 2, the memory 120 may store a Linux kernel layer 220, a system runtime library layer 240, an application framework layer 260, and an application layer 280, where the layers communicate with each other through a software interface, and the Linux kernel layer 220, the system runtime library layer 240, and the application framework layer 260 belong to an operating system space.
The application layer 280 belongs to a user space, and at least one application program runs in the application layer 280, and the application programs may be native application programs carried by an operating system, or third-party application programs developed by third-party developers, and specifically may include applications such as an eye tracker 281, a password 282, a calendar 283, a call 284, a map 285, and a navigation 286. Wherein, the following operations can be realized through the eyeball tracking application: in the eyeball tracking process, acquiring a human eye image of a user; in the process of obtaining a first fixation point position by calling a first eye tracking algorithm to process the eye image, judging whether a user blinks or not based on the eye image, wherein the first eye tracking algorithm is an eye tracking algorithm which needs to be started under the condition that the user does not blink; if yes, calling a second eyeball tracking algorithm to process the human eye image to obtain a second fixation point position; determining the second gaze location as a final gaze location; and if not, determining that the first gazing point position is the final gazing point position.
The application framework layer 260 provides various APIs that may be used by applications that build the application layer, and developers may also build their own applications by using these APIs, such as a window manager 261, a content provider 262, a view system 263, a phone manager 264, a resource manager 265, a notification manager 266, a message manager 267, and so on.
The system runtime library layer 240 provides the main feature support for the Android system through some C/C + + libraries. For example, the SQLite library 241 provides support for a database, the OpenGL/ES library 242 provides support for 3D drawing, the Webkit library 243 provides support for a browser kernel, and the like. Also provided in the system Runtime layer 240 is an Android Runtime library (Android Runtime)244, which mainly provides some core libraries that can allow developers to write Android applications using the Java language.
The Linux kernel layer 220 provides the underlying drivers for the various hardware of the terminal 100, such as a display driver 221, a Bluetooth driver 222, an audio driver 223, a Wi-Fi driver 224, a camera driver 225, power management 226, and the like.
It should be understood that the eyeball tracking processing method described in the embodiment of the present application may be applied to an Android system, and may also be applied to other operating systems, such as an IOS system, and the like.
A currently-used terminal configuration will be described in detail with reference to fig. 3, and it should be understood that the configuration illustrated in the embodiment of the present application is not intended to specifically limit the terminal 100. In other embodiments of the present application, terminal 100 may include more or fewer components than shown, or some components may be combined, some components may be split, or a different arrangement of components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.
As shown in fig. 3, the terminal 100 includes a system on chip 410, an external memory interface 420, an internal memory 421, a Universal Serial Bus (USB) interface 430, a charging management module 440, a power management module 441, a battery 442, an antenna 1, an antenna 2, a mobile communication module 450, a wireless communication module 460, an audio module 470, a speaker 470A, a receiver 470B, a microphone 470C, an earphone interface 470D, a sensor module 480, a button 490, a motor 491, an indicator 492, a camera 493, a display 494, an infrared transmitter 495, a Subscriber Identity Module (SIM) card interface 496, and the like. The sensor module 480 may include a pressure sensor 480A, a gyroscope sensor 480B, an air pressure sensor 480C, a magnetic sensor 480D, an acceleration sensor 480E, a distance sensor 480F, a proximity light sensor 480G, a fingerprint sensor 480H, a temperature sensor 480J, a touch sensor 480K, an ambient light sensor 480L, a bone conduction sensor 480M, and the like.
The wireless communication function of the terminal 400 may be implemented by the antenna 1, the antenna 2, the mobile communication module 450, the wireless communication module 460, the modem processor, the baseband processor, and the like.
The charging management module 440 is configured to receive charging input from a charger.
The power management module 441 is used to connect the battery 442, the charging management module 440 and the processor 440.
The terminal 100 implements a display function through the GPU, the display screen 494, and the application processor, etc. The GPU is an image processing microprocessor connected to a display screen 494 and an application processor. The GPU is used to perform mathematical and geometric calculations for graphics rendering. Processor 440 may include one or more GPUs that execute program instructions to generate or alter display information. When the eye image of the user is obtained, the GPU can process the image data to obtain a blink recognition result, so that whether the user blinks in the eyeball tracking process is quickly recognized.
The display screen 494 is used to display images, videos, and the like. The display screen 494 includes a display panel. The display panel may adopt a Liquid Crystal Display (LCD), an organic light-emitting diode (OLED), an active-matrix organic light-emitting diode (active-matrix organic light-emitting diode, AMOLED), a flexible light-emitting diode (FLED), a miniature, a Micro-oeld, a quantum dot light-emitting diode (QLED), and the like. In some embodiments, the terminal 400 may include 1 or N display screens 494, N being a positive integer greater than 1. In this embodiment of the application, the display screen 494 may be configured to display a red dot or a red dot in number on each icon of the APP, so as to prompt the user that a new message is to be processed.
The terminal 100 may implement a photographing function through the ISP, the camera 493, the video codec, the GPU, the display screen 494, the application processor, and the like.
The ISP is used to process the data fed back by the camera 493.
The camera 493 is used to capture still images or video. The object generates an optical image through the lens and projects the optical image to the photosensitive element. The photosensitive element may be a Charge Coupled Device (CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor. The light sensing element converts the optical signal into an electrical signal, which is then passed to the ISP where it is converted into a digital image signal. And the ISP outputs the digital image signal to the DSP for processing. The DSP converts the digital image signal into image signal in standard RGB, YUV and other formats. In some embodiments, the terminal 400 may include 1 or N cameras 493, where N is a positive integer greater than 1. In the eyeball tracking process, the camera 493 may be used to obtain a human eye image of the user, and then the GPU performs blink recognition and fixation point recognition on the human eye image.
The external memory interface 420 may be used to connect an external memory card, such as a Micro SD card, to extend the memory capability of the terminal 400.
The internal memory 421 may be used to store computer-executable program code, including instructions.
The terminal 400 may implement audio functions through the audio module 470, the speaker 470A, the receiver 470B, the microphone 470C, the earphone interface 470D, and the application processor, etc. Such as music playing, recording, etc.
The pressure sensor 480A in the sensor module 480 is used to sense a pressure signal, which can be converted into an electrical signal. The gyro sensor 480B may be used to determine the motion attitude of the terminal 400. The air pressure sensor 480C is used to measure air pressure. The magnetic sensor 480D includes a hall sensor. The terminal 400 can detect the opening and closing of the flip holster using the magnetic sensor 480D. The acceleration sensor 480E may detect the magnitude of acceleration of the terminal 400 in various directions (typically three axes). A distance sensor 480F for measuring distance. The proximity light sensor 480G may include, for example, a Light Emitting Diode (LED) and a light detector, such as a photodiode. The ambient light sensor 480L is used to sense the ambient light level. The fingerprint sensor 480H is used to collect a fingerprint. The temperature sensor 480J is used to detect temperature. The touch sensor 480K is also referred to as a "touch panel". The touch sensor 480K is used to detect a touch operation applied thereto or thereabout. The bone conduction sensor 480M may acquire a vibration signal.
The keys 490 include a power-on key, a volume key, etc. The terminal 400 may receive a key input, and generate a key signal input related to user setting and function control of the terminal 400.
The motor 491 may generate a vibration indication.
The indicator 492 may be an indicator light, and may be used to indicate a charging status, a change in charge level, or a message, a missed call, a notification, etc.
The SIM card interface 496 is used to connect a SIM card.
The infrared transmitter 495 may be an infrared lamp, and may emit infrared light to irradiate on a human face so as to form a light spot on a human eye, and in the eyeball tracking process, the obtained human eye image includes a human eye light spot, which is helpful for calculating a human eye fixation point according to a position of the human eye light spot in the human eye image.
In a second section, the background of the eyeball tracking processing technology disclosed in the present application is described as follows.
At present, as the mobile phone photographing function becomes more and more powerful, the mobile phone photographing function is applied to more and more scenes. Because the requirements of users on various different individualities are met, the speed of processing real-time shot photos by the mobile phone is higher and higher, and therefore, the shooting speed needs to be improved by enhancing the algorithm efficiency. In many scenes, a mobile phone is required to calculate depth information in an image, the workload of the calculation is large, the image processing speed of a mobile terminal CPU is low, and the CPU calculation capacity in many scenes is insufficient. The eye tracking process comprises blink identification and fixation point identification, and in the processing process of blink identification related algorithm, DSP or GPU of the electronic equipment can be used for acceleration to ensure that the processing speed of the blink identification is faster than that of the fixation point identification.
In the process of eye tracking, the eyes may blink, and when blinking occurs, the conventional eye tracking algorithm cannot capture the blinking in time, which may cause a large deviation in calculating the gaze point position in a short time, and the deviation is very unfavorable for the application of the eye tracking technology. Because the blink time generated by people is vanishing, how to quickly identify the blink is a problem to be solved for further development of the eyeball tracking technology. There are several currently known blink recognition schemes, but there are still many deficiencies for the following reasons: first, the processing speed of blink recognition in the existing blink recognition scheme is too slow, because eye tracking is a very real-time calculation, and a plurality of frames of images are calculated every second, so that a user blink needs to be recognized quickly. Second, the conventional blink recognition aims to determine whether a user blinks, but in the eye tracking technology, attention is paid to a blinking process, and when it is detected that eyelids are closed to a certain degree, it can be considered that blinking occurs in this state. Third, in the conventional blink recognition technology, the emphasis is on accuracy, that is, the object is to reduce the erroneous determination, and the recognition of blinks in eye tracking can allow a certain degree of erroneous determination, but the requirement for power consumption is high, and how to quickly and easily recognize blinks is more concerned.
In summary, the present application provides a blink fast identification scheme suitable for an eyeball tracking technology, which includes blink identification and gaze point identification in an eyeball tracking process, and a blink identification result may have a certain influence on gaze point identification. For example, in a possible case, blink recognition and gaze point recognition in the eye tracking process are sequentially performed, whether a user blinks or not is recognized, if no blink is detected, a default eye tracking algorithm is used to calculate a final gaze point position, and if blink is detected, another eye tracking algorithm needs to be used to calculate the final gaze point position. In another possible case, the blink recognition and the gaze point recognition in the eye tracking process are performed simultaneously, and if it is ensured that the overall duration of the eye tracking process is not increased, the blink recognition speed is required to be faster than the gaze point recognition speed. Therefore, no matter which condition is adopted, the speed optimization is carried out on the blink recognition process, the speed optimization is extremely important in the eyeball tracking process, and the delay in the eyeball tracking process is favorably reduced.
In the third section, the eyeball tracking processing method provided by the embodiment of the present application is introduced as follows.
Referring to fig. 4A, fig. 4A is a schematic flowchart of an eye tracking processing method according to an embodiment of the present disclosure, where the eye tracking processing method includes the following operations.
S401, in the eyeball tracking process, human eye images of a user are obtained.
The terminal acquires human eye images of a user through the eyeball tracking assembly in the eyeball tracking process, and the eyeball tracking assembly comprises an infrared emitter, a camera, computing hardware and the like. In the specific implementation, the eyeball tracking assembly firstly acquires face images of a user within a preset time or continuous multiple frames, and then cuts the face images to obtain human eye images of the user.
S402, in the process of calling a first eye tracking algorithm to process the eye image to obtain a first fixation point position, whether the user blinks or not is judged based on the eye image, and the first eye tracking algorithm is an eye tracking algorithm which needs to be started under the condition that the user does not blink.
In the process of calling a first eyeball tracking algorithm to process a human eye image to obtain a first fixation point position, the terminal conducts blink recognition according to the human eye image to obtain a blink recognition result. The method comprises the steps that a first eyeball tracking algorithm and a second eyeball tracking algorithm are preset in a terminal, when the fact that a user does not blink is recognized in the eyeball tracking process, the first eyeball tracking algorithm is used for calculating a first gazing point position, the first gazing point position is determined to be the final gazing point position, when the fact that the user blinks is recognized, the second eyeball tracking algorithm is used for calculating a second gazing point position, and the second gazing point position is determined to be the final gazing point position. It can be seen that blink recognition and gaze point recognition are processed in parallel, but the processing speed of blink recognition is required to be faster than that of gaze point recognition.
In this possible example, the method further comprises: and judging whether the user blinks or not based on the human eye image by using a graphic processor GPU or a digital signal processor DSP.
In the process of judging whether the user blinks based on the human eye image, the processing process of blink identification is accelerated by using a DSP or a GPU of the terminal so as to ensure that the processing speed of blink identification is higher than that of fixation point identification. When the human eye image of the user is obtained, processing image data of the human eye image by the GPU or the DSP, carrying out blink recognition to obtain a blink recognition result, judging whether the user blinks, if so, determining that a second eyeball tracking algorithm needs to be called to calculate the final gaze position, and if not, obtaining the first gaze position as the final gaze position by using the first eyeball tracking algorithm.
Therefore, in this example, when the terminal processes blink recognition and gaze point recognition at the same time, the processing of blink recognition may be accelerated by using the GPU or the DSP of the terminal, so as to ensure that the processing speed of blink recognition is faster than the processing speed of gaze point recognition, thereby improving the overall processing speed of the eyeball tracking process.
In this possible example, the determining whether the user blinks based on the human eye image includes: converting the human eye image into a gray scale image, and calculating a human eye closure index of the human eye image according to the gray scale image; and judging whether the user blinks or not according to the eye closure index of the eye image.
The human eye image comprises an image of a human eye region of a user and an image of a region around the human eye, and mainly comprises an eyeball region, an eye white region and a periocular region, the human eye image is converted into a gray-scale image, the eyeball region, the eye white region and the periocular region can be distinguished from the gray-scale image more intuitively, so that the human eye closing index of each human eye image can be calculated more accurately, the human eye closing index is used for representing the closing degree of the human eye, when the human eye closing index is higher, the higher the closing degree of the human eye is indicated, namely the user is likely to blink, and therefore the human eye closing index for calculating the human eye image is beneficial for judging whether the user blinks or not.
Therefore, in the example, the eye images of the user are converted into the gray level images, and the eye closure index of each eye image is calculated according to the gray level images, so that whether the user blinks can be judged according to the eye closure index, and whether the user blinks can be accurately identified.
In this possible example, the converting the human eye image into a gray scale map and calculating a human eye closure index of the human eye image according to the gray scale map includes: drawing a histogram of the human eye image according to human eye characteristics, wherein the human eye characteristics comprise a first color corresponding to an eyeball area, a second color corresponding to an eye white area and a third color corresponding to a periocular area; carrying out ternary operation on the histogram of the human eye image to obtain a gray level image of the human eye image; traversing the gray level image of the human eye image to obtain a human eye maximum length value and a human eye maximum width value corresponding to the gray level image; and calculating the ratio of the maximum length value of the human eyes to the maximum width value of the human eyes to obtain the human eye closure index of the human eye image.
The human eye image is distinguished according to three colors, an eyeball area corresponds to a first color, a white eye area corresponds to a second color, a periocular area corresponds to a third color, and a histogram of the human eye image is drawn. As shown in fig. 4B, three peaks represent three medium colors, and a gray scale image of the human eye image is obtained by performing a ternary operation using the histogram, as shown in fig. 4C, the gray scale image uses three colors to distinguish an eyeball region, a white eye region and a periocular region, where the visible eyeball region is gray, the white eye region is white, and the periocular region is black.
Traversing the gray level image of the human eye image to obtain the human eye maximum length value L and the human eye maximum width value W corresponding to each image. The ratio of the maximum length L of the human eye to the maximum width W of the human eye can be used for measuring the opening degree of eyelid. This ratio is minimal when the eye is fully open, and tends to infinity when the eyelids are fully closed. Thus, the eye closure index is the ratio of the eye maximum length value L and the eye maximum width value W.
It can be seen that, in this example, the eye closure index is a ratio of the maximum length value L of the eye to the maximum width value W of the eye, and since the ratio is the smallest when the eye is fully opened and tends to be infinite when the eyelid is fully closed, it is advantageous to accurately calculate the eye closure index of the eye image according to the gray scale map.
In this possible example, the determining whether the user blinks according to the eye closure index of the eye image includes: judging whether the human eye closing index of the human eye image is larger than a preset threshold value or not; if so, determining that the user blinks; if not, determining that the user does not blink.
The ratio of the maximum length value of the human eye to the maximum width value of the human eye in the blinking process needs to be counted and calculated in advance, so that a preset threshold value is set, and the human eye can be regarded as blinking only when the eye closure index is higher than the preset threshold value. In addition, the eye images may be time-stamped, so that the eye tracking calculation result and the blink recognition calculation result under the same time stamp may be searched for and compared in the following.
The eyeball tracking assembly can acquire blink images of a plurality of users if the users blink in the eyeball tracking process, and the users can be considered to blink when the eye closing indexes of the eye images are detected to be larger than a preset threshold value, so that the situation that the users blink by mistake is judged is avoided.
In this example, it is determined that the user blinks in the eye tracking process only when the eye closure index of the eye image is detected to be greater than the preset threshold, and the preset threshold is determined according to the previous blinking condition of the user, so that the occurrence of the situation that the user blinks is judged by mistake is avoided.
And S403, calling a first eyeball tracking algorithm to process the human eye image to obtain a first gazing point position.
S404, if yes, a second eyeball tracking algorithm is called to process the human eye image to obtain a second gazing point position.
S405, determining the second gaze location as a final gaze location.
The position of the final fixation point is determined by whether the user blinks, if the user blinks, a second eyeball tracking algorithm is required to be called to calculate to obtain a second fixation point position, the second fixation point position is the final fixation point position, and if the user does not blink, a first fixation point position calculated by calling a first eyeball tracking algorithm is the final fixation point position.
S406, if not, determining that the first gaze point position is the final gaze point position.
The method comprises the steps that a user can watch a plurality of eyeballs, and the eyeballs are arranged in the eyeballs, wherein in the process of eye tracking, blinking has a great influence on the identification of the position of a gazing point, so that whether the user blinks or not needs to be identified firstly in the process of eye tracking, if the user does not blink, the first gazing point position obtained by calculation through calling a first eye tracking algorithm is the final gazing point position, if the user blinks, a second eye tracking algorithm is called to calculate the position of a second gazing point, and the position of the second gazing point is the accurate gazing point position.
It can be seen that in the embodiment of the application, the terminal firstly acquires the eye image of the user in the eye tracking process, secondly, processes calling the first eye tracking algorithm to obtain the process of the first gaze position of the eye image, and judges whether the user blinks based on the eye image, the eye tracking algorithm which needs to be started under the condition that the user is not blinked is called for the first eye tracking algorithm, and finally, if so, the second eye tracking algorithm is called to process the eye image to obtain the second gaze position, and the second gaze position is determined to be the final gaze position, and if not, the first gaze position is determined to be the final gaze position. The terminal executes blink identification and fixation point identification in parallel in the eyeball tracking process, can quickly identify whether the user blinks, namely quickly identify the blink on the basis of not increasing the overall processing time of eyeball tracking, and simultaneously predicts the problem of fixation point output in the eyeball tracking process, thereby being beneficial to ensuring the continuity of the fixation point in the eyeball tracking process and improving the accuracy of eyeball tracking.
Referring to fig. 5, fig. 5 is a flowchart illustrating another eyeball tracking processing method according to an embodiment of the present disclosure.
S501, in the eyeball tracking process, human eye images of the user are obtained.
S502, in the process of calling a first eye tracking algorithm to process the eye image to obtain a first fixation point position, whether the user blinks or not is judged based on the eye image, and the first eye tracking algorithm is an eye tracking algorithm which needs to be started under the condition that the user does not blink.
And S503, if the user blinks, acquiring fixation point positions of a plurality of moments determined by eyeball tracking before the shooting moment of the human eye image.
S504, determining the change range of the gazing point track before the shooting time of the human eye image according to the gazing point positions at the plurality of times.
The method comprises the steps of obtaining fixation point positions of multiple moments determined by an eyeball tracking algorithm before the shooting moment of a human eye image, namely the fixation point positions of the multiple moments tracked before the user blinks, so that the change range of a fixation point track can be determined before the user blinks, and determining the position of a second fixation point according to the change range of the fixation point track. For example, the gazing point positions at n times before the human eye image are acquired, and the gazing point change trajectory of the user is further determined.
The gaze point positions of the user at a plurality of moments before the blinking of the eye may be obtained according to the first eye tracking algorithm or the second eye tracking algorithm.
And S505, when the change range of the gaze point trajectory is smaller than a preset change range, determining that the second gaze point position is the gaze point position at any moment in the plurality of moments.
The method comprises the steps of obtaining gaze point positions of a plurality of moments before shooting moments of human eye images, determining a gaze point track change range of a user before blinking, and when the gaze point track change range is smaller than a preset change range, locating the gaze point during blinking to be the same as before blinking.
S506, when the change range of the gaze point track is larger than the preset change range, calculating the position of the second gaze point according to preset polynomial fitting.
When the change range of the locus of the gaze point is detected to be larger than the preset change range, the fluctuation of the gaze point during blinking is larger, and at the moment, the position of the second gaze point needs to be calculated according to the preset polynomial fitting.
When the second fixation point position is calculated according to polynomial fitting, recording the current blinking time t, and selecting the fixation point position calculated by eyeball tracking at n times before the time t: (x)i,yi) And i is more than or equal to 1 and less than or equal to N, and fitting by using an N-order polynomial to obtain:
Figure BDA0002697183300000131
in order to make the fitted approximate curve reflect the variation trend of the given data as much as possible, the residual | δ is requiredi|=|f(i)-xiIf | is smaller, the sum of the squares of the residuals can be minimized, i.e.:
Figure BDA0002697183300000141
wherein (x)i,yi) The method can be used for representing the fixation point coordinates at any time of n moments, and the eyeball tracking fixation point position during blinking is obtained by using the formula, so that the problem of mistakenly identifying the fixation point position during blinking can be effectively avoided. Therefore, the eyeball tracking gazing point position during blinking is obtained by using the formula, the problem that the gazing point position is mistakenly identified during blinking can be effectively avoided, and the human eye gazing point position of the user can be more accurately obtained in the eyeball tracking process.
And S507, determining the second gaze location as a final gaze location.
It can be seen that in the embodiment of the application, the terminal firstly acquires the eye image of the user in the eye tracking process, secondly, processes calling the first eye tracking algorithm to obtain the process of the first gaze position of the eye image, and judges whether the user blinks based on the eye image, the eye tracking algorithm which needs to be started under the condition that the user is not blinked is called for the first eye tracking algorithm, and finally, if so, the second eye tracking algorithm is called to process the eye image to obtain the second gaze position, and the second gaze position is determined to be the final gaze position, and if not, the first gaze position is determined to be the final gaze position. The terminal executes blink identification and fixation point identification in parallel in the eyeball tracking process, can quickly identify whether the user blinks, namely quickly identify the blink on the basis of not increasing the overall processing time of eyeball tracking, and simultaneously predicts the problem of fixation point output in the eyeball tracking process, thereby being beneficial to ensuring the continuity of the fixation point in the eyeball tracking process and improving the accuracy of eyeball tracking.
The present embodiment provides an eyeball tracking processing apparatus, which may be a terminal 100. Specifically, the eyeball tracking processing device is used for executing the steps of the eyeball tracking processing method. The eyeball tracking processing device provided by the embodiment of the application can comprise modules corresponding to the corresponding steps.
In the embodiment of the present application, the eyeball tracking processing device may be divided into the functional modules according to the method example, for example, each functional module may be divided according to each function, or two or more functions may be integrated into one processing module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The division of the modules in the embodiment of the present application is schematic, and is only a logic function division, and there may be another division manner in actual implementation.
Fig. 6 is a schematic diagram showing a possible configuration of the eye tracking processing device according to the above embodiment, in a case where each functional module is divided for each function. As shown in fig. 6, the eyeball tracking processing apparatus 6 includes an acquisition unit 61, a processing unit 62, and a determination unit 63.
The acquiring unit 61 is configured to acquire an image of a human eye of a user in an eyeball tracking process;
the processing unit 62 is configured to, in a process of calling a first eye tracking algorithm to process the eye image to obtain a first gaze location, determine whether the user blinks based on the eye image, where the first eye tracking algorithm is an eye tracking algorithm that needs to be started when the user does not blink;
the determining unit 63 is configured to, if yes, invoke a second eye tracking algorithm to process the eye image to obtain a second gaze location; and for determining that the second gaze location is a final gaze location;
the determining unit 63 is further configured to determine that the first gazing point position is the final gazing point position if the first gazing point position is not the final gazing point position.
All relevant contents of each step related to the above method embodiment may be referred to the functional description of the corresponding functional module, and are not described herein again. Of course, the eyeball tracking processing apparatus provided in the embodiment of the present application includes, but is not limited to, the above modules, for example: the eye tracking processing apparatus may further include a storage unit 64. The storage unit 64 may be used to store program codes and data of the eye tracking processing apparatus.
In the case of using an integrated unit, a schematic structural diagram of an eyeball tracking processing device provided by the embodiment of the present application is shown in fig. 7. In fig. 7, the eye tracking processing device 7 includes: a processing module 72 and a communication module 71. The processing module 72 is used for controlling and managing the actions of the eyeball tracking processing device, for example, executing the steps executed by the acquisition unit 61, the processing unit 61, and the determination unit 63, and/or other processes for executing the techniques described herein. The communication module 71 is used to support the interaction between the eye tracking processing apparatus and other devices. As shown in fig. 7, the eyeball tracking processing apparatus may further include a storage module 73, and the storage module 73 is used for storing program codes and data of the eyeball tracking processing apparatus, for example, storing contents stored in the storage unit 64.
The Processing module 72 may be a Processor or a controller, such as a Central Processing Unit (CPU), a general purpose Processor, a Digital Signal Processor (DSP), an ASIC, an FPGA or other programmable logic device, a transistor logic device, a hardware component, or any combination thereof. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. The processor may also be a combination of computing functions, e.g., comprising one or more microprocessors, DSPs, and microprocessors, among others. The communication module 71 may be a transceiver, an RF circuit or a communication interface, etc. The storage module 73 may be a memory.
All relevant contents of each scene related to the method embodiment may be referred to the functional description of the corresponding functional module, and are not described herein again. The eye tracking processing device 6 and the eye tracking processing device 7 can both execute the eye tracking processing method shown in fig. 4A.
Embodiments of the present application also provide a computer storage medium, where the computer storage medium stores a computer program for electronic data exchange, the computer program enables a computer to execute part or all of the steps of any one of the methods described in the above method embodiments, and the computer includes a terminal.
Embodiments of the present application also provide a computer program product comprising a non-transitory computer readable storage medium storing a computer program operable to cause a computer to perform some or all of the steps of any of the methods as described in the above method embodiments. The computer program product may be a software installation package, the computer comprising a terminal.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present application is not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required in this application.
In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus may be implemented in other manners. For example, the above-described embodiments of the apparatus are merely illustrative, and for example, the above-described division of the units is only one type of division of logical functions, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of some interfaces, devices or units, and may be an electric or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit may be stored in a computer readable memory if it is implemented in the form of a software functional unit and sold or used as a stand-alone product. Based on such understanding, the technical solution of the present application may be substantially implemented or a part of or all or part of the technical solution contributing to the prior art may be embodied in the form of a software product stored in a memory, and including several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the above-mentioned method of the embodiments of the present application. And the aforementioned memory comprises: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable memory, which may include: flash Memory disks, Read-Only memories (ROMs), Random Access Memories (RAMs), magnetic or optical disks, and the like.
The foregoing detailed description of the embodiments of the present application has been presented to illustrate the principles and implementations of the present application, and the above description of the embodiments is only provided to help understand the method and the core concept of the present application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (10)

1. An eye tracking processing method, comprising:
in the eyeball tracking process, acquiring a human eye image of a user;
in the process of obtaining a first fixation point position by calling a first eye tracking algorithm to process the eye image, judging whether a user blinks or not based on the eye image, wherein the first eye tracking algorithm is an eye tracking algorithm which needs to be started under the condition that the user does not blink;
if yes, calling a second eyeball tracking algorithm to process the human eye image to obtain a second fixation point position;
determining the second gaze location as a final gaze location;
and if not, determining that the first gazing point position is the final gazing point position.
2. The method of claim 1, wherein determining whether the user blinks based on the human eye image comprises:
converting the human eye image into a gray scale image, and calculating a human eye closure index of the human eye image according to the gray scale image;
and judging whether the user blinks or not according to the eye closure index of the eye image.
3. The method of claim 2, wherein converting the human eye image into a gray scale map and calculating a human eye closure index for the human eye image based on the gray scale map comprises:
drawing a histogram of the human eye image according to human eye characteristics, wherein the human eye characteristics comprise a first color corresponding to an eyeball area, a second color corresponding to an eye white area and a third color corresponding to a periocular area;
carrying out ternary operation on the histogram of the human eye image to obtain a gray level image of the human eye image;
traversing the gray level image of the human eye image to obtain a human eye maximum length value and a human eye maximum width value corresponding to the gray level image;
and calculating the ratio of the maximum length value of the human eyes to the maximum width value of the human eyes to obtain the human eye closure index of the human eye image.
4. The method of claim 2, wherein the determining whether the user blinks according to the eye-closing index of the eye image comprises:
judging whether the human eye closing index of the human eye image is larger than a preset threshold value or not;
if so, determining that the user blinks;
if not, determining that the user does not blink.
5. The method of claim 1, further comprising:
and judging whether the user blinks or not based on the human eye image by using a graphic processor GPU or a digital signal processor DSP.
6. The method according to claim 1 or 2, wherein if yes, invoking a second eye tracking algorithm to process the eye image to obtain a second gaze location, comprising:
acquiring fixation point positions of a plurality of moments determined by eyeball tracking before the shooting moment of the human eye image;
determining the change range of the gazing point track before the shooting time of the human eye image according to the gazing point positions at the plurality of moments;
and determining the position of the second gaze point according to the change range of the gaze point track.
7. The method of claim 6, wherein the determining the second gaze point location according to the gaze point trajectory variation range comprises:
when the change range of the gaze point trajectory is smaller than a preset change range, determining that the second gaze point position is the gaze point position at any moment in the plurality of moments;
and when the change range of the locus of the gazing point is larger than the preset change range, calculating the position of the second gazing point according to preset polynomial fitting.
8. An eye tracking processing apparatus applied to a terminal, the video processing apparatus comprising an acquisition unit, a processing unit, and a determination unit, wherein,
the acquisition unit is used for acquiring a human eye image of a user in an eyeball tracking process;
the processing unit is used for judging whether the user blinks or not based on the eye image in the process of calling a first eye tracking algorithm to process the eye image to obtain a first fixation point position, wherein the first eye tracking algorithm is an eye tracking algorithm which needs to be started under the condition that the user does not blink;
the determining unit is used for calling a second eyeball tracking algorithm to process the human eye image to obtain a second gazing point position if the human eye image is determined to be the first gazing point position; and for determining that the second gaze location is a final gaze location;
the determining unit is further configured to determine that the first gaze point location is a final gaze point location if the first gaze point location is not the final gaze point location.
9. A terminal comprising a processor, memory, a communication interface, and one or more programs stored in the memory and configured to be executed by the processor, the programs comprising instructions for performing the steps in the method of any of claims 1-7.
10. A computer-readable storage medium, characterized in that a computer program for electronic data exchange is stored, wherein the computer program causes a computer to perform the method according to any one of claims 1-7.
CN202011012605.4A 2020-09-23 2020-09-23 Eyeball tracking processing method and related device Pending CN114255505A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011012605.4A CN114255505A (en) 2020-09-23 2020-09-23 Eyeball tracking processing method and related device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011012605.4A CN114255505A (en) 2020-09-23 2020-09-23 Eyeball tracking processing method and related device

Publications (1)

Publication Number Publication Date
CN114255505A true CN114255505A (en) 2022-03-29

Family

ID=80789857

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011012605.4A Pending CN114255505A (en) 2020-09-23 2020-09-23 Eyeball tracking processing method and related device

Country Status (1)

Country Link
CN (1) CN114255505A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117707330A (en) * 2023-05-19 2024-03-15 荣耀终端有限公司 Electronic equipment and eye movement tracking method

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117707330A (en) * 2023-05-19 2024-03-15 荣耀终端有限公司 Electronic equipment and eye movement tracking method

Similar Documents

Publication Publication Date Title
CN111782102B (en) Window display method and related device
CN110650379B (en) Video abstract generation method and device, electronic equipment and storage medium
US20220150403A1 (en) Input Method and Electronic Device
CN110572716B (en) Multimedia data playing method, device and storage medium
WO2020048392A1 (en) Application virus detection method, apparatus, computer device, and storage medium
CN111104980B (en) Method, device, equipment and storage medium for determining classification result
CN110933468A (en) Playing method, playing device, electronic equipment and medium
CN112835445B (en) Interaction method, device and system in virtual reality scene
CN111738365B (en) Image classification model training method and device, computer equipment and storage medium
CN110839128A (en) Photographing behavior detection method and device and storage medium
CN111027490A (en) Face attribute recognition method and device and storage medium
CN110837858A (en) Network model training method and device, computer equipment and storage medium
WO2022095640A1 (en) Method for reconstructing tree-shaped tissue in image, and device and storage medium
CN111880647B (en) Three-dimensional interface control method and terminal
CN114255505A (en) Eyeball tracking processing method and related device
CN110728167A (en) Text detection method and device and computer readable storage medium
CN114547429A (en) Data recommendation method and device, server and storage medium
CN111931712A (en) Face recognition method and device, snapshot machine and system
CN113391775A (en) Man-machine interaction method and equipment
CN115437601A (en) Image sorting method, electronic device, program product, and medium
CN113343709B (en) Method for training intention recognition model, method, device and equipment for intention recognition
CN111310526B (en) Parameter determination method and device for target tracking model and storage medium
CN111258673A (en) Fast application display method and terminal equipment
CN110458289B (en) Multimedia classification model construction method, multimedia classification method and device
CN116909439B (en) Electronic equipment and interaction method thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination