Detailed Description
The embodiments of the present application are applicable to technologies or scenes such as gaze estimation and gaze tracking in vehicle-mounted scenes, game interaction, and the like.
Fig. 1 is a schematic flow chart of an eyeball tracking method according to an embodiment of the present disclosure. The eyeball tracking method provided in the embodiment of the application can be executed by a vehicle-mounted device (such as a car machine), and can also be executed by terminal equipment such as a mobile phone, a computer and the like. The present solution is not particularly limited to this. As shown in fig. 1, the method may include steps 101-104, which are as follows:
101. preprocessing a gray image and a depth image to obtain a gray-depth image of a target under a preset coordinate system, wherein the gray image and the depth image both contain head information of the target;
the target may be a user, a robot, or the like, and this is not particularly limited in the embodiment of the present application.
As an optional implementation manner, as shown in fig. 2, the grayscale image and the depth image are preprocessed, and the grayscale image with high resolution of the target is acquired by an infrared Sensor (IR), and the depth image with low resolution of the target is acquired by a depth camera; and then aligning, interpolating, fusing and the like the low-resolution depth image and the high-resolution gray image to obtain a high-resolution point cloud under the coordinates of the infrared sensor.
Specifically, the infrared sensor and the Depth sensor are calibrated to obtain a coordinate system conversion relationship, then the Depth of the Depth sensor is converted into the infrared sensor coordinate system, and finally aligned infrared-Depth IR-Depth data, namely a gray-Depth image of the target, is output.
102. Performing human head detection on the gray-depth image of the target to obtain a gray-depth image of the head of the target;
as an alternative implementation, the gray-level depth image of the target is subjected to human head detection by using a detection algorithm, which may be, for example, a common human head detection algorithm based on deep learning.
103. Carrying out face reconstruction processing on the gray level-depth image of the head of the target to obtain face information of the target;
as an alternative implementation manner, as shown in fig. 3, a schematic diagram of a face model reconstruction method provided in the embodiment of the present application is shown. Performing feature extraction on the gray level-depth image of the head of the target to obtain a gray level feature and a depth feature of the target; and carrying out fusion processing on the gray level feature and the depth feature of the target to obtain the human face model parameter of the target.
Optionally, the face model parameters include an identity parameter, an expression parameter, a texture parameter, a rotation parameter, a displacement parameter, and a spherical harmonic parameter. Wherein, the identity parameter refers to the identity information of the user; the expression parameters refer to the expression information of the user; the texture parameter is an albedo principal component coefficient indicating the user; the rotation parameter refers to a rotation vector of the head of the user converted from a world coordinate system to a camera coordinate system; the displacement parameter refers to a translation vector of the head of the user converted from a world coordinate system to a camera coordinate system; the spherical harmonic parameters refer to parameters of the illumination model and are used for modeling the illumination model.
And further obtaining the face information of the target based on the face model parameters of the target.
As another optional implementation manner, the gray-depth image of the head of the target is input to a face reconstruction network model for processing, so as to obtain the face information of the target. The human face reconstruction network model obtains the gray characteristic and the depth characteristic of the target by extracting the characteristic of the gray-depth image of the head of the target; performing fusion processing on the gray level feature and the depth feature of the target to obtain a human face model parameter of the target; and further obtaining the face information of the target according to the face model parameters of the target. That is to say, the face model parameters are regressed through the face reconstruction network model, and face mesh information, namely face information, under a preset coordinate system is further acquired.
Specifically, the gray-depth image of the head of the target is input into a first feature extraction layer of a face reconstruction network model for gray feature extraction, the gray-depth image of the head of the target is input into a second feature extraction layer for depth feature extraction, then the features extracted by the first feature extraction layer and the second feature extraction layer are input into a feature fusion layer for fusion processing, and finally, face model parameters obtained by face reconstruction network model regression are output.
The face reconstruction network model can be obtained by adopting convolutional neural network training. Specifically, as shown in fig. 4, feature extraction is performed on a gray level image sample of a user and a depth image sample of the user, which are input into a face reconstruction network model, respectively, so as to obtain a gray level feature and a depth feature of the user; then, carrying out fusion processing on the gray level feature and the depth feature of the user to obtain the face model parameters of the user, wherein the face model parameters comprise identity parameters, expression parameters, texture parameters, rotation parameters, displacement parameters and spherical harmonic parameters; obtaining face information according to the face model parameters of the user; obtaining a loss value according to the face information, the user gray level image sample and the user depth image sample, if the loss value does not reach a stop condition, adjusting parameters of the face reconstruction network model, and repeatedly executing the steps until the stop condition is reached to obtain the trained face reconstruction network model, wherein the weight of the user eyes in a first loss function corresponding to the loss value is not less than a preset threshold value. The first loss function may be a geometric loss function.
As an alternative implementation, the convolutional neural network is trained in an auto-supervision manner. It includes the following three loss functions:
1) geometric loss Egro(X) for calculating an error between the face point cloud and the depth image point cloud:
Egro(X)=wppEpp(X)+wpsEps(X);
wherein E ispp(X) is a point-to-point loss; eps(X) is the loss of points to the surface of the face model; w is appIs a point-to-point weight; w is apsPoint-to-face weights.
2) Face key point loss Elan(X) calculating a three-dimensional key point projection error of the human face model;
wherein L is a visible face key point; LP is a visible eye key point; q. q.siThe ith key point of the face is taken as the face; p is a radical ofiIs the ith three-dimensional (3D) key point on the face model; r is a rotation matrix; t is a displacement vector;
||(qi-qj)-(∏(Rpi+t)-∏(Rpj+t))||2pair of expression (q)i-qj)-(∏(Rpi+t)-∏(Rpj+ t)) square and reopen; sigmai∈L‖qi-∏(Rpi+t)‖2Represents a pair | qi-∏(Rpi+t)‖2A summation wherein | qi-Rpi + t2 represents the absolute value of qi-Rpi + t and then the sum of squares; i. j is a positive integer.
3) Pixel loss Ecol(X) calculating a gray difference between a rendering gray of the face model and an IR gray image;
wherein, F is a pixel point visible for the human face model; i issynPixel values rendered for composition; i isrealAre pixel values in the actual image.
The convolutional neural network adopts the following face model regular loss Ereg(X) face constraint:
wherein alpha isidThe face identity coefficient; alpha is alphaalbIs the face albedo coefficient; alpha is alphaexpIs a facial expression coefficient; sigmaidIs an identity coefficient weight; sigmaalbIs the albedo coefficient; sigmaexpIs the expression coefficient weight.
Because human eyes are the key positions in the eyeball tracking process, the scheme can properly increase the geometric loss E of the human eyesgro(X) weights in (X) to calculate the error between the face point cloud and the depth image point cloud:
Egro(X)=w1Eeve(X)+w2Enose(X)+w3Emouth(X)+w4Eother(X);
wherein E iseve(X) loss of the vertex of the eye region in the face model; enose(X) is the loss of the vertex of the nose region in the face model; emouth(X) loss of the vertex of the mouth region in the face model; eother(X) the vertex loss of other areas in the face model; w is a1Coefficients for the eye region in the face model; w is a2The coefficients of the nose region in the face model; w is a3Coefficients of a mouth region in the face model; w is a4Coefficients for other regions in the face model.
Wherein the coefficient w of the eye region in the face model1Satisfies a condition not less than a preset threshold value. The preset threshold valueAnd may be any value. For example, w1Satisfies the following conditions: w is a1Not less than w2、w1Not less than w3And w1Not less than w4。
The embodiment aims at the loss weight enhancement of the eye region, so that the reconstruction precision of the eye region is higher.
And calculating to obtain a geometric loss value, a face key point loss value and a pixel loss value based on the three loss functions. And if the geometric loss value is not greater than a preset geometric loss threshold value, the face key point loss value is not greater than a preset key point loss threshold value, and the pixel loss value is not greater than a preset pixel loss threshold value, stopping training to obtain a trained face reconstruction network model. If the loss values do not meet the condition, adjusting network parameters, and repeatedly executing the training process until the stop condition is reached.
The stop condition in the above embodiment is explained by taking an example in which the loss value is not greater than the preset loss threshold value. The stopping condition may also be that the number of iterations reaches a preset number, and the like, and this is not specifically limited by the present scheme.
The above description is given by taking three loss functions as examples. Other loss functions may also be used, and this is not specifically limited in this embodiment.
104. And obtaining the pupil position of the target according to the face information.
As an optional implementation manner, the coordinates of the pupils of the eyes can be further obtained according to the eye region key points of the three-dimensional face. Specifically, the pupil position of the target is obtained by solving according to position information of preset key points such as eyelids and canthus on the human face. The pupil position is the starting point of the line of sight.
The embodiments of the present application are described only by taking eye tracking as an example. By adopting the above method, the position of the mouth, the position of the nose, the position of the ear, and the like of the target can be obtained, and the scheme is not particularly limited.
According to the embodiment of the application, the gray-depth image of the target is obtained based on the gray image and the depth image of the target, the gray-depth image of the head of the target is obtained by detecting the human head, and the human face reconstruction processing is carried out according to the gray-depth image of the head of the target, so that the pupil position of the target is obtained. By adopting the method, the human face of the target is reconstructed based on the information of two dimensions of the gray image and the depth image, and an accurate sight line starting point can be obtained in real time.
The focus of the sight line starting point is on the accuracy of the eye region, and the result of eyeball tracking is affected when the eyes of the target are shielded by hands, glasses, a hat and the like, or image change caused by light change, depth error of a depth image and the like. In order to simulate the situations which can occur in various real scenes and enable the face reconstruction network model to cope with various complex scenes, the scheme also provides an eyeball tracking method, and eyeball tracking is carried out based on the obtained enhanced two-dimensional image and three-dimensional point cloud image of the key area corresponding to the target, so that the robustness of the algorithm is improved.
Fig. 5 is a schematic flowchart of another eyeball tracking method according to an embodiment of the present disclosure. The eyeball tracking method provided in the embodiment of the application can be executed by a vehicle-mounted device (such as a car machine), and can also be executed by terminal equipment such as a mobile phone, a computer and the like. The present solution is not particularly limited to this. As shown in fig. 5, the method may include steps 501 and 504, which are as follows:
501. preprocessing a gray image and a depth image to obtain a gray-depth image of a target under a preset coordinate system, wherein the gray image and the depth image both contain head information of the target;
the target may be a user, a robot, or the like, and this is not particularly limited in the embodiment of the present application.
As an optional implementation manner, as shown in fig. 2, the grayscale image and the depth image are preprocessed, and the grayscale image with high resolution of the target is acquired by an infrared Sensor (IR), and the depth image with low resolution of the target is acquired by a depth camera; and then aligning, interpolating, fusing and the like the low-resolution depth image and the high-resolution gray image to obtain a high-resolution point cloud under the coordinates of the infrared sensor.
Specifically, the infrared sensor and the Depth sensor are calibrated to obtain a coordinate system conversion relationship, then the Depth of the Depth sensor is converted into the infrared sensor coordinate system, and finally aligned IR-Depth data, namely a gray-Depth image of the target, is output.
502. Performing human head detection on the gray-depth image of the target to obtain a gray-depth image of the head of the target;
as an alternative implementation, the gray-level depth image of the target is subjected to human head detection by using a detection algorithm, which may be, for example, a common human head detection algorithm based on deep learning.
503. Carrying out face reconstruction processing on the gray level-depth image of the head of the target to obtain face information of the target;
the face reconstruction network model can be obtained by training based on steps 5031 and 5039, and the details are as follows:
5031. acquiring a first point cloud sample of a user, a point cloud sample of a shelter and a texture sample;
the first point cloud sample may be an original point cloud sample of the user, i.e., the point cloud sample of the user without the obstruction.
The shelter is a shelter for the eyes, such as hands, glasses, a hat and the like, or other influences of light change and the like.
5032. Overlaying the point cloud sample of the shelter on the first point cloud sample of the user to obtain a second point cloud sample of the user;
and superposing the point cloud sample of the shielding object in front of the visual angle of the first point cloud sample camera of the user (namely on a camera coordinate system) to obtain a second point cloud sample of the user.
5033. Blanking the second point cloud sample of the user to obtain a third point cloud sample of the user;
in the process of drawing the realistic graphics, the depth information is lost due to projection transformation, which often results in the ambiguity of the graphics. To remove such ambiguities, it is necessary to remove the hidden invisible lines or surfaces during rendering, which is conventionally referred to as removing hidden lines and hidden surfaces, or simply blanking.
And (4) carrying out blanking processing on invisible points behind the shielding object, such as removing the point cloud after the shielding object by adopting a blanking algorithm (for example, a Z-buffer algorithm), so as to obtain a third point cloud sample of the blanked user.
5034. Rendering a third point cloud sample of the user and a texture sample of the shelter to obtain a two-dimensional image sample of the user;
the texture sample of the shielding object positioned in front of the user is rendered to cover the texture of the user behind the shielding object, so that the two-dimensional image sample of the user can be obtained.
5035. Respectively performing enhancement processing of adding noise on the two-dimensional image sample of the user and the third point cloud sample to obtain an enhanced two-dimensional image sample and an enhanced depth image sample of the user, wherein the enhanced two-dimensional image sample and the enhanced depth image sample of the user are respectively a user gray level image sample and a user depth image sample of the input face reconstruction network model;
two-dimensional images and three-dimensional point clouds are obtained after shielding enhancement is carried out, and blocks in various shapes can be superposed to serve as noise. The pixel values or point cloud coordinate values within such a block may conform to a predetermined distribution (e.g., the pixel value distribution satisfies a gaussian distribution with a mean value of 10 and a standard deviation of 0.1, and the point cloud coordinate is assigned a value of zero). To be more realistic, illumination noise, Time of flight (TOF) sensor noise data may also be simulated. For example, blocks of 25 × 25 pixel size, 50 × 50 pixel size, and 100 × 100 pixel size are randomly generated on an IR image and a TOF point cloud, where the gray values of the gray blocks in the two-dimensional image satisfy a gaussian distribution, the mean value of the distribution is the pixel mean value of the corresponding block in the original image, and the standard deviation is 0.01. The block in the point cloud picture can simulate noise such as holes, and the setting depth is zero at the moment. The effect is shown in fig. 6b, where fig. 6a is an effect diagram without superimposed noise.
As an alternative implementation, an original two-dimensional image and three-dimensional point cloud of the user in the cabin are acquired. And acquiring three-dimensional scanning point cloud and texture information of the shielding object by using a scanner. And overlapping the point cloud information of the shielding object on the three-dimensional point cloud information of the user, and removing the point cloud after the shielding object through a z-buffer algorithm to obtain the processed point cloud of the user. And rendering the processed point cloud of the user by scanning the texture of the shielding object to generate a two-dimensional image of the processed user.
Taking the hand occlusion as an example, in order to obtain data of the hand occlusion at various different positions, a scanner may be used to scan the hand first, and three-dimensional point cloud and texture information of the hand are obtained. In the original image, the position of a face key point in a two-dimensional image is obtained by using a face key point algorithm, and the position of the key point in a camera coordinate system can be found in a depth image or a three-dimensional point cloud image according to the position in the image. Then, the three-dimensional model of the hand obtained by scanning before can be put at the corresponding position through the coordinate information of the key point on the face. The occlusion is now in front, and from the sensor perspective, some face regions that were not occluded are now occluded by the hand, and the cloud of face points behind the hand can be eliminated using a blanking algorithm (e.g., z-buffer algorithm). Thus, a complete composite point cloud data can be obtained.
After the point cloud data is acquired, texture information can be acquired according to the point cloud data, and a two-dimensional image under the camera view angle can be rendered, so that an enhanced two-dimensional image and a three-dimensional depth image are acquired.
The above description is only given by way of example, and reflective glasses, opaque sunglasses, and other accessory data that may cause occlusion may also be synthesized. The reconstruction data of the 3d object is obtained through the scanner, the rotation matrix R and the displacement vector T of the human eyes relative to the camera are roughly estimated through an algorithm, the 3d object is moved to the corresponding position by utilizing R, T, the 3d object is superposed on the TOF point cloud data of the flight time by utilizing a blanking algorithm, the grid gray information is superposed on the IR image through perspective projection, and then data synthesis is completed.
5036. Inputting the user gray level image sample and the user depth image sample into a face reconstruction network model to obtain the gray level feature and the depth feature of the user;
the user gray level image sample here is the enhanced two-dimensional image sample of the user, and the user depth image sample here is the enhanced depth image sample.
5037. Fusing the gray level features and the depth features of the user to obtain face model parameters of the user;
5038. obtaining face information according to the face model parameters of the user;
5039. obtaining a loss value according to the face information, a first gray image sample and a first depth image sample of the user, if the loss value does not reach a stop condition, adjusting parameters of the face reconstruction network model, and repeatedly executing the steps until the stop condition is reached to obtain the trained face reconstruction network model, wherein the weight of the user eyes in a first loss function corresponding to the loss value is not less than a preset threshold value;
the first grayscale image sample of the user is an original grayscale image sample of the user, that is, the grayscale image sample of the user when there is no obstruction. The first depth image sample of the user is an original depth image sample of the user, that is, a depth image sample of the user without an obstruction.
For the related description of the step 5036 and the step 5039, reference may be made to the foregoing embodiments, which are not described herein again.
504. And obtaining the pupil position of the target according to the face information.
According to the method and the device, the point cloud sample of the user, the point cloud sample of the shielding object and the texture sample are obtained, and the situation that the shielding object exists is simulated, so that the face reconstruction network model capable of adapting to the shielding object is obtained through training. By adopting the scheme, the data of the eye region is enhanced, so that the reconstruction precision of the eye region is higher; and conditions which can occur in various real scenes can be simulated, and the corresponding enhanced two-dimensional image and three-dimensional image are obtained, so that the robustness of the algorithm is improved.
It should be noted that the eyeball tracking method provided by the application can be executed locally, and can also be executed by uploading the grayscale image and the depth image of the target to the cloud. The cloud end can be realized by a server, the server can be a virtual server, an entity server and the like, and can also be other devices, and the scheme is not particularly limited to this.
Referring to fig. 7, an eyeball tracking apparatus is provided for an embodiment of the present application, where the apparatus may be a vehicle-mounted apparatus (e.g., a car machine), and may also be a terminal device such as a mobile phone and a computer. The device comprises a preprocessing module 701, a detection module 702, a reconstruction processing module 703 and an acquisition module 704, and the details are as follows:
the preprocessing module 701 is configured to preprocess the grayscale image and the depth image to obtain a grayscale-depth image of the target in a preset coordinate system, where the grayscale image and the depth image both include header information of the target;
a detection module 702, configured to perform human head detection on the gray-level depth image of the target to obtain a gray-level depth image of the head of the target;
a reconstruction processing module 703, configured to perform face reconstruction processing on the gray-level depth image of the head of the target to obtain face information of the target;
an obtaining module 704, configured to obtain a pupil position of the target according to the face information.
According to the embodiment of the application, the gray-depth image of the target is obtained based on the gray image and the depth image of the target, the gray-depth image of the head of the target is obtained by detecting the human head, and the human face reconstruction processing is carried out according to the gray-depth image of the head of the target, so that the pupil position of the target is obtained. By adopting the method, the human face of the target is reconstructed based on the information of two dimensions of the gray image and the depth image, and an accurate sight line starting point can be obtained in real time.
As an optional implementation manner, the reconstruction processing module 703 is configured to:
performing feature extraction on the gray level-depth image of the head of the target to obtain a gray level feature and a depth feature of the target;
fusing the gray level feature and the depth feature of the target to obtain a human face model parameter of the target;
and obtaining the face information of the target according to the face model parameters of the target.
And obtaining the face model parameters of the target by fusing the gray level feature and the depth feature of the target, and further obtaining the face information of the target. The human face model parameters of the target are fused with the gray scale features and the depth features, and compared with the method that only the gray scale features are contained in the prior art, the human face model parameters of the target are more comprehensive, and the eyeball tracking precision can be effectively improved.
As an alternative implementation, the face reconstruction processing on the gray-scale depth image of the head of the target is processed through a face reconstruction network model.
As an optional implementation manner, the face reconstruction network model is obtained by training as follows:
respectively extracting the characteristics of a user gray level image sample and a user depth image sample which are input into a face reconstruction network model to obtain the gray level characteristics and the depth characteristics of the user;
fusing the gray level features and the depth features of the user to obtain face model parameters of the user, wherein the face model parameters comprise identity parameters, expression parameters, texture parameters, rotation parameters and displacement parameters;
obtaining face information according to the face model parameters of the user;
and obtaining a loss value according to the face information, if the loss value does not reach a stop condition, adjusting parameters of the face reconstruction network model, and repeatedly executing the steps until the stop condition is reached to obtain the trained face reconstruction network model, wherein the weight of the user eyes in a first loss function corresponding to the loss value is not less than a preset threshold value.
As another optional implementation, the apparatus is further configured to: acquiring a first point cloud sample of the user, a point cloud sample of a shelter and a texture sample; overlaying the point cloud sample of the shelter on the first point cloud sample of the user to obtain a second point cloud sample of the user; blanking the second point cloud sample of the user to obtain a third point cloud sample of the user; rendering the third point cloud sample and the texture sample of the shelter to obtain a two-dimensional image sample of the user; and respectively performing enhancement processing of adding noise on the two-dimensional image sample of the user and the third point cloud sample to obtain an enhanced two-dimensional image sample and an enhanced depth image sample of the user, wherein the enhanced two-dimensional image sample and the enhanced depth image sample of the user are respectively a user gray level image sample and a user depth image sample of the input face reconstruction network model.
It should be noted that the preprocessing module 701, the detecting module 702, the reconstructing processing module 703 and the obtaining module 704 are configured to execute relevant steps of the foregoing method. For example, the preprocessing module 701 is configured to execute the relevant content of step 101 and/or step 501, the detection module 702 is configured to execute the relevant content of step 102 and/or step 502, the reconstruction processing module 703 is configured to execute the relevant content of step 103 and/or step 503, and the acquisition module 704 is configured to execute the relevant content of step 104 and/or step 504.
According to the method and the device, the point cloud sample of the user, the point cloud sample of the shielding object and the texture sample are obtained, and the situation that the shielding object exists is simulated, so that the face reconstruction network model capable of adapting to the shielding object is obtained through training. By adopting the scheme, the data of the eye region is enhanced, so that the reconstruction precision of the eye region is higher; and conditions which can occur in various real scenes can be simulated, and the corresponding enhanced two-dimensional image and the corresponding enhanced three-dimensional point cloud image are obtained, so that the robustness of the algorithm is improved.
In this embodiment, the eyeball tracking device is represented in a module form. A "module" herein may refer to an application-specific integrated circuit (ASIC), a processor and memory that execute one or more software or firmware programs, an integrated logic circuit, and/or other devices that may provide the described functionality. Further, the above preprocessing module 701, the detection module 702, the reconstruction processing module 703 and the acquisition module 704 may be implemented by the processor 801 of the eye tracking apparatus shown in fig. 8.
Fig. 8 is a schematic structural diagram of another eyeball tracking device according to an embodiment of the present application. As shown in fig. 8, the eye tracking apparatus 800 comprises at least one processor 801, at least one memory 802 and at least one communication interface 803. The processor 801, the memory 802 and the communication interface 803 are connected through the communication bus and perform communication with each other.
The processor 801 may be a general purpose Central Processing Unit (CPU), a microprocessor, an application-specific integrated circuit (ASIC), or one or more integrated circuits for controlling the execution of programs according to the above schemes.
Communication interface 803 is used for communicating with other devices or communication Networks, such as ethernet, Radio Access Network (RAN), Wireless Local Area Networks (WLAN), etc.
The Memory 802 may be a Read-Only Memory (ROM) or other type of static storage device that can store static information and instructions, a Random Access Memory (RAM) or other type of dynamic storage device that can store information and instructions, an Electrically Erasable Programmable Read-Only Memory (EEPROM), a Compact Disc Read-Only Memory (CD-ROM) or other optical Disc storage, optical Disc storage (including Compact Disc, laser Disc, optical Disc, digital versatile Disc, blu-ray Disc, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited to these. The memory may be self-contained and coupled to the processor via a bus. The memory may also be integral to the processor.
The memory 802 is used for storing application program codes for executing the above schemes, and is controlled by the processor 801 to execute. The processor 801 is used to execute application program code stored in the memory 802.
The memory 802 stores code that may perform one of the eye tracking methods provided above.
It should be noted that although the eye tracking apparatus 800 shown in fig. 8 only shows a memory, a processor and a communication interface, in the specific implementation process, those skilled in the art will understand that the eye tracking apparatus 800 also includes other devices necessary for normal operation. Also, as may be appreciated by those skilled in the art, the eye tracking apparatus 800 may also include hardware components for performing other additional functions, according to particular needs. Furthermore, those skilled in the art will appreciate that the eye tracking apparatus 800 may also include only those components necessary to implement the embodiments of the present application, and need not include all of the components shown in FIG. 8.
The embodiment of the application also provides a chip system, and the chip system is applied to the electronic equipment; the chip system comprises one or more interface circuits, and one or more processors; the interface circuit and the processor are interconnected through a line; the interface circuit is to receive a signal from a memory of the electronic device and to send the signal to the processor, the signal comprising computer instructions stored in the memory; the electronic device performs the method when the processor executes the computer instructions.
Embodiments of the present application also provide a computer-readable storage medium having stored therein instructions, which when executed on a computer or processor, cause the computer or processor to perform one or more steps of any one of the methods described above.
The embodiment of the application also provides a computer program product containing instructions. The computer program product, when run on a computer or processor, causes the computer or processor to perform one or more steps of any of the methods described above.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
It should be understood that in the description of the present application, unless otherwise indicated, "/" indicates a relationship where the objects associated before and after are an "or", e.g., a/B may indicate a or B; wherein A and B can be singular or plural. Also, in the description of the present application, "a plurality" means two or more than two unless otherwise specified. "at least one of the following" or similar expressions refer to any combination of these items, including any combination of the singular or plural items. For example, at least one (one) of a, b, or c, may represent: a, b, c, a-b, a-c, b-c, or a-b-c, wherein a, b, c may be single or multiple. In addition, in order to facilitate clear description of technical solutions of the embodiments of the present application, in the embodiments of the present application, terms such as "first" and "second" are used to distinguish the same items or similar items having substantially the same functions and actions. Those skilled in the art will appreciate that the terms "first," "second," etc. do not denote any order or quantity, nor do the terms "first," "second," etc. denote any order or importance. Also, in the embodiments of the present application, words such as "exemplary" or "for example" are used to mean serving as examples, illustrations or illustrations. Any embodiment or design described herein as "exemplary" or "e.g.," is not necessarily to be construed as preferred or advantageous over other embodiments or designs. Rather, use of the word "exemplary" or "such as" is intended to present relevant concepts in a concrete fashion for ease of understanding.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the division of the unit is only one logical function division, and other division may be implemented in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. The shown or discussed mutual coupling, direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some interfaces, and may be in an electrical, mechanical or other form.
Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. The procedures or functions according to the embodiments of the present application are wholly or partially generated when the computer program instructions are loaded and executed on a computer. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored on or transmitted over a computer-readable storage medium. The computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by wire (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)), or wirelessly (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that includes one or more of the available media. The usable medium may be a read-only memory (ROM), or a Random Access Memory (RAM), or a magnetic medium, such as a floppy disk, a hard disk, a magnetic tape, a magnetic disk, or an optical medium, such as a Digital Versatile Disk (DVD), or a semiconductor medium, such as a Solid State Disk (SSD).
The above description is only a specific implementation of the embodiments of the present application, but the scope of the embodiments of the present application is not limited thereto, and any changes or substitutions within the technical scope disclosed in the embodiments of the present application should be covered by the scope of the embodiments of the present application. Therefore, the protection scope of the embodiments of the present application shall be subject to the protection scope of the claims.