CN113238652B

CN113238652B - Sight line estimation method, device, equipment and storage medium

Info

Publication number: CN113238652B
Application number: CN202110512902.3A
Authority: CN
Inventors: 刘钢; 唐堂
Original assignee: Beijing Zitiao Network Technology Co Ltd
Current assignee: Beijing Zitiao Network Technology Co Ltd
Priority date: 2021-05-11
Filing date: 2021-05-11
Publication date: 2023-07-14
Anticipated expiration: 2041-05-11
Also published as: CN113238652A

Abstract

The embodiment of the disclosure relates to a sight line estimation method, a device, equipment and a storage medium, which are used for determining the difference value of the sight line direction of eyes on an eye image of a subject and eyes on each reference image and the influence weight of each reference image on the sight line estimation value of the corresponding eye image by acquiring the eye image of the subject and based on the eye image of the subject and at least one reference image of the eye of the subject; the eye direction of the eye on the eye image of the subject is determined based on the difference in the eye direction of the eye on the eye image of the subject and the eye on each reference image, the influence weight of each reference image on the estimated value of the eye image of the subject, and the mark information on the eye direction included on each reference image. The embodiment of the disclosure can improve the accuracy of the sight line estimation.

Description

Sight line estimation method, device, equipment and storage medium

Technical Field

The embodiment of the disclosure relates to the technical field of image recognition, in particular to a sight line estimation method, a sight line estimation device, sight line estimation equipment and a storage medium.

Background

The eye gaze estimation method provided by the related art can estimate the eye gaze direction in the image based on the eye image. However, since the internal/external structures of eyes of each living body are different in practice, the result of the sight line estimation is often inaccurate, and thus, how to improve the accuracy of the sight line estimation is a technical problem to be solved in the art.

Disclosure of Invention

In order to solve the above technical problems or at least partially solve the above technical problems, embodiments of the present disclosure provide a line-of-sight estimating method, apparatus, device, and storage medium.

A first aspect of an embodiment of the present disclosure provides a line-of-sight estimation method, including: acquiring an eye image of a subject; determining, based on the eye image and at least one reference image of the subject's eye, a difference in gaze direction of the eye on the eye image and the eye on each reference image, and an impact weight of each reference image on a gaze estimation value of the eye image, the reference image including on it marker information of the gaze direction of the eye on the reference image; the eye direction of the eye on the eye image is determined based on the difference between the eye on the eye image and the eye's line of sight direction on each reference image, the weight of the influence of each reference image on the estimated value of the eye's line of sight on each reference image, and the mark information on each reference image.

A second aspect of the embodiments of the present disclosure provides a line-of-sight estimating apparatus, comprising:

the acquisition module is used for acquiring an eye image of the object;

A first determining module, configured to determine, based on the eye image and at least one reference image of the eye of the subject, a difference between a direction of line of sight of the eye on the eye image and the eye on each reference image, and an influence weight of each reference image on a estimated value of line of sight of the eye image, where the reference image includes flag information of the direction of line of sight of the eye on the reference image;

and the second determining module is used for determining and obtaining the sight line direction of the eyes on the eye image based on the difference value of the sight line directions of the eyes on the eye image and the eyes on each reference image, the influence weight of each reference image on the sight line estimated value of the eye image and the mark information on each reference image.

A third aspect of the disclosed embodiments provides a terminal device comprising a memory and a processor, wherein the memory stores a computer program which, when executed by the processor, enables the method of the first aspect described above to be implemented.

A fourth aspect of the disclosed embodiments provides a computer readable storage medium having stored therein a computer program which, when executed by a processor, can implement the method of the first aspect described above.

Compared with the prior art, the technical scheme provided by the embodiment of the disclosure has the following advantages:

according to the embodiment of the disclosure, by acquiring an eye image of a subject, determining a difference value between the eye on the eye image of the subject and the sight line direction of the eye on each reference image and an influence weight of each reference image on a sight line estimated value of the corresponding eye image based on the eye image of the subject and at least one reference image of the eye of the subject; the eye direction of the eye on the eye image of the subject is determined based on the difference in the eye direction of the eye on the eye image of the subject and the eye on each reference image, the influence weight of each reference image on the estimated value of the eye image of the subject, and the mark information on the eye direction included on each reference image. According to the embodiment of the disclosure, the eye sight estimated value on the eye image of the object is corrected by integrating the difference value of the eye directions on the plurality of reference images and the eye sight image of the object and the influence weight of the plurality of reference images on the eye sight estimated value of the eye image of the object, so that the accuracy of the eye sight estimation can be improved.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure.

In order to more clearly illustrate the embodiments of the present disclosure or the solutions in the prior art, the drawings that are required for the description of the embodiments or the prior art will be briefly described below, and it will be obvious to those skilled in the art that other drawings can be obtained from these drawings without inventive effort.

FIG. 1 is a flow chart of a line-of-sight estimation method provided by an embodiment of the present disclosure;

FIG. 2 is a flow chart of a method of generating a reference image provided by an embodiment of the present disclosure;

3A-3D are schematic diagrams of a method for generating a reference image according to an embodiment of the disclosure;

FIG. 3E is a schematic illustration of an interface provided by an embodiment of the present disclosure;

FIG. 3F is yet another interface schematic provided by an embodiment of the present disclosure;

FIG. 4 is a flow chart of a method of determining a difference in gaze direction on an eye image and reference images, and an impact weight of the reference images on a gaze estimate of the eye image, provided by an embodiment of the present disclosure;

FIG. 5 is a schematic view of a view estimation model provided by an embodiment of the present disclosure;

fig. 6 is a schematic structural diagram of a sight line estimating apparatus according to an embodiment of the present disclosure;

Fig. 7 is a schematic structural diagram of a terminal device in an embodiment of the disclosure.

Detailed Description

In order that the above objects, features and advantages of the present disclosure may be more clearly understood, a further description of aspects of the present disclosure will be provided below. It should be noted that, without conflict, the embodiments of the present disclosure and features in the embodiments may be combined with each other.

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present disclosure, but the present disclosure may be practiced otherwise than as described herein; it will be apparent that the embodiments in the specification are only some, but not all, embodiments of the disclosure.

In one embodiment of the present disclosure, a gaze estimation method is provided that can estimate a gaze direction of a subject (e.g., a living being having a visual organ, such as a human or animal) based on an eye image of the subject. The method provided by the embodiment of the disclosure can be applied to any scene needing to be applied to sight estimation, such as sight tracking, but not limited to sight tracking.

By way of example, fig. 1 is a flowchart of a line-of-sight estimation method provided by an embodiment of the present disclosure, which may be performed by a terminal device or a program product or model onboard the terminal device. The terminal device may be exemplarily understood as any device having image processing and image capturing functions, such as a mobile phone, a tablet computer, a notebook computer, a desktop computer, a wearable electronic device, a smart home device, etc., and the program product or model may be any product or model having a gaze estimating function. As shown in fig. 1, the method provided in this embodiment includes the following steps:

Step 101, acquiring an eye image of a subject.

References to "subject" in accordance with embodiments of the present disclosure are to be understood as an organism having a visual organ, e.g., a human or other animal, etc.

Reference to an "ocular image" in accordance with embodiments of the present disclosure may be understood to include an image of an ocular organ. The image may not be limited to include only the ocular organ of the subject, but may include other portions of the subject. Such as the face, torso, etc. of the subject.

It should be noted that the eye image may include not only an image of an eye organ of one subject, but also images of eye organs of a plurality of subjects.

The eye image in this embodiment may be obtained through various channels or modes, for example, in a possible implementation manner, the eye image of the subject may be obtained from a preset database; alternatively, in another possible embodiment, the eye image of the subject may be obtained by capturing an image of the subject by the terminal device itself or a capturing device mounted on another device. Of course, this is merely illustrative and is not intended to be the only limitation on the channel and manner in which the ocular image is acquired. In practice, the channel and mode of acquiring the eye image may be set as desired, and need not be limited to a specific mode.

Step 102, determining a difference value between the eye on the eye image and the eye on each reference image and an influence weight of each reference image on the estimated value of the eye image based on the eye image of the subject and at least one reference image of the eye of the subject, wherein the reference image comprises mark information of the eye on the reference image.

The reference image in this embodiment includes at least the eye image of the subject and the mark information of the line of sight of the subject in step 101.

The reference image may be obtained directly from a preset database or other devices, or may be generated by the terminal device itself, which is not specifically limited in this implementation. The reference image may also be obtained in advance.

The method for determining the difference value between the eye on the eye image and the eye on each reference image and the influence weight of each reference image on the estimated value of the eye image according to the embodiment of the disclosure can be various based on the eye image of the subject and at least one reference image of the eye of the subject. For example, in one possible method, an eye image of the subject and a reference image obtained in advance may be used as input data of a first model and a second model obtained in advance, a difference between a line of sight direction of the subject on the eye image and a line of sight direction of the subject on each reference image is obtained through a first model processing, and an influence weight of each reference image on a line of sight estimated value of the eye image is obtained through a second model processing. For example, in another possible method, the eye image of the subject and the reference image obtained in advance may be input into a third model obtained in advance, and the difference between the line of sight direction of the subject on the eye image and the line of sight direction of the subject on each reference image and the influence weight of each reference image on the estimated value of the line of sight of the eye image may be obtained by processing the third model. Of course, the two methods described above are merely examples and are not intended to be limiting. The training methods of the first model, the second model and the third model may refer to the related model training methods, which are not described in detail in this embodiment.

It should be noted that, when the first model, the second model, and the third model may also be trained to have the object or the eye organ identifying function, and the eye image obtained in step 101 includes a plurality of eye organs of the object, the first model, the second model, and the third model may be used to identify the eye organ of the same object from the above eye image and the reference image, for example, identify the eye organ of the object a from the eye image, and then identify the eye organ of the object a from the reference image.

Step 103, determining the eye sight direction on the eye image based on the difference value of the eye sight direction on the eye image and the eye sight direction on each reference image, the influence weight of each reference image on the eye image sight estimated value, and the mark information on each reference image.

For example, assuming that the difference between the eye organ (i.e., the eye) on the eye image obtained in step 101 and the eye organ on the reference image t1 (and the eye organ of the same subject on the eye image) is d1, the difference between the eye organ on the reference image t2 (and the eye organ of the same subject on the eye image) and the eye organ is d2, the eye direction marked on the reference image t1 is g1, the eye direction marked on the reference image t2 is g2, the influence weight of the reference image t1 on the eye image is w1, and the influence weight of the reference image t1 on the eye image is w2, then the estimated value of the eye image with respect to the reference image t1 may be denoted as d1+g1, the estimated value of the eye image with respect to the reference image t2 may be denoted as d2+g2, and the eye direction in the eye image may be denoted as w1 (d1+g1) +w2 (d2+g2). That is, in this embodiment, the estimated line of sight value of the eye image with respect to each reference image may be determined based on the difference between the line of sight directions of the eye image and the eye organ on each reference image and the line of sight directions of the marks on each reference image, and then the line of sight direction of the eye organ on the eye image may be determined based on the weight of influence of each reference image on the estimated line of sight value of the eye image and the estimated line of sight value of the eye image with respect to each reference image. For each reference image, the estimated value of the eye image relative to the eye image can be obtained by summing the marked eye direction on the reference image and the difference between the eye organ on the eye image and the eye organ on the reference image; by the influence weight of the reference image on the eye image's gaze estimate, weighting the estimated value of the eye image relative to the reference image to obtain a weighted value of the estimated value of the eye image relative to the reference image; after the weighted values of the eye image relative to the estimated value of the eye line of each reference image are obtained, the weighted values are summed to obtain the eye direction of the eye organ on the eye image.

Of course, the foregoing is merely illustrative and not restrictive, and in fact, in other embodiments, the number of reference images may not be limited to two, and when the number of reference images is other, the line-of-sight direction determining method of this embodiment may refer to the case of the two reference images, which is not described herein again.

According to the embodiment of the disclosure, by acquiring an eye image of a subject, determining a difference value of a sight line direction of an eye organ on the eye image of the subject and the eye organ on each reference image and an influence weight of each reference image on a sight line estimated value of the corresponding eye image based on the eye image of the subject and at least one reference image of the eye of the subject; the eye direction of the eye organ on the eye image of the subject is determined based on the difference in the eye direction of the eye on the eye image of the subject and the eye organ on each reference image, the influence weight of each reference image on the estimated value of the eye of the subject on the eye image of the eye, and the mark information on the eye direction included on each reference image. According to the embodiment of the disclosure, the eye organs on the eye images of the object are corrected by integrating the difference value of the eye directions of the eye organs on the reference images and the eye organs on the eye images of the object and the influence weight of the eye images of the object on the eye estimated value of the eye images of the object, so that the accuracy of the eye estimated value can be improved.

By way of example, fig. 2 is a flowchart of a method for generating a reference image according to an embodiment of the present disclosure, as shown in fig. 2, the method includes the steps of:

step 201, in response to receiving a shooting instruction, providing a shooting interface for a subject, wherein the shooting interface comprises prompt information for prompting the sight line direction.

And 202, obtaining an eye shooting image of the object based on the shooting interface.

And 203, marking the sight direction prompted by the prompt information on the eye shooting image, and generating a reference image of the eyes of the subject.

For example, fig. 3A to 3D are schematic diagrams of a method for generating a reference image according to an embodiment of the disclosure. As shown in fig. 3A to 3D, when the "first key" in fig. 3A is triggered, the terminal device enters the shooting interface shown in fig. 3B, where the shooting interface includes at least an image acquisition area, a second key, and prompt information for prompting the line of sight direction. Wherein the image acquisition area may be an arbitrarily shaped area in which an image of the object is displayed. The cue information in fig. 3B is used to guide the line of sight of the subject so that the line of sight direction of the subject coincides with the line of sight direction of the cue. The prompt information may include at least information about the gaze direction of the line of sight, and in other embodiments may even include prompt information about the distance between the object and the terminal device, which is, of course, merely illustrative and not exclusive, and in fact, the content of the prompt information in this embodiment may be set as required, and is not necessarily limited to any specific content or contents. In fig. 3B, after the subject reaches the prompted direction of the line of sight according to the prompt information, shooting may be triggered by the "second key" and an eye shot image shown in fig. 3C may be obtained (the eye shot image may not be limited to only include the eyes of the subject), and further, a reference image called in this embodiment, for example, a reference image shown in fig. 3D may be obtained by marking the direction of the line of sight prompted by the prompt information in fig. 3B on the eye shot image shown in fig. 3C. The marking mode of the eye-capturing image in the line-of-sight direction may be set as required, and the embodiment is not limited thereto.

It will be appreciated that fig. 3A-3D are merely exemplary, and not the only, interface schematic provided by embodiments of the present disclosure. Indeed, the interfaces shown in FIGS. 3A-3D may be modified, replaced, or eliminated as desired, and even in some other embodiments, interfaces of other steps may be inserted in the process shown in FIGS. 3A-3D. For example, the interface shown in fig. 3A may be replaced with an interface for prompting the user to input a voice command, and enter the interface shown in fig. 3B when receiving a corresponding voice command. For another example, in still another embodiment, the interface of fig. 3B may be replaced by the interface shown in fig. 3E, and the interface of fig. 3E may include a graphic (exemplarily shown as a black triangle in fig. 3E) for attracting the line of sight of the object, the second key, and text information for prompting the object to look at the graphic. For another example, in another embodiment, the eye shot image acquired in fig. 3B does not meet the preset requirements (such as low illumination intensity, not including the eye organ, and too large or too small a distance between the object and the terminal device, etc.), and after fig. 3B, before fig. 3C, an interface shown in fig. 3F may be further included, and the interface may include a re-shooting prompt message for prompting re-shooting. Of course, the above is merely exemplary and is not intended to be the only limitation on the method and process of generating reference images.

The embodiment realizes an automatic generation method of the reference image, and by providing a shooting interface for the object and displaying prompt information for prompting the sight direction on the interface, the sight of the object is guided to be consistent with the sight direction prompted by the prompt information, so that the accuracy of marking information on the reference image can be improved, the accuracy of correcting the sight estimation result of the reference image is ensured, and the accuracy of sight estimation is improved.

Fig. 4 is a flowchart of a method for determining a difference between directions of lines of sight of an eye image and reference images and an impact weight of the reference images on a line of sight estimation value of the eye image according to an embodiment of the present disclosure, as shown in fig. 4, the method includes:

step 401, processing the eye image and at least one reference image to obtain feature images of the eye image and the at least one reference image respectively.

Step 402, processing feature maps of the eye image and the at least one reference image to obtain a difference value between the eye on the eye image and the eye line direction on each reference image, and an influence weight of each reference image on the estimated value of the eye image.

For example, in one implementation of the embodiment of the disclosure, the method of the embodiment may be performed by a line-of-sight estimation model that is carried on the terminal device. Fig. 5 is a schematic structural diagram of a line-of-sight estimation model provided in an embodiment of the present disclosure, the model including at least a convolution layer, a first network, and a second network. As shown in fig. 5, after the eye image and m reference images of the subject are input into the model in fig. 5, the convolution layer in the model will perform convolution processing on the eye image and each reference image, respectively, to obtain a feature map F of the eye image ₀ Feature map F corresponding to each reference image ₁ ～F _m . Further, feature map F ₀ And feature map F ₁ ～F _m Can be input to the first network and the second network as input data of the first network and the second network. Wherein in the first network, the feature map F may be ₀ Respectively are provided withAnd feature map F ₁ ～F _m The method comprises the steps of stitching to obtain m stitched images, and then performing preset linear transformation processing (such as weighting and matrix multiplication processing on the stitched images, but not limited to the linear transformation processing listed here) on the m stitched images to obtain eye features on each stitched image, wherein the eye features obtained here comprise a feature map F of the eye images ₀ The ocular features on the image, and the ocular features on a feature map of a reference image included on the stitched image. Further, the obtained eye features on each spliced image can be input into a full-connection layer in the first network, and the difference d between the eye image and the sight line direction on each reference image is obtained by processing the full-connection layer ₁ ～d _m . Similar to the first network, in the second network, the feature map F may be used ₀ Respectively with characteristic diagram F ₁ ～F _m The m stitched images are obtained by stitching, and then the eye features on each stitched image are obtained by performing a preset linear transformation process (for example, performing a weighted sum matrix multiplication process on the stitched images, but not limited to the linear transformation process listed here) on the m stitched images. Further, the obtained eye features on each stitched image may be input to a full connection layer in the second network, and the full connection layer processes to obtain the impact weight w of each reference image on the estimated value of the eye image ₁ ～w _m . Let the directions of the lines of sight of the marks on reference image 1 to reference image m be g, respectively ₁ ～g _m Then the eye image's gaze estimate relative to the ith reference image may be represented as d _i +g _i Weight w based on the influence of the ith reference image on the eye image's gaze estimate _i The result of weighting the eye image with respect to the estimated line of sight value of the ith reference image may be expressed as w _i (d _i +g _i ). And then the m weighted processing results corresponding to the m reference images are summed to obtain the sight direction in the eye image. That is, in the above example, the first network and the second network are both pre-trained two networks, wherein the first network is used for the base stationThe second network is used for processing and obtaining the influence weight of the reference image on the estimated value of the eye image. And in an example of this embodiment, a method of determining a difference in a line-of-sight direction of an eye on an eye image and an eye on each reference image, and an impact weight of each reference image on a line-of-sight estimation value of the eye image may include: performing stitching processing on the feature images of the reference image and the eye image aiming at each reference image to obtain stitched images; performing linear transformation processing on the spliced image to obtain eye features on the spliced image; and obtaining a difference value of the eye on the eye image and the sight line direction of the eye on the reference image based on the eye feature processing, and an influence weight of the reference image on the sight line estimated value of the eye image.

It should be noted that, although in fig. 5, the input of the first network and the second network is the feature map of the eye image and the reference image of the subject, and for each reference image, the first network and the second network need to perform the stitching process on the feature map of the eye image and the feature map of the reference image of the subject first, and then perform the linear transformation process on the stitched image obtained by stitching, in other embodiments, the stitching process and the linear transformation process of the first network and the second network for the feature map of the eye image and the reference image of the subject may be performed by other preset networks or models, for example, the network or the model may perform the stitching process on the feature map of the eye image and the reference image of the subject first for each reference image to obtain the stitched image, and then perform the linear transformation process on the stitched image to obtain the eye feature on the stitched image. Further, the eye features obtained by the network or the model may be input into a first network and a second network, respectively, the difference between the eye image of the subject and the line-of-sight direction on the reference image is obtained by processing the first network, and the impact weight of the reference image on the estimated line-of-sight value of the eye image of the subject is obtained by processing the second network.

Of course, the implementation manner in the above example is only one possible implementation manner of the present embodiment, and not all the implementation manners, and in fact, any method that can obtain the difference between the eye on the eye image and the line of sight direction of the eye on the reference image and the weight of the influence of the reference image on the estimated value of the line of sight of the eye image by processing the eye image of the subject and the reference image of the eye of the subject may be adopted by the present embodiment.

According to the method and the device, the difference value of the eye on the eye image and the eye on the reference image and the influence weight of the reference image on the estimated value of the eye image are analyzed through the two networks, so that the mutual influence between the two analyses can be avoided, and the accuracy of a single analysis result is improved.

Fig. 6 is a schematic structural diagram of a line-of-sight estimating apparatus provided by an embodiment of the present disclosure, which may be understood as a terminal device or a part of functional modules in the terminal device in the above-described embodiments. As shown in fig. 6, the line-of-sight estimating means 60 includes:

an acquisition module 61 for acquiring an eye image of a subject;

a first determining module 62, configured to determine, based on the eye image and at least one reference image of the eye of the subject, a difference between a direction of line of sight of the eye on the eye image and the eye on each reference image, and an influence weight of each reference image on a estimated value of line of sight of the eye image, where the reference image includes flag information of the direction of line of sight of the eye on the reference image;

A second determining module 63, configured to determine, based on a difference between the eye on the eye image and the eye on each reference image, an impact weight of each reference image on the estimated value of the eye image, and the mark information on each reference image, a direction of the eye on the eye image.

In one embodiment the view ray estimating device 60 further comprises:

the object shooting module is used for providing a shooting interface for the object when a shooting instruction is received, wherein the shooting interface comprises prompt information for prompting the direction of the sight; obtaining an eye shooting image of the object based on the shooting interface;

and the generating module is used for marking the sight direction prompted by the prompt information on the eye shooting image and generating a reference image object of the eye of the object.

In one embodiment, the first determining module 62 is configured to:

processing the eye image and the at least one reference image to obtain feature images of the eye image and the at least one reference image respectively;

and processing the feature images of the eye image and the at least one reference image to obtain the difference value of the eye on the eye image and the sight line direction of the eye on each reference image, and the influence weight of each reference image on the sight line estimated value of the eye image.

In one embodiment, the first determining module 62 is configured to:

performing stitching processing on the feature images of the reference image and the eye image aiming at each reference image to obtain stitched images;

performing linear transformation processing on the spliced image to obtain eye features on the spliced image;

processing the eye feature based on a first network to obtain a difference value between the eye on the eye image and the sight line direction of the eye on the reference image;

and processing the eye feature based on a second network to obtain the influence weight of the reference image on the estimated value of the eye image.

In one embodiment, the second determining module 63 includes:

a first determining sub-module for determining a gaze estimation value of the eye image relative to each reference image based on a difference in gaze direction of the eye on the eye image and the eye on each reference image, and the marker information on each reference image;

and the second determining submodule is used for determining and obtaining the sight line direction of eyes on the eye image based on the influence weight of each reference image on the sight line estimated value of the eye image and the sight line estimated value of the eye image relative to each reference image.

In one embodiment, the first determination submodule is configured to:

and for each reference image, summing the sight line direction marked on the reference image and the difference value between the eye on the eye image and the sight line direction of the eye on the reference image to obtain the sight line estimated value of the eye image relative to the reference image.

In one embodiment, the second determination submodule is configured to:

for each reference image, weighting the eye image relative to the eye estimated value of the reference image based on the influence weight of the reference image on the eye image eye estimated value, so as to obtain the eye image relative to the eye estimated value of the reference image;

and carrying out summation processing on weighted values of the eye images relative to the estimated value of the sight line of each reference image to obtain the sight line direction of eyes on the eye images.

The device provided in this embodiment can be used to perform the method of any one of the embodiments of fig. 1 to 5, and the implementation manner and the beneficial effects are similar, and are not repeated here.

The embodiment of the present disclosure further provides a terminal device, which is characterized by comprising a processor and a memory, wherein the memory stores a computer program, and when the computer program is executed by the processor, the method of any one of the embodiments of fig. 1 to fig. 5 can be implemented, and the execution manner and the beneficial effects are similar, and are not repeated herein.

Fig. 7 is a schematic structural diagram of a terminal device in an embodiment of the disclosure. Referring now in particular to fig. 7, a schematic diagram of a terminal device 1000 suitable for use in implementing embodiments of the present disclosure is shown. Terminal device 1000 in embodiments of the present disclosure may include, but is not limited to, mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle terminals (e.g., in-vehicle navigation terminals), wearable electronic devices, and the like, and stationary terminals such as digital TVs, desktop computers, smart home devices, and the like. The terminal device shown in fig. 7 is only one example, and should not impose any limitation on the functions and scope of use of the embodiments of the present disclosure.

As shown in fig. 7, terminal device 1000 can include a processing means (e.g., a central processor, a graphics processor, etc.) 1001 that can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM) 1002 or a program loaded from a storage means 1008 into a Random Access Memory (RAM) 1003. In the RAM 1003, various programs and data necessary for the operation of the terminal device 1000 are also stored. The processing device 1001, the ROM 1002, and the RAM 1003 are connected to each other by a bus 1004. An input/output (I/O) interface 1005 is also connected to bus 1004.

In general, the following devices may be connected to the I/O interface 1005: input devices 1006 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, and the like; an output device 1007 including, for example, a Liquid Crystal Display (LCD), speaker, vibrator, etc.; storage 1008 including, for example, magnetic tape, hard disk, etc.; and communication means 1009. Communication means 1009 may allow terminal device 1000 to communicate wirelessly or by wire with other devices to exchange data. While fig. 7 shows terminal device 1000 with various means, it is to be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may be implemented or provided instead.

In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a non-transitory computer readable medium, the computer program comprising program code for performing the method shown in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication device 1009, or installed from the storage device 1008, or installed from the ROM 1002. The above-described functions defined in the method of the embodiment of the present disclosure are performed when the computer program is executed by the processing device 1001.

It should be noted that the computer readable medium described in the present disclosure may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present disclosure, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, fiber optic cables, RF (radio frequency), and the like, or any suitable combination of the foregoing.

In some implementations, the clients, servers may communicate using any currently known or future developed network protocol, such as HTTP (HyperText Transfer Protocol ), and may be interconnected with any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the internet (e.g., the internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed networks.

The computer readable medium may be contained in the terminal device; or may exist alone without being fitted into the terminal device.

The computer-readable medium carries one or more programs which, when executed by the terminal device, cause the terminal device to: acquiring an eye image of a subject; determining, based on the eye image and at least one reference image of the subject's eye, a difference in gaze direction of the eye on the eye image and the eye on each reference image, and an impact weight of each reference image on a gaze estimation value of the eye image, the reference image including on it marker information of the gaze direction of the eye on the reference image; the eye direction of the eye on the eye image is determined based on the difference between the eye on the eye image and the eye's line of sight direction on each reference image, the weight of the influence of each reference image on the estimated value of the eye's line of sight on each reference image, and the mark information on each reference image.

The disclosed embodiments also provide a computer program product comprising computer program code which, when executed by a processor, may perform the method of the embodiments of fig. 1-5 described above.

Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, including, but not limited to, an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, etc. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units involved in the embodiments of the present disclosure may be implemented by means of software, or may be implemented by means of hardware. Wherein the names of the units do not constitute a limitation of the units themselves in some cases.

The functions described above herein may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), an Application Specific Standard Product (ASSP), a system on a chip (SOC), a Complex Programmable Logic Device (CPLD), and the like.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The embodiments of the present disclosure further provide a computer readable storage medium, where a computer program is stored, where the computer program, when executed by a processor, may implement the method of any one of the embodiments of fig. 1 to 5, and the implementation manner and beneficial effects are similar, and are not repeated herein.

It should be noted that in this document, relational terms such as "first" and "second" and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

The foregoing is merely a specific embodiment of the disclosure to enable one skilled in the art to understand or practice the disclosure. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the disclosure. Thus, the present disclosure is not intended to be limited to the embodiments shown and described herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A line-of-sight estimation method, comprising:

acquiring an eye image of a subject;

determining, based on the eye image and at least one reference image of the subject's eye, a difference in gaze direction of the eye on the eye image and the eye on each reference image, and an impact weight of each reference image on a gaze estimation value of the eye image, the reference image including on it marker information of the gaze direction of the eye on the reference image;

determining a line of sight direction of an eye on the eye image based on a difference in the line of sight direction of the eye on the eye image and the eye on each reference image, an influence weight of each reference image on a line of sight estimation value of the eye image, and the marker information on each reference image;

The determining, based on the eye image and at least one reference image of the subject's eye, a difference in gaze direction of the eye on the eye image and the eye on each reference image, and an impact weight of each reference image on a gaze estimate of the eye image, comprises:

2. The method of claim 1, wherein prior to the acquiring the eye image of the subject, the method further comprises:

responding to receiving a shooting instruction, and providing a shooting interface for the object, wherein the shooting interface comprises prompt information for prompting the direction of the sight;

obtaining an eye shooting image of the object based on the shooting interface;

and marking the sight direction prompted by the prompt information on the eye shooting image to generate a reference image of the eye of the subject.

3. The method of claim 1, wherein processing the feature map of the eye image and the at least one reference image to obtain a difference in gaze direction of the eye on the eye image and the eye on each reference image, and an impact weight of each reference image on a gaze estimate of the eye image, comprises:

processing the eye feature based on a first network to obtain a difference value of the eye on the eye image and the eye on the reference image in the sight direction;

4. The method according to claim 1 or 2, wherein the determining the direction of the line of sight of the eye on the eye image based on the difference in the direction of the line of sight of the eye on the eye image and the eye on each reference image, the weight of the influence of each reference image on the estimated value of the line of sight of the eye image, and the marker information on each reference image, comprises:

Determining a gaze estimate of the eye image relative to each reference image based on a difference in gaze direction of the eye on the eye image and the eye on each reference image, and the marker information on each reference image;

a gaze direction of an eye on the eye image is determined based on the impact weight of the reference images on the gaze estimate of the eye image and the gaze estimate of the eye image relative to the reference images.

5. The method of claim 4, wherein the determining a gaze estimate of the eye image relative to each reference image based on a difference in gaze direction of the eye on the eye image and the eye on each reference image, and the marker information on each reference image, comprises:

6. The method of claim 4, wherein the determining a gaze direction of an eye on the ocular image based on the impact weight of the reference images on the gaze estimate of the ocular image and the gaze estimate of the ocular image relative to the reference images comprises:

7. A line-of-sight estimating apparatus, comprising:

the acquisition module is used for acquiring an eye image of the object;

a second determining module, configured to determine, based on a difference between a direction of a line of sight of an eye on the eye image and a direction of a line of sight of eyes on each reference image, an impact weight of each reference image on a line of sight estimation value of the eye image, and the mark information on each reference image, a direction of a line of sight of an eye on the eye image;

The first determining module is configured to:

8. The apparatus of claim 7, wherein the apparatus further comprises:

9. The apparatus of claim 8, wherein the first determining module is configured to:

10. The apparatus according to claim 7 or 8, wherein the second determining module comprises:

11. The apparatus of claim 10, wherein the first determination submodule is configured to:

12. The apparatus of claim 11, wherein the second determination submodule is configured to:

13. A terminal device comprising a processor and a memory;

wherein the memory has stored therein a computer program which, when executed by the processor, performs the method of any of claims 1-6.

14. A computer readable storage medium, characterized in that the storage medium has stored therein a computer program which, when executed by a processor, implements the method according to any of claims 1-6.