CN111178307A - Gaze direction identification method and device, electronic equipment and storage medium - Google Patents

Gaze direction identification method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN111178307A
CN111178307A CN201911424093.XA CN201911424093A CN111178307A CN 111178307 A CN111178307 A CN 111178307A CN 201911424093 A CN201911424093 A CN 201911424093A CN 111178307 A CN111178307 A CN 111178307A
Authority
CN
China
Prior art keywords
image
eye
sight line
line direction
gaze direction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911424093.XA
Other languages
Chinese (zh)
Inventor
杨大业
宋建华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lenovo Beijing Ltd
Original Assignee
Lenovo Beijing Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lenovo Beijing Ltd filed Critical Lenovo Beijing Ltd
Priority to CN201911424093.XA priority Critical patent/CN111178307A/en
Publication of CN111178307A publication Critical patent/CN111178307A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • G06V40/19Sensors therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/80Geometric correction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • G06V40/193Preprocessing; Feature extraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • G06V40/197Matching; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Ophthalmology & Optometry (AREA)
  • Data Mining & Analysis (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The embodiment of the application discloses a method and a device for identifying a gazing direction, electronic equipment and a storage medium, wherein after an image is collected, a face area is detected in the image, and an eye area image is cut out from the face area; reducing the resolution of the eye region image; inputting the eye region image with the reduced resolution into a watching direction identification model to obtain a possibility characteristic value that the sight line direction of the eyes in the image belongs to each sight line direction, or obtaining a possibility characteristic value that the sight line direction of each eye in the image belongs to each sight line direction; according to the likelihood characteristic value that the sight line direction of the eyes in the image belongs to each sight line direction, or the likelihood characteristic value that the sight line direction of each eye in the image belongs to each sight line direction, the sight direction of the eyes in the image is determined.

Description

Gaze direction identification method and device, electronic equipment and storage medium
Technical Field
The present application relates to the field of information processing technologies, and in particular, to a gaze direction identification method and apparatus, an electronic device, and a storage medium.
Background
With the increasing demand for intelligence in electronic devices, more and more applications provide eye control functions. The eye control functionality may provide the user with more convenient interaction and user experience.
The core of the eye control is to obtain the gaze point of the human eye, i.e. the gaze direction from which the acquisition is made. At present, the gaze direction is generally acquired through purkinje spot and lens geometry. However, this method requires the use of an infrared camera, which makes the cost of the electronic device high.
Therefore, how to identify the gaze direction at a low cost is an urgent technical problem to be solved.
Disclosure of Invention
The application aims to provide a method and a device for identifying a gazing direction, electronic equipment and a storage medium, and the method comprises the following technical scheme:
a gaze direction identification method, comprising:
collecting an image;
detecting a face area in the image, and cutting out an eye area image from the face area;
processing the eye region image to reduce a resolution of the eye region image;
inputting the eye region image with the resolution reduced into a watching direction identification model to obtain a possibility characteristic value that the sight line direction of the eyes in the image belongs to each sight line direction, or obtaining a possibility characteristic value that the sight line direction of each eye in the image belongs to each sight line direction;
and determining the gazing direction of the eyes in the image according to the probability characteristic value that the sight line direction of the eyes in the image belongs to each sight line direction, or the probability characteristic value that the sight line direction of each eye in the image belongs to each sight line direction.
In the above method, preferably, the cutting out the eye region image from the face region includes: cutting out a frame of binocular image from the human face area;
the determining the gaze direction of the eye in the image according to the likelihood characterization values that the gaze direction of the eye in the image belongs to each gaze direction includes:
and determining the sight line direction corresponding to the probability characteristic value representing the maximum possibility as the gaze direction of the eyes in the image.
In the above method, preferably, the cutting out a frame of binocular image from the face region includes:
cutting out two frames of monocular images from the human face area;
and splicing the two frames of monocular images into a frame of binocular image.
In the above method, preferably, the cutting out the eye region image from the face region includes: cutting out two frames of monocular images from the human face area;
determining the gaze direction of the eyes in the image according to the likelihood characterization value that the gaze direction of each eye in the image belongs to each gaze direction, comprising:
weighting and summing the probability characteristic values of the eye sight directions in the two frames of monocular images belonging to the same sight direction to obtain the probability characteristic values of the eye sight directions of the eyes belonging to each sight direction;
and determining the sight line direction corresponding to the probability characteristic value representing the maximum possibility as the gaze direction of the eyes in the image.
In the above method, preferably, the cutting out the eye region image from the face region includes:
carrying out geometric correction on the face area to obtain a corrected face area;
and intercepting an eye image from the corrected face area.
The above method, preferably, further comprises:
recording the rotation angle and the rotation direction of the face area when the face area is subjected to geometric correction;
after determining the direction of the eye's line of sight, further comprising:
and rotating the sight line direction of the eye by the rotation angle according to the reverse direction of the rotation direction.
In the foregoing method, preferably, the gaze direction identification model is a three-stage convolutional neural network model, each stage has one convolutional layer, the number of convolutional kernels in each convolutional layer is the same, and in convolutional layers of two adjacent stages, the size of convolutional kernels in a convolutional layer of a subsequent stage is smaller than the size of convolutional kernels in a convolutional layer of a previous stage.
A gaze direction identification apparatus comprising:
the acquisition module is used for acquiring images;
the detection module is used for detecting a face area in the image and cutting out an eye area image from the face area;
the resolution processing module is used for processing the eye region image so as to reduce the resolution of the eye region image;
the gaze direction identification module is used for inputting the eye region image with the reduced resolution into the gaze direction identification model to obtain a probability characteristic value that the gaze direction of the eye in the image belongs to each gaze direction, or obtain a probability characteristic value that the gaze direction of each eye in the image belongs to each gaze direction;
and the gazing direction determining module is used for determining the gazing direction of the eyes in the image according to the probability characteristic value that the sight line direction of the eyes in the image belongs to each sight line direction, or the probability characteristic value that the sight line direction of each eye in the image belongs to each sight line direction.
An electronic device, comprising:
the image acquisition device is used for acquiring images;
a memory for storing at least one set of instructions;
a processor for invoking and executing the set of instructions in the memory, by executing the set of instructions:
acquiring an image through the image acquisition device;
detecting a face area in the image, and cutting out an eye area image from the face area;
processing the eye region image to reduce a resolution of the eye region image;
inputting the eye region image with the resolution reduced into a watching direction identification model to obtain a possibility characteristic value that the sight line direction of the eyes in the image belongs to each sight line direction, or obtaining a possibility characteristic value that the sight line direction of each eye in the image belongs to each sight line direction;
and determining the gazing direction of the eyes in the image according to the probability characteristic value that the sight line direction of the eyes in the image belongs to each sight line direction, or the probability characteristic value that the sight line direction of each eye in the image belongs to each sight line direction.
A readable storage medium, having stored thereon a computer program which, when being executed by a processor, carries out the steps of the gaze direction identification method as defined in any one of the preceding claims.
According to the scheme, the gaze direction identification method, the gaze direction identification device, the electronic equipment and the storage medium provided by the application have the advantages that after the image is collected, the face area is detected in the image, and the eye area image is cut out from the face area; reducing the resolution of the eye region image; inputting the eye region image with the reduced resolution into a watching direction identification model to obtain a possibility characteristic value that the sight line direction of the eyes in the image belongs to each sight line direction, or obtaining a possibility characteristic value that the sight line direction of each eye in the image belongs to each sight line direction; according to the likelihood characteristic value that the sight line direction of the eyes in the image belongs to each sight line direction, or the likelihood characteristic value that the sight line direction of each eye in the image belongs to each sight line direction, the sight direction of the eyes in the image is determined.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.
Fig. 1 is a flowchart of an implementation of a gaze direction identification method provided in an embodiment of the present application;
fig. 2 is a flowchart of an implementation of determining a gaze direction of an eye in an image according to a likelihood characterization value that a gaze direction of each eye in the image belongs to each gaze direction, according to an embodiment of the present application;
fig. 3 is a flowchart of an implementation of cutting out an image of an eye region from a face region according to an embodiment of the present application;
fig. 4 is a schematic diagram of a network architecture of a gaze direction recognition model provided in an embodiment of the present application;
fig. 5 is a schematic structural diagram of a gaze direction identification apparatus provided in an embodiment of the present application;
fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
The terms "first," "second," "third," "fourth," and the like in the description and in the claims, as well as in the drawings described above, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in other sequences than described or illustrated herein.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without inventive step, are within the scope of the present disclosure.
The gazing direction identification method provided by the embodiment of the application can be applied to electronic equipment, and the electronic equipment can acquire images.
Fig. 1 shows a flow chart of an implementation of a gaze direction identification method provided by the present application, which may include:
step S11: and collecting an image.
Since the gaze direction of the user is to be tracked, the user images are acquired in real time, and the following steps can be performed every time one frame of image is acquired.
Step S12: a face region is detected in the image, and an eye region image is cut out from the face region.
In order to find the eye features conveniently, in the embodiment of the application, the eye region is not directly detected from the image, but the face region is detected from the image, and then the eye region is detected from the face region and the eye region image is cut out.
Optionally, a face alignment algorithm based on a regression tree may be used to align a face in the image to determine a face region, after the face region is determined, an eye feature region may be extracted by using a method based on a geometric proportion, and an eye region image may be cut out according to the eye feature region. The ocular features may include, but are not limited to, at least one of: eyebrows, corners of the eye, eye contours, pupil contours, or iris contours, etc. After the eye feature region is extracted, the eye region image may be determined according to the relative positional relationship between the eye region and the eye feature region.
Step S13: the eye region image is processed to reduce the resolution of the eye region image.
In order to improve the calculation efficiency, in the embodiment of the application, after the eye region image is obtained, the resolution of the eye region image is reformed, so that the resolution of the eye region is reduced, the calculation amount of subsequent calculation is reduced, and the calculation efficiency is improved.
For example, the resolution of the eye region image may be reduced to 50 × 50. Of course, this is only an example, and other sizes are also possible, and this scheme is not particularly limited thereto.
Optionally, the eye region image may be divided into a plurality of image blocks (for example, image blocks of 2 × 2 size), and the average value of the pixels in each image block is used as the pixel value of the image block, so as to achieve the purpose of reducing the resolution of the eye region image. In an optional embodiment, if the target resolution is not reached after the resolution adjustment operation is performed once, the resolution of the eye region image with the reduced resolution may be adjusted again, so as to further reduce the resolution of the eye region image. That is to say, in the embodiment of the present application, the resolution of the eye region image may be reduced to the target resolution at least once by performing resolution reduction processing on the eye region image.
Step S14: and inputting the eye region image with the reduced resolution into the gaze direction identification model to obtain a probability characteristic value that the gaze direction of the eye in the image belongs to each gaze direction, or to obtain a probability characteristic value that the gaze direction of each eye in the image belongs to each gaze direction.
In the embodiment of the application, the gaze direction identification model may directly predict the likelihood characterization values that the comprehensive gaze direction of the two eyes belongs to each gaze direction, or predict the likelihood characterization values that the gaze direction of each eye belongs to each gaze direction, respectively.
The probability characterizing value that the gaze direction of the eye belongs to each gaze direction may refer to: the score of the eye's gaze direction belonging to each gaze direction, or the probability of the eye's gaze direction belonging to each gaze direction. In general, the higher the score or probability that the eye's gaze direction belongs to a certain gaze direction, the higher the likelihood that the eye's gaze direction is the certain gaze direction.
The gaze direction of the eye may include, but is not limited to, the following: center, center to the left, center to the right, top left, top right, bottom left, bottom right, top center, bottom center, etc.
Step S15: and determining the gazing direction of the eyes in the image according to the probability characteristic value that the sight line direction of the eyes in the image belongs to each sight line direction or the probability characteristic value that the sight line direction of each eye in the image belongs to each sight line direction.
In the present embodiment, the "eye sight direction" refers to the overall sight direction of the two eyes.
According to the gaze direction identification method, the Purkinje spot does not need to be positioned, an infrared camera is not needed, and only a common camera is needed, so that the gaze direction is identified and tracked at low cost. Moreover, in the recognition process, resolution reduction processing is performed on the eye region image, and the calculation amount of the recognition process is reduced while the recognition accuracy is ensured.
In an optional embodiment, the cutting out the eye region image from the face region includes: and cutting out a frame of binocular image from the human face area. In particular, the method comprises the following steps of,
one frame of binocular image can be cut out from the human face area directly. However, the clipped eye region contains more irrelevant information, such as a region between two eyes, which results in a large amount of subsequent calculation and also affects the recognition accuracy.
In a preferred embodiment, the two monocular images may be cut out from the face region, and then the two monocular images may be spliced into one binocular image. Because the cut monocular image only contains the eye region generally, the precision is higher, therefore, a large amount of irrelevant information is removed from the binocular image obtained by splicing the two frames of monocular images, the calculation amount is reduced, and the identification precision is improved.
In an optional embodiment, after the two monocular images are cut out, the two monocular images may be directly input into the gaze direction recognition model without being spliced.
In this embodiment, regardless of whether the gaze direction recognition model is input as one frame of binocular image or two frames of monocular images, the gaze direction recognition model outputs a probability representing value that the gaze direction of the eye in the image belongs to each gaze direction.
Correspondingly, determining the gaze direction of the eye in the image according to the probability characterization values that the gaze direction of the eye in the image belongs to each gaze direction includes:
and determining the sight line direction corresponding to the probability characteristic value representing the maximum possibility as the gaze direction of the eyes in the image.
In the embodiment of the application, the binocular image with the reduced resolution is input into the gaze direction identification model, and the gaze direction identification model directly predicts the probability characteristic values of the comprehensive sight directions of the two eyes belonging to the respective sight directions. At this time, the gaze direction corresponding to the likelihood characterization value characterizing the greatest likelihood may be determined as the gaze direction of the eye in the image.
For example, suppose that the gaze direction of the eye output by the gaze direction recognition model belongs to the center, the center is left, the center is right, the upper left, the upper right, the lower left, the lower right, the center is upper, and the scores of the gaze directions in the nine gaze directions with the center being lower are sequentially: 2. 3, 4, 9, 1, 5, 8, 7, and 6, it is obvious that the gaze direction of the eye belongs to the upper left direction with the highest score (9 points), and it can be determined that the gaze direction of the eye in the image is the upper left direction.
In an optional embodiment, one implementation manner of cutting out the eye region image from the face region may be as follows: and cutting out two frames of monocular images from the human face area.
In the embodiment of the application, the gaze direction recognition model is input with two frames of monocular images with reduced resolution, and the gaze direction recognition model outputs a probability characterization value that the gaze direction of each eye belongs to each gaze direction.
In an alternative embodiment, the input gaze direction recognition model may also be a frame of binocular images cut out from the face region.
In this embodiment, regardless of whether the gaze direction identification model is input as one frame of binocular image or two frames of monocular images, the gaze direction identification model outputs a probability characterization value that the gaze direction of each eye in the image belongs to each gaze direction.
Accordingly, an implementation flowchart of determining the gaze direction of the eyes in the image according to the likelihood characterization value that the gaze direction of each eye in the image belongs to each gaze direction is shown in fig. 2, and may include:
step S21: and weighting and summing the probability characteristic values of the eye sight directions in the two frames of monocular images belonging to the same sight direction to obtain the probability characteristic values of the eye sight directions of the eyes belonging to the various sight directions.
Optionally, an average value of the likelihood characterization values that the eye sight directions in the two frames of monocular images belong to the same sight direction may be calculated to obtain the likelihood characterization values that the eye sight directions of the eyes belong to the respective sight directions.
For example, suppose that the gaze direction of the left eye output by the gaze direction recognition model belongs to the center, the center is left, the center is right, the upper left, the upper right, the lower left, the lower right, the center is upper, and the scores of the gaze directions in the nine gaze directions with the center being lower are sequentially: 2. 3, 4, 6, 1, 5, 8, 7, 9, the sight direction of right eye belongs to the center, and the center is on the left, and the center is on the right, and upper left, upper right, lower left, lower right, and the center is on the upper, and the score of each sight direction in these nine sight directions of center on the lower is in proper order: 2. 3, 6, 1, 5, 9, 7, the average value of the likelihood characterizing values belonging to the center direction of the eye gaze direction in the two monocular images is (2+2)/2, the average value of the likelihood characterizing values belonging to the center left direction is (3+3)/2, the average value of the likelihood characterizing values belonging to the center right direction is (4+3)/2 3.5, the average value of the likelihood characterizing values belonging to the upper left direction is (6+6)/2 is 6, the average value of the likelihood characterizing values belonging to the upper right direction is (1+1)/2 is 1, the average value of the likelihood characterizing values belonging to the lower left direction is (5+5)/2, the average value of the likelihood characterizing values belonging to the lower right direction is (8+9)/2 is 8.5, the average value of the likelihood characterizing values belonging to the center upper direction is (7+7)/2 is 7, the average value of the probability tokens belonging to the direction off center is (9+ 7)/2-8.
Step S22: and determining the sight line direction corresponding to the probability characteristic value representing the maximum possibility as the gaze direction of the eyes in the image.
In the above example, the average (8.5) of the likelihood characterization values of the two monocular images that the eye gaze direction belongs to the lower right direction is the highest, and therefore, it can be determined that the gaze direction of the eye in the image is the lower right direction.
In practical application, because the head of the user is not fixed, the acquired image may not be a front face but has a certain inclination angle, or the face of the user on the image has a certain deformation, which causes interference to the eye detection, affects the efficiency of searching the eye features, and even affects the recognition accuracy of the gaze direction. Based on this, in an optional embodiment, an implementation flowchart of cutting out an eye region image from a face region provided in this application embodiment is shown in fig. 3, and may include:
step S31: and carrying out geometric correction on the face area to obtain a corrected face area.
Optionally, the geometric correction of the face region includes a rotation of the face region such that the face region is a forward face.
Step S32: and intercepting the eye image from the corrected face area.
The eye image is intercepted in the corrected face area, so that the eye area can be conveniently positioned, the calculation amount of subsequent calculation can be reduced, and the identification precision of the gazing direction is improved.
After the geometric correction is performed on the face region, the eye region is also correspondingly subjected to the geometric correction, which changes the gaze direction of the eyes, so that when the geometric correction is performed on the face region, the rotation angle and the rotation direction of the face region are required to be recorded when the geometric correction is performed on the face region;
after the eye sight line direction is determined, the method further comprises the following steps:
and rotating the sight line direction of the eyes by the rotation angle according to the reverse direction of the rotation direction. The influence of the geometric correction of the face area on the gaze direction is avoided.
In an alternative embodiment, the gaze direction identification model may be a three-stage convolutional neural network model, each stage has one convolutional layer, the number of convolutional kernels in each convolutional layer is the same, and in convolutional layers of two adjacent stages, the size of convolutional kernels in convolutional layers of a later stage is smaller than that of convolutional kernels in convolutional layers of a previous stage.
Referring to fig. 4, fig. 4 is a schematic diagram of a network architecture of a gaze direction recognition model according to an embodiment of the present application. In this example, the gaze direction recognition model includes three convolutional layers, a fully-connected layer and an output layer. Each convolutional layer contains 24 convolutional kernels, and the sizes of the convolutional kernels become smaller as the depth of the network increases, for example, the size of the convolutional kernel in the first convolutional layer is 7 × 7, the size of the convolutional kernel in the second convolutional layer is 5 × 5, and the size of the convolutional kernel in the third convolutional layer is 3 × 3.
An activation function and a pooling layer are also provided behind each convolutional layer. The activation function may be a ReLU activation function, or a variant of a ReLU, such as a leak ReLU, a prellu, a RReLU, or the like. The pooling layer may be a maximum pooling layer or, alternatively, may be a mean pooling layer.
The embodiment of the application realizes the accurate tracking and positioning of the gazing direction through a shallow convolutional neural network.
Corresponding to the method embodiment, an embodiment of the present application further provides a device for identifying a gazing direction, where a schematic structural diagram of the device for identifying a gazing direction is shown in fig. 5, and the device for identifying a gazing direction may include:
an acquisition module 51, a detection module 52, a resolution processing module 53, a gaze direction identification module 54 and a gaze direction determination module 55; wherein the content of the first and second substances,
the acquisition module 51 is used for acquiring images;
the detection module 52 is configured to detect a face region in the image, and cut out an eye region image from the face region;
the resolution processing module 53 is configured to process the eye region image to reduce the resolution of the eye region image;
the gaze direction identification module 54 is configured to input the eye region image with the reduced resolution into a gaze direction identification model, and obtain a likelihood characterization value that a gaze direction of an eye in the image belongs to each gaze direction, or obtain a likelihood characterization value that a gaze direction of each eye in the image belongs to each gaze direction;
the gaze direction determining module 55 is configured to determine the gaze direction of the eye in the image according to the likelihood characterization value that the gaze direction of the eye in the image belongs to each gaze direction, or the likelihood characterization value that the gaze direction of each eye in the image belongs to each gaze direction.
The gaze direction recognition device provided by the embodiment of the application does not need to locate the purkinje spot, does not need an infrared camera, and only needs a general camera, so that the gaze direction recognition and tracking are realized at lower cost. Moreover, in the recognition process, resolution reduction processing is performed on the eye region image, and the calculation amount of the recognition process is reduced while the recognition accuracy is ensured.
In an optional embodiment, when the detection module 52 cuts out the eye region image from the face region, it is specifically configured to: cutting out a frame of binocular image from the human face area;
the gazing direction determining module 55 is specifically configured to, when determining the gazing direction of the eye in the image according to the likelihood characterization values that the gaze direction of the eye in the image belongs to each gaze direction:
and determining the sight line direction corresponding to the probability characteristic value representing the maximum possibility as the gaze direction of the eyes in the image.
In an alternative embodiment, the detection module 52 may include:
the cutting unit is used for cutting two frames of monocular images from the human face area;
and the splicing unit is used for splicing the two frames of monocular images into one frame of binocular image.
In an optional embodiment, when the detection module 52 cuts out the eye region image from the face region, it is specifically configured to: cutting out two frames of monocular images from the human face area;
the gazing direction determining module 55 is specifically configured to, when determining the gazing direction of the eyes in the image according to the probability characteristic value that the sight line direction of each eye in the image belongs to each sight line direction:
weighting and summing the probability characteristic values of the eye sight directions in the two frames of monocular images belonging to the same sight direction to obtain the probability characteristic values of the eye sight directions of the eyes belonging to each sight direction;
and determining the sight line direction corresponding to the probability characteristic value representing the maximum possibility as the gaze direction of the eyes in the image.
In an optional embodiment, when the detection module 52 cuts out the eye region image from the face region, it is specifically configured to:
carrying out geometric correction on the face area to obtain a corrected face area;
and intercepting an eye image from the corrected face area.
In an alternative embodiment, the detection module 52 is further configured to: recording the rotation angle and the rotation direction of the face area when the face area is subjected to geometric correction;
the gazing mode identifying device further includes a correcting module, configured to rotate the gaze direction of the eye by the rotation angle in a direction opposite to the rotation direction after the gaze direction determining module 55 determines the gaze direction of the eye.
In an optional embodiment, the gaze direction identification model is a three-stage convolutional neural network model, each stage has one convolutional layer, the number of convolutional kernels in each convolutional layer is the same, and in convolutional layers of two adjacent stages, the size of convolutional kernels in convolutional layers of a later stage is smaller than that of convolutional kernels in convolutional layers of a previous stage.
Corresponding to the method embodiment, the present application further provides an electronic device, a schematic structural diagram of which is shown in fig. 6, and the electronic device may include:
an image acquisition device 60 for acquiring an image
A memory 61 for storing at least one set of instructions;
a processor 62 for invoking and executing the set of instructions in the memory, by executing the set of instructions:
acquiring an image by the image acquisition device 60;
detecting a face area in the image, and cutting out an eye area image from the face area;
processing the eye region image to reduce a resolution of the eye region image;
inputting the eye region image with the resolution reduced into a watching direction identification model to obtain a possibility characteristic value that the sight line direction of the eyes in the image belongs to each sight line direction, or obtaining a possibility characteristic value that the sight line direction of each eye in the image belongs to each sight line direction;
and determining the gazing direction of the eyes in the image according to the probability characteristic value that the sight line direction of the eyes in the image belongs to each sight line direction, or the probability characteristic value that the sight line direction of each eye in the image belongs to each sight line direction.
Optionally, the refinement and extension functions of the instruction set may be as described above.
Embodiments of the present application further provide a storage medium, where a program suitable for execution by a processor may be stored, where the program is configured to:
collecting an image;
detecting a face area in the image, and cutting out an eye area image from the face area;
processing the eye region image to reduce a resolution of the eye region image;
inputting the eye region image with the resolution reduced into a watching direction identification model to obtain a possibility characteristic value that the sight line direction of the eyes in the image belongs to each sight line direction, or obtaining a possibility characteristic value that the sight line direction of each eye in the image belongs to each sight line direction;
and determining the gazing direction of the eyes in the image according to the probability characteristic value that the sight line direction of the eyes in the image belongs to each sight line direction, or the probability characteristic value that the sight line direction of each eye in the image belongs to each sight line direction.
Optionally, the refinement and extension functions of the instruction set may be as described above.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
It should be understood that the technical problems can be solved by combining and combining the features of the embodiments from the claims.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A gaze direction identification method, comprising:
collecting an image;
detecting a face area in the image, and cutting out an eye area image from the face area;
processing the eye region image to reduce a resolution of the eye region image;
inputting the eye region image with the resolution reduced into a watching direction identification model to obtain a possibility characteristic value that the sight line direction of the eyes in the image belongs to each sight line direction, or obtaining a possibility characteristic value that the sight line direction of each eye in the image belongs to each sight line direction;
and determining the gazing direction of the eyes in the image according to the probability characteristic value that the sight line direction of the eyes in the image belongs to each sight line direction, or the probability characteristic value that the sight line direction of each eye in the image belongs to each sight line direction.
2. The method of claim 1, wherein cropping out an eye region image from the face region comprises: cutting out a frame of binocular image from the human face area;
the determining the gaze direction of the eye in the image according to the likelihood characterization values that the gaze direction of the eye in the image belongs to each gaze direction includes:
and determining the sight line direction corresponding to the probability characteristic value representing the maximum possibility as the gaze direction of the eyes in the image.
3. The method of claim 2, wherein cropping a frame of binocular images from the face region comprises:
cutting out two frames of monocular images from the human face area;
and splicing the two frames of monocular images into a frame of binocular image.
4. The method of claim 1, wherein cropping out an eye region image from the face region comprises: cutting out two frames of monocular images from the human face area;
determining the gaze direction of the eyes in the image according to the likelihood characterization value that the gaze direction of each eye in the image belongs to each gaze direction, comprising:
weighting and summing the probability characteristic values of the eye sight directions in the two frames of monocular images belonging to the same sight direction to obtain the probability characteristic values of the eye sight directions of the eyes belonging to each sight direction;
and determining the sight line direction corresponding to the probability characteristic value representing the maximum possibility as the gaze direction of the eyes in the image.
5. The method according to any one of claims 1-4, wherein the cropping of the eye region image from the face region comprises:
carrying out geometric correction on the face area to obtain a corrected face area;
and intercepting an eye image from the corrected face area.
6. The method of claim 5, further comprising:
recording the rotation angle and the rotation direction of the face area when the face area is subjected to geometric correction;
after determining the direction of the eye's line of sight, further comprising:
and rotating the sight line direction of the eye by the rotation angle according to the reverse direction of the rotation direction.
7. The method of claim 1, wherein the gaze direction recognition model is a three-stage convolutional neural network model, each stage has one convolutional layer, the number of convolutional kernels in each convolutional layer is the same, and the size of convolutional kernels in convolutional layers of two adjacent stages is smaller than that of convolutional kernels in convolutional layers of a previous stage.
8. A gaze direction identification apparatus comprising:
the acquisition module is used for acquiring images;
the detection module is used for detecting a face area in the image and cutting out an eye area image from the face area;
the resolution processing module is used for processing the eye region image so as to reduce the resolution of the eye region image;
the gaze direction identification module is used for inputting the eye region image with the reduced resolution into the gaze direction identification model to obtain a probability characteristic value that the gaze direction of the eye in the image belongs to each gaze direction, or obtain a probability characteristic value that the gaze direction of each eye in the image belongs to each gaze direction;
and the gazing direction determining module is used for determining the gazing direction of the eyes in the image according to the probability characteristic value that the sight line direction of the eyes in the image belongs to each sight line direction, or the probability characteristic value that the sight line direction of each eye in the image belongs to each sight line direction.
9. An electronic device, comprising:
the image acquisition device is used for acquiring images;
a memory for storing at least one set of instructions;
a processor for invoking and executing the set of instructions in the memory, by executing the set of instructions:
acquiring an image through the image acquisition device;
detecting a face area in the image, and cutting out an eye area image from the face area;
processing the eye region image to reduce a resolution of the eye region image;
inputting the eye region image with the resolution reduced into a watching direction identification model to obtain a possibility characteristic value that the sight line direction of the eyes in the image belongs to each sight line direction, or obtaining a possibility characteristic value that the sight line direction of each eye in the image belongs to each sight line direction;
and determining the gazing direction of the eyes in the image according to the probability characteristic value that the sight line direction of the eyes in the image belongs to each sight line direction, or the probability characteristic value that the sight line direction of each eye in the image belongs to each sight line direction.
10. A readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the gaze direction identification method according to any one of claims 1-7.
CN201911424093.XA 2019-12-31 2019-12-31 Gaze direction identification method and device, electronic equipment and storage medium Pending CN111178307A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911424093.XA CN111178307A (en) 2019-12-31 2019-12-31 Gaze direction identification method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911424093.XA CN111178307A (en) 2019-12-31 2019-12-31 Gaze direction identification method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN111178307A true CN111178307A (en) 2020-05-19

Family

ID=70652573

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911424093.XA Pending CN111178307A (en) 2019-12-31 2019-12-31 Gaze direction identification method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111178307A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111767820A (en) * 2020-06-23 2020-10-13 京东数字科技控股有限公司 Method, device, equipment and storage medium for identifying object concerned
CN113361441A (en) * 2021-06-18 2021-09-07 山东大学 Sight line area estimation method and system based on head posture and space attention

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104091155A (en) * 2014-07-04 2014-10-08 武汉工程大学 Rapid iris positioning method with illumination robustness
US20160267713A1 (en) * 2015-03-11 2016-09-15 Oculus Vr, Llc Display device with dual data drivers
CN107909057A (en) * 2017-11-30 2018-04-13 广东欧珀移动通信有限公司 Image processing method, device, electronic equipment and computer-readable recording medium
CN108345848A (en) * 2018-01-31 2018-07-31 广东欧珀移动通信有限公司 The recognition methods of user's direction of gaze and Related product
CN109240504A (en) * 2018-09-25 2019-01-18 北京旷视科技有限公司 Control method, model training method, device and electronic equipment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104091155A (en) * 2014-07-04 2014-10-08 武汉工程大学 Rapid iris positioning method with illumination robustness
US20160267713A1 (en) * 2015-03-11 2016-09-15 Oculus Vr, Llc Display device with dual data drivers
CN107909057A (en) * 2017-11-30 2018-04-13 广东欧珀移动通信有限公司 Image processing method, device, electronic equipment and computer-readable recording medium
CN108345848A (en) * 2018-01-31 2018-07-31 广东欧珀移动通信有限公司 The recognition methods of user's direction of gaze and Related product
CN109240504A (en) * 2018-09-25 2019-01-18 北京旷视科技有限公司 Control method, model training method, device and electronic equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111767820A (en) * 2020-06-23 2020-10-13 京东数字科技控股有限公司 Method, device, equipment and storage medium for identifying object concerned
CN113361441A (en) * 2021-06-18 2021-09-07 山东大学 Sight line area estimation method and system based on head posture and space attention

Similar Documents

Publication Publication Date Title
US11430205B2 (en) Method and apparatus for detecting salient object in image
WO2020015468A1 (en) Image transmission method and apparatus, terminal device, and storage medium
US9508004B2 (en) Eye gaze detection apparatus, computer-readable recording medium storing eye gaze detection program and eye gaze detection method
US11132544B2 (en) Visual fatigue recognition method, visual fatigue recognition device, virtual reality apparatus and storage medium
WO2021016873A1 (en) Cascaded neural network-based attention detection method, computer device, and computer-readable storage medium
CN104794462A (en) Figure image processing method and device
JP5235691B2 (en) Information processing apparatus and information processing method
CN111598038B (en) Facial feature point detection method, device, equipment and storage medium
CN106056064A (en) Face recognition method and face recognition device
CN110070531B (en) Model training method for detecting fundus picture, and fundus picture detection method and device
CN110059666B (en) Attention detection method and device
KR101761586B1 (en) Method for detecting borderline between iris and sclera
WO2019223068A1 (en) Iris image local enhancement method, device, equipment and storage medium
US20190066311A1 (en) Object tracking
EP3699808B1 (en) Facial image detection method and terminal device
CN111178307A (en) Gaze direction identification method and device, electronic equipment and storage medium
CN109815823B (en) Data processing method and related product
WO2022227594A1 (en) Eyeball tracking method and virtual reality device
JP2019517079A (en) Shape detection
CN113642393A (en) Attention mechanism-based multi-feature fusion sight line estimation method
CN115965653A (en) Light spot tracking method and device, electronic equipment and storage medium
CN113850238B (en) Document detection method and device, electronic equipment and storage medium
WO2019223067A1 (en) Multiprocessing-based iris image enhancement method and apparatus, and device and medium
CN113409056A (en) Payment method and device, local identification equipment, face payment system and equipment
CN114255494A (en) Image processing method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination