WO2021197466A1 - Eyeball detection method, apparatus and device, and storage medium - Google Patents

Eyeball detection method, apparatus and device, and storage medium Download PDF

Info

Publication number
WO2021197466A1
WO2021197466A1 PCT/CN2021/085237 CN2021085237W WO2021197466A1 WO 2021197466 A1 WO2021197466 A1 WO 2021197466A1 CN 2021085237 W CN2021085237 W CN 2021085237W WO 2021197466 A1 WO2021197466 A1 WO 2021197466A1
Authority
WO
WIPO (PCT)
Prior art keywords
eyeball
eye image
image
target
position information
Prior art date
Application number
PCT/CN2021/085237
Other languages
French (fr)
Chinese (zh)
Inventor
张小伟
项伟
刘更代
Original Assignee
百果园技术(新加坡)有限公司
张小伟
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 百果园技术(新加坡)有限公司, 张小伟 filed Critical 百果园技术(新加坡)有限公司
Publication of WO2021197466A1 publication Critical patent/WO2021197466A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • G06V40/19Sensors therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • G06V40/193Preprocessing; Feature extraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • G06V40/197Matching; Classification

Definitions

  • the embodiments of the present application relate to the field of image recognition, such as eyeball detection methods, devices, equipment, and storage media.
  • Eyeball detection technology generally includes eyeball key point positioning technology, which is an important technology in the field of image processing and computer vision.
  • the eyeball detection technology is to accurately locate the iris and pupil in the input face image or video.
  • the eyeball detection technology mainly includes the detection of the iris boundary or the key points on the boundary and the detection of the pupil center point. Eyeball detection technology plays an important role in the fields of live entertainment, short video special effects, virtual dolls, and security.
  • Eyeball detection methods can be roughly divided into two categories, one is based on manual feature extraction methods in the traditional computer vision field, and the other is based on neural network technology.
  • the manual feature extraction method based on the traditional computer vision field mainly uses the gradient of the image to extract features, such as scale-invariant feature transform (SIFT) features, and combines traditional algorithms (such as Hough transform and support vector machine).
  • SIFT scale-invariant feature transform
  • Etc. Do iris edge detection or key point detection. This kind of scheme needs to set different parameters for different scenarios, and the accuracy is generally low.
  • the method based on neural network technology mainly uses the multi-layer convolutional neural network to extract the features of the image, and then return to the position of the key point. This kind of scheme is more accurate than the former, but the computational complexity of the model is high, which has a great impact on computing resources. High demands. Therefore, the eyeball detection scheme in the related technology is still not perfect and needs to be improved.
  • the embodiments of the present application provide eyeball detection methods, devices, equipment, and storage media, which can optimize eyeball detection solutions in related technologies.
  • the embodiment of the present application provides an eyeball detection method, which includes: acquiring a target eye image to be detected; inputting the target eye image into a pre-trained eyeball detection model, wherein the eyeball detection model is A convolutional neural network model including a reversible residual network; the position information of key eyeball points in the target eye image is determined according to the output result of the eyeball detection model.
  • An embodiment of the present application provides an eyeball detection device, which includes: a target eye image acquisition module configured to acquire a target eye image to be detected; an image input module configured to input the target eye image to a preset
  • the eyeball detection model is a convolutional neural network model including a reversible residual network
  • a position information determination module is configured to determine the target eye image according to the output result of the eyeball detection model The position information of the key points of the eyeballs in.
  • the embodiment of the present application provides a computer device, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor.
  • a computer program stored in the memory and capable of running on the processor.
  • the embodiment of the present application provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, the eyeball detection method as provided in the embodiment of the present application is implemented.
  • FIG. 1 is a schematic flowchart of an eyeball detection method provided by an embodiment of this application
  • FIG. 2 is a schematic diagram of the distribution of key eyeball points according to an embodiment of the application
  • FIG. 3 is a schematic flowchart of another eyeball detection method provided by an embodiment of the application.
  • FIG. 4 is a schematic flowchart of another eyeball detection method provided by an embodiment of the application.
  • FIG. 5 is a schematic diagram of a flow of eyeball detection provided by an embodiment of this application.
  • FIG. 6 is a schematic diagram of a network structure of an eyeball detection model provided by an embodiment of this application.
  • FIG. 7 is a schematic structural diagram of a reversible residual network provided by an embodiment of this application.
  • FIG. 8 is a structural block diagram of an eyeball detection device provided by an embodiment of the application.
  • FIG. 9 is a structural block diagram of a computer device provided by an embodiment of this application.
  • FIG. 1 is a schematic flow chart of an eyeball detection method provided by an embodiment of the application.
  • the method can be executed by an eyeball detection device, which can be implemented by software and/or hardware, and generally can be integrated in a computer device. As shown in Figure 1, the method includes the following steps.
  • Step 101 Acquire an eye image of a target to be detected.
  • the computer device may include mobile terminal devices such as mobile phones, tablet computers, notebook computers, and personal digital assistants, and may also include other devices such as desktop computers.
  • the embodiment of the present application can effectively improve the calculation efficiency of eyeball detection while ensuring the accuracy of eyeball detection, and the calculation complexity is also effectively controlled. Therefore, the method provided in this embodiment can be widely applied to mobile computing platforms and other computing resources. Restricted platforms, that is, computer equipment can be equipment with limited computing resources, such as low-end (such as low hardware configuration) mobile phones and security equipment. Tests have shown that computer equipment can reach millisecond-level operating speeds.
  • the solutions provided by the embodiments of the present application can be applied to a variety of application scenarios, such as tracking the direction of the user's line of sight, eye tracking, and other applications that require the use of eyeball position related information.
  • it can be used in special effects, stickers, virtual dolls and 3-dimensional (3D) expressions in live video or short video applications, and can also be used in security equipment to assist iris faces Recognition and live detection, etc.
  • the target eye image may be an image containing human eyes.
  • the proportion of the human eye area in the entire target eye image is not limited, and the target eye image may include other parts of the facial features of the human face, or may only include human eyes, which is not limited in the embodiment of the present application.
  • the original image collected by an image capture device such as a camera generally contains the entire face, and may also contain other image information such as the background of the person. Therefore, the original image can be cropped, etc. Operation to obtain the target eye image to reduce the amount of calculation.
  • Step 102 Input the target eye image into a pre-trained eyeball detection model, where the eyeball detection model is a convolutional neural network model including a reversible residual network.
  • the eyeball detection model is a convolutional neural network model including a reversible residual network.
  • the eyeball detection model used for eyeball detection used in the embodiments of the present application may be a convolutional neural network model including a reversible residual network.
  • Eyeball detection models in related technologies generally use more layers of convolutional networks.
  • the computational complexity is very high and cannot be used on devices with limited computing resources. Moreover, due to the high computational complexity, the calculation speed is also greatly affected. The calculation efficiency is low, which affects the real-time performance of eyeball detection.
  • the embodiment of the present application applies the reversible residual network to the eyeball detection model.
  • One or more modules based on the reversible residual network can be set in the model, which can improve the calculation efficiency while ensuring accuracy.
  • the position of the reversible residual network in the eyeball detection model, the number of reversible residual networks, and the parameters in the reversible residual network can be set according to actual applications and scenarios, which are not limited in the embodiment of the present application.
  • the eyeball detection model may also include a convolutional layer, a pooling layer, and a fully connected layer, etc.
  • the structure of the eyeball detection model is not limited in this embodiment.
  • the convolutional layer can be recombined and designed to balance the neural network. Accuracy and complexity, reduce the complexity of the network while maintaining accuracy.
  • the network structure corresponding to the eye detection model can be determined according to actual needs, and the eye detection training model can be obtained, and the training data can be used to train the eye detection training model to optimize the selection of multiple parameters in the eye detection training model.
  • Value to obtain a trained eyeball detection model that is, the pre-trained eyeball detection model in the embodiment of the present application.
  • Step 103 Determine position information of key eyeball points in the target eye image according to the output result of the eyeball detection model.
  • the key points of the eyeball in the target eye image may include, for example, points around the iris, and may also include the center point of the pupil.
  • the number of key points of the eyeball is not limited, for example, 20, which may include 19 points on the periphery of the iris and the center point of the pupil.
  • the position information of the eyeball key points in the target eye image may include information related to the position of the eyeball key points, such as coordinate information of the eyeball key points, and visibility information of the eyeball key points.
  • the coordinate information may include the plane coordinate values of the eyeball key points in the target eye image, and the visibility information may include whether the eyeball key points are occluded by the eyelids.
  • Figure 2 is a schematic diagram of a distribution of key eyeball points provided by an embodiment of the application. As shown in the figure, a total of 20 key points are marked, among which the points numbered 11 to 17 are hidden by the eyelids, and the visibility information is invisible point.
  • the training data used for model training may be marked according to the content contained in the location information. For example, a preset number of eye images can be selected, the key point coordinates and the visibility of the key points in the eye image can be marked to obtain the training eye image, and the training eye image can be used for model training, where,
  • the preset number can be set according to actual requirements such as model accuracy and accuracy, and is generally tens of thousands of levels, such as 60,000.
  • the eyeball detection method obtaineds a target eye image to be detected, and inputs the target eye image into a pre-trained eyeball detection model, where the eyeball detection model is a convolutional neural network including a reversible residual network
  • the network model determines the position information of the key points of the eyeball in the target eye image according to the output result of the eyeball detection model.
  • FIG. 3 is a schematic flow diagram of another eyeball detection method provided by an embodiment of the application. Based on the above-mentioned multiple optional embodiments, the acquisition of the target eye image to be detected is described.
  • the acquiring the target eye image to be detected may include: using a preset face detection method to detect the image to be detected to determine the position information of the corners of the eyes; intercepting the binocular images according to the position information of the corners of the eyes; The binocular image determines the target eye image.
  • the advantage of this setting is that it can further reduce the amount of calculation and improve the detection efficiency.
  • the binocular image may be an image containing both the left eye and the right eye, and the binocular image may also be two images containing the left eye image and the right eye image respectively.
  • the capturing binocular images according to the position information of the corners of the eyes includes: respectively capturing a left-eye image and a right-eye image according to the position information of the corners of the eyes.
  • the advantage of this setting is that it facilitates targeted detection for the left eye and the right eye, and can effectively control the scale of the eyeball detection model.
  • determining the target eye image according to the binocular image includes: reducing and adjusting the binocular image to a preset size to obtain the target eye image.
  • the advantage of this setting is that the amount of calculation can be further controlled.
  • the input picture that is, the image to be detected, may be of a large size, such as some high-definition images. If the intercepted binocular images are directly used as the target eye image and input into the eye detection model, it will bring a greater computational burden and is more accurate The degree increase has a limited effect, so the size can be reduced while ensuring the accuracy, and the preset size can be obtained.
  • the preset size can be set according to actual needs. Different types of binocular images have different corresponding preset sizes. Taking a binocular image as an image that contains both left and right eyes as an example, the preset size can be 30 pixels * 90 pixels; for a binocular image that contains both left and right eye images, the preset size can be 30 pixels * 30 Pixels.
  • the method includes the following steps.
  • Step 301 Use a preset face detection method to detect the image to be detected to determine the position information of the corner of the eye.
  • the image to be detected may be an image containing a human face, for example, it may be derived from a live video image, an image in a surveillance video, etc., and the source of the image to be detected is not limited.
  • the preset face detection method can be selected according to the actual situation, such as the SIFT method.
  • the corner position information may include position information of the two inner corners of the left eye and the two inner corners of the right eye in the image to be detected, such as coordinate information.
  • Step 302 Separately intercept the left-eye image and the right-eye image according to the position information of the corner of the eye.
  • a rectangular cropping frame can be constructed with the distance between two points corresponding to the two inner corners of the left eye being one side length.
  • the rectangular frame can be expanded outward by a preset ratio, and the preset ratio can be set according to actual needs. For example, if the distance between the two points corresponding to the two inner corners of the eye is L, the preset ratio is k, and the rectangle is a square, then the side length of the square is kL, and L and k are all greater than zero.
  • the square cropping frame can be centered on the midpoint of the line connecting the two inner corners of the eye.
  • capturing the left-eye image and the right-eye image separately according to the eye corner position information may include: determining the two inner corner points of each eye according to the eye corner position information corresponding to each of the left eye and the right eye Relative position; rotate the image to be detected according to the relative position, so that the two inner corner points of each eye are on the same horizontal line; intercept the image of each eye.
  • the advantage of this setting is that due to the different head postures and different shooting angles, the connection between the two inner corner points may not be on the same horizontal line. After the image to be detected is rotated, the two inner corners can be rotated.
  • the corners of the eyes are adjusted to be on the same horizontal line, so that the captured left-eye and right-eye images are more standard, ensuring that the images input to the network have less changes and have roughly the same layout, which facilitates the eye detection model to quickly and accurately locate the key points.
  • the rotating the image to be detected according to the relative position so that the two inner corner points of each eye are on the same horizontal line includes: calculating the two inner corner points of each eye according to the relative position. The center point of the line connecting the corners of the eyes; calculating the included angle between the horizontal line passing through the center point and the line connecting the two inner corners of each eye; determining the rotation matrix according to the included angle; based on the rotation matrix Rotate the image to be detected so that the two inner corner points of each eye are on the same horizontal line.
  • the advantage of this setting is that the image to be inspected can be rotated more accurately.
  • the training data corresponding to the eyeball detection model includes training eye images that have undergone random perturbation processing and random rotation processing.
  • Random rotation processing can be performed for the crop frame.
  • the crop frame is rotated at a random angle with a certain probability, and the range of the random angle can be preset, such as 1 degree to 5 degrees.
  • Step 303 The left-eye image and the right-eye image are respectively reduced and adjusted to a preset size to obtain a target eye image.
  • the left-eye image and the right-eye image are respectively reduced and adjusted to a size of 30*30 to obtain the target left-eye image and the target right-eye image.
  • Step 304 Input the target eye image into a pre-trained eyeball detection model, where the eyeball detection model is a convolutional neural network model including a reversible residual network.
  • the eyeball detection model is a convolutional neural network model including a reversible residual network.
  • Step 305 Determine coordinate information and visibility information of key eyeball points in the target eye image according to the output result of the eyeball detection model.
  • Performing the reverse rotation processing on the relative position information based on the rotation matrix may include: calculating the reverse rotation matrix according to the rotation matrix, calculating the product of the reverse rotation matrix and the coordinate information contained in the relative position information, to obtain the target eye
  • the coordinate information of the key eyeball points in the image in the image to be detected, and the inverse rotation matrix is the inverse matrix of the rotation matrix.
  • the relative position information of the key eyeball points is the position information of the key eyeball points in the to-be-detected image after the rotation.
  • the eyeball detection method provided by the embodiment of the present application uses a preset face detection method to detect the image to be detected to determine the position information of the corner of the eye, intercept the left eye image and the right eye image according to the corner position information, and perform size reduction processing to determine the corresponding
  • the target left-eye image and target right-eye image can effectively reduce the amount of calculation, effectively control the scale of the eyeball detection model, and improve the detection efficiency.
  • FIG. 4 is a schematic flowchart of another eyeball detection method provided by an embodiment of the application, which is described on the basis of the foregoing multiple optional embodiments.
  • the inputting the target eye image into a pre-trained eyeball detection model, and determining the position information of key eyeball points in the target eye image according to the output result of the eyeball detection model includes: Input the first target eye image into a pre-trained eyeball detection model, and determine the position information of key eyeball points in the first target eye image according to the first output result of the eyeball detection model; The eye image is flipped horizontally, and the second eye image after the horizontal flip is input into the eyeball detection model, and the second target eye after the horizontal flip is determined according to the second output result of the eyeball detection model The position information of the eyeball key points in the image is horizontally flipped to obtain the position information of the eyeball key points in the second target eye image; wherein the first target eye image is the target For a left eye image, the second target eye image is a target right eye image; or, the first target eye image is a target right eye image, and the second target eye image is a target left eye image.
  • the advantage of this setting is that using the symmetrical relationship between the left and right eyes, only one eye detection model needs to be trained for one of the eyes, and then this model can also be used on the other eye, that is, a model is dual-purpose, without separate training Two models to improve the training efficiency and scope of application of the model.
  • FIG. 5 is a schematic diagram of a flow of eyeball detection provided by an embodiment of this application, and the embodiment of this application can be described with reference to FIG. 5.
  • the method includes the following steps.
  • Step 401 Use a preset face detection method to detect the image to be detected to determine the position information of the corner of the eye.
  • Step 402 Separate the left-eye image and the right-eye image according to the position information of the corner of the eye.
  • Step 403 The left-eye image and the right-eye image are respectively reduced and adjusted to a preset size to obtain the target left-eye image and the target right-eye image.
  • the left-eye image and the right-eye image are respectively reduced and adjusted to a size of 30*30 pixels to obtain the target left-eye image and the target right-eye image.
  • Step 404 Input the target right-eye image into the pre-trained eyeball detection model, and determine the position information of the key eyeball points in the target right-eye image according to the first output result of the eyeball detection model.
  • the eyeball detection model includes multiple reversible residual networks, and also includes a convolutional layer, a pooling layer, and a fully connected layer. From input to output, the eye detection model includes a convolutional layer, a pooling layer, a reversible residual network, a pooling layer, a reversible residual network, a pooling layer, a reversible residual network, and a fully connected layer.
  • the eye detection model may include at least two fully connected layers, the coordinate information of the key eyeball points in the target eye image is determined according to the output of the first fully connected layer, and the output of the preset activation function under the second fully connected layer is determined Determine visibility information of key eyeball points in the target eye image.
  • the preset activation function may be, for example, a sigmoid function.
  • FIG. 6 is a schematic diagram of a network structure of an eyeball detection model provided by an embodiment of the application.
  • an optional network structure of the eyeball detection model may include sequentially connected convolutional layers and first maximum pooling.
  • the third fully connected layer (C64), the third fully connected layer connects the first fully connected layer (C40) and the second fully connected layer (C20).
  • the input image size of the model can be reduced to 30 pixels, that is, the size of 30*30 pixels.
  • the C in the convolutional layer and the fully connected layer in Figure 6 represent the number of output channels in the convolutional layer and the fully connected layer respectively.
  • 3x3 convolution C8 means that the current layer is a 3x3 convolutional layer and outputs 8 features picture.
  • Maximum pooling uses 2x2 pooling.
  • the embodiment of the application uses the structure of the reversible residual module to improve the accuracy of the model.
  • Figure 7 is a schematic structural diagram of a reversible residual network provided by an embodiment of the application.
  • the figure shows that the number of input feature channels is m, the expansion parameter is k, and the number of output feature channels is n
  • the reversible residual module in Figure 6 the values in the reversible residual module in Figure 6 represent m, k, and n, such as (8,8,1) indicates that the number of input feature channels of the first reversible residual module is 8, expansion
  • the parameter is 8, the number of output characteristic channels is 1, and the value in each reversible residual module can be set according to actual needs.
  • a batch normalization (BN) normalization layer and a linear rectification function (Rectified Linear Unit, ReLU) activation layer can also be set.
  • the setting of the BN normalization layer can make the training objective function better converge; the setting of the ReLU activation layer can increase the nonlinearity of the network.
  • the fully connected layer C20 in Figure 6 will get 20 values.
  • the sigmoid activation function that follows independently operates on the 20 values, and outputs 20 numbers between 0 and 1, which can be used as the probability of whether the corresponding key point is visible. 0 means completely invisible, 1 means completely visible.
  • Step 405 Perform a horizontal flip of the target left-eye image, and input the horizontally flipped target left-eye image into the eyeball detection model, and determine the eyeball in the horizontally flipped target left-eye image according to the second output result of the eyeball detection model For the position information of the key point, the position information is horizontally flipped to obtain the position information of the key point of the eyeball in the left-eye image of the target.
  • the target right-eye image and the target left-eye image may be input into the network separately, or the target right-eye image and the horizontally flipped target left-eye image may be combined and input into the network. Make a limit.
  • Step 406 Summarize the position information of the key eyeball points in the right-eye image of the target and the position information of the key eyeball points in the left-eye image of the target to obtain a detection result of the key eyeball points.
  • the eyeball detection method provided in the embodiment of the application trains the eyeball detection model for one eye, and when the eyeball is detected, the other eye is horizontally flipped to achieve the purpose of detecting two eyes by the same model, and the model is improved The scope of application.
  • optimizing the network structure of the eyeball detection model in the embodiments of the present application it is possible to effectively improve the calculation efficiency of eyeball detection while ensuring the accuracy of eyeball detection, quickly obtain eyeball detection results, and improve the response speed of eyeball detection related applications.
  • FIG. 8 is a structural block diagram of an eyeball detection device provided by an embodiment of the application.
  • the device can be implemented by software and/or hardware, generally can be integrated in a computer device, and eyeball detection can be performed by executing an eyeball detection method.
  • the device includes: a target eye image acquisition module 801, configured to acquire a target eye image to be detected; an image input module 802, configured to input the target eye image to a pre-trained eyeball detection
  • the eyeball detection model is a convolutional neural network model including a reversible residual network;
  • the position information determination module 803 is configured to determine the eyeball in the target eye image according to the output result of the eyeball detection model Location information of key points.
  • the eyeball detection device acquires a target eye image to be detected, and inputs the target eye image into a pre-trained eyeball detection model, where the eyeball detection model is a convolutional neural network including a reversible residual network
  • the network model determines the position information of the key points of the eyeball in the target eye image according to the output result of the eyeball detection model.
  • FIG. 9 is a structural block diagram of a computer device provided by an embodiment of this application.
  • the computer device 900 includes a memory 901, a processor 902, and a computer program that is stored on the memory 901 and can run on the processor 902.
  • the processor 902 implements the eye detection method provided in the embodiment of the present application when the computer program is executed.
  • the embodiment of the present application also provides a storage medium containing computer-executable instructions, which are used to execute the eyeball detection method provided by the embodiments of the present application when the computer-executable instructions are executed by a computer processor.
  • the eyeball detection device, device, and storage medium provided in the above embodiments can execute the eyeball detection method provided by any embodiment of the present application, and have corresponding functional modules for executing the method.
  • the eyeball detection method provided in any embodiment of the present application.

Abstract

Disclosed are an eyeball detection method, apparatus and device, and a storage medium. The method comprises: acquiring a target eye image to be subjected to detection; inputting the target eye image into a pretrained eyeball detection model, wherein the eyeball detection model is a convolutional neural network model including a reversible residual network; and determining position information of an eyeball key point in the target eye image according to an output result from the eyeball detection model.

Description

眼球检测方法、装置、设备及存储介质Eyeball detection method, device, equipment and storage medium
本申请要求在2020年04月03日提交中国专利局、申请号为202010261001.7的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application filed with the Chinese Patent Office with the application number 202010261001.7 on April 3, 2020, and the entire content of the application is incorporated into this application by reference.
技术领域Technical field
本申请实施例涉及图像识别领域,例如涉及眼球检测方法、装置、设备及存储介质。The embodiments of the present application relate to the field of image recognition, such as eyeball detection methods, devices, equipment, and storage media.
背景技术Background technique
眼球检测技术一般包括眼球关键点定位技术,是图像处理和计算机视觉领域的一项重要的技术。眼球检测技术是为了准确定位出在输入人脸图像或视频中的虹膜以及瞳孔等位置,眼球检测技术主要包括虹膜边界或者边界上的关键点的检测以及瞳孔中心点的检测。眼球检测技术在娱乐直播、短视频特效、虚拟人偶以及安防等领域有重要的作用。Eyeball detection technology generally includes eyeball key point positioning technology, which is an important technology in the field of image processing and computer vision. The eyeball detection technology is to accurately locate the iris and pupil in the input face image or video. The eyeball detection technology mainly includes the detection of the iris boundary or the key points on the boundary and the detection of the pupil center point. Eyeball detection technology plays an important role in the fields of live entertainment, short video special effects, virtual dolls, and security.
眼球检测方法大致上可以分为两大类,一类是基于传统计算机视觉领域的手工特征提取方法,另外一类是基于神经网络技术的方法。基于传统计算机视觉领域的手工特征提取方法主要是利用图像的梯度来提取特征,例如尺度不变特征变换(Scale-invariant feature transform,SIFT)特征,并结合传统算法(例如霍夫变换和支持向量机等)做虹膜边缘检测或关键点检测,这类方案需要针对不同场景设置不同参数,而且准确性一般较低。基于神经网络技术的方法主要是利用多层卷积神经网络对图像进行特征提取,再回归关键点的位置,这类方案较前者准确性高,但是模型的计算复杂度高,对计算资源有很高的要求。因此,相关技术中的眼球检测方案仍不够完善,需要改进。Eyeball detection methods can be roughly divided into two categories, one is based on manual feature extraction methods in the traditional computer vision field, and the other is based on neural network technology. The manual feature extraction method based on the traditional computer vision field mainly uses the gradient of the image to extract features, such as scale-invariant feature transform (SIFT) features, and combines traditional algorithms (such as Hough transform and support vector machine). Etc.) Do iris edge detection or key point detection. This kind of scheme needs to set different parameters for different scenarios, and the accuracy is generally low. The method based on neural network technology mainly uses the multi-layer convolutional neural network to extract the features of the image, and then return to the position of the key point. This kind of scheme is more accurate than the former, but the computational complexity of the model is high, which has a great impact on computing resources. High demands. Therefore, the eyeball detection scheme in the related technology is still not perfect and needs to be improved.
发明内容Summary of the invention
本申请实施例提供了眼球检测方法、装置、设备及存储介质,可以优化相关技术中的眼球检测方案。The embodiments of the present application provide eyeball detection methods, devices, equipment, and storage media, which can optimize eyeball detection solutions in related technologies.
本申请实施例提供了一种眼球检测方法,该方法包括:获取待检测的目标眼部图像;将所述目标眼部图像输入至预先训练的眼球检测模型中,其中,所述眼球检测模型为包含可逆残差网络的卷积神经网络模型;根据所述眼球检测模型的输出结果确定所述目标眼部图像中的眼球关键点的位置信息。The embodiment of the present application provides an eyeball detection method, which includes: acquiring a target eye image to be detected; inputting the target eye image into a pre-trained eyeball detection model, wherein the eyeball detection model is A convolutional neural network model including a reversible residual network; the position information of key eyeball points in the target eye image is determined according to the output result of the eyeball detection model.
本申请实施例提供了一种眼球检测装置,该装置包括:目标眼部图像获取 模块,设置为获取待检测的目标眼部图像;图像输入模块,设置为将所述目标眼部图像输入至预先训练的眼球检测模型中,其中,所述眼球检测模型为包含可逆残差网络的卷积神经网络模型;位置信息确定模块,设置为根据所述眼球检测模型的输出结果确定所述目标眼部图像中的眼球关键点的位置信息。An embodiment of the present application provides an eyeball detection device, which includes: a target eye image acquisition module configured to acquire a target eye image to be detected; an image input module configured to input the target eye image to a preset In the trained eyeball detection model, the eyeball detection model is a convolutional neural network model including a reversible residual network; a position information determination module is configured to determine the target eye image according to the output result of the eyeball detection model The position information of the key points of the eyeballs in.
本申请实施例提供了一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现如本申请实施例提供的眼球检测方法。The embodiment of the present application provides a computer device, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor. When the processor executes the computer program, Eyeball detection method.
本申请实施例提供了一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现如本申请实施例提供的眼球检测方法。The embodiment of the present application provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, the eyeball detection method as provided in the embodiment of the present application is implemented.
附图说明Description of the drawings
图1为本申请实施例提供的一种眼球检测方法的流程示意图;FIG. 1 is a schematic flowchart of an eyeball detection method provided by an embodiment of this application;
图2为本申请实施例提供的一种眼球关键点的分布示意图;FIG. 2 is a schematic diagram of the distribution of key eyeball points according to an embodiment of the application;
图3为本申请实施例提供的又一种眼球检测方法的流程示意图;FIG. 3 is a schematic flowchart of another eyeball detection method provided by an embodiment of the application;
图4为本申请实施例提供的另一种眼球检测方法的流程示意图;4 is a schematic flowchart of another eyeball detection method provided by an embodiment of the application;
图5为本申请实施例提供的一种眼球检测的流程示意图;FIG. 5 is a schematic diagram of a flow of eyeball detection provided by an embodiment of this application;
图6为本申请实施例提供的一种眼球检测模型的网络结构示意图;FIG. 6 is a schematic diagram of a network structure of an eyeball detection model provided by an embodiment of this application;
图7为本申请实施例提供的一种可逆残差网络的结构示意图;FIG. 7 is a schematic structural diagram of a reversible residual network provided by an embodiment of this application;
图8为本申请实施例提供的一种眼球检测装置的结构框图;FIG. 8 is a structural block diagram of an eyeball detection device provided by an embodiment of the application;
图9为本申请实施例提供的一种计算机设备的结构框图。FIG. 9 is a structural block diagram of a computer device provided by an embodiment of this application.
具体实施方式Detailed ways
下面结合附图和实施例对本申请进行说明。可以理解的是,此处所描述的实施例仅仅用于解释本申请,而非对本申请的限定。图1为本申请实施例提供的一种眼球检测方法的流程示意图,该方法可以由眼球检测装置执行,该装置可由软件和/或硬件实现,一般可集成在计算机设备中。如图1所示,该方法包括如下步骤。The application will be described below with reference to the drawings and embodiments. It can be understood that the embodiments described here are only used to explain the application, but not to limit the application. FIG. 1 is a schematic flow chart of an eyeball detection method provided by an embodiment of the application. The method can be executed by an eyeball detection device, which can be implemented by software and/or hardware, and generally can be integrated in a computer device. As shown in Figure 1, the method includes the following steps.
步骤101、获取待检测的目标眼部图像。Step 101: Acquire an eye image of a target to be detected.
示例性的,计算机设备例如可包括手机、平板电脑、笔记本电脑以及个人数字助理等移动终端设备,也可包括台式电脑等其他设备。另外,本申请实施例可以在保证眼球检测精度的同时有效提高眼球检测的计算效率,计算复杂度也得到了有效控制,因此,本实施例提供的方法可广泛适用于移动计算平台和 其他计算资源受限的平台,也即,计算机设备可以是计算资源有限的设备,如低端(如硬件配置较低)手机以及安防设备等,经试验表明,计算机设备可达到毫秒级的运行速度。Exemplarily, the computer device may include mobile terminal devices such as mobile phones, tablet computers, notebook computers, and personal digital assistants, and may also include other devices such as desktop computers. In addition, the embodiment of the present application can effectively improve the calculation efficiency of eyeball detection while ensuring the accuracy of eyeball detection, and the calculation complexity is also effectively controlled. Therefore, the method provided in this embodiment can be widely applied to mobile computing platforms and other computing resources. Restricted platforms, that is, computer equipment can be equipment with limited computing resources, such as low-end (such as low hardware configuration) mobile phones and security equipment. Tests have shown that computer equipment can reach millisecond-level operating speeds.
本申请实施例提供的方案可应用于多种应用场景中,如涉及用户视线方向追踪、眼球追踪以及其他需要使用到眼球位置相关信息的应用中。可选的,可应用在视频直播或短视频应用中的特效、贴纸、虚拟人偶以及三维的(3-dimensional,3D)表情等功能中,还可应用在安防设备中用来辅助虹膜人脸识别和活体检测等。The solutions provided by the embodiments of the present application can be applied to a variety of application scenarios, such as tracking the direction of the user's line of sight, eye tracking, and other applications that require the use of eyeball position related information. Optionally, it can be used in special effects, stickers, virtual dolls and 3-dimensional (3D) expressions in live video or short video applications, and can also be used in security equipment to assist iris faces Recognition and live detection, etc.
示例性的,目标眼部图像可以是包含人眼的图像。人眼区域在整个目标眼部图像中所占比例不限,目标眼部图像中可包含人脸五官中其他部位,也可仅包含人眼,本申请实施例不做限定。Exemplarily, the target eye image may be an image containing human eyes. The proportion of the human eye area in the entire target eye image is not limited, and the target eye image may include other parts of the facial features of the human face, or may only include human eyes, which is not limited in the embodiment of the present application.
可选的,对于一些应用场景来说,如摄像头等图像采集装置所采集的原始图像中一般会包含整个人脸,还可能包含其他如人物背景等图像信息,因此,可以对原始图像进行裁剪等操作,得到目标眼部图像,以减少运算量。Optionally, for some application scenarios, the original image collected by an image capture device such as a camera generally contains the entire face, and may also contain other image information such as the background of the person. Therefore, the original image can be cropped, etc. Operation to obtain the target eye image to reduce the amount of calculation.
步骤102、将所述目标眼部图像输入至预先训练的眼球检测模型中,其中,所述眼球检测模型为包含可逆残差网络的卷积神经网络模型。Step 102: Input the target eye image into a pre-trained eyeball detection model, where the eyeball detection model is a convolutional neural network model including a reversible residual network.
本申请实施例中所采用的用于进行眼球检测的眼球检测模型可以是包含可逆残差网络的卷积神经网络模型。相关技术中的眼球检测模型中一般使用较多层的卷积网络,计算复杂度非常高,无法在计算资源有限的设备上使用,并且,由于计算复杂度高,计算速度也受到很大影响,计算效率低,影响眼球检测的实时性。为了降低计算复杂度,本申请实施例将可逆残差网络应用于眼球检测模型中,可在模型中设置一个或多个基于可逆残差网络的模块,可以在保证精度的同时提高计算效率。可逆残差网络在眼球检测模型中的位置、可逆残差网络的数量、以及可逆残差网络中的参数可根据实际应用和场景进行设置,本申请实施例不做限定。另外,眼球检测模型中还可包括卷积层、池化层以及全连接层等,眼球检测模型的结构本申请实施例也不做限定,可通过重新组合和设计卷积层来平衡神经网络的准确性和复杂度,在保持准确性的前提下降低网络的复杂度。The eyeball detection model used for eyeball detection used in the embodiments of the present application may be a convolutional neural network model including a reversible residual network. Eyeball detection models in related technologies generally use more layers of convolutional networks. The computational complexity is very high and cannot be used on devices with limited computing resources. Moreover, due to the high computational complexity, the calculation speed is also greatly affected. The calculation efficiency is low, which affects the real-time performance of eyeball detection. In order to reduce the computational complexity, the embodiment of the present application applies the reversible residual network to the eyeball detection model. One or more modules based on the reversible residual network can be set in the model, which can improve the calculation efficiency while ensuring accuracy. The position of the reversible residual network in the eyeball detection model, the number of reversible residual networks, and the parameters in the reversible residual network can be set according to actual applications and scenarios, which are not limited in the embodiment of the present application. In addition, the eyeball detection model may also include a convolutional layer, a pooling layer, and a fully connected layer, etc. The structure of the eyeball detection model is not limited in this embodiment. The convolutional layer can be recombined and designed to balance the neural network. Accuracy and complexity, reduce the complexity of the network while maintaining accuracy.
示例性的,可根据实际需求确定眼部检测模型对应的网络结构,得到眼部检测训练模型,并利用训练数据对眼部检测训练模型进行训练,优化眼部检测训练模型中多个参数的取值,进而得到训练好的眼球检测模型,也即本申请实施例中的预先训练的眼球检测模型。Exemplarily, the network structure corresponding to the eye detection model can be determined according to actual needs, and the eye detection training model can be obtained, and the training data can be used to train the eye detection training model to optimize the selection of multiple parameters in the eye detection training model. Value to obtain a trained eyeball detection model, that is, the pre-trained eyeball detection model in the embodiment of the present application.
步骤103、根据所述眼球检测模型的输出结果确定所述目标眼部图像中的眼 球关键点的位置信息。Step 103: Determine position information of key eyeball points in the target eye image according to the output result of the eyeball detection model.
示例性的,目标眼部图像中的眼球关键点例如可包括虹膜周边的点,还可包括瞳孔中心点。眼球关键点的数量不做限定,例如20个,可包括虹膜周边的19个点和瞳孔中心点。Exemplarily, the key points of the eyeball in the target eye image may include, for example, points around the iris, and may also include the center point of the pupil. The number of key points of the eyeball is not limited, for example, 20, which may include 19 points on the periphery of the iris and the center point of the pupil.
目标眼部图像中的眼球关键点的位置信息可包括与眼球关键点的位置相关的信息,如眼球关键点的坐标信息,又如眼球关键点的可见性信息等。坐标信息可以包括眼球关键点在目标眼部图像中的平面坐标值,可见性信息可包括眼球关键点是否被眼皮遮挡。图2为本申请实施例提供的一种眼球关键点分布示意图,如图所示,共标记出了20个关键点,其中编号为11到17的点由于被眼皮遮挡,可见性信息为不可见点。The position information of the eyeball key points in the target eye image may include information related to the position of the eyeball key points, such as coordinate information of the eyeball key points, and visibility information of the eyeball key points. The coordinate information may include the plane coordinate values of the eyeball key points in the target eye image, and the visibility information may include whether the eyeball key points are occluded by the eyelids. Figure 2 is a schematic diagram of a distribution of key eyeball points provided by an embodiment of the application. As shown in the figure, a total of 20 key points are marked, among which the points numbered 11 to 17 are hidden by the eyelids, and the visibility information is invisible point.
示例性的,可根据位置信息包含的内容对用于模型训练的训练数据进行标记。例如,可选取预设数量的眼部图像,对眼部图像中的关键点坐标以及关键点的可见性等进行标记,得到训练眼部图像,并将训练眼部图像用于模型训练,其中,预设数量可根据模型精度以及准确度等实际需求设定,一般为上万等级,如60000。Exemplarily, the training data used for model training may be marked according to the content contained in the location information. For example, a preset number of eye images can be selected, the key point coordinates and the visibility of the key points in the eye image can be marked to obtain the training eye image, and the training eye image can be used for model training, where, The preset number can be set according to actual requirements such as model accuracy and accuracy, and is generally tens of thousands of levels, such as 60,000.
本申请实施例中提供的眼球检测方法,获取待检测的目标眼部图像,将目标眼部图像输入至预先训练的眼球检测模型中,其中,眼球检测模型为包含可逆残差网络的卷积神经网络模型,根据眼球检测模型的输出结果确定目标眼部图像中的眼球关键点的位置信息。通过采用上述技术方案,由于预先训练的眼球检测模型为包含可逆残差网络的卷积神经网络模型,可以在保证眼球检测精度的同时有效提高眼球检测的计算效率,快速得到眼球检测结果,提升眼球检测相关应用的响应速度。The eyeball detection method provided in the embodiments of the present application obtains a target eye image to be detected, and inputs the target eye image into a pre-trained eyeball detection model, where the eyeball detection model is a convolutional neural network including a reversible residual network The network model determines the position information of the key points of the eyeball in the target eye image according to the output result of the eyeball detection model. By adopting the above technical solution, since the pre-trained eyeball detection model is a convolutional neural network model that includes a reversible residual network, it can effectively improve the calculation efficiency of eyeball detection while ensuring the accuracy of eyeball detection, quickly obtain eyeball detection results, and improve eyeballs Detect the response speed of related applications.
图3为本申请实施例提供的又一种眼球检测方法的流程示意图,在上述多个可选实施例基础上,对获取待检测的目标眼部图像进行说明。FIG. 3 is a schematic flow diagram of another eyeball detection method provided by an embodiment of the application. Based on the above-mentioned multiple optional embodiments, the acquisition of the target eye image to be detected is described.
示例性的,所述获取待检测的目标眼部图像,可包括:采用预设人脸检测方法对待检测图像进行检测,以确定眼角位置信息;根据所述眼角位置信息截取双眼图像;根据所述双眼图像确定目标眼部图像。这样设置的好处在于,可进一步减少运算量,提升检测效率。双眼图像可以是同时包含左眼和右眼的图像,双眼图像也可以是分别包含左眼图像和右眼图像两个图像。可选的,所述根据所述眼角位置信息截取双眼图像,包括:根据所述眼角位置信息分别截取左眼图像和右眼图像。这样设置的好处在于,有利于针对左眼和右眼进行有针对性的检测,并可有效控制眼球检测模型的规模。Exemplarily, the acquiring the target eye image to be detected may include: using a preset face detection method to detect the image to be detected to determine the position information of the corners of the eyes; intercepting the binocular images according to the position information of the corners of the eyes; The binocular image determines the target eye image. The advantage of this setting is that it can further reduce the amount of calculation and improve the detection efficiency. The binocular image may be an image containing both the left eye and the right eye, and the binocular image may also be two images containing the left eye image and the right eye image respectively. Optionally, the capturing binocular images according to the position information of the corners of the eyes includes: respectively capturing a left-eye image and a right-eye image according to the position information of the corners of the eyes. The advantage of this setting is that it facilitates targeted detection for the left eye and the right eye, and can effectively control the scale of the eyeball detection model.
可选的,根据所述双眼图像确定目标眼部图像包括:将所述双眼图像缩小 并调整至预设尺寸,得到目标眼部图像。这样设置的好处在于,可进一步控制计算量。输入图片,即待检测图像,可能尺寸较大,例如是一些高清图像,若直接采用截取的双眼图像作为目标眼部图像输入至眼球检测模型中,会带来较大的计算负担,而且对准确度提升起到的作用有限,因此可以在保证准确度的情况下降低尺寸,得到预设尺寸。预设尺寸可以根据实际需求设置,双眼图像的类型不同,对应的预设尺寸不同。以双眼图像为同时包含左眼和右眼的图像为例,预设尺寸可以是30像素*90像素;以双眼图像包含左眼图像和右眼图像为例,预设尺寸可以是30像素*30像素。Optionally, determining the target eye image according to the binocular image includes: reducing and adjusting the binocular image to a preset size to obtain the target eye image. The advantage of this setting is that the amount of calculation can be further controlled. The input picture, that is, the image to be detected, may be of a large size, such as some high-definition images. If the intercepted binocular images are directly used as the target eye image and input into the eye detection model, it will bring a greater computational burden and is more accurate The degree increase has a limited effect, so the size can be reduced while ensuring the accuracy, and the preset size can be obtained. The preset size can be set according to actual needs. Different types of binocular images have different corresponding preset sizes. Taking a binocular image as an image that contains both left and right eyes as an example, the preset size can be 30 pixels * 90 pixels; for a binocular image that contains both left and right eye images, the preset size can be 30 pixels * 30 Pixels.
可选的,该方法包括如下步骤。Optionally, the method includes the following steps.
步骤301、采用预设人脸检测方法对待检测图像进行检测,以确定眼角位置信息。Step 301: Use a preset face detection method to detect the image to be detected to determine the position information of the corner of the eye.
示例性的,待检测图像可以是包含人脸的图像,例如可以来源于视频直播图像以及监控录像中的图像等,对待检测图像的来源不做限定。预设人脸检测方法可根据实际情况进行选择,例如SIFT方法等。眼角位置信息可包括左眼的两个内眼角以及右眼的两个内眼角在待检测图像中的位置信息,如坐标信息。Exemplarily, the image to be detected may be an image containing a human face, for example, it may be derived from a live video image, an image in a surveillance video, etc., and the source of the image to be detected is not limited. The preset face detection method can be selected according to the actual situation, such as the SIFT method. The corner position information may include position information of the two inner corners of the left eye and the two inner corners of the right eye in the image to be detected, such as coordinate information.
步骤302、根据眼角位置信息分别截取左眼图像和右眼图像。Step 302: Separately intercept the left-eye image and the right-eye image according to the position information of the corner of the eye.
示例性的,以左眼图像为例,可以左眼的两个内眼角对应的两点之间的距离为一个边长构建一个矩形的裁剪框。可选的,为了确保这个裁剪框能将整个眼睛包含进去,可以对矩形框做预设比例的向外扩展,预设比例可根据实际需求设置。例如,两个内眼角对应的两点距离为L,预设比例为k,矩形为正方形,那么正方形的边长为kL,L,k均大于0。正方形裁剪框可以以两个内眼角的连线的中点为中心。Exemplarily, taking the left-eye image as an example, a rectangular cropping frame can be constructed with the distance between two points corresponding to the two inner corners of the left eye being one side length. Optionally, in order to ensure that this cropping frame can contain the entire eye, the rectangular frame can be expanded outward by a preset ratio, and the preset ratio can be set according to actual needs. For example, if the distance between the two points corresponding to the two inner corners of the eye is L, the preset ratio is k, and the rectangle is a square, then the side length of the square is kL, and L and k are all greater than zero. The square cropping frame can be centered on the midpoint of the line connecting the two inner corners of the eye.
示例性的,根据所述眼角位置信息分别截取左眼图像和右眼图像,可包括:根据左眼和右眼中每个眼睛对应的眼角位置信息确定所述每个眼睛的两个内眼角点的相对位置;根据所述相对位置对所述待检测图像进行旋转,以使所述每个眼睛的两个内眼角点处于同一水平线;截取所述每个眼睛的图像。这样设置的好处在于,由于人的头部姿势的不同以及拍摄角度的不同,可能导致两个内眼角点的连线并未处于同一水平线,将待检测图像进行旋转后,可以实现将两个内眼角点调整为处于同一水平线,这样截取得到的左眼图像和右眼图像更加标准,保证输入到网络的图片变化少,有大致相同的布局,方便眼部检测模型快速准确地进行关键点定位。Exemplarily, capturing the left-eye image and the right-eye image separately according to the eye corner position information may include: determining the two inner corner points of each eye according to the eye corner position information corresponding to each of the left eye and the right eye Relative position; rotate the image to be detected according to the relative position, so that the two inner corner points of each eye are on the same horizontal line; intercept the image of each eye. The advantage of this setting is that due to the different head postures and different shooting angles, the connection between the two inner corner points may not be on the same horizontal line. After the image to be detected is rotated, the two inner corners can be rotated. The corners of the eyes are adjusted to be on the same horizontal line, so that the captured left-eye and right-eye images are more standard, ensuring that the images input to the network have less changes and have roughly the same layout, which facilitates the eye detection model to quickly and accurately locate the key points.
所述根据所述相对位置对所述待检测图像进行旋转,以使所述每个眼睛的两个内眼角点处于同一水平线,包括:根据所述相对位置计算所述每个眼睛的 两个内眼角点的连线的中心点;计算通过所述中心点的水平线与所述每个眼睛的两个内眼角点的连线的夹角;根据所述夹角确定旋转矩阵;基于所述旋转矩阵对所述待检测图像进行旋转,以使所述每个眼睛的两个内眼角点处于同一水平线。这样设置的好处在于,能够更加精准地对待检测图像进行旋转。The rotating the image to be detected according to the relative position so that the two inner corner points of each eye are on the same horizontal line includes: calculating the two inner corner points of each eye according to the relative position. The center point of the line connecting the corners of the eyes; calculating the included angle between the horizontal line passing through the center point and the line connecting the two inner corners of each eye; determining the rotation matrix according to the included angle; based on the rotation matrix Rotate the image to be detected so that the two inner corner points of each eye are on the same horizontal line. The advantage of this setting is that the image to be inspected can be rotated more accurately.
可选的,眼球检测模型对应的训练数据包括经过随机扰动处理和随机旋转处理的训练眼部图像。这样设置的好处在于,可以提高模型的鲁棒性。随机旋转处理可以针对裁剪框进行。例如,对裁剪框进行一定概率的随机角度旋转,随机角度的范围可进行预设,例如1度到5度。Optionally, the training data corresponding to the eyeball detection model includes training eye images that have undergone random perturbation processing and random rotation processing. The advantage of this setting is that it can improve the robustness of the model. Random rotation processing can be performed for the crop frame. For example, the crop frame is rotated at a random angle with a certain probability, and the range of the random angle can be preset, such as 1 degree to 5 degrees.
步骤303、将左眼图像和右眼图像分别缩小并调整至预设尺寸,得到目标眼部图像。Step 303: The left-eye image and the right-eye image are respectively reduced and adjusted to a preset size to obtain a target eye image.
示例性的,将左眼图像和右眼图像分别缩小并调整至30*30大小,得到目标左眼图像和目标右眼图像。Exemplarily, the left-eye image and the right-eye image are respectively reduced and adjusted to a size of 30*30 to obtain the target left-eye image and the target right-eye image.
步骤304、将目标眼部图像输入至预先训练的眼球检测模型中,其中,眼球检测模型为包含可逆残差网络的卷积神经网络模型。Step 304: Input the target eye image into a pre-trained eyeball detection model, where the eyeball detection model is a convolutional neural network model including a reversible residual network.
步骤305、根据眼球检测模型的输出结果确定目标眼部图像中的眼球关键点的坐标信息和可见性信息。Step 305: Determine coordinate information and visibility information of key eyeball points in the target eye image according to the output result of the eyeball detection model.
若在进行眼部图像截取前对待检测图像进行了旋转操作,那么可选的,所述目标眼部图像中的眼球关键点的位置信息包括所述眼球关键点在所述待检测图像中的坐标信息;所述根据所述眼球检测模型的输出结果确定所述目标眼部图像中的眼球关键点的位置信息,包括:根据所述眼球检测模型的输出结果确定所述目标眼部图像中的眼球关键点的相对位置信息;基于所述旋转矩阵对所述相对位置信息进行反向旋转处理,得到所述目标眼部图像中的眼球关键点在所述待检测图像中的坐标信息。这样设置的好处在于,可以准确计算出眼球关键点在待检测图像中的坐标信息,为后续的如特效以及贴纸等其他相关应用提供依据。基于所述旋转矩阵对所述相对位置信息进行反向旋转处理例如可包括:根据旋转矩阵计算反向旋转矩阵,计算反向旋转矩阵与相对位置信息中包含的坐标信息的乘积,得到目标眼部图像中的眼球关键点在所述待检测图像中的坐标信息,反向旋转矩阵为旋转矩阵的逆矩阵。If the image to be detected is rotated before the eye image is intercepted, optionally, the position information of the key eye points in the target eye image includes the coordinates of the key eye points in the image to be detected Information; the determining the position information of the key eyeball points in the target eye image according to the output result of the eyeball detection model includes: determining the eyeball in the target eye image according to the output result of the eyeball detection model Relative position information of key points; performing reverse rotation processing on the relative position information based on the rotation matrix to obtain coordinate information of key eye points in the target eye image in the image to be detected. The advantage of this setting is that it can accurately calculate the coordinate information of the key points of the eyeball in the image to be detected, and provide a basis for subsequent related applications such as special effects and stickers. Performing the reverse rotation processing on the relative position information based on the rotation matrix, for example, may include: calculating the reverse rotation matrix according to the rotation matrix, calculating the product of the reverse rotation matrix and the coordinate information contained in the relative position information, to obtain the target eye The coordinate information of the key eyeball points in the image in the image to be detected, and the inverse rotation matrix is the inverse matrix of the rotation matrix.
所述眼球关键点的相对位置信息为所述眼球关键点在旋转后的待检测图像中的位置信息。The relative position information of the key eyeball points is the position information of the key eyeball points in the to-be-detected image after the rotation.
本申请实施例提供的眼球检测方法,采用预设人脸检测方法对待检测图像进行检测,以确定眼角位置信息,根据眼角位置信息截取左眼图像和右眼图像,并进行尺寸缩小处理,确定对应的目标左眼图像和目标右眼图像,可以有效减 少运算量,有效控制眼球检测模型的规模,提升检测效率。The eyeball detection method provided by the embodiment of the present application uses a preset face detection method to detect the image to be detected to determine the position information of the corner of the eye, intercept the left eye image and the right eye image according to the corner position information, and perform size reduction processing to determine the corresponding The target left-eye image and target right-eye image can effectively reduce the amount of calculation, effectively control the scale of the eyeball detection model, and improve the detection efficiency.
图4为本申请实施例提供的另一种眼球检测方法的流程示意图,在上述多个可选实施例基础上进行说明。FIG. 4 is a schematic flowchart of another eyeball detection method provided by an embodiment of the application, which is described on the basis of the foregoing multiple optional embodiments.
示例性的,所述将所述目标眼部图像输入至预先训练的眼球检测模型中,根据所述眼球检测模型的输出结果确定所述目标眼部图像中的眼球关键点的位置信息,包括:将第一目标眼部图像输入至预先训练的眼球检测模型中,根据所述眼球检测模型的第一输出结果确定所述第一目标眼部图像中的眼球关键点的位置信息;对第二目标眼部图像进行水平翻转,并将水平翻转后的第二眼部图像输入至所述眼球检测模型中,根据所述眼球检测模型的第二输出结果确定所述水平翻转后的第二目标眼部图像中的眼球关键点的位置信息,对所述位置信息进行水平翻转处理,得到所述第二目标眼部图像中的眼球关键点的位置信息;其中,所述第一目标眼部图像为目标左眼图像,所述第二目标眼部图像为目标右眼图像;或者,所述第一目标眼部图像为目标右眼图像,所述第二目标眼部图像为目标左眼图像。这样设置的好处在于,利用左右眼的对称关系,只需要针对其中一只眼睛训练一个眼球检测模型,之后这个模型也可以用到另外一只眼睛眼上,即一个模型两用,而不用单独训练两个模型,提升模型的训练效率以及适用范围。Exemplarily, the inputting the target eye image into a pre-trained eyeball detection model, and determining the position information of key eyeball points in the target eye image according to the output result of the eyeball detection model includes: Input the first target eye image into a pre-trained eyeball detection model, and determine the position information of key eyeball points in the first target eye image according to the first output result of the eyeball detection model; The eye image is flipped horizontally, and the second eye image after the horizontal flip is input into the eyeball detection model, and the second target eye after the horizontal flip is determined according to the second output result of the eyeball detection model The position information of the eyeball key points in the image is horizontally flipped to obtain the position information of the eyeball key points in the second target eye image; wherein the first target eye image is the target For a left eye image, the second target eye image is a target right eye image; or, the first target eye image is a target right eye image, and the second target eye image is a target left eye image. The advantage of this setting is that using the symmetrical relationship between the left and right eyes, only one eye detection model needs to be trained for one of the eyes, and then this model can also be used on the other eye, that is, a model is dual-purpose, without separate training Two models to improve the training efficiency and scope of application of the model.
图5为本申请实施例提供的一种眼球检测的流程示意图,可参照图5对本申请实施例进行说明。FIG. 5 is a schematic diagram of a flow of eyeball detection provided by an embodiment of this application, and the embodiment of this application can be described with reference to FIG. 5.
可选的,该方法包括如下步骤。Optionally, the method includes the following steps.
步骤401、采用预设人脸检测方法对待检测图像进行检测,以确定眼角位置信息。Step 401: Use a preset face detection method to detect the image to be detected to determine the position information of the corner of the eye.
步骤402、根据眼角位置信息分别截取左眼图像和右眼图像。Step 402: Separate the left-eye image and the right-eye image according to the position information of the corner of the eye.
步骤403、将左眼图像和右眼图像分别缩小并调整至预设尺寸,得到目标左眼图像和目标右眼图像。Step 403: The left-eye image and the right-eye image are respectively reduced and adjusted to a preset size to obtain the target left-eye image and the target right-eye image.
示例性的,将左眼图像和右眼图像分别缩小并调整至30*30像素大小,得到目标左眼图像和目标右眼图像。Exemplarily, the left-eye image and the right-eye image are respectively reduced and adjusted to a size of 30*30 pixels to obtain the target left-eye image and the target right-eye image.
步骤404、将目标右眼图像输入至预先训练的眼球检测模型中,根据眼球检测模型的第一输出结果确定目标右眼图像中的眼球关键点的位置信息。Step 404: Input the target right-eye image into the pre-trained eyeball detection model, and determine the position information of the key eyeball points in the target right-eye image according to the first output result of the eyeball detection model.
示例性的,眼球检测模型包含多个可逆残差网络,还包括卷积层、池化层和全连接层。从输入至输出方向上,眼球检测模型包含卷积层、池化层、可逆残差网络、池化层、可逆残差网络、池化层、可逆残差网络和全连接层。眼球检测模型可以包括至少两个全连接层,根据第一全连接层的输出确定所述目标 眼部图像中的眼球关键点的坐标信息,根据承接第二全连接层的预设激活函数的输出确定所述目标眼部图像中的眼球关键点的可见性信息。预设激活函数例如可以是sigmoid函数。Exemplarily, the eyeball detection model includes multiple reversible residual networks, and also includes a convolutional layer, a pooling layer, and a fully connected layer. From input to output, the eye detection model includes a convolutional layer, a pooling layer, a reversible residual network, a pooling layer, a reversible residual network, a pooling layer, a reversible residual network, and a fully connected layer. The eye detection model may include at least two fully connected layers, the coordinate information of the key eyeball points in the target eye image is determined according to the output of the first fully connected layer, and the output of the preset activation function under the second fully connected layer is determined Determine visibility information of key eyeball points in the target eye image. The preset activation function may be, for example, a sigmoid function.
图6为本申请实施例提供的一种眼球检测模型的网络结构示意图,如图6所示,眼球检测模型可选的一种网络结构可包括顺次连接的卷积层、第一极大池化层、第一可逆残差模块(可逆残差网络)、第二极大池化层、第二可逆残差模块、第三极大池化层、第三可逆残差模块、第四可逆残差模块和第三全连接层(C64),第三全连接层连接第一全连接层(C40)和第二全连接层(C20)。FIG. 6 is a schematic diagram of a network structure of an eyeball detection model provided by an embodiment of the application. As shown in FIG. 6, an optional network structure of the eyeball detection model may include sequentially connected convolutional layers and first maximum pooling. Layer, the first reversible residual module (reversible residual network), the second maximum pooling layer, the second reversible residual module, the third maximum pooling layer, the third reversible residual module, the fourth reversible residual module, and The third fully connected layer (C64), the third fully connected layer connects the first fully connected layer (C40) and the second fully connected layer (C20).
本申请实施例中,可以将模型的输入图像尺寸降低到30个像素,即30*30像素大小。图6中卷积层和全连接层中的C分别代表卷积层和全连接层中的输出通道(channel)数目,比如3x3卷积C8表示当前层是一个3x3的卷积层输出8个特征图。极大值池化使用2x2池化。本申请实施例利用了可逆残差模块的结构来提升模型的准确性。图7为本申请实施例提供的一种可逆残差网络的结构示意图,如图7所示,图中展示了一个输入特征channel的数目为m,扩张参数为k,输出特征channel的数目为n的可逆残差模块,图6中可逆残差模块中的值分别代表m、k和n,如(8,8,1)表示该第一可逆残差模块的输入特征channel的数目为8,扩张参数为8,输出特征channel的数目为1,每个可逆残差模块中的值可根据实际需求设置。在图6所示的网络结构中,每一个卷积层后面还可以设置一个批标准化(Batch Normalization,BN)归一化层和一个线性整流函数(Rectified Linear Unit,ReLU)激活层。BN归一化层的设置可以让训练的目标函数更好地收敛;ReLU激活层的设置可以增加网络的非线性性。图6中的全连接层C20会得到20个值,紧跟其后的sigmoid激活函数对20个值独立操作,输出20个0到1之间的数,可以作为对应的关键点是否可见概率,0为完全不可见,1为完全可见。In the embodiment of the present application, the input image size of the model can be reduced to 30 pixels, that is, the size of 30*30 pixels. The C in the convolutional layer and the fully connected layer in Figure 6 represent the number of output channels in the convolutional layer and the fully connected layer respectively. For example, 3x3 convolution C8 means that the current layer is a 3x3 convolutional layer and outputs 8 features picture. Maximum pooling uses 2x2 pooling. The embodiment of the application uses the structure of the reversible residual module to improve the accuracy of the model. Figure 7 is a schematic structural diagram of a reversible residual network provided by an embodiment of the application. As shown in Figure 7, the figure shows that the number of input feature channels is m, the expansion parameter is k, and the number of output feature channels is n The reversible residual module in Figure 6, the values in the reversible residual module in Figure 6 represent m, k, and n, such as (8,8,1) indicates that the number of input feature channels of the first reversible residual module is 8, expansion The parameter is 8, the number of output characteristic channels is 1, and the value in each reversible residual module can be set according to actual needs. In the network structure shown in Figure 6, after each convolutional layer, a batch normalization (BN) normalization layer and a linear rectification function (Rectified Linear Unit, ReLU) activation layer can also be set. The setting of the BN normalization layer can make the training objective function better converge; the setting of the ReLU activation layer can increase the nonlinearity of the network. The fully connected layer C20 in Figure 6 will get 20 values. The sigmoid activation function that follows independently operates on the 20 values, and outputs 20 numbers between 0 and 1, which can be used as the probability of whether the corresponding key point is visible. 0 means completely invisible, 1 means completely visible.
步骤405、对目标左眼图像进行水平翻转,并将水平翻转后的目标左眼图像输入至眼球检测模型中,根据眼球检测模型的第二输出结果确定水平翻转后的目标左眼图像中的眼球关键点的位置信息,对位置信息进行水平翻转处理,得到目标左眼图像中的眼球关键点的位置信息。Step 405: Perform a horizontal flip of the target left-eye image, and input the horizontally flipped target left-eye image into the eyeball detection model, and determine the eyeball in the horizontally flipped target left-eye image according to the second output result of the eyeball detection model For the position information of the key point, the position information is horizontally flipped to obtain the position information of the key point of the eyeball in the left-eye image of the target.
可选的,本申请实施例中,可以将目标右眼图像和目标左眼图像分别输入网络,也可以将目标右眼图像以及水平翻转后的目标左眼图像合在一起输入网络,在此不做限定。Optionally, in the embodiment of the present application, the target right-eye image and the target left-eye image may be input into the network separately, or the target right-eye image and the horizontally flipped target left-eye image may be combined and input into the network. Make a limit.
步骤406、汇总目标右眼图像中的眼球关键点的位置信息和目标左眼图像中的眼球关键点的位置信息,得到眼球关键点检测结果。Step 406: Summarize the position information of the key eyeball points in the right-eye image of the target and the position information of the key eyeball points in the left-eye image of the target to obtain a detection result of the key eyeball points.
本申请实施例提供的眼球检测方法,针对一只眼睛进行眼球检测模型的训 练,在眼球检测时,对另外一只眼睛进行水平翻转,达到同一个模型对两只眼睛进行检测的目的,提升模型的适用范围。另外,本申请实施例中通过对眼球检测模型的网络结构进行优化,可以在保证眼球检测精度的同时有效提高眼球检测的计算效率,快速得到眼球检测结果,提升眼球检测相关应用的响应速度。The eyeball detection method provided in the embodiment of the application trains the eyeball detection model for one eye, and when the eyeball is detected, the other eye is horizontally flipped to achieve the purpose of detecting two eyes by the same model, and the model is improved The scope of application. In addition, by optimizing the network structure of the eyeball detection model in the embodiments of the present application, it is possible to effectively improve the calculation efficiency of eyeball detection while ensuring the accuracy of eyeball detection, quickly obtain eyeball detection results, and improve the response speed of eyeball detection related applications.
图8为本申请实施例提供的一种眼球检测装置的结构框图,该装置可由软件和/或硬件实现,一般可集成在计算机设备中,可通过执行眼球检测方法来进行眼球检测。如图8所示,该装置包括:目标眼部图像获取模块801,设置为获取待检测的目标眼部图像;图像输入模块802,设置为将所述目标眼部图像输入至预先训练的眼球检测模型中,其中,所述眼球检测模型为包含可逆残差网络的卷积神经网络模型;位置信息确定模块803,设置为根据所述眼球检测模型的输出结果确定所述目标眼部图像中的眼球关键点的位置信息。FIG. 8 is a structural block diagram of an eyeball detection device provided by an embodiment of the application. The device can be implemented by software and/or hardware, generally can be integrated in a computer device, and eyeball detection can be performed by executing an eyeball detection method. As shown in FIG. 8, the device includes: a target eye image acquisition module 801, configured to acquire a target eye image to be detected; an image input module 802, configured to input the target eye image to a pre-trained eyeball detection In the model, the eyeball detection model is a convolutional neural network model including a reversible residual network; the position information determination module 803 is configured to determine the eyeball in the target eye image according to the output result of the eyeball detection model Location information of key points.
本申请实施例中提供的眼球检测装置,获取待检测的目标眼部图像,将目标眼部图像输入至预先训练的眼球检测模型中,其中,眼球检测模型为包含可逆残差网络的卷积神经网络模型,根据眼球检测模型的输出结果确定目标眼部图像中的眼球关键点的位置信息。通过采用上述技术方案,由于预先训练的眼球检测模型为包含可逆残差网络的卷积神经网络模型,可以在保证眼球检测精度的同时有效提高眼球检测的计算效率,快速得到眼球检测结果,提升眼球检测相关应用的响应速度。The eyeball detection device provided in the embodiments of the present application acquires a target eye image to be detected, and inputs the target eye image into a pre-trained eyeball detection model, where the eyeball detection model is a convolutional neural network including a reversible residual network The network model determines the position information of the key points of the eyeball in the target eye image according to the output result of the eyeball detection model. By adopting the above technical solution, since the pre-trained eyeball detection model is a convolutional neural network model that includes a reversible residual network, it can effectively improve the calculation efficiency of eyeball detection while ensuring the accuracy of eyeball detection, quickly obtain eyeball detection results, and improve eyeballs Detect the response speed of related applications.
本申请实施例提供了一种计算机设备,该计算机设备中可集成本申请实施例提供的眼球检测装置。图9为本申请实施例提供的一种计算机设备的结构框图。计算机设备900包括存储器901、处理器902及存储在存储器901上并可在处理器902上运行的计算机程序,所述处理器902执行所述计算机程序时实现本申请实施例提供的眼球检测方法。The embodiment of the present application provides a computer device, and the eyeball detection device provided in the embodiment of the present application can be integrated in the computer device. FIG. 9 is a structural block diagram of a computer device provided by an embodiment of this application. The computer device 900 includes a memory 901, a processor 902, and a computer program that is stored on the memory 901 and can run on the processor 902. The processor 902 implements the eye detection method provided in the embodiment of the present application when the computer program is executed.
本申请实施例还提供一种包含计算机可执行指令的存储介质,所述计算机可执行指令在由计算机处理器执行时用于执行本申请实施例提供的眼球检测方法。The embodiment of the present application also provides a storage medium containing computer-executable instructions, which are used to execute the eyeball detection method provided by the embodiments of the present application when the computer-executable instructions are executed by a computer processor.
上述实施例中提供的眼球检测装置、设备以及存储介质可执行本申请任意实施例所提供的眼球检测方法,具备执行该方法相应的功能模块。未在上述实施例中描述的技术细节,可参见本申请任意实施例所提供的眼球检测方法。The eyeball detection device, device, and storage medium provided in the above embodiments can execute the eyeball detection method provided by any embodiment of the present application, and have corresponding functional modules for executing the method. For technical details not described in the above embodiments, please refer to the eyeball detection method provided in any embodiment of the present application.

Claims (13)

  1. 一种眼球检测方法,包括:An eyeball detection method, including:
    获取待检测的目标眼部图像;Acquiring an image of the target eye to be detected;
    将所述目标眼部图像输入至预先训练的眼球检测模型中,其中,所述眼球检测模型为包含可逆残差网络的卷积神经网络模型;Inputting the target eye image into a pre-trained eyeball detection model, where the eyeball detection model is a convolutional neural network model including a reversible residual network;
    根据所述眼球检测模型的输出结果确定所述目标眼部图像中的眼球关键点的位置信息。The position information of key eyeball points in the target eye image is determined according to the output result of the eyeball detection model.
  2. 根据权利要求1所述的方法,其中,所述眼球关键点的位置信息包括所述眼球关键点的坐标信息和所述眼球关键点的可见性信息中的至少之一。The method according to claim 1, wherein the position information of the eyeball key point includes at least one of the coordinate information of the eyeball key point and the visibility information of the eyeball key point.
  3. 根据权利要求1所述的方法,其中,所述获取待检测的目标眼部图像,包括:The method according to claim 1, wherein said acquiring an image of the target eye to be detected comprises:
    采用预设人脸检测方法对待检测图像进行检测,以确定眼角位置信息;Detect the image to be detected by using a preset face detection method to determine the position information of the corner of the eye;
    根据所述眼角位置信息截取双眼图像;Intercepting binocular images according to the position information of the corners of the eyes;
    根据所述双眼图像确定所述目标眼部图像。The target eye image is determined based on the binocular images.
  4. 根据权利要求3所述的方法,其中,所述根据所述眼角位置信息截取双眼图像,包括:The method according to claim 3, wherein the capturing binocular images according to the position information of the corners of the eyes comprises:
    根据所述眼角位置信息分别截取左眼图像和右眼图像。The left-eye image and the right-eye image are respectively intercepted according to the position information of the corner of the eye.
  5. 根据权利要求4所述的方法,其中,所述眼角位置信息包括左眼对应的眼角位置信息和右眼对应的眼角位置信息;The method according to claim 4, wherein the corner of the eye position information includes corner position information of the left eye and corner position information of the right eye;
    所述根据所述眼角位置信息分别截取左眼图像和右眼图像,包括:Said capturing the left-eye image and the right-eye image separately according to the position information of the corner of the eye includes:
    根据左眼和右眼中的每个眼睛对应的眼角位置信息确定所述每个眼睛的两个内眼角点的相对位置;Determining the relative positions of the two inner corner points of each eye according to the corner position information corresponding to each of the left eye and the right eye;
    根据所述每个眼睛的两个内眼角点的相对位置对所述待检测图像进行旋转,以使所述每个眼睛的两个内眼角点处于同一水平线;Rotating the image to be detected according to the relative positions of the two inner corner points of each eye, so that the two inner corner points of each eye are on the same horizontal line;
    截取所述每个眼睛图像。Capture the image of each eye.
  6. 根据权利要求5所述的方法,其中,所述根据所述每个眼睛的两个内眼角点相对位置对所述待检测图像进行旋转,以使所述每个眼睛的两个内眼角点处于同一水平线,包括:The method according to claim 5, wherein the image to be detected is rotated according to the relative positions of the two inner corner points of each eye, so that the two inner corner points of each eye are at The same horizontal line, including:
    根据所述每个眼睛的两个内眼角点相对位置计算所述每个眼睛的两个内眼角点的连线的中心点;Calculating the center point of a line connecting the two inner corner points of each eye according to the relative positions of the two inner corner points of each eye;
    计算通过所述中心点的水平线与所述两个内眼角点的连线的夹角;Calculating the angle between the horizontal line passing through the center point and the line connecting the two inner corner points;
    根据所述夹角确定旋转矩阵;Determine the rotation matrix according to the included angle;
    基于所述旋转矩阵对所述待检测图像进行旋转,以使所述每个眼睛的两个内眼角点处于同一水平线。Rotate the image to be detected based on the rotation matrix, so that the two inner corner points of each eye are on the same horizontal line.
  7. 根据权利要求6所述的方法,其中,所述目标眼部图像中的眼球关键点的位置信息包括所述眼球关键点在所述待检测图像中的坐标信息;The method according to claim 6, wherein the position information of the key eyeball points in the target eye image includes coordinate information of the key eyeball points in the image to be detected;
    所述根据所述眼球检测模型的输出结果确定所述目标眼部图像中的眼球关键点的位置信息,包括:The determining position information of key eyeball points in the target eye image according to the output result of the eyeball detection model includes:
    根据所述眼球检测模型的输出结果确定所述目标眼部图像中的眼球关键点的相对位置信息;其中,所述眼球关键点的相对位置信息为所述眼球关键点在旋转后的待检测图像中的位置信息;The relative position information of the key eyeball points in the target eye image is determined according to the output result of the eyeball detection model; wherein the relative position information of the key eyeball points is the to-be-detected image of the key eyeball points after rotation Location information in;
    基于所述旋转矩阵对所述相对位置信息进行反向旋转处理,得到所述目标眼部图像中的眼球关键点在所述待检测图像中的坐标信息。Perform reverse rotation processing on the relative position information based on the rotation matrix to obtain coordinate information of key eyeball points in the target eye image in the image to be detected.
  8. 根据权利要求3所述的方法,其中,所述根据所述双眼图像确定所述目标眼部图像,包括:The method according to claim 3, wherein the determining the target eye image according to the binocular image comprises:
    将所述双眼图像缩小并调整至预设尺寸,得到所述目标眼部图像。The binocular image is reduced and adjusted to a preset size to obtain the target eye image.
  9. 根据权利要求4所述的方法,其中,所述将所述目标眼部图像输入至预先训练的眼球检测模型中,根据所述眼球检测模型的输出结果确定所述目标眼部图像中的眼球关键点的位置信息,包括:The method according to claim 4, wherein said inputting said target eye image into a pre-trained eyeball detection model, and determining the eyeball key in said target eye image according to the output result of said eyeball detection model Point location information, including:
    将第一目标眼部图像输入至预先训练的眼球检测模型中,根据所述眼球检测模型的第一输出结果确定所述第一目标眼部图像中的眼球关键点的位置信息;Inputting a first target eye image into a pre-trained eyeball detection model, and determining position information of key eyeball points in the first target eye image according to a first output result of the eyeball detection model;
    对第二目标眼部图像进行水平翻转,并将水平翻转后的第二眼部图像输入至所述眼球检测模型中,根据所述眼球检测模型的第二输出结果确定所述水平翻转后的第二目标眼部图像中的眼球关键点的位置信息,对所述位置信息进行水平翻转处理,得到所述第二目标眼部图像中的眼球关键点的位置信息;Perform horizontal flipping on the second target eye image, and input the horizontally flipped second eye image into the eyeball detection model, and determine the horizontally flipped second eye image according to the second output result of the eyeball detection model 2. The position information of the key eyeball points in the target eye image, and horizontally flipping the position information to obtain the position information of the key eyeball points in the second target eye image;
    其中,所述第一目标眼部图像为目标左眼图像,所述第二目标眼部图像为目标右眼图像;或者,所述第一目标眼部图像为目标右眼图像,所述第二目标眼部图像为目标左眼图像。Wherein, the first target eye image is a target left eye image, and the second target eye image is a target right eye image; or, the first target eye image is a target right eye image, and the second target eye image is a target right eye image. The target eye image is the target left eye image.
  10. 根据权利要求1所述的方法,其中,所述眼球检测模型中包含第一全连接层和第二全连接层,根据所述第一全连接层的输出确定所述目标眼部图像中的眼球关键点的坐标信息,根据承接所述第二全连接层的预设激活函数的输出确定所述目标眼部图像中的眼球关键点的可见性信息。The method according to claim 1, wherein the eyeball detection model includes a first fully connected layer and a second fully connected layer, and the eyeball in the target eye image is determined according to the output of the first fully connected layer The coordinate information of the key point determines the visibility information of the eyeball key point in the target eye image according to the output of the preset activation function that inherits the second fully connected layer.
  11. 一种眼球检测装置,包括:An eyeball detection device, including:
    目标眼部图像获取模块,设置为获取待检测的目标眼部图像;The target eye image acquisition module is configured to acquire the target eye image to be detected;
    图像输入模块,设置为将所述目标眼部图像输入至预先训练的眼球检测模型中,其中,所述眼球检测模型为包含可逆残差网络的卷积神经网络模型;An image input module, configured to input the target eye image into a pre-trained eyeball detection model, where the eyeball detection model is a convolutional neural network model including a reversible residual network;
    位置信息确定模块,设置为根据所述眼球检测模型的输出结果确定所述目标眼部图像中的眼球关键点的位置信息。The position information determining module is configured to determine the position information of key eyeball points in the target eye image according to the output result of the eyeball detection model.
  12. 一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,其中,所述处理器执行所述计算机程序时实现如权利要求1-10任一项所述的方法。A computer device, comprising a memory, a processor, and a computer program stored in the memory and running on the processor, wherein the processor executes the computer program when the computer program is executed as described in any one of claims 1-10 Methods.
  13. 一种计算机可读存储介质,存储有计算机程序,该程序被处理器执行时实现如权利要求1-10中任一所述的方法。A computer-readable storage medium that stores a computer program, and when the program is executed by a processor, the method according to any one of claims 1-10 is realized.
PCT/CN2021/085237 2020-04-03 2021-04-02 Eyeball detection method, apparatus and device, and storage medium WO2021197466A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010261001.7A CN111476151B (en) 2020-04-03 2020-04-03 Eyeball detection method, device, equipment and storage medium
CN202010261001.7 2020-04-03

Publications (1)

Publication Number Publication Date
WO2021197466A1 true WO2021197466A1 (en) 2021-10-07

Family

ID=71750560

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/085237 WO2021197466A1 (en) 2020-04-03 2021-04-02 Eyeball detection method, apparatus and device, and storage medium

Country Status (2)

Country Link
CN (1) CN111476151B (en)
WO (1) WO2021197466A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115578753A (en) * 2022-09-23 2023-01-06 中国科学院半导体研究所 Human body key point detection method and device, electronic equipment and storage medium

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111476151B (en) * 2020-04-03 2023-02-03 广州市百果园信息技术有限公司 Eyeball detection method, device, equipment and storage medium
CN113591815B (en) * 2021-09-29 2021-12-21 北京万里红科技有限公司 Method for generating canthus recognition model and method for recognizing canthus in eye image

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108009531A (en) * 2017-12-28 2018-05-08 北京工业大学 A kind of face identification method of more tactful antifraud
CN108229301A (en) * 2017-11-03 2018-06-29 北京市商汤科技开发有限公司 Eyelid line detecting method, device and electronic equipment
CN108229293A (en) * 2017-08-09 2018-06-29 北京市商汤科技开发有限公司 Face image processing process, device and electronic equipment
CN108509894A (en) * 2018-03-28 2018-09-07 北京市商汤科技开发有限公司 Method for detecting human face and device
US10185891B1 (en) * 2016-07-08 2019-01-22 Gopro, Inc. Systems and methods for compact convolutional neural networks
CN110096968A (en) * 2019-04-10 2019-08-06 西安电子科技大学 A kind of ultrahigh speed static gesture identification method based on depth model optimization
CN111476151A (en) * 2020-04-03 2020-07-31 广州市百果园信息技术有限公司 Eyeball detection method, device, equipment and storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10684681B2 (en) * 2018-06-11 2020-06-16 Fotonation Limited Neural network image processing apparatus
CN110555426A (en) * 2019-09-11 2019-12-10 北京儒博科技有限公司 Sight line detection method, device, equipment and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10185891B1 (en) * 2016-07-08 2019-01-22 Gopro, Inc. Systems and methods for compact convolutional neural networks
CN108229293A (en) * 2017-08-09 2018-06-29 北京市商汤科技开发有限公司 Face image processing process, device and electronic equipment
CN108229301A (en) * 2017-11-03 2018-06-29 北京市商汤科技开发有限公司 Eyelid line detecting method, device and electronic equipment
CN108009531A (en) * 2017-12-28 2018-05-08 北京工业大学 A kind of face identification method of more tactful antifraud
CN108509894A (en) * 2018-03-28 2018-09-07 北京市商汤科技开发有限公司 Method for detecting human face and device
CN110096968A (en) * 2019-04-10 2019-08-06 西安电子科技大学 A kind of ultrahigh speed static gesture identification method based on depth model optimization
CN111476151A (en) * 2020-04-03 2020-07-31 广州市百果园信息技术有限公司 Eyeball detection method, device, equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115578753A (en) * 2022-09-23 2023-01-06 中国科学院半导体研究所 Human body key point detection method and device, electronic equipment and storage medium
CN115578753B (en) * 2022-09-23 2023-05-05 中国科学院半导体研究所 Human body key point detection method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN111476151B (en) 2023-02-03
CN111476151A (en) 2020-07-31

Similar Documents

Publication Publication Date Title
WO2021197466A1 (en) Eyeball detection method, apparatus and device, and storage medium
CN109359575B (en) Face detection method, service processing method, device, terminal and medium
US11107232B2 (en) Method and apparatus for determining object posture in image, device, and storage medium
US11778403B2 (en) Personalized HRTFs via optical capture
CN108960045A (en) Eyeball tracking method, electronic device and non-transient computer-readable recording medium
CN111091075B (en) Face recognition method and device, electronic equipment and storage medium
GB2560340A (en) Verification method and system
US11120535B2 (en) Image processing method, apparatus, terminal, and storage medium
US9892315B2 (en) Systems and methods for detection of behavior correlated with outside distractions in examinations
WO2016013634A1 (en) Image registration device, image registration method, and image registration program
WO2021169754A1 (en) Photographic composition prompting method and apparatus, storage medium, and electronic device
WO2020063000A1 (en) Neural network training and line of sight detection methods and apparatuses, and electronic device
US10991124B2 (en) Determination apparatus and method for gaze angle
WO2021218568A1 (en) Image depth determination method, living body recognition method, circuit, device, and medium
CN112257696A (en) Sight estimation method and computing equipment
WO2020164284A1 (en) Method and apparatus for recognising living body based on planar detection, terminal, and storage medium
CN106713740A (en) Positioning and tracking video shooting method and system
WO2021135639A1 (en) Living body detection method and apparatus
JP2017123087A (en) Program, device and method for calculating normal vector of planar object reflected in continuous photographic images
CN112017212B (en) Training and tracking method and system of face key point tracking model
CN111325107A (en) Detection model training method and device, electronic equipment and readable storage medium
CN111563490A (en) Face key point tracking method and device and electronic equipment
Cai et al. Gaze estimation driven solution for interacting children with ASD
Tepencelik et al. Body and head orientation estimation with privacy preserving LiDAR sensors
WO2022078291A1 (en) Sound pickup method and sound pickup apparatus

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21779951

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21779951

Country of ref document: EP

Kind code of ref document: A1