CN111353429A

CN111353429A - Interest degree method and system based on eyeball turning

Info

Publication number: CN111353429A
Application number: CN202010128432.6A
Authority: CN
Inventors: 卢宁; 徐国强
Original assignee: OneConnect Financial Technology Co Ltd Shanghai
Current assignee: OneConnect Smart Technology Co Ltd; OneConnect Financial Technology Co Ltd Shanghai
Priority date: 2020-02-28
Filing date: 2020-02-28
Publication date: 2020-06-30
Also published as: WO2021169642A1

Abstract

The invention discloses a video-based eyeball turning determination method, which comprises the following steps: acquiring a target video, wherein the target video is a video of a target product watched by a target user; inputting the target video into an eyeball turning characteristic identification model to obtain an eyeball turning characteristic queue matrix; and determining a target steering angle of the target user based on the eyeball steering characteristic queue matrix. The invention also discloses a video-based eyeball steering determination system. According to the invention, the eyeball turning characteristic queue matrix is obtained by inputting the target video into the eyeball turning characteristic identification model, and then the target turning angle and the target turning time of the corresponding target product are obtained by the eyeball turning characteristic queue matrix, so that the accuracy of eyeball tracking is improved.

Description

Interest degree method and system based on eyeball turning

Technical Field

The embodiment of the invention relates to the technical field of computer vision, in particular to a method and a system for determining eyeball turning based on video.

Technical Field

Eye tracking has long been used to study the visual attention of individuals, and the most common eye tracking technique is Pupil Central Corneal Reflex (PCCR). The PCCR technology is based on the principle that a light source irradiates on a pupil to form highly visible reflected images captured by a camera of a physical tracking device, the images are used for determining the reflection conditions of the light source in a cornea and the pupil, and finally, the direction watched by human eyes is obtained by calculating the vector included angle formed by the reflection of the cornea and the pupil and other geometric characteristics. However, the scheme has large correlation with the light source, a plurality of interference factors and inaccurate identification.

The current visual application in artificial intelligence is mainly based on image processing, or is an application that video is decomposed into pictures of one frame and one frame, and is essentially based on a single-frame image. The relation between the videos is not correlated, and the correlation and continuity between the pictures cannot be reflected. And when the eyeball is tracked, the accuracy is not enough.

Disclosure of Invention

In view of this, an object of the embodiments of the present invention is to provide a method and a system for determining eyeball steering based on video, so as to improve the accuracy of eyeball tracking.

In order to achieve the above object, an embodiment of the present invention provides a method for determining eyeball steering based on a video, including:

acquiring a target video, wherein the target video is a video of a target product watched by a target user;

performing eyeball feature labeling on the target video to obtain a labeled video;

inputting the marked video into an eyeball turning characteristic recognition model, wherein the eyeball turning characteristic recognition model comprises an eyeball characteristic extraction layer, a frame relation processing layer and an eyeball turning action recognition layer;

converting each frame of image of the marked video into a characteristic matrix through the eyeball characteristic extraction layer, and inputting the characteristic matrix corresponding to each frame of image into the frame relation processing layer;

the frame relation processing layer sorts the feature matrixes of each frame of image according to the video time points corresponding to the feature matrixes to obtain feature queues, and the feature queues are input to the eyeball turning action recognition layer;

and the eyeball turning action recognition layer performs characteristic fusion on the characteristic queue to obtain an eyeball turning characteristic queue matrix, and determines the target turning angle of the target user based on the eyeball turning characteristic queue matrix.

Further, after determining the target steering angle of the target user based on the eyeball steering characteristic queue matrix, the method further includes:

acquiring a video time point corresponding to the eyeball turning characteristic queue matrix;

and calculating the distance from the video time point corresponding to the first characteristic matrix to the video time point corresponding to the last characteristic matrix in the eyeball turning characteristic queue matrix as eyeball turning time.

Further, the performing eyeball feature labeling on the target video to obtain a labeled video includes:

identifying eyeball characteristics of each frame of image in the target video;

and performing frame selection on the area where the eyeball characteristics are located through the marking frame to obtain a marked video.

Further, the converting each frame of image of the annotated video into a feature matrix by the eyeball feature extraction layer comprises:

determining eyeball key points of each frame of image of the annotated video, wherein the eyeball key points comprise 128 key points or 256 key points;

acquiring pixel point coordinates of eyeball key points of each frame of image;

and establishing a characteristic matrix according to the eyeball key points of each frame of image, wherein the characteristic matrix comprises 128 or 256 pixel point coordinates.

Further, the performing feature fusion on the feature queue by the eyeball turning action recognition layer to obtain an eyeball turning feature queue matrix includes:

calculating the differential image characteristics of the adjacent frame images to judge whether the eyeball characteristics corresponding to the adjacent frame images are the same or not;

if the characteristic matrixes are the same, the characteristic matrix corresponding to one frame of image is reserved, and the other same characteristic matrix is deleted from the characteristic queue until the characteristic matrixes in the characteristic queue are different, so that a target characteristic queue is obtained;

aligning the eyeball steering feature array matrix

The characteristic matrixes in the sequence are combined to obtain the eyeball turning characteristic queue matrix.

Further, calculating a difference image feature of the target image of the adjacent frame to determine whether eyeball features corresponding to the target image of the adjacent frame are the same includes:

acquiring pixel point coordinates of adjacent frame images;

carrying out differential operation on the pixel point coordinates of the adjacent frame images to obtain differential image characteristics;

and comparing the differential image characteristics with a preset binarization threshold value to judge whether the eyeball characteristics corresponding to the target images of adjacent frames are the same.

Further, determining the target steering angle of the target user based on the eye-steering feature queue matrix comprises:

marking the position of a product in the target video by taking the central position of the eyeball of the target user as an origin;

and calculating a matrix value of the eyeball turning characteristic queue matrix to obtain a target turning angle.

In order to achieve the above object, an embodiment of the present invention further provides a video-based eyeball steering determination system, including:

the acquisition module is used for acquiring a target video, wherein the target video is a video of a target product watched by a target user;

the labeling module is used for labeling eyeball characteristics of the target video to obtain a labeled video;

the input module is used for inputting the marked video into an eyeball turning characteristic identification model, wherein the eyeball turning characteristic identification model comprises an eyeball characteristic extraction layer, a frame relation processing layer and an eyeball turning action identification layer;

the conversion module is used for converting each frame of image of the annotated video into a feature matrix through the eyeball feature extraction layer and inputting the feature matrix corresponding to each frame of image into the frame relation processing layer;

the characteristic sorting module is used for sorting the characteristic matrix of each frame of image by the frame relation processing layer according to the video time point corresponding to the characteristic matrix to obtain a characteristic queue and inputting the characteristic queue to the eyeball turning action identification layer;

and the characteristic fusion and output module is used for performing characteristic fusion on the characteristic queue by the eyeball turning action recognition layer to obtain an eyeball turning characteristic queue matrix, and determining the target turning angle of the target user based on the eyeball turning characteristic queue matrix.

To achieve the above object, an embodiment of the present invention further provides a computer device, which includes a memory and a processor, where the memory stores a computer program that is executable on the processor, and the computer program, when executed by the processor, implements the steps of the video-based eye turning determination method as described above.

To achieve the above object, an embodiment of the present invention further provides a computer-readable storage medium, in which a computer program is stored, where the computer program is executable by at least one processor to cause the at least one processor to execute the steps of the video-based eyeball turning determination method as described above.

According to the method and the system for determining the eyeball turning direction based on the video, provided by the embodiment of the invention, the target video is marked to obtain the marked video, the marked video is input into the eyeball turning characteristic identification model to obtain the eyeball turning characteristic queue matrix, and then the target turning angle of the corresponding target user is obtained based on the eyeball turning characteristic queue matrix, so that the accuracy of eyeball tracking is improved.

Drawings

Fig. 1 is a flowchart of a first embodiment of a video-based eyeball-direction determination method according to the present invention.

Fig. 2 is a flowchart of step S102 in fig. 1 according to an embodiment of the present invention.

FIG. 3 is a flowchart illustrating step S106 in FIG. 1 according to an embodiment of the present invention.

Fig. 4 is a flowchart of step S110 in fig. 1 according to an embodiment of the present invention.

Fig. 5 is a flowchart of step S110A in fig. 4 according to the embodiment of the present invention.

FIG. 6 is a flowchart illustrating another embodiment of step S110 in FIG. 1 according to the present invention.

Fig. 7 is a flowchart of step S111 and step S112 according to an embodiment of the invention.

Fig. 8 is a schematic diagram of program modules of a second embodiment of a video-based eye-turning determination system according to the present invention.

Fig. 9 is a schematic diagram of a hardware structure of a third embodiment of the computer apparatus according to the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Example one

Referring to fig. 1, a flowchart illustrating steps of a video-based eye-turning determination method according to a first embodiment of the present invention is shown. It is to be understood that the flow charts in the embodiments of the present method are not intended to limit the order in which the steps are performed. The following description is made by way of example with the computer device 2 as the execution subject. The details are as follows.

Step S100, a target video is obtained, wherein the target video is the video of a target product watched by a target user.

Specifically, the process that the target user watches the target product is shot through the camera to obtain a target video, and the target video is transmitted to the computer device 2 for processing.

And S102, performing eyeball feature annotation on the target video to obtain an annotated video.

Specifically, each frame of image of the target video is subjected to image segmentation, object detection, image annotation and other processing, so as to obtain an annotated video.

Exemplarily, referring to fig. 2, step S102 further includes:

step S102A, identifying eyeball features of each frame of image in the target video.

Specifically, eyeball features of each frame of image in the target video are identified through eyeball key point detection.

Step S102B, selecting the area where the eyeball characteristics are located through the marking frame to obtain a marked video.

Specifically, the region corresponding to the eyeball key point of each frame of video is selected through the labeling frame, and the labeled video is obtained. And labeling the orientation of the eyeball to acquire a turning motion area of the eyeball in the target video.

And step S104, inputting the marked video into an eyeball turning characteristic identification model, wherein the eyeball turning characteristic identification model comprises an eyeball characteristic extraction layer, a frame relation processing layer and an eyeball turning action identification layer.

Specifically, the eyeball turning characteristic identification model is a pre-trained model and is used for analyzing the marked video and obtaining an eyeball turning characteristic queue matrix. Pre-training an eyeball turning characteristic recognition model based on a deep learning network model:

acquiring a large amount of sample video data, and identifying each frame of sample eyeball characteristic area in each sample video data to obtain a sample image; labeling the sample images according to the time sequence to obtain labeled sample images; inputting the marked sample image into a deep neural network, and extracting a sample characteristic vector of the marked sample image from a CNN convolution layer of the deep neural network; calculating the difference between the marked sample images of the adjacent frames by pixel processing of the sample characteristic vectors to obtain a difference value; deleting the same sample image according to the difference value to obtain a feature queue; and outputting an eyeball steering characteristic queue matrix obtained based on the characteristic queue through the full-connection output layer. The feature extraction method includes, but is not limited to, a facial feature extraction algorithm based on a deep neural network and an eyeball turning feature extraction algorithm based on geometric features.

Illustratively, the eyeball feature extraction layer is used for extracting eyeball features of a target user from each frame of image of the target video and converting the eyeball features into a feature matrix;

the frame relation processing layer is used for determining the frame relation between each frame of image with eyeball characteristics according to the video time point of each frame of image of the target video; and

and the eyeball turning action recognition layer is used for determining an eyeball turning characteristic queue matrix of the target user according to the frame relation and the characteristic matrix.

And step S106, converting each frame of image of the annotated video into a feature matrix through the eyeball feature extraction layer, and inputting the feature matrix corresponding to each frame of image into the frame relation processing layer.

Specifically, the eyeball feature extraction layer splits the target video into each frame of image, and extracts eyeball turning from each frame of image to obtain the respective corresponding features of each frame of image. The eyeball features are composed of a plurality of key points, and can be feature matrixes composed of 128 or 512 key points.

Exemplarily, referring to fig. 3, step S106 further includes:

step S106A, determining eyeball key points of each frame of image of the annotation video, where the eyeball key points include 128 key points or 256 key points.

Specifically, the eyeball feature extraction layer splits the annotation video into each frame of image, and extracts eyeball turning features from each frame of image to obtain a feature matrix corresponding to each frame of image. The eyeball feature is composed of a plurality of eyeball key points, and can be 128 key points or 512 key points.

Step S106B, obtaining coordinates of pixel points of the eyeball key points of each frame of image.

Specifically, pixel point coordinates of key points of each eyeball are obtained, each frame of image is subjected to graying processing to obtain a two-dimensional gray image, and the two-dimensional gray image is converted into a two-dimensional coordinate.

Step S106C, establishing a feature matrix according to the eyeball key points of each frame of image, wherein the feature matrix comprises 128 or 256 pixel point coordinates.

Specifically, the coordinates of the pixel points are sorted to obtain a feature matrix in the form of 128 rows or 256 rows and 2 columns.

And step S108, the frame relation processing layer sorts the feature matrixes of each frame of image according to the video time points corresponding to the feature matrixes to obtain feature queues, and the feature queues are input to the eyeball turning action recognition layer.

Specifically, the frame relation processing layer calculates a corresponding feature matrix of adjacent video time points to determine whether to process the frame image. The frame relation processing layer carries out difference operation on two adjacent frames of images to obtain difference image characteristics, and obtains a movement route of eyeball turning through difference image characteristic analysis, namely, when the difference image characteristics of the two adjacent frames of images are changed to be kept unchanged, the eyeball turning is shown to finish turning movement; when the difference image characteristics of two adjacent frame images change from unchanged, the eyeball begins to perform eyeball turning motion at the moment, and a characteristic queue at the moment is obtained. And arranging the feature matrixes of each frame of image according to the sequence of video time points to obtain a feature queue, so that the subsequent calculation is facilitated. And taking the characteristic queue as a frame relation between the respective corresponding characteristics of each frame of image.

Step S110, the eyeball turning action recognition layer performs feature fusion on the feature queue to obtain an eyeball turning feature queue matrix, and determines the target turning angle of the target user based on the eyeball turning feature queue matrix.

Specifically, the eyeball turning characteristic layer performs duplication checking processing on the characteristic queue, deletes the same characteristics in the queue to obtain target characteristic queues with different eyeball characteristics, and combines arrays of the target characteristic queues according to a time sequence to obtain an eyeball turning characteristic queue matrix.

Exemplarily, referring to fig. 4, step S110 further includes:

in step S110A, difference image features of adjacent frame images are calculated to determine whether corresponding eyeball features of the adjacent frame images are the same.

Specifically, the difference between the feature matrices of the adjacent frame images is calculated through difference operation, and the difference image features are obtained. When the difference image characteristics of two adjacent frames of images are changed to be kept unchanged, indicating that the eyeball turns to complete turning movement at the moment; when the difference image characteristic of two adjacent frame images changes from unchanged to changed, the eyeball begins to perform eyeball turning motion at the moment.

Step S110B, if the feature matrixes are the same, retaining the feature matrix corresponding to one of the frames of images, and deleting another same feature matrix from the feature queue until the feature matrixes in the feature queue are all different, thereby obtaining a target feature queue.

Specifically, if the features are the same, a feature matrix corresponding to the eyeball features of one frame is reserved, and the reserved features may be the next or previous features. If the retained eyeball characteristic is the last same eyeball characteristic, indicating that the characteristic queue comprises the turn-around time; if the retained eye feature is not the last identical feature, it indicates that the feature queue does not include turnaround time. When the feature matrixes in the feature queue are different, namely eyeball turning directions are different, the multi-frame image corresponding to the feature queue is represented as an eyeball motion area.

Step S110C, combining the feature matrices in the target feature queue to obtain the eyeball turning feature queue matrix.

Specifically, the target feature queue comprises a target steering angle, the target steering angle corresponds to the feature queue, and feature matrices of the feature queue are combined according to a time sequence to obtain an eyeball steering feature queue matrix.

Exemplarily, referring to fig. 5, the step S110A further includes:

step S110a1, obtaining coordinates of pixel points of adjacent frame images.

Specifically, the current first frame image is set to be F _ k (x, y), the second frame image is set to be F _ (k-1) (x, y), and (x, y) are the coordinates of the pixel points in each frame image.

Step S110a2, performing difference operation on the pixel coordinates of the adjacent frame images to obtain a difference image feature.

Specifically, see the difference operation formula D _ k (x, y) ═ F _ k (x, y) -F _ (k-1) (x, y) | for calculation; d _ k (x, y) is a difference image feature.

Step S110a3, comparing the difference image features with a preset binarization threshold to determine whether the eyeball features corresponding to the target images of adjacent frames are the same.

Specifically, the differential image features are compared with a preset binarization threshold value by a formula | D _ k (x, y) | > T, and if the differential image features are larger than the preset binarization threshold value, the differential image features are different, and if the differential image features are smaller than the preset binarization threshold value, the differential image features are the same.

Exemplarily, referring to fig. 6, step S110 further includes:

step S1101, marking coordinates of the position of the product in the target video with the center position of the eyeball of the target user as an origin.

Specifically, the target video is provided with a plurality of products, and the position of each product is marked by taking the central position of the eyeball of the target user as an origin, so that the target steering angle calculated based on the eyeball steering characteristic queue matrix corresponds to the target product. The angle of each product relative to the center position of the eyeball can also be calculated according to the coordinates of the products.

Step S1102, calculating a matrix value of the eyeball steering characteristic queue matrix to obtain a target steering angle.

Specifically, a matrix value of an eyeball steering characteristic queue matrix corresponding to the target characteristic queue is calculated, a target steering angle of the target user is obtained, and the target steering angle corresponds to the coordinate, so that the target product is obtained. And if the position corresponding to the target steering angle has deviation, selecting the product closest to the target steering angle as the target product.

Illustratively, referring to fig. 7, the method further comprises:

and step S111, acquiring a video time point corresponding to the eyeball turning characteristic queue matrix.

Specifically, the video time points corresponding to the plurality of frame images corresponding to the eyeball turning characteristic queue matrix are obtained, and the frame relationship processing layer can perform time labeling so as to obtain the video time points.

Step S112, calculating a distance between a video time point corresponding to the first feature matrix and a video time point corresponding to the last feature matrix in the eyeball turning feature queue matrix, as eyeball turning time.

Specifically, the time of the last frame image is subtracted from the time of the first frame image to obtain the target turning time of the target product. The reciprocal of the target steering time is the interest degree, and the longer the steering time is, the smaller the reciprocal is, the higher the interest degree is.

Example two

Referring to fig. 8, a schematic diagram of program modules of a second embodiment of the video-based eye-turning determination system according to the invention is shown. In the present embodiment, the video-based eyeball-diversion determination system 20 may include or be divided into one or more program modules, which are stored in a storage medium and executed by one or more processors, to implement the present invention and implement the above-described video-based eyeball-diversion determination method. The program modules referred to in the embodiments of the present invention refer to a series of computer program instruction segments capable of performing specific functions, and are more suitable than the program itself for describing the execution process of the video-based eyeball-steering determination system 20 in the storage medium. The following description will specifically describe the functions of the program modules of the present embodiment:

the first obtaining module 200 is configured to obtain a target video, where the target video is a video of a target product watched by a target user.

And the labeling module 202 is configured to perform eyeball feature labeling on the target video to obtain a labeled video.

Illustratively, the annotation module 202 is further configured to:

and identifying eyeball characteristics of each frame of image in the target video.

The input module 204 is configured to input the annotation video into an eyeball turning characteristic identification model, where the eyeball turning characteristic identification model includes an eyeball characteristic extraction layer, a frame relationship processing layer, and an eyeball turning action identification layer.

The conversion module 206 is configured to convert each frame of image of the annotation video into a feature matrix through the eyeball feature extraction layer, and input the feature matrix corresponding to each frame of image into the frame relationship processing layer.

Illustratively, the conversion module 206 is further configured to:

determining eyeball key points of each frame of image of the annotated video, wherein the eyeball key points comprise 128 key points or 256 key points.

And acquiring the pixel point coordinates of the eyeball key points of each frame of image.

And the feature sorting module 208 is configured to sort, by the frame relation processing layer, the feature matrix of each frame of image according to the video time point corresponding to the feature matrix to obtain a feature queue, and input the feature queue to the eyeball turning action identification layer.

A feature fusion and output module 210, configured to perform feature fusion on the feature queue by the eyeball turning motion recognition layer to obtain an eyeball turning feature queue matrix, and determine a target turning angle of the target user based on the eyeball turning feature queue matrix.

Illustratively, the feature fusion and output module 210 is further configured to:

and calculating the difference image characteristics of the adjacent frame images to judge whether the eyeball characteristics corresponding to the adjacent frame images are the same.

If the characteristic matrixes are the same, the characteristic matrix corresponding to one frame of image is reserved, and the other same characteristic matrix is deleted from the characteristic queue until the characteristic matrixes in the characteristic queue are different, so that the target characteristic queue is obtained.

Specifically, if the same, one eyeball feature is retained, and the retained feature may be the latter or former. If the retained eyeball characteristic is the last same eyeball characteristic, indicating that the characteristic queue comprises the turn-around time; if the retained eye feature is not the last identical feature, it indicates that the feature queue does not include turnaround time. When the feature matrixes in the target feature queue are different, namely eyeball turning directions are different, the multi-frame image corresponding to the target feature queue is represented as an eyeball motion area.

And combining the characteristic matrixes in the target characteristic queue to obtain the eyeball turning characteristic queue matrix.

and acquiring the pixel point coordinates of the adjacent frame images.

And carrying out differential operation on the pixel point coordinates of the adjacent frame images to obtain the differential image characteristics.

and marking the position of the product in the target video by taking the central position of the eyeball of the target user as an origin.

And calculating the matrix value of the target characteristic queue to obtain a target steering angle.

EXAMPLE III

Fig. 9 is a schematic diagram of a hardware architecture of a computer device according to a third embodiment of the present invention. In the present embodiment, the computer device 2 is a device capable of automatically performing numerical calculation and/or information processing in accordance with a preset or stored instruction. The computer device 2 may be a rack server, a blade server, a tower server or a rack server (including an independent server or a server cluster composed of a plurality of servers), and the like. As shown in fig. 9, the computer device 2 includes, but is not limited to, at least a memory 21, a processor 22, a network interface 23, and a video-based eye-steering determination system 20, which may be communicatively coupled to each other via a system bus. Wherein:

in this embodiment, the memory 21 includes at least one type of computer-readable storage medium including a flash memory, a hard disk, a multimedia card, a card-type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, and the like. In some embodiments, the storage 21 may be an internal storage unit of the computer device 2, such as a hard disk or a memory of the computer device 2. In other embodiments, the memory 21 may also be an external storage device of the computer device 2, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), or the like provided on the computer device 2. Of course, the memory 21 may also comprise both internal and external memory units of the computer device 2. In this embodiment, the memory 21 is generally used for storing an operating system installed in the computer device 2 and various types of application software, such as the program codes of the video-based eye-turning determining system 20 of the second embodiment. Further, the memory 21 may also be used to temporarily store various types of data that have been output or are to be output.

Processor 22 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data Processing chip in some embodiments. The processor 22 is typically used to control the overall operation of the computer device 2. In this embodiment, the processor 22 is configured to execute the program code stored in the memory 21 or process data, for example, execute the video-based eye-turning determining system 20, so as to implement the video-based eye-turning determining method according to the first embodiment.

The network interface 23 may comprise a wireless network interface or a wired network interface, and the network interface 23 is generally used for establishing communication connection between the server 2 and other electronic devices. For example, the network interface 23 is used to connect the server 2 to an external terminal via a network, establish a data transmission channel and a communication connection between the server 2 and the external terminal, and the like. The network may be a wireless or wired network such as an Intranet (Intranet), the Internet (Internet), a Global System of Mobile communication (GSM), Wideband Code Division Multiple Access (WCDMA), a 4G network, a 5G network, Bluetooth (Bluetooth), Wi-Fi, and the like. It is noted that fig. 9 only shows the computer device 2 with components 20-23, but it is to be understood that not all shown components are required to be implemented, and that more or less components may be implemented instead.

In this embodiment, the video-based eye-turning determination system 20 stored in the memory 21 may also be divided into one or more program modules, which are stored in the memory 21 and executed by one or more processors (in this embodiment, the processor 22) to complete the present invention.

For example, fig. 8 is a schematic diagram of program modules of a second embodiment of the video-based eye-turning determination system 20, in which the video-based eye-turning determination system 20 may be divided into an acquisition module 200, an annotation module 202, an input module 204, a transformation module 206, a feature sorting module 208, and a feature fusion and output module 210. The program modules referred to herein are a series of computer program instruction segments that can perform specific functions, and are more suitable than programs for describing the execution of the video-based eye-turning determination system 20 in the computer device 2. The specific functions of the program modules 200 and 210 have been described in detail in the second embodiment, and are not described herein again.

Example four

The present embodiment also provides a computer-readable storage medium, such as a flash memory, a hard disk, a multimedia card, a card-type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a read-only memory (ROM), an electrically erasable programmable read-only memory (EEPROM), a programmable read-only memory (PROM), a magnetic memory, a magnetic disk, an optical disk, a server, an App application mall, etc., on which a computer program is stored, which when executed by a processor implements corresponding functions. The computer-readable storage medium of the present embodiment is used for storing the video-based eye-turning determining system 20, and when being executed by the processor, the computer-readable storage medium implements the video-based eye-turning determining method of the first embodiment.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner.

The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims

1. A video-based eyeball steering determination method is characterized by comprising the following steps:

2. The method of claim 1, after determining the target steering angle of the target user based on the eye-steering feature queue matrix, further comprising:

3. The method of claim 1, wherein performing eye feature labeling on the target video to obtain a labeled video comprises:

identifying eyeball characteristics of each frame of image in the target video;

4. The method of claim 1, wherein converting each frame of image of the annotated video into a feature matrix by the eye feature extraction layer comprises:

acquiring pixel point coordinates of eyeball key points of each frame of image;

5. The method according to claim 4, wherein the eye-turning action recognition layer performs feature fusion on the feature queue to obtain an eye-turning feature queue matrix, which comprises:

6. The method of claim 5, wherein calculating the difference image characteristic of the target image of the adjacent frames to determine whether the eyeball characteristics corresponding to the target image of the adjacent frames are the same comprises:

acquiring pixel point coordinates of adjacent frame images;

7. The method of claim 5, wherein determining the target steering angle for the target user based on the eye-steering feature queue matrix comprises:

8. A video-based eye-steering determination system, comprising:

9. A computer device, characterized in that the computer device comprises a memory, a processor, the memory having stored thereon a computer program being executable on the processor, the computer program being executable by the processor to implement the steps of the video-based eye-turning determination method according to any of the claims 1-7.

10. A computer-readable storage medium, having stored thereon a computer program which is executable by at least one processor to cause the at least one processor to perform the steps of the video-based eye-turning determination method according to any one of claims 1-7.