RU2782543C1

RU2782543C1 - Method and device for sight line detection, method and device for video data processing, device and data carrier

Info

Publication number: RU2782543C1
Application number: RU2022102314A
Authority: RU
Inventors: Гэндай ЛЮ
Original assignee: Биго Текнолоджи Пте. Лтд.
Priority date: 2019-07-10
Filing date: 2020-06-22
Publication date: 2022-10-31

Abstract

FIELD: image processing.

SUBSTANCE: invention relates to the field of image processing, in particular, to a method and device for line of sight detection and video data processing. The effect is achieved by the fact that the detection of the line of sight involves: determining - on the basis of a key characteristic point in the face image - the position of the face corresponding to the image of the face, and the rotational movement of the pupil of the eye, and the rotational movement of the pupil of the eye is a displacement of the center of the pupil in the image of the face relative to the center of the eye apples and backprojecting the rotational movement of the pupil of the eye based on the predetermined projection function and the position of the face into the 3D space where the real face is located to obtain the gaze direction of the real face.

EFFECT: detection of face image and direction of gaze in the video frame, as well as increasing the detection efficiency and accuracy of determining the direction of gaze.

20 cl, 13 dwg

Description

Ссылка на родственную заявкуLink to related application

[0001] Данная заявка испрашивает приоритет по заявке на патент Китая №201910620700.3, поданной 10 июля 2019 года, содержание которой полностью включено в настоящий документ посредством ссылки.[0001] This application claims priority from Chinese Patent Application No. 201910620700.3, filed July 10, 2019, the contents of which are incorporated herein by reference in their entirety.

Область техники, к которой относится настоящее изобретениеThe field of technology to which the present invention relates

[0002] Варианты осуществления настоящего изобретения относятся к области обработки изображений, в частности, к способу и устройству для детектирования линии взгляда, способу и устройству для обработки видеоданных, устройству и носителю данных.[0002] Embodiments of the present invention relate to the field of image processing, in particular, to a method and device for detecting a line of sight, a method and device for processing video data, a device and a storage medium.

Предшествующий уровень техники настоящего изобретенияBackground of the Invention

[0003] В условиях быстрого развития видеотехнологий все большее распространение в сфере развлечений, кинематографа и моделирования виртуальной реальности (VR) получают трехмерные (3D) виртуальные персонажи. В этом случае глаза являются важнейшей частью виртуальных персонажей. Возможность естественного вращения глаз намного меньше, чем возможности мимики лица и движений тела. Однако восприятие людьми движений глаз очень высоко, и поэтому неестественные углы вращения и движения глазных яблок легко считываются. Устройства фиксации движения глаз, которые захватывают центральные точки глазных яблок и зрачков, обычно надеваются на голову и неудобны в использовании.[0003] With the rapid development of video technology, three-dimensional (3D) virtual characters are becoming more widespread in the field of entertainment, cinema and virtual reality (VR) simulation. In this case, the eyes are the most important part of the virtual characters. The possibility of natural eye rotation is much less than the possibility of facial expressions and body movements. However, people's perception of eye movements is very high, and therefore unnatural angles of rotation and movement of the eyeballs are easily read. Eye movement fixation devices that capture the central points of the eyeballs and pupils are usually worn on the head and are inconvenient to use.

[0004] Для устранения указанных проблем в сценариях применения анимации вероятностного выражения лица преобразование движения захваченного центра зрачка в движение глазного яблока обычно реализуется на основе технического решения, предусматривающего синтез текстур глазных яблок, как в приведенных ниже ссылках:[0004] To overcome these problems in probabilistic facial animation scenarios, the transformation of the motion of the captured pupil center into the motion of the eyeball is usually implemented on the basis of a technical solution involving the synthesis of eyeball textures, as in the links below:

[0003] [1] Justus Thies, Michael

Marc Stamminger, Christian Theobalt и Matthias

2018 год: FaceVR: Реконструкция лиц с учетом направления взгляда в реальном масштабе времени в виртуальной реальности, журнал ACM Tram. Graph., 37, 2, Статья 25 (июнь 2018 года);[0003] [1] Justus Thies, Michael

Marc Stamminger, Christian Theobalt and Matthias

2018: FaceVR: Direction-Aware Face Reconstruction in Real Time in Virtual Reality, ACM Tram Magazine. Graph., 37, 2, Article 25 (June 2018);

[0006] [2] Justus Thies, Michael

Christian Theobalt, Marc Stamminger и Matthias Niessner, 2018 год: Headon: Реконструкция видеоданных с портретами людей в реальном масштабе времени, журнал ACM Tram. Graph., 37, 4, Статья 164 (июль 2018 года);[0006] [2] Justus Thies, Michael

Christian Theobalt, Marc Stamminger, and Matthias Niessner, 2018: Headon: Reconstructing Real-Time Human Portrait Video Data, ACM Tram. Graph., 37, 4, Article 164 (July 2018);

[0007] [3] Chen Cao, Hongzhi Wu, Yanlin Weng, Tianjia Shao и Kun Zhou, 2016 год: Анимация лиц в реальном масштабе времени с динамическими образами на базе изображений, журнал ACM Tram. Graph., 35, 4, Статья 126 (июль 2016 года), 12 страниц;[0007] [3] Chen Cao, Hongzhi Wu, Yanlin Weng, Tianjia Shao, and Kun Zhou, 2016: Real-Time Face Animation with Image-Based Dynamic Appearances, ACM Tram. Graph., 35, 4, Article 126 (July 2016), 12 pages;

[0008] [4] Система и способ отслеживания движения лицевых мускулов и глаз для создания анимации с использованием компьютерной графики, CN101069214A; и[0008] [4] System and Method for Tracking Facial Muscles and Eyes for Animation Using Computer Graphics, CN101069214A; and

[0009] [5] Способ создания трехмерной виртуальной модели движения глаз с широким спектром выражения эмоций, CN103279969A.[0009] [5] A method for creating a 3D virtual eye movement model with a wide range of emotional expression, CN103279969A.

[0010] В ссылке [1] и ссылке [2] описаны способы, схожие с управлением данными, а в ссылке [3] применяется более простой и более интуитивно понятный способ на базе электронной доски объявлений, причем все эти способы выбирают текстуру глазного яблока, более всего совпадающую с текущим состоянием глазного яблока, из большого количества текстур глазных яблок, и совмещают ее с целевым глазным яблоком для фиксации изменения движения глазного яблока. При этом требуется сравнение с большим количеством текстур глазных яблок в статистических данных и повышается объем обрабатываемых данных, что приводит к снижению эффективности оценки линии взгляда. В ссылке [4] обеспечивается прямое отслеживание движения глаз за счет движения глазных мускулов, а в ссылке [5] применяется способ на основе правил для прямого синтеза различных эффектов движения при изменении направления взгляда, причем ни один из этих способов прямо не анализирует движение глаз на основании изменений положения зрачка, что приводит к снижению точности определения направления взгляда глазного яблока.[0010] Reference [1] and reference [2] describe methods similar to data manipulation, and reference [3] uses a simpler and more intuitive bulletin board-based method, all of which select an eyeball texture, the most consistent with the current state of the eyeball, from a large number of eyeball textures, and combine it with the target eyeball to capture the change in the movement of the eyeball. This requires comparison with a large number of eyeball textures in the statistical data and increases the amount of data being processed, which leads to a decrease in the efficiency of the line of sight estimation. Reference [4] provides direct tracking of eye movement through eye muscle movement, while reference [5] uses a rule-based method to directly synthesize the various effects of movement by changing gaze direction, neither of which directly analyzes eye movement on the eye. based on changes in the position of the pupil, which leads to a decrease in the accuracy of determining the direction of gaze of the eyeball.

Краткое раскрытие настоящего изобретенияBrief summary of the present invention

[0011] Вариантами осуществления настоящего изобретения предложен способ и устройство для детектирования линии взгляда, способ и устройство для обработки видеоданных, устройство и носитель данных, которые обеспечивают детектирование изображения лица и направления взгляда в видеокадре, а также повышают эффективность детектирования и точность определения направления взгляда.[0011] Embodiments of the present invention provide a method and device for line of sight detection, a method and device for processing video data, a device and a data carrier that provide detection of a face image and direction of gaze in a video frame, and also increase the detection efficiency and accuracy of determining the direction of gaze.

[0012] Вариантами осуществления настоящего изобретения предложен способ детектирования линии взгляда. Этот способ предусматривает:[0012] Embodiments of the present invention provide a method for detecting a line of sight. This method provides:

[0013] определение - на основании ключевой характерной точки в изображении лица положения лица и вращательного перемещения зрачка глаза, которые соответствуют изображению лица, причем вращательное перемещение зрачка глаза представляет собой перемещение центра зрачка относительно центра глазного яблока в изображении лица; и[0013] determination - based on the key feature point in the face image of the position of the face and the rotational movement of the pupil of the eye, which correspond to the image of the face, and the rotational movement of the pupil of the eye is the movement of the center of the pupil relative to the center of the eyeball in the face image; and

[0014] получение направления взгляда реального лица путем обратного проецирования вращательного перемещения зрачка глаза на основании заданной функции проецирования и положения лица на трехмерное пространство, где располагается реальное лицо.[0014] Obtaining the gaze direction of the real face by backprojecting the rotational movement of the pupil of the eye based on the given projection function and the position of the face onto the 3D space where the real face is located.

[0015] Вариантами осуществления настоящего изобретения предложен способ обработки видеоданных. Этот способ предусматриваете:[0015] Embodiments of the present invention provide a method for processing video data. This method provides:

[0016] получение видеокадра в подлежащих обработке видеоданных; и[0016] obtaining a video frame in the video data to be processed; and

[0017] получение направления взгляда реального лица, соответствующего видеокадру, путем реализации способа детектирования линии взгляда согласно любому из указанных выше вариантов осуществления.[0017] Obtaining the gaze direction of a real person corresponding to a video frame by implementing the line of sight detection method according to any of the above embodiments.

[0018] Вариантами осуществления настоящего изобретения предложено устройство для детектирования линии взгляда. Это устройство включает в себя:[0018] Embodiments of the present invention provide an apparatus for line of sight detection. This device includes:

[0019] модуль определения параметров, выполненный с возможностью определения - на основании ключевой характерной точки в изображении лица -положения лица и вращательного перемещения зрачка глаза, которые соответствуют изображению лица, причем вращательное перемещение зрачка глаза представляет собой перемещение центра зрачка относительно центра глазного яблока в изображении лица; и[0019] a parameter determination module configured to determine - based on a key feature point in the face image - the position of the face and the rotational movement of the pupil of the eye, which correspond to the image of the face, and the rotational movement of the pupil of the eye is the movement of the center of the pupil relative to the center of the eyeball in the image faces; and

[0020] модуль детектирования линии взгляда, выполненный с возможностью получения направления взгляда реального лица путем обратного проецирования вращательного перемещения зрачка глаза - на основании заданной функции проецирования и положения лица на трехмерное пространство, где располагается реальное лицо.[0020] A line of sight detection module, configured to obtain a gaze direction of a real face by backprojecting a rotational movement of the eye pupil, based on a predetermined projection function and the position of the face in the 3D space where the real face is located.

[0021] Вариантами осуществления настоящего изобретения предложено устройство для обработки видеоданных. Это устройство включает в себя:[0021] Embodiments of the present invention provide an apparatus for processing video data. This device includes:

[0022] модуль получения видеокадра, выполненный с возможностью получения видеокадра в подлежащих обработке видеоданных; и[0022] a video frame acquisition module, configured to obtain a video frame in the video data to be processed; and

[0023] модуль детектирования линии взгляда, выполненный с возможностью получения направления взгляда реального лица, соответствующего видеокадру, путем реализации способа детектирования линии взгляда согласно любому из указанных выше вариантов осуществления.[0023] A line of sight detection module, configured to obtain the gaze direction of a real person corresponding to a video frame by implementing the line of sight detection method according to any of the above embodiments.

[0024] Вариантами осуществления настоящего изобретения предложена система для обработки линии взгляда. Эта система включает в себя: устройство захвата изображения и устройство обработки данных, которые коммуникативно соединены друг с другом, причем устройство захвата изображения располагается на устройстве обработки данных; и[0024] Embodiments of the present invention provide a system for line of sight processing. This system includes: an image pickup device and a data processing device that are communicatively connected to each other, the image pickup device being located on the data processing device; and

[0025] устройство захвата изображения захватывает подлежащее обработке изображение лица или подлежащие обработке видеоданные, и передает подлежащее обработке изображение лица или подлежащие обработке видеоданные на устройство обработки данных; а устройство обработки данных снабжено устройством для детектирования линии взгляда согласно любому из указанных выше вариантов осуществления и устройством для обработки видеоданных согласно любому из указанных выше вариантов осуществления.[0025] the image capturing device captures the face image to be processed or the video data to be processed, and transmits the face image to be processed or the video data to be processed to the data processing device; and the data processing apparatus is provided with a line of sight detection apparatus according to any of the above embodiments and a video data processing apparatus according to any of the above embodiments.

[0026] Вариантами осуществления настоящего изобретения предложено устройство. Это устройство включает в себя:[0026] Embodiments of the present invention provide a device. This device includes:

[0027] один или несколько процессоров; и[0027] one or more processors; and

[0028] память, выполненную с возможностью хранения одной или нескольких программ; при этом:[0028] a memory configured to store one or more programs; wherein:

[0029] одна или несколько программ при их выполнении одним или несколькими процессорами инициирует реализацию одним или несколькими процессорами способа детектирования линии взгляда согласно любому из вариантов осуществления настоящего изобретения или способа обработки видеоданных согласно любому из вариантов осуществления настоящего изобретения.[0029] One or more programs, when executed by one or more processors, causes one or more processors to implement a sight line detection method according to any embodiment of the present invention or a video data processing method according to any embodiment of the present invention.

[0030] Вариантами осуществления настоящего изобретения предложен машиночитаемый носитель данных для хранения компьютерной программы, причем компьютерная программа при ее выполнении процессором инициирует реализацию этим процессором способа детектирования линии взгляда по любому из вариантов осуществления настоящего изобретения или способа обработки видеоданных по любому из вариантов осуществления настоящего изобретения.[0030] Embodiments of the present invention provide a computer-readable storage medium for storing a computer program, wherein the computer program, when executed by a processor, initiates that processor to implement a sight line detection method according to any embodiment of the present invention or a video data processing method according to any embodiment of the present invention.

Краткое описание фигурBrief description of the figures

[0031] На фиг. 1А представлена блок-схема, иллюстрирующая алгоритм реализации способа детектирования линии взгляда согласно первому варианту осуществления настоящего изобретения;[0031] In FIG. 1A is a flowchart illustrating an implementation flow of the sight line detection method according to the first embodiment of the present invention;

[0032] На фиг. 1В схематически показано вращательное перемещение зрачка глаза в изображении лица в рамках реализации способа детектирования линии взгляда согласно первому варианту осуществления настоящего изобретения;[0032] FIG. 1B schematically shows the rotational movement of the pupil of the eye in a face image in the implementation of the line of sight detection method according to the first embodiment of the present invention;

[0033] На фиг. 1С схематически показана реконструированная сеточная модель лица в рамках реализации способа детектирования линии взгляда согласно первому варианту осуществления настоящего изобретения;[0033] FIG. 1C is a schematic diagram of a reconstructed face mesh model within the implementation of the line of sight detection method according to the first embodiment of the present invention;

[0034] На фиг. 1D схематически проиллюстрирован принцип выполнения процесса детектирования линии взгляда согласно первому варианту осуществления настоящего изобретения;[0034] FIG. 1D schematically illustrates the execution principle of the sight line detection process according to the first embodiment of the present invention;

[0035] На фиг. 2А представлена блок-схема, иллюстрирующая алгоритм реализации способа детектирования линии взгляда согласно второму варианту осуществления настоящего изобретения;[0035] FIG. 2A is a flowchart illustrating an implementation flow of the sight line detection method according to the second embodiment of the present invention;

[0036] На фиг. 2В схематически проиллюстрирован принцип выполнения процесса детектирования линии взгляда согласно второму варианту осуществления настоящего изобретения;[0036] FIG. 2B is a schematic diagram illustrating the execution principle of the sight line detection process according to the second embodiment of the present invention;

[0037] На фиг. 3А представлена блок-схема, иллюстрирующая алгоритм реализации способа детектирования линии взгляда согласно третьему варианту осуществления настоящего изобретения;[0037] FIG. 3A is a flowchart illustrating an implementation flow of the sight line detection method according to the third embodiment of the present invention;

[0038] На фиг. 3В схематически проиллюстрирован принцип выполнения процесса детектирования линии взгляда согласно третьему варианту осуществления настоящего изобретения;[0038] FIG. 3B is a schematic diagram illustrating the execution principle of the sight line detection process according to the third embodiment of the present invention;

[0039] На фиг. 4 представлена блок-схема, иллюстрирующая алгоритм реализации способа обработки видеоданных согласно четвертому варианту осуществления настоящего изобретения;[0039] FIG. 4 is a flowchart illustrating an implementation flow of the video data processing method according to the fourth embodiment of the present invention;

[0040] На фиг. 5 показана структурная схема устройства для детектирования линии взгляда согласно пятому варианту осуществления настоящего изобретения;[0040] In FIG. 5 is a block diagram of a device for detecting a line of sight according to a fifth embodiment of the present invention;

[0041] На фиг. 6 показана структурная схема устройства для обработки видеоданных согласно шестому варианту осуществления настоящего изобретения;[0041] FIG. 6 is a block diagram of a video data processing apparatus according to a sixth embodiment of the present invention;

[0042] На фиг. 7 показана структурная схема системы обработки линии взгляда согласно седьмому варианту осуществления настоящего изобретения; и[0042] FIG. 7 is a block diagram of a sight line processing system according to a seventh embodiment of the present invention; and

[0043] На фиг. 8 показана структурная схема устройства согласно восьмому варианту осуществления настоящего изобретения.[0043] FIG. 8 is a block diagram of a device according to an eighth embodiment of the present invention.

Подробное раскрытие настоящего изобретенияDetailed disclosure of the present invention

[0044] Настоящее изобретение описано ниже в привязке к прилагаемым чертежам и на примере некоторых вариантов его осуществления. Варианты осуществления заявленного изобретения, раскрытые ниже, служат исключительно для иллюстрации настоящего изобретения и не носят ограничительного характера. Кроме того, для упрощения описания на прилагаемых чертежах показана лишь часть, а не все структуры, относящиеся к настоящему изобретению.[0044] The present invention is described below in connection with the accompanying drawings and on the example of some embodiments of its implementation. The embodiments of the claimed invention disclosed below serve solely to illustrate the present invention and are not restrictive. In addition, to simplify the description, the accompanying drawings show only a part, and not all, of the structures related to the present invention.

[0045] Первый вариант осуществления настоящего изобретения[0045] First Embodiment of the Present Invention

[0046] На фиг. 1А представлена блок-схема, иллюстрирующая способ детектирования линии взгляда согласно первому варианту осуществления настоящего изобретения. Этот вариант осуществления применим к любой ситуации, в которой направление взгляда пользователя детектируется путем захвата изображения лица. Способ детектирования линии взгляда согласно этому варианту осуществления настоящего изобретения может выполняться устройством для детектирования линии взгляда согласно одному из вариантов осуществления настоящего изобретения. Это устройство может быть реализовано в виде программных средств и/или аппаратных средств и интегрировано в устройство для реализации способа. Это устройство может представлять собой любое устройство для обработки трехмерной модели, обладающее возможностями по обработке изображений.[0046] FIG. 1A is a flowchart illustrating the line of sight detection method according to the first embodiment of the present invention. This embodiment is applicable to any situation in which the gaze direction of the user is detected by capturing an image of the face. The sight line detection method according to this embodiment of the present invention may be performed by the sight line detection apparatus according to one embodiment of the present invention. This device can be implemented in software and/or hardware and integrated into the device to implement the method. This device may be any 3D model processing device with image processing capabilities.

[0047] Как показано на фиг. 1А, предложенный способ может предусматривать стадии, описанные ниже.[0047] As shown in FIG. 1A, the proposed method may include the steps described below.

[0048] На стадии S110 определяется положение лица и вращательное перемещение зрачка глаза, которые соответствуют изображению лица, на основании ключевой характерной точки в изображении лица.[0048] In step S110, the position of the face and the rotational movement of the pupil of the eye that correspond to the face image are determined based on the key feature point in the face image.

[0049] На дисплее видеоэкрана, содержащего выражения лиц для приложений развлекательного характера или игровых сайтов, соответствующая трехмерная модель лица должна быть реконструирована на основании изображений лиц с множеством разных выражений. В этом случае необходимо определить фактическое направление взгляда пользователя в изображении лица с тем, чтобы добавить в реконструированную трехмерную модель лица соответствующий трехмерный эффект глазного яблока с учетом направления взгляда. Следовательно, в реконструированной трехмерной модели лица глазное яблоко может вращаться естественным образом. Для повышения уровня автоматизации устройства в случае, если соответствующая операция по обработке данных осуществляется напрямую в отношении объекта, на который указывает линия взгляда пользователя, также должно быть детектировано фактическое направление взгляда пользователя на основании изображения лица.[0049] On the display of a video screen containing facial expressions for entertainment applications or gaming sites, the corresponding three-dimensional model of the face must be reconstructed based on images of faces with many different expressions. In this case, it is necessary to determine the actual direction of the user's gaze in the face image in order to add an appropriate 3D eyeball effect to the reconstructed 3D face model, taking into account the direction of gaze. Therefore, in the reconstructed 3D face model, the eyeball can naturally rotate. In order to increase the automation level of the device, in the case where the corresponding data processing operation is performed directly on the object indicated by the line of sight of the user, the actual direction of the user's gaze should also be detected based on the face image.

[0050] Характерная точка представляет собой точку в изображении лица, которая обладает четко выраженными характеристиками, может эффективно отражать важные характеристики модели и может идентифицировать целевую часть изображения. В этом варианте осуществления настоящего изобретения характерная точка относится к пиксельной точке, которая может отображать множество характерных черт в изображении лица. Ключевая характерная точка представляет собой локальную характерную точку в изображении лица. В необязательном варианте ключевые характерные точки представляют собой характерные точки, которые могут четко отображать положение черт лица, таких как глаза, нос, рот и щеки в изображении лица, или положение глазниц, крыльев носа и других деталей лица во всех характерных точках. В этом случае положение лица относится к соответствующему положению головы, когда пользователь смотрит в камеру или поворачивает голову, например, к смещенному углу поворота головы в изображении лица. Вращательное перемещение зрачка глаза представляет собой смещение центра зрачка относительно центра глазного яблока в изображении лица, т.е. вращательное перемещение глазного яблока в изображении лица. Как показано на фиг. 1В, часть, на которую указывает косая линия в области глаза в изображении лица, является зрачком глазного яблока, а вращательное перемещение зрачка глаза обозначено на фиг. 1В как смещение d.[0050] A feature point is a point in a face image that has distinct characteristics, can effectively reflect important characteristics of the model, and can identify a target part of the image. In this embodiment of the present invention, a feature point refers to a pixel point that can display a plurality of features in a face image. A key feature point is a local feature point in a face image. Optionally, key feature points are feature points that can clearly represent the position of facial features such as eyes, nose, mouth, and cheeks in a face image, or the position of the eye sockets, nose alae, and other facial features at all feature points. In this case, the face position refers to the corresponding head position when the user is looking at the camera or turning the head, such as a displaced head rotation angle in the face image. The rotational movement of the pupil of the eye is the displacement of the center of the pupil relative to the center of the eyeball in the face image, i.e. rotational movement of the eyeball in the image of the face. As shown in FIG. 1B, the part indicated by the oblique line in the region of the eye in the face image is the pupil of the eyeball, and the rotational movement of the pupil of the eye is indicated in FIG. 1B as offset d.

[0051] В этом варианте осуществления настоящего изобретения необязательно предусмотрено, что при детектировании фактического направления взгляда пользователя в изображении лица сначала определяются ключевые характерные точки в изображении лица путем распознавания изображения, а затем определяются положения черт лица, соответствующим образом отображаемые множеством ключевых характерных точек, после чего положения, соответствующие множеству ключевых характерных точек, и положения множества черт лица в нормальном изображении лица сравниваются и анализируются для оценки информации о повороте или смещении головы в изображении лица. Таким образом, определяется положение лица пользователя в изображении лица; в этом случае анализируется изображение глаза, определяемое по ключевым характерным точкам в изображении лица, и определяется соответствующее вращательное перемещение зрачка глаза на основании степени смещения центра зрачка относительно центра глазного яблока в изображении глаза, что обеспечивает возможность соответствующей непосредственной обработки данных в отношении вращательного перемещения зрачка глаза для определения направления взгляда реального лица.[0051] In this embodiment of the present invention, it is optionally provided that, when detecting the actual gaze direction of a user in a face image, key feature points in the face image are first determined by image recognition, and then the positions of facial features appropriately displayed by a plurality of key feature points are determined, after whereby the positions corresponding to the plurality of key feature points and the positions of the plurality of facial features in the normal face image are compared and analyzed to evaluate head rotation or displacement information in the face image. Thus, the position of the user's face in the face image is determined; in this case, the eye image, determined from the key feature points in the face image, is analyzed and the corresponding rotational movement of the pupil of the eye is determined based on the degree of displacement of the center of the pupil relative to the center of the eyeball in the eye image, which allows appropriate direct data processing in relation to the rotational movement of the pupil of the eye to determine the direction of view of a real person.

[0052] К примеру, когда этот вариант осуществления настоящего изобретения применяется в сценарии, в котором к реконструированной трехмерной модели лица добавляется соответствующая трехмерная модель глазного яблока, соответствующая трехмерная модель лица должна реконструироваться на основании изображения лица. Для уменьшения объема обрабатываемых данных в этом варианте осуществления настоящего изобретения ключевые характерные точки могут быть получены напрямую с помощью реконструированной трехмерной модели лица. В данном случае перед определением соответствующего положения лица и вращательного перемещения зрачка глаза на основании ключевых характерных точек в изображении лица этот вариант осуществления может дополнительно предусматривать: получение данных о лице, соответствующих изображению лица, путем сканирования изображения лица; получение реконструированной сеточной модели лица путем реконструкции заданной трехмерной сетки-шаблона лица с использованием данных о лице; и извлечение ключевых характерных точек в реконструированной сеточной модели лица и использование извлеченных ключевых характерных точек в качестве ключевых характерных точек в изображении лица.[0052] For example, when this embodiment of the present invention is applied to a scenario in which a corresponding eyeball 3D model is added to a reconstructed 3D face model, the corresponding 3D face model should be reconstructed based on the face image. In order to reduce the amount of data to be processed in this embodiment of the present invention, key feature points can be obtained directly from the reconstructed 3D face model. In this case, before determining the appropriate position of the face and the rotational movement of the pupil of the eye based on the key feature points in the face image, this embodiment may further include: obtaining face data corresponding to the face image by scanning the face image; obtaining a reconstructed face mesh model by reconstructing a given 3D face template mesh using the face data; and extracting key feature points in the reconstructed face mesh model and using the extracted key feature points as key feature points in the face image.

[0053] Сначала захваченное изображение лица сканируется с применением технологии трехмерного сканирования для получения соответствующих данных о лице, а затем совпадающие положения характерных точек в данных о лице и соответствующих характерных точек в заданной трехмерной сетке-шаблоне лица анализируются таким образом, что заданная трехмерная сетка-шаблон лица постоянно деформируется с учетом согласующихся с пространственным положением позиций характерных точек в данных о лице. Следовательно, положение множества характерных точек в деформированной трехмерной сетке-шаблоне лица взаимно-однозначно соответствует положению характерных точек в данных о лице, вследствие чего обеспечивается получение реконструированной сеточной модели лица, показанной на фиг. 1С. Затем из реконструированной сеточной модели лица напрямую извлекается множество соответствующих ключевых характерных точек, и это множество ключевых характерных точек используется в качестве ключевых характерных точек в изображении лица. В этом случае ключевыми характерными точками служат вершины сетки сеточной модели лица, что повышает эффективность и точность извлечения ключевых характерных точек.[0053] First, the captured face image is scanned using 3D scanning technology to obtain the corresponding face data, and then the matching positions of the feature points in the face data and the corresponding feature points in the predetermined 3D facial pattern mesh are analyzed so that the predetermined 3D mesh is the face template is constantly deformed, taking into account the spatially consistent positions of the feature points in the face data. Therefore, the position of the plurality of feature points in the deformed 3D face template mesh one-to-one corresponds to the position of the feature points in the face data, thereby providing the reconstructed face mesh model shown in FIG. 1C. Then, a set of corresponding key feature points is directly extracted from the reconstructed face mesh model, and this set of key feature points is used as key feature points in the face image. In this case, key feature points are mesh vertices of the face mesh model, which increases the efficiency and accuracy of key feature point extraction.

[0054] На стадии S120 осуществляется получение направления взгляда реального лица путем обратного проецирования вращательного перемещения зрачка глаза на основании заданной функции проецирования и положения лица на трехмерное пространство, где располагается реальное лицо.[0054] At step S120, the gaze direction of the real face is obtained by backprojecting the rotational movement of the pupil of the eye based on the given projection function and the position of the face onto the three-dimensional space where the real face is located.

[0055] Функция проецирования представляет собой функцию, используемую в случае, когда лицо пользователя в трехмерном пространстве проецируется на соответствующую двумерную поверхность формирования изображения в процессе захвата изображения лица с целью получения основы для преобразования изображения лица, так что реальное лицо в трехмерном пространстве и изображение лица на двумерной поверхности формирования изображения соответствуют соотношению распределения; а вращательное перемещение зрачка глаза на двумерной поверхности формирования изображения соответствует направлению взгляда реального лица в трехмерном пространстве.[0055] The projection function is a function used in the case where the user's face in 3D is projected onto the corresponding 2D imaging surface in the face image capturing process to obtain a basis for transforming the face image, so that the real face in 3D and the face image on the two-dimensional imaging surface correspond to the distribution ratio; and the rotational movement of the pupil of the eye on the 2D imaging surface corresponds to the gaze direction of the real face in 3D space.

[0056] Смещение положения лица в изображении лица указывает на то, что реальное лицо в трехмерном пространстве повернулось; соответственно, направление взгляда также изменилось. Следовательно, в этом варианте осуществления настоящего изобретения предусмотрено, что в случае получения положения лица и вращательного перемещения зрачка глаза в изображении лица, как это показано на фиг. 1D, на основании положения лица может быть скорректировано множество параметров в заданной функции проецирования, после чего с помощью скорректированной функции проецирования вращательное перемещение зрачка глаза обратно проецируется на трехмерное пространство, где располагается реальной лицо. Таким образом, обеспечивается получение направления взгляда реального лица на основании смещения зрачка глаза при его вращательном перемещении, обратно спроецированного на трехмерное пространство, благодаря чему соответствующая последующая операция выполняется с учетом направления взгляда.[0056] The displacement of the position of the face in the image of the face indicates that the real face in three-dimensional space has rotated; accordingly, the direction of gaze also changed. Therefore, in this embodiment of the present invention, it is provided that in the case of obtaining the position of the face and the rotational movement of the pupil of the eye in the image of the face, as shown in FIG. 1D, a plurality of parameters in a predetermined projection function can be corrected based on the position of the face, and then, using the corrected projection function, the rotational movement of the pupil of the eye is projected back onto the three-dimensional space where the real face is located. In this way, it is possible to obtain the gaze direction of a real face based on the displacement of the pupil of the eye during its rotational movement, projected back into three-dimensional space, so that the corresponding subsequent operation is carried out taking into account the direction of gaze.

[0057] Кроме того, в устройстве захвата изображения предусмотрены разные функции проецирования для разных параметров съемки, и поэтому перед получением направления взгляда реального лица путем обратного проецирования вращательного перемещения зрачка глаза на трехмерное пространство, где располагается реальное лицо, на основании заданной функции проецирования и положения лица способ согласно этому варианту осуществления настоящего изобретения может дополнительно предусматривать: определение заданной функции проецирования на основании параметра устройства захвата изображения, соответствующего изображению лица.[0057] In addition, the image capturing device provides different projection functions for different shooting parameters, and therefore, before obtaining the gaze direction of the real face by backprojecting the rotational movement of the eye pupil to the three-dimensional space where the real face is located, based on the set projection function and position face method according to this embodiment of the present invention may further comprise: determining a predetermined projection function based on a parameter of the image pickup device corresponding to the face image.

[0058] Параметром устройства захвата изображения в этом варианте осуществления может служить его фокусное расстояние для формирования изображения; и в устройстве захвата изображения предусмотрены разные функции проецирования для разных фокусных расстояний для формирования изображения, вследствие чего размер изображения, содержащего один и тот же объект, варьируется при разных фокусных расстояниях для формирования изображения. В этом варианте осуществления настоящего изобретения необязательно предусмотрено, что на основании параметра устройства захвата изображения для захвата изображения лица этим устройством соответствующая функция проецирования может быть определена как заданная функция проецирования в этом варианте осуществления, благодаря чему повышается точность обратного проецирования вращательного перемещения зрачка глаза в изображении лица на трехмерное пространство, где располагается реальное человеческое лицо.[0058] The parameter of the image capturing device in this embodiment may be its focal length for imaging; and the image capturing apparatus provides different projection functions for different imaging focal lengths, so that the size of an image containing the same object varies at different imaging focal lengths. In this embodiment of the present invention, it is optionally provided that, based on the parameter of the image capturing device for capturing a face image by this device, the corresponding projection function can be determined as the specified projection function in this embodiment, thereby improving the back projection accuracy of the rotational movement of the eye pupil in the face image. to a three-dimensional space where a real human face is located.

[0059] В техническом решении, реализованном в этом варианте осуществления настоящего изобретения, сначала определяется положение лица, соответствующее изображению лица, и вращательное перемещение центра зрачка относительно центра глазного яблока в изображении лица на основании ключевой характерной точки в изображении лица, после чего осуществляется обратное проецирование вращательного перемещения зрачка глаза - на основании заданной функции проецирования и положения лица - на трехмерное пространство, где располагается реальное лицо, что позволяет получить направление взгляда реального лица. В этом решении нет необходимости в сравнении текстуры глаза в изображении лица с текстурами глаз во множестве изображений лиц, содержащихся в статистических данных, или в определении направления вращения глазного яблока на основании движения других черт лица в изображении, отличных от глазного яблока, что уменьшает объем обрабатываемых данных о глазе и повышает эффективность детектирования направления взгляда. Направление взгляда реального лица напрямую анализируется на основании вращения зрачка глаза в изображении лица, что повышает точность детектирования направления взгляда.[0059] In the technical solution implemented in this embodiment of the present invention, the position of the face corresponding to the face image is first determined, and the rotational movement of the center of the pupil relative to the center of the eyeball in the face image based on the key feature point in the face image, and then back projection is performed rotational movement of the pupil of the eye - based on a given projection function and the position of the face - to the three-dimensional space where the real face is located, which allows you to get the direction of view of the real face. In this solution, there is no need to compare the texture of the eye in the face image with the textures of the eyes in the set of face images contained in the statistics, or to determine the direction of rotation of the eyeball based on the movement of other facial features in the image other than the eyeball, which reduces the amount of processing eye data and improves the performance of gaze direction detection. The gaze direction of a real face is directly analyzed based on the rotation of the pupil of the eye in the face image, which improves the detection accuracy of the gaze direction.

[0060] Второй вариант осуществления настоящего изобретения[0060] The second embodiment of the present invention

[0061] На фиг. 2А представлена блок-схема, иллюстрирующая алгоритм реализации способа детектирования линии взгляда согласно второму варианту осуществления настоящего изобретения; а на фиг. 2В схематически проиллюстрирован принцип выполнения процесса детектирования линии взгляда согласно второму варианту осуществления настоящего изобретения. Этот вариант осуществления заявленного изобретения проиллюстрирован на основе описанного выше варианта осуществления. В этом варианте осуществления настоящего изобретения проиллюстрирован, главным образом, процесс определения положения лица и вращательного перемещения зрачка глаза в изображении лица.[0061] FIG. 2A is a flowchart illustrating an implementation flow of the sight line detection method according to the second embodiment of the present invention; and in fig. 2B is a schematic diagram illustrating the execution principle of the sight line detection process according to the second embodiment of the present invention. This embodiment of the claimed invention is illustrated based on the embodiment described above. In this embodiment, the present invention mainly illustrates the process of determining the position of the face and rotational movement of the pupil of the eye in the image of the face.

[0062] В необязательном варианте, как это показано на фиг. 2А, способ согласно этому варианту осуществления настоящего изобретения может предусматривать стадии, описанные ниже.[0062] Optionally, as shown in FIG. 2A, the method according to this embodiment of the present invention may include the steps described below.

[0063] На стадии S210 осуществляется получение данных о лице, соответствующих изображению лица, путем сканирования изображения лица; осуществляется получение реконструированной сеточной модели лица путем реконструкции заданной трехмерной сетки-шаблона лица с использованием данных о лице; извлекается ключевая характерная точка в реконструированной сеточной модели лица, и извлеченная ключевая характерная точка в реконструированной сеточной модели лица используется в качестве ключевой характерной точки в изображении лица.[0063] In step S210, face data corresponding to the face image is obtained by scanning the face image; obtaining a reconstructed grid model of the face by reconstructing a given three-dimensional grid-template of the face using data on the face; a key feature point in the reconstructed face mesh model is extracted, and the extracted key feature point in the reconstructed face mesh model is used as a key feature point in the face image.

[0064] На стадии S220 определяется согласующаяся с пространственным положением позиция ключевой характерной точки в изображении лица.[0064] In step S220, a spatially consistent position of a key feature point in the face image is determined.

[0065] В необязательном варианте согласующиеся с пространственным положением позиции указывают на положение черт лица в изображении лица с разным выражением. В случае получения ключевых характерных точек в изображении лица множество ключевых характерных точек может быть проанализировано для оценки согласующихся с пространственным положением позиций черт лица, соответствующих ключевым характерным точкам в изображении лица.[0065] Optionally, spatially consistent positions indicate the position of facial features in an image of a face with different expressions. In the case of obtaining key feature points in a face image, a plurality of key feature points can be analyzed to evaluate spatially consistent positions of facial features corresponding to key feature points in the face image.

[0066] На стадии S230 определяется положение лица в изображении лица на основании согласующейся с пространственным положением позиции.[0066] In step S230, the position of the face in the face image is determined based on the spatially consistent position.

[0067] Путем сравнения согласующихся с пространственным положением позиций множества ключевых характерных точек с согласующимися с пространственным положением позициями соответствующих ключевых характерных точек в стандартном шаблоне изображения лица без всякого выражения определяется смещение множества ключевых характерных точек, после чего определяется такая информация, как угол поворота и расстояние сдвига изображения лица, для получения положения лица в изображении лица.[0067] By comparing the spatially consistent positions of the plurality of key feature points with the spatially consistent positions of the respective key feature points in the standard face image template, the offset of the plurality of key feature points is determined without any expression, and then information such as the rotation angle and distance is determined. face image shift to get the position of the face in the face image.

[0068] На стадии S240 определяется центр глазного яблока, соответствующий изображению лица, на основании ключевой характерной точки в реконструированной сеточной модели лица.[0068] At step S240, the center of the eyeball corresponding to the face image is determined based on the key feature point in the reconstructed face mesh model.

[0069] В необязательном варианте предусмотрено, что при получении реконструированной сеточной модели лица путем реконструкции заданной сеточной модели лица с использованием данных о лице сеточная модель лица имеет такие же размеры, что и изображение лица. В этом случае в сеточной модели лица могут быть определены ключевые характерные точки, после чего на основании множества ключевых характерных точек анализируется положение и размеры глазницы в сеточной модели лица. Положение и размеры глазницы совпадают с положением и размерами глазницы в изображении лица, а центральная точка глазницы принимается за центр соответствующего глазного яблока.[0069] It is optionally provided that when a reconstructed face mesh model is obtained by reconstructing a given face mesh model using face data, the face mesh model has the same dimensions as the face image. In this case, key feature points can be determined in the face mesh model, after which, based on the set of key feature points, the position and dimensions of the eye socket in the face mesh model are analyzed. The position and dimensions of the orbit coincide with the position and dimensions of the orbit in the face image, and the central point of the orbit is taken as the center of the corresponding eyeball.

[0070] На стадии S250 обеспечивается получение центра зрачка, соответствующего изображению глаза, путем распознавания изображения глаза в изображении лица.[0070] At step S250, the center of the pupil corresponding to the eye image is obtained by recognizing the eye image in the face image.

[0071] В необязательном варианте на основании положения глазницы в реконструированной сеточной модели лица может быть определено соответствующее положение изображения глаза в изображении лица. В этом случае изображение глаза распознается с использованием технологии распознавания изображений, после чего в изображении глаза определяется положение зрачка глаза. Поскольку зрачок глаза имеет круглую форму, центр круга зрачка принимается за центр зрачка, соответствующий изображению глаза.[0071] Optionally, based on the position of the eye socket in the reconstructed face mesh model, the corresponding position of the eye image in the face image can be determined. In this case, the image of the eye is recognized using image recognition technology, and then the position of the pupil of the eye is determined in the image of the eye. Since the pupil of the eye is round, the center of the circle of the pupil is taken as the center of the pupil corresponding to the image of the eye.

[0072] К примеру, для обеспечения точности определения центра зрачка глаза, как это показано на фиг. 2В, в этом варианте осуществления настоящего изобретения получение центра зрачка глаза, соответствующего изображению лица, путем распознавания изображения глаза может предусматривать: захват изображения глаза в изображении лица; и получение центра зрачка глаза, соответствующего изображению глаза, путем ввода изображения глаза в предварительно построенную модель глубокой сети.[0072] For example, in order to accurately determine the center of the pupil of the eye, as shown in FIG. 2B, in this embodiment of the present invention, obtaining an eye pupil center corresponding to a face image by eye image recognition may include: capturing an eye image in a face image; and obtaining an eye pupil center corresponding to the eye image by inputting the eye image into the pre-built deep network model.

[0073] Модель глубокой сети представляет собой модель нейронной сети, которая предварительно обучена на основании большого количества статистических изображений глаз и выполнена с возможностью точного распознавания центра зрачка в изображении глаза; причем в этом варианте осуществления настоящего изобретения соответствующее изображение глаза может быть захвачено в изображении лица, исходя из положения глазницы в реконструированной сеточной модели лица. Захваченное изображение глаза вводится в предварительно построенную модель глубокой сети, и изображение глаза анализируется с учетом параметров предварительно обученной сети в модели глубокой сети, вследствие чего обеспечивается получение соответствующего центра зрачка в изображении глаза.[0073] The deep network model is a neural network model that is pre-trained based on a large number of statistical eye images and is capable of accurately recognizing the center of the pupil in the eye image; moreover, in this embodiment of the present invention, the corresponding image of the eye can be captured in the image of the face, based on the position of the orbit in the reconstructed mesh model of the face. The captured eye image is entered into a pre-built deep network model, and the eye image is analyzed with the parameters of the pre-trained network in the deep network model, thereby obtaining the corresponding pupil center in the eye image.

[0074] На стадии S260 определяется вращательное перемещение зрачка глаза, соответствующее изображению лица, на основании положения центра глазного яблока и положения центра зрачка.[0074] In step S260, the rotational movement of the pupil of the eye corresponding to the face image is determined based on the position of the center of the eyeball and the position of the center of the pupil.

[0075] После определения центра глазного яблока и центра зрачка в изображении глаза определяется смещение центра зрачка относительно центра глазного яблока путем проведения различия между положением центра глазного яблока и положением центра зрачка, и обеспечивается получение соответствующего вращательного перемещения зрачка глаза.[0075] After determining the center of the eyeball and the center of the pupil in the eye image, the displacement of the center of the pupil relative to the center of the eyeball is determined by distinguishing between the position of the center of the eyeball and the position of the pupil center, and the corresponding rotational movement of the eye pupil is obtained.

[0076] В этом варианте осуществления настоящего изобретения стадии определения положения лица и вращательного перемещения зрачка глаза могут выполняться одновременно, и какой-либо конкретной последовательности не предусмотрено. Иначе говоря, стадии S220 и S230 рассматриваются в качестве единого процесса, и стадии S240, S250 и S260 рассматриваются в качестве единого процесса. Процессы, соответствующие стадиям S220 и S230, и процессы, соответствующие стадиям S240, S250 и S260, могут выполняться одновременно, что не носит ограничительного характера в этом варианте осуществления настоящего изобретения.[0076] In this embodiment of the present invention, the steps of determining the position of the face and the rotational movement of the pupil of the eye can be performed simultaneously, and no particular sequence is provided. In other words, steps S220 and S230 are considered as a single process, and steps S240, S250 and S260 are considered as a single process. The processes corresponding to steps S220 and S230 and the processes corresponding to steps S240, S250 and S260 may be performed simultaneously, which is not limited in this embodiment of the present invention.

[0077] На стадии S270 обеспечивается получение направления взгляда реального лица путем обратного проецирования вращательного перемещения зрачка глаза - на основании заданной функции проецирования и положения лица - на трехмерное пространство, где располагается реальное лицо.[0077] At step S270, the gaze direction of the real face is obtained by backprojecting the rotational movement of the eye pupil - based on the given projection function and the position of the face - onto the three-dimensional space where the real face is located.

[0078] На стадии S280 конструируется соответствующая трехмерная модель глаза в реконструированной сеточной модели лица на основании направления взгляда реального лица.[0078] In step S280, a corresponding 3D eye model is constructed in the reconstructed face mesh model based on the gaze direction of the real face.

[0079] В необязательном варианте, когда этот вариант осуществления настоящего изобретения применяется в сценарии, когда в реконструированную трехмерную модель лица добавляется соответствующая трехмерная модель глазного яблока, в случае получения направления взгляда реального лица в реконструированную трехмерную сеточную модель лица вводится - на основании направления взгляда реального лица -трехмерная модель глазного яблока, соответствующая эффекту отображения линии взгляда глазного блока, т.е. область глазницы в сеточной модели лица в этом варианте осуществления настоящего изобретения, для обеспечения естественного и плавного поворота трехмерной модели глазного яблока в реконструированной сеточной модели лица. Кром того, в реконструированной сеточной модели лица инициируется анимация глазного яблока, синтезируя специальные эффекты, и улучшается анимационный эффект виртуального вращения глазного яблока.[0079] Optionally, when this embodiment of the present invention is applied in a scenario where a corresponding eyeball 3D model is added to the reconstructed 3D face model, in the case of obtaining the gaze direction of the real face, the reconstructed 3D mesh face model is input - based on the gaze direction of the real faces - a three-dimensional model of the eyeball, corresponding to the effect of displaying the line of sight of the eye block, i.e. the region of the eye socket in the meshed face model in this embodiment of the present invention, to provide a natural and smooth rotation of the 3D eyeball model in the reconstructed meshed face model. In addition, in the reconstructed mesh model of the face, the eyeball animation is initiated, synthesizing special effects, and the animation effect of the eyeball virtual rotation is improved.

[0080] В техническом решении, реализованном в этом варианте осуществления настоящего изобретения, положение лица в изображении лица определяется на основании согласующейся с пространственным положением позиции ключевой характерной точки, и в этом случае соответствующее вращательное перемещение зрачка глаза определяется на основании положения центр глазного яблока и положения центра зрачка глаза в изображении глаза, что обеспечивает точность положения лица и вращательного перемещения зрачка глаза. Более того, направление взгляда реального лица получается путем обратного проецирования вращения зрачка глаза - на основании заданной функции проецировании и положения лица - на трехмерное пространство, где располагается реальное лицо, что повышает эффективность и точность детектирования направления взгляда. В данном случае в реконструированной сеточной модели лица строится соответствующая трехмерная модель глазного яблока на основании направления взгляда реального лица, что обеспечивает естественное и плавное вращение глазного яблока в реконструированной сеточной модели лица, а также улучшает анимационный эффект виртуального вращения глазного яблока в реконструированной сеточной модели лица.[0080] In the technical solution implemented in this embodiment of the present invention, the position of the face in the face image is determined based on the spatially consistent position of the key feature point, in which case the corresponding rotational movement of the pupil of the eye is determined based on the position of the center of the eyeball and the position the center of the pupil of the eye in the image of the eye, which ensures the accuracy of the position of the face and the rotational movement of the pupil of the eye. Moreover, the gaze direction of the real face is obtained by backprojecting the rotation of the pupil of the eye - based on the given projection function and the position of the face - onto the three-dimensional space where the real face is located, which improves the efficiency and accuracy of the gaze direction detection. In this case, the corresponding 3D model of the eyeball is built in the reconstructed face mesh model based on the gaze direction of the real face, which ensures natural and smooth rotation of the eyeball in the reconstructed face mesh model, and also improves the animation effect of the virtual eyeball rotation in the reconstructed face mesh model.

[0081] Третий вариант осуществления настоящего изобретения[0081] The third embodiment of the present invention

[0082] На фиг. 3А представлена блок-схема, иллюстрирующая алгоритм реализации способа детектирования линии взгляда согласно третьему варианту осуществления настоящего изобретения, а на фиг. 3В схематически проиллюстрирован принцип выполнения процесса детектирования линии взгляда согласно третьему варианту осуществления настоящего изобретения. Этот вариант осуществления заявленного изобретения проиллюстрирован на основе вариантов осуществления, описанных выше. В этом варианте осуществления настоящего изобретения иллюстрируется, главным образом, стадия обратного проецирования вращательного перемещения зрачка глаза в изображении лица на направление взгляда реального лица в трехмерном пространстве.[0082] FIG. 3A is a flowchart illustrating an implementation flow of the sight line detection method according to the third embodiment of the present invention, and FIG. 3B is a schematic diagram illustrating the execution principle of the sight line detection process according to the third embodiment of the present invention. This embodiment of the claimed invention is illustrated based on the embodiments described above. This embodiment of the present invention mainly illustrates the step of backprojecting the rotational movement of the eye pupil in the face image onto the gaze direction of the real face in 3D space.

[0083] В необязательном варианте, как это показано на фиг. 3А, способ согласно этому варианту осуществления настоящего изобретения может предусматривать стадии, описанные ниже.[0083] Optionally, as shown in FIG. 3A, the method according to this embodiment of the present invention may include the steps described below.

[0084] На стадии S310 определяется положение лица и вращательное перемещение зрачка глаза, соответствующие изображению лица, на основании ключевой характерной точки в изображении лица.[0084] In step S310, the position of the face and the rotational movement of the pupil of the eye corresponding to the face image are determined based on the key feature point in the face image.

[0085] На стадии S320 строится соответствующая функция оптимизации линии взгляда на основании заданной функции проецирования, положения лица и вращательного перемещения зрачка глаза.[0085] In step S320, a corresponding line of sight optimization function is built based on the given projection function, face position, and rotational movement of the eye pupil.

[0086] В этом варианте осуществления настоящего изобретения необязательно предусмотрено, что за счет выбора соответствующей функции проецирования эта функция проецирования обеспечивает проецирование реального лица на соответствующую двумерную поверхность формирования изображения с учетом положения лица для генерирования соответствующего изображения лица. В случае проецирования направления взгляда реального лица на двумерную поверхность формирования изображения для максимально возможного совмещения направления взгляда реального лица с вращательным перемещением зрачка глаза строится соответствующая функция оптимизации линии взгляда на основании соответствующего проекционного соотношения между направлением взгляда реального лица и вращательным перемещением зрачка глаза в заданной функции проецирования и положением лица. Оптимизация, предусмотренная функцией оптимизации линии взгляда, служит для обеспечения минимальной разницы между положением, в котором направление взгляда реального лица проецируется на двумерную поверхность формирования изображения, и вращательным перемещением зрачка глаза.[0086] In this embodiment of the present invention, it is optionally provided that, by selecting an appropriate projection function, the projection function enables a real face to be projected onto the corresponding two-dimensional imaging surface in consideration of the position of the face to generate the corresponding face image. In the case of projecting the gaze direction of a real face onto a two-dimensional imaging surface, for the maximum possible alignment of the gaze direction of a real face with the rotational movement of the pupil of the eye, the corresponding function of optimizing the line of sight is constructed based on the corresponding projection relationship between the direction of gaze of the real face and the rotational movement of the eye pupil in a given projection function and facial position. The optimization provided by the Line of Sight Optimization function is to ensure that there is a minimum difference between the position in which the gaze direction of the real face is projected onto the 2D imaging surface and the rotational movement of the pupil of the eye.

[0087] На стадии S330 обеспечивается получение направления взгляда, достигающего заданного целевого показателя оптимизации в функции оптимизации линии взгляда в трехмерном пространстве, в котором располагается реальное лицо; и полученное направление взгляда принимается в качестве направления взгляда реального лица.[0087] At step S330, obtaining a gaze direction that achieves a predetermined optimization target in a gaze line optimization function in a three-dimensional space in which a real face is located; and the obtained gaze direction is taken as the gaze direction of the real person.

[0088] В необязательном варианте функция оптимизации линии взгляда, построенная в этом варианте осуществления настоящего изобретения, выполнена с возможностью точного детектирования разницы между вращательным перемещением зрачка глаза в проекционном изображении, полученным путем проецирования большого числа лиц в трехмерном пространстве на двумерную поверхность формирования изображения, и вращательным перемещением зрачка глаза в захваченном изображении лица согласно этому варианту осуществления настоящего изобретения. Поскольку заданная оптимизация, предусмотренная функцией оптимизации линии взгляда, служит для обеспечения минимальной разницы между положением, в котором направление взгляда реального лица проецируется на двумерную поверхность формирования изображения, и вращательным перемещением зрачка глаза, то в этом варианте осуществления настоящего изобретения выбирается проекционное изображение с минимальной разницей, и в проекционном изображении определяется направление взгляда, соответствующее лицу в трехмерном пространстве, с целью получения направления взгляда реального лица согласно этому варианту осуществления настоящего изобретения.[0088] Optionally, the line-of-sight optimization function built in this embodiment of the present invention is capable of accurately detecting the difference between the rotational movement of the eye pupil in a projection image obtained by projecting a large number of faces in 3D onto a 2D imaging surface, and rotational movement of the pupil of the eye in the captured face image according to this embodiment of the present invention. Since the predetermined optimization provided by the line of sight optimization function is to ensure the minimum difference between the position in which the gaze direction of the real face is projected onto the 2D imaging surface and the rotational movement of the pupil of the eye, in this embodiment of the present invention, the projection image with the minimum difference is selected. , and in the projection image, the gaze direction corresponding to the face in 3D space is determined to obtain the gaze direction of the real face according to this embodiment of the present invention.

[0089] К примеру, для повышения точности детектирования линии взгляда перед получением направления взгляда, достигающего заданного целевого показателя оптимизации в функции оптимизации линии взгляда в трехмерном пространстве, в котором располагается реальное лицо, в качестве направления взгляда реального лица способ согласно этому варианту осуществления настоящего изобретения может дополнительно предусматривать: получение ассоциированного изображения, соотносящегося с изображением лица; и обновление на основании направления взгляда в ассоциированном изображении, заданного ассоциированного параметра сглаживания и заданного параметра стабилизации функции оптимизации линии взгляда.[0089] For example, to improve the accuracy of line of sight detection before obtaining a line of sight reaching a predetermined optimization target in the line of sight optimization function in the 3D space in which a real face is located as the line of sight of a real face, the method according to this embodiment of the present invention may further include: obtaining an associated image corresponding to the face image; and updating based on the direction of view in the associated image, a given associated smoothing parameter, and a given stabilization parameter of the eyeline optimization function.

[0090] Ассоциированное изображение переносит направление взгляда, соответствующее ассоциированному изображению; причем в этом варианте осуществления настоящего изобретения необходимо детектировать направление взгляда лица в кадрах предварительно записанного видео, а ассоциированное изображение, соотносящееся с изображением лица, представляет собой видеокадр, предшествующий видеокадру, соответствующему изображению лица в видеозаписи. Вследствие последовательного детектирования направления взгляда лица в каждом кадре видеозаписи направление взгляда в предыдущем видеокадре определяется по факту детектирования направления взгляда в текущем видеокадре. В этом варианте осуществления настоящего изобретения для обеспечения плавности вращения глазного яблока, когда последовательно отображается множество видеокадров в видеозаписи, необходимо обеспечить минимальную разницу между направлениями взгляда в соседних видеокадрах с тем, чтобы максимально уменьшить беспорядочные скачкообразные движения глазного яблока в изображении лица в процессе последовательного отображения видеокадров. В данном случае соответствующая функция оптимизации линии взгляда может быть обновлена на основании направления взгляда в ассоциированном изображении, заданного ассоциированного параметра сглаживания и заданного параметра стабилизации, вследствие чего обеспечивается получение направления взгляда, достигающего заданного целевого показателя оптимизации в функции оптимизации линии взгляда в трехмерном пространстве, в котором располагается реальное лицо, и его принятие в качестве направления взгляда реального лица.[0090] The associated image transfers the direction of view corresponding to the associated image; moreover, in this embodiment of the present invention, it is necessary to detect the gaze direction of a face in frames of pre-recorded video, and the associated image corresponding to the face image is a video frame preceding the video frame corresponding to the face image in the video. Due to successive detection of the gaze direction of the face in each frame of the video recording, the gaze direction in the previous video frame is determined by the fact of detection of the gaze direction in the current video frame. In this embodiment of the present invention, in order to ensure the smooth rotation of the eyeball when a plurality of video frames are sequentially displayed in a video recording, it is necessary to ensure that the difference between the gaze directions in adjacent video frames is minimal so as to minimize the erratic jerky movements of the eyeball in the face image during the sequential display of video frames. . In this case, the corresponding sight line optimization function can be updated based on the gaze direction in the associated image, the given associated smoothing parameter, and the given stabilization parameter, thereby obtaining the direction of sight that achieves the given optimization target in the 3D sight line optimization function, in which the real face is located, and its acceptance as the direction of the real person's gaze.

[0091] К примеру, обновленная функция оптимизации линии взгляда записывается следующим образом:

[0091] For example, the updated sight line optimization function is written as follows:

[0092] В приведенной выше функции величина Π(Rx+t) обозначает заданную функцию проецирования; величина R обозначает параметр вращения в положении лица; величина t обозначает параметр сдвига в положении лица; величина х обозначает направления взгляда в изображении лица; величина d обозначает вращательное перемещение зрачка глаза; величина х₀ обозначает направление взгляда в ассоциированном изображении; величина α обозначает заданный ассоциированный параметр сглаживания; а величина β обозначает заданный параметр стабилизации.[0092] In the above function, the value Π(Rx+t) denotes a predetermined projection function; the R value denotes the rotation parameter in the face position; the value t denotes the shift parameter in the position of the face; the x value denotes the direction of gaze in the face image; the value d denotes the rotational movement of the pupil of the eye; the value x ₀ denotes the direction of view in the associated image; the value α denotes a given associated smoothing parameter; and the value of β denotes a given stabilization parameter.

Величина Π(Rx+t) может быть выражена как sPRx+t, причем величина s обозначает коэффициент масштабирования, а

The value of Π(Rx+t) can be expressed as sPRx+t, with s being the scaling factor and

[0093] В данном случае величина

обозначает разницу между реальным лицом, спрецированным на двумерную поверхность формирования изображения, и вращательным перемещением зрачка глаза в изображении лица; величина

обозначает разницу между направлениями взгляда реального лица и ассоциированного изображения; величина

обозначает уровень стабилизации направления взгляда реального лица и рассчитана таким образом, чтобы ограничивать перемещение глазного яблока с целью предотвращения несовместимости полученного направления взгляда с направлением взгляда реального лица. В этом варианте осуществления настоящего изобретения для обеспечения естественности и плавности вращения глазного яблока заданная функция оптимизации может быть обновлена до минимальной суммы из трех элементов в функции оптимизации линии взгляда. В данном случае на основании обновленной функции оптимизации линии взгляда обеспечивается возможность получения направления взгляда, достигающего заданного целевого показателя оптимизации, в трехмерном пространстве, где располагается реальное лицо, и принятия его в качестве направления взгляда реального лица.[0093] In this case, the value

denotes the difference between a real face projected onto a two-dimensional imaging surface and rotational movement of the pupil of the eye in the face image; magnitude

denotes the difference between the gaze directions of the real face and the associated image; magnitude

indicates the level of gaze direction stabilization of the real face and is calculated in such a way as to limit the movement of the eyeball in order to prevent the received gaze direction from being incompatible with the gaze direction of the real face. In this embodiment of the present invention, in order to ensure the natural and smooth rotation of the eyeball, the predetermined optimization function can be updated to the minimum sum of three elements in the line of sight optimization function. In this case, based on the updated gaze line optimization function, it is possible to obtain a gaze direction that reaches a predetermined optimization target in the 3D space where a real face is located and take it as the gaze direction of a real face.

[0094] В техническом решении, реализованном в этом варианте осуществления настоящего изобретения, на основании заданной функции проецирования, положения лица и вращательного перемещения зрачка глаза строится соответствующая функция оптимизации линии взгляда. Заданный целевой показатель оптимизации в функции оптимизации линии взгляда обеспечивает минимальную разницу между проекционным изображением при проецировании реального лица на двумерную поверхность формирования изображения на основании функции проецирования и изображением лица. В данном случае обеспечивается получение направления взгляда, достигающего целевого показателя оптимизации линии взгляда, в трехмерном пространстве, где располагается реальное лицо, и оно принимается в качестве направления взгляда реального лица. Следовательно, повышается точность детектирования направления взгляда и обеспечивается естественность и плавность вращения глазного яблока в условиях множества направлений взгляда в разных изображениях.[0094] In the technical solution implemented in this embodiment of the present invention, based on the given projection function, the position of the face, and the rotational movement of the pupil of the eye, the corresponding line of sight optimization function is built. The given optimization target in the line of sight optimization function provides a minimum difference between the projection image when projecting a real face onto a two-dimensional imaging surface based on the projection function and the face image. In this case, a gaze direction that reaches the gaze line optimization target is obtained in the 3D space where the real face is located, and is taken as the gaze direction of the real face. Therefore, the accuracy of gaze direction detection is improved and the eyeball rotation is natural and smooth under conditions of multiple gaze directions in different images.

[0095] Четвертый вариант осуществления настоящего изобретения[0095] Fourth embodiment of the present invention

[0096] На фиг. 4 представлена блок-схема, иллюстрирующая алгоритм реализации способа обработки видеоданных согласно четвертому варианту осуществления настоящего изобретения. Этот вариант осуществления применим для любого сценария детектирования направления взгляда пользователя во множестве видеокадров в видеозаписи. Способ обработки видеоданных согласно этому варианту осуществления настоящего изобретения может быть выполнен устройством для обработки видеоданных согласно одному из вариантов осуществления настоящего изобретения. Это устройство может быть реализовано в виде программных средств и/или аппаратных средств и интегрировано в устройство для реализации способа. Это устройство может представлять собой любое устройство обработки трехмерных моделей, обладающее способностью обработки изображений.[0096] FIG. 4 is a flowchart illustrating an implementation flow of the video data processing method according to the fourth embodiment of the present invention. This embodiment is applicable to any scenario of detecting the gaze direction of a user in a plurality of video frames in a video. The video data processing method according to this embodiment of the present invention may be performed by the video data processing apparatus according to one embodiment of the present invention. This device can be implemented in software and/or hardware and integrated into the device to implement the method. This device may be any 3D model processing device with image processing capability.

[0097] Как показано на фиг. 4, предложенный способ может предусматривать стадии, описанные ниже.[0097] As shown in FIG. 4, the proposed method may include the steps described below.

[0098] На стадии S410 обеспечивается получение видеокадра в подлежащих обработке видеоданных.[0098] In step S410, a video frame is obtained in the video data to be processed.

[0099] Когда необходимо детектировать направление взгляда лица в видеокадрах видеоданных, эти видеоданные могут быть сначала обработаны для извлечения множества видеокадров из подлежащих обработке видеоданных таким образом, чтобы можно было последовательно детектировать направление взгляда в видеокадрах.[0099] When it is necessary to detect the gaze direction of a face in the video frames of video data, the video data may be first processed to extract a plurality of video frames from the video data to be processed so that the gaze direction in the video frames can be sequentially detected.

[00100] На стадии S420 обеспечивается получение направления взгляда реального лица, соответствующего видеокадру, путем реализации способа детектирования линии взгляда согласно вариантам осуществления настоящего изобретения, описанным выше.[00100] In step S420, the gaze direction of a real person corresponding to the video frame is obtained by implementing the line of sight detection method according to the embodiments of the present invention described above.

[00101] При получении видеокадров в подлежащих обработке видеоданных соответствующие направления взгляда реального лица во множестве видеокадров могут быть получены путем последовательного детектирования каждого из видеокадров в видеоданных, подлежащих обработке, с использованием способа детектирования линии взгляда согласно вариантам осуществления настоящего изобретения, описанным выше.[00101] When receiving video frames in the video data to be processed, the corresponding gaze directions of a real person in a plurality of video frames can be obtained by sequentially detecting each of the video frames in the video data to be processed using the line of sight detection method according to the embodiments of the present invention described above.

[00102] Кроме того, для улучшения интеллектуальных возможностей обработки видеоданных после определения направления взгляда лица в каждом видеокадре может быть напрямую выполнена соответствующая операция в отношении подлежащих обработке видеоданных путем определения положения направления взгляда. Следовательно, после получения направления взгляда, соответствующего видеокадру, способ согласно этому варианту осуществления настоящего изобретения может дополнительно предусматривать: определение соответствующего смещения линии взгляда на основании направления взгляда, соответствующего соседнему видеокадру в видеоданных, подлежащих обработке; и выполнение соответствующей операции по редактированию видеоданных с учетом смещения линии взгляда.[00102] In addition, in order to improve the intelligence of video data processing, after determining the gaze direction of a face in each video frame, a corresponding operation can be directly performed on the video data to be processed by determining the position of the gaze direction. Therefore, after obtaining a gaze direction corresponding to a video frame, the method according to this embodiment of the present invention may further comprise: determining an appropriate gaze line offset based on a gaze direction corresponding to an adjacent video frame in the video data to be processed; and performing a corresponding video editing operation considering the line of sight shift.

[00103] Во время определения направлений взгляда во множестве видеокадров в подлежащих обработке видеоданных соответствующее смещение линии взгляда может быть определено путем анализа направления взгляда, соответствующего соседнему видеокадру, после чего с учетом смещения линии взгляда может быть напрямую выполнена соответствующая операция редактирования в отношении подлежащих обработке видеоданных. Например, если линия взгляда в видеоданных смещена влево, в подлежащие обработке видеоданные могут быть добавлены карты некоторых спецэффектов.[00103] At the time of determining the gaze directions in a plurality of video frames in the video data to be processed, the corresponding shift of the line of sight can be determined by analyzing the direction of gaze corresponding to the adjacent video frame, after which, taking into account the shift of the line of sight, the corresponding editing operation can be directly performed on the video data to be processed . For example, if the line of sight in the video data is shifted to the left, some special effects maps may be added to the video data to be processed.

[00104] В техническом решении, реализованном в этом варианте осуществления настоящего изобретения, соответствующая операция по редактированию видеоданных выполняется напрямую путем детектирования направлений взгляда в видеокадре в подлежащих обработке видеоданных, и определения смещения линии взгляда с целью улучшения интеллектуальных возможностей обработки видеоданных.[00104] In the solution implemented in this embodiment of the present invention, the corresponding video editing operation is performed directly by detecting the gaze directions in the video frame in the video data to be processed, and determining the shift of the line of sight in order to improve the intelligence of the video data processing.

[00105] Пятый вариант осуществления настоящего изобретения[00105] Fifth embodiment of the present invention

[00106] На фиг. 5 показана структурная схема устройства для детектирования линии взгляда согласно пятому варианту осуществления настоящего изобретении. Как показано на фиг. 5, это устройство может включать в себя: модуль 510 определения параметров, выполненный с возможностью определения - на основании ключевой характерной точки в изображении лица положения лица и вращательного перемещения зрачка глаза, которые соответствуют изображению лица, причем вращательное перемещение зрачка глаза представляет собой смещение центра зрачка относительно центра глазного яблока в изображении лица; и модуль 520 детектирования линии взгляда, выполненный с возможностью получения направления взгляда реального лица путем обратного проецирования вращательного перемещения зрачка глаза - на основании заданной функции проецирования и положения лица - на трехмерное пространство, где располагается реальное лицо.[00106] FIG. 5 is a block diagram of a line of sight detection device according to a fifth embodiment of the present invention. As shown in FIG. 5, this device may include: a parameter determination module 510 configured to determine, based on a key feature point in the face image, the position of the face and the rotational movement of the pupil of the eye that correspond to the face image, the rotational movement of the eye pupil being the displacement of the center of the pupil relative to the center of the eyeball in the image of the face; and a line of sight detection unit 520 configured to obtain the line of sight of the real face by backprojecting the rotational movement of the eye pupil—based on the predetermined projection function and the position of the face—to a three-dimensional space where the real face is located.

[00107] В техническом решении, реализованном в этом варианте осуществления настоящего изобретения, на основании ключевой характерной точки в изображении лица определяется положение лица, соответствующее изображению лица, и вращательное перемещение центра зрачка глаза относительно центра глазного яблока, после чего вращательное перемещение зрачка глаза обратно проецируется на трехмерное пространство, где располагается реальное лицо, на основании заданной функции проецирования и положения лица, вследствие чего обеспечивается получение направления взгляда реального лица. В этом решении нет необходимости в сравнении текстуры глаза в изображении лица с текстурами глаз в огромном количестве статистических изображений лиц или в определении направления вращения глазного яблока на основании движения других черт в изображении лица, отличных от глазного яблока, что уменьшает объем обрабатываемых данных о глазе и повышает эффективность детектирования направления взгляда. Направление взгляда реального лица напрямую анализируется на основании вращения зрачка глаза в изображении лица, что повышает точность детектирования направления взгляда.[00107] In the technical solution implemented in this embodiment of the present invention, based on the key feature point in the face image, the position of the face corresponding to the face image is determined and the rotational movement of the center of the pupil of the eye relative to the center of the eyeball, after which the rotational movement of the pupil of the eye is projected back to the 3D space where the real face is located based on the predetermined projection function and the position of the face, thereby obtaining the gaze direction of the real face. In this solution, there is no need to compare the texture of the eye in the face image with the textures of the eyes in a huge number of statistical face images or to determine the direction of rotation of the eyeball based on the movement of other features in the face image other than the eyeball, which reduces the amount of eye data to be processed and improves the efficiency of gaze direction detection. The gaze direction of a real face is directly analyzed based on the rotation of the pupil of the eye in the face image, which improves the detection accuracy of the gaze direction.

[00108] Устройство для детектирования линии взгляда может дополнительно включать в себя: модуль определения характерных точек, выполненный с возможностью: получения данных о лице, соответствующих изображению лица, путем сканирования изображения лица; получения реконструированной сеточной модели лица путем реконструкции заданной трехмерной сетки-шаблона лица с использованием данных о лице; и извлечения ключевой характерной точки в реконструированной сеточной модели лица с принятием этой ключевой характерной точки в реконструированной сеточной модели лица в качестве ключевой характерной точки в изображении лица.[00108] The line of sight detection device may further include: a feature point determination module, configured to: obtain face data corresponding to a face image by scanning a face image; obtaining a reconstructed face mesh model by reconstructing a predetermined 3D face template mesh using the face data; and extracting a key feature point in the reconstructed face mesh model, and taking the key feature point in the reconstructed face mesh model as a key feature point in the face image.

[00109] Модуль 510 определения параметров может включать в себя: блок определения согласующейся с пространственным положением позиции, выполненный с возможностью определения согласующейся с пространственным положением позиции ключевой характерной точки в изображении лица; и блок определения положения лица, выполненный с возможностью определения положения лица в изображении лица на основании согласующейся с пространственным положением позиции.[00109] The parameter determination module 510 may include: an attitude-consistent position determination unit, configured to determine an attitude-consistent position of a key feature point in a face image; and a face position determination unit configured to determine the position of the face in the face image based on the attitude-consistent position.

[00110] Модуль 510 определения параметров может включать в себя: блок определения центра глаза, выполненный с возможностью определения центра глазного яблока, соответствующего изображению лица, на основании ключевой характерной точки в реконструированной сеточной модели лица; блок определения центра зрачка, выполненный с возможностью получения центра зрачка, соответствующего изображению глаза, путем распознавания изображения глаза в изображении лица; и блок определения смещения зрачка, выполненный с возможностью определения на основании положения центра глазного яблока и положения центра зрачка - вращательного перемещения зрачка глаза, соответствующего изображению лица.[00110] The parameter determination module 510 may include: an eye center determination unit configured to determine the center of the eyeball corresponding to the face image based on a key feature point in the reconstructed face mesh model; a pupil center determination unit, configured to obtain a pupil center corresponding to the eye image by recognizing the eye image in the face image; and a pupil displacement determination unit configured to determine, based on the position of the center of the eyeball and the position of the center of the pupil, the rotational movement of the pupil of the eye corresponding to the face image.

[00111] Блок определения центра зрачка может быть выполнен с возможностью захвата изображения глаза в изображении лица; и получения центра зрачка, соответствующего изображению глаза, путем ввода изображения глаза в предварительно построенную модель глубокой сети.[00111] The pupil center determination unit may be configured to capture an image of an eye in a face image; and obtaining the center of the pupil corresponding to the eye image by inputting the eye image into the pre-built deep network model.

[00112] Устройство для детектирования линии взгляда может дополнительно включать в себя: модуль реконструкции глазного яблока, выполненный с возможностью построения соответствующей трехмерной модели глаза в реконструированной сеточной модели лица на основании направления взгляда реального лица.[00112] The line of sight detection device may further include: an eyeball reconstruction module configured to construct a corresponding 3D eye model in the reconstructed face mesh model based on the line of sight of a real face.

[00113] Модуль 520 детектирования линии взгляда может включать в себя: блок построения функции оптимизации, выполненный с возможностью построения соответствующей функции оптимизации линии взгляда на основании заданной функции проецирования, положения лица и вращательного перемещения зрачка глаза; и блок детектирования линии взгляда, выполненный с возможностью получения направления взгляда, достигающего заданного целевого показателя оптимизации в функции оптимизации линии взгляда в трехмерном пространстве, в котором располагается реальное лицо, и принятия полученного направления взгляда в качестве направления взгляда реального лица.[00113] Sight line detection unit 520 may include: an optimization function builder configured to build a corresponding eye line optimization function based on a given projection function, face position, and eye pupil rotation; and a line of sight detection unit configured to obtain a line of sight reaching a predetermined optimization target in a line of sight optimization function in the 3D space in which the real face is located, and taking the obtained line of sight as the line of sight of the real face.

[00114] Положение лица включает в себя параметр вращения лица и параметр сдвига лица в изображении лица.[00114] The face position includes a face rotation parameter and a face shift parameter in the face image.

[00115] Модуль 520 детектирования линии взгляда может дополнительно включать в себя: блок получения ассоциированного изображения, выполненный с возможностью получения ассоциированного изображения, соотносящегося с изображением лица, причем ассоциированное изображение переносит направление взгляда, соответствующее ассоциированному изображению; и блок обновления функции оптимизации, выполненный с возможностью обновления функции оптимизации линии взгляда на основании направления взгляда в ассоциированном изображении, заданного ассоциированного параметра сглаживания и заданного параметра стабилизации.[00115] The line of sight detection unit 520 may further include: an associated image acquisition unit configured to acquire an associated image corresponding to a face image, the associated image carrying a gaze direction corresponding to the associated image; and an optimization function updater, configured to update the sightline optimization function based on the direction of view in the associated image, the predetermined associated smoothing parameter, and the predetermined stabilization parameter.

[00116] Обновленная функция оптимизации линии взгляда может быть выражена следующим образом:

где величина Π(Rx+t) обозначает заданную функцию проецирования; величина R обозначает параметр вращения в положении лица; величина t обозначает параметр сдвига в положении лица; величина х обозначает направления взгляда в изображении лица; величина d обозначает вращательное перемещения зрачка глаза; величина х₀ обозначает направление взгляда в ассоциированном изображении; величина α обозначает заданный ассоциированный параметр сглаживания; а величина β обозначает заданный параметр стабилизации.[00116] The updated sight line optimization function can be expressed as follows:

where the value Π(Rx+t) denotes a given projection function; the R value denotes the rotation parameter in the face position; the value t denotes the shift parameter in the position of the face; the x value denotes the direction of gaze in the face image; the value d denotes the rotational movement of the pupil of the eye; the value x ₀ denotes the direction of view in the associated image; the value α denotes a given associated smoothing parameter; and the value of β denotes a given stabilization parameter.

[00117] Устройство для детектирования линии взгляда может дополнительно включать в себя: модуль определения функции проецирования, выполненный с возможностью определения заданной функции проецирования на основании параметра устройство захвата изображения, соответствующего изображению лица.[00117] The device for detecting the line of sight may further include: a projection function determination module, configured to determine a predetermined projection function based on the parameter of the image capturing device corresponding to the face image.

[00118] Параметром устройства захвата изображения служит фокусное расстояние для формирования изображения устройства захвата изображения.[00118] The parameter of the image capturing device is the focal length for imaging the image capturing device.

[00119] Устройство для детектирования линии взгляда согласно этому варианту осуществления настоящего изобретения применимо для реализации способа детектирования линии взгляда согласно любому из описанных выше вариантов осуществления, и оно обладает соответствующими функциями.[00119] The line of sight detection device according to this embodiment of the present invention is applicable to the implementation of the sight line detection method according to any of the above embodiments, and has corresponding functions.

[00120] Шестой вариант осуществления настоящего изобретения[00120] Sixth embodiment of the present invention

[00121] На фиг. 6 показана структурная схема устройства для обработки видеоданных согласно шестому варианту осуществления настоящего изобретения. Как показано на фиг. 6, это устройство включает в себя: модуль 610 получения видеокадра, выполненный с возможностью получения видеокадра в подлежащих обработке видеоданных; и модуль 620 детектирования линии взгляда, выполненный с возможностью получения направления взгляда реального лица, соответствующего видеокадру, путем реализации способа детектирования линии взгляда согласно любому из вариантов осуществления настоящего изобретения.[00121] FIG. 6 is a block diagram of a video data processing apparatus according to a sixth embodiment of the present invention. As shown in FIG. 6, this apparatus includes: a video frame acquisition unit 610, configured to acquire a video frame in video data to be processed; and a line of sight detection unit 620, configured to obtain the gaze direction of a real person corresponding to the video frame by implementing the line of sight detection method according to any of the embodiments of the present invention.

[00122] В техническом решении, реализованном в этом варианте осуществления настоящего изобретения, соответствующая операция по редактированию видеоданных выполняется напрямую путем детектирования направления взгляда в каждом видеокадре в подлежащих обработке видеоданных и определения смещения линии взгляда с целью улучшения интеллектуальных возможностей обработки видеоданных.[00122] In the solution implemented in this embodiment of the present invention, the corresponding video editing operation is performed directly by detecting the gaze direction in each video frame in the video data to be processed, and determining the shift of the line of sight in order to improve the intelligence of the video data processing.

[00123] Устройство для обработки видеоданных может дополнительно включать в себя: модуль выполнения операций, выполненный с возможностью определения смещения линии взгляда, соответствующего соседнему видеокадру, на основании направления взгляда, соответствующего соседнему видеокадру в подлежащих обработке видеоданных; и выполнения операции по редактированию видеоданных в соответствии с соседним видеокадром с учетом смещения линии взгляда.[00123] The video data processing apparatus may further include: an operation module configured to determine a line of sight offset corresponding to an adjacent video frame based on a gaze direction corresponding to an adjacent video frame in the video data to be processed; and performing an editing operation of the video data in accordance with the adjacent video frame considering the shift of the line of sight.

[00124] Устройство для обработки видеоданных согласно этому варианту осуществления настоящего изобретения применимо для реализации способа обработки видеоданных согласно любому из описанных выше вариантов осуществления, и оно обладает соответствующими функциями.[00124] The video data processing apparatus according to this embodiment of the present invention is applicable to implementing the video data processing method according to any of the above-described embodiments, and has corresponding functions.

[00125] Седьмой вариант осуществления настоящего изобретения[00125] Seventh Embodiment of the Present Invention

[00126] На фиг. 7 показана структурная схема системы обработки линии взгляда согласно седьмому варианту осуществления настоящего изобретения. Как показано на фиг. 7, система обработки линии взгляда включает в себя: устройство 710 захвата изображения и устройство 720 обработки данных, которые коммуникативно соединены друг с другом. Устройство 710 захвата изображения располагается на устройстве 720 обработки данных.[00126] FIG. 7 is a block diagram of a sight line processing system according to a seventh embodiment of the present invention. As shown in FIG. 7, the line of sight processing system includes: an image pickup device 710 and a data processing device 720 that are communicatively connected to each other. The image capture device 710 is located on the data processing device 720 .

[00127] Устройство 710 захвата изображения захватывает детектируемое изображение лица и подлежащих обработке видеоданных, и передает изображение лица и подлежащие обработке видеоданные на устройство 720 обработки данных. Устройство 720 обработки данных снабжено устройством для детектирования линии взгляда и устройством для обработки видеоданных согласно вариантам осуществления настоящего изобретения, описанным выше, и оно применимо для реализации способа детектирования линии взгляда и способа обработки видеоданных согласно любому из вариантов осуществления настоящего изобретения. Стадии выполнения относятся к способу детектирования линии взгляда и способу обработки видеоданных в любом из вариантов осуществления настоящего изобретения. При этом достигаются соответствующие функции, которые подробно не описаны в настоящем документе.[00127] The image capturing device 710 captures the detected face image and video data to be processed, and transmits the face image and video data to be processed to the data processing device 720 . The data processing device 720 is provided with a line of sight detection device and a video data processing device according to the embodiments of the present invention described above, and is applicable to implementing the line of sight detection method and the video data processing method according to any of the embodiments of the present invention. The execution steps refer to the line of sight detection method and the video data processing method in any of the embodiments of the present invention. This achieves corresponding functions that are not described in detail in this document.

[00128] Восьмой вариант осуществления настоящего изобретения[00128] Eighth Embodiment of the Present Invention

[00129] На фиг. 8 показана структурная схема устройства согласно восьмому варианту осуществления настоящего изобретения. Как показано на фиг. 8, это устройство включает в себя процессор 80, память 81 и модуль 82 связи. В этом устройстве может быть предусмотрен один или несколько процессоров 80, а на фиг. 8 приведен пример с одним процессором 80. Процессор 80, память 81 и модуль 82 связи в устройстве могут быть соединены между собой посредством шины или с помощью иных средств. На фиг. 8 приведен пример с соединением посредством шины.[00129] FIG. 8 is a block diagram of a device according to an eighth embodiment of the present invention. As shown in FIG. 8, this device includes a processor 80, a memory 81, and a communication module 82. This device may be provided with one or more processors 80, and in FIG. 8 shows an example with a single processor 80. The processor 80, memory 81, and communication module 82 in the device may be interconnected via a bus or other means. In FIG. 8 shows an example with a bus connection.

[00130] В качестве машиночитаемого носителя данных память 81 может быть выполнена с возможностью хранения программных продуктов, исполняемых компьютером программ и модулей, например, программных команд/модулей, соответствующих способу детектирования линии взгляда или способу обработки видеоданных согласно любому из вариантов осуществления настоящего изобретения. Процессор 80 исполняет различные 80 функциональные приложения и обеспечивает обработку данных устройством путем приведения в исполнение программных продуктов, команд и модулей, хранящихся в памяти 81. Иначе говоря, реализуется способ детектирования линии взгляда или способ обработки видеоданных.[00130] As a computer-readable storage medium, the memory 81 may be configured to store software products, computer-executable programs and modules, such as program instructions/modules, corresponding to the line of sight detection method or the video data processing method according to any of the embodiments of the present invention. The processor 80 executes various functional applications 80 and provides data processing to the device by executing software products, instructions, and modules stored in the memory 81. In other words, a line of sight detection method or a video data processing method is implemented.

[00131] Память 81 может включать в себя, главным образом, область хранения программ и область хранения данных. В области хранения программ может храниться операционная система и приложение, необходимое для выполнения, по меньшей мере, одной функции; а в области хранения данных могут храниться данные, создаваемые с учетом использования терминала, и иные данные подобного рода. Кроме того, память 81 может включать в себя быстродействующее оперативное запоминающее устройство, а также дополнительно включать в себя энергонезависимое запоминающее устройство, такое как, по меньшей мере, одно запоминающее устройство на магнитных дисках, флэш-память или иные энергонезависимые твердотельные запоминающие устройства. В некоторых примерах память 81 может включать в себя блоки памяти, расположенные на удалении от процессора 80, причем эти удаленные блоки памяти могут быть соединены с устройством по сети. Примерами сети могут служить, помимо прочего, сеть Интернет, корпоративная интрасеть, локальная сеть, сеть мобильной связи и их сочетания.[00131] The memory 81 may mainly include a program storage area and a data storage area. The program storage area may store an operating system and an application required to perform at least one function; and the data storage area may store data generated in view of the use of the terminal and the like. In addition, the memory 81 may include a high-speed random access memory, and also further include a non-volatile storage device, such as at least one magnetic disk storage device, flash memory, or other non-volatile solid-state storage devices. In some examples, memory 81 may include blocks of memory located remotely from processor 80, and these remote blocks of memory can be connected to the device over a network. Examples of a network include, but are not limited to, the Internet, a corporate intranet, a local area network, a mobile network, and combinations thereof.

[00132] Модуль 82 связи может быть выполнен с возможностью реализации сетевого соединения или подключения для передачи мобильных данных между устройствами.[00132] The communication module 82 may be configured to implement a network connection or mobile data connection between devices.

[00133] Устройство согласно этому варианту осуществления настоящего изобретения может быть выполнено с возможностью реализации способа детектирования линии взгляда или способа обработки видеоданных по любому из описанных выше вариантов осуществления и с возможностью выполнения соответствующих функций.[00133] The device according to this embodiment of the present invention may be configured to implement the line of sight detection method or the video data processing method of any of the embodiments described above and to perform the corresponding functions.

[00134] Девятый вариант осуществления настоящего изобретения[00134] Ninth Embodiment of the Present Invention

[00135] Девятым вариантом осуществления настоящего изобретения дополнительно предложен машиночитаемый носитель данных для хранения компьютерной программы. Компьютерная программа при ее выполнении процессором инициирует реализацию программой способа детектирования линии взгляда или способа обработки видеоданных согласно любому из вариантов осуществления настоящего изобретения, описанных выше.[00135] A ninth embodiment of the present invention further provides a computer readable storage medium for storing a computer program. The computer program, when executed by the processor, causes the program to implement the line of sight detection method or video data processing method according to any of the embodiments of the present invention described above.

[00136] Способ детектирования линии взгляда может предусматривать:[00136] The line of sight detection method may include:

[00137] определение на основании ключевой характерной точки в изображении лица - положения лица и вращательного перемещения зрачка глаза, которые соответствуют изображению лица, причем вращательное перемещение зрачка глаза представляет собой смещение центра зрачка относительно центра глазного яблока в изображении лица; и[00137] determining, based on a key feature point in the face image, the position of the face and the rotational movement of the pupil of the eye that correspond to the face image, and the rotational movement of the pupil of the eye is the displacement of the center of the pupil relative to the center of the eyeball in the face image; and

[00138] получение направления взгляда реального лица путем обратного проецирования вращательного перемещения зрачка глаза - на основании заданной функции проецирования и положения лица на трехмерное пространство, где располагается реальное лицо.[00138] obtaining the gaze direction of a real face by backprojecting the rotational movement of the pupil of the eye - based on the given projection function and the position of the face on the three-dimensional space where the real face is located.

[00139] Способ обработки видеоданных может предусматривать:[00139] The video data processing method may include:

[00140] получение видеокадра в подлежащих обработке видеоданных; и[00140] obtaining a video frame in the video data to be processed; and

[00141] получение направления взгляда реального лица, соответствующего видеокадру, путем выполнения способа детектирования линии взгляда согласно любому из вариантов осуществления настоящего изобретения.[00141] Obtaining a gaze direction of a real person corresponding to a video frame by performing a line of sight detection method according to any of the embodiments of the present invention.

[00142] В том, что касается носителя данных согласно одному из вариантов осуществления настоящего изобретения, который содержит исполняемые компьютером команды, то эти исполняемые компьютером команды не ограничены выполнением описанных стадий способов, а могут дополнительно выполнять сопутствующие операции в рамках реализации способа детектирования линии взгляда или способа обработки видеоданных согласно любому из вариантов осуществления настоящего изобретения.[00142] With regard to the storage medium according to one of the embodiments of the present invention, which contains computer-executable instructions, these computer-executable instructions are not limited to performing the described steps of the methods, but may additionally perform related operations within the implementation of the method for detecting the line of sight or a video data processing method according to any of the embodiments of the present invention.

[00143] Опираясь на варианты осуществления настоящего изобретения, специалисты в данной области техники должны понимать, что заявленное изобретение может быть реализовано с помощью программных и универсальных аппаратных средств, и может быть дополнительно реализовано аппаратными средствами. Исходя из этого, технические решения настоящего изобретения могут быть воплощены в виде программного продукта, а компьютерный программный продукт может храниться в машиночитаемом носителе данных, таком как компьютерная дискета, постоянное запоминающее устройство (ROM), оперативное запоминающее устройство (RAM), флеш-накопитель, жесткий диск или оптический диск, в котором содержится множество команд, инициирующих выполнение вычислительным устройством (которым может служить персональный компьютер, сервер, сетевое устройство или иное устройство подобного рода) способов согласно различным вариантам осуществления настоящего изобретения.[00143] Based on embodiments of the present invention, those skilled in the art will appreciate that the claimed invention may be implemented in software and general purpose hardware, and may be further implemented in hardware. Based on this, the technical solutions of the present invention may be implemented as a software product, and the computer program product may be stored in a computer-readable storage medium such as a computer floppy disk, read only memory (ROM), random access memory (RAM), flash drive, a hard disk or optical disk containing a plurality of instructions that cause a computing device (which may be a personal computer, a server, a network device, or the like) to execute the methods of various embodiments of the present invention.

[00144] В вариантах осуществления устройства для детектирования линии взгляда или устройства для обработки видеоданных множество блоков и модулей разделено исключительно по их логическим функциям, но такое разделение не ограничено только этим решением при условии, что при этом обеспечивается возможность выполнения указанными блоками и модулями соответствующих функций. Кроме того, названия множества функциональных блоков представлены исключительно для удобства проведения различия между ними и никоим образом не предполагают ограничение объема правовой охраны настоящего изобретения.[00144] In embodiments of a line of sight detection device or a video processing device, a plurality of blocks and modules are separated solely by their logical functions, but such a separation is not limited to this solution, provided that it is possible for these blocks and modules to perform the corresponding functions . In addition, the names of the plurality of functional blocks are presented solely for the convenience of distinguishing between them and are not intended to limit the scope of the present invention in any way.

Claims

1. A method for detecting a line of sight, comprising:

determining - based on a key feature point in the face image - the corresponding position of the face and the corresponding rotational movement of the pupil of the eye, which correspond to the image of the face, and the rotational movement of the pupil of the eye is the movement of the center of the pupil relative to the center of the eyeball in the face image; and

obtaining the gaze direction of the real face by backprojecting the rotational movement of the pupil of the eye based on the given projection function and the position of the face onto the three-dimensional space where the real face is located;

while obtaining the direction of the gaze of a real face by back-projecting the rotational movement of the pupil of the eye based on the given projection function and the position of the face on the three-dimensional space where the real face is located, provides for:

building an appropriate function to optimize the line of sight based on the given projection function, the position of the face, and the rotational movement of the pupil of the eye; and

obtaining a gaze direction that achieves a predetermined optimization target in the line of sight optimization function in the 3D space in which the real face is located, and taking the obtained gaze direction as the gaze direction of the real face.

2. The method according to claim 1, characterized in that before determining the corresponding rotational movement of the pupil of the eye based on a key characteristic point in the face image, this method further provides:

obtaining relevant face data by scanning the face image;

obtaining a reconstructed mesh model of the face by reconstructing the face data using the 3D face template mesh; and

extracting a key feature point in the reconstructed face mesh model and taking the key feature point in the reconstructed face mesh model as a key feature point in the face image.

3. The method according to claim 2, wherein determining the appropriate position of the face based on a key feature point in the face image includes:

determination of the position of the key characteristic point in the face image consistent with the spatial position; and

determining the position of the face in the face image based on the position consistent with the spatial position.

4. The method of claim. 2, in which the determination of the corresponding rotational movement of the pupil of the eye on the basis of a key characteristic point in the face image includes:

determining an appropriate center of the eyeball based on a key feature point in the face mesh model;

obtaining the corresponding center of the pupil by recognizing the image of the eye in the image of the face; and

determining the appropriate rotational movement of the pupil of the eye based on the position of the center of the eyeball and the position of the center of the pupil.

5. The method according to claim 4, in which obtaining the corresponding center of the pupil of the eye by recognizing the image of the eye in the image of the face involves:

capturing an eye image in a face image; and

obtaining the corresponding pupil center by inputting the eye image into the pre-built deep network model.

6. The method according to any of the preceding paragraphs. 2-5, characterized in that after obtaining the gaze direction of the real face by backprojecting the rotational movement of the pupil of the eye based on the given projection function and the position of the face on the three-dimensional space where the real face is located, this method additionally provides:

building an appropriate function for optimizing the line of sight based on a given projection function of the three-dimensional eye model in the reconstructed mesh model of the face based on the line of sight of the real face.

7. The method of claim 1, wherein the face position includes a rotation parameter and a face shift parameter in the face image.

8. The method according to claim. 1, characterized in that before obtaining the gaze direction that reaches a given optimization target in the line of sight optimization function in the three-dimensional space in which the real face is located, and taking the obtained gaze direction as the direction
the look of a real face, this method additionally provides:

obtaining an associated image corresponding to the image of the face, and the associated image transfers the corresponding direction of view; and

updating the sight line optimization function based on the gaze direction in the associated image, the associated anti-aliasing parameter specified, and the stabilization parameter specified.

9. The method of claim 8, wherein the updated sight line optimization function is written as follows:

;

where the value

denotes a given projection function; magnitude

denotes the rotation parameter in the face position; magnitude

denotes a shift parameter in the position of the face; magnitude

indicates the direction of gaze in the image of the face; magnitude

denotes the rotational movement of the pupil of the eye; magnitude

indicates the direction of view in the associated image; magnitude

denotes an associated smoothing parameter; and the value

stands for stabilization parameter.

10. The method according to any of the preceding paragraphs. 1-5, characterized in that before obtaining the direction of view of a real face by back-projecting the rotational movement of the pupil of the eye onto the three-dimensional space where the real face is located, based on the given projection function and the position of the face, this method additionally provides:

determining an appropriate projection function based on the parameter of the image capturing device corresponding to the face image.

11. The method of claim. 10, in which the parameter of the image capture device is its focal length for imaging.

12. A method for processing video data, which includes:

obtaining a video frame in the video data to be processed; and

obtaining the gaze direction of a real person corresponding to the video frame by implementing the line of sight detection method according to any of the preceding claims. 1-11.

13. The method according to claim 12, characterized in that after receiving the direction of view corresponding to the video frame, this method further provides:

determining an appropriate gaze line offset based on a gaze direction corresponding to an adjacent video frame in the video data to be processed; and

performing an appropriate video editing operation based on the line of sight shift.

14. A device for detecting the line of sight, containing:

a parameter determination module configured to determine, based on a key feature point in the face image, the corresponding position of the face and the corresponding rotational movement of the pupil of the eye, wherein the rotational movement of the eye pupil is a displacement of the center of the pupil relative to the center of the eyeball in the face image; and

a line of sight detection module, configured to obtain a line of sight of a real face by backprojecting a rotational movement of the pupil of the eye based on a predetermined projection function and a position of the face into a three-dimensional space where the real face is located;

wherein the line of sight detection module is configured to:

constructing an appropriate line of sight optimization function based on a given projection function, face position, and rotational movement of the eye pupil; and

15. Device for processing video data, containing:

a video frame acquisition module, configured to acquire a video frame in the video data to be processed; and

a line of sight detection module, configured to obtain the line of sight detection of a real face corresponding to the video frame by implementing the line of sight detection method according to any one of the preceding claims. 1-11.

16. A line of sight processing system, including: an image pickup device and a data processing device that are communicatively connected to each other, the image pickup device being located on the data processing device; wherein:

the image capturing device captures the face image to be processed and the video data to be processed, and transmits the face image to be processed and the video data to be processed to the video data processing device, and the data processing device is provided with the line of sight detection device of claim 14 and the video data processing device of claim 15 .

17. Computing device containing:

one or more processors; and

a memory configured to store one or more programs; wherein:

one or more programs, when executed by one or more processors, causes one or more processors to implement the sight line detection method according to any one of the preceding claims. 1-11.

18. A computer-readable storage medium for storing a computer program, wherein the computer program, when executed by the processor, causes that processor to implement the sight line detection method according to any one of the preceding claims. 1-11.

19. Computing device, containing:

one or more processors; and

a memory configured to store one or more programs; wherein:

one or more programs, when executed by one or more processors, initiates the implementation by one or more processors of the video data processing method according to claim 12 or 13.

20. A computer-readable storage medium for storing a computer program, wherein the computer program, when executed by the processor, initiates the implementation by this processor of the video data processing method according to claim 12 or 13.