KR102305880B1

KR102305880B1 - User's Gaze Tracking Method, and Medium Being Recorded with Program for Executing the Method

Info

Publication number: KR102305880B1
Application number: KR1020200078229A
Authority: KR
Inventors: 이종우
Original assignee: 이종우
Priority date: 2020-06-26
Filing date: 2020-06-26
Publication date: 2021-09-28
Also published as: WO2021261771A1

Abstract

Disclosed are a method for tracking a gaze of a user, and a recording medium in which a program for executing the method for tracking a gaze of a user is recorded. The present invention is implemented through a process of, by a computing device, calculating location information of the pupil of a user from a video image which photographed the user, and generating a gaze vector of the user based on a rotation angle and a rotation direction of the face of the user and position information of the pupil calculated from the video image which photographed the user. According to the present invention, it is possible to track the gaze of the user with high precision through a general camera device such as a webcam or the like rather than infrared camera equipment.

Description

User's Gaze Tracking Method, and Medium Being Recorded with Program for Executing the Method}

본 발명은 사용자의 시선 추적 방법 및 사용자의 시선 추적 방법을 실행시키는 프로그램이 기록된 기록 매체에 관한 것으로, 더욱 상세하게는 적외선 카메라 장비가 아닌 웹캠 등의 일반 카메라 장치를 통해 사용자의 시선을 높은 정밀도로 추적할 수 있도록 하는 사용자의 시선 추적 방법 및 사용자의 시선 추적 방법을 실행시키는 프로그램이 기록된 기록 매체에 관한 것이다. The present invention relates to a user's gaze tracking method and a program for executing the user's gaze tracking method recorded on a recording medium, and more particularly, to a high-precision user's gaze through a general camera device such as a webcam rather than an infrared camera device. To a recording medium in which a program for executing a user's eye tracking method and a user's eye tracking method for tracking with

사용자의 시선을 추적하는 기술인 아이 트래킹 기술은 오래전부터 연구되어 왔으며, 시선을 추적하여 사용자의 심리 상태를 분석할 수 있을 것으로 기대하여 심리학 분야에서 처음 적용된 것으로 알려져 있다.Eye tracking technology, a technology that tracks the user's gaze, has been studied for a long time, and it is known that it was first applied in the field of psychology with the expectation that it would be possible to analyze the user's psychological state by tracking the gaze.

한편, 미국의 토비(Tobii)사로 대표되는 가장 보편화된 아이트래킹 기술은 사용자들의 동공과 적외선 사이의 상호작용으로 인해 구현 가능한 메커니즘을 바탕으로 하고 있으며, 이를 위해 적외선 카메라와 같은 특수 장비를 활용하고 있다.Meanwhile, the most common eye-tracking technology, represented by Tobii in the United States, is based on a mechanism that can be implemented due to the interaction between users' pupils and infrared rays, and for this purpose, special equipment such as an infrared camera is used. .

이와 같이 종래의 아이 트래킹 기술은 적외선 카메라 장비를 기본적 활용 도구로 삼고 있기 때문에 적외선 카메라 장비가 구비되지 않은 환경에서는 구현될 수 없으며, 그로 인해 폭넓은 활용 용도를 갖기 어렵다는 기술적 한계가 있다.As such, the conventional eye-tracking technology cannot be implemented in an environment without infrared camera equipment because it uses infrared camera equipment as a basic application tool.

따라서, 본 발명의 목적은, 적외선 카메라 장비가 아닌 웹캠 등의 일반 카메라 장치를 통해 사용자의 시선을 높은 정밀도로 추적할 수 있도록 하는 사용자의 시선 추적 방법 및 사용자의 시선 추적 방법을 실행시키는 프로그램이 기록된 기록 매체를 제공함에 있다.Accordingly, an object of the present invention is to record a user's gaze tracking method and a program for executing the user's gaze tracking method, which enable tracking of the user's gaze with high precision through a general camera device, such as a webcam, rather than an infrared camera device. It is to provide a recording medium.

상기 목적을 달성하기 위한 본 발명에 따른 사용자의 시선 추적 방법은, (a) 연산 장치가, 사용자를 촬영한 영상 이미지로부터 사용자의 동공의 위치 정보를 산출하는 단계; 및 (b) 상기 연산 장치가, 상기 사용자를 촬영한 영상 이미지로부터 산출된 사용자의 얼굴의 회전 각도 및 회전 방향과 상기 동공의 위치 정보에 기초하여 사용자의 시선 벡터를 생성하는 단계를 포함한다.A user's eye tracking method according to the present invention for achieving the above object, comprises the steps of: (a) calculating, by a computing device, location information of the user's pupil from a video image captured by the user; and (b) generating, by the computing device, a gaze vector of the user based on the rotation angle and rotation direction of the user's face calculated from the video image of the user and the position information of the pupil.

바람직하게는, (c) 상기 연산 장치가 상기 사용자를 촬영한 영상 이미지에 포함되어 있는 사용자의 주시 대상면과 상기 시선 벡터의 교점 좌표를 연산하는 단계를 더 포함한다.Preferably, the method further comprises the step of (c) calculating, by the computing device, the coordinates of the intersection of the gaze vector and the gaze target surface of the user included in the video image captured by the user.

또한, 상기 (a) 단계는, (a1) 연산 장치가, 사용자를 촬영한 영상 이미지로부터 사용자의 얼굴 영역을 추출하는 단계; (a2) 상기 연산 장치가, 추출된 상기 얼굴 영역으로부터 사용자의 눈 영역을 추출하는 단계; 및 (a3) 상기 연산 장치가, 추출된 상기 눈 영역에 기초하여 사용자의 동공의 위치를 산출하는 단계를 포함하는 것을 특징으로 한다.In addition, the step (a) may include: (a1) extracting, by the computing device, a user's face region from a video image captured by the user; (a2) extracting, by the computing device, a user's eye region from the extracted face region; and (a3) calculating, by the computing device, the position of the user's pupil based on the extracted eye region.

또한, 상기 (a3) 단계는, 상기 눈 영역에서 색상값이 변하는 경계를 기준으로 눈동자 영역을 추출하는 단계; 및 상기 눈동자 영역의 중심 좌표를 산출하는 단계 포함하는 것을 특징으로 한다.In addition, the step (a3) may include: extracting a pupil region based on a boundary at which a color value changes in the eye region; and calculating the center coordinates of the pupil region.

한편, 본 발명에 따른 기록 매체는 상기 방법을 실행시키는 프로그램이 기록된 것을 특징으로 한다.On the other hand, the recording medium according to the present invention is characterized in that the program for executing the method is recorded.

본 발명에 따르면, 적외선 카메라 장비가 아닌 웹캠 등의 일반 카메라 장치를 통해 사용자의 시선을 높은 정밀도로 추적할 수 있게 된다.According to the present invention, it is possible to track a user's gaze with high precision through a general camera device such as a webcam rather than an infrared camera device.

아울러, 본 발명에 따르면, 적외선 카메라 장비가 아닌 웹캠 등의 일반 카메라 장치를 통해 사용자의 동공 위치를 높은 정밀도로 인식할 수 있게 된다.In addition, according to the present invention, it is possible to recognize the user's pupil position with high precision through a general camera device such as a webcam rather than an infrared camera device.

도 1은 본 발명의 일 실시예에 따른 사용자의 동공 위치 산출 방법과 이를 이용한 사용자의 시선 추적 방법의 실행 시스템의 구성도,
도 2는 본 발명의 일 실시예에 따른 사용자의 동공 위치 산출 방법의 실행 과정을 나타내는 절차 흐름도,
도 3은 본 발명의 일 실시예에 따른 동공 위치 산출 방법을 이용한 사용자의 시선 추적 방법의 실행 과정을 나타낸 절차 흐름도, 및
도 4는 본 발명의 일 실시예에 따른 동공 위치 산출 방법의 실행 원리를 나타낸 도면이다.1 is a configuration diagram of a method for calculating a user's pupil position and an execution system of a user's eye tracking method using the same according to an embodiment of the present invention;
2 is a flowchart illustrating an execution process of a method for calculating a user's pupil position according to an embodiment of the present invention;
3 is a flowchart illustrating an execution process of a user's gaze tracking method using a pupil position calculation method according to an embodiment of the present invention; and
4 is a diagram illustrating an execution principle of a method for calculating a pupil position according to an embodiment of the present invention.

이하에서는 도면을 참조하여 본 발명을 보다 상세하게 설명한다. 도면들 중 동일한 구성요소들은 가능한 한 어느 곳에서든지 동일한 부호들로 나타내고 있음에 유의해야 한다. 또한 본 발명의 요지를 불필요하게 흐릴 수 있는 공지 기능 및 구성에 대한 상세한 설명은 생략한다.Hereinafter, the present invention will be described in more detail with reference to the drawings. It should be noted that the same components in the drawings are denoted by the same reference numerals wherever possible. In addition, detailed descriptions of well-known functions and configurations that may unnecessarily obscure the gist of the present invention will be omitted.

도 1은 본 발명의 일 실시예에 따른 사용자의 동공 위치 산출 방법과 이를 이용한 사용자의 시선 추적 방법의 실행 시스템의 구성도이고, 도 2는 본 발명의 일 실시예에 따른 사용자의 동공 위치 산출 방법의 실행 과정을 나타내는 절차 흐름도이다. 1 is a configuration diagram of a method for calculating a user's pupil position and an execution system of a user's eye tracking method using the same according to an embodiment of the present invention, and FIG. 2 is a method for calculating a user's pupil position according to an embodiment of the present invention. It is a flow chart showing the execution process of

이하에서는 도 1 및 도 2를 참조하여, 본 발명의 일 실시예에 따른 동공 위치 산출 방법의 실행 과정을 설명하기로 한다.Hereinafter, an execution process of a method for calculating a pupil position according to an embodiment of the present invention will be described with reference to FIGS. 1 and 2 .

도 1에서와 같이 사용자의 얼굴을 촬영하는 웹캠 등의 카메라 장치(100)는 모니터, 신문, 시험지, 미술 작품 등의 주시 대상체(20)를 주시하고 있는 사용자의 얼굴을 촬영하며, 촬영된 영상 이미지 데이터를 연산 장치(100)로 실시간으로 전송한다.As shown in FIG. 1 , the camera device 100 such as a webcam for photographing the user's face photographs the face of the user who is looking at the gaze target 20 such as a monitor, newspaper, test paper, and art work, and the captured video image Data is transmitted to the computing device 100 in real time.

연산 장치(100)에는 본 발명의 일 실시예에 따른 사용자의 동공 위치 산출 방법과 이를 이용한 사용자의 시선 추적 방법의 각 단계를 실행하는 프로그램이 설치되어 있으며, 이와 같은 프로그램은 프로그램 배포용 서버를 통해 PC, 스마트 폰 등의 통신 기능을 구비한 연산 장치(100)로 다운로드되어 설치되어 사용되거나 CD, USB 등 다양한 기록 매체에 기록된 상태에서 양도 및 대여될 수 있을 것이다.A program for executing each step of the method for calculating the user's pupil position and the user's eye tracking method using the same is installed in the computing device 100 according to an embodiment of the present invention, and such a program is provided through a program distribution server It may be downloaded and installed to the computing device 100 having a communication function, such as a PC, a smart phone, etc., and may be transferred or rented while being recorded on various recording media such as CD or USB.

한편, 연산 장치(100)는 카메라 장치(100)로부터 사용자의 촬영 이미지를 실시간으로 획득할 수 있게 되며(S210), 카메라 장치(100)로부터 수신된 영상 이미지에 대한 영상 분석 과정을 통해 영상 이미지로부터 사용자의 얼굴 영역을 추출한다(S230).On the other hand, the computing device 100 can acquire the user's photographed image from the camera device 100 in real time (S210), and from the video image through the video analysis process for the video image received from the camera device 100 The user's face region is extracted (S230).

한편, 본 발명을 실시함에 있어서, 웹캠 등의 다양한 종류의 카메라 장치(100)에 의해 촬영되는 영상 이미지는 다양한 사이즈나 해상도가 될 수 있으며, 이에 연산 장치(100)는 DCN(Deep Convolutional Network) 방식을 적용함으로써, 카메라 장치(100)로부터 수신된 영상 이미지에 대한 소정의 이미지 프로세싱 절차를 통해 일정한 사이즈와 해상도를 가진 영상 이미지를 확보함이 바람직할 것이다.Meanwhile, in practicing the present invention, video images captured by various types of camera devices 100 such as webcams may have various sizes or resolutions, and thus the computing device 100 is a DCN (Deep Convolutional Network) method. By applying , it may be desirable to secure a video image having a certain size and resolution through a predetermined image processing procedure for the video image received from the camera device 100 .

이를 통해 연산 장치(100)는 본 발명에 따른 동공 위치 산출 방법과 시선 추적 방법의 각 실행 절차에서의 연산 속도와 정확도를 높일 수 있게 된다.Through this, the computing device 100 can increase the calculation speed and accuracy in each execution procedure of the pupil position calculation method and the gaze tracking method according to the present invention.

아울러, 본 발명을 실시함에 있어서, 전술한 S230 단계에서 연산 장치(100)가 사용자의 얼굴 영역을 인식함에 있어서, 컴퓨터 비전 영역(Computer Vision)에서의 제반 알고리즘들을 활용한 방식 또는 생물학의 신경망에서 영감을 얻은 통계학적 학습 알고리즘인 인공 신경망(Artificial neural network)를 활용한 방식 또는 머신 러닝을 통한 얼굴 검출(Face Detection) 방식 등을 사용할 수 있을 것이다.In addition, in carrying out the present invention, when the computing device 100 recognizes the user's face region in the above-described step S230, a method utilizing various algorithms in the computer vision region or inspiration from a biological neural network A method using an artificial neural network, which is a statistical learning algorithm that obtained

그 다음, 연산 장치(100)는 추출된 얼굴 영역으로부터 도 4에서와 같은 사용자의 눈 영역(30)을 추출한다(S250). 구체적으로, 연산 장치(100)는 사용자 안구의 외부 노출 영역의 경계에서의 눈꺼풀의 피부 조직과 안구의 흰 자위의 색상값의 차이가 발생하는 지점을 연속적으로 검출함으로써 눈 영역(30)을 추출할 수 있을 것이다.Next, the computing device 100 extracts the user's eye area 30 as shown in FIG. 4 from the extracted face area (S250). Specifically, the computing device 100 extracts the eye region 30 by continuously detecting a point at which a difference in color values between the skin tissue of the eyelid and the white color of the eyeball at the boundary of the external exposure region of the user's eye occurs. will be able

이와 같이 사용자의 눈 영역(30)을 추출한 다음 연산 장치(100)는 추출된 눈 영역(30)으로부터 사용자의 눈동자 영역(40)을 추출한다(S270). 구체적으로, 연산 장치(100)는 눈 영역(30)에서의 흰 자위와 홍채의 경계에서의 흰 자위의 색상값과 홍채의 색상값의 차이가 발생하는 지점을 연속적으로 검출함으로써 눈동자 영역(40)을 추출할 수 있을 것이다. After extracting the user's eye region 30 as described above, the computing device 100 extracts the user's pupil region 40 from the extracted eye region 30 ( S270 ). Specifically, the computing device 100 continuously detects a point at which a difference between the color value of the iris and the color value of the iris at the boundary between the iris in the eye region 30 and the iris in the eye region 30 is generated, thereby forming the pupil region 40 . will be able to extract

한편, 연산 장치(100)는 이와 같이 추출된 눈동자 영역(40)의 중심 좌표를 산출함으로써 동공의 위치 정보를 산출하게 된다(S290).Meanwhile, the computing device 100 calculates the pupil position information by calculating the center coordinates of the pupil region 40 extracted as described above ( S290 ).

구체적으로, 연산 장치(100)는 전술한 S270 단계에서 추출한 눈동자 영역(40)에 의해 정의되는 원의 중심 좌표를 산출함으로써 동공의 위치 정보를 산출할 수 있을 것이다.Specifically, the computing device 100 may calculate the pupil position information by calculating the center coordinates of the circle defined by the pupil region 40 extracted in step S270 described above.

한편, 본 발명을 실시함에 있어서, 눈동자 영역(40)의 일부가 눈꺼풀에 의해 가려진 경우에 연산 장치(100)는 눈 영역(30)에서의 흰 자위와 홍채의 경계에서의 흰 자위의 색상값과 홍채의 색상값의 차이가 발생하는 지점의 환형 연속 검출 구간에 기초하여 나머지 구간에 대한 보간법을 실행함으로써 원의 형상을 갖는 눈동자 영역(40)을 추출할 수 있을 것이다.Meanwhile, in carrying out the present invention, when a part of the pupil region 40 is covered by the eyelid, the computing device 100 calculates the color value of the white region in the eye region 30 and the white region at the boundary between the iris and the iris. The pupil region 40 having a circular shape may be extracted by performing interpolation on the remaining sections based on the annular continuous detection section of the point where the difference in the color values of the iris occurs.

이와 같이 본 발명에 의하면 적외선 카메라 장비가 아닌 웹캠 등의 일반 카메라 장치(100)를 통해서도 사용자의 동공 위치를 높은 정밀도로 인식할 수 있게 된다.As described above, according to the present invention, it is possible to recognize the user's pupil position with high precision even through the general camera device 100 such as a webcam rather than the infrared camera device.

아울러, 본 발명을 실시함에 있어서는 연산 장치(100)는 동공의 색상값 정보, 동공의 형상 정보, 동공의 크기 정보 등의 동공을 규정하기 위한 파라미터값을 미리 설정한 상태에서 컴퓨터 비전 계산 방법을 통해 동공의 위치를 산출할 수도 있을 것이다.In addition, in carrying out the present invention, the computing device 100 sets parameter values for defining the pupil, such as pupil color value information, pupil shape information, and pupil size information, in advance through a computer vision calculation method. It may be possible to calculate the position of the pupil.

아울러, 본 발명을 실시함에 있어서, 연산 장치(100)가 동공의 위치 정보를 산출함에 있어서, 디지털 영상의 밝기나 색 등의 특성을 다른 주변 영역과 비교하는 방식의 알고리즘인 얼룩 감지(Blob Detection) 알고리즘을 사용할 수도 있을 것이다.In addition, in carrying out the present invention, when the computing device 100 calculates the location information of the pupil, Blob Detection is an algorithm that compares characteristics such as brightness or color of a digital image with other surrounding areas. You could also use an algorithm.

보다 구체적으로, 연산 장치(100)는 컴퓨터 비전 영역의 임계값(thresholding), 블러링(blurring) 등의 기본 알고리즘 기법들을 순차적으로 활용해 동공 위치 이외의 픽셀값들을 필터링하는 방식으로 동공의 위치를 산출할 수도 있을 것이다.More specifically, the computing device 100 determines the position of the pupil by sequentially using basic algorithm techniques such as thresholding and blurring of the computer vision region to filter pixel values other than the pupil position. It may be possible to calculate

또한, 이와 같은 방식으로 동공의 후보군이 도출되는 경우에 연산 장치(100)는 후보들의 직경값을 소정의 동공 직경 기준값과 비교함으로써 소정의 오차 범위 내에 있는 동공만을 선택하여 위치 좌표를 산출할 수 있을 것이다.In addition, when the candidate group of pupils is derived in this way, the computing device 100 selects only the pupils within a predetermined error range by comparing the diameter values of the candidates with a predetermined pupil diameter reference value to calculate the position coordinates. will be.

도 3은 본 발명의 일 실시예에 따른 동공 위치 산출 방법을 이용한 사용자의 시선 추적 방법의 실행 과정을 나타낸 절차 흐름도이다. 이하에서는 도 1 및 도 3을 참조하여, 본 발명의 일 실시예에 따른 동공 위치 산출 방법을 이용한 사용자의 시선 추적 방법의 실행 과정을 설명하기로 한다.3 is a flowchart illustrating an execution process of a method for tracking a user's gaze using a method for calculating a pupil position according to an embodiment of the present invention. Hereinafter, with reference to FIGS. 1 and 3 , an execution process of a method for tracking a user's gaze using a method for calculating a pupil position according to an embodiment of the present invention will be described.

상술한 바에서와 같이 도 2에서의 본 발명의 일 실시예에 따른 동공 위치 산출 방법을 통해 동공 위치 정보를 산출한 다음(S310), 연산 장치(100)는 산출된 동공 위치 정보에 기초하여 사용자의 시선 벡터(l)를 생성한다(S330).As described above, after calculating the pupil position information through the method for calculating the pupil position according to the embodiment of the present invention in FIG. Generates a gaze vector l of (S330).

구체적으로, 연산 장치(100)는 전술한 S230 단계에서 영상 이미지로부터 추출한 사용자 얼굴 영역에 대한 이미지 분석을 통해 사용자 얼굴의 3차원 상에서의 회전 방향과 회전 각도를 연산한다.Specifically, the computing device 100 calculates the rotation direction and rotation angle of the user's face in three dimensions through image analysis of the user's face region extracted from the video image in step S230 described above.

본 발명을 실시함에 있어서, 연산 장치(100)는 얼굴의 특징점 추출(Facial Landmarks Detection) 기법을 통해 사용자 얼굴의 3차원 상에서의 회전 방향과 회전 각도를 연산할 수 있을 것이다.In carrying out the present invention, the computing device 100 may calculate the rotation direction and rotation angle of the user's face in three dimensions through a facial landmarks detection technique.

그 다음, 연산 장치(100)는 전술한 S310 단계에서 산출한 사용자의 동공 위치 좌표와 사용자 얼굴의 3차원 상에서의 회전 방향과 회전 각도 정보를 기초로 도 1에서와 같이 사용자의 3차원 시선 벡터(l)를 생성한다.Next, the computing device 100 calculates the user's three-dimensional gaze vector ( l) is created.

한편, 본 발명을 실시함에 있어서 시선 벡터(l)의 크기는 사용자로부터 주시 대상체(20)까지의 거리의 약 1.5배 내지 2배가 될 수 있을 것이다.Meanwhile, in carrying out the present invention, the size of the gaze vector l may be about 1.5 to 2 times the distance from the user to the gaze target 20 .

이와 같이 생성된 시선 벡터(l)는 사용자 동공의 3차원 위치 좌표를 시작점으로 하고 사용자 얼굴의 3차원 회전 방향각을 따라 전개되는 3차원 반직선의 방정식으로 정의될 수 있을 것이다.The gaze vector l generated in this way may be defined as an equation of a three-dimensional radial line developed along the three-dimensional rotation direction angle of the user's face with the three-dimensional position coordinate of the user's pupil as the starting point.

한편, 본 발명을 실시함에 있어서, 전술한 S290 단계에서 연산 장치(100)가 동공의 3차원 위치 좌표를 산출하기 위해서 카메라 장치(100)로부터 사용자까지의 이격 거리인 촬영 거리를 소정의 기준 거리로 일정하게 유지하고, 연산 장치(100)는 연산 장치(100)에 기 설정되어 있는 해당 기준 거리 정보에 기초하여 동공의 3차원 위치 좌표를 산출함이 바람직할 것이다.On the other hand, in carrying out the present invention, in the above-described step S290, in order to calculate the three-dimensional position coordinates of the pupil, the calculation device 100 sets the shooting distance, which is the separation distance from the camera device 100 to the user, as a predetermined reference distance. It is desirable to keep it constant, and the calculation device 100 calculates the 3D position coordinates of the pupil based on the reference distance information preset in the calculation device 100 .

이와 같이 본 발명에 따르면 적외선 카메라 장비가 아닌 웹캠 등의 일반 카메라 장치(100)를 통해도 사용자의 시선을 높은 정밀도로 추적할 수 있게 된다.As described above, according to the present invention, it is possible to track a user's gaze with high precision even through a general camera device 100 such as a webcam rather than an infrared camera device.

아울러, 연산 장치(100)는 사용자가 주시하는 주시 대상체(20)에서 사용자가 주시하는 평면인 주시 대상면의 좌표를 3차원 상에서 연산할 수 있을 것이다. In addition, the computing device 100 may calculate the coordinates of the gaze target plane, which is a plane gazed by the user, in the gaze target 20 that the user gazes at in 3D.

본 발명을 실시함에 있어서 주시 대상체(20)가 카메라 장치(100)에 의해 촬영 가능한 위치에 있으며 카메라 장치(100)로부터 소정의 촬영 거리를 형성하고 있는 신문, 시험지, 미술 작품 등인 경우에 연산 장치(100)는 전술한 S210 단계에서 카메라 장치(100)로부터 수신한 영상 이미지로부터 주시 대상면의 3차원 좌표를 연산할 수 있을 것이다.In carrying out the present invention, in the case of a newspaper, a test paper, a work of art, etc., in which the target object 20 is located at a position capable of being photographed by the camera device 100 and forms a predetermined shooting distance from the camera device 100, the computing device ( 100) may calculate the three-dimensional coordinates of the gaze target surface from the video image received from the camera device 100 in step S210 described above.

아울러, 본 발명을 실시함에 있어서, 주시 대상체(20)가 웹캠이 설치되어 있는 모니터인 경우와 같이 주시 대상체(20)가 카메라 장치(100)에 의해 촬영되지 않는 경우에는 연산 장치(100)에 주시 대상면의 3차원 좌표 정보가 미리 저장됨이 바람직할 것이다.In addition, in carrying out the present invention, when the gaze target 20 is not photographed by the camera device 100 , such as when the gaze target 20 is a monitor on which a webcam is installed, the calculation device 100 is focused on It may be preferable that the three-dimensional coordinate information of the target surface is stored in advance.

이에 따라 연산 장치(100)는 전술한 S330 단계에서 생성된 사용자의 시선 벡터(l)와 주시 대상면의 교점 좌표를 연산할 수 있게 된다(S350).Accordingly, the computing device 100 can calculate the coordinates of the intersection of the user's gaze vector l generated in the above-described step S330 and the target plane (S350).

이와 같이 연산 장치(100)는 사용자의 시선 벡터(l)가 주시 대상면과 만나는 지점을 주시 대상면에서 사용자의 시선이 머물고 있는 영역으로 산출할 수 있게 된다(S370).As such, the computing device 100 can calculate a point where the user's gaze vector l meets the gaze target surface as an area where the user's gaze stays on the gaze target plane ( S370 ).

아울러, 연산 장치(100)는 전술한 S370 단계에서의 사용자의 시선 벡터(l)와 주시 대상면의 교점 좌표의 연산 결과에 기초하여 사용자의 주시 대상체(20)에 대한 주목도를 산출할 수 있을 것이다(S390).In addition, the computing device 100 may calculate the degree of attention of the user to the gaze target 20 based on the calculation result of the coordinates of the intersection of the gaze vector l and the gaze target plane in step S370 described above. (S390).

구체적으로, 사용자가 모니터를 통해 출력되는 강의 콘텐츠를 시청하고 있는 경우에 사용자의 시선 벡터(l)와 주시 대상면의 교점이 존재하지 않는 것으로 판단된 경우 연산 장치(100)는 사용자가 현재 모니터를 통해 출력되는 강의 콘텐츠를 주시하지 않고 있는 것으로 판단할 수 있을 것이다.Specifically, when it is determined that the intersection of the user's gaze vector l and the target plane does not exist when the user is watching lecture content output through the monitor, the computing device 100 allows the user to view the current monitor. It can be determined that the lecture contents output through the lecture are not being watched.

아울러, 강의 콘텐츠의 실행 시간 중에서 사용자의 시선 벡터(l)와 주시 대상면의 교점이 존재함으로써 사용자의 시선이 주시 대상면에 머무르고 있는 것으로 판단된 시간 구간의 비율을 산출함으로써 사용자의 강의 콘텐츠에 대한 주목도를 산출할 수도 있을 것이다.In addition, by calculating the ratio of the time period in which it is determined that the user's gaze stays on the gaze target plane due to the presence of the intersection of the user's gaze vector (l) and the gaze target plane during the execution time of the lecture content, the user's lecture content It may be possible to calculate the degree of attention.

아울러, 본 발명을 실시함에 있어서는, 연산 장치(100)는 주시 대상체(20)가 태블릿 PC 등의 사용자의 제어를 필요로 하는 전자 장비인 경우에는 주시 대상면 상에서의 사용자의 시선 영역 정보에 기초하여 주시 대상체(20)에 대한 제어 정보를 생성할 수도 있을 것이다.In addition, in implementing the present invention, when the target object 20 is an electronic device that requires the user's control, such as a tablet PC, the computing device 100 is configured based on the user's gaze area information on the gaze target surface. Control information on the gaze target 20 may be generated.

구체적으로, 주시 대상면 상에서의 사용자의 시선 영역이 태블릿 PC 화면의 스크롤 버튼 표시 영역과 일치하는 경우에 연산 장치(100)는 스크롤 제어 명령을 생성하고 이를 태블릿 PC 등의 주시 대상체(20)로 전송함으로써 사용자의 시선 정보에 기초하여 전자 장비의 제어 동작이 실행되도록 할 수도 있을 것이다.Specifically, when the user's gaze area on the gaze target surface coincides with the scroll button display area of the tablet PC screen, the computing device 100 generates a scroll control command and transmits it to the gaze target 20 such as the tablet PC. By doing so, the control operation of the electronic device may be executed based on the user's gaze information.

한편, 본 발명을 실시함에 있어서, 전술한 S210 단계에서 카메라 장치(100)에 의회 촬영되는 영상 이미지에는 복수의 사용자가 포함될 수 있을 것이며, 이러한 경우에 연산 장치(100)는 전술한 각 단계를 실행함에 있어서, 각각의 사용자에 대한 얼굴 영역 추출(S230), 각 사용자의 눈 영역 추출(S250), 각 사용자의 눈동자 영역(40) 추출(S270), 각 사용자의 눈동자 영역(40)의 중심 좌표 산출(S290), 각 사용자의 동공 위치 정보 산출(S310), 각 사용자의 시선 백터 생성(S330), 각 사용자의 시선 벡터(l)와 주시 대상면과의 교점 연산(S350), 각 사용자의 시선 영역 산출(S370), 및 각 사용자의 주목도 산출(S390) 절차를 독립적으로 실행함으로써 복수의 사용자의 시선에 대한 동시 추적을 실행할 수 있을 것이다.On the other hand, in implementing the present invention, a plurality of users may be included in the video image taken by the camera device 100 in step S210 described above, and in this case, the computing device 100 executes each step described above. In the process, extracting a face region for each user (S230), extracting each user's eye region (S250), extracting each user's pupil region 40 (S270), and calculating the center coordinates of each user's pupil region 40 (S290), calculating each user's pupil position information (S310), generating each user's gaze vector (S330), calculating the intersection point between each user's gaze vector (l) and the target plane (S350), each user's gaze region By independently executing the calculation ( S370 ) and the calculation ( S390 ) of each user's attention level, simultaneous tracking of the gazes of a plurality of users may be executed.

구제적으로, 연산 장치(100)는 복수의 사용자들의 다양한 크기의 얼굴을 동시 다발적으로 인식하기 위해서 Sliding window approach 또는 CNN 접근법(Convolutional neural network approach)을 사용할 수 있을 것이다.Specifically, the computing device 100 may use a sliding window approach or a convolutional neural network approach (CNN) to simultaneously recognize faces of various sizes of a plurality of users.

보다 구체적으로, CNN 접근법을 적용하는 경우에 연산 장치(100)는 CNN 레이어를 통한 출력값을 특징 지도(feature map)의 형태로서 출력하며, 특징 지도에 기초하여 소정의 확률로 얼굴이 존재하는 영역인 특정 위치를 표시하고, 그 다음 해당 특정 위치에서의 눈 영역(30)의 분포 예상값을 출력할 수 있을 것이다.More specifically, in the case of applying the CNN approach, the computing device 100 outputs the output value through the CNN layer in the form of a feature map, and based on the feature map, a region in which a face exists with a predetermined probability. A specific location may be indicated, and then the predicted distribution value of the eye region 30 at the specific location may be output.

본 발명에서 사용한 용어는 단지 특정한 실시 예를 설명하기 위해 사용된 것으로 본 발명을 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 출원에서, "포함하다" 또는 "가지다" 등의 용어는 명세서상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.Terms used in the present invention are only used to describe specific embodiments and are not intended to limit the present invention. The singular expression includes the plural expression unless the context clearly dictates otherwise. In the present application, terms such as “comprise” or “have” are intended to designate that a feature, number, step, operation, component, part, or combination thereof described in the specification exists, but one or more other features It should be understood that this does not preclude the existence or addition of numbers, steps, operations, components, parts, or combinations thereof.

이상에서는 본 발명의 바람직한 실시예 및 응용예에 대하여 도시하고 설명하였지만, 본 발명은 상술한 특정의 실시예 및 응용예에 한정되지 아니하며, 청구범위에서 청구하는 본 발명의 요지를 벗어남이 없이 당해 발명이 속하는 기술분야에서 통상의 지식을 가진 자에 의해 다양한 변형실시가 가능한 것은 물론이고, 이러한 변형실시들은 본 발명의 기술적 사상이나 전망으로부터 개별적으로 이해되어져서는 안될 것이다.In the above, preferred embodiments and applications of the present invention have been shown and described, but the present invention is not limited to the specific embodiments and applications described above, and the present invention is not limited to the scope of the present invention as claimed in the claims. Various modifications may be made by those skilled in the art to which this belongs, of course, and these modifications should not be individually understood from the technical spirit or prospect of the present invention.

10: 카메라 장치, 20: 주시 대상체,
30: 눈 영역, 40: 눈동자 영역,
100: 연산 장치.10: camera device, 20: gaze object,
30: eye area, 40: pupil area,
100: arithmetic unit.

Claims

(a) extracting, by the computing device, face regions of a plurality of users from a video image of the plurality of users;
(b) extracting, by the computing device, eye regions of a plurality of users from the extracted face regions of the plurality of users;
(c) the arithmetic device extracts a pupil region based on a boundary in which a color value is changed in the eye region, and when a part of the pupil region is covered by an eyelid, an annular continuation of a boundary in which a color value in the eye region changes extracting a pupil region having a circular shape by performing interpolation on the remaining sections covered by the eyelids based on the detection section;
(d) calculating, by the computing device, the coordinates of the center of the pupil region, thereby calculating position information of the user's pupil;
(e) generating, by the computing device, a gaze vector of the user based on the rotation angle and rotation direction of the user's face calculated from the video image taken by the user, and the position information of the pupil;
(f) calculating, by the computing device, three-dimensional coordinates of a gaze target surface included in the video image; and
(g) calculating, by the computing device, the coordinates of the intersection of the three-dimensional coordinates of the gaze target surface included in the video image and the gaze vector
includes,
The step (a) is,
(a1) outputting, by the computing device, an output value through a convolutional neural network layer in the form of a feature map; and
(a2) displaying, by the computing device, a region in which the faces of the plurality of users exist with a predetermined probability based on the feature map,
Step (b) is,
and outputting, by the computing device, an estimated distribution value of the eye regions of the plurality of users in the region where the faces of the plurality of users exist.

delete

A computer-readable recording medium in which a program for executing the method of claim 1 is recorded.