KR20190103570A

KR20190103570A - Method for eye-tracking and terminal for executing the same

Info

Publication number: KR20190103570A
Application number: KR1020180024121A
Authority: KR
Inventors: 석윤찬; 이태희
Original assignee: 주식회사 비주얼캠프
Priority date: 2018-02-28
Filing date: 2018-02-28
Publication date: 2019-09-05
Also published as: KR102094944B1

Abstract

Provided are a method for tracking a gaze and a terminal for performing the same. According to one embodiment of the present invention, the terminal capable of tracking a gaze of a user comprises: a model selection unit selecting a gaze tracking model corresponding to a current screen of the terminal among a plurality of set gaze tracking models; and a gaze tracking unit tracking the gaze of the user on the current screen using the selected gaze tracking model. The gaze tracking model is configured to divide the current screen into a plurality of set areas and identify an area to which a position of the gaze belongs among the divided areas to track the gaze.

Description

METHOD FOR EYE-TRACKING AND TERMINAL FOR EXECUTING THE SAME}

본 발명의 실시예들은 시선 추적 기술과 관련된다.Embodiments of the present invention relate to eye tracking techniques.

시선 추적(Eye Tracking)은 사용자의 안구 움직임을 감지하여 시선의 위치를 추적하는 기술로서, 영상 분석 방식, 콘택트렌즈 방식, 센서 부착 방식 등의 방법이 사용될 수 있다. 영상 분석 방식은 실시간 카메라 이미지의 분석을 통해 동공의 움직임을 검출하고, 각막에 반사된 고정 위치를 기준으로 시선의 방향을 계산한다. 콘택트렌즈 방식은 거울 내장 콘택트렌즈의 반사된 빛이나, 코일 내장 콘택트렌즈의 자기장 등을 이용하며, 편리성이 떨어지는 반면 정확도가 높다. 센서 부착 방식은 눈 주위에 센서를 부착하여 눈의 움직임에 따른 전기장의 변화를 이용하여 안구의 움직임을 감지하며, 눈을 감고 있는 경우(수면 등)에도 안구 움직임의 검출이 가능하다.Eye tracking is a technology for tracking the position of the eye by detecting the eye movement of the user, and an image analysis method, a contact lens method, a sensor attachment method, and the like may be used. The image analysis method detects pupil movement through analysis of a real-time camera image and calculates the direction of the eye based on a fixed position reflected by the cornea. The contact lens method uses reflected light of a contact lens with a mirror or a magnetic field of a contact lens with a coil, and has low accuracy and high accuracy. In the sensor attachment method, a sensor is attached around the eye to detect eye movement using a change in electric field according to eye movement, and eye movement can be detected even when the eyes are closed (sleeping, etc.).

최근, 시선 추적 기술의 적용 대상 기기 및 적용 분야가 점차 확대되고 있으며, 이에 따라 스마트폰 등과 같은 단말에서 광고 서비스 등을 제공함에 있어 상기 시선 추적 기술을 활용하는 시도가 증가하고 있다. 그러나, 종래에는 사용자가 응시하는 지점의 위치 좌표(예를 들어, x, y 좌표)를 획득한 후 상기 위치 좌표로부터 사용자의 시선을 추적하였으며 이 경우 상기 위치 좌표에 따라 시선 추적의 정확도가 떨어지는 문제점이 있다. 또한, 종래에는 시선 추적을 위한 학습 데이터를 수집하는 과정에서도 상기 위치 좌표 각각을 학습하여야 한다는 점에서 학습 시간이 오래 걸리는 문제점이 있다.Recently, the application target device and the application field of the eye tracking technology have been gradually expanded, and thus, attempts to utilize the eye tracking technology have been increased in providing an advertisement service in a terminal such as a smartphone. However, conventionally, after acquiring a position coordinate (for example, x and y coordinates) of a point gazed by a user, the user's gaze is tracked from the position coordinate, and in this case, the accuracy of gaze tracking decreases according to the position coordinate. There is this. In addition, there is a problem in that the learning time is long because the position coordinates must be learned even in the process of collecting the learning data for eye tracking.

한국등록특허공보 제10-1479471호(2015.01.13)Korean Registered Patent Publication No. 10-1479471 (2015.01.13)

본 발명의 실시예들은 시선의 위치 좌표가 아닌 시선의 위치가 속한 화면 내 영역을 식별하여 시선을 추적함으로써 시선 추적의 정확도를 향상시키기 위한 것이다.Embodiments of the present invention are to improve the accuracy of eye tracking by identifying the area within the screen to which the position of the eye belongs, not the position coordinate of the eye.

본 발명의 예시적인 실시예에 따르면, 사용자의 시선 추적이 가능한 단말로서, 설정된 복수의 시선 추적 모델 중 상기 단말의 현재 화면에 대응되는 시선 추적 모델을 선택하는 모델 선택부; 및 선택된 상기 시선 추적 모델을 이용하여 상기 현재 화면 상에서 상기 사용자의 시선을 추적하는 시선 추적부를 포함하며, 상기 시선 추적 모델은, 상기 현재 화면을 설정된 복수의 영역으로 분할하고, 분할된 상기 복수의 영역 중 상기 시선의 위치가 속하는 영역을 식별하여 상기 시선을 추적하도록 구성되는, 단말이 제공된다.According to an exemplary embodiment of the present invention, a gaze tracking terminal of a user, comprising: a model selection unit for selecting a gaze tracking model corresponding to a current screen of the terminal from among a plurality of set gaze tracking models; And a gaze tracking unit configured to track the gaze of the user on the current screen by using the selected gaze tracking model, wherein the gaze tracking model divides the current screen into a plurality of set areas and divides the plurality of areas. The terminal is configured to identify the area to which the position of the gaze belongs to track the gaze.

상기 모델 선택부는, 상기 단말에 설치된 복수의 애플리케이션 중 상기 현재 화면을 출력하는 애플리케이션을 식별하고, 식별된 상기 애플리케이션에 대응되는 시선 추적 모델을 선택할 수 있다.The model selector may identify an application that outputs the current screen among a plurality of applications installed in the terminal, and select a gaze tracking model corresponding to the identified application.

분할된 상기 각 영역의 위치, 크기 및 개수는, 각 시선 추적 모델별로 상이하며, 상기 각 시선 추적 모델은, 상기 복수의 애플리케이션 중 서로 다른 하나 이상의 애플리케이션에 각각 대응될 수 있다.Positions, sizes, and numbers of the divided regions may be different for each eye tracking model, and each eye tracking model may correspond to one or more different applications among the plurality of applications.

상기 각 영역은, 설정된 서로 다른 기능을 수행하기 위해 상기 사용자에 의해 입력 가능하도록 구분되는 영역일 수 있다.Each of the areas may be an area that can be inputted by the user to perform different functions.

상기 단말은, 상기 현재 화면 내 특정 영역을 응시하는 사용자로부터 설정된 액션을 입력 받는 경우 상기 액션을 입력 받는 시점에서 촬영된 상기 사용자의 얼굴 이미지 및 상기 특정 영역의 위치 정보를 수집함으로써 상기 시선 추적 모델을 학습시키는 학습부를 더 포함할 수 있다.When the terminal receives an action set from a user gazing at a specific area within the current screen, the terminal collects the gaze tracking model by collecting the face image of the user and location information of the specific area photographed when the action is input. The learning unit may further include a learning unit.

상기 학습부는, 상기 사용자가 상기 특정 영역을 터치하는 경우 상기 터치가 이루어진 시점에서 촬영된 상기 사용자의 얼굴 이미지 및 상기 특정 영역의 위치 정보를 수집할 수 있다.When the user touches the specific area, the learner may collect face images of the user and location information of the specific area photographed when the touch is made.

본 발명의 다른 예시적인 실시예에 따르면, 사용자의 시선 추적이 가능한 단말에서 수행되는 시선 추적 방법으로서, 상기 단말의 모델 선택부에서, 설정된 복수의 시선 추적 모델 중 상기 단말의 현재 화면에 대응되는 시선 추적 모델을 선택하는 단계; 및 상기 단말의 시선 추적부에서, 선택된 상기 시선 추적 모델을 이용하여 상기 현재 화면 상에서 상기 사용자의 시선을 추적하는 단계를 포함하며, 상기 시선 추적 모델은, 상기 현재 화면을 설정된 복수의 영역으로 분할하고, 분할된 상기 복수의 영역 중 상기 시선의 위치가 속하는 영역을 식별하여 상기 시선을 추적하도록 구성되는, 시선 추적 방법이 제공된다.According to another exemplary embodiment of the present invention, a gaze tracking method performed by a user's gaze tracking terminal, wherein the gaze corresponding to the current screen of the terminal among a plurality of gaze tracking models set by the model selection unit of the terminal Selecting a tracking model; And tracking, by the gaze tracking unit of the terminal, the gaze of the user on the current screen using the selected gaze tracking model, wherein the gaze tracking model divides the current screen into a plurality of set areas. The eye tracking method may be configured to identify an area to which the position of the eye gaze belongs among the plurality of divided areas to track the eye gaze.

상기 시선 추적 모델을 선택하는 단계는, 상기 단말에 설치된 복수의 애플리케이션 중 상기 현재 화면을 출력하는 애플리케이션을 식별하고, 식별된 상기 애플리케이션에 대응되는 시선 추적 모델을 선택할 수 있다.The selecting the eye tracking model may identify an application outputting the current screen among a plurality of applications installed in the terminal and select a eye tracking model corresponding to the identified application.

상기 시선 추적 방법은, 상기 단말의 학습부에서, 상기 현재 화면 내 특정 영역을 응시하는 사용자로부터 설정된 액션을 입력 받는 경우 상기 액션을 입력 받는 시점에서 촬영된 상기 사용자의 얼굴 이미지 및 상기 특정 영역의 위치 정보를 수집함으로써 상기 시선 추적 모델을 학습시키는 단계를 더 포함할 수 있다.The gaze tracking method may include, when the learning unit of the terminal receives an action set by a user gazing at a specific area within the current screen, the face image of the user and the location of the specific area photographed when the action is input. The method may further include training the eye tracking model by collecting information.

상기 시선 추적 모델을 학습시키는 단계는, 상기 사용자가 상기 특정 영역을 터치하는 경우 상기 터치가 이루어진 시점에서 촬영된 상기 사용자의 얼굴 이미지 및 상기 특정 영역의 위치 정보를 수집할 수 있다.In the learning of the gaze tracking model, when the user touches the specific area, the face image of the user photographed at the time when the touch is made and the location information of the specific area may be collected.

본 발명의 실시예들에 따르면, 시선의 위치 좌표가 아닌 시선의 위치가 속한 화면 내 영역을 식별하여 시선을 추적함으로써 시선 추적의 정확도를 향상시킬 수 있다. 시선의 위치 좌표 대신 화면 내 분할된 영역으로 시선을 추적하는 경우, 시선 추적을 위한 클래스(class)의 개수가 적어지므로 시선 추적의 연산 속도 및 정확도가 증가하게 되며 시선 추적 과정에서 사용되는 컴퓨팅 자원을 최소화시킬 수 있다.According to the exemplary embodiments of the present disclosure, the gaze tracking may be improved by identifying an area within the screen to which the gaze position belongs, not the position coordinate of the gaze. In the case of tracking the gaze by the divided area in the screen instead of the coordinates of the gaze position, the number of classes for tracking the gaze decreases, so the computation speed and accuracy of the gaze tracking is increased and the computing resources used in the gaze tracking process are increased. It can be minimized.

또한, 본 발명의 실시예들에 따르면, 시선의 위치 좌표 대신 화면 내 분할된 영역으로 시선 추적을 위한 학습 데이터를 각 시선 추적 모델에 학습시키게 되며, 이 경우 시선 추적을 위한 클래스(class)의 개수가 적어지므로 시선 추적 모델의 학습 속도가 향상되며 학습 과정에서 사용되는 컴퓨팅 자원을 최소화시킬 수 있다.In addition, according to embodiments of the present invention, the gaze tracking model learns the training data for gaze tracking in a divided region in the screen instead of the position coordinates of the gaze, and in this case, the number of classes for gaze tracking. This reduces the learning speed of the eye tracking model and minimizes the computing resources used in the learning process.

도 1은 본 발명의 일 실시예에 따른 단말의 상세 구성을 나타낸 블록도
도 2는 본 발명의 일 실시예에 따른 각 화면(또는 각 애플리케이션)별 시선 추적 모델을 설명하기 위한 예시
도 3은 본 발명의 일 실시예에 따른 각 화면(또는 각 애플리케이션)별 시선 추적 모델을 설명하기 위한 예시
도 4는 본 발명의 일 실시예에 따른 각 화면(또는 각 애플리케이션)별 시선 추적 모델을 설명하기 위한 예시
도 5는 본 발명의 일 실시예에 따른 학습부에서 시선 추적 모델을 학습시키는 과정을 설명하기 위한 예시
도 6은 본 발명의 일 실시예에 따른 시선 추적 방법을 설명하기 위한 흐름도
도 7은 예시적인 실시예들에서 사용되기에 적합한 컴퓨팅 장치를 포함하는 컴퓨팅 환경을 예시하여 설명하기 위한 블록도1 is a block diagram showing a detailed configuration of a terminal according to an embodiment of the present invention
2 is an illustration for explaining a gaze tracking model for each screen (or each application) according to an embodiment of the present invention.
3 is an illustration for explaining a gaze tracking model for each screen (or each application) according to an embodiment of the present invention.
4 is an illustration for explaining a gaze tracking model for each screen (or each application) according to an embodiment of the present invention.
5 is an example for explaining a process of learning a gaze tracking model in a learner according to an exemplary embodiment of the present invention.
6 is a flowchart illustrating a gaze tracking method according to an exemplary embodiment of the present invention.
7 is a block diagram illustrating and describing a computing environment including a computing device suitable for use in example embodiments.

이하, 도면을 참조하여 본 발명의 구체적인 실시형태를 설명하기로 한다. 이하의 상세한 설명은 본 명세서에서 기술된 방법, 장치 및/또는 시스템에 대한 포괄적인 이해를 돕기 위해 제공된다. 그러나 이는 예시에 불과하며 본 발명은 이에 제한되지 않는다.Hereinafter, specific embodiments of the present invention will be described with reference to the drawings. The following detailed description is provided to assist in a comprehensive understanding of the methods, devices, and / or systems described herein. However, this is only an example and the present invention is not limited thereto.

본 발명의 실시예들을 설명함에 있어서, 본 발명과 관련된 공지기술에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명을 생략하기로 한다. 그리고, 후술되는 용어들은 본 발명에서의 기능을 고려하여 정의된 용어들로서 이는 사용자, 운용자의 의도 또는 관례 등에 따라 달라질 수 있다. 그러므로 그 정의는 본 명세서 전반에 걸친 내용을 토대로 내려져야 할 것이다. 상세한 설명에서 사용되는 용어는 단지 본 발명의 실시예들을 기술하기 위한 것이며, 결코 제한적이어서는 안 된다. 명확하게 달리 사용되지 않는 한, 단수 형태의 표현은 복수 형태의 의미를 포함한다. 본 설명에서, "포함" 또는 "구비"와 같은 표현은 어떤 특성들, 숫자들, 단계들, 동작들, 요소들, 이들의 일부 또는 조합을 가리키기 위한 것이며, 기술된 것 이외에 하나 또는 그 이상의 다른 특성, 숫자, 단계, 동작, 요소, 이들의 일부 또는 조합의 존재 또는 가능성을 배제하도록 해석되어서는 안 된다.In describing the embodiments of the present invention, when it is determined that the detailed description of the known technology related to the present invention may unnecessarily obscure the gist of the present invention, the detailed description thereof will be omitted. In addition, terms to be described below are terms defined in consideration of functions in the present invention, which may vary according to the intention or custom of a user or an operator. Therefore, the definition should be made based on the contents throughout the specification. The terminology used in the description is for the purpose of describing embodiments of the invention only and should not be limiting. Unless expressly used otherwise, the singular forms “a,” “an,” and “the” include plural forms of meaning. In this description, expressions such as "comprises" or "equipment" are intended to indicate certain features, numbers, steps, actions, elements, portions or combinations thereof, and one or more than those described. It should not be construed to exclude the presence or possibility of other features, numbers, steps, actions, elements, portions or combinations thereof.

도 1은 본 발명의 일 실시예에 따른 단말(100)의 상세 구성을 나타낸 블록도이다. 본 발명의 일 실시예에 따른 단말(100)은 사용자의 시선 추적이 가능한 기기로서, 예를 들어 스마트폰, 태블릿 PC, 노트북 등과 같은 모바일 기기일 수 있다. 다만, 단말(100)의 종류가 이에 한정되는 것은 아니며, 각종 컨텐츠를 디스플레이하기 위한 화면 및 사용자 촬영을 위한 촬영 장치를 구비하는 다양한 통신 기기가 본 발명의 실시예들에 따른 단말(102)에 해당할 수 있다.1 is a block diagram showing a detailed configuration of a terminal 100 according to an embodiment of the present invention. The terminal 100 according to an exemplary embodiment of the present invention may be a device capable of tracking the eyes of a user and may be, for example, a mobile device such as a smartphone, a tablet PC, a laptop, and the like. However, the type of the terminal 100 is not limited thereto, and various communication devices including a screen for displaying various contents and a photographing apparatus for photographing a user correspond to the terminal 102 according to embodiments of the present invention. can do.

도 1에 도시된 바와 같이, 본 발명의 일 실시예에 따른 단말(100)은 모델 선택부(102), 시선 추적부(104) 및 메모리 제거부(106)를 포함하며, 실시예에 따라 촬영부(108) 및 학습부(110)를 더 포함할 수 있다.As shown in FIG. 1, the terminal 100 according to an embodiment of the present invention includes a model selecting unit 102, a gaze tracking unit 104, and a memory removing unit 106, and photographing according to an embodiment. The unit 108 and the learning unit 110 may be further included.

모델 선택부(102)는 설정된 복수의 시선 추적 모델 중 상기 단말(100)의 현재 화면에 대응되는 시선 추적 모델을 선택한다. 본 실시예들에 있어서, 시선 추적 모델은 사용자의 시선을 추적하는 데 사용되는 모델로서, 단말(100)의 각 화면 또는 단말(100)에 설치된 각 애플리케이션별로 상이할 수 있다. 단말(100)에 설치된 복수의 애플리케이션 중 하나가 사용자에 의해 실행되는 경우, 모델 선택부(102)는 단말(100)의 현재 화면을 출력하는 애플리케이션을 식별하고 식별된 애플리케이션에 대응되는 시선 추적 모델을 선택할 수 있다. The model selector 102 selects a gaze tracking model corresponding to the current screen of the terminal 100 from a plurality of gaze tracking models. In the present exemplary embodiments, the gaze tracking model is a model used to track the gaze of a user and may be different for each screen of the terminal 100 or for each application installed in the terminal 100. When one of a plurality of applications installed in the terminal 100 is executed by the user, the model selector 102 identifies an application that outputs the current screen of the terminal 100 and generates a gaze tracking model corresponding to the identified application. You can choose.

여기서, 시선 추적 모델은 단말(100)의 화면을 설정된 복수의 영역으로 분할하고, 분할된 복수의 영역 중 사용자가 응시하는 영역이 어느 영역인지를 식별함으로써 사용자의 시선을 추적하도록 구성된다. 이때, 분할된 각 영역의 위치, 크기 및 개수는 각 시선 추적 모델별로 상이할 수 있다. 상기 시선 추적 모델에 대해서는 도 2 내지 도 4를 참조하여 구체적으로 후술하기로 한다.Here, the gaze tracking model is configured to track the gaze of the user by dividing the screen of the terminal 100 into a plurality of set areas and identifying which area the user gazes from among the plurality of divided areas. In this case, the position, size and number of each divided region may be different for each gaze tracking model. The eye tracking model will be described later in detail with reference to FIGS. 2 to 4.

시선 추적부(104)는 선택된 시선 추적 모델을 이용하여 상기 현재 화면 상에서 사용자의 시선을 추적한다. 시선 추적부(104)는 복수의 시선 추적 모델 중 선택된 시선 추적 모델을 로딩하고, 상기 시선 추적 모델로부터 현재 화면 내 분할된 각 영역에 관한 정보를 획득할 수 있다. 이후, 시선 추적부(104)는 상기 시선 추적 모델을 통해 분할된 각 영역 중 사용자 시선의 위치가 속하는 영역을 식별하여 시선을 추적할 수 있다.The gaze tracking unit 104 tracks the gaze of the user on the current screen by using the selected gaze tracking model. The gaze tracking unit 104 may load a gaze tracking model selected from a plurality of gaze tracking models, and obtain information on each area divided in the current screen from the gaze tracking model. Thereafter, the gaze tracking unit 104 may track the gaze by identifying an area to which the position of the user gaze belongs among the areas divided by the gaze tracking model.

메모리 제거부(106)는 현재 실행 중인 애플리케이션이 종료되는 경우 상기 애플리케이션에 대응되는 시선 추적 모델을 메모리(미도시)에서 제거한다. The memory remover 106 removes the gaze tracking model corresponding to the application from the memory (not shown) when the currently running application is terminated.

도 2 내지 도 4는 본 발명의 일 실시예에 따른 각 화면(또는 각 애플리케이션)별 시선 추적 모델을 설명하기 위한 예시이다.2 to 4 are views for explaining a gaze tracking model for each screen (or each application) according to an embodiment of the present invention.

먼저, 도 2에서는 단말(100)이 홈 화면을 출력하는 경우를 예시로서 도시하고 있다. 이때, 모델 선택부(102)는 제1 시선 추적 모델을 선택하는 것으로 가정한다.First, FIG. 2 illustrates a case in which the terminal 100 outputs a home screen as an example. In this case, it is assumed that the model selector 102 selects the first eye tracking model.

도 2를 참조하면, 제1 시선 추적 모델은 화면을 28개의 영역으로 분할하고, 분할된 28개의 영역 각각의 위치 정보를 미리 저장하고 있을 수 있다. 이후, 사용자가 단말(100)의 화면을 응시하는 경우, 제1 시선 추적 모델은 상기 28개의 영역 중 사용자 시선의 위치가 속하는 영역을 식별하여 상기 시선을 추적할 수 있다. 예를 들어, 사용자가 7번 영역을 응시하는 경우, 제1 시선 추적 모델은 사용자 시선의 위치가 7번 영역에 속하는 것을 확인함으로써 사용자가 7번 영역을 응시하고 있는 것으로 판단할 수 있다. 이에 따라, 7번 영역에 대응되는 날씨 관련 애플리케이션이 실행될 수 있다.Referring to FIG. 2, the first gaze tracking model may divide a screen into 28 regions and store position information of each of the divided 28 regions in advance. Subsequently, when the user gazes at the screen of the terminal 100, the first gaze tracking model may track the gaze by identifying an area to which the position of the user gaze belongs among the 28 areas. For example, when the user gazes at the seventh area, the first gaze tracking model may determine that the user gazes at the seventh area by confirming that the position of the user's gaze belongs to the seventh area. Accordingly, a weather related application corresponding to area 7 may be executed.

다음으로, 도 3에서는 상하로 스크롤 가능한 애플리케이션이 실행됨에 따라 단말(100)이 상기 애플리케이션과 관련된 컨텐츠를 포함하는 화면을 출력하는 경우를 예시로서 도시하고 있다. 이때, 모델 선택부(102)는 제2 시선 추적 모델을 선택하는 것으로 가정한다.Next, FIG. 3 illustrates an example in which the terminal 100 outputs a screen including content related to the application as the application scrolls up and down. In this case, it is assumed that the model selector 102 selects the second eye tracking model.

도 3을 참조하면, 제2 시선 추적 모델은 화면을 3개의 영역으로 분할하고, 분할된 3개의 영역 각각의 위치 정보를 미리 저장하고 있을 수 있다. 이후, 사용자가 단말(100)의 화면을 응시하는 경우, 제2 시선 추적 모델은 상기 3개의 영역 중 사용자 시선의 위치가 속하는 영역을 식별하여 상기 시선을 추적할 수 있다. 예를 들어, 사용자가 1번 영역을 응시하는 경우, 제2 시선 추적 모델은 사용자 시선의 위치가 1번 영역에 속하는 것을 확인함으로써 사용자가 1번 영역을 응시하고 있는 것으로 판단할 수 있다. 이에 따라, 상하로 스크롤 가능한 애플리케이션에서 1번 영역에 대응되는 “위로 스크롤” 기능이 실행되어 화면이 위로 스크롤될 수 있다.Referring to FIG. 3, the second gaze tracking model may divide a screen into three regions and store position information of each of the divided three regions in advance. Subsequently, when the user gazes at the screen of the terminal 100, the second gaze tracking model may track the gaze by identifying an area to which the position of the user gaze belongs among the three areas. For example, when the user gazes at the first area, the second gaze tracking model may determine that the user gazes at the first area by confirming that the position of the user's gaze belongs to the first area. Accordingly, the “scroll up” function corresponding to area 1 may be executed in an application that is scrollable up and down, and the screen may scroll upward.

다음으로, 도 4에서는 전화걸기와 관련된 애플리케이션이 실행됨에 따라 단말(100)이 상기 애플리케이션과 관련된 컨텐츠를 포함하는 화면을 출력하는 경우를 예시로서 도시하고 있다. 이때, 모델 선택부(102)는 제3 시선 추적 모델을 선택하는 것으로 가정한다.Next, FIG. 4 illustrates an example in which the terminal 100 outputs a screen including content related to the application as an application related to a call is executed. In this case, it is assumed that the model selector 102 selects the third eye tracking model.

도 4를 참조하면, 제3 시선 추적 모델은 화면을 13개의 영역으로 분할하고, 분할된 13개의 영역 각각의 위치 정보를 미리 저장하고 있을 수 있다. 이후, 사용자가 단말(100)의 화면을 응시하는 경우, 제3 시선 추적 모델은 상기 13개의 영역 중 사용자 시선의 위치가 속하는 영역을 식별하여 상기 시선을 추적할 수 있다. 예를 들어, 사용자가 1번 영역, 5번 영역 및 7번 영역을 순차적으로 응시하는 경우, 제3 시선 추적 모델은 사용자 시선의 위치가 1번 영역, 5번 영역 및 7번 영역에 각각 속하는 것을 확인함으로써 사용자가 1번 영역, 5번 영역 및 7번 영역을 순차적으로 응시하고 있는 것으로 판단할 수 있다. 이에 따라, 전화걸기 관련 애플리케이션에서 1번 영역, 5번 영역 및 7번 영역에 대응되는 “숫자 1”, “숫자 5” 및 “숫자 7”이 순차적으로 입력될 수 있다. Referring to FIG. 4, the third gaze tracking model may divide a screen into 13 regions and store position information of each of the 13 divided regions in advance. Subsequently, when the user gazes at the screen of the terminal 100, the third gaze tracking model may track the gaze by identifying an area to which the position of the user gaze belongs among the thirteen areas. For example, when the user gazes sequentially at area 1, area 5, and area 7, the third gaze tracking model indicates that the position of the user's gaze belongs to area 1, area 5, and area 7, respectively. By confirming, it may be determined that the user gazes sequentially at area 1, area 5, and area 7. Accordingly, "number 1", "number 5", and "number 7" corresponding to areas 1, 5, and 7 may be sequentially input in the dialing related application.

이와 같이, 모델 선택부(102)는 설정된 복수의 시선 추적 모델 중 단말(100)의 현재 화면(또는 상기 현재 화면을 출력하는 애플리케이션)에 대응되는 시선 추적 모델을 선택하고, 시선 추적부(104)는 선택된 시선 추적 모델을 이용하여 상기 현재 화면 상에서 사용자의 시선을 추적할 수 있다. 이때, 각 시선 추적 모델은 복수의 애플리케이션 중 서로 다른 하나 이상의 애플리케이션에 각각 대응될 수 있으며, 출력되는 각 화면마다 서로 다른 시선 추적 모델이 사용될 수 있다. As such, the model selector 102 selects a gaze tracking model corresponding to a current screen (or an application that outputs the current screen) of the terminal 100 from among the plurality of gaze tracking models, and the gaze tracking unit 104. The eye may track the eyes of the user on the current screen by using the selected eye tracking model. In this case, each eye tracking model may correspond to one or more different applications among a plurality of applications, and a different eye tracking model may be used for each output screen.

또한, 분할된 각 영역은 설정된 서로 다른 기능을 수행하기 위해 사용자에 의해 입력 가능하도록 구분되는 영역일 수 있다. 위 예시에서, 도 2의 각 영역은 홈 화면에 디스플레이되는 각 애플리케이션의 실행을 위해 사용자에 입력 가능하도록 구분되는 영역이며, 도 3의 각 영역은 “위로 스크롤”, “스크롤 정지” 및 “아래로 스크롤” 기능을 각각 수행하기 위해 사용자에 의해 입력 가능하도록 구분되는 영역이며, 도 4의 각 영역은 사용자에 의해 입력 가능한 “숫자 1”, “숫자 2”…”숫자 9”, “숫자 0”, “기호 *”, “기호 #” 및 “전화걸기 버튼” 각각에 대응되는 영역일 수 있다. 단말(100)은 분할된 복수의 영역 중 사용자가 응시하고 있는 것으로 판단되는 영역을 식별하여 사용자의 시선을 추적하고, 상기 영역에 대응되는 기능(또는 액션)을 수행할 수 있다. In addition, each divided area may be an area that is divided to be input by a user in order to perform different functions. In the above example, each area of FIG. 2 is an area that can be input to the user for execution of each application displayed on the home screen, and each area of FIG. 3 is “scroll up”, “scroll stop”, and “down”. And “number 1”, “number 2”, etc., which can be input by the user to perform the scroll ”function, respectively. It may be an area corresponding to each of ”number 9”, “number 0”, “symbol *”, “symbol #” and “dial button”. The terminal 100 may identify an area determined to be stared by the user from among the plurality of divided areas, track the user's eyes, and perform a function (or action) corresponding to the area.

즉, 본 발명의 실시예들에 따르면, 시선의 위치 좌표가 아닌 시선의 위치가 속한 화면 내 영역을 식별하여 시선을 추적함으로써 시선 추적의 정확도를 향상시킬 수 있다. 시선의 위치 좌표 대신 화면 내 분할된 영역으로 시선을 추적하는 경우, 시선 추적을 위한 클래스(class)의 개수가 적어지므로 시선 추적의 연산 속도 및 정확도가 증가하게 되며 시선 추적 과정에서 사용되는 컴퓨팅 자원을 최소화시킬 수 있다.That is, according to embodiments of the present invention, the gaze tracking may be improved by identifying an area within the screen to which the gaze location belongs, not the location coordinate of the gaze, thereby improving the accuracy of gaze tracking. In the case of tracking the gaze by the divided area in the screen instead of the coordinates of the gaze position, the number of classes for tracking the gaze decreases, so the computation speed and accuracy of the gaze tracking is increased and the computing resources used in the gaze tracking process are increased. It can be minimized.

또한, 다시 도 1로 돌아오면, 시선 추적 시스템(100)은 실시예에 따라 촬영부(108) 및 학습부(110)를 더 포함할 수 있다.In addition, referring back to FIG. 1, the eye tracking system 100 may further include a photographing unit 108 and a learning unit 110 according to an embodiment.

촬영부(108)는 촬영 장치(미도시)를 구비하며, 상기 촬영 장치를 통해 사용자의 얼굴 이미지를 촬영한다. 상기 촬영 장치는 예를 들어, 카메라, 캠코더 등이 될 수 있다. 또한, 상기 촬영 장치는 단말(100)의 일측에 구비될 수 있다.The photographing unit 108 includes a photographing apparatus (not shown), and photographs a face image of the user through the photographing apparatus. The photographing apparatus may be, for example, a camera, a camcorder, or the like. In addition, the photographing apparatus may be provided at one side of the terminal 100.

학습부(110)는 촬영부(108)와 연동하여 각 시선 추적 모델을 학습시킨다. 이때, 학습부(110)는 설정된 딥러닝(Deep Learning) 모델을 이용하여 각 시선 추적 모델을 학습시킬 수 있다. 여기서, 딥러닝 모델은 예를 들어, 합성곱 신경망(CNN : Convolutional Neural Network) 모델일 수 있다. The learner 110 learns each eye tracking model in association with the photographing unit 108. In this case, the learner 110 may learn each eye tracking model by using the set deep learning model. Here, the deep learning model may be, for example, a convolutional neural network (CNN) model.

구체적으로, 학습부(110)는 단말(100)의 현재 화면 내 특정 영역을 응시하는 사용자로부터 설정된 액션을 입력 받는 경우 상기 액션을 입력 받는 시점에서 상기 촬영 장치를 통해 촬영된 사용자의 얼굴 이미지 및 상기 특정 영역의 위치 정보를 수집함으로써 상기 시선 추적 모델을 학습시킬 수 있다. 일 예시로서, 학습부(110)는 사용자가 상기 특정 영역을 터치하는 경우 상기 터치가 이루어진 시점에서 촬영된 상기 사용자의 얼굴 이미지 및 상기 특정 영역의 위치 정보를 수집함으로써 상기 시선 추적 모델을 학습시킬 수 있다. 학습부(110)는 상기 얼굴 이미지로부터 사용자의 눈동자 위치좌표, 얼굴 위치좌표, 눈동자의 방향벡터 등을 획득하고, 이들을 상기 특정 영역의 위치 정보와 함께 딥러닝 모델에 입력함으로써 각 시선 추적 모델을 학습시킬 수 있다. In detail, when the learner 110 receives a set action from a user gazing at a specific area within the current screen of the terminal 100, the face image of the user photographed through the photographing device at the time of receiving the action and the The gaze tracking model may be trained by collecting location information of a specific area. As an example, when the user touches the specific area, the learner 110 may learn the eye tracking model by collecting the face image of the user and the location information of the specific area photographed when the touch is made. have. The learner 110 acquires the eye position coordinates of the user, the face position coordinates, the direction vector of the pupil from the face image, and inputs them to the deep learning model together with the position information of the specific region to learn each eye tracking model. You can.

도 5는 본 발명의 일 실시예에 따른 학습부(110)에서 시선 추적 모델을 학습시키는 과정을 설명하기 위한 예시이다.5 is an example for explaining a process of learning a gaze tracking model in the learner 110 according to an exemplary embodiment of the present invention.

도 5를 참조하면, 학습부(110)는 사용자가 10번 영역을 터치하는 경우 상기 터치가 이루어진 시점에서 촬영된 상기 사용자의 얼굴 이미지 및 상기 10번 영역의 위치 정보를 수집할 수 있다. 학습부(110)는 상기 1번 영역 내지 13번 영역 각각에 대해 이와 같은 과정을 반복 수행함으로써 분할된 각 영역별 사용자의 시선 정보를 학습할 수 있다. 이와 같이 학습된 각 영역별 사용자의 시선 정보는 해당 시선 추적 모델에 입력되며, 학습부(110)는 이러한 과정을 반복 수행하여 해당 시선 추적 모델을 학습시킬 수 있다.Referring to FIG. 5, when the user touches area 10, the learner 110 may collect the face image of the user and location information of the area 10 photographed when the touch is made. The learner 110 may learn the gaze information of the user of each divided region by repeating the above process for each of the areas 1 to 13. The gaze information of the user for each region learned as described above is input to the gaze tracking model, and the learning unit 110 may repeat the above process to learn the gaze tracking model.

본 발명의 실시예들에 따르면, 시선의 위치 좌표 대신 화면 내 분할된 영역으로 시선 추적을 위한 학습 데이터를 각 시선 추적 모델에 학습시키게 되며, 이 경우 시선 추적을 위한 클래스(class)의 개수가 적어지므로 시선 추적 모델의 학습 속도가 향상되며 학습 과정에서 사용되는 컴퓨팅 자원을 최소화시킬 수 있다.According to the exemplary embodiments of the present invention, the gaze tracking model is trained on each gaze tracking model as gaze tracking data in a divided area in the screen instead of the position coordinates of gaze, and in this case, the number of classes for gaze tracking is small. This speeds up the learning of the eye tracking model and minimizes the computing resources used in the learning process.

도 6은 본 발명의 일 실시예에 따른 시선 추적 방법을 설명하기 위한 흐름도이다. 도시된 흐름도에서는 상기 방법을 복수 개의 단계로 나누어 기재하였으나, 적어도 일부의 단계들은 순서를 바꾸어 수행되거나, 다른 단계와 결합되어 함께 수행되거나, 생략되거나, 세부 단계들로 나뉘어 수행되거나, 또는 도시되지 않은 하나 이상의 단계가 부가되어 수행될 수 있다.6 is a flowchart illustrating a gaze tracking method according to an exemplary embodiment of the present invention. In the illustrated flow chart, the method is divided into a plurality of steps, but at least some of the steps may be performed in a reverse order, in combination with other steps, omitted, divided into substeps, or not shown. One or more steps may be added and performed.

S102 단계에서, 단말(100)은 사용자의 입력에 따라 애플리케이션을 실행시킨다.In step S102, the terminal 100 executes the application according to the user's input.

S104 단계에서, 단말(100)은 실행된 애플리케이션을 식별한다.In step S104, the terminal 100 identifies the executed application.

S106 단계에서, 단말(100)은 식별된 애플리케이션에 대응되는 시선 추적 모델을 선택하고, 선택된 시선 추적 모델을 로딩한다.In step S106, the terminal 100 selects a gaze tracking model corresponding to the identified application and loads the selected gaze tracking model.

S108 단계에서, 단말(100)은 상기 시선 추적 모델을 이용하여 사용자의 시선을 추적한다. 구체적으로, 단말(100)은 상기 시선 추적 모델로부터 현재 화면 내 분할된 각 영역에 관한 정보를 획득하고, 분할된 각 영역 중 사용자 시선의 위치가 속하는 영역을 식별하여 시선을 추적할 수 있다.In step S108, the terminal 100 tracks the gaze of the user using the gaze tracking model. In detail, the terminal 100 may obtain information about each divided area of the current screen from the gaze tracking model, and track the gaze by identifying an area to which the position of the user gaze belongs among the divided areas.

S110 단계에서, 단말(100)은 사용자의 입력에 따라 애플리케이션을 종료시키고, 상기 애플리케이션에 대응되는 시선 추적 모델을 메모리에서 제거한다. In step S110, the terminal 100 terminates the application according to a user input, and removes the gaze tracking model corresponding to the application from the memory.

도 7은 예시적인 실시예들에서 사용되기에 적합한 컴퓨팅 장치를 포함하는 컴퓨팅 환경(10)을 예시하여 설명하기 위한 블록도이다. 도시된 실시예에서, 각 컴포넌트들은 이하에 기술된 것 이외에 상이한 기능 및 능력을 가질 수 있고, 이하에 기술되지 것 이외에도 추가적인 컴포넌트를 포함할 수 있다.7 is a block diagram illustrating and describing a computing environment 10 including a computing device suitable for use in the exemplary embodiments. In the illustrated embodiment, each component may have different functions and capabilities in addition to those described below, and may include additional components in addition to those described below.

도시된 컴퓨팅 환경(10)은 컴퓨팅 장치(12)를 포함한다. 일 실시예에서, 컴퓨팅 장치(12)는 시선 추적 시스템(100), 또는 시선 추적 시스템(100)에 포함되는 하나 이상의 컴포넌트일 수 있다.The illustrated computing environment 10 includes a computing device 12. In one embodiment, the computing device 12 may be the eye tracking system 100, or one or more components included in the eye tracking system 100.

컴퓨팅 장치(12)는 적어도 하나의 프로세서(14), 컴퓨터 판독 가능 저장 매체(16) 및 통신 버스(18)를 포함한다. 프로세서(14)는 컴퓨팅 장치(12)로 하여금 앞서 언급된 예시적인 실시예에 따라 동작하도록 할 수 있다. 예컨대, 프로세서(14)는 컴퓨터 판독 가능 저장 매체(16)에 저장된 하나 이상의 프로그램들을 실행할 수 있다. 상기 하나 이상의 프로그램들은 하나 이상의 컴퓨터 실행 가능 명령어를 포함할 수 있으며, 상기 컴퓨터 실행 가능 명령어는 프로세서(14)에 의해 실행되는 경우 컴퓨팅 장치(12)로 하여금 예시적인 실시예에 따른 동작들을 수행하도록 구성될 수 있다.Computing device 12 includes at least one processor 14, computer readable storage medium 16, and communication bus 18. The processor 14 may cause the computing device 12 to operate according to the example embodiments mentioned above. For example, processor 14 may execute one or more programs stored in computer readable storage medium 16. The one or more programs may include one or more computer executable instructions that, when executed by the processor 14, cause the computing device 12 to perform operations in accordance with an exemplary embodiment. Can be.

컴퓨터 판독 가능 저장 매체(16)는 컴퓨터 실행 가능 명령어 내지 프로그램 코드, 프로그램 데이터 및/또는 다른 적합한 형태의 정보를 저장하도록 구성된다. 컴퓨터 판독 가능 저장 매체(16)에 저장된 프로그램(20)은 프로세서(14)에 의해 실행 가능한 명령어의 집합을 포함한다. 일 실시예에서, 컴퓨터 판독 가능 저장 매체(16)는 메모리(랜덤 액세스 메모리와 같은 휘발성 메모리, 비휘발성 메모리, 또는 이들의 적절한 조합), 하나 이상의 자기 디스크 저장 디바이스들, 광학 디스크 저장 디바이스들, 플래시 메모리 디바이스들, 그 밖에 컴퓨팅 장치(12)에 의해 액세스되고 원하는 정보를 저장할 수 있는 다른 형태의 저장 매체, 또는 이들의 적합한 조합일 수 있다.Computer readable storage medium 16 is configured to store computer executable instructions or program code, program data and / or other suitable forms of information. The program 20 stored in the computer readable storage medium 16 includes a set of instructions executable by the processor 14. In one embodiment, computer readable storage medium 16 includes memory (volatile memory, such as random access memory, nonvolatile memory, or a suitable combination thereof), one or more magnetic disk storage devices, optical disk storage devices, flash Memory devices, or any other form of storage medium that is accessible by computing device 12 and capable of storing desired information, or a suitable combination thereof.

통신 버스(18)는 프로세서(14), 컴퓨터 판독 가능 저장 매체(16)를 포함하여 컴퓨팅 장치(12)의 다른 다양한 컴포넌트들을 상호 연결한다.The communication bus 18 interconnects various other components of the computing device 12, including the processor 14 and the computer readable storage medium 16.

컴퓨팅 장치(12)는 또한 하나 이상의 입출력 장치(24)를 위한 인터페이스를 제공하는 하나 이상의 입출력 인터페이스(22) 및 하나 이상의 네트워크 통신 인터페이스(26)를 포함할 수 있다. 입출력 인터페이스(22)는 상술한 스크롤 화면(102), 입력 인터페이스(104), 입력 화면(105) 등을 포함할 수 있다. 입출력 인터페이스(22) 및 네트워크 통신 인터페이스(26)는 통신 버스(18)에 연결된다. 입출력 장치(24)는 입출력 인터페이스(22)를 통해 컴퓨팅 장치(12)의 다른 컴포넌트들에 연결될 수 있다. 예시적인 입출력 장치(24)는 포인팅 장치(마우스 또는 트랙패드 등), 키보드, 터치 입력 장치(터치패드 또는 터치스크린 등), 음성 또는 소리 입력 장치, 다양한 종류의 센서 장치 및/또는 촬영 장치와 같은 입력 장치, 및/또는 디스플레이 장치, 프린터, 스피커 및/또는 네트워크 카드와 같은 출력 장치를 포함할 수 있다. 예시적인 입출력 장치(24)는 컴퓨팅 장치(12)를 구성하는 일 컴포넌트로서 컴퓨팅 장치(12)의 내부에 포함될 수도 있고, 컴퓨팅 장치(12)와는 구별되는 별개의 장치로 컴퓨팅 장치(102)와 연결될 수도 있다.Computing device 12 may also include one or more input / output interfaces 22 and one or more network communication interfaces 26 that provide an interface for one or more input / output devices 24. The input / output interface 22 may include the above-described scroll screen 102, the input interface 104, the input screen 105, and the like. The input / output interface 22 and the network communication interface 26 are connected to the communication bus 18. The input / output device 24 may be connected to other components of the computing device 12 via the input / output interface 22. Exemplary input / output devices 24 may include pointing devices (such as a mouse or trackpad), keyboards, touch input devices (such as touchpads or touchscreens), voice or sound input devices, various types of sensor devices, and / or imaging devices. Input devices, and / or output devices such as display devices, printers, speakers, and / or network cards. The example input / output device 24 may be included inside the computing device 12 as one component of the computing device 12, and may be connected to the computing device 102 as a separate device from the computing device 12. It may be.

이상에서 대표적인 실시예를 통하여 본 발명에 대하여 상세하게 설명하였으나, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자는 전술한 실시예에 대하여 본 발명의 범주에서 벗어나지 않는 한도 내에서 다양한 변형이 가능함을 이해할 것이다. 그러므로 본 발명의 권리범위는 설명된 실시예에 국한되어 정해져서는 안 되며, 후술하는 특허청구범위뿐만 아니라 이 특허청구범위와 균등한 것들에 의해 정해져야 한다.Although the present invention has been described in detail with reference to exemplary embodiments above, those skilled in the art to which the present invention pertains can make various modifications without departing from the scope of the present invention with respect to the above-described embodiments. Will understand. Therefore, the scope of the present invention should not be limited to the described embodiments, but should be defined by the claims below and equivalents thereof.

100 : 시선 추적 시스템
102 : 모델 선택부
104 : 시선 추적부
106 : 메모리 제거부
108 : 촬영부
110 : 학습부100: eye tracking system
102: model selection unit
104: eye tracking unit
106: memory removal unit
108: shooting unit
110: learning unit

Claims

As a terminal capable of tracking the eyes of the user,
A model selection unit for selecting a gaze tracking model corresponding to a current screen of the terminal from a plurality of set gaze tracking models; And
A gaze tracking unit configured to track the gaze of the user on the current screen by using the selected gaze tracking model;
The gaze tracking model is configured to divide the current screen into a plurality of set areas, and to track the gaze by identifying an area to which the position of the gaze belongs among the plurality of divided areas.

The method according to claim 1,
The model selector identifies a application outputting the current screen among a plurality of applications installed in the terminal and selects a gaze tracking model corresponding to the identified application.

The method according to claim 2,
The location, size and number of each divided region are different for each eye tracking model,
The gaze tracking model corresponds to one or more different applications among the plurality of applications, respectively.

The method according to claim 3,
Each of the areas is an area that is divided to be inputable by the user to perform different set functions.

The method according to claim 1,
A learning unit for learning the eye tracking model by collecting a face image of the user and location information of the specific region photographed at the time of receiving the action when receiving an action set by the user staring at the specific region in the current screen. Further comprising, the terminal.

The method according to claim 5,
The learning unit, when the user touches the specific area, the terminal collecting the face image of the user and the location information of the specific area photographed when the touch is made.

A gaze tracking method performed in a terminal capable of tracking a gaze of a user,
Selecting, by the model selection unit of the terminal, a gaze tracking model corresponding to the current screen of the terminal from a plurality of set gaze tracking models; And
In the eye tracking unit of the terminal, using the selected eye tracking model to track the eyes of the user on the current screen,
The gaze tracking model is configured to track the gaze by dividing the current screen into a plurality of set areas, and identifying an area to which the position of the gaze belongs among the plurality of divided areas.

The method according to claim 7,
The selecting a gaze tracking model may include: identifying an application outputting the current screen among a plurality of applications installed in the terminal and selecting a gaze tracking model corresponding to the identified application.

The method according to claim 8,
The location, size and number of each divided region are different for each eye tracking model,
Each eye tracking model corresponds to one or more different applications among the plurality of applications.

The method according to claim 9,
Each of the areas is a gaze tracking method that is divided to be inputable by the user to perform different functions.

The method according to claim 7,
When the learning unit of the terminal receives an action set by a user gazing at a specific area within the current screen, the gaze is collected by collecting the face image of the user and location information of the specific area photographed at the time of receiving the action. Further comprising training a tracking model.

The method according to claim 11,
The training of the gaze tracking model may include, when the user touches the specific area, collecting the face image of the user and location information of the specific area photographed at the time when the touch is made.