KR102092931B1

KR102092931B1 - Method for eye-tracking and user terminal for executing the same

Info

Publication number: KR102092931B1
Application number: KR1020170167334A
Authority: KR
Inventors: 석윤찬; 이태희
Original assignee: 주식회사 비주얼캠프
Priority date: 2017-09-13
Filing date: 2017-12-07
Publication date: 2020-03-24
Also published as: KR20190030140A; CN110476141A

Abstract

시선 추적 방법 및 이를 수행하기 위한 사용자 단말이 제공된다. 본 발명의 일 실시예에 따른 사용자 단말은, 사용자의 얼굴 이미지를 촬영하는 촬영 장치; 및 설정된 룰(rule)을 기반으로 상기 얼굴 이미지로부터 상기 사용자의 얼굴이 향하는 방향을 나타내는 벡터 및 상기 사용자의 눈동자 이미지를 획득하며, 설정된 딥러닝 모델에 상기 얼굴 이미지, 상기 벡터 및 상기 눈동자 이미지를 입력하여 상기 사용자의 시선을 추적하는 시선 추적부를 포함한다.A gaze tracking method and a user terminal for performing the same are provided. A user terminal according to an embodiment of the present invention includes: a photographing device for photographing a face image of a user; And obtaining a vector representing the direction of the user's face from the face image and the user's pupil image based on the set rule, and inputting the face image, the vector, and the pupil image into the set deep learning model. By including a gaze tracking unit for tracking the user's gaze.

Description

METHOD FOR EYE-TRACKING AND USER TERMINAL FOR EXECUTING THE SAME

본 발명은 시선 추적 기술과 관련된다.The present invention relates to eye tracking technology.

시선 추적(Eye Tracking)은 사용자의 안구 움직임을 감지하여 시선의 위치를 추적하는 기술로서, 영상 분석 방식, 콘택트렌즈 방식, 센서 부착 방식 등의 방법이 사용될 수 있다. 영상 분석 방식은 실시간 카메라 이미지의 분석을 통해 동공의 움직임을 검출하고, 각막에 반사된 고정 위치를 기준으로 시선의 방향을 계산한다. 콘택트렌즈 방식은 거울 내장 콘택트렌즈의 반사된 빛이나, 코일 내장 콘택트렌즈의 자기장 등을 이용하며, 편리성이 떨어지는 반면 정확도가 높다. 센서 부착 방식은 눈 주위에 센서를 부착하여 눈의 움직임에 따른 전기장의 변화를 이용하여 안구의 움직임을 감지하며, 눈을 감고 있는 경우(수면 등)에도 안구 움직임의 검출이 가능하다.Eye tracking is a technology that detects the user's eye movement and tracks the position of the eye, and methods such as an image analysis method, a contact lens method, and a sensor attachment method may be used. The video analysis method detects pupil movement through analysis of a real-time camera image and calculates the direction of the gaze based on a fixed position reflected on the cornea. The contact lens method uses reflected light of a mirror built-in contact lens or a magnetic field of a coil built-in contact lens, which is less convenient and has high accuracy. The sensor attachment method attaches a sensor around the eye to detect the movement of the eye using a change in the electric field according to the movement of the eye, and can detect eye movement even when the eye is closed (sleep, etc.).

최근, 시선 추적 기술의 적용 대상 기기 및 적용 분야가 점차 확대되고 있으며, 이에 따라 스마트폰 등과 같은 단말에서 광고 서비스를 제공함에 있어 상기 시선 추적 기술을 활용하는 시도가 증가하고 있다. 그러나, 효율적인 광고 서비스의 제공을 위해서는 시선 추적의 정확도가 보다 향상될 필요가 있으며 광고 시청에 따른 비딩(bidding) 방식, 리워드(reward) 방식 등을 효율적으로 구성할 필요가 있다. Recently, target devices and application fields of the eye tracking technology are gradually expanding, and accordingly, attempts to utilize the eye tracking technology are increasing in providing advertisement services in terminals such as a smartphone. However, in order to provide an efficient advertisement service, the accuracy of gaze tracking needs to be further improved, and it is necessary to efficiently configure a bidding method, a reward method, and the like according to the advertisement viewing.

한국등록특허공보 제10-1479471호(2015.01.13)Korean Registered Patent Publication No. 10-1479471 (2015.01.13)

본 발명의 예시적인 실시예에 따르면, 사용자의 얼굴 이미지를 촬영하는 촬영 장치; 및 설정된 룰(rule)을 기반으로 상기 얼굴 이미지로부터 상기 사용자의 얼굴이 향하는 방향을 나타내는 벡터 및 상기 사용자의 눈동자 이미지를 획득하며, 설정된 딥러닝 모델에 상기 얼굴 이미지, 상기 벡터 및 상기 눈동자 이미지를 입력하여 상기 사용자의 시선을 추적하는 시선 추적부를 포함하는, 사용자 단말이 제공된다.According to an exemplary embodiment of the present invention, a photographing apparatus for photographing a face image of a user; And obtaining a vector representing the direction of the user's face from the face image and the user's pupil image based on the set rule, and inputting the face image, the vector, and the pupil image into the set deep learning model. A user terminal is provided, including a gaze tracking unit for tracking the user's gaze.

상기 사용자 단말은, 화면 내 설정된 지점을 응시하는 응시자로부터 설정된 액션을 입력 받는 경우 상기 액션을 입력 받는 시점에서 촬영된 상기 응시자의 얼굴 이미지 및 상기 설정된 지점의 위치 정보를 포함하는 학습 데이터를 수집하는 학습 데이터 수집부를 더 포함하며, 상기 시선 추적부는, 상기 학습 데이터를 상기 딥러닝 모델에 학습시키고, 상기 학습 데이터를 학습한 상기 딥러닝 모델을 이용하여 상기 사용자의 시선을 추적할 수 있다.When the user terminal receives a set action from a candidate who gazes at a set point in the screen, the user terminal learns to collect learning data including the face image of the test taker and location information of the set point when the action is input. The apparatus further includes a data collection unit, and the gaze tracking unit may train the learning data into the deep learning model and track the user's gaze using the deep learning model that has learned the learning data.

상기 학습 데이터 수집부는, 상기 응시자가 상기 지점을 터치하는 경우 상기 터치가 이루어진 시점에서 상기 학습 데이터를 수집할 수 있다.The learning data collection unit may collect the learning data when the touch is made when the candidate touches the point.

상기 학습 데이터 수집부는, 상기 응시자가 상기 지점을 터치하는 시점에 상기 촬영 장치를 동작시켜 상기 학습 데이터를 수집할 수 있다.The learning data collection unit may collect the learning data by operating the photographing device at a time when the candidate touches the point.

상기 학습 데이터 수집부는, 상기 응시자가 상기 지점을 터치하는 시점에 수집된 상기 학습 데이터를 서버로 전송할 수 있다.The learning data collection unit may transmit the collected learning data to a server when the candidate touches the point.

상기 학습 데이터 수집부는, 상기 촬영 장치가 동작하고 있는 상태에서 상기 응시자가 상기 지점을 터치하는 경우 상기 터치가 이루어진 시점 및 상기 터치가 이루어진 시점으로부터 설정된 시간 만큼의 전후 시점에서 상기 학습 데이터를 각각 수집할 수 있다.When the candidate touches the point while the photographing device is operating, the learning data collection unit collects the learning data at a time before and after the touch and a time set from the time when the touch is made. You can.

상기 학습 데이터 수집부는, 상기 응시자의 시선이 상기 터치 이후에도 상기 지점에 머무를 수 있도록 상기 응시자가 상기 지점을 터치한 이후 상기 지점의 시각 요소를 변화시킬 수 있다.The learning data collection unit may change a visual element of the point after the candidate touches the point so that the candidate's gaze remains at the point even after the touch.

상기 학습 데이터 수집부는, 상기 지점에서 설정된 문구를 디스플레이하고, 상기 응시자가 음성을 발화하는 경우 상기 발화가 시작되는 시점에서 상기 학습 데이터를 수집할 수 있다.The learning data collection unit may display the phrase set at the point, and collect the learning data at the time when the utterance starts when the candidate speaks a voice.

상기 시선 추적부는, 상기 룰을 기반으로 상기 얼굴 이미지로부터 상기 사용자의 눈동자 위치좌표 및 얼굴 위치좌표를 획득하고, 상기 사용자의 얼굴이 향하는 방향을 나타내는 벡터와 함께 상기 눈동자 위치좌표 및 상기 얼굴 위치좌표를 상기 딥러닝 모델에 더 입력할 수 있다.The gaze tracking unit obtains the user's pupil location coordinates and face location coordinates from the face image based on the rule, and obtains the pupil location coordinates and the face location coordinates together with a vector indicating a direction in which the user's face faces. The deep learning model may be further input.

상기 화면에 광고 컨텐츠를 디스플레이하는 컨텐츠 제공부를 더 포함하며, 상기 시선 추적부는, 검출된 상기 사용자의 시선과 상기 화면 내 상기 광고 컨텐츠의 위치에 기초하여 상기 사용자가 상기 광고 컨텐츠를 응시하고 있는지의 여부를 판단하고, 상기 컨텐츠 제공부는, 상기 화면 내 상기 광고 컨텐츠의 위치 및 상기 사용자가 상기 광고 컨텐츠를 응시한 시간을 고려하여 상기 화면 내 상기 광고 컨텐츠의 위치를 변경할 수 있다.Further comprising a content providing unit for displaying the advertisement content on the screen, the gaze tracking unit, based on the detected user's gaze and the location of the advertisement content on the screen whether the user is staring at the advertisement content The content providing unit may change the location of the advertisement content in the screen in consideration of the location of the advertisement content in the screen and a time when the user gazes at the advertisement content.

본 발명의 다른 예시적인 실시예에 따르면, 촬영 장치에서, 사용자의 얼굴 이미지를 촬영하는 단계; 시선 추적부에서, 설정된 룰(rule)을 기반으로 상기 얼굴 이미지로부터 상기 사용자의 얼굴이 향하는 방향을 나타내는 벡터 및 상기 사용자의 눈동자 이미지를 획득하는 단계; 및 상기 시선 추적부에서, 설정된 딥러닝 모델에 상기 얼굴 이미지, 상기 벡터 및 상기 눈동자 이미지를 입력하여 상기 사용자의 시선을 추적하는 단계를 포함하는, 시선 추적 방법이 제공된다.According to another exemplary embodiment of the present invention, in the photographing apparatus, photographing a face image of the user; Acquiring a vector indicating the direction of the face of the user from the face image and the user's pupil image from the face image based on a set rule in the gaze tracking unit; And inputting the face image, the vector, and the pupil image into a set deep learning model in the gaze tracking unit to track the user's gaze.

상기 시선 추적 방법은, 학습 데이터 수집부에서, 화면 내 설정된 지점을 응시하는 응시자로부터 설정된 액션을 입력 받는 경우 상기 액션을 입력 받는 시점에서 촬영된 상기 응시자의 얼굴 이미지 및 상기 설정된 지점의 위치 정보를 포함하는 학습 데이터를 수집하는 단계; 및 상기 시선 추적부에서, 상기 학습 데이터를 상기 딥러닝 모델에 학습시키는 단계를 더 포함하며, 상기 사용자의 시선을 추적하는 단계는, 상기 학습 데이터를 학습한 상기 딥러닝 모델을 이용하여 상기 사용자의 시선을 추적할 수 있다.The gaze tracking method, when receiving a set action from a test taker who gazes at a set point on the screen in the learning data collection unit, includes the face image of the test taker and location information of the set point when the action is input. Collecting learning data; And in the gaze tracking unit, further comprising the step of training the learning data to the deep learning model, the step of tracking the user's gaze, the user of the user using the deep learning model learning the training data You can follow your gaze.

상기 학습 데이터를 수집하는 단계는, 상기 응시자가 상기 지점을 터치하는 경우 상기 터치가 이루어진 시점에서 상기 학습 데이터를 수집할 수 있다.In the collecting of the learning data, when the candidate touches the point, the learning data may be collected at the time when the touch is made.

상기 학습 데이터를 수집하는 단계는, 상기 응시자가 상기 지점을 터치하는 시점에 상기 촬영 장치를 동작시켜 상기 학습 데이터를 수집할 수 있다.In the collecting of the learning data, the learning data may be collected by operating the photographing device at a time when the candidate touches the point.

상기 시선 추적 방법은, 상기 학습 데이터 수집부에서, 상기 응시자가 상기 지점을 터치하는 시점에 수집된 상기 학습 데이터를 서버로 전송하는 단계를 더 포함할 수 있다.The gaze tracking method may further include, in the learning data collection unit, transmitting the learning data collected at a time when the candidate touches the point to the server.

상기 학습 데이터를 수집하는 단계는, 상기 촬영 장치가 동작하고 있는 상태에서 상기 응시자가 상기 지점을 터치하는 경우 상기 터치가 이루어진 시점 및 상기 터치가 이루어진 시점으로부터 설정된 시간 만큼의 전후 시점에서 상기 학습 데이터를 각각 수집할 수 있다.In the step of collecting the learning data, when the candidate touches the point while the photographing device is operating, the learning data is acquired at a time before and after a time set from the time when the touch is made and the time when the touch is made. Each can be collected.

상기 시선 추적 방법은, 상기 학습 데이터 수집부에서, 상기 응시자의 시선이 상기 터치 이후에도 상기 지점에 머무를 수 있도록 상기 응시자가 상기 지점을 터치한 이후 상기 지점의 시각 요소를 변화시키는 단계를 더 포함할 수 있다.The gaze tracking method may further include, in the learning data collection unit, changing the visual element of the point after the candidate touches the point so that the candidate's gaze remains at the point even after the touch. have.

상기 학습 데이터를 수집하는 단계는, 상기 지점에서 설정된 문구를 디스플레이하고, 상기 응시자가 음성을 발화하는 경우 상기 발화가 시작되는 시점에서 상기 학습 데이터를 수집할 수 있다.The step of collecting the learning data may display the phrase set at the point, and collect the learning data at the time when the utterance starts when the candidate speaks a voice.

상기 시선 추적 방법은, 상기 시선 추적부에서, 상기 룰을 기반으로 상기 얼굴 이미지로부터 상기 사용자의 눈동자 위치좌표 및 얼굴 위치좌표를 획득하는 단계를 더 포함하며, 상기 사용자의 시선을 추적하는 단계는, 상기 사용자의 얼굴이 향하는 방향을 나타내는 벡터와 함께 상기 눈동자 위치좌표 및 상기 얼굴 위치좌표를 상기 딥러닝 모델에 더 입력할 수 있다.The gaze tracking method further includes obtaining, by the gaze tracking unit, the user's pupil location coordinates and face location coordinates from the face image based on the rule, wherein the gaze tracking method comprises: The pupil location coordinates and the face location coordinates may be further input to the deep learning model along with a vector indicating a direction in which the user's face is facing.

상기 시선 추적 방법은, 컨텐츠 제공부에서, 상기 화면에 광고 컨텐츠를 디스플레이하는 단계; 상기 시선 추적부에서, 검출된 상기 사용자의 시선과 상기 화면 내 상기 광고 컨텐츠의 위치에 기초하여 상기 사용자가 상기 광고 컨텐츠를 응시하고 있는지의 여부를 판단하는 단계; 및 상기 컨텐츠 제공부에서, 상기 화면 내 상기 광고 컨텐츠의 위치 및 상기 사용자가 상기 광고 컨텐츠를 응시한 시간을 고려하여 상기 화면 내 상기 광고 컨텐츠의 위치를 변경하는 단계를 더 포함할 수 있다.The gaze tracking method includes: displaying, by the content providing unit, advertisement content on the screen; Determining, by the gaze tracking unit, whether the user is staring at the advertisement content based on the detected gaze of the user and the location of the advertisement content on the screen; And in the content providing unit, changing the position of the advertisement content in the screen in consideration of a location of the advertisement content in the screen and a time when the user gazes at the advertisement content.

본 발명의 실시예들은 딥러닝 모델 기반의 시선 추적시 시선 추적의 정확도를 보다 향상시키는 수단을 제공하기 위한 것이다.Embodiments of the present invention are to provide a means to further improve the accuracy of eye tracking when gaze tracking based on a deep learning model.

본 발명의 실시예들에 따르면, 딥러닝 모델 기반의 시선 추적시 사용자의 얼굴 이미지, 눈동자 이미지뿐 아니라 사용자의 얼굴이 향하는 방향을 나타내는 벡터를 딥러닝 모델의 입력 데이터로 사용함으로써 시선 추적의 정확도를 보다 향상시킬 수 있다.According to embodiments of the present invention, the accuracy of gaze tracking is achieved by using a vector representing not only the user's face image, the pupil image, but also the direction of the user's face as input data of the deep learning model when tracking a deep learning model-based gaze. It can improve more.

또한, 본 발명의 실시예들에 따르면, 화면 내 설정된 지점을 응시하고 있는 응시자로부터 터치, 음성 등과 같은 액션을 입력 받는 경우 상기 액션을 입력 받는 시점에서 촬영된 응시자의 얼굴 이미지와 상기 지점의 위치 정보를 시선 추적을 위한 딥러닝 모델의 학습 데이터로 사용함으로써, 시선 추적의 정확도 및 신뢰도를 보다 향상시킬 수 있다.In addition, according to embodiments of the present invention, when an action such as a touch or voice is input from a candidate who is gazing at a set point in the screen, the face image of the candidate and the location information of the point taken at the time of receiving the action By using as learning data of a deep learning model for eye tracking, it is possible to further improve the accuracy and reliability of eye tracking.

도 1은 본 발명의 일 실시예에 따른 광고 시스템의 상세 구성을 나타낸 블록도
도 2는 본 발명의 일 실시예에 따른 단말의 상세 구성을 나타낸 블록도
도 3은 본 발명의 일 실시예에 따른 시선 추적부에서 사용자의 시선을 추적하는 과정을 설명하기 위한 도면
도 4는 본 발명의 일 실시예에 따른 얼굴 벡터의 예시
도 5는 본 발명의 일 실시예에 따른 딥러닝 모델을 통해 사용자의 시선을 추적하는 과정을 나타낸 예시
도 6은 본 발명의 일 실시예에 따른 학습 데이터 수집부에서 딥러닝 모델에 입력되는 학습 데이터를 수집하는 과정을 설명하기 위한 예시
도 7은 본 발명의 일 실시예에 따른 학습 데이터 수집부에서 딥러닝 모델에 입력되는 학습 데이터를 수집하는 과정을 설명하기 위한 다른 예시
도 8은 도 7에서 응시자가 설정된 지점을 터치하는 경우 상기 지점의 시각 요소를 변화시키는 과정을 설명하기 위한 예시
도 9는 본 발명의 일 실시예에 따른 학습 데이터 수집부에서 딥러닝 모델에 입력되는 학습 데이터를 수집하는 과정을 설명하기 위한 다른 예시
도 10은 본 발명의 일 실시예에 따른 시선 기반의 비딩 방식을 설명하기 위한 예시
도 11은 본 발명의 일 실시예에 따른 시선 추적 방법을 설명하기 위한 흐름도
도 12는 예시적인 실시예들에서 사용되기에 적합한 컴퓨팅 장치를 포함하는 컴퓨팅 환경을 예시하여 설명하기 위한 블록도1 is a block diagram showing a detailed configuration of an advertisement system according to an embodiment of the present invention
2 is a block diagram showing a detailed configuration of a terminal according to an embodiment of the present invention
3 is a view for explaining a process of tracking the user's gaze in the gaze tracking unit according to an embodiment of the present invention
4 is an illustration of a face vector according to an embodiment of the present invention
5 is an example showing a process of tracking a user's gaze through a deep learning model according to an embodiment of the present invention
6 is an example for explaining a process of collecting learning data input to a deep learning model in the learning data collection unit according to an embodiment of the present invention
7 is another example for explaining a process of collecting learning data input to a deep learning model in the learning data collection unit according to an embodiment of the present invention
8 is an example for explaining a process of changing the visual element of the point when the candidate touches the set point in FIG. 7
9 is another example for explaining a process of collecting learning data input to a deep learning model in the learning data collection unit according to an embodiment of the present invention
10 is an example for explaining a gaze-based beading method according to an embodiment of the present invention
11 is a flowchart illustrating a gaze tracking method according to an embodiment of the present invention
12 is a block diagram illustrating and illustrating a computing environment including a computing device suitable for use in example embodiments.

이하, 도면을 참조하여 본 발명의 구체적인 실시형태를 설명하기로 한다. 이하의 상세한 설명은 본 명세서에서 기술된 방법, 장치 및/또는 시스템에 대한 포괄적인 이해를 돕기 위해 제공된다. 그러나 이는 예시에 불과하며 본 발명은 이에 제한되지 않는다.Hereinafter, specific embodiments of the present invention will be described with reference to the drawings. The following detailed description is provided to aid in a comprehensive understanding of the methods, devices and / or systems described herein. However, this is only an example and the present invention is not limited thereto.

본 발명의 실시예들을 설명함에 있어서, 본 발명과 관련된 공지기술에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명을 생략하기로 한다. 그리고, 후술되는 용어들은 본 발명에서의 기능을 고려하여 정의된 용어들로서 이는 사용자, 운용자의 의도 또는 관례 등에 따라 달라질 수 있다. 그러므로 그 정의는 본 명세서 전반에 걸친 내용을 토대로 내려져야 할 것이다. 상세한 설명에서 사용되는 용어는 단지 본 발명의 실시예들을 기술하기 위한 것이며, 결코 제한적이어서는 안 된다. 명확하게 달리 사용되지 않는 한, 단수 형태의 표현은 복수 형태의 의미를 포함한다. 본 설명에서, "포함" 또는 "구비"와 같은 표현은 어떤 특성들, 숫자들, 단계들, 동작들, 요소들, 이들의 일부 또는 조합을 가리키기 위한 것이며, 기술된 것 이외에 하나 또는 그 이상의 다른 특성, 숫자, 단계, 동작, 요소, 이들의 일부 또는 조합의 존재 또는 가능성을 배제하도록 해석되어서는 안 된다.In describing the embodiments of the present invention, when it is determined that a detailed description of known technology related to the present invention may unnecessarily obscure the subject matter of the present invention, the detailed description will be omitted. In addition, terms to be described later are terms defined in consideration of functions in the present invention, which may vary according to a user's or operator's intention or practice. Therefore, the definition should be made based on the contents throughout this specification. The terminology used in the detailed description is only for describing embodiments of the present invention and should not be limiting. Unless expressly used otherwise, a singular form includes a plural form. In this description, expressions such as “comprising” or “equipment” are intended to indicate certain characteristics, numbers, steps, actions, elements, parts or combinations thereof, and one or more other than described. It should not be interpreted to exclude the presence or possibility of other characteristics, numbers, steps, actions, elements, or parts or combinations thereof.

도 1은 본 발명의 일 실시예에 따른 광고 시스템(100)의 상세 구성을 나타낸 블록도이다. 도 1에 도시된 바와 같이, 본 발명의 일 실시예에 따른 광고 시스템(100)은 사용자 단말(102), 서버(104), 광고주 단말(106) 및 컨텐츠 개발자 단말(108)을 포함한다.1 is a block diagram showing a detailed configuration of an advertisement system 100 according to an embodiment of the present invention. As shown in FIG. 1, the advertisement system 100 according to an embodiment of the present invention includes a user terminal 102, a server 104, an advertiser terminal 106 and a content developer terminal 108.

사용자 단말(102)은 사용자가 소지하여 각종 광고 서비스를 제공 받는 기기로서, 예를 들어 스마트폰, 태블릿 PC, 노트북 등과 같은 모바일 기기일 수 있다. 다만, 사용자 단말(102)의 종류가 이에 한정되는 것은 아니며, 광고 컨텐츠의 디스플레이를 위한 화면 및 사용자 촬영을 위한 촬영 장치를 구비하는 다양한 통신 기기가 본 발명의 실시예들에 따른 사용자 단말(102)에 해당할 수 있다.The user terminal 102 is a device possessed by a user and provided with various advertisement services, and may be, for example, a mobile device such as a smartphone, a tablet PC, or a laptop. However, the type of the user terminal 102 is not limited thereto, and various communication devices provided with a screen for displaying advertisement content and a photographing device for photographing a user are the user terminal 102 according to embodiments of the present invention May correspond to

사용자 단말(102)은 화면을 구비할 수 있으며, 상기 화면을 통해 광고 컨텐츠를 디스플레이할 수 있다. 또한, 사용자 단말(102)은 카메라, 캠코더 등과 같은 촬영 장치를 구비할 수 있으며, 상기 촬영 장치를 통해 촬영한 사용자의 얼굴 이미지로부터 상기 사용자의 시선을 추적할 수 있다. 이에 따라, 사용자 단말(102)은 검출된 사용자의 시선과 상기 화면 내 광고 컨텐츠의 위치에 기초하여 상기 사용자가 광고 컨텐츠를 응시하고 있는지의 여부를 판단할 수 있다. 이때, 사용자 단말(102)은 서버(104)로부터 설정된 모바일 애플리케이션을 제공 받고, 상기 애플리케이션을 통해 사용자 단말(102) 내 상기 화면, 촬영 장치 등과 연동하여 상술한 광고 컨텐츠의 제공, 시선 추적 기능 등을 수행할 수 있다. The user terminal 102 may have a screen, and display advertisement content through the screen. In addition, the user terminal 102 may include a photographing device such as a camera or a camcorder, and may track the user's gaze from the face image of the user photographed through the photographing device. Accordingly, the user terminal 102 may determine whether the user is staring at the advertisement content based on the detected user's gaze and the location of the advertisement content on the screen. At this time, the user terminal 102 receives the mobile application set by the server 104, and provides the above-described advertisement content, gaze tracking function, etc. by interworking with the screen and the photographing device in the user terminal 102 through the application. Can be done.

또한, 사용자 단말(102)은 설정된 룰 기반(rule-based) 알고리즘 및 딥러닝(deep learning) 모델을 이용하여 사용자의 시선을 추적할 수 있다. 여기서, 룰 기반 알고리즘은 미리 정해진 영상 처리 기법, 이미지 처리 기법, 수학식 등을 이용하여 시선 추적을 위한 각종 데이터를 획득하는 데 사용되는 알고리즘으로서, 예를 들어 얼굴 인식 알고리즘(예를 들어, 주성분 분석(PCA : Principal Component Analysis), 선형판별 분석(LDA : Linear Discriminant Analysis) 등), 얼굴의 특징점 검출 알고리즘(예를 들어, SVM : Support Vector Machine, SURF: Speeded Up Robust Features 등), 이미지 기반의 헤드-추적(head-tracking) 알고리즘, 눈동자 추출 및 눈동자 위치좌표 계산 알고리즘 등이 될 수 있다. 또한, 딥러닝 모델은 예를 들어, 합성곱 신경망(CNN : Convolutional Neural Network) 모델일 수 있다.In addition, the user terminal 102 may track a user's gaze using a set rule-based algorithm and a deep learning model. Here, the rule-based algorithm is an algorithm used to obtain various data for gaze tracking using a predetermined image processing technique, an image processing technique, or a mathematical expression, for example, a face recognition algorithm (for example, principal component analysis) (PCA: Principal Component Analysis), Linear Discriminant Analysis (LDA), etc., facial feature detection algorithm (e.g., SVM: Support Vector Machine, SURF: Speeded Up Robust Features, etc.), image-based head It can be a head-tracking algorithm, pupil extraction and pupil location coordinate calculation algorithm. Further, the deep learning model may be, for example, a convolutional neural network (CNN) model.

서버(104)는 사용자 단말(102)과 광고주 단말(106) 사이에서 광고 서비스 제공을 위한 각종 데이터를 중계한다. 도 1에 도시된 바와 같이, 서버(104)는 네트워크(미도시)를 통해 단말(102), 광고주 단말(106) 및 컨텐츠 개발자 단말(108)과 각각 연결될 수 있다. 서버(104)는 단말(102)의 요청에 따라 광고 서비스 제공을 위한 모바일 애플리케이션을 단말(102)로 제공할 수 있다. 단말(102)은 상기 모바일 애플리케이션을 통해 서버(104)에 접속하여 서버(104)에서 제공하는 각종 광고 서비스를 사용자에게 제공할 수 있다. 또한, 서버(104)는 광고주 단말(106)과 연동하여 컨텐츠 개발자 단말(108)로부터 광고 컨텐츠를 수신하고, 이를 단말(102)로 제공할 수 있다. 이후, 서버(104)는 단말(102)로부터 광고 컨텐츠의 광고 효과와 관련된 각종 데이터(예를 들어, 각 광고 컨텐츠별로 디스플레이된 시간/횟수, 각 광고 컨텐츠 별로 응시된 시간/횟수 등)를 수집하고, 이를 광고주 단말(106)로 제공할 수 있다.The server 104 relays various data for providing advertisement service between the user terminal 102 and the advertiser terminal 106. As shown in FIG. 1, the server 104 may be connected to the terminal 102, the advertiser terminal 106, and the content developer terminal 108 through a network (not shown), respectively. The server 104 may provide a mobile application for providing an advertisement service to the terminal 102 at the request of the terminal 102. The terminal 102 may connect to the server 104 through the mobile application and provide various advertisement services provided by the server 104 to the user. In addition, the server 104 may receive advertisement content from the content developer terminal 108 in cooperation with the advertiser terminal 106 and provide it to the terminal 102. Thereafter, the server 104 collects various data related to the advertisement effect of the advertisement content (eg, the time / number displayed for each advertisement content, the time / number of times stared for each advertisement content, etc.) from the terminal 102 and , It can be provided to the advertiser terminal 106.

광고주 단말(106)은 광고주가 소지하는 단말로서, 네트워크를 통해 서버(104)와 연결될 수 있다. 광고주 단말(106)은 컨텐츠 개발자 단말(108)이 제공하는 복수 개의 광고 컨텐츠 중 적어도 하나를 광고주로부터 선택 받고, 선택된 광고 컨텐츠에 관한 정보를 서버(104)로 제공할 수 있다. 또한, 광고주 단말(106)은 서버(104)로부터 광고 컨텐츠의 광고 효과와 관련된 각종 데이터를 제공 받을 수 있다.The advertiser terminal 106 is a terminal possessed by an advertiser and may be connected to the server 104 through a network. The advertiser terminal 106 may receive at least one of a plurality of advertisement contents provided by the content developer terminal 108 from an advertiser, and provide information regarding the selected advertisement content to the server 104. In addition, the advertiser terminal 106 may receive various data related to the advertising effect of the advertisement content from the server 104.

컨텐츠 개발자 단말(108)은 광고 컨텐츠를 개발하는 개발자가 소지하는 단말로서, 네트워크를 통해 서버(104)와 연결될 수 있다. 컨텐츠 개발자 단말(108)은 컨텐츠 개발자가 제작/편집한 광고 컨텐츠를 서버(104)를 통해 광고주 단말(106)로 제공할 수 있다. 서버(104)는 광고주 단말(106)로부터 광고주가 선택한 광고 컨텐츠에 관한 정보를 수신하고, 상기 정보에 대응되는 광고 컨텐츠를 사용자 단말(102)로 제공할 수 있다.The content developer terminal 108 is a terminal possessed by a developer who develops advertisement content, and may be connected to the server 104 through a network. The content developer terminal 108 may provide the advertisement content produced / edited by the content developer to the advertiser terminal 106 through the server 104. The server 104 may receive information regarding advertisement content selected by an advertiser from the advertiser terminal 106 and provide advertisement content corresponding to the information to the user terminal 102.

도 2는 본 발명의 일 실시예에 따른 단말(102)의 상세 구성을 나타낸 블록도이다. 도 2에 도시된 바와 같이, 본 발명의 일 실시예에 따른 단말(102)은 컨텐츠 제공부(202), 촬영 장치(204), 시선 추적부(206) 및 학습 데이터 수집부(208)를 포함한다.2 is a block diagram showing a detailed configuration of the terminal 102 according to an embodiment of the present invention. As illustrated in FIG. 2, the terminal 102 according to an embodiment of the present invention includes a content providing unit 202, a photographing device 204, a gaze tracking unit 206, and a learning data collection unit 208 do.

컨텐츠 제공부(202)는 사용자 단말(102)의 화면에 광고 컨텐츠를 디스플레이한다. 일 예시로서, 컨텐츠 제공부(202)는 잠금 화면에 광고 컨텐츠를 디스플레이할 수 있다. 상기 잠금 화면은 사용자 단말(102)이 잠금 상태로 전환된 상태에서 사용자로부터 상기 잠금 상태를 해제하기 위한 터치를 입력 받는 경우 디스플레이되는 화면을 의미한다. 컨텐츠 제공부(202)는 텍스트, 이미지, 또는 동영상 형태의 광고 컨텐츠를 상기 잠금 화면에 디스플레이할 수 있다. 다른 예시로서, 컨텐츠 제공부(202)는 사용자로부터 설정된 애플리케이션, 메뉴 등의 실행 명령을 입력 받는 경우 상기 실행 명령에 따라 화면에 광고 컨텐츠를 디스플레이할 수 있다. 다만, 광고 컨텐츠가 디스플레이되는 화면의 예시가 이에 한정되는 것은 아니며, 상기 광고 컨텐츠는 미리 설정된 다양한 형태의 화면에 디스플레이될 수 있다. The content providing unit 202 displays advertisement content on the screen of the user terminal 102. As an example, the content providing unit 202 may display advertisement content on a lock screen. The lock screen refers to a screen displayed when the user terminal 102 receives a touch to release the lock state from the user in a state where the user terminal 102 is switched to the lock state. The content providing unit 202 may display advertising content in the form of text, images, or videos on the lock screen. As another example, when receiving an execution command such as an application or menu set by a user, the content providing unit 202 may display advertisement content on the screen according to the execution command. However, an example of a screen on which advertising content is displayed is not limited to this, and the advertising content may be displayed on various types of screens that are preset.

촬영 장치(204)는 사용자 단말(102)의 화면을 응시하는 사용자를 촬영하는 장치로서, 예를 들어 카메라, 캠코더 등이 될 수 있다. 촬영 장치(204)는 예를 들어, 사용자 단말(102)의 전면(前面)에 구비될 수 있다. 사용자 단말(102)은 촬영 장치(204)를 통해 사용자의 얼굴 이미지를 획득하고, 상기 얼굴 이미지를 통해 사용자의 시선을 추적할 수 있다.The photographing device 204 is a device that photographs a user staring at the screen of the user terminal 102, and may be, for example, a camera or a camcorder. The photographing device 204 may be provided on the front surface of the user terminal 102, for example. The user terminal 102 may acquire a user's face image through the photographing device 204 and track the user's gaze through the face image.

시선 추적부(206)는 사용자의 시선을 추적한다. 시선 추적부(206)는 설정된 룰 기반 알고리즘 및 딥러닝 모델을 이용하여 사용자의 시선을 추적할 수 있다. 본 실시예들에 있어서, 딥러닝은 인간의 신경망(Neural Network) 이론을 이용한 인공신경망(ANN, Artificial Neural Network)의 일종으로서, 계층 구조(Layer Structure)로 구성되어 입력층(Input layer)과 출력층(Output layer) 사이에 하나 이상의 숨겨진 층(Hidden layer)을 갖고 있는 심층 신경망(DNN : Deep Neural Network)을 지칭하는 기계학습(Machine Learning) 모델 또는 알고리즘의 집합을 의미한다. 이때, 시선 추적부(206)는 촬영 장치(204)와 연동하여 사용자의 시선을 추적할 수 있다.The gaze tracking unit 206 tracks the user's gaze. The gaze tracking unit 206 may track a user's gaze using a set rule-based algorithm and a deep learning model. In the present embodiments, deep learning is a type of artificial neural network (ANN) using a human neural network theory, and is composed of a layer structure and is composed of an input layer and an output layer. It means a set of machine learning models or algorithms that refer to a deep neural network (DNN) that has one or more hidden layers between (Output layers). At this time, the gaze tracking unit 206 may track the user's gaze in conjunction with the imaging device 204.

일 예시로서, 촬영 장치(204)에서 사용자의 얼굴이 감지되는 경우, 시선 추적부(206)는 상술한 룰 기반 알고리즘 및 딥러닝 모델을 이용하여 사용자의 시선을 추적할 수 있다. 다른 예시로서, 촬영 장치(204)에서 사용자의 얼굴이 감지되지 않는 경우, 시선 추적부(206)는 슬립 모드로 동작하여 상기 시선 추적을 위한 각종 동작을 중지할 수 있다.As an example, when a user's face is detected in the photographing apparatus 204, the gaze tracking unit 206 may track the user's gaze using the above-described rule-based algorithm and a deep learning model. As another example, when the user's face is not detected by the photographing apparatus 204, the gaze tracking unit 206 may operate in a sleep mode to stop various operations for the gaze tracking.

만약, 촬영 장치(204)에서 사용자의 얼굴이 감지되는 경우, 시선 추적부(206)는 촬영 장치(204)를 통해 촬영된 사용자의 얼굴 이미지를 획득하고, 설정된 룰을 기반으로 상기 얼굴 이미지로부터 사용자의 얼굴이 향하는 방향을 나타내는 벡터 및 사용자의 눈동자 이미지를 획득할 수 있다. 이후, 시선 추적부(206)는 딥러닝 모델(210)에 상기 얼굴 이미지, 상기 벡터 및 상기 눈동자 이미지를 입력하여 상기 사용자의 시선을 추적할 수 있다. 여기서, 상기 딥러닝 모델은 학습 데이터 수집부(208)에서 수집된 충분한 양의 학습 데이터를 미리 학습한 것으로 가정한다. 또한, 시선 추적부(206)는 상기 룰을 기반으로 상기 얼굴 이미지로부터 상기 사용자의 눈동자 위치좌표, 얼굴 위치좌표, 눈동자의 방향벡터 등을 획득하고, 이들을 상기 딥러닝 모델(210)에 입력할 수 있다. 이와 같이, 시선 추적부(206)는 사용자의 얼굴 및 눈동자의 이미지뿐 아니라 룰 기반으로 획득된 시선 추적을 위한 각종 정량적인 데이터를 딥러닝 모델(210)에 입력함으로써 시선 추적의 정확도를 보다 향상시킬 수 있다.If the user's face is detected in the photographing device 204, the gaze tracking unit 206 acquires the user's face image photographed through the photographing device 204, and the user from the face image based on a set rule A vector representing the direction of the face of the user and a user's pupil image may be acquired. Thereafter, the gaze tracking unit 206 may input the face image, the vector, and the pupil image into the deep learning model 210 to track the user's gaze. Here, it is assumed that the deep learning model has previously learned a sufficient amount of learning data collected by the learning data collection unit 208. In addition, the gaze tracking unit 206 may obtain the user's pupil location coordinates, face location coordinates, pupil direction vectors, etc. from the face image based on the rule, and input them into the deep learning model 210. have. As described above, the gaze tracking unit 206 may further improve the accuracy of gaze tracking by inputting various quantitative data for gaze tracking obtained based on a rule as well as an image of a user's face and pupils into the deep learning model 210. You can.

또한, 시선 추적부(206)는 검출된 사용자의 시선과 화면 내 상기 광고 컨텐츠의 위치에 기초하여 상기 사용자가 상기 광고 컨텐츠를 응시하고 있는지의 여부를 판단할 수 있다. 후술할 바와 같이, 컨텐츠 제공부(202)는 상기 화면 내 상기 광고 컨텐츠의 위치 및 상기 사용자가 상기 광고 컨텐츠를 응시한 시간을 고려하여 상기 화면 내 상기 광고 컨텐츠의 위치를 변경할 수 있다. In addition, the gaze tracking unit 206 may determine whether the user is staring at the advertisement content based on the detected user's gaze and the location of the advertisement content on the screen. As will be described later, the content providing unit 202 may change the location of the advertisement content in the screen in consideration of the location of the advertisement content in the screen and the time when the user gazes at the advertisement content.

도 3은 본 발명의 일 실시예에 따른 시선 추적부(206)에서 사용자의 시선을 추적하는 과정을 설명하기 위한 도면이며, 도 4는 본 발명의 일 실시예에 따른 얼굴 벡터의 예시이다. 또한, 도 5는 본 발명의 일 실시예에 따른 딥러닝 모델(210)을 통해 사용자의 시선을 추적하는 과정을 나타낸 예시이다. 3 is a view for explaining a process of tracking the user's gaze in the gaze tracking unit 206 according to an embodiment of the present invention, Figure 4 is an example of a face vector according to an embodiment of the present invention. In addition, Figure 5 is an example showing a process of tracking the user's gaze through the deep learning model 210 according to an embodiment of the present invention.

도 3을 참조하면, 시선 추적부(206)는 촬영 장치(204)를 통해 획득된 사용자의 얼굴 이미지에 룰 기반 알고리즘을 적용하여 상기 사용자의 얼굴이 향하는 방향을 나타내는 벡터, 눈동자 이미지 및 눈동자 위치좌표 등을 획득할 수 있다. 일반적으로, 사용자는 특정 지점을 응시할 때 얼굴을 해당 지점으로 향하고 있으므로 상기 얼굴이 향하는 방향이 사용자의 시선 방향과 일치할 확률이 높다. 이에 따라, 본 발명의 실시예들에서는 시선 추적부(206)가 사용자의 얼굴 이미지, 눈동자 이미지뿐 아니라 사용자의 얼굴이 향하는 방향을 나타내는 벡터를 딥러닝 모델(210)의 입력 데이터로 사용함으로써 시선 추적의 정확도를 보다 향상시킬 수 있도록 하였다. 시선 추적부(206)는 예를 들어, 미리 정해진 특징점 추출 알고리즘을 통해 얼굴 이미지의 특징 벡터를 추출하고, 상기 특징 벡터로부터 상기 사용자의 얼굴이 향하는 방향을 나타내는 벡터, 즉 얼굴 벡터(face-vector)를 획득할 수 있다. 이와 같이 획득된 얼굴 벡터의 예시는 도 4에 도시된 바와 같다. 또한, 시선 추적부(206)는 이미지 처리 기법을 통해 상기 얼굴 이미지로부터 눈 영역을 검출하고, 상기 눈 영역의 이미지(즉, 눈동자 이미지)와 홍채 또는 동공의 위치좌표를 획득할 수 있다. 또한, 시선 추적부(206)는 화면 전체에서 사용자의 얼굴 영역을 검출하고, 상기 얼굴 영역의 위치좌표를 획득할 수 있다. 시선 추적부(206)는 이와 같이 획득된 벡터, 눈동자 이미지/위치좌표, 얼굴 이미지/위치좌표 등을 딥러닝 모델(210)에 입력할 수 있다. Referring to FIG. 3, the gaze tracking unit 206 applies a rule-based algorithm to a user's face image acquired through the photographing device 204, a vector representing a direction the user's face faces, a pupil image and a pupil position coordinate Etc. can be obtained. In general, when a user gazes at a specific point, the face is directed to the corresponding point, so that the direction in which the face is facing is highly likely to match the user's gaze direction. Accordingly, in embodiments of the present invention, the gaze tracking unit 206 tracks gaze by using a vector representing a user's face image, a pupil image, as well as a direction in which the user's face is directed, as input data of the deep learning model 210 It was made to improve the accuracy of. The gaze tracking unit 206 extracts a feature vector of the face image through a predetermined feature point extraction algorithm, for example, and a vector representing a direction of the user's face from the feature vector, that is, a face-vector Can be obtained. An example of the obtained face vector is as shown in FIG. 4. In addition, the gaze tracking unit 206 may detect an eye region from the face image through an image processing technique, and acquire an image (ie, the pupil image) of the eye region and a position coordinate of the iris or pupil. In addition, the gaze tracking unit 206 may detect a user's face area on the entire screen and obtain a positional coordinate of the face area. The gaze tracking unit 206 may input the vector, the pupil image / position coordinate, the face image / location coordinate, and the like obtained in the deep learning model 210.

도 5를 참조하면, 딥러닝 모델(210)은 계층 구조로 이루어지는 복수 개의 레이어를 구비할 수 있으며, 상기 레이어에 상술한 입력 데이터가 입력될 수 있다. 딥러닝 모델(210)은 미리 학습된 학습 데이터와 새롭게 입력된 입력 데이터를 기반으로 사용자의 시선을 추적할 수 있다.Referring to FIG. 5, the deep learning model 210 may include a plurality of layers having a hierarchical structure, and the above-described input data may be input to the layer. The deep learning model 210 may track a user's gaze based on pre-trained learning data and newly input data.

한편, 시선 추적부(206)가 딥러닝 모델(210)을 이용하여 사용자의 시선을 보다 정확히 추적하기 위해서는 딥러닝 모델(210)의 학습용 데이터, 즉 시선 추적을 위한 학습 데이터의 신뢰도가 높아야 한다. Meanwhile, in order for the gaze tracking unit 206 to more accurately track a user's gaze using the deep learning model 210, the reliability of the training data of the deep learning model 210, that is, the learning data for gaze tracking must be high.

이를 위해 다시 도 2로 참조하면, 학습 데이터 수집부(208)는 응시 액션에 기반하여 딥러닝 모델(210)의 학습에 사용되는 다량의 학습 데이터를 수집할 수 있다. 구체적으로, 학습 데이터 수집부(208)는 사용자 단말(102)의 화면 내 설정된 지점을 응시하는 응시자로부터 설정된 액션을 입력 받는 경우 상기 액션을 입력 받는 시점에서 촬영 장치(204)를 통해 촬영된 상기 응시자의 얼굴 이미지 및 상기 설정된 지점의 위치 정보를 포함하는 학습 데이터를 수집할 수 있다. 상기 액션은 예를 들어, 응시자의 화면 터치, 응시자의 음성 발화 등이 될 수 있으며, 상기 학습 데이터 수집의 실시예는 아래와 같다.Referring back to FIG. 2 for this, the learning data collection unit 208 may collect a large amount of learning data used for learning of the deep learning model 210 based on the gaze action. Specifically, the learning data collection unit 208 receives the set action from the test taker gazing at a set point in the screen of the user terminal 102, the test taker photographed through the imaging device 204 at the time of receiving the action Learning data including face image and location information of the set point may be collected. The action may be, for example, a screen touch of a test taker or a voice utterance of the test taker, and an embodiment of collecting the learning data is as follows.

<실시예><Example>

· 응시자가 잠금 화면의 해제를 위한 패턴을 터치 입력하는 경우, 응시자의 터치 입력이 최초로 이루어진 시점에서 촬영 장치(204)가 동작하여 응시자의 얼굴을 촬영 → 촬영된 응시자의 얼굴 이미지(또는 상기 얼굴 이미지/위치좌표, 상기 응시자의 얼굴이 향하는 방향을 나타내는 벡터, 응시자의 눈동자 이미지/위치좌표 등)와 상기 패턴을 최초로 터치한 지점의 위치 정보를 학습 데이터로서 수집When the candidate touch inputs a pattern for unlocking the lock screen, the photographing device 204 operates at the time when the candidate's touch input is first made to take a face of the candidate → the face image of the candidate (or the face image) / Position coordinates, vectors representing the direction of the candidate's face, candidate's pupil images / position coordinates, etc.) and location information of the point where the pattern was first touched is collected as learning data

· 응시자가 화면 내 설정된 애플리케이션 아이콘 또는 메뉴 버튼을 터치(또는 클릭)하는 경우, 응시자의 터치 입력이 이루어진 시점에서 촬영 장치(204)가 동작하여 응시자의 얼굴을 촬영 → 촬영된 응시자의 얼굴 이미지(또는 상기 얼굴 이미지/위치좌표, 상기 응시자의 얼굴이 향하는 방향을 나타내는 벡터, 응시자의 눈동자 이미지/위치좌표 등)와 상기 패턴을 최초로 터치한 지점의 위치 정보를 학습 데이터로서 수집When the candidate touches (or clicks) the application icon or menu button set on the screen, the shooting device 204 operates at the point when the candidate's touch input is made to take the face of the candidate → the face image (or The face image / position coordinate, a vector indicating the direction in which the candidate's face is facing, and the candidate's pupil image / position coordinate, etc.) and location information of the point where the pattern is first touched are collected as learning data

· 화면에 하나의 점을 디스플레이하여 응시자로 하여금 이를 터치하도록 유도하고 응시자가 상기 점을 터치하는 경우, 응시자의 터치 입력이 이루어진 시점에서 촬영 장치(204)가 동작하여 응시자의 얼굴을 촬영 → 촬영된 응시자의 얼굴 이미지(또는 상기 얼굴 이미지/위치좌표, 상기 응시자의 얼굴이 향하는 방향을 나타내는 벡터, 응시자의 눈동자 이미지/위치좌표 등)와 상기 패턴을 최초로 터치한 지점의 위치 정보를 학습 데이터로서 수집· Display a single point on the screen to induce the candidate to touch it, and when the candidate touches the point, the photographing device 204 operates at the time when the candidate's touch input is made to take the face of the candidate → Collect the candidate's face image (or the face image / location coordinate, the vector indicating the direction of the candidate's face, the candidate's pupil image / location coordinate, etc.) and the location information of the point where the pattern was first touched as learning data

이와 같이 수집된 학습 데이터는 딥러닝 모델(210)에 입력되어 학습될 수 있다. 구체적으로, 시선 추적부(206)는 상기 학습 데이터를 딥러닝 모델(210)에 학습시키고, 상기 학습 데이터를 학습한 딥러닝 모델(210)을 이용하여 상기 사용자의 시선을 추적할 수 있다. 이하에서는, 도 6 내지 도 9를 참조하여 학습 데이터 수집부(208)가 학습 데이터를 수집하는 방법을 보다 구체적으로 살펴보기로 한다.The collected learning data may be input to the deep learning model 210 and learned. Specifically, the gaze tracking unit 206 may train the learning data into the deep learning model 210 and track the user's gaze using the deep learning model 210 that has learned the learning data. Hereinafter, a method of collecting the learning data by the learning data collection unit 208 will be described in more detail with reference to FIGS. 6 to 9.

도 6은 본 발명의 일 실시예에 따른 학습 데이터 수집부(208)에서 딥러닝 모델(210)에 입력되는 학습 데이터를 수집하는 과정을 설명하기 위한 예시이다.6 is an example for explaining a process of collecting learning data input to the deep learning model 210 in the learning data collection unit 208 according to an embodiment of the present invention.

도 6을 참조하면, 학습 데이터 수집부(208)는 잠금 화면 상에 패턴 입력을 위한 9개의 점을 디스플레이할 수 있다. 이에 따라, 응시자는 잠금 화면의 해제를 위해 미리 정의된 Z자 형태의 패턴을 터치 입력할 수 있다. 이때, 응시자는 시작점 S → 끝점 E 방향으로 Z자 형태의 패턴을 터치 입력할 수 있다. 학습 데이터 수집부(208)는 응시자의 터치 입력이 최초로 이루어진 시점, 즉 응시자가 시작점 S를 터치한 시점에서 촬영 장치(204)를 통해 촬영된 응시자의 얼굴 이미지 및 상기 시작점 S의 위치 정보를 포함하는 학습 데이터를 수집할 수 있다.Referring to FIG. 6, the learning data collection unit 208 may display nine points for pattern input on the lock screen. Accordingly, the candidate can touch input a predefined Z-shaped pattern to unlock the lock screen. At this time, the candidate may touch input the Z-shaped pattern in the direction of the starting point S → E. The learning data collection unit 208 includes a face image of a candidate taking a picture through the photographing apparatus 204 and a location information of the starting point S at the time when the candidate's touch input is first made, that is, when the candidate touches the starting point S Learning data can be collected.

도 7은 본 발명의 일 실시예에 따른 학습 데이터 수집부(208)에서 딥러닝 모델(210)에 입력되는 학습 데이터를 수집하는 과정을 설명하기 위한 다른 예시이며, 도 8은 도 7에서 응시자가 설정된 지점을 터치하는 경우 상기 지점의 시각 요소를 변화시키는 과정을 설명하기 위한 예시이다.7 is another example for explaining a process of collecting learning data input to the deep learning model 210 in the learning data collection unit 208 according to an embodiment of the present invention, and FIG. 8 is a test taker in FIG. 7 This is an example for explaining a process of changing the visual element of the point when the set point is touched.

도 7을 참조하면, 학습 데이터 수집부(208)는 화면 상에 버튼 A(뒤로 가기 버튼), 버튼 B(앞으로 가기 버튼), 버튼 C(시작 버튼) 및 버튼 D(끝 버튼) 등을 디스플레이할 수 있다. 만약, 응시자가 버튼 A를 터치하는 경우, 학습 데이터 수집부(208)는 응시자의 터치 입력이 이루어진 시점, 즉 응시자가 버튼 A를 터치한 시점에서 촬영 장치(204)를 통해 촬영된 응시자의 얼굴 이미지 및 상기 버튼 A의 위치 정보를 포함하는 학습 데이터를 수집할 수 있다.Referring to FIG. 7, the learning data collection unit 208 displays buttons A (back button), button B (back button), button C (start button), and button D (end button) on the screen. You can. If the candidate touches the button A, the learning data collection unit 208 is a face image of the candidate taken through the photographing device 204 at the time when the candidate's touch is input, that is, when the candidate touches the button A And learning data including location information of the button A.

또한, 학습 데이터 수집부(208)는 상기 응시자의 시선이 상기 터치 이후에도 상기 터치된 지점에 머무를 수 있도록 상기 응시자가 상기 지점을 터치한 이후 상기 지점의 시각 요소(visual elements)를 변화시킬 수 있다. 여기서, 시각 요소는 화면에 출력되는 객체들을 눈으로 인식하는 데 필요한 요소로서, 예를 들어 화면에 출력되는 객체, 상기 객체를 포함하는 영역 또는 상기 객체의 테두리 선의 크기, 형태, 색깔, 밝기, 질감 등이 이에 해당할 수 있다.In addition, the learning data collection unit 208 may change the visual elements of the point after the candidate touches the point so that the candidate's gaze stays at the touched point even after the touch. Here, the visual element is an element necessary for recognizing objects displayed on the screen with eyes, for example, the size, shape, color, brightness, texture of the object output on the screen, the area including the object, or the border line of the object. And so on.

도 8을 참조하면, 상기 응시자가 버튼 A를 터치하는 경우, 학습 데이터 수집부(208)는 버튼 A의 색깔을 보다 진하게 표시할 수 있으며 이에 따라 상기 응시자의 시선이 상기 터치 이후에도 버튼 A에 머무를 수 있도록 유도할 수 있다. Referring to FIG. 8, when the candidate touches button A, the learning data collection unit 208 may display the color of button A more darkly, so that the gaze of the candidate can stay on button A even after the touch. Can be induced.

한편, 학습 데이터 수집부(208)는 상기 응시자가 설정된 지점을 터치하는 시점에 촬영 장치(204)를 동작시켜 상기 학습 데이터를 수집할 수 있다. 즉, 촬영 장치(204)는 평상시에 오프 상태를 유지하다가 상기 응시자가 설정된 지점을 터치하는 시점에 학습 데이터 수집부(208)에 의해 동작하여 사용자를 촬영할 수 있으며, 이에 따라 촬영 장치(204)의 계속적인 동작으로 인해 사용자 단말(102)의 배터리 소모가 증가하는 것을 방지할 수 있다. 또한, 학습 데이터 수집부(208)는 상기 응시자가 상기 지점을 터치하는 시점에 촬영된 상기 응시자의 얼굴 이미지와 상기 지점의 위치 정보(즉, 상기 지점을 터치하는 시점에 수집된 학습 데이터)를 서버(104)로 전송할 수 있으며, 이에 따라 서버(104)는 이를 수집 및 분석할 수 있다. 서버(104)는 사용자 단말(102)로부터 상기 학습 데이터를 수집하여 데이터베이스(미도시)에 저장하고, 사용자 단말(102)이 수행하는 분석 과정(예를 들어, 얼굴 벡터, 눈동자 이미지/위치좌표, 얼굴 이미지/위치좌표 추출 등)을 수행할 수 있다.Meanwhile, the learning data collection unit 208 may collect the learning data by operating the photographing device 204 at the time when the candidate touches the set point. That is, the photographing device 204 may maintain an off state at normal times and operate by the learning data collection unit 208 at the time when the candidate touches the set point, thereby photographing the user. Due to the continuous operation, it is possible to prevent the battery consumption of the user terminal 102 from increasing. In addition, the learning data collection unit 208 may server the candidate's face image photographed when the candidate touches the point and location information of the point (ie, learning data collected at the time when the point is touched). It can be transmitted to 104, and accordingly, the server 104 can collect and analyze it. The server 104 collects the learning data from the user terminal 102 and stores it in a database (not shown), and an analysis process performed by the user terminal 102 (eg, face vector, pupil image / location coordinate, Face image / location coordinate extraction, etc.).

또한, 학습 데이터 수집부(208)는 촬영 장치(204)가 동작하고 있는 상태에서 응시자가 설정된 지점을 터치하는 경우 상기 터치가 이루어진 시점 및 상기 터치가 이루어진 시점으로부터 설정된 시간 만큼의 전후 시점(예를 들어, 터치가 이루어진 시점으로부터 1초 이전 시점, 터치가 이루어진 시점으로부터 1초 이후 시점)에서 상기 학습 데이터를 각각 수집할 수 있다. 일반적으로, 응시자는 특정 지점을 터치하고자 하는 경우 터치 직전과 터치 직후에 해당 지점을 응시하게 되므로, 실제 터치가 이루어진 시점뿐 아니라 터치 직전 및 직후 시점에서 수집된 학습 데이터 또한 그 신뢰도가 높은 것으로 판단할 수 있다. 즉, 본 발명의 실시예들에 따르면, 촬영 장치(204)가 동작하고 있는 상태에서 응시자가 설정된 지점을 터치하는 경우 상기 터치가 이루어진 시점 및 상기 터치가 이루어진 시점으로부터 설정된 시간 만큼의 전후 시점에서 학습 데이터를 각각 수집하도록 함으로써, 신뢰도가 높은 다량의 학습 데이터를 보다 용이하게 수집할 수 있다. In addition, the learning data collection unit 208 when the candidate touches the set point while the imaging device 204 is operating, the time at which the touch was made and the time before and after the set time from the time at which the touch was made (for example, For example, the learning data may be collected at a time 1 second before the touch is made and 1 second after the touch is made). In general, when a candidate wants to touch a specific point, the candidate will stare at the point just before and immediately after the touch, so it is determined that the learning data collected at the point of time just before and immediately after the touch is high. You can. That is, according to the embodiments of the present invention, when the candidate touches a set point while the photographing apparatus 204 is operating, learning is performed at a time before and after the touch is made and a time set from the time at which the touch is made. By collecting the data individually, a large amount of highly reliable learning data can be collected more easily.

도 9는 본 발명의 일 실시예에 따른 학습 데이터 수집부(208)에서 딥러닝 모델(210)에 입력되는 학습 데이터를 수집하는 과정을 설명하기 위한 다른 예시이다.9 is another example for explaining a process of collecting learning data input to the deep learning model 210 in the learning data collection unit 208 according to an embodiment of the present invention.

도 9를 참조하면, 학습 데이터 수집부(208)는 특정 지점에서 설정된 문구를 디스플레이하고, 상기 문구에 응답하여 상기 응시자가 음성을 발화하는 경우 상기 발화가 시작되는 시점에서 촬영 장치(204)를 통해 촬영된 응시자의 얼굴 이미지 및 상기 지점의 위치 정보를 포함하는 학습 데이터를 수집할 수 있다. 일 예시로서, 학습 데이터 수집부(208)는 화면의 상단 및 중앙 부분에 “아래 단어를 말하시오” 및 “Apple”이라는 문구를 각각 디스플레이하고, 이에 따라 상기 응시자가 “Apple”을 따라 읽기 위한 음성을 발화화는 경우 상기 발화가 시작되는 시점에서 촬영 장치(204)를 통해 촬영된 응시자의 얼굴 이미지 및 상기 “Apple” 문구가 디스플레이되는 지점의 위치 정보를 포함하는 학습 데이터를 수집할 수 있다.Referring to FIG. 9, the learning data collection unit 208 displays a phrase set at a specific point, and when the candidate utters a voice in response to the phrase, through the photographing device 204 at the time when the speech starts It is possible to collect learning data including the face image of the photographed candidate and the location information of the point. As an example, the learning data collection unit 208 displays the phrases “speak the words below” and “Apple” on the upper and center portions of the screen, respectively, so that the test taker can read along “Apple” In the case of utterance, learning data including the face image of the candidate photographed through the photographing apparatus 204 and the location information of the point where the “Apple” phrase is displayed at the time when the utterance is started may be collected.

이와 같이, 본 발명의 실시예들에 따르면, 화면 내 설정된 지점을 응시하고 있는 응시자로부터 터치, 음성 등과 같은 액션을 입력 받는 경우 상기 액션을 입력 받는 시점에서 촬영된 응시자의 얼굴 이미지와 상기 지점의 위치 정보를 시선 추적을 위한 딥러닝 모델(210)의 학습 데이터로 사용함으로써, 시선 추적의 정확도 및 신뢰도를 보다 향상시킬 수 있다.As described above, according to embodiments of the present invention, when an action such as a touch or voice is input from a candidate who is gazing at a set point in the screen, the face image of the candidate and the location of the point photographed at the time of receiving the action By using the information as training data of the deep learning model 210 for gaze tracking, it is possible to further improve the accuracy and reliability of gaze tracking.

도 10은 본 발명의 일 실시예에 따른 시선 기반의 비딩 방식을 설명하기 위한 예시이다. 시선 추적부(206)는 검출된 사용자의 시선과 화면 내 상기 광고 컨텐츠의 위치를 비교하여 상기 사용자가 상기 광고 컨텐츠를 응시하고 있는지의 여부를 판단하고, 이에 따라 어느 위치에서 사용자가 광고 컨텐츠를 많이 응시하였는지의 여부를 판단할 수 있다. 시선 추적부(206)는 각 영역별로 사용자가 광고 컨텐츠를 응시한 시간 및 횟수를 계산하고, 이를 서버(104)로 제공할 수 있다. 이에 따라, 서버(104)는 광고주 단말(106)과 연동하여 광고 컨텐츠가 위치하는 각 영역별로 해당 광고 컨텐츠의 비딩(bidding)을 달리할 수 있다. 10 is an example for explaining a gaze-based beading method according to an embodiment of the present invention. The gaze tracking unit 206 compares the detected user's gaze with the location of the advertisement content on the screen to determine whether the user is staring at the advertisement content, and accordingly, at which location, the user sees a lot of advertisement content. It is possible to judge whether or not the gaze has been taken. The gaze tracking unit 206 may calculate the time and number of times the user gazes at the advertisement content for each area, and provides it to the server 104. Accordingly, the server 104 may interwork with the advertiser terminal 106 to change the bidding of the corresponding advertisement content for each area where the advertisement content is located.

도 10을 참조하면, 서버(104)는 사용자가 상대적으로 많이 응시한 영역의 광고 컨텐츠에 대해서는 1달러, 사용자가 상대적으로 조금 응시한 영역의 광고 컨텐츠에 대해서는 0.6 달러로 각각 비딩하여 광고주 단말(106)로 과금할 수 있다.Referring to FIG. 10, the server 104 beads the advertiser terminal 106 by beading for $ 1 for the advertisement content in the area where the user has relatively stared at and $ 0.6 for the advertisement content in the area where the user has relatively stared at. ).

또한, 컨텐츠 제공부(202)는 상기 화면 내 상기 광고 컨텐츠의 위치 및 상기 사용자가 상기 광고 컨텐츠를 응시한 시간을 고려하여 상기 화면 내 상기 광고 컨텐츠의 위치를 변경할 수 있다. 예를 들어, 컨텐츠 제공부(202)는 광고 컨텐츠가 디스플레이된 복수 개의 영역 중 설정된 횟수 또는 시간 이상으로 응시된 영역을 파악하고, 현재 디스플레이되고 있는 광고 컨텐츠의 위치를 상기 설정된 횟수 또는 시간 이상으로 응시된 영역으로 변경할 수 있다. 이에 따라, 사용자로 하여금 상기 광고 콘텐츠를 보다 많이 응시하도록 유도할 수 있다. In addition, the content providing unit 202 may change the location of the advertisement content in the screen in consideration of the location of the advertisement content in the screen and the time the user gazes at the advertisement content. For example, the content providing unit 202 recognizes an area stared at a set number of times or more of a plurality of areas where advertisement content is displayed, and gazes the position of the advertisement content currently being displayed at the set number of times or more. Area can be changed. Accordingly, it is possible to induce the user to stare at the advertisement content more.

도 11은 본 발명의 일 실시예에 따른 시선 추적 방법을 설명하기 위한 흐름도이다. 도시된 흐름도에서는 상기 방법을 복수 개의 단계로 나누어 기재하였으나, 적어도 일부의 단계들은 순서를 바꾸어 수행되거나, 다른 단계와 결합되어 함께 수행되거나, 생략되거나, 세부 단계들로 나뉘어 수행되거나, 또는 도시되지 않은 하나 이상의 단계가 부가되어 수행될 수 있다.11 is a flowchart illustrating a gaze tracking method according to an embodiment of the present invention. In the illustrated flow chart, the method is described by dividing the method into a plurality of steps, but at least some of the steps are performed by reversing the order, combined with other steps, omitted, or divided into detailed steps, or not shown. One or more steps can be performed in addition.

S102 단계에서, 컨텐츠 제공부(202)는 화면에 광고 컨텐츠를 디스플레이한다.In step S102, the content providing unit 202 displays advertisement content on the screen.

S104 단계에서, 시선 추적부(206)는 촬영 장치(204)를 통해 사용자의 얼굴 이미지를 획득한다.In step S104, the gaze tracking unit 206 acquires a user's face image through the photographing device 204.

S106 단계에서, 시선 추적부(206)는 설정된 룰 기반 알고리즘 및 딥러닝 모델을 이용하여 사용자의 시선을 추적한다. 시선 추적부(206)가 룰 기반 알고리즘 및 딥러닝 모델을 이용하여 사용자의 시선을 추적하는 방법은 앞에서 자세히 설명하였는바 여기서는 그 자세한 설명을 생략하도록 한다.In step S106, the gaze tracking unit 206 tracks the user's gaze using the set rule-based algorithm and a deep learning model. The method of tracking the user's gaze using the rule-based algorithm and the deep learning model by the gaze tracking unit 206 has been described in detail above, and thus the detailed description thereof will be omitted.

S108 단계에서, 시선 추적부(206)는 검출된 사용자의 시선과 상기 화면 내 광고 컨텐츠의 위치에 기초하여 상기 사용자가 상기 광고 컨텐츠를 응시하고 있는지의 여부를 판단한다.In step S108, the gaze tracking unit 206 determines whether the user is staring at the advertisement content based on the detected user's gaze and the location of the advertisement content on the screen.

S110 단계에서, 상기 사용자가 상기 광고 컨텐츠를 응시하고 있는 것으로 판단되는 경우, 시선 추적부(206)는 화면 내 광고 컨텐츠의 위치, 상기 광고 컨텐츠에 대한 응시자의 응시 시간/횟수 등을 파악한다. In step S110, when it is determined that the user is staring at the advertisement content, the gaze tracking unit 206 determines the location of the advertisement content on the screen and the candidate's gaze time / number of times for the advertisement content.

도 12는 예시적인 실시예들에서 사용되기에 적합한 컴퓨팅 장치를 포함하는 컴퓨팅 환경(10)을 예시하여 설명하기 위한 블록도이다. 도시된 실시예에서, 각 컴포넌트들은 이하에 기술된 것 이외에 상이한 기능 및 능력을 가질 수 있고, 이하에 기술되지 것 이외에도 추가적인 컴포넌트를 포함할 수 있다.12 is a block diagram illustrating and illustrating a computing environment 10 that includes a computing device suitable for use in example embodiments. In the illustrated embodiment, each component may have different functions and capabilities in addition to those described below, and may include additional components in addition to those described below.

도시된 컴퓨팅 환경(10)은 컴퓨팅 장치(12)를 포함한다. 일 실시예에서, 컴퓨팅 장치(12)는 광고 시스템(100), 또는 사용자 단말(102)에 포함되는 하나 이상의 컴포넌트일 수 있다.The illustrated computing environment 10 includes a computing device 12. In one embodiment, computing device 12 may be one or more components included in advertisement system 100, or user terminal 102.

컴퓨팅 장치(12)는 적어도 하나의 프로세서(14), 컴퓨터 판독 가능 저장 매체(16) 및 통신 버스(18)를 포함한다. 프로세서(14)는 컴퓨팅 장치(12)로 하여금 앞서 언급된 예시적인 실시예에 따라 동작하도록 할 수 있다. 예컨대, 프로세서(14)는 컴퓨터 판독 가능 저장 매체(16)에 저장된 하나 이상의 프로그램들을 실행할 수 있다. 상기 하나 이상의 프로그램들은 하나 이상의 컴퓨터 실행 가능 명령어를 포함할 수 있으며, 상기 컴퓨터 실행 가능 명령어는 프로세서(14)에 의해 실행되는 경우 컴퓨팅 장치(12)로 하여금 예시적인 실시예에 따른 동작들을 수행하도록 구성될 수 있다.The computing device 12 includes at least one processor 14, a computer readable storage medium 16 and a communication bus 18. The processor 14 can cause the computing device 12 to operate in accordance with the exemplary embodiment mentioned above. For example, processor 14 may execute one or more programs stored on computer readable storage medium 16. The one or more programs can include one or more computer-executable instructions, which, when executed by the processor 14, configure the computing device 12 to perform operations according to an exemplary embodiment. Can be.

컴퓨터 판독 가능 저장 매체(16)는 컴퓨터 실행 가능 명령어 내지 프로그램 코드, 프로그램 데이터 및/또는 다른 적합한 형태의 정보를 저장하도록 구성된다. 컴퓨터 판독 가능 저장 매체(16)에 저장된 프로그램(20)은 프로세서(14)에 의해 실행 가능한 명령어의 집합을 포함한다. 일 실시예에서, 컴퓨터 판독 가능 저장 매체(16)는 메모리(랜덤 액세스 메모리와 같은 휘발성 메모리, 비휘발성 메모리, 또는 이들의 적절한 조합), 하나 이상의 자기 디스크 저장 디바이스들, 광학 디스크 저장 디바이스들, 플래시 메모리 디바이스들, 그 밖에 컴퓨팅 장치(12)에 의해 액세스되고 원하는 정보를 저장할 수 있는 다른 형태의 저장 매체, 또는 이들의 적합한 조합일 수 있다.Computer readable storage medium 16 is configured to store computer executable instructions or program code, program data and / or other suitable types of information. The program 20 stored on the computer readable storage medium 16 includes a set of instructions executable by the processor 14. In one embodiment, the computer-readable storage medium 16 is a memory (volatile memory such as random access memory, non-volatile memory, or a suitable combination thereof), one or more magnetic disk storage devices, optical disk storage devices, flash Memory devices, other types of storage media that can be accessed by the computing device 12 and store desired information, or any suitable combination thereof.

통신 버스(18)는 프로세서(14), 컴퓨터 판독 가능 저장 매체(16)를 포함하여 컴퓨팅 장치(12)의 다른 다양한 컴포넌트들을 상호 연결한다.The communication bus 18 interconnects various other components of the computing device 12, including a processor 14 and a computer-readable storage medium 16.

컴퓨팅 장치(12)는 또한 하나 이상의 입출력 장치(24)를 위한 인터페이스를 제공하는 하나 이상의 입출력 인터페이스(22) 및 하나 이상의 네트워크 통신 인터페이스(26)를 포함할 수 있다. 입출력 인터페이스(22)는 상술한 스크롤 화면(102), 입력 인터페이스(104), 입력 화면(105) 등을 포함할 수 있다. 입출력 인터페이스(22) 및 네트워크 통신 인터페이스(26)는 통신 버스(18)에 연결된다. 입출력 장치(24)는 입출력 인터페이스(22)를 통해 컴퓨팅 장치(12)의 다른 컴포넌트들에 연결될 수 있다. 예시적인 입출력 장치(24)는 포인팅 장치(마우스 또는 트랙패드 등), 키보드, 터치 입력 장치(터치패드 또는 터치스크린 등), 음성 또는 소리 입력 장치, 다양한 종류의 센서 장치 및/또는 촬영 장치와 같은 입력 장치, 및/또는 디스플레이 장치, 프린터, 스피커 및/또는 네트워크 카드와 같은 출력 장치를 포함할 수 있다. 예시적인 입출력 장치(24)는 컴퓨팅 장치(12)를 구성하는 일 컴포넌트로서 컴퓨팅 장치(12)의 내부에 포함될 수도 있고, 컴퓨팅 장치(12)와는 구별되는 별개의 장치로 컴퓨팅 장치(102)와 연결될 수도 있다.Computing device 12 may also include one or more I / O interfaces 22 and one or more network communication interfaces 26 that provide an interface for one or more I / O devices 24. The input / output interface 22 may include the scroll screen 102, the input interface 104, and the input screen 105 described above. The input / output interface 22 and the network communication interface 26 are connected to the communication bus 18. The input / output device 24 may be connected to other components of the computing device 12 through the input / output interface 22. Exemplary input / output devices 24 include pointing devices (such as a mouse or trackpad), keyboards, touch input devices (such as touch pads or touch screens), voice or sound input devices, various types of sensor devices, and / or imaging devices. Input devices, and / or output devices such as display devices, printers, speakers, and / or network cards. The exemplary input / output device 24 is a component constituting the computing device 12 and may be included inside the computing device 12 or may be connected to the computing device 102 as a separate device distinct from the computing device 12. It might be.

이상에서 대표적인 실시예를 통하여 본 발명에 대하여 상세하게 설명하였으나, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자는 전술한 실시예에 대하여 본 발명의 범주에서 벗어나지 않는 한도 내에서 다양한 변형이 가능함을 이해할 것이다. 그러므로 본 발명의 권리범위는 설명된 실시예에 국한되어 정해져서는 안 되며, 후술하는 특허청구범위뿐만 아니라 이 특허청구범위와 균등한 것들에 의해 정해져야 한다. The present invention has been described in detail through exemplary embodiments above, but those skilled in the art to which the present invention pertains are capable of various modifications within the limits of the embodiments described above without departing from the scope of the present invention. Will understand. Therefore, the scope of rights of the present invention should not be limited to the described embodiments, and should be determined not only by the claims to be described later, but also by the claims and equivalents.

100 : 광고 시스템
102 : 사용자 단말
104 : 서버
106 : 광고주 단말
108 : 컨텐츠 개발자 단말
202 : 컨텐츠 제공부
204 : 촬영 장치
206 : 시선 추적부
208 : 학습 데이터 수집부
210 : 딥러닝 모델
212 : 리워드 제공부100: advertising system
102: user terminal
104: server
106: advertiser terminal
108: content developer terminal
202: content provider
204: shooting device
206: gaze tracking unit
208: learning data collection unit
210: deep learning model
212: reward provider

Claims

As a user terminal,
A photographing device for photographing a user's face image; And
Based on the set rule, the vector representing the direction of the user's face and the user's pupil image are obtained from the face image, and inputting the face image, the vector, and the pupil image into the set deep learning model It includes a gaze tracking unit for tracking the user's gaze,
The photographing apparatus is provided to photograph the face image of the candidate at the time when the action is input when the set action is input from the candidate who gazes at a set point on the screen,
The gaze tracking unit acquires the vector and the pupil image from the face image captured at the time when the action is input,
The user terminal,
Further comprising a learning data collection unit for collecting learning data including the face image of the candidate and the location information of the set point photographed at the time when the action is input,
The gaze tracking unit trains the learning data into the deep learning model, and tracks the user's gaze using the deep learning model that has learned the learning data.

delete

The method according to claim 1,
The learning data collection unit, when the candidate touches the point, collects the learning data at the time when the touch is made, the user terminal

The method according to claim 3,
The photographing apparatus is provided to maintain an off state during normal operation and operate at a time point when the candidate touches a set point, to photograph the face image of the candidate,
The learning data collection unit collects the learning data at the time when the candidate touches the point, the user terminal.

The method according to claim 3,
The learning data collection unit transmits the learning data collected to the server when the candidate touches the point.

The method according to claim 3,
The learning data collection unit collects each of the learning data at a time before and after the touch and a time set from the time at which the touch is made when the candidate touches the point while the photographing device is operating. , User terminal.

The method according to claim 3,
The learning data collection unit changes a visual element of the point after the candidate touches the point so that the candidate's gaze remains at the point even after the touch.

The method according to claim 1,
The learning data collection unit displays the phrase set at the point, and collects the learning data at the time when the speech starts when the candidate speaks a voice.

The method according to claim 1,
The gaze tracking unit obtains the user's pupil location coordinates and face location coordinates from the face image based on the rule, and obtains the pupil location coordinates and the face location coordinates together with a vector indicating a direction in which the user's face faces. A user terminal further input into the deep learning model.

The method according to claim 1,
Further comprising a content providing unit for displaying advertising content on the screen,
The gaze tracking unit determines whether the user is staring at the advertisement content based on the detected gaze of the user and the location of the advertisement content on the screen,
The content providing unit changes a location of the advertisement content in the screen in consideration of the location of the advertisement content in the screen and a time when the user gazes at the advertisement content.

In the photographing apparatus, photographing a face image of the user;
Acquiring a vector indicating the direction of the face of the user from the face image and the user's pupil image from the face image based on a set rule in the gaze tracking unit; And
A gaze tracking method comprising the step of tracking the user's gaze by inputting the face image, the vector, and the pupil image in a set deep learning model in the gaze tracking unit,
The eye tracking method,
In the photographing apparatus, when receiving a set action from a candidate who gazes at a set point on the screen, the method further includes photographing the face image of the candidate at the time when the action is input,
In the step of tracking the gaze, the vector and the pupil image are obtained from a face image photographed at the time when the action is input,
In the learning data collection unit, collecting learning data including the face image of the candidate and the location information of the set point photographed at the time when the action is input; And
In the gaze tracking unit, further comprising the step of training the learning data to the deep learning model,
The step of tracking the user's gaze includes tracking the user's gaze using the deep learning model that has learned the learning data.

delete

The method according to claim 11,
In the collecting of the learning data, when the candidate touches the point, the learning data is collected at the time when the touch is made.

The method according to claim 13,
The photographing apparatus is provided to maintain an off state during normal operation and operate at a time point when the candidate touches a set point, to photograph the face image of the candidate,
The step of collecting the learning data includes collecting the learning data at a time when the candidate touches the point.

The method according to claim 13,
In the learning data collection unit, further comprising the step of transmitting the collected training data to the server when the candidate touches the point, gaze tracking method.

The method according to claim 13,
In the step of collecting the learning data, when the candidate touches the point while the photographing device is operating, the learning data is acquired at a time before and after a time set from the time when the touch is made and the time when the touch is made A method of eye tracking that is collected individually.

The method according to claim 13,
And in the learning data collection unit, further comprising changing a visual element of the point after the candidate touches the point so that the candidate's gaze stays at the point even after the touch.

The method according to claim 11,
The step of collecting the learning data displays a phrase set at the point, and collects the learning data at a point in time when the speech starts when the candidate speaks a voice.

The method according to claim 11,
In the gaze tracking unit, further comprising the step of obtaining the user's pupil location coordinates and face location coordinates from the face image based on the rule,
The step of tracking the user's gaze comprises inputting the pupil position coordinates and the face position coordinates into the deep learning model together with a vector indicating a direction in which the user's face faces.

The method according to claim 11,
In the content providing unit, displaying advertisement content on the screen;
Determining, by the gaze tracking unit, whether the user is staring at the advertisement content based on the detected gaze of the user and the location of the advertisement content on the screen; And
And in the content providing unit, changing the position of the advertisement content in the screen in consideration of the location of the advertisement content in the screen and a time when the user gazes at the advertisement content.