KR20210085270A

KR20210085270A - Apparatus and method for detecting interest in gazed objects

Info

Publication number: KR20210085270A
Application number: KR1020190178141A
Authority: KR
Inventors: 김호원; 유장희; 한병옥
Original assignee: 한국전자통신연구원
Priority date: 2019-12-30
Filing date: 2019-12-30
Publication date: 2021-07-08
Also published as: KR102477231B1

Abstract

An apparatus and a method for detecting a gaze target interest are disclosed. A gaze target interest detection apparatus according to an embodiment of the present invention comprises one or more processors and an execution memory for storing at least one or more programs executed by the one or more processors. The at least one program receives an image captured by a user gazing at a display from at least one image sensor. Tag information inserted into tag-based content related to the target audience output through the display is input. It is possible to detecting, from the image, facial feature information of the user looking at the gaze target, and gaze information tracking the gaze of the user looking at the gaze target. The user's degree of interest in the gaze target is detected based on the facial feature information, the gaze information, and the tag information.

Description

Apparatus and method for detecting the interest of the target audience {APPARATUS AND METHOD FOR DETECTING INTEREST IN GAZED OBJECTS}

본 발명은 대상체에 대한 관심도 검출 기술에 관한 것으로, 보다 상세하게는 사용자가 응시하는 대상에 대한 관심도를 검출하는 기술에 관한 것이다.The present invention relates to a technology for detecting an interest in an object, and more particularly, to a technology for detecting an interest in an object that a user gazes at.

사용자의 대상체에 대한 관심도를 검출하는 종래의 방법은 토비(Tobii)사의 아이트래킹 시스템과 같이 모니터를 근접거리에서 바라볼 수 있도록 배치된 의자 등에 앉은 상태에서 모니터의 상단 혹은 하단 등에 고정되게 부착되어 사용자의 얼굴을 촬영할 수 있는 영상센서 시스템을 이용한다. 사전처리(Preprocessing) 과정으로 모니터의 특정 위치를 응시하는 사용자의 시선방향 검출을 통한 사용자-모니터 간 캘리브레이션(Calibration) 과정을 거치고 캘리브레이션 시의 얼굴 위치를 벗어나지 않은 선에서 사용자가 화면 상의 어디를 바라보는 지에 대한 응시점 검출을 통해 응시점이 머무르는 대상체를 검출하는 방법이 주로 사용되고 있다. 이러한 방법은 캘리브레이션 과정이 필요하며 캘리브레이션 이후에는 사용자의 움직임이 극히 제한되며 대상에 대한 응시만을 검출하는 제한이 있다. 캘리브레이션 위치를 벗어나거나 고개를 많이 돌릴 경우 응시점 검출이 어려운 한계를 가진다.A conventional method of detecting a user's interest in an object is fixedly attached to the top or bottom of the monitor while sitting on a chair arranged so that the monitor can be viewed from a close distance, such as Tobii's eye tracking system. It uses an image sensor system that can capture the face of As a preprocessing process, a user-monitor calibration process is performed through the detection of the user's gaze direction gazing at a specific location of the monitor, and the user is able to see where the user looks on the screen in a line that does not deviate from the position of the face during calibration. A method of detecting an object in which the gaze point stays is mainly used by detecting the gaze point. This method requires a calibration process, and after calibration, the user's movement is extremely limited, and there is a limitation in detecting only the gaze on the target. If you go out of the calibration position or turn your head a lot, it is difficult to detect the gaze point.

또 다른 방법은 HMD나 AR Eye-glass 등의 아이웨어(Eyeware)장비에 부착하여 사용자의 눈을 직접 촬영할 수 있는 영상센서를 부착함으로써 사용자의 시선움직임 및 응시점을 검출하는 방법이 있다. 또한 아이웨어의 외부에 사용자가 바라보는 방향의 공간을 촬영하는 영상센서를 부착하여 사용자의 시선이 어디를 응시하고 있는지 검출하는 방법이 있다. 이러한 방법은 VR이나 AR 디바이스를 통해 사용자에게 가상 혹은 증강현실 서비스를 제공하기 위해 주로 사용되며 눈을 바라보는 영상센서는 항상 일정한 시점에서 눈을 바라보고 적외선 같은 동공검출을 용이하게 하는 능동광을 사용하여 시선을 정밀하게 검출하는 장점이 있으나 별도의 영상센서가 장착된 아이웨어를 착용해야 한다는 불편함이 있으며, 콘텐츠 등의 서비스를 제공하는 스마트 단말 입장에서는 아이웨어를 장착하지 않은 사용자나 정보제공을 거부하는 사용자의 관심도를 알아낼 방법이 없다는 단점이 있다.Another method is to attach an image sensor that can directly photograph the user's eyes by attaching it to eyeware equipment such as HMD or AR Eye-glass to detect the user's gaze movement and gaze point. In addition, there is a method of detecting where the user's gaze is gazing by attaching an image sensor that captures the space in the direction in which the user is looking to the outside of the eyewear. This method is mainly used to provide virtual or augmented reality services to users through VR or AR devices, and the eye-gazing image sensor always looks at the eye at a certain point in time and uses active light such as infrared to facilitate pupil detection. This has the advantage of precisely detecting the gaze, but it is inconvenient to wear eyewear equipped with a separate image sensor, and from the standpoint of a smart terminal that provides services such as contents, it is difficult to provide information or users without eyewear. The disadvantage is that there is no way to find out the degree of interest of the rejecting user.

한편, 한국등록특허 제 10-1647969 호 “사용자 시선을 검출하기 위한 사용자 시선 검출 장치 및 그 방법과, 그 방법을 실행하기 위한 컴퓨터 프로그램”는 뎁스 카메라를 이용하여 사용자를 촬영하고, 그 촬영된 뎁스 맵을 이용하여 사용자 시선을 효과적으로 검출할 수 있는 사용자 시선 검출 장치 및 그 방법에 관하여 개시하고 있다.Meanwhile, Korean Patent Registration No. 10-1647969, “A device for detecting a user's gaze, a method thereof, and a computer program for executing the method,” uses a depth camera to photograph a user, and the photographed depth Disclosed are an apparatus and method for detecting a user's gaze that can effectively detect a user's gaze using a map.

본 발명은 사용자의 관심도를 검출하여 사용자의 관심도를 반영한 콘텐츠를 인터랙티브하게 제공하는 것을 목적으로 한다.An object of the present invention is to detect a user's level of interest and interactively provide content reflecting the user's level of interest.

또한, 본 발명은 사용자의 관심도를 검출하여 광고효과를 극대화하는 것을 목적으로 한다.In addition, an object of the present invention is to maximize an advertisement effect by detecting a user's level of interest.

또한, 본 발명은 사용자의 관심도를 검출하여 시선의 응시 특성에 기반한 진단 정보를 제공하는 것을 목적으로 한다.Another object of the present invention is to provide diagnostic information based on a gaze characteristic of a gaze by detecting a user's level of interest.

또한, 본 발명은 사용자의 관심도를 검출하여 대화형 인터랙션 서비스를 제공하는 것을 목적으로 한다.Another object of the present invention is to provide an interactive interaction service by detecting a user's level of interest.

상기한 목적을 달성하기 위한 본 발명의 일실시예에 따른 응시 대상 관심도 검출 장치는 하나 이상의 프로세서 및 상기 하나 이상의 프로세서에 의해 실행되는 적어도 하나 이상의 프로그램을 저장하는 실행메모리를 포함하고, 상기 적어도 하나 이상의 프로그램은 적어도 하나 이상의 영상 센서로부터 디스플레이를 응시하는 사용자를 촬영한 영상을 입력 받고, 상기 디스플레이를 통해 출력되는 응시 대상에 관한 태그 기반 콘텐츠에 삽입된 태그 정보를 입력 받고, 상기 영상으로부터 상기 응시 대상을 바라보는 사용자의 얼굴 특징 정보, 상기 응시 대상을 바라보는 상기 사용자의 시선을 추적한 응시 정보를 검출하고, 상기 얼굴 특징 정보 상기 응시 정보 및 상기 태그 정보에 기반하여 상기 응시 대상에 대한 상기 사용자의 관심도를 검출할 수 있다.To achieve the above object, an apparatus for detecting a gaze target interest according to an embodiment of the present invention includes one or more processors and an execution memory storing at least one or more programs executed by the one or more processors, and the at least one or more The program receives an image of a user gazing at a display from at least one or more image sensors, receives tag information inserted in tag-based content about a gazing target output through the display, and selects the gazing target from the image Detecting facial characteristic information of the looking user, gaze information tracking the gaze of the user looking at the gaze target, and the user's interest in the gaze target based on the facial feature information, the gaze information, and the tag information can be detected.

이 때, 상기 태그 정보는 상기 디스플레이에 출력되는 상기 응시 대상의 위치, 영역, 크기, 속성, 상품 카테고리 및 재질 중 적어도 하나를 포함할 수 있다.In this case, the tag information may include at least one of a location, an area, a size, an attribute, a product category, and a material of the gaze target output on the display.

이 때, 상기 적어도 하나 이상의 프로그램은 상기 응시 정보에 기반하여 상기 사용자의 응시점이 기설정된 시간동안 유지된 경우, 상기 얼굴 특징 정보에 기반하여 상기 기설정된 시간 동안 상기 사용자의 얼굴 특징 변화를 검출하여 상기 사용자의 관심도를 검출할 수 있다.At this time, when the gaze point of the user is maintained for a preset time based on the gaze information, the at least one program detects a change in the user's facial features for the preset time based on the facial feature information, It is possible to detect the user's interest level.

이 때, 상기 사용자의 관심도는 상기 얼굴 특징 변화에 따라 무관심, 호감 및 비호감 중 적어도 하나로 검출될 수 있다.In this case, the user's level of interest may be detected as at least one of indifference, liking, and dislike according to the change in the facial feature.

이 때, 상기 적어도 하나 이상의 프로그램은 상기 사용자의 얼굴 특징 변화 중 상기 사용자의 입의 특징점의 변화 및 눈썹의 특징점 변화 중 적어도 하나에 기반하여 상기 사용자의 관심도를 검출할 수 있다.In this case, the at least one program may detect the degree of interest of the user based on at least one of a change in the feature point of the user's mouth and a change in the feature point of the eyebrow among the changes in the user's facial features.

또한, 상기의 목적을 달성하기 위한 본 발명의 일실시예에 따른 응시 대상 관심도 검출 방법은 응시 대상 관심도 검출 장치의 응시 대상 관심도 검출 방법에 있어서, 적어도 하나 이상의 영상 센서로부터 디스플레이를 응시하는 사용자를 촬영한 영상을 입력 받고, 상기 디스플레이를 통해 출력되는 응시 대상에 관한 태그 기반 콘텐츠에 삽입된 태그 정보를 입력 받는 단계 및 상기 영상으로부터 상기 응시 대상을 바라보는 사용자의 얼굴 특징 정보, 상기 응시 대상을 바라보는 상기 사용자의 시선을 추적한 응시 정보를 검출하고, 상기 얼굴 특징 정보 상기 응시 정보 및 상기 태그 정보에 기반하여 상기 응시 대상에 대한 상기 사용자의 관심도를 검출하는 단계를 포함한다.In addition, in the gaze target interest detection method according to an embodiment of the present invention for achieving the above object, the gaze target interest detection method of the gaze target interest detection apparatus, photographing a user gazing at a display from at least one image sensor receiving an image, receiving tag information inserted in tag-based content about the gaze target output through the display, and facial feature information of a user looking at the gaze target from the image, looking at the gaze target and detecting gaze information tracking the user's gaze, and detecting the user's interest in the gaze target based on the facial feature information, the gaze information, and the tag information.

이 때, 상기 사용자의 관심도를 검출하는 단계는 상기 응시 정보에 기반하여 상기 사용자의 응시점이 기설정된 시간동안 유지된 경우, 상기 얼굴 특징 정보에 기반하여 상기 기설정된 시간 동안 상기 사용자의 얼굴 특징 변화를 검출하여 상기 사용자의 관심도를 검출할 수 있다.In this case, the detecting of the user's level of interest may include, when the user's gaze point is maintained for a preset time based on the gaze information, detects a change in the user's facial features for the preset time based on the facial feature information. It is possible to detect the user's level of interest by detecting.

상기 사용자의 관심도는 상기 얼굴 특징 변화에 따라 무관심, 호감 및 비호감 중 적어도 하나로 검출될 수 있다.The user's level of interest may be detected as at least one of indifference, liking, and dislike according to the change in the facial feature.

이 때, 상기 사용자의 관심도를 검출하는 단계는 상기 사용자의 얼굴 특징 변화 중 상기 사용자의 입의 특징점의 변화 및 눈썹의 특징점 변화 중 적어도 하나에 기반하여 상기 사용자의 관심도를 검출할 수 있다.In this case, the detecting of the user's degree of interest may include detecting the user's degree of interest based on at least one of a change in the feature point of the user's mouth and a change in the feature point of the eyebrow among the changes in the user's facial features.

본 발명은 사용자의 관심도를 검출하여 사용자의 관심도를 반영한 콘텐츠를 인터랙티브하게 제공할 수 있다.The present invention can detect the user's interest level and interactively provide content reflecting the user's interest level.

또한, 본 발명은 사용자의 관심도를 검출하여 광고효과를 극대화할 수 있다.In addition, the present invention can maximize the advertising effect by detecting the user's interest.

또한, 본 발명은 사용자의 관심도를 검출하여 시선의 응시 특성에 기반한 진단 정보를 제공할 수 있다.In addition, the present invention can provide diagnostic information based on the gaze characteristic of the gaze by detecting the user's level of interest.

또한, 본 발명은 사용자의 관심도를 검출하여 대화형 인터랙션 서비스를 제공할 수 있다.In addition, the present invention can provide an interactive interaction service by detecting a user's level of interest.

도 1은 본 발명의 일실시예에 따른 응시 대상 관심도 검출 장치를 나타낸 블록도이다.
도 2는 도 1에 도시된 응시 대상 관심도 검출부의 일 예를 세부적으로 나타낸 블록도이다.
도 3 및 도 4는 본 발명의 일실시예에 따른 스마트 디바이스 및 거점형 디스플레이 장치를 이용한 응시대상 관심도 검출 장치를 나타낸 도면이다.
도 5는 본 발명의 일실시예에 따른 응시 대상 관심도 검출 방법을 나타낸 동작흐름도이다.
도 6은 본 발명의 일실시예에 따른 컴퓨터 시스템을 나타낸 도면이다.1 is a block diagram illustrating an apparatus for detecting a gaze target's interest according to an embodiment of the present invention.
FIG. 2 is a detailed block diagram illustrating an example of a gaze target interest level detector illustrated in FIG. 1 .
3 and 4 are diagrams illustrating an apparatus for detecting a gaze target's interest using a smart device and a base-type display apparatus according to an embodiment of the present invention.
5 is an operation flowchart illustrating a method for detecting a gaze target interest according to an embodiment of the present invention.
6 is a diagram illustrating a computer system according to an embodiment of the present invention.

본 발명을 첨부된 도면을 참조하여 상세히 설명하면 다음과 같다. 여기서, 반복되는 설명, 본 발명의 요지를 불필요하게 흐릴 수 있는 공지 기능, 및 구성에 대한 상세한 설명은 생략한다. 본 발명의 실시형태는 당 업계에서 평균적인 지식을 가진 자에게 본 발명을 보다 완전하게 설명하기 위해서 제공되는 것이다. 따라서, 도면에서의 요소들의 형상 및 크기 등은 보다 명확한 설명을 위해 과장될 수 있다.The present invention will be described in detail with reference to the accompanying drawings as follows. Here, repeated descriptions, well-known functions that may unnecessarily obscure the gist of the present invention, and detailed descriptions of configurations will be omitted. The embodiments of the present invention are provided in order to more completely explain the present invention to those of ordinary skill in the art. Accordingly, the shapes and sizes of elements in the drawings may be exaggerated for clearer description.

명세서 전체에서, 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다.Throughout the specification, when a part "includes" a certain element, it means that other elements may be further included, rather than excluding other elements, unless otherwise stated.

이하, 본 발명에 따른 바람직한 실시예를 첨부된 도면을 참조하여 상세하게 설명한다.Hereinafter, preferred embodiments according to the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명의 일실시예에 따른 응시 대상 관심도 검출 장치를 나타낸 블록도이다. 도 2는 도 1에 도시된 응시 대상 관심도 검출부의 일 예를 세부적으로 나타낸 블록도이다. 도 3 및 도 4는 스마트 디바이스 및 거점형 디스플레이 장치를 이용한 응시대상 관심도 검출 장치를 나타낸 도면이다.1 is a block diagram illustrating an apparatus for detecting a gaze target interest according to an embodiment of the present invention. FIG. 2 is a detailed block diagram illustrating an example of a gaze target interest level detector illustrated in FIG. 1 . 3 and 4 are diagrams illustrating an apparatus for detecting a gaze target's interest using a smart device and a base-type display apparatus.

도 1을 참조하면, 본 발명의 일실시예에 따른 응시 대상 관심도 검출 장치는 영상 센서 입력부(110), 콘텐츠 입력부(120), 응시 대상 관심도 검출부(130), 인터랙션 어플리케이션 연동부(140) 및 데이터베이스부(150)를 포함한다.Referring to FIG. 1 , an apparatus for detecting a gaze target interest according to an embodiment of the present invention includes an image sensor input unit 110 , a content input unit 120 , a gaze target interest level detection unit 130 , an interaction application interworking unit 140 , and a database. part 150 .

영상 센서 입력부(110)는 적어도 하나 이상의 영상 센서로부터 디스플레이를 응시하는 사용자를 촬영한 영상을 입력 받을 수 있다.The image sensor input unit 110 may receive an image captured by a user gazing at the display from at least one image sensor.

콘텐츠 입력부(120)는 상기 디스플레이를 통해 출력되는 응시 대상에 관한 태그 기반 콘텐츠에 삽입된 태그 정보를 입력 받을 수 있다.The content input unit 120 may receive tag information inserted into the tag-based content related to the gaze target output through the display.

이 때, 콘텐츠 입력부(120)는 콘텐츠 내 시계열상에서 각 프레임 별로 화면에 표출되는 대상의 종류 혹은 상품명 등과 화면상의 위치 및 영역 등을 담은 태그(Tag) 정보를 포함하는 태그 기반 콘텐츠를 입력 받을 수 있다.In this case, the content input unit 120 may receive tag-based content including tag information including the type or product name of an object displayed on the screen for each frame in the time series within the content, and the location and area on the screen. .

또한, 콘텐츠 입력부(120)는 공간의 배치와 형상에 대한 3D Map과 공간을 구성하는 대상에 대한 위치 및 영역 정보를 담은 태그 정보를 포함하는 3D 맵 등을 태그 기반 콘텐츠로 입력 받을 수 있다.In addition, the content input unit 120 may receive a 3D map for the arrangement and shape of a space and a 3D map including tag information containing location and area information of an object constituting the space as tag-based content.

응시 대상 관심도 검출부(130)는 상기 영상으로부터 상기 응시 대상을 바라보는 사용자의 얼굴 특징 정보, 상기 응시 대상을 바라보는 상기 사용자의 시선을 추적한 응시 정보를 검출하고, 상기 얼굴 특징 정보 상기 응시 정보 및 상기 태그 정보에 기반하여 상기 응시 대상에 대한 상기 사용자의 관심도를 검출할 수 있다.The gaze target interest detection unit 130 detects, from the image, facial feature information of the user looking at the gaze target, gaze information tracking the gaze of the user looking at the gaze target, the facial feature information, the gaze information and The user's interest in the gaze target may be detected based on the tag information.

이 때, 응시 대상 관심도 검출부(130)는 입력 영상과 태그 정보를 입력 받아 사용자가 콘텐츠 혹은 공간상에서 어떠한 대상을 응시할 경우, 응시 대상을 식별하고 응시 대상에 대한 관심도를 검출할 수 있다.At this time, the gaze target interest detection unit 130 may receive the input image and tag information, and when the user gazes at an object in the content or space, identifies the gaze target and detects the interest in the gaze target.

이 때, 응시 대상 관심도 검출부(130)는 상기 응시 정보에 기반하여 상기 사용자의 응시점이 기설정된 시간동안 유지된 경우, 상기 얼굴 특징 정보에 기반하여 상기 기설정된 시간 동안 상기 사용자의 얼굴 특징 변화를 검출하여 상기 사용자의 관심도를 검출할 수 있다.In this case, the gaze target interest detection unit 130 detects a change in the user's facial features for the preset time based on the facial feature information when the gaze point of the user is maintained for a preset time based on the gaze information Thus, the user's level of interest can be detected.

이 때, 응시 대상 관심도 검출부(130)는 상기 사용자의 얼굴 특징 변화 중 상기 사용자의 입의 특징점의 변화 및 눈썹의 특징점 변화 중 적어도 하나에 기반하여 상기 사용자의 관심도를 검출할 수 있다.In this case, the gaze target interest level detection unit 130 may detect the user's level of interest based on at least one of a change in the feature point of the user's mouth and a change in the feature point of the eyebrow among the changes in the user's facial features.

도 2를 참조하면, 응시 대상 관심도 검출부(130)는 얼굴 특징 검출부(131), 시계열 태그 파서부(132), 관심도 검출부(133), 시선 응시 추적부(134), 응시 대상 검출부) 및 관심도 기록부(136)를 포함할 수 있다.Referring to FIG. 2 , the gaze target interest detection unit 130 includes a facial feature detection unit 131 , a time series tag parser unit 132 , an interest level detection unit 133 , a gaze gaze tracking unit 134 , a gaze target detection unit) and an interest recorder (136).

얼굴 특징 검출부(131)는 영상센서 입력부(110)로부터 입력되는 영상 별로 얼굴 특징 정보를 검출할 수 있다. 이 때, 사용되는 얼굴 특징 정보는 얼굴을 구성하는 눈, 코, 입, 눈썹, 귀, 턱선 등이 포함될 수 있으며, 수개 혹은 수십개의 특징점(Feature Point)로 구성된 얼굴 랜드마크(Facial Landmark)로 정의되거나 삼각형 혹은 사각형 메쉬(mesh)로 구성되는 얼굴의 3D 모델에 대해 메쉬를 구성하는 버텍스(vertex)에 대응되는 얼굴의 특징점으로 정의될 수도 있다.The facial feature detection unit 131 may detect facial feature information for each image input from the image sensor input unit 110 . In this case, the facial feature information used may include eyes, nose, mouth, eyebrows, ears, jaw line, etc. constituting the face, and is defined as a facial landmark composed of several or dozens of feature points. Alternatively, for a 3D model of a face composed of a triangular or quadrangular mesh, it may be defined as a feature point of a face corresponding to a vertex constituting the mesh.

얼굴의 특징점은 얼굴을 구성하는 요소들의 특징을 포함할 수 있다.The feature point of the face may include features of elements constituting the face.

이 때, 얼굴 특징 검출부(131)는 PRNet 과 같은 얼굴 특징 정보 검출 기술을 활용할 수 있다. 다만, PRNet은 출력으로 제공하는 3D 모델이 스케일에 대한 정의가 없이 정규화된 3D 공간상에서 표현되는 Scaled 3D 모델에 상응할 수 있다.In this case, the facial feature detection unit 131 may utilize a facial feature information detection technology such as PRNet. However, in PRNet, the 3D model provided as an output can correspond to the scaled 3D model expressed in the normalized 3D space without the definition of scale.

이 때, 얼굴 특징 검출부(131)는 물리적인 공간에 대응되는 얼굴의 특징 정보 검출을 위해, 개별 영상센서를 통해 입력 받은 Scaled 3D 모델의 특징점 간의 대응관계를 이용하여 스케일 정보를 제거할 수 있다.In this case, the facial feature detection unit 131 may remove the scale information by using the correspondence between the feature points of the scaled 3D model input through the individual image sensor in order to detect feature information of the face corresponding to the physical space.

이 때, 얼굴 특징 검출부(131)는 Scaled 3D 모델에서 스케일 정보를 제거한 후, 카메라보정을 통해 얻어진 3D 공간 좌표계를 중심으로 실제 얼굴의 크기에 대응되는 3D 얼굴 특징을 검출할 수 있다.In this case, the facial feature detector 131 may detect a 3D facial feature corresponding to the size of an actual face based on a 3D spatial coordinate system obtained through camera correction after removing scale information from the scaled 3D model.

이 때, 얼굴 특징 검출부(131)는 다수의 영상센서를 사용하여, 각 영상센서별로 얻어진 3D 모델에 대해 각 영상센서의 뷰포인트와 사용자의 위치를 고려한 가시성(Visibility) 테스트를 수행할 수 있다.In this case, the facial feature detector 131 may perform a visibility test on the 3D model obtained for each image sensor by using a plurality of image sensors in consideration of the viewpoint of each image sensor and the location of the user.

이 때, 얼굴 특징 검출부(131)는 가시성이 최대인 영상에서 얻어진 3D 모델 정보들을 선택적으로 취합할 수 있다.In this case, the facial feature detector 131 may selectively collect 3D model information obtained from an image having maximum visibility.

이 때, 얼굴 특징 검출부(131)는 하나의 3D 공간 좌표계를 기준으로 각각의 영상센서에서 보이지 않던 정보들을 통합할 수 있다.In this case, the facial feature detection unit 131 may integrate information not seen in each image sensor based on one 3D spatial coordinate system.

이 때, 얼굴 특징 검출부(131)는 다중 카메라의 모든 정보들을 일관되게 표현하고 통합된 정보로부터 얼굴 특징을 검출할 수 있다.In this case, the facial feature detector 131 may consistently express all information of the multi-camera and detect the facial feature from the integrated information.

이 때, 얼굴 특징 검출부(131)는 3D 모델을 구성하는 개별의 삼각형 혹은 사각형 메쉬를 기준으로 메쉬의 노멀(Normal) 벡터와 카메라가 바라보는 벡터 간의 내적(Dot Product)의 절대값이 음의 방향으로 가장 큰 영상의 정보를 해당 메쉬에 기록하여 가시성이 최대화된 3D 모델을 획득할 수 있다.At this time, the facial feature detector 131 determines that the absolute value of the dot product between the normal vector of the mesh and the vector viewed by the camera is negative on the basis of individual triangular or quadrangular meshes constituting the 3D model. By recording the information of the largest image in the corresponding mesh, it is possible to obtain a 3D model with maximized visibility.

이 때, 얼굴 특징 검출부(131)는 획득한 사용자의 3D 형상(Shape)과 칼라(Color)를 컴퓨터그래픽스 분야에서 인체형 캐릭터를 표현하기 위해 사용하는 uv-Map과 같은 방식으로, 3D 모델에 unwrap(2차원으로 펼쳐진 지구본 사진과 같음) 처리를 수행할 수 있다.At this time, the facial feature detection unit 131 unwraps the acquired user's 3D shape and color onto the 3D model in the same way as a uv-Map used to express a human-like character in the field of computer graphics. (Same as a picture of a globe unfolded in two dimensions) processing can be performed.

이 때, 얼굴 특징 검출부(131)는 unWrap된 영상을 딥러닝 기법들을 통해 학습하고, 사용자가 디스플레이 화면을 기준으로 90도 정도로 옆으로 고개를 돌리는 상황에서도 사용자의 모든 얼굴 특징을 동시에 검출할 수 있다.At this time, the facial feature detection unit 131 learns the unwrapped image through deep learning techniques, and can detect all facial features of the user at the same time even when the user turns his/her head to the side by about 90 degrees based on the display screen. .

이 때, 얼굴 특징 검출부(131)는 사용자의 얼굴이 정면을 응시하지 않아도 검출이 가능하며 다중 디스플레이 혹은 공간을 둘러보는 사용자의 얼굴 특징을 효율적으로 검출할 수 있다.In this case, the facial feature detection unit 131 can detect the user's face even when the user's face does not stare at the front, and can efficiently detect the user's facial features looking around multiple displays or spaces.

이 때, 얼굴 특징 검출부(131)는 개별 영상에서 얼굴 특징을 검출한 후 특징 간의 대응관계를 이용하여 하나의 얼굴 특징을 검출할 수도 있다.In this case, the facial feature detector 131 may detect a facial feature from an individual image and then detect one facial feature using a correspondence between the features.

이 때, 얼굴 특징 검출부(131)는 통상적인 웹캠과 같은 칼라 카메라로만 구성된 영상센서를 이용하여 획득한 영상, 키넥트와 같이 패턴 혹은 TOF(Time of Flight), Active Stereo 등을 이용하여 능동적으로 Depth 정보를 얻을 수 있는 센서의 경우 단일 영상센서 혹은 훨씬 적은 수의 센서구성 만으로도 상기의 칼라 카메라만으로 구성된 실시예와 동일한 결과를 얻을 수 있다.At this time, the facial feature detection unit 131 actively uses an image acquired using an image sensor composed only of a color camera such as a conventional webcam, a pattern such as Kinect, or TOF (Time of Flight), Active Stereo, etc. In the case of a sensor capable of obtaining information, a single image sensor or a configuration of a much smaller number of sensors can achieve the same results as in the embodiment configured with only a color camera.

시선 응시 추적부(134)는 얼굴 특징 검출 유닛(131)에서 찾아진 얼굴의 3D 특징정보를 이용하여 영상 내의 좌, 우 눈의 위치를 식별하고 사용자의 눈 영역이 포함되는 영상영역을 찾아낼 수 있다. The eye gaze tracking unit 134 may use the 3D feature information of the face found by the facial feature detection unit 131 to identify the positions of the left and right eyes in the image and to find the image region including the user's eye region. have.

이 때, 시선 응시 추적부(134)는 다수의 영상센서로부터 검출된 3D 특징 정보를 이용하여 눈의 가시성이 가장 높은 영상센서를 식별할 수 있다.In this case, the eye gaze tracking unit 134 may identify an image sensor having the highest eye visibility using 3D feature information detected from a plurality of image sensors.

이 때, 시선 응시 추적부(134)는 눈 영역의 3D 정보를 이용하여 얻어진 눈의 표면에 수직인 노멀(Normal) 벡터와 카메라가 바라보는 벡터 간의 내적(Dot Product)의 절대값이 음의 방향으로 가장 큰 영상센서를 선택하여 식별할 수 있다.At this time, the gaze tracking unit 134 determines that the absolute value of the dot product between the normal vector perpendicular to the surface of the eye obtained using 3D information of the eye region and the vector viewed by the camera is negative. can be identified by selecting the largest image sensor.

이 때, 시선 응시 추적부(134)는 가시성이 가장 확보된 좌, 우 눈 영상 영역에 대해 딥러닝 기법을 이용한 학습을 통해 영상내의 홍채와 동공의 정보를 식별하여 동공의 위치를 검출할 수 있다.At this time, the eye gaze tracking unit 134 may detect the position of the pupil by identifying the information of the iris and the pupil in the image through learning using a deep learning technique for the left and right eye image regions that have the most visibility. .

이 때, 시선 응시 추적부(134)는 찾아진 동공의 위치를 얼굴 특징 검출 유닛(131)에서 얻어진 사용자의 얼굴 3D 모델을 참고하여 3D 동공의 위치를 찾아낼 수 있다.In this case, the gaze tracking unit 134 may find the position of the 3D pupil by referring to the 3D model of the user's face obtained by the facial feature detecting unit 131 for the position of the found pupil.

이 때, 시선 응시 추적부(134)는 찾아진 3D 동공의 위치는 실제 사용자의 구형 안구의 표면인 동공의 위치이므로 안구의 3D 모델을 이용하여 시선의 응시방향을 3D로 검출할 수 있다.In this case, since the found 3D pupil position is the position of the pupil, which is the surface of the actual user's spherical eye, the gaze gaze tracking unit 134 may detect the gaze direction in 3D using the 3D model of the eyeball.

안구의 3D 모델은 중심점과 일정 반경을 가지는 구형모델로 단순화될 수 있으며 보다 복잡한 실제 안구모형을 3D로 시뮬레이션 할 수도 있다.The 3D model of the eyeball can be simplified to a spherical model having a central point and a certain radius, and more complex real eyeball models can be simulated in 3D.

또한, 구형 안구모델의 경우 모니터의 마우스 대체와 같은 정밀도가 요구되지 않는 경우 얼굴크기나 연령대 등을 고려한 평균적인 인체 해부학적 크기를 기반으로 반경이 결정될 수 있다.In addition, in the case of the spherical eye model, when precision such as a mouse replacement of a monitor is not required, the radius may be determined based on the average anatomical size of the human body in consideration of the face size or age.

이 때, 시선 응시 추적부(134)는 얼굴 특징 검출부(131)를 통해 얻어진 얼굴 특징들을 토대로 사용자의 해부학적 골격을 예측하여 중심점을 설정할 수 있다.In this case, the eye gaze tracking unit 134 may set the center point by predicting the user's anatomical skeleton based on the facial features obtained through the facial feature detecting unit 131 .

이 때, 시선 응시 추적부(134)는 좌, 우 각 눈의 3D 시선 방향을 연계한 3D 시선 교차점을 찾음으로써 카메라보정을 통해 얻은 3D 공간 좌표계를 중심으로 한 3D 시선 교차점인 응시점의 3D 위치와 시선 응시 추적 과정에서 발생하는 오차범위를 고려한 응시 후보 영역을 찾아낼 수 있다.At this time, the gaze tracking unit 134 finds the 3D gaze intersection linking the 3D gaze directions of the left and right eyes, and thus the 3D position of the gaze point, which is the 3D gaze intersection point centered on the 3D spatial coordinate system obtained through camera calibration, and It is possible to find a gaze candidate region in consideration of an error range occurring in the gaze tracking process.

오차는 동공 검출과 안구모델의 중심점과 반경, 3D 얼굴 모델의 검출 불확실성에 의해 발생할 수 있으며, 이 때, 시선 응시 추적부(134)는 화면상의 콘텐츠 내 식별 대상체의 3D 공간상의 크기를 고려하여 오차 범위를 결정할 수 있다.The error may be caused by the pupil detection, the center point and radius of the eye model, and the detection uncertainty of the 3D face model. In this case, the eye gaze tracking unit 134 takes into account the size of the object identified in the content on the screen in 3D space. range can be determined.

이 때, 시선 응시 추적부(134)는 설정된 3D 오차범위와 3D 응시점을 함께 3D 타원형 모델로 제공할 수 있다.In this case, the gaze tracking unit 134 may provide the set 3D error range and the 3D gaze point together as a 3D elliptical model.

시계열 태그 파서부(132)는 태그기반 콘텐츠 입력부(120)에서 입력된 콘텐츠내에 삽입되어 있는 태그 정보를 해석할 수 있다.The time series tag parser 132 may interpret tag information inserted in the content input from the tag-based content input unit 120 .

이 때, 시계열 태그 파서부(132)는 디스플레이 화면에 표출되는 매 프레임별로 보여지는 응시 대상의 3D 정보를 응시대상 검출부(135)에 제공할 수 있다.In this case, the time series tag parser 132 may provide 3D information of the gaze target displayed for each frame displayed on the display screen to the gaze target detection unit 135 .

이 때, 시계열 태그 파서부(132)는 3D 공간 좌표계를 중심으로 표출되는 디스플레이의 3D 공간 배치 정보를 고려하여 응시 대상의 3D 공간 상 크기 및 위치를 계산하고, 응시 대상의 3D 정보를 생성할 수 있다.At this time, the time series tag parser 132 calculates the size and location of the gaze target in 3D space in consideration of the 3D spatial arrangement information of the display displayed around the 3D spatial coordinate system, and generates 3D information of the gaze target. have.

또한, 시계열 태그 파서부(132)는 실제 매장 등의 거점을 대상으로 한 경우 사전에 공간 3D 스캔 등을 통해 얻어진 디지털 3D 맵 상에 태그정보를 가지는 대상의 위치, 크기, 속성 정보들을 해석하여 응시대상 검출부(135)에 제공할 수 있다.In addition, the time series tag parser 132 analyzes the location, size, and attribute information of an object having tag information on a digital 3D map obtained through spatial 3D scan in advance when a base such as a real store is targeted and gazes at it. It may be provided to the target detection unit 135 .

응시대상 검출부(135)는 시계열 태그 파서 유닛(132)을 통해 얻어진 현재 디스플레이 화면상에 표출된 대상체의 태그 정보와 시선 응시 추적 유닛(134)에서 검출된 응시 대상의 3D 정보를 이용하여 사용자와 가장 인접한 응시 대상을 검출할 수 있다.The gaze target detection unit 135 uses the tag information of the target currently displayed on the display screen obtained through the time series tag parser unit 132 and the 3D information of the gaze target detected by the gaze tracking unit 134 to make the most of the interaction with the user. An adjacent gaze target may be detected.

이 때, 응시대상 검출부(135)는 응시 대상의 3D 정보에서 3D 응시점과 3D 오차범위를 고려하여 오차범위 내에 있는 대상체들 중 3D 응시점과 거리상 가장 인접한 응시 대상을 검출할 수 있다.In this case, the gaze target detector 135 may detect the gaze target closest to the 3D gaze point in distance from the 3D gaze point among the objects within the error range in consideration of the 3D gaze point and the 3D error range in the 3D information of the gaze target.

이 때, 응시대상 검출부(135)는 응시 대상의 3D 정보를 검출된 응시대상과 해당 파서 정보를 관심도 기록부(136)에 제공할 수 있다.In this case, the gaze target detection unit 135 may provide the 3D information of the gaze target and the detected gaze target and the corresponding parser information to the interest level recorder 136 .

관심도 검출부(133)는 시선 응시 추적 유닛(134)의 3D 응시점 정보를 기반으로 사용자의 시선을 3D 응시 추적한 결과, 사용자의 응시점이 시계열 상에서 기설정된 시간 이상 유지될 경우, 응시 대상에 대한 사용자의 관심도를 검출할 수 있다.As a result of 3D gaze tracking of the user's gaze based on the 3D gaze point information of the gaze gaze tracking unit 134 , the interest detection unit 133 determines that the user's gaze point is maintained for a predetermined time or longer in the time series. of interest can be detected.

또한, 관심도 검출부(133)는 선택적으로 연산 효율성을 위해 응시점이 기설정된 시간 이상 유지되고 응시대상 검출부(135)가 응시 대상을 검출한 경우에만, 사용자의 관심도를 검출할 수도 있다.Also, the interest detection unit 133 may selectively detect the user's interest only when the gaze point is maintained for a predetermined time or longer for computational efficiency and the gaze target detection unit 135 detects the gaze target.

이 때, 관심도 검출부(133)는 상기 조건이 활성화되면 얼굴 특징 검출부(131)에서 응시점이 유지되는 시간범위 내에 시간순으로 입력되는 얼굴 특징의 변화를 검사할 수 있다.In this case, when the condition is activated, the interest detection unit 133 may examine a change in facial features input in chronological order within a time range in which the gaze point is maintained by the facial feature detection unit 131 .

이 때, 관심도 검출부(133)는 얼굴 특징의 변화에 기반하여 사용자의 관심도를 검출할 수 있다.In this case, the interest level detector 133 may detect the user's level of interest based on the change in facial features.

관심도는 크게 호감, 무관심, 비호감으로 정의될 수 있다.Interest can be broadly defined as liking, indifference, and dislike.

이 때, 관심도 검출부(133)는 얼굴 특징의 변화를 분석하여 상기 3개 카테고리의 관심도를 검출할 수 있다. In this case, the interest level detection unit 133 may detect the three categories of interest by analyzing the change in facial features.

이 때, 관심도 검출부(133)는 얼굴의 전체적인 움직임 분석을 통해 검출할 수 있다.In this case, the interest level detection unit 133 may detect it through the overall motion analysis of the face.

예를 들어, 사용자는 응시 대상에 대해 호감을 가질 경우 의식 중 혹은 무의식 중에 고개를 끄덕일 수 있다.For example, the user may nod his head consciously or unconsciously when he has a liking for the gaze target.

이를 통해, 관심도 검출부(133)는 얼굴 특징의 변화가 응시 대상에 대해 상하로 일정 범위로 반복하여 움직이는 움직임을 검출하여 호감으로 관심도를 검출할 수 있다.Through this, the interest level detection unit 133 may detect the level of interest as a crush by detecting a movement in which a change in a facial feature repeatedly moves up and down in a predetermined range with respect to the gaze target.

반대로 비호감일 경우, 사용자는 응시 대상에 대해 고개를 좌우로 움직일 수 있다.Conversely, in case of unfavorable feeling, the user may move the head left and right with respect to the gaze target.

이 때, 관심도 검출부(133)는 얼굴 특징의 변화가 응시 대상에 대해 좌우로의 반복하여 움직이는 움직임을 검출하여 비호감으로 관심도를 검출할 수 있다.In this case, the interest level detection unit 133 may detect the level of interest as an unfavorable feeling by detecting a movement in which a change in a facial feature moves left and right with respect to the gaze target.

무관심일 경우, 사용자는 응시 대상에 대해 얼굴특징이 변화하지 않을 수 있다.In the case of indifference, the user may not change the facial feature with respect to the gaze target.

이 때, 관심도 검출부(133)는 얼굴 특징의 변화가 응시 대상에 대해 변화하지 않는 경우, 무관심으로 관심도를 검출할 수 있다.At this time, when the change in the facial feature does not change with respect to the gaze target, the interest level detector 133 may detect the interest level with indifference.

이 때, 관심도 검출부(133)는 입의 특징점 움직임을 통해 검출할 수도 있다. In this case, the interest detection unit 133 may also detect the movement of the feature point of the mouth.

이 때, 관심도 검출부(133)는 호감일 경우, 입술의 양끝 특징점이 입술의 중심 특징점을 기준으로 얼굴의 위쪽 방향으로 상대적으로 이동하지를 분석하여 식별할 수 있다.At this time, in the case of a favorable feeling, the interest detection unit 133 may identify by analyzing whether the feature points at both ends of the lips move relatively in the upward direction of the face based on the central feature point of the lips.

이 때, 관심도 검출부(133)는 미소를 짓는 방향으로의 입의 특징점들의 상대적 이동범위를 분석하여 호감의 정도를 식별할 수 있다.In this case, the interest detection unit 133 may identify the degree of liking by analyzing the relative movement ranges of the feature points of the mouth in the smiling direction.

이 때, 관심도 검출부(133)는 호감의 경우, '오!, 와!' 등의 감탄사를 할때의 입모양을 검출함으로써 식별될 수도 있다.At this time, the interest detection unit 133, in the case of liking, 'Oh!, Wow!' It can also be identified by detecting the shape of the mouth when exclamation such as etc.

이 때, 관심도 검출부(133)는 비호감일 경우, 호감의 반대방향으로 입 특징점들의 이동범위의 분석을 통해 비호감의 정도를 식별할 수 있다.In this case, in the case of dislike, the interest detection unit 133 may identify the degree of dislike through analysis of the movement range of the mouth feature points in the opposite direction to the favorable feeling.

이 때, 관심도 검출부(133)는 무관심의 경우 무표정 상태에 해당하는 입의 특징점들의 배치에서의 개별 특징점들의 이동범위를 분석하여 식별할 수 있다.In this case, in the case of indifference, the interest detection unit 133 may identify by analyzing the movement range of individual feature points in the arrangement of the feature points of the mouth corresponding to the expressionless state.

이 때, 관심도 검출부(133)는 눈썹을 구성하는 특징들의 움직임을 통해서도 관심도를 검출할 수 있다.In this case, the interest level detection unit 133 may also detect the level of interest through movement of features constituting the eyebrows.

이 때, 관심도 검출부(133)는 비호감의 경우 미간이 좁아지면서 얼굴을 찌푸릴 때 발생하는 얼굴 특징들의 움직임을 검출함으로써 식별할 수 있다.In this case, the interest detection unit 133 may identify the unfavorable feeling by detecting movement of facial features that occur when the forehead is narrowed and the face is frowned.

이 때, 관심도 검출부(133)는 반대로 호감의 경우 호의적 놀람의 경우에 발생하는 얼굴 특징들의 움직임을 검출함으로써 식별할 수 있다.In this case, the interest level detection unit 133 may identify by detecting movements of facial features that occur in the case of favorable surprise in the case of favorable impression.

이 때, 관심도 검출부(133)는 그 외에도 개인별로 다양한 얼굴의 특징변화를 통해 관심도를 정의할 수 있다.In this case, the interest level detection unit 133 may also define the level of interest through various facial feature changes for each individual.

앞서 서술한 관심도에 따른 얼굴 특징의 변화들은 단편적으로 발생할 수도 있지만 복합적으로 발생할 수도 있다.The above-described changes in facial features according to the degree of interest may occur singly or in combination.

이 때, 관심도 검출부(133)는 다양한 사용자별로 호감, 비호감, 무관심 콘텐츠를 볼때의 시간순의 얼굴 특징변화와 그때의 관심도 값을 학습데이터로 생성하여 딥러닝 기법을 통해 학습함으로써 관심도를 검출할 수 있다.At this time, the interest detection unit 133 generates as learning data the change of facial features in chronological order and the interest value at that time when viewing content of liking, dislike, and indifference by various users to detect interest by learning through a deep learning technique. .

관심도의 출력값은 호감도의 정도에 따라 증가하는 양수값을 무관심일 경우 영의 값을 비호감도에 따라 증가하는 음의 값을 가지며 [-1, 1]의 값으로 정규화 될 수 있다. 또한, 관심도의 출력값은 응시 시간을 함께 포함할 수 있다The output value of interest has a positive value that increases according to the degree of liking, zero in case of indifference, and a negative value that increases according to dislike, and can be normalized to a value of [-1, 1]. Also, the output value of the degree of interest may include the gaze time together.

관심도 기록부(136)는 관심도 검출 유닛(133)으로부터 입력되는 관심도의 유지시간과 정규화된 관심도 값과 응시대상 검출 유닛(135)에서 입력되는 응시대상의 파서 정보를 입력받아 함께 데이터베이스부(150)에 기록할 수 있다.The interest level recording unit 136 receives the interest maintenance time and normalized interest value input from the interest level detection unit 133 and the parser information of the gaze target inputted from the gaze target detection unit 135, and together with the database unit 150 can be recorded

또한, 데이터베이스부(150)는 기록되는 값들을 인터랙션 어플리케이션 연동부(140)에 제공할 수도 있다.Also, the database unit 150 may provide recorded values to the interaction application interworking unit 140 .

데이터베이스부(150)는 상기 사용자의 관심도와 태그 정보를 연계하여 저장할 수 있다.The database unit 150 may store the user's interest in association with the tag information.

인터랙션 어플리케이션 연동부(140)는 상기 사용자의 관심도에 기반하여 상기 사용자에게 적합한 맞춤형 서비스를 제공할 수 있다.The interaction application interworking unit 140 may provide a customized service suitable for the user based on the user's level of interest.

이 때, 인터랙션 어플리케이션 연동부(140)는 응시대상 관심도 검출부(130)를 포함할 수도 있으며, 응시대상 관심도 검출부(130)를 통해 사용자의 관심도를 식별하기 위해 태그 기반 콘텐츠를 선택적으로 사용자가 바라보는 디스플레이 장치에 표출할 수 있다.In this case, the interaction application linkage unit 140 may include the gaze target interest level detection unit 130 , and the user selectively views the tag-based content to identify the user's interest level through the gaze target interest level detection unit 130 . It can be expressed on a display device.

또한, 인터랙션 어플리케이션 연동부(140)는 식별된 사용자의 관심도에 따라 사용자의 관심도를 충족하는 콘텐츠를 인터랙티브 방식으로 사용자에게 제공할 수도 있다.Also, the interaction application interworking unit 140 may provide content satisfying the user's interest in an interactive manner according to the identified user's interest.

이 때, 인터랙션 어플리케이션 연동부(140)는 인터랙션 어플리케이션을 통해 사용자는 어플리케이션이 스마트하게 사용자의 관심도를 파악하고 사용자가 원하는 방향으로 정보 혹은 콘텐츠를 제공할 수 있다.In this case, the interaction application interworking unit 140 may provide information or content in a user's desired direction by smartly identifying the user's interest through the application through the interaction application.

도 3을 참조하면, TV 등의 스마트 디바이스를 통한 응시 대상 관심도 검출 장치 및 방법의 실시 예를 나타낸 것을 알 수 있고, 도 4를 참조하면 쇼핑매장 등의 거점을 통한 응시 대상 관심도 검출 장치 및 방법의 실시 예를 나타낸 것을 알 수 있다.Referring to FIG. 3 , it can be seen that an embodiment of an apparatus and method for detecting a gaze target interest through a smart device such as a TV is shown. Referring to FIG. 4 , an apparatus and method for detecting an gaze target interest through a base such as a shopping store It can be seen that the examples are shown.

스마트 디바이스 혹은 거점을 통한 실시 예와 같이 영상 센서 입력부(131)는 단일 디스플레이와 단일 혹은 다수의 영상센서를 포함할 수 있고, 다수의 디스플레이와 다수의 영상센서 유닛(Image Sensor Unit)을 포함할 수도 있다.As in the embodiment through a smart device or a base, the image sensor input unit 131 may include a single display and a single or multiple image sensors, and may include a plurality of displays and a plurality of image sensor units. have.

또한, 태그기반 콘텐츠 입력부(120)에 입력되는 태그기반 콘텐츠는 도 3의 스마트 디바이스를 통한 예시와 같이 스마트 디바이스의 디스플레이 유닛(Display Unit) 상에 시간별로 표출되는 응시 대상의 위치와 영역, 대상물의 속성 등을 담은 태그 정보를 포함하거나 거점을 통한 예시와 같이 공간 상에 배치된 대상체의 공간 내 위치와 크기, 속성 등을 담은 태그 정보를 포함할 수 있다. 태그에 포함되는 정보는 이 외에도 상품 카테고리, 재질 등 대상체의 속성을 표현하는 다양한 정보가 포함될 수 있다.In addition, the tag-based content input to the tag-based content input unit 120 is the location and area of the gaze target displayed by time on the display unit of the smart device as shown in the example through the smart device of FIG. It may include tag information including attributes or the like, or tag information including the location, size, and attributes of an object disposed in space in the space as an example through a base. In addition to this, the information included in the tag may include various information expressing properties of the object, such as a product category and a material.

도 3 및 도 4를 참조하면, 영상센서 입력부(110)는 단일 혹은 다수의 센서를 포함할 수 있으며, 단일 센서 내부에도 스테레오 카메라나 키넥트와 같이 다중의 복합 영상센서를 포함할 수 있다. 이러한 센서들은 스마트 디바이스 혹은 공간상에 배치가 되고 통상적으로 컴퓨터 비전 분야에서 다수의 카메라 간에 특정 좌표계를 중심으로 한 공간 상의 상대적 배치와 카메라의 초점 등의 내부 속성들을 찾아내는 보편적인 카메라보정(Camera Calibration)이라는 과정을 통해 하나의 3D 공간 좌표계를 중심으로 데이터를 교환할 수 있다. 또한, 카메라보정 과정에서 디스플레이 유닛의 3D 공간 좌표계를 중심으로 한 상대적 배치와 공간상의 대상 별로 3D 공간 좌표계를 중심으로 한 상대적 배치 정보들도 확보할 수 있다. Referring to FIGS. 3 and 4 , the image sensor input unit 110 may include a single or multiple sensors, and may include multiple composite image sensors such as a stereo camera or a Kinect inside a single sensor. These sensors are placed on a smart device or space, and in general, in the field of computer vision, a universal camera calibration that finds out internal properties such as the relative placement in space and the focus of the camera with respect to a specific coordinate system between multiple cameras. Through this process, data can be exchanged around one 3D spatial coordinate system. In addition, in the camera calibration process, relative arrangement information about the 3D space coordinate system of the display unit and the relative arrangement information about the 3D space coordinate system for each object in space can also be secured.

이러한 카메라 보정 과정을 통해 공간상의 하나의 점을 다수의 영상센서가 가림(Occlusion) 없이 촬영할 수 있다면 공간상의 하나의 점에 대해 카메라 보정 과정에서 기준이 된 3D 공간 좌표계를 중심으로 한 상대적 3D 위치는 통상의 컴퓨터 비전 기술을 통해 계산할 수 있다.If a single point in space can be photographed without occlusion by multiple image sensors through this camera calibration process, the relative 3D position centered on the 3D spatial coordinate system that became the standard in the camera calibration process for a single point in space is It can be calculated using common computer vision techniques.

이상의 구체적인 방법론과 실시예를 통해 본 발명의 구성을 설명하였다.The configuration of the present invention has been described through the above specific methodologies and examples.

본 발명을 통해 얼굴 특징분석 기반 응시 대상체 관심도 검출 기술을 설명할 수 있으며, 기존의 단순한 시선추적을 통한 응시방향에 해당하는 대상의 정보를 기록하고 해당 정보를 향후의 검색 등에서 우선 제공하는 [참고 3]의 발명에 비해 사용자의 대상에 대한 관심도를 함께 제공함으로써 보다 사용자의 선호도를 만족하는 콘텐츠 혹은 정보를 사용자에게 제공할 수 있다.The present invention can explain the facial feature analysis-based gaze object interest detection technology, which records information on the target corresponding to the gaze direction through the existing simple gaze tracking, and provides the information first in future searches [Reference 3 ], it is possible to provide the user with content or information that more satisfies the user's preference by providing the user's interest in the subject.

도 5는 본 발명의 일실시예에 따른 응시 대상 관심도 검출 방법을 나타낸 동작흐름도이다.5 is an operation flowchart illustrating a method for detecting a gaze target interest according to an embodiment of the present invention.

도 5를 참조하면, 본 발명의 일실시예에 따른 응시 대상 관심도 검출 방법은 먼저 영상 및 콘텐츠를 입력 받을 수 있다(S210).Referring to FIG. 5 , in the method for detecting an interest of a gaze target according to an embodiment of the present invention, an image and content may be first received ( S210 ).

즉, 단계(S210)는 적어도 하나 이상의 영상 센서로부터 디스플레이를 응시하는 사용자를 촬영한 영상을 입력 받을 수 있다.That is, in step S210, an image captured by a user gazing at the display may be received from at least one image sensor.

이 때, 단계(S210)는 상기 디스플레이를 통해 출력되는 응시 대상에 관한 태그 기반 콘텐츠에 삽입된 태그 정보를 입력 받을 수 있다.In this case, in step S210, tag information inserted into the tag-based content related to the gaze target output through the display may be received.

이 때, 단계(S210)는 콘텐츠 내 시계열상에서 각 프레임 별로 화면에 표출되는 대상의 종류 혹은 상품명 등과 화면상의 위치 및 영역 등을 담은 태그(Tag) 정보를 포함하는 태그 기반 콘텐츠를 입력 받을 수 있다.In this case, in step S210, tag-based content including tag information including the type or product name of an object displayed on the screen for each frame in the time series within the content, and the location and area on the screen may be input.

이 때, 단계(S210)는 공간의 배치와 형상에 대한 3D Map과 공간을 구성하는 대상에 대한 위치 및 영역 정보를 담은 태그 정보를 포함하는 3D 맵 등을 태그 기반 콘텐츠로 입력 받을 수 있다.In this case, in step S210, a 3D map for the arrangement and shape of a space and a 3D map including tag information containing location and area information of an object constituting the space may be input as tag-based content.

또한, 본 발명의 일실시예에 따른 응시 대상 관심도 검출 방법은 응시 대상에 대한 관심도를 검출할 수 있다(S220).Also, the method for detecting the gaze target interest level according to an embodiment of the present invention may detect the gaze target interest level ( S220 ).

즉, 단계(S220)는 상기 영상으로부터 상기 응시 대상을 바라보는 사용자의 얼굴 특징 정보, 상기 응시 대상을 바라보는 상기 사용자의 시선을 추적한 응시 정보를 검출하고, 상기 얼굴 특징 정보 상기 응시 정보 및 상기 태그 정보에 기반하여 상기 응시 대상에 대한 상기 사용자의 관심도를 검출할 수 있다.That is, step S220 detects, from the image, facial feature information of the user looking at the gaze target, gaze information tracking the gaze of the user looking at the gaze target, and the facial feature information, the gaze information and the The user's interest in the gaze target may be detected based on the tag information.

이 때, 단계(S220)는 입력 영상과 태그 정보를 입력 받아 사용자가 콘텐츠 혹은 공간상에서 어떠한 대상을 응시할 경우, 응시 대상을 식별하고 응시 대상에 대한 관심도를 검출할 수 있다.In this case, in step S220 , when the user gazes at an object in the content or space by receiving the input image and tag information, the gaze target may be identified and the degree of interest in the gaze target may be detected.

이 때, 단계(S220)는 상기 응시 정보에 기반하여 상기 사용자의 응시점이 기설정된 시간동안 유지된 경우, 상기 얼굴 특징 정보에 기반하여 상기 기설정된 시간 동안 상기 사용자의 얼굴 특징 변화를 검출하여 상기 사용자의 관심도를 검출할 수 있다.In this case, in step S220, when the gaze point of the user is maintained for a preset time based on the gaze information, a change in the user's facial features is detected for the preset time based on the facial feature information, and the user of interest can be detected.

이 때, 단계(S220)는 상기 사용자의 얼굴 특징 변화 중 상기 사용자의 입의 특징점의 변화 및 눈썹의 특징점 변화 중 적어도 하나에 기반하여 상기 사용자의 관심도를 검출할 수 있다.In this case, in step S220, the user's degree of interest may be detected based on at least one of a change in the feature point of the user's mouth and a change in the feature point of the eyebrow among the changes in the user's facial features.

이 때, 단계(S220)는 영상센서 입력부(110)로부터 입력되는 영상 별로 얼굴 특징 정보를 검출할 수 있다. 이 때, 사용되는 얼굴 특징 정보는 얼굴을 구성하는 눈, 코, 입, 눈썹, 귀, 턱선 등이 포함될 수 있으며, 수개 혹은 수십개의 특징점(Feature Point)로 구성된 얼굴 랜드마크(Facial Landmark)로 정의되거나 삼각형 혹은 사각형 메쉬(mesh)로 구성되는 얼굴의 3D 모델에 대해 메쉬를 구성하는 버텍스(vertex)에 대응되는 얼굴의 특징점으로 정의될 수도 있다.In this case, in step S220 , facial feature information may be detected for each image input from the image sensor input unit 110 . In this case, the facial feature information used may include eyes, nose, mouth, eyebrows, ears, jaw line, etc. constituting the face, and is defined as a facial landmark composed of several or dozens of feature points. Alternatively, for a 3D model of a face composed of a triangular or quadrangular mesh, it may be defined as a feature point of a face corresponding to a vertex constituting the mesh.

이 때, 단계(S220)는 PRNet 과 같은 얼굴 특징 정보 검출 기술을 활용할 수 있다. 다만, PRNet은 출력으로 제공하는 3D 모델이 스케일에 대한 정의가 없이 정규화된 3D 공간상에서 표현되는 Scaled 3D 모델에 상응할 수 있다.In this case, step S220 may utilize a facial feature information detection technology such as PRNet. However, in PRNet, the 3D model provided as an output can correspond to the scaled 3D model expressed in the normalized 3D space without the definition of scale.

이 때, 단계(S220)는 물리적인 공간에 대응되는 얼굴의 특징 정보 검출을 위해, 개별 영상센서를 통해 입력 받은 Scaled 3D 모델의 특징점 간의 대응관계를 이용하여 스케일 정보를 제거할 수 있다.In this case, in step S220 , scale information may be removed by using the correspondence between the feature points of the scaled 3D model input through the individual image sensor in order to detect feature information of the face corresponding to the physical space.

이 때, 단계(S220)는 Scaled 3D 모델에서 스케일 정보를 제거한 후, 카메라보정을 통해 얻어진 3D 공간 좌표계를 중심으로 실제 얼굴의 크기에 대응되는 3D 얼굴 특징을 검출할 수 있다.In this case, in step S220 , after removing the scale information from the scaled 3D model, 3D facial features corresponding to the size of the actual face may be detected based on the 3D spatial coordinate system obtained through camera correction.

이 때, 단계(S220)는 다수의 영상센서를 사용하여, 각 영상센서별로 얻어진 3D 모델에 대해 각 영상센서의 뷰포인트와 사용자의 위치를 고려한 가시성(Visibility) 테스트를 수행할 수 있다.In this case, in step S220 , a visibility test may be performed on the 3D model obtained for each image sensor using a plurality of image sensors in consideration of the viewpoint of each image sensor and the location of the user.

이 때, 단계(S220)는 가시성이 최대인 영상에서 얻어진 3D 모델 정보들을 선택적으로 취합할 수 있다.In this case, in step S220, 3D model information obtained from an image having maximum visibility may be selectively collected.

이 때, 단계(S220)는 하나의 3D 공간 좌표계를 기준으로 각각의 영상센서에서 보이지 않던 정보들을 통합할 수 있다.In this case, in step S220 , information that has not been seen in each image sensor may be integrated based on one 3D spatial coordinate system.

이 때, 단계(S220)는 다중 카메라의 모든 정보들을 일관되게 표현하고 통합된 정보로부터 얼굴 특징을 검출할 수 있다.In this case, in step S220, all information of multiple cameras may be consistently expressed and facial features may be detected from the integrated information.

이 때, 단계(S220)는 3D 모델을 구성하는 개별의 삼각형 혹은 사각형 메쉬를 기준으로 메쉬의 노멀(Normal) 벡터와 카메라가 바라보는 벡터 간의 내적(Dot Product)의 절대값이 음의 방향으로 가장 큰 영상의 정보를 해당 메쉬에 기록하여 가시성이 최대화된 3D 모델을 획득할 수 있다.At this time, in step S220, the absolute value of the dot product between the normal vector of the mesh and the vector viewed by the camera on the basis of the individual triangular or square mesh constituting the 3D model is the most negative in the negative direction. A 3D model with maximized visibility can be obtained by recording large image information in the corresponding mesh.

이 때, 단계(S220)는 획득한 사용자의 3D 형상(Shape)과 칼라(Color)를 컴퓨터그래픽스 분야에서 인체형 캐릭터를 표현하기 위해 사용하는 uv-Map과 같은 방식으로, 3D 모델에 unwrap(2차원으로 펼쳐진 지구본 사진과 같음) 처리를 수행할 수 있다.At this time, step (S220) is unwrap (2) on the 3D model in the same way as the uv-Map used to express the human body character in the computer graphics field using the acquired 3D shape and color of the user. (same as a dimensionally unfolded globe photo) processing can be performed.

이 때, 단계(S220)는 unWrap된 영상을 딥러닝 기법들을 통해 학습하고, 사용자가 디스플레이 화면을 기준으로 90도 정도로 옆으로 고개를 돌리는 상황에서도 사용자의 모든 얼굴 특징을 동시에 검출할 수 있다.At this time, step S220 learns the unwrapped image through deep learning techniques, and it is possible to simultaneously detect all facial features of the user even in a situation where the user turns his/her head to the side by about 90 degrees based on the display screen.

이 때, 단계(S220)는 사용자의 얼굴이 정면을 응시하지 않아도 검출이 가능하며 다중 디스플레이 혹은 공간을 둘러보는 사용자의 얼굴 특징을 효율적으로 검출할 수 있다.In this case, in step S220, the user's face can be detected even when the user's face does not stare at the front, and facial features of the user looking around the multiple displays or space can be efficiently detected.

이 때, 단계(S220)는 개별 영상에서 얼굴 특징을 검출한 후 특징 간의 대응관계를 이용하여 하나의 얼굴 특징을 검출할 수도 있다.In this case, in step S220, one facial feature may be detected by using the correspondence between the features after detecting the facial features in the individual images.

이 때, 단계(S220)는 통상적인 웹캠과 같은 칼라 카메라로만 구성된 영상센서를 이용하여 획득한 영상, 키넥트와 같이 패턴 혹은 TOF(Time of Flight), Active Stereo 등을 이용하여 능동적으로 Depth 정보를 얻을 수 있는 센서의 경우 단일 영상센서 혹은 훨씬 적은 수의 센서구성 만으로도 상기의 칼라 카메라만으로 구성된 실시예와 동일한 결과를 얻을 수 있다.At this time, step S220 is an image acquired using an image sensor composed only of a color camera such as a conventional webcam, a pattern such as Kinect, or TOF (Time of Flight), Active Stereo, etc. In the case of an obtainable sensor, the same result as in the embodiment comprising only a color camera can be obtained only with a single image sensor or with a much smaller number of sensors.

이 때, 단계(S220)는 얼굴 특징 검출 유닛(131)에서 찾아진 얼굴의 3D 특징정보를 이용하여 영상 내의 좌, 우 눈의 위치를 식별하고 사용자의 눈 영역이 포함되는 영상영역을 찾아낼 수 있다.In this case, in step S220, the position of the left and right eyes in the image may be identified using the 3D feature information of the face found by the facial feature detection unit 131, and the image region including the user's eye region may be found. have.

이 때, 단계(S220)는 다수의 영상센서로부터 검출된 3D 특징 정보를 이용하여 눈의 가시성이 가장 높은 영상센서를 식별할 수 있다.In this case, in step S220, an image sensor having the highest eye visibility may be identified using 3D feature information detected from a plurality of image sensors.

이 때, 단계(S220)는 눈 영역의 3D 정보를 이용하여 얻어진 눈의 표면에 수직인 노멀(Normal) 벡터와 카메라가 바라보는 벡터 간의 내적(Dot Product)의 절대값이 음의 방향으로 가장 큰 영상센서를 선택하여 식별할 수 있다.At this time, in step S220, the absolute value of the dot product between the normal vector perpendicular to the surface of the eye obtained using 3D information of the eye region and the vector viewed by the camera is the largest in the negative direction. It can be identified by selecting an image sensor.

이 때, 단계(S220)는 가시성이 가장 확보된 좌, 우 눈 영상 영역에 대해 딥러닝 기법을 이용한 학습을 통해 영상내의 홍채와 동공의 정보를 식별하여 동공의 위치를 검출할 수 있다.At this time, in step S220, information on the iris and pupil in the image may be identified through learning using a deep learning technique for the left and right eye image regions in which visibility is most secured, and the position of the pupil may be detected.

이 때, 단계(S220)는 찾아진 동공의 위치를 얼굴 특징 검출 유닛(131)에서 얻어진 사용자의 얼굴 3D 모델을 참고하여 3D 동공의 위치를 찾아낼 수 있다.In this case, in step S220 , the position of the 3D pupil may be found by referring to the 3D model of the user's face obtained by the facial feature detection unit 131 for the position of the found pupil.

이 때, 단계(S220)는 찾아진 3D 동공의 위치는 실제 사용자의 구형 안구의 표면인 동공의 위치이므로 안구의 3D 모델을 이용하여 시선의 응시방향을 3D로 검출할 수 있다.At this time, in step S220, since the found 3D pupil position is the position of the pupil, which is the surface of the actual user's spherical eyeball, the gaze direction of the gaze can be detected in 3D using the 3D model of the eyeball.

이 때, 단계(S220)는 얼굴 특징 검출부(131)를 통해 얻어진 얼굴 특징들을 토대로 사용자의 해부학적 골격을 예측하여 중심점을 설정할 수 있다.In this case, in step S220 , the central point may be set by predicting the user's anatomical skeleton based on the facial features obtained through the facial feature detection unit 131 .

이 때, 단계(S220)는 좌, 우 각 눈의 3D 시선 방향을 연계한 3D 시선 교차점을 찾음으로써 카메라보정을 통해 얻은 3D 공간 좌표계를 중심으로 한 3D 시선 교차점인 응시점의 3D 위치와 시선 응시 추적 과정에서 발생하는 오차범위를 고려한 응시 후보 영역을 찾아낼 수 있다.At this time, in step S220, the 3D gaze tracking point and the 3D gaze point that is the 3D gaze intersection point centered on the 3D spatial coordinate system obtained through camera calibration by finding the 3D gaze intersection linking the 3D gaze directions of the left and right eyes It is possible to find the candidate area for the examination taking into account the error range that occurs in the process.

오차는 동공 검출과 안구모델의 중심점과 반경, 3D 얼굴 모델의 검출 불확실성에 의해 발생할 수 있으며, 이 때, 단계(S220)는 화면상의 콘텐츠 내 식별 대상체의 3D 공간상의 크기를 고려하여 오차 범위를 결정할 수 있다.The error may be caused by the pupil detection, the center point and radius of the eye model, and the detection uncertainty of the 3D face model. In this case, in step S220, the error range is determined by considering the 3D spatial size of the object to be identified in the content on the screen. can

이 때, 단계(S220)는 설정된 3D 오차범위와 3D 응시점을 함께 3D 타원형 모델로 제공할 수 있다.In this case, step S220 may provide the set 3D error range and 3D gaze point together as a 3D elliptical model.

이 때, 단계(S220)는 태그기반 콘텐츠 입력부(120)에서 입력된 콘텐츠내에 삽입되어 있는 태그 정보를 해석할 수 있다.In this case, in step S220 , tag information inserted in the content input by the tag-based content input unit 120 may be interpreted.

이 때, 단계(S220)는 디스플레이 화면에 표출되는 매 프레임별로 보여지는 응시 대상의 3D 정보를 응시대상 검출부(135)에 제공할 수 있다.In this case, in step S220 , 3D information of the gaze target shown for each frame displayed on the display screen may be provided to the gaze target detection unit 135 .

이 때, 단계(S220)는 3D 공간 좌표계를 중심으로 표출되는 디스플레이의 3D 공간 배치 정보를 고려하여 응시 대상의 3D 공간 상 크기 및 위치를 계산하고, 응시 대상의 3D 정보를 생성할 수 있다.In this case, in step S220, the size and position of the gaze target in 3D space may be calculated in consideration of the 3D spatial arrangement information of the display displayed based on the 3D spatial coordinate system, and 3D information of the gaze target may be generated.

이 때, 단계(S220)는 실제 매장 등의 거점을 대상으로 한 경우 사전에 공간 3D 스캔 등을 통해 얻어진 디지털 3D 맵 상에 태그정보를 가지는 대상의 위치, 크기, 속성 정보들을 해석하여 응시대상 검출부(135)에 제공할 수 있다.At this time, in step S220, when a base such as an actual store is a target, the gaze target detection unit analyzes the location, size, and attribute information of the target having tag information on the digital 3D map obtained through spatial 3D scan in advance. (135) may be provided.

이 때, 단계(S220)는 시계열 태그 파서 유닛(132)을 통해 얻어진 현재 디스플레이 화면상에 표출된 대상체의 태그 정보와 시선 응시 추적 유닛(134)에서 검출된 응시 대상의 3D 정보를 이용하여 사용자와 가장 인접한 응시 대상을 검출할 수 있다.In this case, step S220 is performed with the user using the tag information of the object displayed on the current display screen obtained through the time series tag parser unit 132 and 3D information of the gaze target detected by the gaze tracking unit 134 . The closest gaze target may be detected.

이 때, 단계(S220)는 응시 대상의 3D 정보에서 3D 응시점과 3D 오차범위를 고려하여 오차범위 내에 있는 대상체들 중 3D 응시점과 거리상 가장 인접한 응시 대상을 검출할 수 있다.In this case, in step S220 , the gaze target closest to the 3D gaze point in distance from the 3D gaze point may be detected among the objects within the error range by considering the 3D gaze point and the 3D error range in the 3D information of the gaze target.

이 때, 단계(S220)는 응시 대상의 3D 정보를 검출된 응시대상과 해당 파서 정보를 관심도 기록부(136)에 제공할 수 있다.In this case, in step S220 , the 3D information of the gaze target may be provided to the interest level recorder 136 with the detected gaze target and corresponding parser information.

이 때, 단계(S220)는 시선 응시 추적 유닛(134)의 3D 응시점 정보를 기반으로 사용자의 시선을 3D 응시 추적한 결과, 사용자의 응시점이 시계열 상에서 기설정된 시간 이상 유지될 경우, 응시 대상에 대한 사용자의 관심도를 검출할 수 있다.At this time, in step S220, as a result of 3D gaze tracking of the user's gaze based on the 3D gaze point information of the gaze tracking unit 134, when the user's gaze point is maintained for a predetermined time or more in the time series, the gaze target is It is possible to detect the user's interest in

이 때, 단계(S220)는 선택적으로 연산 효율성을 위해 응시점이 기설정된 시간 이상 유지되고 응시대상 검출부(135)가 응시 대상을 검출한 경우에만, 사용자의 관심도를 검출할 수도 있다.In this case, in step S220, the user's interest may be detected only when the gaze point is maintained for a predetermined time or longer for computational efficiency and the gaze target detection unit 135 detects the gaze target.

이 때, 단계(S220)는 상기 조건이 활성화되면 얼굴 특징 검출부(131)에서 응시점이 유지되는 시간범위 내에 시간순으로 입력되는 얼굴 특징의 변화를 검사할 수 있다.In this case, in step S220 , when the condition is activated, the facial feature detection unit 131 may examine changes in facial features input in chronological order within a time range in which the gaze point is maintained.

이 때, 단계(S220)는 얼굴 특징의 변화에 기반하여 사용자의 관심도를 검출할 수 있다.In this case, in step S220, the user's interest level may be detected based on the change in facial features.

이 때, 단계(S220)는 얼굴 특징의 변화를 분석하여 상기 3개 카테고리의 관심도를 검출할 수 있다. In this case, in step S220, the degree of interest of the three categories may be detected by analyzing the change in the facial feature.

이 때, 단계(S220)는 얼굴의 전체적인 움직임 분석을 통해 검출할 수 있다.In this case, step S220 may be detected through the overall motion analysis of the face.

이 때, 단계(S220)는 얼굴 특징의 변화가 응시 대상에 대해 상하로 일정 범위로 반복하여 움직이는 움직임을 검출하여 호감으로 관심도를 검출할 수 있다.In this case, in step S220 , the degree of interest may be detected as a crush by detecting a movement in which the change of the facial feature repeatedly moves up and down with respect to the gaze target in a certain range.

이 때, 단계(S220)는 얼굴 특징의 변화가 응시 대상에 대해 좌우로의 반복하여 움직이는 움직임을 검출하여 비호감으로 관심도를 검출할 수 있다.In this case, in step S220 , the degree of interest may be detected as unfavorable by detecting a movement in which the change of the facial feature moves left and right with respect to the gaze target.

이 때, 단계(S220)는 얼굴 특징의 변화가 응시 대상에 대해 변화하지 않는 경우, 무관심으로 관심도를 검출할 수 있다.In this case, in step S220, when the change in the facial feature does not change with respect to the gaze target, the degree of interest may be detected with indifference.

이 때, 단계(S220)는 입의 특징점 움직임을 통해 검출할 수도 있다. In this case, step S220 may be detected through the movement of the feature point of the mouth.

이 때, 단계(S220)는 호감일 경우, 입술의 양끝 특징점이 입술의 중심 특징점을 기준으로 얼굴의 위쪽 방향으로 상대적으로 이동하지를 분석하여 식별할 수 있다.In this case, in step S220 , in the case of a favorable feeling, it is possible to identify by analyzing whether the feature points at both ends of the lips move relatively in the upward direction of the face based on the central feature point of the lips.

이 때, 단계(S220)는 미소를 짓는 방향으로의 입의 특징점들의 상대적 이동범위를 분석하여 호감의 정도를 식별할 수 있다.In this case, in step S220, the degree of liking may be identified by analyzing the relative movement ranges of the feature points of the mouth in the smiling direction.

이 때, 단계(S220)는 호감의 경우, '오!, 와!' 등의 감탄사를 할때의 입모양을 검출함으로써 식별될 수도 있다.At this time, step S220 is a case of liking, 'Oh!, Wow!' It can also be identified by detecting the shape of the mouth when exclamation such as etc.

이 때, 단계(S220)는 비호감일 경우, 호감의 반대방향으로 입 특징점들의 이동범위의 분석을 통해 비호감의 정도를 식별할 수 있다.In this case, in step S220 , in the case of unfavorable feeling, the degree of unfavorable feeling may be identified through analysis of the movement range of the mouth feature points in the opposite direction to the favorable feeling.

이 때, 단계(S220)는 무관심의 경우 무표정 상태에 해당하는 입의 특징점들의 배치에서의 개별 특징점들의 이동범위를 분석하여 식별할 수 있다.In this case, in step S220 , in the case of indifference, the movement range of individual feature points in the arrangement of the feature points of the mouth corresponding to the expressionless state may be analyzed and identified.

이 때, 단계(S220)는 눈썹을 구성하는 특징들의 움직임을 통해서도 관심도를 검출할 수 있다.In this case, in step S220, the degree of interest may also be detected through movement of features constituting the eyebrows.

이 때, 단계(S220)는 비호감의 경우 미간이 좁아지면서 얼굴을 찌푸릴 때 발생하는 얼굴 특징들의 움직임을 검출함으로써 식별할 수 있다.In this case, in step S220, in the case of unfavorable feeling, the facial features may be identified by detecting movements of facial features that occur when the forehead is narrowed and the face is frowned.

이 때, 단계(S220)는 반대로 호감의 경우 호의적 놀람의 경우에 발생하는 얼굴 특징들의 움직임을 검출함으로써 식별할 수 있다.In this case, in step S220, on the contrary, in the case of a favorable feeling, it can be identified by detecting the movement of the facial features that occur in the case of a favorable surprise.

이 때, 단계(S220)는 그 외에도 개인별로 다양한 얼굴의 특징변화를 통해 관심도를 정의할 수 있다.In this case, in step S220, the degree of interest may be defined through various facial feature changes for each individual.

이 때, 단계(S220)는 다양한 사용자별로 호감, 비호감, 무관심 콘텐츠를 볼때의 시간순의 얼굴 특징변화와 그때의 관심도 값을 학습데이터로 생성하여 딥러닝 기법을 통해 학습함으로써 관심도를 검출할 수 있다.At this time, in step S220, the degree of interest can be detected by generating as learning data the change of facial features in chronological order and the interest value at that time when viewing content of liking, dislike, and indifference by various users, and learning through a deep learning technique.

이 때, 단계(S220)는 관심도 검출 유닛(133)으로부터 입력되는 관심도의 유지시간과 정규화된 관심도 값과 응시대상 검출 유닛(135)에서 입력되는 응시대상의 파서 정보를 입력받아 함께 데이터베이스부(150)에 기록할 수 있다.At this time, in step S220, the interest maintenance time and normalized interest value input from the interest level detection unit 133 and the parser information of the gaze target inputted from the gaze target detection unit 135 are received together with the database unit 150 ) can be recorded.

또한, 본 발명의 일실시예에 따른 응시 대상 관심도 검출 방법은 사용자 맞춤형 서비스를 제공할 수 있다(S230).In addition, the method for detecting a gaze target interest according to an embodiment of the present invention may provide a user-customized service (S230).

즉, 단계(S230)는 상기 사용자의 관심도에 기반하여 상기 사용자에게 적합한 맞춤형 서비스를 제공할 수 있다.That is, in step S230, a customized service suitable for the user may be provided based on the user's level of interest.

이 때, 단계(S230)는 데이터베이스부(150)가 기록되는 값들을 인터랙션 어플리케이션 연동부(140)에 제공할 수도 있다.In this case, in step S230 , the database unit 150 may provide the recorded values to the interaction application interworking unit 140 .

이 때, 단계(S230)는 데이터베이스부(150)가 상기 사용자의 관심도와 태그 정보를 연계하여 저장할 수 있다.In this case, in step S230, the database unit 150 may store the user's interest in association with the tag information.

이 때, 단계(S230)는 응시대상 관심도 검출부(130)를 통해 사용자의 관심도를 식별하기 위해 태그 기반 콘텐츠를 선택적으로 사용자가 바라보는 디스플레이 장치에 표출할 수 있다.In this case, in step S230 , the tag-based content may be selectively displayed on the display device viewed by the user in order to identify the user's interest level through the gaze target interest level detection unit 130 .

이 때, 단계(S230)는 식별된 사용자의 관심도에 따라 사용자의 관심도를 충족하는 콘텐츠를 인터랙티브 방식으로 사용자에게 제공할 수도 있다.In this case, in step S230, content satisfying the user's interest may be provided to the user in an interactive manner according to the identified user's interest.

이 때, 단계(S230)는 인터랙션 어플리케이션을 통해 사용자는 어플리케이션이 스마트하게 사용자의 관심도를 파악하고 사용자가 원하는 방향으로 정보 혹은 콘텐츠를 제공할 수 있다.In this case, in step S230, the user through the interaction application can smartly identify the user's interest in the application and provide information or content in the user's desired direction.

도 6은 본 발명의 일실시예에 따른 컴퓨터 시스템을 나타낸 도면이다.6 is a diagram illustrating a computer system according to an embodiment of the present invention.

도 6을 참조하면, 본 발명의 일실시예에 따른 응시 대상 관심도 검출 장치는 컴퓨터로 읽을 수 있는 기록매체와 같은 컴퓨터 시스템(1100)에서 구현될 수 있다. 도 6에 도시된 바와 같이, 컴퓨터 시스템(1100)은 버스(1120)를 통하여 서로 통신하는 하나 이상의 프로세서(1110), 메모리(1130), 사용자 인터페이스 입력 장치(1140), 사용자 인터페이스 출력 장치(1150) 및 스토리지(1160)를 포함할 수 있다. 또한, 컴퓨터 시스템(131)은 네트워크(1180)에 연결되는 네트워크 인터페이스(1170)를 더 포함할 수 있다. 프로세서(1110)는 중앙 처리 장치 또는 메모리(1130)나 스토리지(1160)에 저장된 프로세싱 인스트럭션들을 실행하는 반도체 장치일 수 있다. 메모리(1130) 및 스토리지(1160)는 다양한 형태의 휘발성 또는 비휘발성 저장 매체일 수 있다. 예를 들어, 메모리는 ROM(1131)이나 RAM(1132)을 포함할 수 있다.Referring to FIG. 6 , the apparatus for detecting a gaze target's interest according to an embodiment of the present invention may be implemented in a computer system 1100 such as a computer-readable recording medium. As shown in FIG. 6 , the computer system 1100 includes one or more processors 1110 , a memory 1130 , a user interface input device 1140 , and a user interface output device 1150 that communicate with each other via a bus 1120 . and storage 1160 . In addition, the computer system 131 may further include a network interface 1170 coupled to the network 1180 . The processor 1110 may be a central processing unit or a semiconductor device that executes processing instructions stored in the memory 1130 or the storage 1160 . The memory 1130 and the storage 1160 may be various types of volatile or non-volatile storage media. For example, the memory may include a ROM 1131 or a RAM 1132 .

본 발명의 일실시예에 따른 응시 대상 관심도 검출 장치는 하나 이상의 프로세서(1110); 및 상기 하나 이상의 프로세서(1110)에 의해 실행되는 적어도 하나 이상의 프로그램을 저장하는 실행메모리(1130)를 포함하고, 상기 적어도 하나 이상의 프로그램은 적어도 하나 이상의 영상 센서로부터 디스플레이를 응시하는 사용자를 촬영한 영상을 입력 받고, 상기 디스플레이를 통해 출력되는 응시 대상에 관한 태그 기반 콘텐츠에 삽입된 태그 정보를 입력 받고, 상기 영상으로부터 상기 응시 대상을 바라보는 사용자의 얼굴 특징 정보, 상기 응시 대상을 바라보는 상기 사용자의 시선을 추적한 응시 정보를 검출하고, 상기 얼굴 특징 정보 상기 응시 정보 및 상기 태그 정보에 기반하여 상기 응시 대상에 대한 상기 사용자의 관심도를 검출할 수 있다.A gaze target interest detection apparatus according to an embodiment of the present invention includes one or more processors 1110; and an execution memory 1130 for storing at least one or more programs executed by the one or more processors 1110, wherein the at least one or more programs display an image captured by a user gazing at a display from at least one or more image sensors. Receive input, receive tag information inserted in tag-based content related to the gaze target output through the display, and receive facial feature information of the user looking at the gaze target from the image, and the user's gaze looking at the gaze target may detect gaze information that has been tracked, and detect the user's interest in the gaze target based on the facial feature information, the gaze information, and the tag information.

본 발명의 일실시예에 따른 응시 대상 관심도 검출 장치 및 방법은 스마트폰 혹은 스마트패드, 모니터, TV 등을 통한 교육 혹은 광고 등의 콘텐츠 시청이나 백화점 등의 매장 쇼핑, 전시관/체험관 등의 관람 등에서 실제 공간이나 화면 속의 대상들에 대해 사용자의 응시 대상체를 찾고 그 대상체에 대한 사용자의 호감- 비호감-무관심의 관심도를 비접촉 영상센서를 이용하여 자동으로 검출하는 방법을 제공할 수 있다.The apparatus and method for detecting the subject's interest level according to an embodiment of the present invention are actually used in viewing content such as education or advertisement through a smartphone or smart pad, monitor, TV, etc., shopping at a store such as a department store, viewing an exhibition hall/experience hall, etc. It is possible to provide a method of finding a user's gaze object for objects in a space or a screen, and automatically detecting the degree of interest in the user's liking, dislike, and indifference toward the object using a non-contact image sensor.

또한, 본 발명의 일실시예에 따른 응시 대상 관심도 검출 장치 및 방법은 학습기반 영상분석 기술을 이용하여 사용자의 얼굴을 촬영할 수 있는 스마트폰이나 모니터, 광고 디스플레이, 키오스크 등의 스마트단말 상의 화면에 표출되는 대상체나 디지털 3D 맵 혹은 360 영상 등으로 구성된 실제 3D 공간상의 대상체를 바라보는 사용자의 응시대상에 대한 사용자의 관심도를 영상 내의 사용자의 얼굴 특징 분석을 통해 검출하여 사용자의 대상에 대한 관심도를 비접촉 영상센서를 통해 확보할 수 있다. In addition, the apparatus and method for detecting the gaze target's interest according to an embodiment of the present invention is expressed on a screen on a smart terminal such as a smartphone, monitor, advertisement display, kiosk, etc. capable of photographing a user's face using a learning-based image analysis technology The user's level of interest in the user's gaze target is detected through analysis of the user's facial features in the image, and the user's level of interest in the target is determined using a non-contact image. It can be obtained through the sensor.

또한, 본 발명의 일실시예에 따른 응시 대상 관심도 검출 장치 및 방법은 실시간 자동 검출되는 사용자의 관심도를 이용하여 콘텐츠 혹은 광고, 전시물에 대한 고객 관심도를 조사하거나 분석하여 지능화된 사용자 맞춤형 콘텐츠 제공하거나 사용자와 컴퓨터 간의 대화형 서비스 등에서 컴퓨터가 사용자의 관심도를 기반으로 지능형 인터랙션 등을 제공하는 서비스에 활용할 수 있다.In addition, the apparatus and method for detecting a target interest level according to an embodiment of the present invention provides intelligent user-customized content or provides intelligent user-customized content by researching or analyzing customer interest in content, advertisements, and exhibits using the user's interest level automatically detected in real time. It can be used for services that provide intelligent interaction, etc., based on the user's interest, such as interactive services between computer and computer.

또한, 본 발명의 일실시예에 따른 응시 대상 관심도 검출 장치 및 방법은 정해진 위치의 정해진 정면시점에서의 사용자 관찰이 어려운 스마트폰 혹은 스마트패드, 모니터, TV 등을 통한 교육 혹은 광고 등의 콘텐츠 시청 환경이나 백화점 등의 매장 쇼핑, 전시관/체험관 관람 등의 환경에서 실제 공간이나 화면 속의 대상들에 대해 사용자의 응시 대상체를 찾고 그 대상체에 대한 사용자의 관심도를 비접촉 영상센서를 이용하여 자동으로 검출하는 방법을 제공하고, 컴퓨터가 사용자의 관심도를 반영한 콘텐츠를 인터랙티브하게 제공함으로써 사용자 친화적인 UX를 서비스할 수 있다.In addition, the apparatus and method for detecting the target interest level according to an embodiment of the present invention provides a content viewing environment such as education or advertisement through a smart phone or smart pad, monitor, TV, etc., in which it is difficult for a user to observe a user from a predetermined front view of a predetermined location In an environment such as shopping at stores such as department stores or department stores, viewing exhibition halls/experience halls, etc., it is a method to find a user's gaze object for objects in a real space or screen and automatically detect the user's interest in the object using a non-contact image sensor. A user-friendly UX service can be provided by interactively providing content that reflects the user's interest by the computer.

또한, 본 발명의 일실시예에 따른 응시 대상 관심도 검출 장치 및 방법은 스마트 광고에 적용할 경우에는 광고판을 지나가는 사람들이 지나가며 고개를 돌려 응시하는 광고 내 대상과 그때의 관심도를 식별하여 저장된 빅데이터를 분석함으로써 해당 광고판의 설치위치와 시간대 별로 광고판을 지나다니는 유동인구의 관심도에 적합한 광고를 노출하여 지나가는 사용자의 시선을 적극적으로 광고판으로 유도하여 광고효과를 극대화할 수 있게 된다. 또한, 광고를 보려고 멈춰서 응시하는 경우 사용자의 관심도 식별에 따라 추가로 관심을 보일만한 광고를 자연스럽게 추가로 표출함으로써 광고 노출 효과를 극대화할 수 있다. 이렇게 인터랙티브하게 제공된 광고에 따라 식별된 사용자의 관심도를 연령, 체형 등과 결합하여 빅데이터를 구축함으로써 보다 비호감형 광고의 노출을 최소화하고 용이하게 구매로 이어지는 사용자 맞춤형 광고 제공과 사용자의 트랜드를 실시간 수집할 수 있다.In addition, when applied to a smart advertisement, the apparatus and method for detecting the target interest in the gaze target according to an embodiment of the present invention identify the target in the advertisement and the degree of interest at that time when people passing the billboard pass by and turn their heads to stare at the big data stored By analyzing the advertisement, advertisements suitable for the level of interest of the floating population passing the advertisement board are exposed by the installation location and time of the advertisement board, and the user's gaze is actively drawn to the advertisement board, thereby maximizing the advertisement effect. In addition, if you stop and stare to see the advertisement, the advertisement exposure effect can be maximized by naturally additionally expressing an advertisement that will show additional interest according to the identification of the user's interest. By combining the user's interest level identified according to the interactively provided advertisement, such as age, body type, etc., to build big data, it is possible to minimize exposure to unfavorable advertisements, provide customized advertisements that easily lead to purchases, and collect user trends in real time. can

또한, 본 발명의 일실시예에 따른 응시 대상 관심도 검출 장치 및 방법은 백화점 등의 매장 쇼핑에 적용할 경우, 아이쇼핑을 하는 소비자들의 시선이 주로 머무는 위치와 대상, 해당 대상의 관심도에 대한 빅데이터 분석을 통해 소비층의 특성을 분석하거나 보다 효율적인 매장내 상품 배치, 운용 등에 활용할 수 있다. 매장의 경우 상기의 광고판 예시와 결합하여 광고판에 설치된 영상센서를 통해 얻어진 고객의 성별, 연령, 체형 정보를 식별하여 고객이 선호할 만한 제품을 실시간 고객에게 노출함으로써 고객을 매장으로 유도하고 구매로 이어지게 하는 효과를 극대화할 수 있다. 또한, 본 발명을 전시관에 적용할 경우, 전시물을 관람하는 사용자의 관심도를 연령대, 성별 등과 결합하여 전시물의 구성을 사용자들의 관심도를 유도하는 방향으로 전개할 수 있다.In addition, when the apparatus and method for detecting the target interest level according to an embodiment of the present invention are applied to store shopping such as a department store, big data on the location and the target where the eyes of consumers who shop i-shopping mainly stay, and the degree of interest of the target The analysis can be used to analyze the characteristics of the consumer group or to more efficiently arrange and operate products in the store. In the case of a store, in combination with the example of the billboard above, it identifies the customer's gender, age, and body type information obtained through the image sensor installed on the billboard, and exposes the products that the customer may prefer to customers in real time to induce customers to the store and lead to purchase. effect can be maximized. In addition, when the present invention is applied to an exhibition hall, the user's interest in viewing the exhibits can be combined with age, gender, etc. to develop the composition of the exhibits in a direction to induce the users' interest.

또한, 본 발명의 일실시예에 따른 응시 대상 관심도 검출 장치 및 방법은 유치원 등의 교육 시설에 적용할 경우, 디스플레이를 지나다니는 아동들이 주로 관심을 가지는 콘텐츠 요소를 식별하여 아동들이 자연스럽게 디스플레이에 관심을 가지도록 유도할 수 있으며, 아동들이 화면상의 관심대상을 볼 때의 시선의 응시특성을 분석함으로써 자폐 등의 아동 발달 관련 질환을 조기 선별함에 활용될 수 있다.In addition, when the apparatus and method for detecting the gaze target interest according to an embodiment of the present invention are applied to educational facilities such as kindergartens, children passing through the display identify content elements that are mainly interested, so that the children naturally become interested in the display. It can be used to early screening for diseases related to children's development, such as autism, by analyzing the gaze characteristics of children's gaze when they see an object of interest on the screen.

또한, 본 발명의 일실시예에 따른 응시 대상 관심도 검출 장치 및 방법은 로봇에 적용할 경우 사용자의 얼굴 특징 분석을 통해 확보한 성별, 연령 등의 정보와 기존 사용자의 누적 관심도 정보를 결합하거나 시간대별로 식별된 사용자의 즉시적 관심도 정보를 토대로 사용자가 선호하는 방향으로의 자연스러운 대화형 인터랙션을 제공하는 서비스에 활용될 수 있다.In addition, when applied to a robot, the gaze target interest detection apparatus and method according to an embodiment of the present invention combine information such as gender, age, etc. obtained through analysis of a user's facial features with the accumulated interest information of an existing user or by time zone. Based on the identified user's immediate interest information, it can be utilized for a service that provides a natural interactive interaction in a direction preferred by the user.

이상에서와 같이 본 발명의 일실시예에 따른 응시 대상 관심도 검출 장치 및 방법은 상기한 바와 같이 설명된 실시예들의 구성과 방법이 한정되게 적용될 수 있는 것이 아니라, 상기 실시예들은 다양한 변형이 이루어질 수 있도록 각 실시예들의 전부 또는 일부가 선택적으로 조합되어 구성될 수도 있다.As described above, in the apparatus and method for detecting a gaze target interest according to an embodiment of the present invention, the configuration and method of the embodiments described above are not limitedly applicable, but various modifications may be made to the embodiments. All or part of each embodiment may be selectively combined and configured.

110: 영상 센서 입력부 120: 콘텐츠 입력부
130: 응시 대상 관심도 검출부
131: 얼굴 특징 검출부 132: 시계열 태그 파서부
133: 관심도 검출부 134: 시선 응시 추적부
135: 응시 대상 검출부 136: 관심도 기록부
140: 인터랙션 어플리케이션 연동부
150: 데이터베이스부
1100: 컴퓨터 시스템 1110: 프로세서
1120: 버스 1130: 메모리
1131: 롬 1132: 램
1140: 사용자 인터페이스 입력 장치
1150: 사용자 인터페이스 출력 장치
1160: 스토리지 1170: 네트워크 인터페이스
1180: 네트워크110: image sensor input unit 120: content input unit
130: target interest level detection unit
131: facial feature detection unit 132: time series tag parser unit
133: interest detection unit 134: eye gaze tracking unit
135: gaze target detection unit 136: interest record
140: interaction application linkage
150: database unit
1100: computer system 1110: processor
1120: bus 1130: memory
1131: rom 1132: ram
1140: user interface input device
1150: user interface output device
1160: storage 1170: network interface
1180: network

Claims

one or more processors; and
an execution memory for storing at least one or more programs executed by the one or more processors;
including,
the at least one program
Receive an image of a user gazing at a display from at least one image sensor, and receive tag information embedded in tag-based content about a gazing target output through the display,
Detects facial characteristic information of the user looking at the gaze target from the image, gaze information tracking the gaze of the user looking at the gaze target, and the gaze based on the facial feature information, the gaze information, and the tag information A gaze target interest detection apparatus, characterized in that detecting the user's interest in the target.

The method according to claim 1,
The tag information is
The gaze target interest detection apparatus, characterized in that it includes at least one of a location, area, size, attribute, product category, and material of the gaze target output on the display.

3. The method according to claim 2,
the at least one program
When the user's gaze point is maintained for a preset time based on the gaze information, the user's degree of interest is detected by detecting a change in the user's facial features for the preset time based on the facial feature information. A device for detecting the interest of the gaze target.

4. The method according to claim 3,
The user's interest
The gaze target interest detection apparatus, characterized in that the detection of at least one of indifference, liking, and dislike according to the change in the facial feature.

5. The method according to claim 4,
the at least one program
and detecting the degree of interest of the user based on at least one of a change in a feature point of the user's mouth and a change in a feature point of the eyebrow among the changes in the user's facial features.

In the gaze target interest detection method of the gaze target interest level detection device,
receiving an image of a user gazing at a display from at least one image sensor, and receiving tag information inserted into tag-based content related to a gazing target output through the display; and
Detects facial characteristic information of the user looking at the gaze target from the image, gaze information tracking the gaze of the user looking at the gaze target, and the gaze based on the facial feature information, the gaze information, and the tag information detecting the user's interest in the object;
Gaze target interest detection method comprising a.

7. The method of claim 6,
The tag information is
The gaze target interest detection method, characterized in that it includes at least one of a location, area, size, attribute, product category, and material of the gaze target output on the display.

8. The method of claim 7,
The step of detecting the user's interest
When the user's gaze point is maintained for a preset time based on the gaze information, the user's degree of interest is detected by detecting a change in the user's facial features for the preset time based on the facial feature information. A method of detecting the interest of the gaze target.

9. The method of claim 8,
The user's interest
The gaze target interest detection method, characterized in that the detection of at least one of indifference, liking, and dislike according to the change in the facial feature.

10. The method of claim 9,
The step of detecting the user's interest
and detecting the degree of interest of the user based on at least one of a change in a feature point of the user's mouth and a change in a feature point of the eyebrow among the changes in the user's facial features.