KR102195068B1

KR102195068B1 - User terminal with Line-of-sight Matching function and method using the same

Info

Publication number: KR102195068B1
Application number: KR1020190105091A
Authority: KR
Inventors: 손명규; 김현덕; 이상헌
Original assignee: 재단법인대구경북과학기술원
Priority date: 2019-08-27
Filing date: 2019-08-27
Publication date: 2020-12-24

Abstract

The present invention relates to a user terminal with a line-of-sight matching function and a line-of-sight matching method using the same. The present invention includes: a database storing information on the size of the display screen of the user terminal and the position of the camera lens of the user terminal; a receiving unit receiving the input of an image taken with the camera; a face information detection unit acquiring user face and pupil position information by applying the input image to machine learning; a control unit acquiring an actual point-of-view value by estimating the user′s line of sight through the face and pupil position information and deriving a corrected final point-of-view value from the actual point-of-view value; an eye image generation unit generating a pupil image with a changed point of view by applying the final point-of-view value to the pupil image included in the input image; and an output unit matching the pupil image with the changed point of view to the image and outputting a corrected image on the display screen. With the user terminal of the present invention, it is possible to match an inconsistent line of sight attributable to the positional difference between the display screen and the camera by applying the point-of-view correction value to the actual line-of-sight value of the user. In addition, it is possible to correct the line of sight while maintaining the original image by generating a new eye image by applying deep learning to the user eye image.

Description

User terminal with Line-of-sight Matching function and method using the same}

본 발명은 시선 일치화 기능을 탑재한 사용자 단말기 및 그것을 이용한 시선 일치화 방법에 관한 것으로서, 더욱 상세하게는 사용자 단말기를 사용하여 셀프 촬영 또는 영상통화를 할 경우 디스플레이 화면과 사용자 단말기에 탑재된 카메라 렌즈의 위치가 서로 상이하여 발생되는 시선 일치화 기능을 탑재한 사용자 단말기 및 그것을 이용한 시선 일치화 방법에 관한 것이다.The present invention relates to a user terminal equipped with a gaze matching function and a gaze matching method using the same, and more particularly, to a display screen and a camera lens mounted on a user terminal when self-photographing or video call is performed using a user terminal. The present invention relates to a user terminal equipped with a line-of-sight matching function that is generated due to different positions of each other and a line-of-sight matching method using the same.

일반적으로 스마트폰을 사용하는 사용자는 스마트폰의 디스플레이 화면을 통해 출력된 본인의 모습이나 상대방의 모습을 확인하면서 셀프 촬영 또는 영상통화를 하게 된다. In general, a user using a smartphone makes a self-portrait or video call while checking his or her image output through the display screen of the smartphone.

현재 스마트폰에 탑재되는 카메라는 디스플레이 화면의 외부에 위치하고 있다. 따라서, 셀프 촬영을 하는 경우, 사용자는 디스플레이 화면에 나타나는 자신의 모습을 보고 촬영을 하고, 촬영된 결과물을 확인해보면 사용자는 카메라가 아닌 다른 곳을 바라보는 것처럼 촬영된다. Currently, cameras mounted on smartphones are located outside the display screen. Therefore, in the case of self-photographing, the user takes a picture by looking at his or her image on the display screen, and when checking the photographed result, the user is photographed as if looking at a place other than the camera.

즉, 사용자의 시선이 카메라에 고정된 상태에서 촬영된 영상을 살펴보면, 사용자의 시선은 정면에 위치하고 있는 것처럼 출력된다. 반대로 사용자의 시선이 디스플레이 화면에 고정된 상태에서 촬영된 영상을 살펴보면, 사용자의 시선은 다른 곳을 향하고 있는 것처럼 출력된다. 이는 사용자의 시선이 카메라 렌즈를 향해 있지 않고 디스플레이 화면을 향하고 있기 때문이다. That is, when looking at an image captured while the user's gaze is fixed to the camera, the user's gaze is output as if it were located in front. Conversely, when looking at an image captured while the user's gaze is fixed on the display screen, the user's gaze is output as if it was pointing to another place. This is because the user's gaze is not toward the camera lens, but toward the display screen.

그러나 사용자의 시선이 정면을 향하도록 촬영을 한다면, 카메라의 부착위치 때문에 사용자는 디스플레이 화면에 나타나는 이미지 또는 영상을 확인하기 어려운 문제가 발생된다. However, if the user's gaze is taken to face the front, it is difficult for the user to check the image or video displayed on the display screen due to the attachment position of the camera.

본 발명의 배경이 되는 기술은 대한민국 등록특허공보 제10-0307854호(2001.11.02. 공고)에 개시되어 있다.The technology behind the present invention is disclosed in Korean Patent Publication No. 10-0307854 (2001.11.02. Announcement).

본 발명이 이루고자 하는 기술적 과제는, 사용자 단말기를 사용하여 셀프 촬영 또는 영상통화를 할 경우 디스플레이 화면과 사용자 단말기에 탑재된 카메라 렌즈의 위치가 서로 상이하여 발생되는 불일치화된 시선을 보정하는 시선 일치화 기능을 탑재한 사용자 단말기 및 그것을 이용한 시선 일치화 방법을 제공하는데 목적이 있다. The technical problem to be achieved by the present invention is to correct gaze matching that corrects mismatched gazes caused by different positions of a display screen and a camera lens mounted on a user terminal when a self-photographing or video call is made using a user terminal. An object of the present invention is to provide a user terminal equipped with a function and a gaze matching method using the same.

이러한 기술적 과제를 이루기 위한 본 발명의 실시예에 따르면, 시선 일치화 기능을 탑재한 사용자 단말기에 있어서, 사용자 단말기의 디스플레이 화면 크기 및 사용자 단말기에 탑재된 카메라 렌즈의 위치정보를 저장하는 데이터베이스, 상기 카메라를 통해 촬영된 영상을 입력받는 수신부, 상기 입력된 영상을 머신 러닝에 적용하여 사용자의 얼굴 위치 정보 및 눈동자의 위치 정보를 획득하는 얼굴 정보 검출부, 상기 획득된 사용자의 얼굴 위치 정보 및 눈동자의 위치 정보를 통해 사용자의 시선을 추정하여 실제 시점값을 획득하고, 획득된 실제 시점값으로부터 보정된 최종 시점값을 도출하는 제어부, 상기 입력된 영상에 포함된 눈동자의 이미지에 상기 최종 시점값을 적용하여 시점이 변화된 눈동자 이미지를 생성하는 눈 이미지생성부, 그리고 상기 시점이 변화된 눈동자 이미지를 상기 영상에 정합하여 보정된 이미지를 디스플레이 화면 상에 출력하는 출력부를 포함한다. According to an embodiment of the present invention for achieving such a technical problem, in a user terminal equipped with a gaze matching function, a database storing the display screen size of the user terminal and location information of a camera lens mounted on the user terminal, the camera A receiving unit that receives an image captured through, a face information detector that obtains the user's face position information and the pupil position information by applying the input image to machine learning, the obtained user's face position information and the pupil position information A control unit for estimating the user's gaze by estimating the user's gaze and obtaining a corrected final viewpoint value from the obtained actual viewpoint value, and applying the final viewpoint value to the image of the pupil included in the input image And an eye image generation unit that generates the changed pupil image, and an output unit that matches the pupil image whose viewpoint is changed with the image and outputs a corrected image on the display screen.

상기 얼굴 정보 검출부는, 상기 카메라를 통해 촬영된 영상을 기 구축된 머신 러닝에 입력하여 사용자의 얼굴 위치와 눈동자의 위치를 학습하고, 학습이 완료된 상태에서 획득한 영상을 머신러닝에 입력하여 사용자의 얼굴 위치와 눈동자의 위치를 검출하며, 상기 검출된 눈동자와 눈동자 사이의 거리를 이용하여 디스플레이 화면과 사용자간의 거리에 대한 정보를 획득할 수 있다. The face information detection unit inputs the image captured through the camera into pre-built machine learning to learn the user's face position and the position of the pupil, and inputs the image acquired while the learning is completed into the machine learning. The position of the face and the position of the pupil may be detected, and information on the distance between the display screen and the user may be obtained by using the detected distance between the pupil and the pupil.

상기 제어부는, 상기 획득한 눈동자의 위치 정보를 이용하여 영상으로부터 눈 이미지를 추출하고, 추출된 눈 이미지를 기 구축된 머신러닝에 입력하여 안구의 외곽 라인 대한 좌표값과 눈동자에 대한 좌표값을 검출하는 방법을 학습시키고, 학습이 완료된 머신러닝에 현재 시점에서 촬영된 영상을 입력하여 사용자의 눈동자 시선에 대한 실제 시점값을 추정할 수 있다. The control unit extracts an eye image from an image using the acquired position information of the pupil, and inputs the extracted eye image into pre-built machine learning to detect a coordinate value for the outer line of the eye and a coordinate value for the pupil. It is possible to learn how to do it, and estimate the actual viewpoint value of the user's eye gaze by inputting the image captured at the current viewpoint into machine learning on which the learning is completed.

상기 제어부는, 상기 추정된 실제 시점값이 상기 디스플레이 화면의 내측에 위치하고 있는지의 여부를 판단하고, 판단된 결과에 따라 실제 시점값이 디스플레이 화면의 내측에 위치할 경우에는 상기 실제 시점값에 표준 시점값을 적용할 수 있다. The control unit determines whether the estimated actual viewpoint value is located inside the display screen, and when the actual viewpoint value is located inside the display screen according to the determined result, the standard viewpoint is applied to the actual viewpoint value. Value can be applied.

상기 최종 시점값은, 하기의 수학식을 통해 산출될 수 있다. The final viewpoint value may be calculated through the following equation.

여기서,

는 최종 시점값을 나타내고,

는 표준 시점값을 나타내고,

은 실제 시점값을 나타낸다. here,

Represents the final starting point value,

Represents the standard time point value,

Represents the actual starting point.

상기 표준 시점값은, 사용자의 시선이 카메라를 향하고 있는 경우, 기 저장된 사용자 단말기에 탑재된 카메라 렌즈의 위치정보와 눈의 위치 정보 및 디스플레이 화면과 사용자간의 거리값을 통해 획득될 수 있다. When the user's gaze is toward the camera, the standard viewpoint value may be obtained through pre-stored position information of a camera lens mounted on a user terminal, eye position information, and a distance value between the display screen and the user.

상기 눈 이미지생성부는, 입력된 영상을 이용하여 사용자의 모든 시선방향에 대한 눈 이미지 및 대응되는 좌표를 룩업테이블(lookup table) 형태로 저장한 다음, 상기 룩업테이블로부터 상기 최종 시점값에 매칭되는 이미지를 추출할 수 있다. The eye image generator uses the input image to store eye images and corresponding coordinates for all the user's gaze directions in the form of a lookup table, and then an image matching the final viewpoint value from the lookup table. Can be extracted.

상기 눈 이미지생성부는, GAN(Generatvie Adversarial Network)을 이용하여 이미지를 학습시키고, 학습이 완료된 상태에서 입력된 영상에 포함된 사용자의 눈 이미지에 상기 최종 시점값을 적용하여 보정된 시점을 가지는 눈 이미지를 생성할 수 있다. The eye image generation unit learns an image using a Generatvie Adversarial Network (GAN), and applies the final viewpoint value to the eye image of the user included in the input image when the learning is completed, and has a corrected viewpoint. Can be created.

또한, 본 발명의 다른 실시예에 따르면, 학습 기반의 사용자 단말기를 이용한 시선 일치화 방법에 있어서, 사용자 단말기의 디스플레이 화면 크기 및 디스플레이 화면에 대응되는 카메라 렌즈의 위치정보를 저장한 상태에서 상기 카메라를 통해 촬영된 영상을 입력받는 단계, 상기 입력된 영상을 머신 러닝에 적용하여 사용자의 얼굴 위치 정보 및 눈동자의 위치 정보를 획득하는 단계, 상기 획득된 사용자의 얼굴 위치 정보 및 눈동자의 위치 정보를 통해 사용자의 시선을 추정하여 실제 시점값을 획득하고, 획득된 실제 시점값으로부터 보정된 최종 시점값을 도출하는 단계, 상기 입력된 영상에 포함된 눈동자의 이미지에 상기 최종 시점값을 적용하여 시점이 변화된 눈동자 이미지를 생성하는 단계, 그리고 상기 시점이 변화된 눈동자 이미지를 상기 영상에 정합하여 보정된 이미지를 디스플레이 화면 상에 출력하는 단계를 포함한다. In addition, according to another embodiment of the present invention, in a method for matching gaze using a learning-based user terminal, the camera is operated while storing the display screen size of the user terminal and the location information of the camera lens Receiving an image captured through machine learning, obtaining face location information and pupil location information of the user by applying the input image to machine learning, and obtaining the user's face location information and pupil location information Estimating the line of sight of to obtain an actual viewpoint value, and deriving a corrected final viewpoint value from the obtained actual viewpoint value, the pupil whose viewpoint has changed by applying the final viewpoint value to the image of the pupil included in the input image And generating an image, and matching the pupil image of which the viewpoint is changed with the image to output the corrected image on the display screen.

이와 같이 본 발명에 따르면 사용자 단말기는 사용자의 실제 시선값에 시점 보상값을 적용하여 디스플레이 화면과 카메라의 위치가 상이함으로 인해 불일치된 시선을 일치화시킬 수 있도록 하고, 사용자의 눈 이미지에 딥러닝을 적용하여 새로운 눈 이미지를 생성함으로써, 본래의 원본 영상을 유지하면서 시선만 보정할 수 있는 효과를 지닌다. As described above, according to the present invention, the user terminal applies a viewpoint compensation value to the user's actual gaze value so that the discordant gaze can be matched due to the different positions of the display screen and the camera, and deep learning is performed on the user's eye image. By applying it to create a new eye image, it has the effect of correcting only the gaze while maintaining the original original image.

또한, 본 발명에 따르면 사용자 단말기는 셀프 촬영 또는 영상 통화 시에 디스플레이 화면과 카메라의 위치가 다름으로 인해 생기는 불일치된 시선을 소프트웨어적으로 수정함으로써, 셀프 촬영 또는 영상 통화 시 좀 더 자연스럽게 시선을 맞추고 셀프 촬영 또는 영상 통화를 할 수 있게 된다.In addition, according to the present invention, the user terminal corrects inconsistent gaze caused by different positions of the display screen and camera during self-portrait or video call, so that it more naturally meets the gaze during self-portrait or video call. You will be able to shoot or make video calls.

도 1은 본 발명의 실시예에 따른 사용자 단말기를 설명하기 위한 구성도이다.
도 2는 본 발명의 실시예에 따른 시선 일치화 방법을 설명하기 위한 순서도이다.
도 3은 도 2에 도시된 S220단계에서 얼굴 위치 및 눈동자 위치를 머신 러닝에 의해 획득하는 방법을 설명하기 위한 예시도이다.
도 4는 도 3에 도시된 S240를 설명하기 위한 순서도이다.
도 5는 도 4에 도시된 S242단계에서 머신러닝을 통해 출력된 눈동자에 대한 좌표값과 시선의 각도값을 설명하기 위한 예시도이다.
도 6은 도 4에 도시된 S243단계에서 표준시점값을 획득하는 방법을 설명하기 위한 도면이다.
도 7은 도 4에 도시된 S244단계에서 실제 시점값을 획득하는 방법을 설명하기 위한 도면이다.
도 8은 도2에 도시된 S252단계에서 딥러닝을 통해 이미지를 추출하는 방법을 설명하기 위한 도면이다. 1 is a block diagram illustrating a user terminal according to an embodiment of the present invention.
2 is a flow chart for explaining a method of line-of-sight matching according to an embodiment of the present invention.
3 is an exemplary diagram illustrating a method of obtaining a face position and an eye position by machine learning in step S220 shown in FIG. 2.
FIG. 4 is a flowchart illustrating S240 shown in FIG. 3.
FIG. 5 is an exemplary diagram for explaining the coordinate values of the pupil and the angle value of the eye line output through machine learning in step S242 shown in FIG. 4.
FIG. 6 is a diagram for explaining a method of obtaining a standard time point value in step S243 shown in FIG. 4.
FIG. 7 is a diagram illustrating a method of obtaining an actual viewpoint value in step S244 shown in FIG. 4.
8 is a diagram illustrating a method of extracting an image through deep learning in step S252 shown in FIG. 2.

이하 첨부된 도면을 참조하여 본 발명에 따른 바람직한 실시예를 상세히 설명하기로 한다. 이 과정에서 도면에 도시된 선들의 두께나 구성요소의 크기 등은 설명의 명료성과 편의상 과장되게 도시되어 있을 수 있다. Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. In this process, the thickness of the lines or the size of components shown in the drawings may be exaggerated for clarity and convenience of description.

또한 후술되는 용어들은 본 발명에서의 기능을 고려하여 정의된 용어들로서, 이는 사용자, 운용자의 의도 또는 관례에 따라 달라질 수 있다. 그러므로 이러한 용어들에 대한 정의는 본 명세서 전반에 걸친 내용을 토대로 내려져야 할 것이다.In addition, terms to be described later are terms defined in consideration of functions in the present invention, which may vary according to the intention or custom of users or operators. Therefore, definitions of these terms should be made based on the contents throughout the present specification.

이하에서는 도 1를 이용하여 본 발명의 실시예에 따른 시선 일치화 기능을 탑재한 사용자 단말기에 대해 더욱 상세하게 설명한다. Hereinafter, a user terminal equipped with a gaze matching function according to an embodiment of the present invention will be described in more detail with reference to FIG. 1.

도 1에 도시된 바와 같이, 본 발명의 실시예에 따른 사용자 단말기(100)는 데이터베이스(110), 수신부(120), 얼굴 정보 검출부(130), 제어부(140), 눈 이미지생성부(150) 및 출력부(160)를 포함한다. As shown in FIG. 1, the user terminal 100 according to the embodiment of the present invention includes a database 110, a receiver 120, a face information detection unit 130, a control unit 140, and an eye image generation unit 150. And an output unit 160.

먼저 데이터베이스(110)는 사용자 단말기에 포함된 디스플레이 화면의 크기 및 사용자 단말기(100)에 탑재된 카메라 렌즈의 위치정보를 저장한다. First, the database 110 stores the size of a display screen included in the user terminal and location information of a camera lens mounted in the user terminal 100.

부연하자면, 디스플레이 화면은 사용자 단말기의 전면의 위치하고 있으며 제품의 사양에 따라 각각 상이한 크기를 가진다. 그리고, 셀프 촬영 또는 영상통화를 하기 위한 카메라 렌즈는 디스플레이 화면의 내측 중앙에 위치하고 있는 것이 아닌 디스플레이 화면의 상부 외측에 위치한다. 이와 같이, 데이터베이스(110)는 사용자로부터 입력된 사용자 단말기(100)의 크기 및 사용자 단말기(100)에 탑재된 카메라 렌즈의 위치정보를 저장한다. 여기서, 사용자 단말기는 셀프 촬영 또는 영상통화가 가능한 단말기로서, 스마트폰, PDA, 노트북 등을 포함한다.In addition, the display screen is located in front of the user terminal and has different sizes depending on the specifications of the product. In addition, a camera lens for self-photographing or video calling is not located at the inner center of the display screen, but is located at the upper and outer sides of the display screen. In this way, the database 110 stores the size of the user terminal 100 input by the user and location information of the camera lens mounted on the user terminal 100. Here, the user terminal is a terminal capable of taking a picture or making a video call, and includes a smart phone, a PDA, and a notebook computer.

그 다음, 수신부(120)는 사용자 단말기(100)에 탑재된 카메라에 의해 촬영된 영상을 수신하며, 수신된 영상을 얼굴 정보 검출부(130)에 전달한다. Then, the receiving unit 120 receives an image photographed by a camera mounted on the user terminal 100 and transmits the received image to the face information detection unit 130.

얼굴 정보 검출부(130)는 사용자 단말기(100)의 디스플레이 화면을 통해 나타나고 있는 사용자의 얼굴 위치 및 눈동자의 위치 정보를 획득한다. The face information detection unit 130 acquires the location of the user's face and the location of the pupils displayed through the display screen of the user terminal 100.

이를 상세하게 설명하면, 얼굴 정보 검출부(130)는 수신된 영상을 기 구축된 머신 러닝에 입력한다. 이때, 얼굴 정보 검출부(130)는 입력된 복수의 영상을 이용하여 사용자의 얼굴 위치 및 눈 위치에 대해 좌표를 출력하도록 머신 러닝을 학습시킨다. 그리고 머신 러닝의 학습이 완료되면, 얼굴 정보 검출부(130)는 현재 시점에서 수신된 영상을 머신 러닝에 입력하여 사용자 얼굴 위치 및 눈 위치에 대한 좌표값을 검출한다. To explain this in detail, the face information detection unit 130 inputs the received image into pre-built machine learning. In this case, the face information detection unit 130 learns machine learning to output coordinates for the user's face position and eye position using a plurality of input images. In addition, when the machine learning learning is completed, the face information detection unit 130 detects coordinate values for the user's face position and eye position by inputting the image received at the current viewpoint into machine learning.

그 다음, 얼굴 정보 검출부(130)는 검출된 눈 위치에 대한 좌표를 이용하여 눈과 눈의 사이에 해당하는 거리값을 획득한다. 그리고, 얼굴 정보 검출부(130)는 획득한 눈동자 사이의 거리값을 이용하여 사용자 단말기의 디스플레이 화면과 사용자간의 거리를 추정한다. Then, the face information detector 130 acquires a distance value corresponding to the distance between the eye and the eye by using the coordinates of the detected eye position. Then, the face information detector 130 estimates the distance between the display screen of the user terminal and the user by using the obtained distance value between pupils.

제어부(140)는 눈 위치에 대한 좌표값를 이용하여 영상으로부터 눈 이미지를 추출한다. 그리고 제어부(140)는 추출된 눈 이미지를 머신 러닝에 입력하여 눈동자에 대한 위치정보를 획득한다. 그리고, 제어부(140)는 획득한 눈동자에 대한 위치정보를 사용자의 시선이 카메라에 위치하는 것으로 가정하여 획득한 표준 시선값에 대입하여 시선 각도의 차이를 도출한다. 이때, 제어부(140)는 도출된 시선 각도의 차이값을 실제 시점값이라 한다. 그 다음, 제어부(140)는 실제 시점값과 표준 시점값과의 차이에 따른 최종 시점값을 산출한다. The control unit 140 extracts an eye image from an image using a coordinate value for an eye position. In addition, the control unit 140 inputs the extracted eye image into machine learning to obtain positional information on the pupil. In addition, the control unit 140 derives a difference in gaze angle by substituting the acquired position information on the pupil into the acquired standard gaze value assuming that the user's gaze is located on the camera. In this case, the control unit 140 refers to the difference value of the derived gaze angle as an actual viewpoint value. Then, the controller 140 calculates a final viewpoint value according to the difference between the actual viewpoint value and the standard viewpoint value.

눈 이미지생성부(150)는 산출된 최종 시점값을 눈 이미지에 적용하여 시점이 변화된 눈동자 이미지를 생성한다. 이때, 눈 이미지생성부(150)는 두 가지 방법에 의해 눈동자 이미지를 생성할 수 있다. The eye image generator 150 applies the calculated final viewpoint value to the eye image to generate a pupil image with a changed viewpoint. In this case, the eye image generator 150 may generate an eye pupil image by two methods.

첫번째 방법으로는, 눈 이미지생성부(150)는 시선의 방향이 상이한 복수의 눈 이미지 및 눈 이미지에 대한 좌표를 룩업테이블(lookup table) 형태로 저장한 다음, 룩업테이블로부터 최종 시점값에 매칭되는 눈 이미지를 추출할 수 있다. In the first method, the eye image generator 150 stores coordinates of a plurality of eye images and eye images having different gaze directions in the form of a lookup table, and then matches the final viewpoint value from the lookup table. Snow images can be extracted.

두번째 방법으로, 눈 이미지생성부(150)는 GAN(Generatvie Adversarial Network)을 이용하여 눈 이미지를 학습시키고, 학습이 완료된 상태에서 수신된 영상에 포함된 사용자의 눈 이미지에 최종 시점값을 적용하여 보정된 시점을 가지는 눈 이미지를 생성할 수 있다. As a second method, the eye image generator 150 learns an eye image using a Generatvie Adversarial Network (GAN), and corrects by applying a final viewpoint value to the eye image of the user included in the received image when the learning is completed. It is possible to create an image of an eye with a defined viewpoint.

마지막으로, 출력부(160)는 최종 시점값에 따라 시점이 보정된 눈동자 이미지를 영상에 접합하여 보정된 이미지를 디스플레이 화면을 통하여 출력한다. Finally, the output unit 160 attaches the pupil image whose viewpoint is corrected according to the final viewpoint value to the image, and outputs the corrected image through the display screen.

이하에서는 도 2 내지 도 7을 통하여 본 발명의 실시예에 따른 사용자 단말기를 이용하여 시선을 일치화시키는 방법에 대해 더욱 상세하게 설명한다. Hereinafter, a method of matching the gaze using a user terminal according to an embodiment of the present invention will be described in more detail with reference to FIGS. 2 to 7.

도 2는 본 발명의 실시예에 따른 시선 일치화 방법을 설명하기 위한 순서도이고, 도 3은 도 2에 도시된 S220단계에서 얼굴 위치 및 눈동자 위치를 머신 러닝에 의해 획득하는 방법을 설명하기 위한 예시도이다. FIG. 2 is a flow chart for explaining a gaze matching method according to an embodiment of the present invention, and FIG. 3 is an example for explaining a method of acquiring a face position and a pupil position by machine learning in step S220 shown in FIG. 2 Is also.

먼저 데이터베이스(110)는 사용자로부터 입력된 디스플레이 화면 크기에 대한 정보와 디스플레이 화면의 외측의 위치에 탑재된 카메라 렌즈의 위치정보를 저장한다. First, the database 110 stores information on the size of a display screen input from a user and location information of a camera lens mounted on a position outside the display screen.

그 다음 도 2에 도시된 바와 같이, 본 발명의 실시예에 따른 사용자 단말기(100)는 외곽에 탑재된 카메라에 의해 촬영된 영상을 수신한다(S210). 이때, 영상은 셀프 촬영 또는 영상통화를 수행하면서 사용자의 모습을 스스로 촬영하여 생성된 이미지를 나타낸다. Then, as shown in FIG. 2, the user terminal 100 according to an embodiment of the present invention receives an image photographed by a camera mounted outside (S210). In this case, the image represents an image generated by taking a self-portrait of a user while performing a self-photographing or video call.

그리고, 수신부(120)는 수신된 촬영 영상을 얼굴 정보 검출부(130)에 전달한다. Then, the receiving unit 120 transmits the received photographed image to the face information detecting unit 130.

그러면, 얼굴 정보 검출부(130)는 전달받은 영상을 머신 러닝에 적용하여 사용자의 얼굴 위치정보 및 눈의 위치 정보를 획득한다(S220).Then, the face information detector 130 obtains the user's face location information and the eye location information by applying the received image to machine learning (S220).

그리고 얼굴 정보 검출부(130)는 획득한 눈동자의 위치 정보를 이용하여 디스플레이 화면과 사용자 얼굴간의 거리 정보를 획득한다(S230).Then, the face information detector 130 acquires distance information between the display screen and the user's face by using the acquired position information of the pupil (S230).

상기 S220 단계 및 S230단계에 대해 더욱 상세하게 설명하면, 먼저 얼굴 정보 검출부(130)는 사용자에 의해 촬영된 복수의 영상을 수신부(120)로부터 전달받는다. 그리고, 얼굴 정보 검출부(130)는 수신된 복수의 영상을 기 구축된 머신 러닝에 입력하여 학습시킨다. In a more detailed description of the steps S220 and S230, first, the face information detection unit 130 receives a plurality of images captured by the user from the reception unit 120. In addition, the face information detection unit 130 inputs a plurality of received images to pre-built machine learning to learn.

그러면 도 3에 도시된 바와 같이, 머신 러닝은 학습된 결과에 따라 입력된 영상에 포함된 사용자의 얼굴과 눈동자를 추적하여 얼굴 및 눈에 해당하는 위치에 경계박스를 설정한다. 그리고, 머신 러닝에 의하여 4개의 좌표값으로 설정된 경계박스가 출력된다. Then, as shown in FIG. 3, machine learning tracks the user's face and pupils included in the input image according to the learned result, and sets a bounding box at positions corresponding to the face and eyes. Then, a bounding box set with four coordinate values is output by machine learning.

사용자의 얼굴 위치 및 눈 위치에 대한 좌표값 출력이 완료되면, 얼굴 정보 검출부(130)는 출력된 눈 위치에 대한 좌표값을 이용하여 사용자의 눈과 눈 사이의 거리값을 획득한다. 그리고 얼굴 정보 검출부(130)는 눈과 눈 사이의 거리값을 통해 사용자 단말기의 디스플레이 화면과 사용자간의 거리를 추정한다. When the output of the coordinate values for the user's face position and the eye position is completed, the face information detector 130 acquires a distance value between the user's eyes and the eyes using the output coordinate values for the eye position. In addition, the face information detection unit 130 estimates the distance between the display screen of the user terminal and the user through the distance value between eyes and eyes.

그 다음, 얼굴 정보 검출부(130)는 획득한 사용자의 얼굴 위치 및 눈 위치에 대한 좌표값을 제어부(140)에 전달한다. Then, the face information detection unit 130 transmits the acquired coordinate values of the user's face position and eye position to the control unit 140.

그러면, 제어부(140)는 수신된 눈 위치에 대한 좌표값을 이용하여 사용자의 실제 시점값을 획득하고, 획득된 실제 시점값으로부터 보정된 최종 시점값을 도출한다(S240). Then, the controller 140 obtains the user's actual viewpoint value using the received coordinate value for the eye position, and derives the corrected final viewpoint value from the obtained actual viewpoint value (S240).

이하에서는 도 4 및 도 5를 이용하여 S240단계에 대해 더욱 상세하게 설명한다. Hereinafter, step S240 will be described in more detail with reference to FIGS. 4 and 5.

도 4는 도 3에 도시된 S240 단계를 설명하기 위한 순서도이고, 도 5는 도 4에 도시된 S242단계에서 머신러닝을 통해 출력된 눈동자에 대한 좌표값과 시선의 각도값을 설명하기 위한 예시도이다. FIG. 4 is a flow chart for explaining step S240 shown in FIG. 3, and FIG. 5 is an exemplary diagram for explaining the coordinate values of the pupil and the angle value of the eye line output through machine learning in step S242 shown in FIG. 4 to be.

먼저, 제어부(140)는 촬영영상으로부터 획득한 눈 이미지를 기 구축된 머신 러닝에 입력하여 학습시킨다(S241). First, the control unit 140 inputs and learns the eye image acquired from the captured image into pre-built machine learning (S241).

즉, 머신러닝은 눈 이미지를 입력받고, 입력된 눈 이미지로부터 안구의 외곽 라인 대한 좌표값과 눈동자에 대한 좌표값을 출력하도록 학습된다. That is, machine learning is trained to receive an eye image and output a coordinate value for an outer line of the eyeball and a coordinate value for the pupil from the input eye image.

도 5에 도시된 바와 같이, 제어부(140)는 사용자의 눈 위치에 대한 좌표값을 이용하여 눈 이미지를 추출하고, 추출된 눈 이미지를 입력 이미지(input image)로 하여 머신 러닝에 입력한다. 그러면 머신 러닝은 초록색으로 표시된 안구의 외곽 라인 대한 좌표값과 붉은색으로 표시된 눈동자에 대한 좌표값을 출력한다. 이때, 제어부(140)는 머신러닝을 통해 입력된 한 개의 입력이미지에 대응되는 한 개의 출력 이미지를 출력한다. 다만, 도 5에서는 사용자의 시선에 따라 눈동자의 위치 및 안구 외곽라인이 변화되는 것을 설명하기 위하여 한 개의 입력 이미지(input image)에 대응되어 출력되는 출력 이미지(output)를 복수개로 도시하였다. As shown in FIG. 5, the controller 140 extracts an eye image by using a coordinate value for a user's eye position, and inputs the extracted eye image as an input image into machine learning. Then, machine learning outputs the coordinate values for the outer line of the eyeball indicated in green and the coordinate values for the pupil indicated in red. In this case, the controller 140 outputs one output image corresponding to one input image input through machine learning. However, in FIG. 5, a plurality of output images corresponding to one input image are shown in order to explain that the position of the pupil and the outline of the eyeball are changed according to the user's gaze.

S241단계에서 머신러닝의 학습이 완료되면, 제어부(140)는 현재 시점에서 입력된 영상에 포함된 눈 이미지를 이용하여 안구라인에 대한 좌표값과 눈동자에 대한 좌표값을 획득한다(S242).When the machine learning learning is completed in step S241, the controller 140 acquires a coordinate value for the eye line and a coordinate value for the pupil by using the eye image included in the image input at the current viewpoint (S242).

그리고 제어부(140)는 데이터베이스(110)에 저장된 사용자 단말기(100)에 탑재된 카메라 렌즈의 위치정보와 S230단계에서 획득한 눈동자의 위치 정보 및 디스플레이 화면과 사용자간의 거리를 이용하여 표준 시점값(

)을 획득한다(S243).In addition, the control unit 140 uses the position information of the camera lens mounted in the user terminal 100 stored in the database 110, the position information of the pupil obtained in step S230, and the distance between the display screen and the user.

) Is obtained (S243).

한편, 사용자가 카메라를 정면으로 응시하였을 경우에 획득한 값을 표준 시점값(

)으로 나타낸다. On the other hand, the value obtained when the user stares at the camera in front is the standard viewpoint value (

).

이하에서는 도 6을 이용하여 표준 시점값(

)을 획득하는 방법에 대해 더욱 상세하게 설명한다. Hereinafter, using FIG. 6, the standard viewpoint value (

How to obtain) will be described in more detail.

도 6은 도 4에 도시된 S243단계에서 표준시점값을 획득하는 방법을 설명하기 위한 도면이다. FIG. 6 is a diagram for explaining a method of obtaining a standard time point value in step S243 shown in FIG. 4.

도 6의 (a)에 도시된 바와 같이, 제어부(140)는 기 저장된 카메라 렌즈의 위치정보를 이용하여 카메라 렌즈의 위치를 중심으로 디스플레이 화면 내에 수직선(v)을 생성한다. 그리고, 제어부(140)는 S220단계에서 획득한 눈동자의 위치 정보를 이용하여 디스플레이 화면 내에 수평선(h)을 생성한다. 그리고, 제어부(140)는 생성된 수직선(v)과 수평선(h)을 상호 직교하여 교차점(

)을 설정한다. As shown in (a) of FIG. 6, the controller 140 generates a vertical line v in the display screen centering on the position of the camera lens using the previously stored position information of the camera lens. Then, the control unit 140 generates a horizontal line h in the display screen using the position information of the pupil obtained in step S220. Then, the control unit 140 crosses the generated vertical line (v) and horizontal line (h) to mutually orthogonal

) Is set.

그 다음, 제어부(140)는 디스플레이 화면을 통해 출력된 사용자의 눈과 눈 사이의 중간 지점을 디스플레이상의 중심점(

)으로 설정한다. 그리고 제어부(140)는 도 6의 (b)에 도시된 바와 같이, 사용자의 실제 얼굴에서 눈과 눈 사이의 중간 지점을 3D 공간상의 중심점(O)으로 설정한다. Then, the control unit 140 determines the midpoint between the user's eyes and the eyes output through the display screen.

). And, as shown in (b) of FIG. 6, the control unit 140 sets an intermediate point between the eyes and eyes on the user's real face as a center point O in the 3D space.

이때, 디스플레이 화면에서 설정된

과 사용자의 눈과 눈 사이의 중간 지점(O) 사이의 거리는 r_o 로 표현할 수 있다. At this time, set on the display screen

The distance between the user's eyes and the middle point (O) between them can be expressed as r _o .

그 다음, 제어부(140)는 눈동자의 위치에 대한 사이각(

)과 시선에 대한 사이각(

)을 이용하여 표준시점값(

)을 획득하며, 표준시점값(

)은 (

,

)와 같이 표현된다. Then, the control unit 140 is the angle between the pupil position (

) And the angle between the gaze (

) Using the standard time point value (

), and the standard time point value (

) Is (

,

).

여기서, 눈동자의 위치에 대한 사이각(

)은 교차점(

), 3D 공간상의 중심점(O) 및 디스플레이상의 중심점(

)에 의해 형성되고, 시선에 대한 사이각(

)은 디스플레이상의 중심점(

), 3D 공간상의 중심점(O) 및 카메라 렌즈의 위치(c)에 의해 형성된다. Here, the angle between the pupil's position (

) Is the intersection (

), center point in 3D space (O) and center point on display (

), and the angle between the line of sight (

) Is the center point on the display (

), the center point (O) in the 3D space and the position (c) of the camera lens.

S243단계에서 표준 시점값 획득이 완료되면, 제어부(140)는 S242단계에서 획득한 눈동자에 대한 좌표값을 이용하여 실제 시점값(

)을 획득한다(S244). When the acquisition of the standard viewpoint value is completed in step S243, the controller 140 uses the coordinate value for the pupil obtained in step S242 to use the actual viewpoint value (

) Is obtained (S244).

이때, 사용자는 카메라를 정면으로 응시할 수도 있고, 디스플레이 화면을 통해 출력된 자신의 눈을 응시할 수 있다. At this time, the user may stare directly at the camera or may stare at his/her eyes output through the display screen.

이하에서는 도 7을 통해 실제 시점값(

)을 획득하는 방법에 대해 더욱 상세하게 설명한다. In the following, the actual viewpoint value (

How to obtain) will be described in more detail.

도 7은 도 4에 도시된 S244단계에서 실제 시점값을 획득하는 방법을 설명하기 위한 도면이다. FIG. 7 is a diagram illustrating a method of obtaining an actual viewpoint value in step S244 shown in FIG. 4.

부연하자면, 제어부(140)는 표준시점값(

)을 획득하는 방법과 동일하게 실제 시점값(

)을 획득한다. In addition, the control unit 140 is a standard time point value (

In the same way as to obtain the actual point value (

).

다만, 도 7의 (a)에 도시된 바와 같이, 디스플레이상의 중심점(

)은 S242단계에서 획득한 눈동자의 좌표값을 이용하여 설정한다. 즉, 제어부(140)는 눈동자와 눈동자 사이에 위치하는 중간지점을 디스플레이상의 중심점(

)으로 설정한다. However, as shown in Figure 7 (a), the center point on the display (

) Is set using the coordinate value of the pupil obtained in step S242. That is, the control unit 140 selects an intermediate point located between the pupil and the pupil.

).

예를 들어, 사용자가 디스플레이 화면에 출력된 자신의 눈을 응시하고 있다고 가정하면, 도 7의 (b)에 도시된 바와 같이, 디스플레이상에 사용자의 눈동자는 디스플레이의 하단을 응시하고 있는 것처럼 출력된다. 그러면, 제어부(140)는 눈동자의 좌표값에 따라 디스플레이상의 중심점(

)을 디스플레이 화면 내의 수평선(h)보다 다소 낮은 위치에 설정할 수 있다. For example, assuming that the user is staring at his/her eyes output on the display screen, as shown in FIG. 7(b), the user's pupils are output on the display as if they are staring at the bottom of the display. . Then, the control unit 140 is the center point on the display according to the coordinate value of the pupil (

) Can be set at a position slightly lower than the horizontal line (h) in the display screen.

반면에, 사용자가 카메라를 응시하고 있다고 가정하면, 제어부(140)는 눈동자의 좌표값에 따라 디스플레이상의 중심점(

)을 디스플레이 화면 내의 수평선(h)과 동일한 위치에 설정할 수 있다. On the other hand, assuming that the user is staring at the camera, the control unit 140 controls the center point on the display according to the coordinate value of the pupil (

) Can be set at the same position as the horizontal line (h) in the display screen.

그 다음, 제어부(140)는 눈동자의 위치에 대한 사이각(

)과 시선에 대한 사이각(

)을 이용하여 실제 시점값(

)을 획득하며, 실제 시점값(

)은 (

,

) And the angle between the gaze (

) Using the actual time value (

), and the actual time value (

) Is (

,

).

여기서, 눈동자의 위치에 대한 사이각(

)은 교차점(

), 3D 공간상의 중심점(O) 및 디스플레이상의 중심점(

)에 의해 형성되고, 시선에 대한 사이각(

)은 디스플레이상의 중심점(

) Is the intersection (

), center point in 3D space (O) and center point on display (

), and the angle between the line of sight (

) Is the center point on the display (

그 다음, 제어부(140)는 획득한 표준 시점값(

)과 실제 시점값(

)을 하기의 수학식 1에 적용하여 최종 시점값(

)을 산출한다(S245). Then, the control unit 140 obtains the standard time value (

) And the actual point in time (

) Is applied to Equation 1 below, and the final time value (

) Is calculated (S245).

여기서,

는 최종 시점값을 나타내고,

는 표준 시점값을 나타내고,

은 실제 시점값을 나타낸다here,

Represents the final starting point value,

Represents the standard time point value,

Represents the actual starting point

즉, 최종 시점값(

)은 표준 시점값(

)과 실제 시점값(

) 사이에 발생된 차이값에 의해 산출된다. 여기서 표준 시점값(

)은 사용자의 시선을 카메라에 위치하고 있다고 가정한 상태에서 획득한 값이므로, 사용자가 카메라를 정면으로 바라보는 경우, 실제 시점값(

)은 표준 시점값(

)이 동일하게 된다. 따라서, 최종 시점값(

)은 0이 된다. That is, the final starting point (

) Is the standard time point (

) And the actual point in time (

) It is calculated by the difference generated between. Where the standard time point (

) Is a value obtained under the assumption that the user's gaze is positioned at the camera, so when the user looks at the camera directly, the actual viewpoint value (

) Is the standard time point (

) Becomes the same. Therefore, the final starting point (

) Becomes 0.

S240단계에서 최종 시점값 도출이 완료되면, 제어부(140)는 도출된 최종 시점값을 눈 이미지생성부(150)에 전달한다.When derivation of the final viewpoint value is completed in step S240, the control unit 140 transmits the derived final viewpoint value to the eye image generation unit 150.

그러면, 눈 이미지생성부(150)는 입력된 영상에 포함된 눈동자의 이미지에 최종 시점값을 적용하여 시점이 변화된 눈동자 이미지를 생성한다(S250).Then, the eye image generator 150 generates a pupil image with a changed viewpoint by applying the final viewpoint value to the image of the pupil included in the input image (S250).

이때, 눈 이미지생성부(150)는 두 가지 방법에 의해 시점이 변화된 눈동자 이미지를 생성한다.At this time, the eye image generator 150 generates an eye pupil image whose viewpoint is changed by two methods.

첫 번째 방법을 살펴보면, 먼저 눈 이미지생성부(150)는 사용자의 모든 시선방향에 대한 눈 이미지를 수집한다. 그리고 눈 이미지생성부(150)는 수집된 눈 이미지와 눈 이미지에 대응되는 좌표값을 룩업테이블(lookup table) 형태로 저장한다. 따라서, 눈 이미지생성부(150)는 수신된 최종 시점값과 룩업테이블(lookup table)에 저장된 좌표값을 비교하여 눈 이미지를 추출한다. Looking at the first method, first, the eye image generator 150 collects eye images for all the user's gaze directions. In addition, the eye image generator 150 stores the collected eye image and coordinate values corresponding to the eye image in the form of a lookup table. Accordingly, the eye image generator 150 extracts an eye image by comparing the received final viewpoint value with a coordinate value stored in a lookup table.

두 번째 방법으로 눈 이미지생성부(150)는 딥러닝을 통해 시선이 보정된 눈 이미지를 추출한다.As a second method, the eye image generator 150 extracts an eye image whose gaze is corrected through deep learning.

도 8은 도2에 도시된 S252단계에서 딥러닝을 통해 시선이 보정된 눈 이미지를 추출하는 방법을 설명하기 위한 도면이다. FIG. 8 is a diagram illustrating a method of extracting an eye image whose gaze is corrected through deep learning in step S252 shown in FIG. 2.

도 8에 도시된 바와 같이, 눈 이미지생성부(150)는 기 구축된 GAN(Generatvie Adversarial Network) 네트워크를 이용하여 눈동자 이미지를 학습시킨다. As shown in FIG. 8, the eye image generator 150 learns the pupil image using a previously constructed GAN (Generatvie Adversarial Network) network.

즉, 눈 이미지생성부(150)는 GAN 네트웍크를 이용하여 사용자의 눈 이미지와 최종 시점값을 입력받아 새로운 시선값을 갖는 눈동자의 이미지를 출력하도록 GAN 네트웍크를 학습시킨다.That is, the eye image generator 150 trains the GAN network to receive the user's eye image and the final viewpoint value using the GAN network and output an image of the pupil having a new line of sight value.

GAN 네트웍크의 학습이 완료되면 눈 이미지생성부(150)는 입력된 영상에 포함된 사용자의 눈에 대한 이미지를 GAN 네트웍크에 입력한다. 그러면, GAN 네트웍크은 최종 시점값을 적용하여 보정된 시점을 가지는 눈동자 이미지를 생성한다. When learning of the GAN network is completed, the eye image generation unit 150 inputs an image of the user's eyes included in the input image into the GAN network. Then, the GAN network creates a pupil image with a corrected viewpoint by applying the final viewpoint value.

S250 단계에서 새로운 시점을 가지는 눈동자 이미지 생성이 완료되면, 출력부(160)는 생성된 눈동자 이미지를 입력된 영상에 정합하여 보정된 이미지를 출력한다(S260).When the creation of the pupil image having a new viewpoint is completed in step S250, the output unit 160 matches the generated pupil image with the input image and outputs the corrected image (S260).

즉, 출력부(160)는 입력된 영상에 포함된 눈의 모습을 그대로 유지하면서 시선만 보정한 상태로 출력한다. That is, the output unit 160 outputs a state in which only the gaze is corrected while maintaining the shape of the eyes included in the input image as it is.

이와 같이 본 발명의 실시예에 따른 사용자 단말기는 사용자의 실제 시선값에 시점 보상값을 적용하여 디스플레이 화면과 카메라의 위치가 상이함으로 인해 불일치된 시선을 일치화시킬 수 있도록 하고, 사용자의 눈 이미지에 딥러닝을 적용하여 새로운 눈 이미지를 생성함으로써, 본래의 원본 영상을 유지하면서 시선만 보정할 수 있는 효과를 지닌다. As described above, the user terminal according to an embodiment of the present invention applies a viewpoint compensation value to the user's actual gaze value so that the discordant gaze can be matched due to the different positions of the display screen and the camera. By applying deep learning to create a new eye image, it has the effect of correcting only the gaze while maintaining the original original image.

또한, 본 발명의 실시예에 따른 사용자 단말기는 셀프 사진 촬영 또는 영상 통화를 수행할 경우, 디스플레이 화면과 카메라의 위치가 상이함으로 인해 생기는 시선을 소프트웨어적으로 수정함으로써, 셀프 사진 촬영 또는 영상 통화 시 좀 더 자연스럽게 시선을 맞추고 셀프 촬영 또는 영상 통화를 할 수 있게 된다.In addition, the user terminal according to an embodiment of the present invention corrects the gaze caused by the different positions of the display screen and the camera in software when taking a self photo or performing a video call. You will be able to meet your eyes more naturally and make self-portraits or video calls.

본 발명은 도면에 도시된 실시예를 참고로 하여 설명되었으나 이는 예시적인 것에 불과하며, 당해 기술이 속하는 분야에서 통상의 지식을 가진 자라면 이로부터 다양한 변형 및 균등한 타 실시예가 가능하다는 점을 이해할 것이다. 따라서 본 발명의 진정한 기술적 보호범위는 아래의 특허청구범위의 기술적 사상에 의하여 정해져야 할 것이다.The present invention has been described with reference to the embodiments shown in the drawings, but these are only exemplary, and those of ordinary skill in the art will understand that various modifications and equivalent other embodiments are possible therefrom. will be. Therefore, the true technical protection scope of the present invention should be determined by the technical idea of the following claims.

100 : 사용자 단말기
110 : 데이터베이스
120 : 수신부
130 : 얼굴 정보 검출부
140 : 제어부
150 : 눈 이미지생성부
160 : 출력부100: user terminal
110: database
120: receiver
130: face information detection unit
140: control unit
150: eye image generator
160: output

Claims

In a user terminal equipped with a gaze matching function,
Database storing the display screen size of the user terminal and the location information of the camera lens mounted on the user terminal,
A receiving unit receiving an image captured through the camera,
A face information detector configured to obtain face location information and eye location information of a user by applying the input image to machine learning,
Extracting an eye image from the input image using the acquired eye position information, and inputting the extracted eye image into pre-built machine learning to detect a coordinate value for the outer line of the eye and a coordinate value for the pupil. A control unit that learns a method and estimates an actual viewpoint value for the user's eye gaze by inputting an image captured at the current viewpoint into machine learning on which the learning is completed, and derives a corrected final viewpoint value from the estimated actual viewpoint value,
An eye image generator configured to generate a pupil image with a changed viewpoint by applying the final viewpoint value to the image of the pupil included in the input image, and
And an output unit for matching the pupil image of which the viewpoint is changed to the image and outputting the corrected image on the display screen,
The face information detection unit,
The image captured by the camera is input into pre-built machine learning to learn the user's face position and eye position, and the user's face position and eye position by inputting the image acquired while learning is completed into machine learning Is detected,
A user terminal that acquires information on a distance between a display screen and a user by using the detected distance between the eyes and the eyes.

delete

The method of claim 1,
The control unit,
It is determined whether the estimated actual viewpoint value is located inside the display screen, and when the actual viewpoint value is located inside the display screen according to the determined result, a standard viewpoint value is applied to the actual viewpoint value. User terminal that calculates the final time value.

The method of claim 4,
The final time value is,
A user terminal calculated through the following equation;

here,

Represents the final starting point value,

Represents the standard time point value,

Represents the actual starting point.

The method of claim 5,
The standard time point is,
When the user's gaze is toward the camera, the user terminal is obtained through the position information of the camera lens, the position information of the eye, and a distance value between the display screen and the user.

The method of claim 5,
The eye image generator,
A user terminal that stores eye images and corresponding coordinates for all the user's gaze directions in the form of a lookup table using the input image and then extracts an image matching the final viewpoint value from the lookup table.

The method of claim 5,
The eye image generator,
A user terminal that learns an image using a Generatvie Adversarial Network (GAN), and generates a pupil image having a corrected viewpoint by applying the final viewpoint value to the user's eye image included in the input image when the learning is completed.

In the gaze matching method using a user terminal,
Receiving an image captured through the camera while storing the display screen size of the user terminal and location information of the camera lens attached to the user terminal,
Applying the input image to machine learning to obtain face location information and pupil location information of the user,
Learning a method of extracting an eye image from an image using the acquired position information of the pupil, and inputting the extracted eye image into pre-built machine learning to detect the coordinate value of the outer line of the eye and the coordinate value of the pupil. Then, inputting the image received at the current time point into machine learning on which the learning was completed, estimating the actual viewpoint value of the user's eye gaze, and deriving the corrected final viewpoint value from the estimated actual viewpoint value,
Generating a pupil image with a changed viewpoint by applying the final viewpoint value to the image of the pupil included in the input image, and
Comprising the step of matching the pupil image of which the viewpoint is changed to the image and outputting the corrected image on the display screen,
The step of obtaining the user's face location information and the pupil location information,
Learning the location of the user's face and pupils by inputting the image captured through the camera into pre-built machine learning,
The step of detecting the position of the user's face and the pupil by inputting the image acquired while the learning is completed into machine learning, and
And acquiring information on a distance between a display screen and a user by using the detected distance between the pupil and the pupil.

delete

The method of claim 9,
The step of deriving the final viewpoint value,
It is determined whether the estimated actual viewpoint value is located inside the display screen, and when the actual viewpoint value is located inside the display screen according to the determined result, a standard viewpoint value is applied to the actual viewpoint value. A gaze matching method that calculates the final viewpoint value.

The method of claim 9,
The final time value is,
A method of matching the gaze calculated through the following equation;

here,

Represents the final starting point value,

Represents the standard time point value,

Represents the actual starting point.

The method of claim 13,
The standard time point is,
When the user's gaze is facing the camera, the user terminal is obtained through the location information of the camera lens mounted in the user terminal, the position information of the eye, and the distance value between the display screen and the user.

The method of claim 13,
The step of generating the pupil image in which the viewpoint is changed,
Gaze matching that stores eye images for all the user's gaze directions and coordinates corresponding to them in the form of a lookup table using the input image, and then extracts an image matching the final viewpoint value from the lookup table. Angry way.

The method of claim 13,
The step of generating the pupil image in which the viewpoint is changed,
Gaze matching that trains an image using GAN (Generatvie Adversarial Network) and generates a pupil image with a corrected viewpoint by applying the final viewpoint value to the user's eye image included in the input image when the learning is completed. Way.