KR102548476B1

KR102548476B1 - System for social interaction feedback using deep learning and method thereof

Info

Publication number: KR102548476B1
Application number: KR1020210066628A
Authority: KR
Inventors: 전동욱; 김성진; 정도운
Original assignee: 인제대학교 산학협력단
Priority date: 2021-05-25
Filing date: 2021-05-25
Publication date: 2023-06-28
Also published as: KR20220158958A

Abstract

본 발명은 딥러닝 기반의 사회적 상호작용 피드백 시스템을 이용한 사회적 상호작용 피드백 방법에 관한 것이다. 본 발명에 따르면, 딥러닝 기반의 사회적 상호작용 피드백 시스템을 이용한 사회적 상호작용 피드백 방법에 있어서, 일정시간 간격으로 표정이 변화하는 상대방의 얼굴 영상을 사용자 단말기를 통해 피검자에게 제공하는 단계, 피검자가 사용자 단말기의 화면을 통해 제공되는 상대방의 얼굴 영상을 주시하는 동안 카메라를 통해 촬영된 피검자의 얼굴 영상을 딥러닝 모델에 적용하여 피검자의 표정을 분석하는 단계, 피검자의 표정과 상대방의 표정이 일치하는지에 대한 표정 피드백 비율을 산출하는 단계, 시선 추적 장치를 통해 트래킹된 피검자의 시선 처리 결과를 이용하여 피검자가 상대방의 눈 부위를 주시하는지에 대한 몰입 비율을 산출하는 단계, 산출된 표정 피드백 비율과 몰입 비율을 이용하여 상호작용 피드백 수치를 연산하는 단계, 그리고 피검자에 대한 상호작용 피드백 수치를 사용자 단말기로 제공하는 단계를 포함한다.
이와 같이 본 발명에 따르면, 몰입도 및 표정 피드백을 통한 사회적 상호작용을 위한 훈련을 피검자에게 제공할 수 있다. 또한, 상호작용 피드백 수치를 통해 객관적인 피드백을 피검자에게 제공함으로써, 사회불안이 심한 환자나 조현병 환자에게 사회적 상호작용을 증진시키고, 사회기능을 향상시킬 수 있다. The present invention relates to a social interaction feedback method using a deep learning-based social interaction feedback system. According to the present invention, in the social interaction feedback method using a deep learning-based social interaction feedback system, the step of providing a facial image of a partner whose facial expression changes at regular time intervals to a subject through a user terminal, Analyzing the subject's expression by applying the subject's face image captured through the camera to the deep learning model while watching the other person's face image provided through the terminal screen, determining whether the subject's expression and the other person's expression match Calculating a facial expression feedback rate for the subject, calculating an immersion rate for whether the subject looks at the other person's eyes using the subject's gaze processing result tracked through the gaze tracking device, the calculated facial expression feedback rate and immersion rate Calculating an interaction feedback value using , and providing the interaction feedback value for the subject to a user terminal.
As described above, according to the present invention, training for social interaction through immersion and facial expression feedback can be provided to the subject. In addition, by providing objective feedback to the subject through the interactive feedback value, it is possible to increase social interaction and improve social function in patients with severe social anxiety or schizophrenia.

Description

Social interaction feedback system using deep learning and its method {SYSTEM FOR SOCIAL INTERACTION FEEDBACK USING DEEP LEARNING AND METHOD THEREOF}

본 발명은 딥러닝을 이용한 사회적 상호작용 피드백 시스템 및 그 방법에 관한 것으로, 더욱 상세하게는 딥러닝을 기반으로 한 피검자의 표정 분석을 통해 피검자의 몰입 비율과 표정 피드백 비율을 연산하여 상호작용 피드백 수치를 제공하는 딥러닝을 이용한 사회적 상호작용 피드백 시스템 및 그 방법에 관한 것이다. The present invention relates to a social interaction feedback system using deep learning and a method therefor, and more particularly, by calculating an immersion rate and a facial expression feedback rate of a subject through analysis of a subject's facial expression based on deep learning, and an interaction feedback value It relates to a social interaction feedback system using deep learning that provides and a method thereof.

인간은 사회적 존재로써 다른 사람들과 끊임없이 관계를 형성하고 사회적 결속을 강화하려는 욕구를 지닌다. 이러한 사회적 관계를 형성하는 것을 진화론적 관점에서 보면 개인 혼자 생활하는 것보다 사회적 집단을 구성하는 것이 생존에 유리하기 때문이다. As social beings, humans have a need to constantly form relationships with others and strengthen social bonds. From an evolutionary point of view, forming these social relationships is because forming a social group is more advantageous for survival than living alone.

따라서, 이러한 사회적 관계를 형성하기에 유리하도록 언어와 감성이 발달하였다. 감성은 언어보다 더 먼저 발달하였으며 언어가 정확한 정보전달의 목적으로 발달하였다면 감성은 언어보다 더 빠르게 전달할 수 있는 커뮤니케이션 수단이다. 타인의 감성을 파악 하고 이에 대한 적절한 반응으로 자신의 감성을 표현하는 능력이 발달함으로써 적대적인 관계와 친선적 관계를 구분하고 위험 상황을 더 빠르게 파악할 수 있다. 또한, 사회적 관계를 유지하는 데에는 상대방과 눈맞춤을 하고 적절한 순간에 상황에 맞는 표정을 짓는 행동이 중요하다.Therefore, language and sensibility have developed in an advantageous way to form these social relationships. Emotion developed earlier than language, and if language developed for the purpose of accurate information transmission, emotion is a means of communication that can be delivered faster than language. By developing the ability to identify the emotions of others and express one's own emotions in response to them appropriately, it is possible to distinguish between friendly and hostile relationships and identify dangerous situations more quickly. In addition, making eye contact with the other person and making facial expressions appropriate to the situation at the right moment are important to maintaining social relationships.

다만, 정신건강의학과 영역에서 사회불안이 심한 환자군이나 조현병 환자는 사회적 눈맞춤과 사회적 미소를 짓는데 어려움이 있고 이러한 어려움은 사회적 상호작용을 저해하고 사회기능(social function)의 저하의 문제점이 있다. However, in the field of psychiatry, patients with severe social anxiety or patients with schizophrenia have difficulties in making social eye contact and social smiles, and these difficulties hinder social interaction and deteriorate social function.

이에 따라, 사회적 관계가 맺기 어려워하는 환자들은 사회적 상호작용을 위한 훈련이 필요하나, 사람들 간의 사회적 상호작용을 위한 도구는 별도로 존재하지 않아 훈련이 어려운 문제점이 있다. Accordingly, patients who have difficulty forming social relationships need training for social interaction, but there is a problem in that training is difficult because there is no separate tool for social interaction between people.

따라서, 환자의 감정, 표정 등을 이용하여 사람들 사이에서 사회적 관계를 형성하는데 도움을 주는 기술이 필요하게 되었다. Therefore, there is a need for a technology that helps to form social relationships among people by using the patient's emotions, facial expressions, and the like.

본 발명의 배경이 되는 기술은 대한민국 국내공고특허 10-1913811호 (2018.10.31 공고)에 개시되어 있다.The background technology of the present invention is disclosed in Korean Patent Publication No. 10-1913811 (published on October 31, 2018).

본 발명이 이루고자 하는 기술적 과제는 딥러닝을 기반으로 한 피검자의 표정 분석을 통해 피검자의 몰입 비율과 표정 피드백 비율을 연산하여 상호작용 피드백 수치를 제공하는 딥러닝을 이용한 사회적 상호작용 피드백 시스템 및 그 방법에 관한 것이다.A technical problem to be achieved by the present invention is a social interaction feedback system using deep learning that provides interaction feedback values by calculating the immersion rate and facial expression feedback rate of the subject through deep learning-based facial expression analysis and method thereof It is about.

이러한 기술적 과제를 이루기 위한 본 발명의 실시예에 따르면, 딥러닝 기반의 사회적 상호작용 피드백 시스템을 이용한 사회적 상호작용 피드백 방법에 있어서, 일정시간 간격으로 표정이 변화하는 상대방의 얼굴 영상을 사용자 단말기를 통해 피검자에게 제공하는 단계, 상기 피검자가 상기 사용자 단말기의 화면을 통해 제공되는 상기 상대방의 얼굴 영상을 주시하는 동안 카메라를 통해 촬영된 상기 피검자의 얼굴 영상을 딥러닝 모델에 적용하여 상기 피검자의 표정을 분석하는 단계, 상기 피검자의 표정과 상기 상대방의 표정이 일치하는지에 대한 표정 피드백 비율을 산출하는 단계, 시선 추적 장치를 통해 트래킹된 상기 피검자의 시선 처리 결과를 이용하여 상기 피검자가 상기 상대방의 눈 부위를 주시하는지에 대한 몰입 비율을 산출하는 단계, 상기 산출된 몰입 비율과 표정 피드백 비율을 이용하여 상호작용 피드백 수치를 연산하는 단계, 그리고 상기 피검자에 대한 상호작용 피드백 수치를 상기 사용자 단말기로 제공하는 단계를 포함한다. According to an embodiment of the present invention for achieving this technical problem, in a social interaction feedback method using a deep learning-based social interaction feedback system, a face image of a partner whose facial expression changes at regular time intervals is displayed through a user terminal. Providing the subject to the subject, analyzing the subject's facial expression by applying the subject's facial image captured through a camera to a deep learning model while the subject observes the other person's face image provided through the screen of the user terminal calculating a facial expression feedback rate for whether the facial expression of the subject and the facial expression of the other party match, using a result of the processing of the subject's gaze tracked through a gaze tracking device, so that the subject can detect the eye of the other party. Calculating an immersion rate for whether or not you are looking at, calculating an interaction feedback value using the calculated immersion rate and facial expression feedback rate, and providing the interaction feedback value for the subject to the user terminal. include

상기 상대방의 얼굴 영상을 사용자 단말기를 통해 피검자에게 제공하는 단계, 일정시간 간격으로 상기 상대방의 웃는 영상 또는 무표정 영상을 교대로 사용자 단말기의 화면 상에 제공할 수 있다. The step of providing the face image of the other party to the examinee through the user terminal. A smiling image or expressionless image of the other party may be alternately provided on the screen of the user terminal at regular time intervals.

상기 피검자의 표정을 분석하는 단계는, RMN(Residual Masking Network)를 기반으로 하는 딥러닝 모델을 이용하여 상기 피검자의 얼굴 영상으로부터 화남, 중립, 슬픔, 혐오, 행복, 놀람 및 두려움 중에서 적어도 하나의 표정을 인식할 수 있다. In the step of analyzing the subject's facial expression, at least one expression of anger, neutrality, sadness, disgust, happiness, surprise, and fear from the subject's face image using a deep learning model based on RMN (Residual Masking Network) can recognize

상기 표정 피드백 비율을 산출하는 단계는, 다음의 수학식을 통해 상대방의 표정과 피검자의 표정이 일치하는 경우의 시간을 측정하여 표정 피드백 비율을 산출할 수 있다. In the step of calculating the facial expression feedback rate, the facial expression feedback rate may be calculated by measuring the time when the facial expression of the other party and the subject's facial expression match through the following equation.

여기서, A는 사용자 단말기 상에 상대방의 얼굴 영상이 표시된 총 시간(Sec)이고, B는 피검자의 표정과 상대방의 표정이 일치하는 경우의 시간(Sec)이며, C는 상대방이 무표정인 상태에서 피검자가 웃는 표정을 하는 경우의 시간(Sec)이고, D는 상대방이 웃는 표정인 상태에서 피검자가 무표정인 경우의 시간(Sec)을 나타낼 수 있다.Here, A is the total time (Sec) of displaying the face image of the other person on the user terminal, B is the time (Sec) when the facial expression of the subject and the other person match, and C is the time when the other person is expressionless. is the time (Sec) when the subject is smiling, and D is the time (Sec) when the subject is expressionless while the other person is smiling.

상기 몰입 비율을 산출하는 단계는, 상기 상대방의 눈 부위를 주시하는 시간을 측정하여 다음의 수학식을 통해 상기 피검자의 몰입 비율을 산출할 수 있다. In the step of calculating the immersion rate, the immersion rate of the subject may be calculated through the following equation by measuring the time spent looking at the other person's eyes.

상기 상호작용 피드백 수치를 연산하는 단계는, 상기 표정 비드백 비율과 몰입 비율을 평균 처리하여 상기 상호작용 피드백 수치를 연산할 수 있다. In the calculating of the interaction feedback value, the interaction feedback value may be calculated by averaging the facial expression bid-back ratio and the immersion ratio.

본 발명의 다른 실시예에 따르면, 딥러닝을 이용한 사회적 상호작용 피드백 시스템에 있어서, 일정시간 간격으로 표정이 변화하는 상대방의 얼굴 영상을 사용자 단말기를 통해 피검자에게 제공하는 영상 전송부, 상기 피검자가 상기 사용자 단말기의 화면을 통해 제공되는 상대방의 얼굴 영상을 주시하는 동안 카메라를 통해 촬영된 상기 피검자의 얼굴 영상을 딥러닝 모델에 적용하여 상기 피검자의 표정을 분석하는 표정 분석부, 상기 피검자의 표정과 상기 상대방의 표정이 일치하는지에 대한 표정 피드백 비율을 산출하고, 시선 추적 장치를 통해 트래킹된 상기 피검자의 시선 처리 결과를 이용하여 상기 피검자가 상기 상대방의 눈 부위를 주시하는지에 대한 몰입 비율을 산출하는 제어부, 그리고 상기 산출된 몰입 비율과 표정 피드백 비율을 이용하여 상호작용 피드백 수치를 연산하여 상기 사용자 단말기로 제공하는 피드백부를 포함한다. According to another embodiment of the present invention, in a social interaction feedback system using deep learning, an image transmission unit for providing a facial image of a partner whose facial expression changes at regular time intervals to a subject through a user terminal, An expression analysis unit that analyzes the subject's expression by applying the subject's face image captured through the camera to a deep learning model while watching the other person's face image provided through the screen of the user terminal; A control unit that calculates a facial expression feedback rate for whether the other person's facial expression matches, and calculates an immersion rate for whether the subject looks at the other person's eyes using the subject's gaze processing result tracked through the gaze tracking device. and a feedback unit for calculating an interaction feedback value using the calculated immersion rate and facial expression feedback rate and providing the calculated interaction feedback value to the user terminal.

이와 같이 본 발명에 따르면, 몰입도 및 표정 피드백을 통한 사회적 상호작용을 위한 훈련을 피검자에게 제공할 수 있다. 또한, 상호작용 피드백 수치를 통해 객관적인 피드백을 피검자에게 제공함으로써, 사회불안이 심한 환자나 조현병 환자에게 사회적 상호작용을 증진시키고, 사회기능을 향상시킬 수 있다. 그리고, 면접 준비생들에게도 상호작용 피드백 수치를 제공함으로써, 직업 면접 준비에 도움을 줄 수 있다. As described above, according to the present invention, training for social interaction through immersion and facial expression feedback can be provided to the subject. In addition, by providing objective feedback to the subject through the interactive feedback value, it is possible to increase social interaction and improve social function in patients with severe social anxiety or schizophrenia. In addition, by providing interactive feedback values to interview candidates, it can help prepare for job interviews.

도 1은 본 발명의 실시예에 따른 딥러닝을 이용한 사회적 상호작용 피드백 시스템을 설명하기 위한 도면이다.
도 2는 본 발명의 실시예에 따른 딥러닝을 이용한 사회적 상호작용 피드백 시스템의 구성을 설명하기 위한 도면이다.
도 3은 본 발명의 실시예에 따른 딥러닝 기반의 사회적 상호작용 피드백 시스템을 이용한 사회적 상호작용 피드백 방법을 설명하기 위한 순서도이다.
도 4는 도 3의 S310 단계를 설명하기 위한 예시도이다.
도 5는 도 3의 S340 단계를 설명하기 위한 예시도이다.1 is a diagram for explaining a social interaction feedback system using deep learning according to an embodiment of the present invention.
2 is a diagram for explaining the configuration of a social interaction feedback system using deep learning according to an embodiment of the present invention.
3 is a flowchart illustrating a social interaction feedback method using a deep learning-based social interaction feedback system according to an embodiment of the present invention.
4 is an exemplary view for explaining step S310 of FIG. 3 .
5 is an exemplary diagram for explaining step S340 of FIG. 3 .

아래에서는 첨부한 도면을 참조하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 본 발명의 실시 예를 상세히 설명한다. 그러나 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시예에 한정되지 않는다. 그리고 도면에서 본 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다.Hereinafter, embodiments of the present invention will be described in detail so that those skilled in the art can easily practice with reference to the accompanying drawings. However, the present invention may be embodied in many different forms and is not limited to the embodiments described herein. In addition, in order to clearly explain the present invention in the drawings, parts irrelevant to the description are omitted, and similar reference numerals are attached to similar parts throughout the specification.

명세서 전체에서, 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다.Throughout the specification, when a certain component is said to "include", it means that it may further include other components without excluding other components unless otherwise stated.

그러면 첨부한 도면을 참고로 하여 본 발명의 실시 예에 대하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 상세히 설명한다.Then, with reference to the accompanying drawings, an embodiment of the present invention will be described in detail so that those skilled in the art can easily practice it.

이하에서는 도 1을 이용하여 본 발명의 실시예에 따른 딥러닝을 이용한 사회적 상호작용 피드백 시스템(100)을 설명한다.Hereinafter, a social interaction feedback system 100 using deep learning according to an embodiment of the present invention will be described using FIG. 1 .

도 1은 본 발명의 실시예에 따른 딥러닝을 이용한 사회적 상호작용 피드백 시스템을 설명하기 위한 도면이다.1 is a diagram for explaining a social interaction feedback system using deep learning according to an embodiment of the present invention.

도 1에서 나타낸 바와 같이, 본 발명의 실시예에 따른 사회적 상호작용 피드백 시스템(100)은 사용자 단말기(200), 카메라(300) 및 시선 추적 장치(400)를 포함한다. As shown in FIG. 1 , a social interaction feedback system 100 according to an embodiment of the present invention includes a user terminal 200 , a camera 300 and an eye tracking device 400 .

먼저, 사회적 상호작용 피드백 시스템(100)은 피검자의 몰입 비율 및 표정 피드백 비율을 이용하여 연산된 상호작용 피드백 수치를 사용자 단말기(200)로 제공한다.First, the social interaction feedback system 100 provides the user terminal 200 with an interaction feedback value calculated using the subject's immersion rate and facial expression feedback rate.

그리고, 사용자 단말기(200)는 일정시간 간격으로 표정이 변화하는 상대방의 얼굴 영상을 사회적 상호작용 피드백 시스템(100)으로부터 제공받으며, 제공된 상대방의 얼굴 영상을 화면 상에 표시한다. Then, the user terminal 200 receives the face image of the other party whose facial expression changes at regular time intervals from the social interaction feedback system 100 and displays the provided face image of the other party on the screen.

여기서, 사용자 단말기(200)는 유선 또는 무선으로 네트워크에 접속하여 정보를 주고받을 수 있는 기기로 구현될 수 있으며, 본 발명의 실시예에서는 컴퓨터 PC로 설정하여 기술하였으나, 노트북 컴퓨터, 스마트 패드 또는 스마트폰 등과 같은 기기로 구현될 수 있다. 그리고, 스마트폰 또는 스마트패드로 구현되는 경우, 기 설치된 어플리케이션을 통하여 사회적 상호작용 피드백 시스템(100)에 접속할 수 있다.Here, the user terminal 200 may be implemented as a device capable of exchanging information by accessing a network by wire or wirelessly. It may be implemented in a device such as a phone. And, when implemented as a smart phone or smart pad, it is possible to access the social interaction feedback system 100 through a pre-installed application.

다음으로, 카메라(300)는 사용자 단말기(200) 측에 설치되어 상대방의 얼굴 영상을 주시하는 동안에 피검자의 얼굴을 촬영하고, 촬영된 피검자의 얼굴 영상을 사회적 상호작용 피드백 시스템(100)으로 전송한다. Next, the camera 300 is installed on the side of the user terminal 200 and captures the subject's face while watching the other person's face image, and transmits the photographed subject's face image to the social interaction feedback system 100. .

다음으로, 시선 추적 장치(400)는 도 1에서 도시된 바와 같이 안경처럼 사용자가 착용할 수 있는 형태로 구현되거나 사용자 단말기(200)의 하단에 바 모양으로 거치하는 형태로 구현될 수 있으며, 피검자의 시선을 추적한 데이터를 사회적 상호작용 피드백 시스템(100)에 전송한다.Next, as shown in FIG. 1 , the gaze tracking device 400 may be implemented in a form worn by a user like glasses or may be implemented in a form mounted in a bar shape at the bottom of the user terminal 200, and the subject The data tracking the eyes of the user is transmitted to the social interaction feedback system 100 .

이때, 시선 추적 장치(400)는 사용자 단말기(200)의 하단에 바 모양으로 거치하는 형태로 구현될 경우, 적외선을 이용해서 피험자의 눈 위치를 추적한다. 또한, 피검자가 안경과 같은 형태의 시선 추적 장치(400)를 착용하는 경우에는 시선 추적 장치(400)에 부착된 카메라를 이용하여 피검자의 눈 위치를 추적한다. In this case, when the eye tracking device 400 is implemented in a bar shape mounted on the lower end of the user terminal 200, the eye position of the subject is tracked using infrared rays. In addition, when the subject wears the eye-gaze tracking device 400 in the form of glasses, the eye position of the subject is tracked using a camera attached to the eye-gaze tracking device 400 .

이하에서는 도 2를 이용하여 본 발명의 실시예에 따른 딥러닝을 이용한 사회적 상호작용 피드백 시스템(100)의 구성을 설명한다.Hereinafter, the configuration of the social interaction feedback system 100 using deep learning according to an embodiment of the present invention will be described using FIG. 2 .

도 2는 본 발명의 실시예에 따른 딥러닝을 이용한 사회적 상호작용 피드백 시스템의 구성을 설명하기 위한 도면이다. 2 is a diagram for explaining the configuration of a social interaction feedback system using deep learning according to an embodiment of the present invention.

도 2에서 나타낸 바와 같이, 본 발명의 실시예에 따른 딥러닝을 이용한 사회적 상호작용 피드백 시스템(100)은 영상 전송부(110), 표정 분석부(120), 제어부(130) 및 피드백부(140)를 포함한다. As shown in FIG. 2, the social interaction feedback system 100 using deep learning according to an embodiment of the present invention includes an image transmission unit 110, a facial expression analysis unit 120, a control unit 130, and a feedback unit 140. ).

먼저, 영상 전송부(110)는 일정시간 간격으로 표정이 변화하는 상대방의 얼굴 영상을 사용자 단말기(200)를 통해 피검자에게 제공한다.First, the image transmission unit 110 provides a face image of the other party whose facial expression changes at regular intervals through the user terminal 200 to the examinee.

이때, 일정 시간은 사용자의 설정에 따라 달라질 수 있으며, 영상 전송부(110) 상대방의 웃는 영상 또는 무표정 영상을 교대로 사용자 단말기(200)의 화면 상에 제공할 수 있다.In this case, the predetermined time may vary according to the user's setting, and the video transmission unit 110 may alternately provide a smiling image or an expressionless image of the other party on the screen of the user terminal 200 .

즉, 영상 전송부(110)는 15초 간격으로 상대방의 웃는 영상 또는 무표정 영상을 교대로 사용자 단말기(200)의 화면 상에 제공할 수 있다.That is, the image transmission unit 110 may alternately provide a laughing image or an expressionless image of the other party on the screen of the user terminal 200 every 15 seconds.

다음으로, 표정 분석부(120)는 피검자가 사용자 단말기(200)의 화면을 통해 제공되는 상대방의 얼굴 영상을 주시하는 동안 카메라(300)를 통해 촬영된 피검자의 얼굴 영상을 딥러닝 모델에 적용하여 상기 피검자의 표정을 분석한다.Next, the expression analysis unit 120 applies the subject's face image captured through the camera 300 to the deep learning model while the subject watches the other person's face image provided through the screen of the user terminal 200. The expression of the subject is analyzed.

이때, 표정 분석부(120)는 RMN(Residual Masking Network)을 기반으로 하는 딥러닝 모델을 이용하여 피검자의 얼굴 영상으로부터 화남, 중립, 슬픔, 혐오, 행복, 놀람 및 두려움 중에서 적어도 하나의 표정을 인식할 수 있다. At this time, the facial expression analysis unit 120 uses a deep learning model based on RMN (Residual Masking Network) to recognize at least one facial expression from anger, neutrality, sadness, disgust, happiness, surprise, and fear from the subject's face image can do.

여기서, RMN(Residual Masking Network)은 CNN(Convolution Neural Network)를 통해 추출한 특징맵을 이용하여 영상 속 사람의 얼굴을 인식하고, 표정으로부터 감정을 추정할 수 있는 딥러닝 네트워크를 의미한다. Here, a residual masking network (RMN) refers to a deep learning network capable of recognizing a person's face in an image using a feature map extracted through a convolutional neural network (CNN) and estimating emotion from an expression.

그리고, RMN은 영굴 영상 전체를 특징으로 사용하여 영상 내의 사람의 표정을 인식한다. And, RMN recognizes the expression of a person in the image by using the entire image as a feature.

즉, 표정 분석부(120)는 RMN을 기반으로 하는 딥러닝 모델을 이용하여 피검자의 얼굴 영상 전체를 특징으로 하여 표정을 분석할 수 있다. That is, the facial expression analysis unit 120 may analyze the facial expression by using a deep learning model based on RMN as a feature of the entire facial image of the subject.

다음으로, 제어부(130)는 피검자의 표정과 상대방의 표정이 일치하는 시간을 측정하여 표정 피드백 비율을 산출하고, 시선 추적 장치(400)를 통해 트래킹된 피검자의 시선 처리 결과를 이용하여 피검자가 상대방의 눈 부위를 주시하는 시간을 측정하여 몰입 비율을 산출한다.Next, the control unit 130 calculates a facial expression feedback ratio by measuring the time when the subject's facial expression and the other person's facial expression match, and uses the result of the subject's gaze processing tracked through the gaze tracking device 400 to determine whether the subject is the other party. Calculate the immersion rate by measuring the time to gaze at the eye part of the child.

이때, 제어부(130)는 피검자의 표정과 상대방의 표정에 따라 가중치를 달리하여 표정 피드백 비율을 산출할 수 있다.At this time, the controller 130 may calculate the facial expression feedback ratio by varying weights according to the facial expression of the subject and the facial expression of the other party.

다음으로, 피드백부(140)는 산출된 표정 피드백 비율과 몰입 비율을 이용하여 상호작용 피드백 수치를 연산하여 사용자 단말기(200)로 제공한다.Next, the feedback unit 140 calculates an interaction feedback value using the calculated expression feedback rate and immersion rate and provides it to the user terminal 200 .

이때, 피드백부(140)는 산출된 표정 피드백 비율과 몰입 비율을 평균 처리하여 상호작용 피드백 수치를 연산할 수 있다. In this case, the feedback unit 140 may average the calculated facial expression feedback rate and immersion rate to calculate an interaction feedback value.

이하에서는 도 3 내지 도 5를 이용하여 딥러닝 기반의 사회적 상호작용 피드백 시스템을 이용한 사회적 상호작용 피드백 방법을 설명한다.Hereinafter, a social interaction feedback method using a deep learning-based social interaction feedback system will be described using FIGS. 3 to 5 .

도 3은 본 발명의 실시예에 따른 딥러닝 기반의 사회적 상호작용 피드백 시스템을 이용한 사회적 상호작용 피드백 방법을 설명하기 위한 순서도이다.3 is a flowchart illustrating a social interaction feedback method using a deep learning-based social interaction feedback system according to an embodiment of the present invention.

먼저, 사회적 상호작용 피드백 시스템(100)은 기 저장된 상대방의 얼굴 영상을 사용자 단말기(200)를 통해 피검자에게 제공한다(S310). First, the social interaction feedback system 100 provides a pre-stored facial image of the other party to the subject through the user terminal 200 (S310).

이때, 사회적 상호작용 피드백 시스템(100)은 일정시간 간격으로 상대방의 웃는 영상 또는 무표정 영상을 교대로 사용자 단말기(200)의 화면 상에 제공할 수 있으며, 변경 주기는 사용자에 의해 설정될 수 있다. In this case, the social interaction feedback system 100 may alternately provide a smiling image or expressionless image of the other party on the screen of the user terminal 200 at regular time intervals, and the change period may be set by the user.

도 4는 도 3의 S310 단계를 설명하기 위한 예시도이다. 4 is an exemplary view for explaining step S310 of FIG. 3 .

예를 들어, 도 4와 같이 사용자 단말기(200)는 15초 간격으로 상대방의 웃는 영상과 무표정 영상을 교대로 반복해서 제공할 수 있으며, 총 60초동안의 상대방의 얼굴 영상을 제공할 수 있다.For example, as shown in FIG. 4 , the user terminal 200 may alternately and repeatedly provide a laughing image and an expressionless image of the other party at 15-second intervals, and may provide a total of 60-second facial images of the other party.

다음으로, 카메라(300)는 피검자가 상대방의 얼굴 영상을 주시하는 동안 피검자의 얼굴을 촬영하고 촬영된 영상을 사회적 상호작용 피드백 시스템(100)으로 전달한다(S320).Next, the camera 300 photographs the subject's face while the subject looks at the other person's face image, and transmits the captured image to the social interaction feedback system 100 (S320).

즉, 피검자가 사용자 단말기(200)의 화면을 통하여 상대방의 얼굴 영상을 응시하는 동안, 카메라(300)는 상대방의 얼굴 영상에 대응하여 변화하는 피검자의 표정을 정면에서 촬영한다.That is, while the subject gazes at the other person's face image through the screen of the user terminal 200, the camera 300 captures the subject's facial expression that changes in response to the other person's face image from the front.

다음으로, 사회적 상호작용 피드백 시스템(100)은 피검자의 얼굴 영상을 딥러닝 모델에 적용하여 피검자의 표정을 분석한다(S330). Next, the social interaction feedback system 100 analyzes the subject's expression by applying the subject's face image to the deep learning model (S330).

이때, 사회적 상호작용 피드백 시스템(100)은 RMN(Residual Masking Network)를 기반으로 하는 딥러닝 모델에 적용하여 피검자의 얼굴 영상으로부터 화남, 중립, 슬픔, 혐오, 행복, 놀람 및 두려움 중에서 적어도 하나의 표정을 인식할 수 있다. At this time, the social interaction feedback system 100 is applied to a deep learning model based on RMN (Residual Masking Network) to obtain at least one expression from anger, neutrality, sadness, disgust, happiness, surprise, and fear from the subject's face image. can recognize

예를 들어, 영상 내에 피검자가 찡그리고 있을 경우, 사회적 상호작용 피드백 시스템(100)은 딥러닝 모델을 통해 피검자의 표정이 화난 상태라는 것을 자동으로 인식할 수 있다.For example, when a subject is frowning in an image, the social interaction feedback system 100 may automatically recognize that the subject's expression is in an angry state through a deep learning model.

다음으로, 사회적 상호작용 피드백 시스템(100)은 피검자의 표정과 상대방의 얼굴 영상 내의 상대방의 표정이 일치하는 경우의 시간을 측정하여 표정 피드백 비율을 산출한다(S340).Next, the social interaction feedback system 100 calculates a facial expression feedback ratio by measuring the time when the facial expression of the subject matches the facial expression of the other party in the facial image of the subject (S340).

이때, 사회적 상호작용 피드백 시스템(100)은 상대방의 표정과 피검자의 표정이 일치하는 경우의 시간을 측정하여 다음의 수학식 1을 통해 표정 피드백 비율을 산출한다.At this time, the social interaction feedback system 100 calculates the facial expression feedback ratio through Equation 1 below by measuring the time when the facial expression of the other party matches the facial expression of the subject.

여기서, A는 사용자 단말기(200) 상에 상대방의 얼굴 영상이 표시된 총 시간(Sec)이고, B는 피검자의 표정과 상대방의 표정이 일치하는 경우의 시간(Sec)이고, C는 상대방이 무표정인 상태에서 피검자가 웃는 표정을 하는 경우의 시간(Sec)이고, D는 상대방이 웃는 표정인 상태에서 피검자가 무표정인 경우의 시간(Sec)을 나타낸다.Here, A is the total time (Sec) of displaying the face image of the other party on the user terminal 200, B is the time (Sec) when the facial expression of the subject and the other party match, and C is the expressionless face of the other party. In this state, it is the time (Sec) when the subject is smiling, and D represents the time (Sec) when the subject is expressionless while the other party is smiling.

도 5는 도 3의 S340 단계를 설명하기 위한 예시도이다.5 is an exemplary diagram for explaining step S340 of FIG. 3 .

예를 들어, 도 5에서 나타낸 바와 같이, 사용자 단말기(200) 상에 상대방의 얼굴 영상이 표시된 총 시간이 60초라고 가정한다. 그리고, 피검자와 상대방이 모두 웃는 표정을 짓는 시간이 20초 이고, 피검자와 상대방이 모두 무표정을 한 시간이 19초이며, 상대방이 무표정인 상태에서 피검자가 웃는 표정을 하는 경우의 시간은 10초이고, 상대방이 웃는 표정인 상태에서 피검자가 무표정인 경우의 시간은 11초라고 가정할 경우, 사회적 상호작용 피드백 시스템(100)은 표정 피드백 비율이 70% 인 것으로 산출할 수 있다.For example, as shown in FIG. 5 , it is assumed that the total time the face image of the other party is displayed on the user terminal 200 is 60 seconds. In addition, the time for both the subject and the other party to smile is 20 seconds, the time for both the subject and the other person to be expressionless is 19 seconds, and the time for the subject to smile while the other person is expressionless is 10 seconds , If it is assumed that the time when the subject is expressionless while the other party is smiling is 11 seconds, the social interaction feedback system 100 can calculate that the facial expression feedback rate is 70%.

그리고, 사회적 상호작용 피드백 시스템(100)은 피검자가 상대방의 눈 부위를 주시하는 시간을 측정하여 몰입 비율을 산출한다(S350).In addition, the social interaction feedback system 100 calculates an immersion ratio by measuring the amount of time the subject gazes at the other person's eyes (S350).

이때, 사회적 상호작용 피드백 시스템(100)은 시선 추적 장치(400)로부터 트래킹된 피검자의 시선 처리 결과를 전송받아 몰입 비율을 산출할 수 있다. At this time, the social interaction feedback system 100 may calculate the immersion ratio by receiving the subject's gaze processing result tracked from the gaze tracking device 400 .

따라서, 사회적 상호작용 피드백 시스템(100)은 상대방의 눈 부위를 주시하는 시간을 측정하여 다음의 수학식 2를 통해 피검자의 몰입 비율을 산출한다. Therefore, the social interaction feedback system 100 calculates the subject's immersion rate through the following Equation 2 by measuring the time spent looking at the other person's eyes.

예를 들어, 사용자 단말기(200) 상에 상대방의 얼굴 영상이 표시된 총 시간이 60초이고, 피검자가 상대방의 영상를 통해 상대방의 눈 부위를 주시한 시간이 20초라고 가정하면, 사회적 상호작용 피드백 시스템(100)은 피검자의 몰입 비율을 33%로 산출할 수 있다. For example, assuming that the total time the other person's face image is displayed on the user terminal 200 is 60 seconds and the time the subject gazes at the other person's eyes through the other person's image is 20 seconds, the social interaction feedback system (100) can calculate the immersion rate of the subject as 33%.

다음으로, 사회적 상호작용 피드백 시스템(100)은 산출된 표정 피드백 비율과 몰입 비율을 이용하여 상호작용 피드백 수치를 연산한다(S360).Next, the social interaction feedback system 100 calculates an interaction feedback value using the calculated expression feedback rate and immersion rate (S360).

여기서, 사회적 상호작용 피드백 시스템(100)은 산출된 표정 피드백 비율과 몰입 비율을 평균처리하여 상호작용 피드백 수치를 연산할 수 있다. Here, the social interaction feedback system 100 may calculate an interaction feedback value by averaging the calculated facial expression feedback ratio and immersion ratio.

즉, 사회적 상호작용 피드백 시스템(100)은 상기의 예에서, S340 단계 및 S350 단계를 통해 산출된 70%의 표정 피드백 비율과 33%의 피검자의 몰입 비율을 평균처리하여 51.5%의 상호작용 피드백 수치를 계산할 수 있다.That is, in the above example, the social interaction feedback system 100 averages the expression feedback rate of 70% and the immersion rate of the subject of 33% calculated through steps S340 and S350 to obtain an interaction feedback value of 51.5%. can be calculated.

그러면, 사회적 상호작용 피드백 시스템(100)은 연산된 피검자에 대한 상호작용 피드백 수치를 사용자 단말기(200)로 제공한다(S370).Then, the social interaction feedback system 100 provides the calculated interaction feedback value for the subject to the user terminal 200 (S370).

즉, 사회적 상호작용 피드백 시스템(100)은 몰입 비율과 표정 피드백 비율의 총 결과 값인 상호작용 피드백 수치 결과를 사용자 단말기(200)를 통해 피검자에게 피드백할 수 있다. That is, the social interaction feedback system 100 may feed back the interaction feedback numerical result, which is the total result value of the immersion ratio and the facial expression feedback ratio, to the subject through the user terminal 200 .

이와 같이 본 발명의 실시예에 따르면, 몰입도 및 표정 피드백을 통한 사회적 상호작용을 위한 훈련을 피검자에게 제공할 수 있다. 또한, 상호작용 피드백 수치를 통해 객관적인 피드백을 피검자에게 제공함으로써, 사회불안이 심한 환자나 조현병 환자에게 사회적 상호작용을 증진시키고, 사회기능을 향상시킬 수 있다. 그리고, 면접 준비생들에게도 상호작용 피드백 수치를 제공함으로써, 직업 면접 준비에 도움을 줄 수 있다.As described above, according to an embodiment of the present invention, training for social interaction through immersion and facial expression feedback may be provided to the subject. In addition, by providing objective feedback to the subject through the interactive feedback value, it is possible to increase social interaction and improve social function in patients with severe social anxiety or schizophrenia. In addition, by providing interactive feedback values to interview candidates, it can help prepare for job interviews.

본 발명은 도면에 도시된 실시예를 참고로 설명되었으나 이는 예시적인 것에 불과하며, 본 기술 분야의 통상의 지식을 가진 자라면 이로부터 다양한 변형 및 균등한 다른 실시예가 가능하다는 점을 이해할 것이다. 따라서, 본 발명의 진정한 기술적 보호 범위는 첨부된 특허청구범위의 기술적 사상에 의하여 정해져야 할 것이다.Although the present invention has been described with reference to the embodiments shown in the drawings, this is only exemplary, and those skilled in the art will understand that various modifications and equivalent other embodiments are possible therefrom. Therefore, the true technical scope of protection of the present invention should be determined by the technical spirit of the appended claims.

100: 개인성향 예측 시스템, 110: 학습부,
120: 몰입도 분석부, 130: 감정 분석부,
140: 성향 예측부, 200: 사용자 단말기,
300: 카메라부, 400: 시선 추적 장치100: personal propensity prediction system, 110: learning unit,
120: immersion analysis unit, 130: emotion analysis unit,
140: propensity prediction unit, 200: user terminal,
300: camera unit, 400: gaze tracking device

Claims

providing a face image of the other party whose facial expression changes at regular time intervals by an image transmission unit to the examinee through a user terminal;
Analyzing the subject's expression by applying the subject's face image captured through the camera to a deep learning model while the subject is watching the other person's face image provided through the screen of the user terminal by the facial expression analysis unit step,
Calculating, by a controller, a facial expression feedback rate for whether the facial expression of the subject and the facial expression of the other party match;
Calculating, by the control unit, an immersion rate for whether the subject gazes at the other party's eye using a result of processing the subject's gaze tracked through the gaze tracking device;
Calculating an interaction feedback value using the calculated immersion rate and facial expression feedback rate by a feedback unit; and
Providing an interaction feedback value for the subject to the user terminal by the feedback unit;
In the step of calculating the facial expression feedback ratio,
Through the following equation, the facial expression feedback rate is calculated by measuring the time when the other person's facial expression and the subject's facial expression match,

Here, A is the total time (Sec) of displaying the face image of the other person on the user terminal, B is the time (Sec) when the facial expression of the subject and the other person match, and C is the subject's expression when the other person is expressionless. is the time (Sec) when the subject is smiling, D is the time (Sec) when the subject is expressionless while the other person is smiling,
The step of calculating the immersion ratio,
Calculate the immersion rate of the subject through the following equation by measuring the time spent looking at the other person's eyes,

The step of calculating the interaction feedback value,
The social interaction feedback method of calculating the interaction feedback value by averaging the facial expression feedback rate and the immersion rate.

According to claim 1,
providing the other party's face image to the examinee through a user terminal;
A social interaction feedback method of alternately providing a smiling image or expressionless image of the other party on a screen of a user terminal at regular time intervals.

According to claim 1,
Analyzing the facial expression of the subject,
A social interaction feedback method for recognizing at least one facial expression of anger, neutrality, sadness, disgust, happiness, surprise, and fear from the face image of the subject using a deep learning model based on a residual masking network (RMN).

delete

In the deep learning-based social interaction feedback system,
an image transmission unit that provides a facial image of the other party whose facial expression changes at regular time intervals to the subject through a user terminal;
A facial expression analyzer for analyzing the subject's facial expression by applying the subject's face image captured through a camera to a deep learning model while the subject is watching the other person's face image provided through the screen of the user terminal;
A facial expression feedback rate for whether the subject's facial expression and the other person's facial expression match is calculated, and a result of the subject's gaze processing tracked through the gaze tracking device is used to determine whether the subject is looking at the other person's eye area. A control unit that calculates an immersion ratio; and
A feedback unit for calculating an interaction feedback value using the calculated immersion rate and facial expression feedback rate and providing the calculated interaction feedback value to the user terminal;
The control unit,
Through the following equation, the facial expression feedback rate is calculated by measuring the time when the other person's facial expression and the subject's facial expression match,

Here, A is the total time (Sec) of displaying the face image of the other person on the user terminal, B is the time (Sec) when the facial expression of the subject and the other person match, and C is the time when the other person is expressionless. is the time (Sec) when the subject is smiling, D is the time (Sec) when the subject is expressionless while the other person is smiling,
The control unit,
Calculate the immersion rate of the subject through the following equation by measuring the time spent looking at the other person's eyes,

The feedback unit,
A social interaction feedback system for calculating the interaction feedback value by averaging the facial expression feedback rate and the immersion rate.

According to claim 7,
The video transmission unit,
A social interaction feedback system that alternately provides a smiling image or an expressionless image of the other party on a screen of a user terminal at regular time intervals.

According to claim 7,
The facial expression analysis unit,
A social interaction feedback system for recognizing at least one expression of anger, neutrality, sadness, disgust, happiness, surprise, and fear from the face image of the subject using a deep learning model based on a residual masking network (RMN).

delete