KR20230114195A

KR20230114195A - Emotion analysis result providing device and emotion analysis result providing system

Info

Publication number: KR20230114195A
Application number: KR1020230003827A
Authority: KR
Inventors: 김석원
Original assignee: 주식회사 허니엠앤비
Priority date: 2022-01-24
Filing date: 2023-01-11
Publication date: 2023-08-01
Also published as: KR102630804B1; KR102630803B1; KR20230114196A

Abstract

실시예에 따른 감정 분석 결과 제공 장치는 내담자 단말로부터 실시간 수신한 동영상데이터로부터 추출된 제1 음성데이터 및 제1 영상데이터 중 적어도 하나에 기초해, 내담자의 감정을 실시간 분석하여 복수의 제1 분석 결과를 획득하는 제1 분석결과획득부; 및
상기 실시간 수신한 동영상데이터가 소정 시간 동안 누적된 누적 동영상데이터로부터 추출된 제2 음성데이터 및 제2 영상데이터 중 적어도 하나에 기초해, 상기 내담자의 감정을 상기 소정 시간 경과 후 분석하여 복수의 제2 분석 결과를 획득하는 제2 분석결과획득부;를 포함할 수 있다.An emotion analysis result providing apparatus according to an embodiment analyzes a client's emotions in real time based on at least one of first audio data and first image data extracted from video data received in real time from a client terminal, and provides a plurality of first analysis results. A first analysis result obtaining unit for acquiring; and
Based on at least one of second voice data and second video data extracted from the accumulated video data accumulated for a predetermined time, the client's emotion is analyzed after the predetermined time has elapsed, and the plurality of second video data received in real time is analyzed. It may include; a second analysis result acquisition unit for acquiring analysis results.

Description

Emotion analysis result providing device and emotion analysis result providing system

본 발명은 감정 분석 결과 제공 장치 및 감정 분석 결과 제공 시스템에 관한 것으로, 보다 구체적으로, 심리 상담 시의 동영상데이터를 기초로 실시간 내담자의 감정을 분석하면서도 누적된 동영상데이터를 기초로 또한 내담자의 감정을 분석하도록 함으로써, 내담자의 감정을 다각도로 분석 및 통합된 결과를 제공하기 위한, 감정 분석 결과 제공 장치 및 감정 분석 결과 제공 시스템에 관한 것이다. The present invention relates to an emotion analysis result providing device and an emotion analysis result providing system. The present invention relates to an emotion analysis result providing apparatus and an emotion analysis result providing system for analyzing a client's emotion from various angles and providing an integrated result by analyzing a client's emotion.

심리 상담은 전문 지식을 갖춘 상담사가 심리적 문제를 지난 내담자와의 관계에서 공감적 이해, 무조건적 긍정적 존중, 진실성을 기본으로 상담심리의 여러 이론들, 정신분석, 행동주의, 인본주의, 인지주의, 형태주의, 현실요법, 교류분석, 가족치료 등의 내용을 이용하여 그들의 문제 해결을 돕는 치료 방법으로서 내담자가 인간의 사고, 감정, 행동, 대인관계에 대해 탐색하도록 안내하여 다양한 자신의 문제들을 이해하고 변화하도록 돕는 것을 말한다.Psychological counseling is based on empathic understanding, unconditional positive respect, and sincerity in the relationship with the client who has experienced psychological problems by counselors with specialized knowledge. , reality therapy, exchange analysis, family therapy, etc., as a treatment method that helps them solve their problems. It guides clients to explore human thoughts, emotions, behaviors, and interpersonal relationships so that they can understand and change their various problems. say help

이러한 심리 상담은 주로, 주어진 상담실에서 면대면으로 행해지던 전통적인 상담 방식으로 이루어졌는데, 대인 관계에서의 문제를 지니고 있거나 사회 불안이 심한 내담자나 환자들은 상담자와 직접 마주해야 하는 면대면 상담 및 심리치료가 부담되거나 꺼려져 심리치료가 지속되지 못하고 중단되기도 한다.This kind of psychological counseling is mainly done in the traditional counseling method, which was conducted face-to-face in a given counseling room. Due to burden or reluctance, psychotherapy cannot be continued and is sometimes stopped.

그래서 면대면 상담의 대안으로 온라인 상에서 심리를 상담하고 치료하는 온라인 심리치료 프로그램이 제시되고 있다.Therefore, as an alternative to face-to-face counseling, an online psychotherapy program that provides psychological counseling and treatment online is being proposed.

KRKR 2022-0005945 2022-0005945 AA

본 발명은 온라인 심리 치료 프로그램을 진행 시의 동영상데이터를 기초로 실시간 내담자의 감정을 분석하면서도 누적된 동영상데이터를 기초로 또한 내담자의 감정을 분석하도록 함으로써, 내담자의 감정을 다각도로 분석 및 통합된 결과를 제공하고자 하는 데에 그 목적이 있다. The present invention analyzes the client's emotions in real time based on the video data during the online psychotherapy program, and also analyzes the client's emotions based on the accumulated video data, resulting in an integrated result of analyzing the client's emotions from various angles Its purpose is to provide

실시예에 따른 감정 분석 결과 제공 장치는 내담자 단말로부터 실시간 수신한 동영상데이터로부터 추출된 제1 음성데이터 및 제1 영상데이터 중 적어도 하나에 기초해, 내담자의 감정을 실시간 분석하여 복수의 제1 분석 결과를 획득하는 제1 분석결과획득부; 및 An emotion analysis result providing apparatus according to an embodiment analyzes a client's emotions in real time based on at least one of first audio data and first image data extracted from video data received in real time from a client terminal, and provides a plurality of first analysis results. A first analysis result obtaining unit for acquiring; and

상기 실시간 수신한 동영상데이터가 소정 시간 동안 누적된 누적 동영상데이터로부터 추출된 제2 음성데이터 및 제2 영상데이터 중 적어도 하나에 기초해, 상기 내담자의 감정을 상기 소정 시간 경과 후 분석하여 복수의 제2 분석 결과를 획득하는 제2 분석결과획득부;를 포함할 수 있다.Based on at least one of second voice data and second video data extracted from the accumulated video data accumulated for a predetermined time, the client's emotion is analyzed after the predetermined time has elapsed, and the plurality of second video data received in real time is analyzed. It may include; a second analysis result acquisition unit for acquiring analysis results.

상기 복수의 제1 분석 결과는, The plurality of first analysis results,

복수의 일반 감정 상태 각각의 확률값, 대표 감정 상태, 심박수와 스트레스 지수, 키워드별 빈도수, 토픽별로 구성된 키워드들 각각에 대한 빈도수, 음성 파라미터, 시선 위치 정보, 및 얼굴 움직임 정보 중 적어도 둘 이상을 포함할 수 있다. It may include at least two or more of a probability value of each of a plurality of general emotional states, a representative emotional state, a heart rate and a stress index, a frequency for each keyword, a frequency for each of keywords configured for each topic, a voice parameter, gaze position information, and facial movement information. can

상기 복수의 제2 분석 결과는, The plurality of second analysis results,

복수의 일반 감정 상태 각각의 확률값, 복수의 치환 감정 상태 각각의 확률값, 심박수와 스트레스 지수, 키워드별 빈도수, 토픽별로 구성된 키워드들 각각에 대한 빈도수, 음성 파라미터, 시선 위치 정보, 및 얼굴 움직임 정보 중 적어도 둘 이상을 포함할 수 있다. At least one of a probability value of each of a plurality of general emotional states, a probability value of each of a plurality of substitutional emotional states, a heart rate and a stress index, a frequency of each keyword, a frequency of each of keywords configured for each topic, a voice parameter, gaze position information, and face movement information. may contain two or more.

상기 복수의 치환 감정 상태는 상기 복수의 일반 감정 상태 각각의 상위 개념으로 정의되는 감정 상태의 조합일 수 있다. The plurality of substitutional emotional states may be a combination of emotional states defined as a superordinate concept of each of the plurality of general emotional states.

상기 제2 분석결과획득부는, The second analysis result acquisition unit,

상기 제2 영상데이터, 상기 제2 음성데이터, 및 상기 제2 음성데이터가 변환된 텍스트데이터에 기초해 앙상블 학습에 기반한 복수의 모델을 이용해 복수의 일반 감정 상태 각각의 확률값을 획득할 수 있다. Probability values of each of a plurality of general emotional states may be obtained using a plurality of models based on ensemble learning based on the second video data, the second audio data, and the text data obtained by converting the second audio data.

상기 제2 음성데이터가 변환된 텍스트데이터에 대해 잠재 디리클레 할당(LDA, Latent Dirichlet Allocation) 토픽 모델링을 수행하여 상기 토픽별로 구성된 키워드들 각각에 대한 빈도수를 획득할 수 있다. A frequency count for each of the keywords configured for each topic may be obtained by performing Latent Dirichlet Allocation (LDA) topic modeling on the text data obtained by converting the second voice data.

상기 제2 영상데이터에 기초해 내담자의 얼굴을 인식하고, 상기 인식된 얼굴의 특징점을 검출하며, 상기 얼굴의 특징점을 기초로 지정된 관심 영역에서 동공 중심 위치를 검출하고, 상기 검출된 동공 중심 위치의 시간에 따른 변화 정도인 상기 시선 위치 정보를 유클리디안 거리의 변화에 기초해 산출할 수 있다. Based on the second image data, the client's face is recognized, a feature point of the recognized face is detected, a pupil center position is detected in a designated region of interest based on the face feature point, and a pupil center position is determined based on the detected pupil center position. The gaze position information, which is a degree of change over time, may be calculated based on a change in Euclidean distance.

실시예에 따른 감정 분석 결과 제공 시스템은 내담자 단말, 상담자 단말, 및 감정 분석 결과 제공 장치를 포함한 감정 분석 결과 제공 시스템에 있어서, An emotion analysis result providing system according to an embodiment is an emotion analysis result providing system including a client terminal, a counselor terminal, and an emotion analysis result providing device,

상기 감정 분석 결과 제공 장치는, The emotion analysis result providing device,

내담자 단말로부터 실시간 수신한 동영상데이터로부터 추출된 제1 음성데이터 및 제1 영상데이터 중 적어도 하나에 기초해, 내담자의 감정을 실시간 분석하여 복수의 제1 분석 결과를 획득하는 제1 분석결과획득부; a first analysis result obtaining unit configured to obtain a plurality of first analysis results by analyzing the client's emotions in real time based on at least one of first audio data and first video data extracted from video data received from the client terminal in real time;

상기 실시간 수신한 동영상데이터가 소정 시간 동안 누적된 누적 동영상데이터로부터 추출된 제2 음성데이터 및 제2 영상데이터 중 적어도 하나에 기초해, 상기 내담자의 감정을 상기 소정 시간 경과 후 분석하여 복수의 제2 분석 결과를 획득하는 제2 분석결과획득부; 및 Based on at least one of second voice data and second video data extracted from the accumulated video data accumulated for a predetermined time, the client's emotion is analyzed after the predetermined time has elapsed, and the plurality of second video data received in real time is analyzed. a second analysis result acquisition unit for acquiring analysis results; and

상기 복수의 제1 분석 결과 전체를 포함한 정보 및 상기 복수의 제2 분석 결과 전체를 포함한 정보를 상기 상담자 단말로 전송하는 전송부;를 포함하고, A transmission unit configured to transmit information including all of the plurality of first analysis results and information including all of the plurality of second analysis results to the counselor terminal;

상기 상담자 단말은, The counselor terminal,

상기 감정 분석 결과 제공 장치로부터 수신한 상기 복수의 제1 분석 결과 전체를 포함한 정보를 하나의 사용자 인터페이스 화면을 통해 제공하는 제1 분석결과제공부; 및 a first analysis result providing unit that provides information including all of the plurality of first analysis results received from the emotion analysis result providing device through a single user interface screen; and

상기 감정 분석 결과 제공 장치로부터 수신한 상기 복수의 제2 분석 결과 전체 중 사용자에 의해 선택된 제2 분석 결과를 포함한 정보를 상기 하나의 사용자 인터페이스 화면을 통해 제공하는 제2 분석결과제공부;을 포함할 수 있다. a second analysis result providing unit that provides information including a second analysis result selected by a user among all of the plurality of second analysis results received from the emotion analysis result providing device through the one user interface screen; can

상기 하나의 사용자 인터페이스 화면은 상기 복수의 제2 분석 결과를 나타내는 복수의 메뉴를 포함하고,The one user interface screen includes a plurality of menus representing the plurality of second analysis results,

상기 제2 분석결과제공부는, The second analysis result providing unit,

상기 복수의 메뉴 중 사용자에 의해 선택된 메뉴에 대응되는 상기 제2 분석 결과를 포함한 정보를 상기 하나의 사용자 인터페이스 화면을 통해 제공할 수 있다.Information including the second analysis result corresponding to a menu selected by the user from among the plurality of menus may be provided through the single user interface screen.

본 발명에 따르면, 제1 분석결과획득부를 통해 실시간 분석으로 제1 분석 결과를 획득하지만, 제2 분석결과획득부를 통해 소정 시간 경과 후 분석으로 제2 분석 결과를 획득하도록 함으로써, 동일한 내담자와의 실시간 상담 내역을 실시간 분석을 통해 실시간 제공받을 수 있도록 하면서도, 상담이 종료된 후 전체 상담 내역을 통합 분석을 통해 제공받을 수 있도록 함으로써, 하나의 상담 내역에 대해 다양한 형태의 분석 결과를 제공받을 수 있게 되어 보다 다양화되고 체계화된 상담 내역의 관리가 가능해질 수 있게 된다.According to the present invention, the first analysis result is acquired through real-time analysis through the first analysis result acquisition unit, but the second analysis result is acquired through analysis after a predetermined time has elapsed through the second analysis result acquisition unit, thereby real-time communication with the same client. While providing real-time counseling details through real-time analysis, it is possible to receive various types of analysis results for one consultation detail by enabling the entire consultation details to be provided through integrated analysis after the consultation is finished. It becomes possible to manage more diversified and systematized consultation details.

또한, 상담 진행 중일 때의 사용자 인터페이스 화면과 상담이 종료된 후의 분석 결과를 제공하는 사용자 인터페이스 화면을 서로 상이하게 구현하여 상담자의 상담 내역 관리 및 확인이 보다 용이해지게 된다. 예를 들어, 상담이 진행중인 경우, 실시간 분석 결과를 제공받도록 하면서도 전체 분석 결과를 하나의 사용자 인터페이스부의 화면을 통해 제공받도록 함으로써, 상담자가 실시간 상담 내역 확인, 분석, 및/또는 판단이 보다 신속하고 용이해질 수 있으며, 상담이 종료된 경우, 누적된 전체 상담 내역에 대해 하나의(선택된/개별적인) 분석 결과를 하나의 사용자 인터페이스부의 화면을 통해 제공받도록 함으로써, 전체 상담 내역을 사용자가 통합적으로 확인할 수 있게 된다. In addition, a user interface screen during counseling is implemented differently from a user interface screen that provides analysis results after counseling is completed, so that the counselor can more easily manage and check the counseling details. For example, when counseling is in progress, real-time analysis results are provided and all analysis results are provided through a single user interface screen, so that the counselor can check, analyze, and/or judge real-time consultation details more quickly and easily When the consultation is finished, one (selected/individual) analysis result for the entire accumulated consultation history is provided through the screen of one user interface unit, so that the user can check the entire consultation history in an integrated manner. do.

도 1은 실시예에 따른 감정 분석 결과 제공 시스템(1)의 시스템도이다.
도 2 및 도 3은 실시예에 따른 감정 분석 결과 제공 장치(20)의 동작을 설명하는 순서도이다.
도 4는 상담자 단말(30)의 제1 분석결과제공부(31)의 동작에 따른 사용자 인터페이스부(33)의 화면을 예시한다.
도 5 내지 도 11은 상담자 단말(30)의 제2 분석결과제공부(32)의 동작에 따른 사용자 인터페이스부(33)의 화면을 예시한다.
도 12 내지 도 13은 감정 분석 결과 제공 장치(20)의 동작을 설명하기 위해 참조되는 도면이다.1 is a system diagram of an emotion analysis result providing system 1 according to an embodiment.
2 and 3 are flowcharts illustrating the operation of the emotion analysis result providing device 20 according to an embodiment.
4 illustrates the screen of the user interface unit 33 according to the operation of the first analysis result providing unit 31 of the counselor terminal 30 .
5 to 11 illustrate screens of the user interface unit 33 according to the operation of the second analysis result providing unit 32 of the counselor terminal 30 .
12 to 13 are diagrams referenced to describe the operation of the emotion analysis result providing device 20 .

후술하는 본 발명에 대한 상세한 설명은, 본 발명이 실시될 수 있는 특정 실시예를 예시로서 도시하는 첨부 도면을 참조한다. 이들 실시예는 당업자가 본 발명을 실시할 수 있기에 충분하도록 상세히 설명된다. 본 발명의 다양한 실시예는 서로 다르지만 상호 배타적일 필요는 없음이 이해되어야 한다. 예를 들어, 여기에 기재되어 있는 특정 형상, 구조 및 특성은 일 실시예에 관련하여 본 발명의 정신 및 범위를 벗어나지 않으면서 다른 실시예로 구현될 수 있다. 또한, 각각의 개시된 실시예 내의 개별 구성요소의 위치 또는 배치는 본 발명의 정신 및 범위를 벗어나지 않으면서 변경될 수 있음이 이해되어야 한다. 따라서, 후술하는 상세한 설명은 한정적인 의미로서 취하려는 것이 아니며, 본 발명의 범위는, 적절하게 설명된다면, 그 청구항들이 주장하는 것과 균등한 모든 범위와 더불어 첨부된 청구항에 의해서만 한정된다. 도면에서 유사한 참조부호는 여러 측면에 걸쳐서 동일하거나 유사한 기능을 지칭한다. DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS The detailed description of the present invention which follows refers to the accompanying drawings which illustrate, by way of illustration, specific embodiments in which the present invention may be practiced. These embodiments are described in sufficient detail to enable one skilled in the art to practice the present invention. It should be understood that the various embodiments of the present invention are different from each other but are not necessarily mutually exclusive. For example, specific shapes, structures, and characteristics described herein may be implemented in one embodiment in another embodiment without departing from the spirit and scope of the invention. Additionally, it should be understood that the location or arrangement of individual components within each disclosed embodiment may be changed without departing from the spirit and scope of the invention. Accordingly, the detailed description set forth below is not to be taken in a limiting sense, and the scope of the present invention, if properly described, is limited only by the appended claims, along with all equivalents as claimed by those claims. Like reference numbers in the drawings indicate the same or similar function throughout the various aspects.

참고로, 본 발명의 각 순서도에 있어서, 각 단계는 일예이며, 각 순서를 다르게 변경 및/또는 조합한 경우에도 본 발명이 동일/유사하게 적용될 수 있다.For reference, in each flowchart of the present invention, each step is an example, and the present invention can be equally/similarly applied even when each order is differently changed and/or combined.

도 1은 실시예에 따른 감정 분석 결과 제공 시스템(1)의 시스템도이다. 1 is a system diagram of an emotion analysis result providing system 1 according to an embodiment.

내담자 단말(10)은 텍스트/음성/영상 상담을 진행하기 위한 어플리케이션을 구비하여 감정 분석 결과 제공 장치(20)에 연결될 수 있고, 이를 통해 상담자 단말(30)과 텍스트/음성/영상 상담을 진행할 수 있다. 내담자 단말(10)은 상담자 모습이 출력되는 화면과, 내담자의 모습을 촬영하여 상담자 단말(30)로 전송할 수 있는 카메라(미도시)와 음성을 수집하여 상담자와 대화를 주고받을 수 있도록 하는 마이크(미도시) 및 스피커(미도시)가 적어도 구비되어야 한다.The client terminal 10 may be equipped with an application for conducting text/voice/video counseling and may be connected to the emotion analysis result providing device 20, through which text/voice/video counseling may be conducted with the counselor terminal 30. there is. The client terminal 10 includes a screen on which the counselor's appearance is output, a camera (not shown) capable of photographing the client's appearance and transmitting the image to the counselor terminal 30, and a microphone (not shown) that collects voice and allows conversation with the counselor ( (not shown) and a speaker (not shown) should be provided at least.

내담자 단말(10)은 카메라(미도시)를 이용해 내담자의 얼굴이 촬영되어 획득된 영상데이터와 마이크(미도시)를 이용해 수집된 내담자의 음성데이터를 포함한 동영상데이터를 실시간 감정 분석 결과 제공 장치(20)로 전송할 수 있다.The client terminal 10 is a device for providing real-time emotion analysis results, including video data obtained by photographing the client's face using a camera (not shown) and video data including the client's voice data collected using a microphone (not shown). ) can be transmitted.

실시예에 따라, 내담자 단말(10)은 감정 분석 결과 제공 장치(2)로부터 수신한 감정 분석 결과를 사용자 인터페이스부(미도시)의 화면을 통해 출력하여 내담자에게 제공할 수 있다.Depending on the embodiment, the client terminal 10 may output the emotion analysis result received from the emotion analysis result providing device 2 through a screen of a user interface unit (not shown) and provide the result to the client.

상담자 단말(30)은 텍스트/음성/영상 상담을 진행하기 위한 어플리케이션을 구비하여 감정 분석 결과 제공 장치(20)에 연결될 수 있고, 이를 통해 내담자 단말(10)과 텍스트/음성/영상 상담을 진행할 수 있다. 상담자 단말(30)은 내담자 모습이 출력되는 화면과, 상담자의 모습을 촬영하여 내담자 단말(10)로 전송할 수 있는 카메라와 음성을 수집하여 내담자와 대화를 주고받을 수 있도록 하는 마이크 및 스피커가 적어도 구비되어야 한다.The counselor terminal 30 may be equipped with an application for conducting text/voice/video counseling and may be connected to the emotion analysis result providing device 20, through which text/voice/video counseling may be conducted with the client terminal 10. there is. The counselor terminal 30 is provided with at least a screen on which the client's appearance is output, a camera capable of photographing the counselor's image and transmitting the image to the client terminal 10, and a microphone and speaker for collecting voice and communicating with the client. It should be.

상담자 단말(30)은 카메라(미도시)를 이용해 상담자의 상담 얼굴이 촬영되어 획득된 영상데이터와 마이크(미도시)를 이용해 수집된 상담자의 음성데이터를 포함한 동영상데이터를 실시간 감정 분석 결과 제공 장치(20)로 전송할 수 있다.The counselor terminal 30 is a device for providing real-time emotion analysis results, including image data obtained by photographing the counselor's counseling face using a camera (not shown) and video data including the counselor's voice data collected using a microphone (not shown) ( 20) can be sent.

상담자 단말(30)은 감정 분석 결과 제공 장치(20)로부터 수신한 감정 분석 결과를 사용자 인터페이스부(33)의 화면을 통해 출력하여 상담자 또는 관리자(사용자)에게 제공할 수 있다.The counselor terminal 30 may output the emotion analysis result received from the emotion analysis result providing device 20 through the screen of the user interface unit 33 and provide the result to the counselor or manager (user).

감정 분석 결과 제공 장치(20)는 내담자 단말(10)과 상담자 단말(30)의 화상 상담을 서로 중개하고, 화상 상담에 따른 내담자의 감정 분석 결과를 내담자 단말(10) 및/또는 상담자 단말(30)로 전송할 수 있다.The emotion analysis result providing device 20 mediates video counseling between the client terminal 10 and the counselor terminal 30, and transmits the result of emotional analysis of the client according to the video consultation to the client terminal 10 and/or the counselor terminal 30. ) can be transmitted.

도 2 및 도 3은 실시예에 따른 감정 분석 결과 제공 장치(20)의 동작을 설명하는 순서도이다.2 and 3 are flowcharts illustrating the operation of the emotion analysis result providing device 20 according to an embodiment.

실시예에 따라, 제1 분석결과획득부(21a)는 내담자 단말(10)로부터 실시간 동영상데이터를 수신할 수 있다(s11).Depending on the embodiment, the first analysis result acquisition unit 21a may receive real-time video data from the client terminal 10 (s11).

동영상데이터는 내담자의 얼굴이 촬영된 제1 영상데이터 및 제1 음성데이터를 포함할 수 있다.The video data may include first video data and first audio data in which the client's face is photographed.

제1 분석결과획득부(21a)는 동영상데이터로부터 제1 음성데이터 및 제1 영상데이터를 추출할 수 있다(s12).The first analysis result acquisition unit 21a may extract first audio data and first image data from video data (s12).

실시예에 따라, 제1 영상데이터는 실시간 수신한 단일 영상프레임이고, 제1 음성데이터는 실시간 수신한 단일 음성프레임일 수 있다.Depending on the embodiment, the first video data may be a single video frame received in real time, and the first audio data may be a single audio frame received in real time.

실시예에 따라, 동영상데이터로부터 제1 영상데이터 및 제1 음성데이터를 각각 추출하는 것은 공지의 다양한 알고리즘을 통해 구현될 수 있다. Depending on the embodiment, extracting the first image data and the first audio data from the video data, respectively, may be implemented through various well-known algorithms.

제1 분석결과획득부(21a)는 제1 음성데이터 및 제1 영상데이터 중 적어도 하나에 기초해, 내담자의 감정을 실시간 분석하여 복수의 제1 분석 결과를 획득할 수 있다(s13).The first analysis result acquisition unit 21a may obtain a plurality of first analysis results by analyzing the emotion of the client in real time based on at least one of the first voice data and the first image data (s13).

실시예에 따라, 제1 분석결과획득부(21a)는 음성데이터 및 영상데이터 중 적어도 하나에 기초해, 동일 카테고리에 해당되는 분석 결과를 얻기 위해, 제2 분석결과획득부(21b)와 동일/유사한 분석 방법 또는 공지의 다양한 알고리즘을 이용한 실시간 분석을 통해 복수의 제1 분석 결과를 실시간 획득할 수 있다. 이에, 복수의 제1 분석 결과를 획득하기 위한 분석에 대해서는 자세한 설명을 생략한다.Depending on the embodiment, the first analysis result acquisition unit 21a is the same as the second analysis result acquisition unit 21b in order to obtain an analysis result corresponding to the same category based on at least one of audio data and video data. A plurality of first analysis results may be obtained in real time through real-time analysis using a similar analysis method or various well-known algorithms. Accordingly, a detailed description of the analysis for obtaining the plurality of first analysis results will be omitted.

예를 들어, 제2 분석결과획득부(21b)가 복수의 일반 감정 상태 각각의 확률값을 획득하기 위한 분석 방법은 제1 분석결과획득부(21a)가 복수의 일반 감정 상태 각각의 확률값을 획득하는데 동일/유사하게 이용될 수 있다.For example, in the analysis method for the second analysis result acquisition unit 21b to obtain a probability value of each of a plurality of general emotional states, the first analysis result acquisition unit 21a obtains a probability value of each of a plurality of general emotional states. Can be used the same/similarly.

실시예에 따라, 복수의 제1 분석 결과는, 복수의 일반 감정 상태 각각의 확률값, 대표 감정 상태, 심박수와 스트레스 정도, 키워드별 빈도수, 토픽별로 구성된 키워드들 각각에 대한 빈도수, 음성 파라미터, 시선 위치 정보, 및 얼굴 움직임 정보 중 적어도 둘 이상을 포함할 수 있다. According to an embodiment, the plurality of first analysis results may include a probability value of each of a plurality of general emotional states, a representative emotional state, a heart rate and stress level, a frequency for each keyword, a frequency for each of keywords configured for each topic, a voice parameter, and a gaze position. information, and at least two of facial motion information.

실시예에 따라, 복수의 일반 감정 상태는 해당 시각의 기쁨, 분노, 경멸, 놀람, 두려움, 평온 및 슬픔 중 적어도 둘 이상을 포함할 수 있으나, 본 발명은 이에 한정되지 않는다. According to embodiments, the plurality of general emotional states may include at least two or more of joy, anger, contempt, surprise, fear, tranquility, and sadness at a corresponding time, but the present invention is not limited thereto.

실시예에 따라, 제1 분석결과획득부(21a)는 제1 영상데이터와 제1 음성데이터에 기초해 복수의 일반 감정 상태 각각의 확률값을 획득할 수 있으며, 이는, 제2 분석결과획득부(21b)가 제2 영상데이터와 제2 음성데이터에 기초해 복수의 일반 감정 상태 각각의 확률값을 획득하는 것과 동일/유사하게 적용될 수 있다. According to an embodiment, the first analysis result acquisition unit 21a may obtain a probability value of each of a plurality of general emotional states based on the first image data and the first audio data, which is obtained by the second analysis result acquisition unit ( 21b) may be applied in the same/similar manner to obtaining a probability value of each of a plurality of general emotional states based on the second image data and the second audio data.

실시예에 따라, 대표 감정 상태는 해당 시각의 복수의 일반 감정 상태 중 내담자의 감정으로 대표될 수 있는 최종 감정 상태일 수 있다. 예를 들어, 제1 분석결과획득부(21a)는 복수의 각 일반 감정 상태의 확률값 중 가장 높은 수치의 확률값에 대응되는 일반 감정 상태를 해당 시각의 대표 감정 상태로 결정할 수 있다. Depending on the embodiment, the representative emotional state may be a final emotional state that can be represented by the emotion of the client among a plurality of general emotional states of the corresponding time. For example, the first analysis result acquisition unit 21a may determine a general emotional state corresponding to the highest probability value among a plurality of probability values of each general emotional state as a representative emotional state at the corresponding time.

실시예에 따라, 심박수는 RPPG 수치로 획득되며, 스트레스 정도는 높음/중간/낮음 등의 레벨로 획득될 수 있다. Depending on the embodiment, the heart rate may be obtained as an RPGG value, and the degree of stress may be obtained as a high/medium/low level.

실시예에 따라, 제1 분석결과획득부(21a)는 제1 영상데이터에 기초해 심박수와 스트레스 지수를 획득할 수 있다. 이는, 제2 분석결과획득부(21b)의 제2 영상데이터에 기초해 심박수와 스트레스 지수를 획득하는 것과 동일/유사하게 적용될 수 있다. Depending on the embodiment, the first analysis result acquisition unit 21a may obtain a heart rate and a stress index based on the first image data. This may be applied identically/similarly to obtaining the heart rate and the stress index based on the second image data of the second analysis result acquisition unit 21b.

실시예에 따라, 키워드별 빈도수는 기 설정된 키워드 중 내담자에 의해 발화된 키워드별 빈도수로 정의될 수 있다. 실시예에 따라, 기 설정된 키워드는 실시간 상담자 또는 관리자 등의 사용자에 의해 변경되어 설정될 수 있다. Depending on the embodiment, the frequency of each keyword may be defined as the frequency of each keyword uttered by the client among preset keywords. Depending on the embodiment, a preset keyword may be changed and set by a user such as a real-time counselor or manager.

실시예에 따라, 제1 분석결과획득부(21a)는 제1 음성데이터가 변환된 텍스트데이터에 대한 분석을 수행해 기 설정된 키워드별 빈도수를 획득할 수 있다. Depending on the embodiment, the first analysis result acquisition unit 21a may obtain a frequency count for each keyword by performing analysis on the text data obtained by converting the first voice data.

이는, 제2 분석결과획득부(21b)가 제2 음성데이터가 변환된 텍스트데이터에 대한 분석을 수행해 기 설정된 키워드별 빈도수를 획득하는 것과 동일/유사하게 적용될 수 있다.This may be applied identically/similarly to the case where the second analysis result obtaining unit 21b acquires the frequency count for each preset keyword by performing analysis on the text data into which the second voice data is converted.

실시예에 따라, 제1 분석결과획득부(21a)는 제1 음성데이터가 변환된 텍스트데이터에 대해 잠재 디리클레 할당(LDA, Latent Dirichlet Allocation) 토픽 모델링을 수행하여 상기 토픽별로 구성된 키워드들 각각에 대한 빈도수를 획득할 수 있으며, 이는 제2 분석결과획득부(21b)가 제2 음성데이터가 변환된 텍스트데이터에 대해 잠재 디리클레 할당(LDA, Latent Dirichlet Allocation) 토픽 모델링을 수행하여 상기 토픽별로 구성된 키워드들 각각에 대한 빈도수를 획득하는 것과 동일/유사하게 적용될 수 있다. According to an embodiment, the first analysis result acquisition unit 21a performs Latent Dirichlet Allocation (LDA) topic modeling on the text data from which the first voice data is converted, so that each of the keywords configured for each topic The frequency count may be obtained, which is obtained by performing Latent Dirichlet Allocation (LDA) topic modeling on the text data in which the second voice data is converted by the second analysis result acquisition unit 21b to generate keywords configured for each topic. It can be applied in the same / similar manner to obtaining the frequency count for each.

실시예에 따라, 제1 분석결과획득부(21a)는 제1 음성데이터를 기초로 분석하여 음성 파라미터를 획득할 수 있으며, 이는, 제2 분석결과획득부(21b)가 제2 음성데이터를 기초로 분석하여 음성 파라미터를 획득하는 것과 동일/유사하게 적용될 수 있다. Depending on the embodiment, the first analysis result acquisition unit 21a may obtain a voice parameter by analyzing the first voice data, which is determined by the second analysis result acquisition unit 21b based on the second voice data. It can be applied in the same / similar manner to acquiring voice parameters by analyzing with .

실시예에 따라 음성 파라미터는 파형, 피치(pitch), 포즈(pause) 구간, 발화 속도, 및/또는 망설임 구간을 나타내는 지표를 포함할 수 있다. Depending on the embodiment, the voice parameter may include indicators representing a waveform, pitch, pause section, speech speed, and/or hesitation section.

실시예에 따라, 제1 분석결과획득부(21a)는 제1 영상데이터를 기초로 분석하여 시선 위치 정보 및 얼굴 움직임 정보를 획득할 수 있으며, 이는, 제2 분석결과획득부(21b)가 제2 영상데이터를 기초로 분석하여 시선 위치 정보 및 얼굴 움직임 정보를 획득하는 것과 동일/유사하게 적용될 수 있다. Depending on the embodiment, the first analysis result acquisition unit 21a may obtain gaze position information and face movement information by analyzing the first image data based on which the second analysis result acquisition unit 21b obtains 2 It can be applied in the same/similar way as obtaining gaze position information and face motion information by analyzing based on image data.

실시예에 따라, 제1 분석결과획득부(21a)는 복수의 일반 감정 상태를 막대 그래프 형태로 생성하고, 대표 감정 상태를 아이콘 형태 및/또는 방사형 그래프에 적용해 생성하며, 기 설정된 키워드별 빈도수를 원의 크기에 반영해 생성하며, 토픽별로 구성된 키워드들 각각에 대한 빈도수를 테이블 형태로 생성할 수 있다. 실시예에 따라, 기 설정된 키워드별 빈도수의 경우, 빈도수에 대응되는 레벨(예> 1 내지 5레벨)에 따라 원의 크기에 반영되도록 하여 생성할 수 있다. 예를 들어, 1레벨은 1번 내담자에 의해 발화된 것을 나타내어 상대적으로 작은 크기의 원으로 표현되고, 5레벨은 5번 이상 내담자에 의해 발화된 것을 나타내어 상대적으로 큰 크기의 원으로 표현될 수 있다. According to an embodiment, the first analysis result acquisition unit 21a generates a plurality of general emotional states in the form of a bar graph, applies the representative emotional state to an icon form and/or a radial graph, and generates a frequency count for each keyword set in advance. is generated by reflecting the size of the circle, and the frequency count for each of the keywords organized by topic can be created in the form of a table. Depending on the embodiment, in the case of a preset frequency for each keyword, it may be generated by being reflected in the size of a circle according to a level corresponding to the frequency (eg, 1 to 5 levels). For example, level 1 indicates that the client uttered 1 time and is represented by a relatively small circle, and level 5 indicates that the client uttered 5 or more times and is represented by a relatively large circle. .

다만, 이는 실시예이며, 복수의 제1 분석 결과 각각에 대해서 막대 그래프 형태, 아이콘 형태, 방사형 그래프 형태, 원형 형태 등 다양하게 적용하여 생성할 수 있다.However, this is an example, and it can be generated by variously applying a bar graph form, an icon form, a radial graph form, a circle form, etc. to each of the plurality of first analysis results.

전송부(21c)는 복수의 제1 분석 결과 전체를 포함한 정보를 상담자 단말(30)로 전송할 수 있다(s14). The transmission unit 21c may transmit information including all of the plurality of first analysis results to the counselor terminal 30 (s14).

실시예에 따라 복수의 제1 분석 결과 전체를 포함한 정보는, 복수의 제1 분석 결과 전체, 내담자 단말(10)로부터 실시간 수신한 동영상데이터, 및 상기 실시간 수신한 동영상데이터로부터 분리된 내담자의 제1 음성데이터가 변환된 텍스트데이터와 상담자의 음성데이터가 변환된 텍스트데이터의 컨텐츠를 포함할 수 있다.According to the embodiment, the information including all of the plurality of first analysis results may include all of the plurality of first analysis results, video data received in real time from the client terminal 10, and the client's first video data separated from the video data received in real time. Contents of text data obtained by converting voice data and text data obtained by converting counselor's voice data may be included.

도 4는 상담자 단말(30)의 제1 분석결과제공부(31)의 동작에 따른 사용자 인터페이스부(33)의 화면을 예시한다.4 illustrates the screen of the user interface unit 33 according to the operation of the first analysis result providing unit 31 of the counselor terminal 30 .

상담자 단말(30)의 제1 분석결과제공부(31)는 감정 분석 결과 제공 장치(20)로부터 수신한 복수의 제1 분석 결과 전체를 포함한 정보와 상담자가 실시간 기록한 상담 노트를 하나의 사용자 인터페이스부(33)의 화면을 통해 출력함으로써 관리자 또는 상담자등의 사용자에게 제공할 수 있다.The first analysis result providing unit 31 of the counselor terminal 30 transmits information including all of the plurality of first analysis results received from the emotion analysis result providing device 20 and counseling notes recorded by the counselor in real time into one user interface unit. By outputting through the screen of (33), it can be provided to users such as administrators or counselors.

실시예에 따라 복수의 제1 분석 결과 전체를 포함한 정보는, 복수의 제1 분석 결과 전체, 내담자가 촬영된 동영상데이터, 및 내담자가 촬영된 동영상데이터로부터 분리된 내담자의 음성데이터가 변환된 텍스트데이터와 상담자의 음성데이터가 변환된 텍스트데이터에 대응되는 컨텐츠를 포함할 수 있다.According to an embodiment, the information including all of the plurality of first analysis results may include all of the plurality of first analysis results, video data in which the client is photographed, and text data obtained by converting the client's voice data separated from the video data in which the client is photographed. and content corresponding to the text data converted from the counselor's voice data.

실시예에 따라, 각 컨텐츠는 사용자 인터페이스부(33)의 화면 상의 분리된 영역에 각각 출력될 수 있다. Depending on the embodiment, each content may be output to a separate area on the screen of the user interface unit 33 .

예를 들어, 도 4는 사용자 인터페이스부(33)의 화면에 대한 예시로, 복수의 제1 분석 결과(①), 내담자 단말(10)로부터 실시간 수신한 동영상데이터(②), 내담자의 음성데이터가 변환된 텍스트데이터와 상담자의 음성데이터가 변환된 텍스트데이터(④), 상담자가 실시간 기록한 상담 노트(③)를 포함할 수 있다. For example, FIG. 4 is an example of the screen of the user interface unit 33, in which a plurality of first analysis results (①), video data received in real time from the client terminal 10 (②), and audio data of the client are displayed. It may include converted text data and text data (④) in which the counselor's voice data is converted, and counseling notes (③) recorded by the counselor in real time.

(1-1)은 복수의 제1 분석 결과(①) 중 복수의 일반 감정 상태를 나타내며, (1-2)는 대표 감정 상태를, (1-3)은 심박수와 스트레스 지수를, (1-4)는 키워드별 빈도수를, (1-5)는 토픽별로 구성된 키워드들 각각에 대한 빈도수를 나타낸다. (1-1) represents a plurality of general emotional states among the plurality of first analysis results (①), (1-2) represents a representative emotional state, (1-3) represents heart rate and stress index, (1- 4) represents the frequency count for each keyword, and (1-5) represents the frequency count for each of the keywords organized by topic.

실시예에 따라 제2 분석결과획득부(21b)는 실간 수신한 동영상데이터가 소정 시간 동안 누적된 누적 동영상데이터를 생성할 수 있다(s21).Depending on the embodiment, the second analysis result acquisition unit 21b may generate accumulated video data obtained by accumulating video data received in real time for a predetermined time (s21).

실시예에 따라, 제2 분석결과획득부(21b)는 내담자 단말(10)로부터 동영상데이터의 수신을 시작한 시점과 내담자 단말(10)로부터 동영상데이터의 수신을 종료한 시점 사이의 구간 내의 수신 동영상데이터을 누적 동영상데이터로 획득할 수 있다.Depending on the embodiment, the second analysis result acquisition unit 21b obtains received video data within a section between the start of receiving video data from the client terminal 10 and the end of receiving video data from the client terminal 10. It can be obtained as cumulative video data.

실시예에 따라, 제2 분석결과획득부(21b)는 내담자 단말(10)로부터 수신한 동영상데이터의 기록이 시작된 시점과 기록이 종료된 시점 사이의 구간 내의 동영상데이터를 누적 동영상데이터로 획득할 수 있다.Depending on the embodiment, the second analysis result acquisition unit 21b may acquire, as cumulative video data, video data within a section between the time when the recording of the video data received from the client terminal 10 starts and the time when the recording ends. there is.

실시예에 따라, 상기 소정 시간 동안 누적된 동영상데이터를 분석 대상으로 정의할 것에 대한 명령어가 미리 프로그래밍되어 있고, 제2 분석결과획득부(21b)는 해당 명령어들을 참조해 분석 대상을 정의할 수 있다.Depending on the embodiment, commands for defining the video data accumulated for the predetermined time as an analysis target are programmed in advance, and the second analysis result acquisition unit 21b may refer to the corresponding commands to define the analysis target. .

실시예에 따라, 상기 소정 시간은 내담자와 상담자 사이의 상담이 시작될 때부터 종료될 때까지의 시간 구간에 대응될 수 있다.Depending on the embodiment, the predetermined time may correspond to a time interval from the start of counseling between the client and the counselor to the end.

실시예에 따라 제2 분석결과획득부(21b)는 누적 동영상데이터로부터 제2 음성데이터 및 제2 영상데이터를 추출할 수 있다(s22).Depending on the embodiment, the second analysis result acquisition unit 21b may extract the second audio data and the second image data from the cumulative video data (S22).

실시예에 따라, 제1 영상데이터는 실시간 수신한 단일 영상프레임인 반면, 제2 영상데이터는 소정 시간 동안 누적된 복수의 영상프레임을 포함할 수 있다. According to embodiments, the first image data may be a single image frame received in real time, while the second image data may include a plurality of image frames accumulated over a predetermined period of time.

실시예에 따라, 제1 음성데이터는 실시간 수신한 단일 음성프레임인 반면, 제2 음성데이터는 소정 시간 동안 누적된 복수의 음성프레임을 포함할 수 있다.Depending on the embodiment, the first voice data may be a single voice frame received in real time, while the second voice data may include a plurality of voice frames accumulated over a predetermined period of time.

실시예에 따라, 누적 동영상데이터로부터 제2 영상데이터와 제2 음성데이터를 각각 추출하는 것은 공지의 다양한 알고리즘을 통해 구현될 수 있다. Depending on the embodiment, extracting the second image data and the second audio data from the cumulative video data may be implemented using various well-known algorithms.

제2 분석결과획득부(21b)는 제2 음성데이터 및 제2 영상데이터 중 적어도 하나에 기초해, 상기 내담자의 감정을 상기 소정 시간 경과 후 분석하여 복수의 제2 분석 결과를 획득할 수 있다(s23).The second analysis result obtaining unit 21b may obtain a plurality of second analysis results by analyzing the emotion of the client after the elapse of the predetermined time based on at least one of the second audio data and the second image data ( s23).

실시예에 따라, 소정 시간 경과 후 바로(또는 임의의 시간이 경과된 이후) 분석하거나, 사용자에 의해 설정된 시각에 분석할 것에 대한 명령어가 미리 프로그래밍되어 있을 수 있고, 제2 분석결과획득부(21b)는 해당 명령어를 참조해 분석을 수행할 수 있다.Depending on the embodiment, a command for analyzing immediately after a predetermined time has elapsed (or after a certain time has elapsed) or at a time set by the user may be programmed in advance, and the second analysis result acquisition unit 21b ) can perform analysis by referring to the corresponding command.

실시예에 따라, 복수의 제2 분석 결과는 복수의 일반 감정 상태 각각의 확률값, 복수의 치환 감정 상태 각각의 확률값, 심박수와 스트레스 지수, 키워드별 빈도수, 토픽별로 구성된 키워드들 각각에 대한 빈도수, 음성 파라미터, 시선 위치 정보, 및 얼굴 움직임 정보 중 적어도 둘 이상을 포함할 수 있다.Depending on the embodiment, the plurality of second analysis results may be a probability value of each of a plurality of general emotional states, a probability value of each of a plurality of substitutional emotional states, a heart rate and a stress index, a frequency count for each keyword, a frequency count for each of keywords configured for each topic, and voice. It may include at least two or more of parameters, gaze position information, and face motion information.

실시예에 따라, 제2 분석결과획득부(21b)는 복수의 제2 분석 결과 중 적어도 일부를 그래프화하여 전송할 수 있다. Depending on the embodiment, the second analysis result acquisition unit 21b may transmit at least some of the plurality of second analysis results as a graph.

실시예에 따라, 복수의 일반 감정 상태는 해당 시각(또는 시간)의 기쁨, 분노, 경멸, 놀람, 두려움, 평온 및 슬픔 중 적어도 둘 이상을 포함할 수 있으나, 본 발명은 이에 한정되지 않는다. Depending on embodiments, the plurality of general emotional states may include at least two or more of joy, anger, contempt, surprise, fear, tranquility, and sadness at a corresponding time (or time), but the present invention is not limited thereto.

실시예에 따라, 복수의 일반 감정 상태 각각의 확률값은 소정의 시간 구간 동안 획득된 일반 감정 상태의 확률값을 평균화하여 획득될 수도 있다. Depending on the embodiment, the probability value of each of the plurality of general emotional states may be obtained by averaging probability values of the general emotional states obtained during a predetermined time period.

실시예에 따라, 제2 분석결과획득부(21b)는 도 5와 같이 시간대별 복수의 일반 감정 상태 각각의 확률값을 그래프화하여 생성 및 전송할 수 있다. Depending on the embodiment, the second analysis result acquisition unit 21b may generate and transmit a probability value of each of a plurality of general emotional states by time period as a graph, as shown in FIG. 5 .

실시예에 따라, 제2 분석결과획득부(21b)는 제2 영상데이터와 제2 음성데이터에 기초해 복수의 일반 감정 상태 각각의 확률값을 획득할 수 있다. Depending on the embodiment, the second analysis result acquisition unit 21b may obtain a probability value of each of a plurality of general emotional states based on the second image data and the second audio data.

실시예에 따라, 제2 분석결과획득부(21b)는 제2 영상데이터, 제2 음성데이터, 및 제2 음성데이터가 변환된 텍스트데이터에 기초해 앙상블 학습에 기반한 복수의 모델을 이용해 복수의 일반 감정 상태 각각의 확률값을 획득할 수 있다. According to an embodiment, the second analysis result acquisition unit 21b uses a plurality of general models based on ensemble learning based on the second image data, the second audio data, and the text data in which the second audio data is converted. A probability value of each emotional state may be obtained.

구체적으로, 제2 분석결과획득부(21b)는 제2 영상데이터를 구성하는 복수의 영상프레임, 제2 음성데이터를 구성하는 복수의 음성프레임, 복수의 음성프레임이 변환된 복수의 텍스트데이터(STT기반)를 이용해 앙상블 학습에 기반한 복수의 모델을 이용해 복수의 일반 감정 상태 각각의 확률값을 획득할 수 있다. Specifically, the second analysis result acquisition unit 21b includes a plurality of image frames constituting the second image data, a plurality of audio frames constituting the second audio data, and a plurality of text data (STT) obtained by converting the plurality of audio frames. Based), it is possible to obtain a probability value of each of a plurality of general emotional states by using a plurality of models based on ensemble learning.

실시예에 따라, 제2 분석결과획득부(21b)는 학습용 영상프레임에 기초하여 학습된 제1 모델을 생성하고, 학습용 음성프레임에 기초하여 학습된 제2 모델을 생성하며, 학습용 텍스트데이터에 기초하여 학습된 제3 모델을 생성하고, 제1 모델의 제1 가중치, 제2 모델의 제2 가중치, 및 제3 모델의 제3 가중치를 참조해 복수의 일반 감정 상태 각각의 확률값을 획득할 수 있다. According to an embodiment, the second analysis result acquisition unit 21b generates a first model learned based on the video frame for learning, generates a second model learned based on the audio frame for learning, and based on text data for learning. to generate a learned third model, and obtain a probability value of each of a plurality of general emotional states by referring to the first weight of the first model, the second weight of the second model, and the third weight of the third model. .

구체적으로, 복수의 영상프레임 중 적어도 하나의 영상프레임을 기초로 제1 모델을 통해 제1 분류값을 획득하고, 복수의 영상프레임 중 적어도 하나의 음성프레임을 기초로 제2 모델을 통해 제2 분류값을 획득하며, 복수의 텍스트데이터 중 적어도 하나의 텍스트데이터를 기초로 제3 모델을 통해 제3 분류값을 획득하고, 제1 분류값에 제1 가중치를 적용하고, 2 분류값에 상기 제2 가중치를 적용하며, 상기 제3 분류값에 상기 제3 가중치를 적용함으로써, 상기 영상프레임, 상기 음성프레임, 및 상기 텍스트데이터 각각을 기초로 한 복수의 일반 감정 상태 각각의 확률값을 획득할 수 있다. 그리고, 해당 과정을 상기 복수의 영상프레임과 상기 복수의 음성프레임, 그리고 상기 복수의 텍스트데이터에 수행하여 복수의 일반 감정 상태 각각의 확률값을 전체적으로 획득할 수 있다. Specifically, a first classification value is obtained through a first model based on at least one image frame among a plurality of image frames, and a second classification value is obtained through a second model based on at least one audio frame among a plurality of image frames. value, obtains a third classification value through a third model based on at least one text data among a plurality of text data, applies a first weight to the first classification value, and applies the second classification value to the second classification value. A weight is applied, and by applying the third weight to the third classification value, a probability value of each of a plurality of general emotional states based on each of the video frame, the audio frame, and the text data may be obtained. In addition, a corresponding process may be performed on the plurality of video frames, the plurality of audio frames, and the plurality of text data to obtain a probability value of each of a plurality of general emotional states as a whole.

실시예에 따라, 상기 제1 모델은 MobileNet에 기반한 모델일 수 있다.Depending on the embodiment, the first model may be a model based on MobileNet.

실시예에 따라, 상기 제2 모델은 SVM에 기반한 모델일 수 있다.Depending on embodiments, the second model may be a model based on SVM.

실시예에 따라, 상기 제3 모델은 Bert에 기반한 모델일 수 있다.Depending on embodiments, the third model may be a model based on Bert.

실시예에 따라, 제2 분석결과획득부(21b)는 내담자의 영상프레임을 기초로 제1 모델을 통해 내담자의 일반 감정 상태를 나타내는 제1 분류값을 획득할 수 있다.Depending on the embodiment, the second analysis result obtaining unit 21b may obtain a first classification value representing a general emotional state of the client through a first model based on the client's video frame.

실시예에 따라, 제2 분석결과획득부(21b)는 영상프레임을 기초로 MobileNet 모델을 통해 내담자의 일반 감정 상태를 나타내는 제1 분류값을 획득할 수 있다. Depending on the embodiment, the second analysis result acquisition unit 21b may acquire a first classification value indicating a general emotional state of the client through a MobileNet model based on the video frame.

실시예에 따라, 제1 분류값은 복수의 일반 감정 상태 각각의 제1 예비확률값을 포함할 수 있다. According to embodiments, the first classification value may include a first preliminary probability value of each of a plurality of general emotional states.

제2 분석결과획득부(21b)는 내담자의 음성프레임을 기초로 제2 모델을 통해 내담자의 일반 감정 상태를 나타내는 제2 분류값을 획득할 수 있다.The second analysis result obtaining unit 21b may obtain a second classification value representing a general emotional state of the client through a second model based on the client's voice frame.

실시예에 따라, 제2 분석결과획득부(21b)는 음성프레임을 기초로 SVM 알고리즘 기반의 제2 모델을 통해 통해 내담자의 일반 감정 상태를 나타내는 제2 분류값을 획득할 수 있다.Depending on the embodiment, the second analysis result acquisition unit 21b may acquire a second classification value representing the general emotional state of the client through a second model based on an SVM algorithm based on the voice frame.

실시예에 따라, 제2 분류값은 복수의 일반 감정 상태 각각의 제2 예비확률값을 포함할 수 있다. Depending on embodiments, the second classification value may include a second preliminary probability value for each of a plurality of general emotional states.

제2 분석결과획득부(21b)는 텍스트데이터를 기초로 제3 모델을 통해 내담자의 일반 감정 상태를 나타내는 제3 분류값을 획득할 수 있다.The second analysis result acquisition unit 21b may obtain a third classification value representing a general emotional state of the client through a third model based on the text data.

실시예에 따라, 제2 분석결과획득부(21b)는 텍스트데이터를 기초로 BERT 모델을 통해 제3 분류값을 획득할 수 있다.Depending on the embodiment, the second analysis result acquisition unit 21b may acquire the third classification value through the BERT model based on the text data.

실시예에 따라, 제3 분류값은 복수의 일반 감정 상태 각각의 제3 예비확률값을 포함할 수 있다. According to embodiments, the third classification value may include a third preliminary probability value for each of a plurality of general emotional states.

구체적으로, 제2 분석결과획득부(21b)는 도 13을 참조하면, 복수의 일반 감정 상태 각각의 제1 예비확률값에 제1 가중치를 각각 적용한 복수의 제1 결과, 상기 복수의 일반 감정 상태 각각의 제2 예비확률값에 제2 가중치를 각각 적용한 복수의 제2 결과, 및 상기 복수의 일반 감정 상태 각각의 제3 예비확률값에 제3 가중치를 각각 적용한 복수의 제3 결과를 각각 획득하고, 상기 복수의 일반 감정 상태별로 제1 결과, 제2 결과 및 제3 결과를 각각 합산하여 복수의 일반 일반 감정 상태 각각의 확률값을 획득할 수 있다.Specifically, referring to FIG. 13 , the second analysis result obtaining unit 21b obtains a plurality of first results obtained by applying a first weight to a first preliminary probability value of each of a plurality of general emotional states, respectively, and each of the plurality of general emotional states. A plurality of second results obtained by applying a second weight to a second preliminary probability value of , respectively, and a plurality of third results obtained by applying a third weight to a third preliminary probability value of each of the plurality of general emotional states, respectively; A probability value of each of a plurality of general emotional states may be obtained by summing the first result, the second result, and the third result for each general emotional state.

예를 들어, 기쁨의 제1 예비확률값과 분노의 제1 예비확률값이 각각 [0.7,0.3]이고, 기쁨의 제2 예비확률값과 분노의 제2 예비확률값이 각각 [0.8,0.2]이며, 기쁨의 제3 예비확률값과 분노의 제3 예비확률값이 각각 [0.2,0.8]이며, 제1 모델의 가중치(a)가 0.3, 제2 모델의 가중치(b)가 0.5, 제3 모델의 가중치(c)가 0.2인 경우, 기쁨에 대해 합산된 확률값은 0.7*0.3 + 0.8*0.5 + 0.2*0.2 = 0.65이고, 분노에 대해 합산된 확률값은 0.3*0.3 + 0.2*0.5 + 0.8*0.2 = 0.35으로 산출할 수 있다.For example, the first preliminary probability value of joy and the first preliminary probability value of anger are [0.7, 0.3], respectively, the second preliminary probability value of joy and the second preliminary probability value of anger are [0.8, 0.2], respectively, and the value of joy is [0.8, 0.2]. The third preliminary probability value and the third preliminary probability value of anger are [0.2, 0.8], respectively, the weight (a) of the first model is 0.3, the weight (b) of the second model is 0.5, and the weight (c) of the third model is 0.2, the summed probability for joy is 0.7*0.3 + 0.8*0.5 + 0.2*0.2 = 0.65, and the summed probability for anger is 0.3*0.3 + 0.2*0.5 + 0.8*0.2 = 0.35. can

이하, 각 모델의 학습 과정에 대해서는 도 13을 참조하여 후술한다.Hereinafter, the learning process of each model will be described later with reference to FIG. 13 .

실시예에 따라, 복수의 치환 감정 상태는 복수의 일반 감정 상태 각각의 상위 개념으로 정의되는 감정 상태의 조합일 수 있다. 즉, 복수의 치환 감정 상태 각각은 치환 감정 상태의 하위 개념으로 정의되는 일반 감정 상태가 하나 또는 복수 개 매핑되도록 정의될 수 있다. 예를 들어, 일반 감정 상태가 7가지로 표현될 수 있다면, 치환 감정 상태는 4가지(긍정: 기쁨, 부정: 분노/경멸/놀람/두려움, 중립: 평온, 기타:슬픔)로 표현될 수 있다. According to embodiments, the plurality of substitutional emotional states may be a combination of emotional states defined as superordinate concepts of each of the plurality of general emotional states. That is, each of the plurality of substitutional emotional states may be defined such that one or a plurality of general emotional states defined as sub-concepts of the substitutional emotional state are mapped. For example, if the general emotional state can be expressed in 7 ways, the substituted emotional state can be expressed in 4 types (positive: joy, negative: anger/contempt/surprise/fear, neutral: tranquility, other: sadness). .

실시예에 따라, 복수의 치환 감정 상태 각각의 확률값은 상기 복수의 치환 감정 상태 각각에 대응되는 하나 또는 복수의 일반 감정 상태의 확률값들을 합산하여 획득될 수 있다. 즉, 제2 분석결과획득부(21b)는 복수의 일반 감정 상태 각각의 확률값을 산출한 다음, 복수의 일반 감정 상태 각각 중 동일 그룹으로 분류되는 일반 감정 상태의 확률값들을 합산하여 이를 치환 감정 상태의 확률값으로 산출할 수 있다. Depending on the embodiment, the probability value of each of the plurality of substitute emotional states may be obtained by summing probability values of one or a plurality of normal emotional states corresponding to each of the plurality of substitute emotional states. That is, the second analysis result acquisition unit 21b calculates probability values of each of a plurality of general emotional states, and then sums the probability values of general emotional states classified into the same group among the plurality of general emotional states, and then sums the probability values of each of the plurality of general emotional states to form a replacement emotional state. It can be calculated as a probability value.

예를 들어, 특정 시각(또는 시간)의 기쁨의 확률값이 20%, 분노의 확률값이 10%, 경멸의 확률값이 15%, 놀람의 확률값이 5%, 두려움의 확률값이 20%, 평온의 확률값이 10%, 슬픔의 확률값이 20%인 경우, 긍정의 확률값이 20%이며, 부정의 확률값은 50%이며, 중립의 확률값은 10%이고, 기타의 확률값은 20%인 것으로 산출될 수 있다. For example, at a specific time (or time), the probability of joy is 20%, the probability of anger is 10%, the probability of contempt is 15%, the probability of surprise is 5%, the probability of fear is 20%, and the probability of tranquility is 15%. When the probability value of sadness is 10% and the probability value is 20%, the probability value of positive is 20%, the probability of negative is 50%, the probability of neutral is 10%, and other probability values are 20%.

실시예에 따라, 복수의 치환 감정 상태 각각의 확률값은 소정의 시간 구간 동안 획득된 치환 감정 상태의 확률값을 평균화하여 획득될 수도 있다. Depending on the embodiment, the probability value of each of the plurality of alternate emotional states may be obtained by averaging the probability values of the alternate emotional states obtained during a predetermined time interval.

실시예에 따라, 제2 분석결과획득부(21b)는 도 6과 같이 시간대별 복수의 치환 감정 상태 각각의 확률값을 그래프화하여 생성 및 전송할 수 있다. Depending on the embodiment, the second analysis result acquisition unit 21b may graph, generate, and transmit probability values of each of a plurality of substitutional emotional states for each time period, as shown in FIG. 6 .

실시예에 따라, 제2 분석결과획득부(21b)는 제2 영상데이터에 기초해 심박수와 스트레스 지수를 획득할 수 있다. Depending on the embodiment, the second analysis result acquisition unit 21b may obtain a heart rate and a stress index based on the second image data.

구체적으로, 제2 분석결과획득부(21b)는 제2 영상데이터를 구성하는 각각의 영상프레임을 이용해 rPPG(Remote photoplethysmography, 원격 광혈류측정) 신호를 연산하고, 연산된 rPPG 신호를 이용해 심박수를 추출할 수 있다. Specifically, the second analysis result acquisition unit 21b calculates an rPPG (Remote photoplethysmography) signal using each image frame constituting the second image data, and extracts a heart rate using the calculated rPPG signal. can do.

실시예에 따라, 유사하르방식(Haar cascade)알고리즘을 사용해 영상프레임으로부터 얼굴을 검출하고(ROI), 노이즈제거 및 주파수 변환 후 피부색의 변화를 RGB 값의 채널로 변환한 후, 각 채널에서 평균 픽셀값의 변화를 측정하며, 각 채널에서 추출한 값을 이용해 심박수를 계산할 수 있다. According to the embodiment, a face is detected (ROI) from an image frame using a Haar cascade algorithm, and after noise removal and frequency conversion, skin color changes are converted into RGB value channels, and average pixels in each channel The change in the value is measured, and the heart rate can be calculated using the value extracted from each channel.

그리고, 연산된 rPPG 신호로부터 추출된 심박수를 기초로 한 심박변이도 지표와 스트레스에 의한 자율신경계 활성화의 정적 상관관계를 이용해 스트레스 지수를 산출할 수 있다. In addition, the stress index can be calculated using the static correlation between the heart rate variability index based on the heart rate extracted from the calculated rPPG signal and the activation of the autonomic nervous system by stress.

실시예에 따라, 심박변이도 지표는 SDNN(standard deviation of all RR interval) 값, RMSSD(square root of the mean squared differences of successive normal sinus intervals) 값, SDNN/RMSSD비율, NN(normal to normal interval), NN50(연속된 NN 간격들의 변이가 50 ms보다 큰 경우의 수), pNN50(모든 NN 간격 중의 NN50 의 비율), LF(low frequency) 값, HF(high frequency) 값, 또는 LF/HF 값 등을 포함할 수 있다. According to an embodiment, the heart rate variability index is SDNN (standard deviation of all RR interval) value, RMSSD (square root of the mean squared differences of successive normal sinus intervals) value, SDNN / RMSSD ratio, NN (normal to normal interval), NN50 (the number of cases where the variation of consecutive NN intervals is greater than 50 ms), pNN50 (the ratio of NN50 among all NN intervals), LF (low frequency) value, HF (high frequency) value, or LF/HF value can include

예를 들면, SDNN 값에서 간격이 넓을수록 변화도가 크다는 뜻이며, 일반적으로 변화도가 클수록 건강하다는 것을 이용하여 육체의 피로도를 연산할 수 있고, RMSSD 값의 수치가 높으면 심장이 안정도가 높다는 것을 이용하여, 부교감 신경의 활동을 알 수 있다. LF 값의 경우 교감신경계 활성도를 반영하는데, 정신 스트레스와 피로감에 영향을 받으며, 특히 급성 스트레스 정도를 평가할 수 있고, 우울 또는 분노와 상관관계가 있으며, HF 값의 경우 부교감신경계 활성도를 반영하는데, 호흡 활동과 밀접한 관련이 있으며, 장기적인 스트레스, 불안, 또는 공포와 상관관계가 있다. LF/HF 값은 자율신경계의 전체적인 균형 정도를 반영하는데, 일반적으로 건강한 사람은 깨어 있는 동안에 LF 값이 HF 값보다 높다. 따라서, rPPG 신호로부터 추출된 심박변이도 지표와 스트레스에 의한 자율신경계 활성화의 정적 상관관계를 이용함으로써, 스트레스 지수를 산출할 수 있다. For example, the wider the interval in the SDNN value, the greater the variance. In general, the greater the variance, the healthier the body can be used to calculate the fatigue of the body, and the higher the RMSSD value, the higher the stability of the heart. By using it, the activity of the parasympathetic nerve can be known. The LF value reflects the activity of the sympathetic nervous system, which is affected by mental stress and fatigue, can evaluate the degree of acute stress, and is correlated with depression or anger. The HF value reflects the activity of the parasympathetic nervous system. It is closely related to activity and correlates with long-term stress, anxiety, or fear. The LF/HF value reflects the overall balance level of the autonomic nervous system, and in general, the LF value is higher than the HF value in a healthy person while awake. Therefore, the stress index can be calculated by using the static correlation between the heart rate variability index extracted from the rPPG signal and the activation of the autonomic nervous system by stress.

실시예에 따라, 상담이 시작된 후 5분 동안 획득된 각 심박변이도 지표를 이용해 평균값을 산출하고, 이를 기준값으로 설정해 5분 이후의 심박변이도 지표를 산출하여 기준값과 비교하면서 분석할 수 있다. According to an embodiment, an average value may be calculated using each heart rate variability index acquired for 5 minutes after counseling starts, set as a reference value, and a heart rate variability index after 5 minutes may be calculated and analyzed while comparing with the reference value.

다만, 이는 일예이며, 공지의 다양한 방식에 의해 심박수, 심박변이도, 및 스트레스 지수를 산출할 수 있다. However, this is an example, and heart rate, heart rate variability, and stress index can be calculated by various known methods.

실시예에 따라, 제2 분석결과획득부(21b)는 도 7과 같이 시간대별 심박수와 스트레스 지수를 그래프화하여 생성 및 전송할 수 있다. 실시예에 따라, 2 분석결과획득부(21b)는 시간대별 심박수와 스트레스 지수를 하나의 그래프 상에 나타낼 수 있다. 실시예에 따라, 2 분석결과획득부(21b)는 심박수와 스트레스 지수를, 측정 수치 자체 및/또는 측정 수치에 대응되는 분류 정보(정상/주의/경계 등) (정상스트레스/보통스트레스/심한스트레스 등)로 표현된 그래프를 작성하여 전송할 수 있다. Depending on the embodiment, the second analysis result acquisition unit 21b may generate and transmit graphs of heart rate and stress index for each time period as shown in FIG. 7 . Depending on the embodiment, the 2 analysis result acquisition unit 21b may display the heart rate and stress index for each time zone on a single graph. Depending on the embodiment, the second analysis result acquisition unit 21b measures the heart rate and the stress index, the measured value itself and/or the classification information (normal/attention/boundary, etc.) corresponding to the measured value (normal stress/moderate stress/severe stress). etc.) can be created and transmitted.

실시예에 따라, 제2 분석결과획득부(21b)는 제2 음성데이터가 변환된 텍스트데이터에 대한 분석을 수행해 기 설정된 키워드별 빈도수를 획득할 수 있다. Depending on the embodiment, the second analysis result acquisition unit 21b may obtain a frequency count for each keyword by performing analysis on the text data obtained by converting the second voice data.

구체적으로, 제2 분석결과획득부(21b)는 제2 음성데이터를 구성하는 각각의 음성프레임이 변환된 텍스트데이터(STT를 통한) 전체에 대한 분석을 수행해 상담자에 의해 기 설정된 키워드별 빈도수를 획득할 수 있다. 이는, 텍스트데이터와 기 설정된 키워드를 매핑하는 분석을 통해 이루어질 수 있다. Specifically, the second analysis result acquisition unit 21b acquires the frequency count for each keyword preset by the counselor by analyzing the entire text data (via STT) in which each voice frame constituting the second voice data is converted. can do. This may be achieved through analysis of mapping text data and preset keywords.

실시예에 따라, 기 설정된 키워드는 상담자에 의해 실시간 변경 설정되도록 구현될 수 있다. Depending on the embodiment, preset keywords may be implemented to be changed and set by a counselor in real time.

실시예에 따라, 제2 분석결과획득부(21b)는 도 8과 같이, 키워드별 빈도수를 나타내는 그래프를 생성 및 전송할 수 있다. Depending on the embodiment, the second analysis result acquisition unit 21b may generate and transmit a graph showing frequency counts for each keyword, as shown in FIG. 8 .

실시예에 따라, 제2 분석결과획득부(21b)는 키워드별 빈도수를 내림차순 또는 오름차순으로 정렬하여 제공할 수 있다. Depending on the embodiment, the second analysis result acquisition unit 21b may sort and provide frequency counts for each keyword in descending or ascending order.

실시예에 따라, 제2 분석결과획득부(21b)는 키워드별 빈도수를 원의 크기로 표현하여 제공할 수도 있다. 예를 들어, 많은 빈도수로 발화된 키워드는 상대적으로 큰 크기의 원으로 표현하여 그래프를 생성할 수 있다. Depending on the embodiment, the second analysis result acquisition unit 21b may provide the frequency count for each keyword represented by the size of a circle. For example, a graph may be created by expressing keywords that are uttered with a high frequency as a circle having a relatively large size.

실시예에 따라, 제2 분석결과획득부(21b)는 상담자 단말(30)로부터 설정 키워드를 전송받거나, 감정 분석 결과 제공 장치(20)가 별도의 사용자 인터페이스부(미도시)를 통한 사용자 입력을 통해 설정 키워드를 직접 입력받을 수 있다. Depending on the embodiment, the second analysis result acquisition unit 21b receives a set keyword from the counselor terminal 30, or the emotion analysis result providing device 20 receives a user input through a separate user interface unit (not shown). You can directly enter the set keyword through

실시예에 따라, 제2 분석결과획득부(21b)는 제2 음성데이터가 변환된 텍스트데이터에 대해 잠재 디리클레 할당(LDA, Latent Dirichlet Allocation) 토픽 모델링을 수행하여 상기 토픽별로 구성된 키워드들 각각에 대한 빈도수를 획득할 수 있다. Depending on the embodiment, the second analysis result acquisition unit 21b performs Latent Dirichlet Allocation (LDA) topic modeling on the text data from which the second voice data is converted, so that each of the keywords configured for each topic frequency can be obtained.

구체적으로, 제2 분석결과획득부(21b)는 제2 음성데이터를 구성하는 각각의 음성프레임이 변환된 텍스트데이터(STT를 통한) 전체에 대한 잠재 디리클레 할당(LDA, Latent Dirichlet Allocation) 토픽 모델링을 수행하여 상기 토픽별로 구성된 키워드들 각각에 대한 빈도수를 획득할 수 있다. Specifically, the second analysis result acquisition unit 21b performs Latent Dirichlet Allocation (LDA) topic modeling for the entire text data (via STT) in which each voice frame constituting the second voice data is converted. It is possible to obtain a frequency count for each of the keywords configured for each topic.

실시예에 따라, 제2 분석결과획득부(21b)는 텍스트데이터 전체를 구성하는 복수의 문서들 각각을 전처리하고, 전처리된 복수의 문서들로부터 적어도 하나의 토픽을 추출하며, 토픽별로 토픽을 구성하는 키워드들 각각에 대한 단어 빈도수를 산출할 수 있다. According to an embodiment, the second analysis result acquisition unit 21b pre-processes each of a plurality of documents constituting the entire text data, extracts at least one topic from the plurality of pre-processed documents, and constructs a topic by topic. It is possible to calculate the word frequency for each of the keywords.

실시예에 따라, 전처리 과정은 특수 문자 제거(또는 숫자 제거), 형태소 분석, 불용어 제거, 유의어(또는 유사어) 처리 중 적어도 하나 이상을 포함할 수 있다. Depending on embodiments, the preprocessing process may include at least one of removing special characters (or removing numbers), morpheme analysis, removing stopwords, and processing synonyms (or synonyms).

실시예에 따라, 제2 분석결과획득부(21b)는 전처리된 복수의 문서들을 이용하여 적어도 하나의 토픽을 추출하고, 전처리된 복수의 문서들 각각을 추출된 적어도 하나의 토픽 각각과 매칭할 수 있다. 구체적으로, 상기 전처리된 복수의 문서들로부터 키워드를 추출하고, 추출된 키워드의 빈도 분석을 수행한다. 또한, 제2 분석결과획득부(21b)는 토픽 모델링을 위해 단어-문서 행렬(Term-Document Matrix)을 생성할 수 있다. 제2 분석결과획득부(21b)는 LDA(Latent Dirichlet Allocation) 기법을 통해 문서를 이루고 있는 키워드를 통해 적어도 하나의 토픽(주제)을 추출하고, 각 문서를 토픽 별로 분류할 수 있다. 이때, 하나의 문서는 복수 개의 토픽에 포함될 수도 있다. 즉, 문서와 토픽은 1:N(N은 1 이상의 자연수)의 대응 관계를 갖을 수 있다. LDA 모형(또는 LDA 모델)은 사전에 계산된 주제별 단어의 분포를 바탕으로 주어진 문서의 단어를 분석함으로써 해당 문서가 어떤 주제를 다루고 있는지 예측하는 모형이다. 토픽을 나누는 기준은 클러스터 분석 기법의 하나로 실루엣 분석을 통해 K의 군집 수를 결정하고 K 개의 주제에 따라 상위 키워드를 분석한다. Depending on the embodiment, the second analysis result acquisition unit 21b may extract at least one topic using a plurality of preprocessed documents and match each of the plurality of preprocessed documents with each of the extracted at least one topic. there is. Specifically, keywords are extracted from the plurality of preprocessed documents, and frequency analysis of the extracted keywords is performed. In addition, the second analysis result acquisition unit 21b may generate a term-document matrix for topic modeling. The second analysis result acquisition unit 21b may extract at least one topic (subject) through a keyword constituting the document through a Latent Dirichlet Allocation (LDA) technique, and classify each document by topic. In this case, one document may be included in a plurality of topics. That is, documents and topics may have a correspondence relationship of 1:N (N is a natural number greater than or equal to 1). The LDA model (or LDA model) is a model that predicts what topic the document is dealing with by analyzing the words of a given document based on the pre-calculated word distribution for each subject. The criterion for dividing topics is one of the cluster analysis techniques, which determines the number of K clusters through silhouette analysis and analyzes the top keywords according to K topics.

실시예에 따라, 제2 분석결과획득부(21b)는 토픽별로 토픽을 구성하는 키워드들 각각에 대한 단어 빈도수를 도 9a 및 도 9b(도 9a의 그래프 확대)와 같이 복수의 그래프로 그래프화할 수 있다. Depending on the embodiment, the second analysis result acquisition unit 21b may graph the frequency count of words for each of the keywords constituting the topic for each topic into a plurality of graphs as shown in FIGS. 9A and 9B (enlarged graph of FIG. 9A). there is.

실시예에 따라, 임의의 그래프 상에는, 토픽수에 대응되는 원들을 표기하고, 특히, 원의 크기가 토픽에 대응되는 키워드의 빈도수에 비례하도록 그래프화할 수 있다. 또한, 원 사이의 거리는 토픽 사이의 유사도에 기초해 계산되어 그래프화될 수 있다. 그리고, 다른 그래프 상에는, 각 원의 토픽에 대응되는 키워드와 해당 키워드별 빈도수 정보를 나타내도록 작성될 수 있다. Depending on the embodiment, circles corresponding to the number of topics may be marked on an arbitrary graph, and in particular, the graph may be made such that the size of the circles is proportional to the frequency of keywords corresponding to the topics. In addition, the distance between circles may be calculated and graphed based on the degree of similarity between topics. In addition, on another graph, keywords corresponding to topics of each circle and frequency information for each keyword may be displayed.

실시예에 따라, 제2 분석결과획득부(21b)는 제2 음성데이터를 기초로 분석하여 음성 파라미터를 획득할 수 있다. Depending on the embodiment, the second analysis result acquisition unit 21b may obtain a voice parameter by analyzing the second voice data.

본 발명에 따르면, 제2 음성데이터에 대한 음성프레임들을 이용해 각 음성프레임들에 대한 각 음성 파라미터를 획득하기 위하여 공지의 다양한 기술을 적용할 수 있고, 이에 대한 설명은 생략한다. According to the present invention, a variety of well-known techniques can be applied to obtain each voice parameter for each voice frame using voice frames for the second voice data, and a description thereof will be omitted.

피치는, 음성의 주파수와 주기를 기초로 결정될 수 있다. The pitch may be determined based on the frequency and period of the voice.

포즈 구간은 파형이 발생하지 않거나 파형이 끊기는 구간으로 정의될 수 있다. The pause period may be defined as a period in which a waveform does not occur or a waveform is discontinued.

실시예에 따라, 제2 분석결과획득부(21b)는 포즈 구간이 소정 시간(예> 2초) 이상인 경우, 이를 내담자가 망설이는 구간으로 결정 및 이를 이용해 망설임 구간을 나타내는 지표(예>망설임 횟수, 망설임 평균 시간)를 생성할 수 있다. Depending on the embodiment, the second analysis result acquisition unit 21b determines the pause period as a period in which the client hesitates when the pause period is longer than a predetermined time (eg > 2 seconds), and uses this to determine an index indicating the hesitation period (eg > hesitation). count, average hesitation time).

실시예에 따라, 제2 분석결과획득부(21b)는 도 10a 및 도 10b와 같이 시간대별 음성 파라미터를 그래프화하여 생성할 수 있다. Depending on the embodiment, the second analysis result acquisition unit 21b may generate graphs of voice parameters for each time period as shown in FIGS. 10A and 10B.

예를 들어, 도 10a은 시간대별 피치 변화를 나타내며, 10b는 시간대별 포즈 변화를 나타낸다. For example, FIG. 10A shows a pitch change for each time period, and 10B shows a pose change for each time period.

실시예에 따라, 제2 분석결과획득부(21b)는 제2 영상데이터를 기초로 분석하여 시선 위치 정보 및 얼굴 움직임 정보를 획득할 수 있다. Depending on the embodiment, the second analysis result acquisition unit 21b may obtain gaze position information and face movement information by analyzing based on the second image data.

실시예에 따라, 제2 분석결과획득부(21b)는 제2 영상데이터를 구성하는 각각의 영상프레임으로부터 내담자의 얼굴을 인식하고, 상기 인식된 얼굴의 특징점들 검출하며, 상기 얼굴의 특징점들을 기초로 관심 영역을 지정하고, 지정된 관심 영역에서 동공 중심 위치를 검출하고, 상기 검출된 동공 중심 위치의 시간에 따른 변화 정도를 유클리디안 거리의 변화에 기초해 산출함으로써 시선 위치 정보를 획득할 수 있다. According to an embodiment, the second analysis result acquisition unit 21b recognizes the client's face from each image frame constituting the second image data, detects the recognized face feature points, and based on the face feature points. Gaze position information may be obtained by designating a region of interest, detecting a pupil center position in the designated region of interest, and calculating a degree of change of the detected pupil center position over time based on a change in Euclidean distance. .

영상프레임으로부터 내담자의 얼굴을 인식하고, 인식된 얼굴의 특징점들을 검출하는 것은 공지의 알고리즘을 적용해 수행할 수 있다. Recognizing the client's face from the image frame and detecting feature points of the recognized face may be performed by applying a known algorithm.

실시예에 따라, 제2 분석결과획득부(21b)는 얼굴 특징점들 중 특히, 내담자의 좌안과 우안에 각각 대응되는 특징점들로 구성되는 각 영역을 관심 영역으로 지정하고, 좌안 영역과 우안 영역 각 영역에서 동공 중심 위치를 검출할 수 있다. According to the embodiment, the second analysis result acquisition unit 21b designates each region composed of feature points corresponding to the client's left eye and right eye among the facial feature points as the region of interest, and each of the left eye region and the right eye region It is possible to detect the pupil center position in the region.

실시예에 따라, 제2 분석결과획득부(21b)는 미리 생성된 특징점 탬플릿(관심 영역 포함)과 인식된 얼굴의 특징점들을 비교해 관심 영역을 지정할 수 있다. Depending on the embodiment, the second analysis result acquisition unit 21b may designate a region of interest by comparing the previously generated feature point template (including the region of interest) with the recognized facial feature points.

예를 들어, 도 12를 참조하면, 제2 분석결과획득부(21b)는 좌안 영역(및 우안 영역 각각)의 가로축을 기준으로 한 제1 양끝점들과 세로축을 기준으로 한 제2 양끝점들 각각의 좌표를 기준으로 상대적 좌표(제1 양끝점들 사이의 중심 좌표, 제2 양끝점들 사이의 중심 좌표)를 산출하여, 해당 상대적 좌표를 동공 중심 위치로 검출할 수 있다. For example, referring to FIG. 12 , the second analysis result acquisition unit 21b obtains first both end points based on the horizontal axis and second both end points based on the vertical axis of the left eye region (and each right eye region). Relative coordinates (central coordinates between both first end points and center coordinates between both end points) may be calculated based on each coordinate, and the corresponding relative coordinates may be detected as the pupil center position.

그리고, 시간 경과에 따른 내담자의 시선 변화에 따라 산출된 해당 상대적 좌표의 기준 좌표와의 유클리디안 거리를 산출하고, 좌안 영역 및 우안 영역 각각으로부터 산출된 유클리디안 거리의 평균값을 이용해 시선 위치 정보를 획득할 수 있다. In addition, the Euclidean distance between the relative coordinates calculated according to the change in the client's gaze over time and the reference coordinates is calculated, and the gaze position information is obtained using the average value of the Euclidean distances calculated from each of the left eye area and the right eye area. can be obtained.

실시예에 따라, 기준 좌표는, 소정의 시간(예> 상담이 시작된 후 약 1분) 동안 실시간 측정된 유클리디안 거리의 평균값으로 산출될 수 있다. Depending on the embodiment, the reference coordinates may be calculated as an average value of Euclidean distances measured in real time for a predetermined time (eg, about 1 minute after counseling starts).

실시예에 따라, 도 11과 같이, 제2 분석결과획득부(21b)은 시간변화에 따른 상기 유클리디안 거리의 평균값(기준시간당 변위, 초당 움직임)을 그래프화하거나, 시간변화에 따른 유클리디안 거리의 평균값의 누적값(기준시간당 누적변위, 움직임 누적)를 그래프화하여 작성 및 전송할 수 있다. Depending on the embodiment, as shown in FIG. 11, the second analysis result acquisition unit 21b graphs the average value of the Euclidean distance (displacement per reference time, movement per second) over time, or Euclidean distance over time. The accumulated value of the average value of the Dian distance (accumulated displacement per reference time, accumulated motion) can be graphed, prepared and transmitted.

실시예에 따라, 제2 분석결과획득부(21b)는 제2 영상데이터를 구성하는 각각의 영상프레임으로부터 내담자의 얼굴을 인식하고, 상기 인식된 얼굴의 특징점을 검출하고, 상기 검출된 특징점 위치의 시간에 따른 변화 정도를 유클리디안 거리의 변화에 기초해 산출함으로써 얼굴 움직임 정보를 획득할 수 있다. According to an embodiment, the second analysis result acquisition unit 21b recognizes the client's face from each image frame constituting the second image data, detects a feature point of the recognized face, and determines the position of the detected feature point. Facial motion information may be obtained by calculating the degree of change over time based on the change in Euclidean distance.

영상프레임으로부터 내담자의 얼굴을 인식하고, 인식된 얼굴의 특징점을 검출하는 것은 공지의 알고리즘을 적용해 수행할 수 있다. Recognizing the client's face from the image frame and detecting feature points of the recognized face may be performed by applying a known algorithm.

실시예에 따라, 제2 분석결과획득부(21b)는 얼굴의 코끝점을 특히 얼굴의 움직임 위치를 검출하는 데 이용할 수 있다. Depending on the embodiment, the second analysis result acquisition unit 21b may use the tip of the nose of the face to detect the movement position of the face.

실시예에 따라, 제2 분석결과획득부(21b)는 미리 생성된 특징점 탬플릿(코끝이 포함된 얼굴 특징점)과 인식된 얼굴의 특징점을 비교해 코끝점을 대상 특징점으로 검출할 수 있다. Depending on the embodiment, the second analysis result acquisition unit 21b may compare the previously generated feature point template (facial feature point including the nose tip) with the recognized facial feature point and detect the tip of the nose as the target feature point.

실시예에 따라, 코끝점의 좌표값은 영상프레임 전체를 기준으로 한 절대 좌표값을 이용할 수 있다. Depending on the embodiment, the coordinate values of the tip of the nose may use absolute coordinate values based on the entire image frame.

실시예에 따라, 영상프레임을 구성하는 픽셀 단위를 참조해 절대 좌표값을 산출할 수 있다. Depending on the embodiment, absolute coordinate values may be calculated with reference to a pixel unit constituting an image frame.

그리고, 시간 경과에 따른 내담자의 얼굴 움직임 변화에 따라 산출된 현 시점의 코끝점 좌표와 이전 시점의 코끝점 좌표와의 유클리디안 거리를 산출하여 얼굴 움직임 위치 정보를 획득할 수 있다. In addition, facial movement position information may be obtained by calculating a Euclidean distance between the coordinates of the tip of the nose at the current point in time and the coordinates of the tip of the nose at the previous point in time, which are calculated according to changes in the facial movement of the client over time.

실시예에 따라, 제2 분석결과획득부(21b)는 도 11과 같이 시간대별 시선 위치 정보와 얼굴 움직임 정보를 각각 그래프화하여 작성 및 전송할 수 있다. Depending on the embodiment, the second analysis result acquisition unit 21b may create and transmit graphs of gaze position information and face movement information by time zone, respectively, as shown in FIG. 11 .

전송부(21c)는 복수의 제2 분석 결과 전체를 포함한 정보를 상담자 단말(30)로 전송할 수 있다(s24).The transmission unit 21c may transmit information including all of the plurality of second analysis results to the counselor terminal 30 (s24).

실시예에 따라 복수의 제2 분석 결과 전체를 포함한 정보는, 복수의 제2 분석 결과 전체, 내담자 단말(10)로부터 실시간 수신한 동영상데이터, 및 상기 실시간 수신한 동영상데이터로부터 분리된 내담자의 음성데이터가 변환된 텍스트데이터와 상담자의 음성데이터가 변환된 텍스트데이터의 컨텐츠를 포함할 수 있다.According to the embodiment, information including all of the plurality of second analysis results may include all of the plurality of second analysis results, video data received in real time from the client terminal 10, and audio data of the client separated from the video data received in real time. may include the contents of the converted text data and the converted text data of the counselor's voice data.

도 5 내지 도 11은 상담자 단말(30)의 제2 분석결과제공부(32)의 동작에 따른 사용자 인터페이스부(33)의 화면을 예시한다.5 to 11 illustrate screens of the user interface unit 33 according to the operation of the second analysis result providing unit 32 of the counselor terminal 30 .

상담자 단말(30)의 제2 분석결과제공부(32)는 감정 분석 결과 제공 장치(20)로부터 수신한 복수의 제2 분석 결과 전체 중 사용자에 의해 선택된 제2 분석 결과를 포함한 정보와 상담자가 실시간 기록한 상담 노트를 하나의 사용자 인터페이스부(33)의 화면을 통해 출력함으로써 관리자 또는 상담자등의 사용자에게 제공할 수 있다.The second analysis result providing unit 32 of the counselor terminal 30 provides information including the second analysis result selected by the user among the entirety of the plurality of second analysis results received from the emotion analysis result providing device 20 and the counselor provides information in real time. By outputting the recorded counseling notes through the screen of one user interface unit 33, it can be provided to users such as administrators or counselors.

실시예에 따라, 복수의 제2 분석 결과 전체 중 사용자에 의해 선택된 제2 분석 결과를 포함한 정보는 복수의 제2 분석 결과 전체 중 사용자에 의해 선택된 제2 분석 결과, 선택된 제2 분석 결과와 관련된 보조 분석 결과, 내담자가 촬영된 동영상데이터, 및 내담자가 촬영된 동영상데이터로부터 분리된 내담자의 음성데이터가 변환된 텍스트데이터와 상담자의 음성데이터가 변환된 텍스트데이터에 대응되는 컨텐츠를 포함할 수 있다.Depending on the embodiment, the information including the second analysis result selected by the user from among the plurality of second analysis results may be the second analysis result selected by the user from among the plurality of second analysis results, and assistance related to the selected second analysis result. As a result of the analysis, video data of the client, text data obtained by converting the client's voice data separated from the video data of the client, and contents corresponding to text data obtained by converting the counselor's voice data may be included.

실시예에 따라, 제2 분석결과제공부(32)는 사용자 인터페이스부(33)의 화면을 통해 복수의 제2 분석 결과 중 선택된 제2 분석 결과를 출력하기 위한 사용자 입력을 수신할 수 있다.Depending on the embodiment, the second analysis result providing unit 32 may receive a user input for outputting a second analysis result selected from a plurality of second analysis results through the screen of the user interface unit 33 .

실시예에 따라, 사용자 인터페이스부(33)의 화면은 복수의 제2 분석 결과를 나타내는 복수의 메뉴(M)를 포함하고, 제2 분석결과제공부(32)는 사용자에 의해 선택 입력된 메뉴에 대응되는 제2 분석 결과(①)를 사용자 인터페이스부(33)의 화면을 통해 출력하여 사용자에게 제공할 수 있다. Depending on the embodiment, the screen of the user interface unit 33 includes a plurality of menus M indicating a plurality of second analysis results, and the second analysis result providing unit 32 provides menus selected by the user. The corresponding second analysis result (①) may be output through the screen of the user interface unit 33 and provided to the user.

예를 들어, 도 5는 복수의 일반 감정 상태를 제2 분석 결과로 출력하기 위한 메뉴에 대한 사용자의 선택 입력으로 출력된 화면을 예시하고, 도 6은 복수의 치환 감정 상태를 제2 분석 결과로 출력하기 위한 메뉴에 대한 사용자의 선택 입력으로 출력된 화면을 예시하고, 도 7은 심박수와 스트레스 측정값을 제 제2 분석 결과로 출력하기 위한 메뉴에 대한 사용자의 선택 입력으로 출력된 화면을 예시하고, 도 8은 키워드별 빈도수를 제2 분석 결과로 출력하기 위한 메뉴에 대한 사용자의 선택 입력으로 출력된 화면을 예시하고, 도 9a 및 도 9b는 토픽별로 구성된 키워드들 각각에 대한 빈도수를 제2 분석 결과로 출력하기 위한 메뉴에 대한 사용자의 선택 입력으로 출력된 화면을 예시하고, 도 10a 및 도 10b는 음성 파라미터를 제2 분석 결과로 출력하기 위한 메뉴에 대한 사용자의 선택 입력으로 출력된 화면을 예시하고, 도 11은 시선 위치 정보 및/ 얼굴 움직임 정보를 제2 분석 결과로 출력하기 위한 메뉴에 대한 사용자의 선택 입력으로 출력된 화면을 예시한다.For example, FIG. 5 illustrates a screen output as a user's selection input for a menu for outputting a plurality of general emotional states as the second analysis result, and FIG. 6 illustrates a plurality of substitute emotional states as the second analysis result. A screen output as a user's selection input for a menu for output is illustrated, and FIG. 7 illustrates a screen output as a user's selection input for a menu for outputting heart rate and stress measurement values as a second analysis result. 8 illustrates a screen output as a user's selection input for a menu for outputting the frequency count for each keyword as a second analysis result, and FIGS. A screen output as a user's selection input for a menu for output as a result is illustrated, and FIGS. 10A and 10B illustrate a screen output as a user's selection input for a menu for outputting a voice parameter as a second analysis result. 11 illustrates a screen output as a user's selection input for a menu for outputting gaze position information and/or face motion information as a second analysis result.

실시예에 따라, 각 컨텐츠는 사용자 인터페이스부(33)의 화면 상의 분리된 영역에 각각 출력될 수 있다.Depending on the embodiment, each content may be output to a separate area on the screen of the user interface unit 33 .

예를 들어, 도 5 내지 도 11의 사용자 인터페이스부(33)의 화면에 대한 예시로, 제2 분석 결과(①), 제2 분석 결과와 관련된 보조 분석 결과(②), 내담자 가 촬영된 동영상데이터(⑤), 내담자의 음성데이터가 변환된 텍스트데이터와 상담자의 음성데이터가 변환된 텍스트데이터(④), 상담자가 실시간 기록한 상담 노트(③)를 포함할 수 있다. For example, as an example of the screen of the user interface unit 33 of FIGS. 5 to 11, the second analysis result (①), the auxiliary analysis result related to the second analysis result (②), and the video data of the client (⑤), text data obtained by converting the client's voice data, text data obtained by converting the counselor's voice data (④), and consultation notes recorded by the counselor in real time (③) may be included.

도 5 내지 도 7, 및 도 10a, 도 10b, 도 11의 사용자 인터페이스부(33)의 화면에서, 실시예에 따라, 시간대별 제2 분석 결과가 그래프 형태로 구현되어 제공될 수 있다.On the screen of the user interface unit 33 of FIGS. 5 to 7 and FIGS. 10A, 10B, and 11 , the second analysis result for each time period may be implemented in a graph form and provided according to an embodiment.

구체적으로, 상기 하나의 사용자 인터페이스 화면(33)은 복수 개의 영역으로 구분되고, 상기 제2 분석결과제공부(32)는, 상기 복수 개의 영역 중 어느 하나의 영역에 상기 사용자에 의해 선택된 상기 제2 분석 결과가 시간대별로 나타난 그래프를 출력할 수 있다. Specifically, the one user interface screen 33 is divided into a plurality of areas, and the second analysis result providing unit 32 is configured to select the second analysis result providing unit 32 in one of the plurality of areas by the user. A graph showing the analysis results by time period can be output.

도 5 내지 도 11의 사용자 인터페이스부(33)의 화면에서, 실시예에 따라, 내담자의 음성데이터가 변환된 텍스트데이터와 상담자의 음성데이터가 변환된 텍스트데이터(④)는 사용자에 의해 편집되거나 파일 형태로 제공되어 사용자에게 제공될 수 있다. On the screen of the user interface unit 33 of FIGS. 5 to 11, according to the embodiment, the text data obtained by converting the client's voice data and the text data (④) obtained by converting the counselor's voice data are edited by the user or converted into a file. It may be provided in a form and provided to the user.

도 5 내지 도 11의 사용자 인터페이스부(33)의 화면에서, 실시예에 따라, 내담자 정보가 리스트화되어 사용자에 의해 선택된 내담자 정보가 검색될 수 있다.On the screens of the user interface unit 33 of FIGS. 5 to 11 , depending on the embodiment, client information is listed and client information selected by the user can be searched.

실시예에 따라, 도 5 내지 도 11의 각 사용자 인터페이스부(33)의 화면 상의 각 영역에서, 제2 분석 결과(①)와 제1 분석 결과와 관련된 보조 분석 결과(②)에 대응되는 컨텐츠만이 상이하게 구현되고, 나머지 컨텐츠(③④⑤)는 동일하게 구현될 수 있다. Depending on the embodiment, in each region on the screen of each user interface unit 33 of FIGS. 5 to 11, only content corresponding to the second analysis result ① and the auxiliary analysis result related to the first analysis result ② This is implemented differently, and the remaining contents (③④⑤) can be implemented identically.

도 5를 참조하면, 시간대별 복수의 일반 감정 상태 각각의 확률값을 나타내는 그래프가 제2 분석 결과(①)를 나타내는 영역에 제공될 수 있다. 실시예에 따라, 제2 분석 결과와 관련된 보조 분석 결과(②)는 복수의 일반 감정 상태 각각에 대한 항목 정보를 포함할 수 있고, 실시예에 따라 사용자에 의해 선택된 일반 감정 상태 항목에 대한 확률값만이 그래프 상에 출력되도록 설정될 수도 있다.Referring to FIG. 5 , a graph representing probability values of each of a plurality of general emotional states for each time period may be provided in an area representing the second analysis result ①. According to an embodiment, the auxiliary analysis result (②) related to the second analysis result may include item information for each of a plurality of general emotional states, and according to an embodiment, only the probability value for the general emotional state item selected by the user It can also be set to be output on this graph.

실시예에 따라, 그래프 상의 동일 시점에 나타난 복수의 일반 감정 상태의 확률값 중, 확률값이 가장 큰 수치를 나타내는 일반 감정 상태가 해당 시점의 대표 감정 상태인 것으로 판단될 수 있다. According to an embodiment, among probability values of a plurality of general emotional states appearing on the graph at the same point in time, a general emotional state representing a value having the highest probability value may be determined to be a representative emotional state at that point in time.

본 발명에 따르면, 도 5의 제2 분석 결과를 통해, 사용자는 전체 상담 구간 내에서 시간대별(시간의 흐름에 따른) 내담자의 감정 변화를 전체적으로 확인할 수 있게 된다. 이 때, 사용자는 복수의 일반 감정 상태 각각의 확률값을 통합적으로, 또는, 이 중 선택된 일반 감정 상태의 확률값을 개별적으로 확인할 수 있게 된다. 이에 따라, 사용자는 내담자의 감정 변화의 맥락을 전체적으로 이해함으로써, 추후 상담의 방향을 모색할 수 있게 된다. According to the present invention, through the second analysis result of FIG. 5 , the user can check the client's emotional change for each time period (according to the lapse of time) within the entire counseling section as a whole. At this time, the user can collectively check the probability values of each of the plurality of general emotional states or individually check the probability values of the selected general emotional states. Accordingly, the user can seek the direction of counseling in the future by understanding the context of the emotional change of the client as a whole.

도 6을 참조하면, 시간대별 복수의 치환 감정 상태 각각의 확률값을 나타내는 그래프가 제2 분석 결과(①)를 나타내는 영역에 제공될 수 있다. 실시예에 따라, 제2 분석 결과와 관련된 보조 분석 결과(②)는 긍정, 부정, 중립, 기타 각각에 대응되는 빈도수 및/또는 확률 정보와, 이를 도표화한 정보를 포함할 수 있다. 실시예에 따라 사용자에 의해 선택된 치환 감정 상태에 대응되는 정보만이 출력되도록 설정될 수도 있다.Referring to FIG. 6 , a graph representing probability values of each of a plurality of alternate emotional states for each time period may be provided in an area representing the second analysis result ①. Depending on the embodiment, the auxiliary analysis result (②) related to the second analysis result may include frequency and/or probability information corresponding to positive, negative, neutral, and the like, and tabular information thereof. Depending on the embodiment, it may be set so that only information corresponding to the alternate emotional state selected by the user is output.

본 발명에 따르면, 도 6의 제2 분석 결과를 통해, 사용자는 전체 상담 구간 내에서 시간대별(시간의 흐름에 따른) 내담자의 감정 변화를 전체적으로 확인할 수 있게 된다. 이 때, 사용자는 복수의 치환 감정 상태 각각의 확률값을 통합적으로, 또는, 이 중 선택된 치환 감정 상태의 확률값을 개별적으로 확인할 수 있게 된다. 이에 따라, 사용자는 내담자의 감정 변화의 맥락을 전체적으로 이해함으로써, 추후 상담의 방향을 모색할 수 있게 된다. According to the present invention, through the second analysis result of FIG. 6 , the user can check the client's emotional change for each time period (according to the passage of time) within the entire counseling section as a whole. At this time, the user can check the probability values of each of the plurality of alternate emotional states integrally or individually the probability values of the selected alternate emotional states. Accordingly, the user can seek the direction of counseling in the future by understanding the context of the emotional change of the client as a whole.

특히, 본 발명에 따르면, 도 5의 제2 분석 결과를 통해, 사용자는 보다 상세하게 분류된(카테고리화된) 감정 상태 정보를 확인할 수 있는 반면, 도 6의 제2 분석 결과를 통해, 사용자는 보다 큰 범위로 분류된(카테고리화된) 감정 상태 정보를 확인할 수 있어, 사용자의 필요 및 선택에 따라 다양하게 감정 상태 정보를 확인할 수 있게 된다.In particular, according to the present invention, through the second analysis result of FIG. 5, the user can check more detailed classified (categorized) emotional state information, whereas through the second analysis result of FIG. 6, the user It is possible to check the emotional state information classified (categorized) in a larger range, so that the emotional state information can be checked in various ways according to the user's needs and selection.

도 7을 참조하면, 시간대별 심박수와 스트레스 지수를 함께 나타내는 그래프가 제2 분석 결과(①)를 나타내는 영역에 제공될 수 있다.Referring to FIG. 7 , a graph showing the heart rate and stress index for each time period may be provided in an area showing the second analysis result ①.

실시예에 따라, 제2 분석 결과와 관련된 보조 분석 결과는 평균심박수, 평균스트레스레벨 등의 정보를 포함할 수 있다. 실시예에 따라 심박수와 스트레스 레벨 중 사용자에 의해 선택된 정보만이 출력되도록 설정될 수도 있다. Depending on the embodiment, the auxiliary analysis result related to the second analysis result may include information such as average heart rate and average stress level. Depending on the embodiment, it may be set to output only information selected by the user among heart rate and stress level.

본 발명에 따르면, 도 7의 제2 분석 결과를 통해, 사용자는 전체 상담 구간 내에서 시간대별(시간의 흐름에 따른) 내담자의 스트레스 정도를 전체적으로 확인할 수 있게 된다. 이에 따라, 사용자는 내담자의 스트레스 정도나 변화를 포함한 맥락을 전체적으로 이해함으로써, 추후 상담의 방향을 모색할 수 있게 된다. According to the present invention, through the second analysis result of FIG. 7 , the user can check the stress level of the client as a whole for each time period (according to the passage of time) within the entire counseling section. Accordingly, the user can seek a direction for future counseling by fully understanding the context including the stress level or change of the client.

도 8을 참조하면, 키워드별 빈도수를 나타내는 그래프가 제2 분석 결과(①)를 나타내는 영역에 제공될 수 있다.Referring to FIG. 8 , a graph representing the frequency of each keyword may be provided in an area representing the second analysis result (①).

실시예에 따라, 제2 분석 결과와 관련된 보조 분석 결과는 상담자에 의해 기 설정된 키워드 리스트 정보를 포함할 수 있다.Depending on the embodiment, the secondary analysis result related to the second analysis result may include keyword list information preset by the counselor.

본 발명에 따르면, 상담을 진행하는 중에 내담자가 발화하는 단어들 중에서, 심리 상담 측면 및 상담사가 중요하게 생각하는 주요 키워드들이 얼마나 나타났는지를 확인할 수 있게 된다. 예를 들어, 자살, 우울 과 같은 주요 단어들은 내담자의 심리 및 처한 상황을 극명하게 드러내는 키워드이므로 이러한 키워드들이 얼마나 많이 나타나는 가를 확인하는 것은 매우 중요하다. 또한, 관리자에 의해 미리 등록된 설정 단어 뿐 아니라, 해당 상담을 진행하는 상담사에 의해서도 실시간 단어를 등록하여 확인할 수 있으며, 분석에서 제외하고자 하는 단어는 제외할 수 있도록 설정하여, 설정한 단어 뿐 아니라, 내담자가 발화한 모든 단어를 대상으로 분석하고자 할 때는 전체 언어를 대상으로 분석할 수 있다.According to the present invention, among words uttered by a client during counseling, it is possible to check how many key keywords, which are considered important by the counselor and aspects of psychological counseling, appear. For example, key words such as suicide and depression are keywords that clearly reveal the client's psychology and situation, so it is very important to check how many of these keywords appear. In addition, not only set words pre-registered by the manager, but also real-time words can be registered and confirmed by the counselor conducting the consultation, and words to be excluded from analysis are set to be excluded, and not only the set words, If you want to analyze all the words uttered by the client, you can analyze the entire language.

도 9a를 참조하면, 토픽별로 구성된 키워드들 각각에 대한 빈도수가 복수의 형태로 그래프화되어 제2 분석 결과(①)를 나타내는 영역에 제공될 수 있다.Referring to FIG. 9A , a frequency count for each of the keywords configured for each topic may be graphed in a plurality of forms and provided in an area representing the second analysis result (①).

실시예에 따라, 제2 분석 결과와 관련된 보조 분석 결과는 토픽별로 구성된 키워드들 각각에 대한 빈도수와 순위를 나타내는 리스트 정보를 포함할 수 있다.Depending on the embodiment, the auxiliary analysis result related to the second analysis result may include list information indicating the frequency and rank of each of the keywords configured for each topic.

본 발명에 따르면, 도 9a의 제2 분석 결과를 통해, 상담자가 상담을 하는 내용에 대하여, 내담자가 주로 관심을 가지고 있는 주제 분야를 확인할 수 있으며, 다른 주제들과의 관계성을 함께 살펴 볼 수 있게 된다. 이에 따라, 상담자는 내담자의 주된 관심 분야를 전체적으로 이해함으로써, 추후 상담의 방향을 모색할 수 있게 된다. According to the present invention, through the second analysis result of FIG. 9a, it is possible to identify the subject field in which the client is mainly interested in the contents of the counseling by the counselor, and examine the relationship with other subjects together. there will be Accordingly, the counselor can seek the direction of future counseling by understanding the client's main field of interest as a whole.

도 10a 및 도 10b를 참조하면, 시간대별 음성 파라미터가 그래프화되어 제2 분석 결과(①)를 나타내는 영역에 제공될 수 있다.Referring to FIGS. 10A and 10B , voice parameters for each time period may be graphed and provided to an area showing a second analysis result (①).

실시예에 따라, 제2 분석 결과와 관련된 보조 분석 결과는 피치평균값, 포즈평균값, 소정시간당 평균 발화수, 망설임 횟수, 망설임 평균수치 등의 정보를 포함할 수 있다. 실시예에 따라 이 중 사용자에 의해 선택된 정보만이 출력되도록 설정될 수도 있다.Depending on the embodiment, the auxiliary analysis result related to the second analysis result may include information such as an average pitch value, an average pose value, an average number of utterances per predetermined time, the number of hesitations, and an average number of hesitations. Depending on the embodiment, it may be set to output only information selected by the user among them.

본 발명에 따르면, 상담자는 내담자가 언제 언성을 높이는지, 언제 조용히 말하는지, 언제 느리게 말하고, 빠르게 말하는지 등을 그래프를 통해 확인할 수 있고, 이로서, 상담자는 내담자의 감정의 변화, 말의 내용 변화에 따른 음성 파형의 변화를 전체적으로 확인함으로써 시간대별 내담자의 미세한 감정 변화를 판단할 수 있다.According to the present invention, the counselor can check when the client raises his voice, when he speaks quietly, when he speaks slowly, when he speaks quickly, etc. through a graph. By checking the overall change of the voice waveform, it is possible to determine the client's minute emotional change by time period.

도 11을 참조하면, 시간대별 시선 위치 정보와 얼굴 움직임 정보가 각각 그래프화되어 제2 분석 결과(①)를 나타내는 영역에 제공될 수 있다.Referring to FIG. 11 , gaze position information and face motion information for each time period may be graphed and provided to an area representing the second analysis result ①.

실시예에 따라, 제2 분석 결과와 관련된 보조 분석 결과는 평균 시선 위치 변화값, 누적된 평균 시선 위치 변화값, 평균 얼굴 움직임 변화값, 누적된 평균 얼굴 움직임 변화값 등을 포함할 수 있다. 실시예에 따라 이 중 사용자에 의해 선택된 정보만이 출력되도록 설정될 수도 있다.Depending on embodiments, the auxiliary analysis result related to the second analysis result may include an average gaze position change value, an accumulated average gaze position change value, an average facial motion change value, and an accumulated average facial motion change value. Depending on the embodiment, it may be set to output only information selected by the user among them.

본 발명에 따르면, 도 11의 분석 결과를 통해, 시선이나 얼굴을 움직이는 것은 내담자의 감정 상태가 몸으로 드러나는 증거가 될 수 있으므로, 해당 결과를 통해 상담사는 내담자의 무의식적인 행동을 확인하고, 이를 통해 내담자의 감정 상태를 정확히 판단할 수 있으며, 그래프 상의 변위가 크게 발생한 시점을 찾아서 해당 시간대의 상담 내용을 확인하면서 내담자의 감정을 더욱 정확하게 파악할 수 있게 된다.According to the present invention, through the analysis result of FIG. 11, since moving the gaze or face can be evidence that the emotional state of the client is revealed through the body, the counselor confirms the unconscious behavior of the client through the result. The client's emotional state can be accurately determined, and the client's emotion can be more accurately grasped while checking the counseling contents of the corresponding time period by finding the point in time when the displacement on the graph occurred greatly.

본 발명에 따르면, 제1 분석결과획득부(21a)를 통해 실시간 분석으로 제1 분석 결과를 획득하지만, 제2 분석결과획득부(21b)를 통해 소정 시간 경과 후 분석으로 제2 분석 결과를 획득하도록 함으로써, 동일한 내담자와의 실시간 상담 내역을 실시간 분석을 통해 실시간 제공받을 수 있도록 하면서도, 상담이 종료된 후 전체 상담 내역을 통합 분석을 통해 제공받을 수 있도록 함으로써, 하나의 상담 내역에 대해 다양한 형태의 분석 결과를 제공받을 수 있게 되어 보다 다양화되고 체계화된 상담 내역의 관리가 가능해질 수 있게 된다. According to the present invention, the first analysis result is obtained through real-time analysis through the first analysis result acquisition unit 21a, but the second analysis result is acquired through analysis after a predetermined time has elapsed through the second analysis result acquisition unit 21b. By making it possible to receive real-time counseling details with the same client through real-time analysis, while providing the entire counseling details through integrated analysis after counseling is finished, various types of counseling details As analysis results can be provided, it becomes possible to manage more diversified and systematized counseling details.

또한, 상담 진행 중일 때의 사용자 인터페이스 화면과 상담이 종료된 후의 분석 결과를 제공하는 사용자 인터페이스 화면을 서로 상이하게 구현하여 상담자의 상담 내역 관리 및 확인이 보다 용이해지게 된다. 예를 들어, 상담이 진행중인 경우, 실시간 분석 결과를 제공받도록 하면서도 전체 분석 결과를 하나의 사용자 인터페이스부(33)의 화면을 통해 제공받도록 함으로써, 상담자가 실시간 상담 내역 확인, 분석, 및/또는 판단이 보다 신속하고 용이해질 수 있으며, 상담이 종료된 경우, 누적된 전체 상담 내역에 대해 하나의(선택된) 분석 결과를 하나의 사용자 인터페이스부(33)의 화면을 통해 제공받도록 함으로써, 전체 상담 내역을 사용자가 통합적으로 확인할 수 있게 된다. In addition, a user interface screen during counseling is implemented differently from a user interface screen that provides analysis results after counseling is completed, so that the counselor can more easily manage and check the counseling details. For example, when counseling is in progress, the entire analysis result is provided through a screen of one user interface unit 33 while receiving the real-time analysis result, so that the counselor can check, analyze, and/or determine the real-time consultation details. It can be done more quickly and easily, and when the consultation is finished, one (selected) analysis result for the entire accumulated consultation history is provided through the screen of one user interface unit 33, so that the user can view the entire consultation history. can be checked collectively.

도 13은 실시예에 따른 앙상블 모델을 생성하기 위한 학습 과정을 설명하기 위해 참조되는 도면이다.13 is a diagram referenced to describe a learning process for generating an ensemble model according to an embodiment.

실시예에 따라, 제2 분석결과획득부(21b)는 복수의 모델을 각각 미리 생성하여 저장부(22)에 저장할 수 있다.Depending on the embodiment, the second analysis result acquisition unit 21b may generate a plurality of models in advance and store them in the storage unit 22 .

실시예에 따라 복수의 모델은 학습용 영상프레임, 학습용 음성프레임, 및 학습용 텍스트데이터 각각을 학습에 적용한 앙상블 기반의 모델들로 생성될 수 있다.According to an embodiment, a plurality of models may be generated as ensemble-based models in which each of a video frame for learning, an audio frame for learning, and text data for learning is applied to learning.

실시예에 따라, 복수의 모델은 기계학습과 딥러닝이 앙상블되어 생성될 수 있다.Depending on the embodiment, a plurality of models may be generated by ensemble of machine learning and deep learning.

실시예에 따라, 제2 분석결과획득부(21b)는 학습용 영상프레임을 기초로 학습된 제1 모델을 획득하여 메모리(22)에 저장할 수 있다.Depending on the embodiment, the second analysis result acquisition unit 21b may acquire the first model learned based on the image frame for learning and store it in the memory 22 .

실시예에 따라, 제1 모델은 CNN(Convolutional Neural Network) 알고리즘 기반의 MobileNet 모델일 수 있으나, DNN(Deep Neural Network), DCNN(DeepConvolution Neural Network), RNN (Recurrent Neural Network), KNN(K-Nearest Neighbor), SVM(Support Vector Machine), Random Forest, Decision Tree 등의 알고리즘에 기반한 학습으로 생성될 수도 있다. MobileNet 아키텍처는 처리 동작들(즉, 부동 소수점 연산들, 곱셈들 및/또는 덧셈들 등)을 최소화하기 위해 뎁스와이즈 분리가능 컨볼루션들(인수분해된 컨볼루션들의 형태)을 채택한다. 뎁스와이즈 분리가능 컨볼루션들은, 요구되는 동작들의 수를 감소 또는 최소화함으로써 처리를 더 고속화하기 위한 관점에서, 표준 컨볼루션을 뎁스와이즈 컨볼루션 및 1 x 1 컨볼루션(또한 "포인트와이즈 컨볼루션"으로 지칭됨)으로 인수분해한다(예를 들어, 그것의 함수들을 토해 낸다). 뎁스와이즈 컨볼루션은 각각의 입력 채널에 단일 필터를 적용한다. 이어서, 포인트와이즈 컨볼루션은 뎁스와이즈 컨볼루션의 출력들을 조합하기 위해 1 x 1 컨볼루션을 적용하고, 필터링 및 조합 기능들/동작들을, 표준 컨볼루션들에 의해 수행되는 단일 필터링 및 조합 동작이 아니라 2개의 단계로 분리한다. 따라서 MobileNet 아키텍처에서의 구조들은, "계층 그룹"을 정의하거나 예시하기 위해 구조 당 2개의 컨볼루션 계층, 1개의 뎁스와이즈 컨볼루션 계층 및 1개의 포인트와이즈 컨볼루션 계층을 포함할 수 있다.Depending on the embodiment, the first model may be a MobileNet model based on a Convolutional Neural Network (CNN) algorithm, but a Deep Neural Network (DNN), Deep Convolution Neural Network (DCNN), Recurrent Neural Network (RNN), K-Nearest (KNN) Neighbor), SVM (Support Vector Machine), Random Forest, and Decision Tree. The MobileNet architecture employs depthwise separable convolutions (a form of factored convolutions) to minimize processing operations (ie, floating point operations, multiplications and/or additions, etc.). Depthwise separable convolutions replace standard convolution with depthwise convolution and 1 x 1 convolution (also known as "pointwise convolution") in the view of speeding up processing by reducing or minimizing the number of operations required. (e.g., spit out its functions). Depthwise convolution applies a single filter to each input channel. The pointwise convolution then applies a 1 x 1 convolution to combine the outputs of the depthwise convolution, filtering and combining functions/operations, rather than a single filtering and combining operation performed by standard convolutions. separate into two stages. Accordingly, structures in the MobileNet architecture may include two convolutional layers, one depthwise convolutional layer and one pointwise convolutional layer per structure to define or instantiate a “group of layers”.

실시예에 따라, 제2 분석결과획득부(21b)는 학습용 영상프레임을 CNN을 통한 신경망 학습을 통해 학습용 영상프레임에 대한 감정이 분류되도록 함으로써, 제1 모델을 생성할 수 있다.Depending on the embodiment, the second analysis result acquisition unit 21b may generate a first model by classifying emotions for the learning image frame through neural network learning through CNN.

제2 분석결과획득부(21b)는 학습용 음성프레임을 기초로 학습된 제2 모델을 획득하여 저장부(22)에 저장할 수 있다.The second analysis result acquisition unit 21b may obtain a second model learned based on the training voice frame and store it in the storage unit 22 .

실시예에 따라, 제2 모델은 SVM(Support Vector Machine) 알고리즘 기반의 모델일 수 있으나, CNN, DNN(Deep Neural Network), DCNN(DeepConvolution Neural Network), RNN (Recurrent Neural Network), KNN(K-Nearest Neighbor), Random Forest,　Decision　Tree 등의 알고리즘에 기반한 학습으로 생성될 수도 있다.SVM(Support Vector Machine) 알고리즘은 분류와 회귀 분석을 위해 사용되며, 두 카테고리 중 어느 하나에 속한 데이터의 집합이 주어졌을 때, SVM 알고리즘은 주어진 데이터 집합을 바탕으로 하여 새로운 데이터가 어느 카테고리에 속할지 판단하는 비확률적 이진 선형 분류 모델을 만든다. Depending on the embodiment, the second model may be a support vector machine (SVM) algorithm-based model, but CNN, deep neural network (DNN), deep convolution neural network (DCNN), recurrent neural network (RNN), K-N (K- Nearest Neighbor), Random Forest, 　Decision　Tree, etc. Support Vector Machine (SVM) algorithm is used for classification and regression analysis, and a set of data belonging to either category is given. When lost, the SVM algorithm creates a non-probabilistic binary linear classification model based on a given data set to determine which category the new data belongs to.

실시예에 따라, 제2 분석결과획득부(21b)는 학습용 음성프레임의 주파수 분석을 기반으로 음성 특징 벡터를 추출하고, 추출된 음성 특징 벡터를 SVM(Support Vector Machine) 알고리즘을 통해 학습하며, 학습 과정에서 음성 특징 벡터들이 감정별로 분류되도록 함으로써, 제2 모델을 생성할 수 있다. According to an embodiment, the second analysis result acquisition unit 21b extracts a voice feature vector based on the frequency analysis of the learning voice frame, learns the extracted voice feature vector through a Support Vector Machine (SVM) algorithm, and In the process, the second model may be generated by classifying speech feature vectors by emotion.

제2 분석결과획득부(21b)는 학습용 텍스트데이터를 기초로 학습된 제3 모델(22c)를 획득하여 저장부(22)에 저장할 수 있다.The second analysis result acquisition unit 21b may acquire the third model 22c learned based on the text data for learning and store it in the storage unit 22 .

실시예에 따라, 제3 모델은 BERT(Bidirectional Encoder Representations from Transformers) 모델일 수 있으나, DNN(Deep Neural Network), DCNN(DeepConvolution Neural Network), CNN, RNN (Recurrent Neural Network), KNN(K-Nearest Neighbor), SVM(Support Vector Machine), Random Forest,　Decision　Tree 등의 알고리즘에 기반한 학습으로 생성될 수도 있다.BERT는 인코더-디코더 구조의 트랜스포머(Transformer) 아키텍쳐를 기반으로 한 인공지능 모델로서, 입력의 심층 표현(Representation)을 위해 복수의 트랜스포머 계층을 쌓고, 토큰 시퀀스인 마스킹 언어 모델(Masking Language Model)에 마스킹 과정을 적용하는 것을 특징으로 한다. BERT 모델은 파인 튜닝 과정을 거침으로써 적은 양의 데이터에서도 높은 정확도를 나타내며, 특정 벡터에 주목하게 만들어 성능을 향상시키는 어텐션 기반 모델로 문장이 길어져도 성능이 떨어지지 않아 긴 문장에서도 정확도를 유지할 수 있다는 장점이 있다. Depending on the embodiment, the third model may be a BERT (Bidirectional Encoder Representations from Transformers) model, but a Deep Neural Network (DNN), Deep Convolution Neural Network (DCNN), CNN, Recurrent Neural Network (RNN), K-Nearest (KNN) Neighbor), SVM (Support Vector Machine), Random Forest, Decision, Tree, etc. BERT is an artificial intelligence model based on the transformer architecture of encoder-decoder structure. It is characterized by stacking a plurality of transformer layers for in-depth representation and applying a masking process to a masking language model, which is a token sequence. The BERT model exhibits high accuracy even with a small amount of data by undergoing a fine-tuning process, and is an attention-based model that improves performance by drawing attention to a specific vector. there is

실시예에 따라, 제2 분석결과획득부(21b)는 학습용 텍스트데이터를 기초로 컨텍스트 기반 임베딩값을 획득하기 위한 신경망 학습을 통해 학습용 텍스트데이터에 대한 감정이 분류되도록 함으로써, 제3 모델을 생성할 수 있다. According to an embodiment, the second analysis result acquisition unit 21b generates a third model by classifying emotions for the text data for learning through neural network learning to acquire a context-based embedding value based on the text data for learning. can

이상 설명된 실시 형태는 다양한 컴퓨터 구성요소를 통하여 실행될 수 있는 프로그램 명령어의 형태로 구현되어 컴퓨터로 판독가능한 기록매체에 기록될 수 있다. 상기 컴퓨터로 판독가능한 기록매체는 프로그램 명령어, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. The above-described embodiments may be implemented in the form of program instructions that can be executed through various computer components and recorded on a computer-readable recording medium. The computer-readable recording medium may include program instructions, data files, data structures, etc. alone or in combination.

상기 컴퓨터로 판독가능한 기록매체에 기록되는 프로그램 명령어는 본 발명을 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 분야의 당업자에게 공지되어 사용 가능한 것일 수도 있다.Program instructions recorded on the computer-readable recording medium may be specially designed and configured for the present invention, or may be known and usable to those skilled in the art of computer software.

컴퓨터로 판독가능한 기록매체의 예에는, 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체, CD-ROM, DVD와 같은 광기록 매체, 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 ROM, RAM, 플래시 메모리 등과 같은 프로그램 명령어를 저장하고 실행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령어의 예에는, 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드도 포함된다. 상기 하드웨어 장치는 본 발명에 따른 처리를 실행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks and magnetic tapes, optical recording media such as CD-ROMs and DVDs, and magneto-optical media such as floptical disks. optical media), and hardware devices specially configured to store and execute program instructions, such as ROM, RAM, flash memory, and the like. Examples of program instructions include high-level language codes that can be executed by a computer using an interpreter or the like as well as machine language codes such as those produced by a compiler. The hardware device may be configured to act as one or more software modules for executing processes according to the present invention and vice versa.

본 명세서의 양상들은 전체적으로 하드웨어, 전체적으로 소프트웨어 (펌웨어, 상주 소프트웨어, 마이크로 코드 등을 포함 함) 또는 컴퓨터 판독 가능 프로그램 코드가 구현 된 하나 이상의 컴퓨터 판독 가능 매체에 구현 된 컴퓨터 프로그램 제품의 형태를 취할 수 있다.Aspects herein may take the form of entirely hardware, entirely software (including firmware, resident software, microcode, etc.) or a computer program product embodied entirely in one or more computer readable media having computer readable program code embodied thereon. .

이상에서 실시예들에 설명된 특징, 구조, 효과 등은 본 발명의 하나의 실시예에 포함되며, 반드시 하나의 실시예에만 한정되는 것은 아니다. 나아가, 각 실시예에서 예시된 특징, 구조, 효과 등은 실시예들이 속하는 분야의 통상의 지식을 가지는 자에 의해 다른 실시예들에 대해서도 조합 또는 변형되어 실시 가능하다. 따라서 이러한 조합과 변형에 관계된 내용들은 본 발명의 범위에 포함되는 것으로 해석되어야 할 것이다.Features, structures, effects, etc. described in the embodiments above are included in one embodiment of the present invention, and are not necessarily limited to only one embodiment. Furthermore, the features, structures, and effects illustrated in each embodiment can be combined or modified with respect to other embodiments by those skilled in the art in the field to which the embodiments belong. Therefore, contents related to these combinations and variations should be construed as being included in the scope of the present invention.

또한, 이상에서 실시예를 중심으로 설명하였으나 이는 단지 예시일 뿐 본 발명을 한정하는 것이 아니며, 본 발명이 속하는 분야의 통상의 지식을 가진 자라면 본 실시예의 본질적인 특성을 벗어나지 않는 범위에서 이상에 예시되지 않은 여러 가지의 변형과 응용이 가능함을 알 수 있을 것이다. 예를 들어, 실시예에 구체적으로 나타난 각 구성 요소는 변형하여 실시할 수 있는 것이다. 그리고 이러한 변형과 응용에 관계된 차이점들은 첨부된 청구 범위에서 규정하는 본 발명의 범위에 포함되는 것으로 해석되어야 할 것이다.In addition, although the above has been described with a focus on the embodiments, these are only examples and do not limit the present invention, and those skilled in the art to which the present invention belongs can exemplify the above to the extent that does not deviate from the essential characteristics of the present embodiment. It will be seen that various variations and applications that have not been made are possible. For example, each component specifically shown in the embodiment can be modified and implemented. And the differences related to these modifications and applications should be construed as being included in the scope of the present invention as defined in the appended claims.

Claims

a first analysis result obtaining unit configured to obtain a plurality of first analysis results by analyzing the client's emotions in real time based on at least one of first audio data and first video data extracted from video data received from the client terminal in real time; and
Based on at least one of second voice data and second video data extracted from the accumulated video data accumulated for a predetermined time, the client's emotion is analyzed after the predetermined time has elapsed, and the plurality of second video data received in real time is analyzed. A second analysis result acquisition unit for acquiring analysis results; including,
A device that provides emotion analysis results.

According to claim 1,
The plurality of first analysis results,
At least two of a probability value of each of a plurality of general emotional states, a representative emotional state, a heart rate and a stress index, a frequency by keyword, a frequency for each of keywords configured by topic, a voice parameter, gaze position information, and facial movement information. ,
Sentiment analysis result providing device.

According to claim 1,
The plurality of second analysis results,
At least one of a probability value of each of a plurality of general emotional states, a probability value of each of a plurality of substitutional emotional states, a heart rate and a stress index, a frequency of each keyword, a frequency of each of keywords configured for each topic, a voice parameter, gaze position information, and face movement information. containing two or more
Sentiment analysis result providing device.

According to claim 3,
The plurality of substitutional emotional states are a combination of emotional states defined as a superordinate concept of each of the plurality of general emotional states,
Sentiment analysis result providing device.

According to claim 3,
The second analysis result acquisition unit,
Obtaining a probability value of each of a plurality of general emotional states using a plurality of models based on ensemble learning based on the second video data, the second audio data, and the text data in which the second audio data is converted,
Sentiment analysis result providing device.

According to claim 3,
The second analysis result acquisition unit,
Performing Latent Dirichlet Allocation (LDA) topic modeling on the text data from which the second voice data is converted to obtain a frequency count for each of the keywords configured for each topic.
Sentiment analysis result providing device.

According to claim 3,
The second analysis result acquisition unit,
Based on the second image data, the client's face is recognized, a feature point of the recognized face is detected, a pupil center position is detected in a designated region of interest based on the face feature point, and a pupil center position is determined based on the detected pupil center position. Calculating the gaze position information, which is a degree of change over time, based on a change in Euclidean distance,
Sentiment analysis result providing device.

In the emotion analysis result providing system including a client terminal, a counselor terminal, and an emotion analysis result providing device,
The emotion analysis result providing device,
a first analysis result obtaining unit configured to obtain a plurality of first analysis results by analyzing the client's emotions in real time based on at least one of first audio data and first video data extracted from video data received from the client terminal in real time;
Based on at least one of second voice data and second video data extracted from the accumulated video data accumulated for a predetermined time, the client's emotion is analyzed after the predetermined time has elapsed, and the plurality of second video data received in real time is analyzed. a second analysis result acquisition unit for acquiring analysis results; and
A transmitter configured to transmit information including all of the plurality of first analysis results and information including all of the plurality of second analysis results to the counselor terminal;
The counselor terminal,
a first analysis result providing unit that provides information including all of the plurality of first analysis results received from the emotion analysis result providing device through a single user interface screen; and
A second analysis result providing unit providing information including a second analysis result selected by a user among all of the plurality of second analysis results received from the emotion analysis result providing device through the one user interface screen; ,
Sentiment analysis result providing system.

According to claim 8,
The one user interface screen includes a plurality of menus representing the plurality of second analysis results,
The second analysis result providing unit,
Providing information including the second analysis result corresponding to a menu selected by a user from among the plurality of menus through the one user interface screen,
Sentiment analysis result providing system.

According to claim 8,
The plurality of first analysis results,
At least two of a probability value of each of a plurality of general emotional states, a representative emotional state, a heart rate and a stress index, a frequency by keyword, a frequency for each of keywords configured by topic, a voice parameter, gaze position information, and facial movement information. ,
Sentiment analysis result providing system.

According to claim 8,
The plurality of second analysis results,
At least one of a probability value of each of a plurality of general emotional states, a probability value of each of a plurality of substitutional emotional states, a heart rate and a stress index, a frequency of each keyword, a frequency of each of keywords configured for each topic, a voice parameter, gaze position information, and face movement information. containing two or more
Sentiment analysis result providing system.