KR102299455B1

KR102299455B1 - Method and apparatus for neural network based sentiment analysis and sentiment therapy apparatus based on the same

Info

Publication number: KR102299455B1
Application number: KR1020190108607A
Authority: KR
Inventors: 지승도; 김훈; 서원진; 이선정; 조정래; 강정석; 김현근
Original assignee: 한국항공대학교산학협력단
Priority date: 2019-09-03
Filing date: 2019-09-03
Publication date: 2021-09-06
Also published as: KR20210027769A

Abstract

신경망 기반 감정 분석 방법에 관한 것이며, 신경망 기반 감정 분석 방법은 (a) 사용자로부터 입력된 음성 데이터에 신경망 기반의 음성분석을 적용하여 제1 감정 분석 결과를 산출하는 단계; (b) 상기 음성 데이터에 감정사전 기반의 문맥분석을 적용하여 제2 감정 분석 결과를 산출하는 단계; 및 (c) 상기 제1 감정 분석 결과 및 상기 제2 감정 분석 결과를 신경망의 입력으로 하여 상기 음성 데이터에 대한 통합 감정 분석 결과를 출력 제공하는 단계를 포함할 수 있다.To a neural network-based emotion analysis method, the neural network-based emotion analysis method includes the steps of: (a) calculating a first emotion analysis result by applying a neural network-based voice analysis to voice data input from a user; (b) calculating a second emotion analysis result by applying context analysis based on emotion dictionary to the voice data; and (c) using the first emotion analysis result and the second emotion analysis result as inputs to a neural network, and outputting and providing an integrated emotion analysis result for the voice data.

Description

Neural network-based emotion analysis method and apparatus, and apparatus for emotion treatment based thereon

본원은 신경망 기반 감정 분석 및 감성 치료 시스템에 관한 것이다. 특히, 본원은 신경망 기반 감정 분석 장치, 방법 및 시스템, 신경망 기반 감정 분석 기반의 감정(감성) 치료 장치, 방법 및 시스템에 관한 것이다. The present application relates to a neural network-based emotion analysis and emotion treatment system. In particular, the present application relates to a neural network-based emotion analysis apparatus, method, and system, and a neural network-based emotion analysis-based emotion (emotional) treatment apparatus, method, and system.

감성공학은 사람의 마음, 기분과 같은 비언어적 요소를 공학에 접목하여 실생활에 도움이 되도록 기술적으로 실현하는 것을 말한다. 기능, 품질에만 집중하던 과거 기업들의 전략과 달리 최근에는 사람들의 감성(감정)을 자극하거나 이를 활용하여 제품을 개발하는 것이 중심 전략이 되고 있다.Emotional engineering refers to the technical realization of non-verbal elements such as the human mind and mood to be useful in real life by grafting them into engineering. Unlike the strategies of companies in the past, which focused only on function and quality, recently, developing products by stimulating people's emotions (emotions) or using them has become a central strategy.

감성컴퓨팅(Affective Computing)은 인간의 감성을 인지, 해석, 처리할 수 있는 시스템과 장치를 설계하는 것과 관련된 인공지능을 연구하고 개발하는 분야이며, 컴퓨터 과학, 심리학, 인지과학 등 다양한 분야가 융합된 인공지능 기반 시스템을 의미한다. 이는 물리적 또는 감각적 자극을 통해 사람들의 심리적인 반응을 인지하여 인간과 컴퓨터간 상호작용에 활용 가능한 기술이다.Affective Computing is a field that researches and develops artificial intelligence related to designing systems and devices that can recognize, interpret, and process human emotions. AI-based systems. This is a technology that can be used for interaction between humans and computers by recognizing people's psychological responses through physical or sensory stimulation.

최근 기술의 향상에 따라 기계-인간 간에 긴밀한 관계를 맺고 있으며, 인간의 특성은 이성과 감성이어서 인간과 로봇이 상호작용을 할 때에 감성적(감정적)인 측면에 대한 고려가 필수적이라 할 수 있다. 이에 따라, 많은 국내외 기업에서 감성컴퓨팅과 관련된 연구를 활발히 진행하고 있으나, 아직까지는 상용화 초기 단계로서 연구할 가치가 높은 분야라 할 수 있다.With the recent advancement of technology, there is a close relationship between machine and human, and human characteristics are rational and emotional, so it is essential to consider the emotional (emotional) aspect when humans and robots interact. Accordingly, many domestic and foreign companies are actively conducting research related to emotional computing, but it is still an early stage of commercialization, and it is a field worth researching.

종래에 사용자의 감정(감성)을 분석하는 기술로는 음성분석, 자연어 처리 등을 이용한 감정 분석 기술들이 존재한다.Conventionally, as a technique for analyzing a user's emotion (emotion), there exist emotion analysis techniques using voice analysis, natural language processing, and the like.

그런데, 종래에 공지된 음성분석 기반의 감정 분석 기술들은 사용자가 중립적인 억양으로 감정적인 문장을 말하는 경우에 대해서는 인지가 어렵고, 동일한 문장이라 개인차로 인하여 감정 분석 결과가 서로 다르게 도출되는 등 감정 분석 결과의 정확성이 떨어지는 문제가 있다.However, in conventionally known voice analysis-based emotion analysis techniques, it is difficult to recognize when a user speaks an emotional sentence with a neutral accent, and the emotion analysis result is derived differently due to individual differences due to the same sentence. There is a problem that the accuracy of

본원의 배경이 되는 기술은 [Yelin Kim and Emily Mower Provost. 2015. Emotion Recognition During Speech Using Dynamics of Multiple Regions of the Face. ACM Trans. Multimedia Comput. Commun. Appl. 12, 1s, Article 25 (October 2015), 23 pages.] 문헌에 개시되어 있다.The background technology of the present application is [Yelin Kim and Emily Mower Provost. 2015. Emotion Recognition During Speech Using Dynamics of Multiple Regions of the Face. ACM Trans. Multimedia Comput. Commun. Appl. 12, 1s, Article 25 (October 2015), 23 pages.]

본원은 전술한 종래 기술의 문제점을 해결하기 위한 것으로서, 보다 정확한 감정 분석 결과를 도출할 수 있는 신경망 기반 감정 분석 장치와 방법 및 신경망 기반 감정 분석 기반의 감정(감성) 치료 장치와 방법을 제공하려는 것을 목적으로 한다.The present application is to solve the problems of the prior art described above, and to provide a neural network-based emotion analysis apparatus and method that can derive more accurate emotion analysis results, and a neural network-based emotion analysis-based emotion (emotional) treatment apparatus and method The purpose.

다만, 본원의 실시예가 이루고자 하는 기술적 과제는 상기된 바와 같은 기술적 과제들로 한정되지 않으며, 또 다른 기술적 과제들이 존재할 수 있다.However, the technical problems to be achieved by the embodiments of the present application are not limited to the technical problems as described above, and other technical problems may exist.

상기한 기술적 과제를 달성하기 위한 기술적 수단으로서, 본원의 제1 측면에 따른 신경망 기반 감정 분석 방법은, (a) 사용자로부터 입력된 음성 데이터에 신경망 기반의 음성분석을 적용하여 제1 감정 분석 결과를 산출하는 단계; (b) 상기 음성 데이터에 감정사전 기반의 문맥분석을 적용하여 제2 감정 분석 결과를 산출하는 단계; 및 (c) 상기 제1 감정 분석 결과 및 상기 제2 감정 분석 결과를 신경망의 입력으로 하여 상기 음성 데이터에 대한 통합 감정 분석 결과를 출력 제공하는 단계를 포함할 수 있다.As a technical means for achieving the above technical problem, the neural network-based emotion analysis method according to the first aspect of the present application, (a) applies a neural network-based voice analysis to voice data input from a user to obtain a first emotion analysis result calculating; (b) calculating a second emotion analysis result by applying context analysis based on emotion dictionary to the voice data; and (c) using the first emotion analysis result and the second emotion analysis result as inputs to a neural network, and outputting and providing an integrated emotion analysis result for the voice data.

상기한 기술적 과제를 달성하기 위한 기술적 수단으로서, 본원의 제2 측면에 따른 신경망 기반 감정 분석 기반의 감정 치료 방법은, 본원이 제1 측면에 따른 신경망 기반 감정 분석 방법을 수행하여 제공된 통합 감정 분석 결과를 기반으로 감정치료법을 결정하여 제공할 수 있다.As a technical means for achieving the above technical problem, the neural network-based emotion analysis-based emotion treatment method according to the second aspect of the present application is an integrated emotion analysis result provided by the present application performing the neural network-based emotion analysis method according to the first aspect of the present application Based on this, emotional therapy can be decided and provided.

상기한 기술적 과제를 달성하기 위한 기술적 수단으로서, 본원의 제3 측면에 따른 신경망 기반 감정 분석 장치는, 사용자로부터 입력된 음성 데이터에 신경망 기반의 음성분석을 적용하여 제1 감정 분석 결과를 산출하는 음성분석 모듈; 상기 음성 데이터에 감정사전 기반의 문맥분석을 적용하여 제2 감정 분석 결과를 산출하는 문맥분석 모듈; 및 상기 제1 감정 분석 결과 및 상기 제2 감정 분석 결과를 신경망의 입력으로 하여 상기 음성 데이터에 대한 통합 감정 분석 결과를 출력 제공하는 통합 모듈을 포함할 수 있다.As a technical means for achieving the above technical problem, the neural network-based emotion analysis apparatus according to the third aspect of the present application applies neural network-based speech analysis to voice data input from a user to produce a first emotion analysis result. analysis module; a context analysis module for calculating a second emotion analysis result by applying an emotion dictionary-based context analysis to the voice data; and an integrated module for outputting and providing an integrated emotion analysis result for the voice data by using the first emotion analysis result and the second emotion analysis result as input to a neural network.

상기한 기술적 과제를 달성하기 위한 기술적 수단으로서, 본원의 제4 측면에 따른 신경망 기반 감정 분석 기반의 감정 치료 장치는, 본원의 제3 측면에 따른 신경망 기반 감정 분석 장치에 의해 신경망 기반 감정 분석을 수행하여 제공된 통합 감정 분석 결과를 기반으로 감정치료법을 결정하여 제공할 수 있다.As a technical means for achieving the above technical problem, the neural network-based emotion analysis-based emotion treatment apparatus according to the fourth aspect of the present application performs neural network-based emotion analysis by the neural network-based emotion analysis apparatus according to the third aspect of the present application Thus, it is possible to determine and provide an emotion treatment method based on the provided integrated emotion analysis result.

상기한 기술적 과제를 달성하기 위한 기술적 수단으로서, 본원의 제5 측면에 따른 컴퓨터 프로그램은, 본원의 제1 측면에 따른 신경망 기반 감정 분석 방법 및 본원의 제2 측면에 따른 신경망 기반 감정 분석 기반의 감정 치료 방법을 실행시키기 위하여 기록매체에 저장되는 것일 수 있다.As a technical means for achieving the above technical task, the computer program according to the fifth aspect of the present application, the neural network-based emotion analysis method according to the first aspect of the present application, and the neural network-based emotion analysis-based emotion according to the second aspect of the present application It may be stored in a recording medium in order to execute the treatment method.

상술한 과제 해결 수단은 단지 예시적인 것으로서, 본원을 제한하려는 의도로 해석되지 않아야 한다. 상술한 예시적인 실시예 외에도, 도면 및 발명의 상세한 설명에 추가적인 실시예가 존재할 수 있다.The above-described problem solving means are merely exemplary, and should not be construed as limiting the present application. In addition to the exemplary embodiments described above, additional embodiments may exist in the drawings and detailed description.

전술한 본원의 과제 해결 수단에 의하면, 신경망 기반의 음성분석과 감정사전 기반의 문맥분석을 함께 고려하여 통합 감정 분석 결과를 출력 제공하므로, 입력된 음성 데이터에 대한 사용자의 감정 상태를 보다 정확하게 도출할 수 있는 효과가 있다.According to the above-described problem solving means of the present application, since the integrated emotion analysis result is output by considering the neural network-based voice analysis and the emotional dictionary-based context analysis together, it is possible to more accurately derive the user's emotional state for the input voice data. can have an effect.

전술한 본원의 과제 해결 수단에 의하면, 통합 감정 분석 결과를 기반으로 결정된 감정치료법을 제공함으로써, 사용자의 정서적/감정적 상태를 정확히 파악할 수 있고 이를 기반으로 사용자의 현재 감정이 보다 긍정적으로 변화되도록 유도함으로써, 사용자에게 보다 나은 삶을 제공할 수 있다.According to the above-described problem solving means of the present application, by providing an emotion treatment method determined based on the integrated emotion analysis result, the user's emotional/emotional state can be accurately identified and based on this, the user's current emotion can be changed more positively. , can provide users with a better life.

전술한 본원의 과제 해결 수단에 의하면, 감정치료법을 제공함으로써, 사용자의 감정이 부정적 감정 또는 중립 감정이면 사용자의 감정이 긍정적 감정으로 변화되도록 유도할 수 있고, 사용자의 감정이 긍정적 감정이면 해당 긍정적 감정이 유지되거나 그보다 확장되도록 유도할 수 있다.According to the above-described problem solving means of the present application, by providing an emotion therapy method, if the user's emotion is a negative emotion or a neutral emotion, it is possible to induce the user's emotion to be changed into a positive emotion, and if the user's emotion is a positive emotion, the corresponding positive emotion may be maintained or extended beyond that.

다만, 본원에서 얻을 수 있는 효과는 상기된 바와 같은 효과들로 한정되지 않으며, 또 다른 효과들이 존재할 수 있다.However, the effects obtainable herein are not limited to the above-described effects, and other effects may exist.

도 1은 본원의 일 실시예에 따른 신경망 기반 감정 분석 시스템의 개략적인 구성을 나타낸 도면이다.
도 2는 본원의 일 실시예에 따른 신경망 기반 감정 분석 장치의 개략적인 구성을 나타낸 도면이다.
도 3은 본원의 일 실시예에 따른 신경망 기반 감정 분석 장치에 의하여 도출되는 통합 감정 분석 결과의 일예를 나타낸다.
도 4는 본원의 일 실시예에 따른 신경망 기반 감정 분석 장치에 의하여 도출되는 통합 감정 분석 결과의 다른 예를 나타낸다.
도 5는 본원의 일 실시예에 따른 신경망 기반 감정 분석 기반의 감정 치료 시스템의 개략적인 구성을 나타낸 도면이다.
도 6은 본원의 일 실시예에 따른 신경망 기반 감정 분석 기반의 감정 치료 장치의 개략적인 구성을 나타낸 도면이다.
도 7은 본원의 일 실시예에 따른 신경망 기반 감정 분석 기반의 감정 치료 장치에 의하여 사용자의 감정 치료가 이루어지는 예를 설명하기 위한 도면이다.
도 8은 본원의 일 실시예에 따른 신경망 기반 감정 분석 기반의 감정 치료 장치와 연동되는 모바일 어플리케이션 내의 메뉴 구성을 개략적으로 나타낸 도면이다.
도 9는 본원의 일 실시예에 따른 신경망 기반 감정 분석 기반의 감정 치료 장치에서 문맥분석 모듈에 의한 제2 감정 분석 결과의 산출 예를 설명하기 위한 도면이다.
도 10a 내지 도 10c는 본원의 일 실시예에 따른 신경망 기반 감정 분석 기반의 감정 치료 장치에서 음성분석 모듈에 의한 제1 감정 분석 결과의 산출 예를 설명하기 위한 도면이다.
도 11은 본원의 일 실시예에 따른 신경망 기반 감정 분석 기반의 감정 치료 장치에서 통합 모듈에 의한 통합 감정 분석 결과의 산출 예를 설명하기 위한 도면이다.
도 12는 주어진 음성 데이터에 대하여 종래의 자연어 처리를 적용한 감정 분석 결과의 화면 표시 예를 나타낸 도면이다.
도 13은 주어진 음성 데이터에 대하여 본원의 일 실시예에 따른 신경망 기반 감정 분석 기반의 감정 치료 장치의 문맥분석 모듈에 의한 문맥분석의 적용을 통한 감정 분석 결과의 화면 표시 예를 나타낸 도면이다.
도 14는 본원의 일 실시예에 따른 신경망 기반 감정 분석 방법에 대한 동작 흐름도이다.
도 15는 본원의 일 실시예에 따른 신경망 기반 감정 분석 기반의 감정 치료 방법에 대한 동작 흐름도이다.1 is a diagram showing a schematic configuration of a neural network-based emotion analysis system according to an embodiment of the present application.
2 is a diagram showing a schematic configuration of a neural network-based emotion analysis apparatus according to an embodiment of the present application.
3 shows an example of an integrated emotion analysis result derived by the neural network-based emotion analysis apparatus according to an embodiment of the present application.
4 shows another example of an integrated emotion analysis result derived by the neural network-based emotion analysis apparatus according to an embodiment of the present application.
5 is a diagram illustrating a schematic configuration of an emotion treatment system based on neural network-based emotion analysis according to an embodiment of the present application.
6 is a diagram illustrating a schematic configuration of an emotion treatment apparatus based on neural network-based emotion analysis according to an embodiment of the present application.
7 is a diagram for explaining an example in which a user's emotion treatment is performed by an emotion treatment apparatus based on a neural network-based emotion analysis according to an embodiment of the present application.
8 is a diagram schematically illustrating a menu configuration in a mobile application interworking with an emotion treatment apparatus based on neural network-based emotion analysis according to an embodiment of the present application.
FIG. 9 is a diagram for explaining an example of calculating a second emotion analysis result by a context analysis module in an emotion treatment apparatus based on a neural network-based emotion analysis according to an embodiment of the present application.
10A to 10C are diagrams for explaining an example of calculation of a first emotion analysis result by a voice analysis module in a neural network-based emotion analysis-based emotion treatment apparatus according to an embodiment of the present application.
11 is a view for explaining an example of calculation of an integrated emotion analysis result by an integrated module in an emotion treatment apparatus based on a neural network-based emotion analysis according to an embodiment of the present application.
12 is a view showing an example of a screen display of an emotion analysis result by applying a conventional natural language processing to a given voice data.
13 is a view showing an example of displaying a screen of an emotion analysis result through application of context analysis by a context analysis module of an emotion treatment apparatus based on neural network-based emotion analysis according to an embodiment of the present application for given voice data.
14 is an operation flowchart of a neural network-based emotion analysis method according to an embodiment of the present application.
15 is an operation flowchart of an emotion treatment method based on a neural network-based emotion analysis according to an embodiment of the present application.

아래에서는 첨부한 도면을 참조하여 본원이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 본원의 실시예를 상세히 설명한다. 그러나 본원은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시예에 한정되지 않는다. 그리고 도면에서 본원을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다.Hereinafter, embodiments of the present application will be described in detail with reference to the accompanying drawings so that those of ordinary skill in the art to which the present application pertains can easily implement them. However, the present application may be embodied in several different forms and is not limited to the embodiments described herein. And in order to clearly explain the present application in the drawings, parts irrelevant to the description are omitted, and similar reference numerals are attached to similar parts throughout the specification.

본원 명세서 전체에서, 어떤 부분이 다른 부분과 "연결"되어 있다고 할 때, 이는 "직접적으로 연결"되어 있는 경우뿐 아니라, 그 중간에 다른 소자를 사이에 두고 "전기적으로 연결" 또는 "간접적으로 연결"되어 있는 경우도 포함한다. Throughout this specification, when a part is "connected" with another part, it is not only "directly connected" but also "electrically connected" or "indirectly connected" with another element interposed therebetween. "Including cases where

본원 명세서 전체에서, 어떤 부재가 다른 부재 "상에", "상부에", "상단에", "하에", "하부에", "하단에" 위치하고 있다고 할 때, 이는 어떤 부재가 다른 부재에 접해 있는 경우뿐 아니라 두 부재 사이에 또 다른 부재가 존재하는 경우도 포함한다.Throughout this specification, when it is said that a member is positioned "on", "on", "on", "under", "under", or "under" another member, this means that a member is positioned on the other member. It includes not only the case where they are in contact, but also the case where another member exists between two members.

본원 명세서 전체에서, 어떤 부분이 어떤 구성 요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성 요소를 제외하는 것이 아니라 다른 구성 요소를 더 포함할 수 있는 것을 의미한다.Throughout this specification, when a part "includes" a component, it means that other components may be further included, rather than excluding other components, unless otherwise stated.

도 1은 본원의 일 실시예에 따른 신경망 기반 감정 분석 시스템(200)의 개략적인 구성을 나타낸 도면이다. 도 2는 본원의 일 실시예에 따른 신경망 기반 감정 분석 장치(100)의 개략적인 구성을 나타낸 도면이다.1 is a diagram showing a schematic configuration of a neural network-based emotion analysis system 200 according to an embodiment of the present application. 2 is a diagram illustrating a schematic configuration of a neural network-based emotion analysis apparatus 100 according to an embodiment of the present application.

이하에서는 설명의 편의상 본원의 일 실시예에 따른 신경망 기반 감정 분석 시스템(200)을 본 시스템(200)이라 하고, 본원의 일 실시예에 따른 신경망 기반 감정 분석 장치(100)를 본 장치(100)라 하기로 한다.Hereinafter, for convenience of description, the neural network-based emotion analysis system 200 according to an embodiment of the present application is referred to as the present system 200 , and the neural network-based emotion analysis apparatus 100 according to an embodiment of the present application is referred to as the present device 100 . shall do

도 1 및 도 2를 참조하면, 본 시스템(200)은 사용자 단말(2) 및 본 장치(100)를 포함할 수 있다.1 and 2 , the present system 200 may include a user terminal 2 and the present apparatus 100 .

사용자 단말(2)은 사용자(1)로부터 감정 분석 대상이 되는 음성 데이터(d)를 입력받을 수 있다. 사용자 단말(2)은 입력받은 음성 데이터(d)를 본 장치(100)로 전달할 수 있다.The user terminal 2 may receive voice data d, which is an emotion analysis target, from the user 1 . The user terminal 2 may transmit the received voice data d to the device 100 .

사용자 단말(2)은 일예로 스마트폰(Smartphone), 스마트패드(SmartPad), 태블릿 PC, 노트북, 웨어러블 디바이스 등과 같이 이동 및 휴대가 가능한 통신 장치일 수 있다. 다만, 이에만 한정되는 것은 아니고, 사용자 단말(2)은 PCS(Personal Communication System), GSM(Global System for Mobile communication), PDC(Personal Digital Cellular), PHS(Personal Handyphone System), PDA(Personal Digital Assistant), IMT(International Mobile Telecommunication)-2000, CDMA(Code Division Multiple Access)-2000, W-CDMA(WCode Division Multiple Access), Wibro(Wireless Broadband Internet) 단말, 스마트폰(Smartphone), 스마트패드(SmartPad), 태블릿 PC, 노트북, 웨어러블 디바이스, 데스크탑 PC 등과 같은 모든 종류의 유무선 통신 장치를 포함할 수 있다.The user terminal 2 may be, for example, a mobile and portable communication device such as a smart phone, a smart pad, a tablet PC, a notebook computer, and a wearable device. However, the present invention is not limited thereto, and the user terminal 2 includes a Personal Communication System (PCS), a Global System for Mobile communication (GSM), a Personal Digital Cellular (PDC), a Personal Handyphone System (PHS), and a Personal Digital Assistant (PDA). ), IMT (International Mobile Telecommunication)-2000, CDMA (Code Division Multiple Access)-2000, W-CDMA (WCode Division Multiple Access), Wibro (Wireless Broadband Internet) terminal, Smartphone, SmartPad , tablet PCs, notebook computers, wearable devices, desktop PCs, and the like, may include all types of wired and wireless communication devices.

본 장치(100)는 사용자(1)로부터 입력된 음성 데이터(d)에 대한 분석(음성분석, 문맥분석)을 통해 사용자의 감정을 분석할 수 있다. 본 장치(100)는 사용자(1)로부터 입력된 음성 데이터(d)에 대하여 음성분석과 문맥분석을 적용하고, 음성분석의 적용을 통해 산출된 제1 감정 분석 결과(d1)와 문맥분석의 적용을 통해 산출된 제2 감정 분석 결과(d2)를 통합한 통합 감정 분석 결과(d3)를 음성 데이터(d)에 대한 사용자의 감정 분석 결과로서 출력 제공할 수 있다. 여기서, 본 장치(100)가 입력받은 음성 데이터(d)는 사용자 단말(2)을 통해 입력받은 것(달리 말해, 사용자 단말로부터 전달받은 것)일 수 있다.The apparatus 100 may analyze the user's emotion through analysis (voice analysis, context analysis) of the voice data d input from the user 1 . The device 100 applies voice analysis and context analysis to voice data d input from the user 1, and applies the first emotion analysis result d1 calculated through the application of voice analysis and context analysis The integrated emotion analysis result d3 in which the second emotion analysis result d2 calculated through . Here, the voice data d received by the device 100 may be input through the user terminal 2 (in other words, received from the user terminal).

본 장치(100)는 일예로 서버일 수 있다. 다만 이에만 한정되는 것은 아니고, 다른 일예로 본 장치(100)는 사용자 단말(2)에 구비되는 장치로서, 특히 사용자 단말(2)에 설치되는 모바일 어플리케이션(2a) 자체를 의미할 수도 있다.The device 100 may be, for example, a server. However, the present invention is not limited thereto, and as another example, the device 100 is a device provided in the user terminal 2 , and in particular, may refer to the mobile application 2a itself installed in the user terminal 2 .

본 장치(100)가 서버 형태로 마련되는 경우에 대한 예시는 다음과 같다. 사용자 단말(2)에는 사용자(1)로부터 입력된 음성 데이터(d)에 대하여 감정 분석을 수행하는 본 장치(100)와 연동되는 모바일 어플리케이션(2a)이 설치되어 있을 수 있다. 사용자 단말(2) 내 모바일 어플리케이션(2a)은 사용자(1)로부터 입력받은 음성 데이터(d)를 본 장치(100)로 전달할 수 있다. 본 장치(100)는 모바일 어플리케이션(2a)으로부터 전달받은 음성 데이터(d)에 대하여 음성분석과 문맥분석을 적용함으로써 음성 데이터(d)에 대한 사용자의 감정 분석 결과로서 통합 감정 분석 결과(d3)를 생성할 수 있다. 이후, 본 장치(100)는 생성된 통합 감정 분석 결과(d3)를 사용자 단말(2) 내 모바일 어플리케이션(2a)으로 제공(출력 제공)할 수 있으며, 이를 통해 사용자 단말(2)의 화면 상에 통합 감정 분석 결과(d3)가 표시(디스플레이)될 수 있다. 사용자는 본 장치(100)로부터 출력 제공되는 통합 감정 분석 결과(d3)를 사용자 단말(2)의 화면을 통해 제공받아 확인할 수 있다.An example of a case in which the present device 100 is provided in the form of a server is as follows. The user terminal 2 may have a mobile application 2a interworking with the device 100 that performs emotion analysis on the voice data d input from the user 1 may be installed. The mobile application 2a in the user terminal 2 may transmit the voice data d received from the user 1 to the device 100 . The device 100 applies the voice analysis and context analysis to the voice data d received from the mobile application 2a, thereby generating the integrated emotion analysis result d3 as the user's emotion analysis result for the voice data d. can create Thereafter, the device 100 may provide (output) the generated integrated emotion analysis result d3 to the mobile application 2a in the user terminal 2 , and through this, on the screen of the user terminal 2 . The integrated emotion analysis result d3 may be displayed (displayed). The user may receive and check the integrated emotion analysis result d3 output from the device 100 through the screen of the user terminal 2 .

이처럼, 본 장치(100)가 서버인 경우(서버 형태로 마련되는 경우), 본 장치(100)와 사용자 단말(2) 간에는 네트워크 통신을 통해 데이터 송수신(즉, 사용자 단말에서 본 장치로의 음성 데이터의 송신 및 본 장치에서 사용자 단말로의 통합 감정 분석 결과의 송신)이 이루어질 수 있다.As such, when the device 100 is a server (provided in the form of a server), data transmission/reception between the device 100 and the user terminal 2 through network communication (that is, voice data from the user terminal to the device) transmission and transmission of the integrated emotion analysis result from the device to the user terminal) may be performed.

네트워크 통신은 일예로 3GPP(3rd Generation Partnership Project) 네트워크, LTE(Long Term Evolution) 네트워크, WIMAX(World Interoperability for Microwave Access) 네트워크, 인터넷(Internet), LAN(Local Area Network), Wireless LAN(Wireless Local Area Network), WAN(Wide Area Network), PAN(Personal Area Network), 블루투스(Bluetooth) 네트워크, NFC(Near Field Communication) 네트워크, 위성 방송 네트워크, 아날로그 방송 네트워크, DMB(Digital Multimedia Broadcasting) 네트워크 등이 포함될 수 있으며, 이에 한정된 것은 아니다.Network communication is, for example, a 3rd Generation Partnership Project (3GPP) network, a Long Term Evolution (LTE) network, a World Interoperability for Microwave Access (WIMAX) network, the Internet, a Local Area Network (LAN), and a Wireless Local Area (Wireless LAN) network. Network), WAN (Wide Area Network), PAN (Personal Area Network), Bluetooth (Bluetooth) network, NFC (Near Field Communication) network, satellite broadcasting network, analog broadcasting network, DMB (Digital Multimedia Broadcasting) network, etc. may be included. and is not limited thereto.

본 장치(100)(특히, 본 장치에 의한 감정 분석 방법)가 사용자 단말(2) 내 모바일 어플리케이션(2a) 형태로 마련되는 경우에 대한 예시는 다음과 같다. 본 장치(100)는 사용자(1)로부터 음성 데이터(d)를 입력받고, 입력된 음성 데이터(d)에 대하여 음성분석과 문맥분석을 적용함으로써 음성 데이터(d)에 대한 사용자의 감정 분석 결과로서 통합 감정 분석 결과(d3)를 생성할 수 있다. 이후, 본 장치(100)는 생성된 통합 감정 분석 결과(d3)를 사용자 단말(2)의 화면 상에 제공(출력 제공)할 수 있으며, 이를 통해 사용자 단말(2)의 화면 상에 통합 감정 분석 결과(d3)가 표시(디스플레이)되도록 할 수 있다. 사용자는 본 장치(100)로부터 출력 제공되는 통합 감정 분석 결과(d3)를 사용자 단말(2)의 화면을 통해 제공받아 확인할 수 있다.An example of a case in which the present device 100 (in particular, the emotion analysis method by the present device) is provided in the form of a mobile application 2a in the user terminal 2 is as follows. The apparatus 100 receives voice data d from the user 1 and applies voice analysis and context analysis to the input voice data d as a result of user's emotion analysis on the voice data d. An integrated emotion analysis result d3 may be generated. Thereafter, the device 100 may provide (output output) the generated integrated emotion analysis result d3 on the screen of the user terminal 2 , through which the integrated emotion analysis is performed on the screen of the user terminal 2 . The result d3 may be displayed (displayed). The user may receive and check the integrated emotion analysis result d3 output from the device 100 through the screen of the user terminal 2 .

이하에서는 설명의 편의상 본 장치(100)가 서버인 경우(서버 형태로 마련되는 경우)로 예시하기로 하며, 본 장치(100)에 대한 구체적인 설명은 다음과 같다.Hereinafter, for convenience of description, the present device 100 will be exemplified as a server (provided in the form of a server), and a detailed description of the present device 100 is as follows.

본 장치(100)는 음성분석 모듈(10), 문맥분석 모듈(20) 및 통합 모듈(30)을 포함할 수 있다.The apparatus 100 may include a voice analysis module 10 , a context analysis module 20 , and an integration module 30 .

음성분석 모듈(10)은 사용자(1)로부터 입력된 음성 데이터(d)에 신경망(12) 기반의 음성분석을 적용하여 제1 감정 분석 결과(d1)를 산출할 수 있다. 음성분석 모듈(10)은 산출된 제1 감정 분석 결과(d1)를 통합 모듈(30)로 전달할 수 있다.The voice analysis module 10 may calculate the first emotion analysis result d1 by applying the neural network 12-based voice analysis to the voice data d input from the user 1 . The voice analysis module 10 may transmit the calculated first emotion analysis result d1 to the integration module 30 .

음성분석 모듈(10)은, 음성 데이터(d)로부터 추출된 음성 파형 특징을 입력으로 하는 신경망(12) 기반의 음성분석을 통해 제1 감정 분석 결과(d1)를 산출할 수 있다.The voice analysis module 10 may calculate the first emotion analysis result d1 through the neural network 12-based voice analysis to which the voice waveform features extracted from the voice data d are input.

여기서, 음성 파형 특징은 음성 데이터(d)에 대하여 MFCC(Mel-Frequency Cepstral Coefficient) 변환을 수행하여 출력된 MFCC특징 값일 수 있다. 또한, 음성분석 모듈(10)에 의해 산출되는 제1 감정 분석 결과는, 행복(Happy), 중립(Neutral), 슬픔(Sad) 및 분노(Angry)를 포함하는 복수의 감정 각각에 대해 확률적으로 부여되는 값일 수 있다.Here, the voice waveform feature may be an MFCC feature value output by performing Mel-Frequency Cepstral Coefficient (MFCC) transformation on the voice data d. In addition, the first emotion analysis result calculated by the voice analysis module 10 is probabilistic for each of a plurality of emotions including happiness (Happy), neutral (Neutral), sadness (Sad) and anger (Angry). It may be a given value.

구체적으로, 음성분석 모듈(10)은 사용자 단말(2)의 모바일 어플리케이션(2a)을 통해 입력된 음성 데이터(d)를 사용자 단말(2)로부터 전달받을 수 있다.Specifically, the voice analysis module 10 may receive the voice data d input through the mobile application 2a of the user terminal 2 from the user terminal 2 .

예시적으로, 사용자(1)는 현재 감정을 나타내는 문장을 사용자 단말(2)의 모바일 어플리케이션(2a)을 통해 녹음할 수 있다. 모바일 어플리케이션(2a)을 통해 녹음된 사용자(1)의 현재 감정을 나타내는 문장에 대응하는 음성파일은 음성 분석 대상이 되는 음성 데이터(d)로 하여 음성분석 모듈(10)의 입력으로 제공될 수 있다. 본 장치(100)에서는 일예로 음성 데이터(d)로서 녹음된 파일이 고려되는 것으로 예시하였으나, 이에만 한정되는 것은 아니고, 다른 일예로 음성 데이터(d)는 사용자로부터 실시간으로 입력된 음성 데이터일 수 있다.For example, the user 1 may record a sentence representing the current emotion through the mobile application 2a of the user terminal 2 . The voice file corresponding to the sentence representing the current emotion of the user 1 recorded through the mobile application 2a may be provided as an input of the voice analysis module 10 as the voice data d to be subjected to voice analysis. . In the present device 100, as an example, a recorded file is considered as voice data d, but the present invention is not limited thereto, and as another example, voice data d may be voice data inputted in real time from a user. have.

음성분석 모듈(10)는 모바일 어플리케이션(2a)으로부터.wav 확장자를 갖는 음성파일을 음성 분석 대상인 음성 데이터(d)로서 입력(전달)받을 수 있다.The voice analysis module 10 may receive (transfer) a voice file having a .wav extension from the mobile application 2a as voice data d to be analyzed.

음성분석 모듈(10)은 입력된 음성 데이터(d)에 일예로 음성인식 알고리즘으로서 MFCC(Mel-Frequency Cepstral Coefficient)를 적용할 수 있다. 음성분석 모듈(10)은 입력된 음성 데이터(d)에 대하여 MFCC의 적용을 통해 MFCC 변환(11)을 수행할 수 있으며, 이를 통해 음성 데이터(d)에 대한 음성 파형 특징으로서 MFCC 특징 값을 출력할 수 있다. 즉, 음성분석 모듈(10)은 입력된 음성 데이터(d)에 MFCC 변환(11)을 수행하여 음성 데이터(d)에 대응하는 소리를 모델링하고, 그에 따른 MFCC 특징 값(MFCC 값)을 음성 데이터(d)에 대한 음성 파형 특징으로서 출력할 수 있다.The voice analysis module 10 may apply a Mel-Frequency Cepstral Coefficient (MFCC) as an example of a voice recognition algorithm to the input voice data d. The voice analysis module 10 may perform MFCC conversion 11 through application of MFCC to the input voice data d, and output MFCC feature values as voice waveform features for the voice data d through this. can do. That is, the voice analysis module 10 performs MFCC transformation 11 on the input voice data d to model a sound corresponding to the voice data d, and converts the MFCC feature value (MFCC value) accordingly to the voice data. (d) can be output as the audio waveform feature.

음성분석 모듈(10)은 MFCC 변환(11)을 통해 출력된 MFCC 특징 값인 음성 파형 특징을 신경망(12)의 입력으로 제공할 수 있다.The voice analysis module 10 may provide a voice waveform feature that is an MFCC feature value output through the MFCC transformation 11 as an input of the neural network 12 .

신경망(12)으로는 일예로 케라스(Keras)가 고려될 수 있다. 케라스는 파이썬으로 작성된 오픈 소스 신경망 라이브러리를 의미한다. 케라스는 MXNet, Deeplearning4j, 텐서플로, Microsoft Cognitive Toolkit 또는 Theano 위에서 수행할 수 있다. 케라스는 딥 신경망과의 빠른 실험을 가능케 하도록 설계되었으며 최소한의 모듈 방식의 확장 가능성에 초점을 두고 있다. 케라스는 ONEIROS(Open-ended Neuro-Electronic Intelligent Robot Operating System) 프로젝트의 연구적 노력의 일환으로 개발되었다.As the neural network 12 , Keras may be considered as an example. Keras stands for an open source neural network library written in Python. Keras can run on top of MXNet, Deeplearning4j, TensorFlow, Microsoft Cognitive Toolkit or Theano. Keras is designed to enable rapid experimentation with deep neural networks and is focused on minimally modular scalability. Keras was developed as part of the research effort of the Open-ended Neuro-Electronic Intelligent Robot Operating System (ONEIROS) project.

케라스는 본원이 속하는 기술분야의 통상의 지식을 가진 자에게 잘 알려진 기술이므로, 이하에서는 케라스 자체에 대한 설명보다는, 케라스 기반의 신경망(12)이 본원의 일 실시예에 따른 신경망 기반 감정 분석 장치(100)에 적용된 예를 중심으로 설명하기로 한다.Since Keras is a technique well known to those of ordinary skill in the art to which this application belongs, hereinafter, rather than a description of Keras itself, the Keras-based neural network 12 is a neural network-based emotion analysis according to an embodiment of the present application. An example applied to the device 100 will be mainly described.

본 장치(100)에서는 신경망(12)으로서 일예로 컨볼루션 신경망(Convolution Neural Network, CNN, 합성곱 신경망)이 고려될 수 있다. 다만, 이에만 한정되는 것은 아니고, 본 장치(100)에서 고려되는 신경망(12)으로는 예시적으로 순환신경망(RNN, Recurrent Neural Network), 딥 신경망(Deep Neural Network) 등 종래에 이미 공지되었거나 향후 개발되는 다양한 신경망이 적용될 수 있다. 신경망(12)은 딥러닝 모델 등으로 달리 표현될 수 있다.In the apparatus 100 , as the neural network 12 , a convolutional neural network (CNN, convolutional neural network) may be considered as an example. However, the present invention is not limited thereto, and examples of the neural network 12 considered in the present device 100 include a recurrent neural network (RNN), a deep neural network, etc. previously known or future Various neural networks to be developed can be applied. The neural network 12 may be expressed differently as a deep learning model or the like.

음성분석 모듈(10)은 MFCC 변환(11)을 통해 출력된 MFCC 특징 값인 음성 파형 특징을 신경망(12)의 입력으로 제공할 수 있으며, 이를 통해 신경망(12) 기반의 음성분석을 수행함으로써 신경망(12)의 출력으로서 제1 감정 분석 결과(d1)를 산출할 수 있다.The voice analysis module 10 may provide a voice waveform feature, which is an MFCC feature value output through the MFCC transformation 11, as an input to the neural network 12, and through this, the neural network 12-based voice analysis is performed. As the output of 12), the first emotion analysis result d1 may be calculated.

이때, 음성분석 모듈(10)은 음성 파형 특징뿐만 아니라 로컬 스토리지(Local Storage, 13)로부터 획득된 모델 데이터(Model Data)를 신경망(12)의 입력으로 제공할 수 있다. 이를 통해, 음성분석 모듈(10)은 신경망(12) 기반의 음성분석을 통해 제1 감정 분석 결과(d1)를 산출할 수 있다.In this case, the voice analysis module 10 may provide not only the voice waveform features but also the model data obtained from the local storage 13 as an input of the neural network 12 . Through this, the voice analysis module 10 may calculate the first emotion analysis result d1 through voice analysis based on the neural network 12 .

여기서, 모델 데이터는 복수의 음성 데이터에 대한 MFCC 특징 값을 신경망의 입력으로 하고 입력된 MFCC에 대응하는 제1 감정 분석 결과를 출력하도록 학습되는 신경망에 있어서, 신경망을 반복 학습시킴으로써 결정된 신경망의 학습 값(구체적으로, 학습을 통해 결정된 신경망의 가중치 값)을 의미할 수 있다. 즉, 모델 데이터는 학습된 신경망의 가중치 값을 의미할 수 있다. 로컬 스토리지(13)는 이러한 모델 데이터(Model Data)가 복수 개 저장된 데이터베이스를 의미할 수 있다.Here, the model data is a neural network that is trained to output a first emotion analysis result corresponding to the input MFCC by taking MFCC feature values for a plurality of voice data as input to the neural network. (specifically, a weight value of a neural network determined through learning). That is, the model data may mean a weight value of the learned neural network. The local storage 13 may mean a database in which a plurality of such model data are stored.

음성분석 모듈(10)은 음성 데이터(d)에 대한 음성 파형 특징(즉, MFCC 특징 값)과 로컬 스토리지(13)로부터 획득된 모델 데이터(Model Data)를 신경망(12)에 통과시킴으로써, 신경망(12)으로부터 제1 감정 분석 결과(d1)를 산출(출력)할 수 있다.The voice analysis module 10 passes the voice waveform feature (ie, MFCC feature value) for the voice data d and the model data obtained from the local storage 13 through the neural network 12, so that the neural network ( 12), the first emotion analysis result d1 may be calculated (outputted).

제1 감정 분석 결과(d1)는, 행복(Happy), 중립(Neutral), 슬픔(Sad) 및 분노(Angry)를 포함하는 복수의 감정 각각에 대해 확률적으로 부여되는 값일 수 있다. 복수의 감정 각각에 대해 확률적으로 부여되는 값인 제1 감정 분석 결과(d1)는 음성 데이터(d)가 가진 감정의 확률을 나타낸다. 다시 말해, 제1 감정 분석 결과(d1)는 음성 데이터(d)에 포함된 복수의 감정 각각에 대한 확률(확률 값, 확률 점수)을 나타낸다.The first emotion analysis result d1 may be a value probabilistically assigned to each of a plurality of emotions including happiness, neutral, sadness, and anger. The first emotion analysis result d1, which is a value probabilistically assigned to each of the plurality of emotions, represents the probability of the emotion of the voice data d. In other words, the first emotion analysis result d1 represents a probability (probability value, probability score) for each of a plurality of emotions included in the voice data d.

제1 감정 분석 결과(d1)가 복수의 감정 각각에 대해 확률적으로 부여되는 값임에 따라, 복수의 감정 각각에 대하여 확률적으로 부여된 값의 합은 1이 될 수 있다.As the first emotion analysis result d1 is a value probabilistically assigned to each of the plurality of emotions, the sum of values probabilistically assigned to each of the plurality of emotions may be 1.

예시적으로, 음성분석 모듈(10)은 '행복(Happy): 0.05, 중립(Neutral): 0.03, 슬픔(Sad): 0.7, 분노(Angry): 0.22'와 같은 제1 감정 분석 결과(d1)를 산출할 수 있다.Illustratively, the voice analysis module 10 performs a first emotion analysis result (d1) such as 'Happy: 0.05, Neutral: 0.03, Sad: 0.7, Angry: 0.22' can be calculated.

이때, 제1 감정 분석 결과(d1)에 의하면 4가지의 감정 중 슬픔(Sad)에 해당하는 감정의 확률 값(확률 점수)이 0.7로서 가장 큰 값을 가짐에 따라, 음성분석 모듈(10)은 신경망 기반의 음성분석의 적용에 의해 산출된 제1 감정 분석 결과(d1)로부터, 입력된 음성 데이터(d)에 '슬픔'에 해당하는 감정이 가장 큰 비율을 차지하고 있음을 판단할 수 있다. 달리 말하자면, 음성분석 모듈(10)은 제1 감정 분석 결과(d1)를 통하여, 신경망 기반 음성분석을 기반으로 한 음성 데이터(d)에 대한 사용자의 감정이 '슬픔'인 것으로 판단할 수 있다.At this time, according to the first emotion analysis result d1, as the probability value (probability score) of the emotion corresponding to Sad among the four emotions has the largest value as 0.7, the voice analysis module 10 is From the first emotion analysis result d1 calculated by applying the neural network-based voice analysis, it can be determined that the emotion corresponding to 'sadness' occupies the largest proportion in the input voice data d. In other words, the voice analysis module 10 may determine that the user's emotion for the voice data d based on the neural network-based voice analysis is 'sad' through the first emotion analysis result d1.

본 장치(100)에서는 예시적으로 4가지의 복수의 감정으로서 행복, 중립, 슬픔 및 분노가 포함되는 것으로 예시하였으나, 이에만 한정되는 것은 아니고, 본 장치(100)에는 복수의 감정의 수나 복수의 감정의 유형 등은 보다 다양하게 변경하여 적용될 수 있다.The device 100 exemplarily includes happiness, neutrality, sadness, and anger as four plurality of emotions, but is not limited thereto, and the device 100 includes a plurality of emotions or a plurality of emotions. The type of emotion, etc. may be applied by changing more diversely.

음성분석 모듈(10)로부터 산출된 제1 감정 분석 결과(d1)는 통합 모듈(30)에 입력될 수 있다. 제1 감정 분석 결과(d1)는 통합 모듈(30)에서 통합 감정 분석 결과(d3)를 추출(생성)하는데 이용(사용)될 수 있다.The first emotion analysis result d1 calculated from the voice analysis module 10 may be input to the integration module 30 . The first emotion analysis result d1 may be used (used) to extract (generate) the integrated emotion analysis result d3 in the integration module 30 .

문맥분석 모듈(20)은 사용자(1)로부터 입력된 음성 데이터(d)에 감정사전(감성어 사전, 감정사전 DB, 감정단어 DB)(23) 기반의 문맥분석을 적용하여 제2 감정 분석 결과(d2)를 산출할 수 있다.The context analysis module 20 applies the context analysis based on the emotion dictionary (emotional word dictionary, emotion dictionary DB, emotion word DB) 23 to the voice data d input from the user 1 to obtain a second emotion analysis result (d2) can be calculated.

문맥분석 모듈(20)은 음성 데이터(d)를 텍스트로 변환한 다음 변환된 텍스트(Text Data)로부터 적어도 하나의 단어 기본형(Lemma)을 추출하고, 추출된 적어도 하나의 단어 기본형을 감정사전 DB(23)에 포함된 단어 기본형과 매칭하여 제2 감정 분석 결과(d2)를 산출할 수 있다.The context analysis module 20 converts the voice data (d) into text, then extracts at least one word primitive (Lemma) from the converted text (Text Data), and converts the extracted at least one word primitive into the emotion dictionary DB ( 23), the second emotion analysis result d2 may be calculated by matching with the basic form of the word.

여기서, 감정사전 DB(23)는, 복수의 단어 기본형 각각에 대하여 긍정의 세기 및 부정의 세기와 관련한 감정점수가 부여된 형태로 구축될 수 있다. 문맥분석 모듈(20)은 추출된 적어도 하나의 단어 기본형 중 감정사전 DB(23)에 포함된 단어 기본형과 매칭되는 매칭 단어 기본형 각각의 감정점수를 고려하여 제2 감정 분석 결과(d2)를 산출할 수 있다.Here, the emotion dictionary DB 23 may be constructed in a form in which emotion scores related to the strength of positive and negative strength are assigned to each of the plurality of basic types of words. The context analysis module 20 calculates the second emotion analysis result (d2) in consideration of the emotion scores of each of the matching word basic types matching the word basic types included in the emotion dictionary DB 23 among the extracted basic types of at least one word. can

구체적인 예로, 문맥분석 모듈(20)은 모바일 어플리케이션(2a)으로부터.wav 확장자를 갖는 음성파일을 음성 분석 대상인 음성 데이터(d)로서 입력(전달)받을 수 있다.As a specific example, the context analysis module 20 may receive (transfer) a voice file having a .wav extension from the mobile application 2a as voice data d to be analyzed.

문맥분석 모듈(20)은 입력된 음성 데이터(즉, .wav 음성파일)(d)를 STT(Speech To Text, 21)를 통해 텍스트(텍스트 데이터, Text Data)로 변환할 수 있다. 이후, 문맥분석 모듈(20)은 변환된 텍스트(Text Data)를 일예로 구글의 자연어 처리 기술인 Google Natural Language(22)를 이용하여 단어 기본형(Lemma)을 추출할 수 있다. 다시 말해, 문맥분석 모듈(20)은 변환된 텍스트에 Google Natural Language(22)를 적용함으로써 변환된 텍스트로부터 적어도 하나의 단어 기본형(Lemma)을 추출할 수 있다.The context analysis module 20 may convert the input voice data (ie, .wav voice file) (d) into text (text data, text data) through STT (Speech To Text, 21). Thereafter, the context analysis module 20 may extract a word primitive (Lemma) using the converted text data as an example, using Google Natural Language 22, which is Google's natural language processing technology. In other words, the context analysis module 20 may extract at least one word primitive (Lemma) from the converted text by applying the Google Natural Language 22 to the converted text.

텍스트 내에는 단어를 변형하여 사용할 수 있도록 하는 단어의 뿌리가 되는 기본형 단어를 포함하는데, 이러한 기본형 단어(단어의 기본형, 단어 기본형)을 Lemma 라 할 수 있다. 예를 들어, 'write', 'writing', 'wrote', 'written'는 모두 'write'라는 기본형을 기초로 한다. 따라서, 변환된 텍스트 내에 일예로 'write', 'writing', 'wrote' 및 'written' 중 적어도 하나가 포함되어 있는 경우, 문맥분석 모듈(20)은 변환된 텍스트로부터 적어도 하나의 단어 기본형(Lemma)으로서 'write'를 추출할 수 있다.In the text, basic words that are the roots of words that can be used by transforming words are included, and these basic words (basic form of word, basic form of word) can be called Lemma. For example, 'write', 'writing', 'wrote', and 'written' are all based on the basic form of 'write'. Therefore, when at least one of 'write', 'writing', 'wrote' and 'written' is included in the converted text as an example, the context analysis module 20 performs at least one word primitive (Lemma) from the converted text. ) as 'write'.

또한, 문맥분석 모듈(20)은 Google Natural Language(22)의 적용을 통해 변환된 텍스트(Text Data)로부터 단어 기본형(Lemma)뿐만 아니라 디펜던스(Dependency)를 추출할 수 있다.In addition, the context analysis module 20 may extract a word primitive (Lemma) as well as a dependency (Dependency) from the converted text (Text Data) through application of the Google Natural Language (22).

여기서, 디펜던스(Dependency)라 함은 단어 토큰(token; 문장 내 말뭉치, 단어 단위) 간의 종속성에 대한 관계를 나타낸다. 변환된 텍스트에 대한 디펜던스는 일예로 Google Natural Language(22)에서 제공하는 디펜던스 트리(Dependency Tree, 의존성 트리, 종속성 트리)를 통해 추출(식별)될 수 있다. 디펜던스 트리에 대한 설명은 예시적으로 문헌["Universal Dependency Annotation for Multilingual Parsing", Ryan McDonald, Joakim Nivre 외 11명, Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, pages 92-97, Sofia, Bulgaria, August 4-9 2013.]를 참조하여 보다 쉽게 이해될 수 있으며, 본원에서는 구체적인 설명을 생략하기로 한다.Here, the term “dependency” refers to a relationship with respect to dependency between word tokens (corpus in sentences, word units). The dependency on the converted text may be extracted (identified) through, for example, a dependency tree (dependency tree, dependency tree) provided by the Google Natural Language 22 . A description of the dependency tree is exemplarily described in "Universal Dependency Annotation for Multilingual Parsing", Ryan McDonald, Joakim Nivre et al., Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, pages 92-97, Sofia, Bulgaria, August 4-9 2013.] can be more easily understood with reference to, and a detailed description will be omitted herein.

문맥분석 모듈(20)에서 추출된 디펜던스(Dependency)는 감정 단어(감정어)의 영향력을 강화하고, 그 원인을 파악하는 데에 활용될 수 있다. Dependency extracted from the context analysis module 20 may be utilized to strengthen the influence of the emotional word (emotional word) and to identify the cause.

변환된 텍스트(Text Data)로부터 적어도 하나의 단어 기본형(Lemma)을 추출한 이후, 문맥분석 모듈(20)은 추출된 적어도 하나의 단어 기본형을 감정사전 DB(23)에 포함된 단어 기본형과 매칭을 수행할 수 있다. 이때, 감정사전 DB(23)는, 복수의 단어 기본형 각각에 대하여 긍정의 세기 및 부정의 세기와 관련한 감정점수가 부여된 형태로 구축되어 마련될 수 있다. After extracting at least one word primitive (Lemma) from the converted text (Text Data), the context analysis module 20 matches the extracted at least one word primitive type with the word primitive included in the emotion dictionary DB 23 can do. In this case, the emotion dictionary DB 23 may be constructed and provided in a form in which emotion scores related to the strength of positive and negative strength are given to each of the plurality of basic types of words.

추출된 적어도 하나의 단어 기본형과 감정사전 DB(23)에 포함된 단어 기본형과의 매칭을 통해, 문맥분석 모듈(20)은 추출된 적어도 하나의 단어 기본형 중 감정사전 DB(23)에 포함된 단어 기본형과 매칭되는 매칭 단어 기본형 각각의 감정점수를 획득할 수 있다. By matching the extracted at least one word primitive and the word primitive included in the emotion dictionary DB 23 , the context analysis module 20 is a word included in the emotion dictionary DB 23 from among the at least one extracted basic word type. It is possible to obtain an emotion score for each basic type of a matching word matching the basic type.

예시적으로, 감정사전 DB(23)에는 '재미(Fun): +7.0', '선물(gift): +6.0', '여행(travel): +3.0', '슬리퍼(slipper): 0.0', '장례식(funeral): -6.0', '실패하다(fail): -5.0' 등과 같은 데이터가 기 저장되어 있을 수 있다.For example, in the appraisal dictionary DB 23, 'Fun: +7.0', 'gift: +6.0', 'travel: +3.0', 'slipper: 0.0', Data such as 'funeral: -6.0' and 'fail: -5.0' may be pre-stored.

여기서, '재미(Fun): +7.0'는 '재미(Fun)'에 해당하는 단어 기본형(Lemma)에 대하여 긍정의 세기(강도)와 관련한 감정점수로서 +7.0이 부여되어 있음을 의미할 수 있다. 또한, '장례식(funeral): -6.0'는 '장례식(funeral)'에 해당하는 단어 기본형에 대하여 부정의 세기(강도)와 관련한 감정점수로서 -6.0이 부여되어 있음을 의미한다. 또한, '슬리퍼(slipper): 0.0'는 '슬리퍼(slipper)'에 해당하는 단어 기본형에 대하여 부정(또는 긍정)의 세기와 관련한 감정점수로서 0.0이 부여되어 있음을 의미한다. 이러한 단어 기본형별 감정점수의 예는 본원의 이해를 돕기 위한 하나의 예시일 뿐, 이에만 한정되는 것은 아니다.Here, 'Fun: +7.0' may mean that +7.0 is given as an emotional score related to the strength (strength) of the affirmation with respect to the basic form (Lemma) of the word corresponding to 'Fun'. . Also, 'funeral: -6.0' means that -6.0 is given as an emotional score related to the intensity (strength) of negation with respect to the basic form of the word corresponding to 'funeral'. In addition, 'slipper: 0.0' means that 0.0 is given as an emotional score related to the strength of negative (or positive) with respect to the basic form of the word corresponding to 'slipper'. The example of the emotional score for each basic word type is merely an example for helping understanding of the present application, and is not limited thereto.

이때, 감정점수가 0.0이라는 것은 해당 단어가 긍정을 나타내는 단어도 아니고 부정을 나타내는 단어도 아닌 중립의 감정을 나타내는 단어임을 의미할 수 있다. 또한, 감정점수가 +라는 것은 해당 단어가 긍정을 나타내는 단어임을 의미하고, 이는 숫자가 클수록 긍정의 세기(강도, 정도)가 큼을 의미할 수 있다. 반대로, 감정점수가 -라는 것은 해당 단어가 부정을 나타내는 단어임을 의미하고, 이는 숫자가 클수록 부정의 세기(강도, 정도)가 큼을 의미할 수 있다.In this case, the emotional score of 0.0 may mean that the corresponding word is not a word representing positive nor negative, but a word representing neutral emotion. In addition, an emotional score of + means that the corresponding word is an affirmative word, which may mean that the greater the number, the greater the positive strength (strength, degree). Conversely, an emotional score of - means that the corresponding word is a word representing negation, which may mean that the higher the number, the greater the intensity (strength, degree) of negation.

문맥분석 모듈(20)은 일예로 감정점수 모듈(24)을 포함할 수 있다. 감정점수 모듈(24)은 추출된 적어도 하나의 단어 기본형 중 감정사전 DB(23)에 포함된 단어 기본형과 매칭되는 매칭 단어 기본형 각각의 감정점수를 획득할 수 있다. 이후, 감정점수 모듈(24)은 획득된 매칭 단어 기본형 각각의 감정점수를 고려하여 제2 감정 분석 결과(d2)를 산출할 수 있다. 특히, 감정점수 모듈(24)은 일예로 매칭 단어 기본형 각각의 감정점수를 합산하고, 합산된 감정점수를 기반으로 제2 감정 분석 결과(d2)를 산출할 수 있다.The context analysis module 20 may include, for example, an emotion score module 24 . The emotion score module 24 may obtain an emotion score of each of the matching word basic types matching the word basic types included in the emotion dictionary DB 23 among the extracted at least one basic word types. Thereafter, the emotion score module 24 may calculate the second emotion analysis result d2 in consideration of the emotion score of each of the obtained matching word basic types. In particular, the emotion score module 24 may, for example, sum up the emotion scores of each basic matching word, and calculate the second emotion analysis result d2 based on the summed emotion scores.

예를 들어, 변환된 텍스트로부터 적어도 하나의 단어 기본형으로서 '재미(Fun)'와 '여행(travel)'이 추출되었다고 가정하자. 이러한 경우, 감정점수 모듈(24)은 추출된 적어도 하나의 단어 기본형을 감정사전 DB(23)에 포함된 단어 기본형과 매칭을 수행함으로써, 단어 기본형인 '재미(Fun)'와 매칭되는 매칭 단어 기본형의 감정점수로서 '+7.0'를 획득하고, 단어 기본형인 '여행(Travel)'과 매칭되는 매칭 단어 기본형의 감정점수로서 '+3.0'을 획득할 수 있다. 이후, 감정점수 모듈(24)은 획득된 매칭 단어 기본형 각각의 감정점수를 합산할 수 있으며, 이를 통해 합산된 감정점수인 +10.0(+7.0+3.0=+10.0)를 제2 감정 분석 결과(d2)로서 산출할 수 있다.For example, it is assumed that 'Fun' and 'travel' are extracted as at least one word primitive form from the converted text. In this case, the emotion score module 24 matches the extracted at least one word basic type with the word primitive included in the emotion dictionary DB 23, thereby matching the word basic type 'Fun' and matching word basic type You can acquire '+7.0' as the emotional score of , and '+3.0' as the emotional score of the matching word basic type matching with the basic word 'Travel'. Thereafter, the emotional score module 24 may sum the emotional scores of each of the obtained matching word basic types, and use the summed emotional score of +10.0 (+7.0+3.0=+10.0) as the second emotion analysis result (d2). ) can be calculated as

이때, 제2 감정 분석 결과(d2)에 의하면 합산된 감정점수가 +10.0임에 따라, 문맥분석 모듈(20)은 감정사전(23) 기반의 문맥분석의 적용에 의해 산출된 제2 감정 분석 결과(d2)로부터, 입력된 음성 데이터(d)에 긍정 보다는 부정에 해당하는 감정이 나타나고 있음을 판단할 수 있다. 달리 말하자면, 문맥분석 모듈(20)은 제2 감정 분석 결과(d2)를 통하여, 감정사전 기반 문맥분석을 기반으로 한 음성 데이터(d)에 대한 사용자의 감정이 '긍정적 감정'인 것으로 판단할 수 있다.At this time, according to the second emotion analysis result d2, as the summed emotion score is +10.0, the context analysis module 20 calculates the second emotion analysis result by applying the context analysis based on the emotion dictionary 23 From (d2), it can be determined that emotions corresponding to negative rather than positive appear in the input voice data (d). In other words, the context analysis module 20 may determine, through the second emotion analysis result d2, that the user's emotion for the voice data d based on the emotion dictionary-based context analysis is a 'positive emotion'. have.

문맥분석 모듈(20)은 합산된 감정점수가 + 값인 경우 사용자의 감정이 '긍정적 감정'인 것으로 판단하고, 합산된 감정점수가 - 값인 경우 사용자의 감정이 '부정적 감정인 것으로 판단하며, 합산된 감정점수가 0 값인 경우 사용자의 감정이 '중립 감정'인 것으로 판단할 수 있다.The context analysis module 20 determines that the user's emotion is a 'positive emotion' when the summed emotional score is a + value, and determines that the user's emotion is a 'negative emotion' when the summed emotional score is - When the score is 0, it may be determined that the user's emotion is a 'neutral emotion'.

달리 표현하여, 문맥분석 모듈(20)은 문장 내의 단어들(즉, 변환된 텍스트로부터 추출된 적어도 하나의 단어 기본형) 중 감정사전 DB(23)에 속해 있는 단어(매칭되는 단어 기본형)가 있는 경우, 해당 단어(매칭되는 단어 기본형)에 대한 긍/부정 관계의 유무에 관한 감정점수(즉, 긍정 또는 부정의 세기와 관련된 감정점수)를 획득할 수 있다. 이후, 문맥분석 모듈(20)은 획득된 감정점수를 기반으로 하여 전체 문장(즉, 음성 데이터에 해당하는 문장 전체)에 대한 감정점수(즉, 합산된 감정점수)를 제2 감정 분석 결과(d2)로서 산출할 수 있다. 이때, 합산된 감정점수의 크기는 감정의 긍/부정의 세기를 나타낸다.In other words, if there is a word (matching word basic type) belonging to the emotion dictionary DB 23 among words in the sentence (that is, at least one basic word type extracted from the converted text), the context analysis module 20 , it is possible to obtain an emotional score regarding the presence or absence of a positive/negative relationship for the corresponding word (ie, an emotional score related to the strength of positive or negative). Thereafter, the context analysis module 20 calculates the emotion score (ie, the summed emotion score) for the entire sentence (ie, the entire sentence corresponding to the voice data) based on the acquired emotion score as the second emotion analysis result (d2). ) can be calculated as In this case, the size of the summed emotional score indicates the strength of the emotion positive/negative.

문맥분석 모듈(20)로부터 산출된 제2 감정 분석 결과(d2)는 통합 모듈(30)에 입력될 수 있다. 제2 감정 분석 결과(d2)는 통합 모듈(30)에서 통합 감정 분석 결과(d3)를 추출(생성)하는데 이용(사용)될 수 있다.The second emotion analysis result d2 calculated from the context analysis module 20 may be input to the integration module 30 . The second emotion analysis result d2 may be used (used) to extract (generate) the integrated emotion analysis result d3 in the integration module 30 .

통합 모듈(30)은 제1 감정 분석 결과(d1) 및 제2 감정 분석 결과(d2)를 신경망(31)의 입력으로 하여 음성 데이터(d)에 대한 통합 감정 분석 결과(d3)를 출력 제공할 수 있다. 여기서, 신경망(31)으로는 앞서 설명한 신경망(12)에서와 같이 일예로 케라스(Keras)가 고려될 수 있다.The integration module 30 uses the first emotion analysis result d1 and the second emotion analysis result d2 as inputs to the neural network 31 to output the integrated emotion analysis result d3 for the voice data d. can Here, as the neural network 31 , as in the neural network 12 described above, Keras may be considered as an example.

즉, 통합 모듈(30)은 음성분석 모듈(10)에 의해 산출된 제1 감정 분석 결과(d1)와 문맥분석 모듈(20)에 의해 산출된 제2 감정 분석 결과(d2)를 획득하여 신경망(31)의 입력으로 제공할 수 있다. 통합 모듈(10)은 제1 감정 분석 결과(d1)와 제2 감정 분석 결과(d2)를 신경망(31)의 입력으로 한 신경망(31) 기반의 분석을 통해, 신경망(31)의 출력으로서 제1 감정 분석 결과(d1)와 제2 감정 분석 결과(d2)가 통합된 통합 감정 분석 결과(d3)를 산출할 수 있다.That is, the integration module 30 acquires the first emotion analysis result (d1) calculated by the voice analysis module 10 and the second emotion analysis result (d2) calculated by the context analysis module 20 to obtain the neural network ( 31) can be provided as an input. The integration module 10 performs the first emotion analysis result (d1) and the second emotion analysis result (d2) as the output of the neural network 31 through the neural network 31-based analysis using the neural network 31 as inputs. An integrated emotion analysis result d3 in which the first emotion analysis result d1 and the second emotion analysis result d2 are integrated may be calculated.

이때, 통합 모듈(30)은 제1/제2 감정 분석 결과(d1, d2)뿐만 아니라 로컬 스토리지(Local Storage, 32)로부터 획득된 모델 데이터(Model Data)를 신경망(31)의 입력으로 제공할 수 있다. 즉, 통합 모듈(30)은 통합 감정 분석 결과(d3)의 산출을 위한 신경망(31) 기반의 분석 수행시, 로컬 스토리지(32)에 이전에 구성되어 저장되어 있던 모델 데이터가 있는지 여부에 따라 로컬 스토리지(32)로부터 이전의 모델 데이터를 로딩하여 사용(활용)할 수 있다. 이러한 통합 모듈(30)은 신경망(31) 기반의 분석을 통해 통합 감정 분석 결과(d3)를 산출할 수 있다.At this time, the integration module 30 provides the first and second emotion analysis results d1 and d2 as well as the model data obtained from the local storage 32 as an input to the neural network 31 . can That is, the integration module 30 performs local analysis based on the neural network 31 for calculating the integrated emotion analysis result d3, depending on whether there is model data previously configured and stored in the local storage 32. Previous model data may be loaded and used (utilized) from the storage 32 . The integrated module 30 may calculate the integrated emotion analysis result d3 through the analysis based on the neural network 31 .

여기서, 모델 데이터는 복수의 제1 감정 분석 결과와 복수의 제2 감정 분석 결과를 신경망의 입력으로 하고 입력된 제1/제2 감정 분석 결과(d1, d2)에 대응하는 통합 감정 분석 결과를 출력하도록 학습되는 신경망에 있어서, 신경망을 반복 학습시킴으로써 결정된 신경망의 학습 값(구체적으로, 학습을 통해 결정된 신경망의 가중치 값)을 의미할 수 있다. 즉, 모델 데이터는 학습된 신경망의 가중치 값을 의미할 수 있다. 로컬 스토리지(32)는 이러한 모델 데이터가 복수 개 저장된 데이터베이스를 의미할 수 있다.Here, the model data uses a plurality of first emotion analysis results and a plurality of second emotion analysis results as inputs to the neural network, and outputs an integrated emotion analysis result corresponding to the input first/second emotion analysis results d1 and d2. In a neural network that is learned to do so, it may mean a learning value of the neural network determined by repeatedly learning the neural network (specifically, a weight value of the neural network determined through learning). That is, the model data may mean a weight value of the learned neural network. The local storage 32 may mean a database in which a plurality of such model data are stored.

통합 모듈(30)은 제1/제2 감정 분석 결과(d1, d2)와 로컬 스토리지(32)로부터 획득된 모델 데이터를 신경망(31)에 통과시킴으로써, 신경망(31)으로부터 통합 감정 분석 결과(d3)를 산출(출력)할 수 있다.The integration module 30 passes the first/second emotion analysis results d1 and d2 and the model data obtained from the local storage 32 through the neural network 31, and the integrated emotion analysis result d3 from the neural network 31 ) can be calculated (output).

여기서, 통합 감정 분석 결과는 행복(Happy), 중립(Neutral), 슬픔(Sad) 및 분노(Angry)를 포함하는 복수의 감정 각각에 대해 확률적으로 부여되는 값 형태로 1차적으로 도출되고, 상기 복수의 감정 중 최대치의 확률을 갖는 감정을 최종 감정으로 결정한 결과일 수 있다.Here, the integrated emotion analysis result is primarily derived in the form of a value probabilistically assigned to each of a plurality of emotions including happiness, neutral, sad, and Angry, and the It may be a result of determining an emotion having a maximum probability among a plurality of emotions as the final emotion.

다시 말해, 통합 모듈(30)은 음성분석 모듈(10)로부터 전달받은 제1 감정 분석 결과(d1)(행복, 중립, 슬픔 및 분노를 포함하는 복수의 감정 각각에 대해 확률적으로 부여되는 값, 일예로 4가지 감정의 확률 값)와 문맥분석 모듈(20)로부터 전달받은 제2 감정 분석 결과(d2)(매칭 단어 기본형 각각의 감정점수에 대한 합산된 감정점수의 값, 긍/부정의 세기와 관련한 감정점수의 값)를 통합 모듈(30)의 신경망(32)에 입력함으로써, 신경망(32)의 출력인 통합 감정 분석 결과(d3)로서 음성 데이터(d)에 대응하는 사용자의 최종 감정을 출력 제공할 수 있다.In other words, the integration module 30 is the first emotion analysis result d1 received from the voice analysis module 10 (values probabilistically assigned to each of a plurality of emotions including happiness, neutrality, sadness and anger, For example, the probability values of four emotions) and the second emotion analysis result (d2) received from the context analysis module 20 (the value of the summed emotion score for each emotion score of the matching word basic type, the strength of positive/negative By inputting the value of the related emotion score) into the neural network 32 of the integration module 30 , the final emotion of the user corresponding to the voice data d is output as the integrated emotion analysis result d3 that is the output of the neural network 32 . can provide

여기서, 통합 모듈(30)은 통합 감정 분석 결과(d3)의 출력 제공을 위해 1차적으로, 음성분석 모듈(10)로부터 추출되는 제1 감정 분석 결과와 마찬가지로, 복수의 감정으로서 일예로 4가지의 감정(Happy, Neutral, Sad, Angry) 각각에 대하여 확률적으로 부여되는 값(가중치 값)을 도출할 수 있다. 이때, 1차 도출된 4가지 감정에 대한 확률 값 중 최대값에 해당하는 감정이 음성 데이터(d)에 대응하는 사용자(1)의 최종 감정으로 결정될 수 있다. 즉, 통합 모듈(30)은 1차 도출된 4가지 감정 각각에 대한 확률 값(가중치 값) 중 최대 확률 값을 갖는 감정을 음성 데이터(d)에 대응하는 사용자(1)의 최종 감정으로 결정하고, 결정된 최종 감정을 통합 감정 분석 결과(d3)로서 출력 제공할 수 있다.Here, the integrated module 30 is primarily configured to provide the output of the integrated emotion analysis result d3, similarly to the first emotion analysis result extracted from the voice analysis module 10, as a plurality of emotions. It is possible to derive a value (weight value) that is probabilistically assigned to each emotion (Happy, Neutral, Sad, Angry). In this case, the emotion corresponding to the maximum value among the probability values for the first derived four emotions may be determined as the final emotion of the user 1 corresponding to the voice data d. That is, the integration module 30 determines the emotion having the maximum probability value among the probability values (weight values) for each of the first derived four emotions as the final emotion of the user 1 corresponding to the voice data d, and , it is possible to output the determined final emotion as an integrated emotion analysis result d3.

본 장치(100)는 신경망 기반의 음성분석과 감정사전 기반의 문맥분석을 융합하여 음성 데이터(d)에 대한 사용자의 감정을 분석함으로써, 단순히 음성분석을 기반으로 감정을 분석하거나 문맥분석을 기반으로 감정을 분석하는 기술 대비 사용자의 감정 분석에 대한 정확도를 효과적으로 높일 수 있다.The device 100 analyzes the user's emotions for the voice data (d) by fusing the neural network-based voice analysis and the emotion dictionary-based context analysis, thereby simply analyzing the emotions based on the voice analysis or based on the context analysis. It is possible to effectively increase the accuracy of the user's emotion analysis compared to the emotion analysis technique.

도 3은 본원의 일 실시예에 따른 신경망 기반 감정 분석 장치(100)에 의하여 도출되는 통합 감정 분석 결과(d3)의 일예를 나타낸다.3 shows an example of an integrated emotion analysis result d3 derived by the neural network-based emotion analysis apparatus 100 according to an embodiment of the present application.

도 3을 참조하면, 음성 데이터(d) 내에 반언어적 특징(예를 들어, 억양, 말투 등)이 존재하지 않을 경우에는 음성 데이터(d)에 대하여 음성분석 모듈(10)에 의한 음성분석을 적용하는 것만으로 사용자의 감정을 분석(파악)할 수 없다(사용자의 감정을 정확하게 도출해낼 수 없다).Referring to FIG. 3 , when anti-linguistic features (eg, intonation, tone, etc.) do not exist in the voice data d, the voice analysis by the voice analysis module 10 is performed on the voice data d. It is not possible to analyze (figure out) the user's emotions just by applying them (it is impossible to accurately derive the user's emotions).

따라서, 음성 데이터(d)에 반언어적 특징이 존재하지 않아 음성분석 모듈(10)에 의한 음성분석을 기반으로 음성 데이터(d)로부터 사용자가 어떤 감정인지에 대한 정확한 판단이 어려우나, 음성 데이터(d)의 발화문 내에 감정을 나타내는 단어(감정어)가 포함되어 있는 경우(달리 말해, 발화문 내에 억양 등에 대하여 특별한 특징 부분이 없지만 감정어가 포함되어 잇는 경우), 통합 모듈(30)은 제1 감정 분석 결과(d1)뿐만 아니라 문맥분석 적용을 통해 산출된 제2 감정 분석 결과(d2)를 함께 고려하여 음성 데이터(d)에 대한 사용자의 최종 감정을 나타내는 통합 감정 분석 결과(d3)를 산출하여 제공할 수 있다.Therefore, since there is no anti-linguistic feature in the voice data (d), it is difficult to accurately determine the emotion of the user from the voice data (d) based on the voice analysis by the voice analysis module 10, but the voice data ( When a word (emotional word) expressing emotion is included in the speech sentence of d) (in other words, when there is no special feature part for intonation, etc. in the speech sentence, but an emotion word is included), the integration module 30 is the first By considering not only the emotion analysis result (d1) but also the second emotion analysis result (d2) calculated through the application of context analysis, an integrated emotion analysis result (d3) representing the user's final emotion for the voice data (d) is calculated. can provide

예시적으로, 사용자로부터 입력된 음성 데이터(d)의 발화문 내에 반언어적 특징(억양, 말투 등)이 존재하지 않는 경우와 관련하여, 음성 데이터(d)로서 "Why is it so selfish?"라는 발화문이 입력되었다고 하자. Illustratively, in relation to a case in which anti-linguistic features (tongue, tone, etc.) do not exist in the utterance of the voice data d input from the user, as the voice data d, “Why is it so selfish?” Let's say that an utterance is entered.

음성분석 모듈(10)은 이러한 음성 데이터(d)에 음성분석을 적용함으로써, 제1 감정 분석 결과(d1)로서 복수의 각각에 대한 확률(확률 값)을 산출할 수 있다. 이에 따르면, 입력된 음성 데이터(d)에 대한 음성분석의 단일 처리 적용시, 그 결과로는 복수의 감정(4가지 감정)에 대한 확률 중 행복(Happy)에 해당하는 확률이 가장 큰 값을 가짐에 따라 '행복(Happy)에 해당하는 감정 결과가 산출될 수 있다. 즉, 음성분석의 단일 적용으로는 사용자의 감정이 '행복(Happy)'인 것으로 산출될 수 있다.The voice analysis module 10 may calculate a probability (probability value) for each of a plurality of as the first emotion analysis result d1 by applying the voice analysis to the voice data d. According to this, when a single processing of voice analysis is applied to the input voice data d, as a result, the probability corresponding to happiness among the probabilities for a plurality of emotions (four emotions) has the largest value. Accordingly, an emotional result corresponding to 'Happy' may be calculated. That is, the user's emotion can be calculated as 'Happy' with a single application of voice analysis.

한편, 문맥분석 모듈(20)은 이러한 음성 데이터(d)에 문맥분석을 적용한 결과, 제2 감정 분석 결과(d2)로서 -3.0(부정적 감정)에 해당하는 감정 값을 산출할 수 있다. 즉, 입력된 음성 데이터(d)에 대한 문맥분석의 단일 처리 적용시, 그 결과로는 "-3.0(부정적 감정)"에 해당하는 감정 결과가 산출될 수 있다. 다시 말해, 문맥분석의 단일 적용으로는 사용자의 감정이 '중립'인 것으로 산출될 수 있다. Meanwhile, as a result of applying the context analysis to the voice data d, the context analysis module 20 may calculate an emotion value corresponding to -3.0 (negative emotion) as the second emotion analysis result d2. That is, when a single process of context analysis is applied to the input voice data d, an emotion result corresponding to "-3.0 (negative emotion)" may be calculated as a result. In other words, with a single application of context analysis, the user's emotion can be calculated as 'neutral'.

통합 모듈(30)은 제2 감정 분석 결과(d2)인 '-3.0(부정적 감정)'와 제1 감정 분석 결과(d1)인 '복수의 감정 각각에 대한 확률'을 함께 고려해 신경망(31) 기반의 융합(통합) 분석을 수행함으로써, 음성 데이터(d)에 대한 통합 감정 분석 결과(d3)로서 복수의 감정 각각에 대한 확률을 산출할 수 있다. 통합 모듈(30)은 제1/제2 감정 분석 결과(d1, d2)의 통합분석의 처리 결과, 4가지 감정에 대한 확률 중 슬픔(Sad)에 해당하는 확률이 가장 큰 값(높은 값)으로 나타남에 따라, 음성 데이터(d)에 대한 사용자의 최종 감정을 '슬픔(Sad)'인 것으로 산출할 수 있다. 즉, 통합 모듈(30)은 통합분석을 통해, 복수의 감정 중 최대치의 확률을 갖는 감정이 '슬픔(Sad)'이므로, 사용자의 최종 감정을 '슬픔(Sad)'인 것으로 산출할 수 있다.The integration module 30 is based on the neural network 31 by considering '-3.0 (negative emotion)', which is the second emotion analysis result (d2), and 'probability for each of a plurality of emotions', which is the first emotion analysis result (d1). By performing the fusion (integration) analysis of , it is possible to calculate a probability for each of a plurality of emotions as a result of the integrated emotion analysis d3 for the voice data d. As a result of the integrated analysis of the first and second emotion analysis results (d1 and d2), the integration module 30 sets the probability corresponding to sadness (Sad) to the highest value (highest value) among the probabilities for the four emotions. As it appears, the user's final emotion for the voice data d may be calculated as 'Sad'. That is, the integrated module 30 may calculate the final emotion of the user as 'Sad' because the emotion having the maximum probability among the plurality of emotions is 'Sad' through the integrated analysis.

도 4는 본원의 일 실시예에 따른 신경망 기반 감정 분석 장치(100)에 의하여 도출되는 통합 감정 분석 결과(d3)의 다른 예를 나타낸다.4 shows another example of an integrated emotion analysis result d3 derived by the neural network-based emotion analysis apparatus 100 according to an embodiment of the present application.

도 4를 참조하면, 음성 데이터(d) 내에 사용자의 감정을 나타내는 단어(감정어)가 존재하지 않을 경우에는 음성 데이터(d)에 대하여 감정사전(감정사전 DB, 23) 기반의 문맥분석을 적용하는 것만으로 사용자의 감정을 분석(파악)할 수 없다.Referring to FIG. 4 , if there is no word (emotional word) representing the user's emotion in the voice data d, context analysis based on the emotion dictionary (emotion dictionary DB, 23) is applied to the voice data d. It is not possible to analyze (figure out) the user's emotions just by doing it.

따라서, 음성 데이터(d)에 감정을 나타내는 단어가 존재하지 않아 문맥분석 모듈(20)에 의한 문맥분석을 기반으로 음성 데이터(d)로부터 사용자의 감정이 긍정적 감정인지 혹은 부정적 감정인지에 대한 판단이 어려운 경우(달리 말해, 음성 데이터에 대한 제2 감정 분석 결과가 0.0으로서 중립 감정인 것으로 산출된 경우), 통합 모듈(30)은 제2 감정 분석 결과(d2)뿐만 아니라 음성분석 적용을 통해 산출된 제1 감정 분석 결과(d1)를 함께 고려하여 음성 데이터(d)에 대한 사용자의 최종 감정을 나타내는 통합 감정 분석 결과(d3)를 산출하여 제공할 수 있다.Therefore, there is no word expressing emotion in the voice data d, so based on the context analysis by the context analysis module 20, it is difficult to determine whether the user's emotion is a positive emotion or a negative emotion from the voice data d. In a difficult case (in other words, when the second emotion analysis result for the voice data is 0.0 and it is calculated to be a neutral emotion), the integration module 30 performs the second emotion analysis result d2 as well as the second emotion analysis result d2 calculated through the application of voice analysis. An integrated emotion analysis result d3 indicating the user's final emotion on the voice data d may be calculated and provided by considering the first emotion analysis result d1 together.

예시적으로, 사용자로부터 입력된 음성 데이터(d)의 발화문 내에 감정을 나타내는 단어(감정어)가 존재하지 않는 경우와 관련하여, 음성 데이터(d)로서 "I'm done. I don't want to play."라는 발화문이 입력되었다고 하자. Illustratively, in relation to a case where a word (emotional word) representing an emotion does not exist in the utterance of the voice data d input from the user, as the voice data d, “I'm done. I don't” Let's say the utterance "want to play." is input.

문맥분석 모듈(20)은 이러한 음성 데이터(d)에 문맥분석을 적용한 결과, 제2 감정 분석 결과(d2)로서 0.0(중립 감정)에 해당하는 감정 값을 산출할 수 있다. 즉, 입력된 음성 데이터(d)에 대한 문맥분석의 단일 처리 적용시, 그 결과로는 "0.0(-)(중립 감정)"에 해당하는 감정 결과가 산출될 수 있다. 다시 말해, 문맥분석의 단일 적용으로는 사용자의 감정이 '중립'인 것으로 산출될 수 있다. As a result of applying the context analysis to the voice data d, the context analysis module 20 may calculate an emotion value corresponding to 0.0 (neutral emotion) as the second emotion analysis result d2. That is, when a single process of context analysis is applied to the input voice data d, an emotion result corresponding to "0.0(-) (neutral emotion)" may be calculated as a result. In other words, with a single application of context analysis, the user's emotion can be calculated as 'neutral'.

한편, 음성분석 모듈(10)은 이러한 음성 데이터(d)에 음성분석을 적용함으로써, 제1 감정 분석 결과(d1)로서 복수의 각각에 대한 확률(확률 값)을 산출할 수 있다. 이에 따르면, 입력된 음성 데이터(d)에 대한 음성분석의 단일 처리 적용시, 그 결과로는 복수의 감정(4가지 감정)에 대한 확률 중 분노(Angry)에 해당하는 확률이 가장 큰 값(0.00740)을 가짐에 따라 '분노(Angry)에 해당하는 감정 결과가 산출될 수 있다. 즉, 음성분석의 단일 적용으로는 사용자의 감정이 '분노(Angry)'인 것으로 산출될 수 있다.On the other hand, the voice analysis module 10 may calculate a probability (probability value) for each of a plurality of as the first emotion analysis result d1 by applying the voice analysis to the voice data d. According to this, when a single processing of voice analysis is applied to the input voice data d, the result is a value (0.00740) with the highest probability corresponding to Angry among the probabilities for a plurality of emotions (four emotions). ), an emotional result corresponding to 'Angry' can be calculated. That is, with a single application of voice analysis, it can be calculated that the user's emotion is 'Angry'.

통합 모듈(30)은 제2 감정 분석 결과(d2)인 '0.0(중립 감정)'와 제1 감정 분석 결과(d1)인 '복수의 감정 각각에 대한 확률'을 함께 고려해 신경망(31) 기반의 융합(통합) 분석을 수행함으로써, 음성 데이터(d)에 대한 통합 감정 분석 결과(d3)로서 복수의 감정 각각에 대한 확률을 산출할 수 있다. 통합 모듈(30)은 제1/제2 감정 분석 결과(d1, d2)의 통합분석의 처리 결과, 4가지 감정에 대한 확률 중 분노(Angry)에 해당하는 확률이 가장 큰 값(높은 값)으로 나타남에 따라, 음성 데이터(d)에 대한 사용자의 최종 감정을 '분노(Angry)'인 것으로 산출할 수 있다. 즉, 통합 모듈(30)은 통합분석을 통해, 복수의 감정 중 최대치의 확률을 갖는 감정이 '분노'이므로, 사용자의 최종 감정을 '분노'인 것으로 산출할 수 있다.The integration module 30 considers '0.0 (neutral emotion)', which is the second emotion analysis result (d2), and 'probability for each of a plurality of emotions', which is the first emotion analysis result (d1), together with the neural network 31-based By performing the fusion (integration) analysis, it is possible to calculate a probability for each of a plurality of emotions as an integrated emotion analysis result d3 for the voice data d. As a result of the integrated analysis of the first and second emotion analysis results (d1, d2), the integration module 30 sets the probability corresponding to Angry to the highest value (highest value) among the probabilities for the four emotions. As it appears, the user's final emotion for the voice data d may be calculated as 'Angry'. That is, through the integrated analysis, the integration module 30 may calculate the final emotion of the user as 'anger' since the emotion having the maximum probability among the plurality of emotions is 'anger'.

이에 따르면, 본 장치(100)는 문맥분석과 음성분석 중 어느 하나의 분석만 단일 처리로 적용하여 사용자의 감정을 산출하는 것 대비, 2가지 유형 분석(문맥분석과 음성분석) 기법을 융합하여 함께 고려해 사용자의 감정을 산출함으로써, 음성 데이터(d)에 대한 사용자의 감정 산출(추정)의 정확도를 효과적으로 높일 수 있다.According to this, the device 100 combines two types of analysis (context analysis and voice analysis) techniques, in contrast to calculating the user's emotion by applying only one analysis of context analysis and voice analysis as a single process. By calculating the user's emotion in consideration of the user's emotion, the accuracy of calculating (estimating) the user's emotion with respect to the voice data d can be effectively increased.

즉, 본 장치(100)는 발화문에 감정어가 존재하지 않아 감성사전 기반의 감정 분석이 어려운 경우, 음성분석을 함께 활용함으로써 감정 추정의 정확도가 높일 수 있다. 또한, 본 장치(100)는 발화문의 억양에 특별한 부분은 없지만 감정어가 포함된 경우, 감성사전 기반의 문맥분석을 함께 활용함으로써 감정 추정의 정확도를 높일 수 있다.That is, when the emotion analysis based on the emotion dictionary is difficult because the emotional word does not exist in the utterance, the apparatus 100 may increase the accuracy of emotion estimation by using the voice analysis together. In addition, the apparatus 100 may increase the accuracy of emotion estimation by using the sentiment dictionary-based context analysis together when an emotional word is included although there is no special part in the intonation of an utterance.

통합 모듈(30)은 통합분석을 기반으로 산출된 통합 감정 분석 결과(d3)를 사용자 단말(2)의 모바일 어플리케이션(2a)으로 제공(출력 제공)할 수 있다. 이를 통해, 사용자 단말(2)의 화면 상에 통합 감정 분석 결과(d3)가 표시(디스플레이)될 수 있다. 사용자는 본 장치(100)로부터 출력 제공되는 통합 감정 분석 결과(d3)를 사용자 단말(2)의 화면을 통해 제공받아 확인할 수 있다.The integrated module 30 may provide (output) the integrated emotion analysis result d3 calculated based on the integrated analysis to the mobile application 2a of the user terminal 2 . Through this, the integrated emotion analysis result d3 may be displayed (displayed) on the screen of the user terminal 2 . The user may receive and check the integrated emotion analysis result d3 output from the device 100 through the screen of the user terminal 2 .

도 5는 본원의 일 실시예에 따른 신경망 기반 감정 분석 기반의 감정 치료 시스템(200')의 개략적인 구성을 나타낸 도면이다. 도 6은 본원의 일 실시예에 따른 신경망 기반 감정 분석 기반의 감정 치료 장치(110)의 개략적인 구성을 나타낸 도면이다.5 is a diagram showing a schematic configuration of a neural network-based emotion analysis-based emotion treatment system 200' according to an embodiment of the present application. 6 is a diagram illustrating a schematic configuration of an emotion treatment apparatus 110 based on neural network-based emotion analysis according to an embodiment of the present application.

이하에서는 설명의 편의상 본원의 일 실시예에 따른 신경망 기반 감정 분석 기반의 감정 치료 시스템(200')을 본 시스템(200')이라 하고, 본원의 일 실시예에 따른 신경망 기반 감정 분석 기반의 감정 치료 장치(110)를 본 장치(110)라 하기로 한다. 또한, 본원에서 감정 치료는 감성 치료라 달리 표현될 수 있다.Hereinafter, for convenience of explanation, the neural network-based emotion analysis-based emotion treatment system 200 ′ according to an embodiment of the present application is referred to as the present system 200 ′, and neural network-based emotion analysis-based emotion treatment system according to an embodiment of the present application The device 110 will be referred to as the present device 110 . In addition, emotion therapy herein may be expressed differently as emotional therapy.

본 시스템(200')은 앞서 설명한 본 시스템(200, 본원의 일 실시예에 따른 신경망 기반 감정 분석 시스템)과 동일한 시스템일 수 있다. 또한, 본 장치(110)는 앞서 설명한 본 장치(100, 본원의 일 실시예에 따른 신경망 기반 감정 분석 장치)와 동일한 장치일 수 있다. 즉, 본 시스템(200')과 본 장치(110)는 앞서 설명한 본 시스템(200)과 본 장치(100)와 대비하여, 치료 모듈(40)의 구성이 추가되는 것에서만 차이가 있을 뿐, 다른 구성들 및 그 기능에 대해서는 동일할 수 있다.The present system 200' may be the same system as the present system 200 (a neural network-based emotion analysis system according to an embodiment of the present application) described above. Also, the apparatus 110 may be the same apparatus as the apparatus 100 (a neural network-based emotion analysis apparatus according to an embodiment of the present application) described above. That is, the present system 200 ′ and the present apparatus 110 differ only in that the configuration of the treatment module 40 is added, as compared to the present system 200 and the present apparatus 100 described above. The configurations and their functions may be the same.

따라서, 이하 생략된 내용이라 하더라도, 본 시스템(200)에 대하여 설명된 내용은 본 시스템(200')에 대한 설명에도 동일하게 적용될 수 있다. 또한, 이하 생략된 내용이라 하더라도, 본 장치(100)에 대하여 설명된 내용은 본 장치(110)에 대한 설명에도 동일하게 적용될 수 있다.Accordingly, even if omitted below, the descriptions of the present system 200 may be equally applied to the description of the present system 200 ′. Also, even if omitted below, the description of the apparatus 100 may be equally applied to the description of the apparatus 110 .

도 5 및 도 6을 참조하면, 본 시스템(200')은 사용자 단말(2) 및 본 장치(110)를 포함할 수 있다. 본 장치(110)는 음성분석 모듈(10), 문맥분석 모듈(20), 통합 모듈(30) 및 치료 모듈(40)을 포함할 수 있다. 여기서, 사용자 단말(2), 음성분석 모듈(10), 문맥분석 모듈(20) 및 통합 모듈(30)에 대한 설명은 앞서 자세히 설명했으므로, 이하 중복되는 설명은 생략하기로 한다.5 and 6 , the system 200 ′ may include a user terminal 2 and the device 110 . The apparatus 110 may include a voice analysis module 10 , a context analysis module 20 , an integration module 30 , and a treatment module 40 . Here, since the description of the user terminal 2, the voice analysis module 10, the context analysis module 20, and the integration module 30 has been described in detail above, the overlapping description will be omitted below.

본 장치(110)는 앞서 설명된 본 장치(100, 신경망 기반 감정 분석 장치)에 의해 신경망 기반 감정 분석을 수행하여 제공된 통합 감정 분석 결과(d3)를 기반으로 감정치료법(d4)을 결정하여 제공할 수 있다. 구체적인 설명은 다음과 같다.The device 110 determines and provides an emotion therapy method d4 based on the integrated emotion analysis result d3 provided by performing a neural network-based emotion analysis by the device 100 described above. can A detailed description is as follows.

통합 모듈(30)은 제1 감정 분석 결과(d1) 및 제2 감정 분석 결과(d2)를 신경망(31)의 입력으로 하여 음성 데이터(d)에 대한 통합 감정 분석 결과(d3)를 산출하고, 산출된 통합 감정 분석 결과(d3)를 치료 모듈(40)로 전달할 수 있다.The integration module 30 uses the first emotion analysis result d1 and the second emotion analysis result d2 as inputs to the neural network 31 to calculate the integrated emotion analysis result d3 for the voice data d, The calculated integrated emotion analysis result d3 may be transmitted to the treatment module 40 .

치료 모듈(40)은 통합 모듈(30)에 의해 산출된 통합 감정 분석 결과(d3)를 통합 모듈(30)로부터 전달받아 획득할 수 있다.The treatment module 40 may receive and obtain the integrated emotion analysis result d3 calculated by the integration module 30 from the integration module 30 .

치료 모듈(40)은 획득된 통합 감정 분석 결과(d3)를 기반으로 감정치료법(d4)을 결정(치료법 결정, 41)할 수 있다. 즉, 치료 모듈(40)은 통합 감정 분석 결과(d3)를 기반으로 사용자의 최종 감정을 파악하고, 파악된 최종 감정에 따라 사용자(1)에게 어떤 감정치료법(치료법, d4)을 제공할 것인지 결정할 수 있다.The treatment module 40 may determine the emotion treatment method d4 (treatment method determination, 41) based on the obtained integrated emotion analysis result d3. That is, the treatment module 40 determines the final emotion of the user based on the result of the integrated emotion analysis d3, and determines which emotion treatment method (treatment method, d4) to provide to the user 1 according to the identified final emotion. can

여기서, 감정치료법(치료법, d4)의 유형에는 미술치료, 음악치료, 무용/동작 치료 등이 포함될 수 있으며, 이에만 한정되는 것은 아니고, 본 장치(110)에는 사용자의 감정(감성) 치료를 가능하게 하는 다양한 감정치료법이 적용될 수 있다.Here, the type of emotion therapy (therapeutic method, d4) may include art therapy, music therapy, dance/movement therapy, etc. A variety of emotional therapy methods can be applied.

치료 모듈(40)에서 결정되는 감정치료법(d4)은, 통합 감정 분석 결과(d3)로서 사용자의 감정이 부정적 감정 또는 중립 감정인 것으로 나타난 경우, 사용자(1)의 감정에 대해 긍정적 감정으로의 변화를 유도하는 컨텐츠 형태로 제공될 수 있다. 또한, 감정치료법(d4)은 통합 감정 분석 결과로서 사용자(1)의 감정이 긍정적 감정인 것으로 나타난 경우, 사용자(1)의 감정의 긍정적 유지 및 확장을 유도하는 컨텐츠 형태로 제공될 수 있다.The emotion therapy method d4 determined in the treatment module 40, when the user's emotion is shown to be a negative emotion or a neutral emotion as a result of the integrated emotion analysis d3, a change to a positive emotion for the emotion of the user 1 It may be provided in the form of inducing content. In addition, the emotion therapy method d4 may be provided in the form of content that induces positive maintenance and expansion of the user 1's emotion when the emotion of the user 1 is found to be a positive emotion as a result of the integrated emotion analysis.

여기서, 예시적으로 통합 감정 분석 결과(d3)로서 사용자의 감정이 부정적 감정으로 나타났다는 것은, 행복(Happy), 중립(Neutral), 슬픔(Sad) 및 분노(Angry)를 포함하는 복수의 감정 중 최종 감정이 슬픔(Sad) 또는 분노(Angry)로 결정된 경우를 의미할 수 있다. 또한, 통합 감정 분석 결과(d3)로서 사용자의 감정이 중립 감정으로 나타났다는 것은, 복수의 감정 중 최종 감정이 중립(Neutral)으로 결정된 경우를 의미할 수 있다. 또한, 통합 감정 분석 결과(d3)로서 사용자의 감정이 긍정적 감정으로 나타났다는 것은, 복수의 감정 중 최종 감정이 행복(Happy)으로 결정된 경우를 의미할 수 있다.Here, as an example of the integrated emotion analysis result (d3), the fact that the user's emotion is negative emotion is one of a plurality of emotions including happiness, neutrality, sadness, and anger. It may mean a case in which the final emotion is determined to be Sad or Angry. Also, as the result of the integrated emotion analysis d3 , that the user's emotion is a neutral emotion may mean a case in which the final emotion among a plurality of emotions is determined to be neutral. Also, as the result of the integrated emotion analysis d3 , that the user's emotion is a positive emotion may mean a case in which the final emotion among the plurality of emotions is determined to be Happy.

따라서, 치료 모듈(40)은 통합 감정 분석 결과(d3)로서 사용자의 감정이 부정적 감정 또는 중립 감정으로 나타난 경우(즉, 최종 감정이 중립, 슬픔 및 분노 중 어느 하나로 결정된 경우), 사용자의 현재 감정인 중립, 슬픔 및 분노 중 어느 하나의 감정이 긍적적 감정인 행복 감정으로 변화되도록 유도하는 컨텐츠 형태로 감정치료법(d4)을 제공할 수 있다.Therefore, when the user's emotion appears as a negative emotion or a neutral emotion as the result of the integrated emotion analysis d3 (that is, when the final emotion is determined as any one of neutral, sadness, and anger), the treatment module 40 is the user's current emotion The emotion therapy method d4 may be provided in the form of content that induces any one of neutrality, sadness, and anger to be changed into a positive emotion, that is, happiness.

또한, 치료 모듈(40)은 통합 감정 분석 결과(d3)로서 사용자의 감정이 긍정적 감정으로 나타난 경우(즉, 최종 감정이 행복으로 결정된 경우), 사용자의 현재 감정인 행복 감정에 해당하는 긍정적 감정이 유지되거나 확장되도록 유도하는 컨텐츠 형태로 감정치료법(d4)을 제공할 수 있다.In addition, the treatment module 40 maintains the positive emotion corresponding to the user's current emotion, the happiness emotion, when the user's emotion appears as a positive emotion as the result of the integrated emotion analysis d3 (that is, when the final emotion is determined as happiness). The emotion therapy method d4 may be provided in the form of content that induces or expands.

일예로, 치료 모듈(40)은 사용자의 감정에 대해 긍정적 감정으로의 변화를 유도하는 컨텐츠 형태의 감정치료법으로서 미술치료법을 제공할 수 있다. 또한, 치료 모듈(40)은 사용자의 감정에 대해 긍정적 감정의 유지 및 확장을 유도하는 컨텐츠 형태의 감정치료법으로서 영화, 음악 및 도서 중 적어도 하나의 컨텐츠를 제공할 수 있다.As an example, the treatment module 40 may provide an art therapy method as an emotion therapy method in the form of content that induces a change in a user's emotion into a positive emotion. In addition, the treatment module 40 may provide at least one content among movies, music, and books as an emotion therapy method in the form of content that induces maintenance and expansion of positive emotions with respect to the user's emotions.

치료 모듈(40)은 획득된 통합 감정 분석 결과(d3)를 기반으로 감정치료법(d4)이 결정(치료법 결정, 41)되면, 사용자가 결정된 감정치료법을 기반으로 감정치료를 수행할 수 있도록, 결정된 감정치료법(d4)을 사용자에 의한 감정치료의 수행(42)이 가능하도록 하는 형태로 마련하여 사용자 단말(2)의 모바일 어플리케이션(2a)으로 제공할 수 있다.The treatment module 40 determines the emotion treatment method d4 based on the obtained integrated emotion analysis result d3 (treatment method determination, 41) so that the user can perform emotion treatment based on the determined emotion treatment method. The emotion treatment method d4 may be provided in a form that enables the user to perform the emotion treatment 42 and provided as a mobile application 2a of the user terminal 2 .

이때, 치료 모듈(40)은 통합 감정 분석 결과(d3)를 기반으로 결정된 감정치료법에 대하여 로컬 스토리지(43)로부터 획득된 치료법 정보를 적용함으로써, 결정된 감정치료법(d4)을 사용자에 의한 감정치료(감정치료법)의 수행(42)이 가능하도록 하는 형태로 마련할 수 있다. 치료 모듈(40)은 감정치료법의 수행(42)이 가능하도록 마련된 감정치료법(d4)을 사용자 단말(2)로 제공할 수 있다.At this time, the treatment module 40 applies the treatment information obtained from the local storage 43 to the emotion treatment method determined based on the integrated emotion analysis result d3, thereby converting the determined emotion treatment method d4 to the emotional treatment method by the user ( It may be provided in a form that enables the performance 42 of emotion therapy). The treatment module 40 may provide the emotion treatment method d4 prepared to enable the execution 42 of the emotion treatment method to the user terminal 2 .

로컬 스토리지(43)에는 복수의 감정 각각에 대하여 각 감정별로 그에 대응하는 적어도 하나의 치료법 정보가 기 저장되어 있을 수 있다. 여기서, 치료법 정보는 컨텐츠 정보를 의미할 수 있다. 예시적으로, 치료법 정보에는 최종 감정이 슬픔인 경우에 제공 가능한 복수의 미술그림 관련 컨텐츠 정보, 최종 감정이 행복인 경우에 제공 가능한 복수의 음악 관련 컨텐츠 정보 등이 포함될 수 있다.At least one treatment information corresponding to each emotion for each emotion may be pre-stored in the local storage 43 . Here, the treatment information may mean content information. For example, the treatment information may include a plurality of pieces of art picture related content information that can be provided when the final emotion is sadness, and a plurality of pieces of music related content information that can be provided when the final emotion is happiness.

예를 들어, 통합 감정 분석 결과(d3)를 기반으로 감정치료법으로서 미술치료법이 결정(41)되었다고 가정하자. 이때, 치료 모듈(40)은 로컬 스토리지(43)로부터 치료법 정보로서 복수의 미술그림 중 최종 감정의 치료(예를 들어, 부정적 감정인 슬픔 감정을 긍정적 감정인 행복 감정으로 변화되도록 유도하는 치료)를 가능하게 하는 어느 하나의 미술그림을 획득할 수 있다. 이후, 치료 모듈(40)은 결정된 감정치료법을 로컬 스토리지(42)로부터 획득된 치료법 정보에 해당하는 미술그림을 기반으로 사용자가 감정치료의 수행(42)이 가능하도록 하는 형태로 마련하여 사용자 단말(2)로 제공할 수 있다.For example, it is assumed that an art therapy method is determined (41) as an emotion therapy method based on the integrated emotion analysis result (d3). At this time, the treatment module 40 enables treatment of the final emotion among the plurality of art pictures as treatment information from the local storage 43 (eg, a treatment that induces a change of a sadness emotion, which is a negative emotion, into a happiness emotion, which is a positive emotion). You can acquire any one art picture that you do. Thereafter, the treatment module 40 prepares the determined emotion treatment method in a form that enables the user to perform the emotion treatment 42 based on the art picture corresponding to the treatment information obtained from the local storage 42, so that the user terminal ( 2) can be provided.

일예로, 치료 모듈(40)은 획득된 치료법 정보에 해당하는 미술그림을 사용자가 보고 따라 그릴 수 있도록 그림판 형태로 마련하여 사용자 단말(2)로 제공할 수 있다. 일예로, 치료 모듈(40)에 의해 사용자 단말(2)로 제공된 감정치료법(d4)에 의하면, 사용자 단말(2)의 화면 상의 일영역에는 치료 모듈(40)에 의해 결정된 감정치료법에 해당하는 미술그림(즉, 로컬 스토리지로부터 획득된 미술그림)이 표시되고, 사용자 단말(2)의 화면 상의 상기 일영역을 제외한 나머지 영역에는 해당 미술그림을 보고 사용자가 보고 그릴 수 있도록 마련되는 그림판이 표시(제공)될 수 있다.As an example, the treatment module 40 may provide an art picture corresponding to the acquired treatment information in the form of a drawing board so that the user can see and draw along the same, and provide it to the user terminal 2 . As an example, according to the emotion therapy method d4 provided to the user terminal 2 by the treatment module 40 , art corresponding to the emotion therapy method determined by the treatment module 40 is displayed in an area on the screen of the user terminal 2 . A picture (that is, an art picture obtained from local storage) is displayed, and a painting board provided so that the user can see and draw the corresponding art picture is displayed on the remaining area except for the one area on the screen of the user terminal 2 (provided) ) can be

통합 감정 분석 결과(d3) 및/또는 감정치료법(d4)이 표시(제공)되는 사용자 단말(2)의 화면은 일예로 본 장치(100, 110)의 통합 모듈(30) 및/또는 치료 모듈(40)에 의하여 제어될 수 있다.The screen of the user terminal 2 on which the integrated emotion analysis result d3 and/or the emotion therapy method d4 is displayed (provided) is, for example, the integrated module 30 and/or the treatment module ( 40) can be controlled.

달리 표현하여, 본 장치(110, 110)에 의하여 제공되는 통합 감정 분석 결과(d3) 및/또는 감정치료법(d4)은 모바일 어플리케이션(2a)을 통해 사용자 단말(2)의 화면 상에 표시될 수 있다. 이를 위해, 본 장치(100, 110)의 통합 모듈(30) 및/또는 치료 모듈(40)은 통합 감정 분석 결과(d3) 및/또는 감정치료법(d4)이 사용자 단말(2)의 화면에 표시되도록 모바일 어플리케이션(2a)을 제어할 수 있다.In other words, the integrated emotion analysis result d3 and/or the emotion therapy method d4 provided by the devices 110 and 110 may be displayed on the screen of the user terminal 2 through the mobile application 2a. have. To this end, the integrated module 30 and/or the treatment module 40 of the apparatuses 100 and 110 displays the integrated emotion analysis result d3 and/or the emotion therapy method d4 on the screen of the user terminal 2 . It is possible to control the mobile application 2a as much as possible.

이처럼, 치료 모듈(40)이 통합 감정 분석 결과(d3)에 대응하여 결정된 감정치료법(d4)을 사용자 단말(2)로 제공함으로써, 사용자는 본 장치(100)에 의해 제공되는 감정치료법(d4)을 기반으로 감정 치료를 수행할 수 있다.As such, the treatment module 40 provides the emotion treatment method d4 determined in response to the integrated emotion analysis result d3 to the user terminal 2 , so that the user can use the emotion treatment method d4 provided by the device 100 . Emotional therapy can be performed based on

도 7은 본원의 일 실시예에 따른 신경망 기반 감정 분석 기반의 감정 치료 장치(110)에 의하여 사용자의 감정 치료가 이루어지는 예를 설명하기 위한 도면이다. 특히, 도 7은 사용자의 감정 분석 및 감정 치료를 수행하기 위해, 본 장치(110)가 모바일 어플리케이션(2a)을 제어함으로써 표시되는 사용자 단말(2)의 화면 예를 나타낸 도면이다. 7 is a diagram for explaining an example in which a user's emotion treatment is performed by the emotion treatment apparatus 110 based on neural network-based emotion analysis according to an embodiment of the present application. In particular, FIG. 7 is a diagram illustrating an example of a screen of the user terminal 2 displayed by the device 110 controlling the mobile application 2a in order to perform emotion analysis and emotion treatment of the user.

도 7을 참조하면, 본 장치(100, 110)에 의한 신경망 기반 감정 분석 기반의 감정 치료를 위해, 모바일 어플리케이션(2a)에 의하면, 사용자 단말(2)의 메인 화면에 '시작(start)' 메뉴, '리스트(List) 메뉴 등이 제공될 수 있다.Referring to FIG. 7 , for emotion treatment based on neural network-based emotion analysis by the apparatuses 100 and 110 , according to the mobile application 2a, a 'start' menu is displayed on the main screen of the user terminal 2 . , 'List menu, etc. may be provided.

사용자 단말(2)의 화면 상에 표시된 '시작' 메뉴에 대한 사용자 입력이 이루어지면, 사용자(1)로부터 음성 데이터(d)에 대한 입력(녹음)이 가능하도록 하는 음성 데이터 입력 버튼이 화면 상에 제공될 수 있다. 사용자는 음성 데이터 입력 버튼에 대한 입력(예를 들어, 터치 입력 등)을 수행한 다음, 감정 분석을 수행하고자 하는 음성을 발화할 수 있다. 사용자 단말(2)의 모바일 어플리케이션(2a)은 사용자로부터 입력된 발화에 대응하는 음성 데이터(d)를 본 장치(100, 110)로 제공할 수 있다. 본 장치(100, 110)는 입력된 음성 데이터(d)에 대하여 음성분석 및 문맥분석을 적용함으로써 사용자의 감정(현재 감정)(음성 데이터에 대응하는 감정)을 분석할 수 있다.When a user input is made to the 'Start' menu displayed on the screen of the user terminal 2, a voice data input button that enables input (recording) of the voice data d from the user 1 is displayed on the screen. may be provided. The user may perform an input (eg, a touch input, etc.) to the voice data input button, and then utter a voice to perform emotion analysis. The mobile application 2a of the user terminal 2 may provide the voice data d corresponding to the utterance input by the user to the apparatuses 100 and 110 . The apparatuses 100 and 110 may analyze the user's emotion (current emotion) (emotion corresponding to the voice data) by applying voice analysis and context analysis to the input voice data d.

이후, 본 장치(100, 110)는 입력된 음성 데이터(d)에 응답하여, 분석 결과로서 사용자의 최종 감정을 나타내는 통합 감정 분석 결과(d3)를 산출할 수 있다. 또한, 본 장치(100, 110)는 산출된 통합 감정 분석 결과(d3)를 기반으로 감정치료법(d4)을 결정하고, 결정된 감정치료법(d4)을 모바일 어플리케이션(2a)으로 제공할 수 있다. Thereafter, in response to the input voice data d, the apparatuses 100 and 110 may calculate an integrated emotion analysis result d3 representing the user's final emotion as an analysis result. Also, the apparatuses 100 and 110 may determine an emotion treatment method d4 based on the calculated integrated emotion analysis result d3 and provide the determined emotion treatment method d4 to the mobile application 2a.

일예로, 분석된 사용자의 최종 감정이 슬픔(Sad)인 경우, 본 장치(100, 110)는 감정치료법(d4)으로서 사용자의 부정적 감정(즉, 슬픔 감정)이 긍정적 감정으로의 변화가 유도되도록 하는 컨텐츠(예시적으로, 미술치료법에 따른 미술그림)을 모바일 어플리케이션(2a)으로 제공할 수 있다. 이에 따라, 사용자 단말(2)의 화면 상에 감정치료법(d4)으로서 미술치료법 수행을 위한 미술그림이 표시(디스플레이)될 수 있다. 이때, 본 장치(100, 110)는 미술그림뿐만 아니라 사용자(1)가 해당 미술그림을 이용해 감정치료의 수행이 가능하도록 하는 치료수행 모드를 모바일 어플리케이션(2a)으로 제공할 수 있다. 이때, 치료수행 모드는 일예로 해당 미술그림을 보고 사용자가 보고 따라 그릴 수 있도록 마련되는 그림판 등을 의미할 수 있다. 즉, 모바일 어플리케이션(2a)은 일예로 미술치료법에 대한 치료수행 모드로서 미술치료를 위한 그림판을 사용자 단말(2)의 화면에 표시할 수 있다.As an example, when the analyzed final emotion of the user is sadness, the apparatuses 100 and 110 induce a change in the user's negative emotion (ie, sadness emotion) into a positive emotion as an emotion therapy method d4. content (eg, art pictures according to art therapy) may be provided as the mobile application 2a. Accordingly, on the screen of the user terminal 2, an art picture for performing the art therapy method as the emotion therapy method d4 may be displayed (displayed). In this case, the devices 100 and 110 may provide not only an art picture, but also a treatment execution mode that enables the user 1 to perform emotional treatment using the corresponding art picture as the mobile application 2a. In this case, the treatment execution mode may refer to, for example, a painting board provided so that the user can see and follow the corresponding art picture. That is, the mobile application 2a may display, for example, a painting board for art therapy on the screen of the user terminal 2 as a treatment execution mode for art therapy.

이에 따르면, 모바일 어플리케이션(2a)은 입력된 음성 데이터(d)에 대한 사용자의 감정 분석 결과로서, 미술그림 혹은 치료수행 모드에 관한 정보를 포함하는 감정치료법(d4) 및 통합 감정 분석 결과(d3)를 사용자 단말(2)의 화면 상에 표시(제공)할 수 있다.According to this, the mobile application 2a is a user's emotion analysis result for the input voice data d, and the emotion therapy method (d4) and the integrated emotion analysis result (d3) including information about an art picture or a treatment execution mode (d3) may be displayed (provided) on the screen of the user terminal 2 .

다른 일예로, 결정된 감정치료법(d4)이 무용/동작 치료법인 경우, 사용자 단말(2)의 화면 상에는 치료수행 모드로서 사용자가 주어진 동작을 따라 행동할 수 있도록 하는 동영상 등이 표시될 수 있다.As another example, when the determined emotion therapy d4 is a dance/movement therapy, a video or the like may be displayed on the screen of the user terminal 2 to allow the user to act according to a given motion as a treatment execution mode.

도 8은 본원의 일 실시예에 따른 신경망 기반 감정 분석 기반의 감정 치료 장치(110)와 연동되는 모바일 어플리케이션(2a) 내의 메뉴 구성을 개략적으로 나타낸 도면이다. 도 8에서는 일예로 감정치료법으로서 미술치료법이 고려되는 경우에 대하여 설명하기로 한다. FIG. 8 is a diagram schematically illustrating a menu configuration in the mobile application 2a that is linked to the neural network-based emotion analysis-based emotion treatment apparatus 110 according to an embodiment of the present application. In FIG. 8, as an example, a case in which art therapy is considered as an emotion therapy will be described.

도 8을 참조하면, 모바일 어플리케이션(2a)은 사용자 단말(2)의 화면 상에 표시 가능한 메뉴로서 일예로 녹음 메뉴와 히스토리 목록 메뉴를 포함할 수 있다.Referring to FIG. 8 , the mobile application 2a is a menu that can be displayed on the screen of the user terminal 2 and may include, for example, a recording menu and a history list menu.

녹음 메뉴에는 미술 치료 예제 메뉴 및 미술 치료 그림판 메뉴가 포함될 수 있다.The recording menu may include an art therapy sample menu and an art therapy paint menu.

미술 치료 예제 메뉴에는 일예로 사용자가 본 장치(100, 110)에 의하여 감정 치료를 수행함에 있어서, 본 장치(100, 110)로부터 제공받게 되는 감정치료법(d4) 관련 정보가 저장되어 있을 수 있다. 즉, 미술 치료 예제 메뉴에는 미술치료법에 대응하여 사용자가 미술치료를 수행할 수 있도록 하기 위한 미술그림의 예제 그림들이 저장되어 있을 수 있다. 사용자는 미술 치료 예제 메뉴에 저장된 복수의 미술그림의 예제 그림을 고려(참고)함으로써 사용자 본인이 그림을 그리는 등의 미술치료 작업을 수행할 수 있다. 즉, 미술 치료 예제 메뉴에 대응하는 사용자 단말의 화면 상 표시 예는 일예로 도 7에 도시된 '미술치료 이전 그림 예제 화면'의 예와 같을 수 있다.The art therapy example menu may store, for example, information related to the emotion therapy method d4 received from the devices 100 and 110 when the user performs emotion therapy by the devices 100 and 110 . That is, in the art therapy example menu, example pictures of art pictures for allowing the user to perform art therapy in response to art therapy may be stored. The user can perform art therapy work such as drawing a picture by the user by considering (reference) the example pictures of a plurality of art pictures stored in the art therapy example menu. That is, the display example on the screen of the user terminal corresponding to the art therapy example menu may be the same as the example of the 'art therapy example screen before art therapy' shown in FIG. 7 as an example.

미술 치료 그림판 메뉴는 사용자가 본 장치(100, 110)에 의해 분석된 최종 감정에 따라 결정된 미술치료법을 실제로 수행하는 그림을 그릴 수 있도록 하는 화면을 제공할 수 있는 메뉴(즉, 치료수행 모드를 제공하는 메뉴)를 의미할 수 있다. 즉, 미술 치료 그림판에 대응하는 사용자 단말의 화면 상 표시 예는 일예로 도 7에 도시된 '미술치료를 위한 사용자 그림판 화면'의 예와 같을 수 있다. 사용자는 미술 치료 그림판 메뉴를 통해 제공되는 치료수행 모드를 기반으로, 예시적으로 펜의 색깔을 선택하여 미술치료법의 매뉴얼에 따라 그림을 그리는 등의 행동을 수행할 수 있으며, 이에 따라 미술치료가 이루어질 수 있다.The art therapy painting board menu is a menu that can provide a screen that allows the user to draw a picture that actually performs the art therapy method determined according to the final emotion analyzed by the devices 100 and 110 (that is, provides a treatment execution mode) menu) can mean That is, the display example on the screen of the user terminal corresponding to the art therapy painting board may be the same as the example of the 'user painting board screen for art therapy' shown in FIG. 7 as an example. Based on the treatment execution mode provided through the art therapy painting board menu, the user can select a pen color for example and perform actions such as drawing according to the art therapy manual, so that art therapy can be performed. can

히스토리 목록 메뉴에는 일예로 날짜 히스토리 메뉴, 기분 히스토리 메뉴, 음성 히스토리 메뉴 및 그림 히스토리 메뉴를 포함하는 4가지의 메뉴가 포함될 수 있다. 이러한 히스토리 목록 메뉴에는 사용자가 본 장치(100, 110)에 의해 제공되는 데이터(예를 들어, 통합 감정 분석 결과나 감정치료법)를 기반으로 감정 치료를 수행함으로써 생성된 이력 정보(날짜, 기분, 음성 및 그림 관련 히스토리 정보)가 저장될 수 있다. 여기서, 기분이라 함은 감정을 의미할 수 있다.The history list menu may include, for example, four menus including a date history menu, a mood history menu, a voice history menu, and a picture history menu. In this history list menu, history information (date, mood, voice and picture-related history information) may be stored. Here, the feeling may mean an emotion.

히스토리 목록 메뉴에는 네 가지(날짜, 기분, 음성, 그림)의 메뉴 기능이 있으며, 각각의 기능은 사용자가 이전에 본 장치(100, 110)와 연동되는 모바일 어플리케이션(2a)을 사용한 날짜와 그 당시 입력된 기분, 음성 그리고 그림을 보관할 수 있다. There are four menu functions (date, mood, voice, picture) in the history list menu, and each function is the date and time the user used the mobile application 2a linked with the devices 100 and 110 previously viewed. You can store the entered mood, voice and picture.

사용자는 기분 히스토리 메뉴를 통하여 사용자 자신의 이전 감정 상태에 대해 확인할 수 있다. 사용자는 음성 히스토리 메뉴를 통하여 감정 분석 내지 감정 치료를 수행하던 당시에 입력(녹음)된 음성 데이터(음성 파일)을 열어 당시의 기분을 확인할 수 있다. 사용자는 그림 히스토리 메뉴를 통하여 감정 분석 내지 감정 치료를 수행한 당시의 기분을 긍정적으로 변화되도록 유도하기 위해 수행되었던 감정치료법에 대한 치료 결과(예를 들어, 미술치료의 결과)인 그림을 확인할 수 있다.The user may check the user's previous emotional state through the mood history menu. The user can check the mood at the time by opening the voice data (voice file) input (recorded) at the time when emotion analysis or emotion treatment is performed through the voice history menu. The user can check the picture that is the treatment result (for example, the result of art therapy) for the emotion therapy method that was performed to induce a positive change in the mood at the time when the emotion analysis or emotion treatment was performed through the picture history menu. .

본 장치(100, 110)는 입력된 음성 데이터(d)에 대하여, 신경망(일예로 CNN 신경망) 기반 음성분석과 감정사전(감성어 사전) 기반의 문맥분석을 융합하여 음성 데이터(d)에 대한 사용자의 감정 분석을 수행할 수 있다. 본 장치(110, 110)는 이러한 융합 기반의 감정 분석 결과(통합 감정 분석 결과, d3)를 기반으로 사용자(1)에게 감정치료법(감정치료 기법)을 제공, 즉, 사용자가 감정치료를 수행할 수 있도록 할 수 있다.The apparatus 100, 110 fuses a neural network (eg, a CNN neural network)-based voice analysis and an emotion dictionary (emotional dictionary)-based context analysis for the input voice data (d) for voice data (d). The user's sentiment analysis can be performed. The devices 110 and 110 provide an emotion treatment method (emotion treatment technique) to the user 1 based on the fusion-based emotion analysis result (integrated emotion analysis result, d3), that is, the user can perform emotion treatment. can make it happen

본 장치(100, 110)는 감정치료법으로서 일예로 미술치료법을 제공할 수 있다. 본 장치(100, 110)에 의해 제공되는 감정치료법에 의하면, 사용자의 부정적 감정(Sad, Angry) 혹은 중립 감정(Neutral)이 긍정적 감정으로 변화되도록 유도할 수 있다. 또한, 본 장치(110, 110)에 의해 제공되는 감정치료법에 의하면, 사용자의 긍정적 감정(Happy)이 유지 및 확장되도록 유도할 수 있다. 즉, 본 장치(100, 110)에 의해 제공되는 감정치료법은 사용자의 감정을 보다 긍정적으로 변화시키거나 긍정적 감정을 유지 및 확장되도록 유도하는 역할을 수행할 수 있다.The apparatuses 100 and 110 may provide art therapy as an example of emotional therapy. According to the emotion therapy provided by the devices 100 and 110 , the user's negative emotions (Sad, Angry) or neutral emotions (Neutral) can be induced to change into positive emotions. In addition, according to the emotion therapy provided by the devices 110 and 110 , it is possible to induce the user's positive emotion (Happy) to be maintained and expanded. That is, the emotion therapy provided by the apparatuses 100 and 110 may serve to induce the user's emotion to be more positively changed or to maintain and expand the user's emotion.

본 장치(100, 110)는 음성 데이터(d)가 입력되면, 입력된 음성 데이터(d)를 음성분석 모듈(10)과 문맥분석 모듈(20) 각각으로 제공할 수 있다. 음성분석 모듈(10)은 음성 데이터(d)에 대한 음성분석의 처리를 통해 일예로 4가지 감정(Happy, Neutral, Sad, Angry) 각각에 대한 확률 값을 제1 감정 분석 결과(d1)로서 산출할 수 있다. 문맥분석 모듈(20)은 음성 데이터(d)에 대한 문맥분석의 처리를 통해 긍정의 세기 및 부정의 세기와 관련한 감정점수(즉, 긍/부정의 세기 관련 감정점수, 1가지의 긍/부정 결과 값)를 제2 감정 분석 결과(d2)로서 산출할 수 있다. 이후, 통합 모듈(30)은 제1/제2 감정 분석 결과(d1, d2)를 입력으로 하여 사용자의 최종 감정을 포함하는 통합 감정 분석 결과(d3)를 산출할 수 있다. 달리 말해, 통합 모듈(30)은 음성분석 모듈(10)로부터 획득된 감정 데이터와 문맥분석 모듈(20)로부터 획득된 긍/부정 결과 값을 기반으로 사용자의 최종 감정을 통합 감정 분석 결과(d3)로서 산출(추출)할 수 있다. 치료 모듈(40)은 산출된 통합 감정 분석 결과(d3)를 기반으로 감정치료법(d4)을 결정하고, 결정된 감정치료법(d4)이 사용자 단말(2)의 화면 상에 표시되도록 모바일 어플리케이션(2a)으로 결정된 감정치료법(d4)을 제공할 수 있다. 이를 통해, 사용자(10는 사용자 단말(2)의 화면에 제공되는 감정치료법(d4)을 기반으로 감정 치료를 수행할 수 있다.When the voice data d is input, the apparatuses 100 and 110 may provide the input voice data d to the voice analysis module 10 and the context analysis module 20 , respectively. The voice analysis module 10 calculates, for example, a probability value for each of the four emotions (Happy, Neutral, Sad, Angry) as a first emotion analysis result (d1) through the processing of voice analysis for the voice data (d) can do. The context analysis module 20 is an emotional score related to the strength of positive and negative strength through the processing of the context analysis for the voice data d (that is, the emotional score related to the strength of positive/negative, one positive/negative result) value) may be calculated as the second emotion analysis result d2. Thereafter, the integrated module 30 may calculate the integrated emotion analysis result d3 including the user's final emotion by receiving the first and second emotion analysis results d1 and d2 as inputs. In other words, the integrated module 30 integrates the final emotion of the user based on the emotion data obtained from the voice analysis module 10 and the positive/negative result values obtained from the context analysis module 20 as the integrated emotion analysis result (d3) It can be calculated (extracted) as The treatment module 40 determines the emotion treatment method d4 based on the calculated integrated emotion analysis result d3, and the determined emotion treatment method d4 is displayed on the screen of the user terminal 2 on the screen of the mobile application 2a. It is possible to provide the emotional treatment method (d4) determined by Through this, the user 10 may perform emotion treatment based on the emotion therapy method d4 provided on the screen of the user terminal 2 .

도 9는 본원의 일 실시예에 따른 신경망 기반 감정 분석 기반의 감정 치료 장치(110)에서 문맥분석 모듈(20)에 의한 제2 감정 분석 결과(d2)의 산출 예를 설명하기 위한 도면이다.9 is a diagram for explaining an example of calculation of the second emotion analysis result d2 by the context analysis module 20 in the neural network-based emotion analysis-based emotion treatment apparatus 110 according to an embodiment of the present application.

도 9를 참조하면, 일예로, 입력된 음성 데이터(d)가 'When you have meet so many kind and wonderful Stepmothers.'라고 가정하자.Referring to FIG. 9 , as an example, it is assumed that the input voice data d is 'When you have meet so many kind and wonderful Stepmothers.'

문맥분석 모듈(20)은 입력된 음성 데이터(d)를 STT(Speech To Text, 21)를 통해 텍스트로 변환할 수 있다. 이후, 문맥분석 모듈(20)은 변환된 텍스트에 대하여 Google Natural Language(22)를 이용해 단어 기본형(Lemma)을 추출할 수 있다. 일예로, 문맥분석 모듈(20)은 음성 데이터(d)로부터 적어도 하나의 단어 기본형으로서, [I, have, met, so, many, kind, and, wonderful, Stepmothers]를 추출할 수 있다.The context analysis module 20 may convert the input voice data d into text through a Speech To Text (STT) 21 . Thereafter, the context analysis module 20 may extract a word primitive (Lemma) from the converted text using the Google Natural Language 22 . As an example, the context analysis module 20 may extract [I, have, met, so, many, kind, and, wonderful, Stepmothers] as at least one word basic form from the voice data d.

이후, 문맥분석 모듈(20)은 추출된 적어도 하나의 단어 기본형에 대하여 감정사전 DB(감정어 사전 DB, 23)에 포함된 단어 기본형과의 매칭을 수행함으로써, 매칭되는 매칭 단어 기본형의 존재 여부를 확인할 수 있다. 문맥분석 모듈(20)은 추출된 단어 기본형과 감정사전 DB(23)에 포함된 단어 기본형 간의 매칭을 수행함으로써, 매칭 단어 기본형 각각의 감정점수를 획득할 수 있다. Then, the context analysis module 20 performs matching with the word primitive included in the emotion dictionary DB (emotion dictionary DB, 23) with respect to the extracted at least one word primitive, thereby determining whether a matching word primitive exists. can be checked The context analysis module 20 may obtain an emotion score for each of the matching word basic types by performing matching between the extracted basic word types and the basic word types included in the emotion dictionary DB 23 .

달리 표현하여, 문맥분석 모듈(20)은 Google Natural Language(22)의 적용을 통해 변환된 텍스트로부터 변환된 텍스트에 포함된 감정을 나타내는 단어(감정어)의 단어 기본형(Lemma)를 추출하고, 추출된 단어 기본형과 감정사전 DB(23)와의 매칭을 통해, 감정사전 DB(23)로부터 추출된 단어 기본형과 매칭되는 매칭 단어 기본형에 대하여 기 저장된 인덱스(index)를 획득할 수 있다. 이때, 문맥분석 모듈(20)은 획득된 인덱스를 기반으로 매칭 단어 기본형에 대하여 기 부여되어 있는 감정점수를 획득할 수 있다. 이때, 감정단어 DB(23)에는 단어 기본형, 인덱스 및 감정점수에 해당하는 정보가 서로 연계되어 룩업테이블 형식으로 저장되어 있을 수 있다.In other words, the context analysis module 20 extracts a word primitive (Lemma) of a word (emotion word) representing an emotion included in the converted text from the converted text through application of the Google Natural Language 22, and extracts Through matching between the basic form of the word and the emotion dictionary DB 23 , it is possible to obtain a pre-stored index with respect to the basic form of the matching word that matches the basic form of the word extracted from the emotion dictionary DB 23 . In this case, the context analysis module 20 may acquire a pre-given emotion score for the basic matching word based on the obtained index. In this case, the emotion word DB 23 may store information corresponding to a basic word type, an index, and an emotion score in connection with each other in a lookup table format.

예를 들어, 감정사전 DB(23)에는 'kind(인덱스 3): +1.0', 'wonderful(인덱스 9): +2.0' 등과 같은 데이터가 기 저장되어 있을 수 있다. 이에 따르면, 문맥분석 모듈(20)은 매칭 단어 기본형 각각의 감정점수로서, 매칭 단어 기본형인 kind의 감정점수인 '+3.0'과 매칭 단어 기본형인 wonderful의 감정점수인 '+9.0'을 감정사전 DB(23)로부터 획득할 수 있다. For example, data such as 'kind (index 3): +1.0' and 'wonderful (index 9): +2.0' may be pre-stored in the emotion dictionary DB 23 . According to this, the context analysis module 20, as the emotion score of each matching word basic type, is an emotion dictionary DB of '+3.0', which is the emotion score of kind, which is the basic type of the matching word, and '+9.0', which is the emotion score of wonderful, which is the basic type of the matching word. It can be obtained from (23).

이후, 문맥분석 모듈(20)은 추출된 적어도 하나의 단어 기본형 각각에 대응하는 매칭 단어의 감정점수를 합산함으로써, 합산된 감정점수를 제2 감정 분석 결과(d2)로서 산출할 수 있다. 일예로, 합산된 감정점수(감정점수의 총합, Sentiment Score), 즉 음성 데이터(d)에 해당하는 문장 전체에 대한 감정점수는 +3.0(+1.0+2.0=+3.0)(긍정 감정 값)일 수 있다. 이에 따르면, 음성 데이터(d)에 대하여 문맥분석을 수행한 경우에는 사용자의 감정이 '긍정적 감정'인 것으로 산출될 수 있다.Thereafter, the context analysis module 20 may calculate the summed emotional score as the second emotion analysis result d2 by summing the emotional scores of the matching words corresponding to each of the extracted at least one basic word form. For example, the summed emotional score (the sum of the emotional scores, Sentiment Score), that is, the emotional score for the entire sentence corresponding to the voice data (d) is +3.0 (+1.0+2.0=+3.0) (positive emotion value). can According to this, when context analysis is performed on the voice data d, it can be calculated that the user's emotion is a 'positive emotion'.

도 10a 내지 도 10c는 본원의 일 실시예에 따른 신경망 기반 감정 분석 기반의 감정 치료 장치(110)에서 음성분석 모듈(10)에 의한 제1 감정 분석 결과(d1)의 산출 예를 설명하기 위한 도면이다. 10A to 10C are diagrams for explaining an example of calculation of the first emotion analysis result d1 by the voice analysis module 10 in the neural network-based emotion analysis-based emotion treatment apparatus 110 according to an embodiment of the present disclosure; am.

특히, 도 10a는 음성 데이터(d)에 MFCC 변환(11)을 적용함으로써 추출된 음성 파형 특징(즉, MFCC 특징 값)의 예를 나타낸다. 도 10b는 로컬 스토리지(13)로부터 획득되는 모델 데이터(Model Data)의 학습 곡선의 예를 나타낸다. 도 10c는 제1 감정 분석 결과(d1)의 예를 나타낸다.In particular, FIG. 10A shows an example of speech waveform features (ie, MFCC feature values) extracted by applying the MFCC transform 11 to speech data d. 10B shows an example of a learning curve of model data obtained from the local storage 13 . 10C shows an example of the first emotion analysis result d1.

도 10a 내지 도 10c를 참조하면, 일예로, 입력된 음성 데이터(d)가 'When you have meet so many kind and wonderful Stepmothers.'라고 가정하자.10A to 10C , as an example, it is assumed that the input voice data d is 'When you have meet so many kind and wonderful Stepmothers.'

음성분석 모듈(10)은 음성 데이터(d)에 대한 MFCC 변환(11) 적용을 통해, 음성 데이터(d)로부터 음성 파형 특징(즉, MFCC 특징 값)을 추출(검출, 획득)할 수 있다. 예시적으로, 음성 데이터(d)로부터 음성분석 모듈(10)은 26개의 음성 파형 특징을 추출할 수 있다. 음성분석 모듈(10)은 추출된 음성 파형 특징을 신경망(12)의 입력으로 제공할 수 있다.The voice analysis module 10 may extract (detect, acquire) voice waveform features (ie, MFCC feature values) from the voice data d by applying the MFCC transformation 11 to the voice data d. Exemplarily, the voice analysis module 10 may extract 26 voice waveform features from the voice data d. The voice analysis module 10 may provide the extracted voice waveform features as an input of the neural network 12 .

또한, 음성분석 모듈(10)은 로컬 스토리지(13)로부터 모델 데이터(Model Data)를 획득(로딩)하여 신경망(12)의 입력으로 제공할 수 있다. 여기서, 로컬 스토리지(13)에 저장된 모델 데이터는 MFCC 특징 값으로부터 음성을 분류하도록 학습되는 음성 분류를 위한 신경망 모델 내지 해당 신경망 모델에 대한 학습을 통해 결정된 신경망의 가중치 값을 의미할 수 있다.Also, the voice analysis module 10 may obtain (load) model data from the local storage 13 and provide it as an input to the neural network 12 . Here, the model data stored in the local storage 13 may mean a neural network model for voice classification that is learned to classify voices from MFCC feature values or a weight value of a neural network determined through learning of the neural network model.

일예로, 모델 데이터는 2000개의 Tess 음성 데이터의 MFCC 특징 값을 학습한 신경망 모델 관련 데이터(예를 들어, 가중치 값)를 의미할 수 있다. 도 10b에서 파란색 곡선은 정확도를 나타내며, 본 장치(100, 110)에 적용되는 모델 데이터는 일예로 98%의 학습 정확도를 보이는 모델 데이터일 수 있다.For example, the model data may refer to neural network model-related data (eg, weight values) obtained by learning MFCC feature values of 2000 Tess speech data. In FIG. 10B , a blue curve indicates accuracy, and the model data applied to the apparatuses 100 and 110 may be, for example, model data showing a learning accuracy of 98%.

음성분석 모듈(10)은 음성 데이터(d)로부터 추출된 음성 파형 특징 및 로컬 스토리지(13)로부터 획득된 모델 데이터를 신경망(12)의 입력으로 적용함으로써, 신경망(12)의 결과로서 제1 감정 분석 결과(d1)를 산출할 수 있다.The voice analysis module 10 applies the voice waveform features extracted from the voice data d and the model data obtained from the local storage 13 to the input of the neural network 12, so that the first emotion as a result of the neural network 12 The analysis result d1 can be calculated.

음성분석 모듈(10)에 의해 산출되는 제1 감정 분석 결과(d1)에는 행복(Happy), 중립(Neutral), 슬픔(Sad) 및 분노(Angry)를 포함하는 복수의 감정 각각에 대해 확률적으로 부여되는 값(확률 값)이 포함될 수 있다. 도 10c에는 예시적으로, 음성 데이터(d)에 대한 음성분석의 적용 결과, 4가지의 감정 중 '행복(Happy)'에 해당하는 감정이 가장 큰 값을 갖도록 제1 감정 분석 결과(d1)가 산출되었음을 확인할 수 있다. 이에 따르면, 음성 데이터(d)에 대하여 음성분석을 수행한 경우에는 사용자의 감정이 '행복(Happy)'인 것으로 산출될 수 있다.In the first emotion analysis result d1 calculated by the voice analysis module 10, probabilistically for each of a plurality of emotions including happiness, neutral, sadness, and anger. A given value (probability value) may be included. As an example, in FIG. 10c , the first emotion analysis result d1 is shown so that, as a result of applying the voice analysis to the voice data d, the emotion corresponding to 'Happy' has the largest value among the four emotions. It can be confirmed that it has been calculated. Accordingly, when voice analysis is performed on the voice data d, it may be calculated that the user's emotion is 'Happy'.

음성 데이터(d)에 대한 음성분석의 적용 결과는 최종 단계에서 소프트맥스(Softmax) 함수를 이용하여 합이 1이 되도록 적용(조정)될 수 있다. 이에 따르면, 도 10c에 도시된 감정 분석 결과(즉, 제1 감정 분석 결과, d1)는 음성 데이터(d)에 대한 음성분석의 적용 결과로서 예시적으로 소프트맥스 함수가 적용되기 이전의 값을 의미할 수 있다.The result of applying the voice analysis to the voice data d may be applied (adjusted) so that the sum becomes 1 by using a Softmax function in the final step. According to this, the emotion analysis result (that is, the first emotion analysis result, d1) shown in FIG. 10C is a result of applying the voice analysis to the voice data d, and exemplarily means a value before the softmax function is applied. can do.

도 11은 본원의 일 실시예에 따른 신경망 기반 감정 분석 기반의 감정 치료 장치(110)에서 통합 모듈(30)에 의한 통합 감정 분석 결과(d3)의 산출 예를 설명하기 위한 도면이다.11 is a diagram for explaining an example of calculation of the integrated emotion analysis result d3 by the integration module 30 in the neural network-based emotion analysis-based emotion treatment apparatus 110 according to an embodiment of the present application.

도 11을 참조하면, 통합 모듈(30)은 제1 감정 분석 결과(d1)와 제2 감정 분석 결과(d2)를 신경망(31)의 입력으로 제공하여 융합(통합) 분석을 수행함으로써, 음성 데이터(d)에 대한 통합 감정 분석 결과(d3)로서 복수의 감정 각각에 대한 확률을 산출할 수 있다.Referring to FIG. 11 , the integration module 30 provides the first emotion analysis result d1 and the second emotion analysis result d2 as inputs to the neural network 31 to perform fusion (integration) analysis, so that voice data As an integrated emotion analysis result (d3) for (d), a probability for each of a plurality of emotions may be calculated.

통합 모듈(30)은 제1/제2 감정 분석 결과(d1, d2)의 통합분석의 처리 결과, 4가지 감정에 대한 확률 중 행복(Happy)에 해당하는 확률이 가장 큰 값(높은 값)으로 나타남에 따라, 음성 데이터(d)에 대한 사용자의 최종 감정을 '행복(Happy)'인 것으로 산출할 수 있다. 즉, 통합 모듈(30)은 통합분석을 통해, 복수의 감정 중 최대치의 확률을 갖는 감정이 '행복(Happy)'이므로, 사용자의 최종 감정을 '행복(Happy)'인 것으로 산출할 수 있다. 통합 모듈(30)은 '행복(Happy)'을 사용자의 최종 감정으로 결정한 통합분석(음성분석과 문맥분석을 함께 수행한 통합분석) 결과로서 통합 감정 분석 결과(d3)를 산출할 수 있다.As a result of the integrated analysis of the first and second emotion analysis results (d1, d2), the integration module 30 sets the probability corresponding to happiness to the highest value (higher value) among the probabilities for the four emotions. As it appears, the user's final emotion for the voice data d may be calculated as 'Happy'. That is, the integration module 30 may calculate the final emotion of the user as 'Happy' because the emotion having the maximum probability among the plurality of emotions is 'Happy' through the integrated analysis. The integrated module 30 may calculate the integrated emotion analysis result d3 as the result of the integrated analysis (integrated analysis performed together with the voice analysis and the context analysis) in which 'Happy' is the final emotion of the user.

일예로, 도 11에 도시된 통합분석 감정 값은 통합 모듈(30)에 의하여 산출되는 통합 감정 분석 결과(d3)와 관련하여, 행복(Happy), 중립(Neutral), 슬픔(Sad) 및 분노(Angry)를 포함하는 복수의 감정 각각에 대한 최종 결과 값을 의미할 수 있다. 이는 앞서 소프트맥스 함수를 이용하여 합이 1로 조정된 음성분석 모듈(10)에 의해 산출된 음성 데이터(d)에 대한 음성분석의 적용 결과(즉, 제1 감정 분석 결과, d1)와 문맥분석 모듈(20)에 의해 산출된 자연어 분석 결과인 제2 감정 분석 결과(d2)를 함께 이차적으로 처리하여 나온 최종 결과 값을 의미할 수 있다. 이러한 복수의 감정 각각에 대한 최종 결과 값의 합은 1일 수도 있고, 1이 아닐 수도 있다.As an example, the integrated analysis emotion value shown in FIG. 11 is related to the integrated emotion analysis result (d3) calculated by the integration module 30, happiness (Happy), neutral (Neutral), sadness (Sad) and anger ( Angry) may mean a final result value for each of a plurality of emotions. This is the result of applying the voice analysis (ie, the first emotion analysis result, d1) to the voice data d calculated by the voice analysis module 10 whose sum is adjusted to 1 by using the softmax function, and the context analysis. It may mean a final result value obtained by secondary processing together with the second emotion analysis result d2, which is the natural language analysis result calculated by the module 20 . The sum of the final result values for each of the plurality of emotions may be 1 or may not be 1.

치료 모듈(40)은 통합 감정 분석 결과(d3)를 기반으로 감정치료법(d4)을 결정하여 사용자 단말(2)로 제공할 수 있다. 예시적으로, 통합 감정 분석 결과(d3)에 포함된 사용자의 최종 감정이 '행복(Happy)'으로서 긍정적 감정인 것으로 나타난 경우, 치료 모듈(40)은 긍정적 감정의 유지 및 확장을 유도하는 컨텐츠 형태로 감정치료법(d4)을 제공할 수 있다. 일예로, 치료 모듈(40)은 감정치료법(d4)으로서 '행복한 모습 그리기'에 관한 감정치료의 수행이 가능하도록 그림판을 제공할 수 있다.The treatment module 40 may determine the emotion treatment method d4 based on the integrated emotion analysis result d3 and provide it to the user terminal 2 . For example, when the final emotion of the user included in the integrated emotion analysis result d3 is shown to be a positive emotion as 'Happy', the treatment module 40 is converted into a content form that induces maintenance and expansion of the positive emotion. Emotional therapy (d4) can be provided. As an example, the treatment module 40 may provide a painting board to enable emotional treatment for 'drawing a happy figure' as the emotion therapy method d4.

본 장치(100, 110)는 사용자(1)의 음성 데이터(d)를 텍스트 기반의 문맥분석 모듈(20)과 신경망 기반의 음성분석 모듈(10)을 통하여 각각의 감정점수(d1, d2)를 추출한 후, 이를 통합 모듈(30)에서 통합함으로써 사용자의 최종적인 감정(최종 감정)을 분석 결과(d3)로서 산출(도출)할 수 있다. The apparatuses 100 and 110 analyze the voice data d of the user 1 through the text-based context analysis module 20 and the neural network-based voice analysis module 10 to obtain respective emotional scores d1 and d2. After extraction, the final emotion (final emotion) of the user may be calculated (derived) as the analysis result d3 by integrating it in the integration module 30 .

본 장치(100, 110)는 통합분석의 결과(d3)에 따라 감정치료법(d4)을 제공할 수 있다. 본 장치(100, 110)는 통합분석 결과(d3)에 따라 미술치료나 컨텐츠 추천(영화, 음악, 도서 등의 컨텐츠 제공) 등에 관한 감정치료법(d4)을 사용자 단말(2)로 제공할 수 있다. 이를 통해, 본 장치(100, 110)는 이러한 감정치료법(d4)의 제공을 통해 사용자의 스트레스 및 감정과 관련한 문제들이 해소되도록 할 수 있다. The apparatuses 100 and 110 may provide the emotion therapy method d4 according to the result d3 of the integrated analysis. The devices 100 and 110 may provide an emotion therapy method d4 for art therapy or content recommendation (providing content such as movies, music, books, etc.) to the user terminal 2 according to the integrated analysis result d3. . Through this, the apparatuses 100 and 110 can solve problems related to the user's stress and emotions by providing the emotion treatment method d4.

예를 들어, 사용자의 최종 감정(현재의 최종적인 감정)이 '슬픔(Sad)'인 경우, 본 장치(100, 110)는 최종 감정인 슬픈 감정을 긍정적 감정(예를 들어, 행복한 감정)으로 변화시킬 수 있는 감정치료법(d4)으로서 일예로 미술치료법을 제공할 수 있다.For example, when the user's final emotion (the current final emotion) is 'Sad', the devices 100 and 110 change the final emotion, a sad emotion, into a positive emotion (for example, a happy emotion). As an emotional therapy method (d4) that can be performed, art therapy may be provided as an example.

본 장치(100, 110)는 음성분석과 문맥분석을 융합하여 사용자의 감정을 분석할 수 있다. 음성분석을 단일적으로 이용한 감정 분석 방식은 중립적인 억양에 대한 인식이 어렵고, 동일한 문장에 대해서도 개인의 반언어적 특징에 따라 감정이 다르게 결정되기도 하는 문제가 발생할 수 있다.The apparatuses 100 and 110 may analyze a user's emotions by combining voice analysis and context analysis. The emotion analysis method using voice analysis alone may have a problem in that it is difficult to recognize a neutral intonation, and even for the same sentence, emotions may be determined differently depending on an individual's anti-linguistic characteristics.

종래에는 음성을 통한 감정 분석 기술과 관련하여 여러 연구가 진행된 바 있는데, 대부분 60~70%의 정확도로서 비교적 낮은 정확도를 보이는 문제가 있다. 또한 종래에 공지된 대부분의 감정 분석 기술들은 이미지와 같은 데이터를 이용해 감정 분석을 수행할 뿐, 문맥분석과 음성분석을 융합하여 감정 분석을 수행하는 기술에 대해서는 존재하지 않는다. 또한, 종래의 자연어 처리 기반의 감정 분석 기술들은 입력된 음성 데이터에 감정을 나타내는 단어(감정 단어, 감정어)가 포함되어 있지 않은 경우 정확한 감정 인식이 이루어지지 못하는 문제가 있다.In the past, several studies have been conducted in relation to emotion analysis technology through voice, but most of them have a problem of showing a relatively low accuracy with an accuracy of 60 to 70%. In addition, most of the conventionally known emotion analysis techniques only perform emotion analysis using data such as images, and there is no technology for performing emotion analysis by fusion of context analysis and voice analysis. In addition, conventional natural language processing-based emotion analysis techniques have a problem in that accurate emotion recognition cannot be achieved when input voice data does not include words representing emotions (emotion words, emotion words).

구체적으로, 종래에 음성분석을 단독적으로 이용하는 감정 분석 기술은 다음과 같은 문제를 갖는다. 중립적인 억양으로 감정적인 문장을 말하는 경우에 대해서는 인지가 어렵고, 동일한 문장이라 개인차로 인하여 감정 분석 결과가 서로 다르게 결정(도출)될 수 있다.Specifically, the conventional emotion analysis technique using voice analysis alone has the following problems. It is difficult to recognize when an emotional sentence is spoken with a neutral accent, and since it is the same sentence, the emotional analysis result may be determined (derived) differently due to individual differences.

사람의 감정을 인식하는 종래의 기술 중 일예로는 [Yelin Kim and Emily Mower Provost. 2015. Emotion Recognition During Speech Using Dynamics of Multiple Regions of the Face. ACM Trans. Multimedia Comput. Commun. Appl. 12, 1s, Article 25 (October 2015), 23 pages.] 문헌이 존재한다.One example of a conventional technique for recognizing human emotions is [Yelin Kim and Emily Mower Provost. 2015. Emotion Recognition During Speech Using Dynamics of Multiple Regions of the Face. ACM Trans. Multimedia Comput. Commun. Appl. 12, 1s, Article 25 (October 2015), 23 pages.] literature exists.

상기의 종래 문헌은 Deep Belief Network의 확장 기술을 제안하고 있으며, 본원에서 제안하는 기술과는 달리 신경망(Deep Learning, 딥러닝) 기술을 이용하지 않는다는 점에서 차이가 있다. 또한, 종래 문헌의 감정 분석은 60~70%의 정확도로 낮은 정확도를 보이며, 얼굴에 대한 이미지를 기반으로 감정 분석을 수행한다는 점에서 차이가 있다.The above conventional literature proposes an extension technique of a Deep Belief Network, and is different from the technique proposed herein in that it does not use a neural network (Deep Learning) technique. In addition, the emotion analysis of the related art shows a low accuracy with an accuracy of 60 to 70%, and there is a difference in that emotion analysis is performed based on an image of a face.

한편, 종래에 자언어 처리를 기반으로 하는 감정 분석 기술은 다음과 같은 문제를 갖는다. 사용자로부터 획득된 음성 데이터에 감정을 나타내는 단어(감정어)가 포함되어 있지 않은 경우 감정 인식이 어렵다. 또한 획득된 음성 데이터 내에 감정어가 포함되어 있지 않더라도, 단어 간 종속성이나 음성 특징 등에 의하여 획득된 음성 데이터 내에 감정적인 문장이 나타날 수 있으나, 이를 감지하지 못하여 부정확한 감정 분석 결과가 도출될 수 있는 문제가 있다.On the other hand, the conventional emotion analysis technology based on self-language processing has the following problems. When the voice data obtained from the user does not include a word (emotional word) representing an emotion, it is difficult to recognize the emotion. In addition, even if emotional words are not included in the acquired voice data, emotional sentences may appear in the acquired voice data due to inter-word dependencies or voice characteristics, etc. have.

도 12는 주어진 음성 데이터에 대하여 종래의 자연어 처리를 적용한 감정 분석 결과의 화면 표시 예를 나타낸 도면이다. 도 13은 주어진 음성 데이터에 대하여 본원의 일 실시예에 따른 신경망 기반 감정 분석 기반의 감정 치료 장치(110)의 문맥분석 모듈(20)에 의한 문맥분석의 적용을 통한 감정 분석 결과의 화면 표시 예를 나타낸 도면이다.12 is a view showing an example of a screen display of an emotion analysis result by applying a conventional natural language processing to a given voice data. 13 is a screen display example of the emotion analysis result through the application of context analysis by the context analysis module 20 of the neural network-based emotion analysis-based emotion treatment apparatus 110 for the given voice data according to an embodiment of the present application. the drawing shown.

도 12 및 도 13을 참조하면, 도 12에 도시된 감정 분석 결과는 음성 데이터에 대하여 종래의 자연어 처리로서 일예로 Google사의 Google Natural Language(자연어 처리)가 적용된 경우의 예를 나타낸다. 이에 반해, 도 13에 도시된 감정 분석 결과는 음성 데이터에 대하여 종래의 Google Natural Language(자연어 처리)의 Dependency Tree(종속성 트리)와 수식관계 구조를 활용하여 종래 감정 분석 기술을 개선한 감정 분석 기술이 적용된 경우(즉, 본 장치에 의한 문맥분석 기술이 적용된 경우)의 예를 나타낸다.12 and 13 , the emotion analysis result shown in FIG. 12 shows an example in which Google's Google Natural Language is applied as a conventional natural language processing to voice data. On the other hand, the emotion analysis result shown in FIG. 13 is an emotion analysis technology that improves the conventional emotion analysis technology by using a Dependency Tree and a formula relation structure of the conventional Google Natural Language (Natural Language Processing) for voice data. An example of the applied case (ie, the case where the context analysis technique by the present device is applied) is shown.

후술하는 설명에 의하면, 본원에 적용되는 개선된 감정 분석 기술(즉, 본 장치의 문맥분석 모듈에 의한 문맥분석 기반의 감정 분석 기술)이 단순히 종래의 자연어 처리 기술인 Google Natural Language를 이용해 감정 분석을 수행하는 것 보다 더 나은 성능을 나타냄을 확인할 수 있다.As will be described later, the improved emotion analysis technology (that is, emotion analysis technology based on context analysis by the context analysis module of the device) applied to the present application simply performs emotion analysis using Google Natural Language, which is a conventional natural language processing technology. It can be seen that the performance is better than

구체적으로, 도 12 및 도 13은 감정 분석 대상이 되는 음성 데이터로서 "I can’t be happy sometimes not being with my family makes me feel so lonely(나는 행복할 수 가 없어. 가족들과 함께 있지 못한다는 사실이 가끔 날 너무 외롭게 해.)"라는 문장이 고려된 경우의 감정 분석 결과를 나타낸다. Specifically, FIGS. 12 and 13 are voice data that are subject to emotional analysis, indicating that "I can't be happy sometimes not being with my family makes me feel so lonely" The fact sometimes makes me so lonely.)” It represents the emotion analysis result in the case where the sentence is considered.

주어진 음성 데이터의 일예에는 "can’t be happy"와 "lonely"라는 표현이 포함되어 있음에 따라, 주어진 음성 데이터는 전반적으로 부정적인 문장인 것으로 판단됨이 바람직할 수 있다. 즉, 주어진 음성 데이터의 일예에 따르면, 주어진 음성 데이터에 대한 사용자의 감정은 부정적 감정에 해당하는 것으로 판단됨이 바람직할 수 있다.Since expressions such as “can’t be happy” and “lonely” are included in an example of the given voice data, it may be desirable to determine that the given voice data is an overall negative sentence. That is, according to an example of the given voice data, it may be desirable to determine that the user's emotion for the given voice data corresponds to a negative emotion.

하지만, 도 12를 참조하면, 종래 Google Natural Language에서는 주어진 음성 데이터에 대한 감정 분석 수행 결과로서 감정점수를 0.2로 부여하고, 이에 따라 음성 데이터에 대한 사용자의 감정을 '중립 감정'인 것으로 판단하고 있음을 확인할 수 있다. 이러한 종래의 자연어 처리 기반의 감정 분석 결과는 정확성이 떨어짐을 확인할 수 있다.However, referring to FIG. 12, in the conventional Google Natural Language, an emotion score is given as 0.2 as a result of performing emotion analysis on the given voice data, and accordingly, it is determined that the user's emotion toward the voice data is 'neutral emotion'. can confirm. It can be seen that the conventional natural language processing-based emotion analysis result is not accurate.

이에 반해, 도 13을 참조하면, 본 장치(100, 110)의 문맥분석 모듈(20)에서는 주어진 음성 데이터에 대한 감정 분석 수행 결과(즉, 제2 감정 분석 결과, d2)로서 감정점수를 -2.5로 산출하고, 이에 따라 음성 데이터에 대한 사용자의 감정을 '부정적 감정'인 것으로 판단하고 있음을 확인할 수 있다. 즉, 본 장치(100, 110)는 문맥분석 모듈(20)에 의한 문맥분석 적용을 통해 주어진 음성 데이터의 문장이 전반적으로 부정적인 감정을 포함하고 있다는 결과를 도출해 낼 수 있다.On the other hand, referring to FIG. 13 , in the context analysis module 20 of the apparatuses 100 and 110 , the emotion score is -2.5 as the result of performing emotion analysis on the given voice data (ie, the second emotion analysis result, d2). , and accordingly, it can be confirmed that the user's emotion toward the voice data is determined to be a 'negative emotion'. That is, the apparatuses 100 and 110 may derive a result that the sentences of the given voice data generally include negative emotions through the application of the context analysis by the context analysis module 20 .

이러한 본 장치(100, 110)의 문맥분석 모듈(20)에 의한 문맥분석 기반의 감정 분석 결과는 종래의 자언어 처리(일예로, Google Natural Language) 기술 대비 더 정확한 결과를 도출해 냄을 확인할 수 있다.It can be confirmed that the emotion analysis result based on the context analysis by the context analysis module 20 of the present devices 100 and 110 derives more accurate results compared to the conventional self-language processing (eg, Google Natural Language) technology. .

본원은 신경망 기반 감정 분석 및 감정 치료 시스템(200, 200')을 제공할 수 있다. 본 시스템(200, 200')은 음성분석과 감정사전 기반의 문맥분석을 융합하여 입력된 음성 데이터에 대한 사용자의 감정을 분석하고, 분석 결과에 따라 사용자에게 적절한 감정치료법 및 컨텐츠를 제공할 수 있다. 이러한 본 시스템(200, 200')은 NBST(Neural network Based Sentiment analysis and Therapy system)라 지칭될 수 있다. The present application may provide neural network-based emotion analysis and emotion treatment systems 200 and 200 ′. The present systems 200 and 200' can analyze the user's emotions for the input voice data by fusion of voice analysis and context analysis based on emotion dictionary, and provide appropriate emotion therapy and content to the user according to the analysis result. . The present systems 200 and 200 ′ may be referred to as a Neural Network Based Sentiment Analysis and Therapy system (NBST).

본 장치(100, 110) 및 본 시스템(200, 200')은 입력된 음성 데이터에 대하여 음성분석과 감정사전(감성어사전) 기반의 문맥분석을 융합하여 감정 분석을 실시한 후 분석한 결과를 바탕으로 사용자에게 치료 기법(감정치료법)을 제공할 수 있다. 감정치료법으로는 일예로 미술치료법이 고려될 수 있으며, 이러한 감정치료법의 제공을 통해 사용자의 부정적인 감정을 긍정적으로 변화시킬 수 있다.The devices 100 and 110 and the systems 200 and 200 ′ perform emotion analysis by fusion of voice analysis and emotion dictionary (emotional dictionary)-based context analysis on the input voice data, and then based on the analysis results. can provide the user with a treatment technique (emotional therapy). Art therapy may be considered as an example of emotional therapy, and negative emotions of the user may be positively changed by providing such an emotional therapy.

본 장치(100, 110)는 치료 모듈(40)에서 결정된 감정치료법(d4)을 모바일 어플리케이션(2a)으로 제공함으로써, 사용자가 결정된 감정치료법(d4)에 대응하는 감정치료를 수행하도록 도울 수 있다.The apparatuses 100 and 110 may provide the emotional treatment method d4 determined in the treatment module 40 to the mobile application 2a, thereby helping the user to perform an emotion treatment corresponding to the determined emotion treatment method d4.

다시 말해, 본 시스템(200, 200')에서 고려되는 미술치료법 등의 감정치료법(d4)은 사용자(1)의 부정적인 감정을 긍정적으로 변화시켜 주는 역할로 작용될 수 있다. 본 장치(100, 110)는 통합분석 결과 사용자의 최종 감정이 부정적 감정 또는 중립 감정인 것으로 나타나면, 부정적 감정 또는 중립 감정이 긍정적 감정으로 변화되도록 유도하는 감정치료법(d4)을 제공할 수 있다. 또한, 본 장치(100, 110)는 통합분석 결과 사용자의 최종 감정이 긍정적 감정인 것으로 나타나면, 사용자에 대하여 영화, 음악, 도서 등의 컨텐츠에 관한 감정치료법(d4)을 제공함으로써, 사용자의 현재 감정인 긍정적 감정이 유지 및/또는 확장되도록 할 수 있다.In other words, the emotion therapy method d4, such as art therapy, which is considered in the present systems 200 and 200', may serve to change the negative emotion of the user 1 into a positive one. When it is found that the final emotion of the user is a negative emotion or a neutral emotion as a result of the integrated analysis, the apparatuses 100 and 110 may provide an emotion therapy method d4 for inducing a change in the negative emotion or the neutral emotion into a positive emotion. In addition, when the final emotion of the user is found to be a positive emotion as a result of the integrated analysis, the devices 100 and 110 provide the user with an emotion therapy method d4 for content such as movies, music, books, etc., so that the user's current emotion is positive. Emotions can be maintained and/or expanded.

종래 대부분의 모바일 어플리케이션은 텍스트 기반의 감정 분석을 활용하고 있으며, 이 중 어느 모바일 어플리케이션도 사용자로 하여금 감정의 변화를 일으킬만한 요소를 가지고 있지 않다. 즉, 감정 분석을 수행하는 종래 대부분의 모바일 어플리케이션은 사용자의 감정의 변화를 변화시키는 기술에 대하여 전혀 고려하고 있지 않다. 이에 반해, 본 시스템(200, 200')은 음성분석과 문맥분석을 융합하여 파악한 사용자의 감정을 시계열적으로 분석하여 사용자의 감정 변화를 파악하고, 그 결과에 따라 미술치료 등의 감정치료법을 제공할 수 있다. 본 시스템(200, 200')은 감정치료법의 제공을 통해 사용자의 현재 감정이 부정적 감정이나 중립 감정으로 나타난 경우에 대해서는 사용자의 감정이 긍정적 감정으로 변화되도록 할 수 있다.Conventionally, most mobile applications utilize text-based emotion analysis, and none of these mobile applications has an element capable of causing a user to change their emotions. That is, most conventional mobile applications that perform emotion analysis do not consider a technology for changing a user's emotion at all. On the other hand, the present systems 200 and 200' analyze the user's emotions identified by convergence of voice analysis and context analysis to determine the user's emotional change, and provide emotional therapy such as art therapy according to the result. can do. The present systems 200 and 200' may change the user's emotion into a positive emotion when the user's current emotion is expressed as a negative emotion or a neutral emotion through the provision of emotion therapy.

다시 말해, 본 시스템(200, 200')은 음성분석과 문맥분석을 융합하여 입력된 음성 데이터에 대한 사용자의 감정을 파악하고 이를 누적하여 시계열적으로 분석함으로써 사용자의 감정 변화를 파악할 수 있다. 이 결과를 바탕으로 본 시스템(200, 200')은 사용자에게 감정치료법(치료 기법) 및 컨텐츠를 추천(제공)할 수 있다.In other words, the present systems 200 and 200' can recognize the user's emotional change by merging voice analysis and contextual analysis to identify the user's emotions with respect to the input voice data, accumulate them, and analyze them in a time-series manner. Based on this result, the present systems 200 and 200' can recommend (provide) an emotion therapy (treatment technique) and content to the user.

본 시스템(200, 200')은 통합 감정 분석 결과 사용자가 현재 부정적 감정 또는 중립 감정을 보이는 경우, 긍정적 감정으로 변화되도록 할 수 있다. 또한, 본 시스템(200, 200')은 통합 감정 분석 결과 사용자가 현재 긍정적 감정을 보이는 경우(즉, 긍정적 감정을 보이는 사용자에 대하여) 영화, 음악, 도서 등의 컨텐츠를 추천하여 제공할 수 있으며, 이를 통해 사용자의 현재 감정인 긍정적 감정이 유지되거나 보다 확장되도록 도울 수 있다.The present systems 200 and 200' may change into a positive emotion when the user currently shows a negative emotion or a neutral emotion as a result of the integrated emotion analysis. In addition, the present systems 200 and 200' can recommend and provide content such as movies, music, books, etc., when the user currently shows positive emotions as a result of the integrated emotion analysis (that is, for users who show positive emotions), This can help the user's current emotions, such as positive emotions, to be maintained or expanded.

본 시스템(200, 200')은 세계 감정(감성) 컴퓨팅 시장 규모 중 가장 큰 부분을 차지하고 있는 Market Research에 적용될 수 있다. 또한, 본 시스템(200, 200')은 헬스 케어(Health Care) 분야에 직접적으로 활용 될 수 있다. 또한, 본 시스템(200, 200')은 심리치료 시장에 적용 가능하며, 개인 모바일 어플리케이션을 통한 감정 치료의 제공이 가능하다.The present systems 200 and 200' can be applied to Market Research, which occupies the largest portion of the global emotional (emotional) computing market. In addition, the present systems 200 and 200' may be directly utilized in the field of health care. In addition, the present systems 200 and 200' are applicable to the psychotherapy market, and it is possible to provide emotional treatment through a personal mobile application.

이하에서는 상기에 자세히 설명된 내용을 기반으로, 본원의 동작 흐름을 간단히 살펴보기로 한다.Hereinafter, an operation flow of the present application will be briefly reviewed based on the details described above.

도 14는 본원의 일 실시예에 따른 신경망 기반 감정 분석 방법에 대한 동작 흐름도이다.14 is an operation flowchart of a neural network-based emotion analysis method according to an embodiment of the present application.

도 14에 도시된 신경망 기반 감정 분석 방법은 앞서 설명된 본 장치(100)에 의하여 수행될 수 있다. 따라서, 이하 생략된 내용이라고 하더라도 본 장치(100)에 대하여 설명된 내용은 신경망 기반 감정 분석 방법에 대한 설명에도 동일하게 적용될 수 있다.The neural network-based emotion analysis method shown in FIG. 14 may be performed by the apparatus 100 described above. Therefore, even if omitted below, the description of the apparatus 100 may be equally applied to the description of the neural network-based emotion analysis method.

도 14를 참조하면, 단계S11에서 음성분석 모듈은, 사용자로부터 입력된 음성 데이터에 신경망 기반의 음성분석을 적용하여 제1 감정 분석 결과를 산출할 수 있다.Referring to FIG. 14 , in step S11, the voice analysis module may calculate a first emotion analysis result by applying neural network-based voice analysis to voice data input from the user.

이때, 단계S11에서 음성분석 모듈은, 음성 데이터로부터 추출된 음성 파형 특징을 입력으로 하는 신경망 기반의 음성분석을 통해 제1 감정 분석 결과를 산출할 수 있다.In this case, in step S11, the voice analysis module may calculate the first emotion analysis result through neural network-based voice analysis using the voice waveform features extracted from the voice data as an input.

여기서, 음성 파형 특징은, 음성 데이터에 대하여 MFCC(Mel-Frequency Cepstral Coefficient) 변환을 수행하여 출력된 MFCC특징 값일 수 있다. Here, the voice waveform feature may be an MFCC feature value output by performing MFCC (Mel-Frequency Cepstral Coefficient) transformation on voice data.

또한, 제1 감정 분석 결과는, 행복(Happy), 중립(Neutral), 슬픔(Sad) 및 분노(Angry)를 포함하는 복수의 감정 각각에 대해 확률적으로 부여되는 값일 수 있다.Also, the first emotion analysis result may be a value probabilistically assigned to each of a plurality of emotions including happiness, neutrality, sadness, and anger.

다음으로, 단계S12에서 문맥분석 모듈은, 사용자로부터 입력된 음성 데이터에 감정사전 기반의 문맥분석을 적용하여 제2 감정 분석 결과를 산출할 수 있다.Next, in step S12, the context analysis module may calculate the second emotion analysis result by applying the context analysis based on the emotion dictionary to the voice data input from the user.

이때, 단계S12에서 문맥분석 모듈은, 음성 데이터를 텍스트로 변환한 다음 변환된 텍스트로부터 적어도 하나의 단어 기본형(Lemma)을 추출하고, 추출된 적어도 하나의 단어 기본형을 감정사전 DB에 포함된 단어 기본형과 매칭하여 제2 감정 분석 결과를 산출할 수 있다.At this time, in step S12, the context analysis module converts the voice data into text, then extracts at least one word primitive (Lemma) from the converted text, and extracts the extracted at least one word primitive type from the word primitive included in the emotion dictionary DB. It is possible to calculate a second emotion analysis result by matching with .

여기서, 감정사전 DB는, 복수의 단어 기본형 각각에 대하여 긍정의 세기 및 부정의 세기와 관련한 감정점수가 부여된 형태로 구축될 수 있다.Here, the emotion dictionary DB may be constructed in a form in which emotion scores related to the strength of positive and negative strength are given to each of the plurality of basic types of words.

단계S12에서 문맥분석 모듈은, 추출된 적어도 하나의 단어 기본형 중 감정사전 DB에 포함된 단어 기본형과 매칭되는 매칭 단어 기본형 각각의 감정점수를 고려하여 제2 감정 분석 결과를 산출할 수 있다.In step S12, the context analysis module may calculate the second emotion analysis result in consideration of the emotion scores of each of the matching word basic types matching the word basic types included in the emotion dictionary DB among the extracted at least one basic word types.

다음으로, 단계S13에서 통합 모듈은, 단계S11에서 산출된 제1 감정 분석 결과 및 단계S12에서 산출된 제2 감정 분석 결과를 신경망의 입력으로 하여 음성 데이터에 대한 통합 감정 분석 결과를 출력 제공할 수 있다.Next, in step S13, the integrated module may output the integrated emotion analysis result for voice data by using the first emotion analysis result calculated in step S11 and the second emotion analysis result calculated in step S12 as input to the neural network. have.

여기서, 통합 감정 분석 결과는 행복(Happy), 중립(Neutral), 슬픔(Sad) 및 분노(Angry)를 포함하는 복수의 감정 각각에 대해 확률적으로 부여되는 값 형태로 1차적으로 도출되고, 복수의 감정 중 최대치의 확률을 갖는 감정을 최종 감정으로 결정한 결과일 수 있다.Here, the integrated emotion analysis result is primarily derived in the form of a value probabilistically assigned to each of a plurality of emotions including happiness, neutral, sad, and Angry, It may be a result of determining the emotion having the maximum probability among the emotions as the final emotion.

상술한 설명에서, 단계 S11 내지 S13은 본원의 구현예에 따라서, 추가적인 단계들로 더 분할되거나, 더 적은 단계들로 조합될 수 있다. 또한, 일부 단계는 필요에 따라 생략될 수도 있고, 단계 간의 순서가 변경될 수도 있다.In the above description, steps S11 to S13 may be further divided into additional steps or combined into fewer steps, according to an embodiment of the present application. In addition, some steps may be omitted as necessary, and the order between steps may be changed.

도 15는 본원의 일 실시예에 따른 신경망 기반 감정 분석 기반의 감정 치료 방법에 대한 동작 흐름도이다.15 is an operation flowchart of an emotion treatment method based on a neural network-based emotion analysis according to an embodiment of the present application.

도 15에 도시된 신경망 기반 감정 분석 기반의 감정 치료 방법은 앞서 설명된 본 장치(110)에 의하여 수행될 수 있다. 따라서, 이하 생략된 내용이라고 하더라도 본 장치(110)에 대하여 설명된 내용은 신경망 기반 감정 분석 기반의 감정 치료 방법에 대한 설명에도 동일하게 적용될 수 있다.The emotion treatment method based on the neural network-based emotion analysis shown in FIG. 15 may be performed by the apparatus 110 described above. Therefore, even if omitted below, the description of the apparatus 110 may be equally applied to the description of the emotion treatment method based on the neural network-based emotion analysis.

도 15에서의 단계S21 내지 단계S23 각각은 상술한 도 14에서의 단계S11 내지 단계S23 각각과 동일한 단계일 수 있다. 따라서, 이하에서는 단계S21 내지 단계S23과 관련하여 간단히 설명하기로 한다.Each of steps S21 to S23 in FIG. 15 may be the same as each of steps S11 to S23 in FIG. 14 described above. Therefore, hereinafter, it will be briefly described in relation to steps S21 to S23.

도 15를 참조하면, 단계S21에서 음성분석 모듈은, 사용자로부터 입력된 음성 데이터에 신경망 기반의 음성분석을 적용하여 제1 감정 분석 결과를 산출할 수 있다. 다음으로, 단계S22에서 문맥분석 모듈은, 사용자로부터 입력된 음성 데이터에 감정사전 기반의 문맥분석을 적용하여 제2 감정 분석 결과를 산출할 수 있다. 다음으로, 단계S23에서 통합 모듈은, 단계S21에서 산출된 제1 감정 분석 결과 및 단계S22에서 산출된 제2 감정 분석 결과를 신경망의 입력으로 하여 음성 데이터에 대한 통합 감정 분석 결과를 출력 제공할 수 있다.Referring to FIG. 15 , in step S21 , the voice analysis module may calculate a first emotion analysis result by applying neural network-based voice analysis to voice data input from the user. Next, in step S22, the context analysis module may calculate the second emotion analysis result by applying the context analysis based on the emotion dictionary to the voice data input from the user. Next, in step S23, the integrated module may output the integrated emotion analysis result for voice data by using the first emotion analysis result calculated in step S21 and the second emotion analysis result calculated in step S22 as input to the neural network. have.

다음으로, 단계S24에서 치료 모듈은, 단계S23에서 출력 제공된 통합 감정 분석 결과를 기반으로 감정치료법을 결정하여 제공할 수 있다.Next, in step S24, the treatment module may determine and provide an emotion treatment method based on the integrated emotion analysis result output and provided in step S23.

여기서, 감정치료법은, 통합 감정 분석 결과로서 사용자의 감정이 부정적 감정 또는 중립 감정인 것으로 나타난 경우, 사용자의 감정에 대해 긍정적 감정으로의 변화를 유도하는 컨텐츠 형태로 제공될 수 있다.Here, the emotion therapy may be provided in the form of content that induces a change to a positive emotion with respect to the user's emotion when the user's emotion is shown to be a negative emotion or a neutral emotion as a result of the integrated emotion analysis.

또한, 감정치료법은, 통합 감정 분석 결과로서 사용자의 감정이 긍정적 감정인 것으로 나타난 경우, 사용자의 감정의 긍정적 유지 및 확장을 유도하는 컨텐츠 형태로 제공될 수 있다.In addition, the emotion therapy method may be provided in the form of content that induces positive maintenance and expansion of the user's emotion when the user's emotion is found to be a positive emotion as a result of the integrated emotion analysis.

상술한 설명에서, 단계 S21 내지 S24는 본원의 구현예에 따라서, 추가적인 단계들로 더 분할되거나, 더 적은 단계들로 조합될 수 있다. 또한, 일부 단계는 필요에 따라 생략될 수도 있고, 단계 간의 순서가 변경될 수도 있다.In the above description, steps S21 to S24 may be further divided into additional steps or combined into fewer steps, according to an embodiment of the present application. In addition, some steps may be omitted as necessary, and the order between steps may be changed.

본원의 일 실시예에 따른 신경망 기반 감정 분석 방법 및 본원의 일 실시예에 따른 신경망 기반 감정 분석 기반의 감정 치료 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 본 발명을 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. 상기된 하드웨어 장치는 본 발명의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.The neural network-based emotion analysis method according to an embodiment of the present application and the neural network-based emotion analysis-based emotion treatment method according to an embodiment of the present application are implemented in the form of program instructions that can be performed through various computer means and stored in a computer-readable medium. can be recorded. The computer-readable medium may include program instructions, data files, data structures, etc. alone or in combination. The program instructions recorded on the medium may be specially designed and configured for the present invention, or may be known and available to those skilled in the art of computer software. Examples of the computer readable recording medium include magnetic media such as hard disks, floppy disks and magnetic tapes, optical media such as CD-ROMs and DVDs, and magnetic media such as floppy disks. - includes magneto-optical media, and hardware devices specially configured to store and execute program instructions, such as ROM, RAM, flash memory, and the like. Examples of program instructions include not only machine language codes such as those generated by a compiler, but also high-level language codes that can be executed by a computer using an interpreter or the like. The hardware devices described above may be configured to operate as one or more software modules to perform the operations of the present invention, and vice versa.

또한, 전술한 신경망 기반 감정 분석 방법 및 신경망 기반 감정 분석 기반의 감정 치료 방법은 기록 매체에 저장되는 컴퓨터에 의해 실행되는 컴퓨터 프로그램 또는 애플리케이션의 형태로도 구현될 수 있다.In addition, the above-described neural network-based emotion analysis method and neural network-based emotion analysis-based emotion treatment method may be implemented in the form of a computer program or application executed by a computer stored in a recording medium.

전술한 본원의 설명은 예시를 위한 것이며, 본원이 속하는 기술분야의 통상의 지식을 가진 자는 본원의 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 쉽게 변형이 가능하다는 것을 이해할 수 있을 것이다. 그러므로 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며 한정적이 아닌 것으로 이해해야만 한다. 예를 들어, 단일형으로 설명되어 있는 각 구성 요소는 분산되어 실시될 수도 있으며, 마찬가지로 분산된 것으로 설명되어 있는 구성 요소들도 결합된 형태로 실시될 수 있다.The foregoing description of the present application is for illustration, and those of ordinary skill in the art to which the present application pertains will understand that it can be easily modified into other specific forms without changing the technical spirit or essential features of the present application. Therefore, it should be understood that the embodiments described above are illustrative in all respects and not restrictive. For example, each component described as a single type may be implemented in a distributed manner, and likewise components described as distributed may be implemented in a combined form.

본원의 범위는 상기 상세한 설명보다는 후술하는 특허청구범위에 의하여 나타내어지며, 특허청구범위의 의미 및 범위 그리고 그 균등 개념으로부터 도출되는 모든 변경 또는 변형된 형태가 본원의 범위에 포함되는 것으로 해석되어야 한다.The scope of the present application is indicated by the following claims rather than the above detailed description, and all changes or modifications derived from the meaning and scope of the claims and their equivalents should be construed as being included in the scope of the present application.

200: 신경망 기반 감정 분석 시스템
200': 신경망 기반 감정 분석 기반의 감정 치료 시스템
100: 신경망 기반 감정 분석 장치
110: 신경망 기반 감정 분석 기반의 감정 치료 장치
10: 음성분석 모듈
20: 문맥분석 모듈
30: 통합 모듈
40: 치료 모듈
2: 사용자 단말200: Neural network-based emotion analysis system
200': Neural network-based emotion analysis-based emotion treatment system
100: neural network-based emotion analysis device
110: Neural network-based emotion analysis-based emotion treatment device
10: voice analysis module
20: Context Analysis Module
30: integrated module
40: treatment module
2: User terminal

Claims

A neural network-based emotion analysis method performed by a neural network-based emotion analysis device, comprising:
(a) calculating a first emotion analysis result by applying a neural network-based voice analysis to voice data input from a user;
(b) calculating a second emotion analysis result by applying context analysis based on emotion dictionary to the voice data; and
(c) outputting an integrated emotion analysis result for the voice data by using the first emotion analysis result and the second emotion analysis result as inputs to a neural network;
including,
The first emotion analysis result is a value probabilistically assigned to each of a plurality of emotions including happiness, neutral, sadness, and Angry,
The second emotion analysis result includes positive, negative, and neutral, but based on the emotional score corresponding to the neutral, the positive is a score form greater than the emotional score corresponding to the neutral, and the emotional score related to the strength of the positive emotion , and negation is an emotional score related to the intensity of negative emotion in the form of a score smaller than the emotional score corresponding to the neutral,
In the step (b), after converting the speech data into text, at least one word primitive (Lemma) and a dependency are extracted from the converted text, and the extracted at least one word primitive type is stored in an emotion dictionary DB. Calculating the second emotion analysis result by matching with the included word basic form, calculating the second emotion analysis result in consideration of the dependency,
In step (c), the neural network is
When the anti-linguistic feature does not exist in the voice data, the second emotion analysis result is higher than the first emotion analysis result because the first emotion analysis result is a value probabilistically assigned to each of the plurality of emotions. Constructed to derive the integrated emotion analysis results in consideration of further,
When an emotion score corresponding to neutral is given to the second emotion analysis result because an emotion word representing the user's emotion does not exist in the voice data, the maximum of the probability values for each of the plurality of emotions in the first emotion analysis result A neural network-based emotion analysis method that is constructed to calculate the integrated emotion analysis result with an emotion given a probability value.

According to claim 1,
In the step (a), the first emotion analysis result is calculated through the neural network-based speech analysis using the speech waveform feature extracted from the speech data as an input.

3. The method of claim 2,
The voice waveform feature is an MFCC feature value output by performing Mel-Frequency Cepstral Coefficient (MFCC) transformation on the voice data, a neural network-based emotion analysis method.

delete

According to claim 1,
The emotion dictionary DB is constructed in a form in which emotion scores related to the strength of positive and negative strength are given to each of a plurality of basic types of words,
In the step (b), the second emotion analysis result is calculated in consideration of the emotion score of each of the matching word basic types matching the word basic types included in the emotion dictionary DB among the extracted basic types of the at least one word, A neural network-based emotion analysis method.

According to claim 1,
In step (c), the result of the integrated emotion analysis is 1 in the form of a value probabilistically assigned to each of a plurality of emotions including happiness, neutral, sadness, and Angry. A neural network-based emotion analysis method, which is a result of determining an emotion, which is derived sequentially and has a maximum probability among the plurality of emotions, as the final emotion.

delete

A neural network-based emotion analysis device, comprising:
a voice analysis module for calculating a first emotion analysis result by applying a neural network-based voice analysis to voice data input from a user;
a context analysis module for calculating a second emotion analysis result by applying an emotion dictionary-based context analysis to the voice data; and
an integrated module for outputting and providing an integrated emotion analysis result for the voice data by using the first emotion analysis result and the second emotion analysis result as input to a neural network;
including,
The first emotion analysis result is a value probabilistically assigned to each of a plurality of emotions including happiness, neutral, sadness, and Angry,
The second emotion analysis result includes positive, negative, and neutral, but based on the emotional score corresponding to the neutral, the positive is a score form greater than the emotional score corresponding to the neutral, and the emotional score related to the strength of the positive emotion , and negation is an emotional score related to the intensity of negative emotion in the form of a score smaller than the emotional score corresponding to the neutral,
The context analysis module,
After converting the speech data into text, at least one word primitive type (Lemma) and a dependency are extracted from the converted text, and the extracted at least one word primitive type is matched with the word primitive type included in the emotion dictionary DB. Calculating the second emotion analysis result, characterized in that the second emotion analysis result is calculated in consideration of the dependency,
The neural network is
If the anti-verbal feature does not exist in the voice data, the second emotion analysis result is more important than the first emotion analysis result because the first emotion analysis result is a value probabilistically assigned to each of the plurality of emotions. Constructed to derive the integrated emotion analysis result in further consideration,
When an emotion score corresponding to neutral is given to the second emotion analysis result because the emotion word representing the user's emotion does not exist in the voice data, the maximum of the probability values for each of the plurality of emotions in the first emotion analysis result A neural network-based emotion analysis device that is constructed to calculate the integrated emotion analysis result with an emotion given a probability value.

11. The method of claim 10,
wherein the voice analysis module calculates the first emotion analysis result through the neural network-based voice analysis to which voice waveform features extracted from the voice data are input.

12. The method of claim 11,
The voice waveform feature is an MFCC feature value output by performing Mel-Frequency Cepstral Coefficient (MFCC) transformation on the voice data, a neural network-based emotion analysis apparatus.

delete

11. The method of claim 10,
The emotion dictionary DB is constructed in a form in which emotion scores related to the strength of positive and negative strength are given to each of a plurality of basic types of words,
The context analysis module is configured to calculate the second emotion analysis result in consideration of the emotion score of each matching word primitive matching the word primitive included in the emotion dictionary DB among the extracted primitive types of the at least one word based sentiment analysis device.

11. The method of claim 10,
The integrated emotion analysis result is primarily derived in the form of a value probabilistically assigned to each of a plurality of emotions including happiness, neutral, sad, and Angry, and A neural network-based emotion analysis device that is the result of determining the emotion with the highest probability among the emotions as the final emotion.

As an emotion treatment device based on neural network-based emotion analysis,
A neural network-based emotion analysis-based emotion treatment device that determines and provides an emotion treatment method based on the integrated emotion analysis result provided by performing a neural network-based emotion analysis by the neural network-based emotion analysis device according to claim 10 .

18. The method of claim 17,
The emotion therapy is
When it is shown that the user's emotion is a negative emotion or a neutral emotion as a result of the integrated emotion analysis, it is provided in the form of content that induces a change to a positive emotion with respect to the user's emotion,
When the user's emotion is found to be a positive emotion as the result of the integrated emotion analysis, the neural network-based emotion analysis-based emotion treatment apparatus will be provided in the form of content that induces positive maintenance and expansion of the user's emotion.